Changes between Version 1 and Version 2 of SolrInterface


Ignore:
Timestamp:
02/23/10 11:22:30 (4 years ago)
Author:
dread
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • SolrInterface

    v1 v2  
    88There are two main options for getting data into SOLR: 
    99 
    10 * POST the records to SOLR in XML format ([http://wiki.apache.org/solr/UpdateXmlMessages docs]) 
    11 * Direct connection Setup SOLR ([http://wiki.apache.org/solr/DataImportHandler docs]) 
     10 * POST the records to SOLR in XML format ([http://wiki.apache.org/solr/UpdateXmlMessages docs]) 
     11 * Direct connection Setup SOLR ([http://wiki.apache.org/solr/DataImportHandler docs]) 
    1212  * Provide SELECT statements to do queries 
    1313  * Process is initiated by doing a GET to a particular SOLR URL 
     
    1515The preference is for the first option as the abstraction provides more flexibility in the db and more control about what gets indexed. 
    1616 
     17When to index a package? Currently we index it on database after_insert and after_update triggers. But this might seriously slow down a large data import since the indexing requires a POST over the internet. Maybe keep the triggers, but for a batch import we can turn them off and then manually run the indexing. Alternatively store up changes and do an hourly cron. 
     18 
    1719== Tickets == 
    1820 
    19 * Get a SOLR instance running locally and/or eu1, using basic config. 
    20 * Harness one of the three python SOLR libraries to send SOLR Update XML of CKAN Packages. 
    21 * Write tests for SOLR by sending data with SOLR library and using JSON interface for queries. 
    22 * Optimise the SOLR settings for searching our data well - fields description in schema.xml. (Link to schema.xml in developer docs.) 
    23 * Provide option to connect CKAN's search WUI to SOLR back-end. 
     21 * Get a SOLR instance running locally and/or eu1, using basic config. 
     22 * Get indexing and searching working with name and title fields only: 
     23   * Harness one of the three python SOLR libraries to send SOLR Update XML of CKAN Packages (triggered on the command-line).  
     24   * Write tests for SOLR by sending data with SOLR library and using JSON interface for queries. 
     25 * Get it working with all package fields, optimising the field descriptions in schema.xml.  
     26 * Trigger the indexing sensibly (as decided above). 
     27 * Provide option to connect CKAN's search WUI to SOLR back-end. 
     28 * Developer docs - description of how to setup SOLR and provide link to schema.xml in developer docs.