Tuesday, July 6, 2010

ApachesolrExtension module - Part 2


Following up on my earlier post, I have extended the  module for the following:
Current implementation - apachesolr 6.1.x
apachesolr_index_nodes() sends the documents in a chunk of 20 each.
The documents are uploaded in a chunk of 20 during every cron run. 

Solution -
  • I have overridden addDocuments()  function  in Drupal_Apache_Solr_Service_Extension class to not wrap the xml in tags. Added functions to merging all the documents and wrap in a single tag.
  • Clear the solr.xml at the start of every cron run.
  • There will be a process (cron job running a shell script) that will post the xml to the Solr server for generating / updating the index after every Drupal cron run.
 I have uploaded the latest version of the module here.

TODO: A better way of wrapping the xml in a single tag.
Suggestion: apachesolr module should provide information of the total number of documents to be added to the index to the instance of Drupal_Apache_Solr_Service class.
This will help to understand which is the last chunk of documents and wrap the entire xml in tags.

0 comments: