Uploaded image for project: 'XWiki Platform'
  1. XWiki Platform
  2. XWIKI-23239

Improve Solr indexing speed through parallelization and batch processing

    XMLWordPrintable

Details

    • Unknown

    Description

      The idea of this issue is to improve indexing speed in Solr primarily through the means of parallelization and batching. Both re-indexing of the whole wiki due to upgrades as well as large imports can make it necessary to index a lot of pages at once. However, indexing operations can be quite slow currently.

      As current systems usually have enough parallel computing power, the idea of this issue is to address the challenge of slow Solr indexing speed by starting with parallelizing the Solr indexing, at least to the point that the preparation of the data to index and the call to Solr happen in separate threads. Further, the idea is to explore if we can speed up indexing by submitting batches of documents to Solr. Similarly, we could reduce context setup costs by exploiting that frequently, e.g., several objects, properties etc. of a single document are indexed together.

      Attachments

        Activity

          People

            MichaelHamann Michael Hamann
            MichaelHamann Michael Hamann
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: