Uploaded image for project: 'XWiki Platform'
  1. XWiki Platform
  2. XWIKI-15166

SOLR reindexation job cause serious perfomance issues on large database

    XMLWordPrintable

Details

    • Improvement
    • Resolution: Unresolved
    • Major
    • None
    • 10.2, 9.11.4
    • Search - Solr
    • MySQL
    • Unknown

    Description

      The reindexer job launch at startup or through the administration uses a select query that list document in batches of 100. This query looks like the following:

      select XWD_WEB, XWD_NAME, XWD_LANGUAGE, XWD_VERSION
      from xwikidoc
      order by XWD_WEB, XWD_NAME, XWD_LANGUAGE
      limit ?, 100
      

      This query requires the sorting of the entire dataset to get only 100 entries, and this sorting is currently not backed up by any optimization (at least under MySQL), causing a simple filesort that for a large will also require temporary file storage, and resorting of chunks. Such request may last several seconds or even minutes... and is repeated thousand times at startups. On a large dataset, the wiki almost unusable during several hours...

      I have investigated possible optimizations. The first that come to mind according to 8.2.1.13 ORDER BY Optimization is to create the following index:

      create unique index DOC_SPACE_NAME_LANGUAGE on xwikidoc (XWD_WEB, XWD_NAME, XWD_LANGUAGE);
      

      But it does not really helps, unless you also add a FORCE INDEX to the query, since MySQL does not optimize the right way such a query. My current workaround is to create the following index:

      create unique index DOC_SPACE_NAME_LANGUAGE_VERSION on xwikidoc (XWD_WEB, XWD_NAME, XWD_LANGUAGE, XWD_VERSION);
      

      Which that one, only the index is looked at, and the query is lightning fast.

      Another option that could be investigated would be to change the query to avoid the LIMIT offset, using the document ids that old the same information as the one currently in the ORDER BY. Maybe this could help the optimizer and avoid creating a new index. I haven't investigated this option, since my purpose was to find a solution without changing the indexer code.

      Attachments

        Activity

          People

            Unassigned Unassigned
            softec Denis Gervalle
            Votes:
            1 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: