Details
-
Improvement
-
Resolution: Unresolved
-
Major
-
None
-
10.2, 9.11.4
-
MySQL
-
Unknown
-
Description
The reindexer job launch at startup or through the administration uses a select query that list document in batches of 100. This query looks like the following:
select XWD_WEB, XWD_NAME, XWD_LANGUAGE, XWD_VERSION from xwikidoc order by XWD_WEB, XWD_NAME, XWD_LANGUAGE limit ?, 100
This query requires the sorting of the entire dataset to get only 100 entries, and this sorting is currently not backed up by any optimization (at least under MySQL), causing a simple filesort that for a large will also require temporary file storage, and resorting of chunks. Such request may last several seconds or even minutes... and is repeated thousand times at startups. On a large dataset, the wiki almost unusable during several hours...
I have investigated possible optimizations. The first that come to mind according to 8.2.1.13 ORDER BY Optimization is to create the following index:
create unique index DOC_SPACE_NAME_LANGUAGE on xwikidoc (XWD_WEB, XWD_NAME, XWD_LANGUAGE);
But it does not really helps, unless you also add a FORCE INDEX to the query, since MySQL does not optimize the right way such a query. My current workaround is to create the following index:
create unique index DOC_SPACE_NAME_LANGUAGE_VERSION on xwikidoc (XWD_WEB, XWD_NAME, XWD_LANGUAGE, XWD_VERSION);
Which that one, only the index is looked at, and the query is lightning fast.
Another option that could be investigated would be to change the query to avoid the LIMIT offset, using the document ids that old the same information as the one currently in the ORDER BY. Maybe this could help the optimizer and avoid creating a new index. I haven't investigated this option, since my purpose was to find a solution without changing the indexer code.