Details
-
Bug
-
Resolution: Unresolved
-
Major
-
None
-
12.10.6
-
None
-
MariaDB 10.3.27, Debian 10
-
Unknown
-
Description
After upgrading the flavor version on an XWiki instance using Extension Manager, I have restarted the instance and the Solr Index Job started doing its work, however, at the same time, I have noticed the mysqld process going to 60-70% and staying there for a long while.
show full processlist was always displaying the following entry, with the limit updating as the job was slowly progressing through the 200K documents that I had in my wiki:
| 199 | xwiki | localhost:40818 | xwiki | Query | 2 | Creating sort index | select xwikidocum0_.XWD_WEB as col_0_0_, xwikidocum0_.XWD_NAME as col_1_0_, xwikidocum0_.XWD_LANGUAGE as col_2_0_, xwikidocum0_.XWD_VERSION as col_3_0_ from xwikidoc xwikidocum0_ order by xwikidocum0_.XWD_WEB, xwikidocum0_.XWD_NAME, xwikidocum0_.XWD_LANGUAGE limit 60200, 100 | 0.000 |
Execution of the indexing job took over 2 hours and all of that was in order to realize that it did not have to index anything (probably the changes fro my upgrade were indexed on the spot):
2021-07-06 22:53:35,168 [solr/indexer job group daemon thread - org.xwiki.search.solr.internal.job.IndexerJob@540d56f1([solr, indexer])] INFO o.x.s.s.i.j.IndexerJob - Starting job of type [solr.indexer] with identifier [[solr, indexer]] ... 2021-07-07 01:04:59,293 [solr/indexer job group daemon thread - org.xwiki.search.solr.internal.job.IndexerJob@540d56f1([solr, indexer])] INFO o.x.s.s.i.j.IndexerJob - 0 documents added, 0 deleted and 0 updated during the synchronization of the Solr index. 2021-07-07 01:04:59,360 [solr/indexer job group daemon thread - org.xwiki.search.solr.internal.job.IndexerJob@540d56f1([solr, indexer])] INFO o.x.s.s.i.j.IndexerJob - Finished job of type [solr.indexer] with identifier [[solr, indexer]]
Having a better look at the query, I noticed that it was triggering a filesort instead of using indexes:
explain select xwikidocum0_.XWD_WEB as col_0_0_, xwikidocum0_.XWD_NAME as col_1_0_, xwikidocum0_.XWD_LANGUAGE as col_2_0_, xwikidocum0_.XWD_VERSION as col_3_0_ from xwikidoc xwikidocum0_ order by xwikidocum0_.XWD_WEB limit 156300, 100; +------+-------------+--------------+------+---------------+------+---------+------+--------+----------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+--------------+------+---------------+------+---------+------+--------+----------------+ | 1 | SIMPLE | xwikidocum0_ | ALL | NULL | NULL | NULL | NULL | 182534 | Using filesort | +------+-------------+--------------+------+---------------+------+---------+------+--------+----------------+
Running the query as it is, for a batch of 100 results took:
100 rows in set (2.483 sec)
Individual indexes exist for all the clauses in the select and in the order by, but when you have more than 1 clauses and indexes for the order by, no single one is selected and mysql/mariadb turns to a filesort instead, as it ends up being "more efficient" than using the individual indexes.
If we add an explicit index for the order by clauses, we get a significant improvement:
CREATE INDEX solr_iterate_all_documents ON xwikidoc (XWD_WEB, XWD_NAME, XWD_LANGUAGE, XWD_VERSION);
explain select xwikidocum0_.XWD_WEB as col_0_0_, xwikidocum0_.XWD_NAME as col_1_0_, xwikidocum0_.XWD_LANGUAGE as col_2_0_, xwikidocum0_.XWD_VERSION as col_3_0_ from xwikidoc xwikidocum0_ order by xwikidocum0_.XWD_WEB limit 156300, 100; +------+-------------+--------------+-------+---------------+-----------------------------+---------+------+--------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+--------------+-------+---------------+-----------------------------+---------+------+--------+-------------+ | 1 | SIMPLE | xwikidocum0_ | index | NULL | solr_iterate_all_documents | 2319 | NULL | 156400 | Using index | +------+-------------+--------------+-------+---------------+-----------------------------+---------+------+--------+-------------+
...which can be observed when executing the now indexed query:
100 rows in set (0.067 sec)
So the conclusion is that this index is vital to increasing the performance of the indexing operation, specially for wikis with a large number of documents. In my case, it would have resulted in an execution that would have been 1 hour faster, i.e. 50% performance gain by improving the get documents query, executed in each batch of 100 documents.
This could even be considered a solution/improvement for XWIKI-16755.
The automatic adding of this missing Solr index could probably be handled as part of, or the same way as XWIKI-11660.
Attachments
Issue Links
- is related to
-
XWIKI-16755 Indexing the whole wiki take much longer than in 8.4.6
- Open
-
XWIKI-11660 Automatically create remaining database indexes
- Open