Uploaded image for project: 'XWiki Platform'
  1. XWiki Platform
  2. XWIKI-7753

Newer versions of (doc, docx) attachments are not indexed by Lucene

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 4.0-rc-1, 3.5.1
    • Fix Version/s: 4.1, 4.0.1
    • Component/s: Search - Generic
    • Labels:
      None
    • Environment:
      Windows XP, JRE 1.6, reproduced with the install version (hsql) and war version (PostgreSQL 9.1)
    • keywords:
      word, doc, docx, attachment, index, lucene
    • Difficulty:
      Unknown
    • Documentation:
      N/A
    • Documentation in Release Notes:
      N/A
    • Similar issues:

      Description

      When a Word is attached to a page, it's content is indexed properly by Lucene. However, after attaching a new version of the same document, only the original version is found. In the sample there are 2 document (both in doc and docx format). The first contains the text "1212", the second contains "3434".

      To reproduce the bug, attach test.doc from directory 1212. Search will find the document. Attach test.doc from directory 3434 to the same page. Search will find only 1212, but not 3434.

      If I delete the attachment from the page, and reattach test.doc from 3434, then it works correctly: will not found 1212, but will found 3434. (however, versions are lost this way)

        Attachments

          Activity

            People

            • Assignee:
              aj Andreas Jonsson
              Reporter:
              marczi Daniel Marczisovszky
            • Votes:
              1 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:
                Date of First Response: