Uploaded image for project: 'XWiki Platform'
  1. XWiki Platform
  2. XWIKI-16650

Upgrade to Tika 1.22

    XMLWordPrintable

Details

    • Task
    • Resolution: Fixed
    • Major
    • 11.9
    • 11.6
    • Dependency Upgrades
    • None
    • Unknown
    • N/A

    Description

      See: https://dist.apache.org/repos/dist/release/tika/CHANGES-1.22.txt

      Release 1.22 - 07/29/2019
      
         * NOTE: Known regression: PDFBOX-4587 -- PDF passwords with codepoints
           between 0xF000 and 0XF0000 will cause an exception.
      
         * Add parser for HWP v5 files via SooMyung Lee (soomyung) and
           JinSup Kim (ddoleye) (TIKA-2909).
      
         * Fix order of closing streams to avoid "Failed to close temporary resource"
           exception in TesseractOCRParser (TIKA-2908).
      
         * Improve AutoDetectReader performance by caching encoding
           detector (TIKA-1568).
      
         * Prevent RTFParser from outputting illegal tag combinations (TIKA-2889).
      
         * Fix RereadableInputStream to release all resources (TIKA-2903).
      
         * Implement custom language identifier in the tika-eval module based on
           OpenNLP's language detector; add 18 languages and add common words
           lists for all 121 languages (TIKA-2790).
      
         * Fix NPE in MimeTypesReader.releaseParser() via Eamonn Saunders (TIKA-2896).
      
         * Fix RTFParser to extract more content (TIKA-2883).
      
         * Add clientSubmitTime to the metadata extracted from PST files (TIKA-2898).
      
         * Improve StreamingZipContainerDetector for xltx, xltm and
           several other file formats (TIKA-2886).
      

      Attachments

        Issue Links

          Activity

            People

              tmortagne Thomas Mortagne
              surli Simon Urli
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: