Uploaded image for project: 'XWiki Platform'
  1. XWiki Platform
  2. XWIKI-15656

Upgrade to Tika 1.19.1



    • Task
    • Resolution: Fixed
    • Major
    • 10.9
    • 10.8
    • Dependency Upgrades
    • None
    • Unknown
    • N/A


      See http://www.apache.org/dist/tika/CHANGES-1.19.txt

         * Require Java 8 (TIKA-2679).
         * Enable building with Java 11 (TIKA-2668)
         * Add an option to make tika-server robust against infinite loops,
           OOMs, and memory leaks (TIKA-2725).
         * Allow configuration of the Tesseract parser via the standard
           tika-config.xml options (TIKA-2705).
         * Improve handling of empty cells across table-based
           formats (TIKA-2479).
         * Add a Standards compliant HTML encoding detector
           via Gerard Bouchar (TIKA-2673).
         * Improved XML parsing -- limited default entity expansions to 20.
           To raise this limit, add -Djdk.xml.entityExpansionLimit=XXX to
           your commandline.
         * Mime magic improvements for Olympus RAW (TIKA-2658), interpreted
           server-side languages via HTTP (TIKA-2648), MHTML (TIKA-2723)
         * Add absolute timeout to ForkParser rather than testing
           for active (TIKA-2656).
         * Make the RecursiveParserWrapper work with the ForkParser (TIKA-2655).
         * Allow the ForkParser to specify a directory containing tika-app.jar
           for use by the ForkServer.  This allows users to keep most of the
           parser dependencies out of their code; and it allows for an easy
           addition of optional jars for Parser dependencies,
           such as the xerial sqlite jar (TIKA-2653).
         * Use a pool for SAXParsers and DOMBuilders rather than creating
           a new parser/builder for every parse.
           For better performance, set XMLReaderUtils.setPoolSize() to the
           number of threads you're using with Tika (TIKA-2645).
         * Add the RecursiveParserWrapperHandler to improve the RecursiveParserWrapper
           API slightly (TIKA-2644).
         * Upgraded to Commons-Compress 1.18 (TIKA-2707).
         * Upgraded to Apache POI 4.0.0 (TIKA-2552).
         * Upgraded to Apache PDFBox 2.0.11 (TIKA-2681).
         * Upgraded to deeplearning4j 1.0.0-beta2 (TIKA-2672).
         * Upgraded jmatio to 1.4 (TIKA-2667)
         * Upgraded Apache Lucene to 7.4.0 in tika-eval and tika-examples (TIKA-2695).
         * Upgraded junrar to 1.0.1 (TIKA-2664).
         * Numerous other upgrades (TIKA-2692).
         * Excluded Spring as a transitive dependency (TIKA-2721).


        Issue Links



              tmortagne Thomas Mortagne
              tmortagne Thomas Mortagne
              0 Vote for this issue
              2 Start watching this issue