Uploaded image for project: 'XWiki Platform'
  1. XWiki Platform
  2. XWIKI-12274

Upgrade to Tika 1.9

    XMLWordPrintable

Details

    • Unknown
    • N/A

    Description

      See http://www.apache.org/dist/tika/CHANGES-1.9.txt

      Release 1.9 - 6/6/2015
      
        * The ability to use the cTAKES clinical text
          knowledge extraction system for biomedical data is 
          now included as a Tika parser (TIKA-1645, TIKA-1642).
      
        * Tika-server allows a user to specify the Tika config
          from the command line (TIKA-1652, TIKA-1426).
      
        * Matlab file detection has been improved (TIKA-1634).
      
        * The EXIFTool was added as an External parser
          (TIKA-1639).
      
        * If FFMPEG is installed and on the PATH, it is a 
          usable Parser in Tika now (TIKA-1510).
      
        * Fixes have been applied to the ExternalParser to make
          it functional (TIKA-1638).
      
        * Tika service loading can now be more verbose with the 
          org.apache.tika.service.error.warn system property (TIKA-1636).
      
        * Tika Server now allows for metadata extraction from remote
          URLs and in addition it outputs the detected language as a
          metadata field (TIKA-1625).
      
        * OUTPUT_FILE_TOKEN not being replaced in ExternalParser 
          contributed by Pascal Essiembre (TIKA-1620).
      
        * Tika REST server now supports language identification
          (TIKA-1622).
      
        * All of the example code from the Tika in Action book has 
          been donated to Tika and added to tika-examples (TIKA-1562).
      
        * Tika server now logs errors determining ContentDisposition
          (TIKA-1621).
      
        * An algorithm for using Byte Histogram frequencies to construct
          a Neural Network and to perform MIME detection was added
          (TIKA-1582).
      
        * A Bayesian algorithm for MIME detection by probabilistic
          means was added (TIKA-1517).
      
        * Tika now incorporates the Apache Spatial Information
          System capability of parsing Geographic ISO 19139 
          files (TIKA-443). It can also detect those files as
          well.
      
        * Update the MimeTypes code to support inheritance
          (TIKA-1535).
      
        * Provide ability to parse and identify Global Change 
          Master Directory Interchange Format (GCMD DIF) 
          scientific data files (TIKA-1532).
      
        * Improvements to detect CBOR files by extension (TIKA-1610).
      
        * Change xerial.org's sqlite-jdbc jar to "provided" (TIKA-1511).
          Users will now need to add sqlite-jdbc to their classpath for
          the Sqlite3Parser to work.
      
        * ExternalParser.check now catches (suppresses) SecurityException
          and returns false, so it's OK to run Tika with a security policy
          that does not allow execution of external processes (TIKA-1628).
      

      Attachments

        Issue Links

          Activity

            People

              tmortagne Thomas Mortagne
              tmortagne Thomas Mortagne
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: