Details

    • Type: Task
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 7.3-milestone-1
    • Fix Version/s: 7.3-rc-1
    • Component/s: Dependency Upgrades
    • Labels:
      None
    • Difficulty:
      Unknown
    • Documentation:
      N/A
    • Similar issues:

      Description

      Release 1.11 - 10/18/2015
      
        * Java7 API support for allowing java.nio.file.Path as method arguments
          was added to Tika and to ParsingReader, TikaFileTypeDetector, and to
          Tika Config (TIKA-1745, TIKA-1746, TIKA-1751).
      
        * MIME support was added for WebVTT: The Web Video Text Tracks Format
          files (TIKA-1772).
      
        * MIME magic improved to ensure emails detected as message/rfc822
          (TIKA-1771).
      
        * Upgrade to Jackcess Encrypt 2.1.1 to avoid binary incompatibility
          with Bouncy Castle (TIKA-1736).
        
        * Make div and other markup more consistent between PPT and 
          PPTX (TIKA-1755).
      
        * Parse multiple authors from MSOffice's semi-colon delimited
          author field (TIKA-1765).
        
        * Include CTAKESConfig.properties within tika-parsers resources 
          by default (TIKA-1741).
        
        * Prevent infinite recursion when processing inline images
          in PDF files by limiting extraction of duplicate images
          within the same page (TIKA-1742).
      
        * Upgrade to POI 3.13-final (via Andreas Beeker) (TIKA-1707).
      
        * Upgraded tika-batch to use Path throughout (TIKA-1747 and
          (TIKA-1754).
      
        * Upgraded to Path in TikaInputStream (via Yaniv Kunda) (TIKA-1744).
      
        * Changed default content handler type for "/rmeta" in tika-server
          to "xml" to align with "-J" option in tika-app.  
          Clients can now specify handler types via PathParam. (TIKA-1716).
      
        * The fantastic GROBID (or Grobid) GeneRation Of BIbliographic Data
          for machine learning from PDF files is now integrated as a 
          Tika parser (TIKA-1699, TIKA-1712).
      
        * The ability to specify the Tesseract Config Path was added
          to the OCR Parser (TIKA-1703).
      
        * Upgraded to ASM 5.0.4 (TIKA-1705).
      
        * Corrected Tika Config XML detector definition explicit loading 
          of MimeTypes (TIKA-1708)
      
        * In Tika Parsers, Batch, Server, App and Examples, use Apache
          Commons IO instead of inlined ex-Commons classes, and the Java 7
          Standard Charset definitions (TIKA-1710)
      
        * Upgraded to Commons Compress 1.10, which enables zlib compressed
          archives support (TIKA-1718)
      

        Attachments

          Activity

            People

            • Assignee:
              tmortagne Thomas Mortagne
              Reporter:
              tmortagne Thomas Mortagne
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: