Loading...

XML

Word

Printable

Details

Type: Task
Resolution: Fixed
Priority: Major
Fix Version/s: 7.3-rc-1
Affects Version/s: 7.3-milestone-1
Component/s: Dependency Upgrades
Labels:
None

Difficulty:
Unknown
Documentation:
N/A
Documentation in Release Notes:
http://www.xwiki.org/xwiki/bin/view/ReleaseNotes/ReleaseNotesXWiki73RC1#HUpgrades
Similar issues:

Description

Release 1.11 - 10/18/2015

  * Java7 API support for allowing java.nio.file.Path as method arguments
    was added to Tika and to ParsingReader, TikaFileTypeDetector, and to
    Tika Config (TIKA-1745, TIKA-1746, TIKA-1751).

  * MIME support was added for WebVTT: The Web Video Text Tracks Format
    files (TIKA-1772).

  * MIME magic improved to ensure emails detected as message/rfc822
    (TIKA-1771).

  * Upgrade to Jackcess Encrypt 2.1.1 to avoid binary incompatibility
    with Bouncy Castle (TIKA-1736).
  
  * Make div and other markup more consistent between PPT and 
    PPTX (TIKA-1755).

  * Parse multiple authors from MSOffice's semi-colon delimited
    author field (TIKA-1765).
  
  * Include CTAKESConfig.properties within tika-parsers resources 
    by default (TIKA-1741).
  
  * Prevent infinite recursion when processing inline images
    in PDF files by limiting extraction of duplicate images
    within the same page (TIKA-1742).

  * Upgrade to POI 3.13-final (via Andreas Beeker) (TIKA-1707).

  * Upgraded tika-batch to use Path throughout (TIKA-1747 and
    (TIKA-1754).

  * Upgraded to Path in TikaInputStream (via Yaniv Kunda) (TIKA-1744).

  * Changed default content handler type for "/rmeta" in tika-server
    to "xml" to align with "-J" option in tika-app.  
    Clients can now specify handler types via PathParam. (TIKA-1716).

  * The fantastic GROBID (or Grobid) GeneRation Of BIbliographic Data
    for machine learning from PDF files is now integrated as a 
    Tika parser (TIKA-1699, TIKA-1712).

  * The ability to specify the Tesseract Config Path was added
    to the OCR Parser (TIKA-1703).

  * Upgraded to ASM 5.0.4 (TIKA-1705).

  * Corrected Tika Config XML detector definition explicit loading 
    of MimeTypes (TIKA-1708)

  * In Tika Parsers, Batch, Server, App and Examples, use Apache
    Commons IO instead of inlined ex-Commons classes, and the Java 7
    Standard Charset definitions (TIKA-1710)

  * Upgraded to Commons Compress 1.10, which enables zlib compressed
    archives support (TIKA-1718)

Attachments

Activity

People

Assignee:: Thomas Mortagne

Reporter:: Thomas Mortagne

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 27/Oct/15 12:31

Updated:: 29/Oct/15 18:17

Resolved:: 29/Oct/15 18:17