Details
-
Bug
-
Resolution: Fixed
-
Major
-
8.2.2, 8.4.3
-
Unknown
-
N/A
-
N/A
-
Description
The error generated by the indexer is:
Exception in thread "main" java.io.IOException: Error expected floating point number actual='0.-262' at org.apache.pdfbox.cos.COSFloat.<init>(COSFloat.java:81) at org.apache.pdfbox.cos.COSNumber.get(COSNumber.java:115) at org.apache.pdfbox.pdfparser.BaseParser.parseDirObject(BaseParser.java:939)
This issue is PDFBOX-3500 which is fixed in pdfbox-2.0.4. It is special case of issue PDFBOX-3369 that wasn't fixed in pdfbox-2.0.2. Currently, tika-1.14 used in 9.x depends on pdfbox-2.0.3, but replacing the dependency with pdfbox-2.0.4 seems to works perfectly. For 8.2.2, tika is still version 1.13 which depends on pdfbox-2.0.1, and is fully affected by PDFBOX-3369. Again, replacing the dependency with pdfbox-2.0.4 seems to works perfectly.