java.io.IOException: Missing root object specification in trailer Error with easy-pdf-merge

Tadashi Shigeoka ·  Tue, May 15, 2018

If you get a java.io.IOException: Missing root object specification in trailer error with the npm package easy-pdf-merge, the version of the Java library Apache PDFBox used internally might be outdated.

npm

Suddenly Occurring IOException

Error: Command failed: java -jar "/path/to/myapp/node_modules/easy-pdf-merge/jar/pdfbox.jar" PDFMerger "/tmp/a.pdf" "/tmp/b.pdf"

Exception in thread "main" java.io.IOException: Missing root object specification in trailer.
    org.apache.pdfbox.pdfparser.COSParser.parseTrailerValuesDynamically(COSParser.java:2156)
    org.apache.pdfbox.pdfparser.PDFParser.initialParse(PDFParser.java:222)
    org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:271)
    org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:987)
    org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:943)
    org.apache.pdfbox.debugger.PDFDebugger.parseDocument(PDFDebugger.java:1369)
    org.apache.pdfbox.debugger.PDFDebugger.readPDFFile(PDFDebugger.java:1283)
    org.apache.pdfbox.debugger.PDFDebugger.readPDFFile(PDFDebugger.java:1266)
    org.apache.pdfbox.debugger.PDFDebugger.main(PDFDebugger.java:254)
    org.apache.pdfbox.tools.PDFBox.main(PDFBox.java:85)

Searching Google, I found the issue [PDFBOX-3717] java.io.IOException: Missing root object specification in trailer - ASF JIRA. It seems I encountered a PDFBox bug.

Since it says Fix Version/s: 2.0.6, I’ll consider upgrading to version 2.0.6 or higher where this is fixed.

easy-pdf-merge Update

Normally, updating easy-pdf-merge should solve this, but since the main easy-pdf-merge repository isn’t being maintained, I considered either:

  • Forking it myself
  • Finding an updated repository among the existing forks

This time, there was a usable repository among the forks, so I decided to use that. I’ll explain the investigation process in detail.

The Main easy-pdf-merge Uses PDFBox v2.0.1

When I set package.json as follows and ran npm install:

"easy-pdf-merge": "0.1.3",

Checking the installed PDFBox version, it was 2.0.1.

java -jar node_modules/easy-pdf-merge/jar/pdfbox.jar -version
PDFBox version: "2.0.1"

The easy-pdf-merge Fork Uses PDFBox v2.0.8

I’ll try using f2fgroup/easy-pdf-merge, which has the most recent last update date among the forked repositories.

easy-pdf-merge/network

Checking the recent commit log, it seems to be using PDFBox version 2.0.8.

rebuild 2.0.8 · f2fgroup/easy-pdf-merge@2931462

When I set package.json as follows and ran npm install:

"easy-pdf-merge": "git://github.com/f2fgroup/easy-pdf-merge.git#ed345e23f2aef9a9dab62fcae5a85a1b15af3189",

I was able to confirm that the installed PDFBox is indeed version 2.0.8.

java -jar node_modules/easy-pdf-merge/jar/pdfbox.jar -version
PDFBox version: "2.0.8"

That’s all from the Gemba where I encountered a PDFBox bug and updated easy-pdf-merge.