| Issue 79: | Multivalent: java.io.IOException: invalid distance too far back @ 0 | |
| 1 person starred this issue and may be notified of changes. | Back to list |
Hi, Peter. I'm getting a stack trace when running pdfsizeopt revision 244 with some files: ---- info: This is pdfsizeopt.py rUNKNOWN size=318356. info: using Java for Multivalent: /usr/bin/java info: loading PDF from: Galois.pdf info: loaded PDF of 518573 bytes warning: problem with xref table: xref table not found at 508214 warning: trying to load objs without the xref table info: separated to 513 objs + trailer info: found 0 Type1 fonts loaded info: found 30 Type1C fonts loaded info: writing Type1CParser (90093 font bytes) to: pso.conv.parse.tmp.ps info: using Ghostscript gs: GPL Ghostscript 9.05 (2012-02-08) info: executing Type1CParser with Ghostscript: gs -q -dNOPAUSE -dBATCH -sDEVICE=nullpage -sDataFile=pso.conv.parsedata.tmp.ps -f pso.conv.parse.tmp.ps Type1CParser: using interpreter GPL Ghostscript 905 20120208 Type1CParser: all OK info: parsed 30 Type1C fonts info: eliminated 5 duplicate objs info: saving PDF with 508 objs with Multivalent to: Galois.psom.pdf info: writing Multivalent input PDF: pso.conv.mi.tmp.pdf info: generated object stream of 8541 bytes in 364 objects (9%) info: written 462924 bytes to Multivalent input PDF: pso.conv.mi.tmp.pdf info: executing Multivalent to optimize PDF: /usr/bin/java -cp /home/rbrito/Desktop/mirrors/pdfsizeopt/trunk/Multivalent.jar -Djava.awt.headless=true tool.pdf.Compress -nopagepiece -noalt -mon pso.conv.mi.tmp.pdf file:/home/rbrito/Dropbox/documents-to-sort-out/pso.conv.mi.tmp.pdf, 462924 bytes PDF 1.5, producer=MiKTeX-xdvipdfmx (0.7.8), creator= XeTeX output 2012.12.16:1357 511 objects / 106 pagesjava.io.IOException: invalid distance too far back @ 0 while reading object #204: {Filter=FlateDecode, DATA=118811, Length=3781} pso.conv.mi.tmp.pdf: java.io.IOException: invalid distance too far back @ 0 info: Multivalent generated pso.conv.mi.tmp-o.pdf of 0 bytes (0%) Traceback (most recent call last): File "/home/rbrito/Desktop/mirrors/pdfsizeopt/trunk/pdfsizeopt.py", line 7887, in <module> main(sys.argv) File "/home/rbrito/Desktop/mirrors/pdfsizeopt/trunk/pdfsizeopt.py", line 7880, in main is_flate_ok=not do_decompress_flate) File "/home/rbrito/Desktop/mirrors/pdfsizeopt/trunk/pdfsizeopt.py", line 7579, in Save multivalent_java=multivalent_java) File "/home/rbrito/Desktop/mirrors/pdfsizeopt/trunk/pdfsizeopt.py", line 7513, in _RunMultivalent 'Multivalent generated empty output (see its error above)') AssertionError: Multivalent generated empty output (see its error above) ---- I don't know if the problem here is with multivalent or if it is with pdfsizeopt, but I have been getting this java.IO.IOException a lot with some new PDF files that I am trying (yes, I am now hitting pdfsizeopt quite hard). The offending file is attached. Please let me know if there are other information that is needed. Thanks.
Feb 26, 2013
The attached Galois.pdf file is corrupt: object 136, of /Length 3781, contains a corrupt (uncompressible) /FlateDecode stream. The behavior and output of pdfsizeopt is not defined when it receives invalid input (such as Galois.pdf). All I can do for this issue is improving the error message pdfsizeopt prints a bit. To get this PDF optimized, please regenerate it correctly first, or run it through a converter which removes invalid parts, and run pdfsizeopt only after that.
Labels:
-Priority-High Priority-Medium
Feb 27, 2013
Hi, Peter. I just got another copy of the document from the author and this new one is fine. I guess that what we can take from this episode is that pdfsizeopt could print an error message instead of dumping a stack trace. Thanks.
Mar 3, 2013
pdfsizeopt indeed prints a useful error message ``Multivalent generated empty output (see its error above).''. It also prints a stack trace, which is even more useful, because it can be copy-pasted to the issue tracker. Removing the stack trace would make it less useful, thus worse. Making this particular error message (or the corresponding Multivalent error message) more useful would be too much work. Maybe I could add an ``Is the input PDF corrupt?'' clause here, but that's also too much work to do consistently, because PDFs can be corrupt in many ways. The easy improvement is to add the following sentence to the documentation: ``If your input PDF is corrupt, pdfsizeopt may succeed or it may fail, possibly with an error message which is difficult to understand. If you think your PDF is correct, then please report a bug in the pdfsizeopt issue tracker.''. Do you have any specific suggestions how to better report the failure in this particular case?
Labels:
-Type-Defect Type-Enhancement
|
Status: Accepted