| Issue 59: | Corrupt jbig2 pages in output PDF | |
| 1 person starred this issue and may be notified of changes. | Back to list |
What steps will reproduce the problem? 1. Run pdfsizeopt.py Pages1-7.pdf on windows taking the defaults and you'll get the problem. What is the expected output? What do you see instead? I expect the pages to be viewable and compressed. The attached PDF is what I see, blank pages with error stating "Insufficient data for an image". What version of the product are you using? On what operating system? Latest from svn. Windows7 32-bit. Please provide any additional information below. The attached log is the output of the run. I'm also attaching the before compress PDF file and the after compress PDF file. Also I found another viewer (STDU Viewer) that partially decodes the output PDF file so I'm attaching a screenshot of what it looks like. And my statically compiled with vs2010 jbig2.exe from Adam Langley's source on github. Thanks, Darren
Jun 27, 2012
That fixed it. Thanks for all your help!!! Attached is my vs2010 compiled jbig2.exe and all the source code in case someone else wants to compile it.
Jun 27, 2012
Thank you for sharing your jbig2.exe and your source tree. jbig2.exe was one of the missing dependencies of pdfsizeopt on Windows. Today I compiled the remaining few dependencies, so now pdfsizeopt is officially available on Windows, and it's easier to install than ever. If you're interested, please check out the new installation page at https://code.google.com/p/pdfsizeopt/wiki/InstallationInstructionWindows . It would be very useful if you could upload all the library dependencies of jbig2enc_20120627.zip , including the URLs where you downloaded them from, and a .cmd file which compiles all the dependencies from scratch. So we could say to a future developer to install Visual Studio, download and extract a .zip file, run a .cmd file, and wait for jbig2.exe to be built automatically.
Status:
Fixed
Jul 9, 2012
Hey, glad I could help. I followed the instructions here http://tpgit.github.com/UnOfficialLeptDocs/leptonica/README.html#building-on-windows to compile Leptonica (http://leptonica.com/) and download the dependancies. I think you can just download the dependacies ( http://leptonica.org/source/leptonica-1.68-win32-lib-include-dirs.zip) and put everything in the right place to compile the jbig2 encoder. I may have done that. I can't remember. ;) Darren
Jul 9, 2012
This is what I get when I run your new windows version.
C:\Users\x991808\Desktop\pdfsizeopt_win32bin>pdfsizeopt.exe 000000.PDF
info: This is pdfsizeopt.py rUNKNOWN size=309327.
info: loading PDF from: 000000.PDF
info: loaded PDF of 515655 bytes
info: separated to 26 objs + xref + trailer
info: found 0 Type1 fonts loaded
info: found 0 Type1C fonts loaded
info: eliminated 2 unused objs in 2 classes
info: saving PDF with 24 objs with Multivalent to: 000000.psom.pdf
info: writing Multivalent input PDF: pso.conv.mi.tmp.pdf
info: generated object stream of 529 bytes in 21 objects (14%)
info: written 513629 bytes to Multivalent input PDF: pso.conv.mi.tmp.pdf
error: Multivalent.jar not found. Make sure it is on the $PATH, or it is
one of the files on the $CLASSPATH.
Traceback (most recent call last):
File ".\pdfsizeopt.py", line 7698, in <module>
main(sys.argv)
File ".\pdfsizeopt.py", line 7694, in main
may_obj_heads_contain_comments=may_obj_heads_contain_comments)
File ".\pdfsizeopt.py", line 7425, in Save
may_obj_heads_contain_comments=may_obj_heads_contain_comments)
File ".\pdfsizeopt.py", line 7322, in _RunMultivalent
assert 0, 'Multivalent.jar not found, see above'
AssertionError: Multivalent.jar not found, see above
Jul 9, 2012
AssertionError: Multivalent.jar not found, see above Did you follow the installation instructions? Did you download the newest pdfsizeopt.py (its size is 313571)? If that still doesn't fix the problem, please copy-paste the output of dir /s C:\Users\x991808\Desktop\pdfsizeopt_win32bin
Jul 10, 2012
Yes, I followed the instructions but I tried again this morning (re-doing all the instructions) and everything is working fine now. Running a massive PDF to test at the moment. So far so good. I just wish there was a way to speed up pngout. That thing takes forever.
Jul 10, 2012
One last thing you should add is the msvcr100.dll since I compiled jbig2.exe with vs2010. Here's mine.
Jul 10, 2012
About pngout: you can use --use-pngout=no . There is a speed vs size tradeoff here. pngout is slow, but its output is small.
Jul 10, 2012
Based on the information you have provided, I managed to compile a jbig2.exe (see it attached) suitable for use with pdfsizeopt. I compiled it using MinGW (cross-compiling on Linux), so it doesn't need msvcr100.dll . (I also removed the attached msvcr100.dll to avoid copyright issues in the future.) In the near future, I'll release this new jbig2.exe so it will be used by default with pdfsizeopt on Windows. FYI My jbig2.exe is noticeably smaller than yours, because I removed many unnecessary functions from the leptonica library (editing .c files by hand), and I also removed a few command-line flags which pdfsizeopt doesn't need. Thank you very much for your help providing patches and compilation instructions, it helped me a lot in understanding jbig2 on Windows and preparing my own version.
Status:
Started
Jul 11, 2012
Excellent! Glad to hear you were able to get it compiled. It wasn't trivial in VS2010 for me but MinGW is probably the easier choice, especially is you're used to Linux/gcc. Sorry I wasn't able to provide the batch file you requested. Just too much going on right now to mess with it. You might want to try out this alternate version of JBIG2Enc https://github.com/zdenop/jbig2enc/tree/R.Hatlapatka. It's supposed to have better autothresholding which I interpret to mean better compression on some images assuming the thresholding works. I haven't tried it yet. BTW - I tried the --use-pngout=no on my 146MB PDF file. It took 20 minutes instead of 2.5 hours and the file sizes were identical. So pngout doesn't seem to help unless you have color images. Mine test file was all CCITTFaxDecode so maybe if you see that (which is always bitonal) you shouldn't call pngout? Just an idea to save time. |
183 KB Download