Issue 51: Add decryption support
Status:  Accepted
Owner: ----
Reported by garbage-collection@gmx.net, Aug 25, 2011
Hi,

I checked out the latest revision (r167) of pdfsizeopt on my Debian squeeze system and ran it on the attached file `test.pdf'.  This file is *not* protected.

$ python pdfsizeopt.py --use-multivalent=false test.pdf

info: This is pdfsizeopt.py r166 size=279220.
info: loading PDF from: test.pdf
info: loaded PDF of 313456 bytes
info: separated to 131 objs + xref + trailer
info: found 0 Type1 fonts loaded
info: found 0 Type1C fonts loaded
info: eliminated 21 duplicate objs
info: eliminated 3 unused objs in 3 classes
info: saving PDF with 107 objs to: test.pso.pdf
info: generated 303933 bytes (97%)

The output file however seems to be corrupted; various programs either cannot show it or demand some password:

- pdfinfo, xpdf (using libpoppler)
- GNOME evince (using libcairo)

Can you please have a look into this case.  Please let me know if some further information is needed.

Thanks a lot.
Mathias

test.pdf
306 KB   Download
Aug 25, 2011
Project Member #1 pts...@gmail.com
Thank you for taking time for submitting a useful bug report, with all details needed to reproduce the problem included.

The attached PDF is indeed encrypted, it's trailer contains /Encrypt.

Please note that the PDF file format supports encryption without a (user) password, so viewer software can decrypt it without asking for a password. This is what happens with the attached test.pdf .

pdfsizeopt at r166 doesn't support encrypted input at all. This means that it's behavior is undefined on encrypted input. The behavior for test.pdf is to finish successfully, but generate an invalid (useless, bad) output PDF, which confuses xpdf and evince. (Most probably there is no bug in xpdf and evince.)

As a quick fix, I've added the proper error message to r168 if the input PDF is encrypted, so pdfsizeopt should fail early and clearly. You can use the following command to decrypt the PDF before passing it to pdfsizeopt:

  qpdf --decrypt test.pdf test.decrypted.pdf

Adding encrypted file reading support to pdfsizeopt would be a huge effort, and I don't have time for that in a foreseeable future, but I'm keeping this issue open just to track the progress, should there be any.
Status: Accepted
Labels: -Type-Defect Type-Enhancement
Aug 25, 2011
#2 garbage-collection@gmx.net
> Please note that the PDF file format supports encryption without a (user) password, so viewer software can decrypt it without asking for a password. This is what happens with the attached test.pdf .

Sorry for my early claim that it's not "protected". I should have been able to become aware easily of the opposite using e.g. pdfinfo and not merely a pdf viewer.

Thanks you for your quick reaction and the bypass solution using qpdf in the new revision.

Please note, that I deleted the attached file test.pdf from my first post.  Anyway it can be downloaded freely from
http://www.swp-berlin.org/de/produkte/swp-studien-de/swp-studien-detail/article/private_militaerfirmen_im_voelkerrecht.html 

Apr 2, 2012
Project Member #3 pts...@gmail.com
(No comment was entered for this change.)
Summary: Add decryption support
Jul 2, 2012
Project Member #4 pts...@gmail.com
Many users would find this useful, especially if it was automated manual decryption with qpdf.exe would not be necessary.

There are many external tools to decrypt PDFs:

* qpdf --decrypt (doesn't change anything else)
* Multivalent tool.pdf.Compress
* gs -q -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -sOutputFile=unencrypted.pdf -c .setpdfwrite -f encrypted.pdf

, but some of them (e.g. Ghostscript and tool.pdf.Compress) do extra, unwanted transformations as well. We could also integrate decryption to pdfsizeopt, but it may be a bit slow.


Multivalent and qpdf are both able to decrypt PDFs. Maybe addn
Labels: -Priority-Medium Priority-High
Jan 28, 2014
#5 Sebastia...@googlemail.com
qpdf --decrypt removed pdf-pseudo-encryption without unwanted changes in all of my test cases.
Multivalent and Ghostscript had unwanted side-effects and mutool often created damaged PDF files. So I stick with qpdf.
Jan 28, 2014
Project Member #6 pts...@gmail.com
Thanks for the feedback. The newest version of pdfsizeopt recommends using qpdf --decrypt if needed. Does it show this message for your encrypted PDF?

It wouldn't be too hard to exectend pdfsizeopt to call qpdf --decrypt automatically by default (could be disabled by --use-qpdf-decrypt=no). Would you like to have this feature?