Issue 36: assert 'Subrs' not in parsed_font
Status:  Fixed
Owner:
Closed:  Feb 2011
Reported by lemzw...@googlemail.com, Apr 25, 2010
Running

  ./pdfsizeopt --use-pngout=true \
               --use-jbig2=true \
               --use-multivalent=true \
               extending.pdf &> extending.pdfsizeopt.log 

causes

    File "./pdfsizeopt", line 6158, in <module>
      main(sys.argv)
    File "./pdfsizeopt", line 6141, in main
      do_regenerate_all_fonts=do_regenerate_all_fonts)
    File "./pdfsizeopt", line 4291, in UnifyType1CFonts
      assert 'Subrs' not in parsed_font['Private']
  AssertionError

with no PDF output.  Note that it was necessary to apply the patch from
 issue #31 .

Complete input file and log is attached.

extending.pdf
442 KB   Download
extending.pdfsizeopt.log
3.9 KB   View   Download
May 8, 2010
#1 william.bader@gmail.com
pdfsizeopt scans the fonts in the pdf, groups them by name, and checks if fonts with
similar names can be merged.  The top of the loop has several assertions that make
pdfsizeopt fail if the font has an attribute that pdfsizeopt can't handle.  The patch
below makes pdfsizeopt skip to the next font instead of failing with an assertion
error.  pdfsizeopt tries to merge only the fonts in a table of font groups.  As long
as it doesn't enter the bad font into the font group table, the font shouldn't do any
damage.  With the attached patches, pdfsizeopt can process extending.pdf.
William
williambader@hotmail.com

--- pdfsizeopt/pdfsizeopt.py-   2010-03-24 21:06:15.000000000 +0000
+++ pdfsizeopt/pdfsizeopt.py    2010-05-09 03:52:06.000000000 +0100
@@ -4277,6 +4296,33 @@
       assert obj.stream is None
       parsed_font = parsed_fonts[obj_num]
       parsed_font['FontName'] = obj.Get('FontName')
+      if parsed_font['FontType'] != 2:
+        print >>sys.stderr, 'info: font %s is not Type 2, can not merge.' %
parsed_font['FontName']
+        continue
+      if 'CharStrings' not in parsed_font:
+        print >>sys.stderr, 'info: font %s does not have CharStrings, can not
merge.' % parsed_font['FontName']
+        continue
+      if 'FontMatrix' not in parsed_font:
+        print >>sys.stderr, 'info: font %s has no FontMatrix, can not merge.' %
parsed_font['FontName']
+        continue
+      if 'Private' not in parsed_font:
+        print >>sys.stderr, 'info: can font %s has no Private data, can not merge.'
% parsed_font['FontName']
+        continue
+      if 'PaintType' not in parsed_font:
+        print >>sys.stderr, 'info: font %s has no PaintType, can not merge.' %
parsed_font['FontName']
+        continue
+      if 'FontInfo' not in parsed_font:
+        print >>sys.stderr, 'info: font %s has no FontInfo, can not merge.' %
parsed_font['FontName']
+        continue
+      if 'CharStrings' not in parsed_font:
+        print >>sys.stderr, 'info: font %s has no CharStrings, can not merge.' %
parsed_font['FontName']
+        continue
+      if 'Subrs' in parsed_font:
+        print >>sys.stderr, 'info: font %s has Subrs, can not merge.' %
parsed_font['FontName']
+        continue
+      if 'Subrs' in parsed_font['Private']:
+        print >>sys.stderr, 'info: font %s has Private Subrs, can not merge.' %
parsed_font['FontName']
+        continue
       assert parsed_font['FontType'] == 2
       assert 'CharStrings' in parsed_font
       assert 'FontMatrix' in parsed_font

pdfsizeopt.pat
13.1 KB   View   Download
May 9, 2010
#2 lemzw...@googlemail.com
Thanks, it continues, but for me it still fails with

  KeyError: 50

See attached log.
extending.pdfsizeopt.log2
5.5 KB   View   Download
May 9, 2010
#3 william.bader@gmail.com
I compared your log to my log.
I have gs8.71. You have gs8.70.  The version shouldn't matter, but your log has a
warning "*** Warning: GenericResourceDir doesn't point to a valid resource
directory." that could indicate that gs isn't installed well.  Run "gs -h" and check
if gs is looking for a directory like /usr/local/share/ghostscript/8.70/Resource/Init
and then check if that directory contains any files.
The next difference is that my log has
info: loaded PDF of 64069 bytes
info: separated to 74 objs
while your log has
info: loaded PDF of 63834 bytes
info: separated to 73 objs
Are you testing the same extending.pdf that you attached?
The file that I have is 453528 bytes.
I have a version of Multivalent.jar that is 2716363 bytes.
I have Fedora 8 Linux with Python 2.5.1.
I have attached my log and a copy of pdfsizeopt.py with all of my patches.

extending.log3
5.1 KB   View   Download
pdfsizeopt.py
241 KB   View   Download
May 10, 2010
#4 lemzw...@googlemail.com
Thanks.  I've taken care of the gs warning so that it disappears, and now it works as
expected.  Note, however, that ghostscript's warning is harmless and shouldn't change
its behaviour – maybe Multivalent doesn't like that gs emits warnings?
May 10, 2010
#5 william.bader@gmail.com
Multivalent is a stand-alone PDF optimization program, and I think that it does not
use ghostscript.  pdfsizeopt runs ghostscript before running Multivalent.  If
ghostscript isn't working right, it might generate bad output that causes problems
later in pdfsizeopt.py or in Multivalent.
http://multivalent.sourceforge.net/Tools/pdf/Compress.html
http://multivalent.sourceforge.net/license.html
William
Feb 10, 2011
Project Member #6 pts...@gmail.com
Thank you for the bug report about the Subrs error message. Thanks to William for proposing an excellent patch to fix the issue. I've fixed the bug in r154, based on the William's idea.

About the KeyError: if you manage to reproduce the problem with a Ghostscript which doesn't display warnings, please do it by creating a new issue.
Status: Fixed
Feb 10, 2011
#7 lemzw...@googlemail.com
Great that you have some time again to work on pdfsizeopt!
Unfortunately, all your changes to the SVN come with a meaningless
commit message.  An accident?  Or are you using another repository
for the main work?

Feb 11, 2011
Project Member #8 pts...@gmail.com
Yes, I use another repository, and it's not so convenient to copy commits with messages. Maybe it would be easier with Mercurial...
Feb 11, 2011
#9 lemzw...@googlemail.com
What about simply publishing this your repository?
Feb 11, 2011
Project Member #10 pts...@gmail.com
I don't know of a simple way to publish an SVN repository on Google Code. The ways I have tried are slow, complicated and error-prone.
Feb 11, 2011
#11 lemzw...@googlemail.com
OK, then your idea of converting the SVN to Mercurial (or git) sounds sensible.