| Issue 38: | Validation PDFa Broken After runing pdfsizeopt | |
| 2 people starred this issue and may be notified of changes. | Back to list |
Pdfsizeopt broke PDFa : - Remove ID - Remove line break with 'endobj' and 'endstream' I update William Bader patch for PDFa optimisation working : ./pdfsizopt.py --use-multivalent=false test.pdfa.pdf test.opt.pdfa.pdf ( multivalent broke PDFa )
Feb 10, 2011
Project Member
#1
pts...@gmail.com
Mar 4, 2011
I've integrated most of the attached patch (pdfsizeopt.pat) to the trunk, r158, except for /Type/Page unification (has to be disabled with --do-unify-pages=false explicitly), except for Multivalent -nocore14, and except for these entries: @@ -475,9 +475,9 @@ output.append(self.stream) # We don't need '\nendstream' after a non-compressed content stream, # 'Qendstream endobj' is perfectly fine (accepted by gs and xpdf). - output.append('endstream endobj\n') + output.append('\nendstream\nendobj\n') else: - output.append('%sendobj\n' % space) + output.append('%s\nendobj\n' % space) def __GetHead(self): if self._head is None and self._cache is not None: @ -3302,7 +3310,7 @@ trailer_obj.Set('Compress', None) # emitted by Multivalent.jar # Emitted by Multivalent.jar etc., see section 10.3 in # pdf_reference_1-7.pdf . - trailer_obj.Set('ID', None) + #trailer_obj.Set('ID', None) assert trailer_obj.head.startswith('<<') assert trailer_obj.head.endswith('>>') output.append('trailer\n%s\n' % trailer_obj.head) @@ -5816,7 +5871,7 @@ # Please note that we save the space of the removed /ID and /Compress # below, because /Type/XRef is usually the last object, so we don't # need to add padding. - pdf_obj.Set('ID', None) + #pdf_obj.Set('ID', None) pdf_obj.Set('Compress', None) if pdf_obj.Get('Index') != None: raise NotImplementedError('unexpected /Index in xref object') @@ -2592,15 +2592,17 @@ else: pdf_obj.Set('BitsPerComponent', pdf_image_data['BitsPerComponent']) pdf_obj.Set('ColorSpace', pdf_image_data['ColorSpace']) - pdf_obj.Set('Decode', pdf_image_data.get('Decode')) + if pdf_obj.Get('Decode') == None: + # Update Decode only if it is currently not set + pdf_obj.Set('Decode', pdf_image_data.get('Decode')) pdf_obj.Set('Filter', pdf_image_data['Filter']) pdf_obj.Set('DecodeParms', pdf_image_data.get('DecodeParms')) pdf_obj.Set('Length', len(pdf_image_data['.stream'])) # Don't pdf_obj.Set('Decode', ...): it is good as is. pdf_obj.stream = pdf_image_data['.stream'] def CompressToZipPng(self): About PDF/A compatibility: yes, /ID has to be present and endobj/endstream must have whitespace in front of them for PDF/A compatibility. These features will be added to pdfsizeopt later, please post further comments about PDF/A support to https://code.google.com/p/pdfsizeopt/issues/detail?id=13 .
Mar 4, 2011
The /Decode issue has been fixed in https://code.google.com/p/pdfsizeopt/issues/detail?id=37 , and the PDF/A issues are to be discussed in https://code.google.com/p/pdfsizeopt/issues/detail?id=13 , so closing this issue now.
Status:
Duplicate
Mergedinto: 13 |