Bug 694810: Implement late file repair for PDFs.

Currently, if we spot a bad xref as we are reading a PDF in, we can repair that PDF by doing a long exhaustive read of the file. This reconstructs the information that was in the xref, and the file can be opened (and later saved) as normal. If we hit an object that is not in the expected place however, we cannot trigger a repair at that point - so xrefs with duff offsets in (within the bounds of the file) will never be repaired. This commit solves that by triggering a repair (just once) whenever we fail to parse an object in the expected place.
author: Robin Watts <robin.watts@artifex.com> 2013-12-23 11:54:49 +0000
committer: Robin Watts <robin.watts@artifex.com> 2013-12-24 10:07:25 +0000
commit: 6b08b13fc4e95f9f0446a7199f1d9b5d9348d2f9 (patch)
tree: 01968e4ad4fa34c96d831581ec7f0920ff12a82e /include
parent: 3328de5f6432ee25525dca59d4e60d2603477b81 (diff)
download: mupdf-6b08b13fc4e95f9f0446a7199f1d9b5d9348d2f9.tar.xz
2 files changed, 3 insertions, 1 deletions
diff --git a/include/mupdf/pdf/document.h b/include/mupdf/pdf/document.h
index cd0c03ab..73b3692d 100644
--- a/include/mupdf/pdf/document.h
+++ b/include/mupdf/pdf/document.h
@@ -218,6 +218,8 @@ struct pdf_document_s
 
 	int page_count;
 
+	int repair_attempted;
+
 	/* State indicating which file parsing method we are using */
 	int file_reading_linearly;
 	int file_length;
diff --git a/include/mupdf/pdf/parse.h b/include/mupdf/pdf/parse.h
index 0dc52a78..0564a748 100644
--- a/include/mupdf/pdf/parse.h
+++ b/include/mupdf/pdf/parse.h
@@ -28,7 +28,7 @@ pdf_token pdf_lex(fz_stream *f, pdf_lexbuf *lexbuf);
 pdf_obj *pdf_parse_array(pdf_document *doc, fz_stream *f, pdf_lexbuf *buf);
 pdf_obj *pdf_parse_dict(pdf_document *doc, fz_stream *f, pdf_lexbuf *buf);
 pdf_obj *pdf_parse_stm_obj(pdf_document *doc, fz_stream *f, pdf_lexbuf *buf);
-pdf_obj *pdf_parse_ind_obj(pdf_document *doc, fz_stream *f, pdf_lexbuf *buf, int *num, int *gen, int *stm_ofs);
+pdf_obj *pdf_parse_ind_obj(pdf_document *doc, fz_stream *f, pdf_lexbuf *buf, int *num, int *gen, int *stm_ofs, int *try_repair);
 
 /*
 	pdf_print_token: print a lexed token to a buffer, growing if necessary
author	Robin Watts <robin.watts@artifex.com>	2013-12-23 11:54:49 +0000
committer	Robin Watts <robin.watts@artifex.com>	2013-12-24 10:07:25 +0000
commit	6b08b13fc4e95f9f0446a7199f1d9b5d9348d2f9 (patch)
tree	01968e4ad4fa34c96d831581ec7f0920ff12a82e /include
parent	3328de5f6432ee25525dca59d4e60d2603477b81 (diff)
download	mupdf-6b08b13fc4e95f9f0446a7199f1d9b5d9348d2f9.tar.xz