diff options
author | Robin Watts <robin.watts@artifex.com> | 2013-12-23 11:54:49 +0000 |
---|---|---|
committer | Robin Watts <robin.watts@artifex.com> | 2013-12-24 10:07:25 +0000 |
commit | 6b08b13fc4e95f9f0446a7199f1d9b5d9348d2f9 (patch) | |
tree | 01968e4ad4fa34c96d831581ec7f0920ff12a82e /include | |
parent | 3328de5f6432ee25525dca59d4e60d2603477b81 (diff) | |
download | mupdf-6b08b13fc4e95f9f0446a7199f1d9b5d9348d2f9.tar.xz |
Bug 694810: Implement late file repair for PDFs.
Currently, if we spot a bad xref as we are reading a PDF in, we can
repair that PDF by doing a long exhaustive read of the file. This
reconstructs the information that was in the xref, and the file can
be opened (and later saved) as normal.
If we hit an object that is not in the expected place however, we
cannot trigger a repair at that point - so xrefs with duff offsets
in (within the bounds of the file) will never be repaired.
This commit solves that by triggering a repair (just once) whenever
we fail to parse an object in the expected place.
Diffstat (limited to 'include')
-rw-r--r-- | include/mupdf/pdf/document.h | 2 | ||||
-rw-r--r-- | include/mupdf/pdf/parse.h | 2 |
2 files changed, 3 insertions, 1 deletions
diff --git a/include/mupdf/pdf/document.h b/include/mupdf/pdf/document.h index cd0c03ab..73b3692d 100644 --- a/include/mupdf/pdf/document.h +++ b/include/mupdf/pdf/document.h @@ -218,6 +218,8 @@ struct pdf_document_s int page_count; + int repair_attempted; + /* State indicating which file parsing method we are using */ int file_reading_linearly; int file_length; diff --git a/include/mupdf/pdf/parse.h b/include/mupdf/pdf/parse.h index 0dc52a78..0564a748 100644 --- a/include/mupdf/pdf/parse.h +++ b/include/mupdf/pdf/parse.h @@ -28,7 +28,7 @@ pdf_token pdf_lex(fz_stream *f, pdf_lexbuf *lexbuf); pdf_obj *pdf_parse_array(pdf_document *doc, fz_stream *f, pdf_lexbuf *buf); pdf_obj *pdf_parse_dict(pdf_document *doc, fz_stream *f, pdf_lexbuf *buf); pdf_obj *pdf_parse_stm_obj(pdf_document *doc, fz_stream *f, pdf_lexbuf *buf); -pdf_obj *pdf_parse_ind_obj(pdf_document *doc, fz_stream *f, pdf_lexbuf *buf, int *num, int *gen, int *stm_ofs); +pdf_obj *pdf_parse_ind_obj(pdf_document *doc, fz_stream *f, pdf_lexbuf *buf, int *num, int *gen, int *stm_ofs, int *try_repair); /* pdf_print_token: print a lexed token to a buffer, growing if necessary |