summaryrefslogtreecommitdiff
path: root/pdf/pdf_xref.c
AgeCommit message (Collapse)Author
2013-05-27Solve fuzzing SEGV due to negative object number in xref.Robin Watts
2013-05-16Fix off by one error in xref resizing.Robin Watts
Found by zeniko in his fuzzing tests. Many thanks!
2013-05-06Fix formatting.Tor Andersson
2013-03-05Fix warnings seen on cleaning a document.Robin Watts
Seen with the test file from bu 693677. When we read a file in, we read the trailer, and the encrypt object before we start to decrypt other objects. These objects do not make it into the xref table though. When we write a file out, we run through the file reading in objects prior to writing them out; when we read in the trailer and the encrypt object we therefore try to decrypt them, giving errors. To avoid these errors, put the trailer and the encrypt object into the xref table when they are first read. This solves all but 1 problem when cleaning this file with "-dif" (as the signature object contains a digest block of data that is unencrypted). This solves all but 3 problems when cleaning this file with "-difggg"; the signature object, and one orphan copy of the crypt dictionary that is reported twice.
2013-02-19Bug 693639: Use strlcpy instead of strncpy!Tor Andersson
strncpy is *not* the correct function to use. It does not null terminate, and it needlessly zeroes past the end. It was designed for fixed length database records, not strings. Use fz_strlcpy and strlcat instead.
2013-02-19Fix whitespace.Tor Andersson
2013-01-11Bug 693503: Fix NULL dereference in atoi.Robin Watts
If a PDF xref subsection is broken in the wrong place, we can get NULL back from fz_strsep, which causes a SEGV when fed to atoi. Add a new fz_atoi that copes with NULL to avoid this. Problem found in a test file, 3959.pdf.SIGSEGV.ad4.3289 supplied by Mateusz "j00ru" Jurczyk and Gynvael Coldwind of the Google Security Team using Address Sanitizer. Many thanks!
2013-01-04Make token enum a type to ease debuggingSebastian Rasmussen
2012-12-13Bug 693290: Fix use after free in obj stream handling.Robin Watts
Thanks to zeniko for pointing this out. If we encounter a new definition for a given object (presumably due to a repair operation), we used to throw the old one away, and keep the new one. This could cause any current holders of the object to be left with a stale pointer. Now we throw the new one away and keep the old one - with a warning if they are different.
2012-11-30Bug 693290: Fix for potential infinite recursion reading xrefs.Robin Watts
Fix an issue spotted by zeniko. The patch is slightly modified from his supplied one to avoid problems with repeated freeing of the buffer, and to avoid abusing fz_buffer, but is largely based on his work. Many thanks.
2012-10-25Support separate rendering of the main page contents and the annotationsPaul Gardiner
2012-10-17First steps towards supporting transitions.Robin Watts
Only Fade, Wipe and Blinds supported so far. Hit 'p' in the viewer to go into 'presentation' mode. Page swaps then transition from page to page. Pages auto advance until key or mouse is used.
2012-09-04Improve error message when an object is missing from the xref.Tor Andersson
2012-09-04Merge branch 'master' into formsPaul Gardiner
Conflicts: pdf/pdf_xref_aux.c
2012-08-28Add fz_open_document_with_stream function.Tor Andersson
Use a "magic" string for filetype detection: filename or mime-type.
2012-08-24Forms: avoid javascript action execution when engine not availablePaul Gardiner
This was necessary to avoid indirecting through a NULL pointer returned from pdf_js_get_event, but is a generally sensible restriction. Also separate the execution of the document-level javascript actions from the pdf_js contstructor, so that doc->js is set during those actions. Also add a missing const
2012-08-08Merge branch 'master' into formsPaul Gardiner
Conflicts: Makefile apps/mudraw.c pdf/pdf_write.c win32/libmupdf-v8.vcproj
2012-08-06Remove old error mesages turned into comments when adding exceptionsSebastian Rasmussen
2012-08-06Update PDF metadata to include both PDF encryption version and revisionSebastian Rasmussen
2012-08-06No need to drop object for which no reference has been keptSebastian Rasmussen
2012-08-06No need to check for NULL before dropping objectsSebastian Rasmussen
2012-08-02Add missing virtual function setupPaul Gardiner
2012-08-01Merge branch 'master' into formsPaul Gardiner
Conflicts: pdf/mupdf-internal.h pdf/pdf_font.c
2012-08-01Enable pdf_print_xref in release builds; required for pdfshow.Robin Watts
Previously this had been disabled other than in DEBUG builds under the belief that it was only used for debugging.
2012-07-26Only resize xref if trailer size entry indicate more objectsSebastian Rasmussen
This will gracefully handle negative size entries as well, as these would not grow the xref.
2012-07-26Gracefully handle negative xref stream object range boundariesSebastian Rasmussen
An xref stream describes objects within a range of object numbers. Fail if either of these are negative.
2012-07-26Gracefully handle negative xref stream offsetsSebastian Rasmussen
2012-07-26Gracefully handle negative offset and objects in object streamSebastian Rasmussen
Previously a negative offset of the first object in an object stream or a negative number of objects in an object stream would cause a huge allocation. Detect and throw exception on negative values.
2012-07-26Gracefully handle negative xref stream entry widthsSebastian Rasmussen
In an xref stream each entry (type, offset and generation) may be of varible width. Warn if these are negative and assume that they are not present.
2012-07-18Update pdf_to_utf8 to handle either a stream or a stringPaul Gardiner
Also change first argument from fz_context to pdf_document in each of pdf_to_utf8, pdf_to_utf8_name, pdf_to_ucs2 and pdf_to_ucs2_name
2012-07-09Forms: add widget enumeration, and text-widget content typePaul Gardiner
Now reusing the internal representation of an annotation for widgets to avoid two separate lists
2012-07-06Remove debugging functions for release builds.Sebastian Rasmussen
2012-07-05Merge branch 'master' into formsRobin Watts
2012-07-05Move to static inline functions from macros.Robin Watts
Instead of using macros for min/max/abs/clamp, we move to using inline functions. These are more typesafe, and should produce equivalent code on compilers that support inline (i.e. pretty much everything we care about these days). People can always do their own macro versions if they prefer.
2012-06-22Rework pdf_lexbuf to allow for dynamic parsing buffers.Robin Watts
Currently pdf_lexbufs use a static scratch buffer for parsing. In the main case this is 64K in size, but in other cases it can be just 256 bytes; this causes problems when parsing long strings. Even the 64K limit is an implementation limit of Acrobat, not an architectural limit of PDF. Change here to allow dynamic buffers. This means a slightly more complex setup and destruction for each buffer, but more importantly requires correct cleanup on errors. To avoid having to insert lots more try/catch clauses this commit includes various changes to the code so we reuse pdf_lexbufs where possible. This keeps the speed up.
2012-06-22Rework pdf_lexbuf to allow for dynamic parsing buffers.Robin Watts
Currently pdf_lexbufs use a static scratch buffer for parsing. In the main case this is 64K in size, but in other cases it can be just 256 bytes; this causes problems when parsing long strings. Even the 64K limit is an implementation limit of Acrobat, not an architectural limit of PDF. Change here to allow dynamic buffers. This means a slightly more complex setup and destruction for each buffer, but more importantly requires correct cleanup on errors. To avoid having to insert lots more try/catch clauses this commit includes various changes to the code so we reuse pdf_lexbufs where possible. This keeps the speed up.
2012-06-20Reduce amount of boiler plate by casting function pointers to void*.Tor Andersson
Remove the shim indirection layer for fz_document. A little less type safe, but a lot less boiler plate.
2012-06-20Add better mechanism for enumerating annotation rectangles.Robin Watts
Rather than having a dedicated call to enumerate the rectangles for the annotations on a page, add an interface for enumerating annotations with accessor functions. Currently the only accessor function is the one to get the annotation rectangle. Use this new scheme in place of fz_bound_annots within mudraw. Also use this scheme to set the caret cursor in the viewer when over a data field.
2012-06-15Move javascript loading after encryption/repair has been done.Robin Watts
Currently we were attempting to load the javascript for a document immediately on opening it. Here we delay it until 1) the encryption for a document has been loaded, and 2) any repair required to a document has been done. This solves various problems, which were leading (indirectly) to bug 693128.
2012-06-14Add -j flag to mudraw; create simple mujstest scripts automatically.Robin Watts
We add a new fz_bound_annots function (and associated pdf_bound_annots function) that calls a given callback with the page rectangle of the annotations on a given page. This is marked as being a 'temporary' function, so we can remove it/change it in future if required. It seems likely that we'll want to have some sort of 'iterate over annotations' function eventually, and this does the job for now. Add a -j flag to mudraw that outputs a simple mujstest script. For each page with annotations, the script jumps to that page, then for each annotation on the page, it sets some text to be entered, and clicks the annotation. In the case of text fields, this will cause the text to be entered into that text field; in the case of buttons it will execute the button. At the end of each page with annotations, the script is told to snapshot the page. These test scripts are not designed to be full tests, but they do at least provide an easy way for us to generate scripts where every field in our test suite is interacted with.
2012-06-13Merge branch 'master' into formsPaul Gardiner
2012-06-13Remove unnecessary function and improve namingPaul Gardiner
2012-06-13Merge branch 'master' into formsPaul Gardiner
Conflicts: fitz/fitz-internal.h fitz/stm_buffer.c pdf/mupdf-internal.h
2012-06-12A few general utility functions added for the sake of the forms workPaul Gardiner
2012-06-01Merge branch 'master' into formsPaul Gardiner
Conflicts: fitz/doc_document.c fitz/fitz-internal.h fitz/fitz.h fitz/stm_buffer.c pdf/mupdf-internal.h pdf/pdf_object.c pdf/pdf_xobject.c pdf/pdf_xref.c win32/mupdf.sln
2012-05-31A few general utility functions added for the sake of the forms workPaul Gardiner
2012-05-31Add linearization to pdf_write function.Robin Watts
Extend mupdfclean to have a new -l file that writes the file linearized. This should still be considered experimental When writing a pdf file, analyse object use, flatten resource use, reorder the objects, generate a hintstream and output with linearisaton parameters. This is enough for Acrobat to accept the file as being optimised for Fast Web View. We ought to add more tables to the hintstream in some cases, but I doubt anyone actually uses it, the spec is so badly written. Certainly acrobat accepts the file as being optimised for 'Fast Web View'. Update fz_dict_put to allow for us adding a reference to the dictionary that is the sole owner of that reference already (i.e. don't drop then keep something that has a reference count of just 1). Update pdf_load_image_stream to use the stm_buf from the xref if there is one. Update pdf_close_document to discard any stm_bufs it may be holding. Update fz_dict_put to be pdf_dict_put - this was missed in a renaming ages ago and has been inconsistent since.
2012-05-23Bring xref object and stream mutation functions back from the dead.Tor Andersson
Needs more work to use the linked list of free xref slots.
2012-05-15Forms: make forms API separate to the main document APIPaul Gardiner
This also provides a way to test whether interactive methods are supported.
2012-05-11Split part of fz_document interface for pdf_document into separate file.Tor Andersson
Make a separate constructor function that does not link in the interpreter, so we can save space in the mubusy binary by not including the font and cmap resources.