mupdf - MuPDF PDF reader and library

Age	Commit message (Collapse)	Author
2015-02-17	Add ctx parameter and remove embedded contexts for API regularity.	Tor Andersson
	Purge several embedded contexts: Remove embedded context in fz_output. Remove embedded context in fz_stream. Remove embedded context in fz_device. Remove fz_rebind_stream (since it is no longer necessary). Remove embedded context in svg_device. Remove embedded context in XML parser. Add ctx argument to fz_document functions. Remove embedded context in fz_document. Remove embedded context in pdf_document. Remove embedded context in pdf_obj. Make fz_page independent of fz_document in the interface. We shouldn't need to pass the document to all functions handling a page. If a page is tied to the source document, it's redundant; otherwise it's just pointless. Fix reference counting oddity in fz_new_image_from_pixmap.
2015-02-17	Rename fz_close_* and fz_free_* to fz_drop_*.	Tor Andersson
	Rename fz_close to fz_drop_stream. Rename fz_close_archive to fz_drop_archive. Rename fz_close_output to fz_drop_output. Rename fz_free_* to fz_drop_. Rename pdf_free_ to pdf_drop_. Rename xps_free_ to xps_drop_*.
2015-01-06	Add xref_index to speed searching of sparse xrefs.	Robin Watts
	Add a new index that quickly maps object number to the first xref in which an object appears. This appears to get us the speed back that we lost when moving to sparse xrefs.
2014-11-26	Change xref representation to cope better with sparse xrefs.	Robin Watts
	Currently each xref in the file results in an array from 0 to num_objects. If we have a file that has been updated many times this causes a huge waste of memory. Instead we now hold each xref as a list of non-overlapping subsections (exactly as the file holds them). Lookup is therefore potentially slower, but only on files where the xrefs are highly fragmented (i.e. where we would be saving in memory terms). Some parts of our code (notably the file writing code that does garbage collection etc) assumes that lookups of object entry pointers will not change previous object entry pointers that have been looked up. To cope with this, and to cope with the case where we are updating/creating new objects, we introduce the idea of a 'solid' xref. A solid xref is one where it has a single subsection record that spans the entire range of valid object numbers for a file. Once we have ensured that an xref is 'solid', we can safely work on the pointers within it without fear of them moving. We ensure that any 'incremental' xref is solid. We also ensure that any non-incremental write makes the xref solid.
2014-08-19	try not to wrongly overwrite /ID when repairing encrypted documents	Simon Bünzli
	If garbage is appended to an encrypted document, there could be another trailer with /ID but without /Encrypt . The repairing code currently always uses the last encountered values, but replacing the /ID value alone can cause decryption to break. One possible solution is to use the /ID value only when there's either none yet, when there's no /Encrypt or when there's a matching /Encrypt in the same trailer. See https://code.google.com/p/sumatrapdf/issues/detail?id=2697 for a document which Adobe Reader is able to read but MuPDF isn't (it used to before pdf_lex_no_string was introduced, but that's accidental).
2014-01-02	Improve PDF repair logic.	Robin Watts
	When we meet a broken PDF file, we attempt to repair it. We do this by reading tokens from the file and attempting to interpret them as a normal PDF stream. Unfortunately, if the file is corrupt enough so that we start to read from the middle of a stream, and we happen to hit an '(' character, we can go into string reading mode. We can then end up skipping over vast swathes of file that we could otherwise repair. We fix this here by using a new version of the pdf_lex function that refuses to ever return a string. This means we may take more time over skipping things than we did before, but are less likely to skip stuff. We also tweak other parts of the pdf repair logic here. If we hit a badly formed piece of data, clear the num/gen we have stored so that the next plausible piece we get does not get assigned to a random object number.
2014-01-02	fix memory leak in pdf_repair_xref	Simon Bünzli
	The 0 null object is leaked if a document refers to 0 0 obj before requiring a delayed reparation (seen e.g. with 3324.pdf.asan.3.2585).
2014-01-01	don't fail on invalid object streams	Simon Bünzli
	At https://code.google.com/p/sumatrapdf/issues/detail?id=2436 , there's a document with an empty xref section which since recently causes a repair to be triggered. Repairs then stop when pdf_repair_obj_stms fails on an object which isn't even required for the document to render. Such broken object streams should rather be ignored same as broken objects are ignored in pdf_init_document.
2013-12-24	Bug 694810: Implement late file repair for PDFs.	Robin Watts
	Currently, if we spot a bad xref as we are reading a PDF in, we can repair that PDF by doing a long exhaustive read of the file. This reconstructs the information that was in the xref, and the file can be opened (and later saved) as normal. If we hit an object that is not in the expected place however, we cannot trigger a repair at that point - so xrefs with duff offsets in (within the bounds of the file) will never be repaired. This commit solves that by triggering a repair (just once) whenever we fail to parse an object in the expected place.
2013-09-27	stop checking if the result of fz_read is negative	Simon Bünzli
	fz_read used to return a negative value on errors. With the introduction of fz_try/fz_catch, it throws an error instead and always returns non-negative values. This removes the pointless checks.
2013-09-13	Fix various compile warnings spotted by the cluster.	Robin Watts

2013-07-19	Initial work on progressive loading	Robin Watts
	We are testing this using a new -p flag to mupdf that sets a bitrate at which data will appear to arrive progressively as time goes on. For example: mupdf -p 102400 pdf_reference17.pdf Details of the scheme used here are presented in docs/progressive.txt
2013-06-28	Ensure altered objects are moved to the incremental xref section	Paul Gardiner

2013-06-25	Rid the world of "pdf_document *xref".	Robin Watts
	For historical reasons lots of the code uses "xref" when talking about a pdf document. Now pdf_xref is a separate type this has become confusing, so replace 'xref' with 'doc' for clarity.
2013-06-25	Update pdf_obj's to have a pdf_document field.	Robin Watts
	Remove the fz_context field to avoid the structure growing.
2013-06-24	fix recent regressions	zeniko
	* at one place, code returns from inside an fz_try which borks up the error stack * pdf_load_xref wrongly assumes that at least one non-empty xref has been read if there were no errors thrown during parsing * pdf_repair_xref skips integers when object numbers are out of range
2013-06-20	Rearrange source files.	Tor Andersson