mupdf - MuPDF PDF reader and library

Age	Commit message (Collapse)	Author
2016-09-19	fz_store: Reap passes.	Robin Watts
	A few commits back, we introduced the fz_key_storable concept to allow us to cope with objects that were used both as values within the store and as parts of keys within the store. This commit worked, but showed up performance problems; when the store has several million PDF objects in it, bulk changes (such as dropping a display list or document) could trigger many passes across the store. We therefore introduce a mechanism to ameliorate this. These passes, now known as "reap passes", can be batched together using fz_defer_reap_start and fz_defer_reap_end. We trigger this start/end around display list dropping, and around PDF content stream processing. This should be fine, as deferral will be interrupted if we ever run our of memory during mallocing.
2016-09-18	Make printing empty hash table entries optional.	Sebastian Rasmussen

2016-09-16	Add some missing #ifdeffery for Memento.	Robin Watts

2016-09-16	Tweak store handling of PDF document destroy.	Robin Watts
	When we destroy a PDF document, currently we bin everything from the store. Instead, drop just the objects that are specifically tied to that document. Any object tied to the document has a pdf_obj with the required document pointer in it as the key.
2016-09-16	Extend store to cope with references used in keys.	Robin Watts
	The store is effectively a list of items, where each item is a (key, value) pair. The design is such that we can easily get into the state where the only reference to a value is that held by the store. Subsequent references can then be generated by things being 'found' from within the store. While the only reference to an object is that held by it being a value in the store, the store is free to evict it to save memory. Images present a complication to this design; images are stored both as values within the store (by the pdf agent, so that we do not regenerate images each time we meet them in the file), and as parts of the keys within the store. For example, once an image is decoded to give a pixmap, the pixmap is cached in the store. The key to look that pixmap up again includes a reference to the image from which the pixmap was generated. This means, that for document handlers such as gproof that do not place images in the store, we can end up with images that are kept around purely by dint of being used as references in store keys. There is no chance of the value (the decoded pixmap) ever being 'found' from the store as no one other than the key is holding a reference to the image required. Thus the images/pixmaps are never freed until the store is emptied. This commit offers a fix for this situation. Standard store items are based on an fz_storable type. Here we introduce a new fz_key_storable type derived from that. As well as keeping track of the number of references a given item has to it, it keeps a separate count of the number of references a given item has to it from keys in the store. On dropping a reference, we check to see if the number of references has become the same as the number of references from keys in the store. If it has, then we know that these keys can never be 'found' again. So we filter them out of the store, which drops the items.
2016-09-14	Redirect fprintf to android logcat in debug builds.	Robin Watts
	This makes debugging much simpler.
2016-09-14	Fix typo in Memento header.	Robin Watts

2016-09-13	Update Memento for Android.	Robin Watts
	Add backtrace abilities, and fix missing return value from android logging.
2016-09-13	Bug 696984: Type 3 fonts bbox fixes.	Robin Watts
	The upshot of debugging this is that: 1) We can't trust the FontBBox. Certainly it appears that no one else trusts it. 2) We can't trust the d1 values in all cases, as it can lead to use rendering glyphs far larger than we'd want to. So we have the compromise used here. 1) We never clip to the FontBBox. 2) If the FontBBox is invalid, then we calculate the bbox from the contents of the data streams. 3) If the FontBBox is valid, and the d1 rectangle given does not fit inside it, then we calculate the bbox from the contents of the data streams. This could theoretically produce problems with glyphs that have much more content than they actually need, and rely on the d1 rect to clip it down to sanity. If the FontBBox is invalid in such fonts, we will go wrong. It's not clear to me that this will actually work in Acrobat/ Foxit/gs etc either, so we defer handling this better until we actually have an example. Tested with bug 694952, and bug 695843 which were the last 2 in this area.
2016-09-09	Fix VS2005 build; missing stat definition.	Robin Watts
	Windows requires sys/stat.h to be included.
2016-09-08	Add options to control heuristics in structured text.	Sebastian Rasmussen

2016-09-08	Make fz_option_eq() available outside of pdf-writer.	Sebastian Rasmussen

2016-09-08	Add support for GNU tar archives.	Sebastian Rasmussen

2016-09-08	Make fz_archive a generic archive type.	Sebastian Rasmussen
	Previously it was inherently tied to zip archives and directories. Now these are separated out into distinct subclasses. This prepares for support for further archive formats.
2016-09-01	pdf: Load/open streams by indirect reference object when possible.	Tor Andersson

2016-09-01	Simplify PDF resource caching table handling.	Tor Andersson

2016-08-24	Add pdf_array_find to look up the index of an object in an array.	Tor Andersson

2016-08-22	Document part of fz_stream interface.	Sebastian Rasmussen

2016-08-21	Fix typo in document creation macro.	Sebastian Rasmussen

2016-08-01	Move to bitfields in fz_font rather than chars/ints etc.	Robin Watts

2016-08-01	Bug 696984: Badly rendered characters.	Robin Watts
	The type3 font(s) in the file have an invalid (0 sized) bbox, hence the clipping of the chars goes wrong. We now spot the invalid bbox, and suppress the clipping.
2016-07-15	Add interface indicating if a document is reflowable.	Sebastian Rasmussen

2016-07-13	Bug 696699: Fix Text extraction mediabox information.	Robin Watts
	Since the removal of the begin_page device function, structured text extraction has been unable to correctly establish the mediabox for extracted pages. Update the fz_new_stext_page call to take this mediabox information. This is an API change, but hopefully most people are calling fz_new_stext_page_from_page or fz_new_stext_page_from_display_list which are updated here to cope. Update all the apps/tools to behave properly.
2016-07-13	Fix Memento builds; static references were upsetting refcounting.	Robin Watts

2016-07-12	Fix typo in comment.	Robin Watts

2016-07-09	Add documentation for exposed LAB function.	Sebastian Rasmussen

2016-07-08	Avoid warnings in non-Memento builds.	Robin Watts

2016-07-08	Use fz_keep_imp and fz_drop_imp for all reference counting.	Tor Andersson

2016-07-08	git stripspace	Tor Andersson

2016-07-08	Separate close and drop functionality for devices and writers.	Tor Andersson
	Closing a device or writer may throw exceptions, but much of the foreign language bindings (JNI and JS) depend on drop to never throw an exception (exceptions in finalizers are bad).
2016-07-08	Slim pdf_xobject: remove cached colorspace field.	Tor Andersson

2016-07-08	Slim pdf_xobject: remove cached document field.	Tor Andersson

2016-07-08	Slim pdf_xobject: remove cached transparency/isolated/knockout fields.	Tor Andersson

2016-07-08	Slim pdf_xobject struct: remove cached matrix field.	Tor Andersson

2016-07-08	Slim pdf_xobject struct: remove cached bbox field.	Tor Andersson

2016-07-08	Slim pdf_xobject struct: remove cached resources field.	Tor Andersson
	The "contents" field is the same as the "obj" field, so can also be removed.
2016-07-08	Slim pdf_xobject struct: Rename me to obj.	Tor Andersson

2016-07-08	Slim pdf_annot struct: remove cached page_ctm field.	Tor Andersson

2016-07-08	Slim pdf_annot struct: remove cached inv_page_ctm field.	Tor Andersson

2016-07-08	Slim pdf_annot struct: remove cached annot_type and widget_type fields.	Tor Andersson

2016-07-08	Slim pdf_annot struct: remove cached matrix field.	Tor Andersson

2016-07-08	Slim pdf_annot struct: remove cached rect field.	Tor Andersson

2016-07-08	Slim pdf_annot struct: remove cached pagerect field.	Tor Andersson

2016-07-06	Start slimming pdf_page.	Tor Andersson
	We want to turn pdf_page into a thin wrapper around a pdf_obj, so that any updates to the underlying PDF objects will be reflected without having to reload the pdf_page.
2016-07-06	Add fitz to pdf downcasting functions for pages and annotations.	Tor Andersson

2016-07-06	Add annotations to murun.	Tor Andersson

2016-07-06	Fix garbage collection and page grafting for indirect reference chains.	Tor Andersson
	The mark & sweep pass of garbage collection, and resolving indirect objects when grafting objects was following the full chain of indirect references. In the unusual case where a numbered object is itself only an indirect reference to another object, this intermediate numbered object would be missed both when marking for garbage collection, and when copying objects for grafting. Add a function to resolve only one step for these two uses. The following is an example of a file that would break during garbage collection if we follow full indirect reference chains: %PDF-1.3 1 0 obj <</Type/Catalog /Foo[2 0 R 3 0 R]>> endobj 2 0 obj 4 0 R endobj 3 0 obj 5 0 R endobj 4 0 obj <</Length 1>> stream A endstream endobj 5 0 obj <</Length 1>> stream B endstream endobj
2016-07-06	pdf: Drop generation number from public interfaces.	Tor Andersson
	The generation number is only needed for decryption, and is assumed to be zero or irrelevant for all other uses. Store the original object number and generation in the xref slot, so that we can decrypt them even when the objects have been renumbered, without needing to pass the original object number around through the stream loading APIs.
2016-07-06	pdf: Flatten inheritable page properties when copying pages.	Tor Andersson
	Affects pdfclean, pdfmerge, and pdfposter.
2016-07-06	Add support for decoding pbm/pgm/ppm/pam images.	Sebastian Rasmussen