summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2016-09-19fz_store: Reap passes.Robin Watts
A few commits back, we introduced the fz_key_storable concept to allow us to cope with objects that were used both as values within the store and as parts of keys within the store. This commit worked, but showed up performance problems; when the store has several million PDF objects in it, bulk changes (such as dropping a display list or document) could trigger many passes across the store. We therefore introduce a mechanism to ameliorate this. These passes, now known as "reap passes", can be batched together using fz_defer_reap_start and fz_defer_reap_end. We trigger this start/end around display list dropping, and around PDF content stream processing. This should be fine, as deferral will be interrupted if we ever run our of memory during mallocing.
2016-09-18Make printing empty hash table entries optional.Sebastian Rasmussen
2016-09-16Add some missing #ifdeffery for Memento.Robin Watts
2016-09-16Tweak store handling of PDF document destroy.Robin Watts
When we destroy a PDF document, currently we bin everything from the store. Instead, drop just the objects that are specifically tied to that document. Any object tied to the document has a pdf_obj with the required document pointer in it as the key.
2016-09-16Extend store to cope with references used in keys.Robin Watts
The store is effectively a list of items, where each item is a (key, value) pair. The design is such that we can easily get into the state where the only reference to a value is that held by the store. Subsequent references can then be generated by things being 'found' from within the store. While the only reference to an object is that held by it being a value in the store, the store is free to evict it to save memory. Images present a complication to this design; images are stored both as values within the store (by the pdf agent, so that we do not regenerate images each time we meet them in the file), and as parts of the keys within the store. For example, once an image is decoded to give a pixmap, the pixmap is cached in the store. The key to look that pixmap up again includes a reference to the image from which the pixmap was generated. This means, that for document handlers such as gproof that do not place images in the store, we can end up with images that are kept around purely by dint of being used as references in store keys. There is no chance of the value (the decoded pixmap) ever being 'found' from the store as no one other than the key is holding a reference to the image required. Thus the images/pixmaps are never freed until the store is emptied. This commit offers a fix for this situation. Standard store items are based on an fz_storable type. Here we introduce a new fz_key_storable type derived from that. As well as keeping track of the number of references a given item has to it, it keeps a separate count of the number of references a given item has to it from keys in the store. On dropping a reference, we check to see if the number of references has become the same as the number of references from keys in the store. If it has, then we know that these keys can never be 'found' again. So we filter them out of the store, which drops the items.
2016-09-14Redirect fprintf to android logcat in debug builds.Robin Watts
This makes debugging much simpler.
2016-09-14Fix typo in Memento header.Robin Watts
2016-09-13Update Memento for Android.Robin Watts
Add backtrace abilities, and fix missing return value from android logging.
2016-09-13Bug 696984: Type 3 fonts bbox fixes.Robin Watts
The upshot of debugging this is that: 1) We can't trust the FontBBox. Certainly it appears that no one else trusts it. 2) We can't trust the d1 values in all cases, as it can lead to use rendering glyphs far larger than we'd want to. So we have the compromise used here. 1) We never clip to the FontBBox. 2) If the FontBBox is invalid, then we calculate the bbox from the contents of the data streams. 3) If the FontBBox is valid, and the d1 rectangle given does not fit inside it, then we calculate the bbox from the contents of the data streams. This could theoretically produce problems with glyphs that have much more content than they actually need, and rely on the d1 rect to clip it down to sanity. If the FontBBox is invalid in such fonts, we will go wrong. It's not clear to me that this will actually work in Acrobat/ Foxit/gs etc either, so we defer handling this better until we actually have an example. Tested with bug 694952, and bug 695843 which were the last 2 in this area.
2016-09-09Fix VS2005 build; missing stat definition.Robin Watts
Windows requires sys/stat.h to be included.
2016-09-08Add options to control heuristics in structured text.Sebastian Rasmussen
2016-09-08Make fz_option_eq() available outside of pdf-writer.Sebastian Rasmussen
2016-09-08Add support for GNU tar archives.Sebastian Rasmussen
2016-09-08Make fz_archive a generic archive type.Sebastian Rasmussen
Previously it was inherently tied to zip archives and directories. Now these are separated out into distinct subclasses. This prepares for support for further archive formats.
2016-09-01pdf: Load/open streams by indirect reference object when possible.Tor Andersson
2016-09-01Simplify PDF resource caching table handling.Tor Andersson
2016-08-24Add pdf_array_find to look up the index of an object in an array.Tor Andersson
2016-08-22Document part of fz_stream interface.Sebastian Rasmussen
2016-08-21Fix typo in document creation macro.Sebastian Rasmussen
2016-08-01Move to bitfields in fz_font rather than chars/ints etc.Robin Watts
2016-08-01Bug 696984: Badly rendered characters.Robin Watts
The type3 font(s) in the file have an invalid (0 sized) bbox, hence the clipping of the chars goes wrong. We now spot the invalid bbox, and suppress the clipping.
2016-07-15Add interface indicating if a document is reflowable.Sebastian Rasmussen
2016-07-13Bug 696699: Fix Text extraction mediabox information.Robin Watts
Since the removal of the begin_page device function, structured text extraction has been unable to correctly establish the mediabox for extracted pages. Update the fz_new_stext_page call to take this mediabox information. This is an API change, but hopefully most people are calling fz_new_stext_page_from_page or fz_new_stext_page_from_display_list which are updated here to cope. Update all the apps/tools to behave properly.
2016-07-13Fix Memento builds; static references were upsetting refcounting.Robin Watts
2016-07-12Fix typo in comment.Robin Watts
2016-07-09Add documentation for exposed LAB function.Sebastian Rasmussen
2016-07-08Avoid warnings in non-Memento builds.Robin Watts
2016-07-08Use fz_keep_imp and fz_drop_imp for all reference counting.Tor Andersson
2016-07-08git stripspaceTor Andersson
2016-07-08Separate close and drop functionality for devices and writers.Tor Andersson
Closing a device or writer may throw exceptions, but much of the foreign language bindings (JNI and JS) depend on drop to never throw an exception (exceptions in finalizers are bad).
2016-07-08Slim pdf_xobject: remove cached colorspace field.Tor Andersson
2016-07-08Slim pdf_xobject: remove cached document field.Tor Andersson
2016-07-08Slim pdf_xobject: remove cached transparency/isolated/knockout fields.Tor Andersson
2016-07-08Slim pdf_xobject struct: remove cached matrix field.Tor Andersson
2016-07-08Slim pdf_xobject struct: remove cached bbox field.Tor Andersson
2016-07-08Slim pdf_xobject struct: remove cached resources field.Tor Andersson
The "contents" field is the same as the "obj" field, so can also be removed.
2016-07-08Slim pdf_xobject struct: Rename me to obj.Tor Andersson
2016-07-08Slim pdf_annot struct: remove cached page_ctm field.Tor Andersson
2016-07-08Slim pdf_annot struct: remove cached inv_page_ctm field.Tor Andersson
2016-07-08Slim pdf_annot struct: remove cached annot_type and widget_type fields.Tor Andersson
2016-07-08Slim pdf_annot struct: remove cached matrix field.Tor Andersson
2016-07-08Slim pdf_annot struct: remove cached rect field.Tor Andersson
2016-07-08Slim pdf_annot struct: remove cached pagerect field.Tor Andersson
2016-07-06Start slimming pdf_page.Tor Andersson
We want to turn pdf_page into a thin wrapper around a pdf_obj, so that any updates to the underlying PDF objects will be reflected without having to reload the pdf_page.
2016-07-06Add fitz to pdf downcasting functions for pages and annotations.Tor Andersson
2016-07-06Add annotations to murun.Tor Andersson
2016-07-06Fix garbage collection and page grafting for indirect reference chains.Tor Andersson
The mark & sweep pass of garbage collection, and resolving indirect objects when grafting objects was following the full chain of indirect references. In the unusual case where a numbered object is itself only an indirect reference to another object, this intermediate numbered object would be missed both when marking for garbage collection, and when copying objects for grafting. Add a function to resolve only one step for these two uses. The following is an example of a file that would break during garbage collection if we follow full indirect reference chains: %PDF-1.3 1 0 obj <</Type/Catalog /Foo[2 0 R 3 0 R]>> endobj 2 0 obj 4 0 R endobj 3 0 obj 5 0 R endobj 4 0 obj <</Length 1>> stream A endstream endobj 5 0 obj <</Length 1>> stream B endstream endobj
2016-07-06pdf: Drop generation number from public interfaces.Tor Andersson
The generation number is only needed for decryption, and is assumed to be zero or irrelevant for all other uses. Store the original object number and generation in the xref slot, so that we can decrypt them even when the objects have been renumbered, without needing to pass the original object number around through the stream loading APIs.
2016-07-06pdf: Flatten inheritable page properties when copying pages.Tor Andersson
Affects pdfclean, pdfmerge, and pdfposter.
2016-07-06Add support for decoding pbm/pgm/ppm/pam images.Sebastian Rasmussen