mupdf - MuPDF PDF reader and library

Age	Commit message (Collapse)	Author
2016-09-19	fz_store: Reap passes.	Robin Watts
	A few commits back, we introduced the fz_key_storable concept to allow us to cope with objects that were used both as values within the store and as parts of keys within the store. This commit worked, but showed up performance problems; when the store has several million PDF objects in it, bulk changes (such as dropping a display list or document) could trigger many passes across the store. We therefore introduce a mechanism to ameliorate this. These passes, now known as "reap passes", can be batched together using fz_defer_reap_start and fz_defer_reap_end. We trigger this start/end around display list dropping, and around PDF content stream processing. This should be fine, as deferral will be interrupted if we ever run our of memory during mallocing.
2016-09-16	Extend store to cope with references used in keys.	Robin Watts
	The store is effectively a list of items, where each item is a (key, value) pair. The design is such that we can easily get into the state where the only reference to a value is that held by the store. Subsequent references can then be generated by things being 'found' from within the store. While the only reference to an object is that held by it being a value in the store, the store is free to evict it to save memory. Images present a complication to this design; images are stored both as values within the store (by the pdf agent, so that we do not regenerate images each time we meet them in the file), and as parts of the keys within the store. For example, once an image is decoded to give a pixmap, the pixmap is cached in the store. The key to look that pixmap up again includes a reference to the image from which the pixmap was generated. This means, that for document handlers such as gproof that do not place images in the store, we can end up with images that are kept around purely by dint of being used as references in store keys. There is no chance of the value (the decoded pixmap) ever being 'found' from the store as no one other than the key is holding a reference to the image required. Thus the images/pixmaps are never freed until the store is emptied. This commit offers a fix for this situation. Standard store items are based on an fz_storable type. Here we introduce a new fz_key_storable type derived from that. As well as keeping track of the number of references a given item has to it, it keeps a separate count of the number of references a given item has to it from keys in the store. On dropping a reference, we check to see if the number of references has become the same as the number of references from keys in the store. If it has, then we know that these keys can never be 'found' again. So we filter them out of the store, which drops the items.
2016-07-12	Fix typo in fz_new_image_from_pixmap.	Tor Andersson

2016-07-08	Separate close and drop functionality for devices and writers.	Tor Andersson
	Closing a device or writer may throw exceptions, but much of the foreign language bindings (JNI and JS) depend on drop to never throw an exception (exceptions in finalizers are bad).
2016-07-06	Add support for decoding pbm/pgm/ppm/pam images.	Sebastian Rasmussen

2016-07-05	Support J2K/JP2 files in CBZ.	Sebastian Rasmussen

2016-06-23	Support TIFF files in CBZ.	Tor Andersson

2016-06-17	Add device space transform state to draw device.	Tor Andersson
	Allows us to remove the out parameter 'transform' from fz_begin_page.
2016-06-17	Use 'size_t' instead of int as appropriate.	Robin Watts
	This silences the many warnings we get when building for x64 in windows. This does not address any of the warnings we get in thirdparty libraries - in particular harfbuzz. These look (at a quick glance) harmless though.
2016-06-14	Fix typos in various parts of the code.	Sebastian Rasmussen

2016-06-07	Fix subarea image calculations	Robin Watts
	Calculations that involved non power of 2 bpps were going wrong.
2016-05-29	Tweak plotter code slightly for speed.	Robin Watts
	Use do {} while(--w) rather than while(w--) {} as this safes a test each time around the loop.
2016-05-26	Avoid unnecessary alphas when decompressing images from streams.	Robin Watts

2016-05-24	fz_pixmap revamp: add stride and make alpha optional	Robin Watts
	fz_pixmaps now have an explicit stride value. By default no change from before, but code all copes with extra gaps at the end of the line. The alpha data in fz_pixmaps is no longer compulsory. mudraw: use rgb not rgba (ppmraw), cmyk not cmyka (pkmraw). Update halftone code to not expect alpha plane. Update PNG writing to cope with alpha less input. Also hide repeated params within the png output context. ARM code needs updating.
2016-05-20	Add images based on display lists.	Tor Andersson

2016-05-20	Fix typo.	Robin Watts

2016-05-09	Bug 696759: Fix pdf image subregion decode.	Robin Watts
	When decoding < 8 bpp images, we need to allow for the fact that the data is byte aligned at the end of each row by being careful in our calculation of r_skip.
2016-04-28	Fix JPX breakage caused during refactor.	Robin Watts
	I was using fz_compressed_image when I should have been using fz_pixmap_image.
2016-04-28	Refactor fz_image code cases.	Robin Watts
	Split compressed images (images based on a compressed buffer) and pixmap images (images based on a pixmap) out into separate subclasses.
2016-04-28	Tweak fz_image in preparation for things to come.	Robin Watts
	Move from ints to bits where possible.
2016-04-28	Introduce tuning context.	Robin Watts
	For now, just use it for controlling image decoding and image scaling.
2016-04-28	Partial image decode.	Robin Watts
	Update the core fz_get_pixmap_from_image code to allow fetching a subarea of a pixmap. We pass in the required subarea, together with the transformation matrix for the whole image. On return, we have a pixmap at least as big as was requested, and the transformation matrix is updated to map the supplied area to the correct place on the screen. The draw device is updated to use this as required. Everywhere else passes NULLs in, and so gets unchanged behaviour. The standard 'get_pixmap' function has been updated to decode just the required areas of the bitmaps. This means that banded rendering of pages will decode just the image subareas that are required for each band, limiting the memory use. The downside to this is that each band will redecode the image again to extract just the section we want. The image subareas are put into the fz_store in the same way as full images. Currently image areas in the store are only matched when they match exactly; subareas are not identified as being able to use existing images.
2016-03-23	Add support for BMP images.	Sebastian Rasmussen

2016-03-23	Clamp too large image resolution values.	Tor Andersson

2016-03-21	Bug 696668: Update the downscaling logic.	Robin Watts
	An l2factor of 3 is equivalent to downscaling by a factor of 8. We can get an l2factor of 3 downscale out of the jpeglib. We can reasonably downscale by a further l2factor of 3 manually. Any more than that and we start to completely drop pixels without them having any effect. Therefore it's pointless us keeping any tiles around with l2factors > 6. Fix the bug (which was that we were using < instead of <=) and update the value to a more reasonable one anyway.
2016-02-24	Add fz_new_image_from_file.	Tor Andersson

2016-02-22	Drop const from fz_image.	Tor Andersson
	Image objects are immutable and opaque once constructed. Therefore there is no need for the const keyword.
2016-01-13	Add lots of consts.	Robin Watts
	In general, we should use 'const fz_blah' in device calls whenever the callee should not alter the fz_blah. Push this through. This shows up various places where we fz_keep and fz_drop these const things. I've updated the fz_keep and fz_drops with appropriate casts to remove the consts. We may need to do the union dance to avoid the consts for some compilers, but will only do that if required. I think this is nicer overall, even allowing for the const<->no const problems.
2015-12-28	Rename fz_image_get_pixmap to fz_get_pixmap_from_image.	Tor Andersson

2015-12-18	Rename fz_image_get_sanitised_res to fz_image_resolution.	Tor Andersson

2015-12-11	Use fz_output instead of FILE* for most of our output needs.	Tor Andersson
	Use fz_output in debug printing functions. Use fz_output in pdfshow. Use fz_output in fz_trace_device instead of stdout. Use fz_output in pdf-write.c. Rename fz_new_output_to_filename to fz_new_output_with_path. Add seek and tell to fz_output. Remove unused functions like fz_fprintf. Fix typo in pdf_print_obj.
2015-09-01	Default to invert_cmyk_jpeg for all formats other than PDF.	Tor Andersson

2015-07-29	Add support for parsing GIF images.	Sebastian Rasmussen

2015-07-20	Enable fz_images to have NULL buffers, and still be decoded.	Robin Watts
	Important for gproof files.
2015-06-29	Further tweaks to fz_image handling.	Robin Watts
	Ensure that subsampling and caching happen in the generic image code, not in the specific. Previously, the subsampling happened only for images that were decoded from streams. Images that were loaded direct were never subsampled and hence were always cached at full size. After this change both classes of image are correctly subsampled, and the subsampled version kept in the cache. This produces various image diffs in the cluster, none of which are noticable to the naked eye.
2015-06-29	Rejig the internals of fz_image slightly.	Robin Watts
	Previously, we had people calling image->get_pixmap directly. Now we have them all call fz_image_get_pixmap, which will look for a cached version in the store, and only call get_pixmap if required. Previously fz_image_get_pixmap used to look for the cached version in the store, and decode if not - hence the decoding code is now extracted out into standard_image_get_pixmap. This was the original intent of the code, it just somehow didn't end up like that. This nicely queues us up for being able to have fz_images that use a different get_pixel implementation, such as that which will be required for the gprf code.
2015-02-17	Add helper functions to keep/drop reference counts with locking.	Tor Andersson
	Add locks around fz_path and fz_text reference counting.
2015-02-17	Add ctx parameter and remove embedded contexts for API regularity.	Tor Andersson
	Purge several embedded contexts: Remove embedded context in fz_output. Remove embedded context in fz_stream. Remove embedded context in fz_device. Remove fz_rebind_stream (since it is no longer necessary). Remove embedded context in svg_device. Remove embedded context in XML parser. Add ctx argument to fz_document functions. Remove embedded context in fz_document. Remove embedded context in pdf_document. Remove embedded context in pdf_obj. Make fz_page independent of fz_document in the interface. We shouldn't need to pass the document to all functions handling a page. If a page is tied to the source document, it's redundant; otherwise it's just pointless. Fix reference counting oddity in fz_new_image_from_pixmap.
2015-02-17	Rename fz_close_* and fz_free_* to fz_drop_*.	Tor Andersson
	Rename fz_close to fz_drop_stream. Rename fz_close_archive to fz_drop_archive. Rename fz_close_output to fz_drop_output. Rename fz_free_* to fz_drop_. Rename pdf_free_ to pdf_drop_. Rename xps_free_ to xps_drop_*.
2015-02-10	Attempting to render a JPEG with xres and yres set to 1 fails.	Robin Watts
	We end up trying to scale the JPEG up 72 times and fail a malloc. A better plan is to make the image handler disbelieve any xres or yres values less than 72dpi. We take care to still preserve aspect ratios etc.
2014-07-18	fix off-by-one error in fz_unblend_masked_tile	Simon Bünzli
	fz_image::n is used inconsistently: Sometimes it includes the alpha channel and sometimes it doesn't. At the point where fz_unblend_masked_tile is called, it doesn't.
2014-05-27	Fix 693517: Support /SMask/Matte preblended images.	Tor Andersson

2014-05-12	better buffer underflow protection for ba15a8cd3238a3a3c098ad8b7d96cb0e405fc26f	Simon Bünzli

2014-05-07	Bug 695112: only patch height values in JPEG streams	Simon Bünzli
	If the reported height is 0 or too large, use the image size reported in the PDF itself instead (in the case of height 0, the JPEG library is supposed to read the correct value from the DNL segment, but libjpeg doesn't support that).
2014-05-07	Fix 695112: patch JPEG streams with missing dimensions	Tor Andersson
	If a JPEG stream is missing valid values for width/height (usually -1), Adobe Reader substitutes these using the values read from the PDF object. This can be done by scanning and patching the data before passing it to libjpeg. Thanks to zeniko for the patch.
2014-03-18	Fix operator buffering of inline images.	Robin Watts
	Previously pdf_process buffer did not understand inline images. In order to make this work without needlessly duplicating complex code from within pdf-op-run, the parsing of inline images has been moved to happen in pdf-interpret.c. When the op_table entry for BI is called it now expects the inline image to be in csi->img and the dictionary object to be in csi->obj. To make this work, we have had to improve the handling of inline images in general. While non-inline images have been loaded and held in memory in their compressed form and only decoded when required, until now we have always loaded and decoded inline images immediately. This has been due to the difficulty in knowing how many bytes of data to read from the stream - we know the length of the stream once uncompressed, but relating this to the compressed length is hard. To cure this we introduce a new type of filter stream, a 'leecher'. We insert a leecher stream before we build the filters required to decode the image. We then read and discard the appropriate number of uncompressed bytes from the filters. This pulls the compressed data through the leecher stream, which stores it in an fz_buffer. Thus images are now always held in their compressed forms in memory. The pdf-op-run implementation is now trivial. The only real complexity in the pdf-op-buffer implementation is the need to ensure that the /Filter entry in the dictionary object matches the exact point at which we backstopped the decompression.
2014-03-17	Ensure that small images don't subdivide more than they should.	Robin Watts
	Gridfitting can increase the required width/height of images by up to 2 pixels. This makes images that are rendered very small very sensitive to over quantisation. This can produce 'mushier' images than it should, for instance on tests/Ghent_V3.0/090_Font-Support_x3.pdf (pgmraw, 72dpi)
2014-01-17	Avoid overflows in floating point causing illegal accesses	Robin Watts
	If the scale is too large, the calculation to determine the required size of a pixmap can overflow. This can lead to negative width/heights being passed in, which confuses the subsampling code, leading to SEGVs.
2014-01-16	fix memory leaks in pdf_load_jpx and fz_new_image_from_pixmap	Simon Bünzli
	fz_new_image_from_pixmap expects that the pixmap's colorspace has two references which is contrary to expectations. If it instead addrefs the pixmap's colorspace, the only caller pdf_load_jpx can consistently drop the colorspace after passing it to fz_load_jpx. Also, if the contract is that whatever is passed into fz_new_image_from_pixmap belongs to the new image, then the pixmap also has to be dropped on error so that it isn't leaked.
2014-01-06	add stub files for JPEG-XR support	Simon Bünzli
	See SumatraPDF's repo for a Windows-only implementation using WIC.