mupdf - MuPDF PDF reader and library

Age	Commit message (Collapse)	Author
2013-05-30	Add fz_puts to the fz_output bestiary.	Robin Watts

2013-05-30	Check signatures on clicking the corresponding form field	Paul Gardiner

2013-05-30	Add functions to return digital signature info	Paul Gardiner

2013-05-29	Rename some find/lookup functions to be in line with documentation.	Tor Andersson

2013-05-27	Treat multiple whitespace in search strings as single.	Robin Watts
	Skip over successive whitespace in search string. Make android use text_search.c
2013-05-22	Fix typo in comment.	Tor Andersson

2013-05-21	Add monochrome PWG output routines.	Robin Watts

2013-05-21	Add PWG options structure for writing PWGs.	Robin Watts
	This should (pretty much) give us enough to write a mupdftoraster equivalent of gstoraster.
2013-05-16	Add colorspace context.	Tor Andersson
	To prepare for color management, we have to make the device colorspaces per-context and able to be overridden by users.
2013-05-16	Add PWG raster output to mudraw.	Robin Watts

2013-05-14	svgwrite: First attempt at an SVG output device.	Robin Watts
	No font support (just font names are sent through). No group support. No shading support. No image mask support. Line art, text position/size, bitmaps, clipping all seem to work though.
2013-05-10	Tweak png outputting functions.	Robin Watts
	Allow us to get an image as a png in a buffer.
2013-05-06	Add simple visual-to-logic RTL reordering as a text extraction pass.	Tor Andersson

2013-05-06	Use linked list for text spans.	Tor Andersson

2013-04-30	Move fz_normalize_vector into base_geometry.c	Tor Andersson

2013-04-26	Rename functions for consistency.	Robin Watts
	Rename fz_new_output_buffer to be fz_new_output_with_buffer. Rename fz_new_output_file to be fz_new_output_with_file. This is more consistent with other functions such as fz_new_pixmap_with_data.
2013-04-26	Squash 2 const warnings.	Robin Watts
	Add some more consts's and use void *'s where appropriate.
2013-04-26	Hint enabling/disabling for devices.	Robin Watts
	Add configuration functions to control the hints set on a given device. Use this to set whether image data is captured or not in the text extraction process. Also update the display list device to respect the device hints during playback.
2013-04-25	Generalise fz_write_png to fz_output_pixmap_to_png	Robin Watts
	Extract the core of fz_write_png so that it can work to an fz_output * rather than a FILE *. fz_write_png continues to work as before, but now we can output to buffer to.
2013-04-25	Add fz_write method for output streams.	Robin Watts

2013-04-25	Tweak fz_text_page to include image records.	Robin Watts
	Extract such records as part of the text device.
2013-04-11	Move pdf_image to fz_image.	Robin Watts
	In order to be able to output images (either in the pdfwrite device or in the html conversion), we need to be able to get to the original compressed data stream (or else we're going to end up recompressing images). To do that, we need to expose all of the contents of pdf_image into fz_image, so it makes sense to just amalgamate the two. This has knock on effects for the creation of indexed colorspaces, requiring some of that logic to be moved. Also, we need to make xps use the same structures; this means pushing PNG and TIFF support into the decoding code. Also we need to be able to load just the headers from PNG/TIFF/JPEGs as xps doesn't include dimension/resolution information. Also, separate out all the fz_image stuff into fitz/res_image.c rather than having it in res_pixmap.
2013-03-29	Avoid uncompressing indexed images at load time.	Robin Watts
	This actually turned out to be far easier than I'd feared; remove the explicit check that stopped this working, and ensure that we pass the correct value in for the 'indexed' param. Add a function to check for colorspaces being indexed. Bit nasty that this requires a strcmp...
2013-03-26	Spot indents.	Robin Watts

2013-03-26	Text region analysis.	Robin Watts
	Update fz_text_analysis function to look for 'regions'; use this to spot columns etc. Spot columns/width/alignment info. "Intelligently" merge lines based on this. Update html output to make use of this extra information.
2013-03-26	Rework text extraction structures.	Robin Watts
	Rework the text extraction structures - the broad strokes are similar but we now hold more information at each stage to enable us to perform more detailed analysis on the structure of the page. We now hold: fz_text_char's (the position, ucs value, and style of each char). fz_text_span's (sets of chars that share the same baseline/transform, with no more than an expected amount of whitespace between each char). fz_text_line's (sets of spans that share the same baseline (more or less, allowing for super/subscript, but possibly with a larger than expected amount of whitespace). fz_text_block's (sets of lines that follow one another) After fz_text_analysis is called, we hope to have fz_text_blocks split such that each block is a paragraph. This new implementation has the same restrictions as the current implementation it replaces, namely that chars are only considered for addition onto the most recent span at the moment, but this revised form is designed to allow more easy extension, and for this restriction to be lifted. Also add simple paragraph splitting based on finding the most common 'line distance' in blocks. When we add spans together to collate them into lines, we record the 'horizontal' and 'vertical' spacing between them. (Not actually horizontal or vertical, so much as 'in the direction of writing' and 'perpendicular to the direction of writing'). The 'horizontal' value enables us to more correctly output spaces when converting to (say) html later. The 'vertical' value enables us to spot subscripts and superscripts etc, as well as small changes in the baseline due to style changes. We are careful to base the baseline comparison on the baseline for the line, not the baseline for the previous span, as otherwise superscripts/ subscripts on the end of the line affect what we match next. Also, we are less tolerant of vertical shifts after a large gap. This avoids false positives where different columns just happen to almost line up.
2013-03-21	Add 'void' to a function declaration.	Robin Watts

2013-03-20	Add caching of rendered tiles.	Robin Watts
	This requires a slight change to the device interface. Callers that use fz_begin_tile will see no change (and no caching will be done). We add a new fz_begin_tile_id function that takes an extra 'id' parameter, and returns 0 or 1. If the id is 0 then the function behaves exactly as fz_being_tile does, and always returns 0. The PDF and XPS code continues to call the old (uncached) version. The display list code however generates a unique id for every BEGIN_TILE node, and passes this in. If the id is non zero, then it is taken to be a unique identifier for this tile; the implementer of the fz_begin_tile_id entry point can choose to use this to implement caching. If it chooses to ignore the id (and do no caching), it returns 0. If the device implements caching, then it can check on entry for a previously rendered tile with the appropriate matrix and a matching id. If it finds one, then it returns 1. It is the callers responsibility to then skip over all the device calls that would usually happen to render the tiles (i.e. to skip forward to the matching 'END_TILE' operation).
2013-03-20	Add noreturn attribute to throw/rethrow to help improve compiler warnings.	Tor Andersson

2013-03-01	Bug 693624: Ensure that windows copes with utf8 filenames	Robin Watts
	When running under Windows, replace fopen with our own fopen_utf8 that converts from utf8 to unicode before calling the unicode version of fopen.
2013-02-28	Pass bbox to pdf_set_annot_appearance rather than base on display list	Paul Gardiner
	Use of the bbox device to derive the area of the display list can lead to bad results because of heuristics used to handle corners of stroked paths.
2013-02-22	Add fz_get_annot_type	Paul Gardiner

2013-02-20	Bug 693639: bring fitz.h in line with source use of restrict keyword.	Tor Andersson
	Thanks to zeniko.
2013-02-19	Fix whitespace.	Tor Andersson

2013-02-06	Rename bbox to irect.	Tor Andersson

2013-02-06	Add some 'restrict' qualifiers to hopefully speed matrix ops.	Robin Watts
	Also, move fz_is_infinite_rect and fz_is_empty_rect to be a static inline rather than a macro. (Static inlines are preferred over macros by at least one customers). We appear to be calling them with bboxes too, so add fz_is_infinite_bbox and fz_is_empty_bbox to solve this.
2013-02-06	Change to pass structures by reference rather than value.	Robin Watts
	This is faster on ARM in particular. The primary changes involve fz_matrix, fz_rect and fz_bbox. Rather than passing 'fz_rect r' into a function, we now consistently pass 'const fz_rect *r'. Where a rect is passed in and modified, we miss the 'const' off. Where possible, we return the pointer to the modified structure to allow 'chaining' of expressions. The basic upshot of this work is that we do far fewer copies of rectangle/matrix structures, and all the copies we do are explicit. This has opened the way to other optimisations, also performed in this commit. Rather than using expressions like: fz_concat(fz_scale(sx, sy), fz_translate(tx, ty)) we now have fz_pre_{scale,translate,rotate} functions. These can be implemented much more efficiently than doing the fully fledged matrix multiplication that fz_concat requires. We add fz_rect_{min,max} functions to return pointers to the min/max points of a rect. These can be used to in transformations to directly manipulate values. With a little casting in the path transformation code we can avoid more needless copying. We rename fz_widget_bbox to the more consistent fz_bound_widget.
2013-02-04	Add fz_output, and make output functions use it.	Robin Watts
	Various functions in the code output to FILE *, when there are times we'd like them to output to other things, such as fz_buffers. Add an fz_output type, together with fz_printf to allow things to output to this.
2013-01-31	Add support for annotation creation	Paul Gardiner

2013-01-30	Rename fz_irect back to fz_bbox.	Tor Andersson

2013-01-30	Always pass value structs (rect, matrix, etc) as values not by pointer.	Tor Andersson

2013-01-30	Rename fz_rect_covering_rect to fz_irect_from_rect.	Tor Andersson
	It used to be called fz_bbox_covering_rect. It does exact rounding outwards of a rect, so that the resulting irect will always cover the entire area of the input rect. Use fz_round_rect for fuzzy rounding where near-integer values are rounded inwards.
2013-01-30	Introduce fz_irect where the old fz_bbox was useful.	Tor Andersson
	Inside the renderer we often deal with integer sized areas, for pixmaps and scissoring regions. Use a new fz_irect type in these places.
2013-01-30	Eliminate fz_bbox in favor of fz_rect everywhere.	Tor Andersson

2013-01-25	Make strdup take a const char * to silence some warnings.	Tor Andersson

2013-01-11	Bug 693519: Replace char * with const char * in open document.	Robin Watts
	Simple patch to replace const char * with char *. I made the patch myself, but I suspect it's extremely close to the one submitted by Evgeniy A Dushistov, who reported the bug - many thanks!
2012-12-24	Bug 693503: Fix leak while writing a broken file.	Robin Watts
	While investigating samples_mupdf_001/2599.pdf.asan.58.1778, a leak showed up while cleaning the file, due to not dropping an object in an error case. mutool clean -dif samples_mupdf_001/2599.pdf.asan.58.1778 leak.pdf Simple Fix. Also extend PDF writing so that it can cope with skipping errors so we at least get something out at the end. Problem found in a test file supplied by Mateusz "j00ru" Jurczyk and Gynvael Coldwind of the Google Security Team using Address Sanitizer. Many thanks!
2012-12-21	Bug 593603: Fix problems with tiling.	Robin Watts
	Two problems with tiling are fixed here. Firstly, if the tiling bounds are huge, the 'patch' region (the region we are writing into), can overflow, causing a SEGV due to the paint code being very confused by pixmaps that go from just under INT_MAX to just over INT_MIN. Fix this by checking explicitly for overflow in these bounds. If the tiles are stupidly huge, but the scissor is small, we can end up looping many more times than we need to. We fix mapping the scissor region back through the inverse transform, and intersecting this with the pattern area. Problem found in 4201.pdf.SIGSEGV.622.3560, a test file supplied by Mateusz "j00ru" Jurczyk and Gynvael Coldwind of the Google Security Team using Address Sanitizer. Many thanks!
2012-12-20	Bug 693503: Fix SEGV in glyph painting due to bbox overflow.	Robin Watts
	When calculating the bbox for draw_glyph, if the x and y origins of the glyph are extreme (too large to fit in an int), we get overflows of the bbox; empty bboxes are transformed to large ones. The fix is to introduce an fz_translate_bbox function that checks for such things. Also, we update various bbox/rect functions to check for empty bboxes before they check for infinite ones (as a bbox of x0=0 x1=0 y0=0 y1=-1 will be detected both as infinite and empty). Problem found in 2485.pdf.SIGSEGV.2a.1652, a test file supplied by Mateusz "j00ru" Jurczyk and Gynvael Coldwind of the Google Security Team using Address Sanitizer. Many thanks!
2012-12-18	Protect against draw device stack confusion due to errors while pushing.	Robin Watts
	Whenever we have an error while pushing a gstate, we run the risk of getting confused over how many pops we need etc. With this commit we introduce some checking at the dev_null level that attempts to make this behaviour consistent. Any caller may now assume that calling an operation that pushes a clip will always succeed. This means the only error cleanup they need to do is to ensure that if they have pushed a clip (or begun a group, or a mask etc) is to pop it too. Any callee may now assume that if it throws an error during the call to a device entrypoint that would create a group/clip/mask then no more calls will be forthcoming until after the caller has completely finished with that group. This is achieved by the dev_null layer (the layer that indirects from device calls through the device structure to the function pointers) swallowing errors and regurgitating them later as required. A count is kept of the number of pushes that have happened since an error occurred during a push (including that initial one). When this count reaches zero, the original error is regurgitated. This allows the caller to keep the cookie correctly updated.