summaryrefslogtreecommitdiff
path: root/pdf
AgeCommit message (Collapse)Author
2012-04-21Correct mistake in fz_meta.Robin Watts
We were mapping from one enum range to another, and then using the unmapped value.
2012-04-21Big 692996: Eliminate recursion to avoid exception stack overflows.Robin Watts
Avoid recursion in pdf_load_page_tree_node. Avoid recursion (most of the time) in pdf_read_xref_sections.
2012-04-19Bug 692847: Tiling problem fix.Robin Watts
The test in mupdf for 'is more than one tile needed' is wrong, as it assumes that tile bboxes start at 0. Fix that, and everything else should work OK.
2012-04-17Add Meta interface to fz_document.Robin Watts
Use this to reintroduce "Document Properties..." in mupdf viewer.
2012-04-16Reduce the changes that mupdfclean makes to floats.Robin Watts
Most of the changes mupdfclean makes to a file are purely textual (streams are decompressed etc), but some objects can undergo changes due to being read in, and then written out. Notably in this class are floats. For instance, the mediabox in Bug689189.pdf contains 2125.984, which when written out with the current code gives 2125.98. This is enough of a difference to cause rendering changes. By upping the precision (instead of %g use, %1.9g) we get better results; we now output 2125.9839, which is much closer (and in fact has the same float representation when read back in). This drastically reduces the differences between a rendering of Bug689189.pdf and the uncompressed version, but we still have differences - in shadings, it seems.
2012-04-05Fix potential problems on malloc failure.Robin Watts
Don't reset the size of arrays until we have successfully resized them.
2012-04-05Don't unlock a lock we don't own.Robin Watts
While debugging Bug 692943, I spotted a case where we can attempt to unlock the file while we don't hold the file lock due to an error being thrown while we momentarily drop that lock. Simple solution is to add a new fz_try()/fz_catch() to retake the lock in such an error circumstance.
2012-04-05Bug 692141 - Work around bug in VS2005 Team SuiteRobin Watts
Put the logf call in it's own statement to fix a stupid header file bug.
2012-03-28Whitespace fixes.Tor Andersson
2012-03-19Bug 692669: Snap Rotate values for pages to be a multiple of 90Robin Watts
Previously we attempted to honour page rotation values, which is technically against the spec.
2012-03-19Bug 692746; avoid 'double palettes' on jpx images.Robin Watts
It seems that JPX images can be supplied in indexed format, with both a palette internal to the jpx stream, and a palette in the PDF. Googling seems to suggest that the internal palette should be ignored in this case, and the external palette applied. Fortunately, since OpenJPEG-1.5 there is a flag that can be used to tell OpenJPEG not to decode palettes. We update the code here to spot that there is an external palette, and to set this flag.
2012-03-15Bug 692874: Fix Launch annotations.Robin Watts
Looks like my launch annotation code was incorrect, hence giving the empty string for all FZ_LINK_LAUNCH types; fixed here.
2012-03-14Bug 692917: Move to dynamic stroke_states.Robin Watts
Move fz_stroke_state from being a simple structure whose contents are copied repeatedly to being a dynamically allocated reference counted object so we can cope with large numbers of entries in the dash array.
2012-03-14Warn not throw on indirection cycles in pdf_resolve_indirect.Robin Watts
When we fail to be able to cache an object, we warn and return NULL. An indirection cycle should probably be treated the same way. From SumatraMuPDF.patch - Many thanks.
2012-03-14Avoid NULL dereferences in error cases when trying to warn.Robin Watts
Spotted from SumatraMuPDF.patch. Many thanks.
2012-03-13Add ctx argument and rename fz_bound_pixmap to fz_pixmap_bbox.Tor Andersson
2012-03-13Rename some functions and accessors to be more consistent.Tor Andersson
Debug printing functions: debug -> print. Accessors: get noun attribute -> noun attribute. Find -> lookup when the returned value is not reference counted. pixmap_with_rect -> pixmap_with_bbox. We are reserving the word "find" to mean lookups that give ownership of objects to the caller. Lookup is used in other places where the ownership is not transferred, or simple values are returned. The rename is done by the sed script in scripts/rename3.sed
2012-03-12Merge branch 'master' into header-splitRobin Watts
2012-03-12Squash MSVC warning.Robin Watts
2012-03-12Fix bitshifting by a negative amount in PS functionsRobin Watts
When bitshifting by a negative amount, we should shift right; thanks to Sebras' work in this area, I spotted that we are attempting to shift right by a negative number.
2012-03-12Take care of boundary conditions in ps function evaluation.Sebastian Rasmussen
Floating point numbers are now clamped, division by zero is approximated by minimum or maximum value and NaN results in 1.0.
2012-03-07More release tidyups.Robin Watts
Add some function documentation to fitz.h. Add fz_ prefix to runetochar, chartorune, runelen etc. Change fz_runetochar to avoid passing unnecessary pointer.
2012-03-07Splitting tweaks.Tor Andersson
2012-03-06Split fitz.h/mupdf.h into internal/external headers.Robin Watts
Attempt to separate public API from internal functions.
2012-03-06Fix ref counting bugs in race condition correction code.Robin Watts
When we attempt to insert a key/value pair into the store, we have to allow for the possibility that a racing thread may have already inserted an equivalent key/value. We have special code in place to handle this eventuality; if we spot an existing entry, we take the existing one in preference to our new key/value pair. This means that fz_store_item needs to take a new reference to any existing thing it finds before returning it. Currently the only store user that is exposed to this possibility is pdf_image; it spots an existing tile being returned, and was inadvertently double freeing the key.
2012-03-06Warn instead of throw when permissions are missing in encrypted PDF.Sebastian Rasmussen
2012-03-06Guess encryption revision from the version if missing.Sebastian Rasmussen
2012-03-01Setjmp/longjmp exception tweaks.Robin Watts
First, fix a couple of the 'alternative formulations' of the try/catch code in the comments. Secondly, work around a Mac OS X compiler bug.
2012-03-01Remove mask entry from fz_pixmap as never used any more.Robin Watts
Also, the attempts to keep it up to date were causing race conditions in multithreading cases.
2012-02-29Fix trailing whitespace and mixed tabs/spaces in indentation.Tor Andersson
2012-02-29Fix typo that causes an undefined pointer to be freed.Robin Watts
When a font is destroyed the t3 resources are freed; due to a typo a random pointer was being freed.
2012-02-26Move fz_obj to be pdf_obj.Robin Watts
Currently, we are in the slightly strange position of having the PDF specific object types as part of fitz. Here we pull them out into the pdf layer instead. This has been made possible by the recent changes to make the store no longer be tied to having fz_obj's as keys. Most of this work is a simple huge rename; to help customers who may have code that use such functions we have provided a sed script to do the renaming; scripts/rename2.sed. Various other small tweaks are required; the store used to have some debugging code that still required knowledge of fz_obj types - we extract that into a nicer 'type' based function pointer. Also, the type 3 font handling used to have an fz_obj pointer for type 3 resources, and therefore needed to know how to free this; this has become a void * with a function to free it.
2012-02-26Continued documentation improvements.Sebastian Rasmussen
More changes still to come.
2012-02-26Document the most commonly used interface functions.Sebastian Rasmussen
2012-02-25Fix assert/SEGVs seen in cluster due to using a color image as a mask.Robin Watts
When loading a JPX image with no specified colorspace, we were ending with image->colorspace being set to NULL. This caused us to treat the image as a mask. The correct fix is to inherit the colorspace from the jpx once loaded.
2012-02-25Revamp pdf lexing codeRobin Watts
A huge amount (20%+ on some files) of our runtime is spent in fz_atof. A survey of results on the net suggests we will get much better speed by writing our own atof. Part of the job of doing this involves parsing the string to identify the component parts of the number - ludicrously, we are already doing this as part of the lexing process, so it would make sense to do the atoi/atof as part of this process. In order to do this, we need somewhere to store the lexed results; rather than add a float * and an int * to every single pdf_lex call, we generalise the calls to pass a pdf_lexbuf * pointer instead of separate buffer/max/string length pointers. This should help us overall.
2012-02-25Add fz_trim_buffer function, and call it.Robin Watts
Remove stray space at the end of buffers.
2012-02-25Rework image handling for on demand decodeRobin Watts
Introduce a new 'fz_image' type; this type contains rudimentary information about images (such as native, size, colorspace etc) and a function to call to get a pixmap of that image (with a size hint). Instead of passing pixmaps through the device interface (and holding pixmaps in the display list) we now pass images instead. The rendering routines therefore call fz_image_to_pixmap to get pixmaps to render, and fz_pixmap_drop those afterwards. The file format handling routines therefore need to produce images rather than pixmaps; xps and cbz currently just wrap pixmaps as images. PDF is more involved. The stream handling routines in PDF have been altered so that they can recognise when the last stream entry in a filter dictionary is an image decoding filter. Rather than applying this filter, they read and store the parameters into a pdf_image_params structure, and stop decoding at that point. This allows us to read the compressed data for an image into memory as a block. We can then restart the image decode process later. pdf_images therefore consist of the compressed image data for images. When a pixmap is requested for such an image, the code checks to see if we have one (of an appropriate size), and if not, decodes it. The size hint is used to determine whether it is possible to subsample the image; currently this is only supported for JPEGs, but we could add generic subsampling code later. In order to handle caching the produced images, various changes have been made to the store and the underlying hash table. Previously the store was indexed purely by fz_obj keys; we don't have an fz_obj key any more, so have extended the store by adding a concept of a key 'type'. A key type is a pointer to a set of functions that keep/drop/compare and make a hashable key from a key pointer. We make a pdf_store.c file that contains functions to offer the existing fz_obj based functions, and add a new 'type' for keys (based on the fz_image handle, and the subsample factor) in the pdf_image.c file. While working on this, a problem became apparent in the existing store codel; fz_obj objects had no protection on their reference counts, hence an interpreter thread could try to alter a ref count at the same time as a malloc caused an eviction from the store. This has been solved by using the alloc lock as protection. This in turn requires some tweaks to the code to make sure we don't try and keep/drop fz_obj's from the store code while the alloc lock is held. A side effect of this work is that when a hash table is created, we inform it what lock should be used to protect its innards (if any). If the alloc lock is used, the insert method knows to drop/retake it to allow it to safely expand the hash table. Callers to the hash functions have the responsibility of taking/dropping the appropriate lock, and ensuring that they cope with the possibility that insert might drop the alloc lock, causing race conditions.
2012-02-15Add braces to resolve ambiguity.Robin Watts
2012-02-15Fix typo in comment.Robin Watts
CLUSTER_UNTESTED.
2012-02-15Treat 0000000000 * n xref entries as free ones.Robin Watts
Quartz generated PDFs (and maybe others too) seem to use 000000000 65536 n to mean "free object" in defiance of the spec. Add special case code to mupdf to handle this.
2012-02-13Add locking around freetype calls.Robin Watts
We only open one instance of freetype per document. We therefore have to ensure that only 1 call to it takes place at a time. We introduce a lock for this purpose (FZ_LOCK_FREETYPE), and arrange to take/release it as required. We also update the font context so it is properly shared.
2012-02-09Remove stray lock call.Robin Watts
I made a last minute change to make pdf_open_filter do the locking, and forgot to remove the call to lock from pdf_open_stream_with_offset. Fixed here. Thanks to Radu Lazar for pointing this out.
2012-02-08Lock reworking.Robin Watts
This is a significant change to the use of locks in MuPDF. Previously, the user had the option of passing us lock/unlock functions for a single mutex as part of the allocation struct. Now we remove these entries from the allocation struct, and make a separate 'locks' struct. This enables people to use fz_alloc_default with locking. If multithreaded operation is required, then the user is required to create FZ_LOCK_MAX mutexes, which will be locked or unlocked by MuPDF calling the lock/unlock functions within the new fz_locks_context structure passed in at context creation. These mutexes are not required to be recursive (they may be, but MuPDF should never call them in this way). MuPDF avoids deadlocks by imposing a locking ordering on itself; a thread will never take lock n, if it already holds any lock i for which 0 <= i <= n. Currently, there are 4 locks used within MuPDF. Lock 0: The alloc lock; taken around all calls to user supplied (or default) allocation functions. Also taken around all accesses to the refs field of storable items. Lock 1: The store lock; taken whenever the store data structures (specifically the linked list pointers) are accessed. Lock 2: The file lock; taken whenever a thread is accessing the raw file. We use the debugging macros to insist that this is held whenever we do a file based seek or read. We also insist that this is never held when we resolve an indirect reference, as this can have the effect of moving the file pointer. Lock 3: The glyphcache lock; taken whenever a thread calls freetype, or accesses the glyphcache data structures. This introduces some complexities w.r.t type3 fonts. Locking can be hugely problematic, so to ease our minds as to the correctness of this code, we introduce some debugging macros. These compile away to nothing unless FITZ_DEBUG_LOCKING is defined. fz_assert_lock_held(ctx, lock) checks that we hold lock. fz_assert_lock_not_held(ctx, lock) checks that we do not hold lock. In addition fz_lock_debug_lock and fz_lock_debug_unlock are used on every fz_lock/fz_unlock to check the validity of the operation we are performing - in particular it checks that we do/do not already hold the lock we are trying to take/drop, and that by taking this lock we are not violating our defined locking order. The RESOLVE macro (used throughout the code to check whether we need to resolve an indirect reference) calls fz_assert_lock_not_held to ensure that we aren't about to resolve an indirect reference (and hence move the stream pointer) when the file is locked. In order to implement the file locking properly, pdf_open_stream (and friends) now lock the file as a side effect (because they fz_seek to the start of the stream). The lock is automatically dropped on an fz_close of such streams. Previously, the glyph cache was created in a context when it was first required; this presents problems as it can be shared between several contexts or not, depending on whether it is created before the contexts are cloned. We now always create it at startup, so it is always shared. This means that we need reference counting for the glyph caches. Added here. In fz_render_glyph, we take the glyph cache lock, and check to see whether the glyph is in the cache. If it is, we bump the refcount, drop the lock and returned the cached character. If it is not, we need to render the character. For freetype based fonts we keep the lock throughout the rendering process, thus ensuring that freetype is only called in a single threaded manner. For type3 fonts, however, we need to invoke the interpreter again to render the glyph streams. This can require reentrance to this routine. We therefore drop the glyph cache lock, call the interpreter to render us our pixmap, and take the lock again. This dropping and retaking of the lock introduces a possible race condition; 2 threads may try to render the same character at the same time. We therefore modify our hash table insert routines to behave differently if it comes to insert an entry only to find that an entry with the same key is already there. We spot this case; if we have just rendered a type3 glyph and when we try to insert it into the cache discover that someone has beaten us to it, we just discard our entry and use the cached one. Hopefully this will seldom be a problem in practise; to solve it properly would require greater complexity (probably involving spotting that another thread is already working on the desired rendering, and sleeping on a semaphore until it completes).
2012-02-06Pass context to cmap and font descriptor functions.Tor Andersson
2012-02-03Be consistent about passing a fz_context in path/text/shade functions.Tor Andersson
2012-02-03Be consistent about passing a fz_context argument in pixmap functions.Tor Andersson
2012-02-03Reference count fz_link objects.Tor Andersson
2012-02-03Remove extraneous blank lines.Tor Andersson
2012-02-03Add document interface.Tor Andersson