mupdf - MuPDF PDF reader and library

Age	Commit message (Collapse)	Author
2012-06-29	Add 'all-nojs' target to the main trunk.	Robin Watts
	This is an early import of a change from the forms branch to enable me to tweak the clusters mupdf build test line to avoid trying to build the js components for mupdf builds.
2012-06-28	Allow for windows style synthetic font selectors of built-in fonts.	Tor Andersson
	For example "Symbol,Italic" can be handled as an artificially obliqued "Symbol". Fixes an issue in test file normal_161.pdf
2012-06-28	Update tiff iamge predictor to cope with 16 bits.	Robin Watts
	normal_178.pdf contains a monochrome black and white image, encoded as 16bpc rgb.
2012-06-27	Fix clipping of stroked text seen in displaylist cases.	Robin Watts
	When calculating the displaylist node rectangles, we were failing to adjust for linewidth/mitrewidth etc. This could result in glyphs being clipped; see normal_130.pdf for example.
2012-06-27	Fix text clipping operations in pdf.	Robin Watts
	In PDF the text rendering mode can have bit 2 set to mean "add to clipping path". Experiments (and in particular normal_130.pdf) show that the text should be stroked and/or filled BEFORE the path is added to the clipping path.
2012-06-27	Make ASCII85 decoding more tolerant of end of stream errors.	Robin Watts
	This solves the normal_87.pdf rendering issues.
2012-06-25	Fix incorrect assignment in unused bitmap accessor.	Sebastian Rasmussen

2012-06-25	Warning fixes and various clean ups:	Sebastian Rasmussen
	Remove unused variable, silencing compiler warning. No need to initialize variables twice. Remove initialization of unread variable. Remove unnecessary check for NULL. Close output file upon error in cmapdump.
2012-06-25	Add make target to only build thirdparty	Sebastian Rasmussen
	This makes it easier to separate building of mupdf itself from libraries, e.g. when running clang's scan-build.
2012-06-22	Rework pdf_lexbuf to allow for dynamic parsing buffers.	Robin Watts
	Currently pdf_lexbufs use a static scratch buffer for parsing. In the main case this is 64K in size, but in other cases it can be just 256 bytes; this causes problems when parsing long strings. Even the 64K limit is an implementation limit of Acrobat, not an architectural limit of PDF. Change here to allow dynamic buffers. This means a slightly more complex setup and destruction for each buffer, but more importantly requires correct cleanup on errors. To avoid having to insert lots more try/catch clauses this commit includes various changes to the code so we reuse pdf_lexbufs where possible. This keeps the speed up.
2012-06-22	Fix a char vs unsigned char warning.	Robin Watts

2012-06-18	Fix indentation and white space errors.	Tor Andersson

2012-06-15	Scale page size by UserUnit in pdfinfo.	Tor Andersson

2012-06-15	Support UserUnit for scaling PDF pages.	Tor Andersson

2012-06-13	Tweak pdf_new_{rect,matrix,xobject} to avoid warnings.	Robin Watts
	These functions currently call pdf_array_put, but this fails to extend the array. Change to use pdf_array_push instead.
2012-06-13	Make backspace go to the previous page in x11 viewer.	Sebastian Rasmussen

2012-06-13	Remove unnecessary function and improve naming	Paul Gardiner

2012-06-12	Android app: explicitly release resources when page moves out of cache area	Paul Gardiner

2012-06-12	A few general utility functions added for the sake of the forms work	Paul Gardiner

2012-06-12	Remove use of uninialized context.	Paul Gardiner
	Harmless, since the context wasn't used, but confusing.
2012-06-12	Android app: build safe AsyncTask behaviour into a derived class	Paul Gardiner

2012-06-12	Change mudraw to set the error code if rendering errors present.	Robin Watts
	Make mudraw pass a cookie in to the rendering procedures. If any errors are reported for any page, remember this, and set the return code to 1 on exit.
2012-06-12	Followup to commit 120dadb; improved error handling during interpretation.	Robin Watts
	After commit 120dadb, it's far too easy to get into a seemingly infinite loop while processing a corrupt file. We fix this by changing the process to abort when we receive an invalid keyword. Also, we add another layer of nesting to pdf_run_stream to avoid us push/popping an fz_try level on every keyword.
2012-06-11	Move to using sigsetjmp/siglongjmp on more platforms.	Robin Watts
	Previously we used to have a special case hack in for MacOS. Now we call sigsetjmp/siglongjmp on all platforms that define __unix. (i.e. pretty much all of them except windows).
2012-06-11	Fix Bug 693102: Overflows in large pixmap indexing.	Robin Watts
	When we allocate a pixmap > 2G, but < 4G, the index into that pixmap, when calculated as an int can be negative. Fix this with various casts to unsigned int. If we ever move to support >4G images we'll need to rejig the casting to cast each part of the element to ptrdiff_t first.
2012-06-11	Fix Bug 693099: Render failure due to corrupt jpeg data.	Robin Watts
	The file supplied with the bug contains corrupt jpeg data on page 61. This causes an error to be thrown which results in mudraw exiting. Previously, when image decode was done at loading time, the error would have been thrown under the pdf interpreter rather than under the display list renderer. This error would have been caught, a warning given, and the program would have continued. This is not ideal behaviour, as there is no way for a caller to know that there was a problem, and that the image is potentially incomplete. The solution adopted here, solves both these problems. The fz_cookie structure is expanded to include a 'errors' count. Whenever we meet an error during rendering, we increment the 'errors' count, and continue. This enables applications to spot the errors count being non-zero on exit and to display a warning. mupdf is updated here to pass a cookie in and to check the error count at the end; if it is found to be non zero, then a warning is given (just once per visit to each page) to say that the page may have errors on it.
2012-06-08	Fix Bug 693097; memory overwrite when handling knockout groups.	Robin Watts
	When handling knockout groups, we have to copy the background from the previous group in so we can 'knockout' properly. If the previous group is a different colorspace, this gives us problems! The fix, implemented here, is to update the copy_pixmap_rect function to know how to copy between pixmaps of different depth. Gray <-> RGB are the ones we really care about; the generic code will probably do a horrible job, but shouldn't ever be called at present. This suffices to stop the crashing - we will probably revisit this when we revise the blending support.
2012-05-31	Add linearization to pdf_write function.	Robin Watts
	Extend mupdfclean to have a new -l file that writes the file linearized. This should still be considered experimental When writing a pdf file, analyse object use, flatten resource use, reorder the objects, generate a hintstream and output with linearisaton parameters. This is enough for Acrobat to accept the file as being optimised for Fast Web View. We ought to add more tables to the hintstream in some cases, but I doubt anyone actually uses it, the spec is so badly written. Certainly acrobat accepts the file as being optimised for 'Fast Web View'. Update fz_dict_put to allow for us adding a reference to the dictionary that is the sole owner of that reference already (i.e. don't drop then keep something that has a reference count of just 1). Update pdf_load_image_stream to use the stm_buf from the xref if there is one. Update pdf_close_document to discard any stm_bufs it may be holding. Update fz_dict_put to be pdf_dict_put - this was missed in a renaming ages ago and has been inconsistent since.
2012-05-27	Print filenames for fastes/slowest page in mudraw	Sebastian Rasmussen

2012-05-27	Have mupdfinfo gather resource info only on what to show.	Sebastian Rasmussen

2012-05-23	Update usage messages for mubusy command line tools.	Tor Andersson

2012-05-23	Bring xref object and stream mutation functions back from the dead.	Tor Andersson
	Needs more work to use the linked list of free xref slots.
2012-05-23	Make CCITTFax tables static.	Tor Andersson

2012-05-18	Update fitz.h with __cplusplus guard to protect against inline changes.	Robin Watts
	When including fitz.h from C++ files, we must not alter the definition of inline, as it may upset code that follows it. We only alter the definition to enable it if it's available, and it's always available in C++ - so simply avoiding changing it in the C++ case gives us what we want.
2012-05-11	Split part of fz_document interface for pdf_document into separate file.	Tor Andersson
	Make a separate constructor function that does not link in the interpreter, so we can save space in the mubusy binary by not including the font and cmap resources.
2012-05-10	Combine all small tools into mubusy and remove the separate executables.	Tor Andersson

2012-05-10	Fix whitespace errors.	Tor Andersson

2012-05-10	mupdfclean - update to allow renumbering of encrypted objects	Robin Watts
	mupdfclean (or more correctly, the pdf_write function) currently has a limitation, in that we cannot renumber objects when encryption is being used. This is because the object/generation number is pickled into the stream, and renumbering the object causes it to become unreadable. The solution used here is to provide extended functions that take both the object/generation number and the original object/generation number. The original object numbers are only used for setting up the encryption. pdf_write now keeps track of the original object/generation number for each object. This fix is important, if we ever want to output linearized pdf as this requires us to be able to renumber objects to a very specific order. We also make a fix in removeduplicateobjects that should only matter in the case where we fail to read an object correctly.
2012-05-10	Clamp page numbers given to mupdfclean.	Sebastian Rasmussen
	Also make page specification parsing in all tools look similar.
2012-05-09	Bug 693021: Texture position overflow problems.	Robin Watts
	Keep texture position calculations in floats as long as possible, as prematurely dropping back to ints can cause overflows in the intermediate stages that don't nicely cancel out. The fix for this makes 2000 or so bitmap differences, most trivial, but with some progressions.
2012-05-09	Bug 693032: Fix usage messages in mupdfextract and mupdfshow	Robin Watts
	Thanks to stu-mupdf@spacehopper.org
2012-05-08	Bug 693028 - improve mupdf viewer handling of broken page streams	Robin Watts
	Currently, if a page stream cannot be read, mupdf gives an alert box and then exits. This is annoying when reading a large pdf. Here we change the code to only exit if a page is completely broken; in the case of missing page contents, or missing links, we give a warning and just render the best we can. Also, update a couple of error messages to be less misleading.
2012-05-08	Switch to reading content streams on the fly during interpretation.	Robin Watts
	Previously, before interpreting a pages content stream we would load it entirely into a buffer. Then we would interpret that buffer. This has a cost in memory use. Here, we update the code to read from a stream on the fly. This has required changes in various different parts of the code. Firstly, we have removed all use of the FILE lock - as stream reads can now safely be interrupted by resource (or object) reads from elsewhere in the file, the file lock becomes a very hard thing to maintain, and doesn't actually benefit us at all. The choices were to either use a recursive lock, or to remove it entirely; I opted for the latter. The file lock enum value remains as a placeholder for future use in extendable data streams. Secondly, we add a new 'concat' filter that concatenates a series of streams together into one, optionally putting whitespace between each stream (as the pdf parser requires this). Finally, we change page/xobject/pattern content streams to work on the fly, but we leave type3 glyphs using buffers (as presumably these will be run repeatedly).
2012-05-08	Update seeking behaviour of null streams.	Robin Watts
	In order to (hopefully) allow page content streams to be interpreted without having to preload them all into memory before we run them, we need to make the stream reading code cope with other users moving the stream pointer. For example: Consider the case where we are midway through interpreting a contents stream, and us hitting an operator that requires something to be read from Resources. This will move the underlying stream file pointer, and cause the contents stream to read incorrectly when control returns to the interpreter. The solution to this seems to be fairly simple; whenever we create a filter out of the file stream, the existing code puts in a 'null' filter first, to enforce a length limit on the stream. This null filter already does most of the work we need it to, in that by it being there, the buffering of data is done in the null filter rather than in the underlying stream layer. All we need to do is to keep track of where in the underlying stream the null filter thinks it is, and ensure that it seeks there before each read (in case anyone else has moved it). We move the setting of the offset to be explicit in the pdf_open_filter (and associated) call(s), rather than requiring fz_seeks elsewhere.
2012-05-08	Fix bug in ARM code in draw_simple_scale	Robin Watts
	The scale_row_from_temp code was broken. Firstly the rounding was wrong in the 'bulk' case (not a big deal), but more importantly on configurations where unaligned loads were not allowed (such as the nook), we could still crash due to an incorrect test to avoid that code. Thanks to Kammerer for the report, and testing of fixed version.
2012-05-08	Add scripts to generate hyperlinked source in HTML.	Tor Andersson

2012-05-07	Warn instead of giving error on missing password in pdfshow.	Sebastian Rasmussen
	This means pdfshow can show objects in encrypted PDFs again.
2012-05-07	Do not rely on printf NULL-pointer handling for missing font names.	Sebastian Rasmussen

2012-05-03	Bug 693022: Image/Pixmaps stuck in cache	Robin Watts
	Zeniko points out that images that don't decode on demand (i.e. ones that are held as pixmaps all the time) can never be evicted from the cache due to them holding a pointer to the pixmap, which holds a pointer back to the image. His fix is to only cache images that decode on demand. The actual patch applied here is slightly tweaked from his version; firstly the 'dontcache' logic is reversed to 'cache' to avoid overloading my poor brain with another negation. Secondly, one change to a condition is not taken on, as it is (I believe) unnecessary.
2012-05-02	Add omitted mupdfposter.vcproj file.	Robin Watts