mupdf - MuPDF PDF reader and library

Age	Commit message (Collapse)	Author
2015-06-29	Rejig the internals of fz_image slightly.	Robin Watts
	Previously, we had people calling image->get_pixmap directly. Now we have them all call fz_image_get_pixmap, which will look for a cached version in the store, and only call get_pixmap if required. Previously fz_image_get_pixmap used to look for the cached version in the store, and decode if not - hence the decoding code is now extracted out into standard_image_get_pixmap. This was the original intent of the code, it just somehow didn't end up like that. This nicely queues us up for being able to have fz_images that use a different get_pixel implementation, such as that which will be required for the gprf code.
2015-06-02	Ensure that we can still build mudraw standalone if we want to.	Robin Watts
	MUDRAW_STANDALONE forces mudraw_main to be just main. Set this in the mudraw VS project.
2015-05-25	Merge 'mudraw' into 'mutool' binary.	Tor Andersson
	Use "mutool draw" or symlink mutool to mudraw to use mudraw.
2015-05-19	epub: User stylesheets.	Tor Andersson
	Add -U option to mupdf and mudraw to set a user stylesheet. Uses a context to store user the stylesheet, just like the AA level.
2015-05-15	Support pdf files larger than 2Gig.	Robin Watts
	If FZ_LARGEFILE is defined when building, MuPDF uses 64bit offsets for files; this allows us to open streams larger than 2Gig. The downsides to this are that: * The xref entries are larger. * All PDF ints are held as 64bit things rather than 32bit things (to cope with /Prev entries, hint stream offsets etc). * All file positions are stored as 64bits rather than 32. The implementation works by detecting FZ_LARGEFILE. Some #ifdeffery in fitz/system.h sets fz_off_t to either int or int64_t as appropriate, and sets defines for fz_fopen, fz_fseek, fz_ftell etc as required. These call the fseeko64 etc functions on linux (and so define _LARGEFILE64_SOURCE) and the explicit 64bit functions on windows.
2015-04-16	mutool clean -z option to compress streams.	Tor Andersson

2015-04-14	Fix 695918: "mudraw -sm" format string on win32.	Tor Andersson

2015-04-09	Add -v option to mutool and mudraw to print MuPDF version number.	Tor Andersson

2015-04-09	Remove the _no_run functions.	Tor Andersson
	The new pdfclean sanitize functionality mean that mutool now needs the data files, so maintaining the split that was designed to keep data files out of mutool is no longer viable.
2015-04-07	Fix whitespace.	Tor Andersson

2015-04-07	Add EPUB layout options to mupdf-x11 and mudraw.	Tor Andersson

2015-04-07	Rename mutool show 'pages' to 'pagetree' to reduce possible confusion.	Tor Andersson
	Fixes bug 695909.
2015-04-06	Update mutool subtools to use PDF_NAME_xxx rather than "xxx".	Robin Watts

2015-04-06	Add mutool pages subcommand.	Robin Watts
	Inspired by bug 695823. Mutool can now dump the sizes and orientations for pages within a given file.
2015-04-06	Move the guts of pdfclean into the lib.	Robin Watts
	Michael needs to be able to call pdfclean from gsview. At the moment he's having to do this by including the pdfclean.c file into the lib build, and then calling pdfclean_main with a faked up command line. This isn't nice. pdfclean.c is implemented by pdfclean_main parsing the options/filenames out of argv and then passing the filenames/options on to a pdfclean_clean function. This seems like a much nicer API to offer to the world. We therefore pull the guts of pdfclean.c (pdfclean_clean and its subsidiary structures/functions) into pdf-clean-file.c and include this in the library build. This leaves pdfclean.c just as the command line parsing. This should not affect the size of any of the resulting binaries.
2015-04-06	Fix oddity with pdfinfo	Robin Watts
	If pdfinfo is invoked as: mutool info file.pdf 1,2,3 then it will show the items found on page 1, then the items found on pages 1 and 2, then the items shown on pages 1,2 and 3. Fix this by clearing the data after each show operation.
2015-04-02	Bug 695900: pdfclean return code is inverted.	Robin Watts
	Silly typo. Thanks to Daniel Bloemer for pointing this out.
2015-04-01	Update manpages.	Tor Andersson

2015-04-01	Fix scan for %d in mudraw.	Tor Andersson

2015-04-01	Clean up mudraw command line syntax.	Tor Andersson
	... and move outline printing to mutool show.
2015-03-24	Rework handling of PDF names for speed and memory.	Robin Watts
	Currently, every PDF name is allocated in a pdf_obj structure, and comparisons are done using strcmp. Given that we can predict most of the PDF names we'll use in a given file, this seems wasteful. The pdf_obj type is opaque outside the pdf-object.c file, so we can abuse it slightly without anyone outside knowing. We collect a sorted list of names used in PDF (resources/pdf/names.txt), and we add a utility (namedump) that preprocesses this into 2 header files. The first (include/mupdf/pdf/pdf-names-table.h, included as part of include/mupdf/pdf/object.h), defines a set of "PDF_NAME_xxxx" entries. These are pdf_obj 's that callers can use to mean "A PDF object that means literal name 'xxxx'" The second (source/pdf/pdf-name-impl.h) is a C array of names. We therefore update the code so that rather than passing "xxxx" to functions (such as pdf_dict_gets(...)) we now pass PDF_NAME_xxxx (to pdf_dict_get(...)). This is a fairly natural (if widespread) change. The pdf_dict_getp (and sibling) functions that take a path (e.g. "foo/bar/baz") are therefore supplemented with equivalents that take a list (pdf_dict_getl(... , PDF_NAME_foo, PDF_NAME_bar, PDF_NAME_baz, NULL)). The actual implementation of this relies on the fact that small pointer values are never valid values. For a given pdf_obj p, if NULL < (intptr_t)p < PDF_NAME__LIMIT then p is a literal entry in the name table. This enables us to do fast pointer compares and to skip expensive strcmps. Also, bring "null", "true" and "false" into the same style as PDF names. Rather than using full pdf_obj structures for null/true/false, use special pointer values just above the PDF_NAME_ table. This saves memory and makes comparisons easier.
2015-03-20	Fix Memtrace for 64bit operation.	Robin Watts

2015-02-25	Bug 695851: Fix SEGV in mutool info.	Robin Watts
	Add missing initialisation of glo.ctx required due to API change.
2015-02-17	Add ctx parameter and remove embedded contexts for API regularity.	Tor Andersson
	Purge several embedded contexts: Remove embedded context in fz_output. Remove embedded context in fz_stream. Remove embedded context in fz_device. Remove fz_rebind_stream (since it is no longer necessary). Remove embedded context in svg_device. Remove embedded context in XML parser. Add ctx argument to fz_document functions. Remove embedded context in fz_document. Remove embedded context in pdf_document. Remove embedded context in pdf_obj. Make fz_page independent of fz_document in the interface. We shouldn't need to pass the document to all functions handling a page. If a page is tied to the source document, it's redundant; otherwise it's just pointless. Fix reference counting oddity in fz_new_image_from_pixmap.
2015-02-17	Rename fz_close_* and fz_free_* to fz_drop_*.	Tor Andersson
	Rename fz_close to fz_drop_stream. Rename fz_close_archive to fz_drop_archive. Rename fz_close_output to fz_drop_output. Rename fz_free_* to fz_drop_. Rename pdf_free_ to pdf_drop_. Rename xps_free_ to xps_drop_*.
2015-02-17	Reference count fz_document.	Tor Andersson

2015-01-20	don't try extracting object number 0	Simon Bünzli
	pdfextract_main by default iterates through all objects from number 0 to the size of the document's xref table. Object number 0 is however always supposed to be free, so pdfextract consistently fails and shows a slightly confusing warning. Object extraction should by default start at object 1 in order to prevent this warning.
2014-09-09	test-device: Abort interpretation when color found.	Robin Watts
	Add a new class of errors and use them to abort interpretation when the test device detects a color page.
2014-08-27	Revise test-device; thresholding and exhaustive checking.	Tor Andersson
	The original version of the test-device could characterise pages as being grayscale/color purely based on the colorspaces used. This could easily be upset by grayscale images or shadings that happened to be specified in non-grayscale colorspaces however. We now look at the actual shading and image color values, and use a threshold value to allow for some measure of rounding errors in color values that are in practice grayscale.
2014-08-12	Fix 695411: Catch errors when loading objects in mutool extract.	Tor Andersson

2014-08-12	Change error messages in mutool extract to follow the house style.	Tor Andersson

2014-07-17	Add feature testing device, and call it from mudraw with -T flag.	Tor Andersson
	Currently only tests for the presence of non-grayscale color.
2014-05-05	Fix 695207: printf format bug in mutool info.	Tor Andersson

2014-03-25	Split mjs script generation to separate tool.	Tor Andersson
	It has no real reason to live in mudraw, and it does pull in the javascript dependency via pdf-form.c.
2014-03-20	Respect reverse page ranges in mutool clean.	Tor Andersson

2014-03-19	Add routine to clean pdf content streams for pages.	Robin Watts
	New routine to filter the content streams for pages, xobjects, type3 charprocs, patterns etc. The filtered streams are guaranteed to be properly matched with q/Q's, and to not have changed the top level ctm. Additionally we remove (some) repeated settings of colors etc. This filtering can be extended to be smarter later. The idea of this is to both repair after editing, and to leave the streams in a form that can be easily appended to. This is preparatory to work on Bates numbering and Watermarking. Currently the streams produced are uncompressed.
2014-03-19	Make mutool clean sanitise the Dests lists when subsetting.	Robin Watts
	When you use mutool clean to subset pages out of a PDF, we already remove the Name tree entries for named locations that aren't in the target file. We have henceforth failed to remove references to these removed names though. This can cause errors (really warnings) on reading the file back.
2014-03-13	Tweak pdfclean and pdfinfo to be useful as libraries.	Robin Watts
	Firstly, we remove the use of global variables; this is done by introducing a 'globals' structure for each of these files and passing it internally between functions. Next, split the core of pdfclean_main into pdfclean_clean, and the core of pdfinfo_main into pdfinfo_info. The _main functions now do the argv processing. The new functions now run entirely thread safely, so can be called from library functions.
2014-01-09	Add -o option for mutool show.	Tor Andersson
	Windows doesn't like redirecting binary output, so add an explicit filename argument.
2014-01-07	Introduce 'document handlers'.	Robin Watts
	We define a document handler for each file type (2 in the case of PDF, one to handle files with the ability to 'run' them, and one without). We then register these handlers with the context at startup, and then call fz_open_document... as usual. This enables people to select the document types they want at will (and even to extend the library with more document types should they wish).
2014-01-06	fix various MSVC warnings	Simon Bünzli
	Some warnings we'd like to enable for MuPDF and still be able to compile it with warnings as errors using MSVC (2008 to 2013): * C4115: 'timeval' : named type definition in parentheses * C4204: nonstandard extension used : non-constant aggregate initializer * C4295: 'hex' : array is too small to include a terminating null character * C4389: '==' : signed/unsigned mismatch * C4702: unreachable code * C4706: assignment within conditional expression Also, globally disable C4701 which is frequently caused by MSVC not being able to correctly figure out fz_try/fz_catch code flow. And don't define isnan for VS2013 and later where that's no longer needed.
2013-10-31	Add CMYK and CMYK Alpha colorspaces to mudraw options.	Tor Andersson

2013-10-31	Add CMYK support to PAM output.	Tor Andersson

2013-09-30	Disable image interpolation with a hint.	Robin Watts
	Set the hint in mudraw when AA bits is set to 0.
2013-09-27	add support for .tga output to mudraw	Simon Bünzli
	SumatraPDF's testsuite uses Targa images as output because they're compressed while still far easier to compare than PNG and have better tool support than PCL/PWG.
2013-09-06	Add '-' as a option for stdout to mudraw	Robin Watts

2013-08-30	Add simple banding to mudraw.	Robin Watts
	The most complex part here is to ensure that we can output various bitmaps in bands.
2013-08-28	Dump glyph cache size as part of mudraw -M	Robin Watts

2013-08-21	Add simple memory use tracking to mudraw	Robin Watts

2013-08-21	Add -F flag to mudraw to allow format selection.	Robin Watts
	This allows us to "mudraw -F ppm -o /dev/null" etc.