mupdf - MuPDF PDF reader and library

Age	Commit message (Collapse)	Author
2017-12-13	Add 'clean' option to pdfclean to clean (but not sanitize) content streams.	Tor Andersson
	This goes well with the 'mutool clean -d' decompression option to debug content streams, without doing the sanitize optimization pass.
2017-04-27	Include required system headers.	Tor Andersson

2016-10-06	Bug 697194: Document -gggg in muclean.	Robin Watts

2016-04-27	Tweak pdf-write option handling.	Tor Andersson
	The handling of not-decompressing images/fonts was geared towards pdfclean usage; but now that we can create new PDF files, it makes more sense to ask for images and fonts to be compressed, rather than asking for them not to be decompressed with quirky interaction with the 'expand' and 'deflate' flags. If -f or -i are set, we will never decompress images, and we will compress them if they are uncompressed. If -d is set, we will first decompress all streams (module -f or -i). If -z is set, we will then compress all uncompressed streams.
2016-03-31	Initialize pdf write options to zero in pdfclean.	Tor Andersson

2016-03-31	Initialize disabled document writing flags to zero	Sebastian Rasmussen
	Also remove redundant assignments. Fixes http://bugs.ghostscript.com/show_bug.cgi?id=695968
2015-12-18	Remove fz_save_document and use pdf_save_document directly instead.	Tor Andersson
	In preparation of adding pdf_write_document that writes a document to a fz_output stream.
2015-12-15	Rename fz_write_x to fz_save_pixmap_as_x or fz_save_bitmap_as_x.	Tor Andersson
	Separate naming of functions that save complete files to disk from functions that write data to streams.
2015-04-16	mutool clean -z option to compress streams.	Tor Andersson

2015-04-06	Move the guts of pdfclean into the lib.	Robin Watts
	Michael needs to be able to call pdfclean from gsview. At the moment he's having to do this by including the pdfclean.c file into the lib build, and then calling pdfclean_main with a faked up command line. This isn't nice. pdfclean.c is implemented by pdfclean_main parsing the options/filenames out of argv and then passing the filenames/options on to a pdfclean_clean function. This seems like a much nicer API to offer to the world. We therefore pull the guts of pdfclean.c (pdfclean_clean and its subsidiary structures/functions) into pdf-clean-file.c and include this in the library build. This leaves pdfclean.c just as the command line parsing. This should not affect the size of any of the resulting binaries.
2015-04-02	Bug 695900: pdfclean return code is inverted.	Robin Watts
	Silly typo. Thanks to Daniel Bloemer for pointing this out.
2015-04-01	Update manpages.	Tor Andersson

2015-03-24	Rework handling of PDF names for speed and memory.	Robin Watts
	Currently, every PDF name is allocated in a pdf_obj structure, and comparisons are done using strcmp. Given that we can predict most of the PDF names we'll use in a given file, this seems wasteful. The pdf_obj type is opaque outside the pdf-object.c file, so we can abuse it slightly without anyone outside knowing. We collect a sorted list of names used in PDF (resources/pdf/names.txt), and we add a utility (namedump) that preprocesses this into 2 header files. The first (include/mupdf/pdf/pdf-names-table.h, included as part of include/mupdf/pdf/object.h), defines a set of "PDF_NAME_xxxx" entries. These are pdf_obj 's that callers can use to mean "A PDF object that means literal name 'xxxx'" The second (source/pdf/pdf-name-impl.h) is a C array of names. We therefore update the code so that rather than passing "xxxx" to functions (such as pdf_dict_gets(...)) we now pass PDF_NAME_xxxx (to pdf_dict_get(...)). This is a fairly natural (if widespread) change. The pdf_dict_getp (and sibling) functions that take a path (e.g. "foo/bar/baz") are therefore supplemented with equivalents that take a list (pdf_dict_getl(... , PDF_NAME_foo, PDF_NAME_bar, PDF_NAME_baz, NULL)). The actual implementation of this relies on the fact that small pointer values are never valid values. For a given pdf_obj p, if NULL < (intptr_t)p < PDF_NAME__LIMIT then p is a literal entry in the name table. This enables us to do fast pointer compares and to skip expensive strcmps. Also, bring "null", "true" and "false" into the same style as PDF names. Rather than using full pdf_obj structures for null/true/false, use special pointer values just above the PDF_NAME_ table. This saves memory and makes comparisons easier.
2015-02-17	Add ctx parameter and remove embedded contexts for API regularity.	Tor Andersson
	Purge several embedded contexts: Remove embedded context in fz_output. Remove embedded context in fz_stream. Remove embedded context in fz_device. Remove fz_rebind_stream (since it is no longer necessary). Remove embedded context in svg_device. Remove embedded context in XML parser. Add ctx argument to fz_document functions. Remove embedded context in fz_document. Remove embedded context in pdf_document. Remove embedded context in pdf_obj. Make fz_page independent of fz_document in the interface. We shouldn't need to pass the document to all functions handling a page. If a page is tied to the source document, it's redundant; otherwise it's just pointless. Fix reference counting oddity in fz_new_image_from_pixmap.
2015-02-17	Rename fz_close_* and fz_free_* to fz_drop_*.	Tor Andersson
	Rename fz_close to fz_drop_stream. Rename fz_close_archive to fz_drop_archive. Rename fz_close_output to fz_drop_output. Rename fz_free_* to fz_drop_. Rename pdf_free_ to pdf_drop_. Rename xps_free_ to xps_drop_*.
2014-03-20	Respect reverse page ranges in mutool clean.	Tor Andersson

2014-03-19	Add routine to clean pdf content streams for pages.	Robin Watts
	New routine to filter the content streams for pages, xobjects, type3 charprocs, patterns etc. The filtered streams are guaranteed to be properly matched with q/Q's, and to not have changed the top level ctm. Additionally we remove (some) repeated settings of colors etc. This filtering can be extended to be smarter later. The idea of this is to both repair after editing, and to leave the streams in a form that can be easily appended to. This is preparatory to work on Bates numbering and Watermarking. Currently the streams produced are uncompressed.
2014-03-19	Make mutool clean sanitise the Dests lists when subsetting.	Robin Watts
	When you use mutool clean to subset pages out of a PDF, we already remove the Name tree entries for named locations that aren't in the target file. We have henceforth failed to remove references to these removed names though. This can cause errors (really warnings) on reading the file back.
2014-03-13	Tweak pdfclean and pdfinfo to be useful as libraries.	Robin Watts
	Firstly, we remove the use of global variables; this is done by introducing a 'globals' structure for each of these files and passing it internally between functions. Next, split the core of pdfclean_main into pdfclean_clean, and the core of pdfinfo_main into pdfinfo_info. The _main functions now do the argv processing. The new functions now run entirely thread safely, so can be called from library functions.
2013-07-11	Implement dynamic page tree lookups.	Tor Andersson
	No more caching a flattened page tree in doc->page_objs/refs. No more flattening of page resources, rotation and boxes. Smart page number lookup by following Parent links. Naive implementation of insert and delet page that doesn't rebalance the trees. Requires existing page tree to hook into, cannot be used to create a page tree from scratch.
2013-07-04	Update pdf_write_document to support incremental update	Paul Gardiner

2013-06-25	Update pdf_obj's to have a pdf_document field.	Robin Watts
	Remove the fz_context field to avoid the structure growing.
2013-06-20	Rearrange source files.	Tor Andersson