mupdf - MuPDF PDF reader and library

Age	Commit message (Collapse)	Author
2016-12-27	Strip extraneous blank lines.	Tor Andersson

2016-10-07	Add ctx to fz_font functions.	Robin Watts

2016-10-06	Hide internals of fz_colorspace	Robin Watts
	The implementation does not need to be in the public API.
2016-10-05	Move fz_font definition to be private.	Robin Watts
	Move the definition of fz_font to be in a private header file rather than in the public API. Add accessors for specific parts of the structure and use them as appropriate. The font flags, and the harfbuzz records remain public. This means that only 3 files now need access to the font implementation (font.c, pdf-font.c and pdf-type3.c). This may be able to be improved further in future.
2016-09-01	Use fz_convert_color().	Sebastian Rasmussen
	In preference to colorspace internal to_rgb() function pointer.
2016-09-01	Simplify PDF resource caching table handling.	Tor Andersson

2016-07-13	Bug 696892: PDF annotation appearance stream synthesis SEGV	Robin Watts
	The code would SEGV if we were trying to synthesise an appearance stream for an annotation, and the docs pdf resources table had not been initialised. We now intialise the pdf resource tables when we initialise a pdf device. This is the earliest point we know we are going to need them, and covers all cases.
2016-07-12	Fix bugs in pdf_add_image.	Tor Andersson

2016-07-08	Separate close and drop functionality for devices and writers.	Tor Andersson
	Closing a device or writer may throw exceptions, but much of the foreign language bindings (JNI and JS) depend on drop to never throw an exception (exceptions in finalizers are bad).
2016-06-14	Fix typos in various parts of the code.	Sebastian Rasmussen

2016-04-27	Add fz_close_device function.	Tor Andersson
	Garbage collected languages need a way to signal that they are done with a device other than freeing it. Call it implicitly on fz_drop_device; so take care not to call it again in case it has been explicitly called already.
2016-04-26	Change order of arguments to pdf_add_page etc.	Tor Andersson
	Resources are defined before they are used; so it's only logical to have the resource dictionary before the content buffer in the argument list.
2016-04-22	pdf: Remember to drop objects inserted into dicts.	Sebastian Rasmussen

2016-04-04	Fix typo in pdfwrite top-level matrix.	Tor Andersson

2016-03-21	Fix 696661: Missing annotations.	Tor Andersson
	Add an explicit 'page setup' matrix to pdf-write device, which is only used when creating top level page content stream and not the child annotation content streams.
2016-03-15	Fix leak in PDF device.	Robin Watts

2016-03-14	Remove begin_page and end_page device calls.	Tor Andersson
	To be moved into a new document writer interface later.
2016-03-01	Don't use pdf_page struct when creating pages.	Tor Andersson

2016-03-01	Rename pdf_new_ref to pdf_add_object.	Tor Andersson

2016-02-29	pdfwrite: Use Tm directly to set matrix.	Tor Andersson
	Don't mess with Td.
2016-02-29	pdfwrite: Handle all fonts as CID fonts.	Tor Andersson

2016-02-29	pdfwrite: Look through own resource list first.	Tor Andersson
	Don't try creating the resource for each fill_text call.
2016-02-29	Rename pdf_add_simple_font_res and friends.	Tor Andersson

2016-02-29	Pass fz_font to pdf_add_xxx_font_res instead of a fz_buffer.	Tor Andersson
	Make sure all fz_fonts have a ft_buffer available.
2016-02-29	Remove pdf_res struct. Use pdf_obj indirect references directly.	Tor Andersson
	Fix refcounting bugs.
2016-02-29	Fix silly typo. Set default output file for pdfwrite device.	Tor Andersson

2016-02-29	Add mutool create tool, and PDF font and image resource creation.	Michael Vrhel
	Initial framework for creating pdfs This adds a create option to mutool for us to use in working on the API for creating content as well as adding content to existing documents. mutool create: Get page sizes and add them Start the parsing of the contents.txt file which may have multiple page information. Add the pages at the proper sizes. Further work on mutool create_pdf Remove the calls that were being made to the pdf-write device. Clean up several issues with the reading of the page contents. Get the content streams for each page associated with the page->contents Temp. created a pdf_create_page_contents procedure. I will merge this with pdf_create_page as there is significant overlap. Next is to add in the font and image resources and indirect references. Include pdfcreate in build Merge pdf_create_page_contents and pdf_create_page Add support for images in pdfcreate This adds images to the pdf document using a function stolen from pdf-device (send_image). This was renamed pdf_add_image_res and added to pdf-image. Down the road, send-image will be removed. Prior to that, I need to work on making sure that multiple copies of the same image do not end up in the document. Code was also added to create the page resources to point to the proper image in the document. Next fonts will be added in a similar manner, then I will work on computing the md5 sums of image and fonts to ensure only one copy ends up in the document. Then pdf-write will be reworked to use the same code as opposed to its current list of md5 sums that are stored in a device structure. mutool pdfcreate: support for WinAnsiEncoded fonts Added support for very simple fonts (WinAnsiEncoding). Methods added in pdf-font.c. Added first_width and last_width to fz_font_s and stem_v to pdf_font_desc_s. Ran code through memento with simple test of 4 page document creation including an image and a font. Fixed several leaks as well as buffer corruption issues (main changes in pdfcreate). Thanks to Robin for the help with Memento in finding leaks. Added StemV to pdf names as it was needed for the font descriptor creation. Fix for pdf_write_document rename to pdf_save_document Add resource_ids to pdf document structure The purpose of this structure will be to allow the search and reuse of resources when we attempt to add new ones to the document. Fix name changes from recent updates pdf_create branch updated to work with recent changes in master Initial use of hash table for resources To avoid adding in the same resource this adds a resource_tables member to pdf_document. The resource_tables structure consists of multiple fz_hash_table entries, one for each resource type. When an attempt is made to search for an existing resource, the table will be initialized in a brute force search for existing resources. Currently this is only set up for the image resources and accessed through pdf_add_image_res. If a match is found, the reference object is returned. If no match is found NULL is returned and the ref object created in pdf_add_image_res is added into the hash table. In this case, a command line such as create -o output.pdf -f F0:font.ttf -i Im0:image.jpg -i Im1:image1.jpg \\ -i Im2:image.jpg contents.txt will avoid the insertion of two copies of image.jpg into the output PDF document. CID Identity-H Font added for handing ttf This adds a method for adding a ttf to a PDF as a CID font with Identity-H mapping and a ToUnicode entry that is created using FT_Get_Char_Index This takes much care in the creation of the ToUnicode CMap to ensure that the minimum number of entries are created in that we try to use beginbfrange as much as possible before using beginbfchar. The code makes sure to limit the number of entries in a group to 100 and to not cross first-byte boundaries for the CID values as described in the Adobe Technical note 5411. Add missing file pdf-resources.c pdf-resources.c was missing and should have been committed earlier. Added to windows project file. Not sure where else it needs to be added for the other platforms. Clean up names and spacing Make sure that the visible functions have the proper namespace (e.g. pdf_xxxx) Also make sure we have a blank line prior to comment. Be consistent with static function naming in pdf_resources.c pdfwrite make use of image resource fz_hash_table The pdfwrite device now shares the structure that stores the resource images for pdfcreate. With this fix, pdfwrite now avoids duplicating the writing of the same images that are shared across multiple pages. Add missing file pdf-resources.c Initial work toward having pdfwrite use Identity-H Type0 encoding for fonts Finish of CID type0 Identity-H font for pdfwrite This adds in the proper widths which may have been stored in the source font in the width table (parsed from the W entry in the pdf file) or if the free type structure has its own cmap then we can get the width from free type. Widths are restructured into format described in 5.6.3 of PDF spec. Fix issue from conflict merging and multiple define of structure Clean up warnings and make mutool create use simple font
2016-02-24	Add optional scissor hint argument to text clipping functions.	Tor Andersson

2016-02-24	Clarify scissor argument to clip device functions.	Tor Andersson
	The scissor argument is an optional (potentially NULL) rectangle that can give hints to devices about the area that can be scissored. This is used by the draw device and display list device to minimize the size of temporary clip mask buffers. The scissor rectangle, if used, must have been transformed by the current transform matrix.
2016-02-22	Fix leaks in pdf-device.c	Michael Vrhel
	The Forms object should have been dropped once the reference was created for it. Also needed to clean up the group and font objects that the device maintains.
2016-02-22	Rename fz_path_processor to fz_path_walker.	Tor Andersson

2016-02-22	Drop const from fz_image.	Tor Andersson
	Image objects are immutable and opaque once constructed. Therefore there is no need for the const keyword.
2016-02-22	Drop const from fz_shade.	Tor Andersson
	Shading objects are immutable and opaque once constructed. Therefore there is no need for the const keyword.
2016-01-21	Drop const from fz_colorspace.	Tor Andersson
	It's an opaque immutable structure, that we don't expect to ever want to change after creation. Therefore the const keyword is not useful, and is only line noise.
2016-01-21	Drop the const on fz_font.	Tor Andersson
	The font is an immutable opaque structure, no need to add the const keyword since users aren't expected or expecting to change it.
2016-01-13	Add lots of consts.	Robin Watts
	In general, we should use 'const fz_blah' in device calls whenever the callee should not alter the fz_blah. Push this through. This shows up various places where we fz_keep and fz_drop these const things. I've updated the fz_keep and fz_drops with appropriate casts to remove the consts. We may need to do the union dance to avoid the consts for some compilers, but will only do that if required. I think this is nicer overall, even allowing for the const<->no const problems.
2015-12-28	Rename fz_image_get_pixmap to fz_get_pixmap_from_image.	Tor Andersson

2015-12-11	Remove text clip accumulation.	Tor Andersson
	We can now group all clipped text into one fz_text object and simplify the device interface.
2015-12-11	Keep spans of multiple fonts and sizes in one fz_text object.	Tor Andersson

2015-09-28	Bug 696170: Fix typo.	Robin Watts
	sizeof(16) is not 16 :) Thanks to David Binderman for pointing this out.
2015-06-29	Rejig the internals of fz_image slightly.	Robin Watts
	Previously, we had people calling image->get_pixmap directly. Now we have them all call fz_image_get_pixmap, which will look for a cached version in the store, and only call get_pixmap if required. Previously fz_image_get_pixmap used to look for the cached version in the store, and decode if not - hence the decoding code is now extracted out into standard_image_get_pixmap. This was the original intent of the code, it just somehow didn't end up like that. This nicely queues us up for being able to have fz_images that use a different get_pixel implementation, such as that which will be required for the gprf code.
2015-04-03	Bug 694713: Avoid assert when using pdf_page_write	Robin Watts
	When writing a pdf page, we pass page->contents to pdf_new_pdf_device. This object is assumed to be a dictionary (stream) that can be updated with the Length and stream contents once the page writing process has completed. When we are rewriting a pdf page however, this can go wrong; page->contents can be an array of objects. Not only this, in general it would be possible for several pages to share the same page contents (or maybe some of the elements of a page contents array). Updating one page should not update the others. We therefore update pdf_page_write to always create a new page->contents object and use that. Thanks to Michael Cadilhac for spotting the basic problem here.
2015-03-24	Path rework for improved memory usage.	Robin Watts
	Firstly, we make the definition of the path structures local to path.c. This is achieved by using an fz_path_processor function to step through paths enumerating each section using callback functions. Next, we extend the internal path representation to include other section types, including quads, beziers with common control points rectangles, horizontal, vertical and degenerate lines. We also roll close path sections up into the previous sections commands. The hairiest part of this is that fz_transform_path has to cope with changing the path commands depending on the matrix. This is a relatively rare operation though.
2015-03-24	Rework handling of PDF names for speed and memory.	Robin Watts
	Currently, every PDF name is allocated in a pdf_obj structure, and comparisons are done using strcmp. Given that we can predict most of the PDF names we'll use in a given file, this seems wasteful. The pdf_obj type is opaque outside the pdf-object.c file, so we can abuse it slightly without anyone outside knowing. We collect a sorted list of names used in PDF (resources/pdf/names.txt), and we add a utility (namedump) that preprocesses this into 2 header files. The first (include/mupdf/pdf/pdf-names-table.h, included as part of include/mupdf/pdf/object.h), defines a set of "PDF_NAME_xxxx" entries. These are pdf_obj 's that callers can use to mean "A PDF object that means literal name 'xxxx'" The second (source/pdf/pdf-name-impl.h) is a C array of names. We therefore update the code so that rather than passing "xxxx" to functions (such as pdf_dict_gets(...)) we now pass PDF_NAME_xxxx (to pdf_dict_get(...)). This is a fairly natural (if widespread) change. The pdf_dict_getp (and sibling) functions that take a path (e.g. "foo/bar/baz") are therefore supplemented with equivalents that take a list (pdf_dict_getl(... , PDF_NAME_foo, PDF_NAME_bar, PDF_NAME_baz, NULL)). The actual implementation of this relies on the fact that small pointer values are never valid values. For a given pdf_obj p, if NULL < (intptr_t)p < PDF_NAME__LIMIT then p is a literal entry in the name table. This enables us to do fast pointer compares and to skip expensive strcmps. Also, bring "null", "true" and "false" into the same style as PDF names. Rather than using full pdf_obj structures for null/true/false, use special pointer values just above the PDF_NAME_ table. This saves memory and makes comparisons easier.
2015-03-20	Automatically update /Length and /Filter in pdf_update_stream.	Tor Andersson

2015-02-25	Allow pdf_device to be created with pre-populated buffer.	Robin Watts
	When watermarking, we may want to use the PDF device on an existing buffer. In this case, we have no 'contents' object.
2015-02-17	Use embedded superclass struct instead of user pointer in devices.	Tor Andersson

2015-02-17	Add ctx parameter and remove embedded contexts for API regularity.	Tor Andersson
	Purge several embedded contexts: Remove embedded context in fz_output. Remove embedded context in fz_stream. Remove embedded context in fz_device. Remove fz_rebind_stream (since it is no longer necessary). Remove embedded context in svg_device. Remove embedded context in XML parser. Add ctx argument to fz_document functions. Remove embedded context in fz_document. Remove embedded context in pdf_document. Remove embedded context in pdf_obj. Make fz_page independent of fz_document in the interface. We shouldn't need to pass the document to all functions handling a page. If a page is tied to the source document, it's redundant; otherwise it's just pointless. Fix reference counting oddity in fz_new_image_from_pixmap.
2015-02-17	Rename fz_close_* and fz_free_* to fz_drop_*.	Tor Andersson
	Rename fz_close to fz_drop_stream. Rename fz_close_archive to fz_drop_archive. Rename fz_close_output to fz_drop_output. Rename fz_free_* to fz_drop_. Rename pdf_free_ to pdf_drop_. Rename xps_free_ to xps_drop_*.
2014-05-29	fix memory leaks during PDF document creation	Simon Bünzli
	pdf_create_document leaks the trailer and in pdf-device.c many objects are inserted into dictionaries using pdf_dict_puts and leaked instead of using pdf_dict_puts_drop.