mupdf - MuPDF PDF reader and library

Age	Commit message (Collapse)	Author
2016-03-28	MSVC: More solution tweaking.	Robin Watts
	Avoid library warnings when including libfonts.
2016-03-24	MSVC: Add in missing font.	Robin Watts
	I missed a font while reworking the generate.bat font generation.
2016-03-23	Fix MSVC builds.	Robin Watts
	Update generate.bat to generate generate/fontname.c files rather than generate/fontname.{ttc,ttf,cff} etc. Add a new libfonts target that builds those, and make libmupdf depend on it. Fix build problem in load-bmp.c - don't declare in the middle of blocks.
2016-03-23	Compile embedded fonts in separate C files.	Tor Andersson
	Also change unsigned char into const char for embedded data.
2016-03-21	Add .ps output to mutool draw.	Robin Watts
	Simple PS wrapped images with flate compression.
2016-03-16	glyph plotter; Use repeated inclusion of header	Robin Watts
	To avoid having to duplicate a fairly large block of code several times, use repeated inclusion of a header with some macros to generate optimised glyph plotters.
2016-03-07	MSVC: Solution tweaks.	Robin Watts
	Add Memento configurations for mupdf-gl and libglfw to solve build warnings.
2016-03-01	MSVC: Add mujs include path to mutool build.	Robin Watts
	Required to find mujs.h
2016-02-29	jni: First attempt at generic JNI bindings.	Robin Watts
	The purpose of JNI bindings is to allow MuPDF to be driven from Java. There are several possible use cases here. Firstly, and most simply a java application can ask the core of MuPDF to open a document and render it using the existing devices to produce output on a standard Java bitmap. Secondly, a java application might want to drive the device interface itself, making use of the standard MuPDF devices (such as using the rendering engine to render high quality graphics). Thirdly, a java application might want to implement its own device and then call MuPDF to run the document to that device (perhaps to do custom text or image extraction). The first of these cases requires a simple reflection of the main document and standard device classes in JNI. The second of these cases requires the actual device interface itself to be made available as a java interface, together with the ability to construct and manipulate data types like paths, text and fonts so the Java code can build the required objects to pass to implementers of the device interface. The final case requires a reflection layer whereby calls through the device interface in C can be turned into method calls to a Java interface. All of this is attempted in this commit. Some highlights: For each type in the C (such as fz_colorspace) we have a corresponding java class (such as ColorSpace). Where the 'fz_' types are reference counted (such as an fz_colorspace), the java objects (such as ColorSpace) simply take a reference to a pointer to the underlying fz type. Java accessor methods are then provided to manipulate these types. Where the 'fz_' types are not reference counted (such as an fz_rect), the data is actually contained within the Java object itself (such as Rect, RectI and Transform). We add a VS jni project. This doesn't do anything except make the files accessible for editing in the IDE. As much as possible, the Java layers do nothing (other than some programmer friendly type overloading), construction (unavoidable, as can't be done in JNI) and boiler-plate destruction. All the smartness is done in the C. Due to Java and C's differing approach to constness, we need to be careful that a java device does not destructively alter objects passed to it. For example, consider running a display list through a device implemented in java. If the java device were to change a Font object passed to it, this might affect other objects in the display list that shared the same underlying fz_font. Possibly we can achieve this by having an 'isConst' flag on java objects that are created from device calls and passed to the Java device (see the Text class, for an attempt at this currently). This could alternatively be achieved by cloning every such piece of data (see the path code for an example of this approach), but this is probably slow. Better to clone 'just in time' as the first write operation is done to the object.
2016-02-29	js: Add "mutool run" tool to run javascript scripts.	Tor Andersson
	Use an API similar to the JNI bindings.
2016-02-29	Add mutool create tool, and PDF font and image resource creation.	Michael Vrhel
	Initial framework for creating pdfs This adds a create option to mutool for us to use in working on the API for creating content as well as adding content to existing documents. mutool create: Get page sizes and add them Start the parsing of the contents.txt file which may have multiple page information. Add the pages at the proper sizes. Further work on mutool create_pdf Remove the calls that were being made to the pdf-write device. Clean up several issues with the reading of the page contents. Get the content streams for each page associated with the page->contents Temp. created a pdf_create_page_contents procedure. I will merge this with pdf_create_page as there is significant overlap. Next is to add in the font and image resources and indirect references. Include pdfcreate in build Merge pdf_create_page_contents and pdf_create_page Add support for images in pdfcreate This adds images to the pdf document using a function stolen from pdf-device (send_image). This was renamed pdf_add_image_res and added to pdf-image. Down the road, send-image will be removed. Prior to that, I need to work on making sure that multiple copies of the same image do not end up in the document. Code was also added to create the page resources to point to the proper image in the document. Next fonts will be added in a similar manner, then I will work on computing the md5 sums of image and fonts to ensure only one copy ends up in the document. Then pdf-write will be reworked to use the same code as opposed to its current list of md5 sums that are stored in a device structure. mutool pdfcreate: support for WinAnsiEncoded fonts Added support for very simple fonts (WinAnsiEncoding). Methods added in pdf-font.c. Added first_width and last_width to fz_font_s and stem_v to pdf_font_desc_s. Ran code through memento with simple test of 4 page document creation including an image and a font. Fixed several leaks as well as buffer corruption issues (main changes in pdfcreate). Thanks to Robin for the help with Memento in finding leaks. Added StemV to pdf names as it was needed for the font descriptor creation. Fix for pdf_write_document rename to pdf_save_document Add resource_ids to pdf document structure The purpose of this structure will be to allow the search and reuse of resources when we attempt to add new ones to the document. Fix name changes from recent updates pdf_create branch updated to work with recent changes in master Initial use of hash table for resources To avoid adding in the same resource this adds a resource_tables member to pdf_document. The resource_tables structure consists of multiple fz_hash_table entries, one for each resource type. When an attempt is made to search for an existing resource, the table will be initialized in a brute force search for existing resources. Currently this is only set up for the image resources and accessed through pdf_add_image_res. If a match is found, the reference object is returned. If no match is found NULL is returned and the ref object created in pdf_add_image_res is added into the hash table. In this case, a command line such as create -o output.pdf -f F0:font.ttf -i Im0:image.jpg -i Im1:image1.jpg \\ -i Im2:image.jpg contents.txt will avoid the insertion of two copies of image.jpg into the output PDF document. CID Identity-H Font added for handing ttf This adds a method for adding a ttf to a PDF as a CID font with Identity-H mapping and a ToUnicode entry that is created using FT_Get_Char_Index This takes much care in the creation of the ToUnicode CMap to ensure that the minimum number of entries are created in that we try to use beginbfrange as much as possible before using beginbfchar. The code makes sure to limit the number of entries in a group to 100 and to not cross first-byte boundaries for the CID values as described in the Adobe Technical note 5411. Add missing file pdf-resources.c pdf-resources.c was missing and should have been committed earlier. Added to windows project file. Not sure where else it needs to be added for the other platforms. Clean up names and spacing Make sure that the visible functions have the proper namespace (e.g. pdf_xxxx) Also make sure we have a blank line prior to comment. Be consistent with static function naming in pdf_resources.c pdfwrite make use of image resource fz_hash_table The pdfwrite device now shares the structure that stores the resource images for pdfcreate. With this fix, pdfwrite now avoids duplicating the writing of the same images that are shared across multiple pages. Add missing file pdf-resources.c Initial work toward having pdfwrite use Identity-H Type0 encoding for fonts Finish of CID type0 Identity-H font for pdfwrite This adds in the proper widths which may have been stored in the source font in the width table (parsed from the W entry in the pdf file) or if the free type structure has its own cmap then we can get the width from free type. Widths are restructured into format described in 5.6.3 of PDF spec. Fix issue from conflict merging and multiple define of structure Clean up warnings and make mutool create use simple font
2016-02-26	Add harfbuzz path to other VS configs (e.g. x64) as well as set Preproc defines.	Michael Vrhel

2016-02-03	Bug 696546: Add fast strtof	Robin Watts
	Take on a (slightly tweaked) version of Simon Reinhardt's patch. The actual logic is left entirely unchanged; minor changes have been made to the names of functions/types to avoid clashing in the cmapdump.c repeated inclusion. Currently this should really only affect xps files, as strtof is only used as fz_atof, and that's (effectively) all xps for now. I will look at updating lex_number to call this in future.
2016-01-29	Force all harfbuzz allocations through our allocators.	Robin Watts
	Because of a shortcoming in harfbuzz, we can't easily force all its allocations through our allocators. We fudge it, with the addition of some macros to change malloc/free/calloc into hb_malloc/hb_free/hb_calloc. To prevent thread safety issues, we use our freetype lock around calls to harfbuzz. We stash the current context in a static var.
2016-01-28	Add harfbuzz thirdparty submodule.	Tor Andersson

2016-01-28	Add Noto fallback fonts.	Tor Andersson
	Look up fallback fonts by unicode script, with a flag to select the serif or sans-serif font style where such variants exist. Move all builtin fonts into fitz namespace.
2016-01-18	First import of bidi code.	Robin Watts

2016-01-13	VS Solution: Add fz_pool files.	Robin Watts

2015-12-28	Drop 'jsimp' abstraction and use mujs directly.	Tor Andersson

2015-12-22	Update jbig2dec to latest.	Robin Watts
	In particular this takes on the Memento fixes for bug 696183.
2015-12-18	Remove fz_save_document and use pdf_save_document directly instead.	Tor Andersson
	In preparation of adding pdf_write_document that writes a document to a fz_output stream.
2015-12-11	win32: Always build 'generated' in 32-bit mode.	Tor Andersson
	Otherwise we can't run file generation tools with a 64-bit target on a 32-bit host.
2015-11-12	gl: Add x64 target to MSVC project files.	Tor Andersson

2015-10-14	gl: Fix win32 release mode build.	Tor Andersson
	Always build with the 'windows' subsystem and use WinMain. Turn on USE_OUTPUT_DEBUG_STRING to capture fz_warn and fz_throw error messages.
2015-10-06	Update freetype submodule to version 2.6.1.	Tor Andersson

2015-10-06	gl: Windows stuff.	Tor Andersson
	* Add icons to application and window. * Open file dialog if no command line argument. * Install file type associations.
2015-10-06	gl: Split text field handling into separate file and add keyboard focus.	Tor Andersson

2015-10-06	gl: Add an internal header file for GL application.	Tor Andersson

2015-10-06	gl: Fix MSVC warnings.	Tor Andersson

2015-10-06	gl: Use GLFW instead of GLUT.	Tor Andersson
	Add OpenGL text rendering using textured quads, instead of using glut bitmap fonts.
2015-10-06	xps: Add separate link parsing step.	Tor Andersson
	Don't rely on having to run the page once with an identity transform before being able to load the links.
2015-09-28	Workaround for VS2005 linker bug.	Robin Watts
	Disable the use of link time code creation. I don't believe this costs us much (if anything) and it avoids VS2005 dying on release builds due to a bug triggered by the use of 64bit fz_off_t.
2015-09-14	Add utility functions to help reduce device creation boilerplate.	Tor Andersson

2015-08-24	Revert revert of WinMain utf-8 handling and fix the bugs.	Tor Andersson
	Also fix a few ifdefs in time.c so that it builds on MinGW.
2015-08-17	Revert "win32: Convert argv to utf-8 and use regular getopt."	Robin Watts
	Neatness doesn't override actually working. This reverts commit efb5a38ca0bac3537ceaf3383681a518df133143.
2015-07-31	win32: Convert argv to utf-8 and use regular getopt.	Tor Andersson
	Easier than duplicating getopt for wchar_t, since we already have windows specific functions to convert wchar_t strings.
2015-07-30	Add load-gif.c to visual studio project	Michael Vrhel
	Windows build was broken with commit 642a59a4de683a1359733229943be285e3e45c4f
2015-07-20	Code to generate a GProof file from a currently opened document.	Robin Watts
	Given a document, generate a gproof file from it. This encapsulates the name of the file, the desired resolution for proofing, and the page dimensions of all the pages in the file. The idea is that an app will call this when it is asked to go into 'proofing' mode, and will reinvoke itself on this file. This gives the gprf document handler just enough information to fake up a document of n pages of the required sizes. Each page will then be autogenerated on demand.
2015-07-20	First cut at gprf document handler.	Robin Watts
	Doesn't actually trigger generation from ghostscript, or load images from files generated by ghostscript yet.
2015-06-29	Add Separation class to fitz.	Robin Watts
	Simple set of functions for managing sets of separations. Separations have names, equivalent rgb/cmyk colors, and can be enabled/disabled.
2015-06-29	Add an fz_tempfile utility.	Robin Watts
	This will be required for the gprf work.
2015-06-26	Bug 696053: Update windows mupdf to respect command line flags.	Robin Watts
	Previously, only the unix executable had been updated to take command line flags; update the windows one in line with it. We have to cope with the argv being in Unicode; add a windows specific version of getoptw for this. Also note that that fprintf's in the windows mupdf exe won't work as GUI apps don't have a console window, and can't write to the parent one. Fixing that is a larger project than I have time for right now.
2015-06-03	Enable FZ_LARGEFILE for all windows builds.	Robin Watts
	People worrying about the minimal extra memory this takes can disable it if required.
2015-06-02	Ensure that we can still build mudraw standalone if we want to.	Robin Watts
	MUDRAW_STANDALONE forces mudraw_main to be just main. Set this in the mudraw VS project.
2015-05-25	Update VS solution with mutool changes.	Robin Watts
	mudraw.c must be included into mutool.
2015-05-07	Add some missing headers to MSVC solution.	Robin Watts

2015-04-09	Remove the _no_run functions.	Tor Andersson
	The new pdfclean sanitize functionality mean that mutool now needs the data files, so maintaining the split that was designed to keep data files out of mutool is no longer viable.
2015-04-06	Add mutool pages subcommand.	Robin Watts
	Inspired by bug 695823. Mutool can now dump the sizes and orientations for pages within a given file.
2015-04-06	Move the guts of pdfclean into the lib.	Robin Watts
	Michael needs to be able to call pdfclean from gsview. At the moment he's having to do this by including the pdfclean.c file into the lib build, and then calling pdfclean_main with a faked up command line. This isn't nice. pdfclean.c is implemented by pdfclean_main parsing the options/filenames out of argv and then passing the filenames/options on to a pdfclean_clean function. This seems like a much nicer API to offer to the world. We therefore pull the guts of pdfclean.c (pdfclean_clean and its subsidiary structures/functions) into pdf-clean-file.c and include this in the library build. This leaves pdfclean.c just as the command line parsing. This should not affect the size of any of the resulting binaries.
2015-03-24	Rework handling of PDF names for speed and memory.	Robin Watts
	Currently, every PDF name is allocated in a pdf_obj structure, and comparisons are done using strcmp. Given that we can predict most of the PDF names we'll use in a given file, this seems wasteful. The pdf_obj type is opaque outside the pdf-object.c file, so we can abuse it slightly without anyone outside knowing. We collect a sorted list of names used in PDF (resources/pdf/names.txt), and we add a utility (namedump) that preprocesses this into 2 header files. The first (include/mupdf/pdf/pdf-names-table.h, included as part of include/mupdf/pdf/object.h), defines a set of "PDF_NAME_xxxx" entries. These are pdf_obj 's that callers can use to mean "A PDF object that means literal name 'xxxx'" The second (source/pdf/pdf-name-impl.h) is a C array of names. We therefore update the code so that rather than passing "xxxx" to functions (such as pdf_dict_gets(...)) we now pass PDF_NAME_xxxx (to pdf_dict_get(...)). This is a fairly natural (if widespread) change. The pdf_dict_getp (and sibling) functions that take a path (e.g. "foo/bar/baz") are therefore supplemented with equivalents that take a list (pdf_dict_getl(... , PDF_NAME_foo, PDF_NAME_bar, PDF_NAME_baz, NULL)). The actual implementation of this relies on the fact that small pointer values are never valid values. For a given pdf_obj p, if NULL < (intptr_t)p < PDF_NAME__LIMIT then p is a literal entry in the name table. This enables us to do fast pointer compares and to skip expensive strcmps. Also, bring "null", "true" and "false" into the same style as PDF names. Rather than using full pdf_obj structures for null/true/false, use special pointer values just above the PDF_NAME_ table. This saves memory and makes comparisons easier.