Age | Commit message (Collapse) | Author |
|
Michael needs to be able to call pdfclean from gsview. At the moment
he's having to do this by including the pdfclean.c file into the lib
build, and then calling pdfclean_main with a faked up command line.
This isn't nice.
pdfclean.c is implemented by pdfclean_main parsing the options/filenames
out of argv and then passing the filenames/options on to a
pdfclean_clean function.
This seems like a much nicer API to offer to the world.
We therefore pull the guts of pdfclean.c (pdfclean_clean and its
subsidiary structures/functions) into pdf-clean-file.c and include
this in the library build.
This leaves pdfclean.c just as the command line parsing.
This should not affect the size of any of the resulting binaries.
|
|
Currently, every PDF name is allocated in a pdf_obj structure, and
comparisons are done using strcmp. Given that we can predict most
of the PDF names we'll use in a given file, this seems wasteful.
The pdf_obj type is opaque outside the pdf-object.c file, so we can
abuse it slightly without anyone outside knowing.
We collect a sorted list of names used in PDF (resources/pdf/names.txt),
and we add a utility (namedump) that preprocesses this into 2 header
files.
The first (include/mupdf/pdf/pdf-names-table.h, included as part of
include/mupdf/pdf/object.h), defines a set of "PDF_NAME_xxxx"
entries. These are pdf_obj *'s that callers can use to mean "A PDF
object that means literal name 'xxxx'"
The second (source/pdf/pdf-name-impl.h) is a C array of names.
We therefore update the code so that rather than passing "xxxx" to
functions (such as pdf_dict_gets(...)) we now pass PDF_NAME_xxxx (to
pdf_dict_get(...)). This is a fairly natural (if widespread) change.
The pdf_dict_getp (and sibling) functions that take a path (e.g.
"foo/bar/baz") are therefore supplemented with equivalents that
take a list (pdf_dict_getl(... , PDF_NAME_foo, PDF_NAME_bar,
PDF_NAME_baz, NULL)).
The actual implementation of this relies on the fact that small
pointer values are never valid values. For a given pdf_obj *p,
if NULL < (intptr_t)p < PDF_NAME__LIMIT then p is a literal
entry in the name table.
This enables us to do fast pointer compares and to skip expensive
strcmps.
Also, bring "null", "true" and "false" into the same style as PDF names.
Rather than using full pdf_obj structures for null/true/false, use
special pointer values just above the PDF_NAME_ table. This saves
memory and makes comparisons easier.
|
|
Fred had updated a path in an include file. The Makefiles cope with
this, the VS solution does not.
|
|
The lib was being built in an odd place.
|
|
A few casts are required within the code, along with a few #ifdef
changes.
Some tweaks to curl are required too.
|
|
Add the new source files to the solution.
Windows builds whinge about float->double conversions. Fix these
with explicit casts.
Avoid calling strtof and strcasecmp.
|
|
Purge several embedded contexts:
Remove embedded context in fz_output.
Remove embedded context in fz_stream.
Remove embedded context in fz_device.
Remove fz_rebind_stream (since it is no longer necessary).
Remove embedded context in svg_device.
Remove embedded context in XML parser.
Add ctx argument to fz_document functions.
Remove embedded context in fz_document.
Remove embedded context in pdf_document.
Remove embedded context in pdf_obj.
Make fz_page independent of fz_document in the interface.
We shouldn't need to pass the document to all functions handling a page.
If a page is tied to the source document, it's redundant; otherwise it's
just pointless.
Fix reference counting oddity in fz_new_image_from_pixmap.
|
|
|
|
|
|
The dtoa function is for doubles (which is what MuJS uses) but for MuPDF
we only need and want float precision in our output formatting.
|
|
Without this, we fail to find jpeg_cust_mem_init at link time on
Windows.
|
|
|
|
all plus more.
Includes xml, html, text save as, fix issues in window range that determines what
pages are visible at a particular scaling. Fix very tricky to find problem in
interface with gs. Managed code was freeing delegates that were allocated dynamically.
It is not possible to pin these so they are now member variables.
|
|
|
|
We build MuPDF without NDEBUG defined in order to check assertions but
don't want Visual Studio's Output console flooded with warnings for
broken documents. So instead of always defining USE_OUTPUT_DEBUG_STRING
for debug builds under Windows, allow the VS solutions to define it
when desired and to omit it when not.
|
|
It was missing one level of ..\ so failed trying to delete platform/generated
|
|
Currently only tests for the presence of non-grayscale color.
|
|
Replace the DroidSansFallback TTF files with a TTC that has two fonts:
The original and a copy where the OpenType 'vert' substitution
lookup has been pre-applied by copying the uniXXXX.vert glyph data
to uniXXXX.
|
|
Fix broken solution file and add project entries for new files.
|
|
This adds a custom memory management layer between libjpeg and the calling
app - in such a way that the code can be shared between mupdf and
Ghostscript/PDL.
|
|
This enables us to search the source easily, without affecting the
fact that it is compiled using one.c in a single block.
|
|
|
|
|
|
New routine to filter the content streams for pages, xobjects,
type3 charprocs, patterns etc. The filtered streams are guaranteed
to be properly matched with q/Q's, and to not have changed the top
level ctm. Additionally we remove (some) repeated settings of
colors etc. This filtering can be extended to be smarter later.
The idea of this is to both repair after editing, and to leave the
streams in a form that can be easily appended to.
This is preparatory to work on Bates numbering and Watermarking.
Currently the streams produced are uncompressed.
|
|
The primary motivator for this is so that we can print floating point
values and get the full accuracy out, without having to print 1.5 as
1.5000000, and without getting 23e24 etc.
We only support %c, %f, %d, %o, %x and %s currently.
We only support the zero padding qualifier, for integers.
We do support some extensions:
%C turns values >=128 into UTF-8.
%M prints a fz_matrix.
%R prints a fz_rect.
%P prints a fz_point.
We also implement a fprintf variant on top of this to allow for
consistent results when using fz_output.
a
|
|
Previously pdf_process buffer did not understand inline images.
In order to make this work without needlessly duplicating complex code
from within pdf-op-run, the parsing of inline images has been moved to
happen in pdf-interpret.c. When the op_table entry for BI is called
it now expects the inline image to be in csi->img and the dictionary
object to be in csi->obj.
To make this work, we have had to improve the handling of inline images
in general. While non-inline images have been loaded and held in
memory in their compressed form and only decoded when required, until
now we have always loaded and decoded inline images immediately. This
has been due to the difficulty in knowing how many bytes of data to
read from the stream - we know the length of the stream once
uncompressed, but relating this to the compressed length is hard.
To cure this we introduce a new type of filter stream, a 'leecher'.
We insert a leecher stream before we build the filters required to
decode the image. We then read and discard the appropriate number
of uncompressed bytes from the filters. This pulls the compressed
data through the leecher stream, which stores it in an fz_buffer.
Thus images are now always held in their compressed forms in memory.
The pdf-op-run implementation is now trivial. The only real complexity
in the pdf-op-buffer implementation is the need to ensure that the
/Filter entry in the dictionary object matches the exact point at
which we backstopped the decompression.
|
|
Currently this knows about q/Q matching/eliding and avoiding
repeated/unneccesary color/colorspace setting.
It will also collect a dictionary of resources used by a page.
This can be extended to be cleverer in future.
|
|
Using this, we can reconstruct pdf streams out of the process
called. This will enable us to do filtering when used in
combination with future commits.
|
|
Currently the only processing we can do of PDF pages is to run
them through an fz_device. We introduce new "pdf_process"
functionality here to enable us to do more things.
We define a pdf_processor structure with a set of function
pointers in, one per PDF operator, together with functions
for processing xobjects etc. The guts of pdf_run_page_contents
and pdf_run_annot operations are then extracted to give
pdf_process_page_contents and pdf_process_annot, and the
originals implemented in terms of these.
This commit contains just one instance of a pdf_processor, namely
the "run" processor, which contains the original code refactored.
The graphical state (and device pointer) is now part of private data
to the run operator set, rather than being in pdf_csi.
|
|
Patch from Thomas Fach-Pedersen. Many thanks!
Add a new format handler that copes with TIFF files. This replaces
the TIFF functionality within the image format handler, and is
better because this copes with multiple images (as one image per
page).
|
|
Lost as part of the accidental VS2012 change.
|
|
|
|
To share as much code as possible between the Windows 8 app, windows phone app
and Windows desktop app, remove dependencies of Platform and Windows::Foundation
in files that interface to mupdf and replace with C/C++ std methods.
|
|
We define a document handler for each file type (2 in the case of PDF, one
to handle files with the ability to 'run' them, and one without).
We then register these handlers with the context at startup, and then
call fz_open_document... as usual. This enables people to select the
document types they want at will (and even to extend the library with more
document types should they wish).
|
|
See SumatraPDF's repo for a Windows-only implementation using WIC.
|
|
Only -I the config header directory if building the thirdparty library,
not if using the system library.
Fix bug 694808.
|
|
The OpenJPEG in gs is v2, with various patches for fixes. These are in
the process of being passed upstream. We now automatically pull the
openjpeg tree out of GhostPDL and put it in as one particular branch
in the thirdparty/openjpeg.git repo. Change to track this in MuPDF.
This is in keeping with what we have been doing with the jbig2dec
repo for a while now.
|
|
|
|
|
|
Rather than generating fz_pixmaps for glyphs, we generate fz_glyphs.
fz_glyphs can either contain a pixmap, or an RLEd representation
(if it's a mask, and it's smaller).
Should take less memory in the cache, and should be faster to plot.
|
|
|
|
|
|
Windows and X11. Allows files to be fetched and displayed
as they are downloaded both with and without linearization, using
hints if available.
|
|
We are testing this using a new -p flag to mupdf that sets a bitrate at
which data will appear to arrive progressively as time goes on. For
example:
mupdf -p 102400 pdf_reference17.pdf
Details of the scheme used here are presented in docs/progressive.txt
|
|
|
|
|