Age | Commit message (Collapse) | Author |
|
Previously, we had people calling image->get_pixmap directly. Now we
have them all call fz_image_get_pixmap, which will look for a cached
version in the store, and only call get_pixmap if required.
Previously fz_image_get_pixmap used to look for the cached version
in the store, and decode if not - hence the decoding code is now
extracted out into standard_image_get_pixmap.
This was the original intent of the code, it just somehow didn't end
up like that.
This nicely queues us up for being able to have fz_images that use
a different get_pixel implementation, such as that which will be
required for the gprf code.
|
|
MUDRAW_STANDALONE forces mudraw_main to be just main. Set this
in the mudraw VS project.
|
|
Use "mutool draw" or symlink mutool to mudraw to use mudraw.
|
|
Add -U option to mupdf and mudraw to set a user stylesheet.
Uses a context to store user the stylesheet, just like the AA level.
|
|
If FZ_LARGEFILE is defined when building, MuPDF uses 64bit offsets
for files; this allows us to open streams larger than 2Gig.
The downsides to this are that:
* The xref entries are larger.
* All PDF ints are held as 64bit things rather than 32bit things
(to cope with /Prev entries, hint stream offsets etc).
* All file positions are stored as 64bits rather than 32.
The implementation works by detecting FZ_LARGEFILE. Some #ifdeffery
in fitz/system.h sets fz_off_t to either int or int64_t as appropriate,
and sets defines for fz_fopen, fz_fseek, fz_ftell etc as required.
These call the fseeko64 etc functions on linux (and so define
_LARGEFILE64_SOURCE) and the explicit 64bit functions on windows.
|
|
|
|
|
|
|
|
The new pdfclean sanitize functionality mean that mutool now
needs the data files, so maintaining the split that was designed to
keep data files out of mutool is no longer viable.
|
|
|
|
|
|
Fixes bug 695909.
|
|
|
|
Inspired by bug 695823. Mutool can now dump the sizes and
orientations for pages within a given file.
|
|
Michael needs to be able to call pdfclean from gsview. At the moment
he's having to do this by including the pdfclean.c file into the lib
build, and then calling pdfclean_main with a faked up command line.
This isn't nice.
pdfclean.c is implemented by pdfclean_main parsing the options/filenames
out of argv and then passing the filenames/options on to a
pdfclean_clean function.
This seems like a much nicer API to offer to the world.
We therefore pull the guts of pdfclean.c (pdfclean_clean and its
subsidiary structures/functions) into pdf-clean-file.c and include
this in the library build.
This leaves pdfclean.c just as the command line parsing.
This should not affect the size of any of the resulting binaries.
|
|
If pdfinfo is invoked as:
mutool info file.pdf 1,2,3
then it will show the items found on page 1, then the items found on
pages 1 and 2, then the items shown on pages 1,2 and 3.
Fix this by clearing the data after each show operation.
|
|
Silly typo. Thanks to Daniel Bloemer for pointing this out.
|
|
|
|
|
|
... and move outline printing to mutool show.
|
|
Currently, every PDF name is allocated in a pdf_obj structure, and
comparisons are done using strcmp. Given that we can predict most
of the PDF names we'll use in a given file, this seems wasteful.
The pdf_obj type is opaque outside the pdf-object.c file, so we can
abuse it slightly without anyone outside knowing.
We collect a sorted list of names used in PDF (resources/pdf/names.txt),
and we add a utility (namedump) that preprocesses this into 2 header
files.
The first (include/mupdf/pdf/pdf-names-table.h, included as part of
include/mupdf/pdf/object.h), defines a set of "PDF_NAME_xxxx"
entries. These are pdf_obj *'s that callers can use to mean "A PDF
object that means literal name 'xxxx'"
The second (source/pdf/pdf-name-impl.h) is a C array of names.
We therefore update the code so that rather than passing "xxxx" to
functions (such as pdf_dict_gets(...)) we now pass PDF_NAME_xxxx (to
pdf_dict_get(...)). This is a fairly natural (if widespread) change.
The pdf_dict_getp (and sibling) functions that take a path (e.g.
"foo/bar/baz") are therefore supplemented with equivalents that
take a list (pdf_dict_getl(... , PDF_NAME_foo, PDF_NAME_bar,
PDF_NAME_baz, NULL)).
The actual implementation of this relies on the fact that small
pointer values are never valid values. For a given pdf_obj *p,
if NULL < (intptr_t)p < PDF_NAME__LIMIT then p is a literal
entry in the name table.
This enables us to do fast pointer compares and to skip expensive
strcmps.
Also, bring "null", "true" and "false" into the same style as PDF names.
Rather than using full pdf_obj structures for null/true/false, use
special pointer values just above the PDF_NAME_ table. This saves
memory and makes comparisons easier.
|
|
|
|
Add missing initialisation of glo.ctx required due to API
change.
|
|
Purge several embedded contexts:
Remove embedded context in fz_output.
Remove embedded context in fz_stream.
Remove embedded context in fz_device.
Remove fz_rebind_stream (since it is no longer necessary).
Remove embedded context in svg_device.
Remove embedded context in XML parser.
Add ctx argument to fz_document functions.
Remove embedded context in fz_document.
Remove embedded context in pdf_document.
Remove embedded context in pdf_obj.
Make fz_page independent of fz_document in the interface.
We shouldn't need to pass the document to all functions handling a page.
If a page is tied to the source document, it's redundant; otherwise it's
just pointless.
Fix reference counting oddity in fz_new_image_from_pixmap.
|
|
Rename fz_close to fz_drop_stream.
Rename fz_close_archive to fz_drop_archive.
Rename fz_close_output to fz_drop_output.
Rename fz_free_* to fz_drop_*.
Rename pdf_free_* to pdf_drop_*.
Rename xps_free_* to xps_drop_*.
|
|
|
|
pdfextract_main by default iterates through all objects from number 0
to the size of the document's xref table. Object number 0 is however
always supposed to be free, so pdfextract consistently fails and shows
a slightly confusing warning. Object extraction should by default start
at object 1 in order to prevent this warning.
|
|
Add a new class of errors and use them to abort interpretation when
the test device detects a color page.
|
|
The original version of the test-device could characterise pages
as being grayscale/color purely based on the colorspaces used.
This could easily be upset by grayscale images or shadings that
happened to be specified in non-grayscale colorspaces however.
We now look at the actual shading and image color values, and use
a threshold value to allow for some measure of rounding errors in
color values that are in practice grayscale.
|
|
|
|
|
|
Currently only tests for the presence of non-grayscale color.
|
|
|
|
It has no real reason to live in mudraw, and it does pull in the
javascript dependency via pdf-form.c.
|
|
|
|
New routine to filter the content streams for pages, xobjects,
type3 charprocs, patterns etc. The filtered streams are guaranteed
to be properly matched with q/Q's, and to not have changed the top
level ctm. Additionally we remove (some) repeated settings of
colors etc. This filtering can be extended to be smarter later.
The idea of this is to both repair after editing, and to leave the
streams in a form that can be easily appended to.
This is preparatory to work on Bates numbering and Watermarking.
Currently the streams produced are uncompressed.
|
|
When you use mutool clean to subset pages out of a PDF, we already
remove the Name tree entries for named locations that aren't in the
target file. We have henceforth failed to remove references to these
removed names though. This can cause errors (really warnings) on
reading the file back.
|
|
Firstly, we remove the use of global variables; this is done by
introducing a 'globals' structure for each of these files and
passing it internally between functions.
Next, split the core of pdfclean_main into pdfclean_clean, and the
core of pdfinfo_main into pdfinfo_info.
The _main functions now do the argv processing. The new functions now
run entirely thread safely, so can be called from library functions.
|
|
Windows doesn't like redirecting binary output, so add an explicit
filename argument.
|
|
We define a document handler for each file type (2 in the case of PDF, one
to handle files with the ability to 'run' them, and one without).
We then register these handlers with the context at startup, and then
call fz_open_document... as usual. This enables people to select the
document types they want at will (and even to extend the library with more
document types should they wish).
|
|
Some warnings we'd like to enable for MuPDF and still be able to
compile it with warnings as errors using MSVC (2008 to 2013):
* C4115: 'timeval' : named type definition in parentheses
* C4204: nonstandard extension used : non-constant aggregate initializer
* C4295: 'hex' : array is too small to include a terminating null character
* C4389: '==' : signed/unsigned mismatch
* C4702: unreachable code
* C4706: assignment within conditional expression
Also, globally disable C4701 which is frequently caused by MSVC not
being able to correctly figure out fz_try/fz_catch code flow.
And don't define isnan for VS2013 and later where that's no longer needed.
|
|
|
|
|
|
Set the hint in mudraw when AA bits is set to 0.
|
|
SumatraPDF's testsuite uses Targa images as output because they're
compressed while still far easier to compare than PNG and have better
tool support than PCL/PWG.
|
|
|
|
The most complex part here is to ensure that we can output various
bitmaps in bands.
|
|
|
|
|
|
This allows us to "mudraw -F ppm -o /dev/null" etc.
|