Age | Commit message (Collapse) | Author |
|
|
|
|
|
In preparation of adding pdf_write_document that writes a document
to a fz_output stream.
|
|
|
|
|
|
Separate naming of functions that save complete files to disk
from functions that write data to streams.
|
|
Less risk of confusion with the text type used in the device interface.
|
|
Use fz_output in debug printing functions.
Use fz_output in pdfshow.
Use fz_output in fz_trace_device instead of stdout.
Use fz_output in pdf-write.c.
Rename fz_new_output_to_filename to fz_new_output_with_path.
Add seek and tell to fz_output.
Remove unused functions like fz_fprintf.
Fix typo in pdf_print_obj.
|
|
The poster tool worked by rewriting the mediabox, but left the
other boxes (crop, bleed, trim, art) unchanged. This could
cause problems with viewers that didn't intersect such boxes.
This update makes the poster tool write intersected boxes for
all boxes.
|
|
When we are asked to output a text file, the current code would
correctly guess the file format from the name, but would then
output to stdout anyway. Fixed here.
|
|
We were missing a : in the getopt string after the 'p' meaning
that any password supplied was treated as a filename.
|
|
|
|
|
|
Previously, we had people calling image->get_pixmap directly. Now we
have them all call fz_image_get_pixmap, which will look for a cached
version in the store, and only call get_pixmap if required.
Previously fz_image_get_pixmap used to look for the cached version
in the store, and decode if not - hence the decoding code is now
extracted out into standard_image_get_pixmap.
This was the original intent of the code, it just somehow didn't end
up like that.
This nicely queues us up for being able to have fz_images that use
a different get_pixel implementation, such as that which will be
required for the gprf code.
|
|
MUDRAW_STANDALONE forces mudraw_main to be just main. Set this
in the mudraw VS project.
|
|
Use "mutool draw" or symlink mutool to mudraw to use mudraw.
|
|
Add -U option to mupdf and mudraw to set a user stylesheet.
Uses a context to store user the stylesheet, just like the AA level.
|
|
If FZ_LARGEFILE is defined when building, MuPDF uses 64bit offsets
for files; this allows us to open streams larger than 2Gig.
The downsides to this are that:
* The xref entries are larger.
* All PDF ints are held as 64bit things rather than 32bit things
(to cope with /Prev entries, hint stream offsets etc).
* All file positions are stored as 64bits rather than 32.
The implementation works by detecting FZ_LARGEFILE. Some #ifdeffery
in fitz/system.h sets fz_off_t to either int or int64_t as appropriate,
and sets defines for fz_fopen, fz_fseek, fz_ftell etc as required.
These call the fseeko64 etc functions on linux (and so define
_LARGEFILE64_SOURCE) and the explicit 64bit functions on windows.
|
|
|
|
|
|
|
|
The new pdfclean sanitize functionality mean that mutool now
needs the data files, so maintaining the split that was designed to
keep data files out of mutool is no longer viable.
|
|
|
|
|
|
Fixes bug 695909.
|
|
|
|
Inspired by bug 695823. Mutool can now dump the sizes and
orientations for pages within a given file.
|
|
Michael needs to be able to call pdfclean from gsview. At the moment
he's having to do this by including the pdfclean.c file into the lib
build, and then calling pdfclean_main with a faked up command line.
This isn't nice.
pdfclean.c is implemented by pdfclean_main parsing the options/filenames
out of argv and then passing the filenames/options on to a
pdfclean_clean function.
This seems like a much nicer API to offer to the world.
We therefore pull the guts of pdfclean.c (pdfclean_clean and its
subsidiary structures/functions) into pdf-clean-file.c and include
this in the library build.
This leaves pdfclean.c just as the command line parsing.
This should not affect the size of any of the resulting binaries.
|
|
If pdfinfo is invoked as:
mutool info file.pdf 1,2,3
then it will show the items found on page 1, then the items found on
pages 1 and 2, then the items shown on pages 1,2 and 3.
Fix this by clearing the data after each show operation.
|
|
Silly typo. Thanks to Daniel Bloemer for pointing this out.
|
|
|
|
|
|
... and move outline printing to mutool show.
|
|
Currently, every PDF name is allocated in a pdf_obj structure, and
comparisons are done using strcmp. Given that we can predict most
of the PDF names we'll use in a given file, this seems wasteful.
The pdf_obj type is opaque outside the pdf-object.c file, so we can
abuse it slightly without anyone outside knowing.
We collect a sorted list of names used in PDF (resources/pdf/names.txt),
and we add a utility (namedump) that preprocesses this into 2 header
files.
The first (include/mupdf/pdf/pdf-names-table.h, included as part of
include/mupdf/pdf/object.h), defines a set of "PDF_NAME_xxxx"
entries. These are pdf_obj *'s that callers can use to mean "A PDF
object that means literal name 'xxxx'"
The second (source/pdf/pdf-name-impl.h) is a C array of names.
We therefore update the code so that rather than passing "xxxx" to
functions (such as pdf_dict_gets(...)) we now pass PDF_NAME_xxxx (to
pdf_dict_get(...)). This is a fairly natural (if widespread) change.
The pdf_dict_getp (and sibling) functions that take a path (e.g.
"foo/bar/baz") are therefore supplemented with equivalents that
take a list (pdf_dict_getl(... , PDF_NAME_foo, PDF_NAME_bar,
PDF_NAME_baz, NULL)).
The actual implementation of this relies on the fact that small
pointer values are never valid values. For a given pdf_obj *p,
if NULL < (intptr_t)p < PDF_NAME__LIMIT then p is a literal
entry in the name table.
This enables us to do fast pointer compares and to skip expensive
strcmps.
Also, bring "null", "true" and "false" into the same style as PDF names.
Rather than using full pdf_obj structures for null/true/false, use
special pointer values just above the PDF_NAME_ table. This saves
memory and makes comparisons easier.
|
|
|
|
Add missing initialisation of glo.ctx required due to API
change.
|
|
Purge several embedded contexts:
Remove embedded context in fz_output.
Remove embedded context in fz_stream.
Remove embedded context in fz_device.
Remove fz_rebind_stream (since it is no longer necessary).
Remove embedded context in svg_device.
Remove embedded context in XML parser.
Add ctx argument to fz_document functions.
Remove embedded context in fz_document.
Remove embedded context in pdf_document.
Remove embedded context in pdf_obj.
Make fz_page independent of fz_document in the interface.
We shouldn't need to pass the document to all functions handling a page.
If a page is tied to the source document, it's redundant; otherwise it's
just pointless.
Fix reference counting oddity in fz_new_image_from_pixmap.
|
|
Rename fz_close to fz_drop_stream.
Rename fz_close_archive to fz_drop_archive.
Rename fz_close_output to fz_drop_output.
Rename fz_free_* to fz_drop_*.
Rename pdf_free_* to pdf_drop_*.
Rename xps_free_* to xps_drop_*.
|
|
|
|
pdfextract_main by default iterates through all objects from number 0
to the size of the document's xref table. Object number 0 is however
always supposed to be free, so pdfextract consistently fails and shows
a slightly confusing warning. Object extraction should by default start
at object 1 in order to prevent this warning.
|
|
Add a new class of errors and use them to abort interpretation when
the test device detects a color page.
|
|
The original version of the test-device could characterise pages
as being grayscale/color purely based on the colorspaces used.
This could easily be upset by grayscale images or shadings that
happened to be specified in non-grayscale colorspaces however.
We now look at the actual shading and image color values, and use
a threshold value to allow for some measure of rounding errors in
color values that are in practice grayscale.
|
|
|
|
|
|
Currently only tests for the presence of non-grayscale color.
|
|
|
|
It has no real reason to live in mudraw, and it does pull in the
javascript dependency via pdf-form.c.
|
|
|
|
New routine to filter the content streams for pages, xobjects,
type3 charprocs, patterns etc. The filtered streams are guaranteed
to be properly matched with q/Q's, and to not have changed the top
level ctm. Additionally we remove (some) repeated settings of
colors etc. This filtering can be extended to be smarter later.
The idea of this is to both repair after editing, and to leave the
streams in a form that can be easily appended to.
This is preparatory to work on Bates numbering and Watermarking.
Currently the streams produced are uncompressed.
|
|
When you use mutool clean to subset pages out of a PDF, we already
remove the Name tree entries for named locations that aren't in the
target file. We have henceforth failed to remove references to these
removed names though. This can cause errors (really warnings) on
reading the file back.
|