Age | Commit message (Collapse) | Author |
|
Thanks to zeniko.
Also ensure that pdf_free_annot copes with NULL.
|
|
If a colorspace refers to itself as a base, we can get an infinite
recursion and hence stack overflow. Thanks to zeniko for pointing out
that this occurs in embedded CMAPs and stitching functions. Also
solved here.
To avoid having to keep a long list of the objects we've traversed
through, extend the pdf_dict_mark functions to work on all pdf objects,
and hence rename them as pdf_obj_mark etc. Thanks to zeniko again for
feedback on this way of working.
Problem found in a test file, 3882.pdf.SIGSEGV.99.3204 supplied
by Mateusz "j00ru" Jurczyk and Gynvael Coldwind of the Google
Security Team using Address Sanitizer. Many thanks!
|
|
Conflicts:
Makefile
apps/mudraw.c
pdf/pdf_write.c
win32/libmupdf-v8.vcproj
|
|
|
|
Also change first argument from fz_context to pdf_document in each
of pdf_to_utf8, pdf_to_utf8_name, pdf_to_ucs2 and pdf_to_ucs2_name
|
|
|
|
Previously, before interpreting a pages content stream we would
load it entirely into a buffer. Then we would interpret that
buffer. This has a cost in memory use.
Here, we update the code to read from a stream on the fly.
This has required changes in various different parts of the code.
Firstly, we have removed all use of the FILE lock - as stream
reads can now safely be interrupted by resource (or object) reads
from elsewhere in the file, the file lock becomes a very hard
thing to maintain, and doesn't actually benefit us at all. The
choices were to either use a recursive lock, or to remove it
entirely; I opted for the latter.
The file lock enum value remains as a placeholder for future use in
extendable data streams.
Secondly, we add a new 'concat' filter that concatenates a series of
streams together into one, optionally putting whitespace between each
stream (as the pdf parser requires this).
Finally, we change page/xobject/pattern content streams to work
on the fly, but we leave type3 glyphs using buffers (as presumably
these will be run repeatedly).
|
|
Attempt to separate public API from internal functions.
|
|
Currently, we are in the slightly strange position of having
the PDF specific object types as part of fitz. Here we pull
them out into the pdf layer instead. This has been made possible
by the recent changes to make the store no longer be tied to
having fz_obj's as keys.
Most of this work is a simple huge rename; to help customers who
may have code that use such functions we have provided a sed
script to do the renaming; scripts/rename2.sed.
Various other small tweaks are required; the store used to have
some debugging code that still required knowledge of fz_obj
types - we extract that into a nicer 'type' based function
pointer. Also, the type 3 font handling used to have an fz_obj
pointer for type 3 resources, and therefore needed to know how
to free this; this has become a void * with a function to free
it.
|
|
|
|
|
|
In various places in the code, we add markers (".seen") to
dictionaries as we traverse them to ensure that we don't
go into infinite loops.
Adding a dictionary entry is bad as it's a) an expensive
operation, b) a potentially destructive one, and c) produces another
possible point of failure (as mallocs can fail).
Instead, add a flag to each dict to allow them to be marked/unmarked
and use that instead.
Thanks to Zeniko for pointing out various places that could usefully
be protected against infinite recursion.
|
|
When reading an XYZ link destination, I was writing Z over X. While
fixing this, change to cryptic enums rather than unexplained magic
numbers.
When reading the outlines out of a pdf file, better to actually store
them in a linked list rather than just dropping them.
|
|
Move 'kind' into the fz_link_dest structure (as this makes more sense).
Put an fz_link_dest rather than just a page number into the outlines
structure.
Correct parsing of actions and dests from pdf outlines.
|
|
Move to a non-pdf specific type for links. PDF specific parsing is
done in pdf_annots.c as before, but the essential type (and handling
functions for that type) are in a new file fitz/base_link.c.
The new type is more expressive than before; specifically all the
possible PDF modes are expressable in it. Hopefully this should
allow XPS links to be represented too.
|
|
A couple of bits of the code SEGV in the event that values are NULL.
Fixed here by converting a do...while to a while, and adding an extra
guard in the if.
Thanks to Gaetan Bisson for the report, and patch.
|
|
The new fz_malloc_struct(A,B) macro allocates sizeof(B) bytes using
fz_malloc, and then passes the resultant pointer to Memento_label
to label it with "B".
This costs nothing in non-memento builds, but gives much nicer
listings of leaked blocks when memento is enabled.
|
|
Mostly redoing the xps_context to xps_document change and adding
contexts to newly written code.
Conflicts:
apps/pdfapp.c
apps/pdfapp.h
apps/x11_main.c
apps/xpsdraw.c
draw/draw_device.c
draw/draw_scale.c
fitz/base_object.c
fitz/fitz.h
pdf/mupdf.h
pdf/pdf_interpret.c
pdf/pdf_outline.c
pdf/pdf_page.c
xps/muxps.h
xps/xps_doc.c
xps/xps_xml.c
|
|
|
|
|
|
Huge pervasive change to lots of files, adding a context for exception
handling and allocation.
In time we'll move more statics into there.
Also fix some for(i = 0; i < function(...); i++) calls.
|
|
|
|
The run-together words are dead! Long live the underscores!
The postscript inspired naming convention of using all run-together
words has served us well, but it is now time for more readable code.
In this commit I have also added the sed script, rename.sed, that I used
to convert the source. Use it on your patches and application code.
|
|
|