Age | Commit message (Collapse) | Author |
|
|
|
|
|
This is faster on ARM in particular. The primary changes involve
fz_matrix, fz_rect and fz_bbox.
Rather than passing 'fz_rect r' into a function, we now consistently
pass 'const fz_rect *r'. Where a rect is passed in and modified, we
miss the 'const' off. Where possible, we return the pointer to the
modified structure to allow 'chaining' of expressions.
The basic upshot of this work is that we do far fewer copies of
rectangle/matrix structures, and all the copies we do are explicit.
This has opened the way to other optimisations, also performed in
this commit.
Rather than using expressions like:
fz_concat(fz_scale(sx, sy), fz_translate(tx, ty))
we now have fz_pre_{scale,translate,rotate} functions. These
can be implemented much more efficiently than doing the fully
fledged matrix multiplication that fz_concat requires.
We add fz_rect_{min,max} functions to return pointers to the
min/max points of a rect. These can be used to in transformations
to directly manipulate values.
With a little casting in the path transformation code we can avoid
more needless copying.
We rename fz_widget_bbox to the more consistent fz_bound_widget.
|
|
|
|
|
|
If a colorspace refers to itself as a base, we can get an infinite
recursion and hence stack overflow. Thanks to zeniko for pointing out
that this occurs in embedded CMAPs and stitching functions. Also
solved here.
To avoid having to keep a long list of the objects we've traversed
through, extend the pdf_dict_mark functions to work on all pdf objects,
and hence rename them as pdf_obj_mark etc. Thanks to zeniko again for
feedback on this way of working.
Problem found in a test file, 3882.pdf.SIGSEGV.99.3204 supplied
by Mateusz "j00ru" Jurczyk and Gynvael Coldwind of the Google
Security Team using Address Sanitizer. Many thanks!
|
|
Only Fade, Wipe and Blinds supported so far.
Hit 'p' in the viewer to go into 'presentation' mode. Page swaps
then transition from page to page. Pages auto advance until key
or mouse is used.
|
|
Conflicts:
Makefile
apps/mudraw.c
pdf/pdf_write.c
win32/libmupdf-v8.vcproj
|
|
PDF documents that do not have a page tree will have zero pages.
Calling fz_count_pages() twice or more on those documents will have
pdf_load_page_tree() repeatedly trying to load the page tree, each
time leaking the page objects/refs arrays.
Thanks to Zeniko for pointing out this fix.
|
|
|
|
|
|
Conflicts:
pdf/mupdf-internal.h
pdf/pdf_font.c
|
|
Now reusing the internal representation of an annotation for widgets
to avoid two separate lists
|
|
|
|
|
|
Instead of using macros for min/max/abs/clamp, we move to using
inline functions. These are more typesafe, and should produce
equivalent code on compilers that support inline (i.e. pretty much
everything we care about these days).
People can always do their own macro versions if they prefer.
|
|
We now create pdf_annot objects for PDF annotations even if they have no
appearance stream (i.e. even if invisible). That is necessary because
even invisible annotations can be targets of user interaction.
This is at least a partial fix for bug 693131
|
|
|
|
Conflicts:
fitz/doc_document.c
fitz/fitz-internal.h
fitz/fitz.h
fitz/stm_buffer.c
pdf/mupdf-internal.h
pdf/pdf_object.c
pdf/pdf_xobject.c
pdf/pdf_xref.c
win32/mupdf.sln
|
|
Previously, before interpreting a pages content stream we would
load it entirely into a buffer. Then we would interpret that
buffer. This has a cost in memory use.
Here, we update the code to read from a stream on the fly.
This has required changes in various different parts of the code.
Firstly, we have removed all use of the FILE lock - as stream
reads can now safely be interrupted by resource (or object) reads
from elsewhere in the file, the file lock becomes a very hard
thing to maintain, and doesn't actually benefit us at all. The
choices were to either use a recursive lock, or to remove it
entirely; I opted for the latter.
The file lock enum value remains as a placeholder for future use in
extendable data streams.
Secondly, we add a new 'concat' filter that concatenates a series of
streams together into one, optionally putting whitespace between each
stream (as the pdf parser requires this).
Finally, we change page/xobject/pattern content streams to work
on the fly, but we leave type3 glyphs using buffers (as presumably
these will be run repeatedly).
|
|
|
|
Found in a file synthesized by Sebras. Many thanks!
|
|
I forgot to free the stack in the reworked page loading. Fixed
here. Thanks to Zeniko for pointing it out.
|
|
Avoid recursion in pdf_load_page_tree_node.
Avoid recursion (most of the time) in pdf_read_xref_sections.
|
|
Don't reset the size of arrays until we have successfully resized them.
|
|
Previously we attempted to honour page rotation values, which is
technically against the spec.
|
|
Debug printing functions: debug -> print.
Accessors: get noun attribute -> noun attribute.
Find -> lookup when the returned value is not reference counted.
pixmap_with_rect -> pixmap_with_bbox.
We are reserving the word "find" to mean lookups that give ownership
of objects to the caller. Lookup is used in other places where the
ownership is not transferred, or simple values are returned.
The rename is done by the sed script in scripts/rename3.sed
|
|
Attempt to separate public API from internal functions.
|
|
Currently, we are in the slightly strange position of having
the PDF specific object types as part of fitz. Here we pull
them out into the pdf layer instead. This has been made possible
by the recent changes to make the store no longer be tied to
having fz_obj's as keys.
Most of this work is a simple huge rename; to help customers who
may have code that use such functions we have provided a sed
script to do the renaming; scripts/rename2.sed.
Various other small tweaks are required; the store used to have
some debugging code that still required knowledge of fz_obj
types - we extract that into a nicer 'type' based function
pointer. Also, the type 3 font handling used to have an fz_obj
pointer for type 3 resources, and therefore needed to know how
to free this; this has become a void * with a function to free
it.
|
|
Remove stray space at the end of buffers.
|
|
|
|
|
|
|
|
|
|
pdf_resolve_indirect(x) = pdf_resolve_indirect(pdf_resolve_indirect(x))
now - as long as it doesn't throw an exception.
Update the rest of the code to minimise unnecessary function calls.
Previously, we were calling one function to find out if an object was
a dict, only for that to call a function to see if it needed to
resolve the object, then calling another function to actually get the
dict, only to have that call the function to check for the dict
needing resolving again!
|
|
Every xobject keeps a reference to the object from whence
it came. This is marked/unmarked as it is executed.
Thanks to Zeniko for spotting the potential problem.
|
|
Move coordinate space tweaks into pdf_ and xps_run_page, and provide
neutral pdf_ and xps_bound_page functions to return the page size as
a zero-origined bounding box.
|
|
|
|
Add explanations of how to use the macros in fitz.h.
Also included are 2 different formulations, with different strengths/
weaknesses for reference. Will remove these shortly, but I want a
reference to them in git.
Workaround bug in Mac OS Lion gcc (clang works fine).
|
|
In various places in the code, we add markers (".seen") to
dictionaries as we traverse them to ensure that we don't
go into infinite loops.
Adding a dictionary entry is bad as it's a) an expensive
operation, b) a potentially destructive one, and c) produces another
possible point of failure (as mallocs can fail).
Instead, add a flag to each dict to allow them to be marked/unmarked
and use that instead.
Thanks to Zeniko for pointing out various places that could usefully
be protected against infinite recursion.
|
|
Move to a non-pdf specific type for links. PDF specific parsing is
done in pdf_annots.c as before, but the essential type (and handling
functions for that type) are in a new file fitz/base_link.c.
The new type is more expressive than before; specifically all the
possible PDF modes are expressable in it. Hopefully this should
allow XPS links to be represented too.
|
|
The new fz_malloc_struct(A,B) macro allocates sizeof(B) bytes using
fz_malloc, and then passes the resultant pointer to Memento_label
to label it with "B".
This costs nothing in non-memento builds, but gives much nicer
listings of leaked blocks when memento is enabled.
|
|
|
|
|
|
Firstly, we rename pdf_store to fz_store, reflecting the fact that
there are no pdf specific dependencies on it.
Next, we rework it so that all the objects that can be stored in
the store start with an fz_storable structure. This consists of
a reference count, and a function used to free the object when
the reference count reaches zero.
All the keep/drop functions are then reimplemented by calling
fz_keep_sharable/fz_drop_sharable. The 'drop' functions as supplied
by the callers are thus now 'free' functions, only called if
the reference count drops to 0.
The store changes to keep all the items in the store in the linked
list (which becomes a doubly linked one). We still make use of
the hashtable to index into this list quickly, but we now have
the objects in an LRU ordering within the list.
Every object is put into the store, with a size record; this is
an estimate of how much memory would be freed by freeing that
object.
The store is moved into the context and given a maximum size;
when new things are inserted into the store, care is taken to
ensure that we do not expand beyond this size. We evict any
stored items (that are not in use) starting from the least
recently used.
Finding an object in the store now takes a reference to it already.
LOCK and UNLOCK comments are used to indicate where locks need to
be taken and released to ensure thread safety.
|
|
Mostly redoing the xps_context to xps_document change and adding
contexts to newly written code.
Conflicts:
apps/pdfapp.c
apps/pdfapp.h
apps/x11_main.c
apps/xpsdraw.c
draw/draw_device.c
draw/draw_scale.c
fitz/base_object.c
fitz/fitz.h
pdf/mupdf.h
pdf/pdf_interpret.c
pdf/pdf_outline.c
pdf/pdf_page.c
xps/muxps.h
xps/xps_doc.c
xps/xps_xml.c
|
|
|
|
This frees us from passing errors back everywhere, and hence enables us
to pass results back as return values.
Rather than having to explicitly check for errors everywhere and bubble
them, we now allow exception handling to do the work for us; the
downside to this is that we no longer emit as much debugging information
as we did before (though this could be put back in). For now, the
debugging information we have lost has been retained in comments
with 'RJW:' at the start.
This code needs fuller testing, but is being committed as a work in
progress.
|
|
|
|
|