Age | Commit message (Collapse) | Author |
|
|
|
|
|
|
|
New accessor xml_text() will return NULL or text content of a node.
Tag names for text nodes is the empty string "".
Fix bug 692191.
|
|
|
|
|
|
Thanks to Sebras for pointing out our schitzophrenia here.
|
|
Currently, the mupdf code loads shadings at parse time, and
instantly decomposes them into a mesh of triangles. This mesh
of triangles is the transformed and rendered as required.
Unfortunately the storage space for the mesh is typically much
greater than the original representation.
In this commit, we move the shading stream parsing/decomposition
code into a general 'fz_process_mesh' function within res_shade.
We then grab a copy of the buffer at load time, and 'process'
(decompose/paint) at render time.
For the test file on the bug, memory falls from the reported 660Mb
to 30Mb. For another test file (txt9780547775815_ingested.pdf
page 271) it reduces memory use from 750Meg to 33Meg. These figures
could be further reduced by storing the compressed streams from the
pdf file rather than the uncompressed ones.
Incorporating typo fix and unused function removal from Sebras. Thanks.
Remove unused function in shading code
|
|
Conflicts:
pdf/pdf_xref_aux.c
|
|
Use a "magic" string for filetype detection: filename or mime-type.
|
|
Mountian Lion causes various different warnings to be given,
possibly because a change to clang by default. Fix them here.
|
|
Conflicts:
Makefile
apps/mudraw.c
pdf/pdf_write.c
win32/libmupdf-v8.vcproj
|
|
Thanks to Zeniko for pointing out this fix.
|
|
|
|
Conflicts:
pdf/mupdf-internal.h
pdf/pdf_font.c
|
|
Improves text device output when using substitute fonts.
Fixes bug #693019.
|
|
|
|
Shear by 20 degrees for italic. Use 2% wider metrics for bold.
|
|
|
|
Instead of using macros for min/max/abs/clamp, we move to using
inline functions. These are more typesafe, and should produce
equivalent code on compilers that support inline (i.e. pretty much
everything we care about these days).
People can always do their own macro versions if they prefer.
|
|
Remove the shim indirection layer for fz_document. A little less type
safe, but a lot less boiler plate.
|
|
Previously, before interpreting a pages content stream we would
load it entirely into a buffer. Then we would interpret that
buffer. This has a cost in memory use.
Here, we update the code to read from a stream on the fly.
This has required changes in various different parts of the code.
Firstly, we have removed all use of the FILE lock - as stream
reads can now safely be interrupted by resource (or object) reads
from elsewhere in the file, the file lock becomes a very hard
thing to maintain, and doesn't actually benefit us at all. The
choices were to either use a recursive lock, or to remove it
entirely; I opted for the latter.
The file lock enum value remains as a placeholder for future use in
extendable data streams.
Secondly, we add a new 'concat' filter that concatenates a series of
streams together into one, optionally putting whitespace between each
stream (as the pdf parser requires this).
Finally, we change page/xobject/pattern content streams to work
on the fly, but we leave type3 glyphs using buffers (as presumably
these will be run repeatedly).
|
|
Use this to reintroduce "Document Properties..." in mupdf viewer.
|
|
|
|
|
|
Move fz_stroke_state from being a simple structure whose contents
are copied repeatedly to being a dynamically allocated reference
counted object so we can cope with large numbers of entries in
the dash array.
|
|
|
|
If a path is both stroked and filled, we need to treat IsStroking during
parsing differently, so we should create two separate paths for filling
and stroking.
Thanks to SumatraPDF for the patch.
|
|
|
|
Thanks to SumatraPDF for the patch.
|
|
|
|
Thanks to SumatraPDF.
|
|
|
|
Debug printing functions: debug -> print.
Accessors: get noun attribute -> noun attribute.
Find -> lookup when the returned value is not reference counted.
pixmap_with_rect -> pixmap_with_bbox.
We are reserving the word "find" to mean lookups that give ownership
of objects to the caller. Lookup is used in other places where the
ownership is not transferred, or simple values are returned.
The rename is done by the sed script in scripts/rename3.sed
|
|
|
|
C's standard is copy(dst, src), so we move to adopt that here.
Hopefully no one is calling this routine other than us - if they are,
then I apologise! Better to aim for consistency before we freeze
the API at v1.0 than to carry an inconsistent API around ever after.
|
|
Add some function documentation to fitz.h.
Add fz_ prefix to runetochar, chartorune, runelen etc. Change
fz_runetochar to avoid passing unnecessary pointer.
|
|
Attempt to separate public API from internal functions.
|
|
More changes still to come.
|
|
Introduce a new 'fz_image' type; this type contains rudimentary
information about images (such as native, size, colorspace etc)
and a function to call to get a pixmap of that image (with a
size hint).
Instead of passing pixmaps through the device interface (and
holding pixmaps in the display list) we now pass images instead.
The rendering routines therefore call fz_image_to_pixmap to get
pixmaps to render, and fz_pixmap_drop those afterwards.
The file format handling routines therefore need to produce
images rather than pixmaps; xps and cbz currently just wrap
pixmaps as images. PDF is more involved.
The stream handling routines in PDF have been altered so that
they can recognise when the last stream entry in a filter
dictionary is an image decoding filter. Rather than applying
this filter, they read and store the parameters into a
pdf_image_params structure, and stop decoding at that point.
This allows us to read the compressed data for an image into
memory as a block. We can then restart the image decode process
later.
pdf_images therefore consist of the compressed image data for
images. When a pixmap is requested for such an image, the code
checks to see if we have one (of an appropriate size), and if
not, decodes it.
The size hint is used to determine whether it is possible to
subsample the image; currently this is only supported for
JPEGs, but we could add generic subsampling code later.
In order to handle caching the produced images, various changes
have been made to the store and the underlying hash table.
Previously the store was indexed purely by fz_obj keys; we don't
have an fz_obj key any more, so have extended the store by adding
a concept of a key 'type'. A key type is a pointer to a set of
functions that keep/drop/compare and make a hashable key from
a key pointer.
We make a pdf_store.c file that contains functions to offer the
existing fz_obj based functions, and add a new 'type' for keys
(based on the fz_image handle, and the subsample factor) in the
pdf_image.c file.
While working on this, a problem became apparent in the existing
store codel; fz_obj objects had no protection on their reference
counts, hence an interpreter thread could try to alter a ref count
at the same time as a malloc caused an eviction from the store.
This has been solved by using the alloc lock as protection. This in
turn requires some tweaks to the code to make sure we don't try
and keep/drop fz_obj's from the store code while the alloc lock is
held.
A side effect of this work is that when a hash table is created, we
inform it what lock should be used to protect its innards (if any).
If the alloc lock is used, the insert method knows to drop/retake it
to allow it to safely expand the hash table. Callers to the hash
functions have the responsibility of taking/dropping the appropriate
lock, and ensuring that they cope with the possibility that insert
might drop the alloc lock, causing race conditions.
|
|
We only open one instance of freetype per document. We therefore
have to ensure that only 1 call to it takes place at a time. We
introduce a lock for this purpose (FZ_LOCK_FREETYPE), and arrange
to take/release it as required.
We also update the font context so it is properly shared.
|
|
This is a significant change to the use of locks in MuPDF.
Previously, the user had the option of passing us lock/unlock
functions for a single mutex as part of the allocation struct.
Now we remove these entries from the allocation struct, and
make a separate 'locks' struct. This enables people to use
fz_alloc_default with locking.
If multithreaded operation is required, then the user is
required to create FZ_LOCK_MAX mutexes, which will be locked
or unlocked by MuPDF calling the lock/unlock functions within
the new fz_locks_context structure passed in at context creation.
These mutexes are not required to be recursive (they may be, but
MuPDF should never call them in this way). MuPDF avoids deadlocks
by imposing a locking ordering on itself; a thread will never take
lock n, if it already holds any lock i for which 0 <= i <= n.
Currently, there are 4 locks used within MuPDF.
Lock 0: The alloc lock; taken around all calls to user supplied
(or default) allocation functions. Also taken around all accesses
to the refs field of storable items.
Lock 1: The store lock; taken whenever the store data structures
(specifically the linked list pointers) are accessed.
Lock 2: The file lock; taken whenever a thread is accessing the raw
file. We use the debugging macros to insist that this is held
whenever we do a file based seek or read. We also insist that this
is never held when we resolve an indirect reference, as this can
have the effect of moving the file pointer.
Lock 3: The glyphcache lock; taken whenever a thread calls freetype,
or accesses the glyphcache data structures. This introduces some
complexities w.r.t type3 fonts.
Locking can be hugely problematic, so to ease our minds as to
the correctness of this code, we introduce some debugging macros.
These compile away to nothing unless FITZ_DEBUG_LOCKING is defined.
fz_assert_lock_held(ctx, lock) checks that we hold lock.
fz_assert_lock_not_held(ctx, lock) checks that we do not hold lock.
In addition fz_lock_debug_lock and fz_lock_debug_unlock are used
on every fz_lock/fz_unlock to check the validity of the operation
we are performing - in particular it checks that we do/do not already
hold the lock we are trying to take/drop, and that by taking this
lock we are not violating our defined locking order.
The RESOLVE macro (used throughout the code to check whether we need
to resolve an indirect reference) calls fz_assert_lock_not_held to
ensure that we aren't about to resolve an indirect reference (and
hence move the stream pointer) when the file is locked.
In order to implement the file locking properly, pdf_open_stream
(and friends) now lock the file as a side effect (because they
fz_seek to the start of the stream). The lock is automatically
dropped on an fz_close of such streams.
Previously, the glyph cache was created in a context when it was first
required; this presents problems as it can be shared between several
contexts or not, depending on whether it is created before the
contexts are cloned. We now always create it at startup, so it is
always shared.
This means that we need reference counting for the glyph caches.
Added here.
In fz_render_glyph, we take the glyph cache lock, and check to see
whether the glyph is in the cache. If it is, we bump the refcount,
drop the lock and returned the cached character. If it is not, we
need to render the character.
For freetype based fonts we keep the lock throughout the rendering
process, thus ensuring that freetype is only called in a single
threaded manner.
For type3 fonts, however, we need to invoke the interpreter again
to render the glyph streams. This can require reentrance to this
routine. We therefore drop the glyph cache lock, call the
interpreter to render us our pixmap, and take the lock again.
This dropping and retaking of the lock introduces a possible race
condition; 2 threads may try to render the same character at the
same time. We therefore modify our hash table insert routines to
behave differently if it comes to insert an entry only to find
that an entry with the same key is already there.
We spot this case; if we have just rendered a type3 glyph and when
we try to insert it into the cache discover that someone has beaten
us to it, we just discard our entry and use the cached one.
Hopefully this will seldom be a problem in practise; to solve it
properly would require greater complexity (probably involving
spotting that another thread is already working on the desired
rendering, and sleeping on a semaphore until it completes).
|
|
Prevents segfaults when revisiting pages and trying to
access the link object that was freed too early.
|
|
|
|
|
|
|
|
|
|
|
|
Update xps path handling to cope with URLs.
Fix premature freeing of links.
Spot remote URLs and use appropriate link type.
|
|
Currently, this only works with local links.
When running the page, check for NavigateUri entries; if found,
and that page is not already marked as having resolved it's links,
add a new link entry to doc->current_page links. When the page
finishes running, mark the page as having resolved it's links.
This avoids the links being generated multiple times.
Update the mupdf viewer to use these links - but only AFTER the
page has been run.
|