Age | Commit message (Collapse) | Author |
|
When cleaning a file with a corrupt stream in it, historically mupdf
would give up when it encountered such a stream. This is often not
what is desired, as information can be lost.
The changes herein allow us to use our best efforts when reading
a stream, so that broken streams are reproduced in the output
cleaned file.
Problem found in a test file, pdf_001/2599.pdf.asan.58.1778 supplied
by Mateusz "j00ru" Jurczyk and Gynvael Coldwind of the Google
Security Team using Address Sanitizer. Many thanks!
|
|
Instead of using macros for min/max/abs/clamp, we move to using
inline functions. These are more typesafe, and should produce
equivalent code on compilers that support inline (i.e. pretty much
everything we care about these days).
People can always do their own macro versions if they prefer.
|
|
Attempt to separate public API from internal functions.
|
|
|
|
Remove stray space at the end of buffers.
|
|
Introduce a new 'fz_image' type; this type contains rudimentary
information about images (such as native, size, colorspace etc)
and a function to call to get a pixmap of that image (with a
size hint).
Instead of passing pixmaps through the device interface (and
holding pixmaps in the display list) we now pass images instead.
The rendering routines therefore call fz_image_to_pixmap to get
pixmaps to render, and fz_pixmap_drop those afterwards.
The file format handling routines therefore need to produce
images rather than pixmaps; xps and cbz currently just wrap
pixmaps as images. PDF is more involved.
The stream handling routines in PDF have been altered so that
they can recognise when the last stream entry in a filter
dictionary is an image decoding filter. Rather than applying
this filter, they read and store the parameters into a
pdf_image_params structure, and stop decoding at that point.
This allows us to read the compressed data for an image into
memory as a block. We can then restart the image decode process
later.
pdf_images therefore consist of the compressed image data for
images. When a pixmap is requested for such an image, the code
checks to see if we have one (of an appropriate size), and if
not, decodes it.
The size hint is used to determine whether it is possible to
subsample the image; currently this is only supported for
JPEGs, but we could add generic subsampling code later.
In order to handle caching the produced images, various changes
have been made to the store and the underlying hash table.
Previously the store was indexed purely by fz_obj keys; we don't
have an fz_obj key any more, so have extended the store by adding
a concept of a key 'type'. A key type is a pointer to a set of
functions that keep/drop/compare and make a hashable key from
a key pointer.
We make a pdf_store.c file that contains functions to offer the
existing fz_obj based functions, and add a new 'type' for keys
(based on the fz_image handle, and the subsample factor) in the
pdf_image.c file.
While working on this, a problem became apparent in the existing
store codel; fz_obj objects had no protection on their reference
counts, hence an interpreter thread could try to alter a ref count
at the same time as a malloc caused an eviction from the store.
This has been solved by using the alloc lock as protection. This in
turn requires some tweaks to the code to make sure we don't try
and keep/drop fz_obj's from the store code while the alloc lock is
held.
A side effect of this work is that when a hash table is created, we
inform it what lock should be used to protect its innards (if any).
If the alloc lock is used, the insert method knows to drop/retake it
to allow it to safely expand the hash table. Callers to the hash
functions have the responsibility of taking/dropping the appropriate
lock, and ensuring that they cope with the possibility that insert
might drop the alloc lock, causing race conditions.
|
|
|
|
|
|
|
|
Mostly redoing the xps_context to xps_document change and adding
contexts to newly written code.
Conflicts:
apps/pdfapp.c
apps/pdfapp.h
apps/x11_main.c
apps/xpsdraw.c
draw/draw_device.c
draw/draw_scale.c
fitz/base_object.c
fitz/fitz.h
pdf/mupdf.h
pdf/pdf_interpret.c
pdf/pdf_outline.c
pdf/pdf_page.c
xps/muxps.h
xps/xps_doc.c
xps/xps_xml.c
|
|
The pointer arithmetic could over/underflow, causing the range
test to pass even when it shouldn't (due to compiler oddities).
|
|
This frees us from passing errors back everywhere, and hence enables us
to pass results back as return values.
Rather than having to explicitly check for errors everywhere and bubble
them, we now allow exception handling to do the work for us; the
downside to this is that we no longer emit as much debugging information
as we did before (though this could be put back in). For now, the
debugging information we have lost has been retained in comments
with 'RJW:' at the start.
This code needs fuller testing, but is being committed as a work in
progress.
|
|
|
|
Huge pervasive change to lots of files, adding a context for exception
handling and allocation.
In time we'll move more statics into there.
Also fix some for(i = 0; i < function(...); i++) calls.
|
|
Import exception handling code from WSS, modified to fit into the
fitz world.
With this code we have 'real' fz_try/fz_catch/fz_rethrow functions,
handling a fz_except type. We therefore rename the existing fz_throw/
fz_catch/fz_rethrow to be fz_error_make/fz_error_handle/fz_error_note.
We don't actually use fz_try/fz_catch/fz_rethrow yet...
|
|
|
|
The run-together words are dead! Long live the underscores!
The postscript inspired naming convention of using all run-together
words has served us well, but it is now time for more readable code.
In this commit I have also added the sed script, rename.sed, that I used
to convert the source. Use it on your patches and application code.
|
|
instead.
|
|
|
|
|
|
|
|
streams opened with openbuffer.
|
|
|
|
|
|
|
|
number of bytes read into buffer into consideration.
|
|
in pdf_loadstream.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|