Age | Commit message (Collapse) | Author |
|
Simple patch to replace const char * with char *. I made the patch
myself, but I suspect it's extremely close to the one submitted
by Evgeniy A Dushistov, who reported the bug - many thanks!
|
|
Talking to zeniko, he reports that SEGVs still occur in find_changing
within the fax decoder; he doesn't have an example that shows the
problem though (either one he can share, or one he cannot). Presumably
he has some sort of online feedback thing in the event of crashes.
Having stared at the code for a while, I see a potential problem.
I think the code may read too many bytes in the case where we
are entered with x already within the last byte of w. (i.e. where
x >= ((w-1)>>3)<<3). Fixed here.
|
|
If a PDF xref subsection is broken in the wrong place, we can get
NULL back from fz_strsep, which causes a SEGV when fed to atoi.
Add a new fz_atoi that copes with NULL to avoid this.
Problem found in a test file, 3959.pdf.SIGSEGV.ad4.3289 supplied
by Mateusz "j00ru" Jurczyk and Gynvael Coldwind of the Google
Security Team using Address Sanitizer. Many thanks!
|
|
When cleaning a file with a corrupt stream in it, historically mupdf
would give up when it encountered such a stream. This is often not
what is desired, as information can be lost.
The changes herein allow us to use our best efforts when reading
a stream, so that broken streams are reproduced in the output
cleaned file.
Problem found in a test file, pdf_001/2599.pdf.asan.58.1778 supplied
by Mateusz "j00ru" Jurczyk and Gynvael Coldwind of the Google
Security Team using Address Sanitizer. Many thanks!
|
|
While investigating samples_mupdf_001/2599.pdf.asan.58.1778, a leak
showed up while cleaning the file, due to not dropping an object in
an error case.
mutool clean -dif samples_mupdf_001/2599.pdf.asan.58.1778 leak.pdf
Simple Fix. Also extend PDF writing so that it can cope with skipping
errors so we at least get something out at the end.
Problem found in a test file supplied by Mateusz "j00ru" Jurczyk and
Gynvael Coldwind of the Google Security Team using Address Sanitizer.
Many thanks!
|
|
With added comment to explain the funky boolean logic.
|
|
Two problems with tiling are fixed here.
Firstly, if the tiling bounds are huge, the 'patch' region (the region
we are writing into), can overflow, causing a SEGV due to the paint code
being very confused by pixmaps that go from just under INT_MAX to just
over INT_MIN. Fix this by checking explicitly for overflow in these
bounds.
If the tiles are stupidly huge, but the scissor is small, we can end up
looping many more times than we need to. We fix mapping the scissor
region back through the inverse transform, and intersecting this
with the pattern area.
Problem found in 4201.pdf.SIGSEGV.622.3560, a test file supplied by
Mateusz "j00ru" Jurczyk and Gynvael Coldwind of the Google Security
Team using Address Sanitizer. Many thanks!
|
|
When calculating the bbox for draw_glyph, if the x and y origins of
the glyph are extreme (too large to fit in an int), we get overflows
of the bbox; empty bboxes are transformed to large ones.
The fix is to introduce an fz_translate_bbox function that checks for
such things.
Also, we update various bbox/rect functions to check for empty bboxes
before they check for infinite ones (as a bbox of x0=0 x1=0 y0=0 y1=-1
will be detected both as infinite and empty).
Problem found in 2485.pdf.SIGSEGV.2a.1652, a test file supplied by
Mateusz "j00ru" Jurczyk and Gynvael Coldwind of the Google Security
Team using Address Sanitizer. Many thanks!
|
|
It is perfectly allowable to have type3 glyphs that refer to
other type3 glyphs in the same font (and in theory it's probably
even possible to have type3 glyphs that refer back and forth
between 2 or more type3 fonts).
The old code used to cope with this just fine, but with the change
to 'early loading' of the glyphs to display lists at interpret time
a problem has crept in. When we load the type 3 font, we load
each glyph in turn. If glyph 1 tries to use glyph 2, then we look
up the font, only to find that that the font has not been installed
yet, so we reload the entire font. This gets us into an infinite
loop.
As a fix for this, we split the loading of the type3 font into 2; we
load the font as normal, then allow the font to be inserted into
the list of current fonts. Then we run through the glyphs in the
font 'preparing' them (turning them into display lists).
This solves the infinite loop issue, but causes another problem;
recursive references (such as a font holding a display list that
contains a text node that contains a reference to the original font)
result in us never being able to free the structures.
To avoid this, we insist on never allowing type3 glyphs to be referenced
within a type3 display list. The display lists for all type3 glyphs
are therefore 'flat'. We achieve this by adding a 'nested' flag to
the pdf command stream interpreter structure, and setting this in the
case where we are running a glyph stream. We check for that flag in the
type3 glyph render function, and if present, we force the 'render_direct'
path to be used.
Finally, we ensure that fz_text groups are not needlessly created with
no contents.
Problem found in 2923.pdf.asan.22.2139, a test file supplied by
Mateusz "j00ru" Jurczyk and Gynvael Coldwind of the Google Security
Team using Address Sanitizer. Many thanks!
|
|
Leaking long linked lists leads to stack overflows during the
Memento debug output. Avoid this by iterating rather than recursing
where possible.
Also, for sanities sake, where we intent more than 40 spaces, use a
single '*' instead. This keeps logfiles sane.
|
|
Whenever we have an error while pushing a gstate, we run the risk of
getting confused over how many pops we need etc.
With this commit we introduce some checking at the dev_null level that
attempts to make this behaviour consistent.
Any caller may now assume that calling an operation that pushes a clip
will always succeed. This means the only error cleanup they need to
do is to ensure that if they have pushed a clip (or begun a group, or
a mask etc) is to pop it too.
Any callee may now assume that if it throws an error during the call
to a device entrypoint that would create a group/clip/mask then no more
calls will be forthcoming until after the caller has completely finished
with that group.
This is achieved by the dev_null layer (the layer that indirects from
device calls through the device structure to the function pointers)
swallowing errors and regurgitating them later as required. A count is
kept of the number of pushes that have happened since an error
occurred during a push (including that initial one). When this count
reaches zero, the original error is regurgitated. This allows the
caller to keep the cookie correctly updated.
|
|
With illegal fax streams we could access beyond the right hand edge
of the allocated line. Fix this by adding some simple checks.
Issue found by Mateusz "j00ru" Jurczyk and Gynvael Coldwind of the
Google Security Team using Address Sanitizer. Many thanks!
|
|
If an illegal keysize is passed into the AES crypt filter, we
currently exit without setting up the AES context. This causes
us to fail in all manner of ways later on.
We now return failure and callers throw an exception.
This appears to solve all the SEGVs and memory exceptions found in
crypt_aes by Mateusz "j00ru" Jurczyk and Gynvael Coldwind of the
Google Security Team using Address Sanitizer. Many thanks!
|
|
The issues fixed here were found by zeniko - many thanks.
The patch here is our own work - larger change, avoiding casts
for a (hopefully) neater result.
|
|
Thanks to zeniko for pointing out that the recent changes to
the fz_try/fz_catch macros to allow for throws in the fz_always
block had broken the exception stack overflow case.
Thanks also for the example file (nesting stack overflow.pdf),
which has now been added to the regression suite.
|
|
Add a mechanism for getting a color converter function. Implement
the 'convert a single color' call in terms of that. 'Bulk' users
can then repeatedly call the single function.
|
|
A NULL pointer dereference could be caused in error cases due
to me failing to apply zenikos patch correctly.
|
|
Throwing from within the always block is bad practice, but attempt
to cope with it gracefully.
|
|
Turns out that jpeg_finish_decompress can throw errors, hence
can cause an infinite loop. This is fixed here by changing the
jpeg error code to be fz_throw based.
Thanks to zeniko for this patch.
This highlights something that I hadn't fully appreciated before;
anything that throws in a fz_always region will reenter that region.
I think I have a way to fix this so that any throws in the
fz_always region go immediately to the fz_catch.
|
|
Thanks to zeniko for finding various problems and submitting a
patch that fixes them. This commit covers the simpler issues from
his patch; other commits will follow shortly.
* Out of range LZW codes.
* Buffer overflows and error handling in image_jpeg.c
* Buffer overflows in tiff handling
* buffer overflows in cmap parsing.
* Potential double free in font handling.
* Buffer overflow in pdf_form.c
* use of uninitialised value in error case in pdf_image.c
* NULL pointer dereference in xps_outline.c
|
|
Thanks to zeniko for these.
Use otf as extension for opentype fonts.
fz_clampi should take ints, not floats!
Fix typo in prototype.
Squash unwanted warning.
Remove magic number in favour of #define.
Reset generation numbers when renumbering.
|
|
All these leaks were spotted by zeniko, so credit/thanks to him.
|
|
|
|
|
|
Use just 1 loop rather than 2, and count downwards as this is faster
on most architectures.
For the 'hash tabled memoized' general case, the time taken to form the
hashes is significant. Add some code to check that the pixel isn't the
same as the one we just did and bypass the hash.
|
|
The BOM was erroneously being emitted as a text node.
|
|
|
|
We still need to have the callback for type 3 fonts that are uncacheable.
With this change the callback is only ever called directly from the
interpreter in fz_prepare_t3_glyph and fz_render_t3_glyph_direct.
|
|
This means that repeated scaling of the same pixmap (or scales of
'stacked' pixmaps) will do less needless recalculation.
|
|
Once again, thanks to zeniko for pointing this out. With non-monochrome
scales, the 'stray' cases at the end of the line will loop 0 times on x.
resulting in a skewed result.
|
|
Thanks to zeniko for pointing this out. Non monochrome subsamples
would have gone wrong in the last line.
|
|
Thanks to zeniko for pointing out these places that I'd missed updating
the old code.
|
|
Move the assembly macros into fitz-internal.h.
|
|
|
|
Silly slip in my optimised code that results in failing to find
differences at the ends of lines.
|
|
|
|
to avoid clashes, especially on systems where "tolower" is declared as a
macro, for example Cygwin.
|
|
When drawing images, if they are much bigger than we need, quickly
subsample them. Makes images much more cachable, reduces time spent
in expensive smooth scaler.
|
|
When calculating the factor to use for image downscales, calculate it
as a shift rather than a divisor.
|
|
Requires android-ndk-profiler to be copied into android and android/jni.
Also requires r8c of the NDK.
|
|
A huge number of calls are made to getbit from find_changing in fax
decompression. On Android profiling shows that this accounts for 25%
of time in handling page 2 of IA3Z0845.pdf.
Rewrite code to deal with bytes at a time for speed. Profiling
now shows 5% in this function.
|
|
Same algorithm, just implemented in fixed point with a 1 place cache
and checks for trivial black/white rather than floating point.
|
|
Avoid repeated muls by reusing intermediates. Speed generation
of those intermediates by using adds/subs rather than muls.
|
|
|
|
|
|
|
|
|
|
|
|
Regenerate dirty appearance streams and report changed annotations since
last call.
Also include a partial revert of changes in 96f335bc, that turn out not
to be necessary.
fz_update_page must now be called between each document-changing event and
the next render. pdfapp.c and the android app have been updated to do so,
but do not yet take advantage of the possibility to render only the updated
areas of the screen.
|
|
|