Age | Commit message (Collapse) | Author |
|
Talking to zeniko, he reports that SEGVs still occur in find_changing
within the fax decoder; he doesn't have an example that shows the
problem though (either one he can share, or one he cannot). Presumably
he has some sort of online feedback thing in the event of crashes.
Having stared at the code for a while, I see a potential problem.
I think the code may read too many bytes in the case where we
are entered with x already within the last byte of w. (i.e. where
x >= ((w-1)>>3)<<3). Fixed here.
|
|
If a PDF xref subsection is broken in the wrong place, we can get
NULL back from fz_strsep, which causes a SEGV when fed to atoi.
Add a new fz_atoi that copes with NULL to avoid this.
Problem found in a test file, 3959.pdf.SIGSEGV.ad4.3289 supplied
by Mateusz "j00ru" Jurczyk and Gynvael Coldwind of the Google
Security Team using Address Sanitizer. Many thanks!
|
|
Paul had written code before to detect clicks on hyperlinks, but
we hadn't actually done anything with these clicks once detected.
Andre Ferreira supplied a couple of lines of Android magic to form
the Intent from the URL and execute it. Incorporate that here.
(Andre should have é but this upsets git/my editor, sorry)
As part of this patch, we now respond to links at a higher priority
than the left/right clicks to flip pages (but only if link following
mode is enabled).
|
|
Thanks to Andre Ferreira for bringing this up and submitting a
patch. (Andre should have é, but this upsets git/my editor, sorry!)
Change BitmapHolder handling so that we explicitly recycle bitmaps.
Old versions of Android need this to avoid bitmaps 'sticking' in
memory, and it doesn't hurt on new versions.
Also, explicitly empty the bitmap holder before creating a new
bitmap. This avoids us holding more than one copy of the (potentially
large) bitmaps.
|
|
|
|
|
|
Also invalidate search view on every select box change and
avoid creating multiple get-text tasks
|
|
although not actually do anything with the selection yet
|
|
|
|
|
|
If a colorspace refers to itself as a base, we can get an infinite
recursion and hence stack overflow. Thanks to zeniko for pointing out
that this occurs in embedded CMAPs and stitching functions. Also
solved here.
To avoid having to keep a long list of the objects we've traversed
through, extend the pdf_dict_mark functions to work on all pdf objects,
and hence rename them as pdf_obj_mark etc. Thanks to zeniko again for
feedback on this way of working.
Problem found in a test file, 3882.pdf.SIGSEGV.99.3204 supplied
by Mateusz "j00ru" Jurczyk and Gynvael Coldwind of the Google
Security Team using Address Sanitizer. Many thanks!
|
|
When parsing a (broken) PDF stream, we can forget an existing
parsed object when we parse another one. Check for us having
one and free it if we do.
Problem found in a test file, 3289.pdf.asan.77.2545 supplied
by Mateusz "j00ru" Jurczyk and Gynvael Coldwind of the Google
Security Team using Address Sanitizer. Many thanks!
|
|
If the Function entry does not point to either a dictionary or an
array, we should give up, otherwise we deference a NULL pointer.
Problem found in a test file, 1013.pdf.SIGSEGV.8a7.18 supplied
by Mateusz "j00ru" Jurczyk and Gynvael Coldwind of the Google
Security Team using Address Sanitizer. Many thanks!
|
|
|
|
When cleaning a file with a corrupt stream in it, historically mupdf
would give up when it encountered such a stream. This is often not
what is desired, as information can be lost.
The changes herein allow us to use our best efforts when reading
a stream, so that broken streams are reproduced in the output
cleaned file.
Problem found in a test file, pdf_001/2599.pdf.asan.58.1778 supplied
by Mateusz "j00ru" Jurczyk and Gynvael Coldwind of the Google
Security Team using Address Sanitizer. Many thanks!
|
|
When reading a CMAP with values out of range, we can go into a
very long loop emitting the same pair of warnings.
Spot the error case earlier and this give a nicer report.
Problem found in a test file, 3192.pdf.SIGSEGV.b0.2438 supplied
by Mateusz "j00ru" Jurczyk and Gynvael Coldwind of the Google
Security Team using Address Sanitizer. Many thanks!
|
|
An unused dictionary reference could be left dangling. Simple fix
is to drop the reference after use.
Problem found in a test file, 2785.pdf.asan.6d.1985 supplied
by Mateusz "j00ru" Jurczyk and Gynvael Coldwind of the Google
Security Team using Address Sanitizer. Many thanks!
|
|
Problem found in a test file, 4174.pdf.SIGSEGV.50c.3529 supplied
by Mateusz "j00ru" Jurczyk and Gynvael Coldwind of the Google
Security Team using Address Sanitizer. Many thanks!
Fonts extracted from the file supplied to Werner Lemberg who fixed
the problem in freetype. Again, many thanks.
|
|
When running a softmask, we remove the softmask from the gstate,
then run the group contents, then put the softmask back.
If the gstate stack is moved in the meantime (due to it being
realloced for extension), we can end up with it being moved.
We therefore must recalculate gstate before writing again.
Problem found in a test file, pdf_001/2599.pdf.asan.58.1778 supplied
by Mateusz "j00ru" Jurczyk and Gynvael Coldwind of the Google
Security Team using Address Sanitizer. Many thanks!
|
|
While investigating samples_mupdf_001/2599.pdf.asan.58.1778, a leak
showed up while cleaning the file, due to not dropping an object in
an error case.
mutool clean -dif samples_mupdf_001/2599.pdf.asan.58.1778 leak.pdf
Simple Fix. Also extend PDF writing so that it can cope with skipping
errors so we at least get something out at the end.
Problem found in a test file supplied by Mateusz "j00ru" Jurczyk and
Gynvael Coldwind of the Google Security Team using Address Sanitizer.
Many thanks!
|
|
If an OCG refers to itself, we end up recursing forever and
eventually stack overflow. Fix with the pdf_dict_mark stuff.
Problem found in 1551.pdf.SIGSEGV.7fd.615, a test file supplied by
Mateusz "j00ru" Jurczyk and Gynvael Coldwind of the Google Security
Team using Address Sanitizer. Many thanks!
|
|
With added comment to explain the funky boolean logic.
|
|
The pdf function code only expects a maximum of FZ_MAX_COLORS
component functions in a sampling function; more functions than
this causes a buffer overflow. Add some checks to avoid this.
Problem found in 1219.pdf.SIGSEGV.fc0.246, a test file supplied by
Mateusz "j00ru" Jurczyk and Gynvael Coldwind of the Google Security
Team using Address Sanitizer. Many thanks!
|
|
Two problems with tiling are fixed here.
Firstly, if the tiling bounds are huge, the 'patch' region (the region
we are writing into), can overflow, causing a SEGV due to the paint code
being very confused by pixmaps that go from just under INT_MAX to just
over INT_MIN. Fix this by checking explicitly for overflow in these
bounds.
If the tiles are stupidly huge, but the scissor is small, we can end up
looping many more times than we need to. We fix mapping the scissor
region back through the inverse transform, and intersecting this
with the pattern area.
Problem found in 4201.pdf.SIGSEGV.622.3560, a test file supplied by
Mateusz "j00ru" Jurczyk and Gynvael Coldwind of the Google Security
Team using Address Sanitizer. Many thanks!
|
|
If the key length is specified too long (0x120 for example), we can
overrun the key buffer (32 bytes). Fix this with some explicit
checks.
Problem found in 2513.pdf.asan.73.1684, a test file supplied by
Mateusz "j00ru" Jurczyk and Gynvael Coldwind of the Google Security
Team using Address Sanitizer. Many thanks!
|
|
|
|
When calculating the bbox for draw_glyph, if the x and y origins of
the glyph are extreme (too large to fit in an int), we get overflows
of the bbox; empty bboxes are transformed to large ones.
The fix is to introduce an fz_translate_bbox function that checks for
such things.
Also, we update various bbox/rect functions to check for empty bboxes
before they check for infinite ones (as a bbox of x0=0 x1=0 y0=0 y1=-1
will be detected both as infinite and empty).
Problem found in 2485.pdf.SIGSEGV.2a.1652, a test file supplied by
Mateusz "j00ru" Jurczyk and Gynvael Coldwind of the Google Security
Team using Address Sanitizer. Many thanks!
|
|
This makes searching for things much easier.
|
|
Also removed the split between onCreate and onResume. No idea why I
introduced that in the first place.
|
|
It is perfectly allowable to have type3 glyphs that refer to
other type3 glyphs in the same font (and in theory it's probably
even possible to have type3 glyphs that refer back and forth
between 2 or more type3 fonts).
The old code used to cope with this just fine, but with the change
to 'early loading' of the glyphs to display lists at interpret time
a problem has crept in. When we load the type 3 font, we load
each glyph in turn. If glyph 1 tries to use glyph 2, then we look
up the font, only to find that that the font has not been installed
yet, so we reload the entire font. This gets us into an infinite
loop.
As a fix for this, we split the loading of the type3 font into 2; we
load the font as normal, then allow the font to be inserted into
the list of current fonts. Then we run through the glyphs in the
font 'preparing' them (turning them into display lists).
This solves the infinite loop issue, but causes another problem;
recursive references (such as a font holding a display list that
contains a text node that contains a reference to the original font)
result in us never being able to free the structures.
To avoid this, we insist on never allowing type3 glyphs to be referenced
within a type3 display list. The display lists for all type3 glyphs
are therefore 'flat'. We achieve this by adding a 'nested' flag to
the pdf command stream interpreter structure, and setting this in the
case where we are running a glyph stream. We check for that flag in the
type3 glyph render function, and if present, we force the 'render_direct'
path to be used.
Finally, we ensure that fz_text groups are not needlessly created with
no contents.
Problem found in 2923.pdf.asan.22.2139, a test file supplied by
Mateusz "j00ru" Jurczyk and Gynvael Coldwind of the Google Security
Team using Address Sanitizer. Many thanks!
|
|
With a small dst_w (e.g. 1e-23) the floating point maths governing
scales can go wrong in the weight calculations. MSVC in particular
seems to return 1<<31 for the result of the max_len calculation.
It makes no real sense to scale bitmaps to < 1 pixel, so simply clamp
width and height as required.
Problem found in 2923.pdf.asan.22.2139, a test file supplied by
Mateusz "j00ru" Jurczyk and Gynvael Coldwind of the Google Security
Team using Address Sanitizer. Many thanks!
|
|
Leaking long linked lists leads to stack overflows during the
Memento debug output. Avoid this by iterating rather than recursing
where possible.
Also, for sanities sake, where we intent more than 40 spaces, use a
single '*' instead. This keeps logfiles sane.
|
|
When extreme ranges (+/- MAX_INT) are passed into the scaler
signed wrap around gives us problems when calculating the patch.
Simply ignore such cases.
Problem found in 1792.pdf.SIGSEGV.387.883, a test file supplied by
Mateusz "j00ru" Jurczyk and Gynvael Coldwind of the Google Security
Team using Address Sanitizer. Many thanks!
|
|
Whenever we have an error while pushing a gstate, we run the risk of
getting confused over how many pops we need etc.
With this commit we introduce some checking at the dev_null level that
attempts to make this behaviour consistent.
Any caller may now assume that calling an operation that pushes a clip
will always succeed. This means the only error cleanup they need to
do is to ensure that if they have pushed a clip (or begun a group, or
a mask etc) is to pop it too.
Any callee may now assume that if it throws an error during the call
to a device entrypoint that would create a group/clip/mask then no more
calls will be forthcoming until after the caller has completely finished
with that group.
This is achieved by the dev_null layer (the layer that indirects from
device calls through the device structure to the function pointers)
swallowing errors and regurgitating them later as required. A count is
kept of the number of pushes that have happened since an error
occurred during a push (including that initial one). When this count
reaches zero, the original error is regurgitated. This allows the
caller to keep the cookie correctly updated.
|
|
|
|
|
|
With illegal fax streams we could access beyond the right hand edge
of the allocated line. Fix this by adding some simple checks.
Issue found by Mateusz "j00ru" Jurczyk and Gynvael Coldwind of the
Google Security Team using Address Sanitizer. Many thanks!
|
|
We failed to detect a PDF sample function with a size of 0 as being
illegal. This lead us to continue through the code, and then access
out of bounds.
Issue found by Mateusz "j00ru" Jurczyk and Gynvael Coldwind of the
Google Security Team using Address Sanitizer. Many thanks!
|
|
If an illegal keysize is passed into the AES crypt filter, we
currently exit without setting up the AES context. This causes
us to fail in all manner of ways later on.
We now return failure and callers throw an exception.
This appears to solve all the SEGVs and memory exceptions found in
crypt_aes by Mateusz "j00ru" Jurczyk and Gynvael Coldwind of the
Google Security Team using Address Sanitizer. Many thanks!
|
|
Another patch from zeniko; if we read an unknown cmd while parsing
a path string, ensure that we skip over any subsequent numbers to
avoid going into an infinite loop.
|
|
A user (stu-mupdf) points out that if winopen fails, we throw
an error, which crashes due to the exception stack not having
been set up yet. The solution is simply to move pdfapp_init a
little earlier.
|
|
A user (av1474) points out that pthread error codes are non zero,
not negative; hence fix the example code to test for these
correctly.
|
|
Move the TR2 handling code. Thanks to zeniko for this.
|
|
Another fix from zeniko. Thanks again.
|
|
The issues fixed here were found by zeniko - many thanks.
The patch here is our own work - larger change, avoiding casts
for a (hopefully) neater result.
|
|
Thanks to zeniko for pointing this out. If we encounter a new definition
for a given object (presumably due to a repair operation), we used to
throw the old one away, and keep the new one. This could cause any
current holders of the object to be left with a stale pointer.
Now we throw the new one away and keep the old one - with a warning
if they are different.
|
|
Thanks to zeniko for these.
|
|
|
|
The way the forms to be reset is specified is used also in form
submission. This commit pulls out that selection method as a
separate function that returns the set of affected forms as a
pdf array object.
|
|
Following on from the blend.ai.pdf disapparing text fix that went in
the other day, zeniko has pointed out that we should be using the
device space on entry to pdf_show_pattern too. Fixed here.
Many thanks.
|