Age | Commit message (Collapse) | Author |
|
This feature is being implemented mostly for the purpose of permitting
the addition to a page of invisible signatures.
Also change pdf_create_annot to make freshly created annotations
printable by default.
|
|
|
|
Use a fixed number for Math.random().
Return a fixed date for Date.now() and Date.UTC().
|
|
|
|
|
|
Make the scoping clearer, since Javascript doesn't have block scoping.
|
|
|
|
|
|
|
|
see https://code.google.com/p/sumatrapdf/issues/detail?id=2517 for a
document which is broken to the point where it fails to load using
reparation but loads successfully if object 0 is implicitly defined.
|
|
|
|
Patch from Thomas Fach-Pedersen to fix the operation of pdf_insert_page
when called with an empty page tree. Many thanks! As noted in the code
with a FIXME this currently throws an error.
Also, cope with being told to add a page "at" INT_MAX as meaning to
add it at the end of the document.
Possibly this code should cope with a Root without a Pages entry, or
a Pages without a Kids too, but we can fix this in future if it ever
becomes a problem.
|
|
This makes every pdf_run_XX operator function have the same function
type. This paves the way for future changes in this area.
|
|
Acrobat honours Tc and Tw operators found during parsing TJ arrays.
We update the code here to cope. Possibly to completely match we should
honour other operators too, but this will do for now.
This maintains the behaviour of
tests_private/pdf/sumatra/916_-_invalid_argument_to_TJ.pdf 916.pdf
and improves the behaviour in general.
|
|
Useful utility missing from our arsenal.
|
|
Reuses the same internals as pdf_fprintf_obj etc.
|
|
String.prototype.substr() is deprecated.
RegExp.prototype.compile() has never been part of the ECMA standard,
and is deprecated in Mozilla's Javascript since 1.5 (at least).
|
|
|
|
Arrays are intended for numeric arrays, since they have the magic
updating of their "length" property which regular objects lack.
|
|
The test file on this bug:
de53b4bd41191f02d01a3c39b4880fa8_asan_heap-oob_caba3c_9561_7427.pdf
includes a corrupt CMAP. When this is read into memory it produces
a CMAP where the table gets too large. This produces lots of warnings
from 'add_table', but the calls to add_table all assume that the
process completed fine, resulting in range entries being added
that point to nonexistent values.
The fix is to make add_table return a bool to indicate success or
failure, and to only add range entries if the add_table succeeds.
Thanks to Mateusz Jurczyk and Gynvael Coldwind of the Google Security
Team for providing the example files.
|
|
When we call pdf_begin_group, this can go away and do lots of
drawing. This can result in the gstate stack growing, which can
involve a realloc. Any gstate pointer we are holding must therefore
be recalculated after such a call.
The neatest way to do this is to get pdf_begin_group to return
the gstate pointer, thus making it hard to forget to do.
This solves:
e2a1dda5393f4cb8a446fd8edd9d94f9_asan_heap-uaf_b938cf_2075_2393.pdf
Thanks to Mateusz Jurczyk and Gynvael Coldwind of the Google Security
Team for providing the example files.
|
|
When we find certain classes of flaw in the file while attempting to
read an object, we trigger an automatic repair of the file. This
leaves almost all objects unchanged; the sole exception is that of
the trailer object (and its sub objects) which can get dropped and
recreated.
To avoid leaving people holding handles to objects within the trailer
dict high and dry, we introduce a 'pre_repair_trailer' object to
each xref entry. On a repair, we copy the existing trailer object to
this. As we only ever repair once, this is safe.
The only known place where this is a problem is when setting up the
pdf_crypt for a document; we adapt the code here to allow for
potential problems.
The example file that shows this up is:
048d14d2f5f0ae31e9a2cde0be66f16a_asan_heap-uaf_86d4ed_3961_3661.pdf
Thanks to Mateusz Jurczyk and Gynvael Coldwind of the Google Security
Team for providing the fuzzing files.
|
|
If the /Version is a single character string (say "s") then the
current code for converting this in pdf_init_document reads off
the end of the string.
Simple fix is to use fz_atof instead.
Same fix for reading the PDF version normally.
This solves:
53b830f849d028fb2d528520716e157a_asan_heap-oob_478692_5259_4534.pdf
Thanks to Mateusz Jurczyk and Gynvael Coldwind of the Google Security
Team for providing the example files.
|
|
fz_new_image_from_pixmap expects that the pixmap's colorspace has two
references which is contrary to expectations. If it instead addrefs the
pixmap's colorspace, the only caller pdf_load_jpx can consistently
drop the colorspace after passing it to fz_load_jpx.
Also, if the contract is that whatever is passed into
fz_new_image_from_pixmap belongs to the new image, then the pixmap also
has to be dropped on error so that it isn't leaked.
|
|
When we call to execute a pattern, we clear out the pdf_csi (the
interpreter state). This involves clearing the stack and throwing
away the record of the object we have just parsed.
Unfortunately, when filling glyphs with a pattern, that object is
still in use. We therefore amend the pdf_run_contents_stream to
safely stash the object away and restore it afterwards.
This solves this problem, and protects us against any other similar
problems that might also arise.
This solves:
b8e2b57991896bf8120215cfbf7b54bb_asan_heap-uaf_86064f_2362_2587.pdf
Thanks to Mateusz Jurczyk and Gynvael Coldwind of the Google Security
Team for providing the example files.
|
|
At http://code.google.com/p/sumatrapdf/issues/detail?id=2477 , there's
a document which has an indexed colorspace whose lookup string contains
a trailing character. That character can be safely ignored without
rejecting everything depending on such a colorspace.
|
|
For SumatraPDF, the following changes are required:
* fz_load_system_font is called from pdf_load_builtin_font as well so
that Arial, Courier New, etc. can be loaded from the system instead
of their Nimbus replacements. In order to distinguish between calls
from pdf_load_builtin_font and pdf_load_substitute_font, an
is_substitute argument is added.
* fz_load_system_cjk_font is added and called from
pdf_load_substitute_cjk_font so that a better replacement font can
be loaded instead of DroidSansFallback.
* Both fz_load_system_font and fz_load_system_cjk_font return fz_font*
instead of fz_buffer* so that implementers aren't required to load
fonts into memory (SumatraPDF uses fz_new_font_from_file for system
fonts).
In addition to that, fz_load_system_font_func is renamed to
fz_load_system_font_funcs since it now accepts two functions, and the
PDF_ROS_* constants are renamed to FZ_ADOBE_* (collection names aren't
passed as const char* so that implementers know which collections to
expect). For convenience, fz_load_*_font also never throws since
currently all callers have further fallbacks available.
|
|
Avoid negative indirections. Don't make indirections to objects that
aren't going to be used.
Also improve pdf-write.c so that it doesn't call renumberobj on objs
that are going to be dropped.
|
|
If indexed spaces are empty (or truncated) we use garbage values when
they are read. Spot this and pad with 0s to at least be consistent.
Fixes:
013b2dcbd0207501e922910ac335eb59_asan_heap-oob_a59696_5952_500.pdf
5440f8bc8af12e5f7050e59b7ee008cd_asan_heap-oob_a59dd9_5952_500.pdf
fa8c712b03a7b02d6a12856ce042a44e_signal_sigsegv_a59b06_5847_493.pdf
Thanks to Mateusz Jurczyk and Gynvael Coldwind of the Google Security
Team for providing the fuzzing files.
|
|
While attempting to debug a valgrind issue with:
013b2dcbd0207501e922910ac335eb59_asan_heap-oob_a59696_5952_500.pdf
I found that mutool -difggg on it failed with a SEGV. This is due to
us parsing an array with a large invalid indirection in it (e.g.
[123456789 0 R]) and then the renumbering code assuming this is valid
and accessing off the end of an array.
|
|
The ifelse and if operators require special parsing where we convert
ps function streams to bytecode. If a malformed stream presents
if or ifelse without being preceded by the appropriate { ...} blocks
then throw an error.
This avoids us potentially calling ps_run recursively in an infinite
loop as happens with the test file in this bug.
5f091df77f6600d0927dc36777db2b93_signal_sigabrt_7ffff6d59425_6762_5545.pdf
Thanks to Mateusz Jurczyk and Gynvael Coldwind of the Google Security
Team for providing the fuzzing files.
|
|
In the existing code, if build_filter fails, chain will be freed. If
pdf_array_get fails however, it will leak.
Rectify this. No specific bug or example file, just observation arising
from discussions about previous commit.
|
|
pdf_open_raw_renumbered_stream and pdf_open_image_stream both have the
same issue that 98a111c8e49916f8f5ac21d11f4627540f9ddd49 fixes.
|
|
When constructing a filter chain, we pass ownership of 'chain' inwards.
This means we need to be careful not to double close chain.
This fixes:
5df97f8539d31745f1c45cc9e1468825_asan_heap-oob_a59afe_1862_225.pdf
a736faf6f4a34b7ad8eff207ba52aa57_asan_heap-oob_a59dd9_5744_4860.pdf
Thanks to Mateusz Jurczyk and Gynvael Coldwind of the Google Security
Team for providing the fuzzing files.
|
|
Bad annotation appearance streams can cause font_recs to have invalid
values in. Avoid this partly by hardening the code against duff values,
and partly by setting sane defaults before the parsing.
This can be seen in:
33bfbe117bfef7fafc3f927acf50a2e7_signal_sigsegv_81dd96_6257_5205.pdf
Thanks to Mateusz Jurczyk and Gynvael Coldwind of the Google Security
Team for providing the example files.
|
|
pdf_load_obj_stm may resize the xref if it finds further objects in the
stream, that might however invalidate any pdf_xref_entry hold such as
the one in pdf_cache_object. This can be seen e.g. with
7ac3ad9ddad98d10b947a43cf640062f_asan_heap-uaf_930b78_1007_1675.pdf
Thanks to Mateusz Jurczyk and Gynvael Coldwind of the Google Security
Team for providing the example files.
|
|
(Second part of Simons patch - apologies for missing this the first time).
This correctly enables the sanitization of the key length needed for
90db34f64037e2a8a5c3b6a518ba4153_asan_heap-oob_9b117e_1197_1802.pdf
Thanks to Mateusz Jurczyk and Gynvael Coldwind of the Google Security
Team for providing the example files.
|
|
This correctly enables the sanitization of the key length needed for
90db34f64037e2a8a5c3b6a518ba4153_asan_heap-oob_9b117e_1197_1802.pdf
|
|
We define a document handler for each file type (2 in the case of PDF, one
to handle files with the ability to 'run' them, and one without).
We then register these handlers with the context at startup, and then
call fz_open_document... as usual. This enables people to select the
document types they want at will (and even to extend the library with more
document types should they wish).
|
|
Certain optimized documents use a rather large common symbol dictionary
for all JBIG2 images. Caching these JBIG2Globals speeds up loading and
rendering of such documents.
|
|
At https://code.google.com/p/sumatrapdf/issues/detail?id=2460 , there's
a file with missing /Type keys in the page tree nodes. In that case,
leaf nodes and intermediary nodes have to be distinguished in a
different way.
|
|
These warnings are caused by casting function pointers to void*
instead of proper function types.
|
|
Some warnings we'd like to enable for MuPDF and still be able to
compile it with warnings as errors using MSVC (2008 to 2013):
* C4115: 'timeval' : named type definition in parentheses
* C4204: nonstandard extension used : non-constant aggregate initializer
* C4295: 'hex' : array is too small to include a terminating null character
* C4389: '==' : signed/unsigned mismatch
* C4702: unreachable code
* C4706: assignment within conditional expression
Also, globally disable C4701 which is frequently caused by MSVC not
being able to correctly figure out fz_try/fz_catch code flow.
And don't define isnan for VS2013 and later where that's no longer needed.
|
|
The SVG device needs rebinding as it holds a file. The PDF device needs
to rebind the underlying pdf document.
All documents need to rebind their underlying streams.
|
|
When we meet a broken PDF file, we attempt to repair it. We do this by
reading tokens from the file and attempting to interpret them as a
normal PDF stream.
Unfortunately, if the file is corrupt enough so that we start to read
from the middle of a stream, and we happen to hit an '(' character,
we can go into string reading mode. We can then end up skipping over
vast swathes of file that we could otherwise repair.
We fix this here by using a new version of the pdf_lex function that
refuses to ever return a string. This means we may take more time
over skipping things than we did before, but are less likely to
skip stuff.
We also tweak other parts of the pdf repair logic here. If we hit a
badly formed piece of data, clear the num/gen we have stored so that
the next plausible piece we get does not get assigned to a random
object number.
|
|
Remove code that's not used any more as a result of the previous
fix, plus some code that was unused anyway.
|
|
The 0 null object is leaked if a document refers to 0 0 obj before
requiring a delayed reparation (seen e.g. with 3324.pdf.asan.3.2585).
|
|
Thanks to Simon for spotting the original problem. This is a slight
tweak on the patch he supplied.
|
|
Replace an explicit i = i by a comment in a for loop where i is
already at the correct starting value.
|
|
Use round caps and joins so as to better match the result of drawing, and also
so that single dots display. Thanks to Michael Cadilhac for the suggestion.
|