Age | Commit message (Collapse) | Author |
|
The generation number is only needed for decryption, and is assumed
to be zero or irrelevant for all other uses.
Store the original object number and generation in the xref slot, so
that we can decrypt them even when the objects have been renumbered,
without needing to pass the original object number around through
the stream loading APIs.
|
|
|
|
Extraneous explicit type casts can mask errors, especially if a
function prototype or return value changes in the future.
|
|
Use fz_output in debug printing functions.
Use fz_output in pdfshow.
Use fz_output in fz_trace_device instead of stdout.
Use fz_output in pdf-write.c.
Rename fz_new_output_to_filename to fz_new_output_with_path.
Add seek and tell to fz_output.
Remove unused functions like fz_fprintf.
Fix typo in pdf_print_obj.
|
|
|
|
Currently, every PDF name is allocated in a pdf_obj structure, and
comparisons are done using strcmp. Given that we can predict most
of the PDF names we'll use in a given file, this seems wasteful.
The pdf_obj type is opaque outside the pdf-object.c file, so we can
abuse it slightly without anyone outside knowing.
We collect a sorted list of names used in PDF (resources/pdf/names.txt),
and we add a utility (namedump) that preprocesses this into 2 header
files.
The first (include/mupdf/pdf/pdf-names-table.h, included as part of
include/mupdf/pdf/object.h), defines a set of "PDF_NAME_xxxx"
entries. These are pdf_obj *'s that callers can use to mean "A PDF
object that means literal name 'xxxx'"
The second (source/pdf/pdf-name-impl.h) is a C array of names.
We therefore update the code so that rather than passing "xxxx" to
functions (such as pdf_dict_gets(...)) we now pass PDF_NAME_xxxx (to
pdf_dict_get(...)). This is a fairly natural (if widespread) change.
The pdf_dict_getp (and sibling) functions that take a path (e.g.
"foo/bar/baz") are therefore supplemented with equivalents that
take a list (pdf_dict_getl(... , PDF_NAME_foo, PDF_NAME_bar,
PDF_NAME_baz, NULL)).
The actual implementation of this relies on the fact that small
pointer values are never valid values. For a given pdf_obj *p,
if NULL < (intptr_t)p < PDF_NAME__LIMIT then p is a literal
entry in the name table.
This enables us to do fast pointer compares and to skip expensive
strcmps.
Also, bring "null", "true" and "false" into the same style as PDF names.
Rather than using full pdf_obj structures for null/true/false, use
special pointer values just above the PDF_NAME_ table. This saves
memory and makes comparisons easier.
|
|
Purge several embedded contexts:
Remove embedded context in fz_output.
Remove embedded context in fz_stream.
Remove embedded context in fz_device.
Remove fz_rebind_stream (since it is no longer necessary).
Remove embedded context in svg_device.
Remove embedded context in XML parser.
Add ctx argument to fz_document functions.
Remove embedded context in fz_document.
Remove embedded context in pdf_document.
Remove embedded context in pdf_obj.
Make fz_page independent of fz_document in the interface.
We shouldn't need to pass the document to all functions handling a page.
If a page is tied to the source document, it's redundant; otherwise it's
just pointless.
Fix reference counting oddity in fz_new_image_from_pixmap.
|
|
Rename fz_close to fz_drop_stream.
Rename fz_close_archive to fz_drop_archive.
Rename fz_close_output to fz_drop_output.
Rename fz_free_* to fz_drop_*.
Rename pdf_free_* to pdf_drop_*.
Rename xps_free_* to xps_drop_*.
|
|
|
|
In load_sample_func, the stream is not closed and thus leaked if one of
the fz_read_byte or fz_read_bits calls throws (which might happen e.g.
on a Deflate data error).
In pdf_load_compressed_inline_image, the allocated buffer is not freed
if one of the stream initializers or the tile creation throws
(fz_open_leecher does not take ownership of the stream).
|
|
|
|
The ifelse and if operators require special parsing where we convert
ps function streams to bytecode. If a malformed stream presents
if or ifelse without being preceded by the appropriate { ...} blocks
then throw an error.
This avoids us potentially calling ps_run recursively in an infinite
loop as happens with the test file in this bug.
5f091df77f6600d0927dc36777db2b93_signal_sigabrt_7ffff6d59425_6762_5545.pdf
Thanks to Mateusz Jurczyk and Gynvael Coldwind of the Google Security
Team for providing the fuzzing files.
|
|
Some warnings we'd like to enable for MuPDF and still be able to
compile it with warnings as errors using MSVC (2008 to 2013):
* C4115: 'timeval' : named type definition in parentheses
* C4204: nonstandard extension used : non-constant aggregate initializer
* C4295: 'hex' : array is too small to include a terminating null character
* C4389: '==' : signed/unsigned mismatch
* C4702: unreachable code
* C4706: assignment within conditional expression
Also, globally disable C4701 which is frequently caused by MSVC not
being able to correctly figure out fz_try/fz_catch code flow.
And don't define isnan for VS2013 and later where that's no longer needed.
|
|
We were miscalculating the offsets into a sampled functions table,
causing us to overrun the end. Fixed here.
|
|
In case of an unknown function type, we free 'func'. Then we later
read func->type out of the block, and drop the block.
Simple solution is not to free the block initially and to let the
drop of the block do it for us.
|
|
Correct the naming scheme for pdf_obj_xxx functions.
|
|
For historical reasons lots of the code uses "xref" when talking about
a pdf document. Now pdf_xref is a separate type this has become
confusing, so replace 'xref' with 'doc' for clarity.
|
|
|