summaryrefslogtreecommitdiff
path: root/pdf/pdf_cmap_parse.c
AgeCommit message (Collapse)Author
2013-06-20Rearrange source files.Tor Andersson
2013-06-19Exception handling changesRobin Watts
In preparation for work on progressive loading, update the exception handling scheme slightly. Until now, exceptions (as thrown with fz_throw, and caught with fz_try/fz_catch) have merely had an informative string. They have never had anything that can be compared to see if an error is of a particular type. We now introduce error codes; when we fz_throw, we now always give an error code, and can optionally (using fz_throw_message) give both an error code and an informative string. When we fz_rethrow from within a fz_catch, both the error code and the error message is maintained. Using fz_rethrow_message we can 'improve' the error message, but the code is maintained. The error message can be read out using fz_caught_message() and the error code can be read as fz_caught(). Currently we only define a 'generic' error. This will expand in future versions to include other error types that may be tested for.
2013-06-18Merge common and internal headers into one.Tor Andersson
2013-06-18Move header files into separate include directory.Tor Andersson
2013-05-29Killed pdf_cmap_token enum.Tor Andersson
2013-01-04Make token enum a type to ease debuggingSebastian Rasmussen
2013-01-02Bug 693503: Fix overlong (seemingly infinite) loop of warnings.Robin Watts
When reading a CMAP with values out of range, we can go into a very long loop emitting the same pair of warnings. Spot the error case earlier and this give a nicer report. Problem found in a test file, 3192.pdf.SIGSEGV.b0.2438 supplied by Mateusz "j00ru" Jurczyk and Gynvael Coldwind of the Google Security Team using Address Sanitizer. Many thanks!
2012-11-30Bug 693290: Various fixes found from fuzzing.Robin Watts
Thanks to zeniko for finding various problems and submitting a patch that fixes them. This commit covers the simpler issues from his patch; other commits will follow shortly. * Out of range LZW codes. * Buffer overflows and error handling in image_jpeg.c * Buffer overflows in tiff handling * buffer overflows in cmap parsing. * Potential double free in font handling. * Buffer overflow in pdf_form.c * use of uninitialised value in error case in pdf_image.c * NULL pointer dereference in xps_outline.c
2012-08-06Remove old error mesages turned into comments when adding exceptionsSebastian Rasmussen
2012-06-22Rework pdf_lexbuf to allow for dynamic parsing buffers.Robin Watts
Currently pdf_lexbufs use a static scratch buffer for parsing. In the main case this is 64K in size, but in other cases it can be just 256 bytes; this causes problems when parsing long strings. Even the 64K limit is an implementation limit of Acrobat, not an architectural limit of PDF. Change here to allow dynamic buffers. This means a slightly more complex setup and destruction for each buffer, but more importantly requires correct cleanup on errors. To avoid having to insert lots more try/catch clauses this commit includes various changes to the code so we reuse pdf_lexbufs where possible. This keeps the speed up.
2012-03-13Rename some functions and accessors to be more consistent.Tor Andersson
Debug printing functions: debug -> print. Accessors: get noun attribute -> noun attribute. Find -> lookup when the returned value is not reference counted. pixmap_with_rect -> pixmap_with_bbox. We are reserving the word "find" to mean lookups that give ownership of objects to the caller. Lookup is used in other places where the ownership is not transferred, or simple values are returned. The rename is done by the sed script in scripts/rename3.sed
2012-03-06Split fitz.h/mupdf.h into internal/external headers.Robin Watts
Attempt to separate public API from internal functions.
2012-02-29Fix trailing whitespace and mixed tabs/spaces in indentation.Tor Andersson
2012-02-25Revamp pdf lexing codeRobin Watts
A huge amount (20%+ on some files) of our runtime is spent in fz_atof. A survey of results on the net suggests we will get much better speed by writing our own atof. Part of the job of doing this involves parsing the string to identify the component parts of the number - ludicrously, we are already doing this as part of the lexing process, so it would make sense to do the atoi/atof as part of this process. In order to do this, we need somewhere to store the lexed results; rather than add a float * and an int * to every single pdf_lex call, we generalise the calls to pass a pdf_lexbuf * pointer instead of separate buffer/max/string length pointers. This should help us overall.
2012-02-06Pass context to cmap and font descriptor functions.Tor Andersson
2011-12-28Silence warning in pdf_cmap_parse.cRobin Watts
2011-10-04Move to exception handling rather than error passing throughout.Robin Watts
This frees us from passing errors back everywhere, and hence enables us to pass results back as return values. Rather than having to explicitly check for errors everywhere and bubble them, we now allow exception handling to do the work for us; the downside to this is that we no longer emit as much debugging information as we did before (though this could be put back in). For now, the debugging information we have lost has been retained in comments with 'RJW:' at the start. This code needs fuller testing, but is being committed as a work in progress.
2011-09-21Add warning context.Tor Andersson
2011-09-15Add context to mupdf.Robin Watts
Huge pervasive change to lots of files, adding a context for exception handling and allocation. In time we'll move more statics into there. Also fix some for(i = 0; i < function(...); i++) calls.
2011-09-14Initial import of exception handling codeRobin Watts
Import exception handling code from WSS, modified to fit into the fitz world. With this code we have 'real' fz_try/fz_catch/fz_rethrow functions, handling a fz_except type. We therefore rename the existing fz_throw/ fz_catch/fz_rethrow to be fz_error_make/fz_error_handle/fz_error_note. We don't actually use fz_try/fz_catch/fz_rethrow yet...
2011-04-04Le Roi est mort, vive le Roi!Tor Andersson
The run-together words are dead! Long live the underscores! The postscript inspired naming convention of using all run-together words has served us well, but it is now time for more readable code. In this commit I have also added the sed script, rename.sed, that I used to convert the source. Use it on your patches and application code.
2011-04-04pdf: Rename mupdf directory.Tor Andersson