mupdf - MuPDF PDF reader and library

diff options

author	Robin Watts <robin.watts@artifex.com>	2013-12-30 17:59:13 +0000
committer	Robin Watts <robin.watts@artifex.com>	2014-01-02 20:04:38 +0000
commit	bd7393d1be2e3905a6c3f1bb722198217c6195dc (patch)
tree	558533819a1412513e8a7c97c9e08c2a2870dded /source/pdf/js
parent	cf9a5e5e7af55d15a83f542041fc63c73ba57425 (diff)
download	mupdf-bd7393d1be2e3905a6c3f1bb722198217c6195dc.tar.xz

Improve PDF repair logic.

When we meet a broken PDF file, we attempt to repair it. We do this by reading tokens from the file and attempting to interpret them as a normal PDF stream. Unfortunately, if the file is corrupt enough so that we start to read from the middle of a stream, and we happen to hit an '(' character, we can go into string reading mode. We can then end up skipping over vast swathes of file that we could otherwise repair. We fix this here by using a new version of the pdf_lex function that refuses to ever return a string. This means we may take more time over skipping things than we did before, but are less likely to skip stuff. We also tweak other parts of the pdf repair logic here. If we hit a badly formed piece of data, clear the num/gen we have stored so that the next plausible piece we get does not get assigned to a random object number.

Diffstat (limited to 'source/pdf/js')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: