summaryrefslogtreecommitdiff
path: root/source/pdf/pdf-lex.c
AgeCommit message (Collapse)Author
2014-01-02Improve PDF repair logic.Robin Watts
When we meet a broken PDF file, we attempt to repair it. We do this by reading tokens from the file and attempting to interpret them as a normal PDF stream. Unfortunately, if the file is corrupt enough so that we start to read from the middle of a stream, and we happen to hit an '(' character, we can go into string reading mode. We can then end up skipping over vast swathes of file that we could otherwise repair. We fix this here by using a new version of the pdf_lex function that refuses to ever return a string. This means we may take more time over skipping things than we did before, but are less likely to skip stuff. We also tweak other parts of the pdf repair logic here. If we hit a badly formed piece of data, clear the num/gen we have stored so that the next plausible piece we get does not get assigned to a random object number.
2013-09-24Bug 694557: Fix infinite loop in pdf_lex.Robin Watts
When we read a '>' during lexing, we try to read another char to see if it's another '>'. If not, we warn that it's unexpected, put the char back and retry. Putting the char back fails if the '>' was the last char in the stream as we will then have read EOF. We then loop and reread the '>' resulting in an infinite loop. Simple fix is to check for EOF.
2013-06-20Rearrange source files.Tor Andersson