summaryrefslogtreecommitdiff
path: root/source/pdf
AgeCommit message (Collapse)Author
2018-01-22Bug 698889: Handle unterminated PDF arrays gracefully.Sebastian Rasmussen
Thanks to oss-fuzz for reporting this.
2018-01-19Improve signature check failure reportingPaul Gardiner
Because of the structure of openssl's signature checking, we temporarily permit certain errors in the certificate trust stage, so that openssl will continue onto the digest check. That way we can detect special error cases such as the only failure being that a self-signed certificate is present. This commit misses out one of the cases we'd missed.
2018-01-19Fix potential infinite loop when verifying signaturesPaul Gardiner
2018-01-19Perform signature verification via fz_streamPaul Gardiner
Previously, signature verification worked only for file-based documents and the file path had to be passed into the verification function.
2018-01-19Perform document signing via fz_stream and fz_outputPaul Gardiner
This change achieves two goals. It allows signing to be performed even when the document is obtained other than from a disk file. It also reestablishes to a working state signing of file-based documents, a feature that was broken due to complete_signatures being called after certain tables, avaialble via the output options object, had been destroyed.
2018-01-19Fix reading of pfx filesPaul Gardiner
We'd neglected to specify binary mode when opening the file. Possibly this affected only running under Windows.
2018-01-19Further changes to signature support related to changes in opensslPaul Gardiner
Reinstate the separate consideration of errors relating to the certificate trust checking phase. Remove the key-usage records from the certificate before signature verification. This is done so that openssl will recognise self signed cerificates. openssl doesn't consider them as such when the key usage doesn't include certificate signing.
2018-01-19Update use of openssl for signature support, from 1.0.1e to 1.1.0gSebastian Rasmussen
The openssl-version check within Makerules has been updated to ensure we include signature support only when 1.0.1u or later is available. 1.0.1u is the version at which the API changes which necessitated this commit were introduced.
2018-01-10Add colorspace type enum and use it instead of hardcoded checks on N.Tor Andersson
2018-01-05Enable saving of encrypted PDF files.Robin Watts
We need both RC4 and AES encryption. RC4 is a straight reversable stream, and our AES library knows how to encrypt as well as decrypt, so it's "just" a matter of calling them correctly. We therefore expose a generic "encrypt this data" routine (and a matching "how long will the data be once encrypted" routine) within pdf-crypt.c. We then extend our our PDF object output routines to call these. This is enough to get encrypted data preserved over calls to mutool clean. Unfortunately the created files aren't readable, due to 2 further problems, also fixed here. Firstly, mutool clean does not preserve the Encrypt entry in the trailer. This is a simple fix. Secondly, we are required NOT to encrypt the Encrypt entry. This requires us to spot the crypt entry and to special case it.
2018-01-05Fix "being able to search for redacted text" bug.Robin Watts
A customer reports that even after text has been redacted, we can still search for the redacted text. The example file supplied had many instances of the word 'words', and 4 instances of 'apple'. The 'apple' instances were redacted, and the document saved out. 2 such instances were on the first page; when we searched for 'apple' acrobat would find the word after the first removed instance of apple, then find the word 2 after the second removed instance of apple. After much head scratching and cutting down of the file, it appears that the information genuinely isn't in the file. Acrobat is somehow remembering it. It appears to be doing this using the 'ID' entries in the trailer dict. My suspicion is that Acrobat has cached the text extraction from the original document, and is using this on all files that match the IDs. Change the IDs (or remove them) and the problem goes away. The spec says that the ID should be 2 bytestrings in an array. The first is supposed to stay the same in all versions of a file (i.e. it shows the *original* version of the file, and it is the one that is used by encrypt). The second bytestring is supposed to change more often, so here we simply return a new random string on each writing.
2017-12-20Bug 698826: Plug leak of font names when parsing appearance string.Sebastian Rasmussen
Previously if a variable text annotation with a default appearance string had multiple 'Tf' operators all but the last font name would leak.
2017-12-19Bug 698825: Do not drop borrowed colorspaces.Sebastian Rasmussen
Previously the borrowed colorspace was dropped when updating annotation appearances, leading to use after free warnings from valgrind/ASAN.
2017-12-13Initialize generation numbers when saving a new pdf.Tor Andersson
2017-12-13Validate that /Size in trailer is in range.Sebastian Rasmussen
2017-12-13PDF object numbers need not be int64_t, int is sufficient.Sebastian Rasmussen
This is true because they are now limited below PDF_MAX_OBJECT_NUMBER.
2017-12-13Define constant INT64_MAX where int64_t is declared.Sebastian Rasmussen
2017-12-13Move xref section recursion check, simplifying code.Sebastian Rasmussen
2017-12-13Rephrase messages, clarify variable names and remove unused code.Sebastian Rasmussen
2017-12-13Never write negative xref offsets when saving to PDF.Sebastian Rasmussen
2017-12-13Bugs 698804/698810/698811: Keep PDF object numbers below limit.Sebastian Rasmussen
This ensures that: * xref tables with objects pointers do not grow out of bounds. * other readers, e.g. Adobe Acrobat can parse PDFs written by mupdf.
2017-12-13Fix 698785: Catch malformed numbers in PDF lexical scanner.Tor Andersson
Return error tokens when parsing numbers with trailing garbage rather than ignoring the extra characters. Also handle error tokens more gracefully in array and dictionary parsing. Treat error tokens as the 'null' keyword and continue parsing.
2017-12-13Add 'clean' option to pdfclean to clean (but not sanitize) content streams.Tor Andersson
This goes well with the 'mutool clean -d' decompression option to debug content streams, without doing the sanitize optimization pass.
2017-12-08Fix SEGV in redaction code due to TJ with no chars.Robin Watts
If the first TJ we meet in a file has an adjustment, but no chars, then we end up calling 'adjustment' without ever having set fontdesc. This causes a crash. Fix it here.
2017-11-23Workaround freetype synthesizing unicode cmaps.Tor Andersson
2017-11-23Make time stamps 64-bit integers.Tor Andersson
Future proof the API for the Year 2038 problem.
2017-11-22Remove unused annotation function.Tor Andersson
2017-11-22jni/js: Add support for annotation modification dates.Sebastian Rasmussen
2017-11-22jni/js: Use correct text encoding in annotation author and contents.Fred Ross-Perry
Also clarify that a copy of author/contents is returned, and that the caller must free them.
2017-11-22Add pdf_new_text_string utility function.Tor Andersson
Create a PDF 'text string' type string from a UTF-8 input string. If the input is plain ASCII, keep it as is, otherwise re-encode it as UTF-16BE.
2017-11-22jni: Make sure to dirty annotation whenever it changes.Fred Ross-Perry
2017-11-22jni: Return correct quadpoints coordinates.Sebastian Rasmussen
2017-11-22jni: Return correct inklist coordinates.Sebastian Rasmussen
2017-11-22Add usage for missing options to pdf-write.Sebastian Rasmussen
2017-11-22Skip unnecessary newline when writing ASCII streams.Tor Andersson
2017-11-15Bug 698740: Avoid NULL fz_default_colorspaces structures.Robin Watts
Remove code left over from development that is now wrong. We should have default colorspaces in the system, even in NON_ICC builds.
2017-11-14Ensure that after_text functions get ctm.Robin Watts
Also wrap their contents in q/Q, so they can't screw up the rest of the stream.
2017-11-14Ensure filter inits the Trm values on a BT.Robin Watts
Otherwise we can get empty bbox values.
2017-11-14Rejig filter internals slightly.Robin Watts
Hold 2 instances of a structure, rather than a structure with 2 of each fields in it. Also, correct the logic for when we send color changes.
2017-11-13Never draw Popup annotations.Tor Andersson
Popup annotations should never have an appearance stream, but in case they do we shouldn't draw them. The Open property is only to toggle whether the GUI should be showing the Content text editing in a Popup (or Text) annotation, and should not affect drawing the page.
2017-11-09Bug 698353: Avoid having our API depend on DEBUG/NDEBUG.Robin Watts
Currently, our API uses static inlines for fz_lock and fz_unlock, the definitions for which depend on whether we build NDEBUG or not. This isn't ideal as it causes problems when people link a release binary with a debug lib (or vice versa). We really want to continue to use static inlines for the locking functions as used from MuPDF, as we hit them hard in the keep/drop functions. We therefore remove fz_lock/fz_unlock from the public API entirely. Accordingly, we move the fz_lock/fz_unlock static inlines into fitz-imp.h (an internal header), together with the fz_keep_.../fz_drop_... functions. We then have public fz_lock/fz_unlock functions for any external callers to use that are free of compilications. At the same time, to avoid another indirection, we change from holding the locking functions as a pointer to a struct to a struct itself.
2017-11-08Silence warning.Tor Andersson
2017-11-08Bug 689699: Avoid buffer overrun.Robin Watts
When cleaning a pdf file, various lists (of pdf_xref_len length) are defined early on. If we trigger a repair during the clean, this can cause pdf_xref_len to increase causing an overrun. Fix this by watching for changes in the length, and checking accesses to the list for validity. This also appears to fix bugs 698700-698703.
2017-11-08Bug 698704: Fix for overflow check failing due to 'clever' compiler.Robin Watts
Adopt Josephs suggested fix for arithmetic overflow. Thanks to Kan-Ru Chen for spotting the problem.
2017-11-08Bug 698689: Don't create a hint stream for a file with 0 pages.Robin Watts
2017-11-06Expose text filtering through pdf_clean interface.Robin Watts
2017-11-06Use text state handling in pdf_filter_processor to filter text.Robin Watts
2017-11-06Extract text state handling from run pdf_processor.Robin Watts
So it can be used in the filter pdf_processor too.
2017-11-02Fixes for win32 build.Tor Andersson
2017-11-01Add separate fz_close_output step.Tor Andersson
Closing flushes output and may throw exceptions. Dropping frees the state and never throws exceptions.