summaryrefslogtreecommitdiff
path: root/source/tools
AgeCommit message (Collapse)Author
2015-05-15Support pdf files larger than 2Gig.Robin Watts
If FZ_LARGEFILE is defined when building, MuPDF uses 64bit offsets for files; this allows us to open streams larger than 2Gig. The downsides to this are that: * The xref entries are larger. * All PDF ints are held as 64bit things rather than 32bit things (to cope with /Prev entries, hint stream offsets etc). * All file positions are stored as 64bits rather than 32. The implementation works by detecting FZ_LARGEFILE. Some #ifdeffery in fitz/system.h sets fz_off_t to either int or int64_t as appropriate, and sets defines for fz_fopen, fz_fseek, fz_ftell etc as required. These call the fseeko64 etc functions on linux (and so define _LARGEFILE64_SOURCE) and the explicit 64bit functions on windows.
2015-04-16mutool clean -z option to compress streams.Tor Andersson
2015-04-14Fix 695918: "mudraw -sm" format string on win32.Tor Andersson
2015-04-09Add -v option to mutool and mudraw to print MuPDF version number.Tor Andersson
2015-04-09Remove the _no_run functions.Tor Andersson
The new pdfclean sanitize functionality mean that mutool now needs the data files, so maintaining the split that was designed to keep data files out of mutool is no longer viable.
2015-04-07Fix whitespace.Tor Andersson
2015-04-07Add EPUB layout options to mupdf-x11 and mudraw.Tor Andersson
2015-04-07Rename mutool show 'pages' to 'pagetree' to reduce possible confusion.Tor Andersson
Fixes bug 695909.
2015-04-06Update mutool subtools to use PDF_NAME_xxx rather than "xxx".Robin Watts
2015-04-06Add mutool pages subcommand.Robin Watts
Inspired by bug 695823. Mutool can now dump the sizes and orientations for pages within a given file.
2015-04-06Move the guts of pdfclean into the lib.Robin Watts
Michael needs to be able to call pdfclean from gsview. At the moment he's having to do this by including the pdfclean.c file into the lib build, and then calling pdfclean_main with a faked up command line. This isn't nice. pdfclean.c is implemented by pdfclean_main parsing the options/filenames out of argv and then passing the filenames/options on to a pdfclean_clean function. This seems like a much nicer API to offer to the world. We therefore pull the guts of pdfclean.c (pdfclean_clean and its subsidiary structures/functions) into pdf-clean-file.c and include this in the library build. This leaves pdfclean.c just as the command line parsing. This should not affect the size of any of the resulting binaries.
2015-04-06Fix oddity with pdfinfoRobin Watts
If pdfinfo is invoked as: mutool info file.pdf 1,2,3 then it will show the items found on page 1, then the items found on pages 1 and 2, then the items shown on pages 1,2 and 3. Fix this by clearing the data after each show operation.
2015-04-02Bug 695900: pdfclean return code is inverted.Robin Watts
Silly typo. Thanks to Daniel Bloemer for pointing this out.
2015-04-01Update manpages.Tor Andersson
2015-04-01Fix scan for %d in mudraw.Tor Andersson
2015-04-01Clean up mudraw command line syntax.Tor Andersson
... and move outline printing to mutool show.
2015-03-24Rework handling of PDF names for speed and memory.Robin Watts
Currently, every PDF name is allocated in a pdf_obj structure, and comparisons are done using strcmp. Given that we can predict most of the PDF names we'll use in a given file, this seems wasteful. The pdf_obj type is opaque outside the pdf-object.c file, so we can abuse it slightly without anyone outside knowing. We collect a sorted list of names used in PDF (resources/pdf/names.txt), and we add a utility (namedump) that preprocesses this into 2 header files. The first (include/mupdf/pdf/pdf-names-table.h, included as part of include/mupdf/pdf/object.h), defines a set of "PDF_NAME_xxxx" entries. These are pdf_obj *'s that callers can use to mean "A PDF object that means literal name 'xxxx'" The second (source/pdf/pdf-name-impl.h) is a C array of names. We therefore update the code so that rather than passing "xxxx" to functions (such as pdf_dict_gets(...)) we now pass PDF_NAME_xxxx (to pdf_dict_get(...)). This is a fairly natural (if widespread) change. The pdf_dict_getp (and sibling) functions that take a path (e.g. "foo/bar/baz") are therefore supplemented with equivalents that take a list (pdf_dict_getl(... , PDF_NAME_foo, PDF_NAME_bar, PDF_NAME_baz, NULL)). The actual implementation of this relies on the fact that small pointer values are never valid values. For a given pdf_obj *p, if NULL < (intptr_t)p < PDF_NAME__LIMIT then p is a literal entry in the name table. This enables us to do fast pointer compares and to skip expensive strcmps. Also, bring "null", "true" and "false" into the same style as PDF names. Rather than using full pdf_obj structures for null/true/false, use special pointer values just above the PDF_NAME_ table. This saves memory and makes comparisons easier.
2015-03-20Fix Memtrace for 64bit operation.Robin Watts
2015-02-25Bug 695851: Fix SEGV in mutool info.Robin Watts
Add missing initialisation of glo.ctx required due to API change.
2015-02-17Add ctx parameter and remove embedded contexts for API regularity.Tor Andersson
Purge several embedded contexts: Remove embedded context in fz_output. Remove embedded context in fz_stream. Remove embedded context in fz_device. Remove fz_rebind_stream (since it is no longer necessary). Remove embedded context in svg_device. Remove embedded context in XML parser. Add ctx argument to fz_document functions. Remove embedded context in fz_document. Remove embedded context in pdf_document. Remove embedded context in pdf_obj. Make fz_page independent of fz_document in the interface. We shouldn't need to pass the document to all functions handling a page. If a page is tied to the source document, it's redundant; otherwise it's just pointless. Fix reference counting oddity in fz_new_image_from_pixmap.
2015-02-17Rename fz_close_* and fz_free_* to fz_drop_*.Tor Andersson
Rename fz_close to fz_drop_stream. Rename fz_close_archive to fz_drop_archive. Rename fz_close_output to fz_drop_output. Rename fz_free_* to fz_drop_*. Rename pdf_free_* to pdf_drop_*. Rename xps_free_* to xps_drop_*.
2015-02-17Reference count fz_document.Tor Andersson
2015-01-20don't try extracting object number 0Simon Bünzli
pdfextract_main by default iterates through all objects from number 0 to the size of the document's xref table. Object number 0 is however always supposed to be free, so pdfextract consistently fails and shows a slightly confusing warning. Object extraction should by default start at object 1 in order to prevent this warning.
2014-09-09test-device: Abort interpretation when color found.Robin Watts
Add a new class of errors and use them to abort interpretation when the test device detects a color page.
2014-08-27Revise test-device; thresholding and exhaustive checking.Tor Andersson
The original version of the test-device could characterise pages as being grayscale/color purely based on the colorspaces used. This could easily be upset by grayscale images or shadings that happened to be specified in non-grayscale colorspaces however. We now look at the actual shading and image color values, and use a threshold value to allow for some measure of rounding errors in color values that are in practice grayscale.
2014-08-12Fix 695411: Catch errors when loading objects in mutool extract.Tor Andersson
2014-08-12Change error messages in mutool extract to follow the house style.Tor Andersson
2014-07-17Add feature testing device, and call it from mudraw with -T flag.Tor Andersson
Currently only tests for the presence of non-grayscale color.
2014-05-05Fix 695207: printf format bug in mutool info.Tor Andersson
2014-03-25Split mjs script generation to separate tool.Tor Andersson
It has no real reason to live in mudraw, and it does pull in the javascript dependency via pdf-form.c.
2014-03-20Respect reverse page ranges in mutool clean.Tor Andersson
2014-03-19Add routine to clean pdf content streams for pages.Robin Watts
New routine to filter the content streams for pages, xobjects, type3 charprocs, patterns etc. The filtered streams are guaranteed to be properly matched with q/Q's, and to not have changed the top level ctm. Additionally we remove (some) repeated settings of colors etc. This filtering can be extended to be smarter later. The idea of this is to both repair after editing, and to leave the streams in a form that can be easily appended to. This is preparatory to work on Bates numbering and Watermarking. Currently the streams produced are uncompressed.
2014-03-19Make mutool clean sanitise the Dests lists when subsetting.Robin Watts
When you use mutool clean to subset pages out of a PDF, we already remove the Name tree entries for named locations that aren't in the target file. We have henceforth failed to remove references to these removed names though. This can cause errors (really warnings) on reading the file back.
2014-03-13Tweak pdfclean and pdfinfo to be useful as libraries.Robin Watts
Firstly, we remove the use of global variables; this is done by introducing a 'globals' structure for each of these files and passing it internally between functions. Next, split the core of pdfclean_main into pdfclean_clean, and the core of pdfinfo_main into pdfinfo_info. The _main functions now do the argv processing. The new functions now run entirely thread safely, so can be called from library functions.
2014-01-09Add -o option for mutool show.Tor Andersson
Windows doesn't like redirecting binary output, so add an explicit filename argument.
2014-01-07Introduce 'document handlers'.Robin Watts
We define a document handler for each file type (2 in the case of PDF, one to handle files with the ability to 'run' them, and one without). We then register these handlers with the context at startup, and then call fz_open_document... as usual. This enables people to select the document types they want at will (and even to extend the library with more document types should they wish).
2014-01-06fix various MSVC warningsSimon Bünzli
Some warnings we'd like to enable for MuPDF and still be able to compile it with warnings as errors using MSVC (2008 to 2013): * C4115: 'timeval' : named type definition in parentheses * C4204: nonstandard extension used : non-constant aggregate initializer * C4295: 'hex' : array is too small to include a terminating null character * C4389: '==' : signed/unsigned mismatch * C4702: unreachable code * C4706: assignment within conditional expression Also, globally disable C4701 which is frequently caused by MSVC not being able to correctly figure out fz_try/fz_catch code flow. And don't define isnan for VS2013 and later where that's no longer needed.
2013-10-31Add CMYK and CMYK Alpha colorspaces to mudraw options.Tor Andersson
2013-10-31Add CMYK support to PAM output.Tor Andersson
2013-09-30Disable image interpolation with a hint.Robin Watts
Set the hint in mudraw when AA bits is set to 0.
2013-09-27add support for .tga output to mudrawSimon Bünzli
SumatraPDF's testsuite uses Targa images as output because they're compressed while still far easier to compare than PNG and have better tool support than PCL/PWG.
2013-09-06Add '-' as a option for stdout to mudrawRobin Watts
2013-08-30Add simple banding to mudraw.Robin Watts
The most complex part here is to ensure that we can output various bitmaps in bands.
2013-08-28Dump glyph cache size as part of mudraw -MRobin Watts
2013-08-21Add simple memory use tracking to mudrawRobin Watts
2013-08-21Add -F flag to mudraw to allow format selection.Robin Watts
This allows us to "mudraw -F ppm -o /dev/null" etc.
2013-07-26Reword mutool usage text.Tor Andersson
2013-07-25Fix mutool poster operation.Robin Watts
2013-07-24Bug 694429: Fix potential overflows in sprintf in pdfextractRobin Watts
Thanks to Pengsu Cheng for pointing out the problem.
2013-07-11Implement dynamic page tree lookups.Tor Andersson
No more caching a flattened page tree in doc->page_objs/refs. No more flattening of page resources, rotation and boxes. Smart page number lookup by following Parent links. Naive implementation of insert and delet page that doesn't rebalance the trees. Requires existing page tree to hook into, cannot be used to create a page tree from scratch.