summaryrefslogtreecommitdiff
path: root/source/tools/pdfextract.c
AgeCommit message (Collapse)Author
2017-04-27Include required system headers.Tor Andersson
2017-03-22Rename fz_putc/puts/printf to fz_write_*.Tor Andersson
Rename fz_write to fz_write_data. Rename fz_write_buffer_* and fz_buffer_printf to fz_append_*. Be consistent in naming: fz_write_* calls write to fz_output. fz_append_* calls append to fz_buffer. Update documentation.
2016-12-27Strip extraneous blank lines.Tor Andersson
2016-12-14Let pixmap colorspace conversion create new pixmap.Sebastian Rasmussen
This moves dropping the converted pixmap into fz_convert_pixmap(), which relieves every caller from doing so. Moreover resolution, position and interpolation are kept.
2016-12-14Plug pixmap leak when fz_convert_pixmap() throws.Sebastian Rasmussen
2016-11-14Make fz_buffer structure private to fitz.Robin Watts
Move the definition of the structure contents into new fitz-imp.h file. Make all code outside of fitz access the buffer through the defined API. Add a convenience API for people that want to get buffers as null terminated C strings.
2016-10-10Bug 697050: Allow mutool extract to keep JPEGs as JPEGs.Robin Watts
If an image is a JPEG (without a mask, or a colorkey, or using decode), then extract it as such.
2016-09-01pdf: Load/open streams by indirect reference object when possible.Tor Andersson
2016-08-30Fix pdfextract for optional pixmap alpha changes.Tor Andersson
2016-07-22Fix pixmap and reference leak in pdfextract.Sebastian Rasmussen
2016-07-06pdf: Drop generation number from public interfaces.Tor Andersson
The generation number is only needed for decryption, and is assumed to be zero or irrelevant for all other uses. Store the original object number and generation in the xref slot, so that we can decrypt them even when the objects have been renumbered, without needing to pass the original object number around through the stream loading APIs.
2016-06-17Use 'size_t' instead of int as appropriate.Robin Watts
This silences the many warnings we get when building for x64 in windows. This does not address any of the warnings we get in thirdparty libraries - in particular harfbuzz. These look (at a quick glance) harmless though.
2016-06-16Drop save_alpha argument from image writing functions.Tor Andersson
2016-05-24fz_pixmap revamp: add stride and make alpha optionalRobin Watts
fz_pixmaps now have an explicit stride value. By default no change from before, but code all copes with extra gaps at the end of the line. The alpha data in fz_pixmaps is no longer compulsory. mudraw: use rgb not rgba (ppmraw), cmyk not cmyka (pkmraw). Update halftone code to not expect alpha plane. Update PNG writing to cope with alpha less input. Also hide repeated params within the png output context. ARM code needs updating.
2016-04-28Partial image decode.Robin Watts
Update the core fz_get_pixmap_from_image code to allow fetching a subarea of a pixmap. We pass in the required subarea, together with the transformation matrix for the whole image. On return, we have a pixmap at least as big as was requested, and the transformation matrix is updated to map the supplied area to the correct place on the screen. The draw device is updated to use this as required. Everywhere else passes NULLs in, and so gets unchanged behaviour. The standard 'get_pixmap' function has been updated to decode just the required areas of the bitmaps. This means that banded rendering of pages will decode just the image subareas that are required for each band, limiting the memory use. The downside to this is that each band will redecode the image again to extract just the section we want. The image subareas are put into the fz_store in the same way as full images. Currently image areas in the store are only matched when they match exactly; subareas are not identified as being able to use existing images.
2016-03-01Rename pdf_close_document to pdf_drop_document.Tor Andersson
2015-12-28Rename fz_image_get_pixmap to fz_get_pixmap_from_image.Tor Andersson
2015-12-15Rename fz_write_x to fz_save_pixmap_as_x or fz_save_bitmap_as_x.Tor Andersson
Separate naming of functions that save complete files to disk from functions that write data to streams.
2015-12-11Use fz_output instead of FILE* for most of our output needs.Tor Andersson
Use fz_output in debug printing functions. Use fz_output in pdfshow. Use fz_output in fz_trace_device instead of stdout. Use fz_output in pdf-write.c. Rename fz_new_output_to_filename to fz_new_output_with_path. Add seek and tell to fz_output. Remove unused functions like fz_fprintf. Fix typo in pdf_print_obj.
2015-06-29Rejig the internals of fz_image slightly.Robin Watts
Previously, we had people calling image->get_pixmap directly. Now we have them all call fz_image_get_pixmap, which will look for a cached version in the store, and only call get_pixmap if required. Previously fz_image_get_pixmap used to look for the cached version in the store, and decode if not - hence the decoding code is now extracted out into standard_image_get_pixmap. This was the original intent of the code, it just somehow didn't end up like that. This nicely queues us up for being able to have fz_images that use a different get_pixel implementation, such as that which will be required for the gprf code.
2015-05-15Support pdf files larger than 2Gig.Robin Watts
If FZ_LARGEFILE is defined when building, MuPDF uses 64bit offsets for files; this allows us to open streams larger than 2Gig. The downsides to this are that: * The xref entries are larger. * All PDF ints are held as 64bit things rather than 32bit things (to cope with /Prev entries, hint stream offsets etc). * All file positions are stored as 64bits rather than 32. The implementation works by detecting FZ_LARGEFILE. Some #ifdeffery in fitz/system.h sets fz_off_t to either int or int64_t as appropriate, and sets defines for fz_fopen, fz_fseek, fz_ftell etc as required. These call the fseeko64 etc functions on linux (and so define _LARGEFILE64_SOURCE) and the explicit 64bit functions on windows.
2015-04-09Remove the _no_run functions.Tor Andersson
The new pdfclean sanitize functionality mean that mutool now needs the data files, so maintaining the split that was designed to keep data files out of mutool is no longer viable.
2015-04-06Update mutool subtools to use PDF_NAME_xxx rather than "xxx".Robin Watts
2015-02-17Add ctx parameter and remove embedded contexts for API regularity.Tor Andersson
Purge several embedded contexts: Remove embedded context in fz_output. Remove embedded context in fz_stream. Remove embedded context in fz_device. Remove fz_rebind_stream (since it is no longer necessary). Remove embedded context in svg_device. Remove embedded context in XML parser. Add ctx argument to fz_document functions. Remove embedded context in fz_document. Remove embedded context in pdf_document. Remove embedded context in pdf_obj. Make fz_page independent of fz_document in the interface. We shouldn't need to pass the document to all functions handling a page. If a page is tied to the source document, it's redundant; otherwise it's just pointless. Fix reference counting oddity in fz_new_image_from_pixmap.
2015-02-17Rename fz_close_* and fz_free_* to fz_drop_*.Tor Andersson
Rename fz_close to fz_drop_stream. Rename fz_close_archive to fz_drop_archive. Rename fz_close_output to fz_drop_output. Rename fz_free_* to fz_drop_*. Rename pdf_free_* to pdf_drop_*. Rename xps_free_* to xps_drop_*.
2015-01-20don't try extracting object number 0Simon Bünzli
pdfextract_main by default iterates through all objects from number 0 to the size of the document's xref table. Object number 0 is however always supposed to be free, so pdfextract consistently fails and shows a slightly confusing warning. Object extraction should by default start at object 1 in order to prevent this warning.
2014-08-12Fix 695411: Catch errors when loading objects in mutool extract.Tor Andersson
2014-08-12Change error messages in mutool extract to follow the house style.Tor Andersson
2013-07-24Bug 694429: Fix potential overflows in sprintf in pdfextractRobin Watts
Thanks to Pengsu Cheng for pointing out the problem.
2013-06-25Update pdf_obj's to have a pdf_document field.Robin Watts
Remove the fz_context field to avoid the structure growing.
2013-06-20Rename fz_image_to_pixmap to fz_new_pixmap_from_image.Tor Andersson
Match our naming conventions.
2013-06-20Rearrange source files.Tor Andersson