summaryrefslogtreecommitdiff
path: root/source/fitz/stext-device.c
AgeCommit message (Collapse)Author
2016-04-27Add fz_close_device function.Tor Andersson
Garbage collected languages need a way to signal that they are done with a device other than freeing it. Call it implicitly on fz_drop_device; so take care not to call it again in case it has been explicitly called already.
2016-04-06Split encoded ligatures (as from PDF) properly in text extraction.Tor Andersson
2016-04-05Handle many-to-one and many-to-many clusters in structured text extraction.Tor Andersson
2016-03-14Remove begin_page and end_page device calls.Tor Andersson
To be moved into a new document writer interface later.
2016-02-24Add fz_show_string function and move wmode argument to end.Tor Andersson
2016-02-24Add optional scissor hint argument to text clipping functions.Tor Andersson
2016-02-22Drop const from fz_image.Tor Andersson
Image objects are immutable and opaque once constructed. Therefore there is no need for the const keyword.
2016-01-21Drop const from fz_colorspace.Tor Andersson
It's an opaque immutable structure, that we don't expect to ever want to change after creation. Therefore the const keyword is not useful, and is only line noise.
2016-01-20Tidy bidirectional source.Robin Watts
Make the import follow mupdf style (better, if not perfect). Use ucdn where possible to avoid duplicating tables. Shrink the types, make them explicit (e.g. use fz_bidi_level rather than int) and make tables const. Use 32-bit integers for text.
2016-01-13Add lots of consts.Robin Watts
In general, we should use 'const fz_blah' in device calls whenever the callee should not alter the fz_blah. Push this through. This shows up various places where we fz_keep and fz_drop these const things. I've updated the fz_keep and fz_drops with appropriate casts to remove the consts. We may need to do the union dance to avoid the consts for some compilers, but will only do that if required. I think this is nicer overall, even allowing for the const<->no const problems.
2015-12-11Remove text clip accumulation.Tor Andersson
We can now group all clipped text into one fz_text object and simplify the device interface.
2015-12-11Keep spans of multiple fonts and sizes in one fz_text object.Tor Andersson
2015-12-11Rename structured text structs and functions to 'stext'.Tor Andersson
Less risk of confusion with the text type used in the device interface.
2015-08-24Move ucdn.h into public headers.Tor Andersson
2015-07-20Fix leak during text extraction.Robin Watts
MuPDF (the win32/linux viewer) leaks a span_soup each time it is run, even if (seemingly to the user) no text extraction operations are done. This is because the view does a text extraction pass silently, during which 'begin_page' is called for both page contents and annotation contents. This causes a leak of a span_soup. Change the implementation to allocate the span_soup just in time instead.
2015-04-07Fix some warnings.Tor Andersson
2015-04-07Fix structured text extraction in vertical mode.Robin Watts
When advancing a glyph in vertical mode, it should advance down the page. The origin of the glyph as supplied is bottom left, not top right - allow for this in calculations. Previously glyphs were not being collated into spans because of this.
2015-04-07Structured text extraction; improve glyph bounding box calculations.Robin Watts
In vertical motion mode, when calculating bboxes we should use horizontal rather vertical displacements from the 'axis of movement'. In horizontal mode, we displace by 'ascender' and 'descender'. Those concepts don't rotate with the motion mode, so repurpose those fields to hold bbox.x0 and bbox.x1 in vertical mode.
2015-04-07Use fz_advance_glyph rather than direct FT calls during PDF layout.Robin Watts
2015-02-25Text device; collect matrix and bbox for images too.Robin Watts
We were not filling in the matrix and bbox fields for images collected as part of the text extraction device. Fixed here.
2015-02-20Do not crash on text extraction on pages with no text.Robin Watts
Thanks to malc for pointing out the problem.
2015-02-17Use embedded superclass struct instead of user pointer in devices.Tor Andersson
2015-02-17Add ctx parameter and remove embedded contexts for API regularity.Tor Andersson
Purge several embedded contexts: Remove embedded context in fz_output. Remove embedded context in fz_stream. Remove embedded context in fz_device. Remove fz_rebind_stream (since it is no longer necessary). Remove embedded context in svg_device. Remove embedded context in XML parser. Add ctx argument to fz_document functions. Remove embedded context in fz_document. Remove embedded context in pdf_document. Remove embedded context in pdf_obj. Make fz_page independent of fz_document in the interface. We shouldn't need to pass the document to all functions handling a page. If a page is tied to the source document, it's redundant; otherwise it's just pointless. Fix reference counting oddity in fz_new_image_from_pixmap.
2015-02-17Rename fz_close_* and fz_free_* to fz_drop_*.Tor Andersson
Rename fz_close to fz_drop_stream. Rename fz_close_archive to fz_drop_archive. Rename fz_close_output to fz_drop_output. Rename fz_free_* to fz_drop_*. Rename pdf_free_* to pdf_drop_*. Rename xps_free_* to xps_drop_*.
2013-09-13Fix various compile warnings spotted by the cluster.Robin Watts
2013-06-20Rearrange source files.Tor Andersson