summaryrefslogtreecommitdiff
path: root/apps/pdfapp.c
AgeCommit message (Collapse)Author
2013-03-26Rework text extraction structures.Robin Watts
Rework the text extraction structures - the broad strokes are similar but we now hold more information at each stage to enable us to perform more detailed analysis on the structure of the page. We now hold: fz_text_char's (the position, ucs value, and style of each char). fz_text_span's (sets of chars that share the same baseline/transform, with no more than an expected amount of whitespace between each char). fz_text_line's (sets of spans that share the same baseline (more or less, allowing for super/subscript, but possibly with a larger than expected amount of whitespace). fz_text_block's (sets of lines that follow one another) After fz_text_analysis is called, we hope to have fz_text_blocks split such that each block is a paragraph. This new implementation has the same restrictions as the current implementation it replaces, namely that chars are only considered for addition onto the most recent span at the moment, but this revised form is designed to allow more easy extension, and for this restriction to be lifted. Also add simple paragraph splitting based on finding the most common 'line distance' in blocks. When we add spans together to collate them into lines, we record the 'horizontal' and 'vertical' spacing between them. (Not actually horizontal or vertical, so much as 'in the direction of writing' and 'perpendicular to the direction of writing'). The 'horizontal' value enables us to more correctly output spaces when converting to (say) html later. The 'vertical' value enables us to spot subscripts and superscripts etc, as well as small changes in the baseline due to style changes. We are careful to base the baseline comparison on the baseline for the line, not the baseline for the previous span, as otherwise superscripts/ subscripts on the end of the line affect what we match next. Also, we are less tolerant of vertical shifts after a large gap. This avoids false positives where different columns just happen to almost line up.
2013-03-22Squash some warnings.Robin Watts
Some -Wshadow ones, plus some 'set but not used' ones.
2013-02-19Bug 693639: Use strlcpy instead of strncpy!Tor Andersson
strncpy is *not* the correct function to use. It does not null terminate, and it needlessly zeroes past the end. It was designed for fixed length database records, not strings. Use fz_strlcpy and strlcat instead.
2013-02-19Fix whitespace.Tor Andersson
2013-02-13Bump version number strings and dates for 1.2 release.Tor Andersson
2013-02-06Rename bbox to irect.Tor Andersson
2013-02-06Solve potential problems with partial update.Robin Watts
When we run a display list we pass in an area for fast eliding of objects. Ensure that the area we pass in is always at least as big as the bbox with which we create display devices.
2013-02-06Change to pass structures by reference rather than value.Robin Watts
This is faster on ARM in particular. The primary changes involve fz_matrix, fz_rect and fz_bbox. Rather than passing 'fz_rect r' into a function, we now consistently pass 'const fz_rect *r'. Where a rect is passed in and modified, we miss the 'const' off. Where possible, we return the pointer to the modified structure to allow 'chaining' of expressions. The basic upshot of this work is that we do far fewer copies of rectangle/matrix structures, and all the copies we do are explicit. This has opened the way to other optimisations, also performed in this commit. Rather than using expressions like: fz_concat(fz_scale(sx, sy), fz_translate(tx, ty)) we now have fz_pre_{scale,translate,rotate} functions. These can be implemented much more efficiently than doing the fully fledged matrix multiplication that fz_concat requires. We add fz_rect_{min,max} functions to return pointers to the min/max points of a rect. These can be used to in transformations to directly manipulate values. With a little casting in the path transformation code we can avoid more needless copying. We rename fz_widget_bbox to the more consistent fz_bound_widget.
2013-01-30Rename fz_irect back to fz_bbox.Tor Andersson
2013-01-30Introduce fz_irect where the old fz_bbox was useful.Tor Andersson
Inside the renderer we often deal with integer sized areas, for pixmaps and scissoring regions. Use a new fz_irect type in these places.
2013-01-30Eliminate fz_bbox in favor of fz_rect everywhere.Tor Andersson
2012-11-28Bug 693452: Memory leak with transitions disabled.Robin Watts
Since adding transition support any page turn has leaked a bitmap image. Don't save the old image unless we are really in transition mode.
2012-11-27Forms: avoid directly saving to the original filePaul Gardiner
MuPDF needs access to the original file when saving, and in any case directly overwritting the original file has much more potential for data loss than use of a temporary file.
2012-11-01Forms: extend setFillColor implementation to include text widgetsPaul Gardiner
Also update pdf_dict_puts so that passing NULL to val deletes the terminal key. Update pdfapp.c to update the screen between passing a mouse event and invoking a dialog box for value entry Extend javascript wrapper to handle all color spaces
2012-10-29Support partial update in pdfapp.cPaul Gardiner
2012-10-29Add fz_update_pagePaul Gardiner
Regenerate dirty appearance streams and report changed annotations since last call. Also include a partial revert of changes in 96f335bc, that turn out not to be necessary. fz_update_page must now be called between each document-changing event and the next render. pdfapp.c and the android app have been updated to do so, but do not yet take advantage of the possibility to render only the updated areas of the screen.
2012-10-25Update pdfapp to keep a separate display list for annotationsPaul Gardiner
2012-10-19Fix double free of old image in page transitionsSebastian Rasmussen
2012-10-17First steps towards supporting transitions.Robin Watts
Only Fade, Wipe and Blinds supported so far. Hit 'p' in the viewer to go into 'presentation' mode. Page swaps then transition from page to page. Pages auto advance until key or mouse is used.
2012-10-16Forms: avoid the need to reload the page on every changePaul Gardiner
Add pdf_update_annot, which is called before rendering an annotation, and checks that the annotation structure has correct information. There are three reasons the information can be out of date. Attributes of a field may have been changed such that its appearance stream needs updating. In this case the field will have have "Dirty" added to its dictionary The mouse may have changed state over the field, and a different appearance stream needs selecting. The annotation structure now records the mouse states for which the current appearance stream is acceptable. The field may have changed state as recorded by its "AS" value, and a different appearance stream needs selecting.
2012-09-25Forms: support doc.mailDoc.Paul Gardiner
2012-09-25Forms: handle app.launchUrl, currently by displaying a warningPaul Gardiner
2012-09-25Forms: show warning for use of app.execDialogPaul Gardiner
app.execDialog looks very difficult to support. Hopefully we wont have to
2012-09-25Forms: handle app.execMenuItem (presently just as a not-supported warning)Paul Gardiner
The name of the menu item is passed, so presumably the app could respond to some of the possibilities.
2012-09-25Avoid possible buffer overflow in pdfapp_warnPaul Gardiner
2012-09-19Forms: handle print request, both from javascript and from named actionPaul Gardiner
Currently the app windows app responds with a message box explaining that the MuPDF library passes print requests to the app, but the app does not implement printing.
2012-09-18Forms: add event handling api and specifically support for javascript alertPaul Gardiner
2012-09-04Forms: mass renaming for the sake of consistencyPaul Gardiner
2012-08-29Merge branch 'master' into formsPaul Gardiner
Conflicts: cbz/mucbz.c pdf/pdf_parse.c pdf/pdf_form.c xps/xps_zip.c
2012-08-16Bump version numbers to 1.1Tor Andersson
2012-08-16Instead of giving error, throw exception when password is invalidSebastian Rasmussen
Previously this triggered an assertion in the cleanup code when freeing the partially opened document.
2012-08-16Forms: respond to failed validation in windows appPaul Gardiner
2012-08-08Merge branch 'master' into formsPaul Gardiner
Conflicts: Makefile apps/mudraw.c pdf/pdf_write.c win32/libmupdf-v8.vcproj
2012-08-06Check for a display list before trying to render it in pdfappSebastian Rasmussen
Previously fix 13943b92f10796efb175e769afe5b0aea85d879a introduced continued rendering of further pages for documents where one page failed to load. However, if the entire page tree was missing from a PDF document then no display list would be obtained, yet MuPDF tried to render the display list causing a null pointer dereference. Now, check for a valid display list before trying to render it.
2012-08-03Forms: add basic support for choice widgets to the Windows appPaul Gardiner
2012-08-02Forms: implement saving on 'S' keyPaul Gardiner
2012-08-02Forms: add support for save on exit to the windows appPaul Gardiner
2012-07-17Forms: remove unhelpful type distinctionPaul Gardiner
2012-07-09Forms: add widget enumeration, and text-widget content typePaul Gardiner
Now reusing the internal representation of an annotation for widgets to avoid two separate lists
2012-07-05Merge branch 'master' into formsRobin Watts
2012-07-05Move to static inline functions from macros.Robin Watts
Instead of using macros for min/max/abs/clamp, we move to using inline functions. These are more typesafe, and should produce equivalent code on compilers that support inline (i.e. pretty much everything we care about these days). People can always do their own macro versions if they prefer.
2012-06-20Add better mechanism for enumerating annotation rectangles.Robin Watts
Rather than having a dedicated call to enumerate the rectangles for the annotations on a page, add an interface for enumerating annotations with accessor functions. Currently the only accessor function is the one to get the annotation rectangle. Use this new scheme in place of fz_bound_annots within mudraw. Also use this scheme to set the caret cursor in the viewer when over a data field.
2012-06-15Add colorspace entry to pdfapp_s to set default colorspace.Robin Watts
This is initialised to rgb or bgr according to whether _WIN32 is set as has always been the case. This allows apps that want to override it (such as mujstest) to do so though.
2012-06-15Fix mouse click glitches with forms enabled pdfapp.Robin Watts
When clicking on a form field, especially with breakpoints in play, it is possible for the up/down click logic to get confused. Improve it by using a flag to avoid processing a down click twice.
2012-06-14First version of mujstest-v8.pdf and associated solution changes.Robin Watts
Simple command line tool made from cutting all the windows specifics out of win_main.c and adding a simple script handler in. Read lines from the script, and feed those events to pdfapp. Screenshot pages as required.
2012-06-14After handling a mouseclick on a form field, stop processing.Robin Watts
This is important, otherwise we can get into an unexpected state and subsequent mouse moves appear as pans.
2012-06-13Make backspace go to the previous page in x11 viewer.Sebastian Rasmussen
2012-06-13Merge branch 'master' into formsPaul Gardiner
Conflicts: fitz/fitz-internal.h fitz/stm_buffer.c pdf/mupdf-internal.h
2012-06-11Fix Bug 693099: Render failure due to corrupt jpeg data.Robin Watts
The file supplied with the bug contains corrupt jpeg data on page 61. This causes an error to be thrown which results in mudraw exiting. Previously, when image decode was done at loading time, the error would have been thrown under the pdf interpreter rather than under the display list renderer. This error would have been caught, a warning given, and the program would have continued. This is not ideal behaviour, as there is no way for a caller to know that there was a problem, and that the image is potentially incomplete. The solution adopted here, solves both these problems. The fz_cookie structure is expanded to include a 'errors' count. Whenever we meet an error during rendering, we increment the 'errors' count, and continue. This enables applications to spot the errors count being non-zero on exit and to display a warning. mupdf is updated here to pass a cookie in and to check the error count at the end; if it is found to be non zero, then a warning is given (just once per visit to each page) to say that the page may have errors on it.
2012-06-01Merge branch 'master' into formsPaul Gardiner
Conflicts: fitz/doc_document.c fitz/fitz-internal.h fitz/fitz.h fitz/stm_buffer.c pdf/mupdf-internal.h pdf/pdf_object.c pdf/pdf_xobject.c pdf/pdf_xref.c win32/mupdf.sln