Age | Commit message (Collapse) | Author |
|
The store is effectively a list of items, where each item is a
(key, value) pair. The design is such that we can easily get
into the state where the only reference to a value is that held
by the store. Subsequent references can then be generated by
things being 'found' from within the store.
While the only reference to an object is that held by it being
a value in the store, the store is free to evict it to save
memory.
Images present a complication to this design; images are stored
both as values within the store (by the pdf agent, so that we
do not regenerate images each time we meet them in the file),
and as parts of the keys within the store.
For example, once an image is decoded to give a pixmap, the
pixmap is cached in the store. The key to look that pixmap up
again includes a reference to the image from which the pixmap
was generated.
This means, that for document handlers such as gproof that do
not place images in the store, we can end up with images that
are kept around purely by dint of being used as references in
store keys. There is no chance of the value (the decoded pixmap)
ever being 'found' from the store as no one other than the
key is holding a reference to the image required. Thus the
images/pixmaps are never freed until the store is emptied.
This commit offers a fix for this situation.
Standard store items are based on an fz_storable type. Here we
introduce a new fz_key_storable type derived from that. As well
as keeping track of the number of references a given item has
to it, it keeps a separate count of the number of references a
given item has to it from keys in the store.
On dropping a reference, we check to see if the number of
references has become the same as the number of references from
keys in the store. If it has, then we know that these keys can
never be 'found' again. So we filter them out of the store,
which drops the items.
|
|
This makes debugging much simpler.
|
|
|
|
Add backtrace abilities, and fix missing return value from
android logging.
|
|
The upshot of debugging this is that:
1) We can't trust the FontBBox. Certainly it appears that no
one else trusts it.
2) We can't trust the d1 values in all cases, as it can lead to
use rendering glyphs far larger than we'd want to.
So we have the compromise used here.
1) We never clip to the FontBBox.
2) If the FontBBox is invalid, then we calculate the bbox
from the contents of the data streams.
3) If the FontBBox is valid, and the d1 rectangle given
does not fit inside it, then we calculate the bbox from
the contents of the data streams.
This could theoretically produce problems with glyphs that have
much more content than they actually need, and rely on the d1
rect to clip it down to sanity. If the FontBBox is invalid in
such fonts, we will go wrong.
It's not clear to me that this will actually work in Acrobat/
Foxit/gs etc either, so we defer handling this better until we
actually have an example.
Tested with bug 694952, and bug 695843 which were the last 2 in
this area.
|
|
Windows requires sys/stat.h to be included.
|
|
|
|
|
|
|
|
Previously it was inherently tied to zip archives and directories.
Now these are separated out into distinct subclasses. This prepares
for support for further archive formats.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
The type3 font(s) in the file have an invalid (0 sized) bbox, hence
the clipping of the chars goes wrong.
We now spot the invalid bbox, and suppress the clipping.
|
|
|
|
Since the removal of the begin_page device function, structured
text extraction has been unable to correctly establish the
mediabox for extracted pages.
Update the fz_new_stext_page call to take this mediabox
information. This is an API change, but hopefully most people
are calling fz_new_stext_page_from_page or
fz_new_stext_page_from_display_list which are updated here to
cope.
Update all the apps/tools to behave properly.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Closing a device or writer may throw exceptions, but much of the
foreign language bindings (JNI and JS) depend on drop to never throw
an exception (exceptions in finalizers are bad).
|
|
|
|
|
|
|
|
|
|
|
|
The "contents" field is the same as the "obj" field, so can also be
removed.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
We want to turn pdf_page into a thin wrapper around a pdf_obj, so that
any updates to the underlying PDF objects will be reflected without
having to reload the pdf_page.
|
|
|
|
|
|
The mark & sweep pass of garbage collection, and resolving indirect objects when grafting objects
was following the full chain of indirect references. In the unusual case where a numbered object
is itself only an indirect reference to another object, this intermediate numbered object would
be missed both when marking for garbage collection, and when copying objects for grafting.
Add a function to resolve only one step for these two uses.
The following is an example of a file that would break during garbage collection if we
follow full indirect reference chains:
%PDF-1.3
1 0 obj
<</Type/Catalog /Foo[2 0 R 3 0 R]>>
endobj
2 0 obj
4 0 R
endobj
3 0 obj
5 0 R
endobj
4 0 obj
<</Length 1>>
stream
A
endstream
endobj
5 0 obj
<</Length 1>>
stream
B
endstream
endobj
|
|
The generation number is only needed for decryption, and is assumed
to be zero or irrelevant for all other uses.
Store the original object number and generation in the xref slot, so
that we can decrypt them even when the objects have been renumbered,
without needing to pass the original object number around through
the stream loading APIs.
|
|
Affects pdfclean, pdfmerge, and pdfposter.
|
|
|
|
|
|
|
|
|
|
|