summaryrefslogtreecommitdiff
path: root/core/fpdfapi/parser/cpdf_document.h
AgeCommit message (Collapse)Author
2018-07-02Virtualize Observable<T>::ObservedPtr::OnDestroy() for CPDF_Avail cleanupTom Sepez
This enables more complicated cleanup when an observed object is destroyed. Use it to make documents observable and to allow the CPDF_Avail to cleanup without the need for intermediate class. Change-Id: I3a8e758b7ff542e0a58710eff1ac8017205cbd45 Reviewed-on: https://pdfium-review.googlesource.com/36373 Reviewed-by: Lei Zhang <thestig@chromium.org> Commit-Queue: Tom Sepez <tsepez@chromium.org>
2018-06-27Remove useless code.Artem Strygin
No longer needed after commit 20eca1e3 Change-Id: Ica4f67d2a2df760ebf9fd507283791271ad407cd Reviewed-on: https://pdfium-review.googlesource.com/36351 Commit-Queue: Art Snake <art-snake@yandex-team.ru> Reviewed-by: Lei Zhang <thestig@chromium.org>
2018-06-27Rework of loading of CPDF_Document.Artem Strygin
Improve CPDF_Document interface. Fix relationship between CPDF_Document and CPDF_Parser. This CL changes CPDF_Document to internally create the CPDF_Parser and removes the need for the CPDF_Parser to know about the CPDF_Document. Change-Id: Iec7aef19575c90f30b9a6c919dfd4f4417e4caf2 Reviewed-on: https://pdfium-review.googlesource.com/35630 Commit-Queue: Art Snake <art-snake@yandex-team.ru> Reviewed-by: dsinclair <dsinclair@chromium.org>
2018-06-26Unify CPDF_Document loading methods.Artem Strygin
Change-Id: Ibf7aee942027adace7ec0831aefe0fe8c28e41cc Reviewed-on: https://pdfium-review.googlesource.com/35610 Commit-Queue: Art Snake <art-snake@yandex-team.ru> Reviewed-by: dsinclair <dsinclair@chromium.org>
2018-06-26Make CPDF_Document::m_pRootDict an UnownedPtr<>.Tom Sepez
In turn, this requires making some of the tests use an indirect root dictionary so as to satisfy the lifetime requirements. Change-Id: Ibdbe294a76200d4486134e5848c169a6c2d802bf Reviewed-on: https://pdfium-review.googlesource.com/36110 Commit-Queue: dsinclair <dsinclair@chromium.org> Reviewed-by: dsinclair <dsinclair@chromium.org>
2018-06-22Rework of Fixing metadata not read from linearized file.Artem Strygin
Move receiving "Info" dictionary form API implementation to CPDF_Document. Also added test. Bug: pdfium:664 Change-Id: I273980750fbdd4d20711f651245780fc9ba02789 Reviewed-on: https://pdfium-review.googlesource.com/35490 Commit-Queue: Art Snake <art-snake@yandex-team.ru> Reviewed-by: Henrique Nakashima <hnakashima@chromium.org>
2018-06-11Replace FPDF_PAGE_MAX_NUM with class scoped constant.Tom Sepez
Also avoids confusion with unrelated FPDF_PAGE API type. Bug: pdfium:1085 Change-Id: I36569573f020f0b87f13630bbab91caf351e4994 Reviewed-on: https://pdfium-review.googlesource.com/34830 Commit-Queue: Tom Sepez <tsepez@chromium.org> Reviewed-by: Lei Zhang <thestig@chromium.org>
2018-06-05Add test for FPDF_GetPageSizeByIndex()Tom Sepez
Ensure that FPDF_GetPageSizeByIndex() doesn't do a full page parse. Issue was noticed on CL https://pdfium-review.googlesource.com/32830 Change-Id: I51966e0b91e1a002d33ee51f00c0428fa1cda04d Reviewed-on: https://pdfium-review.googlesource.com/33792 Commit-Queue: dsinclair <dsinclair@chromium.org> Reviewed-by: dsinclair <dsinclair@chromium.org>
2018-06-04Revert "Make CPDF_Document cache CPDF_Pages"Tom Sepez
This reverts commit f0d9d28a034fe3650c3c2d662090c1e8687ddb16. Reason for revert: avoid parsing page. Change-Id: Id3478f7e38f1cbe95d098e00158b1d7d9dc6f76e Reviewed-on: https://pdfium-review.googlesource.com/33750 Reviewed-by: Tom Sepez <tsepez@chromium.org> Commit-Queue: Tom Sepez <tsepez@chromium.org>
2018-05-31Make CPDF_Document own its Extension.Tom Sepez
Inverting the ownership from the current situation makes cleanup much more intuitive. Change-Id: Iad9a7ca70c0746170ba753297732e3e34f96c5ba Reviewed-on: https://pdfium-review.googlesource.com/33190 Commit-Queue: Tom Sepez <tsepez@chromium.org> Reviewed-by: dsinclair <dsinclair@chromium.org> Reviewed-by: Nicolás Peña Moreno <npm@chromium.org>
2018-05-30Make CPDF_Document cache CPDF_PagesTom Sepez
We cache pages not by page number, which can bounce around as pages are inserted or removed, but by page dictionary's object number. Since the page may be created under one function and used under another, we can't take the shortcut of not instantiating a render cache nor not parsing the page. Change-Id: I9a325cda8b3141153544ac53e78a51a44e6b411a Reviewed-on: https://pdfium-review.googlesource.com/32830 Commit-Queue: Tom Sepez <tsepez@chromium.org> Reviewed-by: dsinclair <dsinclair@chromium.org>
2018-05-25Replace some #ifdef PDF_ENABLE_XFA with runtime checks.Tom Sepez
Abstract GetUserPermissions() differences via new virtual method. Abstract GetPageCount() differences via existing virtual method. Remove unused ReadHeader() form for non-xfa. Remove unused FindSubstFontByUnicode() for xfa. Remove unused FXFONT_EXACTMATCH Change-Id: I0a3de01a9841db86fcbc96991d3fa2682393b9ad Reviewed-on: https://pdfium-review.googlesource.com/32831 Commit-Queue: Tom Sepez <tsepez@chromium.org> Reviewed-by: dsinclair <dsinclair@chromium.org> Reviewed-by: Lei Zhang <thestig@chromium.org>
2018-05-25Make CPDF_Page retainable.Tom Sepez
Small step to reducing the differences between XFA and non-XFA. We still use the RetainPtr pretty much as if it were an unique_ptr, in that we're not yet caching pages and handing out multiple pointers to the same page in the non-XFA case. The one change is in page view cleanup, where we no longer need a boolean and can take (sufficient) page ownership with a RetainPtr. Tidy up some document.h -> page.h -> document.h circular inclusion while we're at it. NOTE: Wait for imminent branch to pass before landing. We'll want this to bake a while. Change-Id: I64a2f12ac3424ece1063d40583995b834117cf34 Reviewed-on: https://pdfium-review.googlesource.com/32790 Reviewed-by: dsinclair <dsinclair@chromium.org> Reviewed-by: Lei Zhang <thestig@chromium.org> Commit-Queue: Tom Sepez <tsepez@chromium.org>
2018-05-25Add proper const/non-const versions of CPDF_Dictionary::GetDictFor().Lei Zhang
BUG=pdfium:234 Change-Id: I6fde00c976ad4bb9cab632f465cf292f5b1da3d2 Reviewed-on: https://pdfium-review.googlesource.com/32914 Commit-Queue: Lei Zhang <thestig@chromium.org> Reviewed-by: dsinclair <dsinclair@chromium.org>
2018-05-25Mark more CPDF_Objects as const in font code.Lei Zhang
Change-Id: Id37333ba61ad0d395055acffd75d4d8be5eb2b3e Reviewed-on: https://pdfium-review.googlesource.com/32911 Commit-Queue: Lei Zhang <thestig@chromium.org> Reviewed-by: dsinclair <dsinclair@chromium.org>
2018-05-09Add proper const/non-const versions of CPDF_Array methods.Lei Zhang
Instead of having const methods that return non-const pointers. BUG=pdfium:234 Change-Id: I61495543f67229500dfcf2248e93468e9a9b23cf Reviewed-on: https://pdfium-review.googlesource.com/32183 Commit-Queue: Lei Zhang <thestig@chromium.org> Reviewed-by: Henrique Nakashima <hnakashima@chromium.org>
2018-05-09Mark numerious pointers as const.Lei Zhang
They are mostly CPDF_Object* and derived classes, but others that should be are marked const as well. Change-Id: Ib3344d7d8db90940df8edc97c0dd6c59da080541 Reviewed-on: https://pdfium-review.googlesource.com/32180 Commit-Queue: Lei Zhang <thestig@chromium.org> Reviewed-by: Henrique Nakashima <hnakashima@chromium.org>
2018-05-08Rename CPDF_Document::GetPage() to GetPageDictionary().Tom Sepez
Avoids a conflict should we wish to have the document actually track pages, with a GetPage() that returns CPDF_Page. Do the same thing to CPDF_DataAvail along the way. Add some missing consts as well. Change-Id: I2cb2213cc4c0649662fceab80407ee4a3f4cf30e Reviewed-on: https://pdfium-review.googlesource.com/32158 Commit-Queue: dsinclair <dsinclair@chromium.org> Reviewed-by: dsinclair <dsinclair@chromium.org>
2018-05-03Add CPDF_Page::Extension::GetDocExtension()Tom Sepez
In turn, add CPDF_Document::Extension::GetPDFDoc() so that we can use the abstract return type in more places. Mark an internal-only cpdfxfa_context method as private while we're at it. Change-Id: I08e64f4b9438bf2f731c3a37cf2a41152bbbd8fa Reviewed-on: https://pdfium-review.googlesource.com/31916 Commit-Queue: dsinclair <dsinclair@chromium.org> Reviewed-by: dsinclair <dsinclair@chromium.org>
2018-05-02Add CPDF_Document::Extension::GetPageCount()Tom Sepez
Another virtual API at the CPDF layer, to avoid a compile time ifdef XFA. Change-Id: Ia95c4d3b3d3b773aaf45c49ebcadff6b16ca18c6 Reviewed-on: https://pdfium-review.googlesource.com/31910 Commit-Queue: dsinclair <dsinclair@chromium.org> Reviewed-by: dsinclair <dsinclair@chromium.org>
2018-05-01Add CPDF_Document::Extension::DeletePage()Tom Sepez
Replaces one compile-time #ifdef XFA with a dynamic check and a call through a virtual API that prevents the CPDF code from knowing anything about the XFA code. Change-Id: If0ff9b6918b908b3eac824fe1d525c6d4f7316e7 Reviewed-on: https://pdfium-review.googlesource.com/31890 Reviewed-by: dsinclair <dsinclair@chromium.org> Commit-Queue: Tom Sepez <tsepez@chromium.org>
2018-05-01Make FPDF_Document always be CPDF_Document.Tom Sepez
Greatly minimize the impact between going back and forth from XFA being on/off, so that XFA case is just an extension beyond the non-XFA data structures we've shipped for years, instead of being a complete replacement of them. Change-Id: I6c98206e0ec99ea443547a4931eba912b1764d54 Reviewed-on: https://pdfium-review.googlesource.com/31690 Reviewed-by: dsinclair <dsinclair@chromium.org> Commit-Queue: Tom Sepez <tsepez@chromium.org>
2018-04-27Avoid potential duplicate unique_ptr to CPDF_Document from CPDFXA_Context.Tom Sepez
Should FPDFDocumentFromCPDFDocument() be called several times under XFA, we may WrapUnique() the document several times. So cache a pointer back from CPDF_Document to CPDFXFA_Context to avoid this. Because of layering, introduce a placeholder type CPDF_Document::Extension. This is actually the first step in some larger XFA ownership cleanup, but makes a nice standalone CL around the one particular issue. Change-Id: I4be326ddb1a5fae7339e6ed6745dd551b1374b53 Reviewed-on: https://pdfium-review.googlesource.com/31570 Reviewed-by: Lei Zhang <thestig@chromium.org> Commit-Queue: Tom Sepez <tsepez@chromium.org>
2017-10-02Removing unused definesDan Sinclair
Remove unused defines. Change-Id: Ibf10d8470f19cbf4528fe1342398a39ef15c1d12 Reviewed-on: https://pdfium-review.googlesource.com/15110 Commit-Queue: dsinclair <dsinclair@chromium.org> Reviewed-by: Ryan Harrison <rharrison@chromium.org>
2017-09-27Cleanup FX macrosDan Sinclair
This CL renames the FX_OS defines to have _OS_ in their names and drops the _DESKTOP suffix. The FXM defines have been changed to just FX. Change-Id: Iab172fba541713b5f6d14fb8098baf68e3364c74 Reviewed-on: https://pdfium-review.googlesource.com/14833 Reviewed-by: Tom Sepez <tsepez@chromium.org> Commit-Queue: dsinclair <dsinclair@chromium.org>
2017-09-21Move CFX_UnownedPtr to UnownedPtrDan Sinclair
This CL moves CFX_UnownedPtr to UnownedPtr and places in the fxcrt namespace. Bug: pdfium:898 Change-Id: I6d1fa463f365e5cb3aafa8c8a7a5f7eff62ed8e0 Reviewed-on: https://pdfium-review.googlesource.com/14620 Reviewed-by: Tom Sepez <tsepez@chromium.org> Commit-Queue: dsinclair <dsinclair@chromium.org>
2017-09-21Rename CFX_RetainPtr to RetainPtrDan Sinclair
This CL renames CFX_RetainPtr to RetainPtr and places in the fxcrt namespace. Bug: pdfium:898 Change-Id: I8798a9f79cb0840d3f037e8d04937cedd742914e Reviewed-on: https://pdfium-review.googlesource.com/14616 Reviewed-by: Tom Sepez <tsepez@chromium.org> Commit-Queue: dsinclair <dsinclair@chromium.org>
2017-09-18Convert string class namesRyan Harrison
Automated using git grep & sed. Replace StringC classes with StringView classes. Remove the CFX_ prefix and put string classes in fxcrt namespace. Change AsStringC() to AsStringView(). Rename tests from TEST(fxcrt, *String*Foo) to TEST(*String*, Foo). Couple of tests needed to have their names regularlized. BUG=pdfium:894 Change-Id: I7ca038685c8d803795f3ed02545124f7a224c83d Reviewed-on: https://pdfium-review.googlesource.com/14151 Reviewed-by: Tom Sepez <tsepez@chromium.org> Commit-Queue: Ryan Harrison <rharrison@chromium.org>
2017-08-31Remove fx_basic.hDan Sinclair
This CL removes the fx_basic.h header and fixes up includes as needed. Change-Id: I49af32a8327bdbcda40c50a61ffbd75d06609040 Reviewed-on: https://pdfium-review.googlesource.com/12670 Commit-Queue: dsinclair <dsinclair@chromium.org> Reviewed-by: Tom Sepez <tsepez@chromium.org>
2017-08-30Add truly const versions of CPDF_Document getters.Lei Zhang
Instead of only having CPDF_Dictionary* GetRoot() const, provide const CPDF_Dictionary* GetRoot() const and CPDF_Dictionary* GetRoot(). Do the same for GetInfo(). Change-Id: I6eae1208d38327fcdc7d0cd75069a01c95f4a92a Reviewed-on: https://pdfium-review.googlesource.com/11671 Commit-Queue: Lei Zhang <thestig@chromium.org> Reviewed-by: Tom Sepez <tsepez@chromium.org> Reviewed-by: dsinclair <dsinclair@chromium.org>
2017-06-19Fixing metadata not read from linearized file.chromium/3136Henrique Nakashima
This still won't work if the info dict is not on the first page without first calling FPDFAvail_IsFormAvail or FPDFAvail_IsPageAvail, as these are the methods that trigger parsing the rest of the data. Bug: pdfium:664 Change-Id: I0b0193e415a1153dcfb8bfba0e0482da6b6ba53c Reviewed-on: https://pdfium-review.googlesource.com/6610 Commit-Queue: Henrique Nakashima <hnakashima@chromium.org> Reviewed-by: dsinclair <dsinclair@chromium.org> Reviewed-by: Nicolás Peña <npm@chromium.org>
2017-05-24Convert to CFX_UnownedPtr, part 3.Tom Sepez
Remove an explicit clear to re-order the member destruction order. Change-Id: I33da3f3de4b8e8e0cfbdceaf5140e98f5d6f904a Reviewed-on: https://pdfium-review.googlesource.com/5791 Reviewed-by: Lei Zhang <thestig@chromium.org> Commit-Queue: Tom Sepez <tsepez@chromium.org>
2017-05-04CPDF_Document::GetPageData() normally does not return NULL.Lei Zhang
Add a comment to clarify and remove some unneeded checks. Change-Id: I8b0492548b245abc45e161085047c9f36d6c8e2b Reviewed-on: https://pdfium-review.googlesource.com/4871 Commit-Queue: Lei Zhang <thestig@chromium.org> Reviewed-by: Tom Sepez <tsepez@chromium.org>
2017-04-26Get rid of a few |new|s in CPDF_Document.Tom Sepez
The chain of destructors may attempt to use m_pDocPage after it has been set to null by the unique_ptr destructor. Verify it is still present before using it from any code that may be called from some other CPDF_ destructor. Change-Id: I007160231d73feed85d90efc687d6da993653f96 Reviewed-on: https://pdfium-review.googlesource.com/4499 Reviewed-by: Lei Zhang <thestig@chromium.org> Commit-Queue: Tom Sepez <tsepez@chromium.org>
2017-04-04RefCount CPDF_StreamAcc all the time.Tom Sepez
Pass stream argument to constructor; it feels like a stream accessor should always be made from a stream rather than passing one in after the fact. Change-Id: Iaa46cb37677b81f0170f5d39bab76ad38ea4af44 Reviewed-on: https://pdfium-review.googlesource.com/3620 Commit-Queue: Tom Sepez <tsepez@chromium.org> Reviewed-by: Lei Zhang <thestig@chromium.org>
2017-04-04RefCount CPDF_IccProfile all the timeTom Sepez
Make the IccProfile track its stream so that it has a proper key with which to purge the docpagedata map. Change-Id: Ib05ebc1afb828f1f5e5df62a1a33a1bfdecf507d Reviewed-on: https://pdfium-review.googlesource.com/3619 Commit-Queue: Tom Sepez <tsepez@chromium.org> Reviewed-by: Lei Zhang <thestig@chromium.org>
2017-04-01Refcount CPDF_Image all the time.chromium/3061chromium/3060Tom Sepez
Remove the old externally-counted CPDF_CountedImage type. Change-Id: Ia0b288586272da3f2daf7dfc153f08e62794321a Reviewed-on: https://pdfium-review.googlesource.com/3553 Reviewed-by: Lei Zhang <thestig@chromium.org> Commit-Queue: Lei Zhang <thestig@chromium.org>
2017-03-14Replace FX_CHAR and FX_WCHAR with underlying types.Dan Sinclair
Change-Id: I96e0a20d66b9184d22f64d8e4ce0dadd5a78c1e8 Reviewed-on: https://pdfium-review.googlesource.com/2967 Commit-Queue: dsinclair <dsinclair@chromium.org> Reviewed-by: Tom Sepez <tsepez@chromium.org>
2017-01-18Bad indexing in CPDF_Document::FindPageIndex when page tree corrupt.tsepez
Moving to std::vector from the more forgiving CFX_ArrayTemplate revealed the dubious page tree traversal, which depends on the correctness of the /Count entries to properly summarize the total descendants under a given node. The only "correct" thing to do is to throw away these counts as parsed, and re-compute them, perhaps in CountPages(). But I'm not willing to do that since it may break unknown documents in the wild. Pass out-params as pointers while we're at it. BUG=680376 Review-Url: https://codereview.chromium.org/2636403003
2017-01-09Remove CFX_ArrayTemplate from fpdfapitsepez
Review-Url: https://codereview.chromium.org/2611413002
2017-01-03Force stop of page tree traversal when max level reachedNicolas Pena
The previous implementation, FindPDFPage, was already doing this since the recursive call was always with return. Currently, we were trying to keep going even after reaching max level. The problem is that if the page tree is not a tree, we might loop forever. This could also be solved by keeping track of the dictionaries that have been visited, but this solution takes much less space. BUG=672172 Change-Id: Ia37aea58e92b6068de69f26736c612aa6a0ff4b3 Reviewed-on: https://pdfium-review.googlesource.com/2138 Commit-Queue: Nicolás Peña <npm@chromium.org> Commit-Queue: dsinclair <dsinclair@chromium.org> Reviewed-by: Tom Sepez <tsepez@chromium.org> Reviewed-by: dsinclair <dsinclair@chromium.org>
2016-11-21Fixup lint flags.Dan Sinclair
The -build/include setting was masking out build/include_what_you_use. This CL restores them, fixes any build errors, and adds NOLINT as needed. As well, the runtime/explicit and runtime/printf flags are aslo enabled and NOLINT'd. lint cleanups Change-Id: Ib013b3eb29c8d0e48cad74c5df9028684130719f Reviewed-on: https://pdfium-review.googlesource.com/2030 Reviewed-by: Tom Sepez <tsepez@chromium.org>
2016-11-16Move ByteStringPool from document to indirect object holder.tsepez
Since the indirect object holder is now in the object creation business, this will allow it to intern strings in a subsequent CL. Review-Url: https://codereview.chromium.org/2509773003
2016-11-14Make CPDF_PageContentGenerator methods take object numberstsepez
This patch fixes a possibility that an owned CPDF_Stream is handed to the indirect object holder inside RealizeResource(). Its arguments are changed to take an object number, as is done elsewhere in the code, to suggest that only indirect objects are acceptable. BUG=660756 Review-Url: https://codereview.chromium.org/2489423002
2016-11-07Use unique_ptr return from CPDF_Parser::ParseIndirectObject()tsepez
In turn, propgate to callers. This introduces a few release() calls that will go away as more code is converted. It also removes a couple of WrapUnique calls that are no longer needed as ownership of the object flows along. Review-Url: https://codereview.chromium.org/2479303002
2016-11-07Rename CPDF_Linearized to CPDF_LinearizedHeadertsepez
My OCD insists that classes be named after nouns, and "linearized" feels like an adjective. Remove a redundant "if" while at it. Review-Url: https://codereview.chromium.org/2482973002
2016-11-07Reland of Unify some codeart-snake
Unify some code Move parsing of linearized header into separate CPDF_Linearized class. Original review: https://codereview.chromium.org/2466023002/ Revert review: https://codereview.chromium.org/2474283005/ Revert reason was: Breaking the chrome roll. See https://build.chromium.org/p/tryserver.chromium.linux/builders/linux_chromium_rel_ng/builds/331856 ___ Added Fix for fuzzers. Review-Url: https://codereview.chromium.org/2477213003
2016-11-04Revert of Unify some code (patchset #14 id:260001 of ↵chromium/2912chromium/2911dsinclair
https://codereview.chromium.org/2466023002/ ) Reason for revert: Breaking the chrome roll. See https://build.chromium.org/p/tryserver.chromium.linux/builders/linux_chromium_rel_ng/builds/331856 Original issue's description: > Unify some code > > Move parsing of linearized header into separate CPDF_Linearized class. > > Committed: https://pdfium.googlesource.com/pdfium/+/71333dc57ac7e4cf7963c83333730b3882ab371f TBR=thestig@chromium.org,brucedawson@chromium.org,art-snake@yandex-team.ru # Skipping CQ checks because original CL landed less than 1 days ago. NOPRESUBMIT=true NOTREECHECKS=true NOTRY=true Review-Url: https://codereview.chromium.org/2474283005
2016-11-04Unify some codeart-snake
Move parsing of linearized header into separate CPDF_Linearized class. Review-Url: https://codereview.chromium.org/2466023002
2016-11-04Traverse PDF page tree only once in CPDF_Document Try 3npm
Now, we do not start traversal from where we were at, but from the top. This makes the code less prone to bugs, as now there is no need to call methods to recursively fix things. This will save a lot of time when the trees are rather flat, as in the PDF file in the bug. It can still be slow, for instance if we have a chain of page nodes, and the last in the chain contains all of the pages (this is artificial). Try 2 at https://codereview.chromium.org/2442403002/ Also added test where Try 2 would have failed. Tested the pdf from the bug on my Mac: With this CL: load in 21 seconds Without this CL: did not load in 4 minutes, got tired of waiting BUG=chromium:638513 Review-Url: https://codereview.chromium.org/2470803003