summaryrefslogtreecommitdiff
path: root/core/fpdfapi/parser/cpdf_document.h
AgeCommit message (Collapse)Author
2017-01-03Force stop of page tree traversal when max level reachedNicolas Pena
The previous implementation, FindPDFPage, was already doing this since the recursive call was always with return. Currently, we were trying to keep going even after reaching max level. The problem is that if the page tree is not a tree, we might loop forever. This could also be solved by keeping track of the dictionaries that have been visited, but this solution takes much less space. BUG=672172 Change-Id: Ia37aea58e92b6068de69f26736c612aa6a0ff4b3 Reviewed-on: https://pdfium-review.googlesource.com/2138 Commit-Queue: Nicolás Peña <npm@chromium.org> Commit-Queue: dsinclair <dsinclair@chromium.org> Reviewed-by: Tom Sepez <tsepez@chromium.org> Reviewed-by: dsinclair <dsinclair@chromium.org>
2016-11-21Fixup lint flags.Dan Sinclair
The -build/include setting was masking out build/include_what_you_use. This CL restores them, fixes any build errors, and adds NOLINT as needed. As well, the runtime/explicit and runtime/printf flags are aslo enabled and NOLINT'd. lint cleanups Change-Id: Ib013b3eb29c8d0e48cad74c5df9028684130719f Reviewed-on: https://pdfium-review.googlesource.com/2030 Reviewed-by: Tom Sepez <tsepez@chromium.org>
2016-11-16Move ByteStringPool from document to indirect object holder.tsepez
Since the indirect object holder is now in the object creation business, this will allow it to intern strings in a subsequent CL. Review-Url: https://codereview.chromium.org/2509773003
2016-11-14Make CPDF_PageContentGenerator methods take object numberstsepez
This patch fixes a possibility that an owned CPDF_Stream is handed to the indirect object holder inside RealizeResource(). Its arguments are changed to take an object number, as is done elsewhere in the code, to suggest that only indirect objects are acceptable. BUG=660756 Review-Url: https://codereview.chromium.org/2489423002
2016-11-07Use unique_ptr return from CPDF_Parser::ParseIndirectObject()tsepez
In turn, propgate to callers. This introduces a few release() calls that will go away as more code is converted. It also removes a couple of WrapUnique calls that are no longer needed as ownership of the object flows along. Review-Url: https://codereview.chromium.org/2479303002
2016-11-07Rename CPDF_Linearized to CPDF_LinearizedHeadertsepez
My OCD insists that classes be named after nouns, and "linearized" feels like an adjective. Remove a redundant "if" while at it. Review-Url: https://codereview.chromium.org/2482973002
2016-11-07Reland of Unify some codeart-snake
Unify some code Move parsing of linearized header into separate CPDF_Linearized class. Original review: https://codereview.chromium.org/2466023002/ Revert review: https://codereview.chromium.org/2474283005/ Revert reason was: Breaking the chrome roll. See https://build.chromium.org/p/tryserver.chromium.linux/builders/linux_chromium_rel_ng/builds/331856 ___ Added Fix for fuzzers. Review-Url: https://codereview.chromium.org/2477213003
2016-11-04Revert of Unify some code (patchset #14 id:260001 of ↵chromium/2912chromium/2911dsinclair
https://codereview.chromium.org/2466023002/ ) Reason for revert: Breaking the chrome roll. See https://build.chromium.org/p/tryserver.chromium.linux/builders/linux_chromium_rel_ng/builds/331856 Original issue's description: > Unify some code > > Move parsing of linearized header into separate CPDF_Linearized class. > > Committed: https://pdfium.googlesource.com/pdfium/+/71333dc57ac7e4cf7963c83333730b3882ab371f TBR=thestig@chromium.org,brucedawson@chromium.org,art-snake@yandex-team.ru # Skipping CQ checks because original CL landed less than 1 days ago. NOPRESUBMIT=true NOTREECHECKS=true NOTRY=true Review-Url: https://codereview.chromium.org/2474283005
2016-11-04Unify some codeart-snake
Move parsing of linearized header into separate CPDF_Linearized class. Review-Url: https://codereview.chromium.org/2466023002
2016-11-04Traverse PDF page tree only once in CPDF_Document Try 3npm
Now, we do not start traversal from where we were at, but from the top. This makes the code less prone to bugs, as now there is no need to call methods to recursively fix things. This will save a lot of time when the trees are rather flat, as in the PDF file in the bug. It can still be slow, for instance if we have a chain of page nodes, and the last in the chain contains all of the pages (this is artificial). Try 2 at https://codereview.chromium.org/2442403002/ Also added test where Try 2 would have failed. Tested the pdf from the bug on my Mac: With this CL: load in 21 seconds Without this CL: did not load in 4 minutes, got tired of waiting BUG=chromium:638513 Review-Url: https://codereview.chromium.org/2470803003
2016-11-03Move CPDF_Document insert methods from namespacenpm
Making the insert methods private allows us to use private members, as I will need on https://codereview.chromium.org/2470803003/ Review-Url: https://codereview.chromium.org/2472473005
2016-11-02Remove FX_BOOL from coretsepez
Review-Url: https://codereview.chromium.org/2477443002
2016-10-28Revert of Traverse PDF page tree only once in CPDF_Document Try 2 (patchset ↵npm
#3 id:40001 of https://codereview.chromium.org/2442403002/ ) Reason for revert: Not quite right yet. Original issue's description: > Traverse PDF page tree only once in CPDF_Document > > Try 2: main fix was recursively popping elements from the stack. Since > the Traverse method can be called on non-root nodes from GetPage(), we > have to make sure to properly update the parents. > > Try 1 at https://codereview.chromium.org/2414423002/ > > In our current implementation of CPDF_Document::GetPage, we traverse > the PDF page tree until we find the index we are looking for. This is > slow when we do calls GetPage(0), GetPage(1), ... since in this case > the page tree will be traversed n times if there are n pages. This CL > makes sure the page tree is only traversed once. > > Time to load the PDF from the bug below in chrome official build: > Before this CL: around 1 minute 25 seconds > After this CL: around 4 seconds > > BUG=chromium:638513 > > Committed: https://pdfium.googlesource.com/pdfium/+/d3a2009d75eac3cda442f545ef0865afae7b35cf TBR=tsepez@chromium.org,weili@chromium.org,thestig@chromium.org # Not skipping CQ checks because original CL landed more than 1 days ago. BUG=chromium:638513 Review-Url: https://codereview.chromium.org/2461063003
2016-10-26Traverse PDF page tree only once in CPDF_Documentnpm
Try 2: main fix was recursively popping elements from the stack. Since the Traverse method can be called on non-root nodes from GetPage(), we have to make sure to properly update the parents. Try 1 at https://codereview.chromium.org/2414423002/ In our current implementation of CPDF_Document::GetPage, we traverse the PDF page tree until we find the index we are looking for. This is slow when we do calls GetPage(0), GetPage(1), ... since in this case the page tree will be traversed n times if there are n pages. This CL makes sure the page tree is only traversed once. Time to load the PDF from the bug below in chrome official build: Before this CL: around 1 minute 25 seconds After this CL: around 4 seconds BUG=chromium:638513 Review-Url: https://codereview.chromium.org/2442403002
2016-10-21Add CPDF_Document::GetPage() unittestschromium/2899npm
Added a nontrivial page tree and a test that pages are being fetched properly, both when requested in order and in reverse order. This will help prevent introducing bugs while changing the way the page tree is processed. BUG=chromium:638513 Review-Url: https://chromiumcodereview.appspot.com/2435783006
2016-10-20Revert of Traverse PDF page tree only once in CPDF_Document (patchset #4 ↵dsinclair
id:60001 of https://codereview.chromium.org/2414423002/ ) Reason for revert: Possible cause of crbug.com/657897 reverting to find out. BUG=657897 Original issue's description: > Traverse PDF page tree only once in CPDF_Document > > In our current implementation of CPDF_Document::GetPage, we traverse > the PDF page tree until we find the index we are looking for. This is > slow when we do calls GetPage(0), GetPage(1), ... since in this case > the page tree will be traversed n times if there are n pages. This CL > makes sure the page tree is only traversed once. > > Time to load the PDF from the bug below in chrome official build: > Before this CL: 1 minute 40 seconds > After this CL: 5 seconds > > BUG=chromium:638513 > > Committed: https://pdfium.googlesource.com/pdfium/+/7c29e27dae139a205755c1a29b7f3ac8b36ec0da TBR=thestig@chromium.org,tsepez@chromium.org,npm@chromium.org # Not skipping CQ checks because original CL landed more than 1 days ago. BUG=chromium:638513 Review-Url: https://chromiumcodereview.appspot.com/2430313006
2016-10-18Traverse PDF page tree only once in CPDF_Documentchromium/2895npm
In our current implementation of CPDF_Document::GetPage, we traverse the PDF page tree until we find the index we are looking for. This is slow when we do calls GetPage(0), GetPage(1), ... since in this case the page tree will be traversed n times if there are n pages. This CL makes sure the page tree is only traversed once. Time to load the PDF from the bug below in chrome official build: Before this CL: 1 minute 40 seconds After this CL: 5 seconds BUG=chromium:638513 Review-Url: https://codereview.chromium.org/2414423002
2016-10-05Removed unused stuff, some FX_BOOL, and cleanup pageint.h a bitnpm
- Remove some unused stuff from pageint.h. - Replace some FX_BOOL with bool in pageint.h, and related. - Replace some "protected" with "private" in pageint.h. - Move 2 methods into namespace in fpdf_page_parser_old.cpp. Review-Url: https://codereview.chromium.org/2399573002
2016-10-04Move core/fpdfapi/fpdf_parser to core/fpdfapi/parserdsinclair
BUG=pdfium:603 Review-Url: https://codereview.chromium.org/2392603004