Traverse PDF page tree only once in CPDF_Document Try 3

Now, we do not start traversal from where we were at, but from the top. This makes the code less prone to bugs, as now there is no need to call methods to recursively fix things. This will save a lot of time when the trees are rather flat, as in the PDF file in the bug. It can still be slow, for instance if we have a chain of page nodes, and the last in the chain contains all of the pages (this is artificial). Try 2 at https://codereview.chromium.org/2442403002/ Also added test where Try 2 would have failed. Tested the pdf from the bug on my Mac: With this CL: load in 21 seconds Without this CL: did not load in 4 minutes, got tired of waiting BUG=chromium:638513 Review-Url: https://codereview.chromium.org/2470803003
author: npm <npm@chromium.org> 2016-11-04 12:54:51 -0700
committer: Commit bot <commit-bot@chromium.org> 2016-11-04 12:54:51 -0700
commit: ec64cee9acccd0d1e574bbbd8aa91b08c1cf254f (patch)
tree: 1fe1ed80241bffe81cd34786785ef7db5a8494c6 /core/fpdfapi/parser/cpdf_document.h
parent: 33fdebc3da676bff84d0fd0f69b9087c0c12dfeb (diff)
download: pdfium-ec64cee9acccd0d1e574bbbd8aa91b08c1cf254f.tar.xz
1 files changed, 10 insertions, 4 deletions
diff --git a/core/fpdfapi/parser/cpdf_document.h b/core/fpdfapi/parser/cpdf_document.h
index e1135260ee..0a99e42c3f 100644
--- a/core/fpdfapi/parser/cpdf_document.h
+++ b/core/fpdfapi/parser/cpdf_document.h
@@ -105,10 +105,8 @@ class CPDF_Document : public CPDF_IndirectObjectHolder {
  protected:
   // Retrieve page count information by getting count value from the tree nodes
   int RetrievePageCount() const;
-  CPDF_Dictionary* FindPDFPage(CPDF_Dictionary* pPages,
-                               int iPage,
-                               int nPagesToGo,
-                               int level);
+  // When this method is called, m_pTreeTraversal[level] exists.
+  CPDF_Dictionary* TraversePDFPages(int iPage, int* nPagesToGo, size_t level);
   int FindPageIndex(CPDF_Dictionary* pNode,
                     uint32_t& skip_count,
                     uint32_t objnum,
@@ -130,10 +128,18 @@ class CPDF_Document : public CPDF_IndirectObjectHolder {
                            bool bInsert,
                            std::set<CPDF_Dictionary*>* pVisited);
   bool InsertNewPage(int iPage, CPDF_Dictionary* pPageDict);
+  void ResetTraversal();
 
   std::unique_ptr<CPDF_Parser> m_pParser;
   CPDF_Dictionary* m_pRootDict;
   CPDF_Dictionary* m_pInfoDict;
+  // Vector of pairs to know current position in the page tree. The index in the
+  // vector corresponds to the level being described. The pair contains a
+  // pointer to the dictionary being processed at the level, and an index of the
+  // of the child being processed within the dictionary's /Kids array.
+  std::vector<std::pair<CPDF_Dictionary*, size_t>> m_pTreeTraversal;
+  // Index of the next page that will be traversed from the page tree.
+  int m_iNextPageToTraverse;
   bool m_bLinearized;
   int m_iFirstPageNo;
   uint32_t m_dwFirstPageObjNum;
author	npm <npm@chromium.org>	2016-11-04 12:54:51 -0700
committer	Commit bot <commit-bot@chromium.org>	2016-11-04 12:54:51 -0700
commit	ec64cee9acccd0d1e574bbbd8aa91b08c1cf254f (patch)
tree	1fe1ed80241bffe81cd34786785ef7db5a8494c6 /core/fpdfapi/parser/cpdf_document.h
parent	33fdebc3da676bff84d0fd0f69b9087c0c12dfeb (diff)
download	pdfium-ec64cee9acccd0d1e574bbbd8aa91b08c1cf254f.tar.xz