summaryrefslogtreecommitdiff
path: root/core/fpdfapi/parser
AgeCommit message (Collapse)Author
2018-08-06Avoid invalid object numbers in CPDF_Parser::LoadCrossRefV5().chromium/3515Lei Zhang
BUG=chromium:865272 Change-Id: I4606bdfd78ebd6553c36b985b4f49d07b579ac40 Reviewed-on: https://pdfium-review.googlesource.com/39438 Commit-Queue: Lei Zhang <thestig@chromium.org> Reviewed-by: Art Snake <art-snake@yandex-team.ru>
2018-08-06Check for null object type in CPDF_Parser::LoadCrossRefV5().Lei Zhang
BUG=chromium:871042 Change-Id: Id4566b29270ab738c69d46cb96fc134485d6ee2f Reviewed-on: https://pdfium-review.googlesource.com/39510 Reviewed-by: Art Snake <art-snake@yandex-team.ru> Commit-Queue: Lei Zhang <thestig@chromium.org>
2018-08-06Do more CPDF_Parser::LoadCrossRefV5() cleanup.Lei Zhang
- Use range for-loop to avoid needing "i" and "j". - Avoid repeatedly calculating "startnum + j". - Reduce levels of nested ifs. - Remove variable that is only used once. Change-Id: I9d08cef1082812fcfaa2699f65720165c52ebcff Reviewed-on: https://pdfium-review.googlesource.com/39437 Reviewed-by: Art Snake <art-snake@yandex-team.ru> Commit-Queue: Lei Zhang <thestig@chromium.org>
2018-08-04Clarify integer types in CPDF_Parser::LoadCrossRefV5().Lei Zhang
GetVarInt() returns uint32_t. So assign the results to variables of type uint32_t. Then make sure those results get passed on as uint32_t, or use pdfium::base::IsValueInRangeForNumericType<T>() to make sure they can be converted to type T safely. Change-Id: I4556f0b89b4e5cdb99ab530119c8051ec8a9411d Reviewed-on: https://pdfium-review.googlesource.com/39436 Reviewed-by: Art Snake <art-snake@yandex-team.ru> Commit-Queue: Lei Zhang <thestig@chromium.org>
2018-08-02Rework of CPDF_DataAvail::CheckHintTables.Artem Strygin
Move HintTables parsing logic into CPDF_HintTables. Change-Id: I9748179fe9fc3ac44f88c19c347e30c0e7e3ac67 Reviewed-on: https://pdfium-review.googlesource.com/38771 Reviewed-by: Lei Zhang <thestig@chromium.org> Commit-Queue: Art Snake <art-snake@yandex-team.ru>
2018-08-02Remove some checks in IsLinearizedHeaderValid().Lei Zhang
One check can never fail. The other check can be done earlier, before creating the CPDF_LinearizedHeader. Change-Id: I0bccb2a9e19e0d5517daf96684adba6bb3a203bf Reviewed-on: https://pdfium-review.googlesource.com/39412 Reviewed-by: Art Snake <art-snake@yandex-team.ru> Commit-Queue: Lei Zhang <thestig@chromium.org>
2018-08-02Rework of CPDF_Parser::GetLastObjNum.Artem Strygin
Change-Id: I0481774858a9d9823580e1207807e35be8a9eea9 Reviewed-on: https://pdfium-review.googlesource.com/36270 Reviewed-by: Lei Zhang <thestig@chromium.org> Commit-Queue: Art Snake <art-snake@yandex-team.ru>
2018-07-30Check maximum bit count of shared group object numbers.Artem Strygin
Bug: chromium:868477 Change-Id: I5957c5ef051bc4fa8eb51efa6a7fc142996742c5 Reviewed-on: https://pdfium-review.googlesource.com/39130 Commit-Queue: Art Snake <art-snake@yandex-team.ru> Reviewed-by: Henrique Nakashima <hnakashima@chromium.org>
2018-07-27Make pdfium_embeddertests pass on Windows 10.Lei Zhang
BUG=chromium:828177 NOTRY=true Change-Id: I30123087bbe11aaaa6175b5f729b7ab55107a975 Reviewed-on: https://pdfium-review.googlesource.com/38902 Commit-Queue: Lei Zhang <thestig@chromium.org> Reviewed-by: Henrique Nakashima <hnakashima@chromium.org>
2018-07-27Parse obj nums range within Hint tables for shared groups.Artem Strygin
Change-Id: Ib22db6c57d2066ef70c0ef12e44d1e5eee6611a5 Reviewed-on: https://pdfium-review.googlesource.com/36410 Commit-Queue: Art Snake <art-snake@yandex-team.ru> Reviewed-by: Lei Zhang <thestig@chromium.org>
2018-07-25Change GetHeaderOffset() to return Optional<FX_FILESIZE>.Lei Zhang
Remove |kInvalidHeaderOffset|. Change-Id: I5978e745e97aa4e13299dd21028721725ac0c996 Reviewed-on: https://pdfium-review.googlesource.com/38853 Commit-Queue: Lei Zhang <thestig@chromium.org> Reviewed-by: Art Snake <art-snake@yandex-team.ru>
2018-07-25Remove CFX_MemoryStream uses in tests.Lei Zhang
Replace with CFX_BufferSeekableReadStream, which allows for spans and const inputs. Change CXFA_DocumentParser to take IFX_SeekableReadStream instead of IFX_SeekableStream in the process. Change-Id: I0168451350c9fc250231f0414c38738a4d86ca42 Reviewed-on: https://pdfium-review.googlesource.com/38852 Commit-Queue: Lei Zhang <thestig@chromium.org> Reviewed-by: Ryan Harrison <rharrison@chromium.org>
2018-07-25Change CFX_BufferSeekableReadStream to take a span.Lei Zhang
Change-Id: Ib9e20fdfc637b2ba0358586e23ad72454b0b8ad1 Reviewed-on: https://pdfium-review.googlesource.com/38851 Commit-Queue: Lei Zhang <thestig@chromium.org> Reviewed-by: Tom Sepez <tsepez@chromium.org>
2018-07-25Move CPDF_SyntaxParser init methods into ctor.Lei Zhang
- CPDF_SyntaxParser can no longer be initialized multiple times. - Make the file length and header offset const. - Make the header offset type FX_FILESIZE consistently. - Simplify for the common case where the header offset is 0. Change-Id: I7138db1fbcec3b7578b0239b92fc1154fa4dc4ce Reviewed-on: https://pdfium-review.googlesource.com/38850 Reviewed-by: Art Snake <art-snake@yandex-team.ru> Commit-Queue: Lei Zhang <thestig@chromium.org>
2018-07-25Fix hint tables parsing.Artem Strygin
Sample PDF: https://yadi.sk/d/oWLtAEfy3YbEb3 For offsets, equal to the hint stream offset, added hint stream length to determine the actual offset, because linearization inserted the hint stream at the original location of the object. Also the number of bits needed to represent the numerator of the fractional position for each shared object reference may be zero, if each shared group contains only one object with obj num, incremented on 1. Change-Id: I4754d603f388354821e8d0cac97ad99a7578fe4b Reviewed-on: https://pdfium-review.googlesource.com/36610 Commit-Queue: Art Snake <art-snake@yandex-team.ru> Reviewed-by: Lei Zhang <thestig@chromium.org>
2018-07-25Use document size instead of file size while parsing.Artem Strygin
We should use document size instead of File size, because all offsets and sizes was read from document should take into account of header offset. Added some tests of parsing of documents with header offset. Also drop friendship of CPDF_SyntaxParser with CPDF_Parser. Change-Id: Iebec75ab2ee07fb644a6c653b4ef5c2e09af09fe Reviewed-on: https://pdfium-review.googlesource.com/35830 Commit-Queue: Art Snake <art-snake@yandex-team.ru> Reviewed-by: Lei Zhang <thestig@chromium.org>
2018-07-24Fix encryption dictionary owning.Artem Strygin
Return encryption dictionary as const reference from CPDF_Parser. Create a copy in CPDF_Creator if needed. Change-Id: I270f71d307d818fba7f65ebe379f5942ae816934 Reviewed-on: https://pdfium-review.googlesource.com/38390 Reviewed-by: Lei Zhang <thestig@chromium.org> Commit-Queue: Art Snake <art-snake@yandex-team.ru>
2018-07-23Rework of CPDF_Object writing.Artem Strygin
Move writing logic into implementation of related clases. Change-Id: If70dc418b352b562ee681ea34fa6595d6f52eee3 Reviewed-on: https://pdfium-review.googlesource.com/36350 Commit-Queue: Art Snake <art-snake@yandex-team.ru> Reviewed-by: Tom Sepez <tsepez@chromium.org>
2018-07-23Add support of rebuilding crossrefs with compressed objects.Artem Strygin
Change-Id: I0743c34f0206f85828570430edb9f62b6b0cdbb5 Reviewed-on: https://pdfium-review.googlesource.com/37315 Reviewed-by: Lei Zhang <thestig@chromium.org> Commit-Queue: Art Snake <art-snake@yandex-team.ru>
2018-07-20Rework of CPDF_Parser::RebuildCrossRef.chromium/3498Artem Strygin
Use CPDF_SyntaxParser logic to rebuild crossref. Change-Id: I394f64e76294b97c6a7c2b8984a880712fd193a7 Reviewed-on: https://pdfium-review.googlesource.com/37314 Reviewed-by: Lei Zhang <thestig@chromium.org> Commit-Queue: Art Snake <art-snake@yandex-team.ru>
2018-07-18Add pdfium::span::as_bytes() and as_writable_bytes().Tom Sepez
Picks up some enhancements from base/span.h. In turn, also adds the size_bytes() helper. Differs from base version in that it works around C++14 enable_if_t<>, and avoids the dynamic_extent template specialization tricks. Use it in a few places where appropriate. Change-Id: I86f72cf0023f2d4317a7afa351fddee601c8f86c Reviewed-on: https://pdfium-review.googlesource.com/38251 Reviewed-by: Daniel Cheng <dcheng@chromium.org> Commit-Queue: Tom Sepez <tsepez@chromium.org>
2018-07-18Do not add invalid objects to the cross reference table.chromium/3496Lei Zhang
BUG=chromium:851994 Change-Id: I2e14401271c70afa204221e0f3d469f0b82ce8cf Reviewed-on: https://pdfium-review.googlesource.com/37871 Commit-Queue: Lei Zhang <thestig@chromium.org> Reviewed-by: Art Snake <art-snake@yandex-team.ru>
2018-07-18Avoid writing const/non-const versions of the same function.Lei Zhang
Use const_cast for the non-const version to call the const version. Change-Id: Ibdf5fe53255ee6e983555080336f5d63e683afd1 Reviewed-on: https://pdfium-review.googlesource.com/37490 Reviewed-by: Henrique Nakashima <hnakashima@chromium.org> Commit-Queue: Lei Zhang <thestig@chromium.org>
2018-07-18Use CPDF_CrossRefTable within CPDF_ParserArtem Strygin
Change-Id: I354e8bed12606abdc67427bbc7928e3b1f11e243 Reviewed-on: https://pdfium-review.googlesource.com/35433 Commit-Queue: Art Snake <art-snake@yandex-team.ru> Reviewed-by: Lei Zhang <thestig@chromium.org>
2018-07-18Make CPDF_Parser::GetTrailer const method.Artem Strygin
Use own copy of encryption dictionary within CPDF_Parser, to prevent modification of original trailer. Change-Id: I6246b872d431b94411fcec694c5176f8d85dfe26 Reviewed-on: https://pdfium-review.googlesource.com/35450 Commit-Queue: Art Snake <art-snake@yandex-team.ru> Reviewed-by: Lei Zhang <thestig@chromium.org>
2018-07-16Fix some nits in CPDF_Document.Lei Zhang
Change-Id: I57f89b9f2a8ef3f351e7574a76d6064ffde150d3 Reviewed-on: https://pdfium-review.googlesource.com/37870 Reviewed-by: Henrique Nakashima <hnakashima@chromium.org> Commit-Queue: Lei Zhang <thestig@chromium.org>
2018-07-16Remove unused member from CPDF_DataAvail.Tom Sepez
Change-Id: I3686bd3d28a84aae39c750a371902e1e5d62b365 Reviewed-on: https://pdfium-review.googlesource.com/37050 Reviewed-by: dsinclair <dsinclair@chromium.org> Commit-Queue: Tom Sepez <tsepez@chromium.org>
2018-07-16Get rid of some loose allocs/free in CPDF_Document.chromium/3494Tom Sepez
Use std::vector<> as a manager for contiguous buffers. Change-Id: Icaacbd4b7010b928237aa71485411ade7539412a Reviewed-on: https://pdfium-review.googlesource.com/37012 Reviewed-by: dsinclair <dsinclair@chromium.org> Commit-Queue: Tom Sepez <tsepez@chromium.org>
2018-07-12Remove CPDF_HintTables::GetItemLength()Artem Strygin
Commit {Insert later} removed the last caller to this method. Change-Id: I1689b33486396cc3a41139f984f819b39ab02b2a Reviewed-on: https://pdfium-review.googlesource.com/35130 Commit-Queue: Art Snake <art-snake@yandex-team.ru> Reviewed-by: dsinclair <dsinclair@chromium.org>
2018-07-12Implement CPDF_HintsTable::SharedObjGroupInfo.Artem Strygin
Merge shared objects related data into CPDF_HintsTable::SharedObjGroupInfo. Change-Id: I53bb7fc42ea6bcd26b3ebf91b8c6aa402108d086 Reviewed-on: https://pdfium-review.googlesource.com/15830 Reviewed-by: Lei Zhang <thestig@chromium.org> Commit-Queue: Art Snake <art-snake@yandex-team.ru>
2018-07-12Reland "Avoid duplicate data buffering in CPDF_SyntaxParser::ReadStream()."Artem Strygin
This is a reland of 77f15f7883638a4ced131d74c053af10a5970ce9 Original change's description: > Avoid duplicate data buffering in CPDF_SyntaxParser::ReadStream(). > > Allow sub-streams created from an IFX_SeekableReadStream to provide > stream data without copying memory. > The data will only reside in the top-level stream. > > For example: > For file > http://www.major-landrover.ru/upload/attachments/f/9/f96aab07dab04ae89c8a509ec1ef2b31.pdf > (18 Mb) > > The memory usage is reduced by ~13 Mb. > > Change-Id: I2595c014d0fbe1fdd181cc04965cfd7d901c2d88 > Reviewed-on: https://pdfium-review.googlesource.com/35930 > Commit-Queue: Art Snake <art-snake@yandex-team.ru> > Reviewed-by: dsinclair <dsinclair@chromium.org> Change-Id: I4c4d5dcf42ff44784468ac7a7c302df509fc804d Reviewed-on: https://pdfium-review.googlesource.com/37313 Reviewed-by: Henrique Nakashima <hnakashima@chromium.org> Commit-Queue: Art Snake <art-snake@yandex-team.ru>
2018-07-11Fix crash and memory leak.Artem Strygin
Do not return size within CPDF_StreamAcc in case when read data failed. Also free buffers in this case. Bug: chromium:860210 Change-Id: Ifb2a061d7c8427409b68c33f213c5c55343fb946 Reviewed-on: https://pdfium-review.googlesource.com/37310 Reviewed-by: Henrique Nakashima <hnakashima@chromium.org> Commit-Queue: Art Snake <art-snake@yandex-team.ru>
2018-07-11Do not store cross ref v5 obj within document.Artem Strygin
Currently, not necessary to store the cross ref v5 obj within CPDF_IndirectObjectHolder(CPDF_Document), because all necessary data from the cross ref are parsed or cloned, and owned by CPDF_Parser seperately from CPDF_IndirectObjectHolder. Also this fix regression from commit 4ea4459e. BUG=chromium:810768 Change-Id: I0d1a11ff027210f4f15804a69d12416838ec9815 Reviewed-on: https://pdfium-review.googlesource.com/37110 Reviewed-by: Lei Zhang <thestig@chromium.org> Commit-Queue: Art Snake <art-snake@yandex-team.ru>
2018-07-06Revert "Avoid duplicate data buffering in CPDF_SyntaxParser::ReadStream()."Henrique Nakashima
This reverts commit 77f15f7883638a4ced131d74c053af10a5970ce9. Reason for revert: Causes crbug.com/860210 Bug: chromium:860210 Original change's description: > Avoid duplicate data buffering in CPDF_SyntaxParser::ReadStream(). > > Allow sub-streams created from an IFX_SeekableReadStream to provide > stream data without copying memory. > The data will only reside in the top-level stream. > > For example: > For file > http://www.major-landrover.ru/upload/attachments/f/9/f96aab07dab04ae89c8a509ec1ef2b31.pdf > (18 Mb) > > The memory usage is reduced by ~13 Mb. > > Change-Id: I2595c014d0fbe1fdd181cc04965cfd7d901c2d88 > Reviewed-on: https://pdfium-review.googlesource.com/35930 > Commit-Queue: Art Snake <art-snake@yandex-team.ru> > Reviewed-by: dsinclair <dsinclair@chromium.org> TBR=tsepez@chromium.org,dsinclair@chromium.org,art-snake@yandex-team.ru # Not skipping CQ checks because original CL landed > 1 day ago. Change-Id: I947fca17052765935a952a4f25ca48f6599c4af9 Reviewed-on: https://pdfium-review.googlesource.com/37210 Reviewed-by: Henrique Nakashima <hnakashima@chromium.org> Commit-Queue: Henrique Nakashima <hnakashima@chromium.org>
2018-07-03Remove a parameter from CPDF_SyntaxParser::FindTag().Lei Zhang
The limit parameter is always set to 0. Change-Id: Idf7f44e1c5a895e05ad474932d3e9df85f435e3f Reviewed-on: https://pdfium-review.googlesource.com/36990 Commit-Queue: dsinclair <dsinclair@chromium.org> Reviewed-by: dsinclair <dsinclair@chromium.org>
2018-07-03Avoid duplicate data buffering in CPDF_SyntaxParser::ReadStream().Artem Strygin
Allow sub-streams created from an IFX_SeekableReadStream to provide stream data without copying memory. The data will only reside in the top-level stream. For example: For file http://www.major-landrover.ru/upload/attachments/f/9/f96aab07dab04ae89c8a509ec1ef2b31.pdf (18 Mb) The memory usage is reduced by ~13 Mb. Change-Id: I2595c014d0fbe1fdd181cc04965cfd7d901c2d88 Reviewed-on: https://pdfium-review.googlesource.com/35930 Commit-Queue: Art Snake <art-snake@yandex-team.ru> Reviewed-by: dsinclair <dsinclair@chromium.org>
2018-07-03Do data request for CPDF_Stream more smoothly.Artem Strygin
For DocumentLoader we should do reconnect to skip non-requested blocks on each requested offset jump. To reduce reconnections, read stream data first, then do all checks. Thereby the DocumentLoader will continue loading data without reconnections. Change-Id: I344d045e59c5de9e1a4aed0002ea122caa92f240 Reviewed-on: https://pdfium-review.googlesource.com/13450 Commit-Queue: Art Snake <art-snake@yandex-team.ru> Reviewed-by: Lei Zhang <thestig@chromium.org>
2018-07-03Use unowned ptr in cpdf_stream_accTom Sepez
Change-Id: Ide081e462c83a5a209a2e6462dd12b298993f36f Reviewed-on: https://pdfium-review.googlesource.com/36850 Commit-Queue: Tom Sepez <tsepez@chromium.org> Reviewed-by: dsinclair <dsinclair@chromium.org>
2018-07-03Forward declare CPDF_SyntaxParser when possible.Lei Zhang
Change-Id: Ib9b30f7f4e1c41b0e2e2c1757252736599055c96 Reviewed-on: https://pdfium-review.googlesource.com/36870 Reviewed-by: Henrique Nakashima <hnakashima@chromium.org> Commit-Queue: Lei Zhang <thestig@chromium.org>
2018-07-03Use GetPos() and SetPos() in CPDF_SyntaxParser::ReadStream().Lei Zhang
Change-Id: I711508cdffd9756837657390d73b88c2d8c62db5 Reviewed-on: https://pdfium-review.googlesource.com/36891 Commit-Queue: dsinclair <dsinclair@chromium.org> Reviewed-by: Art Snake <art-snake@yandex-team.ru> Reviewed-by: dsinclair <dsinclair@chromium.org>
2018-07-03Fix indentation in CPDF_SyntaxParser.Lei Zhang
Change-Id: I293730ea5981ba8e2dca7a522b9036fa62d7cd35 Reviewed-on: https://pdfium-review.googlesource.com/36890 Commit-Queue: dsinclair <dsinclair@chromium.org> Reviewed-by: Art Snake <art-snake@yandex-team.ru> Reviewed-by: dsinclair <dsinclair@chromium.org>
2018-07-02Virtualize Observable<T>::ObservedPtr::OnDestroy() for CPDF_Avail cleanupTom Sepez
This enables more complicated cleanup when an observed object is destroyed. Use it to make documents observable and to allow the CPDF_Avail to cleanup without the need for intermediate class. Change-Id: I3a8e758b7ff542e0a58710eff1ac8017205cbd45 Reviewed-on: https://pdfium-review.googlesource.com/36373 Reviewed-by: Lei Zhang <thestig@chromium.org> Commit-Queue: Tom Sepez <tsepez@chromium.org>
2018-06-29Implement CPDF_HintsTable::PageInfo.Artem Strygin
Merge page info data from Hints Table into CPDF_HintsTable::PageInfo class. Change-Id: I468996346ee153e3fa8ada6a83770614362d1b92 Reviewed-on: https://pdfium-review.googlesource.com/15813 Commit-Queue: Art Snake <art-snake@yandex-team.ru> Reviewed-by: Lei Zhang <thestig@chromium.org>
2018-06-28Replace DCHECKs with ASSERTs.Lei Zhang
Change-Id: I0f2bf1cb44b4cba872a719f0a75d8776f413812c Reviewed-on: https://pdfium-review.googlesource.com/36250 Commit-Queue: dsinclair <dsinclair@chromium.org> Reviewed-by: dsinclair <dsinclair@chromium.org>
2018-06-28Use UnownedPtr for document within CPDF_XXXAvail.Artem Strygin
Change-Id: I9ded1664564c330132f43047293e18696d77fc7d Reviewed-on: https://pdfium-review.googlesource.com/36310 Commit-Queue: Art Snake <art-snake@yandex-team.ru> Reviewed-by: Tom Sepez <tsepez@chromium.org>
2018-06-27Remove useless code.Artem Strygin
No longer needed after commit 20eca1e3 Change-Id: Ica4f67d2a2df760ebf9fd507283791271ad407cd Reviewed-on: https://pdfium-review.googlesource.com/36351 Commit-Queue: Art Snake <art-snake@yandex-team.ru> Reviewed-by: Lei Zhang <thestig@chromium.org>
2018-06-27Remove CPDF_Parser::ParseIndirectObjectAtByStrict().Lei Zhang
Commit 0145b89a removed the only caller. Change-Id: Ib3b7eaa0bc8be986f7d290f1efaea519d68daf6b Reviewed-on: https://pdfium-review.googlesource.com/36251 Reviewed-by: Art Snake <art-snake@yandex-team.ru> Commit-Queue: Lei Zhang <thestig@chromium.org>
2018-06-27Add fxcrt::AutoRestorer<T>::AbandonRestoration().chromium/3475Tom Sepez
Kinda like reaching a commit point, makes going forward more useful. Change-Id: I7695b6e627d4cd8ed2bccb667d0cabd7f42c7b1c Reviewed-on: https://pdfium-review.googlesource.com/35970 Reviewed-by: Lei Zhang <thestig@chromium.org> Commit-Queue: Tom Sepez <tsepez@chromium.org>
2018-06-27Rework of loading of CPDF_Document.Artem Strygin
Improve CPDF_Document interface. Fix relationship between CPDF_Document and CPDF_Parser. This CL changes CPDF_Document to internally create the CPDF_Parser and removes the need for the CPDF_Parser to know about the CPDF_Document. Change-Id: Iec7aef19575c90f30b9a6c919dfd4f4417e4caf2 Reviewed-on: https://pdfium-review.googlesource.com/35630 Commit-Queue: Art Snake <art-snake@yandex-team.ru> Reviewed-by: dsinclair <dsinclair@chromium.org>
2018-06-27Implement CPDF_CrossRefTableArtem Strygin
Change-Id: I5ac61ab323adb5eec2de8660064fff95ee877b5e Reviewed-on: https://pdfium-review.googlesource.com/35432 Reviewed-by: dsinclair <dsinclair@chromium.org> Commit-Queue: Art Snake <art-snake@yandex-team.ru>