summaryrefslogtreecommitdiff
path: root/core/fpdftext
AgeCommit message (Collapse)Author
2017-05-20Better identify web links by trimming irrelevant charschromium/3107Wei Li
Sometimes, web links are written with other text such as punctuations which makes the extracted web links invalid. We improve this by trimming invalid chars at the end of host name only URLs. For example, host names never ends with ';' or ','. BUG=chromium:720578 Change-Id: Id619025b2153531376d268a69a3a89c3d49fce08 Reviewed-on: https://pdfium-review.googlesource.com/5692 Commit-Queue: Wei Li <weili@chromium.org> Reviewed-by: Lei Zhang <thestig@chromium.org>
2017-05-10Replace operator bool with HasRef() in classes with a CFX_SharedCopyOnWrite ↵Lei Zhang
member. Change-Id: I51e30d298e87b9ae0d5aca83b2f1d6787efce70a Reviewed-on: https://pdfium-review.googlesource.com/5290 Commit-Queue: Lei Zhang <thestig@chromium.org> Reviewed-by: Tom Sepez <tsepez@chromium.org> Reviewed-by: Nicolás Peña <npm@chromium.org>
2017-04-25Use fx_extension.h utilities in more places.Lei Zhang
Change-Id: Iba1aa793567e69acc3cc1acbd5b9a9f531c80b7a Reviewed-on: https://pdfium-review.googlesource.com/4453 Reviewed-by: Tom Sepez <tsepez@chromium.org> Commit-Queue: Lei Zhang <thestig@chromium.org>
2017-04-21Replace FXSYS_iswdigit with std::iswdigit.Lei Zhang
Replace other one-off implementations as well. Change-Id: I2878f3fae479c12b7de5234ee3a26477d602d14d Reviewed-on: https://pdfium-review.googlesource.com/4398 Commit-Queue: Lei Zhang <thestig@chromium.org> Reviewed-by: Tom Sepez <tsepez@chromium.org>
2017-04-20Cleanup the fx_extension code.Dan Sinclair
This CL cleans up the fx_extension file. The stream code was moved to fx_stream. IFX_FileAccess was removed and CFX_CRTFileAccess split to its own file. Code shuffled from header to cpp file. Change-Id: I700fdfcc9797cf4e8050cd9ba010ad8854feefbf Reviewed-on: https://pdfium-review.googlesource.com/4371 Reviewed-by: Nicolás Peña <npm@chromium.org> Commit-Queue: dsinclair <dsinclair@chromium.org>
2017-04-07Tweak CFDF_Font::AppendChar()Tom Sepez
Pass in/out argument as a pointer. Avoid pointless malloc just to copy in multibyte case. Then we can avoid special-casing the single-byte case. Change-Id: I3dd2d57e08ef6ad7b78ea38398b228fa41a9b3e6 Reviewed-on: https://pdfium-review.googlesource.com/3950 Reviewed-by: Nicolás Peña <npm@chromium.org> Commit-Queue: Tom Sepez <tsepez@chromium.org>
2017-04-03Drop FXSYS_ from mem methodsDan Sinclair
This Cl drops the FXSYS_ from mem methods which are the same on all platforms. Bug: pdfium:694 Change-Id: I9d5ae905997dbaaec5aa0b2ae4c07358ed9c6236 Reviewed-on: https://pdfium-review.googlesource.com/3613 Reviewed-by: Tom Sepez <tsepez@chromium.org> Commit-Queue: dsinclair <dsinclair@chromium.org>
2017-04-03Drop FXSYS_ from math methodsDan Sinclair
This Cl drops the FXSYS_ from math methods which are the same on all platforms. Bug: pdfium:694 Change-Id: I85c9ff841fd9095b1434f67319847ba0cd9df7ac Reviewed-on: https://pdfium-review.googlesource.com/3598 Commit-Queue: dsinclair <dsinclair@chromium.org> Reviewed-by: Tom Sepez <tsepez@chromium.org>
2017-03-17Handle web links across lineschromium/3045Wei Li
When a web link has a hyphen at the end of line, we consider it to be continued to the next line. For example, "http://www.abc.com/my-\r\ntest" should be extracted as "http://www.abc.com/my-test". BUG=pdfium:650 Change-Id: I64a93d9c66faf2be0abdaf8cfe8ee496c435d0ca Reviewed-on: https://pdfium-review.googlesource.com/3092 Commit-Queue: Wei Li <weili@chromium.org> Reviewed-by: Lei Zhang <thestig@chromium.org>
2017-03-15Add IndexInBounds() convenience routine.Tom Sepez
Avoid writing |Type| in CollectionSize<Type>() so that index type can change without rewriting conditions. Change-Id: I40c94ca39148b379908760ba9b861114b88af7bb Reviewed-on: https://pdfium-review.googlesource.com/3056 Reviewed-by: Lei Zhang <thestig@chromium.org> Commit-Queue: Tom Sepez <tsepez@chromium.org>
2017-03-14Replace FX_FLOAT with underlying float type.Dan Sinclair
Change-Id: I158b7d80b0ec28b742a9f2d5a96f3dde7fb3ab56 Reviewed-on: https://pdfium-review.googlesource.com/3031 Commit-Queue: dsinclair <dsinclair@chromium.org> Reviewed-by: Tom Sepez <tsepez@chromium.org> Reviewed-by: Nicolás Peña <npm@chromium.org>
2017-03-14Replace FX_CHAR and FX_WCHAR with underlying types.Dan Sinclair
Change-Id: I96e0a20d66b9184d22f64d8e4ce0dadd5a78c1e8 Reviewed-on: https://pdfium-review.googlesource.com/2967 Commit-Queue: dsinclair <dsinclair@chromium.org> Reviewed-by: Tom Sepez <tsepez@chromium.org>
2017-02-23Convert TransformPoint calls to Transform callsDan Sinclair
This Cl converts remaining calls to TransformPoint to use Transform instead. Change-Id: I7a2c000492da5dda3975b4449812f281816fdab6 Reviewed-on: https://pdfium-review.googlesource.com/2822 Commit-Queue: dsinclair <dsinclair@chromium.org> Reviewed-by: Tom Sepez <tsepez@chromium.org>
2017-02-21Remove non CFX_PointF GetIndexAtPosDan Sinclair
This Cl removes the overload that takes a x,y position and uses the CFX_PointF version in all cases. Change-Id: Iebd048d91a1e1ed1c90ec0019e19750afe06e745 Reviewed-on: https://pdfium-review.googlesource.com/2812 Commit-Queue: dsinclair <dsinclair@chromium.org> Reviewed-by: Tom Sepez <tsepez@chromium.org> Reviewed-by: Nicolás Peña <npm@chromium.org>
2017-02-16Convert CPDF_TextPage classes to CFX_PointFchromium/3015Dan Sinclair
This CL converts the FX_FLOAT Origin values to CFX_PointF. Change-Id: Ic47d2bae1dd185fbc4ae88ac15fdd0fdf293c7f9 Reviewed-on: https://pdfium-review.googlesource.com/2716 Commit-Queue: dsinclair <dsinclair@chromium.org> Reviewed-by: Nicolás Peña <npm@chromium.org>
2017-02-16Convert CPDF_TextObject to CFX_PointFDan Sinclair
This CL converts the OriginX and OriginY from CPDF_TextObjectItem to use a CFX_PointF instead of two floats. Change-Id: Id39de18d3d908a86f925bec68d0680392f70906e Reviewed-on: https://pdfium-review.googlesource.com/2715 Commit-Queue: dsinclair <dsinclair@chromium.org> Reviewed-by: Tom Sepez <tsepez@chromium.org> Reviewed-by: Nicolás Peña <npm@chromium.org>
2017-02-14Reland "Convert CFX_FloatPoint to CFX_PointF"Dan Sinclair
This CL updates the CFX_FloatPoint Cl to accommodate for the Origin CL being reverted. Change-Id: I345fe1117938a49ad9ee5f310fe7b5e21d9f1948 Reviewed-on: https://pdfium-review.googlesource.com/2697 Reviewed-by: Nicolás Peña <npm@chromium.org> Commit-Queue: dsinclair <dsinclair@chromium.org>
2017-02-14Revert "Convert Origins to points"Dan Sinclair
This reverts commit da83d3a5cc09c4056310b3cf299dbbccd5c70d11. Reason for revert: Reverting chain to see if fixes Chrome roll. Original change's description: > Convert Origins to points > > This CL converts various OriginX, OriginY pairs into CFX_PointF objects. > > Change-Id: I9141f7fc713c710b2014d4fdcdec7dc93501f844 > Reviewed-on: https://pdfium-review.googlesource.com/2575 > Commit-Queue: dsinclair <dsinclair@chromium.org> > Reviewed-by: Nicolás Peña <npm@chromium.org> > TBR=tsepez@chromium.org,dsinclair@chromium.org,npm@chromium.org,pdfium-reviews@googlegroups.com NOPRESUBMIT=true NOTREECHECKS=true NOTRY=true BUG= Change-Id: I949fb4ec712e2587e7d0ef0191c34db198b61dcc Reviewed-on: https://pdfium-review.googlesource.com/2696 Reviewed-by: dsinclair <dsinclair@chromium.org> Commit-Queue: dsinclair <dsinclair@chromium.org>
2017-02-14Revert "Convert CFX_FloatPoint to CFX_PointF"dsinclair
This reverts commit 4797c4240cb9e2d8cd36c583d46cd52ff94af95d. Reason for revert: Reverting chain to see if fixes Chrome roll. Original change's description: > Convert CFX_FloatPoint to CFX_PointF > > The two classes store the same information, remove the CFX_FloatPoint variant. > > Change-Id: Ie598c2ba5af04fb2bb3347dd48c30fd5e4845e62 > Reviewed-on: https://pdfium-review.googlesource.com/2612 > Commit-Queue: dsinclair <dsinclair@chromium.org> > Reviewed-by: Tom Sepez <tsepez@chromium.org> > TBR=tsepez@chromium.org,dsinclair@chromium.org,pdfium-reviews@googlegroups.com NOPRESUBMIT=true NOTREECHECKS=true NOTRY=true Change-Id: Ia42074e706983c62d2e57497c3079b3c338343a3 Reviewed-on: https://pdfium-review.googlesource.com/2694 Commit-Queue: dsinclair <dsinclair@chromium.org> Reviewed-by: dsinclair <dsinclair@chromium.org>
2017-02-13Convert CFX_FloatPoint to CFX_PointFDan Sinclair
The two classes store the same information, remove the CFX_FloatPoint variant. Change-Id: Ie598c2ba5af04fb2bb3347dd48c30fd5e4845e62 Reviewed-on: https://pdfium-review.googlesource.com/2612 Commit-Queue: dsinclair <dsinclair@chromium.org> Reviewed-by: Tom Sepez <tsepez@chromium.org>
2017-02-13Convert Origins to pointsDan Sinclair
This CL converts various OriginX, OriginY pairs into CFX_PointF objects. Change-Id: I9141f7fc713c710b2014d4fdcdec7dc93501f844 Reviewed-on: https://pdfium-review.googlesource.com/2575 Commit-Queue: dsinclair <dsinclair@chromium.org> Reviewed-by: Nicolás Peña <npm@chromium.org>
2017-02-09Remove Transform in favour of TransformPointDan Sinclair
This CL removes the two Transform() overrides from CFX_Matrix and calls the TransformPoint methods directly. In the case of the 4 param version the values were assigned to the out values before calling. Change-Id: Id633826caec75b848774dcda6cfdcef2dbf5a7db Reviewed-on: https://pdfium-review.googlesource.com/2573 Reviewed-by: Nicolás Peña <npm@chromium.org> Reviewed-by: Tom Sepez <tsepez@chromium.org> Commit-Queue: dsinclair <dsinclair@chromium.org>
2017-02-09Convert Get methods to return instead of using out params.Dan Sinclair
This Cl changes several Get methods to return their values instead of using out parameters. Change-Id: Ie9a930a5c2d0e809f2d7181ca033d801945c1cf9 Reviewed-on: https://pdfium-review.googlesource.com/2556 Commit-Queue: dsinclair <dsinclair@chromium.org> Reviewed-by: Nicolás Peña <npm@chromium.org>
2017-02-08Update to use CFX_Rect{F} and CFX_Matrix constructors.Dan Sinclair
This Cl updates the code to use the constructors instead of creating an empty object and calling Set(). It also removes the various memsets of the CFX_Rect{F} classes. Change-Id: I6e20cec00866a38372858dcba5a30d31103172e4 Reviewed-on: https://pdfium-review.googlesource.com/2550 Commit-Queue: dsinclair <dsinclair@chromium.org> Reviewed-by: Nicolás Peña <npm@chromium.org> Reviewed-by: Tom Sepez <tsepez@chromium.org>
2017-01-09Remove CFX_ArrayTemplate from fpdftext and fxcodec.tsepez
Remove unused m_Segments. Review-Url: https://codereview.chromium.org/2618863004
2016-11-02Remove FX_BOOL from coretsepez
Review-Url: https://codereview.chromium.org/2477443002
2016-10-26Fix some bool/int mismatches.tsepez
Found by winxfa bot when fx_bool defined to bool. Review-Url: https://codereview.chromium.org/2449293002
2016-10-04Move core/fpdfapi/fpdf_parser to core/fpdfapi/parserdsinclair
BUG=pdfium:603 Review-Url: https://codereview.chromium.org/2392603004
2016-10-04Move core/fpdfapi/fpdf_page to core/fpdfapi/pagedsinclair
BUG=pdfium:603 Review-Url: https://codereview.chromium.org/2386423004
2016-10-04Move core/fpdfapi/fpdf_font to core/fpdfapi/fontdsinclair
BUG=pdfium:603 Review-Url: https://codereview.chromium.org/2392773003
2016-09-29Move core/fxcrt/include to core/fxcrtdsinclair
BUG=pdfium:611 Review-Url: https://codereview.chromium.org/2382723003
2016-09-29Move core/fpdftext/include to core/fpdftextdsinclair
BUG=pdfium:611 Review-Url: https://codereview.chromium.org/2383563002
2016-09-29Move core/fpdfapi/fpdf_parser/include to core/fpdfapi/fpdf_parserdsinclair
BUG=pdfium:611 Review-Url: https://codereview.chromium.org/2383543002
2016-09-29Move core/fpdfapi/fpdf_page/include to core/fpdfapi/fpdf_pagedsinclair
BUG=pdfium:611 Review-Url: https://codereview.chromium.org/2379033002
2016-09-29Move core/fpdfapi/fpdf_font/include to core/fpdfapi/fpdf_fontdsinclair
BUG=pdfium:611 Review-Url: https://codereview.chromium.org/2375283003
2016-09-29Check for negative page size in FindTextlineFlowOrientation()thestig
BUG=pdfium:606 Review-Url: https://codereview.chromium.org/2378373002
2016-09-21Make ownership explicit in CPDF_ContentMarkItem.tsepez
The old SetParam() method had "maybe take ownership" semanitcs based upon the type argument. Make GetParam() handle the None case and simplify callers based upon that behaviour. Review-Url: https://codereview.chromium.org/2358043003
2016-09-15Use ToUnicode mapping even when unicode is 0.npm
CPDF_Font::UnicodeFromCharcode returns 0 only if ToUnicode map maps the charcode to 0. CPDF_SimpleFont::UnicodeFromCharcode and CPDF_CID_Font:: UnicodeFromCharCode return 0 only if the call to CPDF_Font returns 0. In other cases, these methods return an empty string. So when processing text, a 0 return from the method should not be replaced with the charcode. BUG=pdfium:583 Review-Url: https://codereview.chromium.org/2342073002
2016-09-15Rename dictionary set and get methodsdsinclair
This Cl makes the Get and Set methods consistenly use {G|S}et<Type>For. BUG=pdfium:596 Review-Url: https://codereview.chromium.org/2334323005
2016-09-02Remove CFX_Matrix::Copy() in favor of assignmenttsepez
The default assignment operator will suffice and allows us to write matrix1 = matrix2; Review-Url: https://codereview.chromium.org/2307953003
2016-09-01Make CPDF_ContentMark have a CPDF_ContentMarkData.tsepez
This one doesn't require an explict Emplace(), as the object seems to get constructed only as a side-effect of making a private copy. Review-Url: https://codereview.chromium.org/2298953002
2016-08-30Make CPDF_TextState have a CPDF_TextStateData rather than inheriting one.tsepez
Review-Url: https://codereview.chromium.org/2287313004
2016-08-29Revert "Add -> operators to CFX_CountRef."tsepez
This reverts commit c10c23a2b1999b1cb0354fd4db9837dc63a3d833. TBR=dsinclair@chromium.org Review-Url: https://codereview.chromium.org/2285283003
2016-08-29Revert "Replace wrapper methods in CPDF_Path with -> operator."tsepez
This reverts commit d09a09751f724ecdb1a0bc307447a3d0c212ebff. TBR=dsinclair@chromium.org Review-Url: https://codereview.chromium.org/2291833002
2016-08-29Replace wrapper methods in CPDF_Path with -> operator.tsepez
These just invoked exaclty the same methodes in the underlying xxxData class, which we can now do with just a ->() Move some methods to the xxxData class, where they belong. In doing so, put MakePrivateCopy() calls at each callsite for those methods that made a copy. Review-Url: https://codereview.chromium.org/2286983002
2016-08-26Add -> operators to CFX_CountRef.chromium/2842tsepez
Allows CFX_CountRefs to behave more like pointers. Rename SetNull() to Clear() for consistency with other ptrs. Change GetPrivateCopy() into MakePrivateCopy() with no return, since the -> operators are clearer than getting an object pointer. Review-Url: https://codereview.chromium.org/2283113002
2016-08-26Move the classes in fpdf_text_int.cpp into their own filesnpm
- CPDF_TextPageFind and CPDF_LinkExtract moved to new file - fpdf_text_int.cpp renamed to be CPDF_TextPage definition Review-Url: https://codereview.chromium.org/2286723003
2016-08-25Remove unused methods in CPDF_TextPage and nitsnpm
fpdf_text_int.cpp should be split up into classes in a later CL Review-Url: https://codereview.chromium.org/2271973004
2016-06-16Simplify CPDF_TextPage::FindTextlineFlowOrientation().chromium/2773chromium/2772chromium/2771thestig
Review-Url: https://codereview.chromium.org/2066043002
2016-06-14Make code compile with clang_use_chrome_plugin (part II)weili
This change contains files in core directory which were not covered in part I. This is part of the efforts to make PDFium code compilable by Clang chromium style plugins. The changes are mainly the following: -- move inline constructor/destructor of complex class/struct out-of-line; -- add constructor/destructor of complex class/struct if not explicitly defined; -- add explicit out-of-line copy constructor when needed; -- move inline virtual functions out-of-line; -- Properly mark virtual functions with 'override'; -- some minor cleanups; BUG=pdfium:469 Review-Url: https://codereview.chromium.org/2060913003