pdfium.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author
2017-11-30	Rewrite lower level details of extracting text from page	Ryan Harrison
	The current implementation of text extraction was difficult to understand, duplicated logic that existed in other methods, and wasn't clear about the units the inputs were in. It also didn't handle control characters correctly. The new implementation leans on the methods for converting indices between the text buffer index and character list index spaces to avoid duplication of code. It also makes it clear to the reader that inputs are in the character list index space. Finally, it fixes issues being seen in Chrome with respect of ranges being slightly off. This CL also adds a test for extracting text that has control characters. BUG=pdfium:942,chromium:654578 Change-Id: Id9d1f360c2d7492c7b5a48d6c9ae29f530892742 Reviewed-on: https://pdfium-review.googlesource.com/20014 Commit-Queue: Ryan Harrison <rharrison@chromium.org> Reviewed-by: dsinclair <dsinclair@chromium.org> Reviewed-by: Henrique Nakashima <hnakashima@chromium.org>
2017-11-29	Fix some nits in FPDFText_GetText().	Lei Zhang
	Use more variables to avoid redundant calculations. Add one more edge test case. Change-Id: I6c8a0aca9de3bdd1a394c39304fd9a75009f9489 Reviewed-on: https://pdfium-review.googlesource.com/19690 Commit-Queue: Ryan Harrison <rharrison@chromium.org> Reviewed-by: Ryan Harrison <rharrison@chromium.org>
2017-11-27	Change FPDF_GetText to return "" when asked to get 0 characters	Ryan Harrison
	BUG=chromium:788103 Change-Id: I8ebdbc78eb14c358d7ac019b96de4828e6071b79 Reviewed-on: https://pdfium-review.googlesource.com/19350 Commit-Queue: Ryan Harrison <rharrison@chromium.org> Reviewed-by: dsinclair <dsinclair@chromium.org>
2017-11-20	Add regression tests for issues with correctly removing hyphens	Ryan Harrison
	There was a regression due to a refactor, where the public API was no longer removing soft hyphens for line broken words. This was causing issues with find and copy/paste operations that depend on selecting a region of text. This change is covered by FPDFTextEmbeddertest.GetTextWithHyphen. FPDFTextEmbeddertest.bug_782596 is a regression test for a bug that was introduced by the original fix. It only fails when running the test under ASAN. BUG=pdfium:935 Change-Id: I26096583c35f9246a3662e702f89b742f1146780 Reviewed-on: https://pdfium-review.googlesource.com/18610 Reviewed-by: Lei Zhang <thestig@chromium.org> Reviewed-by: dsinclair <dsinclair@chromium.org> Commit-Queue: Ryan Harrison <rharrison@chromium.org>
2017-10-24	Add more tests for FPDFText methods.	Lei Zhang
	BUG=pdfium:921 Change-Id: I6973359e6ac112c56843f66eb0b70462f42f9cae Reviewed-on: https://pdfium-review.googlesource.com/16630 Reviewed-by: Ryan Harrison <rharrison@chromium.org> Commit-Queue: Lei Zhang <thestig@chromium.org>
2017-09-05	Leave space for null characters when getting text	Ryan Harrison
	The conversion from WideString to ByeString adds in null characters at the end, so we need to account for these when selecting the range of text to initially extract. BUG=chromium:761770,chromium:761626 Change-Id: Ib8f863e997ebccaaf882e0beb29733f27a18826d Reviewed-on: https://pdfium-review.googlesource.com/13110 Commit-Queue: Ryan Harrison <rharrison@chromium.org> Reviewed-by: dsinclair <dsinclair@chromium.org>
2017-08-31	Make FPDF_GetText stricter on inputs	Ryan Harrison
	The current implementation of this function is problematic. It will attempt to memcpy to NULL. It will accept obviously wrong inputs like a negative start index. It will also accept -1 for the count, which in theory is the amount of space the buffer has allocated to it, so doesn't make sense, but instead an internal call will calculate the number of characters to get if the count is -1. This will them lead to the function attempting to call Left(-1) on a string, which is invalid. Ths documentation for this function mentions none of this behaviour, so I am removing it, since it is inconsistent/bad. The implementation should now more strictly meet defined API. BUG=pdfium:828 Change-Id: I18afdb33e12d77c10d856b4bacd615481979c484 Reviewed-on: https://pdfium-review.googlesource.com/12733 Commit-Queue: Ryan Harrison <rharrison@chromium.org> Reviewed-by: Tom Sepez <tsepez@chromium.org>
2017-08-31	Remove fx_basic.h	Dan Sinclair
	This CL removes the fx_basic.h header and fixes up includes as needed. Change-Id: I49af32a8327bdbcda40c50a61ffbd75d06609040 Reviewed-on: https://pdfium-review.googlesource.com/12670 Commit-Queue: dsinclair <dsinclair@chromium.org> Reviewed-by: Tom Sepez <tsepez@chromium.org>
2017-08-12	Add a new public method to get the the origin of a character.	Andrew Weintraub
	Bug: Change-Id: I376f4af26791cd4ed04049ab179c2b39dd262725 Reviewed-on: https://pdfium-review.googlesource.com/10690 Reviewed-by: Lei Zhang <thestig@chromium.org> Commit-Queue: Lei Zhang <thestig@chromium.org>
2017-05-20	Better identify web links by trimming irrelevant charschromium/3107	Wei Li
	Sometimes, web links are written with other text such as punctuations which makes the extracted web links invalid. We improve this by trimming invalid chars at the end of host name only URLs. For example, host names never ends with ';' or ','. BUG=chromium:720578 Change-Id: Id619025b2153531376d268a69a3a89c3d49fce08 Reviewed-on: https://pdfium-review.googlesource.com/5692 Commit-Queue: Wei Li <weili@chromium.org> Reviewed-by: Lei Zhang <thestig@chromium.org>
2017-03-17	Handle web links across lineschromium/3045	Wei Li
	When a web link has a hyphen at the end of line, we consider it to be continued to the next line. For example, "http://www.abc.com/my-\r\ntest" should be extracted as "http://www.abc.com/my-test". BUG=pdfium:650 Change-Id: I64a93d9c66faf2be0abdaf8cfe8ee496c435d0ca Reviewed-on: https://pdfium-review.googlesource.com/3092 Commit-Queue: Wei Li <weili@chromium.org> Reviewed-by: Lei Zhang <thestig@chromium.org>
2016-11-21	Fixup lint flags.	Dan Sinclair
	The -build/include setting was masking out build/include_what_you_use. This CL restores them, fixes any build errors, and adds NOLINT as needed. As well, the runtime/explicit and runtime/printf flags are aslo enabled and NOLINT'd. lint cleanups Change-Id: Ib013b3eb29c8d0e48cad74c5df9028684130719f Reviewed-on: https://pdfium-review.googlesource.com/2030 Reviewed-by: Tom Sepez <tsepez@chromium.org>
2016-09-29	Move core/fxcrt/include to core/fxcrt	dsinclair
	BUG=pdfium:611 Review-Url: https://codereview.chromium.org/2382723003
2016-09-15	Use ToUnicode mapping even when unicode is 0.	npm
	CPDF_Font::UnicodeFromCharcode returns 0 only if ToUnicode map maps the charcode to 0. CPDF_SimpleFont::UnicodeFromCharcode and CPDF_CID_Font:: UnicodeFromCharCode return 0 only if the call to CPDF_Font returns 0. In other cases, these methods return an empty string. So when processing text, a 0 return from the method should not be replaced with the charcode. BUG=pdfium:583 Review-Url: https://codereview.chromium.org/2342073002
2016-06-07	Get rid of NULLs in core/	thestig
	Review-Url: https://codereview.chromium.org/2032613003
2016-03-29	Code change to avoid signed/unsigned mismatch warnings.	Wei Li
	This makes pdfium code on Linux and Mac sign-compare warning free. The warning flag will be re-enabled after checking on windows clang build. BUG=pdfium:29 R=tsepez@chromium.org Review URL: https://codereview.chromium.org/1841643002 .
2016-03-23	Move core/include/fxcrt to core/fxcrt/include.	Dan Sinclair
	This CL moves the fxcrt code into the core/fxcrt directory. The only exception was fx_bidi.h which was moved into core/fxcrt as it is not used outside of core/. R=tsepez@chromium.org Review URL: https://codereview.chromium.org/1825953002 .
2016-03-14	Move fpdfsdk/src up to fpdfsdk/.	Dan Sinclair
	This CL moves the files in fpdfsdk/src/ up one level to fpdfsdk/ and fixes up the include paths, include guards and build files. R=tsepez@chromium.org Review URL: https://codereview.chromium.org/1799773002 .