summaryrefslogtreecommitdiff
path: root/testing/resources/bug_182.pdf
AgeCommit message (Collapse)Author
2018-08-22Properly handle language markers in decoded textRyan Harrison
In text like document title 0x001B is used as a marker for the beginning/end of a language metadata section. Currently PDFium does nothing with this data, but when returning the 'decoded' text it needs to be stripped out. The existing code assumed that the two bytes following a marker would be the data to be removed and did nothing to track if it was in/out of one of these regions. This led to a situation where it would always strip the two bytes following the region, since it assumed the end marker was the beginning of a new region. This CL corrects the detection and handling of these regions, and adds a regression test for the reported bug. BUG=pdfium:182 Change-Id: I92ddba5666274a8986fed03f502a0331f150f7ac Reviewed-on: https://pdfium-review.googlesource.com/41070 Reviewed-by: Henrique Nakashima <hnakashima@chromium.org> Commit-Queue: Ryan Harrison <rharrison@chromium.org>