Use CFX_WideString in CPDF_NameTree functions to strip BOMchromium/3162

PDFium doesn't strip BOMs during parsing, but we should strip BOMs when retrieving parsed strings in CPDF_NameTree to ensure consistency and appropriate function behavior. See the bug for more info. As outlined in Bug=pdfium:593, the solution is to call GetUnicodeText() instead of GetString(). I added a GetUnicodeTextAt() function in CPDF_Array, which is symmetrical to GetUnicodeTextFor() in CPDF_Dictionary. I then changed the input variable types to CPDF_NameTree functions to be CFX_WideString instead of CFX_ByteString, and modified all the calls to them. I also added a unit test for nametree, which would fail prior to this change. Nametrees with non-unicode names are already tested by embedder tests. Bug=pdfium:820 Change-Id: Id69d7343632f83d1f5180348c0eea290f478183f Reviewed-on: https://pdfium-review.googlesource.com/8091 Reviewed-by: dsinclair <dsinclair@chromium.org> Commit-Queue: Jane Liu <janeliulwq@google.com>
author: Jane Liu <janeliulwq@google.com> 2017-07-19 13:10:50 -0400
committer: Chromium commit bot <commit-bot@chromium.org> 2017-07-19 19:09:39 +0000
commit: 67ccef73bf664b7cdb4c6eed7acbaa4163c22a80 (patch)
tree: 718061bc21fd52eab1bc70a8b9be97585f1d79f8 /core/fpdfdoc/cpdf_nametree_unittest.cpp
parent: eed247e9cb3b0e9ce5dcb8bf6ee7673c9dd3e544 (diff)
download: pdfium-67ccef73bf664b7cdb4c6eed7acbaa4163c22a80.tar.xz
1 files changed, 35 insertions, 0 deletions
diff --git a/core/fpdfdoc/cpdf_nametree_unittest.cpp b/core/fpdfdoc/cpdf_nametree_unittest.cpp
new file mode 100644
index 0000000000..28af9e078d
--- /dev/null
+++ b/core/fpdfdoc/cpdf_nametree_unittest.cpp
@@ -0,0 +1,35 @@
+// Copyright 2017 PDFium Authors. All rights reserved.
+// Use of this source code is governed by a BSD-style license that can be
+// found in the LICENSE file.
+
+#include "core/fpdfdoc/cpdf_nametree.h"
+#include "core/fpdfapi/parser/cpdf_array.h"
+#include "core/fpdfapi/parser/cpdf_dictionary.h"
+#include "core/fpdfapi/parser/cpdf_number.h"
+#include "core/fpdfapi/parser/cpdf_string.h"
+#include "testing/gtest/include/gtest/gtest.h"
+
+TEST(cpdf_nametree, GetUnicodeNameWithBOM) {
+  // Set up the root dictionary with a Names array.
+  auto pRootDict = pdfium::MakeUnique<CPDF_Dictionary>();
+  CPDF_Array* pNames = pRootDict->SetNewFor<CPDF_Array>("Names");
+
+  // Add the key "1" (with BOM) and value 100 into the array.
+  std::ostringstream buf;
+  buf << static_cast<unsigned char>(254) << static_cast<unsigned char>(255)
+      << static_cast<unsigned char>(0) << static_cast<unsigned char>(49);
+  pNames->AddNew<CPDF_String>(CFX_ByteString(buf), true);
+  pNames->AddNew<CPDF_Number>(100);
+
+  // Check that the key is as expected.
+  CPDF_NameTree nameTree(pRootDict.get());
+  CFX_WideString storedName;
+  nameTree.LookupValueAndName(0, &storedName);
+  EXPECT_STREQ(L"1", storedName.c_str());
+
+  // Check that the correct value object can be obtained by looking up "1".
+  CFX_WideString matchName = L"1";
+  CPDF_Object* pObj = nameTree.LookupValue(matchName);
+  ASSERT_TRUE(pObj->IsNumber());
+  EXPECT_EQ(100, pObj->AsNumber()->GetInteger());
+}
author	Jane Liu <janeliulwq@google.com>	2017-07-19 13:10:50 -0400
committer	Chromium commit bot <commit-bot@chromium.org>	2017-07-19 19:09:39 +0000
commit	67ccef73bf664b7cdb4c6eed7acbaa4163c22a80 (patch)
tree	718061bc21fd52eab1bc70a8b9be97585f1d79f8 /core/fpdfdoc/cpdf_nametree_unittest.cpp
parent	eed247e9cb3b0e9ce5dcb8bf6ee7673c9dd3e544 (diff)
download	pdfium-67ccef73bf664b7cdb4c6eed7acbaa4163c22a80.tar.xz