Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(166)

Unified Diff: components/autofill/core/browser/autofill_data_util_unittest.cc

Issue 2132103002: Split CJK full names into name parts correctly. (Closed) Base URL: https://chromium.googlesource.com/chromium/src.git@master
Patch Set: Improve precision for splitting Korean names. Created 4 years, 5 months ago
Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.
Jump to:
View side-by-side diff with in-line comments
Download patch
Index: components/autofill/core/browser/autofill_data_util_unittest.cc
diff --git a/components/autofill/core/browser/autofill_data_util_unittest.cc b/components/autofill/core/browser/autofill_data_util_unittest.cc
index 707ac4925f1c69b15012ab4c4aa6e54398fb8212..f472a00c883ded8b049f2da2a22bf7ea764ba04e 100644
--- a/components/autofill/core/browser/autofill_data_util_unittest.cc
+++ b/components/autofill/core/browser/autofill_data_util_unittest.cc
@@ -32,7 +32,33 @@ TEST(AutofillDataUtilTest, SplitName) {
// Exception to the name suffix removal.
{"John Ma", "John", "", "Ma"},
// Common family name prefixes not considered a middle name.
- {"Milhouse Van Houten", "Milhouse", "", "Van Houten"}};
+ {"Milhouse Van Houten", "Milhouse", "", "Van Houten"},
+
+ // CJK names have reverse order (surname goes first, given name goes
+ // second).
+ {"홍 길동", "길동", "", "홍"}, // Korean name, Hangul
+ {"孫 德明", "德明", "", "孫"}, // Chinese name, Unihan
+ {"山田 貴洋", "貴洋", "", "山田"}, // Japanese name, Unihan
+
+ // CJK names don't usually have a space in the middle, but most of the
+ // time, the surname is only one character (in Chinese & Korean).
+ {"최성훈", "성훈", "", "최"}, // Korean name, Hangul
+ {"강전희", "전희", "", "강"}, // Korean name, Hangul
+ {"刘翔", "翔", "", "刘"}, // (Traditional) Chinese name, Unihan
+ {"劉翔", "翔", "", "劉"}, // (Simplified) Chinese name, Unihan
gogerald1 2016/07/14 18:28:52 nit: I believe this ("劉") is traditional and above
nicolaso 2016/07/14 20:32:17 Fixed.
+
+ // There are a few exceptions. Occasionally, the surname has two
+ // characters.
+ {"남궁도", "도", "", "남궁"}, // Korean name, Hangul
+ {"황보혜정", "혜정", "", "황보"}, // Korean name, Hangul
+ {"歐陽靖", "靖", "", "歐陽"}, // (Traditional) Chinese name, Unihan
+
+ // Korean full names always have at least 3 characters, so a 2-character
Jinsuk Kim 2016/07/13 21:51:09 Please update the comment here.
nicolaso 2016/07/14 18:12:43 Done.
+ // name can only be a given name. Chinese full names, however, might be
+ // only 2 characters.
+ {"제동", "제동", "", ""}, // Korean name, Hangul
+ {"孫文", "文", "", "孫"} // Chinese name, Unihan
+ };
for (TestCase test_case : test_cases) {
NameParts name_parts = SplitName(base::UTF8ToUTF16(test_case.full_name));

Powered by Google App Engine
This is Rietveld 408576698