Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(58)

Side by Side Diff: source/test/testdata/rbbitst.txt

Issue 1621843002: ICU 56 update step 1 (Closed) Base URL: https://chromium.googlesource.com/chromium/deps/icu.git@561
Patch Set: Created 4 years, 11 months ago
Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.
Jump to:
View unified diff | Download patch
OLDNEW
1 # Copyright (c) 2001-2014 International Business Machines 1 # Copyright (c) 2001-2015 International Business Machines
2 # Corporation and others. All Rights Reserved. 2 # Corporation and others. All Rights Reserved.
3 # 3 #
4 # RBBI Test Data 4 # RBBI Test Data
5 # 5 #
6 # File: rbbitst.txt 6 # File: rbbitst.txt
7 # 7 #
8 # The format of this file looks vaguely like some kind of xml-ish markup, 8 # The format of this file looks vaguely like some kind of xml-ish markup,
9 # but it is NOT. The syntax is this.. 9 # but it is NOT. The syntax is this..
10 # 10 #
11 # <word> any following data is for word break testing 11 # <word> any following data is for word break testing
(...skipping 13 matching lines...) Expand all
25 # 25 #
26 # There are two copies of this file in the source repository, 26 # There are two copies of this file in the source repository,
27 # [ICU4C] source/test/testdata/rbbitst.txt 27 # [ICU4C] source/test/testdata/rbbitst.txt
28 # [ICU4J] main/tests/core/src/com/ibm/icu/dev/test/rbbi/rbbitst.txt 28 # [ICU4J] main/tests/core/src/com/ibm/icu/dev/test/rbbi/rbbitst.txt
29 # 29 #
30 # ICU4C's copy is the master. If any changes are made to ICU4J's copy, make sur e they 30 # ICU4C's copy is the master. If any changes are made to ICU4J's copy, make sur e they
31 # are merged back into ICU4C's copy of the file, lest they get overwritten late r. 31 # are merged back into ICU4C's copy of the file, lest they get overwritten late r.
32 # TODO: figure out how to have a single copy of the file for use by both C and Java. 32 # TODO: figure out how to have a single copy of the file for use by both C and Java.
33 33
34 34
35 ## FILTERED BREAK TESTS
36
37 # (William Bradford, public domain. http://catalog.hathitrust.org/Record/0086512 24 ) - edited.
38 <locale en>
39 <sent>
40 <data>\
41 •In the meantime Mr. •Weston arrived with his small ship, which he had now recov ered. •Capt. •Gorges, who informed the Sgt. here that one purpose of his going e ast was to meet with Mr. •Weston, took this opportunity to call him to account f or some abuses he had to lay to his charge.•</data>
42
43 <locale en@ss=standard>
44 <sent>
45 <data>\
46 •In the meantime Mr. Weston arrived with his small ship, which he had now recove red. •Capt. Gorges, who informed the Sgt. here that one purpose of his going eas t was to meet with Mr. Weston, took this opportunity to call him to account for some abuses he had to lay to his charge.•</data>
47
48 ## END FILTERED BREAK TESTS
49
50 <locale>
51
35 # Temp debugging tests 52 # Temp debugging tests
36 <sent> 53 <sent>
37 <data>•\u00c0.•</data> 54 <data>•\u00c0.•</data>
38 55
39 #<data>•\u5487\u67ff\ue591\u5017\u61b3\u60a1\u9510\u8165:"JAVA\u821c\u8165\u7fc8 \u51ce\u306d,\u2494\u56d8\u4ec0\u60b1\u8560\u51ba\u611d\u57b6\u2510\u5d46".\u202 9•</data> 56 #<data>•\u5487\u67ff\ue591\u5017\u61b3\u60a1\u9510\u8165:"JAVA\u821c\u8165\u7fc8 \u51ce\u306d,\u2494\u56d8\u4ec0\u60b1\u8560\u51ba\u611d\u57b6\u2510\u5d46".\u202 9•</data>
40 ################################################################################ ######## 57 ################################################################################ ########
41 # 58 #
42 # 59 #
43 # G r a p h e m e C l u s t e r T e s t s 60 # G r a p h e m e C l u s t e r T e s t s
44 # 61 #
(...skipping 111 matching lines...) Expand 10 before | Expand all | Expand 10 after
156 #Hindi Numbers 173 #Hindi Numbers
157 <data>• •\u0968\u0966.\u0969\u096f<100> •\u0967\u0966\u0966.\u0966\u0966<100> •\ N{RUPEE SIGN}•\u0967,\u0967\u0966\u0966.\u0966\u0966<100> • •\u0905\u092e\u091c< 200>\n•</data> 174 <data>• •\u0968\u0966.\u0969\u096f<100> •\u0967\u0966\u0966.\u0966\u0966<100> •\ N{RUPEE SIGN}•\u0967,\u0967\u0966\u0966.\u0966\u0966<100> • •\u0905\u092e\u091c< 200>\n•</data>
158 175
159 <data>•\u0938\u094d\u200d\u0935\u0924\u0902deadTA\u0930<200>\r•It's<200> •$•30.1 0<100> •12,34<100>¢•£•¤•¥•alpha\u05f3beta\u05f4gamma<200> •</data> 176 <data>•\u0938\u094d\u200d\u0935\u0924\u0902deadTA\u0930<200>\r•It's<200> •$•30.1 0<100> •12,34<100>¢•£•¤•¥•alpha\u05f3beta\u05f4gamma<200> •</data>
160 177
161 <data>•Badges<200>?• •BADGES<200>!•?•!• •We<200> •don't<200> •need<200> •no<200> •STINKING<200> •BADGES<200>!•!•1000,233,456.000<100> •1,23.322<100>%•123.1222<1 00>$•123,000.20<100> •179.01<100>%•X<200> •Now<200>\r•is<200>\n•the<200>\r\n•tim e<200> •</data> 178 <data>•Badges<200>?• •BADGES<200>!•?•!• •We<200> •don't<200> •need<200> •no<200> •STINKING<200> •BADGES<200>!•!•1000,233,456.000<100> •1,23.322<100>%•123.1222<1 00>$•123,000.20<100> •179.01<100>%•X<200> •Now<200>\r•is<200>\n•the<200>\r\n•tim e<200> •</data>
162 179
163 #Hangul 180 #Hangul
164 <data>•\uc5f0\ud569<200> •\uc7a5\ub85c\uad50\ud68c<200> •\u1109\u1161\u11bc\u111 2\u1161\u11bc<200> •\u1112\u1161\u11ab\u110b\u1175\u11ab<200> •Hello<200>,• •how <200> •are<200> •you<200> •</data> 181 <data>•\uc5f0\ud569<200> •\uc7a5\ub85c\uad50\ud68c<200> •\u1109\u1161\u11bc\u111 2\u1161\u11bc<200> •\u1112\u1161\u11ab\u110b\u1175\u11ab<200> •Hello<200>,• •how <200> •are<200> •you<200> •</data>
165 182
183 <data>•Hello<200>,• •how<200> •are<200> •you<200> •\uc5f0\ud569<200> •\uc7a5\ub8 5c\uad50\ud68c<200> •\u1109\u1161\u11bc\u1112\u1161\u11bc<200> •\u1112\u1161\u11 ab\u110b\u1175\u11ab<200> •</data>
166 184
167 # Words containing non-BMP letters 185 # Words containing non-BMP letters
168 <data>•abc\U00010300<200> •abc\N{DESERET SMALL LETTER ENG}<200> •abc\N{MATHEMATI CAL BOLD SMALL Z}<200> •abc\N{MATHEMATICAL SANS-SERIF BOLD ITALIC PI SYMBOL}<200 > •</data> 186 <data>•abc\U00010300<200> •abc\N{DESERET SMALL LETTER ENG}<200> •abc\N{MATHEMATI CAL BOLD SMALL Z}<200> •abc\N{MATHEMATICAL SANS-SERIF BOLD ITALIC PI SYMBOL}<200 > •</data>
169 187
170 # Unassigned code points 188 # Unassigned code points
171 <data>•abc<200>\U0001D800•def<200>\U0001D3FF• •</data> 189 <data>•abc<200>\U0001D800•def<200>\U0001D3FF• •</data>
172 190
173 # Hiragana & Katakana stay together, but separates from each other and Latin. 191 # Hiragana & Katakana stay together, but separates from each other and Latin.
174 # *** what to do about theoretical combos of chars? i.e. hiragana + accent 192 # *** what to do about theoretical combos of chars? i.e. hiragana + accent
175 #<data>•abc<200>\N{HIRAGANA LETTER SMALL A}<400>\N{HIRAGANA LETTER VU}\N{COMBINI NG ACUTE ACCENT}<400>\N{HIRAGANA ITERATION MARK}<400>\N{KATAKANA LETTER SMALL A} \N{KATAKANA ITERATION MARK}\N{HALFWIDTH KATAKANA LETTER WO}\N{HALFWIDTH KATAKANA LETTER N}<400>def<200>#•</data> 193 #<data>•abc<200>\N{HIRAGANA LETTER SMALL A}<400>\N{HIRAGANA LETTER VU}\N{COMBINI NG ACUTE ACCENT}<400>\N{HIRAGANA ITERATION MARK}<400>\N{KATAKANA LETTER SMALL A} \N{KATAKANA ITERATION MARK}\N{HALFWIDTH KATAKANA LETTER WO}\N{HALFWIDTH KATAKANA LETTER N}<400>def<200>#•</data>
(...skipping 69 matching lines...) Expand 10 before | Expand all | Expand 10 after
245 <data>•\u8527<400>\u02ba<200>\u0027\u0d42•\u00b7•\u09ea<100></data> 263 <data>•\u8527<400>\u02ba<200>\u0027\u0d42•\u00b7•\u09ea<100></data>
246 264
247 # 265 #
248 # Jitterbug 5276 - treat Japanese half width voicing marks as Grapheme Extend 266 # Jitterbug 5276 - treat Japanese half width voicing marks as Grapheme Extend
249 # 267 #
250 <data>•A\uff9e\uff9fBC<200> •1\uff9e\uff9f23<100></data> 268 <data>•A\uff9e\uff9fBC<200> •1\uff9e\uff9f23<100></data>
251 269
252 # User guide example: 270 # User guide example:
253 <data>•Parlez<200>-•vous<200> •français<200> •?•</data> 271 <data>•Parlez<200>-•vous<200> •français<200> •?•</data>
254 272
273 # Test for #11673
274 <word>
275 <data>•ジョージア<400> •</data>
276
255 ################################################################################ ######## 277 ################################################################################ ########
256 # 278 #
257 # 279 #
258 # S e n t e n c e B o u n d a r y T e s t s 280 # S e n t e n c e B o u n d a r y T e s t s
259 # 281 #
260 # 282 #
261 ################################################################################ ########## 283 ################################################################################ ##########
262 284
263 285
264 # 286 #
(...skipping 432 matching lines...) Expand 10 before | Expand all | Expand 10 after
697 <data>•เล่น•ผ่าน•ทาง•บลูทูธ•บน•อุปกรณ์•</data> 719 <data>•เล่น•ผ่าน•ทาง•บลูทูธ•บน•อุปกรณ์•</data>
698 720
699 # Test for city names #10691 721 # Test for city names #10691
700 <line> 722 <line>
701 <data>•ไป•ที่•ซานฟรานซิสโก•</data> 723 <data>•ไป•ที่•ซานฟรานซิสโก•</data>
702 724
703 # Test for #10630, #10631 725 # Test for #10630, #10631
704 <line> 726 <line>
705 <data>•แท็ก•แอปพลิเคชัน•เป็น•พิเศษ•</data> 727 <data>•แท็ก•แอปพลิเคชัน•เป็น•พิเศษ•</data>
706 728
729 # Test for #11019
730 <line>
731 <data>•เบ•เบราว์เซอร์•โพ•โพสต์•โพสท์•</data>
732
733 # Test for #11688
734 <line>
735 <data>•อัปเดต•อีเวนต์•</data>
736
707 ################################################################################ ########## 737 ################################################################################ ##########
708 # 738 #
709 # Lao Tests 739 # Lao Tests
710 # 740 #
711 ################################################################################ ########## 741 ################################################################################ ##########
712 <locale en> 742 <locale en>
713 # Basic check for #7647 743 # Basic check for #7647
714 <line> 744 <line>
715 <data>•ສະບາຍດີ•</data> 745 <data>•ສະບາຍດີ•</data>
716 <data>•ດີ•ຂອບໃຈ•</data> 746 <data>•ດີ•ຂອບໃຈ•</data>
(...skipping 174 matching lines...) Expand 10 before | Expand all | Expand 10 after
891 921
892 <data>•abc •- •def •abc •-def •abc- •def •</data> # With ASCII hyphen 922 <data>•abc •- •def •abc •-def •abc- •def •</data> # With ASCII hyphen
893 <data>•abc •‐ •def •abc •‐def •abc‐ •def •</data> # With Unicode u2010 hyphen 923 <data>•abc •‐ •def •abc •‐def •abc‐ •def •</data> # With Unicode u2010 hyphen
894 924
895 # Test for #10176 (in fi) 925 # Test for #10176 (in fi)
896 <line> 926 <line>
897 <data>•abc/•s •def•</data> 927 <data>•abc/•s •def•</data>
898 <data>•abc/\u05D9 •def•</data> 928 <data>•abc/\u05D9 •def•</data>
899 <data>•\u05E7\u05D7/\u05D9 •\u05DE\u05E2\u05D9\u05DC•</data> 929 <data>•\u05E7\u05D7/\u05D9 •\u05DE\u05E2\u05D9\u05DC•</data>
900 <data>•\u05D3\u05E8\u05D5\u05E9\u05D9\u05DD •\u05E9\u05D7\u05E7\u05E0\u05D9\u05D D/\u05D9\u05D5\u05EA•</data> 930 <data>•\u05D3\u05E8\u05D5\u05E9\u05D9\u05DD •\u05E9\u05D7\u05E7\u05E0\u05D9\u05D D/\u05D9\u05D5\u05EA•</data>
931
932 ################################################################################ ####
933 #
934 # Test CSS line break variants: strict, normal, loose
935 #
936 ################################################################################ ####
937
938 <locale ja@lb=strict>
939 <line>
940 # •no brk before 3063 •no brk before 301C•no brk btw 2026 •no brk before FF01•
941 <data>•\u3084\u3063•\u3071•\u308A\u0020•\u0031\u301C\u0020•\u2026\u2026\u0020•\u 30A2\uFF01\u0020•</data>
942
943 <locale ja@lb=normal>
944 <line>
945 # •brk OK before 3063 •brk OK before 301C •no brk btw 2026 • no brk before FF01•
946 <data>•\u3084•\u3063•\u3071•\u308A\u0020•\u0031•\u301C\u0020•\u2026\u2026\u0020• \u30A2\uFF01\u0020•</data>
947
948 <locale ja@lb=loose>
949 <line>
950 # •brk OK before 3063 •brk OK before 301C •brk OK btw 2026 •brk OK before FF01•
951 <data>•\u3084•\u3063•\u3071•\u308A\u0020•\u0031•\u301C\u0020•\u2026•\u2026\u0020 •u30A2•\uFF01\u0020•</data>
952
953 <locale en@lb=strict>
954 <line>
955 # •no brk before 3063 •no brk before 301C•no brk btw 2026 •no brk before FF01•
956 <data>•\u3084\u3063•\u3071•\u308A\u0020•\u0031\u301C\u0020•\u2026\u2026\u0020•\u 30A2\uFF01\u0020•</data>
957
958 <locale en@lb=normal>
959 <line>
960 # •brk OK before 3063 •no brk before 301C •no brk btw 2026 •n o brk before FF01•
961 <data>•\u3084•\u3063•\u3071•\u308A\u0020•\u0031\u301C\u0020•\u2026\u2026\u0020•\ u30A2\uFF01\u0020•</data>
962
963 <locale en@lb=loose>
964 <line>
965 # •brk OK before 3063 •no brk before 301C •brk OK btw 2026 • no brk before FF01•
966 <data>•\u3084•\u3063•\u3071•\u308A\u0020•\u0031\u301C\u0020•\u2026•\u2026\u0020• u30A2\uFF01\u0020•</data>
OLDNEW
« no previous file with comments | « source/test/testdata/numberformattestspecification.txt ('k') | source/test/testdata/regextst.txt » ('j') | no next file with comments »

Powered by Google App Engine
This is Rietveld 408576698