Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(716)

Side by Side Diff: source/data/unidata/changes.txt

Issue 845603002: Update ICU to 54.1 step 1 (Closed) Base URL: https://chromium.googlesource.com/chromium/deps/icu.git@master
Patch Set: remove unusued directories Created 5 years, 11 months ago
Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.
Jump to:
View unified diff | Download patch
« no previous file with comments | « source/data/unidata/UnicodeData.txt ('k') | source/data/unidata/confusables.txt » ('j') | no next file with comments »
Toggle Intra-line Diffs ('i') | Expand Comments ('e') | Collapse Comments ('c') | Show Comments Hide Comments ('s')
OLDNEW
1 * Copyright (C) 2004-2013, International Business Machines 1 * Copyright (C) 2004-2014, International Business Machines
2 * Corporation and others. All Rights Reserved. 2 * Corporation and others. All Rights Reserved.
3 * 3 *
4 * file name: changes.txt 4 * file name: changes.txt
5 * encoding: US-ASCII 5 * encoding: US-ASCII
6 * tab size: 8 (not used) 6 * tab size: 8 (not used)
7 * indentation:4 7 * indentation:4
8 * 8 *
9 * created on: 2004may06 9 * created on: 2004may06
10 * created by: Markus W. Scherer 10 * created by: Markus W. Scherer
11 * 11 *
12 * change log for Unicode updates 12 * change log for Unicode updates
13 13
14 ---------------------------------------------------------------------------- *** 14 ---------------------------------------------------------------------------- ***
15 15
16 Unicode 8.0 update for ICU ??
17
18 * UCA issue from 7.0
19
20 - U+1DE9 COMBINING LATIN SMALL LETTER BETA
21 sorts with Greek Beta, should sort with Latin B?
22 + Ken says:
23 No, it was deliberate:
24
25 03B2;GREEK SMALL LETTER BETA;Ll;;;;0392;;0392
26 1D5D;MODIFIER LETTER SMALL BETA;Lm;<super> 03B2;;;;;
27 1DE9;COMBINING LATIN SMALL LETTER BETA;Mn;<sort> 03B2;;;;;
28 1D66;GREEK SUBSCRIPT SMALL LETTER BETA;Ll;<sub> 03B2;;;;;
29
30 Note the relationship to U+1D5D.
31
32 When the disunified *Latin* beta base letter shows up in Unicode 8.0:
33
34 U+A7B4 LATIN CAPITAL LETTER BETA
35 U+A7B5 LATIN SMALL LETTER BETA
36
37 we could re-evaluate what U+1DE9 equates to, for collation,
38 but currently there isn’t any Latin beta to serve that function
39 in Unicode 7.0.
40
41 - ICU_ROOT=~/svn.icu/trunk
42 - ICU_SRC_DIR=$ICU_ROOT/src
43 - ~/svn.icutools/trunk/dbg/unicode/c$ genuca/genuca --hanOrder implicit $ICU_SRC _DIR
44 - ~/svn.icutools/trunk/dbg/unicode/c$ genuca/genuca --hanOrder radical-stroke $I CU_SRC_DIR
45
46
47 ---------------------------------------------------------------------------- ***
48
49 Unicode 7.0 update for ICU 54
50
51 http://www.unicode.org/review/pri271/ -- beta review
52 http://www.unicode.org/reports/uax-proposed-updates.html
53 http://www.unicode.org/versions/beta-7.0.0.html#notable_issues
54 http://www.unicode.org/reports/tr44/tr44-13.html
55
56 *** ICU Trac
57
58 - ticket 10821: Unicode 7.0, UCA 7.0
59 - C++ branches/markus/uni70 at r35584 from trunk at r35580
60 - Java branches/markus/uni70 at r35587 from trunk at r35545
61
62 *** CLDR Trac
63
64 - ticket 7195: UCA 7.0 CLDR root collation
65 - branches/markus/uni70 at r10062 from trunk at r10061
66
67 - ticket 6762: script metadata for Unicode 7.0 new scripts
68
69 *** Unicode version numbers
70 - makedata.mak
71 - uchar.h
72 - com.ibm.icu.util.VersionInfo
73 - com.ibm.icu.dev.test.lang.UCharacterTest.VERSION_
74
75 - Run ICU4C "configure" _after_ updating the Unicode version number in uchar.h
76 so that the makefiles see the new version number.
77
78 *** data files & enums & parser code
79
80 * file preparation
81
82 - download UCD & IDNA files
83 - make sure that the Unicode data folder passed into preparseucd.py
84 includes a copy of the latest IdnaMappingTable.txt (can be in some subfolder)
85 - only for manual diffs: remove version suffixes from the file names
86 ~/unidata/uni70/20140403$ ../../desuffixucd.py .
87 (see https://sites.google.com/site/unicodetools/inputdata)
88 - only for manual diffs: extract Unihan.zip to "here" (.../ucd/Unihan/*.txt), de lete Unihan.zip
89 - ~/svn.icutools/trunk/src/unicode$ py/preparseucd.py ~/unidata/uni70/20140403 $ ICU_SRC_DIR ~/svn.icutools/trunk/src
90 - This writes files (especially ppucd.txt) to the ICU4C unidata and testdata sub folders.
91 - Restore TODO diffs in source/data/unidata/UCARules.txt
92 cd $ICU_SRC_DIR
93 meld ../../trunk/src/source/data/unidata/UCARules.txt source/data/unidata/UC ARules.txt
94 - Restore ICU patches for ticket #10176 in source/test/testdata/LineBreakTest.tx t
95
96 - also: from http://unicode.org/Public/security/7.0.0/ download new
97 confusables.txt & confusablesWholeScript.txt
98 and copy to $ICU_ROOT/src/source/data/unidata/
99
100 * initial preparseucd.py changes
101 - remove new Unicode scripts from the
102 only-in-ISO-15924 list according to the error message:
103 ValueError: remove ['Hmng', 'Lina', 'Perm', 'Mani', 'Phlp', 'Bass',
104 'Dupl', 'Elba', 'Gran', 'Mend', 'Narb', 'Nbat', 'Palm',
105 'Sind', 'Wara', 'Mroo', 'Khoj', 'Tirh', 'Aghb', 'Mahj']
106 from _scripts_only_in_iso15924
107 -> fix expectedLong names in cucdapi.c/TestUScriptCodeAPI()
108 and in com.ibm.icu.dev.test.lang.TestUScript.java
109 - NamesList.txt now has a heading with a non-ASCII character
110 + keep ppucd.txt in platform charset, rather than changing tool/test parsers
111 + escape non-ASCII characters in heading comments
112 - gets Unicode copyright line from PropertyAliases.txt which is currently still at 2013
113 + get the copyright from the first file whose copyright line contains the curr ent year
114
115 * PropertyValueAliases.txt changes
116 - 32 new Block (blk) values:
117 blk; Bassa_Vah ; Bassa_Vah
118 blk; Caucasian_Albanian ; Caucasian_Albanian
119 blk; Coptic_Epact_Numbers ; Coptic_Epact_Numbers
120 blk; Diacriticals_Ext ; Combining_Diacritical_Marks_Extended
121 blk; Duployan ; Duployan
122 blk; Elbasan ; Elbasan
123 blk; Geometric_Shapes_Ext ; Geometric_Shapes_Extended
124 blk; Grantha ; Grantha
125 blk; Khojki ; Khojki
126 blk; Khudawadi ; Khudawadi
127 blk; Latin_Ext_E ; Latin_Extended_E
128 blk; Linear_A ; Linear_A
129 blk; Mahajani ; Mahajani
130 blk; Manichaean ; Manichaean
131 blk; Mende_Kikakui ; Mende_Kikakui
132 blk; Modi ; Modi
133 blk; Mro ; Mro
134 blk; Myanmar_Ext_B ; Myanmar_Extended_B
135 blk; Nabataean ; Nabataean
136 blk; Old_North_Arabian ; Old_North_Arabian
137 blk; Old_Permic ; Old_Permic
138 blk; Ornamental_Dingbats ; Ornamental_Dingbats
139 blk; Pahawh_Hmong ; Pahawh_Hmong
140 blk; Palmyrene ; Palmyrene
141 blk; Pau_Cin_Hau ; Pau_Cin_Hau
142 blk; Psalter_Pahlavi ; Psalter_Pahlavi
143 blk; Shorthand_Format_Controls ; Shorthand_Format_Controls
144 blk; Siddham ; Siddham
145 blk; Sinhala_Archaic_Numbers ; Sinhala_Archaic_Numbers
146 blk; Sup_Arrows_C ; Supplemental_Arrows_C
147 blk; Tirhuta ; Tirhuta
148 blk; Warang_Citi ; Warang_Citi
149 -> add to uchar.h
150 use long property names for enum constants
151 -> add to UCharacter.UnicodeBlock IDs
152 Eclipse find UBLOCK_([^ ]+) = ([0-9]+), (/.+)
153 replace public static final int \1_ID = \2; \3
154 -> add to UCharacter.UnicodeBlock objects
155 Eclipse find UBLOCK_([^ ]+) = [0-9]+, (/.+)
156 replace public static final UnicodeBlock \1 = new UnicodeBlock("\1" , \1_ID); \2
157 - 28 new Joining_Group (jg) values:
158 jg ; Manichaean_Aleph ; Manichaean_Aleph
159 jg ; Manichaean_Ayin ; Manichaean_Ayin
160 jg ; Manichaean_Beth ; Manichaean_Beth
161 jg ; Manichaean_Daleth ; Manichaean_Daleth
162 jg ; Manichaean_Dhamedh ; Manichaean_Dhamedh
163 jg ; Manichaean_Five ; Manichaean_Five
164 jg ; Manichaean_Gimel ; Manichaean_Gimel
165 jg ; Manichaean_Heth ; Manichaean_Heth
166 jg ; Manichaean_Hundred ; Manichaean_Hundred
167 jg ; Manichaean_Kaph ; Manichaean_Kaph
168 jg ; Manichaean_Lamedh ; Manichaean_Lamedh
169 jg ; Manichaean_Mem ; Manichaean_Mem
170 jg ; Manichaean_Nun ; Manichaean_Nun
171 jg ; Manichaean_One ; Manichaean_One
172 jg ; Manichaean_Pe ; Manichaean_Pe
173 jg ; Manichaean_Qoph ; Manichaean_Qoph
174 jg ; Manichaean_Resh ; Manichaean_Resh
175 jg ; Manichaean_Sadhe ; Manichaean_Sadhe
176 jg ; Manichaean_Samekh ; Manichaean_Samekh
177 jg ; Manichaean_Taw ; Manichaean_Taw
178 jg ; Manichaean_Ten ; Manichaean_Ten
179 jg ; Manichaean_Teth ; Manichaean_Teth
180 jg ; Manichaean_Thamedh ; Manichaean_Thamedh
181 jg ; Manichaean_Twenty ; Manichaean_Twenty
182 jg ; Manichaean_Waw ; Manichaean_Waw
183 jg ; Manichaean_Yodh ; Manichaean_Yodh
184 jg ; Manichaean_Zayin ; Manichaean_Zayin
185 jg ; Straight_Waw ; Straight_Waw
186 -> uchar.h & UCharacter.JoiningGroup
187 - 23 new Script (sc) values:
188 sc ; Aghb ; Caucasian_Albanian
189 sc ; Bass ; Bassa_Vah
190 sc ; Dupl ; Duployan
191 sc ; Elba ; Elbasan
192 sc ; Gran ; Grantha
193 sc ; Hmng ; Pahawh_Hmong
194 sc ; Khoj ; Khojki
195 sc ; Lina ; Linear_A
196 sc ; Mahj ; Mahajani
197 sc ; Mani ; Manichaean
198 sc ; Mend ; Mende_Kikakui
199 sc ; Modi ; Modi
200 sc ; Mroo ; Mro
201 sc ; Narb ; Old_North_Arabian
202 sc ; Nbat ; Nabataean
203 sc ; Palm ; Palmyrene
204 sc ; Pauc ; Pau_Cin_Hau
205 sc ; Perm ; Old_Permic
206 sc ; Phlp ; Psalter_Pahlavi
207 sc ; Sidd ; Siddham
208 sc ; Sind ; Khudawadi
209 sc ; Tirh ; Tirhuta
210 sc ; Wara ; Warang_Citi
211 -> uscript.h (many were added before)
212 comment "Mende Kikakui" for USCRIPT_MENDE
213 add USCRIPT_KHUDAWADI, make USCRIPT_SINDHI an alias
214 -> com.ibm.icu.lang.UScript
215 find USCRIPT_([^ ]+) *= ([0-9]+),(.+)
216 replace public static final int \1 = \2; \3
217 - 6 new script codes from ISO 15924 http://www.unicode.org/iso15924/codechanges. html
218 (added 2012-11-01)
219 Ahom 338 Ahom
220 Hatr 127 Hatran
221 Mult 323 Multani
222 (added 2013-10-12)
223 Modi 324 Modi
224 Pauc 263 Pau Cin Hau
225 Sidd 302 Siddham
226 -> uscript.h (some overlap with additions from Unicode)
227 -> com.ibm.icu.lang.UScript
228 find USCRIPT_([^ ]+) *= ([0-9]+),(.+)
229 replace public static final int \1 = \2; \3
230 -> add Ahom, Hatr, Mult to preparseucd.py _scripts_only_in_iso15924
231 -> add to expectedLong and expectedShort names in cintltst/cucdapi.c/TestUScri ptCodeAPI()
232 and in com.ibm.icu.dev.test.lang.TestUScript.java
233
234 * update Script metadata: SCRIPT_PROPS[] in uscript_props.cpp & UScript.ScriptMe tadata
235 (not strictly necessary for NOT_ENCODED scripts)
236 ~/svn.icutools/trunk/src/unicode$ py/parsescriptmetadata.py $ICU_SRC_DIR/sourc e/common/unicode/uscript.h ~/svn.cldr/trunk/common/properties/scriptMetadata.txt
237
238 * generate normalization data files
239 - cd $ICU_ROOT/dbg
240 - export LD_LIBRARY_PATH=$ICU_ROOT/dbg/lib
241 - SRC_DATA_IN=$ICU_SRC_DIR/source/data/in
242 - UNIDATA=$ICU_SRC_DIR/source/data/unidata
243 - bin/gennorm2 -o $ICU_SRC_DIR/source/common/norm2_nfc_data.h -s $UNIDATA/norm2 nfc.txt --csource
244 - bin/gennorm2 -o $SRC_DATA_IN/nfc.nrm -s $UNIDATA/norm2 nfc.txt
245 - bin/gennorm2 -o $SRC_DATA_IN/nfkc.nrm -s $UNIDATA/norm2 nfc.txt nfkc.txt
246 - bin/gennorm2 -o $SRC_DATA_IN/nfkc_cf.nrm -s $UNIDATA/norm2 nfc.txt nfkc.txt nf kc_cf.txt
247 - bin/gennorm2 -o $SRC_DATA_IN/uts46.nrm -s $UNIDATA/norm2 nfc.txt uts46.txt
248
249 * build ICU (make install)
250 so that the tools build can pick up the new definitions from the installed hea der files.
251
252 ~/svn.icu/uni70/dbg$ echo;echo;make -j5 install > out.txt 2>&1 ; tail -n 20 out. txt
253
254 * build Unicode tools using CMake+make
255
256 ~/svn.icutools/trunk/src/unicode/c/icudefs.txt:
257
258 # Location (--prefix) of where ICU was installed.
259 set(ICU_INST_DIR /home/mscherer/svn.icu/uni70/inst)
260 # Location of the ICU source tree.
261 set(ICU_SRC_DIR /home/mscherer/svn.icu/uni70/src)
262
263 ~/svn.icutools/trunk/dbg/unicode/c$ cmake ../../../src/unicode/c
264 ~/svn.icutools/trunk/dbg/unicode/c$ make
265
266 * genprops work
267 - new code point range for Joining_Group values: 10AC0..10AFF Manichaean
268 + add second array of Joining_Group values for at most 10800..10FFF
269 icutools: unicode/c/genprops/bidipropsbuilder.cpp
270 icu: source/common/ubidi_props.h/.c/_data.h
271 icu4j: main/classes/core/src/com/ibm/icu/impl/UBiDiProps.java
272
273 * generate core properties data files
274 - ~/svn.icutools/trunk/dbg/unicode/c$ genprops/genprops $ICU_SRC_DIR
275 - ~/svn.icutools/trunk/dbg/unicode/c$ genuca/genuca $ICU_SRC_DIR
276 - rebuild ICU (make install) & tools
277 - run genuca again (see step above) so that it picks up the new nfc.nrm
278 - rebuild ICU (make install) & tools
279
280 * update uts46test.cpp and UTS46Test.java if there are new characters that are e quivalent to
281 sequences with non-LDH ASCII (that is, their decompositions contain '=' or sim ilar)
282 - grep IdnaMappingTable.txt or uts46.txt for "disallowed_STD3_valid" on non-ASCI I characters
283 - Unicode 6.0..7.0: U+2260, U+226E, U+226F
284 - nothing new in 7.0, no test file to update
285
286 * run & fix ICU4C tests
287
288 * update Java data files
289 - refresh just the UCD-related files, just to be safe
290 - see (ICU4C)/source/data/icu4j-readme.txt
291 - mkdir /tmp/icu4j
292 - ~/svn.icu/uni70/dbg$ make ICU4J_ROOT=/tmp/icu4j icu4j-data-install
293 output:
294 ...
295 Unicode .icu files built to ./out/build/icudt53l
296 echo timestamp > uni-core-data
297 mkdir -p ./out/icu4j/com/ibm/icu/impl/data/icudt53b
298 mkdir -p ./out/icu4j/tzdata/com/ibm/icu/impl/data/icudt53b
299 echo pnames.icu ubidi.icu ucase.icu uprops.icu > ./out/icu4j/add.txt
300 LD_LIBRARY_PATH=../lib:../stubdata:../tools/ctestfw:$LD_LIBRARY_PATH ../bin /icupkg ./out/tmp/icudt53l.dat ./out/icu4j/icudt53b.dat -a ./out/icu4j/add.txt - s ./out/build/icudt53l -x '*' -tb -d ./out/icu4j/com/ibm/icu/impl/data/icudt53b
301 mv ./out/icu4j/"com/ibm/icu/impl/data/icudt53b/zoneinfo64.res" ./out/icu4j/" com/ibm/icu/impl/data/icudt53b/metaZones.res" ./out/icu4j/"com/ibm/icu/impl/data /icudt53b/timezoneTypes.res" ./out/icu4j/"com/ibm/icu/impl/data/icudt53b/windows Zones.res" "./out/icu4j/tzdata/com/ibm/icu/impl/data/icudt53b"
302 jar cf ./out/icu4j/icudata.jar -C ./out/icu4j com/ibm/icu/impl/data/icudt53b /
303 mkdir -p /tmp/icu4j/main/shared/data
304 cp ./out/icu4j/icudata.jar /tmp/icu4j/main/shared/data
305 jar cf ./out/icu4j/icutzdata.jar -C ./out/icu4j/tzdata com/ibm/icu/impl/data /icudt53b/
306 mkdir -p /tmp/icu4j/main/shared/data
307 cp ./out/icu4j/icutzdata.jar /tmp/icu4j/main/shared/data
308 make[1]: Leaving directory `/home/mscherer/svn.icu/uni70/dbg/data'
309 - copy the big-endian Unicode data files to another location,
310 separate from the other data files
311 ICUDT=icudt54b
312 mkdir -p /tmp/icu4j/com/ibm/icu/impl/data/$ICUDT/coll
313 mkdir -p /tmp/icu4j/com/ibm/icu/impl/data/$ICUDT/brkitr
314 cd ~/svn.icu/uni70/dbg/data/out/icu4j
315 cp com/ibm/icu/impl/data/$ICUDT/confusables.cfu /tmp/icu4j/com/ibm/icu/impl/ data/$ICUDT
316 cp com/ibm/icu/impl/data/$ICUDT/*.icu /tmp/icu4j/com/ibm/icu/impl/data/$ICUD T
317 rm /tmp/icu4j/com/ibm/icu/impl/data/$ICUDT/cnvalias.icu
318 cp com/ibm/icu/impl/data/$ICUDT/*.nrm /tmp/icu4j/com/ibm/icu/impl/data/$ICUD T
319 cp com/ibm/icu/impl/data/$ICUDT/coll/*.icu /tmp/icu4j/com/ibm/icu/impl/data/ $ICUDT/coll
320 cp com/ibm/icu/impl/data/$ICUDT/brkitr/* /tmp/icu4j/com/ibm/icu/impl/data/$I CUDT/brkitr
321 - refresh ICU4J
322 ~/svn.icu/uni70/dbg/data/out/icu4j$ jar uf ~/svn.icu4j/trunk/src/main/shared /data/icudata.jar -C /tmp/icu4j com/ibm/icu/impl/data/$ICUDT
323
324 * update CollationFCD.java
325 + copy & paste the initializers of lcccIndex[] etc. from
326 ICU4C/source/i18n/collationfcd.cpp to
327 ICU4J/main/classes/collate/src/com/ibm/icu/impl/coll/CollationFCD.java
328
329 * refresh Java test .txt files
330 - copy new .txt files into ICU4J's main/tests/core/src/com/ibm/icu/dev/data/unic ode
331 cd $ICU_SRC_DIR/source/data/unidata
332 cp confusables.txt confusablesWholeScript.txt NormalizationCorrections.txt N ormalizationTest.txt SpecialCasing.txt UnicodeData.txt ~/svn.icu4j/trunk/src/mai n/tests/core/src/com/ibm/icu/dev/data/unicode
333 cd ../../test/testdata
334 cp BidiCharacterTest.txt BidiTest.txt ~/svn.icu4j/trunk/src/main/tests/core/ src/com/ibm/icu/dev/data/unicode
335 cp ~/unidata/uni70/20140409/ucd/CompositionExclusions.txt ~/svn.icu4j/trunk/ src/main/tests/core/src/com/ibm/icu/dev/data/unicode
336
337 * UCA
338
339 - download UCA files (mostly allkeys.txt) from http://www.unicode.org/Public/UCA /<beta version>/
340 - run desuffixucd.py (see https://sites.google.com/site/unicodetools/inputdata)
341 - update the input files for Mark's UCA tools, in ~/svn.unitools/trunk/data/uca/ 7.0.0/
342 - run Mark's UCA Main: https://sites.google.com/site/unicodetools/home#TOC-UCA
343 - output files are in ~/svn.unitools/Generated/uca/7.0.0/
344 - review data; compare files, use blankweights.sed or similar
345 ~/svn.unitools$ sed -r -f blankweights.sed Generated/uca/7.0.0/CollationAuxili ary/FractionalUCA.txt > frac-7.0.txt
346 - cd ~/svn.unitools/Generated/uca/7.0.0/
347 - update source/data/unidata/FractionalUCA.txt with FractionalUCA_SHORT.txt
348 cp CollationAuxiliary/FractionalUCA_SHORT.txt $ICU_SRC_DIR/source/data/unidata /FractionalUCA.txt
349 - update source/data/unidata/UCARules.txt with UCA_Rules_SHORT.txt
350 (note removing the underscore before "Rules")
351 cp CollationAuxiliary/UCA_Rules_SHORT.txt $ICU_SRC_DIR/source/data/unidata/U CARules.txt
352 - update (ICU4C)/source/test/testdata/CollationTest_*.txt
353 and (ICU4J)/main/tests/collate/src/com/ibm/icu/dev/data/CollationTest_*.txt
354 with output from Mark's Unicode tools (..._CLDR_..._SHORT.txt)
355 cp CollationAuxiliary/CollationTest_CLDR_NON_IGNORABLE_SHORT.txt $ICU_SRC_DI R/source/test/testdata/CollationTest_NON_IGNORABLE_SHORT.txt
356 cp CollationAuxiliary/CollationTest_CLDR_SHIFTED_SHORT.txt $ICU_SRC_DIR/sour ce/test/testdata/CollationTest_SHIFTED_SHORT.txt
357 cp $ICU_SRC_DIR/source/test/testdata/CollationTest_*.txt ~/svn.icu4j/trunk/s rc/main/tests/collate/src/com/ibm/icu/dev/data
358 - run genuca, see command line above
359 - rebuild ICU4C
360 - refresh ICU4J collation data:
361 (subset of instructions above for properties data refresh, except copies all c oll/*)
362 ICUDT=icudt54b
363 ~/svn.icu/uni70/dbg$ make ICU4J_ROOT=/tmp/icu4j icu4j-data-install
364 ~/svn.icu/uni70/dbg$ mkdir -p /tmp/icu4j/com/ibm/icu/impl/data/$ICUDT/coll
365 ~/svn.icu/uni70/dbg/data/out/icu4j$ cp com/ibm/icu/impl/data/$ICUDT/coll/* / tmp/icu4j/com/ibm/icu/impl/data/$ICUDT/coll
366 ~/svn.icu/uni70/dbg/data/out/icu4j$ jar uf ~/svn.icu4j/trunk/src/main/shared /data/icudata.jar -C /tmp/icu4j com/ibm/icu/impl/data/$ICUDT
367 - run all tests with the *_SHORT.txt or the full files (the full ones have comme nts, useful for debugging)
368 - note on intltest: if collate/UCAConformanceTest fails, then
369 utility/MultithreadTest/TestCollators will fail as well;
370 fix the conformance test before looking into the multi-thread test
371 - copy all output from Mark's UCA tool to unicode.org for review & staging by Ke n & editors
372 - copy most of ~/svn.unitools/Generated/uca/7.0.0/CollationAuxiliary/* to CLDR b ranch
373 ~/svn.unitools$ cp Generated/uca/7.0.0/CollationAuxiliary/* ~/svn.cldr/trunk/c ommon/uca/
374
375 * When refreshing all of ICU4J data from ICU4C
376 - ~/svn.icu/uni70/dbg$ make ICU4J_ROOT=/tmp/icu4j icu4j-data-install
377 - cp /tmp/icu4j/main/shared/data/icudata.jar ~/svn.icu4j/trunk/src/main/shared/d ata
378 or
379 - ~/svn.icu/uni70/dbg$ make ICU4J_ROOT=~/svn.icu4j/trunk/src icu4j-data-install
380
381 * run & fix ICU4J tests
382
383 *** LayoutEngine script information
384
385 (For details see the Unicode 5.2 change log below.)
386
387 * Run icu4j-tools: com.ibm.icu.dev.tool.layout.ScriptNameBuilder.
388 This generates LEScripts.h, LELanguages.h, ScriptAndLanguageTags.h and ScriptA ndLanguageTags.cpp
389 in the working directory.
390 (It also generates ScriptRunData.cpp, which is no longer needed.)
391
392 The generated files have a current copyright date and "@stable" statement.
393 ICU 54: Fixed tools/misc/src/com/ibm/icu/dev/tool/layout/ScriptIDModuleWriter. java
394 for "born stable" Unicode API constants, and to stop parsing ICU version numbe rs
395 which may not contain dots any more.
396
397 - diff current <icu>/source/layout files vs. generated ones
398 ~/svn.icu4j/trunk/src$ meld $ICU_SRC_DIR/source/layout tools/misc/src/com/ib m/icu/dev/tool/layout
399 review and manually merge desired changes;
400 fix gratuitous changes, incorrect @draft/@stable and missing aliases;
401 Unicode-derived script codes should be "born stable" like constants in uchar.h , uscript.h etc.
402 - if you just copy the above files, then
403 fix mixed line endings, review the diffs as above and restore changes to API t ags etc.;
404 manually re-add the "Indic script xyz v.2" tags in ScriptAndLanguageTags.h
405
406 *** API additions
407 - send notice to icu-design about new born-@stable API (enum constants etc.)
408
409 *** merge the Unicode update branches back onto the trunk
410 - do not merge the icudata.jar and testdata.jar,
411 instead rebuild them from merged & tested ICU4C
412
413 ---------------------------------------------------------------------------- ***
414
16 Unicode 6.3 update 415 Unicode 6.3 update
17 416
18 http://www.unicode.org/review/pri249/ -- beta review 417 http://www.unicode.org/review/pri249/ -- beta review
19 http://www.unicode.org/reports/uax-proposed-updates.html 418 http://www.unicode.org/reports/uax-proposed-updates.html
20 http://www.unicode.org/versions/beta-6.3.0.html#notable_issues 419 http://www.unicode.org/versions/beta-6.3.0.html#notable_issues
21 http://www.unicode.org/reports/tr44/tr44-11.html 420 http://www.unicode.org/reports/tr44/tr44-11.html
22 421
23 *** ICU Trac 422 *** ICU Trac
24 423
25 - ticket 10128: update ICU to Unicode 6.3 beta 424 - ticket 10128: update ICU to Unicode 6.3 beta
(...skipping 1658 matching lines...) Expand 10 before | Expand all | Expand 10 after
1684 2083
1685 * name matching 2084 * name matching
1686 - read UCD.html 2085 - read UCD.html
1687 2086
1688 * scripts 2087 * scripts
1689 - use new Hrkt=Katakana_Or_Hiragana 2088 - use new Hrkt=Katakana_Or_Hiragana
1690 2089
1691 * ZWJ & ZWNJ 2090 * ZWJ & ZWNJ
1692 - are now part of combining character sequences 2091 - are now part of combining character sequences
1693 - break iteration used to assume that LB classes did not overlap; now they do fo r ZWJ & ZWNJ 2092 - break iteration used to assume that LB classes did not overlap; now they do fo r ZWJ & ZWNJ
OLDNEW
« no previous file with comments | « source/data/unidata/UnicodeData.txt ('k') | source/data/unidata/confusables.txt » ('j') | no next file with comments »

Powered by Google App Engine
This is Rietveld 408576698