source/data/unidata/changes.txt - Issue 845603002: Update ICU to 54.1 step 1

Side by Side Diff: source/data/unidata/changes.txt

Issue 845603002: Update ICU to 54.1 step 1 (Closed) Base URL: https://chromium.googlesource.com/chromium/deps/icu.git@master

Patch Set: remove unusued directories Created 5 years, 11 months ago

Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.

Jump to:

View unified diff | Download patch

OLD	NEW
1 * Copyright (C) 2004-2013, International Business Machines	1 * Copyright (C) 2004-2014, International Business Machines

2 * Corporation and others. All Rights Reserved.	2 * Corporation and others. All Rights Reserved.

3 *	3 *

4 * file name: changes.txt	4 * file name: changes.txt

5 * encoding: US-ASCII	5 * encoding: US-ASCII

6 * tab size: 8 (not used)	6 * tab size: 8 (not used)

7 * indentation:4	7 * indentation:4

8 *	8 *

9 * created on: 2004may06	9 * created on: 2004may06

10 * created by: Markus W. Scherer	10 * created by: Markus W. Scherer

11 *	11 *

12 * change log for Unicode updates	12 * change log for Unicode updates

13	13

14 ---------------------------------------------------------------------------- ***	14 ---------------------------------------------------------------------------- ***

15	15

	16 Unicode 8.0 update for ICU ??

	17

	18 * UCA issue from 7.0

	19

	20 - U+1DE9 COMBINING LATIN SMALL LETTER BETA

	21 sorts with Greek Beta, should sort with Latin B?

	22 + Ken says:

	23 No, it was deliberate:

	24

	25 03B2;GREEK SMALL LETTER BETA;Ll;;;;0392;;0392

	26 1D5D;MODIFIER LETTER SMALL BETA;Lm;<super> 03B2;;;;;

	27 1DE9;COMBINING LATIN SMALL LETTER BETA;Mn;<sort> 03B2;;;;;

	28 1D66;GREEK SUBSCRIPT SMALL LETTER BETA;Ll;<sub> 03B2;;;;;

	29

	30 Note the relationship to U+1D5D.

	31

	32 When the disunified Latin beta base letter shows up in Unicode 8.0:

	33

	34 U+A7B4 LATIN CAPITAL LETTER BETA

	35 U+A7B5 LATIN SMALL LETTER BETA

	36

	37 we could re-evaluate what U+1DE9 equates to, for collation,

	38 but currently there isn’t any Latin beta to serve that function

	39 in Unicode 7.0.

	40

	41 - ICU_ROOT=~/svn.icu/trunk

	42 - ICU_SRC_DIR=$ICU_ROOT/src

	43 - ~/svn.icutools/trunk/dbg/unicode/c$ genuca/genuca --hanOrder implicit $ICU_SRC _DIR

	44 - ~/svn.icutools/trunk/dbg/unicode/c$ genuca/genuca --hanOrder radical-stroke $I CU_SRC_DIR

	45

	46

	47 ---------------------------------------------------------------------------- ***

	48

	49 Unicode 7.0 update for ICU 54

	50

	51 http://www.unicode.org/review/pri271/ -- beta review

	52 http://www.unicode.org/reports/uax-proposed-updates.html

	53 http://www.unicode.org/versions/beta-7.0.0.html#notable_issues

	54 http://www.unicode.org/reports/tr44/tr44-13.html

	55

	56 *** ICU Trac

	57

	58 - ticket 10821: Unicode 7.0, UCA 7.0

	59 - C++ branches/markus/uni70 at r35584 from trunk at r35580

	60 - Java branches/markus/uni70 at r35587 from trunk at r35545

	61

	62 *** CLDR Trac

	63

	64 - ticket 7195: UCA 7.0 CLDR root collation

	65 - branches/markus/uni70 at r10062 from trunk at r10061

	66

	67 - ticket 6762: script metadata for Unicode 7.0 new scripts

	68

	69 *** Unicode version numbers

	70 - makedata.mak

	71 - uchar.h

	72 - com.ibm.icu.util.VersionInfo

	73 - com.ibm.icu.dev.test.lang.UCharacterTest.VERSION_

	74

	75 - Run ICU4C "configure" _after_ updating the Unicode version number in uchar.h

	76 so that the makefiles see the new version number.

	77

	78 *** data files & enums & parser code

	79

	80 * file preparation

	81

	82 - download UCD & IDNA files

	83 - make sure that the Unicode data folder passed into preparseucd.py

	84 includes a copy of the latest IdnaMappingTable.txt (can be in some subfolder)

	85 - only for manual diffs: remove version suffixes from the file names

	86 ~/unidata/uni70/20140403$ ../../desuffixucd.py .

	87 (see https://sites.google.com/site/unicodetools/inputdata)

	88 - only for manual diffs: extract Unihan.zip to "here" (.../ucd/Unihan/*.txt), de lete Unihan.zip

	89 - ~/svn.icutools/trunk/src/unicode$ py/preparseucd.py ~/unidata/uni70/20140403 $ ICU_SRC_DIR ~/svn.icutools/trunk/src

	90 - This writes files (especially ppucd.txt) to the ICU4C unidata and testdata sub folders.

	91 - Restore TODO diffs in source/data/unidata/UCARules.txt

	92 cd $ICU_SRC_DIR

	93 meld ../../trunk/src/source/data/unidata/UCARules.txt source/data/unidata/UC ARules.txt

	94 - Restore ICU patches for ticket #10176 in source/test/testdata/LineBreakTest.tx t

	95

	96 - also: from http://unicode.org/Public/security/7.0.0/ download new

	97 confusables.txt & confusablesWholeScript.txt

	98 and copy to $ICU_ROOT/src/source/data/unidata/

	99

	100 * initial preparseucd.py changes

	101 - remove new Unicode scripts from the

	102 only-in-ISO-15924 list according to the error message:

	103 ValueError: remove ['Hmng', 'Lina', 'Perm', 'Mani', 'Phlp', 'Bass',

	104 'Dupl', 'Elba', 'Gran', 'Mend', 'Narb', 'Nbat', 'Palm',

	105 'Sind', 'Wara', 'Mroo', 'Khoj', 'Tirh', 'Aghb', 'Mahj']

	106 from _scripts_only_in_iso15924

	107 -> fix expectedLong names in cucdapi.c/TestUScriptCodeAPI()

	108 and in com.ibm.icu.dev.test.lang.TestUScript.java

	109 - NamesList.txt now has a heading with a non-ASCII character

	110 + keep ppucd.txt in platform charset, rather than changing tool/test parsers

	111 + escape non-ASCII characters in heading comments

	112 - gets Unicode copyright line from PropertyAliases.txt which is currently still at 2013

	113 + get the copyright from the first file whose copyright line contains the curr ent year

	114

	115 * PropertyValueAliases.txt changes

	116 - 32 new Block (blk) values:

	117 blk; Bassa_Vah ; Bassa_Vah

	118 blk; Caucasian_Albanian ; Caucasian_Albanian

	119 blk; Coptic_Epact_Numbers ; Coptic_Epact_Numbers

	120 blk; Diacriticals_Ext ; Combining_Diacritical_Marks_Extended

	121 blk; Duployan ; Duployan

	122 blk; Elbasan ; Elbasan

	123 blk; Geometric_Shapes_Ext ; Geometric_Shapes_Extended

	124 blk; Grantha ; Grantha

	125 blk; Khojki ; Khojki

	126 blk; Khudawadi ; Khudawadi

	127 blk; Latin_Ext_E ; Latin_Extended_E

	128 blk; Linear_A ; Linear_A

	129 blk; Mahajani ; Mahajani

	130 blk; Manichaean ; Manichaean

	131 blk; Mende_Kikakui ; Mende_Kikakui

	132 blk; Modi ; Modi

	133 blk; Mro ; Mro

	134 blk; Myanmar_Ext_B ; Myanmar_Extended_B

	135 blk; Nabataean ; Nabataean

	136 blk; Old_North_Arabian ; Old_North_Arabian

	137 blk; Old_Permic ; Old_Permic

	138 blk; Ornamental_Dingbats ; Ornamental_Dingbats

	139 blk; Pahawh_Hmong ; Pahawh_Hmong

	140 blk; Palmyrene ; Palmyrene

	141 blk; Pau_Cin_Hau ; Pau_Cin_Hau

	142 blk; Psalter_Pahlavi ; Psalter_Pahlavi

	143 blk; Shorthand_Format_Controls ; Shorthand_Format_Controls

	144 blk; Siddham ; Siddham

	145 blk; Sinhala_Archaic_Numbers ; Sinhala_Archaic_Numbers

	146 blk; Sup_Arrows_C ; Supplemental_Arrows_C

	147 blk; Tirhuta ; Tirhuta

	148 blk; Warang_Citi ; Warang_Citi

	149 -> add to uchar.h

	150 use long property names for enum constants

	151 -> add to UCharacter.UnicodeBlock IDs

	152 Eclipse find UBLOCK_([^ ]+) = ([0-9]+), (/.+)

	153 replace public static final int \1_ID = \2; \3

	154 -> add to UCharacter.UnicodeBlock objects

	155 Eclipse find UBLOCK_([^ ]+) = [0-9]+, (/.+)

	156 replace public static final UnicodeBlock \1 = new UnicodeBlock("\1" , \1_ID); \2

	157 - 28 new Joining_Group (jg) values:

	158 jg ; Manichaean_Aleph ; Manichaean_Aleph

	159 jg ; Manichaean_Ayin ; Manichaean_Ayin

	160 jg ; Manichaean_Beth ; Manichaean_Beth

	161 jg ; Manichaean_Daleth ; Manichaean_Daleth

	162 jg ; Manichaean_Dhamedh ; Manichaean_Dhamedh

	163 jg ; Manichaean_Five ; Manichaean_Five

	164 jg ; Manichaean_Gimel ; Manichaean_Gimel

	165 jg ; Manichaean_Heth ; Manichaean_Heth

	166 jg ; Manichaean_Hundred ; Manichaean_Hundred

	167 jg ; Manichaean_Kaph ; Manichaean_Kaph

	168 jg ; Manichaean_Lamedh ; Manichaean_Lamedh

	169 jg ; Manichaean_Mem ; Manichaean_Mem

	170 jg ; Manichaean_Nun ; Manichaean_Nun

	171 jg ; Manichaean_One ; Manichaean_One

	172 jg ; Manichaean_Pe ; Manichaean_Pe

	173 jg ; Manichaean_Qoph ; Manichaean_Qoph

	174 jg ; Manichaean_Resh ; Manichaean_Resh

	175 jg ; Manichaean_Sadhe ; Manichaean_Sadhe

	176 jg ; Manichaean_Samekh ; Manichaean_Samekh

	177 jg ; Manichaean_Taw ; Manichaean_Taw

	178 jg ; Manichaean_Ten ; Manichaean_Ten

	179 jg ; Manichaean_Teth ; Manichaean_Teth

	180 jg ; Manichaean_Thamedh ; Manichaean_Thamedh

	181 jg ; Manichaean_Twenty ; Manichaean_Twenty

	182 jg ; Manichaean_Waw ; Manichaean_Waw

	183 jg ; Manichaean_Yodh ; Manichaean_Yodh

	184 jg ; Manichaean_Zayin ; Manichaean_Zayin

	185 jg ; Straight_Waw ; Straight_Waw

	186 -> uchar.h & UCharacter.JoiningGroup

	187 - 23 new Script (sc) values:

	188 sc ; Aghb ; Caucasian_Albanian

	189 sc ; Bass ; Bassa_Vah

	190 sc ; Dupl ; Duployan

	191 sc ; Elba ; Elbasan

	192 sc ; Gran ; Grantha

	193 sc ; Hmng ; Pahawh_Hmong

	194 sc ; Khoj ; Khojki

	195 sc ; Lina ; Linear_A

	196 sc ; Mahj ; Mahajani

	197 sc ; Mani ; Manichaean

	198 sc ; Mend ; Mende_Kikakui

	199 sc ; Modi ; Modi

	200 sc ; Mroo ; Mro

	201 sc ; Narb ; Old_North_Arabian

	202 sc ; Nbat ; Nabataean

	203 sc ; Palm ; Palmyrene

	204 sc ; Pauc ; Pau_Cin_Hau

	205 sc ; Perm ; Old_Permic

	206 sc ; Phlp ; Psalter_Pahlavi

	207 sc ; Sidd ; Siddham

	208 sc ; Sind ; Khudawadi

	209 sc ; Tirh ; Tirhuta

	210 sc ; Wara ; Warang_Citi

	211 -> uscript.h (many were added before)

	212 comment "Mende Kikakui" for USCRIPT_MENDE

	213 add USCRIPT_KHUDAWADI, make USCRIPT_SINDHI an alias

	214 -> com.ibm.icu.lang.UScript

	215 find USCRIPT_([^ ]+) *= ([0-9]+),(.+)

	216 replace public static final int \1 = \2; \3

	217 - 6 new script codes from ISO 15924 http://www.unicode.org/iso15924/codechanges. html

	218 (added 2012-11-01)

	219 Ahom 338 Ahom

	220 Hatr 127 Hatran

	221 Mult 323 Multani

	222 (added 2013-10-12)

	223 Modi 324 Modi

	224 Pauc 263 Pau Cin Hau

	225 Sidd 302 Siddham

	226 -> uscript.h (some overlap with additions from Unicode)

	227 -> com.ibm.icu.lang.UScript

	228 find USCRIPT_([^ ]+) *= ([0-9]+),(.+)

	229 replace public static final int \1 = \2; \3

	230 -> add Ahom, Hatr, Mult to preparseucd.py _scripts_only_in_iso15924

	231 -> add to expectedLong and expectedShort names in cintltst/cucdapi.c/TestUScri ptCodeAPI()

	232 and in com.ibm.icu.dev.test.lang.TestUScript.java

	233

	234 * update Script metadata: SCRIPT_PROPS[] in uscript_props.cpp & UScript.ScriptMe tadata

	235 (not strictly necessary for NOT_ENCODED scripts)

	236 ~/svn.icutools/trunk/src/unicode$ py/parsescriptmetadata.py $ICU_SRC_DIR/sourc e/common/unicode/uscript.h ~/svn.cldr/trunk/common/properties/scriptMetadata.txt

	237

	238 * generate normalization data files

	239 - cd $ICU_ROOT/dbg

	240 - export LD_LIBRARY_PATH=$ICU_ROOT/dbg/lib

	241 - SRC_DATA_IN=$ICU_SRC_DIR/source/data/in

	242 - UNIDATA=$ICU_SRC_DIR/source/data/unidata

	243 - bin/gennorm2 -o $ICU_SRC_DIR/source/common/norm2_nfc_data.h -s $UNIDATA/norm2 nfc.txt --csource

	244 - bin/gennorm2 -o $SRC_DATA_IN/nfc.nrm -s $UNIDATA/norm2 nfc.txt

	245 - bin/gennorm2 -o $SRC_DATA_IN/nfkc.nrm -s $UNIDATA/norm2 nfc.txt nfkc.txt

	246 - bin/gennorm2 -o $SRC_DATA_IN/nfkc_cf.nrm -s $UNIDATA/norm2 nfc.txt nfkc.txt nf kc_cf.txt

	247 - bin/gennorm2 -o $SRC_DATA_IN/uts46.nrm -s $UNIDATA/norm2 nfc.txt uts46.txt

	248

	249 * build ICU (make install)

	250 so that the tools build can pick up the new definitions from the installed hea der files.

	251

	252 ~/svn.icu/uni70/dbg$ echo;echo;make -j5 install > out.txt 2>&1 ; tail -n 20 out. txt

	253

	254 * build Unicode tools using CMake+make

	255

	256 ~/svn.icutools/trunk/src/unicode/c/icudefs.txt:

	257

	258 # Location (--prefix) of where ICU was installed.

	259 set(ICU_INST_DIR /home/mscherer/svn.icu/uni70/inst)

	260 # Location of the ICU source tree.

	261 set(ICU_SRC_DIR /home/mscherer/svn.icu/uni70/src)

	262

	263 ~/svn.icutools/trunk/dbg/unicode/c$ cmake ../../../src/unicode/c

	264 ~/svn.icutools/trunk/dbg/unicode/c$ make

	265

	266 * genprops work

	267 - new code point range for Joining_Group values: 10AC0..10AFF Manichaean

	268 + add second array of Joining_Group values for at most 10800..10FFF

	269 icutools: unicode/c/genprops/bidipropsbuilder.cpp

	270 icu: source/common/ubidi_props.h/.c/_data.h

	271 icu4j: main/classes/core/src/com/ibm/icu/impl/UBiDiProps.java

	272

	273 * generate core properties data files

	274 - ~/svn.icutools/trunk/dbg/unicode/c$ genprops/genprops $ICU_SRC_DIR

	275 - ~/svn.icutools/trunk/dbg/unicode/c$ genuca/genuca $ICU_SRC_DIR

	276 - rebuild ICU (make install) & tools

	277 - run genuca again (see step above) so that it picks up the new nfc.nrm

	278 - rebuild ICU (make install) & tools

	279

	280 * update uts46test.cpp and UTS46Test.java if there are new characters that are e quivalent to

	281 sequences with non-LDH ASCII (that is, their decompositions contain '=' or sim ilar)

	282 - grep IdnaMappingTable.txt or uts46.txt for "disallowed_STD3_valid" on non-ASCI I characters

	283 - Unicode 6.0..7.0: U+2260, U+226E, U+226F

	284 - nothing new in 7.0, no test file to update

	285

	286 * run & fix ICU4C tests

	287

	288 * update Java data files

	289 - refresh just the UCD-related files, just to be safe

	290 - see (ICU4C)/source/data/icu4j-readme.txt

	291 - mkdir /tmp/icu4j

	292 - ~/svn.icu/uni70/dbg$ make ICU4J_ROOT=/tmp/icu4j icu4j-data-install

	293 output:

	294 ...

	295 Unicode .icu files built to ./out/build/icudt53l

	296 echo timestamp > uni-core-data

	297 mkdir -p ./out/icu4j/com/ibm/icu/impl/data/icudt53b

	298 mkdir -p ./out/icu4j/tzdata/com/ibm/icu/impl/data/icudt53b

	299 echo pnames.icu ubidi.icu ucase.icu uprops.icu > ./out/icu4j/add.txt

	300 LD_LIBRARY_PATH=../lib:../stubdata:../tools/ctestfw:$LD_LIBRARY_PATH ../bin /icupkg ./out/tmp/icudt53l.dat ./out/icu4j/icudt53b.dat -a ./out/icu4j/add.txt - s ./out/build/icudt53l -x '*' -tb -d ./out/icu4j/com/ibm/icu/impl/data/icudt53b

	301 mv ./out/icu4j/"com/ibm/icu/impl/data/icudt53b/zoneinfo64.res" ./out/icu4j/" com/ibm/icu/impl/data/icudt53b/metaZones.res" ./out/icu4j/"com/ibm/icu/impl/data /icudt53b/timezoneTypes.res" ./out/icu4j/"com/ibm/icu/impl/data/icudt53b/windows Zones.res" "./out/icu4j/tzdata/com/ibm/icu/impl/data/icudt53b"

	302 jar cf ./out/icu4j/icudata.jar -C ./out/icu4j com/ibm/icu/impl/data/icudt53b /

	303 mkdir -p /tmp/icu4j/main/shared/data

	304 cp ./out/icu4j/icudata.jar /tmp/icu4j/main/shared/data

	305 jar cf ./out/icu4j/icutzdata.jar -C ./out/icu4j/tzdata com/ibm/icu/impl/data /icudt53b/

	306 mkdir -p /tmp/icu4j/main/shared/data

	307 cp ./out/icu4j/icutzdata.jar /tmp/icu4j/main/shared/data

	308 make[1]: Leaving directory `/home/mscherer/svn.icu/uni70/dbg/data'

	309 - copy the big-endian Unicode data files to another location,

	310 separate from the other data files

	311 ICUDT=icudt54b

	312 mkdir -p /tmp/icu4j/com/ibm/icu/impl/data/$ICUDT/coll

	313 mkdir -p /tmp/icu4j/com/ibm/icu/impl/data/$ICUDT/brkitr

	314 cd ~/svn.icu/uni70/dbg/data/out/icu4j

	315 cp com/ibm/icu/impl/data/$ICUDT/confusables.cfu /tmp/icu4j/com/ibm/icu/impl/ data/$ICUDT

	316 cp com/ibm/icu/impl/data/$ICUDT/*.icu /tmp/icu4j/com/ibm/icu/impl/data/$ICUD T

	317 rm /tmp/icu4j/com/ibm/icu/impl/data/$ICUDT/cnvalias.icu

	318 cp com/ibm/icu/impl/data/$ICUDT/*.nrm /tmp/icu4j/com/ibm/icu/impl/data/$ICUD T

	319 cp com/ibm/icu/impl/data/$ICUDT/coll/*.icu /tmp/icu4j/com/ibm/icu/impl/data/ $ICUDT/coll

	320 cp com/ibm/icu/impl/data/$ICUDT/brkitr/* /tmp/icu4j/com/ibm/icu/impl/data/$I CUDT/brkitr

	321 - refresh ICU4J

	322 ~/svn.icu/uni70/dbg/data/out/icu4j$ jar uf ~/svn.icu4j/trunk/src/main/shared /data/icudata.jar -C /tmp/icu4j com/ibm/icu/impl/data/$ICUDT

	323

	324 * update CollationFCD.java

	325 + copy & paste the initializers of lcccIndex[] etc. from

	326 ICU4C/source/i18n/collationfcd.cpp to

	327 ICU4J/main/classes/collate/src/com/ibm/icu/impl/coll/CollationFCD.java

	328

	329 * refresh Java test .txt files

	330 - copy new .txt files into ICU4J's main/tests/core/src/com/ibm/icu/dev/data/unic ode

	331 cd $ICU_SRC_DIR/source/data/unidata

	332 cp confusables.txt confusablesWholeScript.txt NormalizationCorrections.txt N ormalizationTest.txt SpecialCasing.txt UnicodeData.txt ~/svn.icu4j/trunk/src/mai n/tests/core/src/com/ibm/icu/dev/data/unicode

	333 cd ../../test/testdata

	334 cp BidiCharacterTest.txt BidiTest.txt ~/svn.icu4j/trunk/src/main/tests/core/ src/com/ibm/icu/dev/data/unicode

	335 cp ~/unidata/uni70/20140409/ucd/CompositionExclusions.txt ~/svn.icu4j/trunk/ src/main/tests/core/src/com/ibm/icu/dev/data/unicode

	336

	337 * UCA

	338

	339 - download UCA files (mostly allkeys.txt) from http://www.unicode.org/Public/UCA /<beta version>/

	340 - run desuffixucd.py (see https://sites.google.com/site/unicodetools/inputdata)

	341 - update the input files for Mark's UCA tools, in ~/svn.unitools/trunk/data/uca/ 7.0.0/

	342 - run Mark's UCA Main: https://sites.google.com/site/unicodetools/home#TOC-UCA

	343 - output files are in ~/svn.unitools/Generated/uca/7.0.0/

	344 - review data; compare files, use blankweights.sed or similar

	345 ~/svn.unitools$ sed -r -f blankweights.sed Generated/uca/7.0.0/CollationAuxili ary/FractionalUCA.txt > frac-7.0.txt

	346 - cd ~/svn.unitools/Generated/uca/7.0.0/

	347 - update source/data/unidata/FractionalUCA.txt with FractionalUCA_SHORT.txt

	348 cp CollationAuxiliary/FractionalUCA_SHORT.txt $ICU_SRC_DIR/source/data/unidata /FractionalUCA.txt

	349 - update source/data/unidata/UCARules.txt with UCA_Rules_SHORT.txt

	350 (note removing the underscore before "Rules")

	351 cp CollationAuxiliary/UCA_Rules_SHORT.txt $ICU_SRC_DIR/source/data/unidata/U CARules.txt

	352 - update (ICU4C)/source/test/testdata/CollationTest_*.txt

	353 and (ICU4J)/main/tests/collate/src/com/ibm/icu/dev/data/CollationTest_*.txt

	354 with output from Mark's Unicode tools (..._CLDR_..._SHORT.txt)

	355 cp CollationAuxiliary/CollationTest_CLDR_NON_IGNORABLE_SHORT.txt $ICU_SRC_DI R/source/test/testdata/CollationTest_NON_IGNORABLE_SHORT.txt

	356 cp CollationAuxiliary/CollationTest_CLDR_SHIFTED_SHORT.txt $ICU_SRC_DIR/sour ce/test/testdata/CollationTest_SHIFTED_SHORT.txt

	357 cp $ICU_SRC_DIR/source/test/testdata/CollationTest_*.txt ~/svn.icu4j/trunk/s rc/main/tests/collate/src/com/ibm/icu/dev/data

	358 - run genuca, see command line above

	359 - rebuild ICU4C

	360 - refresh ICU4J collation data:

	361 (subset of instructions above for properties data refresh, except copies all c oll/*)

	362 ICUDT=icudt54b

	363 ~/svn.icu/uni70/dbg$ make ICU4J_ROOT=/tmp/icu4j icu4j-data-install

	364 ~/svn.icu/uni70/dbg$ mkdir -p /tmp/icu4j/com/ibm/icu/impl/data/$ICUDT/coll

	365 ~/svn.icu/uni70/dbg/data/out/icu4j$ cp com/ibm/icu/impl/data/$ICUDT/coll/* / tmp/icu4j/com/ibm/icu/impl/data/$ICUDT/coll

	366 ~/svn.icu/uni70/dbg/data/out/icu4j$ jar uf ~/svn.icu4j/trunk/src/main/shared /data/icudata.jar -C /tmp/icu4j com/ibm/icu/impl/data/$ICUDT

	367 - run all tests with the *_SHORT.txt or the full files (the full ones have comme nts, useful for debugging)

	368 - note on intltest: if collate/UCAConformanceTest fails, then

	369 utility/MultithreadTest/TestCollators will fail as well;

	370 fix the conformance test before looking into the multi-thread test

	371 - copy all output from Mark's UCA tool to unicode.org for review & staging by Ke n & editors

	372 - copy most of ~/svn.unitools/Generated/uca/7.0.0/CollationAuxiliary/* to CLDR b ranch

	373 ~/svn.unitools$ cp Generated/uca/7.0.0/CollationAuxiliary/* ~/svn.cldr/trunk/c ommon/uca/

	374

	375 * When refreshing all of ICU4J data from ICU4C

	376 - ~/svn.icu/uni70/dbg$ make ICU4J_ROOT=/tmp/icu4j icu4j-data-install

	377 - cp /tmp/icu4j/main/shared/data/icudata.jar ~/svn.icu4j/trunk/src/main/shared/d ata

	378 or

	379 - ~/svn.icu/uni70/dbg$ make ICU4J_ROOT=~/svn.icu4j/trunk/src icu4j-data-install

	380

	381 * run & fix ICU4J tests

	382

	383 *** LayoutEngine script information

	384

	385 (For details see the Unicode 5.2 change log below.)

	386

	387 * Run icu4j-tools: com.ibm.icu.dev.tool.layout.ScriptNameBuilder.

	388 This generates LEScripts.h, LELanguages.h, ScriptAndLanguageTags.h and ScriptA ndLanguageTags.cpp

	389 in the working directory.

	390 (It also generates ScriptRunData.cpp, which is no longer needed.)

	391

	392 The generated files have a current copyright date and "@stable" statement.

	393 ICU 54: Fixed tools/misc/src/com/ibm/icu/dev/tool/layout/ScriptIDModuleWriter. java

	394 for "born stable" Unicode API constants, and to stop parsing ICU version numbe rs

	395 which may not contain dots any more.

	396

	397 - diff current <icu>/source/layout files vs. generated ones

	398 ~/svn.icu4j/trunk/src$ meld $ICU_SRC_DIR/source/layout tools/misc/src/com/ib m/icu/dev/tool/layout

	399 review and manually merge desired changes;

	400 fix gratuitous changes, incorrect @draft/@stable and missing aliases;

	401 Unicode-derived script codes should be "born stable" like constants in uchar.h , uscript.h etc.

	402 - if you just copy the above files, then

	403 fix mixed line endings, review the diffs as above and restore changes to API t ags etc.;

	404 manually re-add the "Indic script xyz v.2" tags in ScriptAndLanguageTags.h

	405

	406 *** API additions

	407 - send notice to icu-design about new born-@stable API (enum constants etc.)

	408

	409 *** merge the Unicode update branches back onto the trunk

	410 - do not merge the icudata.jar and testdata.jar,

	411 instead rebuild them from merged & tested ICU4C

	412

	413 ---------------------------------------------------------------------------- ***

	414

16 Unicode 6.3 update	415 Unicode 6.3 update

17	416

18 http://www.unicode.org/review/pri249/ -- beta review	417 http://www.unicode.org/review/pri249/ -- beta review

19 http://www.unicode.org/reports/uax-proposed-updates.html	418 http://www.unicode.org/reports/uax-proposed-updates.html

20 http://www.unicode.org/versions/beta-6.3.0.html#notable_issues	419 http://www.unicode.org/versions/beta-6.3.0.html#notable_issues

21 http://www.unicode.org/reports/tr44/tr44-11.html	420 http://www.unicode.org/reports/tr44/tr44-11.html

22	421

23 *** ICU Trac	422 *** ICU Trac

24	423

25 - ticket 10128: update ICU to Unicode 6.3 beta	424 - ticket 10128: update ICU to Unicode 6.3 beta

(...skipping 1658 matching lines...) Expand 10 before \| Expand all \| Expand 10 after Loading...
1684	2083

1685 * name matching	2084 * name matching

1686 - read UCD.html	2085 - read UCD.html

1687	2086

1688 * scripts	2087 * scripts

1689 - use new Hrkt=Katakana_Or_Hiragana	2088 - use new Hrkt=Katakana_Or_Hiragana

1690	2089

1691 * ZWJ & ZWNJ	2090 * ZWJ & ZWNJ

1692 - are now part of combining character sequences	2091 - are now part of combining character sequences

1693 - break iteration used to assume that LB classes did not overlap; now they do fo r ZWJ & ZWNJ	2092 - break iteration used to assume that LB classes did not overlap; now they do fo r ZWJ & ZWNJ

OLD	NEW

« no previous file with comments | « source/data/unidata/UnicodeData.txt ('k') | source/data/unidata/confusables.txt » ('j') | no next file with comments »