icu46/source/data/unidata/changes.txt - Issue 5516007: Check in the pristine copy of ICU 4.6...

Side by Side Diff: icu46/source/data/unidata/changes.txt

Issue 5516007: Check in the pristine copy of ICU 4.6... (Closed) Base URL: svn://chrome-svn/chrome/trunk/deps/third_party/

Patch Set: Created 10 years ago

Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.

Jump to:

View unified diff | Download patch | Annotate | Revision Log

Property Changes:

Added: svn:eol-style
+ LF

OLD	NEW
(Empty)
	1 * Copyright (C) 2004-2010, International Business Machines

	2 * Corporation and others. All Rights Reserved.

	3 *

	4 * file name: changes.txt

	5 * encoding: US-ASCII

	6 * tab size: 8 (not used)

	7 * indentation:4

	8 *

	9 * created on: 2004may06

	10 * created by: Markus W. Scherer

	11 *

	12 * change log for Unicode updates

	13

	14 ---------------------------------------------------------------------------- ***

	15

	16 Unicode 6.0 update

	17

	18 *** related ICU Trac tickets

	19

	20 7264 Unicode 6.0 Update

	21

	22 *** Unicode version numbers

	23 - makedata.mak

	24 - uchar.h

	25 (configure.in & configure: have been modified to extract the version from ucha r.h)

	26 - com.ibm.icu.util.VersionInfo

	27

	28 *** data files & enums & parser code

	29

	30 * file preparation

	31

	32 ~/svn.icu/tools/trunk/src/unicode/c/genprops/misc$ ./ucdcopy.py ~/uni60/20100720 /ucd ~/uni60/processed

	33 - This now prepares both unidata and testdata files in respective output subfold ers.

	34

	35 * PropertyAliases.txt changes

	36 - new Script_Extensions property defined in the new ScriptExtensions.txt file

	37 but not listed in PropertyAliases.txt; reported to unicode.org;

	38 -> added to tools/trunk/src/unicode/c/genpname/SyntheticPropertyAliases.txt

	39 scx; Script_Extensions

	40 -> uchar.h with new UProperty section

	41 -> com.ibm.icu.lang.UProperty, parallel with uchar.h

	42

	43 * PropertyValueAliases.txt changes

	44 - 12 new block names:

	45 Alchemical_Symbols

	46 Bamum_Supplement

	47 Batak

	48 Brahmi

	49 CJK_Unified_Ideographs_Extension_D

	50 Emoticons

	51 Ethiopic_Extended_A

	52 Kana_Supplement

	53 Mandaic

	54 Miscellaneous_Symbols_And_Pictographs

	55 Playing_Cards

	56 Transport_And_Map_Symbols

	57 -> add to uchar.h

	58 -> add to UCharacter.UnicodeBlock

	59 Eclipse find UBLOCK_([^ ]+) = [0-9]+, (/.+)

	60 replace public static final UnicodeBlock \1 = new UnicodeBlock("\1" , \1_ID); \2

	61 - Joining_Group (jg) values:

	62 Teh_Marbuta_Goal becomes the new canonical value for the old Hamza_On_Heh_Goal which becomes an alias

	63 -> uchar.h & UCharacter.JoiningGroup

	64 - 3 new scripts:

	65 sc ; Batk ; Batak

	66 sc ; Brah ; Brahmi

	67 sc ; Mand ; Mandaic

	68 -> remove these from SyntheticPropertyValueAliases.txt

	69 -> add alias USCRIPT_MANDAIC to USCRIPT_MANDAEAN

	70 -> fix expectedLong names in cucdapi.c/TestUScriptCodeAPI()

	71 and in com.ibm.icu.dev.test.lang.TestUScript.java

	72 - 13 new script codes from ISO 15924 http://www.unicode.org/iso15924/codechanges .html

	73 (added 2009-11-11..2010-07-18)

	74 Bass 259 Bassa Vah

	75 Dupl 755 Duployan shortand

	76 Elba 226 Elbasan

	77 Gran 343 Grantha

	78 Kpel 436 Kpelle

	79 Loma 437 Loma

	80 Mend 438 Mende

	81 Merc 101 Meroitic Cursive

	82 Narb 106 Old North Arabian

	83 Nbat 159 Nabataean

	84 Palm 126 Palmyrene

	85 Sind 318 Sindhi

	86 Wara 262 Warang Citi

	87 -> uscript.h

	88 -> com.ibm.icu.lang.UScript

	89 find USCRIPT_([^ ]+) *= ([0-9]+),(.+)

	90 replace public static final int \1 = \2;\3

	91 -> SyntheticPropertyValueAliases.txt

	92 -> add to expectedLong and expectedShort names in cintltst/cucdapi.c/TestUScri ptCodeAPI()

	93 and in com.ibm.icu.dev.test.lang.TestUScript.java

	94 - ISO 15924 name change

	95 Mero 100 Meroitic Hieroglyphs (was Meroitic)

	96 -> add new alias USCRIPT_MEROITIC_HIEROGLYPHS to USCRIPT_MEROITIC

	97 - property value alias added for Cham, was already moved out of SyntheticPropert yValueAliases.txt

	98

	99 * UnicodeData.txt changes

	100 - new CJK block:

	101 2B740;<CJK Ideograph Extension D, First>;Lo;0;L;;;;;N;;;;;

	102 2B81D;<CJK Ideograph Extension D, Last>;Lo;0;L;;;;;N;;;;;

	103 -> add to tools/trunk/src/unicode/c/gennames/gennames.c, with new ucdVersion

	104

	105 * build Unicode tools using CMake+make

	106

	107 * run genpname/preparse.pl (on Linux)

	108 + cd ~/svn.icu/tools/trunk/src/unicode/c/genpname

	109 + make sure that data.h is writable

	110 + perl preparse.pl ~/svn.icu/trunk/src > out.txt

	111 + preparse.pl shows no errors, out.txt Info and Warning lines look ok

	112

	113 * rebuild Unicode tools (at least genpname) using make

	114 - You might first need to "make install" ICU so that the tools build can pick

	115 up the new definitions from the installed header files.

	116

	117 * run genpname

	118 - ~/svn.icu/tools/trunk/bld/unicode$ c/genpname/genpname -v -d ~/svn.icu/trunk/s rc/source/data/in

	119 - rebuild ICU & tools

	120

	121 * update source/data/unidata/norm2/nfkc_cf.txt

	122 - follow the instructions in nfkc_cf.txt for updating it from DerivedNormalizati onProps.txt

	123

	124 * update source/data/unidata/norm2/uts46.txt

	125 - download http://www.unicode.org/Public/idna/6.0.0/IdnaMappingTable.txt

	126 to ~/svn.icu/tools/trunk/src/unicode/py

	127 - adjust idna2nrm.py to handle new disallowed_STD3_valid and disallowed_STD3_map ped values

	128 - ~/svn.icu/tools/trunk/src/unicode/py$ ./idna2nrm.py

	129 - ~/svn.icu/tools/trunk/src/unicode/py$ cp uts46.txt ~/svn.icu/trunk/src/source/ data/unidata/norm2

	130

	131 * update uts46test.cpp and UTS46Test.java if there are new characters that are e quivalent to

	132 sequences with non-LDH ASCII (that is, their decompositions contain '=' or sim ilar)

	133 - grep IdnaMappingTable.txt or uts46.txt for "disallowed_STD3_valid" on non-ASCI I characters

	134 - Unicode 6.0: U+2260, U+226E, U+226F

	135

	136 * generate core properties data files

	137 - ~/svn.icu/tools/trunk/src/unicode$ ./makeprops.sh ~/svn.icu/trunk/src ~/svn.ic u/trunk/bld

	138 - rebuild ICU & tools

	139 - run makeuca.sh so that genuca picks up the new nfc.nrm:

	140 ~/svn.icu/tools/trunk/src/unicode$ ./makeuca.sh ~/svn.icu/trunk/src ~/svn.icu/ trunk/bld

	141 - rebuild ICU & tools

	142

	143 * implement new Script_Extensions property (provisional)

	144 - parser & generator: genprops & uprops.icu

	145 - uscript.h, uprops.h, uchar.c, uniset_props.cpp and others, plus cintltst/cucda pi.c & intltest/usettest.cpp

	146 - UScript.java, UCharacterProperty.java, UnicodeSet.java, TestUScript.java, Unic odeSetTest.java

	147

	148 * switch ubidi.icu, ucase.icu and uprops.icu from UTrie to UTrie2

	149 - (one-time change)

	150 - genbidi/gencase/genprops tools changes

	151 - re-run makeprops.sh (see above)

	152 - UCharacterProperty.java, UCharacterTypeIterator.java,

	153 UBiDiProps.java, UCaseProps.java, and several others with minor changes;

	154 UCharacterPropertyReader.java deleted and its code folded into UCharacterPrope rty.java

	155

	156 * update Java data files

	157 - refresh just the UCD-related files, just to be safe

	158 - see (ICU4C)/source/data/icu4j-readme.txt

	159 - mkdir /tmp/icu4j

	160 - ~/svn.icu/trunk/bld$ make ICU4J_ROOT=/tmp/icu4j icu4j-data-install

	161 output:

	162 ...

	163 Unicode .icu files built to ./out/build/icudt45l

	164 mkdir -p ./out/icu4j/com/ibm/icu/impl/data/icudt45b

	165 echo ubidi.icu ucase.icu uprops.icu > ./out/icu4j/add.txt

	166 LD_LIBRARY_PATH=../lib:../stubdata:../tools/ctestfw:$LD_LIBRARY_PATH ../bin /icupkg ./out/tmp/icudt45l.dat ./out/icu4j/icudt45b.dat -a ./out/icu4j/add.txt - s ./out/build/icudt45l -x '*' -tb -d ./out/icu4j/com/ibm/icu/impl/data/icudt45b

	167 jar cf ./out/icu4j/icudata.jar -C ./out/icu4j com/ibm/icu/impl/data/icudt45b

	168 mkdir -p /tmp/icu4j/main/shared/data

	169 cp ./out/icu4j/icudata.jar /tmp/icu4j/main/shared/data

	170 - copy the big-endian Unicode data files to another location,

	171 separate from the other data files

	172 mkdir -p /tmp/icu4j/com/ibm/icu/impl/data/icudt45b/coll

	173 mkdir -p /tmp/icu4j/com/ibm/icu/impl/data/icudt45b/brkitr

	174 ~/svn.icu/trunk/bld/data/out/icu4j$ cp com/ibm/icu/impl/data/icudt45b/*.icu /tmp/icu4j/com/ibm/icu/impl/data/icudt45b

	175 ~/svn.icu/trunk/bld/data/out/icu4j$ rm /tmp/icu4j/com/ibm/icu/impl/data/icud t45b/cnvalias.icu

	176 ~/svn.icu/trunk/bld/data/out/icu4j$ cp com/ibm/icu/impl/data/icudt45b/*.nrm /tmp/icu4j/com/ibm/icu/impl/data/icudt45b

	177 ~/svn.icu/trunk/bld/data/out/icu4j$ cp com/ibm/icu/impl/data/icudt45b/coll/* .icu /tmp/icu4j/com/ibm/icu/impl/data/icudt45b/coll

	178 ~/svn.icu/trunk/bld/data/out/icu4j$ cp com/ibm/icu/impl/data/icudt45b/brkitr /* /tmp/icu4j/com/ibm/icu/impl/data/icudt45b/brkitr

	179 - refresh ICU4J

	180 ~/svn.icu/trunk/bld/data/out/icu4j$ jar uf ~/svn.icu4j/trunk/src/main/shared /data/icudata.jar -C /tmp/icu4j com/ibm/icu/impl/data/icudt45b

	181

	182 * refresh Java test .txt files

	183 - copy new .txt files into ICU4J's main/tests/core/src/com/ibm/icu/dev/data/unic ode

	184

	185 * un-hardcode normalization skippable (NF*_Inert) test data

	186 - removes one manual step from the Unicode upgrade, and removes dependency on on e of Mark's tools

	187

	188 * copy updated break iterator test files

	189 - now handled by early ucdcopy.py and

	190 copying the uni60/processed/testdata files to ~/svn.icu/trunk/src/source/test/ testdata

	191 (old instructions:

	192 copy from (Unicode 6.0)/ucd/auxiliary/*BreakTest-6....txt

	193 to ~/svn.icu/trunk/src/source/test/testdata)

	194 - they are not used in ICU4J

	195

	196 * UCA

	197

	198 - get output from Mark's tools; look in

	199 http://www.unicode.org/~book/incoming/mark/uca6.0.0/

	200 http://www.macchiato.com/unicode/utc/additional-uca-files

	201 http://www.unicode.org/Public/UCA/6.0.0/

	202 http://www.unicode.org/~mdavis/uca/

	203 - update source/data/unidata/FractionalUCA.txt with FractionalUCA_SHORT.txt

	204 - update source/data/unidata/UCARules.txt with UCA_Rules_SHORT.txt

	205 - update Han-implicit ranges for new CJK extensions:

	206 swapCJK() in ucol.cpp & ImplicitCEGenerator.java

	207 - genuca: allow bytes 02 for U+FFFE, new merge-sort character;

	208 do not add it into invuca so that tailoring primary-after an ignorable works

	209 - genuca: permit space between [variable top] bytes

	210 - ucol.cpp: treat noncharacters like unassigned rather than ignorable

	211 - run makeuca.sh:

	212 ~/svn.icu/tools/trunk/src/unicode$ ./makeuca.sh ~/svn.icu/trunk/src ~/svn.icu/ trunk/bld

	213 - rebuild ICU4C

	214 - refresh ICU4J collation data:

	215 (subset of instructions above for properties data refresh, except copies all c oll/*)

	216 ~/svn.icu/trunk/bld$ make ICU4J_ROOT=/tmp/icu4j icu4j-data-install

	217 mkdir -p /tmp/icu4j/com/ibm/icu/impl/data/icudt45b/coll

	218 ~/svn.icu/trunk/bld/data/out/icu4j$ cp com/ibm/icu/impl/data/icudt45b/coll/* /tmp/icu4j/com/ibm/icu/impl/data/icudt45b/coll

	219 ~/svn.icu/trunk/bld/data/out/icu4j$ jar uf ~/svn.icu4j/trunk/src/main/shared /data/icudata.jar -C /tmp/icu4j com/ibm/icu/impl/data/icudt45b

	220 - update (ICU)/source/test/testdata/CollationTest_*.txt

	221 and (ICU4J)/main/tests/collate/src/com/ibm/icu/dev/data/CollationTest_*.txt

	222 with output from Mark's Unicode tools

	223 - run all tests with the *_SHORT.txt or the full files (the full ones have comme nts)

	224 - note on intltest: if collate/UCAConformanceTest fails, then

	225 utility/MultithreadTest/TestCollators will fail as well;

	226 fix the conformance test before looking into the multi-thread test

	227

	228 * When refreshing all of ICU4J data from ICU4C

	229 - ~/svn.icu/trunk/bld$ make ICU4J_ROOT=/tmp/icu4j icu4j-data-install

	230 - cp /tmp/icu4j/main/shared/data/icudata.jar ~/svn.icu4j/trunk/src/main/shared/d ata

	231 or

	232 - ~/svn.icu/trunk/bld$ make ICU4J_ROOT=~/svn.icu4j/trunk/src icu4j-data-install

	233

	234 *** LayoutEngine script information

	235

	236 (For details see the Unicode 5.2 change log below.)

	237

	238 * Run ICU4J com.ibm.icu.dev.tool.layout.ScriptNameBuilder. This generates LEScri pts.h, LELanguages.h,

	239 ScriptAndLanguageTags.h and ScriptAndLanguageTags.cpp in the working directory. (It also generates

	240 ScriptRunData.cpp, which is no longer needed.)

	241

	242 The generated files have a current copyright date and "@draft" statement.

	243

	244 * copy the above files into <icu>/source/layout, replacing the old files.

	245 * fix mixed line endings

	246 * review the diffs and fix incorrect @draft and missing aliases;

	247 Unicode-derived script codes should be "born stable" like constants in uchar.h , uscript.h etc.

	248 * manually re-add the "Indic script xyz v.2" tags in ScriptAndLanguageTags.h

	249

	250 ---------------------------------------------------------------------------- ***

	251

	252 Unicode 5.2 update

	253

	254 *** related ICU Trac tickets

	255

	256 7084 Unicode 5.2

	257

	258 7167 verify collation bytes

	259 7235 Java test NAME_ALIAS

	260 7236 Java DerivedCoreProperties.txt test

	261 7237 Java BidiTest.txt

	262 7238 UTrie2 in core unidata

	263 7239 test for tailoring gaps

	264 7240 Java fix CollationMiscTest

	265 7243 update layout engine for Unicode 5.2

	266

	267 *** Unicode version numbers

	268 - makedata.mak

	269 - uchar.h

	270 - configure.in & configure

	271 - update ucdVersion in gennames.c if an algorithmic range changes

	272

	273 *** data files & enums & parser code

	274

	275 * file preparation

	276

	277 python source\tools\genprops\misc\ucdcopy.py "C:\Documents and Settings\mscherer \My Documents\unicode\ucd\5.2.0" C:\svn\icuproj\icu\trunk\source\data\unidata

	278 - includes finding files regardless of version numbers,

	279 copying them, and performing the equivalent processing of the

	280 ucdstrip and ucdmerge tools on the desired set of files

	281

	282 * notes on changes

	283 - PropertyAliases.txt

	284 moved from numeric to enumerated:

	285 ccc ; Canonical_Combining_Class

	286 new string properties:

	287 NFKC_CF ; NFKC_Casefold

	288 Name_Alias; Name_Alias

	289 new binary properties:

	290 Cased ; Cased

	291 CI ; Case_Ignorable

	292 CWCF ; Changes_When_Casefolded

	293 CWCM ; Changes_When_Casemapped

	294 CWKCF ; Changes_When_NFKC_Casefolded

	295 CWL ; Changes_When_Lowercased

	296 CWT ; Changes_When_Titlecased

	297 CWU ; Changes_When_Uppercased

	298 new CJK Unihan properties (not supported by ICU)

	299 - PropertyValueAliases.txt

	300 new block names

	301 new scripts

	302 one script code change:

	303 sc ; Qaai ; Inherited

	304 ->

	305 sc ; Zinh ; Inherited ; Qaai

	306 new Line_Break (lb) value:

	307 lb ; CP ; Close_Parenthesis

	308 new Joining_Group (jg) values: Farsi_Yeh, Nya

	309 other new values:

	310 ccc; 214; ATA ; Attached_Above

	311 - DerivedBidiClass.txt

	312 new default-R range: U+1E800 - U+1EFFF

	313 - UnicodeData.txt

	314 all of the ISO comments are gone

	315 new CJK block end:

	316 9FC3;<CJK Ideograph, Last> -> 9FCB;<CJK Ideograph, Last>

	317 new CJK block:

	318 2A700;<CJK Ideograph Extension C, First>;Lo;0;L;;;;;N;;;;;

	319 2B734;<CJK Ideograph Extension C, Last>;Lo;0;L;;;;;N;;;;;

	320

	321 * genpname

	322 - run preparse.pl

	323 + cd \svn\icuproj\icu\trunk\source\tools\genpname

	324 + make sure that data.h is writable

	325 + perl preparse.pl \svn\icuproj\icu\trunk > out.txt

	326 + preparse.pl complains with errors like the following:

	327 Error: sc:Egyp already set to Egyptian_Hieroglyphs, cannot set to Egyp at preparse.pl line 1322, <GEN6> line 34.

	328 This is because ICU 4.0 had scripts from ISO 15924 which are now

	329 added to Unicode 5.2, and the Perl script shows a conflict between Synthetic PropertyValueAliases.txt

	330 and PropertyValueAliases.txt.

	331 -> Removed duplicate script entries from SyntheticPropertyValueAliases.txt:

	332 Egyp, Java, Lana, Mtei, Orkh, Armi, Avst, Kthi, Phli, Prti, Samr, Tavt

	333 + preparse.pl complains with errors about block names missing from uchar.h; ad d them

	334

	335 * uchar.h & uscript.h & uprops.h & uprops.c & genprops

	336 - new block & script values

	337 + 26 new blocks

	338 copy new blocks from Blocks.txt

	339 MS VC++ 2008 regular expression:

	340 find "^{[0-9A-F]+}\.\.{[0-9A-F]+}; {[A-Z].+}$"

	341 replace with " UBLOCK_\3 = 172, /[\1]/"

	342 + several new script values already added in ICU 4.0 for ISO 15924 coverage

	343 (removed from SyntheticPropertyValueAliases.txt, see genpname notes above)

	344 + 3 new script values added for ISO 15924 and Unicode 5.2 coverage

	345 + 1 new script value added for ISO 15924 coverage (not in Unicode 5.2)

	346 (added to SyntheticPropertyValueAliases.txt)

	347 - new Joining Group (JG) values: Farsi_Yeh, Nya

	348 - new Line_Break (lb) value:

	349 lb ; CP ; Close_Parenthesis

	350

	351 * hardcoded Unihan range end/limit

	352 - Unihan range end moves from 9FC3 to 9FCB

	353 search for both 9FC3 (end) and 9FC4 (limit) (regex 9FC[34], case-insensitive)

	354 + do change gennames.c

	355

	356 * Compare definitions of new binary properties with what we used to use

	357 in algorithms, to see if the definitions changed.

	358 - Verified that definitions for Cased and Case_Ignorable are unchanged.

	359 The gencase tool now parses the newly public Case_Ignorable values

	360 in case the definition changes in the future.

	361

	362 * uchar.c & uprops.h & uprops.c & genprops

	363 - new numeric values that didn't exist in Unicode data before:

	364 1/7, 1/9, 1/10, 3/10, 1/16, 3/16

	365 the ones with denominators >9 cannot be supported by uprops.icu formatVersion 5,

	366 therefore redesign the encoding of numeric types and values for formatVersion 6;

	367 design for simple numbers up to at least 144 ("one gross"),

	368 large values up to at least 10^20,

	369 and fractions with numerators -1..17 and denominators 1..16

	370 to cover current and expected future values

	371 (e.g., more Han numeric values, Meroitic twelfths)

	372

	373 * reimplement Hangul_Syllable_Type for new Jamo characters

	374 - the old code assumed that all Jamo characters are in the 11xx block

	375 - Unicode 5.2 fills holes there and adds new Jamo characters in

	376 A960..A97F; Hangul Jamo Extended-A

	377 and in

	378 D7B0..D7FF; Hangul Jamo Extended-B

	379 - Hangul_Syllable_Type can be trivially derived from a subset of

	380 Grapheme_Cluster_Break values

	381

	382 * build Unicode data source code for hardcoding core data

	383 C:\svn\icuproj\icu\trunk\source\data>NMAKE /f makedata.mak ICUMAKE=\svn\icuproj\ icu\trunk\source\data\ CFG=x86\release uni-core-data

	384

	385 ICU data make path is \svn\icuproj\icu\trunk\source\data\

	386 ICU root path is \svn\icuproj\icu\trunk

	387 Information: cannot find "ucmlocal.mk". Not building user-additional converter f iles.

	388 Information: cannot find "brklocal.mk". Not building user-additional break itera tor files.

	389 Information: cannot find "reslocal.mk". Not building user-additional resource bu ndle files.

	390 Information: cannot find "collocal.mk". Not building user-additional resource bu ndle files.

	391 Information: cannot find "rbnflocal.mk". Not building user-additional resource b undle files.

	392 Information: cannot find "trnslocal.mk". Not building user-additional transliter ator files.

	393 Information: cannot find "misclocal.mk". Not building user-additional miscellaen ous files.

	394 Information: cannot find "spreplocal.mk". Not building user-additional stringpre p files.

	395 Creating data file for Unicode Property Names

	396 Creating data file for Unicode Character Properties

	397 Creating data file for Unicode Case Mapping Properties

	398 Creating data file for Unicode BiDi/Shaping Properties

	399 Creating data file for Unicode Normalization

	400 Unicode .icu files built to "\svn\icuproj\icu\trunk\source\data\out\build\icudt4 3l"

	401 Unicode .c source files built to "\svn\icuproj\icu\trunk\source\data\out\tmp"

	402

	403 - copy the .c source files to C:\svn\icuproj\icu\trunk\source\common

	404 and rebuild the common library

	405

	406 *** UCA

	407

	408 - update FractionalUCA.txt with new canonical closure (output from Mark's Unicod e tools)

	409 - update source/data/unidata/UCARules.txt with UCA_Rules_SHORT.txt from Mark's U nicode tools

	410 - update source/test/testdata/CollationTest_*.txt with output from Mark's Unicod e tools

	411 [ Begin obsolete instructions:

	412 Starting with UCA 5.2, we use the CollationTest__SHORT.txt files not the _ST UB.txt files.

	413 - generate the source/test/testdata/CollationTest_*_STUB.txt files via sourc e/tools/genuca/genteststub.py

	414 on Windows:

	415 python C:\svn\icuproj\icu\trunk\source\tools\genuca\genteststub.py Colla tionTest_NON_IGNORABLE_SHORT.txt CollationTest_NON_IGNORABLE_STUB.txt

	416 python C:\svn\icuproj\icu\trunk\source\tools\genuca\genteststub.py Colla tionTest_SHIFTED_SHORT.txt CollationTest_SHIFTED_STUB.txt

	417 End obsolete instructions]

	418 - run all tests with the *_SHORT.txt or the full files (the full ones have comme nts)

	419 not just the *_STUB.txt files

	420 - note on intltest: if collate/UCAConformanceTest fails, then

	421 utility/MultithreadTest/TestCollators will fail as well;

	422 fix the conformance test before looking into the multi-thread test

	423

	424 *** Implement Cased & Case_Ignorable properties

	425 - via UProperty; call ucase.h functions ucase_getType() and ucase_getTypeOrIgnor able()

	426 - Problem: These properties should be disjoint, but aren't

	427 - UTC 2009nov decision: skip all Case_Ignorable regardless of whether they are C ased or not

	428 - change ucase.icu to be able to store any combination of Cased and Case_Ignorab le

	429

	430 *** Implement Changes_When_Xyz properties

	431 - without stored data

	432

	433 *** Implement Name_Alias property

	434 - add it as another name field in unames.icu

	435 - make it available via u_charName() and UCharNameChoice and

	436 - consider it in u_charFromName()

	437

	438 *** Break iterators

	439

	440 * Update break iterator rules to new UAX versions and new property values

	441 * Update source/test/testdata/<boundary>Test.txt files from <unicode.org ucd>/uc d/auxiliary

	442

	443 *** new BidiTest file

	444 - review format and data

	445 - copy BidiTest.txt to source/test/testdata

	446 - write test code using this data

	447 - fix ICU code where it fails the conformance test

	448

	449 *** Java

	450 - generally, find and update code corresponding to C/C++

	451 - UCharacter.UnicodeBlock constants:

	452 a) add an _ID integer per new block, update COUNT

	453 b) add a class instance per new block

	454 Visual Studio regex:

	455 find UBLOCK_{[^ ]+} = [0-9]+, {/.+}

	456 replace with public static final UnicodeBlock \1 = new UnicodeBlock(" \1", \1_ID); \2

	457 - CHAR_NAME_ALIAS -> UCharacter.getNameAlias() and getCharFromNameAlias()

	458

	459 - port test changes to Java

	460

	461 *** LayoutEngine script information

	462

	463 (For comparison, see the Unicode 5.1 update: http://bugs.icu-project.org/trac/ch angeset/23833)

	464

	465 * Run ICU4J com.ibm.icu.dev.tool.layout.ScriptNameBuilder. This generates LEScri pts.h, LELanguages.h,

	466 ScriptAndLanguageTags.h and ScriptAndLanguageTags.cpp in the working directory. (It also generates

	467 ScriptRunData.cpp, which is no longer needed.)

	468

	469 The generated files have a current copyright date and "@draft" statement.

	470

	471 -> Eric Mader wrote in email on 20090930:

	472 "I think the tool has been modified to update @draft to @stable for

	473 older scripts and to add @draft for new scripts.

	474 (I worked with an intern on this last year.)

	475 You should check the output after you run it."

	476

	477 * copy the above files into <icu>/source/layout, replacing the old files.

	478 * fix mixed line endings

	479 * review the diffs and fix incorrect @draft and missing aliases

	480 * manually re-add the "Indic script xyz v.2" tags in ScriptAndLanguageTags.h

	481

	482 Add new default entries to the indicClassTables array in <icu>/source/layout/Ind icClassTables.cpp

	483 and the complexTable array in <icu>/source/layoutex/ParagraphLayout.cpp. (This s tep should be automated...)

	484

	485 -> Eric Mader wrote in email on 20090930:

	486 "This is just a matter of making sure that all the per-script tables have

	487 entries for any new scripts that were added.

	488 If any new Indic characters were added, then the class tables in

	489 IndicClassTables.cpp should be updated to reflect this.

	490 John Emmons should know how to do this if it's required."

	491

	492 * rebuild the layout and layoutex libraries.

	493

	494 *** Documentation

	495 - Update User Guide

	496 + Jamo_Short_Name, sfc->scf, binary property value aliases

	497

	498 ---------------------------------------------------------------------------- ***

	499

	500 Unicode 5.1 update

	501

	502 *** related ICU Trac tickets

	503

	504 5696 Update to Unicode 5.1

	505

	506 *** Unicode version numbers

	507 - makedata.mak

	508 - uchar.h

	509 - configure.in & configure

	510 - update ucdVersion in gennames.c if an algorithmic range changes

	511

	512 *** data files & enums & parser code

	513

	514 * file preparation

	515 - ucdstrip:

	516 DerivedCoreProperties.txt

	517 DerivedNormalizationProps.txt

	518 NormalizationTest.txt

	519 PropList.txt

	520 Scripts.txt

	521 GraphemeBreakProperty.txt

	522 SentenceBreakProperty.txt

	523 WordBreakProperty.txt

	524 - ucdstrip and ucdmerge:

	525 EastAsianWidth.txt

	526 LineBreak.txt

	527

	528 * my ucd2unidata.bat (needs to be updated each time with UCD and file version nu mbers)

	529 copy 5.1.0\ucd\BidiMirroring.txt ..\unidata\

	530 copy 5.1.0\ucd\Blocks.txt ..\unidata\

	531 copy 5.1.0\ucd\CaseFolding.txt ..\unidata\

	532 copy 5.1.0\ucd\DerivedAge.txt ..\unidata\

	533 copy 5.1.0\ucd\extracted\DerivedBidiClass.txt ..\unidata\

	534 copy 5.1.0\ucd\extracted\DerivedJoiningGroup.txt ..\unidata\

	535 copy 5.1.0\ucd\extracted\DerivedJoiningType.txt ..\unidata\

	536 copy 5.1.0\ucd\extracted\DerivedNumericValues.txt ..\unidata\

	537 copy 5.1.0\ucd\NormalizationCorrections.txt ..\unidata\

	538 copy 5.1.0\ucd\PropertyAliases.txt ..\unidata\

	539 copy 5.1.0\ucd\PropertyValueAliases.txt ..\unidata\

	540 copy 5.1.0\ucd\SpecialCasing.txt ..\unidata\

	541 copy 5.1.0\ucd\UnicodeData.txt ..\unidata\

	542

	543 ucdstrip < 5.1.0\ucd\DerivedCoreProperties.txt > ..\unidata\DerivedCorePropertie s.txt

	544 ucdstrip < 5.1.0\ucd\DerivedNormalizationProps.txt > ..\unidata\DerivedNormaliza tionProps.txt

	545 ucdstrip < 5.1.0\ucd\NormalizationTest.txt > ..\unidata\NormalizationTest.txt

	546 ucdstrip < 5.1.0\ucd\PropList.txt > ..\unidata\PropList.txt

	547 ucdstrip < 5.1.0\ucd\Scripts.txt > ..\unidata\Scripts.txt

	548 ucdstrip < 5.1.0\ucd\auxiliary\GraphemeBreakProperty.txt > ..\unidata\GraphemeBr eakProperty.txt

	549 ucdstrip < 5.1.0\ucd\auxiliary\SentenceBreakProperty.txt > ..\unidata\SentenceBr eakProperty.txt

	550 ucdstrip < 5.1.0\ucd\auxiliary\WordBreakProperty.txt > ..\unidata\WordBreakPrope rty.txt

	551 ucdstrip < 5.1.0\ucd\EastAsianWidth.txt \| ucdmerge > ..\unidata\EastAsianWidth.t xt

	552 ucdstrip < 5.1.0\ucd\LineBreak.txt \| ucdmerge > ..\unidata\LineBreak.txt

	553

	554 * genpname

	555 - run preparse.pl

	556 + cd \svn\icuproj\icu\uni51\source\tools\genpname

	557 + make sure that data.h is writable

	558 + perl preparse.pl \svn\icuproj\icu\uni51 > out.txt

	559 + preparse.pl complains with errors like the following:

	560 Error: sc:Cari already set to Carian, cannot set to Cari at preparse.pl li ne 1308, <GEN6> line 30.

	561 This is because ICU 3.8 had scripts from ISO 15924 which are now

	562 added to Unicode 5.1, and the script shows a conflict between SyntheticPrope rtyValueAliases.txt

	563 and PropertyValueAliases.txt.

	564 -> Removed duplicate script entries from SyntheticPropertyValueAliases.txt:

	565 Cari, Cham, Kali, Lepc, Lyci, Lydi, Olck, Rjng, Saur, Sund, Vaii

	566 + PropertyValueAliases.txt now explicitly contains values for boolean properti es:

	567 N/Y, No/Yes, F/T, False/True

	568 -> Added N/No and Y/Yes to preparse.pl function read_PropertyValueAliases.

	569 It will use further values from the file if present.

	570

	571 * uchar.h & uscript.h & uprops.h & uprops.c & genprops

	572 - new block & script values

	573 + 17 new blocks

	574 + 11 new script values already added in ICU 3.8 for ISO 15924 coverage

	575 (removed from SyntheticPropertyValueAliases.txt)

	576 + 14 new script values added for ISO 15924 coverage (not in Unicode 5.1)

	577 (added to SyntheticPropertyValueAliases.txt)

	578 - uprops.icu (uprops.h) only provides 7 bits for script codes.

	579 In ICU 4.0 there are USCRIPT_CODE_LIMIT=130 script codes now.

	580 There is none above 127 yet which is the script code for an

	581 assigned Unicode character, so ICU 4.0 uprops.icu does not store any

	582 script code values greater than 127.

	583 However, it does need to store the maximum script value=USCRIPT_CODE_LIMIT-1=1 29

	584 in a parallel bit field, and that overflows now.

	585 Also, future values >=128 would be incompatible anyway.

	586 uprops.h is modified to move around several of the bit fields

	587 in the properties vector words, and now uses 8 bits for the script code.

	588 Two other bit fields also grow to accommodate future growth:

	589 Block (current count: 172) grows from 8 to 9 bits,

	590 and Word_Break grows from 4 to 5 bits.

	591 - renamed property Simple_Case_Folding (sfc->scf)

	592 + nothing to be done: handled as normal alias

	593 - new property JSN Jamo_Short_Name

	594 + no new API: only contributes to the Name property

	595 - new Grapheme_Cluster_Break (GCB) value: SM=SpacingMark

	596 - new Joining Group (JG) value: Burushashki_Yeh_Barree

	597 - new Sentence_Break (SB) values:

	598 SB ; CR ; CR

	599 SB ; EX ; Extend

	600 SB ; LF ; LF

	601 SB ; SC ; SContinue

	602 - new Word_Break (WB) values:

	603 WB ; CR ; CR

	604 WB ; Extend ; Extend

	605 WB ; LF ; LF

	606 WB ; MB ; MidNumLet

	607

	608 * Further changes in the 2008-02-29 update:

	609 - Default_Ignorable_Code_Point: The new file removes Cc, Cs, noncharacters from DICP

	610 because they should not normally be invisible.

	611 - new Joining Group (JG) value Burushashki_Yeh_Barree was renamed to Burushaski_ Yeh_Barree (one 'h' removed)

	612 - new Grapheme_Cluster_Break (GCB) value: PP=Prepend

	613 - new Word_Break (WB) value: NL=Newline

	614

	615 * hardcoded Unihan range end/limit (see Unicode 4.1 update for comparison)

	616 - Unihan range end moves from 9FBB to 9FC3

	617 search for both 9FBB (end) and 9FBC (limit) (regex 9FB[BC], case-insensitive)

	618 + do change gennames.c

	619

	620 * build Unicode data source code for hardcoding core data

	621 C:\svn\icuproj\icu\uni51\source\data>NMAKE /f makedata.mak ICUMAKE=\svn\icuproj\ icu\uni51\source\data\ CFG=debug uni-core-data

	622

	623 ICU data make path is \svn\icuproj\icu\uni51\source\data\

	624 ICU root path is \svn\icuproj\icu\uni51

	625 Information: cannot find "ucmlocal.mk". Not building user-additional converter f iles.

	626 Information: cannot find "brklocal.mk". Not building user-additional break itera tor files.

	627 Information: cannot find "reslocal.mk". Not building user-additional resource bu ndle files.

	628 Information: cannot find "collocal.mk". Not building user-additional resource bu ndle files.

	629 Information: cannot find "rbnflocal.mk". Not building user-additional resource b undle files.

	630 Information: cannot find "trnslocal.mk". Not building user-additional transliter ator files.

	631 Information: cannot find "misclocal.mk". Not building user-additional miscellaen ous files.

	632 Creating data file for Unicode Character Properties

	633 Creating data file for Unicode Case Mapping Properties

	634 Creating data file for Unicode BiDi/Shaping Properties

	635 Creating data file for Unicode Normalization

	636 Unicode .icu files built to "\svn\icuproj\icu\uni51\source\data\out\build\icudt3 9l"

	637 Unicode .c source files built to "\svn\icuproj\icu\uni51\source\data\out\tmp"

	638

	639 - copy the .c source files to C:\svn\icuproj\icu\uni51\source\common

	640 and rebuild the common library

	641

	642 *** Break iterators

	643

	644 * Update break iterator rules to new UAX versions and new property values

	645

	646 *** UCA

	647

	648 * update FractionalUCA.txt and UCARules.txt with new canonical closure

	649

	650 *** Test suites

	651 - Test that APIs using Unicode property value aliases (like UnicodeSet)

	652 support all of the boolean values N/Y, No/Yes, F/T, False/True

	653 -> TestBinaryValues() tests in both cintltst and intltest

	654

	655 *** LayoutEngine script information

	656 * Run ICU4J com.ibm.icu.dev.tool.layout.ScriptNameBuilder. This generates LEScri pts.h, LELanguage.h,

	657 ScriptAndLanguageTags.h and ScriptAndLanguageTags.cpp in the working directory. (it also generates

	658 ScriptRunData.cpp, which is no longer needed.)

	659

	660 The generated files have a current copyright date and "@draft" statement.

	661

	662 * copy the above files into <icu>/source/layout, replacing the old files.

	663

	664 Add new default entries to the indicClassTables array in <icu>/source/layout/Ind icClassTables.cpp

	665 and the complexTable array in <icu>/source/layoutex/ParagraphLayout.cpp. (This s tep should be automated...)

	666

	667 * rebuild the layout and layoutex libraries.

	668

	669 *** Documentation

	670 - Update User Guide

	671 + Jamo_Short_Name, sfc->scf, binary property value aliases

	672

	673 ---------------------------------------------------------------------------- ***

	674

	675 Unicode 5.0 update

	676

	677 *** related Jitterbugs

	678

	679 5084 RFE: Update to Unicode 5.0

	680

	681 *** data files & enums & parser code

	682

	683 * file preparation

	684 - ucdstrip:

	685 DerivedCoreProperties.txt

	686 DerivedNormalizationProps.txt

	687 NormalizationTest.txt

	688 PropList.txt

	689 Scripts.txt

	690 GraphemeBreakProperty.txt

	691 SentenceBreakProperty.txt

	692 WordBreakProperty.txt

	693 - ucdstrip and ucdmerge:

	694 EastAsianWidth.txt

	695 LineBreak.txt

	696

	697 * my ucd2unidata.bat (needs to be updated each time with UCD and file version nu mbers)

	698 copy 5.0.0\ucd\BidiMirroring.txt ..\unidata\

	699 copy 5.0.0\ucd\Blocks.txt ..\unidata\

	700 copy 5.0.0\ucd\CaseFolding.txt ..\unidata\

	701 copy 5.0.0\ucd\DerivedAge.txt ..\unidata\

	702 copy 5.0.0\ucd\extracted\DerivedBidiClass.txt ..\unidata\

	703 copy 5.0.0\ucd\extracted\DerivedJoiningGroup.txt ..\unidata\

	704 copy 5.0.0\ucd\extracted\DerivedJoiningType.txt ..\unidata\

	705 copy 5.0.0\ucd\extracted\DerivedNumericValues.txt ..\unidata\

	706 copy 5.0.0\ucd\NormalizationCorrections.txt ..\unidata\

	707 copy 5.0.0\ucd\PropertyAliases.txt ..\unidata\

	708 copy 5.0.0\ucd\PropertyValueAliases.txt ..\unidata\

	709 copy 5.0.0\ucd\SpecialCasing.txt ..\unidata\

	710 copy 5.0.0\ucd\UnicodeData.txt ..\unidata\

	711

	712 ucdstrip < 5.0.0\ucd\DerivedCoreProperties.txt > ..\unidata\DerivedCorePropertie s.txt

	713 ucdstrip < 5.0.0\ucd\DerivedNormalizationProps.txt > ..\unidata\DerivedNormaliza tionProps.txt

	714 ucdstrip < 5.0.0\ucd\NormalizationTest.txt > ..\unidata\NormalizationTest.txt

	715 ucdstrip < 5.0.0\ucd\PropList.txt > ..\unidata\PropList.txt

	716 ucdstrip < 5.0.0\ucd\Scripts.txt > ..\unidata\Scripts.txt

	717 ucdstrip < 5.0.0\ucd\auxiliary\GraphemeBreakProperty.txt > ..\unidata\GraphemeBr eakProperty.txt

	718 ucdstrip < 5.0.0\ucd\auxiliary\SentenceBreakProperty.txt > ..\unidata\SentenceBr eakProperty.txt

	719 ucdstrip < 5.0.0\ucd\auxiliary\WordBreakProperty.txt > ..\unidata\WordBreakPrope rty.txt

	720 ucdstrip < 5.0.0\ucd\EastAsianWidth.txt \| ucdmerge > ..\unidata\EastAsianWidth.t xt

	721 ucdstrip < 5.0.0\ucd\LineBreak.txt \| ucdmerge > ..\unidata\LineBreak.txt

	722

	723 * update FractionalUCA.txt and UCARules.txt with new canonical closure

	724

	725 * genpname

	726 - run preparse.pl

	727 + make sure that data.h is writable

	728 + perl preparse.pl \cvs\oss\icu > out.txt

	729

	730 * uchar.h & uscript.h & uprops.h & uprops.c & genprops

	731 - new block & script values

	732 + script values already added in ICU 3.6 because all of ISO 15924 is now cover ed

	733

	734 * build Unicode data source code for hardcoding core data

	735 C:\cvs\oss\icu\source\data>NMAKE /f makedata.mak ICUMAKE=\cvs\oss\icu\source\dat a\ CFG=debug uni-core-data

	736

	737 ICU data make path is \cvs\oss\icu\source\data\

	738 ICU root path is \cvs\oss\icu

	739 Information: cannot find "ucmlocal.mk". Not building user-additional converter f iles.

	740 [etc.]

	741 Creating data file for Unicode Character Properties

	742 Creating data file for Unicode Case Mapping Properties

	743 Creating data file for Unicode BiDi/Shaping Properties

	744 Creating data file for Unicode Normalization

	745 Unicode .icu files built to "\cvs\oss\icu\source\data\out\build\icudt35l"

	746 Unicode .c source files built to "\cvs\oss\icu\source\data\out\tmp"

	747

	748 - copy the .c source files to C:\cvs\oss\icu\source\common

	749 and rebuild the common library

	750

	751 *** Unicode version numbers

	752 - makedata.mak

	753 - uchar.h

	754 - configure.in

	755

	756 *** LayoutEngine script information

	757 * Run ICU4J com.ibm.icu.dev.tool.layout.ScriptNameBuilder. This generates LEScri pts.h, LELanguage.h,

	758 ScriptAndLanguageTags.h and ScriptAndLanguageTags.cpp in the working directory. (it also generates

	759 ScriptRunData.cpp, which is no longer needed.)

	760

	761 The generated files have a current copyright date and "@draft" statement.

	762

	763 * copy the above files into <icu>/source/layout, replacing the old files.

	764

	765 Add new default entries to the indicClassTables array in <icu>/source/layout/Ind icClassTables.cpp

	766 and the complexTable array in <icu>/source/layoutex/ParagraphLayout.cpp. (This s tep should be automated...)

	767

	768 * rebuild the layout and layoutex libraries.

	769

	770 ---------------------------------------------------------------------------- ***

	771

	772 Unicode 4.1 update

	773

	774 *** related Jitterbugs

	775

	776 4332 RFE: Update to Unicode 4.1

	777 4157 RBBI, TR29 4.1 updates

	778

	779 *** data files & enums & parser code

	780

	781 * file preparation

	782 - ucdstrip:

	783 DerivedCoreProperties.txt

	784 DerivedNormalizationProps.txt

	785 NormalizationTest.txt

	786 GraphemeBreakProperty.txt

	787 SentenceBreakProperty.txt

	788 WordBreakProperty.txt

	789 - ucdstrip and ucdmerge:

	790 EastAsianWidth.txt

	791 LineBreak.txt

	792

	793 * add new files to the repository

	794 GraphemeBreakProperty.txt

	795 SentenceBreakProperty.txt

	796 WordBreakProperty.txt

	797

	798 * update FractionalUCA.txt and UCARules.txt with new canonical closure

	799

	800 * genpname

	801 - handle new enumerated properties in sub read_uchar

	802 - run preparse.pl

	803

	804 * uchar.h & uscript.h & uprops.h & uprops.c & genprops

	805 - new binary properties

	806 + Pattern_Syntax

	807 + Pattern_White_Space

	808 - new enumerated properties

	809 + Grapheme_Cluster_Break

	810 + Sentence_Break

	811 + Word_Break

	812 - new block & script & line break values

	813

	814 * gencase

	815 - case-ignorable changes

	816 see http://www.unicode.org/versions/Unicode4.1.0/#CaseMods

	817 now: (D47a) Word_Break=MidLetter or Mn, Me, Cf, Lm, Sk

	818

	819 *** Unicode version numbers

	820 - makedata.mak

	821 - uchar.h

	822 - configure.in

	823

	824 *** tests

	825 - verify that u_charMirror() round-trips

	826 - test all new properties and some new values of old properties

	827

	828 *** other code

	829

	830 * hardcoded Unihan range end/limit

	831 - Unihan range end moves from 9FA5 to 9FBB

	832 search for both 9FA5 (end) and 9FA6 (limit) (regex 9FA[56], case-insensitive)

	833 + do not modify BOCU/BOCSU code because that would change the encoding

	834 and break binary compatibility!

	835 + similarly, do not change the GB 18030 range data (ucnvmbcs.c),

	836 NamePrepProfile.txt

	837 + ignore trietest.c: test data is arbitrary

	838 + ignore tstnorm.cpp: test optimization, not important

	839 + ignore collation: 9FA[56] only appears in comments; swapCJK() uses the whole block up to 9FFF

	840 + do change line_th.txt and word_th.txt

	841 by replacing hardcoded ranges with the new property values

	842 + do change gennames.c

	843

	844 source\data\brkitr\line_th.txt(229): \u33E0-\u33FE \u3400-\u4DB5 \u4E00-\ u9FA5 \uA000-\uA48C \uA490-\uA4C6

	845 source\data\brkitr\word_th.txt(23): \u33E0-\u33FE \u3400-\u4DB5 \u4E00-\u 9FA5 \uA000-\uA48C \uA490-\uA4C6

	846 source\tools\gennames\gennames.c(971): 0x4e00, 0x9fa5,

	847

	848 * case mappings

	849 - compare new special casing context conditions with previous ones

	850 see http://www.unicode.org/versions/Unicode4.1.0/#CaseMods

	851

	852 * genpname

	853 - consider storing only the short name if it is the same as the long name

	854

	855 *** other reviews

	856 - UAX #29 changes (grapheme/word/sentence breaks)

	857 - UAX #14 changes (line breaks)

	858 - Pattern_Syntax & Pattern_White_Space

	859

	860 ---------------------------------------------------------------------------- ***

	861

	862 Unicode 4.0.1 update

	863

	864 *** related Jitterbugs

	865

	866 3170 RFE: Update to Unicode 4.0.1

	867 3171 Add new Unicode 4.0.1 properties

	868 3520 use Unicode 4.0.1 updates for break iteration

	869

	870 *** data files & enums & parser code

	871

	872 * file preparation

	873 - ucdstrip: DerivedNormalizationProps.txt, NormalizationTest.txt, DerivedCorePro perties.txt

	874 - ucdstrip and ucdmerge: EastAsianWidth.txt, LineBreak.txt

	875

	876 * file fixes

	877 - fix UnicodeData.txt general categories of Ethiopic digits Nd->No

	878 according to PRI #26

	879 http://www.unicode.org/review/resolved-pri.html#pri26

	880 - undone again because no corrigendum in sight;

	881 instead modified tests to not check consistency on this for Unicode 4.0.1

	882

	883 * ucdterms.txt

	884 - update from http://www.unicode.org/copyright.html

	885 formatted for plain text

	886

	887 * uchar.h & uprops.h & uprops.c & genprops

	888 - add UBLOCK_CYRILLIC_SUPPLEMENT because the block is renamed

	889 - add U_LB_INSEPARABLE due to a spelling fix

	890 + put short name comment only on line with new constant

	891 for genpname perl script parser

	892 - new binary properties

	893 + STerm

	894 + Variation_Selector

	895

	896 * genpname

	897 - fix genpname perl script so that it doesn't choke on more than 2 names per pro perty value

	898 - perl script: correctly calculate the maximum number of fields per row

	899

	900 * uscript.h

	901 - new script code Hrkt=Katakana_Or_Hiragana

	902

	903 * gennorm.c track changes in DerivedNormalizationProps.txt

	904 - "FNC" -> "FC_NFKC"

	905 - single field "NFD_NO" -> two fields "NFD_QC; N" etc.

	906

	907 * genprops/props2.c track changes in DerivedNumericValues.txt

	908 - changed from 3 columns to 2, dropping the numeric type

	909 + assume that the type is always numeric for Han characters,

	910 and that only those are added in addition to what UnicodeData.txt lists

	911

	912 *** Unicode version numbers

	913 - makedata.mak

	914 - uchar.h

	915 - configure.in

	916

	917 *** tests

	918 - update test of default bidi classes according to PRI #28

	919 /tsutil/cucdtst/TestUnicodeData

	920 http://www.unicode.org/review/resolved-pri.html#pri28

	921 - bidi tests: change exemplar character for ES depending on Unicode version

	922 - change hardcoded expected property values where they change

	923

	924 *** other code

	925

	926 * name matching

	927 - read UCD.html

	928

	929 * scripts

	930 - use new Hrkt=Katakana_Or_Hiragana

	931

	932 * ZWJ & ZWNJ

	933 - are now part of combining character sequences

	934 - break iteration used to assume that LB classes did not overlap; now they do fo r ZWJ & ZWNJ

OLD	NEW

« no previous file with comments | « icu46/source/data/unidata/WordBreakProperty.txt ('k') | icu46/source/data/unidata/confusablesWholeScript.txt » ('j') | no next file with comments »