| Index: source/data/unidata/changes.txt
|
| diff --git a/source/data/unidata/changes.txt b/source/data/unidata/changes.txt
|
| index b61fc54b16701ce4f894f365dc193b9b0667d11b..23f29bf2e3b88993a78621a62378fdf613a087c6 100644
|
| --- a/source/data/unidata/changes.txt
|
| +++ b/source/data/unidata/changes.txt
|
| @@ -1,4 +1,4 @@
|
| -* Copyright (C) 2004-2013, International Business Machines
|
| +* Copyright (C) 2004-2014, International Business Machines
|
| * Corporation and others. All Rights Reserved.
|
| *
|
| * file name: changes.txt
|
| @@ -13,6 +13,405 @@
|
|
|
| ---------------------------------------------------------------------------- ***
|
|
|
| +Unicode 8.0 update for ICU ??
|
| +
|
| +* UCA issue from 7.0
|
| +
|
| +- U+1DE9 COMBINING LATIN SMALL LETTER BETA
|
| + sorts with Greek Beta, should sort with Latin B?
|
| + + Ken says:
|
| + No, it was deliberate:
|
| +
|
| + 03B2;GREEK SMALL LETTER BETA;Ll;;;;0392;;0392
|
| + 1D5D;MODIFIER LETTER SMALL BETA;Lm;<super> 03B2;;;;;
|
| + 1DE9;COMBINING LATIN SMALL LETTER BETA;Mn;<sort> 03B2;;;;;
|
| + 1D66;GREEK SUBSCRIPT SMALL LETTER BETA;Ll;<sub> 03B2;;;;;
|
| +
|
| + Note the relationship to U+1D5D.
|
| +
|
| + When the disunified *Latin* beta base letter shows up in Unicode 8.0:
|
| +
|
| + U+A7B4 LATIN CAPITAL LETTER BETA
|
| + U+A7B5 LATIN SMALL LETTER BETA
|
| +
|
| + we could re-evaluate what U+1DE9 equates to, for collation,
|
| + but currently there isn’t any Latin beta to serve that function
|
| + in Unicode 7.0.
|
| +
|
| +- ICU_ROOT=~/svn.icu/trunk
|
| +- ICU_SRC_DIR=$ICU_ROOT/src
|
| +- ~/svn.icutools/trunk/dbg/unicode/c$ genuca/genuca --hanOrder implicit $ICU_SRC_DIR
|
| +- ~/svn.icutools/trunk/dbg/unicode/c$ genuca/genuca --hanOrder radical-stroke $ICU_SRC_DIR
|
| +
|
| +
|
| +---------------------------------------------------------------------------- ***
|
| +
|
| +Unicode 7.0 update for ICU 54
|
| +
|
| +http://www.unicode.org/review/pri271/ -- beta review
|
| +http://www.unicode.org/reports/uax-proposed-updates.html
|
| +http://www.unicode.org/versions/beta-7.0.0.html#notable_issues
|
| +http://www.unicode.org/reports/tr44/tr44-13.html
|
| +
|
| +*** ICU Trac
|
| +
|
| +- ticket 10821: Unicode 7.0, UCA 7.0
|
| +- C++ branches/markus/uni70 at r35584 from trunk at r35580
|
| +- Java branches/markus/uni70 at r35587 from trunk at r35545
|
| +
|
| +*** CLDR Trac
|
| +
|
| +- ticket 7195: UCA 7.0 CLDR root collation
|
| +- branches/markus/uni70 at r10062 from trunk at r10061
|
| +
|
| +- ticket 6762: script metadata for Unicode 7.0 new scripts
|
| +
|
| +*** Unicode version numbers
|
| +- makedata.mak
|
| +- uchar.h
|
| +- com.ibm.icu.util.VersionInfo
|
| +- com.ibm.icu.dev.test.lang.UCharacterTest.VERSION_
|
| +
|
| +- Run ICU4C "configure" _after_ updating the Unicode version number in uchar.h
|
| + so that the makefiles see the new version number.
|
| +
|
| +*** data files & enums & parser code
|
| +
|
| +* file preparation
|
| +
|
| +- download UCD & IDNA files
|
| +- make sure that the Unicode data folder passed into preparseucd.py
|
| + includes a copy of the latest IdnaMappingTable.txt (can be in some subfolder)
|
| +- only for manual diffs: remove version suffixes from the file names
|
| + ~/unidata/uni70/20140403$ ../../desuffixucd.py .
|
| + (see https://sites.google.com/site/unicodetools/inputdata)
|
| +- only for manual diffs: extract Unihan.zip to "here" (.../ucd/Unihan/*.txt), delete Unihan.zip
|
| +- ~/svn.icutools/trunk/src/unicode$ py/preparseucd.py ~/unidata/uni70/20140403 $ICU_SRC_DIR ~/svn.icutools/trunk/src
|
| +- This writes files (especially ppucd.txt) to the ICU4C unidata and testdata subfolders.
|
| +- Restore TODO diffs in source/data/unidata/UCARules.txt
|
| + cd $ICU_SRC_DIR
|
| + meld ../../trunk/src/source/data/unidata/UCARules.txt source/data/unidata/UCARules.txt
|
| +- Restore ICU patches for ticket #10176 in source/test/testdata/LineBreakTest.txt
|
| +
|
| +- also: from http://unicode.org/Public/security/7.0.0/ download new
|
| + confusables.txt & confusablesWholeScript.txt
|
| + and copy to $ICU_ROOT/src/source/data/unidata/
|
| +
|
| +* initial preparseucd.py changes
|
| +- remove new Unicode scripts from the
|
| + only-in-ISO-15924 list according to the error message:
|
| + ValueError: remove ['Hmng', 'Lina', 'Perm', 'Mani', 'Phlp', 'Bass',
|
| + 'Dupl', 'Elba', 'Gran', 'Mend', 'Narb', 'Nbat', 'Palm',
|
| + 'Sind', 'Wara', 'Mroo', 'Khoj', 'Tirh', 'Aghb', 'Mahj']
|
| + from _scripts_only_in_iso15924
|
| + -> fix expectedLong names in cucdapi.c/TestUScriptCodeAPI()
|
| + and in com.ibm.icu.dev.test.lang.TestUScript.java
|
| +- NamesList.txt now has a heading with a non-ASCII character
|
| + + keep ppucd.txt in platform charset, rather than changing tool/test parsers
|
| + + escape non-ASCII characters in heading comments
|
| +- gets Unicode copyright line from PropertyAliases.txt which is currently still at 2013
|
| + + get the copyright from the first file whose copyright line contains the current year
|
| +
|
| +* PropertyValueAliases.txt changes
|
| +- 32 new Block (blk) values:
|
| + blk; Bassa_Vah ; Bassa_Vah
|
| + blk; Caucasian_Albanian ; Caucasian_Albanian
|
| + blk; Coptic_Epact_Numbers ; Coptic_Epact_Numbers
|
| + blk; Diacriticals_Ext ; Combining_Diacritical_Marks_Extended
|
| + blk; Duployan ; Duployan
|
| + blk; Elbasan ; Elbasan
|
| + blk; Geometric_Shapes_Ext ; Geometric_Shapes_Extended
|
| + blk; Grantha ; Grantha
|
| + blk; Khojki ; Khojki
|
| + blk; Khudawadi ; Khudawadi
|
| + blk; Latin_Ext_E ; Latin_Extended_E
|
| + blk; Linear_A ; Linear_A
|
| + blk; Mahajani ; Mahajani
|
| + blk; Manichaean ; Manichaean
|
| + blk; Mende_Kikakui ; Mende_Kikakui
|
| + blk; Modi ; Modi
|
| + blk; Mro ; Mro
|
| + blk; Myanmar_Ext_B ; Myanmar_Extended_B
|
| + blk; Nabataean ; Nabataean
|
| + blk; Old_North_Arabian ; Old_North_Arabian
|
| + blk; Old_Permic ; Old_Permic
|
| + blk; Ornamental_Dingbats ; Ornamental_Dingbats
|
| + blk; Pahawh_Hmong ; Pahawh_Hmong
|
| + blk; Palmyrene ; Palmyrene
|
| + blk; Pau_Cin_Hau ; Pau_Cin_Hau
|
| + blk; Psalter_Pahlavi ; Psalter_Pahlavi
|
| + blk; Shorthand_Format_Controls ; Shorthand_Format_Controls
|
| + blk; Siddham ; Siddham
|
| + blk; Sinhala_Archaic_Numbers ; Sinhala_Archaic_Numbers
|
| + blk; Sup_Arrows_C ; Supplemental_Arrows_C
|
| + blk; Tirhuta ; Tirhuta
|
| + blk; Warang_Citi ; Warang_Citi
|
| + -> add to uchar.h
|
| + use long property names for enum constants
|
| + -> add to UCharacter.UnicodeBlock IDs
|
| + Eclipse find UBLOCK_([^ ]+) = ([0-9]+), (/.+)
|
| + replace public static final int \1_ID = \2; \3
|
| + -> add to UCharacter.UnicodeBlock objects
|
| + Eclipse find UBLOCK_([^ ]+) = [0-9]+, (/.+)
|
| + replace public static final UnicodeBlock \1 = new UnicodeBlock("\1", \1_ID); \2
|
| +- 28 new Joining_Group (jg) values:
|
| + jg ; Manichaean_Aleph ; Manichaean_Aleph
|
| + jg ; Manichaean_Ayin ; Manichaean_Ayin
|
| + jg ; Manichaean_Beth ; Manichaean_Beth
|
| + jg ; Manichaean_Daleth ; Manichaean_Daleth
|
| + jg ; Manichaean_Dhamedh ; Manichaean_Dhamedh
|
| + jg ; Manichaean_Five ; Manichaean_Five
|
| + jg ; Manichaean_Gimel ; Manichaean_Gimel
|
| + jg ; Manichaean_Heth ; Manichaean_Heth
|
| + jg ; Manichaean_Hundred ; Manichaean_Hundred
|
| + jg ; Manichaean_Kaph ; Manichaean_Kaph
|
| + jg ; Manichaean_Lamedh ; Manichaean_Lamedh
|
| + jg ; Manichaean_Mem ; Manichaean_Mem
|
| + jg ; Manichaean_Nun ; Manichaean_Nun
|
| + jg ; Manichaean_One ; Manichaean_One
|
| + jg ; Manichaean_Pe ; Manichaean_Pe
|
| + jg ; Manichaean_Qoph ; Manichaean_Qoph
|
| + jg ; Manichaean_Resh ; Manichaean_Resh
|
| + jg ; Manichaean_Sadhe ; Manichaean_Sadhe
|
| + jg ; Manichaean_Samekh ; Manichaean_Samekh
|
| + jg ; Manichaean_Taw ; Manichaean_Taw
|
| + jg ; Manichaean_Ten ; Manichaean_Ten
|
| + jg ; Manichaean_Teth ; Manichaean_Teth
|
| + jg ; Manichaean_Thamedh ; Manichaean_Thamedh
|
| + jg ; Manichaean_Twenty ; Manichaean_Twenty
|
| + jg ; Manichaean_Waw ; Manichaean_Waw
|
| + jg ; Manichaean_Yodh ; Manichaean_Yodh
|
| + jg ; Manichaean_Zayin ; Manichaean_Zayin
|
| + jg ; Straight_Waw ; Straight_Waw
|
| + -> uchar.h & UCharacter.JoiningGroup
|
| +- 23 new Script (sc) values:
|
| + sc ; Aghb ; Caucasian_Albanian
|
| + sc ; Bass ; Bassa_Vah
|
| + sc ; Dupl ; Duployan
|
| + sc ; Elba ; Elbasan
|
| + sc ; Gran ; Grantha
|
| + sc ; Hmng ; Pahawh_Hmong
|
| + sc ; Khoj ; Khojki
|
| + sc ; Lina ; Linear_A
|
| + sc ; Mahj ; Mahajani
|
| + sc ; Mani ; Manichaean
|
| + sc ; Mend ; Mende_Kikakui
|
| + sc ; Modi ; Modi
|
| + sc ; Mroo ; Mro
|
| + sc ; Narb ; Old_North_Arabian
|
| + sc ; Nbat ; Nabataean
|
| + sc ; Palm ; Palmyrene
|
| + sc ; Pauc ; Pau_Cin_Hau
|
| + sc ; Perm ; Old_Permic
|
| + sc ; Phlp ; Psalter_Pahlavi
|
| + sc ; Sidd ; Siddham
|
| + sc ; Sind ; Khudawadi
|
| + sc ; Tirh ; Tirhuta
|
| + sc ; Wara ; Warang_Citi
|
| + -> uscript.h (many were added before)
|
| + comment "Mende Kikakui" for USCRIPT_MENDE
|
| + add USCRIPT_KHUDAWADI, make USCRIPT_SINDHI an alias
|
| + -> com.ibm.icu.lang.UScript
|
| + find USCRIPT_([^ ]+) *= ([0-9]+),(.+)
|
| + replace public static final int \1 = \2; \3
|
| +- 6 new script codes from ISO 15924 http://www.unicode.org/iso15924/codechanges.html
|
| + (added 2012-11-01)
|
| + Ahom 338 Ahom
|
| + Hatr 127 Hatran
|
| + Mult 323 Multani
|
| + (added 2013-10-12)
|
| + Modi 324 Modi
|
| + Pauc 263 Pau Cin Hau
|
| + Sidd 302 Siddham
|
| + -> uscript.h (some overlap with additions from Unicode)
|
| + -> com.ibm.icu.lang.UScript
|
| + find USCRIPT_([^ ]+) *= ([0-9]+),(.+)
|
| + replace public static final int \1 = \2; \3
|
| + -> add Ahom, Hatr, Mult to preparseucd.py _scripts_only_in_iso15924
|
| + -> add to expectedLong and expectedShort names in cintltst/cucdapi.c/TestUScriptCodeAPI()
|
| + and in com.ibm.icu.dev.test.lang.TestUScript.java
|
| +
|
| +* update Script metadata: SCRIPT_PROPS[] in uscript_props.cpp & UScript.ScriptMetadata
|
| + (not strictly necessary for NOT_ENCODED scripts)
|
| + ~/svn.icutools/trunk/src/unicode$ py/parsescriptmetadata.py $ICU_SRC_DIR/source/common/unicode/uscript.h ~/svn.cldr/trunk/common/properties/scriptMetadata.txt
|
| +
|
| +* generate normalization data files
|
| +- cd $ICU_ROOT/dbg
|
| +- export LD_LIBRARY_PATH=$ICU_ROOT/dbg/lib
|
| +- SRC_DATA_IN=$ICU_SRC_DIR/source/data/in
|
| +- UNIDATA=$ICU_SRC_DIR/source/data/unidata
|
| +- bin/gennorm2 -o $ICU_SRC_DIR/source/common/norm2_nfc_data.h -s $UNIDATA/norm2 nfc.txt --csource
|
| +- bin/gennorm2 -o $SRC_DATA_IN/nfc.nrm -s $UNIDATA/norm2 nfc.txt
|
| +- bin/gennorm2 -o $SRC_DATA_IN/nfkc.nrm -s $UNIDATA/norm2 nfc.txt nfkc.txt
|
| +- bin/gennorm2 -o $SRC_DATA_IN/nfkc_cf.nrm -s $UNIDATA/norm2 nfc.txt nfkc.txt nfkc_cf.txt
|
| +- bin/gennorm2 -o $SRC_DATA_IN/uts46.nrm -s $UNIDATA/norm2 nfc.txt uts46.txt
|
| +
|
| +* build ICU (make install)
|
| + so that the tools build can pick up the new definitions from the installed header files.
|
| +
|
| +~/svn.icu/uni70/dbg$ echo;echo;make -j5 install > out.txt 2>&1 ; tail -n 20 out.txt
|
| +
|
| +* build Unicode tools using CMake+make
|
| +
|
| +~/svn.icutools/trunk/src/unicode/c/icudefs.txt:
|
| +
|
| +# Location (--prefix) of where ICU was installed.
|
| +set(ICU_INST_DIR /home/mscherer/svn.icu/uni70/inst)
|
| +# Location of the ICU source tree.
|
| +set(ICU_SRC_DIR /home/mscherer/svn.icu/uni70/src)
|
| +
|
| +~/svn.icutools/trunk/dbg/unicode/c$ cmake ../../../src/unicode/c
|
| +~/svn.icutools/trunk/dbg/unicode/c$ make
|
| +
|
| +* genprops work
|
| +- new code point range for Joining_Group values: 10AC0..10AFF Manichaean
|
| + + add second array of Joining_Group values for at most 10800..10FFF
|
| + icutools: unicode/c/genprops/bidipropsbuilder.cpp
|
| + icu: source/common/ubidi_props.h/.c/_data.h
|
| + icu4j: main/classes/core/src/com/ibm/icu/impl/UBiDiProps.java
|
| +
|
| +* generate core properties data files
|
| +- ~/svn.icutools/trunk/dbg/unicode/c$ genprops/genprops $ICU_SRC_DIR
|
| +- ~/svn.icutools/trunk/dbg/unicode/c$ genuca/genuca $ICU_SRC_DIR
|
| +- rebuild ICU (make install) & tools
|
| +- run genuca again (see step above) so that it picks up the new nfc.nrm
|
| +- rebuild ICU (make install) & tools
|
| +
|
| +* update uts46test.cpp and UTS46Test.java if there are new characters that are equivalent to
|
| + sequences with non-LDH ASCII (that is, their decompositions contain '=' or similar)
|
| +- grep IdnaMappingTable.txt or uts46.txt for "disallowed_STD3_valid" on non-ASCII characters
|
| +- Unicode 6.0..7.0: U+2260, U+226E, U+226F
|
| +- nothing new in 7.0, no test file to update
|
| +
|
| +* run & fix ICU4C tests
|
| +
|
| +* update Java data files
|
| +- refresh just the UCD-related files, just to be safe
|
| +- see (ICU4C)/source/data/icu4j-readme.txt
|
| +- mkdir /tmp/icu4j
|
| +- ~/svn.icu/uni70/dbg$ make ICU4J_ROOT=/tmp/icu4j icu4j-data-install
|
| + output:
|
| + ...
|
| + Unicode .icu files built to ./out/build/icudt53l
|
| + echo timestamp > uni-core-data
|
| + mkdir -p ./out/icu4j/com/ibm/icu/impl/data/icudt53b
|
| + mkdir -p ./out/icu4j/tzdata/com/ibm/icu/impl/data/icudt53b
|
| + echo pnames.icu ubidi.icu ucase.icu uprops.icu > ./out/icu4j/add.txt
|
| + LD_LIBRARY_PATH=../lib:../stubdata:../tools/ctestfw:$LD_LIBRARY_PATH ../bin/icupkg ./out/tmp/icudt53l.dat ./out/icu4j/icudt53b.dat -a ./out/icu4j/add.txt -s ./out/build/icudt53l -x '*' -tb -d ./out/icu4j/com/ibm/icu/impl/data/icudt53b
|
| + mv ./out/icu4j/"com/ibm/icu/impl/data/icudt53b/zoneinfo64.res" ./out/icu4j/"com/ibm/icu/impl/data/icudt53b/metaZones.res" ./out/icu4j/"com/ibm/icu/impl/data/icudt53b/timezoneTypes.res" ./out/icu4j/"com/ibm/icu/impl/data/icudt53b/windowsZones.res" "./out/icu4j/tzdata/com/ibm/icu/impl/data/icudt53b"
|
| + jar cf ./out/icu4j/icudata.jar -C ./out/icu4j com/ibm/icu/impl/data/icudt53b/
|
| + mkdir -p /tmp/icu4j/main/shared/data
|
| + cp ./out/icu4j/icudata.jar /tmp/icu4j/main/shared/data
|
| + jar cf ./out/icu4j/icutzdata.jar -C ./out/icu4j/tzdata com/ibm/icu/impl/data/icudt53b/
|
| + mkdir -p /tmp/icu4j/main/shared/data
|
| + cp ./out/icu4j/icutzdata.jar /tmp/icu4j/main/shared/data
|
| + make[1]: Leaving directory `/home/mscherer/svn.icu/uni70/dbg/data'
|
| +- copy the big-endian Unicode data files to another location,
|
| + separate from the other data files
|
| + ICUDT=icudt54b
|
| + mkdir -p /tmp/icu4j/com/ibm/icu/impl/data/$ICUDT/coll
|
| + mkdir -p /tmp/icu4j/com/ibm/icu/impl/data/$ICUDT/brkitr
|
| + cd ~/svn.icu/uni70/dbg/data/out/icu4j
|
| + cp com/ibm/icu/impl/data/$ICUDT/confusables.cfu /tmp/icu4j/com/ibm/icu/impl/data/$ICUDT
|
| + cp com/ibm/icu/impl/data/$ICUDT/*.icu /tmp/icu4j/com/ibm/icu/impl/data/$ICUDT
|
| + rm /tmp/icu4j/com/ibm/icu/impl/data/$ICUDT/cnvalias.icu
|
| + cp com/ibm/icu/impl/data/$ICUDT/*.nrm /tmp/icu4j/com/ibm/icu/impl/data/$ICUDT
|
| + cp com/ibm/icu/impl/data/$ICUDT/coll/*.icu /tmp/icu4j/com/ibm/icu/impl/data/$ICUDT/coll
|
| + cp com/ibm/icu/impl/data/$ICUDT/brkitr/* /tmp/icu4j/com/ibm/icu/impl/data/$ICUDT/brkitr
|
| +- refresh ICU4J
|
| + ~/svn.icu/uni70/dbg/data/out/icu4j$ jar uf ~/svn.icu4j/trunk/src/main/shared/data/icudata.jar -C /tmp/icu4j com/ibm/icu/impl/data/$ICUDT
|
| +
|
| +* update CollationFCD.java
|
| + + copy & paste the initializers of lcccIndex[] etc. from
|
| + ICU4C/source/i18n/collationfcd.cpp to
|
| + ICU4J/main/classes/collate/src/com/ibm/icu/impl/coll/CollationFCD.java
|
| +
|
| +* refresh Java test .txt files
|
| +- copy new .txt files into ICU4J's main/tests/core/src/com/ibm/icu/dev/data/unicode
|
| + cd $ICU_SRC_DIR/source/data/unidata
|
| + cp confusables.txt confusablesWholeScript.txt NormalizationCorrections.txt NormalizationTest.txt SpecialCasing.txt UnicodeData.txt ~/svn.icu4j/trunk/src/main/tests/core/src/com/ibm/icu/dev/data/unicode
|
| + cd ../../test/testdata
|
| + cp BidiCharacterTest.txt BidiTest.txt ~/svn.icu4j/trunk/src/main/tests/core/src/com/ibm/icu/dev/data/unicode
|
| + cp ~/unidata/uni70/20140409/ucd/CompositionExclusions.txt ~/svn.icu4j/trunk/src/main/tests/core/src/com/ibm/icu/dev/data/unicode
|
| +
|
| +* UCA
|
| +
|
| +- download UCA files (mostly allkeys.txt) from http://www.unicode.org/Public/UCA/<beta version>/
|
| +- run desuffixucd.py (see https://sites.google.com/site/unicodetools/inputdata)
|
| +- update the input files for Mark's UCA tools, in ~/svn.unitools/trunk/data/uca/7.0.0/
|
| +- run Mark's UCA Main: https://sites.google.com/site/unicodetools/home#TOC-UCA
|
| +- output files are in ~/svn.unitools/Generated/uca/7.0.0/
|
| +- review data; compare files, use blankweights.sed or similar
|
| + ~/svn.unitools$ sed -r -f blankweights.sed Generated/uca/7.0.0/CollationAuxiliary/FractionalUCA.txt > frac-7.0.txt
|
| +- cd ~/svn.unitools/Generated/uca/7.0.0/
|
| +- update source/data/unidata/FractionalUCA.txt with FractionalUCA_SHORT.txt
|
| + cp CollationAuxiliary/FractionalUCA_SHORT.txt $ICU_SRC_DIR/source/data/unidata/FractionalUCA.txt
|
| +- update source/data/unidata/UCARules.txt with UCA_Rules_SHORT.txt
|
| + (note removing the underscore before "Rules")
|
| + cp CollationAuxiliary/UCA_Rules_SHORT.txt $ICU_SRC_DIR/source/data/unidata/UCARules.txt
|
| +- update (ICU4C)/source/test/testdata/CollationTest_*.txt
|
| + and (ICU4J)/main/tests/collate/src/com/ibm/icu/dev/data/CollationTest_*.txt
|
| + with output from Mark's Unicode tools (..._CLDR_..._SHORT.txt)
|
| + cp CollationAuxiliary/CollationTest_CLDR_NON_IGNORABLE_SHORT.txt $ICU_SRC_DIR/source/test/testdata/CollationTest_NON_IGNORABLE_SHORT.txt
|
| + cp CollationAuxiliary/CollationTest_CLDR_SHIFTED_SHORT.txt $ICU_SRC_DIR/source/test/testdata/CollationTest_SHIFTED_SHORT.txt
|
| + cp $ICU_SRC_DIR/source/test/testdata/CollationTest_*.txt ~/svn.icu4j/trunk/src/main/tests/collate/src/com/ibm/icu/dev/data
|
| +- run genuca, see command line above
|
| +- rebuild ICU4C
|
| +- refresh ICU4J collation data:
|
| + (subset of instructions above for properties data refresh, except copies all coll/*)
|
| + ICUDT=icudt54b
|
| + ~/svn.icu/uni70/dbg$ make ICU4J_ROOT=/tmp/icu4j icu4j-data-install
|
| + ~/svn.icu/uni70/dbg$ mkdir -p /tmp/icu4j/com/ibm/icu/impl/data/$ICUDT/coll
|
| + ~/svn.icu/uni70/dbg/data/out/icu4j$ cp com/ibm/icu/impl/data/$ICUDT/coll/* /tmp/icu4j/com/ibm/icu/impl/data/$ICUDT/coll
|
| + ~/svn.icu/uni70/dbg/data/out/icu4j$ jar uf ~/svn.icu4j/trunk/src/main/shared/data/icudata.jar -C /tmp/icu4j com/ibm/icu/impl/data/$ICUDT
|
| +- run all tests with the *_SHORT.txt or the full files (the full ones have comments, useful for debugging)
|
| +- note on intltest: if collate/UCAConformanceTest fails, then
|
| + utility/MultithreadTest/TestCollators will fail as well;
|
| + fix the conformance test before looking into the multi-thread test
|
| +- copy all output from Mark's UCA tool to unicode.org for review & staging by Ken & editors
|
| +- copy most of ~/svn.unitools/Generated/uca/7.0.0/CollationAuxiliary/* to CLDR branch
|
| + ~/svn.unitools$ cp Generated/uca/7.0.0/CollationAuxiliary/* ~/svn.cldr/trunk/common/uca/
|
| +
|
| +* When refreshing all of ICU4J data from ICU4C
|
| +- ~/svn.icu/uni70/dbg$ make ICU4J_ROOT=/tmp/icu4j icu4j-data-install
|
| +- cp /tmp/icu4j/main/shared/data/icudata.jar ~/svn.icu4j/trunk/src/main/shared/data
|
| +or
|
| +- ~/svn.icu/uni70/dbg$ make ICU4J_ROOT=~/svn.icu4j/trunk/src icu4j-data-install
|
| +
|
| +* run & fix ICU4J tests
|
| +
|
| +*** LayoutEngine script information
|
| +
|
| +(For details see the Unicode 5.2 change log below.)
|
| +
|
| +* Run icu4j-tools: com.ibm.icu.dev.tool.layout.ScriptNameBuilder.
|
| + This generates LEScripts.h, LELanguages.h, ScriptAndLanguageTags.h and ScriptAndLanguageTags.cpp
|
| + in the working directory.
|
| + (It also generates ScriptRunData.cpp, which is no longer needed.)
|
| +
|
| + The generated files have a current copyright date and "@stable" statement.
|
| + ICU 54: Fixed tools/misc/src/com/ibm/icu/dev/tool/layout/ScriptIDModuleWriter.java
|
| + for "born stable" Unicode API constants, and to stop parsing ICU version numbers
|
| + which may not contain dots any more.
|
| +
|
| +- diff current <icu>/source/layout files vs. generated ones
|
| + ~/svn.icu4j/trunk/src$ meld $ICU_SRC_DIR/source/layout tools/misc/src/com/ibm/icu/dev/tool/layout
|
| + review and manually merge desired changes;
|
| + fix gratuitous changes, incorrect @draft/@stable and missing aliases;
|
| + Unicode-derived script codes should be "born stable" like constants in uchar.h, uscript.h etc.
|
| +- if you just copy the above files, then
|
| + fix mixed line endings, review the diffs as above and restore changes to API tags etc.;
|
| + manually re-add the "Indic script xyz v.2" tags in ScriptAndLanguageTags.h
|
| +
|
| +*** API additions
|
| +- send notice to icu-design about new born-@stable API (enum constants etc.)
|
| +
|
| +*** merge the Unicode update branches back onto the trunk
|
| +- do not merge the icudata.jar and testdata.jar,
|
| + instead rebuild them from merged & tested ICU4C
|
| +
|
| +---------------------------------------------------------------------------- ***
|
| +
|
| Unicode 6.3 update
|
|
|
| http://www.unicode.org/review/pri249/ -- beta review
|
|
|