Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(1221)

Side by Side Diff: source/data/unidata/changes.txt

Issue 1621843002: ICU 56 update step 1 (Closed) Base URL: https://chromium.googlesource.com/chromium/deps/icu.git@561
Patch Set: Created 4 years, 11 months ago
Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.
Jump to:
View unified diff | Download patch
« no previous file with comments | « source/data/unidata/UnicodeData.txt ('k') | source/data/unidata/confusablesWholeScript.txt » ('j') | no next file with comments »
Toggle Intra-line Diffs ('i') | Expand Comments ('e') | Collapse Comments ('c') | Show Comments Hide Comments ('s')
OLDNEW
1 * Copyright (C) 2004-2014, International Business Machines 1 * Copyright (C) 2004-2015, International Business Machines
2 * Corporation and others. All Rights Reserved. 2 * Corporation and others. All Rights Reserved.
3 * 3 *
4 * file name: changes.txt 4 * file name: changes.txt
5 * encoding: US-ASCII 5 * encoding: US-ASCII
6 * tab size: 8 (not used) 6 * tab size: 8 (not used)
7 * indentation:4 7 * indentation:4
8 * 8 *
9 * created on: 2004may06 9 * created on: 2004may06
10 * created by: Markus W. Scherer 10 * created by: Markus W. Scherer
11 * 11 *
12 * change log for Unicode updates 12 * change log for Unicode updates
13 13
14 ---------------------------------------------------------------------------- *** 14 ---------------------------------------------------------------------------- ***
15 15
16 Unicode 8.0 update for ICU ?? 16 * New ISO 15924 script codes
17 17
18 * UCA issue from 7.0 18 Starting with ICU 55, we do not add UScriptCode constants any more until their s cripts
19 19 are encoded in Unicode, or can be assumed to be encoded in the next Unicode vers ion.
20 - U+1DE9 COMBINING LATIN SMALL LETTER BETA 20 Script enum constant names want to follow the Unicode script property value alia ses,
21 sorts with Greek Beta, should sort with Latin B? 21 which are assigned only when the scripts are encoded.
22 + Ken says: 22 When we encode scripts early and guess wrong, then we have confusing enum consta nts
23 No, it was deliberate: 23 and have sometimes added aliases.
24 24
25 03B2;GREEK SMALL LETTER BETA;Ll;;;;0392;;0392 25 Exception: Script codes like Latf and Aran that are not subject to separate enco ding
26 1D5D;MODIFIER LETTER SMALL BETA;Lm;<super> 03B2;;;;; 26 can be added at any time.
27 1DE9;COMBINING LATIN SMALL LETTER BETA;Mn;<sort> 03B2;;;;; 27
28 1D66;GREEK SUBSCRIPT SMALL LETTER BETA;Ll;<sub> 03B2;;;;; 28 Script codes not yet in ICU: http://www.unicode.org/iso15924/codechanges.html
29 29
30 Note the relationship to U+1D5D. 30 Added 2014-11-15, see http://bugs.icu-project.org/trac/ticket/11561
31 31 - Adlm 166 Adlam
32 When the disunified *Latin* beta base letter shows up in Unicode 8.0: 32 - Aran 161 Arabic (Nastaliq variant)
33 33 - Kitl 505 Khitan large script
34 U+A7B4 LATIN CAPITAL LETTER BETA 34 - Kits 288 Khitan small script
35 U+A7B5 LATIN SMALL LETTER BETA 35 - Marc 332 Marchen
36 36 - Osge 219 Osage
37 we could re-evaluate what U+1DE9 equates to, for collation, 37
38 but currently there isn’t any Latin beta to serve that function 38 Aran can be added as USCRIPT_ARABIC_NASTALIQ at any time.
39 in Unicode 7.0. 39
40 40 Adlam, Marchen, and Osage are expected to go into Unicode 9;
41 - ICU_ROOT=~/svn.icu/trunk 41 we should assign Unicode script property value aliases for them
42 - ICU_SRC_DIR=$ICU_ROOT/src 42 soon after Unicode 8 is released, and add them in ICU 56.
43
44 Khitan scripts will be encoded later.
45
46 ---------------------------------------------------------------------------- ***
47
48 Unicode 8.0 update for ICU 56
49
50 * Command-line environment setup
51
52 ICU_ROOT=~/svn.icu/trunk
53 ICU_SRC_DIR=$ICU_ROOT/src
54 ICUDT=icudt56b
55 export LD_LIBRARY_PATH=$ICU_ROOT/dbg/lib
56 SRC_DATA_IN=$ICU_SRC_DIR/source/data/in
57 UNIDATA=$ICU_SRC_DIR/source/data/unidata
58
59 http://www.unicode.org/review/pri297/ -- beta review
60 http://www.unicode.org/reports/uax-proposed-updates.html
61 http://unicode.org/versions/beta-8.0.0.html
62 http://www.unicode.org/versions/Unicode8.0.0/
63 http://www.unicode.org/reports/tr44/tr44-15.html
64
65 *** ICU Trac
66
67 - ticket:11574: Unicode 8
68 - C++ branches/markus/uni80 at r37351 from trunk at r37343
69 - Java branches/markus/uni80 at r37352 from trunk at r37338
70
71 *** CLDR Trac
72
73 - cldrbug 8311: UCA 8
74 - branches/markus/uni80 at r11518 from trunk at r11517
75
76 - cldrbug 8109: Unicode 8.0 script metadata
77 - cldrbug 8418: Updated segmentation for Unicode 8.0
78
79 *** Unicode version numbers
80 - makedata.mak
81 - uchar.h
82 - com.ibm.icu.util.VersionInfo
83 - com.ibm.icu.dev.test.lang.UCharacterTest.VERSION_
84
85 - Run ICU4C "configure" _after_ updating the Unicode version number in uchar.h
86 so that the makefiles see the new version number.
87
88 *** data files & enums & parser code
89
90 * file preparation
91
92 - download UCD & IDNA files
93 - make sure that the Unicode data folder passed into preparseucd.py
94 includes a copy of the latest IdnaMappingTable.txt (can be in some subfolder)
95 - only for manual diffs: remove version suffixes from the file names
96 ~/unidata/uni70/20140403$ ../../desuffixucd.py .
97 (see https://sites.google.com/site/unicodetools/inputdata)
98 - only for manual diffs: extract Unihan.zip to "here" (.../ucd/Unihan/*.txt), de lete Unihan.zip
99 - ~/svn.icutools/trunk/src/unicode$ py/preparseucd.py ~/unidata/uni80/20150415 $ ICU_SRC_DIR ~/svn.icutools/trunk/src
100 - This writes files (especially ppucd.txt) to the ICU4C unidata and testdata sub folders.
101
102 - also: from http://unicode.org/Public/security/8.0.0/ download new
103 confusables.txt & confusablesWholeScript.txt
104 and copy to $UNIDATA
105 ~/unidata$ cp uni80/20150415/security/confusables.txt $UNIDATA
106 ~/unidata$ cp uni80/20150415/security/confusablesWholeScript.txt $UNIDATA
107
108 * initial preparseucd.py changes
109 - remove new Unicode scripts from the
110 only-in-ISO-15924 list according to the error message:
111 ValueError: remove ['Ahom', 'Hatr', 'Hluw', 'Hung', 'Mult', 'Sgnw']
112 from _scripts_only_in_iso15924
113 -> fix expectedLong names in cucdapi.c/TestUScriptCodeAPI()
114 and in com.ibm.icu.dev.test.lang.TestUScript.java
115 - property and file name change:
116 IndicMatraCategory -> IndicPositionalCategory
117 - UnicodeData.txt unusual numeric values (improper fractions)
118 109F6;MEROITIC CURSIVE FRACTION ONE TWELFTH;No;0;R;;;;1/12;N;;;;;
119 109F7;MEROITIC CURSIVE FRACTION TWO TWELFTHS;No;0;R;;;;2/12;N;;;;;
120 109F8;MEROITIC CURSIVE FRACTION THREE TWELFTHS;No;0;R;;;;3/12;N;;;;;
121 109F9;MEROITIC CURSIVE FRACTION FOUR TWELFTHS;No;0;R;;;;4/12;N;;;;;
122 109FA;MEROITIC CURSIVE FRACTION FIVE TWELFTHS;No;0;R;;;;5/12;N;;;;;
123 109FB;MEROITIC CURSIVE FRACTION SIX TWELFTHS;No;0;R;;;;6/12;N;;;;;
124 109FC;MEROITIC CURSIVE FRACTION SEVEN TWELFTHS;No;0;R;;;;7/12;N;;;;;
125 109FD;MEROITIC CURSIVE FRACTION EIGHT TWELFTHS;No;0;R;;;;8/12;N;;;;;
126 109FE;MEROITIC CURSIVE FRACTION NINE TWELFTHS;No;0;R;;;;9/12;N;;;;;
127 109FF;MEROITIC CURSIVE FRACTION TEN TWELFTHS;No;0;R;;;;10/12;N;;;;;
128 -> change preparseucd.py to map them to proper fractions (e.g., 1/6)
129 which are listed in DerivedNumericValues.txt;
130 keeps storage in data file simple
131
132 * PropertyValueAliases.txt changes
133 - 10 new Block (blk) values:
134 blk; Ahom ; Ahom
135 blk; Anatolian_Hieroglyphs ; Anatolian_Hieroglyphs
136 blk; Cherokee_Sup ; Cherokee_Supplement
137 blk; CJK_Ext_E ; CJK_Unified_Ideographs_Extension_E
138 blk; Early_Dynastic_Cuneiform ; Early_Dynastic_Cuneiform
139 blk; Hatran ; Hatran
140 blk; Multani ; Multani
141 blk; Old_Hungarian ; Old_Hungarian
142 blk; Sup_Symbols_And_Pictographs ; Supplemental_Symbols_And_Pictographs
143 blk; Sutton_SignWriting ; Sutton_SignWriting
144 -> add to uchar.h
145 use long property names for enum constants
146 -> add to UCharacter.UnicodeBlock IDs
147 Eclipse find UBLOCK_([^ ]+) = ([0-9]+), (/.+)
148 replace public static final int \1_ID = \2; \3
149 -> add to UCharacter.UnicodeBlock objects
150 Eclipse find UBLOCK_([^ ]+) = [0-9]+, (/.+)
151 replace public static final UnicodeBlock \1 = new UnicodeBlock("\1" , \1_ID); \2
152 - 6 new Script (sc) values:
153 sc ; Ahom ; Ahom
154 sc ; Hatr ; Hatran
155 sc ; Hluw ; Anatolian_Hieroglyphs
156 sc ; Hung ; Old_Hungarian
157 sc ; Mult ; Multani
158 sc ; Sgnw ; SignWriting
159 -> all of them had been added already to uscript.h & com.ibm.icu.lang.UScript
160
161 * update Script metadata: SCRIPT_PROPS[] in uscript_props.cpp & UScript.ScriptMe tadata
162 (not strictly necessary for NOT_ENCODED scripts)
163 ~/svn.icutools/trunk/src/unicode$ py/parsescriptmetadata.py $ICU_SRC_DIR/sourc e/common/unicode/uscript.h ~/svn.cldr/trunk/common/properties/scriptMetadata.txt
164
165 * generate normalization data files
166 cd $ICU_ROOT/dbg
167 bin/gennorm2 -o $ICU_SRC_DIR/source/common/norm2_nfc_data.h -s $UNIDATA/norm2 nfc.txt --csource
168 bin/gennorm2 -o $SRC_DATA_IN/nfc.nrm -s $UNIDATA/norm2 nfc.txt
169 bin/gennorm2 -o $SRC_DATA_IN/nfkc.nrm -s $UNIDATA/norm2 nfc.txt nfkc.txt
170 bin/gennorm2 -o $SRC_DATA_IN/nfkc_cf.nrm -s $UNIDATA/norm2 nfc.txt nfkc.txt nf kc_cf.txt
171 bin/gennorm2 -o $SRC_DATA_IN/uts46.nrm -s $UNIDATA/norm2 nfc.txt uts46.txt
172
173 * build ICU (make install)
174 so that the tools build can pick up the new definitions from the installed hea der files.
175
176 $ICU_ROOT/dbg$ echo;echo;make -j5 install > out.txt 2>&1 ; tail -n 20 out.txt
177
178 * build Unicode tools using CMake+make
179
180 ~/svn.icutools/trunk/src/unicode/c/icudefs.txt:
181
182 # Location (--prefix) of where ICU was installed.
183 set(ICU_INST_DIR /home/mscherer/svn.icu/trunk/inst)
184 # Location of the ICU source tree.
185 set(ICU_SRC_DIR /home/mscherer/svn.icu/trunk/src)
186
187 ~/svn.icutools/trunk/dbg/unicode/c$ cmake ../../../src/unicode/c
188 ~/svn.icutools/trunk/dbg/unicode/c$ make
189
190 * generate core properties data files
191 - ~/svn.icutools/trunk/dbg/unicode/c$ genprops/genprops $ICU_SRC_DIR
43 - ~/svn.icutools/trunk/dbg/unicode/c$ genuca/genuca --hanOrder implicit $ICU_SRC _DIR 192 - ~/svn.icutools/trunk/dbg/unicode/c$ genuca/genuca --hanOrder implicit $ICU_SRC _DIR
44 - ~/svn.icutools/trunk/dbg/unicode/c$ genuca/genuca --hanOrder radical-stroke $I CU_SRC_DIR 193 - ~/svn.icutools/trunk/dbg/unicode/c$ genuca/genuca --hanOrder radical-stroke $I CU_SRC_DIR
45 194 - rebuild ICU (make install) & tools
195 - run genuca again (see step above) so that it picks up the new nfc.nrm
196 - rebuild ICU (make install) & tools
197
198 * update uts46test.cpp and UTS46Test.java if there are new characters that are e quivalent to
199 sequences with non-LDH ASCII (that is, their decompositions contain '=' or sim ilar)
200 - grep IdnaMappingTable.txt or uts46.txt for "disallowed_STD3_valid" on non-ASCI I characters
201 - Unicode 6.0..8.0: U+2260, U+226E, U+226F
202 - nothing new in 8.0, no test file to update
203
204 * run & fix ICU4C tests
205 - bad Cherokee case folding due to difference in fallbacks:
206 UCD case folding falls back to no mapping,
207 ICU runtime case folding falls back to lowercasing;
208 fixed casepropsbuilder.cpp to generate scf mappings to self
209 when there is an slc mapping but no scf
210 - Andy handles RBBI & spoof check test failures
211
212 * collation: CLDR collation root, UCA DUCET
213
214 - UCA DUCET goes into Mark's Unicode tools, see
215 https://sites.google.com/site/unicodetools/home#TOC-UCA
216 - CLDR root data files are checked into (CLDR UCA branch)/common/uca/
217 - cd (CLDR UCA branch)/common/uca/
218 - update source/data/unidata/FractionalUCA.txt with FractionalUCA_SHORT.txt
219 cp FractionalUCA_SHORT.txt $ICU_SRC_DIR/source/data/unidata/FractionalUCA.txt
220 - update source/data/unidata/UCARules.txt with UCA_Rules_SHORT.txt
221 cp $ICU_SRC_DIR/source/data/unidata/UCARules.txt /tmp/UCARules-old.txt
222 (note removing the underscore before "Rules")
223 cp UCA_Rules_SHORT.txt $ICU_SRC_DIR/source/data/unidata/UCARules.txt
224 - restore TODO diffs in UCARules.txt
225 meld /tmp/UCARules-old.txt $ICU_SRC_DIR/source/data/unidata/UCARules.txt
226 - update (ICU4C)/source/test/testdata/CollationTest_*.txt
227 and (ICU4J)/main/tests/collate/src/com/ibm/icu/dev/data/CollationTest_*.txt
228 from the CLDR root files (..._CLDR_..._SHORT.txt)
229 cp CollationTest_CLDR_NON_IGNORABLE_SHORT.txt $ICU_SRC_DIR/source/test/testd ata/CollationTest_NON_IGNORABLE_SHORT.txt
230 cp CollationTest_CLDR_SHIFTED_SHORT.txt $ICU_SRC_DIR/source/test/testdata/Co llationTest_SHIFTED_SHORT.txt
231 cp $ICU_SRC_DIR/source/test/testdata/CollationTest_*.txt ~/svn.icu4j/trunk/s rc/main/tests/collate/src/com/ibm/icu/dev/data
232 - if CLDR common/uca/unihan-index.txt changes, then update
233 CLDR common/collation/root.xml <collation type="private-unihan">
234 and regenerate (or update in parallel) $ICU_SRC_DIR/source/data/coll/root.txt
235 - run genuca, see command line above;
236 deal with
237 Error: Unknown script for first-primary sample character U+07d8 on line 2300 5 of /home/mscherer/svn.icu/trunk/src/source/data/unidata/FractionalUCA.txt
238 (add the character to genuca.cpp sampleCharsToScripts[])
239 + look up the script for the new sample characters
240 (e.g., in FractionalUCA.txt)
241 + *add* mappings to sampleCharsToScripts[], do not replace them
242 (in case the script sample characters flip-flop)
243 + insert new scripts in DUCET script order, see the top_byte table
244 at the beginning of FractionalUCA.txt
245 - rebuild ICU4C
246
247 * run & fix ICU4C tests, now with new CLDR collation root data
248 - run all tests with the collation test data *_SHORT.txt or the full files
249 (the full ones have comments, useful for debugging)
250 - note on intltest: if collate/UCAConformanceTest fails, then
251 utility/MultithreadTest/TestCollators will fail as well;
252 fix the conformance test before looking into the multi-thread test
253 - fixed bug in CollationWeights::getWeightRanges()
254 exposed by new data and CollationTest::TestRootElements
255
256 * update Java data files
257 - refresh just the UCD/UCA-related/derived files, just to be safe
258 - see (ICU4C)/source/data/icu4j-readme.txt
259 - mkdir /tmp/icu4j
260 - ~/svn.icu/trunk/dbg$ make ICU4J_ROOT=/tmp/icu4j icu4j-data-install
261 output:
262 ...
263 Unicode .icu files built to ./out/build/icudt56l
264 echo timestamp > uni-core-data
265 mkdir -p ./out/icu4j/com/ibm/icu/impl/data/icudt56b
266 mkdir -p ./out/icu4j/tzdata/com/ibm/icu/impl/data/icudt56b
267 echo pnames.icu uprops.icu ucase.icu ubidi.icu nfc.nrm > ./out/icu4j/add.txt
268 LD_LIBRARY_PATH=../lib:../stubdata:../tools/ctestfw:$LD_LIBRARY_PATH ../bin /icupkg ./out/tmp/icudt56l.dat ./out/icu4j/icudt56b.dat -a ./out/icu4j/add.txt - s ./out/build/icudt56l -x '*' -tb -d ./out/icu4j/com/ibm/icu/impl/data/icudt56b
269 mv ./out/icu4j/"com/ibm/icu/impl/data/icudt56b/zoneinfo64.res" ./out/icu4j/" com/ibm/icu/impl/data/icudt56b/metaZones.res" ./out/icu4j/"com/ibm/icu/impl/data /icudt56b/timezoneTypes.res" ./out/icu4j/"com/ibm/icu/impl/data/icudt56b/windows Zones.res" "./out/icu4j/tzdata/com/ibm/icu/impl/data/icudt56b"
270 jar cf ./out/icu4j/icudata.jar -C ./out/icu4j com/ibm/icu/impl/data/icudt56b /
271 mkdir -p /tmp/icu4j/main/shared/data
272 cp ./out/icu4j/icudata.jar /tmp/icu4j/main/shared/data
273 jar cf ./out/icu4j/icutzdata.jar -C ./out/icu4j/tzdata com/ibm/icu/impl/data /icudt56b/
274 mkdir -p /tmp/icu4j/main/shared/data
275 cp ./out/icu4j/icutzdata.jar /tmp/icu4j/main/shared/data
276 make[1]: Leaving directory `/home/mscherer/svn.icu/trunk/dbg/data'
277 - copy the big-endian Unicode data files to another location,
278 separate from the other data files,
279 and then refresh ICU4J
280 cd ~/svn.icu/trunk/dbg/data/out/icu4j
281 mkdir -p /tmp/icu4j/com/ibm/icu/impl/data/$ICUDT/coll
282 mkdir -p /tmp/icu4j/com/ibm/icu/impl/data/$ICUDT/brkitr
283 cp com/ibm/icu/impl/data/$ICUDT/confusables.cfu /tmp/icu4j/com/ibm/icu/impl/ data/$ICUDT
284 cp com/ibm/icu/impl/data/$ICUDT/*.icu /tmp/icu4j/com/ibm/icu/impl/data/$ICUD T
285 rm /tmp/icu4j/com/ibm/icu/impl/data/$ICUDT/cnvalias.icu
286 cp com/ibm/icu/impl/data/$ICUDT/*.nrm /tmp/icu4j/com/ibm/icu/impl/data/$ICUD T
287 cp com/ibm/icu/impl/data/$ICUDT/coll/* /tmp/icu4j/com/ibm/icu/impl/data/$ICU DT/coll
288 cp com/ibm/icu/impl/data/$ICUDT/brkitr/* /tmp/icu4j/com/ibm/icu/impl/data/$I CUDT/brkitr
289 jar uf ~/svn.icu4j/trunk/src/main/shared/data/icudata.jar -C /tmp/icu4j com/ ibm/icu/impl/data/$ICUDT
290
291 * When refreshing all of ICU4J data from ICU4C
292 - ~/svn.icu/trunk/dbg$ make ICU4J_ROOT=/tmp/icu4j icu4j-data-install
293 - cp /tmp/icu4j/main/shared/data/icudata.jar ~/svn.icu4j/trunk/src/main/shared/d ata
294 or
295 - ~/svn.icu/trunk/dbg$ make ICU4J_ROOT=~/svn.icu4j/trunk/src icu4j-data-install
296
297 * update CollationFCD.java
298 + copy & paste the initializers of lcccIndex[] etc. from
299 ICU4C/source/i18n/collationfcd.cpp to
300 ICU4J/main/classes/collate/src/com/ibm/icu/impl/coll/CollationFCD.java
301
302 * refresh Java test .txt files
303 - copy new .txt files into ICU4J's main/tests/core/src/com/ibm/icu/dev/data/unic ode
304 cd $ICU_SRC_DIR/source/data/unidata
305 cp confusables.txt confusablesWholeScript.txt NormalizationCorrections.txt N ormalizationTest.txt SpecialCasing.txt UnicodeData.txt ~/svn.icu4j/trunk/src/mai n/tests/core/src/com/ibm/icu/dev/data/unicode
306 cd ../../test/testdata
307 cp BidiCharacterTest.txt BidiTest.txt ~/svn.icu4j/trunk/src/main/tests/core/ src/com/ibm/icu/dev/data/unicode
308 cp ~/unidata/uni80/20150415/ucd/CompositionExclusions.txt ~/svn.icu4j/trunk/ src/main/tests/core/src/com/ibm/icu/dev/data/unicode
309
310 * run & fix ICU4J tests
311
312 *** LayoutEngine script information
313
314 * ICU 56: Modify ScriptIDModuleWriter.java to not output @stable tags any more,
315 because the layout engine was deprecated in ICU 54.
316 Modify ScriptIDModuleWriter.java and ScriptTagModuleWriter.java
317 to write lines that we used to add manually.
318
319 * Run icu4j-tools: com.ibm.icu.dev.tool.layout.ScriptNameBuilder.
320 This generates LEScripts.h, LELanguages.h, ScriptAndLanguageTags.h and ScriptA ndLanguageTags.cpp
321 in the working directory.
322
323 (It also generates ScriptRunData.cpp, which is no longer needed.)
324
325 It also reads and regenerates tools/misc/src/com/ibm/icu/dev/tool/layout/Scrip tAndLanguages
326 (a plain text file)
327 which maps ICU versions to the numbers of script/language constants
328 that were added then.
329 (This mapping is probably obsolete since we do not print "@stable ICU xy" any more.)
330
331 The generated files have a current copyright date and "@deprecated" statement.
332
333 * Review changes, fix Java tool if necessary, and copy to ICU4C
334 cd ~/svn.icu4j/trunk/src
335 meld $ICU_SRC_DIR/source/layout tools/misc/src/com/ibm/icu/dev/tool/layout
336 cp tools/misc/src/com/ibm/icu/dev/tool/layout/*.h $ICU_SRC_DIR/source/layout
337 cp tools/misc/src/com/ibm/icu/dev/tool/layout/ScriptAndLanguageTags.cpp $ICU_S RC_DIR/source/layout
338
339 *** API additions
340 - send notice to icu-design about new born-@stable API (enum constants etc.)
341
342 *** merge the Unicode update branches back onto the trunk
343 - do not merge the icudata.jar and testdata.jar,
344 instead rebuild them from merged & tested ICU4C
345 - make sure that changes to Unicode tools & ICU tools are checked in
346 http://www.unicode.org/utility/trac/log/trunk/unicodetools
347 http://bugs.icu-project.org/trac/log/tools/trunk
46 348
47 ---------------------------------------------------------------------------- *** 349 ---------------------------------------------------------------------------- ***
48 350
49 Unicode 7.0 update for ICU 54 351 Unicode 7.0 update for ICU 54
50 352
51 http://www.unicode.org/review/pri271/ -- beta review 353 http://www.unicode.org/review/pri271/ -- beta review
52 http://www.unicode.org/reports/uax-proposed-updates.html 354 http://www.unicode.org/reports/uax-proposed-updates.html
53 http://www.unicode.org/versions/beta-7.0.0.html#notable_issues 355 http://www.unicode.org/versions/beta-7.0.0.html#notable_issues
54 http://www.unicode.org/reports/tr44/tr44-13.html 356 http://www.unicode.org/reports/tr44/tr44-13.html
55 357
(...skipping 2027 matching lines...) Expand 10 before | Expand all | Expand 10 after
2083 2385
2084 * name matching 2386 * name matching
2085 - read UCD.html 2387 - read UCD.html
2086 2388
2087 * scripts 2389 * scripts
2088 - use new Hrkt=Katakana_Or_Hiragana 2390 - use new Hrkt=Katakana_Or_Hiragana
2089 2391
2090 * ZWJ & ZWNJ 2392 * ZWJ & ZWNJ
2091 - are now part of combining character sequences 2393 - are now part of combining character sequences
2092 - break iteration used to assume that LB classes did not overlap; now they do fo r ZWJ & ZWNJ 2394 - break iteration used to assume that LB classes did not overlap; now they do fo r ZWJ & ZWNJ
OLDNEW
« no previous file with comments | « source/data/unidata/UnicodeData.txt ('k') | source/data/unidata/confusablesWholeScript.txt » ('j') | no next file with comments »

Powered by Google App Engine
This is Rietveld 408576698