OLD | NEW |
1 Name: icu | 1 Name: icu |
2 URL: http://site.icu-project.org/ | 2 URL: http://site.icu-project.org/ |
3 Version: 54.1 | 3 Version: 54.1 |
4 License: MIT | 4 License: MIT |
5 Security Critical: yes | 5 Security Critical: yes |
6 | 6 |
7 Description: | 7 Description: |
8 This directory contains the source code of ICU 54.1 for C/C++. | 8 This directory contains the source code of ICU 54.1 for C/C++. |
9 | 9 |
10 | 10 |
(...skipping 45 matching lines...) Loading... |
56 | 56 |
57 - Apply patches/khmer-dictbe.patch and put in a smaller Khmer dictionary | 57 - Apply patches/khmer-dictbe.patch and put in a smaller Khmer dictionary |
58 (source/data/brkitr/khmerdict.txt) obtained from | 58 (source/data/brkitr/khmerdict.txt) obtained from |
59 http://bugs.icu-project.org/trac/ticket/9451 | 59 http://bugs.icu-project.org/trac/ticket/9451 |
60 | 60 |
61 - Add several common Chinese words that were dropped previously to | 61 - Add several common Chinese words that were dropped previously to |
62 source/data/cjdict/brkitr/cjdict.txt | 62 source/data/cjdict/brkitr/cjdict.txt |
63 patch: patches/cjdict.patch | 63 patch: patches/cjdict.patch |
64 upstream bug: http://bugs.icu-project.org/trac/ticket/10888 | 64 upstream bug: http://bugs.icu-project.org/trac/ticket/10888 |
65 | 65 |
66 | |
67 - android/brkitr.patch (to be applied for Android build only) : | |
68 Do not use the C+J dictionary for Chinese/Japanese segmentation | |
69 to reduce the data size. Adjust word.txt and a few other files. | |
70 | |
71 - source/data/brkitr/word_ja.txt (used only on Android) | |
72 Added for Japanese-specific word-breaking without the C+J dictionary. | |
73 | |
74 4. Converter changes : | 66 4. Converter changes : |
75 | 67 |
76 - convrtrs.txt : Replaced the original by our own that only lists encodings | 68 - convrtrs.txt : Replaced the original by our own that only lists encodings |
77 and aliases required by the WHATWG Encoding spec plus a few extra (see | 69 and aliases required by the WHATWG Encoding spec plus a few extra (see |
78 the file as to why). | 70 the file as to why). |
79 | 71 |
80 - Add source/data/mappings/ucmlocal.txt : to list only converters we need. | 72 - Add source/data/mappings/ucmlocal.txt : to list only converters we need. |
81 | 73 |
82 - Add new tables per the WHATWG encoding standards for EUC-JP, | 74 - Add new tables per the WHATWG encoding standards for EUC-JP, |
83 Shift_JIS, Big5 (Big5+Big5HKSCS), EUC-KR and all the single byte encodings. | 75 Shift_JIS, Big5 (Big5+Big5HKSCS), EUC-KR and all the single byte encodings. |
(...skipping 37 matching lines...) Loading... |
121 ExemplarCharacters, LocaleScript, layout, and the name of the | 113 ExemplarCharacters, LocaleScript, layout, and the name of the |
122 language for a locale in its native language. | 114 language for a locale in its native language. |
123 c. Remove the legacy Chinese character set-based collation | 115 c. Remove the legacy Chinese character set-based collation |
124 (big5han/gb2312han) that don't make any sense and nobdoy uses. | 116 (big5han/gb2312han) that don't make any sense and nobdoy uses. |
125 | 117 |
126 - Add tg.txt, ckb.txt, and ku.txt to source/data/{locale,lang} | 118 - Add tg.txt, ckb.txt, and ku.txt to source/data/{locale,lang} |
127 with the minimal locale data necessary for spellchecker and | 119 with the minimal locale data necessary for spellchecker and |
128 and language menus. Also change the English display name | 120 and language menus. Also change the English display name |
129 for ckb to 'Kurdish (Arabic)'. | 121 for ckb to 'Kurdish (Arabic)'. |
130 | 122 |
131 - android/patch_locale.sh (to be run for Android build only): | |
132 a. Make changes to source/data/{region,lang} to exclude these data | |
133 except the language and script names of zh_Hans and zh_Hant. | |
134 b. Remove exemplar cities in timezone data (data/zone). | |
135 c. Keep only the minimal calendar data in data/locales. | |
136 d. Include currency display names for a smaller subset of currencies. | |
137 e. Minimize the locale data for 9 locales to which Chrome on Android | |
138 is not localized. | |
139 | |
140 6. Timezone data update | 123 6. Timezone data update |
141 - Grab the latest version of the following timezone data files and | 124 - Grab the latest version of the following timezone data files and |
142 put them in source/data/misc. | 125 put them in source/data/misc. |
143 | 126 |
144 metaZones.txt | 127 metaZones.txt |
145 timezoneTypes.txt | 128 timezoneTypes.txt |
146 windowsZones.txt | 129 windowsZones.txt |
147 zoneinfo64.txt | 130 zoneinfo64.txt |
148 | 131 |
149 As of January 2015, the latest version is 2015a and the above files | 132 As of January 2015, the latest version is 2015a and the above files |
(...skipping 26 matching lines...) Loading... |
176 Remove unnecessary resources : unames, collator rule source | 159 Remove unnecessary resources : unames, collator rule source |
177 - patches/pkg_gen.patch : | 160 - patches/pkg_gen.patch : |
178 upstream bug (fixed in the upstream RC 55) | 161 upstream bug (fixed in the upstream RC 55) |
179 http://bugs.icu-project.org/trac/ticket/10572 | 162 http://bugs.icu-project.org/trac/ticket/10572 |
180 - patches/data.build.win.patch : | 163 - patches/data.build.win.patch : |
181 Windows-only data build patch. | 164 Windows-only data build patch. |
182 - patches/data_symb.patch : | 165 - patches/data_symb.patch : |
183 Put ICU_DATA_ENTRY_POINT(icudtXX_dat) in common when we use | 166 Put ICU_DATA_ENTRY_POINT(icudtXX_dat) in common when we use |
184 the icu data file or icudt.dll | 167 the icu data file or icudt.dll |
185 | 168 |
186 9. Pre-built data files are checked in with the following steps on Linux: | 169 #9. Pre-built data files are checked in with the following steps on Linux: |
187 | 170 # |
188 a. Make a icu data build directory outside the Chromium source tree | 171 # a. Make a icu data build directory outside the Chromium source tree |
189 and cd to that directory. | 172 # and cd to that directory. |
190 b. Run | 173 # b. Run |
191 | 174 # |
192 ${CHROME_ICU_TREE_TOP}/source/runConfigureICU Linux --disable-layout | 175 # ${CHROME_ICU_TREE_TOP}/source/runConfigureICU Linux --disable-layout |
193 | 176 # |
194 c. Run 'make' | 177 # c. Run 'make' |
195 d. 'make' will fail when pkgdata looks for css3transform.res. Edit | 178 # d. 'make' will fail when pkgdata looks for css3transform.res. Edit |
196 data/out/tmp/icudata.lst to replace 'css3transform.res' with 'root.res'. | 179 # data/out/tmp/icudata.lst to replace 'css3transform.res' with 'root.res'. |
197 (see http://bugs.icu-project.org/trac/ticket/10570 ) and run 'make' again. | 180 # (see http://bugs.icu-project.org/trac/ticket/10570 ) and run 'make' again. |
198 | 181 # |
199 | 182 # |
200 - source/data/in/icudtl.dat : Built on Linux with all the patches | 183 # - source/data/in/icudtl.dat : Built on Linux with all the patches |
201 above applied. icudt54l.dat is generated in | 184 # above applied. icudt54l.dat is generated in |
202 {BUILD_DIR_ROOT}/data/out/tmp and copied to the above location with a | 185 # {BUILD_DIR_ROOT}/data/out/tmp and copied to the above location with a |
203 version number (54) dropped. | 186 # version number (54) dropped. |
204 | 187 # |
205 | 188 # |
206 - {mac,linux}/icudtl_dat.S : Built on Linux with all the | 189 # - {mac,linux}/icudtl_dat.S : Built on Linux with all the |
207 patches above (except android/brkitr.patch) applied and checked in. | 190 # patches above (except android/brkitr.patch) applied and checked in. |
208 This file will be generated in {BUILD_DIR_ROOT}/data/out/tmp as | 191 # This file will be generated in {BUILD_DIR_ROOT}/data/out/tmp as |
209 icudt54l_dat.S, but '54' is dropped while copying. | 192 # icudt54l_dat.S, but '54' is dropped while copying. |
210 | 193 # |
211 mac/icudtl_dat.S is identical to linux/icudtl_dat.S except for | 194 # mac/icudtl_dat.S is identical to linux/icudtl_dat.S except for |
212 the header portion. With "linux/icudtl_dat.S" in its place, | 195 # the header portion. With "linux/icudtl_dat.S" in its place, |
213 run scripts/make_mac_assembly.sh to generate it. | 196 # run scripts/make_mac_assembly.sh to generate it. |
214 | 197 # |
215 - android/icudtl_dat.S : Built on Linux with all the patches above and | 198 # - android/icudtl_dat.S : Built on Linux with all the patches above and |
216 android/brkitr.patch applied and android/patch_locale.sh executed. | 199 # android/brkitr.patch applied and android/patch_locale.sh executed. |
217 '54' is dropped from the name generated in the build tree. | 200 # '54' is dropped from the name generated in the build tree. |
218 | 201 # |
219 - android/icudtl.dat : Generated as icudt54l.dat in | 202 # - android/icudtl.dat : Generated as icudt54l.dat in |
220 {BUILD_DIR_ROOT}/data/out/tmp along with icudt54l_dat.S and | 203 # {BUILD_DIR_ROOT}/data/out/tmp along with icudt54l_dat.S and |
221 copied to the above location with '54' dropped in its name. | 204 # copied to the above location with '54' dropped in its name. |
222 | 205 # |
223 - windows/icudt.dll (by default, we set icu_use_icu_data_flag to 1 | 206 # - windows/icudt.dll (by default, we set icu_use_icu_data_flag to 1 |
224 and don't use this file.) | 207 # and don't use this file.) |
225 | 208 # |
226 a. check out a clean copy of icu54 from the upstream on Windows | 209 # a. check out a clean copy of icu54 from the upstream on Windows |
227 outside the Chrome tree. | 210 # outside the Chrome tree. |
228 | 211 # |
229 $ svn export --native-eol LF http://source.icu-project.org/repos/icu/icu
/tags/release-54-1 ${SEPARATE_ICU_ROOT}/icu54 | 212 # $ svn export --native-eol LF http://source.icu-project.org/repos/icu/ic
u/tags/release-54-1 ${SEPARATE_ICU_ROOT}/icu54 |
230 | 213 # |
231 b. copy ${CHROME_ICU_ROOT}/source/data/in/icudtl.dat to | 214 # b. copy ${CHROME_ICU_ROOT}/source/data/in/icudtl.dat to |
232 ${SEPARATE_ICU_ROOT}/source/data/in/icudt54l.dat | 215 # ${SEPARATE_ICU_ROOT}/source/data/in/icudt54l.dat |
233 c. copy ${CHROME_ICU_ROOT}/source/data/makedata.mak to | 216 # c. copy ${CHROME_ICU_ROOT}/source/data/makedata.mak to |
234 ${SEPARATE_ICU_ROOT}/source/data/makedata.mak | 217 # ${SEPARATE_ICU_ROOT}/source/data/makedata.mak |
235 c. In Visual Studio, open source/allinone/allinone.sln solution | 218 # c. In Visual Studio, open source/allinone/allinone.sln solution |
236 in ${SEPARATE_ICU_ROOT} | 219 # in ${SEPARATE_ICU_ROOT} |
237 d. Build 'makedata' target | 220 # d. Build 'makedata' target |
238 e. icudt54.dll will be generated in ${SEPARATE_ICU_ROOT}/bin | 221 # e. icudt54.dll will be generated in ${SEPARATE_ICU_ROOT}/bin |
239 f. Copy that icudt54.dll to ${CHROME_ICU_ROOT}/windows/icudt.dll | 222 # f. Copy that icudt54.dll to ${CHROME_ICU_ROOT}/windows/icudt.dll |
240 and check that in. | 223 # and check that in. |
241 | 224 |
242 10. Apply the following patches for regex | 225 10. Apply the following patches for regex |
243 - patches/regex.patch (a combined patch of 3 revisions below) | 226 - patches/regex.patch (a combined patch of 3 revisions below) |
244 - upstream bugs (fixed in the upstream 55 RC) | 227 - upstream bugs (fixed in the upstream 55 RC) |
245 http://bugs.icu-project.org/trac/ticket/11370 (r36723:36724) | 228 http://bugs.icu-project.org/trac/ticket/11370 (r36723:36724) |
246 http://bugs.icu-project.org/trac/ticket/11369 (r36726:36727) | 229 http://bugs.icu-project.org/trac/ticket/11369 (r36726:36727) |
247 http://bugs.icu-project.org/trac/ticket/11371 (r36800:36801) | 230 http://bugs.icu-project.org/trac/ticket/11371 (r36800:36801) |
248 | 231 |
249 11. Fix bugs in locid (getBaseName / thread safety). | 232 11. Fix bugs in locid (getBaseName / thread safety). |
250 - patches/locid.patch | 233 - patches/locid.patch |
251 - upstream bugs (fixed in the upstream 55 RC) | 234 - upstream bugs (fixed in the upstream 55 RC) |
252 http://bugs.icu-project.org/trac/ticket/11421 | 235 http://bugs.icu-project.org/trac/ticket/11421 |
253 http://bugs.icu-project.org/trac/ticket/11547 | 236 http://bugs.icu-project.org/trac/ticket/11547 |
254 | 237 |
255 12. Fix bugs in BiDi | 238 12. Fix bugs in BiDi |
256 - patches/bidi.patch | 239 - patches/bidi.patch |
257 - upstream bugs (fixed in the upstream 55 RC) | 240 - upstream bugs (fixed in the upstream 55 RC) |
258 http://bugs.icu-project.org/trac/ticket/11177 | 241 http://bugs.icu-project.org/trac/ticket/11177 |
259 http://bugs.icu-project.org/trac/ticket/11451 | 242 http://bugs.icu-project.org/trac/ticket/11451 |
260 | 243 |
261 13. Fix a data race in cmemory | 244 13. Fix a data race in cmemory |
262 - patches/cmemory.patch | 245 - patches/cmemory.patch |
263 - upstream bug (fixed in the upstream 55 RC) | 246 - upstream bug (fixed in the upstream 55 RC) |
264 http://www.icu-project.org/trac/ticket/11538 | 247 http://www.icu-project.org/trac/ticket/11538 |
| 248 |
| 249 |
| 250 General information for maintenance |
| 251 ----------------------------------- |
| 252 |
| 253 * Build system in gyp/gn + ninja to generate the data packages: |
| 254 We build the necessary tools (genrb, makeconv, icupkg, ...) and then |
| 255 run those tools on all data files we want to generate. |
| 256 |
| 257 This process is slightly complicated because some data files we want to |
| 258 modify (shrink) before the build them. This is implemented with a filter |
| 259 program that takes files that we want to modify and put them in a modified |
| 260 shape in a special directory. |
| 261 |
| 262 The whole chain is (dir names can vary): |
| 263 source/data/* <- raw/pure data |
| 264 out/gen/tmp_icudt54l/* <- filtered/stripped data |
| 265 out/gen/icudt54l/* <- compiled data |
| 266 out/icudt54l.dat <- packaged data |
| 267 |
| 268 This multi step chain means that files are named twice or three times in |
| 269 the build system with different paths (this might change if we can drop |
| 270 gyp) so to add or remove one data chunk it will have to be removed |
| 271 from these lists in icu_data.gypi: |
| 272 *_raw_source |
| 273 *_filtered_source |
| 274 *_generated |
| 275 where * is something like icu_curr_res or icu_lang_res. |
| 276 |
| 277 Some special files are also handled specifically in the filtering system |
| 278 and in the build system. |
| 279 |
| 280 * Build system for tools: makeconv, pkgdata, icupkg, genrb, gendict, |
| 281 genbrk, ... |
| 282 |
| 283 These are pretty straight forward. ICU is built as a static lib and used by |
| 284 the tools that often are just a single file implementing main() and some |
| 285 utility code. Whatever works for building the rest of ICU typically works |
| 286 for building the data tools. |
| 287 |
| 288 * Filtering tool (filter_data_for_size.py): |
| 289 Based on: |
| 290 - android/patch_locale.sh (to be run for Android build only): |
| 291 a. Make changes to source/data/{region,lang} to exclude these data |
| 292 except the language and script names of zh_Hans and zh_Hant. |
| 293 b. Remove exemplar cities in timezone data (data/zone). |
| 294 c. Keep only the minimal calendar data in data/locales. |
| 295 d. Include currency display names for a smaller subset of currencies. |
| 296 e. Minimize the locale data for 9 locales to which Chrome on Android |
| 297 is not localized. |
| 298 |
| 299 - android/brkitr.patch (to be applied for Android build only) : |
| 300 Do not use the C+J dictionary for Chinese/Japanese segmentation |
| 301 to reduce the data size. Adjust word.txt and a few other files. |
| 302 |
| 303 - source/data/brkitr/word_ja.txt (used only on Android) |
| 304 Added for Japanese-specific word-breaking without the C+J dictionary. |
| 305 |
| 306 * The build system is not warning clean. In particular: |
| 307 - ICU tools warn about empty input files. - Check --quiet flags? |
| 308 - The gyp Makefile generator generates duplicate rules (affects |
| 309 v8-standalone) - Reported as gyp issue 484. |
| 310 |
| 311 * Updating to a newer ICU: Diff all Makefile.* and *.mk files before |
| 312 and after and mimic necessary changes (if any) in BUILD.gn, icu_data.gyp |
| 313 and icu_data.gypi. |
OLD | NEW |