Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(119)

Side by Side Diff: icu52/README.chromium

Issue 224943002: icu local change part1 (Closed) Base URL: svn://svn.chromium.org/chrome/trunk/deps/third_party/
Patch Set: function indentation changed Created 6 years, 8 months ago
Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.
Jump to:
View unified diff | Download patch | Annotate | Revision Log
« no previous file with comments | « no previous file | icu52/patches/breakiterator.patch » ('j') | no next file with comments »
Toggle Intra-line Diffs ('i') | Expand Comments ('e') | Collapse Comments ('c') | Show Comments Hide Comments ('s')
OLDNEW
1 Name: icu 1 Name: icu
2 URL: http://site.icu-project.org/ 2 URL: http://site.icu-project.org/
3 Version: 52.1 3 Version: 52.1
4 License: MIT 4 License: MIT
5 Security Critical: yes 5 Security Critical: yes
6 6
7 Description: 7 Description:
8 This directory contains the source code of ICU 52.1 for C/C++ 8 This directory contains the source code of ICU 52.1 for C/C++
9 9
10 1. It was obtained with the following: 10 1. It was obtained with the following:
11 11
12 $ svn export --native-eol LF http://source.icu-project.org/repos/icu/icu/tag s/release-52-1 icu52 12 $ svn export --native-eol LF http://source.icu-project.org/repos/icu/icu/tag s/release-52-1 icu52
13 13
14 The following directories we don't use are removed: 14 The following directories we don't use are removed:
15 15
16 - as_is 16 - as_is
17 - packaging 17 - packaging
18 - source/layout 18 - source/layout
19 - source/layoutex 19 - source/layoutex
20 20
21 2. Platform header files for Linux, FreeBSD, OpenBSD, Android, Mac OS X, and QNX : 21 patches/configure.patch is applied to get runConfigureICU work in the
22 icudata generation step without layout and layoutex directory by removing the
23 corresponding Makefile's from ac_config variable.
22 24
23 - Apply platform.patch in patches directory. : It applies the upstream 25 2. Apply the following patch for platform related headers (putilimpl.h and
24 patch to platform.h.in (see http://bugs.icu-project.org/trac/ticket/8248) 26 others).
25 and change source/common/unicode/ptypes.h to refer to plinux.h and
26 pmac.h generated below.
27 27
28 - 'runConfigureICU Linux', 'runConfigureICU FreeBSD', and 28 - patches/putil.patch for Android and QNX
29 'runConfigureICU MacOSX' are run to generate 29 Upstream bug for Android : http://bugs.icu-project.org/trac/ticket/10478
30 source/common/unicode/platform.h. 30 Upstream bug for QNX : http://bugs.icu-project.org/trac/ticket/10811
31
32 - On OpenBSD, source/common/unicode/platform.h is being generated
33 by the icu4c port in the ports directory and not by runConfigureICU.
34 In case the file has to be updated you can do:
35 cd /home/ports/textproc/icu4c && make configure
36
37 - Rename it to 'plinux.h', 'pfreebsd.h', 'popenbsd.h' and 'pmac.h'
38
39 - Apply patches/pmach.h.patch on Mac to pmac.h
40
41 - On Android, the pandroid.h was generated by copying plinux.h to
42 pandroid.h and applying the patches/pandroid.h.patch.
43
44 - For QNX, the pqnx.h was generated by copying plinux.h to
45 pqnx.h and applying the patches/platform.qnx.patch.
46
47 - For NaCl (icu_nacl.gypi), the pnacl.h was generated by copying plinux.h to
48 pnacl.h and applying the patches/pnacl.h.patch.
49
50 - Apply the CL at https://codereview.chromium.org/15973007/ to plinux.h
51
52 3. The following directories were removed because they're not used by Chromium
53 at the moment:
54 as_is
55 packaging
56 source/extra
57 source/sample
58 source/layout
59 source/layoutex
60 31
61 32
62 4. The word breaking for Chinese and Japanese were modified to use a word 33 3. Breakiterator patches
63 frequency list with the following patch and cjdict.txt.
64 34
65 - patches/segmentation.patch : 35 - Apply patches/brkitr.patch
66 Adds a dictionary (word-frequency)-based word breaking for CJK 36 * word.txt
67 (Korean is supported in the code, but it does not do anything 37 a. Move full stops (U+002E, U+FF0E) from MidNumLet to MidNum so that
68 because we don't have a Korean word-list.) 38 FQDN labels can be split at '.'
39 b. Move fullwidth digits (U+FF10 - U+FF19) from Ideographic to Numeric.
40 See http://unicode.org/cldr/trac/ticket/6555
41 * line.txt
42 a. Use Japanese rules for all locales because Japanese tailoring only
43 affects Japanese specific characters.
44 See http://unicode.org/cldr/trac/ticket/3974
45 b. Minor changes in CL, OP and IS definitions to handle 'comma-variants'
46 more consistenly.
47 See http://unicode.org/cldr/trac/ticket/6557
48 c. Fix line breaking for Chinese characters and quotation marks
49 See http://unicode.org/cldr/trac/ticket/4200 and
50 http://crbug.com/39779
51
69 52
70 - source/data/brkitr/cjdict.txt : 53 - Add a new file brklocal.mk (copied from brkfiles.mk) with line_ja.txt
71 Chinese and Japanese word frequency list. 54 and word_POSIX.txt dropped from the build list.
72 See the file for license/copyright notice
73 55
74 - source/data/brkitr/cc_edict.txt : 56 - Apply patches/khmer-dictbe.patch and put in a smaller Khmer dictionary
75 the list of words derived from CC-Edict.) 57 (source/data/brkitr/khmerdict.txt) obtained from
76 58 http://bugs.icu-project.org/trac/ticket/9451
77 - patches/brkitr.patch
78 * word.txt : Chinese/Japanese segmentation rules, Hebrew-script-specific
79 handling of U+0022, and splitting of FQDN into labels at '.'.
80 » » For Hebrew, see http://unicode.org/cldr/track/ticket/3120
81 * line.txt : Incorporated line_he and minor changes in CL, OP and ID
82 definitions.
83 » » For Hebrew, see http://unicode.org/cldr/track/ticket/4004
84 » » For others, see http://unicode.org/cldr/track/ticket/3974
85 » » http://unicode.org/cldr/track/ticket/4200
86 » » http://unicode.org/cldr/track/ticket/
87 * brklocal.mk : build file changes to drop unnecessary brkitr rule
88 files (e.g. word_ja.txt, line_he.txt)
89 59
90 - android/brkitr.patch (to be applied for Android build only) : 60 - android/brkitr.patch (to be applied for Android build only) :
91 Reverts some changes about Chinese/Japanese segmentation rules in 61 Reverts some changes about Chinese/Japanese segmentation rules in
92 patches/brkitr.patch to reduce binary size for Android. 62 patches/brkitr.patch to reduce binary size for Android.
93 63
94 If you want to run ICU tests, you have to copy source/data/brkitr/cjdict.txt 64 4. Converter changes :
95 to source/test/testdata/cjdict-truncated.txt to pass TestTrieWithValue test.
96 65
97 5. Converter changes : converters.patch 66 - converters.patch :
98 - Include what we really need. See source/data/mappings/ucmlocal.txt 67 a. revises existing mapping tables
99 - Alias and mapping changes : source/data/mappings/convrtrs.txt 68 b. Remove a lot of unused aliases in the converter alias table
100 - Changes several tables and add six new tables, three of which 69 (source/data/mappings/convrtrs.txt ) leading to 40kB size reduction.
101 are 'fake' tables for ISO-2022-CN(-Ext).
102 - ucnv2022.c is modified to use 3 'fake' tables added above for
103 ISO-2022-CN(-Ext).
104 70
105 6. Locale changes 71 - Add source/data/mappings/ucmlocal.txt : to list only converters we need.
72 - Add two new tables per WHATWG encoding standards for EUC-JP and CP866.
73 They're generated with scripts/{eucjp, ibm866}_gen.sh.
74 - Add three 'fake' tables for ISO-2022-CN(-Ext) : noop-*.ucm.
75
76 - uconv.patch
77 a. ucnv2022 uses 3 fake tables for ISO-2022-CN(-Ext) instead of two
78 huge tables.
79 b. ISO-2022-JP-[1-4] is dropped.
80 c. SCSU, BOCU, ISCII, UTF-7 conversion is diabled leading to
81 the 47kB reduction in the code size.
82
83 5. Locale changes
106 - patches/locale1.patch : 84 - patches/locale1.patch :
107 Filipino, Amharic, and Swahili locales 85 Filipino, Amharic, and Swahili locales
108 exemplar character set changes for CJK + 9 Indian locales 86 exemplar character set changes for CJK + 9 Indian locales
109 Minor fixes for Danish, , Turkish, and Korean. 87 Minor fixes for Danish, , Turkish, and Korean.
110 88
111 - patches/locale2.patch : 89 - patches/locale2.patch :
112 The minimum locale data Chrome needs for 47 languages Chrome is 90 The minimum locale data Chrome needs for 47 languages Chrome is
113 not localized to. Each locale data file has ExemplarCharacters, 91 not localized to. Each locale data file has ExemplarCharacters,
114 LocaleScript, layout, and the name of the language for a locale 92 LocaleScript, layout, and the name of the language for a locale
115 in its native language. 93 in its native language.
116 94
117 - patches/locale3.patch : Locale build configuration files. They 95 - patches/locale3.patch : Locale build configuration files. They
118 add reslocal.mk or {trns,sprep,rbnf,coll}local.mk files to 96 add reslocal.mk or {trns,sprep,rbnf,coll}local.mk files to
119 source/data/{coll,curr,lang.locale,curr,region,translit,zone,rbnf,sprep}. 97 source/data/{coll,curr,lang.locale,curr,region,translit,zone,rbnf,sprep}.
120 98
121 - In source/data/region, run the following command to get rid of numeric regio n 99 - In source/data/region, run the following command to get rid of numeric regio n
122 display names we don't use (everything other than 419). 100 display names we don't use (everything other than 419).
123 $ sed -i '/[0-35-9][0-9][0-9]{/ d' *.txt 101 $ sed -i '/[0-35-9][0-9][0-9]{/ d' *.txt
124 102
125 - android/patch_locale.sh (to be run for Android build only): 103 - android/patch_locale.sh (to be run for Android build only):
126 Makes changes to source/data/{curr,region,lang} to exclude these data 104 Makes changes to source/data/{curr,region,lang} to exclude these data
127 except the language and script names of zh_Hans and zh_Hant. 105 except the language and script names of zh_Hans and zh_Hant.
128 106
129 - Add tg.txt to source/data/locale source/data/lang to add the minimal locale 107 - Add tg.txt to source/data/locale source/data/lang to add the minimal locale
130 data necessary for the spellchecker. In both directories, add tg.txt to 108 data necessary for the spellchecker. In both directories, add tg.txt to
131 reslocal.mk 109 reslocal.mk
132 110
133 7. Removal of unihan collation tables from data/coll/{zh,ja,ko}.txt 111 6. Removal of unihan collation tables from data/coll/{zh,ja,ko}.txt
134 112
135 - patches/unihan.patch: 113 - run scripts/remove_unihan.sh
136 unihan collation tables are never used in Chrome/Webkit, but it takes 114 unihan collation tables are never used in Chrome/Blink, but it takes
137 about 1MB in the uncompressed ICU data file in ICU 4.2.1. 115 about 1MB in the uncompressed ICU data file in ICU.
138 116
139 8. Timezone data update 117 7. Timezone data update
140 - Grab the latest version of the following timezone data files and 118 - Grab the latest version of the following timezone data files and
141 put them in source/data/misc. 119 put them in source/data/misc.
142 120
143 metaZones.txt 121 metaZones.txt
144 timezoneTypes.txt 122 timezoneTypes.txt
145 windowsZones.txt 123 windowsZones.txt
146 zoneinfo64.txt 124 zoneinfo64.txt
147 125
148 As of Mar 2014, the latest version is 2014a and the above files 126 As of April 2014, the latest version is 2014b and the above files
149 are available at 127 are available at
150 http://source.icu-project.org/repos/icu/data/trunk/tzdata/icunew/2014a/44/ 128 http://source.icu-project.org/repos/icu/data/trunk/tzdata/icunew/2014b/44/
151 129
152 9. Transliterator customization 130 8. Transliterator customization
153 131
154 - Add el_Upper.txt taken from ICU 52 to source/data/trnslit 132 - Also add css3transform.txt to source/data/trnslit.
155
156 - Also add css3transform.txt to the same directory
157 - Put the following line in trnslocal.mk 133 - Put the following line in trnslocal.mk
158 134
159 TRANSLIT_SOURCE=css3transform.txt 135 TRANSLIT_SOURCE=css3transform.txt
160 136
161 10. Build-related changes 137 9. Build-related changes
162 138
163 - patches/wpo.patch 139 - patches/wpo.patch
164 - patches/vscomp.patch 140 Upstream bugs : http://bugs.icu-project.org/trac/ticket/8043
165 (see http://bugs.icu-project.org/trac/ticket/8355 and 141 http://bugs.icu-project.org/trac/ticket/5701
166 http://bugs.icu-project.org/trac/ticket/8356 ) 142 - patches/vscomp.patch for building with Visual Studio on Windows.
167 - patches/rtti.patch : Make RTTI work without exception handling on Windows 143 a. do not use WINDOWS_LOCALE_API in locmap.c
168 (see http://bugs.icu-project.org/trac/ticket/8343) 144 b. do not redefine stringpiece::npos
169 - patches/data.build.patch : 145 - patches/data.build.patch :
170 To remove some data files we don't use and cut down the data size. 146 Remove unnecessary resources : invuca, unames, collator source, stringprep
171 - patches/data.build.win.patch : 147 - patches/data.build.win.patch :
172 Windows-only data build patch. Add a new target DATALIB to makedata.mak 148 Windows-only data build patch. Add a new target DATALIB to makedata.mak
173 - patches/clang.patch: To build with Clang.
174 (see http://bugs.icu-project.org/trac/ticket/8954 Two other chunks in
175 the patch have already been fixed in the ICU trunk.)
176 - add an empty file (stubdatabuilt.txt) to source/stubdata 149 - add an empty file (stubdatabuilt.txt) to source/stubdata
177 150
178 11. Pre-built data libraries are checked in. 151 10. Pre-built data files are checked in with the following steps on Linux:
179 152
180 Before building data file on Linux, re-run 'runConfigureICU Linux' again 153 a. Make a icu data build directory outside the Chromium source tree.
181 if it's run without data.build.patch in #10 above. 154 b. Run 'runConfigureICU Linux' outside the source tree.
155 c. Run 'make'
156 d. 'make' will fail in the 1st pass. Copy source/data/in/coll/invuca.icu
157 to {BUILD_DIR_ROOT}/data/out/build/icudt52l/coll and re-run 'make'
158 in {BUILD_DIR_ROOT}/data.
182 159
183 Because we removed layout and layoutex directories in step 3, 160 e. 'make' will fail again when pkgdata looks for css3transform.res. Edit
184 'runConfigureICU Linux' will fail even with '--disable-layout'. A
185 work-around is to have a copy of our icu tree in a separate build directory
186 and add back directories we removed in step 3 before
187 running 'runConfigure'.
188
189 'make' will fail in the 1st pass. Copy source/data/in/coll/invuca.icu
190 to {BUILD_DIR_ROOT}/data/out/build/icudt46l/coll and re-run 'make'
191 in {BUILD_DIR_ROOT}/data.
192
193 'make' will fail again when pkgdata looks for css3transform.res. Edit
194 data/out/tmp/icudata.lst to replace 'css3transform.res' with 'root.res'. 161 data/out/tmp/icudata.lst to replace 'css3transform.res' with 'root.res'.
195 (see http://bugs.icu-project.org/trac/ticket/10570 ) and run 'make' again. 162 (see http://bugs.icu-project.org/trac/ticket/10570 ) and run 'make' again.
196 163
197 164
198 - source/data/in/icudtl.dat : Built on Linux with all the patches 165 - source/data/in/icudtl.dat : Built on Linux with all the patches
199 above applied. icudt46l.dat is generated in 166 above applied. icudt52l.dat is generated in
200 {BUILD_DIR_ROOT}/data/out/tmp and copied to the above location with a 167 {BUILD_DIR_ROOT}/data/out/tmp and copied to the above location with a
201 version number (46) dropped. 168 version number (52) dropped.
202 169
203 - windows/icudt.dll : With icudt46l.dat in place, all the patches applied 170 - windows/icudt.dll : With icudt52l.dat in place, all the patches applied
204 and header files moved (#11 below), generated by building icudt_build 171 and header files moved (#11 below), generated by building icudt_build
205 project of build/icudt_build.sln on Windows. icudt46.dll is 172 project of build/icudt_build.sln on Windows. icudt52.dll is
206 generated in bin/{Release,Debug} and copied to windows/icudt.dll 173 generated in bin/{Release,Debug} and copied to windows/icudt.dll
207 and checked in. Note that we drop the version number ('46') from the 174 and checked in. Note that we drop the version number ('52') from the
208 dll name to avoind having to update our build scripts/configuration 175 dll name to avoind having to update our build scripts/configuration
209 files everytime ICU is upgraded to a new version. 176 files everytime ICU is upgraded to a new version.
210 177
211 - {mac,linux}/icudt46l_dat.S : Built on Linux with all the 178 - {mac,linux}/icudt52l_dat.S : Built on Linux with all the
212 patches above (except android/brkitr.patch) applied and checked in. 179 patches above (except android/brkitr.patch) applied and checked in.
213 This file will be generated in {BUILD_DIR_ROOT}/data/out/tmp. 180 This file will be generated in {BUILD_DIR_ROOT}/data/out/tmp.
214 181
215 mac/icudt46l_dat.S is identical to linux/icudt46l_dat.S. It's made 182 mac/icudt52l_dat.S is identical to linux/icudt52l_dat.S. It's made
216 by changing the header portion of the Linux version to read as following 183 by changing the header portion of the Linux version to read as following
217 (no leading whitespace) : 184 (no leading whitespace) :
218 185
219 .globl _icudt46_dat 186 .globl _icudt52_dat
220 #ifdef U_HIDE_DATA_SYMBOL 187 #ifdef U_HIDE_DATA_SYMBOL
221 .private_extern _icudt46_dat 188 .private_extern _icudt52_dat
222 #endif 189 #endif
223 .data 190 .data
224 .const 191 .const
225 .align 4 192 .align 4
226 _icudt46_dat: 193 _icudt52_dat:
227 194
228 195
229 - android/icudt46l_dat.S : Built on Linux with all the patches above and 196 - android/icudt52l_dat.S : Built on Linux with all the patches above and
230 android/brkitr.patch applied and android/patch_locale.sh executed, and 197 android/brkitr.patch applied and android/patch_locale.sh executed, and
231 checked in. 198 checked in.
232 - android/icudtl.dat : Generated as icudt46l.dat in 199 - android/icudtl.dat : Generated as icudt52l.dat in
233 {BUILD_DIR_ROOT}/data/out/tmp along with icudt46l_dat.S and 200 {BUILD_DIR_ROOT}/data/out/tmp along with icudt52l_dat.S and
234 copied to the above location with '46' dropped in its name. 201 copied to the above location with '52' dropped in its name.
235 202
236 203
237 12. Apply the fix found with static analysis tools such as PSV and coverity 204 11. Change export of U_ICUDATA_ENTRY_POINT from U_IMPORT to U_EXPORT.
238
239 - patches/static.analysis.patch
240 - upstream trunk/4.8 do not have this code any more.
241
242 13. Fix for msvs2010 applied:
243 --- D:/src/ent/src/third_party/icu/source/common/stringpiece.cpp
244 (revision 78292)
245 +++ D:/src/ent/src/third_party/icu/source/common/stringpiece.cpp
246 (working copy)
247 @@ -75,7 +75,7 @@
248 * Visual Studios 9.0.
249 * Cygwin with MSVC 9.0 also complains here about redefinition.
250 */
251 -#if (!defined(_MSC_VER) || (_MSC_VER > 1500)) && !defined(CYGWINMSVC)
252 +#if (!defined(_MSC_VER) || (_MSC_VER > 1600)) && !defined(CYGWINMSVC)
253 const int32_t StringPiece::npos;
254 #endif
255
256 14. Fix for locales that don't use '.' as decimal separator: patches/nan.patch
257 - upstream bug: http://bugs.icu-project.org/trac/ticket/8561
258 - Handle other chars besides the dot. This is required because decNumber's
259 parser expects the dot as a decimal separator.
260 - Locales that don't use dot were producing "NaN" values.
261
262 15. Fix a bug in the regex engine.
263 - patches/regex.patch
264 - upstream bug: http://bugs.icu-project.org/trac/ticket/8666 (fixed in the ups tream)
265
266 16. Apply the upstream patch for Korean search collator support (ICU 4.6.1).
267 - patches/search_collation.patch
268 - upstream bug: http://bugs.icu-project.org/trac/ticket/8290
269
270 17. Fix a use of uninitialized memory bug in regular expression matching
271 - patches/rematch.patch
272 - upstream bug: http://bugs.icu-project.org/trac/ticket/8824
273
274 18. Make it compile with -Werror on gcc 4.6
275 - patches/gcc46.patch (ToT upstream does not have this code any more).
276
277 19. Fix four out of bounds memory access error in common/uloc.c
278 and common/uresbund.c
279 - patches/uloc.patch
280 - upstream bug:
281 1. http://bugs.icu-project.org/trac/ticket/8984 (_canonicalize)
282 2. http://bugs.icu-project.org/trac/ticket/9114 (_getKeywords)
283 3. http://bugs.icu-project.org/trac/ticket/8812 (uresbund)
284 http://bugs.icu-project.org/trac/ticket/8813 (uresbund)
285 4. http://bugs.icu-project.org/trac/ticket/10250 (_getKeywords)
286
287 20. Fix a null pointer error in ubrk_setText in ubrk.cpp.
288 - patches/ubrk.patch
289 - upstream bug : http://bugs.icu-project.org/trac/ticket/9115
290
291 21. Fix a clang warning in rbbi.cpp by merging in an upstream change.
292 - patches/changeset_30255.patch
293 - upstream change : http://bugs.icu-project.org/trac/changeset/30255
294
295 22. Fix time zone handling and compilation on iOS.
296 - patches/ios_timezone.patch
297 - upstream bugs : http://bugs.icu-project.org/trac/ticket/9051
298 http://bugs.icu-project.org/trac/ticket/8661
299
300 23. Fix a buffer overflow in utext
301 - patches/utext.patch
302 - upstream change : http://bugs.icu-project.org/trac/changeset/29356
303
304 24. Fix compilation errors on VS2012 and above.
305 - patches/vs2012.patch
306
307 25. Fix a buffer overflow in UTF-16/32 detection.
308 - patches/csetdet.patch
309 - upstream bug: http://bugs.icu-project.org/trac/ticket/10318
310
311 26. Add BreakIterator::getRuleStatus
312 - patches/breakiterator.patch
313 - Copy and paste BreakIterator::getRuleStatus API from ICU 52
314
315 27. Change export of U_ICUDATA_ENTRY_POINT from U_IMPORT to U_EXPORT.
316 - patches/declspec.patch 205 - patches/declspec.patch
317
318 28. Add support for QNX Neutrino.
319 - patches/platform.qnx.patch:
320 See #2 about the platform header generation.
321 - patches/si_value.undef.patch:
322 Work around an all-lowercase macro defined in <signal.h>.
323 Upstream took a different approach:
324 http://bugs.icu-project.org/trac/ticket/9935
325 - patches/xopen_source.patch:
326 Set _XOPEN_SOURCE to 600 as in the upstream changeset:
327 http://bugs.icu-project.org/trac/changeset/30418
OLDNEW
« no previous file with comments | « no previous file | icu52/patches/breakiterator.patch » ('j') | no next file with comments »

Powered by Google App Engine
This is Rietveld 408576698