|
|
Created:
3 years, 9 months ago by Dan Ehrenberg Modified:
3 years, 8 months ago CC:
v8-reviews_googlegroups.com, Michael Hablich, Yang, Benedikt Meurer, jgruber Target Ref:
refs/heads/master Project:
v8 Visibility:
Public. |
Description[date] Add ICU backend for timezone info behind a flag
This patch implements a timezone backend which is based on ICU, rather
than operating system calls. It can be turned on by passing the
--icu-timezone-data flag. The goal here is to take advantage of ICU's
data, which is more complete than the data that some system calls expose.
For example, without any special code, this patch fixes the time zone
of Lord Howe Island to have a correct 30 minute DST offset, rather than
60 minutes as the OS backends assume it to have.
Unfortunately, the parenthized timezone name in Date.prototype.toString()
differs across platforms. This patch chooses the long timezone name,
which matches Windows behavior and might be the most intelligible, but
the web compatibility impact is unclear.
BUG=v8:6031, v8:2137, v8:6076
Review-Url: https://codereview.chromium.org/2724373002
Cr-Commit-Position: refs/heads/master@{#44562}
Committed: https://chromium.googlesource.com/v8/v8/+/b213f2399038a615cdfbfa0201cddc113d304018
Patch Set 1 #Patch Set 2 : Make cast explicit #Patch Set 3 : Make casts even better #Patch Set 4 : More cast tweaks #Patch Set 5 : Fix a negation #Patch Set 6 : Fall back to OS time zone name #Patch Set 7 : Don't leak the icu::TimeZone* #
Total comments: 16
Patch Set 8 : Changes from review #Patch Set 9 : ulan's formatting suggestion #
Total comments: 5
Patch Set 10 : Make it a non-Harmony flag and remove OS fallback #Patch Set 11 : Convert to UTF8 rather than check-failing #
Total comments: 2
Patch Set 12 : Check-fail on overflow #Patch Set 13 : Rebase and add a couple tests #Patch Set 14 : add test file #Patch Set 15 : Use distinctive prefix #
Total comments: 13
Patch Set 16 : Fix up test infrastructure #Patch Set 17 : or {} #
Messages
Total messages: 101 (74 generated)
The CQ bit was checked by littledan@chromium.org
The CQ bit was unchecked by littledan@chromium.org
The CQ bit was checked by littledan@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: v8_linux64_gyp_rel_ng on master.tryserver.v8 (JOB_FAILED, http://build.chromium.org/p/tryserver.v8/builders/v8_linux64_gyp_rel_ng/build...)
The CQ bit was checked by littledan@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: v8_linux64_gyp_rel_ng on master.tryserver.v8 (JOB_FAILED, http://build.chromium.org/p/tryserver.v8/builders/v8_linux64_gyp_rel_ng/build...)
The CQ bit was checked by littledan@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: v8_linux64_gyp_rel_ng on master.tryserver.v8 (JOB_FAILED, http://build.chromium.org/p/tryserver.v8/builders/v8_linux64_gyp_rel_ng/build...)
The CQ bit was checked by littledan@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was checked by littledan@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
Description was changed from ========== [date] Add ICU backend for timezone info behind a flag BUG=v8:6031 ========== to ========== This patch implements a timezone backend which is based on ICU, rather than operating system calls. It can be turned on by passing the --icu-timezone-data flag. The goal here is to take advantage of ICU's data, which is more complete than the data that some system calls expose. For example, without any special code, this patch fixes the time zone of Lord Howe Island to have a correct 30 minute DST offset. CLDR only includes a limited number short timezone names which pertain to a particular locale falling back to "GMT+n" otherwise. This applies surprisingly often; for example, CET becomes GMT+1 in en_US. To avoid changing currently correct short acronyms, this patch falls back to the OS's timezone name if ICU returns something beginning with "GMT". All put together, here's a simple interactive test showing off some newly fixed data, together with the timezone names not regressing. The following was run on a machine with the local time zone set to Lord Howe Island: $ d8 -e --icu-timezone-data "print(new Date(2017, 7)); print(new Date(2017, 0));" Tue Aug 01 2017 00:00:00 GMT+1030 (LHST) Sun Jan 01 2017 00:00:00 GMT+1100 (LHDT) BUG=v8:6031,v8:2137 ==========
littledan@chromium.org changed reviewers: + jshin@chromium.org, ulan@chromium.org
Description was changed from ========== This patch implements a timezone backend which is based on ICU, rather than operating system calls. It can be turned on by passing the --icu-timezone-data flag. The goal here is to take advantage of ICU's data, which is more complete than the data that some system calls expose. For example, without any special code, this patch fixes the time zone of Lord Howe Island to have a correct 30 minute DST offset. CLDR only includes a limited number short timezone names which pertain to a particular locale falling back to "GMT+n" otherwise. This applies surprisingly often; for example, CET becomes GMT+1 in en_US. To avoid changing currently correct short acronyms, this patch falls back to the OS's timezone name if ICU returns something beginning with "GMT". All put together, here's a simple interactive test showing off some newly fixed data, together with the timezone names not regressing. The following was run on a machine with the local time zone set to Lord Howe Island: $ d8 -e --icu-timezone-data "print(new Date(2017, 7)); print(new Date(2017, 0));" Tue Aug 01 2017 00:00:00 GMT+1030 (LHST) Sun Jan 01 2017 00:00:00 GMT+1100 (LHDT) BUG=v8:6031,v8:2137 ========== to ========== This patch implements a timezone backend which is based on ICU, rather than operating system calls. It can be turned on by passing the --icu-timezone-data flag. The goal here is to take advantage of ICU's data, which is more complete than the data that some system calls expose. For example, without any special code, this patch fixes the time zone of Lord Howe Island to have a correct 30 minute DST offset, rather than 60 minutes as the OS backends assume it to have. CLDR only includes a limited number short timezone names which pertain to a particular locale falling back to "GMT+n" otherwise. This applies surprisingly often; for example, CET becomes GMT+1 in en_US. To avoid changing currently correct short acronyms, this patch falls back to the OS's timezone name if ICU returns something beginning with "GMT". All put together, here's a simple interactive test showing off some newly fixed data, together with the timezone names not regressing. The following was run on a machine with the local time zone set to Lord Howe Island: $ d8 -e --icu-timezone-data "print(new Date(2017, 7)); print(new Date(2017, 0));" Tue Aug 01 2017 00:00:00 GMT+1030 (LHST) Sun Jan 01 2017 00:00:00 GMT+1100 (LHDT) BUG=v8:6031,v8:2137 ==========
The CQ bit was checked by littledan@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was checked by littledan@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
https://codereview.chromium.org/2724373002/diff/120001/src/i18n.cc File src/i18n.cc (right): https://codereview.chromium.org/2724373002/diff/120001/src/i18n.cc#newcode902 src/i18n.cc:902: ICUTimezoneCache::ICUTimezoneCache() { Clear(); } I'm surprised this doesn't fail. Clear() deletes timezone_, which is uninitialized at that point. Pointer class members aren't auto-initialized by the compiler AFAIK, so it can contain random junk. https://codereview.chromium.org/2724373002/diff/120001/src/i18n.cc#newcode908 src/i18n.cc:908: char* name = is_dst ? dst_timezone_name_ : timezone_name_; The cache is never filled, not sure if that's on purpose. https://codereview.chromium.org/2724373002/diff/120001/src/i18n.cc#newcode938 src/i18n.cc:938: if (!U_SUCCESS(status)) return false; Nit: return U_SUCCESS(status) https://codereview.chromium.org/2724373002/diff/120001/src/i18n.cc#newcode970 src/i18n.cc:970: if (result[0] == 'G' && result[1] == 'M' && result[2] == 'T') { Nit: Another option would be to use strncmp.
https://codereview.chromium.org/2724373002/diff/120001/src/date.cc File src/date.cc (right): https://codereview.chromium.org/2724373002/diff/120001/src/date.cc#newcode32 src/date.cc:32: FLAG_icu_timezone_data ? new ICUOSTimezoneCache() : nit: let's keep the conditionals properly nested. If would be easier to read and check even though it is more code. #ifdef V8_I18N_SUPPORT FLAG_icu_timezone_data ? new ICUOSTimezoneCache() : base::OS::CreateTimezoneCache() #else base::OS::CreateTimezoneCache() #endif https://codereview.chromium.org/2724373002/diff/120001/src/flag-definitions.h File src/flag-definitions.h (right): https://codereview.chromium.org/2724373002/diff/120001/src/flag-definitions.h... src/flag-definitions.h:214: V(icu_timezone_data, "get information about timezones from ICU") Is it actually a harmony feature? https://codereview.chromium.org/2724373002/diff/120001/src/i18n.cc File src/i18n.cc (right): https://codereview.chromium.org/2724373002/diff/120001/src/i18n.cc#newcode969 src/i18n.cc:969: // starts with GMT, call out to the OS to get the 'real' name. This seems hacky as it breaks the "cache" semantics of the class. Can we fix it one the ICU side and keep this class in sync with ICU? https://codereview.chromium.org/2724373002/diff/120001/src/i18n.h File src/i18n.h (right): https://codereview.chromium.org/2724373002/diff/120001/src/i18n.h#newcode156 src/i18n.h:156: static const int32_t kMaxTimezoneChars = 20; Can we use std::string here?
https://codereview.chromium.org/2724373002/diff/120001/src/i18n.cc File src/i18n.cc (right): https://codereview.chromium.org/2724373002/diff/120001/src/i18n.cc#newcode902 src/i18n.cc:902: ICUTimezoneCache::ICUTimezoneCache() { Clear(); } On 2017/03/03 08:35:29, jgruber wrote: > I'm surprised this doesn't fail. Clear() deletes timezone_, which is > uninitialized at that point. Pointer class members aren't auto-initialized by > the compiler AFAIK, so it can contain random junk. Wow, not sure how I got this so wrong; I guess I started with initializing them more eagerly, and forgot to insert the default initialization, and then was lucky in tests.. Fixed. https://codereview.chromium.org/2724373002/diff/120001/src/i18n.cc#newcode908 src/i18n.cc:908: char* name = is_dst ? dst_timezone_name_ : timezone_name_; On 2017/03/03 08:35:29, jgruber wrote: > The cache is never filled, not sure if that's on purpose. What do you mean? Why is it never filled? I thought it normally would be. I locally put in some print statements and verified that the 'cached' case is hit. When used in conjunction with https://codereview.chromium.org/2726253002/ , the caching part is not so important. The really important part is the memory ownership--the protocol returns a const char* which is supposed to just stick around (as OSes provide), so the solution here is to put it in the object. Might as well keep it around once it's there, but I'm also fine with removing it. https://codereview.chromium.org/2724373002/diff/120001/src/i18n.cc#newcode938 src/i18n.cc:938: if (!U_SUCCESS(status)) return false; On 2017/03/03 08:35:29, jgruber wrote: > Nit: return U_SUCCESS(status) Done, but I would be interested in more feedback here--what's the correct behavior for an ICU failure like this--silently slog on with UTC values, crash, throw an "illegal access" error, increment a UseCounter, something else? https://codereview.chromium.org/2724373002/diff/120001/src/i18n.cc#newcode970 src/i18n.cc:970: if (result[0] == 'G' && result[1] == 'M' && result[2] == 'T') { On 2017/03/03 08:35:29, jgruber wrote: > Nit: Another option would be to use strncmp. Done.
The CQ bit was checked by littledan@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
jgruber@chromium.org changed reviewers: + jgruber@chromium.org
https://codereview.chromium.org/2724373002/diff/120001/src/i18n.cc File src/i18n.cc (right): https://codereview.chromium.org/2724373002/diff/120001/src/i18n.cc#newcode908 src/i18n.cc:908: char* name = is_dst ? dst_timezone_name_ : timezone_name_; On 2017/03/03 13:20:27, Dan Ehrenberg wrote: > On 2017/03/03 08:35:29, jgruber wrote: > > The cache is never filled, not sure if that's on purpose. > > What do you mean? Why is it never filled? I thought it normally would be. I > locally put in some print statements and verified that the 'cached' case is hit. Yup you're right. Sorry, my mistake.
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
Description was changed from ========== This patch implements a timezone backend which is based on ICU, rather than operating system calls. It can be turned on by passing the --icu-timezone-data flag. The goal here is to take advantage of ICU's data, which is more complete than the data that some system calls expose. For example, without any special code, this patch fixes the time zone of Lord Howe Island to have a correct 30 minute DST offset, rather than 60 minutes as the OS backends assume it to have. CLDR only includes a limited number short timezone names which pertain to a particular locale falling back to "GMT+n" otherwise. This applies surprisingly often; for example, CET becomes GMT+1 in en_US. To avoid changing currently correct short acronyms, this patch falls back to the OS's timezone name if ICU returns something beginning with "GMT". All put together, here's a simple interactive test showing off some newly fixed data, together with the timezone names not regressing. The following was run on a machine with the local time zone set to Lord Howe Island: $ d8 -e --icu-timezone-data "print(new Date(2017, 7)); print(new Date(2017, 0));" Tue Aug 01 2017 00:00:00 GMT+1030 (LHST) Sun Jan 01 2017 00:00:00 GMT+1100 (LHDT) BUG=v8:6031,v8:2137 ========== to ========== [date] Add ICU backend for timezone info behind a flag This patch implements a timezone backend which is based on ICU, rather than operating system calls. It can be turned on by passing the --icu-timezone-data flag. The goal here is to take advantage of ICU's data, which is more complete than the data that some system calls expose. For example, without any special code, this patch fixes the time zone of Lord Howe Island to have a correct 30 minute DST offset, rather than 60 minutes as the OS backends assume it to have. CLDR only includes a limited number short timezone names which pertain to a particular locale falling back to "GMT+n" otherwise. This applies surprisingly often; for example, CET becomes GMT+1 in en_US. To avoid changing currently correct short acronyms, this patch falls back to the OS's timezone name if ICU returns something beginning with "GMT". All put together, here's a simple interactive test showing off some newly fixed data, together with the timezone names not regressing. The following was run on a machine with the local time zone set to Lord Howe Island: $ d8 -e --icu-timezone-data "print(new Date(2017, 7)); print(new Date(2017, 0));" Tue Aug 01 2017 00:00:00 GMT+1030 (LHST) Sun Jan 01 2017 00:00:00 GMT+1100 (LHDT) BUG=v8:6031,v8:2137 ==========
The CQ bit was checked by littledan@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
Seems like the big unknown is what the right way to get the short timezone name is. Since this is behind a flag, should I skip the ICUOSTimezoneCache and just go for a patch with ICUTimezoneCache for now, while attempting to solve the issue on the ICU side? I'm not sure whether it will be possible to solve on that side, though. https://codereview.chromium.org/2724373002/diff/120001/src/date.cc File src/date.cc (right): https://codereview.chromium.org/2724373002/diff/120001/src/date.cc#newcode32 src/date.cc:32: FLAG_icu_timezone_data ? new ICUOSTimezoneCache() : On 2017/03/03 13:20:27, ulan wrote: > nit: let's keep the conditionals properly nested. If would be easier to read and > check even though it is more code. > > #ifdef V8_I18N_SUPPORT > FLAG_icu_timezone_data ? new ICUOSTimezoneCache() : > base::OS::CreateTimezoneCache() > #else > base::OS::CreateTimezoneCache() > #endif Done. https://codereview.chromium.org/2724373002/diff/120001/src/flag-definitions.h File src/flag-definitions.h (right): https://codereview.chromium.org/2724373002/diff/120001/src/flag-definitions.h... src/flag-definitions.h:214: V(icu_timezone_data, "get information about timezones from ICU") On 2017/03/03 13:20:27, ulan wrote: > Is it actually a harmony feature? No, I just put it here so it could get testing in HARMONY_STAGED before actually shipping it. Do you think it should be a separate, non-harmony flag instead? https://codereview.chromium.org/2724373002/diff/120001/src/i18n.cc File src/i18n.cc (right): https://codereview.chromium.org/2724373002/diff/120001/src/i18n.cc#newcode969 src/i18n.cc:969: // starts with GMT, call out to the OS to get the 'real' name. On 2017/03/03 13:20:27, ulan wrote: > This seems hacky as it breaks the "cache" semantics of the class. Can we fix it > one the ICU side and keep this class in sync with ICU? > I agree that it's hacky, but I'm not sure we'll get a fix into ICU any time soon. I'm not sure the caching is so important. It turns out that the redundant calculation of whether we're in DST or not was expensive enough that the cache in https://codereview.chromium.org/2726253002/ was necessary in addition anyway to not leave Date with a performance regression when the flag is flipped on. I took a look at the underlying data files. the tz database (https://www.iana.org/time-zones) has all the relevant names. These files are translated into timezone information files in /usr/share/zoneinfo , which has a representation of all names from the original database. By contrast, ICU seems to be getting its information from CLDR. CLDR doesn't represent all of these different time zones, just the ones that are relevant for the locale (and doesn't have historical name information either). One reason that these are taken from CLDR rather than just the tz database is because the three-letter acronyms actually are locale-dependent. For example, CLDR represents "CET" as either CET, MET or SEČ depending on the locale (http://www.unicode.org/cldr/charts/latest/by_type/timezones.europe.html#47cbf...). Here, we only care about the "root locale" however, and aren't trying to localize the output. Speculating, another reason to limit the number of short acronyms could be that ICU implements date parsing, and the acronyms for timezone names are ambiguous (reused with multiple meanings). If a more limited set of timezone names are considered relevant for the locale, then parsing is more possible. Jungshik, do you have any more thoughts, or a deeper understanding of why these short timezone names aren't more available through CLDR? What I don't understand is why the tz database names are not taken into the root locale. Or, absent that, maybe ICU could have a mode fall back to these names in the logic in TimeZone::getDisplayName that currently falls back to "localized" GMT +- the current offset. https://codereview.chromium.org/2724373002/diff/120001/src/i18n.h File src/i18n.h (right): https://codereview.chromium.org/2724373002/diff/120001/src/i18n.h#newcode156 src/i18n.h:156: static const int32_t kMaxTimezoneChars = 20; On 2017/03/03 13:20:27, ulan wrote: > Can we use std::string here? Do you mean that we'd make LocalTimezone() return an std::string, or store the timezone locally as a string? The first seems like a reasonable idea; for the second, I don't know how we'd do the memory management.
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
> for example, CET becomes GMT+1 in en_US. To avoid changing currently correct short acronyms, this patch falls back to the OS's timezone name if ICU returns something beginning with "GMT". CLDR does that on purpose because timezone abbreviations are extremely ambiguous outside regions where people are familiar with them. In European language locales, CET *is* used while it's not used en-US, ja, ko, hi, etc. In what sense do you think 'CET' is correct? I'd rather stick to the CLDR/ICU for them.
After writing the previous comment and an in-line comment below, I realized that there's one argument for what you did; to keep the current behavior. However, that behavior is different on different platforms (because it depends on the OS timezone name) and is not spec-required. https://codereview.chromium.org/2724373002/diff/160001/src/i18n.cc File src/i18n.cc (right): https://codereview.chromium.org/2724373002/diff/160001/src/i18n.cc#newcode968 src/i18n.cc:968: // starts with GMT, call out to the OS to get the 'real' name. IMHO, if-clause had better be dropped for the reason I gave in the previous comment. Also, note that checking for GMT does not work in all locales. For instance, fr uses UTC instead of GMT.
https://codereview.chromium.org/2724373002/diff/160001/src/i18n.cc File src/i18n.cc (right): https://codereview.chromium.org/2724373002/diff/160001/src/i18n.cc#newcode911 src/i18n.cc:911: GetTimeZone()->getDisplayName(is_dst, icu::TimeZone::SHORT, result); By using SHORT, you're matching v8 on Linux, but v8 on Windows would be better matched by 'LONG'. Given this platform disparity, I'm really tempted to use ISO 8601 for Date.toString(). https://codereview.chromium.org/2724373002/diff/160001/src/i18n.cc#newcode968 src/i18n.cc:968: // starts with GMT, call out to the OS to get the 'real' name. On 2017/03/08 06:44:41, jungshik at Google wrote: > IMHO, if-clause had better be dropped for the reason I gave in the previous > comment. > > Also, note that checking for GMT does not work in all locales. For instance, fr > uses UTC instead of GMT. "GMT+1030 (LHST)" : Who'd know what LHST stands for (among en-US locale users)? On Windows, v8 has a more descriptive long name inside parens while on Linux, it has abbreviations 'cryptic' to most users. Because the current bahavior does not have a platform parity anyway, I was planning to use a long descriptive name obtained from ICU everywhere inside parens. That is, instead of falling back to the OS names, you can get a descriptive timezone name from ICU. See http://icu-project.org/apiref/icu4c/classicu_1_1TimeZone.html#a7330eac5e0d8ab... . EDisplayType can be set to "LONG" to get 'Eastern Daylight Saving time' (or depending on locale, 'North American Eastern Daylight Saving Time' in that language). In most locales and frequently used timezones, this should not fallback to UTC offset formats. Wait.. for Date.toString(), do we want to show a localized format? If it's ok to break from the current behavior (which varies a lot from platform to platform and from implementation to implementation), I'm tempted to use ISO 8601 for Date.toString()
On 2017/03/08 06:33:01, jungshik at Google wrote: > > for example, CET becomes GMT+1 in en_US. To avoid > changing currently correct short acronyms, this patch falls back to the > OS's timezone name if ICU returns something beginning with "GMT". > > CLDR does that on purpose because timezone abbreviations are extremely ambiguous > outside regions where people are familiar with them. > > In European language locales, CET *is* used while it's not used en-US, ja, ko, > hi, etc. > > In what sense do you think 'CET' is correct? I'd rather stick to the CLDR/ICU > for them. SorryI haven't read Dan's comment about this. I wrote my comments (a few of them) based on the CL description and his code. Note that IANA tz DB has been moving away from these 3 or 4 letter abbreviations in general. See http://mm.icann.org/pipermail/tz-announce/2017-February/000045.html for instance (the section under the heading "Changes to past and future time zone abbreviations").
https://codereview.chromium.org/2724373002/diff/160001/src/i18n.cc File src/i18n.cc (right): https://codereview.chromium.org/2724373002/diff/160001/src/i18n.cc#newcode911 src/i18n.cc:911: GetTimeZone()->getDisplayName(is_dst, icu::TimeZone::SHORT, result); On 2017/03/08 07:15:20, jungshik at Google wrote: > By using SHORT, you're matching v8 on Linux, but v8 on Windows would be better > matched by 'LONG'. > > Given this platform disparity, I'm really tempted to use ISO 8601 for > Date.toString(). Dan, you talked about root locale in your comment, but here you're not using ROOT locale perhaps because SHORT + ROOT locale does not give you what you want. UnicodeString & getDisplayName (UBool daylight, EDisplayType style, const Locale &locale, UnicodeString &result) const. Again, I think "LONG" is better (matching v8's behavior on Windows) or just using ISO 8601 (breaking from the current behavior) can be considered. Another thing about v8 on Windows. Currently, localized timezone name is used inside parens for Date.prototype.toString(). We don't have to preserve this, do we?
https://codereview.chromium.org/2724373002/diff/120001/src/flag-definitions.h File src/flag-definitions.h (right): https://codereview.chromium.org/2724373002/diff/120001/src/flag-definitions.h... src/flag-definitions.h:214: V(icu_timezone_data, "get information about timezones from ICU") On 2017/03/03 15:06:54, Dan Ehrenberg wrote: > On 2017/03/03 13:20:27, ulan wrote: > > Is it actually a harmony feature? > > No, I just put it here so it could get testing in HARMONY_STAGED before actually > shipping it. Do you think it should be a separate, non-harmony flag instead? If it is not related to harmony, a separate flag would be better. https://codereview.chromium.org/2724373002/diff/120001/src/i18n.h File src/i18n.h (right): https://codereview.chromium.org/2724373002/diff/120001/src/i18n.h#newcode156 src/i18n.h:156: static const int32_t kMaxTimezoneChars = 20; On 2017/03/03 15:06:54, Dan Ehrenberg wrote: > On 2017/03/03 13:20:27, ulan wrote: > > Can we use std::string here? > > Do you mean that we'd make LocalTimezone() return an std::string, or store the > timezone locally as a string? The first seems like a reasonable idea; for the > second, I don't know how we'd do the memory management. Ah, sorry, nevermind. I meant the second, but I see the problem now.
Description was changed from ========== [date] Add ICU backend for timezone info behind a flag This patch implements a timezone backend which is based on ICU, rather than operating system calls. It can be turned on by passing the --icu-timezone-data flag. The goal here is to take advantage of ICU's data, which is more complete than the data that some system calls expose. For example, without any special code, this patch fixes the time zone of Lord Howe Island to have a correct 30 minute DST offset, rather than 60 minutes as the OS backends assume it to have. CLDR only includes a limited number short timezone names which pertain to a particular locale falling back to "GMT+n" otherwise. This applies surprisingly often; for example, CET becomes GMT+1 in en_US. To avoid changing currently correct short acronyms, this patch falls back to the OS's timezone name if ICU returns something beginning with "GMT". All put together, here's a simple interactive test showing off some newly fixed data, together with the timezone names not regressing. The following was run on a machine with the local time zone set to Lord Howe Island: $ d8 -e --icu-timezone-data "print(new Date(2017, 7)); print(new Date(2017, 0));" Tue Aug 01 2017 00:00:00 GMT+1030 (LHST) Sun Jan 01 2017 00:00:00 GMT+1100 (LHDT) BUG=v8:6031,v8:2137 ========== to ========== [date] Add ICU backend for timezone info behind a flag This patch implements a timezone backend which is based on ICU, rather than operating system calls. It can be turned on by passing the --icu-timezone-data flag. The goal here is to take advantage of ICU's data, which is more complete than the data that some system calls expose. For example, without any special code, this patch fixes the time zone of Lord Howe Island to have a correct 30 minute DST offset, rather than 60 minutes as the OS backends assume it to have. Unfortunately, the parenthized timezone name in Date.prototype.toString() differs across platforms. This patch chooses the long timezone name, which matches Windows behavior and might be the most intelligible, but the web compatibility impact is unclear. BUG=v8:6031,v8:2137,v8:6076 ==========
https://codereview.chromium.org/2724373002/diff/120001/src/flag-definitions.h File src/flag-definitions.h (right): https://codereview.chromium.org/2724373002/diff/120001/src/flag-definitions.h... src/flag-definitions.h:214: V(icu_timezone_data, "get information about timezones from ICU") On 2017/03/10 17:55:38, ulan wrote: > On 2017/03/03 15:06:54, Dan Ehrenberg wrote: > > On 2017/03/03 13:20:27, ulan wrote: > > > Is it actually a harmony feature? > > > > No, I just put it here so it could get testing in HARMONY_STAGED before > actually > > shipping it. Do you think it should be a separate, non-harmony flag instead? > > If it is not related to harmony, a separate flag would be better. OK, I made this a separate flag. https://codereview.chromium.org/2724373002/diff/160001/src/i18n.cc File src/i18n.cc (right): https://codereview.chromium.org/2724373002/diff/160001/src/i18n.cc#newcode911 src/i18n.cc:911: GetTimeZone()->getDisplayName(is_dst, icu::TimeZone::SHORT, result); On 2017/03/08 07:27:43, jungshik at Google wrote: > On 2017/03/08 07:15:20, jungshik at Google wrote: > > By using SHORT, you're matching v8 on Linux, but v8 on Windows would be better > > matched by 'LONG'. > > > > Given this platform disparity, I'm really tempted to use ISO 8601 for > > Date.toString(). > > Dan, you talked about root locale in your comment, but here you're not using > ROOT locale perhaps because SHORT + ROOT locale does not give you what you want. > > > UnicodeString & getDisplayName (UBool daylight, EDisplayType style, const > Locale &locale, UnicodeString &result) const. > > Again, I think "LONG" is better (matching v8's behavior on Windows) or just > using ISO 8601 (breaking from the current behavior) can be considered. > > Another thing about v8 on Windows. Currently, localized timezone name is used > inside parens for Date.prototype.toString(). We don't have to preserve this, do > we? > > OK, I switched to LONG, but I am not sure whether this is web-compatible.
The new version switches to a LONG time format, and it also fixes an issue present in a previous version which would reject non-ASCII timezone names. As discussed elsewhere, I'm not sure if LONG is web-compatible. However, interactively, I see a strange omission: some timezones are still rendered in a less accurate way. For example, on my machine, across multiple locales, the central European timezone is rendered as GMT+02:00 .
The CQ bit was checked by littledan@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
From implementation point of view the CL looks good to me. I don't know how to resolve the web-compatibility issue of the time format. Any suggestions? https://codereview.chromium.org/2724373002/diff/200001/src/i18n.cc File src/i18n.cc (right): https://codereview.chromium.org/2724373002/diff/200001/src/i18n.cc#newcode925 src/i18n.cc:925: icu::CheckedArrayByteSink byte_sink(name, kMaxTimezoneChars); CHECK(!byte_sink.Overflowed());
Since this patch is behind a flag, what if we land this version for now and continue investigate the compatibility side in preparation for a possible launch? As a first step, I'm preparing a specification for Date.prototype.toString and related functions (which are currently unspecified) at https://github.com/tc39/ecma262/pull/848 . From there, I was wondering if we could try omitting the timezone string--apparently it is sometimes omitted in some browsers, so maybe the web could take it if we got rid of it consistently. Maybe Chrome could do the experiment about whether shipping this is web-compatible, or maybe another browser is. Note that Firefox is also looking into this change to use ICU data, see https://bugzilla.mozilla.org/show_bug.cgi?id=1346211 https://codereview.chromium.org/2724373002/diff/200001/src/i18n.cc File src/i18n.cc (right): https://codereview.chromium.org/2724373002/diff/200001/src/i18n.cc#newcode925 src/i18n.cc:925: icu::CheckedArrayByteSink byte_sink(name, kMaxTimezoneChars); On 2017/03/17 11:00:05, ulan wrote: > CHECK(!byte_sink.Overflowed()); Check has been added.
The CQ bit was checked by littledan@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: v8_presubmit on master.tryserver.v8 (JOB_FAILED, http://build.chromium.org/p/tryserver.v8/builders/v8_presubmit/builds/38633)
The CQ bit was checked by littledan@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
littledan@chromium.org changed reviewers: + machenbach@google.com
littledan@chromium.org changed reviewers: + machenbach@chromium.org - machenbach@google.com
I added some tests which change the default timezone and locale. I'm not sure how to do this on Windows, but elsewhere, it can be done with environment variables. machenbach, do you think you could take a look at the test infrastructure change?
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: v8_linux64_verify_csa_rel_ng on master.tryserver.v8 (JOB_FAILED, http://build.chromium.org/p/tryserver.v8/builders/v8_linux64_verify_csa_rel_n...) v8_linux64_verify_csa_rel_ng_triggered on master.tryserver.v8 (JOB_FAILED, http://build.chromium.org/p/tryserver.v8/builders/v8_linux64_verify_csa_rel_n...)
The CQ bit was checked by littledan@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
https://codereview.chromium.org/2724373002/diff/280001/tools/testrunner/local... File tools/testrunner/local/commands.py (left): https://codereview.chromium.org/2724373002/diff/280001/tools/testrunner/local... tools/testrunner/local/commands.py:53: def RunProcess(verbose, timeout, args, **rest): Why not keep **rest? Might be that it's used from somewhere. https://codereview.chromium.org/2724373002/diff/280001/tools/testrunner/local... File tools/testrunner/local/commands.py (right): https://codereview.chromium.org/2724373002/diff/280001/tools/testrunner/local... tools/testrunner/local/commands.py:53: def RunProcess(verbose, timeout, args, additionalEnv): nit: additional_env https://codereview.chromium.org/2724373002/diff/280001/tools/testrunner/local... tools/testrunner/local/commands.py:129: def Execute(args, verbose=False, timeout=None, env={}): env=None and then pass env or {} https://codereview.chromium.org/2724373002/diff/280001/tools/testrunner/objec... File tools/testrunner/objects/testcase.py (right): https://codereview.chromium.org/2724373002/diff/280001/tools/testrunner/objec... tools/testrunner/objects/testcase.py:49: copy.outcomes = self.outcomes Please add copying env here, just like outcomes. https://codereview.chromium.org/2724373002/diff/280001/tools/testrunner/objec... tools/testrunner/objects/testcase.py:52: def PackTask(self): Though we neither really support nor test it, maybe add this field to the pack/unpack methods. It's for the network execution of the test runner.
The CQ bit was checked by littledan@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: v8_win64_rel_ng on master.tryserver.v8 (JOB_FAILED, http://build.chromium.org/p/tryserver.v8/builders/v8_win64_rel_ng/builds/25703) v8_win_compile_dbg on master.tryserver.v8 (JOB_FAILED, http://build.chromium.org/p/tryserver.v8/builders/v8_win_compile_dbg/builds/3...) v8_win_rel_ng on master.tryserver.v8 (JOB_FAILED, http://build.chromium.org/p/tryserver.v8/builders/v8_win_rel_ng/builds/25804)
Made changes based on machenbach's comments. I can't tell if the CQ failures have to do with the patch--it looks unrelated and the tree is closed, so I'll try again in a bit. https://codereview.chromium.org/2724373002/diff/280001/tools/testrunner/local... File tools/testrunner/local/commands.py (left): https://codereview.chromium.org/2724373002/diff/280001/tools/testrunner/local... tools/testrunner/local/commands.py:53: def RunProcess(verbose, timeout, args, **rest): On 2017/04/07 13:02:10, Michael Achenbach wrote: > Why not keep **rest? Might be that it's used from somewhere. I didn't see any usages in the tree, but anyway, added it back in. https://codereview.chromium.org/2724373002/diff/280001/tools/testrunner/local... File tools/testrunner/local/commands.py (right): https://codereview.chromium.org/2724373002/diff/280001/tools/testrunner/local... tools/testrunner/local/commands.py:53: def RunProcess(verbose, timeout, args, additionalEnv): On 2017/04/07 13:02:10, Michael Achenbach wrote: > nit: additional_env Done. https://codereview.chromium.org/2724373002/diff/280001/tools/testrunner/local... tools/testrunner/local/commands.py:129: def Execute(args, verbose=False, timeout=None, env={}): On 2017/04/07 13:02:10, Michael Achenbach wrote: > env=None > and then pass > env or {} Done. https://codereview.chromium.org/2724373002/diff/280001/tools/testrunner/objec... File tools/testrunner/objects/testcase.py (right): https://codereview.chromium.org/2724373002/diff/280001/tools/testrunner/objec... tools/testrunner/objects/testcase.py:49: copy.outcomes = self.outcomes On 2017/04/07 13:02:10, Michael Achenbach wrote: > Please add copying env here, just like outcomes. Done. What uses this, so I can test it? I ran all the unit tests and they passed without the line I added, and the precq passed without it as well. https://codereview.chromium.org/2724373002/diff/280001/tools/testrunner/objec... tools/testrunner/objects/testcase.py:52: def PackTask(self): On 2017/04/07 13:02:10, Michael Achenbach wrote: > Though we neither really support nor test it, maybe add this field to the > pack/unpack methods. It's for the network execution of the test runner. Done.
Made changes based on machenbach's comments. I can't tell if the CQ failures have to do with the patch--it looks unrelated and the tree is closed, so I'll try again in a bit.
lgtm from my side.
The CQ bit was checked by littledan@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: v8_linux64_avx2_rel_ng on master.tryserver.v8 (JOB_FAILED, http://build.chromium.org/p/tryserver.v8/builders/v8_linux64_avx2_rel_ng/buil...) v8_linux64_avx2_rel_ng_triggered on master.tryserver.v8 (JOB_FAILED, http://build.chromium.org/p/tryserver.v8/builders/v8_linux64_avx2_rel_ng_trig...) v8_linux64_rel_ng on master.tryserver.v8 (JOB_FAILED, http://build.chromium.org/p/tryserver.v8/builders/v8_linux64_rel_ng/builds/24230) v8_linux64_rel_ng_triggered on master.tryserver.v8 (JOB_FAILED, http://build.chromium.org/p/tryserver.v8/builders/v8_linux64_rel_ng_triggered...)
lgtm https://codereview.chromium.org/2724373002/diff/280001/tools/testrunner/local... File tools/testrunner/local/commands.py (right): https://codereview.chromium.org/2724373002/diff/280001/tools/testrunner/local... tools/testrunner/local/commands.py:129: def Execute(args, verbose=False, timeout=None, env={}): On 2017/04/07 14:06:09, Dan Ehrenberg wrote: > On 2017/04/07 13:02:10, Michael Achenbach wrote: > > env=None > > and then pass > > env or {} > > Done. You didn't do the second part, hence the test errors in buildbot. The line in the end should look like: return RunProcess(verbose, timeout, args, env or {}) often folks also put this into an extra line like: env = env or {} but I'm usually happy with one-liners. Note that in your first version, {} is evaluated at parse-time and all functions would share the same dict object. In case it's mutated, it results in hard-to-find errors. https://codereview.chromium.org/2724373002/diff/280001/tools/testrunner/objec... File tools/testrunner/objects/testcase.py (right): https://codereview.chromium.org/2724373002/diff/280001/tools/testrunner/objec... tools/testrunner/objects/testcase.py:49: copy.outcomes = self.outcomes On 2017/04/07 14:06:09, Dan Ehrenberg wrote: > On 2017/04/07 13:02:10, Michael Achenbach wrote: > > Please add copying env here, just like outcomes. > > Done. What uses this, so I can test it? I ran all the unit tests and they passed > without the line I added, and the precq passed without it as well. Hard to tell, maybe it's a noop. The copy method is called to multiply a test for several testing variants. It'll called only for those suites that use variants. Can also be that it plays an extra roll in the non-maintained network execution.
The CQ bit was checked by littledan@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
The CQ bit was checked by littledan@chromium.org
The patchset sent to the CQ was uploaded after l-g-t-m from ulan@chromium.org, machenbach@chromium.org Link to the patchset: https://codereview.chromium.org/2724373002/#ps320001 (title: "or {}")
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
https://codereview.chromium.org/2724373002/diff/280001/tools/testrunner/local... File tools/testrunner/local/commands.py (right): https://codereview.chromium.org/2724373002/diff/280001/tools/testrunner/local... tools/testrunner/local/commands.py:129: def Execute(args, verbose=False, timeout=None, env={}): On 2017/04/11 08:38:36, Michael Achenbach wrote: > On 2017/04/07 14:06:09, Dan Ehrenberg wrote: > > On 2017/04/07 13:02:10, Michael Achenbach wrote: > > > env=None > > > and then pass > > > env or {} > > > > Done. > > You didn't do the second part, hence the test errors in buildbot. The line in > the end should look like: > > return RunProcess(verbose, timeout, args, env or {}) > > often folks also put this into an extra line like: > env = env or {} > but I'm usually happy with one-liners. > > Note that in your first version, {} is evaluated at parse-time and all functions > would share the same dict object. In case it's mutated, it results in > hard-to-find errors. This was a result of an overconfident search of some of the test code for callers without this arg... anyway, that did the trick. Thanks.
CQ is committing da patch. Bot data: {"patchset_id": 320001, "attempt_start_ts": 1491910227140500, "parent_rev": "9ff68b0c47798ebbafcae6c48014f5e8cf489606", "commit_rev": "b213f2399038a615cdfbfa0201cddc113d304018"}
Message was sent while issue was closed.
Description was changed from ========== [date] Add ICU backend for timezone info behind a flag This patch implements a timezone backend which is based on ICU, rather than operating system calls. It can be turned on by passing the --icu-timezone-data flag. The goal here is to take advantage of ICU's data, which is more complete than the data that some system calls expose. For example, without any special code, this patch fixes the time zone of Lord Howe Island to have a correct 30 minute DST offset, rather than 60 minutes as the OS backends assume it to have. Unfortunately, the parenthized timezone name in Date.prototype.toString() differs across platforms. This patch chooses the long timezone name, which matches Windows behavior and might be the most intelligible, but the web compatibility impact is unclear. BUG=v8:6031,v8:2137,v8:6076 ========== to ========== [date] Add ICU backend for timezone info behind a flag This patch implements a timezone backend which is based on ICU, rather than operating system calls. It can be turned on by passing the --icu-timezone-data flag. The goal here is to take advantage of ICU's data, which is more complete than the data that some system calls expose. For example, without any special code, this patch fixes the time zone of Lord Howe Island to have a correct 30 minute DST offset, rather than 60 minutes as the OS backends assume it to have. Unfortunately, the parenthized timezone name in Date.prototype.toString() differs across platforms. This patch chooses the long timezone name, which matches Windows behavior and might be the most intelligible, but the web compatibility impact is unclear. BUG=v8:6031,v8:2137,v8:6076 Review-Url: https://codereview.chromium.org/2724373002 Cr-Commit-Position: refs/heads/master@{#44562} Committed: https://chromium.googlesource.com/v8/v8/+/b213f2399038a615cdfbfa0201cddc113d3... ==========
Message was sent while issue was closed.
Committed patchset #17 (id:320001) as https://chromium.googlesource.com/v8/v8/+/b213f2399038a615cdfbfa0201cddc113d3...
Message was sent while issue was closed.
A revert of this CL (patchset #17 id:320001) has been created in https://codereview.chromium.org/2811103002/ by machenbach@chromium.org. The reason for reverting is: Breaks noi18n: https://build.chromium.org/p/client.v8/builders/V8%20Linux%20-%20noi18n%20-%2.... |