Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(30)

Issue 1544023002: Call out to ICU for case conversion

Created:
5 years ago by Dan Ehrenberg
Modified:
4 years, 11 months ago
Reviewers:
yangguo, Yang
CC:
v8-reviews_googlegroups.com
Base URL:
https://chromium.googlesource.com/v8/v8.git@master
Target Ref:
refs/pending/heads/master
Project:
v8
Visibility:
Public.

Description

Call out to ICU for case conversion under a new flag V8 has its own implementation of case conversion, in unibrow, which has a couple bugs. This patch calls out to Unicode case conversion from String.prototype.toLowerCase() and String.prototype.toUpperCase() if the --icu-case-mapping flag is passed. On no-intl builds, the flag is disabled. A fast-path for some latin1 strings is taken from Blink in an attempt to avoid performance regressions, and microbenchmarks show that it is competitive with the old implementation. This new behavior fixes a number of test262 tests, as well as a Kangax test. R=yangguo BUG=v8:4476 LOG=Y

Patch Set 1 #

Patch Set 2 : Fix memory allocation, type checking, webkit test passes now #

Patch Set 3 : Fix Windows typecheck issue #

Patch Set 4 : Starting to bring in Blink optimization (doesn't work yet) #

Patch Set 5 : Add ICU casing conditionally behind a flag, with all optimizations #

Patch Set 6 : Fix cast that Windows doesn't like #

Patch Set 7 : Fix lint issue #

Patch Set 8 : Add a test, fix cast (again) #

Patch Set 9 : Hopefully fix lint #

Patch Set 10 : Alphabetize includes #

Patch Set 11 : Additional test case #

Total comments: 12
Unified diffs Side-by-side diffs Delta from patch set Stats (+265 lines, -5 lines) Patch
M src/flag-definitions.h View 1 2 3 4 1 chunk +5 lines, -0 lines 0 comments Download
M src/runtime/runtime-strings.cc View 1 2 3 4 5 6 7 8 9 2 chunks +232 lines, -4 lines 12 comments Download
A test/intl/general/case-mapping.js View 1 2 3 4 5 6 7 8 9 10 1 chunk +26 lines, -0 lines 0 comments Download
M test/intl/testcfg.py View 1 2 3 4 5 6 7 1 chunk +2 lines, -1 line 0 comments Download

Messages

Total messages: 44 (22 generated)
commit-bot: I haz the power
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1544023002/1 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1544023002/1
5 years ago (2015-12-23 03:00:02 UTC) #2
commit-bot: I haz the power
Dry run: Try jobs failed on following builders: v8_presubmit on tryserver.v8 (JOB_FAILED, http://build.chromium.org/p/tryserver.v8/builders/v8_presubmit/builds/9226)
5 years ago (2015-12-23 03:02:38 UTC) #4
commit-bot: I haz the power
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1544023002/20001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1544023002/20001
5 years ago (2015-12-23 07:09:42 UTC) #6
commit-bot: I haz the power
Dry run: Try jobs failed on following builders: v8_win_nosnap_shared_compile_rel on tryserver.v8 (JOB_FAILED, http://build.chromium.org/p/tryserver.v8/builders/v8_win_nosnap_shared_compile_rel/builds/8502)
5 years ago (2015-12-23 07:14:17 UTC) #8
commit-bot: I haz the power
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1544023002/40001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1544023002/40001
5 years ago (2015-12-23 07:32:16 UTC) #10
commit-bot: I haz the power
Dry run: This issue passed the CQ dry run.
5 years ago (2015-12-23 07:54:27 UTC) #12
commit-bot: I haz the power
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1544023002/80001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1544023002/80001
4 years, 12 months ago (2015-12-25 05:48:04 UTC) #14
commit-bot: I haz the power
Dry run: Try jobs failed on following builders: v8_presubmit on tryserver.v8 (JOB_FAILED, http://build.chromium.org/p/tryserver.v8/builders/v8_presubmit/builds/9275)
4 years, 12 months ago (2015-12-25 05:50:55 UTC) #16
commit-bot: I haz the power
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1544023002/100001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1544023002/100001
4 years, 12 months ago (2015-12-25 05:57:15 UTC) #19
commit-bot: I haz the power
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1544023002/120001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1544023002/120001
4 years, 12 months ago (2015-12-25 05:59:33 UTC) #21
commit-bot: I haz the power
Dry run: Try jobs failed on following builders: v8_linux_arm64_rel on tryserver.v8 (JOB_FAILED, http://build.chromium.org/p/tryserver.v8/builders/v8_linux_arm64_rel/builds/13451) v8_mac_rel on ...
4 years, 12 months ago (2015-12-25 06:02:06 UTC) #23
commit-bot: I haz the power
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1544023002/140001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1544023002/140001
4 years, 12 months ago (2015-12-25 07:02:54 UTC) #25
commit-bot: I haz the power
Dry run: Try jobs failed on following builders: v8_presubmit on tryserver.v8 (JOB_FAILED, http://build.chromium.org/p/tryserver.v8/builders/v8_presubmit/builds/9279)
4 years, 12 months ago (2015-12-25 07:05:17 UTC) #27
commit-bot: I haz the power
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1544023002/160001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1544023002/160001
4 years, 12 months ago (2015-12-25 07:55:43 UTC) #29
commit-bot: I haz the power
Dry run: Try jobs failed on following builders: v8_presubmit on tryserver.v8 (JOB_FAILED, http://build.chromium.org/p/tryserver.v8/builders/v8_presubmit/builds/9282)
4 years, 12 months ago (2015-12-25 07:57:51 UTC) #31
commit-bot: I haz the power
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1544023002/180001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1544023002/180001
4 years, 12 months ago (2015-12-25 07:59:23 UTC) #33
Dan Ehrenberg
This patch introduces the option of using ICU for case conversion, as suggested by https://docs.google.com/document/d/1ueSYBXXAf0cI3H1gyjwLdyTNhOE7WLeW2Yhu1cBRg0s/edit#heading=h.xl712vm6vxn3 ...
4 years, 12 months ago (2015-12-25 08:05:50 UTC) #34
commit-bot: I haz the power
Dry run: This issue passed the CQ dry run.
4 years, 12 months ago (2015-12-25 08:30:28 UTC) #36
Dan Ehrenberg
On 2015/12/25 at 08:05:50, Dan Ehrenberg wrote: > This patch introduces the option of using ...
4 years, 12 months ago (2015-12-25 09:17:25 UTC) #37
commit-bot: I haz the power
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1544023002/200001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1544023002/200001
4 years, 12 months ago (2015-12-25 09:30:10 UTC) #39
commit-bot: I haz the power
Dry run: This issue passed the CQ dry run.
4 years, 12 months ago (2015-12-25 10:01:13 UTC) #41
Yang
4 years, 11 months ago (2016-01-07 09:47:38 UTC) #44
Sorry it took this long. Vacation... :)

https://codereview.chromium.org/1544023002/diff/200001/src/runtime/runtime-st...
File src/runtime/runtime-strings.cc (right):

https://codereview.chromium.org/1544023002/diff/200001/src/runtime/runtime-st...
src/runtime/runtime-strings.cc:1078: namespace {
Can we move all of that into its own file?

https://codereview.chromium.org/1544023002/diff/200001/src/runtime/runtime-st...
src/runtime/runtime-strings.cc:1097: reinterpret_cast<const
UChar*>(SeqTwoByteString::cast(*s)->GetChars());
Please use String::FlatContent to access the flat content.

https://codereview.chromium.org/1544023002/diff/200001/src/runtime/runtime-st...
src/runtime/runtime-strings.cc:1119: int32_t real_length_again =
fn(reinterpret_cast<UChar*>(result->GetChars()),
Instead of running the whole thing again, you could simply create a new string
with the excess length, convert the remaining characters, and create a cons
string for both parts.

https://codereview.chromium.org/1544023002/diff/200001/src/runtime/runtime-st...
src/runtime/runtime-strings.cc:1120: real_length, src, length, "", &error);
src can have moved already after the allocation of the two byte string. Please
use a new instance of String::FlatContent to refetch the source.

https://codereview.chromium.org/1544023002/diff/200001/src/runtime/runtime-st...
src/runtime/runtime-strings.cc:1155: uint16_t ch = s->Get(index);
This can get pretty expensive if the string is a deeply nested ConsString.
Please flatten it and use String::FlatContent instead.

https://codereview.chromium.org/1544023002/diff/200001/src/runtime/runtime-st...
src/runtime/runtime-strings.cc:1176: if (StringShape(*s).IsSequentialOneByte())
{
Let's not use StringShape. Just check for s->IsSeqOneByteString().

https://codereview.chromium.org/1544023002/diff/200001/src/runtime/runtime-st...
src/runtime/runtime-strings.cc:1180: memcpy(result->GetChars(),
source->GetChars(), first_index_to_lower);
We should just use CopyChars for uniformity here. See below.

https://codereview.chromium.org/1544023002/diff/200001/src/runtime/runtime-st...
src/runtime/runtime-strings.cc:1193: // as Blink does, but there's also the
ConsString case.
As explained above, we should flatten upfront, so no ConsStrings here. We could
then do CopyChars up to first_index_to_lower, which handles the two-byte to
one-byte copying.

https://codereview.chromium.org/1544023002/diff/200001/src/runtime/runtime-st...
src/runtime/runtime-strings.cc:1215: const uint16_t sharp_s = L'\u00DF';
0xDF should do the trick to, right?

https://codereview.chromium.org/1544023002/diff/200001/src/runtime/runtime-st...
src/runtime/runtime-strings.cc:1232: uint16_t ch = s->Get(index);
Again, please flatten upfront and use String::FlatContent.

https://codereview.chromium.org/1544023002/diff/200001/src/runtime/runtime-st...
src/runtime/runtime-strings.cc:1253: goto upconvert;
Can't we simply "return ConvertCaseICU(s, isolate, u_strToUpper);"? I'd prefer
this small code duplication way more than goto.

https://codereview.chromium.org/1544023002/diff/200001/src/runtime/runtime-st...
src/runtime/runtime-strings.cc:1293: if (FLAG_icu_case_mapping)
the convention with multiline conditionals is to use brackets.

Powered by Google App Engine
This is Rietveld 408576698