Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(224)

Issue 11368138: Add some support for the code-point code-unit distinction. (Closed)

Created:
8 years, 1 month ago by erikcorry
Modified:
8 years, 1 month ago
CC:
reviews_dartlang.org, ahe
Visibility:
Public.

Description

Add some support for the code-point code-unit distinction. Adds String.fromCodeUnits and String.codeUnitAt, and modifies String.fromCodePoints and String.codePointAt to actually use code points. Fixes String.charCodes to use code points and adds String.codeUnits. Reenables some tests, and adds new ones for non-BMP characters. Fixes issues 6418, 6501 and 1357. R=floitsch@google.com BUG= Committed: https://code.google.com/p/dart/source/detail?r=15150

Patch Set 1 #

Total comments: 35

Patch Set 2 : New version integrates feedback, adds less to standard String class. #

Total comments: 16

Patch Set 3 : Implemented feedback from patch set 2. #

Total comments: 42

Patch Set 4 : Implemented feedback from patch set 3 #

Total comments: 2
Unified diffs Side-by-side diffs Delta from patch set Stats (+1044 lines, -632 lines) Patch
M runtime/lib/string.cc View 1 2 3 5 chunks +18 lines, -17 lines 0 comments Download
M runtime/lib/string_base.dart View 1 2 3 8 chunks +113 lines, -78 lines 0 comments Download
M runtime/lib/string_patch.dart View 1 2 1 chunk +2 lines, -2 lines 0 comments Download
M runtime/vm/bootstrap_natives.h View 1 2 1 chunk +2 lines, -2 lines 0 comments Download
M runtime/vm/dart_api_impl.cc View 1 2 3 1 chunk +3 lines, -1 line 0 comments Download
M runtime/vm/dart_api_message.cc View 1 2 3 1 chunk +1 line, -1 line 0 comments Download
M runtime/vm/object.h View 1 2 3 8 chunks +14 lines, -6 lines 0 comments Download
M runtime/vm/object.cc View 1 2 3 13 chunks +66 lines, -44 lines 0 comments Download
M runtime/vm/object_test.cc View 1 2 3 21 chunks +109 lines, -59 lines 0 comments Download
M runtime/vm/scanner.h View 1 chunk +1 line, -1 line 0 comments Download
M runtime/vm/scanner.cc View 1 2 3 2 chunks +10 lines, -2 lines 1 comment Download
M runtime/vm/symbols.cc View 1 2 3 3 chunks +1 line, -7 lines 0 comments Download
M runtime/vm/unicode.h View 1 2 3 3 chunks +56 lines, -9 lines 1 comment Download
M runtime/vm/unicode.cc View 1 2 3 14 chunks +41 lines, -39 lines 0 comments Download
M runtime/vm/unicode_test.cc View 1 2 85 chunks +172 lines, -172 lines 0 comments Download
M sdk/lib/_internal/compiler/implementation/js_backend/constant_emitter.dart View 1 1 chunk +4 lines, -15 lines 0 comments Download
M sdk/lib/_internal/compiler/implementation/lib/core_patch.dart View 1 2 3 1 chunk +6 lines, -6 lines 0 comments Download
M sdk/lib/_internal/compiler/implementation/lib/js_helper.dart View 1 2 3 1 chunk +4 lines, -3 lines 0 comments Download
M sdk/lib/_internal/compiler/implementation/lib/js_string.dart View 1 2 2 chunks +25 lines, -3 lines 0 comments Download
M sdk/lib/_internal/compiler/implementation/scanner/token.dart View 1 2 3 2 chunks +9 lines, -1 line 0 comments Download
M sdk/lib/_internal/compiler/implementation/source_map_builder.dart View 1 1 chunk +3 lines, -3 lines 0 comments Download
M sdk/lib/_internal/compiler/implementation/string_validator.dart View 1 1 chunk +2 lines, -0 lines 0 comments Download
M sdk/lib/_internal/compiler/implementation/util/util.dart View 1 1 chunk +54 lines, -40 lines 0 comments Download
M sdk/lib/core/string.dart View 1 2 3 3 chunks +97 lines, -18 lines 0 comments Download
M sdk/lib/io/string_stream.dart View 1 2 chunks +2 lines, -0 lines 0 comments Download
M sdk/lib/uri/encode_decode.dart View 1 chunk +2 lines, -11 lines 0 comments Download
M sdk/lib/utf/utf32.dart View 2 chunks +2 lines, -2 lines 0 comments Download
M sdk/lib/utf/utf8.dart View 2 chunks +2 lines, -2 lines 0 comments Download
M sdk/lib/utf/utf_core.dart View 1 chunk +0 lines, -42 lines 0 comments Download
M tests/co19/co19-dart2dart.status View 1 1 chunk +5 lines, -3 lines 0 comments Download
M tests/co19/co19-dart2js.status View 1 2 2 chunks +6 lines, -5 lines 0 comments Download
M tests/co19/co19-runtime.status View 1 2 chunks +6 lines, -4 lines 0 comments Download
M tests/corelib/corelib.status View 1 3 chunks +7 lines, -1 line 0 comments Download
A tests/corelib/reg_exp_unicode_2_test.dart View 1 chunk +18 lines, -0 lines 0 comments Download
A tests/corelib/reg_exp_unicode_test.dart View 1 chunk +22 lines, -0 lines 0 comments Download
M tests/corelib/string_from_list_test.dart View 1 chunk +25 lines, -4 lines 0 comments Download
A tests/corelib/string_trim_unicode_test.dart View 1 1 chunk +47 lines, -0 lines 0 comments Download
A tests/corelib/surrogate_pair_toUpper_test.dart View 1 1 chunk +64 lines, -0 lines 0 comments Download
M tests/language/language_dart2js.status View 1 2 1 chunk +0 lines, -1 line 0 comments Download
M tests/standalone/io/string_stream_test.dart View 1 1 chunk +2 lines, -2 lines 0 comments Download
M tests/utils/utf8_test.dart View 1 2 chunks +6 lines, -10 lines 0 comments Download
M tests/utils/utf_test.dart View 1 chunk +5 lines, -5 lines 0 comments Download
M tests/utils/utils.status View 1 1 chunk +0 lines, -1 line 0 comments Download
M utils/tests/string_encoding/unicode_test.dart View 1 1 chunk +10 lines, -10 lines 0 comments Download

Messages

Total messages: 21 (0 generated)
erikcorry
8 years, 1 month ago (2012-11-08 09:55:04 UTC) #1
erikcorry
See also http://code.google.com/p/co19/issues/detail?id=284
8 years, 1 month ago (2012-11-08 10:38:55 UTC) #2
floitsch
LGTM. Please wait for the VM guys to comment on the signed-to-unsigned change. Added Peter ...
8 years, 1 month ago (2012-11-08 15:28:21 UTC) #3
Ivan Posva
Do not commit this before you get buy-in from the VM team. Thanks, -Ivan
8 years, 1 month ago (2012-11-08 18:20:38 UTC) #4
siva
https://codereview.chromium.org/11368138/diff/1/runtime/vm/object.h File runtime/vm/object.h (right): https://codereview.chromium.org/11368138/diff/1/runtime/vm/object.h#newcode3968 runtime/vm/object.h:3968: } I don't mind changing the name 'CharAt' to ...
8 years, 1 month ago (2012-11-08 19:06:47 UTC) #5
erikcorry
https://codereview.chromium.org/11368138/diff/1/runtime/vm/object.h File runtime/vm/object.h (right): https://codereview.chromium.org/11368138/diff/1/runtime/vm/object.h#newcode3968 runtime/vm/object.h:3968: } On 2012/11/08 19:06:47, siva wrote: > I don't ...
8 years, 1 month ago (2012-11-08 22:09:34 UTC) #6
siva
https://codereview.chromium.org/11368138/diff/1/runtime/vm/object.h File runtime/vm/object.h (right): https://codereview.chromium.org/11368138/diff/1/runtime/vm/object.h#newcode3968 runtime/vm/object.h:3968: } We could fix String::Transform/TwoByteString::Transform to locally compute the ...
8 years, 1 month ago (2012-11-09 02:40:42 UTC) #7
cshapiro
https://codereview.chromium.org/11368138/diff/1/runtime/lib/string.cc File runtime/lib/string.cc (right): https://codereview.chromium.org/11368138/diff/1/runtime/lib/string.cc#newcode162 runtime/lib/string.cc:162: return Symbols::New(&value, 1); This is generally not a good ...
8 years, 1 month ago (2012-11-09 04:43:57 UTC) #8
erikcorry
https://codereview.chromium.org/11368138/diff/1/runtime/lib/string.cc File runtime/lib/string.cc (right): https://codereview.chromium.org/11368138/diff/1/runtime/lib/string.cc#newcode162 runtime/lib/string.cc:162: return Symbols::New(&value, 1); On 2012/11/09 04:43:57, cshapiro wrote: > ...
8 years, 1 month ago (2012-11-12 20:22:17 UTC) #9
cshapiro
https://codereview.chromium.org/11368138/diff/1/runtime/lib/string.cc File runtime/lib/string.cc (right): https://codereview.chromium.org/11368138/diff/1/runtime/lib/string.cc#newcode162 runtime/lib/string.cc:162: return Symbols::New(&value, 1); Yes, what you say is all ...
8 years, 1 month ago (2012-11-12 21:51:13 UTC) #10
ngeoffray
FYI https://codereview.chromium.org/11368138/diff/1/sdk/lib/_internal/compiler/implementation/lib/js_helper.dart File sdk/lib/_internal/compiler/implementation/lib/js_helper.dart (right): https://codereview.chromium.org/11368138/diff/1/sdk/lib/_internal/compiler/implementation/lib/js_helper.dart#newcode546 sdk/lib/_internal/compiler/implementation/lib/js_helper.dart:546: static String stringFromCharCodes(charCodes) { On 2012/11/08 15:28:21, floitsch ...
8 years, 1 month ago (2012-11-15 13:24:20 UTC) #11
erikcorry
https://codereview.chromium.org/11368138/diff/1/runtime/lib/string.cc File runtime/lib/string.cc (right): https://codereview.chromium.org/11368138/diff/1/runtime/lib/string.cc#newcode32 runtime/lib/string.cc:32: uint32_t value = Smi::Cast(index_object).Value(); On 2012/11/08 15:28:21, floitsch wrote: ...
8 years, 1 month ago (2012-11-15 13:28:24 UTC) #12
erikcorry
I uploaded a new version. On the VM side it is less invasive. The CodePointAt ...
8 years, 1 month ago (2012-11-15 14:55:41 UTC) #13
cshapiro
https://codereview.chromium.org/11368138/diff/12001/runtime/lib/regexp_jsc.cc File runtime/lib/regexp_jsc.cc (right): https://codereview.chromium.org/11368138/diff/12001/runtime/lib/regexp_jsc.cc#newcode22 runtime/lib/regexp_jsc.cc:22: two_byte_str[i] = str.CodeUnitAt(i); Please do not change the name ...
8 years, 1 month ago (2012-11-15 20:14:51 UTC) #14
erikcorry
New version uploaded https://codereview.chromium.org/11368138/diff/12001/runtime/lib/regexp_jsc.cc File runtime/lib/regexp_jsc.cc (right): https://codereview.chromium.org/11368138/diff/12001/runtime/lib/regexp_jsc.cc#newcode22 runtime/lib/regexp_jsc.cc:22: two_byte_str[i] = str.CodeUnitAt(i); On 2012/11/15 20:14:51, ...
8 years, 1 month ago (2012-11-15 23:47:05 UTC) #15
siva
It would have been nice if we implemented the Codepoint iterator class first in Dart ...
8 years, 1 month ago (2012-11-16 22:32:04 UTC) #16
cshapiro
http://codereview.chromium.org/11368138/diff/11002/runtime/vm/object.cc File runtime/vm/object.cc (right): http://codereview.chromium.org/11368138/diff/11002/runtime/vm/object.cc#newcode9895 runtime/vm/object.cc:9895: return OneByteString::CharAt(*this, index); Why was this changed? I think ...
8 years, 1 month ago (2012-11-17 02:25:27 UTC) #17
erikcorry
https://codereview.chromium.org/11368138/diff/11002/runtime/lib/string_base.dart File runtime/lib/string_base.dart (right): https://codereview.chromium.org/11368138/diff/11002/runtime/lib/string_base.dart#newcode18 runtime/lib/string_base.dart:18: static String createFromUtf16(List<int> codeUnits) { On 2012/11/16 22:32:04, siva ...
8 years, 1 month ago (2012-11-19 12:40:41 UTC) #18
Søren Gjesse
dbc http://codereview.chromium.org/11368138/diff/16046/runtime/vm/unicode.h File runtime/vm/unicode.h (right): http://codereview.chromium.org/11368138/diff/16046/runtime/vm/unicode.h#newcode32 runtime/vm/unicode.h:32: static int32_t CodePointFromCodeUnits(int32_t lead, int32_t trail) { Assert ...
8 years, 1 month ago (2012-11-19 14:18:36 UTC) #19
cshapiro
Hi Erik, I have added a code point iterator to the VM that supports its ...
8 years, 1 month ago (2012-11-20 00:22:02 UTC) #20
siva
8 years, 1 month ago (2012-11-20 00:48:00 UTC) #21
When you sync up to TOT you may find that a number of the
VM changes are now fixed with the Iterator change that
Carl submitted.

Your changes LGTM once you have resolved to TOT.

https://codereview.chromium.org/11368138/diff/16046/runtime/vm/scanner.cc
File runtime/vm/scanner.cc (right):

https://codereview.chromium.org/11368138/diff/16046/runtime/vm/scanner.cc#new...
runtime/vm/scanner.cc:563: Symbols::New(string_chars.data(),
string_chars.length()));
Is the change above necessary considering that this
method calls the uint32 version of String::New which accounts
for supplementary characters.

Powered by Google App Engine
This is Rietveld 408576698