Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(166)

Issue 11365243: Revert OneByteString back to ISO Latin-1 instead of ASCII (Closed)

Created:
8 years, 1 month ago by siva
Modified:
8 years, 1 month ago
Reviewers:
Søren Gjesse, cshapiro
CC:
reviews_dartlang.org, Anton Muhin
Visibility:
Public.

Description

Revert OneByteString back to ISO Latin-1 instead of ASCII as webkit supports ISO Latin-1 and UTF-16 encodings for strings. Committed: https://code.google.com/p/dart/source/detail?r=14989

Patch Set 1 #

Total comments: 2

Patch Set 2 : #

Patch Set 3 : #

Patch Set 4 : #

Total comments: 8

Patch Set 5 : #

Unified diffs Side-by-side diffs Delta from patch set Stats (+81 lines, -69 lines) Patch
M runtime/include/dart_api.h View 1 2 3 4 1 chunk +2 lines, -2 lines 0 comments Download
M runtime/vm/dart_api_impl.cc View 1 2 3 4 1 chunk +1 line, -1 line 0 comments Download
M runtime/vm/dart_api_impl_test.cc View 1 2 3 4 7 chunks +10 lines, -10 lines 0 comments Download
M runtime/vm/dart_api_message.cc View 1 2 3 4 3 chunks +21 lines, -8 lines 0 comments Download
M runtime/vm/object.cc View 1 2 3 4 10 chunks +14 lines, -22 lines 0 comments Download
M runtime/vm/object_test.cc View 1 2 3 4 1 chunk +3 lines, -3 lines 0 comments Download
M runtime/vm/symbols.cc View 1 2 3 4 1 chunk +2 lines, -2 lines 0 comments Download
M runtime/vm/unicode.h View 1 2 3 4 2 chunks +5 lines, -5 lines 0 comments Download
M runtime/vm/unicode.cc View 1 2 3 4 3 chunks +23 lines, -16 lines 0 comments Download

Messages

Total messages: 6 (0 generated)
siva
8 years, 1 month ago (2012-11-14 00:54:04 UTC) #1
cshapiro
lgtm you can probably shorten "ISOLatin1" to just "Latin1" as the ISO part is pretty ...
8 years, 1 month ago (2012-11-14 00:57:23 UTC) #2
siva
Dropped ISO from the name. https://codereview.chromium.org/11365243/diff/1/runtime/include/dart_api.h File runtime/include/dart_api.h (right): https://codereview.chromium.org/11365243/diff/1/runtime/include/dart_api.h#newcode1329 runtime/include/dart_api.h:1329: DART_EXPORT bool Dart_IsISOLatin1String(Dart_Handle object); ...
8 years, 1 month ago (2012-11-14 01:48:54 UTC) #3
siva
The recent changes to support non ASCII characters in messages to native port needed some ...
8 years, 1 month ago (2012-11-14 22:48:54 UTC) #4
Søren Gjesse
lgtm https://codereview.chromium.org/11365243/diff/5004/runtime/include/dart_api.h File runtime/include/dart_api.h (right): https://codereview.chromium.org/11365243/diff/5004/runtime/include/dart_api.h#newcode1329 runtime/include/dart_api.h:1329: * Is this object a Latin-1 String? Maybe ...
8 years, 1 month ago (2012-11-15 08:11:30 UTC) #5
siva
8 years, 1 month ago (2012-11-15 22:42:59 UTC) #6
https://codereview.chromium.org/11365243/diff/5004/runtime/include/dart_api.h
File runtime/include/dart_api.h (right):

https://codereview.chromium.org/11365243/diff/5004/runtime/include/dart_api.h...
runtime/include/dart_api.h:1329: * Is this object a Latin-1 String?
On 2012/11/15 08:11:30, Søren Gjesse wrote:
> Maybe mention ISO 8859-1 in the comment.

Done.

https://codereview.chromium.org/11365243/diff/5004/runtime/vm/dart_api_messag...
File runtime/vm/dart_api_message.cc (right):

https://codereview.chromium.org/11365243/diff/5004/runtime/vm/dart_api_messag...
runtime/vm/dart_api_message.cc:351: reinterpret_cast<uint8_t*>(::malloc(len *
sizeof(uint8_t)));
I agree that we can do some optimization here for pure
ASCII strings in a new CL.

On 2012/11/15 08:11:30, Søren Gjesse wrote:
> It would be nice if we could avoid the additional allocation here when dealing
> with ASCII strings. How about optimistically allocating a CObject string for
> ASCII assuming that it would most likely be ASCII. If we encounter a non-ASCII
> character we can then start what the code currently does.
> 
> We could also add ReallocateDartCObjectString to extend the CObject string.
This
> is currently zone allocated, and will be the last object allocated in the
zone.
> The zone Realloc method could be optimized for expanding the last allocated
> object.
> 
> Maybe that gets to complicated though - we can keep it as is for now.

https://codereview.chromium.org/11365243/diff/5004/runtime/vm/dart_api_messag...
runtime/vm/dart_api_message.cc:790: if (type == Utf8::kLatin1) {
Yes, this was an optimization I was planning on doing
might help regular VM code too. I would prefer doing it in
a new CL.

On 2012/11/15 08:11:30, Søren Gjesse wrote:
> It would be nice if Utf8::CodePointCount could also still return kAscii if it
> was all ASCII. Then we don't need to convert at all.

https://codereview.chromium.org/11365243/diff/5004/runtime/vm/dart_api_messag...
runtime/vm/dart_api_message.cc:793: Utf8::DecodeToLatin1(utf8_str, utf8_len,
latin1_str, len);
Good point, would like to do this in a new CL in order to
avoid too many API changes in this CL.

On 2012/11/15 08:11:30, Søren Gjesse wrote:
> If the Utf8 class had a function to convert a single code point we could avoid
> allocating here, e.g.:
> 
> int i = 0;
> while(i < len) {
>   uint32_t code_point;
>   i += Utf8::Decode(utf8_str + i, &code_point);
>   ASSERT(code_point <= 255);
>   Write<uint8_t>(code_point);
> }
> 
> Utf8::DecodeCodePointToLatin1 returns the number of UTF-8 bytes consumed.
> Alternatively the code point could be returned.
> 
> int i = 0;
> while(i < len) {
>   int consumed;
>   Write<uint8_t>(Utf8::Decode(utf8_str + i, &consumed));
>   ASSERT(consumed <= 2);
>   i += consumed;
> }

Powered by Google App Engine
This is Rietveld 408576698