|
|
Created:
7 years, 3 months ago by davidben Modified:
5 years, 10 months ago Reviewers:
abarth-chromium CC:
blink-reviews, jamesr, eae+blinkwatch, tommyw+watchlist_chromium.org, dglazkov+blink, jeez Base URL:
https://chromium.googlesource.com/chromium/blink.git@master Visibility:
Public. |
DescriptionExpose latin1 methods in WebString
Needed to plumb header values correctly between Chromium and Blink.
Also temporarily mark a test as flaky in advance of breaking it for a
WebURLLoaderImpl change. When that change lands, the test will be fixed and
switched back.
BUG=276769
Committed: https://src.chromium.org/viewvc/blink?view=rev&revision=157238
Patch Set 1 #
Total comments: 6
Patch Set 2 : #
Total comments: 4
Patch Set 3 : #Patch Set 4 : rebase #
Messages
Total messages: 13 (0 generated)
Plan for coordinating repositories: 1. Land this change. 2. When Blink roll happens, land Chromium change to make WebURLLoaderImpl use latin1 methods, thus breaking layout test. 3. When that lands, land Blink change to fix and re-enable the test and add layout tests to verify this is exposed correctly through XHR. Also modify isValidHTTPHeaderValue to check containsOnlyLatin1 so we throw an exception instead of silently converting non-Latin-1 characters to ?. ----- Results of tests for other browser behavior: - Chrome 30.0.1599.14 beta: - decodes response header value as UTF-8 - decodes statusText as UTF-8 - non-ASCII in request header name not allowed - encodes request header value as UTF-8 - Safari 6.0.5 (8536.30.1) - decodes response header value as Latin-1 - decodes statusText as Latin-1 - non-ASCII in request header name not allowed - encodes request header value as Latin-1 - silently truncates request header value at first non-Latin-1 char - Firefox 23.0.1 - decodes response header value as Latin-1 - decodes statusText as UTF-8 - non-ASCII in request header name not allowed - encodes request header value as UTF-8 - Firefox Nightly 26.0a1 (2013-08-30) - decodes response header value as Latin-1 - decodes statusText as Latin-1 - non-ASCII in request header name not allowed - encodes request header value as Latin-1 - throws TypeError if request header value has non-Latin-1 chars - IE 10.0.9200.166660 - decodes response header value as Latin-1 - decodes statusText as Latin-1 - non-ASCII in request header name ALLOWED - encoded as UTF-8 it seems - encodes request header value as UTF-8 - Opera 12.16 - decodes response header value as Latin-1 - decodes statusText as Latin-1 - non-ASCII in request header name not allowed - encodes request header value as Latin-1 - throws DOMException/SYNTAX_ERR if request header value has non-Latin-1 chars I believe per-spec, firefox-nightly's behavior is most correct. We're the only ones that try to decode response header values as UTF-8. We're not the only ones to encode request header values as UTF-8, but Firefox is changing there, and not being able to round-trip headers correctly seems pretty silly.
https://codereview.chromium.org/23456013/diff/1/Source/core/platform/chromium... File Source/core/platform/chromium/support/WebString.cpp (right): https://codereview.chromium.org/23456013/diff/1/Source/core/platform/chromium... Source/core/platform/chromium/support/WebString.cpp:102: return std::string(latin1.data(), latin1.length()); WTF:: <--- This prefix isn't needed. Do you actually need to be able to convert a WebString to Latin-1? If so, we can make a more efficient implementation. If not, we shouldn't add this function. We shouldn't need to make a CString copy. We should be able to copy directly from the String to the std::string. See <https://docs.google.com/document/d/1kOCUlJdh2WJMJGDf-WoEQhmnjKLaOYRbiHz5TiGJl...> for more information about how String represents its data. https://codereview.chromium.org/23456013/diff/1/Source/core/platform/chromium... Source/core/platform/chromium/support/WebString.cpp:107: return WTF::String(data, length); WTF:: <---- No need for this prefix. https://codereview.chromium.org/23456013/diff/1/Source/core/platform/chromium... Source/core/platform/chromium/support/WebString.cpp:113: } Do you actually need this function? If not, we shouldn't added it.
https://codereview.chromium.org/23456013/diff/1/Source/core/platform/chromium... File Source/core/platform/chromium/support/WebString.cpp (right): https://codereview.chromium.org/23456013/diff/1/Source/core/platform/chromium... Source/core/platform/chromium/support/WebString.cpp:102: return std::string(latin1.data(), latin1.length()); On 2013/08/31 18:04:18, abarth wrote: > WTF:: <--- This prefix isn't needed. Removed. I was just going by the UTF-8 ones which have the prefix. Should I remove those while I'm here? > Do you actually need to be able to convert a WebString to Latin-1? If so, we > can make a more efficient implementation. If not, we shouldn't add this > function. For getting XMLHttpRequest#setRequestHeader right I need to be able to go the other direction too. That one we're not the only browser to encode as UTF-8 right now, unlike going the other direction (see pile of test results in first comment), but a future Firefox will do Latin-1 and Safari already does. Also it seems really weird to interpret response headers as Latin-1 but encode request headers as UTF-8. > We shouldn't need to make a CString copy. We should be able to copy directly > from the String to the std::string. See > <https://docs.google.com/document/d/1kOCUlJdh2WJMJGDf-WoEQhmnjKLaOYRbiHz5TiGJl...> > for more information about how String represents its data. Made a new version. It still does a CString copy if we have a UChar instead of LChar string, but otherwise it doesn't. StringUTF8Adaptor is similar. (Or would you prefer reimplementing String::latin1 altogether? Hopefully that's by far the less common case since characters outside of latin1 can't go in the result anyway.) https://codereview.chromium.org/23456013/diff/1/Source/core/platform/chromium... Source/core/platform/chromium/support/WebString.cpp:107: return WTF::String(data, length); On 2013/08/31 18:04:18, abarth wrote: > WTF:: <---- No need for this prefix. Done. https://codereview.chromium.org/23456013/diff/1/Source/core/platform/chromium... Source/core/platform/chromium/support/WebString.cpp:113: } On 2013/08/31 18:04:18, abarth wrote: > Do you actually need this function? If not, we shouldn't added it. Removed.
On 2013/09/03 15:04:56, David Benjamin wrote: > https://codereview.chromium.org/23456013/diff/1/Source/core/platform/chromium... > File Source/core/platform/chromium/support/WebString.cpp (right): > > https://codereview.chromium.org/23456013/diff/1/Source/core/platform/chromium... > Source/core/platform/chromium/support/WebString.cpp:102: return > std::string(latin1.data(), latin1.length()); > On 2013/08/31 18:04:18, abarth wrote: > > WTF:: <--- This prefix isn't needed. > > Removed. I was just going by the UTF-8 ones which have the prefix. Should I > remove those while I'm here? Sure. > > Do you actually need to be able to convert a WebString to Latin-1? If so, we > > can make a more efficient implementation. If not, we shouldn't add this > > function. > > For getting XMLHttpRequest#setRequestHeader right I need to be able to go the > other direction too. That one we're not the only browser to encode as UTF-8 > right now, unlike going the other direction (see pile of test results in first > comment), but a future Firefox will do Latin-1 and Safari already does. Also it > seems really weird to interpret response headers as Latin-1 but encode request > headers as UTF-8. What happens when the String contains characters that cannot be encoded in Latin-1? > > We shouldn't need to make a CString copy. We should be able to copy directly > > from the String to the std::string. See > > > <https://docs.google.com/document/d/1kOCUlJdh2WJMJGDf-WoEQhmnjKLaOYRbiHz5TiGJl...> > > for more information about how String represents its data. > > Made a new version. It still does a CString copy if we have a UChar instead of > LChar string, but otherwise it doesn't. StringUTF8Adaptor is similar. (Or would > you prefer reimplementing String::latin1 altogether? Hopefully that's by far the > less common case since characters outside of latin1 can't go in the result > anyway.) You can avoid the copy even when String is backed by UChars, but it's probably not worth the complexity. We can optimize it if it shows up on profiles.
LGTM https://codereview.chromium.org/23456013/diff/7001/Source/core/platform/chrom... File Source/core/platform/chromium/support/WebString.cpp (right): https://codereview.chromium.org/23456013/diff/7001/Source/core/platform/chrom... Source/core/platform/chromium/support/WebString.cpp:103: if (string.is8Bit()) You need to handle the isEmpty() case as well. You can't call is8Bit on an empty string. if (string.isEmpty()) return std::string(); should be sufficient. https://codereview.chromium.org/23456013/diff/7001/public/platform/WebString.h File public/platform/WebString.h (right): https://codereview.chromium.org/23456013/diff/7001/public/platform/WebString.... public/platform/WebString.h:104: BLINK_COMMON_EXPORT static WebString fromLatin1(const char* data, size_t length); I wonder if we should use WebLChar here instead of char... That will make the call sites uglier, but maybe that's a good idea because its rare that people want to call this function.
On 2013/09/03 15:43:25, abarth wrote: > What happens when the String contains characters that cannot be encoded in > Latin-1? They currently turn into ? since that's what String::latin1 does, but that's pretty silly. My plan was to change isValidHTTPHeaderValue to require name.containsOnlyLatin1() in the third leg (after WebURLLoaderImpl is changed to use Latin1 rather than UTF-8). That will make setRequestHeader throw an exception on header values containing non-latin1 characters. That matches the spec (per ByteString in WebIDL), Opera 12.16, and Firefox nightly. It doesn't match Safari which instead silently truncates the string at the first non-latin1 character.
Also removed some WTF:: prefixes. https://codereview.chromium.org/23456013/diff/7001/Source/core/platform/chrom... File Source/core/platform/chromium/support/WebString.cpp (right): https://codereview.chromium.org/23456013/diff/7001/Source/core/platform/chrom... Source/core/platform/chromium/support/WebString.cpp:103: if (string.is8Bit()) On 2013/09/03 15:46:40, abarth wrote: > You need to handle the isEmpty() case as well. You can't call is8Bit on an > empty string. > > if (string.isEmpty()) > return std::string(); > > should be sufficient. Done. https://codereview.chromium.org/23456013/diff/7001/public/platform/WebString.h File public/platform/WebString.h (right): https://codereview.chromium.org/23456013/diff/7001/public/platform/WebString.... public/platform/WebString.h:104: BLINK_COMMON_EXPORT static WebString fromLatin1(const char* data, size_t length); On 2013/09/03 15:46:40, abarth wrote: > I wonder if we should use WebLChar here instead of char... That will make the > call sites uglier, but maybe that's a good idea because its rare that people > want to call this function. Done. Though it doesn't actually end up doing much since there's the std::string version. Just the implementation of fromLatin1(const std::string& s) gets ugly.
On 2013/09/03 17:58:25, David Benjamin wrote: > Also removed some WTF:: prefixes. > > https://codereview.chromium.org/23456013/diff/7001/Source/core/platform/chrom... > File Source/core/platform/chromium/support/WebString.cpp (right): > > https://codereview.chromium.org/23456013/diff/7001/Source/core/platform/chrom... > Source/core/platform/chromium/support/WebString.cpp:103: if (string.is8Bit()) > On 2013/09/03 15:46:40, abarth wrote: > > You need to handle the isEmpty() case as well. You can't call is8Bit on an > > empty string. > > > > if (string.isEmpty()) > > return std::string(); > > > > should be sufficient. > > Done. > > https://codereview.chromium.org/23456013/diff/7001/public/platform/WebString.h > File public/platform/WebString.h (right): > > https://codereview.chromium.org/23456013/diff/7001/public/platform/WebString.... > public/platform/WebString.h:104: BLINK_COMMON_EXPORT static WebString > fromLatin1(const char* data, size_t length); > On 2013/09/03 15:46:40, abarth wrote: > > I wonder if we should use WebLChar here instead of char... That will make the > > call sites uglier, but maybe that's a good idea because its rare that people > > want to call this function. > > Done. Though it doesn't actually end up doing much since there's the std::string > version. Just the implementation of fromLatin1(const std::string& s) gets ugly. (Oh, sorry I'm always unsure what to do with an LGTM + comments. :-) Should I submit this now?)
lgtm
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/davidben@chromium.org/23456013/15001
Failed to apply patch for LayoutTests/TestExpectations: While running patch -p1 --forward --force --no-backup-if-mismatch; patching file LayoutTests/TestExpectations Hunk #1 FAILED at 1292. 1 out of 1 hunk FAILED -- saving rejects to file LayoutTests/TestExpectations.rej Patch: LayoutTests/TestExpectations Index: LayoutTests/TestExpectations diff --git a/LayoutTests/TestExpectations b/LayoutTests/TestExpectations index 7e8da0478d3445e755f2fa63e3ff50c15e15930d..5830e35b01a68ed4bb6aa60bc4f521f33a7a9b95 100644 --- a/LayoutTests/TestExpectations +++ b/LayoutTests/TestExpectations @@ -1292,3 +1292,6 @@ Bug(jsbell) fast/encoding/utf-16-odd-byte.html [ Skip ] crbug.com/282620 [ XP Win7 ] fast/mediastream/RTCPeerConnection-stats.html [ Failure Pass ] crbug.com/282634 [ SnowLeopard Debug ] svg/as-object/nested-embedded-svg-size-changes-no-layout-triggers-2.html [ Pass ImageOnlyFailure ] + +# Temporarily mark as flaky in advance of Chromium change. +crbug.com/276769 http/tests/security/contentSecurityPolicy/source-list-parsing-nonascii.html [ Failure Pass ]
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/davidben@chromium.org/23456013/22001
Message was sent while issue was closed.
Change committed as 157238 |