Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(479)

Issue 100823007: Stop doing unnecessary UTF-8 to UTF-16 conversions in JSONWriter. (Closed)

Created:
7 years ago by Robert Sesek
Modified:
7 years ago
CC:
chromium-reviews, skanuj+watch_chromium.org, dhollowa+watch_chromium.org, dougw+watch_chromium.org, Ilya Sherman, tim+watch_chromium.org, extensions-reviews_chromium.org, cbentzel+watch_chromium.org, melevin+watch_chromium.org, jar (doing other things), kkania, dominich, groby+spellwatch_chromium.org, chromium-apps-reviews_chromium.org, erikwright+watch_chromium.org, rpetterson, haitaol+watch_chromium.org, rouslan+spellwatch_chromium.org, dsinclair+watch_chromium.org, asvitkine+watch_chromium.org, Jered, donnd+watch_chromium.org, frankf, jfweitz+watch_chromium.org, David Black, samarth+watch_chromium.org, kmadhusu+watch_chromium.org, rsimha+watch_chromium.org, jshin+watch_chromium.org, inferno
Visibility:
Public.

Description

Stop doing unnecessary UTF-8 to UTF-16 conversions in JSONWriter. The JSONReader only accepts UTF-8 input strings and converts \uXXXX sequences back into UTF-8. However, the JSONWriter converts all non-ASCII characters to UTF-16 escape sequences. This round-tripping is sub-optimal, as noted in a TODO from r54359. One reason for this may be that JsonDoubleQuote(), used by JSONWriter, does not handle UTF-8 bytes correctly, interpreting them as code points and writing them out as \u00XX sequences. If this were read back through a RFC-compliant JSON parser, the result would be an invalid encoding error. JsonDoubleQuote() does handle UTF-16 correctly, though. This rewrites the base/json/string_escape.h API and fixes the above UTF-8 issue by dividing callers up into three groups: 1. Those that pass valid UTF-8 to be escaped. Prior to this change, very few callers used this variant. Those that did were likely using ASCII, otherwise the output would be mangled due to the above issue. Now, valid UTF-8 will be passed through to the output unescaped. Invalid UTF-8 sequences are replaced with U+FFFD. 2. Those that pass valid UTF-16 to be escaped. This function now validates that the input is valid UTF-16, and then converts it to unescaped UTF-8 sequences for the output. 3. Those that pass arbitrary byte arrays as std::string and expect a non-RFC- compliant encoding of the binary data using \uXXXX escapes. This behavior is now in the EscapeBytesAsInvalidJSONString() function. It is only used by callers who want a "debug string" but do not expect to actually parse the output as valid JSON, since it is not. Additionally, this removes the JSONWriter::OPTIONS_DO_NOT_ESCAPE flag, since the writer can now handle UTF-8 appropriately. BUG=15466 Committed: https://src.chromium.org/viewvc/chrome?view=rev&revision=239800 Reverted: https://src.chromium.org/viewvc/chrome?view=rev&revision=240082 R=asanka@chromium.org, bauerb@chromium.org, mark@chromium.org, thakis@chromium.org, zea@chromium.org Committed: https://src.chromium.org/viewvc/chrome?view=rev&revision=240190

Patch Set 1 #

Patch Set 2 : '' #

Total comments: 42

Patch Set 3 : Address review comments #

Total comments: 6

Patch Set 4 : Nits #

Total comments: 5

Patch Set 5 : Self-nit #

Patch Set 6 : NetUtilTest.GetDirectoryListingEntry #

Patch Set 7 : Fix ChromeOS page encodings #

Unified diffs Side-by-side diffs Delta from patch set Stats (+346 lines, -233 lines) Patch
M base/debug/trace_event_impl.cc View 1 2 3 4 5 6 1 chunk +1 line, -1 line 0 comments Download
M base/json/json_value_serializer_unittest.cc View 1 2 2 chunks +18 lines, -17 lines 0 comments Download
M base/json/json_writer.h View 1 2 3 3 chunks +4 lines, -17 lines 0 comments Download
M base/json/json_writer.cc View 6 chunks +9 lines, -23 lines 0 comments Download
M base/json/string_escape.h View 1 2 3 2 chunks +42 lines, -20 lines 0 comments Download
M base/json/string_escape.cc View 1 2 3 4 2 chunks +98 lines, -49 lines 0 comments Download
M base/json/string_escape_unittest.cc View 1 2 1 chunk +131 lines, -53 lines 0 comments Download
M base/metrics/statistics_recorder.cc View 1 chunk +1 line, -1 line 0 comments Download
M chrome/browser/chromeos/policy/configuration_policy_handler_chromeos.cc View 1 2 3 4 5 6 1 chunk +1 line, -2 lines 0 comments Download
M chrome/browser/extensions/activity_log/activity_log_unittest.cc View 1 2 3 chunks +3 lines, -3 lines 0 comments Download
M chrome/browser/extensions/api/messaging/native_messaging_host_manifest_unittest.cc View 1 2 1 chunk +1 line, -3 lines 0 comments Download
M chrome/browser/resources/chromeos/login/login.html View 1 2 3 4 5 6 1 chunk +2 lines, -1 line 0 comments Download
M chrome/browser/resources/chromeos/login/oobe.html View 1 2 3 4 5 6 1 chunk +2 lines, -1 line 0 comments Download
M chrome/browser/search/iframe_source.cc View 1 2 2 chunks +0 lines, -3 lines 0 comments Download
M chrome/browser/spellchecker/spelling_service_client.cc View 1 2 3 4 5 6 1 chunk +3 lines, -4 lines 0 comments Download
M chrome/browser/ui/webui/about_ui.cc View 1 2 3 4 5 6 1 chunk +1 line, -2 lines 0 comments Download
M chrome/browser/ui/webui/policy_ui.cc View 1 2 3 4 5 6 1 chunk +1 line, -2 lines 0 comments Download
M chrome/renderer/searchbox/searchbox_extension.cc View 1 2 1 chunk +1 line, -2 lines 0 comments Download
M chrome/test/chromedriver/capabilities.cc View 1 2 3 4 5 6 1 chunk +1 line, -1 line 0 comments Download
M chrome/test/chromedriver/chrome/adb_impl.cc View 1 chunk +1 line, -1 line 0 comments Download
M components/plugins/renderer/plugin_placeholder.cc View 1 2 3 4 5 6 1 chunk +1 line, -1 line 0 comments Download
M jingle/notifier/listener/notification_defines.cc View 1 chunk +6 lines, -4 lines 0 comments Download
M net/base/net_util.cc View 4 chunks +6 lines, -7 lines 0 comments Download
M net/base/net_util_unittest.cc View 1 2 3 4 5 2 chunks +3 lines, -3 lines 0 comments Download
M sync/internal_api/public/base/ordinal.h View 1 chunk +2 lines, -2 lines 0 comments Download
M sync/internal_api/public/base/progress_marker_map.cc View 1 2 1 chunk +2 lines, -3 lines 0 comments Download
M sync/syncable/entry.cc View 1 chunk +2 lines, -4 lines 0 comments Download
M tools/gn/trace.cc View 2 chunks +3 lines, -3 lines 0 comments Download

Messages

Total messages: 26 (0 generated)
Robert Sesek
7 years ago (2013-12-05 23:53:48 UTC) #1
Mark Mentovai
https://codereview.chromium.org/100823007/diff/20001/base/debug/trace_event_impl.cc File base/debug/trace_event_impl.cc (right): https://codereview.chromium.org/100823007/diff/20001/base/debug/trace_event_impl.cc#newcode1 base/debug/trace_event_impl.cc:1: // Copyright (c) 2012 The Chromium Authors. All rights ...
7 years ago (2013-12-06 15:35:11 UTC) #2
Avi (use Gerrit)
https://codereview.chromium.org/100823007/diff/20001/base/json/string_escape.h File base/json/string_escape.h (right): https://codereview.chromium.org/100823007/diff/20001/base/json/string_escape.h#newcode19 base/json/string_escape.h:19: // replaced with the U+FFFD replacement character. On return, ...
7 years ago (2013-12-06 15:45:24 UTC) #3
Mark Mentovai
U+FFFD exists to provide a glyph to use to represent an invalid character. It is ...
7 years ago (2013-12-06 15:58:23 UTC) #4
jungshik at Google
https://codereview.chromium.org/100823007/diff/20001/base/json/json_value_serializer_unittest.cc File base/json/json_value_serializer_unittest.cc (right): https://codereview.chromium.org/100823007/diff/20001/base/json/json_value_serializer_unittest.cc#newcode274 base/json/json_value_serializer_unittest.cc:274: string16 test(WideToUTF16(L"\x7F51\x9875")); While you're at it, you may want ...
7 years ago (2013-12-07 13:16:58 UTC) #5
Mark Mentovai
https://codereview.chromium.org/100823007/diff/20001/base/json/string_escape.h File base/json/string_escape.h (right): https://codereview.chromium.org/100823007/diff/20001/base/json/string_escape.h#newcode19 base/json/string_escape.h:19: // replaced with the U+FFFD replacement character. On return, ...
7 years ago (2013-12-07 22:22:48 UTC) #6
Robert Sesek
https://codereview.chromium.org/100823007/diff/20001/base/debug/trace_event_impl.cc File base/debug/trace_event_impl.cc (right): https://codereview.chromium.org/100823007/diff/20001/base/debug/trace_event_impl.cc#newcode1 base/debug/trace_event_impl.cc:1: // Copyright (c) 2012 The Chromium Authors. All rights ...
7 years ago (2013-12-09 19:52:08 UTC) #7
Mark Mentovai
OK, LGTM. Nice work. https://codereview.chromium.org/100823007/diff/60001/base/json/json_writer.h File base/json/json_writer.h (right): https://codereview.chromium.org/100823007/diff/60001/base/json/json_writer.h#newcode22 base/json/json_writer.h:22: OPTIONS_OMIT_BINARY_VALUES = 1 << 1, ...
7 years ago (2013-12-09 20:21:58 UTC) #8
Robert Sesek
Thanks for the review. Adding OWNERS: thakis: chrome/ asanka: net/ zea: jingle/ and sync/ -- ...
7 years ago (2013-12-09 21:14:37 UTC) #9
Nico
https://codereview.chromium.org/100823007/diff/80001/chrome/browser/extensions/activity_log/activity_log_unittest.cc File chrome/browser/extensions/activity_log/activity_log_unittest.cc (right): https://codereview.chromium.org/100823007/diff/80001/chrome/browser/extensions/activity_log/activity_log_unittest.cc#newcode120 chrome/browser/extensions/activity_log/activity_log_unittest.cc:120: ASSERT_EQ("[\"POST\",\"\\u003Carg_url>\"]", Huh, why does < stay escaped but > ...
7 years ago (2013-12-09 21:20:49 UTC) #10
Robert Sesek
https://codereview.chromium.org/100823007/diff/80001/chrome/browser/extensions/activity_log/activity_log_unittest.cc File chrome/browser/extensions/activity_log/activity_log_unittest.cc (right): https://codereview.chromium.org/100823007/diff/80001/chrome/browser/extensions/activity_log/activity_log_unittest.cc#newcode120 chrome/browser/extensions/activity_log/activity_log_unittest.cc:120: ASSERT_EQ("[\"POST\",\"\\u003Carg_url>\"]", On 2013/12/09 21:20:50, Nico wrote: > Huh, why ...
7 years ago (2013-12-09 21:21:43 UTC) #11
Nico
On Mon, Dec 9, 2013 at 1:21 PM, <rsesek@chromium.org> wrote: > > https://codereview.chromium.org/100823007/diff/80001/ > chrome/browser/extensions/activity_log/activity_log_unittest.cc ...
7 years ago (2013-12-09 21:23:59 UTC) #12
Robert Sesek
On 2013/12/09 21:23:59, Nico wrote: > On Mon, Dec 9, 2013 at 1:21 PM, <mailto:rsesek@chromium.org> ...
7 years ago (2013-12-09 21:29:57 UTC) #13
Nico
chrome lgtm +inferno fyi who added <> escaping, to shout if not escaping '>' in ...
7 years ago (2013-12-09 21:42:22 UTC) #14
asanka
/net/ lgtm
7 years ago (2013-12-09 21:51:47 UTC) #15
Nicolas Zea
jingle/sync lgtm
7 years ago (2013-12-09 22:47:47 UTC) #16
Robert Sesek
+bauerb for components/plugins/OWNERS https://codereview.chromium.org/100823007/diff/80001/base/json/string_escape.h File base/json/string_escape.h (right): https://codereview.chromium.org/100823007/diff/80001/base/json/string_escape.h#newcode29 base/json/string_escape.h:29: bool put_in_quotes, On 2013/12/09 21:42:23, Nico ...
7 years ago (2013-12-09 22:50:56 UTC) #17
Nico
https://codereview.chromium.org/100823007/diff/80001/base/json/string_escape.h File base/json/string_escape.h (right): https://codereview.chromium.org/100823007/diff/80001/base/json/string_escape.h#newcode29 base/json/string_escape.h:29: bool put_in_quotes, On 2013/12/09 22:50:57, rsesek wrote: > On ...
7 years ago (2013-12-10 00:13:32 UTC) #18
Bernhard Bauer
components/plugins LGTM
7 years ago (2013-12-10 09:09:01 UTC) #19
commit-bot: I haz the power
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/rsesek@chromium.org/100823007/100001
7 years ago (2013-12-10 12:31:35 UTC) #20
commit-bot: I haz the power
Retried try job too often on win_rel for step(s) net_unittests http://build.chromium.org/p/tryserver.chromium/buildstatus?builder=win_rel&number=233756
7 years ago (2013-12-10 14:22:50 UTC) #21
commit-bot: I haz the power
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/rsesek@chromium.org/100823007/120001
7 years ago (2013-12-10 15:10:14 UTC) #22
commit-bot: I haz the power
Change committed as 239800
7 years ago (2013-12-10 17:40:38 UTC) #23
Denis Kuznetsov (DE-MUC)
A revert of this CL has been created in https://codereview.chromium.org/106793004/ by antrim@chromium.org. The reason for ...
7 years ago (2013-12-11 11:52:00 UTC) #24
Mark Mentovai
How is it broken? What should have been displayed? What was displayed instead? You know, ...
7 years ago (2013-12-11 14:28:53 UTC) #25
Robert Sesek
7 years ago (2013-12-11 22:10:55 UTC) #26
Message was sent while issue was closed.
Committed patchset #7 manually as r240190 (presubmit successful).

Powered by Google App Engine
This is Rietveld 408576698