Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(475)

Issue 1169393003: Add new SplitString backend. (Closed)

Created:
5 years, 6 months ago by brettw
Modified:
5 years, 6 months ago
Reviewers:
danakj
CC:
chromium-reviews, tracing+reviews_chromium.org, wfh+watch_chromium.org, erikwright+watch_chromium.org, jshin+watch_chromium.org
Base URL:
https://chromium.googlesource.com/chromium/src.git@master
Target Ref:
refs/pending/heads/master
Project:
chromium
Visibility:
Public.

Description

Add new SplitString backend. Create a unified SplitString backend that covers SplitString, SplitStringDontTrim, and Tokenize. Implement those existing functions in terms of the new ones. Theoretically, there is no behavior change. Later, I want to delete the other variants. Some aspects of the existing SplitString/Tokenize variants are confusing or surprising, and not all variants are supported. The new functions are designed to be obvious in all respects from reading the call site. Previously, there were 7 implementation functions: 4 for SplitString (string+trim, string+donttrim, string16+trim, string16+donttrim) for both string types and both trim modes, plus 3 Tokenize functions (string, string16, StringPiece). Now there are 8 variants but all implemented by the same template. The trim mode and the token coalescing that Tokenize did are now arguments. These mean there are two additional branches for every split performed, but since there are typically only a few splits per string this should be a good trade-off. The new APIs give StringPiece variants so that calling code can split in any mode without allocating strings (only one such variant existed before for Tokenize). There are also optimizations for single character split sets which is the common case. This means there is not additional per-input-character overhead as compared to the old split implementations (where only one input character was allowed), and the Tokenize calls that take an input set have one split character (most of them) should be faster. The new implementation also avoids a string copy when trimming whitespace by using StringPieces for the intermediate steps. To implement this, additional Trim functions were added that operate on StringPieces. This also makes minor improvements to TrimString to avoid allocating a separate string for the trim character set (almost always a constant) for each call. These new variants return vectors rather than have out parameters. Return value optimization should handle most of the cases, and the few remaining ones should be handled by the C++11 move operations for operator=. This should give cleaner call sites which will be especially nice given the additional long parameters being added. A simple hardcoded test of splitting a string shows that the new code is about 2x the speed of the old implementation using the legacy SplitString API. The change in autofill_agent.cc is necessary because I changed the inputs to the split function from a string16 to a StringPiece16. Normally the implicit constructor does this automatically, but here it's using the implicit type conversion operator on WebString which can't do string16->StringPiece implicitly. Committed: https://crrev.com/977caaa1d84e9059719ee0f70b0e1e5863d896b2 Cr-Commit-Position: refs/heads/master@{#334226}

Patch Set 1 #

Patch Set 2 : #

Patch Set 3 : more #

Patch Set 4 : #

Patch Set 5 : #

Patch Set 6 : #

Patch Set 7 : #

Total comments: 14

Patch Set 8 : #

Total comments: 5

Patch Set 9 : #

Unified diffs Side-by-side diffs Delta from patch set Stats (+450 lines, -166 lines) Patch
M base/strings/string_piece.h View 1 chunk +0 lines, -6 lines 0 comments Download
M base/strings/string_split.h View 1 2 3 4 5 6 7 8 3 chunks +103 lines, -27 lines 0 comments Download
M base/strings/string_split.cc View 1 2 3 4 5 6 7 4 chunks +166 lines, -70 lines 0 comments Download
M base/strings/string_split_unittest.cc View 1 2 3 4 5 6 7 2 chunks +82 lines, -1 line 0 comments Download
M base/strings/string_util.h View 1 2 3 4 5 6 7 4 chunks +30 lines, -13 lines 0 comments Download
M base/strings/string_util.cc View 1 2 3 4 5 6 5 chunks +57 lines, -48 lines 0 comments Download
M base/strings/string_util_constants.cc View 1 2 3 1 chunk +10 lines, -0 lines 0 comments Download
M components/autofill/content/renderer/autofill_agent.cc View 1 2 3 4 1 chunk +2 lines, -1 line 0 comments Download

Messages

Total messages: 30 (10 generated)
commit-bot: I haz the power
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1169393003/100001
5 years, 6 months ago (2015-06-10 22:36:32 UTC) #2
commit-bot: I haz the power
Dry run: Try jobs failed on following builders: mac_chromium_rel_ng on tryserver.chromium.mac (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.mac/builders/mac_chromium_rel_ng/builds/75594)
5 years, 6 months ago (2015-06-10 23:17:47 UTC) #4
commit-bot: I haz the power
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1169393003/120001
5 years, 6 months ago (2015-06-11 02:55:41 UTC) #6
brettw
5 years, 6 months ago (2015-06-11 04:47:34 UTC) #8
brettw
I had considered naming the new function SplitStringOnAnyOf to emphasize that it splits on any ...
5 years, 6 months ago (2015-06-11 05:49:11 UTC) #9
commit-bot: I haz the power
Dry run: Try jobs failed on following builders: win_chromium_rel_ng on tryserver.chromium.win (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.win/builders/win_chromium_rel_ng/builds/68490)
5 years, 6 months ago (2015-06-11 06:07:03 UTC) #11
commit-bot: I haz the power
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1169393003/120001
5 years, 6 months ago (2015-06-11 06:17:14 UTC) #13
commit-bot: I haz the power
Dry run: This issue passed the CQ dry run.
5 years, 6 months ago (2015-06-11 07:04:04 UTC) #15
danakj
There are some claims about performance "shoulds" in the patch description. Have you considered making ...
5 years, 6 months ago (2015-06-11 21:16:54 UTC) #16
brettw
This change is for cleaning up string splitting (I started off wanting to move tokenize ...
5 years, 6 months ago (2015-06-11 21:56:48 UTC) #17
danakj
On Thu, Jun 11, 2015 at 2:56 PM, <brettw@chromium.org> wrote: > This change is for ...
5 years, 6 months ago (2015-06-11 22:17:03 UTC) #18
brettw
I don't think implementing a real perf test for this would be useful. We don't ...
5 years, 6 months ago (2015-06-11 23:46:33 UTC) #19
danakj
That's awesome, thanks for getting numbers! https://codereview.chromium.org/1169393003/diff/120001/base/strings/string_piece.h File base/strings/string_piece.h (left): https://codereview.chromium.org/1169393003/diff/120001/base/strings/string_piece.h#oldcode22 base/strings/string_piece.h:22: // StringPiece16 is ...
5 years, 6 months ago (2015-06-11 23:58:35 UTC) #20
brettw
New snap up! https://codereview.chromium.org/1169393003/diff/120001/base/strings/string_piece.h File base/strings/string_piece.h (left): https://codereview.chromium.org/1169393003/diff/120001/base/strings/string_piece.h#oldcode22 base/strings/string_piece.h:22: // StringPiece16 is similar to StringPiece ...
5 years, 6 months ago (2015-06-12 17:37:08 UTC) #21
commit-bot: I haz the power
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1169393003/140001
5 years, 6 months ago (2015-06-12 17:37:36 UTC) #23
danakj
LGTM https://codereview.chromium.org/1169393003/diff/140001/base/strings/string_split.h File base/strings/string_split.h (right): https://codereview.chromium.org/1169393003/diff/140001/base/strings/string_split.h#newcode26 base/strings/string_split.h:26: // If the input is ",," and the ...
5 years, 6 months ago (2015-06-12 17:58:12 UTC) #24
brettw
https://codereview.chromium.org/1169393003/diff/140001/base/strings/string_split.h File base/strings/string_split.h (right): https://codereview.chromium.org/1169393003/diff/140001/base/strings/string_split.h#newcode66 base/strings/string_split.h:66: // base::KEEP_WHITESPACE, On 2015/06/12 17:58:12, danakj wrote: > Just ...
5 years, 6 months ago (2015-06-12 18:09:45 UTC) #25
commit-bot: I haz the power
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1169393003/160001
5 years, 6 months ago (2015-06-12 18:10:54 UTC) #28
commit-bot: I haz the power
Committed patchset #9 (id:160001)
5 years, 6 months ago (2015-06-12 19:57:59 UTC) #29
commit-bot: I haz the power
5 years, 6 months ago (2015-06-12 19:58:59 UTC) #30
Message was sent while issue was closed.
Patchset 9 (id:??) landed as
https://crrev.com/977caaa1d84e9059719ee0f70b0e1e5863d896b2
Cr-Commit-Position: refs/heads/master@{#334226}

Powered by Google App Engine
This is Rietveld 408576698