DescriptionAdd new SplitString backend.
Create a unified SplitString backend that covers SplitString, SplitStringDontTrim, and Tokenize. Implement those existing functions in terms of the new ones. Theoretically, there is no behavior change.
Later, I want to delete the other variants. Some aspects of the existing SplitString/Tokenize variants are confusing or surprising, and not all variants are supported. The new functions are designed to be obvious in all respects from reading the call site.
Previously, there were 7 implementation functions: 4 for SplitString (string+trim, string+donttrim, string16+trim, string16+donttrim) for both string types and both trim modes, plus 3 Tokenize functions (string, string16, StringPiece).
Now there are 8 variants but all implemented by the same template. The trim mode and the token coalescing that Tokenize did are now arguments. These mean there are two additional branches for every split performed, but since there are typically only a few splits per string this should be a good trade-off.
The new APIs give StringPiece variants so that calling code can split in any mode without allocating strings (only one such variant existed before for Tokenize). There are also optimizations for single character split sets which is the common case. This means there is not additional per-input-character overhead as compared to the old split implementations (where only one input character was allowed), and the Tokenize calls that take an input set have one split character (most of them) should be faster.
The new implementation also avoids a string copy when trimming whitespace by using StringPieces for the intermediate steps. To implement this, additional Trim functions were added that operate on StringPieces. This also makes minor improvements to TrimString to avoid allocating a separate string for the trim character set (almost always a constant) for each call.
These new variants return vectors rather than have out parameters. Return value optimization should handle most of the cases, and the few remaining ones should be handled by the C++11 move operations for operator=. This should give cleaner call sites which will be especially nice given the additional long parameters being added.
A simple hardcoded test of splitting a string shows that the new code is about 2x the speed of the old implementation using the legacy SplitString API.
The change in autofill_agent.cc is necessary because I changed the inputs to the split function from a string16 to a StringPiece16. Normally the implicit constructor does this automatically, but here it's using the implicit type conversion operator on WebString which can't do string16->StringPiece implicitly.
Committed: https://crrev.com/977caaa1d84e9059719ee0f70b0e1e5863d896b2
Cr-Commit-Position: refs/heads/master@{#334226}
Patch Set 1 #Patch Set 2 : #Patch Set 3 : more #Patch Set 4 : #Patch Set 5 : #Patch Set 6 : #Patch Set 7 : #
Total comments: 14
Patch Set 8 : #
Total comments: 5
Patch Set 9 : #
Messages
Total messages: 30 (10 generated)
|