|
|
Chromium Code Reviews
DescriptionOptimize algorithm for String.prototype.repeat(), fix value caching in others
1. Optimize algorithm for String.prototype.repeat() to O(log N).
2. Improve variable caching usage for the following String instance methods:
- String.prototype.startsWith()
- String.prototype.endsWith()
- String.prototype.contains()
- String.prototype.substr()
The second part made minor improvements across the board, although only
usually by a couple milliseconds per million reps with normal cases.
I benchmarked these improvements (and checked for improvement) with roughly
this general piece of code, running under both the main version and my edits.
var MAX_ITERATIONS = 1000000
var string = 'something';
var start = +(new Date());
for (var i = 0; i < MAX_ITERATIONS; i++) {
string.someMethod(args);
}
var end = new Date().getTime();
print('Time taken: ' + (end - start) + ' ms');
See V8 bug #3450 for details (https://code.google.com/p/v8/issues/detail?id=3450).
R=mstarzinger@chromium.org
Patch Set 1 #
Total comments: 5
Patch Set 2 : Clarify with comments #
Total comments: 3
Patch Set 3 : Remove some small edits #Patch Set 4 : Make return string less cons-y #Patch Set 5 : Replace bitwise with arithmetic operators #
Total comments: 3
Messages
Total messages: 19 (0 generated)
https://codereview.chromium.org/401783003/diff/1/AUTHORS File AUTHORS (right): https://codereview.chromium.org/401783003/diff/1/AUTHORS#newcode37 AUTHORS:37: Isiah Meadows <impinball@gmail.com> Did you already sign the CLA. I wasn't able to find you on the list. Instructions on how to do so can be found here: https://code.google.com/p/v8/wiki/Contributing https://codereview.chromium.org/401783003/diff/1/src/harmony-string.js File src/harmony-string.js (right): https://codereview.chromium.org/401783003/diff/1/src/harmony-string.js#newcode24 src/harmony-string.js:24: if (n < 1) return ''; This special case is not needed if comment below is addressed. https://codereview.chromium.org/401783003/diff/1/src/harmony-string.js#newcode26 src/harmony-string.js:26: // Optimized O(log N) algorithm, without the added overhead of an extra array nit: Let's drop (at least) the second half of the comment. Not sure it's worth referring to an array that no longer exists. https://codereview.chromium.org/401783003/diff/1/src/harmony-string.js#newcode29 src/harmony-string.js:29: while (n > 1) { If we change the predicate to be "n > 0" most special cases resolve naturally. https://codereview.chromium.org/401783003/diff/1/src/harmony-string.js#newcode34 src/harmony-string.js:34: return res + s; This addition is not needed if comment above is addressed.
On 2014/07/28 11:13:52, Michael Starzinger wrote: > https://codereview.chromium.org/401783003/diff/1/AUTHORS > File AUTHORS (right): > > https://codereview.chromium.org/401783003/diff/1/AUTHORS#newcode37 > AUTHORS:37: Isiah Meadows <mailto:impinball@gmail.com> > Did you already sign the CLA. I wasn't able to find you on the list. > Instructions on how to do so can be found here: > > https://code.google.com/p/v8/wiki/Contributing > > https://codereview.chromium.org/401783003/diff/1/src/harmony-string.js > File src/harmony-string.js (right): > > https://codereview.chromium.org/401783003/diff/1/src/harmony-string.js#newcode24 > src/harmony-string.js:24: if (n < 1) return ''; > This special case is not needed if comment below is addressed. > > https://codereview.chromium.org/401783003/diff/1/src/harmony-string.js#newcode26 > src/harmony-string.js:26: // Optimized O(log N) algorithm, without the added > overhead of an extra array > nit: Let's drop (at least) the second half of the comment. Not sure it's worth > referring to an array that no longer exists. > > https://codereview.chromium.org/401783003/diff/1/src/harmony-string.js#newcode29 > src/harmony-string.js:29: while (n > 1) { > If we change the predicate to be "n > 0" most special cases resolve naturally. > > https://codereview.chromium.org/401783003/diff/1/src/harmony-string.js#newcode34 > src/harmony-string.js:34: return res + s; > This addition is not needed if comment above is addressed. First, I did sign the CLA. Don't know why it hasn't shown up yet. Second, I see what you're talking about. I'll upload an edit.
On 2014/07/30 04:02:42, impinball wrote: > On 2014/07/28 11:13:52, Michael Starzinger wrote: > > https://codereview.chromium.org/401783003/diff/1/AUTHORS > > File AUTHORS (right): > > > > https://codereview.chromium.org/401783003/diff/1/AUTHORS#newcode37 > > AUTHORS:37: Isiah Meadows <mailto:impinball@gmail.com> > > Did you already sign the CLA. I wasn't able to find you on the list. > > Instructions on how to do so can be found here: > > > > https://code.google.com/p/v8/wiki/Contributing > > > > https://codereview.chromium.org/401783003/diff/1/src/harmony-string.js > > File src/harmony-string.js (right): > > > > > https://codereview.chromium.org/401783003/diff/1/src/harmony-string.js#newcode24 > > src/harmony-string.js:24: if (n < 1) return ''; > > This special case is not needed if comment below is addressed. > > > > > https://codereview.chromium.org/401783003/diff/1/src/harmony-string.js#newcode26 > > src/harmony-string.js:26: // Optimized O(log N) algorithm, without the added > > overhead of an extra array > > nit: Let's drop (at least) the second half of the comment. Not sure it's worth > > referring to an array that no longer exists. > > > > > https://codereview.chromium.org/401783003/diff/1/src/harmony-string.js#newcode29 > > src/harmony-string.js:29: while (n > 1) { > > If we change the predicate to be "n > 0" most special cases resolve naturally. > > > > > https://codereview.chromium.org/401783003/diff/1/src/harmony-string.js#newcode34 > > src/harmony-string.js:34: return res + s; > > This addition is not needed if comment above is addressed. > > First, I did sign the CLA. Don't know why it hasn't shown up yet. > > Second, I see what you're talking about. I'll upload an edit. I tried signing the CLA a second time. Hopefully this will work...
On 2014/07/30 04:08:39, impinball wrote: > On 2014/07/30 04:02:42, impinball wrote: > > On 2014/07/28 11:13:52, Michael Starzinger wrote: > > > https://codereview.chromium.org/401783003/diff/1/AUTHORS > > > File AUTHORS (right): > > > > > > https://codereview.chromium.org/401783003/diff/1/AUTHORS#newcode37 > > > AUTHORS:37: Isiah Meadows <mailto:impinball@gmail.com> > > > Did you already sign the CLA. I wasn't able to find you on the list. > > > Instructions on how to do so can be found here: > > > > > > https://code.google.com/p/v8/wiki/Contributing > > > > > > https://codereview.chromium.org/401783003/diff/1/src/harmony-string.js > > > File src/harmony-string.js (right): > > > > > > > > > https://codereview.chromium.org/401783003/diff/1/src/harmony-string.js#newcode24 > > > src/harmony-string.js:24: if (n < 1) return ''; > > > This special case is not needed if comment below is addressed. > > > > > > > > > https://codereview.chromium.org/401783003/diff/1/src/harmony-string.js#newcode26 > > > src/harmony-string.js:26: // Optimized O(log N) algorithm, without the added > > > overhead of an extra array > > > nit: Let's drop (at least) the second half of the comment. Not sure it's > worth > > > referring to an array that no longer exists. > > > > > > > > > https://codereview.chromium.org/401783003/diff/1/src/harmony-string.js#newcode29 > > > src/harmony-string.js:29: while (n > 1) { > > > If we change the predicate to be "n > 0" most special cases resolve > naturally. > > > > > > > > > https://codereview.chromium.org/401783003/diff/1/src/harmony-string.js#newcode34 > > > src/harmony-string.js:34: return res + s; > > > This addition is not needed if comment above is addressed. > > > > First, I did sign the CLA. Don't know why it hasn't shown up yet. > > > > Second, I see what you're talking about. I'll upload an edit. > > I tried signing the CLA a second time. Hopefully this will work... Sorry for so many messages, but the last unroll actually is better for performance. I'm sending a purely comment-oriented patch to explain it somewhat in the code. That special case is necessary to unroll the final iteration, making it faster. The speed gain is best summarized by comparing the two algorithms below: As your suggestion: Assume `s` as the string accumulator, `ret` as the return string, and `n` as the counter. 1. Check if `n` is greater than 0. (always true) 2. If step 2 returns false, break the loop. (never happens) 3. Let `m` be `n` bitwise-and 1. (this is guaranteed to be 1) 4. If `m` is non-zero, append `s` to `ret`. (guaranteed) 5. Shift `n` right by 1 (becomes 0) 6. Append `s` to itself, doubling itself. 7. Check if `n` is greater than 0. (always false) 8. If step 5 returns false, break the loop. (guaranteed) 9. Return `ret`. As the original upload: Assume `s` as the string accumulator, `ret` as the return string, and `n` as the counter. 1. Check if `n` is greater than 1. (always false) 2. If step 2 returns false, break the loop. (guaranteed) 3. Append `s` to `ret`. 4. Return `ret` I know this is a relatively small speed change for small n (similar in time to the other changes), but it helps even more with larger n, removing an unnecessary concatenation doubling length.
> https://codereview.chromium.org/401783003/diff/1/src/harmony-string.js#newcode34 > > > > src/harmony-string.js:34: return res + s; > > > > This addition is not needed if comment above is addressed. > > > > > > First, I did sign the CLA. Don't know why it hasn't shown up yet. > > > > > > Second, I see what you're talking about. I'll upload an edit. > > > > I tried signing the CLA a second time. Hopefully this will work... > > Sorry for so many messages, but the last unroll actually is better for > performance. I'm sending a purely comment-oriented patch to explain it somewhat > in the code. > > That special case is necessary to unroll the final iteration, making it faster. > The speed gain is best summarized by comparing the two algorithms below: > > As your suggestion: > > Assume `s` as the string accumulator, `ret` as the return string, and `n` as the > counter. > 1. Check if `n` is greater than 0. (always true) > 2. If step 2 returns false, break the loop. (never happens) > 3. Let `m` be `n` bitwise-and 1. (this is guaranteed to be 1) > 4. If `m` is non-zero, append `s` to `ret`. (guaranteed) > 5. Shift `n` right by 1 (becomes 0) > 6. Append `s` to itself, doubling itself. > 7. Check if `n` is greater than 0. (always false) > 8. If step 5 returns false, break the loop. (guaranteed) > 9. Return `ret`. > > As the original upload: > > Assume `s` as the string accumulator, `ret` as the return string, and `n` as the > counter. > 1. Check if `n` is greater than 1. (always false) > 2. If step 2 returns false, break the loop. (guaranteed) > 3. Append `s` to `ret`. > 4. Return `ret` > > I know this is a relatively small speed change for small n (similar in time to > the other changes), but it helps even more with larger n, removing an > unnecessary concatenation doubling length. Do you have numbers that support this argumentation? If the difference is not measurable I vote for the clean and readable version. This kind of loop-unrolling looks to me like a micro-optimization that ideally should be done by the compiler instead of manually at the JavaScript level.
https://codereview.chromium.org/401783003/diff/20001/src/harmony-string.js File src/harmony-string.js (right): https://codereview.chromium.org/401783003/diff/20001/src/harmony-string.js#ne... src/harmony-string.js:26: if (n < 1) return ''; nit: use "" https://codereview.chromium.org/401783003/diff/20001/src/harmony-string.js#ne... src/harmony-string.js:30: while (n > 1) { +1 for a simpler version, since string concatenations is far more heavy than a comparison. e.g. var res = ""; while (n) { if (n & 1) { res += s; --n; } else { s += s; n >>= 1; } } return res; https://codereview.chromium.org/401783003/diff/20001/src/harmony-string.js#ne... src/harmony-string.js:62: if (ss.length + start > s_len) { this should be a separate patch, to ease possible rollbacks
On 2014/07/30 08:52:22, Michael Starzinger wrote:
> [removed for brevity]
> Do you have numbers that support this argumentation? If the difference is not
> measurable I vote for the clean and readable version. This kind of
> loop-unrolling looks to me like a micro-optimization that ideally should be
done
> by the compiler instead of manually at the JavaScript level.
@Michael, Yes. I ran a series of tests to verify this. The below data is the
result of
averaging three runs of the test suite. Each time is from 1,000,000 method
calls.
There are two outliers in the final test reflected in parenthetical comments,
specifically:
1. "foo" repeated 10 times resulted in being 30 ms slower unrolled in the final
run while it was 70 and 71 ms faster in the other two.
2. The really long string of gibberish repeated one million times was 208 ms
faster in the final run, while being only 26 ms faster in the other two.
Here's a quick guide to why the strings were chosen (I devised them in my
personal
unit tests before sending it off to CR):
1. "foo" - normal short word
2. "x" - a single character
3. (Math.random()+"").slice(2) - a string of arbitrary length, usually the
length
of one longer word or two shorter words.
4. "skjdvbsa fyqweov09eyvqowb ejrhf qwoi4ue" - a longish string of gibberish
Here's the data:
Total means:
5 times: 45 1/4 ms
10 -> 33 (38 8/11 excl. outlier)
100 -> 20 1/12
10000 -> 27 3/4
10001 -> 20 1/12
1000000 -> 46 7/12 (31 10/11 excl. outlier)
Total Avg: 32 1/8 (29 47/72)
String 1: "foo"
5 times -> 27 1/3 ms
10 -> 37 (70 1/2 excl. outlier)
100 -> 25 1/3
10000 -> 34 2/3
10001 -> 27 1/3
1000000 -> 50 1/3
Avg: 33 2/3 (37 7/17 excl. outlier)
String 2: "x"
5 times -> 91 1/3 ms
10 -> 32 1/3
100 -> 15 1/3
10000 -> 11 1/3
10001 -> 6 2/3
1000000 -> 28 1/3
Avg: 30 5/6
String 3: (Math.random()+"").slice(2)
5 times -> 30 2/3 ms
10 -> 29 2/3
100 -> 22
10000 -> 34 1/3
10001 -> 23 2/3
100000 -> 21
Avg: 26 8/9
String 4: "skjdvbsa fyqweov09eyvqowb ejrhf qwoi4ue"
5 times -> 31 2/3 ms
10 -> 33
100 -> 17 2/3
10000 -> 30 2/3
10001 -> 22 2/3
100000 -> 86 2/3 (26 excl. outlier)
Avg: 37 1/18 (27 excl. outlier)
Should this be sufficient?
On 2014/07/30 09:04:01, aandrey wrote: > https://codereview.chromium.org/401783003/diff/20001/src/harmony-string.js > File src/harmony-string.js (right): > > https://codereview.chromium.org/401783003/diff/20001/src/harmony-string.js#ne... > src/harmony-string.js:26: if (n < 1) return ''; > nit: use "" > > https://codereview.chromium.org/401783003/diff/20001/src/harmony-string.js#ne... > src/harmony-string.js:30: while (n > 1) { > +1 for a simpler version, since string concatenations is far more heavy than a > comparison. e.g. > > var res = ""; > while (n) { > if (n & 1) { > res += s; > --n; > } else { > s += s; > n >>= 1; > } > } > return res; > > https://codereview.chromium.org/401783003/diff/20001/src/harmony-string.js#ne... > src/harmony-string.js:62: if (ss.length + start > s_len) { > this should be a separate patch, to ease possible rollbacks Done. Also removed similar patches to string.js.
DBC on the benchmarking side One important difference is that %StringBuilderConcat builds Sequential string while optimization builds cons string. Hence benchmarks that don't account for this difference are not completely representative: you would pay the cost of flattening at use sites, instead of paying it at definition site. If benchmark has no uses of the String.prototype.repeat result then the most effective implementation of repeat is the one that is capable of DCEing it :)
@Vyacheslav Egorov
I tried using %StringBuilderConcat with this macro:
macro APPEND_INTERNAL_ARRAY(dest, src, array)
array[0] = dest;
array[1] = src;
dest = %StringBuilderConcat(array, 2, "");
endmacro
adding this variable:
var res = "";
var array = new InternalArray(2); // new variable
and replacing all instances of `a += b` with `APPEND_INTERNAL_ARRAY(a, b,
array)`.
The result, returning to an O(n) algorithm less optimal than the initial.
Ironically, cons strings are better for building strings. Flattening it with
`%StringBuilderConcat` at the end has little overhead, making it better for
further manipulation. Working sequentially makes it harder to beat O(n) because
of its nature. Here's the gist of each script I used to test all this (omitted
parts to keep within limit):
Guide:
Self-flattening StringRepeat return:
var array = new InternalArray(1);
array[0] = res + s;
return %StringBuilderConcat(array, 1, "");
Standard return:
return res + s;
Strings used in tests:
'foo'
'x'
(Math.random()+'').slice(2)
'skjdvbsa fyqweov09eyvqowb ejrhf qwoi4ue' // test for long strings
Simple example:
`n`s:
5
10
100
10000
10001
1000000 // long count
iterations: 1 million
Loop contents:
res = s.repeat(n); // all already declared
Diff of simple example (`diff -y`):
# internal flatten # # nothing #
Testing string 0: "foo" Testing string 0: "foo"
Running test #0: 5 iterations Running test #0: 5 iterations
Time taken (ms): 358 | Time taken (ms): 257
Running test #1: 10 iterations Running test #1: 10 iterations
Time taken (ms): 375 | Time taken (ms): 275
Running test #2: 100 iterations Running test #2: 100 iterations
Time taken (ms): 510 | Time taken (ms): 415
Running test #3: 10000 iterations Running test #3: 10000 iterations
Time taken (ms): 834 | Time taken (ms): 733
Running test #4: 10001 iterations Running test #4: 10001 iterations
Time taken (ms): 879 | Time taken (ms): 769
Running test #5: 1000000 iterations Running test #5: 1000000 iterations
Time taken (ms): 1140 | Time taken (ms): 1025
Testing string 1: "x" Testing string 1: "x"
Running test #0: 5 iterations Running test #0: 5 iterations
Time taken (ms): 359 | Time taken (ms): 263
Running test #1: 10 iterations Running test #1: 10 iterations
Time taken (ms): 471 | Time taken (ms): 372
Running test #2: 100 iterations Running test #2: 100 iterations
Time taken (ms): 520 | Time taken (ms): 435
Running test #3: 10000 iterations Running test #3: 10000 iterations
Time taken (ms): 872 | Time taken (ms): 776
Running test #4: 10001 iterations Running test #4: 10001 iterations
Time taken (ms): 916 | Time taken (ms): 813
Running test #5: 1000000 iterations Running test #5: 1000000 iterations
Time taken (ms): 1176 | Time taken (ms): 1063
Testing string 2: "6244036226999015" | Testing string 2:
"5932463612407446"
Running test #0: 5 iterations Running test #0: 5 iterations
Time taken (ms): 230 | Time taken (ms): 139
Running test #1: 10 iterations Running test #1: 10 iterations
Time taken (ms): 259 | Time taken (ms): 170
Running test #2: 100 iterations Running test #2: 100 iterations
Time taken (ms): 407 | Time taken (ms): 309
Running test #3: 10000 iterations Running test #3: 10000 iterations
Time taken (ms): 767 | Time taken (ms): 644
Running test #4: 10001 iterations Running test #4: 10001 iterations
Time taken (ms): 790 | Time taken (ms): 688
Running test #5: 1000000 iterations Running test #5: 1000000 iterations
Time taken (ms): 1041 | Time taken (ms): 941
Testing string 3: "skjdvbsa fyqweov09eyvqowb ejrhf qwoi4ue" Testing string 3:
"skjdvbsa fyqweov09eyvqowb ejrhf qwoi4ue"
Running test #0: 5 iterations Running test #0: 5 iterations
Time taken (ms): 227 | Time taken (ms): 141
Running test #1: 10 iterations Running test #1: 10 iterations
Time taken (ms): 262 | Time taken (ms): 169
Running test #2: 100 iterations Running test #2: 100 iterations
Time taken (ms): 397 | Time taken (ms): 308
Running test #3: 10000 iterations Running test #3: 10000 iterations
Time taken (ms): 772 | Time taken (ms): 650
Running test #4: 10001 iterations Running test #4: 10001 iterations
Time taken (ms): 790 | Time taken (ms): 689
Running test #5: 1000000 iterations Running test #5: 1000000 iterations
Time taken (ms): 1054 | Time taken (ms): 940
Complex example:
`n`s:
5
10
100
500
501
1000 // long count
iterations: 100,000
Loop contents:
res = s.repeat(n); // all already declared
foo = res.replace("a", "z");
foo = foo.repeat(2).substr(20);
The cons string (+ `.concat("")`) is actually counterintuitively slightly faster
in the general case (numbers omitted). The second series of tests I had to
significantly lower the n in .repeat() and the number of iterations in the loop,
but the performance difference still exists for the most part with lower n.
One other note, the common case will likely not be repeating a thousand times.
Those are the only ones where the converted before return actually makes a
difference. Also, converting it after building to a flat string makes it more
optimal for further manipulation. When it returns a cons string, it is far
faster in the simple test, but slower in more real world usage. People also
aren't going to think to append `.concat("")` to speed it up, so this diff below
is why I'm submitting this version. There's little difference with smaller
numbers, and it boosts the larger numbers significantly.
# nothing # # internal
flatten #
Testing string 0: "foo" Testing string 0: "foo"
Running test #0: 5 iterations Running test #0: 5 iterations
Time taken (ms): 113 | Time taken (ms): 116
Running test #1: 10 iterations Running test #1: 10 iterations
Time taken (ms): 112 | Time taken (ms): 117
Running test #2: 100 iterations Running test #2: 100 iterations
Time taken (ms): 705 | Time taken (ms): 714
Running test #3: 500 iterations Running test #3: 500 iterations
Time taken (ms): 3311 | Time taken (ms): 3317
Running test #4: 501 iterations Running test #4: 501 iterations
Time taken (ms): 3338 | Time taken (ms): 3328
Running test #5: 1000 iterations Running test #5: 1000 iterations
Time taken (ms): 6459 | Time taken (ms): 6458
Testing string 1: "x" Testing string 1: "x"
Running test #0: 5 iterations Running test #0: 5 iterations
Time taken (ms): 62 | Time taken (ms): 66
Running test #1: 10 iterations Running test #1: 10 iterations
Time taken (ms): 61 | Time taken (ms): 65
Running test #2: 100 iterations Running test #2: 100 iterations
Time taken (ms): 166 | Time taken (ms): 177
Running test #3: 500 iterations Running test #3: 500 iterations
Time taken (ms): 1380 | Time taken (ms): 1352
Running test #4: 501 iterations Running test #4: 501 iterations
Time taken (ms): 1378 | Time taken (ms): 1360
Running test #5: 1000 iterations Running test #5: 1000 iterations
Time taken (ms): 2585 | Time taken (ms): 2587
Testing string 2: "734548875130713" | Testing string 2:
"7016541191842407"
Running test #0: 5 iterations Running test #0: 5 iterations
Time taken (ms): 117 | Time taken (ms): 113
Running test #1: 10 iterations Running test #1: 10 iterations
Time taken (ms): 175 | Time taken (ms): 150
Running test #2: 100 iterations Running test #2: 100 iterations
Time taken (ms): 2785 | Time taken (ms): 2275
Running test #3: 500 iterations Running test #3: 500 iterations
Time taken (ms): 13769 | Time taken (ms): 10846
Running test #4: 501 iterations Running test #4: 501 iterations
Time taken (ms): 13815 | Time taken (ms): 10919
Running test #5: 1000 iterations Running test #5: 1000 iterations
Time taken (ms): 27351 | Time taken (ms): 21540
Testing string 3: "skjdvbsa fyqweov09eyvqowb ejrhf qwoi4ue" Testing string 3:
"skjdvbsa fyqweov09eyvqowb ejrhf qwoi4ue"
Running test #0: 5 iterations Running test #0: 5 iterations
Time taken (ms): 180 | Time taken (ms): 190
Running test #1: 10 iterations Running test #1: 10 iterations
Time taken (ms): 288 | Time taken (ms): 298
Running test #2: 100 iterations Running test #2: 100 iterations
Time taken (ms): 1860 | Time taken (ms): 1876
Running test #3: 500 iterations Running test #3: 500 iterations
Time taken (ms): 8717 | Time taken (ms): 8736
Running test #4: 501 iterations Running test #4: 501 iterations
Time taken (ms): 8746 | Time taken (ms): 8730
Running test #5: 1000 iterations Running test #5: 1000 iterations
Time taken (ms): 17225 | Time taken (ms): 17240
Given the speed difference in more real-world usage, I'm sending this patch to
make it less cons-y in output.
If `s` is the empty string, iterating 2^31-1 times won't alter the output --- so just return the empty string (this isn't in the spec, but it just seems like an obvious case where we're wasting time iterating). Cutting the dataset in half looks good, it might be worth experimenting to see if it's worth anything to do more than that, but it's probably not majorly worth it for common cases (like repeating whitespace to pad fields).
On 2014/08/13 13:38:45, caitp wrote: > If `s` is the empty string, iterating 2^31-1 times won't alter the output --- so > just return the empty string (this isn't in the spec, but it just seems like an > obvious case where we're wasting time iterating). Actually, ignore the previous comment, this patch behaves a lot better than the current implementation checked into the tree. We're good here.
I am handing off the final decision of whether we want to land this change or not off to Andreas, since he is leading the ES6 effort. To me the cost-benefit ratio of this change is unclear due to lack of a concrete (non-micro) benchmark. I personally am not a huge fan of the amount of micro-optimization that is going on (i.e. manual loop unrolling) and would rather keep it simple, but I can see the benefits of having a O(log n) implementation instead of O(n). <sarcasm>Of course there is a O(1) solution by just introducing a RepString instance type.</sarcasm> Andreas: WDYT?
Whether the perf fixes are wanted or not, this CL does at the very least fix the following bug: https://www.dropbox.com/s/s8go2uaut2arzm6/StringRepeatBug.mp4 (currently, we crash d8 and canary when doing this, but I'm not able to reproduce with this patch applied). It might be worth adding a test case for this one. In the Spidermonkey implementation, they will always throw if S.length*N >= maximum-safe-size (they consider this to be 1<<28 [[bytes, presumably, maybe words?]] --- basically we should throw if we surpass some crazy upper threshold, but `1<<28` will nearly instantly return the empty string, rather than crashing the browser. It's unrelated to the perf fix, but you know, it may not be an exploitable bug, but we probably don't want to let scripts just randomly crash the browser, so it's worth fixing. +1 for the perf fixes and correcting that crash bug.
@All It should now actually follow the standard now that I replaced the bitwise operators with their arithmetic equivalents. @caitp That bug is actually the perf problem I'm patching. The old algorithm not only takes longer to do, it takes up more memory as well. The old algorithm basically did this after the type checking: 1. Dynamically allocate an internal JS array of length N. Each entry isn't yet optimized for a type. 2. Copy the string into each of the N slots for it, one by one. 3. Create a string of variable length and let it be `ret`. 4. For each entry in the string, append it to `ret` by means of standard string concatenation. 5. Convert the string back into a JS string instance. 6. Return it from the internal JS function. This uses cons strings, which are better for string building (although worse for manipulation), and then flattens it at the very end, which is already a well optimized process.
Please add test cases for cases specific to this impelementation, like for n = 2^m-1, etc. Also, please sign the CLA: https://developers.google.com/open-source/cla/individual?csw=1 https://codereview.chromium.org/401783003/diff/80001/src/harmony-string.js File src/harmony-string.js (right): https://codereview.chromium.org/401783003/diff/80001/src/harmony-string.js#ne... src/harmony-string.js:31: if (n % 2) res += s; two-character indent please. https://codereview.chromium.org/401783003/diff/80001/src/harmony-string.js#ne... src/harmony-string.js:36: // unroll last iteration, no need to double the initial string again at the Capitalize "unroll". https://codereview.chromium.org/401783003/diff/80001/src/harmony-string.js#ne... src/harmony-string.js:43: return %StringBuilderConcat(array, 1, ""); Use %FlattenString
On 2014/08/19 12:29:00, Yang wrote: > Please add test cases for cases specific to this impelementation, like for n = > 2^m-1, etc. > > Also, please sign the CLA: > https://developers.google.com/open-source/cla/individual?csw=1 > > https://codereview.chromium.org/401783003/diff/80001/src/harmony-string.js > File src/harmony-string.js (right): > > https://codereview.chromium.org/401783003/diff/80001/src/harmony-string.js#ne... > src/harmony-string.js:31: if (n % 2) res += s; > two-character indent please. > > https://codereview.chromium.org/401783003/diff/80001/src/harmony-string.js#ne... > src/harmony-string.js:36: // unroll last iteration, no need to double the > initial string again at the > Capitalize "unroll". > > https://codereview.chromium.org/401783003/diff/80001/src/harmony-string.js#ne... > src/harmony-string.js:43: return %StringBuilderConcat(array, 1, ""); > Use %FlattenString Just to be clear, I followed up on the CLA thing. Everything is fine in that regard. I'm just waiting for you to address the remaining issues so that I can land this.
On 2014/09/22 15:29:55, Yang wrote: > On 2014/08/19 12:29:00, Yang wrote: > > Please add test cases for cases specific to this impelementation, like for n = > > 2^m-1, etc. > > > > Also, please sign the CLA: > > https://developers.google.com/open-source/cla/individual?csw=1 > > > > https://codereview.chromium.org/401783003/diff/80001/src/harmony-string.js > > File src/harmony-string.js (right): > > > > > https://codereview.chromium.org/401783003/diff/80001/src/harmony-string.js#ne... > > src/harmony-string.js:31: if (n % 2) res += s; > > two-character indent please. > > > > > https://codereview.chromium.org/401783003/diff/80001/src/harmony-string.js#ne... > > src/harmony-string.js:36: // unroll last iteration, no need to double the > > initial string again at the > > Capitalize "unroll". > > > > > https://codereview.chromium.org/401783003/diff/80001/src/harmony-string.js#ne... > > src/harmony-string.js:43: return %StringBuilderConcat(array, 1, ""); > > Use %FlattenString > > Just to be clear, I followed up on the CLA thing. Everything is fine in that > regard. I'm just waiting for you to address the remaining issues so that I can > land this. Closing this since I landed https://codereview.chromium.org/657863002/ |
