|
|
Chromium Code Reviews|
Created:
7 years, 4 months ago by arv (Not doing code reviews) Modified:
7 years, 4 months ago CC:
v8-dev Visibility:
Public. |
DescriptionHack forEach to use HasFastPackedElements
WIP
BUG=
Patch Set 1 : Changed to use benchmarks/base.js #Patch Set 2 : Now with sparse tests #Patch Set 3 : cleanup #Patch Set 4 : check undefined instead #Patch Set 5 : Need to special case proxies #Patch Set 6 : Branch on proxies #
Total comments: 6
Messages
Total messages: 18 (0 generated)
Remember that we talked a bit about how slow forEach is due to the in operator. This hack exposes whether the array is packed, allowing us to skip the in check. The results are promising, it gives a 2x perf gain: $ out/ia32.release/d8 test.js ForEach: 183577 ForEach2: 370329
I like it! I doubt it would slow things down much, but can you add a benchmark
for a non-packed case to see how much the extra branch costs us?
Note that there's a more complex version that works even for slightly-sparse
arrays:
if (%HasFastElements(array)) {
var element = array[i];
if (!IS_UNDEFINED(element) || i in array) {
// do whatever
}
} else if (i in array) {
// normal path
}
This works because fast elements can't have getters.
On 2013/08/08 20:40:32, adamk wrote:
> I like it! I doubt it would slow things down much, but can you add a benchmark
> for a non-packed case to see how much the extra branch costs us?
>
> Note that there's a more complex version that works even for slightly-sparse
> arrays:
>
> if (%HasFastElements(array)) {
> var element = array[i];
> if (!IS_UNDEFINED(element) || i in array) {
> // do whatever
> }
> } else if (i in array) {
> // normal path
> }
>
> This works because fast elements can't have getters.
What if the prototype has a non undefined value?
var a = [0, /* 1 */, 2];
a.__proto__ = {1: 3};
On 2013/08/08 20:57:46, arv wrote:
> On 2013/08/08 20:40:32, adamk wrote:
> > I like it! I doubt it would slow things down much, but can you add a
benchmark
> > for a non-packed case to see how much the extra branch costs us?
> >
> > Note that there's a more complex version that works even for slightly-sparse
> > arrays:
> >
> > if (%HasFastElements(array)) {
> > var element = array[i];
> > if (!IS_UNDEFINED(element) || i in array) {
> > // do whatever
> > }
> > } else if (i in array) {
> > // normal path
> > }
> >
> > This works because fast elements can't have getters.
>
> What if the prototype has a non undefined value?
>
> var a = [0, /* 1 */, 2];
> a.__proto__ = {1: 3};
Ugh, you're right, you need to walk up the proto chain checking
%HasFastElements.
I'm sold that optimizing packed arrays is way easier as a first try.
Added sparse array check: arv@zapp:~/src/v8 (faster-for-each) $ out/ia32.release/d8 test.js ForEach: 179435 ForEach2: 363199 ForEachWithHoles: 117389 ForEach2WithHoles: 109674 arv@zapp:~/src/v8 (faster-for-each) $ out/ia32.release/d8 test.js ForEach: 179573 ForEach2: 368303 ForEachWithHoles: 121373 ForEach2WithHoles: 111130 arv@zapp:~/src/v8 (faster-for-each) $ out/ia32.release/d8 test.js ForEach: 184737 ForEach2: 367396 ForEachWithHoles: 122627 ForEach2WithHoles: 112499 The reason ForEach2WithHoles is slightly faster seems to be that some optimization kicks in. Changing the order of the tests makes ForEachWithHoles slightly faster. $ out/ia32.release/d8 test.js ForEach2: 365734 ForEach: 182602 ForEach2WithHoles: 112983 ForEachWithHoles: 123444
One more idea... taken from reduce Instead of doing the in check we [[Get]] the value and only if the value is undefined do we need to do an [[HasProperty]] check. ForEach: 184870 ForEach2: 367066 ForEach3: 452772 ForEachWithHoles: 123503 ForEach2WithHoles: 114547 ForEach3WithHoles: 110123 This is 3x faster! The downside to both of these ideas is that with proxies it is detectable that the order no longer matches the spec. Maybe these hacks are not worth it and we should just make [[HasProperty]] fast... not that at will ever be as fast as ForEach3 since we always need to do the [[Get]] no matter what.
On 2013/08/09 14:24:06, arv wrote: > One more idea... taken from reduce > > Instead of doing the in check we [[Get]] the value and only if the value is > undefined do we need to do an [[HasProperty]] check. > > ForEach: 184870 > ForEach2: 367066 > ForEach3: 452772 > ForEachWithHoles: 123503 > ForEach2WithHoles: 114547 > ForEach3WithHoles: 110123 > > This is 3x faster! > > The downside to both of these ideas is that with proxies it is detectable that > the order no longer matches the spec. It's not just proxies, it's also observable with getters. If you look at git blame, that's what the code used to do, and I think mstarzinger fixed it to match the spec. > Maybe these hacks are not worth it and we should just make [[HasProperty]] > fast... not that at will ever be as fast as ForEach3 since we always need to do > the [[Get]] no matter what.
Hi Michael, I've been thinking about how to improve the perf of Array.prototype.forEach. I have two hacks that improves its speed by about 2x and 3x respectively, without changing anything observable. Do you think either of these are worth doing? $ out/ia32.release/d8 test.js ForEach: 177128 ForEach2: 345285 ForEach3: 430109 ForEachWithHoles: 121757 ForEach2WithHoles: 110729 ForEach3WithHoles: 110750
https://codereview.chromium.org/22545007/diff/20001/src/array.js File src/array.js (right): https://codereview.chromium.org/22545007/diff/20001/src/array.js#newcode1263 src/array.js:1263: if (!isProxy && %HasFastPackedElements(array) || i in array) { It is _really_ surprising that the runtime call into Runtime_HasFastPackedElements is cheaper than the "in" keyword. This suggests that there is a lot of potential in actually improving "in" in general. This needs investigation of the generated code! https://codereview.chromium.org/22545007/diff/20001/src/array.js#newcode1315 src/array.js:1315: if (!IS_UNDEFINED(element) || i in array) { This is actually wrong as the i-th element might have a getter that removes the element upon first access but still returns undefined. This will cause us to do the "in" check on the property after the element has been removed. We had this "optimization" for a while and it was removed. The test262 suite should actually cover this case.
https://codereview.chromium.org/22545007/diff/20001/src/array.js File src/array.js (right): https://codereview.chromium.org/22545007/diff/20001/src/array.js#newcode1315 src/array.js:1315: if (!IS_UNDEFINED(element) || i in array) { On 2013/08/13 14:10:22, Michael Starzinger wrote: > This is actually wrong as the i-th element might have a getter that removes the > element upon first access but still returns undefined. This will cause us to do > the "in" check on the property after the element has been removed. > > We had this "optimization" for a while and it was removed. The test262 suite > should actually cover this case. This is the issue tracking exactly this: https://code.google.com/p/v8/issues/detail?id=1956
On 2013/08/13 14:10:21, Michael Starzinger wrote: > https://codereview.chromium.org/22545007/diff/20001/src/array.js > File src/array.js (right): > > https://codereview.chromium.org/22545007/diff/20001/src/array.js#newcode1263 > src/array.js:1263: if (!isProxy && %HasFastPackedElements(array) || i in array) > { > It is _really_ surprising that the runtime call into > Runtime_HasFastPackedElements is cheaper than the "in" keyword. This suggests > that there is a lot of potential in actually improving "in" in general. This > needs investigation of the generated code! A few things I noticed as I was investigating this. - `in` calls into runtime code - `in` does have a fast path for packed arrays - but %HasFastPackedElements is faster because we can skip `in` entirely since we know that all indexes are in the array. > https://codereview.chromium.org/22545007/diff/20001/src/array.js#newcode1315 > src/array.js:1315: if (!IS_UNDEFINED(element) || i in array) { > This is actually wrong as the i-th element might have a getter that removes the > element upon first access but still returns undefined. This will cause us to do > the "in" check on the property after the element has been removed. > > We had this "optimization" for a while and it was removed. The test262 suite > should actually cover this case. You are correct. I see it now. I believe we still have this bug in reduce and reduceRight find_initial loop.
https://codereview.chromium.org/22545007/diff/20001/src/array.js File src/array.js (right): https://codereview.chromium.org/22545007/diff/20001/src/array.js#newcode1263 src/array.js:1263: if (!isProxy && %HasFastPackedElements(array) || i in array) { On 2013/08/13 14:10:22, Michael Starzinger wrote: > It is _really_ surprising that the runtime call into > Runtime_HasFastPackedElements is cheaper than the "in" keyword. This suggests > that there is a lot of potential in actually improving "in" in general. This > needs investigation of the generated code! Now that I think about it, I don't actually think that this is a valid optimization either. What happens if the callback actually changes the array in a way so that it remains in fast-packed mode, but "i" still runs out of bounds. Imagine the following callback. var array = [1,2,3,4,5,6]; Object.defineProperty(array, '1', { get: function () { array.length = 3; }, configurable: true }); array.forEach(f);
Dbc https://codereview.chromium.org/22545007/diff/20001/src/array.js File src/array.js (right): https://codereview.chromium.org/22545007/diff/20001/src/array.js#newcode1263 src/array.js:1263: if (!isProxy && %HasFastPackedElements(array) || i in array) { I don't think you can have indexed accessors while being packed. I think we always go dictionary mode when accessors are installed. Holey arrays would have this problem; but they are already excluded explicitly. On 2013/08/13 15:47:16, Michael Starzinger wrote: > On 2013/08/13 14:10:22, Michael Starzinger wrote: > > It is _really_ surprising that the runtime call into > > Runtime_HasFastPackedElements is cheaper than the "in" keyword. This suggests > > that there is a lot of potential in actually improving "in" in general. This > > needs investigation of the generated code! > > Now that I think about it, I don't actually think that this is a valid > optimization either. What happens if the callback actually changes the array in > a way so that it remains in fast-packed mode, but "i" still runs out of bounds. > Imagine the following callback. > > var array = [1,2,3,4,5,6]; > Object.defineProperty(array, '1', { > get: function () { array.length = 3; }, > configurable: true > }); > array.forEach(f);
https://codereview.chromium.org/22545007/diff/20001/src/array.js File src/array.js (right): https://codereview.chromium.org/22545007/diff/20001/src/array.js#newcode1263 src/array.js:1263: if (!isProxy && %HasFastPackedElements(array) || i in array) { On 2013/08/13 20:47:42, Toon Verwaest wrote: > I don't think you can have indexed accessors while being packed. I think we > always go dictionary mode when accessors are installed. Holey arrays would have > this problem; but they are already excluded explicitly. > > On 2013/08/13 15:47:16, Michael Starzinger wrote: > > On 2013/08/13 14:10:22, Michael Starzinger wrote: > > > It is _really_ surprising that the runtime call into > > > Runtime_HasFastPackedElements is cheaper than the "in" keyword. This > suggests > > > that there is a lot of potential in actually improving "in" in general. This > > > needs investigation of the generated code! > > > > Now that I think about it, I don't actually think that this is a valid > > optimization either. What happens if the callback actually changes the array > in > > a way so that it remains in fast-packed mode, but "i" still runs out of > bounds. > > Imagine the following callback. > > > > var array = [1,2,3,4,5,6]; > > Object.defineProperty(array, '1', { > > get: function () { array.length = 3; }, > > configurable: true > > }); > > array.forEach(f); > OK, admittedly, my example does not trigger the problem and is unnecessarily complex. But you don't need accessors to reproduce this problem ... var array = [1,2,3,4,5,6]; function f(value, index, array) { array.length = 3; } array.forEach(f);
On 2013/08/14 07:48:47, Michael Starzinger wrote: > https://codereview.chromium.org/22545007/diff/20001/src/array.js > File src/array.js (right): > > https://codereview.chromium.org/22545007/diff/20001/src/array.js#newcode1263 > src/array.js:1263: if (!isProxy && %HasFastPackedElements(array) || i in array) > { > On 2013/08/13 20:47:42, Toon Verwaest wrote: > > I don't think you can have indexed accessors while being packed. I think we > > always go dictionary mode when accessors are installed. Holey arrays would > have > > this problem; but they are already excluded explicitly. > > > > On 2013/08/13 15:47:16, Michael Starzinger wrote: > > > On 2013/08/13 14:10:22, Michael Starzinger wrote: > > > > It is _really_ surprising that the runtime call into > > > > Runtime_HasFastPackedElements is cheaper than the "in" keyword. This > > suggests > > > > that there is a lot of potential in actually improving "in" in general. > This > > > > needs investigation of the generated code! > > > > > > Now that I think about it, I don't actually think that this is a valid > > > optimization either. What happens if the callback actually changes the array > > in > > > a way so that it remains in fast-packed mode, but "i" still runs out of > > bounds. > > > Imagine the following callback. > > > > > > var array = [1,2,3,4,5,6]; > > > Object.defineProperty(array, '1', { > > > get: function () { array.length = 3; }, > > > configurable: true > > > }); > > > array.forEach(f); > > > > OK, admittedly, my example does not trigger the problem and is unnecessarily > complex. But you don't need accessors to reproduce this problem ... > > var array = [1,2,3,4,5,6]; > function f(value, index, array) { > array.length = 3; > } > array.forEach(f); Then we could compare the length, but only if the length is a data property... it gets ugly fast... but it gives us 2x performance in the common case where an array is used. Another thing we could do is to special case packed JSArray in HasElement. My experiments show that gives a 30% perf improvement for forEach but it should benefit all in operations for arrays. Something like: bool JSReceiver::HasElement(uint32_t index) { if (IsJSArray()) { if (JSObject::cast(this)->HasFastPackedElements()) { int len = Smi::cast(JSArray::cast(this)->length())->value(); return index < static_cast<uint32_t>(len); } } if (IsJSProxy()) { return JSProxy::cast(this)->HasElementWithHandler(index); } return JSObject::cast(this)->GetElementAttributeWithReceiver( this, index, true) != ABSENT; }
You can actually create, using Object.defineProperty, an array that doesn't have foreign accessors as its length anymore. At that point values don't get deleted anymore when you reduce the length. This means you can create a "packed array" which has elements beyond its length (which you can find just by reading them). Also, given that forEach reads the length before starting to loop, it means that the array essentially becomes "holey" at the moment you shorted the length dynamically; in the sense that you have to look for elements beyond the actual length of the array. Those elements can come from the prototype chain; and can obviously be a getter or so.
On 2013/08/14 14:48:55, Toon Verwaest wrote: > You can actually create, using Object.defineProperty, an array that doesn't have > foreign accessors as its length anymore. No, Array instances have a non configurable length data property. > At that point values don't get deleted > anymore when you reduce the length. This means you can create a "packed array" > which has elements beyond its length (which you can find just by reading them). > > Also, given that forEach reads the length before starting to loop, it means that > the array essentially becomes "holey" at the moment you shorted the length > dynamically; in the sense that you have to look for elements beyond the actual > length of the array. Those elements can come from the prototype chain; and can > obviously be a getter or so. Yeah, Michael pointed that out. Both length and packed needs to be checked.
What I mean is that you can't break when you go past the length. You actually have to go to a slow path when you read past the length, that does the entire in-check; and loops until the index surpasses the preloaded length value. And that's independent of whether the receiver is fast-packed or not. In any case, we don't think this is the right way to optimize this function. We rather should first teach crankshaft to inline this function into the caller, and then we should teach crankshaft to do the optimizations you are doing by hand; to ensure that all such JavaScript code benefits from that optimization. Do you have a particular pressing need for this function to become fast? In other words, is there a real-world application that is currently blocked by this, performance-wise? If not, we'd rather focus on doing what we described above, rather than adding more localized (potentially buggy) optimizations that we'll have to throw out again later. |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
