|
|
Created:
8 years, 1 month ago by mattrobenolt Modified:
8 years, 1 month ago CC:
v8-dev Visibility:
Public. |
DescriptionOptimizing the loop algorithm for Array.prototype.map.
The current version yields an O(n^2) algorithm because on each iteration of the loop,
it's also checking "i in array", which itself, scans the whole array.
This was changed by substituting the "in" call in favor of !IS_UNDEFINED
to keep a linear time loop.
General test run showing the performance issue: http://jsperf.com/loop-vs-map/3
Patch Set 1 #
Total comments: 1
Messages
Total messages: 11 (0 generated)
https://codereview.chromium.org/11365189/diff/1/src/array.js File src/array.js (right): https://codereview.chromium.org/11365189/diff/1/src/array.js#newcode1243 src/array.js:1243: if (!IS_UNDEFINED(element)) { Quick drive-by-comment: I think that this "optimization" is wrong in the presence of e.g. accessors, because the access of array[i] can be observed.
The only scenario I could come up with where the behavior changes is when explicitly passing undefined as a value in the original array. So I'm assuming that IS_UNDEFINED has some special meaning, other than === undefined? They appear to be acting different. On 2012/11/11 15:51:41, Sven Panne wrote: > https://codereview.chromium.org/11365189/diff/1/src/array.js > File src/array.js (right): > > https://codereview.chromium.org/11365189/diff/1/src/array.js#newcode1243 > src/array.js:1243: if (!IS_UNDEFINED(element)) { > Quick drive-by-comment: I think that this "optimization" is wrong in the > presence of e.g. accessors, because the access of array[i] can be observed.
Disregard that. This is a problem with the difference between a variable being "undefined" vs "undeclared". I can't figure out a way to distinguish between the two in V8. It appears that the IS_UNDETECTABLE macro would catch it, but it doesn't. :( Any ideas? On 2012/11/11 16:33:22, mattrobenolt wrote: > The only scenario I could come up with where the behavior changes is when > explicitly passing undefined as a value in the original array. > > So I'm assuming that IS_UNDEFINED has some special meaning, other than === > undefined? They appear to be acting different. > > > On 2012/11/11 15:51:41, Sven Panne wrote: > > https://codereview.chromium.org/11365189/diff/1/src/array.js > > File src/array.js (right): > > > > https://codereview.chromium.org/11365189/diff/1/src/array.js#newcode1243 > > src/array.js:1243: if (!IS_UNDEFINED(element)) { > > Quick drive-by-comment: I think that this "optimization" is wrong in the > > presence of e.g. accessors, because the access of array[i] can be observed.
On 2012/11/11 17:08:30, mattrobenolt wrote: > Disregard that. This is a problem with the difference between a variable being "undefined" vs "undeclared". [...] To be exact, the real problem is a different one: The spec is very explicit about Array.prototy.map (section 15.4.4.19, step 8c), and it states that [[Get]] should only be called when [[HasProperty]] is true for the current index. Therefore, any optimization where array[i] is accessed without a prior check is not spec-conformant, at least that is my interpretation of the spec. Looping in our "language lawyer" Andreas... ;-)
On 2012/11/12 08:06:50, Sven Panne wrote: > On 2012/11/11 17:08:30, mattrobenolt wrote: > > Disregard that. This is a problem with the difference between a variable being > "undefined" vs "undeclared". [...] > > To be exact, the real problem is a different one: The spec is very explicit > about Array.prototy.map (section 15.4.4.19, step 8c), and it states that [[Get]] > should only be called when [[HasProperty]] is true for the current index. > Therefore, any optimization where array[i] is accessed without a prior check is > not spec-conformant, at least that is my interpretation of the spec. Looping in > our "language lawyer" Andreas... ;-) Indeed. Checking for undefined is incorrect for multiple reasons, e.g. it cannot distinguish absent properties from ones set to undefined (you need 'in' or 'hasOwnProperty' for that), and it incorrectly invokes accessors or proxy traps. Also, "i in arr" does not "scan the whole array". It should be a constant time lookup, and no slower than arr[i] (it actually is significantly faster on accessor properties). Array map certainly is not O(n^2), otherwise your graphs would look *much* worse. But yes, you might wonder why the red version is faster than the yellow one. My guess is that red enables common subexpression elimination on arr[i], so that it has to perform only one lookup per iteration, whereas the yellow version uses two different operations, which cannot be merged by the compiler. Not sure there is much that can easily be done about that, though.
On 2012/11/12 10:51:03, rossberg wrote: > But yes, you might wonder why the red version is faster than the yellow one. My > guess is that red enables common subexpression elimination on arr[i], so that it > has to perform only one lookup per iteration, whereas the yellow version uses > two different operations, which cannot be merged by the compiler. Not sure there > is much that can easily be done about that, though. Looking a bit closer (with Sven), the root cause for the difference simply is that the 'in' operator always calls into the V8 runtime (which incurs noticeable overhead), while normal property access is (usually) handled in compiled code directly.
Thanks for your feedback. :) Are there any ways that this can be optimized specifically for "dense" arrays? I don't know if there'd be a way to check and assert that, but the way that map() is implemented today, since it has to follow the spec verbatim, is pointless. Just guessing, most map() usages are with a "dense" array. I can't think of many times that people are throwing around arrays that have holes in them. I know it can happen, but it's a minority for sure. Since it does the extra checks on each key, it's quite a bit slower than a normal loop/call a function. In my experience, map()s usually provide some benefits in terms of speed as opposed to some syntax sugar. In Javascript today, we're provided a slower method to do something trivial in the first place, and it's slow for the sake of following a spec. Not sure if this is really out of line or not, but just my opinions. Again, thanks for taking the time to look this over and explain. :) On 2012/11/12 12:19:52, rossberg wrote: > On 2012/11/12 10:51:03, rossberg wrote: > > But yes, you might wonder why the red version is faster than the yellow one. > My > > guess is that red enables common subexpression elimination on arr[i], so that > it > > has to perform only one lookup per iteration, whereas the yellow version uses > > two different operations, which cannot be merged by the compiler. Not sure > there > > is much that can easily be done about that, though. > > Looking a bit closer (with Sven), the root cause for the difference simply is > that the 'in' operator always calls into the V8 runtime (which incurs noticeable > overhead), while normal property access is (usually) handled in compiled code > directly.
On 2012/11/12 18:06:03, mattrobenolt wrote: > Are there any ways that this can be optimized specifically for "dense" arrays? I > don't know if there'd be a way to check and assert that, but the way that map() > is implemented today, since it has to follow the spec verbatim, is pointless. > Just guessing, most map() usages are with a "dense" array. I can't think of many > times that people are throwing around arrays that have holes in them. I know it > can happen, but it's a minority for sure. It would be easy to detect cases where the array is dense. The trouble with that, though, is that we have no easy way of knowing whether it _stays_ that way during the loop -- unfortunately, JS arrays are far too mutable, and the callback can usually modify them in any way it likes. Consequently, the "is dense" check would have to happen on every iteration, which is only mildly cheaper than the more general "in" check. In principle, it would be possible to be more clever for special cases, if we could detect the callback to be pure wrt to the array, but that would require significant extra infrastructure to implement. JS arrays are a beautiful concept, aren't they? :)
Wow, that's terrible. So because of all of this, the conclusion is that as a developer, they should avoid using map() if they care about performance. And that's a shame, because map() is a beautiful syntax for a lot of things. I almost want a proposed "fastMap()" or something that says, "Yeah, we don't care about the insanity checks, I know what I'm doing!". But that's more than likely not going to work. ;) On 2012/11/13 10:25:54, rossberg wrote: > On 2012/11/12 18:06:03, mattrobenolt wrote: > > Are there any ways that this can be optimized specifically for "dense" arrays? > I > > don't know if there'd be a way to check and assert that, but the way that > map() > > is implemented today, since it has to follow the spec verbatim, is pointless. > > Just guessing, most map() usages are with a "dense" array. I can't think of > many > > times that people are throwing around arrays that have holes in them. I know > it > > can happen, but it's a minority for sure. > > It would be easy to detect cases where the array is dense. The trouble with > that, though, is that we have no easy way of knowing whether it _stays_ that way > during the loop -- unfortunately, JS arrays are far too mutable, and the > callback can usually modify them in any way it likes. > > Consequently, the "is dense" check would have to happen on every iteration, > which is only mildly cheaper than the more general "in" check. In principle, it > would be possible to be more clever for special cases, if we could detect the > callback to be pure wrt to the array, but that would require significant extra > infrastructure to implement. > > JS arrays are a beautiful concept, aren't they? :)
On 2012/11/13 15:30:22, mattrobenolt wrote: > Wow, that's terrible. > > So because of all of this, the conclusion is that as a developer, they should > avoid using map() if they care about performance. The "premature optimisation is the root of all evil" mantra still applies, though. In particular, the relevance of micro benchmarks like those seen on jsperf often is vastly overestimated. > And that's a shame, because map() is a beautiful syntax for a lot of things. > > I almost want a proposed "fastMap()" or something that says, "Yeah, we don't > care about the insanity checks, I know what I'm doing!". But that's more than > likely not going to work. ;) What they should have done is defining map and friends not to filter holes. |