Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(58)

Issue 1157073002: Speed up object-keyed Map and Set by giving out sequential hash codes

Created:
5 years, 7 months ago by Erik Corry Chromium.org
Modified:
5 years, 7 months ago
Reviewers:
adamk, Toon Verwaest
CC:
v8-dev
Base URL:
https://chromium.googlesource.com/v8/v8.git@master
Target Ref:
refs/pending/heads/master
Project:
v8
Visibility:
Public.

Description

Speed up object-keyed Map and Set by giving out sequential hash codes. The previous code generated random hash codes for JS objects. These codes are not visible to JS, but are used by the implementation of Set() and Map(). An open addressed hash table is very sensitive to sequential hash codes, because they "block off" areas of the hash table, and this means that when you get a collision in the table, you get a lot of collisions, as the default simple search for a free slot steps through the blocked off area. However the implementation in Set() and Map() is not sensitive to sequences of sequential hash codes. In fact they work really well and collisions are rare and harmless. In particular, for large tables, the cache behaviour is much better both on insertion and rehashing (pauses). For the example given in https://code.google.com/p/v8/issues/detail?id=4086 V8 runs very badly unless you give it enough memory with a flag. If you give it enough memory then it still isn't very fast. This patch cuts 40% off the run time of the Map() section of the test. On that bug there is a suggestion by me to give newly created objects a hash code immediately, which also cuts the running time of the Map() section of the test by 40%. These are not the same 40%, though, because combining the two changes reduces running time by 77%, ie a 4x speedup. R=adamk@chromium.org, verwaest@chromium.org BUG=v8:4086 LOG=n

Patch Set 1 #

Patch Set 2 : Revert inadvertent change #

Unified diffs Side-by-side diffs Delta from patch set Stats (+7 lines, -16 lines) Patch
M src/collection.js View 2 chunks +7 lines, -7 lines 0 comments Download
M src/math.js View 2 chunks +0 lines, -9 lines 0 comments Download

Messages

Total messages: 4 (0 generated)
Erik Corry Chromium.org
5 years, 7 months ago (2015-05-26 12:49:39 UTC) #1
Erik Corry Chromium.org
For js-perf-test/Collections this is a 3 to 5% improvement for the object keyed strong collections, ...
5 years, 7 months ago (2015-05-26 14:18:34 UTC) #2
adamk
I'm worried that this over-fits our hashing strategy to the implementation details of these particular ...
5 years, 7 months ago (2015-05-26 17:42:00 UTC) #3
Erik Corry Chromium.org
5 years, 7 months ago (2015-05-26 21:30:42 UTC) #4
On 2015/05/26 17:42:00, adamk wrote:
> I'm worried that this over-fits our hashing strategy to the implementation
> details of these particular data structures. If objects added to a Map or Set
> are later added to a WeakMap, they'll end up with exactly the bad behavior you
> describe in the CL description for open-addressed hash tables (though I
suppose
> we could use different private symbols for hashing in an ObjectHashTable vs an
> OrderedHashTable).
> 
> This hashing strategy may also be over-fit to the microbenchmark, since new
> objects are created and inserted once, meaning that the ordering of the hash
> codes always matches the ordering of the container. In more varied workloads
> (such as adding and removing the same objects from a set), I'd this to be a
> wash.
> 
> If you disagree with the above, can you help me overcome these concerns?

I had a different more complicated version that was less likely to cause
problems
for an open addressing hash.  I put it here for reference:
https://codereview.chromium.org/1158083002

It also gives a good speedup on the modified example from bug 4086.  Where this
CL
lowers runtime by 62% the 1158083002 change above gives only gives about 53%
reduction.  I don't think there's anything special about the example from 4086
other
than that it's a big table.  If you look at the spreadsheet I shared you can see
that
benefits are visible even on 10-element collections, and for each order of
magnitude the collection grows, the benefit becomes clearer.  The benefits are
seen even if objects are not inserted in the exact order that they were given
hash
codes: I think that follows from the fact that 1158083002 works too - check out
the codes it gives out.

There are other solutions. We could give out different hash codes for different
purposes,
as you say, or fix weak map to use the same strategy as Map() and Set().

At this point I'm more than running out of time to spend on this, so probably
I'll just leave
these two CLs there as inspiration. The implementation of Map() and Set() is
rather neat
in terms of how it keeps track of insertion order and how it handles collisions,
but it has
some cache locality issues with the current hash code.

Powered by Google App Engine
This is Rietveld 408576698