|
|
Created:
6 years, 1 month ago by caitp (gmail) Modified:
6 years, 1 month ago CC:
v8-dev Base URL:
https://chromium.googlesource.com/v8/v8.git@master Project:
v8 Visibility:
Public. |
DescriptionCache template literal callSiteObj
BUG=v8:3230
LOG=y
Committed: https://chromium.googlesource.com/v8/v8/+/d6e662d11c93a2717c11407212abf05afe6a5675
Patch Set 1 #
Total comments: 13
Patch Set 2 : Rebased #Patch Set 3 : Use existing hash algorithm, simplify cache get/set, more tests #
Total comments: 5
Patch Set 4 : Fix raw strings with wide characters, + some new tests #
Total comments: 6
Patch Set 5 : Remove multi-byte string handling, will do that later #Patch Set 6 : Shorten hash-string #
Total comments: 1
Patch Set 7 : Rename `num_dummy_chars` to `num_hash_chars` to be clearer #Patch Set 8 : Truncate hash_string to ensure hashed the same #Patch Set 9 : Use Smi for 64 bit, Number for other arches #Patch Set 10 : Ensure hash is a valid Smi literal #
Messages
Total messages: 27 (5 generated)
Simple implementation of callSite caching. String hashing is performed during parsing, while object storage and lookup is a runtime operation implemented in JS. If two different templates result in the same hash key, an array of duplicates is searched after key lookup. If possible, lets simplify this algorithm and ensure it works in all expected cases.
https://codereview.chromium.org/742643003/diff/1/src/harmony-templates.js File src/harmony-templates.js (right): https://codereview.chromium.org/742643003/diff/1/src/harmony-templates.js#new... src/harmony-templates.js:28: if ("raw" in obj) { I know this is broken if Array.prototype is messed with. Maybe a private symbol to identify these would make sense? https://codereview.chromium.org/742643003/diff/1/src/harmony-templates.js#new... src/harmony-templates.js:47: } else if ("raw" in obj) { Same here... what about using a Set of an array?
arv@chromium.org changed reviewers: + arv@chromium.org
neat https://codereview.chromium.org/742643003/diff/1/src/harmony-templates.js File src/harmony-templates.js (right): https://codereview.chromium.org/742643003/diff/1/src/harmony-templates.js#new... src/harmony-templates.js:24: var obj = callSiteCache.get(hash); This needs to be be %MapGet(callSiteCache, hash) or someone can hijack this. https://codereview.chromium.org/742643003/diff/1/src/harmony-templates.js#new... src/harmony-templates.js:28: if ("raw" in obj) { On 2014/11/19 03:50:33, caitp wrote: > I know this is broken if Array.prototype is messed with. Maybe a private symbol > to identify these would make sense? Why is this needed. Can we not make sure we only store valid call site objects in the cache? https://codereview.chromium.org/742643003/diff/1/src/harmony-templates.js#new... src/harmony-templates.js:46: callSiteCache.set(hash, siteObj); %MapSet https://codereview.chromium.org/742643003/diff/1/src/harmony-templates.js#new... src/harmony-templates.js:47: } else if ("raw" in obj) { I'm not sure why this is needed? Don't we only get here if we have a real call site object? https://codereview.chromium.org/742643003/diff/1/src/harmony-templates.js#new... src/harmony-templates.js:49: } else if (IS_ARRAY(obj)) { How can obj be anything but an array? https://codereview.chromium.org/742643003/diff/1/src/parser.cc File src/parser.cc (right): https://codereview.chromium.org/742643003/diff/1/src/parser.cc#newcode5333 src/parser.cc:5333: Vector<uint8_t> hash_string = Vector<uint8_t>::New(num_dummy_chars); What about two byte strings? Can we use StringHasher::HashSequentialString here? https://codereview.chromium.org/742643003/diff/1/test/mjsunit/harmony/templat... File test/mjsunit/harmony/templates.js (right): https://codereview.chromium.org/742643003/diff/1/test/mjsunit/harmony/templat... test/mjsunit/harmony/templates.js:367: })(); Maybe a test that has same values in the array but different values in the raw arrays?
https://codereview.chromium.org/742643003/diff/1/src/harmony-templates.js File src/harmony-templates.js (right): https://codereview.chromium.org/742643003/diff/1/src/harmony-templates.js#new... src/harmony-templates.js:24: var obj = callSiteCache.get(hash); On 2014/11/19 06:09:11, arv wrote: > This needs to be be %MapGet(callSiteCache, hash) or someone can hijack this. Done. https://codereview.chromium.org/742643003/diff/1/src/harmony-templates.js#new... src/harmony-templates.js:28: if ("raw" in obj) { On 2014/11/19 06:09:11, arv wrote: > On 2014/11/19 03:50:33, caitp wrote: > > I know this is broken if Array.prototype is messed with. Maybe a private > symbol > > to identify these would make sense? > > Why is this needed. Can we not make sure we only store valid call site objects > in the cache? In an unlikely event where we have hash collisions, the value is a List of some kind, rather than a single callSiteObj. The lone object is a valid call site, it's just to determine if it's an array of callsites, or an array of strings (single callsite). Less-complex alternative: Always put cached values in an array, even when they're alone (wastes a small bit of memory for the common no-collision case). https://codereview.chromium.org/742643003/diff/1/src/harmony-templates.js#new... src/harmony-templates.js:49: } else if (IS_ARRAY(obj)) { On 2014/11/19 06:09:11, arv wrote: > How can obj be anything but an array? Explained above --- it can be a callSiteObj, or an array of callSiteObjs --- callSiteObj is an array, so it's a bit confusing and bad. Alternative was already proposed, if it sounds good I don't see why not. https://codereview.chromium.org/742643003/diff/1/src/parser.cc File src/parser.cc (right): https://codereview.chromium.org/742643003/diff/1/src/parser.cc#newcode5333 src/parser.cc:5333: Vector<uint8_t> hash_string = Vector<uint8_t>::New(num_dummy_chars); On 2014/11/19 06:09:11, arv wrote: > What about two byte strings? > String::ToCString claims to return a UTF8 representation of the string, so whether it's got multi-byte characters or not, I don't think it should matter. > Can we use StringHasher::HashSequentialString here? I didn't know there was a hashing algorithm built in, I'll use that instead of Murmur.
https://codereview.chromium.org/742643003/diff/40001/src/harmony-templates.js File src/harmony-templates.js (right): https://codereview.chromium.org/742643003/diff/40001/src/harmony-templates.js... src/harmony-templates.js:24: var obj = %MapGet(callSiteCache, hash); It might be a good idea to use LRU caching or something instead, to be less leaky
caitpotter88@gmail.com changed reviewers: + dslomov@chromium.org, rossberg@chromium.org
We should probably rename the "call site object" to "template object" or "template pattern object". The OneByte vs TwoByte issues are not related to this CL so if my suspicions are correct we can deal with it in a different CL. https://codereview.chromium.org/742643003/diff/40001/src/harmony-templates.js File src/harmony-templates.js (right): https://codereview.chromium.org/742643003/diff/40001/src/harmony-templates.js... src/harmony-templates.js:45: %MapSet(callSiteCache, hash, %ArrayConcat(args)); The following looks cleaner to me. var array = %MapGet(callSiteCache, hash); if (IS_UNDEFINED(array)) { array = new InternalArray(); %MapSet(callSiteCache, hash, array); } array.push(siteObj); Push is safe on InternalArray. https://codereview.chromium.org/742643003/diff/40001/src/parser.cc File src/parser.cc (right): https://codereview.chromium.org/742643003/diff/40001/src/parser.cc#newcode5287 src/parser.cc:5287: int num_dummy_chars = (total - 1) * 4; Can you add a comment about why this is needed? https://codereview.chromium.org/742643003/diff/40001/src/parser.cc#newcode5309 src/parser.cc:5309: source->ToCString(ALLOW_NULLS, FAST_STRING_TRAVERSAL, span_start, I'm actually starting to wonder if this is correct. Don't we want to this to either be a 8bit or 16bit string? https://codereview.chromium.org/742643003/diff/40001/src/parser.cc#newcode5327 src/parser.cc:5327: OneByteVector(raw_chars.get(), to_index)); I don't think this is correct. The string might need two bytes here.
On 2014/11/19 17:18:46, arv wrote: > We should probably rename the "call site object" to "template object" or > "template pattern object". > > The OneByte vs TwoByte issues are not related to this CL so if my suspicions are > correct we can deal with it in a different CL. > > https://codereview.chromium.org/742643003/diff/40001/src/harmony-templates.js > File src/harmony-templates.js (right): > > https://codereview.chromium.org/742643003/diff/40001/src/harmony-templates.js... > src/harmony-templates.js:45: %MapSet(callSiteCache, hash, %ArrayConcat(args)); > The following looks cleaner to me. > > var array = %MapGet(callSiteCache, hash); > if (IS_UNDEFINED(array)) { > array = new InternalArray(); > %MapSet(callSiteCache, hash, array); > } > array.push(siteObj); > > Push is safe on InternalArray. > > https://codereview.chromium.org/742643003/diff/40001/src/parser.cc > File src/parser.cc (right): > > https://codereview.chromium.org/742643003/diff/40001/src/parser.cc#newcode5287 > src/parser.cc:5287: int num_dummy_chars = (total - 1) * 4; > Can you add a comment about why this is needed? > > https://codereview.chromium.org/742643003/diff/40001/src/parser.cc#newcode5309 > src/parser.cc:5309: source->ToCString(ALLOW_NULLS, FAST_STRING_TRAVERSAL, > span_start, > I'm actually starting to wonder if this is correct. Don't we want to this to > either be a 8bit or 16bit string? > > https://codereview.chromium.org/742643003/diff/40001/src/parser.cc#newcode5327 > src/parser.cc:5327: OneByteVector(raw_chars.get(), to_index)); > I don't think this is correct. The string might need two bytes here. I think your suspicion may have been correct, based on a quick test: d8> function tag(cs) { print(cs.raw[0]); } undefined d8> tag`안녕` ìë undefined That's probably not very good. I'll see if I can make that work better
https://codereview.chromium.org/742643003/diff/60001/src/parser.cc File src/parser.cc (right): https://codereview.chromium.org/742643003/diff/60001/src/parser.cc#newcode5336 src/parser.cc:5336: unibrow::Utf8Decoder<256> decoder(raw_chars.get(), to_index); A better solution for this is probably to make sure one-byte AstRawStrings can decode UTF8 properly. This works for now, though.
https://codereview.chromium.org/742643003/diff/60001/src/parser.cc File src/parser.cc (right): https://codereview.chromium.org/742643003/diff/60001/src/parser.cc#newcode5310 src/parser.cc:5310: hash_string[num_hash_chars++] = '.'; Can we skip the '.' here? https://codereview.chromium.org/742643003/diff/60001/src/parser.cc#newcode5315 src/parser.cc:5315: source->ToCString(ALLOW_NULLS, FAST_STRING_TRAVERSAL, span_start, I still think this is suspicious. The scanner knows how to handle one vs two byte strings. Lets stick with this known faulty implementation and I'll add a blocking bug to get this fixed. https://codereview.chromium.org/742643003/diff/60001/src/parser.cc#newcode5334 src/parser.cc:5334: // Convert to UTF16 --- One-byte AstRawStrings do not seem to be one byte is used for ascii https://codereview.chromium.org/742643003/diff/60001/src/parser.cc#newcode5336 src/parser.cc:5336: unibrow::Utf8Decoder<256> decoder(raw_chars.get(), to_index); On 2014/11/19 20:16:54, caitp wrote: > A better solution for this is probably to make sure one-byte AstRawStrings can > decode UTF8 properly. > > This works for now, though. We have code in the scanner for handling this already. At this point I think we should ignore the one vs two byte string issue in this cl and start a new CL just about that. Lets focus on the caching of the template pattern object.
https://codereview.chromium.org/742643003/diff/60001/src/harmony-templates.js File src/harmony-templates.js (right): https://codereview.chromium.org/742643003/diff/60001/src/harmony-templates.js... src/harmony-templates.js:39: %MapSet(callSiteCache, hash, new InternalArray(siteObj)); Lets not use the strange overloading of params for the array constructor. var array = new InternalArray(1); array[0] = siteObj;
On 2014/11/19 21:38:57, arv wrote: > https://codereview.chromium.org/742643003/diff/60001/src/harmony-templates.js > File src/harmony-templates.js (right): > > https://codereview.chromium.org/742643003/diff/60001/src/harmony-templates.js... > src/harmony-templates.js:39: %MapSet(callSiteCache, hash, new > InternalArray(siteObj)); > Lets not use the strange overloading of params for the array constructor. > > var array = new InternalArray(1); > array[0] = siteObj; Done --- I think the "." in the hash string separators can be removed too, will do that in a sec. There's a TODO in the tests above a failing test, which can be fixed easily once multi-byte handling is fixed.
LGTM Dmitry, care to do an owners review? https://codereview.chromium.org/742643003/diff/100001/src/parser.cc File src/parser.cc (right): https://codereview.chromium.org/742643003/diff/100001/src/parser.cc#newcode5287 src/parser.cc:5287: int num_dummy_chars = (total - 1) * 3; You lost the comment here.
On 2014/11/19 22:12:31, arv wrote: > LGTM > > Dmitry, care to do an owners review? > > https://codereview.chromium.org/742643003/diff/100001/src/parser.cc > File src/parser.cc (right): > > https://codereview.chromium.org/742643003/diff/100001/src/parser.cc#newcode5287 > src/parser.cc:5287: int num_dummy_chars = (total - 1) * 3; > You lost the comment here. I've since re-added another comment, slightly different semantics but it should be okay most of the time. It can be changed in the followup patch for dealing with multibyte characters.
lgtm
On 2014/11/19 22:16:23, Dmitry Lomov (chromium) wrote: > lgtm will CQ and work on the multi-byte characters issue, hopefully will have a CL up before traveling.
The CQ bit was checked by caitpotter88@gmail.com
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/742643003/90005
The CQ bit was unchecked by commit-bot@chromium.org
Try jobs failed on following builders: v8_linux_rel on tryserver.v8 (http://build.chromium.org/p/tryserver.v8/builders/v8_linux_rel/builds/1288)
On 2014/11/19 23:20:48, I haz the power (commit-bot) wrote: > Try jobs failed on following builders: > v8_linux_rel on tryserver.v8 > (http://build.chromium.org/p/tryserver.v8/builders/v8_linux_rel/builds/1288) Accidentally removed an important fix when reverting the multi-byte handling stuff... just making sure it works on ia32 and then will re-submit.
Added a fix for the ia32 case, which works for me locally --- but it would be good to get a second look at that before committing.
On 2014/11/20 01:39:27, caitp wrote: > Added a fix for the ia32 case, which works for me locally --- but it would be > good to get a second look at that before committing. (basically, is it worth converting to Smi on platforms where it's possible --- or is it better to always use Number for all platforms, or just use the full string, etc...)
The CQ bit was checked by caitpotter88@gmail.com
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/742643003/170001
Message was sent while issue was closed.
Committed patchset #10 (id:170001) |