|
|
Created:
6 years, 2 months ago by caitp (gmail) Modified:
6 years, 1 month ago CC:
v8-dev Project:
v8 Visibility:
Public. |
DescriptionCheck string literals with escapes in PreParserTraits::GetSymbol()
LOG=Y
BUG=v8:3606
R=arv@chromium.org, marja@chromium.org
Committed: https://code.google.com/p/v8/source/detail?r=24880
Patch Set 1 #
Total comments: 5
Patch Set 2 : Add tests + only check unescaped literals for arguments/eval #
Total comments: 2
Patch Set 3 : Share code between LiteralMatches() and UnescapedLiteralMatches() #
Total comments: 3
Patch Set 4 : Use default parameter + add some extra test cases #Patch Set 5 : Fixup indentation #
Total comments: 1
Patch Set 6 : Escape backslash in string literals to make msvs happy #
Messages
Total messages: 24 (1 generated)
Still trying to figure out how to write a test case for this... Work in progress, shouldn't take long https://codereview.chromium.org/615813004/diff/1/src/preparser.cc File src/preparser.cc (right): https://codereview.chromium.org/615813004/diff/1/src/preparser.cc#newcode75 src/preparser.cc:75: if (scanner->LiteralMatches("eval", 4)) { I'm not sure if this should be used for "eval" and "arguments" https://codereview.chromium.org/615813004/diff/1/src/scanner.h File src/scanner.h (right): https://codereview.chromium.org/615813004/diff/1/src/scanner.h#newcode399 src/scanner.h:399: if (current_token() == Token::STRING && The check for Token::STRING is simply because I don't fully understand the scanner yet, it was a really quick look to see how this could work. It may not be necessary. https://codereview.chromium.org/615813004/diff/1/src/scanner.h#newcode406 src/scanner.h:406: return UnescapedLiteralMatches(data, length); But if it does turn out to be necessary, I want to use the unescapedliteralmatches to verify non-string-literals.
Can you update the test in test-parsing.cc?
This seems worth doing. https://codereview.chromium.org/615813004/diff/1/src/preparser.cc File src/preparser.cc (right): https://codereview.chromium.org/615813004/diff/1/src/preparser.cc#newcode75 src/preparser.cc:75: if (scanner->LiteralMatches("eval", 4)) { On 2014/10/01 20:52:18, caitp wrote: > I'm not sure if this should be used for "eval" and "arguments" eval and arguments should be the unescaped one. https://codereview.chromium.org/615813004/diff/1/src/scanner.h File src/scanner.h (right): https://codereview.chromium.org/615813004/diff/1/src/scanner.h#newcode406 src/scanner.h:406: return UnescapedLiteralMatches(data, length); On 2014/10/01 20:52:18, caitp wrote: > But if it does turn out to be necessary, I want to use the > unescapedliteralmatches to verify non-string-literals. Is the only difference here is the literal_contains_escapes? If so, maybe refactor to share the code?
On 2014/10/01 21:00:35, arv wrote: > This seems worth doing. > > https://codereview.chromium.org/615813004/diff/1/src/preparser.cc > File src/preparser.cc (right): > > https://codereview.chromium.org/615813004/diff/1/src/preparser.cc#newcode75 > src/preparser.cc:75: if (scanner->LiteralMatches("eval", 4)) { > On 2014/10/01 20:52:18, caitp wrote: > > I'm not sure if this should be used for "eval" and "arguments" > > eval and arguments should be the unescaped one. > > https://codereview.chromium.org/615813004/diff/1/src/scanner.h > File src/scanner.h (right): > > https://codereview.chromium.org/615813004/diff/1/src/scanner.h#newcode406 > src/scanner.h:406: return UnescapedLiteralMatches(data, length); > On 2014/10/01 20:52:18, caitp wrote: > > But if it does turn out to be necessary, I want to use the > > unescapedliteralmatches to verify non-string-literals. > > Is the only difference here is the literal_contains_escapes? If so, maybe > refactor to share the code? I think it's easier to understand if they have different names "UnescapedLiteralMatches" vs "LiteralMatches" --- a boolean flag would make the code a lot harder to read. Can do it, but I'm not sure it's that great
On 2014/10/01 21:12:46, caitp wrote: > On 2014/10/01 21:00:35, arv wrote: > > This seems worth doing. > > > > https://codereview.chromium.org/615813004/diff/1/src/preparser.cc > > File src/preparser.cc (right): > > > > https://codereview.chromium.org/615813004/diff/1/src/preparser.cc#newcode75 > > src/preparser.cc:75: if (scanner->LiteralMatches("eval", 4)) { > > On 2014/10/01 20:52:18, caitp wrote: > > > I'm not sure if this should be used for "eval" and "arguments" > > > > eval and arguments should be the unescaped one. > > > > https://codereview.chromium.org/615813004/diff/1/src/scanner.h > > File src/scanner.h (right): > > > > https://codereview.chromium.org/615813004/diff/1/src/scanner.h#newcode406 > > src/scanner.h:406: return UnescapedLiteralMatches(data, length); > > On 2014/10/01 20:52:18, caitp wrote: > > > But if it does turn out to be necessary, I want to use the > > > unescapedliteralmatches to verify non-string-literals. > > > > Is the only difference here is the literal_contains_escapes? If so, maybe > > refactor to share the code? > > I think it's easier to understand if they have different names > "UnescapedLiteralMatches" vs "LiteralMatches" --- a boolean flag would make the > code a lot harder to read. Can do it, but I'm not sure it's that great I agree that the functions should have those names. You can still share the code though.
On 2014/10/01 21:20:06, arv wrote: > On 2014/10/01 21:12:46, caitp wrote: > > On 2014/10/01 21:00:35, arv wrote: > > > This seems worth doing. > > > > > > https://codereview.chromium.org/615813004/diff/1/src/preparser.cc > > > File src/preparser.cc (right): > > > > > > https://codereview.chromium.org/615813004/diff/1/src/preparser.cc#newcode75 > > > src/preparser.cc:75: if (scanner->LiteralMatches("eval", 4)) { > > > On 2014/10/01 20:52:18, caitp wrote: > > > > I'm not sure if this should be used for "eval" and "arguments" > > > > > > eval and arguments should be the unescaped one. > > > > > > https://codereview.chromium.org/615813004/diff/1/src/scanner.h > > > File src/scanner.h (right): > > > > > > https://codereview.chromium.org/615813004/diff/1/src/scanner.h#newcode406 > > > src/scanner.h:406: return UnescapedLiteralMatches(data, length); > > > On 2014/10/01 20:52:18, caitp wrote: > > > > But if it does turn out to be necessary, I want to use the > > > > unescapedliteralmatches to verify non-string-literals. > > > > > > Is the only difference here is the literal_contains_escapes? If so, maybe > > > refactor to share the code? > > > > I think it's easier to understand if they have different names > > "UnescapedLiteralMatches" vs "LiteralMatches" --- a boolean flag would make > the > > code a lot harder to read. Can do it, but I'm not sure it's that great > > I agree that the functions should have those names. You can still share the code > though. fair enough
https://codereview.chromium.org/615813004/diff/20001/test/cctest/test-parsing.cc File test/cctest/test-parsing.cc (right): https://codereview.chromium.org/615813004/diff/20001/test/cctest/test-parsing... test/cctest/test-parsing.cc:3858: "static 'prot\u006ftype'() {}", Can you also add: "static 'prototype'() {}", ... while you are here? https://codereview.chromium.org/615813004/diff/20001/test/cctest/test-parsing... test/cctest/test-parsing.cc:3884: "*'c\u006fnstructor'() {}", same https://codereview.chromium.org/615813004/diff/40001/src/scanner.h File src/scanner.h (right): https://codereview.chromium.org/615813004/diff/40001/src/scanner.h#newcode387 src/scanner.h:387: bool LiteralMatches(const char* data, int length, bool allow_escapes) { You can use default params here: bool allow_escapes = true https://codereview.chromium.org/615813004/diff/40001/src/scanner.h#newcode401 src/scanner.h:401: inline bool LiteralMatches(const char* data, int length) { Then this can be removed
arv@chromium.org changed reviewers: + marja@chromium.org
Adding marja since this would need an owner too.
https://codereview.chromium.org/615813004/diff/40001/test/cctest/test-parsing.cc File test/cctest/test-parsing.cc (right): https://codereview.chromium.org/615813004/diff/40001/test/cctest/test-parsing... test/cctest/test-parsing.cc:3861: "static *'prot\u006ftype'() {}", One more case: "static prot\u006ftype() {}", ...
On 2014/10/01 21:49:56, arv wrote: > https://codereview.chromium.org/615813004/diff/40001/test/cctest/test-parsing.cc > File test/cctest/test-parsing.cc (right): > > https://codereview.chromium.org/615813004/diff/40001/test/cctest/test-parsing... > test/cctest/test-parsing.cc:3861: "static *'prot\u006ftype'() {}", > One more case: > > "static prot\u006ftype() {}", > ... added these btw
LGTM Marja, can you take a look since I am not an owner?
https://codereview.chromium.org/615813004/diff/80001/src/preparser.cc File src/preparser.cc (right): https://codereview.chromium.org/615813004/diff/80001/src/preparser.cc#newcode81 src/preparser.cc:81: if (scanner->LiteralMatches("prototype", 9)) { actually, given https://people.mozilla.org/~jorendorff/es6-draft.html#sec-names-and-keywords, this same behaviour probably should apply to `eval` and `arguments`, too
lgtm (sorry for delay, I'm ooo). arv@ should probably be made an owner :) Depending on the level of paranoia, you might want to run the Windows try bots before landing - I had some problems with them not liking some unicode escapes in strings.
On 2014/10/09 18:30:11, marja wrote: > lgtm (sorry for delay, I'm ooo). > > arv@ should probably be made an owner :) > > Depending on the level of paranoia, you might want to run the Windows try bots > before landing - I had some problems with them not liking some unicode escapes > in strings. maybe it would be a good idea to do that (one day I'll be able to run try bots on my own :3)
On 2014/10/21 04:23:16, caitp wrote: > On 2014/10/09 18:30:11, marja wrote: > > lgtm (sorry for delay, I'm ooo). > > > > arv@ should probably be made an owner :) > > > > Depending on the level of paranoia, you might want to run the Windows try bots > > before landing - I had some problems with them not liking some unicode escapes > > in strings. > > maybe it would be a good idea to do that (one day I'll be able to run try bots > on my own :3) This needs landng, right? Running bots now
On 2014/10/24 13:53:25, Dmitry Lomov (chromium) wrote: > On 2014/10/21 04:23:16, caitp wrote: > > On 2014/10/09 18:30:11, marja wrote: > > > lgtm (sorry for delay, I'm ooo). > > > > > > arv@ should probably be made an owner :) > > > > > > Depending on the level of paranoia, you might want to run the Windows try > bots > > > before landing - I had some problems with them not liking some unicode > escapes > > > in strings. > > > > maybe it would be a good idea to do that (one day I'll be able to run try bots > > on my own :3) > > This needs landng, right? > Running bots now Looks like windows issues with certain unicode escapes like marja@ had said... hmm :(
\\u006f instead?
dslomov@ if you re-run the windows tests with the updated CL, I think that should work now. (same pattern used in other tests)
You can also have a look at how I hacked it in StreamingUtf8Script in test-api.cc. This is a handy web site for converting: http://www.endmemo.com/unicode/unicodeconverter.php
Except that seems like your stuff works better actually :) I should probably add some backslashes too.
windows builders seem happy, will land this
Message was sent while issue was closed.
Committed patchset #6 (id:100001) manually as 24880 (presubmit successful).
Message was sent while issue was closed.
Btw, I was confused. Our use cases for putting unicode into the tests are not the same. You want the test to read \uaaaa (so you need to write "\\uaaaa" into the code), whereas in my tests I want the raw bytes of the unicode character to be there in the test (so I could write \uaaaa if Windows liked it, but as it doesn't, I need to write \xaa\xbb\xcc. Sry for the noise. |