|
|
Created:
6 years, 10 months ago by aperez Modified:
6 years, 5 months ago CC:
v8-dev Visibility:
Public. |
DescriptionImplement handling of arrow functions in the parser
Arrow functions are parsed from ParseAssignmentExpression. Handling the
parameter list is done by letting ParseConditionalExpression() parse
a comma-separated list of identifiers, and it returns a tree of
BinaryOperation nodes with VariableProxy leaves, or a single
VariableProxy if there is only one parameter. When the arrow token "=>"
is found, the VariableProxy nodes are passed to ParseFunctionLiteral(),
which will then skip parsing the paramaeter list. This avoids having
to rewind when the arrow is found and restart parsing the parameter
list. Note that ParseExpression() expects parenthesized expressions
to not be empty, so checking for a closing parenthesis is added in
handling the empty parameter list "()" will accept a right-paren and
return an empty expression, which means that the parameter list is
empty.
Additionally, this adds the following machinery:
- A runtime flag "harmony_arrow_functions" (disabled by default).
Enabling "harmony" will enable it as well.
- An IsArrow bit in SharedFunctionInfo, and accessors for it.
- An IsArrow bit in FunctionLiteral, accessorts for it, and
a constructor parameter to set its value.
- In ParserBase: allow_arrow_functions() and set_allow_arrow_functions()
- A V8 native %FunctionIsArrow(), which is used to skip adding the
"function " prefix when getting the source code for an arrow
function.
R=marja@chromium.org
Committed: https://code.google.com/p/v8/source/detail?r=22265
Patch Set 1 #Patch Set 2 : Rebased against latest changes in parser #Patch Set 3 : Rebased #Patch Set 4 : Implement handling of arrow functions in the parser #
Total comments: 18
Patch Set 5 : Rebased, moved to ParserBase, fixed as per comments #Patch Set 6 : Rebased, made arrow function objects not have .prototype #Patch Set 7 : Rebased after latest changes in runtime.{h,cc} #
Total comments: 13
Patch Set 8 : Rebased and fixed nits #Patch Set 9 : Extra parens in parameter lists now recognized, no longer segfaults on "()" #
Total comments: 9
Patch Set 10 : Version with parsing code only, tests into test-parsing.cc #
Total comments: 8
Patch Set 11 : Made tests more comprehensive #
Total comments: 1
Patch Set 12 : Removed ParamListFinder #
Total comments: 21
Patch Set 13 : Cleanup round #Patch Set 14 : Same, plus a rebase and conflicts resolved #
Total comments: 10
Patch Set 15 : Moar clean ups, return dummy FunctionLiteral for arrow functions #
Total comments: 4
Patch Set 16 : Fixed nits #Patch Set 17 : Make ParameterListFromExpression() return arguments in natural order #Patch Set 18 : Made sure build suceeds with GCC and werror=yes #Patch Set 19 : Fixed cpplint issues #Patch Set 20 : New upload after doing "make dependencies" #Patch Set 21 : What "git cl format" produces #
Messages
Total messages: 84 (0 generated)
I have rebased the patch after the latest changes in the parser (FunctionState et al.). Also, this new version moves the “allow_arrow_functions_” flag to ParserBase.
On 2014/02/12 16:01:05, aperez wrote: > I have rebased the patch after the latest changes in the parser (FunctionState > et al.). Also, this new version moves the “allow_arrow_functions_” flag to > ParserBase. Thank you, however, you'll still have to go through the "new feature process" before as outlined here: https://developers.google.com/v8/launchprocess
On 2014/02/12 16:02:52, jochen wrote: > On 2014/02/12 16:01:05, aperez wrote: > > I have rebased the patch after the latest changes in the parser (FunctionState > > et al.). Also, this new version moves the “allow_arrow_functions_” flag to > > ParserBase. > > Thank you, > > however, you'll still have to go through the "new feature process" before as > outlined here: https://developers.google.com/v8/launchprocess Posted: https://groups.google.com/d/topic/v8-users/5FNvOv-kQY4/discussion
CL updated
Here are some comments (but I didn't yet look at all the code in detail). https://codereview.chromium.org/160073006/diff/380001/src/parser.cc File src/parser.cc (right): https://codereview.chromium.org/160073006/diff/380001/src/parser.cc#newcode3141 src/parser.cc:3141: if (expression == NULL) Normally having expression == NULL would be an error case; pls add a comment here that this can actually happen in non-error cases. (Afaics it happens when the argument list is empty.) https://codereview.chromium.org/160073006/diff/380001/src/parser.cc#newcode3312 src/parser.cc:3312: // FormalParameterList :: This ParseFunctionLiteral is pretty monstrous as is, can it be split first before adding another dimension for array funcs? https://codereview.chromium.org/160073006/diff/380001/src/preparser.cc File src/preparser.cc (right): https://codereview.chromium.org/160073006/diff/380001/src/preparser.cc#newcod... src/preparser.cc:844: PreParser::Expression PreParser::ParseFunctionLiteral( ... this doesn't handle arrow funcs yet? (Like, single statement body and so on.) You should add tests to cctest/test-parsing to check that the parser and preparser accept the same set of arrow functions. https://codereview.chromium.org/160073006/diff/380001/src/preparser.h File src/preparser.h (right): https://codereview.chromium.org/160073006/diff/380001/src/preparser.h#newcode... src/preparser.h:1464: if (allow_arrow_functions() && peek() == Token::RPAREN) { This doesn't sound right. I guess the idea is that we see the (, then parse the expression inside the () with ParseExpression. But what if we see a lonely ) when we expect an expression? That should be an error, but this looks like it'll accept it.
Hmm, one more comment... what am I missing? I thought arrow funcs are just syntactic sugar and they won't have any impact beyond the Parser, i.e., the parser would produce the same ast in these cases: (foo, bar) => { baz } function(foo, bar) { baz } but this CL doesn't seem like it...
On 2014/03/24 09:06:47, marja wrote: > Hmm, one more comment... what am I missing? I thought arrow funcs are just > syntactic sugar and they won't have any impact beyond the Parser, i.e., the > parser would produce the same ast in these cases: > (foo, bar) => { baz } > function(foo, bar) { baz } > > but this CL doesn't seem like it... Arrow functions don't bind 'this' or 'arguments', and have a few other differences.
On 2014/03/24 09:21:06, rossberg wrote: > On 2014/03/24 09:06:47, marja wrote: > > Hmm, one more comment... what am I missing? I thought arrow funcs are just > > syntactic sugar and they won't have any impact beyond the Parser, i.e., the > > parser would produce the same ast in these cases: > > (foo, bar) => { baz } > > function(foo, bar) { baz } > > > > but this CL doesn't seem like it... > > Arrow functions don't bind 'this' or 'arguments', and have a few other > differences. ah, ofc, yes, thanks.
In addition to the comments below, this is lacking tests. You should be able to treat arrow functions like ordinary functions for now, such that you can already write JS tests that check correct parsing. Example tests I'd like to see include () => 0 x => x x => y => x + y (x, y) => (u, v) => x*u + y*v (x, y) => x.a = y var f = x => x, y // y is undefined (x, y) => x, y // fails with y unbound => 0 // fails ) => 0 // fails (()) => 0 // fails ((x)) => x // fails ((x, y)) => x + y // fails ); // fails Ideally, write tests that use eval to programmatically generate and check many different combinations of these, in different syntactic contexts. https://codereview.chromium.org/160073006/diff/380001/src/parser.cc File src/parser.cc (right): https://codereview.chromium.org/160073006/diff/380001/src/parser.cc#newcode1013 src/parser.cc:1013: FunctionParsingMode parsing_mode = shared_info->is_generator() Nit: I'd prefer formatting this like FunctionParsingMode parsing_mode = shared_info->is_generator() ? kGeneratorFunction : shared_info->is_arrow() ? kArrowFunction : kNormalFunction; https://codereview.chromium.org/160073006/diff/380001/src/parser.cc#newcode3139 src/parser.cc:3139: Vector<VariableProxy*> Parser::ParameterListFromExpression( How does this rule out illegal parenthesis? For example, ((x)) or ((x, y)) would seem to (incorrectly) parse as the parameter lists (x) and (x, y), wouldn't it? https://codereview.chromium.org/160073006/diff/380001/src/parser.cc#newcode3151 src/parser.cc:3151: ParserTraits::ReportMessageAt(Scanner::Location(pos, pos), Can we refactor this a little to avoid all the error reporting code duplication? https://codereview.chromium.org/160073006/diff/380001/src/parser.cc#newcode3312 src/parser.cc:3312: // FormalParameterList :: On 2014/03/24 09:04:06, marja wrote: > This ParseFunctionLiteral is pretty monstrous as is, can it be split first > before adding another dimension for array funcs? +1. The arrow function case is parsing quite different syntax, so I'd rather have it as a separate function. You can still factor out commonalities into third functions (e.g. ParseFunctionParameterList, ParseFunctionBody, etc.). https://codereview.chromium.org/160073006/diff/380001/src/parser.cc#newcode3315 src/parser.cc:3315: if (func_parsing_mode == kArrowFunction && peek() == Token::ARROW) { Can it ever happen that the first condition is true but the second isn't? Where do you invoke ParseFunctionLiteral with kArrowFunction but have not parsed the parameters yet? https://codereview.chromium.org/160073006/diff/380001/src/parser.cc#newcode3400 src/parser.cc:3400: Expression* expression = ParseExpression(true, CHECK_OK); Shouldn't this be ParseAssignmentExpression?
https://codereview.chromium.org/160073006/diff/380001/src/parser.cc File src/parser.cc (right): https://codereview.chromium.org/160073006/diff/380001/src/parser.cc#newcode3139 src/parser.cc:3139: Vector<VariableProxy*> Parser::ParameterListFromExpression( On 2014/03/25 12:35:34, rossberg wrote: > How does this rule out illegal parenthesis? For example, ((x)) or ((x, y)) would > seem to (incorrectly) parse as the parameter lists (x) and (x, y), wouldn't it? True, this case is tricky to implement when re-interpreting an already-parsed AST as a parameter list. Probably this is going to need some way of tracking whether a parenthesized expression is inside another. I am going to add a TODO here for the moment. https://codereview.chromium.org/160073006/diff/380001/src/parser.cc#newcode3141 src/parser.cc:3141: if (expression == NULL) On 2014/03/24 09:04:06, marja wrote: > Normally having expression == NULL would be an error case; pls add a comment > here that this can actually happen in non-error cases. (Afaics it happens when > the argument list is empty.) Yes, when the argument list is empty, “expression” is NULL. I will add a comment about NULL being valid in this particular case. https://codereview.chromium.org/160073006/diff/380001/src/parser.cc#newcode3151 src/parser.cc:3151: ParserTraits::ReportMessageAt(Scanner::Location(pos, pos), On 2014/03/25 12:35:34, rossberg wrote: > Can we refactor this a little to avoid all the error reporting code duplication? Sure. https://codereview.chromium.org/160073006/diff/380001/src/parser.cc#newcode3312 src/parser.cc:3312: // FormalParameterList :: On 2014/03/25 12:35:34, rossberg wrote: > On 2014/03/24 09:04:06, marja wrote: > > This ParseFunctionLiteral is pretty monstrous as is, can it be split first > > before adding another dimension for array funcs? > > +1. The arrow function case is parsing quite different syntax, so I'd rather > have it as a separate function. You can still factor out commonalities into > third functions (e.g. ParseFunctionParameterList, ParseFunctionBody, etc.). I am in the process of moving parsing of arrow functions to a different function, and trying to put it in ParserBase, which is making me having to move/add some bits into the common parser stuff — IMHO this better for new code than having to convert things later on. https://codereview.chromium.org/160073006/diff/380001/src/parser.cc#newcode3315 src/parser.cc:3315: if (func_parsing_mode == kArrowFunction && peek() == Token::ARROW) { On 2014/03/25 12:35:34, rossberg wrote: > Can it ever happen that the first condition is true but the second isn't? Where > do you invoke ParseFunctionLiteral with kArrowFunction but have not parsed the > parameters yet? It could happen when the function is parsed lazily: it is known in advance that we would be parsing an arrow function (SharedFunctionInfo->is_arrow() returns true), and we would start parsing at the start of the parameter list (so peek() != Token::ARROW). Anyway, after moving arrow function parsing to its own function this code will not be here anymore. https://codereview.chromium.org/160073006/diff/380001/src/parser.cc#newcode3400 src/parser.cc:3400: Expression* expression = ParseExpression(true, CHECK_OK); On 2014/03/25 12:35:34, rossberg wrote: > Shouldn't this be ParseAssignmentExpression? Indeed. https://codereview.chromium.org/160073006/diff/380001/src/preparser.cc File src/preparser.cc (right): https://codereview.chromium.org/160073006/diff/380001/src/preparser.cc#newcod... src/preparser.cc:844: PreParser::Expression PreParser::ParseFunctionLiteral( On 2014/03/24 09:04:06, marja wrote: > ... this doesn't handle arrow funcs yet? (Like, single statement body and so > on.) > > You should add tests to cctest/test-parsing to check that the parser and > preparser accept the same set of arrow functions. FWIW, I will add the tests, but provided that I am will be adding a new ParseArrowFunctionLiteral() in ParserBase, both parser and preparser will be in sync. https://codereview.chromium.org/160073006/diff/380001/src/preparser.h File src/preparser.h (right): https://codereview.chromium.org/160073006/diff/380001/src/preparser.h#newcode... src/preparser.h:1464: if (allow_arrow_functions() && peek() == Token::RPAREN) { On 2014/03/24 09:04:06, marja wrote: > This doesn't sound right. I guess the idea is that we see the (, then parse the > expression inside the () with ParseExpression. But what if we see a lonely ) > when we expect an expression? That should be an error, but this looks like it'll > accept it. It was causing an unexpected token error most of the time, because of the requirement that parens are balanced, but there was a case (parsing a single expression-statement) which caused infinite recursion. I am updating this to check for: scanner()->current_token() == Token::LPAREN && peek() == Token::RPAREN
Btw, I just refactored ParseFunctionLiteral. Con: you'll need to rebase. Pro: The function is less monstrous now, so people are probably more willing to accept additions. You might still want to refactor parsing the arguments out of it. (Or I might.)
On 2014/04/15 08:32:49, marja wrote: > Btw, I just refactored ParseFunctionLiteral. Con: you'll need to rebase. Pro: > The function is less monstrous now, so people are probably more willing to > accept additions. You might still want to refactor parsing the arguments out of > it. (Or I might.) Precisely I was about to update this CL later today... In the end I have split the code for arrow function parsing in a separate ParserBase::ParseArrowFunctionLiteral() function, so rebasing now has gotten much easier. I see in your commit that you have split out SkipLazyFunctionBody() and ParseEagerFunctionBody(), which can be used for arrow function bodies as well... so I will do one more rebase and use those before updating the CL. Thanks for the heads-up!
On 2014/04/15 08:59:58, aperez wrote: > On 2014/04/15 08:32:49, marja wrote: > > Btw, I just refactored ParseFunctionLiteral. Con: you'll need to rebase. Pro: > > The function is less monstrous now, so people are probably more willing to > > accept additions. You might still want to refactor parsing the arguments out > of > > it. (Or I might.) > > Precisely I was about to update this CL later today... ...and just updated it now.
Thanks for working on this! Some questions which popped in my mind while reading through this (but I didn't do much searching to find the answers): 1) Why do Identifiers now need to know their positions? 2) Why is the added FunctionState constructor needed? https://codereview.chromium.org/160073006/diff/440001/src/parser.cc File src/parser.cc (right): https://codereview.chromium.org/160073006/diff/440001/src/parser.cc#newcode3202 src/parser.cc:3202: Vector<VariableProxy*> ParserTraits::ParameterListFromExpression( It looks like this doesn't check that the expression is parenthesized, so this would accept foo, bar => { baz }. You should add tests for such cases in the style of test-parsing.cc. https://codereview.chromium.org/160073006/diff/440001/src/parser.cc#newcode3220 src/parser.cc:3220: goto invalid_token; Afaics, it would be possible to write this in a goto-less way; just set some error condition and break from the while loop, after the loop, check the error conditions. https://codereview.chromium.org/160073006/diff/440001/src/preparser.h File src/preparser.h (right): https://codereview.chromium.org/160073006/diff/440001/src/preparser.h#newcode736 src/preparser.h:736: class PreParserStatement { I'd be more than willing to take this kind of changes as a separate CL. A bit unfortunate that this will be the first unified statement parsing function... I wonder if this CL would get easier if I unify some easy statement parsing funcs (like ParseIfStatement), so that the machinery for unified statement parsing would be there already... https://codereview.chromium.org/160073006/diff/440001/src/preparser.h#newcode939 src/preparser.h:939: FunctionLiteral::IsGeneratorFlag is_generator, Instead of IsGeneratorFlag and IsArrowFlag you should have only one flag, which takes values "generator", "arrow" and "normal". https://codereview.chromium.org/160073006/diff/440001/src/preparser.h#newcode... src/preparser.h:1188: return Vector<const PreParserIdentifier>::empty(); This makes PreParser accept a different language than Parser (well, it already does, but I don't want to make them diverge any further). PreParserExpression could maintain a "is this a valid parenthesized comma-separated identifier list" property which can be used here for checking the validity. https://codereview.chromium.org/160073006/diff/440001/src/preparser.h#newcode... src/preparser.h:2385: if (peek() == Token::LPAREN) { Hmm, so this function can be called in two ways, first by ParseAssignmentExpression, where the next token is always arrow, and then by Parser::ParseLazy where it's the beginning of the param list... Maybe you could split this up into two different functions. The last part, parsing the body after the =>, can be a third function which the two then call.
On 2014/04/24 08:08:05, marja wrote: > Thanks for working on this! > > Some questions which popped in my mind while reading through this (but I didn't > do much searching to find the answers): > > 1) Why do Identifiers now need to know their positions? It is enough to have dummy implementations of PreParserIdentifier::position(), and PreParserIdentifier::length(). The actual positions are not really needed by the preparser. It seems I thought they would be needed and then forgot to remove that piece of code. > 2) Why is the added FunctionState constructor needed? In ParseArrowFunctionLiteral(), when the new Scope is created, in the case of the PreParser it will be a stack-allocated PreParserScope, while in Parser it will be a heap-allocated v8::internal::Scope, so in the line that creates the FunctionState: FunctionState function_state(&function_state_, &scope_, &scope, zone()); …for the PreParser, “scope” is a PreParserScope, and the constructor then receives “&scope” (a PreParserScope*); on the other hand for the Parser “scope” is a v8::internal::Scope*, so the “&scope” parameter is passing a v8::internal::Scope** to the constructor. And that is why there are two constructors. (FTR, this is the simplest solution I could come up with, so suggestions are accepted).
https://codereview.chromium.org/160073006/diff/440001/src/parser.cc File src/parser.cc (right): https://codereview.chromium.org/160073006/diff/440001/src/parser.cc#newcode3202 src/parser.cc:3202: Vector<VariableProxy*> ParserTraits::ParameterListFromExpression( On 2014/04/24 08:08:06, marja wrote: > It looks like this doesn't check that the expression is parenthesized, so this > would accept foo, bar => { baz }. You should add tests for such cases in the > style of test-parsing.cc. The example you mention would be a comma-expression with two items, and it is a valid expression, equivalent to: foo, (bar => { baz }) (foo, bar => { baz }) (foo, (bar => { baz })) So the case you mention is valid syntax, and it is already handled by the grammar because comma has more precedence than expressions, and an arrow function is an expression (the whole arrrow function, including the parameter list). There is no need to check for those cases here, because the input up to the comma has already been parsed as a comma-expression when the ParseArrowFunction() function is called. https://codereview.chromium.org/160073006/diff/440001/src/parser.cc#newcode3220 src/parser.cc:3220: goto invalid_token; On 2014/04/24 08:08:06, marja wrote: > Afaics, it would be possible to write this in a goto-less way; just set some > error condition and break from the while loop, after the loop, check the error > conditions. Sure thing, I will do that. The function will not be as terse, though. https://codereview.chromium.org/160073006/diff/440001/src/preparser.h File src/preparser.h (right): https://codereview.chromium.org/160073006/diff/440001/src/preparser.h#newcode736 src/preparser.h:736: class PreParserStatement { On 2014/04/24 08:08:06, marja wrote: > I'd be more than willing to take this kind of changes as a separate CL. A bit > unfortunate that this will be the first unified statement parsing function... > > I wonder if this CL would get easier if I unify some easy statement parsing > funcs (like ParseIfStatement), so that the machinery for unified statement > parsing would be there already... Fortunately I have this locally as a separate commit, I will submit a new CL which includes only the bits that move PreParserStatement around. https://codereview.chromium.org/160073006/diff/440001/src/preparser.h#newcode939 src/preparser.h:939: FunctionLiteral::IsGeneratorFlag is_generator, On 2014/04/24 08:08:06, marja wrote: > Instead of IsGeneratorFlag and IsArrowFlag you should have only one flag, which > takes values "generator", "arrow" and "normal". Okay. https://codereview.chromium.org/160073006/diff/440001/src/preparser.h#newcode... src/preparser.h:1188: return Vector<const PreParserIdentifier>::empty(); On 2014/04/24 08:08:06, marja wrote: > This makes PreParser accept a different language than Parser (well, it already > does, but I don't want to make them diverge any further). PreParserExpression > could maintain a "is this a valid parenthesized comma-separated identifier list" > property which can be used here for checking the validity. That is actually a neat idea, I will give it a try. Thank you for the suggestion. https://codereview.chromium.org/160073006/diff/440001/src/preparser.h#newcode... src/preparser.h:2385: if (peek() == Token::LPAREN) { On 2014/04/24 08:08:06, marja wrote: > Hmm, so this function can be called in two ways, first by > ParseAssignmentExpression, where the next token is always arrow, and then by > Parser::ParseLazy where it's the beginning of the param list... Correct. In that case we know exactly where the parameter list starts, so there is no need to deduce the parameter names from the AST. Using the AST to pick the parameter names could be done in both cases, but parsing directly for now provides better detection and reporting of syntax errors. > Maybe you could split this up into two different functions. The last part, > parsing the body after the =>, can be a third function which the two then call. Sure, will do.
On 2014/04/24 12:38:36, aperez wrote: > https://codereview.chromium.org/160073006/diff/440001/src/preparser.h#newcode736 > src/preparser.h:736: class PreParserStatement { > On 2014/04/24 08:08:06, marja wrote: > > I'd be more than willing to take this kind of changes as a separate CL. A bit > > unfortunate that this will be the first unified statement parsing function... > > > > I wonder if this CL would get easier if I unify some easy statement parsing > > funcs (like ParseIfStatement), so that the machinery for unified statement > > parsing would be there already... > > Fortunately I have this locally as a separate commit, I will submit a new CL > which includes only the bits that move PreParserStatement around. Done, CL here: https://codereview.chromium.org/252423007/
https://codereview.chromium.org/160073006/diff/440001/src/parser.cc File src/parser.cc (right): https://codereview.chromium.org/160073006/diff/440001/src/parser.cc#newcode3202 src/parser.cc:3202: Vector<VariableProxy*> ParserTraits::ParameterListFromExpression( Hmm, okay. It would be good to add test cases to make sure these expressions are interpreted correctly.
On 2014/04/29 09:38:43, marja wrote: > https://codereview.chromium.org/160073006/diff/440001/src/parser.cc > File src/parser.cc (right): > > https://codereview.chromium.org/160073006/diff/440001/src/parser.cc#newcode3202 > src/parser.cc:3202: Vector<VariableProxy*> > ParserTraits::ParameterListFromExpression( > Hmm, okay. It would be good to add test cases to make sure these expressions are > interpreted correctly. All the issues are solved now, please take a look at the latest version of the patch, which I believe should be ready for landing now :-)
Thanks for the update! Here's a (non-exhaustive) list of comments. In addition, pls add the following tests: - Afaics there are no tests which make sure that PreParser handles arrow functions properly! (mjsunit doesn't go via PreParser). - Add the "arrow functions allowed" flag to test-parsing, so that all existing parsing tests are ran with arrow functions allowed (to make sure you're not changing the accepted language). - Test cases where you check that preparsing and parsing fail the same way and succeed for the same string (see test_parsing.cc) - Test cases for "foo[some arrow func]" and another one for "foo[()]" (this will take the code path you added in ParseExpression, right?) Will these crash because ParseExpression thinks () is okay and returns NULL? - Do constructs like (()) or (()) => something work as expected? - Arrow functions in strict mode and the related errors (dupl. param names etc.) - Tests I listed in the comments - Lazy arrow functions https://codereview.chromium.org/160073006/diff/480001/src/parser.cc File src/parser.cc (right): https://codereview.chromium.org/160073006/diff/480001/src/parser.cc#newcode1017 src/parser.cc:1017: if (shared_info->is_arrow()) { Probably the same refactoring applies to SharedInfo, instead of "is_generator" and "is_arrow" you could have "type" (normal, generator, arrow). https://codereview.chromium.org/160073006/diff/480001/src/parser.h File src/parser.h (right): https://codereview.chromium.org/160073006/diff/480001/src/parser.h#newcode388 src/parser.h:388: typedef v8::internal::Scope* ScopePtr; Why is this needed? https://codereview.chromium.org/160073006/diff/480001/src/preparser.h File src/preparser.h (right): https://codereview.chromium.org/160073006/diff/480001/src/preparser.h#newcode... src/preparser.h:1607: CHECK_OK); I think there is a bug here. So if we're here... but it's not an arrow function but a stray "()", what will happen? I think you're changing the existing behavior of () here. And that's not intentional, I guess? For example, if the code is foo[()], we currently produce an error "Unexpected token )", but now, if arrow functions are allowed, we'll produce an error "Unexpected token ]", right? In any case, pls add test code which tests that possibility (and even better, a test which would've exposed the bug, if you agree it's a bug... but that might be difficult). https://codereview.chromium.org/160073006/diff/480001/src/preparser.h#newcode... src/preparser.h:1643: if (allow_arrow_functions() && peek() == Token::RPAREN && Pls add a test where this code path is taken but it turns out to not be an arrow function. https://codereview.chromium.org/160073006/diff/480001/src/preparser.h#newcode... src/preparser.h:2314: ParserBase<Traits>::ParseArrowFunctionLiteralBody( A lot of this function (starting from strict mode checking) is duplicate code; pls refactor. I'm happy to accept a separate CL which splits the strict mode checking out of ParseFunctionLiteral where it currently is. https://codereview.chromium.org/160073006/diff/480001/test/mjsunit/harmony/ar... File test/mjsunit/harmony/arrow-functions-parsing.js (right): https://codereview.chromium.org/160073006/diff/480001/test/mjsunit/harmony/ar... test/mjsunit/harmony/arrow-functions-parsing.js:28: var objectLiteralResult = ({ value: 42 }); Pls test that these yield the expected results, too. https://codereview.chromium.org/160073006/diff/480001/test/mjsunit/harmony/ar... test/mjsunit/harmony/arrow-functions-parsing.js:39: assertThrows(");", SyntaxError); You can move these "parsing should fail" cases to test-parsing.cc. https://codereview.chromium.org/160073006/diff/480001/test/mjsunit/harmony/ar... test/mjsunit/harmony/arrow-functions-parsing.js:59: assertThrows("(x, (y)) => 0;", SyntaxError); You should add tests for different (legal and illegal) contexts where arrow functions (or something resembling them) can occur. For example, foo, bar => baz foo ? bar => ... : ... foo[()] foo[bar => baz] In addition, you should test cases like ((foo, bar, baz)) => ... (foo, (bar, baz)) => ... ((foo, bar), baz) => ...
https://codereview.chromium.org/160073006/diff/480001/src/parser.h File src/parser.h (right): https://codereview.chromium.org/160073006/diff/480001/src/parser.h#newcode388 src/parser.h:388: typedef v8::internal::Scope* ScopePtr; On 2014/05/26 12:46:06, marja wrote: > Why is this needed? Ahh, nothing, I see we discussed scopes before. :)
On 2014/05/26 12:46:06, marja wrote: > Thanks for the update! > > Here's a (non-exhaustive) list of comments. I have been going over the items in the list and I have almost all of them sorted out now. (Some comments below, keep reading...) > [...] > > - Test cases where you check that preparsing and parsing fail the same way and > succeed for the same string (see test_parsing.cc) After adding code for testing this, it turned out that the PreParser was not behaving completely the same as the parser, in the case where the AST subtree from the arrow function arguments is passed to ParseArrowFunctionLiteral() and then examined to declare variables and report errors. The problem here was that the PreParser does not build an AST, so I came up with a solution which works by “bubbling up” flags in this way: - Add flags to identify PreParserExpressions which are binary operations, and to know when one of those encloses an identifier like “eval”, “arguments” or any other future reserved identifier. - PreParserExpression::BinaryOperation(left, right) is added, and it returns a PreParserExpression that makes an union of the flags in “left” and “right”. Also, if a flag exists both in “left” and “right” (e.g. both enclose “eval”) then it adds an additional flag to signal duplicates. I have this working for “eval” and “arguments”. As soon as I get it to work also for future reserved keywords, both Parser and PreParser will be generating the same errors. > - Test cases for "foo[some arrow func]" and another one for "foo[()]" (this will > take the code path you added in ParseExpression, right?) Will these crash > because ParseExpression thinks () is okay and returns NULL? No problem with this, it won't segfault anymore after fixing the way “()” is handled. For example parsing “foo[()]” will stop and report that “=>” is the next expected token, because in expressions an empty param list is only valid when followed by an arrow function body. > - Do constructs like (()) or (()) => something work as expected? Yes, thanks to the ParamListFinder helper added to the scanner, which keeps track if most recent parenthesized expression is a valid parameter list. > - Arrow functions in strict mode and the related errors (dupl. param names etc.) Read my comment above, about fixing the different errors reported in Parser and PreParser. Parser was reporting those errors before, soon I will have PreParser working as well. > https://codereview.chromium.org/160073006/diff/480001/src/preparser.h#newcode... > src/preparser.h:1607: CHECK_OK); > I think there is a bug here. > > So if we're here... but it's not an arrow function but a stray "()", what will > happen? I think you're changing the existing behavior of () here. And that's not > intentional, I guess? > > For example, if the code is foo[()], we currently produce an error "Unexpected > token )", but now, if arrow functions are allowed, we'll produce an error > "Unexpected token ]", right? Correct, “]” is unexpected. Current version of the parsing code already reports this error properly. > I'm happy to accept a separate CL which splits the strict mode checking out of > ParseFunctionLiteral where it currently is. Sounds good to me. Let's try to have this one CL landed first, and then refactor the checks, as you suggest.
Cool, I'll have another look when you get PreParser sorted out. On 2014/06/04 11:02:55, aperez wrote: > Sounds good to me. Let's try to have this one CL landed first, and then > refactor the checks, as you suggest. I actually meant the other way around: It would be good if you could land the refactoring first, so that you don't need to land duplicate code. It would make this CL smaller, too.
On 2014/06/04 11:06:41, marja wrote: > Cool, I'll have another look when you get PreParser sorted out. > > On 2014/06/04 11:02:55, aperez wrote: > > > Sounds good to me. Let's try to have this one CL landed first, and then > > refactor the checks, as you suggest. > > I actually meant the other way around: It would be good if you could land the > refactoring first, so that you don't need to land duplicate code. It would make > this CL smaller, too. Oh, somehow my brain parsed it the other way around… There is indeed no problem in making the refactoring first O:-)
On 2014/06/04 15:02:52, aperez wrote: > On 2014/06/04 11:06:41, marja wrote: > > Cool, I'll have another look when you get PreParser sorted out. > > > > On 2014/06/04 11:02:55, aperez wrote: > > > > > Sounds good to me. Let's try to have this one CL landed first, and then > > > refactor the checks, as you suggest. > > > > I actually meant the other way around: It would be good if you could land the > > refactoring first, so that you don't need to land duplicate code. It would > make > > this CL smaller, too. > > Oh, somehow my brain parsed it the other way around… There is indeed no problem > in making the refactoring first O:-) Also, this CL is getting kinda big, and it has both parsing and code generation so I will also split the code generation into another CL to make things easier to review.
On 2014/05/26 12:46:06, marja wrote: > https://codereview.chromium.org/160073006/diff/480001/src/preparser.h#newcode... > src/preparser.h:2314: ParserBase<Traits>::ParseArrowFunctionLiteralBody( > A lot of this function (starting from strict mode checking) is duplicate code; > pls refactor. > > I'm happy to accept a separate CL which splits the strict mode checking out of > ParseFunctionLiteral where it currently is. Done, this is the CL: https://codereview.chromium.org/332053004
Just posted a new update. (BTW, if this CL is getting gets too big, I could just cancel it and start a fresh one.) On 2014/05/26 12:46:06, marja wrote: > In addition, pls add the following tests: > > - Afaics there are no tests which make sure that PreParser handles arrow > functions properly! (mjsunit doesn't go via PreParser). > > - Add the "arrow functions allowed" flag to test-parsing, so that all existing > parsing tests are ran with arrow functions allowed (to make sure you're not > changing the accepted language). The tests are now into “cctest/test-parsing.cc”. I have added some more test cases with different contexts, but I will do one more update after making sure arrow function syntax is enabled for all relevant test cases and maybe adding some more test cases. > - Test cases where you check that preparsing and parsing fail the same way and > succeed for the same string (see test_parsing.cc) Those were now added. > - Test cases for "foo[some arrow func]" and another one for "foo[()]" (this will > take the code path you added in ParseExpression, right?) Will these crash > because ParseExpression thinks () is okay and returns NULL? Those will not crash because in an expression the only thing that can follow “()” is an arrow token, so the parser will report that “=>” was expected but not found. > - Do constructs like (()) or (()) => something work as expected? Yes. This is now handled properly. The relevant code is in the “ParamListFinder” helper class (in scanner.cc): it is a small state machine that tracks the most recent single identifier and the most recent parenthesized parameter list, and is used to determine whether the expression to the left side of “=>” is a valid parameter list. > - Arrow functions in strict mode and the related errors (dupl. param names etc.) All handled. > https://codereview.chromium.org/160073006/diff/480001/src/preparser.h#newcode... > src/preparser.h:1607: CHECK_OK); > I think there is a bug here. > > So if we're here... but it's not an arrow function but a stray "()", what will > happen? I think you're changing the existing behavior of () here. And that's not > intentional, I guess? > > For example, if the code is foo[()], we currently produce an error "Unexpected > token )", but now, if arrow functions are allowed, we'll produce an error > "Unexpected token ]", right? That is correct, because after a “()” in an expression an arrow is expected. In the middle of an expression, an empty “()” can only be followed by “=>”. > In any case, pls add test code which tests that possibility (and even better, a > test which would've exposed the bug, if you agree it's a bug... but that might > be difficult). IMHO reporting that there is an unexpected token in those cases would be the expected behavior. The tests now include some cases to check that “()” not followed by “=>” results in a parsing error. WDYT? > https://codereview.chromium.org/160073006/diff/480001/test/mjsunit/harmony/ar... > test/mjsunit/harmony/arrow-functions-parsing.js:28: var objectLiteralResult = ({ > value: 42 }); > Pls test that these yield the expected results, too. I have splitted the parts of the patch which are not related to parsing, to post them in a separate CL. This new file will be in that new CL.
Thanks for posting a new patch! Here are some comments again. Sorry, they're a bit high-level still... I'm not convinced of this ParamListFinder approach. Overall, the code is quite a bit cleaner now. https://codereview.chromium.org/160073006/diff/500001/src/parser.cc File src/parser.cc (right): https://codereview.chromium.org/160073006/diff/500001/src/parser.cc#newcode3323 src/parser.cc:3323: error_token = ""; In this case, don't we get an incorrect error end position? https://codereview.chromium.org/160073006/diff/500001/src/scanner.cc File src/scanner.cc (right): https://codereview.chromium.org/160073006/diff/500001/src/scanner.cc#newcode935 src/scanner.cc:935: bool Scanner::IdentifierIsFutureStrictReserved(const AstString* string) const { This sounds wrong, since we have a separate token for FUTURE_RESERVED_WORD, so here we have somehow lost that information and we're trying to get it back.. :/ https://codereview.chromium.org/160073006/diff/500001/src/scanner.h File src/scanner.h (right): https://codereview.chromium.org/160073006/diff/500001/src/scanner.h#newcode694 src/scanner.h:694: void ParamListFinder::Update(Scanner* scanner) { So you have this state machine which is updated for each token when scanning, and then there is ParserTraits::ParameterListFromExpression too, for sort of getting the same data "after the fact". This looks pretty duplicated to me. Also, you're adding this to be executed after each token during scanning, and I'm worried about the overhead. Would it be feasible to 1) only do the parameter list dance if we have some indicator that we are inside the parameter list (seen "(", for example, and quit as soon as we realize it's not a valid parameter list) 2) do it either when reading the parameter list, or after, but not both? And I'm not sure if Scanner is the right place for the parameter list logic... What's the reason why the "after the fact" processing is not sufficient? Is it because of PreParser? Did you try out the "PreParserExpression keeps track of the "this is a comma-separated identifier name list" property" approach, and why didn't it fly? Btw, how much does this regress the scanning performance? There's this tool lexer-shell which you can use for finding out.. https://codereview.chromium.org/160073006/diff/500001/test/cctest/test-parsin... File test/cctest/test-parsing.cc (right): https://codereview.chromium.org/160073006/diff/500001/test/cctest/test-parsin... test/cctest/test-parsing.cc:257: NULL What, why? What are the changes in this test? https://codereview.chromium.org/160073006/diff/500001/test/cctest/test-parsin... test/cctest/test-parsing.cc:266: for (int i = 0; good_code[i]; i++) { Looping over an array that has only 1 non-trivial element? https://codereview.chromium.org/160073006/diff/500001/test/cctest/test-parsin... test/cctest/test-parsing.cc:1220: parser->set_allow_arrow_functions(i::FLAG_harmony_arrow_functions); We'd probably like to run each existing ParserSyncTest both with and without arrow functions, so you should handle it like the flags above (add a flag for it into "flags" here). But I realize it's actually not as easy as it should be, because for some tests we want to pin some flags (so that other flags vary but those are always true). I can try to add the machinery for that (an "alwaysTrueFlags" array which overrides the varying flags) and see how it looks. https://codereview.chromium.org/160073006/diff/500001/test/cctest/test-parsin... test/cctest/test-parsing.cc:2063: {"eval => {", "}"}, You could have a separate test case for these, and remove the i::FLAG above (so that the main test case would be both with and without the arrow function flag). https://codereview.chromium.org/160073006/diff/500001/test/cctest/test-parsin... test/cctest/test-parsing.cc:2637: Do any of the cases bump into the "is not a variable proxy" in ParserTraits::ParameterListFromExpression()? I guess not? That would happen when we expect a parameter name, but get some other kind of expression which is not a parameter name, such as "13" or "foo ? bar : baz" or "foo + bar". Could you add test cases for those?
Hi again, I was also thinking, why does this CL get so complicated, that is, what's the most painful factor of this? Is it making PreParser produce the same errors when the parameter list is malformed? Is it constructing the parameter list from Expression after we've seen the =>? Is it something else? Maybe I'm overlooking something? Though, it was sort of expected that this is complicated, since we don't yet know whether we're in an arrow function or not when we see (foo, bar, ... :/
On 2014/06/18 07:21:52, marja wrote: > Though, it was sort of expected that this is complicated, since we don't yet > know whether we're in an arrow function or not when we see (foo, bar, ... :/ Unfortunately, you ain't seen nothing yet! ♫♪♫♪ Once we add support for destructuring (especially the CoverInitializedName) and rest params, we have to parse the paren expression as a CoverParenthesizedExpressionAndArrowParameterList, and then go back and reparse and/or validate that. For example: var value = (a = 1, {x, y: z, w: {v}, [a, b, ...c], q); var arrow = (a = 1, {x, y: z, w: {v}, [a, b, ...c], q) => { ... }; To support rest and CoverInitializedName we also need to support var arrow = ({x = 1}, ...xs) => { ... }; var error = ({x = 1}, ...xs); // SyntaxError: Unexpected token = Anyway, I don't want to derail too much. I'm very excited that you are working on basic arrow function support even though I think the whole parsing part of it will need to be reimplemented when we add support rest and destructuring.
On 2014/06/18 14:04:56, arv wrote: > On 2014/06/18 07:21:52, marja wrote: > > Though, it was sort of expected that this is complicated, since we don't yet > > know whether we're in an arrow function or not when we see (foo, bar, ... :/ > > Unfortunately, you ain't seen nothing yet! ♫♪♫♪ > > Once we add support for destructuring (especially the CoverInitializedName) and > rest params, we have to parse the paren expression as a > CoverParenthesizedExpressionAndArrowParameterList, and then go back and reparse > and/or validate that. > > For example: > > var value = (a = 1, {x, y: z, w: {v}, [a, b, ...c], q); > var arrow = (a = 1, {x, y: z, w: {v}, [a, b, ...c], q) => { ... }; > > To support rest and CoverInitializedName we also need to support > > var arrow = ({x = 1}, ...xs) => { ... }; > var error = ({x = 1}, ...xs); // SyntaxError: Unexpected token = > > Anyway, I don't want to derail too much. I'm very excited that you are working > on basic arrow function support even though I think the whole parsing part of it > will need to be reimplemented when we add support rest and destructuring. We discussed this when Adrian started on this, and concluded that we should do it incrementally. Most of the basic machinery should stay the same. In fact, if done right, I would expect destructuring to be an extension of what the current CL does. We also concluded that reparsing is not an option, due to various constraints in V8. We have to use validation, as the current CL already does. The tricky part, though, is what to do in the preparser, which does not generate an AST to validate. As I see it, we will have to parse the cover grammar, and for each production synthesize success flags for both expression and parameter list (i.e., parse both possibilities simultaneously). I think that is what the current CL also has to do, although it should be easier here (so far, parameter lists are strictly a subgrammar of expressions).
On 2014/06/17 11:47:38, marja wrote: > Thanks for posting a new patch! Here are some comments again. Sorry, they're a > bit high-level still... I'm not convinced of this ParamListFinder approach. > Overall, the code is quite a bit cleaner now. ParamListFinder... I reckon it may not be the nicest thing on Earth. > https://codereview.chromium.org/160073006/diff/500001/src/parser.cc#newcode3323 > src/parser.cc:3323: error_token = ""; > In this case, don't we get an incorrect error end position? ParamListFinder in the current implementation will give this error instead (that code path should not be reached): (d8):1: SyntaxError: Function parameter list is malformed (32, a) => {} ^^^^^^ SyntaxError: Function parameter list is malformed Unfortunately, the error does not point at the exact point where the issue is. > https://codereview.chromium.org/160073006/diff/500001/src/scanner.cc#newcode935 > src/scanner.cc:935: bool Scanner::IdentifierIsFutureStrictReserved(const > AstString* string) const { > This sounds wrong, since we have a separate token for FUTURE_RESERVED_WORD, so > here we have somehow lost that information and we're trying to get it back.. :/ We are losing the information when lifting things up to the AST or to the PreParserIdentifier/PreParserExpression objects. Would it be acceptable to keep in the Scanner the positions of the last seen future reserved word, last “eval” token, etc? That would be enough (I think) for detecting those as part of a parameter list in strict mode. > https://codereview.chromium.org/160073006/diff/500001/src/scanner.h#newcode694 > src/scanner.h:694: void ParamListFinder::Update(Scanner* scanner) { > So you have this state machine which is updated for each token when scanning, > and then there is ParserTraits::ParameterListFromExpression too, for sort of > getting the same data "after the fact". This looks pretty duplicated to me. > > Also, you're adding this to be executed after each token during scanning, and > I'm worried about the overhead. > > Would it be feasible to > 1) only do the parameter list dance if we have some indicator that we are inside > the parameter list (seen "(", for example, and quit as soon as we realize it's > not a valid parameter list) > 2) do it either when reading the parameter list, or after, but not both? > > And I'm not sure if Scanner is the right place for the parameter list logic... > > What's the reason why the "after the fact" processing is not sufficient? Is it > because of PreParser? Did you try out the "PreParserExpression keeps track of > the "this is a comma-separated identifier name list" property" approach, and why > didn't it fly? The problem here is that PreParser (PreParserIdentifier/PreParserExpression) does not tracks enough information: a single PreParserExpression is returned from ParseAssignmentExpression(), while in Parser it returns an AST subtree of binops/variableproxies. So I would need it to actually return a list of identifiers seen (or something equivalent), their positions, parenthesization, and for them to have associated strings with the identifier names — not just an integer value with a set of bit flags. That would make the code of PreParser::ParseAssignmentExpression() —and the methods it uses— more complex, and they would end handling a set of data quite similar to what Parser does... and that made me wonder why to have a PreParser to begin with, and to try to come up with the current solution. > Btw, how much does this regress the scanning performance? There's this tool > lexer-shell which you can use for finding out.. As a quick test, I have run lexer-shell and parser-shell on lua.vm.js (a version of the Lua VM compiled to JS using Emscripten, ~700kB in a single JS file — https://github.com/kripken/lua.vm.js). Those are the results before and after my patch: before after lexer-shell, release 8ms 11ms lexer-shell, debug 40ms 50ms parser-shell, release 37/ 32ms 40/ 34ms parser-shell, debug 267/248ms 281/262ms > https://codereview.chromium.org/160073006/diff/500001/test/cctest/test-parsin... > test/cctest/test-parsing.cc:257: NULL > What, why? What are the changes in this test? Ouch, this slipped in during a rebase. > https://codereview.chromium.org/160073006/diff/500001/test/cctest/test-parsin... > test/cctest/test-parsing.cc:1220: > parser->set_allow_arrow_functions(i::FLAG_harmony_arrow_functions); > We'd probably like to run each existing ParserSyncTest both with and without > arrow functions, so you should handle it like the flags above (add a flag for it > into "flags" here). > > But I realize it's actually not as easy as it should be, because for some tests > we want to pin some flags (so that other flags vary but those are always true). That's why I decided to go for always setting the flag for arrow functions to true. > I can try to add the machinery for that (an "alwaysTrueFlags" array which > overrides the varying flags) and see how it looks. Sure, will do. > https://codereview.chromium.org/160073006/diff/500001/test/cctest/test-parsin... > test/cctest/test-parsing.cc:2063: {"eval => {", "}"}, > You could have a separate test case for these, and remove the i::FLAG above (so > that the main test case would be both with and without the arrow function flag). Okay. > https://codereview.chromium.org/160073006/diff/500001/test/cctest/test-parsin... > test/cctest/test-parsing.cc:2637: > Do any of the cases bump into the "is not a variable proxy" in > ParserTraits::ParameterListFromExpression()? I guess not? That would happen when > we expect a parameter name, but get some other kind of expression which is not a > parameter name, such as "13" or "foo ? bar : baz" or "foo + bar". Could you add > test cases for those? I have already added a bunch of this kind of test cases. So far the errors are reported as expected for non-VariableProxies :)
On 2014/06/18 14:27:23, rossberg wrote: > [...] The tricky part, though, is what to do in the preparser, which > does not generate an AST to validate [...] Making the PreParser work fine is the most difficult piece of the puzzle, indeed.
On 2014/06/18 14:54:14, aperez wrote: > On 2014/06/17 11:47:38, marja wrote: > https://codereview.chromium.org/160073006/diff/500001/src/parser.cc#newcode3323 > > src/parser.cc:3323: error_token = ""; > > In this case, don't we get an incorrect error end position? > > ParamListFinder in the current implementation will give this error instead > (that code path should not be reached): > > (d8):1: SyntaxError: Function parameter list is malformed > (32, a) => {} > ^^^^^^ > SyntaxError: Function parameter list is malformed > > Unfortunately, the error does not point at the exact point where the > issue is. I think that's a good enough error. > https://codereview.chromium.org/160073006/diff/500001/src/scanner.cc#newcode935 > > src/scanner.cc:935: bool Scanner::IdentifierIsFutureStrictReserved(const > > AstString* string) const { > > This sounds wrong, since we have a separate token for FUTURE_RESERVED_WORD, so > > here we have somehow lost that information and we're trying to get it back.. > :/ > > We are losing the information when lifting things up to the AST or > to the PreParserIdentifier/PreParserExpression objects. Would it be > acceptable to keep in the Scanner the positions of the last seen > future reserved word, last “eval” token, etc? That would be enough > (I think) for detecting those as part of a parameter list in strict > mode. I'm not at all buying this "add code in the Scanner to enhance information that PreParser lost" approach; instead, we should not lose the information to begin with. Or if that's not feasible, then PreParser should just not produce errors for those things (but that's not a very popular view either). > The problem here is that PreParser (PreParserIdentifier/PreParserExpression) > does not tracks enough information: a single PreParserExpression is returned > from ParseAssignmentExpression(), while in Parser it returns an AST subtree > of binops/variableproxies. So I would need it to actually return a list of > identifiers seen (or something equivalent), their positions, parenthesization, > and for them to have associated strings with the identifier names — not just > an integer value with a set of bit flags. Wdyt of the idea that PreParserExpression would track the "is comma separated identifier list" property? Like, if we see "foo", that's a PreParserExpression that is an identifier (we can track that), the same for "bar", and when we build the comma expression "foo, bar", we can track that the corresponding comma expression is a comma separated identifier list (and furthermore for "foo, bar, baz" (further comma operation), and the parens, and that should do the trick. But like you said, PreParser will get closer to Parser as we are required to produce more early errors, and at some point, if we really cannot do without AST, we need to kill it :(
Hmm, though, detecting "eval" and such complicates the "is comma separated parameter list" stuff a bit, I guess...
On 2014/06/18 15:08:02, marja wrote: > Hmm, though, detecting "eval" and such complicates the "is comma separated > parameter list" stuff a bit, I guess... Argh, and the fact that the function can declare itself strict complicates this even further. We could maybe go with the approach that just says "invalid strict mode parameter list" if there is eval or such in it, without pointing to the location. rossberg@, would that be acceptable? Like this: (foo, bar, eval) => { "use strict"; } ^--------------^ malformed strict mode parameter list
(Sorry for the 4th message in the row..) If you need to track stuff like "last eval position" in the preparser, that should go in PreParserFactory or PreParserTraits or somewhere there, not in Scanner. Would that help?
On 2014/06/18 15:15:15, marja wrote: > On 2014/06/18 15:08:02, marja wrote: > > Hmm, though, detecting "eval" and such complicates the "is comma separated > > parameter list" stuff a bit, I guess... > > Argh, and the fact that the function can declare itself strict complicates this > even further. > > We could maybe go with the approach that just says "invalid strict mode > parameter list" if there is eval or such in it, without pointing to the > location. rossberg@, would that be acceptable? > > Like this: > > (foo, bar, eval) => { "use strict"; } > ^--------------^ > malformed strict mode parameter list Arrow parameters are always strict: http://people.mozilla.org/~jorendorff/es6-draft.html#sec-arrow-function-defin...
On 2014/06/18 15:43:03, arv wrote: > On 2014/06/18 15:15:15, marja wrote: > > On 2014/06/18 15:08:02, marja wrote: > > > Hmm, though, detecting "eval" and such complicates the "is comma separated > > > parameter list" stuff a bit, I guess... > > > > Argh, and the fact that the function can declare itself strict complicates > this > > even further. > > > > We could maybe go with the approach that just says "invalid strict mode > > parameter list" if there is eval or such in it, without pointing to the > > location. rossberg@, would that be acceptable? > > > > Like this: > > > > (foo, bar, eval) => { "use strict"; } > > ^--------------^ > > malformed strict mode parameter list > > Arrow parameters are always strict: > > http://people.mozilla.org/~jorendorff/es6-draft.html#sec-arrow-function-defin... If this is a reasonable way of reporting errors, I can easily modify the code I currently have to “bubble up” the bit in the code of PreParserExpressions that are binary operations, and add a bit that means “this binary expression is a valid arrow function parameter list”, without needing to tap into the scanner. Finding duplicate parameter names would still be tricky, though.
On 2014/06/18 15:18:13, marja wrote: > (Sorry for the 4th message in the row..) > > If you need to track stuff like "last eval position" in the preparser, that > should go in PreParserFactory or PreParserTraits or somewhere there, not in > Scanner. Would that help? It would help to make the error reporting a bit more intelligent point to a more specific location, instead of just the whole parameter list. Anyway, no worries about writing a lot of messages in a row — they are being good as inspiration, reading those I have had some more ideas about things that can be done to avoid hooking things up to the scanner :D
https://codereview.chromium.org/160073006/diff/520001/src/scanner.h File src/scanner.h (right): https://codereview.chromium.org/160073006/diff/520001/src/scanner.h#newcode767 src/scanner.h:767: } Btw, I don't think this state machine works anyway, for example: (foo, bar, (baz, quux) => { ... }); Could you add tests for such cases, to verify that the stuff inside is correctly identified as an arrow function (e.g., that the appropriate errors are produced).
On 2014/06/18 at 16:21:17, marja wrote: > https://codereview.chromium.org/160073006/diff/520001/src/scanner.h > File src/scanner.h (right): > > https://codereview.chromium.org/160073006/diff/520001/src/scanner.h#newcode767 > src/scanner.h:767: } > Btw, I don't think this state machine works anyway, for example: > > (foo, bar, (baz, quux) => { ... }); > > Could you add tests for such cases, to verify that the stuff inside is correctly identified as an arrow function (e.g., that the appropriate errors are produced). Mmmh, wouldn't this case be parsed the same as this?: (foo, bar, ((baz, quux) => { ... })); If I am understanding correctly, the way the grammar is specified, the expression would be picked as a comma-separated list of expressions, being the last expression an arrow function... So this would be correct input, am I right? Or did you mean that the ParamListFinder would have caused an error for that valid input? (Anyway, I am getting rid of ParamListFinder, CL to be updated soon.)
On 2014/06/24 16:08:39, aperez wrote: > On 2014/06/18 at 16:21:17, marja wrote: > > https://codereview.chromium.org/160073006/diff/520001/src/scanner.h > > File src/scanner.h (right): > > > > https://codereview.chromium.org/160073006/diff/520001/src/scanner.h#newcode767 > > src/scanner.h:767: } > > Btw, I don't think this state machine works anyway, for example: > > > > (foo, bar, (baz, quux) => { ... }); > > > > Could you add tests for such cases, to verify that the stuff inside is > correctly identified as an arrow function (e.g., that the appropriate errors are > produced). > > Mmmh, wouldn't this case be parsed the same as this?: > > (foo, bar, ((baz, quux) => { ... })); > > If I am understanding correctly, the way the grammar is specified, the > expression would be picked as a comma-separated list of expressions, > being the last expression an arrow function... So this would be correct > input, am I right? > > Or did you mean that the ParamListFinder would have caused an error > for that valid input? > > (Anyway, I am getting rid of ParamListFinder, CL to be updated soon.) Yes, I meant that ParamListFinder would have caused error for this valid input (it wouldn't have recognized the parameter list).
On 2014/06/18 at 15:15:15, marja wrote: > On 2014/06/18 15:08:02, marja wrote: > > Hmm, though, detecting "eval" and such complicates the "is comma separated > > parameter list" stuff a bit, I guess... > > Argh, and the fact that the function can declare itself strict complicates this even further. > > We could maybe go with the approach that just says "invalid strict mode parameter list" if there is eval or such in it, without pointing to the location. rossberg@, would that be acceptable? > > Like this: > > (foo, bar, eval) => { "use strict"; } > ^--------------^ > malformed strict mode parameter list This style of error reporting is now implemented in the last version of the patch set, which I have just uploaded. I have been able to remove ParamListFinder —so the Scanner does not get the performance hit— by tracking whether an expression is parenthesized once or more than once. Implementation-wise, this is two flag bits in PreParserExpression (one of the bits is set for parenthesized expressions, another for being parenthesized more than once), and an integer value in Expression for the parenthesization level. The only thing that PreParser cannot do now is detecting duplicated identifiers, which is tricky as PreParserIdentifier does not have the strings with the names of the identifiers (ugh!). For the moment I have decided to add a TODO item for revisiting duplicate identifier detection later on, so we can try to get this CL landed, and keep discussing how to solve the issue with duplicate identifiers.
The approach of not detecting duplicate identifiers when preparsing sounds good enough for the first version. I'll have a closer look at the code in the near future...
Alright, the latest patch set is much better! Yay! Could you do a round of cleanup, that is, make sure you're not adding dead code, duplicate code, unintentional changes etc. Some comments below. I have a list of stuff for myself which I need to check why they are needed, but meanwhile I wanted to unblock you... https://codereview.chromium.org/160073006/diff/540001/src/ast.h File src/ast.h (right): https://codereview.chromium.org/160073006/diff/540001/src/ast.h#newcode350 src/ast.h:350: void set_is_parenthesized(bool value) { Afaics this is never called with false (which would be a weird thing anyway... like, we have increased the parenthization level before and then we suddenly decide it's not parenthesized after all). So you could call this increase_parenthesization_level() instead. https://codereview.chromium.org/160073006/diff/540001/src/messages.js File src/messages.js (right): https://codereview.chromium.org/160073006/diff/540001/src/messages.js#newcode159 src/messages.js:159: strict_parameter_list: ["Malformed strict mode parameter list"], This error message is not about strict mode parameter lists, so you should call it "malformed_arrow_function_parameter_list" and the message should be "Malformed arrow function parameter list". https://codereview.chromium.org/160073006/diff/540001/src/parser.cc File src/parser.cc (right): https://codereview.chromium.org/160073006/diff/540001/src/parser.cc#newcode3310 src/parser.cc:3310: Vector<VariableProxy*> ParserTraits::ParameterListFromExpression( Here you have the same logic in 2 functions; would it be possible to have only one function which both checks if the expression is a parameter list and constructs the vector? https://codereview.chromium.org/160073006/diff/540001/src/parser.h File src/parser.h (right): https://codereview.chromium.org/160073006/diff/540001/src/parser.h#newcode561 src/parser.h:561: V8_INLINE Handle<String> EmptyIdentifierString(); Isn't this dead code? https://codereview.chromium.org/160073006/diff/540001/src/parser.h#newcode573 src/parser.h:573: int pos = RelocInfo::kNoPosition); Why is the position needed? https://codereview.chromium.org/160073006/diff/540001/src/parser.h#newcode595 src/parser.h:595: Vector<VariableProxy*> ParameterListFromExpression( Expression* expression); Nit: extra space. https://codereview.chromium.org/160073006/diff/540001/src/parser.h#newcode596 src/parser.h:596: bool IsValidArrowFunctionParameterList(Expression* expression); This function should go where the other IsThisAndThat functions are. https://codereview.chromium.org/160073006/diff/540001/src/parser.h#newcode597 src/parser.h:597: void InferFunctionName(FunctionLiteral* func_to_infer); Isn't this a stray definition? https://codereview.chromium.org/160073006/diff/540001/src/preparser.cc File src/preparser.cc (right): https://codereview.chromium.org/160073006/diff/540001/src/preparser.cc#newcod... src/preparser.cc:112: NULL, this->ast_value_factory()); this->ast_value_factory is always NULL, right? So why not use a default value for it in the FunctionState ctor? https://codereview.chromium.org/160073006/diff/540001/src/preparser.h File src/preparser.h (right): https://codereview.chromium.org/160073006/diff/540001/src/preparser.h#newcode466 src/preparser.h:466: ExpressionT ParseArrowFunctionLiteral(bool* ok); Isn't this one dead code? (And isn't the comment above wrong...?) https://codereview.chromium.org/160073006/diff/540001/src/preparser.h#newcode610 src/preparser.h:610: int length() const { This is used for duplicate parameter error message positions, right? Is it use elsewhere? Might be okay to just always return 0... the PreParser doesn't check duplicate positions yet anyway. (Or does something observable change if you return 0 here?) https://codereview.chromium.org/160073006/diff/540001/src/preparser.h#newcode... src/preparser.h:2476: ParserBase<Traits>::ParseArrowFunctionLiteralBody( Nit: reorder the functions here to match the order of the definitions. https://codereview.chromium.org/160073006/diff/540001/src/preparser.h#newcode... src/preparser.h:2500: parenthesized_function_ = false; // This Was set for this funciton only. 1) Why do you need to set parenthesized_function_ here? It's not set outside arrow functions anyway, because it's only set when we see the function keyword, right? Afaics, where this code originally was, it used to detect these cases: (function foo() { << this is parenthised, so parenthesized_function_ is set to true << here, parenthesized_function_ is set to false function bar() {}<< so that this function doesn't appear as parenthesized })(); But I don't think we need it here... 2) The same code is in the if branch and else branch 3) the comment is pretty malformed :) https://codereview.chromium.org/160073006/diff/540001/test/cctest/test-parsin... File test/cctest/test-parsing.cc (left): https://codereview.chromium.org/160073006/diff/540001/test/cctest/test-parsin... test/cctest/test-parsing.cc:916: v8::internal::FLAG_harmony_scoping = true; Why this change? https://codereview.chromium.org/160073006/diff/540001/test/cctest/test-parsin... File test/cctest/test-parsing.cc (right): https://codereview.chromium.org/160073006/diff/540001/test/cctest/test-parsin... test/cctest/test-parsing.cc:199: i::FLAG_harmony_arrow_functions = true; I don't think you need to add this to existing tests. When I said that you should make sure that existing tests run with arrow functions enabled, I meant the RunParserSync tests. For them, it should be enough to add this flag to the ParserFlagEnum and default_flags (like you did). https://codereview.chromium.org/160073006/diff/540001/test/cctest/test-parsin... test/cctest/test-parsing.cc:257: const char* good_code[] = { I whined about this change before, I'm still whining about it :) https://codereview.chromium.org/160073006/diff/540001/test/cctest/test-parsin... test/cctest/test-parsing.cc:357: preparser.set_allow_arrow_functions(true); And these shouldn't be needed either. For these tests, we don't run them with all experimental features anyway, so no need to make an exception for arrow functions. https://codereview.chromium.org/160073006/diff/540001/test/cctest/test-parsin... test/cctest/test-parsing.cc:1426: static const ParserFlag always_flags[] = { kAllowArrowFunctions }; Instead of this, you should add the flag to flags1 below, so the test gets run with and without it. https://codereview.chromium.org/160073006/diff/540001/test/cctest/test-parsin... test/cctest/test-parsing.cc:1700: i::FLAG_harmony_arrow_functions = true; Instead of this, you should use always_true_flags... Otherwise this test takes longer than needed, because it's run with both the flag and without the flag (via the default_flags permutation thingy) and then this FLAG just overrides them. https://codereview.chromium.org/160073006/diff/540001/test/cctest/test-parsin... test/cctest/test-parsing.cc:1765: i::FLAG_harmony_arrow_functions = true; Here too... in all places, where you need the flag, you should use always_true_flags. https://codereview.chromium.org/160073006/diff/540001/test/cctest/test-parsin... test/cctest/test-parsing.cc:1824: i::FLAG_harmony_arrow_functions = true; And this is completely unnecessary; the test should be ran both with and without arrow functions (and the default_flags permutation takes care of it if you remove this line). (And this goes on throughout this file; pls see all occurrences of i::FLAG_harmony_arrow_functions.)
On 2014/06/26 at 14:38:14, marja wrote: > Alright, the latest patch set is much better! Yay! Reading this made my day yesterday :-) > Could you do a round of cleanup, that is, make sure you're not adding dead code, duplicate code, unintentional changes etc. Indeed! > https://codereview.chromium.org/160073006/diff/540001/src/ast.h#newcode350 > src/ast.h:350: void set_is_parenthesized(bool value) { > Afaics this is never called with false (which would be a weird thing anyway... like, we have increased the parenthization level before and then we suddenly decide it's not parenthesized after all). > > So you could call this increase_parenthesization_level() instead. Sounds good to me. Also I will remove the logic to set the parenthesization level to zero (I can't think of a case in which we would want to reset the counter). > https://codereview.chromium.org/160073006/diff/540001/src/parser.cc#newcode3310 > src/parser.cc:3310: Vector<VariableProxy*> ParserTraits::ParameterListFromExpression( > Here you have the same logic in 2 functions; would it be possible to have only one function which both checks if the expression is a parameter list and constructs the vector? > https://codereview.chromium.org/160073006/diff/540001/src/parser.h#newcode561 > src/parser.h:561: V8_INLINE Handle<String> EmptyIdentifierString(); > Isn't this dead code? Not really, this is used by the code generation in the follow-up CL that I will be posting after this is landed. Of course it does not really make sense to introduce this while it is not used, so I am moving it to the follow-up CL. > https://codereview.chromium.org/160073006/diff/540001/src/parser.h#newcode573 > src/parser.h:573: int pos = RelocInfo::kNoPosition); > Why is the position needed? It is not needed in the latest version of the CL, I am removing this bit. > https://codereview.chromium.org/160073006/diff/540001/src/parser.h#newcode595 > src/parser.h:595: Vector<VariableProxy*> ParameterListFromExpression( Expression* expression); > Nit: extra space. Oh, somehow the presubmit checks did not catch this. > https://codereview.chromium.org/160073006/diff/540001/src/parser.h#newcode597 > src/parser.h:597: void InferFunctionName(FunctionLiteral* func_to_infer); > Isn't this a stray definition? This is stray, yes. But InferFunctionName() is used in the code generation patch, so I will be moving that to the follow-up CL as well. > https://codereview.chromium.org/160073006/diff/540001/src/preparser.cc#newcod... > src/preparser.cc:112: NULL, this->ast_value_factory()); > this->ast_value_factory is always NULL, right? So why not use a default value for it in the FunctionState ctor? Sure, I can add a NULL default value for the constructor. > https://codereview.chromium.org/160073006/diff/540001/src/preparser.h#newcode466 > src/preparser.h:466: ExpressionT ParseArrowFunctionLiteral(bool* ok); > Isn't this one dead code? (And isn't the comment above wrong...?) This is to be used in the follow-up CL in Parser::ParseLazy(), like this: if (shared_info->is_arrow()) result = reinterpret_cast<FunctionLiteral*>(ParseArrowFunctionLiteral(&ok)); else result = ParseFunctionLiteral( /* ... */ ); I have been thinking a bit about this, and IMHO it would be better to avoid having two code paths. The solution would be to just call ParseAssignmentExpression(), like this: if (shared_info->is_arrow()) result = reinterpret_cast<FunctionLiteral*>(ParseAssignmentExpression(false, &ok)); else result = ParseFunctionLiteral( /* ... */ ); Anyway, this alternate version of ParseArrowFunctionLiteral() is dead code for the purpose of this CL, so I am removing it. > https://codereview.chromium.org/160073006/diff/540001/src/preparser.h#newcode610 > src/preparser.h:610: int length() const { > This is used for duplicate parameter error message positions, right? Is it use elsewhere? Might be okay to just always return 0... the PreParser doesn't check duplicate positions yet anyway. (Or does something observable change if you return 0 here?) The way error reporting is done now does not need the lengths, so you are right and this may as well return 0. > https://codereview.chromium.org/160073006/diff/540001/src/preparser.h#newcode... > src/preparser.h:2500: parenthesized_function_ = false; // This Was set for this funciton only. > 1) Why do you need to set parenthesized_function_ here? It's not set outside arrow functions anyway, because it's only set when we see the function keyword, right? Right, it is only set when “(function ...” is found, so it will never be true for arrow functions. I think at some point I had in mind that it would be nice to detect cases like “(() => {})()” in which the arrow function is called right away, but it is certainly more difficult that for normal functions. Also, in most cases we probably want arrow functions parsed lazily even if they are parenthesized, like in the following example where two arrow functions are parenthesized and it can happen that one of them is never used: var a = [1, 2, 3, 4, 5].map(doSquares ? (item => item * item) : (item => item)); In conclusion: let's leave this out for now and not do premature optimizations. If a similar optimization is needed later on for arrow functions, add it with a benchmark that demonstrates the validity of the optimization. > https://codereview.chromium.org/160073006/diff/540001/test/cctest/test-parsin... > test/cctest/test-parsing.cc:916: v8::internal::FLAG_harmony_scoping = true; > Why this change? That was unintentional. > https://codereview.chromium.org/160073006/diff/540001/test/cctest/test-parsin... > test/cctest/test-parsing.cc:199: i::FLAG_harmony_arrow_functions = true; > I don't think you need to add this to existing tests. > > When I said that you should make sure that existing tests run with arrow functions enabled, I meant the RunParserSync tests. > > For them, it should be enough to add this flag to the ParserFlagEnum and default_flags (like you did). As you can image, I understood it would be good to have arrow functions enabled in all tests. I'll remove the flag for the ones which do not involve arrow functions. > https://codereview.chromium.org/160073006/diff/540001/test/cctest/test-parsin... > test/cctest/test-parsing.cc:257: const char* good_code[] = { > I whined about this change before, I'm still whining about it :) Reverting. In the follow-up CL with the codegen I think this can be added back, to test that the preparse data is used for arrow functions as well. > https://codereview.chromium.org/160073006/diff/540001/test/cctest/test-parsin... > test/cctest/test-parsing.cc:357: preparser.set_allow_arrow_functions(true); > And these shouldn't be needed either. For these tests, we don't run them with all experimental features anyway, so no need to make an exception for arrow functions. Ok. > https://codereview.chromium.org/160073006/diff/540001/test/cctest/test-parsin... > test/cctest/test-parsing.cc:1765: i::FLAG_harmony_arrow_functions = true; > Here too... in all places, where you need the flag, you should use always_true_flags. Indeed.
Alright, thanks, I'll have a look at the latest patch set now.
aperez@, btw, have you ran all the relevant tests; can I assume they pass? I usually run this set: tools/run-tests.py -j28 --arch=x64 --mode=debug --report --no-presubmit mozilla/* test262/* cctest/* mjsunit/* webkit/* test262 and mozilla are not checked out by default, see here how to add them to your checkout: https://code.google.com/p/v8/wiki/Testing
Even better! The comments I have are pretty minor (basically just 2 non-trivial ones), so I hope we get this sorted out soon. Re: discussion I don't think we need to make arrow functions lazy. They're small, so switching to PreParser for the function body is not going to make a difference. Let's keep it simple and not do lazy arrow functions. https://codereview.chromium.org/160073006/diff/580001/src/parser.cc File src/parser.cc (right): https://codereview.chromium.org/160073006/diff/580001/src/parser.cc#newcode3322 src/parser.cc:3322: Vector<VariableProxy*> ParserTraits::ParameterListFromExpression( Comment from the previous round: Here you have the same logic in 2 functions; would it be possible to have only one function which both checks if the expression is a parameter list and constructs the vector? https://codereview.chromium.org/160073006/diff/580001/src/preparser.h File src/preparser.h (right): https://codereview.chromium.org/160073006/diff/580001/src/preparser.h#newcode893 src/preparser.h:893: // TODO(aperezdc): Would allowing lazy compilation in preparser make sense? Doesn't seem very sensemaking to me, when we are in PreParser, we're already parsing a lazy function and cannot get lazier than that :) https://codereview.chromium.org/160073006/diff/580001/src/preparser.h#newcode899 src/preparser.h:899: bool IsDeclared(const PreParserIdentifier& identifier) const { What cases does this IsDeclared actually detect? Pls add a comment. https://codereview.chromium.org/160073006/diff/580001/src/preparser.h#newcode... src/preparser.h:1274: bool IsValidArrowFunctionParameterList(PreParserExpression expression) { Pls reorder these functions so that they're in the same order in ParserTraits and PreParserTraits. (So this Is... function should go where the other IsFoo functions are). https://codereview.chromium.org/160073006/diff/580001/src/preparser.h#newcode... src/preparser.h:1280: void CheckConflictingVarDeclarations( Nit: formatting. You'd probably want to have a look how you can use clang-format in your favourite editor :) https://codereview.chromium.org/160073006/diff/580001/src/preparser.h#newcode... src/preparser.h:2539: this->CheckStrictFunctionNameAndParameters(this->EmptyIdentifier(), This is not inside if (strict_mode() == STRICT) because arrow function parameter lists are always strict, right? Pls add a comment. https://codereview.chromium.org/160073006/diff/580001/src/preparser.h#newcode... src/preparser.h:2557: return this->EmptyExpression(); This will change when you start generating the arrow function AST for real, right? Pls add a comment. https://codereview.chromium.org/160073006/diff/580001/test/cctest/test-parsin... File test/cctest/test-parsing.cc (right): https://codereview.chromium.org/160073006/diff/580001/test/cctest/test-parsin... test/cctest/test-parsing.cc:975: { " start;\n" Why this change? https://codereview.chromium.org/160073006/diff/580001/test/cctest/test-parsin... test/cctest/test-parsing.cc:1457: This file seems to contain some unintentional changes still.. https://codereview.chromium.org/160073006/diff/580001/test/cctest/test-parsin... test/cctest/test-parsing.cc:2817: TEST(ErrorsArrowFunctions) { I like this test! :)
On 2014/07/01 at 07:01:12, marja wrote: > aperez@, btw, have you ran all the relevant tests; can I assume they pass? Yes. > I usually run this set: > > tools/run-tests.py -j28 --arch=x64 --mode=debug --report --no-presubmit mozilla/* test262/* cctest/* mjsunit/* webkit/* Typically while developing I run only cctest, mjsunit and webkit/fast in debug mode, and then before uploading the CLs I add test262 and mozilla to be extra sure nothing gets broken :-)
On 2014/07/01 07:01:12, marja wrote: > I usually run this set: > > tools/run-tests.py -j28 --arch=x64 --mode=debug --report --no-presubmit > mozilla/* test262/* cctest/* mjsunit/* webkit/* DBC: the test driver performs prefix matching anyway, so you can save a bit of typing and simply write "mozilla test262 cctest mjsunit webkit" as the list of tests.
rossberg@, btw, one thing: This CL makes V8 crash if the arrow function allowing flag is passed, since the actual implementation is missing (this will return NULL expressions, and if the upper layer tries to use them, it'll crash). Is this okay, or will we be burned by this because of some fuzzing tests?
On 2014/07/03 07:57:14, marja wrote: > This CL makes V8 crash if the arrow function allowing flag is passed, since the > actual implementation is missing (this will return NULL expressions, and if the > upper layer tries to use them, it'll crash). Is this okay, or will we be burned > by this because of some fuzzing tests? The fuzzer does not turn on the harmony flag, only the es-staging flag (harmony containing half-done stuff is one of the reasons why we introduced the latter). That said, the parser should still spit out something well-formed, and not make later phases crash. So that you can e.g. try it in d8, and write JS test cases for parsing (which would be nice to have, btw). For example, for now just create a dummy FunctionLiteral for all arrow functions.
On 2014/07/01 at 07:23:40, marja wrote: > I don't think we need to make arrow functions lazy. They're small, so switching to PreParser for the function body is not going to make a difference. Let's keep it simple and not do lazy arrow functions. This sounds reasonable at first: for most arrow functions one certainly would expect small bodies. But then it is very common for callback functions in Node.js programs to end up having big bodies, and those functions are candidates to be written as arrow functions... WDYT? Should I go ahead and remove the bits for lazy parsing of arrow functions? For the moment I am slighly in favor of keeping lazy arrow functions. > https://codereview.chromium.org/160073006/diff/580001/src/parser.cc > File src/parser.cc (right): > > https://codereview.chromium.org/160073006/diff/580001/src/parser.cc#newcode3322 > src/parser.cc:3322: Vector<VariableProxy*> ParserTraits::ParameterListFromExpression( > Comment from the previous round: > > Here you have the same logic in 2 functions; would it be possible to have only > one function which both checks if the expression is a parameter list and > constructs the vector? I am going to put both back in the same function. > https://codereview.chromium.org/160073006/diff/580001/src/preparser.h > File src/preparser.h (right): > > https://codereview.chromium.org/160073006/diff/580001/src/preparser.h#newcode893 > src/preparser.h:893: // TODO(aperezdc): Would allowing lazy compilation in preparser make sense? > Doesn't seem very sensemaking to me, when we are in PreParser, we're already parsing a lazy function and cannot get lazier than that :) Haha. That is what I thought it would be, so I had already written “return false” from the first versions of the CL :-) > https://codereview.chromium.org/160073006/diff/580001/src/preparser.h#newcode899 > src/preparser.h:899: bool IsDeclared(const PreParserIdentifier& identifier) const { > What cases does this IsDeclared actually detect? Pls add a comment. This is a leftover from the previous parameter checking approach. Right now the PreParser never calls this because PreParserTraits::ParameterListFromExpression() always returns an empty Vector. I am removing this and making IsDeclared() and DeclareParameter() just dummies. > https://codereview.chromium.org/160073006/diff/580001/src/preparser.h#newcode... > src/preparser.h:1274: bool IsValidArrowFunctionParameterList(PreParserExpression expression) { > Pls reorder these functions so that they're in the same order in ParserTraits and PreParserTraits. (So this Is... function should go where the other IsFoo functions are). > > https://codereview.chromium.org/160073006/diff/580001/src/preparser.h#newcode... > src/preparser.h:1280: void CheckConflictingVarDeclarations( > Nit: formatting. > > You'd probably want to have a look how you can use clang-format in your favourite editor :) It turns out that clang-format comes now with Vim support. Thanks for pointing it out, I have gone without checking that for a long while, it seems. > https://codereview.chromium.org/160073006/diff/580001/src/preparser.h#newcode... > src/preparser.h:2539: this->CheckStrictFunctionNameAndParameters(this->EmptyIdentifier(), > This is not inside if (strict_mode() == STRICT) because arrow function parameter lists are always strict, right? Pls add a comment. Exactly. I will add a comment here. > https://codereview.chromium.org/160073006/diff/580001/src/preparser.h#newcode... > src/preparser.h:2557: return this->EmptyExpression(); > This will change when you start generating the arrow function AST for real, right? Pls add a comment. Exactly. > https://codereview.chromium.org/160073006/diff/580001/test/cctest/test-parsin... > test/cctest/test-parsing.cc:1457: > This file seems to contain some unintentional changes still.. All should be gone in the next update. > https://codereview.chromium.org/160073006/diff/580001/test/cctest/test-parsin... > test/cctest/test-parsing.cc:2817: TEST(ErrorsArrowFunctions) { > I like this test! :) :)
On 2014/07/03 at 09:07:11, rossberg wrote: > On 2014/07/03 07:57:14, marja wrote: > > This CL makes V8 crash if the arrow function allowing flag is passed, since the > > actual implementation is missing (this will return NULL expressions, and if the > > upper layer tries to use them, it'll crash). Is this okay, or will we be burned > > by this because of some fuzzing tests? > > The fuzzer does not turn on the harmony flag, only the es-staging flag (harmony containing half-done stuff is one of the reasons why we introduced the latter). > > That said, the parser should still spit out something well-formed, and not make later phases crash. So that you can e.g. try it in d8, and write JS test cases for parsing (which would be nice to have, btw). For example, for now just create a dummy FunctionLiteral for all arrow functions. Sure, I am adding a dummy FunctionLiteral here so d8 won't crash. That will also allow me to uncomment some test cases which rely on having a non-NULL expression to be able to call increase_parenthesization_level() on them.
LGTM assuming you fix these comments. Thanks for working on this! It got a lot cleaner and more elegant in the last couple of patch sets. https://codereview.chromium.org/160073006/diff/600001/src/preparser.h File src/preparser.h (right): https://codereview.chromium.org/160073006/diff/600001/src/preparser.h#newcode722 src/preparser.h:722: ? kMultiParenthesizedExpression This is not very readable. Pls clang-format. Afaics, it suggests: code |= is_parenthesized() ? kMultiParenthesizedExpression : kParenthesizedExpression; If needed, you can add parens like this: code |= (is_parenthesized() ? kMultiParenthesizedExpression : kParenthesizedExpression); https://codereview.chromium.org/160073006/diff/600001/src/preparser.h#newcode... src/preparser.h:2487: // The vector has the items in reverse order. Is it necessary to have the items in reverse order? It seems you always add the right side to the collector first, but couldn't you as easily add the left side first? https://codereview.chromium.org/160073006/diff/600001/src/preparser.h#newcode... src/preparser.h:2506: CHECK_OK); I don't think it makes sense to put CHECK_OK inside a return statement! (See how the macro is defined.) You might get the right thing by just putting ok here instead of CHECK_OK... right? https://codereview.chromium.org/160073006/diff/600001/test/cctest/test-parsin... File test/cctest/test-parsing.cc (right): https://codereview.chromium.org/160073006/diff/600001/test/cctest/test-parsin... test/cctest/test-parsing.cc:1441: Hey, no adding whitespaces :)
On 2014/07/04 at 08:21:14, marja wrote: > LGTM assuming you fix these comments. Yay! > Thanks for working on this! It got a lot cleaner and more elegant in the last couple of patch sets. Indeed. I agree myself on that. Thanks also to you for all the comments while going back-and-forth until reaching the current state. > https://codereview.chromium.org/160073006/diff/600001/src/preparser.h > File src/preparser.h (right): > > https://codereview.chromium.org/160073006/diff/600001/src/preparser.h#newcode722 > src/preparser.h:722: ? kMultiParenthesizedExpression > This is not very readable. Pls clang-format. > > Afaics, it suggests: > > code |= is_parenthesized() ? kMultiParenthesizedExpression > : kParenthesizedExpression; > > If needed, you can add parens like this: I will update with the clang-format suggestion, I think the extra parenthesis do not contribute extra readability. > https://codereview.chromium.org/160073006/diff/600001/src/preparser.h#newcode... > src/preparser.h:2487: // The vector has the items in reverse order. > Is it necessary to have the items in reverse order? It seems you always add the right side to the collector first, but couldn't you as easily add the left side first? The AST subtree for comma-separated expressions has all leaves on the right side, for example the expression: (a, b, c, d) would be parsed into: , / \ , d / \ , c / \ a b By handling the right leaves first, the depth of the call stack due to the recursive invocation will never be more than one level for binop->right(), and for the last recursive invocation on binop->left() the compiler can make a tail-call optimization. (Actually, at some point I had the algorithm written with a “while” loop, but then I noticed that the compiler would optimize the recursive version as well, and personally I find the recursive version easier to understand). > https://codereview.chromium.org/160073006/diff/600001/src/preparser.h#newcode... > src/preparser.h:2506: CHECK_OK); > I don't think it makes sense to put CHECK_OK inside a return statement! > > (See how the macro is defined.) Indeed. I think it is better to follow the same style as in the rest of the parser code, for coherence. Like this: ExpressionT expression = ParseArrowFunctionLiteralBody(... , CHECK_OK); return expression;
Re: tail recursion, sounds a bit like premature optimization; I'm not sure. How long does the arrow function parameter list need to be before you start seeing a difference? You can test with parser-shell.
Btw, I'm okay with landing this and you can investigate the parameter list un-reversing as a follow-up. Would you like me to land the latest patch set?
On 2014/07/07 at 07:07:49, marja wrote: > Btw, I'm okay with landing this and you can investigate the parameter list un-reversing as a follow-up. Would you like me to land the latest patch set? I have made a quick check running parser-shell on an input file with 50 arrow functions, each one with 1000 parameters: the average parsing times are the about the same in both cases. I am going to upload the version that collects the parameters in normal order in a moment, let's better land that one.
Sounds good. I think it's good to start with being less obscure, until proven that being obscure gives some benefit :)
On 2014/07/07 at 08:27:37, marja wrote: > Sounds good. I think it's good to start with being less obscure, until proven that being obscure gives some benefit :) FWIW, I have already updated the CL earlier in the morning to return the parameter list in natural order, so this should be ready to land now \o/
Umm, this doesn't compile for me, does it compile for you? -Werror=unused-but-set-variable
On 2014/07/07 at 11:49:44, marja wrote: > Umm, this doesn't compile for me, does it compile for you? > > -Werror=unused-but-set-variable My bad, I did not see this one because most of the time I work with Clang and I have to pass “werror=no”. After fixing this one issue I am making sure that building with GCC succeeds, and then re-upload.
On 2014/07/07 at 12:46:28, aperez wrote: > On 2014/07/07 at 11:49:44, marja wrote: > > Umm, this doesn't compile for me, does it compile for you? > > > > -Werror=unused-but-set-variable > > My bad, I did not see this one because most of the time I work with Clang > and I have to pass “werror=no”. After fixing this one issue I am making > sure that building with GCC succeeds, and then re-upload. Building with GCC and “werror=yes” works now.
Unfortunately, "make qc" still fails :( It's a presubmit error this time. Are you uploading with --bypass-hooks or how did it actually let you upload the patch set?!
On 2014/07/07 13:18:45, marja wrote: > Unfortunately, "make qc" still fails :( It's a presubmit error this time. Are > you uploading with --bypass-hooks or how did it actually let you upload the > patch set?! ... also, I *could* just hijack this CL and upload a fixed patch set, I just thought that it would be good to get the workflow ironed out, so that you're using the same compiler and the same checks than we do, so that when I try to commit it actually succeeds...
On 2014/07/07 at 13:21:05, marja wrote: > On 2014/07/07 13:18:45, marja wrote: > > Unfortunately, "make qc" still fails :( It's a presubmit error this time. Are > > you uploading with --bypass-hooks or how did it actually let you upload the > > patch set?! > > ... also, I *could* just hijack this CL and upload a fixed patch set, I just thought that it would be good to get the workflow ironed out, so that you're using the same compiler and the same checks than we do, so that when I try to commit it actually succeeds... It seems that my copy of depot_tools was a bit old, and some things have changed in cpplint. I have updated it, and cpplint picked a few things more, which are now fixed.
Okay, now this gets silly. make qc runs, but I get this error when I'm trying to commit: ** Presubmit Warnings ** Your patch is not formatted, please run git cl format
On 2014/07/07 15:01:18, marja wrote: > Okay, now this gets silly. make qc runs, but I get this error when I'm trying to > commit: > > ** Presubmit Warnings ** > Your patch is not formatted, please run git cl format Hmm, so the thing seems to be that maybe you don't have clang format in your checkout. This whine should occur when you git cl upload. So pls run "make dependencies" to get clang format and try to upload again.. you should see this whine. :)
On 2014/07/07 at 15:08:17, marja wrote: > On 2014/07/07 15:01:18, marja wrote: > > Okay, now this gets silly. make qc runs, but I get this error when I'm trying to > > commit: > > > > ** Presubmit Warnings ** > > Your patch is not formatted, please run git cl format > > Hmm, so the thing seems to be that maybe you don't have clang format in your checkout. This whine should occur when you git cl upload. > > So pls run "make dependencies" to get clang format and try to upload again.. you should see this whine. :) Mmmhh, this does not show up here somehow. Let's see if doing “git clean -xdf && make dependencies” does the trick...
On 2014/07/07 at 15:40:22, aperez wrote: > On 2014/07/07 at 15:08:17, marja wrote: > > On 2014/07/07 15:01:18, marja wrote: > > > Okay, now this gets silly. make qc runs, but I get this error when I'm trying to > > > commit: > > > > > > ** Presubmit Warnings ** > > > Your patch is not formatted, please run git cl format > > > > Hmm, so the thing seems to be that maybe you don't have clang format in your checkout. This whine should occur when you git cl upload. > > > > So pls run "make dependencies" to get clang format and try to upload again.. you should see this whine. :) > > Mmmhh, this does not show up here somehow. Let's see if doing > “git clean -xdf && make dependencies” does the trick... Ok, that worked. Now this issue shows and I can also do “git cl format” and it will work. But clang-format wants to do some odd-looking things to “test/cctest/test-parsing.cc”... Should we trust it?
On 2014/07/07 20:30:05, aperez wrote: > Ok, that worked. Now this issue shows and I can also do “git cl format” > and it will work. But clang-format wants to do some odd-looking things > to “test/cctest/test-parsing.cc”... Should we trust it? What does it want to do? Can you upload a patch set with what it suggests?
On 2014/07/07 at 20:32:52, marja wrote: > On 2014/07/07 20:30:05, aperez wrote: > > > Ok, that worked. Now this issue shows and I can also do “git cl format” > > and it will work. But clang-format wants to do some odd-looking things > > to “test/cctest/test-parsing.cc”... Should we trust it? > > What does it want to do? Can you upload a patch set with what it suggests? The last update has what “git cl format” produces. What it does with “test-parsing.cc” making two columns of the data... I am not completely sure about that.
On 2014/07/08 04:10:47, aperez wrote: > On 2014/07/07 at 20:32:52, marja wrote: > > On 2014/07/07 20:30:05, aperez wrote: > > > > > Ok, that worked. Now this issue shows and I can also do “git cl format” > > > and it will work. But clang-format wants to do some odd-looking things > > > to “test/cctest/test-parsing.cc”... Should we trust it? > > > > What does it want to do? Can you upload a patch set with what it suggests? > > The last update has what “git cl format” produces. What it does with > “test-parsing.cc” making two columns of the data... I am not completely > sure about that. Yes, that is truly horrible. I filed a bug against clang-format. So let's do it like this: I'm going to commit this patch set as is. And if we get a decision that either 1) it's okay to rebel against clang-format or 2) clang-format changes its behavior, we'll just commit another CL which reformats test-parsing.cc back to sanity again.
Message was sent while issue was closed.
Committed patchset #21 manually as r22265 (presubmit successful).
Message was sent while issue was closed.
I reverted this because it fails ASAN: http://build.chromium.org/p/client.v8/builders/V8%20Linux64%20ASAN/builds/155... Please create a fixed CL so that the first patch set is the latest patch set of this, unfixed, and the second patch set is the fix.
Message was sent while issue was closed.
On 2014/07/08 07:49:59, marja wrote: > I reverted this because it fails ASAN: > > http://build.chromium.org/p/client.v8/builders/V8%20Linux64%20ASAN/builds/155... > > Please create a fixed CL so that the first patch set is the latest patch set of > this, unfixed, and the second patch set is the fix. Also, we agreed that it's okay to commit non-clang-formatted code in this case, so your fixed version can have un-formatted test-parsing.cc.
Message was sent while issue was closed.
On 2014/07/08 at 08:18:54, marja wrote: > On 2014/07/08 07:49:59, marja wrote: > > I reverted this because it fails ASAN: > > > > http://build.chromium.org/p/client.v8/builders/V8%20Linux64%20ASAN/builds/155... > > > > Please create a fixed CL so that the first patch set is the latest patch set of > > this, unfixed, and the second patch set is the fix. > > Also, we agreed that it's okay to commit non-clang-formatted code in this case, so your fixed version can have un-formatted test-parsing.cc. I am on this. And thanks for filing a bug for clang-format :)
Message was sent while issue was closed.
On 2014/07/08 at 15:00:02, aperez wrote: > On 2014/07/08 at 08:18:54, marja wrote: > > On 2014/07/08 07:49:59, marja wrote: > > > I reverted this because it fails ASAN: > > > > > > http://build.chromium.org/p/client.v8/builders/V8%20Linux64%20ASAN/builds/155... > > > > > > Please create a fixed CL so that the first patch set is the latest patch set of > > > this, unfixed, and the second patch set is the fix. > > > > Also, we agreed that it's okay to commit non-clang-formatted code in this case, so your fixed version can have un-formatted test-parsing.cc. > > I am on this. And thanks for filing a bug for clang-format :) Mmhh, I cannot get ASAN to fail here, just some timed out tests which pass fine if the timeout time is changed. JFTR, this is what I am doing: % make asan=/usr/lib/ccache/bin/clang++ -j4 x64.release.check (Same result on debug builds)
Message was sent while issue was closed.
Luckily, this one is easy to fix by just looking at the log I linked to. The problem is that Collector::ToVector returns a vector allocated with Vector::New, and you need to call Dispose on it at some point. |