|
|
Description[Interpreter] Reduce temporary register usage in generated bytecode.
This change adds new flavors of Visit() methods for obtaining
expression results:
- VisitForAccumulatorValue() which places result in the accumulator.
- VisitForRegisterValue() which places the result in a register.
- VisitForEffect() which evaluates the expression and discards the result.
The targets of these calls place the expression result with
result_scope()->SetResultInRegister() or
result_scope()->SetResultInAccumulator().
By being smarter about result locations, there's less temporary
register usage. However, we now have a hazard with assignments
in binary expressions that didn't exist before. This change detects and
DCHECK's when a hazard is detected. A follow on CL will address this.
There are consequential changes to test-bytecode-generator.cc and
this change also adds new bytecode macros A(x, n) and THIS(n) for
register file entries for arguments and this.
BUG=v8:4280
LOG=NO
Committed: https://crrev.com/339e0c804ec60347d401a09bb62182dabb7ee54c
Cr-Commit-Position: refs/heads/master@{#31445}
Patch Set 1 #Patch Set 2 : Rebase. #Patch Set 3 : Rebase. #
Total comments: 22
Patch Set 4 : Incorporate majority of the review comments in https://codereview.chromium.org/1392933002/ #Patch Set 5 : VisitForEffect. #Patch Set 6 : Update comment in BytecodeGenerator::VisitCall(). #
Total comments: 2
Patch Set 7 : No default. #Patch Set 8 : nop. #Patch Set 9 : Remove result_register_ from BytecodeGenerator. #Patch Set 10 : Detect and correct for assignments in binary expressions. #Patch Set 11 : Additional comments, unbreak test. #Patch Set 12 : Introduce top-level result scope and simplify allocation/free. #Patch Set 13 : Rebase and switch to detection of bad assignments only. #Patch Set 14 : Remove dead code. #
Total comments: 81
Patch Set 15 : Incorporate review comments. #Patch Set 16 : Missed comments. #
Total comments: 22
Patch Set 17 : Incorporate comments from rmcilroy on patch set 16. #Patch Set 18 : Rebase and additional check in LoadLiteral for effect. #Patch Set 19 : Remove an unnecessary TemporaryRegisterScope. #Patch Set 20 : Fix signed-unsigned comparison. #
Total comments: 8
Patch Set 21 : Another signed-unsigned nit. #Patch Set 22 : Undo some git cl formatting of test-bytecode-generator.cc #Patch Set 23 : Incorporate comments from mstarzinger. #Messages
Total messages: 34 (9 generated)
oth@chromium.org changed reviewers: + mstarzinger@chromium.org, rmcilroy@chromium.org
mstarzinger@chromium.org, : Please review changes in interpreter. rmcilroy@chromium.org: Please review changes in interpreter and test. Thanks!
https://codereview.chromium.org/1392933002/diff/40001/src/interpreter/bytecod... File src/interpreter/bytecode-generator.cc (right): https://codereview.chromium.org/1392933002/diff/40001/src/interpreter/bytecod... src/interpreter/bytecode-generator.cc:104: virtual void RegisterSet(Register reg) = 0; Not sure I like these names. How about SetResultInAccumulator() and SetResultInRegister(). https://codereview.chromium.org/1392933002/diff/40001/src/interpreter/bytecod... src/interpreter/bytecode-generator.cc:125: builder()->LoadAccumulatorWithRegister(reg); As discussed offline, could we add a DCHECK to make sure that 1 and only 1 of these gets called on a given Expression scope. https://codereview.chromium.org/1392933002/diff/40001/src/interpreter/bytecod... src/interpreter/bytecode-generator.cc:153: As discussed offline, could we also add a NullResultScope which would DCHECK if either AccumulatorSet or RegisterSet are called, and is used when calling a normal "Visit" (we will presumably need to override Visit to do this). https://codereview.chromium.org/1392933002/diff/40001/src/interpreter/bytecod... src/interpreter/bytecode-generator.cc:551: builder()->LoadNamedProperty(obj, feedback_index(slot), language_mode()); AccumulatorSet() https://codereview.chromium.org/1392933002/diff/40001/src/interpreter/bytecod... src/interpreter/bytecode-generator.cc:567: builder()->StoreAccumulatorInRegister(destination); I was chatting to Danno about this and he made a very good point where this optimization could potentially cause incorrect semantics. If we have the following code: var a = 1; b = a + (a = 2); The lhs of the expression would evaluate down to a simple Register(a), and the rhs would evaluate down to ( 2 -> Register(a), Register(a)). Since we have updated a to "2" in the rhs, that means the result of the add operation would be wrong (it would be 2 + 2 rather than 1 + 2). I'm not sure how we solve this, we probably need to check the expression stack and only do this optimization if there are no assignment operations in any of expressions. Any thoughts? https://codereview.chromium.org/1392933002/diff/40001/src/interpreter/bytecod... src/interpreter/bytecode-generator.cc:728: // a param. Hence explicit allocation and store. If there were no other arguments (or all the arguments happened to be sequential params or locals) then it would actually be possible to use a parameter or local registers (and might be a nice optimization for wrapper functions which call another function with the same arguments). Definitely a TODO for future work though :). Could you clarify the comment saying the "Receiver needs to be in a temporary register so that arguments can be in subsequent ordered registers". https://codereview.chromium.org/1392933002/diff/40001/src/interpreter/bytecod... src/interpreter/bytecode-generator.cc:829: UNREACHABLE(); AcumulatorSet ? https://codereview.chromium.org/1392933002/diff/40001/src/interpreter/bytecod... File src/interpreter/bytecode-generator.h (right): https://codereview.chromium.org/1392933002/diff/40001/src/interpreter/bytecod... src/interpreter/bytecode-generator.h:49: void VisitForAccumulatorValue(Expression* expression); Could we have some comments on when these are expected to be used. https://codereview.chromium.org/1392933002/diff/40001/src/interpreter/bytecod... src/interpreter/bytecode-generator.h:50: void VisitForRegisterValue(Expression* expression, Could VisitForRegisterValue just return the Register (with MUST_USE_RESULT) rather than having a separate GetResultRegister? Also, would it make sense to have the register stored in the RegisterResultScope instead of in the BytecodeGenerator? https://codereview.chromium.org/1392933002/diff/40001/src/interpreter/bytecod... src/interpreter/bytecode-generator.h:53: void DropResultRegister(); Doesn't look like this is necessary as a seperate function since it's only used by GetResultRegister
https://codereview.chromium.org/1392933002/diff/40001/src/interpreter/bytecod... File src/interpreter/bytecode-generator.cc (right): https://codereview.chromium.org/1392933002/diff/40001/src/interpreter/bytecod... src/interpreter/bytecode-generator.cc:567: builder()->StoreAccumulatorInRegister(destination); On 2015/10/08 16:54:25, rmcilroy wrote: > I was chatting to Danno about this and he made a very good point where this > optimization could potentially cause incorrect semantics. If we have the > following code: > > var a = 1; > b = a + (a = 2); > > The lhs of the expression would evaluate down to a simple Register(a), and the > rhs would evaluate down to ( 2 -> Register(a), Register(a)). Since we have > updated a to "2" in the rhs, that means the result of the add operation would be > wrong (it would be 2 + 2 rather than 1 + 2). I'm not sure how we solve this, we > probably need to check the expression stack and only do this optimization if > there are no assignment operations in any of expressions. Any thoughts? Had a further chat with Danno about this, if we keep track of the registers we are using for an expression, then we might be able to keep this as is, but then when we get to an assignment operation, if it is to a local / param we would walk this list of expression registers, and if any of them are the same local / param, then we would emit code to copy the local to a temporary register, and use that register instead. We can chat about this tomorrow in-person.
Thanks, comments incorporated with a couple of exceptions discussed. Will progress the mid-expression register mutation issue before further review. https://codereview.chromium.org/1392933002/diff/40001/src/interpreter/bytecod... File src/interpreter/bytecode-generator.cc (right): https://codereview.chromium.org/1392933002/diff/40001/src/interpreter/bytecod... src/interpreter/bytecode-generator.cc:104: virtual void RegisterSet(Register reg) = 0; On 2015/10/08 16:54:26, rmcilroy wrote: > Not sure I like these names. How about SetResultInAccumulator() and > SetResultInRegister(). Done. https://codereview.chromium.org/1392933002/diff/40001/src/interpreter/bytecod... src/interpreter/bytecode-generator.cc:125: builder()->LoadAccumulatorWithRegister(reg); On 2015/10/08 16:54:26, rmcilroy wrote: > As discussed offline, could we add a DCHECK to make sure that 1 and only 1 of > these gets called on a given Expression scope. Done. https://codereview.chromium.org/1392933002/diff/40001/src/interpreter/bytecod... src/interpreter/bytecode-generator.cc:153: On 2015/10/08 16:54:26, rmcilroy wrote: > As discussed offline, could we also add a NullResultScope which would DCHECK if > either AccumulatorSet or RegisterSet are called, and is used when calling a > normal "Visit" (we will presumably need to override Visit to do this). If the ast visitor is not in an expression then the result scope is null so SetResultInAccumulator/SetResultInRegister fail. The normal visit is already implemented by DEFINE_AST_VISITOR_SUBCLASS_MEMBERS so if there's additional value on this path we'd need to drop the definition in the macro and push the change elsewhere. https://codereview.chromium.org/1392933002/diff/40001/src/interpreter/bytecod... src/interpreter/bytecode-generator.cc:551: builder()->LoadNamedProperty(obj, feedback_index(slot), language_mode()); On 2015/10/08 16:54:26, rmcilroy wrote: > AccumulatorSet() Done. https://codereview.chromium.org/1392933002/diff/40001/src/interpreter/bytecod... src/interpreter/bytecode-generator.cc:567: builder()->StoreAccumulatorInRegister(destination); On 2015/10/08 17:00:09, rmcilroy wrote: > On 2015/10/08 16:54:25, rmcilroy wrote: > > I was chatting to Danno about this and he made a very good point where this > > optimization could potentially cause incorrect semantics. If we have the > > following code: > > > > var a = 1; > > b = a + (a = 2); > > > > The lhs of the expression would evaluate down to a simple Register(a), and the > > rhs would evaluate down to ( 2 -> Register(a), Register(a)). Since we have > > updated a to "2" in the rhs, that means the result of the add operation would > be > > wrong (it would be 2 + 2 rather than 1 + 2). I'm not sure how we solve this, > we > > probably need to check the expression stack and only do this optimization if > > there are no assignment operations in any of expressions. Any thoughts? > > Had a further chat with Danno about this, if we keep track of the registers we > are using for an expression, then we might be able to keep this as is, but then > when we get to an assignment operation, if it is to a local / param we would > walk this list of expression registers, and if any of them are the same local / > param, then we would emit code to copy the local to a temporary register, and > use that register instead. We can chat about this tomorrow in-person. Yes, a bit of a fail not to have recalled this was an issue, but it's definitely come up before. As discussed implementing, this is probably best tackled in two steps since detection and failure is considerably easier than solving the problem. The current register allocation pattern makes life harder than necessary as it does bump allocation in nested scopes, but the code would need the 'temporaries' to live beyond their register scope. We might solve this by having an expression scope that subsumes the temporary register scope functionality. Will update the review when there's a step improvement on one of these paths. https://codereview.chromium.org/1392933002/diff/40001/src/interpreter/bytecod... src/interpreter/bytecode-generator.cc:728: // a param. Hence explicit allocation and store. On 2015/10/08 16:54:25, rmcilroy wrote: > If there were no other arguments (or all the arguments happened to be sequential > params or locals) then it would actually be possible to use a parameter or local > registers (and might be a nice optimization for wrapper functions which call > another function with the same arguments). Definitely a TODO for future work > though :). > > Could you clarify the comment saying the "Receiver needs to be in a temporary > register so that arguments can be in subsequent ordered registers". Done. https://codereview.chromium.org/1392933002/diff/40001/src/interpreter/bytecod... src/interpreter/bytecode-generator.cc:829: UNREACHABLE(); On 2015/10/08 16:54:25, rmcilroy wrote: > AcumulatorSet ? Done. https://codereview.chromium.org/1392933002/diff/40001/src/interpreter/bytecod... File src/interpreter/bytecode-generator.h (right): https://codereview.chromium.org/1392933002/diff/40001/src/interpreter/bytecod... src/interpreter/bytecode-generator.h:49: void VisitForAccumulatorValue(Expression* expression); On 2015/10/08 16:54:26, rmcilroy wrote: > Could we have some comments on when these are expected to be used. Done. https://codereview.chromium.org/1392933002/diff/40001/src/interpreter/bytecod... src/interpreter/bytecode-generator.h:50: void VisitForRegisterValue(Expression* expression, On 2015/10/08 16:54:26, rmcilroy wrote: > Could VisitForRegisterValue just return the Register (with MUST_USE_RESULT) > rather than having a separate GetResultRegister? Also, would it make sense to > have the register stored in the RegisterResultScope instead of in the > BytecodeGenerator? Done. https://codereview.chromium.org/1392933002/diff/40001/src/interpreter/bytecod... src/interpreter/bytecode-generator.h:53: void DropResultRegister(); On 2015/10/08 16:54:26, rmcilroy wrote: > Doesn't look like this is necessary as a seperate function since it's only used > by GetResultRegister Done. Right there was an intermediate step that also used this, but that disappeared before the CL.
https://codereview.chromium.org/1392933002/diff/40001/src/interpreter/bytecod... File src/interpreter/bytecode-generator.h (right): https://codereview.chromium.org/1392933002/diff/40001/src/interpreter/bytecod... src/interpreter/bytecode-generator.h:50: void VisitForRegisterValue(Expression* expression, On 2015/10/09 12:50:48, oth wrote: > On 2015/10/08 16:54:26, rmcilroy wrote: > > Could VisitForRegisterValue just return the Register (with MUST_USE_RESULT) > > rather than having a separate GetResultRegister? Also, would it make sense to > > have the register stored in the RegisterResultScope instead of in the > > BytecodeGenerator? > > Done. What did you think of storing the register in the RegisterResultScope instead of in the BytecodeGenerator? Doesn't seem to be done. It would allow the removal of GetResultRegister and ResultRegisterIsEmpty functions. https://codereview.chromium.org/1392933002/diff/90001/src/interpreter/bytecod... File src/interpreter/bytecode-generator.cc (right): https://codereview.chromium.org/1392933002/diff/90001/src/interpreter/bytecod... src/interpreter/bytecode-generator.cc:593: default: Is this required? I'd prefer to avoid default: if possible so that it would cause a compile-time error if a new VariableLocation enum value is added.
RegisterResultScope? Yep! It falls into the same bucket we discussed this morning in the guise of ExpressionScope that captures information about register usage, temporaries, and potentially the result. This will be the next phase of this CL. https://codereview.chromium.org/1392933002/diff/90001/src/interpreter/bytecod... File src/interpreter/bytecode-generator.cc (right): https://codereview.chromium.org/1392933002/diff/90001/src/interpreter/bytecod... src/interpreter/bytecode-generator.cc:593: default: On 2015/10/09 13:02:56, rmcilroy wrote: > Is this required? I'd prefer to avoid default: if possible so that it would > cause a compile-time error if a new VariableLocation enum value is added. Acknowledged.
On 2015/10/09 13:27:39, oth wrote: > RegisterResultScope? Yep! It falls into the same bucket we discussed this > morning in the guise of ExpressionScope that captures information about register > usage, temporaries, and potentially the result. This will be the next phase of > this CL. > The CL is refreshed to include detection and correction for assignments during the expression of binary expressions. In the normal case (no conflicting assignments, the code makes a single pass over the binary expression tree. The pass runs to completion even to capture all the local/parameter registers with conflicts. If a collision is detected, the output from the first pass is discard and a second visit is made using temporary registers appropriately to avoid errors from conflicts for the affected local/parameter registers. The code uses temporary registers more aggressively and also allows for unused temporaries to be de-allocated in the scope. This came about because when calling VisitForRegisterValue the code only needs a temporary if the value is not already in a register. However, if the temporary has been allocated and not used it causes the frame size to be larger than necessary. An aspect that has room for improvement is determining the number of temporaries based on what are seen in operands (BytecodeArrayBuilder::Output). This is because there's no mechanism to reserve/commit temporary registers and the old method over counted when combined with the other changes here. A single pass solution is an option for the future, but in the near-term the two-pass should hopefully suffice as the second pass should be rarely hit. The single pass probably requires re-thinking temporary register allocation. There's also additional tests here, additional comments, and extra sanity checking on temporary register scopes.
We're putting this on ice for the time being. Please don't spend time looking at this.
Ross, PTAL. The latest patch switches to simply detecting assignments in expressions rather than correcting them. The allocator here is functional, but will need replacing.
Overall looking much cleaner, I like it! A bunch of comments but the main ones are making the checks not specific for BinaryOperators and simplifying the register allocator. https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... File src/interpreter/bytecode-array-builder.cc (right): https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... src/interpreter/bytecode-array-builder.cc:76: // OTH: TODO review or context? I don't think we need to worry about contexts since these registers are never directly returned as results. Maybe add a DCHECK in RegisterResultScope that the result register is RegisterIsParameterOrLocal() or is a temporary register to enforce this? https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... src/interpreter/bytecode-array-builder.cc:77: return reg.is_parameter() || reg.index() < local_register_count_; use local_register_count() instead. https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... src/interpreter/bytecode-array-builder.cc:681: for (size_t i = 0; i < free_temporaries_.size(); i++) { Only do this loop if #DEBUG. https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... src/interpreter/bytecode-array-builder.cc:690: for (size_t i = 0; i < free_temporaries_.size(); i++) { ditto https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... src/interpreter/bytecode-array-builder.cc:697: static void CheckConsecutiveRun(ZoneVector<int>* v, size_t count) { This code should only be run if #DEBUG - just make it IsConsecutiveRun with a bool return value and call DCHECK(IsConsecutiveRun(...)) below. https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... src/interpreter/bytecode-array-builder.cc:700: while (count-- > 1) { for loop instead? https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... src/interpreter/bytecode-array-builder.cc:719: std::sort(free_temporaries_.begin(), free_temporaries_.end()); How about we just keep free_temporaries sorted rather than doing the sort here? E.g., std::set will keep it sorted (and is also more semantically correct since there shouldn't be duplicate free registers). https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... src/interpreter/bytecode-array-builder.cc:721: // First check for a run in the existing elements How often do we end up (in the current test code) getting a run of free registers in the middle of the free_temporaries? Would it be simpler to just maintain the free-list sorted using a set (as above) and only try and abutt the consecutive temporaries at the end of the free list, completing with new temporaries as necessary? https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... src/interpreter/bytecode-array-builder.cc:807: } else if (reg.index() < local_register_count_) { I don't think this check is necessary given the check below. If it is, then use local_register_count() https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... src/interpreter/bytecode-array-builder.cc:810: local_register_count_ + context_register_count_) { reg.index() < fixed_register_count() https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... src/interpreter/bytecode-array-builder.cc:811: // context_register_count_ may be -1 if not initialized. This comment should no longer be true - if you spot somewhere that context_register_count_ is -1 let me know. https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... src/interpreter/bytecode-array-builder.cc:1016: void TemporaryRegisterScope::PrepareForConsecutiveAllocations(size_t count) { As discussed offline - we should be moving towards a model where all register allocations have to happen through the ResultScope (and TemporaryRegisterScope becomes private and a friend of ResultScope), then we could have a DCHECK that after PrepareForConsecutiveAllocations, the result scope only calls NewConsecutiveRegister() and fails if something tries to call NewRegister() on that scope (at least until the Consecutive allocations are no longer needed. As I mentioned, I'm fine with this as-is for now, but let's add a TODO. https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... File src/interpreter/bytecode-array-builder.h (right): https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... src/interpreter/bytecode-array-builder.h:214: Register last_temporary_register() const; nit - move getters below TemporaryRegisterIsLive() (with a newline whitespace) https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... File src/interpreter/bytecode-generator.cc (right): https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... src/interpreter/bytecode-generator.cc:169: ExpressionResultScope* parent() const { return parent_; } nit - outer() and outer_ to be consistent with other scopes? https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... src/interpreter/bytecode-generator.cc:194: // Specifically mapping registers across an expresion. expression / expresion (although this class should just be removed for now unless we start using it in this CL). https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... src/interpreter/bytecode-generator.cc:203: virtual ~TopLevelResultScope() { nit newline above https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... src/interpreter/bytecode-generator.cc:221: // TODO(oth): REVIEW. Can we remove the "if (!result_identified())" now? https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... src/interpreter/bytecode-generator.cc:225: virtual void SetResultInRegister(Register reg) { nit - newline above https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... src/interpreter/bytecode-generator.cc:243: }; nit - could you move this above BytecodeGenerator::AccumulatorResultScope https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... src/interpreter/bytecode-generator.cc:294: result_scope_ = result_scope; nit - just inline this in header https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... src/interpreter/bytecode-generator.cc:298: BytecodeGenerator::ExpressionResultScope* BytecodeGenerator::result_scope() ditto https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... src/interpreter/bytecode-generator.cc:305: BytecodeGenerator::parent_result_scope() const { Unused, please remove (callers could just do result_scope()->parent() anyway if needed) https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... src/interpreter/bytecode-generator.cc:485: VisitForAccumulatorValue(stmt->expression()); VisitForEffect here I think? Following statements should never be expecting anything in the accumulator https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... src/interpreter/bytecode-generator.cc:532: TopLevelResultScope scope(this); Given my comment above about removing TopLevelResultScope for now this is not useful in this CL, however, how about StatementResultScope instead (TopLevel is a bit overloaded). https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... src/interpreter/bytecode-generator.cc:759: VisitForAccumulatorValue(property->value()); VisitForEffect here https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... src/interpreter/bytecode-generator.cc:767: // key, value, language are in another. Can we use PrepareForConsecutiveAllocations at the top of VisitObjectLiteral ? https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... src/interpreter/bytecode-generator.cc:934: FeedbackVectorSlot slot = expr->LiteralFeedbackSlot(); nit - (I know this was my code..) could you move the FeedbackSlot up about builder() so the use of the accumulator is more obvious. https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... src/interpreter/bytecode-generator.cc:1013: StoreForBinaryExpression(destination); I don't like this being called StoreForBinaryExpression given that this is VisitVariableAssignment which doesn't involve binary expressions. How about RecordStoreToLocalRegister or similar? https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... src/interpreter/bytecode-generator.cc:1186: hits++; leftover debugging code? https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... src/interpreter/bytecode-generator.cc:1240: result_scope()->PrepareForConsecutiveAllocations(args->length()); nit - could we move PrepareForConsecutiveAllocations into VisitArguments. Also, shouldn't VisitCall also be using VisitArguments? https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... src/interpreter/bytecode-generator.cc:1244: DCHECK(arg.index() - i == receiver.index() + 1); nit - move Register arg and DCHECK lines above VisitForAccumulator to make use of accumulator more obvious https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... src/interpreter/bytecode-generator.cc:1296: // TODO(rmcilroy): support multiple return values. nit - move TODO back up to DCHECK_LE (that is what would fail if we tried to call something with multiple results. https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... src/interpreter/bytecode-generator.cc:1369: PrepareForBinaryExpression(); As mentioned, this should probably be a scope allocated type, ideally being in a top level ExpressionResultScope as discussed offline rather than being tied to a binary op. I'm fine with us landing this as-is for now with a TODO though to reduce rebasing pain. https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... src/interpreter/bytecode-generator.cc:1478: Register BytecodeGenerator::LoadForBinaryExpression(Register reg) { I think this should be done in VisitVariableLoad at the same point as we alias the local (e.g., call this LoadLocalRegisterAsAlias to match with RecordStoreToLocalRegister). This way it is being done close to the aliasing so it is more obvious why it is necessary, and we won't forget to add it if there end up being more binary op situations (e.g., CompareOps are not classed as BinaryOps, so would have been easy to miss). https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... src/interpreter/bytecode-generator.cc:1509: // expression. This is hopefully rare. Comment is out-of-date. Alsoc could we move this back to it's original location (putting all the Visit operators grouped and moving the PrepareForBinaryExpression..CompleteBinaryExpression and VisitForAccumulatorValue..VisitForRegisterValue below the visit operations. https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... src/interpreter/bytecode-generator.cc:1554: VisitForAccumulatorValue(left); this can be VisitForEffect https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... File src/interpreter/bytecode-generator.h (right): https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... src/interpreter/bytecode-generator.h:119: TopLevelResultScope* top_level_result_scope_; This doesn't seem to be used? Can we remove it for now (or make use of it instead of having PrepareForBinaryExpression / CompleteBinaryExpression? As an aside for when it is later needed, personally I would prefer the top level result scope not to be separate from the result_scope chain (i.e., it is just stored at the top of the result_scope_ chain, and possibly has a 'short cut' method in ExpressionResultScope which points to the top-level so we don't need to walk the whole chain to get it. https://codereview.chromium.org/1392933002/diff/250001/test/cctest/interprete... File test/cctest/interpreter/test-bytecode-generator.cc (right): https://codereview.chromium.org/1392933002/diff/250001/test/cctest/interprete... test/cctest/interpreter/test-bytecode-generator.cc:846: U8(vector->GetIndex(slot1)), // nit - indent operands (and below) - I can't find a way to trick git cl format into liking this though :( https://codereview.chromium.org/1392933002/diff/250001/test/cctest/interprete... test/cctest/interpreter/test-bytecode-generator.cc:1895: B(TypeOf), // could be culled in bytecode array builder. nit - move comment up one line (on Star and Ldar)
Thanks, most of these are incorporated. There's a couple I'd like to see us leave until the register re-mapping CL. PTAL. https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... File src/interpreter/bytecode-array-builder.cc (right): https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... src/interpreter/bytecode-array-builder.cc:76: // OTH: TODO review or context? On 2015/10/19 12:56:21, rmcilroy wrote: > I don't think we need to worry about contexts since these registers are never > directly returned as results. Maybe add a DCHECK in RegisterResultScope that the > result register is RegisterIsParameterOrLocal() or is a temporary register to > enforce this? Agree. Now have a RegisterIsTemporary() as an additional check. Not sure we need anything extra for contexts for now. https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... src/interpreter/bytecode-array-builder.cc:77: return reg.is_parameter() || reg.index() < local_register_count_; On 2015/10/19 12:56:21, rmcilroy wrote: > use local_register_count() instead. Done. https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... src/interpreter/bytecode-array-builder.cc:681: for (size_t i = 0; i < free_temporaries_.size(); i++) { On 2015/10/19 12:56:21, rmcilroy wrote: > Only do this loop if #DEBUG. Re-implemented with ZoneSet. https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... src/interpreter/bytecode-array-builder.cc:690: for (size_t i = 0; i < free_temporaries_.size(); i++) { On 2015/10/19 12:56:22, rmcilroy wrote: > ditto Re-implemented with ZoneSet. https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... src/interpreter/bytecode-array-builder.cc:697: static void CheckConsecutiveRun(ZoneVector<int>* v, size_t count) { On 2015/10/19 12:56:21, rmcilroy wrote: > This code should only be run if #DEBUG - just make it IsConsecutiveRun with a > bool return value and call DCHECK(IsConsecutiveRun(...)) below. Method removed. https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... src/interpreter/bytecode-array-builder.cc:700: while (count-- > 1) { On 2015/10/19 12:56:21, rmcilroy wrote: > for loop instead? Ack. https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... src/interpreter/bytecode-array-builder.cc:719: std::sort(free_temporaries_.begin(), free_temporaries_.end()); On 2015/10/19 12:56:21, rmcilroy wrote: > How about we just keep free_temporaries sorted rather than doing the sort here? > E.g., std::set will keep it sorted (and is also more semantically correct since > there shouldn't be duplicate free registers). Yes, it's become a bit overwhelming in it's current form. Switched to using a set. The code is much cleaner and produces the same output. As part of this the TemporaryRegisterScope now has a NextConsecutiveRegister method that should be called after PrepareForConsecutiveAllocations. And there's counter and cursor for maintaining the necessary state. https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... src/interpreter/bytecode-array-builder.cc:721: // First check for a run in the existing elements On 2015/10/19 12:56:22, rmcilroy wrote: > How often do we end up (in the current test code) getting a run of free > registers in the middle of the free_temporaries? Would it be simpler to just > maintain the free-list sorted using a set (as above) and only try and abutt the > consecutive temporaries at the end of the free list, completing with new > temporaries as necessary? IIRC it only occurs in VisitObjectLiterals which is quite painful. See what you think of the revised allocator - it still does this, but is much cleaner and easier to follow (hopefully). https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... src/interpreter/bytecode-array-builder.cc:807: } else if (reg.index() < local_register_count_) { On 2015/10/19 12:56:21, rmcilroy wrote: > I don't think this check is necessary given the check below. If it is, then use > local_register_count() Yes, agree. Removed. https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... src/interpreter/bytecode-array-builder.cc:810: local_register_count_ + context_register_count_) { On 2015/10/19 12:56:22, rmcilroy wrote: > reg.index() < fixed_register_count() Done. https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... src/interpreter/bytecode-array-builder.cc:811: // context_register_count_ may be -1 if not initialized. On 2015/10/19 12:56:21, rmcilroy wrote: > This comment should no longer be true - if you spot somewhere that > context_register_count_ is -1 let me know. Acknowledged. Removed comment. https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... src/interpreter/bytecode-array-builder.cc:1016: void TemporaryRegisterScope::PrepareForConsecutiveAllocations(size_t count) { On 2015/10/19 12:56:21, rmcilroy wrote: > As discussed offline - we should be moving towards a model where all register > allocations have to happen through the ResultScope (and TemporaryRegisterScope > becomes private and a friend of ResultScope), then we could have a DCHECK that > after PrepareForConsecutiveAllocations, the result scope only calls > NewConsecutiveRegister() and fails if something tries to call NewRegister() on > that scope (at least until the Consecutive allocations are no longer needed. As > I mentioned, I'm fine with this as-is for now, but let's add a TODO. Added a TODO in the constructor for TemporaryRegisterScope. https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... File src/interpreter/bytecode-array-builder.h (right): https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... src/interpreter/bytecode-array-builder.h:214: Register last_temporary_register() const; On 2015/10/19 12:56:22, rmcilroy wrote: > nit - move getters below TemporaryRegisterIsLive() (with a newline whitespace) Done. https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... File src/interpreter/bytecode-generator.cc (right): https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... src/interpreter/bytecode-generator.cc:169: ExpressionResultScope* parent() const { return parent_; } On 2015/10/19 12:56:22, rmcilroy wrote: > nit - outer() and outer_ to be consistent with other scopes? Done. https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... src/interpreter/bytecode-generator.cc:194: // Specifically mapping registers across an expresion. On 2015/10/19 12:56:22, rmcilroy wrote: > expression / expresion (although this class should just be removed for now > unless we start using it in this CL). Done. https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... src/interpreter/bytecode-generator.cc:203: virtual ~TopLevelResultScope() { On 2015/10/19 12:56:22, rmcilroy wrote: > nit newline above Class removed. https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... src/interpreter/bytecode-generator.cc:221: // TODO(oth): REVIEW. On 2015/10/19 12:56:22, rmcilroy wrote: > Can we remove the "if (!result_identified())" now? Done. https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... src/interpreter/bytecode-generator.cc:225: virtual void SetResultInRegister(Register reg) { On 2015/10/19 12:56:23, rmcilroy wrote: > nit - newline above Done. https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... src/interpreter/bytecode-generator.cc:243: }; On 2015/10/19 12:56:22, rmcilroy wrote: > nit - could you move this above BytecodeGenerator::AccumulatorResultScope Done. https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... src/interpreter/bytecode-generator.cc:294: result_scope_ = result_scope; On 2015/10/19 12:56:22, rmcilroy wrote: > nit - just inline this in header Done. https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... src/interpreter/bytecode-generator.cc:298: BytecodeGenerator::ExpressionResultScope* BytecodeGenerator::result_scope() On 2015/10/19 12:56:23, rmcilroy wrote: > ditto Done. https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... src/interpreter/bytecode-generator.cc:305: BytecodeGenerator::parent_result_scope() const { On 2015/10/19 12:56:22, rmcilroy wrote: > Unused, please remove (callers could just do result_scope()->parent() anyway if > needed) Done. https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... src/interpreter/bytecode-generator.cc:485: VisitForAccumulatorValue(stmt->expression()); On 2015/10/19 12:56:23, rmcilroy wrote: > VisitForEffect here I think? Following statements should never be expecting > anything in the accumulator Done. https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... src/interpreter/bytecode-generator.cc:532: TopLevelResultScope scope(this); On 2015/10/19 12:56:22, rmcilroy wrote: > Given my comment above about removing TopLevelResultScope for now this is not > useful in this CL, however, how about StatementResultScope instead (TopLevel is > a bit overloaded). Ack. Removed for now, the newer name is fine. https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... src/interpreter/bytecode-generator.cc:759: VisitForAccumulatorValue(property->value()); On 2015/10/19 12:56:22, rmcilroy wrote: > VisitForEffect here Done. https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... src/interpreter/bytecode-generator.cc:767: // key, value, language are in another. On 2015/10/19 12:56:22, rmcilroy wrote: > Can we use PrepareForConsecutiveAllocations at the top of VisitObjectLiteral ? Done. https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... src/interpreter/bytecode-generator.cc:934: FeedbackVectorSlot slot = expr->LiteralFeedbackSlot(); On 2015/10/19 12:56:22, rmcilroy wrote: > nit - (I know this was my code..) could you move the FeedbackSlot up about > builder() so the use of the accumulator is more obvious. Done. https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... src/interpreter/bytecode-generator.cc:1013: StoreForBinaryExpression(destination); On 2015/10/19 12:56:22, rmcilroy wrote: > I don't like this being called StoreForBinaryExpression given that this is > VisitVariableAssignment which doesn't involve binary expressions. How about > RecordStoreToLocalRegister or similar? RecordStoreToRegister()? It'll certainly cover parameters and probably temporaries that mappings of earlier mappings. https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... src/interpreter/bytecode-generator.cc:1186: hits++; On 2015/10/19 12:56:23, rmcilroy wrote: > leftover debugging code? Done. https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... src/interpreter/bytecode-generator.cc:1240: result_scope()->PrepareForConsecutiveAllocations(args->length()); On 2015/10/19 12:56:22, rmcilroy wrote: > nit - could we move PrepareForConsecutiveAllocations into VisitArguments. Also, > shouldn't VisitCall also be using VisitArguments? Done. https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... src/interpreter/bytecode-generator.cc:1240: result_scope()->PrepareForConsecutiveAllocations(args->length()); On 2015/10/19 12:56:22, rmcilroy wrote: > nit - could we move PrepareForConsecutiveAllocations into VisitArguments. Also, > shouldn't VisitCall also be using VisitArguments? Done. https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... src/interpreter/bytecode-generator.cc:1244: DCHECK(arg.index() - i == receiver.index() + 1); On 2015/10/19 12:56:22, rmcilroy wrote: > nit - move Register arg and DCHECK lines above VisitForAccumulator to make use > of accumulator more obvious Done. https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... src/interpreter/bytecode-generator.cc:1296: // TODO(rmcilroy): support multiple return values. On 2015/10/19 12:56:22, rmcilroy wrote: > nit - move TODO back up to DCHECK_LE (that is what would fail if we tried to > call something with multiple results. Done. https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... src/interpreter/bytecode-generator.cc:1369: PrepareForBinaryExpression(); On 2015/10/19 12:56:22, rmcilroy wrote: > As mentioned, this should probably be a scope allocated type, ideally being in a > top level ExpressionResultScope as discussed offline rather than being tied to a > binary op. I'm fine with us landing this as-is for now with a TODO though to > reduce rebasing pain. Done. https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... src/interpreter/bytecode-generator.cc:1478: Register BytecodeGenerator::LoadForBinaryExpression(Register reg) { On 2015/10/19 12:56:22, rmcilroy wrote: > I think this should be done in VisitVariableLoad at the same point as we alias > the local (e.g., call this LoadLocalRegisterAsAlias to match with > RecordStoreToLocalRegister). This way it is being done close to the aliasing so > it is more obvious why it is necessary, and we won't forget to add it if there > end up being more binary op situations (e.g., CompareOps are not classed as > BinaryOps, so would have been easy to miss). The method has been renamed and there's a TODO to make this change. I'd like it to be in the follow CL that does re-mapping. https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... src/interpreter/bytecode-generator.cc:1509: // expression. This is hopefully rare. On 2015/10/19 12:56:22, rmcilroy wrote: > Comment is out-of-date. Alsoc could we move this back to it's original location > (putting all the Visit operators grouped and moving the > PrepareForBinaryExpression..CompleteBinaryExpression and > VisitForAccumulatorValue..VisitForRegisterValue below the visit operations. Done. https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... src/interpreter/bytecode-generator.cc:1554: VisitForAccumulatorValue(left); On 2015/10/19 12:56:22, rmcilroy wrote: > this can be VisitForEffect Yes, and interestingly the RHS visitor should just be Visit() not VisitForAccumulatorValue(). Comment in the next patch, but it's a right sided tree and we're already in a result scope when called. If RHS is VisitForAccumulator we still end up emitting code for all following RHS values because need to put value in accumulator. https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... File src/interpreter/bytecode-generator.h (right): https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... src/interpreter/bytecode-generator.h:119: TopLevelResultScope* top_level_result_scope_; On 2015/10/19 12:56:23, rmcilroy wrote: > This doesn't seem to be used? Can we remove it for now (or make use of it > instead of having PrepareForBinaryExpression / CompleteBinaryExpression? > > As an aside for when it is later needed, personally I would prefer the top level > result scope not to be separate from the result_scope chain (i.e., it is just > stored at the top of the result_scope_ chain, and possibly has a 'short cut' > method in ExpressionResultScope which points to the top-level so we don't need > to walk the whole chain to get it. Removed for now. Having a central reference saves (chain_length - 1) words of stack. https://codereview.chromium.org/1392933002/diff/250001/test/cctest/interprete... File test/cctest/interpreter/test-bytecode-generator.cc (right): https://codereview.chromium.org/1392933002/diff/250001/test/cctest/interprete... test/cctest/interpreter/test-bytecode-generator.cc:846: U8(vector->GetIndex(slot1)), // On 2015/10/19 12:56:23, rmcilroy wrote: > nit - indent operands (and below) - I can't find a way to trick git cl format > into liking this though :( Done. https://codereview.chromium.org/1392933002/diff/250001/test/cctest/interprete... test/cctest/interpreter/test-bytecode-generator.cc:1895: B(TypeOf), // could be culled in bytecode array builder. On 2015/10/19 12:56:23, rmcilroy wrote: > nit - move comment up one line (on Star and Ldar) Done.
Please remove the "(NOT FOR REVIEW)" from the title. Also check if Michi wants to have a look, but LGTM from my side once comments are addressed (I'll hold off on landing my CLs until this lands to reduce your rebase pain). https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... File src/interpreter/bytecode-generator.cc (right): https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... src/interpreter/bytecode-generator.cc:1013: StoreForBinaryExpression(destination); On 2015/10/20 15:28:52, oth wrote: > On 2015/10/19 12:56:22, rmcilroy wrote: > > I don't like this being called StoreForBinaryExpression given that this is > > VisitVariableAssignment which doesn't involve binary expressions. How about > > RecordStoreToLocalRegister or similar? > > RecordStoreToRegister()? It'll certainly cover parameters and probably > temporaries that mappings of earlier mappings. Good point, SGTM https://codereview.chromium.org/1392933002/diff/250001/src/interpreter/byteco... src/interpreter/bytecode-generator.cc:1554: VisitForAccumulatorValue(left); On 2015/10/20 15:28:53, oth wrote: > On 2015/10/19 12:56:22, rmcilroy wrote: > > this can be VisitForEffect > > Yes, and interestingly the RHS visitor should just be Visit() not > VisitForAccumulatorValue(). Comment in the next patch, but it's a right sided > tree and we're already in a result scope when called. If RHS is > VisitForAccumulator we still end up emitting code for all following RHS values > because need to put value in accumulator. Nice catch :). https://codereview.chromium.org/1392933002/diff/290001/src/interpreter/byteco... File src/interpreter/bytecode-array-builder.cc (right): https://codereview.chromium.org/1392933002/diff/290001/src/interpreter/byteco... src/interpreter/bytecode-array-builder.cc:692: free_temporaries_.erase(reg_index); Could you DCHECK that the value is in free_temporaries_ before it is erased. https://codereview.chromium.org/1392933002/diff/290001/src/interpreter/byteco... src/interpreter/bytecode-array-builder.cc:959: // result scopes as far as possible. nit - move TODO to TemporaryRegisterScope declaration in header. https://codereview.chromium.org/1392933002/diff/290001/src/interpreter/byteco... File src/interpreter/bytecode-array-builder.h (right): https://codereview.chromium.org/1392933002/diff/290001/src/interpreter/byteco... src/interpreter/bytecode-array-builder.h:293: nit - remove extra newline https://codereview.chromium.org/1392933002/diff/290001/src/interpreter/byteco... File src/interpreter/bytecode-generator.cc (right): https://codereview.chromium.org/1392933002/diff/290001/src/interpreter/byteco... src/interpreter/bytecode-generator.cc:725: // key, value, language are in another. Hmm, this is tricky. One option would be to do the PrepareForConsecutiveAllocations() on the outer temporary_register_scope, and declare key, value, language outside the for loop and allocated them if they are invalid here (same for all the other uses of NextConsecutiveRegister in this function). It's a bit messy though. Any other ideas? Fine to leave this as a TODO for now. https://codereview.chromium.org/1392933002/diff/290001/src/interpreter/byteco... src/interpreter/bytecode-generator.cc:1160: result_scope()->PrepareForConsecutiveAllocations(args->length() + 1); You call PrepareForConsecutiveAllocations(args->length() + 1) and >PrepareForConsecutiveAllocations(args->length()) in VisitArguments below. Not sure if this is a problem, but it is probably worth a comment https://codereview.chromium.org/1392933002/diff/290001/src/interpreter/byteco... src/interpreter/bytecode-generator.cc:1180: AccumulatorResultScope accumulator_result_scope(this); Not sure why this is needed? Could you add a comment explaining it if it is? https://codereview.chromium.org/1392933002/diff/290001/src/interpreter/byteco... src/interpreter/bytecode-generator.cc:1325: // TODO(oth): add binary operation scope here to replace As mentioned in the previous review, I don't think this should be a binary operation scope at all, but just a StatementScope. AFAICT you only use the binary_expression_depth_ to work out when to clear the binary_expression_hazard_set_ - if this is the case then if you hold the hazard_set_ in the StatementScope you don't need to track binary expression depth, just clear it when the StatementScope goes out of scope. This might extend the hazard lifetime slightly longer than needed but shouldn't affect normal code and I think it's simpler and less error prone. If you agree, updating the TODO for now is fine, otherwise let's discuss this in-person tomorrow. https://codereview.chromium.org/1392933002/diff/290001/src/interpreter/byteco... src/interpreter/bytecode-generator.cc:1346: // statement or top-level expression scope. ditto https://codereview.chromium.org/1392933002/diff/290001/src/interpreter/byteco... src/interpreter/bytecode-generator.cc:1569: // register if not. Comment now out of date (no need to release temporary variable) https://codereview.chromium.org/1392933002/diff/290001/src/interpreter/byteco... File src/interpreter/bytecode-generator.h (right): https://codereview.chromium.org/1392933002/diff/290001/src/interpreter/byteco... src/interpreter/bytecode-generator.h:76: Register LoadFromAliasedRegister(Register reg); nit - move these into a seperate block from PrepareForBinaryExpression / CompleteBinaryExpression and update comments https://codereview.chromium.org/1392933002/diff/290001/src/interpreter/byteco... src/interpreter/bytecode-generator.h:100: ExpressionResultScope* result_scope() const { return result_scope_; } optional nit - should we call this execution_result() to tie in with the scopes above?
Description was changed from ========== [Interpreter] Reduce temporary register usage in generated bytecode. This change adds new Visit() methods for obtaining expression results: VisitForAccumulatorValue and VisitForRegisterValue, and also VisitForEffect. There are a new set of classes ExpressionResultScopes for capturing results. Code is now expected to update these scopes when a result is produced. There are consequential changes to test-bytecode-generator.cc and this change also adds new bytecode macros A(x, n) and THIS(n) for register file entries for arguments and this. BUG=v8:4280 LOG=NO ========== to ========== [Interpreter] Reduce temporary register usage in generated bytecode. This change adds new flavors of Visit() methods for obtaining expression results: - VisitForAccumulatorValue() which places result in the accumulator. - VisitForRegisterValue() which places the result in a register. - VisitForEffect() which evaluates the expression and discards the result. The targets of these calls place the expression result with result_scope()->SetResultInRegister() or result_scope->SetResultInAccumulator(). By being smarter about result locations, there's less temporary register usages. However, we now have a hazard with assignments in binary expressions that didn't exist before. This change detects and DCHECK's when a hazard is detected. A follow on CL will address this. There are consequential changes to test-bytecode-generator.cc and this change also adds new bytecode macros A(x, n) and THIS(n) for register file entries for arguments and this. BUG=v8:4280 LOG=NO ==========
Description was changed from ========== [Interpreter] Reduce temporary register usage in generated bytecode. This change adds new flavors of Visit() methods for obtaining expression results: - VisitForAccumulatorValue() which places result in the accumulator. - VisitForRegisterValue() which places the result in a register. - VisitForEffect() which evaluates the expression and discards the result. The targets of these calls place the expression result with result_scope()->SetResultInRegister() or result_scope->SetResultInAccumulator(). By being smarter about result locations, there's less temporary register usages. However, we now have a hazard with assignments in binary expressions that didn't exist before. This change detects and DCHECK's when a hazard is detected. A follow on CL will address this. There are consequential changes to test-bytecode-generator.cc and this change also adds new bytecode macros A(x, n) and THIS(n) for register file entries for arguments and this. BUG=v8:4280 LOG=NO ========== to ========== [Interpreter] Reduce temporary register usage in generated bytecode. This change adds new flavors of Visit() methods for obtaining expression results: - VisitForAccumulatorValue() which places result in the accumulator. - VisitForRegisterValue() which places the result in a register. - VisitForEffect() which evaluates the expression and discards the result. The targets of these calls place the expression result with result_scope()->SetResultInRegister() or result_scope()->SetResultInAccumulator(). By being smarter about result locations, there's less temporary register usages. However, we now have a hazard with assignments in binary expressions that didn't exist before. This change detects and DCHECK's when a hazard is detected. A follow on CL will address this. There are consequential changes to test-bytecode-generator.cc and this change also adds new bytecode macros A(x, n) and THIS(n) for register file entries for arguments and this. BUG=v8:4280 LOG=NO ==========
Thanks Ross, all incorporated. Michi, do you have time to take look? https://codereview.chromium.org/1392933002/diff/290001/src/interpreter/byteco... File src/interpreter/bytecode-array-builder.cc (right): https://codereview.chromium.org/1392933002/diff/290001/src/interpreter/byteco... src/interpreter/bytecode-array-builder.cc:692: free_temporaries_.erase(reg_index); On 2015/10/20 16:39:20, rmcilroy wrote: > Could you DCHECK that the value is in free_temporaries_ before it is erased. Done. https://codereview.chromium.org/1392933002/diff/290001/src/interpreter/byteco... src/interpreter/bytecode-array-builder.cc:959: // result scopes as far as possible. On 2015/10/20 16:39:20, rmcilroy wrote: > nit - move TODO to TemporaryRegisterScope declaration in header. Done. https://codereview.chromium.org/1392933002/diff/290001/src/interpreter/byteco... File src/interpreter/bytecode-array-builder.h (right): https://codereview.chromium.org/1392933002/diff/290001/src/interpreter/byteco... src/interpreter/bytecode-array-builder.h:293: On 2015/10/20 16:39:20, rmcilroy wrote: > nit - remove extra newline Done. https://codereview.chromium.org/1392933002/diff/290001/src/interpreter/byteco... File src/interpreter/bytecode-generator.cc (right): https://codereview.chromium.org/1392933002/diff/290001/src/interpreter/byteco... src/interpreter/bytecode-generator.cc:725: // key, value, language are in another. On 2015/10/20 16:39:20, rmcilroy wrote: > Hmm, this is tricky. One option would be to do the > PrepareForConsecutiveAllocations() on the outer temporary_register_scope, and > declare key, value, language outside the for loop and allocated them if they are > invalid here (same for all the other uses of NextConsecutiveRegister in this > function). It's a bit messy though. Any other ideas? Fine to leave this as a > TODO for now. We may have exactly the right mix of allocation scopes for this to work just now. It's a little hairy. Another alternative would be to include an additional allocation that could be guaranteed to be consecutive and move the literal register into this, e.g. Builder()->mov(literal_consecutive, literal). It burns a register, but is safe. https://codereview.chromium.org/1392933002/diff/290001/src/interpreter/byteco... src/interpreter/bytecode-generator.cc:1160: result_scope()->PrepareForConsecutiveAllocations(args->length() + 1); On 2015/10/20 16:39:20, rmcilroy wrote: > You call PrepareForConsecutiveAllocations(args->length() + 1) and > >PrepareForConsecutiveAllocations(args->length()) in VisitArguments below. Not > sure if this is a problem, but it is probably worth a comment Comment added to VisitArguments. The weirdness is introduced because we don't pass first_arg as an argument in the Call operation, but just pass the receiver. There is a simple statemachine inside the TemporaryRegisterScope that keeps track of the next register to be allocated in the consecutive sequence and the number remaining in the consecutive sequence. A call to PrepareForConsecutiveAllocations has no effect if the number requested is <= the existing number of remaining consecutives allocations. If the request is larger then there's the searching/allocating temporary registers to be consecutive in the BytecodeArrayBuilder. e.g. PFCA(5); NewConsecutiveRegister(); PFCA(4) -> second call is no-op PFCA(4); NewConsecutiveRegister(); PFCA(4) -> second call completely resets internal state. So long as all the allocations are properly scoped then this works fine. We DCHECK if scoping is deemed wrong because consecutive allocations can not be fulfilled. Having pushed on this, I've found the code is brittle with respect to errors here (which is good). https://codereview.chromium.org/1392933002/diff/290001/src/interpreter/byteco... src/interpreter/bytecode-generator.cc:1180: AccumulatorResultScope accumulator_result_scope(this); On 2015/10/20 16:39:20, rmcilroy wrote: > Not sure why this is needed? Could you add a comment explaining it if it is? Comment added and scope moved immediately before VisitVariableLoad. Essentially we don't want VisitVariableLoad using and not freeing our temporaries nor setting our result (it's that calling SetResultInAccumulator twice issue). https://codereview.chromium.org/1392933002/diff/290001/src/interpreter/byteco... src/interpreter/bytecode-generator.cc:1325: // TODO(oth): add binary operation scope here to replace On 2015/10/20 16:39:20, rmcilroy wrote: > As mentioned in the previous review, I don't think this should be a binary > operation scope at all, but just a StatementScope. AFAICT you only use the > binary_expression_depth_ to work out when to clear the > binary_expression_hazard_set_ - if this is the case then if you hold the > hazard_set_ in the StatementScope you don't need to track binary expression > depth, just clear it when the StatementScope goes out of scope. This might > extend the hazard lifetime slightly longer than needed but shouldn't affect > normal code and I think it's simpler and less error prone. If you agree, > updating the TODO for now is fine, otherwise let's discuss this in-person > tomorrow. Done - grumble, grumble :-) We can go conservative here. We know we'll revisit if we find need anyway. https://codereview.chromium.org/1392933002/diff/290001/src/interpreter/byteco... src/interpreter/bytecode-generator.cc:1346: // statement or top-level expression scope. On 2015/10/20 16:39:20, rmcilroy wrote: > ditto Done. https://codereview.chromium.org/1392933002/diff/290001/src/interpreter/byteco... src/interpreter/bytecode-generator.cc:1569: // register if not. On 2015/10/20 16:39:20, rmcilroy wrote: > Comment now out of date (no need to release temporary variable) Done. https://codereview.chromium.org/1392933002/diff/290001/src/interpreter/byteco... File src/interpreter/bytecode-generator.h (right): https://codereview.chromium.org/1392933002/diff/290001/src/interpreter/byteco... src/interpreter/bytecode-generator.h:76: Register LoadFromAliasedRegister(Register reg); On 2015/10/20 16:39:20, rmcilroy wrote: > nit - move these into a seperate block from PrepareForBinaryExpression / > CompleteBinaryExpression and update comments Done. https://codereview.chromium.org/1392933002/diff/290001/src/interpreter/byteco... src/interpreter/bytecode-generator.h:100: ExpressionResultScope* result_scope() const { return result_scope_; } On 2015/10/20 16:39:20, rmcilroy wrote: > optional nit - should we call this execution_result() to tie in with the scopes > above? Done.
All still LGTM, thanks.
On 2015/10/21 07:47:29, rmcilroy wrote: > All still LGTM, thanks. Thanks. Will wait on test262 before committing (and Michi if he has time to look). Rebased this morning and added additional code to tell if visiting for effect or value per the ast-graph-builder. This allows us to avoid unnecessary literal loads when visiting for effect.
The CQ bit was checked by oth@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1392933002/350001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1392933002/350001
Description was changed from ========== [Interpreter] Reduce temporary register usage in generated bytecode. This change adds new flavors of Visit() methods for obtaining expression results: - VisitForAccumulatorValue() which places result in the accumulator. - VisitForRegisterValue() which places the result in a register. - VisitForEffect() which evaluates the expression and discards the result. The targets of these calls place the expression result with result_scope()->SetResultInRegister() or result_scope()->SetResultInAccumulator(). By being smarter about result locations, there's less temporary register usages. However, we now have a hazard with assignments in binary expressions that didn't exist before. This change detects and DCHECK's when a hazard is detected. A follow on CL will address this. There are consequential changes to test-bytecode-generator.cc and this change also adds new bytecode macros A(x, n) and THIS(n) for register file entries for arguments and this. BUG=v8:4280 LOG=NO ========== to ========== [Interpreter] Reduce temporary register usage in generated bytecode. This change adds new flavors of Visit() methods for obtaining expression results: - VisitForAccumulatorValue() which places result in the accumulator. - VisitForRegisterValue() which places the result in a register. - VisitForEffect() which evaluates the expression and discards the result. The targets of these calls place the expression result with result_scope()->SetResultInRegister() or result_scope()->SetResultInAccumulator(). By being smarter about result locations, there's less temporary register usage. However, we now have a hazard with assignments in binary expressions that didn't exist before. This change detects and DCHECK's when a hazard is detected. A follow on CL will address this. There are consequential changes to test-bytecode-generator.cc and this change also adds new bytecode macros A(x, n) and THIS(n) for register file entries for arguments and this. BUG=v8:4280 LOG=NO ==========
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: v8_win_rel on tryserver.v8 (JOB_FAILED, http://build.chromium.org/p/tryserver.v8/builders/v8_win_rel/builds/10885)
I don't want to block this, so go ahead and land this to unblock other CLs that build on top of it. But I feel somewhat uncomfortable with the complexity this introduces to BytecodeGenerator. So far I only have one concrete comment to make the complexity manageable, I'll let you know if I have more. https://codereview.chromium.org/1392933002/diff/370001/src/interpreter/byteco... File src/interpreter/bytecode-generator.cc (right): https://codereview.chromium.org/1392933002/diff/370001/src/interpreter/byteco... src/interpreter/bytecode-generator.cc:151: explicit ExpressionResultScope(BytecodeGenerator* generator, nit: Doesn't need to be marked explicit. https://codereview.chromium.org/1392933002/diff/370001/src/interpreter/byteco... src/interpreter/bytecode-generator.cc:163: DCHECK_EQ(result_identified(), true); nit: DCHECK(result_identified()) https://codereview.chromium.org/1392933002/diff/370001/src/interpreter/byteco... src/interpreter/bytecode-generator.cc:244: ~RegisterResultScope() {} nit: Any reason the default destructor cannot be used? https://codereview.chromium.org/1392933002/diff/370001/src/interpreter/byteco... src/interpreter/bytecode-generator.cc:1081: VisitVariableAssignment(variable, slot); This assumes that the VisitVariableAssignment does not clobber the accumulator. The above call to execution_result()->SetResultInAccumulator() just uses the accumulator as a completion value. Looking through the current implementation, this seems to hold. But this is scarily complex. It would be much clearer if all the calls to execution_result()->SetResultInFoo() were at the very bottom of the expression visitation functions.
LGTM once the one concrete comment is addressed.
Thanks Michael. Comments incorporated. https://codereview.chromium.org/1392933002/diff/370001/src/interpreter/byteco... File src/interpreter/bytecode-generator.cc (right): https://codereview.chromium.org/1392933002/diff/370001/src/interpreter/byteco... src/interpreter/bytecode-generator.cc:151: explicit ExpressionResultScope(BytecodeGenerator* generator, On 2015/10/21 14:34:03, Michael Starzinger wrote: > nit: Doesn't need to be marked explicit. Done. https://codereview.chromium.org/1392933002/diff/370001/src/interpreter/byteco... src/interpreter/bytecode-generator.cc:163: DCHECK_EQ(result_identified(), true); On 2015/10/21 14:34:03, Michael Starzinger wrote: > nit: DCHECK(result_identified()) Done. https://codereview.chromium.org/1392933002/diff/370001/src/interpreter/byteco... src/interpreter/bytecode-generator.cc:244: ~RegisterResultScope() {} On 2015/10/21 14:34:04, Michael Starzinger wrote: > nit: Any reason the default destructor cannot be used? Done. https://codereview.chromium.org/1392933002/diff/370001/src/interpreter/byteco... src/interpreter/bytecode-generator.cc:1081: VisitVariableAssignment(variable, slot); On 2015/10/21 14:34:03, Michael Starzinger wrote: > This assumes that the VisitVariableAssignment does not clobber the accumulator. > The above call to execution_result()->SetResultInAccumulator() just uses the > accumulator as a completion value. Looking through the current implementation, > this seems to hold. But this is scarily complex. It would be much clearer if all > the calls to execution_result()->SetResultInFoo() were at the very bottom of the > expression visitation functions. A comment has been added here as I completely agree with the sentiment. Have made the mechanical changes moving SetResultInFoo() calls to the end of the method bodies. The only exception are where they are in switch statements and have different outputs.
The CQ bit was checked by oth@chromium.org
The patchset sent to the CQ was uploaded after l-g-t-m from rmcilroy@chromium.org Link to the patchset: https://codereview.chromium.org/1392933002/#ps420001 (title: "Incorporate comments from mstarzinger.")
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1392933002/420001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1392933002/420001
Message was sent while issue was closed.
Committed patchset #23 (id:420001)
Message was sent while issue was closed.
Patchset 23 (id:??) landed as https://crrev.com/339e0c804ec60347d401a09bb62182dabb7ee54c Cr-Commit-Position: refs/heads/master@{#31445}
Message was sent while issue was closed.
verwaest@chromium.org changed reviewers: + verwaest@chromium.org
Message was sent while issue was closed.
I just zigzagged through this CL, but one comment. Don't forget to handle the following "hazard": (function(a) { return a + (arguments[0] = 10) })(1)
Message was sent while issue was closed.
On 2015/10/21 19:33:20, Toon Verwaest wrote: > I just zigzagged through this CL, but one comment. Don't forget to handle the > following "hazard": > > (function(a) { return a + (arguments[0] = 10) })(1) I *think* this should be fine because sloppy mode parameters are context allocated when the function loads from the arguments object (so won't be accessed directly from interpreter registers), but this is a very good point and we should make sure to have a test for this when we handle these "hazards" |