|
|
Created:
10 years, 10 months ago by Rico Modified:
9 years, 6 months ago Reviewers:
Lasse Reichstein, Kevin Millikin (Chromium) CC:
v8-dev Visibility:
Public. |
DescriptionInline Math.sqrt().
Also changed name of GeneratePow and the %_ call name to follow convention based on MathSin and MathCos. Moved GeneratePow down to the other methods.
Committed: http://code.google.com/p/v8/source/detail?r=4054
Patch Set 1 #Patch Set 2 : '' #
Total comments: 40
Patch Set 3 : '' #
Total comments: 10
Patch Set 4 : '' #
Total comments: 10
Patch Set 5 : '' #Patch Set 6 : '' #
Total comments: 2
Created: 10 years, 9 months ago
Messages
Total messages: 11 (0 generated)
You need to handle allocation failure. http://codereview.chromium.org/661179/diff/1005/1006 File src/ia32/codegen-ia32.cc (right): http://codereview.chromium.org/661179/diff/1005/1006#newcode5817 src/ia32/codegen-ia32.cc:5817: I assume this is just a move, with no changes in the function. http://codereview.chromium.org/661179/diff/1005/1006#newcode6015 src/ia32/codegen-ia32.cc:6015: Result result = frame_->Pop(); Remember to do result.ToRegister() and result.Spill(). http://codereview.chromium.org/661179/diff/1005/1006#newcode6029 src/ia32/codegen-ia32.cc:6029: __ j(not_equal, &end); Move frame_->Push(&result) after bind(&end) so it becomes obvious that we return result.reg() in either case. (That's what happens now anyway, it's hard to see). http://codereview.chromium.org/661179/diff/1005/1006#newcode6035 src/ia32/codegen-ia32.cc:6035: __ AllocateHeapNumber(result.reg(), scratch.reg(), no_reg, &end); If allocation fails, result.reg() have been changed to point to unallocated memory (very near the end of the newspace semispace). On failure you should go to somewhere that calls runtime and does finishes the job instead of just returning and (eventually) crashing.
A few drive-by comments, this is not a real review. (I did not look at the gory details of Math.pow, for instance.) * Identifier names are too cryptic (powi?), in some cases misleading (x_is_double ==> x_is_not_smi). That makes the code harder to understand than it needs to be. * Control flow with raw labels is only safe if all virtual frames reaching the label are identical. * Even for the inline runtime functions where we control the only call site, the generated code should not rely on the specific JS code at the call site (eg, reevaluating the arguments because you know they are parameters and thus have no side effects). http://codereview.chromium.org/661179/diff/1005/1006 File src/ia32/codegen-ia32.cc (right): http://codereview.chromium.org/661179/diff/1005/1006#newcode5817 src/ia32/codegen-ia32.cc:5817: On 2010/02/26 13:49:33, Lasse Reichstein wrote: > I assume this is just a move, with no changes in the function. I'm still concerned that it hasn't been carefully reviewed. http://codereview.chromium.org/661179/diff/1005/1006#newcode5824 src/ia32/codegen-ia32.cc:5824: Load(args->at(0)); Since the args need to be evaluated in order in all cases, please lift it out of the conditional. http://codereview.chromium.org/661179/diff/1005/1006#newcode5828 src/ia32/codegen-ia32.cc:5828: Result p = allocator()->Allocate(eax); Is there any reason to prefer eax here? I don't see one, but there is a lot of code. The style guide suggests avoiding abbreviations unless they are common in the domain. I'm not sure 'p' counts as a common abbreviation. There must be a better name. http://codereview.chromium.org/661179/diff/1005/1006#newcode5830 src/ia32/codegen-ia32.cc:5830: Result x= frame_->Pop(); Space between 'x' and '='. http://codereview.chromium.org/661179/diff/1005/1006#newcode5831 src/ia32/codegen-ia32.cc:5831: if (p.is_valid() && p.reg().is(eax)) { p is always valid. You've allocated eax with no live values, so it must be. It cannot be other than eax. The register allocator will not give you some other register when you request eax. No need for the if, a simple ASSERT works if you're worried about something. http://codereview.chromium.org/661179/diff/1005/1006#newcode5849 src/ia32/codegen-ia32.cc:5849: __ j(not_zero, &x_is_double); 'x_is_double' seems like a bad name, since we're not sure it's a double. (Plus, we usually use the term "heap number".) http://codereview.chromium.org/661179/diff/1005/1006#newcode5867 src/ia32/codegen-ia32.cc:5867: __ mov(x.reg(), y.reg()); Nothing guarantees that x and y are different registers. http://codereview.chromium.org/661179/diff/1005/1006#newcode5869 src/ia32/codegen-ia32.cc:5869: Extra blank line? http://codereview.chromium.org/661179/diff/1005/1006#newcode5977 src/ia32/codegen-ia32.cc:5977: Load(args->at(0)); This is clearly wrong---the args should not be reevaluated. http://codereview.chromium.org/661179/diff/1005/1006#newcode5979 src/ia32/codegen-ia32.cc:5979: frame_->CallRuntime(Runtime::kMath_pow_cfunction, 2); The call to CallRuntime gave you a result that represents eax. Don't ignore this return value. Result answer = frame()->CallRuntime(...); http://codereview.chromium.org/661179/diff/1005/1006#newcode5983 src/ia32/codegen-ia32.cc:5983: __ bind(&return_preg); There are different live values on the two paths to this label. That's bad. 1. Unuse x and y before jumping here. They are not really live. 2. Use JumpTarget for control flow to allow the register allocator to merge live values in potentially different registers. http://codereview.chromium.org/661179/diff/1005/1006#newcode5984 src/ia32/codegen-ia32.cc:5984: frame_->Push(eax); Avoid pushing registers directly. You can choose to explicitly pass the live value along both branches: JumpTarget done; x.Unuse(); y.Unuse(); done.Jump(&p); ... done.Bind(&answer); frame()->Push(&answer); Or else implicitly as the top frame element: JumpTarget done; x.Unuse(); y.Unuse(); frame()->Push(&p); done.Jump(); ... Result answer = frame()->CallRuntime(...); frame()->Push(&answer); done.Bind(); http://codereview.chromium.org/661179/diff/1005/1006#newcode5986 src/ia32/codegen-ia32.cc:5986: Load(args->at(0)); Indentation is off. http://codereview.chromium.org/661179/diff/1005/1006#newcode6025 src/ia32/codegen-ia32.cc:6025: // We should never have anything but a double here - in case we I don't understand the comment because I don't understand the context. But in general: 1. If this check could be expected to fail, you have to do something safe (even crashing via CHECK) instead of this. As it stands, there are two different frame heights reaching label end, and then who knows what will happen? 2. If this is a precondition that the caller should guarantee, then don't do the comparison in release builds. Instead, generate debug code guarded by if (FLAG_debug_code) to verify the precondition and assert. http://codereview.chromium.org/661179/diff/1005/1006#newcode6028 src/ia32/codegen-ia32.cc:6028: Factory::heap_number_map()); Indentation is off. http://codereview.chromium.org/661179/diff/1005/1006#newcode6034 src/ia32/codegen-ia32.cc:6034: Result scratch = allocator()->Allocate(); Using plain labels instead of JumpTarget for control flow is only safe if there is no frame effect. Allocating a register is a frame effect, so the frames reaching end not only have different heights, but they could differ in an arbitrary element. (They actually won't, because all platforms have more than one register available for allocation, but you should not rely on that.)
Kevin, I created a new version of Math.pow with the suggestions that you gave (here and offline). In addition, I changed Math.sqrt to also use a jumptarget and copy the frame to allow for correct handling of allocation failures. Could you have another look? http://codereview.chromium.org/661179/diff/1005/1006 File src/ia32/codegen-ia32.cc (right): http://codereview.chromium.org/661179/diff/1005/1006#newcode5817 src/ia32/codegen-ia32.cc:5817: On 2010/02/26 13:49:33, Lasse Reichstein wrote: > I assume this is just a move, with no changes in the function. Yes http://codereview.chromium.org/661179/diff/1005/1006#newcode5817 src/ia32/codegen-ia32.cc:5817: On 2010/02/28 04:56:01, Kevin Millikin wrote: > On 2010/02/26 13:49:33, Lasse Reichstein wrote: > > I assume this is just a move, with no changes in the function. > > I'm still concerned that it hasn't been carefully reviewed. I will address all of the below http://codereview.chromium.org/661179/diff/1005/1006#newcode5824 src/ia32/codegen-ia32.cc:5824: Load(args->at(0)); On 2010/02/28 04:56:01, Kevin Millikin wrote: > Since the args need to be evaluated in order in all cases, please lift it out of > the conditional. Done. http://codereview.chromium.org/661179/diff/1005/1006#newcode5828 src/ia32/codegen-ia32.cc:5828: Result p = allocator()->Allocate(eax); On 2010/02/28 04:56:01, Kevin Millikin wrote: > Is there any reason to prefer eax here? I don't see one, but there is a lot of > code. > > The style guide suggests avoiding abbreviations unless they are common in the > domain. I'm not sure 'p' counts as a common abbreviation. There must be a > better name. Done. http://codereview.chromium.org/661179/diff/1005/1006#newcode5830 src/ia32/codegen-ia32.cc:5830: Result x= frame_->Pop(); On 2010/02/28 04:56:01, Kevin Millikin wrote: > Space between 'x' and '='. Done. http://codereview.chromium.org/661179/diff/1005/1006#newcode5831 src/ia32/codegen-ia32.cc:5831: if (p.is_valid() && p.reg().is(eax)) { On 2010/02/28 04:56:01, Kevin Millikin wrote: > p is always valid. You've allocated eax with no live values, so it must be. > > It cannot be other than eax. The register allocator will not give you some > other register when you request eax. > > No need for the if, a simple ASSERT works if you're worried about something. Done. http://codereview.chromium.org/661179/diff/1005/1006#newcode5849 src/ia32/codegen-ia32.cc:5849: __ j(not_zero, &x_is_double); On 2010/02/28 04:56:01, Kevin Millikin wrote: > 'x_is_double' seems like a bad name, since we're not sure it's a double. (Plus, > we usually use the term "heap number".) Changed http://codereview.chromium.org/661179/diff/1005/1006#newcode5867 src/ia32/codegen-ia32.cc:5867: __ mov(x.reg(), y.reg()); On 2010/02/28 04:56:01, Kevin Millikin wrote: > Nothing guarantees that x and y are different registers. As discussed offline - we are actually sure that they are heap numbers and that they are different. http://codereview.chromium.org/661179/diff/1005/1006#newcode5869 src/ia32/codegen-ia32.cc:5869: On 2010/02/28 04:56:01, Kevin Millikin wrote: > Extra blank line? Done. http://codereview.chromium.org/661179/diff/1005/1006#newcode5977 src/ia32/codegen-ia32.cc:5977: Load(args->at(0)); On 2010/02/28 04:56:01, Kevin Millikin wrote: > This is clearly wrong---the args should not be reevaluated. As discussed offline this has now been changed completely. http://codereview.chromium.org/661179/diff/1005/1006#newcode5979 src/ia32/codegen-ia32.cc:5979: frame_->CallRuntime(Runtime::kMath_pow_cfunction, 2); On 2010/02/28 04:56:01, Kevin Millikin wrote: > The call to CallRuntime gave you a result that represents eax. Don't ignore > this return value. > > Result answer = frame()->CallRuntime(...); Done. http://codereview.chromium.org/661179/diff/1005/1006#newcode5983 src/ia32/codegen-ia32.cc:5983: __ bind(&return_preg); On 2010/02/28 04:56:01, Kevin Millikin wrote: > There are different live values on the two paths to this label. That's bad. > > 1. Unuse x and y before jumping here. They are not really live. > > 2. Use JumpTarget for control flow to allow the register allocator to merge live > values in potentially different registers. Done. http://codereview.chromium.org/661179/diff/1005/1006#newcode5984 src/ia32/codegen-ia32.cc:5984: frame_->Push(eax); On 2010/02/28 04:56:01, Kevin Millikin wrote: > Avoid pushing registers directly. You can choose to explicitly pass the live > value along both branches: > > JumpTarget done; > x.Unuse(); > y.Unuse(); > done.Jump(&p); > ... > done.Bind(&answer); > frame()->Push(&answer); > > Or else implicitly as the top frame element: > > JumpTarget done; > x.Unuse(); > y.Unuse(); > frame()->Push(&p); > done.Jump(); > ... > Result answer = frame()->CallRuntime(...); > frame()->Push(&answer); > done.Bind(); Done. http://codereview.chromium.org/661179/diff/1005/1006#newcode5986 src/ia32/codegen-ia32.cc:5986: Load(args->at(0)); On 2010/02/28 04:56:01, Kevin Millikin wrote: > Indentation is off. Done. http://codereview.chromium.org/661179/diff/1005/1006#newcode6015 src/ia32/codegen-ia32.cc:6015: Result result = frame_->Pop(); On 2010/02/26 13:49:33, Lasse Reichstein wrote: > Remember to do result.ToRegister() and result.Spill(). Done. http://codereview.chromium.org/661179/diff/1005/1006#newcode6025 src/ia32/codegen-ia32.cc:6025: // We should never have anything but a double here - in case we On 2010/02/28 04:56:01, Kevin Millikin wrote: > I don't understand the comment because I don't understand the context. But in > general: > > 1. If this check could be expected to fail, you have to do something safe (even > crashing via CHECK) instead of this. As it stands, there are two different > frame heights reaching label end, and then who knows what will happen? > > 2. If this is a precondition that the caller should guarantee, then don't do the > comparison in release builds. Instead, generate debug code guarded by if > (FLAG_debug_code) to verify the precondition and assert. As discussed offline we always have heapnumbers here - I have added a check in case the FLAG_debug_code flag is set. http://codereview.chromium.org/661179/diff/1005/1006#newcode6028 src/ia32/codegen-ia32.cc:6028: Factory::heap_number_map()); On 2010/02/28 04:56:01, Kevin Millikin wrote: > Indentation is off. Done. http://codereview.chromium.org/661179/diff/1005/1006#newcode6029 src/ia32/codegen-ia32.cc:6029: __ j(not_equal, &end); On 2010/02/26 13:49:33, Lasse Reichstein wrote: > Move frame_->Push(&result) after bind(&end) so it becomes obvious that we return > result.reg() in either case. > (That's what happens now anyway, it's hard to see). Done. http://codereview.chromium.org/661179/diff/1005/1006#newcode6034 src/ia32/codegen-ia32.cc:6034: Result scratch = allocator()->Allocate(); On 2010/02/28 04:56:01, Kevin Millikin wrote: > Using plain labels instead of JumpTarget for control flow is only safe if there > is no frame effect. > > Allocating a register is a frame effect, so the frames reaching end not only > have different heights, but they could differ in an arbitrary element. (They > actually won't, because all platforms have more than one register available for > allocation, but you should not rely on that.) Done. http://codereview.chromium.org/661179/diff/1005/1006#newcode6035 src/ia32/codegen-ia32.cc:6035: __ AllocateHeapNumber(result.reg(), scratch.reg(), no_reg, &end); On 2010/02/26 13:49:33, Lasse Reichstein wrote: > If allocation fails, result.reg() have been changed to point to unallocated > memory (very near the end of the newspace semispace). On failure you should go > to somewhere that calls runtime and does finishes the job instead of just > returning and (eventually) crashing. Done.
I only looked at GenerateMathPow, but that LGTM. http://codereview.chromium.org/661179/diff/12/13 File src/ia32/codegen-ia32.cc (right): http://codereview.chromium.org/661179/diff/12/13#newcode5963 src/ia32/codegen-ia32.cc:5963: // allocate_return.Branch(not_equal); Commented code? http://codereview.chromium.org/661179/diff/12/13#newcode6046 src/ia32/codegen-ia32.cc:6046: Result res = frame_->CallRuntime(Runtime::kMath_pow, 2); Indentation is off here. You might consider switching the sense of the conditional (!SSE supported), so the little case is at the top.
I addressed the comments - could one of you have look at sqrt as well? http://codereview.chromium.org/661179/diff/12/13 File src/ia32/codegen-ia32.cc (right): http://codereview.chromium.org/661179/diff/12/13#newcode5963 src/ia32/codegen-ia32.cc:5963: // allocate_return.Branch(not_equal); On 2010/03/03 15:08:09, Kevin Millikin wrote: > Commented code? Done. http://codereview.chromium.org/661179/diff/12/13#newcode6046 src/ia32/codegen-ia32.cc:6046: Result res = frame_->CallRuntime(Runtime::kMath_pow, 2); On 2010/03/03 15:08:09, Kevin Millikin wrote: > Indentation is off here. > > You might consider switching the sense of the conditional (!SSE supported), so > the little case is at the top. Done.
Just looking at squareroot: LGTM. http://codereview.chromium.org/661179/diff/12/13 File src/ia32/codegen-ia32.cc (right): http://codereview.chromium.org/661179/diff/12/13#newcode6076 src/ia32/codegen-ia32.cc:6076: if (CpuFeatures::IsSupported(SSE2)) { You might want a x87-FPU version as well. I'm not sure whether we still create snapshots where SSE2 is not assumed. http://codereview.chromium.org/661179/diff/12/13#newcode6094 src/ia32/codegen-ia32.cc:6094: if (FLAG_debug_code) { You need to check that the argument is a HeapNumber every time, not just in debug mode. Arguments to runtime functions must be assumed to be potentially crafted by a malicious source. http://codereview.chromium.org/661179/diff/12/13#newcode6112 src/ia32/codegen-ia32.cc:6112: end.Jump(&result); Could you move this runtime call somewhere else, so that the fast path is just falling through, without any jumps. E.g., after the unconditional jump in line 6092. I'm not sure how that would interact with the virtual frame, though.
http://codereview.chromium.org/661179/diff/2001/2002 File src/ia32/codegen-ia32.cc (right): http://codereview.chromium.org/661179/diff/2001/2002#newcode6081 src/ia32/codegen-ia32.cc:6081: frame_->Spill(result.reg()); frame() http://codereview.chromium.org/661179/diff/2001/2002#newcode6105 src/ia32/codegen-ia32.cc:6105: Result scratch = allocator()->Allocate(); Allocating a register has a virtual frame effect so the cloned frame and the frame actually reaching the call to AllocateHeapNumber are not identical. The register needs to be allocated before the frame is cloned. http://codereview.chromium.org/661179/diff/2001/2002#newcode6117 src/ia32/codegen-ia32.cc:6117: result = frame_->CallRuntime(Runtime::kMath_sqrt, 1); frame() http://codereview.chromium.org/661179/diff/2001/2002#newcode6120 src/ia32/codegen-ia32.cc:6120: frame_->Push(&result); frame() http://codereview.chromium.org/661179/diff/2001/2002#newcode6122 src/ia32/codegen-ia32.cc:6122: Result result = frame()->CallRuntime(Runtime::kMath_sqrt, 1); As before (though this one probably fits on a page), it might read a bit better if the small branch of the if...else is first.
http://codereview.chromium.org/661179/diff/2001/2002 File src/ia32/codegen-ia32.cc (right): http://codereview.chromium.org/661179/diff/2001/2002#newcode6081 src/ia32/codegen-ia32.cc:6081: frame_->Spill(result.reg()); On 2010/03/08 08:26:07, Kevin Millikin wrote: > frame() Done. http://codereview.chromium.org/661179/diff/2001/2002#newcode6105 src/ia32/codegen-ia32.cc:6105: Result scratch = allocator()->Allocate(); On 2010/03/08 08:26:07, Kevin Millikin wrote: > Allocating a register has a virtual frame effect so the cloned frame and the > frame actually reaching the call to AllocateHeapNumber are not identical. > > The register needs to be allocated before the frame is cloned. Done. http://codereview.chromium.org/661179/diff/2001/2002#newcode6117 src/ia32/codegen-ia32.cc:6117: result = frame_->CallRuntime(Runtime::kMath_sqrt, 1); On 2010/03/08 08:26:07, Kevin Millikin wrote: > frame() Done. http://codereview.chromium.org/661179/diff/2001/2002#newcode6120 src/ia32/codegen-ia32.cc:6120: frame_->Push(&result); On 2010/03/08 08:26:07, Kevin Millikin wrote: > frame() Done. http://codereview.chromium.org/661179/diff/2001/2002#newcode6122 src/ia32/codegen-ia32.cc:6122: Result result = frame()->CallRuntime(Runtime::kMath_sqrt, 1); On 2010/03/08 08:26:07, Kevin Millikin wrote: > As before (though this one probably fits on a page), it might read a bit better > if the small branch of the if...else is first. Done.
I added the map heap-checks in both pow and sqrt. If we later decide that these are (security wise) not necessary we can take them out (here and other places) and take the performance improvement of that. http://codereview.chromium.org/661179/diff/12/13 File src/ia32/codegen-ia32.cc (right): http://codereview.chromium.org/661179/diff/12/13#newcode6076 src/ia32/codegen-ia32.cc:6076: if (CpuFeatures::IsSupported(SSE2)) { On 2010/03/08 08:22:05, Lasse Reichstein wrote: > You might want a x87-FPU version as well. > I'm not sure whether we still create snapshots where SSE2 is not assumed. When I create the snapshot version I do get the improved performance so I guess that we do assume SSE2 when building snapshots? I could still make an optimized version using the FPU though - but in many existing places we only have the SSE2 version anyway. http://codereview.chromium.org/661179/diff/12/13#newcode6094 src/ia32/codegen-ia32.cc:6094: if (FLAG_debug_code) { On 2010/03/08 08:22:05, Lasse Reichstein wrote: > You need to check that the argument is a HeapNumber every time, not just in > debug mode. Arguments to runtime functions must be assumed to be potentially > crafted by a malicious source. Done. http://codereview.chromium.org/661179/diff/12/13#newcode6112 src/ia32/codegen-ia32.cc:6112: end.Jump(&result); On 2010/03/08 08:22:05, Lasse Reichstein wrote: > Could you move this runtime call somewhere else, so that the fast path is just > falling through, without any jumps. E.g., after the unconditional jump in line > 6092. > I'm not sure how that would interact with the virtual frame, though. As discussed offline I will leave this here to avoid complicating the code.
LGTM. http://codereview.chromium.org/661179/diff/3015/2013 File src/ia32/codegen-ia32.cc (right): http://codereview.chromium.org/661179/diff/3015/2013#newcode5918 src/ia32/codegen-ia32.cc:5918: Factory::heap_number_map()); Indentation? Ditto for other two. http://codereview.chromium.org/661179/diff/3015/2013#newcode5919 src/ia32/codegen-ia32.cc:5919: call_runtime.Branch(not_zero); not_zero->not_equal (I know they are identical, but for a comparison, it reads much better) Ditto for other two. |