|
|
Created:
5 years, 3 months ago by fedor.indutny Modified:
5 years ago CC:
danno, v8-reviews_googlegroups.com Base URL:
https://chromium.googlesource.com/v8/v8.git@master Target Ref:
refs/pending/heads/master Project:
v8 Visibility:
Public. |
Descriptionbinary-operator-reducer: reduce mul+div(shift)
Reduction Input:
ChangeInt32ToFloat64=> TruncateFloat64ToInt32
Float64Mul=>
ChangeInt32ToFloat64=> Float64Div=>TruncateFloat64ToInt32
Output:
=> TruncateInt64ToInt32
Int64Mul
=> Int64Shr => TruncateInt64ToInt32
Test code:
function mul(a, b) {
var l = a & 0x3ffffff;
var h = b & 0x3ffffff;
var m = l * h;
var rl = m & 0x3ffffff;
var rh = (m / 0x4000000) | 0;
return rl | rh;
}
mul(1, 2);
var a0 = mul(0x3ffffff, 0x3ffffff);
mul(0x0, 0x0);
%OptimizeFunctionOnNextCall(mul);
var a1 = mul(0x3ffffff, 0x3ffffff);
print(a0 + ' == ' + a1);
BUG=
R=mstarzinger@chromium.org
Committed: https://crrev.com/461e5b49d022335a7fc4e9d172397a4bd48b93d4
Cr-Commit-Position: refs/heads/master@{#31899}
Patch Set 1 #Patch Set 2 : ensure no extra precision #
Total comments: 8
Patch Set 3 : backwards-only matching #Patch Set 4 : TruncateInt64ToFloat64 #
Total comments: 10
Patch Set 5 : remove Revisit #Patch Set 6 : fixes #Patch Set 7 : add test #Patch Set 8 : use matchers #Patch Set 9 : check range both ways #
Total comments: 2
Patch Set 10 : compiler: move round + truncate to machine-reducer #Patch Set 11 : make test conditional #
Messages
Total messages: 61 (9 generated)
Hello! This is my first patch for the Turbofan. I'm sure there are lots of bad things in the code - please bear with this, I'm fairly new to this code! Please let me know - what should be fixed in order to get this in. This is a good improvement for my project bn.js : http://github.com/indutny/bn.js Thank you very much, hopefully I didn't waste much time of yours with this. Cheers, Fedor.
Just in case, generated code looks like this: 0x3ed0cc13aa8a 74 488b5de8 REX.W movq rbx,[rbp-0x18] 0x3ed0cc13aa8e 78 480fafc3 REX.W imulq rax,rbx 0x3ed0cc13aa92 82 488bd8 REX.W movq rbx,rax 0x3ed0cc13aa95 85 48c1eb1a REX.W shrq rbx, 26 0x3ed0cc13aa99 89 8bc0 movl rax,rax 0x3ed0cc13aa9b 91 8bdb movl rbx,rbx
I think there might be a problem with the increased precision in current version of the patch. I'll try to figure out the way to fix it.
Just limited the maximum precision to the 52 bits. Hopefully, the implementation is not that awful :)
danno@chromium.org changed reviewers: + danno@chromium.org
Thanks for the patch. Could you perhaps provide some motivation for this CL, i.e. is this reduction important for something else bigger that you're trying to accomplish?
Of course! This is the very heavily used loop in my bn.js library that I'm trying to optimize: https://github.com/indutny/bn.js/blob/49dd63826efeddaf519f2ba53e40b90694f2426... Currently the loads from `this.words` and `num.words` are converted to float64, multiplied, divided by floating point value, and converted back to integers. Skipping the floating point conversions and division, this code will gain 20% performance improvement.
Daniel, Kindly reminding you about this. Thanks.
On 2015/09/28 17:51:26, fedor.indutny wrote: > Daniel, > > Kindly reminding you about this. > > Thanks. Hi Fedor, I think the right place for this optimization would be in compiler/machine-operator-reducer.cc, which handles reductions for that level of operators. Can you explain a bit more about this optimization and how general you estimate it to be? Also, you will need to write some unittests and ideally some test JS code as well for us to have confidence in it.
titzer@chromium.org changed reviewers: + titzer@chromium.org
https://codereview.chromium.org/1350223006/diff/20001/src/compiler/binary-ope... File src/compiler/binary-operator-reducer.cc (right): https://codereview.chromium.org/1350223006/diff/20001/src/compiler/binary-ope... src/compiler/binary-operator-reducer.cc:41: Reduction BinaryOperatorReducer::ReduceFloat64Mul(Node* node) { If you do this optimization at the site of the division, then you won't have to hunt through uses of this multiply. https://codereview.chromium.org/1350223006/diff/20001/src/compiler/binary-ope... src/compiler/binary-operator-reducer.cc:58: for (Node* use : node->uses()) { Generally reductions shouldn't be iterating over the uses of an instruction. You should match the "root" of the graph of nodes that you want to transform.
https://codereview.chromium.org/1350223006/diff/20001/src/compiler/binary-ope... File src/compiler/binary-operator-reducer.cc (right): https://codereview.chromium.org/1350223006/diff/20001/src/compiler/binary-ope... src/compiler/binary-operator-reducer.cc:119: node->InputAt(1)->InputAt(0)); TurboFan also only supports Int64Mul on 64-bit platforms (for the moment; we will probably fill in 32-bit support, but it won't be as fast as a Float64Mul, likely).
On 2015/09/29 11:13:59, titzer wrote: > On 2015/09/28 17:51:26, fedor.indutny wrote: > > Daniel, > > > > Kindly reminding you about this. > > > > Thanks. > > Hi Fedor, I think the right place for this optimization would be in > compiler/machine-operator-reducer.cc, which handles reductions for that level of > operators. Hello! I'm afraid machine-operator-reducer.cc can't be used for this, as this reduction requires AdvancedReducer and MachineOperatorReducer is a child of Reducer. > > Can you explain a bit more about this optimization and how general you estimate > it to be? > This optimization is very "big number library"-specific. I'm sure that many other libraries may benefit from it, and there is tons of production crypto code that relies on such libraries. So I would say that impact should be quite big. BN.js operates on 26bit limbs, because you may multiply them with a single instruction without precision loss (52bits of precision in JS float type). This is kind of compromise between fast addition of big numbers, and fast multiplication, and the main use case is elliptic curves which often use 256 bit numbers. Using 26bit limbs instead of (let's say) 16bit ones means that BN.js will be using 10 limbs instead of 16, and since multiplication is O(n^2) - it makes a big difference in performance. The only problem with 26bit limb multiplication is that it converts intermediate results to floating point numbers. This CL solves this by detecting the edge case, where floating point multiplication is happening without the loss in precision, and the results are converted to integers. It removes floating points ops altogether, thus speeding up things for BN.js I hope these comments explains it, if not - please let me know! > Also, you will need to write some unittests and ideally some test JS code as > well for us to have confidence in it. Surely will do it, thanks!
Answered your questions. Please let me know how it could be improved, going to start working on a test as soon as we will decide on the main component of this CL. On other hand, I understand that you may not have time to do several reviews of this. If this is the case - please let me know and I will provide everything at once. Thank you very much for your time! https://codereview.chromium.org/1350223006/diff/20001/src/compiler/binary-ope... File src/compiler/binary-operator-reducer.cc (right): https://codereview.chromium.org/1350223006/diff/20001/src/compiler/binary-ope... src/compiler/binary-operator-reducer.cc:41: Reduction BinaryOperatorReducer::ReduceFloat64Mul(Node* node) { On one hand - yes, on other hand - doing it at the multiplication means more flexibility, and more general optimization. While bn.js use case does both division and cast to int (via `num | 0`), some code may be doing just the latter one without dividing it all. I would rather keep it working at multiplication node. https://codereview.chromium.org/1350223006/diff/20001/src/compiler/binary-ope... src/compiler/binary-operator-reducer.cc:58: for (Node* use : node->uses()) { I'm looking both up in the tree and down in the tree. Not sure if it could be done differently. Perhaps you have more exact suggestions for me? https://codereview.chromium.org/1350223006/diff/20001/src/compiler/binary-ope... src/compiler/binary-operator-reducer.cc:119: node->InputAt(1)->InputAt(0)); What would be the best way to make this a platform specific reduction then?
On 2015/10/02 02:49:52, fedor.indutny wrote: > On 2015/09/29 11:13:59, titzer wrote: > > On 2015/09/28 17:51:26, fedor.indutny wrote: > > > Daniel, > > > > > > Kindly reminding you about this. > > > > > > Thanks. > > > > Hi Fedor, I think the right place for this optimization would be in > > compiler/machine-operator-reducer.cc, which handles reductions for that level > of > > operators. > > Hello! > > I'm afraid machine-operator-reducer.cc can't be used for this, as this reduction > requires AdvancedReducer and MachineOperatorReducer is a child of Reducer. > > > > > Can you explain a bit more about this optimization and how general you > estimate > > it to be? > > > > This optimization is very "big number library"-specific. I'm sure that many > other libraries may benefit from it, and there is tons of production crypto code > that relies on such libraries. So I would say that impact should be quite big. > > BN.js operates on 26bit limbs, because you may multiply them with a single > instruction without precision loss (52bits of precision in JS float type). This > is kind of compromise between fast addition of big numbers, and fast > multiplication, and the main use case is elliptic curves which often use 256 bit > numbers. Using 26bit limbs instead of (let's say) 16bit ones means that BN.js > will be using 10 limbs instead of 16, and since multiplication is O(n^2) - it > makes a big difference in performance. > > The only problem with 26bit limb multiplication is that it converts intermediate > results to floating point numbers. This CL solves this by detecting the edge > case, where floating point multiplication is happening without the loss in > precision, and the results are converted to integers. It removes floating points > ops altogether, thus speeding up things for BN.js > > I hope these comments explains it, if not - please let me know! > > > Also, you will need to write some unittests and ideally some test JS code as > > well for us to have confidence in it. > > Surely will do it, thanks! Hi, thanks for the explanation, the context helped a lot. Some suggestions: you should be able to use the range types to get a more precise bound on the inputs to the multiplication. That will save some pattern matching for truncations on the inputs.. In general, you can't ignore uses from FrameStates, since they can observe intermediate values. You really need to compute whether there are any observations of the intermediate values to make this optimization safe.
https://codereview.chromium.org/1350223006/diff/20001/src/compiler/binary-ope... File src/compiler/binary-operator-reducer.cc (right): https://codereview.chromium.org/1350223006/diff/20001/src/compiler/binary-ope... src/compiler/binary-operator-reducer.cc:58: for (Node* use : node->uses()) { On 2015/10/02 02:54:30, fedor.indutny wrote: > I'm looking both up in the tree and down in the tree. Not sure if it could be > done differently. Perhaps you have more exact suggestions for me? Yes. What I meant is that reductions are applied in a specific order to the nodes in the graph, and work best when reducing the "root" of a tree _after_ having first reduced the inputs. The GraphReducer algorithm gives you this order for free. When you are looking at uses of a node, you are generally look at unreduced nodes and may miss an opportunity; or you might see some uses that will later get trimmed away. That's why it's usually better to apply your matching/optimization at the so called "root" of the graph of nodes you want to match; it will interact better with the order of visitation and generally be safer. https://codereview.chromium.org/1350223006/diff/20001/src/compiler/binary-ope... src/compiler/binary-operator-reducer.cc:119: node->InputAt(1)->InputAt(0)); On 2015/10/02 02:54:30, fedor.indutny wrote: > What would be the best way to make this a platform specific reduction then? Just check machine()->Is64()
On 2015/10/06 23:10:39, titzer wrote: > On 2015/10/02 02:49:52, fedor.indutny wrote: > > On 2015/09/29 11:13:59, titzer wrote: > > > On 2015/09/28 17:51:26, fedor.indutny wrote: > > > > Daniel, > > > > > > > > Kindly reminding you about this. > > > > > > > > Thanks. > > > > > > Hi Fedor, I think the right place for this optimization would be in > > > compiler/machine-operator-reducer.cc, which handles reductions for that > level > > of > > > operators. > > > > Hello! > > > > I'm afraid machine-operator-reducer.cc can't be used for this, as this > reduction > > requires AdvancedReducer and MachineOperatorReducer is a child of Reducer. > > > > > > > > Can you explain a bit more about this optimization and how general you > > estimate > > > it to be? > > > > > > > This optimization is very "big number library"-specific. I'm sure that many > > other libraries may benefit from it, and there is tons of production crypto > code > > that relies on such libraries. So I would say that impact should be quite big. > > > > BN.js operates on 26bit limbs, because you may multiply them with a single > > instruction without precision loss (52bits of precision in JS float type). > This > > is kind of compromise between fast addition of big numbers, and fast > > multiplication, and the main use case is elliptic curves which often use 256 > bit > > numbers. Using 26bit limbs instead of (let's say) 16bit ones means that BN.js > > will be using 10 limbs instead of 16, and since multiplication is O(n^2) - it > > makes a big difference in performance. > > > > The only problem with 26bit limb multiplication is that it converts > intermediate > > results to floating point numbers. This CL solves this by detecting the edge > > case, where floating point multiplication is happening without the loss in > > precision, and the results are converted to integers. It removes floating > points > > ops altogether, thus speeding up things for BN.js > > > > I hope these comments explains it, if not - please let me know! > > > > > Also, you will need to write some unittests and ideally some test JS code as > > > well for us to have confidence in it. > > > > Surely will do it, thanks! > > Hi, thanks for the explanation, the context helped a lot. > > Some suggestions: you should be able to use the range types to get a more > precise bound on the inputs to the multiplication. > That will save some pattern matching for truncations on the inputs.. Hm... this is a good idea. However I skip cast to float altogether if the match has succeeded. Therefore I still need to know whether to insert cast to int, or just skip cast to float. > > In general, you can't ignore uses from FrameStates, since they can observe > intermediate values. You really need to compute > whether there are any observations of the intermediate values to make this > optimization safe. May I ask you to enlighten me a bit on FrameStates? I have seen them through the code base, but not exactly sure what they capture, and what is their function?
On 2015/10/06 23:14:11, titzer wrote: > https://codereview.chromium.org/1350223006/diff/20001/src/compiler/binary-ope... > File src/compiler/binary-operator-reducer.cc (right): > > https://codereview.chromium.org/1350223006/diff/20001/src/compiler/binary-ope... > src/compiler/binary-operator-reducer.cc:58: for (Node* use : node->uses()) { > On 2015/10/02 02:54:30, fedor.indutny wrote: > > I'm looking both up in the tree and down in the tree. Not sure if it could be > > done differently. Perhaps you have more exact suggestions for me? > > Yes. What I meant is that reductions are applied in a specific order to the > nodes in the graph, and work best when reducing the "root" of a tree _after_ > having first reduced the inputs. The GraphReducer algorithm gives you this order > for free. When you are looking at uses of a node, you are generally look at > unreduced nodes and may miss an opportunity; or you might see some uses that > will later get trimmed away. That's why it's usually better to apply your > matching/optimization at the so called "root" of the graph of nodes you want to > match; it will interact better with the order of visitation and generally be > safer. OK, I'll try to do it differently! > > https://codereview.chromium.org/1350223006/diff/20001/src/compiler/binary-ope... > src/compiler/binary-operator-reducer.cc:119: node->InputAt(1)->InputAt(0)); > On 2015/10/02 02:54:30, fedor.indutny wrote: > > What would be the best way to make this a platform specific reduction then? > > Just check machine()->Is64() Cool, thanks!
I have one question. If I will do it at TruncateFloatToInt32 and Float64Div instead - what should I do about the multiplication? I can certainly copy it twice, but this will create extra work for GVN. Does this sounds OK?
Updated CL, now it matches from the very bottom of replacement, and iterates only inputs. Hopefully it is better this way! ;)
After some consideration, I'm not sure if the way it is done in the second version of patch is optimal. It will replace Float64Mul only for some of its uses, possibly leaving the original Float64Mul in a graph. I think it should replace all Float64Mul(ChangeInt32(A), ChangeInt32(B)) to TruncateInt64ToFloat64(Int64Mul(A, B)) (which will require introducing TruncateInt64ToFloat64). As a next step it could replace TruncateFloat64ToInt32(TruncateInt64ToFloat64(A)) with TruncateInt64ToInt32(A), DivFloat64(TruncateInt64ToFloat64(A), POWER_OF_TWO) with TruncateInt64ToFloat64(Word64Shr(A, SHIFT)). This reduction appears to be more generic, dependent only on immediate inputs of the node, and should not really degrade regular floating point multiplication. What do you think?
On 2015/10/11 17:07:31, fedor.indutny wrote: > After some consideration, I'm not sure if the way it is done in the second > version of patch is optimal. It will replace Float64Mul only for some of its > uses, possibly leaving the original Float64Mul in a graph. > > I think it should replace all Float64Mul(ChangeInt32(A), ChangeInt32(B)) to > TruncateInt64ToFloat64(Int64Mul(A, B)) (which will require introducing > TruncateInt64ToFloat64). As a next step it could replace > TruncateFloat64ToInt32(TruncateInt64ToFloat64(A)) with TruncateInt64ToInt32(A), > DivFloat64(TruncateInt64ToFloat64(A), POWER_OF_TWO) with > TruncateInt64ToFloat64(Word64Shr(A, SHIFT)). > > This reduction appears to be more generic, dependent only on immediate inputs of > the node, and should not really degrade regular floating point multiplication. > What do you think? I changed this CL to use newly-introduced TruncateInt64ToFloat64.
https://codereview.chromium.org/1350223006/diff/60001/src/compiler/binary-ope... File src/compiler/binary-operator-reducer.cc (right): https://codereview.chromium.org/1350223006/diff/60001/src/compiler/binary-ope... src/compiler/binary-operator-reducer.cc:47: if (!machine()->Is64()) return NoChange(); You can use a BinopMatcher here to make this a bit shorter. https://codereview.chromium.org/1350223006/diff/60001/src/compiler/binary-ope... src/compiler/binary-operator-reducer.cc:50: node->InputAt(1)->opcode() != IrOpcode::kChangeInt32ToFloat64) { My earlier comment about using the range types applied to the inputs. So if you check the input ranges, you won't need to pattern match on the opcodes. Range types are always integers in the type system. https://codereview.chromium.org/1350223006/diff/60001/src/compiler/binary-ope... src/compiler/binary-operator-reducer.cc:77: Revisit(out); No need for revisits; the graph reducer takes care of that for you. https://codereview.chromium.org/1350223006/diff/60001/src/compiler/binary-ope... src/compiler/binary-operator-reducer.cc:82: Reduction BinaryOperatorReducer::ReduceFloat52Div(Node* node) { The optimization here with division is only sound if it is fed into a truncation (i.e. conversion to integer), so you'll need to match that at the bottom of the pattern.
https://codereview.chromium.org/1350223006/diff/60001/src/compiler/binary-ope... File src/compiler/binary-operator-reducer.cc (right): https://codereview.chromium.org/1350223006/diff/60001/src/compiler/binary-ope... src/compiler/binary-operator-reducer.cc:50: node->InputAt(1)->opcode() != IrOpcode::kChangeInt32ToFloat64) { But the idea here is that the original values were integers, and not floats. Otherwise I will need to cast them to ints to do Int64Mul, right? This cast will make this code slower for the case, where the multiplication is not fed into truncation, or division. https://codereview.chromium.org/1350223006/diff/60001/src/compiler/binary-ope... src/compiler/binary-operator-reducer.cc:77: Revisit(out); On 2015/10/16 18:44:06, titzer wrote: > No need for revisits; the graph reducer takes care of that for you. Acknowledged. https://codereview.chromium.org/1350223006/diff/60001/src/compiler/binary-ope... src/compiler/binary-operator-reducer.cc:82: Reduction BinaryOperatorReducer::ReduceFloat52Div(Node* node) { Hm... Not sure I understand why. I have modeled it that way that it should be semantically correct on every step of reduction. Div(Int64ToFloat64(A), POWER_OF_TWO) is replaced with Int64ToFloat(SHR(A), SHIFT). So the result is correct, I think?
https://codereview.chromium.org/1350223006/diff/60001/src/compiler/binary-ope... File src/compiler/binary-operator-reducer.cc (right): https://codereview.chromium.org/1350223006/diff/60001/src/compiler/binary-ope... src/compiler/binary-operator-reducer.cc:82: Reduction BinaryOperatorReducer::ReduceFloat52Div(Node* node) { On 2015/10/16 18:52:35, fedor.indutny wrote: > Hm... Not sure I understand why. I have modeled it that way that it should be > semantically correct on every step of reduction. > > Div(Int64ToFloat64(A), POWER_OF_TWO) is replaced with Int64ToFloat(SHR(A), > SHIFT). So the result is correct, I think? Because the fractional bits would be lost. Which is OK if the operation feeds into another truncation, which discards those bits anyway.
https://codereview.chromium.org/1350223006/diff/60001/src/compiler/binary-ope... File src/compiler/binary-operator-reducer.cc (right): https://codereview.chromium.org/1350223006/diff/60001/src/compiler/binary-ope... src/compiler/binary-operator-reducer.cc:82: Reduction BinaryOperatorReducer::ReduceFloat52Div(Node* node) { On 2015/10/16 19:51:54, titzer wrote: > On 2015/10/16 18:52:35, fedor.indutny wrote: > > Hm... Not sure I understand why. I have modeled it that way that it should be > > semantically correct on every step of reduction. > > > > Div(Int64ToFloat64(A), POWER_OF_TWO) is replaced with Int64ToFloat(SHR(A), > > SHIFT). So the result is correct, I think? > > Because the fractional bits would be lost. Which is OK if the operation feeds > into another truncation, which discards those bits anyway. Oooh, this makes sense! However this is not going to play very nicely with the reducer, isn't it?
On 2015/10/16 19:54:19, fedor.indutny wrote: > https://codereview.chromium.org/1350223006/diff/60001/src/compiler/binary-ope... > File src/compiler/binary-operator-reducer.cc (right): > > https://codereview.chromium.org/1350223006/diff/60001/src/compiler/binary-ope... > src/compiler/binary-operator-reducer.cc:82: Reduction > BinaryOperatorReducer::ReduceFloat52Div(Node* node) { > On 2015/10/16 19:51:54, titzer wrote: > > On 2015/10/16 18:52:35, fedor.indutny wrote: > > > Hm... Not sure I understand why. I have modeled it that way that it should > be > > > semantically correct on every step of reduction. > > > > > > Div(Int64ToFloat64(A), POWER_OF_TWO) is replaced with Int64ToFloat(SHR(A), > > > SHIFT). So the result is correct, I think? > > > > Because the fractional bits would be lost. Which is OK if the operation feeds > > into another truncation, which discards those bits anyway. > > Oooh, this makes sense! However this is not going to play very nicely with the > reducer, isn't it? It'll be fine if you match TruncateFloat64ToInt64(Float64Div(ChangeInt64ToFloat64(A), POWER_OF_TWO)) and replace the whole thing by Word64Shr(A, log(POWER_OF_TWO)). Note that we don't have the full set of Int64 <-> Float64 operators yet, but we will be adding them to support WASM, so you can use them internally.
I think I have addressed everything said, except the ranges, which I am not sure what to do about (I still think that range-matching may result in slower performance of code that doesn't do int to float casting). Is it good time to write tests? https://codereview.chromium.org/1350223006/diff/60001/src/compiler/binary-ope... File src/compiler/binary-operator-reducer.cc (right): https://codereview.chromium.org/1350223006/diff/60001/src/compiler/binary-ope... src/compiler/binary-operator-reducer.cc:47: if (!machine()->Is64()) return NoChange(); On 2015/10/16 18:44:06, titzer wrote: > You can use a BinopMatcher here to make this a bit shorter. I wasn't able to use BinopMatcher, but I have declared `left` and `right` variables. To use `BinopMatcher`, I will need to write some kind of custom `NodeMatcher`s, that have `HasValue` (not sure if `ValueMatcher` will work for this purpose, because of the parameter casting). Hope it works.
On 2015/10/16 22:30:38, fedor.indutny wrote: > I think I have addressed everything said, except the ranges, which I am not sure > what to do about (I still think that range-matching may result in slower > performance of code that doesn't do int to float casting). > > Is it good time to write tests? > > https://codereview.chromium.org/1350223006/diff/60001/src/compiler/binary-ope... > File src/compiler/binary-operator-reducer.cc (right): > > https://codereview.chromium.org/1350223006/diff/60001/src/compiler/binary-ope... > src/compiler/binary-operator-reducer.cc:47: if (!machine()->Is64()) return > NoChange(); > On 2015/10/16 18:44:06, titzer wrote: > > You can use a BinopMatcher here to make this a bit shorter. > > I wasn't able to use BinopMatcher, but I have declared `left` and `right` > variables. To use `BinopMatcher`, I will need to write some kind of custom > `NodeMatcher`s, that have `HasValue` (not sure if `ValueMatcher` will work for > this purpose, because of the parameter casting). > > Hope it works. Any updates?
On 2015/10/16 22:30:38, fedor.indutny wrote: > I think I have addressed everything said, except the ranges, which I am not sure > what to do about (I still think that range-matching may result in slower > performance of code that doesn't do int to float casting). > > Is it good time to write tests? It's always a good time to write tests :) > > https://codereview.chromium.org/1350223006/diff/60001/src/compiler/binary-ope... > File src/compiler/binary-operator-reducer.cc (right): > > https://codereview.chromium.org/1350223006/diff/60001/src/compiler/binary-ope... > src/compiler/binary-operator-reducer.cc:47: if (!machine()->Is64()) return > NoChange(); > On 2015/10/16 18:44:06, titzer wrote: > > You can use a BinopMatcher here to make this a bit shorter. > > I wasn't able to use BinopMatcher, but I have declared `left` and `right` > variables. To use `BinopMatcher`, I will need to write some kind of custom > `NodeMatcher`s, that have `HasValue` (not sure if `ValueMatcher` will work for > this purpose, because of the parameter casting). > > Hope it works. BinopMatcher should do the left+right job for you, so please use it if possible. We'd welcome new NodeMatchers that make pattern matching easier, as well.
On 2015/10/30 19:52:57, titzer wrote: > On 2015/10/16 22:30:38, fedor.indutny wrote: > > I think I have addressed everything said, except the ranges, which I am not > sure > > what to do about (I still think that range-matching may result in slower > > performance of code that doesn't do int to float casting). > > > > Is it good time to write tests? > > It's always a good time to write tests :) > Added test. > > > > > https://codereview.chromium.org/1350223006/diff/60001/src/compiler/binary-ope... > > File src/compiler/binary-operator-reducer.cc (right): > > > > > https://codereview.chromium.org/1350223006/diff/60001/src/compiler/binary-ope... > > src/compiler/binary-operator-reducer.cc:47: if (!machine()->Is64()) return > > NoChange(); > > On 2015/10/16 18:44:06, titzer wrote: > > > You can use a BinopMatcher here to make this a bit shorter. > > > > I wasn't able to use BinopMatcher, but I have declared `left` and `right` > > variables. To use `BinopMatcher`, I will need to write some kind of custom > > `NodeMatcher`s, that have `HasValue` (not sure if `ValueMatcher` will work for > > this purpose, because of the parameter casting). > > > > Hope it works. > > BinopMatcher should do the left+right job for you, so please use it if possible. > We'd welcome new NodeMatchers that make pattern matching easier, as well. Working on this now.
On 2015/11/03 22:20:24, fedor.indutny wrote: > On 2015/10/30 19:52:57, titzer wrote: > > On 2015/10/16 22:30:38, fedor.indutny wrote: > > > I think I have addressed everything said, except the ranges, which I am not > > sure > > > what to do about (I still think that range-matching may result in slower > > > performance of code that doesn't do int to float casting). > > > > > > Is it good time to write tests? > > > > It's always a good time to write tests :) > > > > Added test. > > > > > > > > > > https://codereview.chromium.org/1350223006/diff/60001/src/compiler/binary-ope... > > > File src/compiler/binary-operator-reducer.cc (right): > > > > > > > > > https://codereview.chromium.org/1350223006/diff/60001/src/compiler/binary-ope... > > > src/compiler/binary-operator-reducer.cc:47: if (!machine()->Is64()) return > > > NoChange(); > > > On 2015/10/16 18:44:06, titzer wrote: > > > > You can use a BinopMatcher here to make this a bit shorter. > > > > > > I wasn't able to use BinopMatcher, but I have declared `left` and `right` > > > variables. To use `BinopMatcher`, I will need to write some kind of custom > > > `NodeMatcher`s, that have `HasValue` (not sure if `ValueMatcher` will work > for > > > this purpose, because of the parameter casting). > > > > > > Hope it works. > > > > BinopMatcher should do the left+right job for you, so please use it if > possible. > > We'd welcome new NodeMatchers that make pattern matching easier, as well. > > Working on this now. How about this (patchset #8) ?
Kindly reminding you about this. I really would like to get my first turbofan patch landed! :)
Logic looks all good now. Thanks for writing tests. Are you using any of the extra methods from AdvancedReducer anymore? If not, then we can convert this to a regular reducer and/or merge its logic into the machine-operator-reducer.
On 2015/11/09 04:17:18, titzer wrote: > Logic looks all good now. Thanks for writing tests. Thank you! > > Are you using any of the extra methods from AdvancedReducer anymore? If not, > then we can convert this to a regular reducer and/or merge its logic into the > machine-operator-reducer. Yeah, I'm using `Revisit()`.
On 2015/11/09 04:22:51, fedor.indutny wrote: > On 2015/11/09 04:17:18, titzer wrote: > > Logic looks all good now. Thanks for writing tests. > > Thank you! > > > > > Are you using any of the extra methods from AdvancedReducer anymore? If not, > > then we can convert this to a regular reducer and/or merge its logic into the > > machine-operator-reducer. > > Yeah, I'm using `Revisit()`. LGTM
The CQ bit was checked by fedor@indutny.com
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1350223006/140001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1350223006/140001
The CQ bit was unchecked by commit-bot@chromium.org
Try jobs failed on following builders: v8_linux_gcc_compile_rel on tryserver.v8 (JOB_FAILED, http://build.chromium.org/p/tryserver.v8/builders/v8_linux_gcc_compile_rel/bu...)
Titzer, There was a minor warning that I have fixed. While fixing it I have noticed that I didn't check the sign of division. Now the check is there too. Please take another look, sorry for this! Thank you, Fedor.
lgtm other than one comment https://codereview.chromium.org/1350223006/diff/160001/src/compiler/binary-op... File src/compiler/binary-operator-reducer.cc (right): https://codereview.chromium.org/1350223006/diff/160001/src/compiler/binary-op... src/compiler/binary-operator-reducer.cc:120: Reduction BinaryOperatorReducer::ReduceTruncateFloat64ToInt32(Node* node) { Can you move this transformation to machine-operator-reducer?
https://codereview.chromium.org/1350223006/diff/160001/src/compiler/binary-op... File src/compiler/binary-operator-reducer.cc (right): https://codereview.chromium.org/1350223006/diff/160001/src/compiler/binary-op... src/compiler/binary-operator-reducer.cc:120: Reduction BinaryOperatorReducer::ReduceTruncateFloat64ToInt32(Node* node) { On 2015/11/09 18:53:08, titzer wrote: > Can you move this transformation to machine-operator-reducer? Of course, done!
Should I land it?
On 2015/11/09 19:14:07, fedor.indutny wrote: > Should I land it? We use the commit queue. I'll click the button right now.
The CQ bit was checked by titzer@chromium.org
The patchset sent to the CQ was uploaded after l-g-t-m from titzer@chromium.org Link to the patchset: https://codereview.chromium.org/1350223006/#ps180001 (title: "compiler: move round + truncate to machine-reducer")
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1350223006/180001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1350223006/180001
The CQ bit was unchecked by commit-bot@chromium.org
Try jobs failed on following builders: v8_linux_rel on tryserver.v8 (JOB_FAILED, http://build.chromium.org/p/tryserver.v8/builders/v8_linux_rel/builds/11543)
Heh, test obviously does not work on 32bit platforms. Added check. Please take a look.
On 2015/11/09 19:44:19, fedor.indutny wrote: > Heh, test obviously does not work on 32bit platforms. Added check. Please take a > look. You don't need to ask for a review of individual fixes like this one to make it through the commit queue. Internally we use try jobs to reduce noise on CLs, so when they go green feel free to click the commit queue button again.
The CQ bit was checked by fedor@indutny.com
The patchset sent to the CQ was uploaded after l-g-t-m from titzer@chromium.org Link to the patchset: https://codereview.chromium.org/1350223006/#ps200001 (title: "make test conditional")
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1350223006/200001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1350223006/200001
On 2015/11/09 19:46:40, titzer wrote: > On 2015/11/09 19:44:19, fedor.indutny wrote: > > Heh, test obviously does not work on 32bit platforms. Added check. Please take > a > > look. > > You don't need to ask for a review of individual fixes like this one to make it > through the commit queue. > Internally we use try jobs to reduce noise on CLs, so when they go green feel > free to click the commit queue button again. I totally understand! I just once landed it myself after a fix, and other V8 team member asked me to confirm the changes first before clicking the checkbox. Sorry!
Message was sent while issue was closed.
Committed patchset #11 (id:200001)
Message was sent while issue was closed.
Patchset 11 (id:??) landed as https://crrev.com/461e5b49d022335a7fc4e9d172397a4bd48b93d4 Cr-Commit-Position: refs/heads/master@{#31899}
Message was sent while issue was closed.
On 2015/11/09 20:42:48, commit-bot: I haz the power wrote: > Patchset 11 (id:??) landed as > https://crrev.com/461e5b49d022335a7fc4e9d172397a4bd48b93d4 > Cr-Commit-Position: refs/heads/master@{#31899} Thank you!
Message was sent while issue was closed.
Unfortunately this is also unsound for the reasons outlined in https://codereview.chromium.org/1473073004/ I think most of this could be done during representation selection/change insertion, and it would be even more general then.
Message was sent while issue was closed.
A revert of this CL (patchset #11 id:200001) has been created in https://codereview.chromium.org/1478923002/ by bmeurer@chromium.org. The reason for reverting is: This is also unsound for the reasons outlined in https://codereview.chromium.org/1473073004/ Will help Fedor to implement a solution based on simplified operators.. |