|
|
Created:
4 years, 5 months ago by manasijm Modified:
4 years, 4 months ago CC:
native-client-reviews_googlegroups.com Base URL:
https://chromium.googlesource.com/native_client/pnacl-subzero.git@master Target Ref:
refs/heads/master Visibility:
Public. |
DescriptionAggressive LEA
Convert adds with a constant operand to lea on -aggressive-lea
BUG=none
R=stichnot@chromium.org
Committed: https://gerrit.chromium.org/gerrit/gitweb?p=native_client/pnacl-subzero.git;a=commit;h=5b7e1c06cfda07cb09d65fc526a39433a63f9bd4
Patch Set 1 #
Total comments: 1
Patch Set 2 : Revert lea to add when dest and base is same #
Total comments: 10
Patch Set 3 : Address comments #
Total comments: 13
Patch Set 4 : Address Comments #
Total comments: 4
Patch Set 5 : Put duplicate logic into protected method #
Total comments: 6
Patch Set 6 : Address Comments #
Messages
Total messages: 17 (3 generated)
Description was changed from ========== Aggressive LEA Convert adds with a constant operand to lea on -aggressive-lea BUG=none ========== to ========== Aggressive LEA Convert adds with a constant operand to lea on -aggressive-lea BUG=none ==========
manasijm@google.com changed reviewers: + eholk@chromium.org, jpp@chromium.org, kschimpf@google.com, stichnot@chromium.org
https://codereview.chromium.org/2135403002/diff/1/src/IceTargetLoweringX86Bas... File src/IceTargetLoweringX86BaseImpl.h (right): https://codereview.chromium.org/2135403002/diff/1/src/IceTargetLoweringX86Bas... src/IceTargetLoweringX86BaseImpl.h:2187: case InstArithmetic::Add: { LEA are pretty powerful. You can use it as a general purpose, 4-address instructions in x86. E.g., x = add i32 y, z -> lea x, (y_reg, z_reg, 0) x = add i32 y, const -> lea x, const(y_reg,,0) x = add i32 const, z -> lea x, const(z_reg,,0) x = sub i32 y, const -> lea x, -const(y_reg,,0) If I am not mistaken, LEA can also generate 16 bit addresses with 32-bit address modes. Finally, you can't really optimize for i64 unles Traits::Is64Bit
https://codereview.chromium.org/2135403002/diff/20001/src/IceTargetLoweringX8... File src/IceTargetLoweringX86BaseImpl.h (right): https://codereview.chromium.org/2135403002/diff/20001/src/IceTargetLoweringX8... src/IceTargetLoweringX86BaseImpl.h:2188: if (getFlags().getAggressiveLea() && Func->getOptLevel() == Opt_2 && Do you really need OptLevel=2? or is getAggressiveLea() sufficient? https://codereview.chromium.org/2135403002/diff/20001/src/IceTargetLoweringX8... src/IceTargetLoweringX86BaseImpl.h:2193: auto Var = legalize(Src0, Legal_Reg); auto * Also, in this case you can use legalizeToReg() for a simpler and more consistent style. https://codereview.chromium.org/2135403002/diff/20001/src/IceTargetLoweringX8... src/IceTargetLoweringX86BaseImpl.h:2201: } /*else { // Both are Variables Why is the rest of this commented out? https://codereview.chromium.org/2135403002/diff/20001/src/IceTargetLoweringX8... src/IceTargetLoweringX86BaseImpl.h:2205: Func, IceType_void, llvm::cast<Variable>(Var0), nullptr, If you use legalizeToReg() as above, you won't need the llvm::cast<Variable>() here.
https://codereview.chromium.org/2135403002/diff/20001/src/IceTargetLoweringX8... File src/IceTargetLoweringX86BaseImpl.h (right): https://codereview.chromium.org/2135403002/diff/20001/src/IceTargetLoweringX8... src/IceTargetLoweringX86BaseImpl.h:2201: } /*else { // Both are Variables On 2016/07/19 21:29:09, stichnot wrote: > Why is the rest of this commented out? Enabling that increases code size. And doesn't seem to be helpful in general. The commented out portion would have converted x = y + z to a lea and resulted in more forced register legalizations.
https://codereview.chromium.org/2135403002/diff/20001/src/IceTargetLoweringX8... File src/IceTargetLoweringX86BaseImpl.h (right): https://codereview.chromium.org/2135403002/diff/20001/src/IceTargetLoweringX8... src/IceTargetLoweringX86BaseImpl.h:2201: } /*else { // Both are Variables On 2016/07/19 21:54:59, manasijm wrote: > On 2016/07/19 21:29:09, stichnot wrote: > > Why is the rest of this commented out? > > Enabling that increases code size. > And doesn't seem to be helpful in general. > The commented out portion would have converted x = y + z to a lea and resulted > in more forced register legalizations. OK. Either remove the code, or leave it in but disabled and commented. E.g.: } else if (false) { // TODO: Find a good strategy for enabling this. } That way, the disabled code still gets compiled and is less likely to rot.
https://codereview.chromium.org/2135403002/diff/20001/src/IceTargetLoweringX8... File src/IceTargetLoweringX86BaseImpl.h (right): https://codereview.chromium.org/2135403002/diff/20001/src/IceTargetLoweringX8... src/IceTargetLoweringX86BaseImpl.h:2188: if (getFlags().getAggressiveLea() && Func->getOptLevel() == Opt_2 && On 2016/07/19 21:29:08, stichnot wrote: > Do you really need OptLevel=2? or is getAggressiveLea() sufficient? Done. https://codereview.chromium.org/2135403002/diff/20001/src/IceTargetLoweringX8... src/IceTargetLoweringX86BaseImpl.h:2193: auto Var = legalize(Src0, Legal_Reg); On 2016/07/19 21:29:08, stichnot wrote: > auto * > > Also, in this case you can use legalizeToReg() for a simpler and more consistent > style. Done. https://codereview.chromium.org/2135403002/diff/20001/src/IceTargetLoweringX8... src/IceTargetLoweringX86BaseImpl.h:2201: } /*else { // Both are Variables On 2016/07/19 22:30:29, stichnot wrote: > On 2016/07/19 21:54:59, manasijm wrote: > > On 2016/07/19 21:29:09, stichnot wrote: > > > Why is the rest of this commented out? > > > > Enabling that increases code size. > > And doesn't seem to be helpful in general. > > The commented out portion would have converted x = y + z to a lea and resulted > > in more forced register legalizations. > > OK. Either remove the code, or leave it in but disabled and commented. E.g.: > > } else if (false) { > // TODO: Find a good strategy for enabling this. > } > > That way, the disabled code still gets compiled and is less likely to rot. Done. https://codereview.chromium.org/2135403002/diff/20001/src/IceTargetLoweringX8... src/IceTargetLoweringX86BaseImpl.h:2205: Func, IceType_void, llvm::cast<Variable>(Var0), nullptr, On 2016/07/19 21:29:08, stichnot wrote: > If you use legalizeToReg() as above, you won't need the llvm::cast<Variable>() > here. Done.
https://codereview.chromium.org/2135403002/diff/40001/src/IceInstX86BaseImpl.h File src/IceInstX86BaseImpl.h (right): https://codereview.chromium.org/2135403002/diff/40001/src/IceInstX86BaseImpl.... src/IceInstX86BaseImpl.h:2029: auto *MemOp = llvm::dyn_cast<X86OperandMem>(this->getSrc(0)); This needs a comment to explain what's going on, which is to revert back to an add instruction when the lea is actually a 2-address instruction rather than 3-address. Also, are you missing opportunities? eax = lea [eax+edx] eax = lea [edx+eax] and if the dest register is in the index slot instead of the base slot, though that probably never actually happens Finally, any changes like this in an emit() method should have corresponding changes in the emitIAS() method, otherwise they won't be seen in production. Related to this, you should probably have a lit test for this. The normal lit test would test the emitIAS() path, and the emit() path would get tested when "make presubmit" runs with FORCEASM=1. https://codereview.chromium.org/2135403002/diff/40001/src/IceInstX86BaseImpl.... src/IceInstX86BaseImpl.h:2030: if (MemOp && getFlags().getAggressiveLea() && MemOp != nullptr https://codereview.chromium.org/2135403002/diff/40001/src/IceInstX86BaseImpl.... src/IceInstX86BaseImpl.h:2034: const_cast<Cfg *>(Func), this->getDest(), MemOp->getOffset()); const_cast - yuck! This often indicates a design problem, which I think is true here - all the emit() methods take a const Cfg * arg because we never expected emit() to modify the CFG. Can you try using the same allocator trick as in Variable::asType() ? https://codereview.chromium.org/2135403002/diff/40001/src/IceTargetLoweringX8... File src/IceTargetLoweringX86BaseImpl.h (right): https://codereview.chromium.org/2135403002/diff/40001/src/IceTargetLoweringX8... src/IceTargetLoweringX86BaseImpl.h:2188: if (getFlags().getAggressiveLea() && I wonder if this conditional would be clearer as something like this: if (auto *Const = llvm::dyn_cast<Constant>(Instr->getSrc(1))) { const bool ValidType = (Ty == IceType_i32 || (Ty == IceType_i64 && Traits::Is64Bit)); const bool ValidKind = (llvm::isa<ConstantInteger32>(Const) || llvm::isa<ConstantRelocatable>(Const)); if (getFlags().getAggressiveLea() && ValidType && ValidKind) { ... break; } } https://codereview.chromium.org/2135403002/diff/40001/src/IceTargetLoweringX8... src/IceTargetLoweringX86BaseImpl.h:2190: Constant *Const = llvm::dyn_cast<Constant>(Instr->getSrc(1)); auto *Const https://codereview.chromium.org/2135403002/diff/40001/src/IceTargetLoweringX8... src/IceTargetLoweringX86BaseImpl.h:2199: ConvertedToLea = true; I would just put a "break;" here, and remove ConvertedToLea entirely.
https://codereview.chromium.org/2135403002/diff/40001/src/IceInstX86BaseImpl.h File src/IceInstX86BaseImpl.h (right): https://codereview.chromium.org/2135403002/diff/40001/src/IceInstX86BaseImpl.... src/IceInstX86BaseImpl.h:2030: if (MemOp && getFlags().getAggressiveLea() && On 2016/07/29 14:27:26, stichnot wrote: > MemOp != nullptr Done. https://codereview.chromium.org/2135403002/diff/40001/src/IceInstX86BaseImpl.... src/IceInstX86BaseImpl.h:2034: const_cast<Cfg *>(Func), this->getDest(), MemOp->getOffset()); On 2016/07/29 14:27:26, stichnot wrote: > const_cast - yuck! This often indicates a design problem, which I think is true > here - all the emit() methods take a const Cfg * arg because we never expected > emit() to modify the CFG. > > Can you try using the same allocator trick as in Variable::asTy Using the allocator 'trick' does not seem to help because the constructor for InstX86Lea takes a non-const Cfg * https://codereview.chromium.org/2135403002/diff/40001/src/IceInstX86BaseImpl.... src/IceInstX86BaseImpl.h:2034: const_cast<Cfg *>(Func), this->getDest(), MemOp->getOffset()); On 2016/07/29 14:27:26, stichnot wrote: > const_cast - yuck! This often indicates a design problem, which I think is true > here - all the emit() methods take a const Cfg * arg because we never expected > emit() to modify the CFG. > > Can you try using the same allocator trick as in Variable::asType() ? Done. https://codereview.chromium.org/2135403002/diff/40001/src/IceTargetLoweringX8... File src/IceTargetLoweringX86BaseImpl.h (right): https://codereview.chromium.org/2135403002/diff/40001/src/IceTargetLoweringX8... src/IceTargetLoweringX86BaseImpl.h:2188: if (getFlags().getAggressiveLea() && On 2016/07/29 14:27:26, stichnot wrote: > I wonder if this conditional would be clearer as something like this: > > if (auto *Const = llvm::dyn_cast<Constant>(Instr->getSrc(1))) { > const bool ValidType = (Ty == IceType_i32 || > (Ty == IceType_i64 && Traits::Is64Bit)); > const bool ValidKind = (llvm::isa<ConstantInteger32>(Const) || > llvm::isa<ConstantRelocatable>(Const)); > if (getFlags().getAggressiveLea() && ValidType && ValidKind) { > ... > break; > } > } Done. https://codereview.chromium.org/2135403002/diff/40001/src/IceTargetLoweringX8... src/IceTargetLoweringX86BaseImpl.h:2190: Constant *Const = llvm::dyn_cast<Constant>(Instr->getSrc(1)); On 2016/07/29 14:27:26, stichnot wrote: > auto *Const Done. https://codereview.chromium.org/2135403002/diff/40001/src/IceTargetLoweringX8... src/IceTargetLoweringX86BaseImpl.h:2199: ConvertedToLea = true; On 2016/07/29 14:27:26, stichnot wrote: > I would just put a "break;" here, and remove ConvertedToLea entirely. Done.
https://codereview.chromium.org/2135403002/diff/40001/src/IceInstX86BaseImpl.h File src/IceInstX86BaseImpl.h (right): https://codereview.chromium.org/2135403002/diff/40001/src/IceInstX86BaseImpl.... src/IceInstX86BaseImpl.h:2034: const_cast<Cfg *>(Func), this->getDest(), MemOp->getOffset()); On 2016/08/01 19:36:21, manasijm wrote: > On 2016/07/29 14:27:26, stichnot wrote: > > const_cast - yuck! This often indicates a design problem, which I think is > true > > here - all the emit() methods take a const Cfg * arg because we never expected > > emit() to modify the CFG. > > > > Can you try using the same allocator trick as in Variable::asTy > > Using the allocator 'trick' does not seem to help because the constructor for > InstX86Lea takes a non-const Cfg * well actually, asType() has the same problem with the non-const Cfg, and it gets around the problem by directly calling the private Variable ctor, which it gets away with because asType() is a method of the Variable class. We can't really do the same here with the private ctor, so I guess const_cast needs to stay for now. https://codereview.chromium.org/2135403002/diff/60001/src/IceInstX86Base.h File src/IceInstX86Base.h (right): https://codereview.chromium.org/2135403002/diff/60001/src/IceInstX86Base.h#ne... src/IceInstX86Base.h:625: if (MemOp != nullptr && getFlags().getAggressiveLea() && This would be better as a private method of the class, and used here for emitIAS() and also for emit(). Along those lines, the helper could also do the work of creating the new InstX86Add instruction, which would leave the const_cast in just one place. The helper method name might be "deoptToAddOrNull()" or something similar, and used like this: if (Inst *Add = deoptToAddOrNull()) { Add->emitIAS(Func); return; } https://codereview.chromium.org/2135403002/diff/60001/src/IceInstX86BaseImpl.h File src/IceInstX86BaseImpl.h (right): https://codereview.chromium.org/2135403002/diff/60001/src/IceInstX86BaseImpl.... src/IceInstX86BaseImpl.h:2036: //TODO: Remove const_cast by emitting code for add directly. TODO(owner) also, odd that there's no spaced after "//"...
https://codereview.chromium.org/2135403002/diff/60001/src/IceInstX86Base.h File src/IceInstX86Base.h (right): https://codereview.chromium.org/2135403002/diff/60001/src/IceInstX86Base.h#ne... src/IceInstX86Base.h:625: if (MemOp != nullptr && getFlags().getAggressiveLea() && On 2016/08/01 21:08:41, stichnot wrote: > This would be better as a private method of the class, and used here for > emitIAS() and also for emit(). > > Along those lines, the helper could also do the work of creating the new > InstX86Add instruction, which would leave the const_cast in just one place. > > The helper method name might be "deoptToAddOrNull()" or something similar, and > used like this: > > if (Inst *Add = deoptToAddOrNull()) { > Add->emitIAS(Func); > return; > } Done. https://codereview.chromium.org/2135403002/diff/60001/src/IceInstX86BaseImpl.h File src/IceInstX86BaseImpl.h (right): https://codereview.chromium.org/2135403002/diff/60001/src/IceInstX86BaseImpl.... src/IceInstX86BaseImpl.h:2036: //TODO: Remove const_cast by emitting code for add directly. On 2016/08/01 21:08:41, stichnot wrote: > TODO(owner) > > also, odd that there's no spaced after "//"... Done.
https://codereview.chromium.org/2135403002/diff/80001/src/IceInstX86Base.h File src/IceInstX86Base.h (right): https://codereview.chromium.org/2135403002/diff/80001/src/IceInstX86Base.h#ne... src/IceInstX86Base.h:652: InstX86Add *deoptLeaToAddOrNull(const Cfg *Func) const { I think you could just declare this to return Inst*, then you could get rid of the forward class declaration above. https://codereview.chromium.org/2135403002/diff/80001/src/IceInstX86Base.h#ne... src/IceInstX86Base.h:655: auto *MemOp = llvm::dyn_cast<X86OperandMem>(this->getSrc(0)); if (auto *MemOp = llvm::dyn_cast<X86OperandMem>(this->getSrc(0))) { if (getFlags() ... ) { ... return ... ; } } return nullptr; https://codereview.chromium.org/2135403002/diff/80001/src/IceInstX86BaseImpl.h File src/IceInstX86BaseImpl.h (right): https://codereview.chromium.org/2135403002/diff/80001/src/IceInstX86BaseImpl.... src/IceInstX86BaseImpl.h:2029: InstX86BaseUnaryopGPR<InstX86Base::Lea>::deoptLeaToAddOrNull(Func)) { this->deoptLeaToAddOrNull(Func) Seriously.
https://codereview.chromium.org/2135403002/diff/80001/src/IceInstX86Base.h File src/IceInstX86Base.h (right): https://codereview.chromium.org/2135403002/diff/80001/src/IceInstX86Base.h#ne... src/IceInstX86Base.h:652: InstX86Add *deoptLeaToAddOrNull(const Cfg *Func) const { On 2016/08/02 03:43:52, stichnot wrote: > I think you could just declare this to return Inst*, then you could get rid of > the forward class declaration above. Done. https://codereview.chromium.org/2135403002/diff/80001/src/IceInstX86Base.h#ne... src/IceInstX86Base.h:655: auto *MemOp = llvm::dyn_cast<X86OperandMem>(this->getSrc(0)); On 2016/08/02 03:43:52, stichnot wrote: > if (auto *MemOp = llvm::dyn_cast<X86OperandMem>(this->getSrc(0))) { > if (getFlags() ... ) { > ... > return ... ; > } > } > return nullptr; Done. https://codereview.chromium.org/2135403002/diff/80001/src/IceInstX86BaseImpl.h File src/IceInstX86BaseImpl.h (right): https://codereview.chromium.org/2135403002/diff/80001/src/IceInstX86BaseImpl.... src/IceInstX86BaseImpl.h:2029: InstX86BaseUnaryopGPR<InstX86Base::Lea>::deoptLeaToAddOrNull(Func)) { On 2016/08/02 03:43:53, stichnot wrote: > this->deoptLeaToAddOrNull(Func) > > Seriously. Works! Was 'misguided' by my IDE..
lgtm
Description was changed from ========== Aggressive LEA Convert adds with a constant operand to lea on -aggressive-lea BUG=none ========== to ========== Aggressive LEA Convert adds with a constant operand to lea on -aggressive-lea BUG=none R=stichnot@chromium.org Committed: https://gerrit.chromium.org/gerrit/gitweb?p=native_client/pnacl-subzero.git;a... ==========
Message was sent while issue was closed.
Committed patchset #6 (id:100001) manually as 5b7e1c06cfda07cb09d65fc526a39433a63f9bd4 (presubmit successful). |