|
|
Created:
4 years, 10 months ago by Eric Holk Modified:
4 years, 10 months ago CC:
native-client-reviews_googlegroups.com Base URL:
https://chromium.googlesource.com/native_client/pnacl-subzero.git@master Target Ref:
refs/heads/master Visibility:
Public. |
DescriptionSubzero: ARM32: lowering of vector insert and extract.
BUG= https://bugs.chromium.org/p/nativeclient/issues/detail?id=4076
R=stichnot@chromium.org
Committed: https://gerrit.chromium.org/gerrit/gitweb?p=native_client/pnacl-subzero.git;a=commit;h=658bae2023cace91f8489f597c1c82dfe95aaaa2
Patch Set 1 #
Total comments: 63
Patch Set 2 : Incorporating review feedback" #
Total comments: 7
Patch Set 3 : More code cleanup. #Patch Set 4 : Removing incorrect use of rematerializable check" #
Total comments: 7
Patch Set 5 : Incorporating review feedback. #
Total comments: 12
Patch Set 6 : Code review feedback. #
Total comments: 12
Patch Set 7 : #Patch Set 8 : No more auto. #
Total comments: 12
Patch Set 9 : Fixes #
Messages
Total messages: 20 (2 generated)
eholk@chromium.org changed reviewers: + jpp@chromium.org, kschimpf@google.com, sehr@chromium.org, stichnot@chromium.org
This CL enables lowering of ARM32 vector insert and extract instructions. The integer variants lower to `vmov Dn[x], Rd` and `vmov Rd, Dn[x]` (http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0489c/Bcffigdj...) The floating point versions lower to a vmov out of an S register aliased with a Q register. In order to do this reliably, I added a register class for registers Q0-Q7, which are the only ones with S aliases.
https://codereview.chromium.org/1655313002/diff/1/src/IceInstARM32.cpp File src/IceInstARM32.cpp (right): https://codereview.chromium.org/1655313002/diff/1/src/IceInstARM32.cpp#newcod... src/IceInstARM32.cpp:986: auto SrcReg = Src->getRegNum(); You will get some disagreement within the team, but I would like you to back off on use of auto. My approximate rules for when to use auto: 1. Something like a cast operation (static_cast, llvm::dyn_cast) where the explicit type is completely redundant. const auto *Var = llvm::dyn_cast<Variable>(Opnd); 2. Other operations where the the type is pretty clear and practically redundant from context. auto *Instr = InstFoo::create(); 3. Situations where the actual type is obnoxiously verbose for no good reason. auto *Iter = MyContainer.begin(); auto Func = /* some lambda expression */; Otherwise, code starts to become write-only and it gets really hard to make sense of the actual types without continual scrutinizing of header files. https://codereview.chromium.org/1655313002/diff/1/src/IceInstARM32.cpp#newcod... src/IceInstARM32.cpp:991: // This code assumes the Aliases list goes Q_n, S_2n, S_2n+1. The asserts in I would really like to stay away from any assumptions on ordering of the register enum values or the Aliases list. IIUC, the purpose of mapping from Q register to D registers is for syntactic correctness of the emit() output, but ultimately in the emitIAS() output you will just take the encoded register value of the Q register and apply a simple transformation that has nothing to do with our various orderings. So I'm thinking maybe this method should return the encoded D register value instead of the enum value, and the emit() code could then call another function that returns a register name given the encoded value and register class/type (presumably f64 in this case because we know that's how to get at D registers). https://codereview.chromium.org/1655313002/diff/1/src/IceInstARM32.cpp#newcod... src/IceInstARM32.cpp:1019: InstARM32ExtractInsert::getSRegister(Variable *Src) const { Same comment as above. https://codereview.chromium.org/1655313002/diff/1/src/IceInstARM32.cpp#newcod... src/IceInstARM32.cpp:1045: auto Src = llvm::dyn_cast<Variable>(getSrc(0)); You can probably just use llvm::cast<> here and remove the assert. https://codereview.chromium.org/1655313002/diff/1/src/IceInstARM32.h File src/IceInstARM32.h (right): https://codereview.chromium.org/1655313002/diff/1/src/IceInstARM32.h#newcode1368 src/IceInstARM32.h:1368: : InstARM32Pred(Func, InstARM32::Extract, 2, Dest, Predicate), Since both Insert and Extract have a single source Operand (i.e., a single call to addSource()), I would pass it in to this base class ctor and do the addSource() here. Also, "2" is the max number of addSource() calls, so make it "1" instead. https://codereview.chromium.org/1655313002/diff/1/src/IceInstARM32.h#newcode1371 src/IceInstARM32.h:1371: uint32_t Index; const uint32_t Index https://codereview.chromium.org/1655313002/diff/1/src/IceInstARM32.h#newcode1375 src/IceInstARM32.h:1375: RegARM32::AllRegisters getDRegister(Variable *Src, uint32_t VectorSize) const; This isn't absolutely clear-cut, but I think it would be a bit better if these three functions can figure out on their own what Src and VectorSize should be, and not have to pass them as arguments. https://codereview.chromium.org/1655313002/diff/1/src/IceInstARM32.h#newcode1381 src/IceInstARM32.h:1381: }; Implement classof() for completeness. https://codereview.chromium.org/1655313002/diff/1/src/IceInstARM32.h#newcode1421 src/IceInstARM32.h:1421: InstARM32Insert(Cfg *Func, Variable *Dest, Variable *Src0, uint32_t Index, This parameter is named Src0, the one for Extract is name Src; make consistent. https://codereview.chromium.org/1655313002/diff/1/src/IceRegistersARM32.h File src/IceRegistersARM32.h (right): https://codereview.chromium.org/1655313002/diff/1/src/IceRegistersARM32.h#new... src/IceRegistersARM32.h:222: // Extend enum RegClass with ARM32-specific register classes (if any). I guess the "(if any)" comment can be dropped. https://codereview.chromium.org/1655313002/diff/1/src/IceTargetLoweringARM32.cpp File src/IceTargetLoweringARM32.cpp (right): https://codereview.chromium.org/1655313002/diff/1/src/IceTargetLoweringARM32.... src/IceTargetLoweringARM32.cpp:301: llvm::SmallBitVector QtoSRegisters(RegARM32::Reg_NUM); I would list this before InvalidRegisters. https://codereview.chromium.org/1655313002/diff/1/src/IceTargetLoweringARM32.... src/IceTargetLoweringARM32.cpp:3841: Context.insert<InstFakeDef>(Dest); Maybe I haven't thought this through well enough, but this just seems wrong. https://codereview.chromium.org/1655313002/diff/1/src/IceTargetLoweringARM32.... src/IceTargetLoweringARM32.cpp:3845: Variable *Src0 = legalizeToReg(Instr->getSrc(0)); Optional: It would be nice if InstExtractElement had suitably-named trivial methods to access its operands, instead of assuming a numbering. See InstStore::get{Addr,Data}() as an example. https://codereview.chromium.org/1655313002/diff/1/src/IceTargetLoweringARM32.... src/IceTargetLoweringARM32.cpp:3854: Src0->setRegClass(RegARM32::RCARM32_QtoS); It's possible (though unlikely) that legalizeToReg(Instr->getSrc(0)) just returns its argument without creating a new temporary, if src0 is already known to be a pre-colored or infinite-weight variable. It's also possible that it returns an existing temporary from the previous instruction lowering due to the local availability optimization. In either of these cases, you really don't want to override Src0's register class. Instead, move Src0 into a new temporary with the desired register class. https://codereview.chromium.org/1655313002/diff/1/src/IceTargetLoweringARM32.... src/IceTargetLoweringARM32.cpp:3861: assert(false && "extract requires a constant index"); s/extract/extractelement would make a more instantly understandable message. https://codereview.chromium.org/1655313002/diff/1/src/IceTargetLoweringARM32.h File src/IceTargetLoweringARM32.h (right): https://codereview.chromium.org/1655313002/diff/1/src/IceTargetLoweringARM32.... src/IceTargetLoweringARM32.h:93: assert(RegARM32::RCARM32_QtoS < RegARM32::RCARM32_NUM); I think this assert is pretty much guaranteed to be true and you can remove it. (Another possibility is to use static_assert, but it's trivially true enough that I just wouldn't bother.) https://codereview.chromium.org/1655313002/diff/1/src/IceTargetLoweringARM32.... src/IceTargetLoweringARM32.h:424: void _extract(Variable *Dest, Variable *Src, uint32_t Index, I would probably rename Src to Src0 for consistency. https://codereview.chromium.org/1655313002/diff/1/src/IceTargetLoweringARM32.... src/IceTargetLoweringARM32.h:431: void _insert(Variable *Dest, Variable *Val, uint32_t Index, I would probably rename Val to Src0 for consistency. https://codereview.chromium.org/1655313002/diff/1/src/IceTypes.cpp File src/IceTypes.cpp (right): https://codereview.chromium.org/1655313002/diff/1/src/IceTypes.cpp#newcode148 src/IceTypes.cpp:148: size_t typeWidthInBits(Type Ty) { return typeWidthInBytes(Ty) * 8; } Probably CHAR_BIT would be more appropriate than 8. https://codereview.chromium.org/1655313002/diff/1/tests_lit/assembler/arm32/i... File tests_lit/assembler/arm32/insert-extract.ll (right): https://codereview.chromium.org/1655313002/diff/1/tests_lit/assembler/arm32/i... tests_lit/assembler/arm32/insert-extract.ll:1: ; Show that we know how ot translate insertelement and extractelement. to
It would be nice if insert/extract also handled non-infinite-weight temporaries, and then just loaded/stored the desired element to memory. https://codereview.chromium.org/1655313002/diff/1/src/IceInstARM32.cpp File src/IceInstARM32.cpp (right): https://codereview.chromium.org/1655313002/diff/1/src/IceInstARM32.cpp#newcod... src/IceInstARM32.cpp:986: auto SrcReg = Src->getRegNum(); const auto https://codereview.chromium.org/1655313002/diff/1/src/IceInstARM32.cpp#newcod... src/IceInstARM32.cpp:991: // This code assumes the Aliases list goes Q_n, S_2n, S_2n+1. The asserts in On 2016/02/03 15:28:37, stichnot wrote: > I would really like to stay away from any assumptions on ordering of the > register enum values or the Aliases list. > > IIUC, the purpose of mapping from Q register to D registers is for syntactic > correctness of the emit() output, but ultimately in the emitIAS() output you > will just take the encoded register value of the Q register and apply a simple > transformation that has nothing to do with our various orderings. > > So I'm thinking maybe this method should return the encoded D register value > instead of the enum value, and the emit() code could then call another function > that returns a register name given the encoded value and register class/type > (presumably f64 in this case because we know that's how to get at D registers). he will actually need to provide a D register to the underlying assembler. I am concerned about this as well, so I asked him to leave a TODO behind (see below) until we come up with something better that does not require assumptions or clever tricks. https://codereview.chromium.org/1655313002/diff/1/src/IceInstARM32.cpp#newcod... src/IceInstARM32.cpp:993: auto HalfWidth = VectorSize / 2; optional: I am very auto-lenient, but this is probably too much https://codereview.chromium.org/1655313002/diff/1/src/IceInstARM32.cpp#newcod... src/IceInstARM32.cpp:1023: // For floating point values, we hope we got allocated to Q0 - Q7, so we can I would rephrase this. "hoping" conveys the idea that we cross our fingers that Src was allocated the right register. also, won't this assertion fail for non-fp types if, e.g., q15 is allocated? (I had to search for the uses of getSRegisters() below to figure out that this will not happen... maybe assert(Src->getType() == v4f32)?) https://codereview.chromium.org/1655313002/diff/1/src/IceInstARM32.cpp#newcod... src/IceInstARM32.cpp:1031: return (RegARM32::AllRegisters)RegARM32::RegTable[SrcReg].Aliases[Index + 3]; I fear what will happen if the alias declaration ever changes. Please leave a TODO(jpp), and describe that we need to make this more resilient to changes in the alias declaration. https://codereview.chromium.org/1655313002/diff/1/src/IceInstARM32.cpp#newcod... src/IceInstARM32.cpp:1050: auto BitSize = typeWidthInBits(DestTy); instead of adding this new method, why don't you just do auto BitSize = typeWidthInBytes() * CHAR_BIT ? https://codereview.chromium.org/1655313002/diff/1/src/IceInstARM32.h File src/IceInstARM32.h (right): https://codereview.chromium.org/1655313002/diff/1/src/IceInstARM32.h#newcode1371 src/IceInstARM32.h:1371: uint32_t Index; const also, why don't you add a const uint32_t NumElements ? https://codereview.chromium.org/1655313002/diff/1/src/IceInstARM32.h#newcode1376 src/IceInstARM32.h:1376: uint32_t getDIndex(uint32_t VectorSize) const; VectorSize to me is always 128-bit. Maybe rename these to NumElements? https://codereview.chromium.org/1655313002/diff/1/src/IceTargetLoweringARM32.cpp File src/IceTargetLoweringARM32.cpp (right): https://codereview.chromium.org/1655313002/diff/1/src/IceTargetLoweringARM32.... src/IceTargetLoweringARM32.cpp:3841: Context.insert<InstFakeDef>(Dest); On 2016/02/03 15:28:37, stichnot wrote: > Maybe I haven't thought this through well enough, but this just seems wrong. Yup, this is wrong. you have to handle these (which are just stack 'alloca'ed arrays.) If I am not mistaken, legalizeToReg() below will emit code to load from memory https://codereview.chromium.org/1655313002/diff/1/src/IceTargetLoweringARM32.h File src/IceTargetLoweringARM32.h (right): https://codereview.chromium.org/1655313002/diff/1/src/IceTargetLoweringARM32.... src/IceTargetLoweringARM32.h:424: void _extract(Variable *Dest, Variable *Src, uint32_t Index, On 2016/02/03 15:28:38, stichnot wrote: > I would probably rename Src to Src0 for consistency. It's been common practice (at least in the ARM32 backend) to name something Src if it is the only Src. Unfortunately, this is not always followed, but I think it makes moar sense -- if I see Src0 in the middle of a function I am tempted to think there might be a Src1. regardless of parameter naming, I would name this method _extract_element (and _insert_element) simply because _insert and _extract are too generic.
https://codereview.chromium.org/1655313002/diff/1/src/IceInstARM32.cpp File src/IceInstARM32.cpp (right): https://codereview.chromium.org/1655313002/diff/1/src/IceInstARM32.cpp#newcod... src/IceInstARM32.cpp:986: auto SrcReg = Src->getRegNum(); On 2016/02/03 15:28:37, stichnot wrote: > You will get some disagreement within the team, but I would like you to back off > on use of auto. > > My approximate rules for when to use auto: > > 1. Something like a cast operation (static_cast, llvm::dyn_cast) where the > explicit type is completely redundant. > > const auto *Var = llvm::dyn_cast<Variable>(Opnd); > > 2. Other operations where the the type is pretty clear and practically redundant > from context. > > auto *Instr = InstFoo::create(); > > 3. Situations where the actual type is obnoxiously verbose for no good reason. > > auto *Iter = MyContainer.begin(); > auto Func = /* some lambda expression */; > > Otherwise, code starts to become write-only and it gets really hard to make > sense of the actual types without continual scrutinizing of header files. I meant to remove most of my uses of auto before submitting this CL. I'm generally a fan of auto since I've also spent a lot of time writing in type-inferred languages, but the local style here is definitely to avoid using it. https://codereview.chromium.org/1655313002/diff/1/src/IceInstARM32.cpp#newcod... src/IceInstARM32.cpp:986: auto SrcReg = Src->getRegNum(); On 2016/02/03 16:06:51, John wrote: > const auto Done. https://codereview.chromium.org/1655313002/diff/1/src/IceInstARM32.cpp#newcod... src/IceInstARM32.cpp:991: // This code assumes the Aliases list goes Q_n, S_2n, S_2n+1. The asserts in On 2016/02/03 16:06:51, John wrote: > On 2016/02/03 15:28:37, stichnot wrote: > > I would really like to stay away from any assumptions on ordering of the > > register enum values or the Aliases list. > > > > IIUC, the purpose of mapping from Q register to D registers is for syntactic > > correctness of the emit() output, but ultimately in the emitIAS() output you > > will just take the encoded register value of the Q register and apply a simple > > transformation that has nothing to do with our various orderings. > > > > So I'm thinking maybe this method should return the encoded D register value > > instead of the enum value, and the emit() code could then call another > function > > that returns a register name given the encoded value and register class/type > > (presumably f64 in this case because we know that's how to get at D > registers). > > he will actually need to provide a D register to the underlying assembler. I am > concerned about this as well, so I asked him to leave a TODO behind (see below) > until we come up with something better that does not require assumptions or > clever tricks. I've added the TODO. https://codereview.chromium.org/1655313002/diff/1/src/IceInstARM32.cpp#newcod... src/IceInstARM32.cpp:993: auto HalfWidth = VectorSize / 2; On 2016/02/03 16:06:51, John wrote: > optional: I am very auto-lenient, but this is probably too much I removed it. HalfWidth was only used in one place, so I just replaced it with VectorSize / 2. https://codereview.chromium.org/1655313002/diff/1/src/IceInstARM32.cpp#newcod... src/IceInstARM32.cpp:1019: InstARM32ExtractInsert::getSRegister(Variable *Src) const { On 2016/02/03 15:28:37, stichnot wrote: > Same comment as above. Acknowledged. https://codereview.chromium.org/1655313002/diff/1/src/IceInstARM32.cpp#newcod... src/IceInstARM32.cpp:1023: // For floating point values, we hope we got allocated to Q0 - Q7, so we can On 2016/02/03 16:06:51, John wrote: > I would rephrase this. "hoping" conveys the idea that we cross our fingers that > Src was allocated the right register. > > also, won't this assertion fail for non-fp types if, e.g., q15 is allocated? (I > had to search for the uses of getSRegisters() below to figure out that this will > not happen... maybe assert(Src->getType() == v4f32)?) I fixed the comment. I wrote that before the higher up code was actually guaranteeing that invariant. I strengthened the assert too. This function should only be called on the v4f32 version anyway. https://codereview.chromium.org/1655313002/diff/1/src/IceInstARM32.cpp#newcod... src/IceInstARM32.cpp:1031: return (RegARM32::AllRegisters)RegARM32::RegTable[SrcReg].Aliases[Index + 3]; On 2016/02/03 16:06:51, John wrote: > I fear what will happen if the alias declaration ever changes. Please leave a > TODO(jpp), and describe that we need to make this more resilient to changes in > the alias declaration. Done. https://codereview.chromium.org/1655313002/diff/1/src/IceInstARM32.cpp#newcod... src/IceInstARM32.cpp:1045: auto Src = llvm::dyn_cast<Variable>(getSrc(0)); On 2016/02/03 15:28:37, stichnot wrote: > You can probably just use llvm::cast<> here and remove the assert. Done. https://codereview.chromium.org/1655313002/diff/1/src/IceInstARM32.cpp#newcod... src/IceInstARM32.cpp:1050: auto BitSize = typeWidthInBits(DestTy); On 2016/02/03 16:06:51, John wrote: > instead of adding this new method, why don't you just do > > auto BitSize = typeWidthInBytes() * CHAR_BIT > > ? Done. https://codereview.chromium.org/1655313002/diff/1/src/IceInstARM32.h File src/IceInstARM32.h (right): https://codereview.chromium.org/1655313002/diff/1/src/IceInstARM32.h#newcode1368 src/IceInstARM32.h:1368: : InstARM32Pred(Func, InstARM32::Extract, 2, Dest, Predicate), On 2016/02/03 15:28:37, stichnot wrote: > Since both Insert and Extract have a single source Operand (i.e., a single call > to addSource()), I would pass it in to this base class ctor and do the > addSource() here. > > Also, "2" is the max number of addSource() calls, so make it "1" instead. Done. Originally I thought there was going to be more divergence between the two subclasses, but I guess now. https://codereview.chromium.org/1655313002/diff/1/src/IceInstARM32.h#newcode1371 src/IceInstARM32.h:1371: uint32_t Index; On 2016/02/03 15:28:37, stichnot wrote: > const uint32_t Index Done. https://codereview.chromium.org/1655313002/diff/1/src/IceInstARM32.h#newcode1371 src/IceInstARM32.h:1371: uint32_t Index; On 2016/02/03 16:06:51, John wrote: > const > > also, why don't you add a > > const uint32_t NumElements > > ? It wasn't immediately clear what the right NumElements was at this point, although the higher level constructor knows, so that's not a big problem. The floating point version of this instruction doesn't need NumElements, because it's always 4. It doesn't matter much to me, but it's not hard to compute NumElements and it isn't always needed, so it seems better not to bother storing it here. https://codereview.chromium.org/1655313002/diff/1/src/IceInstARM32.h#newcode1375 src/IceInstARM32.h:1375: RegARM32::AllRegisters getDRegister(Variable *Src, uint32_t VectorSize) const; On 2016/02/03 15:28:37, stichnot wrote: > This isn't absolutely clear-cut, but I think it would be a bit better if these > three functions can figure out on their own what Src and VectorSize should be, > and not have to pass them as arguments. I wasn't totally thrilled with the design I used here. The challenge is that the source and vector size are determined by the destination for insertelement, but by the source for extractelement. The common code doesn't know which version it is, so you'd basically have to duplicate this code in each of the subclasses. Perhaps the better design is to make a function that's not part of the class that takes a register, numelements and an index. As it is right now, it's strange that the register and numelements are given explicitly but the index is implicit. It would actually be pretty easy to compute NumElements from the Src though, which would at least make it a little cleaner. I went ahead and made this change, but I'm open to making more changes. https://codereview.chromium.org/1655313002/diff/1/src/IceInstARM32.h#newcode1376 src/IceInstARM32.h:1376: uint32_t getDIndex(uint32_t VectorSize) const; On 2016/02/03 16:06:51, John wrote: > VectorSize to me is always 128-bit. Maybe rename these to NumElements? Done. https://codereview.chromium.org/1655313002/diff/1/src/IceInstARM32.h#newcode1381 src/IceInstARM32.h:1381: }; On 2016/02/03 15:28:37, stichnot wrote: > Implement classof() for completeness. Does it make sense to have classof here, given that this class should never be instantiated except as a base of InstARM32Extract and InstARM32Insert (which both implement classof)? For example, InstArm32Pred does not include classof, that I can see. https://codereview.chromium.org/1655313002/diff/1/src/IceInstARM32.h#newcode1421 src/IceInstARM32.h:1421: InstARM32Insert(Cfg *Func, Variable *Dest, Variable *Src0, uint32_t Index, On 2016/02/03 15:28:37, stichnot wrote: > This parameter is named Src0, the one for Extract is name Src; make consistent. Done. https://codereview.chromium.org/1655313002/diff/1/src/IceRegistersARM32.h File src/IceRegistersARM32.h (right): https://codereview.chromium.org/1655313002/diff/1/src/IceRegistersARM32.h#new... src/IceRegistersARM32.h:222: // Extend enum RegClass with ARM32-specific register classes (if any). On 2016/02/03 15:28:37, stichnot wrote: > I guess the "(if any)" comment can be dropped. Done. https://codereview.chromium.org/1655313002/diff/1/src/IceTargetLoweringARM32.cpp File src/IceTargetLoweringARM32.cpp (right): https://codereview.chromium.org/1655313002/diff/1/src/IceTargetLoweringARM32.... src/IceTargetLoweringARM32.cpp:301: llvm::SmallBitVector QtoSRegisters(RegARM32::Reg_NUM); On 2016/02/03 15:28:37, stichnot wrote: > I would list this before InvalidRegisters. Done. https://codereview.chromium.org/1655313002/diff/1/src/IceTargetLoweringARM32.... src/IceTargetLoweringARM32.cpp:3854: Src0->setRegClass(RegARM32::RCARM32_QtoS); On 2016/02/03 15:28:38, stichnot wrote: > It's possible (though unlikely) that legalizeToReg(Instr->getSrc(0)) just > returns its argument without creating a new temporary, if src0 is already known > to be a pre-colored or infinite-weight variable. It's also possible that it > returns an existing temporary from the previous instruction lowering due to the > local availability optimization. > > In either of these cases, you really don't want to override Src0's register > class. > > Instead, move Src0 into a new temporary with the desired register class. Done. https://codereview.chromium.org/1655313002/diff/1/src/IceTargetLoweringARM32.... src/IceTargetLoweringARM32.cpp:3861: assert(false && "extract requires a constant index"); On 2016/02/03 15:28:38, stichnot wrote: > s/extract/extractelement > would make a more instantly understandable message. Done. https://codereview.chromium.org/1655313002/diff/1/src/IceTargetLoweringARM32.h File src/IceTargetLoweringARM32.h (right): https://codereview.chromium.org/1655313002/diff/1/src/IceTargetLoweringARM32.... src/IceTargetLoweringARM32.h:93: assert(RegARM32::RCARM32_QtoS < RegARM32::RCARM32_NUM); On 2016/02/03 15:28:38, stichnot wrote: > I think this assert is pretty much guaranteed to be true and you can remove it. > (Another possibility is to use static_assert, but it's trivially true enough > that I just wouldn't bother.) > Done. https://codereview.chromium.org/1655313002/diff/1/src/IceTargetLoweringARM32.... src/IceTargetLoweringARM32.h:424: void _extract(Variable *Dest, Variable *Src, uint32_t Index, On 2016/02/03 15:28:38, stichnot wrote: > I would probably rename Src to Src0 for consistency. Done. https://codereview.chromium.org/1655313002/diff/1/src/IceTargetLoweringARM32.... src/IceTargetLoweringARM32.h:424: void _extract(Variable *Dest, Variable *Src, uint32_t Index, On 2016/02/03 16:06:52, John wrote: > On 2016/02/03 15:28:38, stichnot wrote: > > I would probably rename Src to Src0 for consistency. > > It's been common practice (at least in the ARM32 backend) to name something Src > if it is the only Src. Unfortunately, this is not always followed, but I think > it makes moar sense -- if I see Src0 in the middle of a function I am tempted to > think there might be a Src1. > > regardless of parameter naming, I would name this method _extract_element (and > _insert_element) simply because _insert and _extract are too generic. Done. https://codereview.chromium.org/1655313002/diff/1/src/IceTargetLoweringARM32.... src/IceTargetLoweringARM32.h:431: void _insert(Variable *Dest, Variable *Val, uint32_t Index, On 2016/02/03 15:28:38, stichnot wrote: > I would probably rename Val to Src0 for consistency. Done. https://codereview.chromium.org/1655313002/diff/1/src/IceTypes.cpp File src/IceTypes.cpp (right): https://codereview.chromium.org/1655313002/diff/1/src/IceTypes.cpp#newcode148 src/IceTypes.cpp:148: size_t typeWidthInBits(Type Ty) { return typeWidthInBytes(Ty) * 8; } On 2016/02/03 15:28:38, stichnot wrote: > Probably CHAR_BIT would be more appropriate than 8. Agreed. I removed this function and inlined it where needed, following John's suggestion. https://codereview.chromium.org/1655313002/diff/1/tests_lit/assembler/arm32/i... File tests_lit/assembler/arm32/insert-extract.ll (right): https://codereview.chromium.org/1655313002/diff/1/tests_lit/assembler/arm32/i... tests_lit/assembler/arm32/insert-extract.ll:1: ; Show that we know how ot translate insertelement and extractelement. On 2016/02/03 15:28:38, stichnot wrote: > to Done.
https://codereview.chromium.org/1655313002/diff/1/src/IceInstARM32.h File src/IceInstARM32.h (right): https://codereview.chromium.org/1655313002/diff/1/src/IceInstARM32.h#newcode1375 src/IceInstARM32.h:1375: RegARM32::AllRegisters getDRegister(Variable *Src, uint32_t VectorSize) const; On 2016/02/03 21:02:21, Eric Holk wrote: > On 2016/02/03 15:28:37, stichnot wrote: > > This isn't absolutely clear-cut, but I think it would be a bit better if these > > three functions can figure out on their own what Src and VectorSize should be, > > and not have to pass them as arguments. > > I wasn't totally thrilled with the design I used here. > > The challenge is that the source and vector size are determined by the > destination for insertelement, but by the source for extractelement. The common > code doesn't know which version it is, so you'd basically have to duplicate this > code in each of the subclasses. > > Perhaps the better design is to make a function that's not part of the class > that takes a register, numelements and an index. As it is right now, it's > strange that the register and numelements are given explicitly but the index is > implicit. > > It would actually be pretty easy to compute NumElements from the Src though, > which would at least make it a little cleaner. I went ahead and made this > change, but I'm open to making more changes. I was thinking that the common code could determine whether to use Dest or Src based on whether the instruction kind is Insert or Extract. https://codereview.chromium.org/1655313002/diff/1/src/IceInstARM32.h#newcode1381 src/IceInstARM32.h:1381: }; On 2016/02/03 21:02:21, Eric Holk wrote: > On 2016/02/03 15:28:37, stichnot wrote: > > Implement classof() for completeness. > > Does it make sense to have classof here, given that this class should never be > instantiated except as a base of InstARM32Extract and InstARM32Insert (which > both implement classof)? For example, InstArm32Pred does not include classof, > that I can see. If you aren't already aware, have a look at http://llvm.org/docs/HowToSetUpLLVMStyleRTTI.html . In this case, the implementation should look something like: static bool classof(const Inst *Inst) { return isClassof(Inst, Insert) || isClassof(Inst, Extract); } I didn't know InstArm32Pred was incomplete, maybe it should be fixed? :) For example, see InstTarget, Constant, Variable. https://codereview.chromium.org/1655313002/diff/20001/src/IceInstARM32.cpp File src/IceInstARM32.cpp (right): https://codereview.chromium.org/1655313002/diff/20001/src/IceInstARM32.cpp#ne... src/IceInstARM32.cpp:1042: auto &Str = Func->getContext()->getStrEmit(); Ostream &Str https://codereview.chromium.org/1655313002/diff/20001/src/IceInstARM32.cpp#ne... src/IceInstARM32.cpp:1043: auto DestTy = getDest()->getType(); const Type DestTy https://codereview.chromium.org/1655313002/diff/20001/src/IceInstARM32.cpp#ne... src/IceInstARM32.cpp:1045: auto Src = llvm::dyn_cast<Variable>(getSrc(0)); auto * https://codereview.chromium.org/1655313002/diff/20001/src/IceTargetLoweringAR... File src/IceTargetLoweringARM32.cpp (right): https://codereview.chromium.org/1655313002/diff/20001/src/IceTargetLoweringAR... src/IceTargetLoweringARM32.cpp:3851: Variable *TSrc0 = makeReg(Src0->getType()); Bikeshed: the name "TSrc0" strikes me as unlike any other variable name here. I'm too lazy to figure out a better name, any ideas John? https://codereview.chromium.org/1655313002/diff/20001/src/IceTargetLoweringAR... src/IceTargetLoweringARM32.cpp:3855: TSrc0->setRegClass(RegARM32::RCARM32_QtoS); Another possibility is to move more stuff into this conditional: if (isFloatingType(DestTy)) { Variable *TSrc0 = makeReg(Src0->getType()); TSrc0->setRegClass(RegARM32::RCARM32_QtoS); _mov(TSrc0, Src0); Src0 = TSrc0; }
https://codereview.chromium.org/1655313002/diff/1/src/IceInstARM32.h File src/IceInstARM32.h (right): https://codereview.chromium.org/1655313002/diff/1/src/IceInstARM32.h#newcode1375 src/IceInstARM32.h:1375: RegARM32::AllRegisters getDRegister(Variable *Src, uint32_t VectorSize) const; On 2016/02/03 22:32:21, stichnot wrote: > On 2016/02/03 21:02:21, Eric Holk wrote: > > On 2016/02/03 15:28:37, stichnot wrote: > > > This isn't absolutely clear-cut, but I think it would be a bit better if > these > > > three functions can figure out on their own what Src and VectorSize should > be, > > > and not have to pass them as arguments. > > > > I wasn't totally thrilled with the design I used here. > > > > The challenge is that the source and vector size are determined by the > > destination for insertelement, but by the source for extractelement. The > common > > code doesn't know which version it is, so you'd basically have to duplicate > this > > code in each of the subclasses. > > > > Perhaps the better design is to make a function that's not part of the class > > that takes a register, numelements and an index. As it is right now, it's > > strange that the register and numelements are given explicitly but the index > is > > implicit. > > > > It would actually be pretty easy to compute NumElements from the Src though, > > which would at least make it a little cleaner. I went ahead and made this > > change, but I'm open to making more changes. > > I was thinking that the common code could determine whether to use Dest or Src > based on whether the instruction kind is Insert or Extract. what about defining a virtual method in this class: Variable *getVectorOperand() const; and implement getDRegister in terms of that? I believe that's all you'd need. Alternatively, you could have a const Variable *VectorOperand; and let the base classes set it to the right operand. (I'd prefer this approach to avoid the virtual call.) https://codereview.chromium.org/1655313002/diff/1/src/IceInstARM32.h#newcode1375 src/IceInstARM32.h:1375: RegARM32::AllRegisters getDRegister(Variable *Src, uint32_t VectorSize) const; Another issue I'd like to point out here. if this method takes no parameter, then I'd say getDRegister does not convey the idea of what this does. I would provably implement this helper (and all the other helpers in this base class) as global functions in anonymous namespaces in the .cpp file. https://codereview.chromium.org/1655313002/diff/20001/src/IceInstARM32.h File src/IceInstARM32.h (right): https://codereview.chromium.org/1655313002/diff/20001/src/IceInstARM32.h#newc... src/IceInstARM32.h:1380: // For floating point values, we can read directly from an S register. This read/write ... from/to? https://codereview.chromium.org/1655313002/diff/20001/src/IceTargetLoweringAR... File src/IceTargetLoweringARM32.cpp (right): https://codereview.chromium.org/1655313002/diff/20001/src/IceTargetLoweringAR... src/IceTargetLoweringARM32.cpp:3855: TSrc0->setRegClass(RegARM32::RCARM32_QtoS); On 2016/02/03 22:32:21, stichnot wrote: > Another possibility is to move more stuff into this conditional: > if (isFloatingType(DestTy)) { > Variable *TSrc0 = makeReg(Src0->getType()); > TSrc0->setRegClass(RegARM32::RCARM32_QtoS); > _mov(TSrc0, Src0); > Src0 = TSrc0; > } +1
https://codereview.chromium.org/1655313002/diff/1/src/IceInstARM32.h File src/IceInstARM32.h (right): https://codereview.chromium.org/1655313002/diff/1/src/IceInstARM32.h#newcode1375 src/IceInstARM32.h:1375: RegARM32::AllRegisters getDRegister(Variable *Src, uint32_t VectorSize) const; On 2016/02/04 15:23:57, John wrote: > On 2016/02/03 22:32:21, stichnot wrote: > > On 2016/02/03 21:02:21, Eric Holk wrote: > > > On 2016/02/03 15:28:37, stichnot wrote: > > > > This isn't absolutely clear-cut, but I think it would be a bit better if > > these > > > > three functions can figure out on their own what Src and VectorSize should > > be, > > > > and not have to pass them as arguments. > > > > > > I wasn't totally thrilled with the design I used here. > > > > > > The challenge is that the source and vector size are determined by the > > > destination for insertelement, but by the source for extractelement. The > > common > > > code doesn't know which version it is, so you'd basically have to duplicate > > this > > > code in each of the subclasses. > > > > > > Perhaps the better design is to make a function that's not part of the class > > > that takes a register, numelements and an index. As it is right now, it's > > > strange that the register and numelements are given explicitly but the index > > is > > > implicit. > > > > > > It would actually be pretty easy to compute NumElements from the Src though, > > > which would at least make it a little cleaner. I went ahead and made this > > > change, but I'm open to making more changes. > > > > I was thinking that the common code could determine whether to use Dest or Src > > based on whether the instruction kind is Insert or Extract. > > what about defining a virtual method in this class: > > Variable *getVectorOperand() const; > > and implement getDRegister in terms of that? I believe that's all you'd need. I recalled that in general, we had tried to avoid virtual methods in Inst and Operand (particularly Variable), to keep object sizes smaller by avoiding vtables. Unfortunately I see that Inst has a bunch of integral virtual methods, so oh well. Adding a virtual method is probably a fine approach. Also, this would not be a virtual method in the base class, so even if we were generally free and clear of virtual methods, the impact here would be small. > > Alternatively, you could have a > > const Variable *VectorOperand; > > and let the base classes set it to the right operand. (I'd prefer this approach > to avoid the virtual call.) I don't think precisely this is a good idea. There is essentially an assumption that all Variable* are captured by FOREACH_VAR_IN_INST(). Probably nothing would break today by adding a "hidden" Variable*, but... Maybe better would be Variable **VectorOperand, with appropriate const's thrown in?
https://codereview.chromium.org/1655313002/diff/1/src/IceInstARM32.h File src/IceInstARM32.h (right): https://codereview.chromium.org/1655313002/diff/1/src/IceInstARM32.h#newcode1375 src/IceInstARM32.h:1375: RegARM32::AllRegisters getDRegister(Variable *Src, uint32_t VectorSize) const; On 2016/02/04 16:44:12, stichnot wrote: > On 2016/02/04 15:23:57, John wrote: > > On 2016/02/03 22:32:21, stichnot wrote: > > > On 2016/02/03 21:02:21, Eric Holk wrote: > > > > On 2016/02/03 15:28:37, stichnot wrote: > > > > > This isn't absolutely clear-cut, but I think it would be a bit better if > > > these > > > > > three functions can figure out on their own what Src and VectorSize > should > > > be, > > > > > and not have to pass them as arguments. > > > > > > > > I wasn't totally thrilled with the design I used here. > > > > > > > > The challenge is that the source and vector size are determined by the > > > > destination for insertelement, but by the source for extractelement. The > > > common > > > > code doesn't know which version it is, so you'd basically have to > duplicate > > > this > > > > code in each of the subclasses. > > > > > > > > Perhaps the better design is to make a function that's not part of the > class > > > > that takes a register, numelements and an index. As it is right now, it's > > > > strange that the register and numelements are given explicitly but the > index > > > is > > > > implicit. > > > > > > > > It would actually be pretty easy to compute NumElements from the Src > though, > > > > which would at least make it a little cleaner. I went ahead and made this > > > > change, but I'm open to making more changes. > > > > > > I was thinking that the common code could determine whether to use Dest or > Src > > > based on whether the instruction kind is Insert or Extract. > > > > what about defining a virtual method in this class: > > > > Variable *getVectorOperand() const; > > > > and implement getDRegister in terms of that? I believe that's all you'd need. > > I recalled that in general, we had tried to avoid virtual methods in Inst and > Operand (particularly Variable), to keep object sizes smaller by avoiding > vtables. > > Unfortunately I see that Inst has a bunch of integral virtual methods, so oh > well. Adding a virtual method is probably a fine approach. Also, this would > not be a virtual method in the base class, so even if we were generally free and > clear of virtual methods, the impact here would be small. > > > > > Alternatively, you could have a > > > > const Variable *VectorOperand; > > > > and let the base classes set it to the right operand. (I'd prefer this > approach > > to avoid the virtual call.) > > I don't think precisely this is a good idea. There is essentially an assumption > that all Variable* are captured by FOREACH_VAR_IN_INST(). Probably nothing > would break today by adding a "hidden" Variable*, but... Maybe better would be > Variable **VectorOperand, with appropriate const's thrown in? I am **not** suggesting this member in lieu of the canonical Srcs array; See, for example, X86OperandMem: those classes (X8632 and X8664) will cache Base, and Index, but they will **also** add them to Srcs. You don't need a double pointer; a single pointer: Variable *const Operand;
https://codereview.chromium.org/1655313002/diff/1/src/IceInstARM32.h File src/IceInstARM32.h (right): https://codereview.chromium.org/1655313002/diff/1/src/IceInstARM32.h#newcode1381 src/IceInstARM32.h:1381: }; On 2016/02/03 22:32:21, stichnot wrote: > On 2016/02/03 21:02:21, Eric Holk wrote: > > On 2016/02/03 15:28:37, stichnot wrote: > > > Implement classof() for completeness. > > > > Does it make sense to have classof here, given that this class should never be > > instantiated except as a base of InstARM32Extract and InstARM32Insert (which > > both implement classof)? For example, InstArm32Pred does not include classof, > > that I can see. > > If you aren't already aware, have a look at > http://llvm.org/docs/HowToSetUpLLVMStyleRTTI.html . > > In this case, the implementation should look something like: > > static bool classof(const Inst *Inst) { > return isClassof(Inst, Insert) || isClassof(Inst, Extract); > } > > I didn't know InstArm32Pred was incomplete, maybe it should be fixed? :) > > For example, see InstTarget, Constant, Variable. Ah, that makes sense. Thanks.
https://codereview.chromium.org/1655313002/diff/60001/src/IceInstARM32.cpp File src/IceInstARM32.cpp (right): https://codereview.chromium.org/1655313002/diff/60001/src/IceInstARM32.cpp#ne... src/IceInstARM32.cpp:1046: auto Src = llvm::dyn_cast<Variable>(getSrc(0)); auto * https://codereview.chromium.org/1655313002/diff/60001/src/IceInstARM32.cpp#ne... src/IceInstARM32.cpp:1086: const auto Src = llvm::dyn_cast<Variable>(getSrc(0)); auto * https://codereview.chromium.org/1655313002/diff/60001/src/IceTargetLoweringAR... File src/IceTargetLoweringARM32.cpp (right): https://codereview.chromium.org/1655313002/diff/60001/src/IceTargetLoweringAR... src/IceTargetLoweringARM32.cpp:310: QtoSRegisters[i] = Entry.IsVec128 && i < RegARM32::Reg_q8; Either now, or TODO. Don't asssume that registers' enum values are listed in a particular order. Compare their encoded values, or (better yet) add a column to IceRegistersARM32.def via gen_arm32_reg_tables.py. https://codereview.chromium.org/1655313002/diff/60001/src/IceTargetLoweringAR... src/IceTargetLoweringARM32.cpp:3857: } else { Remove the "else". http://llvm.org/docs/CodingStandards.html#don-t-use-else-after-a-return
https://codereview.chromium.org/1655313002/diff/60001/src/IceInstARM32.cpp File src/IceInstARM32.cpp (right): https://codereview.chromium.org/1655313002/diff/60001/src/IceInstARM32.cpp#ne... src/IceInstARM32.cpp:1046: auto Src = llvm::dyn_cast<Variable>(getSrc(0)); On 2016/02/05 13:27:20, stichnot wrote: > auto * Done. https://codereview.chromium.org/1655313002/diff/60001/src/IceInstARM32.cpp#ne... src/IceInstARM32.cpp:1086: const auto Src = llvm::dyn_cast<Variable>(getSrc(0)); On 2016/02/05 13:27:20, stichnot wrote: > auto * Done. https://codereview.chromium.org/1655313002/diff/60001/src/IceTargetLoweringAR... File src/IceTargetLoweringARM32.cpp (right): https://codereview.chromium.org/1655313002/diff/60001/src/IceTargetLoweringAR... src/IceTargetLoweringARM32.cpp:310: QtoSRegisters[i] = Entry.IsVec128 && i < RegARM32::Reg_q8; On 2016/02/05 13:27:20, stichnot wrote: > Either now, or TODO. Don't asssume that registers' enum values are listed in a > particular order. Compare their encoded values, or (better yet) add a column to > IceRegistersARM32.def via gen_arm32_reg_tables.py. I made it compare the encodings and also left a TODO to update the table.
https://codereview.chromium.org/1655313002/diff/80001/src/IceInstARM32.cpp File src/IceInstARM32.cpp (right): https://codereview.chromium.org/1655313002/diff/80001/src/IceInstARM32.cpp#ne... src/IceInstARM32.cpp:1033: // This part assumes the register alias list is goes q0, d0, d1, s0, s1, s2, "is goes" - drop one of those word, I think. https://codereview.chromium.org/1655313002/diff/80001/src/IceInstARM32.cpp#ne... src/IceInstARM32.cpp:1046: auto *Src = llvm::dyn_cast<Variable>(getSrc(0)); llvm::cast<> https://codereview.chromium.org/1655313002/diff/80001/src/IceInstARM32.cpp#ne... src/IceInstARM32.cpp:1051: auto BitSize = typeWidthInBytes(DestTy) * CHAR_BIT; I thought even John had a problem with this use of auto? https://codereview.chromium.org/1655313002/diff/80001/src/IceInstARM32.cpp#ne... src/IceInstARM32.cpp:1086: const auto *Src = llvm::dyn_cast<Variable>(getSrc(0)); Change this to llvm::cast<> and remove the assert above. https://codereview.chromium.org/1655313002/diff/80001/src/IceTargetLoweringAR... File src/IceTargetLoweringARM32.cpp (right): https://codereview.chromium.org/1655313002/diff/80001/src/IceTargetLoweringAR... src/IceTargetLoweringARM32.cpp:311: // TODO: It would be better to store a QtoS flag in the IceRegistersARM32 TODO(person): https://codereview.chromium.org/1655313002/diff/80001/tests_lit/assembler/arm... File tests_lit/assembler/arm32/insert-extract.ll (right): https://codereview.chromium.org/1655313002/diff/80001/tests_lit/assembler/arm... tests_lit/assembler/arm32/insert-extract.ll:23: define internal i32 @extract1_v4i32(<4 x i32> %src) #0 { Can you remove these "#0" strings?
https://codereview.chromium.org/1655313002/diff/80001/src/IceInstARM32.cpp File src/IceInstARM32.cpp (right): https://codereview.chromium.org/1655313002/diff/80001/src/IceInstARM32.cpp#ne... src/IceInstARM32.cpp:1033: // This part assumes the register alias list is goes q0, d0, d1, s0, s1, s2, On 2016/02/08 18:08:29, stichnot wrote: > "is goes" - drop one of those word, I think. Done. https://codereview.chromium.org/1655313002/diff/80001/src/IceInstARM32.cpp#ne... src/IceInstARM32.cpp:1046: auto *Src = llvm::dyn_cast<Variable>(getSrc(0)); On 2016/02/08 18:08:29, stichnot wrote: > llvm::cast<> Done. https://codereview.chromium.org/1655313002/diff/80001/src/IceInstARM32.cpp#ne... src/IceInstARM32.cpp:1051: auto BitSize = typeWidthInBytes(DestTy) * CHAR_BIT; On 2016/02/08 18:08:29, stichnot wrote: > I thought even John had a problem with this use of auto? I'll replace it with `const uint32_t`. https://codereview.chromium.org/1655313002/diff/80001/src/IceInstARM32.cpp#ne... src/IceInstARM32.cpp:1086: const auto *Src = llvm::dyn_cast<Variable>(getSrc(0)); On 2016/02/08 18:08:29, stichnot wrote: > Change this to llvm::cast<> and remove the assert above. Done. https://codereview.chromium.org/1655313002/diff/80001/src/IceTargetLoweringAR... File src/IceTargetLoweringARM32.cpp (right): https://codereview.chromium.org/1655313002/diff/80001/src/IceTargetLoweringAR... src/IceTargetLoweringARM32.cpp:311: // TODO: It would be better to store a QtoS flag in the IceRegistersARM32 On 2016/02/08 18:08:29, stichnot wrote: > TODO(person): Done. https://codereview.chromium.org/1655313002/diff/80001/tests_lit/assembler/arm... File tests_lit/assembler/arm32/insert-extract.ll (right): https://codereview.chromium.org/1655313002/diff/80001/tests_lit/assembler/arm... tests_lit/assembler/arm32/insert-extract.ll:23: define internal i32 @extract1_v4i32(<4 x i32> %src) #0 { On 2016/02/08 18:08:29, stichnot wrote: > Can you remove these "#0" strings? Done.
https://codereview.chromium.org/1655313002/diff/100001/src/IceInstARM32.cpp File src/IceInstARM32.cpp (right): https://codereview.chromium.org/1655313002/diff/100001/src/IceInstARM32.cpp#n... src/IceInstARM32.cpp:1122: auto VectorSize = typeNumElements(Src->getType()); no auto here, I think https://codereview.chromium.org/1655313002/diff/100001/src/IceInstARM32.cpp#n... src/IceInstARM32.cpp:1151: const auto BitSize = typeWidthInBytes(typeElementType(DestTy)) * CHAR_BIT; no auto here, I think https://codereview.chromium.org/1655313002/diff/100001/src/IceInstARM32.cpp#n... src/IceInstARM32.cpp:1154: const auto VectorSize = typeNumElements(DestTy); no auto here, I think https://codereview.chromium.org/1655313002/diff/100001/src/IceInstARM32.cpp#n... src/IceInstARM32.cpp:1156: const auto Index = getDIndex(VectorSize, this->Index); not sure about this auto https://codereview.chromium.org/1655313002/diff/100001/src/IceInstARM32.h File src/IceInstARM32.h (right): https://codereview.chromium.org/1655313002/diff/100001/src/IceInstARM32.h#new... src/IceInstARM32.h:1375: Index(Index) { Could you add an assert to validate that Index is within the proper bounds? That validation is probably already being done a few layers up, but it's hard to look at the .cpp file and reason about whether or not you might be accessing something outside the array of registers. Same for the Insert instruction. https://codereview.chromium.org/1655313002/diff/100001/src/IceTargetLoweringA... File src/IceTargetLoweringARM32.cpp (right): https://codereview.chromium.org/1655313002/diff/100001/src/IceTargetLoweringA... src/IceTargetLoweringARM32.cpp:4267: auto Index = Imm->getValue(); const uint32_t
https://codereview.chromium.org/1655313002/diff/100001/src/IceInstARM32.cpp File src/IceInstARM32.cpp (right): https://codereview.chromium.org/1655313002/diff/100001/src/IceInstARM32.cpp#n... src/IceInstARM32.cpp:1122: auto VectorSize = typeNumElements(Src->getType()); On 2016/02/08 20:13:59, stichnot wrote: > no auto here, I think Done. https://codereview.chromium.org/1655313002/diff/100001/src/IceInstARM32.cpp#n... src/IceInstARM32.cpp:1151: const auto BitSize = typeWidthInBytes(typeElementType(DestTy)) * CHAR_BIT; On 2016/02/08 20:13:59, stichnot wrote: > no auto here, I think Done. https://codereview.chromium.org/1655313002/diff/100001/src/IceInstARM32.cpp#n... src/IceInstARM32.cpp:1154: const auto VectorSize = typeNumElements(DestTy); On 2016/02/08 20:13:59, stichnot wrote: > no auto here, I think Done. https://codereview.chromium.org/1655313002/diff/100001/src/IceInstARM32.cpp#n... src/IceInstARM32.cpp:1156: const auto Index = getDIndex(VectorSize, this->Index); On 2016/02/08 20:13:59, stichnot wrote: > not sure about this auto Done. https://codereview.chromium.org/1655313002/diff/100001/src/IceInstARM32.h File src/IceInstARM32.h (right): https://codereview.chromium.org/1655313002/diff/100001/src/IceInstARM32.h#new... src/IceInstARM32.h:1375: Index(Index) { On 2016/02/08 20:13:59, stichnot wrote: > Could you add an assert to validate that Index is within the proper bounds? > > That validation is probably already being done a few layers up, but it's hard to > look at the .cpp file and reason about whether or not you might be accessing > something outside the array of registers. > > Same for the Insert instruction. Done. https://codereview.chromium.org/1655313002/diff/100001/src/IceTargetLoweringA... File src/IceTargetLoweringARM32.cpp (right): https://codereview.chromium.org/1655313002/diff/100001/src/IceTargetLoweringA... src/IceTargetLoweringARM32.cpp:4267: auto Index = Imm->getValue(); On 2016/02/08 20:13:59, stichnot wrote: > const uint32_t Done.
lgtm https://codereview.chromium.org/1655313002/diff/140001/src/IceInstARM32.cpp File src/IceInstARM32.cpp (right): https://codereview.chromium.org/1655313002/diff/140001/src/IceInstARM32.cpp#n... src/IceInstARM32.cpp:1051: const Register SrcReg = (Register)Src->getRegNum(); Here and the other 4 places, I think it would be better to use static_cast<Register>(...) . https://codereview.chromium.org/1655313002/diff/140001/src/IceInstARM32.cpp#n... src/IceInstARM32.cpp:1093: assert(Src->getType() == IceType_v4f32 && SrcReg < RegARM32::Reg_q8); Instead of this: assert(a && b); Use this: assert(a); assert(b); to get better precision if the assert fails. https://codereview.chromium.org/1655313002/diff/140001/src/IceInstARM32.cpp#n... src/IceInstARM32.cpp:1105: Type DestTy = getDest()->getType(); const Type https://codereview.chromium.org/1655313002/diff/140001/src/IceInstARM32.cpp#n... src/IceInstARM32.cpp:1107: Variable *Src = llvm::cast<Variable>(getSrc(0)); auto *Src or maybe "const auto *src" for consistency with the following method https://codereview.chromium.org/1655313002/diff/140001/src/IceInstARM32.cpp#n... src/IceInstARM32.cpp:1146: const Variable *Src = llvm::cast<Variable>(getSrc(0)); const auto *Src https://codereview.chromium.org/1655313002/diff/140001/src/IceTargetLoweringA... File src/IceTargetLoweringARM32.cpp (right): https://codereview.chromium.org/1655313002/diff/140001/src/IceTargetLoweringA... src/IceTargetLoweringARM32.cpp:3849: if (const ConstantInteger32 *Imm = llvm::dyn_cast<ConstantInteger32>(Src1)) { const auto *Imm https://codereview.chromium.org/1655313002/diff/140001/src/IceTargetLoweringA... src/IceTargetLoweringARM32.cpp:4266: if (const ConstantInteger32 *Imm = llvm::dyn_cast<ConstantInteger32>(Src2)) { const auto *Imm
Thanks! https://codereview.chromium.org/1655313002/diff/140001/src/IceInstARM32.cpp File src/IceInstARM32.cpp (right): https://codereview.chromium.org/1655313002/diff/140001/src/IceInstARM32.cpp#n... src/IceInstARM32.cpp:1051: const Register SrcReg = (Register)Src->getRegNum(); On 2016/02/08 23:08:58, stichnot wrote: > Here and the other 4 places, I think it would be better to use > static_cast<Register>(...) . Done. https://codereview.chromium.org/1655313002/diff/140001/src/IceInstARM32.cpp#n... src/IceInstARM32.cpp:1093: assert(Src->getType() == IceType_v4f32 && SrcReg < RegARM32::Reg_q8); On 2016/02/08 23:08:58, stichnot wrote: > Instead of this: > assert(a && b); > Use this: > assert(a); > assert(b); > to get better precision if the assert fails. Done. https://codereview.chromium.org/1655313002/diff/140001/src/IceInstARM32.cpp#n... src/IceInstARM32.cpp:1105: Type DestTy = getDest()->getType(); On 2016/02/08 23:08:58, stichnot wrote: > const Type Done. https://codereview.chromium.org/1655313002/diff/140001/src/IceInstARM32.cpp#n... src/IceInstARM32.cpp:1107: Variable *Src = llvm::cast<Variable>(getSrc(0)); On 2016/02/08 23:08:58, stichnot wrote: > auto *Src > > or maybe "const auto *src" for consistency with the following method Done. https://codereview.chromium.org/1655313002/diff/140001/src/IceInstARM32.cpp#n... src/IceInstARM32.cpp:1146: const Variable *Src = llvm::cast<Variable>(getSrc(0)); On 2016/02/08 23:08:58, stichnot wrote: > const auto *Src Done.
Description was changed from ========== Subzero: ARM32: lowering of vector insert and extract. BUG= https://bugs.chromium.org/p/nativeclient/issues/detail?id=4076 ========== to ========== Subzero: ARM32: lowering of vector insert and extract. BUG= https://bugs.chromium.org/p/nativeclient/issues/detail?id=4076 R=stichnot@chromium.org Committed: https://gerrit.chromium.org/gerrit/gitweb?p=native_client/pnacl-subzero.git;a... ==========
Message was sent while issue was closed.
Committed patchset #9 (id:160001) manually as 658bae2023cace91f8489f597c1c82dfe95aaaa2 (presubmit successful). |