Chromium Code Reviews| OLD | NEW |
|---|---|
| (Empty) | |
| 1 Target-specific lowering in ICE | |
| 2 =============================== | |
| 3 | |
| 4 This document discusses several issues around generating target-specific ICE | |
|
jvoung (off chromium)
2014/05/15 23:47:34
Thanks for adding this!
| |
| 5 instructions from high-level ICE instructions. | |
| 6 | |
| 7 Meeting register address mode constraints | |
| 8 ----------------------------------------- | |
| 9 | |
| 10 Target-specific instructions often require specific operands to be in physical | |
| 11 registers. Sometimes one specific register is required, but usually any | |
| 12 register in a particular register class will suffice, and that register class is | |
| 13 defined by the instruction/operand type. | |
| 14 | |
| 15 The challenge is that ``Variable`` represents an operand that is either a stack | |
| 16 location in the current frame, or a physical register. Register allocation | |
| 17 happens after target-specific lowering, so during lowering we generally don't | |
| 18 know whether an ``Variable`` operand will meet a target instruction's physical | |
| 19 register requirement. | |
| 20 | |
| 21 To this end, ICE allows certain hints/directives: | |
| 22 | |
| 23 * ``Variable::setWeightInfinite()`` forces a ``Variable`` to get some | |
| 24 physical register (without specifying which particular one) from a | |
| 25 register class. | |
| 26 | |
| 27 * ``Variable::setRegNum()`` forces a ``Variable`` to be assigned a specific | |
| 28 physical register. | |
| 29 | |
| 30 * ``Variable::setPreferredRegister()`` registers a preference for a physical | |
| 31 register based on another ``Variable``'s physical register assignment. | |
| 32 | |
| 33 These hints/directives are described below in more detail. In most cases, | |
| 34 though, they don't need to be explicity used, as the routines that create | |
| 35 lowered instructions have reasonable defaults and simple options that control | |
| 36 these hints/directives. | |
| 37 | |
| 38 The recommended ICE lowering strategy is to generate extra assignment | |
| 39 instructions involving extra ``Variable`` temporaries, using the | |
| 40 hints/directives to force suitable register assignments for the temporaries, and | |
| 41 then let the global register allocator clean things up. | |
| 42 | |
| 43 Note: There is a spectrum of *implementation complexity* versus *translation | |
| 44 speed* versus *code quality*. This recommended strategy picks a point on the | |
| 45 spectrum representing very low complexity ("splat-isel"), pretty good code | |
| 46 quality in terms of frame size and register shuffling/spilling, but perhaps not | |
| 47 the fastest translation speed since extra instructions and operands are created | |
| 48 up front and cleaned up at the end. | |
| 49 | |
| 50 Ensuring some physical register | |
| 51 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
| 52 | |
| 53 The x86 instruction:: | |
| 54 | |
| 55 mov dst, src | |
| 56 | |
| 57 needs at least one of its operands in a physical register (ignoring the case | |
| 58 where ``src`` is a constant). This can be done as follows:: | |
| 59 | |
| 60 mov reg, src | |
| 61 mov dst, reg | |
| 62 | |
| 63 so long as ``reg`` is guaranteed to have a physical register assignment. The | |
| 64 low-level lowering code that accomplishes this looks something like:: | |
| 65 | |
| 66 Variable *Reg; | |
| 67 Reg = Func->makeVariable(Dst->getType()); | |
| 68 Reg->setWeightInfinite(); | |
| 69 NewInst = InstX8632Mov::create(Func, Reg, Src); | |
| 70 NewInst = InstX8632Mov::create(Func, Dst, Reg); | |
| 71 | |
| 72 ``Cfg::makeVariable()`` generates a new temporary, and | |
| 73 ``Variable::setWeightInfinite()`` gives it infinite weight for the purpose of | |
| 74 register allocation, thus guaranteeing it a physical register. | |
| 75 | |
| 76 The ``_mov(Dest, Src)`` method in the ``TargetX8632`` class is sufficiently | |
| 77 powerful to handle these details in most situations. Its ``Dest`` argument is | |
| 78 an in/out parameter. If its input value is ``NULL``, then a new temporary | |
| 79 variable is created, its type is set to the same type as the ``Src`` operand, it | |
| 80 is given infinite register weight, and the new ``Variable`` is returned through | |
| 81 the in/out parameter. (This is in addition to the new temporary being the dest | |
| 82 operand of the ``mov`` instruction.) The simpler version of the above example | |
| 83 is:: | |
| 84 | |
| 85 Variable *Reg = NULL; | |
| 86 _mov(Reg, Src); | |
| 87 _mov(Dst, Reg); | |
| 88 | |
| 89 Preferring another ``Variable``'s physical register | |
| 90 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
| 91 | |
| 92 One problem with this example is that the register allocator usually just | |
| 93 assigns the first available register to a live range. If this instruction ends | |
| 94 the live range of ``src``, this may lead to code like the following:: | |
| 95 | |
| 96 mov reg:eax, src:esi | |
| 97 mov dst:edi, reg:eax | |
| 98 | |
| 99 Since the first instruction happens to end the live range of ``src:esi``, it | |
| 100 would be better to assign ``esi`` to ``reg``:: | |
| 101 | |
| 102 mov reg:esi, src:esi | |
| 103 mov dst:edi, reg:esi | |
| 104 | |
| 105 The first instruction, ``mov esi, esi``, is a redundant assignment and will | |
| 106 ultimately be elided, leaving just ``mov edi, esi``. | |
| 107 | |
| 108 We can tell the register allocator to prefer the register assigned to a | |
| 109 different ``Variable``, using ``Variable::setPreferredRegister()``:: | |
| 110 | |
| 111 Variable *Reg; | |
| 112 Reg = Func->makeVariable(Dst->getType()); | |
| 113 Reg->setWeightInfinite(); | |
| 114 Reg->setPreferredRegister(Src); | |
| 115 NewInst = InstX8632Mov::create(Func, Reg, Src); | |
| 116 NewInst = InstX8632Mov::create(Func, Dst, Reg); | |
| 117 | |
| 118 Or more simply:: | |
| 119 | |
| 120 Variable *Reg = NULL; | |
| 121 _mov(Reg, Src); | |
| 122 _mov(Dst, Reg); | |
| 123 Reg->setPreferredRegister(llvm::dyn_cast<Variable>(Src)); | |
| 124 | |
| 125 The usefulness of ``setPreferredRegister()`` is tied into the implementation of | |
| 126 the register allocator. ICE uses linear-scan register allocation, which sorts | |
| 127 live ranges by starting point and assigns registers in that order. Using | |
| 128 ``B->setPreferredRegister(A)`` only helps when ``A`` has already been assigned a | |
| 129 register by the time ``B`` is being considered. For an assignment ``B=A``, this | |
| 130 is usually a safe assumption because ``B``'s live range begins at this | |
| 131 instruction but ``A``'s live range must have started earlier. (There may be | |
| 132 exceptions for variables that are no longer in SSA form.) But | |
| 133 ``A->setPreferredRegister(B)`` is unlikely to help unless ``B`` has been | |
| 134 precolored. In summary, generally the best practice is to use a pattern like:: | |
| 135 | |
| 136 NewInst = InstX8632Mov::create(Func, Dst, Src); | |
| 137 Dst->setPreferredRegister(Src); | |
| 138 //Src->setPreferredRegister(Dst); -- unlikely to have any effect | |
| 139 | |
| 140 Ensuring a specific physical register | |
| 141 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
| 142 | |
| 143 Some instructions require operands in specific physical registers, or produce | |
| 144 results in specific physical registers. For example, the 32-bit ``ret`` | |
| 145 instruction needs its operand in ``eax``. This can be done with | |
| 146 ``Variable::setRegNum()``:: | |
| 147 | |
| 148 Variable *Reg; | |
| 149 Reg = Func->makeVariable(Src->getType()); | |
| 150 Reg->setWeightInfinite(); | |
| 151 Reg->setRegNum(Reg_eax); | |
| 152 NewInst = InstX8632Mov::create(Func, Reg, Src); | |
|
jvoung (off chromium)
2014/05/15 23:47:34
Given the above discussion about "Src->setPreferre
Jim Stichnoth
2014/05/17 14:14:32
When it's time to assign Src (or B) a register, it
| |
| 153 NewInst = InstX8632Ret::create(Func, Reg); | |
| 154 | |
| 155 Precoloring with ``Variable::setRegNum()`` effectively gives it infinite weight | |
| 156 for register allocation, so the call to ``Variable::setWeightInfinite()`` is | |
| 157 technically unnecessary, but perhaps documents the intention a bit more | |
| 158 strongly. | |
| 159 | |
| 160 The ``_mov(Dest, Src, RegNum)`` method in the ``TargetX8632`` class has an | |
| 161 optional ``RegNum`` argument to force a specific register assignment when the | |
| 162 input ``Dest`` is ``NULL``. As described above, passing in ``Dest=NULL`` causes | |
| 163 a new temporary variable to be created with infinite register weight, and in | |
| 164 addition the specific register is chosen. The simpler version of the above | |
| 165 example is:: | |
| 166 | |
| 167 Variable *Reg = NULL; | |
| 168 _mov(Reg, Src, Reg_eax); | |
| 169 _ret(Reg); | |
| 170 | |
| 171 Disabling live-range interference | |
| 172 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
| 173 | |
| 174 Another problem with the "``mov reg,src; mov dst,reg``" example happens when | |
| 175 the instructions do *not* end the live range of ``src``. In this case, the live | |
| 176 ranges of ``reg`` and ``src`` interfere, so they can't get the same physical | |
| 177 register despite the explicit preference. However, ``reg`` is meant to be an | |
| 178 alias of ``src`` so they needn't be considered to interfere with each other. | |
| 179 This can be expressed via the second (bool) argument of | |
| 180 ``setPreferredRegister()``:: | |
| 181 | |
| 182 Variable *Reg; | |
| 183 Reg = Func->makeVariable(Dst->getType()); | |
| 184 Reg->setWeightInfinite(); | |
| 185 Reg->setPreferredRegister(Src, true); | |
| 186 NewInst = InstX8632Mov::create(Func, Reg, Src); | |
| 187 NewInst = InstX8632Mov::create(Func, Dst, Reg); | |
| 188 | |
| 189 This should be used with caution and probably only for these short-live-range | |
| 190 temporaries, otherwise the classic "lost copy" or "lost swap" problem may be | |
| 191 encountered. | |
| 192 | |
| 193 Instructions with register side effects | |
| 194 --------------------------------------- | |
| 195 | |
| 196 Some instructions produce unwanted results in other registers, or otherwise kill | |
| 197 preexisting values in other registers. For example, a ``call`` kills the | |
| 198 scratch registers. Also, the x86-32 ``idiv`` instruction produces the quotient | |
| 199 in ``eax`` and the remainder in ``edx``, but generally only one of those is | |
| 200 needed in the lowering. It's important that the register allocator doesn't | |
| 201 allocate that register to a live range that spans the instruction. | |
| 202 | |
| 203 ICE provides the ``InstFakeKill`` pseudo-instruction to mark such register | |
| 204 kills. For each of the instruction's source variables, a fake trivial live | |
| 205 range is created that begins and ends in that instruction. The ``InstFakeKill`` | |
| 206 instruction is inserted after the ``call`` instruction. For example:: | |
| 207 | |
| 208 CallInst = InstX8632Call::create(Func, ... ); | |
| 209 VarList KilledRegs; | |
| 210 KilledRegs.push_back(eax); | |
| 211 KilledRegs.push_back(ecx); | |
| 212 KilledRegs.push_back(edx); | |
| 213 NewInst = InstFakeKill::create(Func, KilledRegs, CallInst); | |
| 214 | |
| 215 The last argument to the ``InstFakeKill`` constructor links it to the previous | |
| 216 call instruction, such that if its linked instruction is dead-code eliminated, | |
| 217 the ``InstFakeKill`` instruction is eliminated as well. | |
| 218 | |
| 219 The killed register arguments need to be assigned a physical register via | |
| 220 ``Variable::setRegNum()`` for this to be effective. To avoid a massive | |
| 221 proliferation of ``Variable`` temporaries, the ``TargetLowering`` object caches | |
| 222 one precolored ``Variable`` for each physical register:: | |
| 223 | |
| 224 CallInst = InstX8632Call::create(Func, ... ); | |
| 225 VarList KilledRegs; | |
| 226 Variable *eax = Func->getTarget()->getPhysicalRegister(Reg_eax); | |
| 227 Variable *ecx = Func->getTarget()->getPhysicalRegister(Reg_ecx); | |
| 228 Variable *edx = Func->getTarget()->getPhysicalRegister(Reg_edx); | |
| 229 KilledRegs.push_back(eax); | |
| 230 KilledRegs.push_back(ecx); | |
| 231 KilledRegs.push_back(edx); | |
| 232 NewInst = InstFakeKill::create(Func, KilledRegs, CallInst); | |
| 233 | |
| 234 On first glance, it may seem unnecessary to explicitly kill the register that | |
| 235 returns the ``call`` return value. However, if for some reason the ``call`` | |
| 236 result ends up being unused, dead-code elimination could remove dead assignments | |
| 237 and incorrectly expose the return value register to a register allocation | |
| 238 assignment spanning the call, which would be incorrect. | |
| 239 | |
| 240 Instructions producing multiple values | |
| 241 -------------------------------------- | |
| 242 | |
| 243 ICE instructions allow at most one destination ``Variable``. Some machine | |
| 244 instructions produce more than one usable result. For example, the x86-32 | |
| 245 ``call`` ABI returns a 64-bit integer result in the ``edx:eax`` register pair. | |
| 246 Also, x86-32 has a version of the ``imul`` instruction that produces a 64-bit | |
| 247 result in the ``edx:eax`` register pair. | |
| 248 | |
| 249 To support multi-dest instructions, ICE provides the ``InstFakeDef`` | |
| 250 pseudo-instruction, whose destination can be precolored to the appropriate | |
| 251 physical register. For example, a ``call`` returning a 64-bit result in | |
| 252 ``edx:eax``:: | |
| 253 | |
| 254 CallInst = InstX8632Call::create(Func, RegLow, ... ); | |
| 255 ... | |
| 256 NewInst = InstFakeKill::create(Func, KilledRegs, CallInst); | |
| 257 Variable *RegHigh = Func->makeVariable(IceType_i32); | |
| 258 RegHigh->setRegNum(Reg_edx); | |
| 259 NewInst = InstFakeDef::create(Func, RegHigh); | |
| 260 | |
| 261 ``RegHigh`` is then assigned into the desired ``Variable``. If that assignment | |
| 262 ends up being dead-code eliminated, the ``InstFakeDef`` instruction may be | |
| 263 eliminated as well. | |
| 264 | |
| 265 Preventing dead-code elimination | |
| 266 -------------------------------- | |
| 267 | |
| 268 ICE instructions with a non-NULL ``Dest`` are subject to dead-code elimination. | |
| 269 However, some instructions must not be eliminated in order to preserve side | |
| 270 effects. This applies to most function calls, volatile loads, and loads and | |
| 271 integer divisions where the underlying language and runtime are relying on | |
| 272 hardware exception handling. | |
| 273 | |
| 274 ICE facilitates this with the ``InstFakeUse`` pseudo-instruction. This forces a | |
| 275 use of its source ``Variable`` to keep that variable's definition alive. Since | |
| 276 the ``InstFakeUse`` instruction has no ``Dest``, it will not be eliminated. | |
| 277 | |
| 278 Here is the full example of the x86-32 ``call`` returning a 32-bit integer | |
| 279 result:: | |
| 280 | |
| 281 Variable *Reg = Func->makeVariable(IceType_i32); | |
| 282 Reg->setRegNum(Reg_eax); | |
| 283 CallInst = InstX8632Call::create(Func, Reg, ... ); | |
| 284 VarList KilledRegs; | |
| 285 Variable *eax = Func->getTarget()->getPhysicalRegister(Reg_eax); | |
| 286 Variable *ecx = Func->getTarget()->getPhysicalRegister(Reg_ecx); | |
| 287 Variable *edx = Func->getTarget()->getPhysicalRegister(Reg_edx); | |
| 288 KilledRegs.push_back(eax); | |
| 289 KilledRegs.push_back(ecx); | |
| 290 KilledRegs.push_back(edx); | |
| 291 NewInst = InstFakeKill::create(Func, KilledRegs, CallInst); | |
| 292 NewInst = InstFakeUse::create(Func, Reg); | |
| 293 NewInst = InstX8632Mov::create(Func, Result, Reg); | |
| 294 | |
| 295 Without the ``InstFakeUse``, the entire call sequence could be dead-code | |
| 296 eliminated if its result were unused. | |
| 297 | |
| 298 One more note on this topic. These tools can be used to allow a multi-dest | |
| 299 instruction to be dead-code eliminated only when none of its results is live. | |
| 300 The key is to use the optional source parameter of the ``InstFakeDef`` | |
| 301 instruction. Using pseudocode:: | |
| 302 | |
| 303 t1:eax = call foo(arg1, ...) | |
| 304 InstFakeKill(eax, ecx, edx) | |
| 305 t2:edx = InstFakeDef(t1) | |
| 306 v_result_low = t1 | |
| 307 v_result_high = t2 | |
| 308 | |
| 309 If ``v_result_high`` is live but ``v_result_low`` is dead, adding ``t1`` as an | |
| 310 argument to ``InstFakeDef`` suffices to keep the ``call`` instruction live. | |
| OLD | NEW |