| OLD | NEW |
| (Empty) |
| 1 Target-specific lowering in ICE | |
| 2 =============================== | |
| 3 | |
| 4 This document discusses several issues around generating target-specific ICE | |
| 5 instructions from high-level ICE instructions. | |
| 6 | |
| 7 Meeting register address mode constraints | |
| 8 ----------------------------------------- | |
| 9 | |
| 10 Target-specific instructions often require specific operands to be in physical | |
| 11 registers. Sometimes one specific register is required, but usually any | |
| 12 register in a particular register class will suffice, and that register class is | |
| 13 defined by the instruction/operand type. | |
| 14 | |
| 15 The challenge is that ``Variable`` represents an operand that is either a stack | |
| 16 location in the current frame, or a physical register. Register allocation | |
| 17 happens after target-specific lowering, so during lowering we generally don't | |
| 18 know whether a ``Variable`` operand will meet a target instruction's physical | |
| 19 register requirement. | |
| 20 | |
| 21 To this end, ICE allows certain directives: | |
| 22 | |
| 23 * ``Variable::setWeightInfinite()`` forces a ``Variable`` to get some | |
| 24 physical register (without specifying which particular one) from a | |
| 25 register class. | |
| 26 | |
| 27 * ``Variable::setRegNum()`` forces a ``Variable`` to be assigned a specific | |
| 28 physical register. | |
| 29 | |
| 30 These directives are described below in more detail. In most cases, though, | |
| 31 they don't need to be explicity used, as the routines that create lowered | |
| 32 instructions have reasonable defaults and simple options that control these | |
| 33 directives. | |
| 34 | |
| 35 The recommended ICE lowering strategy is to generate extra assignment | |
| 36 instructions involving extra ``Variable`` temporaries, using the directives to | |
| 37 force suitable register assignments for the temporaries, and then let the | |
| 38 register allocator clean things up. | |
| 39 | |
| 40 Note: There is a spectrum of *implementation complexity* versus *translation | |
| 41 speed* versus *code quality*. This recommended strategy picks a point on the | |
| 42 spectrum representing very low complexity ("splat-isel"), pretty good code | |
| 43 quality in terms of frame size and register shuffling/spilling, but perhaps not | |
| 44 the fastest translation speed since extra instructions and operands are created | |
| 45 up front and cleaned up at the end. | |
| 46 | |
| 47 Ensuring a non-specific physical register | |
| 48 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
| 49 | |
| 50 The x86 instruction:: | |
| 51 | |
| 52 mov dst, src | |
| 53 | |
| 54 needs at least one of its operands in a physical register (ignoring the case | |
| 55 where ``src`` is a constant). This can be done as follows:: | |
| 56 | |
| 57 mov reg, src | |
| 58 mov dst, reg | |
| 59 | |
| 60 so long as ``reg`` is guaranteed to have a physical register assignment. The | |
| 61 low-level lowering code that accomplishes this looks something like:: | |
| 62 | |
| 63 Variable *Reg; | |
| 64 Reg = Func->makeVariable(Dst->getType()); | |
| 65 Reg->setWeightInfinite(); | |
| 66 NewInst = InstX8632Mov::create(Func, Reg, Src); | |
| 67 NewInst = InstX8632Mov::create(Func, Dst, Reg); | |
| 68 | |
| 69 ``Cfg::makeVariable()`` generates a new temporary, and | |
| 70 ``Variable::setWeightInfinite()`` gives it infinite weight for the purpose of | |
| 71 register allocation, thus guaranteeing it a physical register (though leaving | |
| 72 the particular physical register to be determined by the register allocator). | |
| 73 | |
| 74 The ``_mov(Dest, Src)`` method in the ``TargetX8632`` class is sufficiently | |
| 75 powerful to handle these details in most situations. Its ``Dest`` argument is | |
| 76 an in/out parameter. If its input value is ``nullptr``, then a new temporary | |
| 77 variable is created, its type is set to the same type as the ``Src`` operand, it | |
| 78 is given infinite register weight, and the new ``Variable`` is returned through | |
| 79 the in/out parameter. (This is in addition to the new temporary being the dest | |
| 80 operand of the ``mov`` instruction.) The simpler version of the above example | |
| 81 is:: | |
| 82 | |
| 83 Variable *Reg = nullptr; | |
| 84 _mov(Reg, Src); | |
| 85 _mov(Dst, Reg); | |
| 86 | |
| 87 Preferring another ``Variable``'s physical register | |
| 88 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
| 89 | |
| 90 (An older version of ICE allowed the lowering code to provide a register | |
| 91 allocation hint: if a physical register is to be assigned to one ``Variable``, | |
| 92 then prefer a particular ``Variable``'s physical register if available. This | |
| 93 hint would be used to try to reduce the amount of register shuffling. | |
| 94 Currently, the register allocator does this automatically through the | |
| 95 ``FindPreference`` logic.) | |
| 96 | |
| 97 Ensuring a specific physical register | |
| 98 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
| 99 | |
| 100 Some instructions require operands in specific physical registers, or produce | |
| 101 results in specific physical registers. For example, the 32-bit ``ret`` | |
| 102 instruction needs its operand in ``eax``. This can be done with | |
| 103 ``Variable::setRegNum()``:: | |
| 104 | |
| 105 Variable *Reg; | |
| 106 Reg = Func->makeVariable(Src->getType()); | |
| 107 Reg->setWeightInfinite(); | |
| 108 Reg->setRegNum(Reg_eax); | |
| 109 NewInst = InstX8632Mov::create(Func, Reg, Src); | |
| 110 NewInst = InstX8632Ret::create(Func, Reg); | |
| 111 | |
| 112 Precoloring with ``Variable::setRegNum()`` effectively gives it infinite weight | |
| 113 for register allocation, so the call to ``Variable::setWeightInfinite()`` is | |
| 114 technically unnecessary, but perhaps documents the intention a bit more | |
| 115 strongly. | |
| 116 | |
| 117 The ``_mov(Dest, Src, RegNum)`` method in the ``TargetX8632`` class has an | |
| 118 optional ``RegNum`` argument to force a specific register assignment when the | |
| 119 input ``Dest`` is ``nullptr``. As described above, passing in ``Dest=nullptr`` | |
| 120 causes a new temporary variable to be created with infinite register weight, and | |
| 121 in addition the specific register is chosen. The simpler version of the above | |
| 122 example is:: | |
| 123 | |
| 124 Variable *Reg = nullptr; | |
| 125 _mov(Reg, Src, Reg_eax); | |
| 126 _ret(Reg); | |
| 127 | |
| 128 Disabling live-range interference | |
| 129 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
| 130 | |
| 131 (An older version of ICE allowed an overly strong preference for another | |
| 132 ``Variable``'s physical register even if their live ranges interfered. This was | |
| 133 risky, and currently the register allocator derives this automatically through | |
| 134 the ``AllowOverlap`` logic.) | |
| 135 | |
| 136 Call instructions kill scratch registers | |
| 137 ---------------------------------------- | |
| 138 | |
| 139 A ``call`` instruction kills the values in all scratch registers, so it's | |
| 140 important that the register allocator doesn't allocate a scratch register to a | |
| 141 ``Variable`` whose live range spans the ``call`` instruction. ICE provides the | |
| 142 ``InstFakeKill`` pseudo-instruction to compactly mark such register kills. For | |
| 143 each scratch register, a fake trivial live range is created that begins and ends | |
| 144 in that instruction. The ``InstFakeKill`` instruction is inserted after the | |
| 145 ``call`` instruction. For example:: | |
| 146 | |
| 147 CallInst = InstX8632Call::create(Func, ... ); | |
| 148 NewInst = InstFakeKill::create(Func, CallInst); | |
| 149 | |
| 150 The last argument to the ``InstFakeKill`` constructor links it to the previous | |
| 151 call instruction, such that if its linked instruction is dead-code eliminated, | |
| 152 the ``InstFakeKill`` instruction is eliminated as well. The linked ``call`` | |
| 153 instruction could be to a target known to be free of side effects, and therefore | |
| 154 safe to remove if its result is unused. | |
| 155 | |
| 156 Instructions producing multiple values | |
| 157 -------------------------------------- | |
| 158 | |
| 159 ICE instructions allow at most one destination ``Variable``. Some machine | |
| 160 instructions produce more than one usable result. For example, the x86-32 | |
| 161 ``call`` ABI returns a 64-bit integer result in the ``edx:eax`` register pair. | |
| 162 Also, x86-32 has a version of the ``imul`` instruction that produces a 64-bit | |
| 163 result in the ``edx:eax`` register pair. The x86-32 ``idiv`` instruction | |
| 164 produces the quotient in ``eax`` and the remainder in ``edx``, though generally | |
| 165 only one or the other is needed in the lowering. | |
| 166 | |
| 167 To support multi-dest instructions, ICE provides the ``InstFakeDef`` | |
| 168 pseudo-instruction, whose destination can be precolored to the appropriate | |
| 169 physical register. For example, a ``call`` returning a 64-bit result in | |
| 170 ``edx:eax``:: | |
| 171 | |
| 172 CallInst = InstX8632Call::create(Func, RegLow, ... ); | |
| 173 NewInst = InstFakeKill::create(Func, CallInst); | |
| 174 Variable *RegHigh = Func->makeVariable(IceType_i32); | |
| 175 RegHigh->setRegNum(Reg_edx); | |
| 176 NewInst = InstFakeDef::create(Func, RegHigh); | |
| 177 | |
| 178 ``RegHigh`` is then assigned into the desired ``Variable``. If that assignment | |
| 179 ends up being dead-code eliminated, the ``InstFakeDef`` instruction may be | |
| 180 eliminated as well. | |
| 181 | |
| 182 Managing dead-code elimination | |
| 183 ------------------------------ | |
| 184 | |
| 185 ICE instructions with a non-nullptr ``Dest`` are subject to dead-code | |
| 186 elimination. However, some instructions must not be eliminated in order to | |
| 187 preserve side effects. This applies to most function calls, volatile loads, and | |
| 188 loads and integer divisions where the underlying language and runtime are | |
| 189 relying on hardware exception handling. | |
| 190 | |
| 191 ICE facilitates this with the ``InstFakeUse`` pseudo-instruction. This forces a | |
| 192 use of its source ``Variable`` to keep that variable's definition alive. Since | |
| 193 the ``InstFakeUse`` instruction has no ``Dest``, it will not be eliminated. | |
| 194 | |
| 195 Here is the full example of the x86-32 ``call`` returning a 32-bit integer | |
| 196 result:: | |
| 197 | |
| 198 Variable *Reg = Func->makeVariable(IceType_i32); | |
| 199 Reg->setRegNum(Reg_eax); | |
| 200 CallInst = InstX8632Call::create(Func, Reg, ... ); | |
| 201 NewInst = InstFakeKill::create(Func, CallInst); | |
| 202 NewInst = InstFakeUse::create(Func, Reg); | |
| 203 NewInst = InstX8632Mov::create(Func, Result, Reg); | |
| 204 | |
| 205 Without the ``InstFakeUse``, the entire call sequence could be dead-code | |
| 206 eliminated if its result were unused. | |
| 207 | |
| 208 One more note on this topic. These tools can be used to allow a multi-dest | |
| 209 instruction to be dead-code eliminated only when none of its results is live. | |
| 210 The key is to use the optional source parameter of the ``InstFakeDef`` | |
| 211 instruction. Using pseudocode:: | |
| 212 | |
| 213 t1:eax = call foo(arg1, ...) | |
| 214 InstFakeKill // eax, ecx, edx | |
| 215 t2:edx = InstFakeDef(t1) | |
| 216 v_result_low = t1 | |
| 217 v_result_high = t2 | |
| 218 | |
| 219 If ``v_result_high`` is live but ``v_result_low`` is dead, adding ``t1`` as an | |
| 220 argument to ``InstFakeDef`` suffices to keep the ``call`` instruction live. | |
| 221 | |
| 222 Instructions modifying source operands | |
| 223 -------------------------------------- | |
| 224 | |
| 225 Some native instructions may modify one or more source operands. For example, | |
| 226 the x86 ``xadd`` and ``xchg`` instructions modify both source operands. Some | |
| 227 analysis needs to identify every place a ``Variable`` is modified, and it uses | |
| 228 the presence of a ``Dest`` variable for this analysis. Since ICE instructions | |
| 229 have at most one ``Dest``, the ``xadd`` and ``xchg`` instructions need special | |
| 230 treatment. | |
| 231 | |
| 232 A ``Variable`` that is not the ``Dest`` can be marked as modified by adding an | |
| 233 ``InstFakeDef``. However, this is not sufficient, as the ``Variable`` may have | |
| 234 no more live uses, which could result in the ``InstFakeDef`` being dead-code | |
| 235 eliminated. The solution is to add an ``InstFakeUse`` as well. | |
| 236 | |
| 237 To summarize, for every source ``Variable`` that is not equal to the | |
| 238 instruction's ``Dest``, append an ``InstFakeDef`` and ``InstFakeUse`` | |
| 239 instruction to provide the necessary analysis information. | |
| OLD | NEW |