OLD | NEW |
---|---|
(Empty) | |
1 Target-specific lowering in ICE | |
2 =============================== | |
3 | |
4 This document discusses several issues around generating target-specific ICE | |
jvoung (off chromium)
2014/05/15 23:47:34
Thanks for adding this!
| |
5 instructions from high-level ICE instructions. | |
6 | |
7 Meeting register address mode constraints | |
8 ----------------------------------------- | |
9 | |
10 Target-specific instructions often require specific operands to be in physical | |
11 registers. Sometimes one specific register is required, but usually any | |
12 register in a particular register class will suffice, and that register class is | |
13 defined by the instruction/operand type. | |
14 | |
15 The challenge is that ``Variable`` represents an operand that is either a stack | |
16 location in the current frame, or a physical register. Register allocation | |
17 happens after target-specific lowering, so during lowering we generally don't | |
18 know whether an ``Variable`` operand will meet a target instruction's physical | |
19 register requirement. | |
20 | |
21 To this end, ICE allows certain hints/directives: | |
22 | |
23 * ``Variable::setWeightInfinite()`` forces a ``Variable`` to get some | |
24 physical register (without specifying which particular one) from a | |
25 register class. | |
26 | |
27 * ``Variable::setRegNum()`` forces a ``Variable`` to be assigned a specific | |
28 physical register. | |
29 | |
30 * ``Variable::setPreferredRegister()`` registers a preference for a physical | |
31 register based on another ``Variable``'s physical register assignment. | |
32 | |
33 These hints/directives are described below in more detail. In most cases, | |
34 though, they don't need to be explicity used, as the routines that create | |
35 lowered instructions have reasonable defaults and simple options that control | |
36 these hints/directives. | |
37 | |
38 The recommended ICE lowering strategy is to generate extra assignment | |
39 instructions involving extra ``Variable`` temporaries, using the | |
40 hints/directives to force suitable register assignments for the temporaries, and | |
41 then let the global register allocator clean things up. | |
42 | |
43 Note: There is a spectrum of *implementation complexity* versus *translation | |
44 speed* versus *code quality*. This recommended strategy picks a point on the | |
45 spectrum representing very low complexity ("splat-isel"), pretty good code | |
46 quality in terms of frame size and register shuffling/spilling, but perhaps not | |
47 the fastest translation speed since extra instructions and operands are created | |
48 up front and cleaned up at the end. | |
49 | |
50 Ensuring some physical register | |
51 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
52 | |
53 The x86 instruction:: | |
54 | |
55 mov dst, src | |
56 | |
57 needs at least one of its operands in a physical register (ignoring the case | |
58 where ``src`` is a constant). This can be done as follows:: | |
59 | |
60 mov reg, src | |
61 mov dst, reg | |
62 | |
63 so long as ``reg`` is guaranteed to have a physical register assignment. The | |
64 low-level lowering code that accomplishes this looks something like:: | |
65 | |
66 Variable *Reg; | |
67 Reg = Func->makeVariable(Dst->getType()); | |
68 Reg->setWeightInfinite(); | |
69 NewInst = InstX8632Mov::create(Func, Reg, Src); | |
70 NewInst = InstX8632Mov::create(Func, Dst, Reg); | |
71 | |
72 ``Cfg::makeVariable()`` generates a new temporary, and | |
73 ``Variable::setWeightInfinite()`` gives it infinite weight for the purpose of | |
74 register allocation, thus guaranteeing it a physical register. | |
75 | |
76 The ``_mov(Dest, Src)`` method in the ``TargetX8632`` class is sufficiently | |
77 powerful to handle these details in most situations. Its ``Dest`` argument is | |
78 an in/out parameter. If its input value is ``NULL``, then a new temporary | |
79 variable is created, its type is set to the same type as the ``Src`` operand, it | |
80 is given infinite register weight, and the new ``Variable`` is returned through | |
81 the in/out parameter. (This is in addition to the new temporary being the dest | |
82 operand of the ``mov`` instruction.) The simpler version of the above example | |
83 is:: | |
84 | |
85 Variable *Reg = NULL; | |
86 _mov(Reg, Src); | |
87 _mov(Dst, Reg); | |
88 | |
89 Preferring another ``Variable``'s physical register | |
90 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
91 | |
92 One problem with this example is that the register allocator usually just | |
93 assigns the first available register to a live range. If this instruction ends | |
94 the live range of ``src``, this may lead to code like the following:: | |
95 | |
96 mov reg:eax, src:esi | |
97 mov dst:edi, reg:eax | |
98 | |
99 Since the first instruction happens to end the live range of ``src:esi``, it | |
100 would be better to assign ``esi`` to ``reg``:: | |
101 | |
102 mov reg:esi, src:esi | |
103 mov dst:edi, reg:esi | |
104 | |
105 The first instruction, ``mov esi, esi``, is a redundant assignment and will | |
106 ultimately be elided, leaving just ``mov edi, esi``. | |
107 | |
108 We can tell the register allocator to prefer the register assigned to a | |
109 different ``Variable``, using ``Variable::setPreferredRegister()``:: | |
110 | |
111 Variable *Reg; | |
112 Reg = Func->makeVariable(Dst->getType()); | |
113 Reg->setWeightInfinite(); | |
114 Reg->setPreferredRegister(Src); | |
115 NewInst = InstX8632Mov::create(Func, Reg, Src); | |
116 NewInst = InstX8632Mov::create(Func, Dst, Reg); | |
117 | |
118 Or more simply:: | |
119 | |
120 Variable *Reg = NULL; | |
121 _mov(Reg, Src); | |
122 _mov(Dst, Reg); | |
123 Reg->setPreferredRegister(llvm::dyn_cast<Variable>(Src)); | |
124 | |
125 The usefulness of ``setPreferredRegister()`` is tied into the implementation of | |
126 the register allocator. ICE uses linear-scan register allocation, which sorts | |
127 live ranges by starting point and assigns registers in that order. Using | |
128 ``B->setPreferredRegister(A)`` only helps when ``A`` has already been assigned a | |
129 register by the time ``B`` is being considered. For an assignment ``B=A``, this | |
130 is usually a safe assumption because ``B``'s live range begins at this | |
131 instruction but ``A``'s live range must have started earlier. (There may be | |
132 exceptions for variables that are no longer in SSA form.) But | |
133 ``A->setPreferredRegister(B)`` is unlikely to help unless ``B`` has been | |
134 precolored. In summary, generally the best practice is to use a pattern like:: | |
135 | |
136 NewInst = InstX8632Mov::create(Func, Dst, Src); | |
137 Dst->setPreferredRegister(Src); | |
138 //Src->setPreferredRegister(Dst); -- unlikely to have any effect | |
139 | |
140 Ensuring a specific physical register | |
141 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
142 | |
143 Some instructions require operands in specific physical registers, or produce | |
144 results in specific physical registers. For example, the 32-bit ``ret`` | |
145 instruction needs its operand in ``eax``. This can be done with | |
146 ``Variable::setRegNum()``:: | |
147 | |
148 Variable *Reg; | |
149 Reg = Func->makeVariable(Src->getType()); | |
150 Reg->setWeightInfinite(); | |
151 Reg->setRegNum(Reg_eax); | |
152 NewInst = InstX8632Mov::create(Func, Reg, Src); | |
jvoung (off chromium)
2014/05/15 23:47:34
Given the above discussion about "Src->setPreferre
Jim Stichnoth
2014/05/17 14:14:32
When it's time to assign Src (or B) a register, it
| |
153 NewInst = InstX8632Ret::create(Func, Reg); | |
154 | |
155 Precoloring with ``Variable::setRegNum()`` effectively gives it infinite weight | |
156 for register allocation, so the call to ``Variable::setWeightInfinite()`` is | |
157 technically unnecessary, but perhaps documents the intention a bit more | |
158 strongly. | |
159 | |
160 The ``_mov(Dest, Src, RegNum)`` method in the ``TargetX8632`` class has an | |
161 optional ``RegNum`` argument to force a specific register assignment when the | |
162 input ``Dest`` is ``NULL``. As described above, passing in ``Dest=NULL`` causes | |
163 a new temporary variable to be created with infinite register weight, and in | |
164 addition the specific register is chosen. The simpler version of the above | |
165 example is:: | |
166 | |
167 Variable *Reg = NULL; | |
168 _mov(Reg, Src, Reg_eax); | |
169 _ret(Reg); | |
170 | |
171 Disabling live-range interference | |
172 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
173 | |
174 Another problem with the "``mov reg,src; mov dst,reg``" example happens when | |
175 the instructions do *not* end the live range of ``src``. In this case, the live | |
176 ranges of ``reg`` and ``src`` interfere, so they can't get the same physical | |
177 register despite the explicit preference. However, ``reg`` is meant to be an | |
178 alias of ``src`` so they needn't be considered to interfere with each other. | |
179 This can be expressed via the second (bool) argument of | |
180 ``setPreferredRegister()``:: | |
181 | |
182 Variable *Reg; | |
183 Reg = Func->makeVariable(Dst->getType()); | |
184 Reg->setWeightInfinite(); | |
185 Reg->setPreferredRegister(Src, true); | |
186 NewInst = InstX8632Mov::create(Func, Reg, Src); | |
187 NewInst = InstX8632Mov::create(Func, Dst, Reg); | |
188 | |
189 This should be used with caution and probably only for these short-live-range | |
190 temporaries, otherwise the classic "lost copy" or "lost swap" problem may be | |
191 encountered. | |
192 | |
193 Instructions with register side effects | |
194 --------------------------------------- | |
195 | |
196 Some instructions produce unwanted results in other registers, or otherwise kill | |
197 preexisting values in other registers. For example, a ``call`` kills the | |
198 scratch registers. Also, the x86-32 ``idiv`` instruction produces the quotient | |
199 in ``eax`` and the remainder in ``edx``, but generally only one of those is | |
200 needed in the lowering. It's important that the register allocator doesn't | |
201 allocate that register to a live range that spans the instruction. | |
202 | |
203 ICE provides the ``InstFakeKill`` pseudo-instruction to mark such register | |
204 kills. For each of the instruction's source variables, a fake trivial live | |
205 range is created that begins and ends in that instruction. The ``InstFakeKill`` | |
206 instruction is inserted after the ``call`` instruction. For example:: | |
207 | |
208 CallInst = InstX8632Call::create(Func, ... ); | |
209 VarList KilledRegs; | |
210 KilledRegs.push_back(eax); | |
211 KilledRegs.push_back(ecx); | |
212 KilledRegs.push_back(edx); | |
213 NewInst = InstFakeKill::create(Func, KilledRegs, CallInst); | |
214 | |
215 The last argument to the ``InstFakeKill`` constructor links it to the previous | |
216 call instruction, such that if its linked instruction is dead-code eliminated, | |
217 the ``InstFakeKill`` instruction is eliminated as well. | |
218 | |
219 The killed register arguments need to be assigned a physical register via | |
220 ``Variable::setRegNum()`` for this to be effective. To avoid a massive | |
221 proliferation of ``Variable`` temporaries, the ``TargetLowering`` object caches | |
222 one precolored ``Variable`` for each physical register:: | |
223 | |
224 CallInst = InstX8632Call::create(Func, ... ); | |
225 VarList KilledRegs; | |
226 Variable *eax = Func->getTarget()->getPhysicalRegister(Reg_eax); | |
227 Variable *ecx = Func->getTarget()->getPhysicalRegister(Reg_ecx); | |
228 Variable *edx = Func->getTarget()->getPhysicalRegister(Reg_edx); | |
229 KilledRegs.push_back(eax); | |
230 KilledRegs.push_back(ecx); | |
231 KilledRegs.push_back(edx); | |
232 NewInst = InstFakeKill::create(Func, KilledRegs, CallInst); | |
233 | |
234 On first glance, it may seem unnecessary to explicitly kill the register that | |
235 returns the ``call`` return value. However, if for some reason the ``call`` | |
236 result ends up being unused, dead-code elimination could remove dead assignments | |
237 and incorrectly expose the return value register to a register allocation | |
238 assignment spanning the call, which would be incorrect. | |
239 | |
240 Instructions producing multiple values | |
241 -------------------------------------- | |
242 | |
243 ICE instructions allow at most one destination ``Variable``. Some machine | |
244 instructions produce more than one usable result. For example, the x86-32 | |
245 ``call`` ABI returns a 64-bit integer result in the ``edx:eax`` register pair. | |
246 Also, x86-32 has a version of the ``imul`` instruction that produces a 64-bit | |
247 result in the ``edx:eax`` register pair. | |
248 | |
249 To support multi-dest instructions, ICE provides the ``InstFakeDef`` | |
250 pseudo-instruction, whose destination can be precolored to the appropriate | |
251 physical register. For example, a ``call`` returning a 64-bit result in | |
252 ``edx:eax``:: | |
253 | |
254 CallInst = InstX8632Call::create(Func, RegLow, ... ); | |
255 ... | |
256 NewInst = InstFakeKill::create(Func, KilledRegs, CallInst); | |
257 Variable *RegHigh = Func->makeVariable(IceType_i32); | |
258 RegHigh->setRegNum(Reg_edx); | |
259 NewInst = InstFakeDef::create(Func, RegHigh); | |
260 | |
261 ``RegHigh`` is then assigned into the desired ``Variable``. If that assignment | |
262 ends up being dead-code eliminated, the ``InstFakeDef`` instruction may be | |
263 eliminated as well. | |
264 | |
265 Preventing dead-code elimination | |
266 -------------------------------- | |
267 | |
268 ICE instructions with a non-NULL ``Dest`` are subject to dead-code elimination. | |
269 However, some instructions must not be eliminated in order to preserve side | |
270 effects. This applies to most function calls, volatile loads, and loads and | |
271 integer divisions where the underlying language and runtime are relying on | |
272 hardware exception handling. | |
273 | |
274 ICE facilitates this with the ``InstFakeUse`` pseudo-instruction. This forces a | |
275 use of its source ``Variable`` to keep that variable's definition alive. Since | |
276 the ``InstFakeUse`` instruction has no ``Dest``, it will not be eliminated. | |
277 | |
278 Here is the full example of the x86-32 ``call`` returning a 32-bit integer | |
279 result:: | |
280 | |
281 Variable *Reg = Func->makeVariable(IceType_i32); | |
282 Reg->setRegNum(Reg_eax); | |
283 CallInst = InstX8632Call::create(Func, Reg, ... ); | |
284 VarList KilledRegs; | |
285 Variable *eax = Func->getTarget()->getPhysicalRegister(Reg_eax); | |
286 Variable *ecx = Func->getTarget()->getPhysicalRegister(Reg_ecx); | |
287 Variable *edx = Func->getTarget()->getPhysicalRegister(Reg_edx); | |
288 KilledRegs.push_back(eax); | |
289 KilledRegs.push_back(ecx); | |
290 KilledRegs.push_back(edx); | |
291 NewInst = InstFakeKill::create(Func, KilledRegs, CallInst); | |
292 NewInst = InstFakeUse::create(Func, Reg); | |
293 NewInst = InstX8632Mov::create(Func, Result, Reg); | |
294 | |
295 Without the ``InstFakeUse``, the entire call sequence could be dead-code | |
296 eliminated if its result were unused. | |
297 | |
298 One more note on this topic. These tools can be used to allow a multi-dest | |
299 instruction to be dead-code eliminated only when none of its results is live. | |
300 The key is to use the optional source parameter of the ``InstFakeDef`` | |
301 instruction. Using pseudocode:: | |
302 | |
303 t1:eax = call foo(arg1, ...) | |
304 InstFakeKill(eax, ecx, edx) | |
305 t2:edx = InstFakeDef(t1) | |
306 v_result_low = t1 | |
307 v_result_high = t2 | |
308 | |
309 If ``v_result_high`` is live but ``v_result_low`` is dead, adding ``t1`` as an | |
310 argument to ``InstFakeDef`` suffices to keep the ``call`` instruction live. | |
OLD | NEW |