Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(242)

Side by Side Diff: LOWERING.rst

Issue 846763002: Subzero: Update the lowering documentation. (Closed) Base URL: https://chromium.googlesource.com/native_client/pnacl-subzero.git@master
Patch Set: Updates from code review Created 5 years, 11 months ago
Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.
Jump to:
View unified diff | Download patch
« no previous file with comments | « no previous file | no next file » | no next file with comments »
Toggle Intra-line Diffs ('i') | Expand Comments ('e') | Collapse Comments ('c') | Show Comments Hide Comments ('s')
OLDNEW
1 Target-specific lowering in ICE 1 Target-specific lowering in ICE
2 =============================== 2 ===============================
3 3
4 This document discusses several issues around generating target-specific ICE 4 This document discusses several issues around generating target-specific ICE
5 instructions from high-level ICE instructions. 5 instructions from high-level ICE instructions.
6 6
7 Meeting register address mode constraints 7 Meeting register address mode constraints
8 ----------------------------------------- 8 -----------------------------------------
9 9
10 Target-specific instructions often require specific operands to be in physical 10 Target-specific instructions often require specific operands to be in physical
11 registers. Sometimes one specific register is required, but usually any 11 registers. Sometimes one specific register is required, but usually any
12 register in a particular register class will suffice, and that register class is 12 register in a particular register class will suffice, and that register class is
13 defined by the instruction/operand type. 13 defined by the instruction/operand type.
14 14
15 The challenge is that ``Variable`` represents an operand that is either a stack 15 The challenge is that ``Variable`` represents an operand that is either a stack
16 location in the current frame, or a physical register. Register allocation 16 location in the current frame, or a physical register. Register allocation
17 happens after target-specific lowering, so during lowering we generally don't 17 happens after target-specific lowering, so during lowering we generally don't
18 know whether a ``Variable`` operand will meet a target instruction's physical 18 know whether a ``Variable`` operand will meet a target instruction's physical
19 register requirement. 19 register requirement.
20 20
21 To this end, ICE allows certain hints/directives: 21 To this end, ICE allows certain directives:
22 22
23 * ``Variable::setWeightInfinite()`` forces a ``Variable`` to get some 23 * ``Variable::setWeightInfinite()`` forces a ``Variable`` to get some
24 physical register (without specifying which particular one) from a 24 physical register (without specifying which particular one) from a
25 register class. 25 register class.
26 26
27 * ``Variable::setRegNum()`` forces a ``Variable`` to be assigned a specific 27 * ``Variable::setRegNum()`` forces a ``Variable`` to be assigned a specific
28 physical register. 28 physical register.
29 29
30 * ``Variable::setPreferredRegister()`` registers a preference for a physical 30 These directives are described below in more detail. In most cases, though,
31 register based on another ``Variable``'s physical register assignment. 31 they don't need to be explicity used, as the routines that create lowered
32 32 instructions have reasonable defaults and simple options that control these
33 These hints/directives are described below in more detail. In most cases, 33 directives.
34 though, they don't need to be explicity used, as the routines that create
35 lowered instructions have reasonable defaults and simple options that control
36 these hints/directives.
37 34
38 The recommended ICE lowering strategy is to generate extra assignment 35 The recommended ICE lowering strategy is to generate extra assignment
39 instructions involving extra ``Variable`` temporaries, using the 36 instructions involving extra ``Variable`` temporaries, using the directives to
40 hints/directives to force suitable register assignments for the temporaries, and 37 force suitable register assignments for the temporaries, and then let the
41 then let the global register allocator clean things up. 38 register allocator clean things up.
42 39
43 Note: There is a spectrum of *implementation complexity* versus *translation 40 Note: There is a spectrum of *implementation complexity* versus *translation
44 speed* versus *code quality*. This recommended strategy picks a point on the 41 speed* versus *code quality*. This recommended strategy picks a point on the
45 spectrum representing very low complexity ("splat-isel"), pretty good code 42 spectrum representing very low complexity ("splat-isel"), pretty good code
46 quality in terms of frame size and register shuffling/spilling, but perhaps not 43 quality in terms of frame size and register shuffling/spilling, but perhaps not
47 the fastest translation speed since extra instructions and operands are created 44 the fastest translation speed since extra instructions and operands are created
48 up front and cleaned up at the end. 45 up front and cleaned up at the end.
49 46
50 Ensuring some physical register 47 Ensuring a non-specific physical register
51 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 48 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
52 49
53 The x86 instruction:: 50 The x86 instruction::
54 51
55 mov dst, src 52 mov dst, src
56 53
57 needs at least one of its operands in a physical register (ignoring the case 54 needs at least one of its operands in a physical register (ignoring the case
58 where ``src`` is a constant). This can be done as follows:: 55 where ``src`` is a constant). This can be done as follows::
59 56
60 mov reg, src 57 mov reg, src
61 mov dst, reg 58 mov dst, reg
62 59
63 so long as ``reg`` is guaranteed to have a physical register assignment. The 60 so long as ``reg`` is guaranteed to have a physical register assignment. The
64 low-level lowering code that accomplishes this looks something like:: 61 low-level lowering code that accomplishes this looks something like::
65 62
66 Variable *Reg; 63 Variable *Reg;
67 Reg = Func->makeVariable(Dst->getType()); 64 Reg = Func->makeVariable(Dst->getType());
68 Reg->setWeightInfinite(); 65 Reg->setWeightInfinite();
69 NewInst = InstX8632Mov::create(Func, Reg, Src); 66 NewInst = InstX8632Mov::create(Func, Reg, Src);
70 NewInst = InstX8632Mov::create(Func, Dst, Reg); 67 NewInst = InstX8632Mov::create(Func, Dst, Reg);
71 68
72 ``Cfg::makeVariable()`` generates a new temporary, and 69 ``Cfg::makeVariable()`` generates a new temporary, and
73 ``Variable::setWeightInfinite()`` gives it infinite weight for the purpose of 70 ``Variable::setWeightInfinite()`` gives it infinite weight for the purpose of
74 register allocation, thus guaranteeing it a physical register. 71 register allocation, thus guaranteeing it a physical register (though leaving
72 the particular physical register to be determined by the register allocator).
75 73
76 The ``_mov(Dest, Src)`` method in the ``TargetX8632`` class is sufficiently 74 The ``_mov(Dest, Src)`` method in the ``TargetX8632`` class is sufficiently
77 powerful to handle these details in most situations. Its ``Dest`` argument is 75 powerful to handle these details in most situations. Its ``Dest`` argument is
78 an in/out parameter. If its input value is ``NULL``, then a new temporary 76 an in/out parameter. If its input value is ``nullptr``, then a new temporary
79 variable is created, its type is set to the same type as the ``Src`` operand, it 77 variable is created, its type is set to the same type as the ``Src`` operand, it
80 is given infinite register weight, and the new ``Variable`` is returned through 78 is given infinite register weight, and the new ``Variable`` is returned through
81 the in/out parameter. (This is in addition to the new temporary being the dest 79 the in/out parameter. (This is in addition to the new temporary being the dest
82 operand of the ``mov`` instruction.) The simpler version of the above example 80 operand of the ``mov`` instruction.) The simpler version of the above example
83 is:: 81 is::
84 82
85 Variable *Reg = NULL; 83 Variable *Reg = nullptr;
86 _mov(Reg, Src); 84 _mov(Reg, Src);
87 _mov(Dst, Reg); 85 _mov(Dst, Reg);
88 86
89 Preferring another ``Variable``'s physical register 87 Preferring another ``Variable``'s physical register
90 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 88 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
91 89
92 One problem with this example is that the register allocator usually just 90 (An older version of ICE allowed the lowering code to provide a register
93 assigns the first available register to a live range. If this instruction ends 91 allocation hint: if a physical register is to be assigned to one ``Variable``,
94 the live range of ``src``, this may lead to code like the following:: 92 then prefer a particular ``Variable``'s physical register if available. This
95 93 hint would be used to try to reduce the amount of register shuffling.
96 mov reg:eax, src:esi 94 Currently, the register allocator does this automatically through the
97 mov dst:edi, reg:eax 95 ``FindPreference`` logic.)
98
99 Since the first instruction happens to end the live range of ``src:esi``, it
100 would be better to assign ``esi`` to ``reg``::
101
102 mov reg:esi, src:esi
103 mov dst:edi, reg:esi
104
105 The first instruction, ``mov esi, esi``, is a redundant assignment and will
106 ultimately be elided, leaving just ``mov edi, esi``.
107
108 We can tell the register allocator to prefer the register assigned to a
109 different ``Variable``, using ``Variable::setPreferredRegister()``::
110
111 Variable *Reg;
112 Reg = Func->makeVariable(Dst->getType());
113 Reg->setWeightInfinite();
114 Reg->setPreferredRegister(Src);
115 NewInst = InstX8632Mov::create(Func, Reg, Src);
116 NewInst = InstX8632Mov::create(Func, Dst, Reg);
117
118 Or more simply::
119
120 Variable *Reg = NULL;
121 _mov(Reg, Src);
122 _mov(Dst, Reg);
123 Reg->setPreferredRegister(llvm::dyn_cast<Variable>(Src));
124
125 The usefulness of ``setPreferredRegister()`` is tied into the implementation of
126 the register allocator. ICE uses linear-scan register allocation, which sorts
127 live ranges by starting point and assigns registers in that order. Using
128 ``B->setPreferredRegister(A)`` only helps when ``A`` has already been assigned a
129 register by the time ``B`` is being considered. For an assignment ``B=A``, this
130 is usually a safe assumption because ``B``'s live range begins at this
131 instruction but ``A``'s live range must have started earlier. (There may be
132 exceptions for variables that are no longer in SSA form.) But
133 ``A->setPreferredRegister(B)`` is unlikely to help unless ``B`` has been
134 precolored. In summary, generally the best practice is to use a pattern like::
135
136 NewInst = InstX8632Mov::create(Func, Dst, Src);
137 Dst->setPreferredRegister(Src);
138 //Src->setPreferredRegister(Dst); -- unlikely to have any effect
139 96
140 Ensuring a specific physical register 97 Ensuring a specific physical register
141 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 98 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
142 99
143 Some instructions require operands in specific physical registers, or produce 100 Some instructions require operands in specific physical registers, or produce
144 results in specific physical registers. For example, the 32-bit ``ret`` 101 results in specific physical registers. For example, the 32-bit ``ret``
145 instruction needs its operand in ``eax``. This can be done with 102 instruction needs its operand in ``eax``. This can be done with
146 ``Variable::setRegNum()``:: 103 ``Variable::setRegNum()``::
147 104
148 Variable *Reg; 105 Variable *Reg;
149 Reg = Func->makeVariable(Src->getType()); 106 Reg = Func->makeVariable(Src->getType());
150 Reg->setWeightInfinite(); 107 Reg->setWeightInfinite();
151 Reg->setRegNum(Reg_eax); 108 Reg->setRegNum(Reg_eax);
152 NewInst = InstX8632Mov::create(Func, Reg, Src); 109 NewInst = InstX8632Mov::create(Func, Reg, Src);
153 NewInst = InstX8632Ret::create(Func, Reg); 110 NewInst = InstX8632Ret::create(Func, Reg);
154 111
155 Precoloring with ``Variable::setRegNum()`` effectively gives it infinite weight 112 Precoloring with ``Variable::setRegNum()`` effectively gives it infinite weight
156 for register allocation, so the call to ``Variable::setWeightInfinite()`` is 113 for register allocation, so the call to ``Variable::setWeightInfinite()`` is
157 technically unnecessary, but perhaps documents the intention a bit more 114 technically unnecessary, but perhaps documents the intention a bit more
158 strongly. 115 strongly.
159 116
160 The ``_mov(Dest, Src, RegNum)`` method in the ``TargetX8632`` class has an 117 The ``_mov(Dest, Src, RegNum)`` method in the ``TargetX8632`` class has an
161 optional ``RegNum`` argument to force a specific register assignment when the 118 optional ``RegNum`` argument to force a specific register assignment when the
162 input ``Dest`` is ``NULL``. As described above, passing in ``Dest=NULL`` causes 119 input ``Dest`` is ``nullptr``. As described above, passing in ``Dest=nullptr``
163 a new temporary variable to be created with infinite register weight, and in 120 causes a new temporary variable to be created with infinite register weight, and
164 addition the specific register is chosen. The simpler version of the above 121 in addition the specific register is chosen. The simpler version of the above
165 example is:: 122 example is::
166 123
167 Variable *Reg = NULL; 124 Variable *Reg = nullptr;
168 _mov(Reg, Src, Reg_eax); 125 _mov(Reg, Src, Reg_eax);
169 _ret(Reg); 126 _ret(Reg);
170 127
171 Disabling live-range interference 128 Disabling live-range interference
172 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 129 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
173 130
174 Another problem with the "``mov reg,src; mov dst,reg``" example happens when 131 (An older version of ICE allowed an overly strong preference for another
175 the instructions do *not* end the live range of ``src``. In this case, the live 132 ``Variable``'s physical register even if their live ranges interfered. This was
176 ranges of ``reg`` and ``src`` interfere, so they can't get the same physical 133 risky, and currently the register allocator derives this automatically through
177 register despite the explicit preference. However, ``reg`` is meant to be an 134 the ``AllowOverlap`` logic.)
178 alias of ``src`` so they needn't be considered to interfere with each other.
179 This can be expressed via the second (bool) argument of
180 ``setPreferredRegister()``::
181 135
182 Variable *Reg; 136 Call instructions kill scratch registers
183 Reg = Func->makeVariable(Dst->getType()); 137 ----------------------------------------
184 Reg->setWeightInfinite();
185 Reg->setPreferredRegister(Src, true);
186 NewInst = InstX8632Mov::create(Func, Reg, Src);
187 NewInst = InstX8632Mov::create(Func, Dst, Reg);
188 138
189 This should be used with caution and probably only for these short-live-range 139 A ``call`` instruction kills the values in all scratch registers, so it's
190 temporaries, otherwise the classic "lost copy" or "lost swap" problem may be 140 important that the register allocator doesn't allocate a scratch register to a
191 encountered. 141 ``Variable`` whose live range spans the ``call`` instruction. ICE provides the
192 142 ``InstFakeKill`` pseudo-instruction to compactly mark such register kills. For
193 Instructions with register side effects 143 each scratch register, a fake trivial live range is created that begins and ends
194 --------------------------------------- 144 in that instruction. The ``InstFakeKill`` instruction is inserted after the
195 145 ``call`` instruction. For example::
196 Some instructions produce unwanted results in other registers, or otherwise kill
197 preexisting values in other registers. For example, a ``call`` kills the
198 scratch registers. Also, the x86-32 ``idiv`` instruction produces the quotient
199 in ``eax`` and the remainder in ``edx``, but generally only one of those is
200 needed in the lowering. It's important that the register allocator doesn't
201 allocate that register to a live range that spans the instruction.
202
203 ICE provides the ``InstFakeKill`` pseudo-instruction to mark such register
204 kills. For each of the instruction's source variables, a fake trivial live
205 range is created that begins and ends in that instruction. The ``InstFakeKill``
206 instruction is inserted after the ``call`` instruction. For example::
207 146
208 CallInst = InstX8632Call::create(Func, ... ); 147 CallInst = InstX8632Call::create(Func, ... );
209 VarList KilledRegs; 148 NewInst = InstFakeKill::create(Func, CallInst);
210 KilledRegs.push_back(eax);
211 KilledRegs.push_back(ecx);
212 KilledRegs.push_back(edx);
213 NewInst = InstFakeKill::create(Func, KilledRegs, CallInst);
214 149
215 The last argument to the ``InstFakeKill`` constructor links it to the previous 150 The last argument to the ``InstFakeKill`` constructor links it to the previous
216 call instruction, such that if its linked instruction is dead-code eliminated, 151 call instruction, such that if its linked instruction is dead-code eliminated,
217 the ``InstFakeKill`` instruction is eliminated as well. 152 the ``InstFakeKill`` instruction is eliminated as well. The linked ``call``
218 153 instruction could be to a target known to be free of side effects, and therefore
219 The killed register arguments need to be assigned a physical register via 154 safe to remove if its result is unused.
220 ``Variable::setRegNum()`` for this to be effective. To avoid a massive
221 proliferation of ``Variable`` temporaries, the ``TargetLowering`` object caches
222 one precolored ``Variable`` for each physical register::
223
224 CallInst = InstX8632Call::create(Func, ... );
225 VarList KilledRegs;
226 Variable *eax = Func->getTarget()->getPhysicalRegister(Reg_eax);
227 Variable *ecx = Func->getTarget()->getPhysicalRegister(Reg_ecx);
228 Variable *edx = Func->getTarget()->getPhysicalRegister(Reg_edx);
229 KilledRegs.push_back(eax);
230 KilledRegs.push_back(ecx);
231 KilledRegs.push_back(edx);
232 NewInst = InstFakeKill::create(Func, KilledRegs, CallInst);
233
234 On first glance, it may seem unnecessary to explicitly kill the register that
235 returns the ``call`` return value. However, if for some reason the ``call``
236 result ends up being unused, dead-code elimination could remove dead assignments
237 and incorrectly expose the return value register to a register allocation
238 assignment spanning the call, which would be incorrect.
239 155
240 Instructions producing multiple values 156 Instructions producing multiple values
241 -------------------------------------- 157 --------------------------------------
242 158
243 ICE instructions allow at most one destination ``Variable``. Some machine 159 ICE instructions allow at most one destination ``Variable``. Some machine
244 instructions produce more than one usable result. For example, the x86-32 160 instructions produce more than one usable result. For example, the x86-32
245 ``call`` ABI returns a 64-bit integer result in the ``edx:eax`` register pair. 161 ``call`` ABI returns a 64-bit integer result in the ``edx:eax`` register pair.
246 Also, x86-32 has a version of the ``imul`` instruction that produces a 64-bit 162 Also, x86-32 has a version of the ``imul`` instruction that produces a 64-bit
247 result in the ``edx:eax`` register pair. 163 result in the ``edx:eax`` register pair. The x86-32 ``idiv`` instruction
164 produces the quotient in ``eax`` and the remainder in ``edx``, though generally
165 only one or the other is needed in the lowering.
248 166
249 To support multi-dest instructions, ICE provides the ``InstFakeDef`` 167 To support multi-dest instructions, ICE provides the ``InstFakeDef``
250 pseudo-instruction, whose destination can be precolored to the appropriate 168 pseudo-instruction, whose destination can be precolored to the appropriate
251 physical register. For example, a ``call`` returning a 64-bit result in 169 physical register. For example, a ``call`` returning a 64-bit result in
252 ``edx:eax``:: 170 ``edx:eax``::
253 171
254 CallInst = InstX8632Call::create(Func, RegLow, ... ); 172 CallInst = InstX8632Call::create(Func, RegLow, ... );
255 ... 173 NewInst = InstFakeKill::create(Func, CallInst);
256 NewInst = InstFakeKill::create(Func, KilledRegs, CallInst);
257 Variable *RegHigh = Func->makeVariable(IceType_i32); 174 Variable *RegHigh = Func->makeVariable(IceType_i32);
258 RegHigh->setRegNum(Reg_edx); 175 RegHigh->setRegNum(Reg_edx);
259 NewInst = InstFakeDef::create(Func, RegHigh); 176 NewInst = InstFakeDef::create(Func, RegHigh);
260 177
261 ``RegHigh`` is then assigned into the desired ``Variable``. If that assignment 178 ``RegHigh`` is then assigned into the desired ``Variable``. If that assignment
262 ends up being dead-code eliminated, the ``InstFakeDef`` instruction may be 179 ends up being dead-code eliminated, the ``InstFakeDef`` instruction may be
263 eliminated as well. 180 eliminated as well.
264 181
265 Preventing dead-code elimination 182 Managing dead-code elimination
266 -------------------------------- 183 ------------------------------
267 184
268 ICE instructions with a non-NULL ``Dest`` are subject to dead-code elimination. 185 ICE instructions with a non-nullptr ``Dest`` are subject to dead-code
269 However, some instructions must not be eliminated in order to preserve side 186 elimination. However, some instructions must not be eliminated in order to
270 effects. This applies to most function calls, volatile loads, and loads and 187 preserve side effects. This applies to most function calls, volatile loads, and
271 integer divisions where the underlying language and runtime are relying on 188 loads and integer divisions where the underlying language and runtime are
272 hardware exception handling. 189 relying on hardware exception handling.
273 190
274 ICE facilitates this with the ``InstFakeUse`` pseudo-instruction. This forces a 191 ICE facilitates this with the ``InstFakeUse`` pseudo-instruction. This forces a
275 use of its source ``Variable`` to keep that variable's definition alive. Since 192 use of its source ``Variable`` to keep that variable's definition alive. Since
276 the ``InstFakeUse`` instruction has no ``Dest``, it will not be eliminated. 193 the ``InstFakeUse`` instruction has no ``Dest``, it will not be eliminated.
277 194
278 Here is the full example of the x86-32 ``call`` returning a 32-bit integer 195 Here is the full example of the x86-32 ``call`` returning a 32-bit integer
279 result:: 196 result::
280 197
281 Variable *Reg = Func->makeVariable(IceType_i32); 198 Variable *Reg = Func->makeVariable(IceType_i32);
282 Reg->setRegNum(Reg_eax); 199 Reg->setRegNum(Reg_eax);
283 CallInst = InstX8632Call::create(Func, Reg, ... ); 200 CallInst = InstX8632Call::create(Func, Reg, ... );
284 VarList KilledRegs; 201 NewInst = InstFakeKill::create(Func, CallInst);
285 Variable *eax = Func->getTarget()->getPhysicalRegister(Reg_eax);
286 Variable *ecx = Func->getTarget()->getPhysicalRegister(Reg_ecx);
287 Variable *edx = Func->getTarget()->getPhysicalRegister(Reg_edx);
288 KilledRegs.push_back(eax);
289 KilledRegs.push_back(ecx);
290 KilledRegs.push_back(edx);
291 NewInst = InstFakeKill::create(Func, KilledRegs, CallInst);
292 NewInst = InstFakeUse::create(Func, Reg); 202 NewInst = InstFakeUse::create(Func, Reg);
293 NewInst = InstX8632Mov::create(Func, Result, Reg); 203 NewInst = InstX8632Mov::create(Func, Result, Reg);
294 204
295 Without the ``InstFakeUse``, the entire call sequence could be dead-code 205 Without the ``InstFakeUse``, the entire call sequence could be dead-code
296 eliminated if its result were unused. 206 eliminated if its result were unused.
297 207
298 One more note on this topic. These tools can be used to allow a multi-dest 208 One more note on this topic. These tools can be used to allow a multi-dest
299 instruction to be dead-code eliminated only when none of its results is live. 209 instruction to be dead-code eliminated only when none of its results is live.
300 The key is to use the optional source parameter of the ``InstFakeDef`` 210 The key is to use the optional source parameter of the ``InstFakeDef``
301 instruction. Using pseudocode:: 211 instruction. Using pseudocode::
302 212
303 t1:eax = call foo(arg1, ...) 213 t1:eax = call foo(arg1, ...)
304 InstFakeKill(eax, ecx, edx) 214 InstFakeKill // eax, ecx, edx
305 t2:edx = InstFakeDef(t1) 215 t2:edx = InstFakeDef(t1)
306 v_result_low = t1 216 v_result_low = t1
307 v_result_high = t2 217 v_result_high = t2
308 218
309 If ``v_result_high`` is live but ``v_result_low`` is dead, adding ``t1`` as an 219 If ``v_result_high`` is live but ``v_result_low`` is dead, adding ``t1`` as an
310 argument to ``InstFakeDef`` suffices to keep the ``call`` instruction live. 220 argument to ``InstFakeDef`` suffices to keep the ``call`` instruction live.
OLDNEW
« no previous file with comments | « no previous file | no next file » | no next file with comments »

Powered by Google App Engine
This is Rietveld 408576698