Index: LOWERING.rst |
diff --git a/LOWERING.rst b/LOWERING.rst |
index 251e25cefab89692266ba9a4fa3f79a8eabd48d4..d51cd467b901e15e1d1eea836c62ec4ef8ef98dd 100644 |
--- a/LOWERING.rst |
+++ b/LOWERING.rst |
@@ -18,7 +18,7 @@ happens after target-specific lowering, so during lowering we generally don't |
know whether a ``Variable`` operand will meet a target instruction's physical |
register requirement. |
-To this end, ICE allows certain hints/directives: |
+To this end, ICE allows certain directives: |
* ``Variable::setWeightInfinite()`` forces a ``Variable`` to get some |
physical register (without specifying which particular one) from a |
@@ -27,18 +27,15 @@ To this end, ICE allows certain hints/directives: |
* ``Variable::setRegNum()`` forces a ``Variable`` to be assigned a specific |
physical register. |
- * ``Variable::setPreferredRegister()`` registers a preference for a physical |
- register based on another ``Variable``'s physical register assignment. |
- |
-These hints/directives are described below in more detail. In most cases, |
-though, they don't need to be explicity used, as the routines that create |
-lowered instructions have reasonable defaults and simple options that control |
-these hints/directives. |
+These directives are described below in more detail. In most cases, though, |
+they don't need to be explicity used, as the routines that create lowered |
+instructions have reasonable defaults and simple options that control these |
+directives. |
The recommended ICE lowering strategy is to generate extra assignment |
-instructions involving extra ``Variable`` temporaries, using the |
-hints/directives to force suitable register assignments for the temporaries, and |
-then let the global register allocator clean things up. |
+instructions involving extra ``Variable`` temporaries, using the directives to |
+force suitable register assignments for the temporaries, and then let the |
+register allocator clean things up. |
Note: There is a spectrum of *implementation complexity* versus *translation |
speed* versus *code quality*. This recommended strategy picks a point on the |
@@ -47,8 +44,8 @@ quality in terms of frame size and register shuffling/spilling, but perhaps not |
the fastest translation speed since extra instructions and operands are created |
up front and cleaned up at the end. |
-Ensuring some physical register |
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
+Ensuring a non-specific physical register |
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
The x86 instruction:: |
@@ -71,71 +68,31 @@ low-level lowering code that accomplishes this looks something like:: |
``Cfg::makeVariable()`` generates a new temporary, and |
``Variable::setWeightInfinite()`` gives it infinite weight for the purpose of |
-register allocation, thus guaranteeing it a physical register. |
+register allocation, thus guaranteeing it a physical register (though leaving |
+the particular physical register to be determined by the register allocator). |
The ``_mov(Dest, Src)`` method in the ``TargetX8632`` class is sufficiently |
powerful to handle these details in most situations. Its ``Dest`` argument is |
-an in/out parameter. If its input value is ``NULL``, then a new temporary |
+an in/out parameter. If its input value is ``nullptr``, then a new temporary |
variable is created, its type is set to the same type as the ``Src`` operand, it |
is given infinite register weight, and the new ``Variable`` is returned through |
the in/out parameter. (This is in addition to the new temporary being the dest |
operand of the ``mov`` instruction.) The simpler version of the above example |
is:: |
- Variable *Reg = NULL; |
+ Variable *Reg = nullptr; |
_mov(Reg, Src); |
_mov(Dst, Reg); |
Preferring another ``Variable``'s physical register |
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
-One problem with this example is that the register allocator usually just |
-assigns the first available register to a live range. If this instruction ends |
-the live range of ``src``, this may lead to code like the following:: |
- |
- mov reg:eax, src:esi |
- mov dst:edi, reg:eax |
- |
-Since the first instruction happens to end the live range of ``src:esi``, it |
-would be better to assign ``esi`` to ``reg``:: |
- |
- mov reg:esi, src:esi |
- mov dst:edi, reg:esi |
- |
-The first instruction, ``mov esi, esi``, is a redundant assignment and will |
-ultimately be elided, leaving just ``mov edi, esi``. |
- |
-We can tell the register allocator to prefer the register assigned to a |
-different ``Variable``, using ``Variable::setPreferredRegister()``:: |
- |
- Variable *Reg; |
- Reg = Func->makeVariable(Dst->getType()); |
- Reg->setWeightInfinite(); |
- Reg->setPreferredRegister(Src); |
- NewInst = InstX8632Mov::create(Func, Reg, Src); |
- NewInst = InstX8632Mov::create(Func, Dst, Reg); |
- |
-Or more simply:: |
- |
- Variable *Reg = NULL; |
- _mov(Reg, Src); |
- _mov(Dst, Reg); |
- Reg->setPreferredRegister(llvm::dyn_cast<Variable>(Src)); |
- |
-The usefulness of ``setPreferredRegister()`` is tied into the implementation of |
-the register allocator. ICE uses linear-scan register allocation, which sorts |
-live ranges by starting point and assigns registers in that order. Using |
-``B->setPreferredRegister(A)`` only helps when ``A`` has already been assigned a |
-register by the time ``B`` is being considered. For an assignment ``B=A``, this |
-is usually a safe assumption because ``B``'s live range begins at this |
-instruction but ``A``'s live range must have started earlier. (There may be |
-exceptions for variables that are no longer in SSA form.) But |
-``A->setPreferredRegister(B)`` is unlikely to help unless ``B`` has been |
-precolored. In summary, generally the best practice is to use a pattern like:: |
- |
- NewInst = InstX8632Mov::create(Func, Dst, Src); |
- Dst->setPreferredRegister(Src); |
- //Src->setPreferredRegister(Dst); -- unlikely to have any effect |
+(An older version of ICE allowed the lowering code to provide a register |
+allocation hint: if a physical register is to be assigned to one ``Variable``, |
+then prefer a particular ``Variable``'s physical register if available. This |
+hint would be used to try to reduce the amount of register shuffling. |
+Currently, the register allocator does this automatically through the |
+``FindPreference`` logic.) |
Ensuring a specific physical register |
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
@@ -159,83 +116,42 @@ strongly. |
The ``_mov(Dest, Src, RegNum)`` method in the ``TargetX8632`` class has an |
optional ``RegNum`` argument to force a specific register assignment when the |
-input ``Dest`` is ``NULL``. As described above, passing in ``Dest=NULL`` causes |
-a new temporary variable to be created with infinite register weight, and in |
-addition the specific register is chosen. The simpler version of the above |
+input ``Dest`` is ``nullptr``. As described above, passing in ``Dest=nullptr`` |
+causes a new temporary variable to be created with infinite register weight, and |
+in addition the specific register is chosen. The simpler version of the above |
example is:: |
- Variable *Reg = NULL; |
+ Variable *Reg = nullptr; |
_mov(Reg, Src, Reg_eax); |
_ret(Reg); |
Disabling live-range interference |
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
-Another problem with the "``mov reg,src; mov dst,reg``" example happens when |
-the instructions do *not* end the live range of ``src``. In this case, the live |
-ranges of ``reg`` and ``src`` interfere, so they can't get the same physical |
-register despite the explicit preference. However, ``reg`` is meant to be an |
-alias of ``src`` so they needn't be considered to interfere with each other. |
-This can be expressed via the second (bool) argument of |
-``setPreferredRegister()``:: |
+(An older version of ICE allowed an overly strong preference for another |
+``Variable``'s physical register even if their live ranges interfered. This was |
+risky, and currently the register allocator derives this automatically through |
+the ``AllowOverlap`` logic.) |
- Variable *Reg; |
- Reg = Func->makeVariable(Dst->getType()); |
- Reg->setWeightInfinite(); |
- Reg->setPreferredRegister(Src, true); |
- NewInst = InstX8632Mov::create(Func, Reg, Src); |
- NewInst = InstX8632Mov::create(Func, Dst, Reg); |
+Call instructions kill scratch registers |
+---------------------------------------- |
-This should be used with caution and probably only for these short-live-range |
-temporaries, otherwise the classic "lost copy" or "lost swap" problem may be |
-encountered. |
- |
-Instructions with register side effects |
---------------------------------------- |
- |
-Some instructions produce unwanted results in other registers, or otherwise kill |
-preexisting values in other registers. For example, a ``call`` kills the |
-scratch registers. Also, the x86-32 ``idiv`` instruction produces the quotient |
-in ``eax`` and the remainder in ``edx``, but generally only one of those is |
-needed in the lowering. It's important that the register allocator doesn't |
-allocate that register to a live range that spans the instruction. |
- |
-ICE provides the ``InstFakeKill`` pseudo-instruction to mark such register |
-kills. For each of the instruction's source variables, a fake trivial live |
-range is created that begins and ends in that instruction. The ``InstFakeKill`` |
-instruction is inserted after the ``call`` instruction. For example:: |
+A ``call`` instruction kills the values in all scratch registers, so it's |
+important that the register allocator doesn't allocate a scratch register to a |
+``Variable`` whose live range spans the ``call`` instruction. ICE provides the |
+``InstFakeKill`` pseudo-instruction to compactly mark such register kills. For |
+each scratch register, a fake trivial live range is created that begins and ends |
+in that instruction. The ``InstFakeKill`` instruction is inserted after the |
+``call`` instruction. For example:: |
CallInst = InstX8632Call::create(Func, ... ); |
- VarList KilledRegs; |
- KilledRegs.push_back(eax); |
- KilledRegs.push_back(ecx); |
- KilledRegs.push_back(edx); |
- NewInst = InstFakeKill::create(Func, KilledRegs, CallInst); |
+ NewInst = InstFakeKill::create(Func, CallInst); |
The last argument to the ``InstFakeKill`` constructor links it to the previous |
call instruction, such that if its linked instruction is dead-code eliminated, |
-the ``InstFakeKill`` instruction is eliminated as well. |
- |
-The killed register arguments need to be assigned a physical register via |
-``Variable::setRegNum()`` for this to be effective. To avoid a massive |
-proliferation of ``Variable`` temporaries, the ``TargetLowering`` object caches |
-one precolored ``Variable`` for each physical register:: |
- |
- CallInst = InstX8632Call::create(Func, ... ); |
- VarList KilledRegs; |
- Variable *eax = Func->getTarget()->getPhysicalRegister(Reg_eax); |
- Variable *ecx = Func->getTarget()->getPhysicalRegister(Reg_ecx); |
- Variable *edx = Func->getTarget()->getPhysicalRegister(Reg_edx); |
- KilledRegs.push_back(eax); |
- KilledRegs.push_back(ecx); |
- KilledRegs.push_back(edx); |
- NewInst = InstFakeKill::create(Func, KilledRegs, CallInst); |
- |
-On first glance, it may seem unnecessary to explicitly kill the register that |
-returns the ``call`` return value. However, if for some reason the ``call`` |
-result ends up being unused, dead-code elimination could remove dead assignments |
-and incorrectly expose the return value register to a register allocation |
-assignment spanning the call, which would be incorrect. |
+the ``InstFakeKill`` instruction is eliminated as well. The linked ``call`` |
+instruction could be to a target known to be free of side effects, and therefore |
+safe to remove if its result is unused. |
Instructions producing multiple values |
-------------------------------------- |
@@ -244,7 +160,9 @@ ICE instructions allow at most one destination ``Variable``. Some machine |
instructions produce more than one usable result. For example, the x86-32 |
``call`` ABI returns a 64-bit integer result in the ``edx:eax`` register pair. |
Also, x86-32 has a version of the ``imul`` instruction that produces a 64-bit |
-result in the ``edx:eax`` register pair. |
+result in the ``edx:eax`` register pair. The x86-32 ``idiv`` instruction |
+produces the quotient in ``eax`` and the remainder in ``edx``, though generally |
+only one or the other is needed in the lowering. |
To support multi-dest instructions, ICE provides the ``InstFakeDef`` |
pseudo-instruction, whose destination can be precolored to the appropriate |
@@ -252,8 +170,7 @@ physical register. For example, a ``call`` returning a 64-bit result in |
``edx:eax``:: |
CallInst = InstX8632Call::create(Func, RegLow, ... ); |
- ... |
- NewInst = InstFakeKill::create(Func, KilledRegs, CallInst); |
+ NewInst = InstFakeKill::create(Func, CallInst); |
Variable *RegHigh = Func->makeVariable(IceType_i32); |
RegHigh->setRegNum(Reg_edx); |
NewInst = InstFakeDef::create(Func, RegHigh); |
@@ -262,14 +179,14 @@ physical register. For example, a ``call`` returning a 64-bit result in |
ends up being dead-code eliminated, the ``InstFakeDef`` instruction may be |
eliminated as well. |
-Preventing dead-code elimination |
--------------------------------- |
+Managing dead-code elimination |
+------------------------------ |
-ICE instructions with a non-NULL ``Dest`` are subject to dead-code elimination. |
-However, some instructions must not be eliminated in order to preserve side |
-effects. This applies to most function calls, volatile loads, and loads and |
-integer divisions where the underlying language and runtime are relying on |
-hardware exception handling. |
+ICE instructions with a non-nullptr ``Dest`` are subject to dead-code |
+elimination. However, some instructions must not be eliminated in order to |
+preserve side effects. This applies to most function calls, volatile loads, and |
+loads and integer divisions where the underlying language and runtime are |
+relying on hardware exception handling. |
ICE facilitates this with the ``InstFakeUse`` pseudo-instruction. This forces a |
use of its source ``Variable`` to keep that variable's definition alive. Since |
@@ -281,14 +198,7 @@ result:: |
Variable *Reg = Func->makeVariable(IceType_i32); |
Reg->setRegNum(Reg_eax); |
CallInst = InstX8632Call::create(Func, Reg, ... ); |
- VarList KilledRegs; |
- Variable *eax = Func->getTarget()->getPhysicalRegister(Reg_eax); |
- Variable *ecx = Func->getTarget()->getPhysicalRegister(Reg_ecx); |
- Variable *edx = Func->getTarget()->getPhysicalRegister(Reg_edx); |
- KilledRegs.push_back(eax); |
- KilledRegs.push_back(ecx); |
- KilledRegs.push_back(edx); |
- NewInst = InstFakeKill::create(Func, KilledRegs, CallInst); |
+ NewInst = InstFakeKill::create(Func, CallInst); |
NewInst = InstFakeUse::create(Func, Reg); |
NewInst = InstX8632Mov::create(Func, Result, Reg); |
@@ -301,7 +211,7 @@ The key is to use the optional source parameter of the ``InstFakeDef`` |
instruction. Using pseudocode:: |
t1:eax = call foo(arg1, ...) |
- InstFakeKill(eax, ecx, edx) |
+ InstFakeKill // eax, ecx, edx |
t2:edx = InstFakeDef(t1) |
v_result_low = t1 |
v_result_high = t2 |