| Index: LOWERING.rst
|
| diff --git a/LOWERING.rst b/LOWERING.rst
|
| index 251e25cefab89692266ba9a4fa3f79a8eabd48d4..d51cd467b901e15e1d1eea836c62ec4ef8ef98dd 100644
|
| --- a/LOWERING.rst
|
| +++ b/LOWERING.rst
|
| @@ -18,7 +18,7 @@ happens after target-specific lowering, so during lowering we generally don't
|
| know whether a ``Variable`` operand will meet a target instruction's physical
|
| register requirement.
|
|
|
| -To this end, ICE allows certain hints/directives:
|
| +To this end, ICE allows certain directives:
|
|
|
| * ``Variable::setWeightInfinite()`` forces a ``Variable`` to get some
|
| physical register (without specifying which particular one) from a
|
| @@ -27,18 +27,15 @@ To this end, ICE allows certain hints/directives:
|
| * ``Variable::setRegNum()`` forces a ``Variable`` to be assigned a specific
|
| physical register.
|
|
|
| - * ``Variable::setPreferredRegister()`` registers a preference for a physical
|
| - register based on another ``Variable``'s physical register assignment.
|
| -
|
| -These hints/directives are described below in more detail. In most cases,
|
| -though, they don't need to be explicity used, as the routines that create
|
| -lowered instructions have reasonable defaults and simple options that control
|
| -these hints/directives.
|
| +These directives are described below in more detail. In most cases, though,
|
| +they don't need to be explicity used, as the routines that create lowered
|
| +instructions have reasonable defaults and simple options that control these
|
| +directives.
|
|
|
| The recommended ICE lowering strategy is to generate extra assignment
|
| -instructions involving extra ``Variable`` temporaries, using the
|
| -hints/directives to force suitable register assignments for the temporaries, and
|
| -then let the global register allocator clean things up.
|
| +instructions involving extra ``Variable`` temporaries, using the directives to
|
| +force suitable register assignments for the temporaries, and then let the
|
| +register allocator clean things up.
|
|
|
| Note: There is a spectrum of *implementation complexity* versus *translation
|
| speed* versus *code quality*. This recommended strategy picks a point on the
|
| @@ -47,8 +44,8 @@ quality in terms of frame size and register shuffling/spilling, but perhaps not
|
| the fastest translation speed since extra instructions and operands are created
|
| up front and cleaned up at the end.
|
|
|
| -Ensuring some physical register
|
| -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
| +Ensuring a non-specific physical register
|
| +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
| The x86 instruction::
|
|
|
| @@ -71,71 +68,31 @@ low-level lowering code that accomplishes this looks something like::
|
|
|
| ``Cfg::makeVariable()`` generates a new temporary, and
|
| ``Variable::setWeightInfinite()`` gives it infinite weight for the purpose of
|
| -register allocation, thus guaranteeing it a physical register.
|
| +register allocation, thus guaranteeing it a physical register (though leaving
|
| +the particular physical register to be determined by the register allocator).
|
|
|
| The ``_mov(Dest, Src)`` method in the ``TargetX8632`` class is sufficiently
|
| powerful to handle these details in most situations. Its ``Dest`` argument is
|
| -an in/out parameter. If its input value is ``NULL``, then a new temporary
|
| +an in/out parameter. If its input value is ``nullptr``, then a new temporary
|
| variable is created, its type is set to the same type as the ``Src`` operand, it
|
| is given infinite register weight, and the new ``Variable`` is returned through
|
| the in/out parameter. (This is in addition to the new temporary being the dest
|
| operand of the ``mov`` instruction.) The simpler version of the above example
|
| is::
|
|
|
| - Variable *Reg = NULL;
|
| + Variable *Reg = nullptr;
|
| _mov(Reg, Src);
|
| _mov(Dst, Reg);
|
|
|
| Preferring another ``Variable``'s physical register
|
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
| -One problem with this example is that the register allocator usually just
|
| -assigns the first available register to a live range. If this instruction ends
|
| -the live range of ``src``, this may lead to code like the following::
|
| -
|
| - mov reg:eax, src:esi
|
| - mov dst:edi, reg:eax
|
| -
|
| -Since the first instruction happens to end the live range of ``src:esi``, it
|
| -would be better to assign ``esi`` to ``reg``::
|
| -
|
| - mov reg:esi, src:esi
|
| - mov dst:edi, reg:esi
|
| -
|
| -The first instruction, ``mov esi, esi``, is a redundant assignment and will
|
| -ultimately be elided, leaving just ``mov edi, esi``.
|
| -
|
| -We can tell the register allocator to prefer the register assigned to a
|
| -different ``Variable``, using ``Variable::setPreferredRegister()``::
|
| -
|
| - Variable *Reg;
|
| - Reg = Func->makeVariable(Dst->getType());
|
| - Reg->setWeightInfinite();
|
| - Reg->setPreferredRegister(Src);
|
| - NewInst = InstX8632Mov::create(Func, Reg, Src);
|
| - NewInst = InstX8632Mov::create(Func, Dst, Reg);
|
| -
|
| -Or more simply::
|
| -
|
| - Variable *Reg = NULL;
|
| - _mov(Reg, Src);
|
| - _mov(Dst, Reg);
|
| - Reg->setPreferredRegister(llvm::dyn_cast<Variable>(Src));
|
| -
|
| -The usefulness of ``setPreferredRegister()`` is tied into the implementation of
|
| -the register allocator. ICE uses linear-scan register allocation, which sorts
|
| -live ranges by starting point and assigns registers in that order. Using
|
| -``B->setPreferredRegister(A)`` only helps when ``A`` has already been assigned a
|
| -register by the time ``B`` is being considered. For an assignment ``B=A``, this
|
| -is usually a safe assumption because ``B``'s live range begins at this
|
| -instruction but ``A``'s live range must have started earlier. (There may be
|
| -exceptions for variables that are no longer in SSA form.) But
|
| -``A->setPreferredRegister(B)`` is unlikely to help unless ``B`` has been
|
| -precolored. In summary, generally the best practice is to use a pattern like::
|
| -
|
| - NewInst = InstX8632Mov::create(Func, Dst, Src);
|
| - Dst->setPreferredRegister(Src);
|
| - //Src->setPreferredRegister(Dst); -- unlikely to have any effect
|
| +(An older version of ICE allowed the lowering code to provide a register
|
| +allocation hint: if a physical register is to be assigned to one ``Variable``,
|
| +then prefer a particular ``Variable``'s physical register if available. This
|
| +hint would be used to try to reduce the amount of register shuffling.
|
| +Currently, the register allocator does this automatically through the
|
| +``FindPreference`` logic.)
|
|
|
| Ensuring a specific physical register
|
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
| @@ -159,83 +116,42 @@ strongly.
|
|
|
| The ``_mov(Dest, Src, RegNum)`` method in the ``TargetX8632`` class has an
|
| optional ``RegNum`` argument to force a specific register assignment when the
|
| -input ``Dest`` is ``NULL``. As described above, passing in ``Dest=NULL`` causes
|
| -a new temporary variable to be created with infinite register weight, and in
|
| -addition the specific register is chosen. The simpler version of the above
|
| +input ``Dest`` is ``nullptr``. As described above, passing in ``Dest=nullptr``
|
| +causes a new temporary variable to be created with infinite register weight, and
|
| +in addition the specific register is chosen. The simpler version of the above
|
| example is::
|
|
|
| - Variable *Reg = NULL;
|
| + Variable *Reg = nullptr;
|
| _mov(Reg, Src, Reg_eax);
|
| _ret(Reg);
|
|
|
| Disabling live-range interference
|
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
| -Another problem with the "``mov reg,src; mov dst,reg``" example happens when
|
| -the instructions do *not* end the live range of ``src``. In this case, the live
|
| -ranges of ``reg`` and ``src`` interfere, so they can't get the same physical
|
| -register despite the explicit preference. However, ``reg`` is meant to be an
|
| -alias of ``src`` so they needn't be considered to interfere with each other.
|
| -This can be expressed via the second (bool) argument of
|
| -``setPreferredRegister()``::
|
| +(An older version of ICE allowed an overly strong preference for another
|
| +``Variable``'s physical register even if their live ranges interfered. This was
|
| +risky, and currently the register allocator derives this automatically through
|
| +the ``AllowOverlap`` logic.)
|
|
|
| - Variable *Reg;
|
| - Reg = Func->makeVariable(Dst->getType());
|
| - Reg->setWeightInfinite();
|
| - Reg->setPreferredRegister(Src, true);
|
| - NewInst = InstX8632Mov::create(Func, Reg, Src);
|
| - NewInst = InstX8632Mov::create(Func, Dst, Reg);
|
| +Call instructions kill scratch registers
|
| +----------------------------------------
|
|
|
| -This should be used with caution and probably only for these short-live-range
|
| -temporaries, otherwise the classic "lost copy" or "lost swap" problem may be
|
| -encountered.
|
| -
|
| -Instructions with register side effects
|
| ----------------------------------------
|
| -
|
| -Some instructions produce unwanted results in other registers, or otherwise kill
|
| -preexisting values in other registers. For example, a ``call`` kills the
|
| -scratch registers. Also, the x86-32 ``idiv`` instruction produces the quotient
|
| -in ``eax`` and the remainder in ``edx``, but generally only one of those is
|
| -needed in the lowering. It's important that the register allocator doesn't
|
| -allocate that register to a live range that spans the instruction.
|
| -
|
| -ICE provides the ``InstFakeKill`` pseudo-instruction to mark such register
|
| -kills. For each of the instruction's source variables, a fake trivial live
|
| -range is created that begins and ends in that instruction. The ``InstFakeKill``
|
| -instruction is inserted after the ``call`` instruction. For example::
|
| +A ``call`` instruction kills the values in all scratch registers, so it's
|
| +important that the register allocator doesn't allocate a scratch register to a
|
| +``Variable`` whose live range spans the ``call`` instruction. ICE provides the
|
| +``InstFakeKill`` pseudo-instruction to compactly mark such register kills. For
|
| +each scratch register, a fake trivial live range is created that begins and ends
|
| +in that instruction. The ``InstFakeKill`` instruction is inserted after the
|
| +``call`` instruction. For example::
|
|
|
| CallInst = InstX8632Call::create(Func, ... );
|
| - VarList KilledRegs;
|
| - KilledRegs.push_back(eax);
|
| - KilledRegs.push_back(ecx);
|
| - KilledRegs.push_back(edx);
|
| - NewInst = InstFakeKill::create(Func, KilledRegs, CallInst);
|
| + NewInst = InstFakeKill::create(Func, CallInst);
|
|
|
| The last argument to the ``InstFakeKill`` constructor links it to the previous
|
| call instruction, such that if its linked instruction is dead-code eliminated,
|
| -the ``InstFakeKill`` instruction is eliminated as well.
|
| -
|
| -The killed register arguments need to be assigned a physical register via
|
| -``Variable::setRegNum()`` for this to be effective. To avoid a massive
|
| -proliferation of ``Variable`` temporaries, the ``TargetLowering`` object caches
|
| -one precolored ``Variable`` for each physical register::
|
| -
|
| - CallInst = InstX8632Call::create(Func, ... );
|
| - VarList KilledRegs;
|
| - Variable *eax = Func->getTarget()->getPhysicalRegister(Reg_eax);
|
| - Variable *ecx = Func->getTarget()->getPhysicalRegister(Reg_ecx);
|
| - Variable *edx = Func->getTarget()->getPhysicalRegister(Reg_edx);
|
| - KilledRegs.push_back(eax);
|
| - KilledRegs.push_back(ecx);
|
| - KilledRegs.push_back(edx);
|
| - NewInst = InstFakeKill::create(Func, KilledRegs, CallInst);
|
| -
|
| -On first glance, it may seem unnecessary to explicitly kill the register that
|
| -returns the ``call`` return value. However, if for some reason the ``call``
|
| -result ends up being unused, dead-code elimination could remove dead assignments
|
| -and incorrectly expose the return value register to a register allocation
|
| -assignment spanning the call, which would be incorrect.
|
| +the ``InstFakeKill`` instruction is eliminated as well. The linked ``call``
|
| +instruction could be to a target known to be free of side effects, and therefore
|
| +safe to remove if its result is unused.
|
|
|
| Instructions producing multiple values
|
| --------------------------------------
|
| @@ -244,7 +160,9 @@ ICE instructions allow at most one destination ``Variable``. Some machine
|
| instructions produce more than one usable result. For example, the x86-32
|
| ``call`` ABI returns a 64-bit integer result in the ``edx:eax`` register pair.
|
| Also, x86-32 has a version of the ``imul`` instruction that produces a 64-bit
|
| -result in the ``edx:eax`` register pair.
|
| +result in the ``edx:eax`` register pair. The x86-32 ``idiv`` instruction
|
| +produces the quotient in ``eax`` and the remainder in ``edx``, though generally
|
| +only one or the other is needed in the lowering.
|
|
|
| To support multi-dest instructions, ICE provides the ``InstFakeDef``
|
| pseudo-instruction, whose destination can be precolored to the appropriate
|
| @@ -252,8 +170,7 @@ physical register. For example, a ``call`` returning a 64-bit result in
|
| ``edx:eax``::
|
|
|
| CallInst = InstX8632Call::create(Func, RegLow, ... );
|
| - ...
|
| - NewInst = InstFakeKill::create(Func, KilledRegs, CallInst);
|
| + NewInst = InstFakeKill::create(Func, CallInst);
|
| Variable *RegHigh = Func->makeVariable(IceType_i32);
|
| RegHigh->setRegNum(Reg_edx);
|
| NewInst = InstFakeDef::create(Func, RegHigh);
|
| @@ -262,14 +179,14 @@ physical register. For example, a ``call`` returning a 64-bit result in
|
| ends up being dead-code eliminated, the ``InstFakeDef`` instruction may be
|
| eliminated as well.
|
|
|
| -Preventing dead-code elimination
|
| ---------------------------------
|
| +Managing dead-code elimination
|
| +------------------------------
|
|
|
| -ICE instructions with a non-NULL ``Dest`` are subject to dead-code elimination.
|
| -However, some instructions must not be eliminated in order to preserve side
|
| -effects. This applies to most function calls, volatile loads, and loads and
|
| -integer divisions where the underlying language and runtime are relying on
|
| -hardware exception handling.
|
| +ICE instructions with a non-nullptr ``Dest`` are subject to dead-code
|
| +elimination. However, some instructions must not be eliminated in order to
|
| +preserve side effects. This applies to most function calls, volatile loads, and
|
| +loads and integer divisions where the underlying language and runtime are
|
| +relying on hardware exception handling.
|
|
|
| ICE facilitates this with the ``InstFakeUse`` pseudo-instruction. This forces a
|
| use of its source ``Variable`` to keep that variable's definition alive. Since
|
| @@ -281,14 +198,7 @@ result::
|
| Variable *Reg = Func->makeVariable(IceType_i32);
|
| Reg->setRegNum(Reg_eax);
|
| CallInst = InstX8632Call::create(Func, Reg, ... );
|
| - VarList KilledRegs;
|
| - Variable *eax = Func->getTarget()->getPhysicalRegister(Reg_eax);
|
| - Variable *ecx = Func->getTarget()->getPhysicalRegister(Reg_ecx);
|
| - Variable *edx = Func->getTarget()->getPhysicalRegister(Reg_edx);
|
| - KilledRegs.push_back(eax);
|
| - KilledRegs.push_back(ecx);
|
| - KilledRegs.push_back(edx);
|
| - NewInst = InstFakeKill::create(Func, KilledRegs, CallInst);
|
| + NewInst = InstFakeKill::create(Func, CallInst);
|
| NewInst = InstFakeUse::create(Func, Reg);
|
| NewInst = InstX8632Mov::create(Func, Result, Reg);
|
|
|
| @@ -301,7 +211,7 @@ The key is to use the optional source parameter of the ``InstFakeDef``
|
| instruction. Using pseudocode::
|
|
|
| t1:eax = call foo(arg1, ...)
|
| - InstFakeKill(eax, ecx, edx)
|
| + InstFakeKill // eax, ecx, edx
|
| t2:edx = InstFakeDef(t1)
|
| v_result_low = t1
|
| v_result_high = t2
|
|
|