LOWERING.rst - Issue 846763002: Subzero: Update the lowering documentation.

Unified Diff: LOWERING.rst

Issue 846763002: Subzero: Update the lowering documentation. (Closed) Base URL: https://chromium.googlesource.com/native_client/pnacl-subzero.git@master

Patch Set: Updates from code review Created 5 years, 11 months ago

Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.

Jump to:

View side-by-side diff with in-line comments

Download patch

Index: LOWERING.rst

diff --git a/LOWERING.rst b/LOWERING.rst

index 251e25cefab89692266ba9a4fa3f79a8eabd48d4..d51cd467b901e15e1d1eea836c62ec4ef8ef98dd 100644

--- a/LOWERING.rst

+++ b/LOWERING.rst

@@ -18,7 +18,7 @@ happens after target-specific lowering, so during lowering we generally don't

know whether a ``Variable`` operand will meet a target instruction's physical

-To this end, ICE allows certain hints/directives:

+To this end, ICE allows certain directives:

* ``Variable::setWeightInfinite()`` forces a ``Variable`` to get some

physical register (without specifying which particular one) from a

@@ -27,18 +27,15 @@ To this end, ICE allows certain hints/directives:

* ``Variable::setRegNum()`` forces a ``Variable`` to be assigned a specific

physical register.

- * ``Variable::setPreferredRegister()`` registers a preference for a physical

- register based on another ``Variable``'s physical register assignment.

-These hints/directives are described below in more detail. In most cases,

-though, they don't need to be explicity used, as the routines that create

-lowered instructions have reasonable defaults and simple options that control

-these hints/directives.

+These directives are described below in more detail. In most cases, though,

+they don't need to be explicity used, as the routines that create lowered

+instructions have reasonable defaults and simple options that control these

+directives.

The recommended ICE lowering strategy is to generate extra assignment

-instructions involving extra ``Variable`` temporaries, using the

-hints/directives to force suitable register assignments for the temporaries, and

-then let the global register allocator clean things up.

+instructions involving extra ``Variable`` temporaries, using the directives to

+force suitable register assignments for the temporaries, and then let the

+register allocator clean things up.

Note: There is a spectrum of *implementation complexity* versus *translation

speed* versus *code quality*. This recommended strategy picks a point on the

@@ -47,8 +44,8 @@ quality in terms of frame size and register shuffling/spilling, but perhaps not

the fastest translation speed since extra instructions and operands are created

up front and cleaned up at the end.

-Ensuring some physical register

-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

+Ensuring a non-specific physical register

+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

The x86 instruction::

@@ -71,71 +68,31 @@ low-level lowering code that accomplishes this looks something like::

``Cfg::makeVariable()`` generates a new temporary, and

``Variable::setWeightInfinite()`` gives it infinite weight for the purpose of

-register allocation, thus guaranteeing it a physical register.

+register allocation, thus guaranteeing it a physical register (though leaving

+the particular physical register to be determined by the register allocator).

The ``_mov(Dest, Src)`` method in the ``TargetX8632`` class is sufficiently

powerful to handle these details in most situations. Its ``Dest`` argument is

-an in/out parameter. If its input value is ``NULL``, then a new temporary

+an in/out parameter. If its input value is ``nullptr``, then a new temporary

variable is created, its type is set to the same type as the ``Src`` operand, it

is given infinite register weight, and the new ``Variable`` is returned through

the in/out parameter. (This is in addition to the new temporary being the dest

operand of the ``mov`` instruction.) The simpler version of the above example

is::

- Variable *Reg = NULL;

+ Variable *Reg = nullptr;

_mov(Reg, Src);

_mov(Dst, Reg);

Preferring another ``Variable``'s physical register

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

-One problem with this example is that the register allocator usually just

-assigns the first available register to a live range. If this instruction ends

-the live range of ``src``, this may lead to code like the following::

- mov reg:eax, src:esi

- mov dst:edi, reg:eax

-Since the first instruction happens to end the live range of ``src:esi``, it

-would be better to assign ``esi`` to ``reg``::

- mov reg:esi, src:esi

- mov dst:edi, reg:esi

-The first instruction, ``mov esi, esi``, is a redundant assignment and will

-ultimately be elided, leaving just ``mov edi, esi``.

-We can tell the register allocator to prefer the register assigned to a

-different ``Variable``, using ``Variable::setPreferredRegister()``::

- Variable *Reg;

- Reg = Func->makeVariable(Dst->getType());

- Reg->setWeightInfinite();

- Reg->setPreferredRegister(Src);

- NewInst = InstX8632Mov::create(Func, Reg, Src);

- NewInst = InstX8632Mov::create(Func, Dst, Reg);

-Or more simply::

- Variable *Reg = NULL;

- _mov(Reg, Src);

- _mov(Dst, Reg);

- Reg->setPreferredRegister(llvm::dyn_cast<Variable>(Src));

-The usefulness of ``setPreferredRegister()`` is tied into the implementation of

-the register allocator. ICE uses linear-scan register allocation, which sorts

-live ranges by starting point and assigns registers in that order. Using

-``B->setPreferredRegister(A)`` only helps when ``A`` has already been assigned a

-register by the time ``B`` is being considered. For an assignment ``B=A``, this

-is usually a safe assumption because ``B``'s live range begins at this

-instruction but ``A``'s live range must have started earlier. (There may be

-exceptions for variables that are no longer in SSA form.) But

-``A->setPreferredRegister(B)`` is unlikely to help unless ``B`` has been

-precolored. In summary, generally the best practice is to use a pattern like::

- NewInst = InstX8632Mov::create(Func, Dst, Src);

- Dst->setPreferredRegister(Src);

- //Src->setPreferredRegister(Dst); -- unlikely to have any effect

+(An older version of ICE allowed the lowering code to provide a register

+allocation hint: if a physical register is to be assigned to one ``Variable``,

+then prefer a particular ``Variable``'s physical register if available. This

+hint would be used to try to reduce the amount of register shuffling.

+Currently, the register allocator does this automatically through the

+``FindPreference`` logic.)

Ensuring a specific physical register

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

@@ -159,83 +116,42 @@ strongly.

The ``_mov(Dest, Src, RegNum)`` method in the ``TargetX8632`` class has an

optional ``RegNum`` argument to force a specific register assignment when the

-input ``Dest`` is ``NULL``. As described above, passing in ``Dest=NULL`` causes

-a new temporary variable to be created with infinite register weight, and in

-addition the specific register is chosen. The simpler version of the above

+input ``Dest`` is ``nullptr``. As described above, passing in ``Dest=nullptr``

+causes a new temporary variable to be created with infinite register weight, and

+in addition the specific register is chosen. The simpler version of the above

example is::

- Variable *Reg = NULL;

+ Variable *Reg = nullptr;

_mov(Reg, Src, Reg_eax);

_ret(Reg);

Disabling live-range interference

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

-Another problem with the "``mov reg,src; mov dst,reg``" example happens when

-the instructions do *not* end the live range of ``src``. In this case, the live

-ranges of ``reg`` and ``src`` interfere, so they can't get the same physical

-register despite the explicit preference. However, ``reg`` is meant to be an

-alias of ``src`` so they needn't be considered to interfere with each other.

-This can be expressed via the second (bool) argument of

-``setPreferredRegister()``::

+(An older version of ICE allowed an overly strong preference for another

+``Variable``'s physical register even if their live ranges interfered. This was

+risky, and currently the register allocator derives this automatically through

+the ``AllowOverlap`` logic.)

- Variable *Reg;

- Reg = Func->makeVariable(Dst->getType());

- Reg->setWeightInfinite();

- Reg->setPreferredRegister(Src, true);

- NewInst = InstX8632Mov::create(Func, Reg, Src);

- NewInst = InstX8632Mov::create(Func, Dst, Reg);

+Call instructions kill scratch registers

+----------------------------------------

-This should be used with caution and probably only for these short-live-range

-temporaries, otherwise the classic "lost copy" or "lost swap" problem may be

-encountered.

-Instructions with register side effects

----------------------------------------

-Some instructions produce unwanted results in other registers, or otherwise kill

-preexisting values in other registers. For example, a ``call`` kills the

-scratch registers. Also, the x86-32 ``idiv`` instruction produces the quotient

-in ``eax`` and the remainder in ``edx``, but generally only one of those is

-needed in the lowering. It's important that the register allocator doesn't

-allocate that register to a live range that spans the instruction.

-ICE provides the ``InstFakeKill`` pseudo-instruction to mark such register

-kills. For each of the instruction's source variables, a fake trivial live

-range is created that begins and ends in that instruction. The ``InstFakeKill``

-instruction is inserted after the ``call`` instruction. For example::

+A ``call`` instruction kills the values in all scratch registers, so it's

+important that the register allocator doesn't allocate a scratch register to a

+``Variable`` whose live range spans the ``call`` instruction. ICE provides the

+``InstFakeKill`` pseudo-instruction to compactly mark such register kills. For

+each scratch register, a fake trivial live range is created that begins and ends

+in that instruction. The ``InstFakeKill`` instruction is inserted after the

+``call`` instruction. For example::

CallInst = InstX8632Call::create(Func, ... );

- VarList KilledRegs;

- KilledRegs.push_back(eax);

- KilledRegs.push_back(ecx);

- KilledRegs.push_back(edx);

- NewInst = InstFakeKill::create(Func, KilledRegs, CallInst);

+ NewInst = InstFakeKill::create(Func, CallInst);

The last argument to the ``InstFakeKill`` constructor links it to the previous

call instruction, such that if its linked instruction is dead-code eliminated,

-the ``InstFakeKill`` instruction is eliminated as well.

-The killed register arguments need to be assigned a physical register via

-``Variable::setRegNum()`` for this to be effective. To avoid a massive

-proliferation of ``Variable`` temporaries, the ``TargetLowering`` object caches

-one precolored ``Variable`` for each physical register::

- CallInst = InstX8632Call::create(Func, ... );

- VarList KilledRegs;

- Variable *eax = Func->getTarget()->getPhysicalRegister(Reg_eax);

- Variable *ecx = Func->getTarget()->getPhysicalRegister(Reg_ecx);

- Variable *edx = Func->getTarget()->getPhysicalRegister(Reg_edx);

- KilledRegs.push_back(eax);

- KilledRegs.push_back(ecx);

- KilledRegs.push_back(edx);

- NewInst = InstFakeKill::create(Func, KilledRegs, CallInst);

-On first glance, it may seem unnecessary to explicitly kill the register that

-returns the ``call`` return value. However, if for some reason the ``call``

-result ends up being unused, dead-code elimination could remove dead assignments

-and incorrectly expose the return value register to a register allocation

-assignment spanning the call, which would be incorrect.

+the ``InstFakeKill`` instruction is eliminated as well. The linked ``call``

+instruction could be to a target known to be free of side effects, and therefore

+safe to remove if its result is unused.

Instructions producing multiple values

--------------------------------------

@@ -244,7 +160,9 @@ ICE instructions allow at most one destination ``Variable``. Some machine

instructions produce more than one usable result. For example, the x86-32

``call`` ABI returns a 64-bit integer result in the ``edx:eax`` register pair.

Also, x86-32 has a version of the ``imul`` instruction that produces a 64-bit

-result in the ``edx:eax`` register pair.

+result in the ``edx:eax`` register pair. The x86-32 ``idiv`` instruction

+produces the quotient in ``eax`` and the remainder in ``edx``, though generally

+only one or the other is needed in the lowering.

To support multi-dest instructions, ICE provides the ``InstFakeDef``

pseudo-instruction, whose destination can be precolored to the appropriate

@@ -252,8 +170,7 @@ physical register. For example, a ``call`` returning a 64-bit result in

``edx:eax``::

CallInst = InstX8632Call::create(Func, RegLow, ... );

- ...

- NewInst = InstFakeKill::create(Func, KilledRegs, CallInst);

+ NewInst = InstFakeKill::create(Func, CallInst);

Variable *RegHigh = Func->makeVariable(IceType_i32);

RegHigh->setRegNum(Reg_edx);

NewInst = InstFakeDef::create(Func, RegHigh);

@@ -262,14 +179,14 @@ physical register. For example, a ``call`` returning a 64-bit result in

ends up being dead-code eliminated, the ``InstFakeDef`` instruction may be

eliminated as well.

-Preventing dead-code elimination

---------------------------------

+Managing dead-code elimination

+------------------------------

-ICE instructions with a non-NULL ``Dest`` are subject to dead-code elimination.

-However, some instructions must not be eliminated in order to preserve side

-effects. This applies to most function calls, volatile loads, and loads and

-integer divisions where the underlying language and runtime are relying on

-hardware exception handling.

+ICE instructions with a non-nullptr ``Dest`` are subject to dead-code

+elimination. However, some instructions must not be eliminated in order to

+preserve side effects. This applies to most function calls, volatile loads, and

+loads and integer divisions where the underlying language and runtime are

+relying on hardware exception handling.

ICE facilitates this with the ``InstFakeUse`` pseudo-instruction. This forces a

use of its source ``Variable`` to keep that variable's definition alive. Since

@@ -281,14 +198,7 @@ result::

Variable *Reg = Func->makeVariable(IceType_i32);

Reg->setRegNum(Reg_eax);

CallInst = InstX8632Call::create(Func, Reg, ... );

- VarList KilledRegs;

- Variable *eax = Func->getTarget()->getPhysicalRegister(Reg_eax);

- Variable *ecx = Func->getTarget()->getPhysicalRegister(Reg_ecx);

- Variable *edx = Func->getTarget()->getPhysicalRegister(Reg_edx);

- KilledRegs.push_back(eax);

- KilledRegs.push_back(ecx);

- KilledRegs.push_back(edx);

- NewInst = InstFakeKill::create(Func, KilledRegs, CallInst);

+ NewInst = InstFakeKill::create(Func, CallInst);

NewInst = InstFakeUse::create(Func, Reg);

NewInst = InstX8632Mov::create(Func, Result, Reg);

@@ -301,7 +211,7 @@ The key is to use the optional source parameter of the ``InstFakeDef``

instruction. Using pseudocode::

t1:eax = call foo(arg1, ...)

- InstFakeKill(eax, ecx, edx)

+ InstFakeKill // eax, ecx, edx

t2:edx = InstFakeDef(t1)

v_result_low = t1

v_result_high = t2

« no previous file with comments | « no previous file | no next file » | no next file with comments »