| Index: src/README.SIMD.rst
|
| diff --git a/src/README.SIMD.rst b/src/README.SIMD.rst
|
| index 58f25d96b1fa4285267267eaa56434df0afe31f0..f8cf08f3c3a107c10b348e14f86dd8bfe4e316ae 100644
|
| --- a/src/README.SIMD.rst
|
| +++ b/src/README.SIMD.rst
|
| @@ -1,13 +1,14 @@
|
| Missing support
|
| ===============
|
|
|
| -* The PNaCl LLVM backend expands shufflevector operations into
|
| - sequences of insertelement and extractelement operations. For
|
| - instance:
|
| +* The PNaCl LLVM backend expands shufflevector operations into sequences of
|
| + insertelement and extractelement operations. For instance:
|
|
|
| define <4 x i32> @shuffle(<4 x i32> %arg1, <4 x i32> %arg2) {
|
| entry:
|
| - %res = shufflevector <4 x i32> %arg1, <4 x i32> %arg2, <4 x i32> <i32 4, i32 5, i32 0, i32 1>
|
| + %res = shufflevector <4 x i32> %arg1,
|
| + <4 x i32> %arg2,
|
| + <4 x i32> <i32 4, i32 5, i32 0, i32 1>
|
| ret <4 x i32> %res
|
| }
|
|
|
| @@ -30,38 +31,34 @@ Missing support
|
| shuffle operations where appropriate.
|
|
|
| * Add support for vector constants in the backend. The current code
|
| - materializes the vector constants it needs (eg. for performing icmp
|
| - on unsigned operands) using register operations, but this should be
|
| - changed to loading them from a constant pool if the register
|
| - initialization is too complicated (such as in
|
| - TargetX8632::makeVectorOfHighOrderBits()).
|
| + materializes the vector constants it needs (eg. for performing icmp on
|
| + unsigned operands) using register operations, but this should be changed to
|
| + loading them from a constant pool if the register initialization is too
|
| + complicated (such as in TargetX8632::makeVectorOfHighOrderBits()).
|
|
|
| -* [x86 specific] llvm-mc does not allow lea to take a mem128 memory
|
| - operand when assembling x86-32 code. The current
|
| - InstX8632Lea::emit() code uses Variable::asType() to convert any
|
| - mem128 Variables into a compatible memory operand type. However, the
|
| - emit code does not do any conversions of OperandX8632Mem, so if an
|
| - OperandX8632Mem is passed to lea as mem128 the resulting code will
|
| - not assemble. One way to fix this is by implementing
|
| +* [x86 specific] llvm-mc does not allow lea to take a mem128 memory operand
|
| + when assembling x86-32 code. The current InstX8632Lea::emit() code uses
|
| + Variable::asType() to convert any mem128 Variables into a compatible memory
|
| + operand type. However, the emit code does not do any conversions of
|
| + OperandX8632Mem, so if an OperandX8632Mem is passed to lea as mem128 the
|
| + resulting code will not assemble. One way to fix this is by implementing
|
| OperandX8632Mem::asType().
|
|
|
| -* [x86 specific] Lower shl with <4 x i32> using some clever float
|
| - conversion:
|
| +* [x86 specific] Lower shl with <4 x i32> using some clever float conversion:
|
| http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20100726/105087.html
|
|
|
| -* [x86 specific] Add support for using aligned mov operations
|
| - (movaps). This will require passing alignment information to loads
|
| - and stores.
|
| +* [x86 specific] Add support for using aligned mov operations (movaps). This
|
| + will require passing alignment information to loads and stores.
|
|
|
| x86 SIMD Diversification
|
| ========================
|
|
|
| -* Vector "bitwise" operations have several variant instructions: the
|
| - AND operation can be implemented with pand, andpd, or andps. This
|
| - pattern also holds for ANDN, OR, and XOR.
|
| +* Vector "bitwise" operations have several variant instructions: the AND
|
| + operation can be implemented with pand, andpd, or andps. This pattern also
|
| + holds for ANDN, OR, and XOR.
|
|
|
| -* Vector "mov" instructions can be diversified (eg. movdqu instead of
|
| - movups) at the cost of a possible performance penalty.
|
| +* Vector "mov" instructions can be diversified (eg. movdqu instead of movups)
|
| + at the cost of a possible performance penalty.
|
|
|
| -* Scalar FP arithmetic can be diversified by performing the operations
|
| - with the vector version of the instructions.
|
| +* Scalar FP arithmetic can be diversified by performing the operations with the
|
| + vector version of the instructions.
|
|
|