OLD | NEW |
(Empty) | |
| 1 Missing support |
| 2 =============== |
| 3 |
| 4 * The PNaCl LLVM backend expands shufflevector operations into |
| 5 sequences of insertelement and extractelement operations. For |
| 6 instance: |
| 7 |
| 8 define <4 x i32> @shuffle(<4 x i32> %arg1, <4 x i32> %arg2) { |
| 9 entry: |
| 10 %res = shufflevector <4 x i32> %arg1, <4 x i32> %arg2, <4 x i32> <i32 4, i
32 5, i32 0, i32 1> |
| 11 ret <4 x i32> %res |
| 12 } |
| 13 |
| 14 gets expanded into: |
| 15 |
| 16 define <4 x i32> @shuffle(<4 x i32> %arg1, <4 x i32> %arg2) { |
| 17 entry: |
| 18 %0 = extractelement <4 x i32> %arg2, i32 0 |
| 19 %1 = insertelement <4 x i32> undef, i32 %0, i32 0 |
| 20 %2 = extractelement <4 x i32> %arg2, i32 1 |
| 21 %3 = insertelement <4 x i32> %1, i32 %2, i32 1 |
| 22 %4 = extractelement <4 x i32> %arg1, i32 0 |
| 23 %5 = insertelement <4 x i32> %3, i32 %4, i32 2 |
| 24 %6 = extractelement <4 x i32> %arg1, i32 1 |
| 25 %7 = insertelement <4 x i32> %5, i32 %6, i32 3 |
| 26 ret <4 x i32> %7 |
| 27 } |
| 28 |
| 29 Subzero should recognize these sequences and recombine them into |
| 30 shuffle operations where appropriate. |
| 31 |
| 32 * Add support for vector constants in the backend. The current code |
| 33 materializes the vector constants it needs (eg. for performing icmp |
| 34 on unsigned operands) using register operations, but this should be |
| 35 changed to loading them from a constant pool if the register |
| 36 initialization is too complicated (such as in |
| 37 TargetX8632::makeVectorOfHighOrderBits()). |
| 38 |
| 39 * [x86 specific] llvm-mc does not allow lea to take a mem128 memory |
| 40 operand when assembling x86-32 code. The current |
| 41 InstX8632Lea::emit() code uses Variable::asType() to convert any |
| 42 mem128 Variables into a compatible memory operand type. However, the |
| 43 emit code does not do any conversions of OperandX8632Mem, so if an |
| 44 OperandX8632Mem is passed to lea as mem128 the resulting code will |
| 45 not assemble. One way to fix this is by implementing |
| 46 OperandX8632Mem::asType(). |
| 47 |
| 48 * [x86 specific] Lower shl with <4 x i32> using some clever float |
| 49 conversion: |
| 50 http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20100726/105087.html |
| 51 |
| 52 * [x86 specific] Add support for using aligned mov operations |
| 53 (movaps). This will require passing alignment information to loads |
| 54 and stores. |
| 55 |
| 56 x86 SIMD Diversification |
| 57 ======================== |
| 58 |
| 59 * Vector "bitwise" operations have several variant instructions: the |
| 60 AND operation can be implemented with pand, andpd, or andps. This |
| 61 pattern also holds for ANDN, OR, and XOR. |
| 62 |
| 63 * Vector "mov" instructions can be diversified (eg. movdqu instead of |
| 64 movups) at the cost of a possible performance penalty. |
| 65 |
| 66 * Scalar FP arithmetic can be diversified by performing the operations |
| 67 with the vector version of the instructions. |
OLD | NEW |