|
|
Description[Turbofan] Add ARM NEON instructions for implementing SIMD.
- Adds NEON instructions to assembler, disassembler, simulator.
- Adds ExtractLane, ReplaceLane functions to macro assembler.
LOG=N
BUG=v8:4124
Review-Url: https://codereview.chromium.org/2546933002
Cr-Commit-Position: refs/heads/master@{#41737}
Committed: https://chromium.googlesource.com/v8/v8/+/03f33f2e68641903a982a025b0ba645c8ba9f9b3
Patch Set 1 #Patch Set 2 : Restore deleted whitespace. #Patch Set 3 : Fix disasm test. #Patch Set 4 : Add NeonListOperand ctor overload for q-register. #Patch Set 5 : Don't use temporary FP regs in tests. #
Total comments: 18
Patch Set 6 : Review comments. #
Total comments: 75
Patch Set 7 : Second review comments. #
Total comments: 30
Patch Set 8 : Third review comments. #
Total comments: 4
Patch Set 9 : Fourth review comments. #
Messages
Total messages: 52 (38 generated)
The CQ bit was checked by bbudge@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
bbudge@chromium.org changed reviewers: + jacob.bramley@arm.com, martyn.capewell@arm.com
The CQ bit was checked by bbudge@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
The CQ bit was checked by bbudge@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
The CQ bit was checked by bbudge@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
ping.
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
rodolph.perfetta@arm.com changed reviewers: + rodolph.perfetta@arm.com
partial review. https://codereview.chromium.org/2546933002/diff/80001/src/arm/assembler-arm.cc File src/arm/assembler-arm.cc (right): https://codereview.chromium.org/2546933002/diff/80001/src/arm/assembler-arm.c... src/arm/assembler-arm.cc:3970: void Assembler::vcvt(const QwNeonRegister dst, const QwNeonRegister src, other vcvt instruction indicate the conversion direction and type in their name. So here we would have 4 instructions: vcvt_f32_u32, vcvt_f32_s32, vcvt_u32_f32 and vcvt_s32_f32. You could get rid of NeonOtherDataType this way. By convention on ARM the "to" comes before the "from". https://codereview.chromium.org/2546933002/diff/80001/src/arm/assembler-arm.c... src/arm/assembler-arm.cc:3982: op = ((to & NeonDataTypeUMask) != 0) ? 1 : 0; this is the other way round: when converting to integer op<1> == 1, so it should be 3 and 2 here. https://codereview.chromium.org/2546933002/diff/80001/src/arm/assembler-arm.c... src/arm/assembler-arm.cc:3986: op = ((from & NeonDataTypeUMask) != 0) ? 3 : 2; and 1 and 0 here. https://codereview.chromium.org/2546933002/diff/80001/src/arm/disasm-arm.cc File src/arm/disasm-arm.cc (right): https://codereview.chromium.org/2546933002/diff/80001/src/arm/disasm-arm.cc#n... src/arm/disasm-arm.cc:1843: "%s.F32 q%d, q%d, q%d", op, Vd, Vn, Vm); use lower case for the type. https://codereview.chromium.org/2546933002/diff/80001/src/arm/disasm-arm.cc#n... src/arm/disasm-arm.cc:1870: SNPrintF(out_buffer_ + out_buffer_pos_, "vsub.I%d q%d, q%d, q%d", lower case https://codereview.chromium.org/2546933002/diff/80001/src/arm/disasm-arm.cc#n... src/arm/disasm-arm.cc:1874: SNPrintF(out_buffer_ + out_buffer_pos_, "vceq.I%d q%d, q%d, q%d", lower case. https://codereview.chromium.org/2546933002/diff/80001/src/arm/disasm-arm.cc#n... src/arm/disasm-arm.cc:1929: case 0: same as the assembler this is incorrect, it should be "f32.s32" https://codereview.chromium.org/2546933002/diff/80001/src/arm/simulator-arm.cc File src/arm/simulator-arm.cc (right): https://codereview.chromium.org/2546933002/diff/80001/src/arm/simulator-arm.c... src/arm/simulator-arm.cc:3313: uint16_t* dst = reinterpret_cast<uint16_t*>(q_data); I believe this is only safe for bytes, you may encounter type aliasing issue otherwise. The safe way would be to declare a uint16_t array and memcpy q_data into it. This issue appear a few times in this file (with float, double, etc). https://codereview.chromium.org/2546933002/diff/80001/src/arm/simulator-arm.c... src/arm/simulator-arm.cc:4048: case 0: // F32 -> S32 as for the assembler, op interpretation is incorrect: 0 means to F32 from S32 (vcvt.f32.s32)
The CQ bit was checked by bbudge@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
https://codereview.chromium.org/2546933002/diff/80001/src/arm/assembler-arm.cc File src/arm/assembler-arm.cc (right): https://codereview.chromium.org/2546933002/diff/80001/src/arm/assembler-arm.c... src/arm/assembler-arm.cc:3970: void Assembler::vcvt(const QwNeonRegister dst, const QwNeonRegister src, On 2016/12/07 15:59:00, Rodolph Perfetta (ARM) wrote: > other vcvt instruction indicate the conversion direction and type in their name. > So here we would have 4 instructions: vcvt_f32_u32, vcvt_f32_s32, vcvt_u32_f32 > and vcvt_s32_f32. You could get rid of NeonOtherDataType this way. > > By convention on ARM the "to" comes before the "from". Yes, that's a lot more consistent with the existing code. Done. https://codereview.chromium.org/2546933002/diff/80001/src/arm/assembler-arm.c... src/arm/assembler-arm.cc:3982: op = ((to & NeonDataTypeUMask) != 0) ? 1 : 0; On 2016/12/07 15:59:00, Rodolph Perfetta (ARM) wrote: > this is the other way round: when converting to integer op<1> == 1, so it should > be 3 and 2 here. Done. https://codereview.chromium.org/2546933002/diff/80001/src/arm/assembler-arm.c... src/arm/assembler-arm.cc:3986: op = ((from & NeonDataTypeUMask) != 0) ? 3 : 2; On 2016/12/07 15:59:00, Rodolph Perfetta (ARM) wrote: > and 1 and 0 here. Done. https://codereview.chromium.org/2546933002/diff/80001/src/arm/disasm-arm.cc File src/arm/disasm-arm.cc (right): https://codereview.chromium.org/2546933002/diff/80001/src/arm/disasm-arm.cc#n... src/arm/disasm-arm.cc:1843: "%s.F32 q%d, q%d, q%d", op, Vd, Vn, Vm); On 2016/12/07 15:59:00, Rodolph Perfetta (ARM) wrote: > use lower case for the type. Done. https://codereview.chromium.org/2546933002/diff/80001/src/arm/disasm-arm.cc#n... src/arm/disasm-arm.cc:1870: SNPrintF(out_buffer_ + out_buffer_pos_, "vsub.I%d q%d, q%d, q%d", On 2016/12/07 15:59:00, Rodolph Perfetta (ARM) wrote: > lower case Done. https://codereview.chromium.org/2546933002/diff/80001/src/arm/disasm-arm.cc#n... src/arm/disasm-arm.cc:1874: SNPrintF(out_buffer_ + out_buffer_pos_, "vceq.I%d q%d, q%d, q%d", On 2016/12/07 15:59:00, Rodolph Perfetta (ARM) wrote: > lower case. Done. https://codereview.chromium.org/2546933002/diff/80001/src/arm/disasm-arm.cc#n... src/arm/disasm-arm.cc:1929: case 0: On 2016/12/07 15:59:00, Rodolph Perfetta (ARM) wrote: > same as the assembler this is incorrect, it should be "f32.s32" Done. https://codereview.chromium.org/2546933002/diff/80001/src/arm/simulator-arm.cc File src/arm/simulator-arm.cc (right): https://codereview.chromium.org/2546933002/diff/80001/src/arm/simulator-arm.c... src/arm/simulator-arm.cc:3313: uint16_t* dst = reinterpret_cast<uint16_t*>(q_data); On 2016/12/07 15:59:00, Rodolph Perfetta (ARM) wrote: > I believe this is only safe for bytes, you may encounter type aliasing issue > otherwise. The safe way would be to declare a uint16_t array and memcpy q_data > into it. This issue appear a few times in this file (with float, double, etc). I worked around this by performing operations in pairs for the Neon16 case, so we always operate on either uint8_t's or uint32_t's for integer operations. Undefined casts on floats are replaced with bit_cast's. https://codereview.chromium.org/2546933002/diff/80001/src/arm/simulator-arm.c... src/arm/simulator-arm.cc:4048: case 0: // F32 -> S32 On 2016/12/07 15:59:00, Rodolph Perfetta (ARM) wrote: > as for the assembler, op interpretation is incorrect: 0 means to F32 from S32 > (vcvt.f32.s32) Done.
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
https://codereview.chromium.org/2546933002/diff/100001/src/arm/assembler-arm.cc File src/arm/assembler-arm.cc (right): https://codereview.chromium.org/2546933002/diff/100001/src/arm/assembler-arm.... src/arm/assembler-arm.cc:3902: emit(0x1E7 * B23 | d * B22 | 3 * B20 | vd * B12 | 0x17 * B6 | m * B5 | vm); 0x1E7U according to issue 5725 https://codereview.chromium.org/2546933002/diff/100001/src/arm/assembler-arm.... src/arm/assembler-arm.cc:3966: emit(0x1E7 * B23 | d * B22 | 0x3 * B20 | imm4 * B16 | vd * B12 | 0x18 * B7 | 0x1E7U https://codereview.chromium.org/2546933002/diff/100001/src/arm/assembler-arm.... src/arm/assembler-arm.cc:3984: op = dst_type == U32 ? 3 : 2; DCECHK((dst_type == U32) || (dst_type == S32)); https://codereview.chromium.org/2546933002/diff/100001/src/arm/assembler-arm.... src/arm/assembler-arm.cc:3987: op = src_type == U32 ? 1 : 0; ditto with src_type https://codereview.chromium.org/2546933002/diff/100001/src/arm/assembler-arm.... src/arm/assembler-arm.cc:4067: emit(0x1E4 * B23 | d * B22 | vn * B16 | vd * B12 | 0xD * B8 | n * B7 | B6 | 0x1E4U https://codereview.chromium.org/2546933002/diff/100001/src/arm/assembler-arm.... src/arm/assembler-arm.cc:4083: emit(0x1E4 * B23 | d * B22 | sz * B20 | vn * B16 | vd * B12 | 0x8 * B8 | ditto https://codereview.chromium.org/2546933002/diff/100001/src/arm/assembler-arm.... src/arm/assembler-arm.cc:4098: emit(0x1E4 * B23 | d * B22 | B21 | vn * B16 | vd * B12 | 0xD * B8 | n * B7 | ditto https://codereview.chromium.org/2546933002/diff/100001/src/arm/assembler-arm.... src/arm/assembler-arm.cc:4114: emit(0x1E6 * B23 | d * B22 | sz * B20 | vn * B16 | vd * B12 | 0x8 * B8 | ditto https://codereview.chromium.org/2546933002/diff/100001/src/arm/assembler-arm.... src/arm/assembler-arm.cc:4130: emit(0x1E4 * B23 | d * B22 | sz * B20 | vn * B16 | vd * B12 | 0x8 * B8 | ditto https://codereview.chromium.org/2546933002/diff/100001/src/arm/assembler-arm.... src/arm/assembler-arm.cc:4146: emit(0x1E6 * B23 | d * B22 | sz * B20 | vn * B16 | vd * B12 | 0x8 * B8 | ditto https://codereview.chromium.org/2546933002/diff/100001/src/arm/assembler-arm.... src/arm/assembler-arm.cc:4161: int op = 1; // vbsl is that necessary? why not using B20 directly below? https://codereview.chromium.org/2546933002/diff/100001/src/arm/assembler-arm.... src/arm/assembler-arm.cc:4162: emit(0x1E6 * B23 | d * B22 | op * B20 | vn * B16 | vd * B12 | 0x1 * B8 | 0x1E6U https://codereview.chromium.org/2546933002/diff/100001/src/arm/assembler-arm.... src/arm/assembler-arm.cc:4177: int op = 1; // vbsl unused. https://codereview.chromium.org/2546933002/diff/100001/src/arm/assembler-arm.... src/arm/assembler-arm.cc:4178: emit(0x1E7 * B23 | d * B22 | 0x3 * B20 | vn * B16 | vd * B12 | 0x2 * B10 | 0x1E7U https://codereview.chromium.org/2546933002/diff/100001/src/arm/assembler-arm.... src/arm/assembler-arm.cc:4182: void Assembler::vtbx(const DwVfpRegister dst, const NeonListOperand& list, vtbl and vtbx encoding only differ by the value of bit 6, how about a common helper for both instruction? https://codereview.chromium.org/2546933002/diff/100001/src/arm/assembler-arm.... src/arm/assembler-arm.cc:4193: int op = 1; // vbsl unused https://codereview.chromium.org/2546933002/diff/100001/src/arm/assembler-arm.... src/arm/assembler-arm.cc:4194: emit(0x1E7 * B23 | d * B22 | 0x3 * B20 | vn * B16 | vd * B12 | 0x2 * B10 | 0x1E7U https://codereview.chromium.org/2546933002/diff/100001/src/arm/assembler-arm.h File src/arm/assembler-arm.h (right): https://codereview.chromium.org/2546933002/diff/100001/src/arm/assembler-arm.... src/arm/assembler-arm.h:649: int len() const { return register_count_ - 1; } len -> length https://codereview.chromium.org/2546933002/diff/100001/src/arm/assembler-arm.... src/arm/assembler-arm.h:1351: void vdup(const QwNeonRegister dst, const Register src, NeonSize size); in other instructions like vmovl the size qualifier is the first argument to be closer to the assembly syntax https://codereview.chromium.org/2546933002/diff/100001/src/arm/assembler-arm.... src/arm/assembler-arm.h:1352: void vdup(const QwNeonRegister dst, const SwVfpRegister src); this is a special case of vdup.size dst, src[index]. Any reason why we wouldn't want to support for full instruction? That would give us access to higher D registers for example. I am just asking, if this is the only useful case for now that is fine by me. https://codereview.chromium.org/2546933002/diff/100001/src/arm/assembler-arm.... src/arm/assembler-arm.h:1364: const QwNeonRegister src2, NeonSize size); how about having the size as the first https://codereview.chromium.org/2546933002/diff/100001/src/arm/assembler-arm.... src/arm/assembler-arm.h:1368: const QwNeonRegister src2, NeonSize size); ditto https://codereview.chromium.org/2546933002/diff/100001/src/arm/assembler-arm.... src/arm/assembler-arm.h:1370: const QwNeonRegister src2, NeonSize size); ditto https://codereview.chromium.org/2546933002/diff/100001/src/arm/assembler-arm.... src/arm/assembler-arm.h:1372: const QwNeonRegister src2, NeonSize size); ditto https://codereview.chromium.org/2546933002/diff/100001/src/arm/disasm-arm.cc File src/arm/disasm-arm.cc (right): https://codereview.chromium.org/2546933002/diff/100001/src/arm/disasm-arm.cc#... src/arm/disasm-arm.cc:1548: int Vd = (instr->Bit(7) << 3) | (instr->VnValue() >> 1); int Vd = instr->VFPNRegValue(kSimd128Precision); https://codereview.chromium.org/2546933002/diff/100001/src/arm/disasm-arm.cc#... src/arm/disasm-arm.cc:1973: instr->Bit(6) == 0 ? "vtbl" : "vtbx", Vd); vtbl.8, vtbx.8 https://codereview.chromium.org/2546933002/diff/100001/src/arm/macro-assemble... File src/arm/macro-assembler-arm.cc (right): https://codereview.chromium.org/2546933002/diff/100001/src/arm/macro-assemble... src/arm/macro-assembler-arm.cc:1147: void MacroAssembler::ExtractLane(Register dst, QwNeonRegister src, There is a Neon instruction for this: vmov.dt rt, Dn[x] where dt is one of {S8, U8, S16, U16, 32} and x the index of your lane. A partial implementation of this function is already available: vmov(const DwVfpRegister dst, const VmovIndex index, const Register src, const Condition cond), it only supports VFP, i.e vmov.32 rt, Dn[0|1] https://codereview.chromium.org/2546933002/diff/100001/src/arm/macro-assemble... src/arm/macro-assembler-arm.cc:1181: int bytes_per_lane = dt & NeonDataTypeSizeMask; if you implement vmov.dt Dn[x], Rt then you can drop the rest of this function: Move(dst, src); vmov(dt, dst, src_lane, lane); the VFP only part of this instruction is already available: void Assembler::vmov(const Register dst, const VmovIndex index, const DwVfpRegister src, const Condition cond). The VFP part support conditional execution, usually Neon deprecates it so it may be worth removing the condition argument for this instruction https://codereview.chromium.org/2546933002/diff/100001/src/arm/simulator-arm.cc File src/arm/simulator-arm.cc (right): https://codereview.chromium.org/2546933002/diff/100001/src/arm/simulator-arm.... src/arm/simulator-arm.cc:3791: #define TST(x, y) (((x) & (y)) == 0 ? 0xFFFFu : 0) != 0 https://codereview.chromium.org/2546933002/diff/100001/src/arm/simulator-arm.... src/arm/simulator-arm.cc:3806: int size = instr->Bits(21, 20); NeonSize size = static_cast<NeonSize>(instr->Bit(21, 20)); this may make the code below easier to read: case Neon8: ... case Neon16: ... https://codereview.chromium.org/2546933002/diff/100001/src/arm/simulator-arm.... src/arm/simulator-arm.cc:4046: float* as_float = reinterpret_cast<float*>(q_data); undefined https://codereview.chromium.org/2546933002/diff/100001/src/arm/simulator-arm.... src/arm/simulator-arm.cc:4048: uint32_t* as_uint = reinterpret_cast<uint32_t*>(q_data); cast not needed. https://codereview.chromium.org/2546933002/diff/100001/src/arm/simulator-arm.... src/arm/simulator-arm.cc:4062: as_uint[i] = static_cast<uint32_t>(as_float[i]); this is undefined if the float value is negative. Currently clang seem to return the int value and reinterpret it as a uint which is the wrong behaviour for ARM which maps negative float to 0. You'll need a dedicated function for the conversion, see DecodeVCVTBetweenFloatingPointAndInteger for example. https://codereview.chromium.org/2546933002/diff/100001/test/cctest/test-assem... File test/cctest/test-assembler-arm.cc (right): https://codereview.chromium.org/2546933002/diff/100001/test/cctest/test-assem... test/cctest/test-assembler-arm.cc:1377: // vsub(integer). It would be nice to have test for each NeonSize for the instruction below. https://codereview.chromium.org/2546933002/diff/100001/test/cctest/test-disas... File test/cctest/test-disasm-arm.cc (right): https://codereview.chromium.org/2546933002/diff/100001/test/cctest/test-disas... test/cctest/test-disasm-arm.cc:1014: "f3b10802 vtbl d0, {d1}, d2"); vtbl.8/vtbx.8 https://codereview.chromium.org/2546933002/diff/100001/test/cctest/test-disas... test/cctest/test-disasm-arm.cc:1024: "f3b1fa45 vtbx d15, {d1, d2, d3}, d5"); it woudl be nice to have a test with a list of 4 registers. https://codereview.chromium.org/2546933002/diff/100001/test/cctest/test-macro... File test/cctest/test-macro-assembler-arm.cc (right): https://codereview.chromium.org/2546933002/diff/100001/test/cctest/test-macro... test/cctest/test-macro-assembler-arm.cc:217: __ bx(lr); nit: this is not necessary, the line above loaded lr directly in the pc.
Implemented the scalar - ARM core register move instructions. Extracted a utility method for simulator to VCVT Neon registers. https://codereview.chromium.org/2546933002/diff/100001/src/arm/assembler-arm.cc File src/arm/assembler-arm.cc (right): https://codereview.chromium.org/2546933002/diff/100001/src/arm/assembler-arm.... src/arm/assembler-arm.cc:3902: emit(0x1E7 * B23 | d * B22 | 3 * B20 | vd * B12 | 0x17 * B6 | m * B5 | vm); On 2016/12/08 18:08:27, Rodolph Perfetta (ARM) wrote: > 0x1E7U according to issue 5725 Done. https://codereview.chromium.org/2546933002/diff/100001/src/arm/assembler-arm.... src/arm/assembler-arm.cc:3966: emit(0x1E7 * B23 | d * B22 | 0x3 * B20 | imm4 * B16 | vd * B12 | 0x18 * B7 | On 2016/12/08 18:08:27, Rodolph Perfetta (ARM) wrote: > 0x1E7U Done. https://codereview.chromium.org/2546933002/diff/100001/src/arm/assembler-arm.... src/arm/assembler-arm.cc:3984: op = dst_type == U32 ? 3 : 2; On 2016/12/08 18:08:27, Rodolph Perfetta (ARM) wrote: > DCECHK((dst_type == U32) || (dst_type == S32)); It's (subtly) implied by the existing DCHECKs above, but your way is clearer. Done. https://codereview.chromium.org/2546933002/diff/100001/src/arm/assembler-arm.... src/arm/assembler-arm.cc:3987: op = src_type == U32 ? 1 : 0; On 2016/12/08 18:08:27, Rodolph Perfetta (ARM) wrote: > ditto with src_type Done. https://codereview.chromium.org/2546933002/diff/100001/src/arm/assembler-arm.... src/arm/assembler-arm.cc:4067: emit(0x1E4 * B23 | d * B22 | vn * B16 | vd * B12 | 0xD * B8 | n * B7 | B6 | On 2016/12/08 18:08:27, Rodolph Perfetta (ARM) wrote: > 0x1E4U Done. https://codereview.chromium.org/2546933002/diff/100001/src/arm/assembler-arm.... src/arm/assembler-arm.cc:4083: emit(0x1E4 * B23 | d * B22 | sz * B20 | vn * B16 | vd * B12 | 0x8 * B8 | On 2016/12/08 18:08:27, Rodolph Perfetta (ARM) wrote: > ditto Done. https://codereview.chromium.org/2546933002/diff/100001/src/arm/assembler-arm.... src/arm/assembler-arm.cc:4098: emit(0x1E4 * B23 | d * B22 | B21 | vn * B16 | vd * B12 | 0xD * B8 | n * B7 | On 2016/12/08 18:08:27, Rodolph Perfetta (ARM) wrote: > ditto Done. https://codereview.chromium.org/2546933002/diff/100001/src/arm/assembler-arm.... src/arm/assembler-arm.cc:4114: emit(0x1E6 * B23 | d * B22 | sz * B20 | vn * B16 | vd * B12 | 0x8 * B8 | On 2016/12/08 18:08:27, Rodolph Perfetta (ARM) wrote: > ditto Done. https://codereview.chromium.org/2546933002/diff/100001/src/arm/assembler-arm.... src/arm/assembler-arm.cc:4130: emit(0x1E4 * B23 | d * B22 | sz * B20 | vn * B16 | vd * B12 | 0x8 * B8 | On 2016/12/08 18:08:27, Rodolph Perfetta (ARM) wrote: > ditto Done. https://codereview.chromium.org/2546933002/diff/100001/src/arm/assembler-arm.... src/arm/assembler-arm.cc:4146: emit(0x1E6 * B23 | d * B22 | sz * B20 | vn * B16 | vd * B12 | 0x8 * B8 | On 2016/12/08 18:08:27, Rodolph Perfetta (ARM) wrote: > ditto Done. https://codereview.chromium.org/2546933002/diff/100001/src/arm/assembler-arm.... src/arm/assembler-arm.cc:4161: int op = 1; // vbsl On 2016/12/08 18:08:27, Rodolph Perfetta (ARM) wrote: > is that necessary? why not using B20 directly below? Done. https://codereview.chromium.org/2546933002/diff/100001/src/arm/assembler-arm.... src/arm/assembler-arm.cc:4162: emit(0x1E6 * B23 | d * B22 | op * B20 | vn * B16 | vd * B12 | 0x1 * B8 | On 2016/12/08 18:08:27, Rodolph Perfetta (ARM) wrote: > 0x1E6U Done. https://codereview.chromium.org/2546933002/diff/100001/src/arm/assembler-arm.... src/arm/assembler-arm.cc:4177: int op = 1; // vbsl On 2016/12/08 18:08:27, Rodolph Perfetta (ARM) wrote: > unused. Done. https://codereview.chromium.org/2546933002/diff/100001/src/arm/assembler-arm.... src/arm/assembler-arm.cc:4178: emit(0x1E7 * B23 | d * B22 | 0x3 * B20 | vn * B16 | vd * B12 | 0x2 * B10 | On 2016/12/08 18:08:27, Rodolph Perfetta (ARM) wrote: > 0x1E7U Done. https://codereview.chromium.org/2546933002/diff/100001/src/arm/assembler-arm.... src/arm/assembler-arm.cc:4182: void Assembler::vtbx(const DwVfpRegister dst, const NeonListOperand& list, On 2016/12/08 18:08:27, Rodolph Perfetta (ARM) wrote: > vtbl and vtbx encoding only differ by the value of bit 6, how about a common > helper for both instruction? Yes, Done. https://codereview.chromium.org/2546933002/diff/100001/src/arm/assembler-arm.... src/arm/assembler-arm.cc:4193: int op = 1; // vbsl On 2016/12/08 18:08:27, Rodolph Perfetta (ARM) wrote: > unused Done. https://codereview.chromium.org/2546933002/diff/100001/src/arm/assembler-arm.... src/arm/assembler-arm.cc:4194: emit(0x1E7 * B23 | d * B22 | 0x3 * B20 | vn * B16 | vd * B12 | 0x2 * B10 | On 2016/12/08 18:08:27, Rodolph Perfetta (ARM) wrote: > 0x1E7U Done. https://codereview.chromium.org/2546933002/diff/100001/src/arm/assembler-arm.h File src/arm/assembler-arm.h (right): https://codereview.chromium.org/2546933002/diff/100001/src/arm/assembler-arm.... src/arm/assembler-arm.h:649: int len() const { return register_count_ - 1; } On 2016/12/08 18:08:28, Rodolph Perfetta (ARM) wrote: > len -> length Done. https://codereview.chromium.org/2546933002/diff/100001/src/arm/assembler-arm.... src/arm/assembler-arm.h:1351: void vdup(const QwNeonRegister dst, const Register src, NeonSize size); On 2016/12/08 18:08:28, Rodolph Perfetta (ARM) wrote: > in other instructions like vmovl the size qualifier is the first argument to be > closer to the assembly syntax Done. https://codereview.chromium.org/2546933002/diff/100001/src/arm/assembler-arm.... src/arm/assembler-arm.h:1352: void vdup(const QwNeonRegister dst, const SwVfpRegister src); On 2016/12/08 18:08:28, Rodolph Perfetta (ARM) wrote: > this is a special case of vdup.size dst, src[index]. Any reason why we wouldn't > want to support for full instruction? That would give us access to higher D > registers for example. I am just asking, if this is the only useful case for now > that is fine by me. This is enough for Wasm, so I'll leave it as is. https://codereview.chromium.org/2546933002/diff/100001/src/arm/assembler-arm.... src/arm/assembler-arm.h:1364: const QwNeonRegister src2, NeonSize size); On 2016/12/08 18:08:28, Rodolph Perfetta (ARM) wrote: > how about having the size as the first Done. https://codereview.chromium.org/2546933002/diff/100001/src/arm/assembler-arm.... src/arm/assembler-arm.h:1368: const QwNeonRegister src2, NeonSize size); On 2016/12/08 18:08:27, Rodolph Perfetta (ARM) wrote: > ditto Done. https://codereview.chromium.org/2546933002/diff/100001/src/arm/assembler-arm.... src/arm/assembler-arm.h:1370: const QwNeonRegister src2, NeonSize size); On 2016/12/08 18:08:28, Rodolph Perfetta (ARM) wrote: > ditto Done. https://codereview.chromium.org/2546933002/diff/100001/src/arm/assembler-arm.... src/arm/assembler-arm.h:1372: const QwNeonRegister src2, NeonSize size); On 2016/12/08 18:08:28, Rodolph Perfetta (ARM) wrote: > ditto Done. https://codereview.chromium.org/2546933002/diff/100001/src/arm/disasm-arm.cc File src/arm/disasm-arm.cc (right): https://codereview.chromium.org/2546933002/diff/100001/src/arm/disasm-arm.cc#... src/arm/disasm-arm.cc:1548: int Vd = (instr->Bit(7) << 3) | (instr->VnValue() >> 1); On 2016/12/08 18:08:28, Rodolph Perfetta (ARM) wrote: > int Vd = instr->VFPNRegValue(kSimd128Precision); Done. https://codereview.chromium.org/2546933002/diff/100001/src/arm/disasm-arm.cc#... src/arm/disasm-arm.cc:1973: instr->Bit(6) == 0 ? "vtbl" : "vtbx", Vd); On 2016/12/08 18:08:28, Rodolph Perfetta (ARM) wrote: > vtbl.8, vtbx.8 Done. https://codereview.chromium.org/2546933002/diff/100001/src/arm/macro-assemble... File src/arm/macro-assembler-arm.cc (right): https://codereview.chromium.org/2546933002/diff/100001/src/arm/macro-assemble... src/arm/macro-assembler-arm.cc:1147: void MacroAssembler::ExtractLane(Register dst, QwNeonRegister src, On 2016/12/08 18:08:28, Rodolph Perfetta (ARM) wrote: > There is a Neon instruction for this: vmov.dt rt, Dn[x] where dt is one of {S8, > U8, S16, U16, 32} and x the index of your lane. > > A partial implementation of this function is already available: vmov(const > DwVfpRegister dst, const VmovIndex index, const Register src, const Condition > cond), it only supports VFP, i.e vmov.32 rt, Dn[0|1] Awesome, I've implemented this for the other data types. https://codereview.chromium.org/2546933002/diff/100001/src/arm/macro-assemble... src/arm/macro-assembler-arm.cc:1181: int bytes_per_lane = dt & NeonDataTypeSizeMask; On 2016/12/08 18:08:28, Rodolph Perfetta (ARM) wrote: > if you implement vmov.dt Dn[x], Rt then you can drop the rest of this function: > > Move(dst, src); > vmov(dt, dst, src_lane, lane); > > the VFP only part of this instruction is already available: void > Assembler::vmov(const Register dst, const VmovIndex index, const DwVfpRegister > src, const Condition cond). > > The VFP part support conditional execution, usually Neon deprecates it so it may > be worth removing the condition argument for this instruction I implemented the Neon form of this instruction too. I allow the conditional form since the documentation doesn't seem to forbid it. https://codereview.chromium.org/2546933002/diff/100001/src/arm/simulator-arm.cc File src/arm/simulator-arm.cc (right): https://codereview.chromium.org/2546933002/diff/100001/src/arm/simulator-arm.... src/arm/simulator-arm.cc:3791: #define TST(x, y) (((x) & (y)) == 0 ? 0xFFFFu : 0) On 2016/12/08 18:08:28, Rodolph Perfetta (ARM) wrote: > != 0 Done. I need to figure out why the test didn't catch this. Also, I added an _16 suffix to these to make it a little clearer and reduce probability of name collision. https://codereview.chromium.org/2546933002/diff/100001/src/arm/simulator-arm.... src/arm/simulator-arm.cc:3806: int size = instr->Bits(21, 20); On 2016/12/08 18:08:28, Rodolph Perfetta (ARM) wrote: > NeonSize size = static_cast<NeonSize>(instr->Bit(21, 20)); > > this may make the code below easier to read: > case Neon8: ... > case Neon16: ... Done. Here and 4 other switches. https://codereview.chromium.org/2546933002/diff/100001/src/arm/simulator-arm.... src/arm/simulator-arm.cc:4046: float* as_float = reinterpret_cast<float*>(q_data); On 2016/12/08 18:08:28, Rodolph Perfetta (ARM) wrote: > undefined Removed. https://codereview.chromium.org/2546933002/diff/100001/src/arm/simulator-arm.... src/arm/simulator-arm.cc:4048: uint32_t* as_uint = reinterpret_cast<uint32_t*>(q_data); On 2016/12/08 18:08:28, Rodolph Perfetta (ARM) wrote: > cast not needed. Done. https://codereview.chromium.org/2546933002/diff/100001/src/arm/simulator-arm.... src/arm/simulator-arm.cc:4062: as_uint[i] = static_cast<uint32_t>(as_float[i]); On 2016/12/08 18:08:28, Rodolph Perfetta (ARM) wrote: > this is undefined if the float value is negative. Currently clang seem to return > the int value and reinterpret it as a uint which is the wrong behaviour for ARM > which maps negative float to 0. > > You'll need a dedicated function for the conversion, see > DecodeVCVTBetweenFloatingPointAndInteger for example. I've pulled a helper method out of the existing VCVT code and use it here. https://codereview.chromium.org/2546933002/diff/100001/test/cctest/test-assem... File test/cctest/test-assembler-arm.cc (right): https://codereview.chromium.org/2546933002/diff/100001/test/cctest/test-assem... test/cctest/test-assembler-arm.cc:1377: // vsub(integer). On 2016/12/08 18:08:28, Rodolph Perfetta (ARM) wrote: > It would be nice to have test for each NeonSize for the instruction below. Done. Here and for similar instructions. https://codereview.chromium.org/2546933002/diff/100001/test/cctest/test-disas... File test/cctest/test-disasm-arm.cc (right): https://codereview.chromium.org/2546933002/diff/100001/test/cctest/test-disas... test/cctest/test-disasm-arm.cc:1014: "f3b10802 vtbl d0, {d1}, d2"); On 2016/12/08 18:08:28, Rodolph Perfetta (ARM) wrote: > vtbl.8/vtbx.8 Done. https://codereview.chromium.org/2546933002/diff/100001/test/cctest/test-disas... test/cctest/test-disasm-arm.cc:1024: "f3b1fa45 vtbx d15, {d1, d2, d3}, d5"); On 2016/12/08 18:08:28, Rodolph Perfetta (ARM) wrote: > it woudl be nice to have a test with a list of 4 registers. Done. https://codereview.chromium.org/2546933002/diff/100001/test/cctest/test-macro... File test/cctest/test-macro-assembler-arm.cc (right): https://codereview.chromium.org/2546933002/diff/100001/test/cctest/test-macro... test/cctest/test-macro-assembler-arm.cc:217: __ bx(lr); On 2016/12/08 18:08:28, Rodolph Perfetta (ARM) wrote: > nit: this is not necessary, the line above loaded lr directly in the pc. Done. https://codereview.chromium.org/2546933002/diff/100001/test/cctest/test-macro... test/cctest/test-macro-assembler-arm.cc:347: __ bx(lr); And here. https://codereview.chromium.org/2546933002/diff/120001/src/arm/simulator-arm.cc File src/arm/simulator-arm.cc (right): https://codereview.chromium.org/2546933002/diff/120001/src/arm/simulator-arm.... src/arm/simulator-arm.cc:3621: inv_op_vfp_flag_ = get_inv_op_vfp_flag(mode, val, unsigned_integer); Should Neon alter fp state? https://codereview.chromium.org/2546933002/diff/120001/src/arm/simulator-arm.... src/arm/simulator-arm.cc:3627: inexact_vfp_flag_ = (abs_diff != 0); Ditto.
The CQ bit was checked by bbudge@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
friendly ping
https://codereview.chromium.org/2546933002/diff/120001/src/arm/disasm-arm.cc File src/arm/disasm-arm.cc (right): https://codereview.chromium.org/2546933002/diff/120001/src/arm/disasm-arm.cc#... src/arm/disasm-arm.cc:1553: out_buffer_pos_ += SNPrintF(out_buffer_ + out_buffer_pos_, the assembler allows conditional forms of those instruction so the disassembler need to handle them. https://codereview.chromium.org/2546933002/diff/120001/src/arm/disasm-arm.cc#... src/arm/disasm-arm.cc:1572: out_buffer_pos_ += SNPrintF(out_buffer_ + out_buffer_pos_, ditto. https://codereview.chromium.org/2546933002/diff/120001/src/arm/disasm-arm.cc#... src/arm/disasm-arm.cc:1591: out_buffer_pos_ += SNPrintF(out_buffer_ + out_buffer_pos_, handle conditional. https://codereview.chromium.org/2546933002/diff/120001/src/arm/disasm-arm.cc#... src/arm/disasm-arm.cc:1597: SNPrintF(out_buffer_ + out_buffer_pos_, "vmov.%s16 r%d, d%d[%d]", ditto. https://codereview.chromium.org/2546933002/diff/120001/src/arm/simulator-arm.cc File src/arm/simulator-arm.cc (right): https://codereview.chromium.org/2546933002/diff/120001/src/arm/simulator-arm.... src/arm/simulator-arm.cc:3621: inv_op_vfp_flag_ = get_inv_op_vfp_flag(mode, val, unsigned_integer); A bit surprising but yes. On 2016/12/10 21:33:05, bbudge wrote: > Should Neon alter fp state? https://codereview.chromium.org/2546933002/diff/120001/src/arm/simulator-arm.... src/arm/simulator-arm.cc:3627: inexact_vfp_flag_ = (abs_diff != 0); yes. On 2016/12/10 21:33:05, bbudge wrote: > Ditto. https://codereview.chromium.org/2546933002/diff/120001/src/arm/simulator-arm.... src/arm/simulator-arm.cc:4121: // s32 -> f32, round towards 0. neon uses round toward nearest for integer to floating point conversion. https://codereview.chromium.org/2546933002/diff/120001/src/arm/simulator-arm.... src/arm/simulator-arm.cc:4123: static_cast<float>(static_cast<int32_t>(q_data[i]))); static_cast<int32_t> is undefined if q_data is bigger than INT32_MAX. use bitcast. https://codereview.chromium.org/2546933002/diff/120001/src/arm/simulator-arm.... src/arm/simulator-arm.cc:4126: // u32 -> f32, round towards 0. neon uses round toward nearest for integer to floating point conversion. https://codereview.chromium.org/2546933002/diff/120001/src/arm/simulator-arm.... src/arm/simulator-arm.cc:4130: // f32 -> s32, round to nearest. neon uses round toward 0 for floating point to integer conversion. https://codereview.chromium.org/2546933002/diff/120001/src/arm/simulator-arm.... src/arm/simulator-arm.cc:4135: // f32 -> u32, round to nearest. as above https://codereview.chromium.org/2546933002/diff/120001/test/cctest/test-assem... File test/cctest/test-assembler-arm.cc (right): https://codereview.chromium.org/2546933002/diff/120001/test/cctest/test-assem... test/cctest/test-assembler-arm.cc:1359: __ vmov(d1, d0); // q0 = [-1, 1, -1, 1] Could you add test with large integer to test the round mode. https://codereview.chromium.org/2546933002/diff/120001/test/cctest/test-assem... test/cctest/test-assembler-arm.cc:1563: CHECK_EQ(-2, t.vcvt_s32_f32[0]); it should return -1 https://codereview.chromium.org/2546933002/diff/120001/test/cctest/test-assem... test/cctest/test-assembler-arm.cc:1566: CHECK_EQ(2, t.vcvt_s32_f32[3]); 1
The CQ bit was checked by bbudge@chromium.org to run a CQ dry run
https://codereview.chromium.org/2546933002/diff/120001/src/arm/disasm-arm.cc File src/arm/disasm-arm.cc (right): https://codereview.chromium.org/2546933002/diff/120001/src/arm/disasm-arm.cc#... src/arm/disasm-arm.cc:1553: out_buffer_pos_ += SNPrintF(out_buffer_ + out_buffer_pos_, On 2016/12/14 12:06:30, Rodolph Perfetta (ARM) wrote: > the assembler allows conditional forms of those instruction so the disassembler > need to handle them. Ugh. It looks like a lot of work to allow Format to handle scalar lane indices. I should have followed your original recommendation to make these unconditional. I'll remove the Condition argument from the assembler functions instead. https://codereview.chromium.org/2546933002/diff/120001/src/arm/disasm-arm.cc#... src/arm/disasm-arm.cc:1572: out_buffer_pos_ += SNPrintF(out_buffer_ + out_buffer_pos_, On 2016/12/14 12:06:30, Rodolph Perfetta (ARM) wrote: > ditto. No longer conditional. https://codereview.chromium.org/2546933002/diff/120001/src/arm/disasm-arm.cc#... src/arm/disasm-arm.cc:1591: out_buffer_pos_ += SNPrintF(out_buffer_ + out_buffer_pos_, On 2016/12/14 12:06:30, Rodolph Perfetta (ARM) wrote: > handle conditional. No longer conditional. https://codereview.chromium.org/2546933002/diff/120001/src/arm/disasm-arm.cc#... src/arm/disasm-arm.cc:1597: SNPrintF(out_buffer_ + out_buffer_pos_, "vmov.%s16 r%d, d%d[%d]", On 2016/12/14 12:06:30, Rodolph Perfetta (ARM) wrote: > ditto. No longer conditional. https://codereview.chromium.org/2546933002/diff/120001/src/arm/simulator-arm.cc File src/arm/simulator-arm.cc (right): https://codereview.chromium.org/2546933002/diff/120001/src/arm/simulator-arm.... src/arm/simulator-arm.cc:3621: inv_op_vfp_flag_ = get_inv_op_vfp_flag(mode, val, unsigned_integer); On 2016/12/14 12:06:31, Rodolph Perfetta (ARM) wrote: > A bit surprising but yes. > > On 2016/12/10 21:33:05, bbudge wrote: > > Should Neon alter fp state? > OK https://codereview.chromium.org/2546933002/diff/120001/src/arm/simulator-arm.... src/arm/simulator-arm.cc:3627: inexact_vfp_flag_ = (abs_diff != 0); On 2016/12/14 12:06:31, Rodolph Perfetta (ARM) wrote: > yes. > > On 2016/12/10 21:33:05, bbudge wrote: > > Ditto. > OK https://codereview.chromium.org/2546933002/diff/120001/src/arm/simulator-arm.... src/arm/simulator-arm.cc:4121: // s32 -> f32, round towards 0. On 2016/12/14 12:06:31, Rodolph Perfetta (ARM) wrote: > neon uses round toward nearest for integer to floating point conversion. I got confused by my comments, so I changed the comments around to match the instruction mnemonic. Done. https://codereview.chromium.org/2546933002/diff/120001/src/arm/simulator-arm.... src/arm/simulator-arm.cc:4123: static_cast<float>(static_cast<int32_t>(q_data[i]))); On 2016/12/14 12:06:31, Rodolph Perfetta (ARM) wrote: > static_cast<int32_t> is undefined if q_data is bigger than INT32_MAX. use > bitcast. Good catch. Done. https://codereview.chromium.org/2546933002/diff/120001/src/arm/simulator-arm.... src/arm/simulator-arm.cc:4126: // u32 -> f32, round towards 0. On 2016/12/14 12:06:30, Rodolph Perfetta (ARM) wrote: > neon uses round toward nearest for integer to floating point conversion. Done. https://codereview.chromium.org/2546933002/diff/120001/src/arm/simulator-arm.... src/arm/simulator-arm.cc:4130: // f32 -> s32, round to nearest. On 2016/12/14 12:06:31, Rodolph Perfetta (ARM) wrote: > neon uses round toward 0 for floating point to integer conversion. Done. https://codereview.chromium.org/2546933002/diff/120001/src/arm/simulator-arm.... src/arm/simulator-arm.cc:4135: // f32 -> u32, round to nearest. On 2016/12/14 12:06:31, Rodolph Perfetta (ARM) wrote: > as above Done. https://codereview.chromium.org/2546933002/diff/120001/test/cctest/test-assem... File test/cctest/test-assembler-arm.cc (right): https://codereview.chromium.org/2546933002/diff/120001/test/cctest/test-assem... test/cctest/test-assembler-arm.cc:1359: __ vmov(d1, d0); // q0 = [-1, 1, -1, 1] On 2016/12/14 12:06:31, Rodolph Perfetta (ARM) wrote: > Could you add test with large integer to test the round mode. Done. https://codereview.chromium.org/2546933002/diff/120001/test/cctest/test-assem... test/cctest/test-assembler-arm.cc:1563: CHECK_EQ(-2, t.vcvt_s32_f32[0]); On 2016/12/14 12:06:31, Rodolph Perfetta (ARM) wrote: > it should return -1 Done. https://codereview.chromium.org/2546933002/diff/120001/test/cctest/test-assem... test/cctest/test-assembler-arm.cc:1566: CHECK_EQ(2, t.vcvt_s32_f32[3]); On 2016/12/14 12:06:31, Rodolph Perfetta (ARM) wrote: > 1 Done.
The CQ bit was checked by bbudge@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
lgtm (only minor nits) you will need an owner approval though. https://codereview.chromium.org/2546933002/diff/140001/src/arm/simulator-arm.cc File src/arm/simulator-arm.cc (right): https://codereview.chromium.org/2546933002/diff/140001/src/arm/simulator-arm.... src/arm/simulator-arm.cc:3617: VFPRoundingMode mode, bool neon) { nit: neon is unused. https://codereview.chromium.org/2546933002/diff/140001/test/cctest/test-assem... File test/cctest/test-assembler-arm.cc (right): https://codereview.chromium.org/2546933002/diff/140001/test/cctest/test-assem... test/cctest/test-assembler-arm.cc:1363: __ mov(r4, Operand(kMaxUInt32)); this will have the same "bit" representation a -1 below, how about a constant with more than 23 bits set to extend the coverage?
The CQ bit was checked by bbudge@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
bbudge@chromium.org changed reviewers: + bmeurer@chromium.org
Thanks Rodolph for the patient and thorough reviews. Hopefully future CLs will be simpler (no more FP - integer conversions). +Benedikt for OWNERS https://codereview.chromium.org/2546933002/diff/140001/src/arm/simulator-arm.cc File src/arm/simulator-arm.cc (right): https://codereview.chromium.org/2546933002/diff/140001/src/arm/simulator-arm.... src/arm/simulator-arm.cc:3617: VFPRoundingMode mode, bool neon) { On 2016/12/14 20:27:07, Rodolph Perfetta (ARM) wrote: > nit: neon is unused. Done. https://codereview.chromium.org/2546933002/diff/140001/test/cctest/test-assem... File test/cctest/test-assembler-arm.cc (right): https://codereview.chromium.org/2546933002/diff/140001/test/cctest/test-assem... test/cctest/test-assembler-arm.cc:1363: __ mov(r4, Operand(kMaxUInt32)); On 2016/12/14 20:27:08, Rodolph Perfetta (ARM) wrote: > this will have the same "bit" representation a -1 below, how about a constant > with more than 23 bits set to extend the coverage? I added a negative number that can't be represented in 23 bits, kMinInt + 1.
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
I'm going to test my OWNERS bit.
The CQ bit was checked by bbudge@chromium.org
The patchset sent to the CQ was uploaded after l-g-t-m from rodolph.perfetta@arm.com Link to the patchset: https://codereview.chromium.org/2546933002/#ps160001 (title: "Fourth review comments.")
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
CQ is committing da patch. Bot data: {"patchset_id": 160001, "attempt_start_ts": 1481825619044550, "parent_rev": "250e85f84abd24c4ddab2efc6aa0ddddab09c55d", "commit_rev": "03f33f2e68641903a982a025b0ba645c8ba9f9b3"}
Message was sent while issue was closed.
Description was changed from ========== [Turbofan] Add ARM NEON instructions for implementing SIMD. - Adds NEON instructions to assembler, disassembler, simulator. - Adds ExtractLane, ReplaceLane functions to macro assembler. LOG=N BUG=v8:4124 ========== to ========== [Turbofan] Add ARM NEON instructions for implementing SIMD. - Adds NEON instructions to assembler, disassembler, simulator. - Adds ExtractLane, ReplaceLane functions to macro assembler. LOG=N BUG=v8:4124 Review-Url: https://codereview.chromium.org/2546933002 Cr-Commit-Position: refs/heads/master@{#41737} Committed: https://chromium.googlesource.com/v8/v8/+/03f33f2e68641903a982a025b0ba645c8ba... ==========
Message was sent while issue was closed.
Committed patchset #9 (id:160001) as https://chromium.googlesource.com/v8/v8/+/03f33f2e68641903a982a025b0ba645c8ba... |