DescriptionSkFloatConvert
This is SkFloatToHalf and SkHalfToFloat plus a little bit of state
to know whether we have fast f16<->f32 instructions at runtime.
On my laptop with f16c instructions, xferu64_srcover_bw_N_alpha_f16 runtime drops from 1.99ms to 1.09ms.
Here's the new disassembly:
+0x90 movups (%rdx), %xmm1
+0x93 movq (%rsi), %xmm2
+0x97 vcvtph2ps %xmm2, %xmm2
+0x9c movaps %xmm1, %xmm3
+0x9f shufps $255, %xmm3, %xmm3
+0xa3 movaps %xmm0, %xmm4
+0xa6 subps %xmm3, %xmm4
+0xa9 mulps %xmm2, %xmm4
+0xac addps %xmm1, %xmm4
+0xaf vcvtps2ph $0, %xmm4, %xmm4
+0xb5 movq %xmm4, (%rsi)
+0xb9 addq $16, %rdx
+0xbd addq $8, %rsi
+0xc1 decl %ecx
+0xc3 jne +0x90
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1884683002
Patch Set 1 #
Total comments: 1
Messages
Total messages: 11 (6 generated)
|