DescriptionSkPx: new approach to fixed-point SIMD
SkPx is like Sk4px, except each platform implementation of SkPx can declare
a different sweet spot of N pixels, with extra loads and stores to handle the
ragged edge of 0<n<N pixels.
In this case, _sse's sweet spot remains 4 pixels. _neon jumps up to 8 so
we can now use NEON's transposing loads and stores, and _none is just 1.
This makes operations involving alpha considerably more efficient on NEON,
as alpha is its own distinct 8x8 bit plane that's easy to toss around.
This incorporates a few other improvements I've been wanting:
- no requirement that we're dealing with SkPMColor. SkColor works too.
- no anonymous namespace hack to differentiate implementations.
Codegen and perf look good on Clang/x86-64 and GCC/ARMv7.
The NEON code looks very similar to the old NEON code, as intended.
No .skp or GM diffs on my laptop. Don't expect any.
I intend this to replace Sk4px. Plan after landing:
- port SkXfermode_opts.h
- port Color32 in SkBlitRow_D32.cpp (and move to SkBlitRow_opts.h like other
SkOpts code)
- delete all Sk4px-related code
- clean up evolutionary dead ends in SkNx (Sk16b, Sk16h, Sk4i, Sk4d, etc.)
leaving Sk2f, Sk4f (and Sk2s, Sk4s).
- find a machine with AVX2 to work on, write SkPx_avx2.h handling 8 pixels
at a time.
In the end we'll have Sk4f for float pixels, SkPx for fixed-point pixels.
BUG=skia:4117
Committed: https://skia.googlesource.com/skia/+/82c93b45ed6ac0b628adb8375389c202d1f586f9
CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot;client.skia.compile:Build-Mac10.8-Clang-Arm7-Debug-Android-Trybot
Committed: https://skia.googlesource.com/skia/+/a7627dc5cc2bf5d9a95d883d20c40d477ecadadf
Committed: https://skia.googlesource.com/skia/+/e8e17cf23d2a036f9b3050bedeb9d3a544221f4c
Patch Set 1 #Patch Set 2 : enough to run #Patch Set 3 : comments, cleanup #Patch Set 4 : _none #Patch Set 5 : tweak #Patch Set 6 : note #Patch Set 7 : draft NEON #Patch Set 8 : compiling #Patch Set 9 : fixes #
Total comments: 2
Patch Set 10 : rebase, apis, shifts #Patch Set 11 : debug/release #Patch Set 12 : <<16 is not allowed #Patch Set 13 : shl,shr #
Messages
Total messages: 42 (18 generated)
|