Description3-15% speedup to HardLight / Overlay xfermodes.
While investigating my bug (skia:4052) I saw this TODO and figured
it'd make me feel better about an otherwise unsuccessful investigation.
This speeds up HardLight and Overlay (same code) by about 15% with SSE, mostly
by rewriting the logic from 1 cheap comparison and 2 expensive div255() calls
to 2 cheap comparisons and 1 expensive div255().
NEON speeds up by a more modest ~3%.
BUG=skia:
Committed: https://skia.googlesource.com/skia/+/4be181e304d2b280c6801bd13369cfba236d1a66
Patch Set 1 #Patch Set 2 : neon + portable #Patch Set 3 : more readable #Patch Set 4 : LoHi is more consistent with other lo-hi ordering. #Patch Set 5 : Add a test that widenLo() | widenHi() == widenLoHi() #
Messages
Total messages: 14 (6 generated)
|