DescriptionI have found a more efficient way of detecting 1 and 0 alpha in SSE2. In addition, I found a stall on an execution unit for the lea instruction and rearranged to code to avoid that.
Before
1,362.01 LinearSrcOvericonstrip.pngVSkOptsSSE41
2,132.54 LinearSrcOvericonstrip.pngVSkOptsDefault
1,717.77 LinearSrcOvericonstrip.pngVSkOptsNonSimdCore
3,525.14 LinearSrcOvericonstrip.pngVSkOptsTrivial
11,181.78 LinearSrcOvericonstrip.pngVSkOptsBruteForce
644.77 LinearSrcOvermandrill_512.pngVSkOptsSSE41
682.51 LinearSrcOvermandrill_512.pngVSkOptsDefault
1,169.65 LinearSrcOvermandrill_512.pngVSkOptsNonSimdCore
2,486.45 LinearSrcOvermandrill_512.pngVSkOptsTrivial
11,635.94 LinearSrcOvermandrill_512.pngVSkOptsBruteForce
217.76 LinearSrcOverplane.pngVSkOptsSSE41
437.09 LinearSrcOverplane.pngVSkOptsDefault
275.91 LinearSrcOverplane.pngVSkOptsNonSimdCore
481.70 LinearSrcOverplane.pngVSkOptsTrivial
1,504.66 LinearSrcOverplane.pngVSkOptsBruteForce
323.90 LinearSrcOverbaby_tux.pngVSkOptsSSE41
497.49 LinearSrcOverbaby_tux.pngVSkOptsDefault
456.08 LinearSrcOverbaby_tux.pngVSkOptsNonSimdCore
786.46 LinearSrcOverbaby_tux.pngVSkOptsTrivial
2,554.65 LinearSrcOverbaby_tux.pngVSkOptsBruteForce
484.83 LinearSrcOveryellow_rose.pngVSkOptsSSE41
821.86 LinearSrcOveryellow_rose.pngVSkOptsDefault
655.37 LinearSrcOveryellow_rose.pngVSkOptsNonSimdCore
1,323.80 LinearSrcOveryellow_rose.pngVSkOptsTrivial
5,802.61 LinearSrcOveryellow_rose.pngVSkOptsBruteForce
After changes to sse2 and sse4.1
1,343.12 LinearSrcOvericonstrip.pngVSkOptsSSE41
1,441.17 LinearSrcOvericonstrip.pngVSkOptsDefault
1,679.97 LinearSrcOvericonstrip.pngVSkOptsNonSimdCore
3,481.05 LinearSrcOvericonstrip.pngVSkOptsTrivial
10,979.99 LinearSrcOvericonstrip.pngVSkOptsBruteForce
574.17 LinearSrcOvermandrill_512.pngVSkOptsSSE41
641.40 LinearSrcOvermandrill_512.pngVSkOptsDefault
1,169.44 LinearSrcOvermandrill_512.pngVSkOptsNonSimdCore
2,359.84 LinearSrcOvermandrill_512.pngVSkOptsTrivial
12,106.02 LinearSrcOvermandrill_512.pngVSkOptsBruteForce
209.95 LinearSrcOverplane.pngVSkOptsSSE41
249.12 LinearSrcOverplane.pngVSkOptsDefault
270.36 LinearSrcOverplane.pngVSkOptsNonSimdCore
466.30 LinearSrcOverplane.pngVSkOptsTrivial
1,431.14 LinearSrcOverplane.pngVSkOptsBruteForce
309.70 LinearSrcOverbaby_tux.pngVSkOptsSSE41
354.86 LinearSrcOverbaby_tux.pngVSkOptsDefault
442.69 LinearSrcOverbaby_tux.pngVSkOptsNonSimdCore
764.12 LinearSrcOverbaby_tux.pngVSkOptsTrivial
2,756.16 LinearSrcOverbaby_tux.pngVSkOptsBruteForce
457.70 LinearSrcOveryellow_rose.pngVSkOptsSSE41
500.50 LinearSrcOveryellow_rose.pngVSkOptsDefault
677.84 LinearSrcOveryellow_rose.pngVSkOptsNonSimdCore
1,301.50 LinearSrcOveryellow_rose.pngVSkOptsTrivial
5,786.40 LinearSrcOveryellow_rose.pngVSkOptsBruteForce
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1998373002
CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
Committed: https://skia.googlesource.com/skia/+/074b48ecb5ed8f9b25039477794437ae853d85c4
Patch Set 1 #
Total comments: 2
Messages
Total messages: 16 (9 generated)
|