|
|
DescriptionSSE 4.1 SrcOver blits: color32, blitmask.
This is mainly warmup for an AVX2 version.
The machine I'm typing this on just doesn't support AVX2.
This strategy should translate easily down to SSSE3 and SSE2.
Xfermode_SrcOver: 2.73ms -> 2.62ms (0.96x) (That's Color32.)
Xfermode_SrcOver_aa: 3.48ms -> 3.09ms (0.89x) (That's BlitMask_D32_A8.)
AA text blits (text_16_AA_{88,FF,WT,BK}) show speedups in the range of 5 to 20%.
Unlike previous versions of this code, all the div255() are exactly (x+127)/255.
This won't fix any major bugs, but it does correct our bias in the middle.
There will be many diffs, all minor.
I've punted for now on pmaddubsw for lerping. I do intend to try that,
but I want this (relatively simple) code as my basis for comparison.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1526883004
CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
Committed: https://skia.googlesource.com/skia/+/78e0aef610d762009ef7f4eb51a83771443be665
Patch Set 1 #Patch Set 2 : blitmask_d32_a8 #Patch Set 3 : swizzle #Patch Set 4 : more shuffles #Patch Set 5 : comments etc #Patch Set 6 : tweaks #Patch Set 7 : oops #Patch Set 8 : should technically be next2 #Patch Set 9 : very minor #Messages
Total messages: 28 (20 generated)
Description was changed from ========== SSE 4.1 SrcOver blits: color32, blitmask. This is mainly warmup for an AVX2 version. The machine I'm typing this on just doesn't support AVX2. Xfermode_SrcOver: 2.73ms -> 2.62ms (0.96x) (That's Color32.) Xfermode_SrcOver_aa: TBD BUG=skia: ========== to ========== SSE 4.1 SrcOver blits: color32, blitmask. This is mainly warmup for an AVX2 version. The machine I'm typing this on just doesn't support AVX2. Xfermode_SrcOver: 2.73ms -> 2.62ms (0.96x) (That's Color32.) Xfermode_SrcOver_aa: TBD BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is... CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot ==========
Description was changed from ========== SSE 4.1 SrcOver blits: color32, blitmask. This is mainly warmup for an AVX2 version. The machine I'm typing this on just doesn't support AVX2. Xfermode_SrcOver: 2.73ms -> 2.62ms (0.96x) (That's Color32.) Xfermode_SrcOver_aa: TBD BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is... CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot ========== to ========== SSE 4.1 SrcOver blits: color32, blitmask. This is mainly warmup for an AVX2 version. The machine I'm typing this on just doesn't support AVX2. Xfermode_SrcOver: 2.73ms -> 2.62ms (0.96x) (That's Color32.) Xfermode_SrcOver_aa: 3.48ms -> 3.23ms (0.93x) (That's BlitMask_D32_A8.) BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is... CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot ==========
Description was changed from ========== SSE 4.1 SrcOver blits: color32, blitmask. This is mainly warmup for an AVX2 version. The machine I'm typing this on just doesn't support AVX2. Xfermode_SrcOver: 2.73ms -> 2.62ms (0.96x) (That's Color32.) Xfermode_SrcOver_aa: 3.48ms -> 3.23ms (0.93x) (That's BlitMask_D32_A8.) BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is... CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot ========== to ========== SSE 4.1 SrcOver blits: color32, blitmask. This is mainly warmup for an AVX2 version. The machine I'm typing this on just doesn't support AVX2. Xfermode_SrcOver: 2.73ms -> 2.62ms (0.96x) (That's Color32.) Xfermode_SrcOver_aa: 3.48ms -> 3.23ms (0.93x) (That's BlitMask_D32_A8.) Opaque AA text blits (text_16_AA_{FF,WT,BK}) also show speedups in the 0.95x (black) to 0.85x (other opaque) range. BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is... CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot ==========
Description was changed from ========== SSE 4.1 SrcOver blits: color32, blitmask. This is mainly warmup for an AVX2 version. The machine I'm typing this on just doesn't support AVX2. Xfermode_SrcOver: 2.73ms -> 2.62ms (0.96x) (That's Color32.) Xfermode_SrcOver_aa: 3.48ms -> 3.23ms (0.93x) (That's BlitMask_D32_A8.) Opaque AA text blits (text_16_AA_{FF,WT,BK}) also show speedups in the 0.95x (black) to 0.85x (other opaque) range. BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is... CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot ========== to ========== SSE 4.1 SrcOver blits: color32, blitmask. This is mainly warmup for an AVX2 version. The machine I'm typing this on just doesn't support AVX2. Xfermode_SrcOver: 2.73ms -> 2.62ms (0.96x) (That's Color32.) Xfermode_SrcOver_aa: 3.48ms -> 3.23ms (0.93x) (That's BlitMask_D32_A8.) Opaque AA text blits (text_16_AA_{FF,WT,BK}) also show speedups in the 0.95x (black) to 0.85x (other opaque) range. Unlike previous versions of this code, all the div255() are exactly (x+127)/255. This won't fix any major bugs, but it does correct our bias in the middle. BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is... CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot ==========
Description was changed from ========== SSE 4.1 SrcOver blits: color32, blitmask. This is mainly warmup for an AVX2 version. The machine I'm typing this on just doesn't support AVX2. Xfermode_SrcOver: 2.73ms -> 2.62ms (0.96x) (That's Color32.) Xfermode_SrcOver_aa: 3.48ms -> 3.23ms (0.93x) (That's BlitMask_D32_A8.) Opaque AA text blits (text_16_AA_{FF,WT,BK}) also show speedups in the 0.95x (black) to 0.85x (other opaque) range. Unlike previous versions of this code, all the div255() are exactly (x+127)/255. This won't fix any major bugs, but it does correct our bias in the middle. BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is... CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot ========== to ========== SSE 4.1 SrcOver blits: color32, blitmask. This is mainly warmup for an AVX2 version. The machine I'm typing this on just doesn't support AVX2. Xfermode_SrcOver: 2.73ms -> 2.62ms (0.96x) (That's Color32.) Xfermode_SrcOver_aa: 3.48ms -> 3.13ms (0.90x) (That's BlitMask_D32_A8.) Opaque AA text blits (text_16_AA_{FF,WT,BK}) show speedups similar to SrcOver_aa. Unlike previous versions of this code, all the div255() are exactly (x+127)/255. This won't fix any major bugs, but it does correct our bias in the middle. BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is... CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot ==========
The CQ bit was checked by mtklein@google.com to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1526883004/120001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1526883004/120001
Description was changed from ========== SSE 4.1 SrcOver blits: color32, blitmask. This is mainly warmup for an AVX2 version. The machine I'm typing this on just doesn't support AVX2. Xfermode_SrcOver: 2.73ms -> 2.62ms (0.96x) (That's Color32.) Xfermode_SrcOver_aa: 3.48ms -> 3.13ms (0.90x) (That's BlitMask_D32_A8.) Opaque AA text blits (text_16_AA_{FF,WT,BK}) show speedups similar to SrcOver_aa. Unlike previous versions of this code, all the div255() are exactly (x+127)/255. This won't fix any major bugs, but it does correct our bias in the middle. BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is... CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot ========== to ========== SSE 4.1 SrcOver blits: color32, blitmask. This is mainly warmup for an AVX2 version. The machine I'm typing this on just doesn't support AVX2. Xfermode_SrcOver: 2.73ms -> 2.62ms (0.96x) (That's Color32.) Xfermode_SrcOver_aa: 3.48ms -> 3.09ms (0.89x) (That's BlitMask_D32_A8.) Opaque AA text blits (text_16_AA_{FF,WT,BK}) show speedups similar to SrcOver_aa, in the range of 5 to 20%. Unlike previous versions of this code, all the div255() are exactly (x+127)/255. This won't fix any major bugs, but it does correct our bias in the middle. BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is... CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot ==========
Description was changed from ========== SSE 4.1 SrcOver blits: color32, blitmask. This is mainly warmup for an AVX2 version. The machine I'm typing this on just doesn't support AVX2. Xfermode_SrcOver: 2.73ms -> 2.62ms (0.96x) (That's Color32.) Xfermode_SrcOver_aa: 3.48ms -> 3.09ms (0.89x) (That's BlitMask_D32_A8.) Opaque AA text blits (text_16_AA_{FF,WT,BK}) show speedups similar to SrcOver_aa, in the range of 5 to 20%. Unlike previous versions of this code, all the div255() are exactly (x+127)/255. This won't fix any major bugs, but it does correct our bias in the middle. BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is... CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot ========== to ========== SSE 4.1 SrcOver blits: color32, blitmask. This is mainly warmup for an AVX2 version. The machine I'm typing this on just doesn't support AVX2. Xfermode_SrcOver: 2.73ms -> 2.62ms (0.96x) (That's Color32.) Xfermode_SrcOver_aa: 3.48ms -> 3.09ms (0.89x) (That's BlitMask_D32_A8.) Opaque AA text blits (text_16_AA_{FF,WT,BK}) show speedups similar to SrcOver_aa, in the range of 5 to 20%. Unlike previous versions of this code, all the div255() are exactly (x+127)/255. This won't fix any major bugs, but it does correct our bias in the middle. There will be many diffs, all minor. BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is... CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot ==========
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
The CQ bit was checked by mtklein@google.com to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1526883004/160001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1526883004/160001
Description was changed from ========== SSE 4.1 SrcOver blits: color32, blitmask. This is mainly warmup for an AVX2 version. The machine I'm typing this on just doesn't support AVX2. Xfermode_SrcOver: 2.73ms -> 2.62ms (0.96x) (That's Color32.) Xfermode_SrcOver_aa: 3.48ms -> 3.09ms (0.89x) (That's BlitMask_D32_A8.) Opaque AA text blits (text_16_AA_{FF,WT,BK}) show speedups similar to SrcOver_aa, in the range of 5 to 20%. Unlike previous versions of this code, all the div255() are exactly (x+127)/255. This won't fix any major bugs, but it does correct our bias in the middle. There will be many diffs, all minor. BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is... CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot ========== to ========== SSE 4.1 SrcOver blits: color32, blitmask. This is mainly warmup for an AVX2 version. The machine I'm typing this on just doesn't support AVX2. This strategy should translate easily down to SSSE3 and SSE2. Xfermode_SrcOver: 2.73ms -> 2.62ms (0.96x) (That's Color32.) Xfermode_SrcOver_aa: 3.48ms -> 3.09ms (0.89x) (That's BlitMask_D32_A8.) Opaque AA text blits (text_16_AA_{FF,WT,BK}) show speedups similar to SrcOver_aa, in the range of 5 to 20%. Unlike previous versions of this code, all the div255() are exactly (x+127)/255. This won't fix any major bugs, but it does correct our bias in the middle. There will be many diffs, all minor. BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is... CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot ==========
Description was changed from ========== SSE 4.1 SrcOver blits: color32, blitmask. This is mainly warmup for an AVX2 version. The machine I'm typing this on just doesn't support AVX2. This strategy should translate easily down to SSSE3 and SSE2. Xfermode_SrcOver: 2.73ms -> 2.62ms (0.96x) (That's Color32.) Xfermode_SrcOver_aa: 3.48ms -> 3.09ms (0.89x) (That's BlitMask_D32_A8.) Opaque AA text blits (text_16_AA_{FF,WT,BK}) show speedups similar to SrcOver_aa, in the range of 5 to 20%. Unlike previous versions of this code, all the div255() are exactly (x+127)/255. This won't fix any major bugs, but it does correct our bias in the middle. There will be many diffs, all minor. BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is... CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot ========== to ========== SSE 4.1 SrcOver blits: color32, blitmask. This is mainly warmup for an AVX2 version. The machine I'm typing this on just doesn't support AVX2. This strategy should translate easily down to SSSE3 and SSE2. Xfermode_SrcOver: 2.73ms -> 2.62ms (0.96x) (That's Color32.) Xfermode_SrcOver_aa: 3.48ms -> 3.09ms (0.89x) (That's BlitMask_D32_A8.) AA text blits (text_16_AA_{88,FF,WT,BK}) show speedups in the range of 5 to 20%. Unlike previous versions of this code, all the div255() are exactly (x+127)/255. This won't fix any major bugs, but it does correct our bias in the middle. There will be many diffs, all minor. BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is... CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot ==========
Description was changed from ========== SSE 4.1 SrcOver blits: color32, blitmask. This is mainly warmup for an AVX2 version. The machine I'm typing this on just doesn't support AVX2. This strategy should translate easily down to SSSE3 and SSE2. Xfermode_SrcOver: 2.73ms -> 2.62ms (0.96x) (That's Color32.) Xfermode_SrcOver_aa: 3.48ms -> 3.09ms (0.89x) (That's BlitMask_D32_A8.) AA text blits (text_16_AA_{88,FF,WT,BK}) show speedups in the range of 5 to 20%. Unlike previous versions of this code, all the div255() are exactly (x+127)/255. This won't fix any major bugs, but it does correct our bias in the middle. There will be many diffs, all minor. BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is... CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot ========== to ========== SSE 4.1 SrcOver blits: color32, blitmask. This is mainly warmup for an AVX2 version. The machine I'm typing this on just doesn't support AVX2. This strategy should translate easily down to SSSE3 and SSE2. Xfermode_SrcOver: 2.73ms -> 2.62ms (0.96x) (That's Color32.) Xfermode_SrcOver_aa: 3.48ms -> 3.09ms (0.89x) (That's BlitMask_D32_A8.) AA text blits (text_16_AA_{88,FF,WT,BK}) show speedups in the range of 5 to 20%. Unlike previous versions of this code, all the div255() are exactly (x+127)/255. This won't fix any major bugs, but it does correct our bias in the middle. There will be many diffs, all minor. BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is... CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot COMMIT=false Don't land until 1523363002 does. ==========
Description was changed from ========== SSE 4.1 SrcOver blits: color32, blitmask. This is mainly warmup for an AVX2 version. The machine I'm typing this on just doesn't support AVX2. This strategy should translate easily down to SSSE3 and SSE2. Xfermode_SrcOver: 2.73ms -> 2.62ms (0.96x) (That's Color32.) Xfermode_SrcOver_aa: 3.48ms -> 3.09ms (0.89x) (That's BlitMask_D32_A8.) AA text blits (text_16_AA_{88,FF,WT,BK}) show speedups in the range of 5 to 20%. Unlike previous versions of this code, all the div255() are exactly (x+127)/255. This won't fix any major bugs, but it does correct our bias in the middle. There will be many diffs, all minor. BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is... CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot COMMIT=false Don't land until 1523363002 does. ========== to ========== SSE 4.1 SrcOver blits: color32, blitmask. This is mainly warmup for an AVX2 version. The machine I'm typing this on just doesn't support AVX2. This strategy should translate easily down to SSSE3 and SSE2. Xfermode_SrcOver: 2.73ms -> 2.62ms (0.96x) (That's Color32.) Xfermode_SrcOver_aa: 3.48ms -> 3.09ms (0.89x) (That's BlitMask_D32_A8.) AA text blits (text_16_AA_{88,FF,WT,BK}) show speedups in the range of 5 to 20%. Unlike previous versions of this code, all the div255() are exactly (x+127)/255. This won't fix any major bugs, but it does correct our bias in the middle. There will be many diffs, all minor. I've punted for now on pmaddubsw for lerping. I do intend to try that, but I want this (relatively simple) code as my basis for comparison. BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is... CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot COMMIT=false Don't land until 1523363002 does. ==========
Description was changed from ========== SSE 4.1 SrcOver blits: color32, blitmask. This is mainly warmup for an AVX2 version. The machine I'm typing this on just doesn't support AVX2. This strategy should translate easily down to SSSE3 and SSE2. Xfermode_SrcOver: 2.73ms -> 2.62ms (0.96x) (That's Color32.) Xfermode_SrcOver_aa: 3.48ms -> 3.09ms (0.89x) (That's BlitMask_D32_A8.) AA text blits (text_16_AA_{88,FF,WT,BK}) show speedups in the range of 5 to 20%. Unlike previous versions of this code, all the div255() are exactly (x+127)/255. This won't fix any major bugs, but it does correct our bias in the middle. There will be many diffs, all minor. I've punted for now on pmaddubsw for lerping. I do intend to try that, but I want this (relatively simple) code as my basis for comparison. BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is... CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot COMMIT=false Don't land until 1523363002 does. ========== to ========== SSE 4.1 SrcOver blits: color32, blitmask. This is mainly warmup for an AVX2 version. The machine I'm typing this on just doesn't support AVX2. This strategy should translate easily down to SSSE3 and SSE2. Xfermode_SrcOver: 2.73ms -> 2.62ms (0.96x) (That's Color32.) Xfermode_SrcOver_aa: 3.48ms -> 3.09ms (0.89x) (That's BlitMask_D32_A8.) AA text blits (text_16_AA_{88,FF,WT,BK}) show speedups in the range of 5 to 20%. Unlike previous versions of this code, all the div255() are exactly (x+127)/255. This won't fix any major bugs, but it does correct our bias in the middle. There will be many diffs, all minor. I've punted for now on pmaddubsw for lerping. I do intend to try that, but I want this (relatively simple) code as my basis for comparison. BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is... CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot COMMIT=false Don't land until 1523363002 does. ==========
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
mtklein@chromium.org changed reviewers: + herb@google.com - mtklein@google.com
This is a start at capitalizing on all that hacking we did the last couple weeks.
lgtm
Description was changed from ========== SSE 4.1 SrcOver blits: color32, blitmask. This is mainly warmup for an AVX2 version. The machine I'm typing this on just doesn't support AVX2. This strategy should translate easily down to SSSE3 and SSE2. Xfermode_SrcOver: 2.73ms -> 2.62ms (0.96x) (That's Color32.) Xfermode_SrcOver_aa: 3.48ms -> 3.09ms (0.89x) (That's BlitMask_D32_A8.) AA text blits (text_16_AA_{88,FF,WT,BK}) show speedups in the range of 5 to 20%. Unlike previous versions of this code, all the div255() are exactly (x+127)/255. This won't fix any major bugs, but it does correct our bias in the middle. There will be many diffs, all minor. I've punted for now on pmaddubsw for lerping. I do intend to try that, but I want this (relatively simple) code as my basis for comparison. BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is... CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot COMMIT=false Don't land until 1523363002 does. ========== to ========== SSE 4.1 SrcOver blits: color32, blitmask. This is mainly warmup for an AVX2 version. The machine I'm typing this on just doesn't support AVX2. This strategy should translate easily down to SSSE3 and SSE2. Xfermode_SrcOver: 2.73ms -> 2.62ms (0.96x) (That's Color32.) Xfermode_SrcOver_aa: 3.48ms -> 3.09ms (0.89x) (That's BlitMask_D32_A8.) AA text blits (text_16_AA_{88,FF,WT,BK}) show speedups in the range of 5 to 20%. Unlike previous versions of this code, all the div255() are exactly (x+127)/255. This won't fix any major bugs, but it does correct our bias in the middle. There will be many diffs, all minor. I've punted for now on pmaddubsw for lerping. I do intend to try that, but I want this (relatively simple) code as my basis for comparison. BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is... CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot ==========
The CQ bit was checked by mtklein@google.com
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1526883004/160001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1526883004/160001
Message was sent while issue was closed.
Description was changed from ========== SSE 4.1 SrcOver blits: color32, blitmask. This is mainly warmup for an AVX2 version. The machine I'm typing this on just doesn't support AVX2. This strategy should translate easily down to SSSE3 and SSE2. Xfermode_SrcOver: 2.73ms -> 2.62ms (0.96x) (That's Color32.) Xfermode_SrcOver_aa: 3.48ms -> 3.09ms (0.89x) (That's BlitMask_D32_A8.) AA text blits (text_16_AA_{88,FF,WT,BK}) show speedups in the range of 5 to 20%. Unlike previous versions of this code, all the div255() are exactly (x+127)/255. This won't fix any major bugs, but it does correct our bias in the middle. There will be many diffs, all minor. I've punted for now on pmaddubsw for lerping. I do intend to try that, but I want this (relatively simple) code as my basis for comparison. BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is... CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot ========== to ========== SSE 4.1 SrcOver blits: color32, blitmask. This is mainly warmup for an AVX2 version. The machine I'm typing this on just doesn't support AVX2. This strategy should translate easily down to SSSE3 and SSE2. Xfermode_SrcOver: 2.73ms -> 2.62ms (0.96x) (That's Color32.) Xfermode_SrcOver_aa: 3.48ms -> 3.09ms (0.89x) (That's BlitMask_D32_A8.) AA text blits (text_16_AA_{88,FF,WT,BK}) show speedups in the range of 5 to 20%. Unlike previous versions of this code, all the div255() are exactly (x+127)/255. This won't fix any major bugs, but it does correct our bias in the middle. There will be many diffs, all minor. I've punted for now on pmaddubsw for lerping. I do intend to try that, but I want this (relatively simple) code as my basis for comparison. BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is... CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot Committed: https://skia.googlesource.com/skia/+/78e0aef610d762009ef7f4eb51a83771443be665 ==========
Message was sent while issue was closed.
Committed patchset #9 (id:160001) as https://skia.googlesource.com/skia/+/78e0aef610d762009ef7f4eb51a83771443be665 |