Issue 1565223002: Clean up SkXfermode_opts.h

mtklein_C

Description was changed from ========== Clean up SkXfermode_opts.h - Inline transfermode functions. This removes the ...

4 years, 11 months ago (2016-01-07 18:25:31 UTC) #1

mtklein_C

Description was changed from ========== Clean up SkXfermode_opts.h - Inline transfermode functions. This removes the ...

4 years, 11 months ago (2016-01-07 18:42:48 UTC) #2

mtklein_C

Description was changed from ========== Clean up SkXfermode_opts.h - Inline transfermode functions. This removes the ...

4 years, 11 months ago (2016-01-07 18:43:19 UTC) #3

mtklein_C

Description was changed from ========== Clean up SkXfermode_opts.h - Inline transfermode functions. This removes the ...

4 years, 11 months ago (2016-01-07 18:49:29 UTC) #4

mtklein_C

Description was changed from ========== Clean up SkXfermode_opts.h - Inline transfermode functions. This removes the ...

4 years, 11 months ago (2016-01-07 18:50:08 UTC) #5

mtklein_C

Description was changed from ========== Clean up SkXfermode_opts.h - Inline transfermode functions. This removes the ...

4 years, 11 months ago (2016-01-07 18:56:56 UTC) #6

mtklein_C

Description was changed from ========== Clean up SkXfermode_opts.h Seems like MSVC + __vectorcall don't play ...

4 years, 11 months ago (2016-01-07 19:58:51 UTC) #7

Description was changed from

==========
Clean up SkXfermode_opts.h

Seems like MSVC + __vectorcall don't play well together,
so back ourselves out into a situation where we don't need it.

   - Inline transfermode functions.  This removes the need for SK_VECTORCALL.
   - Remove 565 destination specializations.
     Blending into 565 is not speed-critical enough to merit the code bloat.
   - Removing 565 specializations means a bunch of Sk4px code is now dead.

8888 xfermodes generally speed up a bit from inlining, smoothly ranging from no
change down to 0.65x for the fastest functions like Plus or Modulate.

565 xfermodes generally slow down because we're doing them one pixel at a time,
smoothly ranging from no change up to 3.8x for the slowest functions like
Multiply and HardLight.

BUG=skia:4765
GOLD_TRYBOT_URL=
https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is...
CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
==========

to

==========
Clean up SkXfermode_opts.h

Seems like MSVC + __vectorcall don't play well together,
so back ourselves out into a situation where we don't need it.

   - Inline transfermode functions.  This removes the need for SK_VECTORCALL.
   - Remove 565 destination specializations.
     Blending into 565 is not speed-critical enough to merit the code bloat.
   - Removing 565 specializations means a bunch of Sk4px code is now dead.

8888 xfermodes generally speed up a bit from inlining, smoothly ranging from no
change down to 0.65x for the fastest functions like Plus or Modulate.

565 xfermodes generally slow down because we're doing 565 -> 8888 and 8888->565
conversion serially, smoothly ranging from no change up to 2x slower for the
fastest functions like Plus and Modulate.

BUG=skia:4765
GOLD_TRYBOT_URL=
https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is...
CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
==========

mtklein_C

Description was changed from ========== Clean up SkXfermode_opts.h Seems like MSVC + __vectorcall don't play ...

4 years, 11 months ago (2016-01-07 20:00:46 UTC) #8

Description was changed from

==========
Clean up SkXfermode_opts.h

Seems like MSVC + __vectorcall don't play well together,
so back ourselves out into a situation where we don't need it.

   - Inline transfermode functions.  This removes the need for SK_VECTORCALL.
   - Remove 565 destination specializations.
     Blending into 565 is not speed-critical enough to merit the code bloat.
   - Removing 565 specializations means a bunch of Sk4px code is now dead.

8888 xfermodes generally speed up a bit from inlining, smoothly ranging from no
change down to 0.65x for the fastest functions like Plus or Modulate.

565 xfermodes generally slow down because we're doing 565 -> 8888 and 8888->565
conversion serially, smoothly ranging from no change up to 2x slower for the
fastest functions like Plus and Modulate.

BUG=skia:4765
GOLD_TRYBOT_URL=
https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is...
CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
==========

to

==========
Clean up SkXfermode_opts.h

Seems like MSVC + __vectorcall don't play well together,
so back ourselves out into a situation where we don't need it.

   - Inline transfermode functions.  This removes the need for SK_VECTORCALL.
   - Remove 565 destination specializations.
     Blending into 565 is not speed-critical enough to merit the code bloat.
   - Removing 565 specializations means a bunch of Sk4px code is now dead.

8888 xfermodes generally speed up a bit from inlining, smoothly ranging from no
change down to 0.65x for the fastest functions like Plus or Modulate.

565 xfermodes generally slow down because we're doing 565 -> 8888 and 8888->565
conversion serially* on the stack, smoothly ranging from no change up to 2x
slower for the fastest functions like Plus and Modulate.

* the 565->8888 conversion is actually being autovectorized

BUG=skia:4765
GOLD_TRYBOT_URL=
https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is...
CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
==========

mtklein_C

Description was changed from ========== Clean up SkXfermode_opts.h Seems like MSVC + __vectorcall don't play ...

4 years, 11 months ago (2016-01-07 20:00:57 UTC) #9

Description was changed from

==========
Clean up SkXfermode_opts.h

Seems like MSVC + __vectorcall don't play well together,
so back ourselves out into a situation where we don't need it.

   - Inline transfermode functions.  This removes the need for SK_VECTORCALL.
   - Remove 565 destination specializations.
     Blending into 565 is not speed-critical enough to merit the code bloat.
   - Removing 565 specializations means a bunch of Sk4px code is now dead.

8888 xfermodes generally speed up a bit from inlining, smoothly ranging from no
change down to 0.65x for the fastest functions like Plus or Modulate.

565 xfermodes generally slow down because we're doing 565 -> 8888 and 8888->565
conversion serially* on the stack, smoothly ranging from no change up to 2x
slower for the fastest functions like Plus and Modulate.

* the 565->8888 conversion is actually being autovectorized

BUG=skia:4765
GOLD_TRYBOT_URL=
https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is...
CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
==========

to

==========
Clean up SkXfermode_opts.h

Seems like MSVC + __vectorcall don't play well together,
so back ourselves out into a situation where we don't need it.

   - Inline transfermode functions.  This removes the need for SK_VECTORCALL.
   - Remove 565 destination specializations.
     Blending into 565 is not speed-critical enough to merit the code bloat.
   - Removing 565 specializations means a bunch of Sk4px code is now dead.

8888 xfermodes generally speed up a bit from inlining, smoothly ranging from no
change down to 0.65x for the fastest functions like Plus or Modulate.

565 xfermodes generally slow down because we're doing 565 -> 8888 and 8888->565
conversion serially[1] on the stack, smoothly ranging from no change up to 2x
slower for the fastest functions like Plus and Modulate.

[1] the 565->8888 conversion is actually being autovectorized

BUG=skia:4765
GOLD_TRYBOT_URL=
https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is...
CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
==========

mtklein_C

Description was changed from ========== Clean up SkXfermode_opts.h Seems like MSVC + __vectorcall don't play ...

4 years, 11 months ago (2016-01-07 20:01:11 UTC) #10

Description was changed from

==========
Clean up SkXfermode_opts.h

Seems like MSVC + __vectorcall don't play well together,
so back ourselves out into a situation where we don't need it.

   - Inline transfermode functions.  This removes the need for SK_VECTORCALL.
   - Remove 565 destination specializations.
     Blending into 565 is not speed-critical enough to merit the code bloat.
   - Removing 565 specializations means a bunch of Sk4px code is now dead.

8888 xfermodes generally speed up a bit from inlining, smoothly ranging from no
change down to 0.65x for the fastest functions like Plus or Modulate.

565 xfermodes generally slow down because we're doing 565 -> 8888 and 8888->565
conversion serially[1] on the stack, smoothly ranging from no change up to 2x
slower for the fastest functions like Plus and Modulate.

[1] the 565->8888 conversion is actually being autovectorized

BUG=skia:4765
GOLD_TRYBOT_URL=
https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is...
CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
==========

to

==========
Clean up SkXfermode_opts.h

Seems like MSVC + __vectorcall don't play well together,
so back ourselves out into a situation where we don't need it.

   - Inline transfermode functions.  This removes the need for SK_VECTORCALL.
   - Remove 565 destination specializations.
     Blending into 565 is not speed-critical enough to merit the code bloat.
   - Removing 565 specializations means a bunch of Sk4px code is now dead.

8888 xfermodes generally speed up a bit from inlining, smoothly ranging from no
change down to 0.65x for the fastest functions like Plus or Modulate.

565 xfermodes generally slow down because we're doing 565 -> 8888 and 8888->565
conversion serially[1] and using the stack, smoothly ranging from no change up
to 2x slower for the fastest functions like Plus and Modulate.

[1] the 565->8888 conversion is actually being autovectorized

BUG=skia:4765
GOLD_TRYBOT_URL=
https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is...
CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
==========

mtklein_C

Description was changed from ========== Clean up SkXfermode_opts.h Seems like MSVC + __vectorcall don't play ...

4 years, 11 months ago (2016-01-07 20:03:14 UTC) #11

Description was changed from

==========
Clean up SkXfermode_opts.h

Seems like MSVC + __vectorcall don't play well together,
so back ourselves out into a situation where we don't need it.

   - Inline transfermode functions.  This removes the need for SK_VECTORCALL.
   - Remove 565 destination specializations.
     Blending into 565 is not speed-critical enough to merit the code bloat.
   - Removing 565 specializations means a bunch of Sk4px code is now dead.

8888 xfermodes generally speed up a bit from inlining, smoothly ranging from no
change down to 0.65x for the fastest functions like Plus or Modulate.

565 xfermodes generally slow down because we're doing 565 -> 8888 and 8888->565
conversion serially[1] and using the stack, smoothly ranging from no change up
to 2x slower for the fastest functions like Plus and Modulate.

[1] the 565->8888 conversion is actually being autovectorized

BUG=skia:4765
GOLD_TRYBOT_URL=
https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is...
CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
==========

to

==========
Clean up SkXfermode_opts.h

It seems that MSVC + __vectorcall don't play well together,
so back ourselves out into a situation where we don't need it.

   - Inline transfermode functions.  This removes the need for SK_VECTORCALL.
   - Remove 565 destination specializations.
     Blending into 565 is not speed-critical enough to merit the code bloat.
   - Removing 565 specializations means a bunch of Sk4px code is now dead.

8888 xfermodes generally speed up a bit from inlining, smoothly ranging from no
change down to 0.65x for the fastest functions like Plus or Modulate.

565 xfermodes generally slow down because we're doing 565 -> 8888 and 8888->565
conversion serially[1] and using the stack, smoothly ranging from no change up
to 2x slower for the fastest functions like Plus and Modulate.

[1] the 565->8888 conversion is actually being autovectorized

BUG=skia:4765
GOLD_TRYBOT_URL=
https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is...
CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
==========

mtklein_C

The CQ bit was checked by mtklein@chromium.org to run a CQ dry run

4 years, 11 months ago (2016-01-07 20:09:11 UTC) #12

commit-bot: I haz the power

Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1565223002/60001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1565223002/60001

4 years, 11 months ago (2016-01-07 20:09:18 UTC) #13

commit-bot: I haz the power

The CQ bit was unchecked by commit-bot@chromium.org

4 years, 11 months ago (2016-01-07 20:11:48 UTC) #14

commit-bot: I haz the power

Dry run: Try jobs failed on following builders: Build-Win-MSVC-x86-Debug-Trybot on client.skia.compile (JOB_FAILED, http://build.chromium.org/p/client.skia.compile/builders/Build-Win-MSVC-x86-Debug-Trybot/builds/5151)

4 years, 11 months ago (2016-01-07 20:11:48 UTC) #15

mtklein_C

The CQ bit was checked by mtklein@chromium.org to run a CQ dry run

4 years, 11 months ago (2016-01-07 20:17:36 UTC) #16

commit-bot: I haz the power

Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1565223002/80001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1565223002/80001

4 years, 11 months ago (2016-01-07 20:17:38 UTC) #17

mtklein_C

mtklein@chromium.org changed reviewers: + lsalzman@mozilla.com, reed@google.com

4 years, 11 months ago (2016-01-07 20:28:33 UTC) #18

commit-bot: I haz the power

The CQ bit was unchecked by commit-bot@chromium.org

4 years, 11 months ago (2016-01-07 20:40:52 UTC) #20

commit-bot: I haz the power

Dry run: This issue passed the CQ dry run.

4 years, 11 months ago (2016-01-07 20:40:53 UTC) #21

lsalzman1

Any reason why the lambdas in the Map calls still need to be there? The ...

4 years, 11 months ago (2016-01-07 21:04:56 UTC) #22

mtklein

On 2016/01/07 21:04:56, lsalzman1 wrote: > Any reason why the lambdas in the Map calls ...

4 years, 11 months ago (2016-01-07 21:06:45 UTC) #23

mtklein

Description was changed from ========== Clean up SkXfermode_opts.h It seems that MSVC + __vectorcall don't ...

4 years, 11 months ago (2016-01-08 18:45:02 UTC) #24

Description was changed from

==========
Clean up SkXfermode_opts.h

It seems that MSVC + __vectorcall don't play well together,
so back ourselves out into a situation where we don't need it.

   - Inline transfermode functions.  This removes the need for SK_VECTORCALL.
   - Remove 565 destination specializations.
     Blending into 565 is not speed-critical enough to merit the code bloat.
   - Removing 565 specializations means a bunch of Sk4px code is now dead.

8888 xfermodes generally speed up a bit from inlining, smoothly ranging from no
change down to 0.65x for the fastest functions like Plus or Modulate.

565 xfermodes generally slow down because we're doing 565 -> 8888 and 8888->565
conversion serially[1] and using the stack, smoothly ranging from no change up
to 2x slower for the fastest functions like Plus and Modulate.

[1] the 565->8888 conversion is actually being autovectorized

BUG=skia:4765
GOLD_TRYBOT_URL=
https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is...
CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
==========

to

==========
Clean up SkXfermode_opts.h

It seems that MSVC + __vectorcall don't play well together,
so back ourselves out into a situation where we don't need it.

   - Inline transfermode functions.  This removes the need for SK_VECTORCALL.
   - Remove 565 destination specializations.
     Blending into 565 is not speed-critical enough to merit the code bloat.
   - Removing 565 specializations means a bunch of Sk4px code is now dead.

8888 xfermodes generally speed up a bit from inlining, smoothly ranging from no
change down to 0.65x for the fastest functions like Plus or Modulate.

565 xfermodes generally slow down because we're doing 565 -> 8888 and 8888->565
conversion serially[1] and using the stack, smoothly ranging from no change up
to 2x slower for the fastest functions like Plus and Modulate.

[1] the 565->8888 conversion is actually being autovectorized

BUG=skia:4765,skia:4776
GOLD_TRYBOT_URL=
https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is...
CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
==========

commit-bot: I haz the power

CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1565223002/80001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1565223002/80001

4 years, 11 months ago (2016-01-08 19:44:28 UTC) #27

mtklein

Description was changed from ========== Clean up SkXfermode_opts.h It seems that MSVC + __vectorcall don't ...

4 years, 11 months ago (2016-01-08 19:45:15 UTC) #28

Description was changed from

==========
Clean up SkXfermode_opts.h

It seems that MSVC + __vectorcall don't play well together,
so back ourselves out into a situation where we don't need it.

   - Inline transfermode functions.  This removes the need for SK_VECTORCALL.
   - Remove 565 destination specializations.
     Blending into 565 is not speed-critical enough to merit the code bloat.
   - Removing 565 specializations means a bunch of Sk4px code is now dead.

8888 xfermodes generally speed up a bit from inlining, smoothly ranging from no
change down to 0.65x for the fastest functions like Plus or Modulate.

565 xfermodes generally slow down because we're doing 565 -> 8888 and 8888->565
conversion serially[1] and using the stack, smoothly ranging from no change up
to 2x slower for the fastest functions like Plus and Modulate.

[1] the 565->8888 conversion is actually being autovectorized

BUG=skia:4765,skia:4776
GOLD_TRYBOT_URL=
https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is...
CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
==========

to

==========
Clean up SkXfermode_opts.h

It seems that MSVC + __vectorcall don't play well together,
so back ourselves out into a situation where we don't need it.

   - Inline transfermode functions.  This removes the need for SK_VECTORCALL.
   - Remove 565 destination specializations.
     Blending into 565 is not speed-critical enough to merit the code bloat.
   - Removing 565 specializations means a bunch of Sk4px code is now dead.

8888 xfermodes generally speed up a bit from inlining, smoothly ranging from no
change down to 0.65x for the fastest functions like Plus or Modulate.

565 xfermodes generally slow down because we're doing 565 -> 8888 and 8888->565
conversion serially[1] and using the stack, smoothly ranging from no change up
to 2x slower for the fastest functions like Plus and Modulate.

[1] the 565->8888 conversion is actually being autovectorized

BUG=skia:4765,skia:4776
GOLD_TRYBOT_URL=
https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is...
CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot

No public API changes.
TBR=reed@google.com
==========

commit-bot: I haz the power

Description was changed from ========== Clean up SkXfermode_opts.h It seems that MSVC + __vectorcall don't ...

4 years, 11 months ago (2016-01-08 19:45:23 UTC) #29

Message was sent while issue was closed.

Description was changed from

==========
Clean up SkXfermode_opts.h

It seems that MSVC + __vectorcall don't play well together,
so back ourselves out into a situation where we don't need it.

   - Inline transfermode functions.  This removes the need for SK_VECTORCALL.
   - Remove 565 destination specializations.
     Blending into 565 is not speed-critical enough to merit the code bloat.
   - Removing 565 specializations means a bunch of Sk4px code is now dead.

8888 xfermodes generally speed up a bit from inlining, smoothly ranging from no
change down to 0.65x for the fastest functions like Plus or Modulate.

565 xfermodes generally slow down because we're doing 565 -> 8888 and 8888->565
conversion serially[1] and using the stack, smoothly ranging from no change up
to 2x slower for the fastest functions like Plus and Modulate.

[1] the 565->8888 conversion is actually being autovectorized

BUG=skia:4765,skia:4776
GOLD_TRYBOT_URL=
https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is...
CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot

No public API changes.
TBR=reed@google.com
==========

to

==========
Clean up SkXfermode_opts.h

It seems that MSVC + __vectorcall don't play well together,
so back ourselves out into a situation where we don't need it.

   - Inline transfermode functions.  This removes the need for SK_VECTORCALL.
   - Remove 565 destination specializations.
     Blending into 565 is not speed-critical enough to merit the code bloat.
   - Removing 565 specializations means a bunch of Sk4px code is now dead.

8888 xfermodes generally speed up a bit from inlining, smoothly ranging from no
change down to 0.65x for the fastest functions like Plus or Modulate.

565 xfermodes generally slow down because we're doing 565 -> 8888 and 8888->565
conversion serially[1] and using the stack, smoothly ranging from no change up
to 2x slower for the fastest functions like Plus and Modulate.

[1] the 565->8888 conversion is actually being autovectorized

BUG=skia:4765,skia:4776
GOLD_TRYBOT_URL=
https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is...
CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot

No public API changes.
TBR=reed@google.com

Committed:
https://skia.googlesource.com/skia/+/defa0daa6a0f4e97a3527a522ae602c6771a7c80
==========

commit-bot: I haz the power

4 years, 11 months ago (2016-01-08 19:45:24 UTC) #30

Message was sent while issue was closed.

Committed patchset #5 (id:80001) as
https://skia.googlesource.com/skia/+/defa0daa6a0f4e97a3527a522ae602c6771a7c80

Message was sent while issue was closed.

Patchset #6 (id:100001) has been deleted

Issue 1565223002: Clean up SkXfermode_opts.h (Closed)

Description

Patch Set 1 #

Patch Set 2 : names and refactoring #

Patch Set 3 : maybe we dont need to throw away all perf #

Patch Set 4 : typo #

Patch Set 5 : &&&&&& #

Messages