|
|
Descriptionexperimental speedup some xfermodes with Sk4f
Old:
7M 1 11.1ms 11.3ms 11.3ms 11.6ms 1% ▅▄▂▂▁▁▄▄█▇ 8888 Xfermode_Screen
7M 1 10.7ms 10.9ms 10.9ms 11.1ms 1% ▄▄▄▇▃▁█▄▂▅ 8888 Xfermode_Modulate
7M 1 7.86ms 8.03ms 8ms 8.18ms 1% █▇▅▁▃▃▂▃▆▅ 8888 Xfermode_Plus
7M 1 14.6ms 14.8ms 14.8ms 15.1ms 1% ▄█▆▅▄▁▁▆▄▆ 8888 Xfermode_Xor
7M 1 13ms 13.5ms 13.4ms 13.8ms 2% ▅▃▇▁█▂▃▅▃▅ 8888 Xfermode_DstATop
7M 1 13.1ms 13.4ms 13.3ms 13.6ms 1% ▄▁▁▆▅▄▇▆█▂ 8888 Xfermode_SrcATop
New:
7M 1 6.99ms 7.19ms 7.4ms 8.98ms 8% ▁▂▁▃▂█▁▂▂▂ 8888 Xfermode_Screen
7M 1 5.27ms 5.46ms 5.46ms 5.89ms 3% ▁▁▅▁▂█▄▃▄▃ 8888 Xfermode_Modulate
7M 1 6.8ms 7.04ms 7.27ms 8.53ms 8% ▂▁█▁▁▂▂▂▂▇ 8888 Xfermode_Plus
7M 1 9ms 9.2ms 9.33ms 10.5ms 5% ▁█▃▁▂▁▁▁▅▂ 8888 Xfermode_Xor
7M 1 8.34ms 8.57ms 8.73ms 10.6ms 8% ▁▁▁▂▂▂▂▂▂█ 8888 Xfermode_DstATop
7M 1 8.38ms 8.62ms 8.91ms 10.3ms 8% ▁▃▁▂▇▂▁▂▁█ 8888 Xfermode_SrcATop
Need to define SK_SUPPORT_LEGACY_SCALAR_XFERMODES in chrome to suppress change (see https://codereview.chromium.org/1054083002/)
Committed: https://skia.googlesource.com/skia/+/f92ace90d89cc99b34162dda26be564e34ca80ef
Patch Set 1 #Patch Set 2 : #Patch Set 3 : #
Total comments: 1
Patch Set 4 : remove redundant alpha reset #Patch Set 5 : #
Total comments: 2
Patch Set 6 : need Min checks for Screen and Xor (at least) to catch tiny imprecisions that were triggering inval… #Patch Set 7 : propose landing disabled first, to reset benchmarks #
Total comments: 1
Patch Set 8 : add comment/todo to find short-cuts to avoid excessive calls to Sk4f::Min #Patch Set 9 : remove additional clamps for screen and xor, using new relaxed asserts in SkPMFloat::get() #Patch Set 10 : #
Total comments: 2
Patch Set 11 : fix comment #Messages
Total messages: 46 (16 generated)
reed@chromium.org changed reviewers: + caryclark@google.com, mtklein@google.com
The CQ bit was checked by reed@chromium.org to run a CQ dry run
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1043413002/40001
Woah, you're saying you're beating srcatop_modeproc_SSE2? Any results against srcatop_modeproc_neon? If not I can give that a try. This may be the beginning of unforking a whole lot of code.
The CQ bit was unchecked by commit-bot@chromium.org
This issue passed the CQ dry run.
https://codereview.chromium.org/1043413002/diff/40001/src/core/SkXfermode.cpp File src/core/SkXfermode.cpp (right): https://codereview.chromium.org/1043413002/diff/40001/src/core/SkXfermode.cpp... src/core/SkXfermode.cpp:1196: // kSrcATop_Mode, //!< [Da, Sc * Da + (1 - Sa) * Dc] I suspect that this formulation of src-atop has tricked us into doing more work than we need to. If I'm thinking right, Da == Sa*Da + (1-Sa)*Da. The formula for color components can also be applied to alphas? If that's true, res == res2. It's looking like another ~10% speedup to just do this, with no GM diffs against your version: SkPMFloat res = d4 + (s4 * Sk4f(dst.a()) - d4*Sk4f(src.a())) * inv255; SkASSERT(res.isValid()); return res;
On 2015/04/01 19:33:55, mtklein wrote: > https://codereview.chromium.org/1043413002/diff/40001/src/core/SkXfermode.cpp > File src/core/SkXfermode.cpp (right): > > https://codereview.chromium.org/1043413002/diff/40001/src/core/SkXfermode.cpp... > src/core/SkXfermode.cpp:1196: // kSrcATop_Mode, //!< [Da, Sc * Da + (1 - Sa) * > Dc] > I suspect that this formulation of src-atop has tricked us into doing more work > than we need to. > > If I'm thinking right, Da == Sa*Da + (1-Sa)*Da. The formula for color > components can also be applied to alphas? > > If that's true, res == res2. > > It's looking like another ~10% speedup to just do this, with no GM diffs against > your version: > SkPMFloat res = d4 + (s4 * Sk4f(dst.a()) - d4*Sk4f(src.a())) * inv255; > SkASSERT(res.isValid()); > return res; SWEET
yea, setting a component was very awkward to do (given our api)
On 2015/04/01 20:09:03, reed2 wrote: > yea, setting a component was very awkward to do (given our api) It probably should be awkward. The generated code ends up not _so_ bad, less awkward than the API makes it seem, but the instructions to make it really straightforward and fast are in SSE 4.
The CQ bit was checked by reed@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1043413002/80001
The CQ bit was unchecked by reed@chromium.org
I think https://codereview.chromium.org/1052083004/ will fix the assert (pre-existing bug in gm)
The CQ bit was checked by mtklein@google.com to run a CQ dry run
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1043413002/80001
The CQ bit was unchecked by commit-bot@chromium.org
This issue passed the CQ dry run.
https://codereview.chromium.org/1043413002/diff/80001/src/core/SkXfermode.cpp File src/core/SkXfermode.cpp (right): https://codereview.chromium.org/1043413002/diff/80001/src/core/SkXfermode.cpp... src/core/SkXfermode.cpp:1308: if (ProcType::kFoldCoverageIntoSrcAlpha && false) { ? Something about this still TODO?
https://codereview.chromium.org/1043413002/diff/80001/src/core/SkXfermode.cpp File src/core/SkXfermode.cpp (right): https://codereview.chromium.org/1043413002/diff/80001/src/core/SkXfermode.cpp... src/core/SkXfermode.cpp:1308: if (ProcType::kFoldCoverageIntoSrcAlpha && false) { On 2015/04/02 17:18:00, mtklein wrote: > ? Something about this still TODO? Needed some Min() checks in Screen and Xor... done.
ptal -- changed to land disabled for now, to reset benchmarks (had to scale those up to get more reliable numbers).
The CQ bit was checked by reed@chromium.org
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1043413002/120001
The CQ bit was unchecked by commit-bot@chromium.org
No LGTM from a valid reviewer yet. Only full committers are accepted. Even if an LGTM may have been provided, it was from a non-committer, _not_ a full super star committer. See http://www.chromium.org/getting-involved/become-a-committer Note that this has nothing to do with OWNERS files.
The CQ bit was checked by reed@chromium.org to run a CQ dry run
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1043413002/120001
lgtm https://codereview.chromium.org/1043413002/diff/120001/src/core/SkXfermode.cpp File src/core/SkXfermode.cpp (right): https://codereview.chromium.org/1043413002/diff/120001/src/core/SkXfermode.cp... src/core/SkXfermode.cpp:1209: SkPMFloat pm = Sk4f::Min(value, pinnedAlpha); Guh. I take it the places that are using enforce_as_pmfloat can't also get away with clamp_255? They're going both over 255 and out of premul?
The CQ bit was unchecked by commit-bot@chromium.org
This issue passed the CQ dry run.
On 2015/04/02 17:37:15, mtklein wrote: > lgtm > > https://codereview.chromium.org/1043413002/diff/120001/src/core/SkXfermode.cpp > File src/core/SkXfermode.cpp (right): > > https://codereview.chromium.org/1043413002/diff/120001/src/core/SkXfermode.cp... > src/core/SkXfermode.cpp:1209: SkPMFloat pm = Sk4f::Min(value, pinnedAlpha); > Guh. I take it the places that are using enforce_as_pmfloat can't also get away > with clamp_255? They're going both over 255 and out of premul? Sometimes alpha is 255.0001 :( Sometimes alpha is 100.0001 and red is 100.0002 :( Perhaps we can come up with a variant of "valid" that can accommodate one or more of those, but for now the various clamps were needed.
On 2015/04/02 17:57:55, reed2 wrote: > On 2015/04/02 17:37:15, mtklein wrote: > > lgtm > > > > https://codereview.chromium.org/1043413002/diff/120001/src/core/SkXfermode.cpp > > File src/core/SkXfermode.cpp (right): > > > > > https://codereview.chromium.org/1043413002/diff/120001/src/core/SkXfermode.cp... > > src/core/SkXfermode.cpp:1209: SkPMFloat pm = Sk4f::Min(value, pinnedAlpha); > > Guh. I take it the places that are using enforce_as_pmfloat can't also get > away > > with clamp_255? They're going both over 255 and out of premul? > > Sometimes alpha is 255.0001 :( > > Sometimes alpha is 100.0001 and red is 100.0002 :( > > Perhaps we can come up with a variant of "valid" that can accommodate one or > more of those, but for now the various clamps were needed. What I'm wondering is if we should drop the this->isValid() asserts from our conversion code, and just leave the asserts that the end SkPMColor is valid. That may let us ignore small noise like this that won't make a difference to 8-bit.
On 2015/04/02 18:00:08, mtklein wrote: > On 2015/04/02 17:57:55, reed2 wrote: > > On 2015/04/02 17:37:15, mtklein wrote: > > > lgtm > > > > > > > https://codereview.chromium.org/1043413002/diff/120001/src/core/SkXfermode.cpp > > > File src/core/SkXfermode.cpp (right): > > > > > > > > > https://codereview.chromium.org/1043413002/diff/120001/src/core/SkXfermode.cp... > > > src/core/SkXfermode.cpp:1209: SkPMFloat pm = Sk4f::Min(value, pinnedAlpha); > > > Guh. I take it the places that are using enforce_as_pmfloat can't also get > > away > > > with clamp_255? They're going both over 255 and out of premul? > > > > Sometimes alpha is 255.0001 :( > > > > Sometimes alpha is 100.0001 and red is 100.0002 :( > > > > Perhaps we can come up with a variant of "valid" that can accommodate one or > > more of those, but for now the various clamps were needed. > > What I'm wondering is if we should drop the this->isValid() asserts from our > conversion code, and just leave the asserts that the end SkPMColor is valid. > That may let us ignore small noise like this that won't make a difference to > 8-bit. Could work when we know we're at the end of the pipleine, and will convert to bytes right away. Certainly the 255.0001 case should be fine. Not sure about the other, since we could be... alpha = 199.999 red = 200 which would generate diff bytes (or 149.499 and 149.501 if we're truncing instead of rounding)
On 2015/04/02 18:04:09, reed2 wrote: > On 2015/04/02 18:00:08, mtklein wrote: > > On 2015/04/02 17:57:55, reed2 wrote: > > > On 2015/04/02 17:37:15, mtklein wrote: > > > > lgtm > > > > > > > > > > https://codereview.chromium.org/1043413002/diff/120001/src/core/SkXfermode.cpp > > > > File src/core/SkXfermode.cpp (right): > > > > > > > > > > > > > > https://codereview.chromium.org/1043413002/diff/120001/src/core/SkXfermode.cp... > > > > src/core/SkXfermode.cpp:1209: SkPMFloat pm = Sk4f::Min(value, > pinnedAlpha); > > > > Guh. I take it the places that are using enforce_as_pmfloat can't also > get > > > away > > > > with clamp_255? They're going both over 255 and out of premul? > > > > > > Sometimes alpha is 255.0001 :( > > > > > > Sometimes alpha is 100.0001 and red is 100.0002 :( > > > > > > Perhaps we can come up with a variant of "valid" that can accommodate one or > > > more of those, but for now the various clamps were needed. > > > > What I'm wondering is if we should drop the this->isValid() asserts from our > > conversion code, and just leave the asserts that the end SkPMColor is valid. > > That may let us ignore small noise like this that won't make a difference to > > 8-bit. > > Could work when we know we're at the end of the pipleine, and will convert to > bytes right away. Certainly the 255.0001 case should be fine. Not sure about the > other, since we could be... > > alpha = 199.999 > red = 200 > > which would generate diff bytes (or 149.499 and 149.501 if we're truncing > instead of rounding) E.g., https://codereview.chromium.org/1055093002 This still lets us assert isValid() as we like, but relaxes the isValid() requirements on the converters.
back to fasty wasty, ptal
The CQ bit was checked by reed@chromium.org to run a CQ dry run
The patchset sent to the CQ was uploaded after l-g-t-m from mtklein@google.com Link to the patchset: https://codereview.chromium.org/1043413002/#ps180001 (title: " ")
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1043413002/180001
still lgtm https://codereview.chromium.org/1043413002/diff/180001/src/core/SkXfermode.cpp File src/core/SkXfermode.cpp (right): https://codereview.chromium.org/1043413002/diff/180001/src/core/SkXfermode.cp... src/core/SkXfermode.cpp:1201: * Some modes that can, due to very slight numerical error, generate "invalid" pmcolors... no need for "that" https://codereview.chromium.org/1043413002/diff/180001/src/core/SkXfermode.cp... src/core/SkXfermode.cpp:1219: (void)pm.get(); Seems fine, but clearer as SkASSERT(SkPMColorAssert(pm.get()));
The CQ bit was unchecked by commit-bot@chromium.org
This issue passed the CQ dry run.
The CQ bit was checked by reed@chromium.org
The patchset sent to the CQ was uploaded after l-g-t-m from mtklein@google.com Link to the patchset: https://codereview.chromium.org/1043413002/#ps200001 (title: "fix comment")
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1043413002/200001
Message was sent while issue was closed.
Committed patchset #11 (id:200001) as https://skia.googlesource.com/skia/+/f92ace90d89cc99b34162dda26be564e34ca80ef |