|
|
DescriptionBilinear optimization for 1D convolution.
Splits GrGLConvolutionEffect into GrGLBilerpConvolutionEffect and
GrGLBoundedConvolutionEffect. When doing a non-bounded convolution we now
always use the GrGLBilerpConvolutionEffect which uses bilinear filtering to
perform half as many samples in the texture.
BUG=skia:3986
Committed: https://skia.googlesource.com/skia/+/91abe10af417148939548551e210c001022d3bda
Committed: https://skia.googlesource.com/skia/+/0f38612b0facf585854aba4556433b858cbf7da8
GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1216623003
Patch Set 1 : #
Total comments: 10
Patch Set 2 : Remove \t from shaders and simplify SkASSERT #Patch Set 3 : small fixes #Patch Set 4 : Rebase #Patch Set 5 : Rebase #Patch Set 6 : rebase #Patch Set 7 : #Messages
Total messages: 43 (23 generated)
ericrk@chromium.org changed reviewers: + bsalomon@chromium.org, senorblanco@chromium.org
ericrk@chromium.org changed reviewers: + bsalomon@google.com - bsalomon@chromium.org
ericrk@chromium.org changed reviewers: - senorblanco@chromium.org
Patchset #2 (id:20001) has been deleted
Patchset #1 (id:1) has been deleted
Patchset #1 (id:40001) has been deleted
Hi Brian, This is my first pass at adding support for a bilinear-filtering optimization to the Gaussian blur effect (and 1D convolutions in general). Tried to factor things in a way that made sense, but this is my first time working in the Skia codebase, so please let me know if there's anything I'm missing! This change passes DM with a few very minor (not noticeable on manual inspection) changes (off by 1 type things). The change yields an average of 10% speedup across all "blur_" nanobench results (including benches with small radii that can't use the optimization). For some benchmarks the improvement was up to 25%. There were up to 4% regressions in 2/35 test which I'm still in the process of investigating (might just be noise, not sure yet).
bsalomon@google.com changed reviewers: + senorblanco@chromium.org
Hey Eric, this looks nicely done! 4% sounds like it is likely within the noise so I'm not too worried about that. I think senorblanco@ should have a look too as this was originally his baby. One idea for shaving off some more time: Right now when we have a virtual signature on SkMaskFilter produces a result texture. If it instead produced a GrFragmentProcessor as the result that FP could do the last step of the blur while we're drawing into the main target, skipping an intermediate draw. https://codereview.chromium.org/1216623003/diff/60001/src/gpu/effects/GrConvo... File src/gpu/effects/GrConvolutionEffect.cpp (right): https://codereview.chromium.org/1216623003/diff/60001/src/gpu/effects/GrConvo... src/gpu/effects/GrConvolutionEffect.cpp:221: fsBuilder->codeAppendf("\t\t%s = vec4(0, 0, 0, 0);\n", outputColor); You don't really need the \t's anymore (we now "pretty" print the shaders). https://codereview.chromium.org/1216623003/diff/60001/src/gpu/effects/GrConvo... src/gpu/effects/GrConvolutionEffect.cpp:255: SkASSERT(conv.useBounds() == false); !conv.useBounds()?
Thanks for the feedback! Will wait for senorblanco to review. https://codereview.chromium.org/1216623003/diff/60001/src/gpu/effects/GrConvo... File src/gpu/effects/GrConvolutionEffect.cpp (right): https://codereview.chromium.org/1216623003/diff/60001/src/gpu/effects/GrConvo... src/gpu/effects/GrConvolutionEffect.cpp:221: fsBuilder->codeAppendf("\t\t%s = vec4(0, 0, 0, 0);\n", outputColor); On 2015/06/26 21:07:33, bsalomon wrote: > You don't really need the \t's anymore (we now "pretty" print the shaders). removed the \t\ts throughout the file https://codereview.chromium.org/1216623003/diff/60001/src/gpu/effects/GrConvo... src/gpu/effects/GrConvolutionEffect.cpp:255: SkASSERT(conv.useBounds() == false); On 2015/06/26 21:07:34, bsalomon wrote: > !conv.useBounds()? yup
You may need to suppress some layout test failures in Blink before landing this, in particular in the css3/filters tests. Or alternatively, place the functionality behind an SK #define that you pre-enable in Chrome before Skia rolls in. https://codereview.chromium.org/1216623003/diff/60001/src/gpu/effects/GrConvo... File src/gpu/effects/GrConvolutionEffect.cpp (right): https://codereview.chromium.org/1216623003/diff/60001/src/gpu/effects/GrConvo... src/gpu/effects/GrConvolutionEffect.cpp:180: class GrGLBilerpConvolutionEffect : public GrGLConvolutionEffect { <bikeshed> is this really a bilinear fetch? Or essentially linear in 1D? (i.e., is the non-incrementing coord the same in both fetches?) If so maybe it should be GrGLLinearConvolutionEffect GrGLLerpConvolutionEffect or something. </bikeshed> https://codereview.chromium.org/1216623003/diff/60001/src/gpu/effects/GrConvo... src/gpu/effects/GrConvolutionEffect.cpp:211: fSampleOffsetUni = How many uniform values does this add up to? You may have to increase the number of downsample passes for this optimization to stay under the PS2.0 limits. Or check that we have PS >= 3.0 before using it. https://codereview.chromium.org/1216623003/diff/60001/src/gpu/effects/GrConvo... src/gpu/effects/GrConvolutionEffect.cpp:277: float totalKernalWeight = Nit: kernal -> kernel?
Thanks for the feedback. I've created a CL for test suppressions for the blink layout tests that this affects: https://codereview.chromium.org/1218623002/ https://codereview.chromium.org/1216623003/diff/60001/src/gpu/effects/GrConvo... File src/gpu/effects/GrConvolutionEffect.cpp (right): https://codereview.chromium.org/1216623003/diff/60001/src/gpu/effects/GrConvo... src/gpu/effects/GrConvolutionEffect.cpp:180: class GrGLBilerpConvolutionEffect : public GrGLConvolutionEffect { On 2015/06/26 21:28:31, Stephen White wrote: > <bikeshed> is this really a bilinear fetch? Or essentially linear in 1D? (i.e., > is the non-incrementing coord the same in both fetches?) If so maybe it should > be GrGLLinearConvolutionEffect GrGLLerpConvolutionEffect or something. > </bikeshed> yup - that's better :D https://codereview.chromium.org/1216623003/diff/60001/src/gpu/effects/GrConvo... src/gpu/effects/GrConvolutionEffect.cpp:211: fSampleOffsetUni = On 2015/06/26 21:28:31, Stephen White wrote: > How many uniform values does this add up to? You may have to increase the number > of downsample passes for this optimization to stay under the PS2.0 limits. Or > check that we have PS >= 3.0 before using it. I think we're safe (assuming we keep the limit of sigma <= 4, which we are for now). Looks like PS2.0 gives 32 uniform slots (up to vec4 each). Max sigma is 4 == kernel radius of 12 == kernel width of 25 for kernel width 25, linear optimization requires ((25+1)/2) = 13 samples. So max sampleCount is 13. We use two uniform arrays of sampleCount size, which is still only 26 uniforms, so I think this is safe. This actually only uses 2 more uniform slots than the previous shader. Let me know if this makes sense.
LGTM https://codereview.chromium.org/1216623003/diff/60001/src/gpu/effects/GrConvo... File src/gpu/effects/GrConvolutionEffect.cpp (right): https://codereview.chromium.org/1216623003/diff/60001/src/gpu/effects/GrConvo... src/gpu/effects/GrConvolutionEffect.cpp:211: fSampleOffsetUni = On 2015/06/29 17:33:21, ericrk wrote: > On 2015/06/26 21:28:31, Stephen White wrote: > > How many uniform values does this add up to? You may have to increase the > number > > of downsample passes for this optimization to stay under the PS2.0 limits. Or > > check that we have PS >= 3.0 before using it. > > I think we're safe (assuming we keep the limit of sigma <= 4, which we are for > now). > > Looks like PS2.0 gives 32 uniform slots (up to vec4 each). > > Max sigma is 4 == kernel radius of 12 == kernel width of 25 > > for kernel width 25, linear optimization requires ((25+1)/2) = 13 samples. > > So max sampleCount is 13. We use two uniform arrays of sampleCount size, which > is still only 26 uniforms, so I think this is safe. > > This actually only uses 2 more uniform slots than the previous shader. > > Let me know if this makes sense. Yep; makes sense. Thanks!
lgtm
The CQ bit was checked by ericrk@chromium.org
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1216623003/100001
The CQ bit was unchecked by commit-bot@chromium.org
Try jobs failed on following builders: Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Debug-Trybot on client.skia (JOB_FAILED, http://build.chromium.org/p/client.skia/builders/Test-Ubuntu-GCC-GCE-CPU-AVX2...) Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-Shared-Trybot on client.skia (JOB_FAILED, http://build.chromium.org/p/client.skia/builders/Test-Ubuntu-GCC-GCE-CPU-AVX2...) Build-Ubuntu-Clang-x86_64-Debug-Trybot on client.skia.compile (JOB_FAILED, http://build.chromium.org/p/client.skia.compile/builders/Build-Ubuntu-Clang-x...) Build-Ubuntu-GCC-Arm64-Debug-Android-Trybot on client.skia.compile (JOB_FAILED, http://build.chromium.org/p/client.skia.compile/builders/Build-Ubuntu-GCC-Arm...) Build-Ubuntu-GCC-Arm7-Debug-Android-Trybot on client.skia.compile (JOB_FAILED, http://build.chromium.org/p/client.skia.compile/builders/Build-Ubuntu-GCC-Arm...) Build-Ubuntu-GCC-Mips-Debug-Android-Trybot on client.skia.compile (JOB_FAILED, http://build.chromium.org/p/client.skia.compile/builders/Build-Ubuntu-GCC-Mip...) Build-Ubuntu-GCC-x86_64-Release-Trybot on client.skia.compile (JOB_FAILED, http://build.chromium.org/p/client.skia.compile/builders/Build-Ubuntu-GCC-x86...) (exceeded global retry quota)
Patchset #4 (id:120001) has been deleted
Patchset #4 (id:140001) has been deleted
Patchset #4 (id:160001) has been deleted
The CQ bit was checked by ericrk@chromium.org
The patchset sent to the CQ was uploaded after l-g-t-m from bsalomon@google.com, senorblanco@chromium.org Link to the patchset: https://codereview.chromium.org/1216623003/#ps180001 (title: "Rebase")
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1216623003/180001
The CQ bit was unchecked by commit-bot@chromium.org
Try jobs failed on following builders: Build-Ubuntu-Clang-x86_64-Debug-Trybot on client.skia.compile (JOB_FAILED, http://build.chromium.org/p/client.skia.compile/builders/Build-Ubuntu-Clang-x...) Build-Ubuntu-GCC-Arm64-Debug-Android-Trybot on client.skia.compile (JOB_FAILED, http://build.chromium.org/p/client.skia.compile/builders/Build-Ubuntu-GCC-Arm...) Build-Ubuntu-GCC-Arm7-Debug-Android-Trybot on client.skia.compile (JOB_FAILED, http://build.chromium.org/p/client.skia.compile/builders/Build-Ubuntu-GCC-Arm...) Build-Ubuntu-GCC-Mips-Debug-Android-Trybot on client.skia.compile (JOB_FAILED, http://build.chromium.org/p/client.skia.compile/builders/Build-Ubuntu-GCC-Mip...) Build-Ubuntu-GCC-x86_64-Release-Trybot on client.skia.compile (JOB_FAILED, http://build.chromium.org/p/client.skia.compile/builders/Build-Ubuntu-GCC-x86...) (exceeded global retry quota)
The CQ bit was checked by ericrk@chromium.org
The patchset sent to the CQ was uploaded after l-g-t-m from bsalomon@google.com, senorblanco@chromium.org Link to the patchset: https://codereview.chromium.org/1216623003/#ps200001 (title: "Rebase")
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1216623003/200001
Message was sent while issue was closed.
Committed patchset #5 (id:200001) as https://skia.googlesource.com/skia/+/91abe10af417148939548551e210c001022d3bda
Message was sent while issue was closed.
A revert of this CL (patchset #5 id:200001) has been created in https://codereview.chromium.org/1231383005/ by scroggo@google.com. The reason for reverting is: Breaks MSAA.
Message was sent while issue was closed.
On 2015/07/13 14:06:25, scroggo wrote: > A revert of this CL (patchset #5 id:200001) has been created in > https://codereview.chromium.org/1231383005/ by mailto:scroggo@google.com. > > The reason for reverting is: Breaks MSAA. A little more detail: This change appeared to change many images (there were around 700 new images shown in Gold; I did not do a thorough investigation to determine whether they all came from this CL, but many of them showed this CL as the blame in Gold). Many seemed to be small pixel diffs, as expected, but some appeared to be real bugs, e.g. in the GM named "blurs".
The CQ bit was checked by ericrk@chromium.org
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1216623003/200001
Per discussion over chat - I believe the differences are all expected - re-submitting and I will triage any GM failures as soon as they show up.
Message was sent while issue was closed.
Committed patchset #5 (id:200001) as https://skia.googlesource.com/skia/+/0f38612b0facf585854aba4556433b858cbf7da8
Message was sent while issue was closed.
A revert of this CL (patchset #5 id:200001) has been created in https://codereview.chromium.org/1247063005/ by ericrk@chromium.org. The reason for reverting is: Ok, I am now seeing a couple issues. going to revert and investigate further..
Message was sent while issue was closed.
bsalomon@google.com changed reviewers: + robertphillips@google.com
Message was sent while issue was closed.
Description was changed from ========== Bilinear optimization for 1D convolution. Splits GrGLConvolutionEffect into GrGLBilerpConvolutionEffect and GrGLBoundedConvolutionEffect. When doing a non-bounded convolution we now always use the GrGLBilerpConvolutionEffect which uses bilinear filtering to perform half as many samples in the texture. BUG=skia:3986 Committed: https://skia.googlesource.com/skia/+/91abe10af417148939548551e210c001022d3bda Committed: https://skia.googlesource.com/skia/+/0f38612b0facf585854aba4556433b858cbf7da8 ========== to ========== Bilinear optimization for 1D convolution. Splits GrGLConvolutionEffect into GrGLBilerpConvolutionEffect and GrGLBoundedConvolutionEffect. When doing a non-bounded convolution we now always use the GrGLBilerpConvolutionEffect which uses bilinear filtering to perform half as many samples in the texture. BUG=skia:3986 Committed: https://skia.googlesource.com/skia/+/91abe10af417148939548551e210c001022d3bda Committed: https://skia.googlesource.com/skia/+/0f38612b0facf585854aba4556433b858cbf7da8 GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is... ==========
Message was sent while issue was closed.
Description was changed from ========== Bilinear optimization for 1D convolution. Splits GrGLConvolutionEffect into GrGLBilerpConvolutionEffect and GrGLBoundedConvolutionEffect. When doing a non-bounded convolution we now always use the GrGLBilerpConvolutionEffect which uses bilinear filtering to perform half as many samples in the texture. BUG=skia:3986 Committed: https://skia.googlesource.com/skia/+/91abe10af417148939548551e210c001022d3bda Committed: https://skia.googlesource.com/skia/+/0f38612b0facf585854aba4556433b858cbf7da8 GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is... ========== to ========== Bilinear optimization for 1D convolution. Splits GrGLConvolutionEffect into GrGLBilerpConvolutionEffect and GrGLBoundedConvolutionEffect. When doing a non-bounded convolution we now always use the GrGLBilerpConvolutionEffect which uses bilinear filtering to perform half as many samples in the texture. BUG=skia:3986 Committed: https://skia.googlesource.com/skia/+/91abe10af417148939548551e210c001022d3bda Committed: https://skia.googlesource.com/skia/+/0f38612b0facf585854aba4556433b858cbf7da8 GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is... ==========
The CQ bit was checked by ericrk@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1216623003/220001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1216623003/220001
The CQ bit was unchecked by ericrk@chromium.org |