Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(815)

Issue 99933004: Implement a NEON version of the RGBA gaussian blur. This shows a 9-15% speedup on Nexus-10. (Closed)

Created:
7 years ago by Stephen White
Modified:
7 years ago
Reviewers:
mtklein
CC:
skia-review_googlegroups.com
Visibility:
Public.

Description

Implement a NEON version of the RGBA gaussian blur. This shows a 9-15% speedup on Nexus-10. R=mtklein@google.com, mtklein before: running bench [640 480] blur_image_filter_large_10.00_10.00 8888: cmsecs = 33063.23 running bench [640 480] blur_image_filter_small_10.00_10.00 8888: cmsecs = 32800.25 running bench [640 480] blur_image_filter_large_1.00_1.00 8888: cmsecs = 33017.88 running bench [640 480] blur_image_filter_small_1.00_1.00 8888: cmsecs = 32743.35 running bench [640 480] blur_image_filter_large_0.00_1.00 8888: cmsecs = 21024.04 running bench [640 480] blur_image_filter_large_0.00_10.00 8888: cmsecs = 22904.15 running bench [640 480] blur_image_filter_large_1.00_0.00 8888: cmsecs = 18738.08 running bench [640 480] blur_image_filter_large_10.00_0.00 8888: cmsecs = 18798.98 after: running bench [640 480] blur_image_filter_large_10.00_10.00 8888: cmsecs = 30180.96 running bench [640 480] blur_image_filter_small_10.00_10.00 8888: cmsecs = 29861.90 running bench [640 480] blur_image_filter_large_1.00_1.00 8888: cmsecs = 30178.98 running bench [640 480] blur_image_filter_small_1.00_1.00 8888: cmsecs = 29911.25 running bench [640 480] blur_image_filter_large_0.00_1.00 8888: cmsecs = 19344.35 running bench [640 480] blur_image_filter_large_0.00_10.00 8888: cmsecs = 19957.07 running bench [640 480] blur_image_filter_large_1.00_0.00 8888: cmsecs = 17158.84 running bench [640 480] blur_image_filter_large_10.00_0.00 8888: cmsecs = 17330.73 Committed: https://code.google.com/p/skia/source/detail?r=12486

Patch Set 1 #

Patch Set 2 : Remove commented-out code #

Total comments: 9

Patch Set 3 : Fixes per review comments #

Total comments: 1

Patch Set 4 : Moar vector comments #

Unified diffs Side-by-side diffs Delta from patch set Stats (+105 lines, -1 line) Patch
M gyp/opts.gyp View 1 2 2 chunks +1 line, -1 line 0 comments Download
A src/opts/SkBlurImage_opts_neon.cpp View 1 2 3 1 chunk +104 lines, -0 lines 0 comments Download

Messages

Total messages: 7 (0 generated)
Stephen White
mtklein@: PTAL. Thanks!
7 years ago (2013-12-02 21:36:59 UTC) #1
mtklein
https://codereview.chromium.org/99933004/diff/20001/src/opts/SkBlurImage_opts_neon.cpp File src/opts/SkBlurImage_opts_neon.cpp (right): https://codereview.chromium.org/99933004/diff/20001/src/opts/SkBlurImage_opts_neon.cpp#newcode28 src/opts/SkBlurImage_opts_neon.cpp:28: uint8x8_t v8 = vreinterpret_u8_u32(vdup_n_u32(a)); Do you mind adding in ...
7 years ago (2013-12-03 20:22:13 UTC) #2
Stephen White
Thanks for your review. https://codereview.chromium.org/99933004/diff/20001/src/opts/SkBlurImage_opts_neon.cpp File src/opts/SkBlurImage_opts_neon.cpp (right): https://codereview.chromium.org/99933004/diff/20001/src/opts/SkBlurImage_opts_neon.cpp#newcode28 src/opts/SkBlurImage_opts_neon.cpp:28: uint8x8_t v8 = vreinterpret_u8_u32(vdup_n_u32(a)); On ...
7 years ago (2013-12-04 15:18:41 UTC) #3
mtklein
LGTM with one readability suggestion. See what you think before submitting? https://codereview.chromium.org/99933004/diff/20001/src/opts/SkBlurImage_opts_neon.cpp File src/opts/SkBlurImage_opts_neon.cpp (right): ...
7 years ago (2013-12-04 16:52:23 UTC) #4
Stephen White
On 2013/12/04 16:52:23, mtklein wrote: > LGTM with one readability suggestion. See what you think ...
7 years ago (2013-12-04 18:17:56 UTC) #5
Stephen White
Committed patchset #4 manually as r12486 (presubmit successful).
7 years ago (2013-12-04 18:19:50 UTC) #6
mtklein
7 years ago (2013-12-04 18:20:13 UTC) #7
Message was sent while issue was closed.
On 2013/12/04 18:17:56, Stephen White wrote:
> On 2013/12/04 16:52:23, mtklein wrote:
> > LGTM with one readability suggestion.  See what you think before submitting?
> > 
> >
>
https://codereview.chromium.org/99933004/diff/20001/src/opts/SkBlurImage_opts...
> > File src/opts/SkBlurImage_opts_neon.cpp (right):
> > 
> >
>
https://codereview.chromium.org/99933004/diff/20001/src/opts/SkBlurImage_opts...
> > src/opts/SkBlurImage_opts_neon.cpp:34: void SkBoxBlur_SSE2(const SkPMColor*
> src,
> > int srcStride, SkPMColor* dst, int kernelSize,
> > On 2013/12/04 15:18:41, Stephen White wrote:
> > > On 2013/12/03 20:22:13, mtklein wrote:
> > > > Given that we're clearly copying and pasting here (_SSE2, here and
below)
> > > 
> > > Whoops, busted. :) Fixed.
> > > 
> > > >, want
> > > > to try something crazy?  You feeling ambitious?  Want to try to unify
the
> > > > codebase for the SSE and NEON versions?  The structural code is pretty
> much
> > > > verbatim, and there are some very strong correspondences between the SSE
> and
> > > > NEON we're using (though I admit, if we use a 64-bit NEON register as I
> > hoped
> > > > above, that may disappear).
> > > 
> > > I thought about it, but came to the conclusion that the scaffolding might
> well
> > > result in more lines of code than it would save.
> > > 
> > > One possible refactoring that might work would be to move the outer loop
and
> > > stride calculations out of the implementations, and just make the
functions
> > > operate on a single scan line. This is how most of the other opts are
> written.
> > > I'd prefer to do that as a followup, though.
> > 
> > SGTM
> > 
> >
>
https://codereview.chromium.org/99933004/diff/40001/src/opts/SkBlurImage_opts...
> > File src/opts/SkBlurImage_opts_neon.cpp (right):
> > 
> >
>
https://codereview.chromium.org/99933004/diff/40001/src/opts/SkBlurImage_opts...
> > src/opts/SkBlurImage_opts_neon.cpp:30: // ( A R G B A R G B ) -> (   A   R  
G
>  
> > B   A   R   G   B ) -> (   A   R   G   B )
> > Maybe use _ for 0's leaving spaces for spacing?
> 
> Actually this made me like 0's better. Fixed. :)

:)  LGTM

Powered by Google App Engine
This is Rietveld 408576698