|
|
Created:
5 years, 1 month ago by Stephen White Modified:
5 years, 1 month ago Reviewers:
mtklein CC:
reviews_skia.org Base URL:
https://skia.googlesource.com/skia.git@blur-neon-separate-loops Target Ref:
refs/heads/master Project:
skia Visibility:
Public. |
DescriptionSkBlurImageFilter_opts: optimize NEON box_blur_double in separate loops.
Stop leaning so hard on the branch predictor, and pull the conditionals
out of the loops for box_blur_double() (NEON). This is conceptually
the same change as https://codereview.chromium.org/1426583004/ for
the NEON double-pixel loop.
R=mtklein@google.com
BUG=skia:4526
CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
Committed: https://skia.googlesource.com/skia/+/d5b9211d8d072672856019d5608fa21d4397ba54
Patch Set 1 #Patch Set 2 : Upload with correct upstream #
Total comments: 2
Depends on Patchset: Messages
Total messages: 20 (7 generated)
mtklein@: PTAL. Thanks!
The CQ bit was checked by senorblanco@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1412793009/20001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1412793009/20001
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
Description was changed from ========== SkBlurImageFilter_opts: optimize NEON box_blur_double in separate loops. Stop leaning so hard on the branch predictor, and pull the conditionals out of the loops for box_blur_double() (NEON). This is conceptually the same change as https://codereview.chromium.org/1426583004/ for the NEON double-pixel loop. R=mtklein@google.com BUG=skia:4526 NOTREECHECKS=true NOTRY=true NOPRESUBMIT=true CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot ========== to ========== SkBlurImageFilter_opts: optimize NEON box_blur_double in separate loops. Stop leaning so hard on the branch predictor, and pull the conditionals out of the loops for box_blur_double() (NEON). This is conceptually the same change as https://codereview.chromium.org/1426583004/ for the NEON double-pixel loop. R=mtklein@google.com BUG=skia:4526 CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot ==========
The CQ bit was checked by senorblanco@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1412793009/20001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1412793009/20001
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
lgtm https://codereview.chromium.org/1412793009/diff/20001/src/opts/SkBlurImageFil... File src/opts/SkBlurImageFilter_opts.h (right): https://codereview.chromium.org/1412793009/diff/20001/src/opts/SkBlurImageFil... src/opts/SkBlurImageFilter_opts.h:71: #define STORE_SUMS_DOUBLE \ This just symmetry or are you thinking of expanding this to SSE?
Thanks for your review. https://codereview.chromium.org/1412793009/diff/20001/src/opts/SkBlurImageFil... File src/opts/SkBlurImageFilter_opts.h (right): https://codereview.chromium.org/1412793009/diff/20001/src/opts/SkBlurImageFil... src/opts/SkBlurImageFilter_opts.h:71: #define STORE_SUMS_DOUBLE \ On 2015/10/28 22:31:17, mtklein wrote: > This just symmetry or are you thinking of expanding this to SSE? Well, mostly because it's repeated four times below. But yeah, it'd be interesting to consider SSE for this. The one thing I don't like about it is that it produces different pixels than the generic path. But we're already paying that price on NEON.
Patchset #3 (id:40001) has been deleted
The CQ bit was checked by senorblanco@chromium.org
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1412793009/20001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1412793009/20001
Message was sent while issue was closed.
Committed patchset #2 (id:20001) as https://skia.googlesource.com/skia/+/d5b9211d8d072672856019d5608fa21d4397ba54
Message was sent while issue was closed.
On 2015/10/28 at 22:38:33, senorblanco wrote: > Thanks for your review. > > https://codereview.chromium.org/1412793009/diff/20001/src/opts/SkBlurImageFil... > File src/opts/SkBlurImageFilter_opts.h (right): > > https://codereview.chromium.org/1412793009/diff/20001/src/opts/SkBlurImageFil... > src/opts/SkBlurImageFilter_opts.h:71: #define STORE_SUMS_DOUBLE \ > On 2015/10/28 22:31:17, mtklein wrote: > > This just symmetry or are you thinking of expanding this to SSE? > > Well, mostly because it's repeated four times below. > > But yeah, it'd be interesting to consider SSE for this. The one thing I don't like about it is that it produces different pixels than the generic path. But we're already paying that price on NEON. Wait, does it? I thought we were just taking advantage of having the intermediate sums fit in 15 bits? Is there some tweaking we can do to STORE_SUMS_DOUBLE to make them identical?
Message was sent while issue was closed.
On 2015/10/29 13:55:03, mtklein wrote: > On 2015/10/28 at 22:38:33, senorblanco wrote: > > Thanks for your review. > > > > > https://codereview.chromium.org/1412793009/diff/20001/src/opts/SkBlurImageFil... > > File src/opts/SkBlurImageFilter_opts.h (right): > > > > > https://codereview.chromium.org/1412793009/diff/20001/src/opts/SkBlurImageFil... > > src/opts/SkBlurImageFilter_opts.h:71: #define STORE_SUMS_DOUBLE \ > > On 2015/10/28 22:31:17, mtklein wrote: > > > This just symmetry or are you thinking of expanding this to SSE? > > > > Well, mostly because it's repeated four times below. > > > > But yeah, it'd be interesting to consider SSE for this. The one thing I don't > like about it is that it produces different pixels than the generic path. But > we're already paying that price on NEON. > > Wait, does it? I thought we were just taking advantage of having the > intermediate sums fit in 15 bits? Is there some tweaking we can do to > STORE_SUMS_DOUBLE to make them identical? Yeah, it's the downshift (divide) I think. It's been a while since I looked at it. If you can dig up the original codereview, you'll see some comments from me about it.
Message was sent while issue was closed.
On 2015/10/29 13:56:27, Stephen White wrote: > On 2015/10/29 13:55:03, mtklein wrote: > > On 2015/10/28 at 22:38:33, senorblanco wrote: > > > Thanks for your review. > > > > > > > > > https://codereview.chromium.org/1412793009/diff/20001/src/opts/SkBlurImageFil... > > > File src/opts/SkBlurImageFilter_opts.h (right): > > > > > > > > > https://codereview.chromium.org/1412793009/diff/20001/src/opts/SkBlurImageFil... > > > src/opts/SkBlurImageFilter_opts.h:71: #define STORE_SUMS_DOUBLE \ > > > On 2015/10/28 22:31:17, mtklein wrote: > > > > This just symmetry or are you thinking of expanding this to SSE? > > > > > > Well, mostly because it's repeated four times below. > > > > > > But yeah, it'd be interesting to consider SSE for this. The one thing I > don't > > like about it is that it produces different pixels than the generic path. But > > we're already paying that price on NEON. > > > > Wait, does it? I thought we were just taking advantage of having the > > intermediate sums fit in 15 bits? Is there some tweaking we can do to > > STORE_SUMS_DOUBLE to make them identical? > > Yeah, it's the downshift (divide) I think. It's been a while since I looked at > it. > If you can dig up the original codereview, you'll see some comments from me > about it. Because I'm a git-lovin' fool: https://codereview.chromium.org/105893003
Message was sent while issue was closed.
On 2015/10/29 at 14:03:05, senorblanco wrote: > On 2015/10/29 13:56:27, Stephen White wrote: > > On 2015/10/29 13:55:03, mtklein wrote: > > > On 2015/10/28 at 22:38:33, senorblanco wrote: > > > > Thanks for your review. > > > > > > > > > > > > > https://codereview.chromium.org/1412793009/diff/20001/src/opts/SkBlurImageFil... > > > > File src/opts/SkBlurImageFilter_opts.h (right): > > > > > > > > > > > > > https://codereview.chromium.org/1412793009/diff/20001/src/opts/SkBlurImageFil... > > > > src/opts/SkBlurImageFilter_opts.h:71: #define STORE_SUMS_DOUBLE \ > > > > On 2015/10/28 22:31:17, mtklein wrote: > > > > > This just symmetry or are you thinking of expanding this to SSE? > > > > > > > > Well, mostly because it's repeated four times below. > > > > > > > > But yeah, it'd be interesting to consider SSE for this. The one thing I > > don't > > > like about it is that it produces different pixels than the generic path. But > > > we're already paying that price on NEON. > > > > > > Wait, does it? I thought we were just taking advantage of having the > > > intermediate sums fit in 15 bits? Is there some tweaking we can do to > > > STORE_SUMS_DOUBLE to make them identical? > > > > Yeah, it's the downshift (divide) I think. It's been a while since I looked at > > it. > > If you can dig up the original codereview, you'll see some comments from me > > about it. > > Because I'm a git-lovin' fool: https://codereview.chromium.org/105893003 Ah, duh, the scale factor is also now 15 bit. That makes sense. I suspect we could work around that with SSE by just using _mm_mulhi_epu16 instead of _mm_mulhrs_epi16. |