Issue 1015083004: SkPMFloat: avoid loads and stores where possible.

Issue 1015083004: SkPMFloat: avoid loads and stores where possible. (Closed)

Created:
5 years, 9 months ago by mtklein_C

Modified:
5 years, 9 months ago

Reviewers:
msarett

CC:
reviews_skia.org

Base URL:
https://skia.googlesource.com/skia.git@master

Target Ref:
refs/heads/master

Project:
skia

Visibility:
Public.

More Reviews

Description

SkPMFloat: avoid loads and stores where possible. A store/load pair like this is a redundant no-op: store simd_register_a, memory_address load memory_address, simd_register_a Everyone seems to be good at removing those when using SSE, but GCC and Clang are pretty terrible at this for NEON. We end up issuing both redundant commands, usually to and from the stack. That's slow. Let's not do that. This CL unions in the native SIMD register type into SkPMFloat, so that we can assign to and from it directly, which is generating a lot better NEON code. On my Nexus 5, the benchmarks improve from 36ns to 23ns. SSE is just as fast either way, but I paralleled the NEON code for consistency. It's a little terser. And because it needed the platform headers anyway, I moved all includes into SkPMFloat.h, again only for consistency. I'd union in Sk4f too to make its conversion methods a little clearer, but MSVC won't let me (it has a copy constructor... they're apparently not up to speed with C++11 unrestricted unions). BUG=skia: Committed: https://skia.googlesource.com/skia/+/f94fa7112f67af6fc5db19f86d8397307ba17105

Patch Set 1 #

Created: 5 years, 9 months ago

Download [raw] [tar.bz2]

	Unified diffs	Side-by-side diffs	Stats (+52 lines, -33 lines)			Patch
M	src/core/SkPMFloat.h	View	3 chunks	+20 lines, -10 lines	0 comments	Download
M	src/opts/SkPMFloat_SSE2.h	View	4 chunks	+10 lines, -8 lines	0 comments	Download
M	src/opts/SkPMFloat_SSSE3.h	View	4 chunks	+11 lines, -9 lines	0 comments	Download
M	src/opts/SkPMFloat_neon.h	View	3 chunks	+7 lines, -5 lines	0 comments	Download
M	src/opts/SkPMFloat_none.h	View	1 chunk	+4 lines, -1 line	0 comments	Download

Messages

Total messages: 15 (5 generated)

Expand Messages | Collapse Messages | Show Generated Messages | Hide Generated Messages

commit-bot: I haz the power

CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1015083004/20001

5 years, 9 months ago (2015-03-18 16:24:36 UTC) #2

commit-bot: I haz the power

Note for Reviewers: The CQ is waiting for an approval. If you believe that the ...

5 years, 9 months ago (2015-03-18 16:24:37 UTC) #3

commit-bot: I haz the power

Try jobs failed on following builders: Build-Win-VS2013-x86_64-Debug-Trybot on client.skia.compile (JOB_FAILED, http://build.chromium.org/p/client.skia.compile/builders/Build-Win-VS2013-x86_64-Debug-Trybot/builds/256)

5 years, 9 months ago (2015-03-18 16:29:23 UTC) #5

commit-bot: I haz the power

CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1015083004/1

5 years, 9 months ago (2015-03-18 16:32:04 UTC) #8

commit-bot: I haz the power

Note for Reviewers: The CQ is waiting for an approval. If you believe that the ...

5 years, 9 months ago (2015-03-18 16:32:05 UTC) #9

mtklein_C

What do you think about landing this first, to get the store/load noise out, then ...

5 years, 9 months ago (2015-03-18 16:36:49 UTC) #11

msarett

lgtm I think this is a good step. The loads and stores in the benchmarks ...

5 years, 9 months ago (2015-03-18 16:50:13 UTC) #12

commit-bot: I haz the power

Committed patchset #1 (id:1) as https://skia.googlesource.com/skia/+/f94fa7112f67af6fc5db19f86d8397307ba17105

5 years, 9 months ago (2015-03-18 16:51:27 UTC) #13

mtklein

> My one thought is that at some point we may care about the performance ...

5 years, 9 months ago (2015-03-18 18:10:06 UTC) #14

msarett

5 years, 9 months ago (2015-03-18 18:14:01 UTC) #15

Message was sent while issue was closed.

Agreed.

Benchmarking the convert within the actual context it is used will probably be
the time and place to measure the impact of loads and stores.

Expand Messages | Collapse Messages | Show Generated Messages | Hide Generated Messages