Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(214)

Unified Diff: src/opts/SkSwizzler_opts.h

Issue 1657393002: SSSE3 optimizations for gray -> RGBA (or BGRA) (Closed) Base URL: https://skia.googlesource.com/skia.git@gray
Patch Set: Created 4 years, 11 months ago
Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.
Jump to:
View side-by-side diff with in-line comments
Download patch
« no previous file with comments | « no previous file | no next file » | no next file with comments »
Expand Comments ('e') | Collapse Comments ('c') | Show Comments Hide Comments ('s')
Index: src/opts/SkSwizzler_opts.h
diff --git a/src/opts/SkSwizzler_opts.h b/src/opts/SkSwizzler_opts.h
index 24a69a91f78db427c7b62d12eed0dbf433cc01dc..01139096e70fe632dc9f679e90172b4039315614 100644
--- a/src/opts/SkSwizzler_opts.h
+++ b/src/opts/SkSwizzler_opts.h
@@ -452,7 +452,33 @@ static void RGB_to_BGR1(uint32_t dst[], const void* src, int count) {
insert_alpha_should_swaprb<true>(dst, src, count);
}
-static void gray_to_RGB1(uint32_t dst[], const void* src, int count) {
+static void gray_to_RGB1(uint32_t dst[], const void* vsrc, int count) {
msarett 2016/02/02 20:22:36 This performs almost identically to an *unpack* ba
mtklein 2016/02/02 20:29:30 Even on mobile x86 (Venue8)?
mtklein 2016/02/02 20:32:06 (If so that makes me very happy. I love,love,love
msarett 2016/02/02 20:41:44 I only compared the two approaches on my desktop i
mtklein 2016/02/02 20:47:18 :( This is one of those times I hate to be right.
msarett 2016/02/02 20:53:59 Thanks for the extra reference! Got an extra 5% o
+ const uint8_t* src = (const uint8_t*) vsrc;
+
+ const __m128i alphaMask = _mm_set1_epi32(0xFF000000);
+ const uint8_t X = 0xFF; // Used a placeholder. The value of X is irrelevant.
+ const __m128i expand0 = _mm_setr_epi8(0,0,0,X, 1,1,1,X, 2,2,2,X, 3,3,3,X);
+ const __m128i expand1 = _mm_setr_epi8(4,4,4,X, 5,5,5,X, 6,6,6,X, 7,7,7,X);
+ const __m128i expand2 = _mm_setr_epi8(8,8,8,X, 9,9,9,X, 10,10,10,X, 11,11,11,X);
+ const __m128i expand3 = _mm_setr_epi8(12,12,12,X, 13,13,13,X, 14,14,14,X, 15,15,15,X);
+ while (count >= 16) {
+ __m128i grays = _mm_loadu_si128((const __m128i*) src);
+
+ __m128i ggga0 = _mm_or_si128(_mm_shuffle_epi8(grays, expand0), alphaMask);
+ __m128i ggga1 = _mm_or_si128(_mm_shuffle_epi8(grays, expand1), alphaMask);
+ __m128i ggga2 = _mm_or_si128(_mm_shuffle_epi8(grays, expand2), alphaMask);
+ __m128i ggga3 = _mm_or_si128(_mm_shuffle_epi8(grays, expand3), alphaMask);
+
+ _mm_storeu_si128((__m128i*) (dst + 0), ggga0);
+ _mm_storeu_si128((__m128i*) (dst + 4), ggga1);
+ _mm_storeu_si128((__m128i*) (dst + 8), ggga2);
+ _mm_storeu_si128((__m128i*) (dst + 12), ggga3);
+
+ src += 16;
+ dst += 16;
+ count -= 16;
+ }
+
gray_to_RGB1_portable(dst, src, count);
}
« no previous file with comments | « no previous file | no next file » | no next file with comments »

Powered by Google App Engine
This is Rietveld 408576698