Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(251)

Issue 2420553002: Add ARGBExtractAlpha_AVX2 function (Closed)

Created:
4 years, 2 months ago by fbarchard1
Modified:
4 years, 2 months ago
CC:
wangcheng
Target Ref:
refs/heads/master
Project:
libyuv
Visibility:
Public.

Description

Add ARGBExtractAlpha_AVX2 function Port SSE2 version to AVX2. BUG=libyuv:572 TEST=/usr/local/google/home/fbarchard/intelsde/sde -skx -- out/Release/libyuv_unittest --gtest_filter=*Extract* R=wangcheng@google.com, magjed@chromium.org Committed: https://chromium.googlesource.com/libyuv/libyuv/+/a5e93766a20ac5fff9e0ef2f5bc7c4bb1a0fdb8e

Patch Set 1 #

Patch Set 2 : disable macro for clangcl 32 bit #

Patch Set 3 : vpermd instead of 2 vpermq #

Patch Set 4 : vpermd instead of 2 vpermq #

Total comments: 2

Patch Set 5 : unroll avx2 to do 32 pixels #

Patch Set 6 : schedule pack before 2nd load #

Patch Set 7 : bump version #

Unified diffs Side-by-side diffs Delta from patch set Stats (+64 lines, -2 lines) Patch
M README.chromium View 1 2 3 4 5 6 1 chunk +1 line, -1 line 0 comments Download
M include/libyuv/row.h View 1 2 chunks +12 lines, -0 lines 0 comments Download
M include/libyuv/version.h View 1 2 3 4 5 6 1 chunk +1 line, -1 line 0 comments Download
M source/planar_functions.cc View 1 2 3 4 1 chunk +6 lines, -0 lines 0 comments Download
M source/row_any.cc View 1 2 3 4 1 chunk +3 lines, -0 lines 0 comments Download
M source/row_gcc.cc View 1 2 3 4 5 6 1 chunk +41 lines, -0 lines 0 comments Download

Messages

Total messages: 14 (3 generated)
fbarchard1
/usr/local/google/home/fbarchard/intelsde/sde -skx -- out/Release/libyuv_unittest --gtest_filter=*Extract* Note: Google Test filter = *Extract* [==========] Running 1 test ...
4 years, 2 months ago (2016-10-13 02:08:22 UTC) #2
fbarchard1
try bots pass. This function is used in media canvas capture to convert ARGB to ...
4 years, 2 months ago (2016-10-13 17:31:31 UTC) #3
fbarchard1
Benchmark on Macbook Air 11" early 2015 broadwelll 5650u 2.2GHz i7 LIBYUV_DISABLE_AVX2=1 out/Release/libyuv_unittest --gtest_filter=*ExtractAlpha* --libyuv_width=640 ...
4 years, 2 months ago (2016-10-13 18:01:08 UTC) #4
fbarchard1
tried unrolling: void ARGBExtractAlphaRow_AVX2(const uint8* src_argb, uint8* dst_a, int width) { asm volatile ( LABELALIGN ...
4 years, 2 months ago (2016-10-13 18:48:29 UTC) #5
fbarchard1
replaced 2 vpermq with 1 vpermd as done in the ARGBToY function using the same ...
4 years, 2 months ago (2016-10-13 19:28:20 UTC) #6
wangcheng
https://codereview.chromium.org/2420553002/diff/60001/source/row_gcc.cc File source/row_gcc.cc (right): https://codereview.chromium.org/2420553002/diff/60001/source/row_gcc.cc#newcode2884 source/row_gcc.cc:2884: "+rm"(width) // %2 Do you need to check "width" ...
4 years, 2 months ago (2016-10-13 19:52:18 UTC) #8
fbarchard1
https://codereview.chromium.org/2420553002/diff/60001/source/row_gcc.cc File source/row_gcc.cc (right): https://codereview.chromium.org/2420553002/diff/60001/source/row_gcc.cc#newcode2884 source/row_gcc.cc:2884: "+rm"(width) // %2 On 2016/10/13 19:52:18, wangcheng wrote: > ...
4 years, 2 months ago (2016-10-13 21:46:27 UTC) #9
fbarchard1
PTAL Old SSE2 4618 Old AVX2 4109 New AVX2 3792 doing 640x360 100k iterations. 21.7% ...
4 years, 2 months ago (2016-10-13 22:04:47 UTC) #10
wangcheng
lgtm
4 years, 2 months ago (2016-10-13 22:38:27 UTC) #11
fbarchard1
Note that vpshufb can replace vpsrld but appears to be slower 3872 vs 3818 with ...
4 years, 2 months ago (2016-10-13 22:59:35 UTC) #12
fbarchard1
4 years, 2 months ago (2016-10-13 23:03:47 UTC) #14
Message was sent while issue was closed.
Committed patchset #7 (id:120001) manually as
a5e93766a20ac5fff9e0ef2f5bc7c4bb1a0fdb8e (presubmit successful).

Powered by Google App Engine
This is Rietveld 408576698