DescriptionOptimize yuv alpha blend AVX2 code to do 32 pixels at time.
out/Release/libyuv_unittest --libyuv_width=1280 --libyuv_height=720 --libyuv_repeat=9999 --libyuv_flags=-1 --gtest_filter=*I420Blend_Opt
Was LibYUVPlanarTest.I420Blend_Opt (2335 ms)
Now LibYUVPlanarTest.I420Blend_Opt (1937 ms)
vs SSSE3
LibYUVPlanarTest.I420Blend_Opt (2599 ms)
BUG=libyuv:527
R=dhrosa@google.com
Committed: https://chromium.googlesource.com/libyuv/libyuv/+/dee77a4ebeaebc781cb3acd80aa6627fd1c7c825
Patch Set 1 #Patch Set 2 : avx2 does 32 pixels at a time now #Patch Set 3 : gcc port of avx2 that does 32 pixels #
Total comments: 6
Patch Set 4 : add xgetbv comment #Patch Set 5 : update formula to match spreadsheet #Patch Set 6 : merge cpuid changes #
Created: 5 years ago
Messages
Total messages: 10 (4 generated)
|