Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(338)

Issue 2397693002: Add MSA optimized YUY2ToI422, YUY2ToI420, UYVYToI422, UYVYToI420 functions (Closed)

Created:
4 years, 2 months ago by manojkumar.bhosale
Modified:
4 years, 2 months ago
CC:
petar.jovanovic, gordana.cmiljanovic_imgtec.com, raghu.gandham_imgtec.com, parag.salasakar_imgtec.com, mandar.sahastrabuddhe_imgtec.com, rob.isherwood_imgtec.com
Target Ref:
refs/heads/master
Project:
libyuv
Visibility:
Public.

Description

Add MSA optimized YUY2ToI422, YUY2ToI420, UYVYToI422, UYVYToI420 functions R=fbarchard@google.com BUG=libyuv:634 Performance gains as below, YUY2ToI422, YUY2ToI420 :- YUY2ToYRow_MSA : ~10x YUY2ToUVRow_MSA : ~11x YUY2ToUV422Row_MSA : ~9x YUY2ToYRow_Any_MSA : ~6x YUY2ToUVRow_Any_MSA : ~5x YUY2ToUV422Row_Any_MSA : ~4x UYVYToI422, UYVYToI420 :- UYVYToYRow_MSA : ~10x UYVYToUVRow_MSA : ~11x UYVYToUV422Row_MSA : ~9x UYVYToYRow_Any_MSA : ~6x UYVYToUVRow_Any_MSA : ~5x UYVYToUV422Row_Any_MSA : ~4x Committed: https://chromium.googlesource.com/libyuv/libyuv/+/a2891ec77c183ec265af8278eee821e4d9715c12

Patch Set 1 #

Total comments: 12

Patch Set 2 : Updates as per review comments #

Unified diffs Side-by-side diffs Delta from patch set Stats (+205 lines, -3 lines) Patch
M docs/getting_started.md View 1 1 chunk +3 lines, -3 lines 0 comments Download
M include/libyuv/row.h View 5 chunks +26 lines, -0 lines 0 comments Download
M source/convert.cc View 1 2 chunks +20 lines, -0 lines 0 comments Download
M source/planar_functions.cc View 1 2 chunks +20 lines, -0 lines 0 comments Download
M source/row_any.cc View 1 3 chunks +16 lines, -0 lines 0 comments Download
M source/row_msa.cc View 1 1 chunk +120 lines, -0 lines 0 comments Download

Messages

Total messages: 14 (5 generated)
manojkumar.bhosale
4 years, 2 months ago (2016-10-05 09:10:58 UTC) #1
manojkumar.bhosale
Updated with reviewers, cc list and performance gain numbers
4 years, 2 months ago (2016-10-05 09:19:14 UTC) #4
fbarchard1
mostly nits... code looks functional. prefer less unrolling. If youre loading and storing full 16 ...
4 years, 2 months ago (2016-10-05 22:03:59 UTC) #5
manojkumar.bhosale
Incorporated review comments Enabling auto-vectorization for C code reduced SIMD benefit and new MSA performance ...
4 years, 2 months ago (2016-10-07 10:42:09 UTC) #7
fbarchard1
lgtm
4 years, 2 months ago (2016-10-07 17:32:44 UTC) #8
fbarchard1
On 2016/10/07 17:32:44, fbarchard1 wrote: > lgtm note I dont currently have a way to ...
4 years, 2 months ago (2016-10-07 17:37:11 UTC) #10
fbarchard1
Committed patchset #2 (id:20001) manually as a2891ec77c183ec265af8278eee821e4d9715c12 (presubmit successful).
4 years, 2 months ago (2016-10-07 17:37:25 UTC) #12
fbarchard1
I looked into performance of these functions for Intel and Arm. YUY2ToI422 was affected by ...
4 years, 2 months ago (2016-10-10 18:38:06 UTC) #13
fbarchard1
4 years, 2 months ago (2016-10-12 21:01:54 UTC) #14
Message was sent while issue was closed.
note that I'm seeing ongoing performance issues with these on arm.  Functions
that have 1 read and multiple writes are much slower than the inverse.
The unittest now allocates the U and V plane as a side by side buffer, reducing
page misses.

Powered by Google App Engine
This is Rietveld 408576698