Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(42)

Issue 290533002: Add Memcpy32 bench. (Closed)

Created:
6 years, 7 months ago by mtklein_C
Modified:
6 years, 7 months ago
Reviewers:
mtklein, qiankun, reed1
CC:
skia-review_googlegroups.com
Base URL:
https://skia.googlesource.com/skia.git@master
Visibility:
Public.

Description

Add Memcpy32 bench. This compares 32-bit copies using memcpy, autovectorization, and when SSE2 is available, aligned and unaligned SSE2. Running this on my desktop (Intel(R) Xeon(R) CPU E5-2690 0 @ 2.90GHz), I see all four perform essentially the same, except Clang's autovectorization looks a little better than GCC's. memcpy is calling libc 2.19's __memcpy_sse2_unaligned. BUG=skia: Committed: http://code.google.com/p/skia/source/detail?r=14799

Patch Set 1 #

Patch Set 2 : const loops is dumb #

Patch Set 3 : admit defeat #

Total comments: 2

Patch Set 4 : alpha #

Unified diffs Side-by-side diffs Delta from patch set Stats (+155 lines, -0 lines) Patch
A bench/MemcpyBench.cpp View 1 2 1 chunk +154 lines, -0 lines 0 comments Download
M gyp/bench.gypi View 1 2 3 1 chunk +1 line, -0 lines 0 comments Download

Messages

Total messages: 6 (0 generated)
mtklein
If we're going to be looking at adding sk_memcpy32, I want to make sure we've ...
6 years, 7 months ago (2014-05-15 15:35:46 UTC) #1
qiankun
lgtm besides a nit. These four functions perform almost same on my machine. https://codereview.chromium.org/290533002/diff/40001/gyp/bench.gypi File ...
6 years, 7 months ago (2014-05-20 09:12:33 UTC) #2
mtklein
lgtm https://codereview.chromium.org/290533002/diff/40001/gyp/bench.gypi File gyp/bench.gypi (right): https://codereview.chromium.org/290533002/diff/40001/gyp/bench.gypi#newcode57 gyp/bench.gypi:57: '../bench/MemcpyBench.cpp', On 2014/05/20 09:12:33, qiankun wrote: > File ...
6 years, 7 months ago (2014-05-20 14:28:01 UTC) #3
commit-bot: I haz the power
CQ is trying da patch. Follow status at https://skia-tree-status.appspot.com/cq/mtklein@chromium.org/290533002/60001
6 years, 7 months ago (2014-05-20 14:28:14 UTC) #4
commit-bot: I haz the power
Change committed as 14799
6 years, 7 months ago (2014-05-20 14:54:06 UTC) #5
qiankun
6 years, 7 months ago (2014-05-22 08:07:12 UTC) #6
Message was sent while issue was closed.
On 2014/05/20 14:54:06, I haz the power (commit-bot) wrote:
> Change committed as 14799
memcpy32_sse2_align perform best for large data size on ASUS T100 which equipped
with Intel(R) Atom(TM) CPU  Z3740  @ 1.33GHz.

   memcpy32_sse2_unalign_100000   NONRENDERING:  cmsecs =    105.42
    memcpy32_sse2_unalign_10000   NONRENDERING:  cmsecs =      6.29
     memcpy32_sse2_unalign_1000   NONRENDERING:  cmsecs =      0.52
      memcpy32_sse2_unalign_100   NONRENDERING:  cmsecs =      0.07
       memcpy32_sse2_unalign_10   NONRENDERING:  cmsecs =      0.03
     memcpy32_sse2_align_100000   NONRENDERING:  cmsecs =     85.95
      memcpy32_sse2_align_10000   NONRENDERING:  cmsecs =      6.24
       memcpy32_sse2_align_1000   NONRENDERING:  cmsecs =      0.32
        memcpy32_sse2_align_100   NONRENDERING:  cmsecs =      0.06
         memcpy32_sse2_align_10   NONRENDERING:  cmsecs =      0.03
  memcpy32_autovectorize_100000   NONRENDERING:  cmsecs =    210.41
   memcpy32_autovectorize_10000   NONRENDERING:  cmsecs =     15.44
    memcpy32_autovectorize_1000   NONRENDERING:  cmsecs =      1.45
     memcpy32_autovectorize_100   NONRENDERING:  cmsecs =      0.18
      memcpy32_autovectorize_10   NONRENDERING:  cmsecs =      0.03
         memcpy32_memcpy_100000   NONRENDERING:  cmsecs =    103.59
          memcpy32_memcpy_10000   NONRENDERING:  cmsecs =      4.34
           memcpy32_memcpy_1000   NONRENDERING:  cmsecs =      0.32
            memcpy32_memcpy_100   NONRENDERING:  cmsecs =      0.06
             memcpy32_memcpy_10   NONRENDERING:  cmsecs =      0.03

Powered by Google App Engine
This is Rietveld 408576698