Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(356)

Issue 196793026: Add self-scheduling to threaded translation (vs static) (Closed)

Created:
6 years, 9 months ago by jvoung (off chromium)
Modified:
6 years, 9 months ago
Reviewers:
Jim Stichnoth, JF
CC:
native-client-reviews_googlegroups.com
Base URL:
http://git.chromium.org/native_client/pnacl-llvm.git@master
Visibility:
Public.

Description

Add self-scheduling to threaded translation (vs static) For the x86-64 sandboxed translator it makes about a 5% difference over a set of 13 pexes from the web / webstore. For python.pexe this reduces the translation time w/ 4 threads from 43 seconds to 37 seconds. Most of the other web benchmarks only get slight improvements (if any). For the ARM sandboxed translator, it's about 4.5% aggregate difference for on the a15. Again, most of the apps are a wash except for python, gamecake and the gameboy emulator. For the unsandboxed translator, this is often a wash, except on a few benchmarks. Testing on spec2k: Same or better for sandboxed translator. Notably, crafty translation time drops from 1.59 secs to 1.19 secs w/ CPU utilization going from 1.8 to 2.7 w/ 4 threads. With two threads it's 2.1 secs to 1.6 secs and CPU utilization from 1.2 to 1.7. Crafty was one of the benchmarks that regressed after switching from 3.3 to 3.4 and perhaps part of it was bad scheduling. The gcc and eon benchmarks also get a nice boost. Similar on ARM: crafty drops from 5.7 to 4.8 w/ 2 threads. Use a ChunkSize of 8, to reduce the synchronization overhead somewhat, hopefully without hurting streaming translation. This tapers off to 1 when there are few functions remaining to attempt finer grained load balancing near the end. For some idea of how this affects streaming translation the average function size for the benchmarks range from 160 bytes to 1700 bytes. So 8 times that is about 1.3KB to 14KB. Geomean of average func size is 570 bytes. For that geomean, 4 threads times 8 functions would grab ~18KB at a time on geo-average-ish. BUG= http://code.google.com/p/nativeclient/issues/detail?id=3777 R=jfb@chromium.org, stichnot@chromium.org Committed: https://gerrit.chromium.org/gerrit/gitweb?p=native_client/pnacl-llvm.git;a=commit;h=91ec3c7

Patch Set 1 #

Patch Set 2 : cleanup #

Patch Set 3 : lower chunk size just in case #

Total comments: 15

Patch Set 4 : jim review #

Patch Set 5 : two dots #

Patch Set 6 : disallow copy and assign #

Total comments: 21

Patch Set 7 : JFs review #

Patch Set 8 : add assert and comment, etc. #

Patch Set 9 : reinstate check #

Unified diffs Side-by-side diffs Delta from patch set Stats (+185 lines, -13 lines) Patch
A tools/pnacl-llc/ThreadedFunctionQueue.h View 1 2 3 4 5 6 7 1 chunk +109 lines, -0 lines 0 comments Download
M tools/pnacl-llc/pnacl-llc.cpp View 1 2 3 4 5 6 7 8 11 chunks +76 lines, -13 lines 0 comments Download

Messages

Total messages: 12 (0 generated)
jvoung (off chromium)
6 years, 9 months ago (2014-03-18 18:25:56 UTC) #1
Jim Stichnoth
https://codereview.chromium.org/196793026/diff/30001/tools/pnacl-llc/ThreadedFunctionQueue.h File tools/pnacl-llc/ThreadedFunctionQueue.h (right): https://codereview.chromium.org/196793026/diff/30001/tools/pnacl-llc/ThreadedFunctionQueue.h#newcode68 tools/pnacl-llc/ThreadedFunctionQueue.h:68: unsigned RecommendedChunkSize() { declare as const https://codereview.chromium.org/196793026/diff/30001/tools/pnacl-llc/ThreadedFunctionQueue.h#newcode81 tools/pnacl-llc/ThreadedFunctionQueue.h:81: unsigned ...
6 years, 9 months ago (2014-03-18 20:49:49 UTC) #2
jvoung (off chromium)
Thanks! https://codereview.chromium.org/196793026/diff/30001/tools/pnacl-llc/ThreadedFunctionQueue.h File tools/pnacl-llc/ThreadedFunctionQueue.h (right): https://codereview.chromium.org/196793026/diff/30001/tools/pnacl-llc/ThreadedFunctionQueue.h#newcode68 tools/pnacl-llc/ThreadedFunctionQueue.h:68: unsigned RecommendedChunkSize() { On 2014/03/18 20:49:49, stichnot wrote: ...
6 years, 9 months ago (2014-03-18 22:01:17 UTC) #3
Jim Stichnoth
lgtm https://codereview.chromium.org/196793026/diff/30001/tools/pnacl-llc/pnacl-llc.cpp File tools/pnacl-llc/pnacl-llc.cpp (right): https://codereview.chromium.org/196793026/diff/30001/tools/pnacl-llc/pnacl-llc.cpp#newcode135 tools/pnacl-llc/pnacl-llc.cpp:135: NoSplitModuleDynamic("no-split-module-dynamic", On 2014/03/18 22:01:17, jvoung (cr) wrote: > ...
6 years, 9 months ago (2014-03-18 22:23:12 UTC) #4
JF
https://codereview.chromium.org/196793026/diff/80001/tools/pnacl-llc/ThreadedFunctionQueue.h File tools/pnacl-llc/ThreadedFunctionQueue.h (right): https://codereview.chromium.org/196793026/diff/80001/tools/pnacl-llc/ThreadedFunctionQueue.h#newcode23 tools/pnacl-llc/ThreadedFunctionQueue.h:23: : SplitModuleCount(SplitModuleCount), Silly corner case, but checking that SplitModuleCount ...
6 years, 9 months ago (2014-03-19 04:47:04 UTC) #5
jvoung (off chromium)
https://codereview.chromium.org/196793026/diff/80001/tools/pnacl-llc/ThreadedFunctionQueue.h File tools/pnacl-llc/ThreadedFunctionQueue.h (right): https://codereview.chromium.org/196793026/diff/80001/tools/pnacl-llc/ThreadedFunctionQueue.h#newcode23 tools/pnacl-llc/ThreadedFunctionQueue.h:23: : SplitModuleCount(SplitModuleCount), On 2014/03/19 04:47:05, JF wrote: > Silly ...
6 years, 9 months ago (2014-03-19 18:44:36 UTC) #6
JF
https://codereview.chromium.org/196793026/diff/80001/tools/pnacl-llc/ThreadedFunctionQueue.h File tools/pnacl-llc/ThreadedFunctionQueue.h (right): https://codereview.chromium.org/196793026/diff/80001/tools/pnacl-llc/ThreadedFunctionQueue.h#newcode35 tools/pnacl-llc/ThreadedFunctionQueue.h:35: return FuncIndex % SplitModuleCount == ThreadIndex; On 2014/03/19 18:44:37, ...
6 years, 9 months ago (2014-03-19 20:26:07 UTC) #7
jvoung (off chromium)
https://codereview.chromium.org/196793026/diff/80001/tools/pnacl-llc/ThreadedFunctionQueue.h File tools/pnacl-llc/ThreadedFunctionQueue.h (right): https://codereview.chromium.org/196793026/diff/80001/tools/pnacl-llc/ThreadedFunctionQueue.h#newcode35 tools/pnacl-llc/ThreadedFunctionQueue.h:35: return FuncIndex % SplitModuleCount == ThreadIndex; On 2014/03/19 20:26:08, ...
6 years, 9 months ago (2014-03-19 21:45:17 UTC) #8
JF
lgtm
6 years, 9 months ago (2014-03-19 22:02:46 UTC) #9
jvoung (off chromium)
https://codereview.chromium.org/196793026/diff/80001/tools/pnacl-llc/pnacl-llc.cpp File tools/pnacl-llc/pnacl-llc.cpp (right): https://codereview.chromium.org/196793026/diff/80001/tools/pnacl-llc/pnacl-llc.cpp#newcode470 tools/pnacl-llc/pnacl-llc.cpp:470: while (FuncIndex < NextIndex && I != E) { ...
6 years, 9 months ago (2014-03-20 15:50:30 UTC) #10
JF
On 2014/03/20 15:50:30, jvoung (cr) wrote: > https://codereview.chromium.org/196793026/diff/80001/tools/pnacl-llc/pnacl-llc.cpp > File tools/pnacl-llc/pnacl-llc.cpp (right): > > https://codereview.chromium.org/196793026/diff/80001/tools/pnacl-llc/pnacl-llc.cpp#newcode470 ...
6 years, 9 months ago (2014-03-20 16:02:42 UTC) #11
jvoung (off chromium)
6 years, 9 months ago (2014-03-20 23:17:00 UTC) #12
Message was sent while issue was closed.
Committed patchset #9 manually as r91ec3c7 (presubmit successful).

Powered by Google App Engine
This is Rietveld 408576698