Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(162)

Issue 1157663007: Greedy allocator: perf work (Closed)

Created:
5 years, 6 months ago by Mircea Trofin
Modified:
5 years, 4 months ago
CC:
v8-dev, titzer, Benedikt Meurer, JF
Base URL:
https://chromium.googlesource.com/v8/v8.git@master
Target Ref:
refs/pending/heads/master
Project:
v8
Visibility:
Public.

Description

While working on tuning, I realized that the initial implementation was suffering from an inefficiency. Turns out, if a range had multiple conflicts, only the first 2 were considered, after which (regardless of weight) the remaining would win by default. This would lead in some scenarios to inefficient (while correct) codegen. To correct this, and avoid disproportionately large compile time regressions, I implemented an alternate data structure for storing allocations. There were a few more important changes from the initial implementation: - up-front grouping of related live ranges (e.g. phi inputs and output), and a first-attempt allocation of groups on the same register; - conflict resolution: I initially believed that splitting blocked live ranges should rely on conflicts, but after more analysis (including reference implementations and the scarce documentation around the LLVM implementation) I changed that to relying on "what's best for the range itself", letting the weights mechanics of the algorithm converge to the right decision. As a result, a few benchmarks that were before regressed are now improved, and a few with serious regressions (15-20%) are now under the 10% range. I expect these remaining regressions to be easier to understand and address, given the changes I'm introducing here. The change includes a few other opportunistic changes. The optimized code printer was outputting instruction offsets in decimal, changed it to hex, to match what perf outputs them as. Also, when producing printouts from the instruction scheduler, blocks now each indicate which loop (if any) they belong to, which helps with perf analysis. BUG=

Patch Set 1 #

Total comments: 2

Patch Set 2 : signed/unsigned implicit cast fix #

Total comments: 12

Patch Set 3 : addressed feedback #

Total comments: 51

Patch Set 4 : #

Patch Set 5 : #

Total comments: 40

Patch Set 6 : #

Patch Set 7 : Small fix #

Total comments: 39

Patch Set 8 : #

Patch Set 9 : #

Total comments: 36
Unified diffs Side-by-side diffs Delta from patch set Stats (+1168 lines, -403 lines) Patch
M src/compiler/register-allocator.h View 1 2 3 4 5 6 7 8 12 chunks +239 lines, -19 lines 9 comments Download
M src/compiler/register-allocator.cc View 1 2 3 4 5 6 7 26 chunks +923 lines, -379 lines 27 comments Download
M src/disassembler.cc View 1 2 3 4 1 chunk +2 lines, -3 lines 0 comments Download
M src/zone-containers.h View 1 chunk +4 lines, -2 lines 0 comments Download

Messages

Total messages: 20 (4 generated)
Mircea Trofin
https://codereview.chromium.org/1157663007/diff/1/src/compiler/schedule.cc File src/compiler/schedule.cc (right): https://codereview.chromium.org/1157663007/diff/1/src/compiler/schedule.cc#newcode340 src/compiler/schedule.cc:340: os << " (in loop: B" << block->loop_header()->rpo_number() << ...
5 years, 6 months ago (2015-06-03 05:25:51 UTC) #2
Jarin
First batch of comments. https://codereview.chromium.org/1157663007/diff/20001/src/compiler/register-allocator.h File src/compiler/register-allocator.h (right): https://codereview.chromium.org/1157663007/diff/20001/src/compiler/register-allocator.h#newcode485 src/compiler/register-allocator.h:485: float weight_; Could you please ...
5 years, 6 months ago (2015-06-03 15:06:16 UTC) #3
Mircea Trofin
https://codereview.chromium.org/1157663007/diff/20001/src/compiler/register-allocator.h File src/compiler/register-allocator.h (right): https://codereview.chromium.org/1157663007/diff/20001/src/compiler/register-allocator.h#newcode485 src/compiler/register-allocator.h:485: float weight_; On 2015/06/03 15:06:16, jarin wrote: > Could ...
5 years, 6 months ago (2015-06-04 03:38:40 UTC) #4
Jarin
Another batch of comments. https://codereview.chromium.org/1157663007/diff/20001/src/compiler/register-allocator.cc File src/compiler/register-allocator.cc (right): https://codereview.chromium.org/1157663007/diff/20001/src/compiler/register-allocator.cc#newcode3477 src/compiler/register-allocator.cc:3477: static_cast<size_t>(code()->InstructionBlockCount())) On 2015/06/03 05:25:51, Mircea ...
5 years, 6 months ago (2015-06-08 13:32:50 UTC) #5
jvoung (off chromium)
Some very-surface-level comments. https://codereview.chromium.org/1157663007/diff/40001/src/compiler/register-allocator.cc File src/compiler/register-allocator.cc (right): https://codereview.chromium.org/1157663007/diff/40001/src/compiler/register-allocator.cc#newcode2101 src/compiler/register-allocator.cc:2101: range_count++; How is range_count used? Seems ...
5 years, 6 months ago (2015-06-08 18:59:22 UTC) #6
Mircea Trofin
Incorporated feedback, and also retracted the cosmetic formatting in both disassembler.cc and scheduler.cc, which fit ...
5 years, 6 months ago (2015-06-09 02:56:03 UTC) #7
Jarin
Next batch of comments. https://codereview.chromium.org/1157663007/diff/40001/src/compiler/register-allocator.cc File src/compiler/register-allocator.cc (right): https://codereview.chromium.org/1157663007/diff/40001/src/compiler/register-allocator.cc#newcode888 src/compiler/register-allocator.cc:888: DCHECK_NE(0U, size_); On 2015/06/09 02:56:02, ...
5 years, 6 months ago (2015-06-09 15:00:42 UTC) #8
Mircea Trofin
https://codereview.chromium.org/1157663007/diff/40001/src/compiler/register-allocator.cc File src/compiler/register-allocator.cc (right): https://codereview.chromium.org/1157663007/diff/40001/src/compiler/register-allocator.cc#newcode888 src/compiler/register-allocator.cc:888: DCHECK_NE(0U, size_); On 2015/06/09 15:00:41, jarin wrote: > On ...
5 years, 6 months ago (2015-06-10 05:37:11 UTC) #9
Jarin
https://codereview.chromium.org/1157663007/diff/80001/src/compiler/register-allocator.cc File src/compiler/register-allocator.cc (right): https://codereview.chromium.org/1157663007/diff/80001/src/compiler/register-allocator.cc#newcode3227 src/compiler/register-allocator.cc:3227: Next(state)) { On 2015/06/09 15:00:42, jarin wrote: > You ...
5 years, 6 months ago (2015-06-10 05:59:07 UTC) #10
Mircea Trofin
5 years, 6 months ago (2015-06-10 16:54:49 UTC) #11
Benedikt Meurer
First round of comments. https://codereview.chromium.org/1157663007/diff/120001/src/compiler/register-allocator.cc File src/compiler/register-allocator.cc (right): https://codereview.chromium.org/1157663007/diff/120001/src/compiler/register-allocator.cc#newcode127 src/compiler/register-allocator.cc:127: auto hint = range->FirstHintPosition(&reg); Don't ...
5 years, 6 months ago (2015-06-11 05:30:30 UTC) #13
Mircea Trofin
Thanks for the feedback - updated. https://codereview.chromium.org/1157663007/diff/120001/src/compiler/register-allocator.cc File src/compiler/register-allocator.cc (right): https://codereview.chromium.org/1157663007/diff/120001/src/compiler/register-allocator.cc#newcode354 src/compiler/register-allocator.cc:354: InvalidateWeightAndSize(); On 2015/06/11 ...
5 years, 6 months ago (2015-06-11 07:19:36 UTC) #14
Benedikt Meurer
This is actually a big CL, which I'm not sure we can get into landable ...
5 years, 6 months ago (2015-06-11 08:15:43 UTC) #16
Jarin
On 2015/06/11 08:15:43, Benedikt Meurer wrote: > This is actually a big CL, which I'm ...
5 years, 6 months ago (2015-06-11 08:25:16 UTC) #17
Jarin
Some comments I accumulated yesterday. I realize that a lot of this code will be ...
5 years, 6 months ago (2015-06-12 04:09:12 UTC) #18
Sven Panne
5 years, 6 months ago (2015-06-12 07:36:38 UTC) #20
Quick DBC...

https://codereview.chromium.org/1157663007/diff/160001/src/compiler/register-...
File src/compiler/register-allocator.h (right):

https://codereview.chromium.org/1157663007/diff/160001/src/compiler/register-...
src/compiler/register-allocator.h:969: if (a_start < b_start) return a_end >
b_start;
On 2015/06/12 04:09:12, jarin wrote:
> Out of curiosity, cannot you just leave out the "if (a_start == b_start)
return
> true" line and replace "a_start < b_start" with "a_start <= b_start"?
> 
> This special casing for equality seems a bit funny.
> 
> Perhaps you can even expand the recursive call, so the body would be:
> 
> if (a_start <= b_start) return b_start < a_end;
> return b_start <= a_start && a_start < b_end;

Hmmm, this looks still too complicated, normally just a single line is needed:

   return a_start <= b_end && b_start <= a_end;

I'm not sure about the "open-ness" of our intervals, but I'm quite sure that
this condition can be tweaked to any convention.

Powered by Google App Engine
This is Rietveld 408576698