Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(34)

Issue 860283004: Improve the x64 Load and Store access instruction selection to include immediate offsets.

Created:
5 years, 10 months ago by dougc
Modified:
4 years, 8 months ago
Reviewers:
Jarin, danno
CC:
v8-dev
Base URL:
https://chromium.googlesource.com/v8/v8.git@master
Target Ref:
refs/pending/heads/master
Project:
v8
Visibility:
Public.

Description

Improve the x64 Load and Store access instruction selection to include immediate offsets. Currently GetEffectiveAddressMemoryOperand() uses BaseWithIndexAndDisplacement64Match. However this does not recognize the ChangeUint32ToUint64 operation which is applied to the index. It looks like a lot of work to rework BaseWithIndexAndDisplacement64Match, and BaseWithIndexAndDisplacement32Match seems ok as-is, so it seems more appropriate to add x64 backend specific code to match the patterns which is done inline in GetEffectiveAddressMemoryOperand(). Some of the code in VisitChangeUint32ToUint64() has been refactored into OpZeroExtends() and is needed in the pattern matching. The BaseWithIndexAndDisplacementMatch code has some OwnedBy() tests. It's not immediately clear if these are to maintain some invariant or if they are just a heuristic? It seems generally better to dehoist the index offsets aggressively so the OwnedBy() test is omitted in this patch. This could use some reviewer scrutiny? A TODO note is included to explore the scaled-index patterns matched by BaseWithIndexAndDisplacement64Match. Such patterns are less likely to be proven within bounds and thus not taken by these paths anyway, so they are left for future work. This patch gives a useful x0.92 improvement in run time for an asm.js zlib benchmark when using index masking to avoid bounds checks. BUG=

Patch Set 1 #

Total comments: 3

Patch Set 2 : Address reviewer feedback #

Patch Set 3 : Guard against the int32 addition argument being negative. #

Patch Set 4 : Match scaled indexes #

Patch Set 5 : Rebase #

Unified diffs Side-by-side diffs Delta from patch set Stats (+89 lines, -37 lines) Patch
M src/compiler/x64/instruction-selector-x64.cc View 1 2 3 4 3 chunks +89 lines, -37 lines 0 comments Download

Messages

Total messages: 16 (4 generated)
dougc
5 years, 10 months ago (2015-01-28 05:39:46 UTC) #2
Benedikt Meurer
I think this problem is not limited to x64, but affects all 64bit platforms, so ...
5 years, 10 months ago (2015-01-28 05:55:34 UTC) #4
dougc
Thank you for the feedback. There was a problem with the difference between the int32 ...
5 years, 10 months ago (2015-01-29 08:14:08 UTC) #5
Benedikt Meurer
On 2015/01/29 at 08:14:08, dtc-v8 wrote: > Thank you for the feedback. > > There ...
5 years, 10 months ago (2015-01-29 08:17:28 UTC) #6
dougc
On 2015/01/29 08:17:28, Benedikt Meurer wrote: > On 2015/01/29 at 08:14:08, dtc-v8 wrote: ... > ...
5 years, 10 months ago (2015-01-29 09:23:43 UTC) #7
Benedikt Meurer
I think this is something that can only be solved properly by adding support for ...
5 years, 10 months ago (2015-01-30 04:50:09 UTC) #8
dougc
On 2015/01/30 04:50:09, Benedikt Meurer wrote: > I think this is something that can only ...
5 years, 10 months ago (2015-01-30 12:13:20 UTC) #10
Benedikt Meurer
> The index is constrained to be a 31-bit unsigned integer because: it is an ...
5 years, 10 months ago (2015-01-30 12:15:03 UTC) #11
dougc
On 2015/01/30 12:15:03, Benedikt Meurer wrote: > > The index is constrained to be a ...
5 years, 10 months ago (2015-01-30 12:36:57 UTC) #12
Benedikt Meurer
On 2015/01/30 12:36:57, dougc wrote: > On 2015/01/30 12:15:03, Benedikt Meurer wrote: > > > ...
5 years, 10 months ago (2015-02-04 18:39:34 UTC) #13
dougc
On 2015/02/04 18:39:34, Benedikt Meurer wrote: > On 2015/01/30 12:36:57, dougc wrote: > > On ...
5 years, 10 months ago (2015-02-05 13:36:07 UTC) #14
Benedikt Meurer
5 years, 10 months ago (2015-02-05 13:53:33 UTC) #15
On 2015/02/05 at 13:36:07, dtc-v8 wrote:
> On 2015/02/04 18:39:34, Benedikt Meurer wrote:
> > On 2015/01/30 12:36:57, dougc wrote:
> > > On 2015/01/30 12:15:03, Benedikt Meurer wrote:
> > > > > The index is constrained to be a 31-bit unsigned integer because: it
is an
> > > > array buffer index, and the maximum array buffer size is 2GB, and at
this
> > > point
> > > > in the code the index is known to be within bounds?
> > > > 
> > > > That's only true for asm.js heap access, not for Load/Store in general.
> > > 
> > > Could you help me understand what aspect of asm.js this is limited to?
> > > 
> > > This code path is only taken for typed arrays, when
> > > IsExternalArrayElementsKind() is true? E.g. See the logic in
> > > JSTypedLowering::ReduceJSLoadProperty().
> > 
> > As said, that is the current implementation, which only considers asm.js.
But we
> > will be targeting regular JavaScript soon(ish), and that means you can have
> > abitrary Loads/Stores, i.e. even Loads/Stores for Blink DOM objects or other
> > arrays/objects/whatever. You cannot make any assumptions about the index in
that
> > case.
> 
> It's true that asm.js heap access patterns are a subset of the code this path
supports, but this path is not limited to asm.js, rather it deals with typed
array element access. For example u32[i+K] is supported by this path but is not
asm.js.
> 
> It would appear that LoadElement() and StoreElement() are only used when the
index is within bounds, as they do no bounds checking, so even if these were
used for other objects they would only be used when the index is known to be
within bounds of the object?  Further ReduceJSLoadProperty() would be checking
that the index is within the 2G limit. Thus even in future the index will be an
unsigned 31-bit integer if the LoadElement() or StoreElement() paths are taken?
> 
> Transforming the addition to a 64 bit operation earlier might also frustrate
later optimizations. For example the right and left shift are optimized away via
some patterns etc, and the addition moved over a low bit masking operation. A 64
bit op might require more general pattern matching and might not even capture
the constrains needed to satisfy the transforms. It might be sound to lower to
32 bit ops when possible.
> 
> If the code is in flux then I can wait and rework the patch in future.

Yes, let's first get the missing bits for JavaScript in.

Powered by Google App Engine
This is Rietveld 408576698