Issue 1331993004: [es6] Optimize TypedArray.subarray()

Issue 1331993004: [es6] Optimize TypedArray.subarray() (Closed)

Created:
5 years, 3 months ago by skomski

Modified:
5 years, 2 months ago

Reviewers:
Dan Ehrenberg, Michael Starzinger, Jakob Kummerow

CC:
v8-reviews_googlegroups.com

Base URL:
https://chromium.googlesource.com/v8/v8.git@master

Target Ref:
refs/pending/heads/master

Project:
v8

Visibility:
Public.

More Reviews

Description

[es6] Optimize TypedArray.subarray() ```` var array = new Uint8Array(65000); var startDate = Date.now(); var counter = 0; while (counter++ < 50000000) { array.subarray(start, end); } var endDate = Date.now(); print(endDate - startDate); ```` 4200 ms -> 3500 ms (16.67%) BUG= Committed: https://crrev.com/1e2aecf3635a5fb01607fa65511d67902735d90c Cr-Commit-Position: refs/heads/master@{#30770}

Patch Set 1 #

Patch Set 2 : macro version #

Patch Set 3 : remove unused references #

Total comments: 1

Created: 5 years, 3 months ago

Download [raw] [tar.bz2]

		Unified diffs	Side-by-side diffs	Delta from patch set	Stats (+13 lines, -11 lines)			Patch
	M	src/macros.py	View	1	1 chunk	+2 lines, -0 lines	1 comment	Download
	M	src/typedarray.js	View	1 2	2 chunks	+11 lines, -11 lines	0 comments	Download

Messages

Total messages: 17 (3 generated)

Expand Messages | Collapse Messages | Show Generated Messages | Hide Generated Messages

Jakob Kummerow

I'm not sure about this one. Replacing MathMax() calls with verbose hand-written comparisons for performance ...

5 years, 3 months ago (2015-09-11 09:52:26 UTC) #3

skomski

On 2015/09/11 09:52:26, Jakob wrote: > I'm not sure about this one. > > Replacing ...

5 years, 3 months ago (2015-09-11 10:04:48 UTC) #4

Dan Ehrenberg

On 2015/09/11 at 10:04:48, karl wrote: > On 2015/09/11 09:52:26, Jakob wrote: > > I'm ...

5 years, 3 months ago (2015-09-11 13:53:15 UTC) #5

On 2015/09/11 at 10:04:48, karl wrote:
> On 2015/09/11 09:52:26, Jakob wrote:
> > I'm not sure about this one.
> > 
> > Replacing MathMax() calls with verbose hand-written comparisons for
performance
> > reasons is sad. Can't we have a max() function that's fast enough to be
usable?
> > 
> > Also, given that .subarray() ends up having to call the runtime anyway, it'd
be
> > a prime candidate for moving the entire implementation to C++, which would
> > probably be even faster than micro-optimizing the JS bits.
> > 
> > Dan, would you like to land this as an incremental improvement anyway?
> > 
> > Karl, is there any non-microbenchmark use case where this has an impact, or
did
> > you just optimize it "because we can"?
> 
> I optimized it since I want to use subarray in node.js to replace the custom
slice implementation.
> https://github.com/nodejs/node/pull/2777
> 
> I can move it to C++.

Agreed that this is sad. Inlining JS natives better might make this unnecessary,
but that's hard.

I'm not really sure what moving it to C++ would get you. The important part
(invoking the constructor) already ends up calling out to C++. To me, this looks
like a case where the Javascript is doing small surface API work and nothing
computationally significant, so if we can work things out cleanly and
efficiently in JS, I think it'd make sense to leave it there long-term.

What if we made a macro which did a simple max calculation, without all the
NaN-related logic and multiple arguments logic, inline? Would that get the same
speedup? This macro could be applied in many places. It could be defined in
macros.py for use in any JS natives file.

I confess, I have already pulled a patch from the Node.js team which included a
similar change whose main benefit was avoiding calling MathMax
https://codereview.chromium.org/1231673008 . If we could fix this problem in
general with a macro, that would be great, and much easier to use everywhere
than rewriting everything in C++ (even if we should eventually do that).

karl, what other TypedArray performance issues are you facing?

Jakob Kummerow

On 2015/09/11 13:53:15, Dan Ehrenberg wrote: > Agreed that this is sad. Inlining JS natives ...

5 years, 3 months ago (2015-09-11 15:03:45 UTC) #6

On 2015/09/11 13:53:15, Dan Ehrenberg wrote:
> Agreed that this is sad. Inlining JS natives better might make this
unnecessary,
> but that's hard.

Inlining is not the only problem. As you allude to below, Math.min/max also have
pretty crazy spec-defined semantics, so will never be as fast as a simple
comparison.

> I'm not really sure what moving it to C++ would get you. 

The highlights:
- you don't have to worry about JS-spec-induced speed traps for simple things
like max()
- you don't have to worry about ICs, deopts, inlineability, and crafting
platform-specific machine code for compiler intrinsics either
- cross-process (!) memory sharing as opposed to per-context/per-isolate memory
cost
- runtime call overhead upper-bounded to 1 call
- it's nice to have one consistent, safe, canonical implementation of everything

> The important part
> (invoking the constructor) already ends up calling out to C++. To me, this
looks
> like a case where the Javascript is doing small surface API work and nothing
> computationally significant, so if we can work things out cleanly and
> efficiently in JS, I think it'd make sense to leave it there long-term.

The rule of thumb is: if it's 100% pure JS, or if it's most naturally expressed
in JS because it requires JS semantics, implement it in JS; otherwise prefer
C++. Since this uses intrinsics and runtime calls already, it falls into the
latter bucket.

> What if we made a macro which did a simple max calculation, without all the
> NaN-related logic and multiple arguments logic, inline? Would that get the
same
> speedup? This macro could be applied in many places. It could be defined in
> macros.py for use in any JS natives file.

That would mitigate some of the immediate pain here and would constitute an
incremental improvement. Doesn't have to happen in this CL, though.

> I confess, I have already pulled a patch from the Node.js team which included
a
> similar change whose main benefit was avoiding calling MathMax
> https://codereview.chromium.org/1231673008 . If we could fix this problem in
> general with a macro, that would be great, and much easier to use everywhere
> than rewriting everything in C++ (even if we should eventually do that).
> 
> karl, what other TypedArray performance issues are you facing?

skomski

On 2015/09/11 15:03:45, Jakob wrote: > On 2015/09/11 13:53:15, Dan Ehrenberg wrote: > > Agreed ...

5 years, 3 months ago (2015-09-14 17:56:29 UTC) #7

On 2015/09/11 15:03:45, Jakob wrote:
> On 2015/09/11 13:53:15, Dan Ehrenberg wrote:
> > Agreed that this is sad. Inlining JS natives better might make this
> unnecessary,
> > but that's hard.
> 
> Inlining is not the only problem. As you allude to below, Math.min/max also
have
> pretty crazy spec-defined semantics, so will never be as fast as a simple
> comparison.
> 
> > I'm not really sure what moving it to C++ would get you. 
> 
> The highlights:
> - you don't have to worry about JS-spec-induced speed traps for simple things
> like max()
> - you don't have to worry about ICs, deopts, inlineability, and crafting
> platform-specific machine code for compiler intrinsics either
> - cross-process (!) memory sharing as opposed to per-context/per-isolate
memory
> cost
> - runtime call overhead upper-bounded to 1 call
> - it's nice to have one consistent, safe, canonical implementation of
everything
````
RUNTIME_FUNCTION(Runtime_TypedArraySubArrayImpl) {
  HandleScope scope(isolate);
  DCHECK(args.length() == 3);
  if (!args[0]->IsJSTypedArray()) {
    THROW_NEW_ERROR_RETURN_FAILURE(
        isolate, NewTypeError(MessageTemplate::kNotTypedArray));
  }
  Handle<JSTypedArray> buffer(JSTypedArray::cast(args[0]));
  CONVERT_SMI_ARG_CHECKED(begin, 1);
  CONVERT_SMI_ARG_CHECKED(end, 2);

  int buffer_length = buffer->length_value();

  if (begin < 0) {
    begin = Max(0, buffer_length + begin);
  } else {
    begin = Min(buffer_length, begin);
  }

  if (end < 0) {
    end = Max(0, buffer_length + end);
  } else {
    end = Min(buffer_length, end);
  }
  if (end < begin) {
    end = begin;
  }
  int new_length = end - begin;
  int byte_offset = NumberToInt32(buffer->byte_offset());
  size_t begin_byte_offset = byte_offset + begin * buffer->element_size();

  return *isolate->factory()->NewJSTypedArray(
      buffer->type(), buffer->GetBuffer(), begin_byte_offset, new_length);
}
````
I tried a simple c++ implementation but it's actually 3 (30%) seconds slower
even with removing the excessive checks in NewJSTypedArray.
Still did the the TO_INTEGER checks in JS because I did not found a good
replacement in runtime-utils.h.

Back to this CR I pushed a simple macro version that it as fast as the verbose
if variant. PTAL

skomski

Since the subarray patch landed in nodejs it would be great to land this, too. ...

5 years, 3 months ago (2015-09-16 15:01:08 UTC) #8

Dan Ehrenberg

On 2015/09/16 at 15:01:08, karl wrote: > Since the subarray patch landed in nodejs it ...

5 years, 3 months ago (2015-09-16 15:44:45 UTC) #9

Dan Ehrenberg

On 2015/09/16 at 15:44:45, Dan Ehrenberg wrote: > On 2015/09/16 at 15:01:08, karl wrote: > ...

5 years, 3 months ago (2015-09-16 15:46:09 UTC) #10

commit-bot: I haz the power

CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1331993004/40001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1331993004/40001

5 years, 3 months ago (2015-09-16 15:58:49 UTC) #13

commit-bot: I haz the power

Patchset 3 (id:??) landed as https://crrev.com/1e2aecf3635a5fb01607fa65511d67902735d90c Cr-Commit-Position: refs/heads/master@{#30770}

5 years, 3 months ago (2015-09-16 16:22:07 UTC) #15

Michael Starzinger

5 years, 2 months ago (2015-10-07 18:49:07 UTC) #17

Message was sent while issue was closed.

Hmmm ...

https://codereview.chromium.org/1331993004/diff/40001/src/macros.py
File src/macros.py (right):

https://codereview.chromium.org/1331993004/diff/40001/src/macros.py#newcode160
src/macros.py:160: macro MAX_SIMPLE(argA, argB) = (argA < argB ? argB : argA);
Please be aware that this macro will expand "MAX_SIMPLE(a + b, 0)" to "a + b < 0
? a + b : 0" which will evaluate the addition twice in the baseline compiler
(which is the only compiler where this macro makes any sense in the first
place).

If you wanted to guard against that, then the %IS_VAR marker would be the right
thing. But then you would need to adapt the use-sites accordingly. And I see
that this is now being used left and right in the builtin JavaScript files and
almost all use-sites are affected.

Expand Messages | Collapse Messages | Show Generated Messages | Hide Generated Messages