Issue 1136953011: Shrinking EventPath to improve memory usage and performance

Issue 1136953011: Shrinking EventPath to improve memory usage and performance (Closed)

Created:
5 years, 7 months ago by Daniel Bratell

Modified:
5 years, 7 months ago

Reviewers:
haraken, esprehn, sof, Rick Byers, hayato

CC:
blink-reviews, oilpan-reviews

Base URL:
https://chromium.googlesource.com/chromium/blink.git@master

Target Ref:
refs/heads/master

Project:
blink

Visibility:
Public.

More Reviews

Description

Shrinking EventPath to improve memory usage and performance In bug 325460 EventPath was changed to have fewer allocations at the cost of higher memory usage to improve performance. This is a followup that adds one more allocation for lower memory usage and even higher (3-5%) Event dispatching performance. The EventPath memory usage becomes an issue because even though there is, most of the time, only one active Event. Due to slowness or garbage collection delays there can actually be dozens or hundreds of Event objects in memory. In those cases spending 1-2 KB per EventPath ends up consuming 100-200 KB memory. The patch changes the EventPath builder to gather the path on the stack and then building a perfectly sized EventPath. It seems to save about 1200 bytes per event in my adhoc testing (typical event path depth is about 10-12) and performance tests report: Events/EventsDispatching 3.2% faster Events/EventsDispatchingInDeeplyNestedShadowTrees 3.2% faster Events/EventsDispatchingInShadowTrees 5.5% faster (Confirmed through multiple runs) Real world memory usage saving depends on the number of live Event objects. That will vary. R=hayato@chromium.org,sigbjornf@opera.com,rbyers@chromium.org,esprehn@chromium.org BUG=487656, 325460 Committed: https://src.chromium.org/viewvc/blink?view=rev&revision=195651

Patch Set 1 #

Total comments: 4

Patch Set 2 : Addressed review comments. #

Patch Set 3 : Really addressed the review comments. #

Patch Set 4 : Oilpan performance (Member<> is not free of cost as raw pointers) #

Patch Set 5 : Added comment. #

Created: 5 years, 7 months ago

Download [raw] [tar.bz2]

		Unified diffs	Side-by-side diffs	Delta from patch set	Stats (+17 lines, -14 lines)			Patch
	M	Source/core/events/EventPath.h	View		2 chunks	+1 line, -3 lines	0 comments	Download
	M	Source/core/events/EventPath.cpp	View	1 2 3 4	3 chunks	+16 lines, -11 lines	0 comments	Download

Messages

Total messages: 33 (7 generated)

Expand Messages | Collapse Messages | Show Generated Messages | Hide Generated Messages

hayato

LGTM https://codereview.chromium.org/1136953011/diff/1/Source/core/events/EventPath.cpp File Source/core/events/EventPath.cpp (right): https://codereview.chromium.org/1136953011/diff/1/Source/core/events/EventPath.cpp#newcode132 Source/core/events/EventPath.cpp:132: ASSERT(m_nodeEventContexts.size() == 0); We don't need this ASSERT ...

5 years, 7 months ago (2015-05-19 06:45:29 UTC) #3

sof

https://codereview.chromium.org/1136953011/diff/1/Source/core/events/EventPath.cpp File Source/core/events/EventPath.cpp (right): https://codereview.chromium.org/1136953011/diff/1/Source/core/events/EventPath.cpp#newcode97 Source/core/events/EventPath.cpp:97: Vector<Node*, 256> nodesInPath; Make this WillBeHeapVector<RawPtrWillBeMember<Node>>. And do you ...

5 years, 7 months ago (2015-05-19 07:16:19 UTC) #5

Daniel Bratell

On 2015/05/19 07:16:19, sof wrote: > what's performance like if > you keep addNodeEventContext() like ...

5 years, 7 months ago (2015-05-19 08:11:14 UTC) #6

Daniel Bratell

5 years, 7 months ago (2015-05-19 08:11:42 UTC) #7

Daniel Bratell

On 2015/05/19 08:11:14, Daniel Bratell wrote: > Now I guess it could become ugly with ...

5 years, 7 months ago (2015-05-19 08:19:10 UTC) #9

sof

On 2015/05/19 08:19:10, Daniel Bratell wrote: > On 2015/05/19 08:11:14, Daniel Bratell wrote: > > ...

5 years, 7 months ago (2015-05-19 08:22:37 UTC) #10

Daniel Bratell

On 2015/05/19 08:22:37, sof wrote: > On 2015/05/19 08:19:10, Daniel Bratell wrote: > > would ...

5 years, 7 months ago (2015-05-19 08:51:04 UTC) #11

sof

On 2015/05/19 08:51:04, Daniel Bratell wrote: > On 2015/05/19 08:22:37, sof wrote: > > On ...

5 years, 7 months ago (2015-05-19 09:21:25 UTC) #12

Daniel Bratell

On 2015/05/19 09:21:25, sof wrote: > On 2015/05/19 08:51:04, Daniel Bratell wrote: > > On ...

5 years, 7 months ago (2015-05-19 09:38:24 UTC) #13

sof

Leaving it as 64 for now, makes sense - parity with how it was represented ...

5 years, 7 months ago (2015-05-19 10:00:52 UTC) #14

Rick Byers

Seems reasonable to me, LGTM Are there memory consumption tests running automatically that lock in ...

5 years, 7 months ago (2015-05-19 19:09:32 UTC) #15

Daniel Bratell

On 2015/05/19 19:09:32, Rick Byers wrote: > Seems reasonable to me, LGTM > > Are ...

5 years, 7 months ago (2015-05-19 21:55:34 UTC) #16

esprehn

I'm not very comfortable landing this change, this seems like a hack around Oilpan's collector ...

5 years, 7 months ago (2015-05-19 22:00:32 UTC) #18

haraken

On 2015/05/19 22:00:32, esprehn wrote: > I'm not very comfortable landing this change, this seems ...

5 years, 7 months ago (2015-05-19 23:04:56 UTC) #19

haraken

I'm skeptical about your reasoning about the performance improvement and the memory reduction, but the ...

5 years, 7 months ago (2015-05-19 23:05:42 UTC) #20

esprehn

Is event dispatch really that much faster with Oilpan? You're just deferring lots of work ...

5 years, 7 months ago (2015-05-19 23:50:00 UTC) #21

haraken

On 2015/05/19 23:50:00, esprehn wrote: > Is event dispatch really that much faster with Oilpan? ...

5 years, 7 months ago (2015-05-20 00:05:53 UTC) #22

Daniel Bratell

I'm confused: esprehn: This has nothing to do with oilpan. All testing was done on ...

5 years, 7 months ago (2015-05-20 08:40:08 UTC) #23

I'm confused:

esprehn: This has nothing to do with oilpan. All testing was done on trunk with
oilpan disabled. The memory usage problems are on trunk. Whether there will be
more or fewer parallel Event objects with oilpan I do not know so the impact
this has on an oilpan enabled Blink I'm not sure. It should not make anything
worse though. The code I'm fixing could probably look different but it gave a
20% performance boost and that convinced the reviewer at the time that this was
better than what proceeded it.

haraken: "I'm skeptical about your reasoning about the performance improvement
and the memory reduction..." - What makes you skeptical? The performance
improvement numbers come from measurements with the performance benchmarks.
Depending on platform, compiler, allocator and other factors the results may of
course vary but on a Release content_shell x86 built for profiling the results
are consistently:

Events/EventDispatching  Before: 815 ±12 runs/s. After:  846 ±3 runs/s.
Events/EventsDispatchingInDeeplyNestedShadowTrees:  Before:  9410 ±10 runs/s.
After:  9733 ±15 runs/s.
Events/EventsDispatchingInShadowTrees:  Before: 20950 ±40 runs/s. After: 22125
±40 runs/s.

The improvements in this particular build setup (content_shell x86_64 etc etc)
are fully repeatable (3.8% faster, 3.4% faster, 5.6% faster).  I didn't do this
change for the performance improvement but if things get faster, I'll accept it.
Gift horse and all that.

For the memory reduction numbers, the base data is this report from loading
http://youtube.com/tv
==26175== -------------------- 28 of 200 --------------------
==26175== max-live:    144,872 in 91 blocks
==26175== tot-alloc:   265,864 in 167 blocks (avg size 1592.00)
==26175== deaths:      148, at avg age 223,886,445 (0.75% of prog lifetime)
==26175== acc-ratios:  0.26 rd, 0.09 wr  (69,672 b-read, 26,520 b-written)
==26175==    at 0x4C2D92A: operator new(unsigned long)
==26175==    by 0x134EC1D: blink::Event::initEventPath(blink::Node&)
==26175==    by 0x134F122: blink::EventDispatcher::dispatchEvent(blink::Node&,
WTF::PassRefPtr<blink::EventDispatchMediator>)
==26175==    by 0x131B7A8:
blink::Node::dispatchEvent(WTF::PassRefPtr<blink::Event>)

Interesting numbers here are "max-live" 91 blocks. So there was for some reason
91 EventPath objects live at some time, consuming a total of 144872 bytes. Of
all the data allocated (265864 bytes) only 26% were read and 9% written
indicating a buffer that is much larger than is actually needed. I ran some
adhoc mouse event tests on various pages (youtube, cnn, google.com, a japanese
tourist site) since I believe mouse events are the ones most common and with
non-trivial event paths. The most I found was a visible element I could hover at
depth 16 and most were depth 8-12.

The number per object reported is 1592 bytes which corresponds to the 64 element
vector of objects containing 3 pointers each (64 * 3 * 8 is 1536). With this
patch that will be 56 bytes plus the size of the custom allocation which,
assuming a depth of 12 will be 288 bytes + heap overhead. That lands us at
roughly 1200 fewer bytes allocated per object. Since the objects are smaller
than one page (4 KB) we can assume that the unused memory is committed as well
so that we save real memory and not just virtual memory.

The biggest question is how many Event objects will be live at any one time. The
data above shows that at one time at youtube there was 91 live Event objects,
but considering that is a peak, most of the time the count will be lower. On the
other hand, it possible for a document to keep Events alive forever and I assume
that will also keep the EventPaths alive forever so 91 is in no way an absolute
limit. As for oilpan, that might change how many Event objects are alive at peak
and at average as well.

esprehn

Can we just use an array of ownptr instead? The bug here is that we ...

5 years, 7 months ago (2015-05-20 09:07:03 UTC) #24

Can we just use an array of ownptr instead? The bug here is that we create
a vector of big objects. In general we don't do that in Blink for this
reason and use pointers instead. I'd much rather the few line change than
this unless you have numbers that shows this is much better.
On May 20, 2015 1:40 AM, <bratell@opera.com> wrote:

> I'm confused:
>
> esprehn: This has nothing to do with oilpan. All testing was done on trunk
> with
> oilpan disabled. The memory usage problems are on trunk. Whether there
> will be
> more or fewer parallel Event objects with oilpan I do not know so the
> impact
> this has on an oilpan enabled Blink I'm not sure. It should not make
> anything
> worse though. The code I'm fixing could probably look different but it
> gave a
> 20% performance boost and that convinced the reviewer at the time that
> this was
> better than what proceeded it.
>
> haraken: "I'm skeptical about your reasoning about the performance
> improvement
> and the memory reduction..." - What makes you skeptical? The performance
> improvement numbers come from measurements with the performance benchmarks.
> Depending on platform, compiler, allocator and other factors the results
> may of
> course vary but on a Release content_shell x86 built for profiling the
> results
> are consistently:
>
> Events/EventDispatching  Before: 815 ±12 runs/s. After:  846 ±3 runs/s.
> Events/EventsDispatchingInDeeplyNestedShadowTrees:  Before:  9410 ±10
> runs/s.
> After:  9733 ±15 runs/s.
> Events/EventsDispatchingInShadowTrees:  Before: 20950 ±40 runs/s. After:
> 22125
> ±40 runs/s.
>
> The improvements in this particular build setup (content_shell x86_64 etc
> etc)
> are fully repeatable (3.8% faster, 3.4% faster, 5.6% faster).  I didn't do
> this
> change for the performance improvement but if things get faster, I'll
> accept it.
> Gift horse and all that.
>
> For the memory reduction numbers, the base data is this report from loading
> http://youtube.com/tv
> ==26175== -------------------- 28 of 200 --------------------
> ==26175== max-live:    144,872 in 91 blocks
> ==26175== tot-alloc:   265,864 in 167 blocks (avg size 1592.00)
> ==26175== deaths:      148, at avg age 223,886,445 (0.75% of prog lifetime)
> ==26175== acc-ratios:  0.26 rd, 0.09 wr  (69,672 b-read, 26,520 b-written)
> ==26175==    at 0x4C2D92A: operator new(unsigned long)
> ==26175==    by 0x134EC1D: blink::Event::initEventPath(blink::Node&)
> ==26175==    by 0x134F122:
> blink::EventDispatcher::dispatchEvent(blink::Node&,
> WTF::PassRefPtr<blink::EventDispatchMediator>)
> ==26175==    by 0x131B7A8:
> blink::Node::dispatchEvent(WTF::PassRefPtr<blink::Event>)
>
> Interesting numbers here are "max-live" 91 blocks. So there was for some
> reason
> 91 EventPath objects live at some time, consuming a total of 144872 bytes.
> Of
> all the data allocated (265864 bytes) only 26% were read and 9% written
> indicating a buffer that is much larger than is actually needed. I ran some
> adhoc mouse event tests on various pages (youtube, cnn, google.com, a
> japanese
> tourist site) since I believe mouse events are the ones most common and
> with
> non-trivial event paths. The most I found was a visible element I could
> hover at
> depth 16 and most were depth 8-12.
>
> The number per object reported is 1592 bytes which corresponds to the 64
> element
> vector of objects containing 3 pointers each (64 * 3 * 8 is 1536). With
> this
> patch that will be 56 bytes plus the size of the custom allocation which,
> assuming a depth of 12 will be 288 bytes + heap overhead. That lands us at
> roughly 1200 fewer bytes allocated per object. Since the objects are
> smaller
> than one page (4 KB) we can assume that the unused memory is committed as
> well
> so that we save real memory and not just virtual memory.
>
> The biggest question is how many Event objects will be live at any one
> time. The
> data above shows that at one time at youtube there was 91 live Event
> objects,
> but considering that is a peak, most of the time the count will be lower.
> On the
> other hand, it possible for a document to keep Events alive forever and I
> assume
> that will also keep the EventPaths alive forever so 91 is in no way an
> absolute
> limit. As for oilpan, that might change how many Event objects are alive
> at peak
> and at average as well.
>
>
> https://codereview.chromium.org/1136953011/
>
> To unsubscribe from this group and stop receiving emails from it, send an
> email to blink-reviews+unsubscribe@chromium.org.
>

To unsubscribe from this group and stop receiving emails from it, send an email
to blink-reviews+unsubscribe@chromium.org.

Daniel Bratell

On 2015/05/20 09:07:03, esprehn wrote: > Can we just use an array of ownptr instead? ...

5 years, 7 months ago (2015-05-20 10:00:32 UTC) #25

esprehn

lgtm, please add a comment to the code about why you do this. Someone is ...

5 years, 7 months ago (2015-05-20 17:31:00 UTC) #27

Daniel Bratell

On 2015/05/20 17:31:00, esprehn wrote: > lgtm, please add a comment to the code about ...

5 years, 7 months ago (2015-05-20 20:36:15 UTC) #28

esprehn

On 2015/05/20 at 20:36:15, bratell wrote: > On 2015/05/20 17:31:00, esprehn wrote: > > lgtm, ...

5 years, 7 months ago (2015-05-20 20:42:10 UTC) #29

commit-bot: I haz the power

CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1136953011/100001

5 years, 7 months ago (2015-05-20 21:46:16 UTC) #32

commit-bot: I haz the power

5 years, 7 months ago (2015-05-20 23:54:16 UTC) #33

Message was sent while issue was closed.

Committed patchset #5 (id:100001) as
https://src.chromium.org/viewvc/blink?view=rev&revision=195651

Expand Messages | Collapse Messages | Show Generated Messages | Hide Generated Messages