Issue 35893002: IPC pickling optimization for render passes.

Issue 35893002: IPC pickling optimization for render passes. (Closed)

Created:
7 years, 2 months ago by danakj

Modified:
7 years, 1 month ago

Reviewers:
Tom Sepez, jamesr, jar (doing other things), piman

CC:
chromium-reviews, joi+watch-content_chromium.org, darin-cc_chromium.org, cc-bugs_chromium.org, erikwright+watch_chromium.org, jam, rvargas (doing something else)

Base URL:
svn://svn.chromium.org/chrome/trunk/src

Visibility:
Public.

More Reviews

Description

IPC pickling optimization for render passes. Call Pickle::Reserve() for the size of the data in the shared quad state and quad lists. This prevents the WriteFoo() invocations for all of the quad/shared quad states from causing memory re-allocations and moves. This is based on https://codereview.chromium.org/34413002/ from piman@. This is also based after https://codereview.chromium.org/30593005/ and perf numbers (both before and after) include that CL also. content_perftest results on linux chromeos release official build: BEFORE *RESULT mean_frame_serialization_time: DelegatedFrame_ManyQuads_1_4000= 50 us *RESULT mean_frame_serialization_time: DelegatedFrame_ManyQuads_1_100000= 1888 us *RESULT mean_frame_serialization_time: DelegatedFrame_ManyQuads_4000_4000= 728 us *RESULT mean_frame_serialization_time: DelegatedFrame_ManyQuads_100000_100000= 23771 us *RESULT mean_frame_serialization_time: DelegatedFrame_ManyRenderPasses_10000_100= 24118 us AFTER *RESULT mean_frame_serialization_time: DelegatedFrame_ManyQuads_1_4000= 48 us *RESULT mean_frame_serialization_time: DelegatedFrame_ManyQuads_1_100000= 1626 us *RESULT mean_frame_serialization_time: DelegatedFrame_ManyQuads_4000_4000= 460 us *RESULT mean_frame_serialization_time: DelegatedFrame_ManyQuads_100000_100000= 14771 us *RESULT mean_frame_serialization_time: DelegatedFrame_ManyRenderPasses_10000_100= 15626 us This gives a further ~1.5x improvement in serialization time for shared quad states and render passes. R=jar@chromium.org, piman@chromium.org, tsepez@chromium.org, piman BUG=307480 Committed: https://src.chromium.org/viewvc/chrome?view=rev&revision=231656

Patch Set 1 : ccmessagesperf-reserve: #

Patch Set 2 : ccmessagesperf-reserve: one reserve #

Patch Set 3 : ccmessagesperf-reserve: noextraquadwalk #

Patch Set 4 : ccmessagesperf-reserve: add algorithm header #

Total comments: 10

Patch Set 5 : ccmessagesperf-reserve: rename #

Patch Set 6 : ccmessagesperf-reserve: move ReserveSize out of header #

Patch Set 7 : ccmessagesperf-reserve: undo-returnvalues #

Patch Set 8 : ccmessagesperf-reserve: fixbuild #

Total comments: 4

Patch Set 9 : ccmessagesperf-reserve: nits #

Created: 7 years, 1 month ago

Download [raw] [tar.bz2]

	Unified diffs	Side-by-side diffs	Delta from patch set	Stats (+88 lines, -1 line)			Patch
M	base/pickle.h	View	1 2 3 4 5 6	1 chunk	+5 lines, -0 lines	0 comments	Download
M	base/pickle.cc	View	1 2 3 4 5 6 7 8	1 chunk	+11 lines, -1 line	0 comments	Download
M	content/common/cc_messages.cc	View	1 2 3 4 5 6 7	2 chunks	+21 lines, -0 lines	0 comments	Download
M	content/common/cc_messages_unittest.cc	View	1 2 3 4 5	2 chunks	+51 lines, -0 lines	0 comments	Download

Messages

Total messages: 41 (0 generated)

Expand Messages | Collapse Messages

danakj

I originally wrote this in the description: BEFORE *RESULT mean_frame_serialization_time: DelegatedFrame_ManyQuads_1_4000= 50 us *RESULT mean_frame_serialization_time: ...

7 years, 2 months ago (2013-10-22 19:56:59 UTC) #2

piman

https://codereview.chromium.org/35893002/diff/70002/content/common/cc_messages.cc File content/common/cc_messages.cc (right): https://codereview.chromium.org/35893002/diff/70002/content/common/cc_messages.cc#newcode388 content/common/cc_messages.cc:388: to_reserve += sizeof(cc::CheckerboardDrawQuad); note: a bunch of the objects ...

7 years, 2 months ago (2013-10-22 20:22:26 UTC) #3

danakj

On 2013/10/22 20:22:26, piman wrote: > https://codereview.chromium.org/35893002/diff/70002/content/common/cc_messages.cc > File content/common/cc_messages.cc (right): > > https://codereview.chromium.org/35893002/diff/70002/content/common/cc_messages.cc#newcode388 > ...

7 years, 2 months ago (2013-10-22 20:37:27 UTC) #4

piman

On 2013/10/22 20:37:27, danakj wrote: > On 2013/10/22 20:22:26, piman wrote: > > > https://codereview.chromium.org/35893002/diff/70002/content/common/cc_messages.cc ...

7 years, 2 months ago (2013-10-22 23:54:38 UTC) #5

tomhudson

On 2013/10/22 23:54:38, piman wrote: > Oh, sorry, I should have been explicit. The main ...

7 years, 2 months ago (2013-10-23 07:43:09 UTC) #6

danakj

So, my thought is that I don't like hardcoding the "largest" quad cuz if that ...

7 years, 2 months ago (2013-10-23 15:19:03 UTC) #7

piman

On Wed, Oct 23, 2013 at 8:19 AM, <danakj@chromium.org> wrote: > So, my thought is ...

7 years, 2 months ago (2013-10-23 23:52:49 UTC) #8

piman

On 2013/10/23 23:52:49, piman wrote: > On Wed, Oct 23, 2013 at 8:19 AM, <mailto:danakj@chromium.org> ...

7 years, 2 months ago (2013-10-24 06:18:18 UTC) #9

danakj

On Thu, Oct 24, 2013 at 2:18 AM, <piman@chromium.org> wrote: > On 2013/10/23 23:52:49, piman ...

7 years, 2 months ago (2013-10-24 14:41:07 UTC) #10

danakj

On 2013/10/24 14:41:07, danakj wrote: > On Thu, Oct 24, 2013 at 2:18 AM, <mailto:piman@chromium.org> ...

7 years, 2 months ago (2013-10-24 15:03:47 UTC) #11

danakj

+tsepez for cc_messages.h (no IPC message change, just reserving memory ahead of time)

7 years, 2 months ago (2013-10-24 15:15:58 UTC) #13

danakj

On 2013/10/24 15:03:47, danakj wrote: > > > Also, I measured the cost of walking ...

7 years, 2 months ago (2013-10-24 15:27:18 UTC) #14

Tom Sepez

https://codereview.chromium.org/35893002/diff/460002/base/pickle.cc File base/pickle.cc (right): https://codereview.chromium.org/35893002/diff/460002/base/pickle.cc#newcode312 base/pickle.cc:312: Resize(capacity_ * 2 + needed_size); Why 2x? Also, this ...

7 years, 2 months ago (2013-10-24 17:48:05 UTC) #15

danakj

7 years, 2 months ago (2013-10-24 17:51:37 UTC) #16

Tom Sepez

> if realloc fails, we will crash inside tcmalloc no? > Ideally, we want this ...

7 years, 2 months ago (2013-10-24 18:00:18 UTC) #17

danakj

On Thu, Oct 24, 2013 at 2:00 PM, <tsepez@chromium.org> wrote: > > if realloc fails, ...

7 years, 2 months ago (2013-10-24 18:02:22 UTC) #18

piman

On Thu, Oct 24, 2013 at 11:00 AM, <tsepez@chromium.org> wrote: > > if realloc fails, ...

7 years, 2 months ago (2013-10-24 19:40:52 UTC) #21

danakj

On Thu, Oct 24, 2013 at 3:40 PM, Antoine Labour <piman@chromium.org> wrote: > > > ...

7 years, 2 months ago (2013-10-24 20:34:11 UTC) #22

jar (doing other things)

Reading these CLs, I'm beginning to wonder about whether there is a better way of ...

7 years, 1 month ago (2013-10-25 01:54:17 UTC) #23

danakj

I don't think we need to avoid using the ipc pickle for these things if ...

7 years, 1 month ago (2013-10-25 18:40:46 UTC) #24

I don't think we need to avoid using the ipc pickle for these things if we can
make it fast enough. Doing some shared memory thing should be a last resort IMO
as it's a lot more work+code+surface area for bugs. I'd much prefer to make
pickling fast enough to work for this use case.

Doing a Reserve() up front is minimal and allows us to make use of all the
normal pickle methods that verify each element of the message as usual and
discard invalid messages. This it the value of using pickle, I feel.

https://codereview.chromium.org/35893002/diff/700001/base/pickle.cc
File base/pickle.cc (right):

https://codereview.chromium.org/35893002/diff/700001/base/pickle.cc#newcode308
base/pickle.cc:308: // write at a uint32-aligned offset from the beginning of
the header
On 2013/10/25 01:54:18, jar wrote:
> nit: Start with Upper case, end with a period.  (may as well fix line 318
too).

Done.

https://codereview.chromium.org/35893002/diff/700001/base/pickle.cc#newcode314
base/pickle.cc:314: Resize(capacity_ * 2 + needed_size);
On 2013/10/25 01:54:18, jar wrote:
> Why did you double the existing capacity?

For the same reason we double it in BeginWrite(). Pickling N equally sized
things via reserve should not cause N reallocs to take place. This acts like
BeginWrite but is a hint that there are more writes coming after it that will
take up at least this much space. It does not try to be any more special than
that.

> Also...
> 
> Are you at all concerned about overflows, especially since this is called as a
> public function, with presumably less care and scrutiny than other callers of
> Resize()?

I'm no more concerned than we are in BeginWrite, and I'm not sure why I would
want to be? If we don't Reserve() then we're going to just call
WriteBytes()->BeginWrite() this many times anyways. I don't see the difference
here, am I overlooking something?

> FWIW: Realloc already plays such games (exponential growth, by a factor of 2
as
> I recall) when increasing sizes.  As a result, the number of calls to malloc
(or
> certainly TCMalloc) during these Resize() operations is limited to about
> log2(final size).

Right, but if we write 4000 objects with 10 fields each (and previously 2-16
elements in each field), we end up with a lot of calls to realloc anyways. Doing
it once is far superior to log_2(n) times, which was I guess around 14 times.

> Was this really a perf issue??  TCMalloc is fast... and log2 of most things
are
> small.

Yeh, you can see the bug for more information if you like. We were spending 3ms
just pickling a frame of data for the spaceport benchmark. This is a
pathalogical case, but one we care about being fast for.

> It strikes me that perchance the benchmarks are run with a very cold TCMalloc
> cache, which needed to do page allocations afresh, and hence you're measuring
> TCMalloc initialization time.
> 
> When you "skip" those intermediate allocations, you avoid having to warm up
the
> cache (via OS calls). 
> 
> Can you verify that re-running the tests more than once still shows a the same
> perf numbers?  In a live Chromium app, the cache is generally quite well
warmed
> (sadly, having allocated lots of memory).

The perf tests are here:
https://code.google.com/p/chromium/codesearch#chromium/src/content/common/cc_...

We write the ipc 10 times as a warm up, then write it 100 times and take the
mean of those 100 times as the result. Is that what you are thinking of?

Tom Sepez

drive-by: I'd expect that its the repeated memcpy's as part of the reallocs that cost ...

7 years, 1 month ago (2013-10-25 18:52:04 UTC) #25

danakj

On Fri, Oct 25, 2013 at 2:52 PM, <tsepez@chromium.org> wrote: > drive-by: I'd expect that ...

7 years, 1 month ago (2013-10-25 22:21:58 UTC) #26

Tom Sepez

The more I look at it, the more I like the idea of removing the ...

7 years, 1 month ago (2013-10-28 17:00:14 UTC) #27

jamesr

We shouldn't CHECK the return value from allocation functions, that's a waste of cycles and ...

7 years, 1 month ago (2013-10-28 17:33:39 UTC) #28

danakj

On Mon, Oct 28, 2013 at 1:00 PM, <tsepez@chromium.org> wrote: > The more I look ...

7 years, 1 month ago (2013-10-28 17:36:18 UTC) #29

danakj

On Mon, Oct 28, 2013 at 1:35 PM, Dana Jansens <danakj@chromium.org> wrote: > On Mon, ...

7 years, 1 month ago (2013-10-28 17:37:59 UTC) #30

commit-bot: I haz the power

CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/danakj@chromium.org/35893002/800001

7 years, 1 month ago (2013-10-29 05:32:48 UTC) #33

commit-bot: I haz the power

Retried try job too often on win7_aura for step(s) browser_tests http://build.chromium.org/p/tryserver.chromium/buildstatus?builder=win7_aura&number=94211

7 years, 1 month ago (2013-10-29 08:46:12 UTC) #34

commit-bot: I haz the power

CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/danakj@chromium.org/35893002/800001

7 years, 1 month ago (2013-10-29 14:47:30 UTC) #35

jar (doing other things)

One question I kept meaning to ask.... On what platform did you do the perf ...

7 years, 1 month ago (2013-10-29 15:56:51 UTC) #36

danakj

On Tue, Oct 29, 2013 at 11:56 AM, <jar@chromium.org> wrote: > One question I kept ...

7 years, 1 month ago (2013-10-29 16:01:29 UTC) #37

Tom Hudson

On 2013/10/29 15:56:51, jar wrote: > One question I kept meaning to ask.... > > ...

7 years, 1 month ago (2013-10-29 16:06:07 UTC) #38

danakj

On Tue, Oct 29, 2013 at 12:06 PM, <tomhudson@chromium.org> wrote: > On 2013/10/29 15:56:51, jar ...

7 years, 1 month ago (2013-10-29 16:26:03 UTC) #39

commit-bot: I haz the power

CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/danakj@chromium.org/35893002/800001

7 years, 1 month ago (2013-10-29 16:37:48 UTC) #40

danakj

7 years, 1 month ago (2013-10-29 22:23:58 UTC) #41

Message was sent while issue was closed.

Committed patchset #9 manually as r231656 (presubmit successful).

Expand Messages | Collapse Messages