Issue 2670843007: Get rid of quadratic behavior in ResourceScheduler

Charlie Harrison

The CQ bit was checked by csharrison@chromium.org to run a CQ dry run

3 years, 10 months ago (2017-02-04 17:18:23 UTC) #1

commit-bot: I haz the power

Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2670843007/1

3 years, 10 months ago (2017-02-04 17:18:32 UTC) #2

commit-bot: I haz the power

The CQ bit was unchecked by commit-bot@chromium.org

3 years, 10 months ago (2017-02-04 18:04:13 UTC) #3

commit-bot: I haz the power

Dry run: Try jobs failed on following builders: linux_android_rel_ng on master.tryserver.chromium.android (JOB_FAILED, https://build.chromium.org/p/tryserver.chromium.android/builders/linux_android_rel_ng/builds/226014)

3 years, 10 months ago (2017-02-04 18:04:14 UTC) #4

Charlie Harrison

The CQ bit was checked by csharrison@chromium.org to run a CQ dry run

3 years, 10 months ago (2017-02-04 21:59:10 UTC) #5

commit-bot: I haz the power

Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2670843007/20001

3 years, 10 months ago (2017-02-04 21:59:19 UTC) #6

commit-bot: I haz the power

The CQ bit was unchecked by commit-bot@chromium.org

3 years, 10 months ago (2017-02-04 23:59:57 UTC) #7

commit-bot: I haz the power

Dry run: Try jobs failed on following builders: cast_shell_linux on master.tryserver.chromium.linux (JOB_TIMED_OUT, no build URL) ...

3 years, 10 months ago (2017-02-04 23:59:58 UTC) #8

Charlie Harrison

The CQ bit was checked by csharrison@chromium.org to run a CQ dry run

3 years, 10 months ago (2017-02-05 19:46:13 UTC) #9

commit-bot: I haz the power

Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2670843007/40001

3 years, 10 months ago (2017-02-05 19:46:27 UTC) #10

commit-bot: I haz the power

The CQ bit was unchecked by commit-bot@chromium.org

3 years, 10 months ago (2017-02-05 20:50:45 UTC) #11

commit-bot: I haz the power

Dry run: This issue passed the CQ dry run.

3 years, 10 months ago (2017-02-05 20:50:46 UTC) #12

Charlie Harrison

Description was changed from ========== Get rid of O(n^2) behavior in ResourceScheduler BUG= ========== to ...

3 years, 10 months ago (2017-02-05 22:06:42 UTC) #13

Charlie Harrison

Description was changed from ========== Get rid of quadratic behavior in ResourceScheduler The ResourceScheduler often ...

3 years, 10 months ago (2017-02-05 22:06:52 UTC) #14

Charlie Harrison

Description was changed from ========== Get rid of quadratic behavior in ResourceScheduler The ResourceScheduler often ...

3 years, 10 months ago (2017-02-05 22:08:21 UTC) #15

Description was changed from

==========
Get rid of quadratic behavior in ResourceScheduler

The ResourceScheduler often receives bursts of IPCs from the renderer,
which call LoadAnyStartablePendingRequests. This method is O(n) for n 
pending requests for that client/tab.

Because the IPCs are busty, we often have a message queue of m messages,
each one calling into LoadAnyStartablePendingRequests, yielding O(m*n)
behavior.

This patch removes the O(m*n) behavior. As soon as one of these "bursty"
messages is handled, we schedule a new task to call
LoadAnyStartablePendingRequests. If the message queue has m messages
queued up, this will put the call to LoadAnyStartablePendingRequests at
the end. By ensuring we only have one scheduled task to load the startable
requests, we effectively coalesce all of the other calls into this method.

BUG=664174
==========

to

==========
Get rid of quadratic behavior in ResourceScheduler

The ResourceScheduler often receives bursts of IPCs from the renderer,
which call LoadAnyStartablePendingRequests. This method is O(n) for n 
pending requests for that client/tab.

Because the IPCs are busty, we often have a message queue of m messages,
each one calling into LoadAnyStartablePendingRequests, yielding O(m*n)
behavior.

This patch removes the O(m*n) behavior. As soon as one of these "bursty"
messages is handled, we schedule a new task to call
LoadAnyStartablePendingRequests. If the message queue has m messages
queued up, this will put the call to LoadAnyStartablePendingRequests at
the end. By ensuring we only have one scheduled task to load the startable
requests, we effectively coalesce all of the other calls into this method.

Another technique to remove this inefficiency would be to batch the IPC
messages into one. This is being explored by kinuko@ in issue 672370.
This approach however is very simple, and could easily be removed if we
move to a batching system.

BUG=664174
==========

Charlie Harrison

csharrison@chromium.org changed reviewers: + kinuko@chromium.org

3 years, 10 months ago (2017-02-05 22:11:02 UTC) #16

Charlie Harrison

csharrison@chromium.org changed reviewers: + kinuko@chromium.org

3 years, 10 months ago (2017-02-05 22:11:02 UTC) #17

Charlie Harrison

Hey Kinuko, WDYT about this approach to make resource scheduler a bit more CPU friendly ...

3 years, 10 months ago (2017-02-05 22:11:03 UTC) #18

Charlie Harrison

Hey Kinuko, WDYT about this approach to make resource scheduler a bit more CPU friendly ...

3 years, 10 months ago (2017-02-05 22:11:03 UTC) #19

kinuko

Looks reasonable, but I'm a bit concerned about how this might change the behavior. https://codereview.chromium.org/2670843007/diff/40001/content/browser/loader/resource_scheduler.cc ...

3 years, 10 months ago (2017-02-06 20:57:14 UTC) #20

Charlie Harrison

https://codereview.chromium.org/2670843007/diff/40001/content/browser/loader/resource_scheduler.cc File content/browser/loader/resource_scheduler.cc (right): https://codereview.chromium.org/2670843007/diff/40001/content/browser/loader/resource_scheduler.cc#newcode738 content/browser/loader/resource_scheduler.cc:738: weak_ptr_factory_.GetWeakPtr(), trigger)); On 2017/02/06 20:57:14, kinuko wrote: > This ...

3 years, 10 months ago (2017-02-06 21:24:29 UTC) #21

kinuko

On 2017/02/06 21:24:29, Charlie Harrison wrote: > https://codereview.chromium.org/2670843007/diff/40001/content/browser/loader/resource_scheduler.cc > File content/browser/loader/resource_scheduler.cc (right): > > https://codereview.chromium.org/2670843007/diff/40001/content/browser/loader/resource_scheduler.cc#newcode738 ...

3 years, 10 months ago (2017-02-07 05:19:15 UTC) #22

Charlie Harrison

On 2017/02/07 05:19:15, kinuko wrote: > On 2017/02/06 21:24:29, Charlie Harrison wrote: > > > ...

3 years, 10 months ago (2017-02-07 05:59:10 UTC) #23

On 2017/02/07 05:19:15, kinuko wrote:
> On 2017/02/06 21:24:29, Charlie Harrison wrote:
> >
>
https://codereview.chromium.org/2670843007/diff/40001/content/browser/loader/...
> > File content/browser/loader/resource_scheduler.cc (right):
> > 
> >
>
https://codereview.chromium.org/2670843007/diff/40001/content/browser/loader/...
> > content/browser/loader/resource_scheduler.cc:738:
> > weak_ptr_factory_.GetWeakPtr(), trigger));
> > On 2017/02/06 20:57:14, kinuko wrote:
> > > This changes when we actually triggers startable requests esp. if we
already
> > > have lots of inflight tasks already in the queue (and during contentious
> > loading
> > > it looks it's often the case).  Do you think it could affect performance?
> > 
> > This is a valid concern. We currently "schedule" these start tasks on two
> > conditions.
> > 1. Reprioritization (which often comes in batches for image priorities)
> > 2. Request tear down (~ResourceThrottle). I think this will only come from
> > batches from callers of ResourceFetcher::stopFetching (e.g. stopAllLoaders).

> > 
> > In my mind, if we have a bloated queue during these times it feels better to
> > delay here than delaying the current work on the queue with lots of CPU
bound
> > work. A single extra post task during request finish / reprioritization
seems
> > not too bad to me.
> > 
> > If you'd like, I can add a histogram of the wait time for the scheduled
task?
> 
> Case #1 made me worry a bit as it looks it could be called during layout also
> before FMP, but if you don't think that'd be an issue I'm fine with this
change.
>  Or I think for this one we could possibly also just batch them in the
renderer
> too for now.

I think for #1 the only real danger is if N more tasks are posted to the IO
thread
between the 1st reprioritization task sent, and it being handled, where N >= M
reprioritization messages.

If N = M, then we only have to handle the M reprioritization tasks, saving us
lots of CPU.
I liked this change because it was very simple, but now I am feeling less sure.
Honestly
if the IO thread is super busy with servicing requests, etc. Then maybe adding
some delay here is a good thing?

Note that Randy already implemented batch IPC from the renderer here:
https://codereview.chromium.org/2552703003/ but we decided to wait for the more
general
thing.

LMK what you think. Happy to go with whatever you think if best here.

kinuko

On 2017/02/07 05:59:10, Charlie Harrison wrote: > On 2017/02/07 05:19:15, kinuko wrote: > > On ...

3 years, 10 months ago (2017-02-07 06:34:16 UTC) #24

On 2017/02/07 05:59:10, Charlie Harrison wrote:
> On 2017/02/07 05:19:15, kinuko wrote:
> > On 2017/02/06 21:24:29, Charlie Harrison wrote:
> > >
> >
>
https://codereview.chromium.org/2670843007/diff/40001/content/browser/loader/...
> > > File content/browser/loader/resource_scheduler.cc (right):
> > > 
> > >
> >
>
https://codereview.chromium.org/2670843007/diff/40001/content/browser/loader/...
> > > content/browser/loader/resource_scheduler.cc:738:
> > > weak_ptr_factory_.GetWeakPtr(), trigger));
> > > On 2017/02/06 20:57:14, kinuko wrote:
> > > > This changes when we actually triggers startable requests esp. if we
> already
> > > > have lots of inflight tasks already in the queue (and during contentious
> > > loading
> > > > it looks it's often the case).  Do you think it could affect
performance?
> > > 
> > > This is a valid concern. We currently "schedule" these start tasks on two
> > > conditions.
> > > 1. Reprioritization (which often comes in batches for image priorities)
> > > 2. Request tear down (~ResourceThrottle). I think this will only come from
> > > batches from callers of ResourceFetcher::stopFetching (e.g.
stopAllLoaders).
> 
> > > 
> > > In my mind, if we have a bloated queue during these times it feels better
to
> > > delay here than delaying the current work on the queue with lots of CPU
> bound
> > > work. A single extra post task during request finish / reprioritization
> seems
> > > not too bad to me.
> > > 
> > > If you'd like, I can add a histogram of the wait time for the scheduled
> task?
> > 
> > Case #1 made me worry a bit as it looks it could be called during layout
also
> > before FMP, but if you don't think that'd be an issue I'm fine with this
> change.
> >  Or I think for this one we could possibly also just batch them in the
> renderer
> > too for now.
> 
> I think for #1 the only real danger is if N more tasks are posted to the IO
> thread
> between the 1st reprioritization task sent, and it being handled, where N >= M
> reprioritization messages.
> 
> If N = M, then we only have to handle the M reprioritization tasks, saving us
> lots of CPU.
> I liked this change because it was very simple, but now I am feeling less
sure.
> Honestly
> if the IO thread is super busy with servicing requests, etc. Then maybe adding
> some delay here is a good thing?
> 
> Note that Randy already implemented batch IPC from the renderer here:
> https://codereview.chromium.org/2552703003/ but we decided to wait for the
more
> general
> thing.

Yes I remember this, I thought we could plumb this from Blink too.

> LMK what you think. Happy to go with whatever you think if best here.

Ok, we can land this and see how it goes. lgtm

Do you have a set of particular sites + stats we want to watch after this lands?

Charlie Harrison

I am hoping page cyclers will catch regressions or perf improvements. Other than that I ...

3 years, 10 months ago (2017-02-07 14:59:07 UTC) #25

Charlie Harrison

The CQ bit was checked by csharrison@chromium.org

3 years, 10 months ago (2017-02-07 15:04:11 UTC) #26

commit-bot: I haz the power

CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2670843007/40001

3 years, 10 months ago (2017-02-07 15:04:20 UTC) #27

commit-bot: I haz the power

CQ is committing da patch. Bot data: {"patchset_id": 40001, "attempt_start_ts": 1486479851016940, "parent_rev": "d86a44963ae8521c90be4cc0343ed6bfa8a872a7", "commit_rev": "59501351d872c1348665ad6dfd23b09396821056"}

3 years, 10 months ago (2017-02-07 16:02:47 UTC) #28

commit-bot: I haz the power

Description was changed from ========== Get rid of quadratic behavior in ResourceScheduler The ResourceScheduler often ...

3 years, 10 months ago (2017-02-07 16:03:21 UTC) #29

Message was sent while issue was closed.

Description was changed from

==========
Get rid of quadratic behavior in ResourceScheduler

The ResourceScheduler often receives bursts of IPCs from the renderer,
which call LoadAnyStartablePendingRequests. This method is O(n) for n 
pending requests for that client/tab.

Because the IPCs are busty, we often have a message queue of m messages,
each one calling into LoadAnyStartablePendingRequests, yielding O(m*n)
behavior.

This patch removes the O(m*n) behavior. As soon as one of these "bursty"
messages is handled, we schedule a new task to call
LoadAnyStartablePendingRequests. If the message queue has m messages
queued up, this will put the call to LoadAnyStartablePendingRequests at
the end. By ensuring we only have one scheduled task to load the startable
requests, we effectively coalesce all of the other calls into this method.

Another technique to remove this inefficiency would be to batch the IPC
messages into one. This is being explored by kinuko@ in issue 672370.
This approach however is very simple, and could easily be removed if we
move to a batching system.

BUG=664174
==========

to

==========
Get rid of quadratic behavior in ResourceScheduler

The ResourceScheduler often receives bursts of IPCs from the renderer,
which call LoadAnyStartablePendingRequests. This method is O(n) for n 
pending requests for that client/tab.

Because the IPCs are busty, we often have a message queue of m messages,
each one calling into LoadAnyStartablePendingRequests, yielding O(m*n)
behavior.

This patch removes the O(m*n) behavior. As soon as one of these "bursty"
messages is handled, we schedule a new task to call
LoadAnyStartablePendingRequests. If the message queue has m messages
queued up, this will put the call to LoadAnyStartablePendingRequests at
the end. By ensuring we only have one scheduled task to load the startable
requests, we effectively coalesce all of the other calls into this method.

Another technique to remove this inefficiency would be to batch the IPC
messages into one. This is being explored by kinuko@ in issue 672370.
This approach however is very simple, and could easily be removed if we
move to a batching system.

BUG=664174

Review-Url: https://codereview.chromium.org/2670843007
Cr-Commit-Position: refs/heads/master@{#448634}
Committed:
https://chromium.googlesource.com/chromium/src/+/59501351d872c1348665ad6dfd23...
==========

commit-bot: I haz the power

3 years, 10 months ago (2017-02-07 16:03:22 UTC) #30

Message was sent while issue was closed.

Committed patchset #3 (id:40001) as
https://chromium.googlesource.com/chromium/src/+/59501351d872c1348665ad6dfd23...

Issue 2670843007: Get rid of quadratic behavior in ResourceScheduler (Closed)

Description

Patch Set 1 #

Patch Set 2 : Get rid of O(n^2) behavior in ResourceScheduler #

Patch Set 3 : properly initialize members, more comment #

Messages