|
|
DescriptionWorkerThread CPU Load UMA
I feel we might want to separate out these UMAs per worker-types and
for background/foreground cases, but starting with the simplest one.
BUG=692906
Review-Url: https://codereview.chromium.org/2749383003
Cr-Commit-Position: refs/heads/master@{#462006}
Committed: https://chromium.googlesource.com/chromium/src/+/d81b3970f7e6120f5b145af37d330db86b458dba
Patch Set 1 #Patch Set 2 : . #
Total comments: 14
Patch Set 3 : updated intervals #Patch Set 4 : histogram comments #
Total comments: 2
Patch Set 5 : addressed comment #Patch Set 6 : build fix #
Messages
Total messages: 41 (20 generated)
Description was changed from ========== Worker Load UMA - SW is killed in 30 seconds idle - SW is killed if it's in 5 minutes in event - SW is killed if it's executing JS for 30 seconds BUG= ========== to ========== WorkerThread CPU Load UMA I feel we might want to separate out these UMAs per worker-types, but let's start with the simplest one. BUG= ==========
Description was changed from ========== WorkerThread CPU Load UMA I feel we might want to separate out these UMAs per worker-types, but let's start with the simplest one. BUG= ========== to ========== WorkerThread CPU Load UMA I feel we might want to separate out these UMAs per worker-types and for background/foreground cases, but starting with the simplest one. BUG=692906 ==========
kinuko@chromium.org changed reviewers: + altimin@chromium.org, falken@chromium.org, nhiroki@chromium.org
The CQ bit was checked by kinuko@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
Patchset #2 (id:20001) has been deleted
The CQ bit was checked by kinuko@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
Patchset #2 (id:40001) has been deleted
A bit rough but sending this to get early feedback. Could you? altimin, (and/or sami): overall falken, nhiroki: review from worker pov https://codereview.chromium.org/2749383003/diff/60001/third_party/WebKit/Sour... File third_party/WebKit/Source/platform/scheduler/base/time_converter.h (right): https://codereview.chromium.org/2749383003/diff/60001/third_party/WebKit/Sour... third_party/WebKit/Source/platform/scheduler/base/time_converter.h:18: base::TimeDelta::FromSecondsD(monotonicTimeInSeconds); Ended up factoring out this simple inline method but welcoming feedback https://codereview.chromium.org/2749383003/diff/60001/third_party/WebKit/Sour... File third_party/WebKit/Source/platform/scheduler/child/worker_scheduler_impl.cc (right): https://codereview.chromium.org/2749383003/diff/60001/third_party/WebKit/Sour... third_party/WebKit/Source/platform/scheduler/child/worker_scheduler_impl.cc:133: base::TimeTicks end_time_ticks = MonotonicTimeInSecondsToTimeTicks(end_time); I tried to make the code consistent here (i.e. using TaskTimeObserver and MonotinicTimeInSecondsToTimeTicks conversion to get these) but let me know if that's not the recommended way.
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
skyostil@chromium.org changed reviewers: + skyostil@chromium.org
lgtm, I think this is a good first step to get some data here. https://codereview.chromium.org/2749383003/diff/60001/third_party/WebKit/Sour... File third_party/WebKit/Source/platform/scheduler/base/time_converter.h (right): https://codereview.chromium.org/2749383003/diff/60001/third_party/WebKit/Sour... third_party/WebKit/Source/platform/scheduler/base/time_converter.h:18: base::TimeDelta::FromSecondsD(monotonicTimeInSeconds); On 2017/03/22 09:41:10, kinuko wrote: > Ended up factoring out this simple inline method but welcoming feedback Seems like a good idea, thanks!
lgtm % reporting interval selection https://codereview.chromium.org/2749383003/diff/60001/third_party/WebKit/Sour... File third_party/WebKit/Source/platform/scheduler/child/worker_scheduler_impl.cc (right): https://codereview.chromium.org/2749383003/diff/60001/third_party/WebKit/Sour... third_party/WebKit/Source/platform/scheduler/child/worker_scheduler_impl.cc:24: base::TimeDelta::FromSeconds(30); I wonder if we should use lower windows for these intervals. 1 minute was fine for main thread because we could rely on new tasks coming to the main thread. I wonder if we should use something like 1 second here? I also wonder if we need WaitingPeriodBeforeReporting here? (probably not).
lgtm https://codereview.chromium.org/2749383003/diff/60001/third_party/WebKit/Sour... File third_party/WebKit/Source/platform/scheduler/child/worker_scheduler_impl.h (right): https://codereview.chromium.org/2749383003/diff/60001/third_party/WebKit/Sour... third_party/WebKit/Source/platform/scheduler/child/worker_scheduler_impl.h:23: TaskTimeObserver { No public/protected/private. Is this intended? https://codereview.chromium.org/2749383003/diff/60001/third_party/WebKit/Sour... third_party/WebKit/Source/platform/scheduler/child/worker_scheduler_impl.h:64: ThreadLoadTracker load_tracker_; Hm... it looks like this field name doesn't obey the blink's coding standard (this should be m_loadTracker), I'm ok with keeping this as is for consistency with others in this file though.
Thanks! (inline response only) https://codereview.chromium.org/2749383003/diff/60001/third_party/WebKit/Sour... File third_party/WebKit/Source/platform/scheduler/child/worker_scheduler_impl.cc (right): https://codereview.chromium.org/2749383003/diff/60001/third_party/WebKit/Sour... third_party/WebKit/Source/platform/scheduler/child/worker_scheduler_impl.cc:24: base::TimeDelta::FromSeconds(30); On 2017/03/22 22:25:24, altimin wrote: > I wonder if we should use lower windows for these intervals. 1 minute was fine > for main thread because we could rely on new tasks coming to the main thread. I > wonder if we should use something like 1 second here? Matt, Hiroki- any thoughts here? Shorter time makes sense to me in general. > I also wonder if we need WaitingPeriodBeforeReporting here? (probably not). I wondered the same-- Yeah maybe not.
https://codereview.chromium.org/2749383003/diff/60001/third_party/WebKit/Sour... File third_party/WebKit/Source/platform/scheduler/child/worker_scheduler_impl.cc (right): https://codereview.chromium.org/2749383003/diff/60001/third_party/WebKit/Sour... third_party/WebKit/Source/platform/scheduler/child/worker_scheduler_impl.cc:24: base::TimeDelta::FromSeconds(30); On 2017/03/23 05:42:32, kinuko wrote: > On 2017/03/22 22:25:24, altimin wrote: > > I wonder if we should use lower windows for these intervals. 1 minute was fine > > for main thread because we could rely on new tasks coming to the main thread. > I > > wonder if we should use something like 1 second here? > > Matt, Hiroki- any thoughts here? Shorter time makes sense to me in general. At first I was worried that a 1 second reporting interval would mean the worker thread gets woken up every to record something every 1 second, but from code inspection it looks like that is not the case; instead, the sample count just increases when recording and the actual work done is not increased. Do we reasonably expect the recording to happen on worker shutdown? For example, if there is a dedicated worker that is running some heavy task indefinitely, until the user closes the tab: would that be reflected in the histograms as 100% load, or would we never get a chance to record something? > > > I also wonder if we need WaitingPeriodBeforeReporting here? (probably not). > > I wondered the same-- Yeah maybe not. https://codereview.chromium.org/2749383003/diff/60001/tools/metrics/histogram... File tools/metrics/histograms/histograms.xml (right): https://codereview.chromium.org/2749383003/diff/60001/tools/metrics/histogram... tools/metrics/histograms/histograms.xml:80104: + metric is emitted when the worker thread task is completed. Should we elaborate here about the reporting interval? If I understand correctly, if the worker ran a single task for 10 seconds, and we have a 1 second reporting interval, and the worker lived for 30 seconds, would we have 30 samples: 20 samples of 0% and 10 samples of 100%? And if the worker ran 5 tasks, each 100ms, and we have a 1 second reporting interval we'd see 1 sample for 50% load?
Thanks. We had offline discussion and concluded that: 1) hold off landing this, but add yet another UMA for worker thread lifetime in general, and then 2) decide the ReportingInterval and land this. https://codereview.chromium.org/2749383003/diff/60001/third_party/WebKit/Sour... File third_party/WebKit/Source/platform/scheduler/child/worker_scheduler_impl.cc (right): https://codereview.chromium.org/2749383003/diff/60001/third_party/WebKit/Sour... third_party/WebKit/Source/platform/scheduler/child/worker_scheduler_impl.cc:24: base::TimeDelta::FromSeconds(30); On 2017/03/23 06:48:05, falken wrote: > On 2017/03/23 05:42:32, kinuko wrote: > > On 2017/03/22 22:25:24, altimin wrote: > > > I wonder if we should use lower windows for these intervals. 1 minute was > fine > > > for main thread because we could rely on new tasks coming to the main > thread. > > I > > > wonder if we should use something like 1 second here? > > > > Matt, Hiroki- any thoughts here? Shorter time makes sense to me in general. > > At first I was worried that a 1 second reporting interval would mean the worker > thread gets woken up every to record something every 1 second, but from code > inspection it looks like that is not the case; instead, the sample count just > increases when recording and the actual work done is not increased. > > Do we reasonably expect the recording to happen on worker shutdown? For example, > if there is a dedicated worker that is running some heavy task indefinitely, > until the user closes the tab: would that be reflected in the histograms as 100% > load, or would we never get a chance to record something? > > > > > > I also wonder if we need WaitingPeriodBeforeReporting here? (probably not). > > > > I wondered the same-- Yeah maybe not. > Looking into the code it looks we can correctly log when the worker shuts down (regardless of whether the shutdown was graceful or not), but if the worker thread's life time tends to be shorter than the interval we'll lose a lot of stats.
https://codereview.chromium.org/2749383003/diff/60001/third_party/WebKit/Sour... File third_party/WebKit/Source/platform/scheduler/child/worker_scheduler_impl.h (right): https://codereview.chromium.org/2749383003/diff/60001/third_party/WebKit/Sour... third_party/WebKit/Source/platform/scheduler/child/worker_scheduler_impl.h:64: ThreadLoadTracker load_tracker_; On 2017/03/23 05:20:04, nhiroki wrote: > Hm... it looks like this field name doesn't obey the blink's coding standard > (this should be m_loadTracker), I'm ok with keeping this as is for consistency > with others in this file though. This code originally lived in content/ and then moved into Blink. We didn't change the naming because Blink is planning to switch to Chromium style soon anyway, so let's keep using Chromium style here.
Updated the interval & comments in the histogram. PTAL? https://codereview.chromium.org/2749383003/diff/60001/third_party/WebKit/Sour... File third_party/WebKit/Source/platform/scheduler/child/worker_scheduler_impl.cc (right): https://codereview.chromium.org/2749383003/diff/60001/third_party/WebKit/Sour... third_party/WebKit/Source/platform/scheduler/child/worker_scheduler_impl.cc:24: base::TimeDelta::FromSeconds(30); On 2017/03/23 07:00:20, kinuko wrote: > On 2017/03/23 06:48:05, falken wrote: > > On 2017/03/23 05:42:32, kinuko wrote: > > > On 2017/03/22 22:25:24, altimin wrote: > > > > I wonder if we should use lower windows for these intervals. 1 minute was > > fine > > > > for main thread because we could rely on new tasks coming to the main > > thread. > > > I > > > > wonder if we should use something like 1 second here? > > > > > > Matt, Hiroki- any thoughts here? Shorter time makes sense to me in general. > > > > At first I was worried that a 1 second reporting interval would mean the > worker > > thread gets woken up every to record something every 1 second, but from code > > inspection it looks like that is not the case; instead, the sample count just > > increases when recording and the actual work done is not increased. > > > > Do we reasonably expect the recording to happen on worker shutdown? For > example, > > if there is a dedicated worker that is running some heavy task indefinitely, > > until the user closes the tab: would that be reflected in the histograms as > 100% > > load, or would we never get a chance to record something? > > > > > > > > > I also wonder if we need WaitingPeriodBeforeReporting here? (probably > not). > > > > > > I wondered the same-- Yeah maybe not. > > > > Looking into the code it looks we can correctly log when the worker shuts down > (regardless of whether the shutdown was graceful or not), but if the worker > thread's life time tends to be shorter than the interval we'll lose a lot of > stats. Ok, from the UMA looks like about 85% of workers live ~1 sec (About 6~7% live about 1 min and I assume they're SW cases), so setting this like 1 sec seems like a reasonable decision. I also set BeforeReporting interval to 0. https://codereview.chromium.org/2749383003/diff/60001/third_party/WebKit/Sour... File third_party/WebKit/Source/platform/scheduler/child/worker_scheduler_impl.h (right): https://codereview.chromium.org/2749383003/diff/60001/third_party/WebKit/Sour... third_party/WebKit/Source/platform/scheduler/child/worker_scheduler_impl.h:23: TaskTimeObserver { On 2017/03/23 05:20:04, nhiroki wrote: > No public/protected/private. Is this intended? Not really, made it public.
lgtm. https://codereview.chromium.org/2749383003/diff/100001/third_party/WebKit/Sou... File third_party/WebKit/Source/platform/scheduler/child/worker_scheduler_impl.cc (right): https://codereview.chromium.org/2749383003/diff/100001/third_party/WebKit/Sou... third_party/WebKit/Source/platform/scheduler/child/worker_scheduler_impl.cc:110: DCHECK(initialized_); nit: Please keep the DCHECK first.
kinuko@chromium.org changed reviewers: + isherman@chromium.org
isherman@: could you review histogram changes? https://codereview.chromium.org/2749383003/diff/100001/third_party/WebKit/Sou... File third_party/WebKit/Source/platform/scheduler/child/worker_scheduler_impl.cc (right): https://codereview.chromium.org/2749383003/diff/100001/third_party/WebKit/Sou... third_party/WebKit/Source/platform/scheduler/child/worker_scheduler_impl.cc:110: DCHECK(initialized_); On 2017/04/03 13:56:22, Sami wrote: > nit: Please keep the DCHECK first. Done.
Metrics LGTM
lgtm https://codereview.chromium.org/2749383003/diff/60001/third_party/WebKit/Sour... File third_party/WebKit/Source/platform/scheduler/child/worker_scheduler_impl.h (right): https://codereview.chromium.org/2749383003/diff/60001/third_party/WebKit/Sour... third_party/WebKit/Source/platform/scheduler/child/worker_scheduler_impl.h:64: ThreadLoadTracker load_tracker_; On 2017/03/24 10:48:10, Sami wrote: > On 2017/03/23 05:20:04, nhiroki wrote: > > Hm... it looks like this field name doesn't obey the blink's coding standard > > (this should be m_loadTracker), I'm ok with keeping this as is for consistency > > with others in this file though. > > This code originally lived in content/ and then moved into Blink. We didn't > change the naming because Blink is planning to switch to Chromium style soon > anyway, so let's keep using Chromium style here. I see. Thank you for the clarification
lgtm. It seems that sleeping in the middle of the task can affect this metric (as well as main thread load metric). I'm thinking about a good way to combat this (discarding very long tasks seems like a good idea).
On 2017/04/04 11:20:55, altimin wrote: > lgtm. > > It seems that sleeping in the middle of the task can affect this metric (as well > as main thread load metric). I'm thinking about a good way to combat this > (discarding very long tasks seems like a good idea). Yeah that's a good point. (I assume you're fixing that in your side so this patch can land- thanks!)
falken@: do you want to take one more look or I can just land?
lgtm
The CQ bit was checked by kinuko@chromium.org
The patchset sent to the CQ was uploaded after l-g-t-m from skyostil@chromium.org Link to the patchset: https://codereview.chromium.org/2749383003/#ps120001 (title: "addressed comment")
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Try jobs failed on following builders: cast_shell_android on master.tryserver.chromium.android (JOB_FAILED, https://build.chromium.org/p/tryserver.chromium.android/builders/cast_shell_a...) cast_shell_linux on master.tryserver.chromium.linux (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.linux/builders/cast_shell_linu...)
The CQ bit was checked by kinuko@chromium.org
The patchset sent to the CQ was uploaded after l-g-t-m from falken@chromium.org, isherman@chromium.org, altimin@chromium.org, nhiroki@chromium.org, skyostil@chromium.org Link to the patchset: https://codereview.chromium.org/2749383003/#ps140001 (title: "build fix")
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
CQ is committing da patch. Bot data: {"patchset_id": 140001, "attempt_start_ts": 1491371994385230, "parent_rev": "85e49faca65081cad27e7c8ed43b02261afcaa96", "commit_rev": "d81b3970f7e6120f5b145af37d330db86b458dba"}
Message was sent while issue was closed.
Description was changed from ========== WorkerThread CPU Load UMA I feel we might want to separate out these UMAs per worker-types and for background/foreground cases, but starting with the simplest one. BUG=692906 ========== to ========== WorkerThread CPU Load UMA I feel we might want to separate out these UMAs per worker-types and for background/foreground cases, but starting with the simplest one. BUG=692906 Review-Url: https://codereview.chromium.org/2749383003 Cr-Commit-Position: refs/heads/master@{#462006} Committed: https://chromium.googlesource.com/chromium/src/+/d81b3970f7e6120f5b145af37d33... ==========
Message was sent while issue was closed.
Committed patchset #6 (id:140001) as https://chromium.googlesource.com/chromium/src/+/d81b3970f7e6120f5b145af37d33... |