Issue 1084533005: Support total and per-second task/threadtime timeline metrics

Issue 1084533005: Support total and per-second task/threadtime timeline metrics (Closed)

Created:
5 years, 7 months ago by jdduke (slow)

Modified:
5 years, 7 months ago

Reviewers:
nednguyen, sullivan, nednguyen(REVIEW IN OTHER ACC)

CC:
chromium-reviews, telemetry-reviews_chromium.org, Rick Byers

Base URL:
https://chromium.googlesource.com/chromium/src.git@master

Target Ref:
refs/pending/heads/master

Project:
chromium

Visibility:
Public.

More Reviews

Description

Support total and per-second task/threadtime timeline metrics The current threadtime timeline measurement only supports per-frame metrics. Generalize this to support total and per-second metrics for threadtimes and tasks. BUG=466213 CQ_EXTRA_TRYBOTS=tryserver.chromium.perf:mac_perf_bisect;tryserver.chromium.perf:android_nexus5_perf_bisect Committed: https://crrev.com/ce2699e20c3eae9dc17573fd2eb0885875341f44 Cr-Commit-Position: refs/heads/master@{#329322}

Patch Set 1 #

Patch Set 2 : Cleanup #

Total comments: 9

Patch Set 3 : Support for per-second threadtimes #

Patch Set 4 : Cleanup and working tests #

Patch Set 5 : Filter metrics #

Patch Set 6 : Cleanup #

Total comments: 2

Patch Set 7 : Use time from interaction_records #

Total comments: 8

Patch Set 8 : Code review #

Patch Set 9 : Cleanup #

Total comments: 5

Patch Set 10 : Code review (use %) #

Total comments: 2

Patch Set 11 : Change interval to seconds #

Total comments: 2

Patch Set 12 : Precomputed expectations #

Created: 5 years, 7 months ago

Download [raw] [tar.bz2]

	Unified diffs	Side-by-side diffs	Delta from patch set	Stats (+123 lines, -65 lines)			Patch
M	tools/perf/benchmarks/thread_times.py	View	1 2 3 4 5 6 7 8 9 10	1 chunk	+5 lines, -0 lines	0 comments	Download
M	tools/perf/measurements/thread_times_unittest.py	View	1 2 3 4 5 6 7 8 9 10	3 chunks	+16 lines, -12 lines	0 comments	Download
M	tools/perf/metrics/timeline.py	View	1 2 3 4 5 6 7 8 9 10	7 chunks	+74 lines, -41 lines	0 comments	Download
M	tools/perf/metrics/timeline_unittest.py	View	1 2 3 4 5 6 7 8 9 10 11	3 chunks	+28 lines, -12 lines	0 comments	Download

Messages

Total messages: 38 (10 generated)

Expand Messages | Collapse Messages | Show Generated Messages | Hide Generated Messages

jdduke (slow)

As requested, here's the change that implements the measurement/metric changes with tests. https://codereview.chromium.org/1084533005/diff/20001/tools/perf/metrics/timeline.py File tools/perf/metrics/timeline.py ...

5 years, 7 months ago (2015-04-29 18:07:17 UTC) #2

nednguyen

Before even introducing new metrics, we should first start with experimenting a metric that would ...

5 years, 7 months ago (2015-04-30 05:07:45 UTC) #4

Before even introducing new metrics, we should first start with experimenting a
metric that would catch crbug.com/462752.

I want to see an experimental metric that clearly regresses if you revert the
fix in 462752.

https://codereview.chromium.org/1084533005/diff/20001/tools/perf/metrics/time...
File tools/perf/metrics/timeline.py (right):

https://codereview.chromium.org/1084533005/diff/20001/tools/perf/metrics/time...
tools/perf/metrics/timeline.py:130: def ThreadCpuTimeResultName(thread_category,
measure_per_frame=True):
On 2015/04/29 18:07:17, jdduke wrote:
> I'm not married to the |measure_per_frame| name.

I don't thing these helpers function add any value in making the code more
readable or shorter. We should get rid of these.

https://codereview.chromium.org/1084533005/diff/20001/tools/perf/metrics/time...
tools/perf/metrics/timeline.py:134: return "thread_" + thread_category +
"_cpu_time_total"
How stable is this thread_X_cpu_time_total metric? Note that we sometime have
situation that page waits on some js condition. Since the waiting logic is
polling, I suspect that this metric has high variance.

https://codereview.chromium.org/1084533005/diff/20001/tools/perf/metrics/time...
tools/perf/metrics/timeline.py:139: return "tasks_" + thread_category
Same here, I think you should come up with a new ThreadTimeTotal metric that
computes something like tasks/seconds to avoid the timing issue.

https://codereview.chromium.org/1084533005/diff/20001/tools/perf/metrics/time...
tools/perf/metrics/timeline.py:255: def __init__(self, measure_per_frame=True):
Can you refactor this differently so we have two different classes instead of
one class with param like this?

The idea here is later, we gonna port thread_times benchmark to use timeline
based measurement instead. And in TBM model, all the timeline_based_metric
should be directly constructible so we can do things like:

for all subclass S of timeline based metrics:
   S.AddResults(...)

jdduke (slow)

On 2015/04/30 05:07:45, nednguyen wrote: > Before even introducing new metrics, we should first start ...

5 years, 7 months ago (2015-04-30 14:21:58 UTC) #5

jdduke (slow)

https://codereview.chromium.org/1084533005/diff/20001/tools/perf/metrics/timeline.py File tools/perf/metrics/timeline.py (right): https://codereview.chromium.org/1084533005/diff/20001/tools/perf/metrics/timeline.py#newcode134 tools/perf/metrics/timeline.py:134: return "thread_" + thread_category + "_cpu_time_total" On 2015/04/30 05:07:45, ...

5 years, 7 months ago (2015-04-30 14:29:40 UTC) #6

https://codereview.chromium.org/1084533005/diff/20001/tools/perf/metrics/time...
File tools/perf/metrics/timeline.py (right):

https://codereview.chromium.org/1084533005/diff/20001/tools/perf/metrics/time...
tools/perf/metrics/timeline.py:134: return "thread_" + thread_category +
"_cpu_time_total"
On 2015/04/30 05:07:45, nednguyen wrote:
> How stable is this thread_X_cpu_time_total metric? Note that we sometime have
> situation that page waits on some js condition. Since the waiting logic is
> polling, I suspect that this metric has high variance.

In local testing, it's quite stable. Part of that may be that, with each page
measurement (for key_power_cases) we simply wait N seconds. That is, we do all
of the work before the actual measurement is run (load the page, turn off the
screen, wait a couple seconds for stability, then actually start the
measurement).

However, measuring per second will probably scale a lot more nicely in the
future (at least for this case... that won't be useful for the case rbyers@
mentioned with testing taps, but at least if we generalize the normalization
category we'll be ready to do per-event timings or otherwise). I'll make the
change and reupload.

https://codereview.chromium.org/1084533005/diff/20001/tools/perf/metrics/time...
tools/perf/metrics/timeline.py:139: return "tasks_" + thread_category
On 2015/04/30 05:07:45, nednguyen wrote:
> Same here, I think you should come up with a new ThreadTimeTotal metric that
> computes something like tasks/seconds to avoid the timing issue.

Will do.

https://codereview.chromium.org/1084533005/diff/20001/tools/perf/metrics/time...
tools/perf/metrics/timeline.py:255: def __init__(self, measure_per_frame=True):
On 2015/04/30 05:07:45, nednguyen wrote:
> Can you refactor this differently so we have two different classes instead of
> one class with param like this?
> 
> The idea here is later, we gonna port thread_times benchmark to use timeline
> based measurement instead. And in TBM model, all the timeline_based_metric
> should be directly constructible so we can do things like:
> 
> for all subclass S of timeline based metrics:
>    S.AddResults(...)

Sure, what about a
PerFrameThreadTimesTimelineMetric/PerSecondThreadTimesTimelineMetric that
essentially just expose the denominator to the base class?

nednguyen

On 2015/04/30 14:21:58, jdduke wrote: > On 2015/04/30 05:07:45, nednguyen wrote: > > Before even ...

5 years, 7 months ago (2015-04-30 14:46:07 UTC) #7

jdduke (slow)

I've made the adjustment to use per-second thread times. Mind taking another look? https://codereview.chromium.org/1084533005/diff/20001/tools/perf/metrics/timeline.py File ...

5 years, 7 months ago (2015-05-01 15:51:44 UTC) #9

jdduke (slow)

Thanks for your patience, this is my first non-trivial telemetry change =/. Per offline discussion, ...

5 years, 7 months ago (2015-05-01 18:21:28 UTC) #10

nednguyen

https://codereview.chromium.org/1084533005/diff/120001/tools/perf/metrics/timeline.py File tools/perf/metrics/timeline.py (right): https://codereview.chromium.org/1084533005/diff/120001/tools/perf/metrics/timeline.py#newcode293 tools/perf/metrics/timeline.py:293: num_seconds = all_threads.ComputeSliceTimeRange() / 1000.0 Don't you want to ...

5 years, 7 months ago (2015-05-01 19:15:10 UTC) #11

jdduke (slow)

On 2015/05/01 19:15:10, nednguyen wrote: > https://codereview.chromium.org/1084533005/diff/120001/tools/perf/metrics/timeline.py > File tools/perf/metrics/timeline.py (right): > > https://codereview.chromium.org/1084533005/diff/120001/tools/perf/metrics/timeline.py#newcode293 > ...

5 years, 7 months ago (2015-05-01 19:29:10 UTC) #12

nednguyen(REVIEW IN OTHER ACC)

On 2015/05/01 19:29:10, jdduke wrote: > On 2015/05/01 19:15:10, nednguyen wrote: > > > https://codereview.chromium.org/1084533005/diff/120001/tools/perf/metrics/timeline.py ...

5 years, 7 months ago (2015-05-01 20:53:45 UTC) #13

jdduke (slow)

5 years, 7 months ago (2015-05-01 21:08:47 UTC) #14

jdduke (slow)

On 2015/05/01 21:08:47, jdduke wrote: > https://codereview.chromium.org/1084533005/diff/120001/tools/perf/metrics/timeline.py > File tools/perf/metrics/timeline.py (right): > > https://codereview.chromium.org/1084533005/diff/120001/tools/perf/metrics/timeline.py#newcode293 > ...

5 years, 7 months ago (2015-05-01 21:21:49 UTC) #16

nednguyen(REVIEW IN OTHER ACC)

For the case it turn outs to be ms/ms, we probably want to multiply by ...

5 years, 7 months ago (2015-05-01 22:00:45 UTC) #18

jdduke (slow)

https://codereview.chromium.org/1084533005/diff/150001/tools/perf/metrics/timeline.py File tools/perf/metrics/timeline.py (right): https://codereview.chromium.org/1084533005/diff/150001/tools/perf/metrics/timeline.py#newcode104 tools/perf/metrics/timeline.py:104: IntervalNames = ["frame", "second"] On 2015/05/01 22:00:45, nednguyen(REVIEW IN ...

5 years, 7 months ago (2015-05-01 23:49:02 UTC) #19

https://codereview.chromium.org/1084533005/diff/150001/tools/perf/metrics/tim...
File tools/perf/metrics/timeline.py (right):

https://codereview.chromium.org/1084533005/diff/150001/tools/perf/metrics/tim...
tools/perf/metrics/timeline.py:104: IntervalNames = ["frame", "second"]
On 2015/05/01 22:00:45, nednguyen(REVIEW IN OTHER ACC) wrote:
> Instead of seconds, we probably want to use ms for consistency with existing
> metrics.

Done.

https://codereview.chromium.org/1084533005/diff/150001/tools/perf/metrics/tim...
tools/perf/metrics/timeline.py:146: "" if interval_name == "frame" else "_per_"
+ interval_name)
On 2015/05/01 22:00:45, nednguyen(REVIEW IN OTHER ACC) wrote:
> This is a bit hard to read, can you change this to:
> if internal_name == "frame":
> ..
> elif internal_name == "ms":
> ..

Done.

https://codereview.chromium.org/1084533005/diff/150001/tools/perf/metrics/tim...
tools/perf/metrics/timeline.py:214: mean_frame_time = Rate(self.all_action_time,
num_frames)
On 2015/05/01 22:00:45, nednguyen(REVIEW IN OTHER ACC) wrote:
> This computation only applies for per_frames metric.

Right, I'll move it out.

https://codereview.chromium.org/1084533005/diff/150001/tools/perf/metrics/tim...
tools/perf/metrics/timeline.py:233: 'ms', self_time_result))
On 2015/05/01 22:00:44, nednguyen(REVIEW IN OTHER ACC) wrote:
> With the per second metrics, the unit 'ms' here isn't correct.
> Also for consistency, we probably want to use per ms, I think.

Seconds as an interval is nice because it's slightly easier to reason about,
particularly for tasks, but yeah I can see it being confusing. As we're
primarily interested in how this metric changes relatively, rather than the
absolute number, it's probably best to be consistent as you say.

 I'm slightly wary of using a % as we could have >100% cpu time per system time
for any multicore device, but if you think that's better I'm OK with it as well.
Note that the smaller interval means that the baseline threadtime/task counts
per MS will be very low, but again I guess it doesn't matter as it's relative.

nednguyen

On 2015/05/01 23:49:02, jdduke wrote: > https://codereview.chromium.org/1084533005/diff/150001/tools/perf/metrics/timeline.py > File tools/perf/metrics/timeline.py (right): > > https://codereview.chromium.org/1084533005/diff/150001/tools/perf/metrics/timeline.py#newcode104 > ...

5 years, 7 months ago (2015-05-04 00:24:47 UTC) #20

On 2015/05/01 23:49:02, jdduke wrote:
>
https://codereview.chromium.org/1084533005/diff/150001/tools/perf/metrics/tim...
> File tools/perf/metrics/timeline.py (right):
> 
>
https://codereview.chromium.org/1084533005/diff/150001/tools/perf/metrics/tim...
> tools/perf/metrics/timeline.py:104: IntervalNames = ["frame", "second"]
> On 2015/05/01 22:00:45, nednguyen(REVIEW IN OTHER ACC) wrote:
> > Instead of seconds, we probably want to use ms for consistency with existing
> > metrics.
> 
> Done.
> 
>
https://codereview.chromium.org/1084533005/diff/150001/tools/perf/metrics/tim...
> tools/perf/metrics/timeline.py:146: "" if interval_name == "frame" else
"_per_"
> + interval_name)
> On 2015/05/01 22:00:45, nednguyen(REVIEW IN OTHER ACC) wrote:
> > This is a bit hard to read, can you change this to:
> > if internal_name == "frame":
> > ..
> > elif internal_name == "ms":
> > ..
> 
> Done.
> 
>
https://codereview.chromium.org/1084533005/diff/150001/tools/perf/metrics/tim...
> tools/perf/metrics/timeline.py:214: mean_frame_time =
Rate(self.all_action_time,
> num_frames)
> On 2015/05/01 22:00:45, nednguyen(REVIEW IN OTHER ACC) wrote:
> > This computation only applies for per_frames metric.
> 
> Right, I'll move it out.
> 
>
https://codereview.chromium.org/1084533005/diff/150001/tools/perf/metrics/tim...
> tools/perf/metrics/timeline.py:233: 'ms', self_time_result))
> On 2015/05/01 22:00:44, nednguyen(REVIEW IN OTHER ACC) wrote:
> > With the per second metrics, the unit 'ms' here isn't correct.
> > Also for consistency, we probably want to use per ms, I think.
> 
> Seconds as an interval is nice because it's slightly easier to reason about,
> particularly for tasks, but yeah I can see it being confusing. As we're
> primarily interested in how this metric changes relatively, rather than the
> absolute number, it's probably best to be consistent as you say.
> 
>  I'm slightly wary of using a % as we could have >100% cpu time per system
time
> for any multicore device, but if you think that's better I'm OK with it as
well.
I'm ok with this, as long as we have documentation explaining what does > 100%
mean. value objects has the description field
(https://code.google.com/p/chromium/codesearch#chromium/src/tools/telemetry/te...),
how about adding descriptions to the values here in a subsequent patch?

> Note that the smaller interval means that the baseline threadtime/task counts
> per MS will be very low, but again I guess it doesn't matter as it's relative.

It may matter depending on the regression algorithm of the dashboard. But we can
always expand the interval with telemetry's interaction records to make the
range contain more meaningful data if needed.

nednguyen

https://codereview.chromium.org/1084533005/diff/190001/tools/perf/metrics/timeline.py File tools/perf/metrics/timeline.py (right): https://codereview.chromium.org/1084533005/diff/190001/tools/perf/metrics/timeline.py#newcode207 tools/perf/metrics/timeline.py:207: 'ms', cpu_per_interval)) Update unit here and other places? https://codereview.chromium.org/1084533005/diff/190001/tools/perf/metrics/timeline_unittest.py ...

5 years, 7 months ago (2015-05-04 00:39:46 UTC) #21

jdduke (slow)

> I'm ok with this, as long as we have documentation explaining what does > ...

5 years, 7 months ago (2015-05-04 17:46:20 UTC) #22

jdduke (slow)

Ned, anything else you'd like to see here? Wasn't sure if "%" or "percent" would ...

5 years, 7 months ago (2015-05-08 18:57:57 UTC) #23

nednguyen

This is great. Just want to make sure that these metrics also catch the regression ...

5 years, 7 months ago (2015-05-08 21:13:19 UTC) #24

jdduke (slow)

Here are the results without/with the regression: https://drive.google.com/file/d/0ByhHNaAkRQvCSjR2bzBVM21Cekk/view?usp=sharing It definitely catches the regression, but both ...

5 years, 7 months ago (2015-05-08 21:58:56 UTC) #25

jdduke (slow)

Per offline discussion, I went ahead and changed the interval to seconds. The % stays ...

5 years, 7 months ago (2015-05-11 21:28:12 UTC) #27

nednguyen

https://codereview.chromium.org/1084533005/diff/250001/tools/perf/metrics/timeline_unittest.py File tools/perf/metrics/timeline_unittest.py (right): https://codereview.chromium.org/1084533005/diff/250001/tools/perf/metrics/timeline_unittest.py#newcode178 tools/perf/metrics/timeline_unittest.py:178: ] Nits: I would prefer the style of the ...

5 years, 7 months ago (2015-05-11 22:56:02 UTC) #28

nednguyen

On 2015/05/11 22:56:02, nednguyen (slow review) wrote: > https://codereview.chromium.org/1084533005/diff/250001/tools/perf/metrics/timeline_unittest.py > File tools/perf/metrics/timeline_unittest.py (right): > > ...

5 years, 7 months ago (2015-05-11 22:56:21 UTC) #29

jdduke (slow)

Thanks for review! https://codereview.chromium.org/1084533005/diff/250001/tools/perf/metrics/timeline_unittest.py File tools/perf/metrics/timeline_unittest.py (right): https://codereview.chromium.org/1084533005/diff/250001/tools/perf/metrics/timeline_unittest.py#newcode178 tools/perf/metrics/timeline_unittest.py:178: ] On 2015/05/11 22:56:02, nednguyen (slow ...

5 years, 7 months ago (2015-05-11 23:44:41 UTC) #30

commit-bot: I haz the power

CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1084533005/270001

5 years, 7 months ago (2015-05-11 23:49:07 UTC) #33

commit-bot: I haz the power

CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1084533005/270001

5 years, 7 months ago (2015-05-12 01:08:44 UTC) #36

commit-bot: I haz the power

5 years, 7 months ago (2015-05-12 02:16:41 UTC) #38

Message was sent while issue was closed.

Patchset 12 (id:??) landed as
https://crrev.com/ce2699e20c3eae9dc17573fd2eb0885875341f44
Cr-Commit-Position: refs/heads/master@{#329322}

Expand Messages | Collapse Messages | Show Generated Messages | Hide Generated Messages