Issue 854833003: Added GPU performance metrics.

David Yen

I've added the telemetry data for the toplevel traces now. The one thing I was ...

5 years, 11 months ago (2015-01-15 17:19:04 UTC) #1

vmiura

vmiura@chromium.org changed reviewers: + epenner@chromium.org

5 years, 11 months ago (2015-01-15 18:50:14 UTC) #2

vmiura

epenner@, mind taking a look? These are similar to thread_times.

5 years, 11 months ago (2015-01-15 18:50:14 UTC) #3

epenner

Cool! At a high level, it seems like both this and thread_times are solving a ...

5 years, 11 months ago (2015-01-16 01:17:59 UTC) #4

Cool!  At a high level, it seems like both this and thread_times are solving a
very similar problem. I wonder if some of the code could even be reused.

This is just some feedback and questions from looking at gpu_timeline.py

https://codereview.chromium.org/854833003/diff/60001/tools/perf/metrics/gpu_t...
File tools/perf/metrics/gpu_timeline.py (right):

https://codereview.chromium.org/854833003/diff/60001/tools/perf/metrics/gpu_t...
tools/perf/metrics/gpu_timeline.py:16: FRAME_END_MARKER = ('gpu',
'GLES2DecoderImpl::DoSwapBuffers')
Two things on this trace:

First, is there no GPU-side trace for this? That would be the best, as there is
some latency between this and when this actually occurs on the GPU's timeline..
But over many frames it's fine, so not a big deal.

Second, I believe this is never called on some platforms. In thread-times, I
tried this, as well as adding "RealSwapBuffers" traces, but even that didn't
work on all platforms (now I'm having trouble remembering which, I think it was
Mac), so I switched to compositor swaps. You might want to do the same, or dig
into the remaining platforms to find an equivalent trace.

https://codereview.chromium.org/854833003/diff/60001/tools/perf/metrics/gpu_t...
tools/perf/metrics/gpu_timeline.py:21: GPU_SERVICE_DEVICE_VARIANCE = 5
See below, I'm not sure this threshold is really needed. Perhaps it might reduce
some noise, but I don't believe we should fail if we exceed this. It seems to me
it will just cause the test to flake occasionally.

https://codereview.chromium.org/854833003/diff/60001/tools/perf/metrics/gpu_t...
tools/perf/metrics/gpu_timeline.py:61: TRACKED_NAMES_service: Using the
TRACKED_NAMES dictionary, we include
It took me a minute to parse what these mean. Does total_frame mean the total
clock time for a frame? If so, I would call this 'mean_frame_time' for
consistency with other metrics.

Does 'service' mean the CPU time spent in chrome's GPU service? Or perhaps cpu
time in the driver? And does 'device' mean the real GPU time? If so, I would
find 'dispatch' or 'cpu-dispatch' or 'service-cpu' more clear.

Also, have you considered thread_duration instead of duration for the cpu-time
in the service? A clock duration will often lack meaning for anything but
mean_frame_time/fps, as several unrelated things contribute to it (descheduling
and blocking on the GPU).

https://codereview.chromium.org/854833003/diff/60001/tools/perf/metrics/gpu_t...
tools/perf/metrics/gpu_timeline.py:75: if event.thread_start:
This is the only use of thread_start, are you sure this is what you want?

https://codereview.chromium.org/854833003/diff/60001/tools/perf/metrics/gpu_t...
tools/perf/metrics/gpu_timeline.py:91: # on when the tracing stopped the device
trace could not have come back yet.
Another reason for this could be that the device events are likely delayed by
1-2 frames, so if these events were previously clipped against a time-range, we
might lose 1-2 GPU frames at the right side. It's difficult to guarantee that
the traces will correlate exactly, so I'm inclined to just accept the noise.

https://codereview.chromium.org/854833003/diff/60001/tools/perf/metrics/gpu_t...
tools/perf/metrics/gpu_timeline.py:98: assert len(service_events) ==
len(device_events), (
It seems like we should either have a hard '==' in the assertion (without the
event_difference threshold), or just accept this as a small source of noise, and
add a comment somewhere about this limitation. I'm worried that this would flake
depending on the number of gpu events per frame.

https://codereview.chromium.org/854833003/diff/60001/tools/perf/metrics/gpu_t...
tools/perf/metrics/gpu_timeline.py:129: for buffer_swap_event in
buffer_swap_events:
Couple minor things:

Firstly, this looks to be M*N. Perhaps not a big deal, but do we even need to
group gpu events into frames? Given the GPU's latency, the frame grouping don't
be perfect. Also, it seems like given the metrics you want, you just need
something simple like this?

time_per_frame = sum(times_events) / count(frame_events)

Or am I missing something else that you want to calculate by grouping all the
traces into frames?

vmiura

https://codereview.chromium.org/854833003/diff/60001/tools/perf/metrics/gpu_timeline.py File tools/perf/metrics/gpu_timeline.py (right): https://codereview.chromium.org/854833003/diff/60001/tools/perf/metrics/gpu_timeline.py#newcode16 tools/perf/metrics/gpu_timeline.py:16: FRAME_END_MARKER = ('gpu', 'GLES2DecoderImpl::DoSwapBuffers') We should be able to ...

5 years, 11 months ago (2015-01-16 01:41:15 UTC) #5

David Yen

https://codereview.chromium.org/854833003/diff/60001/tools/perf/metrics/gpu_timeline.py File tools/perf/metrics/gpu_timeline.py (right): https://codereview.chromium.org/854833003/diff/60001/tools/perf/metrics/gpu_timeline.py#newcode16 tools/perf/metrics/gpu_timeline.py:16: FRAME_END_MARKER = ('gpu', 'GLES2DecoderImpl::DoSwapBuffers') On 2015/01/16 01:41:15, vmiura wrote: ...

5 years, 11 months ago (2015-01-16 22:58:04 UTC) #6

https://codereview.chromium.org/854833003/diff/60001/tools/perf/metrics/gpu_t...
File tools/perf/metrics/gpu_timeline.py (right):

https://codereview.chromium.org/854833003/diff/60001/tools/perf/metrics/gpu_t...
tools/perf/metrics/gpu_timeline.py:16: FRAME_END_MARKER = ('gpu',
'GLES2DecoderImpl::DoSwapBuffers')
On 2015/01/16 01:41:15, vmiura wrote:
> We should be able to add a "DoSwapBuffers" gpu trace which drops into
> gpu.service & gpu.device categories.

Done here: https://codereview.chromium.org/799753009/

Will switch to use that instead once the CL lands.

https://codereview.chromium.org/854833003/diff/60001/tools/perf/metrics/gpu_t...
tools/perf/metrics/gpu_timeline.py:21: GPU_SERVICE_DEVICE_VARIANCE = 5
On 2015/01/16 01:41:14, vmiura wrote:
> Perhaps if we have the "swap" trace on the GPU we don't need to match CPU &
GPU
> tracks, and remove this matching.

Yes, once the Swap trace is plumbed through we won't need this anymore.

https://codereview.chromium.org/854833003/diff/60001/tools/perf/metrics/gpu_t...
tools/perf/metrics/gpu_timeline.py:61: TRACKED_NAMES_service: Using the
TRACKED_NAMES dictionary, we include
On 2015/01/16 01:41:14, vmiura wrote:
> Currently the 'gpu.service' traces are using the the "asynchronous traces"
> (TRACE_EVENT_COPY_ASYNC_BEGIN/TRACE_EVENT_COPY_ASYNC_END) which don't have the
> 'thread_duration' numbers.
> 
> We can look into making these normal traces.  It may violate trace stacking,
but
> I think Tracing can cope with that.

I've changed the Traces to use normal traces instead of async traces here:
https://codereview.chromium.org/855653003/

After that lands I will use convert the values to use thread_duration.

I've also changed the names a bit to make it more clear.

https://codereview.chromium.org/854833003/diff/60001/tools/perf/metrics/gpu_t...
tools/perf/metrics/gpu_timeline.py:75: if event.thread_start:
On 2015/01/16 01:17:59, epenner wrote:
> This is the only use of thread_start, are you sure this is what you want?

I was going to look into this later, but the AsyncSlices are inserted twice for
some reason. The only different was the thread_start is set for one of them, and
the other one is None. So this was only to filter out one of the extra ones. Now
that we are not using Async traces anymore this is irrelevant, but we should
still figure out why there are multiple async traces.

https://codereview.chromium.org/854833003/diff/60001/tools/perf/metrics/gpu_t...
tools/perf/metrics/gpu_timeline.py:91: # on when the tracing stopped the device
trace could not have come back yet.
On 2015/01/16 01:17:59, epenner wrote:
> Another reason for this could be that the device events are likely delayed by
> 1-2 frames, so if these events were previously clipped against a time-range,
we
> might lose 1-2 GPU frames at the right side. It's difficult to guarantee that
> the traces will correlate exactly, so I'm inclined to just accept the noise.

This is no longer relevant and will be removed once the new buffer swap traces
lands.

https://codereview.chromium.org/854833003/diff/60001/tools/perf/metrics/gpu_t...
tools/perf/metrics/gpu_timeline.py:98: assert len(service_events) ==
len(device_events), (
On 2015/01/16 01:17:59, epenner wrote:
> It seems like we should either have a hard '==' in the assertion (without the
> event_difference threshold), or just accept this as a small source of noise,
and
> add a comment somewhere about this limitation. I'm worried that this would
flake
> depending on the number of gpu events per frame.

Before we were using the BufferSwap trace that only happens on the CPU so it was
important that the CPU/GPU traces matched up perfectly (if anything for the zip
call below). Once the SwapBuffer traces gets plumbed through to the device
traces most of this will be unnecessary though.

https://codereview.chromium.org/854833003/diff/60001/tools/perf/metrics/gpu_t...
tools/perf/metrics/gpu_timeline.py:129: for buffer_swap_event in
buffer_swap_events:
On 2015/01/16 01:17:59, epenner wrote:
> Couple minor things:
> 
> Firstly, this looks to be M*N. Perhaps not a big deal, but do we even need to
> group gpu events into frames? Given the GPU's latency, the frame grouping
don't
> be perfect. Also, it seems like given the metrics you want, you just need
> something simple like this?
> 
> time_per_frame = sum(times_events) / count(frame_events)
> 
> Or am I missing something else that you want to calculate by grouping all the
> traces into frames?

Before I only had SwapBuffer traces on the CPU side so it was impossible to
figure out the start and end for a frame on the GPU side. The grouping was just
to figure out the start and end of a frame on the GPU device side. After the
SwapBuffer trace CL lands this will be simplified and I can remove this grouping
step.

David Yen

Utilizing the new SwapBuffer traces which goes through both the gpu service & device, lots ...

5 years, 11 months ago (2015-01-20 19:37:26 UTC) #7

Utilizing the new SwapBuffer traces which goes through both the gpu service &
device, lots of the old code has been deleted and the new one is much more
simplified. PTAL.

https://codereview.chromium.org/854833003/diff/60001/tools/perf/metrics/gpu_t...
File tools/perf/metrics/gpu_timeline.py (right):

https://codereview.chromium.org/854833003/diff/60001/tools/perf/metrics/gpu_t...
tools/perf/metrics/gpu_timeline.py:16: FRAME_END_MARKER = ('gpu',
'GLES2DecoderImpl::DoSwapBuffers')
On 2015/01/16 22:58:04, David Yen wrote:
> On 2015/01/16 01:41:15, vmiura wrote:
> > We should be able to add a "DoSwapBuffers" gpu trace which drops into
> > gpu.service & gpu.device categories.
> 
> Done here: https://codereview.chromium.org/799753009/
> 
> Will switch to use that instead once the CL lands.
> 

Done.

https://codereview.chromium.org/854833003/diff/60001/tools/perf/metrics/gpu_t...
tools/perf/metrics/gpu_timeline.py:21: GPU_SERVICE_DEVICE_VARIANCE = 5
On 2015/01/16 22:58:04, David Yen wrote:
> On 2015/01/16 01:41:14, vmiura wrote:
> > Perhaps if we have the "swap" trace on the GPU we don't need to match CPU &
> GPU
> > tracks, and remove this matching.
> 
> Yes, once the Swap trace is plumbed through we won't need this anymore.

Done.

https://codereview.chromium.org/854833003/diff/60001/tools/perf/metrics/gpu_t...
tools/perf/metrics/gpu_timeline.py:61: TRACKED_NAMES_service: Using the
TRACKED_NAMES dictionary, we include
On 2015/01/16 22:58:04, David Yen wrote:
> On 2015/01/16 01:41:14, vmiura wrote:
> > Currently the 'gpu.service' traces are using the the "asynchronous traces"
> > (TRACE_EVENT_COPY_ASYNC_BEGIN/TRACE_EVENT_COPY_ASYNC_END) which don't have
the
> > 'thread_duration' numbers.
> > 
> > We can look into making these normal traces.  It may violate trace stacking,
> but
> > I think Tracing can cope with that.
> 
> I've changed the Traces to use normal traces instead of async traces here:
> https://codereview.chromium.org/855653003/
> 
> After that lands I will use convert the values to use thread_duration.
> 
> I've also changed the names a bit to make it more clear.

Done.

https://codereview.chromium.org/854833003/diff/60001/tools/perf/metrics/gpu_t...
tools/perf/metrics/gpu_timeline.py:91: # on when the tracing stopped the device
trace could not have come back yet.
On 2015/01/16 22:58:04, David Yen wrote:
> On 2015/01/16 01:17:59, epenner wrote:
> > Another reason for this could be that the device events are likely delayed
by
> > 1-2 frames, so if these events were previously clipped against a time-range,
> we
> > might lose 1-2 GPU frames at the right side. It's difficult to guarantee
that
> > the traces will correlate exactly, so I'm inclined to just accept the noise.
> 
> This is no longer relevant and will be removed once the new buffer swap traces
> lands.

Done.

https://codereview.chromium.org/854833003/diff/60001/tools/perf/metrics/gpu_t...
tools/perf/metrics/gpu_timeline.py:98: assert len(service_events) ==
len(device_events), (
On 2015/01/16 22:58:04, David Yen wrote:
> On 2015/01/16 01:17:59, epenner wrote:
> > It seems like we should either have a hard '==' in the assertion (without
the
> > event_difference threshold), or just accept this as a small source of noise,
> and
> > add a comment somewhere about this limitation. I'm worried that this would
> flake
> > depending on the number of gpu events per frame.
> 
> Before we were using the BufferSwap trace that only happens on the CPU so it
was
> important that the CPU/GPU traces matched up perfectly (if anything for the
zip
> call below). Once the SwapBuffer traces gets plumbed through to the device
> traces most of this will be unnecessary though.

Done.

https://codereview.chromium.org/854833003/diff/60001/tools/perf/metrics/gpu_t...
tools/perf/metrics/gpu_timeline.py:129: for buffer_swap_event in
buffer_swap_events:
On 2015/01/16 22:58:04, David Yen wrote:
> On 2015/01/16 01:17:59, epenner wrote:
> > Couple minor things:
> > 
> > Firstly, this looks to be M*N. Perhaps not a big deal, but do we even need
to
> > group gpu events into frames? Given the GPU's latency, the frame grouping
> don't
> > be perfect. Also, it seems like given the metrics you want, you just need
> > something simple like this?
> > 
> > time_per_frame = sum(times_events) / count(frame_events)
> > 
> > Or am I missing something else that you want to calculate by grouping all
the
> > traces into frames?
> 
> Before I only had SwapBuffer traces on the CPU side so it was impossible to
> figure out the start and end for a frame on the GPU side. The grouping was
just
> to figure out the start and end of a frame on the GPU device side. After the
> SwapBuffer trace CL lands this will be simplified and I can remove this
grouping
> step.

This is all removed now.

David Yen

On 2015/01/20 19:37:26, David Yen wrote: > Utilizing the new SwapBuffer traces which goes through ...

5 years, 11 months ago (2015-01-23 00:07:22 UTC) #8

On 2015/01/20 19:37:26, David Yen wrote:
> Utilizing the new SwapBuffer traces which goes through both the gpu service &
> device, lots of the old code has been deleted and the new one is much more
> simplified. PTAL.
> 
>
https://codereview.chromium.org/854833003/diff/60001/tools/perf/metrics/gpu_t...
> File tools/perf/metrics/gpu_timeline.py (right):
> 
>
https://codereview.chromium.org/854833003/diff/60001/tools/perf/metrics/gpu_t...
> tools/perf/metrics/gpu_timeline.py:16: FRAME_END_MARKER = ('gpu',
> 'GLES2DecoderImpl::DoSwapBuffers')
> On 2015/01/16 22:58:04, David Yen wrote:
> > On 2015/01/16 01:41:15, vmiura wrote:
> > > We should be able to add a "DoSwapBuffers" gpu trace which drops into
> > > gpu.service & gpu.device categories.
> > 
> > Done here: https://codereview.chromium.org/799753009/
> > 
> > Will switch to use that instead once the CL lands.
> > 
> 
> Done.
> 
>
https://codereview.chromium.org/854833003/diff/60001/tools/perf/metrics/gpu_t...
> tools/perf/metrics/gpu_timeline.py:21: GPU_SERVICE_DEVICE_VARIANCE = 5
> On 2015/01/16 22:58:04, David Yen wrote:
> > On 2015/01/16 01:41:14, vmiura wrote:
> > > Perhaps if we have the "swap" trace on the GPU we don't need to match CPU
&
> > GPU
> > > tracks, and remove this matching.
> > 
> > Yes, once the Swap trace is plumbed through we won't need this anymore.
> 
> Done.
> 
>
https://codereview.chromium.org/854833003/diff/60001/tools/perf/metrics/gpu_t...
> tools/perf/metrics/gpu_timeline.py:61: TRACKED_NAMES_service: Using the
> TRACKED_NAMES dictionary, we include
> On 2015/01/16 22:58:04, David Yen wrote:
> > On 2015/01/16 01:41:14, vmiura wrote:
> > > Currently the 'gpu.service' traces are using the the "asynchronous traces"
> > > (TRACE_EVENT_COPY_ASYNC_BEGIN/TRACE_EVENT_COPY_ASYNC_END) which don't have
> the
> > > 'thread_duration' numbers.
> > > 
> > > We can look into making these normal traces.  It may violate trace
stacking,
> > but
> > > I think Tracing can cope with that.
> > 
> > I've changed the Traces to use normal traces instead of async traces here:
> > https://codereview.chromium.org/855653003/
> > 
> > After that lands I will use convert the values to use thread_duration.
> > 
> > I've also changed the names a bit to make it more clear.
> 
> Done.
> 
>
https://codereview.chromium.org/854833003/diff/60001/tools/perf/metrics/gpu_t...
> tools/perf/metrics/gpu_timeline.py:91: # on when the tracing stopped the
device
> trace could not have come back yet.
> On 2015/01/16 22:58:04, David Yen wrote:
> > On 2015/01/16 01:17:59, epenner wrote:
> > > Another reason for this could be that the device events are likely delayed
> by
> > > 1-2 frames, so if these events were previously clipped against a
time-range,
> > we
> > > might lose 1-2 GPU frames at the right side. It's difficult to guarantee
> that
> > > the traces will correlate exactly, so I'm inclined to just accept the
noise.
> > 
> > This is no longer relevant and will be removed once the new buffer swap
traces
> > lands.
> 
> Done.
> 
>
https://codereview.chromium.org/854833003/diff/60001/tools/perf/metrics/gpu_t...
> tools/perf/metrics/gpu_timeline.py:98: assert len(service_events) ==
> len(device_events), (
> On 2015/01/16 22:58:04, David Yen wrote:
> > On 2015/01/16 01:17:59, epenner wrote:
> > > It seems like we should either have a hard '==' in the assertion (without
> the
> > > event_difference threshold), or just accept this as a small source of
noise,
> > and
> > > add a comment somewhere about this limitation. I'm worried that this would
> > flake
> > > depending on the number of gpu events per frame.
> > 
> > Before we were using the BufferSwap trace that only happens on the CPU so it
> was
> > important that the CPU/GPU traces matched up perfectly (if anything for the
> zip
> > call below). Once the SwapBuffer traces gets plumbed through to the device
> > traces most of this will be unnecessary though.
> 
> Done.
> 
>
https://codereview.chromium.org/854833003/diff/60001/tools/perf/metrics/gpu_t...
> tools/perf/metrics/gpu_timeline.py:129: for buffer_swap_event in
> buffer_swap_events:
> On 2015/01/16 22:58:04, David Yen wrote:
> > On 2015/01/16 01:17:59, epenner wrote:
> > > Couple minor things:
> > > 
> > > Firstly, this looks to be M*N. Perhaps not a big deal, but do we even need
> to
> > > group gpu events into frames? Given the GPU's latency, the frame grouping
> > don't
> > > be perfect. Also, it seems like given the metrics you want, you just need
> > > something simple like this?
> > > 
> > > time_per_frame = sum(times_events) / count(frame_events)
> > > 
> > > Or am I missing something else that you want to calculate by grouping all
> the
> > > traces into frames?
> > 
> > Before I only had SwapBuffer traces on the CPU side so it was impossible to
> > figure out the start and end for a frame on the GPU side. The grouping was
> just
> > to figure out the start and end of a frame on the GPU device side. After the
> > SwapBuffer trace CL lands this will be simplified and I can remove this
> grouping
> > step.
> 
> This is all removed now.

ping

vmiura

I tried out the measurement. I'm not sure if I did something wrong. Some notes ...

5 years, 11 months ago (2015-01-23 01:49:48 UTC) #9

I tried out the measurement.  I'm not sure if I did something wrong.  Some notes
below.

===================

View result at
file:///usr/local/google/code/content_shell/src/gpu_times_top_25_smooth/results.html
vmiura@vmiura1:/usr/local/google/code/content_shell/src$
./tools/perf/run_benchmark --browser android-content-shell
gpu_times.top_25_smooth --results-label=Software --output-dir
./gpu_times_top_25_smooth --extra-browser-args="--disable-gpu-rasterization
--ignore-certificate-errors" --page-filter=Wordpress
WARNING:root:Installing ./out/Release/apks/ContentShell.apk on device if needed.
[ RUN      ] Wordpress
77 KB/s (3383 bytes in 0.042s)
[       OK ] Wordpress (22152 ms)
[  PASSED  ] 1 test.

Pages: [Wordpress]
*RESULT compositor_device-gpu_avg: compositor_device-gpu_avg= 0 ms
*RESULT compositor_device-gpu_max: compositor_device-gpu_max= 0 ms
*RESULT compositor_device-gpu_stddev: compositor_device-gpu_stddev= 0.0 ms
*RESULT compositor_service-cpu_avg: compositor_service-cpu_avg= 0 ms
*RESULT compositor_service-cpu_max: compositor_service-cpu_max= 0 ms
*RESULT compositor_service-cpu_stddev: compositor_service-cpu_stddev= 0.0 ms
*RESULT mean_frame_avg: mean_frame_avg= 4718.14399993 ms
*RESULT mean_frame_max: mean_frame_max= 4718.14399993 ms
*RESULT mean_frame_stddev: mean_frame_stddev= 0.0 ms
*RESULT mean_gpu_device-gpu_avg: mean_gpu_device-gpu_avg= 186.352001726 ms
*RESULT mean_gpu_device-gpu_max: mean_gpu_device-gpu_max= 186.352001726 ms
*RESULT mean_gpu_device-gpu_stddev: mean_gpu_device-gpu_stddev= 0.0 ms
*RESULT mean_gpu_service-cpu_avg: mean_gpu_service-cpu_avg= 1419.806 ms
*RESULT mean_gpu_service-cpu_max: mean_gpu_service-cpu_max= 1419.806 ms
*RESULT mean_gpu_service-cpu_stddev: mean_gpu_service-cpu_stddev= 0.0 ms
*RESULT render_compositor_device-gpu_avg: render_compositor_device-gpu_avg=
7.70999993133 ms
*RESULT render_compositor_device-gpu_max: render_compositor_device-gpu_max=
7.70999993133 ms
*RESULT render_compositor_device-gpu_stddev:
render_compositor_device-gpu_stddev= 0.0 ms
*RESULT render_compositor_service-cpu_avg: render_compositor_service-cpu_avg=
386.682 ms
*RESULT render_compositor_service-cpu_max: render_compositor_service-cpu_max=
386.682 ms
*RESULT render_compositor_service-cpu_stddev:
render_compositor_service-cpu_stddev= 0.0 ms
RESULT telemetry_page_measurement_results: num_failed= 0 count
RESULT telemetry_page_measurement_results: num_errored= 0 count
Pages: [Wordpress]

===================

I'm a bit confused by the naming:
 - "mean_frame_average" -- average of the mean?
 - "compositor_service-cpu_avg" -- we should call it either service, or cpu.
 - "compositor_device-gpu_avg" -- we should call it either device, or gpu.

Also the values don't look correct very high, and avg == max.

vmiura

It would be great to get all the actual cpu & gpu frame times as ...

5 years, 11 months ago (2015-01-23 01:52:31 UTC) #10

nednguyen

nednguyen@google.com changed reviewers: + nednguyen@google.com

5 years, 11 months ago (2015-01-23 02:45:26 UTC) #11

nednguyen

https://codereview.chromium.org/854833003/diff/100001/tools/perf/measurements/gpu_times.py File tools/perf/measurements/gpu_times.py (right): https://codereview.chromium.org/854833003/diff/100001/tools/perf/measurements/gpu_times.py#newcode15 tools/perf/measurements/gpu_times.py:15: class GPUTimes(page_test.PageTest): Why not adding this metrics to timeline_based_measurement?

5 years, 11 months ago (2015-01-23 02:45:27 UTC) #12

vmiura

Ahh, ignore the issue about bad values. Seems I didn't have the necessary patch for ...

5 years, 11 months ago (2015-01-23 03:29:50 UTC) #13

Ahh, ignore the issue about bad values.  Seems I didn't have the necessary patch
for SwapBuffers so was getting a frame of 1.
This is what I get on Nexus 4.

Pages: [Wordpress]
*RESULT compositor_device-gpu_avg: compositor_device-gpu_avg= 0 ms
*RESULT compositor_device-gpu_max: compositor_device-gpu_max= 0 ms
*RESULT compositor_device-gpu_stddev: compositor_device-gpu_stddev= 0.0 ms
*RESULT compositor_service-cpu_avg: compositor_service-cpu_avg= 0 ms
*RESULT compositor_service-cpu_max: compositor_service-cpu_max= 0 ms
*RESULT compositor_service-cpu_stddev: compositor_service-cpu_stddev= 0.0 ms
*RESULT mean_frame_avg: mean_frame_avg= 30.6377290326 ms
*RESULT mean_frame_max: mean_frame_max= 1348.90799999 ms
*RESULT mean_frame_stddev: mean_frame_stddev= 112.730757875 ms
*RESULT mean_gpu_device-gpu_avg: mean_gpu_device-gpu_avg= 1.29135482965 ms
*RESULT mean_gpu_device-gpu_max: mean_gpu_device-gpu_max= 5.71199989319 ms
*RESULT mean_gpu_device-gpu_stddev: mean_gpu_device-gpu_stddev= 1.17938375687 ms
*RESULT mean_gpu_service-cpu_avg: mean_gpu_service-cpu_avg= 5.23331612903 ms
*RESULT mean_gpu_service-cpu_max: mean_gpu_service-cpu_max= 77.304 ms
*RESULT mean_gpu_service-cpu_stddev: mean_gpu_service-cpu_stddev= 7.24114888831
ms
*RESULT render_compositor_device-gpu_avg: render_compositor_device-gpu_avg=
0.0475483786798 ms
*RESULT render_compositor_device-gpu_max: render_compositor_device-gpu_max=
1.69999966812 ms
*RESULT render_compositor_device-gpu_stddev:
render_compositor_device-gpu_stddev= 0.175933025227 ms
*RESULT render_compositor_service-cpu_avg: render_compositor_service-cpu_avg=
1.22275483871 ms
*RESULT render_compositor_service-cpu_max: render_compositor_service-cpu_max=
42.366 ms
*RESULT render_compositor_service-cpu_stddev:
render_compositor_service-cpu_stddev= 4.46883791395 ms
RESULT telemetry_page_measurement_results: num_failed= 0 count
RESULT telemetry_page_measurement_results: num_errored= 0 count
Pages: [Wordpress]

David Yen

On 2015/01/23 02:45:27, nednguyen wrote: > https://codereview.chromium.org/854833003/diff/100001/tools/perf/measurements/gpu_times.py > File tools/perf/measurements/gpu_times.py (right): > > https://codereview.chromium.org/854833003/diff/100001/tools/perf/measurements/gpu_times.py#newcode15 > ...

5 years, 11 months ago (2015-01-23 22:20:10 UTC) #14

nednguyen

On 2015/01/23 22:20:10, David Yen wrote: > On 2015/01/23 02:45:27, nednguyen wrote: > > > ...

5 years, 11 months ago (2015-01-23 22:23:28 UTC) #15

David Yen

On 2015/01/23 01:52:31, vmiura wrote: > It would be great to get all the actual ...

5 years, 11 months ago (2015-01-23 22:24:20 UTC) #16

David Yen

On 2015/01/23 22:24:20, David Yen wrote: > On 2015/01/23 01:52:31, vmiura wrote: > > It ...

5 years, 11 months ago (2015-01-23 23:42:06 UTC) #17

nednguyen

https://codereview.chromium.org/854833003/diff/140001/tools/perf/benchmarks/gpu_times.py File tools/perf/benchmarks/gpu_times.py (right): https://codereview.chromium.org/854833003/diff/140001/tools/perf/benchmarks/gpu_times.py#newcode15 tools/perf/benchmarks/gpu_times.py:15: return gpu_timeline.GPUTimelineMetric() This mean your benchmark will compute gpu_timeline ...

5 years, 11 months ago (2015-01-23 23:58:13 UTC) #18

nednguyen

nednguyen@google.com changed reviewers: + benjhayden@chromium.org

5 years, 11 months ago (2015-01-24 00:03:56 UTC) #19

nednguyen

+Ben: I think this patch will make your work to migrate thread_times to TBM possible.

5 years, 11 months ago (2015-01-24 00:03:57 UTC) #20

nednguyen

https://codereview.chromium.org/854833003/diff/160001/tools/telemetry/telemetry/web_perf/timeline_based_measurement.py File tools/telemetry/telemetry/web_perf/timeline_based_measurement.py (right): https://codereview.chromium.org/854833003/diff/160001/tools/telemetry/telemetry/web_perf/timeline_based_measurement.py#newcode118 tools/telemetry/telemetry/web_perf/timeline_based_measurement.py:118: metric = self._get_metric_from_metric_type_callback(metric_type) Can you make it so that ...

5 years, 11 months ago (2015-01-24 00:15:46 UTC) #21

David Yen

https://codereview.chromium.org/854833003/diff/140001/tools/perf/benchmarks/gpu_times.py File tools/perf/benchmarks/gpu_times.py (right): https://codereview.chromium.org/854833003/diff/140001/tools/perf/benchmarks/gpu_times.py#newcode15 tools/perf/benchmarks/gpu_times.py:15: return gpu_timeline.GPUTimelineMetric() On 2015/01/23 23:58:13, nednguyen wrote: > This ...

5 years, 11 months ago (2015-01-24 00:22:33 UTC) #22

nednguyen

https://codereview.chromium.org/854833003/diff/200001/tools/perf/metrics/gpu_timeline.py File tools/perf/metrics/gpu_timeline.py (right): https://codereview.chromium.org/854833003/diff/200001/tools/perf/metrics/gpu_timeline.py#newcode59 tools/perf/metrics/gpu_timeline.py:59: def AddResults(self, model, _, interaction_records, results): Does this implementation ...

5 years, 11 months ago (2015-01-24 00:32:22 UTC) #23

David Yen

https://codereview.chromium.org/854833003/diff/200001/tools/perf/metrics/gpu_timeline.py File tools/perf/metrics/gpu_timeline.py (right): https://codereview.chromium.org/854833003/diff/200001/tools/perf/metrics/gpu_timeline.py#newcode59 tools/perf/metrics/gpu_timeline.py:59: def AddResults(self, model, _, interaction_records, results): On 2015/01/24 00:32:22, ...

5 years, 11 months ago (2015-01-24 00:41:10 UTC) #24

nednguyen

https://codereview.chromium.org/854833003/diff/140001/tools/perf/benchmarks/gpu_times.py File tools/perf/benchmarks/gpu_times.py (right): https://codereview.chromium.org/854833003/diff/140001/tools/perf/benchmarks/gpu_times.py#newcode15 tools/perf/benchmarks/gpu_times.py:15: return gpu_timeline.GPUTimelineMetric() On 2015/01/24 00:22:33, David Yen wrote: > ...

5 years, 11 months ago (2015-01-24 16:43:04 UTC) #25

https://codereview.chromium.org/854833003/diff/140001/tools/perf/benchmarks/g...
File tools/perf/benchmarks/gpu_times.py (right):

https://codereview.chromium.org/854833003/diff/140001/tools/perf/benchmarks/g...
tools/perf/benchmarks/gpu_times.py:15: return gpu_timeline.GPUTimelineMetric()
On 2015/01/24 00:22:33, David Yen wrote:
> On 2015/01/23 23:58:13, nednguyen wrote:
> > This mean your benchmark will compute gpu_timeline on all the interaction
> > records in the page. Is this really what you want?
> 
> Done. I moved the creation to CreateTimelineBasedMeasurementOptions().

Sorry,  by "compute gpu_timeline on all the interaction", I meant that you will
get the cpu
metric computed for all the interaction records issued on the page. I was asking
whether this is what you want, or do you want to have some filtering here, i.e:
only return gpu_timeline.GPUTimelineMetric() if your interaction records have
some special flags. Note that no filtering is fine too, but I just want to make
sure you are aware of this.

https://codereview.chromium.org/854833003/diff/220001/tools/telemetry/telemet...
File tools/telemetry/telemetry/web_perf/timeline_based_measurement.py (right):

https://codereview.chromium.org/854833003/diff/220001/tools/telemetry/telemet...
tools/telemetry/telemetry/web_perf/timeline_based_measurement.py:118:
metrics_list = self._get_metrics_from_metric_type_callback(metric_type)
I just realize you are duplicating your metrics multiple times when the
interaction have multiple metric types. 

Let make _get_metrics_from_metric_types_callback take a list of interaction's
flags as input. You can define a method for TimelineInteractionRecords class
named GetUserDefinedFlags() that return all flags (could be empty) in the
interaction record except the "repeatable" flag which is reserved for records
repeat checking logic.

Then instead of the for loop in #108, you do:
# make sure that all the interaction records have the same flags (see comment in
line 109)
records_custom_flags = interactions[0].GetUserDefinedFlags()
metrics =
set(self._get_metrics_from_record_flags_callback(records_custom_flags))
for m in metrics:
   m.AddResults(...)

With this change, the default _GetMetricsFromRecordsFlags would be implemented
as:
def _GetMetricsFromRecordFlags(records_custom_flags):
   metrics = []
   if 'is_smooth' in records_custom_flags:
     metrics += smoothness.SmoothnessMetric()
   if 'is_fast' in records_custom_flags:
     metrics += fast_metric.FastMetric()
   if 'is_responsive' in records_custom_flags:
     metrics += responsiveness_metric.ResponsivenessMetric()
   return metrics

David Yen

https://codereview.chromium.org/854833003/diff/140001/tools/perf/benchmarks/gpu_times.py File tools/perf/benchmarks/gpu_times.py (right): https://codereview.chromium.org/854833003/diff/140001/tools/perf/benchmarks/gpu_times.py#newcode15 tools/perf/benchmarks/gpu_times.py:15: return gpu_timeline.GPUTimelineMetric() On 2015/01/24 16:43:04, nednguyen wrote: > On ...

5 years, 11 months ago (2015-01-26 19:21:28 UTC) #26

https://codereview.chromium.org/854833003/diff/140001/tools/perf/benchmarks/g...
File tools/perf/benchmarks/gpu_times.py (right):

https://codereview.chromium.org/854833003/diff/140001/tools/perf/benchmarks/g...
tools/perf/benchmarks/gpu_times.py:15: return gpu_timeline.GPUTimelineMetric()
On 2015/01/24 16:43:04, nednguyen wrote:
> On 2015/01/24 00:22:33, David Yen wrote:
> > On 2015/01/23 23:58:13, nednguyen wrote:
> > > This mean your benchmark will compute gpu_timeline on all the interaction
> > > records in the page. Is this really what you want?
> > 
> > Done. I moved the creation to CreateTimelineBasedMeasurementOptions().
> 
> Sorry,  by "compute gpu_timeline on all the interaction", I meant that you
will
> get the cpu
> metric computed for all the interaction records issued on the page. I was
asking
> whether this is what you want, or do you want to have some filtering here,
i.e:
> only return gpu_timeline.GPUTimelineMetric() if your interaction records have
> some special flags. Note that no filtering is fine too, but I just want to
make
> sure you are aware of this. 

No filtering sounds like what we want. I will move the
_CreateGPUTimelineMetric(_) function back to where it was before.

https://codereview.chromium.org/854833003/diff/220001/tools/telemetry/telemet...
File tools/telemetry/telemetry/web_perf/timeline_based_measurement.py (right):

https://codereview.chromium.org/854833003/diff/220001/tools/telemetry/telemet...
tools/telemetry/telemetry/web_perf/timeline_based_measurement.py:118:
metrics_list = self._get_metrics_from_metric_type_callback(metric_type)
On 2015/01/24 16:43:04, nednguyen wrote:
> I just realize you are duplicating your metrics multiple times when the
> interaction have multiple metric types. 
> 
> Let make _get_metrics_from_metric_types_callback take a list of interaction's
> flags as input. You can define a method for TimelineInteractionRecords class
> named GetUserDefinedFlags() that return all flags (could be empty) in the
> interaction record except the "repeatable" flag which is reserved for records
> repeat checking logic.
> 
> Then instead of the for loop in #108, you do:
> # make sure that all the interaction records have the same flags (see comment
in
> line 109)
> records_custom_flags = interactions[0].GetUserDefinedFlags()
> metrics =
> set(self._get_metrics_from_record_flags_callback(records_custom_flags))
> for m in metrics:
>    m.AddResults(...)
> 
> With this change, the default _GetMetricsFromRecordsFlags would be implemented
> as:
> def _GetMetricsFromRecordFlags(records_custom_flags):
>    metrics = []
>    if 'is_smooth' in records_custom_flags:
>      metrics += smoothness.SmoothnessMetric()
>    if 'is_fast' in records_custom_flags:
>      metrics += fast_metric.FastMetric()
>    if 'is_responsive' in records_custom_flags:
>      metrics += responsiveness_metric.ResponsivenessMetric()
>    return metrics

Really great catch. This is now fixed.

nednguyen

The overall shape looks good to me. Can you pull out some change to another ...

5 years, 11 months ago (2015-01-26 19:45:48 UTC) #27

David Yen

https://codereview.chromium.org/854833003/diff/260001/tools/telemetry/telemetry/web_perf/timeline_based_measurement.py File tools/telemetry/telemetry/web_perf/timeline_based_measurement.py (right): https://codereview.chromium.org/854833003/diff/260001/tools/telemetry/telemetry/web_perf/timeline_based_measurement.py#newcode34 tools/telemetry/telemetry/web_perf/timeline_based_measurement.py:34: tir_module.IS_FAST, fast_metric.FastMetric, On 2015/01/26 19:45:48, nednguyen wrote: > I ...

5 years, 11 months ago (2015-01-26 20:01:48 UTC) #28

https://codereview.chromium.org/854833003/diff/260001/tools/telemetry/telemet...
File tools/telemetry/telemetry/web_perf/timeline_based_measurement.py (right):

https://codereview.chromium.org/854833003/diff/260001/tools/telemetry/telemet...
tools/telemetry/telemetry/web_perf/timeline_based_measurement.py:34:
tir_module.IS_FAST, fast_metric.FastMetric,
On 2015/01/26 19:45:48, nednguyen wrote:
> I think you mean "tir_module.IS_FAST: fast_metric.FastMetric,"

Done.

https://codereview.chromium.org/854833003/diff/260001/tools/telemetry/telemet...
tools/telemetry/telemetry/web_perf/timeline_based_measurement.py:52: return
[metric() for flag, metric in DEFAULT_METRICS if flag in flags_set]
On 2015/01/26 19:45:48, nednguyen wrote:
> in DEFAULT_METRICS.iteritems()

Done.

https://codereview.chromium.org/854833003/diff/260001/tools/telemetry/telemet...
tools/telemetry/telemetry/web_perf/timeline_based_measurement.py:149: that may
be removed.
On 2015/01/26 19:45:48, nednguyen wrote:
> Instead of making  overhead_level sometimes object, sometimes string, you can
> just make it TracingCategoryFilter type only.
> 
> So NO_OVERHEAD_LEVEL, V8_OVERHEAD_LEVEL, MINIMAL_OVERHEAD_LEVEL,
> DEBUG_OVERHEAD_LEVEL will be the 3 default TracingCategoryFilter object
instead
> of enum.
> 
> For this change though, can you please make a different patch sent to me for
the
> ease of reviewing?

I think these are better as strings because we cannot just use global objects
that get reused. The default objects need to be new option objects per TBM
instance because those option objects are actually modifiable. The new
"extra_category_filters" that was added actually do modify the category_filter
object right afterwards.

If we were going to replace the strings as types we would probably replace the
NO_OVERHEAD_LEVEL...etc. variables with either a function that takes a string,
or various functions such as  NoOverheadLevelFilterCategory(). I'm not sure if
replacing the strings with functions is going to be cleaner in the long run.

https://codereview.chromium.org/854833003/diff/260001/tools/telemetry/telemet...
tools/telemetry/telemetry/web_perf/timeline_based_measurement.py:159:
self._extra_category_filters.extend(filters)
On 2015/01/26 19:45:48, nednguyen wrote:
> What is this method for?

This was pulled in from master, I merely renamed the variable.

nednguyen

lgtm https://codereview.chromium.org/854833003/diff/260001/tools/telemetry/telemetry/web_perf/timeline_based_measurement.py File tools/telemetry/telemetry/web_perf/timeline_based_measurement.py (right): https://codereview.chromium.org/854833003/diff/260001/tools/telemetry/telemetry/web_perf/timeline_based_measurement.py#newcode149 tools/telemetry/telemetry/web_perf/timeline_based_measurement.py:149: that may be removed. On 2015/01/26 20:01:48, David ...

5 years, 11 months ago (2015-01-26 22:26:51 UTC) #29

nednguyen

On 2015/01/26 22:26:51, nednguyen wrote: > lgtm > > https://codereview.chromium.org/854833003/diff/260001/tools/telemetry/telemetry/web_perf/timeline_based_measurement.py > File tools/telemetry/telemetry/web_perf/timeline_based_measurement.py (right): > ...

5 years, 11 months ago (2015-01-26 22:27:28 UTC) #30

David Yen

https://codereview.chromium.org/854833003/diff/260001/tools/telemetry/telemetry/web_perf/timeline_based_measurement.py File tools/telemetry/telemetry/web_perf/timeline_based_measurement.py (right): https://codereview.chromium.org/854833003/diff/260001/tools/telemetry/telemetry/web_perf/timeline_based_measurement.py#newcode149 tools/telemetry/telemetry/web_perf/timeline_based_measurement.py:149: that may be removed. On 2015/01/26 22:26:51, nednguyen wrote: ...

5 years, 11 months ago (2015-01-26 22:37:38 UTC) #31

nednguyen

https://codereview.chromium.org/854833003/diff/260001/tools/telemetry/telemetry/web_perf/timeline_based_measurement.py File tools/telemetry/telemetry/web_perf/timeline_based_measurement.py (right): https://codereview.chromium.org/854833003/diff/260001/tools/telemetry/telemetry/web_perf/timeline_based_measurement.py#newcode149 tools/telemetry/telemetry/web_perf/timeline_based_measurement.py:149: that may be removed. On 2015/01/26 22:37:38, David Yen ...

5 years, 11 months ago (2015-01-26 22:51:41 UTC) #32

https://codereview.chromium.org/854833003/diff/260001/tools/telemetry/telemet...
File tools/telemetry/telemetry/web_perf/timeline_based_measurement.py (right):

https://codereview.chromium.org/854833003/diff/260001/tools/telemetry/telemet...
tools/telemetry/telemetry/web_perf/timeline_based_measurement.py:149: that may
be removed.
On 2015/01/26 22:37:38, David Yen wrote:
> On 2015/01/26 22:26:51, nednguyen wrote:
> > On 2015/01/26 20:01:48, David Yen wrote:
> > > On 2015/01/26 19:45:48, nednguyen wrote:
> > > > Instead of making  overhead_level sometimes object, sometimes string,
you
> > can
> > > > just make it TracingCategoryFilter type only.
> > > > 
> > > > So NO_OVERHEAD_LEVEL, V8_OVERHEAD_LEVEL, MINIMAL_OVERHEAD_LEVEL,
> > > > DEBUG_OVERHEAD_LEVEL will be the 3 default TracingCategoryFilter object
> > > instead
> > > > of enum.
> > > > 
> > > > For this change though, can you please make a different patch sent to me
> for
> > > the
> > > > ease of reviewing?
> > > 
> > > I think these are better as strings because we cannot just use global
> objects
> > > that get reused. The default objects need to be new option objects per TBM
> > > instance because those option objects are actually modifiable. The new
> > > "extra_category_filters" that was added actually do modify the
> category_filter
> > > object right afterwards.
> > > 
> > > If we were going to replace the strings as types we would probably replace
> the
> > > NO_OVERHEAD_LEVEL...etc. variables with either a function that takes a
> string,
> > > or various functions such as  NoOverheadLevelFilterCategory(). I'm not
sure
> if
> > > replacing the strings with functions is going to be cleaner in the long
run.
> > 
> > You are right. I thought TracingCategoryFilter is immutable but it isn't.
For
> > now, please add assertion that overhead_level either in ALL_OVERHEAD_LEVELS
or
> > it is of type  TracingCategoryFilter.
> 
> I think the assertion on line 215 handles this case. I will add a readable
> assertion text.

Let move it here instead. Raise error as early as possible.

David Yen

ping epenner, vmiura for gpu_times.py and gpu_timeline* changes. https://codereview.chromium.org/854833003/diff/260001/tools/telemetry/telemetry/web_perf/timeline_based_measurement.py File tools/telemetry/telemetry/web_perf/timeline_based_measurement.py (right): https://codereview.chromium.org/854833003/diff/260001/tools/telemetry/telemetry/web_perf/timeline_based_measurement.py#newcode149 tools/telemetry/telemetry/web_perf/timeline_based_measurement.py:149: that ...

5 years, 11 months ago (2015-01-26 23:09:28 UTC) #33

ping epenner, vmiura for gpu_times.py and gpu_timeline* changes.

https://codereview.chromium.org/854833003/diff/260001/tools/telemetry/telemet...
File tools/telemetry/telemetry/web_perf/timeline_based_measurement.py (right):

https://codereview.chromium.org/854833003/diff/260001/tools/telemetry/telemet...
tools/telemetry/telemetry/web_perf/timeline_based_measurement.py:149: that may
be removed.
On 2015/01/26 22:51:41, nednguyen wrote:
> On 2015/01/26 22:37:38, David Yen wrote:
> > On 2015/01/26 22:26:51, nednguyen wrote:
> > > On 2015/01/26 20:01:48, David Yen wrote:
> > > > On 2015/01/26 19:45:48, nednguyen wrote:
> > > > > Instead of making  overhead_level sometimes object, sometimes string,
> you
> > > can
> > > > > just make it TracingCategoryFilter type only.
> > > > > 
> > > > > So NO_OVERHEAD_LEVEL, V8_OVERHEAD_LEVEL, MINIMAL_OVERHEAD_LEVEL,
> > > > > DEBUG_OVERHEAD_LEVEL will be the 3 default TracingCategoryFilter
object
> > > > instead
> > > > > of enum.
> > > > > 
> > > > > For this change though, can you please make a different patch sent to
me
> > for
> > > > the
> > > > > ease of reviewing?
> > > > 
> > > > I think these are better as strings because we cannot just use global
> > objects
> > > > that get reused. The default objects need to be new option objects per
TBM
> > > > instance because those option objects are actually modifiable. The new
> > > > "extra_category_filters" that was added actually do modify the
> > category_filter
> > > > object right afterwards.
> > > > 
> > > > If we were going to replace the strings as types we would probably
replace
> > the
> > > > NO_OVERHEAD_LEVEL...etc. variables with either a function that takes a
> > string,
> > > > or various functions such as  NoOverheadLevelFilterCategory(). I'm not
> sure
> > if
> > > > replacing the strings with functions is going to be cleaner in the long
> run.
> > > 
> > > You are right. I thought TracingCategoryFilter is immutable but it isn't.
> For
> > > now, please add assertion that overhead_level either in
ALL_OVERHEAD_LEVELS
> or
> > > it is of type  TracingCategoryFilter.
> > 
> > I think the assertion on line 215 handles this case. I will add a readable
> > assertion text.
> 
> Let move it here instead. Raise error as early as possible.

Done.

epenner

LGTM with nits. Only one real issue regarding thread_duration and the rest are just nits. ...

5 years, 11 months ago (2015-01-27 22:06:39 UTC) #34

David Yen

https://codereview.chromium.org/854833003/diff/430001/tools/perf/metrics/gpu_timeline.py File tools/perf/metrics/gpu_timeline.py (right): https://codereview.chromium.org/854833003/diff/430001/tools/perf/metrics/gpu_timeline.py#newcode20 tools/perf/metrics/gpu_timeline.py:20: TRACKED_NAMES = { 'RenderCompositor': 'render_compositor', On 2015/01/27 22:06:38, epenner ...

5 years, 11 months ago (2015-01-27 22:31:28 UTC) #35

https://codereview.chromium.org/854833003/diff/430001/tools/perf/metrics/gpu_...
File tools/perf/metrics/gpu_timeline.py (right):

https://codereview.chromium.org/854833003/diff/430001/tools/perf/metrics/gpu_...
tools/perf/metrics/gpu_timeline.py:20: TRACKED_NAMES = { 'RenderCompositor':
'render_compositor',
On 2015/01/27 22:06:38, epenner wrote:
> Optional nit: Could this be something like TRACKED_GL_CONTEXT_NAMES? I was
> briefly unclear whether we were looking at client-side traces.

Done.

https://codereview.chromium.org/854833003/diff/430001/tools/perf/metrics/gpu_...
tools/perf/metrics/gpu_timeline.py:30: event_times = [(event.thread_duration or
event.duration)
On 2015/01/27 22:06:38, epenner wrote:
> I'd be tempted to just not report anything if thread_duration isn't supported.
> If you want to keep clock times, it would be good to:
> - Report a different name when using clock times
> - Check if thread_duration is 'None' instead (it's possible for
thread_duration
> to be zero and duration to be non-zero, since the clock granularity could be
> different).
> - Might be nice to add a sanity check that 'if any thread_duration is None,
all
> thread_durations are None'.

This was to simplify the calculation of both synchronous and asynchronous
slices. Asynchronous slices do not have the thread_duration, only the duration.
This is why we need to fall back to the duration if the thread_duration does not
exist.

Currently the CPU side traces use synchronous markers and the GPU side traces
use asynchronous markers.

https://codereview.chromium.org/854833003/diff/430001/tools/perf/metrics/gpu_...
tools/perf/metrics/gpu_timeline.py:76: if durations:
On 2015/01/27 22:06:38, epenner wrote:
> Nit: Looks like we should either add or not add all results here. Can you make
> it an empty list and still add it?

list_of_scalar_values requires a non-empty list, I added this after I saw it was
throwing an exception.

https://codereview.chromium.org/854833003/diff/430001/tools/perf/metrics/gpu_...
tools/perf/metrics/gpu_timeline.py:99: mean_frame - Mean time each frame is
calculated to be.
On 2015/01/27 22:06:38, epenner wrote:
> Nit: I think this name might have changed in the code below.

Updated description to match the new return values.

https://codereview.chromium.org/854833003/diff/430001/tools/perf/metrics/gpu_...
tools/perf/metrics/gpu_timeline.py:123: current_tracked_service_events =
collections.defaultdict(list)
On 2015/01/27 22:06:38, epenner wrote:
> Optional Nit: current_frame_service_events, or
> current_frame_tracked_service_events or just a comment that these track the
> traces within the current frame.

Done.

https://codereview.chromium.org/854833003/diff/430001/tools/perf/metrics/gpu_...
tools/perf/metrics/gpu_timeline.py:142: 
On 2015/01/27 22:06:38, epenner wrote:
> Optional nit: The alternating if blocks are almost the same. Could it be less
> code, like a for loop over two with variables in arrays, or a function? But if
> you think it would make this less clear you can forget this.

Yeah originally I had it trying to share more code but it got confusing pretty
quickly.

epenner

https://codereview.chromium.org/854833003/diff/430001/tools/perf/metrics/gpu_timeline.py File tools/perf/metrics/gpu_timeline.py (right): https://codereview.chromium.org/854833003/diff/430001/tools/perf/metrics/gpu_timeline.py#newcode30 tools/perf/metrics/gpu_timeline.py:30: event_times = [(event.thread_duration or event.duration) Minimally, any detection of ...

5 years, 11 months ago (2015-01-27 22:59:02 UTC) #36

vmiura

> Last, it seems like we could actually report thread_duration if the async slice > ...

5 years, 11 months ago (2015-01-27 23:03:09 UTC) #37

epennerAtGoogle

On 2015/01/27 23:03:09, vmiura wrote: > > Last, it seems like we could actually report ...

5 years, 11 months ago (2015-01-27 23:19:10 UTC) #38

David Yen

https://codereview.chromium.org/854833003/diff/430001/tools/perf/metrics/gpu_timeline.py File tools/perf/metrics/gpu_timeline.py (right): https://codereview.chromium.org/854833003/diff/430001/tools/perf/metrics/gpu_timeline.py#newcode30 tools/perf/metrics/gpu_timeline.py:30: event_times = [(event.thread_duration or event.duration) On 2015/01/27 22:59:02, epenner ...

5 years, 11 months ago (2015-01-27 23:20:00 UTC) #39

epennerAtGoogle

On 2015/01/27 23:20:00, David Yen wrote: > https://codereview.chromium.org/854833003/diff/430001/tools/perf/metrics/gpu_timeline.py > File tools/perf/metrics/gpu_timeline.py (right): > > https://codereview.chromium.org/854833003/diff/430001/tools/perf/metrics/gpu_timeline.py#newcode30 ...

5 years, 11 months ago (2015-01-27 23:22:03 UTC) #40

epennerAtGoogle

On 2015/01/27 23:22:03, epennerAtGoogle wrote: > On 2015/01/27 23:20:00, David Yen wrote: > > > ...

5 years, 11 months ago (2015-01-27 23:23:57 UTC) #41

David Yen

On 2015/01/27 23:23:57, epennerAtGoogle wrote: > On 2015/01/27 23:22:03, epennerAtGoogle wrote: > > On 2015/01/27 ...

5 years, 11 months ago (2015-01-27 23:29:13 UTC) #42

commit-bot: I haz the power

CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/854833003/490001

5 years, 11 months ago (2015-01-27 23:37:30 UTC) #44

commit-bot: I haz the power

Patchset 26 (id:??) landed as https://crrev.com/dfb418e345ac741c380a5570b386a5bfad1c6115 Cr-Commit-Position: refs/heads/master@{#313419}

5 years, 11 months ago (2015-01-28 04:02:56 UTC) #46

nduca

nduca@chromium.org changed reviewers: + nduca@chromium.org

5 years, 10 months ago (2015-02-05 16:41:13 UTC) #47

nduca

why is this in tools/perf? all tests should produce gpu times

5 years, 10 months ago (2015-02-05 16:41:14 UTC) #48

David Yen

5 years, 10 months ago (2015-02-05 17:19:33 UTC) #49

Message was sent while issue was closed.

On 2015/02/05 16:41:14, nduca wrote:
> why is this in tools/perf? all tests should produce gpu times

Enabling GPU tracing has significant overhead so we cannot enable it on every
test.

Issue 854833003: Added GPU performance metrics. (Closed)

Description

Patch Set 1 #

Patch Set 2 : skip device metrics instead of filling 0 for unsupported devices #

Patch Set 3 : Added test for filtering out context_ids #

Patch Set 4 : Removed gpu_device check in gpu_times_unittest #

Patch Set 5 : Modified metric names, utilize thread_duration when available #

Patch Set 6 : Simplified gpu_timeline using new SwapBuffer trace #

Patch Set 7 : Standardized names, added frame times #

Patch Set 8 : Replaced gpu_times pagetest with standard timeline_based_measurement #

Patch Set 9 : Fixed help for TBM Options #

Patch Set 10 : Fixed bad merge, moved metrics object creation #

Patch Set 11 : Get Metrics callback now returns a list of metrics #

Patch Set 12 : Fixed help string, added VerifyNonOverlappedRecords #

Patch Set 13 : Merged with master #

Patch Set 14 : Fixed issue with metrics callback called multiple times per metric #

Patch Set 15 : Handle empty flags case #

Patch Set 16 : Fixed flags variable name #

Patch Set 17 : Fixed typos #

Patch Set 18 : Added better assert message #

Patch Set 19 : reverted accidental file modification #

Patch Set 20 : Added early assert to check for valid overhead level #

Patch Set 21 : Fixed unit test #

Patch Set 22 : Renamed total frame time to swap frame time #

Patch Set 23 : Fix swap name in unit tests #

Patch Set 24 : Fixed nits #

Patch Set 25 : CPU and GPU frame times are calculated separately now #

Patch Set 26 : Updated comments, ensure lack of thread_duration becomes 0 #

Messages