Issue 2132683002: [memory-metric] Add support for time ranges to the TBMv2 memory metric

Issue 2132683002: [memory-metric] Add support for time ranges to the TBMv2 memory metric (Closed)

Created:
4 years, 5 months ago by petrcermak

Modified:
4 years, 5 months ago

Reviewers:
Primiano Tucci (use gerrit), benjhayden

CC:
catapult-reviews_chromium.org, tracing-review_chromium.org, ulan

Base URL:
git@github.com:catapult-project/catapult.git@master

Target Ref:
refs/heads/master

Project:
catapult

Visibility:
Public.

More Reviews

Description

[memory-metric] Add support for time ranges to the TBMv2 memory metric This patch adds support for calculating memory values over memory dumps that intersect a given time range. BUG=chromium:625852 ,catapult:#2421 Committed: https://chromium.googlesource.com/external/github.com/catapult-project/catapult/+/88c60f52ca485cfcc08f4a0bd18695aaba6a40a5

Patch Set 2 : Address comments #

Created: 4 years, 5 months ago

Download [raw] [tar.bz2]

	Unified diffs	Side-by-side diffs	Delta from patch set	Stats (+286 lines, -22 lines)			Patch
M	tracing/tracing/metrics/system_health/memory_metric.html	View	1	6 chunks	+23 lines, -8 lines	0 comments	Download
M	tracing/tracing/metrics/system_health/memory_metric_test.html	View	1	12 chunks	+255 lines, -11 lines	0 comments	Download
M	tracing/tracing/model/memory_dump_test_utils.html	View		1 chunk	+8 lines, -3 lines	0 comments	Download

Depends on Patchset:

Issue 1432973005 Patch 1

Dependent Patchsets:

Issue 2134753002 Patch 1

Messages

Total messages: 15 (4 generated)

Expand Messages | Collapse Messages | Show Generated Messages | Hide Generated Messages

petrcermak

PTAL. Thanks, Petr https://codereview.chromium.org/2132683002/diff/1/tracing/tracing/metrics/system_health/memory_metric.html File tracing/tracing/metrics/system_health/memory_metric.html (right): https://codereview.chromium.org/2132683002/diff/1/tracing/tracing/metrics/system_health/memory_metric.html#newcode118 tracing/tracing/metrics/system_health/memory_metric.html:118: return dumps.filter(d => opt_range.intersectsExplicitRangeExclusive( Ben: I ...

4 years, 5 months ago (2016-07-07 17:02:23 UTC) #2

benjhayden

https://codereview.chromium.org/2132683002/diff/1/tracing/tracing/metrics/system_health/memory_metric.html File tracing/tracing/metrics/system_health/memory_metric.html (right): https://codereview.chromium.org/2132683002/diff/1/tracing/tracing/metrics/system_health/memory_metric.html#newcode118 tracing/tracing/metrics/system_health/memory_metric.html:118: return dumps.filter(d => opt_range.intersectsExplicitRangeExclusive( On 2016/07/07 at 17:02:23, petrcermak ...

4 years, 5 months ago (2016-07-07 17:23:10 UTC) #3

benjhayden

a couple questions about the tests, and feel free to use Inclusive, but lgtm https://codereview.chromium.org/2132683002/diff/1/tracing/tracing/metrics/system_health/memory_metric_test.html ...

4 years, 5 months ago (2016-07-07 17:31:32 UTC) #4

Primiano Tucci (use gerrit)

Just a drive-by comment from my side, deferring the rest to ben: is this filtering ...

4 years, 5 months ago (2016-07-08 08:12:30 UTC) #5

petrcermak

Thanks for your comments. PTAL. On 2016/07/08 08:12:30, Primiano Tucci wrote: > Just a drive-by ...

4 years, 5 months ago (2016-07-08 12:14:55 UTC) #6

Thanks for your comments. PTAL.

On 2016/07/08 08:12:30, Primiano Tucci wrote:
> Just a drive-by comment from my side, deferring the rest to ben:
> is this filtering out at the *global* dump level, right?
> Not all processes will dump at the same time. Say that you have two processes,
> B(rowser) and R(enderer). Say that you Create three dumps: D1, D2, D3, which
> happen at the following times:
> 
> D1:  B @ T=1s,  R @ T=2s
> D2:  B @ T=11s,  R @ T=10s
> D3:  B @ T=22s,  R @ T=20s
> 
> if you now pass a range [10-21] I'd expect only D2 to be taken into account
and
> D3 to be dropped completely, even if the renderer process dumped in time.

Yes, the filtering is done at the global memory dump level. Hence, if you pass a
range [10-21], all of the three process dumps (D1-3) will be taken into account.
I understand that this might be confusing, but I DON'T think that filtering at
the process memory dump level is possible for the following two reasons:

  1. Since the values (especially effective size) reported by the metric are
     influenced by memory sharing across processes (DAG), it doesn't make sense
     to report values for only some processes.

  2. If we ignore some process memory dumps, then the totals (e.g.
     memory:chrome:ALL_PROCESSES:...) won't make sense anymore.

================================================================================

There's another subtle issue with the range selection though. Consider two
global memory dumps:

  GMD1: start=10s, end=15s
  GMD2: start=20s, end=25s

If you request the range [16s-19s], the current implementation will return no
values at all (because no GMDs intersect the range). However, the Trace Viewer
UI (namely the "Used memory" track) will display these values as the following
bar chart:

                                        +---------------------- . . .
                                        |     GMD2     :
          +-----------------------------+<-performed-->:
    . . . :     GMD1     :  :<range->:  :              :
          :<-performed-->:  :        :  :              :
  -------------------------------------------------------------------->
          10             15 16       19 20             25

The key thing to note here is that the UI implies that the GMD1 values are kept
until the beginning of GMD2 (20s). In other words, GMD1 should maybe be taken
into account when the range [16s-19s] is requested for consistency. WDYT?

================================================================================

Side note: The importer currently treats PMDs as instantenous with their
timestamps taken from the corresponding v-phase events in the trace. The
timestamp range of a GMD is then simply the min and max of the timestamps of the
underlying PMDs. Example:

  PMD1A raw event: { 'ph': 'v', ts: 10, id: '0xBEEF', ... }
  PMD1B raw event: { 'ph': 'v', ts: 12, id: '0xBEEF', ... }
  PMD1C raw event: { 'ph': 'v', ts: 15, id: '0xBEEF', ... }

    |
    | TraceEventImporter
    v

  PMD1A: ProcessMemoryDump { start: 10, duration: 0, ... } <--------+
  PMD1B: ProcessMemoryDump { start: 12, duration: 0, ... } <------+ |
  PMD1C: ProcessMemoryDump { start: 15, duration: 0, ... } <----+ | |
  GMD1: GlobalMemoryDump { start: 10, duration: 5, ... } <------+-+-+

Since the trace now also contains events marking the beginning and end of each
dump, we could probably use them instead (to get more precise ranges of the
dump events). However, that's orthogonal to this issue and I'm not sure it's
worth the effort.

Petr

https://codereview.chromium.org/2132683002/diff/1/tracing/tracing/metrics/sys...
File tracing/tracing/metrics/system_health/memory_metric.html (right):

https://codereview.chromium.org/2132683002/diff/1/tracing/tracing/metrics/sys...
tracing/tracing/metrics/system_health/memory_metric.html:118: return
dumps.filter(d => opt_range.intersectsExplicitRangeExclusive(
On 2016/07/07 17:23:10, benjhayden_chromium wrote:
> On 2016/07/07 at 17:02:23, petrcermak wrote:
> > Ben: I used this to be consistent with the remaining metrics although I
think
> it would make more sense to use intersectsExplicitRangeInclusive so that, if
> there's only a single process memory dump at timestamp T (in which case the
> global memory dump has zero duration) and the user requests timestamp range
[T,
> T+1], the dump at timestamp T is included. Is there a particular reason for
your
> choice? Would you be ok with using inclusive intersection in the memory
metric?
> 
> I probably just copy-pasted it from somewhere else. You're right, inclusive
> makes more sense. Feel free to use it here, and fix the other metrics if you
> want.

OK, I've changed it to "inclusive" for the memory metric. I'll do the same for
the remaining metrics in a separate patch.

https://codereview.chromium.org/2132683002/diff/1/tracing/tracing/metrics/sys...
File tracing/tracing/metrics/system_health/memory_metric_test.html (right):

https://codereview.chromium.org/2132683002/diff/1/tracing/tracing/metrics/sys...
tracing/tracing/metrics/system_health/memory_metric_test.html:38: name,
modelCallback, opt_rangeOfInterest, expectedNumerics) {
On 2016/07/07 17:31:32, benjhayden_chromium wrote:
> Why not take |options| so that this helper doesn't need to construct it, and
> callers don't need to comment what the parameter is, and the parameter can
grow
> naturally if/when more types of options are added?

Done. Great idea!

https://codereview.chromium.org/2132683002/diff/1/tracing/tracing/model/memor...
File tracing/tracing/model/memory_dump_test_utils.html (right):

https://codereview.chromium.org/2132683002/diff/1/tracing/tracing/model/memor...
tracing/tracing/model/memory_dump_test_utils.html:48: model, timestamp,
opt_levelOfDetail, opt_duration) {
On 2016/07/07 17:31:32, benjhayden_chromium wrote:
> Should this be refactored to take a dict of options like the other test utils?

I agree. Are you OK with me doing it in a subsequent patch (so that this patch
doesn't get too big)?

Primiano Tucci (use gerrit)

Thanks for the super thorough explanation. On 2016/07/08 12:14:55, petrcermak wrote: > Thanks for your ...

4 years, 5 months ago (2016-07-08 12:49:21 UTC) #7

Thanks for the super thorough explanation.

On 2016/07/08 12:14:55, petrcermak wrote:
> Thanks for your comments. PTAL.
> 
> On 2016/07/08 08:12:30, Primiano Tucci wrote:
> > Just a drive-by comment from my side, deferring the rest to ben:
> > is this filtering out at the *global* dump level, right?
> > Not all processes will dump at the same time. Say that you have two
processes,
> > B(rowser) and R(enderer). Say that you Create three dumps: D1, D2, D3, which
> > happen at the following times:
> > 
> > D1:  B @ T=1s,  R @ T=2s
> > D2:  B @ T=11s,  R @ T=10s
> > D3:  B @ T=22s,  R @ T=20s
> > 
> > if you now pass a range [10-21] I'd expect only D2 to be taken into account
> and
> > D3 to be dropped completely, even if the renderer process dumped in time.
> 
> Yes, the filtering is done at the global memory dump level. Hence, if you pass
a
> range [10-21], all of the three process dumps (D1-3) will be taken into
account.
> I understand that this might be confusing, but I DON'T think that filtering at
> the process memory dump level is possible.

Ok perfect. Completelly agree with this. I agree might be contra-intuitive at
first glance, but that is the only safe way of filtering without causing all
sort of weird side effect to happen.
I'm very happy with this aggressive (i.e. either all processes are in the range
or they are all out) filtering


> 
>   1. Since the values (especially effective size) reported by the metric are
>      influenced by memory sharing across processes (DAG), it doesn't make
sense
>      to report values for only some processes.
> 
>   2. If we ignore some process memory dumps, then the totals (e.g.
>      memory:chrome:ALL_PROCESSES:...) won't make sense anymore.

Yup, super agree.


>
================================================================================
> 
> There's another subtle issue with the range selection though. Consider two
> global memory dumps:
> 
>   GMD1: start=10s, end=15s
>   GMD2: start=20s, end=25s
> 
> If you request the range [16s-19s], the current implementation will return no
> values at all (because no GMDs intersect the range).
Makes sense to me.

> However, the Trace Viewer
> UI (namely the "Used memory" track) will display these values as the following
> bar chart:
> 
>                                         +---------------------- . . .
>                                         |     GMD2     :
>           +-----------------------------+<-performed-->:
>     . . . :     GMD1     :  :<range->:  :              :
>           :<-performed-->:  :        :  :              :
>   -------------------------------------------------------------------->
>           10             15 16       19 20             25
> 
> The key thing to note here is that the UI implies that the GMD1 values are
kept
> until the beginning of GMD2 (20s). In other words, GMD1 should maybe be taken
> into account when the range [16s-19s] is requested for consistency. WDYT?

I think it's Okay as it is. IMHO the confusing part here is that the UI shows
the dump as a single point, while in reality is a long-ish thing which spans
across an interval.
But I wouldn't bother to change the UI (and if somebody realyl wants to debug it
we have the async events below in the MemoryInfra track).
If you do the same thing looking at the full length of a GMD it becomes evident
why filtering 16-19 returns an empty selection (and yeah, I understand that from
a UX viewpoint if you do a similar selection using the mouse you get a different
result, but I don't think we really care about matching the mouse behaviour with
the metric semantic).



>
================================================================================
> 
> Side note: The importer currently treats PMDs as instantenous with their
> timestamps taken from the corresponding v-phase events in the trace. The
> timestamp range of a GMD is then simply the min and max of the timestamps of
the
> underlying PMDs. Example:
> 
>   PMD1A raw event: { 'ph': 'v', ts: 10, id: '0xBEEF', ... }
>   PMD1B raw event: { 'ph': 'v', ts: 12, id: '0xBEEF', ... }
>   PMD1C raw event: { 'ph': 'v', ts: 15, id: '0xBEEF', ... }
> 
>     |
>     | TraceEventImporter
>     v
> 
>   PMD1A: ProcessMemoryDump { start: 10, duration: 0, ... } <--------+
>   PMD1B: ProcessMemoryDump { start: 12, duration: 0, ... } <------+ |
>   PMD1C: ProcessMemoryDump { start: 15, duration: 0, ... } <----+ | |
>   GMD1: GlobalMemoryDump { start: 10, duration: 5, ... } <------+-+-+
> 
> Since the trace now also contains events marking the beginning and end of each
> dump, we could probably use them instead (to get more precise ranges of the
> dump events). However, that's orthogonal to this issue and I'm not sure it's
> worth the effort.

Agree. It's something worth considering... but not now :)

petrcermak

On 2016/07/08 12:49:21, Primiano Tucci wrote: > Thanks for the super thorough explanation. :-) Can ...

4 years, 5 months ago (2016-07-08 12:52:19 UTC) #8

petrcermak

On 2016/07/08 12:52:19, petrcermak wrote: > On 2016/07/08 12:49:21, Primiano Tucci wrote: > > Thanks ...

4 years, 5 months ago (2016-07-08 12:52:58 UTC) #9

commit-bot: I haz the power

CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2132683002/20001

4 years, 5 months ago (2016-07-08 12:53:15 UTC) #12

commit-bot: I haz the power

Committed patchset #2 (id:20001) as https://chromium.googlesource.com/external/github.com/catapult-project/catapult/+/88c60f52ca485cfcc08f4a0bd18695aaba6a40a5

4 years, 5 months ago (2016-07-08 13:15:59 UTC) #14

Message was sent while issue was closed.

CQ bit was unchecked.

Expand Messages | Collapse Messages | Show Generated Messages | Hide Generated Messages