|
|
Created:
5 years, 5 months ago by petrcermak Modified:
5 years, 2 months ago Reviewers:
pfeldman CC:
blink-reviews, caseq+blink_chromium.org, yurys+blink_chromium.org, lushnikov+blink_chromium.org, pfeldman+blink_chromium.org, apavlov+blink_chromium.org, devtools-reviews_chromium.org, sergeyv+blink_chromium.org, kozyatinskiy+blink_chromium.org Base URL:
https://chromium.googlesource.com/chromium/blink.git@master Target Ref:
refs/heads/master Project:
blink Visibility:
Public. |
DescriptionAdd Tracing.requestMemoryDump method to DevTools API (protocol)
This method will allow profiling memory tools such as Telemetry to
request memory dumps.
Proposal:
https://docs.google.com/document/d/190-URzSjfsiXNeyke86z85-5ctuqsC_RoX1J1uJTwIQ/edit?usp=sharing
Tracing handler change (Chromium side):
https://codereview.chromium.org/1232313004/
BUG=505826
Committed: https://crrev.com/a99accc93d66ea89d963739bfdf976a9ee6ee47d
git-svn-id: svn://svn.chromium.org/blink/trunk@199571 bbb929c8-8fbe-4397-9dbb-9b2b20218538
Patch Set 1 #
Total comments: 2
Messages
Total messages: 16 (2 generated)
pfeldman@chromium.org changed reviewers: + pfeldman@chromium.org
https://codereview.chromium.org/1235393005/diff/1/Source/devtools/protocol.json File Source/devtools/protocol.json (right): https://codereview.chromium.org/1235393005/diff/1/Source/devtools/protocol.js... Source/devtools/protocol.json:4720: "description": "Request a global memory dump.", Does tracing need to be started to use this one?
https://codereview.chromium.org/1235393005/diff/1/Source/devtools/protocol.json File Source/devtools/protocol.json (right): https://codereview.chromium.org/1235393005/diff/1/Source/devtools/protocol.js... Source/devtools/protocol.json:4720: "description": "Request a global memory dump.", On 2015/07/20 18:44:07, pfeldman wrote: > Does tracing need to be started to use this one? Yes, it does. The request will fail if it hasn't been started. If tracing has been started, but the memory-infra category not enabled, the response will return immediately success=false.
I guess it justifies being a part of tracing domain then. Having said that, it goes a little against the tracing nature of capturing things into a trace. I'd like to see memory info in devtools and in the task manager, at least with some granularity. Does this mean that tracing should be always started in Chrome? Sorry I missed the whole initiative and its design details. Could you point me to where it makes it clear that we need to initiate memory profiling and can't make it a poll-triggered visitor?
I guess it justifies being a part of tracing domain then. Having said that, it goes a little against the tracing nature of capturing things into a trace. I'd like to see memory info in devtools and in the task manager, at least with some granularity. Does this mean that tracing should be always started in Chrome? Sorry I missed the whole initiative and its design details. Could you point me to where it makes it clear that we need to initiate memory profiling and can't make it a poll-triggered visitor?
On 2015/07/21 13:50:29, pfeldman wrote: > I guess it justifies being a part of tracing domain then. Having said that, it > goes a little against the tracing nature of capturing things into a trace. I'd > like to see memory info in devtools and in the task manager, at least with some > granularity. Does this mean that tracing should be always started in Chrome? I'm not sure if this addresses your concern, but you can already enable the memory tracing category using the DevTools API by sending a Tracing.start request with 'disabled-by-default-memory-infra' in the categories parameter. The memory dump data can then be collected from Tracing.dataCollected notifications once Tracing.end is called. I don't think tracing can always be running in Chrome, because the underlying buffers tend to take up quite a lot of memory. > Sorry I missed the whole initiative and its design details. Could you point me > to where it makes it clear that we need to initiate memory profiling and can't > make it a poll-triggered visitor? You can find all the information about the initiative as well as all design documents at go/memory-infra. Tracing needs to be enabled because the memory dump manager adds the resulting memory dumps to the trace (which is not present if tracing is disabled). I suppose it would be possible to return the memory dumps directly by value in the Tracing.requestMemoryDump response, but it seems like a lot of extra work that we do not need at the moment. Primiano might be able to give a more detailed explanation on some of this.
As far as this CL is concerned, yes, this is purely an add-on to the tracing subsystem and falls entirely into that domain. This API is telling to memory-infra: please grab a dump and inject it into the trace. This is the way to get reliable memory measurements in telemetry, and is an important precondition for us to automate and improve most of our (today super-noisy) memory metrics in telemetry. In the ultra essence, the memory-infra code (concretely base/trace_event/*memory*) works this: - Somethings (right now only tracing) asks it to grab a memory dump across all chrome subsystem. - The memory-infra code collects all the data (e.g., queries PartitionAlloc stats, Oilpan stats, malloc stats, v8 stats, etc etc)... IPC / PostTask happen yada yada. - All that information is dumped in the trace. The best way to get a quick understanding of the core architecture without being overwhelmed by dozens of design docs is this slide deck [1], slides 27-33. Or even better, we can have a VC. [1] https://docs.google.com/presentation/d/1hDV_8h41-e9dTo3PHZOaotel1cSDM0QwNAMt_... > I'd like to see memory info in devtools and in the task manager, at least with some granularity. I see where you are aiming at and it makes sense. However, my feeling is that right now memory-infra is still in its exploratory phase. We are using tracing as a way to explore the areas we need to cover (concretely: which parts of codebase to instrument and how do they relate each other) and telemetry is the best way for us to figure out the quality of this data. Exposing some API from memory-infra to (devtools, chrome memory internals), in the form of on-demand collection of memory stats is perfectly reasonable, is something that I have in mind and definitely matches the architecture memory-infra has been designed around. Also, I have some bugs already assigned to that (e.g., crbug.com/25454) We can't get it right now (some internal optimizations that dumps data straight into TraceValue) but the outer interfaces are unaware of this are open to these other use cases. However, at current stage: - Before committing to a public API, I'd like to have a clear idea of the data we want to expose. As a concrete example, we're collaborating with GPU folks at the time of writing to rationalize GPU memory across all the platforms. - Before exposing that information to devtools, I think we should sit down and rationalize what subset of our data makes sense to expose to web developers, and which relates more to chrome internals (I had some chat in the bast with isherman@ about that) > Does this mean that tracing should be always started in Chrome? At current state, yes. But I am very keen to consider a way to get on demand memory stats out of band, without tracing. I just think that we should do that when our dataset is stable. All the technology is there. I just think that right now it is too premature to committing to make that interface stable and exposing to the rest of the codebase. I am very happy to discuss more over VC if you want.
On Tue, Jul 21, 2015 at 8:22 AM, <primiano@chromium.org> wrote: > As far as this CL is concerned, yes, this is purely an add-on to the > tracing > subsystem and falls entirely into that domain. > This API is telling to memory-infra: please grab a dump and inject it into > the > trace. This is the way to get reliable memory measurements in telemetry, > and is > an important precondition for us to automate and improve most of our (today > super-noisy) memory metrics in telemetry. > We might be lost in the terminology since Tracing means so many things these days. When I say "Tracing", I mean its controller/config, categories, start and stop capabilities. In terms of remote debugging and telemetry, it is exposed under the "Tracing" domain with the Tracing.* methods and events. Telemetry is not tracing, it is something bigger. It uses remote debugging protocol to communicate with chrome and is capable of starting tracing among other things. It is talking to multiple domains such as "Page", "Runtime", "Input", etc. Now my perception of the memory sampling is that it has a poll / sampling nature, while tracing has more of a continuous / instrumentation / flow nature. To get a simple memory sample / snapshot, I don't want to start Tracing, request memory dump and stop Tracing. This looks like a tracing misuse. I rather see a Memory.* domain next to Tracing which is available to Telemetry just as much as tracing is. And every time I need a snapshot, I do "Memory.requestMemoryDump". > In the ultra essence, the memory-infra code (concretely > base/trace_event/*memory*) works this: > - Somethings (right now only tracing) asks it to grab a memory dump > across all > chrome subsystem. > - The memory-infra code collects all the data (e.g., queries > PartitionAlloc > stats, Oilpan stats, malloc stats, v8 stats, etc etc)... IPC / PostTask > happen > yada yada. > - All that information is dumped in the trace. > > Ok, so pardon my ignorance due to not reading the docs just yet, but I don't see how you need tracing here. You should implement your own 100 line Memory protocol handler and make your information available outside of the tracing infrastructure. Telemetry would have it. > The best way to get a quick understanding of the core architecture without > being > overwhelmed by dozens of design docs is this slide deck [1], slides 27-33. > Or even better, we can have a VC. > [1] > > https://docs.google.com/presentation/d/1hDV_8h41-e9dTo3PHZOaotel1cSDM0QwNAMt_... > > > > I'd like to see memory info in devtools and in the task manager, at least >> with >> > some granularity. > I see where you are aiming at and it makes sense. However, my feeling is > that > right now memory-infra is still in its exploratory phase. We are using > tracing > as a way to explore the areas we need to cover (concretely: which parts of > codebase to instrument and how do they relate each other) and telemetry is > the > best way for us to figure out the quality of this data. > So in the terms defined above, do you need tracing or telemetry? > > Exposing some API from memory-infra to (devtools, chrome memory > internals), in > the form of on-demand collection of memory stats is perfectly reasonable, > is > something that I have in mind and definitely matches the architecture > memory-infra has been designed around. Also, I have some bugs already > assigned > to that (e.g., crbug.com/25454) > We can't get it right now (some internal optimizations that dumps data > straight > into TraceValue) but the outer interfaces are unaware of this are open to > these > other use cases. > Lets discuss it in a greater detail, at least as far as remote debugging / telemetry is concerned, driving things, requesting snapshots, serializing results, etc. is trivial. > > However, at current stage: > - Before committing to a public API, I'd like to have a clear idea of the > data > we want to expose. As a concrete example, we're collaborating with GPU > folks at > the time of writing to rationalize GPU memory across all the platforms. > - Before exposing that information to devtools, I think we should sit down > and > rationalize what subset of our data makes sense to expose to web > developers, and > which relates more to chrome internals (I had some chat in the bast with > isherman@ about that) Remote debugging protocol is not entirely public. We have numerous hidden methods (80% of the protocol), so we can experiment with the data we send out. It is as much a subject to change as tracing categories. > > > Does this mean that tracing should be always started in Chrome? >> > At current state, yes. But I am very keen to consider a way to get on > demand > memory stats out of band, without tracing. I just think that we should do > that > when our dataset is stable. > All the technology is there. I just think that right now it is too > premature to > committing to make that interface stable and exposing to the rest of the > codebase. > I don't think we need any commitment here, all I am advocating for is a sample-friendly controller API that does not necessarily need to be a part of tracing and inherit its start/stop instrumentation character. Other than that, it seems your code is structured in a way that supports and appreciates this approach. > I am very happy to discuss more over VC if you want. > Lets definitely do that, put something on my calendar and I'll be happy to help you navigate the remote debugging protocol infrastructure. > > https://codereview.chromium.org/1235393005/ > To unsubscribe from this group and stop receiving emails from it, send an email to blink-reviews+unsubscribe@chromium.org.
> Now my perception of the memory sampling is that it has a poll / sampling > nature, while tracing has more of a continuous / instrumentation / flow > nature. Correct as it is right now, but the plan in future would be to have also some TRACE_EVENT_WITH_MEMORY: TL;DR codebase triggering memory samples (post v8 GC, post oilpan GC, etc) which end up into the trace (when enabled). > To get a simple memory sample / snapshot, I don't want to start > Tracing, request memory dump and stop Tracing. This looks like a tracing > misuse. I rather see a Memory.* domain next to Tracing which is available > to Telemetry just as much as tracing is. And every time I need a snapshot, > I do "Memory.requestMemoryDump". All memory-infra has been designed to dump inside the tracing ecosystem. The reasons for this choice are: 1. Correlation (between trace events and memory dumps). It is very important to understand causation effects (did memory increase / decrease when there was a V8/Oilpan GC event? Or after a raster? Or while parsing HTML?). Having that integrated with the tracing timeline is fundamental to understand the memory dynamics of chrome internals. 2. Data analysis. Right now the data dumped inside tracing is pretty raw and requires a lot of offline massaging to be rationalized. Today this massaging is done inside the TraceViewer importer. You might argue at this point: so, how do you intend to make use of this data in telemetry, if you don't have trace-viewer there? The answer is: - (Very) Short term: we cherry pick the few metrics that require very little massaging (and, yes, duplicate that code). - Medium term: there is an ongoing project to merge the trace-viewer importers into telemetry (timeline based measurement: run trace-viewer importers in telemetry via D8 to process the data and use that into the telemetry benchmarks). References and design docs: https://github.com/google/trace-viewer/issues/1076 and https://github.com/google/trace-viewer/issues/984 In both cases (short and long term) we need an API to inject the trace points into telemetry and make sure that they are time-aligned and correlated with the other trace events. > Ok, so pardon my ignorance due to not reading the docs just yet, but I > don't see how you need tracing here. You should implement your own 100 line > Memory protocol handler and make your information available outside of the > tracing infrastructure. Telemetry would have it. The problem is not the 100 line protocol handler, it is the ~10k LOC (it's not a random number) which interprets the memory dump data (which lives in trace-viewer). There is a lot of graph-based math involved there (and more to come). So, to answer your question, the reason why we need tracing is: - Today tracing is the main (actually the driving) use case for memory-infra: we need to inspect chrome-internals to see what happens both in the time dimension (trace events) and memory dimension. - Trace-viewer has all the technology to understand the memory dumps. Very soon that technology will be available in telemetry for free (aforementioned D8 project). > So in the terms defined above, do you need tracing or telemetry? Both. We need telemetry to: - Trigger memory dumps at precise points. - Turn the post-processed data into benchmark results. We need those points to end up in tracing because: they are time-aligned with trace-events AND because the code in tracing knows how to parse and massage the data. In the short term we are going to massage a 5% of them via home brewed (python) telemetry code. But that is only temporary to address urgent short term needs (internal stuff, let's discuss offline) and we have already plans to switch to the new timeline based measurement (D8) as soon as it becomes available in telemetry. > Remote debugging protocol is not entirely public. We have numerous hidden > methods (80% of the protocol), so we can experiment with the data we send > out. It is as much a subject to change as tracing categories. Ok let's discuss this offline. Probably we want this to be a hidden method, at least for the moment. > I don't think we need any commitment here, all I am advocating for is a > sample-friendly controller API that does not necessarily need to be a part > of tracing and inherit its start/stop instrumentation character. Other than > that, it seems your code is structured in a way that supports and > appreciates this approach. This is correct. The only problem is that, at the moment, the data that we get in output is very complex and hard to digest. Trace-viewer allows us to iterate on it quickly and out-of-band of the chromium codebase (if we make mistakes / improvements, we just need to change the data-massaging and can retroactively apply on the existing traces). I will be totally happy to have some digested output available within the chromium codebase itself. But before that I want to understand and rationalize what that output should be, and we are not there yet. In the essence, today the mental model is: MemoryInfra::RequestMemoryDump() -> Dumps ***loads*** of data into the trace It feels like the mental model that we want tomorrow is / Loads of data injected into the trace MemoryInfra::RequestMemoryDump(depth) \ Lightweight processed data returned to the caller Which I'm totally happy to support once we figure out what that "lightweight processed data" should look like Looking forward to discuss more offline.
Message was sent while issue was closed.
This patch was superseded by https://codereview.chromium.org/1245233004/. We decided to extend the API exposed by the existing memory benchmarking extension instead of augmenting the DevTools API.
pfeldman: It turns out that I'll have to land the Blink patch (this) before the Chromium patch. PTAL. Thanks, Petr
lgtm, but I think chromium needs to land first.
The CQ bit was checked by petrcermak@chromium.org
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1235393005/1 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1235393005/1
Message was sent while issue was closed.
Committed patchset #1 (id:1) as https://src.chromium.org/viewvc/blink?view=rev&revision=199571
Message was sent while issue was closed.
Patchset 1 (id:??) landed as https://crrev.com/a99accc93d66ea89d963739bfdf976a9ee6ee47d |