|
|
Chromium Code Reviews|
Created:
3 years, 8 months ago by DmitrySkiba Modified:
3 years, 7 months ago CC:
catapult-reviews_chromium.org, tracing-review_chromium.org Target Ref:
refs/heads/master Project:
catapult Visibility:
Public. |
Descriptionsymbolize_trace: support new heap dump format.
This CL adds support for the new heap dump format (see the bug for details
about the format).
BUG=chromium:708930
Review-Url: https://codereview.chromium.org/2810523002
Committed: https://chromium.googlesource.com/external/github.com/catapult-project/catapult/+/18b10cbe616e114d9eae4e7a0dd07c8d81a7ef39
Patch Set 1 #Patch Set 2 : Remove everything except symbolization #
Total comments: 15
Patch Set 3 : Add comments; delete more unused code #Patch Set 4 : ParseMore -> ParseNext #
Total comments: 23
Patch Set 5 : We need to go deeper #
Total comments: 18
Patch Set 6 : Address comments #Messages
Total messages: 34 (10 generated)
Description was changed from ========== Symbolize traces with new heap format. This CL adds support for the new heap format (see the bug for details about the format). Additionally this CL adds several new abilities: === Recategorization === Chrome's heap profiler categorizes allocations mostly based on the file path that started the currently running task. I.e. if DoStuff() task was posted by code in "app/main/main.cc", all allocations made by any code called from DoStuff() are attributed with "app/main" category. That gives you some idea of where memory came from, but is not accurate. Recategorization uses bottommost non-trivial frame of an allocation backtrace to determine accurate allocation category. For example, in the following backtrace: main() (app/main/main.cc) DoStuff() (browser/stuff/stuff.cc) CacheStuff() (content/cache/cache.cc) AllocateMemory() (base/allocator/allocator.cc) malloc() malloc() doesn't have a path (i.e. skipped), and "base/allocator" path is trivial by default rules (also skipped), so CacheStuff() frame is picked, and all allocations with this backtrace are categorized as "content/cache". Recategorization is an experimental feature, and is only supported on Android and Linux (where you need to build with symbol_level=2). Relevant script arguments: --categorize --trivial-path-list --trivial-path --category-slice === Treemap generation === Since categories are hierarchical (e.g. "foo/bar/baz"), we can build treemap from them (wikipedia.org/wiki/Treemapping). With treemap it's easy to identify largest top-level contributors, and to drill down into them. Treemaps are generated as HTML files that are placed next to the trace file. Relevant script arguments: --treemap === Collapsing of small stack frame branches === Typical trace from native heap profiler contains ~25K PCs that need to be symbolized. Symbolizing them all gives you the fullest information, but can be slow. If you are not interested in all the details, you can ask the script to collapse small stack frame branches before symbolization. This reduces number of frames, and hence number of PCs that need to be symbolized. But since collapsing happens before symbolization, some collapsed branches will end up being larger (sometimes 10x) than the specified collapse threshold. Relevant script arguments: --collapse-threshold BUG=chromium:708930 ========== to ========== symbolize_trace: support new heap dump format. This CL adds support for the new heap dump format (see the bug for details about the format). Additionally this CL adds several new abilities: === Recategorization === Chrome's heap profiler categorizes allocations mostly based on the file path that started the currently running task. I.e. if DoStuff() task was posted by code in "app/main/main.cc", all allocations made by any code called from DoStuff() are attributed with "app/main" category. That gives you some idea of where memory came from, but is not accurate. Recategorization uses bottommost non-trivial frame of an allocation backtrace to determine accurate allocation category. For example, in the following backtrace: main() (app/main/main.cc) DoStuff() (browser/stuff/stuff.cc) CacheStuff() (content/cache/cache.cc) AllocateMemory() (base/allocator/allocator.cc) malloc() malloc() doesn't have a path (i.e. skipped), and "base/allocator" path is trivial by default rules (also skipped), so CacheStuff() frame is picked, and all allocations with this backtrace are categorized as "content/cache". Recategorization is an experimental feature, and is only supported on Android and Linux (where you need to build with symbol_level=2). Relevant script arguments: --categorize --trivial-path-list --trivial-path --category-slice === Treemap generation === Since categories are hierarchical (e.g. "foo/bar/baz"), we can build treemap from them (wikipedia.org/wiki/Treemapping). With treemap it's easy to identify largest top-level contributors, and to drill down into them. Treemaps are generated as HTML files that are placed next to the trace file. Relevant script arguments: --treemap === Collapsing of small stack frame branches === Typical trace from native heap profiler contains ~25K PCs that need to be symbolized. Symbolizing them all gives you the fullest information, but can be slow. If you are not interested in all the details, you can ask the script to collapse small stack frame branches before symbolization. This reduces number of frames, and hence number of PCs that need to be symbolized. But since collapsing happens before symbolization, some collapsed branches will end up being larger (sometimes 10x) than the specified collapse threshold. Relevant script arguments: --collapse-threshold BUG=chromium:708930 ==========
dskiba@chromium.org changed reviewers: + hjd@chromium.org, primiano@chromium.org
dskiba@chromium.org changed reviewers: + fmeawad@chromium.org
Guys PTAL.
primiano@chromium.org changed reviewers: + ajwong@chromium.org, wez@chromium.org
^__^ Honestly this is way more than what I was expecting here :/ Is there any chance we can at least split this into a CL per feature? You are essentially just rewriting all this and adding extra feature that I am not even sure belong here (some comments later). My long term plan is to not require this file at all in the future (see crbug.com/709526) but now you are adding all sort of features and smartness into something that was a meant to be symbolizer. However, given the time that we spent on this, honestly as long as ajwong and wez are fine with this I will be fine as well (Q: does this still work also on Windows and OSX?) . > === Recategorization === To be honest, here my view would be that we should push the full paths together with symbol names in the symbolized trace, and let the UI deal with the categorization logic. > === Treemap generation === This really sounds like a UI feature. Why are we doing here (With html code inline in the same python source?) > === Collapsing of small stack frame branches === Ditto, this seems to be importer business. Or does this make the difference from a non-importable trace (> 128 MB) to an importable one?)
On 2017/04/12 16:19:02, Primiano Tucci wrote: > ^__^ > Honestly this is way more than what I was expecting here :/ Is there any chance > we can at least split this into a CL per feature? > > You are essentially just rewriting all this and adding extra feature that I am > not even sure belong here (some comments later). My long term plan is to not > require this file at all in the future (see crbug.com/709526) but now you are > adding all sort of features and smartness into something that was a meant to be > symbolizer. > However, given the time that we spent on this, honestly as long as ajwong and > wez are fine with this I will be fine as well (Q: does this still work also on > Windows and OSX?) . Yup, works on macOS. I vote to just land it as is, because all these new things are optional, so by default you'll just get symbolization. > > > === Recategorization === > To be honest, here my view would be that we should push the full paths together > with symbol names in the symbolized trace, and let the UI deal with the > categorization logic. Note that this process also changes heap entries, so it's not that simple. But it can be done in UI too, the only thing is that UI will have to do it every time you open a trace, while symbolize_trace does it only once. > > > === Treemap generation === > This really sounds like a UI feature. Why are we doing here (With html code > inline in the same python source?) Treemaps are very useful in getting top-level picture (see go/zzanv for an example). I agree that treemaps are better done inside UI (where we can also navigate between treemap and stack frames), and I'm happy to deprecate this once UI is done. > > > === Collapsing of small stack frame branches === > Ditto, this seems to be importer business. Or does this make the difference from > a non-importable trace (> 128 MB) to an importable one?) No, this can't be done in the UI, because the very purpose of collapsing is to limit number of PCs that need to be symbolized. Less PCs to symbolize -> faster symbolization + smaller traces. This option is for platforms where symbolization is slow (like macOS). The proper way is to speed up the symbolization, but in the meanwhile you can use this option.
Heya... jumping in here. Seems like you've put a lot of thought and work into this. At this complexity, I think this CL might be too large for an easy review. Also, the code has enough brains in it now that think we should consider unittesting it. Previously, when the script was doing one simple thing (scarping out program counters and feeding them to a binary), I think it was okay to avoid unittesting since verifying whether or not it worked was trivial...even if there was some moderately complex logic around handling ASLR and pipelining multiple addr2line invocations. However, if we're gonna start putting in analysis, then I think we should have tests since otherwise it's too easy to get lost. How hard would it be to break up this CL and then add some basic testing of at least the new code? On 2017/04/12 16:49:19, DmitrySkiba wrote: > On 2017/04/12 16:19:02, Primiano Tucci wrote: > > ^__^ > > Honestly this is way more than what I was expecting here :/ Is there any > chance > > we can at least split this into a CL per feature? > > > > You are essentially just rewriting all this and adding extra feature that I am > > not even sure belong here (some comments later). My long term plan is to not > > require this file at all in the future (see crbug.com/709526) but now you are > > adding all sort of features and smartness into something that was a meant to > be > > symbolizer. > > However, given the time that we spent on this, honestly as long as ajwong and > > wez are fine with this I will be fine as well (Q: does this still work also on > > Windows and OSX?) . > > Yup, works on macOS. I vote to just land it as is, because all these new things > are optional, so by default you'll just get symbolization. > > > > > > === Recategorization === > > To be honest, here my view would be that we should push the full paths > together > > with symbol names in the symbolized trace, and let the UI deal with the > > categorization logic. > > Note that this process also changes heap entries, so it's not that simple. But > it can be done in UI too, the only thing is that UI will have to do it every > time you open a trace, while symbolize_trace does it only once. > > > > > > === Treemap generation === > > This really sounds like a UI feature. Why are we doing here (With html code > > inline in the same python source?) > > Treemaps are very useful in getting top-level picture (see go/zzanv for an > example). I agree that treemaps are better done inside UI (where we can also > navigate between treemap and stack frames), and I'm happy to deprecate this once > UI is done. > > > > > > === Collapsing of small stack frame branches === > > Ditto, this seems to be importer business. Or does this make the difference > from > > a non-importable trace (> 128 MB) to an importable one?) > > No, this can't be done in the UI, because the very purpose of collapsing is to > limit number of PCs that need to be symbolized. Less PCs to symbolize -> faster > symbolization + smaller traces. This option is for platforms where symbolization > is slow (like macOS). The proper way is to speed up the symbolization, but in > the meanwhile you can use this option.
ping... curious on thoughts? On 2017/04/12 23:31:32, awong wrote: > Heya... jumping in here. > > Seems like you've put a lot of thought and work into this. At this complexity, I > think this CL might be too large for an easy review. Also, the code has enough > brains in it now that think we should consider unittesting it. > > Previously, when the script was doing one simple thing (scarping out program > counters and feeding them to a binary), I think it was okay to avoid unittesting > since verifying whether or not it worked was trivial...even if there was some > moderately complex logic around handling ASLR and pipelining multiple addr2line > invocations. > > However, if we're gonna start putting in analysis, then I think we should have > tests since otherwise it's too easy to get lost. > > How hard would it be to break up this CL and then add some basic testing of at > least the new code? > > > On 2017/04/12 16:49:19, DmitrySkiba wrote: > > On 2017/04/12 16:19:02, Primiano Tucci wrote: > > > ^__^ > > > Honestly this is way more than what I was expecting here :/ Is there any > > chance > > > we can at least split this into a CL per feature? > > > > > > You are essentially just rewriting all this and adding extra feature that I > am > > > not even sure belong here (some comments later). My long term plan is to not > > > require this file at all in the future (see crbug.com/709526) but now you > are > > > adding all sort of features and smartness into something that was a meant to > > be > > > symbolizer. > > > However, given the time that we spent on this, honestly as long as ajwong > and > > > wez are fine with this I will be fine as well (Q: does this still work also > on > > > Windows and OSX?) . > > > > Yup, works on macOS. I vote to just land it as is, because all these new > things > > are optional, so by default you'll just get symbolization. > > > > > > > > > === Recategorization === > > > To be honest, here my view would be that we should push the full paths > > together > > > with symbol names in the symbolized trace, and let the UI deal with the > > > categorization logic. > > > > Note that this process also changes heap entries, so it's not that simple. But > > it can be done in UI too, the only thing is that UI will have to do it every > > time you open a trace, while symbolize_trace does it only once. > > > > > > > > > === Treemap generation === > > > This really sounds like a UI feature. Why are we doing here (With html code > > > inline in the same python source?) > > > > Treemaps are very useful in getting top-level picture (see go/zzanv for an > > example). I agree that treemaps are better done inside UI (where we can also > > navigate between treemap and stack frames), and I'm happy to deprecate this > once > > UI is done. > > > > > > > > > === Collapsing of small stack frame branches === > > > Ditto, this seems to be importer business. Or does this make the difference > > from > > > a non-importable trace (> 128 MB) to an importable one?) > > > > No, this can't be done in the UI, because the very purpose of collapsing is to > > limit number of PCs that need to be symbolized. Less PCs to symbolize -> > faster > > symbolization + smaller traces. This option is for platforms where > symbolization > > is slow (like macOS). The proper way is to speed up the symbolization, but in > > the meanwhile you can use this option.
On 2017/04/17 19:53:03, awong wrote: > ping... curious on thoughts? I think you're right, and we need unittests. I'll move everything not related to symbolization to follow-up CLs. > > On 2017/04/12 23:31:32, awong wrote: > > Heya... jumping in here. > > > > Seems like you've put a lot of thought and work into this. At this complexity, > I > > think this CL might be too large for an easy review. Also, the code has enough > > brains in it now that think we should consider unittesting it. > > > > Previously, when the script was doing one simple thing (scarping out program > > counters and feeding them to a binary), I think it was okay to avoid > unittesting > > since verifying whether or not it worked was trivial...even if there was some > > moderately complex logic around handling ASLR and pipelining multiple > addr2line > > invocations. > > > > However, if we're gonna start putting in analysis, then I think we should have > > tests since otherwise it's too easy to get lost. > > > > How hard would it be to break up this CL and then add some basic testing of at > > least the new code? > > > > > > On 2017/04/12 16:49:19, DmitrySkiba wrote: > > > On 2017/04/12 16:19:02, Primiano Tucci wrote: > > > > ^__^ > > > > Honestly this is way more than what I was expecting here :/ Is there any > > > chance > > > > we can at least split this into a CL per feature? > > > > > > > > You are essentially just rewriting all this and adding extra feature that > I > > am > > > > not even sure belong here (some comments later). My long term plan is to > not > > > > require this file at all in the future (see crbug.com/709526) but now you > > are > > > > adding all sort of features and smartness into something that was a meant > to > > > be > > > > symbolizer. > > > > However, given the time that we spent on this, honestly as long as ajwong > > and > > > > wez are fine with this I will be fine as well (Q: does this still work > also > > on > > > > Windows and OSX?) . > > > > > > Yup, works on macOS. I vote to just land it as is, because all these new > > things > > > are optional, so by default you'll just get symbolization. > > > > > > > > > > > > === Recategorization === > > > > To be honest, here my view would be that we should push the full paths > > > together > > > > with symbol names in the symbolized trace, and let the UI deal with the > > > > categorization logic. > > > > > > Note that this process also changes heap entries, so it's not that simple. > But > > > it can be done in UI too, the only thing is that UI will have to do it every > > > time you open a trace, while symbolize_trace does it only once. > > > > > > > > > > > > === Treemap generation === > > > > This really sounds like a UI feature. Why are we doing here (With html > code > > > > inline in the same python source?) > > > > > > Treemaps are very useful in getting top-level picture (see go/zzanv for an > > > example). I agree that treemaps are better done inside UI (where we can also > > > navigate between treemap and stack frames), and I'm happy to deprecate this > > once > > > UI is done. > > > > > > > > > > > > === Collapsing of small stack frame branches === > > > > Ditto, this seems to be importer business. Or does this make the > difference > > > from > > > > a non-importable trace (> 128 MB) to an importable one?) > > > > > > No, this can't be done in the UI, because the very purpose of collapsing is > to > > > limit number of PCs that need to be symbolized. Less PCs to symbolize -> > > faster > > > symbolization + smaller traces. This option is for platforms where > > symbolization > > > is slow (like macOS). The proper way is to speed up the symbolization, but > in > > > the meanwhile you can use this option.
Description was changed from ========== symbolize_trace: support new heap dump format. This CL adds support for the new heap dump format (see the bug for details about the format). Additionally this CL adds several new abilities: === Recategorization === Chrome's heap profiler categorizes allocations mostly based on the file path that started the currently running task. I.e. if DoStuff() task was posted by code in "app/main/main.cc", all allocations made by any code called from DoStuff() are attributed with "app/main" category. That gives you some idea of where memory came from, but is not accurate. Recategorization uses bottommost non-trivial frame of an allocation backtrace to determine accurate allocation category. For example, in the following backtrace: main() (app/main/main.cc) DoStuff() (browser/stuff/stuff.cc) CacheStuff() (content/cache/cache.cc) AllocateMemory() (base/allocator/allocator.cc) malloc() malloc() doesn't have a path (i.e. skipped), and "base/allocator" path is trivial by default rules (also skipped), so CacheStuff() frame is picked, and all allocations with this backtrace are categorized as "content/cache". Recategorization is an experimental feature, and is only supported on Android and Linux (where you need to build with symbol_level=2). Relevant script arguments: --categorize --trivial-path-list --trivial-path --category-slice === Treemap generation === Since categories are hierarchical (e.g. "foo/bar/baz"), we can build treemap from them (wikipedia.org/wiki/Treemapping). With treemap it's easy to identify largest top-level contributors, and to drill down into them. Treemaps are generated as HTML files that are placed next to the trace file. Relevant script arguments: --treemap === Collapsing of small stack frame branches === Typical trace from native heap profiler contains ~25K PCs that need to be symbolized. Symbolizing them all gives you the fullest information, but can be slow. If you are not interested in all the details, you can ask the script to collapse small stack frame branches before symbolization. This reduces number of frames, and hence number of PCs that need to be symbolized. But since collapsing happens before symbolization, some collapsed branches will end up being larger (sometimes 10x) than the specified collapse threshold. Relevant script arguments: --collapse-threshold BUG=chromium:708930 ========== to ========== symbolize_trace: support new heap dump format. This CL adds support for the new heap dump format (see the bug for details about the format). BUG=chromium:708930 ==========
Guys PTAL - I've removed everything except symbolization. I'll re-add features in follow-up CLs, along with unittests.
fmeawad@chromium.org changed reviewers: + lpy@chromium.org
+lpy
https://codereview.chromium.org/2810523002/diff/20001/tracing/bin/symbolize_t... File tracing/bin/symbolize_trace (right): https://codereview.chromium.org/2810523002/diff/20001/tracing/bin/symbolize_t... tracing/bin/symbolize_trace:109: class StringMap(object): These classes should have doc strings explaining the basic usage. The name is also pretty generic. When I see StringMap, I expect just a python dictionary. This is clearly different. So what exactly is it doing? What's _modified for, etc?
https://codereview.chromium.org/2810523002/diff/20001/tracing/bin/symbolize_t... File tracing/bin/symbolize_trace (right): https://codereview.chromium.org/2810523002/diff/20001/tracing/bin/symbolize_t... tracing/bin/symbolize_trace:135: self._modified = True Is Clear() not reset? This looks *almost* like __init__, but not. I'm confused at the usage and naming... https://codereview.chromium.org/2810523002/diff/20001/tracing/bin/symbolize_t... tracing/bin/symbolize_trace:149: def ApplyModifications(self): What are such modifications? Can we use a less generic name here? https://codereview.chromium.org/2810523002/diff/20001/tracing/bin/symbolize_t... tracing/bin/symbolize_trace:155: # Serialize into first JSON node, and clear all others. Can we get a "why" in this comment? As a reader, it's hard for me to verify if this makes sense because I don't know why you are doing it. :) https://codereview.chromium.org/2810523002/diff/20001/tracing/bin/symbolize_t... tracing/bin/symbolize_trace:158: string_json[:] = [] string_json.clear()? https://codereview.chromium.org/2810523002/diff/20001/tracing/bin/symbolize_t... tracing/bin/symbolize_trace:163: self._modified = False This is confusing. Shouldn't it be true? https://codereview.chromium.org/2810523002/diff/20001/tracing/bin/symbolize_t... tracing/bin/symbolize_trace:213: types_json[:] = [] types_json.clear()? https://codereview.chromium.org/2810523002/diff/20001/tracing/bin/symbolize_t... tracing/bin/symbolize_trace:220: self._modified = False Should this be true? https://codereview.chromium.org/2810523002/diff/20001/tracing/bin/symbolize_t... tracing/bin/symbolize_trace:266: if not name.startswith(self._PC_TAG): How about invert the logic to remove the not? https://codereview.chromium.org/2810523002/diff/20001/tracing/bin/symbolize_t... tracing/bin/symbolize_trace:273: def __init__(self): Group the __init__?
Okay, last of the trickling comments. So, I think the biggest issue for me is this code is missing overview documentation and comments such that for future readers of the code, it's had to know what assumptions you are making. Adding a file level comment explaining the major classes, and the major API expectations plus usage would go a long way to helping. https://codereview.chromium.org/2810523002/diff/20001/tracing/bin/symbolize_t... File tracing/bin/symbolize_trace (right): https://codereview.chromium.org/2810523002/diff/20001/tracing/bin/symbolize_t... tracing/bin/symbolize_trace:490: def name(self): For these properties, having a docstring that explains their usage and contents is useful. If the concept is general, then making a file-level comment would be good. Right now, I'm unsure the distinction between name, and unique_name. Also, what is in a stack_frame_map? Or a type_name_map? https://codereview.chromium.org/2810523002/diff/20001/tracing/bin/symbolize_t... tracing/bin/symbolize_trace:503: def memory_dumps(self): Why is this one plural? https://codereview.chromium.org/2810523002/diff/20001/tracing/bin/symbolize_t... tracing/bin/symbolize_trace:671: if not process.memory_map: Comment explaining when this can occur? https://codereview.chromium.org/2810523002/diff/20001/tracing/bin/symbolize_t... tracing/bin/symbolize_trace:701: # Encapsulates platform-specific symbolization logic. Turn into docstring. https://codereview.chromium.org/2810523002/diff/20001/tracing/bin/symbolize_t... tracing/bin/symbolize_trace:870: ANDROID_PATH_MATCHER = re.compile( This is hardish to read and matching paths with regexps always scares me. What about using os.path.split? Or will that be worse to read?
Okay, last of the trickling comments. So, I think the biggest issue for me is this code is missing overview documentation and comments such that for future readers of the code, it's had to know what assumptions you are making. Adding a file level comment explaining the major classes, and the major API expectations plus usage would go a long way to helping.
I've added comments, PTAL.
*** Check Out This Weird Trick To Get Your CL Reviewed ***
Some quick comments on the file-level documentation (which is overall a nice summary :) https://codereview.chromium.org/2810523002/diff/60001/tracing/bin/symbolize_t... File tracing/bin/symbolize_trace (right): https://codereview.chromium.org/2810523002/diff/60001/tracing/bin/symbolize_t... tracing/bin/symbolize_trace:15: sampled during tracing period (and 'pid' is an example of a predefined key). nit: "(and 'pid' is an example..." reads oddly here - it's not clear why you are mentioning it at this point. Is it just so you can use it in examples, or is it intended to give an idea of what the dictionaries contain? If the latter then I'd suggest moving the example to come after "have some predefined keys". https://codereview.chromium.org/2810523002/diff/60001/tracing/bin/symbolize_t... tracing/bin/symbolize_trace:17: This script cares only about memory dump events generated by memory-infra nit: Suggest "...dump events in trace files generated with the memory-infra component enabled." https://codereview.chromium.org/2810523002/diff/60001/tracing/bin/symbolize_t... tracing/bin/symbolize_trace:26: Chrome: malloc, blink_gc, and partition_alloc. nit: If these are examples, not an exhaustive list then precede with "e.g." https://codereview.chromium.org/2810523002/diff/60001/tracing/bin/symbolize_t... tracing/bin/symbolize_trace:29: above. nit: If we failed to trace all the way back to main() then might it technically be multiple trees? https://codereview.chromium.org/2810523002/diff/60001/tracing/bin/symbolize_t... tracing/bin/symbolize_trace:37: 6. Rewrites stack frame names (this updates parts of memory dump events). nit: You're missing #5 ;) It's also not clear what the difference is between symbolizing and rewriting the stack frame name - is the distinction meaningful here? One detail that isn't obvious [to me] which may be worth mentioning: do we do anything clever if two PCs symbolize to the same thing? https://codereview.chromium.org/2810523002/diff/60001/tracing/bin/symbolize_t... tracing/bin/symbolize_trace:47: a single metadata event to get full stack frame tree for a process. IIUC the point here is that every "event" in a legacy dump is a complete dump of everything live at that point - I'm not sure that the memory dump vs metadata distinction is worth mentioning here? https://codereview.chromium.org/2810523002/diff/60001/tracing/bin/symbolize_t... tracing/bin/symbolize_trace:51: the stack frame tree that occurred since the previous memory dump event. You might express this as each memory-infra event being incremental, and then separately talk about the type name and string mappings being handled the same way, for clarity. https://codereview.chromium.org/2810523002/diff/60001/tracing/bin/symbolize_t... tracing/bin/symbolize_trace:54: they put everything in the first node, and clear all others. Not sure what you mean about moving everything into the first node.
Added more comments - PTAL. https://codereview.chromium.org/2810523002/diff/60001/tracing/bin/symbolize_t... File tracing/bin/symbolize_trace (right): https://codereview.chromium.org/2810523002/diff/60001/tracing/bin/symbolize_t... tracing/bin/symbolize_trace:15: sampled during tracing period (and 'pid' is an example of a predefined key). On 2017/04/29 00:41:21, Wez wrote: > nit: "(and 'pid' is an example..." reads oddly here - it's not clear why you are > mentioning it at this point. Is it just so you can use it in examples, or is it > intended to give an idea of what the dictionaries contain? If the latter then > I'd suggest moving the example to come after "have some predefined keys". Done. https://codereview.chromium.org/2810523002/diff/60001/tracing/bin/symbolize_t... tracing/bin/symbolize_trace:17: This script cares only about memory dump events generated by memory-infra On 2017/04/29 00:41:21, Wez wrote: > nit: Suggest "...dump events in trace files generated with the memory-infra > component enabled." Done. https://codereview.chromium.org/2810523002/diff/60001/tracing/bin/symbolize_t... tracing/bin/symbolize_trace:26: Chrome: malloc, blink_gc, and partition_alloc. On 2017/04/29 00:41:21, Wez wrote: > nit: If these are examples, not an exhaustive list then precede with "e.g." This is actually an exhaustive list. https://codereview.chromium.org/2810523002/diff/60001/tracing/bin/symbolize_t... tracing/bin/symbolize_trace:29: above. On 2017/04/29 00:41:21, Wez wrote: > nit: If we failed to trace all the way back to main() then might it technically > be multiple trees? It's still a single tree, just with an implicit root node. https://codereview.chromium.org/2810523002/diff/60001/tracing/bin/symbolize_t... tracing/bin/symbolize_trace:37: 6. Rewrites stack frame names (this updates parts of memory dump events). On 2017/04/29 00:41:21, Wez wrote: > nit: You're missing #5 ;) > > It's also not clear what the difference is between symbolizing and rewriting the > stack frame name - is the distinction meaningful here? > > One detail that isn't obvious [to me] which may be worth mentioning: do we do > anything clever if two PCs symbolize to the same thing? Done. Added note about script not coalescing such entries. https://codereview.chromium.org/2810523002/diff/60001/tracing/bin/symbolize_t... tracing/bin/symbolize_trace:47: a single metadata event to get full stack frame tree for a process. On 2017/04/29 00:41:21, Wez wrote: > IIUC the point here is that every "event" in a legacy dump is a complete dump of > everything live at that point - I'm not sure that the memory dump vs metadata > distinction is worth mentioning here? Both formats dump live objects per allocator in each memory dump event. The difference is that in modern format no information escapes memory dump event, hence the stack frame tree (which is constantly updated) is dumped incrementally. In the legacy format the information escaped in form of special metadata events, which were dumped once. I've added an extensive explanation. https://codereview.chromium.org/2810523002/diff/60001/tracing/bin/symbolize_t... tracing/bin/symbolize_trace:51: the stack frame tree that occurred since the previous memory dump event. On 2017/04/29 00:41:21, Wez wrote: > You might express this as each memory-infra event being incremental, and then > separately talk about the type name and string mappings being handled the same > way, for clarity. Done. https://codereview.chromium.org/2810523002/diff/60001/tracing/bin/symbolize_t... tracing/bin/symbolize_trace:54: they put everything in the first node, and clear all others. On 2017/04/29 00:41:21, Wez wrote: > Not sure what you mean about moving everything into the first node. Explained more.
This looks great; left one last set of suggestions. :) https://codereview.chromium.org/2810523002/diff/60001/tracing/bin/symbolize_t... File tracing/bin/symbolize_trace (right): https://codereview.chromium.org/2810523002/diff/60001/tracing/bin/symbolize_t... tracing/bin/symbolize_trace:26: Chrome: malloc, blink_gc, and partition_alloc. On 2017/05/02 06:19:59, DmitrySkiba wrote: > On 2017/04/29 00:41:21, Wez wrote: > > nit: If these are examples, not an exhaustive list then precede with "e.g." > > This is actually an exhaustive list. OK; in that case I would say "There are three allocators in Chrome (malloc, ...)" - being specific about the number helps the reader understand that it's intended to be exhaustive. (And be aware that shortly after you commit this CL, someone will add the N+1th allocator ;) https://codereview.chromium.org/2810523002/diff/60001/tracing/bin/symbolize_t... tracing/bin/symbolize_trace:29: above. On 2017/05/02 06:19:59, DmitrySkiba wrote: > On 2017/04/29 00:41:21, Wez wrote: > > nit: If we failed to trace all the way back to main() then might it > technically > > be multiple trees? > > It's still a single tree, just with an implicit root node. OK; you could add a brief note that effect here, for clarity, but up to you. https://codereview.chromium.org/2810523002/diff/60001/tracing/bin/symbolize_t... tracing/bin/symbolize_trace:37: 6. Rewrites stack frame names (this updates parts of memory dump events). On 2017/05/02 06:19:59, DmitrySkiba wrote: > On 2017/04/29 00:41:21, Wez wrote: > > nit: You're missing #5 ;) > > > > It's also not clear what the difference is between symbolizing and rewriting > the > > stack frame name - is the distinction meaningful here? > > > > One detail that isn't obvious [to me] which may be worth mentioning: do we do > > anything clever if two PCs symbolize to the same thing? > > Done. Added note about script not coalescing such entries. Acknowledged. https://codereview.chromium.org/2810523002/diff/60001/tracing/bin/symbolize_t... tracing/bin/symbolize_trace:47: a single metadata event to get full stack frame tree for a process. On 2017/05/02 06:19:59, DmitrySkiba wrote: > On 2017/04/29 00:41:21, Wez wrote: > > IIUC the point here is that every "event" in a legacy dump is a complete dump > of > > everything live at that point - I'm not sure that the memory dump vs metadata > > distinction is worth mentioning here? > > Both formats dump live objects per allocator in each memory dump event. The > difference is that in modern format no information escapes memory dump event, > hence the stack frame tree (which is constantly updated) is dumped > incrementally. In the legacy format the information escaped in form of special > metadata events, which were dumped once. I've added an extensive explanation. Thanks for adding this detail, however it seems a lot of documentation for what could be expressed more concisely, if I understand correctly, e.g: -- In both the old and new formats each memory dump includes a complete list of all active allocations, each including references to metadata describing the allocation stack, object type, etc. In the old format the metadata is accumulated over the course of trace recording, and appended to it as a single metadata event, separate from the individual memory dumps. To process the dumps in a trace the metadata must first be processed, in its entirety. In the new format each memory dump includes any new metadata introduced by that dump, to add to any metadata accumulated from preceding dumps in the same recording. Each dump in the recording can therefore be processed as soon as it is encountered. For simplicity this script updates the metadata and writes it out as part of the first memory dump entry in the symbolized recording. -- IIUC the key thing to get across here is that this makes parsing for memory-dumps a single-pass operating where previously it was effectively two-pass (one to find the metadata, one to parse the dumps). https://codereview.chromium.org/2810523002/diff/80001/tracing/bin/symbolize_t... File tracing/bin/symbolize_trace (right): https://codereview.chromium.org/2810523002/diff/80001/tracing/bin/symbolize_t... tracing/bin/symbolize_trace:44: * In the modern format stack frame tree, type name mapping, and string nit: " ... modern format the stack frame ..." Otherwise it reads as though you mean "In the (modern format stack frame tree).."
Folks, I will be really really honest here. This CL has involved 4 people for more than 3 weeks and I start wondering whether all this is really bringing a correspondingly large advantage to all of us. My humble, pragmatic, and perhaps a bit too brutal opinion is that If this script works and doesn't break the existing cases and allows us to make progress I think we should move on, as we all have a huge queue of problems to solve. I think that a 3 weeks bike-shedding on a python script which is not shipped into chrome isn't on the top of anybody's list. I am not an OWNER (which is a bit of the problem here, we have 4 people here on this and nobody is an owner) but unless there are major concerns left this LGTM with outstanding comments addressed. https://codereview.chromium.org/2810523002/diff/80001/tracing/bin/symbolize_t... File tracing/bin/symbolize_trace (right): https://codereview.chromium.org/2810523002/diff/80001/tracing/bin/symbolize_t... tracing/bin/symbolize_trace:269: # def __init__(self, node): Are these commented lines intentional ? I think the pattern is to define them and raise NotImplementedError, and mark them as @override below https://codereview.chromium.org/2810523002/diff/80001/tracing/bin/symbolize_t... tracing/bin/symbolize_trace:373: class UnsupportedHeapDumpVersionError(Exception): No need to change it now, but for the future I honestly think that all this specialized exception doesn't add anything more than a assert(...), "Unsupported version") down below where you use this. https://codereview.chromium.org/2810523002/diff/80001/tracing/bin/symbolize_t... tracing/bin/symbolize_trace:414: if heap_dump_version != Trace.HEAP_DUMP_VERSION_1: Here I would have just done assert(heap_dump_version != Trace.HEAP_DUMP_VERSION_1) https://codereview.chromium.org/2810523002/diff/80001/tracing/bin/symbolize_t... tracing/bin/symbolize_trace:427: self._Insert(0, '[null]') is it intentional that clear does this _Insert and __init__ does not? should you maybe call clear() from __init__ ? https://codereview.chromium.org/2810523002/diff/80001/tracing/bin/symbolize_t... tracing/bin/symbolize_trace:448: for strings_node in self._strings_nodes: maybe when you do this add a comment explaining that you want to actually empty the content of the node but keep the existing references valid. otherwise this looks a bit mysterious. https://codereview.chromium.org/2810523002/diff/80001/tracing/bin/symbolize_t... tracing/bin/symbolize_trace:1163: # Android trace files don't have any indication they are from Android. As per discussion offline, maybe specify: traces captured on Android via chrome://inspect?tracing don't seem to have all the metadata necessary to figure out the platform.... https://codereview.chromium.org/2810523002/diff/80001/tracing/bin/symbolize_t... tracing/bin/symbolize_trace:1185: class MultilineHelpFormatter(argparse.HelpFormatter): For a one file python script, having a custom formatter for argparse feels imho a bit overshooting . Is it really worth the extra complexity here? https://codereview.chromium.org/2810523002/diff/80001/tracing/bin/symbolize_t... tracing/bin/symbolize_trace:1227: if os.path.exists(backup_file_path): isn't this a bit too much and really worth the complexity? I mean if the backup file exists already, shame, will be overwritten. I don't think we care about this special case so much to justify this extra complexity
LGTM Comments: - Lets make sure that users know that there is no guarantee if the version is not matching - This works on all Platforms, including Android in the build path - It would be nice to add how to use this script section in the top comment. If possible, since there is many top level functions that can be unittested, can be added in later CLs https://codereview.chromium.org/2810523002/diff/80001/tracing/bin/symbolize_t... File tracing/bin/symbolize_trace (right): https://codereview.chromium.org/2810523002/diff/80001/tracing/bin/symbolize_t... tracing/bin/symbolize_trace:1163: # Android trace files don't have any indication they are from Android. On 2017/05/03 17:25:04, Primiano Tucci wrote: > As per discussion offline, maybe specify: traces captured on Android via > chrome://inspect?tracing don't seem to have all the metadata necessary to figure > out the platform.... look for os-name in the metadata
https://codereview.chromium.org/2810523002/diff/60001/tracing/bin/symbolize_t... File tracing/bin/symbolize_trace (right): https://codereview.chromium.org/2810523002/diff/60001/tracing/bin/symbolize_t... tracing/bin/symbolize_trace:26: Chrome: malloc, blink_gc, and partition_alloc. On 2017/05/03 00:17:09, Wez wrote: > On 2017/05/02 06:19:59, DmitrySkiba wrote: > > On 2017/04/29 00:41:21, Wez wrote: > > > nit: If these are examples, not an exhaustive list then precede with "e.g." > > > > This is actually an exhaustive list. > > OK; in that case I would say "There are three allocators in Chrome (malloc, > ...)" - being specific about the number helps the reader understand that it's > intended to be exhaustive. > > (And be aware that shortly after you commit this CL, someone will add the N+1th > allocator ;) Acknowledged. https://codereview.chromium.org/2810523002/diff/60001/tracing/bin/symbolize_t... tracing/bin/symbolize_trace:29: above. On 2017/05/03 00:17:10, Wez wrote: > On 2017/05/02 06:19:59, DmitrySkiba wrote: > > On 2017/04/29 00:41:21, Wez wrote: > > > nit: If we failed to trace all the way back to main() then might it > > technically > > > be multiple trees? > > > > It's still a single tree, just with an implicit root node. > > OK; you could add a brief note that effect here, for clarity, but up to you. Acknowledged. https://codereview.chromium.org/2810523002/diff/60001/tracing/bin/symbolize_t... tracing/bin/symbolize_trace:47: a single metadata event to get full stack frame tree for a process. On 2017/05/03 00:17:10, Wez wrote: > On 2017/05/02 06:19:59, DmitrySkiba wrote: > > On 2017/04/29 00:41:21, Wez wrote: > > > IIUC the point here is that every "event" in a legacy dump is a complete > dump > > of > > > everything live at that point - I'm not sure that the memory dump vs > metadata > > > distinction is worth mentioning here? > > > > Both formats dump live objects per allocator in each memory dump event. The > > difference is that in modern format no information escapes memory dump event, > > hence the stack frame tree (which is constantly updated) is dumped > > incrementally. In the legacy format the information escaped in form of special > > metadata events, which were dumped once. I've added an extensive explanation. > > Thanks for adding this detail, however it seems a lot of documentation for what > could be expressed more concisely, if I understand correctly, e.g: > > -- > In both the old and new formats each memory dump includes a complete list of all > active allocations, each including references to metadata describing the > allocation stack, object type, etc. > > In the old format the metadata is accumulated over the course of trace > recording, and appended to it as a single metadata event, separate from the > individual memory dumps. To process the dumps in a trace the metadata must first > be processed, in its entirety. > > In the new format each memory dump includes any new metadata introduced by that > dump, to add to any metadata accumulated from preceding dumps in the same > recording. Each dump in the recording can therefore be processed as soon as it > is encountered. > > For simplicity this script updates the metadata and writes it out as part of the > first memory dump entry in the symbolized recording. > -- > > IIUC the key thing to get across here is that this makes parsing for > memory-dumps a single-pass operating where previously it was effectively > two-pass (one to find the metadata, one to parse the dumps). Well, the section is named "Details", and details it provides :) Your understanding is close to what happens, modulo naming. So I guess my explanation worked! https://codereview.chromium.org/2810523002/diff/80001/tracing/bin/symbolize_t... File tracing/bin/symbolize_trace (right): https://codereview.chromium.org/2810523002/diff/80001/tracing/bin/symbolize_t... tracing/bin/symbolize_trace:44: * In the modern format stack frame tree, type name mapping, and string On 2017/05/03 00:17:10, Wez wrote: > nit: " ... modern format the stack frame ..." > > Otherwise it reads as though you mean "In the (modern format stack frame > tree).." Done. https://codereview.chromium.org/2810523002/diff/80001/tracing/bin/symbolize_t... tracing/bin/symbolize_trace:269: # def __init__(self, node): On 2017/05/03 17:25:05, Primiano Tucci wrote: > Are these commented lines intentional ? I think the pattern is to define them > and raise NotImplementedError, and mark them as @override below The thing is that their exact shape is not determined with respect to arguments. I guess I'll just remove them. https://codereview.chromium.org/2810523002/diff/80001/tracing/bin/symbolize_t... tracing/bin/symbolize_trace:373: class UnsupportedHeapDumpVersionError(Exception): On 2017/05/03 17:25:05, Primiano Tucci wrote: > No need to change it now, but for the future I honestly think that all this > specialized exception doesn't add anything more than a assert(...), "Unsupported > version") down below where you use this. I wanted to surface the version that caused the error - so either I copy-paste same assert in several places, or I incapsulate the assert in a method, or I create a specific exception class that does the formatting. https://codereview.chromium.org/2810523002/diff/80001/tracing/bin/symbolize_t... tracing/bin/symbolize_trace:414: if heap_dump_version != Trace.HEAP_DUMP_VERSION_1: On 2017/05/03 17:25:05, Primiano Tucci wrote: > Here I would have just done > assert(heap_dump_version != Trace.HEAP_DUMP_VERSION_1) Acknowledged. https://codereview.chromium.org/2810523002/diff/80001/tracing/bin/symbolize_t... tracing/bin/symbolize_trace:427: self._Insert(0, '[null]') On 2017/05/03 17:25:05, Primiano Tucci wrote: > is it intentional that clear does this _Insert and __init__ does not? > should you maybe call clear() from __init__ ? __init__() (or rather ParseNext) wraps existing nodes, which always have ID #0. When we're clearing all mappings we need to carry string #0 over. I added a comment. https://codereview.chromium.org/2810523002/diff/80001/tracing/bin/symbolize_t... tracing/bin/symbolize_trace:448: for strings_node in self._strings_nodes: On 2017/05/03 17:25:05, Primiano Tucci wrote: > maybe when you do this add a comment explaining that you want to actually empty > the content of the node but keep the existing references valid. otherwise this > looks a bit mysterious. See comments at the top of the file. "Details" explains how we're updating incremental nodes, NodeWrapper explains what "wrapper" is (reference to a node in a larger JSON). https://codereview.chromium.org/2810523002/diff/80001/tracing/bin/symbolize_t... tracing/bin/symbolize_trace:1185: class MultilineHelpFormatter(argparse.HelpFormatter): On 2017/05/03 17:25:05, Primiano Tucci wrote: > For a one file python script, having a custom formatter for argparse feels imho > a bit overshooting . Is it really worth the extra complexity here? Hmm, actually this is a leftover from a version that had categorization. Removed. https://codereview.chromium.org/2810523002/diff/80001/tracing/bin/symbolize_t... tracing/bin/symbolize_trace:1227: if os.path.exists(backup_file_path): On 2017/05/03 17:25:05, Primiano Tucci wrote: > isn't this a bit too much and really worth the complexity? > I mean if the backup file exists already, shame, will be overwritten. I don't > think we care about this special case so much to justify this extra complexity Also a leftover from a previous versions. Removed.
The CQ bit was checked by dskiba@chromium.org
The patchset sent to the CQ was uploaded after l-g-t-m from primiano@chromium.org, fmeawad@chromium.org Link to the patchset: https://codereview.chromium.org/2810523002/#ps100001 (title: "Address comments")
CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
CQ is committing da patch.
Bot data: {"patchset_id": 100001, "attempt_start_ts": 1493873420363430,
"parent_rev": "0d00147b4f72f81b0e71e7accf70090c3aceec16", "commit_rev":
"18b10cbe616e114d9eae4e7a0dd07c8d81a7ef39"}
Message was sent while issue was closed.
Description was changed from ========== symbolize_trace: support new heap dump format. This CL adds support for the new heap dump format (see the bug for details about the format). BUG=chromium:708930 ========== to ========== symbolize_trace: support new heap dump format. This CL adds support for the new heap dump format (see the bug for details about the format). BUG=chromium:708930 Review-Url: https://codereview.chromium.org/2810523002 Committed: https://chromium.googlesource.com/external/github.com/catapult-project/catapu... ==========
Message was sent while issue was closed.
Committed patchset #6 (id:100001) as https://chromium.googlesource.com/external/github.com/catapult-project/catapu... |
