|
|
Created:
6 years, 6 months ago by Primiano Tucci (use gerrit) Modified:
6 years, 5 months ago CC:
chromium-reviews, klundberg+watch_chromium.org, bulach+watch_chromium.org, yfriedman+watch_chromium.org, ilevy-cc_chromium.org, bulach, Philippe Base URL:
https://chromium.googlesource.com/chromium/src.git@master Visibility:
Public. |
Description[Android] Introduce libheap_profiler for memory profiling.
libheap_profiler is a standalone, dependency free and efficient library
aimed to profile memory allocations. In a nutshell it is a replacement
for Android's libc.debug.malloc (which is terribly broken, see the BUG).
This change introduces three new targets:
- libheap_profiler.so: the snoopy library which tracks allocation info.
- heap_dump: executable to dump the allocation info for a given process.
- heap_profiler_unittests: because everybody loves unittests.
Goals:
- Be fast and efficient: the lib intended to be preloaded in the Zygote
and trace bulky processes like Chrome.
- Hook calls to mmap and malloc and grab stack traces associated to them.
- Don't tie particularly to Android, so it can be reused later for Linux.
- Be extensible, in the sense that allocations made by custom allocators,
can be individually visible (instead of just all their mmaped arenas).
- JSON output, as it is meant to be consumed by other tools, not humans.
At the moment, this CL introduces support just for Android.
BUG=382489
NOTRY=true
Committed: https://src.chromium.org/viewvc/chrome?view=rev&revision=281542
Patch Set 1 #
Total comments: 8
Patch Set 2 : s/hdump/heap_dump/, simplify JSON, refactor heap_dump main #
Total comments: 20
Patch Set 3 : Remove EXPORT from API decl, not needed #
Total comments: 3
Patch Set 4 : Add integration tests + pasko comments #
Total comments: 8
Patch Set 5 : Address pasko@#13 + rebase against move trees in 323873004 #
Total comments: 24
Patch Set 6 : Nits #
Total comments: 21
Patch Set 7 : address bulach@ review #
Total comments: 8
Patch Set 8 : Test refactor (bulach@ + pasko@) #Patch Set 9 : Fix gyp files #Patch Set 10 : git cl format FTW #Patch Set 11 : Fix gyp #
Total comments: 4
Patch Set 12 : typedef for bulach@ #
Total comments: 2
Patch Set 13 : Re-refactor test + git cl format #
Total comments: 7
Patch Set 14 : s/vma/alloc/ #Messages
Total messages: 45 (0 generated)
This is the most scary piece of black magic I ever wrote Egor, could you PTAL to tools/android/heap_profiler/heap_profiler.c and hdump.c ? I can find somebody else to split the load for the linker black magic in heap_profiler_hooks_android.c. Many thanks!
just random drive-by, this looks really promising!!! https://codereview.chromium.org/323893002/diff/1/tools/android/heap_profiler/... File tools/android/heap_profiler/hdump.c (right): https://codereview.chromium.org/323893002/diff/1/tools/android/heap_profiler/... tools/android/heap_profiler/hdump.c:27: // 3: {"total": 17, "frames": [1074792772, 1100849864, 1100850688, ...]}, I suppose this is running on the device, right? transmitting over adb can be slow, so I'd suggest minifying this.. say, "t" for "total", "f" for frames... (similarly above..). also, remove all the neat indentation :) it's much easier to prettify in the hsot side. https://codereview.chromium.org/323893002/diff/1/tools/android/heap_profiler/... tools/android/heap_profiler/hdump.c:58: int main(int argc, char** argv) { I'd suggest moving the main to the bottom, and the other static functions above it.. also, how about "heapdump.c" ? https://codereview.chromium.org/323893002/diff/1/tools/android/heap_profiler/... tools/android/heap_profiler/hdump.c:88: signal(SIGINT, exit_handler); could split this into a few smaller functions, say |bool stop_process()| here.. https://codereview.chromium.org/323893002/diff/1/tools/android/heap_profiler/... tools/android/heap_profiler/hdump.c:110: } similarly, something like |get_mem_fd()|, and |get_maps()| https://codereview.chromium.org/323893002/diff/1/tools/android/heap_profiler/... File tools/android/heap_profiler/third_party/bsd-trees/README.chromium (right): https://codereview.chromium.org/323893002/diff/1/tools/android/heap_profiler/... tools/android/heap_profiler/third_party/bsd-trees/README.chromium:15: None. this needs to be separate, in a patch of its own, following http://www.chromium.org/developers/adding-3rd-party-libraries
Thanks Marcus! CC: +pliard, just FYI, you might need it in your "future life". Also, after all the memory inspector reviews, I felt I should share this little lovely piece of black magic. :) https://codereview.chromium.org/323893002/diff/1/tools/android/heap_profiler/... File tools/android/heap_profiler/hdump.c (right): https://codereview.chromium.org/323893002/diff/1/tools/android/heap_profiler/... tools/android/heap_profiler/hdump.c:27: // 3: {"total": 17, "frames": [1074792772, 1100849864, 1100850688, ...]}, On 2014/06/09 17:23:28, bulach wrote: > I suppose this is running on the device, right? > transmitting over adb can be slow, so I'd suggest minifying this.. > say, "t" for "total", "f" for frames... (similarly above..). > also, remove all the neat indentation :) > it's much easier to prettify in the hsot side. Oh, nice point. It's a lot of data actually (perhaps I should consider catting through gzip on the device) https://codereview.chromium.org/323893002/diff/1/tools/android/heap_profiler/... tools/android/heap_profiler/hdump.c:58: int main(int argc, char** argv) { On 2014/06/09 17:23:28, bulach wrote: > I'd suggest moving the main to the bottom, and the other static functions above > it.. > also, how about "heapdump.c" ? Done. To be honest in the long term I consdider merging this into memdump. It's nicer to have one device-side tool for all the memory operations. But it takes some extra steps, let's do it later. https://codereview.chromium.org/323893002/diff/1/tools/android/heap_profiler/... tools/android/heap_profiler/hdump.c:110: } On 2014/06/09 17:23:28, bulach wrote: > similarly, something like |get_mem_fd()|, and |get_maps()| Done.
+torne@ for the linker magic in heap_profiler_hooks_android.c
Dynamic linking trickery looks okay to me :)
I like the approach, especially if used with release binaries!! I just took the first scan over the heap_profiler.c, will continue reading tomorrow. https://codereview.chromium.org/323893002/diff/80001/tools/android/heap_profi... File tools/android/heap_profiler/heap_profiler.gyp (right): https://codereview.chromium.org/323893002/diff/80001/tools/android/heap_profi... tools/android/heap_profiler/heap_profiler.gyp:8: # libheap_profiler is the library which will be preloaded in the Zygote / nit: you could mention LD_PRELOAD here to be more specific about the type of preloading Probably makes sense to mention that it's Android Zygote, not the Chrome Zygote (even though on Android it should be obvious for most readers). https://codereview.chromium.org/323893002/diff/80001/tools/android/heap_profi... tools/android/heap_profiler/heap_profiler.gyp:30: 'target_name': 'heap_dump', nit: change mentions of "hdump" in the commit message to "heap_dump" https://codereview.chromium.org/323893002/diff/80001/tools/android/heap_profi... File tools/android/heap_profiler/heap_profiler.h (right): https://codereview.chromium.org/323893002/diff/80001/tools/android/heap_profi... tools/android/heap_profiler/heap_profiler.h:37: // sum of VMA instances' length which .bt == this. nit: second sentence in this comment just repeats the first https://codereview.chromium.org/323893002/diff/80001/tools/android/heap_profi... File tools/android/heap_profiler/heap_profiler_hooks_android.c (right): https://codereview.chromium.org/323893002/diff/80001/tools/android/heap_profi... tools/android/heap_profiler/heap_profiler_hooks_android.c:30: // And their actual definitions. s/definitions/declarations/ https://codereview.chromium.org/323893002/diff/80001/tools/android/heap_profi... tools/android/heap_profiler/heap_profiler_hooks_android.c:67: abort(); // This world has gone wrong. Good night Vienna. please do perror("open") before aborting to print the error message. https://codereview.chromium.org/323893002/diff/80001/tools/android/heap_profi... tools/android/heap_profiler/heap_profiler_hooks_android.c:74: const size_t depth = get_backtrace(frames, HEAP_PROFILER_MAX_DEPTH); this function returns int, then converts to size_t, then uint32_t. Please unify. https://codereview.chromium.org/323893002/diff/80001/tools/android/heap_profi... tools/android/heap_profiler/heap_profiler_hooks_android.c:80: static uint32_t get_flags_for_mmap(int fd) { it is worth a comment: 1. what flags==0 means 2. why ignoring file mapping 3. how some shared memory mappings fly below the radar and why this does not affect Android (I am not totally sure btw) https://codereview.chromium.org/323893002/diff/80001/tools/android/heap_profi... tools/android/heap_profiler/heap_profiler_hooks_android.c:91: unwind_and_record_alloc(ret, size, get_flags_for_mmap(fd)); I did not check properly, but it seems this function resets errno, please make sure to set errno properly. Having a test for that would be nice too. https://codereview.chromium.org/323893002/diff/80001/tools/android/heap_profi... tools/android/heap_profiler/heap_profiler_hooks_android.c:179: state->have_skipped_self = true; Is this to skip the get_backtrace() from the trace or _Unwind_Backtrace() or up to the overridden function? How about marking appropriate amount of functions noinline or alwaysinline to make sure the final output would not change with a different compiler? https://codereview.chromium.org/323893002/diff/80001/tools/android/heap_profi... tools/android/heap_profiler/heap_profiler_hooks_android.c:189: stack_crawl_state_t state = {.frames = frames, .max_depth = max_depth}; I ❤ C
not lgtm Let's continue the review after we decide what to do about the comment below, it would change the design a bit. https://codereview.chromium.org/323893002/diff/120001/tools/android/heap_prof... File tools/android/heap_profiler/heap_profiler_hooks_android.c (right): https://codereview.chromium.org/323893002/diff/120001/tools/android/heap_prof... tools/android/heap_profiler/heap_profiler_hooks_android.c:133: void* ret = real_malloc(byte_count); The real_malloc can do mmap for a bigger region, hence we would be doing double-accounting of the malloc-ed memory. The question is, does our particular libc call mmap via PLT, internal function or a syscall. I think the most robust solution to count this properly would be to replace the RTLD_NEXT malloc with our own allocator that calls mmap. Other suggestions are welcome :)
https://codereview.chromium.org/323893002/diff/120001/tools/android/heap_prof... File tools/android/heap_profiler/heap_profiler_hooks_android.c (right): https://codereview.chromium.org/323893002/diff/120001/tools/android/heap_prof... tools/android/heap_profiler/heap_profiler_hooks_android.c:133: void* ret = real_malloc(byte_count); On 2014/06/11 12:39:24, pasko wrote: > The real_malloc can do mmap for a bigger region, hence we would be doing > double-accounting of the malloc-ed memory. The question is, does our particular > libc call mmap via PLT, internal function or a syscall. Nope, in this case we'd see the malloc but not the mmap. LD_PRELOAD will have no effect for mmap calls made inside bionic itself. The only case to make it happen, probably, is by defining mmap inside bionic as a weak symbol, which is not the case. However, even if such a thing would happen, the library is smart enough and will not double count. If you look closely to heap_profiler.c:352 you'll find out why. The nice point of having a map of the process VA is that I'm able to don't count twice. There is even a unittest test case which covers that (Realloc) > I think the most robust solution to count this properly would be to replace the > RTLD_NEXT malloc with our own allocator that calls mmap. Other suggestions are > welcome :) This means that every 4 byte allocation will end up in a 4k page... let's not do that. Also that would hide the overhead coming from whatever allocator is used, which I don't want definitely to hide.
https://codereview.chromium.org/323893002/diff/120001/tools/android/heap_prof... File tools/android/heap_profiler/heap_profiler_hooks_android.c (right): https://codereview.chromium.org/323893002/diff/120001/tools/android/heap_prof... tools/android/heap_profiler/heap_profiler_hooks_android.c:133: void* ret = real_malloc(byte_count); By the way, memory can also be consumed by brk/sbrk. The question I am not sure how to answer: should we override brk/sbrk via LD_PRELOAD as well? On 2014/06/11 13:44:45, Primiano Tucci wrote: > On 2014/06/11 12:39:24, pasko wrote: > > The real_malloc can do mmap for a bigger region, hence we would be doing > > double-accounting of the malloc-ed memory. The question is, does our > particular > > libc call mmap via PLT, internal function or a syscall. > > Nope, in this case we'd see the malloc but not the mmap. > LD_PRELOAD will have no effect for mmap calls made inside bionic itself. My point is rather about intercepting allocators in general, since even if it works for bionic today, it may stop working later (unlikely due to performance reasons though, see below), also some configuation change of the allocator can make the approach stop working. Since I think the tool is valuable as purely binary 'interception' tool, we cannot really rely on the current libc implementation/configuration properties that are not mandated by the standard. malloc is allowed to call mmap by the standard. Also, are we interested in using this approach on Linux some time later? === So how do allocators work? Typically allocators reserve large arenas via mmap/brk/sbrk then return small chunks from these large areas. An allocator does not always fallback to mmap/brk/sbrk, since it can reuse old "allocated" regions, but all the memory it gets is coming from one of these 2 sources (brk and sbrk are really one source). If an allocator performs a call to one of these functions to acquire an arena via the dynamic symbol (i.e. via PLT, i.e. the one we redefined), then it's a problem: the function would be overriden by LD_PRELOAD. A C library designed like that is probably a bad C library, so maybe indeed we should forget about this case. Reading code of allocators is not trivial: 1. glibc calls the internal function, which the PLT entry is an alias to (i.e. we are safe) 2. partalloc calls mmap via PLT (partalloc does not expose malloc/free/etc from any library, but who knows, maybe sometimes it will, say, for a component build?) 3. tcmalloc: dunno, seems to fire syscalls directly 4. dlmalloc: depends on how it's configured, in bionic it seems to call sbrk, but from the code I am not sure whether it's a relative offset or a path through GOT&PLT > The only case to make it happen, probably, is by defining mmap inside bionic as > a weak symbol, which is not the case. Do you mean LD_PRELOAD would not override a strong symbol? Or do you mean the library call to "itself" would not look through GOT? That's not true. > However, even if such a thing would happen, the library is smart enough and will > not double count. > If you look closely to heap_profiler.c:352 you'll find out why. Can you describe the design? Does it detect doubly-allocated stuff and automatically frees it? What if malloc got covered by biger mmap range, evicting the small range from known ranges, then we call free() on the original malloc, so now is there an unallocated hole in the giant mmap or there isn't? > The nice point of having a map of the process VA is that I'm able to don't count > twice. > There is even a unittest test case which covers that (Realloc) > > > > I think the most robust solution to count this properly would be to replace > the > > RTLD_NEXT malloc with our own allocator that calls mmap. Other suggestions are > > welcome :) > This means that every 4 byte allocation will end up in a 4k page... let's not do > that. I meant rather pulling some allocator just like partalloc, call mmap from there directly and have control over what map does there. Every 4 byte allocation will end up in *some* 4k page, but there will be many 4 byte allocations on the same 4k page. > Also that would hide the overhead coming from whatever allocator is used, which > I don't want definitely to hide. That's an argument I can understand. Do you think heap profiles would be much different if we replace the allocator with a faster one? Or do you plan to run precise perf measurements with heap profiler enabled?
> By the way, memory can also be consumed by brk/sbrk. The question I am not sure > how to answer: should we override brk/sbrk via LD_PRELOAD as well? I am not aware, at least in chrome, about stuff using (s)brk, but yes in principle thing we should probably hook that as well. (I can do that in a subsequent cl and add a TODO here). > My point is rather about intercepting allocators in general, since even if it > works for bionic today, it may stop working later (unlikely due to performance > reasons though, see below), also some configuation change of the allocator can > make the approach stop working. Can it? How? > malloc is allowed to call mmap by the standard. I'll tell you more, 99% of the malloc implementations end up calling mmap. How do you get pages from the kenel in the first place (if you exclude sbrk)? :) > Also, are we interested in using this approach on Linux some time later? Likely yes > If an allocator performs a call to one of these functions to acquire an arena > via the dynamic symbol (i.e. via PLT, i.e. the one we redefined), then it's a > problem: the function would be overriden by LD_PRELOAD. A C library designed > like that is probably a bad C library, so maybe indeed we should forget about > this case. Right. It is unlikely to happen. But again even if that would happen is not a problem, see below. > 2. partalloc calls mmap via PLT (partalloc does not expose malloc/free/etc from > any library, but who knows, maybe sometimes it will, say, for a component > build?) Yes, and: Option a: partalloc does not expose malloc to Blink (i.e. it gets called via the WtfFastMalloc interface). So we'll see just the mmaps for the moment. Is not highly detailed, but is still correct. Option b: partalloc exposes malloc and lives in the same .so of its callers (Blink). In this case we'd still see only mmap because the call to malloc inside blink will likely not go through the PLT. Option c: partalloc exposes and exports malloc and something outside of the .so calls it. In this case we'd see both malloc and mmap, but we're still not double counting (see below). > 3. tcmalloc: dunno, seems to fire syscalls directly Really? Where? Technically it is possible but sounds an insane idea using syscall to call mmap, especially if you know you can rely on the libc wrapping and exporting mmap for you. > 4. dlmalloc: depends on how it's configured, in bionic it seems to call sbrk, I assume in any reasonable modern piece of software dlmalloc is configured to use mmap. For the only reason that mmap allows to release holes of memory (in pages granularity) later. sbrk not (unless you release the pages on the boundary) > but from the code I am not sure whether it's a relative offset or a path through > GOT&PLT How do you expect to see that form the code? That is a choice of the linker (unless the code calls a private internal symbol). If you have two symbols (mmap and malloc) defined and exported publicly in the same .so (or exe), in the general case the call will be direct (i.e. PC-relative if the code is P.I., and not using PLT). Eventually you can force the linker to jump through PLT (e.g. as a side effect of setting mmap as a weak symbol, or using some magic __attribute__). > > The only case to make it happen, probably, is by defining mmap inside bionic > as > > a weak symbol, which is not the case. Right. > Do you mean LD_PRELOAD would not override a strong symbol? > Or do you mean the library call to "itself" would not look through GOT? Precisely > That's not true. Are you sure? Anyways, doesn't really matter (but I'm curious about the answer here). > Can you describe the design? Does it detect doubly-allocated stuff and > automatically frees it? It always frees by design the area when you allocate something, simple no? :) > What if malloc got covered by biger mmap range, evicting the small range from > known ranges, then we call free() on the original malloc, so now is there an > unallocated hole in the giant mmap or there isn't? Right. But remember that the purpose of this library is "attribuition", not counting memory. Memory can be (and is) counted looking at the true mmaps. The point here, instead, is figure out which call site allocated what (and how that grew over time). Also, keep in mind that mmap allocations are "flagged", so the client can distinguish VMAs coming from malloc and from mmap. So the idea is that the client of this library (the memory_inspector) looks at the mmaps allocations only if can't figure out any other thing in that range. More in details, the memory inspector (the outer tool) looks at the true mmaps first, in order to get the first "coarse" dataset. Then, in the context of the known mmaps seen, will look into the finer grained information coming from this heap profiler. If the single allocs are available, we're happy and we just look at that (the allocator overhead can be inferred substracting the true mmaps - the seen allocations) If the single allocs are not visible, the closest approximation we get are the mmaps. So in conclusion: if we hook mmap malloc and free, the mmap leftover pieces are irrelevant, because the MI will ignore them. Instead if we see only the mmap, it will associate all the mmaped pages of the allocator (with no holes, since we don't see the freee) to the call site on which the mmap happened. Which, I know, is not ideal, but is better than nothing. > That's an argument I can understand. Do you think heap profiles would be much > different if we replace the allocator with a faster one? The core principle of all this library is that I don't want (in the case of Chrome) / cannot (in the case of WebView) recompile / change the allocator code. It is too much trouble. The main point is: Either we hook PartitionAlloc somehow (in future I'll look into that), so we can see individual alloc/frees. Or at the moment we'll get the "gray blob" of its partition pages mmaped. > Or do you plan to run > precise perf measurements with heap profiler enabled? Perf and memory measurement at the same time seems a very unrealistic goal. The point here is the memory attribution to call sites and eventually overhead of 3rd allocators, not performances. Thanks for all the discussion by the way. Regardless what will be implemented, it's an interesting conversation. :)
Egor, can you take another look ? I addressed the comments (you made some very good catches inside!) I added the integration test as you requested (still a bit skeptical to be honest, feels like we're testing the compiler/liner behavior here rather than the library). https://codereview.chromium.org/323893002/diff/80001/tools/android/heap_profi... File tools/android/heap_profiler/heap_profiler.gyp (right): https://codereview.chromium.org/323893002/diff/80001/tools/android/heap_profi... tools/android/heap_profiler/heap_profiler.gyp:8: # libheap_profiler is the library which will be preloaded in the Zygote / On 2014/06/10 16:59:52, pasko wrote: > nit: > > you could mention LD_PRELOAD here to be more specific about the type of > preloading > > Probably makes sense to mention that it's Android Zygote, not the Chrome Zygote > (even though on Android it should be obvious for most readers). Done. https://codereview.chromium.org/323893002/diff/80001/tools/android/heap_profi... tools/android/heap_profiler/heap_profiler.gyp:30: 'target_name': 'heap_dump', On 2014/06/10 16:59:52, pasko wrote: > nit: > change mentions of "hdump" in the commit message to "heap_dump" Done. https://codereview.chromium.org/323893002/diff/80001/tools/android/heap_profi... File tools/android/heap_profiler/heap_profiler_hooks_android.c (right): https://codereview.chromium.org/323893002/diff/80001/tools/android/heap_profi... tools/android/heap_profiler/heap_profiler_hooks_android.c:30: // And their actual definitions. On 2014/06/10 16:59:52, pasko wrote: > s/definitions/declarations/ Well, technically speaking this is a variable definition, as it is allocating storage for the variable itself. A declaration would be something like "extern int foo;" https://codereview.chromium.org/323893002/diff/80001/tools/android/heap_profi... tools/android/heap_profiler/heap_profiler_hooks_android.c:67: abort(); // This world has gone wrong. Good night Vienna. On 2014/06/10 16:59:52, pasko wrote: > please do perror("open") before aborting to print the error message. Done. https://codereview.chromium.org/323893002/diff/80001/tools/android/heap_profi... tools/android/heap_profiler/heap_profiler_hooks_android.c:74: const size_t depth = get_backtrace(frames, HEAP_PROFILER_MAX_DEPTH); On 2014/06/10 16:59:52, pasko wrote: > this function returns int, then converts to size_t, then uint32_t. Please unify. Done. https://codereview.chromium.org/323893002/diff/80001/tools/android/heap_profi... tools/android/heap_profiler/heap_profiler_hooks_android.c:80: static uint32_t get_flags_for_mmap(int fd) { On 2014/06/10 16:59:52, pasko wrote: > it is worth a comment: > 1. what flags==0 means Added a comment to explain what flags are. > 2. why ignoring file mapping This is not responsbility of this library at all. This library just makes the information available and doesn't ignore anything. Eventually the client will ignore it. Hence, I think that talking about ignoring here is pointless. > 3. how some shared memory mappings fly below the radar and why this does not > affect Android (I am not totally sure btw) I think that this is something I'll document in the overall library docs (when I'll have an html page explaining all the profiler tricks) https://codereview.chromium.org/323893002/diff/80001/tools/android/heap_profi... tools/android/heap_profiler/heap_profiler_hooks_android.c:91: unwind_and_record_alloc(ret, size, get_flags_for_mmap(fd)); On 2014/06/10 16:59:52, pasko wrote: > I did not check properly, but it seems this function resets errno, please make > sure to set errno properly. Having a test for that would be nice too. This is a very good catch. I didn't think about errno. Thanks https://codereview.chromium.org/323893002/diff/80001/tools/android/heap_profi... tools/android/heap_profiler/heap_profiler_hooks_android.c:179: state->have_skipped_self = true; On 2014/06/10 16:59:52, pasko wrote: > Is this to skip the get_backtrace() from the trace or _Unwind_Backtrace() or up > to the overridden function? The latter. > How about marking appropriate amount of functions noinline or alwaysinline to > make sure the final output would not change with a different compiler? Right. Added always_inline to get_backtrace and unwind_and_record_alloc.
thanks! Sounds promising, I'll take another dive tomorrow.
one more pass through hooks heap_profiler.c: pending https://codereview.chromium.org/323893002/diff/80001/tools/android/heap_profi... File tools/android/heap_profiler/heap_profiler.gyp (right): https://codereview.chromium.org/323893002/diff/80001/tools/android/heap_profi... tools/android/heap_profiler/heap_profiler.gyp:30: 'target_name': 'heap_dump', On 2014/06/19 12:27:17, Primiano Tucci wrote: > On 2014/06/10 16:59:52, pasko wrote: > > nit: > > change mentions of "hdump" in the commit message to "heap_dump" > > Done. really done? I meant the commit message too :) https://codereview.chromium.org/323893002/diff/200001/tools/android/heap_prof... File tools/android/heap_profiler/heap_profiler_hooks_android.c (right): https://codereview.chromium.org/323893002/diff/200001/tools/android/heap_prof... tools/android/heap_profiler/heap_profiler_hooks_android.c:64: // When available, it tels whether we're in the zygote (=0) or forked (=1) s/tels/tells/ https://codereview.chromium.org/323893002/diff/200001/tools/android/heap_prof... tools/android/heap_profiler/heap_profiler_hooks_android.c:89: flags |= HEAP_PROFILER_FLAGS_IN_ZYGOTE; maybe _not_ worth a comment, but please just educate me: how is this bit of information going to be used? is it to say that non-zygote means what was "consumed by the application"? https://codereview.chromium.org/323893002/diff/200001/tools/android/heap_prof... tools/android/heap_profiler/heap_profiler_hooks_android.c:130: if (ret != NULL) { mremap returns MAP_FAILED on error (i.e. -1) also my headers say that flags is of type int, an you name the variable differently from the other 'flags' below? https://codereview.chromium.org/323893002/diff/200001/tools/android/heap_prof... tools/android/heap_profiler/heap_profiler_hooks_android.c:164: heap_profiler_free(ptr, 0, &flags); this function can also modify errno
Thanks https://codereview.chromium.org/323893002/diff/80001/tools/android/heap_profi... File tools/android/heap_profiler/heap_profiler.gyp (right): https://codereview.chromium.org/323893002/diff/80001/tools/android/heap_profi... tools/android/heap_profiler/heap_profiler.gyp:30: 'target_name': 'heap_dump', > really done? I meant the commit message too :) OMG, sorry, I renamed everything but the commit message. Now is really done :) https://codereview.chromium.org/323893002/diff/200001/tools/android/heap_prof... File tools/android/heap_profiler/heap_profiler_hooks_android.c (right): https://codereview.chromium.org/323893002/diff/200001/tools/android/heap_prof... tools/android/heap_profiler/heap_profiler_hooks_android.c:64: // When available, it tels whether we're in the zygote (=0) or forked (=1) On 2014/06/20 22:07:18, pasko wrote: > s/tels/tells/ Done. https://codereview.chromium.org/323893002/diff/200001/tools/android/heap_prof... tools/android/heap_profiler/heap_profiler_hooks_android.c:89: flags |= HEAP_PROFILER_FLAGS_IN_ZYGOTE; On 2014/06/20 22:07:18, pasko wrote: > maybe _not_ worth a comment, but please just educate me: > > how is this bit of information going to be used? is it to say that non-zygote > means what was "consumed by the application"? Kind of it. Means that the allocation was made by the zygote (i.e. before forking and becoming the actual app's process). Essentially means "don't really bother getting crazy with these allocations. Even if they're big, there is nothing you have the power to change because they're made by android code, not by your app's code". https://codereview.chromium.org/323893002/diff/200001/tools/android/heap_prof... tools/android/heap_profiler/heap_profiler_hooks_android.c:130: if (ret != NULL) { On 2014/06/20 22:07:18, pasko wrote: > mremap returns MAP_FAILED on error (i.e. -1) Good catch, thanks. Actually all the mmap* functions return MAP_FAILED (which is -1 -> 0xffffffff), != NULL. > also my headers say that flags is of type int, an you name the variable > differently from the other 'flags' below? I wish I had an explanation, but on Android, bionic defines the flags in mremap as long (conversely to the others, which are int). I'm just following that :-/ See: https://android.googlesource.com/platform/bionic/+/master/libc/include/sys/mm... https://codereview.chromium.org/323893002/diff/200001/tools/android/heap_prof... tools/android/heap_profiler/heap_profiler_hooks_android.c:164: heap_profiler_free(ptr, 0, &flags); On 2014/06/20 22:07:18, pasko wrote: > this function can also modify errno Right. Introduced wrapper also for heap_profiler_free.
Looked through heap_profiler.c, I am actually quite impressed how elegant, logical and compact it is. Great stuff! Some thoughts/questions below. And I did not look at the tests yet. Who is reviewing tests? https://codereview.chromium.org/323893002/diff/220001/tools/android/heap_prof... File tools/android/heap_profiler/heap_profiler.c (right): https://codereview.chromium.org/323893002/diff/220001/tools/android/heap_prof... tools/android/heap_profiler/heap_profiler.c:7: // (read: munmap). Also, it is very fast and its presence in the system its nit (sorry, I'm picky in inappropriate places quite often): s/very fast/low overhead/ ? https://codereview.chromium.org/323893002/diff/220001/tools/android/heap_prof... tools/android/heap_profiler/heap_profiler.c:19: // 1) A RB-Tree of non overlapping virtual memory regions (VMAs) sorted by their nit: s/non overlapping/non-overlapping/ https://codereview.chromium.org/323893002/diff/220001/tools/android/heap_prof... tools/android/heap_profiler/heap_profiler.c:102: // Is none is found, create a new one from the stack_traces array and add it None is found https://codereview.chromium.org/323893002/diff/220001/tools/android/heap_prof... tools/android/heap_profiler/heap_profiler.c:247: // starts exactly @ |del_start| if any (for dealing with free(ptr)). nit: s/@/at/ please https://codereview.chromium.org/323893002/diff/220001/tools/android/heap_prof... tools/android/heap_profiler/heap_profiler.c:309: insert_vma(del_end + 1, old_end, st, vma->flags); we may OOM here, the worst consequence seems to be hitting the assert when attempting to free this region, which is probably fine, but inconsistent with the heap_profiler_alloc() path where we would just avoid accounting the memory for the chunk. Also we ignore memory consumption if we are OOM on stack traces. https://codereview.chromium.org/323893002/diff/220001/tools/android/heap_prof... tools/android/heap_profiler/heap_profiler.c:351: delete_vmas_in_range(addr, size); ugh, having to remove ranges is counter-intuitive to me. Does it mean we missed the deallocation? Do we trust our metrics after that? How often does it happen that this call actually frees some VMAs? Can we put that to stats? https://codereview.chromium.org/323893002/diff/220001/tools/android/heap_prof... tools/android/heap_profiler/heap_profiler.c:370: memset(stats, 0, sizeof(HeapStats)); isn't this memory mapped from /dev/zero? https://codereview.chromium.org/323893002/diff/220001/tools/android/heap_prof... File tools/android/heap_profiler/heap_profiler.h (right): https://codereview.chromium.org/323893002/diff/220001/tools/android/heap_prof... tools/android/heap_profiler/heap_profiler.h:11: #define HEAP_PROFILER_MAGIC_MARKER 0x42beef42L I know this is perfectly random, but can you make it randomer by actually reading from /dev/{u,}random? https://codereview.chromium.org/323893002/diff/220001/tools/android/heap_prof... File tools/android/heap_profiler/heap_profiler_hooks_android.c (right): https://codereview.chromium.org/323893002/diff/220001/tools/android/heap_prof... tools/android/heap_profiler/heap_profiler_hooks_android.c:1: // Copyright 2014 The Chromium Authors. All rights reserved. why does the file name contain "android"? wouldn't most of it work on Linux? https://codereview.chromium.org/323893002/diff/220001/tools/android/heap_prof... tools/android/heap_profiler/heap_profiler_hooks_android.c:149: discard_alloc(ptr, size, /*old_flags=*/NULL); nit: a single space character after the comment would look 0.01% nicer
Thanks for the review! Don't worry about the tests, as soon as bulach@ comes in I'll bribe him with a Cask ALE (That's how we rock in London @ 9.30 AM :P) Also, I contacted http://www.dranneperry.co.uk/bg.asp and if we go there toghether we can get a group discount. :-P https://codereview.chromium.org/323893002/diff/220001/tools/android/heap_prof... File tools/android/heap_profiler/heap_profiler.c (right): https://codereview.chromium.org/323893002/diff/220001/tools/android/heap_prof... tools/android/heap_profiler/heap_profiler.c:7: // (read: munmap). Also, it is very fast and its presence in the system its On 2014/06/23 20:09:23, pasko wrote: > nit (sorry, I'm picky in inappropriate places quite often): > > s/very fast/low overhead/ ? Done. https://codereview.chromium.org/323893002/diff/220001/tools/android/heap_prof... tools/android/heap_profiler/heap_profiler.c:19: // 1) A RB-Tree of non overlapping virtual memory regions (VMAs) sorted by their On 2014/06/23 20:09:23, pasko wrote: > nit: s/non overlapping/non-overlapping/ Done. https://codereview.chromium.org/323893002/diff/220001/tools/android/heap_prof... tools/android/heap_profiler/heap_profiler.c:102: // Is none is found, create a new one from the stack_traces array and add it On 2014/06/23 20:09:23, pasko wrote: > None is found Actually that was an "If" https://codereview.chromium.org/323893002/diff/220001/tools/android/heap_prof... tools/android/heap_profiler/heap_profiler.c:247: // starts exactly @ |del_start| if any (for dealing with free(ptr)). On 2014/06/23 20:09:23, pasko wrote: > nit: > s/@/at/ please Done. https://codereview.chromium.org/323893002/diff/220001/tools/android/heap_prof... tools/android/heap_profiler/heap_profiler.c:309: insert_vma(del_end + 1, old_end, st, vma->flags); On 2014/06/23 20:09:23, pasko wrote: > we may OOM here, the worst consequence seems to be hitting the assert when > attempting to free this region Hmm which assert? This shouldn't hit asserts in case of OOM. >, which is probably fine, but inconsistent with > the heap_profiler_alloc() path where we would just avoid accounting the memory > for the chunk. Yeah, but on the other side I don't have much clues on how to deal with OOM during a free. The best I can do is guaranteeing that the data structures are consistent and don't count the 2nd piece. We can't do much more than lying if we OOM. > Also we ignore memory consumption if we are OOM on stack traces. Correct. Doesn't make a lot of sense keeping track of allocations if we can't tell where they were allocated from (it's the all point of this library). Note that the current sizing is ratio 4:1 (max allocs vs max stack traces) which far beyond reality (in most of the cases the same paths keep allocating over and over and over). If we start hitting that limit we can revisit these values. However, for the moment, I feel these numbers are gigantic even for chrome :) https://codereview.chromium.org/323893002/diff/220001/tools/android/heap_prof... tools/android/heap_profiler/heap_profiler.c:351: delete_vmas_in_range(addr, size); On 2014/06/23 20:09:23, pasko wrote: > ugh, having to remove ranges is counter-intuitive to me. Does it mean we missed > the deallocation? Do we trust our metrics after that? How often does it happen > that this call actually frees some VMAs? Can we put that to stats? This is to be paranoid in the case we miss some free. This is to guarantee that all the ranges in the tree are non-overlapping. If this assumption doesn't hold, all the logic in the code above might fail terribly (probably crash? I don't even want to bother thinking that). > How often does it happen that this call actually frees some VMAs? The only case I can think about is some GL library which free's up some mmaps in kernel space, thus being completely transparent in user space. I don't know if this is the case today (just a technically possible hypothesis), but if that should happen, I much rather prefer losing some information than crashing. > Can we put that to stats? I feel this would be a noisy metric which will tell us very little. Even if it's the case, there is very little we can do. Personally I wouldn't bother adding the extra code to track something that is unlikely to happen (and in the case it will happen, it will be unactionable) https://codereview.chromium.org/323893002/diff/220001/tools/android/heap_prof... tools/android/heap_profiler/heap_profiler.c:370: memset(stats, 0, sizeof(HeapStats)); On 2014/06/23 20:09:23, pasko wrote: > isn't this memory mapped from /dev/zero? This is essentially for the tests, which init/cleanup more than once during the lifetime of the process (without remapping /dev/zero) https://codereview.chromium.org/323893002/diff/220001/tools/android/heap_prof... File tools/android/heap_profiler/heap_profiler.h (right): https://codereview.chromium.org/323893002/diff/220001/tools/android/heap_prof... tools/android/heap_profiler/heap_profiler.h:11: #define HEAP_PROFILER_MAGIC_MARKER 0x42beef42L On 2014/06/23 20:09:23, pasko wrote: > I know this is perfectly random, but can you make it randomer by actually > reading from /dev/{u,}random? ^__^ You won't believe it: primiano@primiano:~$ python -c "print open('/dev/urandom').read(4).encode('hex')" 42beef42 ;-) Joking aside, I choose that because just because: - it's easier to spot in debug / gdb if something goes wrong - it's uniquely exclusive: try to search for 0x42beef42 on google or code.ohloh.net and you'll be amazed. (From now on, 0x42beef42 is trademarked as "the Primiano's constant") https://codereview.chromium.org/323893002/diff/220001/tools/android/heap_prof... File tools/android/heap_profiler/heap_profiler_hooks_android.c (right): https://codereview.chromium.org/323893002/diff/220001/tools/android/heap_prof... tools/android/heap_profiler/heap_profiler_hooks_android.c:1: // Copyright 2014 The Chromium Authors. All rights reserved. On 2014/06/23 20:09:23, pasko wrote: > why does the file name contain "android"? wouldn't most of it work on Linux? In theory most of it should work on Linux. In practice I haven't tested it and there are some small nitty gritty header diffs w.r.t. Android as the one you found in mremap. This is the long term plan to be honest. But until I haven't tested this on Linux, I don't want to lie and cause headaches to somebody else who might get intrigued by the name. Forthe future I was thinking in renaming this into heap_profiler_hooks_posix.cc and specializing the diff functions in ..._android and ..._linux. https://codereview.chromium.org/323893002/diff/220001/tools/android/heap_prof... tools/android/heap_profiler/heap_profiler_hooks_android.c:149: discard_alloc(ptr, size, /*old_flags=*/NULL); On 2014/06/23 20:09:23, pasko wrote: > nit: a single space character after the comment would look 0.01% nicer I can't find the reference in the code style anymore (we might have removed it), but last time I checked I remember that was telling to not put a space there. A search in codereview seems to agree: \/\*\w+=\*\/\w -> 32 results \/\*\w+=\*\/\s+\w -> 7 results
lgtm for the tests, thanks! some nits and suggestions below: https://codereview.chromium.org/323893002/diff/240001/tools/android/heap_prof... File tools/android/heap_profiler/heap_profiler_integrationtest.cc (right): https://codereview.chromium.org/323893002/diff/240001/tools/android/heap_prof... tools/android/heap_profiler/heap_profiler_integrationtest.cc:21: __attribute__ ((noinline)) void* MallocStep2(size_t size) { nit: maybe format like: __attribute__ ((noinline))\n void* MallocStep2(size_t size) { also...StepN doesn't seem too informative.. how about MallocInner / MallocOuter, or something? https://codereview.chromium.org/323893002/diff/240001/tools/android/heap_prof... tools/android/heap_profiler/heap_profiler_integrationtest.cc:40: __attribute__ ((noinline)) void* MmapStep2(size_t size) { ditto for the N, maybe MmapInner, MmapOuter https://codereview.chromium.org/323893002/diff/240001/tools/android/heap_prof... tools/android/heap_profiler/heap_profiler_integrationtest.cc:57: const size_t expected_fn_length = 16; nit: kExpectedFnLength ...no idea how easy would it be, but maybe have some inlined asm with N noops? https://codereview.chromium.org/323893002/diff/240001/tools/android/heap_prof... tools/android/heap_profiler/heap_profiler_integrationtest.cc:59: for(size_t i = 0; i < HEAP_PROFILER_MAX_DEPTH; ++i) { nit: space after for https://codereview.chromium.org/323893002/diff/240001/tools/android/heap_prof... tools/android/heap_profiler/heap_profiler_integrationtest.cc:79: void* m1 = MallocStep1(1000000); nit: maybe these numbers could be kSizeSmall / Medium / Large? https://codereview.chromium.org/323893002/diff/240001/tools/android/heap_prof... tools/android/heap_profiler/heap_profiler_integrationtest.cc:91: for(size_t i = 0; i < stats->max_stack_traces; ++i) { nit: space after for https://codereview.chromium.org/323893002/diff/240001/tools/android/heap_prof... tools/android/heap_profiler/heap_profiler_integrationtest.cc:121: EXPECT_EQ(3000000, total_alloc_start - total_alloc_end); nit: perhaps kSizeLarge - kSizeSmall.. also, for completeness, perhaps ensure that stats->stack_traces is empty or that none of the steps above are not found? https://codereview.chromium.org/323893002/diff/240001/tools/android/heap_prof... tools/android/heap_profiler/heap_profiler_integrationtest.cc:124: TEST(HeapProfilerIntegrationTest, TestMmapStackTraces) { this and the previous test is almost the same... perhaps could have a helper function that takes three function pointers for the allocators, three sizes and one for the free/munmap? https://codereview.chromium.org/323893002/diff/240001/tools/android/heap_prof... tools/android/heap_profiler/heap_profiler_integrationtest.cc:129: static const size_t m3_size = 509 * PAGE_SIZE; nit: kMMapSizeN https://codereview.chromium.org/323893002/diff/240001/tools/android/heap_prof... tools/android/heap_profiler/heap_profiler_integrationtest.cc:143: for(size_t i = 0; i < stats->max_stack_traces; ++i) { nit: space after for https://codereview.chromium.org/323893002/diff/240001/tools/android/heap_prof... tools/android/heap_profiler/heap_profiler_integrationtest.cc:152: else if (st->alloc_bytes == m2_size) { nit: previous line https://codereview.chromium.org/323893002/diff/240001/tools/android/heap_prof... tools/android/heap_profiler/heap_profiler_integrationtest.cc:159: else if (st->alloc_bytes == m3_size) { nit: previous line
https://codereview.chromium.org/323893002/diff/220001/tools/android/heap_prof... File tools/android/heap_profiler/heap_profiler.c (right): https://codereview.chromium.org/323893002/diff/220001/tools/android/heap_prof... tools/android/heap_profiler/heap_profiler.c:309: insert_vma(del_end + 1, old_end, st, vma->flags); On 2014/06/24 08:43:33, Primiano Tucci wrote: > On 2014/06/23 20:09:23, pasko wrote: > > we may OOM here, the worst consequence seems to be hitting the assert when > > attempting to free this region > Hmm which assert? This shouldn't hit asserts in case of OOM. I thought it would assert on the line 257 when we try to free the region we lost track of. Would it? > >, which is probably fine, but inconsistent with > > the heap_profiler_alloc() path where we would just avoid accounting the memory > > for the chunk. > > Yeah, but on the other side I don't have much clues on how to deal with OOM > during a free. > The best I can do is guaranteeing that the data structures are consistent and > don't count the 2nd piece. That's my preference. Thanks for doing it. > We can't do much more than lying if we OOM. > > > Also we ignore memory consumption if we are OOM on stack traces. > Correct. Doesn't make a lot of sense keeping track of allocations if we can't > tell where they were allocated from (it's the all point of this library). > Note that the current sizing is ratio 4:1 (max allocs vs max stack traces) which > far beyond reality (in most of the cases the same paths keep allocating over and > over and over). > If we start hitting that limit we can revisit these values. However, for the > moment, I feel these numbers are gigantic even for chrome :) How do we determine that we are out of limits? Would you please add stats for it? https://codereview.chromium.org/323893002/diff/220001/tools/android/heap_prof... tools/android/heap_profiler/heap_profiler.c:351: delete_vmas_in_range(addr, size); On 2014/06/24 08:43:33, Primiano Tucci wrote: > > Can we put that to stats? > I feel this would be a noisy metric which will tell us very little. Even if it's > the case, there is very little we can do. Personally I wouldn't bother adding > the extra code to track something that is unlikely to happen (and in the case it > will happen, it will be unactionable) I would argue that it can be actionable. If the metric increases out of the blue, there are options: 1. don't use the specific device where it happens 2. bisect commits to find what seemingly irrelevant changes could lead to this behavior 3. observe the magnitude of the effect (tells us that memory consumption below threshold X are not worth investigating because they could have been caused by this effect that we cannot really explain) More data is good! And we get it with small overhead and small increase in complexity here. Treat is as an assert, it checks our assumptions. Even if we don't know why the assumptions are not met, we can get creative at figuring out the causes.
Thanks both https://codereview.chromium.org/323893002/diff/220001/tools/android/heap_prof... File tools/android/heap_profiler/heap_profiler.c (right): https://codereview.chromium.org/323893002/diff/220001/tools/android/heap_prof... tools/android/heap_profiler/heap_profiler.c:309: insert_vma(del_end + 1, old_end, st, vma->flags); > I thought it would assert on the line 257 when we try to free the region we lost > track of. Would it? Hmm no, as a general principle we never *barf* if we're asked to free an empty region. Line 257 barfs only if the tree is inconsistent (keys are not monotonic, or regions overlap). > How do we determine that we are out of limits? Would you please add stats for > it? Playing a bit with chrome: Browser process: Allocs: 49756, Stacks: 4897 Renderer process: Allocs: 43858 Stacks: 5599 https://codereview.chromium.org/323893002/diff/220001/tools/android/heap_prof... tools/android/heap_profiler/heap_profiler.c:351: delete_vmas_in_range(addr, size); > I would argue that it can be actionable. If the metric increases out of the > blue, there are options: > 1. don't use the specific device where it happens Really? You stop using a device just because you know that the GL driver causes a +X MB offset? > 2. bisect commits to find what seemingly irrelevant changes could lead to this > behavior Hmm. What would you bisect in such case? Chrome? Android? the Kernel? Any combination of these can lead into that scenario. In my opinion we have two possible scenarios: - Something (e.g., GL driver init) "escapes" one-off the analysis and creates a constant offset: we won't really see it in the final analysis, as at the end we'll look into deltas per categories. So, probably we won't even notice - Something escapes constantly the analysis (e.g., we lose a mmap for each GL dma buffer transfer). In that case, we'll see something monstruous coming out from the charts. We'll easily spot it and put into a "don't care" category or blacklist in the profiler. Anyways, can we keep it aside as a future improvement? Doesn't looks a top priority at the moment and I'd much rather see this plugged with the memory inspector and in action :) > Treat is as an assert, it checks our assumptions. Even if we don't know why the > assumptions are not met, we can get creative at figuring out the causes. Hmm believe me, if we cause a crash in the zygote / app_process debugging that one (before the device completes the bootstrap) is not really pleasant :) Anyways, can we keep this for later? https://codereview.chromium.org/323893002/diff/240001/tools/android/heap_prof... File tools/android/heap_profiler/heap_profiler_integrationtest.cc (right): https://codereview.chromium.org/323893002/diff/240001/tools/android/heap_prof... tools/android/heap_profiler/heap_profiler_integrationtest.cc:21: __attribute__ ((noinline)) void* MallocStep2(size_t size) { On 2014/06/24 10:45:14, bulach wrote: > nit: maybe format like: > __attribute__ ((noinline))\n > void* MallocStep2(size_t size) { > > > also...StepN doesn't seem too informative.. how about MallocInner / MallocOuter, > or something? Like it. Done https://codereview.chromium.org/323893002/diff/240001/tools/android/heap_prof... tools/android/heap_profiler/heap_profiler_integrationtest.cc:40: __attribute__ ((noinline)) void* MmapStep2(size_t size) { On 2014/06/24 10:45:14, bulach wrote: > ditto for the N, maybe MmapInner, MmapOuter Done. https://codereview.chromium.org/323893002/diff/240001/tools/android/heap_prof... tools/android/heap_profiler/heap_profiler_integrationtest.cc:57: const size_t expected_fn_length = 16; On 2014/06/24 10:45:13, bulach wrote: > nit: kExpectedFnLength > ...no idea how easy would it be, but maybe have some inlined asm with N noops? Yeah, I thought about that, but unfortunately couldn't come with a portable way of adding nops :/ (also inline asm would make this entire thing even more gross) And unfortunately the GCC aligned attribute doesn't apply to functions. https://codereview.chromium.org/323893002/diff/240001/tools/android/heap_prof... tools/android/heap_profiler/heap_profiler_integrationtest.cc:59: for(size_t i = 0; i < HEAP_PROFILER_MAX_DEPTH; ++i) { On 2014/06/24 10:45:14, bulach wrote: > nit: space after for Done. https://codereview.chromium.org/323893002/diff/240001/tools/android/heap_prof... tools/android/heap_profiler/heap_profiler_integrationtest.cc:79: void* m1 = MallocStep1(1000000); On 2014/06/24 10:45:14, bulach wrote: > nit: maybe these numbers could be kSizeSmall / Medium / Large? Going for kSize1/2/3: their sense is not really be small/large, just be different. https://codereview.chromium.org/323893002/diff/240001/tools/android/heap_prof... tools/android/heap_profiler/heap_profiler_integrationtest.cc:91: for(size_t i = 0; i < stats->max_stack_traces; ++i) { On 2014/06/24 10:45:14, bulach wrote: > nit: space after for Done. https://codereview.chromium.org/323893002/diff/240001/tools/android/heap_prof... tools/android/heap_profiler/heap_profiler_integrationtest.cc:121: EXPECT_EQ(3000000, total_alloc_start - total_alloc_end); On 2014/06/24 10:45:14, bulach wrote: > nit: perhaps kSizeLarge - kSizeSmall.. > > also, for completeness, perhaps ensure that stats->stack_traces is empty Well, I can check for the delta, they will never be empty. The test harness itself creates some allocations :) > or that none of the steps above are not found? Done https://codereview.chromium.org/323893002/diff/240001/tools/android/heap_prof... tools/android/heap_profiler/heap_profiler_integrationtest.cc:124: TEST(HeapProfilerIntegrationTest, TestMmapStackTraces) { On 2014/06/24 10:45:13, bulach wrote: > this and the previous test is almost the same... > > perhaps could have a helper function that takes three function pointers for the > allocators, three sizes and one for the free/munmap? Hmm I did some cleanup + factored the LookupStackTraceBySize function. Extreme factoring of the entire test function seems a bit odd (also I'd split that in two parts at least, one for before-free, one for the after free). Is it OK now? https://codereview.chromium.org/323893002/diff/240001/tools/android/heap_prof... tools/android/heap_profiler/heap_profiler_integrationtest.cc:129: static const size_t m3_size = 509 * PAGE_SIZE; On 2014/06/24 10:45:13, bulach wrote: > nit: kMMapSizeN Done.
still lgtm for tests, but some clarifications inline: https://codereview.chromium.org/323893002/diff/260001/tools/android/heap_prof... File tools/android/heap_profiler/heap_profiler_integrationtest.cc (right): https://codereview.chromium.org/323893002/diff/260001/tools/android/heap_prof... tools/android/heap_profiler/heap_profiler_integrationtest.cc:135: TEST(HeapProfilerIntegrationTest, TestMmapStackTraces) { nit: I still think it'd be clearer to have one inner function taking the sizes / allocators / free'ers as params, the whole body is *exactly* the same except for those params.. https://codereview.chromium.org/323893002/diff/260001/tools/android/heap_prof... File tools/android/heap_profiler/heap_profiler_unittest.cc (right): https://codereview.chromium.org/323893002/diff/260001/tools/android/heap_prof... tools/android/heap_profiler/heap_profiler_unittest.cc:462: int main(int argc, char** argv) { nit: the main() for _unittest is normally on a separate file, heap_profiler_gtest_main.cc. since this is straightforward, it may be possible to remove it from here and just include this on the gyp target: https://code.google.com/p/chromium/codesearch#chromium/src/testing/gtest/src/... this would allow eventually other _unittest.cc to be bundled in the same executable. note: for the integration test I think it's fine to keep as is..
LGTM for: heap_profiler.{c,h} heap_profiler_hooks_android.c heap_profiler_integrationtest.cc Thanks and have fun profiling! https://codereview.chromium.org/323893002/diff/260001/tools/android/heap_prof... File tools/android/heap_profiler/heap_profiler_integrationtest.cc (right): https://codereview.chromium.org/323893002/diff/260001/tools/android/heap_prof... tools/android/heap_profiler/heap_profiler_integrationtest.cc:25: // that this function will be part of the stack trace. this trick is not done for DoMmap, is that for some particular reason or just forgotten? https://codereview.chromium.org/323893002/diff/260001/tools/android/heap_prof... tools/android/heap_profiler/heap_profiler_integrationtest.cc:82: const StacktraceEntry* LookupStackTraceBySize(size_t size) { maybe search by size _and_ a function to filter out spurious matches, possible causes of flake?
https://codereview.chromium.org/323893002/diff/260001/tools/android/heap_prof... File tools/android/heap_profiler/heap_profiler_integrationtest.cc (right): https://codereview.chromium.org/323893002/diff/260001/tools/android/heap_prof... tools/android/heap_profiler/heap_profiler_integrationtest.cc:25: // that this function will be part of the stack trace. On 2014/06/25 16:39:50, pasko wrote: > this trick is not done for DoMmap, is that for some particular reason or just > forgotten? the *trick* is essentially a way to avoid that the function loses the stack frame, which I definitely don't want for the outer/inner functions. I don't care, instead, if DoMmap loses the stack frame due to tail call optim (I never look for the presence of DoMmap in the stack frames). At this point you might wonder: why did you introduce DoMmap at all so? It's simple: if I do the mmap() directly inside MallocInner, the generated code is large (because of all the arguments passed to mmap) and the 16-byte offset assumpion will not hold on all platforms. ;-) https://codereview.chromium.org/323893002/diff/260001/tools/android/heap_prof... tools/android/heap_profiler/heap_profiler_integrationtest.cc:82: const StacktraceEntry* LookupStackTraceBySize(size_t size) { On 2014/06/25 16:39:50, pasko wrote: > maybe search by size _and_ a function to filter out spurious matches, possible > causes of flake? Done. https://codereview.chromium.org/323893002/diff/260001/tools/android/heap_prof... tools/android/heap_profiler/heap_profiler_integrationtest.cc:135: TEST(HeapProfilerIntegrationTest, TestMmapStackTraces) { On 2014/06/25 11:19:33, bulach wrote: > nit: I still think it'd be clearer to have one inner function taking the sizes / > allocators / free'ers as params, the whole body is *exactly* the same except for > those params.. Ok, what about now? https://codereview.chromium.org/323893002/diff/260001/tools/android/heap_prof... File tools/android/heap_profiler/heap_profiler_unittest.cc (right): https://codereview.chromium.org/323893002/diff/260001/tools/android/heap_prof... tools/android/heap_profiler/heap_profiler_unittest.cc:462: int main(int argc, char** argv) { On 2014/06/25 11:19:33, bulach wrote: > nit: the main() for _unittest is normally on a separate file, > heap_profiler_gtest_main.cc. > > since this is straightforward, it may be possible to remove it from here and > just include this on the gyp target: > https://code.google.com/p/chromium/codesearch#chromium/src/testing/gtest/src/... > > this would allow eventually other _unittest.cc to be bundled in the same > executable. Oh nice, I didn't know about the existence of that file. Removed and fixed the gyp. > > note: for the integration test I think it's fine to keep as is.. Yeah, integration test has its own black magic.
+brettw for DEPS Note: bsdtrees is going to be introduced by https://codereview.chromium.org/323873004/ (which got the LG, I'm trying to figure out whether that is sufficient or I have to wait for some other legal reviewer)
lgtm, thanks! looks nicer. small suggestions, feel free to go ahead.. https://codereview.chromium.org/323893002/diff/340001/tools/android/heap_prof... File tools/android/heap_profiler/heap_profiler_integrationtest.cc (right): https://codereview.chromium.org/323893002/diff/340001/tools/android/heap_prof... tools/android/heap_profiler/heap_profiler_integrationtest.cc:64: void Initialize(void* (*outer_fn)(size_t), void* (*inner_fn)(size_t)) { nit: could do with a typedef void* (*AllocatorFn)(size_t), then here it'd be AllocatorFn outer_fn, AllocatorFn inner_fn.. https://codereview.chromium.org/323893002/diff/340001/tools/android/heap_prof... tools/android/heap_profiler/heap_profiler_integrationtest.cc:139: void* m3 = MallocInner(kSize3); nit: could move these three and 151-153 to Initialize?
https://codereview.chromium.org/323893002/diff/340001/tools/android/heap_prof... File tools/android/heap_profiler/heap_profiler_integrationtest.cc (right): https://codereview.chromium.org/323893002/diff/340001/tools/android/heap_prof... tools/android/heap_profiler/heap_profiler_integrationtest.cc:64: void Initialize(void* (*outer_fn)(size_t), void* (*inner_fn)(size_t)) { On 2014/06/26 12:49:38, bulach wrote: > nit: could do with a typedef void* (*AllocatorFn)(size_t), then here it'd be > AllocatorFn outer_fn, AllocatorFn inner_fn.. Done. https://codereview.chromium.org/323893002/diff/340001/tools/android/heap_prof... tools/android/heap_profiler/heap_profiler_integrationtest.cc:139: void* m3 = MallocInner(kSize3); On 2014/06/26 12:49:38, bulach wrote: > nit: could move these three and 151-153 to Initialize? Hmm I tried but it looks odd. I can move the allocation inside Init using the outer/inner fn pointer. However, I cannot move the free (because munmap takes a 2nd param). The overall result feels "unbalanced" and hard to follow. You have to realize that the actual allocation is made inside Init following the function pointer assignment. I'd keep it more readable. Also, the overall lines count wouldn't decrease, as I have to add three extra fields in the class to allocate them in a single place, and we're back to 6. :) This is how it would look like. void Initialize(AllocatorFn outer_fn, AllocatorFn inner_fn) { outer_fn_ = reinterpret_cast<uintptr_t>(outer_fn); inner_fn_ = reinterpret_cast<uintptr_t>(inner_fn); m1_ = outer_fn(kSize1) m2_ = outer_fn(kSize2) m3_ = outer_fn(kSize3) } ... TEST_F(HeapProfilerIntegrationTest, TestMmapStackTraces) { Initialize(&MmapOuter, &MmapInner); munmap(m3_, kSize3); TestAfterAllAllocatedAnd3Freed(); ... protected: void* m1_; void* m2_; void* m3_;
still lgtm, sorry I wasn't clear from the beginning.. https://codereview.chromium.org/323893002/diff/360001/tools/android/heap_prof... File tools/android/heap_profiler/heap_profiler_integrationtest.cc (right): https://codereview.chromium.org/323893002/diff/360001/tools/android/heap_prof... tools/android/heap_profiler/heap_profiler_integrationtest.cc:146: free(m2); you could create a dummy wrapper around free that takes a second unused param :) then this and the following test would just delegate passing: &MallocOuter, &MallocInner, &Free and &MmapOuter, &MmapInner, &munmap which would also make the "Initialize" into a single helper function, and won't need extra members, right?
https://codereview.chromium.org/323893002/diff/360001/tools/android/heap_prof... File tools/android/heap_profiler/heap_profiler_integrationtest.cc (right): https://codereview.chromium.org/323893002/diff/360001/tools/android/heap_prof... tools/android/heap_profiler/heap_profiler_integrationtest.cc:146: free(m2); On 2014/06/26 14:42:54, bulach wrote: > which would also make the "Initialize" into a single helper function, and won't > need extra members, right? Done, now is more "zipped" than ever :) Thanks for the thorough review.
lgtm, thanks!
Hi, I have some questions about naming and dumped results. https://codereview.chromium.org/323893002/diff/400001/tools/android/heap_prof... File tools/android/heap_profiler/heap_dump.c (right): https://codereview.chromium.org/323893002/diff/400001/tools/android/heap_prof... tools/android/heap_profiler/heap_dump.c:12: // "num_allocs": 37542, # Number of virtual memory areas (VMAs) What do you mean by "alloc" here? To me, "alloc" sounds like malloc-like things, but it looks like page-level memory management like mmap. I'm not very sure, but are malloc and mmap mixed? https://codereview.chromium.org/323893002/diff/400001/tools/android/heap_prof... tools/android/heap_profiler/heap_dump.c:23: // +-----------------------------> Start address of the VMA (hex). Where does this start address point? In my trial, it was "421b5020" for example. What virtual address does this mean? IIUC, a start address of a "VMA" must be "-----000" in case the pagesize is 4096.
https://codereview.chromium.org/323893002/diff/400001/tools/android/heap_prof... File tools/android/heap_profiler/heap_dump.c (right): https://codereview.chromium.org/323893002/diff/400001/tools/android/heap_prof... tools/android/heap_profiler/heap_dump.c:12: // "num_allocs": 37542, # Number of virtual memory areas (VMAs) On 2014/07/02 04:49:52, Dai Mikurube wrote: > What do you mean by "alloc" here? To me, "alloc" sounds like malloc-like things, > but it looks like page-level memory management like mmap. Right, let me clarify. The short version is: an allocation is a contiguous memory region with arbitrary granularity (i.e. NOT necessarily page aligned) which is the result of any of the syscall hooked through this library (for the moment malloc/mmap and their variants). Ideally, whenever you intercept a malloc or mmap, the lib creates or deletes an "allocation" (referred in the code as a vma), corresponding to the allocated memory region. Over simplifying, there is one alloc per outstanding mmap/malloc. It gets a bit more complicated, however, in the case of partial frees (as supported by mmap semantics). Immagine the following scenario: mmap(0, 32k) -> You get 1 entry in "allocs" [0, 32k[. munmap(4k, 8k) -> This hole now ends up with two allocs entries: [0, 4k[ and [8, 32k[ So, it is not technically true that there is one "allocs" entry per outstanding mmap/malloc, since holes can create these fragments. I hope it's more clear now. I should review the comment here, thanks for pointing out. > I'm not very sure, but are malloc and mmap mixed? Yes, in the sense of counting, but you can still distinguish them by looking at the flags ("f") of each allocation entry. https://codereview.chromium.org/323893002/diff/400001/tools/android/heap_prof... tools/android/heap_profiler/heap_dump.c:23: // +-----------------------------> Start address of the VMA (hex). On 2014/07/02 04:49:52, Dai Mikurube wrote: > Where does this start address point? > > In my trial, it was "421b5020" for example. What virtual address does this mean? I think it should be clear now. Very likely, that was the address returned by malloc (since it's not page aligned, it cannot be the result of mmap, but flags will tell you). > IIUC, a start address of a "VMA" must be "-----000" in case the pagesize is > 4096. Right, as you might have understood at this point, VMA is not as in the kernel (page table) sense. It's a generic Virtual Memory Area.
https://codereview.chromium.org/323893002/diff/400001/tools/android/heap_prof... File tools/android/heap_profiler/heap_dump.c (right): https://codereview.chromium.org/323893002/diff/400001/tools/android/heap_prof... tools/android/heap_profiler/heap_dump.c:12: // "num_allocs": 37542, # Number of virtual memory areas (VMAs) Hmm, okay, but I think you shouldn't use the word "VMA" here and heap_profiler.cc because its an existing Linux-specific technical term. It confuses readers. Also, the flags should be described somewhere. As you said, mmap'ed regions can overlap each other and partially freed. In addition, a mmap'ed region can contain malloc'ed blocks (many malloc libraries use mmap as arena while some use (s)brk) though malloc'ed blocks don't overlap each other. We may want their difference is described well if mmap and malloc are handled in a mixed way (while we can handle it in a post-mortem script). jfyi, malloc is usually not a syscall, just a library.
https://codereview.chromium.org/323893002/diff/400001/tools/android/heap_prof... File tools/android/heap_profiler/heap_dump.c (right): https://codereview.chromium.org/323893002/diff/400001/tools/android/heap_prof... tools/android/heap_profiler/heap_dump.c:12: // "num_allocs": 37542, # Number of virtual memory areas (VMAs) On 2014/07/02 09:39:32, Dai Mikurube wrote: > Hmm, okay, but I think you shouldn't use the word "VMA" here and > heap_profiler.cc because its an existing Linux-specific technical term. It > confuses readers. Also, the flags should be described somewhere. Hmm agree. I'll reword it. > As you said, mmap'ed regions can overlap each other Well, not really overlap. In the worst case (MAP_FIXED) a new map can override a previous one. > and partially freed. In > addition, a mmap'ed region can contain malloc'ed blocks (many malloc libraries > use mmap as arena while some use (s)brk) though malloc'ed blocks don't overlap > each other. Correct > We may want their difference is described well if mmap and malloc > are handled in a mixed way (while we can handle it in a post-mortem script). See the previous discussion on this point (in this codereview) with pasko@. In essence the idea is that this library exposes as much detail it can gather (that's why the flags) and then the consumer script (memory inspector in my plans) filters/postprocesses this information. > jfyi, malloc is usually not a syscall, just a library. Yeah I know, it was a shortcut in the explanation. I really meant "dynamic symbols". Thanks for pointing out :)
https://codereview.chromium.org/323893002/diff/400001/tools/android/heap_prof... File tools/android/heap_profiler/heap_dump.c (right): https://codereview.chromium.org/323893002/diff/400001/tools/android/heap_prof... tools/android/heap_profiler/heap_dump.c:12: // "num_allocs": 37542, # Number of virtual memory areas (VMAs) On 2014/07/02 09:49:53, Primiano Tucci wrote: > On 2014/07/02 09:39:32, Dai Mikurube wrote: > > Hmm, okay, but I think you shouldn't use the word "VMA" here and > > heap_profiler.cc because its an existing Linux-specific technical term. It > > confuses readers. Also, the flags should be described somewhere. > Hmm agree. I'll reword it. > > > > As you said, mmap'ed regions can overlap each other > Well, not really overlap. In the worst case (MAP_FIXED) a new map can override a > previous one. > > > and partially freed. In > > addition, a mmap'ed region can contain malloc'ed blocks (many malloc libraries > > use mmap as arena while some use (s)brk) though malloc'ed blocks don't overlap > > each other. > Correct > > > We may want their difference is described well if mmap and malloc > > are handled in a mixed way (while we can handle it in a post-mortem script). > See the previous discussion on this point (in this codereview) with pasko@. > In essence the idea is that this library exposes as much detail it can gather > (that's why the flags) and then the consumer script (memory inspector in my > plans) filters/postprocesses this information. Sure. I agree with the basic idea. My suggestion is just to describe the difference somewhere. Except for the comments, others look good to me. :) > > > jfyi, malloc is usually not a syscall, just a library. > Yeah I know, it was a shortcut in the explanation. I really meant "dynamic > symbols". > Thanks for pointing out :)
Ah, one additional request. Currently, we cannot be aware of the flag (indicating the "allocator" - mmap or malloc) only from the "stacks" attribute in the result JSON. We need the "allocs" attribute to know that. Can we have the "flags" in the "stacks" attribute? "allocs" is usually so large.
> Can we have the "flags" in the "stacks" attribute? "allocs" is usually so large. Hmm, we need to think a bit more about this. The problem is that, conceptually, there is a M-to-1 mapping between allocations and stacks. In other word a "stack" (i.e. a code path) could make different types (i.e. with different flags) of allocations . In practice this is very unlikely to happen, because the very last stack frames should diverge in the cases of mmap vs malloc. However, I am not entirely sure about the same holds for other flags, like the zygote one. In essence the risk is that if we add flags also to stacks, it is hard to guarantee, conceptually at least, that they will be consistent. On a practical viewpoint, instead, I agree that it is a reasonable thing to do. What if we keep this change for the next CL, so if we realize that results are too inconsistent we can revert and go back to this original behavior?
On 2014/07/03 10:11:22, Primiano Tucci wrote: > > Can we have the "flags" in the "stacks" attribute? "allocs" is usually so > large. > Hmm, we need to think a bit more about this. > The problem is that, conceptually, there is a M-to-1 mapping between allocations > and stacks. In other word a "stack" (i.e. a code path) could make different > types (i.e. with different flags) of allocations . > In practice this is very unlikely to happen, because the very last stack frames > should diverge in the cases of mmap vs malloc. > However, I am not entirely sure about the same holds for other flags, like the > zygote one. > > In essence the risk is that if we add flags also to stacks, it is hard to > guarantee, conceptually at least, that they will be consistent. > On a practical viewpoint, instead, I agree that it is a reasonable thing to do. > > What if we keep this change for the next CL, so if we realize that results are > too inconsistent we can revert and go back to this original behavior? I agree with going ahead with this CL now and trying that in the next CL. Ideally, I think they should be accounted as different "stacks" if they have different flags (or, at least, allocated by different allocation methods). dmprof does that for malloc and mmap -- it means that there can be two identical stack traces which is by malloc and mmap. I'm not very sure whether we should do that here, though. It depends on implementation and runtime costs.
The CQ bit was checked by primiano@chromium.org
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/primiano@chromium.org/323893002/440001
FYI, CQ is re-trying this CL (attempt #1). The failing builders are: chromium_presubmit on tryserver.chromium (http://build.chromium.org/p/tryserver.chromium/builders/chromium_presubmit/bu...) win_chromium_rel on tryserver.chromium (http://build.chromium.org/p/tryserver.chromium/builders/win_chromium_rel/buil...)
The CQ bit was unchecked by commit-bot@chromium.org
Try jobs failed on following builders: chromium_presubmit on tryserver.chromium (http://build.chromium.org/p/tryserver.chromium/builders/chromium_presubmit/bu...)
LGTM, Good job Primiano! http://goo.gl/i1oOf1 (No, I'm not insane, it's just to make the presubmit bot happy)
The CQ bit was checked by primiano@chromium.org
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/primiano@chromium.org/323893002/440001
Message was sent while issue was closed.
Change committed as 281542 |