| OLD | NEW |
| 1 # Heap Profiler Internals | 1 This document has moved to [//docs/memory-infra/heap_profiler_internals.md](/doc
s/memory-infra/heap_profiler_internals.md). |
| 2 | 2 |
| 3 This document describes how the heap profiler works and how to add heap | |
| 4 profiling support to your allocator. If you just want to know how to use it, | |
| 5 see [Heap Profiling with MemoryInfra](heap_profiler.md) | |
| 6 | |
| 7 [TOC] | |
| 8 | |
| 9 ## Overview | |
| 10 | |
| 11 The heap profiler consists of tree main components: | |
| 12 | |
| 13 * **The Context Tracker**: Responsible for providing context (pseudo stack | |
| 14 backtrace) when an allocation occurs. | |
| 15 * **The Allocation Register**: A specialized hash table that stores allocation | |
| 16 details by address. | |
| 17 * **The Heap Dump Writer**: Extracts the most important information from a set | |
| 18 of recorded allocations and converts it into a format that can be dumped into | |
| 19 the trace log. | |
| 20 | |
| 21 These components are designed to work well together, but to be usable | |
| 22 independently as well. | |
| 23 | |
| 24 When there is a way to get notified of all allocations and frees, this is the | |
| 25 normal flow: | |
| 26 | |
| 27 1. When an allocation occurs, call | |
| 28 [`AllocationContextTracker::GetInstanceForCurrentThread()->GetContextSnapsho
t()`][context-tracker] | |
| 29 to get an [`AllocationContext`][alloc-context]. | |
| 30 2. Insert that context together with the address and size into an | |
| 31 [`AllocationRegister`][alloc-register] by calling `Insert()`. | |
| 32 3. When memory is freed, remove it from the register with `Remove()`. | |
| 33 4. On memory dump, collect the allocations from the register, call | |
| 34 [`ExportHeapDump()`][export-heap-dump], and add the generated heap dump to | |
| 35 the memory dump. | |
| 36 | |
| 37 [context-tracker]: https://chromium.googlesource.com/chromium/src/+/master/base
/trace_event/heap_profiler_allocation_context_tracker.h | |
| 38 [alloc-context]: https://chromium.googlesource.com/chromium/src/+/master/base
/trace_event/heap_profiler_allocation_context.h | |
| 39 [alloc-register]: https://chromium.googlesource.com/chromium/src/+/master/base
/trace_event/heap_profiler_allocation_register.h | |
| 40 [export-heap-dump]: https://chromium.googlesource.com/chromium/src/+/master/base
/trace_event/heap_profiler_heap_dump_writer.h | |
| 41 | |
| 42 *** aside | |
| 43 An allocator can skip step 2 and 3 if it is able to store the context itself, | |
| 44 and if it is able to enumerate all allocations for step 4. | |
| 45 *** | |
| 46 | |
| 47 When heap profiling is enabled (the `--enable-heap-profiling` flag is passed), | |
| 48 the memory dump manager calls `OnHeapProfilingEnabled()` on every | |
| 49 `MemoryDumpProvider` as early as possible, so allocators can start recording | |
| 50 allocations. This should be done even when tracing has not been started, | |
| 51 because these allocations might still be around when a heap dump happens during | |
| 52 tracing. | |
| 53 | |
| 54 ## Context Tracker | |
| 55 | |
| 56 The [`AllocationContextTracker`][context-tracker] is a thread-local object. Its | |
| 57 main purpose is to keep track of a pseudo stack of trace events. Chrome has | |
| 58 been instrumented with lots of `TRACE_EVENT` macros. These trace events push | |
| 59 their name to a thread-local stack when they go into scope, and pop when they | |
| 60 go out of scope, if all of the following conditions have been met: | |
| 61 | |
| 62 * A trace is being recorded. | |
| 63 * The category of the event is enabled in the trace config. | |
| 64 * Heap profiling is enabled (with the `--enable-heap-profiling` flag). | |
| 65 | |
| 66 This means that allocations that occur before tracing is started will not have | |
| 67 backtrace information in their context. | |
| 68 | |
| 69 A thread-local instance of the context tracker is initialized lazily when it is | |
| 70 first accessed. This might be because a trace event pushed or popped, or because | |
| 71 `GetContextSnapshot()` was called when an allocation occurred. | |
| 72 | |
| 73 [`AllocationContext`][alloc-context] is what is used to group and break down | |
| 74 allocations. Currently `AllocationContext` has the following fields: | |
| 75 | |
| 76 * Backtrace: filled by the context tracker, obtained from the thread-local | |
| 77 pseudo stack. | |
| 78 * Type name: to be filled in at a point where the type of a pointer is known, | |
| 79 set to _[unknown]_ by default. | |
| 80 | |
| 81 It is possible to modify this context after insertion into the register, for | |
| 82 instance to set the type name if it was not known at the time of allocation. | |
| 83 | |
| 84 ## Allocation Register | |
| 85 | |
| 86 The [`AllocationRegister`][alloc-register] is a hash table specialized for | |
| 87 storing `(size, AllocationContext)` pairs by address. It has been optimized for | |
| 88 Chrome's typical number of unfreed allocations, and it is backed by `mmap` | |
| 89 memory directly so there are no reentrancy issues when using it to record | |
| 90 `malloc` allocations. | |
| 91 | |
| 92 The allocation register is threading-agnostic. Access must be synchronised | |
| 93 properly. | |
| 94 | |
| 95 ## Heap Dump Writer | |
| 96 | |
| 97 Dumping every single allocation in the allocation register straight into the | |
| 98 trace log is not an option due to the sheer volume (~300k unfreed allocations). | |
| 99 The role of the [`ExportHeapDump()`][export-heap-dump] function is to group | |
| 100 allocations, striking a balance between trace log size and detail. | |
| 101 | |
| 102 See the [Heap Dump Format][heap-dump-format] document for more details about the | |
| 103 structure of the heap dump in the trace log. | |
| 104 | |
| 105 [heap-dump-format]: https://docs.google.com/document/d/1NqBg1MzVnuMsnvV1AKLdKaPS
PGpd81NaMPVk5stYanQ | |
| 106 | |
| 107 ## Instrumenting an Allocator | |
| 108 | |
| 109 Below is an example of adding heap profiling support to an allocator that has | |
| 110 an existing memory dump provider. | |
| 111 | |
| 112 ```cpp | |
| 113 class FooDumpProvider : public MemoryDumpProvider { | |
| 114 | |
| 115 // Kept as pointer because |AllocationRegister| allocates a lot of virtual | |
| 116 // address space when constructed, so only construct it when heap profiling is | |
| 117 // enabled. | |
| 118 scoped_ptr<AllocationRegister> allocation_register_; | |
| 119 Lock allocation_register_lock_; | |
| 120 | |
| 121 static FooDumpProvider* GetInstance(); | |
| 122 | |
| 123 void InsertAllocation(void* address, size_t size) { | |
| 124 AllocationContext context = AllocationContextTracker::GetInstanceForCurrentT
hread()->GetContextSnapshot(); | |
| 125 AutoLock lock(allocation_register_lock_); | |
| 126 allocation_register_->Insert(address, size, context); | |
| 127 } | |
| 128 | |
| 129 void RemoveAllocation(void* address) { | |
| 130 AutoLock lock(allocation_register_lock_); | |
| 131 allocation_register_->Remove(address); | |
| 132 } | |
| 133 | |
| 134 // Will be called as early as possible by the memory dump manager. | |
| 135 void OnHeapProfilingEnabled(bool enabled) override { | |
| 136 AutoLock lock(allocation_register_lock_); | |
| 137 allocation_register_.reset(new AllocationRegister()); | |
| 138 | |
| 139 // At this point, make sure that from now on, for every allocation and | |
| 140 // free, |FooDumpProvider::GetInstance()->InsertAllocation()| and | |
| 141 // |RemoveAllocation| are called. | |
| 142 } | |
| 143 | |
| 144 bool OnMemoryDump(const MemoryDumpArgs& args, | |
| 145 ProcessMemoryDump& pmd) override { | |
| 146 // Do regular dumping here. | |
| 147 | |
| 148 // Dump the heap only for detailed dumps. | |
| 149 if (args.level_of_detail == MemoryDumpLevelOfDetail::DETAILED) { | |
| 150 TraceEventMemoryOverhead overhead; | |
| 151 hash_map<AllocationContext, size_t> bytes_by_context; | |
| 152 | |
| 153 { | |
| 154 AutoLock lock(allocation_register_lock_); | |
| 155 if (allocation_register_) { | |
| 156 // Group allocations in the register into |bytes_by_context|, but do | |
| 157 // no additional processing inside the lock. | |
| 158 for (const auto& alloc_size : *allocation_register_) | |
| 159 bytes_by_context[alloc_size.context] += alloc_size.size; | |
| 160 | |
| 161 allocation_register_->EstimateTraceMemoryOverhead(&overhead); | |
| 162 } | |
| 163 } | |
| 164 | |
| 165 if (!bytes_by_context.empty()) { | |
| 166 scoped_refptr<TracedValue> heap_dump = ExportHeapDump( | |
| 167 bytes_by_context, | |
| 168 pmd->session_state()->stack_frame_deduplicator(), | |
| 169 pmb->session_state()->type_name_deduplicator()); | |
| 170 pmd->AddHeapDump("foo_allocator", heap_dump); | |
| 171 overhead.DumpInto("tracing/heap_profiler", pmd); | |
| 172 } | |
| 173 } | |
| 174 | |
| 175 return true; | |
| 176 } | |
| 177 }; | |
| 178 | |
| 179 ``` | |
| 180 | |
| 181 *** aside | |
| 182 The implementation for `malloc` is more complicated because it needs to deal | |
| 183 with reentrancy. | |
| 184 *** | |
| OLD | NEW |