OLD | NEW |
1 # Heap Profiler Internals | 1 This document has moved to [//docs/memory-infra/heap_profiler_internals.md](/doc
s/memory-infra/heap_profiler_internals.md). |
2 | 2 |
3 This document describes how the heap profiler works and how to add heap | |
4 profiling support to your allocator. If you just want to know how to use it, | |
5 see [Heap Profiling with MemoryInfra](heap_profiler.md) | |
6 | |
7 [TOC] | |
8 | |
9 ## Overview | |
10 | |
11 The heap profiler consists of tree main components: | |
12 | |
13 * **The Context Tracker**: Responsible for providing context (pseudo stack | |
14 backtrace) when an allocation occurs. | |
15 * **The Allocation Register**: A specialized hash table that stores allocation | |
16 details by address. | |
17 * **The Heap Dump Writer**: Extracts the most important information from a set | |
18 of recorded allocations and converts it into a format that can be dumped into | |
19 the trace log. | |
20 | |
21 These components are designed to work well together, but to be usable | |
22 independently as well. | |
23 | |
24 When there is a way to get notified of all allocations and frees, this is the | |
25 normal flow: | |
26 | |
27 1. When an allocation occurs, call | |
28 [`AllocationContextTracker::GetInstanceForCurrentThread()->GetContextSnapsho
t()`][context-tracker] | |
29 to get an [`AllocationContext`][alloc-context]. | |
30 2. Insert that context together with the address and size into an | |
31 [`AllocationRegister`][alloc-register] by calling `Insert()`. | |
32 3. When memory is freed, remove it from the register with `Remove()`. | |
33 4. On memory dump, collect the allocations from the register, call | |
34 [`ExportHeapDump()`][export-heap-dump], and add the generated heap dump to | |
35 the memory dump. | |
36 | |
37 [context-tracker]: https://chromium.googlesource.com/chromium/src/+/master/base
/trace_event/heap_profiler_allocation_context_tracker.h | |
38 [alloc-context]: https://chromium.googlesource.com/chromium/src/+/master/base
/trace_event/heap_profiler_allocation_context.h | |
39 [alloc-register]: https://chromium.googlesource.com/chromium/src/+/master/base
/trace_event/heap_profiler_allocation_register.h | |
40 [export-heap-dump]: https://chromium.googlesource.com/chromium/src/+/master/base
/trace_event/heap_profiler_heap_dump_writer.h | |
41 | |
42 *** aside | |
43 An allocator can skip step 2 and 3 if it is able to store the context itself, | |
44 and if it is able to enumerate all allocations for step 4. | |
45 *** | |
46 | |
47 When heap profiling is enabled (the `--enable-heap-profiling` flag is passed), | |
48 the memory dump manager calls `OnHeapProfilingEnabled()` on every | |
49 `MemoryDumpProvider` as early as possible, so allocators can start recording | |
50 allocations. This should be done even when tracing has not been started, | |
51 because these allocations might still be around when a heap dump happens during | |
52 tracing. | |
53 | |
54 ## Context Tracker | |
55 | |
56 The [`AllocationContextTracker`][context-tracker] is a thread-local object. Its | |
57 main purpose is to keep track of a pseudo stack of trace events. Chrome has | |
58 been instrumented with lots of `TRACE_EVENT` macros. These trace events push | |
59 their name to a thread-local stack when they go into scope, and pop when they | |
60 go out of scope, if all of the following conditions have been met: | |
61 | |
62 * A trace is being recorded. | |
63 * The category of the event is enabled in the trace config. | |
64 * Heap profiling is enabled (with the `--enable-heap-profiling` flag). | |
65 | |
66 This means that allocations that occur before tracing is started will not have | |
67 backtrace information in their context. | |
68 | |
69 A thread-local instance of the context tracker is initialized lazily when it is | |
70 first accessed. This might be because a trace event pushed or popped, or because | |
71 `GetContextSnapshot()` was called when an allocation occurred. | |
72 | |
73 [`AllocationContext`][alloc-context] is what is used to group and break down | |
74 allocations. Currently `AllocationContext` has the following fields: | |
75 | |
76 * Backtrace: filled by the context tracker, obtained from the thread-local | |
77 pseudo stack. | |
78 * Type name: to be filled in at a point where the type of a pointer is known, | |
79 set to _[unknown]_ by default. | |
80 | |
81 It is possible to modify this context after insertion into the register, for | |
82 instance to set the type name if it was not known at the time of allocation. | |
83 | |
84 ## Allocation Register | |
85 | |
86 The [`AllocationRegister`][alloc-register] is a hash table specialized for | |
87 storing `(size, AllocationContext)` pairs by address. It has been optimized for | |
88 Chrome's typical number of unfreed allocations, and it is backed by `mmap` | |
89 memory directly so there are no reentrancy issues when using it to record | |
90 `malloc` allocations. | |
91 | |
92 The allocation register is threading-agnostic. Access must be synchronised | |
93 properly. | |
94 | |
95 ## Heap Dump Writer | |
96 | |
97 Dumping every single allocation in the allocation register straight into the | |
98 trace log is not an option due to the sheer volume (~300k unfreed allocations). | |
99 The role of the [`ExportHeapDump()`][export-heap-dump] function is to group | |
100 allocations, striking a balance between trace log size and detail. | |
101 | |
102 See the [Heap Dump Format][heap-dump-format] document for more details about the | |
103 structure of the heap dump in the trace log. | |
104 | |
105 [heap-dump-format]: https://docs.google.com/document/d/1NqBg1MzVnuMsnvV1AKLdKaPS
PGpd81NaMPVk5stYanQ | |
106 | |
107 ## Instrumenting an Allocator | |
108 | |
109 Below is an example of adding heap profiling support to an allocator that has | |
110 an existing memory dump provider. | |
111 | |
112 ```cpp | |
113 class FooDumpProvider : public MemoryDumpProvider { | |
114 | |
115 // Kept as pointer because |AllocationRegister| allocates a lot of virtual | |
116 // address space when constructed, so only construct it when heap profiling is | |
117 // enabled. | |
118 scoped_ptr<AllocationRegister> allocation_register_; | |
119 Lock allocation_register_lock_; | |
120 | |
121 static FooDumpProvider* GetInstance(); | |
122 | |
123 void InsertAllocation(void* address, size_t size) { | |
124 AllocationContext context = AllocationContextTracker::GetInstanceForCurrentT
hread()->GetContextSnapshot(); | |
125 AutoLock lock(allocation_register_lock_); | |
126 allocation_register_->Insert(address, size, context); | |
127 } | |
128 | |
129 void RemoveAllocation(void* address) { | |
130 AutoLock lock(allocation_register_lock_); | |
131 allocation_register_->Remove(address); | |
132 } | |
133 | |
134 // Will be called as early as possible by the memory dump manager. | |
135 void OnHeapProfilingEnabled(bool enabled) override { | |
136 AutoLock lock(allocation_register_lock_); | |
137 allocation_register_.reset(new AllocationRegister()); | |
138 | |
139 // At this point, make sure that from now on, for every allocation and | |
140 // free, |FooDumpProvider::GetInstance()->InsertAllocation()| and | |
141 // |RemoveAllocation| are called. | |
142 } | |
143 | |
144 bool OnMemoryDump(const MemoryDumpArgs& args, | |
145 ProcessMemoryDump& pmd) override { | |
146 // Do regular dumping here. | |
147 | |
148 // Dump the heap only for detailed dumps. | |
149 if (args.level_of_detail == MemoryDumpLevelOfDetail::DETAILED) { | |
150 TraceEventMemoryOverhead overhead; | |
151 hash_map<AllocationContext, size_t> bytes_by_context; | |
152 | |
153 { | |
154 AutoLock lock(allocation_register_lock_); | |
155 if (allocation_register_) { | |
156 // Group allocations in the register into |bytes_by_context|, but do | |
157 // no additional processing inside the lock. | |
158 for (const auto& alloc_size : *allocation_register_) | |
159 bytes_by_context[alloc_size.context] += alloc_size.size; | |
160 | |
161 allocation_register_->EstimateTraceMemoryOverhead(&overhead); | |
162 } | |
163 } | |
164 | |
165 if (!bytes_by_context.empty()) { | |
166 scoped_refptr<TracedValue> heap_dump = ExportHeapDump( | |
167 bytes_by_context, | |
168 pmd->session_state()->stack_frame_deduplicator(), | |
169 pmb->session_state()->type_name_deduplicator()); | |
170 pmd->AddHeapDump("foo_allocator", heap_dump); | |
171 overhead.DumpInto("tracing/heap_profiler", pmd); | |
172 } | |
173 } | |
174 | |
175 return true; | |
176 } | |
177 }; | |
178 | |
179 ``` | |
180 | |
181 *** aside | |
182 The implementation for `malloc` is more complicated because it needs to deal | |
183 with reentrancy. | |
184 *** | |
OLD | NEW |