Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(86)

Side by Side Diff: base/trace_event/heap_profiler_heap_dump_writer.cc

Issue 1494293005: [Tracing] Enable heap dumps with both type info and backtraces (Closed) Base URL: https://chromium.googlesource.com/chromium/src.git@sfdedup-iter
Patch Set: Avoid reuse and generic code, opt for less helper structs instead Created 5 years ago
Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.
Jump to:
View unified diff | Download patch
OLDNEW
1 // Copyright 2015 The Chromium Authors. All rights reserved. 1 // Copyright 2015 The Chromium Authors. All rights reserved.
2 // Use of this source code is governed by a BSD-style license that can be 2 // Use of this source code is governed by a BSD-style license that can be
3 // found in the LICENSE file. 3 // found in the LICENSE file.
4 4
5 #include "base/trace_event/heap_profiler_heap_dump_writer.h" 5 #include "base/trace_event/heap_profiler_heap_dump_writer.h"
6 6
7 #include <algorithm> 7 #include <algorithm>
8 #include <iterator> 8 #include <iterator>
9 #include <utility> 9 #include <tuple>
10 #include <vector> 10 #include <vector>
11 11
12 #include "base/containers/hash_tables.h"
12 #include "base/format_macros.h" 13 #include "base/format_macros.h"
14 #include "base/logging.h"
15 #include "base/macros.h"
13 #include "base/strings/stringprintf.h" 16 #include "base/strings/stringprintf.h"
17 #include "base/trace_event/heap_profiler_allocation_context.h"
14 #include "base/trace_event/heap_profiler_stack_frame_deduplicator.h" 18 #include "base/trace_event/heap_profiler_stack_frame_deduplicator.h"
15 #include "base/trace_event/heap_profiler_type_name_deduplicator.h" 19 #include "base/trace_event/heap_profiler_type_name_deduplicator.h"
16 #include "base/trace_event/trace_event_argument.h" 20 #include "base/trace_event/trace_event_argument.h"
17 21
22 // Most of what the |HeapDumpWriter| does is aggregating detailed information
23 // about the heap and deciding what to dump. Input to this process is a list of
24 // |AllocationContext|s and size. (Togeter a |Part|.)
25 //
26 // The parts are grouped into |Bucket|s. A bucket is a group of parts of which
27 // the properties share a prefix. (Type name is considered a list of length one
28 // here.) First all parts are put into one bucket that represents the entire
29 // heap. Then this bucket is recursively broken down into smaller buckets. Each
30 // bucket keeps track of whether further breakdown is possible.
31
18 namespace base { 32 namespace base {
19 namespace trace_event { 33 namespace trace_event {
20 34
35 using Entry = HeapDumpWriter::Entry;
36
21 namespace { 37 namespace {
22 38
23 template <typename T> 39 // Denotes a property of |AllocationContext|.
24 bool PairSizeGt(const std::pair<T, size_t>& lhs, 40 enum class Property { kBacktrace, kTypeName };
Primiano Tucci (use gerrit) 2015/12/08 18:05:30 nit: s/Property/BreakdownMode/ or GroupingMode (as
Ruud van Asseldonk 2015/12/09 14:07:53 I renamed |GroupBy| to |BreakDownBy|, because if t
25 const std::pair<T, size_t>& rhs) { 41
26 return lhs.second > rhs.second; 42 // A part of the heap: a context, and the number of bytes allocated for that
43 // context. Parts are mutually exclusive (i.e. every byte of the heap belongs
44 // to exactly one part).
45 using Part = std::pair<const AllocationContext*, size_t>;
Primiano Tucci (use gerrit) 2015/12/08 18:05:30 As discussed offline, the name is confusing. Don't
Ruud van Asseldonk 2015/12/09 14:07:53 Done.
46
47 struct Bucket {
48 Bucket() : size(0), backtrace_cursor(0), type_name_broken_down(false) {}
49
50 std::vector<Part> parts;
Primiano Tucci (use gerrit) 2015/12/08 18:05:30 this becomes: // This is allocations grouped by s
Ruud van Asseldonk 2015/12/09 14:07:53 Done. (const AllocationContext* though.)
51
52 // The sum of the sizes of the parts.
53 size_t size;
54
55 // The index of the stack frame that has not yet been broken down by. For
56 // all parts in this bucket, the stack frames 0 up to (but not including) the
57 // cursor, must be equal.
58 size_t backtrace_cursor;
59
60 // When true, the type name for all parts in this bucket must be equal.
61 bool type_name_broken_down;
Primiano Tucci (use gerrit) 2015/12/08 18:05:30 is_broken_down_by_type
Ruud van Asseldonk 2015/12/09 14:07:53 Done.
62 };
63
64 bool operator<(const Bucket& lhs, const Bucket& rhs) {
Primiano Tucci (use gerrit) 2015/12/08 18:05:30 add a comment explaining what this is this used fo
Ruud van Asseldonk 2015/12/09 14:07:53 Done.
65 return lhs.size < rhs.size;
27 } 66 }
28 67
29 // Converts a |hash_map<T, size_t>| into a vector of (T, size_t) pairs that is 68 // Groups the heap parts in the bucket by |groupBy|. The buckets in the returned
30 // ordered from high |size_t| to low |size_t|. 69 // list will have |backtrace_cursor| advanced or |type_name_broken_down| set
31 template <typename T> 70 // depending the property to group by.
32 std::vector<std::pair<T, size_t>> SortBySizeDescending( 71 std::vector<Bucket> GroupBy(const Bucket& bucket, Property groupBy) {
33 const hash_map<T, size_t>& grouped) { 72 base::hash_map<const char*, Bucket> grouped;
Primiano Tucci (use gerrit) 2015/12/08 18:05:30 s/grouped/breakdown/
Ruud van Asseldonk 2015/12/09 14:07:53 Done.
34 std::vector<std::pair<T, size_t>> sorted; 73
35 sorted.reserve(grouped.size()); 74 if (groupBy == Property::kBacktrace) {
36 std::copy(grouped.begin(), grouped.end(), std::back_inserter(sorted)); 75 for (const Part& part : bucket.parts) {
Primiano Tucci (use gerrit) 2015/12/08 18:05:30 yeah this is way more readable if you mention Allo
Primiano Tucci (use gerrit) 2015/12/08 18:05:30 maybe it's cleaner if this for is outside the ifs?
Ruud van Asseldonk 2015/12/09 14:07:53 Iterating (AllocationContext, size_t) pairs, you m
Ruud van Asseldonk 2015/12/09 14:07:53 Probably but then we can’t early out on type name
37 std::sort(sorted.begin(), sorted.end(), PairSizeGt<T>); 76 const char* const* begin = std::begin(part.first->backtrace.frames);
38 return sorted; 77 const char* const* end = std::end(part.first->backtrace.frames);
78 const char* const* cursor = begin + bucket.backtrace_cursor;
79
80 // The backtrace in the context is padded with null pointers, but these
81 // should not be considered for breakdown. Adjust end to point past the
82 // last non-null frame.
83 while (begin != end && *(end - 1) == nullptr)
84 end--;
85
86 DCHECK_LE(cursor, end);
87
88 if (cursor != end) {
89 Bucket& subbucket = grouped[*cursor];
90 subbucket.size += part.second;
91 subbucket.parts.push_back(part);
92 subbucket.backtrace_cursor = bucket.backtrace_cursor + 1;
93 subbucket.type_name_broken_down = bucket.type_name_broken_down;
94 DCHECK_GT(subbucket.size, 0u);
95 }
96 }
97 } else if (groupBy == Property::kTypeName) {
98 if (!bucket.type_name_broken_down) {
Primiano Tucci (use gerrit) 2015/12/08 18:05:30 should this be a DCHECK + an if somewhere else (w
Ruud van Asseldonk 2015/12/09 14:07:53 It could but that would entangle the two functions
99 for (const Part& part : bucket.parts) {
100 Bucket& subbucket = grouped[part.first->type_name];
101 subbucket.size += part.second;
102 subbucket.parts.push_back(part);
103 subbucket.backtrace_cursor = bucket.backtrace_cursor;
104 subbucket.type_name_broken_down = true;
105 DCHECK_GT(subbucket.size, 0u);
106 }
107 }
108 }
109
110 std::vector<Bucket> buckets;
111 buckets.reserve(grouped.size());
112 for (auto key_bucket : grouped)
113 buckets.push_back(key_bucket.second);
114
115 return buckets;
116 }
117
118 // Breaks down the bucket by |property|. Returns only buckets that contribute
119 // significantly to the total size. The long tail is omitted.
120 std::vector<Bucket> BreakDownBy(const Bucket& bucket, Property property) {
121 std::vector<Bucket> buckets = GroupBy(bucket, property);
122
123 // Ensure that |buckets| is a max-heap (the data structure, not memory heap),
124 // so its front contains the largest bucket. Buckets should be iterated
125 // ordered by size, but sorting the vector is overkill because the long tail
126 // of small buckets will be discarded. By using a max-heap, the optimal case
127 // where all but the first bucket are discarded is O(n). The worst case where
128 // no bucket is discarded is doing a heap sort, which is O(n log n).
129 std::make_heap(buckets.begin(), buckets.end());
130
131 // Keep including parts until adding a part would increase the number of
132 // bytes accounted for by less than a percent. The large buckets end up in
133 // [it, end()), [begin(), it) is the part that contains the max-heap of small
134 // buckets. TODO(ruuda): tweak the heuristic.
135 size_t accounted_for = 0;
136 std::vector<Bucket>::iterator it;
137 for (it = buckets.end(); it != buckets.begin(); --it) {
138 // Compute contribution to number of bytes accounted for in percent, rounded
139 // down due to integer division. Buckets are iterated by descending size,
140 // so later buckets cannot have a larger contribution than this one.
141 accounted_for += buckets.front().size;
142 size_t contribution = buckets.front().size * 100 / accounted_for;
143 if (contribution == 0)
144 break;
145
146 // Put the largest bucket in [begin, it) at |it - 1| and max-heapify
147 // [begin, it - 1). This puts the next largest bucket at |buckets.front()|.
148 std::pop_heap(buckets.begin(), it);
149 }
150
151 // Discard the long tail of buckets that contribute less than a percent.
Primiano Tucci (use gerrit) 2015/12/08 18:05:30 Ok please add a comment which explains graphically
Ruud van Asseldonk 2015/12/09 14:07:53 Done.
152 buckets.erase(buckets.begin(), it);
153
154 return buckets;
39 } 155 }
40 156
41 } // namespace 157 } // namespace
42 158
43 HeapDumpWriter::HeapDumpWriter(StackFrameDeduplicator* stack_frame_deduplicator, 159 HeapDumpWriter::HeapDumpWriter(StackFrameDeduplicator* stack_frame_deduplicator,
44 TypeNameDeduplicator* type_name_deduplicator) 160 TypeNameDeduplicator* type_name_deduplicator)
45 : traced_value_(new TracedValue()), 161 : stack_frame_deduplicator_(stack_frame_deduplicator),
46 stack_frame_deduplicator_(stack_frame_deduplicator),
47 type_name_deduplicator_(type_name_deduplicator) {} 162 type_name_deduplicator_(type_name_deduplicator) {}
48 163
49 HeapDumpWriter::~HeapDumpWriter() {} 164 HeapDumpWriter::~HeapDumpWriter() {}
50 165
51 void HeapDumpWriter::InsertAllocation(const AllocationContext& context, 166 void HeapDumpWriter::InsertAllocation(const AllocationContext& context,
52 size_t size) { 167 size_t size) {
53 bytes_by_context_[context] += size; 168 bytes_by_context_[context] += size;
54 } 169 }
55 170
56 scoped_refptr<TracedValue> HeapDumpWriter::WriteHeapDump() { 171 bool HeapDumpWriter::InsertEntry(const Bucket& bucket) {
Primiano Tucci (use gerrit) 2015/12/08 18:05:30 This needs a better name. At this point I'm confus
Ruud van Asseldonk 2015/12/09 14:07:53 Of course you never really _need_ classes or metho
57 // Group by backtrace and by type ID, and compute the total heap size while 172 // The parts in the bucket are all different, but for all properties, the
58 // iterating anyway. 173 // [begin, cursor) range is equal for all parts in the bucket.
59 size_t total_size = 0; 174 DCHECK(!bucket.parts.empty());
60 hash_map<Backtrace, size_t> bytes_by_backtrace;
61 hash_map<const char*, size_t> bytes_by_type;
62 175
63 for (auto context_size : bytes_by_context_) { 176 const AllocationContext& context = *bucket.parts.front().first;
64 total_size += context_size.second; 177
65 bytes_by_backtrace[context_size.first.backtrace] += context_size.second; 178 const char* const* backtrace_begin = std::begin(context.backtrace.frames);
66 bytes_by_type[context_size.first.type_name] += context_size.second; 179 const char* const* backtrace_end = backtrace_begin + bucket.backtrace_cursor;
180 DCHECK_LE(bucket.backtrace_cursor, arraysize(context.backtrace.frames));
181
182 Entry entry;
183 entry.stack_frame_id =
184 stack_frame_deduplicator_->Insert(backtrace_begin, backtrace_end);
185
186 // Deduplicate the type name, or use ID -1 if type name is not set.
187 entry.type_id = bucket.type_name_broken_down
188 ? type_name_deduplicator_->Insert(context.type_name)
189 : -1;
190
191 entry.size = bucket.size;
192
193 auto position_and_inserted = entries_.insert(entry);
194 return position_and_inserted.second;
195 }
196
197 void HeapDumpWriter::BreakDown(const Bucket& bucket) {
198 std::vector<Bucket> by_backtrace = BreakDownBy(bucket, Property::kBacktrace);
199 std::vector<Bucket> by_type_name = BreakDownBy(bucket, Property::kTypeName);
Primiano Tucci (use gerrit) 2015/12/08 18:05:30 this should be conditioned by if (bucket.is_broken
Ruud van Asseldonk 2015/12/09 14:07:53 Yes, you are right. I wanted to avoid further asym
200
201 // Insert entries for the buckets. If a bucket was not present before, it has
202 // not been broken down before, so recursively continue breaking down in that
203 // case. There might be multiple routes to the same entry (first break down
204 // by type name, then by backtrace, or first by backtrace and then by type),
205 // so a set is used to avoid dumping and breaking down entries more than once.
206
207 for (const Bucket& subbucket : by_backtrace)
208 if (InsertEntry(subbucket))
209 BreakDown(subbucket);
210
211 for (const Bucket& subbucket : by_type_name)
212 if (InsertEntry(subbucket))
213 BreakDown(subbucket);
214 }
215
216 const std::set<Entry>& HeapDumpWriter::Dump() {
217 // Start with one bucket that represents the entire heap. Iterate by
218 // reference, because the heap parts are going to point into allocation
219 // contexts stored in |bytes_by_context_|.
220 Bucket root_bucket;
221 for (const auto& context_and_size : bytes_by_context_) {
222 const AllocationContext* context = &context_and_size.first;
223 const size_t size = context_and_size.second;
224 root_bucket.parts.push_back(Part(context, size));
225 root_bucket.size += size;
67 } 226 }
68 227
69 auto sorted_bytes_by_backtrace = SortBySizeDescending(bytes_by_backtrace); 228 InsertEntry(root_bucket);
70 auto sorted_bytes_by_type = SortBySizeDescending(bytes_by_type);
71 229
72 traced_value_->BeginArray("entries"); 230 // Recursively break down the heap and fill |entries_| with entries to dump.
231 BreakDown(root_bucket);
73 232
74 // The global size, no column specified. 233 return entries_;
75 { 234 }
76 traced_value_->BeginDictionary(); 235
77 WriteSize(total_size); 236 // static
78 traced_value_->EndDictionary(); 237 scoped_refptr<TracedValue> HeapDumpWriter::Write(
238 const std::set<Entry>& entries) {
239 std::string buffer;
240 scoped_refptr<TracedValue> traced_value = new TracedValue;
241
242 traced_value->BeginArray("entries");
243
244 for (const Entry& entry : entries) {
245 traced_value->BeginDictionary();
246
247 // Format size as hexadecimal string into |buffer|.
248 SStringPrintf(&buffer, "%" PRIx64, static_cast<uint64_t>(entry.size));
249 traced_value->SetString("size", buffer);
250
251 if (entry.stack_frame_id == -1) {
252 // An empty backtrace (which will have ID -1) is represented by the empty
253 // string, because there is no leaf frame to reference in |stackFrames|.
254 traced_value->SetString("bt", "");
255 } else {
256 // Format index of the leaf frame as a string, because |stackFrames| is a
257 // dictionary, not an array.
258 SStringPrintf(&buffer, "%i", entry.stack_frame_id);
259 traced_value->SetString("bt", buffer);
260 }
261
262 // Type ID -1 (cumulative size for all types) is represented by the absence
263 // of the "type" key in the dictionary.
264 if (entry.type_id != -1) {
265 // Format the type ID as a string.
266 SStringPrintf(&buffer, "%i", entry.type_id);
267 traced_value->SetString("type", buffer);
268 }
269
270 traced_value->EndDictionary();
79 } 271 }
80 272
81 // Entries with the size per backtrace. 273 traced_value->EndArray(); // "entries"
82 for (const auto& entry : sorted_bytes_by_backtrace) { 274 return traced_value;
83 traced_value_->BeginDictionary();
84 // Insert a forward reference to the backtrace that will be written to the
85 // |stackFrames| dictionary later on.
86 int idx = stack_frame_deduplicator_->Insert(std::begin(entry.first.frames),
87 std::end(entry.first.frames));
88 WriteStackFrameIndex(idx);
89 WriteSize(entry.second);
90 traced_value_->EndDictionary();
91 }
92
93 // Entries with the size per type.
94 for (const auto& entry : sorted_bytes_by_type) {
95 traced_value_->BeginDictionary();
96 // Insert a forward reference to the type name that will be written to the
97 // trace when it is flushed.
98 WriteTypeId(type_name_deduplicator_->Insert(entry.first));
99 WriteSize(entry.second);
100 traced_value_->EndDictionary();
101 }
102
103 traced_value_->EndArray(); // "entries"
104
105 return traced_value_;
106 } 275 }
107 276
108 void HeapDumpWriter::WriteStackFrameIndex(int index) { 277 bool operator<(Entry lhs, Entry rhs) {
109 if (index == -1) { 278 // There is no need to compare |size|. If the backtrace and type name are
110 // An empty backtrace (which will have index -1) is represented by the empty 279 // equal then the sizes must be equal as well.
111 // string, because there is no leaf frame to reference in |stackFrames|. 280 return std::tie(lhs.stack_frame_id, lhs.type_id) <
112 traced_value_->SetString("bt", ""); 281 std::tie(rhs.stack_frame_id, rhs.type_id);
113 } else {
114 // Format index of the leaf frame as a string, because |stackFrames| is a
115 // dictionary, not an array.
116 SStringPrintf(&buffer_, "%i", index);
117 traced_value_->SetString("bt", buffer_);
118 }
119 }
120
121 void HeapDumpWriter::WriteTypeId(int type_id) {
122 // Format the type ID as a string.
123 SStringPrintf(&buffer_, "%i", type_id);
124 traced_value_->SetString("type", buffer_);
125 }
126
127 void HeapDumpWriter::WriteSize(size_t size) {
128 // Format size as hexadecimal string into |buffer_|.
129 SStringPrintf(&buffer_, "%" PRIx64, static_cast<uint64_t>(size));
130 traced_value_->SetString("size", buffer_);
131 } 282 }
132 283
133 } // namespace trace_event 284 } // namespace trace_event
134 } // namespace base 285 } // namespace base
OLDNEW

Powered by Google App Engine
This is Rietveld 408576698