base/trace_event/heap_profiler_heap_dump_writer.cc - Issue 1911643002: Add configurable limit to allocations in heap profiler.

Side by Side Diff: base/trace_event/heap_profiler_heap_dump_writer.cc

Issue 1911643002: Add configurable limit to allocations in heap profiler. (Closed) Base URL: https://chromium.googlesource.com/chromium/src.git@master

Patch Set: Addressing comments Created 4 years, 8 months ago

Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.

Jump to:

View unified diff | Download patch

OLD	NEW
1 // Copyright 2015 The Chromium Authors. All rights reserved.	1 // Copyright 2015 The Chromium Authors. All rights reserved.

2 // Use of this source code is governed by a BSD-style license that can be	2 // Use of this source code is governed by a BSD-style license that can be

3 // found in the LICENSE file.	3 // found in the LICENSE file.

4	4

5 #include "base/trace_event/heap_profiler_heap_dump_writer.h"	5 #include "base/trace_event/heap_profiler_heap_dump_writer.h"

6	6

7 #include <stdint.h>	7 #include <stdint.h>

8	8

9 #include <algorithm>	9 #include <algorithm>

10 #include <iterator>	10 #include <iterator>

11 #include <tuple>	11 #include <tuple>

12 #include <utility>	12 #include <utility>

13 #include <vector>	13 #include <vector>

14	14

15 #include "base/format_macros.h"	15 #include "base/format_macros.h"

16 #include "base/logging.h"	16 #include "base/logging.h"

17 #include "base/macros.h"	17 #include "base/macros.h"

18 #include "base/strings/stringprintf.h"	18 #include "base/strings/stringprintf.h"

19 #include "base/trace_event/heap_profiler_stack_frame_deduplicator.h"	19 #include "base/trace_event/heap_profiler_stack_frame_deduplicator.h"

20 #include "base/trace_event/heap_profiler_type_name_deduplicator.h"	20 #include "base/trace_event/heap_profiler_type_name_deduplicator.h"

	21 #include "base/trace_event/trace_config.h"

21 #include "base/trace_event/trace_event_argument.h"	22 #include "base/trace_event/trace_event_argument.h"

	23 #include "base/trace_event/trace_log.h"

22	24

23 // Most of what the \|HeapDumpWriter\| does is aggregating detailed information	25 // Most of what the \|HeapDumpWriter\| does is aggregating detailed information

24 // about the heap and deciding what to dump. The Input to this process is a list	26 // about the heap and deciding what to dump. The Input to this process is a list

25 // of \|AllocationContext\|s and size pairs.	27 // of \|AllocationContext\|s and size pairs.

26 //	28 //

27 // The pairs are grouped into \|Bucket\|s. A bucket is a group of (context, size)	29 // The pairs are grouped into \|Bucket\|s. A bucket is a group of (context, size)

28 // pairs where the properties of the contexts share a prefix. (Type name is	30 // pairs where the properties of the contexts share a prefix. (Type name is

29 // considered a list of length one here.) First all pairs are put into one	31 // considered a list of length one here.) First all pairs are put into one

30 // bucket that represents the entire heap. Then this bucket is recursively	32 // bucket that represents the entire heap. Then this bucket is recursively

31 // broken down into smaller buckets. Each bucket keeps track of whether further	33 // broken down into smaller buckets. Each bucket keeps track of whether further

(...skipping 89 matching lines...) Expand 10 before \| Expand all \| Expand 10 after Loading...
121	123

122 std::vector<Bucket> buckets;	124 std::vector<Bucket> buckets;

123 buckets.reserve(breakdown.size());	125 buckets.reserve(breakdown.size());

124 for (auto key_bucket : breakdown)	126 for (auto key_bucket : breakdown)

125 buckets.push_back(key_bucket.second);	127 buckets.push_back(key_bucket.second);

126	128

127 return buckets;	129 return buckets;

128 }	130 }

129	131

130 // Breaks down the bucket by \|breakBy\|. Returns only buckets that contribute	132 // Breaks down the bucket by \|breakBy\|. Returns only buckets that contribute

131 // significantly to the total size. The long tail is omitted.	133 // more than \|minSizeBytes\| to the total size. The long tail is omitted.

132 std::vector<Bucket> BreakDownBy(const Bucket& bucket, BreakDownMode breakBy) {	134 std::vector<Bucket> BreakDownBy(const Bucket& bucket,

	135 BreakDownMode breakBy,

	136 size_t min_size_bytes) {

133 std::vector<Bucket> buckets = GetSubbuckets(bucket, breakBy);	137 std::vector<Bucket> buckets = GetSubbuckets(bucket, breakBy);

134	138

135 // Ensure that \|buckets\| is a max-heap (the data structure, not memory heap),	139 // Ensure that \|buckets\| is a max-heap (the data structure, not memory heap),

136 // so its front contains the largest bucket. Buckets should be iterated	140 // so its front contains the largest bucket. Buckets should be iterated

137 // ordered by size, but sorting the vector is overkill because the long tail	141 // ordered by size, but sorting the vector is overkill because the long tail

138 // of small buckets will be discarded. By using a max-heap, the optimal case	142 // of small buckets will be discarded. By using a max-heap, the optimal case

139 // where all but the first bucket are discarded is O(n). The worst case where	143 // where all but the first bucket are discarded is O(n). The worst case where

140 // no bucket is discarded is doing a heap sort, which is O(n log n).	144 // no bucket is discarded is doing a heap sort, which is O(n log n).

141 std::make_heap(buckets.begin(), buckets.end());	145 std::make_heap(buckets.begin(), buckets.end());

142	146

143 // Keep including buckets until adding one would increase the number of	147 // Keep including buckets until adding one would increase the number of

144 // bytes accounted for by less than 0.8% (1/125). This simple heuristic works	148 // bytes accounted for by \|min_size_bytes\|. The large buckets end up in

145 // quite well. The large buckets end up in [it, end()), [begin(), it) is the	149 // [it, end()), [begin(), it) is the part that contains the max-heap

146 // part that contains the max-heap of small buckets.	150 // of small buckets.

147 size_t accounted_for = 0;

148 std::vector<Bucket>::iterator it;	151 std::vector<Bucket>::iterator it;

149 for (it = buckets.end(); it != buckets.begin(); --it) {	152 for (it = buckets.end(); it != buckets.begin(); --it) {

150 accounted_for += buckets.front().size;	153 if (buckets.front().size < min_size_bytes)

151 if (buckets.front().size < (accounted_for / 125))

152 break;	154 break;

153	155

154 // Put the largest bucket in [begin, it) at \|it - 1\| and max-heapify	156 // Put the largest bucket in [begin, it) at \|it - 1\| and max-heapify

155 // [begin, it - 1). This puts the next largest bucket at \|buckets.front()\|.	157 // [begin, it - 1). This puts the next largest bucket at \|buckets.front()\|.

156 std::pop_heap(buckets.begin(), it);	158 std::pop_heap(buckets.begin(), it);

157 }	159 }

158	160

159 // At this point, \|buckets\| looks like this (numbers are bucket sizes):	161 // At this point, \|buckets\| looks like this (numbers are bucket sizes):

160 //	162 //

161 // <-- max-heap of small buckets --->	163 // <-- max-heap of small buckets --->

(...skipping 14 matching lines...) Expand all Loading...
176 bool operator<(Entry lhs, Entry rhs) {	178 bool operator<(Entry lhs, Entry rhs) {

177 // There is no need to compare \|size\|. If the backtrace and type name are	179 // There is no need to compare \|size\|. If the backtrace and type name are

178 // equal then the sizes must be equal as well.	180 // equal then the sizes must be equal as well.

179 return std::tie(lhs.stack_frame_id, lhs.type_id) <	181 return std::tie(lhs.stack_frame_id, lhs.type_id) <

180 std::tie(rhs.stack_frame_id, rhs.type_id);	182 std::tie(rhs.stack_frame_id, rhs.type_id);

181 }	183 }

182	184

183 HeapDumpWriter::HeapDumpWriter(StackFrameDeduplicator* stack_frame_deduplicator,	185 HeapDumpWriter::HeapDumpWriter(StackFrameDeduplicator* stack_frame_deduplicator,

184 TypeNameDeduplicator* type_name_deduplicator)	186 TypeNameDeduplicator* type_name_deduplicator)

185 : stack_frame_deduplicator_(stack_frame_deduplicator),	187 : stack_frame_deduplicator_(stack_frame_deduplicator),

186 type_name_deduplicator_(type_name_deduplicator) {}	188 type_name_deduplicator_(type_name_deduplicator) {

	189 min_allocation_size_bytes_ = TraceLog::GetInstance()->GetCurrentTraceConfig()
	Primiano Tucci (use gerrit) 2016/04/21 16:53:21 Hmm this is now introducing the assumption that He Hmm this is now introducing the assumption that HeapDumpWriter is alive only when tracing is enabled, which might cause bugs. Can we instead pass a HeapProfilerConfig& (see comment below) here? What I would do here is: It is safe to get the TraceLog in the OnMemoryDump of the MemoryDumpProviders, because at that point tracing is guaranteed to be active. This will mean having to do GetCurrentTraceConfig() in 3 places (malloc, partition alloc and blinkgc heap profiler), which is bad in terms of copy/paste but better in terms of lifetime. A better alternative could be the following: in MemoryDumpManager::OnTraceLogEnabled(), we already read the MemoryDumpConfig from the TraceLog. Let's copy that into the SessionState (which is accessible to all MemoryDumpProviders via pmd->session_state()). If you do that, at this point the only arg you need to pass to HeapDumpWriter is a const MemoryDumpSessionState&, which would make everything cleaner. In other words today in malloc_dump_provider you have: std::unique_ptr<TracedValue> heap_dump = ExportHeapDump( metrics_by_context, pmd->session_state()->stack_frame_deduplicator(), pmd->session_state()->type_name_deduplicator()); with this change you will just std::unique_ptr<TracedValue> heap_dump = ExportHeapDump( metrics_by_context, pmd->session_state()) This can be done in a separate CL, before or after this (i.e. you could first refactor the existing code to move make this take the SessionState which is going to be a non-funcitonal refactor easier to review, and later add the extra threshold to the trace config)
	190 .GetMinAllocationSizeBytes();

	191 }

187	192

188 HeapDumpWriter::~HeapDumpWriter() {}	193 HeapDumpWriter::~HeapDumpWriter() {}

189	194

190 bool HeapDumpWriter::AddEntryForBucket(const Bucket& bucket) {	195 bool HeapDumpWriter::AddEntryForBucket(const Bucket& bucket) {

191 // The contexts in the bucket are all different, but the [begin, cursor) range	196 // The contexts in the bucket are all different, but the [begin, cursor) range

192 // is equal for all contexts in the bucket, and the type names are the same if	197 // is equal for all contexts in the bucket, and the type names are the same if

193 // \|is_broken_down_by_type_name\| is set.	198 // \|is_broken_down_by_type_name\| is set.

194 DCHECK(!bucket.metrics_by_context.empty());	199 DCHECK(!bucket.metrics_by_context.empty());

195	200

196 const AllocationContext* context = bucket.metrics_by_context.front().first;	201 const AllocationContext* context = bucket.metrics_by_context.front().first;

(...skipping 12 matching lines...) Expand all Loading...
209 : -1;	214 : -1;

210	215

211 entry.size = bucket.size;	216 entry.size = bucket.size;

212 entry.count = bucket.count;	217 entry.count = bucket.count;

213	218

214 auto position_and_inserted = entries_.insert(entry);	219 auto position_and_inserted = entries_.insert(entry);

215 return position_and_inserted.second;	220 return position_and_inserted.second;

216 }	221 }

217	222

218 void HeapDumpWriter::BreakDown(const Bucket& bucket) {	223 void HeapDumpWriter::BreakDown(const Bucket& bucket) {

219 auto by_backtrace = BreakDownBy(bucket, BreakDownMode::kByBacktrace);	224 auto by_backtrace = BreakDownBy(bucket,

220 auto by_type_name = BreakDownBy(bucket, BreakDownMode::kByTypeName);	225 BreakDownMode::kByBacktrace,

	226 min_allocation_size_bytes_);

	227 auto by_type_name = BreakDownBy(bucket,

	228 BreakDownMode::kByTypeName,

	229 min_allocation_size_bytes_);

221	230

222 // Insert entries for the buckets. If a bucket was not present before, it has	231 // Insert entries for the buckets. If a bucket was not present before, it has

223 // not been broken down before, so recursively continue breaking down in that	232 // not been broken down before, so recursively continue breaking down in that

224 // case. There might be multiple routes to the same entry (first break down	233 // case. There might be multiple routes to the same entry (first break down

225 // by type name, then by backtrace, or first by backtrace and then by type),	234 // by type name, then by backtrace, or first by backtrace and then by type),

226 // so a set is used to avoid dumping and breaking down entries more than once.	235 // so a set is used to avoid dumping and breaking down entries more than once.

227	236

228 for (const Bucket& subbucket : by_backtrace)	237 for (const Bucket& subbucket : by_backtrace)

229 if (AddEntryForBucket(subbucket))	238 if (AddEntryForBucket(subbucket))

230 BreakDown(subbucket);	239 BreakDown(subbucket);

(...skipping 75 matching lines...) Expand 10 before \| Expand all \| Expand 10 after Loading...
306 const hash_map<AllocationContext, AllocationMetrics>& metrics_by_context,	315 const hash_map<AllocationContext, AllocationMetrics>& metrics_by_context,

307 StackFrameDeduplicator* stack_frame_deduplicator,	316 StackFrameDeduplicator* stack_frame_deduplicator,

308 TypeNameDeduplicator* type_name_deduplicator) {	317 TypeNameDeduplicator* type_name_deduplicator) {

309 internal::HeapDumpWriter writer(stack_frame_deduplicator,	318 internal::HeapDumpWriter writer(stack_frame_deduplicator,

310 type_name_deduplicator);	319 type_name_deduplicator);

311 return Serialize(writer.Summarize(metrics_by_context));	320 return Serialize(writer.Summarize(metrics_by_context));

312 }	321 }

313	322

314 } // namespace trace_event	323 } // namespace trace_event

315 } // namespace base	324 } // namespace base

OLD	NEW