tools/convert_perf_script_to_tracing_json.py - Issue 226933002: Convert 'perf script' output to about:tracing json.

Side by Side Diff: tools/convert_perf_script_to_tracing_json.py

Issue 226933002: Convert 'perf script' output to about:tracing json. (Closed) Base URL: svn://svn.chromium.org/chrome/trunk/src

Patch Set: Created 6 years, 8 months ago

Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.

Jump to:

View unified diff | Download patch | Annotate | Revision Log

OLD	NEW
(Empty)
	1 #!/usr/bin/python

	2 # Copyright (c) 2012 The Chromium Authors. All rights reserved.

	3 # Use of this source code is governed by a BSD-style license that can be

	4 # found in the LICENSE file.

	5

	6 import json

	7 import re

	8 import sys

	9

	10 """Converts the output of "perf script" into json for about:tracing.

	11

	12 Usage: perf script \| convert_perf_script_to_tracing_json.py > trace.json

	13 """

	14

	15 # FIXME: Signal to the traceviewer that this is a CPU profile so it shows

	16 # samples instead of ms as the units.
	eseidel 2014/04/05 00:03:13 The samples themselves can sorta represent 0.1ms o The samples themselves can sorta represent 0.1ms or whatever the sampling frequency is. ojan 2014/04/05 00:13:50 The perf script output doesn't say what the sampli Show quoted text On 2014/04/05 00:03:13, eseidel wrote: > The samples themselves can sorta represent 0.1ms or whatever the sampling > frequency is. The perf script output doesn't say what the sampling frequency is. Maybe there's some other way to get at that? Personally, I prefer knowing the number of samples in my profiles to the number of ms, but I think it'd be good to give control over that if we have the information. vmiura 2014/04/08 19:33:50 In my experience the perf sampling interval is qui Show quoted text On 2014/04/05 00:13:50, ojan wrote: > On 2014/04/05 00:03:13, eseidel wrote: > > The samples themselves can sorta represent 0.1ms or whatever the sampling > > frequency is. > > The perf script output doesn't say what the sampling frequency is. Maybe there's > some other way to get at that? > > Personally, I prefer knowing the number of samples in my profiles to the number > of ms, but I think it'd be good to give control over that if we have the > information. In my experience the perf sampling interval is quite variable and hard to convert to time. Perf has a "period" field in every sample, that has to be used as a weight. This is how "perf" calculates %ages. perf script doesn't support the "period" field, but I added it locally.
	17

	18 def strip_to_last_paren(line):
	eseidel 2014/04/05 00:03:13 I'm surprised you didn't just use a regexp to matc I'm surprised you didn't just use a regexp to match the whole line?
	19 last_paren_index = line.rfind('(')

	20 if last_paren_index == -1:

	21 return line

	22 return line[:last_paren_index]

	23

	24 def extract_function_name(line):

	25 # This information from the stack doesn't seem terribly useful.

	26 line = line.replace('(anonymous namespace)::', '')

	27 line = line.replace('non-virtual thunk to ', '')

	28

	29 # Strip executable name.

	30 line = strip_to_last_paren(line)

	31 # Strip function arguments.
	eseidel 2014/04/05 00:03:13 Yeah, I think a regexp would be more readable, may Yeah, I think a regexp would be more readable, maybe? ojan 2014/04/05 00:13:50 I had a lot of trouble coming up with a regexp tha Show quoted text On 2014/04/05 00:03:13, eseidel wrote: > Yeah, I think a regexp would be more readable, maybe? I had a lot of trouble coming up with a regexp that did what I wanted...so I gave up and wrote this. Open to suggestions.
	32 line = strip_to_last_paren(line)

	33

	34 line = line.strip()

	35 line = re.sub('\s+', ' ', line)

	36

	37 first_space_index = line.find(' ')

	38 if first_space_index == -1:

	39 # Unsymbolized addresses.

	40 return line

	41 return line[first_space_index + 1:]

	42

	43 def collapse_perf_script_output(lines):

	44 collapsed_lines = {}

	45 stack_so_far = []

	46 thread_id = ''

	47

	48 for line in lines:

	49 line = line.strip()

	50

	51 if not line:

	52 if stack_so_far:

	53 stack = ';'.join(stack_so_far)

	54 collapsed_lines[thread_id][stack] = (

	55 collapsed_lines[thread_id].setdefault(stack, 0) + 1)

	56 stack_so_far = []

	57 continue

	58

	59 if line[0] == '#':

	60 continue

	61

	62 match_header = re.match('\w+\ (\w+)\ cycles:\s*$', line)

	63 if match_header:

	64 thread_id = match_header.group(1)

	65 if not thread_id in collapsed_lines:

	66 collapsed_lines[thread_id] = {}

	67 continue

	68

	69 stack_so_far.insert(0, extract_function_name(line))

	70

	71 return collapsed_lines

	72

	73 def add_sample(root, thread_id, current_stack, start_time, end_time,

	74 original_stack):

	75 node_so_far = root

	76 for function in current_stack:

	77 # Can get the same stack on different threads, so identify samples by

	78 # combination of function name and thread_id.

	79 key = function + ';' + thread_id

	80 if key in node_so_far:

	81 node_so_far[key]['end_time'] = end_time

	82 else:

	83 node_so_far[key] = {

	84 'children': {},

	85 'start_time': start_time,

	86 'end_time': end_time,

	87 }

	88 node_so_far = node_so_far[key]['children']

	89

	90 def compute_tree(collapsed_lines):

	91 total_samples_per_thread = {}

	92 total_samples_for_all_threads = 0

	93 tree = {}

	94

	95 for thread_id in collapsed_lines:

	96 for stack in sorted(collapsed_lines[thread_id].iterkeys()):

	97 samples = collapsed_lines[thread_id][stack]

	98 total_samples_for_thread = total_samples_per_thread.setdefault(

	99 thread_id, 0)

	100 add_sample(tree, thread_id, stack.split(';'),

	101 total_samples_for_thread, total_samples_for_thread + samples,

	102 stack)

	103 total_samples_per_thread[thread_id] += samples

	104 total_samples_for_all_threads += samples

	105

	106 return tree, total_samples_for_all_threads, total_samples_per_thread

	107

	108 def json_for_subtree(node, trace_data, total_samples_for_all_threads,

	109 total_samples_per_thread):

	110 for key in sorted(node.iterkeys()):

	111 function, thread_id = key.split(';')

	112 start_time = int(node[key]['start_time'])

	113 end_time = int(node[key]['end_time'])

	114 duration = end_time - start_time

	115 process_percent = '%2.2f%%' % (

	116 100 * float(duration) / total_samples_for_all_threads)

	117 thread_percent = '%2.2f%%' % (

	118 100 * float(duration) / total_samples_per_thread[thread_id])

	119

	120 # FIXME: extract out process IDs.

	121 children = node[key]['children']

	122

	123 # If there are no children, we can use a Complete event instead two

	124 # Duration events.

	125 if not children:

	126 trace_data.append({
	eseidel 2014/04/05 00:05:09 Crazy. When I started down this path I just used Crazy. When I started down this path I just used immediate events and didn't try to build a tree in the tracefile corresponding to teh stack traces. I guess that's kinda neat though, but I would expec these stacks to be very deep. ojan 2014/04/05 00:13:50 What are immediate events? Show quoted text On 2014/04/05 00:05:09, eseidel wrote: > Crazy. When I started down this path I just used immediate events and didn't > try to build a tree in the tracefile corresponding to teh stack traces. I guess > that's kinda neat though, but I would expec these stacks to be very deep. What are immediate events? dsinclair 2014/04/08 02:28:36 Immediate events are drawn as a line, instead of a Show quoted text On 2014/04/05 00:13:50, ojan wrote: > On 2014/04/05 00:05:09, eseidel wrote: > > Crazy. When I started down this path I just used immediate events and didn't > > try to build a tree in the tracefile corresponding to teh stack traces. I > guess > > that's kinda neat though, but I would expec these stacks to be very deep. > > What are immediate events? Immediate events are drawn as a line, instead of a block in the UI. There are 3 types, global, process and thread. The type, bascally, describes how tall the line is. A global line is drawn the height of the UI, the process lines are a full process block and a thread line will be one slice high. https://docs.google.com/a/chromium.org/document/d/1CvAClvFfyA5R-PhYUmn5OOQtYM... ojan 2014/04/08 19:38:40 The heading fragment doesn't seem to work. Searchi Show quoted text On 2014/04/08 02:28:36, dsinclair wrote: > On 2014/04/05 00:13:50, ojan wrote: > > On 2014/04/05 00:05:09, eseidel wrote: > > > Crazy. When I started down this path I just used immediate events and > didn't > > > try to build a tree in the tracefile corresponding to teh stack traces. I > > guess > > > that's kinda neat though, but I would expec these stacks to be very deep. > > > > What are immediate events? > > Immediate events are drawn as a line, instead of a block in the UI. There are 3 > types, global, process and thread. The type, bascally, describes how tall the > line is. A global line is drawn the height of the UI, the process lines are a > full process block and a thread line will be one slice high. > > > https://docs.google.com/a/chromium.org/document/d/1CvAClvFfyA5R-PhYUmn5OOQtYM... The heading fragment doesn't seem to work. Searching the doc for "immediate" doesn't find any hits. In either case, from your description of how they're drawn, I'm not sure why we'd want to use immediate events here. IMO, the whole point of this is to get the tree view of the samples. dsinclair 2014/04/08 19:42:44 Sorry, I should use the right words, we call them Show quoted text On 2014/04/08 19:38:40, ojan wrote: > On 2014/04/08 02:28:36, dsinclair wrote: > > On 2014/04/05 00:13:50, ojan wrote: > > > On 2014/04/05 00:05:09, eseidel wrote: > > > > Crazy. When I started down this path I just used immediate events and > > didn't > > > > try to build a tree in the tracefile corresponding to teh stack traces. I > > > guess > > > > that's kinda neat though, but I would expec these stacks to be very deep. > > > > > > What are immediate events? > > > > Immediate events are drawn as a line, instead of a block in the UI. There are > 3 > > types, global, process and thread. The type, bascally, describes how tall the > > line is. A global line is drawn the height of the UI, the process lines are a > > full process block and a thread line will be one slice high. > > > > > > > https://docs.google.com/a/chromium.org/document/d/1CvAClvFfyA5R-PhYUmn5OOQtYM... > > The heading fragment doesn't seem to work. Searching the doc for "immediate" > doesn't find any hits. > > In either case, from your description of how they're drawn, I'm not sure why > we'd want to use immediate events here. IMO, the whole point of this is to get > the tree view of the samples. Sorry, I should use the right words, we call them instant events, in the trace-viewer side.
	127 "pid": 1,

	128 "tid": thread_id,

	129 "name": function,

	130 "ts": start_time,

	131 "dur": duration,

	132 "ph": "X",

	133 "args": {

	134 "process percent": process_percent,

	135 "thread percent": thread_percent

	136 },

	137 })

	138 continue

	139

	140 trace_data.append({
	dsinclair 2014/04/08 02:28:36 I think, although I maybe mistaken, you could use I think, although I maybe mistaken, you could use a complete event here as well. As long as the timestamps are in the right order I think it will do the UI correctly. (Although, you'd need to test it to be sure) ojan 2014/04/08 19:38:40 I'm not sure what you're suggesting...how would th Show quoted text On 2014/04/08 02:28:36, dsinclair wrote: > I think, although I maybe mistaken, you could use a complete event here as well. > As long as the timestamps are in the right order I think it will do the UI > correctly. (Although, you'd need to test it to be sure) I'm not sure what you're suggesting...how would the children be nested beneath this function if using a complete event? Or are you just suggesting changing the phase to "X"? dsinclair 2014/04/08 19:42:44 We create the nested slices based on timestamps no Show quoted text On 2014/04/08 19:38:40, ojan wrote: > On 2014/04/08 02:28:36, dsinclair wrote: > > I think, although I maybe mistaken, you could use a complete event here as > well. > > As long as the timestamps are in the right order I think it will do the UI > > correctly. (Although, you'd need to test it to be sure) > > I'm not sure what you're suggesting...how would the children be nested beneath > this function if using a complete event? Or are you just suggesting changing the > phase to "X"? We create the nested slices based on timestamps not based on nesting. I believe you could just output a "ph": "X" here and drop the "ph": "E" event below. (It doesn't really matter all X events do is decrease the number of events we have to emit, it is functionally equivalent to a B/E pair for one X event.)
	141 "pid": 1,

	142 "tid": thread_id,

	143 "name": function,

	144 "ts": start_time,

	145 "ph": "B",

	146 })

	147

	148 json_for_subtree(children, trace_data, total_samples_for_all_threads,

	149 total_samples_per_thread)

	150

	151 trace_data.append({

	152 "pid": 1,

	153 "tid": thread_id,

	154 "name": function,

	155 "ts": end_time,

	156 "ph": "E",

	157 "args": {

	158 "process percent": process_percent,

	159 "thread percent": thread_percent

	160 },

	161 })

	162

	163 def stringified_json_output(lines):

	164 tree, total_samples_for_all_threads, total_samples_per_thread = (

	165 compute_tree(collapse_perf_script_output(lines)))

	166 trace_data = []

	167 json_for_subtree(tree, trace_data, total_samples_for_all_threads,

	168 total_samples_per_thread)

	169 return json.dumps(trace_data, separators=(',',':'))

	170

	171 if __name__ == "__main__":

	172 print stringified_json_output(sys.stdin.readlines())

OLD	NEW

« no previous file with comments | « no previous file | tools/convert_perf_script_to_tracing_json_test.py » ('j') | no next file with comments »