tools/findit/blame.py - Issue 421223003: [Findit] Plain objects to represent the returned result from running the algorithm,

Side by Side Diff: tools/findit/blame.py

Issue 421223003: [Findit] Plain objects to represent the returned result from running the algorithm, (Closed) Base URL: https://chromium.googlesource.com/chromium/src.git@master

Patch Set: addressed codereview. Created 6 years, 4 months ago

Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.

Jump to:

View unified diff | Download patch

OLD	NEW
(Empty)
	1 # Copyright (c) 2014 The Chromium Authors. All rights reserved.

	2 # Use of this source code is governed by a BSD-style license that can be

	3 # found in the LICENSE file.

	4

	5 from threading import Lock, Thread

	6

	7 import crash_utils

	8 import utils

	9

	10

	11 class Blame(object):

	12 """Represents a blame object.

	13

	14 The object contains blame information for one line of stack, and this

	15 information is shown when there are no CLs that change the crashing files.

	16 Attributes:

	17 content: The content of the line to find the blame for.
	aarya 2014/08/14 20:22:20 s/content/line_content s/content/line_content jeun 2014/08/14 22:16:48 Done. Show quoted text On 2014/08/14 20:22:20, aarya wrote: > s/content/line_content Done.
	18 component_name: The name of the component this line is in.
	aarya 2014/08/14 21:14:39 s/this line is in/for this line. s/this line is in/for this line. jeun 2014/08/14 22:16:48 Done. Show quoted text On 2014/08/14 21:14:39, aarya wrote: > s/this line is in/for this line. Done.
	19 stack_frame_index: The stack frame index of this file.

	20 file_name: The name of the file.

	21 line_number: The line that caused a crash.

	22 author: The author of this line on the latest revision.

	23 crash_revision: The revision that caused the crash.

	24 revision: The latest revision of this line before the crash revision.

	25 url: The url of the change for the revision.

	26 range_start: The starting range of the regression for this component.

	27 range_end: The ending range of the regression.

	28

	29 """

	30

	31 def __init__(self, content, component_name, stack_frame_index, file_name,

	32 line_number, author, crash_revision, revision, url,

	33 range_start, range_end):

	34 # Set all the variables from the arguments.

	35 self.content = content

	36 self.component_name = component_name

	37 self.stack_frame_index = stack_frame_index

	38 self.file = file_name

	39 self.line_number = line_number

	40 self.author = author

	41 self.revision = revision

	42 self.url = url

	43 self.distance = crash_utils.INFINITY

	44 self.range_start = range_start

	45 self.range_end = range_end

	46 revision = int(revision)

	47

	48 # Calculate the distance, where it measures how far the last revision is
	aarya 2014/08/14 20:22:20 Can you stop by and help me understand what do you Can you stop by and help me understand what do you mean and how are using distance all over the place. jeun 2014/08/14 22:16:48 Done. Show quoted text On 2014/08/14 20:22:20, aarya wrote: > Can you stop by and help me understand what do you mean and how are using > distance all over the place. Done.
	49 # from the regression.

	50 if range_start and range_end:

	51 self.distance = min(abs(revision - range_start),

	52 abs(revision - range_end))

	53

	54 # If the regression is in SVN but it does not have regression info, check

	55 # how far the last revision is from crash revision.

	56 elif not utils.IsGitHash(crash_revision):

	57 self.distance = abs(int(crash_revision) - revision)

	58

	59

	60 class BlameList(object):

	61 """Represents a list of blame objects.

	62

	63 Thread-safe.

	64 """

	65

	66 def __init__(self):

	67 self.blame_list = []

	68 self.blame_list_lock = Lock()

	69

	70 def __getitem__(self, index):

	71 return self.blame_list[index]

	72

	73 def FindBlame(self, callstack, crash_revision_dict, regression_dict, parsers,

	74 top_n_frames=10):

	75 """Given a stack within a stacktrace, retrieves blame information.

	76

	77 Only either first 'top_n_frames' or the length of stack, whichever is

	78 shorter, results are returned. The default value of 'top_n_frames' is 10.

	79

	80 Args:

	81 callstack: The list of stack frames.

	82 crash_revision_dict: A dictionary that maps component to its crash

	83 revision.

	84 regression_dict: A dictionary that maps component to its revision

	85 range.

	86 parsers: A list of two parsers, svn_parser and git_parser

	87 top_n_frames: A number of stack frames to show the blame result for.

	88 """

	89 # Only return blame information for first 'top_n_frames' frames.

	90 stack_frames = callstack.GetTopNFrames(top_n_frames)

	91

	92 threads = []

	93 # Iterate through frames in stack.

	94 for stack_frame in stack_frames:

	95 # If the component this line is from does not have a crash revision,

	96 # It is not possible to get blame information so ignore this line.
	aarya 2014/08/14 20:22:20 s/It/it s/so/, so s/It/it s/so/, so jeun 2014/08/14 22:16:47 Done. Show quoted text On 2014/08/14 20:22:20, aarya wrote: > s/It/it > s/so/, so Done.
	97 component_path = stack_frame.component_path

	98 if component_path not in crash_revision_dict:

	99 continue

	100

	101 crash_revision = crash_revision_dict[component_path]['revision']

	102 range_start = None

	103 range_end = None

	104 is_git = utils.IsGitHash(crash_revision)

	105

	106 repository_parser = parsers[1 if is_git else 0]
	aarya 2014/08/14 20:22:20 parsers should be a dict. parsers['git'] and parse parsers should be a dict. parsers['git'] and parsers['svn'], don't use 0 and 1. jeun 2014/08/14 22:16:47 Done. Show quoted text On 2014/08/14 20:22:20, aarya wrote: > parsers should be a dict. parsers['git'] and parsers['svn'], don't use 0 and 1. Done.
	107

	108 # If the revision is in SVN, and if regression information is available,

	109 # get it. Not for Git because we cannot calculate the distance.

	110 if not is_git:

	111 if regression_dict and component_path in regression_dict:

	112 component_object = regression_dict[component_path]

	113 range_start = int(component_object['old_revision'])

	114 range_end = int(component_object['new_revision'])

	115

	116 # Generate blame entry, one thread for one entry.

	117 blame_thread = Thread(

	118 target=self.__GenerateBlameEntry,

	119 args=[repository_parser, stack_frame, crash_revision,

	120 range_start, range_end])

	121 threads.append(blame_thread)

	122 blame_thread.start()

	123

	124 # Join the results before returning.

	125 for blame_thread in threads:

	126 blame_thread.join()

	127

	128 def __GenerateBlameEntry(self, repository_parser, stack_frame,
	aarya 2014/08/14 20:22:21 Why does the name start with __ Why does the name start with __ jeun 2014/08/14 22:16:48 It is because the function is used only in this cl Show quoted text On 2014/08/14 20:22:21, aarya wrote: > Why does the name start with __ It is because the function is used only in this class.
	129 crash_revision, range_start, range_end):

	130 """Generates blame list from the arguments."""

	131 stack_frame_index = stack_frame.index

	132 component_path = stack_frame.component_path

	133 component_name = stack_frame.component_name

	134 file_name = stack_frame.file_name

	135 file_path = stack_frame.file_path

	136 line = stack_frame.crashed_line_number
	aarya 2014/08/14 20:22:20 s/crashed_line_number/crash_line_number s/line/cra s/crashed_line_number/crash_line_number s/line/crash_line_number . better to use same naming convention. jeun 2014/08/14 22:16:48 Done. Show quoted text On 2014/08/14 20:22:20, aarya wrote: > s/crashed_line_number/crash_line_number > s/line/crash_line_number . better to use same naming convention. Done.
	137

	138 # Parse blame information.

	139 parsed_blame_info = repository_parser.ParseBlameInfo(

	140 component_path, file_path, line, crash_revision)

	141

	142 # If it fails to retrieve information, do not do anything.

	143 if not parsed_blame_info:
	aarya 2014/08/14 20:22:21 can you check list length so that we don't error o can you check list length so that we don't error on next line. jeun 2014/08/14 22:16:48 Done. Show quoted text On 2014/08/14 20:22:21, aarya wrote: > can you check list length so that we don't error on next line. Done.
	144 return

	145

	146 # Create blame object from the parsed info and add it to the list.

	147 (content, revision, author, url) = parsed_blame_info

	148 blame = Blame(content, component_name, stack_frame_index, file_name, line,

	149 author, crash_revision, revision, url,

	150 range_start, range_end)

	151

	152 with self.blame_list_lock:

	153 self.blame_list.append(blame)

	154

	155 def FilterAndSortBlameList(self):

	156 """Filters and sorts the blame list."""

	157 # Sort the blame list by its distance, and its position in stack.

	158 self.blame_list.sort(key=lambda blame: (blame.distance,

	159 blame.stack_frame_index))

	160

	161 filtered_blame_list = []

	162

	163 for blame in self.blame_list:

	164 # If regression information is available, check if it needs to be

	165 # filtered.

	166 if blame.range_start and blame.range_end:

	167

	168 # Discards results that are too far from the regression.

	169 # For example, if regression is 10000:11000, it is very not

	170 # likely that a commit from revision 1000 would have caused a crash.

	171 if (blame.distance > blame.range_start / 4) and (
	aarya 2014/08/14 20:22:21 This rule makes no sense, why are you adding this This rule makes no sense, why are you adding this or why is this needed. How has this helped in results. jeun 2014/08/14 22:16:48 Done. Show quoted text On 2014/08/14 20:22:21, aarya wrote: > This rule makes no sense, why are you adding this or why is this needed. How has > this helped in results. Done.
	172 blame.distance > blame.range_end / 4):

	173 continue

	174

	175 filtered_blame_list.append(blame)

	176 self.blame_list = filtered_blame_list

OLD	NEW

« no previous file with comments | « no previous file | tools/findit/match_set.py » ('j') | tools/findit/match_set.py » ('J')