Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(5)

Side by Side Diff: tools/auto_bisect/bisect_results.py

Issue 850013004: Obtain confidence score based off last known good and first known bad revision results. (Closed) Base URL: https://chromium.googlesource.com/chromium/src.git@master
Patch Set: Created 5 years, 11 months ago
Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.
Jump to:
View unified diff | Download patch
OLDNEW
1 # Copyright 2014 The Chromium Authors. All rights reserved. 1 # Copyright 2014 The Chromium Authors. All rights reserved.
2 # Use of this source code is governed by a BSD-style license that can be 2 # Use of this source code is governed by a BSD-style license that can be
3 # found in the LICENSE file. 3 # found in the LICENSE file.
4 4
5 import math 5 import math
6 import os 6 import os
7 7
8 import bisect_utils 8 import bisect_utils
9 import math_utils 9 import math_utils
10 import source_control 10 import source_control
(...skipping 211 matching lines...) Expand 10 before | Expand all | Expand 10 after
222 bad_greater_than_good else not prev_greater_than_current) 222 bad_greater_than_good else not prev_greater_than_current)
223 223
224 # Only report potential regressions with high confidence. 224 # Only report potential regressions with high confidence.
225 if is_same_direction and confidence > 50: 225 if is_same_direction and confidence > 50:
226 other_regressions.append([revision_state, prev_state, confidence]) 226 other_regressions.append([revision_state, prev_state, confidence])
227 previous_values.append(current_values) 227 previous_values.append(current_values)
228 prev_state = revision_state 228 prev_state = revision_state
229 return other_regressions 229 return other_regressions
230 230
231 @staticmethod 231 @staticmethod
232 def FindBreakingRevRange(revision_states): 232 def FindBreakingRevRange(revision_states):
qyearsley 2015/01/16 18:37:03 I think it's probably worth it to add a test for t
RobertoCN 2015/01/30 21:23:32 Done.
233 """Finds the last known good and first known bad revisions.
234
235 Note that since revision_states is expected to be in reverse cronological
RobertoCN 2015/01/14 18:50:47 chronological not cronological.
RobertoCN 2015/01/30 21:23:32 Done.
236 order, the last known good revision is the first revision in the list that
237 has the passed property set to 1, therefore the name
238 `first_working_revision`. The inverse applies to `last_broken_revision`.\
RobertoCN 2015/01/14 18:50:47 Unnecessary backlash at the end of the line.(typo)
RobertoCN 2015/01/30 21:23:32 Done.
239 """
qyearsley 2015/01/16 18:37:03 [Optional] You could add Args and Returns sections
RobertoCN 2015/01/30 21:23:32 Done.
233 first_working_revision = None 240 first_working_revision = None
234 last_broken_revision = None 241 last_broken_revision = None
qyearsley 2015/01/16 18:37:03 [Optional] It might be less confusing to reverse t
RobertoCN 2015/01/30 21:23:32 Acknowledged.
235 242
236 for revision_state in revision_states: 243 for revision_state in revision_states:
237 if revision_state.passed == 1 and not first_working_revision: 244 if revision_state.passed == 1 and not first_working_revision:
238 first_working_revision = revision_state 245 first_working_revision = revision_state
239 246
240 if not revision_state.passed: 247 if not revision_state.passed:
241 last_broken_revision = revision_state 248 last_broken_revision = revision_state
242 249
243 return first_working_revision, last_broken_revision 250 return first_working_revision, last_broken_revision
244 251
(...skipping 35 matching lines...) Expand 10 before | Expand all | Expand 10 after
280 287
281 regression_size = 100 * math_utils.RelativeChange(mean_of_good_runs, 288 regression_size = 100 * math_utils.RelativeChange(mean_of_good_runs,
282 mean_of_bad_runs) 289 mean_of_bad_runs)
283 if math.isnan(regression_size): 290 if math.isnan(regression_size):
284 regression_size = 'zero-to-nonzero' 291 regression_size = 'zero-to-nonzero'
285 292
286 regression_std_err = math.fabs(math_utils.PooledStandardError( 293 regression_std_err = math.fabs(math_utils.PooledStandardError(
287 [working_mean, broken_mean]) / 294 [working_mean, broken_mean]) /
288 max(0.0001, min(mean_of_good_runs, mean_of_bad_runs))) * 100.0 295 max(0.0001, min(mean_of_good_runs, mean_of_bad_runs))) * 100.0
289 296
290 # Give a "confidence" in the bisect. At the moment we use how distinct the 297 # Give a "confidence" in the bisect. Currently, we consider the values of
291 # values are before and after the last broken revision, and how noisy the 298 # only the revisions at the breaking range (last known good and first known
292 # overall graph is. 299 # bad) see the note in the docstring for FindBreakingRange.
293 confidence_params = (sum(working_means, []), sum(broken_means, [])) 300 confidence_params = (
301 sum([first_working_rev.value['values']], []),
302 sum([last_broken_rev.value['values']], [])
303 )
294 confidence = cls.ConfidenceScore(*confidence_params) 304 confidence = cls.ConfidenceScore(*confidence_params)
295 305
296 bad_greater_than_good = mean_of_bad_runs > mean_of_good_runs 306 bad_greater_than_good = mean_of_bad_runs > mean_of_good_runs
297 307
298 return {'regression_size': regression_size, 308 return {'regression_size': regression_size,
299 'regression_std_err': regression_std_err, 309 'regression_std_err': regression_std_err,
300 'confidence': confidence, 310 'confidence': confidence,
301 'bad_greater_than_good': bad_greater_than_good} 311 'bad_greater_than_good': bad_greater_than_good}
OLDNEW

Powered by Google App Engine
This is Rietveld 408576698