tools/auto_bisect/bisect_results.py - Issue 850013004: Obtain confidence score based off last known good and first known bad revision results.

Side by Side Diff: tools/auto_bisect/bisect_results.py

Issue 850013004: Obtain confidence score based off last known good and first known bad revision results. (Closed) Base URL: https://chromium.googlesource.com/chromium/src.git@master

Patch Set: Created 5 years, 11 months ago

Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.

Jump to:

View unified diff | Download patch

« tools/auto_bisect/bisect_perf_regression.py ('K') | « tools/auto_bisect/bisect_perf_regression.py ('k') | tools/auto_bisect/bisect_results_test.py » ('j') | tools/auto_bisect/bisect_results_test.py » ('J')
Toggle Intra-line Diffs ('i') | Expand Comments ('e') | Collapse Comments ('c') | Hide Comments ('s')

OLD	NEW
1 # Copyright 2014 The Chromium Authors. All rights reserved.	1 # Copyright 2014 The Chromium Authors. All rights reserved.

2 # Use of this source code is governed by a BSD-style license that can be	2 # Use of this source code is governed by a BSD-style license that can be

3 # found in the LICENSE file.	3 # found in the LICENSE file.

4	4

5 import math	5 import math

6 import os	6 import os

7	7

8 import bisect_utils	8 import bisect_utils

9 import math_utils	9 import math_utils

10 import source_control	10 import source_control

(...skipping 211 matching lines...) Expand 10 before \| Expand all \| Expand 10 after Loading...
222 bad_greater_than_good else not prev_greater_than_current)	222 bad_greater_than_good else not prev_greater_than_current)

223	223

224 # Only report potential regressions with high confidence.	224 # Only report potential regressions with high confidence.

225 if is_same_direction and confidence > 50:	225 if is_same_direction and confidence > 50:

226 other_regressions.append([revision_state, prev_state, confidence])	226 other_regressions.append([revision_state, prev_state, confidence])

227 previous_values.append(current_values)	227 previous_values.append(current_values)

228 prev_state = revision_state	228 prev_state = revision_state

229 return other_regressions	229 return other_regressions

230	230

231 @staticmethod	231 @staticmethod

232 def FindBreakingRevRange(revision_states):	232 def FindBreakingRevRange(revision_states):
	qyearsley 2015/01/16 18:37:03 I think it's probably worth it to add a test for t I think it's probably worth it to add a test for this function, even though it's probably at least covered by the other tests methods. RobertoCN 2015/01/30 21:23:32 Done. Show quoted text On 2015/01/16 18:37:03, qyearsley wrote: > I think it's probably worth it to add a test for this function, even though it's > probably at least covered by the other tests methods. Done.
	233 """Finds the last known good and first known bad revisions.

	234

	235 Note that since revision_states is expected to be in reverse cronological
	RobertoCN 2015/01/14 18:50:47 chronological not cronological. chronological not cronological. RobertoCN 2015/01/30 21:23:32 Done. Show quoted text On 2015/01/14 18:50:47, robertocn wrote: > chronological not cronological. Done.
	236 order, the last known good revision is the first revision in the list that

	237 has the passed property set to 1, therefore the name

	238 `first_working_revision`. The inverse applies to `last_broken_revision`.\
	RobertoCN 2015/01/14 18:50:47 Unnecessary backlash at the end of the line.(typo) Unnecessary backlash at the end of the line.(typo) RobertoCN 2015/01/30 21:23:32 Done. Show quoted text On 2015/01/14 18:50:47, robertocn wrote: > Unnecessary backlash at the end of the line.(typo) Done.
	239 """
	qyearsley 2015/01/16 18:37:03 [Optional] You could add Args and Returns sections [Optional] You could add Args and Returns sections to this docstring which explicitly says that the input is a list of RevisionState and the output is a pair of RevisionState. RobertoCN 2015/01/30 21:23:32 Done. Show quoted text On 2015/01/16 18:37:03, qyearsley wrote: > [Optional] You could add Args and Returns sections to this docstring which > explicitly says that the input is a list of RevisionState and the output is a > pair of RevisionState. Done.
233 first_working_revision = None	240 first_working_revision = None

234 last_broken_revision = None	241 last_broken_revision = None
	qyearsley 2015/01/16 18:37:03 [Optional] It might be less confusing to reverse t [Optional] It might be less confusing to reverse the list so that it's in chronological order, and call these last_good_rev and first_bad_rev. This could be done in a separate CL, changing the other usages as well, and making the revision states list chronological order. RobertoCN 2015/01/30 21:23:32 Acknowledged. Show quoted text On 2015/01/16 18:37:03, qyearsley wrote: > [Optional] It might be less confusing to reverse the list so that it's in > chronological order, and call these last_good_rev and first_bad_rev. This could > be done in a separate CL, changing the other usages as well, and making the > revision states list chronological order. Acknowledged.
235	242

236 for revision_state in revision_states:	243 for revision_state in revision_states:

237 if revision_state.passed == 1 and not first_working_revision:	244 if revision_state.passed == 1 and not first_working_revision:

238 first_working_revision = revision_state	245 first_working_revision = revision_state

239	246

240 if not revision_state.passed:	247 if not revision_state.passed:

241 last_broken_revision = revision_state	248 last_broken_revision = revision_state

242	249

243 return first_working_revision, last_broken_revision	250 return first_working_revision, last_broken_revision

244	251

(...skipping 35 matching lines...) Expand 10 before \| Expand all \| Expand 10 after Loading...
280	287

281 regression_size = 100 * math_utils.RelativeChange(mean_of_good_runs,	288 regression_size = 100 * math_utils.RelativeChange(mean_of_good_runs,

282 mean_of_bad_runs)	289 mean_of_bad_runs)

283 if math.isnan(regression_size):	290 if math.isnan(regression_size):

284 regression_size = 'zero-to-nonzero'	291 regression_size = 'zero-to-nonzero'

285	292

286 regression_std_err = math.fabs(math_utils.PooledStandardError(	293 regression_std_err = math.fabs(math_utils.PooledStandardError(

287 [working_mean, broken_mean]) /	294 [working_mean, broken_mean]) /

288 max(0.0001, min(mean_of_good_runs, mean_of_bad_runs))) * 100.0	295 max(0.0001, min(mean_of_good_runs, mean_of_bad_runs))) * 100.0

289	296

290 # Give a "confidence" in the bisect. At the moment we use how distinct the	297 # Give a "confidence" in the bisect. Currently, we consider the values of

291 # values are before and after the last broken revision, and how noisy the	298 # only the revisions at the breaking range (last known good and first known

292 # overall graph is.	299 # bad) see the note in the docstring for FindBreakingRange.

293 confidence_params = (sum(working_means, []), sum(broken_means, []))	300 confidence_params = (

	301 sum([first_working_rev.value['values']], []),

	302 sum([last_broken_rev.value['values']], [])

	303 )

294 confidence = cls.ConfidenceScore(*confidence_params)	304 confidence = cls.ConfidenceScore(*confidence_params)

295	305

296 bad_greater_than_good = mean_of_bad_runs > mean_of_good_runs	306 bad_greater_than_good = mean_of_bad_runs > mean_of_good_runs

297	307

298 return {'regression_size': regression_size,	308 return {'regression_size': regression_size,

299 'regression_std_err': regression_std_err,	309 'regression_std_err': regression_std_err,

300 'confidence': confidence,	310 'confidence': confidence,

301 'bad_greater_than_good': bad_greater_than_good}	311 'bad_greater_than_good': bad_greater_than_good}

OLD	NEW