tools/auto_bisect/bisect_results.py - Issue 850013004: Obtain confidence score based off last known good and first known bad revision results.

Side by Side Diff: tools/auto_bisect/bisect_results.py

Issue 850013004: Obtain confidence score based off last known good and first known bad revision results. (Closed) Base URL: https://chromium.googlesource.com/chromium/src.git@master

Patch Set: Addressing feedback Created 5 years, 10 months ago

Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.

Jump to:

View unified diff | Download patch

OLD	NEW
1 # Copyright 2014 The Chromium Authors. All rights reserved.	1 # Copyright 2014 The Chromium Authors. All rights reserved.

2 # Use of this source code is governed by a BSD-style license that can be	2 # Use of this source code is governed by a BSD-style license that can be

3 # found in the LICENSE file.	3 # found in the LICENSE file.

4	4

5 import math	5 import math

6 import os	6 import os

7	7

8 import bisect_utils	8 import bisect_utils

9 import math_utils	9 import math_utils

10 import source_control	10 import source_control

(...skipping 212 matching lines...) Expand 10 before \| Expand all \| Expand 10 after Loading...
223	223

224 # Only report potential regressions with high confidence.	224 # Only report potential regressions with high confidence.

225 if is_same_direction and confidence > 50:	225 if is_same_direction and confidence > 50:

226 other_regressions.append([revision_state, prev_state, confidence])	226 other_regressions.append([revision_state, prev_state, confidence])

227 previous_values.append(current_values)	227 previous_values.append(current_values)

228 prev_state = revision_state	228 prev_state = revision_state

229 return other_regressions	229 return other_regressions

230	230

231 @staticmethod	231 @staticmethod

232 def FindBreakingRevRange(revision_states):	232 def FindBreakingRevRange(revision_states):

	233 """Finds the last known good and first known bad revisions.

	234

	235 Note that since revision_states is expected to be in reverse chronological

	236 order, the last known good revision is the first revision in the list that

	237 has the passed property set to 1, therefore the name

	238 `first_working_revision`. The inverse applies to `last_broken_revision`.

	239

	240 Args:

	241 revision_states: A list of RevisionState instances.

	242

	243 Returns:

	244 A tuple containing the two revision states at the border. (Last

	245 known good and first known bad.)

	246 """

233 first_working_revision = None	247 first_working_revision = None

234 last_broken_revision = None	248 last_broken_revision = None

235	249

236 for revision_state in revision_states:	250 for revision_state in revision_states:

237 if revision_state.passed == 1 and not first_working_revision:	251 if revision_state.passed == 1 and not first_working_revision:

238 first_working_revision = revision_state	252 first_working_revision = revision_state

239	253

240 if not revision_state.passed:	254 if not revision_state.passed:

241 last_broken_revision = revision_state	255 last_broken_revision = revision_state

242	256

(...skipping 37 matching lines...) Expand 10 before \| Expand all \| Expand 10 after Loading...
280	294

281 regression_size = 100 * math_utils.RelativeChange(mean_of_good_runs,	295 regression_size = 100 * math_utils.RelativeChange(mean_of_good_runs,

282 mean_of_bad_runs)	296 mean_of_bad_runs)

283 if math.isnan(regression_size):	297 if math.isnan(regression_size):

284 regression_size = 'zero-to-nonzero'	298 regression_size = 'zero-to-nonzero'

285	299

286 regression_std_err = math.fabs(math_utils.PooledStandardError(	300 regression_std_err = math.fabs(math_utils.PooledStandardError(

287 [working_mean, broken_mean]) /	301 [working_mean, broken_mean]) /

288 max(0.0001, min(mean_of_good_runs, mean_of_bad_runs))) * 100.0	302 max(0.0001, min(mean_of_good_runs, mean_of_bad_runs))) * 100.0

289	303

290 # Give a "confidence" in the bisect. At the moment we use how distinct the	304 # Give a "confidence" in the bisect. Currently, we consider the values of

291 # values are before and after the last broken revision, and how noisy the	305 # only the revisions at the breaking range (last known good and first known

292 # overall graph is.	306 # bad) see the note in the docstring for FindBreakingRange.

293 confidence_params = (sum(working_means, []), sum(broken_means, []))	307 confidence_params = (

	308 sum([first_working_rev.value['values']], []),

	309 sum([last_broken_rev.value['values']], [])

	310 )

294 confidence = cls.ConfidenceScore(*confidence_params)	311 confidence = cls.ConfidenceScore(*confidence_params)

295	312

296 bad_greater_than_good = mean_of_bad_runs > mean_of_good_runs	313 bad_greater_than_good = mean_of_bad_runs > mean_of_good_runs

297	314

298 return {'regression_size': regression_size,	315 return {'regression_size': regression_size,

299 'regression_std_err': regression_std_err,	316 'regression_std_err': regression_std_err,

300 'confidence': confidence,	317 'confidence': confidence,

301 'bad_greater_than_good': bad_greater_than_good}	318 'bad_greater_than_good': bad_greater_than_good}

OLD	NEW

« no previous file with comments | « tools/auto_bisect/bisect_perf_regression.py ('k') | tools/auto_bisect/bisect_results_test.py » ('j') | no next file with comments »