scripts/slave/recipe_modules/auto_bisect/revision_state.py - Issue 1825993003: Making check_initial_confidence verify return_code bisects.

Side by Side Diff: scripts/slave/recipe_modules/auto_bisect/revision_state.py

Issue 1825993003: Making check_initial_confidence verify return_code bisects. (Closed) Base URL: https://chromium.googlesource.com/chromium/tools/build.git@master

Patch Set: Adressing feedback. Created 4 years, 9 months ago

Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.

Jump to:

View unified diff | Download patch

« scripts/slave/recipe_modules/auto_bisect/bisector_test.py ('K') | « scripts/slave/recipe_modules/auto_bisect/bisector_test.py ('k') | scripts/slave/recipes/bisect.expected/basic_return_code_test.json » ('j') | no next file with comments »
Toggle Intra-line Diffs ('i') | Expand Comments ('e') | Collapse Comments ('c') | Hide Comments ('s')

OLD	NEW
1 # Copyright 2015 The Chromium Authors. All rights reserved.	1 # Copyright 2015 The Chromium Authors. All rights reserved.

2 # Use of this source code is governed by a BSD-style license that can be	2 # Use of this source code is governed by a BSD-style license that can be

3 # found in the LICENSE file.	3 # found in the LICENSE file.

4	4

5 """An interface for holding state and result of revisions in a bisect job.	5 """An interface for holding state and result of revisions in a bisect job.

6	6

7 When implementing support for tests other than perf, one should extend this	7 When implementing support for tests other than perf, one should extend this

8 class so that the bisect module and recipe can use it.	8 class so that the bisect module and recipe can use it.

9	9

10 See perf_revision_state for an example.	10 See perf_revision_state for an example.

(...skipping 72 matching lines...) Expand 10 before \| Expand all \| Expand 10 after Loading...
83 self.depot, self.commit_hash)	83 self.depot, self.commit_hash)

84 self.deps_sha = hashlib.sha1(self.deps_patch).hexdigest()	84 self.deps_sha = hashlib.sha1(self.deps_patch).hexdigest()

85 self.deps_sha_patch = self.bisector.make_deps_sha_file(self.deps_sha)	85 self.deps_sha_patch = self.bisector.make_deps_sha_file(self.deps_sha)

86 self.deps = dict(base_revision.deps)	86 self.deps = dict(base_revision.deps)

87 self.deps[self.depot_name] = self.commit_hash	87 self.deps[self.depot_name] = self.commit_hash

88 else:	88 else:

89 self.needs_patch = False	89 self.needs_patch = False

90 self.build_url = self.bisector.get_platform_gs_prefix() + self._gs_suffix()	90 self.build_url = self.bisector.get_platform_gs_prefix() + self._gs_suffix()

91 self.values = []	91 self.values = []

92 self.mean_value = None	92 self.mean_value = None

	93 self.overall_return_code = None

93 self.std_dev = None	94 self.std_dev = None

94 self.repeat_count = MINIMUM_SAMPLE_SIZE	95 self.repeat_count = MINIMUM_SAMPLE_SIZE

95 self._test_config = None	96 self._test_config = None

96 self.build_number = None	97 self.build_number = None

97	98

98 @property	99 @property

99 def tested(self):	100 def tested(self):

100 return self.status in (RevisionState.TESTED,)	101 return self.status in (RevisionState.TESTED,)

101	102

102 @property	103 @property

(...skipping 220 matching lines...) Expand 10 before \| Expand all \| Expand 10 after Loading...
323 # the 'error' key explaining the type of error.	324 # the 'error' key explaining the type of error.

324 results = test_results['results']	325 results = test_results['results']

325 if results.get('errors'):	326 if results.get('errors'):

326 self.status = RevisionState.FAILED	327 self.status = RevisionState.FAILED

327 if 'MISSING_METRIC' in results.get('errors'): # pragma: no cover	328 if 'MISSING_METRIC' in results.get('errors'): # pragma: no cover

328 self.bisector.surface_result('MISSING_METRIC')	329 self.bisector.surface_result('MISSING_METRIC')

329 return	330 return

330 self.values += results['values']	331 self.values += results['values']

331 if self.bisector.is_return_code_mode():	332 if self.bisector.is_return_code_mode():

332 retcodes = test_results['retcodes']	333 retcodes = test_results['retcodes']

333 overall_return_code = 0 if all(v == 0 for v in retcodes) else 1	334 self.overall_return_code = 0 if all(v == 0 for v in retcodes) else 1

334 self.mean_value = overall_return_code	335 # Keeping mean_value for compatibility with dashboard.

	336 # TODO(robertocn): refactor mean_value, specially when uploading results

	337 # to dashboard.
	qyearsley 2016/03/23 22:09:36 What do you think we should do specifically when u What do you think we should do specifically when uploading to the perf dashboard? I see two main choices: 1. In Bisector._revision_data (https://code.google.com/p/chromium/codesearch#chromium/build/scripts/slave/re...), send a field "value" or "overall_return_code" instead of "mean_value", and also update the perf dashboard to deal with that. (I like this option a bit better) 2. In Bisector._revision_data, send either overall_return_code or mean_value, but keep using the key "mean_value" so that the perf dashboard doesn't have to be changed. Do you agree?
	338 self.mean_value = self.overall_return_code

335 elif self.values:	339 elif self.values:

336 api = self.bisector.api	340 api = self.bisector.api

337 self.mean_value = api.m.math_utils.mean(self.values)	341 self.mean_value = api.m.math_utils.mean(self.values)

338 self.std_dev = api.m.math_utils.standard_deviation(self.values)	342 self.std_dev = api.m.math_utils.standard_deviation(self.values)

339 # Values were not found, but the test did not otherwise fail.	343 # Values were not found, but the test did not otherwise fail.

340 else:	344 else:

341 self.status = RevisionState.FAILED	345 self.status = RevisionState.FAILED

342 self.bisector.surface_result('MISSING_METRIC')	346 self.bisector.surface_result('MISSING_METRIC')

343 return	347 return

344 # If we have already decided on the goodness of this revision, we shouldn't	348 # If we have already decided on the goodness of this revision, we shouldn't

(...skipping 156 matching lines...) Expand 10 before \| Expand all \| Expand 10 after Loading...
501 significantly different as a precondition.	505 significantly different as a precondition.

502	506

503 Returns:	507 Returns:

504 True if the results of testing this revision are significantly different	508 True if the results of testing this revision are significantly different

505 from those of testing the earliest known bad revision.	509 from those of testing the earliest known bad revision.

506 False if they are instead significantly different form those of testing	510 False if they are instead significantly different form those of testing

507 the latest knwon good revision.	511 the latest knwon good revision.

508 """	512 """

509	513

510 if self.bisector.is_return_code_mode():	514 if self.bisector.is_return_code_mode():

511 return self.mean_value == self.bisector.lkgr.mean_value	515 return self.overall_return_code == self.bisector.lkgr.overall_return_code

512	516

513 while True:	517 while True:

514 diff_from_good = self.bisector.significantly_different(	518 diff_from_good = self.bisector.significantly_different(

515 self.bisector.lkgr.values, self.values)	519 self.bisector.lkgr.values, self.values)

516 diff_from_bad = self.bisector.significantly_different(	520 diff_from_bad = self.bisector.significantly_different(

517 self.bisector.fkbr.values, self.values)	521 self.bisector.fkbr.values, self.values)

518	522

519 if diff_from_good and diff_from_bad:	523 if diff_from_good and diff_from_bad:

520 # Multiple regressions.	524 # Multiple regressions.

521 # For now, proceed bisecting the biggest difference of the means.	525 # For now, proceed bisecting the biggest difference of the means.

(...skipping 38 matching lines...) Expand 10 before \| Expand all \| Expand 10 after Loading...
560 """	564 """

561 next_revision_to_test = min(self.bisector.lkgr, self, self.bisector.fkbr,	565 next_revision_to_test = min(self.bisector.lkgr, self, self.bisector.fkbr,

562 key=lambda x: len(x.values))	566 key=lambda x: len(x.values))

563 if (len(self.bisector.last_tested_revision.values) ==	567 if (len(self.bisector.last_tested_revision.values) ==

564 next_revision_to_test.values):	568 next_revision_to_test.values):

565 self.bisector.last_tested_revision.retest()	569 self.bisector.last_tested_revision.retest()

566 else:	570 else:

567 next_revision_to_test.retest()	571 next_revision_to_test.retest()

568	572

569 def __repr__(self):	573 def __repr__(self):

	574 if self.overall_return_code is not None:

	575 return ('RevisionState(rev=%s, values=%r, overall_return_code=%r, '

	576 'std_dev=%r)') % (self.revision_string(), self.values,

	577 self.overall_return_code, self.std_dev)

570 return ('RevisionState(rev=%s, values=%r, mean_value=%r, std_dev=%r)' % (	578 return ('RevisionState(rev=%s, values=%r, mean_value=%r, std_dev=%r)' % (

571 self.revision_string(), self.values, self.mean_value, self.std_dev))	579 self.revision_string(), self.values, self.mean_value, self.std_dev))

OLD	NEW