tools/bisect-perf-regression.py - Issue 388623002: Reformat bisect results output.

Side by Side Diff: tools/bisect-perf-regression.py

Issue 388623002: Reformat bisect results output. (Closed) Base URL: svn://svn.chromium.org/chrome/trunk/src

Patch Set: Created 6 years, 5 months ago

Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.

Jump to:

View unified diff | Download patch | Annotate | Revision Log

OLD	NEW
1 #!/usr/bin/env python	1 #!/usr/bin/env python

2 # Copyright (c) 2013 The Chromium Authors. All rights reserved.	2 # Copyright (c) 2013 The Chromium Authors. All rights reserved.

3 # Use of this source code is governed by a BSD-style license that can be	3 # Use of this source code is governed by a BSD-style license that can be

4 # found in the LICENSE file.	4 # found in the LICENSE file.

5	5

6 """Performance Test Bisect Tool	6 """Performance Test Bisect Tool

7	7

8 This script bisects a series of changelists using binary search. It starts at	8 This script bisects a series of changelists using binary search. It starts at

9 a bad revision where a performance metric has regressed, and asks for a last	9 a bad revision where a performance metric has regressed, and asks for a last

10 known-good revision. It will then binary search across this revision range by	10 known-good revision. It will then binary search across this revision range by

(...skipping 165 matching lines...) Expand 10 before \| Expand all \| Expand 10 after Loading...
176 @@ -0,0 +1 @@	176 @@ -0,0 +1 @@

177 +%(deps_sha)s	177 +%(deps_sha)s

178 """	178 """

179	179

180 # The possible values of the --bisect_mode flag, which determines what to	180 # The possible values of the --bisect_mode flag, which determines what to

181 # use when classifying a revision as "good" or "bad".	181 # use when classifying a revision as "good" or "bad".

182 BISECT_MODE_MEAN = 'mean'	182 BISECT_MODE_MEAN = 'mean'

183 BISECT_MODE_STD_DEV = 'std_dev'	183 BISECT_MODE_STD_DEV = 'std_dev'

184 BISECT_MODE_RETURN_CODE = 'return_code'	184 BISECT_MODE_RETURN_CODE = 'return_code'

185	185

	186 # The perf dashboard specifically looks for the string

	187 # "Estimated Confidence: 95%" to decide whether or not

	188 # to cc the author(s). If you change this, please update the perf

	189 # dashboard as well.

	190 RESULTS_BANNER = """

	191 ===== BISECT JOB RESULTS =====

	192 Status: %(status)s

	193

	194 Test Command: %(command)s

	195 Test Metric: %(metrics)s

	196 Relative Change: %(change)s

	197 Estimated Confidence: %(confidence)d%%"""

	198

	199 # The perf dashboard specifically looks for the string

	200 # "Author : " to parse out who to cc on a bug. If you change the

	201 # formatting here, please update the perf dashboard as well.

	202 RESULTS_REVISION_INFO = """

	203 ===== SUSPECTED CL(s) =====

	204 Subject : %(subject)s

	205 Author : %(author)s%(email_info)s%(commit_info)s

	206 Date : %(cl_date)s"""

	207

	208 REPRO_STEPS_LOCAL = """

	209 ==== INSTRUCTIONS TO REPRODUCE ====

	210 To run locally:

	211 $%(command)s"""

	212

	213 REPRO_STEPS_TRYJOB = """

	214 To reproduce on Performance trybot:

	215 1. Create new git branch or check out existing branch.

	216 2. Edit tools/run-perf-test.cfg (instructions in file) or \

	217 third_party/WebKit/Tools/run-perf-test.cfg.

	218 a) Take care to strip any src/ directories from the head of \

	219 relative path names.

	220 b) On desktop, only --browser=release is supported, on android \

	221 --browser=android-chromium-testshell.

	222 c) Test command to use: %(command)s

	223 3. Upload your patch. --bypass-hooks is necessary to upload the changes you \

	224 committed locally to run-perf-test.cfg.

	225 Note: DO NOT commit run-perf-test.cfg changes to the project repository.

	226 $ git cl upload --bypass-hooks

	227 4. Send your try job to the tryserver. \

	228 [Please make sure to use appropriate bot to reproduce]

	229 $ git cl try -m tryserver.chromium.perf -b <bot>

	230

	231 For more details please visit \nhttps://sites.google.com/a/chromium.org/dev/\

	232 developers/performance-try-bots"""

	233

	234 RESULTS_THANKYOU = """

	235 ===== THANK YOU FOR CHOOSING BISECT AIRLINES =====
	Andrew Hayden (chromium.org) 2014/07/15 10:24:21 For posterity, this made my perf-sheriffing day. For posterity, this made my perf-sheriffing day.
	236 Visit http://www.chromium.org/developers/core-principles for Chrome's policy

	237 on perf regressions.

	238 Contact chrome-perf-dashboard-team with any questions or suggestions about

	239 bisecting.

	240 .------.

	241 .---. \ \==)

	242 \|PERF\ \ \\

	243 \| ---------'-------'-----------.

	244 . 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 `-.

	245 \______________.-------._______________)

	246 / /

	247 / /

	248 / /==)

	249 ._______."""

	250

186	251

187 def _AddAdditionalDepotInfo(depot_info):	252 def _AddAdditionalDepotInfo(depot_info):

188 """Adds additional depot info to the global depot variables."""	253 """Adds additional depot info to the global depot variables."""

189 global DEPOT_DEPS_NAME	254 global DEPOT_DEPS_NAME

190 global DEPOT_NAMES	255 global DEPOT_NAMES

191 DEPOT_DEPS_NAME = dict(DEPOT_DEPS_NAME.items() +	256 DEPOT_DEPS_NAME = dict(DEPOT_DEPS_NAME.items() +

192 depot_info.items())	257 depot_info.items())

193 DEPOT_NAMES = DEPOT_DEPS_NAME.keys()	258 DEPOT_NAMES = DEPOT_DEPS_NAME.keys()

194	259

195	260

(...skipping 3005 matching lines...) Expand 10 before \| Expand all \| Expand 10 after Loading...
3201 results_dict['last_broken_revision'],	3266 results_dict['last_broken_revision'],

3202 100, final_step=False)	3267 100, final_step=False)

3203	3268

3204 def _PrintConfidence(self, results_dict):	3269 def _PrintConfidence(self, results_dict):

3205 # The perf dashboard specifically looks for the string	3270 # The perf dashboard specifically looks for the string

3206 # "Confidence in Bisection Results: 100%" to decide whether or not	3271 # "Confidence in Bisection Results: 100%" to decide whether or not

3207 # to cc the author(s). If you change this, please update the perf	3272 # to cc the author(s). If you change this, please update the perf

3208 # dashboard as well.	3273 # dashboard as well.

3209 print 'Confidence in Bisection Results: %d%%' % results_dict['confidence']	3274 print 'Confidence in Bisection Results: %d%%' % results_dict['confidence']

3210	3275

	3276 def _ConfidenceLevelStatus(self, results_dict):

	3277 if not results_dict['confidence']:

	3278 return None

	3279 confidence_status = 'Successful with %(level)s confidence%(warning)s.'

	3280 if results_dict['confidence'] >= 95:

	3281 level = 'high'

	3282 else:

	3283 level = 'low'

	3284 warning = ' and warnings'

	3285 if not self.warnings:

	3286 warning = ''

	3287 return confidence_status % {'level': level, 'warning': warning}

	3288

	3289 def _PrintThankYou(self):

	3290 print RESULTS_THANKYOU

	3291

3211 def _PrintBanner(self, results_dict):	3292 def _PrintBanner(self, results_dict):

3212 print

3213 print " __o_\___ Aw Snap! We hit a speed bump!"

3214 print "=-O----O-'__.~.___________________________________"

3215 print

3216 if self._IsBisectModeReturnCode():	3293 if self._IsBisectModeReturnCode():

3217 print ('Bisect reproduced a change in return codes while running the '	3294 metrics = 'N/A'

3218 'performance test.')	3295 change = 'Yes'

3219 else:	3296 else:

3220 print ('Bisect reproduced a %.02f%% (+-%.02f%%) change in the '	3297 metrics = '/'.join(self.opts.metric)

3221 '%s metric.' % (results_dict['regression_size'],	3298 change = '%.02f%% (+/-%.02f%%)' % (

3222 results_dict['regression_std_err'], '/'.join(self.opts.metric)))	3299 results_dict['regression_size'], results_dict['regression_std_err'])

3223 self._PrintConfidence(results_dict)	3300

	3301 if results_dict['culprit_revisions'] and results_dict['confidence']:

	3302 status = self._ConfidenceLevelStatus(results_dict)

	3303 else:

	3304 status = 'Failure, could not reproduce.'

	3305 change = 'Bisect could not reproduce a change.'

	3306

	3307 print RESULTS_BANNER % {

	3308 'status': status,

	3309 'command': self.opts.command,

	3310 'metrics': metrics,

	3311 'change': change,

	3312 'confidence': results_dict['confidence'],

	3313 }

	3314

3224	3315

3225 def _PrintFailedBanner(self, results_dict):	3316 def _PrintFailedBanner(self, results_dict):

3226 print	3317 print

3227 if self._IsBisectModeReturnCode():	3318 if self._IsBisectModeReturnCode():

3228 print 'Bisect could not reproduce a change in the return code.'	3319 print 'Bisect could not reproduce a change in the return code.'

3229 else:	3320 else:

3230 print ('Bisect could not reproduce a change in the '	3321 print ('Bisect could not reproduce a change in the '

3231 '%s metric.' % '/'.join(self.opts.metric))	3322 '%s metric.' % '/'.join(self.opts.metric))

3232 print	3323 print

3233	3324

3234 def _GetViewVCLinkFromDepotAndHash(self, cl, depot):	3325 def _GetViewVCLinkFromDepotAndHash(self, cl, depot):

3235 info = self.source_control.QueryRevisionInfo(cl,	3326 info = self.source_control.QueryRevisionInfo(cl,

3236 self._GetDepotDirectory(depot))	3327 self._GetDepotDirectory(depot))

3237 if depot and DEPOT_DEPS_NAME[depot].has_key('viewvc'):	3328 if depot and DEPOT_DEPS_NAME[depot].has_key('viewvc'):

3238 try:	3329 try:

3239 # Format is "git-svn-id: svn://....@123456 <other data>"	3330 # Format is "git-svn-id: svn://....@123456 <other data>"

3240 svn_line = [i for i in info['body'].splitlines() if 'git-svn-id:' in i]	3331 svn_line = [i for i in info['body'].splitlines() if 'git-svn-id:' in i]

3241 svn_revision = svn_line[0].split('@')	3332 svn_revision = svn_line[0].split('@')

3242 svn_revision = svn_revision[1].split(' ')[0]	3333 svn_revision = svn_revision[1].split(' ')[0]

3243 return DEPOT_DEPS_NAME[depot]['viewvc'] + svn_revision	3334 return DEPOT_DEPS_NAME[depot]['viewvc'] + svn_revision

3244 except IndexError:	3335 except IndexError:

3245 return ''	3336 return ''

3246 return ''	3337 return ''

3247	3338

3248 def _PrintRevisionInfo(self, cl, info, depot=None):	3339 def _PrintRevisionInfo(self, cl, info, depot=None):

3249 # The perf dashboard specifically looks for the string	3340 email_info = ''

3250 # "Author : " to parse out who to cc on a bug. If you change the

3251 # formatting here, please update the perf dashboard as well.

3252 print

3253 print 'Subject : %s' % info['subject']

3254 print 'Author : %s' % info['author']

3255 if not info['email'].startswith(info['author']):	3341 if not info['email'].startswith(info['author']):

3256 print 'Email : %s' % info['email']	3342 email_info = '\nEmail : %s' % info['email']

3257 commit_link = self._GetViewVCLinkFromDepotAndHash(cl, depot)	3343 commit_link = self._GetViewVCLinkFromDepotAndHash(cl, depot)

3258 if commit_link:	3344 if commit_link:

3259 print 'Link : %s' % commit_link	3345 commit_info = '\nLink : %s' % commit_link

3260 else:	3346 else:

3261 print	3347 commit_info = ('\nFailed to parse svn revision from body:\n%s' %

3262 print 'Failed to parse svn revision from body:'	3348 info['body'])

3263 print	3349 print RESULTS_REVISION_INFO % {

3264 print info['body']	3350 'subject': info['subject'],

3265 print	3351 'author': info['author'],

3266 print 'Commit : %s' % cl	3352 'email_info': email_info,

3267 print 'Date : %s' % info['date']	3353 'commit_info': commit_info,

	3354 'cl_date': info['date']

	3355 }

3268	3356

3269 def _PrintTableRow(self, column_widths, row_data):	3357 def _PrintTableRow(self, column_widths, row_data):

3270 assert len(column_widths) == len(row_data)	3358 assert len(column_widths) == len(row_data)

3271	3359

3272 text = ''	3360 text = ''

3273 for i in xrange(len(column_widths)):	3361 for i in xrange(len(column_widths)):

3274 current_row_data = row_data[i].center(column_widths[i], ' ')	3362 current_row_data = row_data[i].center(column_widths[i], ' ')

3275 text += ('%%%ds' % column_widths[i]) % current_row_data	3363 text += ('%%%ds' % column_widths[i]) % current_row_data

3276 print text	3364 print text

3277	3365

(...skipping 33 matching lines...) Expand 10 before \| Expand all \| Expand 10 after Loading...
3311 mean = '%d' % current_data['value']['mean']	3399 mean = '%d' % current_data['value']['mean']

3312 self._PrintTableRow(	3400 self._PrintTableRow(

3313 [20, 70, 14, 13],	3401 [20, 70, 14, 13],

3314 [current_data['depot'], cl_link, mean, state_str])	3402 [current_data['depot'], cl_link, mean, state_str])

3315	3403

3316 def _PrintTestedCommitsTable(self, revision_data_sorted,	3404 def _PrintTestedCommitsTable(self, revision_data_sorted,

3317 first_working_revision, last_broken_revision, confidence,	3405 first_working_revision, last_broken_revision, confidence,

3318 final_step=True):	3406 final_step=True):

3319 print	3407 print

3320 if final_step:	3408 if final_step:

3321 print 'Tested commits:'	3409 print '===== TESTED COMMITS ====='

3322 else:	3410 else:

3323 print 'Partial results:'	3411 print '===== PARTIAL RESULTS ====='

3324 self._PrintTestedCommitsHeader()	3412 self._PrintTestedCommitsHeader()

3325 state = 0	3413 state = 0

3326 for current_id, current_data in revision_data_sorted:	3414 for current_id, current_data in revision_data_sorted:

3327 if current_data['value']:	3415 if current_data['value']:

3328 if (current_id == last_broken_revision or	3416 if (current_id == last_broken_revision or

3329 current_id == first_working_revision):	3417 current_id == first_working_revision):

3330 # If confidence is too low, don't add this empty line since it's	3418 # If confidence is too low, don't add this empty line since it's

3331 # used to put focus on a suspected CL.	3419 # used to put focus on a suspected CL.

3332 if confidence and final_step:	3420 if confidence and final_step:

3333 print	3421 print

(...skipping 13 matching lines...) Expand all Loading...
3347 state_str = ''	3435 state_str = ''

3348 state_str = state_str.center(13, ' ')	3436 state_str = state_str.center(13, ' ')

3349	3437

3350 cl_link = self._GetViewVCLinkFromDepotAndHash(current_id,	3438 cl_link = self._GetViewVCLinkFromDepotAndHash(current_id,

3351 current_data['depot'])	3439 current_data['depot'])

3352 if not cl_link:	3440 if not cl_link:

3353 cl_link = current_id	3441 cl_link = current_id

3354 self._PrintTestedCommitsEntry(current_data, cl_link, state_str)	3442 self._PrintTestedCommitsEntry(current_data, cl_link, state_str)

3355	3443

3356 def _PrintReproSteps(self):	3444 def _PrintReproSteps(self):

3357 print	3445 command = '$ ' + self.opts.command

3358 print 'To reproduce locally:'

3359 print '$ ' + self.opts.command

3360 if bisect_utils.IsTelemetryCommand(self.opts.command):	3446 if bisect_utils.IsTelemetryCommand(self.opts.command):

3361 print	3447 command += ('\nAlso consider passing --profiler=list to see available '

3362 print 'Also consider passing --profiler=list to see available profilers.'	3448 'profilers.')

	3449 print REPRO_STEPS_LOCAL % {'command': command}

	3450 print REPRO_STEPS_TRYJOB % {'command': command}

3363	3451

3364 def _PrintOtherRegressions(self, other_regressions, revision_data):	3452 def _PrintOtherRegressions(self, other_regressions, revision_data):

3365 print	3453 print

3366 print 'Other regressions may have occurred:'	3454 print 'Other regressions may have occurred:'

3367 print ' %8s %70s %10s' % ('Depot'.center(8, ' '),	3455 print ' %8s %70s %10s' % ('Depot'.center(8, ' '),

3368 'Range'.center(70, ' '), 'Confidence'.center(10, ' '))	3456 'Range'.center(70, ' '), 'Confidence'.center(10, ' '))

3369 for regression in other_regressions:	3457 for regression in other_regressions:

3370 current_id, previous_id, confidence = regression	3458 current_id, previous_id, confidence = regression

3371 current_data = revision_data[current_id]	3459 current_data = revision_data[current_id]

3372 previous_data = revision_data[previous_id]	3460 previous_data = revision_data[previous_id]

(...skipping 33 matching lines...) Expand 10 before \| Expand all \| Expand 10 after Loading...
3406 seconds=int(step_build_time_avg))	3494 seconds=int(step_build_time_avg))

3407 print 'Average test time : %s' % datetime.timedelta(	3495 print 'Average test time : %s' % datetime.timedelta(

3408 seconds=int(step_perf_time_avg))	3496 seconds=int(step_perf_time_avg))

3409	3497

3410 def _PrintWarnings(self):	3498 def _PrintWarnings(self):

3411 if not self.warnings:	3499 if not self.warnings:

3412 return	3500 return

3413 print	3501 print

3414 print 'WARNINGS:'	3502 print 'WARNINGS:'

3415 for w in set(self.warnings):	3503 for w in set(self.warnings):

3416 print ' !!! %s' % w	3504 print ' ! %s' % w

3417	3505

3418 def _FindOtherRegressions(self, revision_data_sorted, bad_greater_than_good):	3506 def _FindOtherRegressions(self, revision_data_sorted, bad_greater_than_good):

3419 other_regressions = []	3507 other_regressions = []

3420 previous_values = []	3508 previous_values = []

3421 previous_id = None	3509 previous_id = None

3422 for current_id, current_data in revision_data_sorted:	3510 for current_id, current_data in revision_data_sorted:

3423 current_values = current_data['value']	3511 current_values = current_data['value']

3424 if current_values:	3512 if current_values:

3425 current_values = current_values['values']	3513 current_values = current_values['values']

3426 if previous_values:	3514 if previous_values:

(...skipping 180 matching lines...) Expand 10 before \| Expand all \| Expand 10 after Loading...
3607 print ' %20s %40s %s' % (current_data['depot'],	3695 print ' %20s %40s %s' % (current_data['depot'],

3608 current_id, build_status)	3696 current_id, build_status)

3609 print	3697 print

3610	3698

3611 if self.opts.output_buildbot_annotations:	3699 if self.opts.output_buildbot_annotations:

3612 bisect_utils.OutputAnnotationStepClosed()	3700 bisect_utils.OutputAnnotationStepClosed()

3613 # The perf dashboard scrapes the "results" step in order to comment on	3701 # The perf dashboard scrapes the "results" step in order to comment on

3614 # bugs. If you change this, please update the perf dashboard as well.	3702 # bugs. If you change this, please update the perf dashboard as well.

3615 bisect_utils.OutputAnnotationStepStart('Results')	3703 bisect_utils.OutputAnnotationStepStart('Results')

3616	3704

	3705 self._PrintBanner(results_dict)

	3706 self._PrintWarnings()

	3707

3617 if results_dict['culprit_revisions'] and results_dict['confidence']:	3708 if results_dict['culprit_revisions'] and results_dict['confidence']:

3618 self._PrintBanner(results_dict)

3619 for culprit in results_dict['culprit_revisions']:	3709 for culprit in results_dict['culprit_revisions']:

3620 cl, info, depot = culprit	3710 cl, info, depot = culprit

3621 self._PrintRevisionInfo(cl, info, depot)	3711 self._PrintRevisionInfo(cl, info, depot)

3622 self._PrintReproSteps()

3623 if results_dict['other_regressions']:	3712 if results_dict['other_regressions']:

3624 self._PrintOtherRegressions(results_dict['other_regressions'],	3713 self._PrintOtherRegressions(results_dict['other_regressions'],

3625 revision_data)	3714 revision_data)

3626 else:

3627 self._PrintFailedBanner(results_dict)

3628 self._PrintReproSteps()

3629

3630 self._PrintTestedCommitsTable(revision_data_sorted,	3715 self._PrintTestedCommitsTable(revision_data_sorted,

3631 results_dict['first_working_revision'],	3716 results_dict['first_working_revision'],

3632 results_dict['last_broken_revision'],	3717 results_dict['last_broken_revision'],

3633 results_dict['confidence'])	3718 results_dict['confidence'])

3634 self._PrintStepTime(revision_data_sorted)	3719 self._PrintStepTime(revision_data_sorted)

3635 self._PrintWarnings()	3720 self._PrintReproSteps()

3636	3721 self._PrintThankYou()

3637 if self.opts.output_buildbot_annotations:	3722 if self.opts.output_buildbot_annotations:

3638 bisect_utils.OutputAnnotationStepClosed()	3723 bisect_utils.OutputAnnotationStepClosed()

3639	3724

3640	3725

3641 def DetermineAndCreateSourceControl(opts):	3726 def DetermineAndCreateSourceControl(opts):

3642 """Attempts to determine the underlying source control workflow and returns	3727 """Attempts to determine the underlying source control workflow and returns

3643 a SourceControl object.	3728 a SourceControl object.

3644	3729

3645 Returns:	3730 Returns:

3646 An instance of a SourceControl object, or None if the current workflow	3731 An instance of a SourceControl object, or None if the current workflow

(...skipping 389 matching lines...) Expand 10 before \| Expand all \| Expand 10 after Loading...
4036 # The perf dashboard scrapes the "results" step in order to comment on	4121 # The perf dashboard scrapes the "results" step in order to comment on

4037 # bugs. If you change this, please update the perf dashboard as well.	4122 # bugs. If you change this, please update the perf dashboard as well.

4038 bisect_utils.OutputAnnotationStepStart('Results')	4123 bisect_utils.OutputAnnotationStepStart('Results')

4039 print 'Error: %s' % e.message	4124 print 'Error: %s' % e.message

4040 if opts.output_buildbot_annotations:	4125 if opts.output_buildbot_annotations:

4041 bisect_utils.OutputAnnotationStepClosed()	4126 bisect_utils.OutputAnnotationStepClosed()

4042 return 1	4127 return 1

4043	4128

4044 if __name__ == '__main__':	4129 if __name__ == '__main__':

4045 sys.exit(main())	4130 sys.exit(main())

OLD	NEW

« no previous file with comments | « no previous file | no next file » | no next file with comments »