tools/flakiness/is_flaky_test.py - Issue 563243002: Implemented a flaky test runner for auto-bisect bot.

Unified Diff: tools/flakiness/is_flaky_test.py

Issue 563243002: Implemented a flaky test runner for auto-bisect bot. (Closed) Base URL: https://chromium.googlesource.com/chromium/src.git@master

Patch Set: Created 6 years, 3 months ago

Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.

Jump to:

View side-by-side diff with in-line comments

Download patch

Index: tools/flakiness/is_flaky_test.py

diff --git a/tools/flakiness/is_flaky_test.py b/tools/flakiness/is_flaky_test.py

new file mode 100755

index 0000000000000000000000000000000000000000..454bb4e7c61fa6cc93f3e3cc67b1d2e2e4efa7ea

--- /dev/null

+++ b/tools/flakiness/is_flaky_test.py

@@ -0,0 +1,72 @@

+#!/usr/bin/env python

Sergiy Byelozyorov 2014/09/15 08:12:14 Done.

+# Use of this source code is governed by a BSD-style license that can be

+# found in the LICENSE file.

+"""Runs a test repeatedly to measure its flakiness. The return code is non-zero

+if the flakiness is higher than the specified threshold."""

qyearsley 2014/09/12 16:42:59 Maybe "failure rate" seems like a better-defined t

Sergiy Byelozyorov 2014/09/15 08:12:14 Done.

+import argparse

+import subprocess

+import sys

+import time

+def load_options():

+ parser = argparse.ArgumentParser(description=sys.modules['__main__'].__doc__)

qyearsley 2014/09/12 16:42:59 What's the difference between `sys.modules['__main

Sergiy Byelozyorov 2014/09/15 08:12:14 Done.

+ parser.add_argument('--retries', default=1000, type=int,

+ help='Number of test retries to measure flakiness.')

+ parser.add_argument('--threshold', default=0.05, type=float,

+ help='Minimum flakiness level at which test is '

+ 'considered flaky.')

+ parser.add_argument('--jobs', '-j', type=int, default=1,

+ help='Number of parallel jobs to run tests.')

+ parser.add_argument('command', nargs='+', help='Command to run test.')

+ return parser.parse_args()

+def process_finished(running, num_passed, num_failed):

+ finished = [p for p in running if p.poll() is not None]

+ running[:] = [p for p in running if p.poll() is None]

+ num_passed += len([p for p in finished if p.returncode == 0])

+ num_failed += len([p for p in finished if p.returncode != 0])

+ print '%d processed finished. Total passed: %d. Total failed: %d' % (

+ len(finished), num_passed, num_failed)

+ return num_passed, num_failed

+def main():

+ options = load_options()

+ num_passed = num_failed = 0

+ running = []

+ # Start all retries, while limiting total number of running processes.

+ for attempt in range(options.retries):

+ print 'Starting retry %d out of %d\n' % (attempt + 1, options.retries)

+ running.append(subprocess.Popen(options.command, stdout=subprocess.PIPE,

+ stderr=subprocess.STDOUT))

+ while len(running) >= options.jobs:

+ print 'Waiting for previous retries to finish before starting new ones...'

+ time.sleep(0.1)

+ num_passed, num_failed = process_finished(running, num_passed, num_failed)

+ # Wait for the remaining retries to finish.

+ print 'Waiting for the remaining retries to finish...'

+ for process in running:

+ process.wait()

+ num_passed, num_failed = process_finished(running, num_passed, num_failed)

+ if num_passed == 0 or num_failed == 0:

+ flakiness = 0

qyearsley 2014/09/12 16:42:59 So 100% failure is defined as having a flakiness o

qyearsley 2014/09/12 16:44:26 I meant to write "999 failures out of 1000 has a f

Sergiy Byelozyorov 2014/09/15 08:12:14 Yes, if a test failing all the time then it is a f

+ else:

+ flakiness = num_failed / options.retries

qyearsley 2014/09/12 16:42:59 num_failed and options.retries are both ints here,

Sergiy Byelozyorov 2014/09/15 08:12:14 Nice catch. Done.

+ print 'Flakiness is %.2f' % flakiness

+ if flakiness > options.threshold:

+ return 1

+ else:

+ return 0

+if __name__ == '__main__':

+ sys.exit(main())

« no previous file with comments | « no previous file | no next file » | no next file with comments »