|
|
Created:
6 years, 4 months ago by Dirk Pranke Modified:
6 years, 4 months ago CC:
chromium-reviews, telemetry+watch_chromium.org, ghost stip (do not use), dtu, Paweł Hajdan Jr., jam Base URL:
svn://svn.chromium.org/chrome/trunk/src Project:
chromium Visibility:
Public. |
DescriptionImplement auto-retry of failed tests for telemetry{_perf}_unittests.
This implements the same basic logic we use in GTest-based tests to
retry tests that fail during a run. By default, we will not retry
anything (this matches what gtests do when run not in botmode).
If --retry-limit is passed on the command line, we will retry each
failing test up to that number of times. So, to enable retries on
the bots by default they will have to pass this arg.
R=dtu@chromium.org, tonyg@chromium.org
BUG=398027
Committed: https://src.chromium.org/viewvc/chrome?view=rev&revision=288265
Patch Set 1 : initial patch for review #
Total comments: 11
Patch Set 2 : merge forward to r287841 #Patch Set 3 : update w/ review feedback #
Messages
Total messages: 24 (0 generated)
Paweł, John, Ojan, can you review this and confirm that the logic looks right? Dave and/or Tony, please review to make sure you understand how telemetry_unittests will change with this. I don't do anything special to indicate that we are retrying previously failed tests. I am assuming that the gtest log parser won't be confused by this; if this is wrong, please let me know what I would need to change to make it happy. (I'm not actually sure what all needs to parse the log output at this point, given that we're mostly converted over to recipes, so I'm not sure how to verify my assumptions). This change will *still* not upload the results to the flakiness dashboard; that change will follow this one. This change depends on https://codereview.chromium.org/429013004/ .
lgtm, but please wait for Dave to take a pass too. https://codereview.chromium.org/426123002/diff/20001/tools/telemetry/telemetr... File tools/telemetry/telemetry/unittest/json_results.py (right): https://codereview.chromium.org/426123002/diff/20001/tools/telemetry/telemetr... tools/telemetry/telemetry/unittest/json_results.py:93: assert(actuals) Nit: assert actuals, 'Useful message to a developer that hits this assert' https://codereview.chromium.org/426123002/diff/20001/tools/telemetry/telemetr... tools/telemetry/telemetry/unittest/json_results.py:98: return 1 if full_results['num_failures_by_type']['FAIL'] else 0 return min(255, len(full_results['num_failures_by_type']['FAIL']))?
https://codereview.chromium.org/426123002/diff/20001/tools/telemetry/telemetr... File tools/telemetry/telemetry/unittest/json_results.py (right): https://codereview.chromium.org/426123002/diff/20001/tools/telemetry/telemetr... tools/telemetry/telemetry/unittest/json_results.py:93: assert(actuals) On 2014/07/30 00:08:30, tonyg wrote: > Nit: > > assert actuals, 'Useful message to a developer that hits this assert' Sure, will fix. https://codereview.chromium.org/426123002/diff/20001/tools/telemetry/telemetr... tools/telemetry/telemetry/unittest/json_results.py:98: return 1 if full_results['num_failures_by_type']['FAIL'] else 0 On 2014/07/30 00:08:30, tonyg wrote: > return min(255, len(full_results['num_failures_by_type']['FAIL']))? Experience w/ run-webkit-tests returning the actual number of failures has taught me that returning anything other than a strictly limited set of error codes ends up causing a bunch of infra woes. I think we should only return 0 (success), 1 (at least one test failed), or 2 (something weird happened, like a bad command line arg).
On 2014/07/30 00:13:29, Dirk Pranke wrote: > https://codereview.chromium.org/426123002/diff/20001/tools/telemetry/telemetr... > File tools/telemetry/telemetry/unittest/json_results.py (right): > > https://codereview.chromium.org/426123002/diff/20001/tools/telemetry/telemetr... > tools/telemetry/telemetry/unittest/json_results.py:93: assert(actuals) > On 2014/07/30 00:08:30, tonyg wrote: > > Nit: > > > > assert actuals, 'Useful message to a developer that hits this assert' > > Sure, will fix. > > https://codereview.chromium.org/426123002/diff/20001/tools/telemetry/telemetr... > tools/telemetry/telemetry/unittest/json_results.py:98: return 1 if > full_results['num_failures_by_type']['FAIL'] else 0 > On 2014/07/30 00:08:30, tonyg wrote: > > return min(255, len(full_results['num_failures_by_type']['FAIL']))? > > Experience w/ run-webkit-tests returning the actual number of failures has > taught me that returning anything other than a strictly limited set of error > codes ends up causing a bunch of infra woes. > > I think we should only return 0 (success), 1 (at least one test failed), or 2 > (something weird happened, like a bad command line arg). sgtm. There are some other places where we do that in Telemetry and might want to make them consistent in a follow-up.
https://chromiumcodereview.appspot.com/426123002/diff/20001/tools/telemetry/t... File tools/telemetry/telemetry/unittest/run_tests.py (right): https://chromiumcodereview.appspot.com/426123002/diff/20001/tools/telemetry/t... tools/telemetry/telemetry/unittest/run_tests.py:158: failed_tests = json_results.FailedTestNames(result) None of these calls are json-specific, can they be moved? https://chromiumcodereview.appspot.com/426123002/diff/20001/tools/telemetry/t... tools/telemetry/telemetry/unittest/run_tests.py:163: args.positional_args = original_positional_args We essentially take the union of the args. Doesn't copying the original args mean we run all of the original tests?
https://chromiumcodereview.appspot.com/426123002/diff/20001/tools/telemetry/t... File tools/telemetry/telemetry/unittest/run_tests.py (right): https://chromiumcodereview.appspot.com/426123002/diff/20001/tools/telemetry/t... tools/telemetry/telemetry/unittest/run_tests.py:158: failed_tests = json_results.FailedTestNames(result) On 2014/07/30 01:31:19, dtu wrote: > None of these calls are json-specific, can they be moved? I'm not sure what you mean by "None of these"; are you referring to some other method besides FailedTestNames? We need FailedTestNames inside json_results, so I'm not sure where else I would move it that wouldn't create a dependency that I don't want json_results to have. I could duplicate the code, I suppose, but that doesn't seem like a big win? https://chromiumcodereview.appspot.com/426123002/diff/20001/tools/telemetry/t... tools/telemetry/telemetry/unittest/run_tests.py:163: args.positional_args = original_positional_args On 2014/07/30 01:31:19, dtu wrote: > We essentially take the union of the args. Doesn't copying the original args > mean we run all of the original tests? Hm, yeah, if you originally specified some test names then they would get re-run. I originally wrote things this way because I wasn't sure if positional_args might contain things that aren't test names that we would need to preserve. I will look into that.
I defer to other reviewers, I'm not familiar with this code.
just one comment about the default retries. i'm not qualified to review this as I'm not familiar with this test harness. The basic algorithm used by the gtest launcher is: -after running tests, run each failed test serially at most 3x. https://codereview.chromium.org/426123002/diff/20001/tools/telemetry/telemetr... File tools/telemetry/telemetry/unittest/run_tests.py (right): https://codereview.chromium.org/426123002/diff/20001/tools/telemetry/telemetr... tools/telemetry/telemetry/unittest/run_tests.py:129: parser.add_option('--retry-limit', type='int', default=0, why not make this 3 to match our other test gtest-based test harnesses? i.e. we don't need to specify extra flags in recipes to enable test retries.
https://codereview.chromium.org/426123002/diff/20001/tools/telemetry/telemetr... File tools/telemetry/telemetry/unittest/run_tests.py (right): https://codereview.chromium.org/426123002/diff/20001/tools/telemetry/telemetr... tools/telemetry/telemetry/unittest/run_tests.py:129: parser.add_option('--retry-limit', type='int', default=0, On 2014/07/30 20:44:54, jam wrote: > why not make this 3 to match our other test gtest-based test harnesses? i.e. we > don't need to specify extra flags in recipes to enable test retries. Good question. As I understand it, the gtest-based tests don't actually retry tests by default; you have to specify the --test-launcher-bot-mode flag to get the retries (or use --test-launcher-retry-limit to specify a number explicitly).
https://codereview.chromium.org/426123002/diff/20001/tools/telemetry/telemetr... File tools/telemetry/telemetry/unittest/run_tests.py (right): https://codereview.chromium.org/426123002/diff/20001/tools/telemetry/telemetr... tools/telemetry/telemetry/unittest/run_tests.py:129: parser.add_option('--retry-limit', type='int', default=0, On 2014/07/30 21:40:33, Dirk Pranke wrote: > On 2014/07/30 20:44:54, jam wrote: > > why not make this 3 to match our other test gtest-based test harnesses? i.e. > we > > don't need to specify extra flags in recipes to enable test retries. > > Good question. As I understand it, the gtest-based tests don't actually retry > tests by default; you have to specify the --test-launcher-bot-mode flag to get > the retries (or use --test-launcher-retry-limit to specify a number explicitly). > ah, I was not aware of that. nvm then
On 2014/07/30 at 21:42:29, jam wrote: > https://codereview.chromium.org/426123002/diff/20001/tools/telemetry/telemetr... > File tools/telemetry/telemetry/unittest/run_tests.py (right): > > https://codereview.chromium.org/426123002/diff/20001/tools/telemetry/telemetr... > tools/telemetry/telemetry/unittest/run_tests.py:129: parser.add_option('--retry-limit', type='int', default=0, > On 2014/07/30 21:40:33, Dirk Pranke wrote: > > On 2014/07/30 20:44:54, jam wrote: > > > why not make this 3 to match our other test gtest-based test harnesses? i.e. > > we > > > don't need to specify extra flags in recipes to enable test retries. > > > > Good question. As I understand it, the gtest-based tests don't actually retry > > tests by default; you have to specify the --test-launcher-bot-mode flag to get > > the retries (or use --test-launcher-retry-limit to specify a number explicitly). > > > > ah, I was not aware of that. nvm then Are you saying tests don't retry when people run them locally?
On 2014/07/30 23:14:33, ojan-only-code-yellow-reviews wrote: > On 2014/07/30 at 21:42:29, jam wrote: > > > https://codereview.chromium.org/426123002/diff/20001/tools/telemetry/telemetr... > > File tools/telemetry/telemetry/unittest/run_tests.py (right): > > > > > https://codereview.chromium.org/426123002/diff/20001/tools/telemetry/telemetr... > > tools/telemetry/telemetry/unittest/run_tests.py:129: > parser.add_option('--retry-limit', type='int', default=0, > > On 2014/07/30 21:40:33, Dirk Pranke wrote: > > > On 2014/07/30 20:44:54, jam wrote: > > > > why not make this 3 to match our other test gtest-based test harnesses? > i.e. > > > we > > > > don't need to specify extra flags in recipes to enable test retries. > > > > > > Good question. As I understand it, the gtest-based tests don't actually > retry > > > tests by default; you have to specify the --test-launcher-bot-mode flag to > get > > > the retries (or use --test-launcher-retry-limit to specify a number > explicitly). > > > > > > > ah, I was not aware of that. nvm then > > Are you saying tests don't retry when people run them locally? Correct, not without a flag telling them to do so.
On 2014/07/30 23:31:08, Dirk Pranke wrote: > On 2014/07/30 23:14:33, ojan-only-code-yellow-reviews wrote: > > On 2014/07/30 at 21:42:29, jam wrote: > > > > > > https://codereview.chromium.org/426123002/diff/20001/tools/telemetry/telemetr... > > > File tools/telemetry/telemetry/unittest/run_tests.py (right): > > > > > > > > > https://codereview.chromium.org/426123002/diff/20001/tools/telemetry/telemetr... > > > tools/telemetry/telemetry/unittest/run_tests.py:129: > > parser.add_option('--retry-limit', type='int', default=0, > > > On 2014/07/30 21:40:33, Dirk Pranke wrote: > > > > On 2014/07/30 20:44:54, jam wrote: > > > > > why not make this 3 to match our other test gtest-based test harnesses? > > i.e. > > > > we > > > > > don't need to specify extra flags in recipes to enable test retries. > > > > > > > > Good question. As I understand it, the gtest-based tests don't actually > > retry > > > > tests by default; you have to specify the --test-launcher-bot-mode flag to > > get > > > > the retries (or use --test-launcher-retry-limit to specify a number > > explicitly). > > > > > > > > > > ah, I was not aware of that. nvm then > > > > Are you saying tests don't retry when people run them locally? > > Correct, not without a flag telling them to do so. Ojan is correct to point out that this is a bit confusing. Why don't we make this the default for telemetry, and we can switch gtest to also do the same? I think the reason we didn't do that in the gtest launcher initially is to make debugging easier. But now the test launcher detects when it's in a debugger and runs in-process anyways, so this isn't needed anymore.
On 2014/07/31 16:55:09, jam wrote: > On 2014/07/30 23:31:08, Dirk Pranke wrote: > > On 2014/07/30 23:14:33, ojan-only-code-yellow-reviews wrote: > > > On 2014/07/30 at 21:42:29, jam wrote: > > > > > > > > > > https://codereview.chromium.org/426123002/diff/20001/tools/telemetry/telemetr... > > > > File tools/telemetry/telemetry/unittest/run_tests.py (right): > > > > > > > > > > > > > > https://codereview.chromium.org/426123002/diff/20001/tools/telemetry/telemetr... > > > > tools/telemetry/telemetry/unittest/run_tests.py:129: > > > parser.add_option('--retry-limit', type='int', default=0, > > > > On 2014/07/30 21:40:33, Dirk Pranke wrote: > > > > > On 2014/07/30 20:44:54, jam wrote: > > > > > > why not make this 3 to match our other test gtest-based test > harnesses? > > > i.e. > > > > > we > > > > > > don't need to specify extra flags in recipes to enable test retries. > > > > > > > > > > Good question. As I understand it, the gtest-based tests don't actually > > > retry > > > > > tests by default; you have to specify the --test-launcher-bot-mode flag > to > > > get > > > > > the retries (or use --test-launcher-retry-limit to specify a number > > > explicitly). > > > > > > > > > > > > > ah, I was not aware of that. nvm then > > > > > > Are you saying tests don't retry when people run them locally? > > > > Correct, not without a flag telling them to do so. > > Ojan is correct to point out that this is a bit confusing. Why don't we make > this the default for telemetry, and we can switch gtest to also do the same? > > I think the reason we didn't do that in the gtest launcher initially is to make > debugging easier. But now the test launcher detects when it's in a debugger and > runs in-process anyways, so this isn't needed anymore. run-webkit-tests actually retries by default unless you specify tests to run explicitly; the idea is that if you are focused on a particular test, you'd probably rather see the failure fast rather than wait for retries. That said, I have no strong opinions on this one way or another. How about we leave this as-is for now, and change both in a separate patch?
Oh, I would note that we should probably get the input of the people that run these tests interactively (tony, dave, the other telemetry folks) to see if they care :).
On 2014/07/31 17:07:35, Dirk Pranke wrote: > Oh, I would note that we should probably get the input of the people that run > these tests interactively (tony, dave, the other telemetry folks) to see if they > care :). My preference is for as much consistency among test suites as possible. Telemetry unittests wants to be the same as any other test we run.
On 2014/07/31 17:13:05, tonyg wrote: > On 2014/07/31 17:07:35, Dirk Pranke wrote: > > Oh, I would note that we should probably get the input of the people that run > > these tests interactively (tony, dave, the other telemetry folks) to see if > they > > care :). > > My preference is for as much consistency among test suites as possible. > Telemetry unittests wants to be the same as any other test we run. it's on my plate to make the gtest based launcher have the same behavior by default when run on developer's machines as it is on the bots, to make this easier (specifically the retries). i hope my comments aren't holding this up? :) Please commit, whatever you guys will make you happy at this point, and we can discuss the automatic retry behavior afterwards.
On 2014/08/06 21:17:53, jam wrote: > i hope my comments aren't holding this up? :) Please commit, whatever you guys > will make you happy at this point, and we can discuss the automatic retry > behavior afterwards. Nope, not blocked on you, just waiting for other things to land and settle. I'll probably land this shortly.
Okay, I've updated w/ the review feedback. Dave, can you take another look?
On 2014/08/07 01:30:54, Dirk Pranke wrote: > Okay, I've updated w/ the review feedback. Dave, can you take another look? (now with actual patches uploaded :).
lgtm
The CQ bit was checked by dpranke@chromium.org
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/dpranke@chromium.org/426123002/60001
Message was sent while issue was closed.
Change committed as 288265 |