|
|
DescriptionSupport specifying a seed for pseudo-random test order.
This is a continuation of http://crrev.com/2082653004 with the following
differences:
- --order=random-seeded is removed
- The seed is given in a separate flag.
Once we find out which tests are order-dependent and will be flaky with random order,
then this can be changed so that "random" is the default order instead of "natural".
BUG=601332
Committed: https://crrev.com/fa5c50adb9b096701b9c0e1e9eee458665484bc5
Cr-Commit-Position: refs/heads/master@{#417968}
Patch Set 1 #
Total comments: 13
Patch Set 2 : Use fixed seed by default; move logging of random seed to Printer; remove writing of file. #Patch Set 3 : Remove incomplete docstring change #
Total comments: 1
Patch Set 4 : Add random order seed to results file. #
Messages
Total messages: 30 (13 generated)
The CQ bit was checked by qyearsley@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
qyearsley@chromium.org changed reviewers: + dpranke@chromium.org, tansell@chromium.org
qyearsley@chromium.org changed reviewers: + ojan@chromium.org
https://codereview.chromium.org/2308283002/diff/1/third_party/WebKit/Tools/Sc... File third_party/WebKit/Tools/Scripts/webkitpy/layout_tests/controllers/manager.py (right): https://codereview.chromium.org/2308283002/diff/1/third_party/WebKit/Tools/Sc... third_party/WebKit/Tools/Scripts/webkitpy/layout_tests/controllers/manager.py:281: self._filesystem.write_text_file(path, contents) We could also potentially: - Not write the seed to the results directory; to reproduce a test order would then involve getting the seed that's logged when the test was run. - Also write the test order. This might sometimes be helpful, but may be unnecessary.
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
https://codereview.chromium.org/2308283002/diff/1/third_party/WebKit/Tools/Sc... File third_party/WebKit/Tools/Scripts/webkitpy/layout_tests/controllers/manager.py (right): https://codereview.chromium.org/2308283002/diff/1/third_party/WebKit/Tools/Sc... third_party/WebKit/Tools/Scripts/webkitpy/layout_tests/controllers/manager.py:281: self._filesystem.write_text_file(path, contents) On 2016/09/05 00:15:28, qyearsley wrote: > We could also potentially: > - Not write the seed to the results directory; to reproduce a test order would > then involve getting the seed that's logged when the test was run. > - Also write the test order. This might sometimes be helpful, but may be > unnecessary. I would include the seed as a field in the results.json file, along with all of the other records of the test run. I don't think we should write a separate file just for this. I would probably also log the seed that was used at the beginning of the run, as part of the printer.print_config() function called at run_webkit_tests.py:544, rather than logging it here.
Thanks for picking up this CL Quinten. Definitely happy for someone else to get it in sooner than I can. I was meaning to get back to it, but it just hadn't reached the top of my TODO list yet. Tim 'mithro' Ansell https://codereview.chromium.org/2308283002/diff/1/third_party/WebKit/Tools/Sc... File third_party/WebKit/Tools/Scripts/webkitpy/layout_tests/controllers/manager.py (right): https://codereview.chromium.org/2308283002/diff/1/third_party/WebKit/Tools/Sc... third_party/WebKit/Tools/Scripts/webkitpy/layout_tests/controllers/manager.py:281: self._filesystem.write_text_file(path, contents) On 2016/09/06 01:19:57, Dirk Pranke wrote: > On 2016/09/05 00:15:28, qyearsley wrote: > > We could also potentially: > > - Not write the seed to the results directory; to reproduce a test order > would > > then involve getting the seed that's logged when the test was run. > > - Also write the test order. This might sometimes be helpful, but may be > > unnecessary. > > I would include the seed as a field in the results.json file, along with all of > the other records of the test run. I don't think we should write a separate file > just for this. > > I would probably also log the seed that was used at the beginning of the run, as > part of the printer.print_config() function called at run_webkit_tests.py:544, > rather than logging it here. I'm happy with Dirk's suggestions. The aim is to be able to find the random seed uses for a given run so it can be reproduced. This includes both when run on the bots and when run locally. https://codereview.chromium.org/2308283002/diff/1/third_party/WebKit/Tools/Sc... File third_party/WebKit/Tools/Scripts/webkitpy/layout_tests/run_webkit_tests.py (right): https://codereview.chromium.org/2308283002/diff/1/third_party/WebKit/Tools/Sc... third_party/WebKit/Tools/Scripts/webkitpy/layout_tests/run_webkit_tests.py:388: help="Seed to use for random test order. Only applicable in combination with --order=random."), For the help you should add what the default value? There are two reasonable options - current time or a fix value.
https://codereview.chromium.org/2308283002/diff/1/third_party/WebKit/Tools/Sc... File third_party/WebKit/Tools/Scripts/webkitpy/layout_tests/controllers/manager.py (right): https://codereview.chromium.org/2308283002/diff/1/third_party/WebKit/Tools/Sc... third_party/WebKit/Tools/Scripts/webkitpy/layout_tests/controllers/manager.py:262: self._random_seed = int(time.time()) I think we should do a fixed value. Using current time is almost always a mistake because it's then hard to reproduce flakiness you encounter.
https://codereview.chromium.org/2308283002/diff/1/third_party/WebKit/Tools/Sc... File third_party/WebKit/Tools/Scripts/webkitpy/layout_tests/controllers/manager.py (right): https://codereview.chromium.org/2308283002/diff/1/third_party/WebKit/Tools/Sc... third_party/WebKit/Tools/Scripts/webkitpy/layout_tests/controllers/manager.py:262: self._random_seed = int(time.time()) On 2016/09/06 at 16:53:29, ojan wrote: > I think we should do a fixed value. Using current time is almost always a mistake because it's then hard to reproduce flakiness you encounter. If the seed is output with the results, it should then be possible to reproduce flakiness that is encountered by passing --order=random --seed=<seed from prior run>. One possible advantage of a random seed based on current time by default is that more time-dependent flakiness would be revealed over the course of multiple test runs. Although, the order should also change over time as the number of tests changes, so an order-dependent flaky test shouldn't be able to "hide" for too long. Given that, does it still seem like a fixed seed would be a good idea? https://codereview.chromium.org/2308283002/diff/1/third_party/WebKit/Tools/Sc... third_party/WebKit/Tools/Scripts/webkitpy/layout_tests/controllers/manager.py:281: self._filesystem.write_text_file(path, contents) On 2016/09/06 at 02:02:25, mithro wrote: > On 2016/09/06 01:19:57, Dirk Pranke wrote: > > On 2016/09/05 00:15:28, qyearsley wrote: > > > We could also potentially: > > > - Not write the seed to the results directory; to reproduce a test order > > would > > > then involve getting the seed that's logged when the test was run. > > > - Also write the test order. This might sometimes be helpful, but may be > > > unnecessary. > > > > I would include the seed as a field in the results.json file, along with all of > > the other records of the test run. I don't think we should write a separate file > > just for this. > > > > I would probably also log the seed that was used at the beginning of the run, as > > part of the printer.print_config() function called at run_webkit_tests.py:544, > > rather than logging it here. > > I'm happy with Dirk's suggestions. > > The aim is to be able to find the random seed uses for a given run so it can be reproduced. This includes both when run on the bots and when run locally. Aye, SGTM https://codereview.chromium.org/2308283002/diff/1/third_party/WebKit/Tools/Sc... File third_party/WebKit/Tools/Scripts/webkitpy/layout_tests/run_webkit_tests.py (right): https://codereview.chromium.org/2308283002/diff/1/third_party/WebKit/Tools/Sc... third_party/WebKit/Tools/Scripts/webkitpy/layout_tests/run_webkit_tests.py:388: help="Seed to use for random test order. Only applicable in combination with --order=random."), On 2016/09/06 at 02:02:25, mithro wrote: > For the help you should add what the default value? There are two reasonable options - current time or a fix value. Agreed - will add this.
jeffcarp@chromium.org changed reviewers: + jeffcarp@chromium.org
https://codereview.chromium.org/2308283002/diff/1/third_party/WebKit/Tools/Sc... File third_party/WebKit/Tools/Scripts/webkitpy/layout_tests/controllers/manager.py (right): https://codereview.chromium.org/2308283002/diff/1/third_party/WebKit/Tools/Sc... third_party/WebKit/Tools/Scripts/webkitpy/layout_tests/controllers/manager.py:262: self._random_seed = int(time.time()) On 2016/09/06 at 16:53:29, ojan wrote: > I think we should do a fixed value. Using current time is almost always a mistake because it's then hard to reproduce flakiness you encounter. Is that still true if we log the seed in results.json?
https://codereview.chromium.org/2308283002/diff/1/third_party/WebKit/Tools/Sc... File third_party/WebKit/Tools/Scripts/webkitpy/layout_tests/controllers/manager.py (right): https://codereview.chromium.org/2308283002/diff/1/third_party/WebKit/Tools/Sc... third_party/WebKit/Tools/Scripts/webkitpy/layout_tests/controllers/manager.py:262: self._random_seed = int(time.time()) On 2016/09/06 at 17:22:48, qyearsley wrote: > On 2016/09/06 at 16:53:29, ojan wrote: > > I think we should do a fixed value. Using current time is almost always a mistake because it's then hard to reproduce flakiness you encounter. > > If the seed is output with the results, it should then be possible to reproduce flakiness that is encountered by passing --order=random --seed=<seed from prior run>. > > One possible advantage of a random seed based on current time by default is that more time-dependent flakiness would be revealed over the course of multiple test runs. Although, the order should also change over time as the number of tests changes, so an order-dependent flaky test shouldn't be able to "hide" for too long. > > Given that, does it still seem like a fixed seed would be a good idea? Think of this from the perspective of a sheriff trying to figure out why a failure happened. It will be hard to reason about current time. Admittedly, a fixed seed is hard to reason about as well, but at least that only changes when the list of test to run changes.
https://codereview.chromium.org/2308283002/diff/1/third_party/WebKit/Tools/Sc... File third_party/WebKit/Tools/Scripts/webkitpy/layout_tests/controllers/manager.py (right): https://codereview.chromium.org/2308283002/diff/1/third_party/WebKit/Tools/Sc... third_party/WebKit/Tools/Scripts/webkitpy/layout_tests/controllers/manager.py:262: self._random_seed = int(time.time()) On 2016/09/06 at 21:27:11, ojan wrote: > On 2016/09/06 at 17:22:48, qyearsley wrote: > > On 2016/09/06 at 16:53:29, ojan wrote: > > > I think we should do a fixed value. Using current time is almost always a mistake because it's then hard to reproduce flakiness you encounter. > > > > If the seed is output with the results, it should then be possible to reproduce flakiness that is encountered by passing --order=random --seed=<seed from prior run>. > > > > One possible advantage of a random seed based on current time by default is that more time-dependent flakiness would be revealed over the course of multiple test runs. Although, the order should also change over time as the number of tests changes, so an order-dependent flaky test shouldn't be able to "hide" for too long. > > > > Given that, does it still seem like a fixed seed would be a good idea? > > Think of this from the perspective of a sheriff trying to figure out why a failure happened. It will be hard to reason about current time. Admittedly, a fixed seed is hard to reason about as well, but at least that only changes when the list of test to run changes. If this would make it hard to tell whether a failing test is flaky or order-dependent, should we investigate other methods to make sure tests are order-independent? Or what if we started with a small set of tests that are known to be order-independent, run those in random order, and slowly move more tests into that bucket?
https://codereview.chromium.org/2308283002/diff/1/third_party/WebKit/Tools/Sc... File third_party/WebKit/Tools/Scripts/webkitpy/layout_tests/controllers/manager.py (right): https://codereview.chromium.org/2308283002/diff/1/third_party/WebKit/Tools/Sc... third_party/WebKit/Tools/Scripts/webkitpy/layout_tests/controllers/manager.py:262: self._random_seed = int(time.time()) On 2016/09/06 at 22:57:30, jeffcarp wrote: > On 2016/09/06 at 21:27:11, ojan wrote: > > On 2016/09/06 at 17:22:48, qyearsley wrote: > > > On 2016/09/06 at 16:53:29, ojan wrote: > > > > I think we should do a fixed value. Using current time is almost always a mistake because it's then hard to reproduce flakiness you encounter. > > > > > > If the seed is output with the results, it should then be possible to reproduce flakiness that is encountered by passing --order=random --seed=<seed from prior run>. > > > > > > One possible advantage of a random seed based on current time by default is that more time-dependent flakiness would be revealed over the course of multiple test runs. Although, the order should also change over time as the number of tests changes, so an order-dependent flaky test shouldn't be able to "hide" for too long. > > > > > > Given that, does it still seem like a fixed seed would be a good idea? > > > > Think of this from the perspective of a sheriff trying to figure out why a failure happened. It will be hard to reason about current time. Admittedly, a fixed seed is hard to reason about as well, but at least that only changes when the list of test to run changes. > > If this would make it hard to tell whether a failing test is flaky or order-dependent, should we investigate other methods to make sure tests are order-independent? Or what if we started with a small set of tests that are known to be order-independent, run those in random order, and slowly move more tests into that bucket? I think this is a good idea. You probably want it to be a blacklist so that when you're done moving things you can get rid of the concept of order-dependent tests. Concretely, we could have an OrderDependentTests file that lets you list directories or individual tests and the order they should run in. Then you can remove one directory at a time from that list. Theoretically, we could even run the order-independent tests on swarming while we're working on emptying the OrderDependentTest file. I don't know how complicated that would be to do. If it's easy, I think it'd be a good idea. If complicated, it might not be worth it.
In the latest patch now, I've now changed it so that: - The seed is 4 by default (but can still be specified) - The seed is printed in printer.print_config. - The seed is not printed anywhere in the results directory as of the latest patch. Adding the seed used to full-results.json and/or failing-results.json would involve modifying either Manager._write_json_results or TestRunResults.summarize_results, which increases complexity and those functions are already pretty complex. I'm not quite sure whether it's justified, since that random seed used isn't strictly really part of the results; it's part of the config. If a fixed seed is used by default, then when sheriffs try to reproduce failures by re-running the tests locally, they will be using the same seed as on the waterfall by default. So, finding and specifying the random seed would be unnecessary when reproducing failures. The main purpose of this CL is to make it easier to discover order-dependent flaky tests by repeatedly running the tests in random order with different seeds; I think for now, writing out the random seed to a file may not be necessary? https://codereview.chromium.org/2308283002/diff/1/third_party/WebKit/Tools/Sc... File third_party/WebKit/Tools/Scripts/webkitpy/layout_tests/run_webkit_tests.py (right): https://codereview.chromium.org/2308283002/diff/1/third_party/WebKit/Tools/Sc... third_party/WebKit/Tools/Scripts/webkitpy/layout_tests/run_webkit_tests.py:388: help="Seed to use for random test order. Only applicable in combination with --order=random."), On 2016/09/06 at 17:22:48, qyearsley wrote: > On 2016/09/06 at 02:02:25, mithro wrote: > > For the help you should add what the default value? There are two reasonable options - current time or a fix value. > > Agreed - will add this. Now done
The CQ bit was checked by qyearsley@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
Ping - does the latest version of this CL look OK to commit?
lgtm, if you also change it so that we write the seed into the results.json file(s) so that we can programmatically retrieve it and don't have to parse the logs. https://codereview.chromium.org/2308283002/diff/40001/third_party/WebKit/Tool... File third_party/WebKit/Tools/Scripts/webkitpy/layout_tests/controllers/manager.py (right): https://codereview.chromium.org/2308283002/diff/40001/third_party/WebKit/Tool... third_party/WebKit/Tools/Scripts/webkitpy/layout_tests/controllers/manager.py:258: rand.seed(self._options.seed) Nit: this can be rand = random.Random(self._options.seed) I think.
On 2016/09/09 at 19:50:23, dpranke wrote: > lgtm, if you also change it so that we write the seed into the results.json file(s) so that we can programmatically retrieve it and don't have to parse the logs. Done -- could you quickly review test_run_results.py to see if that looks good to commit? > https://codereview.chromium.org/2308283002/diff/40001/third_party/WebKit/Tool... > File third_party/WebKit/Tools/Scripts/webkitpy/layout_tests/controllers/manager.py (right): > > https://codereview.chromium.org/2308283002/diff/40001/third_party/WebKit/Tool... > third_party/WebKit/Tools/Scripts/webkitpy/layout_tests/controllers/manager.py:258: rand.seed(self._options.seed) > Nit: this can be > > rand = random.Random(self._options.seed) > > I think. Yep, it looks like the Random constructor can take a seed. Now updated; also in the latest patchset I changed that block to remove the `rand` temporary variable.
lgtm
The CQ bit was checked by qyearsley@chromium.org
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
Message was sent while issue was closed.
Committed patchset #4 (id:60001)
Message was sent while issue was closed.
Description was changed from ========== Support specifying a seed for pseudo-random test order. This is a continuation of http://crrev.com/2082653004 with the following differences: - --order=random-seeded is removed - The seed is given in a separate flag. Once we find out which tests are order-dependent and will be flaky with random order, then this can be changed so that "random" is the default order instead of "natural". BUG=601332 ========== to ========== Support specifying a seed for pseudo-random test order. This is a continuation of http://crrev.com/2082653004 with the following differences: - --order=random-seeded is removed - The seed is given in a separate flag. Once we find out which tests are order-dependent and will be flaky with random order, then this can be changed so that "random" is the default order instead of "natural". BUG=601332 Committed: https://crrev.com/fa5c50adb9b096701b9c0e1e9eee458665484bc5 Cr-Commit-Position: refs/heads/master@{#417968} ==========
Message was sent while issue was closed.
Patchset 4 (id:??) landed as https://crrev.com/fa5c50adb9b096701b9c0e1e9eee458665484bc5 Cr-Commit-Position: refs/heads/master@{#417968} |