|
|
Created:
4 years, 2 months ago by nednguyen Modified:
4 years, 2 months ago Reviewers:
Ken Russell (switch to Gerrit), Vadim Sh., Dirk Pranke, iannucci, M-A Ruel, estaab, eyaich1, Sergiy Byelozyorov CC:
chromium-reviews, infra-reviews+build_chromium.org, kjellander-cc_chromium.org, Dirk Pranke, martiniss Target Ref:
refs/heads/master Project:
build Visibility:
Public. |
DescriptionAdd json test results format support for SwarmingIsolatedScriptTest
BUG=649762
Committed: https://chromium.googlesource.com/chromium/tools/build/+/7baad31fc4293485a824d4ef6ef7361adc2979a1
Patch Set 1 #
Total comments: 22
Patch Set 2 : Move merging logic to a separate file for unittesting #Patch Set 3 : Address review comments #
Total comments: 1
Patch Set 4 : Fix typos #
Total comments: 9
Patch Set 5 : Add expectation test coverage #Patch Set 6 : Undo collect_gtest_task_test change #Patch Set 7 : Update SwarmingIsolatedScriptTest #
Total comments: 2
Patch Set 8 : Add missing comments #Patch Set 9 : Rebase #
Total comments: 24
Patch Set 10 : Address Sergiy's comments #Patch Set 11 : Rebase #Patch Set 12 : Address Dirk's comment #
Total comments: 2
Patch Set 13 : Address Sergiy comments #
Total comments: 8
Patch Set 14 : Update failed test generation #Patch Set 15 : Use repr(e) to make the invalid_results_exc clearer #Patch Set 16 : Move results_merger out of resources/ & remove resources/__init__.py #Patch Set 17 : Add no cover #Messages
Total messages: 81 (36 generated)
The CQ bit was checked by nednguyen@google.com to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: No L-G-T-M from a valid reviewer yet. CQ run can only be started by full committers or once the patch has received an L-G-T-M from a full committer. Even if an L-G-T-M may have been provided, it was from a non-committer, _not_ a full super star committer. Committers are members of the group "project-infra-committers". Note that this has nothing to do with OWNERS files.
The CQ bit was checked by nednguyen@google.com to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: No L-G-T-M from a valid reviewer yet. CQ run can only be started by full committers or once the patch has received an L-G-T-M from a full committer. Even if an L-G-T-M may have been provided, it was from a non-committer, _not_ a full super star committer. Committers are members of the group "project-infra-committers". Note that this has nothing to do with OWNERS files.
Patchset #1 (id:1) has been deleted
Patchset #1 (id:20001) has been deleted
Description was changed from ========== Support json test results format BUG= ========== to ========== Support json test results format BUG=649762 ==========
nednguyen@google.com changed reviewers: + estaab@chromium.org, kbr@chromium.org, sergiyb@chromium.org
This is not ready for full review yet, but I just want to have some feedbacks on high level goal: This CL allows merging the shard results which has json test results format. To clarify: when we say making test steps uploads to test-resuts.appspot.com, do we mean uploads the merged json or individual shard's json?
eyaich@chromium.org changed reviewers: + eyaich@chromium.org
You are also going to have to update this contract in other places: 1) chromium_tests/steps.py that relies on these results in the recipe(lines 815 and 1086) 2) the scripts that run on the swarming bots that produce the results (src/testing/scripts/*) https://codereview.chromium.org/2375663003/diff/40001/scripts/slave/recipe_mo... File scripts/slave/recipe_modules/swarming/api.py (right): https://codereview.chromium.org/2375663003/diff/40001/scripts/slave/recipe_mo... scripts/slave/recipe_modules/swarming/api.py:867: if not shard_results_list: no need for this check you create it above https://codereview.chromium.org/2375663003/diff/40001/scripts/slave/recipe_mo... scripts/slave/recipe_modules/swarming/api.py:869: elif 'seconds_since_epoch' in shard_results_list[0]: is this the distinguishing factor between the results from unittests vs perf tests? Is that the correct variant to be checking?
On 2016/09/29 at 14:18:42, nednguyen wrote: > This is not ready for full review yet, but I just want to have some feedbacks on high level goal: > > This CL allows merging the shard results which has json test results format. > > To clarify: when we say making test steps uploads to test-resuts.appspot.com, do we mean uploads the merged json or individual shard's json? The merged results will need to be uploaded since they are keyed on (master, builder, test name, build number). The server doesn't have support for merging shards. Hope that helps. :)
sergiyb@chromium.org changed reviewers: + maruel@chromium.org
Yay! Thank you for this CL! Left a few comments. https://codereview.chromium.org/2375663003/diff/40001/scripts/slave/recipe_mo... File scripts/slave/recipe_modules/swarming/api.py (right): https://codereview.chromium.org/2375663003/diff/40001/scripts/slave/recipe_mo... scripts/slave/recipe_modules/swarming/api.py:809: raise recipe_api.InfraFailure('Invalid json test results') I would double-check if everybody implements the full spec... I remember seeing some fields missing in some test results, so I treated them all as optional. https://codereview.chromium.org/2375663003/diff/40001/scripts/slave/recipe_mo... scripts/slave/recipe_modules/swarming/api.py:815: while nodes_queue: I don't see nodes_queue defined anywhere... does this have tests? https://codereview.chromium.org/2375663003/diff/40001/scripts/slave/recipe_mo... scripts/slave/recipe_modules/swarming/api.py:839: results_json['seconds_since_epoch']) Any particular reason to pick the smaller of the two values? https://codereview.chromium.org/2375663003/diff/40001/scripts/slave/recipe_mo... scripts/slave/recipe_modules/swarming/api.py:841: if not result_type in merged_results['num_failures_by_type']: IMHO this is more Pythonic: merged_results['num_failures_by_type'].setdefault(result_type, 0) https://codereview.chromium.org/2375663003/diff/40001/scripts/slave/recipe_mo... scripts/slave/recipe_modules/swarming/api.py:867: if not shard_results_list: On 2016/09/29 14:55:23, eyaich1 wrote: > no need for this check you create it above But it can be empty, can't it?
Thanks for generalizing this code. A couple of comments so far. https://codereview.chromium.org/2375663003/diff/40001/scripts/slave/recipe_mo... File scripts/slave/recipe_modules/swarming/api.py (right): https://codereview.chromium.org/2375663003/diff/40001/scripts/slave/recipe_mo... scripts/slave/recipe_modules/swarming/api.py:855: # These are the only keys we pay attention to in the output JSON. Please move this comment to the top of _merge_simplified_test_result_format and adjust as necessary. https://codereview.chromium.org/2375663003/diff/40001/scripts/slave/recipe_mo... scripts/slave/recipe_modules/swarming/api.py:868: raise Exception('No shard results is created') Grammar: remove "is". https://codereview.chromium.org/2375663003/diff/40001/scripts/slave/recipe_mo... File scripts/slave/recipe_modules/swarming/tests/collect_gtest_task_test.py (right): https://codereview.chromium.org/2375663003/diff/40001/scripts/slave/recipe_mo... scripts/slave/recipe_modules/swarming/tests/collect_gtest_task_test.py:232: } Sorry, I don't understand: why is this empty?
https://codereview.chromium.org/2375663003/diff/40001/scripts/slave/recipe_mo... File scripts/slave/recipe_modules/swarming/api.py (right): https://codereview.chromium.org/2375663003/diff/40001/scripts/slave/recipe_mo... scripts/slave/recipe_modules/swarming/api.py:809: raise recipe_api.InfraFailure('Invalid json test results') On 2016/09/29 21:36:22, Sergiy Byelozyorov wrote: > I would double-check if everybody implements the full spec... I remember seeing > some fields missing in some test results, so I treated them all as optional. Hmhh, the spec say these fields are required? http://www.chromium.org/developers/the-json-test-results-format Currently, this is not used by anyone who use the script_isolated_test recipe, so I think it's better to follow the spec here. https://codereview.chromium.org/2375663003/diff/40001/scripts/slave/recipe_mo... scripts/slave/recipe_modules/swarming/api.py:815: while nodes_queue: On 2016/09/29 21:36:22, Sergiy Byelozyorov wrote: > I don't see nodes_queue defined anywhere... does this have tests? Yeah, I wasn't sure how to test these. Done. https://codereview.chromium.org/2375663003/diff/40001/scripts/slave/recipe_mo... scripts/slave/recipe_modules/swarming/api.py:839: results_json['seconds_since_epoch']) On 2016/09/29 21:36:22, Sergiy Byelozyorov wrote: > Any particular reason to pick the smaller of the two values? I honestly not sure what is the reason for this field & the spec says this is required. Do you have any suggestion? https://codereview.chromium.org/2375663003/diff/40001/scripts/slave/recipe_mo... scripts/slave/recipe_modules/swarming/api.py:841: if not result_type in merged_results['num_failures_by_type']: On 2016/09/29 21:36:22, Sergiy Byelozyorov wrote: > IMHO this is more Pythonic: > > merged_results['num_failures_by_type'].setdefault(result_type, 0) Done. https://codereview.chromium.org/2375663003/diff/40001/scripts/slave/recipe_mo... scripts/slave/recipe_modules/swarming/api.py:855: # These are the only keys we pay attention to in the output JSON. On 2016/09/30 23:37:53, Ken Russell wrote: > Please move this comment to the top of _merge_simplified_test_result_format and > adjust as necessary. Done. https://codereview.chromium.org/2375663003/diff/40001/scripts/slave/recipe_mo... scripts/slave/recipe_modules/swarming/api.py:868: raise Exception('No shard results is created') On 2016/09/30 23:37:53, Ken Russell wrote: > Grammar: remove "is". Done. https://codereview.chromium.org/2375663003/diff/40001/scripts/slave/recipe_mo... scripts/slave/recipe_modules/swarming/api.py:869: elif 'seconds_since_epoch' in shard_results_list[0]: On 2016/09/29 14:55:23, eyaich1 wrote: > is this the distinguishing factor between the results from unittests vs perf > tests? Is that the correct variant to be checking? This is the distinguished factor between the simplified json result vs chromium json test result. https://codereview.chromium.org/2375663003/diff/40001/scripts/slave/recipe_mo... File scripts/slave/recipe_modules/swarming/tests/collect_gtest_task_test.py (right): https://codereview.chromium.org/2375663003/diff/40001/scripts/slave/recipe_mo... scripts/slave/recipe_modules/swarming/tests/collect_gtest_task_test.py:232: } On 2016/09/30 23:37:53, Ken Russell wrote: > Sorry, I don't understand: why is this empty? Sorry, I wasn't quite sure how to test this. Please see the updated patch.
https://codereview.chromium.org/2375663003/diff/80001/scripts/slave/recipe_mo... File scripts/slave/recipe_modules/swarming/__init__.py (right): https://codereview.chromium.org/2375663003/diff/80001/scripts/slave/recipe_mo... scripts/slave/recipe_modules/swarming/__init__.py:18: try: Reviewers: I need to guard this so that I can run scripts/slave/recipe_modules/swarming/results_merger_unittest.py Otherwise, the import fails because recipe_engine module is not found Please let me know if there is a better way to deal with this.
kbr@chromium.org changed reviewers: + vadimsh@chromium.org
CC'ing Vadim in particular regarding the unittest and the try/catch.
vadimsh@chromium.org changed reviewers: + iannucci@chromium.org
+ also Robbie, who's working on improving recipes unit tests in general https://codereview.chromium.org/2375663003/diff/100001/scripts/slave/recipe_m... File scripts/slave/recipe_modules/swarming/api.py (right): https://codereview.chromium.org/2375663003/diff/100001/scripts/slave/recipe_m... scripts/slave/recipe_modules/swarming/api.py:787: return results_merger.merge_test_results(shard_results_list) I'd move this into a separate external script, similar to how collect_gtest_task.py is organized, e.g. instead of loading all JSONs into recipe engine memory and doing merges right in the receipt engine process, invoke an external "results_merger.py <path0> <path1> ...". Then you can avoid 'try' in __init__.py, and code coverage and unit testing story will be more consistent. https://codereview.chromium.org/2375663003/diff/100001/scripts/slave/recipe_m... File scripts/slave/recipe_modules/swarming/results_merger.py (right): https://codereview.chromium.org/2375663003/diff/100001/scripts/slave/recipe_m... scripts/slave/recipe_modules/swarming/results_merger.py:1: # Copyright 2016 The Chromium Authors. All rights reserved. I suspect you will have hard time convincing recipe tests that this code is 100% covered, because it will probably ignore code coverage provided by results_merger_unittest.py.
https://codereview.chromium.org/2375663003/diff/100001/scripts/slave/recipe_m... File scripts/slave/recipe_modules/swarming/api.py (right): https://codereview.chromium.org/2375663003/diff/100001/scripts/slave/recipe_m... scripts/slave/recipe_modules/swarming/api.py:787: return results_merger.merge_test_results(shard_results_list) On 2016/10/04 01:05:29, Vadim Sh. wrote: > I'd move this into a separate external script, similar to how > collect_gtest_task.py is organized, e.g. instead of loading all JSONs into > recipe engine memory and doing merges right in the receipt engine process, > invoke an external "results_merger.py <path0> <path1> ...". > > Then you can avoid 'try' in __init__.py, and code coverage and unit testing > story will be more consistent. Though it will introduce a separate step :-/ Also it will probably be slightly slower (for extra-large files) due to additional JSON serialization roundtrip. Don't know.
Can use leak_to feature of json.output to avoid round trip. On Mon, Oct 3, 2016, 18:11 <vadimsh@chromium.org> wrote: > > > https://codereview.chromium.org/2375663003/diff/100001/scripts/slave/recipe_m... > File scripts/slave/recipe_modules/swarming/api.py (right): > > > https://codereview.chromium.org/2375663003/diff/100001/scripts/slave/recipe_m... > scripts/slave/recipe_modules/swarming/api.py:787: return > results_merger.merge_test_results(shard_results_list) > On 2016/10/04 01:05:29, Vadim Sh. wrote: > > I'd move this into a separate external script, similar to how > > collect_gtest_task.py is organized, e.g. instead of loading all JSONs > into > > recipe engine memory and doing merges right in the receipt engine > process, > > invoke an external "results_merger.py <path0> <path1> ...". > > > > Then you can avoid 'try' in __init__.py, and code coverage and unit > testing > > story will be more consistent. > > Though it will introduce a separate step :-/ Also it will probably be > slightly slower (for extra-large files) due to additional JSON > serialization roundtrip. Don't know. > > https://codereview.chromium.org/2375663003/ > -- You received this message because you are subscribed to the Google Groups "Chromium-reviews" group. To unsubscribe from this group and stop receiving emails from it, send an email to chromium-reviews+unsubscribe@chromium.org.
It should certainly be made possible to unit test the merger functions, but I'm not convinced that hacking in the current unit test is a good idea. None of the recipe infrastructure will actually run that unit test on the commit queue, so it defeats the purpose of adding it. iannucci@ was talking about adding a post_process step to the recipe_simulation_tests, which could allow an assertion to be made about the results of an earlier step. See this thread (Google internal only, sorry): https://groups.google.com/a/google.com/forum/#!topic/chrome-recipes/GeI-7BZX4dk Even if that's not possible I would advocate adding a simulation test in scripts/slave/recipes/chromium.py, since that's the current regression testing framework. Please update the CL when you decide on how to test this. Thanks.
Yes and post process assertions should land this week. I'll take a look at the CL when off mobile :) On Mon, Oct 3, 2016, 18:14 <kbr@chromium.org> wrote: > It should certainly be made possible to unit test the merger functions, > but I'm > not convinced that hacking in the current unit test is a good idea. None > of the > recipe infrastructure will actually run that unit test on the commit > queue, so > it defeats the purpose of adding it. > > iannucci@ was talking about adding a post_process step to the > recipe_simulation_tests, which could allow an assertion to be made about > the > results of an earlier step. See this thread (Google internal only, sorry): > > > https://groups.google.com/a/google.com/forum/#!topic/chrome-recipes/GeI-7BZX4dk > > Even if that's not possible I would advocate adding a simulation test in > scripts/slave/recipes/chromium.py, since that's the current regression > testing > framework. > > Please update the CL when you decide on how to test this. Thanks. > > > https://codereview.chromium.org/2375663003/ > -- You received this message because you are subscribed to the Google Groups "Chromium-reviews" group. To unsubscribe from this group and stop receiving emails from it, send an email to chromium-reviews+unsubscribe@chromium.org.
On 2016/10/04 01:14:29, Ken Russell wrote: > It should certainly be made possible to unit test the merger functions, but I'm > not convinced that hacking in the current unit test is a good idea. None of the > recipe infrastructure will actually run that unit test on the commit queue, so > it defeats the purpose of adding it. build/PRESUBMIT actually pick up the unittest files if I put it in the tests/ folder: https://cs.chromium.org/chromium/build/PRESUBMIT.py?rcl=0&l=121 > > iannucci@ was talking about adding a post_process step to the > recipe_simulation_tests, which could allow an assertion to be made about the > results of an earlier step. See this thread (Google internal only, sorry): > > https://groups.google.com/a/google.com/forum/#!topic/chrome-recipes/GeI-7BZX4dk > > Even if that's not possible I would advocate adding a simulation test in > scripts/slave/recipes/chromium.py, since that's the current regression testing > framework. I think we need to do both the expectation test, and unittest one. Expectation test is for code coverage. The unittest should also be needed because: 1) It's much easier to add more test cases using unittest framework 2) It's much faster to run scripts/slave/recipe_modules/swarming/results_merger_unittest.py 3) When test fails, it's easier to figure out what went wrong & fix it with the unittest. > > Please update the CL when you decide on how to test this. Thanks.
On 2016/10/04 01:23:27, nednguyen wrote: > On 2016/10/04 01:14:29, Ken Russell wrote: > > It should certainly be made possible to unit test the merger functions, but > I'm > > not convinced that hacking in the current unit test is a good idea. None of > the > > recipe infrastructure will actually run that unit test on the commit queue, so > > it defeats the purpose of adding it. > > build/PRESUBMIT actually pick up the unittest files if I put it in the tests/ > folder: https://cs.chromium.org/chromium/build/PRESUBMIT.py?rcl=0&l=121 Thanks, I didn't know that. > > iannucci@ was talking about adding a post_process step to the > > recipe_simulation_tests, which could allow an assertion to be made about the > > results of an earlier step. See this thread (Google internal only, sorry): > > > > > https://groups.google.com/a/google.com/forum/#!topic/chrome-recipes/GeI-7BZX4dk > > > > Even if that's not possible I would advocate adding a simulation test in > > scripts/slave/recipes/chromium.py, since that's the current regression testing > > framework. > > I think we need to do both the expectation test, and unittest one. > > Expectation test is for code coverage. The unittest should also be needed > because: > 1) It's much easier to add more test cases using unittest framework > 2) It's much faster to run > scripts/slave/recipe_modules/swarming/results_merger_unittest.py > 3) When test fails, it's easier to figure out what went wrong & fix it with the > unittest. I agree it's easier to write unit tests than recipe simulation tests. If you'd like to move forward with your unit test then please move it to the correct directory to be picked up by the presubmit checks. Do the presubmit checks pass at this point? I'm assuming they don't.
https://codereview.chromium.org/2375663003/diff/100001/scripts/slave/recipe_m... File scripts/slave/recipe_modules/swarming/__init__.py (right): https://codereview.chromium.org/2375663003/diff/100001/scripts/slave/recipe_m... scripts/slave/recipe_modules/swarming/__init__.py:18: try: Add a comment that this try/catch enables the results_merger_unittest to run. https://codereview.chromium.org/2375663003/diff/100001/scripts/slave/recipe_m... File scripts/slave/recipe_modules/swarming/tests/collect_gtest_task_test.py (right): https://codereview.chromium.org/2375663003/diff/100001/scripts/slave/recipe_m... scripts/slave/recipe_modules/swarming/tests/collect_gtest_task_test.py:359: self.assertEqual(GOOD_JSON_TEST_RESULT_MERGED, merged) This test doesn't pass, does it? I see Vadim's point about moving the merger into its own separate script, since these unit tests are pretty simple to understand.
https://codereview.chromium.org/2375663003/diff/100001/scripts/slave/recipe_m... File scripts/slave/recipe_modules/swarming/__init__.py (right): https://codereview.chromium.org/2375663003/diff/100001/scripts/slave/recipe_m... scripts/slave/recipe_modules/swarming/__init__.py:18: try: On 2016/10/04 01:43:43, Ken Russell wrote: > Add a comment that this try/catch enables the results_merger_unittest to run. This may not be needed if I migrate result_merger to a script https://codereview.chromium.org/2375663003/diff/100001/scripts/slave/recipe_m... File scripts/slave/recipe_modules/swarming/results_merger.py (right): https://codereview.chromium.org/2375663003/diff/100001/scripts/slave/recipe_m... scripts/slave/recipe_modules/swarming/results_merger.py:1: # Copyright 2016 The Chromium Authors. All rights reserved. On 2016/10/04 01:05:29, Vadim Sh. wrote: > I suspect you will have hard time convincing recipe tests that this code is 100% > covered, because it will probably ignore code coverage provided by > results_merger_unittest.py. I plan to add both expectation test & unittest. Howver, if change this to be a runnable script, does expectation test's coverage include the script's code? https://codereview.chromium.org/2375663003/diff/100001/scripts/slave/recipe_m... File scripts/slave/recipe_modules/swarming/tests/collect_gtest_task_test.py (right): https://codereview.chromium.org/2375663003/diff/100001/scripts/slave/recipe_m... scripts/slave/recipe_modules/swarming/tests/collect_gtest_task_test.py:359: self.assertEqual(GOOD_JSON_TEST_RESULT_MERGED, merged) On 2016/10/04 01:43:43, Ken Russell wrote: > This test doesn't pass, does it? I see Vadim's point about moving the merger > into its own separate script, since these unit tests are pretty simple to > understand. Ah, this change should be undo. My bad
https://codereview.chromium.org/2375663003/diff/40001/scripts/slave/recipe_mo... File scripts/slave/recipe_modules/swarming/api.py (right): https://codereview.chromium.org/2375663003/diff/40001/scripts/slave/recipe_mo... scripts/slave/recipe_modules/swarming/api.py:809: raise recipe_api.InfraFailure('Invalid json test results') On 2016/10/04 00:10:11, nednguyen wrote: > On 2016/09/29 21:36:22, Sergiy Byelozyorov wrote: > > I would double-check if everybody implements the full spec... I remember > seeing > > some fields missing in some test results, so I treated them all as optional. > > Hmhh, the spec say these fields are required? > http://www.chromium.org/developers/the-json-test-results-format > > Currently, this is not used by anyone who use the script_isolated_test recipe, > so I think it's better to follow the spec here. Sure. I just meant that you should make sure it doesn't break with real-world results. If you find any launchers that do not follow the spec, please file bugs into Infra>Flakiness>Dashboard component for them. https://codereview.chromium.org/2375663003/diff/40001/scripts/slave/recipe_mo... scripts/slave/recipe_modules/swarming/api.py:839: results_json['seconds_since_epoch']) On 2016/10/04 00:10:11, nednguyen wrote: > On 2016/09/29 21:36:22, Sergiy Byelozyorov wrote: > > Any particular reason to pick the smaller of the two values? > > I honestly not sure what is the reason for this field & the spec says this is > required. Do you have any suggestion? AFAIK, this is when the tests were run... probably some random time measured by the test launcher. Minimum of the two values is perfectly find with me. I was just wondering if this was just an arbitrary choice or whether there was some logic behind it. https://codereview.chromium.org/2375663003/diff/100001/scripts/slave/recipe_m... File scripts/slave/recipe_modules/swarming/results_merger.py (right): https://codereview.chromium.org/2375663003/diff/100001/scripts/slave/recipe_m... scripts/slave/recipe_modules/swarming/results_merger.py:1: # Copyright 2016 The Chromium Authors. All rights reserved. On 2016/10/04 01:49:52, nednguyen wrote: > On 2016/10/04 01:05:29, Vadim Sh. wrote: > > I suspect you will have hard time convincing recipe tests that this code is > 100% > > covered, because it will probably ignore code coverage provided by > > results_merger_unittest.py. > > I plan to add both expectation test & unittest. Howver, if change this to be a > runnable script, does expectation test's coverage include the script's code? +1 for unit tests, because IMHO we have too many expectation tests (essentially, integration tests) covering just a few lines of code. But like Vadim, I am not sure if our coverage system takes unittests into account when computing coverage. You can try running the CQ in dry run on this CL to check that.
https://codereview.chromium.org/2375663003/diff/40001/scripts/slave/recipe_mo... File scripts/slave/recipe_modules/swarming/api.py (right): https://codereview.chromium.org/2375663003/diff/40001/scripts/slave/recipe_mo... scripts/slave/recipe_modules/swarming/api.py:839: results_json['seconds_since_epoch']) On 2016/10/04 08:54:52, Sergiy Byelozyorov wrote: > On 2016/10/04 00:10:11, nednguyen wrote: > > On 2016/09/29 21:36:22, Sergiy Byelozyorov wrote: > > > Any particular reason to pick the smaller of the two values? > > > > I honestly not sure what is the reason for this field & the spec says this is > > required. Do you have any suggestion? > > AFAIK, this is when the tests were run... probably some random time measured by > the test launcher. Minimum of the two values is perfectly find with me. I was > just wondering if this was just an arbitrary choice or whether there was some > logic behind it. Ah, I just think that the earliest seconds_since_epoch time of all the shards best represent the seconds_since_epoch time of when the suite actually starts.
On 2016/10/04 13:30:29, nednguyen wrote: > https://codereview.chromium.org/2375663003/diff/40001/scripts/slave/recipe_mo... > File scripts/slave/recipe_modules/swarming/api.py (right): > > https://codereview.chromium.org/2375663003/diff/40001/scripts/slave/recipe_mo... > scripts/slave/recipe_modules/swarming/api.py:839: > results_json['seconds_since_epoch']) > On 2016/10/04 08:54:52, Sergiy Byelozyorov wrote: > > On 2016/10/04 00:10:11, nednguyen wrote: > > > On 2016/09/29 21:36:22, Sergiy Byelozyorov wrote: > > > > Any particular reason to pick the smaller of the two values? > > > > > > I honestly not sure what is the reason for this field & the spec says this > is > > > required. Do you have any suggestion? > > > > AFAIK, this is when the tests were run... probably some random time measured > by > > the test launcher. Minimum of the two values is perfectly find with me. I was > > just wondering if this was just an arbitrary choice or whether there was some > > logic behind it. > > Ah, I just think that the earliest seconds_since_epoch time of all the shards > best represent the seconds_since_epoch time of when the suite actually starts. I looked into how collect_gtest_task.py implemented as there is a lot of boiler plates in code just to share the logic by turning it to a runnable script. I will try my best to have full coverage over results_merger module by changing some existing test step with the current implementation and not go down the path of scriptifying.
On 2016/10/04 13:34:14, nednguyen wrote: > On 2016/10/04 13:30:29, nednguyen wrote: > > > https://codereview.chromium.org/2375663003/diff/40001/scripts/slave/recipe_mo... > > File scripts/slave/recipe_modules/swarming/api.py (right): > > > > > https://codereview.chromium.org/2375663003/diff/40001/scripts/slave/recipe_mo... > > scripts/slave/recipe_modules/swarming/api.py:839: > > results_json['seconds_since_epoch']) > > On 2016/10/04 08:54:52, Sergiy Byelozyorov wrote: > > > On 2016/10/04 00:10:11, nednguyen wrote: > > > > On 2016/09/29 21:36:22, Sergiy Byelozyorov wrote: > > > > > Any particular reason to pick the smaller of the two values? > > > > > > > > I honestly not sure what is the reason for this field & the spec says this > > is > > > > required. Do you have any suggestion? > > > > > > AFAIK, this is when the tests were run... probably some random time measured > > by > > > the test launcher. Minimum of the two values is perfectly find with me. I > was > > > just wondering if this was just an arbitrary choice or whether there was > some > > > logic behind it. > > > > Ah, I just think that the earliest seconds_since_epoch time of all the shards > > best represent the seconds_since_epoch time of when the suite actually starts. > > I looked into how collect_gtest_task.py implemented as there is a lot of boiler > plates in code just to share the logic by turning it to a runnable script. I > will try my best to have full coverage over results_merger module by changing > some existing test step with the current implementation and not go down the path > of scriptifying. This CL now achieve full expectation test coverage & unittest is passing. PTAL again
The CQ bit was checked by nednguyen@google.com to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: No L-G-T-M from a valid reviewer yet. CQ run can only be started by full committers or once the patch has received an L-G-T-M from a full committer. Even if an L-G-T-M may have been provided, it was from a non-committer, _not_ a full super star committer. Committers are members of the group "project-infra-committers". Note that this has nothing to do with OWNERS files.
The CQ bit was checked by nednguyen@google.com to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: No L-G-T-M from a valid reviewer yet. CQ run can only be started by full committers or once the patch has received an L-G-T-M from a full committer. Even if an L-G-T-M may have been provided, it was from a non-committer, _not_ a full super star committer. Committers are members of the group "project-infra-committers". Note that this has nothing to do with OWNERS files.
On 2016/10/04 17:46:17, nednguyen wrote: > On 2016/10/04 13:34:14, nednguyen wrote: > > On 2016/10/04 13:30:29, nednguyen wrote: > > > > > > https://codereview.chromium.org/2375663003/diff/40001/scripts/slave/recipe_mo... > > > File scripts/slave/recipe_modules/swarming/api.py (right): > > > > > > > > > https://codereview.chromium.org/2375663003/diff/40001/scripts/slave/recipe_mo... > > > scripts/slave/recipe_modules/swarming/api.py:839: > > > results_json['seconds_since_epoch']) > > > On 2016/10/04 08:54:52, Sergiy Byelozyorov wrote: > > > > On 2016/10/04 00:10:11, nednguyen wrote: > > > > > On 2016/09/29 21:36:22, Sergiy Byelozyorov wrote: > > > > > > Any particular reason to pick the smaller of the two values? > > > > > > > > > > I honestly not sure what is the reason for this field & the spec says > this > > > is > > > > > required. Do you have any suggestion? > > > > > > > > AFAIK, this is when the tests were run... probably some random time > measured > > > by > > > > the test launcher. Minimum of the two values is perfectly find with me. I > > was > > > > just wondering if this was just an arbitrary choice or whether there was > > some > > > > logic behind it. > > > > > > Ah, I just think that the earliest seconds_since_epoch time of all the > shards > > > best represent the seconds_since_epoch time of when the suite actually > starts. > > > > I looked into how collect_gtest_task.py implemented as there is a lot of > boiler > > plates in code just to share the logic by turning it to a runnable script. I > > will try my best to have full coverage over results_merger module by changing > > some existing test step with the current implementation and not go down the > path > > of scriptifying. > > This CL now achieve full expectation test coverage & unittest is passing. PTAL > again Actually I forgot to update the swarmed_isolated_script_test recipe to deal with the json result format, so this CL is not ready to land. Will be working on this next.
dpranke@chromium.org changed reviewers: + dpranke@chromium.org
https://codereview.chromium.org/2375663003/diff/40001/scripts/slave/recipe_mo... File scripts/slave/recipe_modules/swarming/api.py (right): https://codereview.chromium.org/2375663003/diff/40001/scripts/slave/recipe_mo... scripts/slave/recipe_modules/swarming/api.py:809: raise recipe_api.InfraFailure('Invalid json test results') On 2016/10/04 08:54:52, Sergiy Byelozyorov wrote: > On 2016/10/04 00:10:11, nednguyen wrote: > > On 2016/09/29 21:36:22, Sergiy Byelozyorov wrote: > > > I would double-check if everybody implements the full spec... I remember > > seeing > > > some fields missing in some test results, so I treated them all as optional. > > > > Hmhh, the spec say these fields are required? > > http://www.chromium.org/developers/the-json-test-results-format > > > > Currently, this is not used by anyone who use the script_isolated_test recipe, > > so I think it's better to follow the spec here. > > Sure. I just meant that you should make sure it doesn't break with real-world > results. If you find any launchers that do not follow the spec, please file bugs > into Infra>Flakiness>Dashboard component for them. We should try to follow the spec and update the spec if need be if we can't follow it, but we shouldn't arbitrarily diverge from the spec. (This is probably obvious).
On 2016/10/04 20:04:03, Dirk Pranke (slow) wrote: > https://codereview.chromium.org/2375663003/diff/40001/scripts/slave/recipe_mo... > File scripts/slave/recipe_modules/swarming/api.py (right): > > https://codereview.chromium.org/2375663003/diff/40001/scripts/slave/recipe_mo... > scripts/slave/recipe_modules/swarming/api.py:809: raise > recipe_api.InfraFailure('Invalid json test results') > On 2016/10/04 08:54:52, Sergiy Byelozyorov wrote: > > On 2016/10/04 00:10:11, nednguyen wrote: > > > On 2016/09/29 21:36:22, Sergiy Byelozyorov wrote: > > > > I would double-check if everybody implements the full spec... I remember > > > seeing > > > > some fields missing in some test results, so I treated them all as > optional. > > > > > > Hmhh, the spec say these fields are required? > > > http://www.chromium.org/developers/the-json-test-results-format > > > > > > Currently, this is not used by anyone who use the script_isolated_test > recipe, > > > so I think it's better to follow the spec here. > > > > Sure. I just meant that you should make sure it doesn't break with real-world > > results. If you find any launchers that do not follow the spec, please file > bugs > > into Infra>Flakiness>Dashboard component for them. > > We should try to follow the spec and update the spec if need be if we can't > follow it, but we shouldn't arbitrarily diverge from the spec. > > (This is probably obvious). SwarmingIsolatedScriptTest is also updated to support json result format, PTAL
Description was changed from ========== Support json test results format BUG=649762 ========== to ========== Add json test results format support for SwarmingIsolatedScriptTest BUG=649762 ==========
Looks excellent to me. LGTM https://codereview.chromium.org/2375663003/diff/160001/scripts/slave/recipe_m... File scripts/slave/recipe_modules/swarming/results_merger.py (right): https://codereview.chromium.org/2375663003/diff/160001/scripts/slave/recipe_m... scripts/slave/recipe_modules/swarming/results_merger.py:18: a Please finish this comment.
https://codereview.chromium.org/2375663003/diff/160001/scripts/slave/recipe_m... File scripts/slave/recipe_modules/swarming/results_merger.py (right): https://codereview.chromium.org/2375663003/diff/160001/scripts/slave/recipe_m... scripts/slave/recipe_modules/swarming/results_merger.py:18: a On 2016/10/05 00:56:28, Ken Russell wrote: > Please finish this comment. Done. I was O_o
On 2016/10/05 13:14:37, nednguyen wrote: > https://codereview.chromium.org/2375663003/diff/160001/scripts/slave/recipe_m... > File scripts/slave/recipe_modules/swarming/results_merger.py (right): > > https://codereview.chromium.org/2375663003/diff/160001/scripts/slave/recipe_m... > scripts/slave/recipe_modules/swarming/results_merger.py:18: a > On 2016/10/05 00:56:28, Ken Russell wrote: > > Please finish this comment. > > Done. I was O_o I just had a rebase conflict. ~_~ Ping infra folks!
https://codereview.chromium.org/2375663003/diff/200001/scripts/slave/recipe_m... File scripts/slave/recipe_modules/chromium_tests/steps.py (right): https://codereview.chromium.org/2375663003/diff/200001/scripts/slave/recipe_m... scripts/slave/recipe_modules/chromium_tests/steps.py:1085: valid = results['valid'] Please remove two lines above... they are not needed anymore. https://codereview.chromium.org/2375663003/diff/200001/scripts/slave/recipe_m... scripts/slave/recipe_modules/chromium_tests/steps.py:1092: tests[t]['expected'] != tests[t]['actual']) AFAIK, 'expected' and 'actual' are lists of results separated by space,e.g. expected = 'TIMEOUT IMAGE' and actual = 'FAIL FAIL IMAGE'. The example above means that test has been run 3 tries - failed twice and produced an image once. The expected list means that either timeout or image results are ok. https://codereview.chromium.org/2375663003/diff/200001/scripts/slave/recipe_m... scripts/slave/recipe_modules/chromium_tests/steps.py:1093: valid = results['num_failures_by_type'].get('FAIL', 0) == len(failures) FAIL is not the only type of failure. https://codereview.chromium.org/2375663003/diff/200001/scripts/slave/recipe_m... File scripts/slave/recipe_modules/swarming/__init__.py (right): https://codereview.chromium.org/2375663003/diff/200001/scripts/slave/recipe_m... scripts/slave/recipe_modules/swarming/__init__.py:27: # let results_merger_unittest runnable. Why does results_merger_unittest need to import this? If the import fails then it does not actually get anything useful from this module. https://codereview.chromium.org/2375663003/diff/200001/scripts/slave/recipe_m... File scripts/slave/recipe_modules/swarming/results_merger.py (right): https://codereview.chromium.org/2375663003/diff/200001/scripts/slave/recipe_m... scripts/slave/recipe_modules/swarming/results_merger.py:44: merged_results[key] = merged_results[key] and result_json[key] I'd make this code less generic and easier to read: for result_json in shard_results_list: successes = result_json.get('successes', []) failures = result_json.get('failures', []) valid = result_json.get('valid', True) if (not isinstance(successes, list) or not isinstance(failures, list) or not isinstance(valid, bool)) raise Exception('Unexpected value type in %s' % result_json) merged_results['successes'].extend(successes) merged_results['failures'].extend(failures) merged_results['valid'] = merged_results['valid'] and valid https://codereview.chromium.org/2375663003/diff/200001/scripts/slave/recipe_m... File scripts/slave/recipe_modules/swarming/tests/results_merger_unittest.py (right): https://codereview.chromium.org/2375663003/diff/200001/scripts/slave/recipe_m... scripts/slave/recipe_modules/swarming/tests/results_merger_unittest.py:14: 0, os.path.abspath(os.path.join(THIS_DIR, '..', '..', '..', 'unittests'))) Please move close to actual 'import test_env' line below or move import line here. https://codereview.chromium.org/2375663003/diff/200001/scripts/slave/recipe_m... scripts/slave/recipe_modules/swarming/tests/results_merger_unittest.py:16: sys.path.insert(0, os.path.join(THIS_DIR, '..', '..')) What is this one for? https://codereview.chromium.org/2375663003/diff/200001/scripts/slave/recipe_m... scripts/slave/recipe_modules/swarming/tests/results_merger_unittest.py:212: class MergingTest(unittest.TestCase): # pragma: no cover Why is this marked to be without coverage? Aren't we running these tests? https://codereview.chromium.org/2375663003/diff/200001/scripts/slave/recipes/... File scripts/slave/recipes/chromium.py (right): https://codereview.chromium.org/2375663003/diff/200001/scripts/slave/recipes/... scripts/slave/recipes/chromium.py:340: nit: please remove this line... no need for double-empty-line between tests
https://codereview.chromium.org/2375663003/diff/200001/scripts/slave/recipe_m... File scripts/slave/recipe_modules/chromium_tests/steps.py (right): https://codereview.chromium.org/2375663003/diff/200001/scripts/slave/recipe_m... scripts/slave/recipe_modules/chromium_tests/steps.py:1085: valid = results['valid'] On 2016/10/05 15:36:06, Sergiy Byelozyorov wrote: > Please remove two lines above... they are not needed anymore. Done. https://codereview.chromium.org/2375663003/diff/200001/scripts/slave/recipe_m... scripts/slave/recipe_modules/chromium_tests/steps.py:1092: tests[t]['expected'] != tests[t]['actual']) On 2016/10/05 15:36:06, Sergiy Byelozyorov wrote: > AFAIK, 'expected' and 'actual' are lists of results separated by space,e.g. > expected = 'TIMEOUT IMAGE' and actual = 'FAIL FAIL IMAGE'. The example above > means that test has been run 3 tries - failed twice and produced an image once. > The expected list means that either timeout or image results are ok. I see. The spec says: "actual" is an ordered space-separated list of the results the test actually produced. "expected" is an unordered space-separated list of the result types expected for the test. So for this semantic, I compare the set(test results) instead. Wdyt? https://codereview.chromium.org/2375663003/diff/200001/scripts/slave/recipe_m... scripts/slave/recipe_modules/chromium_tests/steps.py:1093: valid = results['num_failures_by_type'].get('FAIL', 0) == len(failures) On 2016/10/05 15:36:05, Sergiy Byelozyorov wrote: > FAIL is not the only type of failure. But is should be the only one we use for checking the validity of the result here? With CRASH & TIMEOUT, the number of test failure may not be equaled the number of "failures" computed above? https://codereview.chromium.org/2375663003/diff/200001/scripts/slave/recipe_m... File scripts/slave/recipe_modules/swarming/__init__.py (right): https://codereview.chromium.org/2375663003/diff/200001/scripts/slave/recipe_m... scripts/slave/recipe_modules/swarming/__init__.py:27: # let results_merger_unittest runnable. On 2016/10/05 15:36:06, Sergiy Byelozyorov wrote: > Why does results_merger_unittest need to import this? If the import fails then > it does not actually get anything useful from this module. Done. Move results_merger to resources/ help solved this. https://codereview.chromium.org/2375663003/diff/200001/scripts/slave/recipe_m... File scripts/slave/recipe_modules/swarming/results_merger.py (right): https://codereview.chromium.org/2375663003/diff/200001/scripts/slave/recipe_m... scripts/slave/recipe_modules/swarming/results_merger.py:44: merged_results[key] = merged_results[key] and result_json[key] On 2016/10/05 15:36:06, Sergiy Byelozyorov wrote: > I'd make this code less generic and easier to read: > > for result_json in shard_results_list: > successes = result_json.get('successes', []) > failures = result_json.get('failures', []) > valid = result_json.get('valid', True) > > if (not isinstance(successes, list) or not isinstance(failures, list) or > not isinstance(valid, bool)) > raise Exception('Unexpected value type in %s' % result_json) > > merged_results['successes'].extend(successes) > merged_results['failures'].extend(failures) > merged_results['valid'] = merged_results['valid'] and valid Done. https://codereview.chromium.org/2375663003/diff/200001/scripts/slave/recipe_m... File scripts/slave/recipe_modules/swarming/tests/results_merger_unittest.py (right): https://codereview.chromium.org/2375663003/diff/200001/scripts/slave/recipe_m... scripts/slave/recipe_modules/swarming/tests/results_merger_unittest.py:14: 0, os.path.abspath(os.path.join(THIS_DIR, '..', '..', '..', 'unittests'))) On 2016/10/05 15:36:06, Sergiy Byelozyorov wrote: > Please move close to actual 'import test_env' line below or move import line > here. Done. https://codereview.chromium.org/2375663003/diff/200001/scripts/slave/recipe_m... scripts/slave/recipe_modules/swarming/tests/results_merger_unittest.py:16: sys.path.insert(0, os.path.join(THIS_DIR, '..', '..')) On 2016/10/05 15:36:06, Sergiy Byelozyorov wrote: > What is this one for? For importing resource_merger. PTAL again https://codereview.chromium.org/2375663003/diff/200001/scripts/slave/recipe_m... scripts/slave/recipe_modules/swarming/tests/results_merger_unittest.py:212: class MergingTest(unittest.TestCase): # pragma: no cover On 2016/10/05 15:36:06, Sergiy Byelozyorov wrote: > Why is this marked to be without coverage? Aren't we running these tests? We aren't running these test in expectation framework, but are running them in PRESUBMIT: https://cs.chromium.org/chromium/build/PRESUBMIT.py?rcl=0&l=121 https://codereview.chromium.org/2375663003/diff/200001/scripts/slave/recipes/... File scripts/slave/recipes/chromium.py (right): https://codereview.chromium.org/2375663003/diff/200001/scripts/slave/recipes/... scripts/slave/recipes/chromium.py:340: On 2016/10/05 15:36:06, Sergiy Byelozyorov wrote: > nit: please remove this line... no need for double-empty-line between tests Done.
https://codereview.chromium.org/2375663003/diff/200001/scripts/slave/recipe_m... File scripts/slave/recipe_modules/chromium_tests/steps.py (right): https://codereview.chromium.org/2375663003/diff/200001/scripts/slave/recipe_m... scripts/slave/recipe_modules/chromium_tests/steps.py:1093: valid = results['num_failures_by_type'].get('FAIL', 0) == len(failures) On 2016/10/05 18:50:53, nednguyen wrote: > On 2016/10/05 15:36:05, Sergiy Byelozyorov wrote: > > FAIL is not the only type of failure. > > But is should be the only one we use for checking the validity of the result > here? With CRASH & TIMEOUT, the number of test failure may not be equaled the > number of "failures" computed above? "valid" is used to detect whether something went horribly wrong when trying to run the tests (i.e., exceptional circumstances), not whether any individual tests passed or failed. So it should be true if you got any well-formed results back at all, even if they were all failures and crashes, etc.
https://codereview.chromium.org/2375663003/diff/200001/scripts/slave/recipe_m... File scripts/slave/recipe_modules/chromium_tests/steps.py (right): https://codereview.chromium.org/2375663003/diff/200001/scripts/slave/recipe_m... scripts/slave/recipe_modules/chromium_tests/steps.py:1093: valid = results['num_failures_by_type'].get('FAIL', 0) == len(failures) On 2016/10/05 19:39:46, Dirk Pranke (slow) wrote: > On 2016/10/05 18:50:53, nednguyen wrote: > > On 2016/10/05 15:36:05, Sergiy Byelozyorov wrote: > > > FAIL is not the only type of failure. > > > > But is should be the only one we use for checking the validity of the result > > here? With CRASH & TIMEOUT, the number of test failure may not be equaled the > > number of "failures" computed above? > > "valid" is used to detect whether something went horribly wrong when trying to > run the tests (i.e., exceptional circumstances), not whether any individual > tests passed or failed. So it should be true if you got any well-formed results > back at all, even if they were all failures and crashes, etc. Done. We have a try catch block at 1105 to check valid.
On 2016/10/05 20:14:29, nednguyen wrote: > https://codereview.chromium.org/2375663003/diff/200001/scripts/slave/recipe_m... > File scripts/slave/recipe_modules/chromium_tests/steps.py (right): > > https://codereview.chromium.org/2375663003/diff/200001/scripts/slave/recipe_m... > scripts/slave/recipe_modules/chromium_tests/steps.py:1093: valid = > results['num_failures_by_type'].get('FAIL', 0) == len(failures) > On 2016/10/05 19:39:46, Dirk Pranke (slow) wrote: > > On 2016/10/05 18:50:53, nednguyen wrote: > > > On 2016/10/05 15:36:05, Sergiy Byelozyorov wrote: > > > > FAIL is not the only type of failure. > > > > > > But is should be the only one we use for checking the validity of the result > > > here? With CRASH & TIMEOUT, the number of test failure may not be equaled > the > > > number of "failures" computed above? > > > > "valid" is used to detect whether something went horribly wrong when trying to > > run the tests (i.e., exceptional circumstances), not whether any individual > > tests passed or failed. So it should be true if you got any well-formed > results > > back at all, even if they were all failures and crashes, etc. > > Done. We have a try catch block at 1105 to check valid. PTAL
lgtm w/ comments https://codereview.chromium.org/2375663003/diff/200001/scripts/slave/recipe_m... File scripts/slave/recipe_modules/chromium_tests/steps.py (right): https://codereview.chromium.org/2375663003/diff/200001/scripts/slave/recipe_m... scripts/slave/recipe_modules/chromium_tests/steps.py:1092: tests[t]['expected'] != tests[t]['actual']) On 2016/10/05 18:50:53, nednguyen wrote: > On 2016/10/05 15:36:06, Sergiy Byelozyorov wrote: > > AFAIK, 'expected' and 'actual' are lists of results separated by space,e.g. > > expected = 'TIMEOUT IMAGE' and actual = 'FAIL FAIL IMAGE'. The example above > > means that test has been run 3 tries - failed twice and produced an image > once. > > The expected list means that either timeout or image results are ok. > > I see. The spec says: > "actual" is an ordered space-separated list of the results the test actually > produced. > > "expected" is an unordered space-separated list of the result types expected for > the test. > > So for this semantic, I compare the set(test results) instead. Wdyt? I think, to get a list of failures, you'd need to do the following: failures = list( t for t in tests if all(res not in tests[t]['expected'].split() for res in tests[t]['actual'].split()) https://codereview.chromium.org/2375663003/diff/200001/scripts/slave/recipe_m... File scripts/slave/recipe_modules/swarming/tests/results_merger_unittest.py (right): https://codereview.chromium.org/2375663003/diff/200001/scripts/slave/recipe_m... scripts/slave/recipe_modules/swarming/tests/results_merger_unittest.py:212: class MergingTest(unittest.TestCase): # pragma: no cover On 2016/10/05 18:50:53, nednguyen wrote: > On 2016/10/05 15:36:06, Sergiy Byelozyorov wrote: > > Why is this marked to be without coverage? Aren't we running these tests? > > We aren't running these test in expectation framework, but are running them in > PRESUBMIT: > > https://cs.chromium.org/chromium/build/PRESUBMIT.py?rcl=0&l=121 This is sad that we have to misuse pragma-no-cover here, but for lack of a better alternative, I guess this will do. Can you please add a comment explaining that? https://codereview.chromium.org/2375663003/diff/260001/scripts/slave/recipe_m... File scripts/slave/recipe_modules/test_utils/test_api.py (right): https://codereview.chromium.org/2375663003/diff/260001/scripts/slave/recipe_m... scripts/slave/recipe_modules/test_utils/test_api.py:154: tests_run = { the actual/expected will also need to be updated here. actual - ordered list of actual test results expected - set of results that are considered as passing for example: actual: FAIL FAIL PASS expected: PASS is for a flaky test that only passed on a third try. actual: TIMEOUT expected: PASS TIMEOUT means that the test passed, because TIMEOUT is one of the expected results.
https://codereview.chromium.org/2375663003/diff/200001/scripts/slave/recipe_m... File scripts/slave/recipe_modules/chromium_tests/steps.py (right): https://codereview.chromium.org/2375663003/diff/200001/scripts/slave/recipe_m... scripts/slave/recipe_modules/chromium_tests/steps.py:1092: tests[t]['expected'] != tests[t]['actual']) On 2016/10/09 14:43:10, Sergiy Byelozyorov wrote: > On 2016/10/05 18:50:53, nednguyen wrote: > > On 2016/10/05 15:36:06, Sergiy Byelozyorov wrote: > > > AFAIK, 'expected' and 'actual' are lists of results separated by space,e.g. > > > expected = 'TIMEOUT IMAGE' and actual = 'FAIL FAIL IMAGE'. The example above > > > means that test has been run 3 tries - failed twice and produced an image > > once. > > > The expected list means that either timeout or image results are ok. > > > > I see. The spec says: > > "actual" is an ordered space-separated list of the results the test actually > > produced. > > > > "expected" is an unordered space-separated list of the result types expected > for > > the test. > > > > So for this semantic, I compare the set(test results) instead. Wdyt? > > I think, to get a list of failures, you'd need to do the following: > > failures = list( > t for t in tests > if all(res not in tests[t]['expected'].split() > for res in tests[t]['actual'].split()) Done. https://codereview.chromium.org/2375663003/diff/200001/scripts/slave/recipe_m... File scripts/slave/recipe_modules/swarming/tests/results_merger_unittest.py (right): https://codereview.chromium.org/2375663003/diff/200001/scripts/slave/recipe_m... scripts/slave/recipe_modules/swarming/tests/results_merger_unittest.py:212: class MergingTest(unittest.TestCase): # pragma: no cover On 2016/10/09 14:43:10, Sergiy Byelozyorov wrote: > On 2016/10/05 18:50:53, nednguyen wrote: > > On 2016/10/05 15:36:06, Sergiy Byelozyorov wrote: > > > Why is this marked to be without coverage? Aren't we running these tests? > > > > We aren't running these test in expectation framework, but are running them in > > PRESUBMIT: > > > > https://cs.chromium.org/chromium/build/PRESUBMIT.py?rcl=0&l=121 > > This is sad that we have to misuse pragma-no-cover here, but for lack of a > better alternative, I guess this will do. Can you please add a comment > explaining that? Done. https://codereview.chromium.org/2375663003/diff/260001/scripts/slave/recipe_m... File scripts/slave/recipe_modules/test_utils/test_api.py (right): https://codereview.chromium.org/2375663003/diff/260001/scripts/slave/recipe_m... scripts/slave/recipe_modules/test_utils/test_api.py:154: tests_run = { On 2016/10/09 14:43:10, Sergiy Byelozyorov wrote: > the actual/expected will also need to be updated here. > > actual - ordered list of actual test results > expected - set of results that are considered as passing > > for example: > > actual: FAIL FAIL PASS > expected: PASS > > is for a flaky test that only passed on a third try. > > actual: TIMEOUT > expected: PASS TIMEOUT > > means that the test passed, because TIMEOUT is one of the expected results. Done.
https://codereview.chromium.org/2375663003/diff/280001/scripts/slave/recipe_m... File scripts/slave/recipe_modules/swarming/api.py (right): https://codereview.chromium.org/2375663003/diff/280001/scripts/slave/recipe_m... scripts/slave/recipe_modules/swarming/api.py:11: from resources import results_merger Are you sure this is possible? Similar to my swarming api crrev.com/2394093002 it seems like you have to kick off a subprocess to execute a script in another directory. Maybe I am wrong if it is a subdirectory of your current directory vs a parent or a sibling. https://codereview.chromium.org/2375663003/diff/280001/scripts/slave/recipes/... File scripts/slave/recipes/chromium.expected/dynamic_swarmed_sharded_failed_isolated_script_test.json (left): https://codereview.chromium.org/2375663003/diff/280001/scripts/slave/recipes/... scripts/slave/recipes/chromium.expected/dynamic_swarmed_sharded_failed_isolated_script_test.json:425: "@@@STEP_TEXT@<br/>failures:<br/>test1.Test1<br/>test2.Test2<br/>test3.Test3<br/>test4.Test4<br/>@@@", It does appear that this line should still be here even though the output is in the new format. https://codereview.chromium.org/2375663003/diff/280001/scripts/slave/recipes/... File scripts/slave/recipes/chromium.expected/dynamic_swarmed_sharded_invalid_format_isolated_script_test.json (right): https://codereview.chromium.org/2375663003/diff/280001/scripts/slave/recipes/... scripts/slave/recipes/chromium.expected/dynamic_swarmed_sharded_invalid_format_isolated_script_test.json:452: "status_code": 0 If this is invalid format should it be returning 0 or 1? https://codereview.chromium.org/2375663003/diff/280001/scripts/slave/recipes/... File scripts/slave/recipes/chromium.expected/dynamic_swarmed_sharded_isolated_script_test_missing_shard.json (right): https://codereview.chromium.org/2375663003/diff/280001/scripts/slave/recipes/... scripts/slave/recipes/chromium.expected/dynamic_swarmed_sharded_isolated_script_test_missing_shard.json:432: "@@@STEP_LOG_LINE@invalid_results_exc@'valid'@@@", It seems to swap out 'valid' for 'failures' in every one of these expectations that fails on invalid results. Might be worth investigating why.
https://codereview.chromium.org/2375663003/diff/280001/scripts/slave/recipe_m... File scripts/slave/recipe_modules/swarming/api.py (right): https://codereview.chromium.org/2375663003/diff/280001/scripts/slave/recipe_m... scripts/slave/recipe_modules/swarming/api.py:11: from resources import results_merger On 2016/10/10 13:51:28, eyaich1 wrote: > Are you sure this is possible? Similar to my swarming api crrev.com/2394093002 > it seems like you have to kick off a subprocess to execute a script in another > directory. Maybe I am wrong if it is a subdirectory of your current directory > vs a parent or a sibling. Yes, this is possible because I added resources/__init__.py to turn it to a package https://codereview.chromium.org/2375663003/diff/280001/scripts/slave/recipes/... File scripts/slave/recipes/chromium.expected/dynamic_swarmed_sharded_failed_isolated_script_test.json (left): https://codereview.chromium.org/2375663003/diff/280001/scripts/slave/recipes/... scripts/slave/recipes/chromium.expected/dynamic_swarmed_sharded_failed_isolated_script_test.json:425: "@@@STEP_TEXT@<br/>failures:<br/>test1.Test1<br/>test2.Test2<br/>test3.Test3<br/>test4.Test4<br/>@@@", On 2016/10/10 13:51:28, eyaich1 wrote: > It does appear that this line should still be here even though the output is in > the new format. This is done in newer patchset. https://codereview.chromium.org/2375663003/diff/280001/scripts/slave/recipes/... File scripts/slave/recipes/chromium.expected/dynamic_swarmed_sharded_isolated_script_test_missing_shard.json (right): https://codereview.chromium.org/2375663003/diff/280001/scripts/slave/recipes/... scripts/slave/recipes/chromium.expected/dynamic_swarmed_sharded_isolated_script_test_missing_shard.json:432: "@@@STEP_LOG_LINE@invalid_results_exc@'valid'@@@", On 2016/10/10 13:51:28, eyaich1 wrote: > It seems to swap out 'valid' for 'failures' in every one of these expectations > that fails on invalid results. Might be worth investigating why. Oh, that's because we used to access "failures" key first, then "valid" key. Though the error message here is too cryptic, so I change it to use repr() which returns KeyError('valid',), making it much clearer
PTAL again Emily https://codereview.chromium.org/2375663003/diff/280001/scripts/slave/recipes/... File scripts/slave/recipes/chromium.expected/dynamic_swarmed_sharded_invalid_format_isolated_script_test.json (right): https://codereview.chromium.org/2375663003/diff/280001/scripts/slave/recipes/... scripts/slave/recipes/chromium.expected/dynamic_swarmed_sharded_invalid_format_isolated_script_test.json:452: "status_code": 0 On 2016/10/10 13:51:28, eyaich1 wrote: > If this is invalid format should it be returning 0 or 1? Oh, this file is newly created, so the diff is for comparing with some existing expected test file. Though I think we should never override status_code of the test. If the pipeline fails & status_code of the test is still zero, the system should report that.
On 2016/10/10 14:42:29, nednguyen wrote: > PTAL again Emily > > https://codereview.chromium.org/2375663003/diff/280001/scripts/slave/recipes/... > File > scripts/slave/recipes/chromium.expected/dynamic_swarmed_sharded_invalid_format_isolated_script_test.json > (right): > > https://codereview.chromium.org/2375663003/diff/280001/scripts/slave/recipes/... > scripts/slave/recipes/chromium.expected/dynamic_swarmed_sharded_invalid_format_isolated_script_test.json:452: > "status_code": 0 > On 2016/10/10 13:51:28, eyaich1 wrote: > > If this is invalid format should it be returning 0 or 1? > > Oh, this file is newly created, so the diff is for comparing with some existing > expected test file. > > Though I think we should never override status_code of the test. If the pipeline > fails & status_code of the test is still zero, the system should report that. lgtm
The CQ bit was checked by nednguyen@google.com
The patchset sent to the CQ was uploaded after l-g-t-m from kbr@chromium.org, sergiyb@chromium.org Link to the patchset: https://codereview.chromium.org/2375663003/#ps320001 (title: "Use repr(e) to make the invalid_results_exc clearer")
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Try jobs failed on following builders: Build Presubmit on luci.infra.try (JOB_FAILED, https://luci-milo.appspot.com/swarming/task/31c7e5c41b570610)
On 2016/10/10 14:56:04, commit-bot: I haz the power wrote: > Try jobs failed on following builders: > Build Presubmit on luci.infra.try (JOB_FAILED, > https://luci-milo.appspot.com/swarming/task/31c7e5c41b570610) To avoid scripts/slave/unittests/recipe_lint_test.py, I move results_merger back to scripts/slave/recipe_modules/swarming/.. Luckily, I found the way to import only that module, and no longer need to guard the import in scripts/slave/recipe_modules/swarming/__init__.py
The CQ bit was checked by nednguyen@google.com
The patchset sent to the CQ was uploaded after l-g-t-m from kbr@chromium.org, eyaich@chromium.org, sergiyb@chromium.org Link to the patchset: https://codereview.chromium.org/2375663003/#ps340001 (title: "Move results_merger out of resources/ & remove resources/__init__.py")
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Try jobs failed on following builders: Build Presubmit on luci.infra.try (JOB_FAILED, https://luci-milo.appspot.com/swarming/task/31c7fafd6da82f10)
The CQ bit was checked by nednguyen@google.com
The patchset sent to the CQ was uploaded after l-g-t-m from kbr@chromium.org, eyaich@chromium.org, sergiyb@chromium.org Link to the patchset: https://codereview.chromium.org/2375663003/#ps360001 (title: "Add no cover")
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
Message was sent while issue was closed.
Description was changed from ========== Add json test results format support for SwarmingIsolatedScriptTest BUG=649762 ========== to ========== Add json test results format support for SwarmingIsolatedScriptTest BUG=649762 Committed: https://chromium.googlesource.com/chromium/tools/build/+/7baad31fc4293485a824... ==========
Message was sent while issue was closed.
Committed patchset #17 (id:360001) as https://chromium.googlesource.com/chromium/tools/build/+/7baad31fc4293485a824...
Message was sent while issue was closed.
Patchset #18 (id:380001) has been deleted |