Issue 2375663003: Add json test results format support for SwarmingIsolatedScriptTest

nednguyen

The CQ bit was checked by nednguyen@google.com to run a CQ dry run

4 years, 2 months ago (2016-09-27 22:04:15 UTC) #1

commit-bot: I haz the power

Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2375663003/20001

4 years, 2 months ago (2016-09-27 22:04:22 UTC) #2

commit-bot: I haz the power

The CQ bit was unchecked by commit-bot@chromium.org

4 years, 2 months ago (2016-09-27 22:04:28 UTC) #3

commit-bot: I haz the power

Dry run: No L-G-T-M from a valid reviewer yet. CQ run can only be started ...

4 years, 2 months ago (2016-09-27 22:04:28 UTC) #4

nednguyen

The CQ bit was checked by nednguyen@google.com to run a CQ dry run

4 years, 2 months ago (2016-09-27 22:08:09 UTC) #5

commit-bot: I haz the power

Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2375663003/20001

4 years, 2 months ago (2016-09-27 22:08:11 UTC) #6

commit-bot: I haz the power

The CQ bit was unchecked by commit-bot@chromium.org

4 years, 2 months ago (2016-09-27 22:08:12 UTC) #7

commit-bot: I haz the power

Dry run: No L-G-T-M from a valid reviewer yet. CQ run can only be started ...

4 years, 2 months ago (2016-09-27 22:08:12 UTC) #8

nednguyen

Description was changed from ========== Support json test results format BUG= ========== to ========== Support ...

4 years, 2 months ago (2016-09-29 14:15:11 UTC) #11

nednguyen

nednguyen@google.com changed reviewers: + estaab@chromium.org, kbr@chromium.org, sergiyb@chromium.org

4 years, 2 months ago (2016-09-29 14:18:42 UTC) #12

nednguyen

This is not ready for full review yet, but I just want to have some ...

4 years, 2 months ago (2016-09-29 14:18:42 UTC) #13

eyaich1

eyaich@chromium.org changed reviewers: + eyaich@chromium.org

4 years, 2 months ago (2016-09-29 14:55:22 UTC) #14

eyaich1

You are also going to have to update this contract in other places: 1) chromium_tests/steps.py ...

4 years, 2 months ago (2016-09-29 14:55:23 UTC) #15

estaab

On 2016/09/29 at 14:18:42, nednguyen wrote: > This is not ready for full review yet, ...

4 years, 2 months ago (2016-09-29 17:36:15 UTC) #16

Sergiy Byelozyorov

sergiyb@chromium.org changed reviewers: + maruel@chromium.org

4 years, 2 months ago (2016-09-29 21:22:31 UTC) #17

Sergiy Byelozyorov

Yay! Thank you for this CL! Left a few comments. https://codereview.chromium.org/2375663003/diff/40001/scripts/slave/recipe_modules/swarming/api.py File scripts/slave/recipe_modules/swarming/api.py (right): https://codereview.chromium.org/2375663003/diff/40001/scripts/slave/recipe_modules/swarming/api.py#newcode809 ...

4 years, 2 months ago (2016-09-29 21:36:22 UTC) #18

Ken Russell (switch to Gerrit)

Thanks for generalizing this code. A couple of comments so far. https://codereview.chromium.org/2375663003/diff/40001/scripts/slave/recipe_modules/swarming/api.py File scripts/slave/recipe_modules/swarming/api.py (right): ...

4 years, 2 months ago (2016-09-30 23:37:53 UTC) #19

nednguyen

https://codereview.chromium.org/2375663003/diff/40001/scripts/slave/recipe_modules/swarming/api.py File scripts/slave/recipe_modules/swarming/api.py (right): https://codereview.chromium.org/2375663003/diff/40001/scripts/slave/recipe_modules/swarming/api.py#newcode809 scripts/slave/recipe_modules/swarming/api.py:809: raise recipe_api.InfraFailure('Invalid json test results') On 2016/09/29 21:36:22, Sergiy ...

4 years, 2 months ago (2016-10-04 00:10:11 UTC) #20

nednguyen

https://codereview.chromium.org/2375663003/diff/80001/scripts/slave/recipe_modules/swarming/__init__.py File scripts/slave/recipe_modules/swarming/__init__.py (right): https://codereview.chromium.org/2375663003/diff/80001/scripts/slave/recipe_modules/swarming/__init__.py#newcode18 scripts/slave/recipe_modules/swarming/__init__.py:18: try: Reviewers: I need to guard this so that ...

4 years, 2 months ago (2016-10-04 00:14:59 UTC) #21

Ken Russell (switch to Gerrit)

kbr@chromium.org changed reviewers: + vadimsh@chromium.org

4 years, 2 months ago (2016-10-04 00:49:28 UTC) #22

Ken Russell (switch to Gerrit)

CC'ing Vadim in particular regarding the unittest and the try/catch.

4 years, 2 months ago (2016-10-04 00:49:29 UTC) #23

Vadim Sh.

vadimsh@chromium.org changed reviewers: + iannucci@chromium.org

4 years, 2 months ago (2016-10-04 01:05:28 UTC) #24

Vadim Sh.

+ also Robbie, who's working on improving recipes unit tests in general https://codereview.chromium.org/2375663003/diff/100001/scripts/slave/recipe_modules/swarming/api.py File scripts/slave/recipe_modules/swarming/api.py ...

4 years, 2 months ago (2016-10-04 01:05:29 UTC) #25

Vadim Sh.

https://codereview.chromium.org/2375663003/diff/100001/scripts/slave/recipe_modules/swarming/api.py File scripts/slave/recipe_modules/swarming/api.py (right): https://codereview.chromium.org/2375663003/diff/100001/scripts/slave/recipe_modules/swarming/api.py#newcode787 scripts/slave/recipe_modules/swarming/api.py:787: return results_merger.merge_test_results(shard_results_list) On 2016/10/04 01:05:29, Vadim Sh. wrote: > ...

4 years, 2 months ago (2016-10-04 01:11:32 UTC) #26

iannucci

Can use leak_to feature of json.output to avoid round trip. On Mon, Oct 3, 2016, ...

4 years, 2 months ago (2016-10-04 01:13:23 UTC) #27

Ken Russell (switch to Gerrit)

It should certainly be made possible to unit test the merger functions, but I'm not ...

4 years, 2 months ago (2016-10-04 01:14:29 UTC) #28

iannucci

Yes and post process assertions should land this week. I'll take a look at the ...

4 years, 2 months ago (2016-10-04 01:20:43 UTC) #29

nednguyen

On 2016/10/04 01:14:29, Ken Russell wrote: > It should certainly be made possible to unit ...

4 years, 2 months ago (2016-10-04 01:23:27 UTC) #30

Ken Russell (switch to Gerrit)

On 2016/10/04 01:23:27, nednguyen wrote: > On 2016/10/04 01:14:29, Ken Russell wrote: > > It ...

4 years, 2 months ago (2016-10-04 01:43:36 UTC) #31

Ken Russell (switch to Gerrit)

https://codereview.chromium.org/2375663003/diff/100001/scripts/slave/recipe_modules/swarming/__init__.py File scripts/slave/recipe_modules/swarming/__init__.py (right): https://codereview.chromium.org/2375663003/diff/100001/scripts/slave/recipe_modules/swarming/__init__.py#newcode18 scripts/slave/recipe_modules/swarming/__init__.py:18: try: Add a comment that this try/catch enables the ...

4 years, 2 months ago (2016-10-04 01:43:44 UTC) #32

nednguyen

https://codereview.chromium.org/2375663003/diff/100001/scripts/slave/recipe_modules/swarming/__init__.py File scripts/slave/recipe_modules/swarming/__init__.py (right): https://codereview.chromium.org/2375663003/diff/100001/scripts/slave/recipe_modules/swarming/__init__.py#newcode18 scripts/slave/recipe_modules/swarming/__init__.py:18: try: On 2016/10/04 01:43:43, Ken Russell wrote: > Add ...

4 years, 2 months ago (2016-10-04 01:49:52 UTC) #33

Sergiy Byelozyorov

https://codereview.chromium.org/2375663003/diff/40001/scripts/slave/recipe_modules/swarming/api.py File scripts/slave/recipe_modules/swarming/api.py (right): https://codereview.chromium.org/2375663003/diff/40001/scripts/slave/recipe_modules/swarming/api.py#newcode809 scripts/slave/recipe_modules/swarming/api.py:809: raise recipe_api.InfraFailure('Invalid json test results') On 2016/10/04 00:10:11, nednguyen ...

4 years, 2 months ago (2016-10-04 08:54:52 UTC) #34

https://codereview.chromium.org/2375663003/diff/40001/scripts/slave/recipe_mo...
File scripts/slave/recipe_modules/swarming/api.py (right):

https://codereview.chromium.org/2375663003/diff/40001/scripts/slave/recipe_mo...
scripts/slave/recipe_modules/swarming/api.py:809: raise
recipe_api.InfraFailure('Invalid json test results')
On 2016/10/04 00:10:11, nednguyen wrote:
> On 2016/09/29 21:36:22, Sergiy Byelozyorov wrote:
> > I would double-check if everybody implements the full spec... I remember
> seeing
> > some fields missing in some test results, so I treated them all as optional.
> 
> Hmhh, the spec say these fields are required? 
> http://www.chromium.org/developers/the-json-test-results-format
> 
> Currently, this is not used by anyone who use the script_isolated_test recipe,
> so I think it's better to follow the spec here.

Sure. I just meant that you should make sure it doesn't break with real-world
results. If you find any launchers that do not follow the spec, please file bugs
into Infra>Flakiness>Dashboard component for them.

https://codereview.chromium.org/2375663003/diff/40001/scripts/slave/recipe_mo...
scripts/slave/recipe_modules/swarming/api.py:839:
results_json['seconds_since_epoch'])
On 2016/10/04 00:10:11, nednguyen wrote:
> On 2016/09/29 21:36:22, Sergiy Byelozyorov wrote:
> > Any particular reason to pick the smaller of the two values?
> 
> I honestly not sure what is the reason for this field & the spec says this is
> required. Do you have any suggestion?

AFAIK, this is when the tests were run... probably some random time measured by
the test launcher. Minimum of the two values is perfectly find with me. I was
just wondering if this was just an arbitrary choice or whether there was some
logic behind it.

https://codereview.chromium.org/2375663003/diff/100001/scripts/slave/recipe_m...
File scripts/slave/recipe_modules/swarming/results_merger.py (right):

https://codereview.chromium.org/2375663003/diff/100001/scripts/slave/recipe_m...
scripts/slave/recipe_modules/swarming/results_merger.py:1: # Copyright 2016 The
Chromium Authors. All rights reserved.
On 2016/10/04 01:49:52, nednguyen wrote:
> On 2016/10/04 01:05:29, Vadim Sh. wrote:
> > I suspect you will have hard time convincing recipe tests that this code is
> 100%
> > covered, because it will probably ignore code coverage provided by
> > results_merger_unittest.py.
> 
> I plan to add both expectation test & unittest.  Howver, if change this to be
a
> runnable script, does expectation test's coverage include the script's code?

+1 for unit tests, because IMHO we have too many expectation tests (essentially,
integration tests) covering just a few lines of code. But like Vadim, I am not
sure if our coverage system takes unittests into account when computing
coverage. You can try running the CQ in dry run on this CL to check that.

nednguyen

https://codereview.chromium.org/2375663003/diff/40001/scripts/slave/recipe_modules/swarming/api.py File scripts/slave/recipe_modules/swarming/api.py (right): https://codereview.chromium.org/2375663003/diff/40001/scripts/slave/recipe_modules/swarming/api.py#newcode839 scripts/slave/recipe_modules/swarming/api.py:839: results_json['seconds_since_epoch']) On 2016/10/04 08:54:52, Sergiy Byelozyorov wrote: > On ...

4 years, 2 months ago (2016-10-04 13:30:29 UTC) #35

nednguyen

On 2016/10/04 13:30:29, nednguyen wrote: > https://codereview.chromium.org/2375663003/diff/40001/scripts/slave/recipe_modules/swarming/api.py > File scripts/slave/recipe_modules/swarming/api.py (right): > > https://codereview.chromium.org/2375663003/diff/40001/scripts/slave/recipe_modules/swarming/api.py#newcode839 > ...

4 years, 2 months ago (2016-10-04 13:34:14 UTC) #36

nednguyen

On 2016/10/04 13:34:14, nednguyen wrote: > On 2016/10/04 13:30:29, nednguyen wrote: > > > https://codereview.chromium.org/2375663003/diff/40001/scripts/slave/recipe_modules/swarming/api.py ...

4 years, 2 months ago (2016-10-04 17:46:17 UTC) #37

nednguyen

The CQ bit was checked by nednguyen@google.com to run a CQ dry run

4 years, 2 months ago (2016-10-04 17:46:55 UTC) #38

commit-bot: I haz the power

Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2375663003/140001

4 years, 2 months ago (2016-10-04 17:47:02 UTC) #39

commit-bot: I haz the power

The CQ bit was unchecked by commit-bot@chromium.org

4 years, 2 months ago (2016-10-04 17:47:04 UTC) #40

commit-bot: I haz the power

Dry run: No L-G-T-M from a valid reviewer yet. CQ run can only be started ...

4 years, 2 months ago (2016-10-04 17:47:06 UTC) #41

nednguyen

The CQ bit was checked by nednguyen@google.com to run a CQ dry run

4 years, 2 months ago (2016-10-04 17:48:28 UTC) #42

commit-bot: I haz the power

Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2375663003/140001

4 years, 2 months ago (2016-10-04 17:48:33 UTC) #43

commit-bot: I haz the power

The CQ bit was unchecked by commit-bot@chromium.org

4 years, 2 months ago (2016-10-04 17:48:34 UTC) #44

commit-bot: I haz the power

Dry run: No L-G-T-M from a valid reviewer yet. CQ run can only be started ...

4 years, 2 months ago (2016-10-04 17:48:35 UTC) #45

nednguyen

On 2016/10/04 17:46:17, nednguyen wrote: > On 2016/10/04 13:34:14, nednguyen wrote: > > On 2016/10/04 ...

4 years, 2 months ago (2016-10-04 18:32:46 UTC) #46

On 2016/10/04 17:46:17, nednguyen wrote:
> On 2016/10/04 13:34:14, nednguyen wrote:
> > On 2016/10/04 13:30:29, nednguyen wrote:
> > >
> >
>
https://codereview.chromium.org/2375663003/diff/40001/scripts/slave/recipe_mo...
> > > File scripts/slave/recipe_modules/swarming/api.py (right):
> > > 
> > >
> >
>
https://codereview.chromium.org/2375663003/diff/40001/scripts/slave/recipe_mo...
> > > scripts/slave/recipe_modules/swarming/api.py:839:
> > > results_json['seconds_since_epoch'])
> > > On 2016/10/04 08:54:52, Sergiy Byelozyorov wrote:
> > > > On 2016/10/04 00:10:11, nednguyen wrote:
> > > > > On 2016/09/29 21:36:22, Sergiy Byelozyorov wrote:
> > > > > > Any particular reason to pick the smaller of the two values?
> > > > > 
> > > > > I honestly not sure what is the reason for this field & the spec says
> this
> > > is
> > > > > required. Do you have any suggestion?
> > > > 
> > > > AFAIK, this is when the tests were run... probably some random time
> measured
> > > by
> > > > the test launcher. Minimum of the two values is perfectly find with me.
I
> > was
> > > > just wondering if this was just an arbitrary choice or whether there was
> > some
> > > > logic behind it.
> > > 
> > > Ah, I just think that the earliest seconds_since_epoch time of all the
> shards
> > > best represent the seconds_since_epoch time of when the suite actually
> starts.
> > 
> > I looked into how collect_gtest_task.py implemented as there is a lot of
> boiler
> > plates in code just to share the logic by turning it to a runnable script. I
> > will try my best to have full coverage over results_merger module by
changing
> > some existing test step with the current implementation and not go down the
> path
> > of scriptifying.
> 
> This CL now achieve full expectation test coverage & unittest is passing. PTAL
> again

Actually I forgot to update the swarmed_isolated_script_test recipe to deal with
the json result format, so this CL is not ready to land. Will be working on this
next.

Dirk Pranke

dpranke@chromium.org changed reviewers: + dpranke@chromium.org

4 years, 2 months ago (2016-10-04 20:04:02 UTC) #47

Dirk Pranke

https://codereview.chromium.org/2375663003/diff/40001/scripts/slave/recipe_modules/swarming/api.py File scripts/slave/recipe_modules/swarming/api.py (right): https://codereview.chromium.org/2375663003/diff/40001/scripts/slave/recipe_modules/swarming/api.py#newcode809 scripts/slave/recipe_modules/swarming/api.py:809: raise recipe_api.InfraFailure('Invalid json test results') On 2016/10/04 08:54:52, Sergiy ...

4 years, 2 months ago (2016-10-04 20:04:03 UTC) #48

nednguyen

On 2016/10/04 20:04:03, Dirk Pranke (slow) wrote: > https://codereview.chromium.org/2375663003/diff/40001/scripts/slave/recipe_modules/swarming/api.py > File scripts/slave/recipe_modules/swarming/api.py (right): > > ...

4 years, 2 months ago (2016-10-04 21:08:46 UTC) #49

nednguyen

Description was changed from ========== Support json test results format BUG=649762 ========== to ========== Add ...

4 years, 2 months ago (2016-10-04 21:13:30 UTC) #50

Ken Russell (switch to Gerrit)

Looks excellent to me. LGTM https://codereview.chromium.org/2375663003/diff/160001/scripts/slave/recipe_modules/swarming/results_merger.py File scripts/slave/recipe_modules/swarming/results_merger.py (right): https://codereview.chromium.org/2375663003/diff/160001/scripts/slave/recipe_modules/swarming/results_merger.py#newcode18 scripts/slave/recipe_modules/swarming/results_merger.py:18: a Please finish this ...

4 years, 2 months ago (2016-10-05 00:56:28 UTC) #51

nednguyen

https://codereview.chromium.org/2375663003/diff/160001/scripts/slave/recipe_modules/swarming/results_merger.py File scripts/slave/recipe_modules/swarming/results_merger.py (right): https://codereview.chromium.org/2375663003/diff/160001/scripts/slave/recipe_modules/swarming/results_merger.py#newcode18 scripts/slave/recipe_modules/swarming/results_merger.py:18: a On 2016/10/05 00:56:28, Ken Russell wrote: > Please ...

4 years, 2 months ago (2016-10-05 13:14:37 UTC) #52

nednguyen

On 2016/10/05 13:14:37, nednguyen wrote: > https://codereview.chromium.org/2375663003/diff/160001/scripts/slave/recipe_modules/swarming/results_merger.py > File scripts/slave/recipe_modules/swarming/results_merger.py (right): > > https://codereview.chromium.org/2375663003/diff/160001/scripts/slave/recipe_modules/swarming/results_merger.py#newcode18 > ...

4 years, 2 months ago (2016-10-05 14:55:40 UTC) #53

Sergiy Byelozyorov

https://codereview.chromium.org/2375663003/diff/200001/scripts/slave/recipe_modules/chromium_tests/steps.py File scripts/slave/recipe_modules/chromium_tests/steps.py (right): https://codereview.chromium.org/2375663003/diff/200001/scripts/slave/recipe_modules/chromium_tests/steps.py#newcode1085 scripts/slave/recipe_modules/chromium_tests/steps.py:1085: valid = results['valid'] Please remove two lines above... they ...

4 years, 2 months ago (2016-10-05 15:36:07 UTC) #54

nednguyen

https://codereview.chromium.org/2375663003/diff/200001/scripts/slave/recipe_modules/chromium_tests/steps.py File scripts/slave/recipe_modules/chromium_tests/steps.py (right): https://codereview.chromium.org/2375663003/diff/200001/scripts/slave/recipe_modules/chromium_tests/steps.py#newcode1085 scripts/slave/recipe_modules/chromium_tests/steps.py:1085: valid = results['valid'] On 2016/10/05 15:36:06, Sergiy Byelozyorov wrote: ...

4 years, 2 months ago (2016-10-05 18:50:53 UTC) #55

https://codereview.chromium.org/2375663003/diff/200001/scripts/slave/recipe_m...
File scripts/slave/recipe_modules/chromium_tests/steps.py (right):

https://codereview.chromium.org/2375663003/diff/200001/scripts/slave/recipe_m...
scripts/slave/recipe_modules/chromium_tests/steps.py:1085: valid =
results['valid']
On 2016/10/05 15:36:06, Sergiy Byelozyorov wrote:
> Please remove two lines above... they are not needed anymore.

Done.

https://codereview.chromium.org/2375663003/diff/200001/scripts/slave/recipe_m...
scripts/slave/recipe_modules/chromium_tests/steps.py:1092: tests[t]['expected']
!= tests[t]['actual'])
On 2016/10/05 15:36:06, Sergiy Byelozyorov wrote:
> AFAIK, 'expected' and 'actual' are lists of results separated by space,e.g.
> expected = 'TIMEOUT IMAGE' and actual = 'FAIL FAIL IMAGE'. The example above
> means that test has been run 3 tries - failed twice and produced an image
once.
> The expected list means that either timeout or image results are ok.

I see. The spec says:
"actual" is an ordered space-separated list of the results the test actually
produced. 

"expected" is an unordered space-separated list of the result types expected for
the test.

So for this semantic, I compare the set(test results) instead. Wdyt?

https://codereview.chromium.org/2375663003/diff/200001/scripts/slave/recipe_m...
scripts/slave/recipe_modules/chromium_tests/steps.py:1093: valid =
results['num_failures_by_type'].get('FAIL', 0) == len(failures)
On 2016/10/05 15:36:05, Sergiy Byelozyorov wrote:
> FAIL is not the only type of failure.

But is should be the only one we use for checking the validity of the result
here? With CRASH & TIMEOUT, the number of test failure may not be equaled the
number of "failures" computed above?

https://codereview.chromium.org/2375663003/diff/200001/scripts/slave/recipe_m...
File scripts/slave/recipe_modules/swarming/__init__.py (right):

https://codereview.chromium.org/2375663003/diff/200001/scripts/slave/recipe_m...
scripts/slave/recipe_modules/swarming/__init__.py:27: # let
results_merger_unittest runnable.
On 2016/10/05 15:36:06, Sergiy Byelozyorov wrote:
> Why does results_merger_unittest need to import this? If the import fails then
> it does not actually get anything useful from this module.

Done. Move results_merger to resources/ help solved this.

https://codereview.chromium.org/2375663003/diff/200001/scripts/slave/recipe_m...
File scripts/slave/recipe_modules/swarming/results_merger.py (right):

https://codereview.chromium.org/2375663003/diff/200001/scripts/slave/recipe_m...
scripts/slave/recipe_modules/swarming/results_merger.py:44: merged_results[key]
= merged_results[key] and result_json[key]
On 2016/10/05 15:36:06, Sergiy Byelozyorov wrote:
> I'd make this code less generic and easier to read:
> 
> for result_json in shard_results_list:
>   successes = result_json.get('successes', [])
>   failures = result_json.get('failures', [])
>   valid = result_json.get('valid', True)
> 
>   if (not isinstance(successes, list) or not isinstance(failures, list) or
>       not isinstance(valid, bool))
>     raise Exception('Unexpected value type in %s' % result_json)
> 
>   merged_results['successes'].extend(successes)
>   merged_results['failures'].extend(failures)
>   merged_results['valid'] = merged_results['valid'] and valid

Done.

https://codereview.chromium.org/2375663003/diff/200001/scripts/slave/recipe_m...
File scripts/slave/recipe_modules/swarming/tests/results_merger_unittest.py
(right):

https://codereview.chromium.org/2375663003/diff/200001/scripts/slave/recipe_m...
scripts/slave/recipe_modules/swarming/tests/results_merger_unittest.py:14: 0,
os.path.abspath(os.path.join(THIS_DIR, '..', '..',  '..', 'unittests')))
On 2016/10/05 15:36:06, Sergiy Byelozyorov wrote:
> Please move close to actual 'import test_env' line below or move import line
> here.

Done.

https://codereview.chromium.org/2375663003/diff/200001/scripts/slave/recipe_m...
scripts/slave/recipe_modules/swarming/tests/results_merger_unittest.py:16:
sys.path.insert(0, os.path.join(THIS_DIR, '..', '..'))
On 2016/10/05 15:36:06, Sergiy Byelozyorov wrote:
> What is this one for?

For importing resource_merger. PTAL again

https://codereview.chromium.org/2375663003/diff/200001/scripts/slave/recipe_m...
scripts/slave/recipe_modules/swarming/tests/results_merger_unittest.py:212:
class MergingTest(unittest.TestCase):  # pragma: no cover
On 2016/10/05 15:36:06, Sergiy Byelozyorov wrote:
> Why is this marked to be without coverage? Aren't we running these tests?

We aren't running these test in expectation framework, but are running them in
PRESUBMIT:

https://cs.chromium.org/chromium/build/PRESUBMIT.py?rcl=0&l=121

https://codereview.chromium.org/2375663003/diff/200001/scripts/slave/recipes/...
File scripts/slave/recipes/chromium.py (right):

https://codereview.chromium.org/2375663003/diff/200001/scripts/slave/recipes/...
scripts/slave/recipes/chromium.py:340: 
On 2016/10/05 15:36:06, Sergiy Byelozyorov wrote:
> nit: please remove this line... no need for double-empty-line between tests

Done.

Dirk Pranke

https://codereview.chromium.org/2375663003/diff/200001/scripts/slave/recipe_modules/chromium_tests/steps.py File scripts/slave/recipe_modules/chromium_tests/steps.py (right): https://codereview.chromium.org/2375663003/diff/200001/scripts/slave/recipe_modules/chromium_tests/steps.py#newcode1093 scripts/slave/recipe_modules/chromium_tests/steps.py:1093: valid = results['num_failures_by_type'].get('FAIL', 0) == len(failures) On 2016/10/05 18:50:53, ...

4 years, 2 months ago (2016-10-05 19:39:46 UTC) #56

nednguyen

https://codereview.chromium.org/2375663003/diff/200001/scripts/slave/recipe_modules/chromium_tests/steps.py File scripts/slave/recipe_modules/chromium_tests/steps.py (right): https://codereview.chromium.org/2375663003/diff/200001/scripts/slave/recipe_modules/chromium_tests/steps.py#newcode1093 scripts/slave/recipe_modules/chromium_tests/steps.py:1093: valid = results['num_failures_by_type'].get('FAIL', 0) == len(failures) On 2016/10/05 19:39:46, ...

4 years, 2 months ago (2016-10-05 20:14:29 UTC) #57

nednguyen

On 2016/10/05 20:14:29, nednguyen wrote: > https://codereview.chromium.org/2375663003/diff/200001/scripts/slave/recipe_modules/chromium_tests/steps.py > File scripts/slave/recipe_modules/chromium_tests/steps.py (right): > > https://codereview.chromium.org/2375663003/diff/200001/scripts/slave/recipe_modules/chromium_tests/steps.py#newcode1093 > ...

4 years, 2 months ago (2016-10-06 22:41:29 UTC) #58

Sergiy Byelozyorov

lgtm w/ comments https://codereview.chromium.org/2375663003/diff/200001/scripts/slave/recipe_modules/chromium_tests/steps.py File scripts/slave/recipe_modules/chromium_tests/steps.py (right): https://codereview.chromium.org/2375663003/diff/200001/scripts/slave/recipe_modules/chromium_tests/steps.py#newcode1092 scripts/slave/recipe_modules/chromium_tests/steps.py:1092: tests[t]['expected'] != tests[t]['actual']) On 2016/10/05 18:50:53, ...

4 years, 2 months ago (2016-10-09 14:43:10 UTC) #59

lgtm w/ comments

https://codereview.chromium.org/2375663003/diff/200001/scripts/slave/recipe_m...
File scripts/slave/recipe_modules/chromium_tests/steps.py (right):

https://codereview.chromium.org/2375663003/diff/200001/scripts/slave/recipe_m...
scripts/slave/recipe_modules/chromium_tests/steps.py:1092: tests[t]['expected']
!= tests[t]['actual'])
On 2016/10/05 18:50:53, nednguyen wrote:
> On 2016/10/05 15:36:06, Sergiy Byelozyorov wrote:
> > AFAIK, 'expected' and 'actual' are lists of results separated by space,e.g.
> > expected = 'TIMEOUT IMAGE' and actual = 'FAIL FAIL IMAGE'. The example above
> > means that test has been run 3 tries - failed twice and produced an image
> once.
> > The expected list means that either timeout or image results are ok.
> 
> I see. The spec says:
> "actual" is an ordered space-separated list of the results the test actually
> produced. 
> 
> "expected" is an unordered space-separated list of the result types expected
for
> the test.
> 
> So for this semantic, I compare the set(test results) instead. Wdyt?

I think, to get a list of failures, you'd need to do the following:

failures = list(
    t for t in tests
    if all(res not in tests[t]['expected'].split()
           for res in tests[t]['actual'].split())

https://codereview.chromium.org/2375663003/diff/200001/scripts/slave/recipe_m...
File scripts/slave/recipe_modules/swarming/tests/results_merger_unittest.py
(right):

https://codereview.chromium.org/2375663003/diff/200001/scripts/slave/recipe_m...
scripts/slave/recipe_modules/swarming/tests/results_merger_unittest.py:212:
class MergingTest(unittest.TestCase):  # pragma: no cover
On 2016/10/05 18:50:53, nednguyen wrote:
> On 2016/10/05 15:36:06, Sergiy Byelozyorov wrote:
> > Why is this marked to be without coverage? Aren't we running these tests?
> 
> We aren't running these test in expectation framework, but are running them in
> PRESUBMIT:
> 
> https://cs.chromium.org/chromium/build/PRESUBMIT.py?rcl=0&l=121

This is sad that we have to misuse pragma-no-cover here, but for lack of a
better alternative, I guess this will do. Can you please add a comment
explaining that?

https://codereview.chromium.org/2375663003/diff/260001/scripts/slave/recipe_m...
File scripts/slave/recipe_modules/test_utils/test_api.py (right):

https://codereview.chromium.org/2375663003/diff/260001/scripts/slave/recipe_m...
scripts/slave/recipe_modules/test_utils/test_api.py:154: tests_run = {
the actual/expected will also need to be updated here.

actual - ordered list of actual test results
expected - set of results that are considered as passing

for example:

actual: FAIL FAIL PASS
expected: PASS

is for a flaky test that only passed on a third try.

actual: TIMEOUT
expected: PASS TIMEOUT

means that the test passed, because TIMEOUT is one of the expected results.

nednguyen

https://codereview.chromium.org/2375663003/diff/200001/scripts/slave/recipe_modules/chromium_tests/steps.py File scripts/slave/recipe_modules/chromium_tests/steps.py (right): https://codereview.chromium.org/2375663003/diff/200001/scripts/slave/recipe_modules/chromium_tests/steps.py#newcode1092 scripts/slave/recipe_modules/chromium_tests/steps.py:1092: tests[t]['expected'] != tests[t]['actual']) On 2016/10/09 14:43:10, Sergiy Byelozyorov wrote: ...

4 years, 2 months ago (2016-10-10 13:24:52 UTC) #60

https://codereview.chromium.org/2375663003/diff/200001/scripts/slave/recipe_m...
File scripts/slave/recipe_modules/chromium_tests/steps.py (right):

https://codereview.chromium.org/2375663003/diff/200001/scripts/slave/recipe_m...
scripts/slave/recipe_modules/chromium_tests/steps.py:1092: tests[t]['expected']
!= tests[t]['actual'])
On 2016/10/09 14:43:10, Sergiy Byelozyorov wrote:
> On 2016/10/05 18:50:53, nednguyen wrote:
> > On 2016/10/05 15:36:06, Sergiy Byelozyorov wrote:
> > > AFAIK, 'expected' and 'actual' are lists of results separated by
space,e.g.
> > > expected = 'TIMEOUT IMAGE' and actual = 'FAIL FAIL IMAGE'. The example
above
> > > means that test has been run 3 tries - failed twice and produced an image
> > once.
> > > The expected list means that either timeout or image results are ok.
> > 
> > I see. The spec says:
> > "actual" is an ordered space-separated list of the results the test actually
> > produced. 
> > 
> > "expected" is an unordered space-separated list of the result types expected
> for
> > the test.
> > 
> > So for this semantic, I compare the set(test results) instead. Wdyt?
> 
> I think, to get a list of failures, you'd need to do the following:
> 
> failures = list(
>     t for t in tests
>     if all(res not in tests[t]['expected'].split()
>            for res in tests[t]['actual'].split())

Done.

https://codereview.chromium.org/2375663003/diff/200001/scripts/slave/recipe_m...
File scripts/slave/recipe_modules/swarming/tests/results_merger_unittest.py
(right):

https://codereview.chromium.org/2375663003/diff/200001/scripts/slave/recipe_m...
scripts/slave/recipe_modules/swarming/tests/results_merger_unittest.py:212:
class MergingTest(unittest.TestCase):  # pragma: no cover
On 2016/10/09 14:43:10, Sergiy Byelozyorov wrote:
> On 2016/10/05 18:50:53, nednguyen wrote:
> > On 2016/10/05 15:36:06, Sergiy Byelozyorov wrote:
> > > Why is this marked to be without coverage? Aren't we running these tests?
> > 
> > We aren't running these test in expectation framework, but are running them
in
> > PRESUBMIT:
> > 
> > https://cs.chromium.org/chromium/build/PRESUBMIT.py?rcl=0&l=121
> 
> This is sad that we have to misuse pragma-no-cover here, but for lack of a
> better alternative, I guess this will do. Can you please add a comment
> explaining that?

Done.

https://codereview.chromium.org/2375663003/diff/260001/scripts/slave/recipe_m...
File scripts/slave/recipe_modules/test_utils/test_api.py (right):

https://codereview.chromium.org/2375663003/diff/260001/scripts/slave/recipe_m...
scripts/slave/recipe_modules/test_utils/test_api.py:154: tests_run = {
On 2016/10/09 14:43:10, Sergiy Byelozyorov wrote:
> the actual/expected will also need to be updated here.
> 
> actual - ordered list of actual test results
> expected - set of results that are considered as passing
> 
> for example:
> 
> actual: FAIL FAIL PASS
> expected: PASS
> 
> is for a flaky test that only passed on a third try.
> 
> actual: TIMEOUT
> expected: PASS TIMEOUT
> 
> means that the test passed, because TIMEOUT is one of the expected results.

Done.

eyaich1

https://codereview.chromium.org/2375663003/diff/280001/scripts/slave/recipe_modules/swarming/api.py File scripts/slave/recipe_modules/swarming/api.py (right): https://codereview.chromium.org/2375663003/diff/280001/scripts/slave/recipe_modules/swarming/api.py#newcode11 scripts/slave/recipe_modules/swarming/api.py:11: from resources import results_merger Are you sure this is ...

4 years, 2 months ago (2016-10-10 13:51:28 UTC) #61

nednguyen

https://codereview.chromium.org/2375663003/diff/280001/scripts/slave/recipe_modules/swarming/api.py File scripts/slave/recipe_modules/swarming/api.py (right): https://codereview.chromium.org/2375663003/diff/280001/scripts/slave/recipe_modules/swarming/api.py#newcode11 scripts/slave/recipe_modules/swarming/api.py:11: from resources import results_merger On 2016/10/10 13:51:28, eyaich1 wrote: ...

4 years, 2 months ago (2016-10-10 14:26:26 UTC) #62

https://codereview.chromium.org/2375663003/diff/280001/scripts/slave/recipe_m...
File scripts/slave/recipe_modules/swarming/api.py (right):

https://codereview.chromium.org/2375663003/diff/280001/scripts/slave/recipe_m...
scripts/slave/recipe_modules/swarming/api.py:11: from resources import
results_merger
On 2016/10/10 13:51:28, eyaich1 wrote:
> Are you sure this is possible?  Similar to my swarming api
crrev.com/2394093002
> it seems like you have to kick off a subprocess to execute a script in another
> directory.  Maybe I am wrong if it is a subdirectory of your current directory
> vs a parent or a sibling.

Yes, this is possible because I added resources/__init__.py to turn it to a
package

https://codereview.chromium.org/2375663003/diff/280001/scripts/slave/recipes/...
File
scripts/slave/recipes/chromium.expected/dynamic_swarmed_sharded_failed_isolated_script_test.json
(left):

https://codereview.chromium.org/2375663003/diff/280001/scripts/slave/recipes/...
scripts/slave/recipes/chromium.expected/dynamic_swarmed_sharded_failed_isolated_script_test.json:425:
"@@@STEP_TEXT@<br/>failures:<br/>test1.Test1<br/>test2.Test2<br/>test3.Test3<br/>test4.Test4<br/>@@@",
On 2016/10/10 13:51:28, eyaich1 wrote:
> It does appear that this line should still be here even though the output is
in
> the new format.

This is done in newer patchset.

https://codereview.chromium.org/2375663003/diff/280001/scripts/slave/recipes/...
File
scripts/slave/recipes/chromium.expected/dynamic_swarmed_sharded_isolated_script_test_missing_shard.json
(right):

https://codereview.chromium.org/2375663003/diff/280001/scripts/slave/recipes/...
scripts/slave/recipes/chromium.expected/dynamic_swarmed_sharded_isolated_script_test_missing_shard.json:432:
"@@@STEP_LOG_LINE@invalid_results_exc@'valid'@@@",
On 2016/10/10 13:51:28, eyaich1 wrote:
> It seems to swap out 'valid' for 'failures' in every one of these expectations
> that fails on invalid results.  Might be worth investigating why.

Oh, that's because we used to access "failures" key first, then "valid" key.
Though the error message here is too cryptic, so I change it to use repr() which
returns KeyError('valid',), making it much clearer

nednguyen

PTAL again Emily https://codereview.chromium.org/2375663003/diff/280001/scripts/slave/recipes/chromium.expected/dynamic_swarmed_sharded_invalid_format_isolated_script_test.json File scripts/slave/recipes/chromium.expected/dynamic_swarmed_sharded_invalid_format_isolated_script_test.json (right): https://codereview.chromium.org/2375663003/diff/280001/scripts/slave/recipes/chromium.expected/dynamic_swarmed_sharded_invalid_format_isolated_script_test.json#newcode452 scripts/slave/recipes/chromium.expected/dynamic_swarmed_sharded_invalid_format_isolated_script_test.json:452: "status_code": 0 On 2016/10/10 13:51:28, eyaich1 ...

4 years, 2 months ago (2016-10-10 14:42:29 UTC) #63

eyaich1

On 2016/10/10 14:42:29, nednguyen wrote: > PTAL again Emily > > https://codereview.chromium.org/2375663003/diff/280001/scripts/slave/recipes/chromium.expected/dynamic_swarmed_sharded_invalid_format_isolated_script_test.json > File > ...

4 years, 2 months ago (2016-10-10 14:43:34 UTC) #64

nednguyen

The patchset sent to the CQ was uploaded after l-g-t-m from kbr@chromium.org, sergiyb@chromium.org Link to ...

4 years, 2 months ago (2016-10-10 14:44:24 UTC) #66

commit-bot: I haz the power

CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2375663003/320001

4 years, 2 months ago (2016-10-10 14:44:35 UTC) #67

commit-bot: I haz the power

The CQ bit was unchecked by commit-bot@chromium.org

4 years, 2 months ago (2016-10-10 14:56:01 UTC) #68

commit-bot: I haz the power

Try jobs failed on following builders: Build Presubmit on luci.infra.try (JOB_FAILED, https://luci-milo.appspot.com/swarming/task/31c7e5c41b570610)

4 years, 2 months ago (2016-10-10 14:56:04 UTC) #69

nednguyen

On 2016/10/10 14:56:04, commit-bot: I haz the power wrote: > Try jobs failed on following ...

4 years, 2 months ago (2016-10-10 15:14:51 UTC) #70

nednguyen

The patchset sent to the CQ was uploaded after l-g-t-m from kbr@chromium.org, eyaich@chromium.org, sergiyb@chromium.org Link ...

4 years, 2 months ago (2016-10-10 15:14:58 UTC) #72

commit-bot: I haz the power

CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2375663003/340001

4 years, 2 months ago (2016-10-10 15:15:01 UTC) #73

commit-bot: I haz the power

The CQ bit was unchecked by commit-bot@chromium.org

4 years, 2 months ago (2016-10-10 15:28:55 UTC) #74

commit-bot: I haz the power

Try jobs failed on following builders: Build Presubmit on luci.infra.try (JOB_FAILED, https://luci-milo.appspot.com/swarming/task/31c7fafd6da82f10)

4 years, 2 months ago (2016-10-10 15:28:57 UTC) #75

nednguyen

The patchset sent to the CQ was uploaded after l-g-t-m from kbr@chromium.org, eyaich@chromium.org, sergiyb@chromium.org Link ...

4 years, 2 months ago (2016-10-10 15:34:55 UTC) #77

commit-bot: I haz the power

CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2375663003/360001

4 years, 2 months ago (2016-10-10 15:35:01 UTC) #78

commit-bot: I haz the power

Description was changed from ========== Add json test results format support for SwarmingIsolatedScriptTest BUG=649762 ========== ...

4 years, 2 months ago (2016-10-10 15:42:41 UTC) #79

commit-bot: I haz the power

4 years, 2 months ago (2016-10-10 15:42:42 UTC) #80

Message was sent while issue was closed.

Committed patchset #17 (id:360001) as
https://chromium.googlesource.com/chromium/tools/build/+/7baad31fc4293485a824...

Message was sent while issue was closed.

Patchset #18 (id:380001) has been deleted

Issue 2375663003: Add json test results format support for SwarmingIsolatedScriptTest (Closed)

Description

Patch Set 1 #

Patch Set 2 : Move merging logic to a separate file for unittesting #

Patch Set 3 : Address review comments #

Patch Set 4 : Fix typos #

Patch Set 5 : Add expectation test coverage #

Patch Set 6 : Undo collect_gtest_task_test change #

Patch Set 7 : Update SwarmingIsolatedScriptTest #

Patch Set 8 : Add missing comments #

Patch Set 9 : Rebase #

Patch Set 10 : Address Sergiy's comments #

Patch Set 11 : Rebase #

Patch Set 12 : Address Dirk's comment #

Patch Set 13 : Address Sergiy comments #

Patch Set 14 : Update failed test generation #

Patch Set 15 : Use repr(e) to make the invalid_results_exc clearer #

Patch Set 16 : Move results_merger out of resources/ & remove resources/init.py #

Patch Set 17 : Add no cover #

Messages