Issue 1416763007: Add a recipe to identify culprits for chromium compile failures.

stgao

Description was changed from ========== Add findit recipe for compile failures and corresponding builders. BUG= ...

5 years, 1 month ago (2015-11-11 01:02:39 UTC) #1

stgao

Description was changed from ========== Add findit recipe to identify culprits for chromium compile failures ...

5 years, 1 month ago (2015-11-13 22:33:58 UTC) #4

stgao

Description was changed from ========== Add findit recipe to identify culprits for chromium compile failures ...

5 years, 1 month ago (2015-11-13 22:43:51 UTC) #5

stgao

stgao@chromium.org changed reviewers: + dpranke@chromium.org, iannucci@chromium.org

5 years, 1 month ago (2015-11-13 22:47:36 UTC) #6

stgao

Hi Robbie and Dirk, Would you mind a review of the recipe in this CL ...

5 years, 1 month ago (2015-11-13 22:47:36 UTC) #7

stgao

Description was changed from ========== Add a recipe to identify culprits for chromium compile failures ...

5 years, 1 month ago (2015-11-13 23:43:15 UTC) #8

stgao

Ping :) dpranke@, would you mind reviewing these two files? scripts/slave/recipe_modules/chromium_tests/api.py scripts/slave/recipe_modules/filter/api.py iannucci@, would you ...

5 years, 1 month ago (2015-11-17 18:40:25 UTC) #9

Dirk Pranke

Sorry for the delay, I've been swamped with branch-related stuff. I should be able to ...

5 years, 1 month ago (2015-11-17 18:58:41 UTC) #10

Dirk Pranke

https://codereview.chromium.org/1416763007/diff/40001/scripts/slave/recipe_modules/chromium_tests/api.py File scripts/slave/recipe_modules/chromium_tests/api.py (right): https://codereview.chromium.org/1416763007/diff/40001/scripts/slave/recipe_modules/chromium_tests/api.py#newcode481 scripts/slave/recipe_modules/chromium_tests/api.py:481: self.m.properties['buildername'] nit: wrap the expression in parentheses instead of ...

5 years, 1 month ago (2015-11-17 22:38:51 UTC) #11

iannucci

This lgtm :) https://codereview.chromium.org/1416763007/diff/40001/scripts/slave/recipe_modules/findit/api.py File scripts/slave/recipe_modules/findit/api.py (right): https://codereview.chromium.org/1416763007/diff/40001/scripts/slave/recipe_modules/findit/api.py#newcode20 scripts/slave/recipe_modules/findit/api.py:20: cwd = self.m.path['checkout'].join(repo_dir) is this if/else ...

5 years, 1 month ago (2015-11-18 03:57:54 UTC) #12

stgao

Description was changed from ========== Add a recipe to identify culprits for chromium compile failures ...

5 years, 1 month ago (2015-11-18 23:04:52 UTC) #13

Description was changed from

==========
Add a recipe to identify culprits for chromium compile failures on the chromium
main waterfall.

Input to the recipe:
 * (target_mastername, target_buildername): the waterfall builder where the
compile failure occurred.
 * (target_solution, target_solution_revisions): is the solution name of the
project ("src" for chromium) and the revisions in the regression range.
 * compile_targets: the failed compile targets that needs to recompile, like
browser_tests, a/b/c.o, etc.

Main flow in the recipe:
1. Configure to match (target_mastername, target_buildername)
2. For each revision in the regression range:
   * Checkout code at the revision
   * If compile_targets is not provided, use analyze to figure the targets that
are affected by the revision
   * Run compile if needed
3. Report results in Json output.

Example runs:
1. compile_targets is specified in build request: http://shortn/_JHfefo7eYy
2. compile_targets is Not specified in build request: http://shortn/_FM1douL9UG

BUG=545299
==========

to

==========
Add a recipe to identify culprits for chromium compile failures on the chromium
main waterfall.

Input to the recipe:
 * (target_mastername, target_buildername): the waterfall builder where the
compile failure occurred.
 * (target_solution, target_solution_revisions): is the solution name of the
project ("src" for chromium) and the revisions in the regression range.
 * compile_targets: the failed compile targets that needs to recompile, like
browser_tests, a/b/c.o, etc.

Main flow in the recipe:
1. Configure to match (target_mastername, target_buildername)
2. For each revision in the regression range:
   * Checkout code at the revision
   * If compile_targets is not provided, use analyze to figure the targets that
are affected by the revision
   * Run compile if needed
3. Report results in Json output.

Example runs:
1. compile_targets is specified in build request: http://shortn/_h77B3kMe6w
2. compile_targets is Not specified in build request: http://shortn/_FM1douL9UG

BUG=545299
==========

stgao

Description was changed from ========== Add a recipe to identify culprits for chromium compile failures ...

5 years, 1 month ago (2015-11-19 20:10:22 UTC) #14

Description was changed from

==========
Add a recipe to identify culprits for chromium compile failures on the chromium
main waterfall.

Input to the recipe:
 * (target_mastername, target_buildername): the waterfall builder where the
compile failure occurred.
 * (target_solution, target_solution_revisions): is the solution name of the
project ("src" for chromium) and the revisions in the regression range.
 * compile_targets: the failed compile targets that needs to recompile, like
browser_tests, a/b/c.o, etc.

Main flow in the recipe:
1. Configure to match (target_mastername, target_buildername)
2. For each revision in the regression range:
   * Checkout code at the revision
   * If compile_targets is not provided, use analyze to figure the targets that
are affected by the revision
   * Run compile if needed
3. Report results in Json output.

Example runs:
1. compile_targets is specified in build request: http://shortn/_h77B3kMe6w
2. compile_targets is Not specified in build request: http://shortn/_FM1douL9UG

BUG=545299
==========

to

==========
Add a recipe to identify culprits for chromium compile failures on the chromium
main waterfall.

Input to the recipe:
 * (target_mastername, target_buildername): the waterfall builder where the
compile failure occurred.
 * (target_solution, target_solution_revisions): is the solution name of the
project ("src" for chromium) and the revisions in the regression range.
 * compile_targets: the failed compile targets that needs to recompile, like
browser_tests, a/b/c.o, etc.

Main flow in the recipe:
1. Configure to match (target_mastername, target_buildername)
2. For each revision in the regression range:
   * Checkout code at the revision
   * If compile_targets is not provided, use analyze to figure the targets that
are affected by the revision
   * Run compile if needed
3. Report results in Json output.

Example runs:
1. compile_targets is specified in build request: http://shortn/_lKI7l8Y5VT
2. compile_targets is Not specified in build request: http://shortn/_xl3pUitjEx

BUG=545299
==========

stgao

Description was changed from ========== Add a recipe to identify culprits for chromium compile failures ...

5 years, 1 month ago (2015-11-19 20:10:40 UTC) #15

Description was changed from

==========
Add a recipe to identify culprits for chromium compile failures on the chromium
main waterfall.

Input to the recipe:
 * (target_mastername, target_buildername): the waterfall builder where the
compile failure occurred.
 * (target_solution, target_solution_revisions): is the solution name of the
project ("src" for chromium) and the revisions in the regression range.
 * compile_targets: the failed compile targets that needs to recompile, like
browser_tests, a/b/c.o, etc.

Main flow in the recipe:
1. Configure to match (target_mastername, target_buildername)
2. For each revision in the regression range:
   * Checkout code at the revision
   * If compile_targets is not provided, use analyze to figure the targets that
are affected by the revision
   * Run compile if needed
3. Report results in Json output.

Example runs:
1. compile_targets is specified in build request: http://shortn/_lKI7l8Y5VT
2. compile_targets is Not specified in build request: http://shortn/_xl3pUitjEx

BUG=545299
==========

to

==========
Add a recipe to identify culprits for chromium compile failures on the chromium
main waterfall.

Input to the recipe:
 * (target_mastername, target_buildername): the waterfall builder where the
compile failure occurred.
 * (target_solution, target_solution_revisions): is the solution name of the
project ("src" for chromium) and the revisions in the regression range.
 * compile_targets: the failed compile targets that needs to recompile, like
browser_tests, etc.

Main flow in the recipe:
1. Configure to match (target_mastername, target_buildername)
2. For each revision in the regression range:
   * Checkout code at the revision
   * If compile_targets is not provided, use analyze to figure the targets that
are affected by the revision
   * Run compile if needed
3. Report results in Json output.

Example runs:
1. compile_targets is specified in build request: http://shortn/_lKI7l8Y5VT
2. compile_targets is Not specified in build request: http://shortn/_xl3pUitjEx

BUG=545299
==========

stgao

Sorry for being a bit late on addressing comments. After some hacking to make old-version ...

5 years, 1 month ago (2015-11-19 22:06:34 UTC) #16

Sorry for being a bit late on addressing comments.

After some hacking to make old-version src-side MB to work with latest change to
build-side MB or analyze, this CL is tested out with an actual compile failure.
1. compile_targets is specified in build request: http://shortn/_lKI7l8Y5VT
2. compile_targets is Not specified in build request: http://shortn/_xl3pUitjEx

https://codereview.chromium.org/1416763007/diff/40001/scripts/slave/recipe_mo...
File scripts/slave/recipe_modules/chromium_tests/api.py (right):

https://codereview.chromium.org/1416763007/diff/40001/scripts/slave/recipe_mo...
scripts/slave/recipe_modules/chromium_tests/api.py:481:
self.m.properties['buildername']
On 2015/11/17 22:38:51, Dirk Pranke wrote:
> nit: wrap the expression in parentheses instead of using line continuations,
> e.g.:
> 
> target_mastername = (self.m... or
>                      self.m...)

Done.

https://codereview.chromium.org/1416763007/diff/40001/scripts/slave/recipe_mo...
File scripts/slave/recipe_modules/findit/api.py (right):

https://codereview.chromium.org/1416763007/diff/40001/scripts/slave/recipe_mo...
scripts/slave/recipe_modules/findit/api.py:20: cwd =
self.m.path['checkout'].join(repo_dir)
On 2015/11/18 03:57:53, iannucci wrote:
> is this if/else backwards?
> 
> Does it make sense to just get rid of the default argument for repo_dir?

Changed input parameter to "solution_name" and derive the repo dir from it.

https://codereview.chromium.org/1416763007/diff/40001/scripts/slave/recipe_mo...
scripts/slave/recipe_modules/findit/api.py:23: step_result = self.m.git('diff',
previous_revision, revision, '--name-only',
On 2015/11/18 03:57:54, iannucci wrote:
> I think 
> 
>   revision+"~1"
> 
> is possibly clearer and shorter than previous_revision... but maybe that's
just
> me :)

Done.

https://codereview.chromium.org/1416763007/diff/40001/scripts/slave/recipes/f...
File scripts/slave/recipes/findit/chromium/compile.py (right):

https://codereview.chromium.org/1416763007/diff/40001/scripts/slave/recipes/f...
scripts/slave/recipes/findit/chromium/compile.py:36: kind=List(basestring),
default=None, param_name='compile_targets',
On 2015/11/18 03:57:54, iannucci wrote:
> yay! real types!
> 
> Please note that in order for this to work with `run_recipe.py` or buildbucket
> that this will need to be an actual list of strings. So:
> 
>   ["list", "of", "strings"]
> 
> and not
> 
>   list,of,strings
> 
> But other than that this is awesome :)

Yes, it will be a list of strings in the build request through Buildbucket.

Another note here is that List(str) won't work, because a string in a list is
turned into a unicode string somehow...Not sure why and exactly where...

https://codereview.chromium.org/1416763007/diff/40001/scripts/slave/recipes/f...
scripts/slave/recipes/findit/chromium/compile.py:37: help='The failed compile
targets, eg: browser_tests, a/b.o'),
On 2015/11/17 22:38:51, Dirk Pranke wrote:
> nit: it's not clear what "a/b.o" is here; is that supposed to be an arbitrary
> object file? 
> 
> Are you planning to support bisecting into compile failures for single object
> files? that seems like it
> will be fragile.

Yes, it's supposed to be an arbitrary object file.
I removed it for now.

The idea is: if a source file failed to compile on the waterfall, recompiling it
only will speed up the culprit finding because much less compile units have to
be run.

However, I still have to confirm one thing: target "a" depends on target "b",
the source file a.cc included in target "a" depends on a header file b.h
generated in target "b", not sure whether compiling a.o for a.cc only will make
the whole target "b" to be compiled first and b.h is generated as expected.

Another possible issue is that: even if the source file a.cc passes compile, the
target "a" might still fail to compile, or other targets impacted by the commit
might also fail to compile. However, with the assumption that a compile failure
is caused by a single commit which is the most common case except conflicting
commits, this approach could still identify the culprit.
Is this the case you thought fragile?

https://codereview.chromium.org/1416763007/diff/40001/scripts/slave/recipes/f...
scripts/slave/recipes/findit/chromium/compile.py:57: })
On 2015/11/18 03:57:54, iannucci wrote:
> This is mutating `api.gclient.c`.
> 
> I think a more concise way to express this section would be:
> 
>   api.gclient.c.revisions[target_solution] = current_revision
> 
> which should work even if cfg.revisions is empty (if not, let me know)

Yes, it works!

https://codereview.chromium.org/1416763007/diff/40001/scripts/slave/recipes/f...
scripts/slave/recipes/findit/chromium/compile.py:80: requires_compile,
matching_exes, compile_targets = \
On 2015/11/17 22:38:51, Dirk Pranke wrote:
> You should update this logic to the latest version of chromium_tests.analyze:
> you can
> (and should) just look at the returned compile_targets list, and not need any
of
> lines 88-95.

Done.

Also refactored chromium_tests.get_compile_targets_and_tests into a new function
chromuim_tests.get_test_targets_compile_targets_and_tests to return test_targets
and compile_targets.

https://codereview.chromium.org/1416763007/diff/40001/scripts/slave/recipes/f...
scripts/slave/recipes/findit/chromium/compile.py:100: continue
On 2015/11/18 03:57:54, iannucci wrote:
> I would maybe pull this loop body into a function of current_revision, and
have
> it return the result. Then you can do the results.append in the loop, and the
> control flow will be a bit clearer (e.g. that results is only ever appended to
> once per loop).

Done.

https://codereview.chromium.org/1416763007/diff/40001/scripts/slave/recipes/f...
scripts/slave/recipes/findit/chromium/compile.py:125: json.dumps(results,
indent=2))
On 2015/11/18 03:57:54, iannucci wrote:
> I would consider summarizing the culprit (if we found one) on the step_text
too.

Good idea.
Done.

Dirk Pranke

dpranke@chromium.org changed reviewers: + phajdan.jr@chromium.org

5 years, 1 month ago (2015-11-24 00:17:40 UTC) #17

Dirk Pranke

I'd like Paweł to take a look at this as well since he's the best ...

5 years, 1 month ago (2015-11-24 00:17:41 UTC) #18

I'd like Paweł to take a look at this as well since he's the best OWNER for
chromium_tests/api.py.

I think the changes specific to analyze look fine. I have concerns about the
changes to get_test_targets_and_compile_targets(), as noted below.

On 2015/11/19 22:06:34, Shuotao(sick-slow) wrote:
> On 2015/11/17 22:38:51, Dirk Pranke wrote:
> > nit: it's not clear what "a/b.o" is here; is that supposed to be an
arbitrary
> > object file? 
> > 
> > Are you planning to support bisecting into compile failures for single
object
> > files? that seems like it
> > will be fragile.
> 
> Yes, it's supposed to be an arbitrary object file.
> I removed it for now.
> 
> The idea is: if a source file failed to compile on the waterfall, recompiling
it
> only will speed up the culprit finding because much less compile units have to
> be run.

Yup, understood.

> However, I still have to confirm one thing: target "a" depends on target "b",
> the source file a.cc included in target "a" depends on a header file b.h
> generated in target "b", not sure whether compiling a.o for a.cc only will
make
> the whole target "b" to be compiled first and b.h is generated as expected.

If you compile target "a", it will compile target "b" and generate the headers
as
needed. If you compile "a.o" it depends on the state of the ninja cache whether
it will work or not. I think if you have a clean build it won't work, but I'm
not 100% 
positive.

> 
> Another possible issue is that: even if the source file a.cc passes compile,
the
> target "a" might still fail to compile, or other targets impacted by the
commit
> might also fail to compile. However, with the assumption that a compile
failure
> is caused by a single commit which is the most common case except conflicting
> commits, this approach could still identify the culprit.

I do agree that for the most common case building individual files will probably
work,
but you may find cases where files get added or renamed or problematic changes
move from file to file where things will break.

https://codereview.chromium.org/1416763007/diff/60001/scripts/slave/recipe_mo...
File scripts/slave/recipe_modules/findit/api.py (right):

https://codereview.chromium.org/1416763007/diff/60001/scripts/slave/recipe_mo...
scripts/slave/recipe_modules/findit/api.py:20: """Returns the files changed by
the given revision.
I would probably call this routine "files_changed_by_revision" (or maybe
files_changed_in_revision).

We don't generally want to call things "get_X" in chromium style, just "X", and
"changed" is probably 
a bit clearer than "affected" in this context.

https://codereview.chromium.org/1416763007/diff/60001/scripts/slave/recipes/f...
File scripts/slave/recipes/findit/chromium/compile.py (right):

https://codereview.chromium.org/1416763007/diff/60001/scripts/slave/recipes/f...
scripts/slave/recipes/findit/chromium/compile.py:69:
api.chromium_tests.get_test_targets_compile_targets_and_tests(
why are you splitting test_targets and compile_targets apart, 
only to pass them back in to analyze?

in theory the build targets listed as 'additional_compile_targets'
are treated differently from the build targets listed as 
'test_targets' in the call to analyze, but in practice for your 
purposes, they'll mostly be the same.

So I'm not sure why you couldn't use the older version of 
get_compile_targets_and_tests() as-is?

If we really do need to change things, I think that whole 
method is getting crufty and confusing and we probably 
need to scrap it completely; it's trying to do too many 
things. I would be more inclined to have it just return 
the compile targets for the tests, and move the 
'compile_targets' logic out of the function.

https://codereview.chromium.org/1416763007/diff/60001/scripts/slave/recipes/f...
scripts/slave/recipes/findit/chromium/compile.py:89: assert compile_targets,
'compile targets should not be empty!'
nit: I'm not convinced this assert is really useful.

https://codereview.chromium.org/1416763007/diff/60001/scripts/slave/recipes/f...
scripts/slave/recipes/findit/chromium/compile.py:98: [], # Not run any test.
nit: I'd just say tests_including_triggered=[] instead

stgao

Many thanks Dirk for the review! I have a quick question as below, do you ...

5 years, 1 month ago (2015-11-24 06:15:29 UTC) #19

Many thanks Dirk for the review!
I have a quick question as below, do you mind a check?

https://codereview.chromium.org/1416763007/diff/60001/scripts/slave/recipes/f...
File scripts/slave/recipes/findit/chromium/compile.py (right):

https://codereview.chromium.org/1416763007/diff/60001/scripts/slave/recipes/f...
scripts/slave/recipes/findit/chromium/compile.py:69:
api.chromium_tests.get_test_targets_compile_targets_and_tests(
On 2015/11/24 00:17:41, Dirk Pranke wrote:
> why are you splitting test_targets and compile_targets apart, 
> only to pass them back in to analyze?

Yes, to match the changed API in chromium_tests.analyze.

> 
> in theory the build targets listed as 'additional_compile_targets'
> are treated differently from the build targets listed as 
> 'test_targets' in the call to analyze, but in practice for your 
> purposes, they'll mostly be the same.
> 
> So I'm not sure why you couldn't use the older version of 
> get_compile_targets_and_tests() as-is?

I made the change, because I thought the following code doesn't match the api
description in filter.does_patch_require_compile.
IIUC, this is actually what you suggested, so I will go with it.

Before I change it back, I'd like to double check: to get the impacted compile
targets by the affected files, the code in this comment and the code in this
patchset are equivalent?

# `all_compile_targets` includes those specified as
# "additional_compile_targets" in the test spec.
all_compile_targets, _ = api.chromium_tests.get_compile_targets_and_tests(...) 
_, compile_targets = api.chromium_tests.analyze(
     affected_files,
     test_targets=[ ],
     additional_compile_targets=all_compile_targets,
     config_file_name='trybot_analyze_config.json')

> 
> If we really do need to change things, I think that whole 
> method is getting crufty and confusing and we probably 
> need to scrap it completely; it's trying to do too many 
> things. I would be more inclined to have it just return 
> the compile targets for the tests, and move the 
> 'compile_targets' logic out of the function.

Paweł Hajdan Jr.

https://codereview.chromium.org/1416763007/diff/60001/scripts/slave/recipe_modules/chromium_tests/api.py File scripts/slave/recipe_modules/chromium_tests/api.py (right): https://codereview.chromium.org/1416763007/diff/60001/scripts/slave/recipe_modules/chromium_tests/api.py#newcode341 scripts/slave/recipe_modules/chromium_tests/api.py:341: return (sorted(test_compile_targets), sorted(compile_targets), Why do we need this change? ...

5 years ago (2015-11-24 13:20:41 UTC) #20

Dirk Pranke

https://codereview.chromium.org/1416763007/diff/60001/scripts/slave/recipe_modules/chromium_tests/api.py File scripts/slave/recipe_modules/chromium_tests/api.py (right): https://codereview.chromium.org/1416763007/diff/60001/scripts/slave/recipe_modules/chromium_tests/api.py#newcode497 scripts/slave/recipe_modules/chromium_tests/api.py:497: target_mastername = (self.m.properties.get('target_mastername') or On 2015/11/24 13:20:41, Paweł Hajdan ...

5 years ago (2015-11-24 16:35:00 UTC) #21

stgao

Comments are addressed. Do you mind another review? Thanks :) https://codereview.chromium.org/1416763007/diff/60001/scripts/slave/recipe_modules/chromium_tests/api.py File scripts/slave/recipe_modules/chromium_tests/api.py (right): https://codereview.chromium.org/1416763007/diff/60001/scripts/slave/recipe_modules/chromium_tests/api.py#newcode341 ...

5 years ago (2015-11-24 23:19:20 UTC) #22

Comments are addressed. Do you mind another review?

Thanks :)

https://codereview.chromium.org/1416763007/diff/60001/scripts/slave/recipe_mo...
File scripts/slave/recipe_modules/chromium_tests/api.py (right):

https://codereview.chromium.org/1416763007/diff/60001/scripts/slave/recipe_mo...
scripts/slave/recipe_modules/chromium_tests/api.py:341: return
(sorted(test_compile_targets), sorted(compile_targets),
On 2015/11/24 13:20:41, Paweł Hajdan Jr. wrote:
> Why do we need this change?

Was trying to match to new api filter.does_patch_require_compile (build targets
are split into test_targets and additional_compile_targets).

But as Dirk suggested in the recipe below, I reverted this change.

https://codereview.chromium.org/1416763007/diff/60001/scripts/slave/recipe_mo...
scripts/slave/recipe_modules/chromium_tests/api.py:497: target_mastername =
(self.m.properties.get('target_mastername') or
On 2015/11/24 16:35:00, Dirk Pranke wrote:
> On 2015/11/24 13:20:41, Paweł Hajdan Jr. wrote:
> > Let's make this explicit then, i.e. have run_mb_and_compile take mastername
> and
> > buildername parameters.
> 
> I actually told stgao to do things this way; because these are passed in as 
> build properties to arbitrary recipes, I don't think passing them in as 
> parameters actually helps anything, and I'd rather have the logic in one
place.
> 
> However, reading it again, I think I would add a comment that find it may
> need to override the defaults in order to be able to exactly match a given
bot.

Done.

https://codereview.chromium.org/1416763007/diff/60001/scripts/slave/recipe_mo...
File scripts/slave/recipe_modules/filter/api.py (right):

https://codereview.chromium.org/1416763007/diff/60001/scripts/slave/recipe_mo...
scripts/slave/recipe_modules/filter/api.py:163: # We need to use the actual
mastername and buildername we're running on,
On 2015/11/24 13:20:41, Paweł Hajdan Jr. wrote:
> Could you extract this to a separate CL? Such changes are quite risky to land,
> so I'd like to minimize that risk and simplify analyzing the blamelist.

Done.

https://codereview.chromium.org/1474473004/

https://codereview.chromium.org/1416763007/diff/60001/scripts/slave/recipe_mo...
File scripts/slave/recipe_modules/findit/api.py (right):

https://codereview.chromium.org/1416763007/diff/60001/scripts/slave/recipe_mo...
scripts/slave/recipe_modules/findit/api.py:20: """Returns the files changed by
the given revision.
On 2015/11/24 00:17:41, Dirk Pranke wrote:
> I would probably call this routine "files_changed_by_revision" (or maybe
> files_changed_in_revision).
> 
> We don't generally want to call things "get_X" in chromium style, just "X",
and
> "changed" is probably 
> a bit clearer than "affected" in this context.

Done.

https://codereview.chromium.org/1416763007/diff/60001/scripts/slave/recipes/f...
File scripts/slave/recipes/findit/chromium/compile.py (right):

https://codereview.chromium.org/1416763007/diff/60001/scripts/slave/recipes/f...
scripts/slave/recipes/findit/chromium/compile.py:29: 'target_solution':
Property(
On 2015/11/24 13:20:41, Paweł Hajdan Jr. wrote:
> Why is this needed?
> 
> The mastername and buildername should identify the solution, right?

This is very helpful, and it could save me one parameter.

But how could I identify the solution for waterfall continuous builder specified
by the given target_mastername and target_buildername? Eg: how to derive "src"
from ("chromium.linux", "Linux Builder")?

https://codereview.chromium.org/1416763007/diff/60001/scripts/slave/recipes/f...
scripts/slave/recipes/findit/chromium/compile.py:59: # If no compile targets are
provided, use analyze to filter out the
On 2015/11/24 13:20:41, Paweł Hajdan Jr. wrote:
> Why do you want to support the case where no compile targets are provided?

For compile failures on the waterfall, now there is no reliable way for a
program to tell which targets failed to compile. Thus, no compile targets is the
common case.

For compile and link failures, we are now trying to use regex to extract the
failed targets. However, this approach is impossible for failures of ACTIONs.

https://codereview.chromium.org/1416763007/diff/60001/scripts/slave/recipes/f...
scripts/slave/recipes/findit/chromium/compile.py:69:
api.chromium_tests.get_test_targets_compile_targets_and_tests(
On 2015/11/24 00:17:41, Dirk Pranke wrote:
> why are you splitting test_targets and compile_targets apart, 
> only to pass them back in to analyze?
> 
> in theory the build targets listed as 'additional_compile_targets'
> are treated differently from the build targets listed as 
> 'test_targets' in the call to analyze, but in practice for your 
> purposes, they'll mostly be the same.
> 
> So I'm not sure why you couldn't use the older version of 
> get_compile_targets_and_tests() as-is?
> 
> If we really do need to change things, I think that whole 
> method is getting crufty and confusing and we probably 
> need to scrap it completely; it's trying to do too many 
> things. I would be more inclined to have it just return 
> the compile targets for the tests, and move the 
> 'compile_targets' logic out of the function.

Done.

https://codereview.chromium.org/1416763007/diff/60001/scripts/slave/recipes/f...
scripts/slave/recipes/findit/chromium/compile.py:79: _, compile_targets =
api.chromium_tests.analyze(
On 2015/11/24 13:20:41, Paweł Hajdan Jr. wrote:
> Is analyze _really_ needed here? Why can't we rely on ninja to recompile
targets
> that need it?

We could do that, but it will be slow, because more targets have to be compiled.
With analyze, we could even skip compile at some commits.

The goal here is to speed up compile so that we could reduce the latency of
culprit finding.

https://codereview.chromium.org/1416763007/diff/60001/scripts/slave/recipes/f...
scripts/slave/recipes/findit/chromium/compile.py:89: assert compile_targets,
'compile targets should not be empty!'
On 2015/11/24 00:17:41, Dirk Pranke wrote:
> nit: I'm not convinced this assert is really useful.

Done.

https://codereview.chromium.org/1416763007/diff/60001/scripts/slave/recipes/f...
scripts/slave/recipes/findit/chromium/compile.py:98: [], # Not run any test.
On 2015/11/24 00:17:41, Dirk Pranke wrote:
> nit: I'd just say tests_including_triggered=[] instead

Done.

https://codereview.chromium.org/1416763007/diff/60001/scripts/slave/recipes/f...
scripts/slave/recipes/findit/chromium/compile.py:99:
override_bot_type='builder_tester')
On 2015/11/24 13:20:41, Paweł Hajdan Jr. wrote:
> Why is this needed?

Just want to make it explicit. I could remove it if that is the preferred way.
But it seems chromium_trybot.py does the same like this.

https://codereview.chromium.org/1416763007/diff/60001/scripts/slave/recipes/f...
scripts/slave/recipes/findit/chromium/compile.py:111: target_mastername,
target_buildername, override_bot_type='builder_tester')
On 2015/11/24 13:20:41, Paweł Hajdan Jr. wrote:
> Why is override_bot_type needed here?

Just want to make it explicit. I could remove it if that is the preferred way.
But it seems chromium_trybot.py does the same like this.

https://codereview.chromium.org/1416763007/diff/60001/scripts/slave/recipes/f...
scripts/slave/recipes/findit/chromium/compile.py:121: # Retry once only on infra
failure.
On 2015/11/24 13:20:41, Paweł Hajdan Jr. wrote:
> Is this retry really needed/useful here?

For compile failures on the waterfall, I saw a lot of flaky failures in August
and September
(https://docs.google.com/a/google.com/spreadsheets/d/1MxOZtL8JAQo2KW6kAouxvcd-...).
Thus, I added this retry to make the culprit-finding process more reliable.

However, I don't have data for trybots. But as the under-lying infra is the
same, I think the situation might be similar.

https://codereview.chromium.org/1416763007/diff/60001/scripts/slave/recipes/f...
scripts/slave/recipes/findit/chromium/compile.py:128: # TODO: if compile targets
are specified, compile may fail because those
On 2015/11/24 13:20:41, Paweł Hajdan Jr. wrote:
> Please follow the style guide's TODO(...) format.

Done.

https://codereview.chromium.org/1416763007/diff/60001/scripts/slave/recipes/f...
scripts/slave/recipes/findit/chromium/compile.py:133: step_result =
api.python.inline(
On 2015/11/24 13:20:41, Paweł Hajdan Jr. wrote:
> Why not use api.python.result_step (or succeeding/failing_step)?

I'm afraid I can't do that.

Because they don't support adding both step text and step log at the same time.
And as they don't return the step_result either, I can't set either step text or
step log like the code here.

stgao

Hi Paweł, Do you mind helping review both this CL and https://codereview.chromium.org/1474473004/? Tryjob-based culprit finding ...

5 years ago (2015-11-30 01:26:33 UTC) #23

Paweł Hajdan Jr.

https://codereview.chromium.org/1416763007/diff/60001/scripts/slave/recipes/findit/chromium/compile.py File scripts/slave/recipes/findit/chromium/compile.py (right): https://codereview.chromium.org/1416763007/diff/60001/scripts/slave/recipes/findit/chromium/compile.py#newcode79 scripts/slave/recipes/findit/chromium/compile.py:79: _, compile_targets = api.chromium_tests.analyze( On 2015/11/24 at 23:19:20, Shuotao ...

5 years ago (2015-11-30 16:19:05 UTC) #24

https://codereview.chromium.org/1416763007/diff/60001/scripts/slave/recipes/f...
File scripts/slave/recipes/findit/chromium/compile.py (right):

https://codereview.chromium.org/1416763007/diff/60001/scripts/slave/recipes/f...
scripts/slave/recipes/findit/chromium/compile.py:79: _, compile_targets =
api.chromium_tests.analyze(
On 2015/11/24 at 23:19:20, Shuotao wrote:
> On 2015/11/24 13:20:41, Paweł Hajdan Jr. wrote:
> > Is analyze _really_ needed here? Why can't we rely on ninja to recompile
targets
> > that need it?
> 
> We could do that, but it will be slow, because more targets have to be
compiled. With analyze, we could even skip compile at some commits.
> 
> The goal here is to speed up compile so that we could reduce the latency of
culprit finding.

What's the reasoning behind this? I'm not convinced. What is the scenario where
a target is not fresh according to ninja, but is not affected by given revision?

https://codereview.chromium.org/1416763007/diff/60001/scripts/slave/recipes/f...
scripts/slave/recipes/findit/chromium/compile.py:99:
override_bot_type='builder_tester')
On 2015/11/24 at 23:19:20, Shuotao wrote:
> On 2015/11/24 13:20:41, Paweł Hajdan Jr. wrote:
> > Why is this needed?
> 
> Just want to make it explicit. I could remove it if that is the preferred way.
> But it seems chromium_trybot.py does the same like this.

That's actually a behavior change. chromium_trybot does this because it merges
builder+tester pairs into single trybots.

What this recipe is doing is just compiling things. Is this because of avoiding
to archive the build artifacts?

https://codereview.chromium.org/1416763007/diff/60001/scripts/slave/recipes/f...
scripts/slave/recipes/findit/chromium/compile.py:121: # Retry once only on infra
failure.
On 2015/11/24 at 23:19:19, Shuotao wrote:
> On 2015/11/24 13:20:41, Paweł Hajdan Jr. wrote:
> > Is this retry really needed/useful here?
> 
> For compile failures on the waterfall, I saw a lot of flaky failures in August
and September
(https://docs.google.com/a/google.com/spreadsheets/d/1MxOZtL8JAQo2KW6kAouxvcd-...).
Thus, I added this retry to make the culprit-finding process more reliable.
> 
> However, I don't have data for trybots. But as the under-lying infra is the
same, I think the situation might be similar.

Can we start with a version of this recipe which doesn't have this logic, and
possibly add it later if needed?

https://codereview.chromium.org/1416763007/diff/60001/scripts/slave/recipes/f...
scripts/slave/recipes/findit/chromium/compile.py:133: step_result =
api.python.inline(
On 2015/11/24 at 23:19:20, Shuotao wrote:
> On 2015/11/24 13:20:41, Paweł Hajdan Jr. wrote:
> > Why not use api.python.result_step (or succeeding/failing_step)?
> 
> I'm afraid I can't do that.
> 
> Because they don't support adding both step text and step log at the same
time. And as they don't return the step_result either, I can't set either step
text or step log like the code here.

Could you at the very least add a TODO, and eventually make the python recipe
module support that?

stgao

Hi Paweł, I've addressed your comments based on our discussion offline. Thanks, Shuotao Gao https://codereview.chromium.org/1416763007/diff/60001/scripts/slave/recipes/findit/chromium/compile.py ...

5 years ago (2015-12-01 16:30:01 UTC) #29

Hi Paweł,

I've addressed your comments based on our discussion offline.

Thanks,
Shuotao Gao

https://codereview.chromium.org/1416763007/diff/60001/scripts/slave/recipes/f...
File scripts/slave/recipes/findit/chromium/compile.py (right):

https://codereview.chromium.org/1416763007/diff/60001/scripts/slave/recipes/f...
scripts/slave/recipes/findit/chromium/compile.py:79: _, compile_targets =
api.chromium_tests.analyze(
On 2015/11/30 16:19:05, Paweł Hajdan Jr. wrote:
> On 2015/11/24 at 23:19:20, Shuotao wrote:
> > On 2015/11/24 13:20:41, Paweł Hajdan Jr. wrote:
> > > Is analyze _really_ needed here? Why can't we rely on ninja to recompile
> targets
> > > that need it?
> > 
> > We could do that, but it will be slow, because more targets have to be
> compiled. With analyze, we could even skip compile at some commits.
> > 
> > The goal here is to speed up compile so that we could reduce the latency of
> culprit finding.
> 
> What's the reasoning behind this? I'm not convinced. What is the scenario
where
> a target is not fresh according to ninja, but is not affected by given
revision?

In this smoke test, build in http://shortn/_d3dv1OGpgs after a clobber with "rm
-rf src/out/Debug", only a few targets were affected by the first tested
revision "6beae8c75e3a0300ced4c95cb0463cfe2f6b7e6e".

As discussed offline, I will move this out to a separate CL. "analyze" is needed
to reduce the number of targets to be compile, before ninja could support
outputting the failed compile targets to a json file or the like yet.

https://codereview.chromium.org/1416763007/diff/60001/scripts/slave/recipes/f...
scripts/slave/recipes/findit/chromium/compile.py:99:
override_bot_type='builder_tester')
On 2015/11/30 16:19:05, Paweł Hajdan Jr. wrote:
> On 2015/11/24 at 23:19:20, Shuotao wrote:
> > On 2015/11/24 13:20:41, Paweł Hajdan Jr. wrote:
> > > Why is this needed?
> > 
> > Just want to make it explicit. I could remove it if that is the preferred
way.
> > But it seems chromium_trybot.py does the same like this.
> 
> That's actually a behavior change. chromium_trybot does this because it merges
> builder+tester pairs into single trybots.

Oh, I see. I thought it was the same as a "builder" bot type.
Removed all override.

> 
> What this recipe is doing is just compiling things. Is this because of
avoiding
> to archive the build artifacts?
Put up https://codereview.chromium.org/1484293002/ for this purpose instead.

https://codereview.chromium.org/1416763007/diff/60001/scripts/slave/recipes/f...
scripts/slave/recipes/findit/chromium/compile.py:121: # Retry once only on infra
failure.
On 2015/11/30 16:19:05, Paweł Hajdan Jr. wrote:
> On 2015/11/24 at 23:19:19, Shuotao wrote:
> > On 2015/11/24 13:20:41, Paweł Hajdan Jr. wrote:
> > > Is this retry really needed/useful here?
> > 
> > For compile failures on the waterfall, I saw a lot of flaky failures in
August
> and September
>
(https://docs.google.com/a/google.com/spreadsheets/d/1MxOZtL8JAQo2KW6kAouxvcd-...).
> Thus, I added this retry to make the culprit-finding process more reliable.
> > 
> > However, I don't have data for trybots. But as the under-lying infra is the
> same, I think the situation might be similar.
> 
> Can we start with a version of this recipe which doesn't have this logic, and
> possibly add it later if needed?

OK. Removed.

https://codereview.chromium.org/1416763007/diff/60001/scripts/slave/recipes/f...
scripts/slave/recipes/findit/chromium/compile.py:133: step_result =
api.python.inline(
On 2015/11/30 16:19:05, Paweł Hajdan Jr. wrote:
> On 2015/11/24 at 23:19:20, Shuotao wrote:
> > On 2015/11/24 13:20:41, Paweł Hajdan Jr. wrote:
> > > Why not use api.python.result_step (or succeeding/failing_step)?
> > 
> > I'm afraid I can't do that.
> > 
> > Because they don't support adding both step text and step log at the same
> time. And as they don't return the step_result either, I can't set either step
> text or step log like the code here.
> 
> Could you at the very least add a TODO, and eventually make the python recipe
> module support that?

Done.

stgao

Hi Paweł, I've uploaded a new patch per our discussion in the meeting. Do you ...

5 years ago (2015-12-01 17:15:02 UTC) #32

stgao

Description was changed from ========== Add a recipe to identify culprits for chromium compile failures ...

5 years ago (2015-12-01 18:45:41 UTC) #34

stgao

The patchset sent to the CQ was uploaded after l-g-t-m from iannucci@chromium.org, phajdan.jr@chromium.org Link to ...

5 years ago (2015-12-01 21:24:11 UTC) #36

commit-bot: I haz the power

CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1416763007/260001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1416763007/260001

5 years ago (2015-12-01 21:24:17 UTC) #37

commit-bot: I haz the power

Description was changed from ========== Add a recipe to identify culprits for chromium compile failures ...

5 years ago (2015-12-01 21:29:43 UTC) #38

commit-bot: I haz the power

5 years ago (2015-12-01 21:29:44 UTC) #39

Message was sent while issue was closed.

Committed patchset #6 (id:260001) as
http://src.chromium.org/viewvc/chrome?view=rev&revision=297767

Issue 1416763007: Add a recipe to identify culprits for chromium compile failures. (Closed)

Description

Patch Set 1 #

Patch Set 2 : Rebase and Address comments. #

Patch Set 3 : Just rebase #

Patch Set 4 : Address comments. #

Patch Set 5 : Split "analyze" out. #

Patch Set 6 : Rebase, and add an owner file. #

Messages