|
|
Chromium Code Reviews|
Created:
3 years, 9 months ago by nodir Modified:
3 years, 9 months ago CC:
chromium-reviews, infra-reviews+recipes-py_chromium.org Target Ref:
refs/heads/master Project:
recipe_engine Visibility:
Public. |
DescriptionFix GitBackend.checkout
GitBackend.checkout is problematic:
- if allow_fetch=False and revision is not a commit, e.g. a branch, it
checkouts whatever happens to be named locally without fetching, possibly stale.
This is inconsistent with GitilesBackend which refuses to work if fetching
isn't allowed.
RemoteRunFactory worked only because it used a new workdir each time.
- it does git-reset, potentionally moving a branch to a non-TOT commit, so next
attempt to checkout a branch with allow_fetch=False will checkout that commit,
not local TOT.
- if a git repository already exists, does not check whether origin.url
matches the requested one. This works only if revision is a commit.
- does not fail on revision "origin/master", which is an invalid remote
revision because origin is a locally configured remote.
As a result other parts of the system pass this value already.
Fix all those problems.
R=iannucci@chromium.org, phajdan.jr@chromium.org
BUG=696100
Patch Set 1 #Patch Set 2 : Fix GitBackend.checkout #
Total comments: 6
Patch Set 3 : refs/heads/master #Patch Set 4 : bug #
Total comments: 4
Patch Set 5 : bug and assert #Patch Set 6 : tests #
Total comments: 3
Messages
Total messages: 26 (9 generated)
PTAL
Description was changed from ========== Fix GitBackend.checkout GitBackend.checkout is problematic: - if allow_fetch=False and revision is not a commit, e.g. a branch, it checkouts whatever happens to be named locally, possibly stale. This is inconsistent with GitilesBackend which refuses to work if fetching isn't allowed. - it does git-reset, potentionally moving a branch to a non-TOT commit, so next attempt to checkout a branch with allow_fetch=False will checkout that commit, not local TOT. - if a git repository already exists, does not check whether origin.url matches the requested one. This works only if revision is a commit. - does not fail on revision "origin/master", which is an invalid remote revision because origin is a locally configured remote. As a result other parts of the system pass this value already. Fix all those problems. R=iannucci@chromium.org, phajdan.jr@chromium.org BUG=696100 ========== to ========== Fix GitBackend.checkout GitBackend.checkout is problematic: - if allow_fetch=False and revision is not a commit, e.g. a branch, it checkouts whatever happens to be named locally without fetching, possibly stale. This is inconsistent with GitilesBackend which refuses to work if fetching isn't allowed. RemoteRunFactory worked only because it used a new workdir each time. - it does git-reset, potentionally moving a branch to a non-TOT commit, so next attempt to checkout a branch with allow_fetch=False will checkout that commit, not local TOT. - if a git repository already exists, does not check whether origin.url matches the requested one. This works only if revision is a commit. - does not fail on revision "origin/master", which is an invalid remote revision because origin is a locally configured remote. As a result other parts of the system pass this value already. Fix all those problems. R=iannucci@chromium.org, phajdan.jr@chromium.org BUG=696100 ==========
nodir@chromium.org changed reviewers: + dnj@chromium.org
+dnj
looks better https://codereview.chromium.org/2720853002/diff/20001/recipe_engine/fetch.py File recipe_engine/fetch.py (right): https://codereview.chromium.org/2720853002/diff/20001/recipe_engine/fetch.py#... recipe_engine/fetch.py:138: # TODO(nodir): raise an exception this is how gclient and bot_update happened. Could we instead: * change all remote_run's to pass the correct ref (and while we're at it, shouldn't is actually be refs/heads/master?) * change this to be an exception https://codereview.chromium.org/2720853002/diff/20001/recipe_engine/fetch.py#... recipe_engine/fetch.py:141: is_commit = re.match('^[a-z0-9]{40}$', revision) this is why git changing off SHA1 will be hard :) https://codereview.chromium.org/2720853002/diff/20001/recipe_engine/fetch.py#... recipe_engine/fetch.py:180: git('fetch', repo_url, 'master' if is_commit else revision) refs/heads/master
https://codereview.chromium.org/2720853002/diff/20001/recipe_engine/fetch.py File recipe_engine/fetch.py (right): https://codereview.chromium.org/2720853002/diff/20001/recipe_engine/fetch.py#... recipe_engine/fetch.py:138: # TODO(nodir): raise an exception On 2017/02/27 20:23:44, iannucci wrote: > this is how gclient and bot_update happened. Could we instead: > > * change all remote_run's to pass the correct ref (and while we're at it, > shouldn't is actually be refs/heads/master?) This is https://chromium-review.googlesource.com/c/446895/ it requires master restarts I don't want this CL to be blocked on restarts of all masters that have RemoteRunFactory I've added a bug Yes, the caller is better to pass "refs/heads/master", although both work. > * change this to be an exception https://codereview.chromium.org/2720853002/diff/20001/recipe_engine/fetch.py#... recipe_engine/fetch.py:141: is_commit = re.match('^[a-z0-9]{40}$', revision) On 2017/02/27 20:23:44, iannucci wrote: > this is why git changing off SHA1 will be hard :) yeah, i thought of this too :) https://codereview.chromium.org/2720853002/diff/20001/recipe_engine/fetch.py#... recipe_engine/fetch.py:180: git('fetch', repo_url, 'master' if is_commit else revision) On 2017/02/27 20:23:44, iannucci wrote: > refs/heads/master Done.
lgtm, though I would strongly consider just doing the restarts and then making this an error tomorrow. You're going to have to do the restarts and fix that bug anyway, might as well be today. https://codereview.chromium.org/2720853002/diff/60001/recipe_engine/fetch.py File recipe_engine/fetch.py (right): https://codereview.chromium.org/2720853002/diff/60001/recipe_engine/fetch.py#... recipe_engine/fetch.py:138: # TODO(nodir): raise an exception http://crbug.com/696704 Let's use syntax `BUG(http://crbug.com/696704): ...` and also point at this file in that bug too. We really need a way to keep track of those cleanup bugs :(. I feel like they just all get lost now. Honestly, restarting all the masters using remote_run isn't that bad and could probably be done tonight. I would also make sure, when making this an exception, to assert that revision starts with "refs/". These are the only refs guaranteed by git to be absolute (basically "refs/" is the same thing as starting a path with "/" in posix). All other refs try to pick the 'longest match' suffix out of the refs that the git server knows about. It's stupid. https://codereview.chromium.org/2720853002/diff/60001/recipe_engine/fetch.py#... recipe_engine/fetch.py:158: logging.warning('Revision %s is not available: %s', revision, e) Revision %s needs to be fetched: %s So that way the logs aren't confusing (it IS available... just not locally).
talked offline, nodir pinkie-swears[1] that he'll clean up the hack in the next week or so. [1]: https://en.wikipedia.org/wiki/Pinky_swear
https://codereview.chromium.org/2720853002/diff/60001/recipe_engine/fetch.py File recipe_engine/fetch.py (right): https://codereview.chromium.org/2720853002/diff/60001/recipe_engine/fetch.py#... recipe_engine/fetch.py:138: # TODO(nodir): raise an exception http://crbug.com/696704 On 2017/02/27 20:42:03, iannucci wrote: > Let's use syntax `BUG(http://crbug.com/696704): ...` and also point at this file > in that bug too. Done > > We really need a way to keep track of those cleanup bugs :(. I feel like they > just all get lost now. > Honestly, restarting all the masters using remote_run isn't that bad and could > probably be done tonight. I need to get other work done today and my work is blocked on this. Sorry, I cannot wait for master restart. > > I would also make sure, when making this an exception, to assert that revision > starts with "refs/". These are the only refs guaranteed by git to be absolute > (basically "refs/" is the same thing as starting a path with "/" in posix). All > other refs try to pick the 'longest match' suffix out of the refs that the git > server knows about. It's stupid. Done https://codereview.chromium.org/2720853002/diff/60001/recipe_engine/fetch.py#... recipe_engine/fetch.py:158: logging.warning('Revision %s is not available: %s', revision, e) On 2017/02/27 20:42:03, iannucci wrote: > Revision %s needs to be fetched: %s > > So that way the logs aren't confusing (it IS available... just not locally). done
The CQ bit was checked by nodir@chromium.org
The patchset sent to the CQ was uploaded after l-g-t-m from iannucci@chromium.org Link to the patchset: https://codereview.chromium.org/2720853002/#ps80001 (title: "bug and assert")
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Try jobs failed on following builders: Recipes-py Presubmit on luci.infra.try (JOB_FAILED, https://luci-milo.appspot.com/swarming/task/349a477a0f447e10)
PTAL again These changes are scary because this stuff looks fragile.
The CQ bit was checked by nodir@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: Recipes-py Presubmit on luci.infra.try (JOB_FAILED, https://luci-milo.appspot.com/swarming/task/349ab923541cab10)
On 2017/02/27 23:27:26, nodir wrote: > PTAL again > > These changes are scary because this stuff looks fragile. It looks like presubmit is still grumpy?
On 2017/02/28 00:07:16, iannucci wrote: > On 2017/02/27 23:27:26, nodir wrote: > > PTAL again > > > > These changes are scary because this stuff looks fragile. > > It looks like presubmit is still grumpy? (but new changes lgtm)
Thanks for tackling this. Looks like a good robustness improvement. I've noticed your comment about how these changes are scary and code fragile - could you elaborate what could we do to improve that? FWIW the changes look fine to me, but the trybots are still failing. If you're in a rush, feel free to fix and land. Also feel free to ping me on IM for re-review. https://codereview.chromium.org/2720853002/diff/90001/recipe_engine/fetch.py File recipe_engine/fetch.py (right): https://codereview.chromium.org/2720853002/diff/90001/recipe_engine/fetch.py#... recipe_engine/fetch.py:136: revision = 'refs/heads/master' Consider a logging.warning call in this case. https://codereview.chromium.org/2720853002/diff/90001/recipe_engine/fetch.py#... recipe_engine/fetch.py:172: # Typically we cannot fetch a commit, so we assume that remote master branch recipes.cfg has a "branch" field, so that assumption might not be true. How about we fetch everything, or at the very least add a TODO for even more precise handling of branches? https://codereview.chromium.org/2720853002/diff/90001/recipe_engine/package.py File recipe_engine/package.py (right): https://codereview.chromium.org/2720853002/diff/90001/recipe_engine/package.p... recipe_engine/package.py:265: self.run_git(context, 'fetch', self.repo, 'refs/heads/master') Similarly I'm worried about assumptions r1 and r2 are on the master branch. Should there be at least a TODO to remove that assumption?
Message was sent while issue was closed.
On 2017/02/28 08:44:33, Paweł Hajdan Jr. wrote: > Thanks for tackling this. Looks like a good robustness improvement. > > I've noticed your comment about how these changes are scary and code fragile - > could you elaborate what could we do to improve that? I use the word fragile because I change one part of the codebase and another part starts failing, although there is no obvious reason why it should be - there are implicit assumptions all over the code base. Other parts of the codebase assume: - that origin is configured to a certain URL, because there is `git fetch` call - that local branches exist and are tracking remote branches, because there is `git fetch` call - that origin/master has a certain state, because they do aHash..origin/master What we can could do to improve that? Write functions so that do not have implicit assumptions of undocumented behavior of other functions. Treat functions as separate entities that must have a documented protocol between them. A function must not rely on implementation details of another function: origin and local branches are implementation details of "make it so that the directory contains files of a certain revision of a certain repo". I've spent a day looking at this code, trying to find all such links between the code I'm changing; trying to find what a function meant to do (as opposed to what it does according to code), and discovering things I am find wrong. I still didn't came up with a diff that I am confident in. This takes too much of my energy and I feel that I get zero my actual work done. I am not willing to take the responsibility for failing builders because this code is fragile. I am abandoning this CL and will fix my problem (the reason I started to modify this code base) in another way, a way that I am much more confident in. Feel free to take ownership of these fixes.
Message was sent while issue was closed.
On 2017/02/28 15:13:26, nodir wrote: > On 2017/02/28 08:44:33, Paweł Hajdan Jr. wrote: > > Thanks for tackling this. Looks like a good robustness improvement. > > > > I've noticed your comment about how these changes are scary and code fragile - > > could you elaborate what could we do to improve that? > > I use the word fragile because I change one part of the codebase and another > part starts failing, although there is no obvious reason why it should be - > there are implicit assumptions all over the code base. Other parts of the > codebase assume: > - that origin is configured to a certain URL, because there is `git fetch` call > - that local branches exist and are tracking remote branches, because there is > `git fetch` call > - that origin/master has a certain state, because they do aHash..origin/master > > What we can could do to improve that? Write functions so that do not have > implicit assumptions of undocumented behavior of other functions. Treat > functions as separate entities that must have a documented protocol between > them. A function must not rely on implementation details of another function: > origin and local branches are implementation details of "make it so that the > directory contains files of a certain revision of a certain repo". > > I've spent a day looking at this code, trying to find all such links between the > code I'm changing; trying to find what a function meant to do (as opposed to > what it does according to code), and discovering things I am find wrong. I still > didn't came up with a diff that I am confident in. This takes too much of my > energy and I feel that I get zero my actual work done. I am not willing to take > the responsibility for failing builders because this code is fragile. > > I am abandoning this CL and will fix my problem (the reason I started to modify > this code base) in another way, a way that I am much more confident in. Feel > free to take ownership of these fixes. to be clear, there are cascading changes in other parts of the codebase that are needed from my point of view (to remove assumptions) - and that's the work I decided to stop doing.
Message was sent while issue was closed.
Wait, I'm not sure why this is being abandoned? You already put work in to make the code work better... is this not landable after fixing the tests?
Message was sent while issue was closed.
On 2017/02/28 18:04:06, iannucci wrote: > Wait, I'm not sure why this is being abandoned? You already put work in to make > the code work better... is this > not landable after fixing the tests? tests fail because of more assumptions that I've discovered that I don't have cycles to fix |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
