|
|
Created:
6 years, 10 months ago by agable Modified:
5 years, 2 months ago CC:
chromium-reviews, Dirk Pranke, cmp-cc_chromium.org, M-A Ruel, iannucci+depot_tools_chromium.org, ilevy-cc_chromium.org Visibility:
Public. |
DescriptionCreate "git cache" command.
The git cache command is a central place to manage a machine's git cache.
It provides two subcommands:
* populate -- creates or updates the cache of a given repository
* exists -- prints the path to the cache of a repo, if it exists
Gclient, deps2git, bot_update, and any other tools that touch the cache will
be able to use this command to make sure that everyone is interacting with
the cache in the same way.
R=hinoka@chromium.org, iannucci@chromium.org
BUG=339168
Committed: http://src.chromium.org/viewvc/chrome?view=rev&revision=253007
Patch Set 1 #
Total comments: 44
Patch Set 2 : Comments from hinoka@ #Patch Set 3 : Comments from iannucci@ #
Total comments: 1
Patch Set 4 : Manually write config file #Patch Set 5 : Reupload #
Total comments: 21
Patch Set 6 : Shallow fetches and other improvements #
Total comments: 20
Patch Set 7 : More comments. #
Total comments: 2
Patch Set 8 : Final comment #
Messages
Total messages: 18 (0 generated)
Some comments. Robbie: I don't really know the behavior of CMDpopulate as well so you should look into that. Don't worry as much about the lock file. We've bikeshedded on that already and determined that the race case (two gclient threads cloning from the same cache) might not actually be a real issue, and if it is it'll turn up red on the waterfall anyways and we can bikeshed about that later. https://codereview.chromium.org/164823002/diff/1/git_cache.py File git_cache.py (right): https://codereview.chromium.org/164823002/diff/1/git_cache.py#newcode26 git_cache.py:26: idx = url.find('://') szager mentioned this in a different CL, but lets use urlparse for url parsing/segmentation. https://codereview.chromium.org/164823002/diff/1/git_cache.py#newcode36 git_cache.py:36: kwargs.setdefault('cwd', os.getcwd) os.getcwd() https://codereview.chromium.org/164823002/diff/1/git_cache.py#newcode46 git_cache.py:46: stdout.write( print >>stdout, "..." ? \n usage might be flaky on win https://codereview.chromium.org/164823002/diff/1/git_cache.py#newcode76 git_cache.py:76: f = os.fdopen(fd, 'w') with os.fdopen(fd, 'w') as f: ? https://codereview.chromium.org/164823002/diff/1/git_cache.py#newcode98 git_cache.py:98: raise LockError("failed to create %s" % self.path) s/failed/Failed/ Also might as well include e.errno in the message. https://codereview.chromium.org/164823002/diff/1/git_cache.py#newcode108 git_cache.py:108: def steal(self): This isn't used anywhere? Need a CMDwipeoutallthelocks() for bothammer to call https://codereview.chromium.org/164823002/diff/1/git_cache.py#newcode140 git_cache.py:140: """Checks to see if there already is a cache of the given repo.""" Whats the purpose of this? It looks like it returns: 0 (OK): Path exists 1 (FAIL): Path does not exist, or you passed in the wrong arguments. If this is meant to be called externally, i'd prefer if it returns 0 for path exists/not exists, and vary stdout accordingly, rather than exit with a failure for not existing, otherwise its easy to conflate a "Path does not exist' signal with a "The script failed unexpectedly" signal. Just my opinion though, I realize most bash commands do the same thing. https://codereview.chromium.org/164823002/diff/1/git_cache.py#newcode143 git_cache.py:143: DieWithError('git cache exists only takes exactly one repo url.') lets use parser.error() instead https://codereview.chromium.org/164823002/diff/1/git_cache.py#newcode157 git_cache.py:157: help='local repository to initialize from') nit: align at paren. Only align at 4 spaces if first line is empty, eg parse.add_option( '--local', ....) https://codereview.chromium.org/164823002/diff/1/git_cache.py#newcode160 git_cache.py:160: DieWithError('git cache populate only takes exactly one repo url.') lets use parser.error() instead https://codereview.chromium.org/164823002/diff/1/git_cache.py#newcode183 git_cache.py:183: with Lockfile(repo_dir): "with Lock" generally means "wait for this lock", but the code suggests that it fails if the lock is present. I think this should be (a): if Lockfile.is_locked(repo_dir): DieWithError('%s is locked by someone else' % repo_dir) (b): Have "with Lockfile" be blocking. (c): Add a comment saying its a non-blocking fail-fast lock and be done with it. https://codereview.chromium.org/164823002/diff/1/git_cache.py#newcode226 git_cache.py:226: if options.cache_dir: The if/else doesn't need to be in the try block. https://codereview.chromium.org/164823002/diff/1/git_cache.py#newcode232 git_cache.py:232: DieWithError('No cache directory specified on command line ' lets use parser.error() instead
cloning into the same cache*
https://codereview.chromium.org/164823002/diff/1/git_cache.py File git_cache.py (right): https://codereview.chromium.org/164823002/diff/1/git_cache.py#newcode26 git_cache.py:26: idx = url.find('://') On 2014/02/13 23:01:29, hinoka wrote: > szager mentioned this in a different CL, but lets use urlparse for url > parsing/segmentation. Done. https://codereview.chromium.org/164823002/diff/1/git_cache.py#newcode36 git_cache.py:36: kwargs.setdefault('cwd', os.getcwd) On 2014/02/13 23:01:29, hinoka wrote: > os.getcwd() Done. https://codereview.chromium.org/164823002/diff/1/git_cache.py#newcode76 git_cache.py:76: f = os.fdopen(fd, 'w') On 2014/02/13 23:01:29, hinoka wrote: > with os.fdopen(fd, 'w') as f: ? AFAIK only builtin open() is a context manager, os.fdopen isn't. https://codereview.chromium.org/164823002/diff/1/git_cache.py#newcode98 git_cache.py:98: raise LockError("failed to create %s" % self.path) On 2014/02/13 23:01:29, hinoka wrote: > s/failed/Failed/ > Also might as well include e.errno in the message. Done. https://codereview.chromium.org/164823002/diff/1/git_cache.py#newcode108 git_cache.py:108: def steal(self): On 2014/02/13 23:01:29, hinoka wrote: > This isn't used anywhere? Need a CMDwipeoutallthelocks() for bothammer to call Heh. Yeah, adding CMDmineallmine() in this patchset. https://codereview.chromium.org/164823002/diff/1/git_cache.py#newcode140 git_cache.py:140: """Checks to see if there already is a cache of the given repo.""" On 2014/02/13 23:01:29, hinoka wrote: > Whats the purpose of this? It looks like it returns: > > 0 (OK): Path exists > 1 (FAIL): Path does not exist, or you passed in the wrong arguments. > > If this is meant to be called externally, i'd prefer if it returns 0 for path > exists/not exists, and vary stdout accordingly, rather than exit with a failure > for not existing, otherwise its easy to conflate a "Path does not exist' signal > with a "The script failed unexpectedly" signal. > > Just my opinion though, I realize most bash commands do the same thing. Its purpose is to test for existence, and print the path to the cache if it exists (so that other things can use this to look up cache locations and e.g. perform clones from there). Good point about the return code, it will always return 0 on success, even if that success is "doesn't exist". https://codereview.chromium.org/164823002/diff/1/git_cache.py#newcode143 git_cache.py:143: DieWithError('git cache exists only takes exactly one repo url.') On 2014/02/13 23:01:29, hinoka wrote: > lets use parser.error() instead Done. https://codereview.chromium.org/164823002/diff/1/git_cache.py#newcode157 git_cache.py:157: help='local repository to initialize from') On 2014/02/13 23:01:29, hinoka wrote: > nit: align at paren. Only align at 4 spaces if first line is empty, eg > parse.add_option( > '--local', ....) Done. https://codereview.chromium.org/164823002/diff/1/git_cache.py#newcode160 git_cache.py:160: DieWithError('git cache populate only takes exactly one repo url.') On 2014/02/13 23:01:29, hinoka wrote: > lets use parser.error() instead Done. https://codereview.chromium.org/164823002/diff/1/git_cache.py#newcode183 git_cache.py:183: with Lockfile(repo_dir): On 2014/02/13 23:01:29, hinoka wrote: > "with Lock" generally means "wait for this lock", but the code suggests that it > fails if the lock is present. > > I think this should be > (a): > if Lockfile.is_locked(repo_dir): > DieWithError('%s is locked by someone else' % repo_dir) > > (b): > Have "with Lockfile" be blocking. > > (c): Add a comment saying its a non-blocking fail-fast lock and be done with it. Yeah, we already talked about this, but replying here for posterity: Agreed about the semantics. Would be better to have it nicely wait until the lock becomes available. But definitely don't want to have it always wait arbitrarily long, since sometimes there mere existence of a previous lock is an indicator of error. And finding a good default timeout for waiting for the lock is hard, since some repos take 2 seconds to clone and some take 2 hours. So I'm going to leave this as-is until Robbie pitches in and graces us with his wisdom. https://codereview.chromium.org/164823002/diff/1/git_cache.py#newcode226 git_cache.py:226: if options.cache_dir: On 2014/02/13 23:01:29, hinoka wrote: > The if/else doesn't need to be in the try block. Well, it kinda does. I can move it out but then I have to predeclare global_cache_dir as None and add "and if global_cache_dir' to the if-clause. I think this is more logical: It's a 2x2 grid of global exists/not vs options exists/not. The global exists case is in the try, the global doesn't exist case is in the except. https://codereview.chromium.org/164823002/diff/1/git_cache.py#newcode232 git_cache.py:232: DieWithError('No cache directory specified on command line ' On 2014/02/13 23:01:29, hinoka wrote: > lets use parser.error() instead Done.
https://codereview.chromium.org/164823002/diff/1/git_cache.py File git_cache.py (right): https://codereview.chromium.org/164823002/diff/1/git_cache.py#newcode62 git_cache.py:62: def _read_pid(self): This method only really makes sense when stealing. Otherwise: * You don't have the lock and you can't trust the value you get from this function * You DO have the lock, and so you know pid == self.pid https://codereview.chromium.org/164823002/diff/1/git_cache.py#newcode74 git_cache.py:74: open_mode = 0o644 Not sure what the value in having this be a extra variable is. It's pretty obviously an open mode. https://codereview.chromium.org/164823002/diff/1/git_cache.py#newcode76 git_cache.py:76: f = os.fdopen(fd, 'w') On 2014/02/13 23:01:29, hinoka wrote: > with os.fdopen(fd, 'w') as f: ? and also print >> f, self.pid :) https://codereview.chromium.org/164823002/diff/1/git_cache.py#newcode123 git_cache.py:123: """Test if the file is locked by anyone.""" note, this is racy... P1: lock P2: is_locked? P1: stuff P1: unlock P2: do stuff assuming it's locked # <- :( The only safe way to tell if it's locked is if YOU have the lock (i.e. the only answers are "yes" and "maybe") https://codereview.chromium.org/164823002/diff/1/git_cache.py#newcode128 git_cache.py:128: return self.is_locked() and self.pid == self._read_pid() This should just be True iff make_lockfile didn't throw an exception. https://codereview.chromium.org/164823002/diff/1/git_cache.py#newcode140 git_cache.py:140: """Checks to see if there already is a cache of the given repo.""" On 2014/02/13 23:01:29, hinoka wrote: > Whats the purpose of this? It looks like it returns: > > 0 (OK): Path exists > 1 (FAIL): Path does not exist, or you passed in the wrong arguments. > > If this is meant to be called externally, i'd prefer if it returns 0 for path > exists/not exists, and vary stdout accordingly, rather than exit with a failure > for not existing, otherwise its easy to conflate a "Path does not exist' signal > with a "The script failed unexpectedly" signal. > > Just my opinion though, I realize most bash commands do the same thing. I disagree a bit. I think: 0 exists, 1 DNE, -1 bad args https://codereview.chromium.org/164823002/diff/1/git_cache.py#newcode157 git_cache.py:157: help='local repository to initialize from') On 2014/02/13 23:01:29, hinoka wrote: > nit: align at paren. Only align at 4 spaces if first line is empty, eg > parse.add_option( > '--local', ....) should we auto-default to the current repo, if there is one? https://codereview.chromium.org/164823002/diff/1/git_cache.py#newcode183 git_cache.py:183: with Lockfile(repo_dir): On 2014/02/13 23:01:29, hinoka wrote: > "with Lock" generally means "wait for this lock", but the code suggests that it > fails if the lock is present. > > I think this should be > (a): > if Lockfile.is_locked(repo_dir): > DieWithError('%s is locked by someone else' % repo_dir) Racy > > (b): > Have "with Lockfile" be blocking. Maybe, but you'd need a very different style of file lock for this to work (http://linux.die.net/man/2/flock) > > (c): Add a comment saying its a non-blocking fail-fast lock and be done with it. Probably the best option for now. Eventually we should aim to move to B, because we can't guarantee that DEPS authors won't do silly things (unless we're relying on in-process locks to prevent this i.e. in gclient/deps2git/botupdate) https://codereview.chromium.org/164823002/diff/1/git_cache.py#newcode205 git_cache.py:205: # with repack and remove the linkage. may consider an optimization: * if the local repo only contains refs which are <= the mirror refs * add alternates file, nuke local objects/packfiles repack is a pretty pricey operation.
https://codereview.chromium.org/164823002/diff/1/git_cache.py File git_cache.py (right): https://codereview.chromium.org/164823002/diff/1/git_cache.py#newcode46 git_cache.py:46: stdout.write( On 2014/02/13 23:01:29, hinoka wrote: > print >>stdout, "..." ? > \n usage might be flaky on win Done. https://codereview.chromium.org/164823002/diff/1/git_cache.py#newcode62 git_cache.py:62: def _read_pid(self): On 2014/02/14 01:43:08, iannucci wrote: > This method only really makes sense when stealing. Otherwise: > * You don't have the lock and you can't trust the value you get from this > function > * You DO have the lock, and so you know pid == self.pid I seriously considered having _read_pid() and steal() lock the lockfile. But decided that was a bit too meta and we didn't care enough. This method isn't meant to be 100% correct. It's used for informational purposes when stealing to print the pid of the old locking process (or eventually check if that process is still running, if windows cooperates). Leaving as is, with updated docstring. https://codereview.chromium.org/164823002/diff/1/git_cache.py#newcode74 git_cache.py:74: open_mode = 0o644 On 2014/02/14 01:43:08, iannucci wrote: > Not sure what the value in having this be a extra variable is. It's pretty > obviously an open mode. Done. https://codereview.chromium.org/164823002/diff/1/git_cache.py#newcode76 git_cache.py:76: f = os.fdopen(fd, 'w') On 2014/02/14 01:43:08, iannucci wrote: > On 2014/02/13 23:01:29, hinoka wrote: > > with os.fdopen(fd, 'w') as f: ? > > and also > print >> f, self.pid > > :) Done. https://codereview.chromium.org/164823002/diff/1/git_cache.py#newcode123 git_cache.py:123: """Test if the file is locked by anyone.""" On 2014/02/14 01:43:08, iannucci wrote: > note, this is racy... > > P1: lock > P2: is_locked? > P1: stuff > P1: unlock > P2: do stuff assuming it's locked # <- :( > > The only safe way to tell if it's locked is if YOU have the lock (i.e. the only > answers are "yes" and "maybe") I think you mean "yes" and "ask me again later" :P But anyway, not really another way to do it. Noted in docstring. https://codereview.chromium.org/164823002/diff/1/git_cache.py#newcode128 git_cache.py:128: return self.is_locked() and self.pid == self._read_pid() On 2014/02/14 01:43:08, iannucci wrote: > This should just be True iff make_lockfile didn't throw an exception. ...and no one else has stolen the lock. Checking the pid is important. https://codereview.chromium.org/164823002/diff/1/git_cache.py#newcode140 git_cache.py:140: """Checks to see if there already is a cache of the given repo.""" On 2014/02/14 01:43:08, iannucci wrote: > On 2014/02/13 23:01:29, hinoka wrote: > > Whats the purpose of this? It looks like it returns: > > > > 0 (OK): Path exists > > 1 (FAIL): Path does not exist, or you passed in the wrong arguments. > > > > If this is meant to be called externally, i'd prefer if it returns 0 for path > > exists/not exists, and vary stdout accordingly, rather than exit with a > failure > > for not existing, otherwise its easy to conflate a "Path does not exist' > signal > > with a "The script failed unexpectedly" signal. > > > > Just my opinion though, I realize most bash commands do the same thing. > > I disagree a bit. I think: 0 exists, 1 DNE, -1 bad args parser.error() automatically generates retcode 2 for bad args (not -1). Upon further consideration, I'd like to keep non-existence as a non-zero return code so that client code can try: cache_path = subprocess.check_output(['git', 'cache', 'exists', 'foo']) except subprocess.CalledProcessError: # oh snap son, you done goofed https://codereview.chromium.org/164823002/diff/1/git_cache.py#newcode157 git_cache.py:157: help='local repository to initialize from') On 2014/02/14 01:43:08, iannucci wrote: > On 2014/02/13 23:01:29, hinoka wrote: > > nit: align at paren. Only align at 4 spaces if first line is empty, eg > > parse.add_option( > > '--local', ....) > > > should we auto-default to the current repo, if there is one? What does git clone do if --reference points to an object store that is completely irrelevant? I'd prefer to keep it explicit -- none of the 'git cache' commands so far have any dependency on being run from inside a git repo, so I'm not really in favor of adding a secret default. https://codereview.chromium.org/164823002/diff/1/git_cache.py#newcode183 git_cache.py:183: with Lockfile(repo_dir): On 2014/02/14 01:43:08, iannucci wrote: > On 2014/02/13 23:01:29, hinoka wrote: > > "with Lock" generally means "wait for this lock", but the code suggests that > it > > fails if the lock is present. > > > > I think this should be > > (a): > > if Lockfile.is_locked(repo_dir): > > DieWithError('%s is locked by someone else' % repo_dir) > > Racy Yep. > > > > > (b): > > Have "with Lockfile" be blocking. > > Maybe, but you'd need a very different style of file lock for this to work > (http://linux.die.net/man/2/flock) > Yeah, I spent a while trying to do this in a way that would leverage flock on windows and some other api on windows. Turns out, a lot of other people have done that already. And they're all pretty disappointing implementations, and they all rely on win32 apis that we don't have installed on any of our windows systems. So that was a show stopper. > > > > (c): Add a comment saying its a non-blocking fail-fast lock and be done with > it. > > Probably the best option for now. Eventually we should aim to move to B, because > we can't guarantee that DEPS authors won't do silly things (unless we're relying > on in-process locks to prevent this i.e. in gclient/deps2git/botupdate) Doing this. https://codereview.chromium.org/164823002/diff/1/git_cache.py#newcode205 git_cache.py:205: # with repack and remove the linkage. On 2014/02/14 01:43:08, iannucci wrote: > may consider an optimization: > * if the local repo only contains refs which are <= the mirror refs > * add alternates file, nuke local objects/packfiles > > repack is a pretty pricey operation. I think not worth it. In addition, repack -ad is pretty much *the* canonical way to detach a repo from anywhere it has alternates pointing, so I'd rather stick to that.
https://codereview.chromium.org/164823002/diff/150001/git_cache.py File git_cache.py (right): https://codereview.chromium.org/164823002/diff/150001/git_cache.py#newcode183 git_cache.py:183: cmd = ['clone'] + v + ['-c', 'core.deltaBaseCacheLimit=2g', We talked about this yesterday, but we want to make sure the refspec is set correctly
Robbie, Ryan, Just uploaded a new patchset. Two relevant changes: a) We now manually init --bare and config remotes.origin the cache repo. This means that we now always fetch, and never clone. This means that the "update" and "we're hosed, start over" logic is much more similar. Yay. b) Because we now only fetch, not clone, we can't use --reference. I've removed all the logic relating to initializing the cache from an already-extant repository. It's a minor loss, but not one I'm worried about.
https://codereview.chromium.org/164823002/diff/260001/git_cache.py File git_cache.py (right): https://codereview.chromium.org/164823002/diff/260001/git_cache.py#newcode51 git_cache.py:51: gclient_utils.CheckCallAndFilter(['git'] + cmd, **kwargs) git.bat for windows, right? https://codereview.chromium.org/164823002/diff/260001/git_cache.py#newcode94 git_cache.py:94: Do. Or do not. There is no try. http://www.maniacworld.com/dog-yoda-halloween-costume.jpg https://codereview.chromium.org/164823002/diff/260001/git_cache.py#newcode161 git_cache.py:161: options, args = parser.parse_args(args) Add a todo about a shallow clone option? https://codereview.chromium.org/164823002/diff/260001/git_cache.py#newcode177 git_cache.py:177: if not os.path.exists(os.path.join(repo_dir, 'config')): Does this imply that all repos that have a "config" file is good and uncorrupted? https://codereview.chromium.org/164823002/diff/260001/git_cache.py#newcode185 git_cache.py:185: '+refs/heads/*:refs/heads/*'], Lets also ensure that these lines are in the config, so that we can recover from cache_dirs who have configs but don't have a correct refspec. Would be useful for when we want to add a new refspec https://codereview.chromium.org/164823002/diff/260001/git_cache.py#newcode200 git_cache.py:200: parser.add_option('--force', '-f', action='store_true', nit: Short options before long options https://codereview.chromium.org/164823002/diff/260001/git_cache.py#newcode201 git_cache.py:201: help='actually perform the action') nit: Capitalize first word, add period. https://codereview.chromium.org/164823002/diff/260001/git_cache.py#newcode220 git_cache.py:220: '%s' % lockfiles) ', '.join(lockfiles) would look nicer https://codereview.chromium.org/164823002/diff/260001/git_cache.py#newcode221 git_cache.py:221: return parser.error()? I can't think of any situations where if we run this as a script, we wouldn't want to just fail right here. https://codereview.chromium.org/164823002/diff/260001/git_cache.py#newcode223 git_cache.py:223: unlocked = untouched = [] I don't think this is what you want :P >>> a = b = [] >>> a.append('asdf') >>> a ['asdf'] >>> b ['asdf'] https://codereview.chromium.org/164823002/diff/260001/git_cache.py#newcode251 git_cache.py:251: ['git', 'config', '--global', 'cache.cachepath']).strip() git.bat for windows.
https://codereview.chromium.org/164823002/diff/310001/git-cache File git-cache (right): https://codereview.chromium.org/164823002/diff/310001/git-cache#newcode7 git-cache:7: # number of a commit. this probably should be updated/deleted https://codereview.chromium.org/164823002/diff/310001/git_cache.py File git_cache.py (right): https://codereview.chromium.org/164823002/diff/310001/git_cache.py#newcode36 git_cache.py:36: return norm_url.replace('-', '--').replace('/', '-') do either of these need to do case normalization? https://codereview.chromium.org/164823002/diff/310001/git_cache.py#newcode47 git_cache.py:47: env.setdefault('SSH_ASKPASS', 'true') hmm... copypasta magic? Should this go in gclient_utils? https://codereview.chromium.org/164823002/diff/310001/git_cache.py#newcode165 git_cache.py:165: help='only cache 10000 commits of history') Not sure I see the need for both --shallow and --depth. https://codereview.chromium.org/164823002/diff/310001/git_cache.py#newcode174 git_cache.py:174: parser.error('--depth only takes integer arguments') http://docs.python.org/2/library/optparse.html#optparse.Option.type ? https://codereview.chromium.org/164823002/diff/310001/git_cache.py#newcode183 git_cache.py:183: filter_fn = lambda l: '[up to date]' not in l does filter_fn return the line to print, or does it return a boolean to indicate if it's supposed to print? Also shouldn't this go in gclient_utils? https://codereview.chromium.org/164823002/diff/310001/git_cache.py#newcode214 git_cache.py:214: fetch_cmd = ['fetch'] + v + d + ['--update-shallow', '--tags', 'origin'] update-shallow ? Where are the docs on that for git 1.8? https://codereview.chromium.org/164823002/diff/310001/git_cache.py#newcode220 git_cache.py:220: logging.warn('Shallow fetch requested, but repo cache already exists.') We should only print this if the existing repo isn't shallow https://codereview.chromium.org/164823002/diff/310001/git_cache.py#newcode254 git_cache.py:254: if lf.break_lock(): should print error if the lock exists and is owned by the wrong pid
Responded to all comments. Also pep257'd all docstrings. https://codereview.chromium.org/164823002/diff/260001/git_cache.py File git_cache.py (right): https://codereview.chromium.org/164823002/diff/260001/git_cache.py#newcode51 git_cache.py:51: gclient_utils.CheckCallAndFilter(['git'] + cmd, **kwargs) On 2014/02/20 20:30:56, Ryan T. wrote: > git.bat for windows, right? Done. https://codereview.chromium.org/164823002/diff/260001/git_cache.py#newcode161 git_cache.py:161: options, args = parser.parse_args(args) On 2014/02/20 20:30:56, Ryan T. wrote: > Add a todo about a shallow clone option? Shallow clones are done in the next patchset. https://codereview.chromium.org/164823002/diff/260001/git_cache.py#newcode177 git_cache.py:177: if not os.path.exists(os.path.join(repo_dir, 'config')): On 2014/02/20 20:30:56, Ryan T. wrote: > Does this imply that all repos that have a "config" file is good and > uncorrupted? It implies that all repos that have a "config" file are in a recoverable state. https://codereview.chromium.org/164823002/diff/260001/git_cache.py#newcode185 git_cache.py:185: '+refs/heads/*:refs/heads/*'], On 2014/02/20 20:30:56, Ryan T. wrote: > Lets also ensure that these lines are in the config, so that we can recover from > cache_dirs who have configs but don't have a correct refspec. Would be useful > for when we want to add a new refspec Also done in the next patchset -- we rewrite the config no matter what. https://codereview.chromium.org/164823002/diff/260001/git_cache.py#newcode200 git_cache.py:200: parser.add_option('--force', '-f', action='store_true', On 2014/02/20 20:30:56, Ryan T. wrote: > nit: Short options before long options But the options.foo attribute is named after the long option... https://codereview.chromium.org/164823002/diff/260001/git_cache.py#newcode201 git_cache.py:201: help='actually perform the action') On 2014/02/20 20:30:56, Ryan T. wrote: > nit: Capitalize first word, add period. Following the example of 'git cl' and git itself here. Settled on capitals but no periods. https://codereview.chromium.org/164823002/diff/260001/git_cache.py#newcode220 git_cache.py:220: '%s' % lockfiles) On 2014/02/20 20:30:56, Ryan T. wrote: > ', '.join(lockfiles) would look nicer Done. https://codereview.chromium.org/164823002/diff/260001/git_cache.py#newcode221 git_cache.py:221: return On 2014/02/20 20:30:56, Ryan T. wrote: > parser.error()? I can't think of any situations where if we run this as a > script, we wouldn't want to just fail right here. Good point, "git clean" without -f returns 128. https://codereview.chromium.org/164823002/diff/260001/git_cache.py#newcode223 git_cache.py:223: unlocked = untouched = [] On 2014/02/20 20:30:56, Ryan T. wrote: > I don't think this is what you want :P > > >>> a = b = [] > >>> a.append('asdf') > >>> a > ['asdf'] > >>> b > ['asdf'] Done. https://codereview.chromium.org/164823002/diff/260001/git_cache.py#newcode251 git_cache.py:251: ['git', 'config', '--global', 'cache.cachepath']).strip() On 2014/02/20 20:30:56, Ryan T. wrote: > git.bat for windows. Done. https://codereview.chromium.org/164823002/diff/310001/git-cache File git-cache (right): https://codereview.chromium.org/164823002/diff/310001/git-cache#newcode7 git-cache:7: # number of a commit. On 2014/02/21 11:32:07, iannucci wrote: > this probably should be updated/deleted Whoops. Done. https://codereview.chromium.org/164823002/diff/310001/git_cache.py File git_cache.py (right): https://codereview.chromium.org/164823002/diff/310001/git_cache.py#newcode36 git_cache.py:36: return norm_url.replace('-', '--').replace('/', '-') On 2014/02/21 11:32:07, iannucci wrote: > do either of these need to do case normalization? Url no, just in case someone is dumb; cache_dir, yes. https://codereview.chromium.org/164823002/diff/310001/git_cache.py#newcode47 git_cache.py:47: env.setdefault('SSH_ASKPASS', 'true') On 2014/02/21 11:32:07, iannucci wrote: > hmm... copypasta magic? Should this go in gclient_utils? It's a tiny bit incompatible with the version of RunGit in gclient_scm, so it wouldn't actually be used in multiple places. I'd rather keep it here for now. https://codereview.chromium.org/164823002/diff/310001/git_cache.py#newcode165 git_cache.py:165: help='only cache 10000 commits of history') On 2014/02/21 11:32:07, iannucci wrote: > Not sure I see the need for both --shallow and --depth. I thought it would be nice to have a simple default, and -d with no argument doesn't make sense. Don't really care either way though. Can remove if you like. https://codereview.chromium.org/164823002/diff/310001/git_cache.py#newcode174 git_cache.py:174: parser.error('--depth only takes integer arguments') On 2014/02/21 11:32:07, iannucci wrote: > http://docs.python.org/2/library/optparse.html#optparse.Option.type > > ? Damn, I knew that existed, but couldn't find it. Thanks. https://codereview.chromium.org/164823002/diff/310001/git_cache.py#newcode183 git_cache.py:183: filter_fn = lambda l: '[up to date]' not in l On 2014/02/21 11:32:07, iannucci wrote: > does filter_fn return the line to print, or does it return a boolean to indicate > if it's supposed to print? > > Also shouldn't this go in gclient_utils? This miniature filter function (which just does true/false) gets morphed in to the gclient_utils.GitFilter object which turns it into a fully-fledged filter function that either prints or doesn't print each line. I think this should remain here -- ideally the similar version in gclient_scm will go away when it switches to using this implementation instead of doing everything itself. https://codereview.chromium.org/164823002/diff/310001/git_cache.py#newcode214 git_cache.py:214: fetch_cmd = ['fetch'] + v + d + ['--update-shallow', '--tags', 'origin'] On 2014/02/21 11:32:07, iannucci wrote: > update-shallow ? Where are the docs on that for git 1.8? Damn, looks like I was reading 1.9 docs. And also running 1.9 on my machine. Removed. https://codereview.chromium.org/164823002/diff/310001/git_cache.py#newcode220 git_cache.py:220: logging.warn('Shallow fetch requested, but repo cache already exists.') On 2014/02/21 11:32:07, iannucci wrote: > We should only print this if the existing repo isn't shallow Done. https://codereview.chromium.org/164823002/diff/310001/git_cache.py#newcode254 git_cache.py:254: if lf.break_lock(): On 2014/02/21 11:32:07, iannucci wrote: > should print error if the lock exists and is owned by the wrong pid Define 'wrong pid'. Every time 'git cache X' runs, it will have a different pid. It is impossible for a call to 'git cache unlock' to break locks that were created by itself.
Ping. This really needs to go in.
lgtm
lgtm https://codereview.chromium.org/164823002/diff/310001/git_cache.py File git_cache.py (right): https://codereview.chromium.org/164823002/diff/310001/git_cache.py#newcode254 git_cache.py:254: if lf.break_lock(): On 2014/02/21 19:33:44, agable wrote: > On 2014/02/21 11:32:07, iannucci wrote: > > should print error if the lock exists and is owned by the wrong pid > > Define 'wrong pid'. Every time 'git cache X' runs, it will have a different pid. > It is impossible for a call to 'git cache unlock' to break locks that were > created by itself. I guess I meant, can we print a warning/error if the pid in the file corresponds to a process which is still running? https://codereview.chromium.org/164823002/diff/360001/git_cache.py File git_cache.py (right): https://codereview.chromium.org/164823002/diff/360001/git_cache.py#newcode165 git_cache.py:165: """Bare clone or update a repository in the cache.""" I actually think this reveals too much implementation. "Ensures that the cache has all up-to-date objects for the given repo" ?
The CQ bit was checked by agable@chromium.org
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/agable@chromium.org/164823002/440001
https://codereview.chromium.org/164823002/diff/310001/git_cache.py File git_cache.py (right): https://codereview.chromium.org/164823002/diff/310001/git_cache.py#newcode254 git_cache.py:254: if lf.break_lock(): > I guess I meant, can we print a warning/error if the pid in the file corresponds > to a process which is still running? Ah. No. Not on windows, especially. https://codereview.chromium.org/164823002/diff/360001/git_cache.py File git_cache.py (right): https://codereview.chromium.org/164823002/diff/360001/git_cache.py#newcode165 git_cache.py:165: """Bare clone or update a repository in the cache.""" On 2014/02/24 19:43:03, iannucci wrote: > I actually think this reveals too much implementation. > > "Ensures that the cache has all up-to-date objects for the given repo" > > ? Done.
Message was sent while issue was closed.
Change committed as 253007 |