|
|
Created:
7 years, 2 months ago by iannucci Modified:
7 years ago CC:
chromium-reviews, Dirk Pranke, cmp-cc_chromium.org, iannucci+depot_tools_chromium.org, ilevy-cc_chromium.org Visibility:
Public. |
DescriptionAdd git-number script to calculate generation numbers for commits.
Compatible with any git topology (multiple roots, weird branching/merging, etc.)
I can't get it to be any faster (in python). Suggestions welcome :).
On z600/linux, this takes 5.1s to calculate the initial count for 2e3de954ef0a
(HEAD on src.git at the time of writing). Subsequent lookups take ~0.06s. For
reference, this machine takes 3s to just list the revisions in sorted order
without any additional processing (using rev-list).
All calculations are stored in a git-notes-style ref with the exception that the
leaf 'tree' object which would normally be stored in a git-notes world is
replaced with a packed binary file which consists of records [hash int]. Each
run of this script will create only 1 commit object on this internal ref which
will have as its parents:
* The previous git number commit
* All of the target commits we calculated numbers for.
This ref is then excluded on subsequent invocations of rev-list, which means that
git-number will only ever process commit objects which it hasn't already
calculated a value for. It also prevents you from attempting to number this
special ref :).
This implementation only has a 1-byte fanout which seems to be the best
performance for the repos we're dealing with (i.e. on the order of 500k commit
objects). Bumping this up to a 2-byte fanout became extremely slow (I suspect
the internal caching structures I'm using are not efficient in this mode and
could be improved). Using no fanout is slower than the 1 byte fanout for lookups
by about 30%.
R=agable@chromium.org, stip@chromium.org, szager@chromium.org
BUG=280154, 309692, skia:1639
Committed: http://src.chromium.org/viewvc/chrome?view=rev&revision=236035
Patch Set 1 #Patch Set 2 : Parse all refs in one command. Reduces overhead when attempting to obtain numbers for multiple refs. #Patch Set 3 : Make it work on windows. 9s for initial calculation/.3s for lookup on z620. #Patch Set 4 : Remove silly author line :) #
Total comments: 81
Patch Set 5 : Address comments #Patch Set 6 : Address comments (reupload!) #
Total comments: 10
Patch Set 7 : Address more formatting comments #Patch Set 8 : Now with tests! #
Total comments: 38
Patch Set 9 : address comments #
Total comments: 21
Patch Set 10 : Address final comments? #
Total comments: 3
Patch Set 11 : ws #
Total comments: 21
Patch Set 12 : #
Total comments: 14
Patch Set 13 : fixes #
Total comments: 6
Patch Set 14 : more fixins #
Total comments: 26
Patch Set 15 #
Total comments: 16
Patch Set 16 : woot! working! #Patch Set 17 : Add version checking for coverage module #
Messages
Total messages: 47 (0 generated)
Presumably, I will also want to eventually add the equivalent of `git describe` and `git tag --contains` which will be much faster because won't have to recurse the repo history.
Also, here's a relevant SO question: http://stackoverflow.com/questions/6702821/git-commit-generation-numbers/6708...
I was thinking about adding some 'root' tracking logic to this, but it only solves a fraction of the conflicting cases. Consider: R1 < Z < A < B < C \ Q < D / A and D both have a # of '2'. They are not comparable in a meaningful way, because neither is an ancestor of the other. Basically the problem boils down to the fact that git has no fixed 'forward ref' ordering which is immutable. In order to give every node in the commit graph an absolute 'position', you would have to know that A 'was to the left' of D (or vice versa). Back refs have this property because the order of parents of a given commit is fixed at the point where this commit is created. However, at any time, I could chose to create a new branch from R1 (or anything), which the positioning system would then have to create a backwards compatible 'numbering' for this position. Now if we were willing to pick a ref to number from (say one that includes C), we could then number as: * C: 4 * B: 3 * A: 2 * D: 2.2 * Q: 2.1 * Z: 1 * R1: 0 This means that in order to assign counts to commits, we need to pick a reference that we want to number from. Because we 'backfill' numbering on the branch from the 'future' (i.e. D is numbered relative to B, not R1), we have all the data and ordering information that we care about. Then if we compare D and A, we know immediately that D is on a parallel line of history with A, and thus not directly comparable. However, we still know that D comes before B. If there are subbranches off of Q or D, they would have another dot component, and so on. Amusingly, the 'left most' line of history which only has a single component in this scheme, would be compatible with the plain generation numbers. This would probably fall apart on the git-svn branch however, since it is, in fact, absurdly branchy. Anyhow, I say we just stick to the basic number for now, and we strive to only make sane comparisons. I suspect that the plain number will be sufficient in all the cases where we were comparing SVN revisions previously. We've made it this far without comparing tools/build SVN revisions to src SVN revisions, so I think we should be OK.
Can has review? Anyone? :P
On 2013/10/21 17:36:38, iannucci wrote: > Can has review? Anyone? :P On it.
Are the tests coming in a follow up CL? Otherwise why so many trivial functions in git_common? https://codereview.chromium.org/26109002/diff/11001/git-number File git-number (right): https://codereview.chromium.org/26109002/diff/11001/git-number#newcode24 git-number:24: PYTHONDONTWRITEBYTECODE=1 exec "$base_dir/python_bin/python.exe" "$base_dir"/git_number.py "$@" Note: python_bin hardcoding will have to change for python 2.7.5 transition. https://codereview.chromium.org/26109002/diff/11001/git_common.py File git_common.py (right): https://codereview.chromium.org/26109002/diff/11001/git_common.py#newcode8 git_common.py:8: from multiprocessing.pool import IMapIterator Can't this be done later? https://codereview.chromium.org/26109002/diff/11001/git_common.py#newcode23 git_common.py:23: import binascii Order https://codereview.chromium.org/26109002/diff/11001/git_common.py#newcode25 git_common.py:25: hexlify = binascii.hexlify It's not used once. Why? https://codereview.chromium.org/26109002/diff/11001/git_common.py#newcode26 git_common.py:26: unhexlify = binascii.unhexlify It's used once, why? https://codereview.chromium.org/26109002/diff/11001/git_common.py#newcode30 git_common.py:30: 2 lines https://codereview.chromium.org/26109002/diff/11001/git_common.py#newcode46 git_common.py:46: """ """Memoizes a single-argument pure function. https://codereview.chromium.org/26109002/diff/11001/git_common.py#newcode172 git_common.py:172: GIT_EXE = 'git.bat' if sys.platform.startswith('win') else 'git' Constants should be at the top. https://codereview.chromium.org/26109002/diff/11001/git_common.py#newcode187 git_common.py:187: def git_intern_f(f, kind='blob'): This function is worth a docstring. https://codereview.chromium.org/26109002/diff/11001/git_number.py File git_number.py (right): https://codereview.chromium.org/26109002/diff/11001/git_number.py#newcode14 git_number.py:14: from git_common import git_hash, run_git, git_intern_f, git_tree Many of them are used once or twice. This doesn't pass the bar of direct importing. https://codereview.chromium.org/26109002/diff/11001/git_number.py#newcode26 git_number.py:26: # need to reimplement cache data structures to be a bit more sophistocated than sophisticated https://codereview.chromium.org/26109002/diff/11001/git_number.py#newcode45 git_number.py:45: raw = buffer(p.stdout.read()) Why not use .communicate()? You risk a pipe buffer hang. https://codereview.chromium.org/26109002/diff/11001/git_number.py#newcode54 git_number.py:54: def get_num(ref): A ref in git parlance is specifically not a hash. Change the variable name to be self-descriptive. https://codereview.chromium.org/26109002/diff/11001/git_number.py#newcode61 git_number.py:61: """Transform a number tree (in the form returned by |get_number_tree|) into a Transforms https://codereview.chromium.org/26109002/diff/11001/git_number.py#newcode83 git_number.py:83: """After calculating the generation number for |targets|, call finalize to The "After .." part should be in the second line docstring. Explains what it does first. https://codereview.chromium.org/26109002/diff/11001/git_number.py#newcode90 git_number.py:90: one empty line max. https://codereview.chromium.org/26109002/diff/11001/git_number.py#newcode194 git_number.py:194: if opts.verbose: Why do it conditionally? https://codereview.chromium.org/26109002/diff/11001/git_number.py#newcode198 git_number.py:198: run_git('update-ref', '-d', REF) I'd prefer to return 0 right after instead of using an else: https://codereview.chromium.org/26109002/diff/11001/git_number.py#newcode205 git_number.py:205: finalize(targets) return 0 after. https://codereview.chromium.org/26109002/diff/11001/git_number.py#newcode210 git_number.py:210: main() sys.exit(main()) https://codereview.chromium.org/26109002/diff/11001/git_number.py#newcode211 git_number.py:211: except KeyboardInterrupt: I'd prefer this to be handled inside the main().
On 2013/10/21 17:56:43, M-A Ruel wrote: > Are the tests coming in a follow up CL? Otherwise why so many trivial functions > in git_common? Ah. Right. This was airlifted from my personal vimfiles repo (https://github.com/riannucci/vimified/tree/master/gitscripts). The common functions are used by many scripts that I would like to (later) add to depot_tools, but aren't ready for prime time (for example, the reup script knows how to rebase all your branches at once in the chromium workflow, even with deeply nested local branches. This includes deleting fully integrated branches and fixing up local branch metadata). I could merge the two files for now, but I would rather add tests for all of it. That said, all of this code I've been using on my various workstations (mac, linux and windows) for months now, so I have high confidence that it does work as advertised :) I'll respond to the rest of your comments in a bit. > > https://codereview.chromium.org/26109002/diff/11001/git-number > File git-number (right): > > https://codereview.chromium.org/26109002/diff/11001/git-number#newcode24 > git-number:24: PYTHONDONTWRITEBYTECODE=1 exec "$base_dir/python_bin/python.exe" > "$base_dir"/git_number.py "$@" > Note: python_bin hardcoding will have to change for python 2.7.5 transition. > > https://codereview.chromium.org/26109002/diff/11001/git_common.py > File git_common.py (right): > > https://codereview.chromium.org/26109002/diff/11001/git_common.py#newcode8 > git_common.py:8: from multiprocessing.pool import IMapIterator > Can't this be done later? > > https://codereview.chromium.org/26109002/diff/11001/git_common.py#newcode23 > git_common.py:23: import binascii > Order > > https://codereview.chromium.org/26109002/diff/11001/git_common.py#newcode25 > git_common.py:25: hexlify = binascii.hexlify > It's not used once. Why? > > https://codereview.chromium.org/26109002/diff/11001/git_common.py#newcode26 > git_common.py:26: unhexlify = binascii.unhexlify > It's used once, why? > > https://codereview.chromium.org/26109002/diff/11001/git_common.py#newcode30 > git_common.py:30: > 2 lines > > https://codereview.chromium.org/26109002/diff/11001/git_common.py#newcode46 > git_common.py:46: """ > """Memoizes a single-argument pure function. > > https://codereview.chromium.org/26109002/diff/11001/git_common.py#newcode172 > git_common.py:172: GIT_EXE = 'git.bat' if sys.platform.startswith('win') else > 'git' > Constants should be at the top. > > https://codereview.chromium.org/26109002/diff/11001/git_common.py#newcode187 > git_common.py:187: def git_intern_f(f, kind='blob'): > This function is worth a docstring. > > https://codereview.chromium.org/26109002/diff/11001/git_number.py > File git_number.py (right): > > https://codereview.chromium.org/26109002/diff/11001/git_number.py#newcode14 > git_number.py:14: from git_common import git_hash, run_git, git_intern_f, > git_tree > Many of them are used once or twice. This doesn't pass the bar of direct > importing. > > https://codereview.chromium.org/26109002/diff/11001/git_number.py#newcode26 > git_number.py:26: # need to reimplement cache data structures to be a bit more > sophistocated than > sophisticated > > https://codereview.chromium.org/26109002/diff/11001/git_number.py#newcode45 > git_number.py:45: raw = buffer(p.stdout.read()) > Why not use .communicate()? You risk a pipe buffer hang. > > https://codereview.chromium.org/26109002/diff/11001/git_number.py#newcode54 > git_number.py:54: def get_num(ref): > A ref in git parlance is specifically not a hash. Change the variable name to be > self-descriptive. > > https://codereview.chromium.org/26109002/diff/11001/git_number.py#newcode61 > git_number.py:61: """Transform a number tree (in the form returned by > |get_number_tree|) into a > Transforms > > https://codereview.chromium.org/26109002/diff/11001/git_number.py#newcode83 > git_number.py:83: """After calculating the generation number for |targets|, call > finalize to > The "After .." part should be in the second line docstring. Explains what it > does first. > > https://codereview.chromium.org/26109002/diff/11001/git_number.py#newcode90 > git_number.py:90: > one empty line max. > > https://codereview.chromium.org/26109002/diff/11001/git_number.py#newcode194 > git_number.py:194: if opts.verbose: > Why do it conditionally? > > https://codereview.chromium.org/26109002/diff/11001/git_number.py#newcode198 > git_number.py:198: run_git('update-ref', '-d', REF) > I'd prefer to return 0 right after instead of using an else: > > https://codereview.chromium.org/26109002/diff/11001/git_number.py#newcode205 > git_number.py:205: finalize(targets) > return 0 after. > > https://codereview.chromium.org/26109002/diff/11001/git_number.py#newcode210 > git_number.py:210: main() > sys.exit(main()) > > https://codereview.chromium.org/26109002/diff/11001/git_number.py#newcode211 > git_number.py:211: except KeyboardInterrupt: > I'd prefer this to be handled inside the main().
https://codereview.chromium.org/26109002/diff/11001/git_common.py File git_common.py (right): https://codereview.chromium.org/26109002/diff/11001/git_common.py#newcode37 git_common.py:37: self.out_err = out_err output and out_err are never used. https://codereview.chromium.org/26109002/diff/11001/git_common.py#newcode45 git_common.py:45: def memoize_one(f): Is there anywhere else in depot_tools this could live, that's even more general than git_common? https://codereview.chromium.org/26109002/diff/11001/git_common.py#newcode70 git_common.py:70: def initer(orig, orig_args): Define this inside ScopedPool? https://codereview.chromium.org/26109002/diff/11001/git_common.py#newcode96 git_common.py:96: class StatusPrinter(object): This is more like a ProgressPrinter, since it increments and prints an integer count. https://codereview.chromium.org/26109002/diff/11001/git_common.py#newcode100 git_common.py:100: Create a StatusPrinter. Bump up a line. https://codereview.chromium.org/26109002/diff/11001/git_common.py#newcode102 git_common.py:102: Call .start() to get it going, or use it as a context manager which produces Doesn't actually have a .start() method. https://codereview.chromium.org/26109002/diff/11001/git_common.py#newcode148 git_common.py:148: def parse_committish(*committish): Docstring. Unclear why a committish would have multiple lines (would expect it to be one hexadecimal hash). Also, ditto comment above about just defining unhexlify here. https://codereview.chromium.org/26109002/diff/11001/git_common.py#newcode155 git_common.py:155: def check_output(*popenargs, **kwargs): Docstriiing. https://codereview.chromium.org/26109002/diff/11001/git_common.py#newcode162 git_common.py:162: output, out_err = process.communicate(indata) I thought communicate misbehaved if you didn't have stdin=PIPE. Also, stdout and stderr aren't reserved works, so you could just use those instead of output and out_err. https://codereview.chromium.org/26109002/diff/11001/git_common.py#newcode164 git_common.py:164: if retcode: You need to handle the case where poll() returns None, saying the process hasn't terminated. https://codereview.chromium.org/26109002/diff/11001/git_common.py#newcode174 git_common.py:174: def run_git(*cmd, **kwargs): Doooooooc strriiiiiiing. https://codereview.chromium.org/26109002/diff/11001/git_common.py#newcode183 git_common.py:183: def git_hash(*reflike): Like seriously. All functions are worth good docstrings. https://codereview.chromium.org/26109002/diff/11001/git_number.py File git_number.py (right): https://codereview.chromium.org/26109002/diff/11001/git_number.py#newcode96 git_number.py:96: with StatusPrinter('Finalizing: (%%d/%d)' % len(DIRTY_TREES)) as inc: This half-formatted string is unfortunate. You could have StatusPrinter do self._fmt % {'count': self._count} and have %%(count)d here to be a bit more explicit about what's going on. https://codereview.chromium.org/26109002/diff/11001/git_number.py#newcode114 git_number.py:114: commit_cmd.append(run_git('write-tree', env=env)) This is confusing, especially since you're calling git plumbing commands here. Please run_git and capture its output, then append it to the commit_cmd. https://codereview.chromium.org/26109002/diff/11001/git_number.py#newcode160 git_number.py:160: toks = map(unhexlify, line.split()) tokens https://codereview.chromium.org/26109002/diff/11001/git_number.py#newcode187 git_number.py:187: parser.add_option('-v', '--verbose', action='count', default=0
No tests yet, but comments addressed. https://chromiumcodereview.appspot.com/26109002/diff/11001/git-number File git-number (right): https://chromiumcodereview.appspot.com/26109002/diff/11001/git-number#newcode24 git-number:24: PYTHONDONTWRITEBYTECODE=1 exec "$base_dir/python_bin/python.exe" "$base_dir"/git_number.py "$@" On 2013/10/21 17:56:44, M-A Ruel wrote: > Note: python_bin hardcoding will have to change for python 2.7.5 transition. :(.. Is there a more generic approach here? https://chromiumcodereview.appspot.com/26109002/diff/11001/git_common.py File git_common.py (right): https://chromiumcodereview.appspot.com/26109002/diff/11001/git_common.py#newc... git_common.py:8: from multiprocessing.pool import IMapIterator On 2013/10/21 17:56:44, M-A Ruel wrote: > Can't this be done later? 'this' ? Parallelizing, or monkey-patching IMapIterator? I would rather do parallelization now because: 1) It works 2) It's fast (I can make numbers if you like, but it was quite substantial last I checked (though previously it was doing many more tasks in parallel)) 3) It's (IMO) pretty low readability cost The monkey patch is so that Ctrl-C works, because python can't be bothered to have a working multiprocessing library :D https://chromiumcodereview.appspot.com/26109002/diff/11001/git_common.py#newc... git_common.py:23: import binascii On 2013/10/21 17:56:44, M-A Ruel wrote: > Order Done. https://chromiumcodereview.appspot.com/26109002/diff/11001/git_common.py#newc... git_common.py:25: hexlify = binascii.hexlify On 2013/10/21 17:56:44, M-A Ruel wrote: > It's not used once. Why? It's imported (in git_number). Used for dealing with git hashes. Same for unhexlify. https://chromiumcodereview.appspot.com/26109002/diff/11001/git_common.py#newc... git_common.py:30: On 2013/10/21 17:56:44, M-A Ruel wrote: > 2 lines Done. https://chromiumcodereview.appspot.com/26109002/diff/11001/git_common.py#newc... git_common.py:37: self.out_err = out_err On 2013/10/21 20:16:42, Aaron Gable wrote: > output and out_err are never used. True. Done. https://chromiumcodereview.appspot.com/26109002/diff/11001/git_common.py#newc... git_common.py:45: def memoize_one(f): On 2013/10/21 20:16:42, Aaron Gable wrote: > Is there anywhere else in depot_tools this could live, that's even more general > than git_common? Not sure... https://chromiumcodereview.appspot.com/26109002/diff/11001/git_common.py#newc... git_common.py:46: """ On 2013/10/21 17:56:44, M-A Ruel wrote: > """Memoizes a single-argument pure function. Done. https://chromiumcodereview.appspot.com/26109002/diff/11001/git_common.py#newc... git_common.py:70: def initer(orig, orig_args): On 2013/10/21 20:16:42, Aaron Gable wrote: > Define this inside ScopedPool? Can't. Multprocessing sux and this must be importable. I'll add a docstring tho. https://chromiumcodereview.appspot.com/26109002/diff/11001/git_common.py#newc... git_common.py:96: class StatusPrinter(object): On 2013/10/21 20:16:42, Aaron Gable wrote: > This is more like a ProgressPrinter, since it increments and prints an integer > count. Done. https://chromiumcodereview.appspot.com/26109002/diff/11001/git_common.py#newc... git_common.py:100: Create a StatusPrinter. On 2013/10/21 20:16:42, Aaron Gable wrote: > Bump up a line. Done. https://chromiumcodereview.appspot.com/26109002/diff/11001/git_common.py#newc... git_common.py:102: Call .start() to get it going, or use it as a context manager which produces On 2013/10/21 20:16:42, Aaron Gable wrote: > Doesn't actually have a .start() method. well how about that... Done. https://chromiumcodereview.appspot.com/26109002/diff/11001/git_common.py#newc... git_common.py:148: def parse_committish(*committish): On 2013/10/21 20:16:42, Aaron Gable wrote: > Docstring. Unclear why a committish would have multiple lines (would expect it > to be one hexadecimal hash). Also, ditto comment above about just defining > unhexlify here. Added comments to *lify, added docstring and renamed this function to make more sense. https://chromiumcodereview.appspot.com/26109002/diff/11001/git_common.py#newc... git_common.py:155: def check_output(*popenargs, **kwargs): On 2013/10/21 20:16:42, Aaron Gable wrote: > Docstriiing. Done. https://chromiumcodereview.appspot.com/26109002/diff/11001/git_common.py#newc... git_common.py:162: output, out_err = process.communicate(indata) On 2013/10/21 20:16:42, Aaron Gable wrote: > I thought communicate misbehaved if you didn't have stdin=PIPE. We do this if indata is not-None. > Also, stdout and > stderr aren't reserved works, so you could just use those instead of output and > out_err. I prefer 'output' since 'stdout' is a stream, whereas 'output' is a string. out_err is gone now. https://chromiumcodereview.appspot.com/26109002/diff/11001/git_common.py#newc... git_common.py:164: if retcode: On 2013/10/21 20:16:42, Aaron Gable wrote: > You need to handle the case where poll() returns None, saying the process hasn't > terminated. communicate() guarantees that the process is ended. I switched poll() to returncode to be more explicit. https://chromiumcodereview.appspot.com/26109002/diff/11001/git_common.py#newc... git_common.py:172: GIT_EXE = 'git.bat' if sys.platform.startswith('win') else 'git' On 2013/10/21 17:56:44, M-A Ruel wrote: > Constants should be at the top. Yep. Done. https://chromiumcodereview.appspot.com/26109002/diff/11001/git_common.py#newc... git_common.py:174: def run_git(*cmd, **kwargs): On 2013/10/21 20:16:42, Aaron Gable wrote: > Doooooooc strriiiiiiing. Done. https://chromiumcodereview.appspot.com/26109002/diff/11001/git_common.py#newc... git_common.py:183: def git_hash(*reflike): On 2013/10/21 20:16:42, Aaron Gable wrote: > Like seriously. All functions are worth good docstrings. really? this is a one-line function... """Returns the git hash after rev-parsing |reflike|""" No. https://chromiumcodereview.appspot.com/26109002/diff/11001/git_common.py#newc... git_common.py:187: def git_intern_f(f, kind='blob'): On 2013/10/21 17:56:44, M-A Ruel wrote: > This function is worth a docstring. Done. https://chromiumcodereview.appspot.com/26109002/diff/11001/git_number.py File git_number.py (right): https://chromiumcodereview.appspot.com/26109002/diff/11001/git_number.py#newc... git_number.py:14: from git_common import git_hash, run_git, git_intern_f, git_tree On 2013/10/21 17:56:44, M-A Ruel wrote: > Many of them are used once or twice. This doesn't pass the bar of direct > importing. Donez'd https://chromiumcodereview.appspot.com/26109002/diff/11001/git_number.py#newc... git_number.py:26: # need to reimplement cache data structures to be a bit more sophistocated than On 2013/10/21 17:56:44, M-A Ruel wrote: > sophisticated oops. Done. https://chromiumcodereview.appspot.com/26109002/diff/11001/git_number.py#newc... git_number.py:45: raw = buffer(p.stdout.read()) On 2013/10/21 17:56:44, M-A Ruel wrote: > Why not use .communicate()? You risk a pipe buffer hang. Yeah, this used to make more sense. Done. https://chromiumcodereview.appspot.com/26109002/diff/11001/git_number.py#newc... git_number.py:54: def get_num(ref): On 2013/10/21 17:56:44, M-A Ruel wrote: > A ref in git parlance is specifically not a hash. Change the variable name to be > self-descriptive. You're right. hash is a built-in though... maybe commit_id? LMK what you think. https://chromiumcodereview.appspot.com/26109002/diff/11001/git_number.py#newc... git_number.py:61: """Transform a number tree (in the form returned by |get_number_tree|) into a On 2013/10/21 17:56:44, M-A Ruel wrote: > Transforms Done. https://chromiumcodereview.appspot.com/26109002/diff/11001/git_number.py#newc... git_number.py:83: """After calculating the generation number for |targets|, call finalize to On 2013/10/21 17:56:44, M-A Ruel wrote: > The "After .." part should be in the second line docstring. Explains what it > does first. Done. https://chromiumcodereview.appspot.com/26109002/diff/11001/git_number.py#newc... git_number.py:90: On 2013/10/21 17:56:44, M-A Ruel wrote: > one empty line max. Done. https://chromiumcodereview.appspot.com/26109002/diff/11001/git_number.py#newc... git_number.py:96: with StatusPrinter('Finalizing: (%%d/%d)' % len(DIRTY_TREES)) as inc: On 2013/10/21 20:16:42, Aaron Gable wrote: > This half-formatted string is unfortunate. You could have StatusPrinter do > self._fmt % {'count': self._count} and have %%(count)d here to be a bit more > explicit about what's going on. k. Done. https://chromiumcodereview.appspot.com/26109002/diff/11001/git_number.py#newc... git_number.py:114: commit_cmd.append(run_git('write-tree', env=env)) On 2013/10/21 20:16:42, Aaron Gable wrote: > This is confusing, especially since you're calling git plumbing commands here. > Please run_git and capture its output, then append it to the commit_cmd. Hm... not sure I agree, but sure. Done. https://chromiumcodereview.appspot.com/26109002/diff/11001/git_number.py#newc... git_number.py:160: toks = map(unhexlify, line.split()) On 2013/10/21 20:16:42, Aaron Gable wrote: > tokens Done. https://chromiumcodereview.appspot.com/26109002/diff/11001/git_number.py#newc... git_number.py:187: parser.add_option('-v', '--verbose', action='count', On 2013/10/21 20:16:42, Aaron Gable wrote: > default=0 Done. https://chromiumcodereview.appspot.com/26109002/diff/11001/git_number.py#newc... git_number.py:194: if opts.verbose: On 2013/10/21 17:56:44, M-A Ruel wrote: > Why do it conditionally? Done. https://chromiumcodereview.appspot.com/26109002/diff/11001/git_number.py#newc... git_number.py:198: run_git('update-ref', '-d', REF) On 2013/10/21 17:56:44, M-A Ruel wrote: > I'd prefer to return 0 right after instead of using an else: Done. https://chromiumcodereview.appspot.com/26109002/diff/11001/git_number.py#newc... git_number.py:205: finalize(targets) On 2013/10/21 17:56:44, M-A Ruel wrote: > return 0 after. Done. https://chromiumcodereview.appspot.com/26109002/diff/11001/git_number.py#newc... git_number.py:210: main() On 2013/10/21 17:56:44, M-A Ruel wrote: > sys.exit(main()) Done. https://chromiumcodereview.appspot.com/26109002/diff/11001/git_number.py#newc... git_number.py:211: except KeyboardInterrupt: On 2013/10/21 17:56:44, M-A Ruel wrote: > I'd prefer this to be handled inside the main(). Done.
https://chromiumcodereview.appspot.com/26109002/diff/11001/git-number File git-number (right): https://chromiumcodereview.appspot.com/26109002/diff/11001/git-number#newcode24 git-number:24: PYTHONDONTWRITEBYTECODE=1 exec "$base_dir/python_bin/python.exe" "$base_dir"/git_number.py "$@" On 2013/10/22 07:28:22, iannucci wrote: > On 2013/10/21 17:56:44, M-A Ruel wrote: > > Note: python_bin hardcoding will have to change for python 2.7.5 transition. > > :(.. Is there a more generic approach here? Don't bother here for now, I'll create a separate CL for the other scripts and will update this one at the same time. https://chromiumcodereview.appspot.com/26109002/diff/11001/git_common.py File git_common.py (right): https://chromiumcodereview.appspot.com/26109002/diff/11001/git_common.py#newc... git_common.py:8: from multiprocessing.pool import IMapIterator On 2013/10/22 07:28:22, iannucci wrote: > On 2013/10/21 17:56:44, M-A Ruel wrote: > > Can't this be done later? > > 'this' ? Parallelizing, or monkey-patching IMapIterator? Monkey patching, but I don't mind. Leave it there. https://chromiumcodereview.appspot.com/26109002/diff/11001/git_common.py#newc... git_common.py:25: hexlify = binascii.hexlify On 2013/10/22 07:28:22, iannucci wrote: > On 2013/10/21 17:56:44, M-A Ruel wrote: > > It's not used once. Why? > > It's imported (in git_number). Used for dealing with git hashes. Same for > unhexlify. Then import it where it's used. https://chromiumcodereview.appspot.com/26109002/diff/11001/git_common.py#newc... git_common.py:174: def run_git(*cmd, **kwargs): On 2013/10/22 07:28:22, iannucci wrote: > On 2013/10/21 20:16:42, Aaron Gable wrote: > > Doooooooc strriiiiiiing. > > Done. I don't think this particular function needed a docstring. It's more trivial and transparent even if longer than let's say parse_committish(), which is more opage. https://chromiumcodereview.appspot.com/26109002/diff/11001/git_common.py#newc... git_common.py:183: def git_hash(*reflike): On 2013/10/22 07:28:22, iannucci wrote: > On 2013/10/21 20:16:42, Aaron Gable wrote: > > Like seriously. All functions are worth good docstrings. > > really? this is a one-line function... > > """Returns the git hash after rev-parsing |reflike|""" > > No. I think it's more about *what* this call does. It converts a reference into a hash. I really don't mind either way, I just think the function is ill named. First it's a git_common module, naming the functions with a git_ prefix is unneeded, I'd prefer the function to be named reference_to_commit_hash() and then it's self-descriptive enough *at call site* that it wouldn't even need a docstring. But I don't mind. https://chromiumcodereview.appspot.com/26109002/diff/11001/git_number.py File git_number.py (right): https://chromiumcodereview.appspot.com/26109002/diff/11001/git_number.py#newc... git_number.py:54: def get_num(ref): On 2013/10/22 07:28:22, iannucci wrote: > On 2013/10/21 17:56:44, M-A Ruel wrote: > > A ref in git parlance is specifically not a hash. Change the variable name to > be > > self-descriptive. > > You're right. hash is a built-in though... maybe commit_id? LMK what you think. commit_hash https://chromiumcodereview.appspot.com/26109002/diff/97001/git_common.py File git_common.py (right): https://chromiumcodereview.appspot.com/26109002/diff/97001/git_common.py#newc... git_common.py:133: if VERBOSE_LEVEL > 0: Use an instance member instead to set it. https://chromiumcodereview.appspot.com/26109002/diff/97001/git_common.py#newc... git_common.py:211: if VERBOSE_LEVEL > 1: Why not using logging.info() ? Or at worst, use logging.getLogger().isEnabledFor(logging.INFO) ? https://chromiumcodereview.appspot.com/26109002/diff/97001/git_number.py File git_number.py (right): https://chromiumcodereview.appspot.com/26109002/diff/97001/git_number.py#newc... git_number.py:137: """Load/calculate the generation numbers for targets. Loads and then calculates https://chromiumcodereview.appspot.com/26109002/diff/97001/git_number.py#newc... git_number.py:201: if opts.reset: Personally, I'd start the try: here, or preferably create a function here where you call the worker function. I don't mind much. https://chromiumcodereview.appspot.com/26109002/diff/97001/git_number.py#newc... git_number.py:215: 2 lines
Working on tests now. Formatting comments addressed (hopefully). https://codereview.chromium.org/26109002/diff/11001/git_common.py File git_common.py (right): https://codereview.chromium.org/26109002/diff/11001/git_common.py#newcode183 git_common.py:183: def git_hash(*reflike): On 2013/10/24 13:23:03, M-A Ruel wrote: > On 2013/10/22 07:28:22, iannucci wrote: > > On 2013/10/21 20:16:42, Aaron Gable wrote: > > > Like seriously. All functions are worth good docstrings. > > > > really? this is a one-line function... > > > > """Returns the git hash after rev-parsing |reflike|""" > > > > No. > > I think it's more about *what* this call does. It converts a reference into a > hash. I really don't mind either way, I just think the function is ill named. > Hm... at the callsite: hashes = git_hash('HEAD', 'HEAD~100', "deadbeef") I read it as: 'hashes is the git_hash of "HEAD", "HEAD~100", and "deadbeef"'. Maybe that's just me? > First it's a git_common module, naming the functions with a git_ prefix is > unneeded, I'd prefer the function to be named reference_to_commit_hash() and > then it's self-descriptive enough *at call site* that it wouldn't even need a > docstring. > > But I don't mind. +1 on taking the git prefix off of everything, but reference_to_commit_hash seems too verbose to me. New naming scheme is go-style-ish (minus the Capital functions). https://codereview.chromium.org/26109002/diff/11001/git_number.py File git_number.py (right): https://codereview.chromium.org/26109002/diff/11001/git_number.py#newcode54 git_number.py:54: def get_num(ref): On 2013/10/24 13:23:03, M-A Ruel wrote: > On 2013/10/22 07:28:22, iannucci wrote: > > On 2013/10/21 17:56:44, M-A Ruel wrote: > > > A ref in git parlance is specifically not a hash. Change the variable name > to > > be > > > self-descriptive. > > > > You're right. hash is a built-in though... maybe commit_id? LMK what you > think. > > commit_hash Done. https://codereview.chromium.org/26109002/diff/97001/git_common.py File git_common.py (right): https://codereview.chromium.org/26109002/diff/97001/git_common.py#newcode133 git_common.py:133: if VERBOSE_LEVEL > 0: On 2013/10/24 13:23:03, M-A Ruel wrote: > Use an instance member instead to set it. Done. https://codereview.chromium.org/26109002/diff/97001/git_common.py#newcode211 git_common.py:211: if VERBOSE_LEVEL > 1: On 2013/10/24 13:23:03, M-A Ruel wrote: > Why not using logging.info() ? > Or at worst, use logging.getLogger().isEnabledFor(logging.INFO) ? because I'm lazy and the logging module annoys me? :) I'll fix it. Done. https://codereview.chromium.org/26109002/diff/97001/git_number.py File git_number.py (right): https://codereview.chromium.org/26109002/diff/97001/git_number.py#newcode137 git_number.py:137: """Load/calculate the generation numbers for targets. On 2013/10/24 13:23:03, M-A Ruel wrote: > Loads and then calculates Clarified this docstring. https://codereview.chromium.org/26109002/diff/97001/git_number.py#newcode201 git_number.py:201: if opts.reset: On 2013/10/24 13:23:03, M-A Ruel wrote: > Personally, I'd start the try: here, or preferably create a function here where > you call the worker function. I don't mind much. Cleaned this up as you suggest. Much better now :) https://codereview.chromium.org/26109002/diff/97001/git_number.py#newcode215 git_number.py:215: On 2013/10/24 13:23:03, M-A Ruel wrote: > 2 lines Done.
OK! PTAL, now with 100% code coverage
https://chromiumcodereview.appspot.com/26109002/diff/197001/git-number File git-number (right): https://chromiumcodereview.appspot.com/26109002/diff/197001/git-number#newcode23 git-number:23: if [ -d "$base_dir/python_bin" -a $MINGW = 0 ]; then Use the new way that works with 2.7.5. https://chromiumcodereview.appspot.com/26109002/diff/197001/git_common.py File git_common.py (right): https://chromiumcodereview.appspot.com/26109002/diff/197001/git_common.py#new... git_common.py:30: pathlify = lambda s: '/'.join('%02x' % ord(b) for b in s) This seems to create very deep directory. What is the intent? https://chromiumcodereview.appspot.com/26109002/diff/197001/git_common.py#new... git_common.py:54: * cache (dict) - Maps arg to f(arg) I'd be good to document how one is expected to disable the caching in unit tests. https://chromiumcodereview.appspot.com/26109002/diff/197001/git_number.py File git_number.py (right): https://chromiumcodereview.appspot.com/26109002/diff/197001/git_number.py#new... git_number.py:5: I like a short module docstring that tells what the script does and why, then use OptionParser(..., description=sys.modules[__name__].__doc__) https://chromiumcodereview.appspot.com/26109002/diff/197001/git_number.py#new... git_number.py:31: one more line https://chromiumcodereview.appspot.com/26109002/diff/197001/git_number.py#new... git_number.py:34: """Return a dictionary of the blob contents specified by |prefix_bytes|. Returns https://chromiumcodereview.appspot.com/26109002/diff/197001/git_number.py#new... git_number.py:58: space more https://chromiumcodereview.appspot.com/26109002/diff/197001/git_number.py#new... git_number.py:60: get_number_tree.cache.clear() In general I'm not a fan of global caches, for example this code inhibits processing multiple git repositories inside a single process. For example let's say someone in a few years wants to process all the DEPS as pseudo revs and use this script as a library. I'm not requesting you to change it but I think it is a significant downside of the technique used here. https://chromiumcodereview.appspot.com/26109002/diff/197001/git_number.py#new... git_number.py:80: def leaf_map_fn((pre, tree)): Is (()) intended? https://chromiumcodereview.appspot.com/26109002/diff/197001/testing_support/g... File testing_support/git_test_utils.py (right): https://chromiumcodereview.appspot.com/26109002/diff/197001/testing_support/g... testing_support/git_test_utils.py:59: def __iter__(self): Group the __functions__ together, sort both subsets. https://chromiumcodereview.appspot.com/26109002/diff/197001/testing_support/g... testing_support/git_test_utils.py:196: class GitRepo(object): This could likely be used to remove git support in testing_support/fake_repos.py. https://chromiumcodereview.appspot.com/26109002/diff/197001/testing_support/g... testing_support/git_test_utils.py:418: def covered_main(includes): This is not git specific. I'd recommend a separate file https://chromiumcodereview.appspot.com/26109002/diff/197001/tests/git_common_... File tests/git_common_test.py (right): https://chromiumcodereview.appspot.com/26109002/diff/197001/tests/git_common_... tests/git_common_test.py:16: sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__)))) Make this a constant, and use another one for the __main__ section below. https://chromiumcodereview.appspot.com/26109002/diff/197001/tests/git_common_... tests/git_common_test.py:65: import time You want to reimport in fear of time.sleep() mocking? https://chromiumcodereview.appspot.com/26109002/diff/197001/tests/git_common_... tests/git_common_test.py:253: REPO = "" REPO = '' https://chromiumcodereview.appspot.com/26109002/diff/197001/tests/git_number_... File tests/git_number_test.py (right): https://chromiumcodereview.appspot.com/26109002/diff/197001/tests/git_number_... tests/git_number_test.py:2: # Copyright (c) 2012 The Chromium Authors. All rights reserved. # Copyright 2013 The Chromium Authors. All rights reserved. https://chromiumcodereview.appspot.com/26109002/diff/197001/tests/git_number_... tests/git_number_test.py:28: cls.gn.POOL_KIND = 'threads' Why not do it in setUp()? https://chromiumcodereview.appspot.com/26109002/diff/197001/tests/git_number_... tests/git_number_test.py:32: super(Basic, cls).tearDownClass() You wan to call the super class after the function did its own stuff. https://chromiumcodereview.appspot.com/26109002/diff/197001/tests/git_number_... tests/git_number_test.py:36: super(Basic, self).tearDown() same
Thanks for the fast feedback. PTAL https://chromiumcodereview.appspot.com/26109002/diff/197001/git-number File git-number (right): https://chromiumcodereview.appspot.com/26109002/diff/197001/git-number#newcode23 git-number:23: if [ -d "$base_dir/python_bin" -a $MINGW = 0 ]; then On 2013/11/07 20:59:11, M-A Ruel wrote: > Use the new way that works with 2.7.5. Done!! https://chromiumcodereview.appspot.com/26109002/diff/197001/git_common.py File git_common.py (right): https://chromiumcodereview.appspot.com/26109002/diff/197001/git_common.py#new... git_common.py:30: pathlify = lambda s: '/'.join('%02x' % ord(b) for b in s) On 2013/11/07 20:59:11, M-A Ruel wrote: > This seems to create very deep directory. What is the intent? Like the comment above says, it' converts a prefix of a hash into a path. It's used for the git number caching mechanism. I moved it to git_number.py https://chromiumcodereview.appspot.com/26109002/diff/197001/git_common.py#new... git_common.py:54: * cache (dict) - Maps arg to f(arg) On 2013/11/07 20:59:11, M-A Ruel wrote: > I'd be good to document how one is expected to disable the caching in unit > tests. Done. https://chromiumcodereview.appspot.com/26109002/diff/197001/git_number.py File git_number.py (right): https://chromiumcodereview.appspot.com/26109002/diff/197001/git_number.py#new... git_number.py:5: On 2013/11/07 20:59:11, M-A Ruel wrote: > I like a short module docstring that tells what the script does and why, then > use > OptionParser(..., description=sys.modules[__name__].__doc__) Done. (though I used usage= because description was reflowing the text poorly). https://chromiumcodereview.appspot.com/26109002/diff/197001/git_number.py#new... git_number.py:31: On 2013/11/07 20:59:11, M-A Ruel wrote: > one more line Done. https://chromiumcodereview.appspot.com/26109002/diff/197001/git_number.py#new... git_number.py:34: """Return a dictionary of the blob contents specified by |prefix_bytes|. On 2013/11/07 20:59:11, M-A Ruel wrote: > Returns Done. https://chromiumcodereview.appspot.com/26109002/diff/197001/git_number.py#new... git_number.py:58: On 2013/11/07 20:59:11, M-A Ruel wrote: > space more Done. https://chromiumcodereview.appspot.com/26109002/diff/197001/git_number.py#new... git_number.py:60: get_number_tree.cache.clear() On 2013/11/07 20:59:11, M-A Ruel wrote: > In general I'm not a fan of global caches, for example this code inhibits > processing multiple git repositories inside a single process. For example let's > say someone in a few years wants to process all the DEPS as pseudo revs and use > this script as a library. > > I'm not requesting you to change it but I think it is a significant downside of > the technique used here. Fair point. It would make sense to refactor this into a class which takes a repo as it's argument. Maybe for v2? https://chromiumcodereview.appspot.com/26109002/diff/197001/git_number.py#new... git_number.py:80: def leaf_map_fn((pre, tree)): On 2013/11/07 20:59:11, M-A Ruel wrote: > Is (()) intended? Yeah, this function is invoked with a tuple (by multiprocessing). The extra () unpacks it automatically. https://chromiumcodereview.appspot.com/26109002/diff/197001/testing_support/g... File testing_support/git_test_utils.py (right): https://chromiumcodereview.appspot.com/26109002/diff/197001/testing_support/g... testing_support/git_test_utils.py:59: def __iter__(self): On 2013/11/07 20:59:11, M-A Ruel wrote: > Group the __functions__ together, sort both subsets. Done. I put __init__ at the top though. https://chromiumcodereview.appspot.com/26109002/diff/197001/testing_support/g... testing_support/git_test_utils.py:196: class GitRepo(object): On 2013/11/07 20:59:11, M-A Ruel wrote: > This could likely be used to remove git support in > testing_support/fake_repos.py. Yeah agreed https://chromiumcodereview.appspot.com/26109002/diff/197001/testing_support/g... testing_support/git_test_utils.py:418: def covered_main(includes): On 2013/11/07 20:59:11, M-A Ruel wrote: > This is not git specific. I'd recommend a separate file Done. Moved to coverage_uitls.py (even though there's only one think in there at the moment). https://chromiumcodereview.appspot.com/26109002/diff/197001/tests/git_common_... File tests/git_common_test.py (right): https://chromiumcodereview.appspot.com/26109002/diff/197001/tests/git_common_... tests/git_common_test.py:16: sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__)))) On 2013/11/07 20:59:11, M-A Ruel wrote: > Make this a constant, and use another one for the __main__ section below. Done. https://chromiumcodereview.appspot.com/26109002/diff/197001/tests/git_common_... tests/git_common_test.py:65: import time On 2013/11/07 20:59:11, M-A Ruel wrote: > You want to reimport in fear of time.sleep() mocking? No, actually I'm not sure why I did this. Fixed. https://chromiumcodereview.appspot.com/26109002/diff/197001/tests/git_common_... tests/git_common_test.py:253: REPO = "" On 2013/11/07 20:59:11, M-A Ruel wrote: > REPO = '' Done. https://chromiumcodereview.appspot.com/26109002/diff/197001/tests/git_number_... File tests/git_number_test.py (right): https://chromiumcodereview.appspot.com/26109002/diff/197001/tests/git_number_... tests/git_number_test.py:2: # Copyright (c) 2012 The Chromium Authors. All rights reserved. On 2013/11/07 20:59:11, M-A Ruel wrote: > # Copyright 2013 The Chromium Authors. All rights reserved. Oops. Done. https://chromiumcodereview.appspot.com/26109002/diff/197001/tests/git_number_... tests/git_number_test.py:28: cls.gn.POOL_KIND = 'threads' On 2013/11/07 20:59:11, M-A Ruel wrote: > Why not do it in setUp()? Doesn't need to be done for every test https://chromiumcodereview.appspot.com/26109002/diff/197001/tests/git_number_... tests/git_number_test.py:32: super(Basic, cls).tearDownClass() On 2013/11/07 20:59:11, M-A Ruel wrote: > You wan to call the super class after the function did its own stuff. Good point. Fixed everywhere. https://chromiumcodereview.appspot.com/26109002/diff/197001/tests/git_number_... tests/git_number_test.py:36: super(Basic, self).tearDown() On 2013/11/07 20:59:11, M-A Ruel wrote: > same Done.
Sorry for the delay, I wasn't fully awake yesterday. https://chromiumcodereview.appspot.com/26109002/diff/347001/git_common.py File git_common.py (right): https://chromiumcodereview.appspot.com/26109002/diff/347001/git_common.py#new... git_common.py:54: cache = {} I'd prefer if this decorator was thread safe. This is the kind of thing where someone adds a worker thread in 2 years and then this starts blowing up randomly. https://chromiumcodereview.appspot.com/26109002/diff/347001/git_common.py#new... git_common.py:162: """This takes one or more committishes, and returns the binary-encoded git """Returns binary-encoded git ref (hash) for one or more refname (commitref). Nomenclature is everything. http://git-scm.com/docs/gitglossary#def_ref Note that the git documentation is not consistent with regard to how to name a reference to a commit. Most commands man page will just state <commit> where in practice it should probably be IMHO "commitref". Both http://git-scm.com/docs/git-rev-parse and http://git-scm.com/docs/git-check-ref-format uses "refname" so I prefer this name. But I don't mind much, you are free to chose what you prefer. https://chromiumcodereview.appspot.com/26109002/diff/347001/git_common.py#new... git_common.py:180: """Run a Popen command, and return the stdout as a string. Runs and what about check_output? But it doesn't seem like it properly support stdin, oh well. https://chromiumcodereview.appspot.com/26109002/diff/347001/git_number.py File git_number.py (right): https://chromiumcodereview.appspot.com/26109002/diff/347001/git_number.py#new... git_number.py:52: """Convert a binary git hash prefix into a posix path, one folder per byte. Converts S'es everywhere. https://chromiumcodereview.appspot.com/26109002/diff/347001/git_number.py#new... git_number.py:71: p = subprocess.Popen(['git', 'cat-file', 'blob', ref], Why not use git.run()? https://chromiumcodereview.appspot.com/26109002/diff/347001/testing_support/c... File testing_support/coverage_utils.py (right): https://chromiumcodereview.appspot.com/26109002/diff/347001/testing_support/c... testing_support/coverage_utils.py:8: 2 lines https://chromiumcodereview.appspot.com/26109002/diff/347001/testing_support/g... File testing_support/git_test_utils.py (right): https://chromiumcodereview.appspot.com/26109002/diff/347001/testing_support/g... testing_support/git_test_utils.py:160: def __init__(self, repo_schema="", '' https://chromiumcodereview.appspot.com/26109002/diff/347001/testing_support/g... testing_support/git_test_utils.py:259: """Allows you to get the hash of a commit by it's schema name. Gets the hash of a ... https://chromiumcodereview.appspot.com/26109002/diff/347001/tests/git_number_... File tests/git_number_test.py (right): https://chromiumcodereview.appspot.com/26109002/diff/347001/tests/git_number_... tests/git_number_test.py:17: 2 lines https://chromiumcodereview.appspot.com/26109002/diff/347001/tests/git_number_... tests/git_number_test.py:34: cls.gn.POOL_KIND = 'procs' I'd prefer you to cache the old value before line 30 and put it back here. https://chromiumcodereview.appspot.com/26109002/diff/347001/tests/git_number_... tests/git_number_test.py:75: self.repo.run(self.gn.get_num, binascii.unhexlify(self.repo['A'])), None I personally prefer putting the expected value first but I don't mind much.
https://codereview.chromium.org/26109002/diff/347001/git_common.py File git_common.py (right): https://codereview.chromium.org/26109002/diff/347001/git_common.py#newcode54 git_common.py:54: cache = {} On 2013/11/08 19:32:23, M-A Ruel wrote: > I'd prefer if this decorator was thread safe. This is the kind of thing where > someone adds a worker thread in 2 years and then this starts blowing up > randomly. I made a threadsafe=bool option mandatory for this decorator, so users of the function will always have to pick. https://codereview.chromium.org/26109002/diff/347001/git_common.py#newcode162 git_common.py:162: """This takes one or more committishes, and returns the binary-encoded git On 2013/11/08 19:32:23, M-A Ruel wrote: > """Returns binary-encoded git ref (hash) for one or more refname (commitref). > > Nomenclature is everything. > http://git-scm.com/docs/gitglossary#def_ref > > Note that the git documentation is not consistent with regard to how to name a > reference to a commit. Most commands man page will just state <commit> where in > practice it should probably be IMHO "commitref". > > Both http://git-scm.com/docs/git-rev-parse and > http://git-scm.com/docs/git-check-ref-format uses "refname" so I prefer this > name. > > But I don't mind much, you are free to chose what you prefer. Yeah I agree, the terminology in the git docs is NOT consistent. I think I'll go with: ref -> any way to refer to an object (anything parsable by rev-parse) *ref -> a ref to a particular object type (commitref, treeref, tagref, blobref) object hash -> the sha1 to identify an object * hash -> the sha1 to identify a particular object type (tree name, blob name, ...) WDYT? https://codereview.chromium.org/26109002/diff/347001/git_common.py#newcode180 git_common.py:180: """Run a Popen command, and return the stdout as a string. On 2013/11/08 19:32:23, M-A Ruel wrote: > Runs > Done > and what about check_output? But it doesn't seem like it properly support stdin, > oh well. yep. I'm telling you... subprocess sucks bigtime. https://codereview.chromium.org/26109002/diff/347001/git_number.py File git_number.py (right): https://codereview.chromium.org/26109002/diff/347001/git_number.py#newcode52 git_number.py:52: """Convert a binary git hash prefix into a posix path, one folder per byte. On 2013/11/08 19:32:23, M-A Ruel wrote: > Converts > > S'es everywhere. DE-IMPERITIZE ALL THE VERBS! https://codereview.chromium.org/26109002/diff/347001/git_number.py#newcode71 git_number.py:71: p = subprocess.Popen(['git', 'cat-file', 'blob', ref], On 2013/11/08 19:32:23, M-A Ruel wrote: > Why not use git.run()? Good question... I'm pretty sure there was a reason though. I'll try to use git run and if it doesn't work, I'll add a comment for why not :) edit Looks like it was because I had run() always call strip() on the output. Added a flag and converted to run(). Also tidied this method up a bit. https://codereview.chromium.org/26109002/diff/347001/testing_support/coverage... File testing_support/coverage_utils.py (right): https://codereview.chromium.org/26109002/diff/347001/testing_support/coverage... testing_support/coverage_utils.py:8: On 2013/11/08 19:32:23, M-A Ruel wrote: > 2 lines Done. https://codereview.chromium.org/26109002/diff/347001/testing_support/git_test... File testing_support/git_test_utils.py (right): https://codereview.chromium.org/26109002/diff/347001/testing_support/git_test... testing_support/git_test_utils.py:160: def __init__(self, repo_schema="", On 2013/11/08 19:32:23, M-A Ruel wrote: > '' Done. https://codereview.chromium.org/26109002/diff/347001/testing_support/git_test... testing_support/git_test_utils.py:259: """Allows you to get the hash of a commit by it's schema name. On 2013/11/08 19:32:23, M-A Ruel wrote: > Gets the hash of a ... Done. Fixed the "it's" too :) https://codereview.chromium.org/26109002/diff/347001/tests/git_number_test.py File tests/git_number_test.py (right): https://codereview.chromium.org/26109002/diff/347001/tests/git_number_test.py... tests/git_number_test.py:17: On 2013/11/08 19:32:23, M-A Ruel wrote: > 2 lines Done. https://codereview.chromium.org/26109002/diff/347001/tests/git_number_test.py... tests/git_number_test.py:34: cls.gn.POOL_KIND = 'procs' On 2013/11/08 19:32:23, M-A Ruel wrote: > I'd prefer you to cache the old value before line 30 and put it back here. Done.
(PTAL; I think this is ready to go, unless anyone has more comments? :)
I'm not warm to 'git-number' name, I'd prefer git-pseudorev or something like that. But I don't mind that much. https://chromiumcodereview.appspot.com/26109002/diff/467001/git_common.py File git_common.py (right): https://chromiumcodereview.appspot.com/26109002/diff/467001/git_common.py#new... git_common.py:62: if threadsafe: Why make it optional? You think this has a significant performance impact? I doubt so. I'd prefer for safer by default. https://chromiumcodereview.appspot.com/26109002/diff/467001/git_common.py#new... git_common.py:115: def ScopedPool(*args, **kwargs): Why not use kind=None here? https://chromiumcodereview.appspot.com/26109002/diff/467001/tests/git_common_... File tests/git_common_test.py (right): https://chromiumcodereview.appspot.com/26109002/diff/467001/tests/git_common_... tests/git_common_test.py:68: 2 lines
On 2013/11/12 21:13:40, M-A Ruel wrote: > I'm not warm to 'git-number' name, I'd prefer git-pseudorev or something like > that. But I don't mind that much. Hm. I think pseudorev will imply that it behaves like svn's revision, which it doesn't. What about 'git-gen-num' aka `git gen-num HEAD~100` ? > > https://chromiumcodereview.appspot.com/26109002/diff/467001/git_common.py > File git_common.py (right): > > https://chromiumcodereview.appspot.com/26109002/diff/467001/git_common.py#new... > git_common.py:62: if threadsafe: > Why make it optional? You think this has a significant performance impact? I > doubt so. I'd prefer for safer by default. The thing is that the GIL already locks the cache object accesses anyway in CPython, so the only thing the lock gets you is the same behavior on non-CPython. And if you want to hold the lock for the duration of the function invocation to prevent multiple threads from calculating the same value, then that will definitely slow things down. > > https://chromiumcodereview.appspot.com/26109002/diff/467001/git_common.py#new... > git_common.py:115: def ScopedPool(*args, **kwargs): > Why not use kind=None here? > Because Python 2.7 is dumb and I can't do def foo(*args, thing=None, **kwargs) And I didn't want to statically replicate multiprocessing.Pool's __init__ signature. > https://chromiumcodereview.appspot.com/26109002/diff/467001/tests/git_common_... > File tests/git_common_test.py (right): > > https://chromiumcodereview.appspot.com/26109002/diff/467001/tests/git_common_... > tests/git_common_test.py:68: > 2 lines Done.
Can I land this yet? :(
Sorry for the delays, I'm slow at larger code dumps (it's over 1400 lines). On 2013/11/12 21:28:49, iannucci wrote: > On 2013/11/12 21:13:40, M-A Ruel wrote: > > I'm not warm to 'git-number' name, I'd prefer git-pseudorev or something like > > that. But I don't mind that much. > > Hm. I think pseudorev will imply that it behaves like svn's revision, which it > doesn't. What about 'git-gen-num' aka `git gen-num HEAD~100` ? Fine with git-number. Not a fan but I'll survive. > The thing is that the GIL already locks the cache object accesses anyway in > CPython, so the only thing the lock gets you is the same behavior on > non-CPython. > > And if you want to hold the lock for the duration of the function invocation to > prevent multiple threads from calculating the same value, then that will > definitely slow things down. No, if your function is idempotent, and it has to, to have its return value cached, then it's fine to call the function concurrently. Only updating the dict should be locked.
On 2013/11/13 18:10:50, M-A Ruel wrote: > Sorry for the delays, I'm slow at larger code dumps (it's over 1400 lines). Yeah me to, that's kinda why I want to land it already so subsequent changes can be smaller :D > > On 2013/11/12 21:28:49, iannucci wrote: > > On 2013/11/12 21:13:40, M-A Ruel wrote: > > > I'm not warm to 'git-number' name, I'd prefer git-pseudorev or something > like > > > that. But I don't mind that much. > > > > Hm. I think pseudorev will imply that it behaves like svn's revision, which it > > doesn't. What about 'git-gen-num' aka `git gen-num HEAD~100` ? > > Fine with git-number. Not a fan but I'll survive. > > > > The thing is that the GIL already locks the cache object accesses anyway in > > CPython, so the only thing the lock gets you is the same behavior on > > non-CPython. > > > > And if you want to hold the lock for the duration of the function invocation > to > > prevent multiple threads from calculating the same value, then that will > > definitely slow things down. > > No, if your function is idempotent, and it has to, to have its return value > cached, then it's fine to call the function concurrently. Only updating the dict > should be locked. Ok, fine, I'll take a benchmark and if it doesn't slow things down then I'll just make it threadsafe :)
On 2013/11/13 18:19:19, iannucci wrote: > On 2013/11/13 18:10:50, M-A Ruel wrote: > > Sorry for the delays, I'm slow at larger code dumps (it's over 1400 lines). > > Yeah me to, that's kinda why I want to land it already so subsequent changes can > be smaller :D > > > > > On 2013/11/12 21:28:49, iannucci wrote: > > > On 2013/11/12 21:13:40, M-A Ruel wrote: > > > > I'm not warm to 'git-number' name, I'd prefer git-pseudorev or something > > like > > > > that. But I don't mind that much. > > > > > > Hm. I think pseudorev will imply that it behaves like svn's revision, which > it > > > doesn't. What about 'git-gen-num' aka `git gen-num HEAD~100` ? > > > > Fine with git-number. Not a fan but I'll survive. > > > > > > > The thing is that the GIL already locks the cache object accesses anyway in > > > CPython, so the only thing the lock gets you is the same behavior on > > > non-CPython. > > > > > > And if you want to hold the lock for the duration of the function invocation > > to > > > prevent multiple threads from calculating the same value, then that will > > > definitely slow things down. > > > > No, if your function is idempotent, and it has to, to have its return value > > cached, then it's fine to call the function concurrently. Only updating the > dict > > should be locked. > > Ok, fine, I'll take a benchmark and if it doesn't slow things down then I'll > just make it threadsafe :) So, it looks like it adds about 2.5 seconds (from 5.5 to 8) for the initial calculation. I can make the default to be True, and then set it false for this application, but I would rather have the choice be explicit and deliberate either way. Besides, default arguments on decorators are tricky because then @deco # seems like it should be valid. @deco() # but actually, it should be this.
https://codereview.chromium.org/26109002/diff/547001/git_common.py File git_common.py (right): https://codereview.chromium.org/26109002/diff/547001/git_common.py#newcode143 git_common.py:143: class ProgressPrinter(object): FTR, I prefer https://code.google.com/p/swarming/source/browse/utils/threading_utils.py?rep... which is very versatile but it's fine here. https://codereview.chromium.org/26109002/diff/547001/git_common.py#newcode180 git_common.py:180: self._stream.write('\r'+s) self._stream.write('\r' + s) https://codereview.chromium.org/26109002/diff/547001/git_common.py#newcode217 git_common.py:217: str(commitrefs)) str() is not needed by definition. https://codereview.chromium.org/26109002/diff/547001/git_common.py#newcode220 git_common.py:220: def _check_output(*popenargs, **kwargs): Why not subprocess2.check_output() ? It works on 2.6. https://codereview.chromium.org/26109002/diff/547001/git_number.py File git_number.py (right): https://codereview.chromium.org/26109002/diff/547001/git_number.py#newcode148 git_number.py:148: Don't you care about updater.returncode? https://codereview.chromium.org/26109002/diff/547001/git_number.py#newcode169 git_number.py:169: yield x+r x+ r https://codereview.chromium.org/26109002/diff/547001/git_number.py#newcode175 git_number.py:175: """Load the generation numbers for targets. Calculates missing numbers if """Loads the generation numbers for targets. Calculates missing numbers if one or more of the targets is past the cached calculations. """ But I don't think it's really "load", it's more "generate". https://codereview.chromium.org/26109002/diff/547001/git_number.py#newcode221 git_number.py:221: def git_number(do_reset, do_cache, target_refs): I'd prefer a oneliner docstring https://codereview.chromium.org/26109002/diff/547001/git_number.py#newcode235 git_number.py:235: def main(): # pragma: no cover You could mock git_number and test it! :P I'm kidding btw. https://codereview.chromium.org/26109002/diff/547001/git_number.py#newcode255 git_number.py:255: pass return 1
https://codereview.chromium.org/26109002/diff/547001/git_common.py File git_common.py (right): https://codereview.chromium.org/26109002/diff/547001/git_common.py#newcode143 git_common.py:143: class ProgressPrinter(object): On 2013/11/13 19:55:00, M-A Ruel wrote: > FTR, I prefer > https://code.google.com/p/swarming/source/browse/utils/threading_utils.py?rep... > which is very versatile but it's fine here. Nice :D. Yeah we should have a standard library of this stuff... testing harnesses and subprocess42-type utilities are other good candidates IMO. Right now we have like... 5 separate standard libraries :( https://codereview.chromium.org/26109002/diff/547001/git_common.py#newcode180 git_common.py:180: self._stream.write('\r'+s) On 2013/11/13 19:55:00, M-A Ruel wrote: > self._stream.write('\r' + s) d'oh. Why doesn't our presubmit barf on this? https://codereview.chromium.org/26109002/diff/547001/git_common.py#newcode217 git_common.py:217: str(commitrefs)) On 2013/11/13 19:55:00, M-A Ruel wrote: > str() is not needed by definition. Well... it is because commitrefs is a tuple. I don't like the trimodal % typing characteristics :/ https://codereview.chromium.org/26109002/diff/547001/git_common.py#newcode220 git_common.py:220: def _check_output(*popenargs, **kwargs): On 2013/11/13 19:55:00, M-A Ruel wrote: > Why not subprocess2.check_output() ? It works on 2.6. Good question. I'll use that. https://codereview.chromium.org/26109002/diff/547001/git_number.py File git_number.py (right): https://codereview.chromium.org/26109002/diff/547001/git_number.py#newcode148 git_number.py:148: On 2013/11/13 19:55:00, M-A Ruel wrote: > Don't you care about updater.returncode? added assert https://codereview.chromium.org/26109002/diff/547001/git_number.py#newcode169 git_number.py:169: yield x+r On 2013/11/13 19:55:00, M-A Ruel wrote: > x+ r Done. https://codereview.chromium.org/26109002/diff/547001/git_number.py#newcode175 git_number.py:175: """Load the generation numbers for targets. Calculates missing numbers if On 2013/11/13 19:55:00, M-A Ruel wrote: > """Loads the generation numbers for targets. > > Calculates missing numbers if one or more of the targets is past the cached > calculations. > """ > > But I don't think it's really "load", it's more "generate". PTAL https://codereview.chromium.org/26109002/diff/547001/git_number.py#newcode221 git_number.py:221: def git_number(do_reset, do_cache, target_refs): On 2013/11/13 19:55:00, M-A Ruel wrote: > I'd prefer a oneliner docstring I nuked the function because it was too multipurpose and hard do describe. https://codereview.chromium.org/26109002/diff/547001/git_number.py#newcode235 git_number.py:235: def main(): # pragma: no cover On 2013/11/13 19:55:00, M-A Ruel wrote: > You could mock git_number and test it! :P > > I'm kidding btw. :D https://codereview.chromium.org/26109002/diff/547001/git_number.py#newcode255 git_number.py:255: pass On 2013/11/13 19:55:00, M-A Ruel wrote: > return 1 Done.
https://codereview.chromium.org/26109002/diff/547001/git_common.py File git_common.py (right): https://codereview.chromium.org/26109002/diff/547001/git_common.py#newcode180 git_common.py:180: self._stream.write('\r'+s) On 2013/11/14 07:06:29, iannucci wrote: > On 2013/11/13 19:55:00, M-A Ruel wrote: > > self._stream.write('\r' + s) > > d'oh. Why doesn't our presubmit barf on this? No idea. https://codereview.chromium.org/26109002/diff/637001/git_common.py File git_common.py (right): https://codereview.chromium.org/26109002/diff/637001/git_common.py#newcode22 git_common.py:22: import subprocess2 put it in a separate block since it is not stdlib. https://codereview.chromium.org/26109002/diff/637001/git_number.py File git_number.py (right): https://codereview.chromium.org/26109002/diff/637001/git_number.py#newcode110 git_number.py:110: return git.intern_f(f) I don't think it'll work on Windows. This is because TemporaryFile opens the file in non-shareable mode (!) on Windows. Please confirm. https://codereview.chromium.org/26109002/diff/637001/git_number.py#newcode187 git_number.py:187: if all(get_num(t) is not None for t in targets): This is somewhat dangerous. If someone uses a generator as an import to targets, it'll be exhausted here and line 207 will not be executed, or at least the first None item will be silently skipped. https://codereview.chromium.org/26109002/diff/637001/testing_support/coverage... File testing_support/coverage_utils.py (right): https://codereview.chromium.org/26109002/diff/637001/testing_support/coverage... testing_support/coverage_utils.py:23: os.path.dirname(os.path.dirname(__file__)), 'third_party'))) I still prefer this to be a global variable. https://codereview.chromium.org/26109002/diff/637001/testing_support/git_test... File testing_support/git_test_utils.py (right): https://codereview.chromium.org/26109002/diff/637001/testing_support/git_test... testing_support/git_test_utils.py:22: assert typ == 'blob', "Only support blobs for now" Use single quotes. https://codereview.chromium.org/26109002/diff/637001/testing_support/git_test... testing_support/git_test_utils.py:160: def __init__(self, repo_schema='', I prefer __init__ to always be the first method. https://codereview.chromium.org/26109002/diff/637001/tests/git_common_test.py File tests/git_common_test.py (right): https://codereview.chromium.org/26109002/diff/637001/tests/git_common_test.py... tests/git_common_test.py:152: class GitReadOnlyFunctionsTest(git_test_utils.GitRepoReadOnlyTestBase, You are inheriting from unittest.TestCase twice here. You likely want to use a MixIn class instead.
https://codereview.chromium.org/26109002/diff/637001/git_common.py File git_common.py (right): https://codereview.chromium.org/26109002/diff/637001/git_common.py#newcode22 git_common.py:22: import subprocess2 On 2013/11/14 17:27:24, M-A Ruel wrote: > put it in a separate block since it is not stdlib. Oh, right. https://codereview.chromium.org/26109002/diff/637001/git_number.py File git_number.py (right): https://codereview.chromium.org/26109002/diff/637001/git_number.py#newcode110 git_number.py:110: return git.intern_f(f) On 2013/11/14 17:27:24, M-A Ruel wrote: > I don't think it'll work on Windows. This is because TemporaryFile opens the > file in non-shareable mode (!) on Windows. Please confirm. Totally works :) https://codereview.chromium.org/26109002/diff/637001/git_number.py#newcode187 git_number.py:187: if all(get_num(t) is not None for t in targets): On 2013/11/14 17:27:24, M-A Ruel wrote: > This is somewhat dangerous. If someone uses a generator as an import to targets, > it'll be exhausted here and line 207 will not be executed, or at least the first > None item will be silently skipped. hm, good point. I'll list-ify it. https://codereview.chromium.org/26109002/diff/637001/testing_support/coverage... File testing_support/coverage_utils.py (right): https://codereview.chromium.org/26109002/diff/637001/testing_support/coverage... testing_support/coverage_utils.py:23: os.path.dirname(os.path.dirname(__file__)), 'third_party'))) On 2013/11/14 17:27:24, M-A Ruel wrote: > I still prefer this to be a global variable. Aw, nuts, I thought I did that. https://codereview.chromium.org/26109002/diff/637001/testing_support/git_test... File testing_support/git_test_utils.py (right): https://codereview.chromium.org/26109002/diff/637001/testing_support/git_test... testing_support/git_test_utils.py:22: assert typ == 'blob', "Only support blobs for now" On 2013/11/14 17:27:24, M-A Ruel wrote: > Use single quotes. Done. https://codereview.chromium.org/26109002/diff/637001/testing_support/git_test... testing_support/git_test_utils.py:160: def __init__(self, repo_schema='', On 2013/11/14 17:27:24, M-A Ruel wrote: > I prefer __init__ to always be the first method. Done. https://codereview.chromium.org/26109002/diff/637001/tests/git_common_test.py File tests/git_common_test.py (right): https://codereview.chromium.org/26109002/diff/637001/tests/git_common_test.py... tests/git_common_test.py:152: class GitReadOnlyFunctionsTest(git_test_utils.GitRepoReadOnlyTestBase, On 2013/11/14 17:27:24, M-A Ruel wrote: > You are inheriting from unittest.TestCase twice here. You likely want to use a > MixIn class instead. Hm... setUpClass chains on super.setUpClass though. I think the diamond inheritance works here. Unless you have a simple pattern that would work?
https://codereview.chromium.org/26109002/diff/757001/git_common.py File git_common.py (right): https://codereview.chromium.org/26109002/diff/757001/git_common.py#newcode205 git_common.py:205: raise Exception('one of %s does not seem to be a valid commitref.' % Why not use a cute exception? Exception is hard to trap. You could return None. Really optional. https://codereview.chromium.org/26109002/diff/757001/git_number.py File git_number.py (right): https://codereview.chromium.org/26109002/diff/757001/git_number.py#newcode241 git_number.py:241: logging.getLogger().setLevel(logging.INFO) why not use: levels = [logging.ERROR, logging.INFO, logging.DEBUG] logging.basicConfig(level=levels[max(opts.verbose, len(levels)-1]) https://codereview.chromium.org/26109002/diff/757001/git_number.py#newcode250 git_number.py:250: targets = git.parse_commitrefs(*(args or ['HEAD'])) You could do management here instead; if not targets: parser.error('Yo dawg, wrong commitref')
https://codereview.chromium.org/26109002/diff/757001/git_common.py File git_common.py (right): https://codereview.chromium.org/26109002/diff/757001/git_common.py#newcode205 git_common.py:205: raise Exception('one of %s does not seem to be a valid commitref.' % On 2013/11/15 02:51:29, M-A Ruel wrote: > Why not use a cute exception? Exception is hard to trap. You could return None. > Really optional. Done. https://codereview.chromium.org/26109002/diff/757001/git_number.py File git_number.py (right): https://codereview.chromium.org/26109002/diff/757001/git_number.py#newcode241 git_number.py:241: logging.getLogger().setLevel(logging.INFO) On 2013/11/15 02:51:29, M-A Ruel wrote: > why not use: > levels = [logging.ERROR, logging.INFO, logging.DEBUG] > logging.basicConfig(level=levels[max(opts.verbose, len(levels)-1]) Done (except it should be min() :)) https://codereview.chromium.org/26109002/diff/757001/git_number.py#newcode250 git_number.py:250: targets = git.parse_commitrefs(*(args or ['HEAD'])) On 2013/11/15 02:51:29, M-A Ruel wrote: > You could do management here instead; > if not targets: > parser.error('Yo dawg, wrong commitref') Did both
What do I need to do to land this? I'm getting antsy :)
Getting near. Your code will be perfect. :) https://codereview.chromium.org/26109002/diff/797002/git_common.py File git_common.py (right): https://codereview.chromium.org/26109002/diff/797002/git_common.py#newcode277 git_common.py:277: opts += ['-r'] I prefer append() Especially that you use append on the next line, inconsistency. https://codereview.chromium.org/26109002/diff/797002/git_number.py File git_number.py (right): https://codereview.chromium.org/26109002/diff/797002/git_number.py#newcode155 git_number.py:155: commit_cmd += ['-p', binascii.hexlify(t)] append especially that the next line is using append https://codereview.chromium.org/26109002/diff/797002/git_number.py#newcode196 git_number.py:196: empty) wrong alignment https://codereview.chromium.org/26109002/diff/797002/git_number.py#newcode209 git_number.py:209: 'rev-list', '--topo-order', '--parents', '--reverse', '^' + REF Always keep a comma at the end when it's not terminated with ] https://codereview.chromium.org/26109002/diff/797002/git_number.py#newcode257 git_number.py:257: print '\n'.join(str(get_num(x)) for x in targets) Sometimes you do map, sometimes you use a generator. I'd prefer to be consistent. For example here it could be: print '\n'.join(map(str(map(get_num, targets))) https://codereview.chromium.org/26109002/diff/797002/testing_support/git_test... File testing_support/git_test_utils.py (right): https://codereview.chromium.org/26109002/diff/797002/testing_support/git_test... testing_support/git_test_utils.py:38: def __eq__(self, other): It's generally dangerous to implement __eq__ without __ne__. Maybe MutableSet does it for you, I'm not familiar with it. https://codereview.chromium.org/26109002/diff/797002/testing_support/git_test... testing_support/git_test_utils.py:146: if empty_keys: I'd prefer: assert empty_keys, 'Cycle detected! %s' % par_map and remove the condition https://codereview.chromium.org/26109002/diff/797002/testing_support/git_test... testing_support/git_test_utils.py:174: if parent and self.master is None: self.master == '' is a valid value? https://codereview.chromium.org/26109002/diff/797002/testing_support/git_test... testing_support/git_test_utils.py:184: """Method to obtain data for a commit. Obtains the data for a commit. https://codereview.chromium.org/26109002/diff/797002/tests/git_common_test.py File tests/git_common_test.py (right): https://codereview.chromium.org/26109002/diff/797002/tests/git_common_test.py... tests/git_common_test.py:39: else: lines 39 and 40 are not strictly needed, I'd drop. https://codereview.chromium.org/26109002/diff/797002/tests/git_common_test.py... tests/git_common_test.py:45: self.assertEqual(4, double_if_even(2)) For these kind of tests, I like: data = [ (expected, input), ... ] and then iterate over. But I don't much. In particular here I don't think it'd work. https://codereview.chromium.org/26109002/diff/797002/tests/git_common_test.py... tests/git_common_test.py:134: # This test is probably racy, but I don't have a better alternative. Put comment inside the function. https://codereview.chromium.org/26109002/diff/797002/tests/git_common_test.py... tests/git_common_test.py:149: self.assertGreaterEqual(stream.count, 10) Glad I'm upgrading to 2.7? :)
https://chromiumcodereview.appspot.com/26109002/diff/797002/git_common.py File git_common.py (right): https://chromiumcodereview.appspot.com/26109002/diff/797002/git_common.py#new... git_common.py:277: opts += ['-r'] On 2013/11/15 22:37:06, M-A Ruel wrote: > I prefer append() > Especially that you use append on the next line, inconsistency. Done. https://chromiumcodereview.appspot.com/26109002/diff/797002/git_number.py File git_number.py (right): https://chromiumcodereview.appspot.com/26109002/diff/797002/git_number.py#new... git_number.py:155: commit_cmd += ['-p', binascii.hexlify(t)] On 2013/11/15 22:37:06, M-A Ruel wrote: > append > especially that the next line is using append extend :) done. https://chromiumcodereview.appspot.com/26109002/diff/797002/git_number.py#new... git_number.py:196: empty) On 2013/11/15 22:37:06, M-A Ruel wrote: > wrong alignment Done. https://chromiumcodereview.appspot.com/26109002/diff/797002/git_number.py#new... git_number.py:209: 'rev-list', '--topo-order', '--parents', '--reverse', '^' + REF On 2013/11/15 22:37:06, M-A Ruel wrote: > Always keep a comma at the end when it's not terminated with ] Done. https://chromiumcodereview.appspot.com/26109002/diff/797002/git_number.py#new... git_number.py:257: print '\n'.join(str(get_num(x)) for x in targets) On 2013/11/15 22:37:06, M-A Ruel wrote: > Sometimes you do map, sometimes you use a generator. I'd prefer to be > consistent. For example here it could be: > print '\n'.join(map(str(map(get_num, targets))) Yeah... this was an obvious one. Done. https://chromiumcodereview.appspot.com/26109002/diff/797002/testing_support/g... File testing_support/git_test_utils.py (right): https://chromiumcodereview.appspot.com/26109002/diff/797002/testing_support/g... testing_support/git_test_utils.py:38: def __eq__(self, other): On 2013/11/15 22:37:06, M-A Ruel wrote: > It's generally dangerous to implement __eq__ without __ne__. Maybe MutableSet > does it for you, I'm not familiar with it. Budunno, I airlifted this from the activestate recipe. I'll fix it. https://chromiumcodereview.appspot.com/26109002/diff/797002/testing_support/g... testing_support/git_test_utils.py:146: if empty_keys: On 2013/11/15 22:37:06, M-A Ruel wrote: > I'd prefer: > assert empty_keys, 'Cycle detected! %s' % par_map > and remove the condition Oh, yeah, good call. That's one of my peeves :( Almost as bad as if True: return 1 else: return 0 https://chromiumcodereview.appspot.com/26109002/diff/797002/testing_support/g... testing_support/git_test_utils.py:174: if parent and self.master is None: On 2013/11/15 22:37:06, M-A Ruel wrote: > self.master == '' is a valid value? No, it's the initial state. It's also not possible to assign it ''. Fixed anyway for brevity :) https://chromiumcodereview.appspot.com/26109002/diff/797002/testing_support/g... testing_support/git_test_utils.py:184: """Method to obtain data for a commit. On 2013/11/15 22:37:06, M-A Ruel wrote: > Obtains the data for a commit. Done. https://chromiumcodereview.appspot.com/26109002/diff/797002/tests/git_common_... File tests/git_common_test.py (right): https://chromiumcodereview.appspot.com/26109002/diff/797002/tests/git_common_... tests/git_common_test.py:39: else: On 2013/11/15 22:37:06, M-A Ruel wrote: > lines 39 and 40 are not strictly needed, I'd drop. fair enough :). I like being explicit though, so made this an inline if. https://chromiumcodereview.appspot.com/26109002/diff/797002/tests/git_common_... tests/git_common_test.py:45: self.assertEqual(4, double_if_even(2)) On 2013/11/15 22:37:06, M-A Ruel wrote: > For these kind of tests, I like: > data = [ > (expected, input), > ... > ] > > and then iterate over. But I don't much. In particular here I don't think it'd > work. Yeah I think it would be awkward here. Plus, the stacktrace is helpful on breakage and gets weird when looping over data :( https://chromiumcodereview.appspot.com/26109002/diff/797002/tests/git_common_... tests/git_common_test.py:134: # This test is probably racy, but I don't have a better alternative. On 2013/11/15 22:37:06, M-A Ruel wrote: > Put comment inside the function. What?! Crazy... :) Done. https://chromiumcodereview.appspot.com/26109002/diff/797002/tests/git_common_... tests/git_common_test.py:149: self.assertGreaterEqual(stream.count, 10) On 2013/11/15 22:37:06, M-A Ruel wrote: > Glad I'm upgrading to 2.7? :) Very!!! :D
Last series of issues. lgtm after that. https://codereview.chromium.org/26109002/diff/897001/git_common.py File git_common.py (right): https://codereview.chromium.org/26109002/diff/897001/git_common.py#newcode226 git_common.py:226: logging.debug('running: %s', ' '.join(repr(tok) for tok in cmd)) map(repr, cmd) ? (kidding) 'Running ...' though. https://codereview.chromium.org/26109002/diff/897001/git_number.py File git_number.py (right): https://codereview.chromium.org/26109002/diff/897001/git_number.py#newcode84 git_number.py:84: the commit hash doesn't have a computed value yet.""" """Returns the generation number for a commit. Returns None if the commit hash doesn't have a computed value yet. """ much clearer very explicit so docstring https://codereview.chromium.org/26109002/diff/897001/tests/git_common_test.py File tests/git_common_test.py (right): https://codereview.chromium.org/26109002/diff/897001/tests/git_common_test.py... tests/git_common_test.py:20: from testing_support import git_test_utils sort https://codereview.chromium.org/26109002/diff/897001/tests/git_common_test.py... tests/git_common_test.py:80: CTRL_C = signal.SIGINT CTRL_C = signal.CTRL_C_EVENT if sys.platform == 'win32' else signal.SIGINT 'win64' is never going to be a platform. I can't find the relevant bug report on bugs.python.org though. https://codereview.chromium.org/26109002/diff/897001/tests/git_common_test.py... tests/git_common_test.py:85: for i in pool.imap(slow_square, xrange(10)): result = list(pool.imap(slow_square, xrange(10))) https://codereview.chromium.org/26109002/diff/897001/tests/git_common_test.py... tests/git_common_test.py:94: for i in pool.imap(slow_square, xrange(1000000)): One thing I prefer is to use a list of threading.Event, then signal half og the items in the list. This is faster and more deterministic than using sleep. I'd highly prefer if this whole test executable didn't use time.sleep() at all. https://codereview.chromium.org/26109002/diff/897001/tests/git_common_test.py... tests/git_common_test.py:103: for i in pool.imap(slow_square, xrange(10)): same https://codereview.chromium.org/26109002/diff/897001/tests/git_number_test.py File tests/git_number_test.py (right): https://codereview.chromium.org/26109002/diff/897001/tests/git_number_test.py... tests/git_number_test.py:59: ) Not my favorite layout, in general I prefer: self.assertEqual( None, self.repo.run(self.gn.get_num, binascii.unhexlify(self.repo['A']))) In particular for an unknown reason, you are aligning at +2.
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/iannucci@chromium.org/26109002/957001
https://chromiumcodereview.appspot.com/26109002/diff/897001/git_common.py File git_common.py (right): https://chromiumcodereview.appspot.com/26109002/diff/897001/git_common.py#new... git_common.py:226: logging.debug('running: %s', ' '.join(repr(tok) for tok in cmd)) On 2013/11/17 19:54:58, M-A Ruel wrote: > map(repr, cmd) ? > > (kidding) > > 'Running ...' though. Ha. Yeah, I've been warned away from map(), but I tend to think functionally... /me should be doing more haskell. Fixed string tho https://chromiumcodereview.appspot.com/26109002/diff/897001/git_number.py File git_number.py (right): https://chromiumcodereview.appspot.com/26109002/diff/897001/git_number.py#new... git_number.py:84: the commit hash doesn't have a computed value yet.""" On 2013/11/17 19:54:58, M-A Ruel wrote: > """Returns the generation number for a commit. > > Returns None if the commit hash doesn't have a computed value yet. > """ > > much clearer very explicit so docstring great comment so wow much clarity https://chromiumcodereview.appspot.com/26109002/diff/897001/tests/git_common_... File tests/git_common_test.py (right): https://chromiumcodereview.appspot.com/26109002/diff/897001/tests/git_common_... tests/git_common_test.py:20: from testing_support import git_test_utils On 2013/11/17 19:54:58, M-A Ruel wrote: > sort Dangit! Why don't we have pylint do this.... https://chromiumcodereview.appspot.com/26109002/diff/897001/tests/git_common_... tests/git_common_test.py:80: CTRL_C = signal.SIGINT On 2013/11/17 19:54:58, M-A Ruel wrote: > CTRL_C = signal.CTRL_C_EVENT if sys.platform == 'win32' else signal.SIGINT > > 'win64' is never going to be a platform. I can't find the relevant bug report on > http://bugs.python.org though. Yeah, I know... it's weird. Done. https://chromiumcodereview.appspot.com/26109002/diff/897001/tests/git_common_... tests/git_common_test.py:85: for i in pool.imap(slow_square, xrange(10)): On 2013/11/17 19:54:58, M-A Ruel wrote: > result = list(pool.imap(slow_square, xrange(10))) Lol, done. https://chromiumcodereview.appspot.com/26109002/diff/897001/tests/git_common_... tests/git_common_test.py:94: for i in pool.imap(slow_square, xrange(1000000)): On 2013/11/17 19:54:58, M-A Ruel wrote: > One thing I prefer is to use a list of threading.Event, then signal half og the > items in the list. This is faster and more deterministic than using sleep. I'd > highly prefer if this whole test executable didn't use time.sleep() at all. Tried it. Surprise! Python signal-handling is broken: http://bytes.com/topic/python/answers/802572-waiting-event-blocks-all-signals https://chromiumcodereview.appspot.com/26109002/diff/897001/tests/git_common_... tests/git_common_test.py:103: for i in pool.imap(slow_square, xrange(10)): On 2013/11/17 19:54:58, M-A Ruel wrote: > same Done. https://chromiumcodereview.appspot.com/26109002/diff/897001/tests/git_number_... File tests/git_number_test.py (right): https://chromiumcodereview.appspot.com/26109002/diff/897001/tests/git_number_... tests/git_number_test.py:59: ) On 2013/11/17 19:54:58, M-A Ruel wrote: > Not my favorite layout, in general I prefer: > self.assertEqual( > None, > self.repo.run(self.gn.get_num, binascii.unhexlify(self.repo['A']))) > > In particular for an unknown reason, you are aligning at +2. Done. TBH, I'm unclear on the +4 rule. That's any time you're doing any sort of () or {} or []? Also, why +4 when the indent size is +2? Google style guide is +4, but their normal indent is also +4, I assumed the two were linked?
Presubmit check for 26109002-957001 failed and returned exit status 1. Running presubmit commit checks ... Checking out rietveld... Running save-description-on-failure.sh Running push-basic.sh Running upstream.sh Running submit-from-new-dir.sh Running abandon.sh Running submodule-merge-test.sh Running upload-local-tracking-branch.sh Running hooks.sh Running post-dcommit-hook-test.sh Running upload-stale.sh Running patch.sh Running basic.sh ** Presubmit ERRORS ** tests/git_number_test.py failed Exception in thread Thread-18: Traceback (most recent call last): File "/usr/lib/python2.7/threading.py", line 551, in __bootstrap_inner self.run() File "/usr/lib/python2.7/threading.py", line 504, in run self.__target(*self.__args, **self.__kwargs) File "/usr/lib/python2.7/multiprocessing/pool.py", line 345, in _handle_tasks debug('task handler exiting') File "/b/commit-queue/workdir/tools/depot_tools/third_party/coverage/collector.py", line 113, in _trace self.cur_file_data, self.last_line = self.data_stack.pop() IndexError: pop from empty list ... ---------------------------------------------------------------------- Ran 3 tests in 7.516s OK Name Stmts Miss Cover Missing ------------------------------------------ git_number 99 1 99% 177 FATAL: not at 100% coverage. Presubmit checks took 87.8s to calculate. Was the presubmit check useful? Please send feedback & hate mail to maruel@chromium.org!
Aaand... that's a race in the coverage module (local repro, at least). Doesn't seem to happen with the C version of the module.
On 2013/11/18 05:50:29, iannucci wrote: > Aaand... that's a race in the coverage module (local repro, at least). Doesn't > seem to happen with the C version of the module. WDYT about scrapping the pure-python coverage module, keeping the coverage testing support as-is, but just print a warning if they don't have the C version of the module installed? This should keep us honest (local presubmits will use the system version), but won't make the CQ run w/ coverage. The C version seems to be better maintained, probably because it's just absurdly faster so no one wants the pure-python one. Alternately I could install the C module on the CQ...
Ok, added version checking for the coverage module, so we can optionally have tests require a the native tracer with at least a given version. I'm going to CQ it to make sure the CQ can't run it yet (haven't installed it on cq), then I'll install and cq it to get this thing landed already.
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/iannucci@chromium.org/26109002/1037001
Presubmit check for 26109002-1037001 failed and returned exit status 1. Running presubmit commit checks ... Checking out rietveld... Running save-description-on-failure.sh Running push-basic.sh Running upstream.sh Running submit-from-new-dir.sh Running abandon.sh Running submodule-merge-test.sh Running upload-local-tracking-branch.sh Running hooks.sh Running post-dcommit-hook-test.sh Running upload-stale.sh Running patch.sh Running basic.sh ** Presubmit ERRORS ** tests/git_number_test.py failed ERROR: python-coverage (3.7) is required to be installed on your PYTHONPATH to run this test. Presubmit checks took 89.3s to calculate.
> > ** Presubmit ERRORS ** > tests/git_number_test.py failed > ERROR: python-coverage (3.7) is required to be installed on your PYTHONPATH to > run this test. > > > Presubmit checks took 89.3s to calculate. Booyeah :)
On 2013/11/18 05:54:37, iannucci wrote: > On 2013/11/18 05:50:29, iannucci wrote: > > Aaand... that's a race in the coverage module (local repro, at least). Doesn't > > seem to happen with the C version of the module. > > WDYT about scrapping the pure-python coverage module, keeping the coverage > testing support as-is, but just print a warning if they don't have the C version > of the module installed? This should keep us honest (local presubmits will use > the system version), but won't make the CQ run w/ coverage. The C version seems > to be better maintained, probably because it's just absurdly faster so no one > wants the pure-python one. > > Alternately I could install the C module on the CQ... In certain cases, we want the test to pass on Windows, like with depot_tools.
On 2013/11/19 14:06:01, M-A Ruel wrote: > On 2013/11/18 05:54:37, iannucci wrote: > > On 2013/11/18 05:50:29, iannucci wrote: > > > Aaand... that's a race in the coverage module (local repro, at least). > Doesn't > > > seem to happen with the C version of the module. > > > > WDYT about scrapping the pure-python coverage module, keeping the coverage > > testing support as-is, but just print a warning if they don't have the C > version > > of the module installed? This should keep us honest (local presubmits will use > > the system version), but won't make the CQ run w/ coverage. The C version > seems > > to be better maintained, probably because it's just absurdly faster so no one > > wants the pure-python one. > > > > Alternately I could install the C module on the CQ... > > In certain cases, we want the test to pass on Windows, like with depot_tools. Well, it's certainly possible to install on windows. You just need MSVS and pip.
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/iannucci@chromium.org/26109002/1037001
Message was sent while issue was closed.
Change committed as 236035 |