|
|
Created:
5 years, 3 months ago by dnj Modified:
5 years, 3 months ago CC:
chromium-reviews, dpranke+depot_tools_chromium.org, iannucci+depot_tools_chromium.org Target Ref:
refs/heads/master Project:
depot_tools Visibility:
Public. |
Descriptiongsutil: Parallel-safe, specify target, add clean.
- Update "gsutil.py" to be cooperatively safe when invoked
multiple times simultaneously.
- Allow the cache directory to be overridden by the
DEPOT_TOOLS_GSUTIL_BIN_DIR environment variable.
- Add a "--clean" flag to force "gsutil.py" to do a clean download.
BUG=chromium:452497
TEST=local
- for i in `seq 1 50`; do ./gsutil.py --clean -- version&; done
Committed: http://src.chromium.org/viewvc/chrome?view=rev&revision=296772
Patch Set 1 #
Total comments: 8
Patch Set 2 : Use proper tempfile, comments. #Patch Set 3 : Update variable name. #
Total comments: 6
Patch Set 4 : os.rename, temp directory on same FS. #Patch Set 5 : Cleaner w/ contextmanager. #
Total comments: 2
Patch Set 6 : Catch IOError. #Patch Set 7 : Explicitly create base directory. #Messages
Total messages: 32 (8 generated)
dnj@chromium.org changed reviewers: + andreyu@google.com, davidjames@chromium.org, hinoka@chromium.org, vapier@chromium.org
PTAL
hinoka@google.com changed reviewers: + hinoka@google.com
https://codereview.chromium.org/1346213003/diff/1/gsutil.py File gsutil.py (right): https://codereview.chromium.org/1346213003/diff/1/gsutil.py#newcode93 gsutil.py:93: cache_dir = os.path.join(target, '.cache_dir.%s' % (suffix,)) this would be more aptly called "tempdir" now? maybe also clean up lingering cache_dirs older than 1 hour. https://codereview.chromium.org/1346213003/diff/1/gsutil.py#newcode123 gsutil.py:123: bin_dir = os.environ.get('DEPOT_TOOLS_GSUTIL_BIN_DIR') get(key, default) https://codereview.chromium.org/1346213003/diff/1/gsutil.py#newcode129 gsutil.py:129: parser.add_argument('--clean', action='store_true', What do you see the use case for this being?
https://codereview.chromium.org/1346213003/diff/1/gsutil.py File gsutil.py (right): https://codereview.chromium.org/1346213003/diff/1/gsutil.py#newcode84 gsutil.py:84: suffix = '%d_%d' % (time.time(), os.getpid()) this isn't parallel safe. you cannot rely on PID's being unique (pid namespaces for example). if you need a tempdir, then use a proper library like tempfile.NamedTemporaryFile().
On 2015/09/17 19:31:00, vapier wrote: > https://codereview.chromium.org/1346213003/diff/1/gsutil.py > File gsutil.py (right): > > https://codereview.chromium.org/1346213003/diff/1/gsutil.py#newcode84 > gsutil.py:84: suffix = '%d_%d' % (time.time(), os.getpid()) > this isn't parallel safe. you cannot rely on PID's being unique (pid namespaces > for example). if you need a tempdir, then use a proper library like > tempfile.NamedTemporaryFile(). (note that tempfile.NamedTemporaryFile() doesn't quite work right on windows)
https://codereview.chromium.org/1346213003/diff/1/gsutil.py File gsutil.py (right): https://codereview.chromium.org/1346213003/diff/1/gsutil.py#newcode84 gsutil.py:84: suffix = '%d_%d' % (time.time(), os.getpid()) On 2015/09/17 19:31:00, vapier wrote: > this isn't parallel safe. you cannot rely on PID's being unique (pid namespaces > for example). if you need a tempdir, then use a proper library like > tempfile.NamedTemporaryFile(). Using "tempfile.mkdtemp" now. https://codereview.chromium.org/1346213003/diff/1/gsutil.py#newcode93 gsutil.py:93: cache_dir = os.path.join(target, '.cache_dir.%s' % (suffix,)) On 2015/09/17 19:26:45, Ryan Tseng wrote: > this would be more aptly called "tempdir" now? > > maybe also clean up lingering cache_dirs older than 1 hour. Acknowledged. https://codereview.chromium.org/1346213003/diff/1/gsutil.py#newcode123 gsutil.py:123: bin_dir = os.environ.get('DEPOT_TOOLS_GSUTIL_BIN_DIR') On 2015/09/17 19:26:45, Ryan Tseng wrote: > get(key, default) Done. https://codereview.chromium.org/1346213003/diff/1/gsutil.py#newcode129 gsutil.py:129: parser.add_argument('--clean', action='store_true', On 2015/09/17 19:26:45, Ryan Tseng wrote: > What do you see the use case for this being? Useful for testing massively in parallel.
Patchset #2 (id:20001) has been deleted
davidjames@google.com changed reviewers: + davidjames@google.com
https://codereview.chromium.org/1346213003/diff/60001/gsutil.py File gsutil.py (right): https://codereview.chromium.org/1346213003/diff/60001/gsutil.py#newcode89 gsutil.py:89: shutil.rmtree(cleanup_path) did you intend to catch errors from rmtree? https://codereview.chromium.org/1346213003/diff/60001/gsutil.py#newcode90 gsutil.py:90: except IOError: These can return OSError too I believe. >>> shutil.rmtree('f/f') Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/lib/python2.7/shutil.py", line 239, in rmtree onerror(os.listdir, path, sys.exc_info()) File "/usr/lib/python2.7/shutil.py", line 237, in rmtree names = os.listdir(path) OSError: [Errno 13] Permission denied: 'f/f' Consider the case when multiple users (who have different permissions) are all writing to the same dir in parallel. https://codereview.chromium.org/1346213003/diff/60001/gsutil.py#newcode102 gsutil.py:102: except IOError: shutil.Error: >>> shutil.move('g/f', 'e') Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/lib/python2.7/shutil.py", line 292, in move raise Error, "Destination path '%s' already exists" % real_dst shutil.Error: Destination path 'e/f' already exists OSError as well? >>> shutil.move('g/f', 'e') Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/lib/python2.7/shutil.py", line 300, in move rmtree(src) File "/usr/lib/python2.7/shutil.py", line 256, in rmtree onerror(os.rmdir, path, sys.exc_info()) File "/usr/lib/python2.7/shutil.py", line 254, in rmtree os.rmdir(path) OSError: [Errno 13] Permission denied: 'g/f' You can avoid shutil.Error by using os.rename directly. os.rename also provides better guarantees for parallel-safeness. (Unfortunately, shutil.move has a crappy implementation and uses copy/delete in the case of an OSError which then makes it not as parallel-safe.)
dnj@google.com changed reviewers: + dnj@google.com
Updated, fixed issue w/ rename where "/tmp" may not share filesystem by dropping tempdir underneath the gsutil dir. https://codereview.chromium.org/1346213003/diff/60001/gsutil.py File gsutil.py (right): https://codereview.chromium.org/1346213003/diff/60001/gsutil.py#newcode89 gsutil.py:89: shutil.rmtree(cleanup_path) On 2015/09/17 20:44:31, David James wrote: > did you intend to catch errors from rmtree? Hmm, probably shouldn't. https://codereview.chromium.org/1346213003/diff/60001/gsutil.py#newcode90 gsutil.py:90: except IOError: On 2015/09/17 20:44:31, David James wrote: > These can return OSError too I believe. > > >>> shutil.rmtree('f/f') > Traceback (most recent call last): > File "<stdin>", line 1, in <module> > File "/usr/lib/python2.7/shutil.py", line 239, in rmtree > onerror(os.listdir, path, sys.exc_info()) > File "/usr/lib/python2.7/shutil.py", line 237, in rmtree > names = os.listdir(path) > OSError: [Errno 13] Permission denied: 'f/f' > > Consider the case when multiple users (who have different permissions) are all > writing to the same dir in parallel. Multi-user, gross. I don't think that's an intended use case. Good point though; I'll go ahead and move the rmtree out of the try/except. https://codereview.chromium.org/1346213003/diff/60001/gsutil.py#newcode102 gsutil.py:102: except IOError: On 2015/09/17 20:44:31, David James wrote: > shutil.Error: > >>> shutil.move('g/f', 'e') > Traceback (most recent call last): > File "<stdin>", line 1, in <module> > File "/usr/lib/python2.7/shutil.py", line 292, in move > raise Error, "Destination path '%s' already exists" % real_dst > shutil.Error: Destination path 'e/f' already exists > > > OSError as well? > > >>> shutil.move('g/f', 'e') > Traceback (most recent call last): > File "<stdin>", line 1, in <module> > File "/usr/lib/python2.7/shutil.py", line 300, in move > rmtree(src) > File "/usr/lib/python2.7/shutil.py", line 256, in rmtree > onerror(os.rmdir, path, sys.exc_info()) > File "/usr/lib/python2.7/shutil.py", line 254, in rmtree > os.rmdir(path) > OSError: [Errno 13] Permission denied: 'g/f' > > > You can avoid shutil.Error by using os.rename directly. os.rename also provides > better guarantees for parallel-safeness. (Unfortunately, shutil.move has a > crappy implementation and uses copy/delete in the case of an OSError which then > makes it not as parallel-safe.) Hah did that before I read your comment. Done.
https://codereview.chromium.org/1346213003/diff/100001/gsutil.py File gsutil.py (right): https://codereview.chromium.org/1346213003/diff/100001/gsutil.py#newcode99 gsutil.py:99: except OSError: Per https://www.python.org/dev/peps/pep-3151/#confusing-set-of-os-related-exceptions , you should catch IOError anywhere you catch OSError. EnvironmentError does the trick, or (IOError, OSError). Quoting the PEP: "In fact, it is hard to think of any situation where OSError should be caught but not IOError, or the reverse."
lgtm otherwise
WDYT hinoka@, iannucci@? https://codereview.chromium.org/1346213003/diff/100001/gsutil.py File gsutil.py (right): https://codereview.chromium.org/1346213003/diff/100001/gsutil.py#newcode99 gsutil.py:99: except OSError: On 2015/09/17 21:56:50, David James wrote: > Per > https://www.python.org/dev/peps/pep-3151/#confusing-set-of-os-related-exceptions > , you should catch IOError anywhere you catch OSError. EnvironmentError does the > trick, or (IOError, OSError). > > Quoting the PEP: "In fact, it is hard to think of any situation where OSError > should be caught but not IOError, or the reverse." Done.
iannucci@chromium.org changed reviewers: + iannucci@chromium.org
I don't have much context on this script (other than the very high-level context), but it lgtm fwiw.
lgtm
On 2015/09/18 00:28:55, Ryan Tseng wrote: > lgtm Alrighty, I'll wait until tomorrow to deploy since this can affect ... a lot :)
What exactly can it affect? There's nothing changing in script's semantics unless DEPOT_TOOLS_GSUTIL_BIN_DIR is set, right? On Thu, Sep 17, 2015 at 12:04 PM, <dnj@chromium.org> wrote: > Reviewers: hinoka, davidjames, vapier, andreyu, > > Message: > PTAL > > Description: > gsutil: Parallel-safe, specify target, add clean. > > - Update "gsutil.py" to be cooperatively safe when invoked > multiple times simultaneously. > - Allow the cache directory to be overridden by the > DEPOT_TOOLS_GSUTIL_BIN_DIR environment variable. > - Add a "--clean" flag to force "gsutil.py" to do a clean download. > > BUG=chromium:452497 > TEST=local > - for i in `seq 1 50`; do ./gsutil.py --clean -- version&; done > > Please review this at https://codereview.chromium.org/1346213003/ > > Base URL: svn://svn.chromium.org/chrome/trunk/tools/depot_tools > > Affected files (+39, -17 lines): > M gsutil.py > M tests/gsutil_test.py > > > Index: gsutil.py > diff --git a/gsutil.py b/gsutil.py > index > 53589a27bc48e97261d47f3d22a18b9ea1a4d88c..8a11c050651fe15dbc1ae9771357c5fc9573519f > 100755 > --- a/gsutil.py > +++ b/gsutil.py > @@ -14,6 +14,7 @@ import os > import shutil > import subprocess > import sys > +import time > import urllib2 > import zipfile > > @@ -26,7 +27,6 @@ DEFAULT_BIN_DIR = os.path.join(THIS_DIR, 'external_bin', > 'gsutil') > DEFAULT_FALLBACK_GSUTIL = os.path.join( > THIS_DIR, 'third_party', 'gsutil', 'gsutil') > > - > class InvalidGsutilError(Exception): > pass > > @@ -73,22 +73,35 @@ def check_gsutil(gsutil_bin): > [sys.executable, gsutil_bin, 'version'], > stdout=subprocess.PIPE, stderr=subprocess.STDOUT) == 0 > > -def ensure_gsutil(version, target): > +def ensure_gsutil(version, target, clean): > bin_dir = os.path.join(target, 'gsutil_%s' % version) > gsutil_bin = os.path.join(bin_dir, 'gsutil', 'gsutil') > - if os.path.isfile(gsutil_bin) and check_gsutil(gsutil_bin): > + if not clean and os.path.isfile(gsutil_bin) and > check_gsutil(gsutil_bin): > # Everything is awesome! we're all done here. > return gsutil_bin > > - if os.path.isdir(bin_dir): > - # Clean up if we're redownloading a corrupted gsutil. > - shutil.rmtree(bin_dir) > - cache_dir = os.path.join(target, '.cache_dir') > + # Clean up if we're redownloading a corrupted gsutil. > + suffix = '%d_%d' % (time.time(), os.getpid()) > + try: > + cleanup_path = '%s.%s' % (bin_dir, suffix) > + shutil.move(bin_dir, cleanup_path) > + shutil.rmtree(cleanup_path) > + except IOError: > + # Directory was not present. > + pass > + > + cache_dir = os.path.join(target, '.cache_dir.%s' % (suffix,)) > if not os.path.isdir(cache_dir): > os.makedirs(cache_dir) > target_zip_filename = download_gsutil(version, cache_dir) > with zipfile.ZipFile(target_zip_filename, 'r') as target_zip: > - target_zip.extractall(bin_dir) > + target_zip.extractall(cache_dir) > + > + try: > + shutil.move(cache_dir, bin_dir) > + except IOError: > + # Something else did this in parallel. > + pass > > # Final check that the gsutil bin is okay. This should never fail. > if not check_gsutil(gsutil_bin): > @@ -97,9 +110,9 @@ def ensure_gsutil(version, target): > return gsutil_bin > > > -def run_gsutil(force_version, fallback, target, args): > +def run_gsutil(force_version, fallback, target, args, clean=False): > if force_version: > - gsutil_bin = ensure_gsutil(force_version, target) > + gsutil_bin = ensure_gsutil(force_version, target, clean) > else: > gsutil_bin = fallback > cmd = [sys.executable, gsutil_bin] + args > @@ -107,10 +120,18 @@ def run_gsutil(force_version, fallback, target, > args): > > > def parse_args(): > + bin_dir = os.environ.get('DEPOT_TOOLS_GSUTIL_BIN_DIR') > + if not bin_dir: > + bin_dir = DEFAULT_BIN_DIR > + > parser = argparse.ArgumentParser() > parser.add_argument('--force-version', default='4.13') > + parser.add_argument('--clean', action='store_true', > + help='Clear any existing gsutil package, forcing a new download.') > parser.add_argument('--fallback', default=DEFAULT_FALLBACK_GSUTIL) > - parser.add_argument('--target', default=DEFAULT_BIN_DIR) > + parser.add_argument('--target', default=bin_dir, > + help='The target directory to download/store a gsutil version in. ' > + '(default is %(default)s).') > parser.add_argument('args', nargs=argparse.REMAINDER) > > args, extras = parser.parse_known_args() > @@ -118,12 +139,13 @@ def parse_args(): > args.args.pop(0) > if extras: > args.args = extras + args.args > - return args.force_version, args.fallback, args.target, args.args > + return args > > > def main(): > - force_version, fallback, target, args = parse_args() > - return run_gsutil(force_version, fallback, target, args) > + args = parse_args() > + return run_gsutil(args.force_version, args.fallback, args.target, > args.args, > + clean=args.clean) > > if __name__ == '__main__': > sys.exit(main()) > Index: tests/gsutil_test.py > diff --git a/tests/gsutil_test.py b/tests/gsutil_test.py > index > 76570dd31034727b4c38346ea6e84fe383f9b234..d34eebfa52e76affc1df6b6a52c0c4801d150523 > 100755 > --- a/tests/gsutil_test.py > +++ b/tests/gsutil_test.py > @@ -144,8 +144,8 @@ class GsutilUnitTests(unittest.TestCase): > > # This should delete the old bin and rewrite it with 'Fake gsutil' > self.assertRaises( > - gsutil.InvalidGsutilError, gsutil.ensure_gsutil, version, > self.tempdir) > - self.assertTrue(os.path.isdir(os.path.join(self.tempdir, > '.cache_dir'))) > + gsutil.InvalidGsutilError, gsutil.ensure_gsutil, version, > self.tempdir, > + False) > self.assertTrue(os.path.exists(gsutil_bin)) > with open(gsutil_bin, 'r') as f: > self.assertEquals(f.read(), fake_gsutil) > @@ -165,7 +165,7 @@ class GsutilUnitTests(unittest.TestCase): > with open(gsutil_bin, 'w') as f: > f.write('Foobar') > self.assertEquals( > - gsutil.ensure_gsutil(version, self.tempdir), gsutil_bin) > + gsutil.ensure_gsutil(version, self.tempdir, False), gsutil_bin) > > if __name__ == '__main__': > unittest.main() > > > To unsubscribe from this group and stop receiving emails from it, send an email to chromium-reviews+unsubscribe@chromium.org.
Yep On Thu, Sep 17, 2015 at 5:36 PM, Andrey Ulanov <andreyu@google.com> wrote: > What exactly can it affect? There's nothing changing in script's semantics > unless DEPOT_TOOLS_GSUTIL_BIN_DIR is set, right? > > On Thu, Sep 17, 2015 at 12:04 PM, <dnj@chromium.org> wrote: > >> Reviewers: hinoka, davidjames, vapier, andreyu, >> >> Message: >> PTAL >> >> Description: >> gsutil: Parallel-safe, specify target, add clean. >> >> - Update "gsutil.py" to be cooperatively safe when invoked >> multiple times simultaneously. >> - Allow the cache directory to be overridden by the >> DEPOT_TOOLS_GSUTIL_BIN_DIR environment variable. >> - Add a "--clean" flag to force "gsutil.py" to do a clean download. >> >> BUG=chromium:452497 >> TEST=local >> - for i in `seq 1 50`; do ./gsutil.py --clean -- version&; done >> >> Please review this at https://codereview.chromium.org/1346213003/ >> >> Base URL: svn://svn.chromium.org/chrome/trunk/tools/depot_tools >> >> Affected files (+39, -17 lines): >> M gsutil.py >> M tests/gsutil_test.py >> >> >> Index: gsutil.py >> diff --git a/gsutil.py b/gsutil.py >> index >> 53589a27bc48e97261d47f3d22a18b9ea1a4d88c..8a11c050651fe15dbc1ae9771357c5fc9573519f >> 100755 >> --- a/gsutil.py >> +++ b/gsutil.py >> @@ -14,6 +14,7 @@ import os >> import shutil >> import subprocess >> import sys >> +import time >> import urllib2 >> import zipfile >> >> @@ -26,7 +27,6 @@ DEFAULT_BIN_DIR = os.path.join(THIS_DIR, >> 'external_bin', 'gsutil') >> DEFAULT_FALLBACK_GSUTIL = os.path.join( >> THIS_DIR, 'third_party', 'gsutil', 'gsutil') >> >> - >> class InvalidGsutilError(Exception): >> pass >> >> @@ -73,22 +73,35 @@ def check_gsutil(gsutil_bin): >> [sys.executable, gsutil_bin, 'version'], >> stdout=subprocess.PIPE, stderr=subprocess.STDOUT) == 0 >> >> -def ensure_gsutil(version, target): >> +def ensure_gsutil(version, target, clean): >> bin_dir = os.path.join(target, 'gsutil_%s' % version) >> gsutil_bin = os.path.join(bin_dir, 'gsutil', 'gsutil') >> - if os.path.isfile(gsutil_bin) and check_gsutil(gsutil_bin): >> + if not clean and os.path.isfile(gsutil_bin) and >> check_gsutil(gsutil_bin): >> # Everything is awesome! we're all done here. >> return gsutil_bin >> >> - if os.path.isdir(bin_dir): >> - # Clean up if we're redownloading a corrupted gsutil. >> - shutil.rmtree(bin_dir) >> - cache_dir = os.path.join(target, '.cache_dir') >> + # Clean up if we're redownloading a corrupted gsutil. >> + suffix = '%d_%d' % (time.time(), os.getpid()) >> + try: >> + cleanup_path = '%s.%s' % (bin_dir, suffix) >> + shutil.move(bin_dir, cleanup_path) >> + shutil.rmtree(cleanup_path) >> + except IOError: >> + # Directory was not present. >> + pass >> + >> + cache_dir = os.path.join(target, '.cache_dir.%s' % (suffix,)) >> if not os.path.isdir(cache_dir): >> os.makedirs(cache_dir) >> target_zip_filename = download_gsutil(version, cache_dir) >> with zipfile.ZipFile(target_zip_filename, 'r') as target_zip: >> - target_zip.extractall(bin_dir) >> + target_zip.extractall(cache_dir) >> + >> + try: >> + shutil.move(cache_dir, bin_dir) >> + except IOError: >> + # Something else did this in parallel. >> + pass >> >> # Final check that the gsutil bin is okay. This should never fail. >> if not check_gsutil(gsutil_bin): >> @@ -97,9 +110,9 @@ def ensure_gsutil(version, target): >> return gsutil_bin >> >> >> -def run_gsutil(force_version, fallback, target, args): >> +def run_gsutil(force_version, fallback, target, args, clean=False): >> if force_version: >> - gsutil_bin = ensure_gsutil(force_version, target) >> + gsutil_bin = ensure_gsutil(force_version, target, clean) >> else: >> gsutil_bin = fallback >> cmd = [sys.executable, gsutil_bin] + args >> @@ -107,10 +120,18 @@ def run_gsutil(force_version, fallback, target, >> args): >> >> >> def parse_args(): >> + bin_dir = os.environ.get('DEPOT_TOOLS_GSUTIL_BIN_DIR') >> + if not bin_dir: >> + bin_dir = DEFAULT_BIN_DIR >> + >> parser = argparse.ArgumentParser() >> parser.add_argument('--force-version', default='4.13') >> + parser.add_argument('--clean', action='store_true', >> + help='Clear any existing gsutil package, forcing a new download.') >> parser.add_argument('--fallback', default=DEFAULT_FALLBACK_GSUTIL) >> - parser.add_argument('--target', default=DEFAULT_BIN_DIR) >> + parser.add_argument('--target', default=bin_dir, >> + help='The target directory to download/store a gsutil version in. ' >> + '(default is %(default)s).') >> parser.add_argument('args', nargs=argparse.REMAINDER) >> >> args, extras = parser.parse_known_args() >> @@ -118,12 +139,13 @@ def parse_args(): >> args.args.pop(0) >> if extras: >> args.args = extras + args.args >> - return args.force_version, args.fallback, args.target, args.args >> + return args >> >> >> def main(): >> - force_version, fallback, target, args = parse_args() >> - return run_gsutil(force_version, fallback, target, args) >> + args = parse_args() >> + return run_gsutil(args.force_version, args.fallback, args.target, >> args.args, >> + clean=args.clean) >> >> if __name__ == '__main__': >> sys.exit(main()) >> Index: tests/gsutil_test.py >> diff --git a/tests/gsutil_test.py b/tests/gsutil_test.py >> index >> 76570dd31034727b4c38346ea6e84fe383f9b234..d34eebfa52e76affc1df6b6a52c0c4801d150523 >> 100755 >> --- a/tests/gsutil_test.py >> +++ b/tests/gsutil_test.py >> @@ -144,8 +144,8 @@ class GsutilUnitTests(unittest.TestCase): >> >> # This should delete the old bin and rewrite it with 'Fake gsutil' >> self.assertRaises( >> - gsutil.InvalidGsutilError, gsutil.ensure_gsutil, version, >> self.tempdir) >> - self.assertTrue(os.path.isdir(os.path.join(self.tempdir, >> '.cache_dir'))) >> + gsutil.InvalidGsutilError, gsutil.ensure_gsutil, version, >> self.tempdir, >> + False) >> self.assertTrue(os.path.exists(gsutil_bin)) >> with open(gsutil_bin, 'r') as f: >> self.assertEquals(f.read(), fake_gsutil) >> @@ -165,7 +165,7 @@ class GsutilUnitTests(unittest.TestCase): >> with open(gsutil_bin, 'w') as f: >> f.write('Foobar') >> self.assertEquals( >> - gsutil.ensure_gsutil(version, self.tempdir), gsutil_bin) >> + gsutil.ensure_gsutil(version, self.tempdir, False), gsutil_bin) >> >> if __name__ == '__main__': >> unittest.main() >> >> >> > To unsubscribe from this group and stop receiving emails from it, send an email to chromium-reviews+unsubscribe@chromium.org.
> What exactly can it affect? There's nothing changing in script's semantics > unless DEPOT_TOOLS_GSUTIL_BIN_DIR is set, right? Semantically this is true. However, over half of the inner mechanism for downloading "gsutil" has been rewritten and new system features are being exercised. If something goes wrong in there (remember, this runs on Windows, Linux, and Mac), we could tank the entire Chrome build fleet, which is sufficient to prompt me to wait until tomorrow when we will have all hands available.
First of all, I appreciate you taking time to deal with this problem. If you're interested, below I also offer my frank opinion about this change. This is an outsider opinion, I'm missing the full context, and I'm not even chromium committer, so take it with a grain of salt. <rant> To complete a very simple task of downloading a file from the network we have a special script (download_from_google_storage) which to do its job invokes another script (gsutil.py), which then downloads the tool that finally can do the download. To make it more interesting just in case the script maintains several versions of the tool. Ever though above is already more complicated than it needs to be, this change is making downloading files from internet even more challenging, the user of that script will now have to provide a temporary cache directory where the mentioned tool will be downloaded. They will have to worry about keeping that directory between multiple download_from_google_storage invocations, but then also removing it when it's no longer necessary. All of that while we already have all the tools necessary installed in the system. Very nice! If the following picture does not describe this situation precisely I don't know what does: </rant> On Thu, Sep 17, 2015 at 5:32 PM, <dnj@google.com> wrote: > On 2015/09/18 00:28:55, Ryan Tseng wrote: > >> lgtm >> > > Alrighty, I'll wait until tomorrow to deploy since this can affect ... a > lot :) > > https://codereview.chromium.org/1346213003/ > To unsubscribe from this group and stop receiving emails from it, send an email to chromium-reviews+unsubscribe@chromium.org.
> To complete a very simple task of downloading a file from the network Some simple problems explode pretty quickly when you apply them to thousands of developers, TPMs, and bots running countless permutations of repository versions on dozens of operating systems and architectures, This script was born from crippling developer and build system breakages that occurred when we have unilaterally upgraded "gsutil" in the past, and it has done a pretty good job allowing us to canary "gsutil" upgrades and their impact. Unfortunately, "gsutil" is not perfectly backwards-compatible, and some scripts rely on old "gsutil" behaviors. The answer seems simple: update the scripts! But they are scattered across hundreds of repositories, each of which have different owners and may be pinned to versions years old. For example, some parts of our build system have been running for 5+ years reliably and their owners are not willing to accept the potentially-expensive risk of updating them. If this were truly a simple problem, we'd be using a simple solution. I think we'd all like that, and we're all working through some pretty thick layers of technical debt to get there.
fwiw, the GS team doesn't support older versions of gsutil, and CrOS has had a pretty good record with its lib/gs.py module. everything writes to the gs.py API, and gs.py itself hides all of the gsutil details.
Thanks for explanation Daniel. I do understand that for a project that has been under active development for any long amount of time there will accumulate some technical debt, which occasionally everybody has to deal with. In this particular case it seems to me you're dealing with it in a sub-optimal way, actually adding more hacks to already complex system, digging a deeper hole that will be harder to climb out of. I'll try to explain why I think so and what can be done instead. So some users of gsutil are compatible with some versions, but not with others. So far the only reason for that I heard was that they can't parse some output of gsutil. (is that correct or are they other reasons things got broken by updates in the past?) When that happened, did the gsutil contract change? Or did the client make assumptions about gsutil that are not a part of the contract? Without knowing more details I'm going to assume that it's that software that tries to parse gsutil output is broken, rather than gsutil is updated in a backwards-incompatible way. Now let's say there are indeed things that can't be easily fixed and that require older versions. But then why do you expect that everybody will want some specific version? Good citizens will work with any version. Why does gsutil.py has --force-version flag, but not, say, --gimme-some-fresh-version or --use-whatever-i-have-in-usr-bin? By not having such option you're endorsing the client's reliance on any particular version. What I'd suggest is this: 1. By default gsutil.py should only guarantee that version it runs is some fresh one (i.e. newer than some particular release) but not give guarantees what exactly it is. That could run a version from $PATH as long as it's not too old or download something if there is nothing in $PATH. (that's what I wanted to do in my change, though not exactly like I described here). Note that in the current version there's already no guarantees: default value of force_version is not fixed in stone and you update it occasionally, e.g. http://crrev.com/1237013003 2. Fix any code that uses --force-version flag, by making sure it does not rely on anything that's not a part of gsutil contract. 3. Get rid of force-version flag. On Thu, Sep 17, 2015 at 6:32 PM, <dnj@google.com> wrote: > To complete a very simple task of downloading a file from the network >> > > Some simple problems explode pretty quickly when you apply them to > thousands of > developers, TPMs, and bots running countless permutations of repository > versions > on dozens of operating systems and architectures, This script was born from > crippling developer and build system breakages that occurred when we have > unilaterally upgraded "gsutil" in the past, and it has done a pretty good > job > allowing us to canary "gsutil" upgrades and their impact. > > Unfortunately, "gsutil" is not perfectly backwards-compatible, and some > scripts > rely on old "gsutil" behaviors. The answer seems simple: update the > scripts! But > they are scattered across hundreds of repositories, each of which have > different > owners and may be pinned to versions years old. For example, some parts of > our > build system have been running for 5+ years reliably and their owners are > not > willing to accept the potentially-expensive risk of updating them. > > If this were truly a simple problem, we'd be using a simple solution. I > think > we'd all like that, and we're all working through some pretty thick layers > of > technical debt to get there. > > https://codereview.chromium.org/1346213003/ > To unsubscribe from this group and stop receiving emails from it, send an email to chromium-reviews+unsubscribe@chromium.org.
it is not possible for people who use gsutil directly to be "good citizens". the GS team provides no guarantees of stability at all. for example: - the output of commands change (both successes and errors) - the exit codes changes (sometimes doing exit(0) when there's an error) - errors sometimes go to stdout and not stderr - sub-commands change, both in name (getacl->acl get, etc...), and in arguments - inputs to the commands change (e.g. acl templates from XML to JSON between 3.x and 4.x) - workaround buggy behavior in gsutil itself (e.g. misbehavior w/resuming of files, proxy settings, retries, etc...) this is still going on too: the behavior of `gsutil stat` changed in an incompatible way between 4.12, 4.13, and 4.14 (all released in the last few months). for even more examples, browse our history: https://chromium.googlesource.com/chromiumos/chromite/+log/master/lib/gs.py this is exactly why CrOS has written a gs.py library to wrap gsutil and provide a stable python API to its users. only one place needs to be concerned with differences in gsutil behavior, and we have unittests to catch when gsutil changes. thus we can quickly upgrade between versions and all downstream consumers continue working for free. as soon as you expose `gsutil` to people, you've lost. game over man, game over.
> this is exactly why CrOS has written a gs.py library to wrap gsutil and provide > a stable python API to its users. +1 Chromite is pretty great :) > as soon as you expose `gsutil` to people, you've lost. game over man, game > over. We lost that game a few years back :(
The CQ bit was checked by dnj@chromium.org
The patchset sent to the CQ was uploaded after l-g-t-m from davidjames@google.com, hinoka@google.com, iannucci@chromium.org Link to the patchset: https://codereview.chromium.org/1346213003/#ps140001 (title: "Explicitly create base directory.")
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1346213003/140001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1346213003/140001
Message was sent while issue was closed.
Committed patchset #7 (id:140001) as http://src.chromium.org/viewvc/chrome?view=rev&revision=296772 |