|
|
Chromium Code Reviews
Descriptionluci-py/isolateserver.py: Add archive support when downloading.
Add support for extracting archives which contain many small files in one group together. This drastically increases the speed of dealing with a large number of small files.
* https://github.com/luci/luci-go/issues/9
* https://crbug.com/598990
BUG=598990
DEPS=2049523004
Committed: https://github.com/luci/luci-py/commit/6a5bfa1af2f8cd27fb5a43e7f95b969b01404807
Patch Set 1 : Uploading for a quick look. #Patch Set 2 : Tests now pass. #
Total comments: 1
Patch Set 3 : Adding a test. #
Total comments: 23
Patch Set 4 : Small fixes. #
Total comments: 25
Patch Set 5 : Fixing for review. #Patch Set 6 : Adding a file outside the archive to test. #Patch Set 7 : Rebase #
Total comments: 14
Patch Set 8 : Fixes. #Patch Set 9 : Fixes. #
Total comments: 2
Patch Set 10 : Adding a few more tests. #Patch Set 11 : Removing unneeded output. #Patch Set 12 : Changing smallar to ar and version bump (maruel requested). #Patch Set 13 : Fixing the bot_archive.py #Patch Set 14 : Another attempt at fixing bot_archive.py #Patch Set 15 : Adding a smoke test (and fixing a bug it found). #Patch Set 16 : Moving the fileobj stuff to https://codereview.chromium.org/2186263002/ #Patch Set 17 : Small fixes. #Patch Set 18 : Update hashes as version has changed. #
Total comments: 1
Depends on Patchset: Messages
Total messages: 43 (23 generated)
Patchset #2 (id:20001) has been deleted
Patchset #1 (id:1) has been deleted
Patchset #1 (id:40001) has been deleted
tansell@chromium.org changed reviewers: + maruel@chromium.org
Hi Marc, This patch doesn't work yet but is what I was thinking during our discussion. One a related note, using the generator yield things rather than file objects is really weird. I worked a bit on a patch which uses proper file like objects and it has a number of advantages; * Allows "on the fly" hash verification and calculation * Allows transparent on the fly compression * Resolves the streaming support FIXMEs * Fixes the retry logic * Means more of the operations happen in C land and seems more performant * Decreases the complexity of the fetch/push methods I'll try and catch you tomorrow to discuss it more. Tim 'mithro' Ansell
maruel@chromium.org changed reviewers: + nodir@chromium.org, vadim@chromium.org
+Nodir because we discussed hardlink() before. +Vadim as he wrote the read() with the generator idea. I'm generally fine with exposing a file like object.
On 2016/06/15 17:42:14, M-A Ruel wrote: > +Nodir because we discussed hardlink() before. > +Vadim as he wrote the read() with the generator idea. No, I didn't :P The generator-based flow was added by you: https://github.com/luci/luci-py/commit/b7e79a21bea459f781907db2d7e07bec9090f3... (It was file-based before, if I understand the diff correctly) It seemed like a good idea at the time, so I probably added more stuff on top later (don't remember). > > I'm generally fine with exposing a file like object. +1. Also I hope we'll reimplement this code in Go eventually. Supporting two code bases isn't fun.
On 2016/06/15 19:03:30, Vadim Sh. wrote: > It seemed like a good idea at the time, so I probably added more stuff on top > later (don't remember). I had a lot of bad good ideas. :/
On 2016/06/15 19:18:16, M-A Ruel wrote: > On 2016/06/15 19:03:30, Vadim Sh. wrote: > > It seemed like a good idea at the time, so I probably added more stuff on top > > later (don't remember). > > I had a lot of bad good ideas. :/ I can see how you ended going down this path with the weird "r.iter_content()" method that requests seem to provide. The docs are pretty sheepish about mentioning that you can access the content as a fileobject via the `raw` object and it looks like it is only available when stream=True too.
Hey Marc, I've just been working on this patch, getting close to all the existing tests passing now. However, I'm wondering what the best way to test the new functionality is? There are quite a few different styles of tests under the tests/ directory and I can see a number of different ways to do it. Appreciate advice on how you would like to see it done. On a related note, do you think it is worth adding support for generating the archives on upload in the Python code too? Most of the people doing uploading should be using the new go version right? Tim 'mithro' Ansell
Description was changed from ========== WIP: Archive support when downloading. * https://github.com/luci/luci-go/issues/9 * https://crbug.com/598990 BUG=598990 ========== to ========== Add archive support when downloading. Add support for extracting archives which contain many small files in one group together. This drastically increases the speed of dealing with a large number of small files. * https://github.com/luci/luci-go/issues/9 * https://crbug.com/598990 BUG=598990 DEPS=2049523004 ==========
Hi everyone, This CL is now ready to be looked at. I've only added one extra test which checks this functionality until Marc gives advice on the best way to properly test it. This CL depends on the archive CL (https://codereview.chromium.org/2049523004/) which I believe is ready to land. Tim 'mithro' Ansell https://codereview.chromium.org/2060983006/diff/80001/client/run_isolated.py File client/run_isolated.py (right): https://codereview.chromium.org/2060983006/diff/80001/client/run_isolated.py#... client/run_isolated.py:272: 'items_cold': base64.b64encode(large.pack(sorted(cache.added))), Are these wrong? Looking at the MemoryCache code, the added/used are lists which contain the number of bytes?
https://codereview.chromium.org/2060983006/diff/100001/client/cipd.py File client/cipd.py (right): https://codereview.chromium.org/2060983006/diff/100001/client/cipd.py#newcode359 client/cipd.py:359: instance_id = version_cache.getfileobj(version_digest).read() DiskCache.getfileobject returns a file and it is responsibility of the caller to close it this and some other call sites do not close the file https://codereview.chromium.org/2060983006/diff/100001/client/isolateserver.py File client/isolateserver.py (right): https://codereview.chromium.org/2060983006/diff/100001/client/isolateserver.p... client/isolateserver.py:157: name = getattr(fileobj, "name", None) s/"/'/ https://codereview.chromium.org/2060983006/diff/100001/client/isolateserver.p... client/isolateserver.py:178: else: unnecessary else https://codereview.chromium.org/2060983006/diff/100001/client/isolateserver.p... client/isolateserver.py:181: written += readsize you might have to `written = destfileobj.tell()` because readsize is not necessarily equal to # of bytes actually read from srcfileobject, which is not necessarily equal to # of bytes actually written to descobjfile https://codereview.chromium.org/2060983006/diff/100001/client/isolateserver.p... client/isolateserver.py:184: def putfile(srcfileobj, dstpath, file_mode=None, limit=-1): should self._used be updated in this method? https://codereview.chromium.org/2060983006/diff/100001/client/isolateserver.p... client/isolateserver.py:1341: """ document that it is the responsibility of the caller to close the returned file https://codereview.chromium.org/2060983006/diff/100001/client/isolateserver.p... client/isolateserver.py:2017: if isarchive == 'ar': perhaps inverse the condition and remove `extractor = None` https://codereview.chromium.org/2060983006/diff/100001/client/run_isolated.py File client/run_isolated.py (right): https://codereview.chromium.org/2060983006/diff/100001/client/run_isolated.py... client/run_isolated.py:94: package.add_directory(os.path.join(BASE_DIR, 'libs')) why do we need this?
https://codereview.chromium.org/2060983006/diff/100001/client/cipd.py File client/cipd.py (right): https://codereview.chromium.org/2060983006/diff/100001/client/cipd.py#newcode359 client/cipd.py:359: instance_id = version_cache.getfileobj(version_digest).read() On 2016/06/20 15:54:34, nodir wrote: > DiskCache.getfileobject returns a file and it is responsibility of the caller to > close it > this and some other call sites do not close the file I've changed this one to a "with xxxx". See my comment about close later... https://codereview.chromium.org/2060983006/diff/100001/client/isolateserver.py File client/isolateserver.py (right): https://codereview.chromium.org/2060983006/diff/100001/client/isolateserver.p... client/isolateserver.py:157: name = getattr(fileobj, "name", None) On 2016/06/20 15:54:35, nodir wrote: > s/"/'/ Done. https://codereview.chromium.org/2060983006/diff/100001/client/isolateserver.p... client/isolateserver.py:178: else: On 2016/06/20 15:54:35, nodir wrote: > unnecessary else Done. https://codereview.chromium.org/2060983006/diff/100001/client/isolateserver.p... client/isolateserver.py:181: written += readsize On 2016/06/20 15:54:35, nodir wrote: > you might have to `written = destfileobj.tell()` because readsize is not > necessarily equal to # of bytes actually read from srcfileobject, which is not > necessarily equal to # of bytes actually written to descobjfile len(data) should be enough. If write doesn't write the full amount, it raises an IOError. https://codereview.chromium.org/2060983006/diff/100001/client/isolateserver.p... client/isolateserver.py:184: def putfile(srcfileobj, dstpath, file_mode=None, limit=-1): On 2016/06/20 15:54:35, nodir wrote: > should self._used be updated in this method? This function doesn't have access to self._used and I don't think should be independent of that class too. https://codereview.chromium.org/2060983006/diff/100001/client/isolateserver.p... client/isolateserver.py:1341: """ On 2016/06/20 15:54:34, nodir wrote: > document that it is the responsibility of the caller to close the returned file I updated the docs to include that it is a *readable* file object. In Python the file will be closed when the object is garbage collected. There are only really two reasons you want to explicitly close file objects in python; * You are opening / closing a large number of files and are running out of file descriptors before garbage collection is occurring. * You are writing to a file and have finished with it. https://codereview.chromium.org/2060983006/diff/100001/client/isolateserver.p... client/isolateserver.py:2017: if isarchive == 'ar': On 2016/06/20 15:54:35, nodir wrote: > perhaps inverse the condition and remove `extractor = None` extractor = None is actually not needed. Removed. https://codereview.chromium.org/2060983006/diff/100001/client/run_isolated.py File client/run_isolated.py (right): https://codereview.chromium.org/2060983006/diff/100001/client/run_isolated.py... client/run_isolated.py:94: package.add_directory(os.path.join(BASE_DIR, 'libs')) On 2016/06/20 15:54:35, nodir wrote: > why do we need this? The ar files are under libs/
On 2016/06/18 06:43:10, mithro wrote: > However, I'm wondering what the best way to test the new functionality is? There > are quite a few different styles of tests under the tests/ directory and I can > see a number of different ways to do it. Appreciate advice on how you would like > to see it done. You can either do a unit test and/or a smoke test. Unit tests are preferred but not always possible when testing OS functionality. In your case I don't think it's necessary. > On a related note, do you think it is worth adding support for generating the > archives on upload in the Python code too? No. > Most of the people doing uploading > should be using the new go version right? Exact. https://codereview.chromium.org/2060983006/diff/100001/client/isolateserver.py File client/isolateserver.py (right): https://codereview.chromium.org/2060983006/diff/100001/client/isolateserver.p... client/isolateserver.py:1341: """ On 2016/06/21 06:42:33, mithro wrote: > On 2016/06/20 15:54:34, nodir wrote: > > document that it is the responsibility of the caller to close the returned > file > > I updated the docs to include that it is a *readable* file object. > > In Python the file will be closed when the object is garbage collected. > > There are only really two reasons you want to explicitly close file objects in > python; > * You are opening / closing a large number of files and are running out of file > descriptors before garbage collection is occurring. We are potentially hitting this case. > * You are writing to a file and have finished with it. https://codereview.chromium.org/2060983006/diff/120001/client/isolateserver.py File client/isolateserver.py (right): https://codereview.chromium.org/2060983006/diff/120001/client/isolateserver.p... client/isolateserver.py:156: def fileobj_path(fileobj): add docstring to these 3 functions https://codereview.chromium.org/2060983006/diff/120001/client/isolateserver.p... client/isolateserver.py:161: if not isinstance(name, unicode): Please return unicode paths instead of encoded str instances. Only linux/bsd gets this wrong, other sane OSes (Windows, OSX) use unicode internally. Basically, none of these scripts work with paths that are not representable as unicode characters. https://codereview.chromium.org/2060983006/diff/120001/client/isolateserver.p... client/isolateserver.py:164: if fs.exists(name): why check for existence? https://codereview.chromium.org/2060983006/diff/120001/client/isolateserver.p... client/isolateserver.py:168: def fileobj_copy(dstfileobj, srcfileobj, amount=-1): s/amount/size/ ? https://codereview.chromium.org/2060983006/diff/120001/client/isolateserver.p... client/isolateserver.py:199: srcfileobj.close() that shouldn't be done here. https://codereview.chromium.org/2060983006/diff/120001/client/isolateserver.p... client/isolateserver.py:1387: with self._lock: Nesting is not optimal; with self._lock: try: d = self._contents[digest] except KeyError: raise CacheMiss(digest) self._used.append(len(d)) return StringIO.StringIO(d) it's not going to affect anything performance wise, it's more to document the scoping. https://codereview.chromium.org/2060983006/diff/120001/client/isolateserver.p... client/isolateserver.py:1800: item.load(fetch_queue.cache.getfileobj(item_hash).read()) with fetch_queue.cache.getfileobj(item_hash) as f: ? and/or change load() to accept a file object. https://codereview.chromium.org/2060983006/diff/120001/client/isolateserver.p... client/isolateserver.py:2011: srcfileobj = cache.getfileobj(digest) with cache.getfileobj(digest) as f: https://codereview.chromium.org/2060983006/diff/120001/client/tests/isolatese... File client/tests/isolateserver_test.py (right): https://codereview.chromium.org/2060983006/diff/120001/client/tests/isolatese... client/tests/isolateserver_test.py:727: [cache.getfileobj(i.digest).read() for i in items]) please close everywhere
https://codereview.chromium.org/2060983006/diff/100001/client/isolateserver.py File client/isolateserver.py (right): https://codereview.chromium.org/2060983006/diff/100001/client/isolateserver.p... client/isolateserver.py:181: written += readsize On 2016/06/21 06:42:33, mithro wrote: > On 2016/06/20 15:54:35, nodir wrote: > > you might have to `written = destfileobj.tell()` because readsize is not > > necessarily equal to # of bytes actually read from srcfileobject, which is not > > necessarily equal to # of bytes actually written to descobjfile > > len(data) should be enough. > > If write doesn't write the full amount, it raises an IOError. Acknowledged. https://codereview.chromium.org/2060983006/diff/100001/client/isolateserver.p... client/isolateserver.py:184: def putfile(srcfileobj, dstpath, file_mode=None, limit=-1): On 2016/06/21 06:42:33, mithro wrote: > On 2016/06/20 15:54:35, nodir wrote: > > should self._used be updated in this method? > > This function doesn't have access to self._used and I don't think should be > independent of that class too. for some reason I thought it is a method, not a global func https://codereview.chromium.org/2060983006/diff/100001/client/isolateserver.p... client/isolateserver.py:1341: """ On 2016/06/21 13:31:08, M-A Ruel wrote: > On 2016/06/21 06:42:33, mithro wrote: > > On 2016/06/20 15:54:34, nodir wrote: > > > document that it is the responsibility of the caller to close the returned > > file > > > > I updated the docs to include that it is a *readable* file object. > > > > In Python the file will be closed when the object is garbage collected. > > > > There are only really two reasons you want to explicitly close file objects in > > python; > > * You are opening / closing a large number of files and are running out of > file > > descriptors before garbage collection is occurring. > > We are potentially hitting this case. > > > * You are writing to a file and have finished with it. > IIRC windows requires a file handle to be closed for another process to access the file, so we should close all file handles before we spawn the task command. the general recommendation, on windows at least, i think, is to close a file handle as soon as possible. it is orthogonal to the language https://codereview.chromium.org/2060983006/diff/100001/client/isolateserver.p... client/isolateserver.py:2017: if isarchive == 'ar': On 2016/06/21 06:42:33, mithro wrote: > On 2016/06/20 15:54:35, nodir wrote: > > perhaps inverse the condition and remove `extractor = None` > > extractor = None is actually not needed. Removed. the goal was to reduce nesting. With condition inversed the nesting is less https://codereview.chromium.org/2060983006/diff/120001/client/cipd.py File client/cipd.py (right): https://codereview.chromium.org/2060983006/diff/120001/client/cipd.py#newcode386 client/cipd.py:386: instance_cache.getfileobj(instance_id), here too https://codereview.chromium.org/2060983006/diff/120001/client/isolateserver.py File client/isolateserver.py (right): https://codereview.chromium.org/2060983006/diff/120001/client/isolateserver.p... client/isolateserver.py:199: srcfileobj.close() On 2016/06/21 13:31:08, M-A Ruel wrote: > that shouldn't be done here. +1
Hi! I believe I have made all the requested changes, if I have missed something, please comment on it again. I made a couple of changes while fixing the review comments too; * I changed from having a "archive" tag to having a "filetype" tag which can either be "basic" (the default if not give) or "smallfiles-archive". I think this reduces the overloading of the archive name a bit and allows more future types. * I moved to use io.BytesIO over StringIO object because StringIO doesn't implement the "with" API. Tim 'mithro' Ansell https://codereview.chromium.org/2060983006/diff/100001/client/isolateserver.py File client/isolateserver.py (right): https://codereview.chromium.org/2060983006/diff/100001/client/isolateserver.p... client/isolateserver.py:1341: """ On 2016/06/21 22:20:04, nodir wrote: > On 2016/06/21 13:31:08, M-A Ruel wrote: > > On 2016/06/21 06:42:33, mithro wrote: > > > On 2016/06/20 15:54:34, nodir wrote: > > > > document that it is the responsibility of the caller to close the returned > > > file > > > > > > I updated the docs to include that it is a *readable* file object. > > > > > > In Python the file will be closed when the object is garbage collected. > > > > > > There are only really two reasons you want to explicitly close file objects > in > > > python; > > > * You are opening / closing a large number of files and are running out of > > file > > > descriptors before garbage collection is occurring. > > > > We are potentially hitting this case. > > > > > * You are writing to a file and have finished with it. > > > > IIRC windows requires a file handle to be closed for another process to access > the file, so we should close all file handles before we spawn the task command. I believe this is correct if the file is opened in write or exclusive read mode but not shared read mode (which is the default). > the general recommendation, on windows at least, i think, is to close a file > handle as soon as possible. it is orthogonal to the language I'm doing this now. https://codereview.chromium.org/2060983006/diff/100001/client/isolateserver.p... client/isolateserver.py:2017: if isarchive == 'ar': On 2016/06/21 22:20:04, nodir wrote: > On 2016/06/21 06:42:33, mithro wrote: > > On 2016/06/20 15:54:35, nodir wrote: > > > perhaps inverse the condition and remove `extractor = None` > > > > extractor = None is actually not needed. Removed. > > the goal was to reduce nesting. With condition inversed the nesting is less I refactored this code a little. https://codereview.chromium.org/2060983006/diff/120001/client/cipd.py File client/cipd.py (right): https://codereview.chromium.org/2060983006/diff/120001/client/cipd.py#newcode386 client/cipd.py:386: instance_cache.getfileobj(instance_id), On 2016/06/21 22:20:05, nodir wrote: > here too Done. https://codereview.chromium.org/2060983006/diff/120001/client/isolateserver.py File client/isolateserver.py (right): https://codereview.chromium.org/2060983006/diff/120001/client/isolateserver.p... client/isolateserver.py:156: def fileobj_path(fileobj): On 2016/06/21 13:31:09, M-A Ruel wrote: > add docstring to these 3 functions Done. https://codereview.chromium.org/2060983006/diff/120001/client/isolateserver.p... client/isolateserver.py:161: if not isinstance(name, unicode): On 2016/06/21 13:31:08, M-A Ruel wrote: > Please return unicode paths instead of encoded str instances. Only linux/bsd > gets this wrong, other sane OSes (Windows, OSX) use unicode internally. > Basically, none of these scripts work with paths that are not representable as > unicode characters. This is *creating* a unicode value from a byte (str) representation. This is needed if someone did open('blah.txt') - such as the standard library when creating a NamedTemporaryFile. I've added a comment explain this. https://codereview.chromium.org/2060983006/diff/120001/client/isolateserver.p... client/isolateserver.py:164: if fs.exists(name): On 2016/06/21 13:31:09, M-A Ruel wrote: > why check for existence? There are many reasons why a fileobj has a name which isn't available on the file system any more. I want the result of this function to be passable to open. One example is that it is reasonably common to open a file and then delete the filename (preventing other people from opening it). In this case the fileobj will then have a path which is no longer present on the file system. https://codereview.chromium.org/2060983006/diff/120001/client/isolateserver.p... client/isolateserver.py:168: def fileobj_copy(dstfileobj, srcfileobj, amount=-1): On 2016/06/21 13:31:09, M-A Ruel wrote: > s/amount/size/ ? Done. https://codereview.chromium.org/2060983006/diff/120001/client/isolateserver.p... client/isolateserver.py:199: srcfileobj.close() On 2016/06/21 22:20:05, nodir wrote: > On 2016/06/21 13:31:08, M-A Ruel wrote: > > that shouldn't be done here. > > +1 Done. https://codereview.chromium.org/2060983006/diff/120001/client/isolateserver.p... client/isolateserver.py:199: srcfileobj.close() On 2016/06/21 13:31:08, M-A Ruel wrote: > that shouldn't be done here. Done. https://codereview.chromium.org/2060983006/diff/120001/client/isolateserver.p... client/isolateserver.py:1387: with self._lock: On 2016/06/21 13:31:08, M-A Ruel wrote: > Nesting is not optimal; > > with self._lock: > try: > d = self._contents[digest] > except KeyError: > raise CacheMiss(digest) > self._used.append(len(d)) > return StringIO.StringIO(d) > > it's not going to affect anything performance wise, it's more to document the > scoping. Done. https://codereview.chromium.org/2060983006/diff/120001/client/isolateserver.p... client/isolateserver.py:1800: item.load(fetch_queue.cache.getfileobj(item_hash).read()) On 2016/06/21 13:31:09, M-A Ruel wrote: > with fetch_queue.cache.getfileobj(item_hash) as f: > ? Done > and/or change load() to accept a file object. I'll leave that for the CL which changes everything to use file like objects. https://codereview.chromium.org/2060983006/diff/120001/client/isolateserver.p... client/isolateserver.py:2011: srcfileobj = cache.getfileobj(digest) On 2016/06/21 13:31:08, M-A Ruel wrote: > with cache.getfileobj(digest) as f: Done. https://codereview.chromium.org/2060983006/diff/120001/client/tests/isolatese... File client/tests/isolateserver_test.py (right): https://codereview.chromium.org/2060983006/diff/120001/client/tests/isolatese... client/tests/isolateserver_test.py:727: [cache.getfileobj(i.digest).read() for i in items]) On 2016/06/21 13:31:09, M-A Ruel wrote: > please close everywhere Done.
lgtm https://codereview.chromium.org/2060983006/diff/180001/client/isolateserver.py File client/isolateserver.py (right): https://codereview.chromium.org/2060983006/diff/180001/client/isolateserver.p... client/isolateserver.py:156: """Return file system path for file like object. or None https://codereview.chromium.org/2060983006/diff/180001/client/utils/fs.py File client/utils/fs.py (right): https://codereview.chromium.org/2060983006/diff/180001/client/utils/fs.py#new... client/utils/fs.py:91: -2 blank lines please, since you are at this file https://codereview.chromium.org/2060983006/diff/180001/client/utils/fs.py#new... client/utils/fs.py:107: +1 blank line
https://codereview.chromium.org/2060983006/diff/120001/client/isolateserver.py File client/isolateserver.py (right): https://codereview.chromium.org/2060983006/diff/120001/client/isolateserver.p... client/isolateserver.py:161: if not isinstance(name, unicode): On 2016/06/22 12:03:10, mithro wrote: > On 2016/06/21 13:31:08, M-A Ruel wrote: > > Please return unicode paths instead of encoded str instances. Only linux/bsd > > gets this wrong, other sane OSes (Windows, OSX) use unicode internally. > > > Basically, none of these scripts work with paths that are not representable as > > unicode characters. > > This is *creating* a unicode value from a byte (str) representation. This is > needed if someone did open('blah.txt') - such as the standard library when > creating a NamedTemporaryFile. > > I've added a comment explain this. Err, I had misread, sorry. https://codereview.chromium.org/2060983006/diff/120001/client/isolateserver.p... client/isolateserver.py:164: if fs.exists(name): On 2016/06/22 12:03:10, mithro wrote: > On 2016/06/21 13:31:09, M-A Ruel wrote: > > why check for existence? > > There are many reasons why a fileobj has a name which isn't available on the > file system any more. I want the result of this function to be passable to open. > > One example is that it is reasonably common to open a file and then delete the > filename (preventing other people from opening it). In this case the fileobj > will then have a path which is no longer present on the file system. > Oh I agree in the general case, but this is not general code here and this is a hot code path. I don't think this should be supported. https://codereview.chromium.org/2060983006/diff/180001/appengine/isolate/doc/... File appengine/isolate/doc/Design.md (right): https://codereview.chromium.org/2060983006/diff/180001/appengine/isolate/doc/... appengine/isolate/doc/Design.md:120: - `t`: type of the file, defaults to `basic` if not given. I'd make the statement stronger, it should not be given if the default. https://codereview.chromium.org/2060983006/diff/180001/appengine/isolate/doc/... appengine/isolate/doc/Design.md:138: - `smallfiles-archive`: An [ar](https://en.wikipedia.org/wiki/Ar_(Unix)) this is overly verbose. :/ I think 'ar' is sufficient. https://codereview.chromium.org/2060983006/diff/180001/client/isolateserver.py File client/isolateserver.py (right): https://codereview.chromium.org/2060983006/diff/180001/client/isolateserver.p... client/isolateserver.py:176: # FIXME: Replace fileobj_copy with shutil.copyfileobj once proper file wrappers TODO(tansell)
https://codereview.chromium.org/2060983006/diff/120001/client/isolateserver.py File client/isolateserver.py (right): https://codereview.chromium.org/2060983006/diff/120001/client/isolateserver.p... client/isolateserver.py:161: if not isinstance(name, unicode): On 2016/06/22 17:17:49, M-A Ruel wrote: > On 2016/06/22 12:03:10, mithro wrote: > > On 2016/06/21 13:31:08, M-A Ruel wrote: > > > Please return unicode paths instead of encoded str instances. Only linux/bsd > > > gets this wrong, other sane OSes (Windows, OSX) use unicode internally. > > > > > Basically, none of these scripts work with paths that are not representable > as > > > unicode characters. > > > > This is *creating* a unicode value from a byte (str) representation. This is > > needed if someone did open('blah.txt') - such as the standard library when > > creating a NamedTemporaryFile. > > > > I've added a comment explain this. > > Err, I had misread, sorry. Acknowledged. https://codereview.chromium.org/2060983006/diff/180001/appengine/isolate/doc/... File appengine/isolate/doc/Design.md (right): https://codereview.chromium.org/2060983006/diff/180001/appengine/isolate/doc/... appengine/isolate/doc/Design.md:120: - `t`: type of the file, defaults to `basic` if not given. On 2016/06/22 17:17:49, M-A Ruel wrote: > I'd make the statement stronger, it should not be given if the default. Done. https://codereview.chromium.org/2060983006/diff/180001/appengine/isolate/doc/... appengine/isolate/doc/Design.md:138: - `smallfiles-archive`: An [ar](https://en.wikipedia.org/wiki/Ar_(Unix)) On 2016/06/22 17:17:49, M-A Ruel wrote: > this is overly verbose. :/ I think 'ar' is sufficient. How about "smallar"? If we want to use ar files in the future for different reason it would be good to have a way to differentiate. https://codereview.chromium.org/2060983006/diff/180001/client/isolateserver.py File client/isolateserver.py (right): https://codereview.chromium.org/2060983006/diff/180001/client/isolateserver.p... client/isolateserver.py:156: """Return file system path for file like object. On 2016/06/22 16:34:06, nodir wrote: > or None Done. https://codereview.chromium.org/2060983006/diff/180001/client/isolateserver.p... client/isolateserver.py:176: # FIXME: Replace fileobj_copy with shutil.copyfileobj once proper file wrappers On 2016/06/22 17:17:49, M-A Ruel wrote: > TODO(tansell) Done. https://codereview.chromium.org/2060983006/diff/180001/client/utils/fs.py File client/utils/fs.py (right): https://codereview.chromium.org/2060983006/diff/180001/client/utils/fs.py#new... client/utils/fs.py:91: On 2016/06/22 16:34:06, nodir wrote: > -2 blank lines > please, since you are at this file Done. https://codereview.chromium.org/2060983006/diff/180001/client/utils/fs.py#new... client/utils/fs.py:107: On 2016/06/22 16:34:06, nodir wrote: > +1 blank line Done.
lgtm https://codereview.chromium.org/2060983006/diff/180001/appengine/isolate/doc/... File appengine/isolate/doc/Design.md (right): https://codereview.chromium.org/2060983006/diff/180001/appengine/isolate/doc/... appengine/isolate/doc/Design.md:138: - `smallfiles-archive`: An [ar](https://en.wikipedia.org/wiki/Ar_(Unix)) On 2016/06/23 07:17:21, mithro wrote: > On 2016/06/22 17:17:49, M-A Ruel wrote: > > this is overly verbose. :/ I think 'ar' is sufficient. > > How about "smallar"? If we want to use ar files in the future for different > reason it would be good to have a way to differentiate. I don't understand, it's an 'ar' file. We use 'ar' files here. If we use ar files for another reason later, we'll use another value but for now, the file type is 'ar'. https://codereview.chromium.org/2060983006/diff/220001/client/isolateserver.py File client/isolateserver.py (right): https://codereview.chromium.org/2060983006/diff/220001/client/isolateserver.p... client/isolateserver.py:8: __version__ = '0.4.9' bump to 0.5 https://codereview.chromium.org/2060983006/diff/220001/client/isolateserver.p... client/isolateserver.py:200: def putfile(srcfileobj, dstpath, file_mode=None, size=-1): remove default argument to file_mode.
The CQ bit was checked by tansell@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: Luci-py Presubmit on luci.infra.try (JOB_FAILED, https://luci-milo.appspot.com/swarming/task/304a816fc7e7bc10)
The CQ bit was checked by tansell@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: Luci-py Presubmit on luci.infra.try (JOB_FAILED, https://luci-milo.appspot.com/swarming/task/304ae6d80c1b7910)
Patchset #17 (id:380001) has been deleted
The CQ bit was checked by tansell@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
Description was changed from ========== Add archive support when downloading. Add support for extracting archives which contain many small files in one group together. This drastically increases the speed of dealing with a large number of small files. * https://github.com/luci/luci-go/issues/9 * https://crbug.com/598990 BUG=598990 DEPS=2049523004 ========== to ========== luci-py/isolateserver.py: Add archive support when downloading. Add support for extracting archives which contain many small files in one group together. This drastically increases the speed of dealing with a large number of small files. * https://github.com/luci/luci-go/issues/9 * https://crbug.com/598990 BUG=598990 DEPS=2049523004 ==========
https://codereview.chromium.org/2060983006/diff/180001/appengine/isolate/doc/... File appengine/isolate/doc/Design.md (right): https://codereview.chromium.org/2060983006/diff/180001/appengine/isolate/doc/... appengine/isolate/doc/Design.md:138: - `smallfiles-archive`: An [ar](https://en.wikipedia.org/wiki/Ar_(Unix)) On 2016/06/23 13:29:01, M-A Ruel wrote: > On 2016/06/23 07:17:21, mithro wrote: > > On 2016/06/22 17:17:49, M-A Ruel wrote: > > > this is overly verbose. :/ I think 'ar' is sufficient. > > > > How about "smallar"? If we want to use ar files in the future for different > > reason it would be good to have a way to differentiate. > > I don't understand, it's an 'ar' file. We use 'ar' files here. If we use ar > files for another reason later, we'll use another value but for now, the file > type is 'ar'. Done.
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
The CQ bit was checked by tansell@chromium.org
The patchset sent to the CQ was uploaded after l-g-t-m from nodir@chromium.org, maruel@chromium.org Link to the patchset: https://codereview.chromium.org/2060983006/#ps420001 (title: "Update hashes as version has changed.")
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
Message was sent while issue was closed.
Description was changed from ========== luci-py/isolateserver.py: Add archive support when downloading. Add support for extracting archives which contain many small files in one group together. This drastically increases the speed of dealing with a large number of small files. * https://github.com/luci/luci-go/issues/9 * https://crbug.com/598990 BUG=598990 DEPS=2049523004 ========== to ========== luci-py/isolateserver.py: Add archive support when downloading. Add support for extracting archives which contain many small files in one group together. This drastically increases the speed of dealing with a large number of small files. * https://github.com/luci/luci-go/issues/9 * https://crbug.com/598990 BUG=598990 DEPS=2049523004 Committed: https://github.com/luci/luci-py/commit/6a5bfa1af2f8cd27fb5a43e7f95b969b01404807 ==========
Message was sent while issue was closed.
Committed patchset #18 (id:420001) as https://github.com/luci/luci-py/commit/6a5bfa1af2f8cd27fb5a43e7f95b969b01404807
Message was sent while issue was closed.
https://codereview.chromium.org/2060983006/diff/420001/client/tests/isolatese... File client/tests/isolateserver_test.py (right): https://codereview.chromium.org/2060983006/diff/420001/client/tests/isolatese... client/tests/isolateserver_test.py:1134: you added one line too much. |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
