|
|
Descriptionluci-py: Tools for working with BSD style ar archives.
ar archives are the simplest format that satisfies the following criteria;
* Is an existing standard which has tools which ship on standard Linux
systems.
* Requires no escaping / processing of file contents.
* Header can be written without needing to understand the whole file.
* Extremely fast to process / generate.
Other formats which were consider before selecting the ar format;
* tar
* zip
* cpio
* rar
* Something protobuf based.
There are a couple of drawbacks;
* The ar format doesn't support special files or symlinks.
* The Linux ar tool don't extract with directory paths (listing is fine).
The go version is in https://codereview.chromium.org/2043623002
* https://github.com/luci/luci-go/issues/9
* https://crbug.com/598990
BUG=598990
Committed: https://github.com/luci/luci-py/commit/6952ff7c2a890381a9745558f1aeb5fa499e727b
Patch Set 1 #Patch Set 2 : Removing the hashbang line. #
Total comments: 29
Patch Set 3 : Fixing for Maruel's review. #
Total comments: 29
Patch Set 4 : Reworking API to better match zipfile/tarfile API. #
Total comments: 29
Patch Set 5 : Fixing for Marc's review. #
Total comments: 9
Patch Set 6 : Fixing nits. #Patch Set 7 : Rebase onto master. #Patch Set 8 : Small fixes. #Patch Set 9 : Make unicode stuff work. #Patch Set 10 : Adding missing unicode tests. #
Total comments: 34
Patch Set 11 : Fixing for review. #
Total comments: 1
Patch Set 12 : Rebase onto master #Patch Set 13 : Fixing Marc's last comment. #Patch Set 14 : Skip the unicode path test on broken systems. #Patch Set 15 : Small unicode fix. #Patch Set 16 : Small rebase. #Patch Set 17 : Actually fix the unicode problem properly. #
Messages
Total messages: 58 (36 generated)
tansell@chromium.org changed reviewers: + maruel@chromium.org, tandrii@chromium.org
Hi Marc, Here is the Python version of the ar archive code (this is a cleaned up version of what I originally started using for LayoutTests). It actually performs within bounds of measurement error of the go version. :) This code is missing tests which I'm still in the process of finishing. Tim 'mithro' Ansell
https://codereview.chromium.org/2049523004/diff/20001/client/libs/ar/cli.py File client/libs/ar/cli.py (right): https://codereview.chromium.org/2049523004/diff/20001/client/libs/ar/cli.py#n... client/libs/ar/cli.py:32: print("Took %f for %i files == %f files/second" % ( use single quotes https://codereview.chromium.org/2049523004/diff/20001/client/libs/ar/cli.py#n... client/libs/ar/cli.py:36: def __del__(self): Use __exit__ instead, don't overide __del__ we already have a class like that; utils.Profiler https://codereview.chromium.org/2049523004/diff/20001/client/libs/ar/cli.py#n... client/libs/ar/cli.py:42: timing = Timing(timing) e.g.: with Timing() as t: https://codereview.chromium.org/2049523004/diff/20001/client/libs/ar/cli.py#n... client/libs/ar/cli.py:52: print(fp, file=sys.stderr) logging.info() https://codereview.chromium.org/2049523004/diff/20001/client/libs/ar/cli.py#n... client/libs/ar/cli.py:54: with open(fp, 'r') as f: 'rb' is important https://codereview.chromium.org/2049523004/diff/20001/client/libs/ar/cli.py#n... client/libs/ar/cli.py:106: description='Command line tool for creating and extracting ar files.') One thing I like is to put this as the module docstring then use here: sys.modules[__name__].__doc__ https://codereview.chromium.org/2049523004/diff/20001/client/libs/ar/cli.py#n... client/libs/ar/cli.py:123: type=argparse.FileType('w'), default=sys.stdout, 'wb' and 'rb' below https://codereview.chromium.org/2049523004/diff/20001/client/libs/ar/cli.py#n... client/libs/ar/cli.py:162: mode = eval("%s_cmd" % args.mode) mode = getattr(sys.modules[__name__], args.mode + '_cmd') https://codereview.chromium.org/2049523004/diff/20001/client/libs/ar/reader.py File client/libs/ar/reader.py (right): https://codereview.chromium.org/2049523004/diff/20001/client/libs/ar/reader.p... client/libs/ar/reader.py:38: check = self.check why? https://codereview.chromium.org/2049523004/diff/20001/client/libs/ar/reader.p... client/libs/ar/reader.py:47: if header == '': if not header: https://codereview.chromium.org/2049523004/diff/20001/client/libs/ar/reader.p... client/libs/ar/reader.py:53: assert name.startswith("#1/"), name use a real check + raise ValueError() https://codereview.chromium.org/2049523004/diff/20001/client/libs/ar/reader.p... client/libs/ar/reader.py:56: assert_eq(dmodtime, modtime) same https://codereview.chromium.org/2049523004/diff/20001/client/libs/ar/writer.py File client/libs/ar/writer.py (right): https://codereview.chromium.org/2049523004/diff/20001/client/libs/ar/writer.p... client/libs/ar/writer.py:28: self.obuf.write("#1/%-13s" % str(len(fp))) single quotes everywhere https://codereview.chromium.org/2049523004/diff/20001/client/libs/ar/writer.p... client/libs/ar/writer.py:74: except AttributeError: ?
Description was changed from ========== Tools for working with BSD style ar archives. ar archives are the simplest format that stratifies the following criteria; * Is an existing standard which has tools which ship on standard Linux systems. * Requires no escaping / processing of file contents. * Header can be written without needing to understand the whole file. * Extremely fast to process / generate. Other formats which were consider before selecting the ar format; * tar * zip * cpio * rar * Something protobuf based. There are a couple of drawbacks; * The ar format doesn't support special files or symlinks. * The Linux ar tool don't extract with directory paths (list fine). The go version is in https://codereview.chromium.org/2043623002 * https://github.com/luci/luci-go/issues/9 * https://crbug.com/598990 BUG=598990 ========== to ========== Tools for working with BSD style ar archives. ar archives are the simplest format that stratifies the following criteria; * Is an existing standard which has tools which ship on standard Linux systems. * Requires no escaping / processing of file contents. * Header can be written without needing to understand the whole file. * Extremely fast to process / generate. Other formats which were consider before selecting the ar format; * tar * zip * cpio * rar * Something protobuf based. There are a couple of drawbacks; * The ar format doesn't support special files or symlinks. * The Linux ar tool don't extract with directory paths (listing is fine). The go version is in https://codereview.chromium.org/2043623002 * https://github.com/luci/luci-go/issues/9 * https://crbug.com/598990 BUG=598990 ==========
Hi Marc, Updated, PTAL. Tim 'mithro' Ansell https://codereview.chromium.org/2049523004/diff/20001/client/libs/ar/cli.py File client/libs/ar/cli.py (right): https://codereview.chromium.org/2049523004/diff/20001/client/libs/ar/cli.py#n... client/libs/ar/cli.py:32: print("Took %f for %i files == %f files/second" % ( On 2016/06/09 21:50:43, M-A Ruel wrote: > use single quotes Done. https://codereview.chromium.org/2049523004/diff/20001/client/libs/ar/cli.py#n... client/libs/ar/cli.py:36: def __del__(self): On 2016/06/09 21:50:43, M-A Ruel wrote: > Use __exit__ instead, don't overide __del__ > > we already have a class like that; utils.Profiler They are doing slightly different things, the Timing class exists so you get frequent "files per second" reports rather than trying to record proper statistics. I renamed it to be "ProgressReporter" to hopefully be clearer. https://codereview.chromium.org/2049523004/diff/20001/client/libs/ar/cli.py#n... client/libs/ar/cli.py:42: timing = Timing(timing) On 2016/06/09 21:50:43, M-A Ruel wrote: > e.g.: > with Timing() as t: Obsolete? https://codereview.chromium.org/2049523004/diff/20001/client/libs/ar/cli.py#n... client/libs/ar/cli.py:52: print(fp, file=sys.stderr) On 2016/06/09 21:50:43, M-A Ruel wrote: > logging.info() Logging adds a bunch of extra stuff. We want just the file names listed here. https://codereview.chromium.org/2049523004/diff/20001/client/libs/ar/cli.py#n... client/libs/ar/cli.py:54: with open(fp, 'r') as f: On 2016/06/09 21:50:43, M-A Ruel wrote: > 'rb' is important Done. https://codereview.chromium.org/2049523004/diff/20001/client/libs/ar/cli.py#n... client/libs/ar/cli.py:106: description='Command line tool for creating and extracting ar files.') On 2016/06/09 21:50:43, M-A Ruel wrote: > One thing I like is to put this as the module docstring then use here: > > sys.modules[__name__].__doc__ Done. https://codereview.chromium.org/2049523004/diff/20001/client/libs/ar/cli.py#n... client/libs/ar/cli.py:123: type=argparse.FileType('w'), default=sys.stdout, On 2016/06/09 21:50:43, M-A Ruel wrote: > 'wb' and 'rb' below Done. https://codereview.chromium.org/2049523004/diff/20001/client/libs/ar/cli.py#n... client/libs/ar/cli.py:162: mode = eval("%s_cmd" % args.mode) On 2016/06/09 21:50:43, M-A Ruel wrote: > mode = getattr(sys.modules[__name__], args.mode + '_cmd') Done. https://codereview.chromium.org/2049523004/diff/20001/client/libs/ar/reader.py File client/libs/ar/reader.py (right): https://codereview.chromium.org/2049523004/diff/20001/client/libs/ar/reader.p... client/libs/ar/reader.py:38: check = self.check On 2016/06/09 21:50:44, M-A Ruel wrote: > why? Speed. Looking up check in self repeatedly is quite slow. https://codereview.chromium.org/2049523004/diff/20001/client/libs/ar/reader.p... client/libs/ar/reader.py:47: if header == '': On 2016/06/09 21:50:44, M-A Ruel wrote: > if not header: Done. https://codereview.chromium.org/2049523004/diff/20001/client/libs/ar/reader.p... client/libs/ar/reader.py:53: assert name.startswith("#1/"), name On 2016/06/09 21:50:44, M-A Ruel wrote: > use a real check + raise ValueError() Python -O causes these asserts to totally disappear, which gives roughly a 10-15% speed up of this code. I could catch the assertion error and convert it to ValueError? https://codereview.chromium.org/2049523004/diff/20001/client/libs/ar/reader.p... client/libs/ar/reader.py:56: assert_eq(dmodtime, modtime) On 2016/06/09 21:50:44, M-A Ruel wrote: > same See above. https://codereview.chromium.org/2049523004/diff/20001/client/libs/ar/writer.py File client/libs/ar/writer.py (right): https://codereview.chromium.org/2049523004/diff/20001/client/libs/ar/writer.p... client/libs/ar/writer.py:28: self.obuf.write("#1/%-13s" % str(len(fp))) On 2016/06/09 21:50:44, M-A Ruel wrote: > single quotes everywhere Done. https://codereview.chromium.org/2049523004/diff/20001/client/libs/ar/writer.p... client/libs/ar/writer.py:74: except AttributeError: On 2016/06/09 21:50:44, M-A Ruel wrote: > ? Rather than trying to check if ibuf_or_str will work with shutil.copyfileobj, just try it (if it acts enough like a duck for copyfileobj, then we treat it like a duck). If it fails fall back to using the argument as a string.
https://codereview.chromium.org/2049523004/diff/20001/client/libs/ar/reader.py File client/libs/ar/reader.py (right): https://codereview.chromium.org/2049523004/diff/20001/client/libs/ar/reader.p... client/libs/ar/reader.py:38: check = self.check On 2016/06/14 12:15:55, mithro wrote: > On 2016/06/09 21:50:44, M-A Ruel wrote: > > why? > > Speed. Looking up check in self repeatedly is quite slow. We do not use python -O in practice so I don't know if it is relevant. I guess we could envision switching if you want to drive this effort. https://codereview.chromium.org/2049523004/diff/40001/client/libs/ar/__init__.py File client/libs/ar/__init__.py (right): https://codereview.chromium.org/2049523004/diff/40001/client/libs/ar/__init__... client/libs/ar/__init__.py:22: __all__ = ['ArDefaultReader', 'ArWriter', 'ArDefaultWriter'] Ar prefix is not really needed. I don't think we need to export ArWriter. https://codereview.chromium.org/2049523004/diff/40001/client/libs/ar/cli.py File client/libs/ar/cli.py (right): https://codereview.chromium.org/2049523004/diff/40001/client/libs/ar/cli.py#n... client/libs/ar/cli.py:41: def create_cmd(filename, dirs, progress, read_ahead=0, verbose=True): remove verbose flag. do not put a default value to read_ahead. https://codereview.chromium.org/2049523004/diff/40001/client/libs/ar/cli.py#n... client/libs/ar/cli.py:52: print(fp, file=sys.stderr) Please use logging https://codereview.chromium.org/2049523004/diff/40001/client/libs/ar/cli.py#n... client/libs/ar/cli.py:71: def list_cmd(filename, progress, check=False): remove default value https://codereview.chromium.org/2049523004/diff/40001/client/libs/ar/cli.py#n... client/libs/ar/cli.py:73: for fp, _, _ in arfile: for fp, _, _ in ArDefaultReader(filename, check): https://codereview.chromium.org/2049523004/diff/40001/client/libs/ar/cli.py#n... client/libs/ar/cli.py:79: filename, progress, verbose=False, blocksize=1024*64, check=False): remove the verbose flag, use logging remove default values https://codereview.chromium.org/2049523004/diff/40001/client/libs/ar/cli.py#n... client/libs/ar/cli.py:87: os.makedirs(os.path.dirname(fp)) you should keep track of the directories that were created in a list and skip these. Then directory creation failure can be a hard fail and you can call os.mkdir(), which is likely (?) faster. https://codereview.chromium.org/2049523004/diff/40001/client/libs/ar/cli.py#n... client/libs/ar/cli.py:143: p.add_argument( add this flag as a global flag and call logging.basicConfig(level=) on it. We have wrapping code too in utils/ https://codereview.chromium.org/2049523004/diff/40001/client/libs/ar/reader.py File client/libs/ar/reader.py (right): https://codereview.chromium.org/2049523004/diff/40001/client/libs/ar/reader.p... client/libs/ar/reader.py:31: self.check = check I'd prefer self._check and self._ibuf https://codereview.chromium.org/2049523004/diff/40001/client/libs/ar/reader.p... client/libs/ar/reader.py:63: assert_eq(dmodtime, modtime) +2 https://codereview.chromium.org/2049523004/diff/40001/client/libs/ar/reader.p... client/libs/ar/reader.py:79: if read == 0: if not read: https://codereview.chromium.org/2049523004/diff/40001/client/libs/ar/reader.p... client/libs/ar/reader.py:84: if data_size % 2 != 0: if data_size % 2: https://codereview.chromium.org/2049523004/diff/40001/client/libs/ar/writer.py File client/libs/ar/writer.py (right): https://codereview.chromium.org/2049523004/diff/40001/client/libs/ar/writer.p... client/libs/ar/writer.py:12: MODE_HEADER = [] = object() https://codereview.chromium.org/2049523004/diff/40001/client/libs/ar/writer.p... client/libs/ar/writer.py:16: self.obuf = obuf I'd prefer self._obuf and _ prefix for others too https://codereview.chromium.org/2049523004/diff/40001/client/libs/ar/writer.p... client/libs/ar/writer.py:28: self.obuf.write('#1/%-13s' % str(len(fp))) Can you use struct.pack() like you did with unpack in reader? https://codereview.chromium.org/2049523004/diff/40001/client/libs/ar/writer.p... client/libs/ar/writer.py:51: def _write(self, ibuf=None, s=None): I think I'd prefer two functions. This is confusing. https://codereview.chromium.org/2049523004/diff/40001/client/libs/ar/writer.p... client/libs/ar/writer.py:65: if self.bytesrequired == 0: if not self._bytesrequired: https://codereview.chromium.org/2049523004/diff/40001/client/libs/ar/writer.p... client/libs/ar/writer.py:70: def write(self, ibuf_or_str): I'd prefer a writestr() function and not try to guess, it would be cleaner, more readable and slightly faster. https://codereview.chromium.org/2049523004/diff/40001/client/libs/ar/writer.p... client/libs/ar/writer.py:97: modtime=None, ownerid=None, groupid=None, filemod=None): I'd prefer to not expose these arguments at all. Add a # pylint: disable=foo-bar
Description was changed from ========== Tools for working with BSD style ar archives. ar archives are the simplest format that stratifies the following criteria; * Is an existing standard which has tools which ship on standard Linux systems. * Requires no escaping / processing of file contents. * Header can be written without needing to understand the whole file. * Extremely fast to process / generate. Other formats which were consider before selecting the ar format; * tar * zip * cpio * rar * Something protobuf based. There are a couple of drawbacks; * The ar format doesn't support special files or symlinks. * The Linux ar tool don't extract with directory paths (listing is fine). The go version is in https://codereview.chromium.org/2043623002 * https://github.com/luci/luci-go/issues/9 * https://crbug.com/598990 BUG=598990 ========== to ========== luci-py: Tools for working with BSD style ar archives. ar archives are the simplest format that satisfies the following criteria; * Is an existing standard which has tools which ship on standard Linux systems. * Requires no escaping / processing of file contents. * Header can be written without needing to understand the whole file. * Extremely fast to process / generate. Other formats which were consider before selecting the ar format; * tar * zip * cpio * rar * Something protobuf based. There are a couple of drawbacks; * The ar format doesn't support special files or symlinks. * The Linux ar tool don't extract with directory paths (listing is fine). The go version is in https://codereview.chromium.org/2043623002 * https://github.com/luci/luci-go/issues/9 * https://crbug.com/598990 BUG=598990 ==========
Hi Marc, I've spent a bunch of time today reworking the API of this code to closer match that of the tarfile/zipfile API. I've also added a whole bunch of tests for the code. I still have some of your other comments to look into, but wanted to send this out before going home. Tim 'mithro' Ansell https://codereview.chromium.org/2049523004/diff/40001/client/libs/ar/__init__.py File client/libs/ar/__init__.py (right): https://codereview.chromium.org/2049523004/diff/40001/client/libs/ar/__init__... client/libs/ar/__init__.py:22: __all__ = ['ArDefaultReader', 'ArWriter', 'ArDefaultWriter'] On 2016/06/14 13:26:16, M-A Ruel wrote: > Ar prefix is not really needed. I don't think we need to export ArWriter. This duplication is pretty normal in Python (IE tarfile.TarInfo) as you often do a "from ar import ArWriter" https://codereview.chromium.org/2049523004/diff/40001/client/libs/ar/reader.py File client/libs/ar/reader.py (right): https://codereview.chromium.org/2049523004/diff/40001/client/libs/ar/reader.p... client/libs/ar/reader.py:31: self.check = check On 2016/06/14 13:26:17, M-A Ruel wrote: > I'd prefer self._check and self._ibuf Obsolete. https://codereview.chromium.org/2049523004/diff/40001/client/libs/ar/reader.p... client/libs/ar/reader.py:63: assert_eq(dmodtime, modtime) On 2016/06/14 13:26:16, M-A Ruel wrote: > +2 Obsolete. https://codereview.chromium.org/2049523004/diff/40001/client/libs/ar/writer.py File client/libs/ar/writer.py (right): https://codereview.chromium.org/2049523004/diff/40001/client/libs/ar/writer.p... client/libs/ar/writer.py:12: MODE_HEADER = [] On 2016/06/14 13:26:17, M-A Ruel wrote: > = object() Obsolete. https://codereview.chromium.org/2049523004/diff/40001/client/libs/ar/writer.p... client/libs/ar/writer.py:16: self.obuf = obuf On 2016/06/14 13:26:17, M-A Ruel wrote: > I'd prefer self._obuf and _ prefix for others too Obsolete. https://codereview.chromium.org/2049523004/diff/40001/client/libs/ar/writer.p... client/libs/ar/writer.py:28: self.obuf.write('#1/%-13s' % str(len(fp))) On 2016/06/14 13:26:17, M-A Ruel wrote: > Can you use struct.pack() like you did with unpack in reader? No stuct.pack doesn't do this style of output (left aligned integers). https://codereview.chromium.org/2049523004/diff/40001/client/libs/ar/writer.p... client/libs/ar/writer.py:51: def _write(self, ibuf=None, s=None): On 2016/06/14 13:26:17, M-A Ruel wrote: > I think I'd prefer two functions. This is confusing. Obsolete. https://codereview.chromium.org/2049523004/diff/40001/client/libs/ar/writer.p... client/libs/ar/writer.py:65: if self.bytesrequired == 0: On 2016/06/14 13:26:17, M-A Ruel wrote: > if not self._bytesrequired: Obsolete. https://codereview.chromium.org/2049523004/diff/40001/client/libs/ar/writer.p... client/libs/ar/writer.py:70: def write(self, ibuf_or_str): On 2016/06/14 13:26:17, M-A Ruel wrote: > I'd prefer a writestr() function and not try to guess, it would be cleaner, more > readable and slightly faster. Obsolete. https://codereview.chromium.org/2049523004/diff/40001/client/libs/ar/writer.p... client/libs/ar/writer.py:97: modtime=None, ownerid=None, groupid=None, filemod=None): On 2016/06/14 13:26:17, M-A Ruel wrote: > I'd prefer to not expose these arguments at all. Add a # pylint: > disable=foo-bar Obsolete.
looks good, mostly styling nits, generally fine with the design. https://codereview.chromium.org/2049523004/diff/60001/client/libs/arfile/__in... File client/libs/arfile/__init__.py (right): https://codereview.chromium.org/2049523004/diff/60001/client/libs/arfile/__in... client/libs/arfile/__init__.py:9: 'open', 'is_arfile', sort https://codereview.chromium.org/2049523004/diff/60001/client/libs/arfile/__in... client/libs/arfile/__init__.py:12: ] align at +0 https://codereview.chromium.org/2049523004/diff/60001/client/libs/arfile/arfi... File client/libs/arfile/arfile.py (right): https://codereview.chromium.org/2049523004/diff/60001/client/libs/arfile/arfi... client/libs/arfile/arfile.py:11: from collections import namedtuple we don't do that according to the style guide https://codereview.chromium.org/2049523004/diff/60001/client/libs/arfile/arfi... client/libs/arfile/arfile.py:31: 'format', 'name', 'size', 'mtime', 'uid', 'gid', 'mode']) use +4 https://codereview.chromium.org/2049523004/diff/60001/client/libs/arfile/arfi... client/libs/arfile/arfile.py:120: else: no need for else: https://codereview.chromium.org/2049523004/diff/60001/client/libs/arfile/arfi... client/libs/arfile/arfile.py:210: cls._format(path, arformat), path, size, align arguments at +4 everywhere https://codereview.chromium.org/2049523004/diff/60001/client/libs/arfile/arfi... client/libs/arfile/arfile.py:319: return file(name, 'rb').read(len(AR_MAGIC_START)) == AR_MAGIC_START with a with statement https://codereview.chromium.org/2049523004/diff/60001/client/libs/arfile/arfi... client/libs/arfile/arfile.py:339: else: no need for else https://codereview.chromium.org/2049523004/diff/60001/client/libs/arfile/cli.py File client/libs/arfile/cli.py (right): https://codereview.chromium.org/2049523004/diff/60001/client/libs/arfile/cli.... client/libs/arfile/cli.py:43: def create_cmd(filename, dirs, progress, read_ahead=0, verbose=True): remove all default argument, they are not needed https://codereview.chromium.org/2049523004/diff/60001/client/libs/arfile/cli.... client/libs/arfile/cli.py:50: for fn in sorted(filenames): filenames.sort() would be faster? https://codereview.chromium.org/2049523004/diff/60001/client/libs/arfile/cli.... client/libs/arfile/cli.py:71: afw.close() try / finally? https://codereview.chromium.org/2049523004/diff/60001/client/libs/arfile/cli.... client/libs/arfile/cli.py:139: for p in parser_create, parser_extract: Please at it at global level instead, and use module logging instead of passing it it. https://codereview.chromium.org/2049523004/diff/60001/client/libs/arfile/cli.... client/libs/arfile/cli.py:155: mode(**args.__dict__) return mode(**args.__dict__) https://codereview.chromium.org/2049523004/diff/60001/client/libs/arfile/cli.... client/libs/arfile/cli.py:159: main("artool", sys.argv[1:]) sys.exit(main("artool", sys.argv[1:]))
Hi Marc, I've fixed everything except your requested change around verbose. The artool command is suppose to behave closer to the tar command rather than other Google tools. Maybe the flag name should be changed to "--list" or something to better indicate what it is used for? Tim 'mithro' Ansell https://codereview.chromium.org/2049523004/diff/60001/client/libs/arfile/__in... File client/libs/arfile/__init__.py (right): https://codereview.chromium.org/2049523004/diff/60001/client/libs/arfile/__in... client/libs/arfile/__init__.py:9: 'open', 'is_arfile', On 2016/06/16 14:57:05, M-A Ruel wrote: > sort Done. https://codereview.chromium.org/2049523004/diff/60001/client/libs/arfile/__in... client/libs/arfile/__init__.py:12: ] On 2016/06/16 14:57:05, M-A Ruel wrote: > align at +0 Done. https://codereview.chromium.org/2049523004/diff/60001/client/libs/arfile/arfi... File client/libs/arfile/arfile.py (right): https://codereview.chromium.org/2049523004/diff/60001/client/libs/arfile/arfi... client/libs/arfile/arfile.py:11: from collections import namedtuple On 2016/06/16 14:57:05, M-A Ruel wrote: > we don't do that according to the style guide Don't do what? I checked the CODING_STYLE.md and think you are referring to "import modules not symbols"? Turns out StringIO wasn't used in this file, so I removed it. I also fixed the import order. https://codereview.chromium.org/2049523004/diff/60001/client/libs/arfile/arfi... client/libs/arfile/arfile.py:31: 'format', 'name', 'size', 'mtime', 'uid', 'gid', 'mode']) On 2016/06/16 14:57:05, M-A Ruel wrote: > use +4 Done. https://codereview.chromium.org/2049523004/diff/60001/client/libs/arfile/arfi... client/libs/arfile/arfile.py:120: else: On 2016/06/16 14:57:05, M-A Ruel wrote: > no need for else: Done. https://codereview.chromium.org/2049523004/diff/60001/client/libs/arfile/arfi... client/libs/arfile/arfile.py:210: cls._format(path, arformat), path, size, On 2016/06/16 14:57:05, M-A Ruel wrote: > align arguments at +4 everywhere I think I've gotten them all. Any idea why your pylint configuration isn't picking these up? https://codereview.chromium.org/2049523004/diff/60001/client/libs/arfile/arfi... client/libs/arfile/arfile.py:319: return file(name, 'rb').read(len(AR_MAGIC_START)) == AR_MAGIC_START On 2016/06/16 14:57:05, M-A Ruel wrote: > with a with statement Done. https://codereview.chromium.org/2049523004/diff/60001/client/libs/arfile/arfi... client/libs/arfile/arfile.py:339: else: On 2016/06/16 14:57:05, M-A Ruel wrote: > no need for else Done. https://codereview.chromium.org/2049523004/diff/60001/client/libs/arfile/cli.py File client/libs/arfile/cli.py (right): https://codereview.chromium.org/2049523004/diff/60001/client/libs/arfile/cli.... client/libs/arfile/cli.py:43: def create_cmd(filename, dirs, progress, read_ahead=0, verbose=True): On 2016/06/16 14:57:05, M-A Ruel wrote: > remove all default argument, they are not needed Done. https://codereview.chromium.org/2049523004/diff/60001/client/libs/arfile/cli.... client/libs/arfile/cli.py:50: for fn in sorted(filenames): On 2016/06/16 14:57:05, M-A Ruel wrote: > filenames.sort() would be faster? Done. https://codereview.chromium.org/2049523004/diff/60001/client/libs/arfile/cli.... client/libs/arfile/cli.py:71: afw.close() On 2016/06/16 14:57:05, M-A Ruel wrote: > try / finally? Done. https://codereview.chromium.org/2049523004/diff/60001/client/libs/arfile/cli.... client/libs/arfile/cli.py:139: for p in parser_create, parser_extract: On 2016/06/16 14:57:05, M-A Ruel wrote: > Please at it at global level instead, and use module logging instead of passing > it it. The idea behind verbose mode for create and extract is that they provide something you can diff against the list output. artool list --file abc.ar > output1 artool extract --file abc.ar --verbose > output2 diff output1 output2 It should also be useful to get the number of files that were extracted / added; ar create --file abc.ar --verbose 2>&1 | wc -l ar extract --file abc.ar --verbose 2>&1 | wc -l Both these are common patterns when dealing with tar archives. https://codereview.chromium.org/2049523004/diff/60001/client/libs/arfile/cli.... client/libs/arfile/cli.py:155: mode(**args.__dict__) On 2016/06/16 14:57:05, M-A Ruel wrote: > return mode(**args.__dict__) Done. https://codereview.chromium.org/2049523004/diff/60001/client/libs/arfile/cli.... client/libs/arfile/cli.py:159: main("artool", sys.argv[1:]) On 2016/06/16 14:57:05, M-A Ruel wrote: > sys.exit(main("artool", sys.argv[1:])) Done.
lgtm with nits https://codereview.chromium.org/2049523004/diff/60001/client/libs/arfile/cli.py File client/libs/arfile/cli.py (right): https://codereview.chromium.org/2049523004/diff/60001/client/libs/arfile/cli.... client/libs/arfile/cli.py:139: for p in parser_create, parser_extract: On 2016/06/17 06:44:50, mithro wrote: > On 2016/06/16 14:57:05, M-A Ruel wrote: > > Please at it at global level instead, and use module logging instead of > passing > > it it. > > The idea behind verbose mode for create and extract is that they provide > something you can diff against the list output. > > artool list --file abc.ar > output1 > artool extract --file abc.ar --verbose > output2 > diff output1 output2 > > It should also be useful to get the number of files that were extracted / added; > > ar create --file abc.ar --verbose 2>&1 | wc -l > ar extract --file abc.ar --verbose 2>&1 | wc -l > > Both these are common patterns when dealing with tar archives. Ok I didn't realize you tried to mimic an existing tool. https://codereview.chromium.org/2049523004/diff/80001/client/libs/arfile/cli.py File client/libs/arfile/cli.py (right): https://codereview.chromium.org/2049523004/diff/80001/client/libs/arfile/cli.... client/libs/arfile/cli.py:36: t, self.filecount, self.filecount/t), file=sys.stderr) +4 https://codereview.chromium.org/2049523004/diff/80001/client/libs/arfile/test... File client/libs/arfile/tests.py (right): https://codereview.chromium.org/2049523004/diff/80001/client/libs/arfile/test... client/libs/arfile/tests.py:265: consistent spacing https://codereview.chromium.org/2049523004/diff/80001/client/libs/arfile/test... client/libs/arfile/tests.py:457: self.assertCLI( you know this fits one line, right? :) https://codereview.chromium.org/2049523004/diff/80001/client/libs/arfile/test... client/libs/arfile/tests.py:460: 2 empty lines consistently https://codereview.chromium.org/2049523004/diff/80001/client/libs/arfile/test... client/libs/arfile/tests.py:463: subprocess.call('python arfile.py', shell=True) I don't understand why this is not a test case
The CQ bit was checked by tansell@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/2049523004/140001
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: Luci-py Presubmit on master.tryserver.infra (JOB_FAILED, https://build.chromium.org/p/tryserver.infra/builders/Luci-py%20Presubmit/bui...)
I've also; * Added a bunch of unicode unittests * Changed to using io module rather than StringIO. * Made sure the tests are run as part of the presubmit. https://codereview.chromium.org/2049523004/diff/80001/client/libs/arfile/cli.py File client/libs/arfile/cli.py (right): https://codereview.chromium.org/2049523004/diff/80001/client/libs/arfile/cli.... client/libs/arfile/cli.py:36: t, self.filecount, self.filecount/t), file=sys.stderr) On 2016/06/17 13:26:49, M-A Ruel wrote: > +4 Done. https://codereview.chromium.org/2049523004/diff/80001/client/libs/arfile/test... File client/libs/arfile/tests.py (right): https://codereview.chromium.org/2049523004/diff/80001/client/libs/arfile/test... client/libs/arfile/tests.py:457: self.assertCLI( On 2016/06/17 13:26:49, M-A Ruel wrote: > you know this fits one line, right? :) Just making all the functions have the same formatting. https://codereview.chromium.org/2049523004/diff/80001/client/libs/arfile/test... client/libs/arfile/tests.py:460: On 2016/06/17 13:26:49, M-A Ruel wrote: > 2 empty lines consistently Done. https://codereview.chromium.org/2049523004/diff/80001/client/libs/arfile/test... client/libs/arfile/tests.py:463: subprocess.call('python arfile.py', shell=True) On 2016/06/17 13:26:49, M-A Ruel wrote: > I don't understand why this is not a test case This is running the doctests in the arfile.py
The CQ bit was checked by tansell@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/2049523004/180001
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
https://codereview.chromium.org/2049523004/diff/180001/client/PRESUBMIT.py File client/PRESUBMIT.py (right): https://codereview.chromium.org/2049523004/diff/180001/client/PRESUBMIT.py#ne... client/PRESUBMIT.py:49: r'.*tests\.py$', Please rename test instead of creating a one-off https://codereview.chromium.org/2049523004/diff/180001/client/libs/arfile/arf... File client/libs/arfile/arfile.py (right): https://codereview.chromium.org/2049523004/diff/180001/client/libs/arfile/arf... client/libs/arfile/arfile.py:26: _ArFormatSimpleForbidden = " " use single quote thoroughly. I don't think it's worth using a named variable to disallow whitespaces. This just makes line 73 harder to read than needed. https://codereview.chromium.org/2049523004/diff/180001/client/libs/arfile/arf... client/libs/arfile/arfile.py:29: 'format', 'name', 'size', 'mtime', 'uid', 'gid', 'mode']) please add empty lines between. https://codereview.chromium.org/2049523004/diff/180001/client/libs/arfile/arf... client/libs/arfile/arfile.py:37: def _format(path, arformat=None): why use a default argument? All call sites specify it. https://codereview.chromium.org/2049523004/diff/180001/client/libs/arfile/arf... client/libs/arfile/arfile.py:62: if arformat is None: replace all with if not arformat: https://codereview.chromium.org/2049523004/diff/180001/client/libs/arfile/arf... client/libs/arfile/arfile.py:118: if self.format is AR_FORMAT_SIMPLE: s/is/==/ where relevant, otherwise you would have bad surprises https://codereview.chromium.org/2049523004/diff/180001/client/libs/arfile/arf... client/libs/arfile/arfile.py:125: def fromfileobj(cls, fileobj, fullparse=True): no need for =True, all call sites specify fullparse. https://codereview.chromium.org/2049523004/diff/180001/client/libs/arfile/arf... client/libs/arfile/arfile.py:181: def frompath(cls, path, arformat=None): This function is never called, remove unless you plan to use it? Then unit test but I prefer removal. https://codereview.chromium.org/2049523004/diff/180001/client/libs/arfile/arf... client/libs/arfile/arfile.py:216: """Create a string buffer from a TarInfo object.""" s/string/str/ https://codereview.chromium.org/2049523004/diff/180001/client/libs/arfile/arf... client/libs/arfile/arfile.py:263: if ai is None: if not ai: https://codereview.chromium.org/2049523004/diff/180001/client/libs/arfile/arf... client/libs/arfile/arfile.py:272: if read == 0: if not read: https://codereview.chromium.org/2049523004/diff/180001/client/libs/arfile/arf... client/libs/arfile/arfile.py:289: Will close the output buffer.""" trailing """ on multiline docstring on its own line. But I'd merge the two lines; """Closes the archive and input buffer.""" because it's not output. Personally I don't like that this class takes ownership of the lifetime of the input file object. I don't like it but I see that's the normal paradigm in python, unlike (generally) in Go. https://codereview.chromium.org/2049523004/diff/180001/client/libs/arfile/arf... client/libs/arfile/arfile.py:301: if not fileobj and arinfo.size != 0: if not fileobj and arinfo.size: https://codereview.chromium.org/2049523004/diff/180001/client/libs/arfile/cli.py File client/libs/arfile/cli.py (right): https://codereview.chromium.org/2049523004/diff/180001/client/libs/arfile/cli... client/libs/arfile/cli.py:119: help="Amount of data to read-ahead before doing a stat.") use single quotes thoroughly. https://codereview.chromium.org/2049523004/diff/180001/client/libs/arfile/tes... File client/libs/arfile/tests.py (right): https://codereview.chromium.org/2049523004/diff/180001/client/libs/arfile/tes... client/libs/arfile/tests.py:1: #!/usr/bin/env python please rename it to ar_test.py https://codereview.chromium.org/2049523004/diff/180001/client/libs/arfile/tes... client/libs/arfile/tests.py:555: self.assertCLI( please move these that fits 80 cols as one line, not 3. https://codereview.chromium.org/2049523004/diff/180001/client/libs/arfile/tes... client/libs/arfile/tests.py:591: import doctest please import at file level at the top
https://codereview.chromium.org/2049523004/diff/180001/client/PRESUBMIT.py File client/PRESUBMIT.py (right): https://codereview.chromium.org/2049523004/diff/180001/client/PRESUBMIT.py#ne... client/PRESUBMIT.py:49: r'.*tests\.py$', On 2016/06/22 14:21:31, M-A Ruel wrote: > Please rename test instead of creating a one-off Done. https://codereview.chromium.org/2049523004/diff/180001/client/libs/arfile/arf... File client/libs/arfile/arfile.py (right): https://codereview.chromium.org/2049523004/diff/180001/client/libs/arfile/arf... client/libs/arfile/arfile.py:26: _ArFormatSimpleForbidden = " " On 2016/06/22 14:21:31, M-A Ruel wrote: > use single quote thoroughly. Done. > I don't think it's worth using a named variable to disallow whitespaces. This > just makes line 73 harder to read than needed. Removed. Originally there were more forbidden characters. https://codereview.chromium.org/2049523004/diff/180001/client/libs/arfile/arf... client/libs/arfile/arfile.py:29: 'format', 'name', 'size', 'mtime', 'uid', 'gid', 'mode']) On 2016/06/22 14:21:32, M-A Ruel wrote: > please add empty lines between. Done. https://codereview.chromium.org/2049523004/diff/180001/client/libs/arfile/arf... client/libs/arfile/arfile.py:37: def _format(path, arformat=None): On 2016/06/22 14:21:31, M-A Ruel wrote: > why use a default argument? All call sites specify it. Done. https://codereview.chromium.org/2049523004/diff/180001/client/libs/arfile/arf... client/libs/arfile/arfile.py:62: if arformat is None: On 2016/06/22 14:21:32, M-A Ruel wrote: > replace all with > if not arformat: Done. https://codereview.chromium.org/2049523004/diff/180001/client/libs/arfile/arf... client/libs/arfile/arfile.py:118: if self.format is AR_FORMAT_SIMPLE: On 2016/06/22 14:21:31, M-A Ruel wrote: > s/is/==/ > where relevant, otherwise you would have bad surprises I actually want an *is* here. I changed the types of AR_FORMAT_XXXX objects to make that clearer. https://codereview.chromium.org/2049523004/diff/180001/client/libs/arfile/arf... client/libs/arfile/arfile.py:125: def fromfileobj(cls, fileobj, fullparse=True): On 2016/06/22 14:21:31, M-A Ruel wrote: > no need for =True, all call sites specify fullparse. I'd prefer to keep it here as it is a public interface. https://codereview.chromium.org/2049523004/diff/180001/client/libs/arfile/arf... client/libs/arfile/arfile.py:181: def frompath(cls, path, arformat=None): On 2016/06/22 14:21:32, M-A Ruel wrote: > This function is never called, remove unless you plan to use it? Then unit test > but I prefer removal. I'd like to keep it, so I added it to cli and a unittest. https://codereview.chromium.org/2049523004/diff/180001/client/libs/arfile/arf... client/libs/arfile/arfile.py:216: """Create a string buffer from a TarInfo object.""" On 2016/06/22 14:21:31, M-A Ruel wrote: > s/string/str/ Actually the whole comment is wrong. Replaced with; """Write an ArInfo object to file like object.""" https://codereview.chromium.org/2049523004/diff/180001/client/libs/arfile/arf... client/libs/arfile/arfile.py:263: if ai is None: On 2016/06/22 14:21:32, M-A Ruel wrote: > if not ai: Done. https://codereview.chromium.org/2049523004/diff/180001/client/libs/arfile/arf... client/libs/arfile/arfile.py:272: if read == 0: On 2016/06/22 14:21:31, M-A Ruel wrote: > if not read: Done. https://codereview.chromium.org/2049523004/diff/180001/client/libs/arfile/arf... client/libs/arfile/arfile.py:289: Will close the output buffer.""" On 2016/06/22 14:21:32, M-A Ruel wrote: > trailing """ on multiline docstring on its own line. But I'd merge the two > lines; > > """Closes the archive and input buffer.""" > > because it's not output. > > Personally I don't like that this class takes ownership of the lifetime of the > input file object. I don't like it but I see that's the normal paradigm in > python, unlike (generally) in Go. Done. https://codereview.chromium.org/2049523004/diff/180001/client/libs/arfile/arf... client/libs/arfile/arfile.py:301: if not fileobj and arinfo.size != 0: On 2016/06/22 14:21:32, M-A Ruel wrote: > if not fileobj and arinfo.size: Done. https://codereview.chromium.org/2049523004/diff/180001/client/libs/arfile/cli.py File client/libs/arfile/cli.py (right): https://codereview.chromium.org/2049523004/diff/180001/client/libs/arfile/cli... client/libs/arfile/cli.py:119: help="Amount of data to read-ahead before doing a stat.") On 2016/06/22 14:21:32, M-A Ruel wrote: > use single quotes thoroughly. Done. https://codereview.chromium.org/2049523004/diff/180001/client/libs/arfile/tes... File client/libs/arfile/tests.py (right): https://codereview.chromium.org/2049523004/diff/180001/client/libs/arfile/tes... client/libs/arfile/tests.py:1: #!/usr/bin/env python On 2016/06/22 14:21:32, M-A Ruel wrote: > please rename it to ar_test.py Done. https://codereview.chromium.org/2049523004/diff/180001/client/libs/arfile/tes... client/libs/arfile/tests.py:555: self.assertCLI( On 2016/06/22 14:21:32, M-A Ruel wrote: > please move these that fits 80 cols as one line, not 3. Done. https://codereview.chromium.org/2049523004/diff/180001/client/libs/arfile/tes... client/libs/arfile/tests.py:591: import doctest On 2016/06/22 14:21:32, M-A Ruel wrote: > please import at file level at the top Done.
Patchset #11 (id:200001) has been deleted
Patchset #11 (id:220001) has been deleted
lgtm with one issue https://codereview.chromium.org/2049523004/diff/240001/client/libs/arfile/arf... File client/libs/arfile/arfile.py (right): https://codereview.chromium.org/2049523004/diff/240001/client/libs/arfile/arf... client/libs/arfile/arfile.py:16: AR_FORMAT_SIMPLE = ['Simple Format'] IMHO it's worse because now it is mutable. Please just use string.
The CQ bit was checked by tansell@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: Luci-py Presubmit on luci.infra.try (JOB_FAILED, https://luci-milo.appspot.com/swarming/task/304a94e4c44f1810)
The CQ bit was checked by tansell@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
The CQ bit was checked by tansell@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
The CQ bit was checked by tansell@chromium.org
The patchset sent to the CQ was uploaded after l-g-t-m from maruel@chromium.org Link to the patchset: https://codereview.chromium.org/2049523004/#ps320001 (title: "Small unicode fix.")
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by tansell@chromium.org
The CQ bit was checked by tansell@chromium.org
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Try jobs failed on following builders: Luci-py Presubmit on luci.infra.try (JOB_FAILED, https://luci-milo.appspot.com/swarming/task/304ac589adec0e10)
The CQ bit was checked by tansell@chromium.org
The patchset sent to the CQ was uploaded after l-g-t-m from maruel@chromium.org Link to the patchset: https://codereview.chromium.org/2049523004/#ps340001 (title: "Small rebase.")
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by tansell@chromium.org
The CQ bit was checked by tansell@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
The CQ bit was checked by tansell@chromium.org
The patchset sent to the CQ was uploaded after l-g-t-m from maruel@chromium.org Link to the patchset: https://codereview.chromium.org/2049523004/#ps360001 (title: "Actually fix the unicode problem properly.")
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
Message was sent while issue was closed.
Description was changed from ========== luci-py: Tools for working with BSD style ar archives. ar archives are the simplest format that satisfies the following criteria; * Is an existing standard which has tools which ship on standard Linux systems. * Requires no escaping / processing of file contents. * Header can be written without needing to understand the whole file. * Extremely fast to process / generate. Other formats which were consider before selecting the ar format; * tar * zip * cpio * rar * Something protobuf based. There are a couple of drawbacks; * The ar format doesn't support special files or symlinks. * The Linux ar tool don't extract with directory paths (listing is fine). The go version is in https://codereview.chromium.org/2043623002 * https://github.com/luci/luci-go/issues/9 * https://crbug.com/598990 BUG=598990 ========== to ========== luci-py: Tools for working with BSD style ar archives. ar archives are the simplest format that satisfies the following criteria; * Is an existing standard which has tools which ship on standard Linux systems. * Requires no escaping / processing of file contents. * Header can be written without needing to understand the whole file. * Extremely fast to process / generate. Other formats which were consider before selecting the ar format; * tar * zip * cpio * rar * Something protobuf based. There are a couple of drawbacks; * The ar format doesn't support special files or symlinks. * The Linux ar tool don't extract with directory paths (listing is fine). The go version is in https://codereview.chromium.org/2043623002 * https://github.com/luci/luci-go/issues/9 * https://crbug.com/598990 BUG=598990 Committed: https://github.com/luci/luci-py/commit/6952ff7c2a890381a9745558f1aeb5fa499e727b ==========
Message was sent while issue was closed.
Committed patchset #17 (id:360001) as https://github.com/luci/luci-py/commit/6952ff7c2a890381a9745558f1aeb5fa499e727b |