|
|
Descriptionluci-go: Tools for working with BSD style ar archives.
ar archives are the simplest format that satisfies the following criteria;
* Is an existing standard which has tools which ship on standard Linux
systems.
* Requires no escaping / processing of file contents.
* Header can be written without needing to understand the whole file.
* Extremely fast to process / generate.
Other formats which were consider before selecting the ar format;
* tar
* zip
* cpio
* rar
* Something protobuf based.
On a z620, these ar versions are ~5 times faster than the archive/tar
equivalent which ships with go.
There are a couple of drawbacks;
* The ar format doesn't support special files or symlinks.
* The Linux ar tool don't extract with directory paths (list fine).
The Python version of this code is at https://codereview.chromium.org/2049523004
This was split out of https://codereview.chromium.org/2014243002
* https://github.com/luci/luci-go/issues/9
* https://crbug.com/598990
BUG=598990
Patch Set 1 #
Total comments: 24
Patch Set 2 : Fixing maruel's comments. #Patch Set 3 : Fixes. #
Total comments: 42
Patch Set 4 : Fixing for maruel's review. #
Total comments: 74
Patch Set 5 : Fixing for maruel's review + reader rewrite. #Patch Set 6 : Latest fixes. #
Total comments: 17
Patch Set 7 : Lots of fixes. #Patch Set 8 : Small update to a comment. #
Total comments: 11
Messages
Total messages: 25 (9 generated)
tansell@chromium.org changed reviewers: + maruel@google.com
Hi Marc, This is the first CL of three CLs being split out of the WIP change at https://codereview.chromium.org/2014243002 As I'm new to go programming, I'd appreciate make sure that I'm following normal go practices. I've based a lot of the structure on the archive/tar code in the go standard library. Are there any other people who should be added as reviewers? Tim 'mithro' Ansell
tandrii@google.com changed reviewers: + tandrii@google.com
You can add me(tandrii@) for these reviews as well. Before I go for full review here, can you answer general question on your WIP CIL https://codereview.chromium.org/2014243002 ?
On 2016/06/06 at 08:38:01, tandrii wrote: > You can add me(tandrii@) for these reviews as well. > Before I go for full review here, can you answer general question on your WIP CIL https://codereview.chromium.org/2014243002 ? Replied to your comments on the WIP patch and added some more context. Will upload the other two parts tomorrow (as it's almost 3am here). Tim 'mithro' Ansell
maruel@chromium.org changed reviewers: + maruel@chromium.org
Just general styling comment, will do a more careful review a bit later. https://codereview.chromium.org/2043623002/diff/1/common/archive/ar/ar_test.go File common/archive/ar/ar_test.go (right): https://codereview.chromium.org/2043623002/diff/1/common/archive/ar/ar_test.g... common/archive/ar/ar_test.go:46: b := bytes.NewBufferString("") b := &bytes.Buffer{} https://codereview.chromium.org/2043623002/diff/1/common/archive/ar/ar_test.g... common/archive/ar/ar_test.go:50: err := ar.Add("filename1", data) if err := ar.Add("filename1", data); err != nil { https://codereview.chromium.org/2043623002/diff/1/common/archive/ar/ar_test.g... common/archive/ar/ar_test.go:55: if err != nil { if err := ar.Close(); err != nil { https://codereview.chromium.org/2043623002/diff/1/common/archive/ar/ar_test.g... common/archive/ar/ar_test.go:76: if g, e := h.Name(), "filename1"; g != e { FTR, you can use: ut.AssertEqual(t, "filename1", h.Name()) https://codereview.chromium.org/2043623002/diff/1/common/archive/ar/ar_test.g... common/archive/ar/ar_test.go:105: _, err := exec.LookPath("ar") you can merge here as one line too and a few other below. https://codereview.chromium.org/2043623002/diff/1/common/archive/ar/reader.go File common/archive/ar/reader.go (right): https://codereview.chromium.org/2043623002/diff/1/common/archive/ar/reader.go... common/archive/ar/reader.go:6: * Read an ar archive file. Do not create 2 package comments https://codereview.chromium.org/2043623002/diff/1/common/archive/ar/reader.go... common/archive/ar/reader.go:31: func (fi arFileInfo) Name() string { return fi.name } use pointer methods otherwise this copies the struct each time. https://codereview.chromium.org/2043623002/diff/1/common/archive/ar/reader.go... common/archive/ar/reader.go:50: READ_BODY = iota just iota for the first https://codereview.chromium.org/2043623002/diff/1/common/archive/ar/writer.go File common/archive/ar/writer.go (right): https://codereview.chromium.org/2043623002/diff/1/common/archive/ar/writer.go... common/archive/ar/writer.go:6: * Write an ar archive file. If you want to have a comment for the file, use the following form: // Write an ar achive file. package ar import ( ... the empty line between the commend and the "package ar" line is important.
Patchset #2 (id:20001) has been deleted
Hi Maruel, I've made the changes you requested. Tim 'mithro' Ansell https://codereview.chromium.org/2043623002/diff/1/common/archive/ar/ar_test.go File common/archive/ar/ar_test.go (right): https://codereview.chromium.org/2043623002/diff/1/common/archive/ar/ar_test.g... common/archive/ar/ar_test.go:46: b := bytes.NewBufferString("") On 2016/06/06 at 23:01:05, M-A Ruel wrote: > b := &bytes.Buffer{} Done. https://codereview.chromium.org/2043623002/diff/1/common/archive/ar/ar_test.g... common/archive/ar/ar_test.go:50: err := ar.Add("filename1", data) On 2016/06/06 at 23:01:05, M-A Ruel wrote: > if err := ar.Add("filename1", data); err != nil { Done. https://codereview.chromium.org/2043623002/diff/1/common/archive/ar/ar_test.g... common/archive/ar/ar_test.go:55: if err != nil { On 2016/06/06 at 23:01:05, M-A Ruel wrote: > if err := ar.Close(); err != nil { Done. https://codereview.chromium.org/2043623002/diff/1/common/archive/ar/ar_test.g... common/archive/ar/ar_test.go:76: if g, e := h.Name(), "filename1"; g != e { On 2016/06/06 at 23:01:05, M-A Ruel wrote: > FTR, you can use: > > ut.AssertEqual(t, "filename1", h.Name()) It seems that ut is your module? github.com/maruel/ut? https://codereview.chromium.org/2043623002/diff/1/common/archive/ar/ar_test.g... common/archive/ar/ar_test.go:105: _, err := exec.LookPath("ar") On 2016/06/06 at 23:01:05, M-A Ruel wrote: > you can merge here as one line too and a few other below. I've merged the ones which just return err, but it seems you can't do this if you need to use the other argument? https://codereview.chromium.org/2043623002/diff/1/common/archive/ar/reader.go File common/archive/ar/reader.go (right): https://codereview.chromium.org/2043623002/diff/1/common/archive/ar/reader.go... common/archive/ar/reader.go:6: * Read an ar archive file. On 2016/06/06 at 23:01:05, M-A Ruel wrote: > Do not create 2 package comments So this comment is somehow associated with the package? https://codereview.chromium.org/2043623002/diff/1/common/archive/ar/reader.go... common/archive/ar/reader.go:31: func (fi arFileInfo) Name() string { return fi.name } On 2016/06/06 at 23:01:05, M-A Ruel wrote: > use pointer methods otherwise this copies the struct each time. Done. https://codereview.chromium.org/2043623002/diff/1/common/archive/ar/reader.go... common/archive/ar/reader.go:50: READ_BODY = iota On 2016/06/06 at 23:01:05, M-A Ruel wrote: > just iota for the first Any idea why gofmt doesn't pick that one up? https://codereview.chromium.org/2043623002/diff/1/common/archive/ar/writer.go File common/archive/ar/writer.go (right): https://codereview.chromium.org/2043623002/diff/1/common/archive/ar/writer.go... common/archive/ar/writer.go:6: * Write an ar archive file. On 2016/06/06 at 23:01:05, M-A Ruel wrote: > If you want to have a comment for the file, use the following form: > > // Write an ar achive file. > > package ar > > import ( > ... > > the empty line between the commend and the "package ar" line is important. Done.
Description was changed from ========== Tools for working with BSD style ar archives. ar archives are the simplest format that stratifies the following criteria; * Is an existing standard which has tools which ship on standard Linux systems. * Requires no escaping / processing of file contents. * Header can be written without needing to understand the whole file. * Extremely fast to process / generate. Other formats which were consider before selecting the ar format; * tar * zip * cpio * rar * Something protobuf based. On a z620, these ar versions are ~5 times faster than the archive/tar equivalent which ships with go. There are a couple of drawbacks; * The ar format doesn't support special files or symlinks. * The Linux ar tool don't extract with directory paths (list fine). This was split out of https://codereview.chromium.org/2014243002 * https://github.com/luci/luci-go/issues/9 * https://crbug.com/598990 BUG=598990 ========== to ========== Tools for working with BSD style ar archives. ar archives are the simplest format that stratifies the following criteria; * Is an existing standard which has tools which ship on standard Linux systems. * Requires no escaping / processing of file contents. * Header can be written without needing to understand the whole file. * Extremely fast to process / generate. Other formats which were consider before selecting the ar format; * tar * zip * cpio * rar * Something protobuf based. On a z620, these ar versions are ~5 times faster than the archive/tar equivalent which ships with go. There are a couple of drawbacks; * The ar format doesn't support special files or symlinks. * The Linux ar tool don't extract with directory paths (list fine). The Python version of this code is at https://codereview.chromium.org/2049523004 This was split out of https://codereview.chromium.org/2014243002 * https://github.com/luci/luci-go/issues/9 * https://crbug.com/598990 BUG=598990 ==========
https://codereview.chromium.org/2043623002/diff/1/common/archive/ar/ar_test.go File common/archive/ar/ar_test.go (right): https://codereview.chromium.org/2043623002/diff/1/common/archive/ar/ar_test.g... common/archive/ar/ar_test.go:76: if g, e := h.Name(), "filename1"; g != e { On 2016/06/07 12:28:52, mithro wrote: > On 2016/06/06 at 23:01:05, M-A Ruel wrote: > > FTR, you can use: > > > > ut.AssertEqual(t, "filename1", h.Name()) > > It seems that ut is your module? github.com/maruel/ut? Yes, goimports will find it automatically, it's really just up to you. lots of devs uses goconvey but I'm not personally a fan. https://codereview.chromium.org/2043623002/diff/1/common/archive/ar/reader.go File common/archive/ar/reader.go (right): https://codereview.chromium.org/2043623002/diff/1/common/archive/ar/reader.go... common/archive/ar/reader.go:50: READ_BODY = iota On 2016/06/07 12:28:52, mithro wrote: > On 2016/06/06 at 23:01:05, M-A Ruel wrote: > > just iota for the first > > Any idea why gofmt doesn't pick that one up? I mean you can remove the complete "= iota". It's not gofmt's job to understand that. https://codereview.chromium.org/2043623002/diff/60001/common/archive/ar/reade... File common/archive/ar/reader.go (right): https://codereview.chromium.org/2043623002/diff/60001/common/archive/ar/reade... common/archive/ar/reader.go:45: type ReaderStage uint this one doesn't look like it needs to be exported, same for lines 48-50 https://codereview.chromium.org/2043623002/diff/60001/common/archive/ar/reade... common/archive/ar/reader.go:61: reader := Reader{r: r, bytesrequired: 0, needspadding: false} reader := &Reader{r: r} https://codereview.chromium.org/2043623002/diff/60001/common/archive/ar/reade... common/archive/ar/reader.go:65: } else { remove the else, just close the bracket. https://codereview.chromium.org/2043623002/diff/60001/common/archive/ar/reade... common/archive/ar/reader.go:70: func (ar *Reader) checkBytes(name string, str []byte) error { the name: prefix can be added at call site. https://codereview.chromium.org/2043623002/diff/60001/common/archive/ar/reade... common/archive/ar/reader.go:78: if count != len(buffer) { it's a guarantee of ReadFull() https://codereview.chromium.org/2043623002/diff/60001/common/archive/ar/reade... common/archive/ar/reader.go:100: //ar.r.Close() remove, generally it's up to the caller to do it. Still, ar.r = nil https://codereview.chromium.org/2043623002/diff/60001/common/archive/ar/reade... common/archive/ar/reader.go:105: func (ar *Reader) readBytes(numbytes int64) error { I don't understand this function https://codereview.chromium.org/2043623002/diff/60001/common/archive/ar/reade... common/archive/ar/reader.go:127: // Check you can write bytes to the ar at this moment. s/write/read/ https://codereview.chromium.org/2043623002/diff/60001/common/archive/ar/reade... common/archive/ar/reader.go:131: return errors.New("Usage error, need to read header first.") all errors should start with a lower case https://codereview.chromium.org/2043623002/diff/60001/common/archive/ar/reade... common/archive/ar/reader.go:151: err := ar.checkRead() inline the function since it's only used once. https://codereview.chromium.org/2043623002/diff/60001/common/archive/ar/reade... common/archive/ar/reader.go:272: func (ar *Reader) Next() (*arFileInfo, error) { don't return a private type on a public function; you can create a FileInfo interface if you want; type FileInfo interface { os.FileInfo .. extra functions } https://codereview.chromium.org/2043623002/diff/60001/common/archive/ar/write... File common/archive/ar/writer.go (right): https://codereview.chromium.org/2043623002/diff/60001/common/archive/ar/write... common/archive/ar/writer.go:21: type WriterStage uint doesn't need to be exported https://codereview.chromium.org/2043623002/diff/60001/common/archive/ar/write... common/archive/ar/writer.go:26: WRITE_CLOSED There's no need for _CLOSED, it's closed when wr.w is nil. Same for reader https://codereview.chromium.org/2043623002/diff/60001/common/archive/ar/write... common/archive/ar/writer.go:39: return &Writer{w: w, stage: WRITE_HEADER, bytesrequired: 0, needspadding: false} return &Writer{w: w, state: writeHeader} https://codereview.chromium.org/2043623002/diff/60001/common/archive/ar/write... common/archive/ar/writer.go:51: panic(fmt.Sprintf("Unknown writer mode: %d", aw.stage)) don't panic here https://codereview.chromium.org/2043623002/diff/60001/common/archive/ar/write... common/archive/ar/writer.go:53: //aw.w.Close() set it to nil instead https://codereview.chromium.org/2043623002/diff/60001/common/archive/ar/write... common/archive/ar/writer.go:60: panic(fmt.Sprintf("To much data written! Needed %d, got %d", aw.bytesrequired, numbytes)) return an error use lower case text https://codereview.chromium.org/2043623002/diff/60001/common/archive/ar/write... common/archive/ar/writer.go:81: return errors.New("Usage error, need to write header first.") if there's no file in written to it, it should just close the archive to be valid. https://codereview.chromium.org/2043623002/diff/60001/common/archive/ar/write... common/archive/ar/writer.go:123: panic(fmt.Sprintf("err while copying (%s), archive is probably corrupted!", err)) don't panic (everywhere) https://codereview.chromium.org/2043623002/diff/60001/common/archive/doc.go File common/archive/doc.go (right): https://codereview.chromium.org/2043623002/diff/60001/common/archive/doc.go#n... common/archive/doc.go:5: // Package archive is for readers/writers of archive file formats like ar, tar, I don't think it's worth creating this layer as I don't think there will be support for other formats.
Hi Maruel, I think I got all your comments. I'm going to wait to submit this change until the things which actually use it are also ready. Tim 'mithro' Ansell https://codereview.chromium.org/2043623002/diff/1/common/archive/ar/ar_test.go File common/archive/ar/ar_test.go (right): https://codereview.chromium.org/2043623002/diff/1/common/archive/ar/ar_test.g... common/archive/ar/ar_test.go:76: if g, e := h.Name(), "filename1"; g != e { On 2016/06/08 at 20:25:31, M-A Ruel wrote: > On 2016/06/07 12:28:52, mithro wrote: > > On 2016/06/06 at 23:01:05, M-A Ruel wrote: > > > FTR, you can use: > > > > > > ut.AssertEqual(t, "filename1", h.Name()) > > > > It seems that ut is your module? github.com/maruel/ut? I was surprised there is no "unittest" like library which ships with go... > Yes, goimports will find it automatically, it's really just up to you. lots of devs uses goconvey but I'm not personally a fan. I'm not sure what you mean by "find it automatically" - I've not used goimports before? At first it seems I ended up with the "Google3" version which rewrote all my import statements incorrectly. Eventually was able to install it via `go get -u golang.org/x/tools/cmd/goimports/...` https://codereview.chromium.org/2043623002/diff/60001/common/archive/ar/reade... File common/archive/ar/reader.go (right): https://codereview.chromium.org/2043623002/diff/60001/common/archive/ar/reade... common/archive/ar/reader.go:45: type ReaderStage uint On 2016/06/08 at 20:25:31, M-A Ruel wrote: > this one doesn't look like it needs to be exported, same for lines 48-50 Done. https://codereview.chromium.org/2043623002/diff/60001/common/archive/ar/reade... common/archive/ar/reader.go:61: reader := Reader{r: r, bytesrequired: 0, needspadding: false} On 2016/06/08 at 20:25:31, M-A Ruel wrote: > reader := &Reader{r: r} Done. https://codereview.chromium.org/2043623002/diff/60001/common/archive/ar/reade... common/archive/ar/reader.go:65: } else { On 2016/06/08 at 20:25:31, M-A Ruel wrote: > remove the else, just close the bracket. Done. https://codereview.chromium.org/2043623002/diff/60001/common/archive/ar/reade... common/archive/ar/reader.go:70: func (ar *Reader) checkBytes(name string, str []byte) error { On 2016/06/08 at 20:25:31, M-A Ruel wrote: > the name: prefix can be added at call site. I originally had it that way, but it seems to be cleaner/shorter this way. https://codereview.chromium.org/2043623002/diff/60001/common/archive/ar/reade... common/archive/ar/reader.go:78: if count != len(buffer) { On 2016/06/08 at 20:25:31, M-A Ruel wrote: > it's a guarantee of ReadFull() Looks like you are right, I thought that if io.ReadFull is asked for 20 bytes but there is only one byte left in the io.Reader then count would be 1. Removed. https://codereview.chromium.org/2043623002/diff/60001/common/archive/ar/reade... common/archive/ar/reader.go:100: //ar.r.Close() On 2016/06/08 at 20:25:31, M-A Ruel wrote: > remove, generally it's up to the caller to do it. Still, > > ar.r = nil Done. https://codereview.chromium.org/2043623002/diff/60001/common/archive/ar/reade... common/archive/ar/reader.go:105: func (ar *Reader) readBytes(numbytes int64) error { On 2016/06/08 at 20:25:31, M-A Ruel wrote: > I don't understand this function This function is called after reading a number of bytes and does all the cleanup like updating the bytes required, reading any extra padding and going back to the header mode. At the moment we only have a `func (ar *Reader) Read(b []byte) (n int, err error) {` but it likely in the future we might want a version which Reads into an io.Writer object which will also need to call this function. https://codereview.chromium.org/2043623002/diff/60001/common/archive/ar/reade... common/archive/ar/reader.go:127: // Check you can write bytes to the ar at this moment. On 2016/06/08 at 20:25:31, M-A Ruel wrote: > s/write/read/ Done. (Plus changed the name slightly.) https://codereview.chromium.org/2043623002/diff/60001/common/archive/ar/reade... common/archive/ar/reader.go:131: return errors.New("Usage error, need to read header first.") On 2016/06/08 at 20:25:31, M-A Ruel wrote: > all errors should start with a lower case Done. https://codereview.chromium.org/2043623002/diff/60001/common/archive/ar/reade... common/archive/ar/reader.go:151: err := ar.checkRead() On 2016/06/08 at 20:25:31, M-A Ruel wrote: > inline the function since it's only used once. Done. https://codereview.chromium.org/2043623002/diff/60001/common/archive/ar/reade... common/archive/ar/reader.go:272: func (ar *Reader) Next() (*arFileInfo, error) { On 2016/06/08 at 20:25:31, M-A Ruel wrote: > don't return a private type on a public function; you can create a FileInfo interface if you want; > > type FileInfo interface { > os.FileInfo > .. extra functions > } Done. https://codereview.chromium.org/2043623002/diff/60001/common/archive/ar/write... File common/archive/ar/writer.go (right): https://codereview.chromium.org/2043623002/diff/60001/common/archive/ar/write... common/archive/ar/writer.go:21: type WriterStage uint On 2016/06/08 at 20:25:31, M-A Ruel wrote: > doesn't need to be exported Done. https://codereview.chromium.org/2043623002/diff/60001/common/archive/ar/write... common/archive/ar/writer.go:26: WRITE_CLOSED On 2016/06/08 at 20:25:32, M-A Ruel wrote: > There's no need for _CLOSED, it's closed when wr.w is nil. Same for reader I removed the WRITE_CLOSED but found that it made the state checking switch statement ugly, it goes from ``` switch aw.stage { case WRITE_HEADER: // Good case WRITE_BODY: return errors.New("Usage error, writing a file.") case WRITE_CLOSED: return errors.New("Usage error, archive already closed.") default: panic(fmt.Sprintf("Unknown writer mode: %d", aw.stage)) } ``` to ``` switch aw.stage { case WRITE_HEADER: // Good case WRITE_BODY: return errors.New("Usage error, writing a file.") default: panic(fmt.Sprintf("Unknown writer mode: %d", aw.stage)) } if aw.r == nil { return errors.New("Usage error, archive already closed.") } ``` https://codereview.chromium.org/2043623002/diff/60001/common/archive/ar/write... common/archive/ar/writer.go:39: return &Writer{w: w, stage: WRITE_HEADER, bytesrequired: 0, needspadding: false} On 2016/06/08 at 20:25:32, M-A Ruel wrote: > return &Writer{w: w, state: writeHeader} Done. https://codereview.chromium.org/2043623002/diff/60001/common/archive/ar/write... common/archive/ar/writer.go:51: panic(fmt.Sprintf("Unknown writer mode: %d", aw.stage)) On 2016/06/08 at 20:25:31, M-A Ruel wrote: > don't panic here Go doesn't seem to guarantee safety here? I did change it to a log.Fatalf... https://codereview.chromium.org/2043623002/diff/60001/common/archive/ar/write... common/archive/ar/writer.go:53: //aw.w.Close() On 2016/06/08 at 20:25:32, M-A Ruel wrote: > set it to nil instead I'm setting to nil, but I've kept the aw.stage bit. https://codereview.chromium.org/2043623002/diff/60001/common/archive/ar/write... common/archive/ar/writer.go:60: panic(fmt.Sprintf("To much data written! Needed %d, got %d", aw.bytesrequired, numbytes)) On 2016/06/08 at 20:25:32, M-A Ruel wrote: > return an error At this point if this has occurred the archive is corrupted and there is nothing we can do? > use lower case text Done. https://codereview.chromium.org/2043623002/diff/60001/common/archive/ar/write... common/archive/ar/writer.go:81: return errors.New("Usage error, need to write header first.") On 2016/06/08 at 20:25:32, M-A Ruel wrote: > if there's no file in written to it, it should just close the archive to be valid. I don't understand? https://codereview.chromium.org/2043623002/diff/60001/common/archive/ar/write... common/archive/ar/writer.go:123: panic(fmt.Sprintf("err while copying (%s), archive is probably corrupted!", err)) On 2016/06/08 at 20:25:32, M-A Ruel wrote: > don't panic (everywhere) If this copy failed, there isn't much we can do? Isn't that what panic is for, exceptional errors? https://codereview.chromium.org/2043623002/diff/60001/common/archive/doc.go File common/archive/doc.go (right): https://codereview.chromium.org/2043623002/diff/60001/common/archive/doc.go#n... common/archive/doc.go:5: // Package archive is for readers/writers of archive file formats like ar, tar, On 2016/06/08 at 20:25:32, M-A Ruel wrote: > I don't think it's worth creating this layer as I don't think there will be support for other formats. I wanted it to match the "archive/tar" type naming?
https://codereview.chromium.org/2043623002/diff/1/common/archive/ar/ar_test.go File common/archive/ar/ar_test.go (right): https://codereview.chromium.org/2043623002/diff/1/common/archive/ar/ar_test.g... common/archive/ar/ar_test.go:76: if g, e := h.Name(), "filename1"; g != e { On 2016/06/10 10:09:52, mithro wrote: > On 2016/06/08 at 20:25:31, M-A Ruel wrote: > > Yes, goimports will find it automatically, it's really just up to you. lots of > devs uses goconvey but I'm not personally a fan. > > I'm not sure what you mean by "find it automatically" - I've not used goimports > before? > > At first it seems I ended up with the "Google3" version which rewrote all my > import statements incorrectly. Eventually was able to install it via `go get -u > golang.org/x/tools/cmd/goimports/...` You need to setup $GOPATH locally, get the non google goimports as you did, then when you run goimports on save (which I recommend), then imports are added automatically based on what is available. https://codereview.chromium.org/2043623002/diff/1/common/archive/ar/ar_test.g... common/archive/ar/ar_test.go:105: _, err := exec.LookPath("ar") On 2016/06/07 12:28:52, mithro wrote: > On 2016/06/06 at 23:01:05, M-A Ruel wrote: > > you can merge here as one line too and a few other below. > > I've merged the ones which just return err, but it seems you can't do this if > you need to use the other argument? Exact https://codereview.chromium.org/2043623002/diff/60001/common/archive/ar/reade... File common/archive/ar/reader.go (right): https://codereview.chromium.org/2043623002/diff/60001/common/archive/ar/reade... common/archive/ar/reader.go:105: func (ar *Reader) readBytes(numbytes int64) error { On 2016/06/10 10:09:52, mithro wrote: > On 2016/06/08 at 20:25:31, M-A Ruel wrote: > > I don't understand this function > > This function is called after reading a number of bytes and does all the cleanup > like updating the bytes required, reading any extra padding and going back to > the header mode. > > At the moment we only have a `func (ar *Reader) Read(b []byte) (n int, err > error) {` but it likely in the future we might want a version which Reads into > an io.Writer object which will also need to call this function. That's https://golang.org/pkg/io/#WriterTo https://codereview.chromium.org/2043623002/diff/60001/common/archive/ar/write... File common/archive/ar/writer.go (right): https://codereview.chromium.org/2043623002/diff/60001/common/archive/ar/write... common/archive/ar/writer.go:26: WRITE_CLOSED On 2016/06/10 10:09:53, mithro wrote: > On 2016/06/08 at 20:25:32, M-A Ruel wrote: > > There's no need for _CLOSED, it's closed when wr.w is nil. Same for reader > > I removed the WRITE_CLOSED but found that it made the state checking switch > statement ugly, it goes from > ``` > switch aw.stage { > case WRITE_HEADER: > // Good > case WRITE_BODY: > return errors.New("Usage error, writing a file.") > case WRITE_CLOSED: > return errors.New("Usage error, archive already closed.") > default: > panic(fmt.Sprintf("Unknown writer mode: %d", aw.stage)) > } > ``` > to > ``` > switch aw.stage { > case WRITE_HEADER: > // Good > case WRITE_BODY: > return errors.New("Usage error, writing a file.") > default: > panic(fmt.Sprintf("Unknown writer mode: %d", aw.stage)) > } > if aw.r == nil { > return errors.New("Usage error, archive already closed.") > } > ``` That's because you assume you need to enforce that double-Close are detected and must be relayed to the user. https://codereview.chromium.org/2043623002/diff/80001/common/archive/ar/reade... File common/archive/ar/reader.go (right): https://codereview.chromium.org/2043623002/diff/80001/common/archive/ar/reade... common/archive/ar/reader.go:51: READ_HEADER readerStage = iota don't export the const either https://codereview.chromium.org/2043623002/diff/80001/common/archive/ar/reade... common/archive/ar/reader.go:53: READ_CLOSED Not needed, can detect with r == nil https://codereview.chromium.org/2043623002/diff/80001/common/archive/ar/reade... common/archive/ar/reader.go:59: bytesrequired int64 streamSizeRemaining would probably be more understandable https://codereview.chromium.org/2043623002/diff/80001/common/archive/ar/reade... common/archive/ar/reader.go:66: if err != nil { merge with line 65 https://codereview.chromium.org/2043623002/diff/80001/common/archive/ar/reade... common/archive/ar/reader.go:75: _, err := io.ReadFull(ar.r, buffer) merge with line 76 https://codereview.chromium.org/2043623002/diff/80001/common/archive/ar/reade... common/archive/ar/reader.go:82: } else { no need for else, I'd recommend to to reverse the condition for consistency with the line above https://codereview.chromium.org/2043623002/diff/80001/common/archive/ar/reade... common/archive/ar/reader.go:92: return errors.New("usage error, reading a file.") Why is it a problem? https://codereview.chromium.org/2043623002/diff/80001/common/archive/ar/reade... common/archive/ar/reader.go:94: return errors.New("usage error, archive already closed.") I don't think it's worth reporting to the user. https://codereview.chromium.org/2043623002/diff/80001/common/archive/ar/reade... common/archive/ar/reader.go:98: ar.stage = READ_CLOSED When ar.r is nil, it's closed. So stage becomes a bool https://codereview.chromium.org/2043623002/diff/80001/common/archive/ar/reade... common/archive/ar/reader.go:103: func (ar *Reader) completeReadBytes(numbytes int64) error { So this function is actually readPadding(), right? https://codereview.chromium.org/2043623002/diff/80001/common/archive/ar/reade... common/archive/ar/reader.go:109: if ar.bytesrequired != 0 { when can this happen? https://codereview.chromium.org/2043623002/diff/80001/common/archive/ar/reade... common/archive/ar/reader.go:115: err := ar.checkBytes("padding", []byte("\n")) merge with next line []byte{'\n'} https://codereview.chromium.org/2043623002/diff/80001/common/archive/ar/reade... common/archive/ar/reader.go:125: // Check we have finished writing bytes writing? https://codereview.chromium.org/2043623002/diff/80001/common/archive/ar/reade... common/archive/ar/reader.go:137: return errors.New("usage error, need to read header first") Keep in mind it's an internal function, it should just work. https://codereview.chromium.org/2043623002/diff/80001/common/archive/ar/reade... common/archive/ar/reader.go:146: datalen := int64(len(data)) inline datalen https://codereview.chromium.org/2043623002/diff/80001/common/archive/ar/reade... common/archive/ar/reader.go:159: func (ar *Reader) readHeaderBytes(name string, bytes int, formatstr string) (int64, error) { readHeaderInt64() ? https://codereview.chromium.org/2043623002/diff/80001/common/archive/ar/reade... common/archive/ar/reader.go:166: var output int64 = 0 =0 is not needed https://codereview.chromium.org/2043623002/diff/80001/common/archive/ar/reade... common/archive/ar/reader.go:167: _, err = fmt.Sscanf(string(data), formatstr, &output) merge with next line https://codereview.chromium.org/2043623002/diff/80001/common/archive/ar/reade... common/archive/ar/reader.go:178: func (ar *Reader) readHeader() (*arFileInfoData, error) { Inline into Next() https://codereview.chromium.org/2043623002/diff/80001/common/archive/ar/reade... common/archive/ar/reader.go:234: err = ar.checkBytes("filemagic", []byte("\x60\n")) create a byte array directly merge with next line https://codereview.chromium.org/2043623002/diff/80001/common/archive/ar/reade... common/archive/ar/reader.go:245: err = ar.readPartial("filename", filename) merge https://codereview.chromium.org/2043623002/diff/80001/common/archive/ar/reade... common/archive/ar/reader.go:249: fi.name = string(filename) encoding is assumed to be utf8? https://codereview.chromium.org/2043623002/diff/80001/common/archive/ar/reade... common/archive/ar/reader.go:255: err = ar.readPartial("data", b) I'd prefer to not use named return values merge https://codereview.chromium.org/2043623002/diff/80001/common/archive/ar/write... File common/archive/ar/writer.go (right): https://codereview.chromium.org/2043623002/diff/80001/common/archive/ar/write... common/archive/ar/writer.go:25: WRITE_HEADER writerStage = iota Use golang style; https://github.com/golang/go/wiki/CodeReviewComments#mixed-caps No need to export https://codereview.chromium.org/2043623002/diff/80001/common/archive/ar/write... common/archive/ar/writer.go:32: stage writerStage make it a bool, unless you want to delay the "!<arch>\n" write but I don't think it's a good idea to delay. https://codereview.chromium.org/2043623002/diff/80001/common/archive/ar/write... common/archive/ar/writer.go:39: io.WriteString(w, "!<arch>\n") do not ignore the error here or lazy write the header but I don't think it's worth https://codereview.chromium.org/2043623002/diff/80001/common/archive/ar/write... common/archive/ar/writer.go:44: switch aw.stage { same comment as reader, just flush the file correctly if possible, ignore double close https://codereview.chromium.org/2043623002/diff/80001/common/archive/ar/write... common/archive/ar/writer.go:78: // Check you can write bytes to the ar at this moment. // canWriteContent returns nil if the stream is in a position to write a stream content. https://codereview.chromium.org/2043623002/diff/80001/common/archive/ar/write... common/archive/ar/writer.go:80: switch aw.stage { if a.r == nil { return errors.New(... } if a.inHeader { return errors.New(... } return nil https://codereview.chromium.org/2043623002/diff/80001/common/archive/ar/write... common/archive/ar/writer.go:97: log.Fatalf("didn't write enough bytes %d still needed, archive corrupted", aw.bytesrequired) return errors.New() then have the call sites return this call. https://codereview.chromium.org/2043623002/diff/80001/common/archive/ar/write... common/archive/ar/writer.go:102: err := aw.checkWrite() similar to reader, inline here and at other places where possible so the the number of LoC is reduced https://codereview.chromium.org/2043623002/diff/80001/common/archive/ar/write... common/archive/ar/writer.go:117: func (aw *Writer) WriteReader(data io.Reader) error { https://golang.org/pkg/io/#ReaderFrom would probably be more appropriate. https://codereview.chromium.org/2043623002/diff/80001/common/archive/ar/write... common/archive/ar/writer.go:133: func (aw *Writer) WriteBytes(data []byte) error { https://golang.org/pkg/io/#ReaderFrom would probably be more appropriate. Then it's a io.Copy() call. add doc https://codereview.chromium.org/2043623002/diff/80001/common/archive/ar/write... common/archive/ar/writer.go:144: aw.writePartial(data) return value is ignored https://codereview.chromium.org/2043623002/diff/80001/common/archive/ar/write... common/archive/ar/writer.go:149: func (aw *Writer) writeHeaderBytes(name string, size int64, modtime uint64, ownerid uint, groupid uint, filemod uint) error { writeHeaderInt or something like that? https://codereview.chromium.org/2043623002/diff/80001/common/archive/ar/write... common/archive/ar/writer.go:162: fmt.Fprintf(aw.w, "#1/%-13d", len(name)) do not ignore the return value https://codereview.chromium.org/2043623002/diff/80001/common/archive/ar/write... common/archive/ar/writer.go:212: // FIXME: Where do we get user/group from - they don't appear to be in Go's Mode() object? Don't bother, I'd prefer you to not claim to support that
vadimsh@chromium.org changed reviewers: + vadimsh@chromium.org
https://codereview.chromium.org/2043623002/diff/1/common/archive/ar/ar_test.go File common/archive/ar/ar_test.go (right): https://codereview.chromium.org/2043623002/diff/1/common/archive/ar/ar_test.g... common/archive/ar/ar_test.go:76: if g, e := h.Name(), "filename1"; g != e { On 2016/06/10 17:51:09, M-A Ruel wrote: > On 2016/06/10 10:09:52, mithro wrote: > > On 2016/06/08 at 20:25:31, M-A Ruel wrote: > > > Yes, goimports will find it automatically, it's really just up to you. lots > of > > devs uses goconvey but I'm not personally a fan. > > > > I'm not sure what you mean by "find it automatically" - I've not used > goimports > > before? > > > > At first it seems I ended up with the "Google3" version which rewrote all my > > import statements incorrectly. Eventually was able to install it via `go get > -u > > golang.org/x/tools/cmd/goimports/...` > > > You need to setup $GOPATH locally, get the non google goimports as you did, then > when you run goimports on save (which I recommend), then imports are added > automatically based on what is available. > Non-hardcode gophers on chrome-infra are using this: https://chromium.googlesource.com/infra/infra/+/master/go/#Bootstrap It installs all needed stuff and configures environ automatically. The downside it that you need to "enter" this environment each time using eval `./env.py` (though one can add corresponding env vars to bash_profile too).
Hi Marc, Lots of changes in this update. I asked around about the correct way to do the errors and the suggestion was to define my own error which had a "Fatal()" function on it. While doing so I have ended up refactoring the error creation significantly. BTW Why doesn't something in "git cl upload" check for when "err := a\nif err != nil {" should be replaced with "if err := a; err != nil {"? I also substantially change the way the reader works. I wrote the reader after the writer, so it inherited my thinking style from that which wasn't the right solution. Unlike writing, when reading it is perfectly fine to skip over sections of the archive without reading it. It is also very handy to be able to get an io.Reader object which contains the file content. From what I can see, Go doesn't have a nice way of making it so if the underlying io.Reader is really an io.ReadSeeker, I can also return an io.ReadSeeker? It also doesn't seem to come with an io.LimitedReadSeeker either... Can you please take another look. Tim 'mithro' Ansell https://codereview.chromium.org/2043623002/diff/80001/common/archive/ar/reade... File common/archive/ar/reader.go (right): https://codereview.chromium.org/2043623002/diff/80001/common/archive/ar/reade... common/archive/ar/reader.go:51: READ_HEADER readerStage = iota On 2016/06/10 17:51:10, M-A Ruel wrote: > don't export the const either Done and now obsolete. https://codereview.chromium.org/2043623002/diff/80001/common/archive/ar/reade... common/archive/ar/reader.go:53: READ_CLOSED On 2016/06/10 17:51:11, M-A Ruel wrote: > Not needed, can detect with r == nil Done and now obsolete. https://codereview.chromium.org/2043623002/diff/80001/common/archive/ar/reade... common/archive/ar/reader.go:59: bytesrequired int64 On 2016/06/10 17:51:10, M-A Ruel wrote: > streamSizeRemaining would probably be more understandable Done and now obsolete. https://codereview.chromium.org/2043623002/diff/80001/common/archive/ar/reade... common/archive/ar/reader.go:66: if err != nil { On 2016/06/10 17:51:10, M-A Ruel wrote: > merge with line 65 Done. https://codereview.chromium.org/2043623002/diff/80001/common/archive/ar/reade... common/archive/ar/reader.go:75: _, err := io.ReadFull(ar.r, buffer) On 2016/06/10 17:51:10, M-A Ruel wrote: > merge with line 76 Done. https://codereview.chromium.org/2043623002/diff/80001/common/archive/ar/reade... common/archive/ar/reader.go:82: } else { On 2016/06/10 17:51:10, M-A Ruel wrote: > no need for else, I'd recommend to to reverse the condition for consistency with > the line above Done. https://codereview.chromium.org/2043623002/diff/80001/common/archive/ar/reade... common/archive/ar/reader.go:92: return errors.New("usage error, reading a file.") On 2016/06/10 17:51:10, M-A Ruel wrote: > Why is it a problem? I guess in the reading case it isn't. https://codereview.chromium.org/2043623002/diff/80001/common/archive/ar/reade... common/archive/ar/reader.go:94: return errors.New("usage error, archive already closed.") On 2016/06/10 17:51:11, M-A Ruel wrote: > I don't think it's worth reporting to the user. This is how other things in Go seem to behave. They return an error when closing something which is already closed. https://codereview.chromium.org/2043623002/diff/80001/common/archive/ar/reade... common/archive/ar/reader.go:98: ar.stage = READ_CLOSED On 2016/06/10 17:51:10, M-A Ruel wrote: > When ar.r is nil, it's closed. > So stage becomes a bool Done and now obsolete. https://codereview.chromium.org/2043623002/diff/80001/common/archive/ar/reade... common/archive/ar/reader.go:103: func (ar *Reader) completeReadBytes(numbytes int64) error { On 2016/06/10 17:51:10, M-A Ruel wrote: > So this function is actually readPadding(), right? Now obsolete. https://codereview.chromium.org/2043623002/diff/80001/common/archive/ar/reade... common/archive/ar/reader.go:109: if ar.bytesrequired != 0 { On 2016/06/10 17:51:10, M-A Ruel wrote: > when can this happen? When a partial read is used. https://codereview.chromium.org/2043623002/diff/80001/common/archive/ar/reade... common/archive/ar/reader.go:115: err := ar.checkBytes("padding", []byte("\n")) On 2016/06/10 17:51:11, M-A Ruel wrote: > merge with next line > []byte{'\n'} Done. https://codereview.chromium.org/2043623002/diff/80001/common/archive/ar/reade... common/archive/ar/reader.go:125: // Check we have finished writing bytes On 2016/06/10 17:51:10, M-A Ruel wrote: > writing? Done. https://codereview.chromium.org/2043623002/diff/80001/common/archive/ar/reade... common/archive/ar/reader.go:137: return errors.New("usage error, need to read header first") On 2016/06/10 17:51:10, M-A Ruel wrote: > Keep in mind it's an internal function, it should just work. Now obsolete. https://codereview.chromium.org/2043623002/diff/80001/common/archive/ar/reade... common/archive/ar/reader.go:146: datalen := int64(len(data)) On 2016/06/10 17:51:10, M-A Ruel wrote: > inline datalen Done. https://codereview.chromium.org/2043623002/diff/80001/common/archive/ar/reade... common/archive/ar/reader.go:159: func (ar *Reader) readHeaderBytes(name string, bytes int, formatstr string) (int64, error) { On 2016/06/10 17:51:10, M-A Ruel wrote: > readHeaderInt64() ? Done. https://codereview.chromium.org/2043623002/diff/80001/common/archive/ar/reade... common/archive/ar/reader.go:166: var output int64 = 0 On 2016/06/10 17:51:10, M-A Ruel wrote: > =0 is not needed Done. https://codereview.chromium.org/2043623002/diff/80001/common/archive/ar/reade... common/archive/ar/reader.go:167: _, err = fmt.Sscanf(string(data), formatstr, &output) On 2016/06/10 17:51:10, M-A Ruel wrote: > merge with next line Done. https://codereview.chromium.org/2043623002/diff/80001/common/archive/ar/reade... common/archive/ar/reader.go:178: func (ar *Reader) readHeader() (*arFileInfoData, error) { On 2016/06/10 17:51:10, M-A Ruel wrote: > Inline into Next() Done. https://codereview.chromium.org/2043623002/diff/80001/common/archive/ar/reade... common/archive/ar/reader.go:234: err = ar.checkBytes("filemagic", []byte("\x60\n")) On 2016/06/10 17:51:11, M-A Ruel wrote: > create a byte array directly > merge with next line Done. https://codereview.chromium.org/2043623002/diff/80001/common/archive/ar/reade... common/archive/ar/reader.go:245: err = ar.readPartial("filename", filename) On 2016/06/10 17:51:10, M-A Ruel wrote: > merge Done. https://codereview.chromium.org/2043623002/diff/80001/common/archive/ar/reade... common/archive/ar/reader.go:249: fi.name = string(filename) On 2016/06/10 17:51:10, M-A Ruel wrote: > encoding is assumed to be utf8? The "spec" seems to indicate it should be ASCII but everything seems to work fine with utf-8. https://codereview.chromium.org/2043623002/diff/80001/common/archive/ar/reade... common/archive/ar/reader.go:255: err = ar.readPartial("data", b) On 2016/06/10 17:51:11, M-A Ruel wrote: > I'd prefer to not use named return values > merge I actually didn't even know that was possible! Fixed. https://codereview.chromium.org/2043623002/diff/80001/common/archive/ar/write... File common/archive/ar/writer.go (right): https://codereview.chromium.org/2043623002/diff/80001/common/archive/ar/write... common/archive/ar/writer.go:25: WRITE_HEADER writerStage = iota On 2016/06/10 17:51:11, M-A Ruel wrote: > Use golang style; > https://github.com/golang/go/wiki/CodeReviewComments#mixed-caps > No need to export Done. https://codereview.chromium.org/2043623002/diff/80001/common/archive/ar/write... common/archive/ar/writer.go:32: stage writerStage On 2016/06/10 17:51:11, M-A Ruel wrote: > make it a bool, unless you want to delay the "!<arch>\n" write but I don't think > it's a good idea to delay. Done. https://codereview.chromium.org/2043623002/diff/80001/common/archive/ar/write... common/archive/ar/writer.go:39: io.WriteString(w, "!<arch>\n") On 2016/06/10 17:51:11, M-A Ruel wrote: > do not ignore the error here > or lazy write the header but I don't think it's worth Done. https://codereview.chromium.org/2043623002/diff/80001/common/archive/ar/write... common/archive/ar/writer.go:44: switch aw.stage { On 2016/06/10 17:51:11, M-A Ruel wrote: > same comment as reader, just flush the file correctly if possible, ignore double > close Most things report an error on a double Close() (as it is normally an indicator of a more serious problem). tarfile does and so does File() see https://play.golang.org/p/_OBK2KeAk1 https://codereview.chromium.org/2043623002/diff/80001/common/archive/ar/write... common/archive/ar/writer.go:78: // Check you can write bytes to the ar at this moment. On 2016/06/10 17:51:11, M-A Ruel wrote: > // canWriteContent returns nil if the stream is in a position to write a stream > content. Done. https://codereview.chromium.org/2043623002/diff/80001/common/archive/ar/write... common/archive/ar/writer.go:80: switch aw.stage { On 2016/06/10 17:51:11, M-A Ruel wrote: > if a.r == nil { > return errors.New(... > } > if a.inHeader { > return errors.New(... > } > return nil I still don't like this change, I personally think that having a single variable which tracks the state is less error prone because you always check the whole enum. https://codereview.chromium.org/2043623002/diff/80001/common/archive/ar/write... common/archive/ar/writer.go:97: log.Fatalf("didn't write enough bytes %d still needed, archive corrupted", aw.bytesrequired) On 2016/06/10 17:51:11, M-A Ruel wrote: > return errors.New() > > then have the call sites return this call. Kinda done - see my comment previously. https://codereview.chromium.org/2043623002/diff/80001/common/archive/ar/write... common/archive/ar/writer.go:102: err := aw.checkWrite() On 2016/06/10 17:51:11, M-A Ruel wrote: > similar to reader, inline here and at other places where possible so the the > number of LoC is reduced Done. https://codereview.chromium.org/2043623002/diff/80001/common/archive/ar/write... common/archive/ar/writer.go:117: func (aw *Writer) WriteReader(data io.Reader) error { On 2016/06/10 17:51:11, M-A Ruel wrote: > https://golang.org/pkg/io/#ReaderFrom would probably be more appropriate. I agree that interface looks like what this should implement. It does kind of break the naming though :( https://codereview.chromium.org/2043623002/diff/80001/common/archive/ar/write... common/archive/ar/writer.go:133: func (aw *Writer) WriteBytes(data []byte) error { On 2016/06/10 17:51:11, M-A Ruel wrote: > https://golang.org/pkg/io/#ReaderFrom would probably be more appropriate. Then > it's a io.Copy() call. io.Copy() seems to use this function under the hood if it is available but is more generic it seems? > add doc Done. https://codereview.chromium.org/2043623002/diff/80001/common/archive/ar/write... common/archive/ar/writer.go:144: aw.writePartial(data) On 2016/06/10 17:51:11, M-A Ruel wrote: > return value is ignored Fixed. https://codereview.chromium.org/2043623002/diff/80001/common/archive/ar/write... common/archive/ar/writer.go:149: func (aw *Writer) writeHeaderBytes(name string, size int64, modtime uint64, ownerid uint, groupid uint, filemod uint) error { On 2016/06/10 17:51:11, M-A Ruel wrote: > writeHeaderInt > or something like that? Went with writeHeaderInternal https://codereview.chromium.org/2043623002/diff/80001/common/archive/ar/write... common/archive/ar/writer.go:162: fmt.Fprintf(aw.w, "#1/%-13d", len(name)) On 2016/06/10 17:51:11, M-A Ruel wrote: > do not ignore the return value Done. https://codereview.chromium.org/2043623002/diff/80001/common/archive/ar/write... common/archive/ar/writer.go:212: // FIXME: Where do we get user/group from - they don't appear to be in Go's Mode() object? On 2016/06/10 17:51:11, M-A Ruel wrote: > Don't bother, I'd prefer you to not claim to support that Done.
Change looks pretty good. Small things. On 2016/06/14 10:57:53, mithro wrote: > I asked around about the correct way to do the errors and the suggestion was to > define my own error which had a "Fatal()" function on it. While doing so I have > ended up refactoring the error creation significantly. There is one problem though, as returning an error to Error can wrap a nil error into a non-nil Error. This is super confusing behavior. > BTW Why doesn't something in "git cl upload" check for when "err := a\nif err != > nil {" should be replaced with "if err := a; err != nil {"? It's a bit hard to automate due to variable aliasing but it's likely doable, maybe it should be embedded in errcheck? https://codereview.chromium.org/2043623002/diff/120001/common/archive/ar/erro... File common/archive/ar/errors.go (right): https://codereview.chromium.org/2043623002/diff/120001/common/archive/ar/erro... common/archive/ar/errors.go:19: type UsageError struct { same comments as in writer.go https://codereview.chromium.org/2043623002/diff/120001/common/archive/ar/read... File common/archive/ar/reader.go (right): https://codereview.chromium.org/2043623002/diff/120001/common/archive/ar/read... common/archive/ar/reader.go:20: ErrReadAfterClose = UsageError{msg: "read after file closed"} same comments as in writer.go https://codereview.chromium.org/2043623002/diff/120001/common/archive/ar/read... common/archive/ar/reader.go:23: type ReadDataIOError struct { same comments as in writer.go https://codereview.chromium.org/2043623002/diff/120001/common/archive/ar/read... common/archive/ar/reader.go:124: // Read any remains of the previous file On 2016/06/14 10:57:53, mithro wrote: > From what I can see, Go doesn't have a nice way of making it so if the > underlying io.Reader is really an io.ReadSeeker, I can also return an > io.ReadSeeker? It also doesn't seem to come with an io.LimitedReadSeeker > either... You can do dynamic cast; https://golang.org/pkg/io/#Seeker The following should work fine, you can ignore the LimitedReader when seeking; if s, ok := ar.r.(io.Seeker); ok { if _, err := s.Seek(ar.bodyReader.N, 1); err != nil { return nil, err } } else { if _, err := io.Copy(ioutil.Discard, &ar.bodyReader); err != nil { return nil, err } } https://codereview.chromium.org/2043623002/diff/120001/common/archive/ar/writ... File common/archive/ar/writer.go (right): https://codereview.chromium.org/2043623002/diff/120001/common/archive/ar/writ... common/archive/ar/writer.go:17: var ( since there's only one, do not use () I don't think you should export this variable. https://codereview.chromium.org/2043623002/diff/120001/common/archive/ar/writ... common/archive/ar/writer.go:21: // WriteTooLongError indicates trying to write the wrong amount of data into the archive. wrap at 80 cols https://codereview.chromium.org/2043623002/diff/120001/common/archive/ar/writ... common/archive/ar/writer.go:23: type WriteTooLongError struct { I don't think these error types should be exported. https://codereview.chromium.org/2043623002/diff/120001/common/archive/ar/writ... common/archive/ar/writer.go:188: // Calling with wrong size data will return WriteTooLongError but the archive will 80 cols for comments is helpful https://codereview.chromium.org/2043623002/diff/120001/common/archive/ar/writ... common/archive/ar/writer.go:203: if err := aw.checkFinished(); err != nil { tail call optimisation: return aw.checkFinished() https://codereview.chromium.org/2043623002/diff/120001/common/archive/ar/writ... common/archive/ar/writer.go:288: /* FIXME: Should we also exclude other "special" files? TODO(mithro): https://codereview.chromium.org/2043623002/diff/120001/common/archive/ar/writ... common/archive/ar/writer.go:299: func (aw *Writer) Add(filepath string, data []byte) Error { AddRaw(), AddContent() ? I would have liked WriteFOO but it would probably be more confusing.
Description was changed from ========== Tools for working with BSD style ar archives. ar archives are the simplest format that stratifies the following criteria; * Is an existing standard which has tools which ship on standard Linux systems. * Requires no escaping / processing of file contents. * Header can be written without needing to understand the whole file. * Extremely fast to process / generate. Other formats which were consider before selecting the ar format; * tar * zip * cpio * rar * Something protobuf based. On a z620, these ar versions are ~5 times faster than the archive/tar equivalent which ships with go. There are a couple of drawbacks; * The ar format doesn't support special files or symlinks. * The Linux ar tool don't extract with directory paths (list fine). The Python version of this code is at https://codereview.chromium.org/2049523004 This was split out of https://codereview.chromium.org/2014243002 * https://github.com/luci/luci-go/issues/9 * https://crbug.com/598990 BUG=598990 ========== to ========== luci-go: Tools for working with BSD style ar archives. ar archives are the simplest format that satisfies the following criteria; * Is an existing standard which has tools which ship on standard Linux systems. * Requires no escaping / processing of file contents. * Header can be written without needing to understand the whole file. * Extremely fast to process / generate. Other formats which were consider before selecting the ar format; * tar * zip * cpio * rar * Something protobuf based. On a z620, these ar versions are ~5 times faster than the archive/tar equivalent which ships with go. There are a couple of drawbacks; * The ar format doesn't support special files or symlinks. * The Linux ar tool don't extract with directory paths (list fine). The Python version of this code is at https://codereview.chromium.org/2049523004 This was split out of https://codereview.chromium.org/2014243002 * https://github.com/luci/luci-go/issues/9 * https://crbug.com/598990 BUG=598990 ==========
tansell@chromium.org changed reviewers: + djd@chromium.org, mcgreevy@golang.org - maruel@google.com
Hi! Adding Michael and Dave to review this go code and advice on the API. The code should also now be GoLint clean. Tim 'mithro' Ansell https://codereview.chromium.org/2043623002/diff/120001/common/archive/ar/read... File common/archive/ar/reader.go (right): https://codereview.chromium.org/2043623002/diff/120001/common/archive/ar/read... common/archive/ar/reader.go:124: // Read any remains of the previous file On 2016/06/14 14:30:43, M-A Ruel wrote: > On 2016/06/14 10:57:53, mithro wrote: > > From what I can see, Go doesn't have a nice way of making it so if the > > underlying io.Reader is really an io.ReadSeeker, I can also return an > > io.ReadSeeker? It also doesn't seem to come with an io.LimitedReadSeeker > > either... > > You can do dynamic cast; https://golang.org/pkg/io/#Seeker > The following should work fine, you can ignore the LimitedReader when seeking; > > if s, ok := ar.r.(io.Seeker); ok { > if _, err := s.Seek(ar.bodyReader.N, 1); err != nil { > return nil, err > } > } else { > if _, err := io.Copy(ioutil.Discard, &ar.bodyReader); err != nil { > return nil, err > } > } Done. https://codereview.chromium.org/2043623002/diff/120001/common/archive/ar/writ... File common/archive/ar/writer.go (right): https://codereview.chromium.org/2043623002/diff/120001/common/archive/ar/writ... common/archive/ar/writer.go:21: // WriteTooLongError indicates trying to write the wrong amount of data into the archive. On 2016/06/14 14:30:43, M-A Ruel wrote: > wrap at 80 cols Done. https://codereview.chromium.org/2043623002/diff/120001/common/archive/ar/writ... common/archive/ar/writer.go:188: // Calling with wrong size data will return WriteTooLongError but the archive will On 2016/06/14 14:30:43, M-A Ruel wrote: > 80 cols for comments is helpful Done. https://codereview.chromium.org/2043623002/diff/120001/common/archive/ar/writ... common/archive/ar/writer.go:203: if err := aw.checkFinished(); err != nil { On 2016/06/14 14:30:43, M-A Ruel wrote: > tail call optimisation: > return aw.checkFinished() Done. https://codereview.chromium.org/2043623002/diff/120001/common/archive/ar/writ... common/archive/ar/writer.go:288: /* FIXME: Should we also exclude other "special" files? On 2016/06/14 14:30:43, M-A Ruel wrote: > TODO(mithro): Done. https://codereview.chromium.org/2043623002/diff/120001/common/archive/ar/writ... common/archive/ar/writer.go:299: func (aw *Writer) Add(filepath string, data []byte) Error { On 2016/06/14 14:30:43, M-A Ruel wrote: > AddRaw(), AddContent() ? I would have liked WriteFOO but it would probably be > more confusing. Used AddWithContent().
mcgreevy@chromium.org changed reviewers: + mcgreevy@chromium.org
I've mostly looked at the ar package API so far, rather than the implementation details. These are my initial thoughts. https://codereview.chromium.org/2043623002/diff/160001/common/archive/ar/erro... File common/archive/ar/errors.go (right): https://codereview.chromium.org/2043623002/diff/160001/common/archive/ar/erro... common/archive/ar/errors.go:14: type Error interface { There are a bunch of different error types in this package, some of which serve to store extra information about the cause of the error, and some which serve to provide different implementations of this Error interface (e.g. WriteTooLongError, which always returns false from its Fatal method, and WriteTooLongFatalError which always returns true). I'm not sure that all of these different types are necessary, and even if they are, whether they need to be exported. For example, will users of this package ever do type assertions to check whether an error is a WriteTooLongFatalError vs a WriteTooLongError? If not, that's a signal that they don't need to be exported. The same effect could be achieved with fewer types, e.g.: // note: these need not be exported. type fatalError string func (err fatalError) Fatal() {return true} type nonFatalError string func (err nonFatalError) Fatal() {return false} func WriteTooLongError(got, want int64) Error { return nonFatalError(fmt.Sprintf("invalid data length (got %d, want %d)", got, want)) } func WriteTooLongFatalError(got, want int64) Error { return fatalError(fmt.Sprintf("archive corrupted: invalid data written (got %d, want %d)", got, want)) } I suspect that even the distinction between fatal vs non-fatal errors can be conveyed via some other mechanism, which would mean that this Error interface could go away, and a standard error type used instead, which would be nice. This will probably become clearer as the number of these error types is reduced. https://codereview.chromium.org/2043623002/diff/160001/common/archive/ar/erro... common/archive/ar/errors.go:20: type UsageError struct { This doesn't need to be a struct. type UsageError string would suffice. You can than create one like so: error := UsageError("some message") You can still define methods on this type, like: func (e UsageError) Error() string {...} https://codereview.chromium.org/2043623002/diff/160001/common/archive/ar/read... File common/archive/ar/reader.go (right): https://codereview.chromium.org/2043623002/diff/160001/common/archive/ar/read... common/archive/ar/reader.go:31: return fmt.Sprintf("%s (wanted '%s', got '%s') -- *archive corrupted*", e.IOError.Error(), e.wanted, e.got) FYI, got idiomatically precedes want in Go (you will always see "got: %v; want %v" in tests, never the other way around). https://codereview.chromium.org/2043623002/diff/160001/common/archive/ar/read... common/archive/ar/reader.go:217: func (ar *Reader) Body() io.Reader { Is there a reason why the body can be returned from Reader.Next? i.e. func (ar *Reader) Next() (FileInfo, io.Reader, Error) {...} https://codereview.chromium.org/2043623002/diff/160001/common/archive/ar/writ... File common/archive/ar/writer.go (right): https://codereview.chromium.org/2043623002/diff/160001/common/archive/ar/writ... common/archive/ar/writer.go:86: func NewWriter(w io.Writer) (*Writer, Error) { It's good to see NewReader and NewWriter taking io.Reader and io.Writer, respectively. https://codereview.chromium.org/2043623002/diff/160001/common/archive/ar/writ... common/archive/ar/writer.go:182: func (aw *Writer) ReaderFrom(r io.Reader) (int64, Error) { This method name sounds like it returns a Reader. What about just calling it "Write", and merging it with WriteBytes, which does basically the same thing? A caller can convert a []byte into a reader before passing it in by calling bytes.NewReader from the standard library. I've expanded on this thought below. https://codereview.chromium.org/2043623002/diff/160001/common/archive/ar/writ... common/archive/ar/writer.go:290: // WriteBytes or ReaderFrom should be called after writing the header. // WriteBytes or ReaderFrom should be called after writing the header. // Calling at the wrong time will return a UsageError. It feels like these kind of directions about the order in which methods must be called could be avoided by simply having a single method which takes a header and the contents to be written (something like the existing AddWithContent, but with the ability to use non-defaults). This would also make the API less error prone and perhaps remove the need for the existence of UsageError. What about something like this: type Header interface { os.FileInfo } type defaultHeader struct { filepath string size int64 } func BasicHeader(filepath string, size int64) Header { return &deafultHeader{filepath, size} } func (bh *defaultHeader) Size() int64 { return bh.size } func (bh *defaultHeader) Mode() os.FileMode { return DefaultFileMode } // TODO: implement the rest of os.FileInfo for deafultHeader. func (aw *Writer) WriteFile(data io.Reader, header Header) error { ... } So, the equivalent of doing aw.WriteHeaderDefault("path/to/file", 1024) aw.WriteBytes(buf) would be: aw.WriteFile(bytes.NewReader(buf), ar.BasicHeader("path/to/file", 1024)). In this way, all of {AddWithContent,ReaderFrom,WriteBytes,WriteHeader,WriteHeaderDefault} can be replaced with a single method, WriteFile, and a helper that can be used if needed (BasicHeader). As well as reducing the number of methods, a bunch of incorrect usages (e.g. calling WriteHeader and WriteBytes out of order) are now impossible. Am I missing a use case? https://codereview.chromium.org/2043623002/diff/160001/common/archive/doc.go File common/archive/doc.go (right): https://codereview.chromium.org/2043623002/diff/160001/common/archive/doc.go#... common/archive/doc.go:5: // Package archive is for readers/writers of archive file formats like ar, tar, This comment should say what that package is/does rather than who uses it. So, I suggest replacing "is for readers/writers of" with "provides read/write support for"
https://codereview.chromium.org/2043623002/diff/160001/common/archive/ar/writ... File common/archive/ar/writer.go (right): https://codereview.chromium.org/2043623002/diff/160001/common/archive/ar/writ... common/archive/ar/writer.go:182: func (aw *Writer) ReaderFrom(r io.Reader) (int64, Error) { On 2016/11/07 00:37:22, mcgreevy2 wrote: > This method name sounds like it returns a Reader. What about just calling it > "Write", and merging it with WriteBytes, which does basically the same thing? A > caller can convert a []byte into a reader before passing it in by calling > bytes.NewReader from the standard library. I've expanded on this thought below. So, I see now that this is intended to implement io.ReaderFrom. It doesn't, though, because the signature is wrong. It should be: ReadFrom(r Reader) (n int64, err error) Note: "ReadFrom", not "ReaderFrom", and "error" not "Error". https://codereview.chromium.org/2043623002/diff/160001/common/archive/ar/writ... common/archive/ar/writer.go:208: func (aw *Writer) WriteBytes(data []byte) Error { Similarly, this should be Write(data []byte) (int, error) to implement io.Writer. https://codereview.chromium.org/2043623002/diff/160001/common/archive/ar/writ... common/archive/ar/writer.go:290: // WriteBytes or ReaderFrom should be called after writing the header. I can see the benefit, though, in having *ar.Writer implement io.Writer and io.ReaderFrom: implementing io.Writer allows you to use *ar.Writer as an argument to io.Copy (and implementing io.ReaderFrom makes the copy more efficient). I think we can still reduce the number of methods for writing the header, but I'll take another look in the morning.
Re: your commit message: "these ar versions are ~5 times faster than the archive/tar equivalent which ships with go." Do you mean that the code in this CL is ~5 times faster than archive/tar? How did you measure that? How significant is this time with respect to the time it takes to upload/download the results? It would be great if we could get away with using archive/tar instead of maintaining extra code.
On 2016/11/08 00:29:16, mcgreevy2 wrote: > Re: your commit message: > > "these ar versions are ~5 times faster than the archive/tar > equivalent which ships with go." > > Do you mean that the code in this CL is ~5 times faster than archive/tar? How > did you measure that? > > How significant is this time with respect to the time it takes to > upload/download the results? It would be great if we could get away with using > archive/tar instead of maintaining extra code. You can get this via a profile: $ isolate archive -help $ isolate archive -isolate-server=fake -trace foo.json ... then you can load foo.json into chrome://tracing and visualize the relative times. |