Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(417)

Issue 2054763004: luci-go/common/dirwalk: Code for walking a directory tree efficiently

Created:
4 years, 6 months ago by mithro
Modified:
3 years, 5 months ago
CC:
andrew.wang, chromium-reviews, infra-reviews+luci-go_chromium.org, tandrii+luci-go_chromium.org, todd
Base URL:
https://github.com/luci/luci-go@master
Target Ref:
refs/heads/master
Project:
luci-go
Visibility:
Public.

Description

luci-go/common/dirtools: Tools for walking a directory tree efficiently This CL adds code to walk directory trees efficiently. It adds three methods of walking which have the same interface; * `dirwalk.WalkBasic` - Uses go's builtin walk method * `dirwalk.WalkParallel` - Uses a worker pool to do the walk on many threads. * `dirwalk.WalkNoStat` - Uses a custom method which doesn't call stat. It also adds a bunch of tools for examining the performance of these methods; * `gendir` - Takes a json description and generates a directory. * `walkdir` - Walks a directory doing tests. After comparison on Linux the following results were found; * dirwalk.WalkNoStat is generally fastest for large number of small files (such as test1, test4, test5 & test6) and not measurable slower in other cases. * dirwalk.WalkParallel is *much* slower in almost all cases (even than the built in walk) - In some cases it can deliver the same wall clock time for large increase in CPU seconds used. * dirwalk.WalkBasic has very strong performance for the lines of code needed. * Doing a parallel hash increases performance sometimes and decreases it in others. The code is all under a single directory to allow it to be moved into its own go package in the future. Bugs: * https://github.com/luci/luci-go/issues/9 * https://crbug.com/598990 BUG=598990

Patch Set 1 #

Patch Set 2 : Rebase onto master. #

Patch Set 3 : s/dirtools/dirwalk/gc #

Patch Set 4 : Moving the test tool locations. #

Patch Set 5 : Small updates. #

Total comments: 73

Patch Set 6 : Fixing a bunch of stuff for review. #

Patch Set 7 : Rebase onto master. #

Total comments: 1

Patch Set 8 : Refactoring. #

Patch Set 9 : Refactoring. #

Patch Set 10 : s/NullWalker/BaseWalker/gc #

Patch Set 11 : Other changes. #

Patch Set 12 : Fixes. #

Total comments: 4

Patch Set 13 : Massive rework #

Patch Set 14 : Major rewrite of the code. #

Total comments: 5
Unified diffs Side-by-side diffs Delta from patch set Stats (+1593 lines, -7 lines) Patch
A common/dirwalk/callback.go View 1 2 3 4 5 6 7 8 9 10 11 12 1 chunk +16 lines, -0 lines 0 comments Download
A + common/dirwalk/doc.go View 1 2 1 chunk +2 lines, -2 lines 0 comments Download
A common/dirwalk/test.sh View 1 2 3 4 5 6 7 8 9 10 11 12 13 1 chunk +69 lines, -0 lines 0 comments Download
A common/dirwalk/tests/test1.json View 1 2 1 chunk +18 lines, -0 lines 0 comments Download
A common/dirwalk/tests/test2.json View 1 2 1 chunk +18 lines, -0 lines 0 comments Download
A common/dirwalk/tests/test3.json View 1 2 1 chunk +24 lines, -0 lines 0 comments Download
A common/dirwalk/tests/test4.json View 1 2 1 chunk +18 lines, -0 lines 0 comments Download
A common/dirwalk/tests/test5.json View 1 2 1 chunk +24 lines, -0 lines 0 comments Download
A common/dirwalk/tests/test6.json View 1 2 1 chunk +18 lines, -0 lines 0 comments Download
A common/dirwalk/tests/tools/gendir/content.go View 1 2 3 4 5 6 7 8 9 10 11 12 13 1 chunk +95 lines, -0 lines 1 comment Download
A + common/dirwalk/tests/tools/gendir/doc.go View 1 2 3 4 5 6 7 1 chunk +1 line, -2 lines 0 comments Download
A common/dirwalk/tests/tools/gendir/generate.go View 1 2 3 4 5 6 7 8 9 10 11 12 13 1 chunk +179 lines, -0 lines 2 comments Download
A common/dirwalk/tests/tools/gendir/lorem_data.go View 1 2 3 4 5 6 7 1 chunk +209 lines, -0 lines 0 comments Download
A common/dirwalk/tests/tools/gendir/main.go View 1 2 3 4 5 6 7 8 9 10 11 12 13 1 chunk +81 lines, -0 lines 0 comments Download
A common/dirwalk/tests/tools/gendir/random_utils.go View 1 2 3 4 5 6 7 8 9 10 11 12 13 1 chunk +37 lines, -0 lines 0 comments Download
A + common/dirwalk/tests/tools/gendir/util.go View 1 2 3 4 5 6 7 1 chunk +12 lines, -1 line 0 comments Download
A + common/dirwalk/tests/tools/walkdir/doc.go View 1 2 3 4 5 6 7 1 chunk +1 line, -2 lines 0 comments Download
A common/dirwalk/tests/tools/walkdir/fileprocessor.go View 1 2 3 4 5 6 7 8 9 10 11 12 13 1 chunk +63 lines, -0 lines 0 comments Download
A common/dirwalk/tests/tools/walkdir/fileprocessor_hash.go View 1 2 3 4 5 6 7 8 9 10 11 12 1 chunk +36 lines, -0 lines 0 comments Download
A common/dirwalk/tests/tools/walkdir/fileprocessor_phash.go View 1 2 3 4 5 6 7 8 9 10 11 12 13 1 chunk +98 lines, -0 lines 0 comments Download
A common/dirwalk/tests/tools/walkdir/fileprocessor_print.go View 1 2 3 4 5 6 7 8 9 10 11 12 13 1 chunk +31 lines, -0 lines 0 comments Download
A common/dirwalk/tests/tools/walkdir/fileprocessor_read.go View 1 2 3 4 5 6 7 8 9 10 11 12 1 chunk +32 lines, -0 lines 0 comments Download
A common/dirwalk/tests/tools/walkdir/fileprocessor_size.go View 1 2 3 4 5 6 7 8 9 10 11 12 1 chunk +42 lines, -0 lines 0 comments Download
A common/dirwalk/tests/tools/walkdir/fileprocessor_verify.go View 1 2 3 4 5 6 7 8 9 10 11 12 13 1 chunk +67 lines, -0 lines 0 comments Download
A common/dirwalk/tests/tools/walkdir/main.go View 1 2 3 4 5 6 7 8 9 10 11 12 13 1 chunk +116 lines, -0 lines 0 comments Download
A common/dirwalk/tests/tools/walkdir/util.go View 1 2 3 4 5 6 7 8 9 10 11 12 1 chunk +20 lines, -0 lines 0 comments Download
A common/dirwalk/util.go View 1 2 3 4 5 6 7 8 9 10 11 12 13 1 chunk +51 lines, -0 lines 0 comments Download
A common/dirwalk/walk_basic.go View 1 2 3 4 5 6 7 8 9 10 11 12 13 1 chunk +49 lines, -0 lines 0 comments Download
A common/dirwalk/walk_nostat.go View 1 2 3 4 5 6 7 8 9 10 11 12 13 1 chunk +69 lines, -0 lines 2 comments Download
A common/dirwalk/walk_parallel.go View 1 2 3 4 5 6 7 8 9 10 11 12 13 1 chunk +97 lines, -0 lines 0 comments Download

Messages

Total messages: 15 (6 generated)
mithro
Hi Maruel, This is the dirtools code split out into a separate CL with a ...
4 years, 6 months ago (2016-06-09 16:47:02 UTC) #2
M-A Ruel
On 2016/06/09 16:47:02, mithro wrote: > Hi Maruel, > > This is the dirtools code ...
4 years, 6 months ago (2016-06-10 02:20:43 UTC) #3
M-A Ruel
https://codereview.chromium.org/2054763004/diff/80001/common/dirwalk/observer.go File common/dirwalk/observer.go (right): https://codereview.chromium.org/2054763004/diff/80001/common/dirwalk/observer.go#newcode7 common/dirwalk/observer.go:7: /** Always use // comments for consistency. I do ...
4 years, 3 months ago (2016-09-15 14:31:03 UTC) #6
mithro
Hi Marc, I had a bunch of questions regarding your comments. I will upload a ...
4 years, 3 months ago (2016-09-20 12:41:45 UTC) #7
M-A Ruel
https://codereview.chromium.org/2054763004/diff/80001/common/dirwalk/observer.go File common/dirwalk/observer.go (right): https://codereview.chromium.org/2054763004/diff/80001/common/dirwalk/observer.go#newcode16 common/dirwalk/observer.go:16: LargeFile(filename string) On 2016/09/20 12:41:43, mithro wrote: > On ...
4 years, 3 months ago (2016-09-20 16:37:27 UTC) #8
mithro
Hi Maruel, I've refactored this CL a bunch to match the io.Reader style for the ...
4 years, 3 months ago (2016-09-22 11:19:59 UTC) #10
M-A Ruel
https://codereview.chromium.org/2054763004/diff/220001/common/dirwalk/tests/tools/walkdir/walkers_phash.go File common/dirwalk/tests/tools/walkdir/walkers_phash.go (right): https://codereview.chromium.org/2054763004/diff/220001/common/dirwalk/tests/tools/walkdir/walkers_phash.go#newcode38 common/dirwalk/tests/tools/walkdir/walkers_phash.go:38: var filecount uint64 = 0 =0 not needed https://codereview.chromium.org/2054763004/diff/220001/common/dirwalk/tests/tools/walkdir/walkers_phash.go#newcode41 ...
4 years, 3 months ago (2016-09-23 01:48:17 UTC) #11
mithro
Hi everyone, Based on the feedback from Dave Day (djd@) and Marc, I have done ...
4 years, 1 month ago (2016-11-08 03:48:39 UTC) #13
mcgreevy_g
3 years, 5 months ago (2017-06-27 03:29:17 UTC) #15
https://codereview.chromium.org/2054763004/diff/260001/common/dirwalk/tests/t...
File common/dirwalk/tests/tools/gendir/content.go (right):

https://codereview.chromium.org/2054763004/diff/260001/common/dirwalk/tests/t...
common/dirwalk/tests/tools/gendir/content.go:43: reader :=
textRandomGenerator{r: r}
return &textRandomGenerator{r: r}

https://codereview.chromium.org/2054763004/diff/260001/common/dirwalk/tests/t...
File common/dirwalk/tests/tools/gendir/generate.go (right):

https://codereview.chromium.org/2054763004/diff/260001/common/dirwalk/tests/t...
common/dirwalk/tests/tools/gendir/generate.go:30: for written < fileSize {
I'm pretty sure that you can achieve this by just wrapping fileContent in an
io.LimitedReader, and then using io.Copy to f from the wrapped reader.

https://codereview.chromium.org/2054763004/diff/260001/common/dirwalk/tests/t...
common/dirwalk/tests/tools/gendir/generate.go:83: for true {
This can be expressed as:

for {

Note: this pattern also occurs elsewhere in this file.

https://codereview.chromium.org/2054763004/diff/260001/common/dirwalk/walk_no...
File common/dirwalk/walk_nostat.go (right):

https://codereview.chromium.org/2054763004/diff/260001/common/dirwalk/walk_no...
common/dirwalk/walk_nostat.go:27: if err != io.EOF && err != nil {
Note: it's valid for a Reader to return a nil error and then return io.EOF with
0 bytes read on the next call to Read.

https://codereview.chromium.org/2054763004/diff/260001/common/dirwalk/walk_no...
common/dirwalk/walk_nostat.go:41: // This file was bigger than the block size
Or, perhaps it was equal to the block size and io.EOF will be returned on the
next call to Read.

Powered by Google App Engine
This is Rietveld 408576698