|
|
Created:
5 years, 6 months ago by iannucci Modified:
5 years, 6 months ago CC:
chromium-reviews, dpranke+depot_tools_chromium.org, iannucci+depot_tools_chromium.org Target Ref:
refs/heads/master Project:
depot_tools Visibility:
Public. |
DescriptionParallelize pylint PRESUBMIT checks.
R=maruel@chromium.org
BUG=479837, 499650
Committed: http://src.chromium.org/viewvc/chrome?view=rev&revision=295664
Patch Set 1 #
Total comments: 1
Patch Set 2 : More better #Patch Set 3 : fix test #
Messages
Total messages: 31 (4 generated)
on a z600: * Pylint on build repo went from 90s to 14s (will post time for chromium/src shortly... after webkit finishes syncing...)
tools/telemetry (on z600 linux) before: 67s after: 8s Note: I had to change from the old RunPylint() method to GetPylint + RunTests to get this result.
iannucci@chromium.org changed reviewers: + dtu@chromium.org, nednguyen@google.com
Ping?
It's tricky, because it won't yield to the same results depending on how the sharding it done. The reason it was done this way is that pylint keeps an internal state, when when processing a fair number of python files, it can do cross references across modules. By not presenting all the code at once, you lose this. On the other time, its running time is exponential, which is very annoying. So I don't know, maybe it should be run shard in the presubmit then run with all files in the CI?
On 2015/06/12 18:03:32, M-A Ruel wrote: > It's tricky, because it won't yield to the same results depending on how the > sharding it done. > > The reason it was done this way is that pylint keeps an internal state, when > when processing a fair number of python files, it can do cross references across > modules. > > By not presenting all the code at once, you lose this. > > On the other time, its running time is exponential, which is very annoying. So I > don't know, maybe it should be run shard in the presubmit then run with all > files in the CI? Hm, ok... are you sure this behavior hasn't changed since pylint was upgraded? Is there a test case I can run to repro this?
On 2015/06/12 18:04:46, iannucci wrote: > Hm, ok... are you sure this behavior hasn't changed since pylint was upgraded? > Is there a test case I can run to repro this? file a.py: import b print b.c() file b.py def C(): return 1 If you parse a.py alone, it won't trigger. We may deemed it's worth the trade off, especially w.r.t. exponential runtime but I did find this useful.
https://chromiumcodereview.appspot.com/1181103002/diff/1/presubmit_canned_che... File presubmit_canned_checks.py (right): https://chromiumcodereview.appspot.com/1181103002/diff/1/presubmit_canned_che... presubmit_canned_checks.py:804: files = files[limit:] If we shard the files arbitrarily, how does this affects the effectiveness of the linting jobs and some lint check like "import-error" requires global state of all files?
On 2015/06/12 18:06:14, nednguyen wrote: > https://chromiumcodereview.appspot.com/1181103002/diff/1/presubmit_canned_che... > File presubmit_canned_checks.py (right): > > https://chromiumcodereview.appspot.com/1181103002/diff/1/presubmit_canned_che... > presubmit_canned_checks.py:804: files = files[limit:] > If we shard the files arbitrarily, how does this affects the effectiveness of > the linting jobs and some lint check like "import-error" requires global state > of all files? One possible direction is running lintAllFiles() (take long time but more correct) in CQ PRESUBMIT, and lintParallelized() upon "git cl upload".
On 2015/06/12 18:06:14, nednguyen wrote: > https://chromiumcodereview.appspot.com/1181103002/diff/1/presubmit_canned_che... > File presubmit_canned_checks.py (right): > > https://chromiumcodereview.appspot.com/1181103002/diff/1/presubmit_canned_che... > presubmit_canned_checks.py:804: files = files[limit:] > If we shard the files arbitrarily, how does this affects the effectiveness of > the linting jobs and some lint check like "import-error" requires global state > of all files? IIUC, the current version of pylint looks this information up from sys.path (the old one didn't). If sharding broke this assumption, I would assume that the presubmit would break (but it doesn't). Looking at the strace output of pylint seems to imply that it's searching PYTHONPATH for imports (e.g. the way that python does it).
On 2015/06/12 18:10:36, iannucci wrote: > IIUC, the current version of pylint looks this information up from sys.path (the > old one didn't). If sharding broke this assumption, I would assume that the > presubmit would break (but it doesn't). Looking at the strace output of pylint > seems to imply that it's searching PYTHONPATH for imports (e.g. the way that > python does it). and it does load the files? that's my question.
So I just did an experiment; I moved tools/telemetry/telemetry/core/backends to tools/telemetry/telemetry/core/wat , and ran presubmit. Basically every sharded pylint test failed with sensible import errors (both inside of tools/telemetry/telemetry/core/backends and outside of it).
On 2015/06/12 18:18:07, iannucci wrote: > So I just did an experiment; I moved tools/telemetry/telemetry/core/backends to > tools/telemetry/telemetry/core/wat , and ran presubmit. Basically every sharded > pylint test failed with sensible import errors (both inside of > tools/telemetry/telemetry/core/backends and outside of it). (for those playing along, that's 73 files, and on the z600 the pylint shard size for this folder is 19)
On 2015/06/12 18:19:21, iannucci wrote: > On 2015/06/12 18:18:07, iannucci wrote: > > So I just did an experiment; I moved tools/telemetry/telemetry/core/backends > to > > tools/telemetry/telemetry/core/wat , and ran presubmit. Basically every > sharded > > pylint test failed with sensible import errors (both inside of > > tools/telemetry/telemetry/core/backends and outside of it). > > (for those playing along, that's 73 files, and on the z600 the pylint shard size > for this folder is 19) Thanks for double checking, lgtm
Glad to see this :) lgtm
On 2015/06/12 18:19:21, iannucci wrote: > On 2015/06/12 18:18:07, iannucci wrote: > > So I just did an experiment; I moved tools/telemetry/telemetry/core/backends > to > > tools/telemetry/telemetry/core/wat , and ran presubmit. Basically every > sharded > > pylint test failed with sensible import errors (both inside of > > tools/telemetry/telemetry/core/backends and outside of it). > > (for those playing along, that's 73 files, and on the z600 the pylint shard size > for this folder is 19) If your experiment shows that parallelization does not reduce lint coverage, this is lgtm. Awesome work!
Ok, so it looks like import-error's aren't effected, but cyclic imports are :/ I'll see if I can figure out something for that.
On 2015/06/12 18:31:35, iannucci wrote: > Ok, so it looks like import-error's aren't effected, but cyclic imports are :/ > > I'll see if I can figure out something for that. As mentioned before, maybe we can run the non parallelized one on CQ?
On 2015/06/12 18:32:55, nednguyen wrote: > On 2015/06/12 18:31:35, iannucci wrote: > > Ok, so it looks like import-error's aren't effected, but cyclic imports are :/ > > > > I'll see if I can figure out something for that. > > As mentioned before, maybe we can run the non parallelized one on CQ? yeah, maybe that's the way to go. Side note, pylint has a --jobs option... which also misses the cyclic import >_<
PTAL: obvious solution is obvious Not as fast as the pure-sharded one, but still much faster (telemetry is 12s instead of 90s). Plus, it gives us an avenue for adding other broken checks in case we discover other incorrectness.
On 2015/06/12 19:42:00, iannucci wrote: > PTAL: obvious solution is obvious > > Not as fast as the pure-sharded one, but still much faster (telemetry is 12s > instead of 90s). > > Plus, it gives us an avenue for adding other broken checks in case we discover > other incorrectness. er 12s instead of 67s (up from 8 in the pure-sharded solution)
lgtm
On 2015/06/12 19:45:46, M-A Ruel wrote: > lgtm Will commit after lunch. Thanks for the reviews :)
On 2015/06/12 19:48:31, iannucci wrote: > On 2015/06/12 19:45:46, M-A Ruel wrote: > > lgtm > > Will commit after lunch. Thanks for the reviews :) lgtm
interesting. I didn't know about any of these flags. Nice improvements!
The CQ bit was checked by iannucci@chromium.org
The patchset sent to the CQ was uploaded after l-g-t-m from dtu@chromium.org, maruel@chromium.org Link to the patchset: https://chromiumcodereview.appspot.com/1181103002/#ps40001 (title: "fix test")
The CQ bit was checked by iannucci@chromium.org
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1181103002/40001
Message was sent while issue was closed.
Committed patchset #3 (id:40001) as http://src.chromium.org/viewvc/chrome?view=rev&revision=295664 |