|
|
Chromium Code Reviews
DescriptionRefactor how we fetch Telemetry deps conditionally
Before this, we conditionally fetch Telemetry deps through checking GYP_DEFINES
variable in https://github.com/catapult-project/catapult/blob/master/telemetry/bin/fetch_telemetry_binary_dependencies#L18
This refactors it by adding tools/perf/conditionally_execute which allows conditioning
executing any script on GYP_DEFINES flag.
BUG=725516
Review-Url: https://codereview.chromium.org/2921353002
Cr-Commit-Position: refs/heads/master@{#477339}
Committed: https://chromium.googlesource.com/chromium/src/+/534722a4d48581f5216bdf98d98d283ae478849a
Patch Set 1 #
Total comments: 1
Patch Set 2 : Fix typos #Patch Set 3 : Update path ref #
Total comments: 4
Patch Set 4 : update #Patch Set 5 : Address Kai's reviews #Patch Set 6 : Only fetch data of official perf benchmarks #
Total comments: 1
Patch Set 7 : Keep only refatoring #Messages
Total messages: 51 (30 generated)
The CQ bit was checked by nednguyen@google.com to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
nednguyen@google.com changed reviewers: + dpranke@chromium.org, jbudorick@chromium.org, kainino@chromium.org
https://codereview.chromium.org/2921353002/diff/1/DEPS File DEPS (right): https://codereview.chromium.org/2921353002/diff/1/DEPS#newcode894 DEPS:894: '--gyp-condition', 'fetch_telemetry_dependencies=1', With this, we can remove the conditioning code in https://github.com/catapult-project/catapult/blob/master/telemetry/bin/fetch_...
Description was changed from ========== Add script to conditionally fetch WPR archives of tools/perf/ BUG=725516 ========== to ========== Add script to conditionally fetch WPR archives of tools/perf/ during "gclient runhooks" BUG=725516 ==========
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: android_arm64_dbg_recipe on master.tryserver.chromium.android (JOB_FAILED, https://build.chromium.org/p/tryserver.chromium.android/builders/android_arm6...) android_clang_dbg_recipe on master.tryserver.chromium.android (JOB_FAILED, https://build.chromium.org/p/tryserver.chromium.android/builders/android_clan...) android_cronet on master.tryserver.chromium.android (JOB_FAILED, https://build.chromium.org/p/tryserver.chromium.android/builders/android_cron...) cast_shell_android on master.tryserver.chromium.android (JOB_FAILED, https://build.chromium.org/p/tryserver.chromium.android/builders/cast_shell_a...) linux_chromium_chromeos_ozone_rel_ng on master.tryserver.chromium.linux (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.linux/builders/linux_chromium_...) linux_chromium_compile_dbg_ng on master.tryserver.chromium.linux (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.linux/builders/linux_chromium_...) linux_chromium_headless_rel on master.tryserver.chromium.linux (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.linux/builders/linux_chromium_...) mac_chromium_rel_ng on master.tryserver.chromium.mac (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.mac/builders/mac_chromium_rel_...)
The CQ bit was checked by nednguyen@google.com to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: linux_chromium_compile_dbg_ng on master.tryserver.chromium.linux (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.linux/builders/linux_chromium_...) linux_chromium_rel_ng on master.tryserver.chromium.linux (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.linux/builders/linux_chromium_...)
The CQ bit was checked by nednguyen@google.com to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
On 2017/06/05 21:31:47, nednguyen wrote: > https://codereview.chromium.org/2921353002/diff/1/DEPS > File DEPS (right): > > https://codereview.chromium.org/2921353002/diff/1/DEPS#newcode894 > DEPS:894: '--gyp-condition', 'fetch_telemetry_dependencies=1', > With this, we can remove the conditioning code in > https://github.com/catapult-project/catapult/blob/master/telemetry/bin/fetch_... Non-owner lgtm. My understanding is this script will be obsoleted soon with other work to add conditional dependency support to gclient.
Forgot to send comments with the last message https://codereview.chromium.org/2921353002/diff/40001/tools/perf/conditionall... File tools/perf/conditionally_execute (right): https://codereview.chromium.org/2921353002/diff/40001/tools/perf/conditionall... tools/perf/conditionally_execute:15: help=('Name of environment variable that acts as the switch. If the ' nit: it's not really an environment variable that acts as a switch, but a string that must appear in the environment variable GYP_DEFINES https://codereview.chromium.org/2921353002/diff/40001/tools/perf/conditionall... tools/perf/conditionally_execute:25: if options.gyp_condition in gyp_defines: This may not matter since this should get replaced soon, but maybe it would be good to guard against, e.g., 'FOOfetch_telemetry_dependencies=1BAR'. Most of the existing code that checks GYP_DEFINES doesn't do this currently, though.
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: linux_chromium_compile_dbg_ng on master.tryserver.chromium.linux (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.linux/builders/linux_chromium_...) linux_chromium_rel_ng on master.tryserver.chromium.linux (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.linux/builders/linux_chromium_...)
The CQ bit was checked by nednguyen@google.com to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
https://codereview.chromium.org/2921353002/diff/40001/tools/perf/conditionall... File tools/perf/conditionally_execute (right): https://codereview.chromium.org/2921353002/diff/40001/tools/perf/conditionall... tools/perf/conditionally_execute:15: help=('Name of environment variable that acts as the switch. If the ' On 2017/06/05 21:58:51, Kai Ninomiya wrote: > nit: it's not really an environment variable that acts as a switch, but a string > that must appear in the environment variable GYP_DEFINES Done. https://codereview.chromium.org/2921353002/diff/40001/tools/perf/conditionall... tools/perf/conditionally_execute:25: if options.gyp_condition in gyp_defines: On 2017/06/05 21:58:51, Kai Ninomiya wrote: > This may not matter since this should get replaced soon, but maybe it would be > good to guard against, e.g., 'FOOfetch_telemetry_dependencies=1BAR'. Most of the > existing code that checks GYP_DEFINES doesn't do this currently, though. Done.
The CQ bit was checked by nednguyen@google.com to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
lgtm
On 2017/06/05 22:29:36, Kai Ninomiya wrote: > lgtm Dirk & John: I am submitting this CL to stop the flakiness for now. Feel free to add your comments & I will address them later.
The CQ bit was unchecked by nednguyen@google.com
The CQ bit was checked by nednguyen@google.com
CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by nednguyen@google.com
On 2017/06/05 22:39:01, commit-bot: I haz the power wrote: > CQ is trying da patch. > > Follow status at: > https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or... Wait. This is increasing the duration of "gclient runhooks" from 7s (https://build.chromium.org/p/tryserver.chromium.mac/builders/mac_chromium_rel...) to 7 minutes (https://build.chromium.org/p/tryserver.chromium.mac/builders/mac_chromium_rel...). If the bots cache the data, then should be just one time thing. Anyway, I think I should wait for your opinion on the runhooks time increase before landing
Description was changed from ========== Add script to conditionally fetch WPR archives of tools/perf/ during "gclient runhooks" BUG=725516 ========== to ========== Add script to conditionally fetch WPR archives of tools/perf/ during "gclient runhooks" This increases the perf isolate (used by telemetry_perf_unittest) by about 2.6 Gb. BUG=725516 ==========
On 2017/06/05 22:41:30, nednguyen wrote: > On 2017/06/05 22:39:01, commit-bot: I haz the power wrote: > > CQ is trying da patch. > > > > Follow status at: > > > https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or... > > Wait. > > This is increasing the duration of "gclient runhooks" from 7s > (https://build.chromium.org/p/tryserver.chromium.mac/builders/mac_chromium_rel...) > to 7 minutes > (https://build.chromium.org/p/tryserver.chromium.mac/builders/mac_chromium_rel...). > If the bots cache the data, then should be just one time thing. > > Anyway, I think I should wait for your opinion on the runhooks time increase > before landing fetch_perf_wpr_archives should run once per build-bot, right? So if you don't get the same build-bot again, it'll do the full download of WPR archives?
On 2017/06/05 22:41:30, nednguyen wrote: > On 2017/06/05 22:39:01, commit-bot: I haz the power wrote: > > CQ is trying da patch. > > > > Follow status at: > > > https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or... > > Wait. > > This is increasing the duration of "gclient runhooks" from 7s > (https://build.chromium.org/p/tryserver.chromium.mac/builders/mac_chromium_rel...) > to 7 minutes > (https://build.chromium.org/p/tryserver.chromium.mac/builders/mac_chromium_rel...). > If the bots cache the data, then should be just one time thing. > > Anyway, I think I should wait for your opinion on the runhooks time increase > before landing ! How large are these downloads...?
We shouldn't add 7m of download time to every bot in the CQ lightly, even if it is a 1-time hit (only on a new checkout). We should figure out if we can either reduce the # of files you need or compress them, or zip them up so you have fewer files to download, or fetch them in parallel. All of those things will likely speed this up significantly.
Ned and I chatted about this a bit over IM. It looks like there's ~2.6GB of data spread across ~500 files. This is right about the size where it's not clear if the normal isolate file approach will work fast enough, or whether the "exparchive" approach we use for the layout tests is working. Assuming we did keep the basic approach and downloaded the files during runhooks, then isolated them like any other test file, things should work pretty well. It's worth trying to fetch files in parallel and/or fetch all of them at once as an archive and then downloading them. However, from talking to jbudorick@, the flakiness made the linux_android_rel_ng almost unusable today. Do we know why that is? Have the archives gotten significantly bigger recently? Are we running these tests more often under swarming on android? If we don't know the cause of the increased flakiness, it seems reasonable to disable the test step in order to save the rest of the CQ if we don't have a fix. I'm not sure if it's better to do that than it is to land this CL as-is or try to land an improved version w/ parallel fetches, etc. Thoughts?
On 2017/06/06 00:25:32, Dirk Pranke (slow) wrote: > Ned and I chatted about this a bit over IM. > > It looks like there's ~2.6GB of data spread across ~500 files. This is right > about the size where it's not clear if the normal isolate file approach will > work fast enough, or whether the "exparchive" approach we use for the layout > tests is working. > > Assuming we did keep the basic approach and downloaded the files during > runhooks, then isolated them like any other test file, things should work pretty > well. It's worth trying to fetch files in parallel and/or fetch all of them at > once as an archive and then downloading them. > > However, from talking to jbudorick@, the flakiness made the linux_android_rel_ng > almost unusable today. Do we know why that is? Have the archives gotten > significantly bigger recently? Are we running these tests more often under > swarming on android? Having looked at our monitoring again since our discussion, this was also the case on Thursday and Friday last week. > > If we don't know the cause of the increased flakiness, it seems reasonable to > disable the test step in order to save the rest of the CQ if we don't have a > fix. I'm not sure if it's better to do that than it is to land this CL as-is or > try to land an improved version w/ parallel fetches, etc. > > Thoughts?
On 2017/06/06 00:29:45, jbudorick wrote: > On 2017/06/06 00:25:32, Dirk Pranke (slow) wrote: > > Ned and I chatted about this a bit over IM. > > > > It looks like there's ~2.6GB of data spread across ~500 files. This is right > > about the size where it's not clear if the normal isolate file approach will > > work fast enough, or whether the "exparchive" approach we use for the layout > > tests is working. > > > > Assuming we did keep the basic approach and downloaded the files during > > runhooks, then isolated them like any other test file, things should work > pretty > > well. It's worth trying to fetch files in parallel and/or fetch all of them at > > once as an archive and then downloading them. > > > > However, from talking to jbudorick@, the flakiness made the > linux_android_rel_ng > > almost unusable today. Do we know why that is? Have the archives gotten > > significantly bigger recently? Are we running these tests more often under > > swarming on android? > > Having looked at our monitoring again since our discussion, this was also the > case on Thursday and Friday last week. > > > > > If we don't know the cause of the increased flakiness, it seems reasonable to > > disable the test step in order to save the rest of the CQ if we don't have a > > fix. I'm not sure if it's better to do that than it is to land this CL as-is > or > > try to land an improved version w/ parallel fetches, etc. > > > > Thoughts? The one actually running most of the heavy tests is android_n5x_rel_ng bot, which runs the system health smoke test. The linux_android_rel_ng doesn't run system health smoke test & is meant to take pretty fast to run. I am fine with disabling the whole telemetry_perf_unittest on linux_android_rel_ng now since the situation is pretty bad there.
If we're getting the coverage on another config in the CQ already, then I'm totally fine w/ removing it from LARNG.
On 2017/06/06 01:02:57, Dirk Pranke (slow) wrote: > If we're getting the coverage on another config in the CQ already, then I'm > totally > fine w/ removing it from LARNG. Ok, I keep this CL about refactoring how we conditionally fetch the current Telemetry binary deps.
Description was changed from ========== Add script to conditionally fetch WPR archives of tools/perf/ during "gclient runhooks" This increases the perf isolate (used by telemetry_perf_unittest) by about 2.6 Gb. BUG=725516 ========== to ========== Refactor how we fetch Telemetry deps conditionally Before this, we conditionally fetch Telemetry deps through checking GYP_DEFINES variable in https://github.com/catapult-project/catapult/blob/master/telemetry/bin/fetch_... This refactors it by adding tools/perf/conditionally_execute which allows conditioning executing any script on GYP_DEFINES flag. BUG=725516 ==========
On 2017/06/06 16:18:48, nednguyen wrote: > On 2017/06/06 01:02:57, Dirk Pranke (slow) wrote: > > If we're getting the coverage on another config in the CQ already, then I'm > > totally > > fine w/ removing it from LARNG. > > Ok, I keep this CL about refactoring how we conditionally fetch the current > Telemetry binary deps. PTAL again
The CQ bit was checked by nednguyen@google.com to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
lgtm https://codereview.chromium.org/2921353002/diff/100001/tools/perf/fetch_bench... File tools/perf/fetch_benchmark_deps.py (right): https://codereview.chromium.org/2921353002/diff/100001/tools/perf/fetch_bench... tools/perf/fetch_benchmark_deps.py:110: % b.Name()) This fix should still land somewhere, whether in this CL or not.
On 2017/06/06 17:08:15, jbudorick wrote: > lgtm > > https://codereview.chromium.org/2921353002/diff/100001/tools/perf/fetch_bench... > File tools/perf/fetch_benchmark_deps.py (right): > > https://codereview.chromium.org/2921353002/diff/100001/tools/perf/fetch_bench... > tools/perf/fetch_benchmark_deps.py:110: % b.Name()) > This fix should still land somewhere, whether in this CL or not. Yup. I plan to fix it in the CL that improving the downloading speed.
The CQ bit was unchecked by nednguyen@google.com
The CQ bit was checked by nednguyen@google.com
The patchset sent to the CQ was uploaded after l-g-t-m from kainino@chromium.org Link to the patchset: https://codereview.chromium.org/2921353002/#ps120001 (title: "Keep only refatoring")
CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
CQ is committing da patch.
Bot data: {"patchset_id": 120001, "attempt_start_ts": 1496768945252820,
"parent_rev": "1ef74e6c994ef0e20184f79fdf0372ff756e13fd", "commit_rev":
"534722a4d48581f5216bdf98d98d283ae478849a"}
Message was sent while issue was closed.
Description was changed from ========== Refactor how we fetch Telemetry deps conditionally Before this, we conditionally fetch Telemetry deps through checking GYP_DEFINES variable in https://github.com/catapult-project/catapult/blob/master/telemetry/bin/fetch_... This refactors it by adding tools/perf/conditionally_execute which allows conditioning executing any script on GYP_DEFINES flag. BUG=725516 ========== to ========== Refactor how we fetch Telemetry deps conditionally Before this, we conditionally fetch Telemetry deps through checking GYP_DEFINES variable in https://github.com/catapult-project/catapult/blob/master/telemetry/bin/fetch_... This refactors it by adding tools/perf/conditionally_execute which allows conditioning executing any script on GYP_DEFINES flag. BUG=725516 Review-Url: https://codereview.chromium.org/2921353002 Cr-Commit-Position: refs/heads/master@{#477339} Committed: https://chromium.googlesource.com/chromium/src/+/534722a4d48581f5216bdf98d98d... ==========
Message was sent while issue was closed.
Committed patchset #7 (id:120001) as https://chromium.googlesource.com/chromium/src/+/534722a4d48581f5216bdf98d98d... |
