|
|
Chromium Code Reviews|
Created:
4 years, 4 months ago by dtu Modified:
4 years, 3 months ago CC:
chromium-reviews, infra-reviews+build_chromium.org, kjellander-cc_chromium.org, sullivan Base URL:
https://chromium.googlesource.com/chromium/tools/build.git@master Target Ref:
refs/heads/master Project:
build Visibility:
Public. |
DescriptionIncrease timeout for determining initial confidence from 2 hours to 20 hours.
Some benchmarks take up to 40 minutes to run. 2 hours only gets you 4
iterations. That's not enough to establish confidence for many regressions. If
you want to get up to 15 iterations of both the first and last commits, that's
20 hours. A build times out after 24 hours, so 20 seems like a good number with
some headroom.
Unfortunately, if it takes that long to establish confidence, the bisect run
will probably time out after 24 hours, anyway. You could argue that we're
wasting bisect resources doing this, but having a timeout would be clearer to
the user than having a no-confidence result.
Long term fix would be to implement --story-filter for benchmarks that we can
divide up, to reduce the run time:
https://github.com/catapult-project/catapult/issues/1811
Also swarming will remove that 24-hour timeout restriction, I believe.
BUG=640509
Committed: https://chromium.googlesource.com/chromium/tools/build/+/6a0ae9a267cf531c5730e35900d3ec7cd1f4f20e
Patch Set 1 #
Messages
Total messages: 20 (11 generated)
Description was changed from ========== Increase timeout for detemining initial confidence from 2 hours to 20 hours. Some benchmarks take up to 40 minutes to run. 2 hours only gets you 4 iterations. That's not enough to establish confidence for many regressions. If you want to get up to 15 iterations of both the first and last commits, that's 20 hours. A build times out after 24 hours, so 20 seems like a good number with some headroom. BUG=640509 ========== to ========== Increase timeout for determining initial confidence from 2 hours to 20 hours. Some benchmarks take up to 40 minutes to run. 2 hours only gets you 4 iterations. That's not enough to establish confidence for many regressions. If you want to get up to 15 iterations of both the first and last commits, that's 20 hours. A build times out after 24 hours, so 20 seems like a good number with some headroom. BUG=640509 ==========
dtu@chromium.org changed reviewers: + prasadv@chromium.org, robertocn@chromium.org
Description was changed from ========== Increase timeout for determining initial confidence from 2 hours to 20 hours. Some benchmarks take up to 40 minutes to run. 2 hours only gets you 4 iterations. That's not enough to establish confidence for many regressions. If you want to get up to 15 iterations of both the first and last commits, that's 20 hours. A build times out after 24 hours, so 20 seems like a good number with some headroom. BUG=640509 ========== to ========== Increase timeout for determining initial confidence from 2 hours to 20 hours. Some benchmarks take up to 40 minutes to run. 2 hours only gets you 4 iterations. That's not enough to establish confidence for many regressions. If you want to get up to 15 iterations of both the first and last commits, that's 20 hours. A build times out after 24 hours, so 20 seems like a good number with some headroom. Long term fix would be to implement --story-filter for benchmarks that we can divide up: https://github.com/catapult-project/catapult/issues/1811 BUG=640509 ==========
Description was changed from ========== Increase timeout for determining initial confidence from 2 hours to 20 hours. Some benchmarks take up to 40 minutes to run. 2 hours only gets you 4 iterations. That's not enough to establish confidence for many regressions. If you want to get up to 15 iterations of both the first and last commits, that's 20 hours. A build times out after 24 hours, so 20 seems like a good number with some headroom. Long term fix would be to implement --story-filter for benchmarks that we can divide up: https://github.com/catapult-project/catapult/issues/1811 BUG=640509 ========== to ========== Increase timeout for determining initial confidence from 2 hours to 20 hours. Some benchmarks take up to 40 minutes to run. 2 hours only gets you 4 iterations. That's not enough to establish confidence for many regressions. If you want to get up to 15 iterations of both the first and last commits, that's 20 hours. A build times out after 24 hours, so 20 seems like a good number with some headroom. Long term fix would be to implement --story-filter for benchmarks that we can divide up, to reduce the run time: https://github.com/catapult-project/catapult/issues/1811 Also swarming will remove that 24-hour timeout restriction, I believe. BUG=640509 ==========
Description was changed from ========== Increase timeout for determining initial confidence from 2 hours to 20 hours. Some benchmarks take up to 40 minutes to run. 2 hours only gets you 4 iterations. That's not enough to establish confidence for many regressions. If you want to get up to 15 iterations of both the first and last commits, that's 20 hours. A build times out after 24 hours, so 20 seems like a good number with some headroom. Long term fix would be to implement --story-filter for benchmarks that we can divide up, to reduce the run time: https://github.com/catapult-project/catapult/issues/1811 Also swarming will remove that 24-hour timeout restriction, I believe. BUG=640509 ========== to ========== Increase timeout for determining initial confidence from 2 hours to 20 hours. Some benchmarks take up to 40 minutes to run. 2 hours only gets you 4 iterations. That's not enough to establish confidence for many regressions. If you want to get up to 15 iterations of both the first and last commits, that's 20 hours. A build times out after 24 hours, so 20 seems like a good number with some headroom. Unfortunately, if it takes that long to establish confidence, the bisect run will probably time out after 24 hours, anyway. You could argue that we're wasting bisect resources doing this, but having a timeout would be clearer to the user than having a no-confidence result. Long term fix would be to implement --story-filter for benchmarks that we can divide up, to reduce the run time: https://github.com/catapult-project/catapult/issues/1811 Also swarming will remove that 24-hour timeout restriction, I believe. BUG=640509 ==========
sullivan@chromium.org changed reviewers: + sullivan@chromium.org
lgtm
The CQ bit was checked by dtu@chromium.org
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Try jobs failed on following builders: Build Presubmit on luci.infra.try (JOB_FAILED, https://luci-milo.appspot.com/swarming/task/30d815a70a3c8810)
perezju@chromium.org changed reviewers: + perezju@chromium.org
Ouch, this is bad. Basically means that bisecting for system health (or memory.top_10_mobile) is nearly useless; and this will just make the thing run for 24 hours before telling us it couldn't do anything useful anyway. Fixing issue https://github.com/catapult-project/catapult/issues/1811 is then a major blocker to really solve this. Also we'll have to think a bit about long-ish running tests (like memory.top_10_mobile_stress) where we do want to run for a while without closing/restarting the browser, and would not want to use --story-filter. Question, for a very large regression, is it possible for the bisect algorithm achieve "confidence" after a couple of runs? Or does it really need a minimum of 10 or something?
On 2016/08/25 09:11:49, perezju wrote: > Ouch, this is bad. Basically means that bisecting for system health (or > memory.top_10_mobile) is nearly useless; and this will just make the thing run > for 24 hours before telling us it couldn't do anything useful anyway. > > Fixing issue https://github.com/catapult-project/catapult/issues/1811 is then a > major blocker to really solve this. This will only fix the problem for the System Health benchmark. memory.top_10_mobile[_stress] doesn't tear down the browser after every single story, so we still have to run the full story set to reproduce the regression (actually we can stop running it as soon as we've finished running the bisected story). > Also we'll have to think a bit about long-ish running tests (like > memory.top_10_mobile_stress) where we do want to run for a while without > closing/restarting the browser, and would not want to use --story-filter. > > Question, for a very large regression, is it possible for the bisect algorithm > achieve "confidence" after a couple of runs? Or does it really need a minimum of > 10 or something?
On 2016/08/25 09:17:04, petrcermak wrote: > On 2016/08/25 09:11:49, perezju wrote: > > Ouch, this is bad. Basically means that bisecting for system health (or > > memory.top_10_mobile) is nearly useless; and this will just make the thing run > > for 24 hours before telling us it couldn't do anything useful anyway. > > > > Fixing issue https://github.com/catapult-project/catapult/issues/1811 is then > a > > major blocker to really solve this. > > This will only fix the problem for the System Health benchmark. > memory.top_10_mobile[_stress] doesn't tear down the browser after every single > story, so we still have to run the full story set to reproduce the regression > (actually we can stop running it as soon as we've finished running the bisected > story). I can change that on memory.top_10_mobile (after branch point). We'll need to think harder about memory.top_10_mobile_stress; maybe the answer is that it is not really bisectable in an effective manner. And that could be OK too, as long as we do have good solutions for the other cases.
The CQ bit was checked by dtu@chromium.org
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
Message was sent while issue was closed.
Description was changed from ========== Increase timeout for determining initial confidence from 2 hours to 20 hours. Some benchmarks take up to 40 minutes to run. 2 hours only gets you 4 iterations. That's not enough to establish confidence for many regressions. If you want to get up to 15 iterations of both the first and last commits, that's 20 hours. A build times out after 24 hours, so 20 seems like a good number with some headroom. Unfortunately, if it takes that long to establish confidence, the bisect run will probably time out after 24 hours, anyway. You could argue that we're wasting bisect resources doing this, but having a timeout would be clearer to the user than having a no-confidence result. Long term fix would be to implement --story-filter for benchmarks that we can divide up, to reduce the run time: https://github.com/catapult-project/catapult/issues/1811 Also swarming will remove that 24-hour timeout restriction, I believe. BUG=640509 ========== to ========== Increase timeout for determining initial confidence from 2 hours to 20 hours. Some benchmarks take up to 40 minutes to run. 2 hours only gets you 4 iterations. That's not enough to establish confidence for many regressions. If you want to get up to 15 iterations of both the first and last commits, that's 20 hours. A build times out after 24 hours, so 20 seems like a good number with some headroom. Unfortunately, if it takes that long to establish confidence, the bisect run will probably time out after 24 hours, anyway. You could argue that we're wasting bisect resources doing this, but having a timeout would be clearer to the user than having a no-confidence result. Long term fix would be to implement --story-filter for benchmarks that we can divide up, to reduce the run time: https://github.com/catapult-project/catapult/issues/1811 Also swarming will remove that 24-hour timeout restriction, I believe. BUG=640509 Committed: https://chromium.googlesource.com/chromium/tools/build/+/6a0ae9a267cf531c5730... ==========
Message was sent while issue was closed.
Committed patchset #1 (id:1) as https://chromium.googlesource.com/chromium/tools/build/+/6a0ae9a267cf531c5730... |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
