|
|
Created:
6 years, 6 months ago by epenner Modified:
6 years, 6 months ago CC:
chromium-reviews, klundberg+watch_chromium.org, bulach+watch_chromium.org, yfriedman+watch_chromium.org, ilevy-cc_chromium.org Base URL:
https://chromium.googlesource.com/chromium/src.git@master Project:
chromium Visibility:
Public. |
DescriptionTelemetry: Set scaling governor on all cpus (on all devices)
We try to set the scaling governor for all CPU cores, but
this doesn't work SOCs that dynamically disable cores
entirely (ie. Qualcomm). Since this only happens on Qcomm,
this patch stops the 'mpdecision' process on Qualcomm, which
makes it behave like other devices.
PERF-SHERIFFS: This may improve metrics for the better given
all cores are in 'performance' mode now. It should hopefully
reduce noise, since a different number of cores might have
had 'performance' set on each run, and any code that runs on
a non-performance core will suffer from noisy timings/rates.
Example: Tough-compositor-cases std-dev went from 0.25ms to
0.04ms with this change.
BUG=383566
Committed: https://src.chromium.org/viewvc/chrome?view=rev&revision=278837
Patch Set 1 #Patch Set 2 : Test. #Patch Set 3 : #
Total comments: 26
Patch Set 4 : Rebase. #Patch Set 5 : Address feedback. #
Total comments: 15
Patch Set 6 : Address Feedback. #Patch Set 7 : Disable pylint warning. #
Messages
Total messages: 43 (0 generated)
Ptal
Thanks Eric, really interested in seeing how this affects the results. Would you mind updating tools/telemetry/telemetry/core/platform/profiler/perf_profiler.py too? https://codereview.chromium.org/338233003/diff/20002/build/android/pylib/perf... File build/android/pylib/perf/perf_control.py (right): https://codereview.chromium.org/338233003/diff/20002/build/android/pylib/perf... build/android/pylib/perf/perf_control.py:86: if force_online: nit: This could be moved to line 83 to early out before getting the list of cores. https://codereview.chromium.org/338233003/diff/20002/build/android/pylib/perf... File build/android/pylib/perf/perf_control_unittest.py (right): https://codereview.chromium.org/338233003/diff/20002/build/android/pylib/perf... build/android/pylib/perf/perf_control_unittest.py:37: 'ls -d /sys/devices/system/cpu/cpu[0-9]*/cpufreq/cpuinfo_cur_freq') Should we be checking scaling_governor instead of cpuinfo_cur_freq?
I think it's a good idea to have all cores on in HighPerf mode (hey, I even had a CL to do this: https://codereview.chromium.org/73173005/) https://codereview.chromium.org/338233003/diff/20002/build/android/pylib/perf... File build/android/pylib/perf/perf_control.py (right): https://codereview.chromium.org/338233003/diff/20002/build/android/pylib/perf... build/android/pylib/perf/perf_control.py:54: self._ForceAllCpusOnline(False) All this function does fithe False is 'start mpdecision'. Maybe factor this functionality out to make it easier to read? For non-qualcomm SoCs this does not bring down the CPUs that were not up originally. I am not sure how important that is, and how robust it would be to try doing the proper thing here (given the script gets interrupted by the user sometimes), but it seems we may change the device behavior to something far from the initial. This may cause unpleasant surprises (like half of our fleet performs differently from another half, go figure). https://codereview.chromium.org/338233003/diff/20002/build/android/pylib/perf... build/android/pylib/perf/perf_control.py:68: Some vendors (or only Qualcomm?) hot-plug their CPUs, which can add noise It seems Tegra and OMAP4 support hotplug, they just have a special scaling governor value for that. Qualcomm is special in that regard: hotplugging via userspace mpdecision even with governor set to 'performance', and not the sysfs setting. Since we have a notion of ProductModel, we can maintain a list of all the Qualcomm product models here and avoid unnecessary adb shell commands for others. WDYT about this (premature?) optimization? Caveat: 'Nexus 7v1' is a tegra and 'Nexus 7v2' is a QC, having the same product model :/ https://codereview.chromium.org/338233003/diff/20002/build/android/pylib/perf... build/android/pylib/perf/perf_control.py:82: self._device.old_interface.RunShellCommandWithSU(script) SU is not available on some devices. Here we can assume running as root, so using RunShellCommand() would work on those (userdebug?) builds. RunShellCommand/RunShellCommandWithSU is broken, instead it should be RunShellCommand/RunPrivilegedShellCommand, and the RunPrivilegedShellCommand should discover the best way to run (via SU or make sure it runs as root, or fails). Let's fix it in another change. https://codereview.chromium.org/338233003/diff/20002/build/android/pylib/perf... build/android/pylib/perf/perf_control.py:85: 'ls -d /sys/devices/system/cpu/cpu[0-9]*/online') Please use a constant like _CPU_DIR_FMT? there is also self._kernel_max, which may soon exceed 10, no need to run extra shell commands. https://codereview.chromium.org/338233003/diff/20002/build/android/pylib/perf... build/android/pylib/perf/perf_control.py:88: script = 'echo 1 > {0};'.format(online_path) There is android_commands.SetProtectedFileContents()
2nd thought: > Example: Tough-compositor-cases std-dev went from 0.25 to 0.04 with this change. Was it Nexus 5?
https://codereview.chromium.org/338233003/diff/20002/build/android/pylib/perf... File build/android/pylib/perf/perf_control.py (right): https://codereview.chromium.org/338233003/diff/20002/build/android/pylib/perf... build/android/pylib/perf/perf_control.py:54: self._ForceAllCpusOnline(False) On 2014/06/17 11:48:30, pasko wrote: > All this function does fithe False is 'start mpdecision'. Maybe factor this > functionality out to make it easier to read? > > For non-qualcomm SoCs this does not bring down the CPUs that were not up > originally. I am not sure how important that is, and how robust it would be to > try doing the proper thing here (given the script gets interrupted by the user > sometimes), but it seems we may change the device behavior to something far from > the initial. This may cause unpleasant surprises (like half of our fleet > performs differently from another half, go figure). We could always reboot the devices after running the tests if this turns out to be a problem. https://codereview.chromium.org/338233003/diff/20002/build/android/pylib/perf... build/android/pylib/perf/perf_control.py:68: Some vendors (or only Qualcomm?) hot-plug their CPUs, which can add noise On 2014/06/17 11:48:30, pasko wrote: > It seems Tegra and OMAP4 support hotplug, they just have a special scaling > governor value for that. Qualcomm is special in that regard: hotplugging via > userspace mpdecision even with governor set to 'performance', and not the sysfs > setting. > > Since we have a notion of ProductModel, we can maintain a list of all the > Qualcomm product models here and avoid unnecessary adb shell commands for > others. WDYT about this (premature?) optimization? Caveat: 'Nexus 7v1' is a > tegra and 'Nexus 7v2' is a QC, having the same product model :/ I think such a list would quickly become unmaintanable -- we'd probably forget to update it when adding new devices. "start" and "stop" should already be no-ops on non-QC-devices, but we could also check whether /system/sbin/mpdecision (IIRC) exists before running them. https://codereview.chromium.org/338233003/diff/20002/build/android/pylib/perf... build/android/pylib/perf/perf_control.py:82: self._device.old_interface.RunShellCommandWithSU(script) On 2014/06/17 11:48:30, pasko wrote: > SU is not available on some devices. Here we can assume running as root, so > using RunShellCommand() would work on those (userdebug?) builds. Can we really assume that for all callers? At least telemetry doesn't require root. Agreed that the interfaces for running-commands-as-root are somewhat messed up but let's fix that separately. https://codereview.chromium.org/338233003/diff/20002/build/android/pylib/perf... build/android/pylib/perf/perf_control.py:88: script = 'echo 1 > {0};'.format(online_path) On 2014/06/17 11:48:30, pasko wrote: > There is android_commands.SetProtectedFileContents() Right, we should use that. FWIW it wasn't used here before because it adds quite a bit of latency. Now that we stop mpdecision that's no longer an issue.
https://codereview.chromium.org/338233003/diff/20002/build/android/pylib/perf... File build/android/pylib/perf/perf_control.py (right): https://codereview.chromium.org/338233003/diff/20002/build/android/pylib/perf... build/android/pylib/perf/perf_control.py:54: self._ForceAllCpusOnline(False) On 2014/06/17 13:04:07, Sami wrote: > On 2014/06/17 11:48:30, pasko wrote: > > All this function does fithe False is 'start mpdecision'. Maybe factor this > > functionality out to make it easier to read? > > > > For non-qualcomm SoCs this does not bring down the CPUs that were not up > > originally. I am not sure how important that is, and how robust it would be to > > try doing the proper thing here (given the script gets interrupted by the user > > sometimes), but it seems we may change the device behavior to something far > from > > the initial. This may cause unpleasant surprises (like half of our fleet > > performs differently from another half, go figure). > > We could always reboot the devices after running the tests if this turns out to > be a problem. Maybe we should just remove the RestoreOriginalPerfMode()? Set the mode before running tests, don't care about the behavior after running the tests => simpler. https://codereview.chromium.org/338233003/diff/20002/build/android/pylib/perf... build/android/pylib/perf/perf_control.py:68: Some vendors (or only Qualcomm?) hot-plug their CPUs, which can add noise On 2014/06/17 13:04:07, Sami wrote: > On 2014/06/17 11:48:30, pasko wrote: > > It seems Tegra and OMAP4 support hotplug, they just have a special scaling > > governor value for that. Qualcomm is special in that regard: hotplugging via > > userspace mpdecision even with governor set to 'performance', and not the > sysfs > > setting. > > > > Since we have a notion of ProductModel, we can maintain a list of all the > > Qualcomm product models here and avoid unnecessary adb shell commands for > > others. WDYT about this (premature?) optimization? Caveat: 'Nexus 7v1' is a > > tegra and 'Nexus 7v2' is a QC, having the same product model :/ > > I think such a list would quickly become unmaintanable -- we'd probably forget > to update it when adding new devices. "start" and "stop" should already be > no-ops on non-QC-devices, but we could also check whether > /system/sbin/mpdecision (IIRC) exists before running them. Yes, I would prefer a real no-op :) If we check for mpdecision existence once and reuse this knowledge for unaffected devices, that's better. Currently we are running these same shell commands between test runs, that slows things down for no reason. https://codereview.chromium.org/338233003/diff/20002/build/android/pylib/perf... build/android/pylib/perf/perf_control.py:82: self._device.old_interface.RunShellCommandWithSU(script) On 2014/06/17 13:04:07, Sami wrote: > On 2014/06/17 11:48:30, pasko wrote: > > SU is not available on some devices. Here we can assume running as root, so > > using RunShellCommand() would work on those (userdebug?) builds. > > Can we really assume that for all callers? At least telemetry doesn't require > root. Agreed that the interfaces for running-commands-as-root are somewhat > messed up but let's fix that separately. Yes, let's do it separately. I am absolutely sure that my test device will not work with this code the next day the CL is committed, and I am about 99% sure nothing on the perf bots will break if we assume root here today.
https://codereview.chromium.org/338233003/diff/20002/build/android/pylib/perf... File build/android/pylib/perf/perf_control.py (right): https://codereview.chromium.org/338233003/diff/20002/build/android/pylib/perf... build/android/pylib/perf/perf_control.py:54: self._ForceAllCpusOnline(False) On 2014/06/17 15:12:37, pasko wrote: > On 2014/06/17 13:04:07, Sami wrote: > > On 2014/06/17 11:48:30, pasko wrote: > > > All this function does fithe False is 'start mpdecision'. Maybe factor this > > > functionality out to make it easier to read? > > > > > > For non-qualcomm SoCs this does not bring down the CPUs that were not up > > > originally. I am not sure how important that is, and how robust it would be > to > > > try doing the proper thing here (given the script gets interrupted by the > user > > > sometimes), but it seems we may change the device behavior to something far > > from > > > the initial. This may cause unpleasant surprises (like half of our fleet > > > performs differently from another half, go figure). > > > > We could always reboot the devices after running the tests if this turns out > to > > be a problem. > > Maybe we should just remove the RestoreOriginalPerfMode()? Set the mode before > running tests, don't care about the behavior after running the tests => simpler. Hmm, this is called from various front-ends -- not just the bot test runner script -- and that wouldn't be appropriate in all cases. https://codereview.chromium.org/338233003/diff/20002/build/android/pylib/perf... build/android/pylib/perf/perf_control.py:82: self._device.old_interface.RunShellCommandWithSU(script) On 2014/06/17 15:12:37, pasko wrote: > On 2014/06/17 13:04:07, Sami wrote: > > On 2014/06/17 11:48:30, pasko wrote: > > > SU is not available on some devices. Here we can assume running as root, so > > > using RunShellCommand() would work on those (userdebug?) builds. > > > > Can we really assume that for all callers? At least telemetry doesn't require > > root. Agreed that the interfaces for running-commands-as-root are somewhat > > messed up but let's fix that separately. > > Yes, let's do it separately. > > I am absolutely sure that my test device will not work with this code the next > day the CL is committed, and I am about 99% sure nothing on the perf bots will > break if we assume root here today. Right, again, this affects more than just the perf bots :(
https://codereview.chromium.org/338233003/diff/20002/build/android/pylib/perf... File build/android/pylib/perf/perf_control.py (right): https://codereview.chromium.org/338233003/diff/20002/build/android/pylib/perf... build/android/pylib/perf/perf_control.py:54: self._ForceAllCpusOnline(False) On 2014/06/17 15:46:35, Sami wrote: > On 2014/06/17 15:12:37, pasko wrote: > > On 2014/06/17 13:04:07, Sami wrote: > > > On 2014/06/17 11:48:30, pasko wrote: > > > > All this function does fithe False is 'start mpdecision'. Maybe factor > this > > > > functionality out to make it easier to read? > > > > > > > > For non-qualcomm SoCs this does not bring down the CPUs that were not up > > > > originally. I am not sure how important that is, and how robust it would > be > > to > > > > try doing the proper thing here (given the script gets interrupted by the > > user > > > > sometimes), but it seems we may change the device behavior to something > far > > > from > > > > the initial. This may cause unpleasant surprises (like half of our fleet > > > > performs differently from another half, go figure). > > > > > > We could always reboot the devices after running the tests if this turns out > > to > > > be a problem. > > > > Maybe we should just remove the RestoreOriginalPerfMode()? Set the mode before > > running tests, don't care about the behavior after running the tests => > simpler. > > Hmm, this is called from various front-ends -- not just the bot test runner > script -- and that wouldn't be appropriate in all cases. OK, I just don't understand the contract we are maintaining. Why leaving leftover cores online is acceptable, but the leftover governor is not? What are these various frontends?
On 2014/06/17 17:02:23, pasko wrote: > OK, I just don't understand the contract we are maintaining. Why leaving > leftover cores online is acceptable, but the leftover governor is not? What are > these various frontends? Ok, now I'm confused :) We're restoring the governor here too, no? There are three front-ends: perf bot scripts, telemetry with --profiler=perf and adb_profile_chrome with --perf. Related question: should we check/reset the max CPU frequencies too? mpdecision may be leaving them to random values.
On 2014/06/17 17:13:33, Sami wrote: > On 2014/06/17 17:02:23, pasko wrote: > > OK, I just don't understand the contract we are maintaining. Why leaving > > leftover cores online is acceptable, but the leftover governor is not? What > are > > these various frontends? > > Ok, now I'm confused :) We're restoring the governor here too, no? Yes, we are restoring the governor here. However, on non-qualcomm devices we are not bringing down any cores. Is there a guarantee that returning governors to 'original' would bring the cores up/down exactly as it was before the test? I think there is no such guarantee. So the situation is: we try to get back to the previous state only partially (i.e. for governors, but not for cores). And my question would be "is this on purpose?" > There are three front-ends: perf bot scripts, telemetry with --profiler=perf and > adb_profile_chrome with --perf. These others absolutely must run as adb root, if they are not, let's break here and see the breakages instead of looking at improperly collected data. > Related question: should we check/reset the max CPU frequencies too? mpdecision > may be leaving them to random values. really? Then we should check and reset. Hm, that may actually bring the noise even lower. My understanding is that we should take available freqs from ...cpuX/cpufreq/scaling_available_frequencies and make sure cpuX/cpufreq/cpuinfo_{min,max}_freq is set to a desired value before running tests.
On 2014/06/17 17:44:55, pasko wrote: > On 2014/06/17 17:13:33, Sami wrote: > > On 2014/06/17 17:02:23, pasko wrote: > > > OK, I just don't understand the contract we are maintaining. Why leaving > > > leftover cores online is acceptable, but the leftover governor is not? What > > are > > > these various frontends? > > > > Ok, now I'm confused :) We're restoring the governor here too, no? > > Yes, we are restoring the governor here. However, on non-qualcomm devices we are > not bringing down any cores. Is there a guarantee that returning governors to > 'original' would bring the cores up/down exactly as it was before the test? I > think there is no such guarantee. So the situation is: we try to get back to the > previous state only partially (i.e. for governors, but not for cores). And my > question would be "is this on purpose?" Yes, see here: go/aeluz > > There are three front-ends: perf bot scripts, telemetry with --profiler=perf > and > > adb_profile_chrome with --perf. > > These others absolutely must run as adb root, if they are not, let's break here > and see the breakages instead of looking at improperly collected data. Ah, I think you're right. Maybe we can just assert that we have root here. > really? Then we should check and reset. Hm, that may actually bring the noise > even lower. My understanding is that we should take available freqs from > ...cpuX/cpufreq/scaling_available_frequencies and make sure > cpuX/cpufreq/cpuinfo_{min,max}_freq is set to a desired value before running > tests. I didn't actually verify this -- let's check before making this more complex than it needs to be.
https://codereview.chromium.org/338233003/diff/20002/build/android/pylib/perf... File build/android/pylib/perf/perf_control.py (right): https://codereview.chromium.org/338233003/diff/20002/build/android/pylib/perf... build/android/pylib/perf/perf_control.py:54: self._ForceAllCpusOnline(False) My thinking was that this only only different if mpdecision exists (other devices were already like this, right?). And if mpdecision exists, it will restore the cores to whatever it decides is 'correct' after we start it again. If any cores are left enabled though, then we need to restore their governors (especially for CPU0). Since all tests currently do this, I don't think there's currently an issue with half the fleet having different behavior, but also if this doesn't work then I don't see any fix short of rebooting the device. Ideally though mpdecision isn't stateful and the device will behave the same any time it is enabled. https://codereview.chromium.org/338233003/diff/20002/build/android/pylib/perf... build/android/pylib/perf/perf_control.py:68: Some vendors (or only Qualcomm?) hot-plug their CPUs, which can add noise Okay I buy it's good to check for mpdecision first, but are we sure it's worth caching it? There is several shell commands here and all the shell commands are pretty tiny compared to the tests themselves. https://codereview.chromium.org/338233003/diff/20002/build/android/pylib/perf... build/android/pylib/perf/perf_control.py:82: self._device.old_interface.RunShellCommandWithSU(script) Can I take away this okay for now? Will these commands can just fail silently for non root devices? That should be fine? https://codereview.chromium.org/338233003/diff/20002/build/android/pylib/perf... build/android/pylib/perf/perf_control.py:86: if force_online: On 2014/06/17 04:16:43, Sami wrote: > nit: This could be moved to line 83 to early out before getting the list of > cores. Nice. Done. https://codereview.chromium.org/338233003/diff/20002/build/android/pylib/perf... build/android/pylib/perf/perf_control.py:88: script = 'echo 1 > {0};'.format(online_path) On 2014/06/17 13:04:07, Sami wrote: > On 2014/06/17 11:48:30, pasko wrote: > > There is android_commands.SetProtectedFileContents() > > Right, we should use that. FWIW it wasn't used here before because it adds quite > a bit of latency. Now that we stop mpdecision that's no longer an issue. Ahh good point, yeah it needed the && before to win the race mpdecision. I'll change it. https://codereview.chromium.org/338233003/diff/20002/build/android/pylib/perf... File build/android/pylib/perf/perf_control_unittest.py (right): https://codereview.chromium.org/338233003/diff/20002/build/android/pylib/perf... build/android/pylib/perf/perf_control_unittest.py:37: 'ls -d /sys/devices/system/cpu/cpu[0-9]*/cpufreq/cpuinfo_cur_freq') On 2014/06/17 04:16:44, Sami wrote: > Should we be checking scaling_governor instead of cpuinfo_cur_freq? Thanks! As you can see I haven't run this yet. I was having trouble finding the test suite's name. Do you happen to have that handy?
Okay, since there was a bit of back and forth comments, I've made all the changes recommended so far. Let me know if I'm missed or misinterpreted something. We can also continue to discuss the thermal issue on the bug as I test other devices.
https://codereview.chromium.org/338233003/diff/20002/build/android/pylib/perf... File build/android/pylib/perf/perf_control.py (right): https://codereview.chromium.org/338233003/diff/20002/build/android/pylib/perf... build/android/pylib/perf/perf_control.py:54: self._ForceAllCpusOnline(False) On 2014/06/17 22:10:55, epenner wrote: > My thinking was that this only only different if mpdecision exists (other > devices were already like this, right?). And if mpdecision exists, it will > restore the cores to whatever it decides is 'correct' after we start it again. > > If any cores are left enabled though, then we need to restore their governors > (especially for CPU0). > > Since all tests currently do this, I don't think there's currently an issue with > half the fleet having different behavior, The issue is that we cannot really divide our current fleet by half :) But I attempt to be optimistic: we will have more devices. > but also if this doesn't work then I > don't see any fix short of rebooting the device. Ideally though mpdecision isn't > stateful and the device will behave the same any time it is enabled. Spinning off the discussion here: https://code.google.com/p/chromium/issues/detail?id=386082 https://codereview.chromium.org/338233003/diff/20002/build/android/pylib/perf... build/android/pylib/perf/perf_control.py:68: Some vendors (or only Qualcomm?) hot-plug their CPUs, which can add noise On 2014/06/17 22:10:56, epenner wrote: > Okay I buy it's good to check for mpdecision first, but are we sure it's worth > caching it? There is several shell commands here and all the shell commands are > pretty tiny compared to the tests themselves. You are right, we are deciding whether to trade smallish perf benefit for smallish complexity. A few more arguments in favor of more complexity: * these shell commands take a few seconds total between test runs * more adb commands pollute the verbose output on the bots, have to look through more irrelevancy in the logs * Seeing failures in the logs can be quite misleading, these quickly become red herringz when something else happens on bots => harder for inexperienced sheriffs to investigate On the other hand, the core idea of this CL is much more important than this minor comment, so I'm stopping my side of the bikeshed now :) https://codereview.chromium.org/338233003/diff/20002/build/android/pylib/perf... build/android/pylib/perf/perf_control.py:82: self._device.old_interface.RunShellCommandWithSU(script) On 2014/06/17 22:10:56, epenner wrote: > Can I take away this okay for now? Will these commands can just fail silently > for non root devices? That should be fine? Yeah, I think it will fail silently if adb is not run as root.
Ptal for latest. - I cache have_mpdecision_ now and only start/stop if it exists. - This patch only enables cores for N4. To support perf profiler, I've added ForceHighPerfMode which does it for all devices. - I've removed RestoreOriginal and replaced it with SetDefault in the once place it was used. I could remove SetDefault as well, but I think it's still slightly better than doing nothing. We can discuss further on the separate bug though.
I think this is looking pretty good now and can almost land before we sort out the deterministic vs. realistic mode. I won't object if other reviewers express that we need to wait. LGTM https://codereview.chromium.org/338233003/diff/110001/build/android/pylib/per... File build/android/pylib/perf/perf_control.py (right): https://codereview.chromium.org/338233003/diff/110001/build/android/pylib/per... build/android/pylib/perf/perf_control.py:48: def ForceHighPerfMode(self): I think SetProfilingMode() would be more readable and won't require any self-documentation. In the future it may diverge from the HighPerfMode, so it's probably here to stay for longer. https://codereview.chromium.org/338233003/diff/110001/build/android/pylib/per... build/android/pylib/perf/perf_control.py:101: logging.warn('Unexpected cpu hot plugging detected.') loggin.warn is deprecated, please use logging.warning https://codereview.chromium.org/338233003/diff/110001/build/android/pylib/per... File build/android/pylib/perf/perf_control_unittest.py (right): https://codereview.chromium.org/338233003/diff/110001/build/android/pylib/per... build/android/pylib/perf/perf_control_unittest.py:28: perf.ForceHighPerfMode() It was like that before, but this does not really look like a unittest if it changes the state of the whole system. ideally we would use something like telemetry.unittest.simple_mock to mock out self._device.old_interface, but it can take some time. Up to you.
I added a few nits but overall lgtm. https://codereview.chromium.org/338233003/diff/110001/build/android/pylib/per... File build/android/pylib/perf/perf_control.py (right): https://codereview.chromium.org/338233003/diff/110001/build/android/pylib/per... build/android/pylib/perf/perf_control.py:32: def NumCpuCores(self): nit: This could be a @property (and perhaps private too). https://codereview.chromium.org/338233003/diff/110001/build/android/pylib/per... build/android/pylib/perf/perf_control.py:36: return self._CPU_ONLINE_FMT These two functions are only used in test code. I think it'd be cleaner just to have the test reach in an get these constants from here or duplicate them instead of making them part of the public interface. No-one else should need these. https://codereview.chromium.org/338233003/diff/110001/build/android/pylib/per... build/android/pylib/perf/perf_control.py:109: online_path, "1") nit: '1' instead of "1" https://codereview.chromium.org/338233003/diff/110001/build/android/pylib/per... File build/android/pylib/perf/perf_control_unittest.py (right): https://codereview.chromium.org/338233003/diff/110001/build/android/pylib/per... build/android/pylib/perf/perf_control_unittest.py:28: perf.ForceHighPerfMode() On 2014/06/19 17:48:32, pasko wrote: > It was like that before, but this does not really look like a unittest if it > changes the state of the whole system. > > ideally we would use something like telemetry.unittest.simple_mock to mock out > self._device.old_interface, but it can take some time. > > Up to you. Cue age old debate about integration vs. unit tests :) My opinion is that in this case we actually want to frob the state of the device and make sure it actually responded. This way we can catch problems with new devices that work differently. (BTW, AFAICT these tests aren't actually running on any bots -- probably should fix that too :) https://codereview.chromium.org/338233003/diff/110001/tools/android/adb_profi... File tools/android/adb_profile_chrome/perf_controller.py (right): https://codereview.chromium.org/338233003/diff/110001/tools/android/adb_profi... tools/android/adb_profile_chrome/perf_controller.py:61: self._perf_control.ForceHighPerfMode(True) True should be removed, right? BTW, tools/android/adb_profile_chrome/run_tests should catch this -- also not on any bots atm. :(
Thanks! Addressed all feedback. I'm going to land unless someone objects. Also feel free to file bugs against me for realistic/real modes as we determine exactly what to do there. https://codereview.chromium.org/338233003/diff/110001/build/android/pylib/per... File build/android/pylib/perf/perf_control.py (right): https://codereview.chromium.org/338233003/diff/110001/build/android/pylib/per... build/android/pylib/perf/perf_control.py:32: def NumCpuCores(self): On 2014/06/19 18:12:42, Sami wrote: > nit: This could be a @property (and perhaps private too). Done. https://codereview.chromium.org/338233003/diff/110001/build/android/pylib/per... build/android/pylib/perf/perf_control.py:36: return self._CPU_ONLINE_FMT On 2014/06/19 18:12:43, Sami wrote: > These two functions are only used in test code. I think it'd be cleaner just to > have the test reach in an get these constants from here or duplicate them > instead of making them part of the public interface. No-one else should need > these. Yeah this was just for the test since accessing privates is a warning. Latest patch duplicates the constants and just keeps NumCpuCores public. If you want that one private I can inherit a special class in the unit test to get at it (or let me know if there is a better way). https://codereview.chromium.org/338233003/diff/110001/build/android/pylib/per... build/android/pylib/perf/perf_control.py:48: def ForceHighPerfMode(self): On 2014/06/19 17:48:31, pasko wrote: > I think SetProfilingMode() would be more readable and won't require any > self-documentation. In the future it may diverge from the HighPerfMode, so it's > probably here to stay for longer. Changed to SetPerfProfilingMode. https://codereview.chromium.org/338233003/diff/110001/build/android/pylib/per... build/android/pylib/perf/perf_control.py:101: logging.warn('Unexpected cpu hot plugging detected.') On 2014/06/19 17:48:31, pasko wrote: > loggin.warn is deprecated, please use logging.warning Done. https://codereview.chromium.org/338233003/diff/110001/build/android/pylib/per... build/android/pylib/perf/perf_control.py:109: online_path, "1") On 2014/06/19 18:12:42, Sami wrote: > nit: '1' instead of "1" Done. https://codereview.chromium.org/338233003/diff/110001/tools/android/adb_profi... File tools/android/adb_profile_chrome/perf_controller.py (right): https://codereview.chromium.org/338233003/diff/110001/tools/android/adb_profi... tools/android/adb_profile_chrome/perf_controller.py:61: self._perf_control.ForceHighPerfMode(True) On 2014/06/19 18:12:43, Sami wrote: > True should be removed, right? > > BTW, tools/android/adb_profile_chrome/run_tests should catch this -- also not on > any bots atm. :( Thanks. I ran the above and the other test manually for the latest patch.
The CQ bit was checked by epenner@chromium.org
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/epenner@chromium.org/338233003/130001
On 2014/06/19 23:58:50, epenner wrote: > Yeah this was just for the test since accessing privates is a warning. Latest > patch duplicates the constants and just keeps NumCpuCores public. If you want > that one private I can inherit a special class in the unit test to get at it (or > let me know if there is a better way). Well, "better", but a lot of the tests just disable the pylint warning about accessing private fields.
The CQ bit was unchecked by epenner@chromium.org
The CQ bit was checked by epenner@chromium.org
On 2014/06/20 00:02:08, Sami wrote: > On 2014/06/19 23:58:50, epenner wrote: > > Yeah this was just for the test since accessing privates is a warning. Latest > > patch duplicates the constants and just keeps NumCpuCores public. If you want > > that one private I can inherit a special class in the unit test to get at it > (or > > let me know if there is a better way). > > Well, "better", but a lot of the tests just disable the pylint warning about > accessing private fields. Ahh. yeah I was considering but didn't know if that was common. For a test it seems reasonable so I went ahead and did that and made everything private.
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/epenner@chromium.org/338233003/150001
Oops still need tools/telemetry LGTM. Also, after reading the other but, let me know if we should hold off for deterministic mode first.
Actually, read the other bug that it's okay to land this before 'realistic' mode lands. So just need your LGTM Tony, for tools/telemetry OWNERS
FYI, CQ is re-trying this CL (attempt #1). The failing builders are: chromium_presubmit on tryserver.chromium (http://build.chromium.org/p/tryserver.chromium/builders/chromium_presubmit/bu...) win_chromium_x64_rel on tryserver.chromium (http://build.chromium.org/p/tryserver.chromium/builders/win_chromium_x64_rel/...)
The CQ bit was unchecked by commit-bot@chromium.org
Try jobs failed on following builders: chromium_presubmit on tryserver.chromium (http://build.chromium.org/p/tryserver.chromium/builders/chromium_presubmit/bu...)
presubmit says someone from tools/telemetry/OWNERS has to approve... https://codereview.chromium.org/338233003/diff/110001/build/android/pylib/per... File build/android/pylib/perf/perf_control.py (right): https://codereview.chromium.org/338233003/diff/110001/build/android/pylib/per... build/android/pylib/perf/perf_control.py:36: return self._CPU_ONLINE_FMT On 2014/06/19 23:58:50, epenner wrote: > On 2014/06/19 18:12:43, Sami wrote: > > These two functions are only used in test code. I think it'd be cleaner just > to > > have the test reach in an get these constants from here or duplicate them > > instead of making them part of the public interface. No-one else should need > > these. > > Yeah this was just for the test since accessing privates is a warning. Latest > patch duplicates the constants and just keeps NumCpuCores public. If you want > that one private I can inherit a special class in the unit test to get at it (or > let me know if there is a better way). One other option: CpuOnlineFormatForTesting()
On 2014/06/20 13:48:37, pasko wrote: > presubmit says someone from tools/telemetry/OWNERS has to approve... > +Dtu for tools/telemetry/OWNERS since Tony is out. Tony gave the go ahead on the bug.
lgtm
The CQ bit was checked by epenner@chromium.org
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/epenner@chromium.org/338233003/150001
Message was sent while issue was closed.
Change committed as 278837
Message was sent while issue was closed.
The new exception is thrown for me when I run `tools/perf/run_benchmark -v --browser=android-chrome-shell thread_times.key_mobile_sites --pageset-repeat=10`: Traceback (most recent call last): <module> at tools/perf/run_benchmark:25 sys.exit(test_runner.main()) main at tools/telemetry/telemetry/test_runner.py:300 return command().Run(options) Run at tools/telemetry/telemetry/test_runner.py:177 return min(255, self._test().Run(args)) Run at tools/telemetry/telemetry/test.py:84 results = page_runner.Run(test, ps, expectations, args) Run at tools/telemetry/telemetry/page/page_runner.py:411 page, credentials_path, possible_browser, results, state) _PrepareAndRunPage at tools/telemetry/telemetry/page/page_runner.py:253 finder_options) StartBrowserIfNeeded at tools/telemetry/telemetry/page/page_runner.py:49 self.browser = possible_browser.Create() Create at tools/telemetry/telemetry/core/backends/chrome/android_browser_finder.py:111 b = browser.Browser(backend, self._platform_backend) __init__ at tools/telemetry/telemetry/core/browser.py:43 self._platform.SetFullPerformanceModeEnabled(True) SetFullPerformanceModeEnabled at tools/telemetry/telemetry/core/platform/__init__.py:64 return self._platform_backend.SetFullPerformanceModeEnabled(enabled) SetFullPerformanceModeEnabled at tools/telemetry/telemetry/core/platform/android_platform_backend.py:97 self._perf_tests_setup.SetHighPerfMode() SetHighPerfMode at build/android/pylib/perf/perf_control.py:39 self._ForceAllCpusOnline(True) _ForceAllCpusOnline at build/android/pylib/perf/perf_control.py:108 raise RuntimeError('Failed to force CPUs online') RuntimeError: Failed to force CPUs online This is with a nexus 4; probably because it's not rooted. http://www.chromium.org/developers/telemetry/run_locally doesn't mention that that's necessary though, and maybe it wasn't necessary before this change?
Message was sent while issue was closed.
> This is with a nexus 4; probably because it's not rooted. > http://www.chromium.org/developers/telemetry/run_locally doesn't mention that > that's necessary though, and maybe it wasn't necessary before this change? Darn. I added this to double check the behavior when rooted. When not rooted we only want to fail for the perf profiler and possibly just a log otherwise. I'll have a patch up shortly.
Message was sent while issue was closed.
On 2014/06/23 18:11:21, Nico (away) wrote: > The new exception is thrown for me when I run `tools/perf/run_benchmark -v > --browser=android-chrome-shell thread_times.key_mobile_sites > --pageset-repeat=10`: > > Traceback (most recent call last): > <module> at tools/perf/run_benchmark:25 > sys.exit(test_runner.main()) > main at tools/telemetry/telemetry/test_runner.py:300 > return command().Run(options) > Run at tools/telemetry/telemetry/test_runner.py:177 > return min(255, self._test().Run(args)) > Run at tools/telemetry/telemetry/test.py:84 > results = page_runner.Run(test, ps, expectations, args) > Run at tools/telemetry/telemetry/page/page_runner.py:411 > page, credentials_path, possible_browser, results, state) > _PrepareAndRunPage at tools/telemetry/telemetry/page/page_runner.py:253 > finder_options) > StartBrowserIfNeeded at tools/telemetry/telemetry/page/page_runner.py:49 > self.browser = possible_browser.Create() > Create at > tools/telemetry/telemetry/core/backends/chrome/android_browser_finder.py:111 > b = browser.Browser(backend, self._platform_backend) > __init__ at tools/telemetry/telemetry/core/browser.py:43 > self._platform.SetFullPerformanceModeEnabled(True) > SetFullPerformanceModeEnabled at > tools/telemetry/telemetry/core/platform/__init__.py:64 > return self._platform_backend.SetFullPerformanceModeEnabled(enabled) > SetFullPerformanceModeEnabled at > tools/telemetry/telemetry/core/platform/android_platform_backend.py:97 > self._perf_tests_setup.SetHighPerfMode() > SetHighPerfMode at build/android/pylib/perf/perf_control.py:39 > self._ForceAllCpusOnline(True) > _ForceAllCpusOnline at build/android/pylib/perf/perf_control.py:108 > raise RuntimeError('Failed to force CPUs online') > RuntimeError: Failed to force CPUs online > > This is with a nexus 4; probably because it's not rooted. > http://www.chromium.org/developers/telemetry/run_locally doesn't mention that > that's necessary though, and maybe it wasn't necessary before this change? Thanks Nico. I think it was not necessary before to run as root, we just ignored all the issues we have with that mode. We should better always run benchmarks with root capabilities for simplicity (some benchmarks require bringing CPU into certain mode to make measurements less noisy, some others need to drop OS disk page cache, some yet another type of benchmark obtains detailed memory dumps from processes running under different UIDs, etc.) I just updated the documentation to say that we need a userdebug build and adb daemon being run as root, or having the su command installed (su is the only option on non-nexus devices, where we cannot obtain userdebug builds). I think that's a minor hassle compared to what we would alternatively go through after misinterpreting results of perf measurements.
Message was sent while issue was closed.
On 2014/06/23 18:59:02, epenner wrote: > > This is with a nexus 4; probably because it's not rooted. > > http://www.chromium.org/developers/telemetry/run_locally doesn't mention that > > that's necessary though, and maybe it wasn't necessary before this change? > > Darn. I added this to double check the behavior when rooted. When not rooted we > only want to fail for the perf profiler and possibly just a log otherwise. I would actually prefer failing, nobody notices log messages like these. Would this message say "you may disregard all results now" in the middle of the log?
On my device, thakis@yearofthelinuxdesktop:~/src/chrome/src$ third_party/android_tools/sdk/platform-tools/adb root adbd cannot run as root in production builds So for me requiring this is a somewhat large nuisance. I'd prefer noisy results over no results, I think. (result.html could say in red "this is noisy") On Mon, Jun 23, 2014 at 12:06 PM, <pasko@chromium.org> wrote: > On 2014/06/23 18:59:02, epenner wrote: > >> > This is with a nexus 4; probably because it's not rooted. >> > http://www.chromium.org/developers/telemetry/run_locally doesn't >> mention >> > that > >> > that's necessary though, and maybe it wasn't necessary before this >> change? >> > > Darn. I added this to double check the behavior when rooted. When not >> rooted >> > we > >> only want to fail for the perf profiler and possibly just a log otherwise. >> > > I would actually prefer failing, nobody notices log messages like these. > Would > this message say "you may disregard all results now" in the middle of the > log? > > > > https://codereview.chromium.org/338233003/ > To unsubscribe from this group and stop receiving emails from it, send an email to chromium-reviews+unsubscribe@chromium.org.
Message was sent while issue was closed.
I can see both sides of the argument, but I'm also leaning towards some things working without root if it's not a major burden. One use case: I test a lot with random off-shelf devices (and page-repeat works okay as mitigation). Precedent: I doubt we'll disable tracing, and the timings there will also be more noisy without root. On 2014/06/23 19:10:44, Nico (away) wrote: > On my device, > > thakis@yearofthelinuxdesktop:~/src/chrome/src$ > third_party/android_tools/sdk/platform-tools/adb root > adbd cannot run as root in production builds > > So for me requiring this is a somewhat large nuisance. I'd prefer noisy > results over no results, I think. (result.html could say in red "this is > noisy") >
Message was sent while issue was closed.
On 2014/06/23 19:44:03, epennerAtGoogle wrote: > I can see both sides of the argument, but I'm also leaning towards some things > working without root if it's not a major burden. > > One use case: I test a lot with random off-shelf devices (and page-repeat works > okay as mitigation). > Precedent: I doubt we'll disable tracing, and the timings there will also be > more noisy without root. My concerns are: 1. supporting extra mode (non-root) requires extra effort 2. non-root in general means not only extra noise, but also lack of results or skewed results, cases when "it's non-root, but still okay" won't be documented, and will get easily subtle. That said, I think we can still support non-root benchmarking, if the log message about skewed results appears in the end of the run, that'd make me more comfortable. |