|
|
DescriptionTelemetry: Don't fail on setting performance mode without root.
Writing the scaling_governor file requires root, but doing so
without root now fails. This patch early outs with a warning
when we don't have root.
BUG=391526
Committed: https://src.chromium.org/viewvc/chrome?view=rev&revision=286441
Patch Set 1 #
Total comments: 2
Patch Set 2 : Also early out for default. #Patch Set 3 : All caps "NOISY" #Patch Set 4 : Add failure with message. #Patch Set 5 : Use AtExit warning. #Messages
Total messages: 17 (0 generated)
Ptal. I think setting a protected file without root is a new failure. That makes sense though, so this just adds an early out so we don't try it.
https://codereview.chromium.org/373223002/diff/1/build/android/pylib/perf/per... File build/android/pylib/perf/perf_control.py (right): https://codereview.chromium.org/373223002/diff/1/build/android/pylib/perf/per... build/android/pylib/perf/perf_control.py:33: logging.warning('Need root for performance mode. Results may be noisy!') I agree that running on non-rooted devices is useful. How can we make the warning more visible? Can you find a way for tools/perf/run_benchmark to print the warning right after printing the results? Yes, this would introduce extra complexity, but I think that we agreed that this use-case (i.e. running perf measurements on non-rooted devices) is meaningful enough to pay for this extra complexity. Also, maybe the wording should include 'WARNING WARNING WARNIG'? :)
https://codereview.chromium.org/373223002/diff/1/build/android/pylib/perf/per... File build/android/pylib/perf/perf_control.py (right): https://codereview.chromium.org/373223002/diff/1/build/android/pylib/perf/per... build/android/pylib/perf/perf_control.py:33: logging.warning('Need root for performance mode. Results may be noisy!') On 2014/07/09 09:20:30, pasko wrote: > I agree that running on non-rooted devices is useful. > > How can we make the warning more visible? Can you find a way for > tools/perf/run_benchmark to print the warning right after printing the results? > Yes, this would introduce extra complexity, but I think that we agreed that this > use-case (i.e. running perf measurements on non-rooted devices) is meaningful > enough to pay for this extra complexity. > > Also, maybe the wording should include 'WARNING WARNING WARNIG'? :) I started looking into this, but the more I do, the more I'm thinking this is subjective and not worth the complexity. Firstly, I just don't see this being that large of an issue. Any developer doing anything with Chrome should have a rooted Android device (we all got an N7 and an N4 recently). If they don't have root it's because they are using an off the shelf device for gathering info. If they are doing A/B tests locally, they will realize very quickly, since these warnings sit their for up to several minutes while the test is running: WARNING:root:Can't enable root in production builds with type user WARNING:root:Need root for performance mode. Results may be NOISY!! They can also exit the test at this point, and save several minutes of their time, if they used a non-rooted device by accident. Regarding the logistics of moving warnings, what set of tests should this happen for? I think this will require unique warning lists etc. to be passed around depending on the type of test that might be affected, and its method of presenting results. Regarding all caps, it's an arms race trying to use more exclamation marks and more capital letters to get the reader's attention. I think the consensus is that this isn't good UX, and just makes text harder to read. I did all-caps the word NOISY and added one extra exclamation mark, and am hoping that's a good enough compromise.
I read your idea on the bug and that sounds reasonable. I've done it so perf_control just has warnings, but higher-level code can get rid of the warning if appropriate for the use-case. In this case now there is an explicit failure with a message telling the user to add the '--no-performance-mode' flag.
On 2014/07/09 19:22:54, epennerAtGoogle wrote: > https://codereview.chromium.org/373223002/diff/1/build/android/pylib/perf/per... > File build/android/pylib/perf/perf_control.py (right): > > https://codereview.chromium.org/373223002/diff/1/build/android/pylib/perf/per... > build/android/pylib/perf/perf_control.py:33: logging.warning('Need root for > performance mode. Results may be noisy!') > On 2014/07/09 09:20:30, pasko wrote: > > I agree that running on non-rooted devices is useful. > > > > How can we make the warning more visible? Can you find a way for > > tools/perf/run_benchmark to print the warning right after printing the > results? > > Yes, this would introduce extra complexity, but I think that we agreed that > this > > use-case (i.e. running perf measurements on non-rooted devices) is meaningful > > enough to pay for this extra complexity. > > > > Also, maybe the wording should include 'WARNING WARNING WARNIG'? :) > > I started looking into this, but the more I do, the more I'm thinking this is > subjective and not worth the complexity. Firstly, I just don't see this being > that large of an issue. Any developer doing anything with Chrome should have a > rooted Android device (we all got an N7 and an N4 recently). If they don't have > root it's because they are using an off the shelf device for gathering info. If > they are doing A/B tests locally, they will realize very quickly, since these > warnings sit their for up to several minutes while the test is running: > > WARNING:root:Can't enable root in production builds with type user > WARNING:root:Need root for performance mode. Results may be NOISY!! > > They can also exit the test at this point, and save several minutes of their > time, if they used a non-rooted device by accident. > > Regarding the logistics of moving warnings, what set of tests should this happen > for? I think this will require unique warning lists etc. to be passed around > depending on the type of test that might be affected, and its method of > presenting results. > > Regarding all caps, it's an arms race trying to use more exclamation marks and > more capital letters to get the reader's attention. I think the consensus is > that this isn't good UX, and just makes text harder to read. > > I did all-caps the word NOISY and added one extra exclamation mark, and am > hoping that's a good enough compromise. We are running performance tests on non-GED devices (i.e. user builds and non-rooted). For example, we have a bot running on S4 downstream. And we have seen issues with specific devices (in some cases, things that we can do nothing about due to the hardware of the device). But as we are doing more and more to get more device coverage, we will also be running performance tests on more non-rooted, non-GED devices. So please don't assume that all our Telemetry tests will run on N4 and N7s.
> We are running performance tests on non-GED devices (i.e. user builds and > non-rooted). For example, we have a bot running on S4 downstream. > And we have seen issues with specific devices (in some cases, things that we can > do nothing about due to the hardware of the device). > > But as we are doing more and more to get more device coverage, we will also be > running performance tests on more non-rooted, non-GED devices. > So please don't assume that all our Telemetry tests will run on N4 and N7s. That was actually the goal of this patch, we just need to agree on a way of not failing the test. With the latest patch, you would need to pass a special flag --no-performance-mode on those bots... I'm guessing that's not ideal (requires customizing certain bots)?
On 2014/07/10 01:36:05, epenner wrote: > > We are running performance tests on non-GED devices (i.e. user builds and > > non-rooted). For example, we have a bot running on S4 downstream. > > And we have seen issues with specific devices (in some cases, things that we > can > > do nothing about due to the hardware of the device). > > > > But as we are doing more and more to get more device coverage, we will also be > > running performance tests on more non-rooted, non-GED devices. > > So please don't assume that all our Telemetry tests will run on N4 and N7s. > > That was actually the goal of this patch, we just need to agree on a way of not > failing the test. With the latest patch, you would need to pass a special flag > --no-performance-mode on those bots... I'm guessing that's not ideal (requires > customizing certain bots)? Yes, we are already doing too much customization for specific bots and are/should be trying to avoid it as it makes the buildbot code and bots more difficult to maintain.
On 2014/07/10 03:36:43, klundberg wrote: > On 2014/07/10 01:36:05, epenner wrote: > > > We are running performance tests on non-GED devices (i.e. user builds and > > > non-rooted). For example, we have a bot running on S4 downstream. > > > And we have seen issues with specific devices (in some cases, things that we > > can > > > do nothing about due to the hardware of the device). > > > > > > But as we are doing more and more to get more device coverage, we will also > be > > > running performance tests on more non-rooted, non-GED devices. > > > So please don't assume that all our Telemetry tests will run on N4 and N7s. > > > > That was actually the goal of this patch, we just need to agree on a way of > not > > failing the test. With the latest patch, you would need to pass a special flag > > --no-performance-mode on those bots... I'm guessing that's not ideal (requires > > customizing certain bots)? > > Yes, we are already doing too much customization for specific bots and > are/should be trying to avoid it as it makes the buildbot code and bots more > difficult to maintain. OK, in other words, are again hit by the problems of bots being hard to manage (i.e. code in other repos, unfamiliar to chrome engs, no docs, a flag is difficult to add). If that's our main concern, I indeed have to prefer a way that in telemetry only spits a warning and dynamically adapts to the devices it observes. :/
On 2014/07/10 09:42:33, pasko wrote: > On 2014/07/10 03:36:43, klundberg wrote: > > On 2014/07/10 01:36:05, epenner wrote: > > > > We are running performance tests on non-GED devices (i.e. user builds and > > > > non-rooted). For example, we have a bot running on S4 downstream. > > > > And we have seen issues with specific devices (in some cases, things that > we > > > can > > > > do nothing about due to the hardware of the device). > > > > > > > > But as we are doing more and more to get more device coverage, we will > also > > be > > > > running performance tests on more non-rooted, non-GED devices. > > > > So please don't assume that all our Telemetry tests will run on N4 and > N7s. > > > > > > That was actually the goal of this patch, we just need to agree on a way of > > not > > > failing the test. With the latest patch, you would need to pass a special > flag > > > --no-performance-mode on those bots... I'm guessing that's not ideal > (requires > > > customizing certain bots)? > > > > Yes, we are already doing too much customization for specific bots and > > are/should be trying to avoid it as it makes the buildbot code and bots more > > difficult to maintain. > > OK, in other words, are again hit by the problems of bots being hard to manage > (i.e. code in other repos, unfamiliar to chrome engs, no docs, a flag is > difficult to add). I'm not concerned about code in other repos or unfamiliarity as I happen to have made several changes to the bot code and can help with changes for this as well:-) The problem is having too many special cases which are always hard to maintain especially at the moment where we have recipes in some places but not in all. The infrastructure team are looking at consolidating how we run perf tests across platforms and upstream/downstream and as part of this, it should hopefully be easier to add flags. But my suggestion would be to adapt to devices for now and maybe have a bug for adding a flag when we have a better idea of the infrastructure changes that will happen hopefully in Q3. > If that's our main concern, I indeed have to prefer a way > that in telemetry only spits a warning and dynamically adapts to the devices it > observes. :/
Coming back to this patch. So if we want to adapt, pasko@ are you okay with the warning I had? That was the remaining concern. I looked briefly at the code and didn't see an easy way to propagate a warning around the code in all instances where this might occur, but I'm having another look now. If you have an idea in mind for how this could be implemented cleanly, please let me know.
On 2014/07/29 20:03:44, epennerAtGoogle wrote: > Coming back to this patch. So if we want to adapt, pasko@ are you okay with the > warning I had? That was the remaining concern. > > I looked briefly at the code and didn't see an easy way to propagate a warning > around the code in all instances where this might occur, but I'm having another > look now. If you have an idea in mind for how this could be implemented cleanly, > please let me know. +Dtu for P1 review. Given that we have always run tests on non-rooted devices in the past, I'm not sure why we started failing intentionally only after we improved SetHighPerfMode. So I'm hoping we can fix the failure first and then follow up with any UI improvements? However, I did find 'atexit' which allows us to generically print an extra error without passing it around in the code. I think having a more visible error was the last remaining objection. Ptal.
On 2014/07/30 01:33:57, epenner wrote: > On 2014/07/29 20:03:44, epennerAtGoogle wrote: > > Coming back to this patch. So if we want to adapt, pasko@ are you okay with > the > > warning I had? That was the remaining concern. > > > > I looked briefly at the code and didn't see an easy way to propagate a warning > > around the code in all instances where this might occur, but I'm having > another > > look now. If you have an idea in mind for how this could be implemented > cleanly, > > please let me know. > > +Dtu for P1 review. > > Given that we have always run tests on non-rooted devices in the past, I'm not > sure why we started failing intentionally only after we improved > SetHighPerfMode. So I'm hoping we can fix the failure first and then follow up > with any UI improvements? > > However, I did find 'atexit' which allows us to generically print an extra error > without passing it around in the code. I think having a more visible error was > the last remaining objection. > > Ptal. lgtm telemetry_unittests is not running on the waterfall, so we wouldn't have caught a regression here. I was looking at enabling them when I ran into this :)
Landing. Pasko@, let's talk on the bug if there is any follow ups required.
The CQ bit was checked by epenner@chromium.org
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/epenner@chromium.org/373223002/120001
Message was sent while issue was closed.
Change committed as 286441
Message was sent while issue was closed.
On 2014/07/30 01:42:23, epenner wrote: > Landing. Pasko@, let's talk on the bug if there is any follow ups required. LGTM, atexit is a great idea |