|
|
Chromium Code Reviews|
Created:
4 years, 1 month ago by rnephew (Reviews Here) Modified:
4 years, 1 month ago CC:
catapult-reviews_chromium.org, telemetry-reviews_chromium.org Target Ref:
refs/heads/master Project:
catapult Visibility:
Public. |
Description[Telemetry] Decrease end of test run logging.
Some logging output was set to critical/info when debug may be more
appropriate.
BUG=chromium:664505
Committed: https://chromium.googlesource.com/external/github.com/catapult-project/catapult/+/023a2b33afdcc27dcb0e7671337eb2eb5b07fdd2
Patch Set 1 #
Total comments: 8
Patch Set 2 : [Telemetry] Decrease of of test run logging. #
Messages
Total messages: 16 (4 generated)
rnephew@chromium.org changed reviewers: + nednguyen@google.com, perezju@chromium.org
Description was changed from ========== [Telemetry] Decrease of of test run logging. Some logging output was set to critical/info when debug may be more appropriate. BUG=chromium:664505 ========== to ========== [Telemetry] Decrease end of test run logging. Some logging output was set to critical/info when debug may be more appropriate. BUG=chromium:664505 ==========
Thanks for looking into this and help us reduce the tons of log spam that has crept into Telemetry recently! I have some comments on the implementation, though. https://codereview.chromium.org/2504653002/diff/1/telemetry/telemetry/interna... File telemetry/telemetry/internal/util/atexit_with_log.py (right): https://codereview.chromium.org/2504653002/diff/1/telemetry/telemetry/interna... telemetry/telemetry/internal/util/atexit_with_log.py:12: logging.debug('Did run %s', repr(function)) I'm not sure about this. The whole point of this module is to "atexit_with_log", and this seems to defeat that purpose. I think the problem is on whoever is calling it. Do they really need to log? (In many cases probably not.) https://codereview.chromium.org/2504653002/diff/1/telemetry/telemetry/interna... File telemetry/telemetry/internal/util/wpr_server.py (right): https://codereview.chromium.org/2504653002/diff/1/telemetry/telemetry/interna... telemetry/telemetry/internal/util/wpr_server.py:239: logging.debug('Attempting to stop WPR server that is not running.') imho we should just drop this line completely. As things are implemented, it's expected to call this method even if the WPR is already stopped (and that's why we get tons of log span). I don't think there is a need to log anything here.
https://codereview.chromium.org/2504653002/diff/1/telemetry/telemetry/interna... File telemetry/telemetry/internal/util/atexit_with_log.py (right): https://codereview.chromium.org/2504653002/diff/1/telemetry/telemetry/interna... telemetry/telemetry/internal/util/atexit_with_log.py:12: logging.debug('Did run %s', repr(function)) On 2016/11/15 16:40:34, perezju wrote: > I'm not sure about this. > > The whole point of this module is to "atexit_with_log", and this seems to defeat > that purpose. > > I think the problem is on whoever is calling it. Do they really need to log? (In > many cases probably not.) Or, alternatively, I think a lot of the log spam comes from bad use of atexit, for example ChromeTracingAgent._RemoveTraceConfigFile is called loads of times on exit. Do we really need to call this method more than once?
https://codereview.chromium.org/2504653002/diff/1/telemetry/telemetry/interna... File telemetry/telemetry/internal/util/atexit_with_log.py (right): https://codereview.chromium.org/2504653002/diff/1/telemetry/telemetry/interna... telemetry/telemetry/internal/util/atexit_with_log.py:12: logging.debug('Did run %s', repr(function)) On 2016/11/15 16:40:34, perezju wrote: > I'm not sure about this. > > The whole point of this module is to "atexit_with_log", and this seems to defeat > that purpose. > > I think the problem is on whoever is calling it. Do they really need to log? (In > many cases probably not.) It still logs, just at the debug level. So it would still be helpful for local runs where we can run it at higher verbosity and if we need to flip it on to debug stuff on the bots we can always switch it back temporarily. https://codereview.chromium.org/2504653002/diff/1/telemetry/telemetry/interna... File telemetry/telemetry/internal/util/wpr_server.py (right): https://codereview.chromium.org/2504653002/diff/1/telemetry/telemetry/interna... telemetry/telemetry/internal/util/wpr_server.py:239: logging.debug('Attempting to stop WPR server that is not running.') On 2016/11/15 16:40:34, perezju wrote: > imho we should just drop this line completely. As things are implemented, it's > expected to call this method even if the WPR is already stopped (and that's why > we get tons of log span). > > I don't think there is a need to log anything here. I added this when trying to debug forwarder issues about a year ago, I agree its probably unneeded. Lets let Ned put is opinion in and if we all agree, I'll go ahead and get rid of it completely.
https://codereview.chromium.org/2504653002/diff/1/telemetry/telemetry/interna... File telemetry/telemetry/internal/util/atexit_with_log.py (right): https://codereview.chromium.org/2504653002/diff/1/telemetry/telemetry/interna... telemetry/telemetry/internal/util/atexit_with_log.py:12: logging.debug('Did run %s', repr(function)) On 2016/11/15 16:45:16, perezju wrote: > On 2016/11/15 16:40:34, perezju wrote: > > I'm not sure about this. > > > > The whole point of this module is to "atexit_with_log", and this seems to > defeat > > that purpose. > > > > I think the problem is on whoever is calling it. Do they really need to log? > (In > > many cases probably not.) > > Or, alternatively, I think a lot of the log spam comes from bad use of atexit, > for example ChromeTracingAgent._RemoveTraceConfigFile is called loads of times > on exit. Do we really need to call this method more than once? I think thats an artifact of how telemetry is designed. It sets up the TraceConfigFile before each time the browser is loaded, and sets up the atexit call whenever it is set up. The system health pages launch a fresh chrome for each test run IIRC.
On 2016/11/15 16:50:31, rnephew (Reviews Here) wrote: > https://codereview.chromium.org/2504653002/diff/1/telemetry/telemetry/interna... > File telemetry/telemetry/internal/util/atexit_with_log.py (right): > > https://codereview.chromium.org/2504653002/diff/1/telemetry/telemetry/interna... > telemetry/telemetry/internal/util/atexit_with_log.py:12: logging.debug('Did run > %s', repr(function)) > On 2016/11/15 16:45:16, perezju wrote: > > On 2016/11/15 16:40:34, perezju wrote: > > > I'm not sure about this. > > > > > > The whole point of this module is to "atexit_with_log", and this seems to > > defeat > > > that purpose. > > > > > > I think the problem is on whoever is calling it. Do they really need to log? > > (In > > > many cases probably not.) > > > > Or, alternatively, I think a lot of the log spam comes from bad use of atexit, > > for example ChromeTracingAgent._RemoveTraceConfigFile is called loads of times > > on exit. Do we really need to call this method more than once? > > I think thats an artifact of how telemetry is designed. It sets up the > TraceConfigFile before each time the browser is loaded, and sets up the atexit > call whenever it is set up. The system health pages launch a fresh chrome for > each test run IIRC. Exactly. I think we should remove the trace config on browser.Close(); and maybe add a single extra try to remove the trace config on exit just in case we crash or something unexpected happens while running.
On 2016/11/15 16:56:26, perezju wrote: > On 2016/11/15 16:50:31, rnephew (Reviews Here) wrote: > > > https://codereview.chromium.org/2504653002/diff/1/telemetry/telemetry/interna... > > File telemetry/telemetry/internal/util/atexit_with_log.py (right): > > > > > https://codereview.chromium.org/2504653002/diff/1/telemetry/telemetry/interna... > > telemetry/telemetry/internal/util/atexit_with_log.py:12: logging.debug('Did > run > > %s', repr(function)) > > On 2016/11/15 16:45:16, perezju wrote: > > > On 2016/11/15 16:40:34, perezju wrote: > > > > I'm not sure about this. > > > > > > > > The whole point of this module is to "atexit_with_log", and this seems to > > > defeat > > > > that purpose. > > > > > > > > I think the problem is on whoever is calling it. Do they really need to > log? > > > (In > > > > many cases probably not.) > > > > > > Or, alternatively, I think a lot of the log spam comes from bad use of > atexit, > > > for example ChromeTracingAgent._RemoveTraceConfigFile is called loads of > times > > > on exit. Do we really need to call this method more than once? > > > > I think thats an artifact of how telemetry is designed. It sets up the > > TraceConfigFile before each time the browser is loaded, and sets up the atexit > > call whenever it is set up. The system health pages launch a fresh chrome for > > each test run IIRC. > > Exactly. I think we should remove the trace config on browser.Close(); and maybe > add a single extra try to remove the trace config on exit just in case we crash > or something unexpected happens while running. We can't just move the WPR calls though. I think it might also be better if we could set the debug level for each individual call to at_exit_with_logging and have it default to one if none is set.
lgtm https://codereview.chromium.org/2504653002/diff/1/telemetry/telemetry/interna... File telemetry/telemetry/internal/util/atexit_with_log.py (right): https://codereview.chromium.org/2504653002/diff/1/telemetry/telemetry/interna... telemetry/telemetry/internal/util/atexit_with_log.py:12: logging.debug('Did run %s', repr(function)) On 2016/11/15 16:50:31, rnephew (Reviews Here) wrote: > On 2016/11/15 16:45:16, perezju wrote: > > On 2016/11/15 16:40:34, perezju wrote: > > > I'm not sure about this. > > > > > > The whole point of this module is to "atexit_with_log", and this seems to > > defeat > > > that purpose. > > > > > > I think the problem is on whoever is calling it. Do they really need to log? > > (In > > > many cases probably not.) > > > > Or, alternatively, I think a lot of the log spam comes from bad use of atexit, > > for example ChromeTracingAgent._RemoveTraceConfigFile is called loads of times > > on exit. Do we really need to call this method more than once? > > I think thats an artifact of how telemetry is designed. It sets up the > TraceConfigFile before each time the browser is loaded, and sets up the atexit > call whenever it is set up. The system health pages launch a fresh chrome for > each test run IIRC. I created when I need to debug the issue of hang telemetry test. Fine to make this debug level. https://codereview.chromium.org/2504653002/diff/1/telemetry/telemetry/interna... File telemetry/telemetry/internal/util/wpr_server.py (right): https://codereview.chromium.org/2504653002/diff/1/telemetry/telemetry/interna... telemetry/telemetry/internal/util/wpr_server.py:239: logging.debug('Attempting to stop WPR server that is not running.') On 2016/11/15 16:47:02, rnephew (Reviews Here) wrote: > On 2016/11/15 16:40:34, perezju wrote: > > imho we should just drop this line completely. As things are implemented, it's > > expected to call this method even if the WPR is already stopped (and that's > why > > we get tons of log span). > > > > I don't think there is a need to log anything here. > > I added this when trying to debug forwarder issues about a year ago, I agree its > probably unneeded. Lets let Ned put is opinion in and if we all agree, I'll go > ahead and get rid of it completely. It's fine to either make this debug or remove this.
On 2016/11/15 17:14:41, nednguyen wrote: > lgtm > > https://codereview.chromium.org/2504653002/diff/1/telemetry/telemetry/interna... > File telemetry/telemetry/internal/util/atexit_with_log.py (right): > > https://codereview.chromium.org/2504653002/diff/1/telemetry/telemetry/interna... > telemetry/telemetry/internal/util/atexit_with_log.py:12: logging.debug('Did run > %s', repr(function)) > On 2016/11/15 16:50:31, rnephew (Reviews Here) wrote: > > On 2016/11/15 16:45:16, perezju wrote: > > > On 2016/11/15 16:40:34, perezju wrote: > > > > I'm not sure about this. > > > > > > > > The whole point of this module is to "atexit_with_log", and this seems to > > > defeat > > > > that purpose. > > > > > > > > I think the problem is on whoever is calling it. Do they really need to > log? > > > (In > > > > many cases probably not.) > > > > > > Or, alternatively, I think a lot of the log spam comes from bad use of > atexit, > > > for example ChromeTracingAgent._RemoveTraceConfigFile is called loads of > times > > > on exit. Do we really need to call this method more than once? > > > > I think thats an artifact of how telemetry is designed. It sets up the > > TraceConfigFile before each time the browser is loaded, and sets up the atexit > > call whenever it is set up. The system health pages launch a fresh chrome for > > each test run IIRC. > > I created when I need to debug the issue of hang telemetry test. Fine to make > this debug level. > > https://codereview.chromium.org/2504653002/diff/1/telemetry/telemetry/interna... > File telemetry/telemetry/internal/util/wpr_server.py (right): > > https://codereview.chromium.org/2504653002/diff/1/telemetry/telemetry/interna... > telemetry/telemetry/internal/util/wpr_server.py:239: logging.debug('Attempting > to stop WPR server that is not running.') > On 2016/11/15 16:47:02, rnephew (Reviews Here) wrote: > > On 2016/11/15 16:40:34, perezju wrote: > > > imho we should just drop this line completely. As things are implemented, > it's > > > expected to call this method even if the WPR is already stopped (and that's > > why > > > we get tons of log span). > > > > > > I don't think there is a need to log anything here. > > > > I added this when trying to debug forwarder issues about a year ago, I agree > its > > probably unneeded. Lets let Ned put is opinion in and if we all agree, I'll go > > ahead and get rid of it completely. > > It's fine to either make this debug or remove this. I got rid of the wpr logging message all together. I also changed a tsproxy message to logging that was very similar to the wpr log message. What are your thoughts on that Ned?
On 2016/11/15 17:18:42, rnephew (Reviews Here) wrote: > On 2016/11/15 17:14:41, nednguyen wrote: > > lgtm > > > > > https://codereview.chromium.org/2504653002/diff/1/telemetry/telemetry/interna... > > File telemetry/telemetry/internal/util/atexit_with_log.py (right): > > > > > https://codereview.chromium.org/2504653002/diff/1/telemetry/telemetry/interna... > > telemetry/telemetry/internal/util/atexit_with_log.py:12: logging.debug('Did > run > > %s', repr(function)) > > On 2016/11/15 16:50:31, rnephew (Reviews Here) wrote: > > > On 2016/11/15 16:45:16, perezju wrote: > > > > On 2016/11/15 16:40:34, perezju wrote: > > > > > I'm not sure about this. > > > > > > > > > > The whole point of this module is to "atexit_with_log", and this seems > to > > > > defeat > > > > > that purpose. > > > > > > > > > > I think the problem is on whoever is calling it. Do they really need to > > log? > > > > (In > > > > > many cases probably not.) > > > > > > > > Or, alternatively, I think a lot of the log spam comes from bad use of > > atexit, > > > > for example ChromeTracingAgent._RemoveTraceConfigFile is called loads of > > times > > > > on exit. Do we really need to call this method more than once? > > > > > > I think thats an artifact of how telemetry is designed. It sets up the > > > TraceConfigFile before each time the browser is loaded, and sets up the > atexit > > > call whenever it is set up. The system health pages launch a fresh chrome > for > > > each test run IIRC. > > > > I created when I need to debug the issue of hang telemetry test. Fine to make > > this debug level. > > > > > https://codereview.chromium.org/2504653002/diff/1/telemetry/telemetry/interna... > > File telemetry/telemetry/internal/util/wpr_server.py (right): > > > > > https://codereview.chromium.org/2504653002/diff/1/telemetry/telemetry/interna... > > telemetry/telemetry/internal/util/wpr_server.py:239: logging.debug('Attempting > > to stop WPR server that is not running.') > > On 2016/11/15 16:47:02, rnephew (Reviews Here) wrote: > > > On 2016/11/15 16:40:34, perezju wrote: > > > > imho we should just drop this line completely. As things are implemented, > > it's > > > > expected to call this method even if the WPR is already stopped (and > that's > > > why > > > > we get tons of log span). > > > > > > > > I don't think there is a need to log anything here. > > > > > > I added this when trying to debug forwarder issues about a year ago, I agree > > its > > > probably unneeded. Lets let Ned put is opinion in and if we all agree, I'll > go > > > ahead and get rid of it completely. > > > > It's fine to either make this debug or remove this. > > I got rid of the wpr logging message all together. I also changed a tsproxy > message to logging that was very similar to the wpr log message. What are your > thoughts on that Ned? lgtm In general, I think which logging to enable is an art more than science. The game is to try the hit the sweet spot of providing enough data so when things go wrong, you have the right info to look at as soon as possible, but not too much that it's makes your daily debugging terrible. In more advanced infrastructure, ideally the problem of too much log could be solved with better UI. Oh well..
The CQ bit was checked by rnephew@chromium.org
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
Message was sent while issue was closed.
Description was changed from ========== [Telemetry] Decrease end of test run logging. Some logging output was set to critical/info when debug may be more appropriate. BUG=chromium:664505 ========== to ========== [Telemetry] Decrease end of test run logging. Some logging output was set to critical/info when debug may be more appropriate. BUG=chromium:664505 Committed: https://chromium.googlesource.com/external/github.com/catapult-project/catapu... ==========
Message was sent while issue was closed.
Committed patchset #2 (id:20001) as https://chromium.googlesource.com/external/github.com/catapult-project/catapu... |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
