|
|
Created:
5 years, 4 months ago by Georges Khalil Modified:
5 years, 4 months ago CC:
chromium-reviews, tracing+reviews_chromium.org, wfh+watch_chromium.org, jam, asvitkine+watch_chromium.org, darin-cc_chromium.org Base URL:
https://chromium.googlesource.com/chromium/src.git@master Target Ref:
refs/pending/heads/master Project:
chromium Visibility:
Public. |
Description[ETW Export] Add polling for ETW keyword.
This adds a background low-priority thread for each process that monitors the keyword and adjusts the categories accordingly. This allows users to start profiling (using UIForETW or Xpef directly) and specify which categories to enable, without having to restart Chrome.
Notes:
- TraceEventETWExport is a friend of TraceLog (see comment in code).
- ETW exporting still needs to be enabled using the command line or in about:flags.
BUG=491909
Committed: https://crrev.com/3bd527974c0cea7b1e0e9965fa9e13823861ebb6
Cr-Commit-Position: refs/heads/master@{#343393}
Patch Set 1 : #
Total comments: 6
Patch Set 2 : brucedawson@ comments. #
Total comments: 15
Patch Set 3 : dsinclair@ comments. #
Total comments: 1
Patch Set 4 : Fix nit. #
Messages
Total messages: 22 (8 generated)
Patchset #1 (id:1) has been deleted
georgesak@chromium.org changed reviewers: + brucedawson@chromium.org
Patchset #1 (id:20001) has been deleted
Bruce, PTAL. This is another approach to the same issue as in https://codereview.chromium.org/1213523003/
Looks good. Since the thread only gets started when requested so there is no cost to most users, and a trivial cost to those who opt-in. Having Chrome detect the change in ETW keywords as initiated by UIforETW is quite magical. I tested that UIforETW can now trigger emitting of ETW events from all processes, after Chrome startup, as long as the flag is specified. https://codereview.chromium.org/1279353002/diff/40001/base/trace_event/trace_... File base/trace_event/trace_event_etw_export_win.cc (right): https://codereview.chromium.org/1279353002/diff/40001/base/trace_event/trace_... base/trace_event/trace_event_etw_export_win.cc:187: if (GetInstance() && !GetInstance()->etw_export_enabled_) { The repeated calls to GetInstance() seem excessive to me. It feels like it would be better to call it once and stash it in a pointer: auto* instance = GetInstance(); if (instance && instant->etw_export_enabled_) This might make the code slightly smaller and it feels like the right thing to do. https://codereview.chromium.org/1279353002/diff/40001/base/trace_event/trace_... base/trace_event/trace_event_etw_export_win.cc:418: TraceLog::GetInstance()->UpdateCategoryGroupEnabledFlags(); I think we need a lock around the data structures that are shared between the background thread and the rest of Chrome? Also some comments saying which functions are called from which thread(s). https://codereview.chromium.org/1279353002/diff/40001/base/trace_event/trace_... File base/trace_event/trace_event_etw_export_win.h (right): https://codereview.chromium.org/1279353002/diff/40001/base/trace_event/trace_... base/trace_event/trace_event_etw_export_win.h:87: static void UpdateKeyword(); Perhaps name it UpdateETWKeyword() to avoid name collisions - just to make code search work better.
georgesak@chromium.org changed reviewers: + dsinclair@chromium.org
Thanks Bruce. Dan, PTAL. https://codereview.chromium.org/1279353002/diff/40001/base/trace_event/trace_... File base/trace_event/trace_event_etw_export_win.cc (right): https://codereview.chromium.org/1279353002/diff/40001/base/trace_event/trace_... base/trace_event/trace_event_etw_export_win.cc:187: if (GetInstance() && !GetInstance()->etw_export_enabled_) { On 2015/08/10 23:47:15, brucedawson wrote: > The repeated calls to GetInstance() seem excessive to me. It feels like it would > be better to call it once and stash it in a pointer: > > auto* instance = GetInstance(); > if (instance && instant->etw_export_enabled_) > > This might make the code slightly smaller and it feels like the right thing to > do. Agreed, done. I did it everywhere for consistency. https://codereview.chromium.org/1279353002/diff/40001/base/trace_event/trace_... base/trace_event/trace_event_etw_export_win.cc:418: TraceLog::GetInstance()->UpdateCategoryGroupEnabledFlags(); On 2015/08/10 23:47:15, brucedawson wrote: > I think we need a lock around the data structures that are shared between the > background thread and the rest of Chrome? Also some comments saying which > functions are called from which thread(s). Technically, yes, there's a very small chance of race here. But locking UpdateCategoryGroupEnabledFlags would do more harm than good. Worst case is that the flags don't get enabled correctly. https://codereview.chromium.org/1279353002/diff/40001/base/trace_event/trace_... File base/trace_event/trace_event_etw_export_win.h (right): https://codereview.chromium.org/1279353002/diff/40001/base/trace_event/trace_... base/trace_event/trace_event_etw_export_win.h:87: static void UpdateKeyword(); On 2015/08/10 23:47:15, brucedawson wrote: > Perhaps name it UpdateETWKeyword() to avoid name collisions - just to make code > search work better. Done.
lgtm
https://codereview.chromium.org/1279353002/diff/60001/base/trace_event/trace_... File base/trace_event/trace_event_etw_export_win.cc (right): https://codereview.chromium.org/1279353002/diff/60001/base/trace_event/trace_... base/trace_event/trace_event_etw_export_win.cc:96: // thread that will monitor the ETW keyword for any changes.. nit: double .'s at end. https://codereview.chromium.org/1279353002/diff/60001/base/trace_event/trace_... base/trace_event/trace_event_etw_export_win.cc:107: base::TimeDelta::FromMilliseconds(kUpdateTimerDelayMs)); Why do the milliseconds conversion every iteration? Why not store it outside the while loop? https://codereview.chromium.org/1279353002/diff/60001/base/trace_event/trace_... base/trace_event/trace_event_etw_export_win.cc:195: 0, instance ->keyword_update_thread_.get(), nit: Remove space between instance and -> https://codereview.chromium.org/1279353002/diff/60001/base/trace_event/trace_... base/trace_event/trace_event_etw_export_win.cc:207: } Why don't we shut down the thread at the point we stop tracing? Seems wasteful to leave it running and not doing anything? https://codereview.chromium.org/1279353002/diff/60001/base/trace_event/trace_... base/trace_event/trace_event_etw_export_win.cc:422: if (IsETWExportEnabled()) { if (!IsETWExportEnabled()) return; if (!instance->UpdateEnabledCategories()) return; https://codereview.chromium.org/1279353002/diff/60001/base/trace_event/trace_... File base/trace_event/trace_log.h (right): https://codereview.chromium.org/1279353002/diff/60001/base/trace_event/trace_... base/trace_event/trace_log.h:331: friend class TraceEventETWExport; This is kinda gross as it gives access to everything. Should the mentioned method be public? Or is there a better api to call to do the updating?
https://codereview.chromium.org/1279353002/diff/60001/base/trace_event/trace_... File base/trace_event/trace_event_etw_export_win.cc (right): https://codereview.chromium.org/1279353002/diff/60001/base/trace_event/trace_... base/trace_event/trace_event_etw_export_win.cc:207: } Currently this function is never called - there is no mechanism to disable tracing. What does happen is that the ETW keyword goes to zero (because nobody is listening) so exporting of ETW events stops. At that point the only overhead is the thread continuing to poll once a second, watching for the ETW keyword to change to non-zero again. There are really three states: disabled (etw_export_enable == false), enabled but inactive (etw_export_enable == true, etw keyword == 0), and active (etw_export_enable == true, etw keyword != 0).
Patchset #3 (id:80001) has been deleted
Patchset #3 (id:100001) has been deleted
Dan, PTAnL. Biggest change is the new API in TraceLog that gets called by TraceEventETWExport to update the categories. https://codereview.chromium.org/1279353002/diff/60001/base/trace_event/trace_... File base/trace_event/trace_event_etw_export_win.cc (right): https://codereview.chromium.org/1279353002/diff/60001/base/trace_event/trace_... base/trace_event/trace_event_etw_export_win.cc:96: // thread that will monitor the ETW keyword for any changes.. On 2015/08/11 19:33:14, dsinclair wrote: > nit: double .'s at end. Done. https://codereview.chromium.org/1279353002/diff/60001/base/trace_event/trace_... base/trace_event/trace_event_etw_export_win.cc:107: base::TimeDelta::FromMilliseconds(kUpdateTimerDelayMs)); On 2015/08/11 19:33:14, dsinclair wrote: > Why do the milliseconds conversion every iteration? Why not store it outside the > while loop? Done. https://codereview.chromium.org/1279353002/diff/60001/base/trace_event/trace_... base/trace_event/trace_event_etw_export_win.cc:171: UpdateEnabledCategories(); I removed this from here and moved it to EnableETWExport. It makes more sense that way. https://codereview.chromium.org/1279353002/diff/60001/base/trace_event/trace_... base/trace_event/trace_event_etw_export_win.cc:195: 0, instance ->keyword_update_thread_.get(), On 2015/08/11 19:33:14, dsinclair wrote: > nit: Remove space between instance and -> Done. https://codereview.chromium.org/1279353002/diff/60001/base/trace_event/trace_... base/trace_event/trace_event_etw_export_win.cc:207: } On 2015/08/11 20:24:02, brucedawson wrote: > Currently this function is never called - there is no mechanism to disable > tracing. What does happen is that the ETW keyword goes to zero (because nobody > is listening) so exporting of ETW events stops. At that point the only overhead > is the thread continuing to poll once a second, watching for the ETW keyword to > change to non-zero again. > > There are really three states: disabled (etw_export_enable == false), enabled > but inactive (etw_export_enable == true, etw keyword == 0), and active > (etw_export_enable == true, etw keyword != 0). Exactly. This might change down the road when we can enable it dynamically. Should I just remove this function for now and we add it back when needed? https://codereview.chromium.org/1279353002/diff/60001/base/trace_event/trace_... base/trace_event/trace_event_etw_export_win.cc:422: if (IsETWExportEnabled()) { On 2015/08/11 19:33:14, dsinclair wrote: > if (!IsETWExportEnabled()) > return; > if (!instance->UpdateEnabledCategories()) > return; I changed this. Now UpdateEnabledCategories calls directly the TraceLog API. https://codereview.chromium.org/1279353002/diff/60001/base/trace_event/trace_... File base/trace_event/trace_log.h (right): https://codereview.chromium.org/1279353002/diff/60001/base/trace_event/trace_... base/trace_event/trace_log.h:331: friend class TraceEventETWExport; On 2015/08/11 19:33:14, dsinclair wrote: > This is kinda gross as it gives access to everything. Should the mentioned > method be public? Or is there a better api to call to do the updating? I added an API in tracelog that allows us to set/clear just the ETW bit in the categories. I also now acquire the internal lock to avoid any potential race condition.
I'm still hesitant about adding a thread that we never shutdown that will do nothing most of the time. I think we need to have this shutdown the thread when we stop tracing. https://codereview.chromium.org/1279353002/diff/60001/base/trace_event/trace_... File base/trace_event/trace_event_etw_export_win.cc (right): https://codereview.chromium.org/1279353002/diff/60001/base/trace_event/trace_... base/trace_event/trace_event_etw_export_win.cc:207: } On 2015/08/12 17:24:25, Georges Khalil wrote: > On 2015/08/11 20:24:02, brucedawson wrote: > > Currently this function is never called - there is no mechanism to disable > > tracing. What does happen is that the ETW keyword goes to zero (because nobody > > is listening) so exporting of ETW events stops. At that point the only > overhead > > is the thread continuing to poll once a second, watching for the ETW keyword > to > > change to non-zero again. > > > > There are really three states: disabled (etw_export_enable == false), enabled > > but inactive (etw_export_enable == true, etw keyword == 0), and active > > (etw_export_enable == true, etw keyword != 0). > > Exactly. This might change down the road when we can enable it dynamically. > > Should I just remove this function for now and we add it back when needed? If it's unused, we should probably use it as it's confusing to see it and not have it used. https://codereview.chromium.org/1279353002/diff/120001/base/trace_event/trace... File base/trace_event/trace_event_etw_export_win.cc (right): https://codereview.chromium.org/1279353002/diff/120001/base/trace_event/trace... base/trace_event/trace_event_etw_export_win.cc:190: // that checks the keyword. Then create a thead that will call that same s/thead/thread/
On 2015/08/13 19:03:17, dsinclair wrote: > I'm still hesitant about adding a thread that we never shutdown that will do > nothing most of the time. That's why this thread only runs when requested through the flag. The thread will only run on dozens or hundreds of machines worldwide, for users who are investigating problems. > I think we need to have this shutdown the thread when we stop tracing. We can't do that because it doesn't make sense. The purpose of this thread is to tell us when to stop and start tracing. If we stop the thread then we can't trace again until Chrome restarts. I agree that polling is inelegant, but practically speaking it has zero overhead (a flag check once per second on at most a few hundred machines), and the alternatives involve significant engineering complexity. I'm all in favor of getting rid of the polling at some point, but until then we need a way to let motivated users record high-quality ETW traces without going through crazy hoops. This change allows that. Ping me if this doesn't make sense.
lgtm w/ previous nit. Sounds fair enough to me.
On 2015/08/14 13:43:45, dsinclair wrote: > lgtm w/ previous nit. > > Sounds fair enough to me. Thanks, nit fixed. Sending this to CQ.
The CQ bit was checked by georgesak@chromium.org
The patchset sent to the CQ was uploaded after l-g-t-m from brucedawson@chromium.org, dsinclair@chromium.org Link to the patchset: https://codereview.chromium.org/1279353002/#ps140001 (title: "Fix nit.")
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1279353002/140001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1279353002/140001
Message was sent while issue was closed.
Committed patchset #4 (id:140001)
Message was sent while issue was closed.
Patchset 4 (id:??) landed as https://crrev.com/3bd527974c0cea7b1e0e9965fa9e13823861ebb6 Cr-Commit-Position: refs/heads/master@{#343393} |