|
|
Created:
4 years, 8 months ago by manzagop (departed) Modified:
4 years, 8 months ago Reviewers:
jochen (gone - plz use gerrit), Sigurður Ásgeirsson, Patrick Monette, scottmg, Alexei Svitkine (slow), rkaplow, grt (UTC plus 2) CC:
Alexei Svitkine (slow), chromium-reviews Base URL:
https://chromium.googlesource.com/chromium/src.git@master Target Ref:
refs/pending/heads/master Project:
chromium Visibility:
Public. |
DescriptionCapture a report on failed browser rendez-vous.
Overview:
Browser process hangs are believed to be responsible for a large fraction of browser unclean shutdowns. This patch is to get visibility into the problem on windows using the Kasko reporter.
Note:
This functionality is hidden behind buildflags, and only triggers for users with stats collection consent on canary/unspecified channels. At this time, it is only considered for one-off releases.
Details:
- Use of the Kasko reporter is required for its ability to capture reports from other processes.
- Pulls out Kasko reporting code to kasko_utils
- Invoke Kasko on failed rendez-vous.
- Hidden behind build flags (kasko and kasko-rdv-capture).
BUG=478209
Committed: https://crrev.com/f071e88974318cec29c02bd5b92fe61868da8569
Cr-Commit-Position: refs/heads/master@{#386983}
Patch Set 1 : #
Total comments: 2
Patch Set 2 : Check buildflag/channel, set crashkey, address race #
Total comments: 6
Patch Set 3 : Executable name check #
Total comments: 54
Patch Set 4 : Address comments, check stats collection consent #
Total comments: 10
Patch Set 5 : Address grt's second round #
Total comments: 4
Patch Set 6 : Address grt's 3rd round #
Total comments: 8
Patch Set 7 : Check stats consent inside SendFailedRdvReport #Patch Set 8 : Revise stats consent validation #
Total comments: 7
Patch Set 9 : Fix consent validation at rdv #
Total comments: 4
Patch Set 10 : Merge #Patch Set 11 : Remove extra header, format gn #Patch Set 12 : Fix gn variables. #
Total comments: 4
Patch Set 13 : Build file changes #Patch Set 14 : Ensure kasko variable defined in gyp #
Total comments: 2
Patch Set 15 : GYP fixup #
Messages
Total messages: 66 (19 generated)
Patchset #1 (id:1) has been deleted
Patchset #1 (id:20001) has been deleted
Patchset #1 (id:40001) has been deleted
Description was changed from ========== WIP - kasko_utils BUG= ========== to ========== BUG= ==========
manzagop@chromium.org changed reviewers: + pmonette@chromium.org, siggi@chromium.org
Sending out for early feedback. Still a few things to tweak.
nice! https://codereview.chromium.org/1844023002/diff/60001/chrome/browser/process_... File chrome/browser/process_singleton_win.cc (right): https://codereview.chromium.org/1844023002/diff/60001/chrome/browser/process_... chrome/browser/process_singleton_win.cc:265: #if BUILDFLAG(ENABLE_KASKO) As we discussed, there is a race on the PID here, as it may have been reused for "excel.exe" or some other process. There are many ways to close the race, for example by redoing the window lookup and the ::GetWindowThreadProcessId here. As-is, the race already exists, but at worst it'll kill a random process on the user's system :/
Please have another look! https://codereview.chromium.org/1844023002/diff/60001/chrome/browser/process_... File chrome/browser/process_singleton_win.cc (right): https://codereview.chromium.org/1844023002/diff/60001/chrome/browser/process_... chrome/browser/process_singleton_win.cc:265: #if BUILDFLAG(ENABLE_KASKO) On 2016/03/31 19:28:49, Sigurður Ásgeirsson wrote: > As we discussed, there is a race on the PID here, as it may have been reused for > "excel.exe" or some other process. > There are many ways to close the race, for example by redoing the window lookup > and the ::GetWindowThreadProcessId here. > > As-is, the race already exists, but at worst it'll kill a random process on the > user's system :/ Done.
nice! https://codereview.chromium.org/1844023002/diff/80001/chrome/browser/process_... File chrome/browser/process_singleton_win.cc (right): https://codereview.chromium.org/1844023002/diff/80001/chrome/browser/process_... chrome/browser/process_singleton_win.cc:270: thread_id = ::GetWindowThreadProcessId(remote_window_, &process_id); this is still racy, as you have the same problem with the window handle. Essentially the question you need to answer is whether the window handle you have is really still the singleton for the profile dir, and then whether it's owned by the process you opened. https://codereview.chromium.org/1844023002/diff/80001/chrome/browser/process_... chrome/browser/process_singleton_win.cc:270: thread_id = ::GetWindowThreadProcessId(remote_window_, &process_id); It's also worth noting here that we nominate the owning thread of this window as the root of the search for the culprit thread? https://codereview.chromium.org/1844023002/diff/80001/chrome/browser/process_... chrome/browser/process_singleton_win.cc:298: bool launched_kasko = it's worth adding some running commentary here, will e.g. ShutdownKaskoReporter block for upload, or do you rely on another process or another instance of kasko to do taht?
Please have another look! https://codereview.chromium.org/1844023002/diff/80001/chrome/browser/process_... File chrome/browser/process_singleton_win.cc (right): https://codereview.chromium.org/1844023002/diff/80001/chrome/browser/process_... chrome/browser/process_singleton_win.cc:270: thread_id = ::GetWindowThreadProcessId(remote_window_, &process_id); On 2016/04/01 17:27:17, Sigurður Ásgeirsson wrote: > It's also worth noting here that we nominate the owning thread of this window as > the root of the search for the culprit thread? Done. https://codereview.chromium.org/1844023002/diff/80001/chrome/browser/process_... chrome/browser/process_singleton_win.cc:270: thread_id = ::GetWindowThreadProcessId(remote_window_, &process_id); On 2016/04/01 17:27:17, Sigurður Ásgeirsson wrote: > this is still racy, as you have the same problem with the window handle. > > Essentially the question you need to answer is whether the window handle you > have is really still the singleton for the profile dir, and then whether it's > owned by the process you opened. As discussed, it's hard to avoid the issues with HWND and/or PID recycling. Instead, we compare the executable names, which is a sane defense against false positives. https://codereview.chromium.org/1844023002/diff/80001/chrome/browser/process_... chrome/browser/process_singleton_win.cc:298: bool launched_kasko = On 2016/04/01 17:27:17, Sigurður Ásgeirsson wrote: > it's worth adding some running commentary here, will e.g. ShutdownKaskoReporter > block for upload, or do you rely on another process or another instance of kasko > to do taht? Done.
https://codereview.chromium.org/1844023002/diff/100001/chrome/browser/process... File chrome/browser/process_singleton_win.cc (right): https://codereview.chromium.org/1844023002/diff/100001/chrome/browser/process... chrome/browser/process_singleton_win.cc:206: if (GetModuleFileName(nullptr, exe_name_self, MAX_PATH) == 0) You can use PathService::Get(FILE_EXE) instead; https://codereview.chromium.org/1844023002/diff/100001/chrome/browser/process... chrome/browser/process_singleton_win.cc:217: if (base::FilePath(exe_name_self) != base::FilePath(exe_name_other)) Can you move the exe name comparison to a function? I'll need to reuse it in the case the Wait Chain Traversal API gives me another process.
siggi@chromium.org changed reviewers: + grt@chromium.org
nice, lgtm with some nits. I'd ask grt for advice on main executable path comparison - I just know this has been problematic in the past, no details. https://codereview.chromium.org/1844023002/diff/100001/chrome/browser/process... File chrome/browser/process_singleton_win.cc (right): https://codereview.chromium.org/1844023002/diff/100001/chrome/browser/process... chrome/browser/process_singleton_win.cc:217: if (base::FilePath(exe_name_self) != base::FilePath(exe_name_other)) @grt may be familiar with cases where this comparison could fail. I wonder whether it might be better/sufficient to compare the directory name. Also, I'm not sure whether this does right by casing differences and such? https://codereview.chromium.org/1844023002/diff/100001/chrome/chrome_watcher/... File chrome/chrome_watcher/kasko_util.cc (right): https://codereview.chromium.org/1844023002/diff/100001/chrome/chrome_watcher/... chrome/chrome_watcher/kasko_util.cc:203: // TODO(erikwright): Make the dump-type channel-dependent. erikwright won't likely be addressing this :) https://codereview.chromium.org/1844023002/diff/100001/chrome/chrome_watcher/... File chrome/chrome_watcher/kasko_util.h (right): https://codereview.chromium.org/1844023002/diff/100001/chrome/chrome_watcher/... chrome/chrome_watcher/kasko_util.h:26: const base::char16* key, const base::Process& process); the "key" parameter needs a little bit of 'splaining?
Description was changed from ========== BUG= ========== to ========== Overview: Browser process hangs are believed to be responsible for a large fraction of browser unclean shutdowns. This patch is to get visibility into the problem. Note: This functionality is hidden behind buildflags, and only triggers on canary/unspecified channels. At this time, it is only considered for one-off releases. Details: - Use of the Kasko reporter is required for its ability to capture reports from other processes. - Pulls out Kasko reporting code to kasko_utils - Invoke Kasko on failed rendez-vous. - Hidden behind build flags (kasko and kasko-rdv-capture). BUG=600725 ==========
Description was changed from ========== Overview: Browser process hangs are believed to be responsible for a large fraction of browser unclean shutdowns. This patch is to get visibility into the problem. Note: This functionality is hidden behind buildflags, and only triggers on canary/unspecified channels. At this time, it is only considered for one-off releases. Details: - Use of the Kasko reporter is required for its ability to capture reports from other processes. - Pulls out Kasko reporting code to kasko_utils - Invoke Kasko on failed rendez-vous. - Hidden behind build flags (kasko and kasko-rdv-capture). BUG=600725 ========== to ========== Overview: Browser process hangs are believed to be responsible for a large fraction of browser unclean shutdowns. This patch is to get visibility into the problem. Note: This functionality is hidden behind buildflags, and only triggers on canary/unspecified channels. At this time, it is only considered for one-off releases. Details: - Use of the Kasko reporter is required for its ability to capture reports from other processes. - Pulls out Kasko reporting code to kasko_utils - Invoke Kasko on failed rendez-vous. - Hidden behind build flags (kasko and kasko-rdv-capture). BUG=478209 ==========
https://codereview.chromium.org/1844023002/diff/100001/chrome/browser/process... File chrome/browser/process_singleton_win.cc (right): https://codereview.chromium.org/1844023002/diff/100001/chrome/browser/process... chrome/browser/process_singleton_win.cc:9: #include <winbase.h> change this to #include <windows.h> and move it above the other two includes. https://codereview.chromium.org/1844023002/diff/100001/chrome/browser/process... chrome/browser/process_singleton_win.cc:11: #include <cwchar> why? wchar_t is a keyword. is there something else from this header you need? https://codereview.chromium.org/1844023002/diff/100001/chrome/browser/process... chrome/browser/process_singleton_win.cc:194: // the exception context to live either in the dumper of the dumpee. This of -> or? https://codereview.chromium.org/1844023002/diff/100001/chrome/browser/process... chrome/browser/process_singleton_win.cc:197: void SendFailedRdvReport(const base::Process& process, const DWORD thread_id) { remove "const" https://codereview.chromium.org/1844023002/diff/100001/chrome/browser/process... chrome/browser/process_singleton_win.cc:201: // Ensure the target process shares the current process' executable name. This nit: process's https://codereview.chromium.org/1844023002/diff/100001/chrome/browser/process... chrome/browser/process_singleton_win.cc:212: DWORD error = GetLastError(); remove this https://codereview.chromium.org/1844023002/diff/100001/chrome/browser/process... chrome/browser/process_singleton_win.cc:213: DPLOG(INFO) << "Failed to get executable name for other process: " << error; The *P* class of log functions automatically add the friendly text for the last error code, so change this to just: DPLOG(ERROR) << "Failed to get executable name for other process"; https://codereview.chromium.org/1844023002/diff/100001/chrome/browser/process... chrome/browser/process_singleton_win.cc:217: if (base::FilePath(exe_name_self) != base::FilePath(exe_name_other)) On 2016/04/05 13:52:29, Sigurður Ásgeirsson wrote: > @grt may be familiar with cases where this comparison could fail. I wonder > whether it might be better/sufficient to compare the directory name. > Also, I'm not sure whether this does right by casing differences and such? We generally use base::FilePath::CompareEqualIgnoreCase if we think the paths are going to be the same names for the file, or if false negatives aren't a concern. If you think they may be different names for the same underlying file, then I don't think you can do better than getting handles to them and comparing their volume serial number and file index. https://codereview.chromium.org/1844023002/diff/100001/chrome/browser/process... chrome/browser/process_singleton_win.cc:221: const version_info::Channel channel = chrome::GetChannel(); consider doing this higher up in the function so that you don't incur the other costs only to discard them due to the channel. https://codereview.chromium.org/1844023002/diff/100001/chrome/browser/process... chrome/browser/process_singleton_win.cc:240: // completion of ongoing background tasks (eg upload). If the report is nit: "e.g., upload" https://codereview.chromium.org/1844023002/diff/100001/chrome/browser/process... chrome/browser/process_singleton_win.cc:319: // The window is hung. https://youtu.be/O4gqsuww6lw https://codereview.chromium.org/1844023002/diff/100001/chrome/chrome_watcher/... File chrome/chrome_watcher/kasko_util.cc (right): https://codereview.chromium.org/1844023002/diff/100001/chrome/chrome_watcher/... chrome/chrome_watcher/kasko_util.cc:33: void GetKaskoCrashServerUrl(base::string16* crash_server) { return the string16 rather than taking it as an out param. https://codereview.chromium.org/1844023002/diff/100001/chrome/chrome_watcher/... chrome/chrome_watcher/kasko_util.cc:34: const char kKaskoCrashServerUrl[] = "KASKO_CRASH_SERVER_URL"; static const char https://codereview.chromium.org/1844023002/diff/100001/chrome/chrome_watcher/... chrome/chrome_watcher/kasko_util.cc:38: auto env = base::Environment::Create(); #include <memory> std::unique_ptr<base::Environment> env(base::Environment::Create()); https://codereview.chromium.org/1844023002/diff/100001/chrome/chrome_watcher/... chrome/chrome_watcher/kasko_util.cc:41: base::UTF8ToWide(env_var.c_str(), env_var.size(), crash_server); do you really mean UTF8ToUTF16 here (the destination type is a string16, not a wstring)? https://codereview.chromium.org/1844023002/diff/100001/chrome/chrome_watcher/... chrome/chrome_watcher/kasko_util.cc:51: base::FilePath* base_dir) { return a base::FilePath rather than taking it as an out arg. FilePaths are treated like value types. https://codereview.chromium.org/1844023002/diff/100001/chrome/chrome_watcher/... chrome/chrome_watcher/kasko_util.cc:52: const char kKaskoCrashReportBaseDir[] = "KASKO_CRASH_REPORTS_BASE_DIR"; static https://codereview.chromium.org/1844023002/diff/100001/chrome/chrome_watcher/... chrome/chrome_watcher/kasko_util.cc:53: auto env = base::Environment::Create(); unique_ptr here, too https://codereview.chromium.org/1844023002/diff/100001/chrome/chrome_watcher/... chrome/chrome_watcher/kasko_util.cc:80: HANDLE event_source_handle = ::RegisterEventSource(NULL, L"Chrome"); NULL -> nullptr everywhere https://codereview.chromium.org/1844023002/diff/100001/chrome/chrome_watcher/... chrome/chrome_watcher/kasko_util.cc:80: HANDLE event_source_handle = ::RegisterEventSource(NULL, L"Chrome"); i somewhat prefer using std::unique_ptr with a custom deleter over ScopedClosureRunner. it's more efficient, less generated code, and provides a stronger lifetime guarantee. https://codereview.chromium.org/1844023002/diff/100001/chrome/chrome_watcher/... chrome/chrome_watcher/kasko_util.cc:96: DCHECK(sid); do you also want this check if Get UserSidString fails? does it make sense to get rid of the log above this and change this to DPCHECK(sid)? does this make sense: if (base::win::GetUserSidString(&sid_string) && !sid_string.empty()) ::ConvertStringSidToSid(sid_string.c_str(), &sid); DPCHECK(sid); https://codereview.chromium.org/1844023002/diff/100001/chrome/chrome_watcher/... chrome/chrome_watcher/kasko_util.cc:116: kCrashUploadEventId, sid, what if |sid| is null here?
Description was changed from ========== Overview: Browser process hangs are believed to be responsible for a large fraction of browser unclean shutdowns. This patch is to get visibility into the problem. Note: This functionality is hidden behind buildflags, and only triggers on canary/unspecified channels. At this time, it is only considered for one-off releases. Details: - Use of the Kasko reporter is required for its ability to capture reports from other processes. - Pulls out Kasko reporting code to kasko_utils - Invoke Kasko on failed rendez-vous. - Hidden behind build flags (kasko and kasko-rdv-capture). BUG=478209 ========== to ========== Overview: Browser process hangs are believed to be responsible for a large fraction of browser unclean shutdowns. This patch is to get visibility into the problem. Note: This functionality is hidden behind buildflags, and only triggers for users with stats collection consent on canary/unspecified channels. At this time, it is only considered for one-off releases. Details: - Use of the Kasko reporter is required for its ability to capture reports from other processes. - Pulls out Kasko reporting code to kasko_utils - Invoke Kasko on failed rendez-vous. - Hidden behind build flags (kasko and kasko-rdv-capture). BUG=478209 ==========
Addressed comments, and also added checking for stats collection. Please have another look. (Note I mistakenly merged with master without uploading the diff separately, so the .gyp and .gypi diffs are a bit messed up). https://codereview.chromium.org/1844023002/diff/100001/chrome/browser/process... File chrome/browser/process_singleton_win.cc (right): https://codereview.chromium.org/1844023002/diff/100001/chrome/browser/process... chrome/browser/process_singleton_win.cc:9: #include <winbase.h> On 2016/04/05 17:22:08, grt (very slow) wrote: > change this to #include <windows.h> and move it above the other two includes. Done. https://codereview.chromium.org/1844023002/diff/100001/chrome/browser/process... chrome/browser/process_singleton_win.cc:11: #include <cwchar> On 2016/04/05 17:22:07, grt (very slow) wrote: > why? wchar_t is a keyword. is there something else from this header you need? Done. Dead code (was using wcscmp at one point). https://codereview.chromium.org/1844023002/diff/100001/chrome/browser/process... chrome/browser/process_singleton_win.cc:194: // the exception context to live either in the dumper of the dumpee. This On 2016/04/05 17:22:07, grt (very slow) wrote: > of -> or? Done. https://codereview.chromium.org/1844023002/diff/100001/chrome/browser/process... chrome/browser/process_singleton_win.cc:197: void SendFailedRdvReport(const base::Process& process, const DWORD thread_id) { On 2016/04/05 17:22:07, grt (very slow) wrote: > remove "const" Done. https://codereview.chromium.org/1844023002/diff/100001/chrome/browser/process... chrome/browser/process_singleton_win.cc:201: // Ensure the target process shares the current process' executable name. This On 2016/04/05 17:22:08, grt (very slow) wrote: > nit: process's Done. https://codereview.chromium.org/1844023002/diff/100001/chrome/browser/process... chrome/browser/process_singleton_win.cc:206: if (GetModuleFileName(nullptr, exe_name_self, MAX_PATH) == 0) On 2016/04/04 23:33:40, Patrick Monette wrote: > You can use PathService::Get(FILE_EXE) instead; Done. https://codereview.chromium.org/1844023002/diff/100001/chrome/browser/process... chrome/browser/process_singleton_win.cc:212: DWORD error = GetLastError(); On 2016/04/05 17:22:07, grt (very slow) wrote: > remove this Done. https://codereview.chromium.org/1844023002/diff/100001/chrome/browser/process... chrome/browser/process_singleton_win.cc:213: DPLOG(INFO) << "Failed to get executable name for other process: " << error; On 2016/04/05 17:22:07, grt (very slow) wrote: > The *P* class of log functions automatically add the friendly text for the last > error code, so change this to just: > DPLOG(ERROR) << "Failed to get executable name for other process"; Done. https://codereview.chromium.org/1844023002/diff/100001/chrome/browser/process... chrome/browser/process_singleton_win.cc:217: if (base::FilePath(exe_name_self) != base::FilePath(exe_name_other)) On 2016/04/05 13:52:29, Sigurður Ásgeirsson wrote: > @grt may be familiar with cases where this comparison could fail. I wonder > whether it might be better/sufficient to compare the directory name. > Also, I'm not sure whether this does right by casing differences and such? Acknowledged. https://codereview.chromium.org/1844023002/diff/100001/chrome/browser/process... chrome/browser/process_singleton_win.cc:217: if (base::FilePath(exe_name_self) != base::FilePath(exe_name_other)) On 2016/04/05 17:22:08, grt (very slow) wrote: > On 2016/04/05 13:52:29, Sigurður Ásgeirsson wrote: > > @grt may be familiar with cases where this comparison could fail. I wonder > > whether it might be better/sufficient to compare the directory name. > > Also, I'm not sure whether this does right by casing differences and such? > > We generally use base::FilePath::CompareEqualIgnoreCase if we think the paths > are going to be the same names for the file, or if false negatives aren't a > concern. If you think they may be different names for the same underlying file, > then I don't think you can do better than getting handles to them and comparing > their volume serial number and file index. False negatives are not an issue (though I have a TODO to add a metric to know how prevalent they are). Will go with CompareEqualIgnoreCase. Done. https://codereview.chromium.org/1844023002/diff/100001/chrome/browser/process... chrome/browser/process_singleton_win.cc:217: if (base::FilePath(exe_name_self) != base::FilePath(exe_name_other)) On 2016/04/04 23:33:40, Patrick Monette wrote: > Can you move the exe name comparison to a function? I'll need to reuse it in the > case the Wait Chain Traversal API gives me another process. Done. https://codereview.chromium.org/1844023002/diff/100001/chrome/browser/process... chrome/browser/process_singleton_win.cc:221: const version_info::Channel channel = chrome::GetChannel(); On 2016/04/05 17:22:08, grt (very slow) wrote: > consider doing this higher up in the function so that you don't incur the other > costs only to discard them due to the channel. Done. https://codereview.chromium.org/1844023002/diff/100001/chrome/browser/process... chrome/browser/process_singleton_win.cc:240: // completion of ongoing background tasks (eg upload). If the report is On 2016/04/05 17:22:08, grt (very slow) wrote: > nit: "e.g., upload" Done. https://codereview.chromium.org/1844023002/diff/100001/chrome/browser/process... chrome/browser/process_singleton_win.cc:319: // The window is hung. On 2016/04/05 17:22:07, grt (very slow) wrote: > https://youtu.be/O4gqsuww6lw Acknowledged?! ;) https://codereview.chromium.org/1844023002/diff/100001/chrome/chrome_watcher/... File chrome/chrome_watcher/kasko_util.cc (right): https://codereview.chromium.org/1844023002/diff/100001/chrome/chrome_watcher/... chrome/chrome_watcher/kasko_util.cc:33: void GetKaskoCrashServerUrl(base::string16* crash_server) { On 2016/04/05 17:22:08, grt (very slow) wrote: > return the string16 rather than taking it as an out param. Done. https://codereview.chromium.org/1844023002/diff/100001/chrome/chrome_watcher/... chrome/chrome_watcher/kasko_util.cc:34: const char kKaskoCrashServerUrl[] = "KASKO_CRASH_SERVER_URL"; On 2016/04/05 17:22:08, grt (very slow) wrote: > static const char Done. https://codereview.chromium.org/1844023002/diff/100001/chrome/chrome_watcher/... chrome/chrome_watcher/kasko_util.cc:38: auto env = base::Environment::Create(); On 2016/04/05 17:22:08, grt (very slow) wrote: > #include <memory> > > std::unique_ptr<base::Environment> env(base::Environment::Create()); Ouch. Done. https://codereview.chromium.org/1844023002/diff/100001/chrome/chrome_watcher/... chrome/chrome_watcher/kasko_util.cc:41: base::UTF8ToWide(env_var.c_str(), env_var.size(), crash_server); On 2016/04/05 17:22:08, grt (very slow) wrote: > do you really mean UTF8ToUTF16 here (the destination type is a string16, not a > wstring)? Done. https://codereview.chromium.org/1844023002/diff/100001/chrome/chrome_watcher/... chrome/chrome_watcher/kasko_util.cc:51: base::FilePath* base_dir) { On 2016/04/05 17:22:08, grt (very slow) wrote: > return a base::FilePath rather than taking it as an out arg. FilePaths are > treated like value types. Done. https://codereview.chromium.org/1844023002/diff/100001/chrome/chrome_watcher/... chrome/chrome_watcher/kasko_util.cc:52: const char kKaskoCrashReportBaseDir[] = "KASKO_CRASH_REPORTS_BASE_DIR"; On 2016/04/05 17:22:08, grt (very slow) wrote: > static Done. https://codereview.chromium.org/1844023002/diff/100001/chrome/chrome_watcher/... chrome/chrome_watcher/kasko_util.cc:53: auto env = base::Environment::Create(); On 2016/04/05 17:22:08, grt (very slow) wrote: > unique_ptr here, too Done. https://codereview.chromium.org/1844023002/diff/100001/chrome/chrome_watcher/... chrome/chrome_watcher/kasko_util.cc:80: HANDLE event_source_handle = ::RegisterEventSource(NULL, L"Chrome"); On 2016/04/05 17:22:08, grt (very slow) wrote: > NULL -> nullptr everywhere Done in this file. Did not change "untouched lines" in other files, eg process_singleton_win. https://codereview.chromium.org/1844023002/diff/100001/chrome/chrome_watcher/... chrome/chrome_watcher/kasko_util.cc:80: HANDLE event_source_handle = ::RegisterEventSource(NULL, L"Chrome"); On 2016/04/05 17:22:08, grt (very slow) wrote: > i somewhat prefer using std::unique_ptr with a custom deleter over > ScopedClosureRunner. it's more efficient, less generated code, and provides a > stronger lifetime guarantee. Done. Also applied to sid below, but it's not as clean. https://codereview.chromium.org/1844023002/diff/100001/chrome/chrome_watcher/... chrome/chrome_watcher/kasko_util.cc:96: DCHECK(sid); On 2016/04/05 17:22:08, grt (very slow) wrote: > do you also want this check if Get UserSidString fails? does it make sense to > get rid of the log above this and change this to DPCHECK(sid)? does this make > sense: > > if (base::win::GetUserSidString(&sid_string) && !sid_string.empty()) > ::ConvertStringSidToSid(sid_string.c_str(), &sid); > DPCHECK(sid); It seems ok for ReportEvent (below) to get a null sid. In terms of this function's expectations, I'm not sure. In doubt and without knowing much about this, I'd tend to leave as is: apply the dcheck only on non empty string. I've also left the DPLOG / DCHECK separate because I wasn't sure about using a 'P' logging function when GetUserSidString might not generate a last error. Please push back if you feel strongly about this. Note that kasko is going away as soon as CP is able to get reports for other processes https://bugs.chromium.org/p/crashpad/issues/detail?id=98 https://codereview.chromium.org/1844023002/diff/100001/chrome/chrome_watcher/... chrome/chrome_watcher/kasko_util.cc:116: kCrashUploadEventId, sid, On 2016/04/05 17:22:08, grt (very slow) wrote: > what if |sid| is null here? Looks like it's optional. If it's missing the user's name won't be added to the log, but other things will. https://codereview.chromium.org/1844023002/diff/100001/chrome/chrome_watcher/... chrome/chrome_watcher/kasko_util.cc:203: // TODO(erikwright): Make the dump-type channel-dependent. On 2016/04/05 13:52:29, Sigurður Ásgeirsson wrote: > erikwright won't likely be addressing this :) Done. https://codereview.chromium.org/1844023002/diff/100001/chrome/chrome_watcher/... File chrome/chrome_watcher/kasko_util.h (right): https://codereview.chromium.org/1844023002/diff/100001/chrome/chrome_watcher/... chrome/chrome_watcher/kasko_util.h:26: const base::char16* key, const base::Process& process); On 2016/04/05 13:52:29, Sigurður Ásgeirsson wrote: > the "key" parameter needs a little bit of 'splaining? Done.
Description was changed from ========== Overview: Browser process hangs are believed to be responsible for a large fraction of browser unclean shutdowns. This patch is to get visibility into the problem. Note: This functionality is hidden behind buildflags, and only triggers for users with stats collection consent on canary/unspecified channels. At this time, it is only considered for one-off releases. Details: - Use of the Kasko reporter is required for its ability to capture reports from other processes. - Pulls out Kasko reporting code to kasko_utils - Invoke Kasko on failed rendez-vous. - Hidden behind build flags (kasko and kasko-rdv-capture). BUG=478209 ========== to ========== Overview: Browser process hangs are believed to be responsible for a large fraction of browser unclean shutdowns. This patch is to get visibility into the problem on windows using the Kasko reporter. Note: This functionality is hidden behind buildflags, and only triggers for users with stats collection consent on canary/unspecified channels. At this time, it is only considered for one-off releases. Details: - Use of the Kasko reporter is required for its ability to capture reports from other processes. - Pulls out Kasko reporting code to kasko_utils - Invoke Kasko on failed rendez-vous. - Hidden behind build flags (kasko and kasko-rdv-capture). BUG=478209 ==========
https://codereview.chromium.org/1844023002/diff/120001/chrome/browser/process... File chrome/browser/process_singleton_win.cc (right): https://codereview.chromium.org/1844023002/diff/120001/chrome/browser/process... chrome/browser/process_singleton_win.cc:318: GoogleUpdateSettings::GetCollectStatsConsent(); i think this is the wrong thing to check. metrics::MetricsServiceAccessor::IsMetricsReportingEnabled may be the right thing; please check with a metrics OWNER. https://codereview.chromium.org/1844023002/diff/120001/chrome/chrome_watcher/... File chrome/chrome_watcher/kasko_util.cc (right): https://codereview.chromium.org/1844023002/diff/120001/chrome/chrome_watcher/... chrome/chrome_watcher/kasko_util.cc:65: void operator()(HANDLE event_source_handle) { const method https://codereview.chromium.org/1844023002/diff/120001/chrome/chrome_watcher/... chrome/chrome_watcher/kasko_util.cc:75: void operator()(PSID sid) { const method https://codereview.chromium.org/1844023002/diff/120001/chrome/chrome_watcher/... chrome/chrome_watcher/kasko_util.cc:172: &exe_name_other_len) == 0) { do you need to check exe_name_other_len after the call to be sure that exe_name_other was big enough?
https://codereview.chromium.org/1844023002/diff/120001/chrome/browser/process... File chrome/browser/process_singleton_win.cc (right): https://codereview.chromium.org/1844023002/diff/120001/chrome/browser/process... chrome/browser/process_singleton_win.cc:318: GoogleUpdateSettings::GetCollectStatsConsent(); On 2016/04/06 19:50:35, grt (very slow) wrote: > i think this is the wrong thing to check. > metrics::MetricsServiceAccessor::IsMetricsReportingEnabled may be the right > thing; please check with a metrics OWNER. Acknowledged. https://codereview.chromium.org/1844023002/diff/120001/chrome/chrome_watcher/... File chrome/chrome_watcher/kasko_util.cc (right): https://codereview.chromium.org/1844023002/diff/120001/chrome/chrome_watcher/... chrome/chrome_watcher/kasko_util.cc:65: void operator()(HANDLE event_source_handle) { On 2016/04/06 19:50:36, grt (very slow) wrote: > const method Done. https://codereview.chromium.org/1844023002/diff/120001/chrome/chrome_watcher/... chrome/chrome_watcher/kasko_util.cc:75: void operator()(PSID sid) { On 2016/04/06 19:50:35, grt (very slow) wrote: > const method Done. https://codereview.chromium.org/1844023002/diff/120001/chrome/chrome_watcher/... chrome/chrome_watcher/kasko_util.cc:172: &exe_name_other_len) == 0) { On 2016/04/06 19:50:36, grt (very slow) wrote: > do you need to check exe_name_other_len after the call to be sure that > exe_name_other was big enough? I'm not sure. I see that paths can sometimes be longer than MAX_PATH. The doc for QueryFullProcessImageName doesn't say anything about a buffer that is too small. I do know that GetModuleFileNameEx can succeed when the buffer is too small, and return a truncated result. Let's be conservative and return false if the whole buffer is used.
manzagop@chromium.org changed reviewers: + rkaplow@chromium.org
Hi rkaplow! Can you advise on the correct approach to determine stats collection consent? See https://codereview.chromium.org/1844023002/diff/120001/chrome/browser/process... Thanks!
https://codereview.chromium.org/1844023002/diff/140001/chrome/chrome_watcher/... File chrome/chrome_watcher/kasko_util.cc (right): https://codereview.chromium.org/1844023002/diff/140001/chrome/chrome_watcher/... chrome/chrome_watcher/kasko_util.cc:173: if (QueryFullProcessImageName(process.Handle(), 0, exe_name_other, Uber nit: other spots in this file use :: in front of Win32 API calls. Please do the same here for consistency. Thanks. https://codereview.chromium.org/1844023002/diff/140001/chrome/chrome_watcher/... chrome/chrome_watcher/kasko_util.cc:185: return false; The most conservative thing to do is to only do the comparison if exe_name_other_len is greater than zero and less than arraysize(exe_name_other). Please do that, and replace the MAX_PATH on line 171 with arraysize so that it remains correct if someone makes the buffer bigger in the future.
Description was changed from ========== Overview: Browser process hangs are believed to be responsible for a large fraction of browser unclean shutdowns. This patch is to get visibility into the problem on windows using the Kasko reporter. Note: This functionality is hidden behind buildflags, and only triggers for users with stats collection consent on canary/unspecified channels. At this time, it is only considered for one-off releases. Details: - Use of the Kasko reporter is required for its ability to capture reports from other processes. - Pulls out Kasko reporting code to kasko_utils - Invoke Kasko on failed rendez-vous. - Hidden behind build flags (kasko and kasko-rdv-capture). BUG=478209 ========== to ========== Capture a report on failed browser rendez-vous. Overview: Browser process hangs are believed to be responsible for a large fraction of browser unclean shutdowns. This patch is to get visibility into the problem on windows using the Kasko reporter. Note: This functionality is hidden behind buildflags, and only triggers for users with stats collection consent on canary/unspecified channels. At this time, it is only considered for one-off releases. Details: - Use of the Kasko reporter is required for its ability to capture reports from other processes. - Pulls out Kasko reporting code to kasko_utils - Invoke Kasko on failed rendez-vous. - Hidden behind build flags (kasko and kasko-rdv-capture). BUG=478209 ==========
Thanks! Another look? https://codereview.chromium.org/1844023002/diff/140001/chrome/chrome_watcher/... File chrome/chrome_watcher/kasko_util.cc (right): https://codereview.chromium.org/1844023002/diff/140001/chrome/chrome_watcher/... chrome/chrome_watcher/kasko_util.cc:173: if (QueryFullProcessImageName(process.Handle(), 0, exe_name_other, On 2016/04/07 13:41:49, grt (very slow) wrote: > Uber nit: other spots in this file use :: in front of Win32 API calls. Please do > the same here for consistency. Thanks. Done. https://codereview.chromium.org/1844023002/diff/140001/chrome/chrome_watcher/... chrome/chrome_watcher/kasko_util.cc:185: return false; On 2016/04/07 13:41:49, grt (very slow) wrote: > The most conservative thing to do is to only do the comparison if > exe_name_other_len is greater than zero and less than arraysize(exe_name_other). > Please do that, and replace the MAX_PATH on line 171 with arraysize so that it > remains correct if someone makes the buffer bigger in the future. Done.
https://codereview.chromium.org/1844023002/diff/120001/chrome/browser/process... File chrome/browser/process_singleton_win.cc (right): https://codereview.chromium.org/1844023002/diff/120001/chrome/browser/process... chrome/browser/process_singleton_win.cc:318: GoogleUpdateSettings::GetCollectStatsConsent(); On 2016/04/06 20:44:53, manzagop wrote: > On 2016/04/06 19:50:35, grt (very slow) wrote: > > i think this is the wrong thing to check. > > metrics::MetricsServiceAccessor::IsMetricsReportingEnabled may be the right > > thing; please check with a metrics OWNER. > > Acknowledged. I've only seen the use of the method greg pointed to. The docs for the GoogleUpdateSettings do claim to be checking the right bit, but I'm really familiar with it. https://code.google.com/p/chromium/codesearch#chromium/src/chrome/browser/goo... So I would agree to switch it, but adding alexei if he knows about GoogleUpdateSettings::GetCollectStatsConsent() and if that ends up checking the same thing or not.
https://codereview.chromium.org/1844023002/diff/120001/chrome/browser/process... File chrome/browser/process_singleton_win.cc (right): https://codereview.chromium.org/1844023002/diff/120001/chrome/browser/process... chrome/browser/process_singleton_win.cc:318: GoogleUpdateSettings::GetCollectStatsConsent(); On 2016/04/07 17:57:00, rkaplow wrote: > On 2016/04/06 20:44:53, manzagop wrote: > > On 2016/04/06 19:50:35, grt (very slow) wrote: > > > i think this is the wrong thing to check. > > > metrics::MetricsServiceAccessor::IsMetricsReportingEnabled may be the right > > > thing; please check with a metrics OWNER. > > > > Acknowledged. > > I've only seen the use of the method greg pointed to. > > The docs for the GoogleUpdateSettings do claim to be checking the right bit, but > I'm really familiar with it. > https://code.google.com/p/chromium/codesearch#chromium/src/chrome/browser/goo... > > So I would agree to switch it, but adding alexei if he knows about > GoogleUpdateSettings::GetCollectStatsConsent() and if that ends up checking the > same thing or not. For context, this is what is used to decide whether to launch the watcher. https://code.google.com/p/chromium/codesearch#chromium/src/chrome/app/main_dl...
On 2016/04/07 18:09:20, manzagop wrote: > https://codereview.chromium.org/1844023002/diff/120001/chrome/browser/process... > File chrome/browser/process_singleton_win.cc (right): > > https://codereview.chromium.org/1844023002/diff/120001/chrome/browser/process... > chrome/browser/process_singleton_win.cc:318: > GoogleUpdateSettings::GetCollectStatsConsent(); > On 2016/04/07 17:57:00, rkaplow wrote: > > On 2016/04/06 20:44:53, manzagop wrote: > > > On 2016/04/06 19:50:35, grt (very slow) wrote: > > > > i think this is the wrong thing to check. > > > > metrics::MetricsServiceAccessor::IsMetricsReportingEnabled may be the > right > > > > thing; please check with a metrics OWNER. > > > > > > Acknowledged. > > > > I've only seen the use of the method greg pointed to. > > > > The docs for the GoogleUpdateSettings do claim to be checking the right bit, > but > > I'm really familiar with it. > > > https://code.google.com/p/chromium/codesearch#chromium/src/chrome/browser/goo... > > > > So I would agree to switch it, but adding alexei if he knows about > > GoogleUpdateSettings::GetCollectStatsConsent() and if that ends up checking > the > > same thing or not. > > For context, this is what is used to decide whether to launch the watcher. > https://code.google.com/p/chromium/codesearch#chromium/src/chrome/app/main_dl... Then that is wrong. It doesn't take into account Group Policy settings. I think the actual metrics code does, but Alexei should confirm.
On 2016/04/07 20:21:56, grt (very slow) wrote: > On 2016/04/07 18:09:20, manzagop wrote: > > > https://codereview.chromium.org/1844023002/diff/120001/chrome/browser/process... > > File chrome/browser/process_singleton_win.cc (right): > > > > > https://codereview.chromium.org/1844023002/diff/120001/chrome/browser/process... > > chrome/browser/process_singleton_win.cc:318: > > GoogleUpdateSettings::GetCollectStatsConsent(); > > On 2016/04/07 17:57:00, rkaplow wrote: > > > On 2016/04/06 20:44:53, manzagop wrote: > > > > On 2016/04/06 19:50:35, grt (very slow) wrote: > > > > > i think this is the wrong thing to check. > > > > > metrics::MetricsServiceAccessor::IsMetricsReportingEnabled may be the > > right > > > > > thing; please check with a metrics OWNER. > > > > > > > > Acknowledged. > > > > > > I've only seen the use of the method greg pointed to. > > > > > > The docs for the GoogleUpdateSettings do claim to be checking the right bit, > > but > > > I'm really familiar with it. > > > > > > https://code.google.com/p/chromium/codesearch#chromium/src/chrome/browser/goo... > > > > > > So I would agree to switch it, but adding alexei if he knows about > > > GoogleUpdateSettings::GetCollectStatsConsent() and if that ends up checking > > the > > > same thing or not. > > > > For context, this is what is used to decide whether to launch the watcher. > > > https://code.google.com/p/chromium/codesearch#chromium/src/chrome/app/main_dl... > > Then that is wrong. It doesn't take into account Group Policy settings. I think > the actual metrics code does, but Alexei should confirm. By the way, I meant to write I'm *not* really familiar with it (the GoogleUpdateSettings method). Slightly changed the intent of my email :\
from my pov, code lgtm, but please wait for asvitkine to comment on consent.
asvitkine@chromium.org changed reviewers: + asvitkine@chromium.org
https://codereview.chromium.org/1844023002/diff/160001/chrome/browser/process... File chrome/browser/process_singleton_win.cc (right): https://codereview.chromium.org/1844023002/diff/160001/chrome/browser/process... chrome/browser/process_singleton_win.cc:191: #if BUILDFLAG(ENABLE_KASKO_FAILED_RDV_REPORTS) Why do you need a whole build flag for this? The amount of code this is enabling is negligible (on top of Kasko already being on). Why not just do a runtime check instead? https://codereview.chromium.org/1844023002/diff/160001/chrome/browser/process... chrome/browser/process_singleton_win.cc:318: GoogleUpdateSettings::GetCollectStatsConsent(); The correct way to get this is via ChromeMetricsServiceAccessor. Also, can this logic be checked inside SendFailedRdvReport()? That way, it would be impossible to use that function without the user providing consent.
FYI: We have a bug on file about the other API: https://bugs.chromium.org/p/chromium/issues/detail?id=570414
https://codereview.chromium.org/1844023002/diff/160001/chrome/browser/process... File chrome/browser/process_singleton_win.cc (right): https://codereview.chromium.org/1844023002/diff/160001/chrome/browser/process... chrome/browser/process_singleton_win.cc:318: GoogleUpdateSettings::GetCollectStatsConsent(); On 2016/04/07 20:30:19, Alexei Svitkine wrote: > The correct way to get this is via ChromeMetricsServiceAccessor. > > Also, can this logic be checked inside SendFailedRdvReport()? That way, it would > be impossible to use that function without the user providing consent. Note that this is happening in a browser that has not loaded the user profile and will never load the user profile (it's in rendezvous). I don't think this instance will reasonably ever initialize metrics. It's however reporting on a browser instance that should have gone through all of that. Does ChromeMetricsServiceAccessor require that the metrics service has been initialized?
https://codereview.chromium.org/1844023002/diff/160001/chrome/browser/process... File chrome/browser/process_singleton_win.cc (right): https://codereview.chromium.org/1844023002/diff/160001/chrome/browser/process... chrome/browser/process_singleton_win.cc:318: GoogleUpdateSettings::GetCollectStatsConsent(); On 2016/04/07 20:35:30, Sigurður Ásgeirsson wrote: > On 2016/04/07 20:30:19, Alexei Svitkine wrote: > > The correct way to get this is via ChromeMetricsServiceAccessor. > > > > Also, can this logic be checked inside SendFailedRdvReport()? That way, it > would > > be impossible to use that function without the user providing consent. > > Note that this is happening in a browser that has not loaded the user profile > and will never load the user profile (it's in rendezvous). I don't think this > instance will reasonably ever initialize metrics. > It's however reporting on a browser instance that should have gone through all > of that. > Does ChromeMetricsServiceAccessor require that the metrics service has been > initialized? That's a good point, the ChromeMetricsServiceAccessor API requires metrics service to be initialized and so can't be used in this context, as you say.
I've kept the existing check, but moved it inside the function. https://codereview.chromium.org/1844023002/diff/160001/chrome/browser/process... File chrome/browser/process_singleton_win.cc (right): https://codereview.chromium.org/1844023002/diff/160001/chrome/browser/process... chrome/browser/process_singleton_win.cc:191: #if BUILDFLAG(ENABLE_KASKO_FAILED_RDV_REPORTS) On 2016/04/07 20:30:19, Alexei Svitkine wrote: > Why do you need a whole build flag for this? > > The amount of code this is enabling is negligible (on top of Kasko already being > on). Why not just do a runtime check instead? This is the approach used to enable the other (existing) hang report capture: https://code.google.com/p/chromium/codesearch#chromium/src/chrome/chrome_watc... Can you say more about what you mean by runtime check (use finch?) and the benefits (dynamic control)? https://codereview.chromium.org/1844023002/diff/160001/chrome/browser/process... chrome/browser/process_singleton_win.cc:318: GoogleUpdateSettings::GetCollectStatsConsent(); On 2016/04/07 20:37:04, Alexei Svitkine wrote: > On 2016/04/07 20:35:30, Sigurður Ásgeirsson wrote: > > On 2016/04/07 20:30:19, Alexei Svitkine wrote: > > > The correct way to get this is via ChromeMetricsServiceAccessor. > > > > > > Also, can this logic be checked inside SendFailedRdvReport()? That way, it > > would > > > be impossible to use that function without the user providing consent. > > > > Note that this is happening in a browser that has not loaded the user profile > > and will never load the user profile (it's in rendezvous). I don't think this > > instance will reasonably ever initialize metrics. > > It's however reporting on a browser instance that should have gone through all > > of that. > > Does ChromeMetricsServiceAccessor require that the metrics service has been > > initialized? > > That's a good point, the ChromeMetricsServiceAccessor API requires metrics > service to be initialized and so can't be used in this context, as you say. Moved the check inside the function.
https://codereview.chromium.org/1844023002/diff/160001/chrome/browser/process... File chrome/browser/process_singleton_win.cc (right): https://codereview.chromium.org/1844023002/diff/160001/chrome/browser/process... chrome/browser/process_singleton_win.cc:191: #if BUILDFLAG(ENABLE_KASKO_FAILED_RDV_REPORTS) On 2016/04/07 21:21:25, manzagop wrote: > On 2016/04/07 20:30:19, Alexei Svitkine wrote: > > Why do you need a whole build flag for this? > > > > The amount of code this is enabling is negligible (on top of Kasko already > being > > on). Why not just do a runtime check instead? > > This is the approach used to enable the other (existing) hang report capture: > https://code.google.com/p/chromium/codesearch#chromium/src/chrome/chrome_watc... > > Can you say more about what you mean by runtime check (use finch?) and the > benefits (dynamic control)? Yeah, runtime check via something like a base::Feature which can be controlled by a server-side study. I think besides dynamic control, a big benefit is reduced complexity - if you look at the number of places you're changing in your CL for this build-time check, it's a lot of extra boilerplate. Reducing that makes things much cleaner.
https://codereview.chromium.org/1844023002/diff/160001/chrome/browser/process... File chrome/browser/process_singleton_win.cc (right): https://codereview.chromium.org/1844023002/diff/160001/chrome/browser/process... chrome/browser/process_singleton_win.cc:191: #if BUILDFLAG(ENABLE_KASKO_FAILED_RDV_REPORTS) On 2016/04/07 21:26:35, Alexei Svitkine wrote: > On 2016/04/07 21:21:25, manzagop wrote: > > On 2016/04/07 20:30:19, Alexei Svitkine wrote: > > > Why do you need a whole build flag for this? > > > > > > The amount of code this is enabling is negligible (on top of Kasko already > > being > > > on). Why not just do a runtime check instead? > > > > This is the approach used to enable the other (existing) hang report capture: > > > https://code.google.com/p/chromium/codesearch#chromium/src/chrome/chrome_watc... > > > > Can you say more about what you mean by runtime check (use finch?) and the > > benefits (dynamic control)? > > Yeah, runtime check via something like a base::Feature which can be controlled > by a server-side study. > > I think besides dynamic control, a big benefit is reduced complexity - if you > look at the number of places you're changing in your CL for this build-time > check, it's a lot of extra boilerplate. Reducing that makes things much cleaner. Got it. I agree that would be better but unless you're opposed, I'd leave as is for now.
manzagop@chromium.org changed reviewers: + jochen@chromium.org
Hi jochen! I'm looking for owners' review for: build\common.gypi chrome\browser\BUILD.gn chrome\browser\process_singleton_win.cc Could you have a look? Thanks! Pierre
On 2016/04/07 20:37:04, Alexei Svitkine wrote: > https://codereview.chromium.org/1844023002/diff/160001/chrome/browser/process... > File chrome/browser/process_singleton_win.cc (right): > > https://codereview.chromium.org/1844023002/diff/160001/chrome/browser/process... > chrome/browser/process_singleton_win.cc:318: > GoogleUpdateSettings::GetCollectStatsConsent(); > On 2016/04/07 20:35:30, Sigurður Ásgeirsson wrote: > > On 2016/04/07 20:30:19, Alexei Svitkine wrote: > > > The correct way to get this is via ChromeMetricsServiceAccessor. > > > > > > Also, can this logic be checked inside SendFailedRdvReport()? That way, it > > would > > > be impossible to use that function without the user providing consent. > > > > Note that this is happening in a browser that has not loaded the user profile > > and will never load the user profile (it's in rendezvous). I don't think this > > instance will reasonably ever initialize metrics. > > It's however reporting on a browser instance that should have gone through all > > of that. > > Does ChromeMetricsServiceAccessor require that the metrics service has been > > initialized? > > That's a good point, the ChromeMetricsServiceAccessor API requires metrics > service to be initialized and so can't be used in this context, as you say. Then do we not need a better way to check for consent? The pure-registry approach skips GP. Will a report be uploaded by this browser process, or just put on disk for the real browser process's crashpad to upload if reporting really is enabled?
On 2016/04/07 at 23:08:21, grt wrote: > On 2016/04/07 20:37:04, Alexei Svitkine wrote: > > https://codereview.chromium.org/1844023002/diff/160001/chrome/browser/process... > > File chrome/browser/process_singleton_win.cc (right): > > > > https://codereview.chromium.org/1844023002/diff/160001/chrome/browser/process... > > chrome/browser/process_singleton_win.cc:318: > > GoogleUpdateSettings::GetCollectStatsConsent(); > > On 2016/04/07 20:35:30, Sigurður Ásgeirsson wrote: > > > On 2016/04/07 20:30:19, Alexei Svitkine wrote: > > > > The correct way to get this is via ChromeMetricsServiceAccessor. > > > > > > > > Also, can this logic be checked inside SendFailedRdvReport()? That way, it > > > would > > > > be impossible to use that function without the user providing consent. > > > > > > Note that this is happening in a browser that has not loaded the user profile > > > and will never load the user profile (it's in rendezvous). I don't think this > > > instance will reasonably ever initialize metrics. > > > It's however reporting on a browser instance that should have gone through all > > > of that. > > > Does ChromeMetricsServiceAccessor require that the metrics service has been > > > initialized? > > > > That's a good point, the ChromeMetricsServiceAccessor API requires metrics > > service to be initialized and so can't be used in this context, as you say. > > Then do we not need a better way to check for consent? The pure-registry approach skips GP. Will a report be uploaded by this browser process, or just put on disk for the real browser process's crashpad to upload if reporting really is enabled? crashpad uses this check: https://code.google.com/p/chromium/codesearch#chromium/src/chrome/app/chrome_... that should work w/o the metrics service, no?
lgtm for the build/ files etc..
manzagop@chromium.org changed reviewers: + scottmg@chromium.org
Thanks everyone for the reviews so far. Re: stats collection consent, CrashPad looks at policy in determining whether uploads are enabled, then stores the result. https://code.google.com/p/chromium/codesearch#chromium/src/components/crash/c... I've therefore replaced both stat collection consent checks (rdv report capture and watcher launch) to crash_reporter::GetUploadsEnabled. scottmg: can you validate the use of crash_reporter::GetUploadsEnabled is correct (diff #7)? asvitkine,grt: is this ok with you? Thanks!
https://codereview.chromium.org/1844023002/diff/200001/chrome/app/main_dll_lo... File chrome/app/main_dll_loader_win.cc (right): https://codereview.chromium.org/1844023002/diff/200001/chrome/app/main_dll_lo... chrome/app/main_dll_loader_win.cc:227: const bool stats_collection_consent = crash_reporter::GetUploadsEnabled(); Yes, this lgtm. You could collapse out the #if defined(GOOGLE_CHROME_BUILD) to match what Crashpad does.
scottmg and I discussed the fact the second call always returns false (because it's in the dll, not the exe). I'll see about fixing it.
https://codereview.chromium.org/1844023002/diff/200001/chrome/browser/process... File chrome/browser/process_singleton_win.cc (right): https://codereview.chromium.org/1844023002/diff/200001/chrome/browser/process... chrome/browser/process_singleton_win.cc:198: if (!crash_reporter::GetUploadsEnabled()) Sorry, I only looked at the other file. As this is in the dll, the globals won't be initialized here. The upload list thunks back to the exe for this reason https://code.google.com/p/chromium/codesearch#chromium/src/chrome/browser/cra... .
Please have another look! https://codereview.chromium.org/1844023002/diff/200001/chrome/app/main_dll_lo... File chrome/app/main_dll_loader_win.cc (right): https://codereview.chromium.org/1844023002/diff/200001/chrome/app/main_dll_lo... chrome/app/main_dll_loader_win.cc:227: const bool stats_collection_consent = crash_reporter::GetUploadsEnabled(); On 2016/04/08 15:52:25, scottmg wrote: > Yes, this lgtm. You could collapse out the #if defined(GOOGLE_CHROME_BUILD) to > match what Crashpad does. I think I understood what you meant, but please validate. https://codereview.chromium.org/1844023002/diff/200001/chrome/browser/process... File chrome/browser/process_singleton_win.cc (right): https://codereview.chromium.org/1844023002/diff/200001/chrome/browser/process... chrome/browser/process_singleton_win.cc:198: if (!crash_reporter::GetUploadsEnabled()) On 2016/04/08 16:04:34, scottmg wrote: > Sorry, I only looked at the other file. As this is in the dll, the globals won't > be initialized here. The upload list thunks back to the exe for this reason > https://code.google.com/p/chromium/codesearch#chromium/src/chrome/browser/cra... > . I went with instantiating a ChromeCrashReporterClient instead, but it ended up requiring changes to the gyp. Tell me what you think!
https://codereview.chromium.org/1844023002/diff/200001/chrome/app/main_dll_lo... File chrome/app/main_dll_loader_win.cc (right): https://codereview.chromium.org/1844023002/diff/200001/chrome/app/main_dll_lo... chrome/app/main_dll_loader_win.cc:227: const bool stats_collection_consent = crash_reporter::GetUploadsEnabled(); On 2016/04/08 21:15:56, manzagop wrote: > On 2016/04/08 15:52:25, scottmg wrote: > > Yes, this lgtm. You could collapse out the #if defined(GOOGLE_CHROME_BUILD) to > > match what Crashpad does. > > I think I understood what you meant, but please validate. Yeah, that's what I meant, sorry for the terseness. https://codereview.chromium.org/1844023002/diff/200001/chrome/browser/process... File chrome/browser/process_singleton_win.cc (right): https://codereview.chromium.org/1844023002/diff/200001/chrome/browser/process... chrome/browser/process_singleton_win.cc:198: if (!crash_reporter::GetUploadsEnabled()) On 2016/04/08 21:15:56, manzagop wrote: > On 2016/04/08 16:04:34, scottmg wrote: > > Sorry, I only looked at the other file. As this is in the dll, the globals > won't > > be initialized here. The upload list thunks back to the exe for this reason > > > https://code.google.com/p/chromium/codesearch#chromium/src/chrome/browser/cra... > > . > > I went with instantiating a ChromeCrashReporterClient instead, but it ended up > requiring changes to the gyp. Tell me what you think! Adding the dependency and duplicating that code seems a bit unfortunate but I guess since it's only temporary it's OK. I guess the end-game is that we have watcher, rendezvous reporter, and crashpad all in the same process? https://codereview.chromium.org/1844023002/diff/220001/chrome/app/main_dll_lo... File chrome/app/main_dll_loader_win.cc (right): https://codereview.chromium.org/1844023002/diff/220001/chrome/app/main_dll_lo... chrome/app/main_dll_loader_win.cc:226: if (crash_reporter::GetUploadsEnabled()) { Yeah, that's what I meant. Sorry for the terseness.
Thanks! Did anyone want a final look? https://codereview.chromium.org/1844023002/diff/200001/chrome/browser/process... File chrome/browser/process_singleton_win.cc (right): https://codereview.chromium.org/1844023002/diff/200001/chrome/browser/process... chrome/browser/process_singleton_win.cc:198: if (!crash_reporter::GetUploadsEnabled()) On 2016/04/08 22:20:55, scottmg wrote: > On 2016/04/08 21:15:56, manzagop wrote: > > On 2016/04/08 16:04:34, scottmg wrote: > > > Sorry, I only looked at the other file. As this is in the dll, the globals > > won't > > > be initialized here. The upload list thunks back to the exe for this reason > > > > > > https://code.google.com/p/chromium/codesearch#chromium/src/chrome/browser/cra... > > > . > > > > I went with instantiating a ChromeCrashReporterClient instead, but it ended up > > requiring changes to the gyp. Tell me what you think! > > Adding the dependency and duplicating that code seems a bit unfortunate but I > guess since it's only temporary it's OK. > > I guess the end-game is that we have watcher, rendezvous reporter, and crashpad > all in the same process? Agreed there's no need to have this reporter. As for the watcher, I'm not familiar about the considerations there. https://codereview.chromium.org/1844023002/diff/220001/chrome/app/main_dll_lo... File chrome/app/main_dll_loader_win.cc (right): https://codereview.chromium.org/1844023002/diff/220001/chrome/app/main_dll_lo... chrome/app/main_dll_loader_win.cc:226: if (crash_reporter::GetUploadsEnabled()) { On 2016/04/08 22:20:55, scottmg wrote: > Yeah, that's what I meant. Sorry for the terseness. Acknowledged.
lgtm w/ a nit https://codereview.chromium.org/1844023002/diff/220001/chrome/browser/process... File chrome/browser/process_singleton_win.cc (right): https://codereview.chromium.org/1844023002/diff/220001/chrome/browser/process... chrome/browser/process_singleton_win.cc:42: #include "components/crash/content/app/crashpad.h" unused?
Thanks! Going ahead with submission. https://codereview.chromium.org/1844023002/diff/220001/chrome/browser/process... File chrome/browser/process_singleton_win.cc (right): https://codereview.chromium.org/1844023002/diff/220001/chrome/browser/process... chrome/browser/process_singleton_win.cc:42: #include "components/crash/content/app/crashpad.h" On 2016/04/12 14:35:22, grt (very slow) wrote: > unused? Yes! Done.
TIL about GN. GN variables now fixed. pmonette: could you have a look since you wrote the kasko gn files?
lgtm with nits https://codereview.chromium.org/1844023002/diff/280001/chrome/browser/BUILD.gn File chrome/browser/BUILD.gn (right): https://codereview.chromium.org/1844023002/diff/280001/chrome/browser/BUILD.g... chrome/browser/BUILD.gn:849: "//chrome/chrome_watcher:kasko_util", Move this to live 667? It's only used if enable_kasko_failed_rdv_reports is true. https://codereview.chromium.org/1844023002/diff/280001/chrome/chrome_browser.... File chrome/chrome_browser.gypi (right): https://codereview.chromium.org/1844023002/diff/280001/chrome/chrome_browser.... chrome/chrome_browser.gypi:3837: 'kasko_util', Same here.
Patrick, This is the refactor of the build files we agreed upon. The kasko_util target's definition is now dependent on some variables. Please have another look. https://codereview.chromium.org/1844023002/diff/280001/chrome/browser/BUILD.gn File chrome/browser/BUILD.gn (right): https://codereview.chromium.org/1844023002/diff/280001/chrome/browser/BUILD.g... chrome/browser/BUILD.gn:849: "//chrome/chrome_watcher:kasko_util", On 2016/04/12 18:48:42, Patrick Monette wrote: > Move this to live 667? It's only used if enable_kasko_failed_rdv_reports is > true. Stayed with unconditional dependency, but the dependency's definition is now dependent on some variables. https://codereview.chromium.org/1844023002/diff/280001/chrome/chrome_browser.... File chrome/chrome_browser.gypi (right): https://codereview.chromium.org/1844023002/diff/280001/chrome/chrome_browser.... chrome/chrome_browser.gypi:3837: 'kasko_util', On 2016/04/12 18:48:42, Patrick Monette wrote: > Same here. See other comment.
https://codereview.chromium.org/1844023002/diff/320001/build/common.gypi File build/common.gypi (left): https://codereview.chromium.org/1844023002/diff/320001/build/common.gypi#oldc... build/common.gypi:1229: 'kasko%': '<(kasko)', Finally this is required and causing problems on try bots.
Fixed the GYP issue. Going ahead with submission. Thanks! https://codereview.chromium.org/1844023002/diff/320001/build/common.gypi File build/common.gypi (left): https://codereview.chromium.org/1844023002/diff/320001/build/common.gypi#oldc... build/common.gypi:1229: 'kasko%': '<(kasko)', On 2016/04/13 02:48:38, Patrick Monette wrote: > Finally this is required and causing problems on try bots. Done.
The CQ bit was checked by manzagop@chromium.org
The patchset sent to the CQ was uploaded after l-g-t-m from siggi@chromium.org, jochen@chromium.org, scottmg@chromium.org, grt@chromium.org, pmonette@chromium.org Link to the patchset: https://codereview.chromium.org/1844023002/#ps340001 (title: "GYP fixup")
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1844023002/340001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1844023002/340001
Message was sent while issue was closed.
Description was changed from ========== Capture a report on failed browser rendez-vous. Overview: Browser process hangs are believed to be responsible for a large fraction of browser unclean shutdowns. This patch is to get visibility into the problem on windows using the Kasko reporter. Note: This functionality is hidden behind buildflags, and only triggers for users with stats collection consent on canary/unspecified channels. At this time, it is only considered for one-off releases. Details: - Use of the Kasko reporter is required for its ability to capture reports from other processes. - Pulls out Kasko reporting code to kasko_utils - Invoke Kasko on failed rendez-vous. - Hidden behind build flags (kasko and kasko-rdv-capture). BUG=478209 ========== to ========== Capture a report on failed browser rendez-vous. Overview: Browser process hangs are believed to be responsible for a large fraction of browser unclean shutdowns. This patch is to get visibility into the problem on windows using the Kasko reporter. Note: This functionality is hidden behind buildflags, and only triggers for users with stats collection consent on canary/unspecified channels. At this time, it is only considered for one-off releases. Details: - Use of the Kasko reporter is required for its ability to capture reports from other processes. - Pulls out Kasko reporting code to kasko_utils - Invoke Kasko on failed rendez-vous. - Hidden behind build flags (kasko and kasko-rdv-capture). BUG=478209 ==========
Message was sent while issue was closed.
Committed patchset #15 (id:340001)
Message was sent while issue was closed.
Description was changed from ========== Capture a report on failed browser rendez-vous. Overview: Browser process hangs are believed to be responsible for a large fraction of browser unclean shutdowns. This patch is to get visibility into the problem on windows using the Kasko reporter. Note: This functionality is hidden behind buildflags, and only triggers for users with stats collection consent on canary/unspecified channels. At this time, it is only considered for one-off releases. Details: - Use of the Kasko reporter is required for its ability to capture reports from other processes. - Pulls out Kasko reporting code to kasko_utils - Invoke Kasko on failed rendez-vous. - Hidden behind build flags (kasko and kasko-rdv-capture). BUG=478209 ========== to ========== Capture a report on failed browser rendez-vous. Overview: Browser process hangs are believed to be responsible for a large fraction of browser unclean shutdowns. This patch is to get visibility into the problem on windows using the Kasko reporter. Note: This functionality is hidden behind buildflags, and only triggers for users with stats collection consent on canary/unspecified channels. At this time, it is only considered for one-off releases. Details: - Use of the Kasko reporter is required for its ability to capture reports from other processes. - Pulls out Kasko reporting code to kasko_utils - Invoke Kasko on failed rendez-vous. - Hidden behind build flags (kasko and kasko-rdv-capture). BUG=478209 Committed: https://crrev.com/f071e88974318cec29c02bd5b92fe61868da8569 Cr-Commit-Position: refs/heads/master@{#386983} ==========
Message was sent while issue was closed.
Patchset 15 (id:??) landed as https://crrev.com/f071e88974318cec29c02bd5b92fe61868da8569 Cr-Commit-Position: refs/heads/master@{#386983} |