Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(350)

Issue 2396093003: Revert of Verify if GPU message pump is signaled when it hangs in WaitForWork (Closed)

Created:
4 years, 2 months ago by stanisc
Modified:
4 years, 2 months ago
CC:
chromium-reviews, jam, sadrul, piman+watch_chromium.org, darin-cc_chromium.org, vmiura
Base URL:
https://chromium.googlesource.com/chromium/src.git@master
Target Ref:
refs/pending/heads/master
Project:
chromium
Visibility:
Public.

Description

Revert of Verify if GPU message pump is signaled when it hangs in WaitForWork (patchset #3 id:40001 of https://codereview.chromium.org/2077613002/ ) Reason for revert: This check isn't needed anymore and in practice it is negative in 100% of crash dumps. After some additional research I realized that that was a false negative and that this check doesn't work as expected with auto-reset events. I confirmed that an auto-reset event gets promptly reset back to non-signaled when it gets signaled as long as there is at least one thread already waiting on it. That is the case even when when the target thread is never scheduled to run. The check would work with a manual-reset event but apparently it is useless in the case of an auto-reset event. Original issue's description: > Verify if GPU message pump is signaled when it hangs in WaitForWork > > This is a diagnostic change. The code introduced in this > change runs only when GPU process is about to terminate > with a deliberate crash. > > We a getting a number of crashes triggered by GPU hang in > MessagePumpForGpu::WaitForWork. There is already some > instrumentation that indicates that: > a) MessagePumpForGpu::WaitForWork is sitting in > MsgWaitForMultipleObjectsEx for longer than 15 seconds > b) MessagePumpForGpu::ScheduleWork is called after > WaitForWork enters the wait, sometimes several seconds > after, and SetEvent must be called (at least we grab the > timestamp right before calling SetEvent method) > c) The event is set but it doesn't wake up the wait > > While it is possible for a thread that is awaken to not > be immediately scheduled by the OS, it is hard to imagine > that going up for 15+ seconds. So the theory is that the > event handle might be recycled and there might be some > other code that has closed but hasn't nulled out the > old handle. Since the event is auto-reset, when there are > multiple waiters only one of them would be awaken and > reset the event, and the other one would > just continue waiting. So if the other code is somehow > still waiting on its closed and now recycled handle, that > would explain the hang. > > This code would allow GPU watchdog to check whether > the event was set at the time of the crash. This would > give us a clue of whether the situation described above > is actually happening. > > Extra bonus: the investigation would also explain if > Renderer hangs in MessagePumpDefault::Run are caused by > the same issue. > > BUG=620904 > > Committed: https://crrev.com/702c0f481843035dd46c6a6a256cbe65dda8629c > Cr-Commit-Position: refs/heads/master@{#400871} TBR=kbr@chromium.org,dcheng@chromium.org,brucedawson@chromium.org,jbauman@chromium.org # Not skipping CQ checks because original CL landed more than 1 days ago. BUG=620904

Patch Set 1 #

Unified diffs Side-by-side diffs Delta from patch set Stats (+0 lines, -47 lines) Patch
M base/message_loop/message_loop.h View 1 chunk +0 lines, -9 lines 0 comments Download
M base/message_loop/message_loop.cc View 1 chunk +0 lines, -6 lines 0 comments Download
M base/message_loop/message_pump.h View 1 chunk +0 lines, -9 lines 0 comments Download
M base/message_loop/message_pump.cc View 1 chunk +0 lines, -7 lines 0 comments Download
M base/message_loop/message_pump_win.h View 1 chunk +0 lines, -7 lines 0 comments Download
M base/message_loop/message_pump_win.cc View 1 chunk +0 lines, -5 lines 0 comments Download
M content/gpu/gpu_watchdog_thread.cc View 1 chunk +0 lines, -4 lines 0 comments Download

Messages

Total messages: 9 (3 generated)
stanisc
Created Revert of Verify if GPU message pump is signaled when it hangs in WaitForWork
4 years, 2 months ago (2016-10-06 20:37:06 UTC) #1
commit-bot: I haz the power
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2396093003/1
4 years, 2 months ago (2016-10-06 20:42:14 UTC) #3
commit-bot: I haz the power
Try jobs failed on following builders: android_clang_dbg_recipe on master.tryserver.chromium.android (JOB_FAILED, https://build.chromium.org/p/tryserver.chromium.android/builders/android_clang_dbg_recipe/builds/141689) android_compile_dbg on master.tryserver.chromium.android (JOB_FAILED, ...
4 years, 2 months ago (2016-10-06 20:46:36 UTC) #5
Ken Russell (switch to Gerrit)
+vmiura Are you planning to rebase this and land it, or do this under a ...
4 years, 2 months ago (2016-10-07 02:58:06 UTC) #6
stanisc
On 2016/10/07 02:58:06, Ken Russell wrote: > +vmiura > > Are you planning to rebase ...
4 years, 2 months ago (2016-10-07 04:06:20 UTC) #7
Ken Russell (switch to Gerrit)
4 years, 2 months ago (2016-10-07 04:06:55 UTC) #8
On 2016/10/07 04:06:20, stanisc wrote:
> On 2016/10/07 02:58:06, Ken Russell wrote:
> > +vmiura
> > 
> > Are you planning to rebase this and land it, or do this under a different
CL?
> > 
> > vmiura's been investigating GPU process hangs under http://crbug.com/609252
;
> > the current hypothesis is that a memory stomp is corrupting the observer
list,
> > causing a key notification to the watchdog to be dropped.
> 
> I created a new CL - https://codereview.chromium.org/2393333002/. I'll submit
it
> tomorrow.

OK. I'll close this.

Powered by Google App Engine
This is Rietveld 408576698