Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(984)

Issue 2381893002: base: Take care of unexpected wakeups in condition_variable_win.cc. (Closed)

Created:
4 years, 2 months ago by prashant.n
Modified:
4 years, 1 month ago
Reviewers:
danakj, robliao, reveman, vmpstr
CC:
chromium-reviews
Target Ref:
refs/pending/heads/master
Project:
chromium
Visibility:
Public.

Description

base: Take care of unexpected wakeups in condition_variable_win.cc. As per MSDN documentation (https://goo.gl/tPHPkY), condition variables are subject to spurious wakeups (those not associated with an explicit wake) and stolen wakeups (another thread manages to run before the woken thread). These unexpected wakeups may cause bizarre behaviours in functionalities where condition variables are used. This patch adds a function which checks whether it's a real timeout and if not it retries until some defined number of times after which it crashes. BUG=640577, 617010

Patch Set 1 #

Patch Set 2 : build break #

Patch Set 3 : nits #

Patch Set 4 : build break #

Total comments: 3
Unified diffs Side-by-side diffs Delta from patch set Stats (+64 lines, -10 lines) Patch
M base/synchronization/condition_variable_win.cc View 1 2 3 2 chunks +64 lines, -10 lines 3 comments Download

Depends on Patchset:

Dependent Patchsets:

Messages

Total messages: 45 (11 generated)
prashant.n
For bug "Chrome: Crash Report - [Renderer hang] cc::TaskGraphWorkQueue::HasReadyToRunTasksInNamespace" suspect spurious wakeups of SleepConditionVariableSRW() as ...
4 years, 2 months ago (2016-09-29 11:42:55 UTC) #5
prashant.n
For bug "Chrome: Crash Report - [Renderer hang] cc::TaskGraphWorkQueue::HasReadyToRunTasksInNamespace" suspect spurious wakeups of SleepConditionVariableSRW() as ...
4 years, 2 months ago (2016-09-29 11:42:55 UTC) #6
danakj
+robliao
4 years, 2 months ago (2016-09-29 19:22:33 UTC) #13
robliao
On 2016/09/29 19:22:33, danakj wrote: > +robliao Two questions: 1) Like most condition variables, the ...
4 years, 2 months ago (2016-09-29 19:25:38 UTC) #14
prashant.n
On 2016/09/29 19:25:38, robliao wrote: > On 2016/09/29 19:22:33, danakj wrote: > > +robliao > ...
4 years, 2 months ago (2016-09-30 00:35:18 UTC) #15
robliao
On 2016/09/30 00:35:18, prashant.n wrote: > On 2016/09/29 19:25:38, robliao wrote: > > On 2016/09/29 ...
4 years, 2 months ago (2016-09-30 00:44:08 UTC) #16
prashant.n
> Also, by indirecting the CV one level, you potentially introduce some races > where ...
4 years, 2 months ago (2016-09-30 02:16:39 UTC) #17
prashant.n
> In such case, if we check whether lock is acquired or not before retrying ...
4 years, 2 months ago (2016-09-30 09:09:53 UTC) #18
prashant.n
IMO, the patch should work provided that SleepConditionVariableSRW() fails on spurious wakeups. If the function ...
4 years, 2 months ago (2016-09-30 11:34:37 UTC) #19
robliao
On 2016/09/30 11:34:37, prashant.n wrote: > IMO, the patch should work provided that SleepConditionVariableSRW() fails ...
4 years, 2 months ago (2016-09-30 17:18:42 UTC) #20
danakj
There is nothing windows-specific about spuriously waking up from wait either. pthreads says this: "When ...
4 years, 2 months ago (2016-09-30 18:12:42 UTC) #21
prashant.n
> There are two main scenarios to consider when dealing with the CV: > 1) ...
4 years, 2 months ago (2016-09-30 18:12:55 UTC) #22
danakj
On Fri, Sep 30, 2016 at 11:12 AM, <danakj@chromium.org> wrote: > There is nothing windows-specific ...
4 years, 2 months ago (2016-09-30 18:25:19 UTC) #23
robliao
> In case of spurious wakeup, is acquiring lock after signal is received > guaranteed? ...
4 years, 2 months ago (2016-09-30 18:35:27 UTC) #24
prashant.n
> Yes. It is an error not to reacquire the lock because the thread presumes ...
4 years, 2 months ago (2016-10-01 00:12:55 UTC) #25
prashant.n
+reveman@ and vmpstr@
4 years, 2 months ago (2016-10-01 00:15:00 UTC) #27
prashant.n
My conclusion on this is - We cannot have spurious wakeups (as we don't see ...
4 years, 2 months ago (2016-10-03 05:22:40 UTC) #28
reveman
A while loop for this on windows seems like the appropriate fix based on linked ...
4 years, 2 months ago (2016-10-03 10:58:07 UTC) #29
prashant.n
> Is there a way we can test this in a unit test? I'm not ...
4 years, 2 months ago (2016-10-03 11:39:50 UTC) #30
prashant.n
> Is there a way we can test this in a unit test? I'm not ...
4 years, 2 months ago (2016-10-03 11:39:50 UTC) #31
prashant.n
https://codereview.chromium.org/2381893002/diff/60001/base/synchronization/condition_variable_win.cc File base/synchronization/condition_variable_win.cc (right): https://codereview.chromium.org/2381893002/diff/60001/base/synchronization/condition_variable_win.cc#newcode20 base/synchronization/condition_variable_win.cc:20: static const int kMaxNumberOfTriesAfterFailure = 10; On 2016/10/03 10:58:07, ...
4 years, 2 months ago (2016-10-03 11:40:14 UTC) #32
prashant.n
On 2016/10/03 11:39:50, prashant.n wrote: > > Is there a way we can test this ...
4 years, 2 months ago (2016-10-03 13:11:27 UTC) #33
reveman
https://codereview.chromium.org/2381893002/diff/60001/base/synchronization/condition_variable_win.cc File base/synchronization/condition_variable_win.cc (right): https://codereview.chromium.org/2381893002/diff/60001/base/synchronization/condition_variable_win.cc#newcode20 base/synchronization/condition_variable_win.cc:20: static const int kMaxNumberOfTriesAfterFailure = 10; On 2016/10/03 at ...
4 years, 2 months ago (2016-10-03 13:23:40 UTC) #34
prashant.n
On 2016/10/03 13:23:40, reveman wrote: > > The situation is very similar to how we ...
4 years, 2 months ago (2016-10-03 17:04:54 UTC) #35
robliao
On 2016/10/03 13:23:40, reveman wrote: > https://codereview.chromium.org/2381893002/diff/60001/base/synchronization/condition_variable_win.cc > File base/synchronization/condition_variable_win.cc (right): > > https://codereview.chromium.org/2381893002/diff/60001/base/synchronization/condition_variable_win.cc#newcode20 > ...
4 years, 2 months ago (2016-10-03 17:05:08 UTC) #36
prashant.n
On 2016/10/03 17:05:08, robliao wrote: > > As indicated before, clients will naturally have a ...
4 years, 2 months ago (2016-10-03 17:23:58 UTC) #37
reveman
On 2016/10/03 at 17:05:08, robliao wrote: > On 2016/10/03 13:23:40, reveman wrote: > > https://codereview.chromium.org/2381893002/diff/60001/base/synchronization/condition_variable_win.cc ...
4 years, 2 months ago (2016-10-03 19:46:17 UTC) #38
prashant.n
On 2016/10/03 19:46:17, reveman wrote: > Ah, it's documented here: > https://cs.chromium.org/chromium/src/base/synchronization/condition_variable.h?rcl=0&l=18 > > I ...
4 years, 2 months ago (2016-10-04 01:38:41 UTC) #39
prashant.n
On 2016/10/04 01:38:41, prashant.n wrote: > Hmm. Then should we patch up at call site ...
4 years, 2 months ago (2016-10-04 07:26:23 UTC) #40
prashant.n
On 2016/10/04 07:26:23, prashant.n wrote: > On 2016/10/04 01:38:41, prashant.n wrote: > > Hmm. Then ...
4 years, 2 months ago (2016-10-04 13:42:05 UTC) #41
robliao
On 2016/10/04 13:42:05, prashant.n wrote: > On 2016/10/04 07:26:23, prashant.n wrote: > > On 2016/10/04 ...
4 years, 2 months ago (2016-10-05 00:21:18 UTC) #42
prashant.n
On 2016/10/05 00:21:18, robliao wrote: > CWP needs to be robust to spurious wakeups. If ...
4 years, 2 months ago (2016-10-05 07:01:27 UTC) #43
robliao
On 2016/10/05 07:01:27, prashant.n wrote: > On 2016/10/05 00:21:18, robliao wrote: > > CWP needs ...
4 years, 2 months ago (2016-10-07 22:33:53 UTC) #44
robliao
4 years, 1 month ago (2016-10-27 23:23:12 UTC) #45
Message was sent while issue was closed.
On 2016/10/07 22:33:53, robliao wrote:
> On 2016/10/05 07:01:27, prashant.n wrote:
> > On 2016/10/05 00:21:18, robliao wrote:
> > > CWP needs to be robust to spurious wakeups. If CWP can't make forward
> progress
> > > due to deadlocking, that suggests abug.
> > 
> > The CWP handles this situation and will not lead to deadlocks assuming
> > SleepConditionVariableSRW() provides atomic operations.
> > 
> > On waiting,
> > SleepConditionVariableSRW() = Unlock + Start Waiting
> > 
> > On signalling,
> > it should function as = Stop Waiting + Lock
> > 
> > 
> > Suppose there are two threads t1 and t2. t1 is about to wait on cv and t2
has
> > already been waiting on same cv and due to spurious/stolen wake up, both get
> > woken up from cv, then only one can succeed to acquire the lock, other will
> > continue to wait to acquire the lock.
> 
> Are we all agreed that we do not need this change?

Closing this patch for now. Feel free to reopen if the need presents itself.

Powered by Google App Engine
This is Rietveld 408576698