|
|
Chromium Code Reviews
DescriptionAddRef() / Release() while URLRequest is alive in OCSPRequestSession
To make sure OCSPRequestSession is avlie while URLRequest is running,
AddRef() in StartURLRequest() and Release() in OnReadCompleted().
BUG=28526, 28769
TEST=none
Committed: http://src.chromium.org/viewvc/chrome?view=rev&revision=33950
Patch Set 1 #Patch Set 2 : Relese() after request_ is deleted #
Total comments: 7
Patch Set 3 : refactor #
Total comments: 9
Patch Set 4 : fix deadlock in Cancel call, notify worker in WillDestroyCurrentMessageLoop #
Total comments: 8
Patch Set 5 : fix wtc's comment #
Total comments: 3
Messages
Total messages: 21 (0 generated)
Thank you for the patch! http://codereview.chromium.org/449009/diff/1001/2001 File net/ocsp/nss_ocsp.cc (right): http://codereview.chromium.org/449009/diff/1001/2001#newcode532 net/ocsp/nss_ocsp.cc:532: req->Cancel(); I still don't know where the bug is. This req->Cancel() call should add a reference to OCSPRequestSession until the CancelURLRequest task has run. (This is done by the RunnableMethod class's constructor and destructor.) So that reference should ensure that OCSPRequestSession lives longer than URLRequest.
I'm also wondering if URLFetcher has the same bug. The URLFetcher::Core destructor doesn't have a DCHECK(!request_), so we can't easily tell if request_ is NULL.
LGTM. I spent several hours looking into this crash. I still haven't figured out why this is necessary, but this fix will definitely ensure that OCSPRequestSession lives longer than URLRequest, and pmarks verified that this fix works, so please check this in. http://codereview.chromium.org/449009/diff/1001/2001 File net/ocsp/nss_ocsp.cc (right): http://codereview.chromium.org/449009/diff/1001/2001#newcode231 net/ocsp/nss_ocsp.cc:231: Release(); // Balanced with StartURLRequest(). Please put this Release() call immediately after the request_ = NULL; statement.
On 2009/12/03 04:49:38, wtc wrote: > > I still haven't figured out why this is necessary, Shortly after I wrote this comment, I figured it out. I have a CL at http://codereview.chromium.org/460040 . Fumitoshi, do we have to call io_loop_->RemoveDestructionObserver(this) on the IO thread?
On 2009/12/03 05:37:33, wtc wrote: > On 2009/12/03 04:49:38, wtc wrote: > > > > I still haven't figured out why this is necessary, > > Shortly after I wrote this comment, I figured it out. > I have a CL at http://codereview.chromium.org/460040 . > > Fumitoshi, do we have to call > io_loop_->RemoveDestructionObserver(this) on the > IO thread? I think so. RemoveDestructionObserver must be called on the same message loop.
On 2009/12/03 06:04:17, ukai wrote: > > I think so. RemoveDestructionObserver must be called on the same message loop. In that case, we need to make sure OCSPRequestSession is destroyed in the IO thread because the OCSPRequestSession destructor may call io_loop_->RemoveDestructionObserver(this). I think your CL will ensure that OCSPRequestSession is destroyed on the IO thread?
On 2009/12/03 06:12:08, wtc wrote: > On 2009/12/03 06:04:17, ukai wrote: > > > > I think so. RemoveDestructionObserver must be called on the same message loop. > > In that case, we need to make sure OCSPRequestSession is destroyed > in the IO thread because the OCSPRequestSession destructor may call > io_loop_->RemoveDestructionObserver(this). I think your CL will > ensure that OCSPRequestSession is destroyed on the IO thread? Oh, we shouldn't call RemoveDestructionObserver() in OCSPRequestSession destructor. But, if request_ is NULL, then io_loop_ should be NULL (or it didn't call AddDestructionObserver()). So, it might be fine just putting DCHECK(!io_loop) in OCSPRequestSession destructor?
On 2009/12/03 06:17:46, ukai wrote: > > Oh, we shouldn't call RemoveDestructionObserver() in OCSPRequestSession > destructor. But, if request_ is NULL, then io_loop_ should be NULL (or it > didn't call AddDestructionObserver()). So, it might be fine just putting > DCHECK(!io_loop) in OCSPRequestSession destructor? I agree. But DCHECK(!io_loop_) may be too strong because io_loop_ may be non-NULL if we didn't call Start(). My head is not clear now, so I'll think about this again tomorrow when my head is fresh :-)
On 2009/12/03 07:45:35, wtc wrote: > On 2009/12/03 06:17:46, ukai wrote: > > > > Oh, we shouldn't call RemoveDestructionObserver() in OCSPRequestSession > > destructor. But, if request_ is NULL, then io_loop_ should be NULL (or it > > didn't call AddDestructionObserver()). So, it might be fine just putting > > DCHECK(!io_loop) in OCSPRequestSession destructor? > > I agree. But DCHECK(!io_loop_) may be too strong because > io_loop_ may be non-NULL if we didn't call Start(). Ah, yes. that's right. Maybe, we can set io_loop_ in Start()? > My head is not clear now, so I'll think about this > again tomorrow when my head is fresh :-) Thanks!
I think we should also check in your CL even if pmarks verifies my CL can fix the crash. I believe I fully understand what causes the crash now. We need to make sure OCSPRequestSession stays alive while URLRequest is running a URLRequest::Delegate callback. We can accomplish that by moving the cv_.Signal() call to the end of the OnReadCompleted() method, but it's better to guarantee that by acquiring a reference. Here are some more comments. http://codereview.chromium.org/449009/diff/1001/2001 File net/ocsp/nss_ocsp.cc (right): http://codereview.chromium.org/449009/diff/1001/2001#newcode157 net/ocsp/nss_ocsp.cc:157: Cancel(); This Cancel() call will result in a deadlock since it also tries to acquire lock_. http://codereview.chromium.org/449009/diff/1001/2001#newcode231 net/ocsp/nss_ocsp.cc:231: Release(); // Balanced with StartURLRequest(). On 2009/12/03 04:49:38, wtc wrote: > Please put this Release() call immediately after the > request_ = NULL; > statement. Please ignore this suggestion. After the Release() call we cannot access any members, so it's best to put Release() at the end of the method. http://codereview.chromium.org/449009/diff/1001/2001#newcode242 net/ocsp/nss_ocsp.cc:242: request_->Cancel(); We may need to set finished_ to true and call cv_.Signal() here to wake up the worker thread. (We'd need to set response_code_ to -1 to indicate an error._) I'm not sure it's necessary to wake up worker threads when we're shutting down the browser though.
On 2009/12/03 07:48:17, ukai wrote: > > Ah, yes. that's right. > Maybe, we can set io_loop_ in Start()? No. What I meant is that if we didn't call Start(), io_loop_ will still be non-NULL when we call the OCSPRequestSession destructor. io_loop_ is initialized to non-NULL by the constructor, so we don't need to set io_loop_ in Start() again.
On 2009/12/03 07:59:22, wtc wrote: > On 2009/12/03 07:48:17, ukai wrote: > > > > Ah, yes. that's right. > > Maybe, we can set io_loop_ in Start()? > > No. > > What I meant is that if we didn't call Start(), io_loop_ will still be > non-NULL when we call the OCSPRequestSession destructor. > io_loop_ is initialized to non-NULL by the constructor, so we > don't need to set io_loop_ in Start() again. I mean io_loop_ is NULL in constructor, and set io_loop_ in Start().
On 2009/12/03 08:01:00, ukai wrote: > > I mean io_loop_ is NULL in constructor, and set io_loop_ in Start(). Ah, I see. Yes, that should work. Ideally we should set io_loop_ in StartURLRequest(), when we set request_ = new URLRequest(url_, this); so that it's clear that io_loop_ and request_ are set and cleared together. Alternatively, we can add a bool member 'destruction_observer_added_' to remember this.
I fixed to io_loop_ management in OCSPInitSingleton. Could you take a look again, please? Thanks. http://codereview.chromium.org/449009/diff/1001/2001 File net/ocsp/nss_ocsp.cc (right): http://codereview.chromium.org/449009/diff/1001/2001#newcode157 net/ocsp/nss_ocsp.cc:157: Cancel(); On 2009/12/03 07:54:15, wtc wrote: > This Cancel() call will result in a deadlock since it also > tries to acquire lock_. Done. http://codereview.chromium.org/449009/diff/1001/2001#newcode242 net/ocsp/nss_ocsp.cc:242: request_->Cancel(); On 2009/12/03 07:54:15, wtc wrote: > We may need to set finished_ to true and call cv_.Signal() > here to wake up the worker thread. (We'd need to set > response_code_ to -1 to indicate an error._) > I'm not sure it's necessary to wake up worker threads > when we're shutting down the browser though. Done.
LGTM, but I have some questions about the correctness of the locking OCSPInitSingleton. I suggest that we check in your Patch Set 2 first, and then keep working on the refactoring. http://codereview.chromium.org/449009/diff/6005/10002 File net/ocsp/nss_ocsp.cc (right): http://codereview.chromium.org/449009/diff/6005/10002#newcode44 net/ocsp/nss_ocsp.cc:44: MessageLoop* io_thread() const { Please document that the io_thread() method can only be called by the IO thread. Hmm... given that, maybe OCSPRequestSession::StartURLRequest should just call MessageLoopForIO::current() to get io_loop_, and we can remove this io_thread() method. http://codereview.chromium.org/449009/diff/6005/10002#newcode48 net/ocsp/nss_ocsp.cc:48: void PostTask(const tracked_objects::Location& from_here, Task* task) { Please name this method PostTaskToIOLoop. http://codereview.chromium.org/449009/diff/6005/10002#newcode67 net/ocsp/nss_ocsp.cc:67: PostTask(FROM_HERE, Note: on second thought, I'm not sure if my following comment is correct, but I still put it here for your reference: We can use io_loop_ without locking in the OCSPInitSingleton destructor. When we're destroying an object, no other thread should have access to it. The reason I think this comment may be wrong is that the IO thread may call WillDestroyCurrentMessageLoop(). Are singletons destroyed on the main thread, or the same thread the singletons were created on? This is making my head hurt... http://codereview.chromium.org/449009/diff/6005/10002#newcode77 net/ocsp/nss_ocsp.cc:77: io_loop->RemoveDestructionObserver(ocsp_init_singleton); There is a race here: the address of the destroyed OCSPInitSingleton object may have been reused for some other object that has added its address as a destruction observer. Perhaps that new object's AddDestructionObserver call will fail because there is an existing destructor observer with the same address? http://codereview.chromium.org/449009/diff/6005/10002#newcode581 net/ocsp/nss_ocsp.cc:581: Singleton<OCSPInitSingleton>::get(); Please add a comment here that says this is where the OCSPInitSingleton object should be created, and this method should be called on the IO thread.
These are addressed in codereview.chromium.org/460067. http://codereview.chromium.org/449009/diff/6005/10002 File net/ocsp/nss_ocsp.cc (right): http://codereview.chromium.org/449009/diff/6005/10002#newcode44 net/ocsp/nss_ocsp.cc:44: MessageLoop* io_thread() const { On 2009/12/03 20:05:27, wtc wrote: > Please document that the io_thread() method can only be > called by the IO thread. > > Hmm... given that, maybe OCSPRequestSession::StartURLRequest > should just call MessageLoopForIO::current() to get io_loop_, > and we can remove this io_thread() method. Done. http://codereview.chromium.org/449009/diff/6005/10002#newcode48 net/ocsp/nss_ocsp.cc:48: void PostTask(const tracked_objects::Location& from_here, Task* task) { On 2009/12/03 20:05:27, wtc wrote: > Please name this method PostTaskToIOLoop. Done. http://codereview.chromium.org/449009/diff/6005/10002#newcode67 net/ocsp/nss_ocsp.cc:67: PostTask(FROM_HERE, On 2009/12/03 20:05:27, wtc wrote: > Note: on second thought, I'm not sure if my following comment > is correct, but I still put it here for your reference: > > We can use io_loop_ without locking in the > OCSPInitSingleton destructor. When we're > destroying an object, no other thread should > have access to it. > > The reason I think this comment may be wrong is that the > IO thread may call WillDestroyCurrentMessageLoop(). > > Are singletons destroyed on the main thread, or the same > thread the singletons were created on? > > This is making my head hurt... It seems singleton is deleted on a thread of AtExitManager runs? http://codereview.chromium.org/449009/diff/6005/10002#newcode77 net/ocsp/nss_ocsp.cc:77: io_loop->RemoveDestructionObserver(ocsp_init_singleton); On 2009/12/03 20:05:27, wtc wrote: > There is a race here: the address of the destroyed > OCSPInitSingleton object may have been reused for some > other object that has added its address as a destruction > observer. Perhaps that new object's AddDestructionObserver > call will fail because there is an existing destructor > observer with the same address? Ah, that would be race. Hmm, we need to RefCountedThreadSafe object to hold io_loop_ ?
LGTM. Thanks! I suggested some changes below. Please let me know if you have any questions because this code is quite subtle. http://codereview.chromium.org/449009/diff/9003/9004 File net/ocsp/nss_ocsp.cc (right): http://codereview.chromium.org/449009/diff/9003/9004#newcode155 net/ocsp/nss_ocsp.cc:155: io_loop_->PostTask( It would be a good idea to add a new CancelInternal() or CancelLocked() method that does this, so that we don't need to duplicate these three lines (155-157) here and in the Cancel() method. http://codereview.chromium.org/449009/diff/9003/9004#newcode242 net/ocsp/nss_ocsp.cc:242: request_->Cancel(); The code here (lines 242 - 251) is CancelURLRequest(). We should move the new AutoLock and cv_.Signal() to CancelURLRequest() and just call CancelURLRequest() here. http://codereview.chromium.org/449009/diff/9003/9004#newcode249 net/ocsp/nss_ocsp.cc:249: io_loop_ = NULL; Please remove response_code_ = -1; // Indicate an error. I found that response_code_ is initialized to -1 in the constructor, so this is not necessary. Remove io_loop_ = NULL; because we already set io_loop_ to NULL at line 239 above. http://codereview.chromium.org/449009/diff/9003/9004#newcode294 net/ocsp/nss_ocsp.cc:294: AddRef(); // Release in OnReadCompleted() or after |request_| deleted. This comment should just say: Release after |request_| is deleted. because in OnReadCompleted() we also call Release() after request_ is deleted.
http://codereview.chromium.org/449009/diff/9003/9004 File net/ocsp/nss_ocsp.cc (right): http://codereview.chromium.org/449009/diff/9003/9004#newcode155 net/ocsp/nss_ocsp.cc:155: io_loop_->PostTask( On 2009/12/04 22:01:05, wtc wrote: > It would be a good idea to add a new CancelInternal() or > CancelLocked() method that does this, so that we don't need > to duplicate these three lines (155-157) here and in the > Cancel() method. Done. http://codereview.chromium.org/449009/diff/9003/9004#newcode242 net/ocsp/nss_ocsp.cc:242: request_->Cancel(); On 2009/12/04 22:01:05, wtc wrote: > The code here (lines 242 - 251) is CancelURLRequest(). > We should move the new AutoLock and cv_.Signal() to > CancelURLRequest() and just call CancelURLRequest() here. Done. http://codereview.chromium.org/449009/diff/9003/9004#newcode249 net/ocsp/nss_ocsp.cc:249: io_loop_ = NULL; On 2009/12/04 22:01:05, wtc wrote: > Please remove > response_code_ = -1; // Indicate an error. > > I found that response_code_ is initialized to -1 in the > constructor, so this is not necessary. > > Remove > io_loop_ = NULL; > because we already set io_loop_ to NULL at > line 239 above. Done. http://codereview.chromium.org/449009/diff/9003/9004#newcode294 net/ocsp/nss_ocsp.cc:294: AddRef(); // Release in OnReadCompleted() or after |request_| deleted. On 2009/12/04 22:01:05, wtc wrote: > This comment should just say: > Release after |request_| is deleted. > > because in OnReadCompleted() we also call Release() after > request_ is deleted. Done.
http://codereview.chromium.org/449009/diff/9005/9006 File net/ocsp/nss_ocsp.cc (right): http://codereview.chromium.org/449009/diff/9005/9006#newcode228 net/ocsp/nss_ocsp.cc:228: virtual void WillDestroyCurrentMessageLoop() { It's a shame that the WillDestroyCurrentMessageLoop method may lock lock_ and set io_loop_ to NULL twice, once here, and once (if request_ is not NULL) inside CancelURLRequest. But I can't come up with a way to avoid this. http://codereview.chromium.org/449009/diff/9005/9006#newcode246 net/ocsp/nss_ocsp.cc:246: void CancelLocked() { Please add a comment to document that we must call this method while holding lock_. You can add this comment to your other CL at http://codereview.chromium.org/460067 Thanks!
http://codereview.chromium.org/449009/diff/9005/9006 File net/ocsp/nss_ocsp.cc (right): http://codereview.chromium.org/449009/diff/9005/9006#newcode246 net/ocsp/nss_ocsp.cc:246: void CancelLocked() { On 2009/12/08 01:44:01, wtc wrote: > Please add a comment to document that we must call this method > while holding lock_. Sure. Added lock_.AssertAcquired() too. > > You can add this comment to your other CL at > http://codereview.chromium.org/460067 > > Thanks! |
