Issue 364943002: Makes waiting SSLConnectJobs use the message loops to resume their connection.

mshelley

6 years, 5 months ago (2014-07-08 01:22:55 UTC) #1

wtc

Patch set 1 LGTM. Please check all the changes to make sure it is still ...

6 years, 5 months ago (2014-07-08 17:37:03 UTC) #2

Ryan Sleevi

https://codereview.chromium.org/364943002/diff/20001/net/socket/ssl_client_socket_pool.cc File net/socket/ssl_client_socket_pool.cc (right): https://codereview.chromium.org/364943002/diff/20001/net/socket/ssl_client_socket_pool.cc#newcode138 net/socket/ssl_client_socket_pool.cc:138: base::MessageLoopForIO::current()->PostTask(FROM_HERE, cb); On 2014/07/08 17:37:03, wtc wrote: > > ...

6 years, 5 months ago (2014-07-08 19:36:53 UTC) #3

https://codereview.chromium.org/364943002/diff/20001/net/socket/ssl_client_so...
File net/socket/ssl_client_socket_pool.cc (right):

https://codereview.chromium.org/364943002/diff/20001/net/socket/ssl_client_so...
net/socket/ssl_client_socket_pool.cc:138:
base::MessageLoopForIO::current()->PostTask(FROM_HERE, cb);
On 2014/07/08 17:37:03, wtc wrote:
> 
> IMPORTANT: please make sure the task won't run if |ssl_socket| is destroyed
> after the task is posted but before the message loop is ready to run the task.

Note: You should not use base::MessageLoopForIO directly.

The "old" path was using base::MessageLoop::current() [which always returns the
MessageLoop of the current thread, and you're always guaranteed that this thread
will be the IO thread by virtue of these classes being single threaded]

The "new" path is to use ThreadTaskRunnerHandle::Get()->PostTask() instead.

However, as Wan-Teh mentioned, we need to ensure that this callback is safe to
run even when the socket is deleted. We should be able to ensure this by
ensuring the callback, when registered (and later returned by
GetFirstCallback()) uses a WeakPtr.

Another thing to consider is to actually schedule a callback to grab the first
callback, rather than grabing it on line 131.

imagine you have sockets 1, 2, and 3 - with 1 being the leader.

If 1 connects, and you decide to start connecting 2, you could PostTask, but
then 2 could be deleted.
1) If 2 is deleted, you need to make sure the unregistration code can handle
when 2 is no longer in pending_sockets_and_callbacks (eg: all the state in line
113-136)
2) If 2 is deleted, what you want is to allow 3 to resume, not 2.

For this reason, I'd suggest you actually do a PostTask that runs a method on
the SSLCJM, which then does what you do on lines 129-138, and use an (internal
only) weakptr to the SSLCJM.

This will ensure that if 2 is deleted, by the time the CB runs, 3 is what is
treated as the leader (line 134) and the CB that is run.

Ryan Sleevi

Sorry, to be explicit: Not LGTM for now, because I think the scenario I described ...

6 years, 5 months ago (2014-07-08 19:37:24 UTC) #4

mshelley

https://codereview.chromium.org/364943002/diff/20001/net/socket/ssl_client_socket_openssl.cc File net/socket/ssl_client_socket_openssl.cc (right): https://codereview.chromium.org/364943002/diff/20001/net/socket/ssl_client_socket_openssl.cc#newcode396 net/socket/ssl_client_socket_openssl.cc:396: base::MessageLoopForIO::current()->PostTask(FROM_HERE, error_callback_); On 2014/07/08 17:37:03, wtc wrote: > > ...

6 years, 5 months ago (2014-07-09 21:53:07 UTC) #5

https://codereview.chromium.org/364943002/diff/20001/net/socket/ssl_client_so...
File net/socket/ssl_client_socket_openssl.cc (right):

https://codereview.chromium.org/364943002/diff/20001/net/socket/ssl_client_so...
net/socket/ssl_client_socket_openssl.cc:396:
base::MessageLoopForIO::current()->PostTask(FROM_HERE, error_callback_);
On 2014/07/08 17:37:03, wtc wrote:
> 
> IMPORTANT: please verify that this task should still be run if this object is
> deleted.

By "this object" do you mean the SSLClientSocketOpenSSL that this method is
called on, or the SSLConnectJobMessenger that is bound to the callback? 

If you're referring to the socket, then yes the task should still be run even if
the socket is deleted. The callback is runng OnJobFailed, which doesn't actually
use the socket of the failed job. It just removes this socket from
connecting_sockets_, and then tells the next socket in
pending_sockets_and_callbacks_ to connect. 

If you're referring to the SSLConnectJobMessenger, as of now that won't be
deleted until the SSLConnectJobFactory is deleted. My understanding was that the
SSLConnectJobFactory should outlive all sockets because it's owned by socket
pool, so I don't think that the messenger being deleted should ever be a
problem?

Note that I do plan to change the handling of SSLConnectJobMessengers so that
they are deleted after all of their pending sockets have been told to connect.

https://codereview.chromium.org/364943002/diff/20001/net/socket/ssl_client_so...
File net/socket/ssl_client_socket_pool.cc (right):

https://codereview.chromium.org/364943002/diff/20001/net/socket/ssl_client_so...
net/socket/ssl_client_socket_pool.cc:138:
base::MessageLoopForIO::current()->PostTask(FROM_HERE, cb);
On 2014/07/08 19:36:53, Ryan Sleevi wrote:
> On 2014/07/08 17:37:03, wtc wrote:
> > 
> > IMPORTANT: please make sure the task won't run if |ssl_socket| is destroyed
> > after the task is posted but before the message loop is ready to run the
task.
> 
> Note: You should not use base::MessageLoopForIO directly.
> 
> The "old" path was using base::MessageLoop::current() [which always returns
the
> MessageLoop of the current thread, and you're always guaranteed that this
thread
> will be the IO thread by virtue of these classes being single threaded]
> 
> The "new" path is to use ThreadTaskRunnerHandle::Get()->PostTask() instead.
> 
> However, as Wan-Teh mentioned, we need to ensure that this callback is safe to
> run even when the socket is deleted. We should be able to ensure this by
> ensuring the callback, when registered (and later returned by
> GetFirstCallback()) uses a WeakPtr.
> 
> Another thing to consider is to actually schedule a callback to grab the first
> callback, rather than grabing it on line 131.
> 
> imagine you have sockets 1, 2, and 3 - with 1 being the leader.
> 
> If 1 connects, and you decide to start connecting 2, you could PostTask, but
> then 2 could be deleted.
> 1) If 2 is deleted, you need to make sure the unregistration code can handle
> when 2 is no longer in pending_sockets_and_callbacks (eg: all the state in
line
> 113-136)
> 2) If 2 is deleted, what you want is to allow 3 to resume, not 2.
> 
> For this reason, I'd suggest you actually do a PostTask that runs a method on
> the SSLCJM, which then does what you do on lines 129-138, and use an (internal
> only) weakptr to the SSLCJM.
> 
> This will ensure that if 2 is deleted, by the time the CB runs, 3 is what is
> treated as the leader (line 134) and the CB that is run.

So I think the issue that you're referring to could happen if socket 1 fails to
connect -- if socket 1 connects, then all of the pending socket's callbacks will
be run regardless of whether one of them fails (i.e. there is no leader). 

To make sure that I understand what you're suggesting: I could have OnJobsFailed
post a task with callback to another SSLCJM method, which would essentially do
the same thing as OnJobsFailed, except it would look for the first socket in
pending_sockets_and_callbacks that had not been deleted (not just the first
socket), and then treat that socket as the leader and directly run its callback.

https://codereview.chromium.org/364943002/diff/20001/net/socket/ssl_client_so...
File net/socket/ssl_client_socket_pool.h (right):

https://codereview.chromium.org/364943002/diff/20001/net/socket/ssl_client_so...
net/socket/ssl_client_socket_pool.h:174:
base::MessageLoopForIO::current()->PostTask(FROM_HERE, *it);
On 2014/07/08 17:37:03, wtc wrote:
> 
> 1. IMPORTANT: this will make cancelling the callback of a deleted
> SSLClientSocket more difficult. Please make sure that is handled properly, for
> example, by the use of WeakPtrs.
> 
> 2. Nit: the indentation is off by one. Can "git cl format" fix this?

The callbacks that are being posted here are bound to SSLConnectJob WeakPtrs, so
I believe they should be handled correctly should the job be cancelled. My
understanding was that the socket should always outlive the job; if that is the
case, then the callback for a deleted socket simply wouldn't be run, which I
believe is what we would want in this case.

Ryan Sleevi

https://codereview.chromium.org/364943002/diff/20001/net/socket/ssl_client_socket_openssl.cc File net/socket/ssl_client_socket_openssl.cc (right): https://codereview.chromium.org/364943002/diff/20001/net/socket/ssl_client_socket_openssl.cc#newcode396 net/socket/ssl_client_socket_openssl.cc:396: base::MessageLoopForIO::current()->PostTask(FROM_HERE, error_callback_); On 2014/07/09 21:53:06, mshelley wrote: > On ...

6 years, 5 months ago (2014-07-09 22:40:38 UTC) #6

https://codereview.chromium.org/364943002/diff/20001/net/socket/ssl_client_so...
File net/socket/ssl_client_socket_openssl.cc (right):

https://codereview.chromium.org/364943002/diff/20001/net/socket/ssl_client_so...
net/socket/ssl_client_socket_openssl.cc:396:
base::MessageLoopForIO::current()->PostTask(FROM_HERE, error_callback_);
On 2014/07/09 21:53:06, mshelley wrote:
> On 2014/07/08 17:37:03, wtc wrote:
> > 
> > IMPORTANT: please verify that this task should still be run if this object
is
> > deleted.
> 
> By "this object" do you mean the SSLClientSocketOpenSSL that this method is
> called on, or the SSLConnectJobMessenger that is bound to the callback? 
> 
> If you're referring to the socket, then yes the task should still be run even
if
> the socket is deleted. The callback is runng OnJobFailed, which doesn't
actually
> use the socket of the failed job. It just removes this socket from
> connecting_sockets_, and then tells the next socket in
> pending_sockets_and_callbacks_ to connect. 
> 
> If you're referring to the SSLConnectJobMessenger, as of now that won't be
> deleted until the SSLConnectJobFactory is deleted. My understanding was that
the
> SSLConnectJobFactory should outlive all sockets because it's owned by socket
> pool, so I don't think that the messenger being deleted should ever be a
> problem?
> 
> Note that I do plan to change the handling of SSLConnectJobMessengers so that
> they are deleted after all of their pending sockets have been told to connect.

To be clear: The situation concerned is

Precondition:
Contents of MessageLoop
  Task1 - Run something that will eventually call OnSocketFailure
  Task2 - Shut things down

At T0, you run Task1, which schedules a new Task, Task3, to run error_callback_
At T1, Task2 runs, deleting everything
At T2, error_callback_ is now run, except the thing that created the
ErrorCallback may have gone away.

Wan-Teh's fundamental question was that if, as part of T1/Task2, this object is
deleted, is it still correct to call error_callback_?

The current design of the net/socket interfaces is that once an object is
destroyed, *no* callbacks for pending operations will be called. Which means
calling T2/error_callback_ would be a violation of that principle, because the
associated socket will have gone away.

mshelley

On 2014/07/09 22:40:38, Ryan Sleevi wrote: > https://codereview.chromium.org/364943002/diff/20001/net/socket/ssl_client_socket_openssl.cc > File net/socket/ssl_client_socket_openssl.cc (right): > > https://codereview.chromium.org/364943002/diff/20001/net/socket/ssl_client_socket_openssl.cc#newcode396 ...

6 years, 5 months ago (2014-07-09 23:34:54 UTC) #7

On 2014/07/09 22:40:38, Ryan Sleevi wrote:
>
https://codereview.chromium.org/364943002/diff/20001/net/socket/ssl_client_so...
> File net/socket/ssl_client_socket_openssl.cc (right):
> 
>
https://codereview.chromium.org/364943002/diff/20001/net/socket/ssl_client_so...
> net/socket/ssl_client_socket_openssl.cc:396:
> base::MessageLoopForIO::current()->PostTask(FROM_HERE, error_callback_);
> On 2014/07/09 21:53:06, mshelley wrote:
> > On 2014/07/08 17:37:03, wtc wrote:
> > > 
> > > IMPORTANT: please verify that this task should still be run if this object
> is
> > > deleted.
> > 
> > By "this object" do you mean the SSLClientSocketOpenSSL that this method is
> > called on, or the SSLConnectJobMessenger that is bound to the callback? 
> > 
> > If you're referring to the socket, then yes the task should still be run
even
> if
> > the socket is deleted. The callback is runng OnJobFailed, which doesn't
> actually
> > use the socket of the failed job. It just removes this socket from
> > connecting_sockets_, and then tells the next socket in
> > pending_sockets_and_callbacks_ to connect. 
> > 
> > If you're referring to the SSLConnectJobMessenger, as of now that won't be
> > deleted until the SSLConnectJobFactory is deleted. My understanding was that
> the
> > SSLConnectJobFactory should outlive all sockets because it's owned by socket
> > pool, so I don't think that the messenger being deleted should ever be a
> > problem?
> > 
> > Note that I do plan to change the handling of SSLConnectJobMessengers so
that
> > they are deleted after all of their pending sockets have been told to
connect.
>  
> 
> To be clear: The situation concerned is
> 
> Precondition:
> Contents of MessageLoop
>   Task1 - Run something that will eventually call OnSocketFailure
>   Task2 - Shut things down
> 
> At T0, you run Task1, which schedules a new Task, Task3, to run
error_callback_
> At T1, Task2 runs, deleting everything
> At T2, error_callback_ is now run, except the thing that created the
> ErrorCallback may have gone away.
> 
> Wan-Teh's fundamental question was that if, as part of T1/Task2, this object
is
> deleted, is it still correct to call error_callback_?
> 
> The current design of the net/socket interfaces is that once an object is
> destroyed, *no* callbacks for pending operations will be called. Which means
> calling T2/error_callback_ would be a violation of that principle, because the
> associated socket will have gone away.

Oh ok I see. So no, I would not want to call the error_callback_ if everything
(i.e. the SSLConnectJobMessenger) is deleted. I think an appropriate solution
then would be to use a weak_ptr for the SSLConnectJobMessenger, so that if the
messenger is deleted before the error_callback_ task is run, the task will not
be run. Does that sound correct?

Ryan Sleevi

On 2014/07/09 23:34:54, mshelley wrote: > Oh ok I see. So no, I would not ...

6 years, 5 months ago (2014-07-09 23:47:10 UTC) #8

Ryan Sleevi

https://codereview.chromium.org/364943002/diff/40001/net/socket/ssl_client_socket_pool.cc File net/socket/ssl_client_socket_pool.cc (right): https://codereview.chromium.org/364943002/diff/40001/net/socket/ssl_client_socket_pool.cc#newcode116 net/socket/ssl_client_socket_pool.cc:116: &SSLConnectJobMessenger::OnJobSucceeded, weak_factory_.GetWeakPtr())); It would be good to document why ...

6 years, 5 months ago (2014-07-10 00:03:22 UTC) #11

mshelley

On 2014/07/09 23:47:10, Ryan Sleevi wrote: > On 2014/07/09 23:34:54, mshelley wrote: > > Oh ...

6 years, 5 months ago (2014-07-10 00:18:12 UTC) #12

Ryan Sleevi

On 2014/07/10 00:18:12, mshelley wrote: > Ok sorry I think I'm unintentionally running in circles ...

6 years, 5 months ago (2014-07-10 00:24:22 UTC) #13

mshelley

On 2014/07/10 00:24:22, Ryan Sleevi wrote: > On 2014/07/10 00:18:12, mshelley wrote: > > Ok ...

6 years, 5 months ago (2014-07-10 00:33:53 UTC) #14

mshelley

https://codereview.chromium.org/364943002/diff/40001/net/socket/ssl_client_socket_pool.cc File net/socket/ssl_client_socket_pool.cc (right): https://codereview.chromium.org/364943002/diff/40001/net/socket/ssl_client_socket_pool.cc#newcode116 net/socket/ssl_client_socket_pool.cc:116: &SSLConnectJobMessenger::OnJobSucceeded, weak_factory_.GetWeakPtr())); On 2014/07/10 00:03:22, Ryan Sleevi wrote: > ...

6 years, 5 months ago (2014-07-10 00:44:20 UTC) #15

https://codereview.chromium.org/364943002/diff/40001/net/socket/ssl_client_so...
File net/socket/ssl_client_socket_pool.cc (right):

https://codereview.chromium.org/364943002/diff/40001/net/socket/ssl_client_so...
net/socket/ssl_client_socket_pool.cc:116:
&SSLConnectJobMessenger::OnJobSucceeded, weak_factory_.GetWeakPtr()));
On 2014/07/10 00:03:22, Ryan Sleevi wrote:
> It would be good to document why you're using WeakPtr here, in that it's
> non-obvious (though it's what we discussed)

Done.

https://codereview.chromium.org/364943002/diff/40001/net/socket/ssl_client_so...
net/socket/ssl_client_socket_pool.cc:141:
std::vector<SocketAndCallback>::iterator it;
On 2014/07/10 00:03:22, Ryan Sleevi wrote:
> anal retentive: Would nice to have declaration order match assignment order
(it
> -> socket -> callback)

Done.

https://codereview.chromium.org/364943002/diff/40001/net/socket/ssl_client_so...
net/socket/ssl_client_socket_pool.cc:174:
base::ThreadTaskRunnerHandle::Get()->PostTask(FROM_HERE, it->callback);
On 2014/07/10 00:03:22, Ryan Sleevi wrote:
> Lifetime question: This seems another situation where we might have a callback
> de-register itself in between when the PostTask happened and it's actually
run.
> 
> Is that possible? Expected? While the callback-binder can always use WeakPtr
to
> handle an unexpected callback, it's "nicer" to provide clear lifetime
semantics
> about when/how things will be called (even if it is much harder)

I believe the only way that one of these callbacks could normally be
de-registered is if its SSLConnectJob was deleted. That shouldn't happen until
the SSLConnectJob has finished its connection. Since by definition these connect
jobs cannot finish their connections until their callback has been run, the
callbacks should not be de-registered. 

However, if for some reason the SSLConnectJobFactory was deleted before the jobs
finished connecting I guess that would be possible?

So I think my answer is it's possible but not expected.

https://codereview.chromium.org/364943002/diff/40001/net/socket/ssl_client_so...
File net/socket/ssl_client_socket_pool.h (right):

https://codereview.chromium.org/364943002/diff/40001/net/socket/ssl_client_so...
net/socket/ssl_client_socket_pool.h:15: #include
"base/thread_task_runner_handle.h"
On 2014/07/10 00:03:22, Ryan Sleevi wrote:
> This should be in the .cc, not the .h
> 
> In .h files, we want to keep the number of headers included as small as
> possible.

Done.

Ryan Sleevi

https://codereview.chromium.org/364943002/diff/60001/net/socket/ssl_client_socket_openssl.cc File net/socket/ssl_client_socket_openssl.cc (right): https://codereview.chromium.org/364943002/diff/60001/net/socket/ssl_client_socket_openssl.cc#newcode398 net/socket/ssl_client_socket_openssl.cc:398: // SSLClientSocketOpenSSL at the callback's runtime. Note: The request ...

6 years, 5 months ago (2014-07-10 02:34:25 UTC) #16

mshelley

https://codereview.chromium.org/364943002/diff/60001/net/socket/ssl_client_socket_openssl.cc File net/socket/ssl_client_socket_openssl.cc (right): https://codereview.chromium.org/364943002/diff/60001/net/socket/ssl_client_socket_openssl.cc#newcode398 net/socket/ssl_client_socket_openssl.cc:398: // SSLClientSocketOpenSSL at the callback's runtime. On 2014/07/10 02:34:25, ...

6 years, 5 months ago (2014-07-10 16:35:32 UTC) #17

Ryan Sleevi

Several comments unrelated to this CL, but I only noticed them in particular when reviewing ...

6 years, 5 months ago (2014-07-10 20:30:45 UTC) #18

Several comments unrelated to this CL, but I only noticed them in particular
when reviewing this. Please feel free to migrate into the appropriate CL and
rebase this, or create a separate CL for them.

https://codereview.chromium.org/364943002/diff/100001/net/socket/ssl_client_s...
File net/socket/ssl_client_socket_openssl.h (right):

https://codereview.chromium.org/364943002/diff/100001/net/socket/ssl_client_s...
net/socket/ssl_client_socket_openssl.h:226: // will still be valid at its
runtime.
Sorry that it seems like I still wasn't clear.

The caller/user of the SSLClientSocket should never have to read to the data
members - that's documentation for the implementer/coder

What matters is the documentation on the public methods; in particular,
SetSocketFailureCallback (on the base class)

https://codereview.chromium.org/364943002/diff/100001/net/socket/ssl_client_s...
File net/socket/ssl_client_socket_pool.h (right):

https://codereview.chromium.org/364943002/diff/100001/net/socket/ssl_client_s...
net/socket/ssl_client_socket_pool.h:122: // SSL connection.
unrelated to this CL: (not sure which CL is the master for this, or if this has
landed, with all the CLs flying about)

// Returns true if |ssl_socket|'s Connect() method should be called.

https://codereview.chromium.org/364943002/diff/100001/net/socket/ssl_client_s...
net/socket/ssl_client_socket_pool.h:126: // upon the completion of
|ssl_socket|'s connection.
unrelated to this CL:

// Configure the SSLConnectJobMessenger to begin monitoring |ssl_socket|'s
// connection status. After a successful connection, or an error,
// the messenger will determine which sockets that have been added
// via AddPendingSocket() to allow to proceed, notifying the associated
// callback asynchronously, so that the caller can call Connect() on them.

https://codereview.chromium.org/364943002/diff/100001/net/socket/ssl_client_s...
net/socket/ssl_client_socket_pool.h:131: void AddPendingSocket(SSLClientSocket*
socket, const base::Closure& callback);
// Adds |socket| to the list of sockets waiting to Connect(). When
// the messenger has determined it's an appropriate time for |socket|
// to connect, it will asynchronously invoke |callback|. Callers should
// then proceed to call Connect() and establish the socket.
//
// Note: It is an error to call AddPendingSocket() without having first
// called MonitorConnectionResult() and configuring a socket that WILL
// have Connect() called on it.

https://codereview.chromium.org/364943002/diff/100001/net/socket/ssl_client_s...
net/socket/ssl_client_socket_pool.h:139: void OnJobFailed();
note: OnJobSucceeded and OnJobFailed both seem to only be used in base::Bind()
calls.

In that case, these methods should be private.

The same seems to be true for ConnectNewLeader (which is called by OnJobFailed)

mshelley

Note: I fixed the comments that were unrelated to this CL in the branch that ...

6 years, 5 months ago (2014-07-10 22:49:59 UTC) #19

Note: I fixed the comments that were unrelated to this CL in the branch that
this branch is dependent upon, because that seemed like the most appropriate
place to update them.

https://codereview.chromium.org/364943002/diff/100001/net/socket/ssl_client_s...
File net/socket/ssl_client_socket_openssl.h (right):

https://codereview.chromium.org/364943002/diff/100001/net/socket/ssl_client_s...
net/socket/ssl_client_socket_openssl.h:226: // will still be valid at its
runtime.
On 2014/07/10 20:30:45, Ryan Sleevi wrote:
> Sorry that it seems like I still wasn't clear.
> 
> The caller/user of the SSLClientSocket should never have to read to the data
> members - that's documentation for the implementer/coder
> 
> What matters is the documentation on the public methods; in particular,
> SetSocketFailureCallback (on the base class)

Done.

https://codereview.chromium.org/364943002/diff/100001/net/socket/ssl_client_s...
File net/socket/ssl_client_socket_pool.h (right):

https://codereview.chromium.org/364943002/diff/100001/net/socket/ssl_client_s...
net/socket/ssl_client_socket_pool.h:122: // SSL connection.
On 2014/07/10 20:30:45, Ryan Sleevi wrote:
> unrelated to this CL: (not sure which CL is the master for this, or if this
has
> landed, with all the CLs flying about)
> 
> // Returns true if |ssl_socket|'s Connect() method should be called.

Done.

https://codereview.chromium.org/364943002/diff/100001/net/socket/ssl_client_s...
net/socket/ssl_client_socket_pool.h:126: // upon the completion of
|ssl_socket|'s connection.
On 2014/07/10 20:30:45, Ryan Sleevi wrote:
> unrelated to this CL:
> 
> // Configure the SSLConnectJobMessenger to begin monitoring |ssl_socket|'s
> // connection status. After a successful connection, or an error,
> // the messenger will determine which sockets that have been added
> // via AddPendingSocket() to allow to proceed, notifying the associated
> // callback asynchronously, so that the caller can call Connect() on them.

Done.

https://codereview.chromium.org/364943002/diff/100001/net/socket/ssl_client_s...
net/socket/ssl_client_socket_pool.h:131: void AddPendingSocket(SSLClientSocket*
socket, const base::Closure& callback);
On 2014/07/10 20:30:45, Ryan Sleevi wrote:
> // Adds |socket| to the list of sockets waiting to Connect(). When
> // the messenger has determined it's an appropriate time for |socket|
> // to connect, it will asynchronously invoke |callback|. Callers should
> // then proceed to call Connect() and establish the socket.
> //
> // Note: It is an error to call AddPendingSocket() without having first
> // called MonitorConnectionResult() and configuring a socket that WILL
> // have Connect() called on it.

Done.

https://codereview.chromium.org/364943002/diff/100001/net/socket/ssl_client_s...
net/socket/ssl_client_socket_pool.h:139: void OnJobFailed();
On 2014/07/10 20:30:45, Ryan Sleevi wrote:
> note: OnJobSucceeded and OnJobFailed both seem to only be used in base::Bind()
> calls.
> 
> In that case, these methods should be private.
> 
> The same seems to be true for ConnectNewLeader (which is called by
OnJobFailed)

OnJobFailed is called once directly in DoSSLConenctComplete, but I'll make the
others private.

Ryan Sleevi

Ok, I'm going to re-examine the dependent CLs and then get back to this, but ...

6 years, 5 months ago (2014-07-11 01:25:59 UTC) #20

wtc

Review comments on patch set 5: I suggest some changes to reduce the amount of ...

6 years, 5 months ago (2014-07-11 18:55:51 UTC) #21

Review comments on patch set 5:

I suggest some changes to reduce the amount of code we defer to posted tasks.

https://codereview.chromium.org/364943002/diff/120001/net/socket/ssl_client_s...
File net/socket/ssl_client_socket_openssl.cc (right):

https://codereview.chromium.org/364943002/diff/120001/net/socket/ssl_client_s...
net/socket/ssl_client_socket_openssl.cc:399:
base::ThreadTaskRunnerHandle::Get()->PostTask(FROM_HERE, error_callback_);

IMPORTANT: Based on my understanding of the code, I suggest running this error
callback directly. This will allow us to remove the comment for
SetSocketFailureCallback, which would introduce a behavior uncommon in the
various Socket interfaces of our network stack:

 // NOTE: This callback may still be executed even if this socket has
 // been disconnected or deleted.

The error callback (SSLConnectJobMessenger::OnJobFailed) already posts a task.
So posting a task here would be redundant.

https://codereview.chromium.org/364943002/diff/120001/net/socket/ssl_client_s...
File net/socket/ssl_client_socket_pool.cc (right):

https://codereview.chromium.org/364943002/diff/120001/net/socket/ssl_client_s...
net/socket/ssl_client_socket_pool.cc:114: // SSLConnectJobMessenger weak_ptrs
are used here to ensure that

weak_ptrs => "weak pointers" or "WeakPtrs"

https://codereview.chromium.org/364943002/diff/120001/net/socket/ssl_client_s...
net/socket/ssl_client_socket_pool.cc:151: if (it->socket != NULL) {

IMPORTANT: this null check for |it->socket| doesn't work because |it->socket| is
not a weak pointer. |it->socket| comes from the messenger_->AddPendingSocket()
on line 438. It won't magically become null.

You'll need to write the code to remove a socket from
pending_sockets_and_callbacks_ before the socket is deleted.

https://codereview.chromium.org/364943002/diff/120001/net/socket/ssl_client_s...
net/socket/ssl_client_socket_pool.cc:169: callback.Run();

IMPORTANT: I have to admit I don't understand why ConnectNewLeader should be run
as a task. If it is necessary to do so, it seems that only this very last line
(to run the resumption callback of the new leader) needs to be run as a task.

Contrast this with OnJobSucceeded. OnJobSucceeded does everything directly
except the last line (RunAllJobs), which posts the resumption callbacks of all
pending sockets as tasks. So it seems that OnJobFailed should also be able to do
everything directly except for the resumption callback of the new leader.

https://codereview.chromium.org/364943002/diff/120001/net/socket/ssl_client_s...
net/socket/ssl_client_socket_pool.cc:180: task_runner->PostTask(FROM_HERE,
it->callback);

Please add curly braces because the loop constraint part of the for loop spans
multiple lines. This is a local style preference of the network stack code.

mshelley

https://codereview.chromium.org/364943002/diff/120001/net/socket/ssl_client_socket.h File net/socket/ssl_client_socket.h (right): https://codereview.chromium.org/364943002/diff/120001/net/socket/ssl_client_socket.h#newcode103 net/socket/ssl_client_socket.h:103: // This callback should be run regardless of whether ...

6 years, 5 months ago (2014-07-14 20:30:07 UTC) #22

https://codereview.chromium.org/364943002/diff/120001/net/socket/ssl_client_s...
File net/socket/ssl_client_socket.h (right):

https://codereview.chromium.org/364943002/diff/120001/net/socket/ssl_client_s...
net/socket/ssl_client_socket.h:103: // This callback should be run regardless of
whether or not this socket
On 2014/07/11 01:25:59, Ryan Sleevi wrote:
> s/should be/will be/
> 
> Reworded though:
> 
> // NOTE: This callback may still be executed even if this socket has
> // been disconnected or deleted.

Done.

https://codereview.chromium.org/364943002/diff/120001/net/socket/ssl_client_s...
File net/socket/ssl_client_socket_openssl.cc (right):

https://codereview.chromium.org/364943002/diff/120001/net/socket/ssl_client_s...
net/socket/ssl_client_socket_openssl.cc:399:
base::ThreadTaskRunnerHandle::Get()->PostTask(FROM_HERE, error_callback_);
On 2014/07/11 18:55:51, wtc wrote:
> 
> IMPORTANT: Based on my understanding of the code, I suggest running this error
> callback directly. This will allow us to remove the comment for
> SetSocketFailureCallback, which would introduce a behavior uncommon in the
> various Socket interfaces of our network stack:
> 
>  // NOTE: This callback may still be executed even if this socket has
>  // been disconnected or deleted.
> 
> The error callback (SSLConnectJobMessenger::OnJobFailed) already posts a task.
> So posting a task here would be redundant.

Done.

https://codereview.chromium.org/364943002/diff/120001/net/socket/ssl_client_s...
File net/socket/ssl_client_socket_openssl.h (right):

https://codereview.chromium.org/364943002/diff/120001/net/socket/ssl_client_s...
net/socket/ssl_client_socket_openssl.h:226: // will still be valid at its
runtime.
On 2014/07/11 01:25:59, Ryan Sleevi wrote:
> drop this comment (225-226)

Done.

https://codereview.chromium.org/364943002/diff/120001/net/socket/ssl_client_s...
File net/socket/ssl_client_socket_pool.cc (right):

https://codereview.chromium.org/364943002/diff/120001/net/socket/ssl_client_s...
net/socket/ssl_client_socket_pool.cc:114: // SSLConnectJobMessenger weak_ptrs
are used here to ensure that
On 2014/07/11 18:55:51, wtc wrote:
> 
> weak_ptrs => "weak pointers" or "WeakPtrs"

Done.

https://codereview.chromium.org/364943002/diff/120001/net/socket/ssl_client_s...
net/socket/ssl_client_socket_pool.cc:151: if (it->socket != NULL) {
On 2014/07/11 18:55:51, wtc wrote:
> 
> IMPORTANT: this null check for |it->socket| doesn't work because |it->socket|
is
> not a weak pointer. |it->socket| comes from the messenger_->AddPendingSocket()
> on line 438. It won't magically become null.
> 
> You'll need to write the code to remove a socket from
> pending_sockets_and_callbacks_ before the socket is deleted.

This has been corrected now in https://codereview.chromium.org/353713005/.

https://codereview.chromium.org/364943002/diff/120001/net/socket/ssl_client_s...
net/socket/ssl_client_socket_pool.cc:164: // be connected next.
On 2014/07/11 01:25:59, Ryan Sleevi wrote:
> comment formating

Done.

https://codereview.chromium.org/364943002/diff/120001/net/socket/ssl_client_s...
net/socket/ssl_client_socket_pool.cc:169: callback.Run();
On 2014/07/11 18:55:51, wtc wrote:
> 
> IMPORTANT: I have to admit I don't understand why ConnectNewLeader should be
run
> as a task. If it is necessary to do so, it seems that only this very last line
> (to run the resumption callback of the new leader) needs to be run as a task.
> 
> Contrast this with OnJobSucceeded. OnJobSucceeded does everything directly
> except the last line (RunAllJobs), which posts the resumption callbacks of all
> pending sockets as tasks. So it seems that OnJobFailed should also be able to
do
> everything directly except for the resumption callback of the new leader.

The original concern was this: 

Say I have pending sockets a, b, and c. I post a task connecting a. By the time
the task runs, a has been deleted for some reason. This results in the task not
running, and sockets b and c never resuming their connections. So basically I
don't want there to be any delay between accessing a socket from
pending_sockets_and_callbacks_, and running that socket's callback.

With OnJobSucceeded this isn't a concern, because by definition there are not
pending sockets dependent upon the connection of the sockets being connected
with post task in OnJobSucceeded.

https://codereview.chromium.org/364943002/diff/120001/net/socket/ssl_client_s...
net/socket/ssl_client_socket_pool.cc:180: task_runner->PostTask(FROM_HERE,
it->callback);
On 2014/07/11 18:55:51, wtc wrote:
> 
> Please add curly braces because the loop constraint part of the for loop spans
> multiple lines. This is a local style preference of the network stack code.

Done.

wtc

Review comments on patch set 6: https://codereview.chromium.org/364943002/diff/140001/net/socket/ssl_client_socket.h File net/socket/ssl_client_socket.h (right): https://codereview.chromium.org/364943002/diff/140001/net/socket/ssl_client_socket.h#newcode104 net/socket/ssl_client_socket.h:104: // been disconnected ...

6 years, 5 months ago (2014-07-14 22:39:19 UTC) #23

wtc

Additional comments on patch set 6: https://codereview.chromium.org/364943002/diff/140001/net/socket/ssl_client_socket_pool.cc File net/socket/ssl_client_socket_pool.cc (right): https://codereview.chromium.org/364943002/diff/140001/net/socket/ssl_client_socket_pool.cc#newcode156 net/socket/ssl_client_socket_pool.cc:156: callback.Run(); I believe ...

6 years, 5 months ago (2014-07-14 23:57:31 UTC) #24

Ryan Sleevi

https://codereview.chromium.org/364943002/diff/140001/net/socket/ssl_client_socket_pool.cc File net/socket/ssl_client_socket_pool.cc (right): https://codereview.chromium.org/364943002/diff/140001/net/socket/ssl_client_socket_pool.cc#newcode138 net/socket/ssl_client_socket_pool.cc:138: base::ThreadTaskRunnerHandle::Get()->PostTask(FROM_HERE, callback); On 2014/07/14 22:39:18, wtc wrote: > > ...

6 years, 5 months ago (2014-07-15 00:48:11 UTC) #25

wtc

https://codereview.chromium.org/364943002/diff/140001/net/socket/ssl_client_socket_pool.cc File net/socket/ssl_client_socket_pool.cc (right): https://codereview.chromium.org/364943002/diff/140001/net/socket/ssl_client_socket_pool.cc#newcode138 net/socket/ssl_client_socket_pool.cc:138: base::ThreadTaskRunnerHandle::Get()->PostTask(FROM_HERE, callback); Ryan and I discussed this in person. ...

6 years, 5 months ago (2014-07-15 21:27:09 UTC) #26

wtc

https://codereview.chromium.org/364943002/diff/140001/net/socket/ssl_client_socket_pool.cc File net/socket/ssl_client_socket_pool.cc (right): https://codereview.chromium.org/364943002/diff/140001/net/socket/ssl_client_socket_pool.cc#newcode138 net/socket/ssl_client_socket_pool.cc:138: base::ThreadTaskRunnerHandle::Get()->PostTask(FROM_HERE, callback); The purpose of the PostTask should be ...

6 years, 5 months ago (2014-07-16 21:48:13 UTC) #27

Ryan Sleevi

https://codereview.chromium.org/364943002/diff/140001/net/socket/ssl_client_socket_pool.cc File net/socket/ssl_client_socket_pool.cc (right): https://codereview.chromium.org/364943002/diff/140001/net/socket/ssl_client_socket_pool.cc#newcode162 net/socket/ssl_client_socket_pool.cc:162: base::ThreadTaskRunnerHandle::Get(); Use a SequencedTaskRunner here instead. Or just TaskRunner. ...

6 years, 5 months ago (2014-07-16 21:54:43 UTC) #28

mshelley

https://codereview.chromium.org/364943002/diff/140001/net/socket/ssl_client_socket.h File net/socket/ssl_client_socket.h (right): https://codereview.chromium.org/364943002/diff/140001/net/socket/ssl_client_socket.h#newcode104 net/socket/ssl_client_socket.h:104: // been disconnected or deleted. On 2014/07/14 22:39:18, wtc ...

6 years, 5 months ago (2014-07-17 16:28:04 UTC) #29

https://codereview.chromium.org/364943002/diff/140001/net/socket/ssl_client_s...
File net/socket/ssl_client_socket.h (right):

https://codereview.chromium.org/364943002/diff/140001/net/socket/ssl_client_s...
net/socket/ssl_client_socket.h:104: // been disconnected or deleted.
On 2014/07/14 22:39:18, wtc wrote:
> 
> Delete this comment.

Done.

https://codereview.chromium.org/364943002/diff/140001/net/socket/ssl_client_s...
File net/socket/ssl_client_socket_openssl.cc (right):

https://codereview.chromium.org/364943002/diff/140001/net/socket/ssl_client_s...
net/socket/ssl_client_socket_openssl.cc:19: #include
"base/thread_task_runner_handle.h"
On 2014/07/14 22:39:18, wtc wrote:
> 
> Delete this line.

Done.

https://codereview.chromium.org/364943002/diff/140001/net/socket/ssl_client_s...
File net/socket/ssl_client_socket_pool.cc (right):

https://codereview.chromium.org/364943002/diff/140001/net/socket/ssl_client_s...
net/socket/ssl_client_socket_pool.cc:138:
base::ThreadTaskRunnerHandle::Get()->PostTask(FROM_HERE, callback);
On 2014/07/14 22:39:18, wtc wrote:
> 
> I still think we can run ConnectNewLeader directly, and ConnectNewLeader just
> needs to post the callback of the new leader as a task.

Done.

https://codereview.chromium.org/364943002/diff/140001/net/socket/ssl_client_s...
net/socket/ssl_client_socket_pool.cc:151: return;
On 2014/07/14 22:39:18, wtc wrote:
> 
> BUG: Check |it| first, before dereferencing |it|:
> 
>   std::vector<SocketAndCallback>::iterator it =
>       pending_sockets_and_callbacks_.begin();
> 
>   // If there were no valid pending sockets, return.
>   if (it == pending_sockets_and_callbacks_.end())
>     return;
> 
>   SSLClientSocket* ssl_socket = it->socket;
>   base::Closure& callback = it->callback;
> 
>   pending_sockets_and_callbacks_.erase(it);
> 
>   MonitorConnectionResult(ssl_socket);
>   callback.Run();
> 
> Alternatively you can use the empty() method:
> 
>   // If there were no valid pending sockets, return.
>   if (pending_sockets_and_callbacks_.empty())
>     return;
> 
>   std::vector<SocketAndCallback>::iterator it =
>       pending_sockets_and_callbacks_.begin();
>   SSLClientSocket* ssl_socket = it->socket;
>   base::Closure& callback = it->callback;
>   pending_sockets_and_callbacks_.erase(it);
> 
>   MonitorConnectionResult(ssl_socket);
>   callback.Run();

Done.

https://codereview.chromium.org/364943002/diff/140001/net/socket/ssl_client_s...
net/socket/ssl_client_socket_pool.cc:156: callback.Run();
On 2014/07/14 23:57:31, wtc wrote:
> 
> I believe that if the SSLClientSocket destructor (or rather,
> SSLClientSocket::Disconnect) calls the SocketFailure callback, then it is safe
> to post |callback| as a task here.

Done.

https://codereview.chromium.org/364943002/diff/140001/net/socket/ssl_client_s...
net/socket/ssl_client_socket_pool.cc:162: base::ThreadTaskRunnerHandle::Get();
On 2014/07/16 21:54:43, Ryan Sleevi wrote:
> Use a SequencedTaskRunner here instead. Or just TaskRunner. There's no need to
> "couple" to the STTR here, even if that's what TTRH::Get returns
> 
> (Think of the type expression as a set of "Here are my requirements". Your
only
> requirement here is that tasks run)

Done.

https://codereview.chromium.org/364943002/diff/140001/net/socket/ssl_client_s...
net/socket/ssl_client_socket_pool.cc:167: task_runner->PostTask(FROM_HERE,
it->callback);
On 2014/07/16 21:54:43, Ryan Sleevi wrote:
> On 2014/07/16 21:48:12, wtc wrote:
> > 
> > Based on my understanding of the problem (the somewhat recursive call
> sequence),
> > I think it is fine for RunAllJobs to run the callbacks directly. The reason
is
> > that the sockets in question don't have any SocketSuccess or SocketFailure
> > callbacks set, so they won't call back to the Messenger.
> 
> I think this requires some clarifications, and as I mentioned to wtc, I do
find
> this slightly easier to reason about with threading, even though it's slightly
> less performant.
> 
> You can end up deleting the Messenger, by virtue of a socket operation then
> causing the pool to be deleted, which then deletes the messenger.
> 
> If some code in Messenger was responsible for calling RunAllTasks (meaning
it's
> not the top of the MessageLoop task execution), then that code might access
> member variables, and thus crash (because the Messenger was since deleted).
> 
> This is not to say it's unsolvable, but it does mean that one has a harder
time
> reasoning about the safety of this implementation, in that you have to then go
> validate these invariants.
> 
> > 
> > Since this is the common code path (called by OnJobSucceeded), running the
> > callbacks directly will improve performance.
> 

So if I understand correctly, the reason to continue using PostTask here is as
follows: 

Say I have 2 pending sockets. I directly run the callback of the first socket.
Somehow running that callback deletes the messenger. I try to return to my
for-loop after running that callback so that I can run the next callback, but
the messenger has already been deleted and things crash. 

If I use post task, this won't be an issue because I won't actually be executing
the callbacks inside of the for loop, so they won't be able to delete the
messenger until after the messenger has finished doing everything that it needs
to do.

Based upon that understanding, it makes more sense to me to continue using
PostTask despite the performance drawbacks. Thoughts?

https://codereview.chromium.org/364943002/diff/140001/net/socket/ssl_client_s...
net/socket/ssl_client_socket_pool.cc:445: messenger_->OnJobFailed();
On 2014/07/14 23:57:31, wtc wrote:
> 
> Delete these two lines. Then OnJobFailed can be private.

Done.

https://codereview.chromium.org/364943002/diff/140001/net/socket/ssl_client_s...
File net/socket/ssl_client_socket_pool.h (right):

https://codereview.chromium.org/364943002/diff/140001/net/socket/ssl_client_s...
net/socket/ssl_client_socket_pool.h:144: void OnJobFailed();
On 2014/07/14 22:39:18, wtc wrote:
> 
> Make the OnJobFailed method be private. Declare it after OnJobSucceeded.

Done.

wtc

https://codereview.chromium.org/364943002/diff/140001/net/socket/ssl_client_socket_pool.cc File net/socket/ssl_client_socket_pool.cc (right): https://codereview.chromium.org/364943002/diff/140001/net/socket/ssl_client_socket_pool.cc#newcode167 net/socket/ssl_client_socket_pool.cc:167: task_runner->PostTask(FROM_HERE, it->callback); On 2014/07/17 16:28:03, mshelley wrote: > > ...

6 years, 5 months ago (2014-07-18 01:31:57 UTC) #30

Ryan Sleevi

https://codereview.chromium.org/364943002/diff/160001/net/socket/ssl_client_socket_pool.cc File net/socket/ssl_client_socket_pool.cc (right): https://codereview.chromium.org/364943002/diff/160001/net/socket/ssl_client_socket_pool.cc#newcode134 net/socket/ssl_client_socket_pool.cc:134: // SSLConnectJobMessenger should become invalid before they're run. Probably ...

6 years, 5 months ago (2014-07-18 23:08:35 UTC) #31

mshelley

https://codereview.chromium.org/364943002/diff/160001/net/socket/ssl_client_socket_pool.cc File net/socket/ssl_client_socket_pool.cc (right): https://codereview.chromium.org/364943002/diff/160001/net/socket/ssl_client_socket_pool.cc#newcode134 net/socket/ssl_client_socket_pool.cc:134: // SSLConnectJobMessenger should become invalid before they're run. On ...

6 years, 5 months ago (2014-07-21 23:08:29 UTC) #32

mmenke

https://codereview.chromium.org/364943002/diff/180001/net/socket/ssl_client_socket.h File net/socket/ssl_client_socket.h (right): https://codereview.chromium.org/364943002/diff/180001/net/socket/ssl_client_socket.h#newcode108 net/socket/ssl_client_socket.h:108: virtual void OnSocketFailure() = 0; Don't believe this belongs ...

6 years, 5 months ago (2014-07-22 18:06:32 UTC) #33

mshelley

https://codereview.chromium.org/364943002/diff/180001/net/socket/ssl_client_socket.h File net/socket/ssl_client_socket.h (right): https://codereview.chromium.org/364943002/diff/180001/net/socket/ssl_client_socket.h#newcode108 net/socket/ssl_client_socket.h:108: virtual void OnSocketFailure() = 0; On 2014/07/22 18:06:32, mmenke ...

6 years, 5 months ago (2014-07-23 22:09:08 UTC) #34

Ryan Sleevi

I think this LGTM, but see the comments on the CL this is based on ...

6 years, 5 months ago (2014-07-25 21:31:44 UTC) #35

wtc

Review comments on patch set 10: Since the CL has gone through several revisions, please ...

6 years, 4 months ago (2014-08-05 00:44:57 UTC) #36

Ryan Sleevi

6 years, 4 months ago (2014-08-06 02:15:50 UTC) #37

https://codereview.chromium.org/364943002/diff/220001/net/socket/ssl_client_s...
File net/socket/ssl_client_socket_pool.cc (right):

https://codereview.chromium.org/364943002/diff/220001/net/socket/ssl_client_s...
net/socket/ssl_client_socket_pool.cc:151: task_runner->PostTask(FROM_HERE,
it->callback);
On 2014/08/05 00:44:57, wtc wrote:
> 
> 1. I think running the callbacks directly is safe and improves performance.
But
> I won't block this change.
> 
> We don't call SetHandshakeCompletionCallback on any of these pending sockets,
so
> the callbacks won't call the SSLConnectJobMessenger. Therefore, the "somewhat
> recursive call sequence" you mentioned in the CL's description should not
occur.

I don't think this is entirely true. Any one of these callbacks could create a
new SSLSocket, which would potentially fall back into this messenger. Any number
of side-effects could occur.

Now, whether or not they do in this CL is irrelevant, I think - it's more about
making sure the API contract is consistent.

> 
> 2. Please add a comment to explain why it is necessary or useful to post tasks
> to run the callbacks later.

Issue 364943002: Makes waiting SSLConnectJobs use the message loops to resume their connection. (Closed)

Description

Patch Set 1 : #

Patch Set 2 : Changed messenger's handling of failed jobs, and switched to use ThreadTaskRunnerHandle #

Patch Set 3 : Added documentation and fixed assignment order. #

Patch Set 4 : Moved documentation and switched to use a temp var of the task_runner. #

Patch Set 5 : Moved documentation and made some methods private. #

Patch Set 6 : Changed OnSocketFailure callback to run directly & fixed comments. #

Patch Set 7 : Changed handling of on JobFailed. #

Patch Set 8 : Fixed use after free bug & other cl comments. #

Patch Set 9 : Deleted unnessecary headers and removed unneeded declaration. #

Patch Set 10 : Rebase #

Messages