Issue 2888623011: Fix for HTTP2 request hanging bug.

mmenke

The CQ bit was checked by mmenke@chromium.org to run a CQ dry run

3 years, 7 months ago (2017-05-18 18:31:04 UTC) #1

commit-bot: I haz the power

Dry run: CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2888623011/1

3 years, 7 months ago (2017-05-18 18:33:14 UTC) #2

mmenke

Description was changed from ========== Fix for HTTP2 request hanging bug. If, when a socket ...

3 years, 7 months ago (2017-05-18 18:56:20 UTC) #3

mmenke

mmenke@chromium.org changed reviewers: + davidben@chromium.org

3 years, 7 months ago (2017-05-18 20:00:19 UTC) #4

mmenke

[davidben]: PTAL. Since things are notified of request completion asynchronously, having a loop here shouldn't ...

3 years, 7 months ago (2017-05-18 20:00:20 UTC) #5

davidben

Okay, I think I understand what's going on? Two questions. https://codereview.chromium.org/2888623011/diff/1/net/socket/client_socket_pool_base.cc File net/socket/client_socket_pool_base.cc (left): https://codereview.chromium.org/2888623011/diff/1/net/socket/client_socket_pool_base.cc#oldcode934 ...

3 years, 7 months ago (2017-05-18 20:22:51 UTC) #6

commit-bot: I haz the power

The CQ bit was unchecked by commit-bot@chromium.org

3 years, 7 months ago (2017-05-18 20:25:04 UTC) #7

commit-bot: I haz the power

Dry run: Try jobs failed on following builders: linux_chromium_rel_ng on master.tryserver.chromium.linux (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.linux/builders/linux_chromium_rel_ng/builds/459157)

3 years, 7 months ago (2017-05-18 20:25:05 UTC) #8

mmenke

https://codereview.chromium.org/2888623011/diff/1/net/socket/client_socket_pool_base.cc File net/socket/client_socket_pool_base.cc (left): https://codereview.chromium.org/2888623011/diff/1/net/socket/client_socket_pool_base.cc#oldcode934 net/socket/client_socket_pool_base.cc:934: // the looping we leave it at this. On ...

3 years, 7 months ago (2017-05-18 20:44:57 UTC) #9

https://codereview.chromium.org/2888623011/diff/1/net/socket/client_socket_po...
File net/socket/client_socket_pool_base.cc (left):

https://codereview.chromium.org/2888623011/diff/1/net/socket/client_socket_po...
net/socket/client_socket_pool_base.cc:934: //        the looping we leave it at
this.
On 2017/05/18 20:22:51, davidben wrote:
> Previously we tried to avoid infinite looping here. Should that be a concern?
I
> guess the assumption is that, since we invoke all user callbacks here on a
> PostTask, we can only drain through our existing set of requests and not grow
> new ones in processing?

Correct.  I can't see how we'd get into an infinite loop, or have re-entrancy
issues here because of the PostTask.  Also note that we're already potentially
creating two ConnectJobs in a row, since we always call this method after
another OnAvailableSocketSlot call.

I'm also not seeing how we'd end up looping more than once, in general, other
than on sync failure/success.

https://codereview.chromium.org/2888623011/diff/1/net/socket/client_socket_po...
File net/socket/client_socket_pool_base.cc (right):

https://codereview.chromium.org/2888623011/diff/1/net/socket/client_socket_po...
net/socket/client_socket_pool_base.cc:916: break;
On 2017/05/18 20:22:51, davidben wrote:
> What's causing the socket pools to react to the free socket slot in this code?

Destroying the socket will do that.

So suppose this is the SSL pool, and the client socket pool is full. 
SSLClientSockets own a ClientSocketHandle, which owns a TCPClientSocket.  So
deleting the idle SSLClientSocket will delete the ClientSocketHandle, which will
free up a slot in the lower layer pool.  Which will then call its own
ProcessPendingRequest and CheckForStalledSocketGroups functions.

Hrm...there's another question:  Should we also do a loop for checking for
stalled lower layer pools?  Think it's beyond the scope of this CL, but worth
figuring out.

davidben

lgtm https://codereview.chromium.org/2888623011/diff/1/net/socket/client_socket_pool_base.cc File net/socket/client_socket_pool_base.cc (right): https://codereview.chromium.org/2888623011/diff/1/net/socket/client_socket_pool_base.cc#newcode916 net/socket/client_socket_pool_base.cc:916: break; On 2017/05/18 20:44:57, mmenke wrote: > On ...

3 years, 7 months ago (2017-05-18 20:52:31 UTC) #10

mmenke

https://codereview.chromium.org/2888623011/diff/1/net/socket/client_socket_pool_base.cc File net/socket/client_socket_pool_base.cc (right): https://codereview.chromium.org/2888623011/diff/1/net/socket/client_socket_pool_base.cc#newcode916 net/socket/client_socket_pool_base.cc:916: break; On 2017/05/18 20:52:31, davidben wrote: > On 2017/05/18 ...

3 years, 7 months ago (2017-05-18 21:00:43 UTC) #11

https://codereview.chromium.org/2888623011/diff/1/net/socket/client_socket_po...
File net/socket/client_socket_pool_base.cc (right):

https://codereview.chromium.org/2888623011/diff/1/net/socket/client_socket_po...
net/socket/client_socket_pool_base.cc:916: break;
On 2017/05/18 20:52:31, davidben wrote:
> On 2017/05/18 20:44:57, mmenke wrote:
> > On 2017/05/18 20:22:51, davidben wrote:
> > > What's causing the socket pools to react to the free socket slot in this
> code?
> > 
> > Destroying the socket will do that.
> > 
> > So suppose this is the SSL pool, and the client socket pool is full. 
> > SSLClientSockets own a ClientSocketHandle, which owns a TCPClientSocket.  So
> > deleting the idle SSLClientSocket will delete the ClientSocketHandle, which
> will
> > free up a slot in the lower layer pool.  Which will then call its own
> > ProcessPendingRequest and CheckForStalledSocketGroups functions.
> > 
> > Hrm...there's another question:  Should we also do a loop for checking for
> > stalled lower layer pools?  Think it's beyond the scope of this CL, but
worth
> > figuring out.
> 
> I guess the vague analogy to our sync-return || ERR_IO_PENDING calling
> convention would be that all of these functions should return a signal whether
> they synchronously did something and expect you to loop or whether they'll
> handle all their obligations asynchornously.

Ideally, we really should only have at most one (new) idle socket when there's a
higher layer stalled group, so only one call is needed.  I'm skeptical that's
currently the case, though, so yea, that's one approach.

We could also just look at idle_socket_count(), like we do below, and then do:

if (idle_socket_count() == 0)
  return;
for (std::set<LowerLayeredPool*>::iterator it = lower_pools_.begin();
     it != lower_pools_.end(); ++it) {
  while ((*it)->IsStalled()) {
    CloseOneIdleSocket();
    if (idle_socket_count() == 0)
      return;
  }
}

commit-bot: I haz the power

CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2888623011/1

3 years, 7 months ago (2017-05-18 21:05:31 UTC) #13

commit-bot: I haz the power

CQ is committing da patch. Bot data: {"patchset_id": 1, "attempt_start_ts": 1495141384588740, "parent_rev": "40ab4cb23bce1e6502719e82c3b058021cb149dc", "commit_rev": "9d72fe409a214856ec4d9700105b4664bb4c0861"}

3 years, 7 months ago (2017-05-18 22:36:20 UTC) #14

commit-bot: I haz the power

Description was changed from ========== Fix for HTTP2 request hanging bug. If, when a socket ...

3 years, 7 months ago (2017-05-18 22:36:31 UTC) #15

commit-bot: I haz the power

3 years, 7 months ago (2017-05-18 22:36:32 UTC) #16

Message was sent while issue was closed.

Committed patchset #1 (id:1) as
https://chromium.googlesource.com/chromium/src/+/9d72fe409a214856ec4d9700105b...

Issue 2888623011: Fix for HTTP2 request hanging bug. (Closed)

Description

Patch Set 1 #

Messages