|
|
Chromium Code Reviews|
Created:
3 years, 9 months ago by Zhongyi Shi Modified:
3 years, 9 months ago CC:
chromium-reviews, cbentzel+watch_chromium.org, net-reviews_chromium.org, erikchen, xunjieli Target Ref:
refs/heads/master Project:
chromium Visibility:
Public. |
DescriptionQUIC: mark QUIC handshake failed if connection is closed after CryptoConnect
so that QuicStreamFactory::Job will not hang.
Currently, if QuicConnection is closed during crypto_stream_->CryptConnect(),
it will silently post a task to notify QuicStreamFactory to of the session
close without notifying the QuicStreamFactory::Job. This causes
QuicStreamFactory::Job to hang. Subsequent alt jobs will be bound to the
hanging QUIC job, and will never win race over TCP and get cleaned up.
BUG=700617
Review-Url: https://codereview.chromium.org/2766603004
Cr-Commit-Position: refs/heads/master@{#459271}
Committed: https://chromium.googlesource.com/chromium/src/+/363c91c57a5ae99e63a138555afa83dc02262325
Patch Set 1 #
Total comments: 9
Patch Set 2 : address rch's comments, add tests retry Request after failed crypto connect #Patch Set 3 : address rch's comments #
Total comments: 6
Patch Set 4 : Test retrying requests won't hang in QuicStreamFactoryTest #
Total comments: 4
Patch Set 5 : nits #Patch Set 6 : disable QuicUploadToAlternativeProxyServer #
Total comments: 7
Patch Set 7 : address xunjieli's comments #
Total comments: 2
Patch Set 8 : nits #Patch Set 9 : add a TODO #
Messages
Total messages: 35 (17 generated)
zhongyi@chromium.org changed reviewers: + rch@chromium.org
Ryan, ptal! I will add one more regression test in QuicNetworkTransactionTests if this patch looks good in general. Thanks!
Nice! https://codereview.chromium.org/2766603004/diff/1/net/quic/chromium/mock_cryp... File net/quic/chromium/mock_crypto_client_stream_factory.h (right): https://codereview.chromium.org/2766603004/diff/1/net/quic/chromium/mock_cryp... net/quic/chromium/mock_crypto_client_stream_factory.h:56: // MockCryptoClientStream in tests. nit: Instead of adding a new member for this which, if true, causes handshake_mode_ to be ignored, let's just add a new handshake mode for this purpose. https://codereview.chromium.org/2766603004/diff/1/net/quic/chromium/quic_chro... File net/quic/chromium/quic_chromium_client_session.cc (right): https://codereview.chromium.org/2766603004/diff/1/net/quic/chromium/quic_chro... net/quic/chromium/quic_chromium_client_session.cc:684: return ERR_QUIC_HANDSHAKE_FAILED; Please move this to immediately after the call to CryptoConnect(). https://codereview.chromium.org/2766603004/diff/1/net/quic/chromium/quic_stre... File net/quic/chromium/quic_stream_factory_test.cc (right): https://codereview.chromium.org/2766603004/diff/1/net/quic/chromium/quic_stre... net/quic/chromium/quic_stream_factory_test.cc:1826: EXPECT_EQ(ERR_IO_PENDING, I'm a bit surprised that this returns ERR_IO_PENDING instead of QUIC_HANDSHAKE_FAILED. Is that because DNS resolution (in this test) in async? If so, can you make another version of this test which does host_resolver_.set_synchronous_mode(true); (look for other uses in this file) and verify that it fails immediately? https://codereview.chromium.org/2766603004/diff/1/net/quic/chromium/quic_stre... net/quic/chromium/quic_stream_factory_test.cc:1832: EXPECT_FALSE(HasActiveJob(host_port_pair_, privacy_mode_)); If I understand the bug correctly, in the old code, if we had tried to make a new request at this point, the request would have hung. So instead of checking with the peer that there is no active job can you make another request, and this time let it succeed?
Thanks Ryan for the quick review. Add one more tests to do synchronous DNS resolution and retry the QUIC request after the first one failed with QUIC_HANDSHAKE_FAILED. PTAL! https://codereview.chromium.org/2766603004/diff/1/net/quic/chromium/quic_chro... File net/quic/chromium/quic_chromium_client_session.cc (right): https://codereview.chromium.org/2766603004/diff/1/net/quic/chromium/quic_chro... net/quic/chromium/quic_chromium_client_session.cc:684: return ERR_QUIC_HANDSHAKE_FAILED; On 2017/03/22 02:38:48, Ryan Hamilton wrote: > Please move this to immediately after the call to CryptoConnect(). Done. https://codereview.chromium.org/2766603004/diff/1/net/quic/chromium/quic_stre... File net/quic/chromium/quic_stream_factory_test.cc (right): https://codereview.chromium.org/2766603004/diff/1/net/quic/chromium/quic_stre... net/quic/chromium/quic_stream_factory_test.cc:1826: EXPECT_EQ(ERR_IO_PENDING, On 2017/03/22 02:38:48, Ryan Hamilton wrote: > I'm a bit surprised that this returns ERR_IO_PENDING instead of > QUIC_HANDSHAKE_FAILED. Is that because DNS resolution (in this test) in async? > If so, can you make another version of this test which does > host_resolver_.set_synchronous_mode(true); (look for other uses in this file) > and verify that it fails immediately? Done. https://codereview.chromium.org/2766603004/diff/1/net/quic/chromium/quic_stre... net/quic/chromium/quic_stream_factory_test.cc:1832: EXPECT_FALSE(HasActiveJob(host_port_pair_, privacy_mode_)); On 2017/03/22 02:38:48, Ryan Hamilton wrote: > If I understand the bug correctly, in the old code, if we had tried to make a > new request at this point, the request would have hung. So instead of checking > with the peer that there is no active job can you make another request, and this > time let it succeed? Done. Good point! I have checked those regression test will fail w/o the fix.
https://codereview.chromium.org/2766603004/diff/1/net/quic/chromium/quic_stre... File net/quic/chromium/quic_stream_factory_test.cc (right): https://codereview.chromium.org/2766603004/diff/1/net/quic/chromium/quic_stre... net/quic/chromium/quic_stream_factory_test.cc:1832: EXPECT_FALSE(HasActiveJob(host_port_pair_, privacy_mode_)); On 2017/03/22 07:00:11, Zhongyi Shi wrote: > On 2017/03/22 02:38:48, Ryan Hamilton wrote: > > If I understand the bug correctly, in the old code, if we had tried to make a > > new request at this point, the request would have hung. So instead of checking > > with the peer that there is no active job can you make another request, and > this > > time let it succeed? > > Done. Good point! It looks like this is still testing for an active not, instead of testing that a new QuicStreamRequest will not hang. > I have checked those regression test will fail w/o the fix. Great! https://codereview.chromium.org/2766603004/diff/40001/net/quic/chromium/mock_... File net/quic/chromium/mock_crypto_client_stream_factory.cc (right): https://codereview.chromium.org/2766603004/diff/40001/net/quic/chromium/mock_... net/quic/chromium/mock_crypto_client_stream_factory.cc:41: } nit: Sorry, I should have noticed this last time. Since the real crypto stream does not use the proof verify details, lets move this block up to be the first line in the method. https://codereview.chromium.org/2766603004/diff/40001/net/quic/chromium/quic_... File net/quic/chromium/quic_stream_factory_test.cc (right): https://codereview.chromium.org/2766603004/diff/40001/net/quic/chromium/quic_... net/quic/chromium/quic_stream_factory_test.cc:1851: // Create request, should fail immediately. It would be nice if the comments in this test and the previous test matched. https://codereview.chromium.org/2766603004/diff/40001/net/quic/test_tools/moc... File net/quic/test_tools/mock_crypto_client_stream.h (right): https://codereview.chromium.org/2766603004/diff/40001/net/quic/test_tools/moc... net/quic/test_tools/mock_crypto_client_stream.h:40: // fallback to use normal QuicCryptoClientStream to do CryptoConnect. nit: // USE_DEFAULT_CRYPTO indicates that MockCryptoClientStreamFactory will // create a QuicCryptoClientStream instead of a MockCryptoClientStream.
Patchset #4 (id:60001) has been deleted
Patchset #4 (id:80001) has been deleted
Thanks for the review, comments addressed. PTAL https://codereview.chromium.org/2766603004/diff/1/net/quic/chromium/quic_stre... File net/quic/chromium/quic_stream_factory_test.cc (right): https://codereview.chromium.org/2766603004/diff/1/net/quic/chromium/quic_stre... net/quic/chromium/quic_stream_factory_test.cc:1832: EXPECT_FALSE(HasActiveJob(host_port_pair_, privacy_mode_)); On 2017/03/22 14:13:01, Ryan Hamilton wrote: > On 2017/03/22 07:00:11, Zhongyi Shi wrote: > > On 2017/03/22 02:38:48, Ryan Hamilton wrote: > > > If I understand the bug correctly, in the old code, if we had tried to make > a > > > new request at this point, the request would have hung. So instead of > checking > > > with the peer that there is no active job can you make another request, and > > this > > > time let it succeed? > > > > Done. Good point! > > It looks like this is still testing for an active not, instead of testing that a > new QuicStreamRequest will not hang. > > > I have checked those regression test will fail w/o the fix. > > Great! Done. https://codereview.chromium.org/2766603004/diff/40001/net/quic/chromium/mock_... File net/quic/chromium/mock_crypto_client_stream_factory.cc (right): https://codereview.chromium.org/2766603004/diff/40001/net/quic/chromium/mock_... net/quic/chromium/mock_crypto_client_stream_factory.cc:41: } On 2017/03/22 14:13:01, Ryan Hamilton wrote: > nit: Sorry, I should have noticed this last time. Since the real crypto stream > does not use the proof verify details, lets move this block up to be the first > line in the method. Done. https://codereview.chromium.org/2766603004/diff/40001/net/quic/chromium/quic_... File net/quic/chromium/quic_stream_factory_test.cc (right): https://codereview.chromium.org/2766603004/diff/40001/net/quic/chromium/quic_... net/quic/chromium/quic_stream_factory_test.cc:1851: // Create request, should fail immediately. On 2017/03/22 14:13:01, Ryan Hamilton wrote: > It would be nice if the comments in this test and the previous test matched. Done. https://codereview.chromium.org/2766603004/diff/40001/net/quic/test_tools/moc... File net/quic/test_tools/mock_crypto_client_stream.h (right): https://codereview.chromium.org/2766603004/diff/40001/net/quic/test_tools/moc... net/quic/test_tools/mock_crypto_client_stream.h:40: // fallback to use normal QuicCryptoClientStream to do CryptoConnect. On 2017/03/22 14:13:01, Ryan Hamilton wrote: > nit: > > // USE_DEFAULT_CRYPTO indicates that MockCryptoClientStreamFactory will > // create a QuicCryptoClientStream instead of a MockCryptoClientStream. Done.
lgtm mod two nits https://codereview.chromium.org/2766603004/diff/100001/net/quic/chromium/quic... File net/quic/chromium/quic_stream_factory_test.cc (right): https://codereview.chromium.org/2766603004/diff/100001/net/quic/chromium/quic... net/quic/chromium/quic_stream_factory_test.cc:1828: // Create request, should fail after synchronous host resolution completes. Host resolution is async in this test, right? Perhaps it should say // Should fail after the write of the CHLO fails or some such? https://codereview.chromium.org/2766603004/diff/100001/net/quic/chromium/quic... net/quic/chromium/quic_stream_factory_test.cc:1877: TEST_P(QuicStreamFactoryTest, RetryAfterWriteErrorInCryptoConnect) { Is the only delta between this test and the previous one sync vs async DNS? If so I'd name this: WriteErrorInCryptoConnectWithSyncHostResolution (or some such)
Thanks for the quick review, landing now! https://codereview.chromium.org/2766603004/diff/100001/net/quic/chromium/quic... File net/quic/chromium/quic_stream_factory_test.cc (right): https://codereview.chromium.org/2766603004/diff/100001/net/quic/chromium/quic... net/quic/chromium/quic_stream_factory_test.cc:1828: // Create request, should fail after synchronous host resolution completes. On 2017/03/22 19:24:38, Ryan Hamilton wrote: > Host resolution is async in this test, right? Perhaps it should say > > // Should fail after the write of the CHLO fails > > or some such? Good catch! I was copying the comment from the next test and didn't notice this. https://codereview.chromium.org/2766603004/diff/100001/net/quic/chromium/quic... net/quic/chromium/quic_stream_factory_test.cc:1877: TEST_P(QuicStreamFactoryTest, RetryAfterWriteErrorInCryptoConnect) { On 2017/03/22 19:24:38, Ryan Hamilton wrote: > Is the only delta between this test and the previous one sync vs async DNS? If > so I'd name this: > > WriteErrorInCryptoConnectWithSyncHostResolution > > (or some such) Done. Yeah, the only difference is mode of DNS resolution.
The CQ bit was checked by zhongyi@chromium.org
The patchset sent to the CQ was uploaded after l-g-t-m from rch@chromium.org Link to the patchset: https://codereview.chromium.org/2766603004/#ps120001 (title: "nits")
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Try jobs failed on following builders: mac_chromium_rel_ng on master.tryserver.chromium.mac (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.mac/builders/mac_chromium_rel_...)
Patchset #6 (id:140001) has been deleted
The CQ bit was checked by zhongyi@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
Description was changed from ========== QUIC: mark QUIC handshake failed if connection is closed after CryptoConnect so that QuicStreamFactory::Job will not hang. Currently, if QuicConnection is closed during crypto_stream_->CryptConnect(), it will silently post a task to notify QuicStreamFactory to of the session close without notifying the QuicStreamFactory::Job. This causes QuicStreamFactory::Job to hang. Subsequent alt jobs will be bound to the hanging QUIC job, and will never win race over TCP and get cleaned up. BUG=700617 ========== to ========== QUIC: mark QUIC handshake failed if connection is closed after CryptoConnect so that QuicStreamFactory::Job will not hang. Currently, if QuicConnection is closed during crypto_stream_->CryptConnect(), it will silently post a task to notify QuicStreamFactory to of the session close without notifying the QuicStreamFactory::Job. This causes QuicStreamFactory::Job to hang. Subsequent alt jobs will be bound to the hanging QUIC job, and will never win race over TCP and get cleaned up. BUG=700617 ==========
xunjieli@chromium.org changed reviewers: + xunjieli@chromium.org
https://codereview.chromium.org/2766603004/diff/160001/net/quic/chromium/quic... File net/quic/chromium/quic_chromium_client_session.cc (right): https://codereview.chromium.org/2766603004/diff/160001/net/quic/chromium/quic... net/quic/chromium/quic_chromium_client_session.cc:673: if (!connection()->connected()) nit: Maybe add a comment here explaining why we need to check handshake state after CryptoConnect(). Maybe a TODO here with a bug number to investigate whether crypto_stream_->CryptoConnect() could take a callback as a parameter? It will make some of the code easier to write and less error-prone. https://codereview.chromium.org/2766603004/diff/160001/net/quic/chromium/quic... File net/quic/chromium/quic_network_transaction_unittest.cc (right): https://codereview.chromium.org/2766603004/diff/160001/net/quic/chromium/quic... net/quic/chromium/quic_network_transaction_unittest.cc:2674: // code path. Need fix. Need a bug number here. https://codereview.chromium.org/2766603004/diff/160001/net/quic/test_tools/mo... File net/quic/test_tools/mock_crypto_client_stream.h (right): https://codereview.chromium.org/2766603004/diff/160001/net/quic/test_tools/mo... net/quic/test_tools/mock_crypto_client_stream.h:39: // USE_DEFAULT_CRYPTO_STREAM indicates that MockCryptoClientStreamFactory Hmm.. even though this is test code, there is layering violation here. Can you move the enum to the Factory class?
Helen, thanks for reviewing, ptal! https://codereview.chromium.org/2766603004/diff/160001/net/quic/chromium/quic... File net/quic/chromium/quic_chromium_client_session.cc (right): https://codereview.chromium.org/2766603004/diff/160001/net/quic/chromium/quic... net/quic/chromium/quic_chromium_client_session.cc:673: if (!connection()->connected()) On 2017/03/23 16:18:02, xunjieli wrote: > nit: Maybe add a comment here explaining why we need to check handshake state > after CryptoConnect(). > > Maybe a TODO here with a bug number to investigate whether > crypto_stream_->CryptoConnect() could take a callback as a parameter? It will > make some of the code easier to write and less error-prone. Done. https://codereview.chromium.org/2766603004/diff/160001/net/quic/chromium/quic... File net/quic/chromium/quic_network_transaction_unittest.cc (right): https://codereview.chromium.org/2766603004/diff/160001/net/quic/chromium/quic... net/quic/chromium/quic_network_transaction_unittest.cc:2674: // code path. Need fix. On 2017/03/23 16:18:02, xunjieli wrote: > Need a bug number here. Done. https://codereview.chromium.org/2766603004/diff/160001/net/quic/test_tools/mo... File net/quic/test_tools/mock_crypto_client_stream.h (right): https://codereview.chromium.org/2766603004/diff/160001/net/quic/test_tools/mo... net/quic/test_tools/mock_crypto_client_stream.h:39: // USE_DEFAULT_CRYPTO_STREAM indicates that MockCryptoClientStreamFactory On 2017/03/23 16:18:02, xunjieli wrote: > Hmm.. even though this is test code, there is layering violation here. Can you > move the enum to the Factory class? I prefer to move this to another CL, since if we move the enum to the factory in this CL, it will introduce a lot of test changes. Considering this CL will be merged to other milestones, shall we keep it simple as is here?
https://codereview.chromium.org/2766603004/diff/160001/net/quic/test_tools/mo... File net/quic/test_tools/mock_crypto_client_stream.h (right): https://codereview.chromium.org/2766603004/diff/160001/net/quic/test_tools/mo... net/quic/test_tools/mock_crypto_client_stream.h:39: // USE_DEFAULT_CRYPTO_STREAM indicates that MockCryptoClientStreamFactory On 2017/03/23 17:28:50, Zhongyi Shi wrote: > On 2017/03/23 16:18:02, xunjieli wrote: > > Hmm.. even though this is test code, there is layering violation here. Can you > > move the enum to the Factory class? > > I prefer to move this to another CL, since if we move the enum to the factory in > this CL, it will introduce a lot of test changes. Considering this CL will be > merged to other milestones, shall we keep it simple as is here? SG. Can we add a crbug to track it? https://codereview.chromium.org/2766603004/diff/180001/net/quic/chromium/quic... File net/quic/chromium/quic_chromium_client_session.cc (right): https://codereview.chromium.org/2766603004/diff/180001/net/quic/chromium/quic... net/quic/chromium/quic_chromium_client_session.cc:675: // synchronously with ERR_QUIC_HANDSHAKE_FAILED so that QuicStreamFactory::Job mentioning QuicStreamFacory::Job here is a layering violation. I think what happens here is crypto_stream_->CryptoConnect()'s method signature doesn't indicate that it can return error synchronously. The calling code (QuicChromiumClientSession) failed to handle the sync error. If you agree, I think this API (QuicCryptoClientStream::CryptoConnect()) needs to be changed to make the API contract clearer.
https://codereview.chromium.org/2766603004/diff/180001/net/quic/chromium/quic... File net/quic/chromium/quic_chromium_client_session.cc (right): https://codereview.chromium.org/2766603004/diff/180001/net/quic/chromium/quic... net/quic/chromium/quic_chromium_client_session.cc:675: // synchronously with ERR_QUIC_HANDSHAKE_FAILED so that QuicStreamFactory::Job On 2017/03/23 17:43:37, xunjieli wrote: > > mentioning QuicStreamFacory::Job here is a layering violation. > > I think what happens here is crypto_stream_->CryptoConnect()'s method signature > doesn't indicate that it can return error synchronously. The calling code > (QuicChromiumClientSession) failed to handle the sync error. > > If you agree, I think this API (QuicCryptoClientStream::CryptoConnect()) needs > to be changed to make the API contract clearer. QuicCryptoClientStream::CryptoConnect() is shared with internal code, which doesn't support completion callback IIUC. It might be a good idea to change the API to return a status code for the CryptoConnect here, which will introduce changes in both internal and external code base. But let's keep it simple here and have the leak fixed first. Helen, could we move the discussion to crbug.com/704649?
Deferring this to Ryan. I didn't look at the tests.
On 2017/03/23 21:43:41, xunjieli wrote: > Deferring this to Ryan. I didn't look at the tests. I'm happy for this to land as-is and to do the cleanups you suggest in followups. (They both seem reasonable) Or if Cherie wants to move the test-only enum as part of this CL, I'm totally fine with that too.
On 2017/03/23 21:55:12, Ryan Hamilton wrote: > On 2017/03/23 21:43:41, xunjieli wrote: > > Deferring this to Ryan. I didn't look at the tests. > > I'm happy for this to land as-is and to do the cleanups you suggest in > followups. (They both seem reasonable) > > Or if Cherie wants to move the test-only enum as part of this CL, I'm totally > fine with that too. Thanks all for helping the review. I will land as is since the test-only enum will introduce a lot of changes in unrelated files as well. Landing now.
The CQ bit was checked by zhongyi@chromium.org
The patchset sent to the CQ was uploaded after l-g-t-m from rch@chromium.org Link to the patchset: https://codereview.chromium.org/2766603004/#ps220001 (title: "add a TODO")
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
CQ is committing da patch.
Bot data: {"patchset_id": 220001, "attempt_start_ts": 1490306603863530,
"parent_rev": "5e1457c6a247571aa25c76364693863b7bdd4b7f", "commit_rev":
"363c91c57a5ae99e63a138555afa83dc02262325"}
Message was sent while issue was closed.
Description was changed from ========== QUIC: mark QUIC handshake failed if connection is closed after CryptoConnect so that QuicStreamFactory::Job will not hang. Currently, if QuicConnection is closed during crypto_stream_->CryptConnect(), it will silently post a task to notify QuicStreamFactory to of the session close without notifying the QuicStreamFactory::Job. This causes QuicStreamFactory::Job to hang. Subsequent alt jobs will be bound to the hanging QUIC job, and will never win race over TCP and get cleaned up. BUG=700617 ========== to ========== QUIC: mark QUIC handshake failed if connection is closed after CryptoConnect so that QuicStreamFactory::Job will not hang. Currently, if QuicConnection is closed during crypto_stream_->CryptConnect(), it will silently post a task to notify QuicStreamFactory to of the session close without notifying the QuicStreamFactory::Job. This causes QuicStreamFactory::Job to hang. Subsequent alt jobs will be bound to the hanging QUIC job, and will never win race over TCP and get cleaned up. BUG=700617 Review-Url: https://codereview.chromium.org/2766603004 Cr-Commit-Position: refs/heads/master@{#459271} Committed: https://chromium.googlesource.com/chromium/src/+/363c91c57a5ae99e63a138555afa... ==========
Message was sent while issue was closed.
Committed patchset #9 (id:220001) as https://chromium.googlesource.com/chromium/src/+/363c91c57a5ae99e63a138555afa... |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
