|
|
Chromium Code Reviews
DescriptionAllow the initial request to take over failed parallel requests.
When a parallel request fails, DownloadFileImpl will send an error and cause download to be interrupted.
This could cause a potential problem:
If a server always fail addtional requests, these requests will cause the download to interrupt.
As a result, the initial request and the download can go nowhere.
This CL addresses the issue by allowing the initial request to continue,
if it is still alive and can cover the slice to be downloaded by the failed request.
If the initial request cannot download the slice left by the parallel request, an error will be sent.
Will add a test once https://codereview.chromium.org/2742093002/ lands
BUG=644352
Review-Url: https://codereview.chromium.org/2748103014
Cr-Commit-Position: refs/heads/master@{#459005}
Committed: https://chromium.googlesource.com/chromium/src/+/6dbfc80be8169ac5e0ffc7f051432ab8681de7bb
Patch Set 1 #
Total comments: 25
Patch Set 2 : address comments #Patch Set 3 : rebase and fix a case if file contains holes #Patch Set 4 : rebase again #
Messages
Total messages: 28 (17 generated)
qinmin@chromium.org changed reviewers: + dtrainor@chromium.org, xingliu@chromium.org
PTAL
https://codereview.chromium.org/2748103014/diff/1/content/browser/download/do... File content/browser/download/download_file_impl.cc (right): https://codereview.chromium.org/2748103014/diff/1/content/browser/download/do... content/browser/download/download_file_impl.cc:517: auto initial_source_stream = source_streams_.find(save_info_->offset); What if the error is triggered by the initial stream? can_recover_from_error is probably set to true. And the file may never be completed. https://codereview.chromium.org/2748103014/diff/1/content/browser/download/do... content/browser/download/download_file_impl.cc:528: // The initial stream will download all data downloading from its offset If there is a new slice added, does the initial slice will have a length cap? We probably run into this block only when the initial stream fails. https://codereview.chromium.org/2748103014/diff/1/content/browser/download/do... content/browser/download/download_file_impl.cc:533: DCHECK_EQ(stream.second->bytes_written(), 0); Is it safe to assume streams in the middle never write anything?
https://codereview.chromium.org/2748103014/diff/1/content/browser/download/do... File content/browser/download/download_file_impl.cc (right): https://codereview.chromium.org/2748103014/diff/1/content/browser/download/do... content/browser/download/download_file_impl.cc:517: auto initial_source_stream = source_streams_.find(save_info_->offset); On 2017/03/18 22:27:34, xingliu wrote: > What if the error is triggered by the initial stream? can_recover_from_error is > probably set to true. And the file may never be completed. If the error is trigger by initial stream, set_finished(true) should be called on line 512. So it will skip the if block below, and can_recover_from_error should be false. https://codereview.chromium.org/2748103014/diff/1/content/browser/download/do... content/browser/download/download_file_impl.cc:528: // The initial stream will download all data downloading from its offset On 2017/03/18 22:27:34, xingliu wrote: > If there is a new slice added, does the initial slice will have a length cap? > > We probably run into this block only when the initial stream fails. No. For example, there are stream A, B, C ordered by their offset and A is the initial stream. If C fails to write any data, A will check if can download all the data C should download here. If B already starts writing data, then A's length will be truncated. So can_recover_from_error will return false. Otherwise, A's length is still kContentlength, so it can help handle C's portion https://codereview.chromium.org/2748103014/diff/1/content/browser/download/do... content/browser/download/download_file_impl.cc:533: DCHECK_EQ(stream.second->bytes_written(), 0); On 2017/03/18 22:27:34, xingliu wrote: > Is it safe to assume streams in the middle never write anything? Yes, if a stream in the middle writes something, it will truncate the length of all streams prior to it. So those streams will never surpass its starting offset. As a result, the initial request will fail one of the if conditions above.
lgtm with nit%. Not in this CL, but maybe something to consider. We probably can add a function to verify the states of source streams. To ensure our assumption of truncating length is consistence in code. https://codereview.chromium.org/2748103014/diff/1/content/browser/download/do... File content/browser/download/download_file_impl.cc (right): https://codereview.chromium.org/2748103014/diff/1/content/browser/download/do... content/browser/download/download_file_impl.cc:517: auto initial_source_stream = source_streams_.find(save_info_->offset); On 2017/03/20 20:59:21, qinmin wrote: > On 2017/03/18 22:27:34, xingliu wrote: > > What if the error is triggered by the initial stream? can_recover_from_error > is > > probably set to true. And the file may never be completed. > > If the error is trigger by initial stream, set_finished(true) should be called > on line 512. So it will skip the if block below, and can_recover_from_error > should be false. I see. https://codereview.chromium.org/2748103014/diff/1/content/browser/download/do... content/browser/download/download_file_impl.cc:528: // The initial stream will download all data downloading from its offset On 2017/03/20 20:59:21, qinmin wrote: > On 2017/03/18 22:27:34, xingliu wrote: > > If there is a new slice added, does the initial slice will have a length cap? > > > > We probably run into this block only when the initial stream fails. > > No. For example, there are stream A, B, C ordered by their offset and A is the > initial stream. > If C fails to write any data, A will check if can download all the data C should > download here. > If B already starts writing data, then A's length will be truncated. So > can_recover_from_error will return false. Otherwise, A's length is still > kContentlength, so it can help handle C's portion sgtm. https://codereview.chromium.org/2748103014/diff/1/content/browser/download/do... content/browser/download/download_file_impl.cc:533: DCHECK_EQ(stream.second->bytes_written(), 0); On 2017/03/20 20:59:21, qinmin wrote: > On 2017/03/18 22:27:34, xingliu wrote: > > Is it safe to assume streams in the middle never write anything? > > Yes, if a stream in the middle writes something, it will truncate the length of > all streams prior to it. So those streams will never surpass its starting > offset. > As a result, the initial request will fail one of the if conditions above. I see. https://codereview.chromium.org/2748103014/diff/1/content/browser/download/do... content/browser/download/download_file_impl.cc:553: reason, TotalBytesReceived(), base::Passed(&hash_state))); nit%: Not in this CL, since we allow duplicate IO now, TotalBytesReceived may be larger than file size. We probably want to
Sorry for a half done comment. https://codereview.chromium.org/2748103014/diff/1/content/browser/download/do... File content/browser/download/download_file_impl.cc (right): https://codereview.chromium.org/2748103014/diff/1/content/browser/download/do... content/browser/download/download_file_impl.cc:553: reason, TotalBytesReceived(), base::Passed(&hash_state))); On 2017/03/20 22:33:33, xingliu wrote: > nit%: Not in this CL, since we allow duplicate IO now, TotalBytesReceived may be > larger than file size. We probably want to make sure it's ok for user to see larger bytes number than the file size.
https://codereview.chromium.org/2748103014/diff/1/content/browser/download/do... File content/browser/download/download_file_impl.cc (right): https://codereview.chromium.org/2748103014/diff/1/content/browser/download/do... content/browser/download/download_file_impl.cc:517: auto initial_source_stream = source_streams_.find(save_info_->offset); Should we just call this stream_iterator or something? initial_source_stream and initial_stream are really close in name. https://codereview.chromium.org/2748103014/diff/1/content/browser/download/do... content/browser/download/download_file_impl.cc:520: // If the initial request is alive, check if it can help download all the Can we add a TODO to figure out allowing adjacent streams to always recover their neighbors? https://codereview.chromium.org/2748103014/diff/1/content/browser/download/do... content/browser/download/download_file_impl.cc:528: // The initial stream will download all data downloading from its offset On 2017/03/20 22:33:33, xingliu wrote: > On 2017/03/20 20:59:21, qinmin wrote: > > On 2017/03/18 22:27:34, xingliu wrote: > > > If there is a new slice added, does the initial slice will have a length > cap? > > > > > > We probably run into this block only when the initial stream fails. > > > > No. For example, there are stream A, B, C ordered by their offset and A is the > > initial stream. > > If C fails to write any data, A will check if can download all the data C > should > > download here. > > If B already starts writing data, then A's length will be truncated. So > > can_recover_from_error will return false. Otherwise, A's length is still > > kContentlength, so it can help handle C's portion > > sgtm. Hmm that seems like it might be a really common occurrence. If we start the requests like: 1. Start A 2. Response from A 3. Start B, then start C 4. Response from B 5. Failure for C. We just drop the whole request? Some alternative thoughts: 1. We wait for *all* stream responses to start before truncating the initial request. If any fail we fall back to that initial request for the whole download? 2. We only fail if we've written a large amount of data to B (either > some byte threshold or > some % of the file). https://codereview.chromium.org/2748103014/diff/1/content/browser/download/do... content/browser/download/download_file_impl.cc:533: DCHECK_EQ(stream.second->bytes_written(), 0); On 2017/03/20 22:33:33, xingliu wrote: > On 2017/03/20 20:59:21, qinmin wrote: > > On 2017/03/18 22:27:34, xingliu wrote: > > > Is it safe to assume streams in the middle never write anything? > > > > Yes, if a stream in the middle writes something, it will truncate the length > of > > all streams prior to it. So those streams will never surpass its starting > > offset. > > As a result, the initial request will fail one of the if conditions above. > > I see. Who kills the outstanding request if the middle streams haven't responded yet? Does anyone clean up these killed SourceStream objects in this list? https://codereview.chromium.org/2748103014/diff/1/content/browser/download/do... content/browser/download/download_file_impl.cc:548: weak_factory_.InvalidateWeakPtrs(); Should we pull out the shutdown code to a common helper? Do we have these few lines duplicated elsewhere?
Some discussion. https://codereview.chromium.org/2748103014/diff/1/content/browser/download/do... File content/browser/download/download_file_impl.cc (right): https://codereview.chromium.org/2748103014/diff/1/content/browser/download/do... content/browser/download/download_file_impl.cc:533: DCHECK_EQ(stream.second->bytes_written(), 0); On 2017/03/21 17:00:16, David Trainor-ping if over 24h wrote: > On 2017/03/20 22:33:33, xingliu wrote: > > On 2017/03/20 20:59:21, qinmin wrote: > > > On 2017/03/18 22:27:34, xingliu wrote: > > > > Is it safe to assume streams in the middle never write anything? > > > > > > Yes, if a stream in the middle writes something, it will truncate the length > > of > > > all streams prior to it. So those streams will never surpass its starting > > > offset. > > > As a result, the initial request will fail one of the if conditions above. > > > > I see. > > Who kills the outstanding request if the middle streams haven't responded yet? > Does anyone clean up these killed SourceStream objects in this list? DownloadFileImpl and all objects under it will get killed after DestinationComplete/DestinationError is called, that the UI thread kill the DownloadFileImpl with ReleaseDownloadFile or something. In my understanding, removing SourceStream won't remove the underlying pipe(byte stream writer+reader). The pipe will be destory on Job::Cancel, called on interrupt, or the pipe may be depleted after request is done, and UrlDownloader will remove itself, and also remove the underlying url request. If we don't get response for a while, DownloadFileImpl won't create the SourceStream for it. The url request may still open, not sure if somewhere has time out logic. That's probably something worthwhile to figure out.
https://codereview.chromium.org/2748103014/diff/1/content/browser/download/do... File content/browser/download/download_file_impl.cc (right): https://codereview.chromium.org/2748103014/diff/1/content/browser/download/do... content/browser/download/download_file_impl.cc:517: auto initial_source_stream = source_streams_.find(save_info_->offset); On 2017/03/21 17:00:16, David Trainor-ping if over 24h wrote: > Should we just call this stream_iterator or something? initial_source_stream > and initial_stream are really close in name. Done. https://codereview.chromium.org/2748103014/diff/1/content/browser/download/do... content/browser/download/download_file_impl.cc:520: // If the initial request is alive, check if it can help download all the On 2017/03/21 17:00:16, David Trainor-ping if over 24h wrote: > Can we add a TODO to figure out allowing adjacent streams to always recover > their neighbors? Done. https://codereview.chromium.org/2748103014/diff/1/content/browser/download/do... content/browser/download/download_file_impl.cc:528: // The initial stream will download all data downloading from its offset On 2017/03/21 17:00:16, David Trainor-ping if over 24h wrote: > On 2017/03/20 22:33:33, xingliu wrote: > > On 2017/03/20 20:59:21, qinmin wrote: > > > On 2017/03/18 22:27:34, xingliu wrote: > > > > If there is a new slice added, does the initial slice will have a length > > cap? > > > > > > > > We probably run into this block only when the initial stream fails. > > > > > > No. For example, there are stream A, B, C ordered by their offset and A is > the > > > initial stream. > > > If C fails to write any data, A will check if can download all the data C > > should > > > download here. > > > If B already starts writing data, then A's length will be truncated. So > > > can_recover_from_error will return false. Otherwise, A's length is still > > > kContentlength, so it can help handle C's portion > > > > sgtm. > > Hmm that seems like it might be a really common occurrence. If we start the > requests like: > > 1. Start A > 2. Response from A > 3. Start B, then start C > 4. Response from B > 5. Failure for C. > > We just drop the whole request? > > Some alternative thoughts: > 1. We wait for *all* stream responses to start before truncating the initial > request. If any fail we fall back to that initial request for the whole > download? > 2. We only fail if we've written a large amount of data to B (either > some byte > threshold or > some % of the file). This will add more complexity to the current logic. I would prefer not doing this for the initial version. And later if keep B half open, we don't need to fail the whole request. DownloadItemImpl will auto resume the download if an interruption occurs here, so no user action is needed unless the download fails 5 times. https://codereview.chromium.org/2748103014/diff/1/content/browser/download/do... content/browser/download/download_file_impl.cc:548: weak_factory_.InvalidateWeakPtrs(); On 2017/03/21 17:00:16, David Trainor-ping if over 24h wrote: > Should we pull out the shutdown code to a common helper? Do we have these few > lines duplicated elsewhere? No, we don't have the lines duplicated. It is similar to the DestinationComplete call, but this one has an interruption reason. So If we want to move it to a common helper, we would have something like FinishDownload(bool is_download_complete, int interrupt_reason), which doesn't feel very straightforward. Or we can simplify it to FinishDownload(int interrupt_reason). Still, it is not straightforward to figure out what the function actually does. https://codereview.chromium.org/2748103014/diff/1/content/browser/download/do... content/browser/download/download_file_impl.cc:553: reason, TotalBytesReceived(), base::Passed(&hash_state))); On 2017/03/20 22:33:33, xingliu wrote: > nit%: Not in this CL, since we allow duplicate IO now, TotalBytesReceived may be > larger than file size. We probably want to This won't happen. TotalBytesReceived is calculated from basefile, and the DownloadFileImpl::CalculateBytesToWrite() prohibits duplicate write to a particular file offset. So TotalBytesReceived() should reflect the actual file size, though it could be different from content-length header. we didn't actually record the bytes received from the network in this class. We can get that in StreamActive() call, but the issue is that they are not stored in db and not reported to DownloadItemImpl. So only physical bytes written to disk is at hand.
Have some replies inline, but lgtm. https://codereview.chromium.org/2748103014/diff/1/content/browser/download/do... File content/browser/download/download_file_impl.cc (right): https://codereview.chromium.org/2748103014/diff/1/content/browser/download/do... content/browser/download/download_file_impl.cc:528: // The initial stream will download all data downloading from its offset On 2017/03/21 19:29:17, qinmin wrote: > On 2017/03/21 17:00:16, David Trainor-ping if over 24h wrote: > > On 2017/03/20 22:33:33, xingliu wrote: > > > On 2017/03/20 20:59:21, qinmin wrote: > > > > On 2017/03/18 22:27:34, xingliu wrote: > > > > > If there is a new slice added, does the initial slice will have a length > > > cap? > > > > > > > > > > We probably run into this block only when the initial stream fails. > > > > > > > > No. For example, there are stream A, B, C ordered by their offset and A is > > the > > > > initial stream. > > > > If C fails to write any data, A will check if can download all the data C > > > should > > > > download here. > > > > If B already starts writing data, then A's length will be truncated. So > > > > can_recover_from_error will return false. Otherwise, A's length is still > > > > kContentlength, so it can help handle C's portion > > > > > > sgtm. > > > > Hmm that seems like it might be a really common occurrence. If we start the > > requests like: > > > > 1. Start A > > 2. Response from A > > 3. Start B, then start C > > 4. Response from B > > 5. Failure for C. > > > > We just drop the whole request? > > > > Some alternative thoughts: > > 1. We wait for *all* stream responses to start before truncating the initial > > request. If any fail we fall back to that initial request for the whole > > download? > > 2. We only fail if we've written a large amount of data to B (either > some > byte > > threshold or > some % of the file). > > This will add more complexity to the current logic. I would prefer not doing > this for the initial version. > And later if keep B half open, we don't need to fail the whole request. > DownloadItemImpl will auto resume the download if an interruption occurs here, > so no user action is needed unless the download fails 5 times. Okay I guess it's fine to postpone this for now. I just want to make sure we understand what the typical failure cases are. Even if we retry automatically, if the server starts blocking connections the retry could just fail (depending on how fast we retry) which would be even worse than just having a single connection to start. https://codereview.chromium.org/2748103014/diff/1/content/browser/download/do... content/browser/download/download_file_impl.cc:533: DCHECK_EQ(stream.second->bytes_written(), 0); On 2017/03/21 17:48:04, xingliu wrote: > On 2017/03/21 17:00:16, David Trainor-ping if over 24h wrote: > > On 2017/03/20 22:33:33, xingliu wrote: > > > On 2017/03/20 20:59:21, qinmin wrote: > > > > On 2017/03/18 22:27:34, xingliu wrote: > > > > > Is it safe to assume streams in the middle never write anything? > > > > > > > > Yes, if a stream in the middle writes something, it will truncate the > length > > > of > > > > all streams prior to it. So those streams will never surpass its starting > > > > offset. > > > > As a result, the initial request will fail one of the if conditions above. > > > > > > I see. > > > > Who kills the outstanding request if the middle streams haven't responded yet? > > > Does anyone clean up these killed SourceStream objects in this list? > > DownloadFileImpl and all objects under it will get killed after > DestinationComplete/DestinationError is called, that the UI thread kill the > DownloadFileImpl with ReleaseDownloadFile or something. > > In my understanding, removing SourceStream won't remove the underlying pipe(byte > stream writer+reader). > > The pipe will be destory on Job::Cancel, called on interrupt, or the pipe may be > depleted after request is done, and UrlDownloader will remove itself, and also > remove the underlying url request. > > If we don't get response for a while, DownloadFileImpl won't create the > SourceStream for it. The url request may still open, not sure if somewhere has > time out logic. That's probably something worthwhile to figure out. Thanks! https://codereview.chromium.org/2748103014/diff/1/content/browser/download/do... content/browser/download/download_file_impl.cc:548: weak_factory_.InvalidateWeakPtrs(); On 2017/03/21 19:29:17, qinmin wrote: > On 2017/03/21 17:00:16, David Trainor-ping if over 24h wrote: > > Should we pull out the shutdown code to a common helper? Do we have these few > > lines duplicated elsewhere? > > No, we don't have the lines duplicated. It is similar to the DestinationComplete > call, but this one has an interruption reason. > So If we want to move it to a common helper, we would have something like > FinishDownload(bool is_download_complete, int interrupt_reason), which doesn't > feel very straightforward. Or we can simplify it to FinishDownload(int > interrupt_reason). Still, it is not straightforward to figure out what the > function actually does. Yeah makes sense. Thanks
The CQ bit was checked by qinmin@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: ios-device on master.tryserver.chromium.mac (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.mac/builders/ios-device/builds...) ios-device-xcode-clang on master.tryserver.chromium.mac (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.mac/builders/ios-device-xcode-...) ios-simulator on master.tryserver.chromium.mac (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.mac/builders/ios-simulator/bui...) ios-simulator-xcode-clang on master.tryserver.chromium.mac (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.mac/builders/ios-simulator-xco...) mac_chromium_compile_dbg_ng on master.tryserver.chromium.mac (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.mac/builders/mac_chromium_comp...)
The CQ bit was checked by qinmin@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: android_compile_dbg on master.tryserver.chromium.android (JOB_FAILED, https://build.chromium.org/p/tryserver.chromium.android/builders/android_comp...) ios-device-xcode-clang on master.tryserver.chromium.mac (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.mac/builders/ios-device-xcode-...) mac_chromium_rel_ng on master.tryserver.chromium.mac (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.mac/builders/mac_chromium_rel_...)
The CQ bit was checked by qinmin@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
The CQ bit was checked by qinmin@chromium.org
The patchset sent to the CQ was uploaded after l-g-t-m from xingliu@chromium.org, dtrainor@chromium.org Link to the patchset: https://codereview.chromium.org/2748103014/#ps60001 (title: "rebase again")
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
CQ is committing da patch.
Bot data: {"patchset_id": 60001, "attempt_start_ts": 1490243808506260,
"parent_rev": "b89c4320456a8a5b52c5b4fb4cee7421081f4616", "commit_rev":
"6dbfc80be8169ac5e0ffc7f051432ab8681de7bb"}
Message was sent while issue was closed.
Description was changed from ========== Allow the initial request to take over failed parallel requests. When a parallel request fails, DownloadFileImpl will send an error and cause download to be interrupted. This could cause a potential problem: If a server always fail addtional requests, these requests will cause the download to interrupt. As a result, the initial request and the download can go nowhere. This CL addresses the issue by allowing the initial request to continue, if it is still alive and can cover the slice to be downloaded by the failed request. If the initial request cannot download the slice left by the parallel request, an error will be sent. Will add a test once https://codereview.chromium.org/2742093002/ lands BUG=644352 ========== to ========== Allow the initial request to take over failed parallel requests. When a parallel request fails, DownloadFileImpl will send an error and cause download to be interrupted. This could cause a potential problem: If a server always fail addtional requests, these requests will cause the download to interrupt. As a result, the initial request and the download can go nowhere. This CL addresses the issue by allowing the initial request to continue, if it is still alive and can cover the slice to be downloaded by the failed request. If the initial request cannot download the slice left by the parallel request, an error will be sent. Will add a test once https://codereview.chromium.org/2742093002/ lands BUG=644352 Review-Url: https://codereview.chromium.org/2748103014 Cr-Commit-Position: refs/heads/master@{#459005} Committed: https://chromium.googlesource.com/chromium/src/+/6dbfc80be8169ac5e0ffc7f05143... ==========
Message was sent while issue was closed.
Committed patchset #4 (id:60001) as https://chromium.googlesource.com/chromium/src/+/6dbfc80be8169ac5e0ffc7f05143... |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
