Issue 2748103014: Allow the initial request to take over failed parallel requests.

qinmin

qinmin@chromium.org changed reviewers: + dtrainor@chromium.org, xingliu@chromium.org

3 years, 9 months ago (2017-03-17 22:28:01 UTC) #1

xingliu

https://codereview.chromium.org/2748103014/diff/1/content/browser/download/download_file_impl.cc File content/browser/download/download_file_impl.cc (right): https://codereview.chromium.org/2748103014/diff/1/content/browser/download/download_file_impl.cc#newcode517 content/browser/download/download_file_impl.cc:517: auto initial_source_stream = source_streams_.find(save_info_->offset); What if the error is ...

3 years, 9 months ago (2017-03-18 22:27:34 UTC) #3

qinmin

https://codereview.chromium.org/2748103014/diff/1/content/browser/download/download_file_impl.cc File content/browser/download/download_file_impl.cc (right): https://codereview.chromium.org/2748103014/diff/1/content/browser/download/download_file_impl.cc#newcode517 content/browser/download/download_file_impl.cc:517: auto initial_source_stream = source_streams_.find(save_info_->offset); On 2017/03/18 22:27:34, xingliu wrote: ...

3 years, 9 months ago (2017-03-20 20:59:21 UTC) #4

https://codereview.chromium.org/2748103014/diff/1/content/browser/download/do...
File content/browser/download/download_file_impl.cc (right):

https://codereview.chromium.org/2748103014/diff/1/content/browser/download/do...
content/browser/download/download_file_impl.cc:517: auto initial_source_stream =
source_streams_.find(save_info_->offset);
On 2017/03/18 22:27:34, xingliu wrote:
> What if the error is triggered by the initial stream? can_recover_from_error
is
> probably set to true. And the file may never be completed.

If the error is trigger by initial stream, set_finished(true) should be called
on line 512. So it will skip the if block below, and can_recover_from_error
should be false.

https://codereview.chromium.org/2748103014/diff/1/content/browser/download/do...
content/browser/download/download_file_impl.cc:528: // The initial stream will
download all data downloading from its offset
On 2017/03/18 22:27:34, xingliu wrote:
> If there is a new slice added, does the initial slice will have a length cap?
> 
> We probably run into this block only when the initial stream fails.

No. For example, there are stream A, B, C ordered by their offset and A is the
initial stream. 
If C fails to write any data, A will check if can download all the data C should
download here. 
If B already starts writing data, then A's length will be truncated. So
can_recover_from_error will return false. Otherwise, A's length is still
kContentlength, so it can help handle C's portion

https://codereview.chromium.org/2748103014/diff/1/content/browser/download/do...
content/browser/download/download_file_impl.cc:533:
DCHECK_EQ(stream.second->bytes_written(), 0);
On 2017/03/18 22:27:34, xingliu wrote:
> Is it safe to assume streams in the middle never write anything?

Yes, if a stream in the middle writes something, it will truncate the length of
all streams prior to it. So those streams will never surpass its starting
offset. 
As a result, the initial request will fail one of the if conditions above.

xingliu

lgtm with nit%. Not in this CL, but maybe something to consider. We probably can ...

3 years, 9 months ago (2017-03-20 22:33:33 UTC) #5

lgtm with nit%.

Not in this CL, but maybe something to consider.

We probably can add a function to verify the states of source streams. To ensure
our assumption of truncating length is consistence in code.

https://codereview.chromium.org/2748103014/diff/1/content/browser/download/do...
File content/browser/download/download_file_impl.cc (right):

https://codereview.chromium.org/2748103014/diff/1/content/browser/download/do...
content/browser/download/download_file_impl.cc:517: auto initial_source_stream =
source_streams_.find(save_info_->offset);
On 2017/03/20 20:59:21, qinmin wrote:
> On 2017/03/18 22:27:34, xingliu wrote:
> > What if the error is triggered by the initial stream? can_recover_from_error
> is
> > probably set to true. And the file may never be completed.
> 
> If the error is trigger by initial stream, set_finished(true) should be called
> on line 512. So it will skip the if block below, and can_recover_from_error
> should be false.

I see.

https://codereview.chromium.org/2748103014/diff/1/content/browser/download/do...
content/browser/download/download_file_impl.cc:528: // The initial stream will
download all data downloading from its offset
On 2017/03/20 20:59:21, qinmin wrote:
> On 2017/03/18 22:27:34, xingliu wrote:
> > If there is a new slice added, does the initial slice will have a length
cap?
> > 
> > We probably run into this block only when the initial stream fails.
> 
> No. For example, there are stream A, B, C ordered by their offset and A is the
> initial stream. 
> If C fails to write any data, A will check if can download all the data C
should
> download here. 
> If B already starts writing data, then A's length will be truncated. So
> can_recover_from_error will return false. Otherwise, A's length is still
> kContentlength, so it can help handle C's portion

sgtm.

https://codereview.chromium.org/2748103014/diff/1/content/browser/download/do...
content/browser/download/download_file_impl.cc:533:
DCHECK_EQ(stream.second->bytes_written(), 0);
On 2017/03/20 20:59:21, qinmin wrote:
> On 2017/03/18 22:27:34, xingliu wrote:
> > Is it safe to assume streams in the middle never write anything?
> 
> Yes, if a stream in the middle writes something, it will truncate the length
of
> all streams prior to it. So those streams will never surpass its starting
> offset. 
> As a result, the initial request will fail one of the if conditions above.

I see.

https://codereview.chromium.org/2748103014/diff/1/content/browser/download/do...
content/browser/download/download_file_impl.cc:553: reason,
TotalBytesReceived(), base::Passed(&hash_state)));
nit%: Not in this CL, since we allow duplicate IO now, TotalBytesReceived may be
larger than file size. We probably want to

xingliu

Sorry for a half done comment. https://codereview.chromium.org/2748103014/diff/1/content/browser/download/download_file_impl.cc File content/browser/download/download_file_impl.cc (right): https://codereview.chromium.org/2748103014/diff/1/content/browser/download/download_file_impl.cc#newcode553 content/browser/download/download_file_impl.cc:553: reason, TotalBytesReceived(), base::Passed(&hash_state))); ...

3 years, 9 months ago (2017-03-20 22:35:10 UTC) #6

David Trainor- moved to gerrit

https://codereview.chromium.org/2748103014/diff/1/content/browser/download/download_file_impl.cc File content/browser/download/download_file_impl.cc (right): https://codereview.chromium.org/2748103014/diff/1/content/browser/download/download_file_impl.cc#newcode517 content/browser/download/download_file_impl.cc:517: auto initial_source_stream = source_streams_.find(save_info_->offset); Should we just call this ...

3 years, 9 months ago (2017-03-21 17:00:17 UTC) #7

https://codereview.chromium.org/2748103014/diff/1/content/browser/download/do...
File content/browser/download/download_file_impl.cc (right):

https://codereview.chromium.org/2748103014/diff/1/content/browser/download/do...
content/browser/download/download_file_impl.cc:517: auto initial_source_stream =
source_streams_.find(save_info_->offset);
Should we just call this stream_iterator or something?  initial_source_stream
and initial_stream are really close in name.

https://codereview.chromium.org/2748103014/diff/1/content/browser/download/do...
content/browser/download/download_file_impl.cc:520: // If the initial request is
alive, check if it can help download all the
Can we add a TODO to figure out allowing adjacent streams to always recover
their neighbors?

https://codereview.chromium.org/2748103014/diff/1/content/browser/download/do...
content/browser/download/download_file_impl.cc:528: // The initial stream will
download all data downloading from its offset
On 2017/03/20 22:33:33, xingliu wrote:
> On 2017/03/20 20:59:21, qinmin wrote:
> > On 2017/03/18 22:27:34, xingliu wrote:
> > > If there is a new slice added, does the initial slice will have a length
> cap?
> > > 
> > > We probably run into this block only when the initial stream fails.
> > 
> > No. For example, there are stream A, B, C ordered by their offset and A is
the
> > initial stream. 
> > If C fails to write any data, A will check if can download all the data C
> should
> > download here. 
> > If B already starts writing data, then A's length will be truncated. So
> > can_recover_from_error will return false. Otherwise, A's length is still
> > kContentlength, so it can help handle C's portion
> 
> sgtm.

Hmm that seems like it might be a really common occurrence.  If we start the
requests like:

1. Start A
2. Response from A
3. Start B, then start C
4. Response from B
5. Failure for C.

We just drop the whole request?

Some alternative thoughts:
1. We wait for *all* stream responses to start before truncating the initial
request.  If any fail we fall back to that initial request for the whole
download?
2. We only fail if we've written a large amount of data to B (either > some byte
threshold or > some % of the file).

https://codereview.chromium.org/2748103014/diff/1/content/browser/download/do...
content/browser/download/download_file_impl.cc:533:
DCHECK_EQ(stream.second->bytes_written(), 0);
On 2017/03/20 22:33:33, xingliu wrote:
> On 2017/03/20 20:59:21, qinmin wrote:
> > On 2017/03/18 22:27:34, xingliu wrote:
> > > Is it safe to assume streams in the middle never write anything?
> > 
> > Yes, if a stream in the middle writes something, it will truncate the length
> of
> > all streams prior to it. So those streams will never surpass its starting
> > offset. 
> > As a result, the initial request will fail one of the if conditions above.
> 
> I see.

Who kills the outstanding request if the middle streams haven't responded yet? 
Does anyone clean up these killed SourceStream objects in this list?

https://codereview.chromium.org/2748103014/diff/1/content/browser/download/do...
content/browser/download/download_file_impl.cc:548:
weak_factory_.InvalidateWeakPtrs();
Should we pull out the shutdown code to a common helper?  Do we have these few
lines duplicated elsewhere?

xingliu

Some discussion. https://codereview.chromium.org/2748103014/diff/1/content/browser/download/download_file_impl.cc File content/browser/download/download_file_impl.cc (right): https://codereview.chromium.org/2748103014/diff/1/content/browser/download/download_file_impl.cc#newcode533 content/browser/download/download_file_impl.cc:533: DCHECK_EQ(stream.second->bytes_written(), 0); On 2017/03/21 17:00:16, David Trainor-ping ...

3 years, 9 months ago (2017-03-21 17:48:04 UTC) #8

qinmin

https://codereview.chromium.org/2748103014/diff/1/content/browser/download/download_file_impl.cc File content/browser/download/download_file_impl.cc (right): https://codereview.chromium.org/2748103014/diff/1/content/browser/download/download_file_impl.cc#newcode517 content/browser/download/download_file_impl.cc:517: auto initial_source_stream = source_streams_.find(save_info_->offset); On 2017/03/21 17:00:16, David Trainor-ping ...

3 years, 9 months ago (2017-03-21 19:29:17 UTC) #9

https://codereview.chromium.org/2748103014/diff/1/content/browser/download/do...
File content/browser/download/download_file_impl.cc (right):

https://codereview.chromium.org/2748103014/diff/1/content/browser/download/do...
content/browser/download/download_file_impl.cc:517: auto initial_source_stream =
source_streams_.find(save_info_->offset);
On 2017/03/21 17:00:16, David Trainor-ping if over 24h wrote:
> Should we just call this stream_iterator or something?  initial_source_stream
> and initial_stream are really close in name.

Done.

https://codereview.chromium.org/2748103014/diff/1/content/browser/download/do...
content/browser/download/download_file_impl.cc:520: // If the initial request is
alive, check if it can help download all the
On 2017/03/21 17:00:16, David Trainor-ping if over 24h wrote:
> Can we add a TODO to figure out allowing adjacent streams to always recover
> their neighbors?

Done.

https://codereview.chromium.org/2748103014/diff/1/content/browser/download/do...
content/browser/download/download_file_impl.cc:528: // The initial stream will
download all data downloading from its offset
On 2017/03/21 17:00:16, David Trainor-ping if over 24h wrote:
> On 2017/03/20 22:33:33, xingliu wrote:
> > On 2017/03/20 20:59:21, qinmin wrote:
> > > On 2017/03/18 22:27:34, xingliu wrote:
> > > > If there is a new slice added, does the initial slice will have a length
> > cap?
> > > > 
> > > > We probably run into this block only when the initial stream fails.
> > > 
> > > No. For example, there are stream A, B, C ordered by their offset and A is
> the
> > > initial stream. 
> > > If C fails to write any data, A will check if can download all the data C
> > should
> > > download here. 
> > > If B already starts writing data, then A's length will be truncated. So
> > > can_recover_from_error will return false. Otherwise, A's length is still
> > > kContentlength, so it can help handle C's portion
> > 
> > sgtm.
> 
> Hmm that seems like it might be a really common occurrence.  If we start the
> requests like:
> 
> 1. Start A
> 2. Response from A
> 3. Start B, then start C
> 4. Response from B
> 5. Failure for C.
> 
> We just drop the whole request?
> 
> Some alternative thoughts:
> 1. We wait for *all* stream responses to start before truncating the initial
> request.  If any fail we fall back to that initial request for the whole
> download?
> 2. We only fail if we've written a large amount of data to B (either > some
byte
> threshold or > some % of the file).

This will add more complexity to the current logic. I would prefer not doing
this for the initial version.
And later if keep B half open, we don't need to fail the whole request.
DownloadItemImpl will auto resume the download if an interruption occurs here,
so no user action is needed unless the download fails 5 times.

https://codereview.chromium.org/2748103014/diff/1/content/browser/download/do...
content/browser/download/download_file_impl.cc:548:
weak_factory_.InvalidateWeakPtrs();
On 2017/03/21 17:00:16, David Trainor-ping if over 24h wrote:
> Should we pull out the shutdown code to a common helper?  Do we have these few
> lines duplicated elsewhere?

No, we don't have the lines duplicated. It is similar to the DestinationComplete
call, but this one has an interruption reason.
So If we want to move it to a common helper, we would have something like
FinishDownload(bool is_download_complete, int interrupt_reason), which doesn't
feel very straightforward. Or we can simplify it to FinishDownload(int
interrupt_reason). Still, it is not straightforward to figure out what the
function actually does.

https://codereview.chromium.org/2748103014/diff/1/content/browser/download/do...
content/browser/download/download_file_impl.cc:553: reason,
TotalBytesReceived(), base::Passed(&hash_state)));
On 2017/03/20 22:33:33, xingliu wrote:
> nit%: Not in this CL, since we allow duplicate IO now, TotalBytesReceived may
be
> larger than file size. We probably want to

This won't happen. TotalBytesReceived is calculated from basefile, and the
DownloadFileImpl::CalculateBytesToWrite() prohibits duplicate write to a
particular file offset. So TotalBytesReceived() should reflect the actual file
size, though it could be different from content-length header.  we didn't
actually record the bytes received from the network in this class. We can get
that in StreamActive() call, but the issue is that they are not stored in db and
not reported to DownloadItemImpl. So only physical bytes written to disk is at
hand.

David Trainor- moved to gerrit

Have some replies inline, but lgtm. https://codereview.chromium.org/2748103014/diff/1/content/browser/download/download_file_impl.cc File content/browser/download/download_file_impl.cc (right): https://codereview.chromium.org/2748103014/diff/1/content/browser/download/download_file_impl.cc#newcode528 content/browser/download/download_file_impl.cc:528: // The initial ...

3 years, 9 months ago (2017-03-22 03:10:55 UTC) #10

Have some replies inline, but lgtm.

https://codereview.chromium.org/2748103014/diff/1/content/browser/download/do...
File content/browser/download/download_file_impl.cc (right):

https://codereview.chromium.org/2748103014/diff/1/content/browser/download/do...
content/browser/download/download_file_impl.cc:528: // The initial stream will
download all data downloading from its offset
On 2017/03/21 19:29:17, qinmin wrote:
> On 2017/03/21 17:00:16, David Trainor-ping if over 24h wrote:
> > On 2017/03/20 22:33:33, xingliu wrote:
> > > On 2017/03/20 20:59:21, qinmin wrote:
> > > > On 2017/03/18 22:27:34, xingliu wrote:
> > > > > If there is a new slice added, does the initial slice will have a
length
> > > cap?
> > > > > 
> > > > > We probably run into this block only when the initial stream fails.
> > > > 
> > > > No. For example, there are stream A, B, C ordered by their offset and A
is
> > the
> > > > initial stream. 
> > > > If C fails to write any data, A will check if can download all the data
C
> > > should
> > > > download here. 
> > > > If B already starts writing data, then A's length will be truncated. So
> > > > can_recover_from_error will return false. Otherwise, A's length is still
> > > > kContentlength, so it can help handle C's portion
> > > 
> > > sgtm.
> > 
> > Hmm that seems like it might be a really common occurrence.  If we start the
> > requests like:
> > 
> > 1. Start A
> > 2. Response from A
> > 3. Start B, then start C
> > 4. Response from B
> > 5. Failure for C.
> > 
> > We just drop the whole request?
> > 
> > Some alternative thoughts:
> > 1. We wait for *all* stream responses to start before truncating the initial
> > request.  If any fail we fall back to that initial request for the whole
> > download?
> > 2. We only fail if we've written a large amount of data to B (either > some
> byte
> > threshold or > some % of the file).
> 
> This will add more complexity to the current logic. I would prefer not doing
> this for the initial version.
> And later if keep B half open, we don't need to fail the whole request.
> DownloadItemImpl will auto resume the download if an interruption occurs here,
> so no user action is needed unless the download fails 5 times.

Okay I guess it's fine to postpone this for now.  I just want to make sure we
understand what the typical failure cases are.  Even if we retry automatically,
if the server starts blocking connections the retry could just fail (depending
on how fast we retry) which would be even worse than just having a single
connection to start.

https://codereview.chromium.org/2748103014/diff/1/content/browser/download/do...
content/browser/download/download_file_impl.cc:533:
DCHECK_EQ(stream.second->bytes_written(), 0);
On 2017/03/21 17:48:04, xingliu wrote:
> On 2017/03/21 17:00:16, David Trainor-ping if over 24h wrote:
> > On 2017/03/20 22:33:33, xingliu wrote:
> > > On 2017/03/20 20:59:21, qinmin wrote:
> > > > On 2017/03/18 22:27:34, xingliu wrote:
> > > > > Is it safe to assume streams in the middle never write anything?
> > > > 
> > > > Yes, if a stream in the middle writes something, it will truncate the
> length
> > > of
> > > > all streams prior to it. So those streams will never surpass its
starting
> > > > offset. 
> > > > As a result, the initial request will fail one of the if conditions
above.
> > > 
> > > I see.
> > 
> > Who kills the outstanding request if the middle streams haven't responded
yet?
> 
> > Does anyone clean up these killed SourceStream objects in this list?
> 
> DownloadFileImpl and all objects under it will get killed after
> DestinationComplete/DestinationError is called, that the UI thread kill the
> DownloadFileImpl with ReleaseDownloadFile or something.
> 
> In my understanding, removing SourceStream won't remove the underlying
pipe(byte
> stream writer+reader).
> 
> The pipe will be destory on Job::Cancel, called on interrupt, or the pipe may
be
> depleted after request is done, and UrlDownloader will remove itself, and also
> remove the underlying url request.
> 
> If we don't get response for a while, DownloadFileImpl won't create the
> SourceStream for it. The url request may still open, not sure if somewhere has
> time out logic. That's probably something worthwhile to figure out.

Thanks!

https://codereview.chromium.org/2748103014/diff/1/content/browser/download/do...
content/browser/download/download_file_impl.cc:548:
weak_factory_.InvalidateWeakPtrs();
On 2017/03/21 19:29:17, qinmin wrote:
> On 2017/03/21 17:00:16, David Trainor-ping if over 24h wrote:
> > Should we pull out the shutdown code to a common helper?  Do we have these
few
> > lines duplicated elsewhere?
> 
> No, we don't have the lines duplicated. It is similar to the
DestinationComplete
> call, but this one has an interruption reason.
> So If we want to move it to a common helper, we would have something like
> FinishDownload(bool is_download_complete, int interrupt_reason), which doesn't
> feel very straightforward. Or we can simplify it to FinishDownload(int
> interrupt_reason). Still, it is not straightforward to figure out what the
> function actually does.

Yeah makes sense.  Thanks

qinmin

The CQ bit was checked by qinmin@chromium.org to run a CQ dry run

3 years, 9 months ago (2017-03-22 20:41:03 UTC) #11