|
|
DescriptionUpdate the received slices vector when stream is written to disk
Whenever new data is written to disk, we should update the received slices vector.
This could impact existing SourceStream objects as their length could get truncated.
This CL introduces the logic to update both the received slices vector and update existing SourceStream objects.
Because the update is very frequent, SourceStream now holds an index to the vector.
So that they can update the vector efficiently.
BUG=644352
Review-Url: https://codereview.chromium.org/2737033002
Cr-Commit-Position: refs/heads/master@{#456796}
Committed: https://chromium.googlesource.com/chromium/src/+/cde3c3a4b12244b382390ee59b5484074e364b6e
Patch Set 1 #
Total comments: 9
Patch Set 2 : nits #Patch Set 3 : merge a new slice with existing ones #
Total comments: 3
Patch Set 4 : fix unit test #
Total comments: 17
Patch Set 5 : addressing comments #Patch Set 6 : nit #
Messages
Total messages: 42 (22 generated)
qinmin@chromium.org changed reviewers: + dtrainor@chromium.org, xingliu@chromium.org
PTAL
Patchset #1 (id:1) has been deleted
https://codereview.chromium.org/2737033002/diff/20001/content/browser/downloa... File content/browser/download/download_file_impl.cc (right): https://codereview.chromium.org/2737033002/diff/20001/content/browser/downloa... content/browser/download/download_file_impl.cc:451: void DownloadFileImpl::AddNewSlice(int64_t offset, int64_t length) { This in general sgtm. Question: after adding a new slice to ReceivedSlice, when we write back to the db, does it means we will have more db records? e.g, we resume a parallel download with 2 received slices, and we find slices to download, finally we have 4 received slices. I think we should avoid to create more and more db records. https://codereview.chromium.org/2737033002/diff/20001/content/browser/downloa... File content/browser/download/download_file_impl.h (right): https://codereview.chromium.org/2737033002/diff/20001/content/browser/downloa... content/browser/download/download_file_impl.h:136: // created a slice. nit%: Can we explicitly say written data will result in creating a slice. It's a critical implementation detail. And maybe consider to use size_t. https://codereview.chromium.org/2737033002/diff/20001/content/browser/downloa... File content/browser/download/parallel_download_utils_unittest.cc (right): https://codereview.chromium.org/2737033002/diff/20001/content/browser/downloa... content/browser/download/parallel_download_utils_unittest.cc:60: EXPECT_EQ(slice2, slices[2]); nit%: Can we add another case here that inserts something in the middle?
https://codereview.chromium.org/2737033002/diff/20001/content/browser/downloa... File content/browser/download/download_file_impl.cc (right): https://codereview.chromium.org/2737033002/diff/20001/content/browser/downloa... content/browser/download/download_file_impl.cc:451: void DownloadFileImpl::AddNewSlice(int64_t offset, int64_t length) { On 2017/03/07 22:18:13, xingliu wrote: > This in general sgtm. > > Question: after adding a new slice to ReceivedSlice, when we write back to the > db, does it means we will have more db records? > > e.g, we resume a parallel download with 2 received slices, and we find slices to > download, finally we have 4 received slices. > > > I think we should avoid to create more and more db records. Yes, we will. We can merge some of the slices together when db initializes, but that's not a top priority item. If we merge new slices with existing slices, then we have to set the bytes_received to non-zero value for each stream, which is very wierd. the slice info will be cleared when download finishes, so hopefully it won't be there forever. https://codereview.chromium.org/2737033002/diff/20001/content/browser/downloa... File content/browser/download/download_file_impl.h (right): https://codereview.chromium.org/2737033002/diff/20001/content/browser/downloa... content/browser/download/download_file_impl.h:136: // created a slice. On 2017/03/07 22:18:13, xingliu wrote: > nit%: Can we explicitly say written data will result in creating a slice. It's a > critical implementation detail. > > And maybe consider to use size_t. > Done. https://codereview.chromium.org/2737033002/diff/20001/content/browser/downloa... File content/browser/download/parallel_download_utils_unittest.cc (right): https://codereview.chromium.org/2737033002/diff/20001/content/browser/downloa... content/browser/download/parallel_download_utils_unittest.cc:60: EXPECT_EQ(slice2, slices[2]); On 2017/03/07 22:18:13, xingliu wrote: > nit%: Can we add another case here that inserts something in the middle? Done.
lgtm with nits. https://codereview.chromium.org/2737033002/diff/20001/content/browser/downloa... File content/browser/download/download_file_impl.cc (right): https://codereview.chromium.org/2737033002/diff/20001/content/browser/downloa... content/browser/download/download_file_impl.cc:451: void DownloadFileImpl::AddNewSlice(int64_t offset, int64_t length) { On 2017/03/08 18:46:33, qinmin wrote: > On 2017/03/07 22:18:13, xingliu wrote: > > This in general sgtm. > > > > Question: after adding a new slice to ReceivedSlice, when we write back to the > > db, does it means we will have more db records? > > > > e.g, we resume a parallel download with 2 received slices, and we find slices > to > > download, finally we have 4 received slices. > > > > > > I think we should avoid to create more and more db records. > > Yes, we will. We can merge some of the slices together when db initializes, but > that's not a top priority item. > If we merge new slices with existing slices, then we have to set the > bytes_received to non-zero value for each stream, which is very wierd. > the slice info will be cleared when download finishes, so hopefully it won't be > there forever. Maybe we should test this in very poor network condition that it won't spawn too many db records. https://codereview.chromium.org/2737033002/diff/20001/content/browser/downloa... content/browser/download/download_file_impl.cc:454: int index = AddReceivedSliceToSortedArray( nit%: Just something to consider here, is it possible to add a new slice with existing offset? This is dangerous since it will corrupt source stream states.
no lgtm, since crashing in download in linux build.
Patchset #3 (id:60001) has been deleted
On 2017/03/08 19:51:44, xingliu wrote: > no lgtm, since crashing in download in linux build. Skipped the logic for updadting received_slices_ when is_sparse_file is false, this should fix the crash. The new CL now merges a new slice with existing ones. It is more straightforward than I previously thought. PTAL again.
https://codereview.chromium.org/2737033002/diff/20001/content/browser/downloa... File content/browser/download/download_file_impl.cc (right): https://codereview.chromium.org/2737033002/diff/20001/content/browser/downloa... content/browser/download/download_file_impl.cc:454: int index = AddReceivedSliceToSortedArray( On 2017/03/08 19:45:18, xingliu wrote: > nit%: Just something to consider here, is it possible to add a new slice with > existing offset? This is dangerous since it will corrupt source stream states. This shouldn't happen. A slice will not be added to the db unless data is downloaded. So a new request will always be offset+data_received. If the server doesn't support range request, then we should not use parallel downloading.
The CQ bit was checked by qinmin@chromium.org
The patchset sent to the CQ was uploaded after l-g-t-m from xingliu@chromium.org Link to the patchset: https://codereview.chromium.org/2737033002/#ps80001 (title: "merge a new slice with existing ones")
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
No L-G-T-M from a valid reviewer yet. CQ run can only be started by full committers or once the patch has received an L-G-T-M from a full committer. Even if an L-G-T-M may have been provided, it was from a non-committer, _not_ a full super star committer. Committers are members of the group "project-chromium-committers". Note that this has nothing to do with OWNERS files.
The CQ bit was checked by qinmin@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: mac_chromium_rel_ng on master.tryserver.chromium.mac (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.mac/builders/mac_chromium_rel_...)
https://codereview.chromium.org/2737033002/diff/80001/content/browser/downloa... File content/browser/download/parallel_download_utils.cc (right): https://codereview.chromium.org/2737033002/diff/80001/content/browser/downloa... content/browser/download/parallel_download_utils.cc:77: if (prev->offset + prev->received_bytes == new_slice.offset) { Do we merge when (prev->offset + prev->received_bytes > new_slice.offset)? If we have 3 slices, the first two are finished, and then they merged, and we have 2 slices eventually, does it means the db now will have 2 records? We currently use REPLACE so we probably need to delete a single record?
https://codereview.chromium.org/2737033002/diff/80001/content/browser/downloa... File content/browser/download/parallel_download_utils.cc (right): https://codereview.chromium.org/2737033002/diff/80001/content/browser/downloa... content/browser/download/parallel_download_utils.cc:77: if (prev->offset + prev->received_bytes == new_slice.offset) { On 2017/03/09 01:33:53, xingliu wrote: > Do we merge when (prev->offset + prev->received_bytes > new_slice.offset)? > > If we have 3 slices, the first two are finished, and then they merged, and we > have 2 slices eventually, does it means the db now will have 2 records? > We currently use REPLACE so we probably need to delete a single record? No. That shouldn't happen. If that happens, there is a logic error somewhere in our code when calculating how much data we can write. 2 finished slices won't get merged, because that will impact the number of db entries. Here we only merge a new(unfinished) slice with an existing finished slice, so we won't increase the number of db entries if possible. For example, if (0, 500) and (1000, 1500) are 2 finished slice, when a new slice (500, 600) is introduced, it will be merged with (0, 500). When that new slice finally goes to 1000, we have (0, 1000) and (1000, 1500), those 2 finished slices will not be merged.
https://codereview.chromium.org/2737033002/diff/80001/content/browser/downloa... File content/browser/download/parallel_download_utils.cc (right): https://codereview.chromium.org/2737033002/diff/80001/content/browser/downloa... content/browser/download/parallel_download_utils.cc:77: if (prev->offset + prev->received_bytes == new_slice.offset) { On 2017/03/09 05:43:53, qinmin wrote: > On 2017/03/09 01:33:53, xingliu wrote: > > Do we merge when (prev->offset + prev->received_bytes > new_slice.offset)? > > > > If we have 3 slices, the first two are finished, and then they merged, and we > > have 2 slices eventually, does it means the db now will have 2 records? > > We currently use REPLACE so we probably need to delete a single record? > > No. That shouldn't happen. If that happens, there is a logic error somewhere in > our code when calculating how much data we can write. > > 2 finished slices won't get merged, because that will impact the number of db > entries. > Here we only merge a new(unfinished) slice with an existing finished slice, so > we won't increase the number of db entries if possible. > For example, if (0, 500) and (1000, 1500) are 2 finished slice, when a new slice > (500, 600) is introduced, it will be merged with (0, 500). When that new slice > finally goes to 1000, we have (0, 1000) and (1000, 1500), those 2 finished > slices will not be merged. sgtm.
The CQ bit was checked by qinmin@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
Unit tests are now fixed, PTAL again.
https://codereview.chromium.org/2737033002/diff/100001/content/browser/downlo... File content/browser/download/download_file_impl.cc (right): https://codereview.chromium.org/2737033002/diff/100001/content/browser/downlo... content/browser/download/download_file_impl.cc:323: // If the write operation creates a new slice, add it to the Move this comment after the if (!is_sparse_file_)? https://codereview.chromium.org/2737033002/diff/100001/content/browser/downlo... content/browser/download/download_file_impl.cc:326: if (!is_sparse_file_) Starting to wonder if in the future we should consider treating a !is_sparse_file_ as a is_sparse_file_ with a single slice and remove the boolean. https://codereview.chromium.org/2737033002/diff/100001/content/browser/downlo... content/browser/download/download_file_impl.cc:331: received_slices_[source_stream->index()].received_bytes += Should the stream just hold a reference to the associated slice directly? Feels like that would clean some of this up. https://codereview.chromium.org/2737033002/diff/100001/content/browser/downlo... File content/browser/download/parallel_download_utils.cc (right): https://codereview.chromium.org/2737033002/diff/100001/content/browser/downlo... content/browser/download/parallel_download_utils.cc:31: bool compareReceivedSlices(const DownloadItem::ReceivedSlice& lhs, Comparator operators on the DownloadItem::ReceivedSlice would work as well. But either way is ok with me. https://codereview.chromium.org/2737033002/diff/100001/content/browser/downlo... File content/browser/download/parallel_download_utils_unittest.cc (right): https://codereview.chromium.org/2737033002/diff/100001/content/browser/downlo... content/browser/download/parallel_download_utils_unittest.cc:69: EXPECT_EQ(500, slices[0].offset); EXPECT_EQ(slice1, slices[0])? https://codereview.chromium.org/2737033002/diff/100001/content/browser/downlo... content/browser/download/parallel_download_utils_unittest.cc:75: EXPECT_EQ(slice3, slices[0]); EXPECT_EQ(slice1, slices[1])? https://codereview.chromium.org/2737033002/diff/100001/content/browser/downlo... content/browser/download/parallel_download_utils_unittest.cc:83: // A new slice can only merge with a slice in front of it, not behind it. Is the comment flipped?
https://codereview.chromium.org/2737033002/diff/100001/content/browser/downlo... File content/browser/download/download_file_impl.cc (right): https://codereview.chromium.org/2737033002/diff/100001/content/browser/downlo... content/browser/download/download_file_impl.cc:323: // If the write operation creates a new slice, add it to the On 2017/03/09 17:44:02, David Trainor-ping if over 24h wrote: > Move this comment after the if (!is_sparse_file_)? Done. https://codereview.chromium.org/2737033002/diff/100001/content/browser/downlo... content/browser/download/download_file_impl.cc:326: if (!is_sparse_file_) On 2017/03/09 17:44:02, David Trainor-ping if over 24h wrote: > Starting to wonder if in the future we should consider treating a > !is_sparse_file_ as a is_sparse_file_ with a single slice and remove the > boolean. updated the TODO in the header file. There could still be some issues in calculating the hash in BaseFile, we need to figure that out first. https://codereview.chromium.org/2737033002/diff/100001/content/browser/downlo... content/browser/download/download_file_impl.cc:331: received_slices_[source_stream->index()].received_bytes += On 2017/03/09 17:44:02, David Trainor-ping if over 24h wrote: > Should the stream just hold a reference to the associated slice directly? Feels > like that would clean some of this up. Hold a pointer/reference is not safe when the |received_slice_| vector changes. Index is much easier to update when the vector changes, just +/-1 or don't change. But for pointers/references, we need to recalculate the slice pointer for each source stream. https://codereview.chromium.org/2737033002/diff/100001/content/browser/downlo... File content/browser/download/parallel_download_utils.cc (right): https://codereview.chromium.org/2737033002/diff/100001/content/browser/downlo... content/browser/download/parallel_download_utils.cc:31: bool compareReceivedSlices(const DownloadItem::ReceivedSlice& lhs, On 2017/03/09 17:44:02, David Trainor-ping if over 24h wrote: > Comparator operators on the DownloadItem::ReceivedSlice would work as well. But > either way is ok with me. Added a TODO here. https://codereview.chromium.org/2737033002/diff/100001/content/browser/downlo... File content/browser/download/parallel_download_utils_unittest.cc (right): https://codereview.chromium.org/2737033002/diff/100001/content/browser/downlo... content/browser/download/parallel_download_utils_unittest.cc:69: EXPECT_EQ(500, slices[0].offset); On 2017/03/09 17:44:03, David Trainor-ping if over 24h wrote: > EXPECT_EQ(slice1, slices[0])? No, they are not equal. slice1 is (500, 1000), slices[0] is (500, 1400); https://codereview.chromium.org/2737033002/diff/100001/content/browser/downlo... content/browser/download/parallel_download_utils_unittest.cc:75: EXPECT_EQ(slice3, slices[0]); On 2017/03/09 17:44:02, David Trainor-ping if over 24h wrote: > EXPECT_EQ(slice1, slices[1])? same as above https://codereview.chromium.org/2737033002/diff/100001/content/browser/downlo... content/browser/download/parallel_download_utils_unittest.cc:83: // A new slice can only merge with a slice in front of it, not behind it. On 2017/03/09 17:44:02, David Trainor-ping if over 24h wrote: > Is the comment flipped? no, this is correct. A new slice (50, 100) can only merge with (0, 50), but not (100, 150). We do this because we don't want to change the existing stored slice offset.
The CQ bit was checked by xingliu@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
ping, dtrainor@, would you please take another look?
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
lgtm % nit. https://codereview.chromium.org/2737033002/diff/100001/content/browser/downlo... File content/browser/download/parallel_download_utils_unittest.cc (right): https://codereview.chromium.org/2737033002/diff/100001/content/browser/downlo... content/browser/download/parallel_download_utils_unittest.cc:69: EXPECT_EQ(500, slices[0].offset); On 2017/03/09 21:44:58, qinmin wrote: > On 2017/03/09 17:44:03, David Trainor-ping if over 24h wrote: > > EXPECT_EQ(slice1, slices[0])? > > No, they are not equal. slice1 is (500, 1000), slices[0] is (500, 1400); Ah yeah thanks! I forgot the merge somehow :D. https://codereview.chromium.org/2737033002/diff/100001/content/browser/downlo... content/browser/download/parallel_download_utils_unittest.cc:83: // A new slice can only merge with a slice in front of it, not behind it. On 2017/03/09 21:44:58, qinmin wrote: > On 2017/03/09 17:44:02, David Trainor-ping if over 24h wrote: > > Is the comment flipped? > > no, this is correct. > > A new slice (50, 100) can only merge with (0, 50), but not (100, 150). > We do this because we don't want to change the existing stored slice offset. I think the wording regarding in front of and behind is confusing. "Earlier in the file" or "later in the file"?
https://codereview.chromium.org/2737033002/diff/100001/content/browser/downlo... File content/browser/download/parallel_download_utils_unittest.cc (right): https://codereview.chromium.org/2737033002/diff/100001/content/browser/downlo... content/browser/download/parallel_download_utils_unittest.cc:83: // A new slice can only merge with a slice in front of it, not behind it. On 2017/03/14 15:07:31, David Trainor-ping if over 24h wrote: > On 2017/03/09 21:44:58, qinmin wrote: > > On 2017/03/09 17:44:02, David Trainor-ping if over 24h wrote: > > > Is the comment flipped? > > > > no, this is correct. > > > > A new slice (50, 100) can only merge with (0, 50), but not (100, 150). > > We do this because we don't want to change the existing stored slice offset. > > I think the wording regarding in front of and behind is confusing. "Earlier in > the file" or "later in the file"? Done.
The CQ bit was checked by qinmin@chromium.org
The patchset sent to the CQ was uploaded after l-g-t-m from xingliu@chromium.org, dtrainor@chromium.org Link to the patchset: https://codereview.chromium.org/2737033002/#ps140001 (title: "nit")
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
CQ is committing da patch. Bot data: {"patchset_id": 140001, "attempt_start_ts": 1489516368106580, "parent_rev": "aa3e611c9d8b00201fab5d36af2758b618e94b05", "commit_rev": "cde3c3a4b12244b382390ee59b5484074e364b6e"}
Message was sent while issue was closed.
Description was changed from ========== Update the received slices vector when stream is written to disk Whenever new data is written to disk, we should update the received slices vector. This could impact existing SourceStream objects as their length could get truncated. This CL introduces the logic to update both the received slices vector and update existing SourceStream objects. Because the update is very frequent, SourceStream now holds an index to the vector. So that they can update the vector efficiently. BUG=644352 ========== to ========== Update the received slices vector when stream is written to disk Whenever new data is written to disk, we should update the received slices vector. This could impact existing SourceStream objects as their length could get truncated. This CL introduces the logic to update both the received slices vector and update existing SourceStream objects. Because the update is very frequent, SourceStream now holds an index to the vector. So that they can update the vector efficiently. BUG=644352 Review-Url: https://codereview.chromium.org/2737033002 Cr-Commit-Position: refs/heads/master@{#456796} Committed: https://chromium.googlesource.com/chromium/src/+/cde3c3a4b12244b382390ee59b54... ==========
Message was sent while issue was closed.
Committed patchset #6 (id:140001) as https://chromium.googlesource.com/chromium/src/+/cde3c3a4b12244b382390ee59b54... |