Issue 2665243003: add a download slices table into history download db

qinmin

qinmin@chromium.org changed reviewers: + asanka@chromium.org, dtrainor@chromium.org, sdefresne@chromium.org, xingliu@chromium.org

3 years, 10 months ago (2017-02-01 18:38:00 UTC) #1

qinmin

The CQ bit was checked by qinmin@chromium.org to run a CQ dry run

3 years, 10 months ago (2017-02-01 22:26:50 UTC) #4

commit-bot: I haz the power

Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2665243003/20001

3 years, 10 months ago (2017-02-01 22:27:17 UTC) #5

asanka

Is it possible to come up with a scheme where the shard state can be ...

3 years, 10 months ago (2017-02-01 22:34:10 UTC) #6

Is it possible to come up with a scheme where the shard state can be discovered
by what's on the disk? It seems reliance on this table will make parallel
downloads very fragile.

While we keep the primary download information in the history DB, I'm a bit
apprehensive about storing shard info there due to how fickle I think shard
states are going to be. We are already dealing with issues related to on-disk
state diverging from what's kept in downloads history. Adding shard state would
exacerbate this. Ultimately regardless of what the history DB has to say about
the individual download jobs, we'd need to:

* Examine the disk to locate the download job artifacts for the purpose of
cleanup.
* Examine the on disk artifacts to know how much progress was made by each job.
We are unlikely to be able to update the job state for all shards on browser
kill / shutdown.

So for example, if all the download jobs for download with GUID xxx-yyy-zzz was
named:

   xxx-yyy-zzz.<hhh>.crdownload

where <hhh> is the hex encoded start offset, then by listing and stat()ing such
files, we'd know the offset and length of each job without needing to worry
about whether the history DB is up-to-date. Also, the intended size of the job
file is irrelevant. The only thing that matters is how far along the job got
before it was killed/interrupted.

We don't need to store the state/interrupt reason for these jobs. If the
download is not continuable, we'd need to summarily remove all the job artifacts
and the primary partial file. The specific reason doesn't affect how we handle
reason WRT download jobs.

https://codereview.chromium.org/2665243003/diff/20001/components/history/core...
File components/history/core/browser/download_database.cc (right):

https://codereview.chromium.org/2665243003/diff/20001/components/history/core...
components/history/core/browser/download_database.cc:487: if
(IsParallelDownloadingEnabled()) {
The DB schema shouldn't depend on a feature state.

It's okay to rev up the schema version. Doing so lets us know when we introduced
this table and will be important if we want to modify/remove it later.

Also, it doesn't matter that the feature isn't available by default on the
stable channel.

commit-bot: I haz the power

The CQ bit was unchecked by commit-bot@chromium.org

3 years, 10 months ago (2017-02-01 22:34:53 UTC) #7

commit-bot: I haz the power

Dry run: Try jobs failed on following builders: ios-device on master.tryserver.chromium.mac (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.mac/builders/ios-device/builds/145963)

3 years, 10 months ago (2017-02-01 22:34:54 UTC) #8

qinmin

Our current plan is to use a single file, not stitching shards together for parallel ...

3 years, 10 months ago (2017-02-01 23:07:10 UTC) #9

Our current plan is to use a single file, not stitching shards together for
parallel download.
So instead of introducing lots of crdownload files, we just create a big file(UC
browser create a file with 60% of its original size), and ask each download job
to write to the target offset in that file.
This is also the behavior for UC browser:
https://docs.google.com/document/d/18KAE4KACcXrnGD860w1Rsvjqk1527YOXAl0HmNyc4...

Using small shards can probably save some space, but it will introduce name
conflict and disk failure issues when stitching shards together. And a large
number of downloads don't have strong validators, so having GUID as intermediate
file name seems very wierd.
Using a single file can also be problematic if chrome crashes. We have to
rollback the received bytes for each job, but that should be OK since most
download won't have a lot of browser crashes and losing 10 seconds of data might
be ok. (I am working on UMA to check how much extra data we download).




On 2017/02/01 22:34:10, asanka wrote:
> Is it possible to come up with a scheme where the shard state can be
discovered
> by what's on the disk? It seems reliance on this table will make parallel
> downloads very fragile.
> 
> While we keep the primary download information in the history DB, I'm a bit
> apprehensive about storing shard info there due to how fickle I think shard
> states are going to be. We are already dealing with issues related to on-disk
> state diverging from what's kept in downloads history. Adding shard state
would
> exacerbate this. Ultimately regardless of what the history DB has to say about
> the individual download jobs, we'd need to:
> 
> * Examine the disk to locate the download job artifacts for the purpose of
> cleanup.
> * Examine the on disk artifacts to know how much progress was made by each
job.
> We are unlikely to be able to update the job state for all shards on browser
> kill / shutdown.
> 
> So for example, if all the download jobs for download with GUID xxx-yyy-zzz
was
> named:
> 
>    xxx-yyy-zzz.<hhh>.crdownload
> 
> where <hhh> is the hex encoded start offset, then by listing and stat()ing
such
> files, we'd know the offset and length of each job without needing to worry
> about whether the history DB is up-to-date. Also, the intended size of the job
> file is irrelevant. The only thing that matters is how far along the job got
> before it was killed/interrupted.
> 
> We don't need to store the state/interrupt reason for these jobs. If the
> download is not continuable, we'd need to summarily remove all the job
artifacts
> and the primary partial file. The specific reason doesn't affect how we handle
> reason WRT download jobs.
> 
>
https://codereview.chromium.org/2665243003/diff/20001/components/history/core...
> File components/history/core/browser/download_database.cc (right):
> 
>
https://codereview.chromium.org/2665243003/diff/20001/components/history/core...
> components/history/core/browser/download_database.cc:487: if
> (IsParallelDownloadingEnabled()) {
> The DB schema shouldn't depend on a feature state.
> 
> It's okay to rev up the schema version. Doing so lets us know when we
introduced
> this table and will be important if we want to modify/remove it later.
> 
> Also, it doesn't matter that the feature isn't available by default on the
> stable channel.

qinmin

https://codereview.chromium.org/2665243003/diff/20001/components/history/core/browser/download_database.cc File components/history/core/browser/download_database.cc (right): https://codereview.chromium.org/2665243003/diff/20001/components/history/core/browser/download_database.cc#newcode487 components/history/core/browser/download_database.cc:487: if (IsParallelDownloadingEnabled()) { On 2017/02/01 22:34:10, asanka wrote: > ...

3 years, 10 months ago (2017-02-01 23:37:26 UTC) #10

qinmin

Description was changed from ========== add an optional download jobs table into history download db ...

3 years, 10 months ago (2017-02-01 23:38:12 UTC) #11

asanka

asanka@chromium.org changed reviewers: + sky@chromium.org

3 years, 10 months ago (2017-02-02 19:06:26 UTC) #12

asanka

https://codereview.chromium.org/2665243003/diff/40001/components/history/core/browser/download_constants.h File components/history/core/browser/download_constants.h (right): https://codereview.chromium.org/2665243003/diff/40001/components/history/core/browser/download_constants.h#newcode8 components/history/core/browser/download_constants.h:8: #include "base/feature_list.h" Not needed. https://codereview.chromium.org/2665243003/diff/40001/components/history/core/browser/download_database.cc File components/history/core/browser/download_database.cc (right): https://codereview.chromium.org/2665243003/diff/40001/components/history/core/browser/download_database.cc#newcode323 ...

3 years, 10 months ago (2017-02-02 19:06:28 UTC) #13

asanka

On 2017/02/01 at 23:07:10, qinmin wrote: > Our current plan is to use a single ...

3 years, 10 months ago (2017-02-02 19:29:08 UTC) #14

On 2017/02/01 at 23:07:10, qinmin wrote:
> Our current plan is to use a single file, not stitching shards together for
parallel download.
> So instead of introducing lots of crdownload files, we just create a big
file(UC browser create a file with 60% of its original size), and ask each
download job to write to the target offset in that file.
> This is also the behavior for UC browser:
https://docs.google.com/document/d/18KAE4KACcXrnGD860w1Rsvjqk1527YOXAl0HmNyc4...

This isn't a feature where compatibility is an issue, so let's evaluate
technical decisions on their own merits.

Can you double check the frequency of database updates that this change is going
to introduce? It's probably fine, but I've grown a bit apprehensive about
relying on DB operations being fast. The 10 second delay in transaction commit
may not be your problem if DB updates are scheduled with a increasingly longer
delay as we go. We don't want to overwhelm the DB thread if the DB is running
slow, so it's worth checking with folks about whether a sustained stream of
updates at a high frequency is OK.

> Using small shards can probably save some space, but it will introduce name
conflict and disk failure issues when stitching shards together. And a large
number of downloads don't have strong validators, so having GUID as intermediate
file name seems very wierd.

If GUIDs start colliding we are going to have other problems :) My comment was
merely a strawman, but the use of GUID in the name was so that the filename +
stat() will immediately tell us exactly which download a shard file corresponds
to, its starting offset, and its length (so far). Its intended length is
irrelevant to intermediate state.

I understand that some things are easier said than done, so ultimately ease of
implementation is something I can't comment on. So I'm okay with your decision
either way.

> Using a single file can also be problematic if chrome crashes. We have to
rollback the received bytes for each job, but that should be OK since most
download won't have a lot of browser crashes and losing 10 seconds of data might
be ok. (I am working on UMA to check how much extra data we download).
> 
> 
> 
> 
> On 2017/02/01 22:34:10, asanka wrote:
> > Is it possible to come up with a scheme where the shard state can be
discovered
> > by what's on the disk? It seems reliance on this table will make parallel
> > downloads very fragile.
> > 
> > While we keep the primary download information in the history DB, I'm a bit
> > apprehensive about storing shard info there due to how fickle I think shard
> > states are going to be. We are already dealing with issues related to
on-disk
> > state diverging from what's kept in downloads history. Adding shard state
would
> > exacerbate this. Ultimately regardless of what the history DB has to say
about
> > the individual download jobs, we'd need to:
> > 
> > * Examine the disk to locate the download job artifacts for the purpose of
> > cleanup.
> > * Examine the on disk artifacts to know how much progress was made by each
job.
> > We are unlikely to be able to update the job state for all shards on browser
> > kill / shutdown.
> > 
> > So for example, if all the download jobs for download with GUID xxx-yyy-zzz
was
> > named:
> > 
> >    xxx-yyy-zzz.<hhh>.crdownload
> > 
> > where <hhh> is the hex encoded start offset, then by listing and stat()ing
such
> > files, we'd know the offset and length of each job without needing to worry
> > about whether the history DB is up-to-date. Also, the intended size of the
job
> > file is irrelevant. The only thing that matters is how far along the job got
> > before it was killed/interrupted.
> > 
> > We don't need to store the state/interrupt reason for these jobs. If the
> > download is not continuable, we'd need to summarily remove all the job
artifacts
> > and the primary partial file. The specific reason doesn't affect how we
handle
> > reason WRT download jobs.
> > 
> >
https://codereview.chromium.org/2665243003/diff/20001/components/history/core...
> > File components/history/core/browser/download_database.cc (right):
> > 
> >
https://codereview.chromium.org/2665243003/diff/20001/components/history/core...
> > components/history/core/browser/download_database.cc:487: if
> > (IsParallelDownloadingEnabled()) {
> > The DB schema shouldn't depend on a feature state.
> > 
> > It's okay to rev up the schema version. Doing so lets us know when we
introduced
> > this table and will be important if we want to modify/remove it later.
> > 
> > Also, it doesn't matter that the feature isn't available by default on the
> > stable channel.

qinmin

https://codereview.chromium.org/2665243003/diff/40001/components/history/core/browser/download_constants.h File components/history/core/browser/download_constants.h (right): https://codereview.chromium.org/2665243003/diff/40001/components/history/core/browser/download_constants.h#newcode8 components/history/core/browser/download_constants.h:8: #include "base/feature_list.h" On 2017/02/02 19:06:27, asanka wrote: > Not ...

3 years, 10 months ago (2017-02-02 22:31:32 UTC) #16

https://codereview.chromium.org/2665243003/diff/40001/components/history/core...
File components/history/core/browser/download_constants.h (right):

https://codereview.chromium.org/2665243003/diff/40001/components/history/core...
components/history/core/browser/download_constants.h:8: #include
"base/feature_list.h"
On 2017/02/02 19:06:27, asanka wrote:
> Not needed.

Done. removed

https://codereview.chromium.org/2665243003/diff/40001/components/history/core...
File components/history/core/browser/download_database.cc (right):

https://codereview.chromium.org/2665243003/diff/40001/components/history/core...
components/history/core/browser/download_database.cc:323:
!GetDB().DoesTableExist("downloads_jobs") &&
On 2017/02/02 19:06:27, asanka wrote:
> This condition isn't relevant here since neither kSchema nor kUrlChainSchema
> assume downloads_jobs doesn't exist.

Done. removed, it just feels wierd that download_jobs table exists even if both
downloads and downloads_url_chain don't exist.

https://codereview.chromium.org/2665243003/diff/40001/components/history/core...
components/history/core/browser/download_database.cc:326: ret = ret &&
EnsureDownloadJobsTableExists();
On 2017/02/02 19:06:27, asanka wrote:
> Roll EnsureDownloadJobsTableExists here for consistency. Or pull out
> downloads_url_chain initialization into a separate function. Either is fine.

Done. Moved all the code here

https://codereview.chromium.org/2665243003/diff/40001/components/history/core...
components/history/core/browser/download_database.cc:507: if
(!base::ContainsKey(info_map, id))
On 2017/02/02 19:06:27, asanka wrote:
> Might as well front load this check. There's no point populating the rest of
> DownloadJobInfo if this condition is false.
> 
> Could you also add metrics to measure how often this happens?

Done. Added a new UMA entry

https://codereview.chromium.org/2665243003/diff/40001/components/history/core...
components/history/core/browser/download_database.cc:758: "id INTEGER NOT NULL,"
               // downloads.id.
On 2017/02/02 19:06:27, asanka wrote:
> Probably call this |download_id| since it doesn't uniquely identify a record
in
> this table.

Done.

https://codereview.chromium.org/2665243003/diff/40001/components/history/core...
components/history/core/browser/download_database.cc:766: "interrupt_reason
INTEGER NOT NULL, " // DownloadInterruptReason
On 2017/02/02 19:06:27, asanka wrote:
> How do you plan to use |state| and |interrupt_reason| ?

I was thinking whether we should keep the state for later resumption purpose.
For example, if a download has 2 jobs, and the 2nd job is interrupted with a
SERVER_FORBIDDEN reason and the received byte is 0 , we probably don't want to
create the 2nd job when resuming the download later on.

Of course we can always create all the jobs on resumption and let them fail, but
I am just wondering whether we should avoid doing that.

Having those informations around could always help up to change the logic later
on, and also have some UMAs to study whether creating multiple jobs are desired
for certain interrupt reasons.

https://codereview.chromium.org/2665243003/diff/40001/components/history/core...
components/history/core/browser/download_database.cc:785: job_exists =
statement_query.ColumnInt(0) > 0;
On 2017/02/02 19:06:28, asanka wrote:
> Separate UpdateDownloadJob and CreateDownloadJob.
> 
> An overwhelming majority of the time we are just trying to update an existing
> download job (possibly tens of times a second). It's much more efficient to
deal
> with missing |downloads_jobs| entries by handling the failure of the "UPDATE"
> query instead of doing a "SELECT count(*)" before every update.
> 
> Also can't new download jobs be created at any point in the lifetime of a
> download? The |is_new_download| condition doesn't really capture the set of
> cases where we don't expect the job to already exist in the DB.

Done. Separated UpdateDownloadJob from CreateDownloadJob. Once the sql in
UpdateDownloadJob fails, it will call CreateDownloadJob to create a new one

https://codereview.chromium.org/2665243003/diff/40001/components/history/core...
components/history/core/browser/download_database.cc:786: // There should not be
any jobs in downloads_jobs for this id.  If there
On 2017/02/02 19:06:27, asanka wrote:
> This doesn't really follow from the (!is_new_download && job_exists) case. A
> download job created at any point of the download's lifetime should assert
that
> there isn't a conflicting download job in the table.

Done.

https://codereview.chromium.org/2665243003/diff/40001/components/history/core...
File components/history/core/browser/download_database.h (right):

https://codereview.chromium.org/2665243003/diff/40001/components/history/core...
components/history/core/browser/download_database.h:32: static bool
IsParallelDownloadingEnabled();
On 2017/02/02 19:06:28, asanka wrote:
> Not needed.

Done. Removed

qinmin

On 2017/02/02 19:29:08, asanka wrote: > On 2017/02/01 at 23:07:10, qinmin wrote: > > Our ...

3 years, 10 months ago (2017-02-02 23:53:38 UTC) #17

On 2017/02/02 19:29:08, asanka wrote:
> On 2017/02/01 at 23:07:10, qinmin wrote:
> > Our current plan is to use a single file, not stitching shards together for
> parallel download.
> > So instead of introducing lots of crdownload files, we just create a big
> file(UC browser create a file with 60% of its original size), and ask each
> download job to write to the target offset in that file.
> > This is also the behavior for UC browser:
>
https://docs.google.com/document/d/18KAE4KACcXrnGD860w1Rsvjqk1527YOXAl0HmNyc4...
> 
> This isn't a feature where compatibility is an issue, so let's evaluate
> technical decisions on their own merits.
> 
> Can you double check the frequency of database updates that this change is
going
> to introduce? It's probably fine, but I've grown a bit apprehensive about
> relying on DB operations being fast. The 10 second delay in transaction commit
> may not be your problem if DB updates are scheduled with a increasingly longer
> delay as we go. We don't want to overwhelm the DB thread if the DB is running
> slow, so it's worth checking with folks about whether a sustained stream of
> updates at a high frequency is OK.

I captured some trace of this on an Android One device. Since we don't have the
parallel downloading implementation yet,
I created 4 separate download to simulate that 4 download jobs are running at
the same time.
The result looks ok. We have about 5 db updates each second, and each updates
takes an average of
0.3 ms, the history thread is mostly idle, so I think we will not be impacted by
this change later
on when we start supporting parallel download.

> 
> > Using small shards can probably save some space, but it will introduce name
> conflict and disk failure issues when stitching shards together. And a large
> number of downloads don't have strong validators, so having GUID as
intermediate
> file name seems very wierd.
> 
> If GUIDs start colliding we are going to have other problems :) My comment was
> merely a strawman, but the use of GUID in the name was so that the filename +
> stat() will immediately tell us exactly which download a shard file
corresponds
> to, its starting offset, and its length (so far). Its intended length is
> irrelevant to intermediate state.
> 
> I understand that some things are easier said than done, so ultimately ease of
> implementation is something I can't comment on. So I'm okay with your decision
> either way.
> 
> > Using a single file can also be problematic if chrome crashes. We have to
> rollback the received bytes for each job, but that should be OK since most
> download won't have a lot of browser crashes and losing 10 seconds of data
might
> be ok. (I am working on UMA to check how much extra data we download).
> > 
> > 
> > 
> > 
> > On 2017/02/01 22:34:10, asanka wrote:
> > > Is it possible to come up with a scheme where the shard state can be
> discovered
> > > by what's on the disk? It seems reliance on this table will make parallel
> > > downloads very fragile.
> > > 
> > > While we keep the primary download information in the history DB, I'm a
bit
> > > apprehensive about storing shard info there due to how fickle I think
shard
> > > states are going to be. We are already dealing with issues related to
> on-disk
> > > state diverging from what's kept in downloads history. Adding shard state
> would
> > > exacerbate this. Ultimately regardless of what the history DB has to say
> about
> > > the individual download jobs, we'd need to:
> > > 
> > > * Examine the disk to locate the download job artifacts for the purpose of
> > > cleanup.
> > > * Examine the on disk artifacts to know how much progress was made by each
> job.
> > > We are unlikely to be able to update the job state for all shards on
browser
> > > kill / shutdown.
> > > 
> > > So for example, if all the download jobs for download with GUID
xxx-yyy-zzz
> was
> > > named:
> > > 
> > >    xxx-yyy-zzz.<hhh>.crdownload
> > > 
> > > where <hhh> is the hex encoded start offset, then by listing and stat()ing
> such
> > > files, we'd know the offset and length of each job without needing to
worry
> > > about whether the history DB is up-to-date. Also, the intended size of the
> job
> > > file is irrelevant. The only thing that matters is how far along the job
got
> > > before it was killed/interrupted.
> > > 
> > > We don't need to store the state/interrupt reason for these jobs. If the
> > > download is not continuable, we'd need to summarily remove all the job
> artifacts
> > > and the primary partial file. The specific reason doesn't affect how we
> handle
> > > reason WRT download jobs.
> > > 
> > >
>
https://codereview.chromium.org/2665243003/diff/20001/components/history/core...
> > > File components/history/core/browser/download_database.cc (right):
> > > 
> > >
>
https://codereview.chromium.org/2665243003/diff/20001/components/history/core...
> > > components/history/core/browser/download_database.cc:487: if
> > > (IsParallelDownloadingEnabled()) {
> > > The DB schema shouldn't depend on a feature state.
> > > 
> > > It's okay to rev up the schema version. Doing so lets us know when we
> introduced
> > > this table and will be important if we want to modify/remove it later.
> > > 
> > > Also, it doesn't matter that the feature isn't available by default on the
> > > stable channel.

sky

https://codereview.chromium.org/2665243003/diff/60001/components/history/core/browser/download_database.cc File components/history/core/browser/download_database.cc (right): https://codereview.chromium.org/2665243003/diff/60001/components/history/core/browser/download_database.cc#newcode324 components/history/core/browser/download_database.cc:324: "state INTEGER NOT NULL," // To be determined. Is ...

3 years, 10 months ago (2017-02-03 16:02:34 UTC) #18

qinmin

https://codereview.chromium.org/2665243003/diff/60001/components/history/core/browser/download_database.cc File components/history/core/browser/download_database.cc (right): https://codereview.chromium.org/2665243003/diff/60001/components/history/core/browser/download_database.cc#newcode324 components/history/core/browser/download_database.cc:324: "state INTEGER NOT NULL," // To be determined. On ...

3 years, 10 months ago (2017-02-04 00:06:57 UTC) #19

sky

+shess for following question https://codereview.chromium.org/2665243003/diff/80001/components/history/core/browser/download_database.cc File components/history/core/browser/download_database.cc (right): https://codereview.chromium.org/2665243003/diff/80001/components/history/core/browser/download_database.cc#newcode760 components/history/core/browser/download_database.cc:760: sql::Statement statement_query(GetDB().GetCachedStatement( On 2017/02/04 00:06:57, ...

3 years, 10 months ago (2017-02-06 16:29:04 UTC) #21

qinmin

https://codereview.chromium.org/2665243003/diff/80001/components/history/core/browser/download_database.cc File components/history/core/browser/download_database.cc (right): https://codereview.chromium.org/2665243003/diff/80001/components/history/core/browser/download_database.cc#newcode760 components/history/core/browser/download_database.cc:760: sql::Statement statement_query(GetDB().GetCachedStatement( On 2017/02/06 16:29:04, sky wrote: > On ...

3 years, 10 months ago (2017-02-06 16:41:53 UTC) #22

qinmin

https://codereview.chromium.org/2665243003/diff/80001/components/history/core/browser/download_database.cc File components/history/core/browser/download_database.cc (right): https://codereview.chromium.org/2665243003/diff/80001/components/history/core/browser/download_database.cc#newcode760 components/history/core/browser/download_database.cc:760: sql::Statement statement_query(GetDB().GetCachedStatement( On 2017/02/06 16:41:53, qinmin wrote: > On ...

3 years, 10 months ago (2017-02-06 19:25:02 UTC) #23

https://codereview.chromium.org/2665243003/diff/80001/components/history/core...
File components/history/core/browser/download_database.cc (right):

https://codereview.chromium.org/2665243003/diff/80001/components/history/core...
components/history/core/browser/download_database.cc:760: sql::Statement
statement_query(GetDB().GetCachedStatement(
On 2017/02/06 16:41:53, qinmin wrote:
> On 2017/02/06 16:29:04, sky wrote:
> > On 2017/02/04 00:06:57, qinmin wrote:
> > > Unfortunately, update statement always return true. So I have to use a
> > separate
> > > select count(*) statement here. Given that we don't have too much db
updates
> > per
> > > second (only 5 updates if 4 jobs are created), this shouldn't cause too
much
> > > issue.
> > > Or a more efficient approach would be ask HistoryService to specifically
> call
> > > CreateDowloadJob() whenever a download job is added. So later calls to
> > > UpdateDownload() will no longer need to check if the job exists or not,
> wdyt?
> > 
> > http://stackoverflow.com/questions/15277373/sqlite-upsert-update-or-insert
> > mentions INSERT OR IGNORE. But I'm not sure what is best. Adding shess for
his
> > thoughts.
> 
> INSERT OR IGNORE doesn't improve the performance of update because we will
have
> to call INSERT OR IGNORE first, then call UPDATE. So each UPDATE now becomes 2
> operation. What we really need is INSERT OR UPDATE statement, but that is
> missing in sqlite.
> Another alternative is to call "SELECT changes()" after each UPDATE statement,
> that will return the number of rows that was updated during the last update
> call. This should be a very efficient call since it doesn't require any query
> into the database and all the history db operations are on the same thread.

Compared the performance of "SELECT changes" and "SELECT count(*)" through
tracing. "SELECT Count(*)" takes about 0.15 ms, while "select changes()" takes
an average of 0.05 ms. So select changes() is much more efficient. The UPDATE
statement takes about 0.3ms. Uploaded a new patchset to use "select changes()".

qinmin

The CQ bit was checked by qinmin@chromium.org to run a CQ dry run

3 years, 10 months ago (2017-02-06 20:18:34 UTC) #24

commit-bot: I haz the power

Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2665243003/100001

3 years, 10 months ago (2017-02-06 20:19:12 UTC) #25

commit-bot: I haz the power

The CQ bit was unchecked by commit-bot@chromium.org

3 years, 10 months ago (2017-02-06 20:27:13 UTC) #26

commit-bot: I haz the power

Dry run: Try jobs failed on following builders: android_cronet on master.tryserver.chromium.android (JOB_FAILED, https://build.chromium.org/p/tryserver.chromium.android/builders/android_cronet/builds/77567) ios-device on ...

3 years, 10 months ago (2017-02-06 20:27:14 UTC) #27

Scott Hess - ex-Googler

https://codereview.chromium.org/2665243003/diff/100001/components/history/core/browser/download_database.cc File components/history/core/browser/download_database.cc (right): https://codereview.chromium.org/2665243003/diff/100001/components/history/core/browser/download_database.cc#newcode341 components/history/core/browser/download_database.cc:341: GetDB().Execute(kJobsSchema)); IMHO this is getting convoluted. Someone coming along ...

3 years, 10 months ago (2017-02-06 21:22:16 UTC) #28

https://codereview.chromium.org/2665243003/diff/100001/components/history/cor...
File components/history/core/browser/download_database.cc (right):

https://codereview.chromium.org/2665243003/diff/100001/components/history/cor...
components/history/core/browser/download_database.cc:341:
GetDB().Execute(kJobsSchema));
IMHO this is getting convoluted.  Someone coming along later is going to have to
decode what this all logically means before they can make their change.  I think
it would be worthwhile to invest the time now to straighten out the logic of
what gets created when.

To be honest, I'm not entirely clear the previous code is even correct.  The if
clause seems to allow the downloads table to exist without requiring a
downloads_url_chains table, while the else clause seems to imply that they both
should exist.  I don't know what to make of that at all, and the comment doesn't
help me understand.

https://codereview.chromium.org/2665243003/diff/100001/components/history/cor...
components/history/core/browser/download_database.cc:716:
RemoveDownloadJobs(id);
Are these statements all in a containing transaction?

Note that sql::Transaction is basically free if nested.  So there's no downside
to having extra transactions around.  But this code doesn't seem to use them, so
I figure there's already a transaction out there, so just wanted to verify.

DCHECK(db().transaction_nesting()) would require a transaction...

https://codereview.chromium.org/2665243003/diff/100001/components/history/cor...
components/history/core/browser/download_database.cc:792: std::map<uint32_t,
DownloadRow*>* download_row_map) {
If you keep the out pointer, you should DCHECK it, because you won't touch it
(and generate a NULL deref) if the table contains no data.

https://codereview.chromium.org/2665243003/diff/100001/components/history/cor...
components/history/core/browser/download_database.cc:801: // See the comment
above about SQLITE lacking unsigned integers.
404 - Comment Not Found

That said - why does it need to be unsigned?  I would expect that DownloadId
would be literally the id in the downloads table, in which case it should mostly
be an opaque thing generated by SQLite which you carry from one SQLite query to
another SQLite query to reference that row.  The higher-level code shouldn't
really be making assumptions about structure, only uniqueness.  [Though now I
see that someone higher-level wants to create the ids ahead of time.  Oh, well,
hopefully both sources of truth stay in sync.]

https://codereview.chromium.org/2665243003/diff/100001/components/history/cor...
components/history/core/browser/download_database.cc:808: int download_id =
IntToDownloadId(signed_id);
This should either be DownloadId to match the function you're calling, or
uint32_t to match the map you're keying.

https://codereview.chromium.org/2665243003/diff/100001/components/history/cor...
components/history/core/browser/download_database.cc:810: // record.
The record is still in the database, right?  Is there code elsewhere which
brings this back in line, or does the record just live unloved forever?

I guess the same applies to the invalid download id case above.

https://codereview.chromium.org/2665243003/diff/100001/components/history/cor...
File components/history/core/browser/download_database.h (right):

https://codereview.chromium.org/2665243003/diff/100001/components/history/cor...
components/history/core/browser/download_database.h:126: void
RemoveDownloadJobs(uint32_t id);
Seems like this should be DownloadId.

https://codereview.chromium.org/2665243003/diff/100001/components/history/cor...
components/history/core/browser/download_database.h:130: void
QueryDownloadJobs(std::map<uint32_t, DownloadRow*>* download_row_map);
This too.

Also, you should be able to return a map<>, right?  Though I haven't dug around
related code to see if that's the style in these parts.

https://codereview.chromium.org/2665243003/diff/100001/components/history/cor...
File components/history/core/browser/download_job_info.h (right):

https://codereview.chromium.org/2665243003/diff/100001/components/history/cor...
components/history/core/browser/download_job_info.h:17: // multiple download
jobis associated with it, with each job having its own
probably s/jobis/job ids/ or something like.

https://codereview.chromium.org/2665243003/diff/100001/components/history/cor...
File components/history/core/browser/history_database.cc (right):

https://codereview.chromium.org/2665243003/diff/100001/components/history/cor...
components/history/core/browser/history_database.cc:543: }
maintain the whitespace line between clauses.

Scott Hess - ex-Googler

https://codereview.chromium.org/2665243003/diff/100001/components/history/core/browser/download_database.cc File components/history/core/browser/download_database.cc (right): https://codereview.chromium.org/2665243003/diff/100001/components/history/core/browser/download_database.cc#newcode560 components/history/core/browser/download_database.cc:560: !CreateDownloadJob(data.download_job_info[i])) { Oh, as I noted elsewhere, since update ...

3 years, 10 months ago (2017-02-06 21:23:51 UTC) #29

Scott Hess - ex-Googler

On 2017/02/06 19:25:02, qinmin wrote: > https://codereview.chromium.org/2665243003/diff/80001/components/history/core/browser/download_database.cc > File components/history/core/browser/download_database.cc (right): > > https://codereview.chromium.org/2665243003/diff/80001/components/history/core/browser/download_database.cc#newcode760 > ...

3 years, 10 months ago (2017-02-06 21:40:48 UTC) #30

On 2017/02/06 19:25:02, qinmin wrote:
>
https://codereview.chromium.org/2665243003/diff/80001/components/history/core...
> File components/history/core/browser/download_database.cc (right):
> 
>
https://codereview.chromium.org/2665243003/diff/80001/components/history/core...
> components/history/core/browser/download_database.cc:760: sql::Statement
> statement_query(GetDB().GetCachedStatement(
> On 2017/02/06 16:41:53, qinmin wrote:
> > On 2017/02/06 16:29:04, sky wrote:
> > > On 2017/02/04 00:06:57, qinmin wrote:
> > > > Unfortunately, update statement always return true. So I have to use a
> > > separate
> > > > select count(*) statement here. Given that we don't have too much db
> updates
> > > per
> > > > second (only 5 updates if 4 jobs are created), this shouldn't cause too
> much
> > > > issue.
> > > > Or a more efficient approach would be ask HistoryService to specifically
> > call
> > > > CreateDowloadJob() whenever a download job is added. So later calls to
> > > > UpdateDownload() will no longer need to check if the job exists or not,
> > wdyt?
> > > 
> > > http://stackoverflow.com/questions/15277373/sqlite-upsert-update-or-insert
> > > mentions INSERT OR IGNORE. But I'm not sure what is best. Adding shess for
> his
> > > thoughts.
> > 
> > INSERT OR IGNORE doesn't improve the performance of update because we will
> have
> > to call INSERT OR IGNORE first, then call UPDATE. So each UPDATE now becomes
2
> > operation. What we really need is INSERT OR UPDATE statement, but that is
> > missing in sqlite.
> > Another alternative is to call "SELECT changes()" after each UPDATE
statement,
> > that will return the number of rows that was updated during the last update
> > call. This should be a very efficient call since it doesn't require any
query
> > into the database and all the history db operations are on the same thread.
> 
> Compared the performance of "SELECT changes" and "SELECT count(*)" through
> tracing. "SELECT Count(*)" takes about 0.15 ms, while "select changes()" takes
> an average of 0.05 ms. So select changes() is much more efficient. The UPDATE
> statement takes about 0.3ms. Uploaded a new patchset to use "select
changes()".

I wouldn't let micro-benchmarks drive this.  The biggest consideration is disk
I/O, and whether you write things as "select then insert or update" or "insert
then update if fail" or "update then insert if fail" doesn't really matter. 
Each variant is going to touch the same db pages, so as long as you're in a
transaction the cache will keep them all approximately the same cost.

I think your best options are:

 - INSERT OR REPLACE if you're always updating the entire row.  You won't find
out if the row was pre-existing, but in most cases you don't really need to know
that.
 - INSERT, on fail UPDATE if you expect not to have an existing row.
 - UPDATE, on fail INSERT if you expect to have an existing row.

But, like I said, I don't think there's a strong argument versus SELECT then
INSERT or UPDATE, especially if they make things harder to read.

WRT the insert stream, I don't think the idleness of the history thread is the
best thing to consider.  Generating frequent COMMIT calls is not great simply
because it can have poor interactions with the overall system, both at the
filesystem level, and down in the flash controller (cheap flash controllers can
cause horrible results with lots of file I/O and frequent fsync calls).  Is this
code in the long-running history transaction?  If so, it doesn't much matter. 
If it's not in the long-running history transaction, then I think you shouldn't
update more frequently than once every 5s or 10s (or even every 30s or 60s). 
Obviously there's a tradeoff in terms of duplicate downloading, here, but it
seems likely to me that the majority bins are "Downloads happen fast and you
hardly need to even track the jobs" and "Downloads absolutely suck and are slow
with lots of retries".  In the first case, checkpointing frequency hardly
matters, in the second case checkpoints hardly demonstrate progress.

qinmin

On 2017/02/06 21:40:48, Scott Hess wrote: > On 2017/02/06 19:25:02, qinmin wrote: > > > ...

3 years, 10 months ago (2017-02-06 23:39:12 UTC) #31

On 2017/02/06 21:40:48, Scott Hess wrote:
> On 2017/02/06 19:25:02, qinmin wrote:
> >
>
https://codereview.chromium.org/2665243003/diff/80001/components/history/core...
> > File components/history/core/browser/download_database.cc (right):
> > 
> >
>
https://codereview.chromium.org/2665243003/diff/80001/components/history/core...
> > components/history/core/browser/download_database.cc:760: sql::Statement
> > statement_query(GetDB().GetCachedStatement(
> > On 2017/02/06 16:41:53, qinmin wrote:
> > > On 2017/02/06 16:29:04, sky wrote:
> > > > On 2017/02/04 00:06:57, qinmin wrote:
> > > > > Unfortunately, update statement always return true. So I have to use a
> > > > separate
> > > > > select count(*) statement here. Given that we don't have too much db
> > updates
> > > > per
> > > > > second (only 5 updates if 4 jobs are created), this shouldn't cause
too
> > much
> > > > > issue.
> > > > > Or a more efficient approach would be ask HistoryService to
specifically
> > > call
> > > > > CreateDowloadJob() whenever a download job is added. So later calls to
> > > > > UpdateDownload() will no longer need to check if the job exists or
not,
> > > wdyt?
> > > > 
> > > >
http://stackoverflow.com/questions/15277373/sqlite-upsert-update-or-insert
> > > > mentions INSERT OR IGNORE. But I'm not sure what is best. Adding shess
for
> > his
> > > > thoughts.
> > > 
> > > INSERT OR IGNORE doesn't improve the performance of update because we will
> > have
> > > to call INSERT OR IGNORE first, then call UPDATE. So each UPDATE now
becomes
> 2
> > > operation. What we really need is INSERT OR UPDATE statement, but that is
> > > missing in sqlite.
> > > Another alternative is to call "SELECT changes()" after each UPDATE
> statement,
> > > that will return the number of rows that was updated during the last
update
> > > call. This should be a very efficient call since it doesn't require any
> query
> > > into the database and all the history db operations are on the same
thread.
> > 
> > Compared the performance of "SELECT changes" and "SELECT count(*)" through
> > tracing. "SELECT Count(*)" takes about 0.15 ms, while "select changes()"
takes
> > an average of 0.05 ms. So select changes() is much more efficient. The
UPDATE
> > statement takes about 0.3ms. Uploaded a new patchset to use "select
> changes()".
> 
> I wouldn't let micro-benchmarks drive this.  The biggest consideration is disk
> I/O, and whether you write things as "select then insert or update" or "insert
> then update if fail" or "update then insert if fail" doesn't really matter. 
> Each variant is going to touch the same db pages, so as long as you're in a
> transaction the cache will keep them all approximately the same cost.
> 
> I think your best options are:
> 
>  - INSERT OR REPLACE if you're always updating the entire row.  You won't find
> out if the row was pre-existing, but in most cases you don't really need to
know
> that.
>  - INSERT, on fail UPDATE if you expect not to have an existing row.
>  - UPDATE, on fail INSERT if you expect to have an existing row.
> 
> But, like I said, I don't think there's a strong argument versus SELECT then
> INSERT or UPDATE, especially if they make things harder to read.
> 
> WRT the insert stream, I don't think the idleness of the history thread is the
> best thing to consider.  Generating frequent COMMIT calls is not great simply
> because it can have poor interactions with the overall system, both at the
> filesystem level, and down in the flash controller (cheap flash controllers
can
> cause horrible results with lots of file I/O and frequent fsync calls).  Is
this
> code in the long-running history transaction?  If so, it doesn't much matter. 
> If it's not in the long-running history transaction, then I think you
shouldn't
> update more frequently than once every 5s or 10s (or even every 30s or 60s). 
> Obviously there's a tradeoff in terms of duplicate downloading, here, but it
> seems likely to me that the majority bins are "Downloads happen fast and you
> hardly need to even track the jobs" and "Downloads absolutely suck and are
slow
> with lots of retries".  In the first case, checkpointing frequency hardly
> matters, in the second case checkpoints hardly demonstrate progress.

Thanks Scott for the information. 
Yes, the new code will be part of the long-running history transaction. History
db has a commit schedule of every 10 seconds, so this is unlikely to cause a lot
of I/O interactions.

qinmin

https://codereview.chromium.org/2665243003/diff/100001/components/history/core/browser/download_database.cc File components/history/core/browser/download_database.cc (right): https://codereview.chromium.org/2665243003/diff/100001/components/history/core/browser/download_database.cc#newcode341 components/history/core/browser/download_database.cc:341: GetDB().Execute(kJobsSchema)); On 2017/02/06 21:22:16, Scott Hess wrote: > IMHO ...

3 years, 10 months ago (2017-02-06 23:39:22 UTC) #32

https://codereview.chromium.org/2665243003/diff/100001/components/history/cor...
File components/history/core/browser/download_database.cc (right):

https://codereview.chromium.org/2665243003/diff/100001/components/history/cor...
components/history/core/browser/download_database.cc:341:
GetDB().Execute(kJobsSchema));
On 2017/02/06 21:22:16, Scott Hess wrote:
> IMHO this is getting convoluted.  Someone coming along later is going to have
to
> decode what this all logically means before they can make their change.  I
think
> it would be worthwhile to invest the time now to straighten out the logic of
> what gets created when.
> 
> To be honest, I'm not entirely clear the previous code is even correct.  The
if
> clause seems to allow the downloads table to exist without requiring a
> downloads_url_chains table, while the else clause seems to imply that they
both
> should exist.  I don't know what to make of that at all, and the comment
doesn't
> help me understand.

The issue is related to database versions. download_url_chains table is
introduced in version 24, so it is possible that the downloads table exists
while the downloads_url_chains doesn't. There are migration functions to
reintroduce this table when user ugrade from v23 to something >24. I added
comments here to describe the logic about downloads_url_chains and
downloads_jobs.

https://codereview.chromium.org/2665243003/diff/100001/components/history/cor...
components/history/core/browser/download_database.cc:560:
!CreateDownloadJob(data.download_job_info[i])) {
On 2017/02/06 21:23:51, Scott Hess wrote:
> Oh, as I noted elsewhere, since update is touching all columns (either
> overwriting them or using them in WHERE clause), you could perhaps use INSERT
OR
> REPLACE and consolidate update/create in one function.

Done. Switched to use REPLACE(short form of INSERT OR REPLACE) and changed the
function name to CreateOrUpdateDownloadJob().

https://codereview.chromium.org/2665243003/diff/100001/components/history/cor...
components/history/core/browser/download_database.cc:716:
RemoveDownloadJobs(id);
On 2017/02/06 21:22:16, Scott Hess wrote:
> Are these statements all in a containing transaction?
> 
> Note that sql::Transaction is basically free if nested.  So there's no
downside
> to having extra transactions around.  But this code doesn't seem to use them,
so
> I figure there's already a transaction out there, so just wanted to verify.
> 
> DCHECK(db().transaction_nesting()) would require a transaction...

Yes, all these statement are in a long running transaction.

https://codereview.chromium.org/2665243003/diff/100001/components/history/cor...
components/history/core/browser/download_database.cc:792: std::map<uint32_t,
DownloadRow*>* download_row_map) {
On 2017/02/06 21:22:16, Scott Hess wrote:
> If you keep the out pointer, you should DCHECK it, because you won't touch it
> (and generate a NULL deref) if the table contains no data.

Done.

https://codereview.chromium.org/2665243003/diff/100001/components/history/cor...
components/history/core/browser/download_database.cc:801: // See the comment
above about SQLITE lacking unsigned integers.
Changed the comments to make it make more sense. The comment is just to refer to
some sentences above that sqlite ints are signed, so we need to convert it to
unsigned DownloadId.

On 2017/02/06 21:22:16, Scott Hess wrote:
> 404 - Comment Not Found
> 
> That said - why does it need to be unsigned?  I would expect that DownloadId
> would be literally the id in the downloads table, in which case it should
mostly
> be an opaque thing generated by SQLite which you carry from one SQLite query
to
> another SQLite query to reference that row.  The higher-level code shouldn't
> really be making assumptions about structure, only uniqueness.  [Though now I
> see that someone higher-level wants to create the ids ahead of time.  Oh,
well,
> hopefully both sources of truth stay in sync.]

https://codereview.chromium.org/2665243003/diff/100001/components/history/cor...
components/history/core/browser/download_database.cc:808: int download_id =
IntToDownloadId(signed_id);
On 2017/02/06 21:22:16, Scott Hess wrote:
> This should either be DownloadId to match the function you're calling, or
> uint32_t to match the map you're keying.

Done.

https://codereview.chromium.org/2665243003/diff/100001/components/history/cor...
components/history/core/browser/download_database.cc:810: // record.
On 2017/02/06 21:22:16, Scott Hess wrote:
> The record is still in the database, right?  Is there code elsewhere which
> brings this back in line, or does the record just live unloved forever?
> 
> I guess the same applies to the invalid download id case above.

Call RemoveDownloadJobs() to remove the job from db. 
For the nagative download id above, added a TODO. Fixing that will require
changing the RemoveDownloadJobs() definition to support negative ids.

https://codereview.chromium.org/2665243003/diff/100001/components/history/cor...
File components/history/core/browser/download_database.h (right):

https://codereview.chromium.org/2665243003/diff/100001/components/history/cor...
components/history/core/browser/download_database.h:126: void
RemoveDownloadJobs(uint32_t id);
On 2017/02/06 21:22:16, Scott Hess wrote:
> Seems like this should be DownloadId.

Done.

https://codereview.chromium.org/2665243003/diff/100001/components/history/cor...
components/history/core/browser/download_database.h:130: void
QueryDownloadJobs(std::map<uint32_t, DownloadRow*>* download_row_map);
On 2017/02/06 21:22:16, Scott Hess wrote:
> This too.
> 
> Also, you should be able to return a map<>, right?  Though I haven't dug
around
> related code to see if that's the style in these parts.

Done changing it to DownloadId. For the return values, we have a "void
QueryDownloads(std::vector<DownloadRow>* results)" at the top of this file, so i
think it is ok to use the input param for output. This is also more efficient
than making a new map as output.

https://codereview.chromium.org/2665243003/diff/100001/components/history/cor...
File components/history/core/browser/download_job_info.h (right):

https://codereview.chromium.org/2665243003/diff/100001/components/history/cor...
components/history/core/browser/download_job_info.h:17: // multiple download
jobis associated with it, with each job having its own
On 2017/02/06 21:22:16, Scott Hess wrote:
> probably s/jobis/job ids/ or something like.

Done.

https://codereview.chromium.org/2665243003/diff/100001/components/history/cor...
File components/history/core/browser/history_database.cc (right):

https://codereview.chromium.org/2665243003/diff/100001/components/history/cor...
components/history/core/browser/history_database.cc:543: }
On 2017/02/06 21:22:16, Scott Hess wrote:
> maintain the whitespace line between clauses.

Done.

qinmin

The CQ bit was checked by qinmin@chromium.org to run a CQ dry run

3 years, 10 months ago (2017-02-06 23:39:31 UTC) #33

commit-bot: I haz the power

Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2665243003/120001

3 years, 10 months ago (2017-02-06 23:40:39 UTC) #34

qinmin

reviewers, PTAL again. @asanka, I added some comments in DownloadDatabase::InitDownloadTable(), please check if they are ...

3 years, 10 months ago (2017-02-07 00:24:26 UTC) #35

commit-bot: I haz the power

The CQ bit was unchecked by commit-bot@chromium.org

3 years, 10 months ago (2017-02-07 00:55:30 UTC) #36

commit-bot: I haz the power

Dry run: This issue passed the CQ dry run.

3 years, 10 months ago (2017-02-07 00:55:33 UTC) #37

qinmin

qinmin@chromium.org changed reviewers: - sdefresne@chromium.org

3 years, 10 months ago (2017-02-07 20:40:22 UTC) #38

asanka

LGTM % comments https://codereview.chromium.org/2665243003/diff/120001/components/history/core/browser/download_database.cc File components/history/core/browser/download_database.cc (right): https://codereview.chromium.org/2665243003/diff/120001/components/history/core/browser/download_database.cc#newcode331 components/history/core/browser/download_database.cc:331: // If the "downloads" table exist, ...

3 years, 10 months ago (2017-02-08 03:50:50 UTC) #39

David Trainor- moved to gerrit

lgtm % nits https://codereview.chromium.org/2665243003/diff/120001/components/history/core/browser/download_database.cc File components/history/core/browser/download_database.cc (right): https://codereview.chromium.org/2665243003/diff/120001/components/history/core/browser/download_database.cc#newcode565 components/history/core/browser/download_database.cc:565: if (!CreateOrUpdateDownloadJob(data.download_job_info[i])) Do we have to ...

3 years, 10 months ago (2017-02-08 07:54:05 UTC) #40

Scott Hess - ex-Googler

https://codereview.chromium.org/2665243003/diff/120001/components/history/core/browser/download_database.cc File components/history/core/browser/download_database.cc (right): https://codereview.chromium.org/2665243003/diff/120001/components/history/core/browser/download_database.cc#newcode333 components/history/core/browser/download_database.cc:333: // create it later. Sigh. This is why we ...

3 years, 10 months ago (2017-02-08 22:01:40 UTC) #42

https://codereview.chromium.org/2665243003/diff/120001/components/history/cor...
File components/history/core/browser/download_database.cc (right):

https://codereview.chromium.org/2665243003/diff/120001/components/history/cor...
components/history/core/browser/download_database.cc:333: // create it later.
Sigh.  This is why we can't have nice things.  Having this code probe the schema
while other code blindly accepts the version isn't great.  Does this code have
unit tests to verify the schema transitions, so at least we know when a change
breaks the world?

Also, I don't see any tests specifically targeting this migration.  Those tests
need to be around so that future changes don't break you.  I don't know what the
standards are in this code (*), but you should definitely have a test which
verifies that the v32->v33 transition works.

(*) IMHO the best case is having golden files so that you can reasonably test
migration chains and have high fidelity with actual user schema.  Synthetic
schema are attractive to developers, but if the synthetic schema departs from
the actual schema embedded in user profiles, it makes the tests pointless.

https://codereview.chromium.org/2665243003/diff/120001/components/history/cor...
components/history/core/browser/download_database.cc:565: if
(!CreateOrUpdateDownloadJob(data.download_job_info[i]))
On 2017/02/08 07:54:05, David Trainor wrote:
> Do we have to clean up any other state we persisted if this fails?

Unfortunately, since this is in the long-running transaction, I'm not sure that
there is anything in place to do that.  But that's intrinsic to the design of
this module, I think.

https://codereview.chromium.org/2665243003/diff/120001/components/history/cor...
components/history/core/browser/download_database.cc:781: if (signed_id <=
static_cast<int64_t>(kInvalidDownloadId)) {
On 2017/02/08 03:50:50, asanka wrote:
> Nit: Introduce IsValidDownloadId() or somesuch and move this check there.
> download_types.h implies that kInvalidDownloadId should be treated as opaque.

It might even make sense to have the signature be:

  bool ConvertIntToDownloadId(int64_t int_version, DownloadId* out)
WARN_UNUSED_RESULT;

Then callers can't simply assume it all works.

That said ... I'm not entirely clear what the use case is, here.  Presumably
this code is not inserting invalid ids in the first place, so this should never
happen.  If it does happen ... what's special about this case which deserves
special handling, in a module where nothing else does much in the way of
sanitization?  Incomplete ad-hoc error handling is almost worse than nothing at
all, sometimes.

https://codereview.chromium.org/2665243003/diff/120001/components/history/cor...
components/history/core/browser/download_database.cc:791: bool found =
base::ContainsKey(*download_row_map, download_id);
Reading this and the (*download_row_map)[download_id]-> at the bottom makes me
wonder if it would make sense to rearrange this code a bit.  One option would be
to use find() here and save the iterator for there, rather than having separate
calls.  Another option would be to optimistically create the DownloadJobInfo,
the put all of the test-and-set code in one place.

Actually, this check also can subsume the invalid-id check (which I assume you
expect to never happen and are only coding out of an abundance of caution). 
Then you could rearrange this block to look like:
 - Build |info|
 - find() the row iter
 - if end(), error handling
 - else add |info| to the row.

https://codereview.chromium.org/2665243003/diff/120001/components/history/cor...
File components/history/core/browser/download_database.h (right):

https://codereview.chromium.org/2665243003/diff/120001/components/history/cor...
components/history/core/browser/download_database.h:127: void
QueryDownloadJobs(std::map<uint32_t, DownloadRow*>* download_row_map);
I still want this key to be a DownloadId, but I can see that the overall file
isn't consistent in use of DownloadId versus uint32_t, so I'll just encourage a
someday-if-you-have-time cleanup.

https://codereview.chromium.org/2665243003/diff/120001/components/history/cor...
File components/history/core/browser/download_job_info.h (right):

https://codereview.chromium.org/2665243003/diff/120001/components/history/cor...
components/history/core/browser/download_job_info.h:33: // The id of the
download in the database. Not changed by UpdateDownload().
"Not changed by UpdateDownload()" seem irrelevant.  It's passed as a const&, so
UpdateDownload() can't change it.

https://codereview.chromium.org/2665243003/diff/120001/components/history/cor...
components/history/core/browser/download_job_info.h:37: int job_id;
AFAICT, this CL isn't wired in to anything, so I can't really see how you plan
to use it, but based in the tuple (start_position, length, received_bytes), I
would guess you're going to do something like:
 - Setup a bunch of sub-jobs for the overall download.
 - Make progress, updating received_bytes and state.
 - Finish and remove the sub-jobs.

In that case, start_position and length should uniquely identify the job, and,
indeed, start_position alone would too.  In fact, start_position probably must
be unique.  So do you need a job_id, or is start_position sufficient?

[I don't feel super strongly about this, it's a trivial change to the database
space.  I'm actually more concerned about whether the data model fits usage
well.]

qinmin

https://codereview.chromium.org/2665243003/diff/120001/components/history/core/browser/download_database.cc File components/history/core/browser/download_database.cc (right): https://codereview.chromium.org/2665243003/diff/120001/components/history/core/browser/download_database.cc#newcode331 components/history/core/browser/download_database.cc:331: // If the "downloads" table exist, "download_url_chain" might not ...

3 years, 10 months ago (2017-02-09 01:01:42 UTC) #43

https://codereview.chromium.org/2665243003/diff/120001/components/history/cor...
File components/history/core/browser/download_database.cc (right):

https://codereview.chromium.org/2665243003/diff/120001/components/history/cor...
components/history/core/browser/download_database.cc:331: // If the "downloads"
table exist, "download_url_chain" might not be there
On 2017/02/08 03:50:50, asanka wrote:
> nit: s/exist/exists/

Done.

https://codereview.chromium.org/2665243003/diff/120001/components/history/cor...
components/history/core/browser/download_database.cc:333: // create it later.
On 2017/02/08 22:01:39, Scott Hess wrote:
> Sigh.  This is why we can't have nice things.  Having this code probe the
schema
> while other code blindly accepts the version isn't great.  Does this code have
> unit tests to verify the schema transitions, so at least we know when a change
> breaks the world?
> 
> Also, I don't see any tests specifically targeting this migration.  Those
tests
> need to be around so that future changes don't break you.  I don't know what
the
> standards are in this code (*), but you should definitely have a test which
> verifies that the v32->v33 transition works.
> 
> (*) IMHO the best case is having golden files so that you can reasonably test
> migration chains and have high fidelity with actual user schema.  Synthetic
> schema are attractive to developers, but if the synthetic schema departs from
> the actual schema embedded in user profiles, it makes the tests pointless.

I have added HistoryBackendDBTest.MigrateDownloadsJobsTable() to test the
migration from V32->V33. There are also some other tests in that file to test
earlier migrations. For example, the introduction of the downloads_url_chain
table was tested in
HistoryBackendDBTest.MigrateDownloadsReasonPathsAndDangerType.

https://codereview.chromium.org/2665243003/diff/120001/components/history/cor...
components/history/core/browser/download_database.cc:565: if
(!CreateOrUpdateDownloadJob(data.download_job_info[i]))
On 2017/02/08 22:01:39, Scott Hess wrote:
> On 2017/02/08 07:54:05, David Trainor wrote:
> > Do we have to clean up any other state we persisted if this fails?
> 
> Unfortunately, since this is in the long-running transaction, I'm not sure
that
> there is anything in place to do that.  But that's intrinsic to the design of
> this module, I think.

If some of job failed to update/create, the worst case will be we have to
discard those jobs or getting out-dated information when querying the downloads
table. In both cases, the impact is limited (user will probably have to
redownload some of the content when resuming the download).

https://codereview.chromium.org/2665243003/diff/120001/components/history/cor...
components/history/core/browser/download_database.cc:758: return
statement_replace.Run();;
On 2017/02/08 07:54:05, David Trainor wrote:
> Remove extra ;

Done.

https://codereview.chromium.org/2665243003/diff/120001/components/history/cor...
components/history/core/browser/download_database.cc:781: if (signed_id <=
static_cast<int64_t>(kInvalidDownloadId)) {
On 2017/02/08 22:01:39, Scott Hess wrote:
> On 2017/02/08 03:50:50, asanka wrote:
> > Nit: Introduce IsValidDownloadId() or somesuch and move this check there.
> > download_types.h implies that kInvalidDownloadId should be treated as
opaque.
> 
> It might even make sense to have the signature be:
> 
>   bool ConvertIntToDownloadId(int64_t int_version, DownloadId* out)
> WARN_UNUSED_RESULT;
> 
> Then callers can't simply assume it all works.
> 
> That said ... I'm not entirely clear what the use case is, here.  Presumably
> this code is not inserting invalid ids in the first place, so this should
never
> happen.  If it does happen ... what's special about this case which deserves
> special handling, in a module where nothing else does much in the way of
> sanitization?  Incomplete ad-hoc error handling is almost worse than nothing
at
> all, sometimes.

Followed Scott's suggestion, changed the IntToDownloadId() to bool
ConvertItToDownloadId(int64_t, DownloadId*). If that function returns false
here, the downloadId will be equal to kInvalidDownloadId, which should fail
subsequent map.find() statement.

https://codereview.chromium.org/2665243003/diff/120001/components/history/cor...
components/history/core/browser/download_database.cc:791: bool found =
base::ContainsKey(*download_row_map, download_id);
On 2017/02/08 22:01:39, Scott Hess wrote:
> Reading this and the (*download_row_map)[download_id]-> at the bottom makes me
> wonder if it would make sense to rearrange this code a bit.  One option would
be
> to use find() here and save the iterator for there, rather than having
separate
> calls.  Another option would be to optimistically create the DownloadJobInfo,
> the put all of the test-and-set code in one place.
> 
> Actually, this check also can subsume the invalid-id check (which I assume you
> expect to never happen and are only coding out of an abundance of caution). 
> Then you could rearrange this block to look like:
>  - Build |info|
>  - find() the row iter
>  - if end(), error handling
>  - else add |info| to the row.

Done.

https://codereview.chromium.org/2665243003/diff/120001/components/history/cor...
components/history/core/browser/download_database.cc:794: DCHECK(found);
On 2017/02/08 03:50:50, asanka wrote:
> Nit: remove DCHECK. This is an inconsistency in the DB, not one in the
browser.
> And we've already allowed for the possibility of negative download IDs.

Done.

https://codereview.chromium.org/2665243003/diff/120001/components/history/cor...
File components/history/core/browser/download_database.h (right):

https://codereview.chromium.org/2665243003/diff/120001/components/history/cor...
components/history/core/browser/download_database.h:127: void
QueryDownloadJobs(std::map<uint32_t, DownloadRow*>* download_row_map);
On 2017/02/08 22:01:39, Scott Hess wrote:
> I still want this key to be a DownloadId, but I can see that the overall file
> isn't consistent in use of DownloadId versus uint32_t, so I'll just encourage
a
> someday-if-you-have-time cleanup.

Done. Changed all member functions that using uint32_t to DownloadId

https://codereview.chromium.org/2665243003/diff/120001/components/history/cor...
File components/history/core/browser/download_job_info.cc (right):

https://codereview.chromium.org/2665243003/diff/120001/components/history/cor...
components/history/core/browser/download_job_info.cc:12: :
download_id(kInvalidDownloadId),
On 2017/02/08 03:50:50, asanka wrote:
> initialize job_id

Done.

https://codereview.chromium.org/2665243003/diff/120001/components/history/cor...
File components/history/core/browser/download_job_info.h (right):

https://codereview.chromium.org/2665243003/diff/120001/components/history/cor...
components/history/core/browser/download_job_info.h:33: // The id of the
download in the database. Not changed by UpdateDownload().
On 2017/02/08 22:01:40, Scott Hess wrote:
> "Not changed by UpdateDownload()" seem irrelevant.  It's passed as a const&,
so
> UpdateDownload() can't change it.

removed

https://codereview.chromium.org/2665243003/diff/120001/components/history/cor...
components/history/core/browser/download_job_info.h:37: int job_id;
On 2017/02/08 22:01:40, Scott Hess wrote:
> AFAICT, this CL isn't wired in to anything, so I can't really see how you plan
> to use it, but based in the tuple (start_position, length, received_bytes), I
> would guess you're going to do something like:
>  - Setup a bunch of sub-jobs for the overall download.
>  - Make progress, updating received_bytes and state.
>  - Finish and remove the sub-jobs.
> 
> In that case, start_position and length should uniquely identify the job, and,
> indeed, start_position alone would too.  In fact, start_position probably must
> be unique.  So do you need a job_id, or is start_position sufficient?
> 
> [I don't feel super strongly about this, it's a trivial change to the database
> space.  I'm actually more concerned about whether the data model fits usage
> well.]

Using start position might not be enough. For example, if Chrome always create 4
jobs to evenly handle the data downloading, and the first job finishes 50% while
the other 3 jobs didn't make any progress.  Suppose browser crashes at that
moment, when resuming the download later, we need to update the starting
position for all jobs so that the remaining 7/8 data is splitted evenly between
the 4 jobs. That will cause us to change the start_position for each job. Using
an id to identify the jobs make it easier to achieve such tasks.

https://codereview.chromium.org/2665243003/diff/120001/components/history/cor...
File components/history/core/browser/history_backend_db_unittest.cc (right):

https://codereview.chromium.org/2665243003/diff/120001/components/history/cor...
components/history/core/browser/history_backend_db_unittest.cc:1049: // Create
the DB.
On 2017/02/08 03:50:50, asanka wrote:
> nit: remove. here and below.

Done.

https://codereview.chromium.org/2665243003/diff/120001/tools/metrics/histogra...
File tools/metrics/histograms/histograms.xml (right):

https://codereview.chromium.org/2665243003/diff/120001/tools/metrics/histogra...
tools/metrics/histograms/histograms.xml:12502: +    Whether a download can be
found for a download job in the DB.
On 2017/02/08 03:50:50, asanka wrote:
> Should mention that the histogram is recorded for each download job row each
> time the database is loaded.

Done.

qinmin

qinmin@chromium.org changed reviewers: + holte@chromium.org

3 years, 10 months ago (2017-02-09 01:06:50 UTC) #45

asanka

https://codereview.chromium.org/2665243003/diff/120001/components/history/core/browser/download_job_info.h File components/history/core/browser/download_job_info.h (right): https://codereview.chromium.org/2665243003/diff/120001/components/history/core/browser/download_job_info.h#newcode37 components/history/core/browser/download_job_info.h:37: int job_id; On 2017/02/09 at 01:01:42, qinmin wrote: > ...

3 years, 10 months ago (2017-02-09 02:19:47 UTC) #48

https://codereview.chromium.org/2665243003/diff/120001/components/history/cor...
File components/history/core/browser/download_job_info.h (right):

https://codereview.chromium.org/2665243003/diff/120001/components/history/cor...
components/history/core/browser/download_job_info.h:37: int job_id;
On 2017/02/09 at 01:01:42, qinmin wrote:
> On 2017/02/08 22:01:40, Scott Hess wrote:
> > AFAICT, this CL isn't wired in to anything, so I can't really see how you
plan
> > to use it, but based in the tuple (start_position, length, received_bytes),
I
> > would guess you're going to do something like:
> >  - Setup a bunch of sub-jobs for the overall download.
> >  - Make progress, updating received_bytes and state.
> >  - Finish and remove the sub-jobs.
> > 
> > In that case, start_position and length should uniquely identify the job,
and,
> > indeed, start_position alone would too.  In fact, start_position probably
must
> > be unique.  So do you need a job_id, or is start_position sufficient?
> > 
> > [I don't feel super strongly about this, it's a trivial change to the
database
> > space.  I'm actually more concerned about whether the data model fits usage
> > well.]
> 
> Using start position might not be enough. For example, if Chrome always create
4 jobs to evenly handle the data downloading, and the first job finishes 50%
while the other 3 jobs didn't make any progress.  Suppose browser crashes at
that moment, when resuming the download later, we need to update the starting
position for all jobs so that the remaining 7/8 data is splitted evenly between
the 4 jobs. That will cause us to change the start_position for each job. Using
an id to identify the jobs make it easier to achieve such tasks.

I think it's worth fleshing out how you are going to use it.

All you really need is a list of slices of the file that have been successfully
downloaded. Those slices can be uniquely identified by download_id and offset as
shess mentioned. The additional bit you need is the count of bytes downloaded
from that offset.

The list of download jobs is an ephemeral thing that each download session needs
to figure out based on the distribution of the holes in the file and network
conditions. I don't think persisting the download jobs to disk is going to help.

qinmin

On 2017/02/09 02:19:47, asanka wrote: > https://codereview.chromium.org/2665243003/diff/120001/components/history/core/browser/download_job_info.h > File components/history/core/browser/download_job_info.h (right): > > https://codereview.chromium.org/2665243003/diff/120001/components/history/core/browser/download_job_info.h#newcode37 > ...

3 years, 10 months ago (2017-02-09 06:58:11 UTC) #49

On 2017/02/09 02:19:47, asanka wrote:
>
https://codereview.chromium.org/2665243003/diff/120001/components/history/cor...
> File components/history/core/browser/download_job_info.h (right):
> 
>
https://codereview.chromium.org/2665243003/diff/120001/components/history/cor...
> components/history/core/browser/download_job_info.h:37: int job_id;
> On 2017/02/09 at 01:01:42, qinmin wrote:
> > On 2017/02/08 22:01:40, Scott Hess wrote:
> > > AFAICT, this CL isn't wired in to anything, so I can't really see how you
> plan
> > > to use it, but based in the tuple (start_position, length,
received_bytes),
> I
> > > would guess you're going to do something like:
> > >  - Setup a bunch of sub-jobs for the overall download.
> > >  - Make progress, updating received_bytes and state.
> > >  - Finish and remove the sub-jobs.
> > > 
> > > In that case, start_position and length should uniquely identify the job,
> and,
> > > indeed, start_position alone would too.  In fact, start_position probably
> must
> > > be unique.  So do you need a job_id, or is start_position sufficient?
> > > 
> > > [I don't feel super strongly about this, it's a trivial change to the
> database
> > > space.  I'm actually more concerned about whether the data model fits
usage
> > > well.]
> > 
> > Using start position might not be enough. For example, if Chrome always
create
> 4 jobs to evenly handle the data downloading, and the first job finishes 50%
> while the other 3 jobs didn't make any progress.  Suppose browser crashes at
> that moment, when resuming the download later, we need to update the starting
> position for all jobs so that the remaining 7/8 data is splitted evenly
between
> the 4 jobs. That will cause us to change the start_position for each job.
Using
> an id to identify the jobs make it easier to achieve such tasks.
> 
> 
> I think it's worth fleshing out how you are going to use it.
> 
> All you really need is a list of slices of the file that have been
successfully
> downloaded. Those slices can be uniquely identified by download_id and offset
as
> shess mentioned. The additional bit you need is the count of bytes downloaded
> from that offset.
> 
> The list of download jobs is an ephemeral thing that each download session
needs
> to figure out based on the distribution of the holes in the file and network
> conditions. I don't think persisting the download jobs to disk is going to
help.

Yes, each download session is ephemeral, but that  doesn't mean download_id and
offset(bytes downloaded) can uniquely identify such a session.
The problem is that if a download session is created with offset x, and length
y, the session can immediately got killed due to a browser crash.
When user resumes that download, the information(x, y)  of that session is
loaded into memory.
Unfortunately, Chrome might recreate another download session with same offset x
and length y based on the current downloaded pieces.
So we need either another id to uniquely identify the 2 different sessions, or
reuse the previous session. That's why a job_id might be helpful.
Anyway, I think querying the download jobs table by ordering on job_id can give
us a lot of information on how each piece is created over the time.
This can help us debug potential incorrect logics in session creation later on.

asanka

On 2017/02/09 at 06:58:11, qinmin wrote: > On 2017/02/09 02:19:47, asanka wrote: > > https://codereview.chromium.org/2665243003/diff/120001/components/history/core/browser/download_job_info.h ...

3 years, 10 months ago (2017-02-09 21:17:21 UTC) #50

On 2017/02/09 at 06:58:11, qinmin wrote:
> On 2017/02/09 02:19:47, asanka wrote:
> >
https://codereview.chromium.org/2665243003/diff/120001/components/history/cor...
> > File components/history/core/browser/download_job_info.h (right):
> > 
> >
https://codereview.chromium.org/2665243003/diff/120001/components/history/cor...
> > components/history/core/browser/download_job_info.h:37: int job_id;
> > On 2017/02/09 at 01:01:42, qinmin wrote:
> > > On 2017/02/08 22:01:40, Scott Hess wrote:
> > > > AFAICT, this CL isn't wired in to anything, so I can't really see how
you
> > plan
> > > > to use it, but based in the tuple (start_position, length,
received_bytes),
> > I
> > > > would guess you're going to do something like:
> > > >  - Setup a bunch of sub-jobs for the overall download.
> > > >  - Make progress, updating received_bytes and state.
> > > >  - Finish and remove the sub-jobs.
> > > > 
> > > > In that case, start_position and length should uniquely identify the
job,
> > and,
> > > > indeed, start_position alone would too.  In fact, start_position
probably
> > must
> > > > be unique.  So do you need a job_id, or is start_position sufficient?
> > > > 
> > > > [I don't feel super strongly about this, it's a trivial change to the
> > database
> > > > space.  I'm actually more concerned about whether the data model fits
usage
> > > > well.]
> > > 
> > > Using start position might not be enough. For example, if Chrome always
create
> > 4 jobs to evenly handle the data downloading, and the first job finishes 50%
> > while the other 3 jobs didn't make any progress.  Suppose browser crashes at
> > that moment, when resuming the download later, we need to update the
starting
> > position for all jobs so that the remaining 7/8 data is splitted evenly
between
> > the 4 jobs. That will cause us to change the start_position for each job.
Using
> > an id to identify the jobs make it easier to achieve such tasks.
> > 
> > 
> > I think it's worth fleshing out how you are going to use it.
> > 
> > All you really need is a list of slices of the file that have been
successfully
> > downloaded. Those slices can be uniquely identified by download_id and
offset as
> > shess mentioned. The additional bit you need is the count of bytes
downloaded
> > from that offset.
> > 
> > The list of download jobs is an ephemeral thing that each download session
needs
> > to figure out based on the distribution of the holes in the file and network
> > conditions. I don't think persisting the download jobs to disk is going to
help.
> 
> Yes, each download session is ephemeral, but that  doesn't mean download_id
and offset(bytes downloaded) can uniquely identify such a session.
> The problem is that if a download session is created with offset x, and length
y, the session can immediately got killed due to a browser crash.
> When user resumes that download, the information(x, y)  of that session is
loaded into memory.
> Unfortunately, Chrome might recreate another download session with same offset
x and length y based on the current downloaded pieces.
> So we need either another id to uniquely identify the 2 different sessions, or
reuse the previous session. That's why a job_id might be helpful.

Those are problems that we'll have if we store download jobs. My point is that
you don't need to store download jobs. :-) The slices -- which you do need to
store -- only record what's written to disk, not how they got there.

> Anyway, I think querying the download jobs table by ordering on job_id can
give us a lot of information on how each piece is created over the time.
> This can help us debug potential incorrect logics in session creation later
on.

But do you need to store it in history for all users? Debugging logic like this
is what we use netlogs for since the behavior depends on lots of factors that
aren't captured here.

Either way, I think you have an implementation plan in mind. Do you have a doc
we can look at?

qinmin

On 2017/02/09 21:17:21, asanka wrote: > On 2017/02/09 at 06:58:11, qinmin wrote: > > On ...

3 years, 10 months ago (2017-02-09 23:31:17 UTC) #51

On 2017/02/09 21:17:21, asanka wrote:
> On 2017/02/09 at 06:58:11, qinmin wrote:
> > On 2017/02/09 02:19:47, asanka wrote:
> > >
>
https://codereview.chromium.org/2665243003/diff/120001/components/history/cor...
> > > File components/history/core/browser/download_job_info.h (right):
> > > 
> > >
>
https://codereview.chromium.org/2665243003/diff/120001/components/history/cor...
> > > components/history/core/browser/download_job_info.h:37: int job_id;
> > > On 2017/02/09 at 01:01:42, qinmin wrote:
> > > > On 2017/02/08 22:01:40, Scott Hess wrote:
> > > > > AFAICT, this CL isn't wired in to anything, so I can't really see how
> you
> > > plan
> > > > > to use it, but based in the tuple (start_position, length,
> received_bytes),
> > > I
> > > > > would guess you're going to do something like:
> > > > >  - Setup a bunch of sub-jobs for the overall download.
> > > > >  - Make progress, updating received_bytes and state.
> > > > >  - Finish and remove the sub-jobs.
> > > > > 
> > > > > In that case, start_position and length should uniquely identify the
> job,
> > > and,
> > > > > indeed, start_position alone would too.  In fact, start_position
> probably
> > > must
> > > > > be unique.  So do you need a job_id, or is start_position sufficient?
> > > > > 
> > > > > [I don't feel super strongly about this, it's a trivial change to the
> > > database
> > > > > space.  I'm actually more concerned about whether the data model fits
> usage
> > > > > well.]
> > > > 
> > > > Using start position might not be enough. For example, if Chrome always
> create
> > > 4 jobs to evenly handle the data downloading, and the first job finishes
50%
> > > while the other 3 jobs didn't make any progress.  Suppose browser crashes
at
> > > that moment, when resuming the download later, we need to update the
> starting
> > > position for all jobs so that the remaining 7/8 data is splitted evenly
> between
> > > the 4 jobs. That will cause us to change the start_position for each job.
> Using
> > > an id to identify the jobs make it easier to achieve such tasks.
> > > 
> > > 
> > > I think it's worth fleshing out how you are going to use it.
> > > 
> > > All you really need is a list of slices of the file that have been
> successfully
> > > downloaded. Those slices can be uniquely identified by download_id and
> offset as
> > > shess mentioned. The additional bit you need is the count of bytes
> downloaded
> > > from that offset.
> > > 
> > > The list of download jobs is an ephemeral thing that each download session
> needs
> > > to figure out based on the distribution of the holes in the file and
network
> > > conditions. I don't think persisting the download jobs to disk is going to
> help.
> > 
> > Yes, each download session is ephemeral, but that  doesn't mean download_id
> and offset(bytes downloaded) can uniquely identify such a session.
> > The problem is that if a download session is created with offset x, and
length
> y, the session can immediately got killed due to a browser crash.
> > When user resumes that download, the information(x, y)  of that session is
> loaded into memory.
> > Unfortunately, Chrome might recreate another download session with same
offset
> x and length y based on the current downloaded pieces.
> > So we need either another id to uniquely identify the 2 different sessions,
or
> reuse the previous session. That's why a job_id might be helpful.
> 
> Those are problems that we'll have if we store download jobs. My point is that
> you don't need to store download jobs. :-) The slices -- which you do need to
> store -- only record what's written to disk, not how they got there.
> 
> > Anyway, I think querying the download jobs table by ordering on job_id can
> give us a lot of information on how each piece is created over the time.
> > This can help us debug potential incorrect logics in session creation later
> on.
> 
> But do you need to store it in history for all users? Debugging logic like
this
> is what we use netlogs for since the behavior depends on lots of factors that
> aren't captured here.
> 
> Either way, I think you have an implementation plan in mind. Do you have a doc
> we can look at?

Only storing the slices info sounds like a reasonable idea. In that case, we
only need 3 fields for DownloadJobInfo (id, start_position, bytes_received).
I have made the change in the new patchset, and updated the WIP doc here:
https://docs.google.com/document/d/1w8jgAs77VqR1QH1LzD-LbExJzSBE6xhtEa_9drdBn...

@shess, @sky, PTAL again, thanks!

asanka

nit: job => slice https://codereview.chromium.org/2665243003/diff/160001/components/history/core/browser/download_database.cc File components/history/core/browser/download_database.cc (right): https://codereview.chromium.org/2665243003/diff/160001/components/history/core/browser/download_database.cc#newcode559 components/history/core/browser/download_database.cc:559: for (size_t i = 0; ...

3 years, 10 months ago (2017-02-10 16:30:30 UTC) #52

qinmin

Description was changed from ========== add an optional download jobs table into history download db ...

3 years, 10 months ago (2017-02-10 18:07:13 UTC) #54

qinmin

On 2017/02/10 16:30:30, asanka wrote: > nit: job => slice > > https://codereview.chromium.org/2665243003/diff/160001/components/history/core/browser/download_database.cc > File ...

3 years, 10 months ago (2017-02-10 18:29:42 UTC) #55

qinmin

Description was changed from ========== add an download slices table into history download db To ...

3 years, 10 months ago (2017-02-10 18:33:30 UTC) #56

qinmin

PTAL again. Updated the DownloadJobInfo to DownloadSliceInfo, and changed the CL description. The job status ...

3 years, 10 months ago (2017-02-10 18:39:09 UTC) #57

PTAL again.
Updated the DownloadJobInfo to DownloadSliceInfo, and changed the CL
description. 
The job status are no longer needed, only the slice information needs to be
stored.
Added new test to test the behavior of deleting all slices when the input is
empty.

On 2017/02/10 18:29:42, qinmin wrote:
> On 2017/02/10 16:30:30, asanka wrote:
> > nit: job => slice
> > 
> >
>
https://codereview.chromium.org/2665243003/diff/160001/components/history/cor...
> > File components/history/core/browser/download_database.cc (right):
> > 
> >
>
https://codereview.chromium.org/2665243003/diff/160001/components/history/cor...
> > components/history/core/browser/download_database.cc:559: for (size_t i = 0;
i
> <
> > data.download_job_info.size(); ++i) {
> > Either with jobs or slices, we'd need to deal with the cases where we
discard
> > the intermediate file. This could happen, for example, if the server
responds
> > with a precondition failure indicating that the resource has changed. If
that
> > happens we'd need to discard the file and start from scratch.
> > 
> > The code as written would leave the old slices around. If the browser is
> > restarted, in the new session it would consider parts of the new file as
being
> > valid leading to a corrupt download file.
> > 
> > It's worth considering that the slices can only mutate in three ways:
> > 
> > 1. All slices disappear.
> > 2. New slices are introduced.
> > 3. Existing slices update with a new value for 'received_bytes'
> > 
> > 2 and 3 are handled here. We just need to worry about 1. I.e. if there are
no
> > slices in data.download_job_info (or download_slice_info), then just nuke
the
> > slices matching the download id.
> > 
> > Also, consider not writing slices to the DB unless they have a non-zero
> > 'received_bytes'. Those records are just clutter if they turn out to be DOA
> and
> > can complicate assumption 1 above.
> 
> For 1, we can also pass a download_slice_info vector that all the
received_bytes
> are set 0.
> In that case, the database should also delete those entries.
> 
> In the new PS, I have added the logic to delete all the slice information
> whenever download_slice_info vector is empty.
> The new PS will also skip inserting a slice into the db if received_bytes is
0.

Scott Hess - ex-Googler

I think my suggestions are all in the "Obvious suggestion, you got this" or "Suggestion, ...

3 years, 10 months ago (2017-02-10 22:41:12 UTC) #58

I think my suggestions are all in the "Obvious suggestion, you got this" or
"Suggestion, you can ignore" bucket, so LGTM.

BUT, and I feel really bad for bringing this up so late in the game, but I don't
see anyone in the prior comments asking about why are we laying down persistence
data for a performance feature who's effect hasn't really been quantified, yet? 
I'm not asking this because I have any opinion on the actual feature, but rather
because once the data is persisted on disk, we have no control over cleanup, so
we have to consider it forever (or at least for 2-3 years).  So if this is just
groundwork to do an experiment, I wonder if there's any way to get some/most of
the data we need without persisting state information to disk?

I'm fine with "Yes, we considered that, but couldn't think of a way around it." 
I don't really need to know the reasons, I just want to make sure you're
appropriately reluctant to commit to long-term ownership of this schema change!

https://codereview.chromium.org/2665243003/diff/180001/components/history/cor...
File components/history/core/browser/download_database.cc (right):

https://codereview.chromium.org/2665243003/diff/180001/components/history/cor...
components/history/core/browser/download_database.cc:26: #include
"download_slice_info.h"
This needs the entire path.

https://codereview.chromium.org/2665243003/diff/180001/components/history/cor...
components/history/core/browser/download_database.cc:497: it != info_map.end();
++it) {
This line is showing a whitespace-only change.

https://codereview.chromium.org/2665243003/diff/180001/components/history/cor...
components/history/core/browser/download_database.cc:560:
RemoveDownloadSlices(data.id);
I would rather see this as an if/else where the else contains the for loop, or
always call the Remove* call (so it's always remove-and-rewrite rather than
overwrite-in-place).  I'm not so much worried about efficiency as about having
multiple phrasings of the condition being brittle to maintenance.

[I guess I would lean towards if/else, because update-in-place is slightly more
efficient WRT the index.]

https://codereview.chromium.org/2665243003/diff/180001/components/history/cor...
components/history/core/browser/download_database.cc:772: info.received_bytes =
statement_query.ColumnInt64(column++);
Of course, now that this has half as many fields, it hardly matters what order
things are in!

https://codereview.chromium.org/2665243003/diff/180001/components/history/cor...
File components/history/core/browser/history_backend_db_unittest.cc (right):

https://codereview.chromium.org/2665243003/diff/180001/components/history/cor...
components/history/core/browser/history_backend_db_unittest.cc:756: }
I see that this follows the style in this file, so it's fine, but just FYI, both
of these cases mimic the implementation of:
  ASSERT_TRUE(db.Execute(sql));
Full statements are really only needed for bind parameters or reading off
results (other than success).

https://codereview.chromium.org/2665243003/diff/180001/components/history/cor...
components/history/core/browser/history_backend_db_unittest.cc:785:
ASSERT_TRUE(s2.Run());
This could also be Execute(), unless the style of this file suggests otherwise.

https://codereview.chromium.org/2665243003/diff/180001/components/test/data/h...
File components/test/data/history/history.32.sql (right):

https://codereview.chromium.org/2665243003/diff/180001/components/test/data/h...
components/test/data/history/history.32.sql:11: CREATE TABLE
downloads_url_chains (id INTEGER NOT NULL,chain_index INTEGER NOT NULL,url
LONGVARCHAR NOT NULL, PRIMARY KEY (id, chain_index) );
Perhaps I am missing something, but why does the migration test use
manually-created schema and data, rather than just leveraging this golden file?

qinmin

Thanks Scott for your thoughts. Based on our experiment with parallel downloading, the results are ...

3 years, 10 months ago (2017-02-11 00:21:45 UTC) #59

Thanks Scott for your thoughts. Based on our experiment with parallel
downloading, the results
are quite promising in many situations. Yes, we have considered not introducing
any
schema changes for end users. But based on the user feedbacksand the results we
have,
we believe this is something we should pursue in the near future. If things
really go against us (
very unlikely but there is always the possibility), we can drop this table with
a schema change
and it won't have much impact on users since download can always restart from
the beginning.


On 2017/02/10 22:41:12, Scott Hess wrote:
> I think my suggestions are all in the "Obvious suggestion, you got this" or
> "Suggestion, you can ignore" bucket, so LGTM.
> 
> BUT, and I feel really bad for bringing this up so late in the game, but I
don't
> see anyone in the prior comments asking about why are we laying down
persistence
> data for a performance feature who's effect hasn't really been quantified,
yet? 
> I'm not asking this because I have any opinion on the actual feature, but
rather
> because once the data is persisted on disk, we have no control over cleanup,
so
> we have to consider it forever (or at least for 2-3 years).  So if this is
just
> groundwork to do an experiment, I wonder if there's any way to get some/most
of
> the data we need without persisting state information to disk?
> 
> I'm fine with "Yes, we considered that, but couldn't think of a way around
it." 
> I don't really need to know the reasons, I just want to make sure you're
> appropriately reluctant to commit to long-term ownership of this schema
change!
> 
>
https://codereview.chromium.org/2665243003/diff/180001/components/history/cor...
> File components/history/core/browser/download_database.cc (right):
> 
>
https://codereview.chromium.org/2665243003/diff/180001/components/history/cor...
> components/history/core/browser/download_database.cc:26: #include
> "download_slice_info.h"
> This needs the entire path.
> 
>
https://codereview.chromium.org/2665243003/diff/180001/components/history/cor...
> components/history/core/browser/download_database.cc:497: it !=
info_map.end();
> ++it) {
> This line is showing a whitespace-only change.
> 
>
https://codereview.chromium.org/2665243003/diff/180001/components/history/cor...
> components/history/core/browser/download_database.cc:560:
> RemoveDownloadSlices(data.id);
> I would rather see this as an if/else where the else contains the for loop, or
> always call the Remove* call (so it's always remove-and-rewrite rather than
> overwrite-in-place).  I'm not so much worried about efficiency as about having
> multiple phrasings of the condition being brittle to maintenance.
> 
> [I guess I would lean towards if/else, because update-in-place is slightly
more
> efficient WRT the index.]
> 
>
https://codereview.chromium.org/2665243003/diff/180001/components/history/cor...
> components/history/core/browser/download_database.cc:772: info.received_bytes
=
> statement_query.ColumnInt64(column++);
> Of course, now that this has half as many fields, it hardly matters what order
> things are in!
> 
>
https://codereview.chromium.org/2665243003/diff/180001/components/history/cor...
> File components/history/core/browser/history_backend_db_unittest.cc (right):
> 
>
https://codereview.chromium.org/2665243003/diff/180001/components/history/cor...
> components/history/core/browser/history_backend_db_unittest.cc:756: }
> I see that this follows the style in this file, so it's fine, but just FYI,
both
> of these cases mimic the implementation of:
>   ASSERT_TRUE(db.Execute(sql));
> Full statements are really only needed for bind parameters or reading off
> results (other than success).
> 
>
https://codereview.chromium.org/2665243003/diff/180001/components/history/cor...
> components/history/core/browser/history_backend_db_unittest.cc:785:
> ASSERT_TRUE(s2.Run());
> This could also be Execute(), unless the style of this file suggests
otherwise.
> 
>
https://codereview.chromium.org/2665243003/diff/180001/components/test/data/h...
> File components/test/data/history/history.32.sql (right):
> 
>
https://codereview.chromium.org/2665243003/diff/180001/components/test/data/h...
> components/test/data/history/history.32.sql:11: CREATE TABLE
> downloads_url_chains (id INTEGER NOT NULL,chain_index INTEGER NOT NULL,url
> LONGVARCHAR NOT NULL, PRIMARY KEY (id, chain_index) );
> Perhaps I am missing something, but why does the migration test use
> manually-created schema and data, rather than just leveraging this golden
file?

qinmin

https://codereview.chromium.org/2665243003/diff/180001/components/history/core/browser/download_database.cc File components/history/core/browser/download_database.cc (right): https://codereview.chromium.org/2665243003/diff/180001/components/history/core/browser/download_database.cc#newcode26 components/history/core/browser/download_database.cc:26: #include "download_slice_info.h" On 2017/02/10 22:41:11, Scott Hess wrote: > ...

3 years, 10 months ago (2017-02-11 00:23:54 UTC) #60

https://codereview.chromium.org/2665243003/diff/180001/components/history/cor...
File components/history/core/browser/download_database.cc (right):

https://codereview.chromium.org/2665243003/diff/180001/components/history/cor...
components/history/core/browser/download_database.cc:26: #include
"download_slice_info.h"
On 2017/02/10 22:41:11, Scott Hess wrote:
> This needs the entire path.

Fixed. Strange, the path is there in the previous PS, must accidentally hit the
backspace key while working on this.

https://codereview.chromium.org/2665243003/diff/180001/components/history/cor...
components/history/core/browser/download_database.cc:497: it != info_map.end();
++it) {
On 2017/02/10 22:41:11, Scott Hess wrote:
> This line is showing a whitespace-only change.

Done.

https://codereview.chromium.org/2665243003/diff/180001/components/history/cor...
components/history/core/browser/download_database.cc:560:
RemoveDownloadSlices(data.id);
On 2017/02/10 22:41:12, Scott Hess wrote:
> I would rather see this as an if/else where the else contains the for loop, or
> always call the Remove* call (so it's always remove-and-rewrite rather than
> overwrite-in-place).  I'm not so much worried about efficiency as about having
> multiple phrasings of the condition being brittle to maintenance.
> 
> [I guess I would lean towards if/else, because update-in-place is slightly
more
> efficient WRT the index.]

Done.

https://codereview.chromium.org/2665243003/diff/180001/components/history/cor...
File components/history/core/browser/history_backend_db_unittest.cc (right):

https://codereview.chromium.org/2665243003/diff/180001/components/history/cor...
components/history/core/browser/history_backend_db_unittest.cc:756: }
On 2017/02/10 22:41:12, Scott Hess wrote:
> I see that this follows the style in this file, so it's fine, but just FYI,
both
> of these cases mimic the implementation of:
>   ASSERT_TRUE(db.Execute(sql));
> Full statements are really only needed for bind parameters or reading off
> results (other than success).

Switched to use execute.

https://codereview.chromium.org/2665243003/diff/180001/components/history/cor...
components/history/core/browser/history_backend_db_unittest.cc:785:
ASSERT_TRUE(s2.Run());
On 2017/02/10 22:41:12, Scott Hess wrote:
> This could also be Execute(), unless the style of this file suggests
otherwise.

Done.

https://codereview.chromium.org/2665243003/diff/180001/components/test/data/h...
File components/test/data/history/history.32.sql (right):

https://codereview.chromium.org/2665243003/diff/180001/components/test/data/h...
components/test/data/history/history.32.sql:11: CREATE TABLE
downloads_url_chains (id INTEGER NOT NULL,chain_index INTEGER NOT NULL,url
LONGVARCHAR NOT NULL, PRIMARY KEY (id, chain_index) );
On 2017/02/10 22:41:12, Scott Hess wrote:
> Perhaps I am missing something, but why does the migration test use
> manually-created schema and data, rather than just leveraging this golden
file?

Done. Moved the insert statements here.

Scott Hess - ex-Googler

Changes LGTM, with a very minor nit that you can either ignore or just change ...

3 years, 10 months ago (2017-02-11 03:31:12 UTC) #61

Scott Hess - ex-Googler

On 2017/02/11 00:21:45, qinmin wrote: > Thanks Scott for your thoughts. Based on our experiment ...

3 years, 10 months ago (2017-02-11 03:32:53 UTC) #62

On 2017/02/11 00:21:45, qinmin wrote:
> Thanks Scott for your thoughts. Based on our experiment with parallel
> downloading, the results
> are quite promising in many situations. Yes, we have considered not
introducing
> any
> schema changes for end users. But based on the user feedbacksand the results
we
> have,
> we believe this is something we should pursue in the near future. If things
> really go against us (
> very unlikely but there is always the possibility), we can drop this table
with
> a schema change
> and it won't have much impact on users since download can always restart from
> the beginning.
> 
> 
> On 2017/02/10 22:41:12, Scott Hess wrote:
> > I think my suggestions are all in the "Obvious suggestion, you got this" or
> > "Suggestion, you can ignore" bucket, so LGTM.
> > 
> > BUT, and I feel really bad for bringing this up so late in the game, but I
> don't
> > see anyone in the prior comments asking about why are we laying down
> persistence
> > data for a performance feature who's effect hasn't really been quantified,
> yet? 
> > I'm not asking this because I have any opinion on the actual feature, but
> rather
> > because once the data is persisted on disk, we have no control over cleanup,
> so
> > we have to consider it forever (or at least for 2-3 years).  So if this is
> just
> > groundwork to do an experiment, I wonder if there's any way to get some/most
> of
> > the data we need without persisting state information to disk?
> > 
> > I'm fine with "Yes, we considered that, but couldn't think of a way around
> it." 
> > I don't really need to know the reasons, I just want to make sure you're
> > appropriately reluctant to commit to long-term ownership of this schema
> change!
> > 
> >
>
https://codereview.chromium.org/2665243003/diff/180001/components/history/cor...
> > File components/history/core/browser/download_database.cc (right):
> > 
> >
>
https://codereview.chromium.org/2665243003/diff/180001/components/history/cor...
> > components/history/core/browser/download_database.cc:26: #include
> > "download_slice_info.h"
> > This needs the entire path.
> > 
> >
>
https://codereview.chromium.org/2665243003/diff/180001/components/history/cor...
> > components/history/core/browser/download_database.cc:497: it !=
> info_map.end();
> > ++it) {
> > This line is showing a whitespace-only change.
> > 
> >
>
https://codereview.chromium.org/2665243003/diff/180001/components/history/cor...
> > components/history/core/browser/download_database.cc:560:
> > RemoveDownloadSlices(data.id);
> > I would rather see this as an if/else where the else contains the for loop,
or
> > always call the Remove* call (so it's always remove-and-rewrite rather than
> > overwrite-in-place).  I'm not so much worried about efficiency as about
having
> > multiple phrasings of the condition being brittle to maintenance.
> > 
> > [I guess I would lean towards if/else, because update-in-place is slightly
> more
> > efficient WRT the index.]
> > 
> >
>
https://codereview.chromium.org/2665243003/diff/180001/components/history/cor...
> > components/history/core/browser/download_database.cc:772:
info.received_bytes
> =
> > statement_query.ColumnInt64(column++);
> > Of course, now that this has half as many fields, it hardly matters what
order
> > things are in!
> > 
> >
>
https://codereview.chromium.org/2665243003/diff/180001/components/history/cor...
> > File components/history/core/browser/history_backend_db_unittest.cc (right):
> > 
> >
>
https://codereview.chromium.org/2665243003/diff/180001/components/history/cor...
> > components/history/core/browser/history_backend_db_unittest.cc:756: }
> > I see that this follows the style in this file, so it's fine, but just FYI,
> both
> > of these cases mimic the implementation of:
> >   ASSERT_TRUE(db.Execute(sql));
> > Full statements are really only needed for bind parameters or reading off
> > results (other than success).
> > 
> >
>
https://codereview.chromium.org/2665243003/diff/180001/components/history/cor...
> > components/history/core/browser/history_backend_db_unittest.cc:785:
> > ASSERT_TRUE(s2.Run());
> > This could also be Execute(), unless the style of this file suggests
> otherwise.
> > 
> >
>
https://codereview.chromium.org/2665243003/diff/180001/components/test/data/h...
> > File components/test/data/history/history.32.sql (right):
> > 
> >
>
https://codereview.chromium.org/2665243003/diff/180001/components/test/data/h...
> > components/test/data/history/history.32.sql:11: CREATE TABLE
> > downloads_url_chains (id INTEGER NOT NULL,chain_index INTEGER NOT NULL,url
> > LONGVARCHAR NOT NULL, PRIMARY KEY (id, chain_index) );
> > Perhaps I am missing something, but why does the migration test use
> > manually-created schema and data, rather than just leveraging this golden
> file?

OK, thanks for the explainer.

Apologies if I seem over-sensitive.  Usually the pattern is someone lands
something, then goes on to work on something else, then something something and
three years later I'm trying to figure things out.  So I try to act as a bit of
friction up front if I can.

qinmin

The CQ bit was checked by qinmin@chromium.org to run a CQ dry run

3 years, 10 months ago (2017-02-13 19:51:16 UTC) #64

commit-bot: I haz the power

Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2665243003/200001

3 years, 10 months ago (2017-02-13 19:52:20 UTC) #65

commit-bot: I haz the power

The CQ bit was unchecked by commit-bot@chromium.org

3 years, 10 months ago (2017-02-13 20:03:54 UTC) #66

commit-bot: I haz the power

Dry run: Try jobs failed on following builders: linux_chromium_tsan_rel_ng on master.tryserver.chromium.linux (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.linux/builders/linux_chromium_tsan_rel_ng/builds/14053) mac_chromium_compile_dbg_ng on ...

3 years, 10 months ago (2017-02-13 20:03:55 UTC) #67

qinmin

https://codereview.chromium.org/2665243003/diff/200001/components/history/core/browser/history_backend_db_unittest.cc File components/history/core/browser/history_backend_db_unittest.cc (right): https://codereview.chromium.org/2665243003/diff/200001/components/history/core/browser/history_backend_db_unittest.cc#newcode763 components/history/core/browser/history_backend_db_unittest.cc:763: ASSERT_TRUE(db.Execute(insert_statement.c_str())); On 2017/02/11 03:31:12, Scott Hess wrote: > Just ...

3 years, 10 months ago (2017-02-13 22:02:11 UTC) #68

qinmin

The patchset sent to the CQ was uploaded after l-g-t-m from asanka@chromium.org, dtrainor@chromium.org, holte@chromium.org, sky@chromium.org, ...

3 years, 10 months ago (2017-02-14 17:08:09 UTC) #70

commit-bot: I haz the power

CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2665243003/220001

3 years, 10 months ago (2017-02-14 17:08:26 UTC) #71

commit-bot: I haz the power

The CQ bit was unchecked by commit-bot@chromium.org

3 years, 10 months ago (2017-02-14 18:13:50 UTC) #72

commit-bot: I haz the power

Try jobs failed on following builders: ios-simulator on master.tryserver.chromium.mac (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.mac/builders/ios-simulator/builds/154871)

3 years, 10 months ago (2017-02-14 18:13:51 UTC) #73

Scott Hess - ex-Googler

On 2017/02/14 18:13:51, commit-bot: I haz the power wrote: > Try jobs failed on following ...

3 years, 10 months ago (2017-02-14 18:16:43 UTC) #74

asanka

On 2017/02/14 at 18:16:43, shess wrote: > On 2017/02/14 18:13:51, commit-bot: I haz the power ...

3 years, 10 months ago (2017-02-14 21:31:28 UTC) #75

qinmin

On 2017/02/14 21:31:28, asanka wrote: > On 2017/02/14 at 18:16:43, shess wrote: > > On ...

3 years, 10 months ago (2017-02-14 23:22:22 UTC) #76

qinmin

The patchset sent to the CQ was uploaded after l-g-t-m from asanka@chromium.org, dtrainor@chromium.org, sky@chromium.org, holte@chromium.org ...

3 years, 10 months ago (2017-02-15 00:07:08 UTC) #79

commit-bot: I haz the power

CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2665243003/240001

3 years, 10 months ago (2017-02-15 00:08:12 UTC) #80

commit-bot: I haz the power

CQ is committing da patch. Bot data: {"patchset_id": 240001, "attempt_start_ts": 1487117228048560, "parent_rev": "d182a1a7bfdd22c1d166df4ecbda86af18ca68e5", "commit_rev": "b2925e8fc7cacf09215b6c9c12b975fb7a9bdee5"}

3 years, 10 months ago (2017-02-15 06:47:52 UTC) #81

commit-bot: I haz the power

Description was changed from ========== add a download slices table into history download db To ...

3 years, 10 months ago (2017-02-15 06:48:26 UTC) #82

commit-bot: I haz the power

3 years, 10 months ago (2017-02-15 06:48:28 UTC) #83

Message was sent while issue was closed.

Committed patchset #12 (id:240001) as
https://chromium.googlesource.com/chromium/src/+/b2925e8fc7cacf09215b6c9c12b9...

Issue 2665243003: add a download slices table into history download db (Closed)

Description

Patch Set 1 #

Patch Set 2 : increase history db version #

Patch Set 3 : addressing comments #

Patch Set 4 : addressing comments #

Patch Set 5 : use Select changes() instead of select count(*) #

Patch Set 6 : addressing comments #

Patch Set 7 : addressing comments #

Patch Set 8 : Simplify DownloadJobInfo #

Patch Set 9 : rename download_job_info to download_slice_info #

Patch Set 10 : nits #

Patch Set 11 : nit #

Patch Set 12 : add test file to BUILD.gn #

Messages