|
|
DescriptionAdded HistoryService::GetHistoryCount() and helper functions in components/history
GetHistoryCount counts history the same way they are listed chrome://history
i.e. A visit to the URL is counted once if it is visited multiple times in the same day using local time, but counted multiple times if they are visited in different days.
BUG=501916
(This CL is uploaded as a response to comment #55-57 in the bug.)
Committed: https://crrev.com/4cbca7b75d45f19948cf89c8b5ed6934bd62af4a
Cr-Commit-Position: refs/heads/master@{#349184}
Patch Set 1 #
Total comments: 32
Patch Set 2 : Respond to msramek's comment + fixed wrong localtime calculation #
Total comments: 2
Patch Set 3 : Respond to msramek's comment #
Messages
Total messages: 24 (7 generated)
lwchkg@gmail.com changed reviewers: + msramek@chromium.org, sdefresne@chromium.org, sky@chromium.org
Dear msramek, Please have a look. Thanks. Regards, WC Leung.
Thanks for extracting this! I left some comments. Btw, this feels like it needs tests, but as I said, I'm soon going to build atop of this, so I can also add them later myself. https://codereview.chromium.org/1345473003/diff/1/components/history/core/bro... File components/history/core/browser/history_backend.h (right): https://codereview.chromium.org/1345473003/diff/1/components/history/core/bro... components/history/core/browser/history_backend.h:279: // Count the number of URLs in the browsing history. Please update the comment to explain that we're counting distinct URLs per day. https://codereview.chromium.org/1345473003/diff/1/components/history/core/bro... File components/history/core/browser/history_service.cc (right): https://codereview.chromium.org/1345473003/diff/1/components/history/core/bro... components/history/core/browser/history_service.cc:670: // Statistics ------------------------------------------------------------------ nit: Empty line after "---" comment. https://codereview.chromium.org/1345473003/diff/1/components/history/core/bro... File components/history/core/browser/history_service.h (right): https://codereview.chromium.org/1345473003/diff/1/components/history/core/bro... components/history/core/browser/history_service.h:347: // Returns the count of URLs in the url database. The argument is the count. I'd suggest omitting this comment. The fact that this is used in |GetHistoryCount| gives a clear picture what this callback is for. Furthermore, count = 0 does not indicate database failure, but success = false does. You could add that to the |GetHistoryCount| comment. https://codereview.chromium.org/1345473003/diff/1/components/history/core/bro... components/history/core/browser/history_service.h:349: typedef base::Callback<void(HistoryCountResult)> GetHistoryCountCallback; This shouldn't be called "Get", as it's not a getter. https://codereview.chromium.org/1345473003/diff/1/components/history/core/bro... components/history/core/browser/history_service.h:351: // Returns the count of URLs in the url database. Please update the comment to explain that we're counting distinct URLs per day. https://codereview.chromium.org/1345473003/diff/1/components/history/core/bro... File components/history/core/browser/history_types.h (right): https://codereview.chromium.org/1345473003/diff/1/components/history/core/bro... components/history/core/browser/history_types.h:458: // HistoryBackend::GetHistoryCount nit: period at the end of sentence https://codereview.chromium.org/1345473003/diff/1/components/history/core/bro... components/history/core/browser/history_types.h:461: // successfull or not. If false, then |count| is undefined. nit: successful https://codereview.chromium.org/1345473003/diff/1/components/history/core/bro... File components/history/core/browser/visit_database.cc (right): https://codereview.chromium.org/1345473003/diff/1/components/history/core/bro... components/history/core/browser/visit_database.cc:539: int VisitDatabase::GetHistoryCount() { Please follow the convention that methods return boolean indicating if the SQL statement was successful. Return the result through an out-parameter. https://codereview.chromium.org/1345473003/diff/1/components/history/core/bro... components/history/core/browser/visit_database.cc:541: "SELECT COUNT(*) FROM (" Why not SELECT COUNT(DISTINCT ...) FROM ...? https://codereview.chromium.org/1345473003/diff/1/components/history/core/bro... components/history/core/browser/visit_database.cc:544: // Windows Epoch to the number of seconds from Unix Epoch. Shouldn't this be DATE(timestamp, 'unixepoch', 'localtime')? First, we interpret the timestamp as Unix Epoch, and then convert it to local time? https://www.sqlite.org/lang_datefunc.html Also note that base::Time() timestamps are from "1601-01-01", not from "1970-01-01", i.e. not Unix Epoch. But since AFAIK neither of them handles leap seconds, it should be fine. This should be explained in a comment. https://codereview.chromium.org/1345473003/diff/1/components/history/core/bro... components/history/core/browser/visit_database.cc:545: "DATE(visit_time/1000000-11644473600, 'localtime', 'unixepoch') " Please use a constant, such as Time::kTimeTToMicrosecondsOffset or similar.
The CQ bit was checked by mlerman@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1345473003/1 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1345473003/1
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
Yes building a test is needed. (And I have no experience of writing tests at all, so it is better for write them later in another CL.) I don't mind to write the test myself (which is a good learning opportunity to me), but if you need the test in 1-2 days then I'm not likely to write on time. BTW, any test ideas? (1. localtime handling, 2. any others?) https://codereview.chromium.org/1345473003/diff/1/components/history/core/bro... File components/history/core/browser/history_backend.h (right): https://codereview.chromium.org/1345473003/diff/1/components/history/core/bro... components/history/core/browser/history_backend.h:279: // Count the number of URLs in the browsing history. On 2015/09/14 17:28:16, msramek wrote: > Please update the comment to explain that we're counting distinct URLs per day. Acknowledged. https://codereview.chromium.org/1345473003/diff/1/components/history/core/bro... File components/history/core/browser/history_service.cc (right): https://codereview.chromium.org/1345473003/diff/1/components/history/core/bro... components/history/core/browser/history_service.cc:670: // Statistics ------------------------------------------------------------------ On 2015/09/14 17:28:16, msramek wrote: > nit: Empty line after "---" comment. Acknowledged. https://codereview.chromium.org/1345473003/diff/1/components/history/core/bro... File components/history/core/browser/history_service.h (right): https://codereview.chromium.org/1345473003/diff/1/components/history/core/bro... components/history/core/browser/history_service.h:347: // Returns the count of URLs in the url database. The argument is the count. On 2015/09/14 17:28:16, msramek wrote: > I'd suggest omitting this comment. The fact that this is used in > |GetHistoryCount| gives a clear picture what this callback is for. > > Furthermore, count = 0 does not indicate database failure, but success = false > does. You could add that to the |GetHistoryCount| comment. Thanks. Just forgot to update the comment here. https://codereview.chromium.org/1345473003/diff/1/components/history/core/bro... components/history/core/browser/history_service.h:349: typedef base::Callback<void(HistoryCountResult)> GetHistoryCountCallback; On 2015/09/14 17:28:16, msramek wrote: > This shouldn't be called "Get", as it's not a getter. Puzzled here. This is the callback function in GetHistoryCount, so it's called GetHistoryCountCallback according the the naming conventions in this file. https://codereview.chromium.org/1345473003/diff/1/components/history/core/bro... components/history/core/browser/history_service.h:351: // Returns the count of URLs in the url database. On 2015/09/14 17:28:16, msramek wrote: > Please update the comment to explain that we're counting distinct URLs per day. Acknowledged. https://codereview.chromium.org/1345473003/diff/1/components/history/core/bro... File components/history/core/browser/history_types.h (right): https://codereview.chromium.org/1345473003/diff/1/components/history/core/bro... components/history/core/browser/history_types.h:458: // HistoryBackend::GetHistoryCount On 2015/09/14 17:28:16, msramek wrote: > nit: period at the end of sentence Acknowledged. https://codereview.chromium.org/1345473003/diff/1/components/history/core/bro... components/history/core/browser/history_types.h:461: // successfull or not. If false, then |count| is undefined. On 2015/09/14 17:28:16, msramek wrote: > nit: successful Acknowledged. Embarassing spelling mistake. https://codereview.chromium.org/1345473003/diff/1/components/history/core/bro... File components/history/core/browser/visit_database.cc (right): https://codereview.chromium.org/1345473003/diff/1/components/history/core/bro... components/history/core/browser/visit_database.cc:539: int VisitDatabase::GetHistoryCount() { On 2015/09/14 17:28:16, msramek wrote: > Please follow the convention that methods return boolean indicating if the SQL > statement was successful. Return the result through an out-parameter. Sounds good. https://codereview.chromium.org/1345473003/diff/1/components/history/core/bro... components/history/core/browser/visit_database.cc:541: "SELECT COUNT(*) FROM (" On 2015/09/14 17:28:16, msramek wrote: > Why not SELECT COUNT(DISTINCT ...) FROM ...? I was using COUNT(*) where DISTINCT is not available. I'll change to SELECT COUNT(DISTINCT ...) as it looks like more efficient. https://codereview.chromium.org/1345473003/diff/1/components/history/core/bro... components/history/core/browser/visit_database.cc:544: // Windows Epoch to the number of seconds from Unix Epoch. On 2015/09/14 17:28:16, msramek wrote: > Shouldn't this be DATE(timestamp, 'unixepoch', 'localtime')? First, we interpret > the timestamp as Unix Epoch, and then convert it to local time? > > https://www.sqlite.org/lang_datefunc.html You're right. Confirmed that my wrong version calculates times incorrectly. This is very embarrassing. > > Also note that base::Time() timestamps are from "1601-01-01", not from > "1970-01-01", i.e. not Unix Epoch. But since AFAIK neither of them handles leap > seconds, it should be fine. This should be explained in a comment. Thanks for the information and I'll include in the comments. MS filetime seems to be completely ignorant about the concepts of leap seconds. https://codereview.chromium.org/1345473003/diff/1/components/history/core/bro... components/history/core/browser/visit_database.cc:545: "DATE(visit_time/1000000-11644473600, 'localtime', 'unixepoch') " Time::kTimeTToMicrosecondsOffset sounds good. Anyway the comment "... platform-dependent epoch" (line 408) in base/time/time.h is scary.
Had been so dumb that I replied one of the comments wrongly. Here's the correction. https://codereview.chromium.org/1345473003/diff/1/components/history/core/bro... File components/history/core/browser/visit_database.cc (right): https://codereview.chromium.org/1345473003/diff/1/components/history/core/bro... components/history/core/browser/visit_database.cc:541: "SELECT COUNT(*) FROM (" On 2015/09/15 14:27:08, lwchkg wrote: > On 2015/09/14 17:28:16, msramek wrote: > > Why not SELECT COUNT(DISTINCT ...) FROM ...? > > I was using COUNT(*) where DISTINCT is not available. I'll change to SELECT > COUNT(DISTINCT ...) as it looks like more efficient. Sorry for wrong reply. SELECT COUNT(DISTINCT ...) FROM ... doesn't work because we need to count distinct (url, date) pairs.
patch #2 submitted. PTAL.
The CQ bit was checked by msramek@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1345473003/20001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1345473003/20001
LGTM sky@, sdefresne@, can you please have a look as owners? https://codereview.chromium.org/1345473003/diff/1/components/history/core/bro... File components/history/core/browser/history_service.h (right): https://codereview.chromium.org/1345473003/diff/1/components/history/core/bro... components/history/core/browser/history_service.h:349: typedef base::Callback<void(HistoryCountResult)> GetHistoryCountCallback; On 2015/09/15 14:27:08, lwchkg wrote: > On 2015/09/14 17:28:16, msramek wrote: > > This shouldn't be called "Get", as it's not a getter. > > Puzzled here. This is the callback function in GetHistoryCount, so it's called > GetHistoryCountCallback according the the naming conventions in this file. Oh... of course :) Sorry for that. https://codereview.chromium.org/1345473003/diff/1/components/history/core/bro... File components/history/core/browser/visit_database.cc (right): https://codereview.chromium.org/1345473003/diff/1/components/history/core/bro... components/history/core/browser/visit_database.cc:541: "SELECT COUNT(*) FROM (" On 2015/09/15 18:03:07, lwchkg wrote: > On 2015/09/15 14:27:08, lwchkg wrote: > > On 2015/09/14 17:28:16, msramek wrote: > > > Why not SELECT COUNT(DISTINCT ...) FROM ...? > > > > I was using COUNT(*) where DISTINCT is not available. I'll change to SELECT > > COUNT(DISTINCT ...) as it looks like more efficient. > > Sorry for wrong reply. SELECT COUNT(DISTINCT ...) FROM ... doesn't work because > we need to count distinct (url, date) pairs. Just to be sure we're talking about the same thing: SELECT COUNT(DISTINCT url, DATE(...)) FROM visits doesn't work? (SQLite has limited syntax, so I wouldn't be surprised, I'm just wondering). https://codereview.chromium.org/1345473003/diff/1/components/history/core/bro... components/history/core/browser/visit_database.cc:544: // Windows Epoch to the number of seconds from Unix Epoch. On 2015/09/15 14:27:08, lwchkg wrote: > On 2015/09/14 17:28:16, msramek wrote: > > Shouldn't this be DATE(timestamp, 'unixepoch', 'localtime')? First, we > interpret > > the timestamp as Unix Epoch, and then convert it to local time? > > > > https://www.sqlite.org/lang_datefunc.html > > You're right. Confirmed that my wrong version calculates times incorrectly. This > is very embarrassing. > > > > > Also note that base::Time() timestamps are from "1601-01-01", not from > > "1970-01-01", i.e. not Unix Epoch. But since AFAIK neither of them handles > leap > > seconds, it should be fine. This should be explained in a comment. > > Thanks for the information and I'll include in the comments. MS filetime seems > to be completely ignorant about the concepts of leap seconds. Hm, I didn't finish my thought properly. Since both epochs start at midnight of their respective time, and they don't use leap seconds, it holds that DATE(x, 'unixepoch', 'localtime') == DATE(y, 'unixepoch', 'localtime') if and only if DATE(x-11644473600, 'unixepoch', 'localtime') == DATE(y-11644473600, 'unixepoch', 'localtime') So the conversion with kTimeTToMicrosecondsOffset is not even needed, because we don't need to get the exact date, just to round to the current date. However, what you currently have is better for readability and probably not observably slower. Let's keep this for now, and see if other reviewers have different opinions. https://codereview.chromium.org/1345473003/diff/1/components/history/core/bro... components/history/core/browser/visit_database.cc:545: "DATE(visit_time/1000000-11644473600, 'localtime', 'unixepoch') " On 2015/09/15 14:27:08, lwchkg wrote: > Time::kTimeTToMicrosecondsOffset sounds good. Anyway the comment "... > platform-dependent epoch" (line 408) in base/time/time.h is scary. It does. I looked around and it seems to be used everywhere for these conversions, see e.g.: https://code.google.com/p/chromium/codesearch#chromium/src/base/time/time.cc&... and it actually has the same value defined for posix, Win, and Mac, so I assume that this is correct. https://codereview.chromium.org/1345473003/diff/20001/components/history/core... File components/history/core/browser/history_backend.cc (right): https://codereview.chromium.org/1345473003/diff/20001/components/history/core... components/history/core/browser/history_backend.cc:1063: result.success = false; Nit: result.success = db_ && db_->GetHistoryCount()
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
lgtm
Here's the response to some of msramek's comment. https://codereview.chromium.org/1345473003/diff/1/components/history/core/bro... File components/history/core/browser/visit_database.cc (right): https://codereview.chromium.org/1345473003/diff/1/components/history/core/bro... components/history/core/browser/visit_database.cc:541: "SELECT COUNT(*) FROM (" On 2015/09/16 11:16:30, msramek wrote: > On 2015/09/15 18:03:07, lwchkg wrote: > > On 2015/09/15 14:27:08, lwchkg wrote: > > > On 2015/09/14 17:28:16, msramek wrote: > > > > Why not SELECT COUNT(DISTINCT ...) FROM ...? > > > > > > I was using COUNT(*) where DISTINCT is not available. I'll change to SELECT > > > COUNT(DISTINCT ...) as it looks like more efficient. > > > > Sorry for wrong reply. SELECT COUNT(DISTINCT ...) FROM ... doesn't work > because > > we need to count distinct (url, date) pairs. > > Just to be sure we're talking about the same thing: > > SELECT COUNT(DISTINCT url, DATE(...)) FROM visits > > doesn't work? (SQLite has limited syntax, so I wouldn't be surprised, I'm just > wondering). No, it doesn't work in SQLite. Error message: wrong number of arguments to function COUNT() Anyway, just checked the syntax of a few SQL engines. MySQL has the syntax you mentioned, but Oracle has not. see https://dev.mysql.com/doc/refman/5.7/en/group-by-functions.html#function_count and http://docs.oracle.com/database/121/nav/portal_5.htm Is that an non-standard feature of MySQL? https://codereview.chromium.org/1345473003/diff/1/components/history/core/bro... components/history/core/browser/visit_database.cc:544: // Windows Epoch to the number of seconds from Unix Epoch. > > Hm, I didn't finish my thought properly. Since both epochs start at midnight of > their respective time, and they don't use leap seconds, it holds that > > DATE(x, 'unixepoch', 'localtime') == DATE(y, 'unixepoch', 'localtime') > > if and only if > > DATE(x-11644473600, 'unixepoch', 'localtime') == DATE(y-11644473600, > 'unixepoch', 'localtime') > > So the conversion with kTimeTToMicrosecondsOffset is not even needed, because we > don't need to get the exact date, just to round to the current date. Not correct. The leap years are different before and after conversion. So the dates will be different. This has a effect in Daylight Saving Time because timestamps are corrected with one more hour of offset. > > However, what you currently have is better for readability and probably not > observably slower. > > Let's keep this for now, and see if other reviewers have different opinions. https://codereview.chromium.org/1345473003/diff/1/components/history/core/bro... components/history/core/browser/visit_database.cc:545: "DATE(visit_time/1000000-11644473600, 'localtime', 'unixepoch') " On 2015/09/16 11:16:30, msramek wrote: > On 2015/09/15 14:27:08, lwchkg wrote: > > Time::kTimeTToMicrosecondsOffset sounds good. Anyway the comment "... > > platform-dependent epoch" (line 408) in base/time/time.h is scary. > > It does. I looked around and it seems to be used everywhere for these > conversions, see e.g.: > > https://code.google.com/p/chromium/codesearch#chromium/src/base/time/time.cc&... > > and it actually has the same value defined for posix, Win, and Mac, so I assume > that this is correct. Agree... looks like the comment is incorrect. Should we file a bug? https://codereview.chromium.org/1345473003/diff/20001/components/history/core... File components/history/core/browser/history_backend.cc (right): https://codereview.chromium.org/1345473003/diff/20001/components/history/core... components/history/core/browser/history_backend.cc:1063: result.success = false; On 2015/09/16 11:16:30, msramek wrote: > Nit: result.success = db_ && db_->GetHistoryCount() Sounds good.
The CQ bit was checked by lwchkg@gmail.com
The patchset sent to the CQ was uploaded after l-g-t-m from msramek@chromium.org, sdefresne@chromium.org Link to the patchset: https://codereview.chromium.org/1345473003/#ps40001 (title: "Respond to msramek's comment")
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1345473003/40001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1345473003/40001
Message was sent while issue was closed.
Committed patchset #3 (id:40001)
Message was sent while issue was closed.
Patchset 3 (id:??) landed as https://crrev.com/4cbca7b75d45f19948cf89c8b5ed6934bd62af4a Cr-Commit-Position: refs/heads/master@{#349184}
Message was sent while issue was closed.
https://codereview.chromium.org/1345473003/diff/1/components/history/core/bro... File components/history/core/browser/visit_database.cc (right): https://codereview.chromium.org/1345473003/diff/1/components/history/core/bro... components/history/core/browser/visit_database.cc:544: // Windows Epoch to the number of seconds from Unix Epoch. On 2015/09/16 17:32:34, lwchkg wrote: > > > > Hm, I didn't finish my thought properly. Since both epochs start at midnight > of > > their respective time, and they don't use leap seconds, it holds that > > > > DATE(x, 'unixepoch', 'localtime') == DATE(y, 'unixepoch', 'localtime') > > > > if and only if > > > > DATE(x-11644473600, 'unixepoch', 'localtime') == DATE(y-11644473600, > > 'unixepoch', 'localtime') > > > > So the conversion with kTimeTToMicrosecondsOffset is not even needed, because > we > > don't need to get the exact date, just to round to the current date. > > Not correct. The leap years are different before and after conversion. So the > dates will be different. This has a effect in Daylight Saving Time because > timestamps are corrected with one more hour of offset. > > > > > However, what you currently have is better for readability and probably not > > observably slower. > > > > Let's keep this for now, and see if other reviewers have different opinions. > Ah, of course, DST. Thanks for catching this, and sorry for misguiding you. https://codereview.chromium.org/1345473003/diff/1/components/history/core/bro... components/history/core/browser/visit_database.cc:545: "DATE(visit_time/1000000-11644473600, 'localtime', 'unixepoch') " On 2015/09/16 17:32:34, lwchkg wrote: > On 2015/09/16 11:16:30, msramek wrote: > > On 2015/09/15 14:27:08, lwchkg wrote: > > > Time::kTimeTToMicrosecondsOffset sounds good. Anyway the comment "... > > > platform-dependent epoch" (line 408) in base/time/time.h is scary. > > > > It does. I looked around and it seems to be used everywhere for these > > conversions, see e.g.: > > > > > https://code.google.com/p/chromium/codesearch#chromium/src/base/time/time.cc&... > > > > and it actually has the same value defined for posix, Win, and Mac, so I > assume > > that this is correct. > > Agree... looks like the comment is incorrect. Should we file a bug? I investigated a bit more - the constant is used specifically as the offset between base::Time and time_t. base::Time is guaranteed to start on 1601-01-01, but time_t is *not* guaranteed to start on 1971-01-01 (although in practice it does). That's why the constant is platform-dependent. Now in this case, we're transforming base::Time to unixepoch, not to time_t, so the constant is platform-independent. So while in practice kTimeTToMicrosecondsOffset is the correct constant to use, in theory it's not. I'll fix this shortly.
Message was sent while issue was closed.
Patchset 3 (id:??) landed as https://crrev.com/4cbca7b75d45f19948cf89c8b5ed6934bd62af4a Cr-Commit-Position: refs/heads/master@{#349184} |