|
|
Chromium Code Reviews|
Created:
7 years, 2 months ago by sclittle Modified:
7 years ago Reviewers:
cbentzel, jar (doing other things), Jói, davidben, bengr, Alexei Svitkine (slow), mmenke, Scott Hess - ex-Googler CC:
chromium-reviews, cbentzel+watch_chromium.org Base URL:
https://chromium.googlesource.com/chromium/src.git@precache Visibility:
Public. |
DescriptionMetrics database for tracking precache statistics.
BUG=306185, 309216
Committed: https://src.chromium.org/viewvc/chrome?view=rev&revision=238824
Patch Set 1 #Patch Set 2 : General fixes, added tests, and moved PrecacheManager into component #
Total comments: 60
Patch Set 3 : Addressed comments #
Total comments: 4
Patch Set 4 : Moved MostVisitedURLsProvider into chrome/browser/precache #
Total comments: 41
Patch Set 5 : Addressed comments #
Total comments: 17
Patch Set 6 : Addressed comments #Patch Set 7 : Added field trials #
Total comments: 31
Patch Set 8 : Reduced CL down to just the PrecacheDatabase #
Total comments: 19
Patch Set 9 : Addressed comments #
Total comments: 6
Patch Set 10 : Addressed comments #
Total comments: 38
Patch Set 11 : Addressed comments #
Total comments: 2
Patch Set 12 : Addressed comments #
Total comments: 12
Patch Set 13 : Changed to report metrics per-resource, and addressed comments #
Total comments: 8
Patch Set 14 : Addressed comments #
Total comments: 2
Patch Set 15 : Addressed comments #Patch Set 16 : Updated histogram comments to indicate that metrics are logged in bytes, not KB #Patch Set 17 : Changed components_tests.gyp to depend on precache_core target instead of precache target #Messages
Total messages: 54 (0 generated)
I've written the code for the precache database. It still needs more unit tests, and I haven't rigorously tested it manually yet, but the interfaces are all here and everything's implemented. I'll patch in the additional unit tests once I've written them.
On 2013/10/16 07:41:56, sclittle wrote: > I've written the code for the precache database. > > It still needs more unit tests, and I haven't rigorously tested it manually yet, > but the interfaces are all here and everything's implemented. > > I'll patch in the additional unit tests once I've written them. I'll review once the unit tests are in.
I've added tests and made a lot of general fixes. It's still a little rough, but PTAL.
https://codereview.chromium.org/27047003/diff/24028/chrome/browser/DEPS File chrome/browser/DEPS (right): https://codereview.chromium.org/27047003/diff/24028/chrome/browser/DEPS#newco... chrome/browser/DEPS:16: "+components/precache", alphabetize. https://codereview.chromium.org/27047003/diff/24028/chrome/browser/history/mo... File chrome/browser/history/most_visited_urls_provider.cc (right): https://codereview.chromium.org/27047003/diff/24028/chrome/browser/history/mo... chrome/browser/history/most_visited_urls_provider.cc:1: // Copyright (c) 2012 The Chromium Authors. All rights reserved. 2012 -> 2013. Remove the '(c)' too. https://codereview.chromium.org/27047003/diff/24028/chrome/browser/history/mo... chrome/browser/history/most_visited_urls_provider.cc:12: MostVisitedURLsProvider::MostVisitedURLsProvider(TopSites* top_sites) Can this be a const reference? https://codereview.chromium.org/27047003/diff/24028/chrome/browser/history/mo... chrome/browser/history/most_visited_urls_provider.cc:24: base::Callback<void(const std::list<GURL>&)> callback, typedef this in the interface. https://codereview.chromium.org/27047003/diff/24028/chrome/browser/history/mo... File chrome/browser/history/most_visited_urls_provider.h (right): https://codereview.chromium.org/27047003/diff/24028/chrome/browser/history/mo... chrome/browser/history/most_visited_urls_provider.h:1: // Copyright (c) 2012 The Chromium Authors. All rights reserved. 2012 -> 2013. Remove the '(c)' too. https://codereview.chromium.org/27047003/diff/24028/chrome/browser/history/mo... chrome/browser/history/most_visited_urls_provider.h:23: // An InterestingURLsProvider that provides a list of the user's most visited How about "RepresentativeURLsProvider"? I'm not sure this is any better than "Interesting". https://codereview.chromium.org/27047003/diff/24028/chrome/browser/net/chrome... File chrome/browser/net/chrome_network_delegate.cc (right): https://codereview.chromium.org/27047003/diff/24028/chrome/browser/net/chrome... chrome/browser/net/chrome_network_delegate.cc:305: void RecordPrecacheStatsOnUIThread(void* profile_id, const GURL& url, Why is this a void*? Can it be a const Profile*? https://codereview.chromium.org/27047003/diff/24028/chrome/browser/net/chrome... chrome/browser/net/chrome_network_delegate.cc:305: void RecordPrecacheStatsOnUIThread(void* profile_id, const GURL& url, To be explicit, I'd wrap this function in #if defined(OS_ANDROID), in addition to wrapping the caller. https://codereview.chromium.org/27047003/diff/24028/chrome/browser/net/chrome... chrome/browser/net/chrome_network_delegate.cc:316: if (!precache_manager) Can this ever happen? How? https://codereview.chromium.org/27047003/diff/24028/chrome/browser/net/chrome... chrome/browser/net/chrome_network_delegate.cc:320: if (!precache_database) Likewise. https://codereview.chromium.org/27047003/diff/24028/chrome/browser/net/chrome... chrome/browser/net/chrome_network_delegate.cc:323: bool is_cellular = net::NetworkChangeNotifier::IsConnectionCellular( I wonder what fraction is labeled as CONNECTION_UNKNOWN. https://codereview.chromium.org/27047003/diff/24028/chrome/browser/net/chrome... chrome/browser/net/chrome_network_delegate.cc:556: received_content_length, request->was_cached())); In what cases does was_cached() return true? Is it true if revalidated and not changed? Etc. https://codereview.chromium.org/27047003/diff/24028/chrome/common/chrome_swit... File chrome/common/chrome_switches.cc (left): https://codereview.chromium.org/27047003/diff/24028/chrome/common/chrome_swit... chrome/common/chrome_switches.cc:702: const char kEnablePrecache[] = "enable-precache"; The semantics should be that precache is enabled if it is in the enabled group of a precache field trial, or this switch is used. If you don't already have a function that wraps both, please add one, i.e., bool IsPrecacheEnabled(); https://codereview.chromium.org/27047003/diff/24028/components/precache.gypi File components/precache.gypi (right): https://codereview.chromium.org/27047003/diff/24028/components/precache.gypi#... components/precache.gypi:37: 'precache/core/precache_statistics_table.cc', alphabetize. https://codereview.chromium.org/27047003/diff/24028/components/precache/conte... File components/precache/content/precache_manager.cc (right): https://codereview.chromium.org/27047003/diff/24028/components/precache/conte... components/precache/content/precache_manager.cc:22: namespace { Move this anonymous namespace about the precache namespace. https://codereview.chromium.org/27047003/diff/24028/components/precache/conte... components/precache/content/precache_manager.cc:56: return CommandLine::ForCurrentProcess()->HasSwitch(switches::kEnablePrecache); How does precache know which servers to query? Are these other flags? We will want to add these via gyp as well. https://codereview.chromium.org/27047003/diff/24028/components/precache/conte... components/precache/content/precache_manager.cc:94: base::Time::Now().LocalMidnight() - base::TimeDelta::FromHours(1))); Why 11pm? I don't follow. https://codereview.chromium.org/27047003/diff/24028/components/precache/conte... File components/precache/content/precache_manager.h (right): https://codereview.chromium.org/27047003/diff/24028/components/precache/conte... components/precache/content/precache_manager.h:14: #include "components/precache/core/interesting_urls_provider.h" actually, could this just be called a urls provider? https://codereview.chromium.org/27047003/diff/24028/components/precache/conte... components/precache/content/precache_manager.h:39: static bool IsPrecachingEnabled(); Ah, here it is. Great! https://codereview.chromium.org/27047003/diff/24028/components/precache/conte... components/precache/content/precache_manager.h:57: bool is_precaching(); I prefer IsPrecaching() because it is answering a question in addition to returning a member. https://codereview.chromium.org/27047003/diff/24028/components/precache/core/... File components/precache/core/precache_database.cc (right): https://codereview.chromium.org/27047003/diff/24028/components/precache/core/... components/precache/core/precache_database.cc:18: namespace { Again, any reason your anonymous namespace is declared inside precache? https://codereview.chromium.org/27047003/diff/24028/components/precache/core/... components/precache/core/precache_database.cc:24: int GetPercentageForUMA(int64 numerator, int64 denomenator) { Does a function like this really not exist in chromium? I would rename as GetInt64Percentage. https://codereview.chromium.org/27047003/diff/24028/components/precache/core/... components/precache/core/precache_database.cc:33: UMA_HISTOGRAM_COUNTS("Precache.DailyPrecachedKB", You also need to describe these UMA in histograms.xml https://codereview.chromium.org/27047003/diff/24028/components/precache/core/... components/precache/core/precache_database.cc:34: stats.precached_bytes / 1000); A KB is 1024 B. Why not report bytes instead? Also, double change the ranges for UMA_HISTOGRAM_COUNTS https://codereview.chromium.org/27047003/diff/24028/components/precache/core/... components/precache/core/precache_database.cc:65: // is save to delete it, regardless of what thread we are on. is safe https://codereview.chromium.org/27047003/diff/24028/components/precache/core/... components/precache/core/precache_database.cc:98: RecordSingleDayPrecacheUMA(it->second); I don't think we should record UMA for days where nothing was fetched/prefetched. https://codereview.chromium.org/27047003/diff/24028/components/precache/core/... File components/precache/core/precache_database.h (right): https://codereview.chromium.org/27047003/diff/24028/components/precache/core/... components/precache/core/precache_database.h:25: // constructed or destroyed on any threads, but all other methods must be called threads -> thread on a single thread -> on the same thread https://codereview.chromium.org/27047003/diff/24028/components/precache/core/... components/precache/core/precache_database.h:41: void FlushOldStats(base::Time flush_end_date); ReportPreviousDaysStats() ? https://codereview.chromium.org/27047003/diff/24028/components/precache/core/... File components/precache/core/precache_database_unittest.cc (right): https://codereview.chromium.org/27047003/diff/24028/components/precache/core/... components/precache/core/precache_database_unittest.cc:280: } // namespace In addition, I'd like a holistic test that verifies that everything works correctly if used the way ChromeNetworkDelegate uses it. This would run through a list of resource receptions. https://codereview.chromium.org/27047003/diff/24028/components/precache/core/... File components/precache/core/precache_switches.cc (right): https://codereview.chromium.org/27047003/diff/24028/components/precache/core/... components/precache/core/precache_switches.cc:8: namespace switches { Is this how components manage their own switches?
https://codereview.chromium.org/27047003/diff/24028/chrome/browser/DEPS File chrome/browser/DEPS (right): https://codereview.chromium.org/27047003/diff/24028/chrome/browser/DEPS#newco... chrome/browser/DEPS:16: "+components/precache", On 2013/10/23 19:03:36, bengr1 wrote: > alphabetize. Done. https://codereview.chromium.org/27047003/diff/24028/chrome/browser/history/mo... File chrome/browser/history/most_visited_urls_provider.cc (right): https://codereview.chromium.org/27047003/diff/24028/chrome/browser/history/mo... chrome/browser/history/most_visited_urls_provider.cc:1: // Copyright (c) 2012 The Chromium Authors. All rights reserved. On 2013/10/23 19:03:36, bengr1 wrote: > 2012 -> 2013. Remove the '(c)' too. Done. https://codereview.chromium.org/27047003/diff/24028/chrome/browser/history/mo... chrome/browser/history/most_visited_urls_provider.cc:12: MostVisitedURLsProvider::MostVisitedURLsProvider(TopSites* top_sites) On 2013/10/23 19:03:36, bengr1 wrote: > Can this be a const reference? Unfortunately, no. We need a non-const TopSites in order to call top_sites->GetMostVisitedURLs, which is a non-const method. https://codereview.chromium.org/27047003/diff/24028/chrome/browser/history/mo... chrome/browser/history/most_visited_urls_provider.cc:24: base::Callback<void(const std::list<GURL>&)> callback, On 2013/10/23 19:03:36, bengr1 wrote: > typedef this in the interface. Done. https://codereview.chromium.org/27047003/diff/24028/chrome/browser/history/mo... File chrome/browser/history/most_visited_urls_provider.h (right): https://codereview.chromium.org/27047003/diff/24028/chrome/browser/history/mo... chrome/browser/history/most_visited_urls_provider.h:1: // Copyright (c) 2012 The Chromium Authors. All rights reserved. On 2013/10/23 19:03:36, bengr1 wrote: > 2012 -> 2013. Remove the '(c)' too. Done. https://codereview.chromium.org/27047003/diff/24028/chrome/browser/history/mo... chrome/browser/history/most_visited_urls_provider.h:23: // An InterestingURLsProvider that provides a list of the user's most visited On 2013/10/23 19:03:36, bengr1 wrote: > How about "RepresentativeURLsProvider"? I'm not sure this is any better than > "Interesting". Renamed to URLListProvider. https://codereview.chromium.org/27047003/diff/24028/chrome/browser/net/chrome... File chrome/browser/net/chrome_network_delegate.cc (right): https://codereview.chromium.org/27047003/diff/24028/chrome/browser/net/chrome... chrome/browser/net/chrome_network_delegate.cc:305: void RecordPrecacheStatsOnUIThread(void* profile_id, const GURL& url, On 2013/10/23 19:03:36, bengr1 wrote: > Why is this a void*? Can it be a const Profile*? It's a void* because that's what the ChromeNetworkDelegate has. Profiles should only be used on the UI thread, so we wait until we're on the UI thread before we cast it into a Profile*. The function NotifyEPMRequestStatus in this file does it the same way. We can't make it a const pointer since BrowserContext::GetRequestContext is a non-const method, and the PrecacheManager needs to call that on the profile when it constructs the PrecacheFetcher. https://codereview.chromium.org/27047003/diff/24028/chrome/browser/net/chrome... chrome/browser/net/chrome_network_delegate.cc:305: void RecordPrecacheStatsOnUIThread(void* profile_id, const GURL& url, On 2013/10/23 19:03:36, bengr1 wrote: > To be explicit, I'd wrap this function in #if defined(OS_ANDROID), in addition > to wrapping the caller. Done. https://codereview.chromium.org/27047003/diff/24028/chrome/browser/net/chrome... chrome/browser/net/chrome_network_delegate.cc:316: if (!precache_manager) On 2013/10/23 19:03:36, bengr1 wrote: > Can this ever happen? How? This will happen if the profile is incognito. https://codereview.chromium.org/27047003/diff/24028/chrome/browser/net/chrome... chrome/browser/net/chrome_network_delegate.cc:320: if (!precache_database) On 2013/10/23 19:03:36, bengr1 wrote: > Likewise. This can never happen. Removed if statement. https://codereview.chromium.org/27047003/diff/24028/chrome/browser/net/chrome... chrome/browser/net/chrome_network_delegate.cc:323: bool is_cellular = net::NetworkChangeNotifier::IsConnectionCellular( On 2013/10/23 19:03:36, bengr1 wrote: > I wonder what fraction is labeled as CONNECTION_UNKNOWN. Check out the NCN histograms, including NCN.CM.TimeOnUnknown, NCN.CM.TimeOn2G, NCN.CM.TimeOn3G, and NCN.CM.TimeOn4G. https://codereview.chromium.org/27047003/diff/24028/chrome/browser/net/chrome... chrome/browser/net/chrome_network_delegate.cc:556: received_content_length, request->was_cached())); On 2013/10/23 19:03:36, bengr1 wrote: > In what cases does was_cached() return true? Is it true if revalidated and not > changed? Etc. was_cached() returns true if the response was served from the cache, including 304s. I've verified this manually. https://codereview.chromium.org/27047003/diff/24028/chrome/common/chrome_swit... File chrome/common/chrome_switches.cc (left): https://codereview.chromium.org/27047003/diff/24028/chrome/common/chrome_swit... chrome/common/chrome_switches.cc:702: const char kEnablePrecache[] = "enable-precache"; On 2013/10/23 19:03:36, bengr1 wrote: > The semantics should be that precache is enabled if it is in the enabled group > of a precache field trial, or this switch is used. If you don't already have a > function that wraps both, please add one, i.e., > > bool IsPrecacheEnabled(); see PrecacheManager::IsPrecachingEnabled() https://codereview.chromium.org/27047003/diff/24028/components/precache.gypi File components/precache.gypi (right): https://codereview.chromium.org/27047003/diff/24028/components/precache.gypi#... components/precache.gypi:37: 'precache/core/precache_statistics_table.cc', On 2013/10/23 19:03:36, bengr1 wrote: > alphabetize. Done. https://codereview.chromium.org/27047003/diff/24028/components/precache/conte... File components/precache/content/precache_manager.cc (right): https://codereview.chromium.org/27047003/diff/24028/components/precache/conte... components/precache/content/precache_manager.cc:22: namespace { On 2013/10/23 19:03:36, bengr1 wrote: > Move this anonymous namespace about the precache namespace. Done. https://codereview.chromium.org/27047003/diff/24028/components/precache/conte... components/precache/content/precache_manager.cc:56: return CommandLine::ForCurrentProcess()->HasSwitch(switches::kEnablePrecache); On 2013/10/23 19:03:36, bengr1 wrote: > How does precache know which servers to query? Are these other flags? We will > want to add these via gyp as well. Those are set by flags already; see components/precache/core/precache_switches.cc, and if the flags aren't set, the values defined in precache_defines.gypi are used. The existence of these flags is documented in the PrecacheFetcher class comment and in the code where they are used, should they be documented in PrecacheManager as well? https://codereview.chromium.org/27047003/diff/24028/components/precache/conte... components/precache/content/precache_manager.cc:94: base::Time::Now().LocalMidnight() - base::TimeDelta::FromHours(1))); On 2013/10/23 19:03:36, bengr1 wrote: > Why 11pm? I don't follow. It just has to be a time on the day before Now(). The 11PM weirdness is to account for daylight savings. Pulled it out and added comment. https://codereview.chromium.org/27047003/diff/24028/components/precache/conte... File components/precache/content/precache_manager.h (right): https://codereview.chromium.org/27047003/diff/24028/components/precache/conte... components/precache/content/precache_manager.h:14: #include "components/precache/core/interesting_urls_provider.h" On 2013/10/23 19:03:36, bengr1 wrote: > actually, could this just be called a urls provider? Changed to URLListProvider https://codereview.chromium.org/27047003/diff/24028/components/precache/conte... components/precache/content/precache_manager.h:39: static bool IsPrecachingEnabled(); On 2013/10/23 19:03:36, bengr1 wrote: > Ah, here it is. Great! You're welcome :) https://codereview.chromium.org/27047003/diff/24028/components/precache/conte... components/precache/content/precache_manager.h:57: bool is_precaching(); On 2013/10/23 19:03:36, bengr1 wrote: > I prefer IsPrecaching() because it is answering a question in addition to > returning a member. Done. https://codereview.chromium.org/27047003/diff/24028/components/precache/core/... File components/precache/core/precache_database.cc (right): https://codereview.chromium.org/27047003/diff/24028/components/precache/core/... components/precache/core/precache_database.cc:18: namespace { On 2013/10/23 19:03:36, bengr1 wrote: > Again, any reason your anonymous namespace is declared inside precache? Moved anon namespace out of precache namespace. https://codereview.chromium.org/27047003/diff/24028/components/precache/core/... components/precache/core/precache_database.cc:24: int GetPercentageForUMA(int64 numerator, int64 denomenator) { On 2013/10/23 19:03:36, bengr1 wrote: > Does a function like this really not exist in chromium? I would rename as > GetInt64Percentage. On second thought, it doesn't really make sense to report a percentage at all if the denomenator is zero, so I've removed this function and inlined the calculation. https://codereview.chromium.org/27047003/diff/24028/components/precache/core/... components/precache/core/precache_database.cc:33: UMA_HISTOGRAM_COUNTS("Precache.DailyPrecachedKB", On 2013/10/23 19:03:36, bengr1 wrote: > You also need to describe these UMA in histograms.xml Done. https://codereview.chromium.org/27047003/diff/24028/components/precache/core/... components/precache/core/precache_database.cc:34: stats.precached_bytes / 1000); On 2013/10/23 19:03:36, bengr1 wrote: > A KB is 1024 B. Why not report bytes instead? Also, double change the ranges for > UMA_HISTOGRAM_COUNTS Changed to report bytes with max_range at 1 billion. https://codereview.chromium.org/27047003/diff/24028/components/precache/core/... components/precache/core/precache_database.cc:65: // is save to delete it, regardless of what thread we are on. On 2013/10/23 19:03:36, bengr1 wrote: > is safe Done. https://codereview.chromium.org/27047003/diff/24028/components/precache/core/... components/precache/core/precache_database.cc:98: RecordSingleDayPrecacheUMA(it->second); On 2013/10/23 19:03:36, bengr1 wrote: > I don't think we should record UMA for days where nothing was > fetched/prefetched. There won't be any rows in the statistics table for days when nothing was fetched or precached. Added comment. https://codereview.chromium.org/27047003/diff/24028/components/precache/core/... File components/precache/core/precache_database.h (right): https://codereview.chromium.org/27047003/diff/24028/components/precache/core/... components/precache/core/precache_database.h:25: // constructed or destroyed on any threads, but all other methods must be called On 2013/10/23 19:03:36, bengr1 wrote: > threads -> thread > > on a single thread -> on the same thread Done. https://codereview.chromium.org/27047003/diff/24028/components/precache/core/... components/precache/core/precache_database.h:41: void FlushOldStats(base::Time flush_end_date); On 2013/10/23 19:03:36, bengr1 wrote: > ReportPreviousDaysStats() ? Changed to ReportAndDeleteOldStats. Passing in a Time parameter makes testing nicer since we can use a constant, as opposed to relying on "Now()" in tests. https://codereview.chromium.org/27047003/diff/24028/components/precache/core/... File components/precache/core/precache_database_unittest.cc (right): https://codereview.chromium.org/27047003/diff/24028/components/precache/core/... components/precache/core/precache_database_unittest.cc:280: } // namespace On 2013/10/23 19:03:36, bengr1 wrote: > In addition, I'd like a holistic test that verifies that everything works > correctly if used the way ChromeNetworkDelegate uses it. This would run through > a list of resource receptions. Done. https://codereview.chromium.org/27047003/diff/24028/components/precache/core/... File components/precache/core/precache_switches.cc (right): https://codereview.chromium.org/27047003/diff/24028/components/precache/core/... components/precache/core/precache_switches.cc:8: namespace switches { On 2013/10/23 19:03:36, bengr1 wrote: > Is this how components manage their own switches? Yes, see components/autofill/core/common/autofill_switches.*
mmenke: components/precache/*, chrome_network_delegate joi: components_tests.gyp sky: chrome/browser/history/* jar: histograms.xml Thanks in advance!
Alexei for histograms.xml review
components_tests.gypi LGTM
On 2013/10/25 10:53:09, Jói wrote: > components_tests.gypi LGTM I'm pretty loaded with reviews today, not sure if I'll get to one this large until Monday.
What do you need me to review?
On Mon, Oct 28, 2013 at 7:13 AM, <sky@chromium.org> wrote: > What do you need me to review? > @sky: chrome/browser/history/* Thanks! > > https://codereview.chromium.**org/27047003/<https://codereview.chromium.org/2... > To unsubscribe from this group and stop receiving emails from it, send an email to chromium-reviews+unsubscribe@chromium.org.
Is there a design doc or something you can point me at?
On 2013/10/28 19:56:03, sky wrote: > Is there a design doc or something you can point me at? crbug.com/309216 Design doc is here: https://docs.google.com/a/google.com/document/d/1k_UnHahv8xpFFaBCg8N4cncNQpwn...
Thanks for the pointer. One more question, has this gone through eng review? It is certainly big enough that it seems as though it should. -Scott On Mon, Oct 28, 2013 at 1:13 PM, <sclittle@chromium.org> wrote: > On 2013/10/28 19:56:03, sky wrote: >> >> Is there a design doc or something you can point me at? > > > crbug.com/309216 > > Design doc is here: > https://docs.google.com/a/google.com/document/d/1k_UnHahv8xpFFaBCg8N4cncNQpwn... > > https://codereview.chromium.org/27047003/ To unsubscribe from this group and stop receiving emails from it, send an email to chromium-reviews+unsubscribe@chromium.org.
https://codereview.chromium.org/27047003/diff/174001/chrome/browser/history/m... File chrome/browser/history/most_visited_urls_provider.h (right): https://codereview.chromium.org/27047003/diff/174001/chrome/browser/history/m... chrome/browser/history/most_visited_urls_provider.h:18: class MostVisitedURLsProvider : public precache::URLListProvider { Is there a particular reason to put this in chrome/browser/history rather than some where in the precache code? https://codereview.chromium.org/27047003/diff/174001/components/precache/core... File components/precache/core/url_list_provider.h (right): https://codereview.chromium.org/27047003/diff/174001/components/precache/core... components/precache/core/url_list_provider.h:21: virtual void GetURLs(const GetURLsCallback& callback) = 0; This should document callback may be run before the call returns (synchronous).
On 2013/10/28 23:11:59, sky wrote: > Thanks for the pointer. > > One more question, has this gone through eng review? It is certainly > big enough that it seems as though it should. This is an experimental feature for now; we'll definitely go through a full eng review once we have some results.
https://codereview.chromium.org/27047003/diff/174001/chrome/browser/history/m... File chrome/browser/history/most_visited_urls_provider.h (right): https://codereview.chromium.org/27047003/diff/174001/chrome/browser/history/m... chrome/browser/history/most_visited_urls_provider.h:18: class MostVisitedURLsProvider : public precache::URLListProvider { On 2013/10/28 23:14:16, sky wrote: > Is there a particular reason to put this in chrome/browser/history rather than > some where in the precache code? Good point, it doesn't really have any purpose outside of the precache code. It can't be in the component, since layered components aren't allowed to depend on chrome/, and this uses TopSites, so I've moved it to a new directory chrome/browser/precache. https://codereview.chromium.org/27047003/diff/174001/components/precache/core... File components/precache/core/url_list_provider.h (right): https://codereview.chromium.org/27047003/diff/174001/components/precache/core... components/precache/core/url_list_provider.h:21: virtual void GetURLs(const GetURLsCallback& callback) = 0; On 2013/10/28 23:14:16, sky wrote: > This should document callback may be run before the call returns (synchronous). Done.
You've removed the changes from chrome/browser/history, so you no longer need me:) -sky As to the end review. Even if experimental it should still go through eng review.
https://codereview.chromium.org/27047003/diff/444001/chrome/browser/net/chrom... File chrome/browser/net/chrome_network_delegate.cc (right): https://codereview.chromium.org/27047003/diff/444001/chrome/browser/net/chrom... chrome/browser/net/chrome_network_delegate.cc:43: #include "components/precache/core/precache_database.h" Here and for the other new code, is there no if-def for whether or not we're being compiled with a particular component? https://codereview.chromium.org/27047003/diff/444001/components/precache/core... File components/precache/core/precache_database.cc (right): https://codereview.chromium.org/27047003/diff/444001/components/precache/core... components/precache/core/precache_database.cc:26: UMA_HISTOGRAM_CUSTOM_COUNTS(name, sample, 1, 1000000000, 50); Samples are integers, aren't they? If that's the case, and we actually can hit 1 GB, it's not outside the realm of possibility we can hit 2 GB...At which point we overflow. If it's outside the realm of possibility, we should be using a smaller number. My recommended solution: Record all values in kilobytes. https://codereview.chromium.org/27047003/diff/444001/components/precache/core... components/precache/core/precache_database.cc:26: UMA_HISTOGRAM_CUSTOM_COUNTS(name, sample, 1, 1000000000, 50); I think we really should change the name. One could well imaging a UMA_HISTOGRAM_BYTES being added as a standard code, which could well cause confusion. Maybe just PRECACHE_UMA_HISTOGRAM_BYTES or PRECACHE_UMA_BYTES. https://codereview.chromium.org/27047003/diff/444001/components/precache/core... components/precache/core/precache_database.cc:52: #undef UMA_HISTOGRAM_BYTES Don't think this is needed, and #undef seems pretty uncommon in Chrome. https://codereview.chromium.org/27047003/diff/444001/components/precache/core... components/precache/core/precache_database.cc:62: DetachFromThread(); Does this work? Looks like the parent's destructor will call CalledOnValidThread(), which will fail when the thread ID is not set, which is what DetachFromThread() does. https://codereview.chromium.org/27047003/diff/444001/components/precache/core... components/precache/core/precache_database.cc:99: base::Time::FromInternalValue(0), Can't this just be base::Time()? https://codereview.chromium.org/27047003/diff/444001/components/precache/core... components/precache/core/precache_database.cc:102: std::map<base::Time, PrecacheStatisticsTable::PrecacheStatistics> stats_map; This is really crying out for a public typedef in PrecacheStatisticsTable. https://codereview.chromium.org/27047003/diff/444001/components/precache/core... components/precache/core/precache_database.cc:104: base::Time::FromInternalValue(0), end_date, &stats_map); base::Time(0)? https://codereview.chromium.org/27047003/diff/444001/components/precache/core... components/precache/core/precache_database.cc:122: bool is_precaching, bool is_cellular) { Should we separate out those resources we had to revalidate from those we didn't? Successful precaching is great in either case, but makes a bigger difference in the no-revalidation case. https://codereview.chromium.org/27047003/diff/444001/components/precache/core... components/precache/core/precache_database.cc:126: if (CantAccessDatabase()) { Should we just DCHECK on this instead, or is it an expected state in some cases? https://codereview.chromium.org/27047003/diff/444001/components/precache/core... components/precache/core/precache_database.cc:134: // resource recently, then the precache fetch did nothing, so ignore it. I'm not following...The table should have the URL if the page was fetched in the last 60 days, which isn't exactly recent, and doesn't mean we didn't update it in the cache...does it? https://codereview.chromium.org/27047003/diff/444001/components/precache/core... components/precache/core/precache_database.cc:141: precache_statistics_table_->IncreaseStatsForFetch(fetch_time, stats); Why are we only setting one of the fields? I'm confused. Also, why only when it wasn't cached? https://codereview.chromium.org/27047003/diff/444001/components/precache/core... components/precache/core/precache_database.cc:169: precache_statistics_table_->IncreaseStatsForFetch(fetch_time, stats); When not precaching, we record stats if it wasn't cached, or it was cached and it's in the precache_url_table_...Why the difference from the precaching case? https://codereview.chromium.org/27047003/diff/444001/components/precache/core... components/precache/core/precache_database.cc:175: bool PrecacheDatabase::CantAccessDatabase() { nit: I think it's easy to miss the "T" when reading the name. I suggest either "IsDatabaseInaccessible()" or "IsDatabaseAccessible()" and negating the result (I suggest "is" over "can" to make it even clearer it returns a book, but I'm fine with "CanAccessDatabase" as well). Open to other ideas. https://codereview.chromium.org/27047003/diff/444001/components/precache/core... components/precache/core/precache_database.cc:175: bool PrecacheDatabase::CantAccessDatabase() { Can this be const? https://codereview.chromium.org/27047003/diff/444001/components/precache/core... File components/precache/core/precache_database.h (right): https://codereview.chromium.org/27047003/diff/444001/components/precache/core... components/precache/core/precache_database.h:36: void Init(sql::Connection* db); May want to pass in the db as a scoped_ptr<sql::Connection> to make ownership semantics clear at callsites. https://codereview.chromium.org/27047003/diff/444001/components/precache/core... components/precache/core/precache_database.h:54: ~PrecacheDatabase(); nit: Think there should be a blank line between friend and method declarations. https://codereview.chromium.org/27047003/diff/444001/components/precache/core... File components/precache/core/precache_statistics_table.cc (right): https://codereview.chromium.org/27047003/diff/444001/components/precache/core... components/precache/core/precache_statistics_table.cc:46: && saved_bytes_cellular == other.saved_bytes_cellular; nit: && goes on previous line. https://codereview.chromium.org/27047003/diff/444001/components/precache/core... File components/precache/core/precache_statistics_table.h (right): https://codereview.chromium.org/27047003/diff/444001/components/precache/core... components/precache/core/precache_statistics_table.h:48: // Updates daily statistics after a fetch. If a row for the day of "Fetch" is ambiguous. Particularly given that this isn't called for all cached responses, but it is called for *some* cached responses. I'm confused.
Thanks for the review, replies are below. https://codereview.chromium.org/27047003/diff/444001/chrome/browser/net/chrom... File chrome/browser/net/chrome_network_delegate.cc (right): https://codereview.chromium.org/27047003/diff/444001/chrome/browser/net/chrom... chrome/browser/net/chrome_network_delegate.cc:43: #include "components/precache/core/precache_database.h" On 2013/10/29 18:23:34, mmenke wrote: > Here and for the other new code, is there no if-def for whether or not we're > being compiled with a particular component? There isn't any macro that is specifically defined when we're being compiled with precaching; at the moment we use OS_ANDROID for this. Should I add a new macro for this that is only defined when OS_ANDROID is defined? https://codereview.chromium.org/27047003/diff/444001/components/precache/core... File components/precache/core/precache_database.cc (right): https://codereview.chromium.org/27047003/diff/444001/components/precache/core... components/precache/core/precache_database.cc:26: UMA_HISTOGRAM_CUSTOM_COUNTS(name, sample, 1, 1000000000, 50); On 2013/10/29 18:23:34, mmenke wrote: > Samples are integers, aren't they? If that's the case, and we actually can hit > 1 GB, it's not outside the realm of possibility we can hit 2 GB...At which point > we overflow. If it's outside the realm of possibility, we should be using a > smaller number. > > My recommended solution: Record all values in kilobytes. Done. https://codereview.chromium.org/27047003/diff/444001/components/precache/core... components/precache/core/precache_database.cc:26: UMA_HISTOGRAM_CUSTOM_COUNTS(name, sample, 1, 1000000000, 50); On 2013/10/29 18:23:34, mmenke wrote: > I think we really should change the name. One could well imaging a > UMA_HISTOGRAM_BYTES being added as a standard code, which could well cause > confusion. Maybe just PRECACHE_UMA_HISTOGRAM_BYTES or PRECACHE_UMA_BYTES. Changed to report values in KB using UMA_HISTOGRAM_COUNTS. https://codereview.chromium.org/27047003/diff/444001/components/precache/core... components/precache/core/precache_database.cc:52: #undef UMA_HISTOGRAM_BYTES On 2013/10/29 18:23:34, mmenke wrote: > Don't think this is needed, and #undef seems pretty uncommon in Chrome. Removed. https://codereview.chromium.org/27047003/diff/444001/components/precache/core... components/precache/core/precache_database.cc:62: DetachFromThread(); On 2013/10/29 18:23:34, mmenke wrote: > Does this work? Looks like the parent's destructor will call > CalledOnValidThread(), which will fail when the thread ID is not set, which is > what DetachFromThread() does. If the thread ID is not set, CalledOnValidThread() will set the thread ID to that of the current thread. https://code.google.com/p/chromium/codesearch#chromium/src/base/threading/thr... ThreadCheckerImpl::EnsureThreadIdAssigned() sets the thread ID to the current thread if it is unset. https://codereview.chromium.org/27047003/diff/444001/components/precache/core... components/precache/core/precache_database.cc:99: base::Time::FromInternalValue(0), On 2013/10/29 18:23:34, mmenke wrote: > Can't this just be base::Time()? Done. https://codereview.chromium.org/27047003/diff/444001/components/precache/core... components/precache/core/precache_database.cc:102: std::map<base::Time, PrecacheStatisticsTable::PrecacheStatistics> stats_map; On 2013/10/29 18:23:34, mmenke wrote: > This is really crying out for a public typedef in PrecacheStatisticsTable. Done. https://codereview.chromium.org/27047003/diff/444001/components/precache/core... components/precache/core/precache_database.cc:104: base::Time::FromInternalValue(0), end_date, &stats_map); On 2013/10/29 18:23:34, mmenke wrote: > base::Time(0)? Changed to base::Time(). https://codereview.chromium.org/27047003/diff/444001/components/precache/core... components/precache/core/precache_database.cc:122: bool is_precaching, bool is_cellular) { On 2013/10/29 18:23:34, mmenke wrote: > Should we separate out those resources we had to revalidate from those we > didn't? Successful precaching is great in either case, but makes a bigger > difference in the no-revalidation case. That's true, but for individual fetches we're just tracking the effects on bandwidth usage, and the main difference between a fresh cache hit and a 304 is the latency. Latency metrics are outside the scope of this CL. https://codereview.chromium.org/27047003/diff/444001/components/precache/core... components/precache/core/precache_database.cc:126: if (CantAccessDatabase()) { On 2013/10/29 18:23:34, mmenke wrote: > Should we just DCHECK on this instead, or is it an expected state in some cases? If the user ran out of disk space or something and we can't access the database, we don't want to crash Chrome. https://codereview.chromium.org/27047003/diff/444001/components/precache/core... components/precache/core/precache_database.cc:134: // resource recently, then the precache fetch did nothing, so ignore it. On 2013/10/29 18:23:34, mmenke wrote: > I'm not following...The table should have the URL if the page was fetched in the > last 60 days, which isn't exactly recent, and doesn't mean we didn't update it > in the cache...does it? The URL table only keeps track of URLs that are in the cache because of precaching, and would not be in the cache if not for precaching. I've updated the comment to be clearer, and I've also updated the class comment for PrecacheURLTable to be clearer about this. https://codereview.chromium.org/27047003/diff/444001/components/precache/core... components/precache/core/precache_database.cc:141: precache_statistics_table_->IncreaseStatsForFetch(fetch_time, stats); On 2013/10/29 18:23:34, mmenke wrote: > Why are we only setting one of the fields? I'm confused. Also, why only when > it wasn't cached? All fields in the PrecacheStatistics default to 0. Added comment. https://codereview.chromium.org/27047003/diff/444001/components/precache/core... components/precache/core/precache_database.cc:169: precache_statistics_table_->IncreaseStatsForFetch(fetch_time, stats); On 2013/10/29 18:23:34, mmenke wrote: > When not precaching, we record stats if it wasn't cached, or it was cached and > it's in the precache_url_table_...Why the difference from the precaching case? Added comment. https://codereview.chromium.org/27047003/diff/444001/components/precache/core... components/precache/core/precache_database.cc:175: bool PrecacheDatabase::CantAccessDatabase() { On 2013/10/29 18:23:34, mmenke wrote: > nit: I think it's easy to miss the "T" when reading the name. I suggest either > "IsDatabaseInaccessible()" or "IsDatabaseAccessible()" and negating the result > (I suggest "is" over "can" to make it even clearer it returns a book, but I'm > fine with "CanAccessDatabase" as well). Open to other ideas. Done. https://codereview.chromium.org/27047003/diff/444001/components/precache/core... components/precache/core/precache_database.cc:175: bool PrecacheDatabase::CantAccessDatabase() { On 2013/10/29 18:23:34, mmenke wrote: > Can this be const? Done. https://codereview.chromium.org/27047003/diff/444001/components/precache/core... File components/precache/core/precache_database.h (right): https://codereview.chromium.org/27047003/diff/444001/components/precache/core... components/precache/core/precache_database.h:36: void Init(sql::Connection* db); On 2013/10/29 18:23:34, mmenke wrote: > May want to pass in the db as a scoped_ptr<sql::Connection> to make ownership > semantics clear at callsites. Done. https://codereview.chromium.org/27047003/diff/444001/components/precache/core... components/precache/core/precache_database.h:54: ~PrecacheDatabase(); On 2013/10/29 18:23:34, mmenke wrote: > nit: Think there should be a blank line between friend and method declarations. Done. https://codereview.chromium.org/27047003/diff/444001/components/precache/core... File components/precache/core/precache_statistics_table.cc (right): https://codereview.chromium.org/27047003/diff/444001/components/precache/core... components/precache/core/precache_statistics_table.cc:46: && saved_bytes_cellular == other.saved_bytes_cellular; On 2013/10/29 18:23:34, mmenke wrote: > nit: && goes on previous line. Done. https://codereview.chromium.org/27047003/diff/444001/components/precache/core... File components/precache/core/precache_statistics_table.h (right): https://codereview.chromium.org/27047003/diff/444001/components/precache/core... components/precache/core/precache_statistics_table.h:48: // Updates daily statistics after a fetch. If a row for the day of On 2013/10/29 18:23:34, mmenke wrote: > "Fetch" is ambiguous. Particularly given that this isn't called for all cached > responses, but it is called for *some* cached responses. I'm confused. Removed the concept of "fetch" from this method.
https://codereview.chromium.org/27047003/diff/444001/chrome/browser/net/chrom... File chrome/browser/net/chrome_network_delegate.cc (right): https://codereview.chromium.org/27047003/diff/444001/chrome/browser/net/chrom... chrome/browser/net/chrome_network_delegate.cc:43: #include "components/precache/core/precache_database.h" On 2013/10/29 23:40:45, sclittle wrote: > On 2013/10/29 18:23:34, mmenke wrote: > > Here and for the other new code, is there no if-def for whether or not we're > > being compiled with a particular component? > > There isn't any macro that is specifically defined when we're being compiled > with precaching; at the moment we use OS_ANDROID for this. > > Should I add a new macro for this that is only defined when OS_ANDROID is > defined? No, it's fine - I had just thought/hoped that there might be a pre-processor macro system associated with which components we were being compiled with. https://codereview.chromium.org/27047003/diff/444001/components/precache/core... File components/precache/core/precache_database.cc (right): https://codereview.chromium.org/27047003/diff/444001/components/precache/core... components/precache/core/precache_database.cc:134: // resource recently, then the precache fetch did nothing, so ignore it. On 2013/10/29 23:40:45, sclittle wrote: > On 2013/10/29 18:23:34, mmenke wrote: > > I'm not following...The table should have the URL if the page was fetched in > the > > last 60 days, which isn't exactly recent, and doesn't mean we didn't update it > > in the cache...does it? > > The URL table only keeps track of URLs that are in the cache because of > precaching, and would not be in the cache if not for precaching. > > I've updated the comment to be clearer, and I've also updated the class comment > for PrecacheURLTable to be clearer about this. You seem to add everything we download to the table, as long as precaching is enabled via a command line flag, no? I see nothing to make is_pracaching true only in the particular case that a request was initiated by the precache code. https://codereview.chromium.org/27047003/diff/554001/components/precache/core... File components/precache/core/precache_database.cc (right): https://codereview.chromium.org/27047003/diff/554001/components/precache/core... components/precache/core/precache_database.cc:8: #include "base/time/time.h" No need to include stuff here that's in the header. https://codereview.chromium.org/27047003/diff/554001/components/precache/core... components/precache/core/precache_database.cc:105: } suggestion: Just report stats for the previous day, if we have them. Downside: We lose stats. There will be a bias against light users in our numbers. Upside: We don't combine numbers from different days, or from different algorithms as we try new things. Otherwise, whenever you play with things, you'll continue to get old stats for days. https://codereview.chromium.org/27047003/diff/554001/components/precache/core... components/precache/core/precache_database.cc:134: stats.precached_bytes = size; I'm still not following....These bytes were not in fact precached, they were downloaded, possibly over a cellular network, possibly not. https://codereview.chromium.org/27047003/diff/554001/components/precache/core... File components/precache/core/precache_database.h (right): https://codereview.chromium.org/27047003/diff/554001/components/precache/core... components/precache/core/precache_database.h:22: class PrecacheStatisticsTable; nit: Alphabetize. https://codereview.chromium.org/27047003/diff/554001/components/precache/core... components/precache/core/precache_database.h:42: // by const reference so that this method can be posted on a thread. I don't believe this is true - when you pass by reference, the data is actually still copied to the callback. struct CallbackParamTraits { typedef const T& ForwardType; // <-- type for the function is a reference typedef T StorageType; // <-- but it's copied for storage. }; Time is often passed by value rather than reference, so think the code is fine, should just remove that comment. https://codereview.chromium.org/27047003/diff/554001/components/precache/core... components/precache/core/precache_database.h:47: // method can be posted on a thread. Again, this is not true. Should pass URL by reference. By value is fine for everything else. Can then also remove gurl.h from the includes. https://codereview.chromium.org/27047003/diff/554001/components/precache/core... components/precache/core/precache_database.h:49: bool was_cached, bool is_precaching, bool is_cellular); You definitely need to define is_precaching here. I had assumed it meant precaching was enabled, and it seems to depend on command line flags, rather than what initiated the request. https://codereview.chromium.org/27047003/diff/554001/components/precache/core... components/precache/core/precache_database.h:49: bool was_cached, bool is_precaching, bool is_cellular); nit: |is_cellular| could be clearer - I suggest is_connection_cellular.
Thanks for the comments, replies are below. https://codereview.chromium.org/27047003/diff/444001/components/precache/core... File components/precache/core/precache_database.cc (right): https://codereview.chromium.org/27047003/diff/444001/components/precache/core... components/precache/core/precache_database.cc:134: // resource recently, then the precache fetch did nothing, so ignore it. On 2013/10/30 18:29:08, mmenke wrote: > On 2013/10/29 23:40:45, sclittle wrote: > > On 2013/10/29 18:23:34, mmenke wrote: > > > I'm not following...The table should have the URL if the page was fetched in > > the > > > last 60 days, which isn't exactly recent, and doesn't mean we didn't update > it > > > in the cache...does it? > > > > The URL table only keeps track of URLs that are in the cache because of > > precaching, and would not be in the cache if not for precaching. > > > > I've updated the comment to be clearer, and I've also updated the class > comment > > for PrecacheURLTable to be clearer about this. > > You seem to add everything we download to the table, as long as precaching is > enabled via a command line flag, no? I see nothing to make is_pracaching true > only in the particular case that a request was initiated by the precache code. The purpose of the URL table is to allow us to identify when a fetch comes from the cache thanks to precaching. Hopefully renaming the parameter from "is_precaching" to "is_precache" can clear that up. I've also added a comment to the precache_url_table_ member to hopefully help clarify this. https://codereview.chromium.org/27047003/diff/554001/components/precache/core... File components/precache/core/precache_database.cc (right): https://codereview.chromium.org/27047003/diff/554001/components/precache/core... components/precache/core/precache_database.cc:8: #include "base/time/time.h" On 2013/10/30 18:29:08, mmenke wrote: > No need to include stuff here that's in the header. Removed it from the header. https://codereview.chromium.org/27047003/diff/554001/components/precache/core... components/precache/core/precache_database.cc:105: } On 2013/10/30 18:29:08, mmenke wrote: > suggestion: Just report stats for the previous day, if we have them. > > Downside: We lose stats. There will be a bias against light users in our > numbers. Upside: We don't combine numbers from different days, or from > different algorithms as we try new things. Otherwise, whenever you play with > things, you'll continue to get old stats for days. You have a good point, but we'd continue to get old stats for days even if we do only report stats for the previous day. This is because a resource could be precached days before it eventually gets used, so if we play with different algorithms or other things on the server side, we would have to wait for the clients' caches to catch up anyway. Since we'll be getting data related to old precaches regardless, we're probably better off reporting all old stats so that we don't lose any stats. https://codereview.chromium.org/27047003/diff/554001/components/precache/core... components/precache/core/precache_database.cc:134: stats.precached_bytes = size; On 2013/10/30 18:29:08, mmenke wrote: > I'm still not following....These bytes were not in fact precached, they were > downloaded, possibly over a cellular network, possibly not. Renamed "is_precaching" to "is_precache" to hopefully clarify this. https://codereview.chromium.org/27047003/diff/554001/components/precache/core... File components/precache/core/precache_database.h (right): https://codereview.chromium.org/27047003/diff/554001/components/precache/core... components/precache/core/precache_database.h:22: class PrecacheStatisticsTable; On 2013/10/30 18:29:08, mmenke wrote: > nit: Alphabetize. Done. https://codereview.chromium.org/27047003/diff/554001/components/precache/core... components/precache/core/precache_database.h:42: // by const reference so that this method can be posted on a thread. On 2013/10/30 18:29:08, mmenke wrote: > I don't believe this is true - when you pass by reference, the data is actually > still copied to the callback. > > struct CallbackParamTraits { > typedef const T& ForwardType; // <-- type for the function is a reference > typedef T StorageType; // <-- but it's copied for storage. > }; > > Time is often passed by value rather than reference, so think the code is fine, > should just remove that comment. Done. I've changed it to pass Time by reference as well and forward declared Time. https://codereview.chromium.org/27047003/diff/554001/components/precache/core... components/precache/core/precache_database.h:47: // method can be posted on a thread. On 2013/10/30 18:29:08, mmenke wrote: > Again, this is not true. Should pass URL by reference. By value is fine for > everything else. Can then also remove gurl.h from the includes. Done. https://codereview.chromium.org/27047003/diff/554001/components/precache/core... components/precache/core/precache_database.h:49: bool was_cached, bool is_precaching, bool is_cellular); On 2013/10/30 18:29:08, mmenke wrote: > nit: |is_cellular| could be clearer - I suggest is_connection_cellular. Done. https://codereview.chromium.org/27047003/diff/554001/components/precache/core... components/precache/core/precache_database.h:49: bool was_cached, bool is_precaching, bool is_cellular); On 2013/10/30 18:29:08, mmenke wrote: > You definitely need to define is_precaching here. I had assumed it meant > precaching was enabled, and it seems to depend on command line flags, rather > than what initiated the request. It actually meant whether or not we are currently in the middle of precaching, i.e. that the fetch is a precache. Renamed to |is_precache| to be clearer, and added comment.
I'll have more comments later today - sorry on the slow start, I was really confused about the meaning of is_precache and the stat fields, and wanted to clear up what was going on before digging deeper. https://codereview.chromium.org/27047003/diff/554001/components/precache/core... File components/precache/core/precache_database.cc (right): https://codereview.chromium.org/27047003/diff/554001/components/precache/core... components/precache/core/precache_database.cc:105: } On 2013/10/30 20:29:48, sclittle wrote: > On 2013/10/30 18:29:08, mmenke wrote: > > suggestion: Just report stats for the previous day, if we have them. > > > > Downside: We lose stats. There will be a bias against light users in our > > numbers. Upside: We don't combine numbers from different days, or from > > different algorithms as we try new things. Otherwise, whenever you play with > > things, you'll continue to get old stats for days. > > You have a good point, but we'd continue to get old stats for days even if we do > only report stats for the previous day. > > This is because a resource could be precached days before it eventually gets > used, so if we play with different algorithms or other things on the server > side, we would have to wait for the clients' caches to catch up anyway. > > Since we'll be getting data related to old precaches regardless, we're probably > better off reporting all old stats so that we don't lose any stats. I'm not so sure, though you are certainly correct that my proposal isn't a perfect fix. https://codereview.chromium.org/27047003/diff/704001/chrome/browser/net/chrom... File chrome/browser/net/chrome_network_delegate.cc (right): https://codereview.chromium.org/27047003/diff/704001/chrome/browser/net/chrom... chrome/browser/net/chrome_network_delegate.cc:317: if (!received_content_length || url.is_empty()) Should only do this for HTTP and HTTPS requests, right? https://codereview.chromium.org/27047003/diff/704001/chrome/browser/net/chrom... chrome/browser/net/chrome_network_delegate.cc:338: precache_manager->IsPrecaching(), is_cellular)); Do we really need to get the time from the URL request? Not strongly opposed, just not sure passing it around really gets us much over using the time we record the stat instead. https://codereview.chromium.org/27047003/diff/704001/chrome/browser/net/chrom... chrome/browser/net/chrome_network_delegate.cc:338: precache_manager->IsPrecaching(), is_cellular)); I don't think the ChromeNetworkDelegate should know or care about the precache_database - I think we should just pass the URL, time, content length, and was_cached flag to the PrecacheManager and have it pass it on to the DB, so this file doesn't depend on it. Can also then get rid of the database accessor. Also should move logic about not recording 0-length requests / empty URLs / etc there. https://codereview.chromium.org/27047003/diff/704001/components/precache/core... File components/precache/core/precache_database.h (right): https://codereview.chromium.org/27047003/diff/704001/components/precache/core... components/precache/core/precache_database.h:49: // |is_precache| indicates whether or not the fetch is a precache, and nit: "is a precache" -> "is a precache request" https://codereview.chromium.org/27047003/diff/704001/components/precache/core... components/precache/core/precache_database.h:54: bool is_connection_cellular); We're assuming all requests made while the PrecacheManager is precaching files is a precache attempt, which is not necessarily true. We should comment on that assumption where we call this function. https://codereview.chromium.org/27047003/diff/704001/components/precache/core... File components/precache/core/precache_statistics_table.h (right): https://codereview.chromium.org/27047003/diff/704001/components/precache/core... components/precache/core/precache_statistics_table.h:33: int64 precached_bytes; I think these names are unclear. I'm a fan of clear names, even if they are painfully long, since they make code easier to follow. I think we should also add comments. I suggest calling this downloaded_precache_motivated_bytes or downloaded_precache_initiated_bytes, here and everywhere else it appears. https://codereview.chromium.org/27047003/diff/704001/components/precache/core... components/precache/core/precache_statistics_table.h:34: int64 downloaded_bytes; I suggest calling this downloaded_non_precache_bytes. https://codereview.chromium.org/27047003/diff/704001/components/precache/core... components/precache/core/precache_statistics_table.h:35: int64 downloaded_bytes_cellular; And this downloaded_non_precache_bytes_cellular. https://codereview.chromium.org/27047003/diff/704001/components/precache/core... File components/precache/core/precache_statistics_table_unittest.cc (right): https://codereview.chromium.org/27047003/diff/704001/components/precache/core... components/precache/core/precache_statistics_table_unittest.cc:9: #include "base/callback.h" Do we need bind or callback? https://codereview.chromium.org/27047003/diff/704001/components/precache/core... components/precache/core/precache_statistics_table_unittest.cc:13: #include "sql/statement.h" Do we need sql/statement? https://codereview.chromium.org/27047003/diff/704001/components/precache/core... components/precache/core/precache_statistics_table_unittest.cc:29: virtual void SetUp() { OVERRIDE (And include compiler_specific for it) https://codereview.chromium.org/27047003/diff/704001/components/precache/core... components/precache/core/precache_statistics_table_unittest.cc:55: base::Time(), base::Time::Max(), &actual_stats_map); You never check that GetAllStatsBetween doesn't just always return all stats in the map. https://codereview.chromium.org/27047003/diff/704001/components/precache/core... components/precache/core/precache_statistics_table_unittest.cc:61: const base::Time kBeforeTime = base::Time::UnixEpoch(); Using real times always makes me nervous. Suggest just using a constant time. Why? Technically, there are days on which midnight + 24 hours is the same day, due to leap seconds. Yes, the time code won't actually correctly handle that case, so this test will pass, but I really don't want to think about that case at all. https://codereview.chromium.org/27047003/diff/704001/components/precache/core... File components/precache/core/precache_url_table.cc (right): https://codereview.chromium.org/27047003/diff/704001/components/precache/core... components/precache/core/precache_url_table.cc:83: statement.BindInt64(1, delete_end.ToInternalValue()); Are internal values guaranteed to be consistent between chrome versions (On a single platform, of course)? I don't see it changing, just want to make sure we shouldn't be using something else guaranteed to be consistent. https://codereview.chromium.org/27047003/diff/704001/components/precache/core... components/precache/core/precache_url_table.cc:88: void PrecacheURLTable::GetAllDataForTesting(std::map<GURL, base::Time>* map) { I suggest clearing the map first, so we have a well defined behavior. https://codereview.chromium.org/27047003/diff/704001/components/precache/core... components/precache/core/precache_url_table.cc:104: } This doesn't need to be a member function. Suggest just moving it up to the anonymous namespace. https://codereview.chromium.org/27047003/diff/704001/components/precache/core... File components/precache/core/precache_url_table.h (right): https://codereview.chromium.org/27047003/diff/704001/components/precache/core... components/precache/core/precache_url_table.h:46: const base::Time& delete_end); Do we really need to have these between functions? Seems like we could just take an end time.
Let's do a bit more of an eng review as sky@ suggested before tunneling in on the code details. It doesn't make sense to get code details hammered out if the entire approach needs to be rethought. On Thu, Oct 31, 2013 at 12:44 PM, <mmenke@chromium.org> wrote: > I'll have more comments later today - sorry on the slow start, I was really > confused about the meaning of is_precache and the stat fields, and wanted to > clear up what was going on before digging deeper. > > > > https://codereview.chromium.org/27047003/diff/554001/components/precache/core... > File components/precache/core/precache_database.cc (right): > > https://codereview.chromium.org/27047003/diff/554001/components/precache/core... > components/precache/core/precache_database.cc:105: } > > On 2013/10/30 20:29:48, sclittle wrote: >> >> On 2013/10/30 18:29:08, mmenke wrote: >> > suggestion: Just report stats for the previous day, if we have > > them. >> >> > >> > Downside: We lose stats. There will be a bias against light users > > in our >> >> > numbers. Upside: We don't combine numbers from different days, or > > from >> >> > different algorithms as we try new things. Otherwise, whenever you > > play with >> >> > things, you'll continue to get old stats for days. > > >> You have a good point, but we'd continue to get old stats for days > > even if we do >> >> only report stats for the previous day. > > >> This is because a resource could be precached days before it > > eventually gets >> >> used, so if we play with different algorithms or other things on the > > server >> >> side, we would have to wait for the clients' caches to catch up > > anyway. > >> Since we'll be getting data related to old precaches regardless, we're > > probably >> >> better off reporting all old stats so that we don't lose any stats. > > > I'm not so sure, though you are certainly correct that my proposal isn't > a perfect fix. > > https://codereview.chromium.org/27047003/diff/704001/chrome/browser/net/chrom... > File chrome/browser/net/chrome_network_delegate.cc (right): > > https://codereview.chromium.org/27047003/diff/704001/chrome/browser/net/chrom... > chrome/browser/net/chrome_network_delegate.cc:317: if > (!received_content_length || url.is_empty()) > Should only do this for HTTP and HTTPS requests, right? > > https://codereview.chromium.org/27047003/diff/704001/chrome/browser/net/chrom... > chrome/browser/net/chrome_network_delegate.cc:338: > precache_manager->IsPrecaching(), is_cellular)); > Do we really need to get the time from the URL request? Not strongly > opposed, just not sure passing it around really gets us much over using > the time we record the stat instead. > > https://codereview.chromium.org/27047003/diff/704001/chrome/browser/net/chrom... > chrome/browser/net/chrome_network_delegate.cc:338: > precache_manager->IsPrecaching(), is_cellular)); > I don't think the ChromeNetworkDelegate should know or care about the > precache_database - I think we should just pass the URL, time, content > length, and was_cached flag to the PrecacheManager and have it pass it > on to the DB, so this file doesn't depend on it. Can also then get rid > of the database accessor. > > Also should move logic about not recording 0-length requests / empty > URLs / etc there. > > https://codereview.chromium.org/27047003/diff/704001/components/precache/core... > File components/precache/core/precache_database.h (right): > > https://codereview.chromium.org/27047003/diff/704001/components/precache/core... > components/precache/core/precache_database.h:49: // |is_precache| > indicates whether or not the fetch is a precache, and > nit: "is a precache" -> "is a precache request" > > https://codereview.chromium.org/27047003/diff/704001/components/precache/core... > components/precache/core/precache_database.h:54: bool > is_connection_cellular); > We're assuming all requests made while the PrecacheManager is precaching > files is a precache attempt, which is not necessarily true. We should > comment on that assumption where we call this function. > > https://codereview.chromium.org/27047003/diff/704001/components/precache/core... > File components/precache/core/precache_statistics_table.h (right): > > https://codereview.chromium.org/27047003/diff/704001/components/precache/core... > components/precache/core/precache_statistics_table.h:33: int64 > precached_bytes; > I think these names are unclear. I'm a fan of clear names, even if they > are painfully long, since they make code easier to follow. I think we > should also add comments. > > I suggest calling this downloaded_precache_motivated_bytes or > downloaded_precache_initiated_bytes, here and everywhere else it > appears. > > https://codereview.chromium.org/27047003/diff/704001/components/precache/core... > components/precache/core/precache_statistics_table.h:34: int64 > downloaded_bytes; > I suggest calling this downloaded_non_precache_bytes. > > https://codereview.chromium.org/27047003/diff/704001/components/precache/core... > components/precache/core/precache_statistics_table.h:35: int64 > downloaded_bytes_cellular; > And this downloaded_non_precache_bytes_cellular. > > https://codereview.chromium.org/27047003/diff/704001/components/precache/core... > File components/precache/core/precache_statistics_table_unittest.cc > (right): > > https://codereview.chromium.org/27047003/diff/704001/components/precache/core... > components/precache/core/precache_statistics_table_unittest.cc:9: > #include "base/callback.h" > Do we need bind or callback? > > https://codereview.chromium.org/27047003/diff/704001/components/precache/core... > components/precache/core/precache_statistics_table_unittest.cc:13: > #include "sql/statement.h" > Do we need sql/statement? > > https://codereview.chromium.org/27047003/diff/704001/components/precache/core... > components/precache/core/precache_statistics_table_unittest.cc:29: > virtual void SetUp() { > OVERRIDE (And include compiler_specific for it) > > https://codereview.chromium.org/27047003/diff/704001/components/precache/core... > components/precache/core/precache_statistics_table_unittest.cc:55: > base::Time(), base::Time::Max(), &actual_stats_map); > You never check that GetAllStatsBetween doesn't just always return all > stats in the map. > > https://codereview.chromium.org/27047003/diff/704001/components/precache/core... > components/precache/core/precache_statistics_table_unittest.cc:61: const > base::Time kBeforeTime = base::Time::UnixEpoch(); > Using real times always makes me nervous. Suggest just using a constant > time. Why? Technically, there are days on which midnight + 24 hours is > the same day, due to leap seconds. Yes, the time code won't actually > correctly handle that case, so this test will pass, but I really don't > want to think about that case at all. > > https://codereview.chromium.org/27047003/diff/704001/components/precache/core... > File components/precache/core/precache_url_table.cc (right): > > https://codereview.chromium.org/27047003/diff/704001/components/precache/core... > components/precache/core/precache_url_table.cc:83: > statement.BindInt64(1, delete_end.ToInternalValue()); > Are internal values guaranteed to be consistent between chrome versions > (On a single platform, of course)? I don't see it changing, just want > to make sure we shouldn't be using something else guaranteed to be > consistent. > > https://codereview.chromium.org/27047003/diff/704001/components/precache/core... > components/precache/core/precache_url_table.cc:88: void > PrecacheURLTable::GetAllDataForTesting(std::map<GURL, base::Time>* map) > { > I suggest clearing the map first, so we have a well defined behavior. > > https://codereview.chromium.org/27047003/diff/704001/components/precache/core... > components/precache/core/precache_url_table.cc:104: } > This doesn't need to be a member function. Suggest just moving it up to > the anonymous namespace. > > https://codereview.chromium.org/27047003/diff/704001/components/precache/core... > File components/precache/core/precache_url_table.h (right): > > https://codereview.chromium.org/27047003/diff/704001/components/precache/core... > components/precache/core/precache_url_table.h:46: const base::Time& > delete_end); > Do we really need to have these between functions? Seems like we could > just take an end time. > > https://codereview.chromium.org/27047003/ To unsubscribe from this group and stop receiving emails from it, send an email to chromium-reviews+unsubscribe@chromium.org.
I'm splitting this CL into multiple smaller CLs; this CL now only contains the metrics database for tracking precache statistics, and the changes to histograms.xml. https://codereview.chromium.org/27047003/diff/704001/components/precache/core... File components/precache/core/precache_database.h (right): https://codereview.chromium.org/27047003/diff/704001/components/precache/core... components/precache/core/precache_database.h:49: // |is_precache| indicates whether or not the fetch is a precache, and On 2013/10/31 16:45:00, mmenke wrote: > nit: "is a precache" -> "is a precache request" Done. https://codereview.chromium.org/27047003/diff/704001/components/precache/core... components/precache/core/precache_database.h:54: bool is_connection_cellular); On 2013/10/31 16:45:00, mmenke wrote: > We're assuming all requests made while the PrecacheManager is precaching files > is a precache attempt, which is not necessarily true. We should comment on that > assumption where we call this function. Sure. https://codereview.chromium.org/27047003/diff/704001/components/precache/core... File components/precache/core/precache_statistics_table.h (right): https://codereview.chromium.org/27047003/diff/704001/components/precache/core... components/precache/core/precache_statistics_table.h:33: int64 precached_bytes; On 2013/10/31 16:45:00, mmenke wrote: > I think these names are unclear. I'm a fan of clear names, even if they are > painfully long, since they make code easier to follow. I think we should also > add comments. > > I suggest calling this downloaded_precache_motivated_bytes or > downloaded_precache_initiated_bytes, here and everywhere else it appears. Done. https://codereview.chromium.org/27047003/diff/704001/components/precache/core... components/precache/core/precache_statistics_table.h:34: int64 downloaded_bytes; On 2013/10/31 16:45:00, mmenke wrote: > I suggest calling this downloaded_non_precache_bytes. Done. https://codereview.chromium.org/27047003/diff/704001/components/precache/core... components/precache/core/precache_statistics_table.h:35: int64 downloaded_bytes_cellular; On 2013/10/31 16:45:00, mmenke wrote: > And this downloaded_non_precache_bytes_cellular. Done. https://codereview.chromium.org/27047003/diff/704001/components/precache/core... File components/precache/core/precache_statistics_table_unittest.cc (right): https://codereview.chromium.org/27047003/diff/704001/components/precache/core... components/precache/core/precache_statistics_table_unittest.cc:9: #include "base/callback.h" On 2013/10/31 16:45:00, mmenke wrote: > Do we need bind or callback? Removed. https://codereview.chromium.org/27047003/diff/704001/components/precache/core... components/precache/core/precache_statistics_table_unittest.cc:13: #include "sql/statement.h" On 2013/10/31 16:45:00, mmenke wrote: > Do we need sql/statement? Removed. https://codereview.chromium.org/27047003/diff/704001/components/precache/core... components/precache/core/precache_statistics_table_unittest.cc:29: virtual void SetUp() { On 2013/10/31 16:45:00, mmenke wrote: > OVERRIDE (And include compiler_specific for it) Done. https://codereview.chromium.org/27047003/diff/704001/components/precache/core... components/precache/core/precache_statistics_table_unittest.cc:55: base::Time(), base::Time::Max(), &actual_stats_map); On 2013/10/31 16:45:00, mmenke wrote: > You never check that GetAllStatsBetween doesn't just always return all stats in > the map. Added test. https://codereview.chromium.org/27047003/diff/704001/components/precache/core... components/precache/core/precache_statistics_table_unittest.cc:61: const base::Time kBeforeTime = base::Time::UnixEpoch(); On 2013/10/31 16:45:00, mmenke wrote: > Using real times always makes me nervous. Suggest just using a constant time. > Why? Technically, there are days on which midnight + 24 hours is the same day, > due to leap seconds. Yes, the time code won't actually correctly handle that > case, so this test will pass, but I really don't want to think about that case > at all. Done. https://codereview.chromium.org/27047003/diff/704001/components/precache/core... File components/precache/core/precache_url_table.cc (right): https://codereview.chromium.org/27047003/diff/704001/components/precache/core... components/precache/core/precache_url_table.cc:83: statement.BindInt64(1, delete_end.ToInternalValue()); On 2013/10/31 16:45:00, mmenke wrote: > Are internal values guaranteed to be consistent between chrome versions (On a > single platform, of course)? I don't see it changing, just want to make sure we > shouldn't be using something else guaranteed to be consistent. There aren't any consistency guarantees on internal values. I don't see any methods on Time that are more standard and guaranteed to be consistent, though, and other code seems to use the internal value as well (https://code.google.com/p/chromium/codesearch#chromium/src/chrome/browser/pre...). We could do something more complicated like (delete_begin - base::Time()).InMicroseconds(), which is guaranteed to be consistent, but it's probably not worth the additional complexity. https://codereview.chromium.org/27047003/diff/704001/components/precache/core... components/precache/core/precache_url_table.cc:88: void PrecacheURLTable::GetAllDataForTesting(std::map<GURL, base::Time>* map) { On 2013/10/31 16:45:00, mmenke wrote: > I suggest clearing the map first, so we have a well defined behavior. Done. https://codereview.chromium.org/27047003/diff/704001/components/precache/core... components/precache/core/precache_url_table.cc:104: } On 2013/10/31 16:45:00, mmenke wrote: > This doesn't need to be a member function. Suggest just moving it up to the > anonymous namespace. Done. https://codereview.chromium.org/27047003/diff/704001/components/precache/core... File components/precache/core/precache_url_table.h (right): https://codereview.chromium.org/27047003/diff/704001/components/precache/core... components/precache/core/precache_url_table.h:46: const base::Time& delete_end); On 2013/10/31 16:45:00, mmenke wrote: > Do we really need to have these between functions? Seems like we could just > take an end time. Done.
[+davidben], who has agreed to take over the review from me.
I'm actually not familiar with either our metrics or database code. But here're some stylistic things and a time zone question. https://codereview.chromium.org/27047003/diff/1084001/components/precache/cor... File components/precache/core/precache_database.cc (right): https://codereview.chromium.org/27047003/diff/1084001/components/precache/cor... components/precache/core/precache_database.cc:72: db_.reset(db.release()); Nit: You can write "db_ = db.Pass();" which is a slightly safer pattern. (Emulation of move semantics from C++11.) https://codereview.chromium.org/27047003/diff/1084001/components/precache/cor... components/precache/core/precache_database.cc:161: precache_url_table_->DeleteURL(url); I think there are cases where requests also just bypass the cache altogether and not touch it, such requests with an external If-Unmodified-Since. I imagine they're not common to care much about, and I don't know if the network stack even exposes enough information for you there, but just FYI. https://codereview.chromium.org/27047003/diff/1084001/components/precache/cor... components/precache/core/precache_database.cc:180: precache_url_table_->DeleteURL(url); You could probably move the IncreaseDailyStats and DeleteURL calls out of the two conditions and make it a little easier to follow: PrecacheStatisticsTable::PrecacheStatistics stats; if (...) { ... blah, compute appropriate update to stats. } else if (...) { ... } precache_statistics_table_->IncreaseDailyStats(fetch_time, stats); precache_url_table_->DeleteURL(url); https://codereview.chromium.org/27047003/diff/1084001/components/precache/cor... File components/precache/core/precache_database.h (right): https://codereview.chromium.org/27047003/diff/1084001/components/precache/cor... components/precache/core/precache_database.h:54: bool is_connection_cellular); What if you split this into RecordURLFetched and RecordURLPrecached or something like that? It looks like the handling for those two cases is completely different anyway. It'd let you cut down a bit on the pile of booleans here. Dunno what your intended caller looks like, but this seems to also match how your unit tests uses this anyway. https://codereview.chromium.org/27047003/diff/1084001/components/precache/cor... File components/precache/core/precache_database_unittest.cc (right): https://codereview.chromium.org/27047003/diff/1084001/components/precache/cor... components/precache/core/precache_database_unittest.cc:186: false /* is_connection_cellular */); Any reason not to use your various helper functions here too? https://codereview.chromium.org/27047003/diff/1084001/components/precache/cor... components/precache/core/precache_database_unittest.cc:366: RecordPrecacheFromNetwork(kURL1, fetch_time += kFetchPeriod, kSize1); Nit: I don't think we tend put += in the middle of an expression like that, do we? I'd suggest just pulling that out into lines above. https://codereview.chromium.org/27047003/diff/1084001/components/precache/cor... File components/precache/core/precache_statistics_table.cc (right): https://codereview.chromium.org/27047003/diff/1084001/components/precache/cor... components/precache/core/precache_statistics_table.cc:21: return time_on_day.LocalMidnight().ToInternalValue(); What happens if the user changes time zones? Seems you could end up with some strange results if you key dates by their local time midnight time. Not sure what would be a better mechanism. Could you store hourly in UTC and then aggregate after the statistics have been uploaded? I guess that wouldn't give you grouping by local days either. It does seem a little odd to me how you have data aggregated by day, but then UMA itself also can aggregate histograms and such. Particularly with the old days' data thing mmenke mentioned. But I'm not actually very familiar with our metrics. https://codereview.chromium.org/27047003/diff/1084001/components/precache/cor... File components/precache/core/precache_statistics_table.h (right): https://codereview.chromium.org/27047003/diff/1084001/components/precache/cor... components/precache/core/precache_statistics_table.h:54: int64 saved_bytes_cellular; Nit: I'd probably put some newlines after each of these fields, just so it's easier to see which comment goes with what.
Thanks for the comments, replies below. https://codereview.chromium.org/27047003/diff/1084001/components/precache/cor... File components/precache/core/precache_database.cc (right): https://codereview.chromium.org/27047003/diff/1084001/components/precache/cor... components/precache/core/precache_database.cc:72: db_.reset(db.release()); On 2013/11/06 23:50:53, David Benjamin wrote: > Nit: You can write "db_ = db.Pass();" which is a slightly safer pattern. > (Emulation of move semantics from C++11.) Done. https://codereview.chromium.org/27047003/diff/1084001/components/precache/cor... components/precache/core/precache_database.cc:161: precache_url_table_->DeleteURL(url); On 2013/11/06 23:50:53, David Benjamin wrote: > I think there are cases where requests also just bypass the cache altogether and > not touch it, such requests with an external If-Unmodified-Since. I imagine > they're not common to care much about, and I don't know if the network stack > even exposes enough information for you there, but just FYI. I'm not familiar with these cases; what does it mean for a request to have an "external" If-Unmodified-Since? From what I can see in the HTTP 1.1 spec, it looks like a 2xx response from these kinds of requests would still be stored in the cache, so I think this code would handle those kinds of cases correctly. We wouldn't call RecordURLFetched for unsuccessful responses. https://codereview.chromium.org/27047003/diff/1084001/components/precache/cor... components/precache/core/precache_database.cc:180: precache_url_table_->DeleteURL(url); On 2013/11/06 23:50:53, David Benjamin wrote: > You could probably move the IncreaseDailyStats and DeleteURL calls out of the > two conditions and make it a little easier to follow: > > PrecacheStatisticsTable::PrecacheStatistics stats; > if (...) { > ... blah, compute appropriate update to stats. > } else if (...) { > ... > } > precache_statistics_table_->IncreaseDailyStats(fetch_time, stats); > > precache_url_table_->DeleteURL(url); Done. https://codereview.chromium.org/27047003/diff/1084001/components/precache/cor... File components/precache/core/precache_database.h (right): https://codereview.chromium.org/27047003/diff/1084001/components/precache/cor... components/precache/core/precache_database.h:54: bool is_connection_cellular); On 2013/11/06 23:50:53, David Benjamin wrote: > What if you split this into RecordURLFetched and RecordURLPrecached or something > like that? It looks like the handling for those two cases is completely > different anyway. It'd let you cut down a bit on the pile of booleans here. > Dunno what your intended caller looks like, but this seems to also match how > your unit tests uses this anyway. Done. https://codereview.chromium.org/27047003/diff/1084001/components/precache/cor... File components/precache/core/precache_database_unittest.cc (right): https://codereview.chromium.org/27047003/diff/1084001/components/precache/cor... components/precache/core/precache_database_unittest.cc:186: false /* is_connection_cellular */); On 2013/11/06 23:50:53, David Benjamin wrote: > Any reason not to use your various helper functions here too? Changed to use helper functions. https://codereview.chromium.org/27047003/diff/1084001/components/precache/cor... components/precache/core/precache_database_unittest.cc:366: RecordPrecacheFromNetwork(kURL1, fetch_time += kFetchPeriod, kSize1); On 2013/11/06 23:50:53, David Benjamin wrote: > Nit: I don't think we tend put += in the middle of an expression like that, do > we? I'd suggest just pulling that out into lines above. Done. https://codereview.chromium.org/27047003/diff/1084001/components/precache/cor... File components/precache/core/precache_statistics_table.cc (right): https://codereview.chromium.org/27047003/diff/1084001/components/precache/cor... components/precache/core/precache_statistics_table.cc:21: return time_on_day.LocalMidnight().ToInternalValue(); On 2013/11/06 23:50:53, David Benjamin wrote: > What happens if the user changes time zones? Seems you could end up with some > strange results if you key dates by their local time midnight time. Not sure > what would be a better mechanism. Could you store hourly in UTC and then > aggregate after the statistics have been uploaded? I guess that wouldn't give > you grouping by local days either. > > It does seem a little odd to me how you have data aggregated by day, but then > UMA itself also can aggregate histograms and such. Particularly with the old > days' data thing mmenke mentioned. But I'm not actually very familiar with our > metrics. Good point about time zones. If we switch time zones, the keys for the day would change, and weird stuff would happen. I've changed it to not care about local times at all, and just track 24 hour intervals. The reason that we track daily stats, as opposed to just reporting UMA per request, is because we want to be able to measure the usefulness and overhead of precaching for the typical user. Reporting UMA per-request would show us the usefulness and overhead for the typical request, which we don't care about. Using the overall sums of these per-request stats wouldn't tell us anything about typical users either, since it would be biased toward outliers who made a lot of big requests. https://codereview.chromium.org/27047003/diff/1084001/components/precache/cor... File components/precache/core/precache_statistics_table.h (right): https://codereview.chromium.org/27047003/diff/1084001/components/precache/cor... components/precache/core/precache_statistics_table.h:54: int64 saved_bytes_cellular; On 2013/11/06 23:50:53, David Benjamin wrote: > Nit: I'd probably put some newlines after each of these fields, just so it's > easier to see which comment goes with what. Done.
There's one test that's still using local time (via base::Time::FromString). You probably want to change that now that you're bucketing by UTC. Otherwise, this LGTM. But, like I said, I don't actually know much about histograms, so you should probably also have jar@ or someone else take a look. https://codereview.chromium.org/27047003/diff/1084001/components/precache/cor... File components/precache/core/precache_database.cc (right): https://codereview.chromium.org/27047003/diff/1084001/components/precache/cor... components/precache/core/precache_database.cc:161: precache_url_table_->DeleteURL(url); On 2013/11/09 01:07:58, sclittle wrote: > On 2013/11/06 23:50:53, David Benjamin wrote: > > I think there are cases where requests also just bypass the cache altogether > and > > not touch it, such requests with an external If-Unmodified-Since. I imagine > > they're not common to care much about, and I don't know if the network stack > > even exposes enough information for you there, but just FYI. > > I'm not familiar with these cases; what does it mean for a request to have an > "external" If-Unmodified-Since? > > From what I can see in the HTTP 1.1 spec, it looks like a 2xx response from > these kinds of requests would still be stored in the cache, so I think this code > would handle those kinds of cases correctly. We wouldn't call RecordURLFetched > for unsuccessful responses. By external I mean the headers originated from outside the network stack somewhere. An XHR might have put it in there, or some plugin. The cache has several modes when that happens. Sometimes it will still interpret the results and write to cache but never read from it. Sometimes it will just bypass it altogether because it doesn't know how to handle that. I believe If-Unmodified-Since is one of those cases. So it's possible to not use the cache on a request and still have it available for later runs. (I don't think it's actually a big deal, and I don't know how you'd get at that information outside the network stack. I just thought I'd mention it.) https://codereview.chromium.org/27047003/diff/1084001/components/precache/cor... File components/precache/core/precache_statistics_table.cc (right): https://codereview.chromium.org/27047003/diff/1084001/components/precache/cor... components/precache/core/precache_statistics_table.cc:21: return time_on_day.LocalMidnight().ToInternalValue(); On 2013/11/09 01:07:58, sclittle wrote: > On 2013/11/06 23:50:53, David Benjamin wrote: > > What happens if the user changes time zones? Seems you could end up with some > > strange results if you key dates by their local time midnight time. Not sure > > what would be a better mechanism. Could you store hourly in UTC and then > > aggregate after the statistics have been uploaded? I guess that wouldn't give > > you grouping by local days either. > > > > It does seem a little odd to me how you have data aggregated by day, but then > > UMA itself also can aggregate histograms and such. Particularly with the old > > days' data thing mmenke mentioned. But I'm not actually very familiar with our > > metrics. > > Good point about time zones. If we switch time zones, the keys for the day would > change, and weird stuff would happen. I've changed it to not care about local > times at all, and just track 24 hour intervals. > > The reason that we track daily stats, as opposed to just reporting UMA per > request, is because we want to be able to measure the usefulness and overhead of > precaching for the typical user. Reporting UMA per-request would show us the > usefulness and overhead for the typical request, which we don't care about. > Using the overall sums of these per-request stats wouldn't tell us anything > about typical users either, since it would be biased toward outliers who made a > lot of big requests. Ah, so if you rely on UMA's aggregation, you'll weight a user's contributions by the number of requests they make. Makes sense. (I'm not very familiar with you can/can't do with UMA.) https://codereview.chromium.org/27047003/diff/1234001/components/precache/cor... File components/precache/core/precache_url_table_unittest.cc (right): https://codereview.chromium.org/27047003/diff/1234001/components/precache/cor... components/precache/core/precache_url_table_unittest.cc:21: EXPECT_TRUE(base::Time::FromString(time_string, &time)); Now that time zones are no longer relevant for keying time, you probably want to be using UTC instead of the local time.
https://codereview.chromium.org/27047003/diff/1234001/tools/metrics/histogram... File tools/metrics/histograms/histograms.xml (right): https://codereview.chromium.org/27047003/diff/1234001/tools/metrics/histogram... tools/metrics/histograms/histograms.xml:13471: + The daily number of kilobytes that were downloaded over any network for Clarify what "daily" means in the description. When is it actually logged? (e.g. when is the check whether this should be logged performed - and does it cover multiple sessions). https://codereview.chromium.org/27047003/diff/1234001/tools/metrics/histogram... tools/metrics/histograms/histograms.xml:13476: +<histogram name="Precache.DailyDownloadedNonPrecacheKB.Cellular" units="KB"> You can make a new fieldtrial entry to handle all the ".Cellular" versions of the histograms (then you don't need to have entries for the .Cellular version of the histograms): <fieldtrial name="PrecacheCellular" separator="."> <group name="Cellular" label="covers only data downloaded over cellular networks"/> <affected-histogram name="Precache.DailyDownloadedNonPrecacheKB"/> <affected-histogram name="Precache.DailySavedKB"/> <affected-histogram name="Precache.DailySavingsPercentage"/> </fieldtrial> (The fieldtrial tag name is a misnomer, since field trials no longer use this mechanism, but it lets you created suffixed versions of histograms.)
https://codereview.chromium.org/27047003/diff/1084001/components/precache/cor... File components/precache/core/precache_database.cc (right): https://codereview.chromium.org/27047003/diff/1084001/components/precache/cor... components/precache/core/precache_database.cc:161: precache_url_table_->DeleteURL(url); On 2013/11/11 19:06:26, David Benjamin wrote: > On 2013/11/09 01:07:58, sclittle wrote: > > On 2013/11/06 23:50:53, David Benjamin wrote: > > > I think there are cases where requests also just bypass the cache altogether > > and > > > not touch it, such requests with an external If-Unmodified-Since. I imagine > > > they're not common to care much about, and I don't know if the network stack > > > even exposes enough information for you there, but just FYI. > > > > I'm not familiar with these cases; what does it mean for a request to have an > > "external" If-Unmodified-Since? > > > > From what I can see in the HTTP 1.1 spec, it looks like a 2xx response from > > these kinds of requests would still be stored in the cache, so I think this > code > > would handle those kinds of cases correctly. We wouldn't call RecordURLFetched > > for unsuccessful responses. > > By external I mean the headers originated from outside the network stack > somewhere. An XHR might have put it in there, or some plugin. The cache has > several modes when that happens. Sometimes it will still interpret the results > and write to cache but never read from it. Sometimes it will just bypass it > altogether because it doesn't know how to handle that. I believe > If-Unmodified-Since is one of those cases. So it's possible to not use the cache > on a request and still have it available for later runs. > > (I don't think it's actually a big deal, and I don't know how you'd get at that > information outside the network stack. I just thought I'd mention it.) Interesting, thanks for the info. https://codereview.chromium.org/27047003/diff/1234001/components/precache/cor... File components/precache/core/precache_url_table_unittest.cc (right): https://codereview.chromium.org/27047003/diff/1234001/components/precache/cor... components/precache/core/precache_url_table_unittest.cc:21: EXPECT_TRUE(base::Time::FromString(time_string, &time)); On 2013/11/11 19:06:26, David Benjamin wrote: > Now that time zones are no longer relevant for keying time, you probably want to > be using UTC instead of the local time. Done. https://codereview.chromium.org/27047003/diff/1234001/tools/metrics/histogram... File tools/metrics/histograms/histograms.xml (right): https://codereview.chromium.org/27047003/diff/1234001/tools/metrics/histogram... tools/metrics/histograms/histograms.xml:13471: + The daily number of kilobytes that were downloaded over any network for On 2013/11/11 20:04:11, Alexei Svitkine wrote: > Clarify what "daily" means in the description. When is it actually logged? (e.g. > when is the check whether this should be logged performed - and does it cover > multiple sessions). Done. https://codereview.chromium.org/27047003/diff/1234001/tools/metrics/histogram... tools/metrics/histograms/histograms.xml:13476: +<histogram name="Precache.DailyDownloadedNonPrecacheKB.Cellular" units="KB"> On 2013/11/11 20:04:11, Alexei Svitkine wrote: > You can make a new fieldtrial entry to handle all the ".Cellular" versions of > the histograms (then you don't need to have entries for the .Cellular version > of the histograms): > > <fieldtrial name="PrecacheCellular" separator="."> > <group name="Cellular" label="covers only data downloaded over cellular > networks"/> > <affected-histogram name="Precache.DailyDownloadedNonPrecacheKB"/> > <affected-histogram name="Precache.DailySavedKB"/> > <affected-histogram name="Precache.DailySavingsPercentage"/> > </fieldtrial> > > (The fieldtrial tag name is a misnomer, since field trials no longer use this > mechanism, but it lets you created suffixed versions of histograms.) Done.
+shess: components/precache/core/DEPS, added "+sql" Thanks!
As a general comment, as I've crawled through Chromium code fixing SQLite mis-uses, I have come to associate the pattern of separate database and table objects with code smell. If you ever modify the schema, you end up with all sorts of accessors to let the different objects poke each other, and sometimes transactional integrity is lost. I haven't modelled your code well enough to say for sure, but it seems possible that half of the footprint of the table code would simply disappear if it were integrated into the database object. WRT my transaction comments, there are two general rationales for using transactions. First is the obvious assertion that "These operations should be atomic". Second is a performance issue. If you do not use an explicit transaction, then every standalone SQLite statement becomes a transaction on its own. For update statements, this means an fsync(). As a rule of thumb, I would just put a sql::Transaction around every bundle of code which results in more than one SQLite statement where at least one of them is an update of some sort. ["update" in this case includes anything which updates the database contents, not just literal "UPDATE" statements.] https://codereview.chromium.org/27047003/diff/1404001/components/precache/cor... File components/precache/core/precache_database.cc (right): https://codereview.chromium.org/27047003/diff/1404001/components/precache/cor... components/precache/core/precache_database.cc:67: void PrecacheDatabase::Init(scoped_ptr<sql::Connection> db) { I'm not seeing where this database is coming from originally? https://codereview.chromium.org/27047003/diff/1404001/components/precache/cor... components/precache/core/precache_database.cc:71: sql::Connection has a set_histogram_tag() which interfaces with "SqliteDatabases" fieldtrial in histograms.xml to provide error and size tracking. This would be a reasonable place to set that up. https://codereview.chromium.org/27047003/diff/1404001/components/precache/cor... components/precache/core/precache_database.cc:80: precache_statistics_table_->Init(db_.get()); These should probably be in a transaction. https://codereview.chromium.org/27047003/diff/1404001/components/precache/cor... components/precache/core/precache_database.cc:85: DCHECK(db_); These are implicit in IsDatabaseAccessible(). https://codereview.chromium.org/27047003/diff/1404001/components/precache/cor... components/precache/core/precache_database.cc:91: You should probably start a transaction here to encompass the following changes. https://codereview.chromium.org/27047003/diff/1404001/components/precache/cor... components/precache/core/precache_database.cc:116: DCHECK(db_); These are implicit in IsDatabaseAccessible(). https://codereview.chromium.org/27047003/diff/1404001/components/precache/cor... components/precache/core/precache_database.cc:130: Probably should start a transaction here. https://codereview.chromium.org/27047003/diff/1404001/components/precache/cor... components/precache/core/precache_database.cc:150: DCHECK(db_); These are implicit in IsDatabaseAccessible(). https://codereview.chromium.org/27047003/diff/1404001/components/precache/cor... components/precache/core/precache_database.cc:156: Probably should open a transaction here. https://codereview.chromium.org/27047003/diff/1404001/components/precache/cor... File components/precache/core/precache_database.h (right): https://codereview.chromium.org/27047003/diff/1404001/components/precache/cor... components/precache/core/precache_database.h:32: base::NonThreadSafe { Why are you subclassing base::NonThreadSafe rather than using a base::ThreadChecker member? The documentation seems to prefer the ThreadChecker member unless there's a reason for subclassing. https://codereview.chromium.org/27047003/diff/1404001/components/precache/cor... File components/precache/core/precache_statistics_table.cc (right): https://codereview.chromium.org/27047003/diff/1404001/components/precache/cor... components/precache/core/precache_statistics_table.cc:16: const char kStatisticsTableName[] = "precache_statistics"; I generally discourage this kind of coding. It sometimes gradually turns into a dynamic table-swizzling system which is impossible to follow when you need to figure out why a broken schema is breaking the world. The only advantage I can see is that it makes the table name a constant - but you're already dependent on all of the column names being right, so it really doesn't gain anything (do not suggest building a general mapping layer, please!). I think you're better off inlining the table names and making sure you have good test coverage. https://codereview.chromium.org/27047003/diff/1404001/components/precache/cor... components/precache/core/precache_statistics_table.cc:157: if (db_->DoesTableExist(kStatisticsTableName)) If you're committed to this approach, use "CREATE TABLE IF NOT EXISTS", rather than probing and then creating as separate items. https://codereview.chromium.org/27047003/diff/1404001/components/precache/cor... components/precache/core/precache_statistics_table.cc:167: "PRIMARY KEY(date))", prefer "date INTEGER PRIMARY KEY" rather than specifying the primary key separately. https://codereview.chromium.org/27047003/diff/1404001/components/precache/cor... File components/precache/core/precache_url_table.cc (right): https://codereview.chromium.org/27047003/diff/1404001/components/precache/cor... components/precache/core/precache_url_table.cc:18: const char kUrlTableName[] = "precache_urls"; Same comment as for stats table. https://codereview.chromium.org/27047003/diff/1404001/components/precache/cor... components/precache/core/precache_url_table.cc:42: DCHECK(db_); I think all of these DCHECK(db_) can be gotten rid of. Every method immediately dereferences db_, so it will only provide a trivial improvement to diagnostics. https://codereview.chromium.org/27047003/diff/1404001/components/precache/cor... components/precache/core/precache_url_table.cc:105: if (db_->DoesTableExist(kUrlTableName)) Same comment as for stats table. https://codereview.chromium.org/27047003/diff/1404001/components/precache/cor... components/precache/core/precache_url_table.cc:109: "CREATE TABLE %s (url TEXT, time INTEGER, PRIMARY KEY(url))", "url TEXT PRIMARY KEY" should work fine here. https://codereview.chromium.org/27047003/diff/1404001/components/precache/cor... components/precache/core/precache_url_table.cc:111: DLOG(WARNING) << "Could not create precache URL table in database."; Note that the sql/ library is going to complain loudly if this fails in debug mode.
Also - are you planning for this to stick around for the long term? It seems like something which might only exist for a milestone or two. Did you have a specific plan for removing the stale databases when done?
Thanks for the comments, replies are below. @shess: Regarding your comment about combining the database and table objects: I could be missing something, but I'm not convinced that combining the classes would be an improvement. Currently, the tables don't touch each other at all, store very conceptually different data, and all interaction with them is handled by the database object. They could easily be in different databases, if not for the additional overhead of doing that. If the schema of the tables change to the point where the different tables need to poke each other, then a more significant refactoring is likely necessary. Also, combining the classes basically just involves moving all the methods from the table classes and putting them in the database class, and maybe inlining a few of the queries, so I don't think the code footprint would be significantly smaller. Regarding the lifetime of the database, if we decided in a later milestone that we don't want to track stats in the database anymore, then we could just delete the database file if it exists when Chrome starts precaching resources. https://codereview.chromium.org/27047003/diff/1404001/components/precache/cor... File components/precache/core/precache_database.cc (right): https://codereview.chromium.org/27047003/diff/1404001/components/precache/cor... components/precache/core/precache_database.cc:67: void PrecacheDatabase::Init(scoped_ptr<sql::Connection> db) { On 2013/11/19 18:47:48, shess wrote: > I'm not seeing where this database is coming from originally? There will be a PrecacheManager BrowserContextKeyedService that will pass in the database as a scoped pointer. The reason we don't create the database here is because we want the database file to be stored on the BrowserContext's path, which only the PrecacheManager will have access to. https://codereview.chromium.org/27047003/diff/1404001/components/precache/cor... components/precache/core/precache_database.cc:71: On 2013/11/19 18:47:48, shess wrote: > sql::Connection has a set_histogram_tag() which interfaces with > "SqliteDatabases" fieldtrial in histograms.xml to provide error and size > tracking. This would be a reasonable place to set that up. Done. https://codereview.chromium.org/27047003/diff/1404001/components/precache/cor... components/precache/core/precache_database.cc:80: precache_statistics_table_->Init(db_.get()); On 2013/11/19 18:47:48, shess wrote: > These should probably be in a transaction. Done. https://codereview.chromium.org/27047003/diff/1404001/components/precache/cor... components/precache/core/precache_database.cc:85: DCHECK(db_); On 2013/11/19 18:47:48, shess wrote: > These are implicit in IsDatabaseAccessible(). Done. https://codereview.chromium.org/27047003/diff/1404001/components/precache/cor... components/precache/core/precache_database.cc:91: On 2013/11/19 18:47:48, shess wrote: > You should probably start a transaction here to encompass the following changes. Done. https://codereview.chromium.org/27047003/diff/1404001/components/precache/cor... components/precache/core/precache_database.cc:116: DCHECK(db_); On 2013/11/19 18:47:48, shess wrote: > These are implicit in IsDatabaseAccessible(). Done. https://codereview.chromium.org/27047003/diff/1404001/components/precache/cor... components/precache/core/precache_database.cc:130: On 2013/11/19 18:47:48, shess wrote: > Probably should start a transaction here. Done. https://codereview.chromium.org/27047003/diff/1404001/components/precache/cor... components/precache/core/precache_database.cc:150: DCHECK(db_); On 2013/11/19 18:47:48, shess wrote: > These are implicit in IsDatabaseAccessible(). Done. https://codereview.chromium.org/27047003/diff/1404001/components/precache/cor... components/precache/core/precache_database.cc:156: On 2013/11/19 18:47:48, shess wrote: > Probably should open a transaction here. Done. https://codereview.chromium.org/27047003/diff/1404001/components/precache/cor... File components/precache/core/precache_database.h (right): https://codereview.chromium.org/27047003/diff/1404001/components/precache/cor... components/precache/core/precache_database.h:32: base::NonThreadSafe { On 2013/11/19 18:47:48, shess wrote: > Why are you subclassing base::NonThreadSafe rather than using a > base::ThreadChecker member? The documentation seems to prefer the ThreadChecker > member unless there's a reason for subclassing. Changed to ThreadChecker. https://codereview.chromium.org/27047003/diff/1404001/components/precache/cor... File components/precache/core/precache_statistics_table.cc (right): https://codereview.chromium.org/27047003/diff/1404001/components/precache/cor... components/precache/core/precache_statistics_table.cc:16: const char kStatisticsTableName[] = "precache_statistics"; On 2013/11/19 18:47:48, shess wrote: > I generally discourage this kind of coding. It sometimes gradually turns into a > dynamic table-swizzling system which is impossible to follow when you need to > figure out why a broken schema is breaking the world. The only advantage I can > see is that it makes the table name a constant - but you're already dependent on > all of the column names being right, so it really doesn't gain anything (do not > suggest building a general mapping layer, please!). I think you're better off > inlining the table names and making sure you have good test coverage. Done. https://codereview.chromium.org/27047003/diff/1404001/components/precache/cor... components/precache/core/precache_statistics_table.cc:157: if (db_->DoesTableExist(kStatisticsTableName)) On 2013/11/19 18:47:48, shess wrote: > If you're committed to this approach, use "CREATE TABLE IF NOT EXISTS", rather > than probing and then creating as separate items. Done. https://codereview.chromium.org/27047003/diff/1404001/components/precache/cor... components/precache/core/precache_statistics_table.cc:167: "PRIMARY KEY(date))", On 2013/11/19 18:47:48, shess wrote: > prefer "date INTEGER PRIMARY KEY" rather than specifying the primary key > separately. Done. https://codereview.chromium.org/27047003/diff/1404001/components/precache/cor... File components/precache/core/precache_url_table.cc (right): https://codereview.chromium.org/27047003/diff/1404001/components/precache/cor... components/precache/core/precache_url_table.cc:18: const char kUrlTableName[] = "precache_urls"; On 2013/11/19 18:47:48, shess wrote: > Same comment as for stats table. Done. https://codereview.chromium.org/27047003/diff/1404001/components/precache/cor... components/precache/core/precache_url_table.cc:42: DCHECK(db_); On 2013/11/19 18:47:48, shess wrote: > I think all of these DCHECK(db_) can be gotten rid of. Every method immediately > dereferences db_, so it will only provide a trivial improvement to diagnostics. Done. https://codereview.chromium.org/27047003/diff/1404001/components/precache/cor... components/precache/core/precache_url_table.cc:105: if (db_->DoesTableExist(kUrlTableName)) On 2013/11/19 18:47:48, shess wrote: > Same comment as for stats table. Done. https://codereview.chromium.org/27047003/diff/1404001/components/precache/cor... components/precache/core/precache_url_table.cc:109: "CREATE TABLE %s (url TEXT, time INTEGER, PRIMARY KEY(url))", On 2013/11/19 18:47:48, shess wrote: > "url TEXT PRIMARY KEY" should work fine here. Done. https://codereview.chromium.org/27047003/diff/1404001/components/precache/cor... components/precache/core/precache_url_table.cc:111: DLOG(WARNING) << "Could not create precache URL table in database."; On 2013/11/19 18:47:48, shess wrote: > Note that the sql/ library is going to complain loudly if this fails in debug > mode. Good point; if this fails in debug mode, Execute() would DLOG(FATAL), so this warning would never be printed.
Ping? My internship ends in a few weeks, and I have a bunch of CLs to get in that depend on this one.
IMO, the histograms that you're gathering (at least based on the description) will be hard to make sense of. See comments below. https://codereview.chromium.org/27047003/diff/1504001/tools/metrics/histogram... File tools/metrics/histograms/histograms.xml (right): https://codereview.chromium.org/27047003/diff/1504001/tools/metrics/histogram... tools/metrics/histograms/histograms.xml:13721: + over 24 hour intervals, and can cover multiple sessions. Values of this FYI: This aggregation over 24 hours will likely conflate the length of time that a user runs (without restart), with the data I think you really wanted. I suspect you wanted to measure the size of each resource separately, and then aggregate the histograms. That would remove the conflation I mentioned above.
https://codereview.chromium.org/27047003/diff/1504001/tools/metrics/histogram... File tools/metrics/histograms/histograms.xml (right): https://codereview.chromium.org/27047003/diff/1504001/tools/metrics/histogram... tools/metrics/histograms/histograms.xml:13721: + over 24 hour intervals, and can cover multiple sessions. Values of this On 2013/11/26 20:52:53, jar wrote: > FYI: This aggregation over 24 hours will likely conflate the length of time that > a user runs (without restart), with the data I think you really wanted. > > I suspect you wanted to measure the size of each resource separately, and then > aggregate the histograms. That would remove the conflation I mentioned above. I've updated the descriptions to hopefully help clarify that this is per-profile. We want to measure the usefulness and overhead of precaching for the typical user. If we measure the savings and overhead of each resource separately, this would show us the savings and overhead for the typical resource, which we don't care about. Aggregating per-resource histograms also wouldn't tell us much about typical users either, since those sums would be biased toward outliers who fetched a lot of big resources. By aggregating metrics over 24 hours for each profile, we can measure things like the median user's savings percentage.
On 2013/11/26 22:22:28, sclittle wrote: > https://codereview.chromium.org/27047003/diff/1504001/tools/metrics/histogram... > File tools/metrics/histograms/histograms.xml (right): > > https://codereview.chromium.org/27047003/diff/1504001/tools/metrics/histogram... > tools/metrics/histograms/histograms.xml:13721: + over 24 hour intervals, and > can cover multiple sessions. Values of this > On 2013/11/26 20:52:53, jar wrote: > > FYI: This aggregation over 24 hours will likely conflate the length of time > that > > a user runs (without restart), with the data I think you really wanted. > > > > I suspect you wanted to measure the size of each resource separately, and then > > aggregate the histograms. That would remove the conflation I mentioned above. > > I've updated the descriptions to hopefully help clarify that this is > per-profile. > > We want to measure the usefulness and overhead of precaching for the typical > user. If we measure the savings and overhead of each resource separately, this > would show us the savings and overhead for the typical resource, which we don't > care about. Aggregating per-resource histograms also wouldn't tell us much about > typical users either, since those sums would be biased toward outliers who > fetched a lot of big resources. By aggregating metrics over 24 hours for each > profile, we can measure things like the median user's savings percentage. Consider this sample set: User A uses chrome for 24 hours straight every Saturday. User B uses Chrome every day of the week for 7 days, running for 5 minutes per day. What will your histogram show when you sample (for 24 hour periods) how much savings there were? It would appear (based on comments) that your histogram will have 8 distinct samples, if I understand your prose. IMO, the histogram will show 1 sample, with something interesting for user A, and will have 7 tiny samples for the 7 different 24 hour intervals that B ran. How will you deduce anything interesting from that about "median user's savings?" IMO, your histogram will tell you more about surfing usage in 24 hour periods, than about precaching use.
Sorry about the long delay. Feel free to ping earlier, either I'll be like "Oh, crap, I should do that review" or I'll explicitly ignore it, but your ping won't be the email that sends me over the edge. On 2013/11/19 23:27:09, sclittle wrote: > Thanks for the comments, replies are below. > > @shess: Regarding your comment about combining the database and table objects: I > could be missing something, but I'm not convinced that combining the classes > would be an improvement. Currently, the tables don't touch each other at all, > store very conceptually different data, and all interaction with them is handled > by the database object. They could easily be in different databases, if not for > the additional overhead of doing that. If the schema of the tables change to the > point where the different tables need to poke each other, then a more > significant refactoring is likely necessary. Also, combining the classes > basically just involves moving all the methods from the table classes and > putting them in the database class, and maybe inlining a few of the queries, so > I don't think the code footprint would be significantly smaller. Yeah, I haven't been able to develop a principled way to phrase things, and I don't really have any formal database background. Mostly what I've found is that splitting the schema and statements across many files makes it harder to reason about the state of the system, and things which are stored persistently on disk tend to require reasoning of that sort more often than average code (average code you rewrite it and everything changes, but when dealing with persistent data that has evolved over time, you have to integrate history while reviewing). But if you'd rather have it this way, that's fine. https://codereview.chromium.org/27047003/diff/1404001/components/precache/cor... File components/precache/core/precache_database.cc (right): https://codereview.chromium.org/27047003/diff/1404001/components/precache/cor... components/precache/core/precache_database.cc:67: void PrecacheDatabase::Init(scoped_ptr<sql::Connection> db) { On 2013/11/19 23:27:10, sclittle wrote: > On 2013/11/19 18:47:48, shess wrote: > > I'm not seeing where this database is coming from originally? > > There will be a PrecacheManager BrowserContextKeyedService that will pass in the > database as a scoped pointer. The reason we don't create the database here is > because we want the database file to be stored on the BrowserContext's path, > which only the PrecacheManager will have access to. But why does the caller create the database, as opposed to passing in the path? https://codereview.chromium.org/27047003/diff/1544001/components/precache/cor... File components/precache/core/precache_database.cc (right): https://codereview.chromium.org/27047003/diff/1544001/components/precache/cor... components/precache/core/precache_database.cc:19: const int64 kPrecacheHistoryExpiryPeriodDays = 60; Just |int| unless being 64-bit is somehow relevant. https://codereview.chromium.org/27047003/diff/1544001/components/precache/cor... components/precache/core/precache_database.cc:65: thread_checker_.DetachFromThread(); Note that sql::Connection doesn't make thread-safety guarantees. That does not mean that everything will bust if you destruct db_ on a different thread than it was created on, just that nobody has your back if you do so. I don't think it will cause problems in here, though. https://codereview.chromium.org/27047003/diff/1544001/components/precache/cor... components/precache/core/precache_database.cc:95: db_->RazeAndClose(); Note that RazeAndClose() is a bit of a misnomer - it was originally intended to be used from a nested context where sql::Connection code was on the stack, possibly transactions open, so neither Raze() nor Close() were appropriate. After calling this, db_ is still a sql::Connection*, but all calls to it will fail. I don't think it's necessarily inappropriate, is_open() should return false after this, so I think it works. Just wanted to make sure the subtleties are understood. https://codereview.chromium.org/27047003/diff/1544001/components/precache/cor... File components/precache/core/precache_database.h (right): https://codereview.chromium.org/27047003/diff/1544001/components/precache/cor... components/precache/core/precache_database.h:57: bool is_connection_cellular); While the |size| storage is 64-bit signed integer for each of these methods, is that the appropriate API to expose? size_t might be better. I don't feel strongly about this. https://codereview.chromium.org/27047003/diff/1544001/components/precache/cor... File components/precache/core/precache_statistics_table.cc (right): https://codereview.chromium.org/27047003/diff/1544001/components/precache/cor... components/precache/core/precache_statistics_table.cc:149: "saved_bytes_cellular INTEGER DEFAULT 0)")); If this fails, the caller should really fail Init(). https://codereview.chromium.org/27047003/diff/1544001/components/precache/cor... File components/precache/core/precache_url_table.cc (right): https://codereview.chromium.org/27047003/diff/1544001/components/precache/cor... components/precache/core/precache_url_table.cc:88: "INTEGER)")); If this fails, the caller should really fail init.
On 2013/11/26 23:40:54, jar wrote: > On 2013/11/26 22:22:28, sclittle wrote: > > > https://codereview.chromium.org/27047003/diff/1504001/tools/metrics/histogram... > > File tools/metrics/histograms/histograms.xml (right): > > > > > https://codereview.chromium.org/27047003/diff/1504001/tools/metrics/histogram... > > tools/metrics/histograms/histograms.xml:13721: + over 24 hour intervals, > and > > can cover multiple sessions. Values of this > > On 2013/11/26 20:52:53, jar wrote: > > > FYI: This aggregation over 24 hours will likely conflate the length of time > > that > > > a user runs (without restart), with the data I think you really wanted. > > > > > > I suspect you wanted to measure the size of each resource separately, and > then > > > aggregate the histograms. That would remove the conflation I mentioned > above. > > > > I've updated the descriptions to hopefully help clarify that this is > > per-profile. > > > > We want to measure the usefulness and overhead of precaching for the typical > > user. If we measure the savings and overhead of each resource separately, this > > would show us the savings and overhead for the typical resource, which we > don't > > care about. Aggregating per-resource histograms also wouldn't tell us much > about > > typical users either, since those sums would be biased toward outliers who > > fetched a lot of big resources. By aggregating metrics over 24 hours for each > > profile, we can measure things like the median user's savings percentage. > > Consider this sample set: > > User A uses chrome for 24 hours straight every Saturday. > > User B uses Chrome every day of the week for 7 days, running for 5 minutes per > day. > > What will your histogram show when you sample (for 24 hour periods) how much > savings there were? > > It would appear (based on comments) that your histogram will have 8 distinct > samples, if I understand your prose. > > IMO, the histogram will show 1 sample, with something interesting for user A, > and will have 7 tiny samples for the 7 different 24 hour intervals that B ran. > > How will you deduce anything interesting from that about "median user's > savings?" > > IMO, your histogram will tell you more about surfing usage in 24 hour periods, > than about precaching use. @jar: You're right, sorry. The histogram would tell us the savings for users for a typical day, not for a typical user. We could deduce the median user's *daily* savings by counting the total number of users who use precaching, and adding extra 0's where appropriate for days where we have less samples, but it's not ideal. Another alternative would be to also report metrics for days when nothing happened, instead of skipping them, but that is still not ideal, since all those zeros for the empty days would be reported at the same time once Chrome is used again. Do you know if there's a way to get the median user's savings from measuring the size of each resource separately?
> Do you know if there's a way to get the median user's savings from measuring the > size of each resource separately? The "median savings per user" is also a pretty odd metric. My guess is that on any given day, less than half of our users use chrome... so the median user would have "0 savings." I really think you want to ask about savings per resource fetched, and not attempt to normalize based on users (let alone looking at a median user). You then might also ask how many resources each user fetches (on average?) in an hour of surfing, or such, but I doubt that any median-in-a-day metric will give you data that is significant to your quest.
On 2013/11/28 01:00:45, jar wrote: > > Do you know if there's a way to get the median user's savings from measuring > the > > size of each resource separately? > > The "median savings per user" is also a pretty odd metric. My guess is that on > any given day, less than half of our users use chrome... so the median user > would have "0 savings." > > I really think you want to ask about savings per resource fetched, and not > attempt to normalize based on users (let alone looking at a median user). You > then might also ask how many resources each user fetches (on average?) in an > hour of surfing, or such, but I doubt that any median-in-a-day metric will give > you data that is significant to your quest. Good point, I'll modify it to track stats per resource fetched.
Thanks for the comments, replies are below. I've changed it to report metrics on a per-resource basis. https://codereview.chromium.org/27047003/diff/1404001/components/precache/cor... File components/precache/core/precache_database.cc (right): https://codereview.chromium.org/27047003/diff/1404001/components/precache/cor... components/precache/core/precache_database.cc:67: void PrecacheDatabase::Init(scoped_ptr<sql::Connection> db) { On 2013/11/27 01:32:33, shess wrote: > On 2013/11/19 23:27:10, sclittle wrote: > > On 2013/11/19 18:47:48, shess wrote: > > > I'm not seeing where this database is coming from originally? > > > > There will be a PrecacheManager BrowserContextKeyedService that will pass in > the > > database as a scoped pointer. The reason we don't create the database here is > > because we want the database file to be stored on the BrowserContext's path, > > which only the PrecacheManager will have access to. > > But why does the caller create the database, as opposed to passing in the path? Good point. Changed this to take in a path instead. https://codereview.chromium.org/27047003/diff/1544001/components/precache/cor... File components/precache/core/precache_database.cc (right): https://codereview.chromium.org/27047003/diff/1544001/components/precache/cor... components/precache/core/precache_database.cc:19: const int64 kPrecacheHistoryExpiryPeriodDays = 60; On 2013/11/27 01:32:33, shess wrote: > Just |int| unless being 64-bit is somehow relevant. Done. https://codereview.chromium.org/27047003/diff/1544001/components/precache/cor... components/precache/core/precache_database.cc:65: thread_checker_.DetachFromThread(); On 2013/11/27 01:32:33, shess wrote: > Note that sql::Connection doesn't make thread-safety guarantees. That does not > mean that everything will bust if you destruct db_ on a different thread than it > was created on, just that nobody has your back if you do so. I don't think it > will cause problems in here, though. Ok. https://codereview.chromium.org/27047003/diff/1544001/components/precache/cor... components/precache/core/precache_database.cc:95: db_->RazeAndClose(); On 2013/11/27 01:32:33, shess wrote: > Note that RazeAndClose() is a bit of a misnomer - it was originally intended to > be used from a nested context where sql::Connection code was on the stack, > possibly transactions open, so neither Raze() nor Close() were appropriate. > After calling this, db_ is still a sql::Connection*, but all calls to it will > fail. > > I don't think it's necessarily inappropriate, is_open() should return false > after this, so I think it works. Just wanted to make sure the subtleties are > understood. Ok https://codereview.chromium.org/27047003/diff/1544001/components/precache/cor... File components/precache/core/precache_database.h (right): https://codereview.chromium.org/27047003/diff/1544001/components/precache/cor... components/precache/core/precache_database.h:57: bool is_connection_cellular); On 2013/11/27 01:32:33, shess wrote: > While the |size| storage is 64-bit signed integer for each of these methods, is > that the appropriate API to expose? size_t might be better. I don't feel > strongly about this. The value for |size| would originally come from URLRequest.received_response_content_length(), which is an int64. https://codereview.chromium.org/27047003/diff/1544001/components/precache/cor... File components/precache/core/precache_statistics_table.cc (right): https://codereview.chromium.org/27047003/diff/1544001/components/precache/cor... components/precache/core/precache_statistics_table.cc:149: "saved_bytes_cellular INTEGER DEFAULT 0)")); On 2013/11/27 01:32:33, shess wrote: > If this fails, the caller should really fail Init(). Removed the statistics table. https://codereview.chromium.org/27047003/diff/1544001/components/precache/cor... File components/precache/core/precache_url_table.cc (right): https://codereview.chromium.org/27047003/diff/1544001/components/precache/cor... components/precache/core/precache_url_table.cc:88: "INTEGER)")); On 2013/11/27 01:32:33, shess wrote: > If this fails, the caller should really fail init. Done.
LGTM, the only real change I want is stringprintf.h. The others are just suggestions. If you move forward with the suggestions, feel free to ping me again, I'll try to turn things around quickly. https://codereview.chromium.org/27047003/diff/1564001/components/precache/cor... File components/precache/core/precache_database.cc (right): https://codereview.chromium.org/27047003/diff/1564001/components/precache/cor... components/precache/core/precache_database.cc:54: transaction.Commit(); This doesn't make quite as much sense when you only have a single table involved, since now there aren't multiple automatic transactions. If it feels clunky, feel free to just call Init() and let it handle any necessary transaction internally. https://codereview.chromium.org/27047003/diff/1564001/components/precache/cor... File components/precache/core/precache_database_unittest.cc (right): https://codereview.chromium.org/27047003/diff/1564001/components/precache/cor... components/precache/core/precache_database_unittest.cc:90: precache_database_->precache_url_table_->GetAllDataForTesting( precache_url_table()-> seems reasonable for this. https://codereview.chromium.org/27047003/diff/1564001/components/precache/cor... components/precache/core/precache_database_unittest.cc:179: false /* is_connection_cellular */); I was going to comment on this, but then decided not to, but here you're deciding me ... is it possible that you want enums for these, so that you can say WAS_CACHED and IS_CONNECTION_CELLULAR rather than bools which are so easy to confuse that you find yourself commenting them? It would prevent accidental mis-ordering between those two bools, and also against |size|. https://codereview.chromium.org/27047003/diff/1564001/components/precache/cor... File components/precache/core/precache_url_table.cc (right): https://codereview.chromium.org/27047003/diff/1564001/components/precache/cor... components/precache/core/precache_url_table.cc:10: #include "base/strings/stringprintf.h" I don't think this is needed any longer.
jar or asvitkine, PTAL https://codereview.chromium.org/27047003/diff/1564001/components/precache/cor... File components/precache/core/precache_database.cc (right): https://codereview.chromium.org/27047003/diff/1564001/components/precache/cor... components/precache/core/precache_database.cc:54: transaction.Commit(); On 2013/12/03 20:55:38, shess wrote: > This doesn't make quite as much sense when you only have a single table > involved, since now there aren't multiple automatic transactions. If it feels > clunky, feel free to just call Init() and let it handle any necessary > transaction internally. Done. https://codereview.chromium.org/27047003/diff/1564001/components/precache/cor... File components/precache/core/precache_database_unittest.cc (right): https://codereview.chromium.org/27047003/diff/1564001/components/precache/cor... components/precache/core/precache_database_unittest.cc:90: precache_database_->precache_url_table_->GetAllDataForTesting( On 2013/12/03 20:55:38, shess wrote: > precache_url_table()-> seems reasonable for this. Done. https://codereview.chromium.org/27047003/diff/1564001/components/precache/cor... components/precache/core/precache_database_unittest.cc:179: false /* is_connection_cellular */); On 2013/12/03 20:55:38, shess wrote: > I was going to comment on this, but then decided not to, but here you're > deciding me ... is it possible that you want enums for these, so that you can > say WAS_CACHED and IS_CONNECTION_CELLULAR rather than bools which are so easy to > confuse that you find yourself commenting them? It would prevent accidental > mis-ordering between those two bools, and also against |size|. You have a point, but I think I'll just keep them as bools for now. https://codereview.chromium.org/27047003/diff/1564001/components/precache/cor... File components/precache/core/precache_url_table.cc (right): https://codereview.chromium.org/27047003/diff/1564001/components/precache/cor... components/precache/core/precache_url_table.cc:10: #include "base/strings/stringprintf.h" On 2013/12/03 20:55:38, shess wrote: > I don't think this is needed any longer. Done.
changes lgtm
histograms lgtm % comment https://codereview.chromium.org/27047003/diff/1584001/tools/metrics/histogram... File tools/metrics/histograms/histograms.xml (right): https://codereview.chromium.org/27047003/diff/1584001/tools/metrics/histogram... tools/metrics/histograms/histograms.xml:13880: + fetches that were not motivated by precaching. Nit: Mention when this is logged. It's per-request, right? Please mention that. Same for the others.
https://codereview.chromium.org/27047003/diff/1584001/tools/metrics/histogram... File tools/metrics/histograms/histograms.xml (right): https://codereview.chromium.org/27047003/diff/1584001/tools/metrics/histogram... tools/metrics/histograms/histograms.xml:13880: + fetches that were not motivated by precaching. On 2013/12/04 19:11:50, Alexei Svitkine wrote: > Nit: Mention when this is logged. It's per-request, right? Please mention that. > Same for the others. Done.
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/sclittle@chromium.org/27047003/1604001
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/sclittle@chromium.org/27047003/1624001
Retried try job too often on ios_dbg_simulator for step(s) base_unittests, compile, content_unittests, crypto_unittests, net_unittests, sql_unittests, ui_unittests, url_unittests http://build.chromium.org/p/tryserver.chromium/buildstatus?builder=ios_dbg_si...
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/sclittle@chromium.org/27047003/1644001
Message was sent while issue was closed.
Change committed as 238824 |
