|
|
DescriptionAdd History.TopHostsVisitsByRank UMA metric.
Created in the style of TopSitesVisitsByRank
(https://codereview.chromium.org/16517002/), but instead of populating
the top hosts map at Init(), where it is potentially user-impacting
(blocking other history-needing features, such as autocomplete, from
running), it is lazily populated by TopHosts(). Callers of that method
should not expect any additional delay due to this addition.
Practically speaking, this metric will only report for members of the
Precache field trial, as that is the only client of TopHosts, currently.
This will only provide information on Android devices, but otherwise is not
expected to bias the metric significantly.
BUG=499532
Committed: https://crrev.com/cffae42a2d58588ca6cca5f40351b168fcdc10b0
Cr-Commit-Position: refs/heads/master@{#337185}
Patch Set 1 : Add History.TopHostsVisitsByRank and extract HostForTopHosts util function. #
Total comments: 6
Patch Set 2 : Edits per mpearson. #
Total comments: 24
Patch Set 3 : Fix RecordTopHostsMetrics performance problem and add small test. #
Total comments: 2
Patch Set 4 : Add unittest and fix off-by-one error introduced in patch set 3. #Patch Set 5 : Use HistogramTester to eliminate test brittleness. #Patch Set 6 : Make History.TopHostsVisitsByRank 0-based. #Patch Set 7 : Make History.TopHostsVisitsByRank 1-based again. #
Total comments: 8
Patch Set 8 : Case folding and doc tweaks per sky. #
Total comments: 1
Patch Set 9 : Rebase. #Patch Set 10 : Fix unittest to be 1-based again. #Messages
Total messages: 46 (14 generated)
twifkak@chromium.org changed reviewers: + bengr@chromium.org, jwd@chromium.org, mpearson@chromium.org, sdefresne@chromium.org, sky@chromium.org
mpearson@/jwd@ for tools/metrics/ sky@/sdefresne@ for components/history/
twifkak@chromium.org changed reviewers: + brettw@chromium.org
Adding brettw for components/history/.
In the future, I suggest picking reviewers, not picking multiple people to review the same code (unless you want two sets of eyes--do you?). --mark https://codereview.chromium.org/1179953005/diff/1/tools/metrics/histograms/hi... File tools/metrics/histograms/histograms.xml (left): https://codereview.chromium.org/1179953005/diff/1/tools/metrics/histograms/hi... tools/metrics/histograms/histograms.xml:12926: -<histogram name="History.TopSitesVisitsByRank" units="rank"> Do not delete old histograms. https://codereview.chromium.org/1179953005/diff/1/tools/metrics/histograms/hi... File tools/metrics/histograms/histograms.xml (right): https://codereview.chromium.org/1179953005/diff/1/tools/metrics/histograms/hi... tools/metrics/histograms/histograms.xml:12901: + Page visits to each of a user's top 50 hosts. Visits to all other hosts go What happens if hosts get reordered? I visit host #5 several times and then it becomes host #4? https://codereview.chromium.org/1179953005/diff/1/tools/metrics/histograms/hi... tools/metrics/histograms/histograms.xml:12903: + browsing and only count it once when cycling through a redirect chain. "if it came from user browsing" as opposed to what?
Also, from the changelist description: "Practically speaking, this metric will only report for members of the Precache field trial, as that is the only client of TopHosts, currently. This is not expected to bias the metric significantly." I assume you're saying this can bias the metric because the Precache field trial only runs in certain conditions (certain platforms for instance)? Why do you think the platform might not have an influence on the browsing behavior of the user? I know I visit a much smaller set of sites on my mobile phone than my desktop. I'm trying to decide if you should mention any of this issue (field trial, platform, etc.) restrictions in the histogram description. I'm not sure. --mark
twifkak@chromium.org changed reviewers: - brettw@chromium.org, jwd@chromium.org, sky@chromium.org
On 2015/06/12 20:09:00, Mark P wrote: > In the future, I suggest picking reviewers, not picking multiple people to > review the same code (unless you want two sets of eyes--do you?). Sorry; didn't know the etiquette (and don't know which OWNERS make for good reviewers). I've trimmed the list of reviewers.
On 2015/06/12 20:12:30, Mark P wrote: > Also, from the changelist description: > "Practically speaking, this metric will only report for members of the > Precache field trial, as that is the only client of TopHosts, currently. > This is not expected to bias the metric significantly." > > I assume you're saying this can bias the metric because the Precache field > trial only runs in certain conditions (certain platforms for instance)? > Why do you think the platform might not have an influence on the browsing > behavior of the user? I know I visit a much smaller set of sites on my > mobile phone than my desktop. > > I'm trying to decide if you should mention any of this issue (field trial, > platform, etc.) restrictions in the histogram description. I'm not > sure. > > --mark Okay, good point, some of this is worth mentioning. The fact that it's mobile-only (actually Android-only) will likely affect things quite a bit. Edited the CL description and histogram description to mention this. The bias I was thinking about was that the precaching will potentially affect page load speed & data usage, which may in turn unconsciously affect user behavior. I'm guessing that will be negligible.
https://codereview.chromium.org/1179953005/diff/1/tools/metrics/histograms/hi... File tools/metrics/histograms/histograms.xml (left): https://codereview.chromium.org/1179953005/diff/1/tools/metrics/histograms/hi... tools/metrics/histograms/histograms.xml:12926: -<histogram name="History.TopSitesVisitsByRank" units="rank"> On 2015/06/12 20:08:59, Mark P wrote: > Do not delete old histograms. Done. https://codereview.chromium.org/1179953005/diff/1/tools/metrics/histograms/hi... File tools/metrics/histograms/histograms.xml (right): https://codereview.chromium.org/1179953005/diff/1/tools/metrics/histograms/hi... tools/metrics/histograms/histograms.xml:12901: + Page visits to each of a user's top 50 hosts. Visits to all other hosts go On 2015/06/12 20:08:59, Mark P wrote: > What happens if hosts get reordered? I visit host #5 several times and then it > becomes host #4? Done. https://codereview.chromium.org/1179953005/diff/1/tools/metrics/histograms/hi... tools/metrics/histograms/histograms.xml:12903: + browsing and only count it once when cycling through a redirect chain. On 2015/06/12 20:08:59, Mark P wrote: > "if it came from user browsing" > as opposed to what? > I'm not sure to what it's referring exactly; I just copied it from the other histogram description (since it's a mirror of that CL). I'd guess it's referring to this: "visit_source == SOURCE_BROWSED && (transition & ui::PAGE_TRANSITION_CHAIN_END)", in which case, it would be as opposed to the other things on those enums: https://code.google.com/p/chromium/codesearch#chromium/src/components/history... https://code.google.com/p/chromium/codesearch#chromium/src/ui/base/page_trans... e.g. redirects, automated sync, extensions, etc.\
https://codereview.chromium.org/1179953005/diff/20001/tools/metrics/histogram... File tools/metrics/histograms/histograms.xml (right): https://codereview.chromium.org/1179953005/diff/20001/tools/metrics/histogram... tools/metrics/histograms/histograms.xml:12926: + approximately nightly. Only count the page visit if it came from user after "nightly.", you may want to add another sentence saying that this means that the same bucket can represent multiple hosts if the top hosts section was recalculated in between emissions to this histogram. This also means that if you look at the UMA dashboard (which summarizes counts over a day) that likely a day's count covers a recalc. Something about this should be alluded to in the description. https://codereview.chromium.org/1179953005/diff/20001/tools/metrics/histogram... tools/metrics/histograms/histograms.xml:12927: + browsing and only count it once when cycling through a redirect chain. I suggest removing "came from user browsing". Look at the alternatives (sync, extensions, etc.), no one would expect this to count visits caused by those. In other words, I think that phrase adds more confusing than it clarifies. Also, you mention this code fragment: "visit_source == SOURCE_BROWSED && (transition & ui::PAGE_TRANSITION_CHAIN_END)" which implies that you only count it _at most_ once when cycling through a redirect chain, possibly not at all if it's not at the end of the chain. Are you sure your statement "count it once" is accurate?
lgtm with nits. https://codereview.chromium.org/1179953005/diff/20001/components/history/core... File components/history/core/browser/history_backend.cc (right): https://codereview.chromium.org/1179953005/diff/20001/components/history/core... components/history/core/browser/history_backend.cc:93: const int kMaxTopHostsForMetrics = 50; Is that big enough? https://codereview.chromium.org/1179953005/diff/20001/components/history/core... components/history/core/browser/history_backend.cc:426: for (const auto& host_count : top_hosts) { Not required, but you could make this slightly more compact: int i = 0; for (const auto& host_count : top_hosts) host_ranks[host_count.first] = ++i;
sky@chromium.org changed reviewers: + sky@chromium.org
https://codereview.chromium.org/1179953005/diff/20001/components/history/core... File components/history/core/browser/history_backend.cc (right): https://codereview.chromium.org/1179953005/diff/20001/components/history/core... components/history/core/browser/history_backend.cc:740: int host_rank = host_ranks_[HostForTopHosts(url)]; This always results in inserting into host_ranks_, which I don't think you want. Use find to determine if the value is present. https://codereview.chromium.org/1179953005/diff/20001/components/history/core... components/history/core/browser/history_backend.cc:744: UMA_HISTOGRAM_ENUMERATION("History.TopHostsVisitsByRank", host_rank, Not being familiar with TopHosts, I'm not sure how often it is called. If I leave the browser open for a long time won't host_ranks_ get stale? Seems to me it would be better to periodically query for the data you want and not internally cache host_ranks_. https://codereview.chromium.org/1179953005/diff/20001/components/history/core... File components/history/core/browser/history_backend.h (right): https://codereview.chromium.org/1179953005/diff/20001/components/history/core... components/history/core/browser/history_backend.h:842: mutable base::hash_map<std::string, int> host_ranks_; Why the 1s based value? https://codereview.chromium.org/1179953005/diff/20001/components/history/core... File components/history/core/browser/url_utils.cc (right): https://codereview.chromium.org/1179953005/diff/20001/components/history/core... components/history/core/browser/url_utils.cc:92: host.assign(host, 4, std::string::npos); nit: no need to reassign to host, just return the substring.
https://codereview.chromium.org/1179953005/diff/20001/components/history/core... File components/history/core/browser/history_backend.cc (right): https://codereview.chromium.org/1179953005/diff/20001/components/history/core... components/history/core/browser/history_backend.cc:93: const int kMaxTopHostsForMetrics = 50; On 2015/06/13 00:21:14, bengr wrote: > Is that big enough? Bigger would be better, since the median of History.MonthlyHostCount is 75ish and the mode is 120ish. However, the default is 50 buckets, and I didn't want to rock any boats. What are the trade-offs involved in picking more buckets? Alternatively, I could choose a coarser granularity. Thoughts? https://codereview.chromium.org/1179953005/diff/20001/components/history/core... components/history/core/browser/history_backend.cc:426: for (const auto& host_count : top_hosts) { On 2015/06/13 00:21:14, bengr wrote: > Not required, but you could make this slightly more compact: > > int i = 0; > for (const auto& host_count : top_hosts) > host_ranks[host_count.first] = ++i; Done. I chose the other way because strahinja doesn't like the use of ++i as an expression, since it requires mentally tracking some state when reading. But I have no strong opinion. https://codereview.chromium.org/1179953005/diff/20001/components/history/core... components/history/core/browser/history_backend.cc:740: int host_rank = host_ranks_[HostForTopHosts(url)]; On 2015/06/15 17:28:17, sky wrote: > This always results in inserting into host_ranks_, which I don't think you want. > Use find to determine if the value is present. Oof, fail. Done. https://codereview.chromium.org/1179953005/diff/20001/components/history/core... components/history/core/browser/history_backend.cc:744: UMA_HISTOGRAM_ENUMERATION("History.TopHostsVisitsByRank", host_rank, On 2015/06/15 17:28:17, sky wrote: > Not being familiar with TopHosts, I'm not sure how often it is called. If I > leave the browser open for a long time won't host_ranks_ get stale? Seems to me > it would be better to periodically query for the data you want and not > internally cache host_ranks_. TopHosts will be computed approximately once or twice per day for users in the Precache experiment. I wanted to write a metric that didn't have a significant user impact, since that was a prominent concern on the CL that I reference in the description. If I periodically compute host_ranks_, I will block other history-needing features, such as autocomplete, I will have an impact on IO usage, which may affect other tasks, and I will impact battery usage. PrecacheLauncher solves this by querying only when the device is on-power and not presently interactive -- something which can not (and probably should not) be done from within HistoryBackend, AFAIK. https://codereview.chromium.org/1179953005/diff/20001/components/history/core... File components/history/core/browser/history_backend.h (right): https://codereview.chromium.org/1179953005/diff/20001/components/history/core... components/history/core/browser/history_backend.h:842: mutable base::hash_map<std::string, int> host_ranks_; On 2015/06/15 17:28:17, sky wrote: > Why the 1s based value? It seems more intuitive. The statement we want to make is "X% of page visits come from the top Y sites," to which this maps cleanly. I can change it if necessary. https://codereview.chromium.org/1179953005/diff/20001/components/history/core... File components/history/core/browser/url_utils.cc (right): https://codereview.chromium.org/1179953005/diff/20001/components/history/core... components/history/core/browser/url_utils.cc:92: host.assign(host, 4, std::string::npos); On 2015/06/15 17:28:17, sky wrote: > nit: no need to reassign to host, just return the substring. Was trying to make use of NRVO. Returning just the substring here would break NRVO, IIUC, and create an additional copy in the case where the host doesn't start with "www.". I mean, I don't have a benchmark for this, so I obviously don't have a strong performance need. Also, it's entirely likely I'm wrong about the above. I'm happy to change if you feel strongly. https://codereview.chromium.org/1179953005/diff/20001/tools/metrics/histogram... File tools/metrics/histograms/histograms.xml (right): https://codereview.chromium.org/1179953005/diff/20001/tools/metrics/histogram... tools/metrics/histograms/histograms.xml:12926: + approximately nightly. Only count the page visit if it came from user On 2015/06/12 21:09:54, Mark P wrote: > after "nightly.", you may want to add another sentence saying that this means > that the same bucket can represent multiple hosts if the top hosts section was > recalculated in between emissions to this histogram. This also means that if > you look at the UMA dashboard (which summarizes counts over a day) that likely a > day's count covers a recalc. Something about this should be alluded to in the > description. Done. https://codereview.chromium.org/1179953005/diff/20001/tools/metrics/histogram... tools/metrics/histograms/histograms.xml:12927: + browsing and only count it once when cycling through a redirect chain. On 2015/06/12 21:09:54, Mark P wrote: > I suggest removing "came from user browsing". Look at the alternatives (sync, > extensions, etc.), no one would expect this to count visits caused by those. In > other words, I think that phrase adds more confusing than it clarifies. Done. > Also, you mention this code fragment: > "visit_source == SOURCE_BROWSED && (transition & ui::PAGE_TRANSITION_CHAIN_END)" > which implies that you only count it _at most_ once when cycling through a > redirect chain, > possibly not at all if it's not at the end of the chain. > Are you sure your statement "count it once" is accurate? Are you implying that some redirect chains don't end in a PAGE_TRANSITION_CHAIN_END? Under what circumstances?
histograms.xml lgtm see minor comments below --mark https://codereview.chromium.org/1179953005/diff/20001/components/history/core... File components/history/core/browser/history_backend.cc (right): https://codereview.chromium.org/1179953005/diff/20001/components/history/core... components/history/core/browser/history_backend.cc:740: int host_rank = host_ranks_[HostForTopHosts(url)]; On 2015/06/16 20:25:23, twifkak wrote: > On 2015/06/15 17:28:17, sky wrote: > > This always results in inserting into host_ranks_, which I don't think you > want. > > Use find to determine if the value is present. > > Oof, fail. Done. This makes me think you should either have unit tests or at least test this using about:histograms. https://codereview.chromium.org/1179953005/diff/20001/tools/metrics/histogram... File tools/metrics/histograms/histograms.xml (right): https://codereview.chromium.org/1179953005/diff/20001/tools/metrics/histogram... tools/metrics/histograms/histograms.xml:12927: + browsing and only count it once when cycling through a redirect chain. On 2015/06/16 20:25:24, twifkak wrote: > On 2015/06/12 21:09:54, Mark P wrote: > > I suggest removing "came from user browsing". Look at the alternatives (sync, > > extensions, etc.), no one would expect this to count visits caused by those. > In > > other words, I think that phrase adds more confusing than it clarifies. > > Done. > > > Also, you mention this code fragment: > > "visit_source == SOURCE_BROWSED && (transition & > ui::PAGE_TRANSITION_CHAIN_END)" > > which implies that you only count it _at most_ once when cycling through a > > redirect chain, > > possibly not at all if it's not at the end of the chain. > > Are you sure your statement "count it once" is accurate? > > Are you implying that some redirect chains don't end in a > PAGE_TRANSITION_CHAIN_END? Under what circumstances? No. I was thinking from the previous text that you were counting a visit to a host every time it was visited (even on a redirect chain), just making sure that if a host was hit multiple times on a chain you'd only count it once. With the new text, what you're actually doing is clear. (It wasn't what I thought before.) https://codereview.chromium.org/1179953005/diff/40001/tools/metrics/histogram... File tools/metrics/histograms/histograms.xml (right): https://codereview.chromium.org/1179953005/diff/40001/tools/metrics/histogram... tools/metrics/histograms/histograms.xml:12925: + into the 51st bucket. Android-only. Only count the last URL in a redirect nit: count->counts
https://codereview.chromium.org/1179953005/diff/20001/components/history/core... File components/history/core/browser/history_backend.cc (right): https://codereview.chromium.org/1179953005/diff/20001/components/history/core... components/history/core/browser/history_backend.cc:740: int host_rank = host_ranks_[HostForTopHosts(url)]; On 2015/06/16 20:54:31, Mark P wrote: > On 2015/06/16 20:25:23, twifkak wrote: > > On 2015/06/15 17:28:17, sky wrote: > > > This always results in inserting into host_ranks_, which I don't think you > > want. > > > Use find to determine if the value is present. > > > > Oof, fail. Done. > > This makes me think you should either have unit tests or at least test this > using about:histograms. A unit test (or about:histograms) wouldn't have caught this particular bug, since 0 was the correct default value. The only effect would have been unbounded memory growth. That said, a unit test is a good idea. Added. Found a bug in this latest find-based version and fixed it. https://codereview.chromium.org/1179953005/diff/20001/tools/metrics/histogram... File tools/metrics/histograms/histograms.xml (right): https://codereview.chromium.org/1179953005/diff/20001/tools/metrics/histogram... tools/metrics/histograms/histograms.xml:12927: + browsing and only count it once when cycling through a redirect chain. On 2015/06/16 20:54:31, Mark P wrote: > On 2015/06/16 20:25:24, twifkak wrote: > > On 2015/06/12 21:09:54, Mark P wrote: > > > I suggest removing "came from user browsing". Look at the alternatives > (sync, > > > extensions, etc.), no one would expect this to count visits caused by those. > > > In > > > other words, I think that phrase adds more confusing than it clarifies. > > > > Done. > > > > > Also, you mention this code fragment: > > > "visit_source == SOURCE_BROWSED && (transition & > > ui::PAGE_TRANSITION_CHAIN_END)" > > > which implies that you only count it _at most_ once when cycling through a > > > redirect chain, > > > possibly not at all if it's not at the end of the chain. > > > Are you sure your statement "count it once" is accurate? > > > > Are you implying that some redirect chains don't end in a > > PAGE_TRANSITION_CHAIN_END? Under what circumstances? > > No. I was thinking from the previous text that you were counting a visit to a > host every time it was visited (even on a redirect chain), just making sure that > if a host was hit multiple times on a chain you'd only count it once. > With the new text, what you're actually doing is clear. (It wasn't what I > thought before.) Acknowledged. https://codereview.chromium.org/1179953005/diff/40001/tools/metrics/histogram... File tools/metrics/histograms/histograms.xml (right): https://codereview.chromium.org/1179953005/diff/40001/tools/metrics/histogram... tools/metrics/histograms/histograms.xml:12925: + into the 51st bucket. Android-only. Only count the last URL in a redirect On 2015/06/16 20:54:31, Mark P wrote: > nit: count->counts Done.
sky and sdefresne: Ping.
twifkak@chromium.org changed reviewers: - sdefresne@chromium.org
https://codereview.chromium.org/1179953005/diff/20001/components/history/core... File components/history/core/browser/history_backend.cc (right): https://codereview.chromium.org/1179953005/diff/20001/components/history/core... components/history/core/browser/history_backend.cc:744: UMA_HISTOGRAM_ENUMERATION("History.TopHostsVisitsByRank", host_rank, On 2015/06/16 20:25:23, twifkak wrote: > On 2015/06/15 17:28:17, sky wrote: > > Not being familiar with TopHosts, I'm not sure how often it is called. If I > > leave the browser open for a long time won't host_ranks_ get stale? Seems to > me > > it would be better to periodically query for the data you want and not > > internally cache host_ranks_. > > TopHosts will be computed approximately once or twice per day for users in the > Precache experiment. Sure, but then how realistic is your metric if it gets stale? > > I wanted to write a metric that didn't have a significant user impact, since > that was a prominent concern on the CL that I reference in the description. If I > periodically compute host_ranks_, I will block other history-needing features, > such as autocomplete, I will have an impact on IO usage, which may affect other > tasks, and I will impact battery usage. PrecacheLauncher solves this by querying > only when the device is on-power and not presently interactive -- something > which can not (and probably should not) be done from within HistoryBackend, > AFAIK. https://codereview.chromium.org/1179953005/diff/20001/components/history/core... File components/history/core/browser/history_backend.h (right): https://codereview.chromium.org/1179953005/diff/20001/components/history/core... components/history/core/browser/history_backend.h:842: mutable base::hash_map<std::string, int> host_ranks_; On 2015/06/16 20:25:24, twifkak wrote: > On 2015/06/15 17:28:17, sky wrote: > > Why the 1s based value? > > It seems more intuitive. The statement we want to make is "X% of page visits > come from the top Y sites," to which this maps cleanly. I can change it if > necessary. I think it would be better to have 0 based (like nearly everything else in chrome), but if you want the reporting to be 1 based increment there.
A rebase got mixed in there. My diff lines are: history_backend.h: L842 / R846 history_backend.cc: L739-743 / R747-750 history_backend_unittest.cc: L3033 / R3089 https://codereview.chromium.org/1179953005/diff/20001/components/history/core... File components/history/core/browser/history_backend.cc (right): https://codereview.chromium.org/1179953005/diff/20001/components/history/core... components/history/core/browser/history_backend.cc:744: UMA_HISTOGRAM_ENUMERATION("History.TopHostsVisitsByRank", host_rank, On 2015/06/30 19:17:26, sky wrote: > On 2015/06/16 20:25:23, twifkak wrote: > > On 2015/06/15 17:28:17, sky wrote: > > > Not being familiar with TopHosts, I'm not sure how often it is called. If I > > > leave the browser open for a long time won't host_ranks_ get stale? Seems to > > me > > > it would be better to periodically query for the data you want and not > > > internally cache host_ranks_. > > > > TopHosts will be computed approximately once or twice per day for users in the > > Precache experiment. > > Sure, but then how realistic is your metric if it gets stale? My intended use for this metric is to gather data with which to tweak the parameters for the Precache experiment. The Precache experiment prefetches static resources for the top N hosts, as determined by calling TopHosts(). As such, the metric is actually the most realistic reflection of what I want to determine: an estimate of the experiment's impact for a given N. As far as how realistic it is for uses other than my intended use, I guess I'd need some ideas of what those uses might be in order to determine the level of precision they require. https://codereview.chromium.org/1179953005/diff/20001/components/history/core... File components/history/core/browser/history_backend.h (right): https://codereview.chromium.org/1179953005/diff/20001/components/history/core... components/history/core/browser/history_backend.h:842: mutable base::hash_map<std::string, int> host_ranks_; On 2015/06/30 19:17:26, sky wrote: > On 2015/06/16 20:25:24, twifkak wrote: > > On 2015/06/15 17:28:17, sky wrote: > > > Why the 1s based value? > > > > It seems more intuitive. The statement we want to make is "X% of page visits > > come from the top Y sites," to which this maps cleanly. I can change it if > > necessary. > > I think it would be better to have 0 based (like nearly everything else in > chrome), but if you want the reporting to be 1 based increment there. Done. When you say "increment there", where do you mean?
On Tue, Jun 30, 2015 at 2:14 PM, <twifkak@chromium.org> wrote: > A rebase got mixed in there. My diff lines are: > history_backend.h: L842 / R846 > history_backend.cc: L739-743 / R747-750 > history_backend_unittest.cc: L3033 / R3089 > > > https://codereview.chromium.org/1179953005/diff/20001/components/history/core... > File components/history/core/browser/history_backend.cc (right): > > https://codereview.chromium.org/1179953005/diff/20001/components/history/core... > components/history/core/browser/history_backend.cc:744: > UMA_HISTOGRAM_ENUMERATION("History.TopHostsVisitsByRank", host_rank, > On 2015/06/30 19:17:26, sky wrote: >> >> On 2015/06/16 20:25:23, twifkak wrote: >> > On 2015/06/15 17:28:17, sky wrote: >> > > Not being familiar with TopHosts, I'm not sure how often it is > > called. If I >> >> > > leave the browser open for a long time won't host_ranks_ get > > stale? Seems to >> >> > me >> > > it would be better to periodically query for the data you want and > > not >> >> > > internally cache host_ranks_. >> > >> > TopHosts will be computed approximately once or twice per day for > > users in the >> >> > Precache experiment. > > >> Sure, but then how realistic is your metric if it gets stale? > > > My intended use for this metric is to gather data with which to tweak > the parameters for the Precache experiment. The Precache experiment > prefetches static resources for the top N hosts, as determined by > calling TopHosts(). As such, the metric is actually the most realistic > reflection of what I want to determine: an estimate of the experiment's > impact for a given N. > > As far as how realistic it is for uses other than my intended use, I > guess I'd need some ideas of what those uses might be in order to > determine the level of precision they require. > > https://codereview.chromium.org/1179953005/diff/20001/components/history/core... > File components/history/core/browser/history_backend.h (right): > > https://codereview.chromium.org/1179953005/diff/20001/components/history/core... > components/history/core/browser/history_backend.h:842: mutable > base::hash_map<std::string, int> host_ranks_; > On 2015/06/30 19:17:26, sky wrote: >> >> On 2015/06/16 20:25:24, twifkak wrote: >> > On 2015/06/15 17:28:17, sky wrote: >> > > Why the 1s based value? >> > >> > It seems more intuitive. The statement we want to make is "X% of > > page visits >> >> > come from the top Y sites," to which this maps cleanly. I can change > > it if >> >> > necessary. > > >> I think it would be better to have 0 based (like nearly everything > > else in >> >> chrome), but if you want the reporting to be 1 based increment there. > > > Done. When you say "increment there", where do you mean? Sorry for not being clear. I meant keep the map 0 based, but in your uma stats add 1 (if you really want to). -Scott > > https://codereview.chromium.org/1179953005/ To unsubscribe from this group and stop receiving emails from it, send an email to chromium-reviews+unsubscribe@chromium.org.
On 2015/06/30 22:19:44, sky wrote: > On Tue, Jun 30, 2015 at 2:14 PM, <mailto:twifkak@chromium.org> wrote: > > A rebase got mixed in there. My diff lines are: > > history_backend.h: L842 / R846 > > history_backend.cc: L739-743 / R747-750 > > history_backend_unittest.cc: L3033 / R3089 > > > > > > > https://codereview.chromium.org/1179953005/diff/20001/components/history/core... > > File components/history/core/browser/history_backend.cc (right): > > > > > https://codereview.chromium.org/1179953005/diff/20001/components/history/core... > > components/history/core/browser/history_backend.cc:744: > > UMA_HISTOGRAM_ENUMERATION("History.TopHostsVisitsByRank", host_rank, > > On 2015/06/30 19:17:26, sky wrote: > >> > >> On 2015/06/16 20:25:23, twifkak wrote: > >> > On 2015/06/15 17:28:17, sky wrote: > >> > > Not being familiar with TopHosts, I'm not sure how often it is > > > > called. If I > >> > >> > > leave the browser open for a long time won't host_ranks_ get > > > > stale? Seems to > >> > >> > me > >> > > it would be better to periodically query for the data you want and > > > > not > >> > >> > > internally cache host_ranks_. > >> > > >> > TopHosts will be computed approximately once or twice per day for > > > > users in the > >> > >> > Precache experiment. > > > > > >> Sure, but then how realistic is your metric if it gets stale? > > > > > > My intended use for this metric is to gather data with which to tweak > > the parameters for the Precache experiment. The Precache experiment > > prefetches static resources for the top N hosts, as determined by > > calling TopHosts(). As such, the metric is actually the most realistic > > reflection of what I want to determine: an estimate of the experiment's > > impact for a given N. > > > > As far as how realistic it is for uses other than my intended use, I > > guess I'd need some ideas of what those uses might be in order to > > determine the level of precision they require. > > > > > https://codereview.chromium.org/1179953005/diff/20001/components/history/core... > > File components/history/core/browser/history_backend.h (right): > > > > > https://codereview.chromium.org/1179953005/diff/20001/components/history/core... > > components/history/core/browser/history_backend.h:842: mutable > > base::hash_map<std::string, int> host_ranks_; > > On 2015/06/30 19:17:26, sky wrote: > >> > >> On 2015/06/16 20:25:24, twifkak wrote: > >> > On 2015/06/15 17:28:17, sky wrote: > >> > > Why the 1s based value? > >> > > >> > It seems more intuitive. The statement we want to make is "X% of > > > > page visits > >> > >> > come from the top Y sites," to which this maps cleanly. I can change > > > > it if > >> > >> > necessary. > > > > > >> I think it would be better to have 0 based (like nearly everything > > > > else in > >> > >> chrome), but if you want the reporting to be 1 based increment there. > > > > > > Done. When you say "increment there", where do you mean? > > Sorry for not being clear. I meant keep the map 0 based, but in your > uma stats add 1 (if you really want to). Ah, okay, done. > -Scott > > > > https://codereview.chromium.org/1179953005/ > > To unsubscribe from this group and stop receiving emails from it, send an email > to mailto:chromium-reviews+unsubscribe@chromium.org.
https://codereview.chromium.org/1179953005/diff/120001/components/history/cor... File components/history/core/browser/history_backend.h (right): https://codereview.chromium.org/1179953005/diff/120001/components/history/cor... components/history/core/browser/history_backend.h:846: // Map from host to index in the TopHosts list. Add a comment that this map is not necessarily up to date. https://codereview.chromium.org/1179953005/diff/120001/components/history/cor... File components/history/core/browser/url_utils.cc (right): https://codereview.chromium.org/1179953005/diff/120001/components/history/cor... components/history/core/browser/url_utils.cc:91: if (base::StartsWithASCII(host, "www.", true)) StartsWithASCII is deprecated. Use STartsWith(..., CompareCase::INSENSITIVE_ASCII). https://codereview.chromium.org/1179953005/diff/120001/components/history/cor... File components/history/core/browser/url_utils_unittest.cc (right): https://codereview.chromium.org/1179953005/diff/120001/components/history/cor... components/history/core/browser/url_utils_unittest.cc:132: EXPECT_EQ("foo.com", HostForTopHosts(GURL("http://www.foo.com/bar"))); You should add a variant that has mixed case, eg http://WWw.bar.com/ https://codereview.chromium.org/1179953005/diff/120001/tools/metrics/histogra... File tools/metrics/histograms/histograms.xml (right): https://codereview.chromium.org/1179953005/diff/120001/tools/metrics/histogra... tools/metrics/histograms/histograms.xml:13086: + into the 51st bucket. Android-only. Only counts the last URL in a redirect What makes this android only? Also, document that it's 1s based.
https://codereview.chromium.org/1179953005/diff/120001/components/history/cor... File components/history/core/browser/history_backend.h (right): https://codereview.chromium.org/1179953005/diff/120001/components/history/cor... components/history/core/browser/history_backend.h:846: // Map from host to index in the TopHosts list. On 2015/07/01 16:53:32, sky wrote: > Add a comment that this map is not necessarily up to date. Done. https://codereview.chromium.org/1179953005/diff/120001/components/history/cor... File components/history/core/browser/url_utils.cc (right): https://codereview.chromium.org/1179953005/diff/120001/components/history/cor... components/history/core/browser/url_utils.cc:91: if (base::StartsWithASCII(host, "www.", true)) On 2015/07/01 16:53:33, sky wrote: > StartsWithASCII is deprecated. Use STartsWith(..., > CompareCase::INSENSITIVE_ASCII). Actually, went ahead and case-folded the host. This appears safe to do: https://tools.ietf.org/html/rfc4343#section-3 https://codereview.chromium.org/1179953005/diff/120001/components/history/cor... File components/history/core/browser/url_utils_unittest.cc (right): https://codereview.chromium.org/1179953005/diff/120001/components/history/cor... components/history/core/browser/url_utils_unittest.cc:132: EXPECT_EQ("foo.com", HostForTopHosts(GURL("http://www.foo.com/bar"))); On 2015/07/01 16:53:33, sky wrote: > You should add a variant that has mixed case, eg http://WWw.bar.com/ Done. https://codereview.chromium.org/1179953005/diff/120001/tools/metrics/histogra... File tools/metrics/histograms/histograms.xml (right): https://codereview.chromium.org/1179953005/diff/120001/tools/metrics/histogra... tools/metrics/histograms/histograms.xml:13086: + into the 51st bucket. Android-only. Only counts the last URL in a redirect On 2015/07/01 16:53:33, sky wrote: > What makes this android only? The host_ranks_ map is updated only by TopHosts(), which is called only by precache::PrecacheManager::StartPrecaching(), which is called only by PrecacheLauncher::Start(), which is a JNI class. > Also, document that it's 1s based. Done. https://codereview.chromium.org/1179953005/diff/130001/tools/metrics/histogra... File tools/metrics/histograms/histograms.xml (right): https://codereview.chromium.org/1179953005/diff/130001/tools/metrics/histogra... tools/metrics/histograms/histograms.xml:13084: + <owner>bengr@chromium.org</owner> Note: This addition is per https://codereview.chromium.org/1212773002/diff/40001/tools/metrics/histogram....
LGTM
On 2015/07/01 20:45:23, sky wrote: > LGTM Thank you for the review!
The CQ bit was checked by twifkak@chromium.org
The patchset sent to the CQ was uploaded after l-g-t-m from bengr@chromium.org, mpearson@chromium.org Link to the patchset: https://codereview.chromium.org/1179953005/#ps130001 (title: "Case folding and doc tweaks per sky.")
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1179953005/130001
The CQ bit was unchecked by commit-bot@chromium.org
Exceeded global retry quota
The CQ bit was checked by twifkak@chromium.org
The patchset sent to the CQ was uploaded after l-g-t-m from bengr@chromium.org, mpearson@chromium.org, sky@chromium.org Link to the patchset: https://codereview.chromium.org/1179953005/#ps170001 (title: "Fix unittest to be 1-based again.")
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1179953005/170001
The CQ bit was unchecked by commit-bot@chromium.org
Try jobs failed on following builders: linux_android_rel_ng on tryserver.chromium.linux (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.linux/builders/linux_android_r...)
The CQ bit was checked by twifkak@chromium.org
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1179953005/170001
The CQ bit was unchecked by commit-bot@chromium.org
Try jobs failed on following builders: linux_android_rel_ng on tryserver.chromium.linux (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.linux/builders/linux_android_r...)
On 2015/07/02 02:59:08, commit-bot: I haz the power wrote: > Try jobs failed on following builders: > linux_android_rel_ng on tryserver.chromium.linux (JOB_FAILED, > http://build.chromium.org/p/tryserver.chromium.linux/builders/linux_android_r...) OK, so it's not a flake, but it looks it was fixed by https://chromium.googlesource.com/chromium/src/+/8df2841213f300c7d72c9f20eb91..., so I'll try again.
The CQ bit was checked by twifkak@chromium.org
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1179953005/170001
Message was sent while issue was closed.
Committed patchset #10 (id:170001)
Message was sent while issue was closed.
Patchset 10 (id:??) landed as https://crrev.com/cffae42a2d58588ca6cca5f40351b168fcdc10b0 Cr-Commit-Position: refs/heads/master@{#337185} |