|
|
Created:
7 years, 6 months ago by bengr Modified:
7 years, 6 months ago CC:
chromium-reviews, browser-components-watch_chromium.org, Alexei Svitkine (slow), jwd Base URL:
svn://svn.chromium.org/chrome/trunk/src Visibility:
Public. |
DescriptionTrack fraction of visits to top URLs.
Record UMA to count the number of page visits to each of a user's top k URLs.
BUG=247216
Committed: https://src.chromium.org/viewvc/chrome?view=rev&revision=207957
Patch Set 1 #Patch Set 2 : Added histogram description #
Total comments: 13
Patch Set 3 : Addressed reviewer comments #Patch Set 4 : Added more detail to histogram description. #
Total comments: 15
Patch Set 5 : Addressed comments from jar #
Total comments: 16
Patch Set 6 : Added map and addressed other comments #
Total comments: 6
Patch Set 7 : Made UMA-related functions into members #
Messages
Total messages: 30 (0 generated)
Brett, PTAL. Thanks.
You definitely don't want to do it this way. The most visited computation is very slow, so doing it on every page will be VERY bad.
There's not that much info on the bug for me to suggest an alternative. You can hook into top_sites which maintains a cache of the top ~20 "top sites".
I see your point. My first inclination was to use top_sites, but I wanted to track more than the top 20 and I didn't have a good idea where I'd be able to see every page visit and have access to top_sites. Do you have any suggestions for the latter? I can live with 20. Thanks. On Mon, Jun 10, 2013 at 1:55 PM, <brettw@chromium.org> wrote: > There's not that much info on the bug for me to suggest an alternative. > You can > hook into top_sites which maintains a cache of the top ~20 "top sites". > > https://codereview.chromium.**org/16517002/<https://codereview.chromium.org/1... >
No, I can't think of a good place. You may need to add a new component. I don't know how much you need these metrics but it sounds like the cost may outweigh the benefit. Brett On Mon, Jun 10, 2013 at 3:00 PM, Ben Greenstein <bengr@google.com> wrote: > I see your point. My first inclination was to use top_sites, but I wanted to > track more than the top 20 and I didn't have a good idea where I'd be able > to see every page visit and have access to top_sites. Do you have any > suggestions for the latter? I can live with 20. Thanks. > > > On Mon, Jun 10, 2013 at 1:55 PM, <brettw@chromium.org> wrote: >> >> There's not that much info on the bug for me to suggest an alternative. >> You can >> hook into top_sites which maintains a cache of the top ~20 "top sites". >> >> https://codereview.chromium.org/16517002/ > >
On 2013/06/10 22:04:55, brettw wrote: > No, I can't think of a good place. You may need to add a new > component. I don't know how much you need these metrics but it sounds > like the cost may outweigh the benefit. Hmm. It is very important to us. I just looked at the code again and I only query the db once when the HistoryBackend is initialized. Is that too much? On Android this works because sessions are typically short lived. (I can add a comment that this approach isn't strictly accurate, because the top sites list might change over the course of a session.) I then loop through all entries in the top urls and their redirect chains. That seems a little heavy. If necessary, I could create a hash map to avoid the looping. What do you think? > Brett > > On Mon, Jun 10, 2013 at 3:00 PM, Ben Greenstein <mailto:bengr@google.com> wrote: > > I see your point. My first inclination was to use top_sites, but I wanted to > > track more than the top 20 and I didn't have a good idea where I'd be able > > to see every page visit and have access to top_sites. Do you have any > > suggestions for the latter? I can live with 20. Thanks. > > > > > > On Mon, Jun 10, 2013 at 1:55 PM, <mailto:brettw@chromium.org> wrote: > >> > >> There's not that much info on the bug for me to suggest an alternative. > >> You can > >> hook into top_sites which maintains a cache of the top ~20 "top sites". > >> > >> https://codereview.chromium.org/16517002/ > > > >
jar: histograms.xml
https://codereview.chromium.org/16517002/diff/9001/tools/metrics/histograms/h... File tools/metrics/histograms/histograms.xml (right): https://codereview.chromium.org/16517002/diff/9001/tools/metrics/histograms/h... tools/metrics/histograms/histograms.xml:2610: <summary>Page visits to each of a user's top k sites.</summary> This seems like a strange metric. Are you saying that the kth bucket will contain a count of the number of times a user visited the kth ranked site? Are you trying to estimate what the usage curve looks like? If so, you certainly need to have the normalization which shows how many visits were "in the unsung long tail." Can you verbalize what you're hoping to learn from this??
On 2013/06/11 00:47:34, jar wrote: > https://codereview.chromium.org/16517002/diff/9001/tools/metrics/histograms/h... > File tools/metrics/histograms/histograms.xml (right): > > https://codereview.chromium.org/16517002/diff/9001/tools/metrics/histograms/h... > tools/metrics/histograms/histograms.xml:2610: <summary>Page visits to each of a > user's top k sites.</summary> > This seems like a strange metric. Thank you. ;) > > Are you saying that the kth bucket will contain a count of the number of times a > user visited the kth ranked site? Yes. > > Are you trying to estimate what the usage curve looks like? Yes. > > If so, you certainly need to have the normalization which shows how many visits > were "in the unsung long tail." I track for the top 50 sites for each user. Ranks 51-infinity go into the 51st bucket. > > Can you verbalize what you're hoping to learn from this?? I can try. :) We want to understand (1) whether a user's top sites are a good predictor of the pages he will visit, and (2) if so, at what point (presuming <50) do we see diminishing returns. E.g., if 50% of all page visits are to the top 3 sites and 51% are to the top 4, we learn that top sites are a good predictor, but only because the first 3 (or fewer) sites are great predictors, while the rest are weak.
I see, I misread the logic, lemme look at it in more detail
History initialization is kind of "hot" in that you can't get any autocomplete suggestions until it's complete, typically that's the first thing you want when you start the browser, and top sites computation isn't super fast. But it's not clearly horrible, however. If you guys think this is worthwhile, I think it's fine, but I think there should be a well-defined end date in 1-2 releases, with comments in the code mentioning when it can be removed, and a bug filed to track this. https://codereview.chromium.org/16517002/diff/9001/chrome/browser/history/his... File chrome/browser/history/history_backend.cc (right): https://codereview.chromium.org/16517002/diff/9001/chrome/browser/history/his... chrome/browser/history/history_backend.cc:59: // Return the rank of the url if it appears in |top_urls| (including within Style: blank line before here, don't indent namespace contents. https://codereview.chromium.org/16517002/diff/9001/chrome/browser/history/his... chrome/browser/history/history_backend.cc:84: } // namespace Blank line before here. https://codereview.chromium.org/16517002/diff/9001/chrome/browser/history/his... chrome/browser/history/history_backend.cc:137: static const size_t kMaxTopSites = 50; Can this have a name more specific to your metrics? Otherwise this looks like a general "how many top sites we have" thing. https://codereview.chromium.org/16517002/diff/9001/chrome/browser/history/his... chrome/browser/history/history_backend.cc:889: (transition & content::PAGE_TRANSITION_CHAIN_END) != 0) If you want only the last one, then this code should go in AddPage which calls this function once for each redirect in the chain. Then you can remove the END check. https://codereview.chromium.org/16517002/diff/9001/chrome/browser/history/his... chrome/browser/history/history_backend.cc:891: // TODO(bengr): check referrer as well. This TODO doesn't make any sense to me. Either remove it or expand on it for somebody else reading can understand what you meant. https://codereview.chromium.org/16517002/diff/9001/chrome/browser/history/his... File chrome/browser/history/history_backend.h (right): https://codereview.chromium.org/16517002/diff/9001/chrome/browser/history/his... chrome/browser/history/history_backend.h:908: MostVisitedURLList most_visited_urls_; This should probably have a comment about why we store it here
https://codereview.chromium.org/16517002/diff/9001/chrome/browser/history/his... File chrome/browser/history/history_backend.cc (right): https://codereview.chromium.org/16517002/diff/9001/chrome/browser/history/his... chrome/browser/history/history_backend.cc:59: // Return the rank of the url if it appears in |top_urls| (including within On 2013/06/11 18:28:16, brettw wrote: > Style: blank line before here, don't indent namespace contents. Done. https://codereview.chromium.org/16517002/diff/9001/chrome/browser/history/his... chrome/browser/history/history_backend.cc:84: } // namespace On 2013/06/11 18:28:16, brettw wrote: > Blank line before here. Done. https://codereview.chromium.org/16517002/diff/9001/chrome/browser/history/his... chrome/browser/history/history_backend.cc:137: static const size_t kMaxTopSites = 50; On 2013/06/11 18:28:16, brettw wrote: > Can this have a name more specific to your metrics? Otherwise this looks like a > general "how many top sites we have" thing. Done. https://codereview.chromium.org/16517002/diff/9001/chrome/browser/history/his... chrome/browser/history/history_backend.cc:889: (transition & content::PAGE_TRANSITION_CHAIN_END) != 0) If I understand the code, AddPage calls this AddPageVisit in two locations, one for the no redirect case and on for the redirect case. If I move this code there, I'd have to duplicate it. What do you prefer? On 2013/06/11 18:28:16, brettw wrote: > If you want only the last one, then this code should go in AddPage which calls > this function once for each redirect in the chain. Then you can remove the END > check. https://codereview.chromium.org/16517002/diff/9001/chrome/browser/history/his... chrome/browser/history/history_backend.cc:891: // TODO(bengr): check referrer as well. On 2013/06/11 18:28:16, brettw wrote: > This TODO doesn't make any sense to me. Either remove it or expand on it for > somebody else reading can understand what you meant. Done. https://codereview.chromium.org/16517002/diff/9001/chrome/browser/history/his... File chrome/browser/history/history_backend.h (right): https://codereview.chromium.org/16517002/diff/9001/chrome/browser/history/his... chrome/browser/history/history_backend.h:908: MostVisitedURLList most_visited_urls_; On 2013/06/11 18:28:16, brettw wrote: > This should probably have a comment about why we store it here Done.
On 2013/06/11 02:21:55, bengr1 wrote: > On 2013/06/11 00:47:34, jar wrote: > > > https://codereview.chromium.org/16517002/diff/9001/tools/metrics/histograms/h... > > File tools/metrics/histograms/histograms.xml (right): > > > > > https://codereview.chromium.org/16517002/diff/9001/tools/metrics/histograms/h... > > tools/metrics/histograms/histograms.xml:2610: <summary>Page visits to each of > a > > user's top k sites.</summary> > > This seems like a strange metric. > > Thank you. ;) > > > > > Are you saying that the kth bucket will contain a count of the number of times > a > > user visited the kth ranked site? > > Yes. > > > > > Are you trying to estimate what the usage curve looks like? > > Yes. > > > > > If so, you certainly need to have the normalization which shows how many > visits > > were "in the unsung long tail." > > I track for the top 50 sites for each user. Ranks 51-infinity go into the 51st > bucket. > > > > > Can you verbalize what you're hoping to learn from this?? > > I can try. :) We want to understand (1) whether a user's top sites are a good > predictor of the pages he will visit, and (2) if so, at what point (presuming > <50) do we see diminishing returns. E.g., if 50% of all page visits are to the > top 3 sites and 51% are to the top 4, we learn that top sites are a good > predictor, but only because the first 3 (or fewer) sites are great predictors, > while the rest are weak. Please say some of this in the description of the histogram. There was (for example) no mention of the fact that the 51st bucket contained the visitation count for "all other" sites (beyond the "known" 50 top sites. Please try to explain what the histogram will contain in the xml file. I'm also curious how sharded sites will play in your attempted metrics. For example, suppose you wanted to look at youtube. I bet you'd find a ton of distinct hosts, all ending with the same TLD (*.youtube.com), but each having a pseudo-random host prefix. These would then presumably be numerous, but none of these would be in the "top 50." <side comment> It is also interesting that you seem to be looking for a "one size fits all" cutoff. It seems very plausible that some users will have a low cutoff (50% of navigations are among the their top 3 favorite sites), and some users will be diverse surfers (10% of navigations will be among the top 50% of sites). You might want to classify users, and try to track how deeply you need to save info in order to best support the user... </side comment> Which leads to the question: What will you do with this data if you got it? How will you act?? What cache size will change?? What activity will be different (are you going to maintain "hot DNS resolutions" for the most likely 50% of user sites??). Perhaps if I (or the uma team) understood the desired use, we could comment better on how to gather a most useful metric.
+ilya, b/c jar@ is OOO.
On 2013/06/17 20:44:28, bengr1 wrote: > +ilya, b/c jar@ is OOO. He is? I don't see any messages that Jim hasn't replied to on this code review, nor do I see anything on his calendar.
On 2013/06/17 21:43:50, Ilya Sherman wrote: > On 2013/06/17 20:44:28, bengr1 wrote: > > +ilya, b/c jar@ is OOO. > > He is? I don't see any messages that Jim hasn't replied to on this code review, > nor do I see anything on his calendar. I think I've addressed all of jar@'s comments, and I believe he is OOO. jar@, are you back? Ilya, if no response, would you mind taking over?
https://codereview.chromium.org/16517002/diff/23001/chrome/browser/history/hi... File chrome/browser/history/history_backend.cc (right): https://codereview.chromium.org/16517002/diff/23001/chrome/browser/history/hi... chrome/browser/history/history_backend.cc:64: for (size_t i = 0; i < top_urls.size(); ++i) { nit: How about at least not searching beyond max_top_url_count? ...or DCHECK that size() is always under it?? https://codereview.chromium.org/16517002/diff/23001/chrome/browser/history/hi... chrome/browser/history/history_backend.cc:68: if (url == top_urls[i].redirects[j]) I'm not sure how long these chains are.... but are you concerned about writing code that does these linear searches? This seems like it would be done a lot... Could you mind histogramming how many URL compares we do here? ... and do we do this for every navigation to every URL?? https://codereview.chromium.org/16517002/diff/23001/chrome/browser/history/hi... chrome/browser/history/history_backend.cc:80: const GURL& url, int max_top_url_count) { nit: one arg per line in declaration and definitions. https://codereview.chromium.org/16517002/diff/23001/chrome/browser/history/hi... chrome/browser/history/history_backend.cc:84: UMA_HISTOGRAM_COUNTS("History.TopSitesVisitsByRank", rank); Better would be to use: UMA_HISTOGRAM_ENUMERATION("...", rank, max_top_url_count + 1); https://codereview.chromium.org/16517002/diff/23001/chrome/browser/history/hi... chrome/browser/history/history_backend.cc:141: static const size_t kPageVisitStatsMaxTopSites = 50; nit: (even though I like statics at file scope...) Anonymous namespaces are preferred. This list of constants should probably be pushed higher in the file, (or your new code later in the file), so that the new code can refer to these constants. https://codereview.chromium.org/16517002/diff/23001/chrome/browser/history/hi... chrome/browser/history/history_backend.cc:894: RecordTopPageVisitStats(most_visited_urls_, url, nit: use curlies around any if action that runs onto more than a line. https://codereview.chromium.org/16517002/diff/23001/chrome/browser/history/hi... File chrome/browser/history/history_backend.h (right): https://codereview.chromium.org/16517002/diff/23001/chrome/browser/history/hi... chrome/browser/history/history_backend.h:912: MostVisitedURLList most_visited_urls_; nit: I went to see what type this was... and noticed it was defined twice in chrome/browser/history/history_types.h. Perhaps you could also delete the redundant declaration.
https://codereview.chromium.org/16517002/diff/23001/chrome/browser/history/hi... File chrome/browser/history/history_backend.cc (right): https://codereview.chromium.org/16517002/diff/23001/chrome/browser/history/hi... chrome/browser/history/history_backend.cc:64: for (size_t i = 0; i < top_urls.size(); ++i) { On 2013/06/17 22:58:14, jar wrote: > nit: How about at least not searching beyond max_top_url_count? > > ...or DCHECK that size() is always under it?? Done. https://codereview.chromium.org/16517002/diff/23001/chrome/browser/history/hi... chrome/browser/history/history_backend.cc:68: if (url == top_urls[i].redirects[j]) From a random sampling of top sites, I'm seeing chains of size 1-4. This occurs once per page load, but on the db thread. I considered using a hash, but this seemed to be overkill for a metric that will be removed in M31. If you feel otherwise, I could build a hash table. Also, do you really want me to histogram the operation of code that supports a histogram? That seems like overkill to me, but if you really want it, I'm ok with adding it. On 2013/06/17 22:58:14, jar wrote: > I'm not sure how long these chains are.... but are you concerned about writing > code that does these linear searches? > > This seems like it would be done a lot... > > Could you mind histogramming how many URL compares we do here? ... and do we do > this for every navigation to every URL?? https://codereview.chromium.org/16517002/diff/23001/chrome/browser/history/hi... chrome/browser/history/history_backend.cc:80: const GURL& url, int max_top_url_count) { On 2013/06/17 22:58:14, jar wrote: > nit: one arg per line in declaration and definitions. Done. https://codereview.chromium.org/16517002/diff/23001/chrome/browser/history/hi... chrome/browser/history/history_backend.cc:84: UMA_HISTOGRAM_COUNTS("History.TopSitesVisitsByRank", rank); On 2013/06/17 22:58:14, jar wrote: > Better would be to use: > UMA_HISTOGRAM_ENUMERATION("...", rank, max_top_url_count + 1); Done. https://codereview.chromium.org/16517002/diff/23001/chrome/browser/history/hi... chrome/browser/history/history_backend.cc:141: static const size_t kPageVisitStatsMaxTopSites = 50; I moved just my constant to the top. On 2013/06/17 22:58:14, jar wrote: > nit: (even though I like statics at file scope...) Anonymous namespaces are > preferred. > > This list of constants should probably be pushed higher in the file, (or your > new code later in the file), so that the new code can refer to these constants. https://codereview.chromium.org/16517002/diff/23001/chrome/browser/history/hi... chrome/browser/history/history_backend.cc:894: RecordTopPageVisitStats(most_visited_urls_, url, On 2013/06/17 22:58:14, jar wrote: > nit: use curlies around any if action that runs onto more than a line. Done. https://codereview.chromium.org/16517002/diff/23001/chrome/browser/history/hi... File chrome/browser/history/history_backend.h (right): https://codereview.chromium.org/16517002/diff/23001/chrome/browser/history/hi... chrome/browser/history/history_backend.h:912: MostVisitedURLList most_visited_urls_; On 2013/06/17 22:58:14, jar wrote: > nit: I went to see what type this was... and noticed it was defined twice in > chrome/browser/history/history_types.h. Perhaps you could also delete the > redundant declaration. Done.
https://codereview.chromium.org/16517002/diff/23001/chrome/browser/history/hi... File chrome/browser/history/history_backend.cc (right): https://codereview.chromium.org/16517002/diff/23001/chrome/browser/history/hi... chrome/browser/history/history_backend.cc:68: if (url == top_urls[i].redirects[j]) On 2013/06/17 23:46:59, bengr1 wrote: > From a random sampling of top sites, I'm seeing chains of size 1-4. This occurs > once per page load, but on the db thread. I considered using a hash, but this > seemed to be overkill for a metric that will be removed in M31. If you feel > otherwise, I could build a hash table. Also, do you really want me to histogram > the operation of code that supports a histogram? That seems like overkill to me, > but if you really want it, I'm ok with adding it. > > On 2013/06/17 22:58:14, jar wrote: > > I'm not sure how long these chains are.... but are you concerned about writing > > code that does these linear searches? > > > > This seems like it would be done a lot... > > > > Could you mind histogramming how many URL compares we do here? ... and do we > do > > this for every navigation to every URL?? > This histogram will be hardly anything in performance compared with the scanning you're doing.... assuming you call it only with the sum of the counts traversed in a failure. I just want to be sure we're not blowing performance out big time... and we have a way to see what is happening. If you change to using a map (as per later comments), I wouldn't be concerned about perf here. https://codereview.chromium.org/16517002/diff/30001/chrome/browser/history/hi... File chrome/browser/history/history_backend.cc (right): https://codereview.chromium.org/16517002/diff/30001/chrome/browser/history/hi... chrome/browser/history/history_backend.cc:62: static const size_t kPageVisitStatsMaxTopSites = 50; nit: no need to for static, now that you're in the anonymous namespace. https://codereview.chromium.org/16517002/diff/30001/chrome/browser/history/hi... chrome/browser/history/history_backend.cc:87: int max_top_url_count) { Why is this now passed as an argument, rather than consistently using kPageVisitStatsMaxTopSites? Using the constant ensure that we are safe in our use of the histogram macro (args are constants, other than sample). https://codereview.chromium.org/16517002/diff/30001/chrome/browser/history/hi... chrome/browser/history/history_backend.cc:147: nit: How about coming these constants up to the top of the file in an anonymous namespace? You can then have your #ifdef android within that space, and the constants would be nicely gathered together. https://codereview.chromium.org/16517002/diff/30001/chrome/browser/history/hi... chrome/browser/history/history_backend.cc:303: &most_visited_urls_); Is this done exactly once per process run? If so, it sure seems like it would be nicer to put this into a map, and then have a quick single scan on each navigation. If this is updated constantly, as the "top URLs" change.... then it would not be such a clear win. https://codereview.chromium.org/16517002/diff/30001/chrome/browser/history/hi... chrome/browser/history/history_backend.cc:850: // Only count the page visit if it came from user browsing and only count it Please indicate this in your histograms.xml prose. https://codereview.chromium.org/16517002/diff/30001/chrome/browser/history/hi... chrome/browser/history/history_backend.cc:853: (transition & content::PAGE_TRANSITION_CHAIN_END) != 0) { It is interesting that you *only* check if we're at the end of a redirect chain... but when scanning the URL list, you appear to (wastefully??) compare with all entries in the redirect list. Why is that? Why not (for instance) only search if PAGE_TRANSITION_CHAIN_START, and then only look at the first URL (and not any redirections??
https://codereview.chromium.org/16517002/diff/30001/chrome/browser/history/hi... File chrome/browser/history/history_backend.cc (right): https://codereview.chromium.org/16517002/diff/30001/chrome/browser/history/hi... chrome/browser/history/history_backend.cc:62: static const size_t kPageVisitStatsMaxTopSites = 50; On 2013/06/18 00:51:34, jar wrote: > nit: no need to for static, now that you're in the anonymous namespace. Done. https://codereview.chromium.org/16517002/diff/30001/chrome/browser/history/hi... chrome/browser/history/history_backend.cc:87: int max_top_url_count) { On 2013/06/18 00:51:34, jar wrote: > Why is this now passed as an argument, rather than consistently using > kPageVisitStatsMaxTopSites? Using the constant ensure that we are safe in our > use of the histogram macro (args are constants, other than sample). Done. https://codereview.chromium.org/16517002/diff/30001/chrome/browser/history/hi... chrome/browser/history/history_backend.cc:147: The histogram code I've added is temporary and once it's gone, all the remaining constants will again be cleanly at the top of the file. I'd prefer not to move the previously existing constants into the anonymous namespace that I added. On 2013/06/18 00:51:34, jar wrote: > nit: How about coming these constants up to the top of the file in an anonymous > namespace? You can then have your #ifdef android within that space, and the > constants would be nicely gathered together. https://codereview.chromium.org/16517002/diff/30001/chrome/browser/history/hi... chrome/browser/history/history_backend.cc:303: &most_visited_urls_); On 2013/06/18 00:51:34, jar wrote: > Is this done exactly once per process run? If so, it sure seems like it would > be nicer to put this into a map, and then have a quick single scan on each > navigation. > > If this is updated constantly, as the "top URLs" change.... then it would not be > such a clear win. Done. https://codereview.chromium.org/16517002/diff/30001/chrome/browser/history/hi... chrome/browser/history/history_backend.cc:850: // Only count the page visit if it came from user browsing and only count it On 2013/06/18 00:51:34, jar wrote: > Please indicate this in your histograms.xml prose. > Done. https://codereview.chromium.org/16517002/diff/30001/chrome/browser/history/hi... chrome/browser/history/history_backend.cc:853: (transition & content::PAGE_TRANSITION_CHAIN_END) != 0) { Let's say the user commonly follows a redirect chain: a->b->c->d, but today follows x->a->b->c->d. If I look for x in the top sites redirect chains, I won't find it, even though x really means a visit to d, which is a top site. Therefore I use the end of the chain for comparison. And why not just look at the last entry in each top site's redirect chain? Because afaik, the top sites chains do not guarantee an order afaik. On 2013/06/18 00:51:34, jar wrote: > It is interesting that you *only* check if we're at the end of a redirect > chain... but when scanning the URL list, you appear to (wastefully??) compare > with all entries in the redirect list. Why is that? > > Why not (for instance) only search if PAGE_TRANSITION_CHAIN_START, and then only > look at the first URL (and not any redirections??
https://codereview.chromium.org/16517002/diff/30001/chrome/browser/history/hi... File chrome/browser/history/history_backend.cc (right): https://codereview.chromium.org/16517002/diff/30001/chrome/browser/history/hi... chrome/browser/history/history_backend.cc:147: On 2013/06/18 18:43:37, bengr1 wrote: > The histogram code I've added is temporary and once it's gone, all the remaining > constants will again be cleanly at the top of the file. I'd prefer not to move > the previously existing constants into the anonymous namespace that I added. > > On 2013/06/18 00:51:34, jar wrote: > > nit: How about coming these constants up to the top of the file in an > anonymous > > namespace? You can then have your #ifdef android within that space, and the > > constants would be nicely gathered together. > When editing files, it is always good to clean (mildly... *NOT* major risky refactors, etc., or drastic style changes) while changing what should be adjacent code. If you landed local functions, your code should actually be landed between lines 121 and 123 of this file, rather than above the file level comments. If you land as methods, they should be aligned with their header declarations. It is a bad idea to drop code into the "wrong place" and plan to "remove it RSN," as too often, it stays much longer than thought :-/. https://codereview.chromium.org/16517002/diff/30001/chrome/browser/history/hi... chrome/browser/history/history_backend.cc:853: (transition & content::PAGE_TRANSITION_CHAIN_END) != 0) { On 2013/06/18 18:43:37, bengr1 wrote: > Let's say the user commonly follows a redirect chain: a->b->c->d, but today > follows x->a->b->c->d. If I look for x in the top sites redirect chains, I won't > find it, even though x really means a visit to d, which is a top site. Therefore > I use the end of the chain for comparison. > > And why not just look at the last entry in each top site's redirect chain? > Because afaik, the top sites chains do not guarantee an order afaik. If there is a redirect chain... I sure hope it is consistent... and all roads lead to the final link (with no redirects). Are you suggesting that sometimes there is an intentional redirect loop that hopes across a bunch of sites, and then settles, after visiting (fetching) several/all intermediaries? That seems strange... but I'll leave it to you to judge. Thanks for your explanation. https://codereview.chromium.org/16517002/diff/37001/chrome/browser/history/hi... File chrome/browser/history/history_backend.cc (right): https://codereview.chromium.org/16517002/diff/37001/chrome/browser/history/hi... chrome/browser/history/history_backend.cc:67: std::map<GURL, int>& top_urls_map) { nit: Pass in the pointer to modify the map, rather than a mutable reference. ... or even better... as mentioned... use methods, so you don't need to pass the slot around to functions. https://codereview.chromium.org/16517002/diff/37001/chrome/browser/history/hi... chrome/browser/history/history_backend.cc:86: LOG(WARNING) << "XXX rank " << rank << " url: " << url.spec(); Generally we use DLOG, so that we don't swell up the binary with a pile of text segments and logging that is never used by folks in the field. https://codereview.chromium.org/16517002/diff/37001/chrome/browser/history/hi... File chrome/browser/history/history_backend.h (right): https://codereview.chromium.org/16517002/diff/37001/chrome/browser/history/hi... chrome/browser/history/history_backend.h:917: std::map<GURL, int> most_visited_urls_map_; nit: Making this a member is good/fine. You should make the functions that manipulate this into methods, and then you won't need to pass it in as an argument.
https://codereview.chromium.org/16517002/diff/37001/chrome/browser/history/hi... File chrome/browser/history/history_backend.cc (right): https://codereview.chromium.org/16517002/diff/37001/chrome/browser/history/hi... chrome/browser/history/history_backend.cc:67: std::map<GURL, int>& top_urls_map) { On 2013/06/19 00:49:58, jar wrote: > nit: Pass in the pointer to modify the map, rather than a mutable reference. > > ... or even better... as mentioned... use methods, so you don't need to pass the > slot around to functions. Done. https://codereview.chromium.org/16517002/diff/37001/chrome/browser/history/hi... chrome/browser/history/history_backend.cc:86: LOG(WARNING) << "XXX rank " << rank << " url: " << url.spec(); Thanks for catching this. It was some logging leftover from debugging. Removed. On 2013/06/19 00:49:58, jar wrote: > Generally we use DLOG, so that we don't swell up the binary with a pile of text > segments and logging that is never used by folks in the field. https://codereview.chromium.org/16517002/diff/37001/chrome/browser/history/hi... File chrome/browser/history/history_backend.h (right): https://codereview.chromium.org/16517002/diff/37001/chrome/browser/history/hi... chrome/browser/history/history_backend.h:917: std::map<GURL, int> most_visited_urls_map_; On 2013/06/19 00:49:58, jar wrote: > nit: Making this a member is good/fine. You should make the functions that > manipulate this into methods, and then you won't need to pass it in as an > argument. Done.
https://codereview.chromium.org/16517002/diff/30001/chrome/browser/history/hi... File chrome/browser/history/history_backend.cc (right): https://codereview.chromium.org/16517002/diff/30001/chrome/browser/history/hi... chrome/browser/history/history_backend.cc:853: (transition & content::PAGE_TRANSITION_CHAIN_END) != 0) { Redirect chains should be ordered by time, so if it's in a vector, you should be fine.
https://codereview.chromium.org/16517002/diff/30001/chrome/browser/history/hi... File chrome/browser/history/history_backend.cc (right): https://codereview.chromium.org/16517002/diff/30001/chrome/browser/history/hi... chrome/browser/history/history_backend.cc:853: (transition & content::PAGE_TRANSITION_CHAIN_END) != 0) { Good to know. I've switched over to using a map, so there aren't any wasteful comparisons. On 2013/06/19 19:10:27, brettw wrote: > Redirect chains should be ordered by time, so if it's in a vector, you should be > fine.
lgtm
brettw@, are you ok with the change?
lgtm
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/bengr@chromium.org/16517002/42001
Message was sent while issue was closed.
Change committed as 207957 |