|
|
Created:
7 years, 5 months ago by bbudge Modified:
7 years, 4 months ago CC:
chromium-reviews, Ilya Sherman, jar (doing other things), asvitkine+watch_chromium.org, erikkay Base URL:
svn://svn.chromium.org/chrome/trunk/src Visibility:
Public. |
DescriptionAdd UMA histograms for home page, start page, and default search engine
hosts, as they are set in the Preferences file.
This adds histogram enum values for non-prepopulated DSE hosts. The expanded enum is used to log Preferences settings for home page, first start page, and DSE hosts.
TBR=battre@chromium.org, meacer@chromium.org, mpearson@chromium.org, rockot@chromium.org
BUG=263234
Committed: https://src.chromium.org/viewvc/chrome?view=rev&revision=214498
Patch Set 1 #Patch Set 2 : Change approach to use PrefMetricService. #Patch Set 3 : Reuse existing histogram enum for DSE hosts. #
Total comments: 1
Patch Set 4 : Address Ken's comment, guard startup page UMA on restore setting #
Total comments: 4
Patch Set 5 : Canonicalize hosts, enumerate all start pages. #
Total comments: 26
Patch Set 6 : Address Mark's comments. #Patch Set 7 : Eliminate magic number. #
Total comments: 2
Patch Set 8 : Eliminate duplicate constant. #
Total comments: 5
Patch Set 9 : Remove erroneous domain. #
Total comments: 12
Patch Set 10 : Address Mark's comments. #
Total comments: 2
Patch Set 11 : Tweak comments. #Patch Set 12 : Rebase to after Mustafa's CL. #Patch Set 13 : Redo the patch. #Patch Set 14 : Redo the patch again. #
Messages
Total messages: 25 (0 generated)
This builds on Ken's CL to collect prefs settings. https://codereview.chromium.org/19776015/ I'll rebase once that lands.
You will need chrome-privacy approval to explicitly log whether users have a relationship to items on this list of hostnames. Please CC me on the thread. thanks, mark On Tue, Jul 23, 2013 at 4:19 PM, <bbudge@chromium.org> wrote: > Reviewers: Mark P, Mustafa Emre Acer, rockot, jochen, battre, > > Message: > This builds on Ken's CL to collect prefs settings. > > https://codereview.chromium.**org/19776015/<https://codereview.chromium.org/1... > > I'll rebase once that lands. > > > > Description: > Add UMA logging for custom search providers. > When the default search engine is not in the default provider list, > log some common hosts for custom search. > > BUG=263234 > > Please review this at https://codereview.chromium.**org/20012003/<https://codereview.chromium.org/2... > > SVN Base: svn://svn.chromium.org/chrome/**trunk/src<http://svn.chromium.org/chrome/trunk/src> > > Affected files: > M chrome/browser/chrome_browser_**main.cc > A chrome/browser/prefs/pref_**metrics_service.h > A chrome/browser/prefs/pref_**metrics_service.cc > M chrome/chrome_browser.gypi > M tools/metrics/histograms/**histograms.xml > > >
This is ready for review. The effect of this is to expand the search engine sites whose hosts are logged when Chrome starts up. I left the existing DSE histogram alone, since it is collected at a different point in the startup sequence. Moving it to the PrefMetricsService would cause a discontinuity in that histogram.
LGTM minus factory dependency update https://codereview.chromium.org/20012003/diff/10001/chrome/browser/prefs/pref... File chrome/browser/prefs/pref_metrics_service.cc (right): https://codereview.chromium.org/20012003/diff/10001/chrome/browser/prefs/pref... chrome/browser/prefs/pref_metrics_service.cc:149: BrowserContextDependencyManager::GetInstance()) { Because PrefMetricsService now depends on TemplateURLService it should guarantee safe destruction order. It doesn't yet matter, but it may eventually. All this means is the factory constructor should call: DependsOn(TemplateURLServiceFactory::GetInstance());
+sky for OWNERS review of prepopulated_engines.json.
LGTM
Privacy review passed. I still need OWNERS review from battre and mpearson. PTAL
This does not work, yet. https://codereview.chromium.org/20012003/diff/21001/chrome/browser/prefs/pref... File chrome/browser/prefs/pref_metrics_service.cc (right): https://codereview.chromium.org/20012003/diff/21001/chrome/browser/prefs/pref... chrome/browser/prefs/pref_metrics_service.cc:113: int host_id = MapHostToId(host_id_map, homepage_url.host()); This does not work. You can have www.foo.com, www2.foo.com and foo.com but only foo.com is stored in the host_id_map. You can either store more URLs in the hostmap or normalize the URLs with GetDomainAndRegistry (check in line 113 but also in line 44). https://codereview.chromium.org/20012003/diff/21001/chrome/browser/prefs/pref... chrome/browser/prefs/pref_metrics_service.cc:124: if (url_list->GetSize() > 0) { I think this should be a for loop iterating over the entire list.
https://codereview.chromium.org/20012003/diff/21001/chrome/browser/prefs/pref... File chrome/browser/prefs/pref_metrics_service.cc (right): https://codereview.chromium.org/20012003/diff/21001/chrome/browser/prefs/pref... chrome/browser/prefs/pref_metrics_service.cc:113: int host_id = MapHostToId(host_id_map, homepage_url.host()); On 2013/07/26 19:27:51, battre wrote: > This does not work. You can have http://www.foo.com, http://www2.foo.com and http://foo.com but only > http://foo.com is stored in the host_id_map. > > You can either store more URLs in the hostmap or normalize the URLs with > GetDomainAndRegistry (check in line 113 but also in line 44). Changed to canonicalize hosts using GetDomainAndRegistry, both on adding to the map and on looking them up. Thanks, that's a useful function. https://codereview.chromium.org/20012003/diff/21001/chrome/browser/prefs/pref... chrome/browser/prefs/pref_metrics_service.cc:124: if (url_list->GetSize() > 0) { On 2013/07/26 19:27:51, battre wrote: > I think this should be a for loop iterating over the entire list. There was some discussion about multiple start pages. I think you're right, we want to know even if the page isn't topmost at startup. Done.
I have lots of comments. I also have a big request. How are you testing this? I'd encourage testing with about:histograms. You can muck with your preferences (via text edit or the settings page) and see what comes out. I'd especially encourage you to build a proper release build with this, get your browser hijacked, and make sure these histograms get recorded as you'd expect. If we're going to make interferences about the scope of hijacking for various domains based on this data, I'd like to have confidence that we're accurately measuring hijacking for at least some of these possible hijackers. --mark https://codereview.chromium.org/20012003/diff/26001/chrome/browser/prefs/pref... File chrome/browser/prefs/pref_metrics_service.cc (right): https://codereview.chromium.org/20012003/diff/26001/chrome/browser/prefs/pref... chrome/browser/prefs/pref_metrics_service.cc:30: std::string CanonicalizeHost(const std::string& host) { If this effectively returns the domain name, please say so and give the function a better name. Also give the typedef and its instances a better name. [note to self: If this doesn't return the domain name, I have other concerns.] https://codereview.chromium.org/20012003/diff/26001/chrome/browser/prefs/pref... chrome/browser/prefs/pref_metrics_service.cc:37: void AddHostToMap(HostIdMap* host_id_map, nit: style has input parameters before modified and output parameters. https://codereview.chromium.org/20012003/diff/26001/chrome/browser/prefs/pref... chrome/browser/prefs/pref_metrics_service.cc:43: // Builds a map that associated host name strings with histogram enum values, nit: associated -> associates https://codereview.chromium.org/20012003/diff/26001/chrome/browser/prefs/pref... chrome/browser/prefs/pref_metrics_service.cc:53: prepopulated_urls[i]->url_ref().GetHost(), Why don't you canonicalize these too? https://codereview.chromium.org/20012003/diff/26001/chrome/browser/prefs/pref... chrome/browser/prefs/pref_metrics_service.cc:70: AddHostToMap(host_id_map, "imesh.net", 114); Please do something, whether here or elsewhere, to ensure that people don't add more elements here without increasing kMaxHostHistogramValue. https://codereview.chromium.org/20012003/diff/26001/chrome/browser/prefs/pref... chrome/browser/prefs/pref_metrics_service.cc:74: AddHostToMap(host_id_map, "avg.com", 50); Introduce the 36 and 50 to the histograms.xml enum. https://codereview.chromium.org/20012003/diff/26001/chrome/browser/prefs/pref... chrome/browser/prefs/pref_metrics_service.cc:100: bool home_page_is_ntp = prefs->GetBoolean(prefs::kHomePageIsNewTabPage); nit: make both of these const https://codereview.chromium.org/20012003/diff/26001/chrome/browser/prefs/pref... chrome/browser/prefs/pref_metrics_service.cc:113: template_url_service->GetDefaultSearchProvider(); Please check the return value. Occasionally we've seen users in the wild with no default search provider. https://codereview.chromium.org/20012003/diff/26001/chrome/browser/prefs/pref... chrome/browser/prefs/pref_metrics_service.cc:127: // If page restore on startup is set (value == 4), record all startup pages. nit: "page restore on startup" almost sounds like "restore the pages I was viewing last" can you rephrase to make it more obvious that it's open these pages at startup? https://codereview.chromium.org/20012003/diff/26001/chrome/browser/prefs/pref... chrome/browser/prefs/pref_metrics_service.cc:137: UMA_HISTOGRAM_ENUMERATION("Settings.FirstStartPageURL", "FirstStartPageURL" is a wrong name for this. https://codereview.chromium.org/20012003/diff/26001/tools/metrics/histograms/... File tools/metrics/histograms/histograms.xml (right): https://codereview.chromium.org/20012003/diff/26001/tools/metrics/histograms/... tools/metrics/histograms/histograms.xml:3881: <histogram name="HomePage.IsHomeButtonEnabled" enum="BooleanEnabled"> You do not use either this histogram or the one below in your code. Please remove them. https://codereview.chromium.org/20012003/diff/26001/tools/metrics/histograms/... tools/metrics/histograms/histograms.xml:12749: The id of the default search engine URL host that is specified in Please describe when this is emitted. (ditto other histograms below) https://codereview.chromium.org/20012003/diff/26001/tools/metrics/histograms/... tools/metrics/histograms/histograms.xml:12756: The id of the first start page URL host that is specified in Preferences. Please revise this comment; it no longer matches the code.
As for testing, I've been changing my settings via Chrome's settings page, then printf-debugging. Thanks for the tip about about:histograms. I'll try to actually get a hijacked Chrome set up on a Windows machine and make sure it's working correctly. https://codereview.chromium.org/20012003/diff/26001/chrome/browser/prefs/pref... File chrome/browser/prefs/pref_metrics_service.cc (right): https://codereview.chromium.org/20012003/diff/26001/chrome/browser/prefs/pref... chrome/browser/prefs/pref_metrics_service.cc:30: std::string CanonicalizeHost(const std::string& host) { On 2013/07/26 23:30:23, Mark P wrote: > If this effectively returns the domain name, please say so and give the function > a better name. Also give the typedef and its instances a better name. > > [note to self: If this doesn't return the domain name, I have other concerns.] Done. HostIdMap -> DomainIdMap CanonicalizeHost -> GetDomain AddHostToMap -> AddDomain plus assorted local variable renamings to be consistent https://codereview.chromium.org/20012003/diff/26001/chrome/browser/prefs/pref... chrome/browser/prefs/pref_metrics_service.cc:37: void AddHostToMap(HostIdMap* host_id_map, On 2013/07/26 23:30:23, Mark P wrote: > nit: style has input parameters before modified and output parameters. Done. https://codereview.chromium.org/20012003/diff/26001/chrome/browser/prefs/pref... chrome/browser/prefs/pref_metrics_service.cc:43: // Builds a map that associated host name strings with histogram enum values, On 2013/07/26 23:30:23, Mark P wrote: > nit: associated -> associates Done. https://codereview.chromium.org/20012003/diff/26001/chrome/browser/prefs/pref... chrome/browser/prefs/pref_metrics_service.cc:53: prepopulated_urls[i]->url_ref().GetHost(), On 2013/07/26 23:30:23, Mark P wrote: > Why don't you canonicalize these too? The AddHostToMap did the conversion. It is now done here and with the name changes it should be clearer. https://codereview.chromium.org/20012003/diff/26001/chrome/browser/prefs/pref... chrome/browser/prefs/pref_metrics_service.cc:70: AddHostToMap(host_id_map, "imesh.net", 114); On 2013/07/26 23:30:23, Mark P wrote: > Please do something, whether here or elsewhere, to ensure that people don't add > more elements here without increasing kMaxHostHistogramValue. Done. Added a comment. https://codereview.chromium.org/20012003/diff/26001/chrome/browser/prefs/pref... chrome/browser/prefs/pref_metrics_service.cc:74: AddHostToMap(host_id_map, "avg.com", 50); On 2013/07/26 23:30:23, Mark P wrote: > Introduce the 36 and 50 to the histograms.xml enum. Good catch. Done. https://codereview.chromium.org/20012003/diff/26001/chrome/browser/prefs/pref... chrome/browser/prefs/pref_metrics_service.cc:100: bool home_page_is_ntp = prefs->GetBoolean(prefs::kHomePageIsNewTabPage); On 2013/07/26 23:30:23, Mark P wrote: > nit: make both of these const Done. https://codereview.chromium.org/20012003/diff/26001/chrome/browser/prefs/pref... chrome/browser/prefs/pref_metrics_service.cc:113: template_url_service->GetDefaultSearchProvider(); On 2013/07/26 23:30:23, Mark P wrote: > Please check the return value. Occasionally we've seen users in the wild with > no default search provider. Done. https://codereview.chromium.org/20012003/diff/26001/chrome/browser/prefs/pref... chrome/browser/prefs/pref_metrics_service.cc:127: // If page restore on startup is set (value == 4), record all startup pages. On 2013/07/26 23:30:23, Mark P wrote: > nit: "page restore on startup" almost sounds like "restore the pages I was > viewing last" > can you rephrase to make it more obvious that it's open these pages at startup? Done. https://codereview.chromium.org/20012003/diff/26001/chrome/browser/prefs/pref... chrome/browser/prefs/pref_metrics_service.cc:137: UMA_HISTOGRAM_ENUMERATION("Settings.FirstStartPageURL", On 2013/07/26 23:30:23, Mark P wrote: > "FirstStartPageURL" is a wrong name for this. Done. https://codereview.chromium.org/20012003/diff/26001/tools/metrics/histograms/... File tools/metrics/histograms/histograms.xml (right): https://codereview.chromium.org/20012003/diff/26001/tools/metrics/histograms/... tools/metrics/histograms/histograms.xml:3881: <histogram name="HomePage.IsHomeButtonEnabled" enum="BooleanEnabled"> Leftover from a previous version where I based off another patch. Done. https://codereview.chromium.org/20012003/diff/26001/tools/metrics/histograms/... tools/metrics/histograms/histograms.xml:12749: The id of the default search engine URL host that is specified in On 2013/07/26 23:30:23, Mark P wrote: > Please describe when this is emitted. > > (ditto other histograms below) Done. https://codereview.chromium.org/20012003/diff/26001/tools/metrics/histograms/... tools/metrics/histograms/histograms.xml:12756: The id of the first start page URL host that is specified in Preferences. On 2013/07/26 23:30:23, Mark P wrote: > Please revise this comment; it no longer matches the code. Done.
https://codereview.chromium.org/20012003/diff/40001/chrome/browser/prefs/pref... File chrome/browser/prefs/pref_metrics_service.cc (right): https://codereview.chromium.org/20012003/diff/40001/chrome/browser/prefs/pref... chrome/browser/prefs/pref_metrics_service.cc:27: static const int kMaxHostHistogramValue = 114; What happens if people add a new ID in the prepopulated_engines.json file? Do you want to add a COMPILE_ASSERT((TemplateURLPrepopulateData::kMaxPrepopulatedEngineID == kMaxHostHistogramValue, kMaxPrepopulatedEngineID_must_equal_kMaxHostHistogramValue)?
https://codereview.chromium.org/20012003/diff/40001/chrome/browser/prefs/pref... File chrome/browser/prefs/pref_metrics_service.cc (right): https://codereview.chromium.org/20012003/diff/40001/chrome/browser/prefs/pref... chrome/browser/prefs/pref_metrics_service.cc:27: static const int kMaxHostHistogramValue = 114; On 2013/07/28 07:57:13, battre wrote: > What happens if people add a new ID in the prepopulated_engines.json file? Do > you want to add a > COMPILE_ASSERT((TemplateURLPrepopulateData::kMaxPrepopulatedEngineID == > kMaxHostHistogramValue, > kMaxPrepopulatedEngineID_must_equal_kMaxHostHistogramValue)? Good point. Instead, I just use this constant and update my comments.
https://codereview.chromium.org/20012003/diff/44001/chrome/browser/prefs/pref... File chrome/browser/prefs/pref_metrics_service.cc (right): https://codereview.chromium.org/20012003/diff/44001/chrome/browser/prefs/pref... chrome/browser/prefs/pref_metrics_service.cc:62: AddDomain("deltasearch.com", 104, domain_id_map); whoops, need to remove this.
lgtm
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/bbudge@chromium.org/20012003/51001
Mark, if this looks OK, I need an OWNERS review for histograms.xml and prepopulated_engines.json (I'll likely TBR that one.)
https://codereview.chromium.org/20012003/diff/44001/chrome/browser/prefs/pref... File chrome/browser/prefs/pref_metrics_service.cc (right): https://codereview.chromium.org/20012003/diff/44001/chrome/browser/prefs/pref... chrome/browser/prefs/pref_metrics_service.cc:50: std::string domain(GetDomain(prepopulated_urls[i]->url_ref().GetHost())); nit: This intermediate variable doesn't buy you anything useful. https://codereview.chromium.org/20012003/diff/44001/tools/metrics/histograms/... File tools/metrics/histograms/histograms.xml (right): https://codereview.chromium.org/20012003/diff/44001/tools/metrics/histograms/... tools/metrics/histograms/histograms.xml:12735: The id of the default search engine provider that is specified in user This is not the id of the default search engine provider. It's the id of the domain of the default search engine provider. There's a substantial difference (some percent of our users, for instance, have a google domain as their default search provider but not the pre-populated google ID search engine). https://codereview.chromium.org/20012003/diff/51001/chrome/browser/prefs/pref... File chrome/browser/prefs/pref_metrics_service.cc (right): https://codereview.chromium.org/20012003/diff/51001/chrome/browser/prefs/pref... chrome/browser/prefs/pref_metrics_service.cc:25: typedef std::map<std::string, int> DomainIdMap; nit: please comment me; what does the int mean / where do the ids come from? https://codereview.chromium.org/20012003/diff/51001/chrome/browser/prefs/pref... chrome/browser/prefs/pref_metrics_service.cc:29: return std::string( Why do you need this std::string()? https://codereview.chromium.org/20012003/diff/51001/chrome/browser/prefs/pref... chrome/browser/prefs/pref_metrics_service.cc:46: size_t default_search_index; nit: // unused https://codereview.chromium.org/20012003/diff/51001/chrome/browser/prefs/pref... chrome/browser/prefs/pref_metrics_service.cc:58: // the list of available ids in the prepopulated_engines.json file. This last sentence is bad advice. We shouldn't restore enums to the range--that would imply reuse of the particular enums, which can be misleading about what domain name a particular bucket in an upload means. https://codereview.chromium.org/20012003/diff/51001/chrome/browser/prefs/pref... chrome/browser/prefs/pref_metrics_service.cc:86: DomainIdMap::const_iterator it = domain_id_map.find(domain); nit: just call GetDomain directly. https://codereview.chromium.org/20012003/diff/51001/chrome/browser/prefs/pref... chrome/browser/prefs/pref_metrics_service.cc:120: int domain_id = const or just put it on the UMA_HISTOGRAM line directly
Thanks for all the reviewing. PTAL https://codereview.chromium.org/20012003/diff/44001/chrome/browser/prefs/pref... File chrome/browser/prefs/pref_metrics_service.cc (right): https://codereview.chromium.org/20012003/diff/44001/chrome/browser/prefs/pref... chrome/browser/prefs/pref_metrics_service.cc:50: std::string domain(GetDomain(prepopulated_urls[i]->url_ref().GetHost())); On 2013/07/30 15:32:56, Mark P wrote: > nit: This intermediate variable doesn't buy you anything useful. Done. https://codereview.chromium.org/20012003/diff/44001/tools/metrics/histograms/... File tools/metrics/histograms/histograms.xml (right): https://codereview.chromium.org/20012003/diff/44001/tools/metrics/histograms/... tools/metrics/histograms/histograms.xml:12735: The id of the default search engine provider that is specified in user On 2013/07/30 15:32:56, Mark P wrote: > This is not the id of the default search engine provider. It's the id of the > domain of the default search engine provider. There's a substantial difference > (some percent of our users, for instance, have a google domain as their default > search provider but not the pre-populated google ID search engine). Fixed the comment. https://codereview.chromium.org/20012003/diff/51001/chrome/browser/prefs/pref... File chrome/browser/prefs/pref_metrics_service.cc (right): https://codereview.chromium.org/20012003/diff/51001/chrome/browser/prefs/pref... chrome/browser/prefs/pref_metrics_service.cc:25: typedef std::map<std::string, int> DomainIdMap; On 2013/07/30 15:32:56, Mark P wrote: > nit: please comment me; what does the int mean / where do the ids come from? Done. https://codereview.chromium.org/20012003/diff/51001/chrome/browser/prefs/pref... chrome/browser/prefs/pref_metrics_service.cc:29: return std::string( On 2013/07/30 15:32:56, Mark P wrote: > Why do you need this std::string()? Right, I don't need it. Done. https://codereview.chromium.org/20012003/diff/51001/chrome/browser/prefs/pref... chrome/browser/prefs/pref_metrics_service.cc:46: size_t default_search_index; On 2013/07/30 15:32:56, Mark P wrote: > nit: // unused Done. https://codereview.chromium.org/20012003/diff/51001/chrome/browser/prefs/pref... chrome/browser/prefs/pref_metrics_service.cc:58: // the list of available ids in the prepopulated_engines.json file. On 2013/07/30 15:32:56, Mark P wrote: > This last sentence is bad advice. We shouldn't restore enums to the range--that > would imply reuse of the particular enums, which can be misleading about what > domain name a particular bucket in an upload means. Yes, I suppose ids are forever. Removed comment. https://codereview.chromium.org/20012003/diff/51001/chrome/browser/prefs/pref... chrome/browser/prefs/pref_metrics_service.cc:86: DomainIdMap::const_iterator it = domain_id_map.find(domain); On 2013/07/30 15:32:56, Mark P wrote: > nit: just call GetDomain directly. Done. https://codereview.chromium.org/20012003/diff/51001/chrome/browser/prefs/pref... chrome/browser/prefs/pref_metrics_service.cc:120: int domain_id = On 2013/07/30 15:32:56, Mark P wrote: > const > or > just put it on the UMA_HISTOGRAM line directly Made these const. Done.
lgtm obviously you may need to serialize this change and the other so they patch/apply properly. --mark
if you haven't submitted this yet, you might as well make this below change in any case, still lgtm --mark https://codereview.chromium.org/20012003/diff/67001/chrome/browser/prefs/pref... File chrome/browser/prefs/pref_metrics_service.cc (right): https://codereview.chromium.org/20012003/diff/67001/chrome/browser/prefs/pref... chrome/browser/prefs/pref_metrics_service.cc:26: // in tools/metrics/histograms/histograms.xml. and (often) also prepopulated_...
Tweaked a few comments. Mustafa will land his CL first, then I'll do the merge. https://codereview.chromium.org/20012003/diff/67001/chrome/browser/prefs/pref... File chrome/browser/prefs/pref_metrics_service.cc (right): https://codereview.chromium.org/20012003/diff/67001/chrome/browser/prefs/pref... chrome/browser/prefs/pref_metrics_service.cc:26: // in tools/metrics/histograms/histograms.xml. On 2013/07/30 19:02:38, Mark P wrote: > and (often) also prepopulated_... Done.
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/bbudge@chromium.org/20012003/100001
Message was sent while issue was closed.
Committed patchset #14 manually as r214498 (presubmit successful).
Message was sent while issue was closed.
Failed to trigger a try job on mac_rel HTTP Error 400: Bad Request |