|
|
Created:
4 years, 11 months ago by msramek Modified:
4 years, 10 months ago CC:
asanka, cbentzel+watch_chromium.org, chromium-reviews, darin-cc_chromium.org, jam, markusheintz_ Base URL:
https://chromium.googlesource.com/chromium/src.git@master Target Ref:
refs/pending/heads/master Project:
chromium Visibility:
Public. |
DescriptionAdd an OriginFilterBuilder class for [white|black]listing origins.
We already have at least two different usecases for origin-based browsing data deletion - by whitelist (delete the data of one or more particular sites) or blacklist (clear browsing data, but protect a certain set of sites that are deemed valuable). Furthermore, the deletion might be done on a particular origin, or including subdomains (could be decided on a per-backend basis, settable by an advanced user, or a product decision).
To avoid plumbing this functionality separately into all backends, we instead pass a GURL->bool predicate to individual backends that they can use as a blackbox to evaluate whether a certain URL should be deleted. We introduce an OriginFilterBuilder class that constructs such predicates.
To illustrate the usage of the new API, we also modify the origin-based deletion in DownloadManager.
See https://docs.google.com/document/d/12UeAMGOk9MDFwrT9kfNldL4qnUIfnIdVbQSk-hb-mWg/ for more details.
BUG=578162
Committed: https://crrev.com/d9c162ad0a240aa967c247f5be258056d6a4efbf
Cr-Commit-Position: refs/heads/master@{#376967}
Patch Set 1 : #
Total comments: 6
Patch Set 2 : Move to browsing_data/, pass around callback #Patch Set 3 : Rebase. #Patch Set 4 : Fixed content_unittests. #
Total comments: 26
Patch Set 5 : Builder pattern #
Total comments: 6
Patch Set 6 : Empty filter. #
Total comments: 21
Patch Set 7 : Lifetime management. #
Total comments: 4
Patch Set 8 : DCHECK(not many origins) #
Total comments: 3
Patch Set 9 : History TODO. #Patch Set 10 : Rebase over 1246583002 #Messages
Total messages: 49 (20 generated)
Description was changed from ========== Add an OriginFilter class for [white|black]listing origins. We already have at least two different usecases for origin-based browsing data deletion - by whitelist (delete the data of one or more particular sites) or blacklist (clear browsing data, but protect a certain set of sites that are deemed valuable). Furthermore, the deletion might be done on a particular origin, or including subdomains (could be decided on a per-backend basis, settable by an advanced user, or a product decision). To avoid plumbing this functionality separately into all backends, we create this filter class that encapsulates it, and can then be passed as a blackbox to individual backends. To illustrate the usage of the new API, we also modify the origin-based deletion in DownloadManager. Regarding the placement of the class in the code: It would make sense to place this class in url/, as it can be viewed as a convenience container of url::Origin. However, to understand what is a subdomain, we need to use net/, hence the placement in net/. As the data storage backends used by BrowsingDataRemover are spread in different parts of code, it is impractical to use higher layers, such as components/. BUG=578162 ========== to ========== Add an OriginFilter class for [white|black]listing origins. We already have at least two different usecases for origin-based browsing data deletion - by whitelist (delete the data of one or more particular sites) or blacklist (clear browsing data, but protect a certain set of sites that are deemed valuable). Furthermore, the deletion might be done on a particular origin, or including subdomains (could be decided on a per-backend basis, settable by an advanced user, or a product decision). To avoid plumbing this functionality separately into all backends, we create this filter class that encapsulates it, and can then be passed as a blackbox to individual backends. To illustrate the usage of the new API, we also modify the origin-based deletion in DownloadManager. BUG=578162 ==========
Patchset #1 (id:1) has been deleted
Patchset #1 (id:20001) has been deleted
Patchset #1 (id:40001) has been deleted
Description was changed from ========== Add an OriginFilter class for [white|black]listing origins. We already have at least two different usecases for origin-based browsing data deletion - by whitelist (delete the data of one or more particular sites) or blacklist (clear browsing data, but protect a certain set of sites that are deemed valuable). Furthermore, the deletion might be done on a particular origin, or including subdomains (could be decided on a per-backend basis, settable by an advanced user, or a product decision). To avoid plumbing this functionality separately into all backends, we create this filter class that encapsulates it, and can then be passed as a blackbox to individual backends. To illustrate the usage of the new API, we also modify the origin-based deletion in DownloadManager. BUG=578162 ========== to ========== Add an OriginFilter class for [white|black]listing origins. We already have at least two different usecases for origin-based browsing data deletion - by whitelist (delete the data of one or more particular sites) or blacklist (clear browsing data, but protect a certain set of sites that are deemed valuable). Furthermore, the deletion might be done on a particular origin, or including subdomains (could be decided on a per-backend basis, settable by an advanced user, or a product decision). To avoid plumbing this functionality separately into all backends, we create this filter class that encapsulates it, and can then be passed as a blackbox to individual backends. To illustrate the usage of the new API, we also modify the origin-based deletion in DownloadManager. A note to the placement of the class: As BrowsingDataRemover has ~17 different backends, this class is going to be reused all over Chrome. Notably the need to use it in net/ (for cache deletion) and content/browser (for download manager) necessitate placement in the lowest layers of code, such as url/. This makes sense if we regard OriginFilter as a "smart container" of Origin, which also lives in url/. See https://docs.google.com/document/d/12UeAMGOk9MDFwrT9kfNldL4qnUIfnIdVbQSk-hb-mWg/ for more details. BUG=578162 ==========
Patchset #1 (id:60001) has been deleted
Patchset #1 (id:80001) has been deleted
Patchset #1 (id:100001) has been deleted
Patchset #1 (id:120001) has been deleted
Patchset #1 (id:140001) has been deleted
Patchset #1 (id:160001) has been deleted
Patchset #1 (id:180001) has been deleted
msramek@chromium.org changed reviewers: + brettw@chromium.org, dmurph@chromium.org, mkwst@chromium.org
Hello guys, can you have a look? Mike and Daniel - this is the implementation of the concept I described in the doc that I shared with you (linked in the CL description). Brett, please review as an url/ owner (and since the rest is mostly syntactical changes, perhaps also as a / owner). Thanks, Martin
On 2016/02/01 at 16:37:26, msramek wrote: > Hello guys, > > can you have a look? > > Mike and Daniel - this is the implementation of the concept I described in the doc that I shared with you (linked in the CL description). > > Brett, please review as an url/ owner (and since the rest is mostly syntactical changes, perhaps also as a / owner). > > Thanks, > Martin Question: Does this mean that we still need to implement the filter in the rest of the backends? For example, we would want the filter to apply here: https://code.google.com/p/chromium/codesearch#chromium/src/chrome/browser/bro...
I left a comment in the doc. It seems better to me if we can make this a filter function that is run on the backends rather than a new class with a specific implementation in one of the most low-level directories. It would seem preferable for the data clearing service be able to define its own matching. https://codereview.chromium.org/1603903002/diff/200001/url/origin_filter.cc File url/origin_filter.cc (right): https://codereview.chromium.org/1603903002/diff/200001/url/origin_filter.cc#n... url/origin_filter.cc:39: bool OriginFilter::MatchesURLWithSubdomains(const GURL& url) const { I don't understand this function. Why would the caller be responsible for determining URL vs. subdomain rather than the filter creator? Also, I think if you want to expose a "delete this domain" you want to use the effective TLD service. Probably that should replace the code you have here. https://codereview.chromium.org/1603903002/diff/200001/url/origin_filter.cc#n... url/origin_filter.cc:83: for (std::set<Origin>::const_iterator our_origin = origin_list_.begin(); for (const Origin& : origin_list) ... https://codereview.chromium.org/1603903002/diff/200001/url/origin_filter.h File url/origin_filter.h (right): https://codereview.chromium.org/1603903002/diff/200001/url/origin_filter.h#ne... url/origin_filter.h:43: bool operator==(const OriginFilter& other) const; Why do we need to do this comparison?
I didn't have time to get to this CL today, so just to respond. Brett - thanks, that's a good suggestion. I'll move the logic elsewhere (probably just to browsing_data/) and then pass a callback function to the backends. I was trying to avoid putting this into low layers of the code, but this didn't come to my mind. Since this is no longer about changes to url/, feel free to remove yourself as a reviewer (but feel free to stay as well :) ). Daniel - yes. I chose DownloadManager as an example to illustrate how OriginFilter would work, because there the change was simple - DownloadManager is one of the few backends that currently have logic for origin-based deletion. We will have to implement it for the others.
+ttr314@googlemail.com I'm CC-ing you so that we can coordinate with your work on crbug.com/113968. I'm afraid that I cannot give you a direct access to the doc, as it also contains some internal discussions. I should have separated that :/ Nevertheless, the doc and this CL are about the design from the side of BrowsingDataRemover. From the perspective of the cookies backend, as a client, we only have to make sure to agree on an interface. brettw@ suggested that instead of the OriginFilter class, as you can see here, a predicate of the form "base::Callback<bool(const GURL&)>" will be passed around. I'll move with that proposition today, so that I don't block you. So, please use "predicate(url)" to test whether a url should be deleted (instead of passing GURL or url::Origin directly). Assume that the predicate might be true for any set of URLs (not just one, not just finite).
On 2016/02/05 09:42:34, msramek wrote: > mailto:+ttr314@googlemail.com > > I'm CC-ing you so that we can coordinate with your work on crbug.com/113968. > > I'm afraid that I cannot give you a direct access to the doc, as it also > contains some internal discussions. I should have separated that :/ > > Nevertheless, the doc and this CL are about the design from the side of > BrowsingDataRemover. From the perspective of the cookies backend, as a client, > we only have to make sure to agree on an interface. > > brettw@ suggested that instead of the OriginFilter class, as you can see here, a > predicate of the form "base::Callback<bool(const GURL&)>" will be passed around. > I'll move with that proposition today, so that I don't block you. > > So, please use "predicate(url)" to test whether a url should be deleted (instead > of passing GURL or url::Origin directly). Assume that the predicate might be > true for any set of URLs (not just one, not just finite). msramek@, thanks for putting me in the loop. I'm happy to see you working on this feature as it will be very useful for crbug.com/78093 which I intend to implement at some point in the future. Will get in touch with you via mail later for some additional questions beyond the technical details. Note that I just landed https://codereview.chromium.org/1651913003/ this morning, so you'll probably have to rebase and resolve some minor conflicts -- sorry about that. Shouldn't be too tricky though.
Please have another look! I'm now passing around const base::Callback<bool(const GURL&)>&, as per Brett's suggestion. This approach really is more flexible - from the perspective of data storage backends, this is now a general url->boolean predicate that could theoretically not be an equivalence class on origins, and thus I renamed the method parameters to url_filter instead of origin_filter. To clean up the nomenclature, I renamed OriginFilter to OriginFilterBuilder, since it builds these |url_filter|s on the basis of origin. Timo - thanks for the heads-up. The rebase was not painful, we just ran into a little duplication in the BrowsingDataRemover unittest, as you've just added origin-based deletion to passwords :) https://codereview.chromium.org/1603903002/diff/200001/url/origin_filter.cc File url/origin_filter.cc (right): https://codereview.chromium.org/1603903002/diff/200001/url/origin_filter.cc#n... url/origin_filter.cc:39: bool OriginFilter::MatchesURLWithSubdomains(const GURL& url) const { On 2016/02/01 22:48:25, brettw wrote: > I don't understand this function. Why would the caller be responsible for > determining URL vs. subdomain rather than the filter creator? > > Also, I think if you want to expose a "delete this domain" you want to use the > effective TLD service. Probably that should replace the code you have here. If the decision between exact origin / subdomains is determined on a per-backend basis, then it's the same thing. If it's not, we can have a wrapper that decides between the two methods e.g. according to user preference (I do plan to extend this class, as I currently don't have insight to special cases awaiting in various backends, nor product decisions of possible clients). But since we now only pass around callback and this class is only used within BrowsingDataRemover, it now is the creator who determines that. I thought about using eTLD+1 for this case, but I don't think it's the right choice. I think it's reasonable to want to delete 'plus.google.com' data without deleting 'google.com'. https://codereview.chromium.org/1603903002/diff/200001/url/origin_filter.cc#n... url/origin_filter.cc:83: for (std::set<Origin>::const_iterator our_origin = origin_list_.begin(); On 2016/02/01 22:48:25, brettw wrote: > for (const Origin& : origin_list) ... I removed operator==, as it's not needed anymore. https://codereview.chromium.org/1603903002/diff/200001/url/origin_filter.h File url/origin_filter.h (right): https://codereview.chromium.org/1603903002/diff/200001/url/origin_filter.h#ne... url/origin_filter.h:43: bool operator==(const OriginFilter& other) const; On 2016/02/01 22:48:25, brettw wrote: > Why do we need to do this comparison? It was used in a gmock matcher. The test that used it now changed, so I removed it.
ttr314@googlemail.com changed reviewers: + ttr314@googlemail.com
Just my 2 cents. :-) https://codereview.chromium.org/1603903002/diff/260001/chrome/browser/browsin... File chrome/browser/browsing_data/browsing_data_remover_unittest.cc (right): https://codereview.chromium.org/1603903002/diff/260001/chrome/browser/browsin... chrome/browser/browsing_data/browsing_data_remover_unittest.cc:267: // TODO(msramek): This is only used for backends that currently response. nit-pick: Sentence incomplete/grammar? I'm not entirely sure what you mean by "that currently response". https://codereview.chromium.org/1603903002/diff/260001/chrome/browser/browsin... chrome/browser/browsing_data/browsing_data_remover_unittest.cc:301: // its own OriginFilterBuilder. If an instance of OriginFilter was passed We don't have OriginFilter as a class anymore, do we? https://codereview.chromium.org/1603903002/diff/260001/chrome/browser/browsin... chrome/browser/browsing_data/browsing_data_remover_unittest.cc:302: // to BrowsingDataRemover (what should evenetually be the case), we can Nit-pick: Typo (evenetually -> eventually). https://codereview.chromium.org/1603903002/diff/260001/content/browser/downlo... File content/browser/download/download_manager_impl.h (right): https://codereview.chromium.org/1603903002/diff/260001/content/browser/downlo... content/browser/download/download_manager_impl.h:77: int RemoveDownloadsByOriginAndTime( I think we should consider renaming this method and related test names now that we are passing a generic URL filter as we're not limited to just origins anymore. Not sure what would be a good name though. RemoveDownloadsByPredicateAndTime is probably correct but not perfectly concise.
https://codereview.chromium.org/1603903002/diff/260001/chrome/browser/browsin... File chrome/browser/browsing_data/origin_filter_builder.cc (right): https://codereview.chromium.org/1603903002/diff/260001/chrome/browser/browsin... chrome/browser/browsing_data/origin_filter_builder.cc:52: bool OriginFilterBuilder::MatchesURLWithSubdomains(const GURL& url) const { Since you have an "Empty" pattern that matches everything, you should probably optimize this and return false if there are no origins so you don't do a lot of work taking apart every URL for doing just time-based deletions. Same with MatchesURL above (although the problem is less there). https://codereview.chromium.org/1603903002/diff/260001/chrome/browser/browsin... chrome/browser/browsing_data/origin_filter_builder.cc:59: std::string host = url.host(); Instead here: base::StringPiece host_piece = url.host_piece(); for (size_t i = 0; i... https://codereview.chromium.org/1603903002/diff/260001/chrome/browser/browsin... chrome/browser/browsing_data/origin_filter_builder.cc:66: url.scheme(), url.scheme_piece() https://codereview.chromium.org/1603903002/diff/260001/chrome/browser/browsin... chrome/browser/browsing_data/origin_filter_builder.cc:68: url.EffectiveIntPort()); Can you remove extracting the port from the loop so we don't call it every time? https://codereview.chromium.org/1603903002/diff/260001/chrome/browser/browsin... chrome/browser/browsing_data/origin_filter_builder.cc:94: LOG(WARNING) << "Invalid origin passed into OriginFilter."; Can this be a NOTREACHED instead? THat will be debug only. I don't think having this in release build gives much value, and it adds >100 bytes. https://codereview.chromium.org/1603903002/diff/260001/chrome/browser/browsin... chrome/browser/browsing_data/origin_filter_builder.cc:105: matches_url_callback_ = base::Bind( I think it's sufficient (and less code) to just make these callbacks on demand from the getter and return them directly rather than keeping them on the class. https://codereview.chromium.org/1603903002/diff/260001/chrome/browser/browsin... File chrome/browser/browsing_data/origin_filter_builder.h (right): https://codereview.chromium.org/1603903002/diff/260001/chrome/browser/browsin... chrome/browser/browsing_data/origin_filter_builder.h:21: static scoped_ptr<OriginFilterBuilder> AsWhitelist( This API seems unusual to me. It forces heap allocation which makes everything less convenient. I would have probably done an enum for the constructor which lets you creat on the stack but makes things explicit: OriginFilterBuilder filter(OriginFilterBuilder::WHITELIST); If the caller wants a heap allocation, then can always do it themselves. Also, with the "builder" pattern, I might have expected the ability to add things to the "builder" to make the pattern. Otherwise, this is just a "filter". Depending on how you expect this to be used, it could be more convenient either way (if you don't already have the URLs in a vector, making a vector or origins to give to the "builder" seems redundant). Actually, I now see the code in the browsing data remover and what I'm calling the "builder" mode looks like it would work better. https://codereview.chromium.org/1603903002/diff/260001/chrome/browser/browsin... chrome/browser/browsing_data/origin_filter_builder.h:34: const base::Callback<bool(const GURL& url)>& GetSubdomainFilter(); Did you pick "dubdomain" over "domain" for a reason? Just "domain" is clearer to me, as the matcher might match a top level domain and there's nothing "sub" about it. https://codereview.chromium.org/1603903002/diff/260001/content/browser/downlo... File content/browser/download/download_manager_impl_unittest.cc (right): https://codereview.chromium.org/1603903002/diff/260001/content/browser/downlo... content/browser/download/download_manager_impl_unittest.cc:714: } Blank line before here (and after the namespace opening), and a " // namespace" annotation at the end.
Patchset #5 (id:280001) has been deleted
https://codereview.chromium.org/1603903002/diff/260001/chrome/browser/browsin... File chrome/browser/browsing_data/browsing_data_remover_unittest.cc (right): https://codereview.chromium.org/1603903002/diff/260001/chrome/browser/browsin... chrome/browser/browsing_data/browsing_data_remover_unittest.cc:267: // TODO(msramek): This is only used for backends that currently response. On 2016/02/05 19:01:24, Timo Reimann wrote: > nit-pick: Sentence incomplete/grammar? I'm not entirely sure what you mean by > "that currently response". Done. Sorry, I must have been interrupted while writing this comment. https://codereview.chromium.org/1603903002/diff/260001/chrome/browser/browsin... chrome/browser/browsing_data/browsing_data_remover_unittest.cc:301: // its own OriginFilterBuilder. If an instance of OriginFilter was passed On 2016/02/05 19:01:24, Timo Reimann wrote: > We don't have OriginFilter as a class anymore, do we? Done. Nope, this is supposed to refer to url filters now. https://codereview.chromium.org/1603903002/diff/260001/chrome/browser/browsin... chrome/browser/browsing_data/browsing_data_remover_unittest.cc:302: // to BrowsingDataRemover (what should evenetually be the case), we can On 2016/02/05 19:01:24, Timo Reimann wrote: > Nit-pick: Typo (evenetually -> eventually). Done. https://codereview.chromium.org/1603903002/diff/260001/chrome/browser/browsin... File chrome/browser/browsing_data/origin_filter_builder.cc (right): https://codereview.chromium.org/1603903002/diff/260001/chrome/browser/browsin... chrome/browser/browsing_data/origin_filter_builder.cc:52: bool OriginFilterBuilder::MatchesURLWithSubdomains(const GURL& url) const { On 2016/02/05 21:26:04, brettw wrote: > Since you have an "Empty" pattern that matches everything, you should probably > optimize this and return false if there are no origins so you don't do a lot of > work taking apart every URL for doing just time-based deletions. Same with > MatchesURL above (although the problem is less there). Done. Good point, especially since we expect most calls to be just plain old unfiltered data deletions. https://codereview.chromium.org/1603903002/diff/260001/chrome/browser/browsin... chrome/browser/browsing_data/origin_filter_builder.cc:59: std::string host = url.host(); On 2016/02/05 21:26:04, brettw wrote: > Instead here: > base::StringPiece host_piece = url.host_piece(); > for (size_t i = 0; i... Done. https://codereview.chromium.org/1603903002/diff/260001/chrome/browser/browsin... chrome/browser/browsing_data/origin_filter_builder.cc:66: url.scheme(), On 2016/02/05 21:26:04, brettw wrote: > url.scheme_piece() Done. https://codereview.chromium.org/1603903002/diff/260001/chrome/browser/browsin... chrome/browser/browsing_data/origin_filter_builder.cc:68: url.EffectiveIntPort()); On 2016/02/05 21:26:04, brettw wrote: > Can you remove extracting the port from the loop so we don't call it every time? Done. https://codereview.chromium.org/1603903002/diff/260001/chrome/browser/browsin... chrome/browser/browsing_data/origin_filter_builder.cc:94: LOG(WARNING) << "Invalid origin passed into OriginFilter."; On 2016/02/05 21:26:04, brettw wrote: > Can this be a NOTREACHED instead? THat will be debug only. I don't think having > this in release build gives much value, and it adds >100 bytes. Done. One reason why this wasn't a NOTREACHED originally was that I used some unique origins in the operator== test, but that one doesn't exist anymore. https://codereview.chromium.org/1603903002/diff/260001/chrome/browser/browsin... chrome/browser/browsing_data/origin_filter_builder.cc:105: matches_url_callback_ = base::Bind( On 2016/02/05 21:26:04, brettw wrote: > I think it's sufficient (and less code) to just make these callbacks on demand > from the getter and return them directly rather than keeping them on the class. Done. I think my intention was to make it possible to compare filters in tests (see my comments in BrowsingDataRemoverTest - if the test and the class itself use the same instance of OriginFilterBuilder, and they reference the very same callback, then we're able to just compare the callbacks with operator==. Otherwise, it's difficult to test whether they're equivalent. But the former assumption is currently not true, so it doesn't matter for now. https://codereview.chromium.org/1603903002/diff/260001/chrome/browser/browsin... File chrome/browser/browsing_data/origin_filter_builder.h (right): https://codereview.chromium.org/1603903002/diff/260001/chrome/browser/browsin... chrome/browser/browsing_data/origin_filter_builder.h:21: static scoped_ptr<OriginFilterBuilder> AsWhitelist( On 2016/02/05 21:26:04, brettw wrote: > This API seems unusual to me. It forces heap allocation which makes everything > less convenient. I would have probably done an enum for the constructor which > lets you creat on the stack but makes things explicit: > > OriginFilterBuilder filter(OriginFilterBuilder::WHITELIST); > > If the caller wants a heap allocation, then can always do it themselves. > > Also, with the "builder" pattern, I might have expected the ability to add > things to the "builder" to make the pattern. Otherwise, this is just a "filter". > Depending on how you expect this to be used, it could be more convenient either > way (if you don't already have the URLs in a vector, making a vector or origins > to give to the "builder" seems redundant). > > Actually, I now see the code in the browsing data remover and what I'm calling > the "builder" mode looks like it would work better. Done. Agreed, the builder model is quite convenient here. https://codereview.chromium.org/1603903002/diff/260001/chrome/browser/browsin... chrome/browser/browsing_data/origin_filter_builder.h:34: const base::Callback<bool(const GURL& url)>& GetSubdomainFilter(); On 2016/02/05 21:26:04, brettw wrote: > Did you pick "dubdomain" over "domain" for a reason? Just "domain" is clearer to > me, as the matcher might match a top level domain and there's nothing "sub" > about it. Done. Ok, let's try using "domain"; I hope it won't be confusing in the other direction (that it does not match subdomains). My interpretation was the typical mathematical way - the relations of subset, substring etc. are reflexive. According to the spec, "A domain is a subdomain of another domain if it is contained within that domain. This relationship can be tested by seeing if the subdomain's name ends with the containing domain's name." it sounds to me that every domain is also a subdomain of itself. https://codereview.chromium.org/1603903002/diff/260001/content/browser/downlo... File content/browser/download/download_manager_impl.h (right): https://codereview.chromium.org/1603903002/diff/260001/content/browser/downlo... content/browser/download/download_manager_impl.h:77: int RemoveDownloadsByOriginAndTime( On 2016/02/05 19:01:24, Timo Reimann wrote: > I think we should consider renaming this method and related test names now that > we are passing a generic URL filter as we're not limited to just origins > anymore. > > Not sure what would be a good name though. RemoveDownloadsByPredicateAndTime is > probably correct but not perfectly concise. Done. I replaced Origin with URL, as the filter technically doesn't have to be scoped to origins. I think we don't need to spell out that the URLs are passed in in a predicate, as that's visible from the method parameters. https://codereview.chromium.org/1603903002/diff/260001/content/browser/downlo... File content/browser/download/download_manager_impl_unittest.cc (right): https://codereview.chromium.org/1603903002/diff/260001/content/browser/downlo... content/browser/download/download_manager_impl_unittest.cc:714: } On 2016/02/05 21:26:04, brettw wrote: > Blank line before here (and after the namespace opening), and a " // namespace" > annotation at the end. Done. Also, renamed this and the test to mention URL instead of origin, since the interface now changed, and download manager is able to delete by URL.
Description was changed from ========== Add an OriginFilter class for [white|black]listing origins. We already have at least two different usecases for origin-based browsing data deletion - by whitelist (delete the data of one or more particular sites) or blacklist (clear browsing data, but protect a certain set of sites that are deemed valuable). Furthermore, the deletion might be done on a particular origin, or including subdomains (could be decided on a per-backend basis, settable by an advanced user, or a product decision). To avoid plumbing this functionality separately into all backends, we create this filter class that encapsulates it, and can then be passed as a blackbox to individual backends. To illustrate the usage of the new API, we also modify the origin-based deletion in DownloadManager. A note to the placement of the class: As BrowsingDataRemover has ~17 different backends, this class is going to be reused all over Chrome. Notably the need to use it in net/ (for cache deletion) and content/browser (for download manager) necessitate placement in the lowest layers of code, such as url/. This makes sense if we regard OriginFilter as a "smart container" of Origin, which also lives in url/. See https://docs.google.com/document/d/12UeAMGOk9MDFwrT9kfNldL4qnUIfnIdVbQSk-hb-mWg/ for more details. BUG=578162 ========== to ========== Add an OriginFilterBuilder class for [white|black]listing origins. We already have at least two different usecases for origin-based browsing data deletion - by whitelist (delete the data of one or more particular sites) or blacklist (clear browsing data, but protect a certain set of sites that are deemed valuable). Furthermore, the deletion might be done on a particular origin, or including subdomains (could be decided on a per-backend basis, settable by an advanced user, or a product decision). To avoid plumbing this functionality separately into all backends, we instead pass a GURL->bool predicate to individual backends that they can use as a blackbox to evaluate whether a certain URL should be deleted. We introduce an OriginFilterBuilder class that constructs such predicates. To illustrate the usage of the new API, we also modify the origin-based deletion in DownloadManager. See https://docs.google.com/document/d/12UeAMGOk9MDFwrT9kfNldL4qnUIfnIdVbQSk-hb-mWg/ for more details. BUG=578162 ==========
Sorry for the late second round -- I noticed another two smallish things only after rebasing my branch on top of this issue and working more closely with the code. Hope that's okay. https://codereview.chromium.org/1603903002/diff/300001/chrome/browser/browsin... File chrome/browser/browsing_data/browsing_data_remover_unittest.cc (right): https://codereview.chromium.org/1603903002/diff/300001/chrome/browser/browsin... chrome/browser/browsing_data/browsing_data_remover_unittest.cc:2227: base::Callback<bool(const GURL&)> filter = builder.BuildSameOriginFilter(); It takes a bit of cognitive effort to understand that we're using an empty, black-listed origin filter in order to match URLs unconditionally. We could be more clear about our intention by passing a simple always-true filter instead, just like the EmptyFilter defined in download_manager_impl.cc. It might even be worth to move that into OriginFilterBuilder so that it can be reused. https://codereview.chromium.org/1603903002/diff/300001/chrome/browser/browsin... File chrome/browser/browsing_data/origin_filter_builder.cc (right): https://codereview.chromium.org/1603903002/diff/300001/chrome/browser/browsin... chrome/browser/browsing_data/origin_filter_builder.cc:24: return this; This and other usages of "if (condition) { NOTREACHED(); return this; }" below could be replaced by DCHECK(!condition) for increased conciseness.
https://codereview.chromium.org/1603903002/diff/300001/chrome/browser/browsin... File chrome/browser/browsing_data/browsing_data_remover_unittest.cc (right): https://codereview.chromium.org/1603903002/diff/300001/chrome/browser/browsin... chrome/browser/browsing_data/browsing_data_remover_unittest.cc:2227: base::Callback<bool(const GURL&)> filter = builder.BuildSameOriginFilter(); On 2016/02/11 22:46:43, Timo Reimann wrote: > It takes a bit of cognitive effort to understand that we're using an empty, > black-listed origin filter in order to match URLs unconditionally. We could be > more clear about our intention by passing a simple always-true filter instead, > just like the EmptyFilter defined in download_manager_impl.cc. > > It might even be worth to move that into OriginFilterBuilder so that it can be > reused. I was originally thinking about having a tri-state mode enum { EMPTY, WHITELIST, BLACKLIST } to avoid this cognitive effort, but then decided against it, as it would be even more confusing if you do: OriginFilterBuilder builder(EMPTY); builder.AddOrigin(...); // <- the mode is "EMPTY", but the filter // is in fact not empty Should AddOrigin(...) fail, or should the mode automatically flip to BLACKLIST? It was starting to feel a bit overengineered to me. It is also awkward that when constructing the filter in BrowsingDataRemover, we actually have to start with blacklist and then flip to whitelist when one origin is added. The builder pattern is intended for incremental construction of objects, which this case really isn't, and if it wasn't the primary usecase, I would be against including the ChangeMode() method. But at least for convenience in tests, I guess a static method to generate an empty filter is useful. https://codereview.chromium.org/1603903002/diff/300001/chrome/browser/browsin... File chrome/browser/browsing_data/origin_filter_builder.cc (right): https://codereview.chromium.org/1603903002/diff/300001/chrome/browser/browsin... chrome/browser/browsing_data/origin_filter_builder.cc:24: return this; On 2016/02/11 22:46:43, Timo Reimann wrote: > This and other usages of "if (condition) { NOTREACHED(); return this; }" below > could be replaced by DCHECK(!condition) for increased conciseness. Done.
https://codereview.chromium.org/1603903002/diff/300001/chrome/browser/browsin... File chrome/browser/browsing_data/browsing_data_remover_unittest.cc (right): https://codereview.chromium.org/1603903002/diff/300001/chrome/browser/browsin... chrome/browser/browsing_data/browsing_data_remover_unittest.cc:2227: base::Callback<bool(const GURL&)> filter = builder.BuildSameOriginFilter(); On 2016/02/12 14:24:50, msramek wrote: > On 2016/02/11 22:46:43, Timo Reimann wrote: > > It takes a bit of cognitive effort to understand that we're using an empty, > > black-listed origin filter in order to match URLs unconditionally. We could be > > more clear about our intention by passing a simple always-true filter instead, > > just like the EmptyFilter defined in download_manager_impl.cc. > > > > It might even be worth to move that into OriginFilterBuilder so that it can be > > reused. > > I was originally thinking about having a tri-state mode enum { EMPTY, WHITELIST, > BLACKLIST } to avoid this cognitive effort, but then decided against it, as it > would be even more confusing if you do: > > OriginFilterBuilder builder(EMPTY); > builder.AddOrigin(...); // <- the mode is "EMPTY", but the filter > // is in fact not empty > > Should AddOrigin(...) fail, or should the mode automatically flip to BLACKLIST? > It was starting to feel a bit overengineered to me. > > It is also awkward that when constructing the filter in BrowsingDataRemover, we > actually have to start with blacklist and then flip to whitelist when one origin > is added. The builder pattern is intended for incremental construction of > objects, which this case really isn't, and if it wasn't the primary usecase, I > would be against including the ChangeMode() method. > > But at least for convenience in tests, I guess a static method to generate an > empty filter is useful. If flipping from "no filter" (i.e., an empty black list) to a non-empty white list (or even black list, for that matter) is really the primary use case, how about just requiring clients of the API to start over with a new OriginFilterBuilder once the first origin comes in? It'd be just as easy syntactically and allow to remove the ChangeMode method. The one downside I see is that you cannot constify the OriginFilterBuilder instance anymore. However, to me that's a more explicit way of saying that the instance may change than to mutate its internal state.
Mike, friendly ping! Can you please review as the owner of browsing_data/ ?
https://codereview.chromium.org/1603903002/diff/320001/chrome/browser/browsin... File chrome/browser/browsing_data/origin_filter_builder.cc (right): https://codereview.chromium.org/1603903002/diff/320001/chrome/browser/browsin... chrome/browser/browsing_data/origin_filter_builder.cc:59: return base::Bind(&OriginFilterBuilder::MatchesURL, base::Unretained(this)); As discussed with Martin, the current approach requires callers to manage the lifetime of the builder instance, particularly due to the contained |origin_list_|. It would be nice if ownership could be bound to the callback instead.
The changes to c/b/browsing_data/browsing_data_remover* look good to me. I have a few questions about the OriginFilterBuilder bits inline. Thanks! https://codereview.chromium.org/1603903002/diff/320001/chrome/browser/browsin... File chrome/browser/browsing_data/origin_filter_builder.cc (right): https://codereview.chromium.org/1603903002/diff/320001/chrome/browser/browsin... chrome/browser/browsing_data/origin_filter_builder.cc:66: base::Unretained(this)); Why is Unretained safe here? https://codereview.chromium.org/1603903002/diff/320001/chrome/browser/browsin... chrome/browser/browsing_data/origin_filter_builder.cc:70: base::Callback<bool(const GURL&)> OriginFilterBuilder::BuildEmptyFilter() { Perhaps "NoopFilter"? An empty BLACKLIST has a different behavior than an empty WHITELIST, so "empty" isn't as unambiguous as I'd like. :) https://codereview.chromium.org/1603903002/diff/320001/chrome/browser/browsin... File chrome/browser/browsing_data/origin_filter_builder.h (right): https://codereview.chromium.org/1603903002/diff/320001/chrome/browser/browsin... chrome/browser/browsing_data/origin_filter_builder.h:35: OriginFilterBuilder* ChangeMode(Mode mode); Nit: s/ChangeMode/SetMode/ https://codereview.chromium.org/1603903002/diff/320001/chrome/browser/browsin... chrome/browser/browsing_data/origin_filter_builder.h:45: // possible to change the builder parameters (origin list and mode). Can you add some comments about lifetimes here and `BuildSameOriginFilter()`? As ttr314@ noted, that's actually a bit complicated. https://codereview.chromium.org/1603903002/diff/320001/chrome/browser/browsin... chrome/browser/browsing_data/origin_filter_builder.h:46: base::Callback<bool(const GURL&)> BuildDomainFilter(); HSTS calls these "congruent" and "superdomain" matches. Perhaps reusing that terminology here could be useful: `BuildCongruentFilter`/`BuildSuperdomainFilter`? https://codereview.chromium.org/1603903002/diff/320001/chrome/browser/browsin... File chrome/browser/browsing_data/origin_filter_builder_unittest.cc (right): https://codereview.chromium.org/1603903002/diff/320001/chrome/browser/browsin... chrome/browser/browsing_data/origin_filter_builder_unittest.cc:101: EXPECT_FALSE(filter.Run(GURL("https://www.youtube.com"))); Nit: You could reduce the boilerplate in all of these tests by creating a struct of test cases and looping through them.
https://codereview.chromium.org/1603903002/diff/320001/chrome/browser/browsin... File chrome/browser/browsing_data/origin_filter_builder.h (right): https://codereview.chromium.org/1603903002/diff/320001/chrome/browser/browsin... chrome/browser/browsing_data/origin_filter_builder.h:32: OriginFilterBuilder* AddOrigin(const url::Origin& origin); Thanks for the update on the builder pattern, I like this better than the old way. https://codereview.chromium.org/1603903002/diff/320001/chrome/browser/browsin... chrome/browser/browsing_data/origin_filter_builder.h:35: OriginFilterBuilder* ChangeMode(Mode mode); On 2016/02/17 15:22:10, Mike West wrote: > Nit: s/ChangeMode/SetMode/ This, plus void return value. I notice you do this to let callers do builder.SetMode(...)->Bar(). In some cases this pattern can be useful. But this is pretty unusual in Chrome, has potentially confusing memory management semantics, and the way you use this in the patch, it's the same number of lines to do builder.SetMode(); builder.AddOrigin(); on separate lines. https://codereview.chromium.org/1603903002/diff/320001/chrome/browser/browsin... chrome/browser/browsing_data/origin_filter_builder.h:46: base::Callback<bool(const GURL&)> BuildDomainFilter(); On 2016/02/17 15:22:10, Mike West wrote: > HSTS calls these "congruent" and "superdomain" matches. Perhaps reusing that > terminology here could be useful: > `BuildCongruentFilter`/`BuildSuperdomainFilter`? I've never heard of those, and I personally prefer the patches current naming. https://codereview.chromium.org/1603903002/diff/320001/content/browser/downlo... File content/browser/download/download_manager_impl_unittest.cc (right): https://codereview.chromium.org/1603903002/diff/320001/content/browser/downlo... content/browser/download/download_manager_impl_unittest.cc:713: return base::Bind(&GURL::operator==, base::Unretained(url)); Can we remove the unretained and make a copy here? Then we can change the url parameter to this function to be a "const GURL&". Unless I'm missing something we don't actually need the pointer here, and the memory management seems scary.
https://codereview.chromium.org/1603903002/diff/320001/chrome/browser/browsin... File chrome/browser/browsing_data/origin_filter_builder.h (right): https://codereview.chromium.org/1603903002/diff/320001/chrome/browser/browsin... chrome/browser/browsing_data/origin_filter_builder.h:46: base::Callback<bool(const GURL&)> BuildDomainFilter(); On 2016/02/17 at 20:34:24, brettw wrote: > On 2016/02/17 15:22:10, Mike West wrote: > > HSTS calls these "congruent" and "superdomain" matches. Perhaps reusing that > > terminology here could be useful: > > `BuildCongruentFilter`/`BuildSuperdomainFilter`? > > I've never heard of those, and I personally prefer the patches current naming. I don't feel strongly about it: if y'all are happier with the current naming, then keep it.
I fixed the problems with origin_list_ lifetime, and I think the entire concept is much better now! Timo, this means that you have to update your follow-up CL, but according to our yesterday's discussion, I think you were happy with this approach :) PTAL! https://codereview.chromium.org/1603903002/diff/300001/chrome/browser/browsin... File chrome/browser/browsing_data/browsing_data_remover_unittest.cc (right): https://codereview.chromium.org/1603903002/diff/300001/chrome/browser/browsin... chrome/browser/browsing_data/browsing_data_remover_unittest.cc:2227: base::Callback<bool(const GURL&)> filter = builder.BuildSameOriginFilter(); On 2016/02/15 07:39:15, Timo Reimann wrote: > On 2016/02/12 14:24:50, msramek wrote: > > On 2016/02/11 22:46:43, Timo Reimann wrote: > > > It takes a bit of cognitive effort to understand that we're using an empty, > > > black-listed origin filter in order to match URLs unconditionally. We could > be > > > more clear about our intention by passing a simple always-true filter > instead, > > > just like the EmptyFilter defined in download_manager_impl.cc. > > > > > > It might even be worth to move that into OriginFilterBuilder so that it can > be > > > reused. > > > > I was originally thinking about having a tri-state mode enum { EMPTY, > WHITELIST, > > BLACKLIST } to avoid this cognitive effort, but then decided against it, as it > > would be even more confusing if you do: > > > > OriginFilterBuilder builder(EMPTY); > > builder.AddOrigin(...); // <- the mode is "EMPTY", but the filter > > // is in fact not empty > > > > Should AddOrigin(...) fail, or should the mode automatically flip to > BLACKLIST? > > It was starting to feel a bit overengineered to me. > > > > It is also awkward that when constructing the filter in BrowsingDataRemover, > we > > actually have to start with blacklist and then flip to whitelist when one > origin > > is added. The builder pattern is intended for incremental construction of > > objects, which this case really isn't, and if it wasn't the primary usecase, I > > would be against including the ChangeMode() method. > > > > But at least for convenience in tests, I guess a static method to generate an > > empty filter is useful. > > If flipping from "no filter" (i.e., an empty black list) to a non-empty white > list (or even black list, for that matter) is really the primary use case, how > about just requiring clients of the API to start over with a new > OriginFilterBuilder once the first origin comes in? It'd be just as easy > syntactically and allow to remove the ChangeMode method. > > The one downside I see is that you cannot constify the OriginFilterBuilder > instance anymore. However, to me that's a more explicit way of saying that the > instance may change than to mutate its internal state. The builder looks much better now, and you're able to build more than one filter. In that world, it's quite reasonable to add a few origins, build a whitelist, then add a few more, then build a blacklist for the same set, then add a few more and build another whitelist etc. It currently doesn't seem to me that SetMode() doesn't fit into the model, so I kept it. https://codereview.chromium.org/1603903002/diff/320001/chrome/browser/browsin... File chrome/browser/browsing_data/origin_filter_builder.cc (right): https://codereview.chromium.org/1603903002/diff/320001/chrome/browser/browsin... chrome/browser/browsing_data/origin_filter_builder.cc:59: return base::Bind(&OriginFilterBuilder::MatchesURL, base::Unretained(this)); On 2016/02/17 14:59:23, Timo Reimann wrote: > As discussed with Martin, the current approach requires callers to manage the > lifetime of the builder instance, particularly due to the contained > |origin_list_|. It would be nice if ownership could be bound to the callback > instead. Done. Changed this to base::Owned, the callback now has its own copy of the data, and the matcher methods themselves can now be static (actually should, so that I don't accidentally use mode_ instead of mode or origin_list_ instead of origins). https://codereview.chromium.org/1603903002/diff/320001/chrome/browser/browsin... chrome/browser/browsing_data/origin_filter_builder.cc:66: base::Unretained(this)); On 2016/02/17 15:22:10, Mike West wrote: > Why is Unretained safe here? It's not :) See my other comments. https://codereview.chromium.org/1603903002/diff/320001/chrome/browser/browsin... chrome/browser/browsing_data/origin_filter_builder.cc:70: base::Callback<bool(const GURL&)> OriginFilterBuilder::BuildEmptyFilter() { On 2016/02/17 15:22:10, Mike West wrote: > Perhaps "NoopFilter"? An empty BLACKLIST has a different behavior than an empty > WHITELIST, so "empty" isn't as unambiguous as I'd like. :) Done. Yes, I agree. https://codereview.chromium.org/1603903002/diff/320001/chrome/browser/browsin... File chrome/browser/browsing_data/origin_filter_builder.h (right): https://codereview.chromium.org/1603903002/diff/320001/chrome/browser/browsin... chrome/browser/browsing_data/origin_filter_builder.h:32: OriginFilterBuilder* AddOrigin(const url::Origin& origin); On 2016/02/17 20:34:24, brettw wrote: > Thanks for the update on the builder pattern, I like this better than the old > way. Acknowledged. https://codereview.chromium.org/1603903002/diff/320001/chrome/browser/browsin... chrome/browser/browsing_data/origin_filter_builder.h:35: OriginFilterBuilder* ChangeMode(Mode mode); On 2016/02/17 20:34:24, brettw wrote: > On 2016/02/17 15:22:10, Mike West wrote: > > Nit: s/ChangeMode/SetMode/ > > This, plus void return value. I notice you do this to let callers do > builder.SetMode(...)->Bar(). In some cases this pattern can be useful. But this > is pretty unusual in Chrome, has potentially confusing memory management > semantics, and the way you use this in the patch, it's the same number of lines > to do > builder.SetMode(); > builder.AddOrigin(); > on separate lines. Done. Didn't seem unusual to me, as as I have recently worked with ContentSettingsPattern::Builder, which does that :) https://codereview.chromium.org/1603903002/diff/320001/chrome/browser/browsin... chrome/browser/browsing_data/origin_filter_builder.h:45: // possible to change the builder parameters (origin list and mode). On 2016/02/17 15:22:10, Mike West wrote: > Can you add some comments about lifetimes here and `BuildSameOriginFilter()`? As > ttr314@ noted, that's actually a bit complicated. My original thought here was that BrowsingDataRemover blocks until the deletion finishes anyway, so there won't be a problem with lifetimes. However, in Timo's CL for password manager, this caused some troubles in tests. We discussed this in a separate thread, and my proposal (which I implemented in this last patchset) is to make a copy of origin_list_ and pass it as Owned() instead of Unretained(). This means that the callback will keep its list of URLs as long as they're necessary, and there will be no surprises w.r.t. lifetime. Since we established that the number of origins typically shouldn't be large (which is why we can get away with set instead of e.g. trie), we can also reckon that this approach will still be fast and not take too much extra memory. It also allows me to remove the DCHECKS saying "you cannot modify the builder after building once", because now each built filter will make its own copy of the data. And finally, it's weird for a builder to have to stay alive after building. I think the current state is much better. https://codereview.chromium.org/1603903002/diff/320001/chrome/browser/browsin... chrome/browser/browsing_data/origin_filter_builder.h:46: base::Callback<bool(const GURL&)> BuildDomainFilter(); On 2016/02/18 10:50:31, Mike West wrote: > On 2016/02/17 at 20:34:24, brettw wrote: > > On 2016/02/17 15:22:10, Mike West wrote: > > > HSTS calls these "congruent" and "superdomain" matches. Perhaps reusing that > > > terminology here could be useful: > > > `BuildCongruentFilter`/`BuildSuperdomainFilter`? > > > > I've never heard of those, and I personally prefer the patches current naming. > > I don't feel strongly about it: if y'all are happier with the current naming, > then keep it. "superdomain" could be confusing, as in my mind this matches subdomains, it's just that we implement it reversely. But I also don't have a strong opinion. If Brett feels strongly, I guess I'll keep the naming for now :) https://codereview.chromium.org/1603903002/diff/320001/chrome/browser/browsin... File chrome/browser/browsing_data/origin_filter_builder_unittest.cc (right): https://codereview.chromium.org/1603903002/diff/320001/chrome/browser/browsin... chrome/browser/browsing_data/origin_filter_builder_unittest.cc:101: EXPECT_FALSE(filter.Run(GURL("https://www.youtube.com"))); On 2016/02/17 15:22:10, Mike West wrote: > Nit: You could reduce the boilerplate in all of these tests by creating a struct > of test cases and looping through them. Done. https://codereview.chromium.org/1603903002/diff/320001/content/browser/downlo... File content/browser/download/download_manager_impl_unittest.cc (right): https://codereview.chromium.org/1603903002/diff/320001/content/browser/downlo... content/browser/download/download_manager_impl_unittest.cc:713: return base::Bind(&GURL::operator==, base::Unretained(url)); On 2016/02/17 20:34:24, brettw wrote: > Can we remove the unretained and make a copy here? Then we can change the url > parameter to this function to be a "const GURL&". Unless I'm missing something > we don't actually need the pointer here, and the memory management seems scary. Done.
https://codereview.chromium.org/1603903002/diff/340001/chrome/browser/browsin... File chrome/browser/browsing_data/origin_filter_builder.cc (right): https://codereview.chromium.org/1603903002/diff/340001/chrome/browser/browsin... chrome/browser/browsing_data/origin_filter_builder.cc:57: base::Owned(origins), mode_); Copying the list makes sense if we're planning on generating multiple filters from a single builder. Is that actually the case? It seems like the plan is to generate a single matcher from a single builder, and use that matcher in a variety of browsing-data-clearing backends. Do we need the ability to create subsequent matchers? This is probably the same question Brett asked, just phrased a little differently. If we need the flexibility, fine. I just don't see why we would. https://codereview.chromium.org/1603903002/diff/340001/chrome/browser/browsin... chrome/browser/browsing_data/origin_filter_builder.cc:83: // If they are, replace std::set with a trie for faster searching. How will you know whether the filters become large? Do we need a histogram here?
https://codereview.chromium.org/1603903002/diff/340001/chrome/browser/browsin... File chrome/browser/browsing_data/origin_filter_builder.cc (right): https://codereview.chromium.org/1603903002/diff/340001/chrome/browser/browsin... chrome/browser/browsing_data/origin_filter_builder.cc:57: base::Owned(origins), mode_); On 2016/02/19 07:07:32, Mike West wrote: > Copying the list makes sense if we're planning on generating multiple filters > from a single builder. Is that actually the case? It seems like the plan is to > generate a single matcher from a single builder, and use that matcher in a > variety of browsing-data-clearing backends. Do we need the ability to create > subsequent matchers? > > This is probably the same question Brett asked, just phrased a little > differently. If we need the flexibility, fine. I just don't see why we would. I would put it the other way around :) We can of course allocate origins_list_ on heap as a scoped_ptr and then pass it to this callback. That way, there will be only one copy, but we will also only be able to build once. This means adding more code (DCHECK(!built) et al.) to several methods, documenting that this is the behavior (because I don't think it's typical for the builder pattern to stop working after one build) and lowering flexibility for callers (in the future, we will have callers build the filter and pass it to BrowsingDataRemover, because if BrowsingDataRemover only allows one url to be passed, that defeats the purpose of these filters). The added complexity is IMHO not worth saving the small amount of duplicated memory for a short time, but if you think otherwise, I can change that. https://codereview.chromium.org/1603903002/diff/340001/chrome/browser/browsin... chrome/browser/browsing_data/origin_filter_builder.cc:83: // If they are, replace std::set with a trie for faster searching. On 2016/02/19 07:07:32, Mike West wrote: > How will you know whether the filters become large? Do we need a histogram here? The "Clear site data" usecase will, to my understanding, only use a 1-element whitelist. Daniel wants to have 5-10 element blacklists. Those are all usecases we currently know about. It sounds more like I should add DCHECK_LE(origins.size(), 10) << "Talk to msramek@ before you do this.";
On 2016/02/19 at 12:02:11, msramek wrote: > https://codereview.chromium.org/1603903002/diff/340001/chrome/browser/browsin... > File chrome/browser/browsing_data/origin_filter_builder.cc (right): > > https://codereview.chromium.org/1603903002/diff/340001/chrome/browser/browsin... > chrome/browser/browsing_data/origin_filter_builder.cc:57: base::Owned(origins), mode_); > On 2016/02/19 07:07:32, Mike West wrote: > > Copying the list makes sense if we're planning on generating multiple filters > > from a single builder. Is that actually the case? It seems like the plan is to > > generate a single matcher from a single builder, and use that matcher in a > > variety of browsing-data-clearing backends. Do we need the ability to create > > subsequent matchers? > > > > This is probably the same question Brett asked, just phrased a little > > differently. If we need the flexibility, fine. I just don't see why we would. > > I would put it the other way around :) We can of course allocate origins_list_ on heap as a scoped_ptr and then pass it to this callback. That way, there will be only one copy, but we will also only be able to build once. This means adding more code (DCHECK(!built) et al.) to several methods, documenting that this is the behavior (because I don't think it's typical for the builder pattern to stop working after one build) and lowering flexibility for callers (in the future, we will have callers build the filter and pass it to BrowsingDataRemover, because if BrowsingDataRemover only allows one url to be passed, that defeats the purpose of these filters). > > The added complexity is IMHO not worth saving the small amount of duplicated memory for a short time, but if you think otherwise, I can change that. It feels like you're building flexibility you're not going to use. I'm skeptical that it's necessary, but I don't think the cost of copying is worth arguing about too much here (since it will all delete itself when the deletion operation is complete). Assuming that Brett is happy, LGTM. > https://codereview.chromium.org/1603903002/diff/340001/chrome/browser/browsin... > chrome/browser/browsing_data/origin_filter_builder.cc:83: // If they are, replace std::set with a trie for faster searching. > On 2016/02/19 07:07:32, Mike West wrote: > > How will you know whether the filters become large? Do we need a histogram here? > > The "Clear site data" usecase will, to my understanding, only use a 1-element whitelist. Daniel wants to have 5-10 element blacklists. Those are all usecases we currently know about. It sounds more like I should add > > DCHECK_LE(origins.size(), 10) << "Talk to msramek@ before you do this."; That's probably fine. That said, it might be useful to add some sort of metrics anyway. We track the number of calls to the BDR, it would be interesting to see how many of those are origin-scoped, for instance. Doesn't have to be in this CL, but I think it's probably worth doing.
Thanks, Mike. I'll leave the metrics to a follow-up CL, since I'm apparently blocking people with this one already :( Brett, can you have a look?
lgtm https://codereview.chromium.org/1603903002/diff/360001/chrome/browser/browsin... File chrome/browser/browsing_data/browsing_data_remover.cc (right): https://codereview.chromium.org/1603903002/diff/360001/chrome/browser/browsin... chrome/browser/browsing_data/browsing_data_remover.cc:399: // TODO(msramek): Investigate whether history deletion, especially in the History does an exact match.
https://codereview.chromium.org/1603903002/diff/360001/chrome/browser/browsin... File chrome/browser/browsing_data/browsing_data_remover.cc (right): https://codereview.chromium.org/1603903002/diff/360001/chrome/browser/browsin... chrome/browser/browsing_data/browsing_data_remover.cc:399: // TODO(msramek): Investigate whether history deletion, especially in the On 2016/02/22 23:23:58, brettw wrote: > History does an exact match. Thanks. I updated the TODO. I'll have to talk to serverside folks to get a new API for per-origin deletions.
The CQ bit was checked by msramek@chromium.org
The patchset sent to the CQ was uploaded after l-g-t-m from brettw@chromium.org, mkwst@chromium.org Link to the patchset: https://codereview.chromium.org/1603903002/#ps400001 (title: "Rebase over 1246583002")
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1603903002/400001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1603903002/400001
Message was sent while issue was closed.
Description was changed from ========== Add an OriginFilterBuilder class for [white|black]listing origins. We already have at least two different usecases for origin-based browsing data deletion - by whitelist (delete the data of one or more particular sites) or blacklist (clear browsing data, but protect a certain set of sites that are deemed valuable). Furthermore, the deletion might be done on a particular origin, or including subdomains (could be decided on a per-backend basis, settable by an advanced user, or a product decision). To avoid plumbing this functionality separately into all backends, we instead pass a GURL->bool predicate to individual backends that they can use as a blackbox to evaluate whether a certain URL should be deleted. We introduce an OriginFilterBuilder class that constructs such predicates. To illustrate the usage of the new API, we also modify the origin-based deletion in DownloadManager. See https://docs.google.com/document/d/12UeAMGOk9MDFwrT9kfNldL4qnUIfnIdVbQSk-hb-mWg/ for more details. BUG=578162 ========== to ========== Add an OriginFilterBuilder class for [white|black]listing origins. We already have at least two different usecases for origin-based browsing data deletion - by whitelist (delete the data of one or more particular sites) or blacklist (clear browsing data, but protect a certain set of sites that are deemed valuable). Furthermore, the deletion might be done on a particular origin, or including subdomains (could be decided on a per-backend basis, settable by an advanced user, or a product decision). To avoid plumbing this functionality separately into all backends, we instead pass a GURL->bool predicate to individual backends that they can use as a blackbox to evaluate whether a certain URL should be deleted. We introduce an OriginFilterBuilder class that constructs such predicates. To illustrate the usage of the new API, we also modify the origin-based deletion in DownloadManager. See https://docs.google.com/document/d/12UeAMGOk9MDFwrT9kfNldL4qnUIfnIdVbQSk-hb-mWg/ for more details. BUG=578162 ==========
Message was sent while issue was closed.
Committed patchset #10 (id:400001)
Message was sent while issue was closed.
Description was changed from ========== Add an OriginFilterBuilder class for [white|black]listing origins. We already have at least two different usecases for origin-based browsing data deletion - by whitelist (delete the data of one or more particular sites) or blacklist (clear browsing data, but protect a certain set of sites that are deemed valuable). Furthermore, the deletion might be done on a particular origin, or including subdomains (could be decided on a per-backend basis, settable by an advanced user, or a product decision). To avoid plumbing this functionality separately into all backends, we instead pass a GURL->bool predicate to individual backends that they can use as a blackbox to evaluate whether a certain URL should be deleted. We introduce an OriginFilterBuilder class that constructs such predicates. To illustrate the usage of the new API, we also modify the origin-based deletion in DownloadManager. See https://docs.google.com/document/d/12UeAMGOk9MDFwrT9kfNldL4qnUIfnIdVbQSk-hb-mWg/ for more details. BUG=578162 ========== to ========== Add an OriginFilterBuilder class for [white|black]listing origins. We already have at least two different usecases for origin-based browsing data deletion - by whitelist (delete the data of one or more particular sites) or blacklist (clear browsing data, but protect a certain set of sites that are deemed valuable). Furthermore, the deletion might be done on a particular origin, or including subdomains (could be decided on a per-backend basis, settable by an advanced user, or a product decision). To avoid plumbing this functionality separately into all backends, we instead pass a GURL->bool predicate to individual backends that they can use as a blackbox to evaluate whether a certain URL should be deleted. We introduce an OriginFilterBuilder class that constructs such predicates. To illustrate the usage of the new API, we also modify the origin-based deletion in DownloadManager. See https://docs.google.com/document/d/12UeAMGOk9MDFwrT9kfNldL4qnUIfnIdVbQSk-hb-mWg/ for more details. BUG=578162 Committed: https://crrev.com/d9c162ad0a240aa967c247f5be258056d6a4efbf Cr-Commit-Position: refs/heads/master@{#376967} ==========
Message was sent while issue was closed.
Patchset 10 (id:??) landed as https://crrev.com/d9c162ad0a240aa967c247f5be258056d6a4efbf Cr-Commit-Position: refs/heads/master@{#376967}
Message was sent while issue was closed.
https://codereview.chromium.org/1603903002/diff/360001/chrome/browser/browsin... File chrome/browser/browsing_data/browsing_data_remover.cc (right): https://codereview.chromium.org/1603903002/diff/360001/chrome/browser/browsin... chrome/browser/browsing_data/browsing_data_remover.cc:399: // TODO(msramek): Investigate whether history deletion, especially in the I'm not sure why anybody serverside would be necessary. The browser history service can just delete matching URLs and those deletions will by synced around. |