|
|
Created:
9 years, 10 months ago by gcasto (DO NOT USE) Modified:
9 years, 10 months ago CC:
chromium-reviews, Paweł Hajdan Jr., chrome-anti-phishing_googlegroups.com Visibility:
Public. |
DescriptionAdd caching to phishing client side detection.
BUG=None
TEST=Ran client_side_detection_service_unittest
Committed: http://src.chromium.org/viewvc/chrome?view=rev&revision=74874
Patch Set 1 #Patch Set 2 : Updated header comments #
Total comments: 18
Patch Set 3 : Address comments #Patch Set 4 : Add testing for cache refresh. Also make interval that we keep track of requests configurable. #
Total comments: 5
Patch Set 5 : Change kMaxReports to kMaxReportsPerInterval and change a few comments. #Patch Set 6 : Add UMA stat for not being able to serialize request. #
Messages
Total messages: 11 (0 generated)
codereview.chromium.org doesn't seem to accept comments at the moment. client_side_detection_service.h: - add a comment that describes the new struct and the member variables? - I think the cache should have a maximum size. In 30 days this cache can get huge. Consider using an LRU cache or something of the sort. - comment for cache_: is cache_ really used to limit the total number of requests per day? client_side_detection_service.cc: - consider putting UpdateCache() into GetCachedResult(...). Didn't look at the test yet. Thanks, noe. On Mon, Feb 7, 2011 at 1:41 PM, <gcasto@google.com> wrote: > Reviewers: Brian Ryner, noelutz, > > Description: > Add caching to phishing client side detection. > > BUG=None > TEST=Ran client_side_detection_service_unittest > > Please review this at http://codereview.chromium.org/6374017/ > > SVN Base: http://git.chromium.org/git/chromium.git@trunk > > Affected files: > M chrome/browser/safe_browsing/client_side_detection_service.h > M chrome/browser/safe_browsing/client_side_detection_service.cc > M chrome/browser/safe_browsing/client_side_detection_service_unittest.cc > > > -- -|||--
On Mon, Feb 7, 2011 at 5:50 PM, Noé Lutz <noelutz@google.com> wrote: > codereview.chromium.org doesn't seem to accept comments at the moment. > > client_side_detection_service.h: > - add a comment that describes the new struct and the member variables? > I'm not sure what exactly your looking for, but I fleshed out the comments some. > - I think the cache should have a maximum size. In 30 days this > cache can get huge. Consider using an LRU cache or something of the > sort. > We update the cache on every possible ping, clearing out entries that are no longer relevant. This plus the fact that we limit the number of pings per day means that there can only ever be 3 elements in the cache, so I think that using LRU is probably overkill. > - comment for cache_: is cache_ really used to limit the total number > of requests per day? > > Ah, no. I was thinking about using it for that purpose as well, but I decided it was easier to reason about this way. Removed that comment. > client_side_detection_service.cc: > - consider putting UpdateCache() into GetCachedResult(...). > > What do you mean by this exactly? UpdateCache is already called in GetCachedResult(). Do you mean not make it a separate function call? > Didn't look at the test yet. > > Thanks, > noe. > > On Mon, Feb 7, 2011 at 1:41 PM, <gcasto@google.com> wrote: > > Reviewers: Brian Ryner, noelutz, > > > > Description: > > Add caching to phishing client side detection. > > > > BUG=None > > TEST=Ran client_side_detection_service_unittest > > > > Please review this at http://codereview.chromium.org/6374017/ > > > > SVN Base: http://git.chromium.org/git/chromium.git@trunk > > > > Affected files: > > M chrome/browser/safe_browsing/client_side_detection_service.h > > M chrome/browser/safe_browsing/client_side_detection_service.cc > > M chrome/browser/safe_browsing/client_side_detection_service_unittest.cc > > > > > > > > > > -- > -|||-- >
LGTM. On Tue, Feb 8, 2011 at 11:01 AM, Garrett Casto <gcasto@google.com> wrote: > > > On Mon, Feb 7, 2011 at 5:50 PM, Noé Lutz <noelutz@google.com> wrote: >> >> codereview.chromium.org doesn't seem to accept comments at the moment. >> >> client_side_detection_service.h: >> - add a comment that describes the new struct and the member variables? > > I'm not sure what exactly your looking for, but I fleshed out the comments > some. > >> >> - I think the cache should have a maximum size. In 30 days this >> cache can get huge. Consider using an LRU cache or something of the >> sort. > > We update the cache on every possible ping, clearing out entries that are no > longer relevant. This plus the fact that we limit the number of pings per > day means that there can only ever be 3 elements in the cache, so I think > that using LRU is probably overkill. Oops! You're totally right, LRU cache would be overkill. Maybe you could mention somewhere that the cache will never contain more than kMaxReportsPerDay * kNegativeCacheInterval (in days) entries? >> - comment for cache_: is cache_ really used to limit the total number >> of requests per day? >> > > Ah, no. I was thinking about using it for that purpose as well, but I > decided it was easier to reason about this way. Removed that comment. > >> >> client_side_detection_service.cc: >> - consider putting UpdateCache() into GetCachedResult(...). >> > > What do you mean by this exactly? UpdateCache is already called in > GetCachedResult(). Do you mean not make it a separate function call? Sorry for the confusion. I was wondering why UpdateCache was its own function if GetCachedResult is the only place you call it. > >> >> Didn't look at the test yet. >> >> Thanks, >> noe. >> >> On Mon, Feb 7, 2011 at 1:41 PM, <gcasto@google.com> wrote: >> > Reviewers: Brian Ryner, noelutz, >> > >> > Description: >> > Add caching to phishing client side detection. >> > >> > BUG=None >> > TEST=Ran client_side_detection_service_unittest >> > >> > Please review this at http://codereview.chromium.org/6374017/ >> > >> > SVN Base: http://git.chromium.org/git/chromium.git@trunk >> > >> > Affected files: >> > M chrome/browser/safe_browsing/client_side_detection_service.h >> > M chrome/browser/safe_browsing/client_side_detection_service.cc >> > M >> > chrome/browser/safe_browsing/client_side_detection_service_unittest.cc >> > >> > >> > >> >> >> >> -- >> -|||-- > > -- -|||--
http://codereview.chromium.org/6374017/diff/4001/chrome/browser/safe_browsing... File chrome/browser/safe_browsing/client_side_detection_service.cc (right): http://codereview.chromium.org/6374017/diff/4001/chrome/browser/safe_browsing... chrome/browser/safe_browsing/client_side_detection_service.cc:236: LOG(INFO) << "Satisfying request for " << phishing_url << " from cache"; This is probably not something that should always log. Can you make it a VLOG? http://codereview.chromium.org/6374017/diff/4001/chrome/browser/safe_browsing... chrome/browser/safe_browsing/client_side_detection_service.cc:248: LOG(INFO) << "Refreshing cache for " << phishing_url; Ditto here. http://codereview.chromium.org/6374017/diff/4001/chrome/browser/safe_browsing... chrome/browser/safe_browsing/client_side_detection_service.cc:251: LOG(WARNING) << "Too many report phishing requests sent in the last day, " It's not directly part of this change, but this is also something that we may not want to log in release builds. http://codereview.chromium.org/6374017/diff/4001/chrome/browser/safe_browsing... chrome/browser/safe_browsing/client_side_detection_service.cc:332: // Cache response, possibly flushing an old one. I'm not totally sure I understand the "possibly flushing an old one" part of this comment. If we're not using a cached result for this URL (because it's too old), won't we have already removed that entry in UpdateCache()? If so, this map insertion shouldn't actually cause anything to be removed from the cache. http://codereview.chromium.org/6374017/diff/4001/chrome/browser/safe_browsing... chrome/browser/safe_browsing/client_side_detection_service.cc:362: if ((it->second->is_phishing && This might be more readable if you use a temporary reference variable for it->second. Up to you. http://codereview.chromium.org/6374017/diff/4001/chrome/browser/safe_browsing... File chrome/browser/safe_browsing/client_side_detection_service.h (right): http://codereview.chromium.org/6374017/diff/4001/chrome/browser/safe_browsing... chrome/browser/safe_browsing/client_side_detection_service.h:111: timestamp(time) {} The style guidelines discourage inlining constructors unless it's necessary: http://www.chromium.org/developers/coding-style#TOC-Inline-functions http://codereview.chromium.org/6374017/diff/4001/chrome/browser/safe_browsing... chrome/browser/safe_browsing/client_side_detection_service.h:184: bool GetCachedResult(GURL url, bool* is_phishing); Have this take a const GURL&, so that it doesn't need to make an extra copy of the URL? http://codereview.chromium.org/6374017/diff/4001/chrome/browser/safe_browsing... File chrome/browser/safe_browsing/client_side_detection_service_unittest.cc (right): http://codereview.chromium.org/6374017/diff/4001/chrome/browser/safe_browsing... chrome/browser/safe_browsing/client_side_detection_service_unittest.cc:102: base::TimeDelta::FromMinutes(2); Did you mean to use the "now" variable here instead of calling base::Time::Now() again? Also, it's unlikely, but if the test is running really slowly, 2 minutes could possibly elapse between this call and when UpdateCache is called, which would cause this to be evicted. It might be a little safer to increase this to at least 5 minutes. http://codereview.chromium.org/6374017/diff/4001/chrome/browser/safe_browsing... chrome/browser/safe_browsing/client_side_detection_service_unittest.cc:123: LOG(INFO) << "Size:" << cache.size(); Did you mean to leave this in?
http://codereview.chromium.org/6374017/diff/4001/chrome/browser/safe_browsing... File chrome/browser/safe_browsing/client_side_detection_service.cc (right): http://codereview.chromium.org/6374017/diff/4001/chrome/browser/safe_browsing... chrome/browser/safe_browsing/client_side_detection_service.cc:236: LOG(INFO) << "Satisfying request for " << phishing_url << " from cache"; On 2011/02/08 20:16:48, Brian Ryner wrote: > This is probably not something that should always log. Can you make it a VLOG? Done. http://codereview.chromium.org/6374017/diff/4001/chrome/browser/safe_browsing... chrome/browser/safe_browsing/client_side_detection_service.cc:248: LOG(INFO) << "Refreshing cache for " << phishing_url; On 2011/02/08 20:16:48, Brian Ryner wrote: > Ditto here. Done. http://codereview.chromium.org/6374017/diff/4001/chrome/browser/safe_browsing... chrome/browser/safe_browsing/client_side_detection_service.cc:251: LOG(WARNING) << "Too many report phishing requests sent in the last day, " On 2011/02/08 20:16:48, Brian Ryner wrote: > It's not directly part of this change, but this is also something that we may > not want to log in release builds. Done. http://codereview.chromium.org/6374017/diff/4001/chrome/browser/safe_browsing... chrome/browser/safe_browsing/client_side_detection_service.cc:332: // Cache response, possibly flushing an old one. On 2011/02/08 20:16:48, Brian Ryner wrote: > I'm not totally sure I understand the "possibly flushing an old one" part of > this comment. If we're not using a cached result for this URL (because it's too > old), won't we have already removed that entry in UpdateCache()? If so, this > map insertion shouldn't actually cause anything to be removed from the cache. Your correct that this comment is currently wrong, though it's a problem with the code not with the comment. UpdateCache really shouldn't get rid of an element if it's not a day old regardless of if it can be used to satisfy a request as we want to allow cache refreshes to happen. I think that I changed the code to actually do this now. http://codereview.chromium.org/6374017/diff/4001/chrome/browser/safe_browsing... chrome/browser/safe_browsing/client_side_detection_service.cc:362: if ((it->second->is_phishing && On 2011/02/08 20:16:48, Brian Ryner wrote: > This might be more readable if you use a temporary reference variable for > it->second. Up to you. Done. http://codereview.chromium.org/6374017/diff/4001/chrome/browser/safe_browsing... File chrome/browser/safe_browsing/client_side_detection_service.h (right): http://codereview.chromium.org/6374017/diff/4001/chrome/browser/safe_browsing... chrome/browser/safe_browsing/client_side_detection_service.h:111: timestamp(time) {} On 2011/02/08 20:16:48, Brian Ryner wrote: > The style guidelines discourage inlining constructors unless it's necessary: > http://www.chromium.org/developers/coding-style#TOC-Inline-functions Moved. http://codereview.chromium.org/6374017/diff/4001/chrome/browser/safe_browsing... chrome/browser/safe_browsing/client_side_detection_service.h:184: bool GetCachedResult(GURL url, bool* is_phishing); On 2011/02/08 20:16:48, Brian Ryner wrote: > Have this take a const GURL&, so that it doesn't need to make an extra copy of > the URL? Done. http://codereview.chromium.org/6374017/diff/4001/chrome/browser/safe_browsing... File chrome/browser/safe_browsing/client_side_detection_service_unittest.cc (right): http://codereview.chromium.org/6374017/diff/4001/chrome/browser/safe_browsing... chrome/browser/safe_browsing/client_side_detection_service_unittest.cc:102: base::TimeDelta::FromMinutes(2); On 2011/02/08 20:16:48, Brian Ryner wrote: > Did you mean to use the "now" variable here instead of calling base::Time::Now() > again? > > Also, it's unlikely, but if the test is running really slowly, 2 minutes could > possibly elapse between this call and when UpdateCache is called, which would > cause this to be evicted. It might be a little safer to increase this to at > least 5 minutes. Done. http://codereview.chromium.org/6374017/diff/4001/chrome/browser/safe_browsing... chrome/browser/safe_browsing/client_side_detection_service_unittest.cc:123: LOG(INFO) << "Size:" << cache.size(); On 2011/02/08 20:16:48, Brian Ryner wrote: > Did you mean to leave this in? Removed.
I just uploaded a new patch which adds testing for cache refresh and makes the interval that we keep track of requests configurable (kReportsInterval). I talked a little with Brian offline about merging the report time calculations with the cache, but after playing around with it some I think that it makes things messier. Specifically you have to add a way to specify that this entry didn't get a reply, and the added code to deal with this seems to outweigh the cost of keeping 2 datastructures. Please take another look.
LGTM Just a couple of small things. http://codereview.chromium.org/6374017/diff/11001/chrome/browser/safe_browsin... File chrome/browser/safe_browsing/client_side_detection_service.cc (right): http://codereview.chromium.org/6374017/diff/11001/chrome/browser/safe_browsin... chrome/browser/safe_browsing/client_side_detection_service.cc:354: UpdateCache(); If the cache was going to be very large, the linear traversal for UpdateCache could become a performance problem. But with our current limits, it should be fine. http://codereview.chromium.org/6374017/diff/11001/chrome/browser/safe_browsin... File chrome/browser/safe_browsing/client_side_detection_service.h (right): http://codereview.chromium.org/6374017/diff/11001/chrome/browser/safe_browsin... chrome/browser/safe_browsing/client_side_detection_service.h:115: static const int kMaxReports; Maybe call this "kMaxReportPerInterval" to make it clearer that it's associated with the constant below? http://codereview.chromium.org/6374017/diff/11001/chrome/browser/safe_browsin... chrome/browser/safe_browsing/client_side_detection_service.h:208: // Currently this value is 3. It might be better to avoid mentioning the specific value in the header, so it can't get out of sync with the constants in the .cc.
http://codereview.chromium.org/6374017/diff/11001/chrome/browser/safe_browsin... File chrome/browser/safe_browsing/client_side_detection_service.h (right): http://codereview.chromium.org/6374017/diff/11001/chrome/browser/safe_browsin... chrome/browser/safe_browsing/client_side_detection_service.h:115: static const int kMaxReports; On 2011/02/09 23:46:42, Brian Ryner wrote: > Maybe call this "kMaxReportPerInterval" to make it clearer that it's associated > with the constant below? Done. http://codereview.chromium.org/6374017/diff/11001/chrome/browser/safe_browsin... chrome/browser/safe_browsing/client_side_detection_service.h:208: // Currently this value is 3. On 2011/02/09 23:46:42, Brian Ryner wrote: > It might be better to avoid mentioning the specific value in the header, so it > can't get out of sync with the constants in the .cc. Done.
Added an UMA stat for issues with serializing the request, as Noe requested out of band.
LGTM |