Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(538)

Issue 2878046: Add an extractor for DOM features to be used for client side phishing detection. (Closed)

Created:
10 years, 5 months ago by Brian Ryner
Modified:
9 years, 7 months ago
CC:
chromium-reviews, darin-cc_chromium.org, brettw-cc_chromium.org, chrome-anti-phishing_googlegroups.com, Paweł Hajdan Jr.
Visibility:
Public.

Description

Add an extractor for DOM features to be used for client side phishing detection. PhishingDOMFeatureExtractor iterates over the page elements and computes a number of features. To avoid blocking the renderer for too long, the extractor may run in several chunks of works, posting a task to continue processing if necessary. This CL only includes the feature extraction itself. I will add the logic to cap the time per iteration in a follow-up CL. BUG=none TEST=PhishingDOMFeatureExtractorTest Committed: http://src.chromium.org/viewvc/chrome?view=rev&revision=54082

Patch Set 1 #

Total comments: 21

Patch Set 2 : addressed comments from jhawkins, phajdan.jr, and brettw #

Total comments: 6

Patch Set 3 : added a check for unexpected feature values #

Patch Set 4 : clarified comment in features.h #

Total comments: 4

Patch Set 5 : address marria's comments #

Unified diffs Side-by-side diffs Delta from patch set Stats (+1063 lines, -1 line) Patch
M chrome/chrome_renderer.gypi View 1 2 3 4 1 chunk +2 lines, -0 lines 0 comments Download
M chrome/chrome_tests.gypi View 1 2 3 4 1 chunk +1 line, -0 lines 0 comments Download
M chrome/renderer/safe_browsing/features.h View 1 2 3 4 2 chunks +55 lines, -0 lines 0 comments Download
M chrome/renderer/safe_browsing/features.cc View 1 2 3 chunks +34 lines, -1 line 0 comments Download
M chrome/renderer/safe_browsing/features_unittest.cc View 2 chunks +17 lines, -0 lines 0 comments Download
A chrome/renderer/safe_browsing/phishing_dom_feature_extractor.h View 1 2 3 4 1 chunk +128 lines, -0 lines 0 comments Download
A chrome/renderer/safe_browsing/phishing_dom_feature_extractor.cc View 1 2 3 4 1 chunk +416 lines, -0 lines 0 comments Download
A chrome/renderer/safe_browsing/phishing_dom_feature_extractor_unittest.cc View 1 2 3 4 1 chunk +410 lines, -0 lines 0 comments Download

Messages

Total messages: 15 (0 generated)
Brian Ryner
Brett, someone mentioned that you'd be a good person to review the way I'm setting ...
10 years, 5 months ago (2010-07-21 22:55:18 UTC) #1
James Hawkins
http://codereview.chromium.org/2878046/diff/1/6 File chrome/renderer/safe_browsing/phishing_dom_feature_extractor.cc (right): http://codereview.chromium.org/2878046/diff/1/6#newcode26 chrome/renderer/safe_browsing/phishing_dom_feature_extractor.cc:26: // Link related features Two space indent for these. ...
10 years, 5 months ago (2010-07-21 23:12:19 UTC) #2
Paweł Hajdan Jr.
Drive-by with a test comment. Please let me take another look before committing. http://codereview.chromium.org/2878046/diff/1/8 File ...
10 years, 5 months ago (2010-07-21 23:28:30 UTC) #3
brettw
I don't see an obvious reason why the test setup is wrong, but I don't ...
10 years, 5 months ago (2010-07-22 21:04:10 UTC) #4
Brian Ryner
Have another look. http://codereview.chromium.org/2878046/diff/1/6 File chrome/renderer/safe_browsing/phishing_dom_feature_extractor.cc (right): http://codereview.chromium.org/2878046/diff/1/6#newcode26 chrome/renderer/safe_browsing/phishing_dom_feature_extractor.cc:26: // Link related features On 2010/07/21 ...
10 years, 5 months ago (2010-07-23 21:21:08 UTC) #5
James Hawkins
LGTM http://codereview.chromium.org/2878046/diff/9001/10006 File chrome/renderer/safe_browsing/phishing_dom_feature_extractor.h (right): http://codereview.chromium.org/2878046/diff/9001/10006#newcode105 chrome/renderer/safe_browsing/phishing_dom_feature_extractor.h:105: FeatureMap* features_; // the caller keeps ownership of ...
10 years, 5 months ago (2010-07-23 21:27:39 UTC) #6
Paweł Hajdan Jr.
Code I commented in the drive-by LGTM.
10 years, 5 months ago (2010-07-23 22:30:51 UTC) #7
lzheng
http://codereview.chromium.org/2878046/diff/9001/10003 File chrome/renderer/safe_browsing/features.cc (right): http://codereview.chromium.org/2878046/diff/9001/10003#newcode21 chrome/renderer/safe_browsing/features.cc:21: bool FeatureMap::AddRealFeature(const std::string& name, double value) { Since only ...
10 years, 5 months ago (2010-07-26 17:51:28 UTC) #8
Brian Ryner
Have another look. Thanks, Brian http://codereview.chromium.org/2878046/diff/9001/10003 File chrome/renderer/safe_browsing/features.cc (right): http://codereview.chromium.org/2878046/diff/9001/10003#newcode21 chrome/renderer/safe_browsing/features.cc:21: bool FeatureMap::AddRealFeature(const std::string& name, ...
10 years, 5 months ago (2010-07-27 00:20:05 UTC) #9
lzheng
LGTM. Please make sure other reviewers are fine. Thanks!
10 years, 5 months ago (2010-07-27 00:41:13 UTC) #10
Brian Ryner
On 2010/07/27 00:41:13, lzheng wrote: > LGTM. > > Please make sure other reviewers are ...
10 years, 5 months ago (2010-07-27 04:47:29 UTC) #11
marria
lgtm http://codereview.chromium.org/2878046/diff/22001/14007 File chrome/renderer/safe_browsing/phishing_dom_feature_extractor.cc (right): http://codereview.chromium.org/2878046/diff/22001/14007#newcode49 chrome/renderer/safe_browsing/phishing_dom_feature_extractor.cc:49: : external_links(0), You could make this external_links and ...
10 years, 5 months ago (2010-07-27 21:31:51 UTC) #12
Brian Ryner
Have one more look. I also added a missing #include to the unittest, which showed ...
10 years, 4 months ago (2010-07-27 23:25:46 UTC) #13
marria
lgtm
10 years, 4 months ago (2010-07-28 00:48:09 UTC) #14
lzheng1
10 years, 4 months ago (2010-07-28 09:17:41 UTC) #15
LGTM

On Tue, Jul 27, 2010 at 5:48 PM, <marria@google.com> wrote:

> lgtm
>
>
> http://codereview.chromium.org/2878046/show
>

Powered by Google App Engine
This is Rietveld 408576698