|
|
Chromium Code Reviews|
Created:
4 years, 3 months ago by Jialiu Lin Modified:
4 years, 1 month ago CC:
chromium-reviews, grt+watch_chromium.org, nasko Target Ref:
refs/pending/heads/master Project:
chromium Visibility:
Public. |
DescriptionAdd SafeBrowsingNavigationObserver to listen to navigation events
happen on all frames.
These navigation events can help safe browsing service identify
suspicious downloads that intentionally hide their referrers
and/or landing pages, and report these pieces of info to safe
browsing backend.
This CL only covers the observing part. More management code
(e.g. periodically cleanup, actual reporting, etc) will come up
shortly.
BUG=639467
CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.linux:linux_site_isolation
Committed: https://crrev.com/602009dfabda593406bfd7f5171e8c4183335bf5
Committed: https://crrev.com/1782326f8a2c81a58136a0475d83f6c1c11db38d
Committed: https://crrev.com/53acce88363fb6a9ed54e991d17b203cfdf6326d
Cr-Original-Original-Commit-Position: refs/heads/master@{#428865}
Cr-Original-Commit-Position: refs/heads/master@{#429071}
Cr-Commit-Position: refs/heads/master@{#429976}
Patch Set 1 #Patch Set 2 : clean up test cases #Patch Set 3 : nit in test #Patch Set 4 : disable DownloadViaHTML5FileApi due to flackiness #
Total comments: 40
Patch Set 5 : address nparker@'s comment and rebase-update #Patch Set 6 : nit in test #Patch Set 7 : nit #Patch Set 8 : milli seconds to seconds #Patch Set 9 : nit #
Total comments: 54
Patch Set 10 : address creis@'s comments #Patch Set 11 : nit #
Total comments: 47
Patch Set 12 : address creis's comments #
Total comments: 2
Patch Set 13 : add TODO #Patch Set 14 : manually merge in 2336833002, and use the new source_render_process_id in RetargetingDetails #Patch Set 15 : nit #
Total comments: 28
Patch Set 16 : address nparker's comments #Patch Set 17 : fix compilation #
Total comments: 50
Patch Set 18 : address creis@ comments #
Total comments: 20
Patch Set 19 : address comments from creis@ and nparker@ #Patch Set 20 : make all bots happy #
Total comments: 81
Patch Set 21 : address comments #Patch Set 22 : nit #
Total comments: 10
Patch Set 23 : add server redirect browser test #
Total comments: 8
Patch Set 24 : nit #Patch Set 25 : rebase, and fix flaky browser tests #
Total comments: 2
Patch Set 26 : 2nd attempt for flaky browser tests #Patch Set 27 : refine comments and format #
Total comments: 18
Patch Set 28 : using StartWatchingNewWebContents() #
Total comments: 2
Patch Set 29 : resolve final nit #Messages
Total messages: 162 (103 generated)
The CQ bit was checked by jialiul@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
Patchset #2 (id:20001) has been deleted
jialiul@chromium.org changed reviewers: + nparker@chromium.org
Hi nparker@, Finally, I got most of browser_test working. I wanna run it by you first before sending it to creis@. This CL only handles observing part. You might find some comments very lengthy, let me know if it confuses you. Note that, some windows bots are not happy with my SBNavigationObserverBrowserTest.DownloadViaHTML5FileApi test case. So I temporarily commented out some lines. Will un-comment them once I figure out what went wrong. Also did some refactoring on test cases.
The CQ bit was checked by jialiul@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
I haven't looked at tests yet, but overall structure seems reasonable to me. I cannot reason about when things get observed... I'll leave that to Charlie. https://codereview.chromium.org/2302913003/diff/80001/chrome/browser/safe_bro... File chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc (right): https://codereview.chromium.org/2302913003/diff/80001/chrome/browser/safe_bro... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:74: SafeBrowsingNavigationObserver::NavigationEvent::NavigationEvent( Does the default copy constructor work here? I think there's a "= default" syntax that invokes that. I haven't used it., https://codereview.chromium.org/2302913003/diff/80001/chrome/browser/safe_bro... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:108: return &navigation_map_; nit: trivial accessor could go in the .h https://codereview.chromium.org/2302913003/diff/80001/chrome/browser/safe_bro... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:123: WebContentsObserver::Observe(source_contents); What does this do? Does it invoke it on *this? https://codereview.chromium.org/2302913003/diff/80001/chrome/browser/safe_bro... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:142: GURL target_url = retargeting_detail->target_url; nit: could be const GURL& https://codereview.chromium.org/2302913003/diff/80001/chrome/browser/safe_bro... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:159: navigation_map_[target_url].push_back(std::move(nav_event)); I'm trying to remember the move semantics with std::vectors... Does it destroy the nav_event that's on the stack? https://codereview.chromium.org/2302913003/diff/80001/chrome/browser/safe_bro... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:186: // some outdated entries. Is it useful to record this event? https://codereview.chromium.org/2302913003/diff/80001/chrome/browser/safe_bro... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:195: void SafeBrowsingNavigationObserver::DidStartNavigation( For those of us not familiar with NavigationObserver, it would be useful to add a brief comment on when/where/why this gets called. https://codereview.chromium.org/2302913003/diff/80001/chrome/browser/safe_bro... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:202: // Construct a NavigationEvent based on avialble information in sp: available https://codereview.chromium.org/2302913003/diff/80001/chrome/browser/safe_bro... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:208: if (host && host->GetLastCommittedURL() != GURL()) Would GetLasCommittedURL().is_valid() work? Avoids creating a temporary. https://codereview.chromium.org/2302913003/diff/80001/chrome/browser/safe_bro... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:210: else What's the difference between the source_url in these two cases? https://codereview.chromium.org/2302913003/diff/80001/chrome/browser/safe_bro... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:219: if (navigation_handle->IsInMainFrame()) { Do you want to record in the nav_event whether or not it's in main frame? https://codereview.chromium.org/2302913003/diff/80001/chrome/browser/safe_bro... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:222: nav_event.main_frame_url = navigation_handle->GetWebContents()->GetURL(); nit: this uses braces for single-line if/else, but above does not. Just be consistent. https://codereview.chromium.org/2302913003/diff/80001/chrome/browser/safe_bro... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:271: navigation_map_[key_url] = std::vector<NavigationEvent>(); nit: Can you insert a pair of key_url and std::vector<NavigationEvent>({*nav_event})? I don't think I have the syntax quite right, but it'd be a bit more efficient. If it's ugly, never mind. https://codereview.chromium.org/2302913003/diff/80001/chrome/browser/safe_bro... File chrome/browser/safe_browsing/safe_browsing_navigation_observer.h (right): https://codereview.chromium.org/2302913003/diff/80001/chrome/browser/safe_bro... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:30: NavigationEvent(const GURL& source_url, Is there away to avoid the many-args-of-similar-type issue here? They're easy to mix up w/o the compiler noticing. Not sure if it'd make things super wordy, but you could just remove this constructor and require callers set each field. https://codereview.chromium.org/2302913003/diff/80001/chrome/browser/safe_bro... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:54: bool is_server_redirect; It's is possible for is_user_initiated and is_server_redirect both to be true? (and if not, would an enum work better?) https://codereview.chromium.org/2302913003/diff/80001/chrome/browser/safe_bro... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:61: typedef std::map<content::NavigationHandle*, NavigationEvent> Who owns these pointers? Add a comment, or down below where they're used. https://codereview.chromium.org/2302913003/diff/80001/chrome/browser/safe_bro... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:79: void RecordRetargeting(const content::NotificationDetails& details); What does retargeting mean? https://codereview.chromium.org/2302913003/diff/80001/chrome/browser/safe_bro... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:96: NavigationHandleMap navigation_handle_map_; Add some comments on what these are used for, or how the lifetime of entries relate to each other. https://codereview.chromium.org/2302913003/diff/80001/chrome/test/data/safe_b... File chrome/test/data/safe_browsing/download_protection/navigation_observer/navigation_observer_tests.html (right): https://codereview.chromium.org/2302913003/diff/80001/chrome/test/data/safe_b... chrome/test/data/safe_browsing/download_protection/navigation_observer/navigation_observer_tests.html:45: window.webkitRequestFileSystem( I still find this one hard to grok. https://codereview.chromium.org/2302913003/diff/80001/chrome/test/data/safe_b... chrome/test/data/safe_browsing/download_protection/navigation_observer/navigation_observer_tests.html:80: <a id="single_meta_refresh_redirect" href="redirect.html" >Redirect download via meta refresh</a><br> nit: space after quote. You could break+indent these after the >'s to make them easier to read.
The CQ bit was checked by jialiul@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
jialiul@chromium.org changed reviewers: + creis@chromium.org
Thanks, nparker@! All your comments are addressed now. +creis@, Sorry for this belated CL. It took me a while to figure out an easier way to write browser test for this. This CL only handles the observing and recording part. The management part (a.k.a when and how to clean up these navigation events), and hooking it into existing safe browsing service will be the next. Please let me know if this navigation observe makes sense to you and whether I missed some important corner cases. (p.s. I guess certain corner cases that happens only on [tiny percentage of ] individual users can be tolerated, since SB backend operated on aggregated data.) https://codereview.chromium.org/2302913003/diff/80001/chrome/browser/safe_bro... File chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc (right): https://codereview.chromium.org/2302913003/diff/80001/chrome/browser/safe_bro... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:74: SafeBrowsingNavigationObserver::NavigationEvent::NavigationEvent( On 2016/09/03 at 00:21:35, Nathan Parker wrote: > Does the default copy constructor work here? I think there's a "= default" syntax that invokes that. I haven't used it., Actually I need a move constructor here instead of a copy constructor. Fixed. https://codereview.chromium.org/2302913003/diff/80001/chrome/browser/safe_bro... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:108: return &navigation_map_; On 2016/09/03 at 00:21:35, Nathan Parker wrote: > nit: trivial accessor could go in the .h Done. https://codereview.chromium.org/2302913003/diff/80001/chrome/browser/safe_bro... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:123: WebContentsObserver::Observe(source_contents); On 2016/09/03 at 00:21:35, Nathan Parker wrote: > What does this do? Does it invoke it on *this? Yes, since SBNavigationObserver inherits both WebContentsObserver and NotificationObserver, this is to explicitly say which Observe() function to call. Only after a webcontents being observed, NavigationObserver can receive DidStartNavigation/ DidFinishNavigaiton and other callbacks for this specific webcontents. https://codereview.chromium.org/2302913003/diff/80001/chrome/browser/safe_bro... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:142: GURL target_url = retargeting_detail->target_url; On 2016/09/03 at 00:21:35, Nathan Parker wrote: > nit: could be const GURL& Indeed. done. https://codereview.chromium.org/2302913003/diff/80001/chrome/browser/safe_bro... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:159: navigation_map_[target_url].push_back(std::move(nav_event)); On 2016/09/03 at 00:21:35, Nathan Parker wrote: > I'm trying to remember the move semantics with std::vectors... Does it destroy the nav_event that's on the stack? It basically pushes the underlying nav_event object into the vector without making a copy. Yes, nav_event will be invalid after std::move. https://codereview.chromium.org/2302913003/diff/80001/chrome/browser/safe_bro... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:186: // some outdated entries. On 2016/09/03 at 00:21:35, Nathan Parker wrote: > Is it useful to record this event? Maybe. Let me handle or delete this in my next CL. https://codereview.chromium.org/2302913003/diff/80001/chrome/browser/safe_bro... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:195: void SafeBrowsingNavigationObserver::DidStartNavigation( On 2016/09/03 at 00:21:35, Nathan Parker wrote: > For those of us not familiar with NavigationObserver, it would be useful to add a brief comment on when/where/why this gets called. Comments added. https://codereview.chromium.org/2302913003/diff/80001/chrome/browser/safe_bro... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:202: // Construct a NavigationEvent based on avialble information in On 2016/09/03 at 00:21:35, Nathan Parker wrote: > sp: available done https://codereview.chromium.org/2302913003/diff/80001/chrome/browser/safe_bro... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:208: if (host && host->GetLastCommittedURL() != GURL()) On 2016/09/03 at 00:21:35, Nathan Parker wrote: > Would GetLasCommittedURL().is_valid() work? Avoids creating a temporary. Done. https://codereview.chromium.org/2302913003/diff/80001/chrome/browser/safe_bro... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:210: else On 2016/09/03 at 00:21:34, Nathan Parker wrote: > What's the difference between the source_url in these two cases? The differences are whether this is the first URL to be commit in this frame or not. Comments added to make it clearer. https://codereview.chromium.org/2302913003/diff/80001/chrome/browser/safe_bro... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:219: if (navigation_handle->IsInMainFrame()) { On 2016/09/03 at 00:21:35, Nathan Parker wrote: > Do you want to record in the nav_event whether or not it's in main frame? frame_id and main_frame_url are recorded, which are enough to tell whether it is a mainframe navigation or not. https://codereview.chromium.org/2302913003/diff/80001/chrome/browser/safe_bro... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:222: nav_event.main_frame_url = navigation_handle->GetWebContents()->GetURL(); On 2016/09/03 at 00:21:35, Nathan Parker wrote: > nit: this uses braces for single-line if/else, but above does not. Just be consistent. Thanks for catching this. https://codereview.chromium.org/2302913003/diff/80001/chrome/browser/safe_bro... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:271: navigation_map_[key_url] = std::vector<NavigationEvent>(); On 2016/09/03 at 00:21:35, Nathan Parker wrote: > nit: Can you insert a pair of key_url and std::vector<NavigationEvent>({*nav_event})? > I don't think I have the syntax quite right, but it'd be a bit more efficient. If it's ugly, never mind. Yep, insert is safer than [] operator. https://codereview.chromium.org/2302913003/diff/80001/chrome/browser/safe_bro... File chrome/browser/safe_browsing/safe_browsing_navigation_observer.h (right): https://codereview.chromium.org/2302913003/diff/80001/chrome/browser/safe_bro... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:30: NavigationEvent(const GURL& source_url, On 2016/09/03 at 00:21:35, Nathan Parker wrote: > Is there away to avoid the many-args-of-similar-type issue here? They're easy to mix up w/o the compiler noticing. > > Not sure if it'd make things super wordy, but you could just remove this constructor and require callers set each field. Agree. Removed this constructor. https://codereview.chromium.org/2302913003/diff/80001/chrome/browser/safe_bro... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:54: bool is_server_redirect; On 2016/09/03 at 00:21:35, Nathan Parker wrote: > It's is possible for is_user_initiated and is_server_redirect both to be true? (and if not, would an enum work better?) Yes, it's possible. User clicks on a link (url_a), and this link resolves to a server redirection (url_b). In that case, both bools are true. And this counts as one navigation event, since they shares a single NavigationHandle. https://codereview.chromium.org/2302913003/diff/80001/chrome/browser/safe_bro... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:61: typedef std::map<content::NavigationHandle*, NavigationEvent> On 2016/09/03 at 00:21:35, Nathan Parker wrote: > Who owns these pointers? Add a comment, or down below where they're used. done. https://codereview.chromium.org/2302913003/diff/80001/chrome/browser/safe_bro... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:79: void RecordRetargeting(const content::NotificationDetails& details); On 2016/09/03 at 00:21:35, Nathan Parker wrote: > What does retargeting mean? Retargeting means navigation triggered in one tab and finished in another, e.g. open link in a new tab/window either by user or by code. https://codereview.chromium.org/2302913003/diff/80001/chrome/browser/safe_bro... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:96: NavigationHandleMap navigation_handle_map_; On 2016/09/03 at 00:21:35, Nathan Parker wrote: > Add some comments on what these are used for, or how the lifetime of entries relate to each other. done. https://codereview.chromium.org/2302913003/diff/80001/chrome/test/data/safe_b... File chrome/test/data/safe_browsing/download_protection/navigation_observer/navigation_observer_tests.html (right): https://codereview.chromium.org/2302913003/diff/80001/chrome/test/data/safe_b... chrome/test/data/safe_browsing/download_protection/navigation_observer/navigation_observer_tests.html:45: window.webkitRequestFileSystem( On 2016/09/03 at 00:21:35, Nathan Parker wrote: > I still find this one hard to grok. It took me some time to understand this code too... It basically create a data blob and save it to html filesystem temporary space. And then use filesystem:http://embedded_test_server_address/temporary/filename to download it. https://codereview.chromium.org/2302913003/diff/80001/chrome/test/data/safe_b... chrome/test/data/safe_browsing/download_protection/navigation_observer/navigation_observer_tests.html:80: <a id="single_meta_refresh_redirect" href="redirect.html" >Redirect download via meta refresh</a><br> On 2016/09/03 at 00:21:35, Nathan Parker wrote: > nit: space after quote. You could break+indent these after the >'s to make them easier to read. done.
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
Thanks! FYI, I just got hit with a critical security bug (and two other large reviews) today, so I might be a little delayed replying. I'll try to review as soon as I can.
On 2016/09/07 at 23:28:22, creis wrote: > Thanks! FYI, I just got hit with a critical security bug (and two other large reviews) today, so I might be a little delayed replying. I'll try to review as soon as I can. No problem. Take your time. :-)
Hi creis@, Sorry for bothering you again. I know everybody is crasily busy with Perf this week. Could you maybe take a look at this CL after Perf? Maybe just focus on whether my observer is listening to the right things. nparker@ will help me with the readibility bits. And if you have more time, https://crrev.com/2336833002 is a small delta based on this one. Really appreciate if you can take a look as well.
On 2016/09/15 21:58:52, Jialiu Lin wrote: > Hi creis@, > Sorry for bothering you again. I know everybody is crasily busy with Perf this > week. Could you maybe take a look at this CL after Perf? Maybe just focus on > whether my observer is listening to the right things. nparker@ will help me with > the readibility bits. > > And if you have more time, https://crrev.com/2336833002 is a small delta based > on this one. Really appreciate if you can take a look as well. Yes, very sorry for the delay! I'm scheduling time on my calendar tomorrow to review it, now that Perf is done.
I've mainly reviewed safe_browsing_navigation_observer.{h,cc} so far. Thanks
for digging in and getting started.
Looks like there's a few high level things we'll need to fix, like having
separate WebContentsObservers per WebContents and tracking the source URL
correctly. (Did the tests pass so far? I think the source_url is getting
assigned incorrectly on navigations that target different frames/tabs.)
I'm still concerned about introducing memory regressions from tracking all of
this, but maybe we can find ways to optimize it after we get it functional.
https://codereview.chromium.org/2302913003/diff/180001/chrome/browser/safe_br...
File chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc (right):
https://codereview.chromium.org/2302913003/diff/180001/chrome/browser/safe_br...
chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:4: #include
"chrome/browser/safe_browsing/safe_browsing_navigation_observer.h"
nit: Add blank line before.
https://codereview.chromium.org/2302913003/diff/180001/chrome/browser/safe_br...
chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:29: double
SecondsFromTime(const base::Time& time) {
Why do we need this?
https://codereview.chromium.org/2302913003/diff/180001/chrome/browser/safe_br...
chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:87:
registrar_.Add(this, content::NOTIFICATION_NAV_ENTRY_PENDING,
Why do you need this one? The fewer notifications we can rely on, the better.
https://codereview.chromium.org/2302913003/diff/180001/chrome/browser/safe_br...
chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:166:
WebContentsObserver::Observe(dest_content);
This will stop observing the previous WebContents, so it won't work. Nick
points out that TabHelpers::AttachTabHelpers is the right pattern to use for
listening to all WebContents (where we create one observer per WebContents).
https://codereview.chromium.org/2302913003/diff/180001/chrome/browser/safe_br...
chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:203: // the
source url.
This doesn't sound right to me. If a navigation in one frame/tab targets
another frame/tab, the DidStartNavigation will pass a handle for the new
frame/tab. That means you wouldn't get the URL of the frame that initiated the
navigation.
I thought this was the reason we wanted to use Jochen's extension API
suggestion? (I'm having trouble remembering the details.)
https://codereview.chromium.org/2302913003/diff/180001/chrome/browser/safe_br...
chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:219:
nav_event.main_frame_url = navigation_handle->GetWebContents()->GetURL();
GetURL is deprecated because it's ambiguous whether you want the visible or last
committed one. I think you want GetLastCommittedURL.
https://codereview.chromium.org/2302913003/diff/180001/chrome/browser/safe_br...
chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:233:
nav_event->server_redirect_url = navigation_handle->GetURL();
If you only want to track the last redirect and not the chain, we should
document that in the declaration.
https://codereview.chromium.org/2302913003/diff/180001/chrome/browser/safe_br...
chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:244: // In
NavigationHandle's definition, render initiated navigation includes:
nit: s/render/renderer/
https://codereview.chromium.org/2302913003/diff/180001/chrome/browser/safe_br...
chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:246: // meta
refresh tag and using window.history.pushState.
Be careful not to imply that this list exhaustive. It's a very small list of
examples that try to give a sense for the variety of ways the renderer process
can navigate.
https://codereview.chromium.org/2302913003/diff/180001/chrome/browser/safe_br...
chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:251: // if
this info is available.
Yes, we generally consider "user initiated" to mean browser_initiated ||
has_user_gesture. You can probably shorten the description a bit accordingly.
https://codereview.chromium.org/2302913003/diff/180001/chrome/browser/safe_br...
chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:267: :
nav_event->target_url;
I think we might need to be more careful about the URL than this. It's not
clear to me whether navigation_handle->GetURL() will match either of these at
the time it finishes (e.g., if it ends up being an error page, blocked by CSP,
etc). Maybe that gets handled as a new DidStart/DidFinish? (Nasko is looking
into this at the moment, I believe.)
It would also help to have more documentation on what you expect target_url and
server_redirect_url to be. Is target_url supposed to match the committed URL if
there's a redirect, or is it really more of an original URL for the request?
https://codereview.chromium.org/2302913003/diff/180001/chrome/browser/safe_br...
chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:271:
navigation_map_.at(key_url).push_back(std::move(*nav_event));
Why do this separately (requiring a second lookup)?
https://codereview.chromium.org/2302913003/diff/180001/chrome/browser/safe_br...
File chrome/browser/safe_browsing/safe_browsing_navigation_observer.h (right):
https://codereview.chromium.org/2302913003/diff/180001/chrome/browser/safe_br...
chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:13: struct
RetargetingDetails;
nit: This appears unused in the .h file.
https://codereview.chromium.org/2302913003/diff/180001/chrome/browser/safe_br...
chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:25: class
SafeBrowsingNavigationObserver : public content::NotificationObserver,
It looks like this is trying to have one instance that observes all WebContents,
which isn't supported. (See the WebContentsObserver::Observe documentation.)
Also, is this meant to be limited to just WebContents in normal tabs, or other
non-traditional WebContents like background pages, etc? (Hopefully just normal
tabs.)
https://codereview.chromium.org/2302913003/diff/180001/chrome/browser/safe_br...
chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:28: struct
NavigationEvent {
Please have a comment with some description of how this will be used (as well as
details like whether it will track subframe navigations as well).
https://codereview.chromium.org/2302913003/diff/180001/chrome/browser/safe_br...
chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:34: int
source_tab_id;
Please document these fields. For example, WebContents doesn't have an ID, so
it's not immediately clear what ID this is tracking.
https://codereview.chromium.org/2302913003/diff/180001/chrome/browser/safe_br...
chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:44: bool
is_user_initiated;
Mention how you track this. Does it boil down to is_browser_initiated ||
has_user_gesture?
https://codereview.chromium.org/2302913003/diff/180001/chrome/browser/safe_br...
chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:47: bool
is_finished;
What event does this refer to? (There are many possible ways to declare
something finished.)
https://codereview.chromium.org/2302913003/diff/180001/chrome/browser/safe_br...
chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:48: GURL
server_redirect_url;
Normally we keep track of a "redirect chain" of URLs. Which one is this? (And
maybe it should be listed closer to is_server_redirect?)
https://codereview.chromium.org/2302913003/diff/180001/chrome/browser/safe_br...
chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:60:
NavigationHandleMap;
nit: Switch order to be consistent with the declarations below?
Also, do these typedefs need to be public, or can they be declared private?
https://codereview.chromium.org/2302913003/diff/180001/chrome/browser/safe_br...
chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:75: // Record
navigation event when observes a chrome::NOTIFICATION_RETARGETING
nit: s/observes/observing/
https://codereview.chromium.org/2302913003/diff/180001/chrome/browser/safe_br...
chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:76: //
notification. Retargeting means navigation triggered by one webcontents
nit: WebContents
https://codereview.chromium.org/2302913003/diff/180001/chrome/browser/safe_br...
chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:79: void
RecordRetargeting(const content::NotificationDetails& details);
Huh. I wasn't familiar with NOTIFICATION_RETARGETING. NotificationObserver is
deprecated, so I wonder if there's another way we should be listening for this.
(Not sure, though, so maybe we can leave it for now.)
https://codereview.chromium.org/2302913003/diff/180001/chrome/browser/safe_br...
chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:89:
nit: No blank lines between method override declarations.
https://codereview.chromium.org/2302913003/diff/180001/chrome/browser/safe_br...
chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:103: // events.
NavigationHandle pointers are owned by Navigator class. Since a
I think NavigationHandle is owned by RenderFrameHost, rather than Navigator.
https://codereview.chromium.org/2302913003/diff/180001/chrome/browser/safe_br...
chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:113: //
navigation_map_ will be removed if they are older than t minutes since
Please say what t is.
I still have to wonder if we're recording too much here. Even with a small time
limit, a large number of navigations will require a fair amount more memory than
before (and any time limit could let an attacker just wait long enough for the
events to expire before proceeding with the download).
I keep wondering if there's a point at which we can forget certain navigation
events and still have what you need, but I suppose that's hard. An attacker
could always close the original tab (or navigate it to a different page) after
opening a new tab, and we'd lose info about the original page.
https://codereview.chromium.org/2302913003/diff/180001/chrome/test/data/safe_...
File
chrome/test/data/safe_browsing/download_protection/navigation_observer/navigation_observer_tests.html
(right):
https://codereview.chromium.org/2302913003/diff/180001/chrome/test/data/safe_...
chrome/test/data/safe_browsing/download_protection/navigation_observer/navigation_observer_tests.html:84:
<a id="direct_download_noreferrer" href="../signed.exe" rel="noreferrer">
For all the noreferrer cases, you might want to also test rel=noreferrer and
target=_blank. That opens a new window and does a process swap, which
might be another way to bypass any tracking.
On 2016/09/16 23:35:06, Charlie Reis (slow) wrote:
> I've mainly reviewed safe_browsing_navigation_observer.{h,cc} so far. Thanks
> for digging in and getting started.
>
> Looks like there's a few high level things we'll need to fix, like having
> separate WebContentsObservers per WebContents and tracking the source URL
> correctly. (Did the tests pass so far? I think the source_url is getting
> assigned incorrectly on navigations that target different frames/tabs.)
>
> I'm still concerned about introducing memory regressions from tracking all of
> this, but maybe we can find ways to optimize it after we get it functional.
>
>
https://codereview.chromium.org/2302913003/diff/180001/chrome/browser/safe_br...
> File chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc
(right):
>
>
https://codereview.chromium.org/2302913003/diff/180001/chrome/browser/safe_br...
> chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:4: #include
> "chrome/browser/safe_browsing/safe_browsing_navigation_observer.h"
> nit: Add blank line before.
>
>
https://codereview.chromium.org/2302913003/diff/180001/chrome/browser/safe_br...
> chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:29: double
> SecondsFromTime(const base::Time& time) {
> Why do we need this?
>
Thanks for your feedback! This is really helpful. I'll address them one by one
next week.
Just to make sure I understand your hight level points. It seems the biggest
misunderstanding I had is what WebContentsObserver::Observe() implies. I didn't
realize it will stop the observation of the previous WebContents.
So to recap what you said, I need (at least) two classes to track navigation
events. One is a observer, that listens to a single WebContents; and the other
is a kind of observer manager, which in charge of aggregating navigation details
from individual observers and other management tasks such as clean up.
TabHelpers::AttachTabHelpers is useful for creating and associating each
navigation observer to a specific WebContents.
In case of opening link in a new tab event, observing
chrome::NOTIFICATION_RETARGETING notification so far is the only way I know to
record the linkage between two WebContents. Let me know if you have better
ideas.
Also in terms of recording too much data, I agree with you, although I don't
have a good solution here. Let me get the functionality right first, and we can
do finch tests to see how it impacts performance and memory uses.
Thank you very much!
On 2016/09/17 00:35:10, Jialiu Lin wrote:
> On 2016/09/16 23:35:06, Charlie Reis (slow) wrote:
> > I've mainly reviewed safe_browsing_navigation_observer.{h,cc} so far.
Thanks
> > for digging in and getting started.
> >
> > Looks like there's a few high level things we'll need to fix, like having
> > separate WebContentsObservers per WebContents and tracking the source URL
> > correctly. (Did the tests pass so far? I think the source_url is getting
> > assigned incorrectly on navigations that target different frames/tabs.)
> >
> > I'm still concerned about introducing memory regressions from tracking all
of
> > this, but maybe we can find ways to optimize it after we get it functional.
> >
> >
>
https://codereview.chromium.org/2302913003/diff/180001/chrome/browser/safe_br...
> > File chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc
> (right):
> >
> >
>
https://codereview.chromium.org/2302913003/diff/180001/chrome/browser/safe_br...
> > chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:4:
#include
> > "chrome/browser/safe_browsing/safe_browsing_navigation_observer.h"
> > nit: Add blank line before.
> >
> >
>
https://codereview.chromium.org/2302913003/diff/180001/chrome/browser/safe_br...
> > chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:29: double
> > SecondsFromTime(const base::Time& time) {
> > Why do we need this?
> >
>
>
> Thanks for your feedback! This is really helpful. I'll address them one by one
> next week.
> Just to make sure I understand your hight level points. It seems the biggest
> misunderstanding I had is what WebContentsObserver::Observe() implies. I
didn't
> realize it will stop the observation of the previous WebContents.
> So to recap what you said, I need (at least) two classes to track navigation
> events. One is a observer, that listens to a single WebContents; and the other
> is a kind of observer manager, which in charge of aggregating navigation
details
> from individual observers and other management tasks such as clean up.
Yes, that sounds right.
> TabHelpers::AttachTabHelpers is useful for creating and associating each
> navigation observer to a specific WebContents.
Yep. I haven't worked as much directly with that, but Nick pointed me to it as
the way to do this. (We can check in with him if questions come up.)
> In case of opening link in a new tab event, observing
> chrome::NOTIFICATION_RETARGETING notification so far is the only way I know to
> record the linkage between two WebContents. Let me know if you have better
> ideas.
I'll ask around. Nick, Nasko, or Avi might know.
> Also in terms of recording too much data, I agree with you, although I don't
> have a good solution here. Let me get the functionality right first, and we
can
> do finch tests to see how it impacts performance and memory uses.
>
> Thank you very much!
Sounds good to me. Thanks!
The CQ bit was checked by jialiul@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
Hi creis@, Thanks for your comments! Here are some major changes since last patch: (1) make SafeBrowsingNavigationObserver per WebContents (2) refine logic of getting user gesture by overriding WebContentsObserver::DidGetUserInteraction(..) (3) a new class SBNavigationObserverManager to do management tasks and handle retargeting cases. (4) Since WebContentsObserver::DidOpenRequestedURL does not work as we expected, I'll stick to using NOTIFICATION_RETARGETING for now. (5) add a couple of more test cases for target=_blank Please feel free to add more comments. Thanks! https://codereview.chromium.org/2302913003/diff/180001/chrome/browser/safe_br... File chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc (right): https://codereview.chromium.org/2302913003/diff/180001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:4: #include "chrome/browser/safe_browsing/safe_browsing_navigation_observer.h" On 2016/09/16 23:35:04, Charlie Reis (slow) wrote: > nit: Add blank line before. Done. https://codereview.chromium.org/2302913003/diff/180001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:29: double SecondsFromTime(const base::Time& time) { On 2016/09/16 at 23:35:04, Charlie Reis (slow) wrote: > Why do we need this? removed. https://codereview.chromium.org/2302913003/diff/180001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:87: registrar_.Add(this, content::NOTIFICATION_NAV_ENTRY_PENDING, On 2016/09/16 at 23:35:05, Charlie Reis (slow) wrote: > Why do you need this one? The fewer notifications we can rely on, the better. Removed, only keep NOTIFICATION_RETARGETING https://codereview.chromium.org/2302913003/diff/180001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:166: WebContentsObserver::Observe(dest_content); On 2016/09/16 at 23:35:05, Charlie Reis (slow) wrote: > This will stop observing the previous WebContents, so it won't work. Nick points out that TabHelpers::AttachTabHelpers is the right pattern to use for listening to all WebContents (where we create one observer per WebContents). Done. https://codereview.chromium.org/2302913003/diff/180001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:203: // the source url. On 2016/09/16 23:35:05, Charlie Reis (slow) wrote: > This doesn't sound right to me. If a navigation in one frame/tab targets > another frame/tab, the DidStartNavigation will pass a handle for the new > frame/tab. That means you wouldn't get the URL of the frame that initiated the > navigation. > > I thought this was the reason we wanted to use Jochen's extension API > suggestion? (I'm having trouble remembering the details.) retargeting is handled by listening to chrome::NOTIFICATION_RETARGETING notification in SBNavigationObserverManager. This is similar to what jochen's extension API did, though unfortunately I cannot use his classes since they are defined in chrome/browser/extensions/api https://codereview.chromium.org/2302913003/diff/180001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:219: nav_event.main_frame_url = navigation_handle->GetWebContents()->GetURL(); On 2016/09/16 23:35:04, Charlie Reis (slow) wrote: > GetURL is deprecated because it's ambiguous whether you want the visible or last > committed one. I think you want GetLastCommittedURL. Done. https://codereview.chromium.org/2302913003/diff/180001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:233: nav_event->server_redirect_url = navigation_handle->GetURL(); On 2016/09/16 23:35:04, Charlie Reis (slow) wrote: > If you only want to track the last redirect and not the chain, we should > document that in the declaration. Done. https://codereview.chromium.org/2302913003/diff/180001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:244: // In NavigationHandle's definition, render initiated navigation includes: On 2016/09/16 23:35:05, Charlie Reis (slow) wrote: > nit: s/render/renderer/ Done. https://codereview.chromium.org/2302913003/diff/180001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:246: // meta refresh tag and using window.history.pushState. On 2016/09/16 23:35:04, Charlie Reis (slow) wrote: > Be careful not to imply that this list exhaustive. It's a very small list of > examples that try to give a sense for the variety of ways the renderer process > can navigate. Fixed comments https://codereview.chromium.org/2302913003/diff/180001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:251: // if this info is available. On 2016/09/16 23:35:04, Charlie Reis (slow) wrote: > Yes, we generally consider "user initiated" to mean browser_initiated || > has_user_gesture. You can probably shorten the description a bit accordingly. Done. https://codereview.chromium.org/2302913003/diff/180001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:267: : nav_event->target_url; On 2016/09/16 23:35:04, Charlie Reis (slow) wrote: > I think we might need to be more careful about the URL than this. It's not > clear to me whether navigation_handle->GetURL() will match either of these at > the time it finishes (e.g., if it ends up being an error page, blocked by CSP, > etc). Maybe that gets handled as a new DidStart/DidFinish? (Nasko is looking > into this at the moment, I believe.) > > It would also help to have more documentation on what you expect target_url and > server_redirect_url to be. Is target_url supposed to match the committed URL if > there's a redirect, or is it really more of an original URL for the request? You're right. I should use the committed URL instead. In case of an error page, I can probably ignore this navigation, since it is unlikely to trigger a download. https://codereview.chromium.org/2302913003/diff/180001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:271: navigation_map_.at(key_url).push_back(std::move(*nav_event)); On 2016/09/16 23:35:05, Charlie Reis (slow) wrote: > Why do this separately (requiring a second lookup)? Done. https://codereview.chromium.org/2302913003/diff/180001/chrome/browser/safe_br... File chrome/browser/safe_browsing/safe_browsing_navigation_observer.h (right): https://codereview.chromium.org/2302913003/diff/180001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:13: struct RetargetingDetails; On 2016/09/16 23:35:05, Charlie Reis (slow) wrote: > nit: This appears unused in the .h file. Done. https://codereview.chromium.org/2302913003/diff/180001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:25: class SafeBrowsingNavigationObserver : public content::NotificationObserver, On 2016/09/16 23:35:05, Charlie Reis (slow) wrote: > It looks like this is trying to have one instance that observes all WebContents, > which isn't supported. (See the WebContentsObserver::Observe documentation.) > > Also, is this meant to be limited to just WebContents in normal tabs, or other > non-traditional WebContents like background pages, etc? (Hopefully just normal > tabs.) Change to one observer per WebContents instead. Thanks for pointing this out. https://codereview.chromium.org/2302913003/diff/180001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:28: struct NavigationEvent { On 2016/09/16 23:35:06, Charlie Reis (slow) wrote: > Please have a comment with some description of how this will be used (as well as > details like whether it will track subframe navigations as well). Done. https://codereview.chromium.org/2302913003/diff/180001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:34: int source_tab_id; On 2016/09/16 23:35:05, Charlie Reis (slow) wrote: > Please document these fields. For example, WebContents doesn't have an ID, so > it's not immediately clear what ID this is tracking. Done. https://codereview.chromium.org/2302913003/diff/180001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:44: bool is_user_initiated; On 2016/09/16 at 23:35:05, Charlie Reis (slow) wrote: > Mention how you track this. Does it boil down to is_browser_initiated || has_user_gesture? Done. https://codereview.chromium.org/2302913003/diff/180001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:47: bool is_finished; On 2016/09/16 23:35:05, Charlie Reis (slow) wrote: > What event does this refer to? (There are many possible ways to declare > something finished.) This field is no longer needed. Removed. https://codereview.chromium.org/2302913003/diff/180001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:48: GURL server_redirect_url; On 2016/09/16 23:35:05, Charlie Reis (slow) wrote: > Normally we keep track of a "redirect chain" of URLs. Which one is this? (And > maybe it should be listed closer to is_server_redirect?) If there are more than one server redirect, this is the final redirect it resolves to. https://codereview.chromium.org/2302913003/diff/180001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:60: NavigationHandleMap; On 2016/09/16 23:35:05, Charlie Reis (slow) wrote: > nit: Switch order to be consistent with the declarations below? > > Also, do these typedefs need to be public, or can they be declared private? NavigationMap is used in both unit tests and browser tests, so make it public is easier. https://codereview.chromium.org/2302913003/diff/180001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:75: // Record navigation event when observes a chrome::NOTIFICATION_RETARGETING On 2016/09/16 23:35:05, Charlie Reis (slow) wrote: > nit: s/observes/observing/ Done. https://codereview.chromium.org/2302913003/diff/180001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:76: // notification. Retargeting means navigation triggered by one webcontents On 2016/09/16 23:35:05, Charlie Reis (slow) wrote: > nit: WebContents Done. https://codereview.chromium.org/2302913003/diff/180001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:79: void RecordRetargeting(const content::NotificationDetails& details); On 2016/09/16 23:35:05, Charlie Reis (slow) wrote: > Huh. I wasn't familiar with NOTIFICATION_RETARGETING. NotificationObserver is > deprecated, so I wonder if there's another way we should be listening for this. > (Not sure, though, so maybe we can leave it for now.) I tried WebContentsObserver::DidOpenRequestedURL as suggested by Nick and Avi. Unfortunately it is not working as expected. Let's keep it for now until I figure out an alternative way. https://codereview.chromium.org/2302913003/diff/180001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:89: On 2016/09/16 23:35:05, Charlie Reis (slow) wrote: > nit: No blank lines between method override declarations. Done. https://codereview.chromium.org/2302913003/diff/180001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:103: // events. NavigationHandle pointers are owned by Navigator class. Since a On 2016/09/16 23:35:06, Charlie Reis (slow) wrote: > I think NavigationHandle is owned by RenderFrameHost, rather than Navigator. Done. https://codereview.chromium.org/2302913003/diff/180001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:113: // navigation_map_ will be removed if they are older than t minutes since On 2016/09/16 23:35:05, Charlie Reis (slow) wrote: > Please say what t is. > > I still have to wonder if we're recording too much here. Even with a small time > limit, a large number of navigations will require a fair amount more memory than > before (and any time limit could let an attacker just wait long enough for the > events to expire before proceeding with the download). > > I keep wondering if there's a point at which we can forget certain navigation > events and still have what you need, but I suppose that's hard. An attacker > could always close the original tab (or navigate it to a different page) after > opening a new tab, and we'd lose info about the original page. Ah, sorry, I should mention that t is a place holder here. I was planning to create a constant in safe_browsing_service.h and refer to that constant in this comment. Anyway, let me put "2 minutes" for now. https://codereview.chromium.org/2302913003/diff/180001/chrome/test/data/safe_... File chrome/test/data/safe_browsing/download_protection/navigation_observer/navigation_observer_tests.html (right): https://codereview.chromium.org/2302913003/diff/180001/chrome/test/data/safe_... chrome/test/data/safe_browsing/download_protection/navigation_observer/navigation_observer_tests.html:84: <a id="direct_download_noreferrer" href="../signed.exe" rel="noreferrer"> On 2016/09/16 23:35:06, Charlie Reis (slow) wrote: > For all the noreferrer cases, you might want to also test rel=noreferrer and > target=_blank. That opens a new window and does a process swap, which > might be another way to bypass any tracking. Add a couple of more cases for target=_blank
Thanks for the updates! I'm part of the way through another pass, but I think enough questions have come up that I should post them now and we can discuss how to resolve them. I'm CC'ing Nasko, since some of these choices might not work in PlzNavigate. (Nasko, no need for a full review at the moment, but I wanted this to be on your radar.) Let's also expand the CL description a bit, including why this info is being collected (e.g., future reporting to track down suspicious downloads). https://codereview.chromium.org/2302913003/diff/220001/chrome/browser/safe_br... File chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc (right): https://codereview.chromium.org/2302913003/diff/220001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:27: bool IsExpired(const base::Time& timestamp) { nit: IsUserGestureExpired? https://codereview.chromium.org/2302913003/diff/220001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:90: std::make_pair(nav_event_key, std::vector<NavigationEvent>())); Maybe I'm misreading this, but isn't this only needed if there's no vector already? If there's a pre-existing vector for this key, we presumably just want to push the event into it and not insert a new empty vector. https://codereview.chromium.org/2302913003/diff/220001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:143: source_contents->GetRenderProcessHost()->GetID(), This would break with OOPIFs, since WebContents::GetRenderProcessHost() is for the main frame and might not match the RFH's process ID. I filed https://crbug.com/649855, since RetargetingDetails apparently lacks the frame's process ID to be able to get this right. https://codereview.chromium.org/2302913003/diff/220001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:146: replacements.ClearRef(); Can you help me understand why we're clearing the ref everywhere? I would suggest a helper function for it, but I'm not sure if we want to do it at all. For example, if it's so that in-page navigations look the same as the original page load, that will only help with fragment navigations and not pushState or replaceState (which can change the URL to any path within the same origin). https://codereview.chromium.org/2302913003/diff/220001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:221: navigation_handle->GetFrameTreeNodeId()); I'm concerned about this line, since the navigation could be starting in a pending RenderFrameHost, not the current RenderFrameHost. Maybe that doesn't matter if we're just trying to get the source URL, but I think there's a problem with that as well (below). https://codereview.chromium.org/2302913003/diff/220001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:231: host->GetLastCommittedURL().ReplaceComponents(replacements); This looks like it won't handle different frames of the same page targeting each other (which I'm guessing don't go through NOTIFICATION_RETARGETING). If the main frame at evil.com creates an iframe at innocent.com and then navigates it to download.com, source_url will look like innocent.com instead of evil.com. It's not clear to me whether we have something that tracks the URL of the page that caused the navigation yet, and it might be something our team will need to add. Can you test and verify whether that's correct? https://codereview.chromium.org/2302913003/diff/220001/chrome/browser/safe_br... File chrome/browser/safe_browsing/safe_browsing_navigation_observer.h (right): https://codereview.chromium.org/2302913003/diff/220001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:17: class NotificationRegistrar; nit: You're including NavigationHandle and NotificationRegistrar above already, so you don't need to also forward declare them here. (Looking more closely, maybe it makes sense to forward declare NavigationHandle and not include it, but vice versa for NotificationRegistrar?) https://codereview.chromium.org/2302913003/diff/220001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:24: // Struct to record the details of a navigation event for all frames. nit: for any frame. (A single struct won't record for all frames.) https://codereview.chromium.org/2302913003/diff/220001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:32: GURL source_url; nit: Add blank line before, and add a comment saying what this refers to. ("Source URL" and "Source WebContents" aren't well defined concepts, and could be confused with opener. I think you're referring to the URL and WebContents that caused this navigation to occur. This could be the previous page in the same frame, a different frame of the same page, or a different tab entirely.) https://codereview.chromium.org/2302913003/diff/220001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:37: GURL target_url; Let's be clear about whether this is the original request URL (pre-redirects) or the committed URL. I'm not sure which is better for looking things up in the map. https://codereview.chromium.org/2302913003/diff/220001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:48: GURL server_redirect_url; // Last server redirect url in redirect chain. Please clarify in the comment whether this is always present (i.e., if it's equal to target_url if there are no redirects), or if it's only non-empty when there's at least one server redirect. (Redirect chains are typically handled more like the former, but I'm guessing you mean the latter.) https://codereview.chromium.org/2302913003/diff/220001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:51: // Manager class for SafeBrowsingNavigationObserver. Can you say more about what this class manages? This is mostly a recap of the name. :) https://codereview.chromium.org/2302913003/diff/220001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:54: class SBNavigationObserverManager Is declaring a top-level class in the same file ok within chrome/? I don't think we usually allow it in content/, unless it's within an anonymous namespace and not exposed to other code. https://codereview.chromium.org/2302913003/diff/220001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:65: SBNavigationObserverManager(); nit: Blank line before (after typedefs). https://codereview.chromium.org/2302913003/diff/220001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:97: // keyed on target url. Since the same url can be requested multiple times Is it correct to key this on the pre-redirect URL? (I haven't verified yet that there's a bug here-- it depends on how we look up the key later.) https://codereview.chromium.org/2302913003/diff/220001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:101: // their corresponding navigations finish. Is this timer implemented yet? I'm curious how it will work. https://codereview.chromium.org/2302913003/diff/220001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:105: // in each WebContents. This is used to determin if any retargeting navigation nit: determine This seems like it won't be accurate. If there's a user gesture in one tab, and later the tab automatically opens a popup with no user gesture, it will look like there was one. If you're ok with this for your use case, please document it so that future code doesn't make the wrong assumption about this data. https://codereview.chromium.org/2302913003/diff/220001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:129: static const char kWebContentsUserDataKey[]; nit: Let's just declare this in an anonymous namespace in the .cc file if possible. https://codereview.chromium.org/2302913003/diff/220001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:130: void RecordNavigationPendingEntry( I don't see this implemented anywhere. https://codereview.chromium.org/2302913003/diff/220001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:150: NavigationHandleMap; nit: Typedefs should be declared at the beginning of the section, before methods. (https://google.github.io/styleguide/cppguide.html#Declaration_Order) https://codereview.chromium.org/2302913003/diff/220001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:154: bool has_user_gesture_; I don't think we can track this on a per-tab basis. Several frames in the tab can have navigations going at the same time, and each one might or might not have a user gesture. Why is it needed here, vs in NavigationEvent?
Description was changed from ========== Add SafeBrowsingNavigationObserver to listen to navigation events happen on all frames. BUG=639467 ========== to ========== Add SafeBrowsingNavigationObserver to listen to navigation events happen on all frames. These navigation events can help safe browsing service identify suspicious downloads that intentionally hide their referrers and/or landing pages, and report these pieces of info to safe browsing backend. This CL only covers the observing part. More management code (e.g. navigation event periodically cleanup, actual reporting, etc) will come up shortly. BUG=639467 ==========
Description was changed from ========== Add SafeBrowsingNavigationObserver to listen to navigation events happen on all frames. These navigation events can help safe browsing service identify suspicious downloads that intentionally hide their referrers and/or landing pages, and report these pieces of info to safe browsing backend. This CL only covers the observing part. More management code (e.g. navigation event periodically cleanup, actual reporting, etc) will come up shortly. BUG=639467 ========== to ========== Add SafeBrowsingNavigationObserver to listen to navigation events happen on all frames. These navigation events can help safe browsing service identify suspicious downloads that intentionally hide their referrers and/or landing pages, and report these pieces of info to safe browsing backend. This CL only covers the observing part. More management code (e.g. periodically cleanup, actual reporting, etc) will come up shortly. BUG=639467 ==========
Thanks, creis@! Please see my replies inline. https://codereview.chromium.org/2302913003/diff/220001/chrome/browser/safe_br... File chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc (right): https://codereview.chromium.org/2302913003/diff/220001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:27: bool IsExpired(const base::Time& timestamp) { On 2016/09/23 at 23:14:13, Charlie Reis (slow) wrote: > nit: IsUserGestureExpired? Done. https://codereview.chromium.org/2302913003/diff/220001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:90: std::make_pair(nav_event_key, std::vector<NavigationEvent>())); On 2016/09/23 at 23:14:14, Charlie Reis (slow) wrote: > Maybe I'm misreading this, but isn't this only needed if there's no vector already? If there's a pre-existing vector for this key, we presumably just want to push the event into it and not insert a new empty vector. http://en.cppreference.com/w/cpp/container/unordered_map/insert If the key is already there, insertion will not happen. std::unordered_map::insert returns a pair<iterator, bool>, this bool indicates if insertion actually happened, the iterator point to the latest inserted entry or the one prevent the insertion. https://codereview.chromium.org/2302913003/diff/220001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:143: source_contents->GetRenderProcessHost()->GetID(), On 2016/09/23 at 23:14:13, Charlie Reis (slow) wrote: > This would break with OOPIFs, since WebContents::GetRenderProcessHost() is for the main frame and might not match the RFH's process ID. I filed https://crbug.com/649855, since RetargetingDetails apparently lacks the frame's process ID to be able to get this right. Thanks, I will add a comments here. https://codereview.chromium.org/2302913003/diff/220001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:146: replacements.ClearRef(); On 2016/09/23 at 23:14:13, Charlie Reis (slow) wrote: > Can you help me understand why we're clearing the ref everywhere? I would suggest a helper function for it, but I'm not sure if we want to do it at all. > > For example, if it's so that in-page navigations look the same as the original page load, that will only help with fragment navigations and not pushState or replaceState (which can change the URL to any path within the same origin). yes, this is to make "http://foo.com#bar" treated the same as "http://foo.com". I added a helper function ClearURLRef(..) for it. I don't quite understand the pushState or replaceState part. Can you maybe give me an example? https://codereview.chromium.org/2302913003/diff/220001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:221: navigation_handle->GetFrameTreeNodeId()); On 2016/09/23 at 23:14:13, Charlie Reis (slow) wrote: > I'm concerned about this line, since the navigation could be starting in a pending RenderFrameHost, not the current RenderFrameHost. Maybe that doesn't matter if we're just trying to get the source URL, but I think there's a problem with that as well (below). Agree these are tricky cases. See my proposals below. https://codereview.chromium.org/2302913003/diff/220001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:231: host->GetLastCommittedURL().ReplaceComponents(replacements); On 2016/09/23 at 23:14:13, Charlie Reis (slow) wrote: > This looks like it won't handle different frames of the same page targeting each other (which I'm guessing don't go through NOTIFICATION_RETARGETING). If the main frame at evil.com creates an iframe at innocent.com and then navigates it to download.com, source_url will look like innocent.com instead of evil.com. > > It's not clear to me whether we have something that tracks the URL of the page that caused the navigation yet, and it might be something our team will need to add. Can you test and verify whether that's correct? There are two cases keep me awake at night. (1) the first one is just like what you described, i.e. creating a scapegoat iframe within the same page. For this case, at least main frame "evil.com" is captured in NavigationEvent.main_frame_url, so evil.com will eventually be blacklisted. (2) evil.com navigates to innocent.com in a new named tab (main frame), then evil.com tries to download a malicious file in the tab currently displaying innocent.com. This can be easily done by using target="innocent". The second navigation will not trigger NOTIFICATION_RETARGETING, so it looks just like innocent.com is landing a bad download. And to confirm, there is no existing method (I've known of) can track this case correctly. Though, there are 2 hacky ways we can try: (a) In SafeBrosingNaivgationObserverManager, create another map that keeps track of who creates each WebContents (both the creator webcontents and creator URL). We'll report this creator URL chain (if they are not stale) in download ping. (b) NavigationHandle::GetReferrer() seems pointing to the correct referrer if no ReferrerPolicy is enforced. I wonder if it is possible to always keep an unsanitized referrer, it might involve some upstream changes in navigator and other classes as well. https://codereview.chromium.org/2302913003/diff/220001/chrome/browser/safe_br... File chrome/browser/safe_browsing/safe_browsing_navigation_observer.h (right): https://codereview.chromium.org/2302913003/diff/220001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:17: class NotificationRegistrar; On 2016/09/23 at 23:14:14, Charlie Reis (slow) wrote: > nit: You're including NavigationHandle and NotificationRegistrar above already, so you don't need to also forward declare them here. (Looking more closely, maybe it makes sense to forward declare NavigationHandle and not include it, but vice versa for NotificationRegistrar?) Done. https://codereview.chromium.org/2302913003/diff/220001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:24: // Struct to record the details of a navigation event for all frames. On 2016/09/23 at 23:14:14, Charlie Reis (slow) wrote: > nit: for any frame. > (A single struct won't record for all frames.) Done. https://codereview.chromium.org/2302913003/diff/220001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:32: GURL source_url; On 2016/09/23 at 23:14:14, Charlie Reis (slow) wrote: > nit: Add blank line before, and add a comment saying what this refers to. ("Source URL" and "Source WebContents" aren't well defined concepts, and could be confused with opener. I think you're referring to the URL and WebContents that caused this navigation to occur. This could be the previous page in the same frame, a different frame of the same page, or a different tab entirely.) Done. https://codereview.chromium.org/2302913003/diff/220001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:37: GURL target_url; On 2016/09/23 at 23:14:14, Charlie Reis (slow) wrote: > Let's be clear about whether this is the original request URL (pre-redirects) or the committed URL. I'm not sure which is better for looking things up in the map. This is the original request URL. But NavigationMap is keyed on the committed URL. I'll add a comment near NavigationMap to make this point clearer. https://codereview.chromium.org/2302913003/diff/220001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:48: GURL server_redirect_url; // Last server redirect url in redirect chain. On 2016/09/23 at 23:14:14, Charlie Reis (slow) wrote: > Please clarify in the comment whether this is always present (i.e., if it's equal to target_url if there are no redirects), or if it's only non-empty when there's at least one server redirect. (Redirect chains are typically handled more like the former, but I'm guessing you mean the latter.) Yes, in case no server redirect, this field will be empty. Comment clarified. https://codereview.chromium.org/2302913003/diff/220001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:51: // Manager class for SafeBrowsingNavigationObserver. On 2016/09/23 at 23:14:14, Charlie Reis (slow) wrote: > Can you say more about what this class manages? This is mostly a recap of the name. :) Done. https://codereview.chromium.org/2302913003/diff/220001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:54: class SBNavigationObserverManager On 2016/09/23 at 23:14:14, Charlie Reis (slow) wrote: > Is declaring a top-level class in the same file ok within chrome/? I don't think we usually allow it in content/, unless it's within an anonymous namespace and not exposed to other code. Moved this manager class into a separate file. Done. https://codereview.chromium.org/2302913003/diff/220001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:65: SBNavigationObserverManager(); On 2016/09/23 at 23:14:14, Charlie Reis (slow) wrote: > nit: Blank line before (after typedefs). Done. https://codereview.chromium.org/2302913003/diff/220001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:97: // keyed on target url. Since the same url can be requested multiple times On 2016/09/23 at 23:14:14, Charlie Reis (slow) wrote: > Is it correct to key this on the pre-redirect URL? (I haven't verified yet that there's a bug here-- it depends on how we look up the key later.) Fixed this comments. Map actually keyed on post-redirect final resolved URL. Thanks for catching this. https://codereview.chromium.org/2302913003/diff/220001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:101: // their corresponding navigations finish. On 2016/09/23 at 23:14:14, Charlie Reis (slow) wrote: > Is this timer implemented yet? I'm curious how it will work. Not yet. It will be in my next CL. https://codereview.chromium.org/2302913003/diff/220001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:105: // in each WebContents. This is used to determin if any retargeting navigation On 2016/09/23 at 23:14:14, Charlie Reis (slow) wrote: > nit: determine > > This seems like it won't be accurate. If there's a user gesture in one tab, and later the tab automatically opens a popup with no user gesture, it will look like there was one. If you're ok with this for your use case, please document it so that future code doesn't make the wrong assumption about this data. Safe browsing backend will aggregate download pings first, so we don't need to get 100% correct for individual cases. https://codereview.chromium.org/2302913003/diff/220001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:129: static const char kWebContentsUserDataKey[]; On 2016/09/23 at 23:14:14, Charlie Reis (slow) wrote: > nit: Let's just declare this in an anonymous namespace in the .cc file if possible. Done. https://codereview.chromium.org/2302913003/diff/220001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:130: void RecordNavigationPendingEntry( On 2016/09/23 at 23:14:14, Charlie Reis (slow) wrote: > I don't see this implemented anywhere. Oops, forgot to delete this one. https://codereview.chromium.org/2302913003/diff/220001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:150: NavigationHandleMap; On 2016/09/23 at 23:14:14, Charlie Reis (slow) wrote: > nit: Typedefs should be declared at the beginning of the section, before methods. > (https://google.github.io/styleguide/cppguide.html#Declaration_Order) Done https://codereview.chromium.org/2302913003/diff/220001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:154: bool has_user_gesture_; On 2016/09/23 at 23:14:14, Charlie Reis (slow) wrote: > I don't think we can track this on a per-tab basis. Several frames in the tab can have navigations going at the same time, and each one might or might not have a user gesture. > > Why is it needed here, vs in NavigationEvent? Oh, sorry for the misleading comment. This field is set by WebContentsObservers::DidGetUserInteraction(), and WebContentsObservers::DidGetUserInteraction() is actually called before DidStartNavigation(). That is to say, we need a field to keep track of this boolean before either NavigationHandle is available or NavigationEvent is created. And I assume that the navigation closely following DidGetUserInteraction() is the one triggered by user gesture. (and similar to previous comment, this might not be 100% true for individual cases, but should be right after aggregation.)
Thanks. Some thoughts on in-page navigations and tracking source URLs below. It sounds like we may want to chat some more (perhaps also with nasko@ and dcheng@) about the right way to track which document caused a navigation to occur. https://codereview.chromium.org/2302913003/diff/220001/chrome/browser/safe_br... File chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc (right): https://codereview.chromium.org/2302913003/diff/220001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:90: std::make_pair(nav_event_key, std::vector<NavigationEvent>())); On 2016/09/27 17:50:59, Jialiu Lin wrote: > On 2016/09/23 at 23:14:14, Charlie Reis (slow) wrote: > > Maybe I'm misreading this, but isn't this only needed if there's no vector > already? If there's a pre-existing vector for this key, we presumably just want > to push the event into it and not insert a new empty vector. > > http://en.cppreference.com/w/cpp/container/unordered_map/insert > If the key is already there, insertion will not happen. > std::unordered_map::insert returns a pair<iterator, bool>, this bool indicates > if insertion actually happened, the iterator point to the latest inserted entry > or the one prevent the insertion. Acknowledged. https://codereview.chromium.org/2302913003/diff/220001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:146: replacements.ClearRef(); On 2016/09/27 17:50:59, Jialiu Lin wrote: > On 2016/09/23 at 23:14:13, Charlie Reis (slow) wrote: > > Can you help me understand why we're clearing the ref everywhere? I would > suggest a helper function for it, but I'm not sure if we want to do it at all. > > > > For example, if it's so that in-page navigations look the same as the original > page load, that will only help with fragment navigations and not pushState or > replaceState (which can change the URL to any path within the same origin). > > yes, this is to make "http://foo.com#bar" treated the same as "http://foo.com". > I added a helper function ClearURLRef(..) for it. > > I don't quite understand the pushState or replaceState part. Can you maybe give > me an example? Sure. For both #bar and pushState/replaceState, we're talking about in-page navigations that don't have a new document and keep all their existing iframes and other state. Clearing the ref may give you a kind of canonical URL for the fragment case, but it won't help with pushState. Suppose we're on http://foo.com/pageone. That page can do history.pushState({}, "anything", "anything"), at which point the URL becomes http://foo.com/anything. There's no way to tell just by looking at the URL that "pageone" and "anything" are the same document. This means that clearing the ref is not an effective way to tell if we're still on the same page or not. As a result, I'd recommend not doing it, and we can look for other ways to achieve the goal you're after. What's the problem you were trying to solve? (In other words, why did you need to treat foo.com and foo.com#bar the same?) https://codereview.chromium.org/2302913003/diff/220001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:231: host->GetLastCommittedURL().ReplaceComponents(replacements); On 2016/09/27 17:50:59, Jialiu Lin wrote: > On 2016/09/23 at 23:14:13, Charlie Reis (slow) wrote: > > This looks like it won't handle different frames of the same page targeting > each other (which I'm guessing don't go through NOTIFICATION_RETARGETING). If > the main frame at http://evil.com creates an iframe at http://innocent.com and then navigates > it to http://download.com, source_url will look like http://innocent.com instead of http://evil.com. > > > > It's not clear to me whether we have something that tracks the URL of the page > that caused the navigation yet, and it might be something our team will need to > add. Can you test and verify whether that's correct? > > There are two cases keep me awake at night. > (1) the first one is just like what you described, i.e. creating a scapegoat > iframe within the same page. > For this case, at least main frame "evil.com" is captured in > NavigationEvent.main_frame_url, so http://evil.com will eventually be blacklisted. Note that my example could have put evil.com in iframe 1 and innocent.com/download.com in iframe 2. We would never see evil.com reported, though you're right that we would at least have whatever main_frame_url is, which is somewhat indirectly responsible (e.g., it might be a news web site that loaded a bad advertisement). > > > (2) http://evil.com navigates to http://innocent.com in a new named tab (main frame), then > http://evil.com tries to download a malicious file in the tab currently displaying > http://innocent.com. This can be easily done by using target="innocent". The second > navigation will not trigger NOTIFICATION_RETARGETING, so it looks just like > http://innocent.com is landing a bad download. And to confirm, there is no existing > method (I've known of) can track this case correctly. Agreed. I think that case would be problematic as well. > Though, there are 2 hacky ways we can try: > (a) In SafeBrosingNaivgationObserverManager, create another map that keeps > track of who creates each WebContents (both the creator webcontents and creator > URL). We'll report this creator URL chain (if they are not stale) in download > ping. I'm a little hesitant on this one, again because main frames tend not to be special. I worry about cases that the bad URLs are in iframes and we never end up seeing them. There's also the risk that the creator of a window is not to blame, but some other window is able to target it by name. > (b) NavigationHandle::GetReferrer() seems pointing to the correct referrer if > no ReferrerPolicy is enforced. I wonder if it is possible to always keep an > unsanitized referrer, it might involve some upstream changes in navigator and > other classes as well. I'm curious about this approach. As a strawman proposal, suppose we added a unsanitized_referrer alongside referrer, such that it's populated even when referrer has been omitted. This might let us track what we need in your map. The downside is that referrer shows up *everywhere*, so it's potentially a lot of plumbing. Otherwise, we may need to bake something into the core navigation logic to track this kind of thing. The good news is, our team does see some benefits in going down that path-- it would help with process model decisions (e.g., which process should this data URL go into?) would have helped with some recent security bug fixes. So maybe it's worth finding a way to track that. (As a starting point, there's ScheduledNavigation::m_originDocument in Blink, but Daniel points out that isn't trustworthy or correct in many cases.) https://codereview.chromium.org/2302913003/diff/240001/chrome/browser/safe_br... File chrome/browser/safe_browsing/safe_browsing_navigation_observer_manager.h (right): https://codereview.chromium.org/2302913003/diff/240001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_manager.h:87: // their corresponding navigations finish. Please make this last sentence a TODO, since it's not implemented yet.
Thanks creis@! I'll send out a calendar invite to you, nasko@ and dcheng@ shortly, so we can chat about these problematic cases. https://codereview.chromium.org/2302913003/diff/220001/chrome/browser/safe_br... File chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc (right): https://codereview.chromium.org/2302913003/diff/220001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:146: replacements.ClearRef(); On 2016/09/28 at 21:08:13, Charlie Reis wrote: > On 2016/09/27 17:50:59, Jialiu Lin wrote: > > On 2016/09/23 at 23:14:13, Charlie Reis (slow) wrote: > > > Can you help me understand why we're clearing the ref everywhere? I would > > suggest a helper function for it, but I'm not sure if we want to do it at all. > > > > > > For example, if it's so that in-page navigations look the same as the original > > page load, that will only help with fragment navigations and not pushState or > > replaceState (which can change the URL to any path within the same origin). > > > > yes, this is to make "http://foo.com#bar" treated the same as "http://foo.com". > > I added a helper function ClearURLRef(..) for it. > > > > I don't quite understand the pushState or replaceState part. Can you maybe give > > me an example? > > Sure. For both #bar and pushState/replaceState, we're talking about in-page navigations that don't have a new document and keep all their existing iframes and other state. > > Clearing the ref may give you a kind of canonical URL for the fragment case, but it won't help with pushState. Suppose we're on http://foo.com/pageone. That page can do history.pushState({}, "anything", "anything"), at which point the URL becomes http://foo.com/anything. There's no way to tell just by looking at the URL that "pageone" and "anything" are the same document. > > This means that clearing the ref is not an effective way to tell if we're still on the same page or not. As a result, I'd recommend not doing it, and we can look for other ways to achieve the goal you're after. > > What's the problem you were trying to solve? (In other words, why did you need to treat foo.com and foo.com#bar the same?) I see. How about I just remove empty ref (a.k.a "http://foo.com#"->"http://foo.com")? Since a lot of javascript triggered navigation does <a href="#" onclick="somefunction()">. Having two keys "http://foo.com#" and "http://foo.com" in NavigationMap might be a little bit confusing when I actually need to trace back downloads. https://codereview.chromium.org/2302913003/diff/220001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:231: host->GetLastCommittedURL().ReplaceComponents(replacements); On 2016/09/28 at 21:08:13, Charlie Reis wrote: > On 2016/09/27 17:50:59, Jialiu Lin wrote: > > On 2016/09/23 at 23:14:13, Charlie Reis (slow) wrote: > > > This looks like it won't handle different frames of the same page targeting > > each other (which I'm guessing don't go through NOTIFICATION_RETARGETING). If > > the main frame at http://evil.com creates an iframe at http://innocent.com and then navigates > > it to http://download.com, source_url will look like http://innocent.com instead of http://evil.com. > > > > > > It's not clear to me whether we have something that tracks the URL of the page > > that caused the navigation yet, and it might be something our team will need to > > add. Can you test and verify whether that's correct? > > > > There are two cases keep me awake at night. > > (1) the first one is just like what you described, i.e. creating a scapegoat > > iframe within the same page. > > For this case, at least main frame "evil.com" is captured in > > NavigationEvent.main_frame_url, so http://evil.com will eventually be blacklisted. > > Note that my example could have put evil.com in iframe 1 and innocent.com/download.com in iframe 2. We would never see evil.com reported, though you're right that we would at least have whatever main_frame_url is, which is somewhat indirectly responsible (e.g., it might be a news web site that loaded a bad advertisement). You're right. it is quite indirect. > > > > > > > > (2) http://evil.com navigates to http://innocent.com in a new named tab (main frame), then > > http://evil.com tries to download a malicious file in the tab currently displaying > > http://innocent.com. This can be easily done by using target="innocent". The second > > navigation will not trigger NOTIFICATION_RETARGETING, so it looks just like > > http://innocent.com is landing a bad download. And to confirm, there is no existing > > method (I've known of) can track this case correctly. > > Agreed. I think that case would be problematic as well. > > > Though, there are 2 hacky ways we can try: > > (a) In SafeBrosingNaivgationObserverManager, create another map that keeps > > track of who creates each WebContents (both the creator webcontents and creator > > URL). We'll report this creator URL chain (if they are not stale) in download > > ping. > > I'm a little hesitant on this one, again because main frames tend not to be special. I worry about cases that the bad URLs are in iframes and we never end up seeing them. There's also the risk that the creator of a window is not to blame, but some other window is able to target it by name. > > > (b) NavigationHandle::GetReferrer() seems pointing to the correct referrer if > > no ReferrerPolicy is enforced. I wonder if it is possible to always keep an > > unsanitized referrer, it might involve some upstream changes in navigator and > > other classes as well. > > I'm curious about this approach. As a strawman proposal, suppose we added a unsanitized_referrer alongside referrer, such that it's populated even when referrer has been omitted. This might let us track what we need in your map. The downside is that referrer shows up *everywhere*, so it's potentially a lot of plumbing. > > Otherwise, we may need to bake something into the core navigation logic to track this kind of thing. The good news is, our team does see some benefits in going down that path-- it would help with process model decisions (e.g., which process should this data URL go into?) would have helped with some recent security bug fixes. So maybe it's worth finding a way to track that. > > (As a starting point, there's ScheduledNavigation::m_originDocument in Blink, but Daniel points out that isn't trustworthy or correct in many cases.) Let talk more about these cases. I'll sent out a calendar invite to you, nasko@, and dcheng@ shortly. https://codereview.chromium.org/2302913003/diff/240001/chrome/browser/safe_br... File chrome/browser/safe_browsing/safe_browsing_navigation_observer_manager.h (right): https://codereview.chromium.org/2302913003/diff/240001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_manager.h:87: // their corresponding navigations finish. On 2016/09/28 at 21:08:13, Charlie Reis wrote: > Please make this last sentence a TODO, since it's not implemented yet. Done.
The CQ bit was checked by jialiul@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Exceeded global retry quota
The CQ bit was checked by jialiul@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: linux_chromium_rel_ng on master.tryserver.chromium.linux (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.linux/builders/linux_chromium_...)
Patchset #15 (id:300001) has been deleted
The CQ bit was checked by jialiul@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
Patchset #15 (id:320001) has been deleted
Patchset #15 (id:340001) has been deleted
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: chromium_presubmit on master.tryserver.chromium.linux (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.linux/builders/chromium_presub...)
Patchset #16 (id:380001) has been deleted
Patchset #15 (id:360001) has been deleted
Hi creis@ and nparker@, Since RetargetingDetails now has the correct process source_render_process_id (crbug.com/649855), I wonder if we can move forward with this CL, such that making incremental changes would be easier. (I'm totally aware that NOTIFICATION_RETARGETING will be deprecated very soon, and we are also waiting for crbug.com/651895 to tackle frame targeting each other cases. ) Some changes to this CL since last time you reviewed: (1) incorporate the new source_render_process_id in RetargetingDetails (thanks to nasko@) (2) crrev.com/2336833002 merged into this CL That small CL was intended to get IP addresses of host from all frames through DidGetResourceResponseStart(..). Since it was based on the very first version of this CL, it becomes somewhat painful to rebase, I decided to merge it into this CL. nparker@ has already taken a pass on that CL and LGTMed. (3) Some minor changes, such as changing the helper function to just clear empty url ref instead of all ref, adjusting some comments etc. Let me know if you have further comments.
Yes, I agree, I think we should work on landing this CL so you can make further progress. Charlie -- I'll rely on you to check the navigation logic. https://codereview.chromium.org/2302913003/diff/400001/chrome/browser/safe_br... File chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc (right): https://codereview.chromium.org/2302913003/diff/400001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:40: NavigationEvent::NavigationEvent(NavigationEvent&& nav_event) What is constructor used for? https://codereview.chromium.org/2302913003/diff/400001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:95: last_user_gesture_timestamp_(base::Time::Now()) {} Does the Now() imply that a new observer was always associated with a gesture? https://codereview.chromium.org/2302913003/diff/400001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:100: // paramter is unique to this navigation, which will appear in the following nit: parameter https://codereview.chromium.org/2302913003/diff/400001/chrome/browser/safe_br... File chrome/browser/safe_browsing/safe_browsing_navigation_observer_manager.cc (right): https://codereview.chromium.org/2302913003/diff/400001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_manager.cc:23: static const double kUserGestureTTLInSecond = 0.5; Add a comment to explain what this does, or how it was chosen. https://codereview.chromium.org/2302913003/diff/400001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_manager.cc:31: ? now - timestamp_in_double > kUserGestureTTLInSecond Should be safe to just return now - timestamp_in_double > kUserGestureTTLInSecond https://codereview.chromium.org/2302913003/diff/400001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_manager.cc:56: insertion_result.first->second.push_back(std::move(*nav_event)); Does the std::move destroy the nav_event? If so, maybe add that to this function's comments in the .h https://codereview.chromium.org/2302913003/diff/400001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_manager.cc:64: if (!insertion_result.second) nit: add comment, like // Update the timestamp if entry already exists. (I find "thing.first.second," etc, challenging to follow since the types are non obvious, so comments are useful) https://codereview.chromium.org/2302913003/diff/400001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_manager.cc:75: user_gesture_map_.erase(web_contents); Some comments here would help, since I'm not clear on why it's being erased. Also, I think you can save another lookup by doing erase(it) https://codereview.chromium.org/2302913003/diff/400001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_manager.cc:131: SafeBrowsingNavigationObserverManager::ClearEmptyURLRef( Add a comment as to why ClearEmptyURLRef() is this necessary. https://codereview.chromium.org/2302913003/diff/400001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_manager.cc:135: std::make_pair(target_url, std::vector<NavigationEvent>())); If you save the iterator this returns, you can do the push_back() later w/o re-searching the map. Actually, this should probably be moved down there for clarity. https://codereview.chromium.org/2302913003/diff/400001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_manager.cc:138: nav_event.source_url = since source_url is already empty, you could do if (rfh) { nav_event.source_url = .. } and skip the temporary GURL() creation. https://codereview.chromium.org/2302913003/diff/400001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_manager.cc:149: nav_event.timestamp = base::Time::Now(); This is already set in the nav_event ctor. You could remove it from there. https://codereview.chromium.org/2302913003/diff/400001/chrome/browser/safe_br... File chrome/browser/safe_browsing/safe_browsing_navigation_observer_manager.h (right): https://codereview.chromium.org/2302913003/diff/400001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_manager.h:24: // Manager class for SafeBrowsingNavigationObserver, which in charge of cleaning nit: which is in charge https://codereview.chromium.org/2302913003/diff/400001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_manager.h:35: struct GurlHash { This could be private
The CQ bit was checked by jialiul@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
Thanks, nparker! Comments addressed. https://codereview.chromium.org/2302913003/diff/400001/chrome/browser/safe_br... File chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc (right): https://codereview.chromium.org/2302913003/diff/400001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:40: NavigationEvent::NavigationEvent(NavigationEvent&& nav_event) On 2016/10/14 23:52:49, Nathan Parker wrote: > What is constructor used for? This move constructor is required by std::move(). https://codereview.chromium.org/2302913003/diff/400001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:95: last_user_gesture_timestamp_(base::Time::Now()) {} On 2016/10/14 23:52:49, Nathan Parker wrote: > Does the Now() imply that a new observer was always associated with a gesture? not necessary. whether there is a user gesture is indicated by has_user_gesture boolean. Let me use base::Time::Time() instead to make it less confusing. https://codereview.chromium.org/2302913003/diff/400001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:100: // paramter is unique to this navigation, which will appear in the following On 2016/10/14 23:52:49, Nathan Parker wrote: > nit: parameter Done. https://codereview.chromium.org/2302913003/diff/400001/chrome/browser/safe_br... File chrome/browser/safe_browsing/safe_browsing_navigation_observer_manager.cc (right): https://codereview.chromium.org/2302913003/diff/400001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_manager.cc:23: static const double kUserGestureTTLInSecond = 0.5; On 2016/10/14 23:52:49, Nathan Parker wrote: > Add a comment to explain what this does, or how it was chosen. Done. https://codereview.chromium.org/2302913003/diff/400001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_manager.cc:31: ? now - timestamp_in_double > kUserGestureTTLInSecond On 2016/10/14 23:52:49, Nathan Parker wrote: > Should be safe to just > > return now - timestamp_in_double > kUserGestureTTLInSecond ACK. I would like to return expired (true) if timestamp is in the future ( larger than now, which suggests invalid). but now - timestamp_in_double > kUserGestureTTLInSecond would return otherwise. https://codereview.chromium.org/2302913003/diff/400001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_manager.cc:56: insertion_result.first->second.push_back(std::move(*nav_event)); On 2016/10/14 23:52:49, Nathan Parker wrote: > Does the std::move destroy the nav_event? If so, maybe add that to this > function's comments in the .h Done. https://codereview.chromium.org/2302913003/diff/400001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_manager.cc:64: if (!insertion_result.second) On 2016/10/14 23:52:49, Nathan Parker wrote: > nit: add comment, like // Update the timestamp if entry already exists. > > (I find "thing.first.second," etc, challenging to follow since the types are non > obvious, so comments are useful) Agree. comment added. https://codereview.chromium.org/2302913003/diff/400001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_manager.cc:75: user_gesture_map_.erase(web_contents); On 2016/10/14 23:52:49, Nathan Parker wrote: > Some comments here would help, since I'm not clear on why it's being erased. > > Also, I think you can save another lookup by doing erase(it) Done. https://codereview.chromium.org/2302913003/diff/400001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_manager.cc:131: SafeBrowsingNavigationObserverManager::ClearEmptyURLRef( On 2016/10/14 23:52:49, Nathan Parker wrote: > Add a comment as to why ClearEmptyURLRef() is this necessary. Done. https://codereview.chromium.org/2302913003/diff/400001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_manager.cc:135: std::make_pair(target_url, std::vector<NavigationEvent>())); On 2016/10/14 23:52:49, Nathan Parker wrote: > If you save the iterator this returns, you can do the push_back() later w/o > re-searching the map. Actually, this should probably be moved down there for > clarity. Thanks for catching these. https://codereview.chromium.org/2302913003/diff/400001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_manager.cc:138: nav_event.source_url = On 2016/10/14 23:52:49, Nathan Parker wrote: > since source_url is already empty, you could do > if (rfh) { > nav_event.source_url = .. > } > > and skip the temporary GURL() creation. Done. https://codereview.chromium.org/2302913003/diff/400001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_manager.cc:149: nav_event.timestamp = base::Time::Now(); On 2016/10/14 23:52:49, Nathan Parker wrote: > This is already set in the nav_event ctor. You could remove it from there. Yes, this is redundant. Fixed. https://codereview.chromium.org/2302913003/diff/400001/chrome/browser/safe_br... File chrome/browser/safe_browsing/safe_browsing_navigation_observer_manager.h (right): https://codereview.chromium.org/2302913003/diff/400001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_manager.h:24: // Manager class for SafeBrowsingNavigationObserver, which in charge of cleaning On 2016/10/14 23:52:49, Nathan Parker wrote: > nit: which is in charge Done. https://codereview.chromium.org/2302913003/diff/400001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_manager.h:35: struct GurlHash { On 2016/10/14 23:52:49, Nathan Parker wrote: > This could be private Yes. Moved to private.
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: android_compile_dbg on master.tryserver.chromium.android (JOB_FAILED, https://build.chromium.org/p/tryserver.chromium.android/builders/android_comp...)
The CQ bit was checked by jialiul@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
Thanks for moving forward on this. I'm a bit unclear on whether you're intending to land this before we are able to track the initiating URL, or if we're just making progress on this CL's drafts until then. Either way, we should probably be clearer in the code about which parts we expect to be buggy for the time being. :) https://codereview.chromium.org/2302913003/diff/440001/chrome/browser/safe_br... File chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc (right): https://codereview.chromium.org/2302913003/diff/440001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:28: : source_url(), nit: These empty initializations aren't needed, right? Most code I've seen omits them. https://codereview.chromium.org/2302913003/diff/440001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:104: // If we already seen this navigation_handle before, no need to do anything. I'm not sure whether it's a good idea to ignore it. It might be possible to get two DidStarts if navigation does a cross-process transfer, for example, in which case you might want to update the data structure. (Even if none of the current properties can change during a transfer, I'd worry about bugs if we added a property later, like which RFH it's in.) https://codereview.chromium.org/2302913003/diff/440001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:114: navigation_handle->GetFrameTreeNodeId()); This doesn't look safe, at least for general uses of |host| as the RFH being navigated. The navigation could be started from a pending RenderFrameHost, not the current RenderFrameHost for the node. (If you explicitly want the |current_frame_host|, then this would be ok.) https://codereview.chromium.org/2302913003/diff/440001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:120: if (host && host->GetLastCommittedURL().is_valid()) { The last committed URL in the frame is not the source URL for a navigation. Another frame could have targeted this frame after something was already loaded in it. (That's why we want to add an initiating URL/origin to the NavigationHandle, right? If this is just a stopgap until that's added, we should add a big scary TODO that the information in source_url is wrong for the time being.) :) https://codereview.chromium.org/2302913003/diff/440001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:163: return; Maybe add a NOTREACHED to aid with debugging and tests? https://codereview.chromium.org/2302913003/diff/440001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:177: return; NOTREACHED? https://codereview.chromium.org/2302913003/diff/440001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:191: GURL nav_event_key = I'm not sure I understand this complex clause. Would navigation_handle->GetURL() give us the same thing (possibly after passing through ClearEmptyURLRef)? If I'm missing something, then this could use a comment explaining it. https://codereview.chromium.org/2302913003/diff/440001/chrome/browser/safe_br... File chrome/browser/safe_browsing/safe_browsing_navigation_observer.h (right): https://codereview.chromium.org/2302913003/diff/440001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:20: // Struct to record the details of a navigation event for any frames. nit: s/frames/frame/ https://codereview.chromium.org/2302913003/diff/440001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:30: int source_tab_id; // Which tab the above source_url is in. Tab ID is Assuming that source_url might be a subframe, I think it might help to clarify the comment here to say something like "Which tab contains the frame with source_url." https://codereview.chromium.org/2302913003/diff/440001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:40: GURL main_frame_url; // Main frame url of the source_url. Can be the same This should probably be called source_main_frame_url, since it sounds more like the destination URL. https://codereview.chromium.org/2302913003/diff/440001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:44: base::Time timestamp; // Timestamp of last time this object is updated. last_updated, perhaps? https://codereview.chromium.org/2302913003/diff/440001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:49: GURL server_redirect_url; // Last server redirect url in server redirect This sounds a bit awkward to use. In most places (e.g., NavigationEntry's URL), we store the URL that ended up committing after all the redirects, and you can call GetOriginalRequestURL if you care about what it was before redirects. With the current struct, there's no simple way to find what URL committed. You'd have to check server_redirect_url, and if that's empty, fall back to target_url. Would it be simpler to use if we change server_redirect_url to committed_url and have it present even there's no redirects? https://codereview.chromium.org/2302913003/diff/440001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:54: // Structure to keep tracks of resolved IP address of a host. nit: track https://codereview.chromium.org/2302913003/diff/440001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:98: // and added to navigation_map_ at the end of DidFinishNavigation(...). Please clarify where navigation_map_ lives. https://codereview.chromium.org/2302913003/diff/440001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:100: scoped_refptr<SafeBrowsingNavigationObserverManager> manager_; nit: I'd find this section more readable with blank lines between members, given the comments. Just my 2 cents, though. https://codereview.chromium.org/2302913003/diff/440001/chrome/browser/safe_br... File chrome/browser/safe_browsing/safe_browsing_navigation_observer_manager.cc (right): https://codereview.chromium.org/2302913003/diff/440001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_manager.cc:26: static const double kUserGestureTTLInSecond = 0.5; That seems pretty low to me-- is that a value used elsewhere? (On a machine where things are running slowly, it could easily take more than 0.5 seconds to get from the click to the DidStartNavigation, for example.) https://codereview.chromium.org/2302913003/diff/440001/chrome/browser/safe_br... File chrome/browser/safe_browsing/safe_browsing_navigation_observer_manager.h (right): https://codereview.chromium.org/2302913003/diff/440001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_manager.h:39: static GURL ClearEmptyURLRef(const GURL& url); nit: ClearEmptyRef https://codereview.chromium.org/2302913003/diff/440001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_manager.h:108: // in URLChainEntry in ClientDownlodRequest. nit: ClientDownloadRequest https://codereview.chromium.org/2302913003/diff/440001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_manager.h:109: HostToIpMap host_to_ip_map_; It seems a bit complex/unfortunate to have to track this given how the IP address can change over time, but I see that you've accounted for it. What's the reason it's necessary? https://codereview.chromium.org/2302913003/diff/440001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_manager.h:111: content::NotificationRegistrar registrar_; nti: Add blank line after.
Hi creis@, Thanks for your comments! Added TODOs in code to indicate source_url might not be accurate if frames are targeting each other. And yes, I do want to land this CL before the initiating url/origin is resolved (since we don't have an ETA on that). It will also make it easier to land incremental changes afterward. Thanks! https://codereview.chromium.org/2302913003/diff/440001/chrome/browser/safe_br... File chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc (right): https://codereview.chromium.org/2302913003/diff/440001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:28: : source_url(), On 2016/10/17 23:54:03, Charlie Reis wrote: > nit: These empty initializations aren't needed, right? Most code I've seen > omits them. Acknowledged. Compiler complains if I omit this default constructor. It is used by safe_browsing_navigation_observer_manager.cc, and in tests. https://codereview.chromium.org/2302913003/diff/440001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:104: // If we already seen this navigation_handle before, no need to do anything. On 2016/10/17 23:54:03, Charlie Reis wrote: > I'm not sure whether it's a good idea to ignore it. It might be possible to get > two DidStarts if navigation does a cross-process transfer, for example, in which > case you might want to update the data structure. (Even if none of the current > properties can change during a transfer, I'd worry about bugs if we added a > property later, like which RFH it's in.) Acknowledged. I added a TODO here, since this 2 DidStarts scenario may also requires changes in the why how I handle user gestures. If navifation does a cross-process transfer, will it generates 2 DidStarts all the time or only occasionally? Since Safe Browsing backend will aggregate these referrer chains, we don't need to get every NavigationEvent correct in all corner cases. https://codereview.chromium.org/2302913003/diff/440001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:114: navigation_handle->GetFrameTreeNodeId()); On 2016/10/17 23:54:03, Charlie Reis wrote: > This doesn't look safe, at least for general uses of |host| as the RFH being > navigated. The navigation could be started from a pending RenderFrameHost, not > the current RenderFrameHost for the node. (If you explicitly want the > |current_frame_host|, then this would be ok.) Let's keep it for now. Since |host| here is only used to figure out the source_url. After we can get the true initiating url from NavigationHandle, we don't need this RHF anymore. https://codereview.chromium.org/2302913003/diff/440001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:120: if (host && host->GetLastCommittedURL().is_valid()) { On 2016/10/17 23:54:03, Charlie Reis wrote: > The last committed URL in the frame is not the source URL for a navigation. > Another frame could have targeted this frame after something was already loaded > in it. > > (That's why we want to add an initiating URL/origin to the NavigationHandle, > right? If this is just a stopgap until that's added, we should add a big scary > TODO that the information in source_url is wrong for the time being.) :) Oh, yes, forgot to add this TODO. https://codereview.chromium.org/2302913003/diff/440001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:163: return; On 2016/10/17 23:54:03, Charlie Reis wrote: > Maybe add a NOTREACHED to aid with debugging and tests? Good idea. Done. https://codereview.chromium.org/2302913003/diff/440001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:177: return; On 2016/10/17 23:54:03, Charlie Reis wrote: > NOTREACHED? Done. https://codereview.chromium.org/2302913003/diff/440001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:191: GURL nav_event_key = On 2016/10/17 23:54:03, Charlie Reis wrote: > I'm not sure I understand this complex clause. Would > navigation_handle->GetURL() give us the same thing (possibly after passing > through ClearEmptyURLRef)? If I'm missing something, then this could use a > comment explaining it. Ah, don't need this complex thing anymore. https://codereview.chromium.org/2302913003/diff/440001/chrome/browser/safe_br... File chrome/browser/safe_browsing/safe_browsing_navigation_observer.h (right): https://codereview.chromium.org/2302913003/diff/440001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:20: // Struct to record the details of a navigation event for any frames. On 2016/10/17 23:54:03, Charlie Reis wrote: > nit: s/frames/frame/ Done. https://codereview.chromium.org/2302913003/diff/440001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:30: int source_tab_id; // Which tab the above source_url is in. Tab ID is On 2016/10/17 23:54:04, Charlie Reis wrote: > Assuming that source_url might be a subframe, I think it might help to clarify > the comment here to say something like "Which tab contains the frame with > source_url." Ah, that's much clearer. Thanks! https://codereview.chromium.org/2302913003/diff/440001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:40: GURL main_frame_url; // Main frame url of the source_url. Can be the same On 2016/10/17 23:54:04, Charlie Reis wrote: > This should probably be called source_main_frame_url, since it sounds more like > the destination URL. make sense. https://codereview.chromium.org/2302913003/diff/440001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:44: base::Time timestamp; // Timestamp of last time this object is updated. On 2016/10/17 23:54:04, Charlie Reis wrote: > last_updated, perhaps? Done. https://codereview.chromium.org/2302913003/diff/440001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:49: GURL server_redirect_url; // Last server redirect url in server redirect On 2016/10/17 23:54:04, Charlie Reis wrote: > This sounds a bit awkward to use. In most places (e.g., NavigationEntry's URL), > we store the URL that ended up committing after all the redirects, and you can > call GetOriginalRequestURL if you care about what it was before redirects. > > With the current struct, there's no simple way to find what URL committed. > You'd have to check server_redirect_url, and if that's empty, fall back to > target_url. > > Would it be simpler to use if we change server_redirect_url to committed_url and > have it present even there's no redirects? I agree. though for some cases (e.g. download urls) this url may not be technically committed. how about we rename target_url to original_request_url, and server_redirect_url to actual_target_url? https://codereview.chromium.org/2302913003/diff/440001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:54: // Structure to keep tracks of resolved IP address of a host. On 2016/10/17 23:54:04, Charlie Reis wrote: > nit: track Done. https://codereview.chromium.org/2302913003/diff/440001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:98: // and added to navigation_map_ at the end of DidFinishNavigation(...). On 2016/10/17 23:54:04, Charlie Reis wrote: > Please clarify where navigation_map_ lives. Done. https://codereview.chromium.org/2302913003/diff/440001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:100: scoped_refptr<SafeBrowsingNavigationObserverManager> manager_; On 2016/10/17 23:54:04, Charlie Reis wrote: > nit: I'd find this section more readable with blank lines between members, given > the comments. Just my 2 cents, though. Blank lines added. https://codereview.chromium.org/2302913003/diff/440001/chrome/browser/safe_br... File chrome/browser/safe_browsing/safe_browsing_navigation_observer_manager.cc (right): https://codereview.chromium.org/2302913003/diff/440001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_manager.cc:26: static const double kUserGestureTTLInSecond = 0.5; On 2016/10/17 23:54:04, Charlie Reis wrote: > That seems pretty low to me-- is that a value used elsewhere? (On a machine > where things are running slowly, it could easily take more than 0.5 seconds to > get from the click to the DidStartNavigation, for example. Do you think 1.0 sec makes more sense? Do you know any UMA metrics can provide some hint? https://codereview.chromium.org/2302913003/diff/440001/chrome/browser/safe_br... File chrome/browser/safe_browsing/safe_browsing_navigation_observer_manager.h (right): https://codereview.chromium.org/2302913003/diff/440001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_manager.h:39: static GURL ClearEmptyURLRef(const GURL& url); On 2016/10/17 23:54:04, Charlie Reis wrote: > nit: ClearEmptyRef Done. https://codereview.chromium.org/2302913003/diff/440001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_manager.h:108: // in URLChainEntry in ClientDownlodRequest. On 2016/10/17 23:54:04, Charlie Reis wrote: > nit: ClientDownloadRequest Done. https://codereview.chromium.org/2302913003/diff/440001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_manager.h:109: HostToIpMap host_to_ip_map_; On 2016/10/17 23:54:04, Charlie Reis wrote: > It seems a bit complex/unfortunate to have to track this given how the IP > address can change over time, but I see that you've accounted for it. What's > the reason it's necessary? SB backend wants this info to see :(1) if website is cloaking, (2) any malicious/hijacked DNS servers, (3) potentially match them to a known malicious IPs. https://codereview.chromium.org/2302913003/diff/440001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_manager.h:111: content::NotificationRegistrar registrar_; On 2016/10/17 23:54:04, Charlie Reis wrote: > nti: Add blank line after. Done.
Thanks! A few quick comments, since I'm out an event today. On 2016/10/18 19:06:06, Jialiu Lin wrote: > Hi creis@, > Thanks for your comments! Added TODOs in code to indicate source_url might not > be accurate if frames are targeting each other. > And yes, I do want to land this CL before the initiating url/origin is resolved > (since we don't have an ETA on that). > It will also make it easier to land incremental changes afterward. Ok-- let's be sure not to start basing any decisions on these values until the initiating URL issue is fixed, though. I'm ok with landing things to test out the code, though. https://codereview.chromium.org/2302913003/diff/440001/chrome/browser/safe_br... File chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc (right): https://codereview.chromium.org/2302913003/diff/440001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:28: : source_url(), On 2016/10/18 19:06:04, Jialiu Lin wrote: > On 2016/10/17 23:54:03, Charlie Reis wrote: > > nit: These empty initializations aren't needed, right? Most code I've seen > > omits them. > > Acknowledged. Compiler complains if I omit this default constructor. It is used > by safe_browsing_navigation_observer_manager.cc, and in tests. Acknowledged. https://codereview.chromium.org/2302913003/diff/440001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:104: // If we already seen this navigation_handle before, no need to do anything. On 2016/10/18 19:06:04, Jialiu Lin wrote: > On 2016/10/17 23:54:03, Charlie Reis wrote: > > I'm not sure whether it's a good idea to ignore it. It might be possible to > get > > two DidStarts if navigation does a cross-process transfer, for example, in > which > > case you might want to update the data structure. (Even if none of the > current > > properties can change during a transfer, I'd worry about bugs if we added a > > property later, like which RFH it's in.) > Acknowledged. I added a TODO here, since this 2 DidStarts scenario may also > requires changes in the why how I handle user gestures. > If navifation does a cross-process transfer, will it generates 2 DidStarts all > the time or only occasionally? Since Safe Browsing backend will aggregate these > referrer chains, we don't need to get every NavigationEvent correct in all > corner cases. This happens on every transfer, not just occasionally. You should be able to test it by doing a server redirect to the Chrome Web Store (e.g., fire up testserver.py and go to http://127.0.0.1:8080/server-redirect?https://chrome.google.com/webstore). (I'm away from a debugger right now so I haven't verified it manually, but I'm pretty sure that's how it works.) Because of this and your user gesture concern, we may want to consider fixing it before landing the CL, or at least characterize exactly what might be broken in those cases. https://codereview.chromium.org/2302913003/diff/440001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:114: navigation_handle->GetFrameTreeNodeId()); On 2016/10/18 19:06:04, Jialiu Lin wrote: > On 2016/10/17 23:54:03, Charlie Reis wrote: > > This doesn't look safe, at least for general uses of |host| as the RFH being > > navigated. The navigation could be started from a pending RenderFrameHost, > not > > the current RenderFrameHost for the node. (If you explicitly want the > > |current_frame_host|, then this would be ok.) > Let's keep it for now. Since |host| here is only used to figure out the > source_url. After we can get the true initiating url from NavigationHandle, we > don't need this RHF anymore. I see. We should name it current_frame_host to be clear if we keep it. https://codereview.chromium.org/2302913003/diff/440001/chrome/browser/safe_br... File chrome/browser/safe_browsing/safe_browsing_navigation_observer.h (right): https://codereview.chromium.org/2302913003/diff/440001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:49: GURL server_redirect_url; // Last server redirect url in server redirect On 2016/10/18 19:06:05, Jialiu Lin wrote: > On 2016/10/17 23:54:04, Charlie Reis wrote: > > This sounds a bit awkward to use. In most places (e.g., NavigationEntry's > URL), > > we store the URL that ended up committing after all the redirects, and you can > > call GetOriginalRequestURL if you care about what it was before redirects. > > > > With the current struct, there's no simple way to find what URL committed. > > You'd have to check server_redirect_url, and if that's empty, fall back to > > target_url. > > > > Would it be simpler to use if we change server_redirect_url to committed_url > and > > have it present even there's no redirects? > > I agree. though for some cases (e.g. download urls) this url may not be > technically committed. how about we rename target_url to original_request_url, > and server_redirect_url to actual_target_url? Good point about downloads not committing. Maybe destination_url would be closer to terminology we use elsewhere? https://codereview.chromium.org/2302913003/diff/440001/chrome/browser/safe_br... File chrome/browser/safe_browsing/safe_browsing_navigation_observer_manager.cc (right): https://codereview.chromium.org/2302913003/diff/440001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_manager.cc:26: static const double kUserGestureTTLInSecond = 0.5; On 2016/10/18 19:06:05, Jialiu Lin wrote: > On 2016/10/17 23:54:04, Charlie Reis wrote: > > That seems pretty low to me-- is that a value used elsewhere? (On a machine > > where things are running slowly, it could easily take more than 0.5 seconds to > > get from the click to the DidStartNavigation, for example. > Do you think 1.0 sec makes more sense? Do you know any UMA metrics can provide > some hint? I'd check with csharrison@ about which metrics might shed light on this-- I think he's tracked a lot of navigation timing values that might be helpful here. https://codereview.chromium.org/2302913003/diff/440001/chrome/browser/safe_br... File chrome/browser/safe_browsing/safe_browsing_navigation_observer_manager.h (right): https://codereview.chromium.org/2302913003/diff/440001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_manager.h:109: HostToIpMap host_to_ip_map_; On 2016/10/18 19:06:05, Jialiu Lin wrote: > On 2016/10/17 23:54:04, Charlie Reis wrote: > > It seems a bit complex/unfortunate to have to track this given how the IP > > address can change over time, but I see that you've accounted for it. What's > > the reason it's necessary? > > SB backend wants this info to see :(1) if website is cloaking, (2) any > malicious/hijacked DNS servers, (3) potentially match them to a known malicious > IPs. Acknowledged. https://codereview.chromium.org/2302913003/diff/460001/chrome/browser/safe_br... File chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc (right): https://codereview.chromium.org/2302913003/diff/460001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:118: // If there was URL previously committed in this render frame host, nit: in the current RenderFrameHost https://codereview.chromium.org/2302913003/diff/460001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:124: // added to NavigationHandle. Thanks. Can you add a reference to https://crbug.com/651895? We should also copy this TODO to the source_url declaration, since it affects anyone who might use this value. https://codereview.chromium.org/2302913003/diff/460001/chrome/browser/safe_br... File chrome/browser/safe_browsing/safe_browsing_navigation_observer.h (right): https://codereview.chromium.org/2302913003/diff/460001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:43: base::Time timestamp; // When this NavigationEvent is last updated. I meant the name of the member should be last_updated rather than timestamp. :) ("timestamp" just says what type it is, without indicating if it's the start, commit time, or last updated time.)
From my point of view, LGTM, modulo these comments and whatever Charlie has. https://codereview.chromium.org/2302913003/diff/460001/chrome/browser/safe_br... File chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc (right): https://codereview.chromium.org/2302913003/diff/460001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:165: if (navigation_handle_map_.find(navigation_handle) == I'm not sure if this gets compiled out on release builds. To ensure it does, you could do DCHECK(..find(..) == ..end()). Same below. https://codereview.chromium.org/2302913003/diff/460001/chrome/browser/safe_br... File chrome/browser/safe_browsing/safe_browsing_navigation_observer.h (right): https://codereview.chromium.org/2302913003/diff/460001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:43: base::Time timestamp; // When this NavigationEvent is last updated. On 2016/10/19 17:10:46, Charlie Reis (OOO until 10-19) wrote: > I meant the name of the member should be last_updated rather than timestamp. :) > ("timestamp" just says what type it is, without indicating if it's the start, > commit time, or last updated time.) +1 https://codereview.chromium.org/2302913003/diff/460001/chrome/browser/safe_br... File chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc (right): https://codereview.chromium.org/2302913003/diff/460001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc:156: void VerifyNavigationEvent(const GURL& expected_source_url, These repeated GURLs and bools are pretty easy to mix up, and are hard to tell what the args are when you call the function -- this makes it hard to verify the calls are correct. Is there another way to do this? A few ideas: 1) Add comments to each arg in each function call 2) Expand all the EXPECT_EQ's in each test. That way the names for each arg are clear 3) Start with some default-value'd Navigation event, set the relevant fields, and then compare the NavEvents with a helper function. https://codereview.chromium.org/2302913003/diff/460001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc:177: // Since all testing pages have the same host, there is onlly one entry in nit:only https://codereview.chromium.org/2302913003/diff/460001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc:183: actual_host_ip_map->at(embedded_test_server()->base_url().host()) nit: Easier to read w/ a temporary auto& ip_list = actual_host_ip_map->at(embedded_test_server()->base_url().host()); ASSERT_EQ(1, ip_list.size()) ASSERT_EQ(1, ip_list.back().ip) https://codereview.chromium.org/2302913003/diff/460001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc:424: EXPECT_FALSE(nav_map->at(blank_url).at(0).source_tab_id == nit: EXPECT_EQ and _NE rather than TRUE and FALSE. https://codereview.chromium.org/2302913003/diff/460001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc:562: // If we have already seem an IP associated with a host, update its timestamp. Nit: space between tests s/seem/seen/
The CQ bit was checked by jialiul@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
Thank both of you for your patient and thorough review! All your remaining comments are addressed. :-) https://codereview.chromium.org/2302913003/diff/440001/chrome/browser/safe_br... File chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc (right): https://codereview.chromium.org/2302913003/diff/440001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:104: // If we already seen this navigation_handle before, no need to do anything. On 2016/10/19 17:10:45, Charlie Reis (OOO until 10-19) wrote: > On 2016/10/18 19:06:04, Jialiu Lin wrote: > > On 2016/10/17 23:54:03, Charlie Reis wrote: > > > I'm not sure whether it's a good idea to ignore it. It might be possible to > > get > > > two DidStarts if navigation does a cross-process transfer, for example, in > > which > > > case you might want to update the data structure. (Even if none of the > > current > > > properties can change during a transfer, I'd worry about bugs if we added a > > > property later, like which RFH it's in.) > > Acknowledged. I added a TODO here, since this 2 DidStarts scenario may also > > requires changes in the why how I handle user gestures. > > If navifation does a cross-process transfer, will it generates 2 DidStarts all > > the time or only occasionally? Since Safe Browsing backend will aggregate > these > > referrer chains, we don't need to get every NavigationEvent correct in all > > corner cases. > > This happens on every transfer, not just occasionally. You should be able to > test it by doing a server redirect to the Chrome Web Store (e.g., fire up > testserver.py and go to > http://127.0.0.1:8080/server-redirect?https://chrome.google.com/webstore). (I'm > away from a debugger right now so I haven't verified it manually, but I'm pretty > sure that's how it works.) > > Because of this and your user gesture concern, we may want to consider fixing it > before landing the CL, or at least characterize exactly what might be broken in > those cases. Agree. refactored this function to address this concern. https://codereview.chromium.org/2302913003/diff/440001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:114: navigation_handle->GetFrameTreeNodeId()); On 2016/10/19 17:10:45, Charlie Reis (OOO until 10-19) wrote: > On 2016/10/18 19:06:04, Jialiu Lin wrote: > > On 2016/10/17 23:54:03, Charlie Reis wrote: > > > This doesn't look safe, at least for general uses of |host| as the RFH being > > > navigated. The navigation could be started from a pending RenderFrameHost, > > not > > > the current RenderFrameHost for the node. (If you explicitly want the > > > |current_frame_host|, then this would be ok.) > > Let's keep it for now. Since |host| here is only used to figure out the > > source_url. After we can get the true initiating url from NavigationHandle, we > > don't need this RHF anymore. > > I see. We should name it current_frame_host to be clear if we keep it. Done. https://codereview.chromium.org/2302913003/diff/440001/chrome/browser/safe_br... File chrome/browser/safe_browsing/safe_browsing_navigation_observer.h (right): https://codereview.chromium.org/2302913003/diff/440001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:49: GURL server_redirect_url; // Last server redirect url in server redirect On 2016/10/19 17:10:45, Charlie Reis (OOO until 10-19) wrote: > On 2016/10/18 19:06:05, Jialiu Lin wrote: > > On 2016/10/17 23:54:04, Charlie Reis wrote: > > > This sounds a bit awkward to use. In most places (e.g., NavigationEntry's > > URL), > > > we store the URL that ended up committing after all the redirects, and you > can > > > call GetOriginalRequestURL if you care about what it was before redirects. > > > > > > With the current struct, there's no simple way to find what URL committed. > > > You'd have to check server_redirect_url, and if that's empty, fall back to > > > target_url. > > > > > > Would it be simpler to use if we change server_redirect_url to committed_url > > and > > > have it present even there's no redirects? > > > > I agree. though for some cases (e.g. download urls) this url may not be > > technically committed. how about we rename target_url to > original_request_url, > > and server_redirect_url to actual_target_url? > > Good point about downloads not committing. Maybe destination_url would be > closer to terminology we use elsewhere? SG. renamed to destination_url. https://codereview.chromium.org/2302913003/diff/440001/chrome/browser/safe_br... File chrome/browser/safe_browsing/safe_browsing_navigation_observer_manager.cc (right): https://codereview.chromium.org/2302913003/diff/440001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_manager.cc:26: static const double kUserGestureTTLInSecond = 0.5; On 2016/10/19 17:10:45, Charlie Reis (OOO until 10-19) wrote: > On 2016/10/18 19:06:05, Jialiu Lin wrote: > > On 2016/10/17 23:54:04, Charlie Reis wrote: > > > That seems pretty low to me-- is that a value used elsewhere? (On a machine > > > where things are running slowly, it could easily take more than 0.5 seconds > to > > > get from the click to the DidStartNavigation, for example. > > Do you think 1.0 sec makes more sense? Do you know any UMA metrics can > provide > > some hint? > > I'd check with csharrison@ about which metrics might shed light on this-- I > think he's tracked a lot of navigation timing values that might be helpful here. Thanks! I'll also do more research on this. Will update this value shortly. https://codereview.chromium.org/2302913003/diff/460001/chrome/browser/safe_br... File chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc (right): https://codereview.chromium.org/2302913003/diff/460001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:118: // If there was URL previously committed in this render frame host, On 2016/10/19 17:10:46, Charlie Reis (OOO until 10-19) wrote: > nit: in the current RenderFrameHost Done. https://codereview.chromium.org/2302913003/diff/460001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:124: // added to NavigationHandle. On 2016/10/19 17:10:45, Charlie Reis (OOO until 10-19) wrote: > Thanks. Can you add a reference to https://crbug.com/651895 > > We should also copy this TODO to the source_url declaration, since it affects > anyone who might use this value. Sure. Done. https://codereview.chromium.org/2302913003/diff/460001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:165: if (navigation_handle_map_.find(navigation_handle) == On 2016/10/19 22:56:14, Nathan Parker wrote: > I'm not sure if this gets compiled out on release builds. To ensure it does, > you could do > DCHECK(..find(..) == ..end()). Same below. Done. https://codereview.chromium.org/2302913003/diff/460001/chrome/browser/safe_br... File chrome/browser/safe_browsing/safe_browsing_navigation_observer.h (right): https://codereview.chromium.org/2302913003/diff/460001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:43: base::Time timestamp; // When this NavigationEvent is last updated. On 2016/10/19 22:56:14, Nathan Parker wrote: > On 2016/10/19 17:10:46, Charlie Reis (OOO until 10-19) wrote: > > I meant the name of the member should be last_updated rather than timestamp. > :) > > ("timestamp" just says what type it is, without indicating if it's the start, > > commit time, or last updated time.) > > +1 Done. https://codereview.chromium.org/2302913003/diff/460001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:43: base::Time timestamp; // When this NavigationEvent is last updated. On 2016/10/19 22:56:14, Nathan Parker wrote: > On 2016/10/19 17:10:46, Charlie Reis (OOO until 10-19) wrote: > > I meant the name of the member should be last_updated rather than timestamp. > :) > > ("timestamp" just says what type it is, without indicating if it's the start, > > commit time, or last updated time.) > > +1 Done. https://codereview.chromium.org/2302913003/diff/460001/chrome/browser/safe_br... File chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc (right): https://codereview.chromium.org/2302913003/diff/460001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc:156: void VerifyNavigationEvent(const GURL& expected_source_url, On 2016/10/19 22:56:14, Nathan Parker wrote: > These repeated GURLs and bools are pretty easy to mix up, and are hard to tell > what the args are when you call the function -- this makes it hard to verify the > calls are correct. Is there another way to do this? A few ideas: > > 1) Add comments to each arg in each function call > 2) Expand all the EXPECT_EQ's in each test. That way the names for each arg are > clear > 3) Start with some default-value'd Navigation event, set the relevant fields, > and then compare the NavEvents with a helper function. Adding comments to the first call of this function in each test. https://codereview.chromium.org/2302913003/diff/460001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc:177: // Since all testing pages have the same host, there is onlly one entry in On 2016/10/19 22:56:14, Nathan Parker wrote: > nit:only Done. https://codereview.chromium.org/2302913003/diff/460001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc:183: actual_host_ip_map->at(embedded_test_server()->base_url().host()) On 2016/10/19 22:56:14, Nathan Parker wrote: > nit: Easier to read w/ a temporary > auto& ip_list = > actual_host_ip_map->at(embedded_test_server()->base_url().host()); > ASSERT_EQ(1, ip_list.size()) > ASSERT_EQ(1, ip_list.back().ip) Done. https://codereview.chromium.org/2302913003/diff/460001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc:424: EXPECT_FALSE(nav_map->at(blank_url).at(0).source_tab_id == On 2016/10/19 22:56:14, Nathan Parker wrote: > nit: EXPECT_EQ and _NE rather than TRUE and FALSE. Done. https://codereview.chromium.org/2302913003/diff/460001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc:562: // If we have already seem an IP associated with a host, update its timestamp. On 2016/10/19 22:56:14, Nathan Parker wrote: > Nit: space between tests > s/seem/seen/ Done.
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: linux_chromium_asan_rel_ng on master.tryserver.chromium.linux (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.linux/builders/linux_chromium_...)
The CQ bit was checked by jialiul@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: linux_chromium_asan_rel_ng on master.tryserver.chromium.linux (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.linux/builders/linux_chromium_...)
Patchset #20 (id:500001) has been deleted
The CQ bit was checked by jialiul@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was checked by jialiul@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: android_clang_dbg_recipe on master.tryserver.chromium.android (JOB_FAILED, https://build.chromium.org/p/tryserver.chromium.android/builders/android_clan...)
Patchset #20 (id:520001) has been deleted
Finally, all bots are happy now. Let me know if you have further comments. Thanks!
Thanks! I think this is almost ready to go. The comments below are mostly nits to make this easier to follow, since I think it will be important for others to understand how the tracking works. I also looked through the tests, which I didn't have a chance to do before. It's a great set of example cases this code has to deal with. One question in there about how we'll tie together the NavigationEvents to link the initial URL to the eventual download (which I'm sure I understood at one point). :) https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... File chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc (right): https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:78: // TODO(jialiul): This method will be called by TabHelpers::AttachTabHelpers. nit: TODO above NOTIMPLEMENTED https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:99: // Called when a navigation started in the WebContents. |navigation_handle| in nit: s/started/starts/ nit: s/in parameter/parameter/ (there's no ambiguity, since there aren't any out parameters) https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:106: // It is possible to see multiple DidStartNaivgation(..) with the same nit: multiple DidStartNavigation calls (Note the "Naiv" typo) https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:128: content::RenderFrameHost* current_frame_host = nit: Move this declaration under the comment below, since it should go away when we resolve the TODO. https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:132: // If there was URL previously committed in the current RenderFrameHost, nit: a URL https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:136: // TODO(jialiul): source_url will be inccorect when another frame is targeting nit: incorrect Maybe also mention source_tab_id and source_main_frame_url in the TODO? I think those have the same limitation, right? https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:145: navigation_handle->GetURL()); I'm curious, is there anything that depends on the source URL being non-empty? It seems like a browser-initiated navigation might be best treated like it has no source URL, since it may be unrelated to what the user is currently looking at. (Similarly, until the initiator URL plumbing is added, it's a bit unusual to say that a URL is its own src URL for the first navigation in a frame.) The only reason I mention is that it might be more difficult to analyze the data if you have to always check whether the source_url is different from the original_request_url before deciding whether some other page was responsible for the navigation. If you leave it empty in the cases that no other page was responsible, it might be easier to analyze the data. https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:169: navigation_handle_map_.end()); You might want to rephrase this to a if (...) { NOTREACHED; return; } That way we don't crash on line 172 if the handle isn't found in a release build. Same in DidFinishNavigation. https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... File chrome/browser/safe_browsing/safe_browsing_navigation_observer.h (right): https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:31: // navigation involves frames target each other. nit targeting https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:46: base::Time last_updated; // When this NavigationEvent is last updated. nit: s/is/was/ https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:100: // corresponding entry in this map will be removed from navigation_handle_map_ nit: entries https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... File chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc (right): https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc:129: // all the navigation event we need. nit: events https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc:147: // Wait until all the navigation completes. nit: navigations complete. https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc:214: // Since this test uses java script to mimic clicking on a link, there is no nit: Javascript Still, this sounds incorrect to me. ClickTestLink uses ExecuteScript, which boils down to ExecuteJavaScriptWithUserGestureForTests. I would expect it to have a user gesture, so I'm puzzled why it doesn't. https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc:218: initial_url, // source_main_frame_url Seems like this would be easier to read if the source_main_frame_url came before original_request_url (similar to the declaration in NavigationEvent). Similarly, it's probably useful for destination_url to come right after original_request_url. (Same in the unittest file.) https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc:268: VerifyNavigationEvent(download_url, download_url, download_url, false, false, I think it would be useful to format this like the previous call, annotating what each argument is. (Same in the tests below.) Can you also include a comment for why there are two items in the vector? I'm not sure what this second one is, and it seems to have lost the attribution to initial_url. https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc:273: // Click on a link which navigates to a page then redirect to download using nit: redirects to a download https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc:292: redirect_url, // destination_url This is a bit surprising when first reading the test, since you'd expect this to be download_url at first glance. It's probably worth mentioning in a comment that client redirects commit and then generate a second NavigationEvent, unlike server redirects. (Are there any tests that show the server redirect behavior? I don't see any. You can trigger one with "/server-redirect?dest_url" and the embedded test server will make it work.) https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc:299: // Click on a link which navigates to a page then redirect to download using nit: redirects to a download https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc:321: false, redirect_url, nav_map->at(redirect_url).at(1)); What's going on in this one? I'm not sure why we would be going from redirect_url to redirect_url. (Or is this because we set the source_url to the destination_url when there is no source_url? Maybe this would be clearer if the source_url were empty in cases like this.) https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc:386: // Click on a link which redirect twice until reaches download using a mixture nit: redirects twice until it https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc:420: GURL blank_url = GURL("about:blank"); nit: url::kAboutBlankURL https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc:445: download_url, nav_map->at(download_url).at(0)); Can you remind me how we're going to link this NavigationEvent to the one with initial_url? I forget what ties them together. https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc:490: // flacky on other platforms. nit: flaky https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... File chrome/browser/safe_browsing/safe_browsing_navigation_observer_manager.cc (right): https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_manager.cc:21: namespace safe_browsing { nit: Add blank line before. https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_manager.cc:26: static const double kUserGestureTTLInSecond = 1.0; Is this based on new data, or should we include a TODO to set this to something based on UMA metrics? https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_manager.cc:34: ? now - timestamp_in_double > kUserGestureTTLInSecond nit: Can you add some parentheses on this line and the one before to make it easier to read? Or even better, rephrase to: if (timestamp_in_double <= now) return true; return (now - timestamp_in_double) > kUserGestureTTLInSecond; https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... File chrome/browser/safe_browsing/safe_browsing_navigation_observer_manager.h (right): https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_manager.h:36: // kUserGestureTTLInSecond; nit: Period, not semicolon. https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_manager.h:38: // Helper function to strip empty ref fragment from a URL. Let's add a justification for why this is useful. (You mentioned earlier that many pages end up with a "#" in the URL due to Javascript, and we don't want that to have a separate entry in the maps.) https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_manager.h:44: // pointed by |nav_event| will be no longer accessible after this function. nit: pointed to by https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_manager.h:105: // sub frame URLs' hosts. Since it is possible for a host to resolve to more nit: subframe https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... File chrome/browser/safe_browsing/safe_browsing_navigation_observer_unittest.cc (right): https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_unittest.cc:68: // Navigation in current tab nit: End comment with period. https://codereview.chromium.org/2302913003/diff/540001/chrome/test/data/safe_... File chrome/test/data/safe_browsing/download_protection/navigation_observer/navigation_observer_tests.html (right): https://codereview.chromium.org/2302913003/diff/540001/chrome/test/data/safe_... chrome/test/data/safe_browsing/download_protection/navigation_observer/navigation_observer_tests.html:16: return true; Do these need to return true? https://codereview.chromium.org/2302913003/diff/540001/chrome/test/data/safe_... chrome/test/data/safe_browsing/download_protection/navigation_observer/navigation_observer_tests.html:24: // Trigger a download in new tab. Let's elaborate a bit, since it's more complex than that-- this triggers a redirect to a download in a new tab with no opener. https://codereview.chromium.org/2302913003/diff/540001/chrome/test/data/safe_... chrome/test/data/safe_browsing/download_protection/navigation_observer/navigation_observer_tests.html:92: </a><br> nit: Maybe put a blank line after each <br>, as in the cases above? I think that makes it easier to read. https://codereview.chromium.org/2302913003/diff/540001/chrome/test/data/safe_... chrome/test/data/safe_browsing/download_protection/navigation_observer/navigation_observer_tests.html:105: <a id="new_tab_download" href="#" onclick="downloadInNewTab(false)"> nit: downloadInNewTab doesn't take a parameter anymore.
Thanks, creis@! These comments are super helpful! I'm currently working on a regression and will come back to this CL tomorrow or Wednesday. I will ping you when I resolve all the issues. Thanks,
Hi creis@, Thanks for your thorough review! I have addressed all your comments! Since the NavigationEvents we keep track of have both source url (source tab is), destination url (target tab id) and the is_user_initiated boolean, and they are keyed by the destination url in NavgationMap, we can easily do backward search given a download url (and its tab id). We'll continue searching for source url until one of the following condition meets: (1) we reach a NavigationEvent that initiated by user ( Of cause we need to search twice since we want to keep track of two user gestures if possible) (2) no more match in navigation_map_ Some corner cases: (1) If source url is empty (e.g. new tab download), we continue backward search using the destination url. (2) if there are multiple matching NavigationEvents, we pick the one with the most recent timestamp. https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... File chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc (right): https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:78: // TODO(jialiul): This method will be called by TabHelpers::AttachTabHelpers. On 2016/10/25 04:52:26, Charlie Reis wrote: > nit: TODO above NOTIMPLEMENTED Done. https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:99: // Called when a navigation started in the WebContents. |navigation_handle| in On 2016/10/25 04:52:26, Charlie Reis wrote: > nit: s/started/starts/ > nit: s/in parameter/parameter/ > (there's no ambiguity, since there aren't any out parameters) Done. https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:106: // It is possible to see multiple DidStartNaivgation(..) with the same On 2016/10/25 04:52:26, Charlie Reis wrote: > nit: multiple DidStartNavigation calls > (Note the "Naiv" typo) Good eye! https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:128: content::RenderFrameHost* current_frame_host = On 2016/10/25 04:52:26, Charlie Reis wrote: > nit: Move this declaration under the comment below, since it should go away when > we resolve the TODO. Done. https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:132: // If there was URL previously committed in the current RenderFrameHost, On 2016/10/25 04:52:26, Charlie Reis wrote: > nit: a URL Done. https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:136: // TODO(jialiul): source_url will be inccorect when another frame is targeting On 2016/10/25 04:52:26, Charlie Reis wrote: > nit: incorrect > > Maybe also mention source_tab_id and source_main_frame_url in the TODO? I think > those have the same limitation, right? Yes, indeed. Comment refined. https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:145: navigation_handle->GetURL()); On 2016/10/25 04:52:26, Charlie Reis wrote: > I'm curious, is there anything that depends on the source URL being non-empty? > It seems like a browser-initiated navigation might be best treated like it has > no source URL, since it may be unrelated to what the user is currently looking > at. (Similarly, until the initiator URL plumbing is added, it's a bit unusual > to say that a URL is its own src URL for the first navigation in a frame.) > > The only reason I mention is that it might be more difficult to analyze the data > if you have to always check whether the source_url is different from the > original_request_url before deciding whether some other page was responsible for > the navigation. If you leave it empty in the cases that no other page was > responsible, it might be easier to analyze the data. Make sense. https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.cc:169: navigation_handle_map_.end()); On 2016/10/25 04:52:26, Charlie Reis wrote: > You might want to rephrase this to a if (...) { NOTREACHED; return; } > > That way we don't crash on line 172 if the handle isn't found in a release > build. Same in DidFinishNavigation. Done. https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... File chrome/browser/safe_browsing/safe_browsing_navigation_observer.h (right): https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:31: // navigation involves frames target each other. On 2016/10/25 04:52:26, Charlie Reis wrote: > nit targeting Done. https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:46: base::Time last_updated; // When this NavigationEvent is last updated. On 2016/10/25 04:52:26, Charlie Reis wrote: > nit: s/is/was/ Done. https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:100: // corresponding entry in this map will be removed from navigation_handle_map_ On 2016/10/25 04:52:26, Charlie Reis wrote: > nit: entries Done. https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... File chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc (right): https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc:129: // all the navigation event we need. On 2016/10/25 04:52:27, Charlie Reis wrote: > nit: events Done. https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc:147: // Wait until all the navigation completes. On 2016/10/25 04:52:26, Charlie Reis wrote: > nit: navigations complete. Done. https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc:214: // Since this test uses java script to mimic clicking on a link, there is no On 2016/10/25 04:52:27, Charlie Reis wrote: > nit: Javascript > > Still, this sounds incorrect to me. ClickTestLink uses ExecuteScript, which > boils down to ExecuteJavaScriptWithUserGestureForTests. I would expect it to > have a user gesture, so I'm puzzled why it doesn't. Not sure why, but when I tested manually, WebContentsObserver::DidGetUserInteraction() kicks in, but when used ExecuteScript(), DidGetUserInteraction() never get triggered. https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc:218: initial_url, // source_main_frame_url On 2016/10/25 04:52:26, Charlie Reis wrote: > Seems like this would be easier to read if the source_main_frame_url came before > original_request_url (similar to the declaration in NavigationEvent). > > Similarly, it's probably useful for destination_url to come right after > original_request_url. > > (Same in the unittest file.) Done. https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc:268: VerifyNavigationEvent(download_url, download_url, download_url, false, false, On 2016/10/25 04:52:27, Charlie Reis wrote: > I think it would be useful to format this like the previous call, annotating > what each argument is. (Same in the tests below.) > > Can you also include a comment for why there are two items in the vector? I'm > not sure what this second one is, and it seems to have lost the attribution to > initial_url. The first one comes from NOTIFICATION_RETARGETING, which indicates the actual initiator. The second one is the true navigation event. So after we can obtain the true initiator info from NavigationHandle, the first one will be gone. All the other retargeting related test cases (new tab download, target blank, etc) have an extra NavigationEvent too due to the same reason. I added a TODO to serve as a reminder. These tests need to be updated after crbug.com/651895 is fixed. https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc:273: // Click on a link which navigates to a page then redirect to download using On 2016/10/25 04:52:26, Charlie Reis wrote: > nit: redirects to a download Done. https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc:292: redirect_url, // destination_url On 2016/10/25 04:52:27, Charlie Reis wrote: > This is a bit surprising when first reading the test, since you'd expect this to > be download_url at first glance. It's probably worth mentioning in a comment > that client redirects commit and then generate a second NavigationEvent, unlike > server redirects. > > (Are there any tests that show the server redirect behavior? I don't see any. > You can trigger one with "/server-redirect?dest_url" and the embedded test > server will make it work.) Sure. comment added. Server side redirection is tested in unittests by using SimulateRedirect() https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc:299: // Click on a link which navigates to a page then redirect to download using On 2016/10/25 04:52:27, Charlie Reis wrote: > nit: redirects to a download Done. https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc:321: false, redirect_url, nav_map->at(redirect_url).at(1)); On 2016/10/25 04:52:27, Charlie Reis wrote: > What's going on in this one? I'm not sure why we would be going from > redirect_url to redirect_url. (Or is this because we set the source_url to the > destination_url when there is no source_url? Maybe this would be clearer if the > source_url were empty in cases like this.) Yes, source should be empty after changes in observer.cc. https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc:386: // Click on a link which redirect twice until reaches download using a mixture On 2016/10/25 04:52:26, Charlie Reis wrote: > nit: redirects twice until it Done. https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc:420: GURL blank_url = GURL("about:blank"); On 2016/10/25 04:52:27, Charlie Reis wrote: > nit: url::kAboutBlankURL Done. https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc:445: download_url, nav_map->at(download_url).at(0)); On 2016/10/25 04:52:27, Charlie Reis wrote: > Can you remind me how we're going to link this NavigationEvent to the one with > initial_url? I forget what ties them together. Since all the NavigationEvents are keyed on their destination url, given a download url (and its tab id), look up NavigationMap to find corresponding NavigationEvent, then search for the source_url and source tab id as the new destination url. Continue doing this until we reach a NavigationEvent initiated by user or there is no more match in the map. In case NavigationEvent has empty source_url and does not initiated by user (this rules out browser initiated navigation in a new tab), we then continue the backward search using destination_url. If two NavigationEvents have the same destination url (and tab id), then we pick the most recent one based on their timestamps. So in this specific case, download_url -> blank_url->empty url (blank_url)->initial_url https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc:490: // flacky on other platforms. On 2016/10/25 04:52:27, Charlie Reis wrote: > nit: flaky Done. https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... File chrome/browser/safe_browsing/safe_browsing_navigation_observer_manager.cc (right): https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_manager.cc:21: namespace safe_browsing { On 2016/10/25 04:52:27, Charlie Reis wrote: > nit: Add blank line before. Done. https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_manager.cc:26: static const double kUserGestureTTLInSecond = 1.0; On 2016/10/25 04:52:27, Charlie Reis wrote: > Is this based on new data, or should we include a TODO to set this to something > based on UMA metrics? dominickn@ (who had some experiences with user gesture) told me that "Blink has a 1 second window after a user gesture where every thing is considered to have a gesture". So I use the same period here. https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_manager.cc:34: ? now - timestamp_in_double > kUserGestureTTLInSecond On 2016/10/25 04:52:27, Charlie Reis wrote: > nit: Can you add some parentheses on this line and the one before to make it > easier to read? Or even better, rephrase to: > > if (timestamp_in_double <= now) > return true; > > return (now - timestamp_in_double) > kUserGestureTTLInSecond; Done. https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... File chrome/browser/safe_browsing/safe_browsing_navigation_observer_manager.h (right): https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_manager.h:36: // kUserGestureTTLInSecond; On 2016/10/25 04:52:27, Charlie Reis wrote: > nit: Period, not semicolon. Done. https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_manager.h:38: // Helper function to strip empty ref fragment from a URL. On 2016/10/25 04:52:27, Charlie Reis wrote: > Let's add a justification for why this is useful. (You mentioned earlier that > many pages end up with a "#" in the URL due to Javascript, and we don't want > that to have a separate entry in the maps.) Done. https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_manager.h:44: // pointed by |nav_event| will be no longer accessible after this function. On 2016/10/25 04:52:27, Charlie Reis wrote: > nit: pointed to by Done. https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_manager.h:105: // sub frame URLs' hosts. Since it is possible for a host to resolve to more On 2016/10/25 04:52:27, Charlie Reis wrote: > nit: subframe Done. https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... File chrome/browser/safe_browsing/safe_browsing_navigation_observer_unittest.cc (right): https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_unittest.cc:68: // Navigation in current tab On 2016/10/25 04:52:27, Charlie Reis wrote: > nit: End comment with period. Done. https://codereview.chromium.org/2302913003/diff/540001/chrome/test/data/safe_... File chrome/test/data/safe_browsing/download_protection/navigation_observer/navigation_observer_tests.html (right): https://codereview.chromium.org/2302913003/diff/540001/chrome/test/data/safe_... chrome/test/data/safe_browsing/download_protection/navigation_observer/navigation_observer_tests.html:16: return true; On 2016/10/25 04:52:27, Charlie Reis wrote: > Do these need to return true? Actually no. Not sure why I did that... Removed all the return statements. https://codereview.chromium.org/2302913003/diff/540001/chrome/test/data/safe_... chrome/test/data/safe_browsing/download_protection/navigation_observer/navigation_observer_tests.html:24: // Trigger a download in new tab. On 2016/10/25 04:52:27, Charlie Reis wrote: > Let's elaborate a bit, since it's more complex than that-- this triggers a > redirect to a download in a new tab with no opener. Done. https://codereview.chromium.org/2302913003/diff/540001/chrome/test/data/safe_... chrome/test/data/safe_browsing/download_protection/navigation_observer/navigation_observer_tests.html:92: </a><br> On 2016/10/25 04:52:27, Charlie Reis wrote: > nit: Maybe put a blank line after each <br>, as in the cases above? I think > that makes it easier to read. Done. https://codereview.chromium.org/2302913003/diff/540001/chrome/test/data/safe_... chrome/test/data/safe_browsing/download_protection/navigation_observer/navigation_observer_tests.html:105: <a id="new_tab_download" href="#" onclick="downloadInNewTab(false)"> On 2016/10/25 04:52:27, Charlie Reis wrote: > nit: downloadInNewTab doesn't take a parameter anymore. Ah, thanks for catching this.
Thanks! I didn't get a chance to look at it today (stability sheriff and other craziness), but I'll try to go through it tomorrow.
Thanks! This seems good, with the known caveats we've discussed. LGTM with nits (and one request for a browser test for server redirects). Thanks for your patience on the long review-- big CLs can take a while, and this just happens to touch a lot of the corner cases of navigation. :) https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... File chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc (right): https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc:214: // Since this test uses java script to mimic clicking on a link, there is no On 2016/10/27 02:15:20, Jialiu Lin wrote: > On 2016/10/25 04:52:27, Charlie Reis wrote: > > nit: Javascript > > > > Still, this sounds incorrect to me. ClickTestLink uses ExecuteScript, which > > boils down to ExecuteJavaScriptWithUserGestureForTests. I would expect it to > > have a user gesture, so I'm puzzled why it doesn't. > Not sure why, but when I tested manually, > WebContentsObserver::DidGetUserInteraction() kicks in, but when used > ExecuteScript(), DidGetUserInteraction() never get triggered. Hmm, I wonder if we have two different ways of tracking user gestures. We should mention this in the comment, since it's pretty surprising to me. (I suppose we could check with dominickn@, but it's nothing to delay this CL over.) https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc:218: initial_url, // source_main_frame_url On 2016/10/27 02:15:20, Jialiu Lin wrote: > On 2016/10/25 04:52:26, Charlie Reis wrote: > > Seems like this would be easier to read if the source_main_frame_url came > before > > original_request_url (similar to the declaration in NavigationEvent). > > > > Similarly, it's probably useful for destination_url to come right after > > original_request_url. > > > > (Same in the unittest file.) > > Done. Thanks! I'm finding the tests much easier to read now. https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc:268: VerifyNavigationEvent(download_url, download_url, download_url, false, false, On 2016/10/27 02:15:20, Jialiu Lin wrote: > On 2016/10/25 04:52:27, Charlie Reis wrote: > > I think it would be useful to format this like the previous call, annotating > > what each argument is. (Same in the tests below.) > > > > Can you also include a comment for why there are two items in the vector? I'm > > not sure what this second one is, and it seems to have lost the attribution to > > initial_url. > The first one comes from NOTIFICATION_RETARGETING, which indicates the actual > initiator. The second one is the true navigation event. > So after we can obtain the true initiator info from NavigationHandle, the first > one will be gone. > All the other retargeting related test cases (new tab download, target blank, > etc) have an extra NavigationEvent too due to the same reason. > > I added a TODO to serve as a reminder. These tests need to be updated after > crbug.com/651895 is fixed. Ah, that explains it. Thanks! https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc:292: redirect_url, // destination_url On 2016/10/27 02:15:20, Jialiu Lin wrote: > On 2016/10/25 04:52:27, Charlie Reis wrote: > > This is a bit surprising when first reading the test, since you'd expect this > to > > be download_url at first glance. It's probably worth mentioning in a comment > > that client redirects commit and then generate a second NavigationEvent, > unlike > > server redirects. > > > > (Are there any tests that show the server redirect behavior? I don't see any. > > > You can trigger one with "/server-redirect?dest_url" and the embedded test > > server will make it work.) > Sure. comment added. > Server side redirection is tested in unittests by using SimulateRedirect() We've found that unit tests (while they're useful for some things and certainly fast) tend to not provide good coverage against regressions in practice. I'd suggest also including a browser test here for server redirects if possible, since it shouldn't be too hard with the "/server-redirect?" URL syntax. https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc:445: download_url, nav_map->at(download_url).at(0)); On 2016/10/27 02:15:20, Jialiu Lin wrote: > On 2016/10/25 04:52:27, Charlie Reis wrote: > > Can you remind me how we're going to link this NavigationEvent to the one with > > initial_url? I forget what ties them together. > > Since all the NavigationEvents are keyed on their destination url, given a > download url (and its tab id), look up NavigationMap to find corresponding > NavigationEvent, then search for the source_url and source tab id as the new > destination url. Continue doing this until we reach a NavigationEvent initiated > by user or there is no more match in the map. > In case NavigationEvent has empty source_url and does not initiated by user > (this rules out browser initiated navigation in a new tab), we then continue the > backward search using destination_url. If two NavigationEvents have the same > destination url (and tab id), then we pick the most recent one based on their > timestamps. > > So in this specific case, download_url -> blank_url->empty url > (blank_url)->initial_url Thanks. As discussed, the "most recent" approach has some downsides (e.g., it might allow an attacker to partly misdirect blame to another URL he/she controls), but I don't think it should be a problem if we just care about finding URLs the attacker controls. I'm curious-- that code (to generate the chain of events) will eventually live in Chrome, right? When we get around to adding it, we should make it the last step in each of these tests, ensuring that it always gives us back the initial_url. https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... File chrome/browser/safe_browsing/safe_browsing_navigation_observer_manager.cc (right): https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_manager.cc:26: static const double kUserGestureTTLInSecond = 1.0; On 2016/10/27 02:15:20, Jialiu Lin wrote: > On 2016/10/25 04:52:27, Charlie Reis wrote: > > Is this based on new data, or should we include a TODO to set this to > something > > based on UMA metrics? > dominickn@ (who had some experiences with user gesture) told me that "Blink has > a 1 second window after a user gesture where every thing is considered to have a > gesture". So I use the same period here. Acknowledged. https://codereview.chromium.org/2302913003/diff/580001/chrome/browser/safe_br... File chrome/browser/safe_browsing/safe_browsing_navigation_observer.h (right): https://codereview.chromium.org/2302913003/diff/580001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:40: // from initial_request_url. nit: Fix indent. https://codereview.chromium.org/2302913003/diff/580001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:45: nit: Not sure why some of these have blank lines between and others don't. Are they broken into meaningful blocks, or should we just take out all the blank lines? https://codereview.chromium.org/2302913003/diff/580001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:68: // and sub-frame navigations) nit: End with period. https://codereview.chromium.org/2302913003/diff/580001/chrome/browser/safe_br... File chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc (right): https://codereview.chromium.org/2302913003/diff/580001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc:332: VerifyNavigationEvent(initial_url, // source_url Same TODO about Retargeting here? https://codereview.chromium.org/2302913003/diff/580001/chrome/test/data/safe_... File chrome/test/data/safe_browsing/download_protection/navigation_observer/iframe_retargeting.html (right): https://codereview.chromium.org/2302913003/diff/580001/chrome/test/data/safe_... chrome/test/data/safe_browsing/download_protection/navigation_observer/iframe_retargeting.html:4: function downloadInNewTab() { nit: Fix indent.
Patchset #23 (id:600001) has been deleted
Patchset #24 (id:640001) has been deleted
Patchset #23 (id:620001) has been deleted
Server side redirect browser test added. Thank you so much creis@! I learnt a lot from your thorough reviews. Thanks! :-) https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... File chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc (right): https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc:214: // Since this test uses java script to mimic clicking on a link, there is no On 2016/10/28 23:36:08, Charlie Reis wrote: > On 2016/10/27 02:15:20, Jialiu Lin wrote: > > On 2016/10/25 04:52:27, Charlie Reis wrote: > > > nit: Javascript > > > > > > Still, this sounds incorrect to me. ClickTestLink uses ExecuteScript, which > > > boils down to ExecuteJavaScriptWithUserGestureForTests. I would expect it > to > > > have a user gesture, so I'm puzzled why it doesn't. > > Not sure why, but when I tested manually, > > WebContentsObserver::DidGetUserInteraction() kicks in, but when used > > ExecuteScript(), DidGetUserInteraction() never get triggered. > > Hmm, I wonder if we have two different ways of tracking user gestures. We > should mention this in the comment, since it's pretty surprising to me. (I > suppose we could check with dominickn@, but it's nothing to delay this CL over.) It is probably because DidGetUserInteraction() only responses to input events (a.k.a MouseDown, GestureScrollBegin,TouchStart, RawKeyDown), whereas ExecuteJavaScriptWithUserGestureForTests() directly sets the has_user_gesture bit in NavigationHandle. I would say DidGetUserInteraction() is more reliable and it prevents bad guys to use javascript to "click" around to evade. https://cs.chromium.org/chromium/src/content/browser/renderer_host/render_wid... I'll confirm with dominickn@ later. https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc:292: redirect_url, // destination_url On 2016/10/28 23:36:08, Charlie Reis wrote: > On 2016/10/27 02:15:20, Jialiu Lin wrote: > > On 2016/10/25 04:52:27, Charlie Reis wrote: > > > This is a bit surprising when first reading the test, since you'd expect > this > > to > > > be download_url at first glance. It's probably worth mentioning in a > comment > > > that client redirects commit and then generate a second NavigationEvent, > > unlike > > > server redirects. > > > > > > (Are there any tests that show the server redirect behavior? I don't see > any. > > > > > You can trigger one with "/server-redirect?dest_url" and the embedded test > > > server will make it work.) > > Sure. comment added. > > Server side redirection is tested in unittests by using SimulateRedirect() > > We've found that unit tests (while they're useful for some things and certainly > fast) tend to not provide good coverage against regressions in practice. I'd > suggest also including a browser test here for server redirects if possible, > since it shouldn't be too hard with the "/server-redirect?" URL syntax. Sure. redirect browser test added. https://codereview.chromium.org/2302913003/diff/540001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc:445: download_url, nav_map->at(download_url).at(0)); On 2016/10/28 23:36:08, Charlie Reis wrote: > On 2016/10/27 02:15:20, Jialiu Lin wrote: > > On 2016/10/25 04:52:27, Charlie Reis wrote: > > > Can you remind me how we're going to link this NavigationEvent to the one > with > > > initial_url? I forget what ties them together. > > > > Since all the NavigationEvents are keyed on their destination url, given a > > download url (and its tab id), look up NavigationMap to find corresponding > > NavigationEvent, then search for the source_url and source tab id as the new > > destination url. Continue doing this until we reach a NavigationEvent > initiated > > by user or there is no more match in the map. > > In case NavigationEvent has empty source_url and does not initiated by user > > (this rules out browser initiated navigation in a new tab), we then continue > the > > backward search using destination_url. If two NavigationEvents have the same > > destination url (and tab id), then we pick the most recent one based on their > > timestamps. > > > > So in this specific case, download_url -> blank_url->empty url > > (blank_url)->initial_url > > Thanks. As discussed, the "most recent" approach has some downsides (e.g., it > might allow an attacker to partly misdirect blame to another URL he/she > controls), but I don't think it should be a problem if we just care about > finding URLs the attacker controls. > > I'm curious-- that code (to generate the chain of events) will eventually live > in Chrome, right? When we get around to adding it, we should make it the last > step in each of these tests, ensuring that it always gives us back the > initial_url. Yes, sure. That's my plan too. https://codereview.chromium.org/2302913003/diff/580001/chrome/browser/safe_br... File chrome/browser/safe_browsing/safe_browsing_navigation_observer.h (right): https://codereview.chromium.org/2302913003/diff/580001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:40: // from initial_request_url. On 2016/10/28 23:36:09, Charlie Reis wrote: > nit: Fix indent. Done. https://codereview.chromium.org/2302913003/diff/580001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:45: On 2016/10/28 23:36:09, Charlie Reis wrote: > nit: Not sure why some of these have blank lines between and others don't. Are > they broken into meaningful blocks, or should we just take out all the blank > lines? Done. https://codereview.chromium.org/2302913003/diff/580001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:68: // and sub-frame navigations) On 2016/10/28 23:36:09, Charlie Reis wrote: > nit: End with period. Done. https://codereview.chromium.org/2302913003/diff/580001/chrome/browser/safe_br... File chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc (right): https://codereview.chromium.org/2302913003/diff/580001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc:332: VerifyNavigationEvent(initial_url, // source_url On 2016/10/28 23:36:09, Charlie Reis wrote: > Same TODO about Retargeting here? Done. https://codereview.chromium.org/2302913003/diff/580001/chrome/test/data/safe_... File chrome/test/data/safe_browsing/download_protection/navigation_observer/iframe_retargeting.html (right): https://codereview.chromium.org/2302913003/diff/580001/chrome/test/data/safe_... chrome/test/data/safe_browsing/download_protection/navigation_observer/iframe_retargeting.html:4: function downloadInNewTab() { On 2016/10/28 23:36:09, Charlie Reis wrote: > nit: Fix indent. Done.
Great! Still LGTM, though it looks like a few of your changes didn't make it into the latest patch set. https://codereview.chromium.org/2302913003/diff/660001/chrome/browser/safe_br... File chrome/browser/safe_browsing/safe_browsing_navigation_observer.h (right): https://codereview.chromium.org/2302913003/diff/660001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:45: nit: Looks like the blank line comment wasn't addressed. :) (That, or I don't see what the groupings are here.) https://codereview.chromium.org/2302913003/diff/660001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:68: // and sub-frame navigations) nit: Still needs a period. :) https://codereview.chromium.org/2302913003/diff/660001/chrome/browser/safe_br... File chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc (right): https://codereview.chromium.org/2302913003/diff/660001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc:215: // gesture), and DidGetUserInteraction() does not response to ExecuteScript(), nit: s/response/respond/ https://codereview.chromium.org/2302913003/diff/660001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc:770: IN_PROC_BROWSER_TEST_F(SBNavigationObserverBrowserTest, ServerRedirect) { Thanks! I'm glad we have browser test coverage for this.
Oops, you're right. There were changes not uploaded in the last patch. Fixed. Thanks! https://codereview.chromium.org/2302913003/diff/660001/chrome/browser/safe_br... File chrome/browser/safe_browsing/safe_browsing_navigation_observer.h (right): https://codereview.chromium.org/2302913003/diff/660001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:45: On 2016/10/31 16:56:45, Charlie Reis wrote: > nit: Looks like the blank line comment wasn't addressed. :) (That, or I don't > see what the groupings are here.) Done. https://codereview.chromium.org/2302913003/diff/660001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer.h:68: // and sub-frame navigations) On 2016/10/31 16:56:46, Charlie Reis wrote: > nit: Still needs a period. :) Done. https://codereview.chromium.org/2302913003/diff/660001/chrome/browser/safe_br... File chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc (right): https://codereview.chromium.org/2302913003/diff/660001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc:215: // gesture), and DidGetUserInteraction() does not response to ExecuteScript(), On 2016/10/31 16:56:46, Charlie Reis wrote: > nit: s/response/respond/ Done. https://codereview.chromium.org/2302913003/diff/660001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc:770: IN_PROC_BROWSER_TEST_F(SBNavigationObserverBrowserTest, ServerRedirect) { On 2016/10/31 16:56:46, Charlie Reis wrote: > Thanks! I'm glad we have browser test coverage for this. :-)
The CQ bit was checked by jialiul@chromium.org
The patchset sent to the CQ was uploaded after l-g-t-m from nparker@chromium.org, creis@chromium.org Link to the patchset: https://codereview.chromium.org/2302913003/#ps680001 (title: "nit")
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Try jobs failed on following builders: ios-simulator on master.tryserver.chromium.mac (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.mac/builders/ios-simulator/bui...)
The CQ bit was checked by jialiul@chromium.org
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Try jobs failed on following builders: ios-simulator on master.tryserver.chromium.mac (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.mac/builders/ios-simulator/bui...)
The CQ bit was checked by jialiul@chromium.org
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Try jobs failed on following builders: ios-simulator on master.tryserver.chromium.mac (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.mac/builders/ios-simulator/bui...)
The CQ bit was checked by jialiul@chromium.org
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
Message was sent while issue was closed.
Committed patchset #24 (id:680001)
Message was sent while issue was closed.
Description was changed from ========== Add SafeBrowsingNavigationObserver to listen to navigation events happen on all frames. These navigation events can help safe browsing service identify suspicious downloads that intentionally hide their referrers and/or landing pages, and report these pieces of info to safe browsing backend. This CL only covers the observing part. More management code (e.g. periodically cleanup, actual reporting, etc) will come up shortly. BUG=639467 ========== to ========== Add SafeBrowsingNavigationObserver to listen to navigation events happen on all frames. These navigation events can help safe browsing service identify suspicious downloads that intentionally hide their referrers and/or landing pages, and report these pieces of info to safe browsing backend. This CL only covers the observing part. More management code (e.g. periodically cleanup, actual reporting, etc) will come up shortly. BUG=639467 Committed: https://crrev.com/602009dfabda593406bfd7f5171e8c4183335bf5 Cr-Commit-Position: refs/heads/master@{#428865} ==========
Message was sent while issue was closed.
Patchset 24 (id:??) landed as https://crrev.com/602009dfabda593406bfd7f5171e8c4183335bf5 Cr-Commit-Position: refs/heads/master@{#428865}
Message was sent while issue was closed.
lgtm Congrats, and thanks for being patient on the other dependencies. This is a thorny part of the code!
Message was sent while issue was closed.
A revert of this CL (patchset #24 id:680001) has been created in https://codereview.chromium.org/2470743002/ by kjellander@chromium.org. The reason for reverting is: This is causing flakiness failures in the newly added SBNavigationObserverBrowserTest.SubFrameNewTabDownload test on Linux ChromiumOS Tests (1): see http://test-results.appspot.com/dashboards/flakiness_dashboard.html#testType=... It starts with the next build after the one where the CL was submitted: https://build.chromium.org/p/chromium.chromiumos/builders/Linux%20ChromiumOS%....
Message was sent while issue was closed.
Description was changed from ========== Add SafeBrowsingNavigationObserver to listen to navigation events happen on all frames. These navigation events can help safe browsing service identify suspicious downloads that intentionally hide their referrers and/or landing pages, and report these pieces of info to safe browsing backend. This CL only covers the observing part. More management code (e.g. periodically cleanup, actual reporting, etc) will come up shortly. BUG=639467 Committed: https://crrev.com/602009dfabda593406bfd7f5171e8c4183335bf5 Cr-Commit-Position: refs/heads/master@{#428865} ========== to ========== Add SafeBrowsingNavigationObserver to listen to navigation events happen on all frames. These navigation events can help safe browsing service identify suspicious downloads that intentionally hide their referrers and/or landing pages, and report these pieces of info to safe browsing backend. This CL only covers the observing part. More management code (e.g. periodically cleanup, actual reporting, etc) will come up shortly. BUG=639467 Committed: https://crrev.com/602009dfabda593406bfd7f5171e8c4183335bf5 Cr-Commit-Position: refs/heads/master@{#428865} ==========
The CQ bit was checked by jialiul@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: ios-device on master.tryserver.chromium.mac (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.mac/builders/ios-device/builds...)
Patchset #25 (id:700001) has been deleted
The CQ bit was checked by jialiul@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
The CQ bit was checked by jialiul@chromium.org
The patchset sent to the CQ was uploaded after l-g-t-m from nparker@chromium.org, creis@chromium.org Link to the patchset: https://codereview.chromium.org/2302913003/#ps720001 (title: "rebase, and fix flaky browser tests")
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
Message was sent while issue was closed.
Description was changed from ========== Add SafeBrowsingNavigationObserver to listen to navigation events happen on all frames. These navigation events can help safe browsing service identify suspicious downloads that intentionally hide their referrers and/or landing pages, and report these pieces of info to safe browsing backend. This CL only covers the observing part. More management code (e.g. periodically cleanup, actual reporting, etc) will come up shortly. BUG=639467 Committed: https://crrev.com/602009dfabda593406bfd7f5171e8c4183335bf5 Cr-Commit-Position: refs/heads/master@{#428865} ========== to ========== Add SafeBrowsingNavigationObserver to listen to navigation events happen on all frames. These navigation events can help safe browsing service identify suspicious downloads that intentionally hide their referrers and/or landing pages, and report these pieces of info to safe browsing backend. This CL only covers the observing part. More management code (e.g. periodically cleanup, actual reporting, etc) will come up shortly. BUG=639467 Committed: https://crrev.com/602009dfabda593406bfd7f5171e8c4183335bf5 Cr-Commit-Position: refs/heads/master@{#428865} ==========
Message was sent while issue was closed.
Committed patchset #25 (id:720001)
Message was sent while issue was closed.
Description was changed from ========== Add SafeBrowsingNavigationObserver to listen to navigation events happen on all frames. These navigation events can help safe browsing service identify suspicious downloads that intentionally hide their referrers and/or landing pages, and report these pieces of info to safe browsing backend. This CL only covers the observing part. More management code (e.g. periodically cleanup, actual reporting, etc) will come up shortly. BUG=639467 Committed: https://crrev.com/602009dfabda593406bfd7f5171e8c4183335bf5 Cr-Commit-Position: refs/heads/master@{#428865} ========== to ========== Add SafeBrowsingNavigationObserver to listen to navigation events happen on all frames. These navigation events can help safe browsing service identify suspicious downloads that intentionally hide their referrers and/or landing pages, and report these pieces of info to safe browsing backend. This CL only covers the observing part. More management code (e.g. periodically cleanup, actual reporting, etc) will come up shortly. BUG=639467 Committed: https://crrev.com/602009dfabda593406bfd7f5171e8c4183335bf5 Committed: https://crrev.com/1782326f8a2c81a58136a0475d83f6c1c11db38d Cr-Original-Commit-Position: refs/heads/master@{#428865} Cr-Commit-Position: refs/heads/master@{#429071} ==========
Message was sent while issue was closed.
Patchset 25 (id:??) landed as https://crrev.com/1782326f8a2c81a58136a0475d83f6c1c11db38d Cr-Commit-Position: refs/heads/master@{#429071}
Message was sent while issue was closed.
In the future, please have the diff reviewed before re-landing after a revert, or mention that you're TBR'ing the diff so that reviewers know to take a look. PS 25 LGTM with one concern we can address in a followup CL (assuming it sticks). https://codereview.chromium.org/2302913003/diff/720001/chrome/browser/safe_br... File chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc (right): https://codereview.chromium.org/2302913003/diff/720001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc:149: for (int i = 0; i < navigation_count-1; i++) { Why is this navigation_count-1? Maybe we should document what |navigation_count| is supposed to be, since this seems unexpected.
Message was sent while issue was closed.
Sorry, I forgot to TBR. I'm actually not sure if this one will stick. I wasn't able to reproduce the test failure on my local builds. it seems all flaky tests involves two tabs. My educated guess is that evaluation kicks in before one of these tab finishes all navigations. |navigation_count| refers to the number of navigations expected to occur after test starts. That's the number of times WaitForLoadStopWithoutSuccessCheck(..) need to be called. The navigation_count -1 is due to the fact that one WaitForLoadStopWithoutSuccessCheck(..) is already called before the for loop. I'll add more comment in a followup CL or in my next re-land. Thanks! On Tue, Nov 1, 2016 at 1:01 PM <creis@chromium.org> wrote: > In the future, please have the diff reviewed before re-landing after a > revert, > or mention that you're TBR'ing the diff so that reviewers know to take a > look. > > PS 25 LGTM with one concern we can address in a followup CL (assuming it > sticks). > > > > https://codereview.chromium.org/2302913003/diff/720001/chrome/browser/safe_br... > File > > chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc > (right): > > > https://codereview.chromium.org/2302913003/diff/720001/chrome/browser/safe_br... > > chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc:149: > for (int i = 0; i < navigation_count-1; i++) { > Why is this navigation_count-1? Maybe we should document what > |navigation_count| is supposed to be, since this seems unexpected. > > https://codereview.chromium.org/2302913003/ > -- You received this message because you are subscribed to the Google Groups "Chromium-reviews" group. To unsubscribe from this group and stop receiving emails from it, send an email to chromium-reviews+unsubscribe@chromium.org.
Message was sent while issue was closed.
A revert of this CL (patchset #25 id:720001) has been created in https://codereview.chromium.org/2471623002/ by avi@chromium.org. The reason for reverting is: SPP browsertests are failing. https://build.chromium.org/p/chromium.linux/builders/Linux%20Tests/builds/48395 ../../chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc:488: Failure Value of: nav_map->size() Actual: 2 Expected: std::size_t(3) Which is: 3 .
Message was sent while issue was closed.
Description was changed from ========== Add SafeBrowsingNavigationObserver to listen to navigation events happen on all frames. These navigation events can help safe browsing service identify suspicious downloads that intentionally hide their referrers and/or landing pages, and report these pieces of info to safe browsing backend. This CL only covers the observing part. More management code (e.g. periodically cleanup, actual reporting, etc) will come up shortly. BUG=639467 Committed: https://crrev.com/602009dfabda593406bfd7f5171e8c4183335bf5 Committed: https://crrev.com/1782326f8a2c81a58136a0475d83f6c1c11db38d Cr-Original-Commit-Position: refs/heads/master@{#428865} Cr-Commit-Position: refs/heads/master@{#429071} ========== to ========== Add SafeBrowsingNavigationObserver to listen to navigation events happen on all frames. These navigation events can help safe browsing service identify suspicious downloads that intentionally hide their referrers and/or landing pages, and report these pieces of info to safe browsing backend. This CL only covers the observing part. More management code (e.g. periodically cleanup, actual reporting, etc) will come up shortly. BUG=639467 CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.linux:linux_site_isolation Committed: https://crrev.com/602009dfabda593406bfd7f5171e8c4183335bf5 Committed: https://crrev.com/1782326f8a2c81a58136a0475d83f6c1c11db38d Cr-Original-Commit-Position: refs/heads/master@{#428865} Cr-Commit-Position: refs/heads/master@{#429071} ==========
Message was sent while issue was closed.
On 2016/11/01 20:44:16, Avi wrote: > A revert of this CL (patchset #25 id:720001) has been created in > https://codereview.chromium.org/2471623002/ by mailto:avi@chromium.org. > > The reason for reverting is: SPP browsertests are failing. > > https://build.chromium.org/p/chromium.linux/builders/Linux%20Tests/builds/48395 > > ../../chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc:488: > Failure > Value of: nav_map->size() > Actual: 2 > Expected: std::size_t(3) > Which is: 3 > . Thanks Avi. Jialiu, is this more of the same flakiness? I think we run these tests in --site-per-process mode on the CQ now, so I'm guessing they passed there before landing and failed after. Just to be safe, though, I added "CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.linux:linux_site_isolation" to the CL description to make sure that we include linux_site_isolation try bot coverage. Can you see if they pass or fail when you run them with --site-per-process locally? Also, I'd recommend running the tests locally with --gtest_repeat=N (e.g., 10) to try to catch some of the flakiness.
The CQ bit was checked by jialiul@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
Patchset #26 (id:740001) has been deleted
The CQ bit was checked by jialiul@chromium.org to run a CQ dry run
The CQ bit was unchecked by jialiul@chromium.org
The CQ bit was checked by jialiul@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
Thanks for being patient with me. Please take a look at patch 26. I did 20 fold test on with and without --site_per_process flag. It seems flakiness has nothing to do with this flag. The flakiness is caused by evaluation kicking in before all the navigations finish, thus yields unexpected results. And the flakiness is not within safe_browsing_navigation_observer or safe_browsing_navigation_manager code, purely in browser test code. I did a couple of things to mitigate: (1) refined ClickTestLink() function, make it more specific about how many navigations on either current tab or new tab should be waited before evaluation can start (2) for all the tests involve new tab, stop the current tab from refreshing. (3) other small tweaks to slightly simplify test cases without losing the core part we want to test. I did another 10 fold local tests after these changes, seems fine so far.
Thanks for tracking down the flakiness. A few thoughts and suggestions below before we reland. https://codereview.chromium.org/2302913003/diff/720001/chrome/browser/safe_br... File chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc (right): https://codereview.chromium.org/2302913003/diff/720001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc:148: content::WaitForLoadStopWithoutSuccessCheck(main_web_contents); Yes, this was likely broken for links that opened in new tabs. Your tab_added_observer should hopefully resolve it. https://codereview.chromium.org/2302913003/diff/780001/chrome/browser/safe_br... File chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc (right): https://codereview.chromium.org/2302913003/diff/780001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc:93: false); Did this contribute to the flakiness? https://codereview.chromium.org/2302913003/diff/780001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc:152: content::WebContents* current_tab_web_contents = nit: current_web_contents https://codereview.chromium.org/2302913003/diff/780001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc:165: content::WaitForLoadStopWithoutSuccessCheck(current_tab_web_contents); This looks problematic. Can you use a TestNavigationObserver instead and specify the number of navigations in the constructor? https://codereview.chromium.org/2302913003/diff/780001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc:174: content::WaitForLoadStopWithoutSuccessCheck(new_tab_web_contents); Hmm. I'm worried about this as well, and it's not clear whether we'll be adding the TestNavigationObserver early enough on the new tab for it to work. Worth a try, though. https://codereview.chromium.org/2302913003/diff/780001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc:443: // Click on a link which redirects to another page using window.location. Why isn't this a download anymore? Is there a way to switch it back with the TestNavigationObserver change? (If not, it's not a big deal, but it concerns me if there are cases that we're unable to test.) https://codereview.chromium.org/2302913003/diff/780001/chrome/test/data/safe_... File chrome/test/data/safe_browsing/download_protection/navigation_observer/iframe_retargeting.html (right): https://codereview.chromium.org/2302913003/diff/780001/chrome/test/data/safe_... chrome/test/data/safe_browsing/download_protection/navigation_observer/iframe_retargeting.html:5: var tab = window.open('navigation_observer_tests.html'); What was the reason for changing this test? https://codereview.chromium.org/2302913003/diff/780001/chrome/test/data/safe_... File chrome/test/data/safe_browsing/download_protection/navigation_observer/navigation_observer_tests.html (right): https://codereview.chromium.org/2302913003/diff/780001/chrome/test/data/safe_... chrome/test/data/safe_browsing/download_protection/navigation_observer/navigation_observer_tests.html:15: window.location.assign("about:blank"); Why is this changed? I would expect assign() and href assignment to be equivalent, and at the moment the comment is incorrect. https://codereview.chromium.org/2302913003/diff/780001/chrome/test/data/safe_... chrome/test/data/safe_browsing/download_protection/navigation_observer/navigation_observer_tests.html:102: <a id="window_location_redirection" href="#" onclick="windowLocationRedirection(); return false;"> Just curious-- what's the reason return false is needed?
https://codereview.chromium.org/2302913003/diff/780001/chrome/browser/safe_br... File chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc (right): https://codereview.chromium.org/2302913003/diff/780001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc:174: content::WaitForLoadStopWithoutSuccessCheck(new_tab_web_contents); On 2016/11/03 22:58:45, Charlie Reis wrote: > Hmm. I'm worried about this as well, and it's not clear whether we'll be adding > the TestNavigationObserver early enough on the new tab for it to work. Worth a > try, though. Maybe we can use a single TestNavigationObserver and use a combined count of navigations, if we use StartWatchingNewWebContents()?
The CQ bit was checked by jialiul@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was checked by jialiul@chromium.org to run a CQ dry run
Patchset #28 (id:800001) has been deleted
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
Thanks creis@. StartWatchingNewWebContents(..) is indeed the magical function that resolves all the problems. https://codereview.chromium.org/2302913003/diff/780001/chrome/browser/safe_br... File chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc (right): https://codereview.chromium.org/2302913003/diff/780001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc:93: false); On 2016/11/03 22:58:45, Charlie Reis wrote: > Did this contribute to the flakiness? Somewhat, disabling safe browsing will prevent download protection and incident reporting service to intervene downloads. These service might impact downloads and stall test in some corner cases. Since safe browsing is not relevant here , it is safer to disable it. https://codereview.chromium.org/2302913003/diff/780001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc:152: content::WebContents* current_tab_web_contents = On 2016/11/03 22:58:45, Charlie Reis wrote: > nit: current_web_contents Done. https://codereview.chromium.org/2302913003/diff/780001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc:165: content::WaitForLoadStopWithoutSuccessCheck(current_tab_web_contents); On 2016/11/03 22:58:45, Charlie Reis wrote: > This looks problematic. Can you use a TestNavigationObserver instead and > specify the number of navigations in the constructor? Sure. done. https://codereview.chromium.org/2302913003/diff/780001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc:174: content::WaitForLoadStopWithoutSuccessCheck(new_tab_web_contents); On 2016/11/03 at 23:43:40, Charlie Reis wrote: > On 2016/11/03 22:58:45, Charlie Reis wrote: > > Hmm. I'm worried about this as well, and it's not clear whether we'll be adding > > the TestNavigationObserver early enough on the new tab for it to work. Worth a > > try, though. > > Maybe we can use a single TestNavigationObserver and use a combined count of navigations, if we use StartWatchingNewWebContents()? This is a magical function! It resolves all the flakiness. :-) https://codereview.chromium.org/2302913003/diff/780001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc:443: // Click on a link which redirects to another page using window.location. On 2016/11/03 at 22:58:45, Charlie Reis wrote: > Why isn't this a download anymore? Is there a way to switch it back with the TestNavigationObserver change? (If not, it's not a big deal, but it concerns me if there are cases that we're unable to test.) Change back to download after the StartWatchingNewWebContents() change. https://codereview.chromium.org/2302913003/diff/780001/chrome/test/data/safe_... File chrome/test/data/safe_browsing/download_protection/navigation_observer/iframe_retargeting.html (right): https://codereview.chromium.org/2302913003/diff/780001/chrome/test/data/safe_... chrome/test/data/safe_browsing/download_protection/navigation_observer/iframe_retargeting.html:5: var tab = window.open('navigation_observer_tests.html'); On 2016/11/03 22:58:45, Charlie Reis wrote: > What was the reason for changing this test? Changed it back after using StartWatchingNewWebContents(). https://codereview.chromium.org/2302913003/diff/780001/chrome/test/data/safe_... File chrome/test/data/safe_browsing/download_protection/navigation_observer/navigation_observer_tests.html (right): https://codereview.chromium.org/2302913003/diff/780001/chrome/test/data/safe_... chrome/test/data/safe_browsing/download_protection/navigation_observer/navigation_observer_tests.html:15: window.location.assign("about:blank"); On 2016/11/03 22:58:45, Charlie Reis wrote: > Why is this changed? I would expect assign() and href assignment to be > equivalent, and at the moment the comment is incorrect. yes, window.location.assign and window.location.href is mostly equivalent. And according to W3Schools, assign() seems more reliable. And change it back to redirect to download since the flakiness is fixed. https://codereview.chromium.org/2302913003/diff/780001/chrome/test/data/safe_... chrome/test/data/safe_browsing/download_protection/navigation_observer/navigation_observer_tests.html:102: <a id="window_location_redirection" href="#" onclick="windowLocationRedirection(); return false;"> On 2016/11/03 22:58:45, Charlie Reis wrote: > Just curious-- what's the reason return false is needed? Returning false prevents the current page from refreshing (which is not necessary for our tests).
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
Thanks-- I'm not sure all of the remaining changes are necessary, but I think they're fine to include. Diff from PS25 to PS28 LGTM. https://codereview.chromium.org/2302913003/diff/780001/chrome/browser/safe_br... File chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc (right): https://codereview.chromium.org/2302913003/diff/780001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc:93: false); On 2016/11/04 05:38:00, Jialiu Lin (OOO Nov7-11) wrote: > On 2016/11/03 22:58:45, Charlie Reis wrote: > > Did this contribute to the flakiness? > > Somewhat, disabling safe browsing will prevent download protection and incident > reporting service to intervene downloads. These service might impact downloads > and stall test in some corner cases. Since safe browsing is not relevant here , > it is safer to disable it. Acknowledged. https://codereview.chromium.org/2302913003/diff/820001/chrome/browser/safe_br... File chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc (right): https://codereview.chromium.org/2302913003/diff/820001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc:167: nit: Remove extra blank line.
Thanks! https://codereview.chromium.org/2302913003/diff/820001/chrome/browser/safe_br... File chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc (right): https://codereview.chromium.org/2302913003/diff/820001/chrome/browser/safe_br... chrome/browser/safe_browsing/safe_browsing_navigation_observer_browsertest.cc:167: On 2016/11/04 16:40:57, Charlie Reis wrote: > nit: Remove extra blank line. Done.
The CQ bit was checked by jialiul@chromium.org
The patchset sent to the CQ was uploaded after l-g-t-m from nparker@chromium.org, creis@chromium.org Link to the patchset: https://codereview.chromium.org/2302913003/#ps840001 (title: "resolve final nit")
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
Message was sent while issue was closed.
Description was changed from ========== Add SafeBrowsingNavigationObserver to listen to navigation events happen on all frames. These navigation events can help safe browsing service identify suspicious downloads that intentionally hide their referrers and/or landing pages, and report these pieces of info to safe browsing backend. This CL only covers the observing part. More management code (e.g. periodically cleanup, actual reporting, etc) will come up shortly. BUG=639467 CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.linux:linux_site_isolation Committed: https://crrev.com/602009dfabda593406bfd7f5171e8c4183335bf5 Committed: https://crrev.com/1782326f8a2c81a58136a0475d83f6c1c11db38d Cr-Original-Commit-Position: refs/heads/master@{#428865} Cr-Commit-Position: refs/heads/master@{#429071} ========== to ========== Add SafeBrowsingNavigationObserver to listen to navigation events happen on all frames. These navigation events can help safe browsing service identify suspicious downloads that intentionally hide their referrers and/or landing pages, and report these pieces of info to safe browsing backend. This CL only covers the observing part. More management code (e.g. periodically cleanup, actual reporting, etc) will come up shortly. BUG=639467 CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.linux:linux_site_isolation Committed: https://crrev.com/602009dfabda593406bfd7f5171e8c4183335bf5 Committed: https://crrev.com/1782326f8a2c81a58136a0475d83f6c1c11db38d Cr-Original-Commit-Position: refs/heads/master@{#428865} Cr-Commit-Position: refs/heads/master@{#429071} ==========
Message was sent while issue was closed.
Committed patchset #29 (id:840001)
Message was sent while issue was closed.
Description was changed from ========== Add SafeBrowsingNavigationObserver to listen to navigation events happen on all frames. These navigation events can help safe browsing service identify suspicious downloads that intentionally hide their referrers and/or landing pages, and report these pieces of info to safe browsing backend. This CL only covers the observing part. More management code (e.g. periodically cleanup, actual reporting, etc) will come up shortly. BUG=639467 CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.linux:linux_site_isolation Committed: https://crrev.com/602009dfabda593406bfd7f5171e8c4183335bf5 Committed: https://crrev.com/1782326f8a2c81a58136a0475d83f6c1c11db38d Cr-Original-Commit-Position: refs/heads/master@{#428865} Cr-Commit-Position: refs/heads/master@{#429071} ========== to ========== Add SafeBrowsingNavigationObserver to listen to navigation events happen on all frames. These navigation events can help safe browsing service identify suspicious downloads that intentionally hide their referrers and/or landing pages, and report these pieces of info to safe browsing backend. This CL only covers the observing part. More management code (e.g. periodically cleanup, actual reporting, etc) will come up shortly. BUG=639467 CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.linux:linux_site_isolation Committed: https://crrev.com/602009dfabda593406bfd7f5171e8c4183335bf5 Committed: https://crrev.com/1782326f8a2c81a58136a0475d83f6c1c11db38d Committed: https://crrev.com/53acce88363fb6a9ed54e991d17b203cfdf6326d Cr-Original-Original-Commit-Position: refs/heads/master@{#428865} Cr-Original-Commit-Position: refs/heads/master@{#429071} Cr-Commit-Position: refs/heads/master@{#429976} ==========
Message was sent while issue was closed.
Patchset 29 (id:??) landed as https://crrev.com/53acce88363fb6a9ed54e991d17b203cfdf6326d Cr-Commit-Position: refs/heads/master@{#429976} |
