|
|
Created:
9 years, 11 months ago by jianli Modified:
9 years, 6 months ago CC:
chromium-reviews, Paweł Hajdan Jr., darin-cc_chromium.org, rdsmith+dwatch_chromium.org, brettw-cc_chromium.org, adeboor Base URL:
svn://chrome-svn/chrome/trunk/src/ Visibility:
Public. |
DescriptionFix bug 69468: Drag download does not work if there is URL redirection involved.
The fix is to keep track of the original URL.
BUG=69468
TEST=Manual test to verify.
Committed: http://src.chromium.org/viewvc/chrome?view=rev&revision=71494
Patch Set 1 #
Total comments: 4
Messages
Total messages: 13 (0 generated)
On 2011/01/12 23:54:21, jianli wrote: Jian: I'm reluctant, as I think that both download_item.h and download_create_info.h have way too many random member variables already, and I'd rather not add to them. But there may not be a better way to solve this problem. Pawel, do you have any ideas on this one? The basic problem being addressed (as I understand it; Jian, please correct me if I get this wrong) is associating a drag operation with a download operation. When a drag-and-drop occurs, a DragDownloadFile is created and it executes the download on behalf of the drag. It initiates a download by spawning a download_util::DownloadUrl on the IO thread, but that invocation doesn't return any information. So when it's later trying to figure out what DownloadItem corresponds to the download it initiated, it tries to do a URL match. If the target URL does a redirect, that doesn't work. This CL adds an "original_url_" field to DownloadCreateInfo and DownloadItem, and does the match on that. What I'd ideally like to see instead is a callback task passed down through the stack (DragDownloadFile::InitiateDownload->DownloadUrlToFile->download_util::DownloadUrl). DownloadResourceHandler would allocate it's download id in the constructor (currently it's allocated in DownloadResourceHandler::OnResponseStarted(), presumably to conserve ids, but I don't think there's any real need for that), and the ResourceDispatcherHost could get that id and return it to the DragDownloadFile through the callback. Then we don't have to do this after the fact association based on URL, which is a little ugly anyway--in theory, you might have multiple downloads from the same URL, and you'd like to get the one that's really associated with the download you initiated--that *probably* doesn't matter, but in an active server context, it might. But that's a fair amount of work in somewhat tricky code, so I'm hoping Pawel has a better idea.
On Wed, Jan 12, 2011 at 5:47 PM, <rdsmith@chromium.org> wrote: > On 2011/01/12 23:54:21, jianli wrote: > > Jian: I'm reluctant, as I think that both download_item.h and > download_create_info.h have way too many random member variables already, > and > I'd rather not add to them. But there may not be a better way to solve > this > problem. > > Pawel, do you have any ideas on this one? The basic problem being > addressed (as > I understand it; Jian, please correct me if I get this wrong) is > associating a > drag operation with a download operation. When a drag-and-drop occurs, a > DragDownloadFile is created and it executes the download on behalf of the > drag. > It initiates a download by spawning a download_util::DownloadUrl on the IO > thread, but that invocation doesn't return any information. So when it's > later > trying to figure out what DownloadItem corresponds to the download it > initiated, > it tries to do a URL match. If the target URL does a redirect, that > doesn't > work. This CL adds an "original_url_" field to DownloadCreateInfo and > DownloadItem, and does the match on that. > Yes, you understand the whole problem and this fix correctly. > > What I'd ideally like to see instead is a callback task passed down through > the > stack > > (DragDownloadFile::InitiateDownload->DownloadUrlToFile->download_util::DownloadUrl). > DownloadResourceHandler would allocate it's download id in the constructor > (currently it's allocated in DownloadResourceHandler::OnResponseStarted(), > presumably to conserve ids, but I don't think there's any real need for > that), > and the ResourceDispatcherHost could get that id and return it to the > DragDownloadFile through the callback. Then we don't have to do this after > the > fact association based on URL, which is a little ugly anyway--in theory, > you > might have multiple downloads from the same URL, and you'd like to get the > one > that's really associated with the download you initiated--that *probably* > doesn't matter, but in an active server context, it might. But that's a > fair > amount of work in somewhat tricky code, so I'm hoping Pawel has a better > idea. The current download observation model seems to be too complicated. I agree we could simply pass down DownloadItem::Observer when calling DownloadManager::DownloadUrlToFile. With this, we do not need to hook up with DownloadManager observer. > > > > http://codereview.chromium.org/6131009/ >
On 2011/01/13 02:21:05, jianli wrote: > The current download observation model seems to be too complicated. I > agree we could simply pass down DownloadItem::Observer when > calling DownloadManager::DownloadUrlToFile. With this, we do not need to > hook up with DownloadManager observer. I liked this when I first thought about it, but on thinking about it more I'm not sure it's a good idea. The Notify/Observe model, as I understand it, implies a contract where the observer agrees to stop observation before destruction (so that the notifier doesn't call methods on a destroyed object). But that requires the observer to always have a pointer/reference to the notifier, which DragDownloadFile won't have if it passes itself down as a notifier. So I think we need a different answer. There is a lot of complexity in the download system, some of which is necessary (downloads by their nature will have feet in IO, File, and UI threads) and much of which isn't. That's a large part of why I'd like to avoid dumping extra members into DownloadItem/DownloadCreateInfo :-J.
On Wed, Jan 12, 2011 at 6:55 PM, <rdsmith@chromium.org> wrote: > On 2011/01/13 02:21:05, jianli wrote: > >> The current download observation model seems to be too complicated. I >> agree we could simply pass down DownloadItem::Observer when >> calling DownloadManager::DownloadUrlToFile. With this, we do not need to >> hook up with DownloadManager observer. >> > > I liked this when I first thought about it, but on thinking about it more > I'm > not sure it's a good idea. The Notify/Observe model, as I understand it, > implies a contract where the observer agrees to stop observation before > destruction (so that the notifier doesn't call methods on a destroyed > object). > But that requires the observer to always have a pointer/reference to the > notifier, which DragDownloadFile won't have if it passes itself down as a > notifier. So I think we need a different answer. > What do you suggest? It seems that we need to dump an extra member to DownloadItem/DownloadCreateInfo since we do not have any existing member for this "association" purpose. Adding original_url is the simplest one since it involves the least code changes, but it is for fixing this specific bug only. We can add other member, like callback, or some kind of ID or tag that can be used to build up the association between the initiator and the downloader. > > There is a lot of complexity in the download system, some of which is > necessary > (downloads by their nature will have feet in IO, File, and UI threads) and > much > of which isn't. That's a large part of why I'd like to avoid dumping extra > members into DownloadItem/DownloadCreateInfo :-J. > > > > http://codereview.chromium.org/6131009/ >
Ideally we should postpone making such changes to the download code until it's properly cleaned up, or make the change itself super-clean (that may mean refactoring some of the current code). http://codereview.chromium.org/6131009/diff/1/chrome/browser/download/downloa... File chrome/browser/download/download_item.h (right): http://codereview.chromium.org/6131009/diff/1/chrome/browser/download/downloa... chrome/browser/download/download_item.h:191: GURL original_url() const { return original_url_; } We try to avoid adding more accessors like this one. http://codereview.chromium.org/6131009/diff/1/chrome/browser/download/downloa... chrome/browser/download/download_item.h:263: // The original URL before any redirection. Now we have three URLs, and it's super-easy to confuse them all. The difference should be much better documented. Do we actually need all three?
On 2011/01/12 23:54:21, jianli wrote: Jian: How important is this change to you? Is this something you ran into and said "I know how to fix that", or is it something that's blocking some other high priority thing you want to do? If the former, let's hold off on doing it until we're further along in the refactor. There are a couple of things that Andy Hendrickson and I have on our plates that would require solving this problem (resuming failed downloads, unifying SavePackage (which currently uses url matching and hence is presumably vulnerable to the redirect problem you're trying to solve as well) and DownloadManager systems). OTOH if you wait for us, my guess is that it'll be about six months until we get to it (yes, there's a lot to do). The way I think we should solve this is to get an ID of some sort for the dispatched download when it's dispatched. I see how to get the IO on the IO thread, but what I don't see is how to return it back to the DragDownloadFile in a way that's resilient to objects going away (i.e. if the DragDownloadFile dispatches a request to the IO thread to start a download, and is then deleted, a return callback might do things with a deleted object.) The way the rest of the download system solves this problem is by having long-lived objects (DownloadManager, DownloadFileManager) that receive such messages and dispatch to the appropriate objects if they exist, so that might be the right way to go for the the DragDownloadFile. But it's not clear how to do it in a way that integrates cleanly with the rest of the system.
Hi folks, For context, we had to turn off the attachment drag-out feature in gmail because of an xss bug. While it was off, we changed mail attachment downloading to come from a different domain, to avoid browser misbehavior (content sniffing) and potential security hole exploits even in the face of correct behavior. This necessitated a redirect. While the feature has been off, we've been receiving numerous support requests, as people really like the feature. So we would like to turn it on as soon as possible. Adam On Jan 13, 2011 7:00 AM, <rdsmith@chromium.org> wrote: > On 2011/01/12 23:54:21, jianli wrote: > > Jian: How important is this change to you? Is this something you ran into > and > said "I know how to fix that", or is it something that's blocking some other > high priority thing you want to do? If the former, let's hold off on doing > it > until we're further along in the refactor. There are a couple of things > that > Andy Hendrickson and I have on our plates that would require solving this > problem (resuming failed downloads, unifying SavePackage (which currently > uses > url matching and hence is presumably vulnerable to the redirect problem > you're > trying to solve as well) and DownloadManager systems). OTOH if you wait > for us, > my guess is that it'll be about six months until we get to it (yes, there's > a > lot to do). > > The way I think we should solve this is to get an ID of some sort for the > dispatched download when it's dispatched. I see how to get the IO on the IO > thread, but what I don't see is how to return it back to the > DragDownloadFile in > a way that's resilient to objects going away (i.e. if the DragDownloadFile > dispatches a request to the IO thread to start a download, and is then > deleted, > a return callback might do things with a deleted object.) The way the rest > of > the download system solves this problem is by having long-lived objects > (DownloadManager, DownloadFileManager) that receive such messages and > dispatch > to the appropriate objects if they exist, so that might be the right way to > go > for the the DragDownloadFile. But it's not clear how to do it in a way that > integrates cleanly with the rest of the system. > > > http://codereview.chromium.org/6131009/
Pawel: Given this context, I'm inclined to let this change go through. Sync'ing based on url is how several current pieces of code solve this problem, and if you're syncing based on url, saving the original url is a reasonable way to do it. Extra info in the DownloadCreateInfo and DownloadItem is repugnant, but doesn't mess up the code too much more. And I don't see a cleaner way to do this without a fair amount of refactoring (and some magic I haven't figured out yet about object lifetimes, though I will eventually do so :-}). Are you ok with that? I'll do a second review of the code if/when we have basic agreement on the high level approach. On 2011/01/13 17:45:48, adeboor_google.com wrote: > Hi folks, > > For context, we had to turn off the attachment drag-out feature in gmail > because of an xss bug. While it was off, we changed mail attachment > downloading to come from a different domain, to avoid browser misbehavior > (content sniffing) and potential security hole exploits even in the face of > correct behavior. This necessitated a redirect. While the feature has been > off, we've been receiving numerous support requests, as people really like > the feature. So we would like to turn it on as soon as possible. > > Adam > On Jan 13, 2011 7:00 AM, <mailto:rdsmith@chromium.org> wrote: > > On 2011/01/12 23:54:21, jianli wrote: > > > > Jian: How important is this change to you? Is this something you ran into > > and > > said "I know how to fix that", or is it something that's blocking some > other > > high priority thing you want to do? If the former, let's hold off on doing > > > it > > until we're further along in the refactor. There are a couple of things > > that > > Andy Hendrickson and I have on our plates that would require solving this > > problem (resuming failed downloads, unifying SavePackage (which currently > > uses > > url matching and hence is presumably vulnerable to the redirect problem > > you're > > trying to solve as well) and DownloadManager systems). OTOH if you wait > > for us, > > my guess is that it'll be about six months until we get to it (yes, > there's > > a > > lot to do). > > > > The way I think we should solve this is to get an ID of some sort for the > > dispatched download when it's dispatched. I see how to get the IO on the > IO > > thread, but what I don't see is how to return it back to the > > DragDownloadFile in > > a way that's resilient to objects going away (i.e. if the DragDownloadFile > > dispatches a request to the IO thread to start a download, and is then > > deleted, > > a return callback might do things with a deleted object.) The way the rest > > > of > > the download system solves this problem is by having long-lived objects > > (DownloadManager, DownloadFileManager) that receive such messages and > > dispatch > > to the appropriate objects if they exist, so that might be the right way > to > > go > > for the the DragDownloadFile. But it's not clear how to do it in a way > that > > integrates cleanly with the rest of the system. > > > > > > http://codereview.chromium.org/6131009/
Randy, you are actively working on this code so I'm fine with this CL. However, I think the difference between now three URLs must be documented. We can land this CL quickly, and then document in a follow-up. Ideally the follow-up should also reduce the number of URLs we store, but that's probably too complicated for now.
Thank you for your flexibility, folks. Adam On Jan 13, 2011 11:07 PM, <phajdan.jr@chromium.org> wrote:
LGTM, though I'd like a bit more documentation on the new member variables as noted below. http://codereview.chromium.org/6131009/diff/1/chrome/browser/download/downloa... File chrome/browser/download/download_item.h (right): http://codereview.chromium.org/6131009/diff/1/chrome/browser/download/downloa... chrome/browser/download/download_item.h:191: GURL original_url() const { return original_url_; } On 2011/01/13 08:47:59, Paweł Hajdan Jr. wrote: > We try to avoid adding more accessors like this one. It's sorta like the difference between structures and classes; structures just carry data around, but classes hide that data and present an abstracted view of it. There's no abstraction on top of any of the URLs, so I think direct set/get accessors are appropriate. This might suggest a useful refactoring, where the pure, carried around data associated with the DownloadItem is exposed through a separate interface (maybe returns a const struct blah&) or maybe even is somehow a separate object, and DownloadItem is a more standard class, with abstractions and lack of access to internal data members. But that's not this CL :-}. http://codereview.chromium.org/6131009/diff/1/chrome/browser/download/downloa... chrome/browser/download/download_item.h:263: // The original URL before any redirection. On 2011/01/13 08:47:59, Paweł Hajdan Jr. wrote: > Now we have three URLs, and it's super-easy to confuse them all. The difference > should be much better documented. Do we actually need all three? I agree with this. I'd suggest changing the doc on url_ to "The URL from which we are downloading", adding "... redirection by the server for this url" to the doc above. In terms of whether we actually need all three, it's a hard question to answer. I just spent some time with codesearch, and found that url_/url() is used for: * download_util::CreateDownloadItemValue (-> DOM) * AutomationProvider::GetDictionaryFromDownloadItem (which could be used in a whole lot of ways--it's going to be hard to trace the uses of this down). * DragDownloadFile::ModelChanged (the usage this CL is modifying) * Browser::OnStartDownload(): Avoid showing the animation for downloads that are extensions. * OpenChromeExtension: Used for notification of extension install. referrer_url_/referrer_url() is only used in two of the above locations: the automation provider dictionary creation and OpenChromeExtension. Those usages make me nervous that we do indeed need all three, though possibly we should use the original_url_ (added here) for some of them.
I will add more documentation as you suggest and land the patch. Thanks a lot. On Fri, Jan 14, 2011 at 11:43 AM, <rdsmith@chromium.org> wrote: > LGTM, though I'd like a bit more documentation on the new member variables > as > noted below. > > > > > > http://codereview.chromium.org/6131009/diff/1/chrome/browser/download/downloa... > File chrome/browser/download/download_item.h (right): > > > http://codereview.chromium.org/6131009/diff/1/chrome/browser/download/downloa... > chrome/browser/download/download_item.h:191: GURL original_url() const { > return original_url_; } > On 2011/01/13 08:47:59, Paweł Hajdan Jr. wrote: > >> We try to avoid adding more accessors like this one. >> > > It's sorta like the difference between structures and classes; > structures just carry data around, but classes hide that data and > present an abstracted view of it. There's no abstraction on top of any > of the URLs, so I think direct set/get accessors are appropriate. This > might suggest a useful refactoring, where the pure, carried around data > associated with the DownloadItem is exposed through a separate interface > (maybe returns a const struct blah&) or maybe even is somehow a separate > object, and DownloadItem is a more standard class, with abstractions and > lack of access to internal data members. But that's not this CL :-}. > > > > http://codereview.chromium.org/6131009/diff/1/chrome/browser/download/downloa... > chrome/browser/download/download_item.h:263: // The original URL before > any redirection. > On 2011/01/13 08:47:59, Paweł Hajdan Jr. wrote: > >> Now we have three URLs, and it's super-easy to confuse them all. The >> > difference > >> should be much better documented. Do we actually need all three? >> > > I agree with this. I'd suggest changing the doc on url_ to "The URL > from which we are downloading", adding "... redirection by the server > for this url" to the doc above. > > In terms of whether we actually need all three, it's a hard question to > answer. I just spent some time with codesearch, and found that > url_/url() is used for: > * download_util::CreateDownloadItemValue (-> DOM) > * AutomationProvider::GetDictionaryFromDownloadItem (which could be used > in a whole lot of ways--it's going to be hard to trace the uses of this > down). > * DragDownloadFile::ModelChanged (the usage this CL is modifying) > * Browser::OnStartDownload(): Avoid showing the animation for downloads > that are extensions. > * OpenChromeExtension: Used for notification of extension install. > > referrer_url_/referrer_url() is only used in two of the above locations: > the automation provider dictionary creation and OpenChromeExtension. > > Those usages make me nervous that we do indeed need all three, though > possibly we should use the original_url_ (added here) for some of them. > > Yes, we might need to use original_url for some of them. We need to do more investigations here. > > http://codereview.chromium.org/6131009/ > |