|
|
Created:
8 years, 1 month ago by mkosiba (inactive) Modified:
8 years ago CC:
chromium-reviews Base URL:
http://git.chromium.org/external/google-url.git@master Visibility:
Public. |
DescriptionMake ResolveRelative work with all hierarchical URLs.
This makes the ResolveRelative method work correctly for all
hierarchical URLs while preserving the special support for the
"standard" schemes.
The change is needed for Android WebView, where it is perfectly
valid for an embedder to load a data URL with a base URL that is
a custom scheme (like myapp://resources/) and then later intercept
those requests.
BUG=159832
Patch Set 1 #Patch Set 2 : #Patch Set 3 : #Patch Set 4 : #Patch Set 5 : #
Total comments: 7
Patch Set 6 : address Joth's feedback #
Total comments: 4
Patch Set 7 : improve unittests and UT docs #
Total comments: 6
Patch Set 8 : #
Total comments: 4
Patch Set 9 : fix nits #
Total comments: 1
Messages
Total messages: 27 (0 generated)
Hey! PTAL. The problem I'm trying to solve is that in WebKit KURLGoogle calls url_util::ResolveRelative in init(). If the base URL has a non-"standard" scheme this would result in init() returning and empty string, which then causes WebKit to ignore the entire request. For the android WebView, we need to support embedders using custom schemes (which together with the intercepting mechanism we provide means, that you could do stuff like: <img src="fractal://mandelbrot/" />) Thanks! Martin
We should also run this by abarth and/or eseidel. Some history: we used to treat all schemes with "://" as relative. This is the correct behavior as documented in the spec, and would make your case work. However, this ends up breaking the web (noticed in bug 160!) and had to remove that for compat: http://code.google.com/p/google-url/source/detail?r=123 Now we only treat specifically known schemes as standard. The scary thing about your change is that it introduces an inconsistency between the handling of resolving relative URLs and just canonicalizing a URL. For example, if we load the URL "asdf://Foo/bar" in the browser won't think it's hierarchical and will treat the stuff following the scheme as opaque data, leaving everything unchanged (modulo escaping non-ASCII and invalid characters). Your change doesn't affect this. However, with your change if we have that URL asdf://Foo/bar" and resolve "zzz" on it, it will be treated as hierarchical for the purposes of that translation and it will be canonicalized to "asdf://foo/zzz" (host lower-cased). I'm not entirely sure what this will break, but in general these inconsistencies are very bad, and lead to security problems and origin checking failures. The core problem here is that some Android apps are expecting the specified HTML behavior, but Chrome can not supply this behavior for full compat with existing platforms. We could say that "://" means hierarchical on Android only, but that sounds non-optimal and also a likely source of bugs.
> We could say that "://" means hierarchical on Android only, but that sounds > non-optimal and also a likely source of bugs. We try to avoid these sorts of behavior differences if at all possible because they result in different pools of content making contradictory assumptions about how browsers work. Is this only about the "content" scheme? Perhaps we just need to recognize that scheme as being hierarchical.
@brett - thanks for explaining. I read the bug and agree that the proposed solution is scary. @abarth - unfortunately (see the CR description for a bit more detail) it's not only about content. We'd like to support any scheme that the third party app authors invent. Maybe dynamically adding the scheme of the base URL to the list of standard schemes would solve the problem? If the URL is specified as a base URL (and has the :// bit) could we treat it as "standard" then? Would it be less scary if we did that on android_webview only? If so this would require 2 things: - @abarth - we would need some way of getting the base URL if it was specified via the <base /> tag. From a quick glance this should be fairly easy. - @brettw? - currently the list of "standard" schemes is locked as part of initialization (in ContentMainDelegate I think), for what I assume are security reasons. Would there be potential issues in not locking the list? On Thu, Nov 1, 2012 at 9:17 PM, <abarth@chromium.org> wrote: > We could say that "://" means hierarchical on Android only, but that >> sounds >> non-optimal and also a likely source of bugs. >> > > We try to avoid these sorts of behavior differences if at all possible > because > they result in different pools of content making contradictory assumptions > about > how browsers work. > > Is this only about the "content" scheme? Perhaps we just need to > recognize that > scheme as being hierarchical. > > https://codereview.chromium.**org/11367010/<https://codereview.chromium.org/1... >
Thinking about this last night, I'm not sure that your proposed change is that bad. I'm interested in Adam's opinion. Under normal usage, the only way for a page to have a non-standard scheme and have content is for it to be "data:" (or maybe something else similar). The only thing your change would affect in this situation is if the first character after the data: is a slash, and the page has non-relative URLs. Non-relative URLs already won't do anything interesting, so the fact that we'd resolve them to garbage and potentially do destructive normalization on them (assuming they're hierarchical) doesn't seem dangerous or surprising. For the Android case, some other app is scripting the WebView and we have a potentially random scheme, and we're resolving URLs relative to that. What will happen is if the app is giving us URLs that aren't canonicalized under hierarchical rules is that they will seemingly randomly be canonicalized according to these rules. From Chrome's perspective, this isn't a problem. All the URLs are opaque data and useless to us for the purposes of same origin checks. From the app's perspective, this may be a bit weird but I bet it doesn't come up yet. The main thing that will happen in practice is that if the hostname for the URL has upper-case in it, it will be lowercased sometimes and sometimes not, and the app needs to expect this. So I think that your patch isn't "bad" from this perspective. A separate question is whether your patch is sufficient for solving the problem on Android. If apps are giving us URLs that Chrome doesn't think are standard, I think a bunch of stuff will be broken, like same origin checks. Your patch won't fix some of these things, so the apps may find themselves restricted in what they can do. Thinking out loud, maybe your current patch is enough to unbreak most current content, and then maybe we need to work to have some recommended guidelines for apps doing this. Like we could reserve a scheme that we register as standard and recommend that app users use that if they're going to be doing request interception. Brett On Fri, Nov 2, 2012 at 3:52 AM, Martin Kosiba <mkosiba@chromium.org> wrote: > @brett - thanks for explaining. I read the bug and agree that the proposed > solution is scary. > > @abarth - unfortunately (see the CR description for a bit more detail) it's > not only about content. We'd like to support any scheme that the third party > app authors invent. > > Maybe dynamically adding the scheme of the base URL to the list of standard > schemes would solve the problem? If the URL is specified as a base URL (and > has the :// bit) could we treat it as "standard" then? Would it be less > scary if we did that on android_webview only? > If so this would require 2 things: > - @abarth - we would need some way of getting the base URL if it was > specified via the <base /> tag. From a quick glance this should be fairly > easy. > - @brettw? - currently the list of "standard" schemes is locked as part of > initialization (in ContentMainDelegate I think), for what I assume are > security reasons. Would there be potential issues in not locking the list? > > > On Thu, Nov 1, 2012 at 9:17 PM, <abarth@chromium.org> wrote: >>> >>> We could say that "://" means hierarchical on Android only, but that >>> sounds >>> non-optimal and also a likely source of bugs. >> >> >> We try to avoid these sorts of behavior differences if at all possible >> because >> they result in different pools of content making contradictory assumptions >> about >> how browsers work. >> >> Is this only about the "content" scheme? Perhaps we just need to >> recognize that >> scheme as being hierarchical. >> >> https://codereview.chromium.org/11367010/ > >
On Fri, Nov 2, 2012 at 10:42 PM, Brett Wilson <brettw@chromium.org> wrote: > Under normal usage, the only way for a page to have a non-standard > scheme and have content is for it to be "data:" (or maybe something > else similar). The only thing your change would affect in this > situation is if the first character after the data: is a slash, and > the page has non-relative URLs. Non-relative URLs already won't do > anything interesting, so the fact that we'd resolve them to garbage > and potentially do destructive normalization on them (assuming they're > hierarchical) doesn't seem dangerous or surprising. > > For the Android case, some other app is scripting the WebView and we > have a potentially random scheme, and we're resolving URLs relative to > that. What will happen is if the app is giving us URLs that aren't > canonicalized under hierarchical rules is that they will seemingly > randomly be canonicalized according to these rules. hmm.. that doesn't sound too terrible. It may require some app authors to change their interception code if the base URLs they're providing end up being different after canonicalization, I assume this could be fixed by using Java URI.equals instead of a string compare? > > From Chrome's perspective, this isn't a problem. All the URLs are > opaque data and useless to us for the purposes of same origin checks. > From the app's perspective, this may be a bit weird but I bet it > doesn't come up yet. The main thing that will happen in practice is > that if the hostname for the URL has upper-case in it, it will be > lowercased sometimes and sometimes not, and the app needs to expect > this. > > So I think that your patch isn't "bad" from this perspective. > > agreed. > A separate question is whether your patch is sufficient for solving > the problem on Android. If apps are giving us URLs that Chrome doesn't > think are standard, I think a bunch of stuff will be broken, like same > origin checks. Your patch won't fix some of these things, so the apps > may find themselves restricted in what they can do. > > Could you briefly explain (or point to the code) how the same origin checks work in Chrome? Especially the bit in the previous paragraph where you mentioned that URLs are useless for this purpose? > Thinking out loud, maybe your current patch is enough to unbreak most > current content, and then maybe we need to work to have some > recommended guidelines for apps doing this. Like we could reserve a > scheme that we register as standard and recommend that app users use > that if they're going to be doing request interception. > > This is something we're considering. It should be trivial to change apps to use a custom hostname within the http:// protocol instead of a custom scheme Another thing we were considering is to require 3rd party app authors to register the protocols before usage. > Brett > > On Fri, Nov 2, 2012 at 3:52 AM, Martin Kosiba <mkosiba@chromium.org> > wrote: > > @brett - thanks for explaining. I read the bug and agree that the > proposed > > solution is scary. > > > > @abarth - unfortunately (see the CR description for a bit more detail) > it's > > not only about content. We'd like to support any scheme that the third > party > > app authors invent. > > > > Maybe dynamically adding the scheme of the base URL to the list of > standard > > schemes would solve the problem? If the URL is specified as a base URL > (and > > has the :// bit) could we treat it as "standard" then? Would it be less > > scary if we did that on android_webview only? > > If so this would require 2 things: > > - @abarth - we would need some way of getting the base URL if it was > > specified via the <base /> tag. From a quick glance this should be fairly > > easy. > > - @brettw? - currently the list of "standard" schemes is locked as > part of > > initialization (in ContentMainDelegate I think), for what I assume are > > security reasons. Would there be potential issues in not locking the > list? > > > > > > On Thu, Nov 1, 2012 at 9:17 PM, <abarth@chromium.org> wrote: > >>> > >>> We could say that "://" means hierarchical on Android only, but that > >>> sounds > >>> non-optimal and also a likely source of bugs. > >> > >> > >> We try to avoid these sorts of behavior differences if at all possible > >> because > >> they result in different pools of content making contradictory > assumptions > >> about > >> how browsers work. > >> > >> Is this only about the "content" scheme? Perhaps we just need to > >> recognize that > >> scheme as being hierarchical. > >> > >> https://codereview.chromium.org/11367010/ > > > > >
On Wed, Nov 7, 2012 at 8:34 AM, Martin Kosiba <mkosiba@chromium.org> wrote: > On Fri, Nov 2, 2012 at 10:42 PM, Brett Wilson <brettw@chromium.org> wrote: >> >> Under normal usage, the only way for a page to have a non-standard >> scheme and have content is for it to be "data:" (or maybe something >> else similar). The only thing your change would affect in this >> situation is if the first character after the data: is a slash, and >> the page has non-relative URLs. Non-relative URLs already won't do >> anything interesting, so the fact that we'd resolve them to garbage >> and potentially do destructive normalization on them (assuming they're >> hierarchical) doesn't seem dangerous or surprising. >> >> For the Android case, some other app is scripting the WebView and we >> have a potentially random scheme, and we're resolving URLs relative to >> that. What will happen is if the app is giving us URLs that aren't >> canonicalized under hierarchical rules is that they will seemingly >> randomly be canonicalized according to these rules. > > > hmm.. that doesn't sound too terrible. It may require some app authors to > change their interception code if the base URLs they're providing end up > being different after canonicalization, I assume this could be fixed by > using Java URI.equals instead of a string compare? > >> >> >> >> From Chrome's perspective, this isn't a problem. All the URLs are >> opaque data and useless to us for the purposes of same origin checks. >> From the app's perspective, this may be a bit weird but I bet it >> doesn't come up yet. The main thing that will happen in practice is >> that if the hostname for the URL has upper-case in it, it will be >> lowercased sometimes and sometimes not, and the app needs to expect >> this. >> >> So I think that your patch isn't "bad" from this perspective. >> > > agreed. > >> >> A separate question is whether your patch is sufficient for solving >> the problem on Android. If apps are giving us URLs that Chrome doesn't >> think are standard, I think a bunch of stuff will be broken, like same >> origin checks. Your patch won't fix some of these things, so the apps >> may find themselves restricted in what they can do. >> > > Could you briefly explain (or point to the code) how the same origin checks > work in Chrome? Especially the bit in the previous paragraph where you > mentioned that URLs are useless for this purpose? I don't think same origin checks work for non-hierarchical URLs. I *believe* (Adam can correct me) that we'll always say that the origins don't match. This may make some features not work, although I don't know off the top of my head what. >> Thinking out loud, maybe your current patch is enough to unbreak most >> current content, and then maybe we need to work to have some >> recommended guidelines for apps doing this. Like we could reserve a >> scheme that we register as standard and recommend that app users use >> that if they're going to be doing request interception. >> > > This is something we're considering. It should be trivial to change apps to > use a custom hostname within the http:// protocol instead of a custom scheme > Another thing we were considering is to require 3rd party app authors to > register the protocols before usage. If you encourage people to use http, either there should be a standard hostname for this purpose that we control, or we should use a different scheme and register it as one of the list of standard ones. I think using a new scheme will be cleaner and has less potential for weird issues down the road. We do this for extensions: chrome-extension://<extension ID>/... Brett
A bit more info here. So I tried adding "content" to the hierarchical URL whitelist and that doesn't work. Android resolves the content provider by looking up the package and class name, which are case-sensitive, so while resolving the base URL worked, canonicalization was causing problems elsewhere. Soooo... I'm sort of back to square one. One thought I had was that the problematic point here is resolving a path as relative to some 'random' URL (what I mean is that <img src="content://org.foo.bar.Provider/image1"/> works fine as the code is right now), so I thought that maybe a better approach would be to change what happens when we fail in DoResolveRelative. Right now resolving a relative path against a 'random' base URL will fail and not pass anything out to the caller. What if DoResolveRelative called ResolveRelativeURL in line 255? That way we'd sort of make a best-effort attempt (which might be not what you expect) at doing something useful, and worst case Chrome would simply fail creating a job for the request. This should be OK (as in no new unexpected behavior for supported schemes) if you assume that all the schemes that Chrome has registered factories for have also been added to the whitelist (which is the case, AFAIK). WDYT? On Wed, Nov 7, 2012 at 6:31 PM, Brett Wilson <brettw@chromium.org> wrote: > On Wed, Nov 7, 2012 at 8:34 AM, Martin Kosiba <mkosiba@chromium.org> > wrote: > > On Fri, Nov 2, 2012 at 10:42 PM, Brett Wilson <brettw@chromium.org> > wrote: > >> > >> Under normal usage, the only way for a page to have a non-standard > >> scheme and have content is for it to be "data:" (or maybe something > >> else similar). The only thing your change would affect in this > >> situation is if the first character after the data: is a slash, and > >> the page has non-relative URLs. Non-relative URLs already won't do > >> anything interesting, so the fact that we'd resolve them to garbage > >> and potentially do destructive normalization on them (assuming they're > >> hierarchical) doesn't seem dangerous or surprising. > >> > >> For the Android case, some other app is scripting the WebView and we > >> have a potentially random scheme, and we're resolving URLs relative to > >> that. What will happen is if the app is giving us URLs that aren't > >> canonicalized under hierarchical rules is that they will seemingly > >> randomly be canonicalized according to these rules. > > > > > > hmm.. that doesn't sound too terrible. It may require some app authors to > > change their interception code if the base URLs they're providing end up > > being different after canonicalization, I assume this could be fixed by > > using Java URI.equals instead of a string compare? > > > >> > >> > >> > >> From Chrome's perspective, this isn't a problem. All the URLs are > >> opaque data and useless to us for the purposes of same origin checks. > >> From the app's perspective, this may be a bit weird but I bet it > >> doesn't come up yet. The main thing that will happen in practice is > >> that if the hostname for the URL has upper-case in it, it will be > >> lowercased sometimes and sometimes not, and the app needs to expect > >> this. > >> > >> So I think that your patch isn't "bad" from this perspective. > >> > > > > agreed. > > > >> > >> A separate question is whether your patch is sufficient for solving > >> the problem on Android. If apps are giving us URLs that Chrome doesn't > >> think are standard, I think a bunch of stuff will be broken, like same > >> origin checks. Your patch won't fix some of these things, so the apps > >> may find themselves restricted in what they can do. > >> > > > > Could you briefly explain (or point to the code) how the same origin > checks > > work in Chrome? Especially the bit in the previous paragraph where you > > mentioned that URLs are useless for this purpose? > > I don't think same origin checks work for non-hierarchical URLs. I > *believe* (Adam can correct me) that we'll always say that the origins > don't match. This may make some features not work, although I don't > know off the top of my head what. > > > >> Thinking out loud, maybe your current patch is enough to unbreak most > >> current content, and then maybe we need to work to have some > >> recommended guidelines for apps doing this. Like we could reserve a > >> scheme that we register as standard and recommend that app users use > >> that if they're going to be doing request interception. > >> > > > > This is something we're considering. It should be trivial to change apps > to > > use a custom hostname within the http:// protocol instead of a custom > scheme > > Another thing we were considering is to require 3rd party app authors to > > register the protocols before usage. > > If you encourage people to use http, either there should be a standard > hostname for this purpose that we control, or we should use a > different scheme and register it as one of the list of standard ones. > I think using a new scheme will be cleaner and has less potential for > weird issues down the road. We do this for extensions: > chrome-extension://<extension ID>/... > > Brett >
Can you provide a little more background? The URLs that we're generating... are they defined by the system or do apps use their own? Is there a particular scheme it uses or is it random? Initially I got the impression that apps used whatever they wanted and manually intercepted requests. But I see now you're talking about the system doing some lookups for the "content" scheme. Can you clarify the scope of the users and what they expect? I don't totally follow your suggested fix. Maybe it would be easy enough to just upload a patch? Brett
On 2012/11/14 00:09:56, brettw wrote: > Can you provide a little more background? sure! > The URLs that we're generating... are they defined by the system or do > apps use their own? Is there a particular scheme it uses or is it > random? Initially I got the impression that apps used whatever they > wanted and manually intercepted requests. But I see now you're talking > about the system doing some lookups for the "content" scheme. Can you > clarify the scope of the users and what they expect? So there are 2 classes of.. things: 1) content:// URIs which are handled by the system, more details are here: http://developer.android.com/guide/topics/providers/content-provider-basics.html but it essentially works like this: trying to access a content://org.package.ProviderName/resource_name url in the WebView will prompt the system to load the org.package.ProviderName class, and call something like query('resource_name') on an instance of that class. 2) URLs with other 'random' schemes - which are handled entirely by the app by using URLRequest intercepting > I don't totally follow your suggested fix. Maybe it would be easy > enough to just upload a patch? So I tried doing that and realized it won't work (in case you're interested it would work exactly like the current patch, except that non-standard urls would be marked as invalid. But since, like you've mentioned, Chrome treats invalid URLs as 'radioactive', we can't really do anything useful with those).
Hey! I uploaded a patch that makes the 'use whatever URL as base' scenario as compatible with KURL behavior as I could get it. The key points that I wanted to achieve are: 1) resolving a relative url against a non-standard base_spec works, 2) content://class.name.of.Provider/ is not lowercased 3) resolving ../img.jpg against content://foo.Bar/ doesn't result in content://img.jpg 4) resolving ../file against scheme:/base results in scheme:/file rather than an invalid URL. The downside is that some 'bad' URLs (like the data:/file.html which was found by the unit tests) are now valid (just like in the previous patch). I realize that this looks pretty hacky and I'm wondering if it would be better if this behavior were optional, as none of the other Chrome ports need this functionality. https://codereview.chromium.org/11367010/diff/11002/src/url_util.cc File src/url_util.cc (right): https://codereview.chromium.org/11367010/diff/11002/src/url_util.cc#newcode244 src/url_util.cc:244: int num_slashes = url_parse::CountConsecutiveSlashes( the reason I count slashes here (instead of the previous Parse-based code) is that ResolveRelativeURL doesn't use the Parse code to create output_parsed. That would mean having to calculate Parsed::is_hierarchical in at lease 3 different places (2 in Parse and 1 in ResolveRelativeURL). 1 more copy seems better than 3 more.
Ping?
https://codereview.chromium.org/11367010/diff/11002/src/url_util.cc File src/url_util.cc (right): https://codereview.chromium.org/11367010/diff/11002/src/url_util.cc#newcode247 src/url_util.cc:247: base_is_hierarchical_only = num_slashes == 1; I think I might find the logic below a bit simpler if this were just base_is_hierarchical = num_slashes > 0; we'd then just say "base_is_hierarchical || standard_base_scheme" at line 258 https://codereview.chromium.org/11367010/diff/11002/src/url_util.cc#newcode275 src/url_util.cc:275: } else if (base_is_authority_based && is_relative) { Seems this would be simpler if you swapped the order, and then you don't need is_hierarchical_only as it drops out by default:- if (is_relative && base_is_authority_based && !standard_base_scheme) { ... // authority preserving logic } else if (is_relative) { ... // old logic } https://codereview.chromium.org/11367010/diff/11002/src/url_util.cc#newcode290 src/url_util.cc:290: // re-created. I can't quite follow this (but I don't have so much experience in this code). Possibly an example would help, but it may be too verbose
https://codereview.chromium.org/11367010/diff/11002/src/url_util.cc File src/url_util.cc (right): https://codereview.chromium.org/11367010/diff/11002/src/url_util.cc#newcode247 src/url_util.cc:247: base_is_hierarchical_only = num_slashes == 1; On 2012/11/29 02:30:41, joth wrote: > I think I might find the logic below a bit simpler if this were just > base_is_hierarchical = num_slashes > 0; > > we'd then just say "base_is_hierarchical || standard_base_scheme" at line 258 Done. https://codereview.chromium.org/11367010/diff/11002/src/url_util.cc#newcode275 src/url_util.cc:275: } else if (base_is_authority_based && is_relative) { On 2012/11/29 02:30:41, joth wrote: > Seems this would be simpler if you swapped the order, and then you don't need > is_hierarchical_only as it drops out by default:- > > if (is_relative && base_is_authority_based && !standard_base_scheme) { > ... // authority preserving logic > } else if (is_relative) { > ... // old logic > } Done. https://codereview.chromium.org/11367010/diff/11002/src/url_util.cc#newcode290 src/url_util.cc:290: // re-created. On 2012/11/29 02:30:41, joth wrote: > I can't quite follow this (but I don't have so much experience in this code). > Possibly an example would help, but it may be too verbose umm.. so the deal is that the contents of the output_parsed structure were partially copied over from the base_parsed_authority structure which are different from the ones that would have been copied over from base_parsed. an example would be a url like scheme://foo.com/bar in base_parsed we would get host:[], path:[foo.com/bar] in base_parsed_authority we get host:[foo.com], path:[/bar] so then after calling ResolveRelativeURL("scheme://foo.com/bar", "/baz") we would get output parsed host[foo.com], path[/baz] but what we want is host[] path [foo.com/baz]. This is because we don't want to promote the path URL to a standard URL only because it was resolved against a base (also a bunch of otherwise useful DCHECKs would need to be disabled).
Sorry I've been slow on this, these URL things need some time with undivided attention that I haven't had. Hopefully I can get to this today.
Ping? You want to chat on VC about this?
Sorry for the delay. https://codereview.chromium.org/11367010/diff/20001/src/gurl_unittest.cc File src/gurl_unittest.cc (right): https://codereview.chromium.org/11367010/diff/20001/src/gurl_unittest.cc#newc... src/gurl_unittest.cc:203: {"data:/blahblah", "file.html", true, "data:/file.html"}, This case seems fine to me. But it seems like we should have a different section heading, since in this case "data" is being treated as quasi-standard. https://codereview.chromium.org/11367010/diff/20001/src/url_canon_unittest.cc File src/url_canon_unittest.cc (right): https://codereview.chromium.org/11367010/diff/20001/src/url_canon_unittest.cc... src/url_canon_unittest.cc:2039: // Non-standard URLs. Can you expand on this comment a bit about what you're trying to test? Additionally, your code does different things for one and two slashes. I don't see those differences reflected in the tests here (seeing that would make it more obvious to me decide whether this is the right thing or not).
another version is out, PTAL https://codereview.chromium.org/11367010/diff/20001/src/gurl_unittest.cc File src/gurl_unittest.cc (right): https://codereview.chromium.org/11367010/diff/20001/src/gurl_unittest.cc#newc... src/gurl_unittest.cc:203: {"data:/blahblah", "file.html", true, "data:/file.html"}, On 2012/12/06 00:25:10, brettw wrote: > This case seems fine to me. But it seems like we should have a different section > heading, since in this case "data" is being treated as quasi-standard. Done. https://codereview.chromium.org/11367010/diff/20001/src/url_canon_unittest.cc File src/url_canon_unittest.cc (right): https://codereview.chromium.org/11367010/diff/20001/src/url_canon_unittest.cc... src/url_canon_unittest.cc:2039: // Non-standard URLs. On 2012/12/06 00:25:10, brettw wrote: > Can you expand on this comment a bit about what you're trying to test? > > Additionally, your code does different things for one and two slashes. I don't > see those differences reflected in the tests here (seeing that would make it > more obvious to me decide whether this is the right thing or not). Done. https://codereview.chromium.org/11367010/diff/27001/src/url_util_unittest.cc File src/url_util_unittest.cc (right): https://codereview.chromium.org/11367010/diff/27001/src/url_util_unittest.cc#... src/url_util_unittest.cc:252: TEST(URLUtilTest, TestResolveRelativeWithNonStandardBase) { The test case in url_canon_unittest didn't actually exercise any of the new code (at most I could add additional expectations to ResolveRelative and ResolveRelativeURL) and I think it is preferable to add a new test case here rather than add to the one in GURL. https://codereview.chromium.org/11367010/diff/27001/src/url_util_unittest.cc#... src/url_util_unittest.cc:277: {"content://content.Provider/", "//other.Provider", true, "content://other.provider/"}, this is a bit inconsistent, but acceptable (the Android WebView doesn't support resolving // anyway) and I can't think of a non-terrible way of fixing this.
https://codereview.chromium.org/11367010/diff/27001/src/url_util_unittest.cc File src/url_util_unittest.cc (right): https://codereview.chromium.org/11367010/diff/27001/src/url_util_unittest.cc#... src/url_util_unittest.cc:252: TEST(URLUtilTest, TestResolveRelativeWithNonStandardBase) { Maybe have a function comment explicitly calling out this is for non-standard yet hierarchical schemes (per GURL::IsStandard()) https://codereview.chromium.org/11367010/diff/27001/src/url_util_unittest.cc#... src/url_util_unittest.cc:280: for (size_t i = 0; i < ARRAYSIZE_UNSAFE(cases); i++) { while you're here, you could write this as a gtest data-driven test? See TEST_P(X509CertificateNameVerifyTest, VerifyHostname) in http://src.chromium.org/viewvc/chrome/trunk/src/net/base/x509_certificate_uni... for example
https://codereview.chromium.org/11367010/diff/27001/src/url_util_unittest.cc File src/url_util_unittest.cc (right): https://codereview.chromium.org/11367010/diff/27001/src/url_util_unittest.cc#... src/url_util_unittest.cc:252: TEST(URLUtilTest, TestResolveRelativeWithNonStandardBase) { On 2012/12/06 20:02:26, joth wrote: > Maybe have a function comment explicitly calling out this is for non-standard > yet hierarchical schemes (per GURL::IsStandard()) Done. https://codereview.chromium.org/11367010/diff/27001/src/url_util_unittest.cc#... src/url_util_unittest.cc:280: for (size_t i = 0; i < ARRAYSIZE_UNSAFE(cases); i++) { On 2012/12/06 20:02:26, joth wrote: > while you're here, you could write this as a gtest data-driven test? See > TEST_P(X509CertificateNameVerifyTest, VerifyHostname) > in > http://src.chromium.org/viewvc/chrome/trunk/src/net/base/x509_certificate_uni... > for example I tried doing that, but didn't like the end result. This adds a fair amount of noise and doesn't seem to buy much.. sample error output is: [ FAILED ] URLUtilResolveRelativeTest.NonStandardBase/2, where GetParam() = 32-byte object <51-70 5E-00 00-00 00-00 2D-70 5E-00 00-00 00-00 01-00 00-00 00-00 00-00 6A-70 5E-00 00-00 00-00> (0 ms) TEST_P isn't used anywhere else in googleurl, even though many test cases are in fact data-driven (the prevailing pattern is the one I applied here), so I'm going to stick to the for loop.
On 10 December 2012 04:32, <mkosiba@chromium.org> wrote: > > https://codereview.chromium.**org/11367010/diff/27001/src/** > url_util_unittest.cc<https://codereview.chromium.org/11367010/diff/27001/src/url_util_unittest.cc> > File src/url_util_unittest.cc (right): > > https://codereview.chromium.**org/11367010/diff/27001/src/** > url_util_unittest.cc#**newcode252<https://codereview.chromium.org/11367010/diff/27001/src/url_util_unittest.cc#newcode252> > src/url_util_unittest.cc:252: TEST(URLUtilTest, > TestResolveRelativeWithNonStan**dardBase) { > On 2012/12/06 20:02:26, joth wrote: > >> Maybe have a function comment explicitly calling out this is for >> > non-standard > >> yet hierarchical schemes (per GURL::IsStandard()) >> > > Done. > > > https://codereview.chromium.**org/11367010/diff/27001/src/** > url_util_unittest.cc#**newcode280<https://codereview.chromium.org/11367010/diff/27001/src/url_util_unittest.cc#newcode280> > src/url_util_unittest.cc:280: for (size_t i = 0; i < > ARRAYSIZE_UNSAFE(cases); i++) { > On 2012/12/06 20:02:26, joth wrote: > >> while you're here, you could write this as a gtest data-driven test? >> > See > >> TEST_P(**X509CertificateNameVerifyTest, VerifyHostname) >> in >> > > http://src.chromium.org/**viewvc/chrome/trunk/src/net/** > base/x509_certificate_**unittest.cc<http://src.chromium.org/viewvc/chrome/trunk/src/net/base/x509_certificate_unittest.cc> > >> for example >> > > I tried doing that, but didn't like the end result. This adds a fair > amount of noise and doesn't seem to buy much.. sample error output is: > [ FAILED ] URLUtilResolveRelativeTest.**NonStandardBase/2, where > GetParam() = 32-byte object <51-70 5E-00 00-00 00-00 2D-70 5E-00 00-00 > 00-00 01-00 00-00 00-00 00-00 6A-70 5E-00 00-00 00-00> (0 ms) > > TEST_P isn't used anywhere else in googleurl, even though many test > cases are in fact data-driven (the prevailing pattern is the one I > applied here), so I'm going to stick to the for loop. > > yeah, local consistency rules. Thanks for looking into it. > https://codereview.chromium.**org/11367010/<https://codereview.chromium.org/1... >
Hey Brett! Would it be helpful if we were to stick this under OS_ANDROID or some global settings variable that would be enabled only for the webview build?
I chatted a bit more with Brett on this - my understanding is:- - strongly prefer not to make this feature optional, so we can have consistent story on 'how chromium deals with URLs'. We could look to add option as last resort if needed to unblock dev/testing. - ideally land this without any config option, but let it bake on dev channel for a good time before going to beta. - M25 branches Monday. So lets land this Tuesday+ - In line with that, Brett will endeavor to finish off the review in next day or two. Thanks!
LGTM. Just 2 nits. After the branch, ping me and I'll check in the final patch for you. Then you can roll deps to pull into Chrome. https://codereview.chromium.org/11367010/diff/33001/src/url_util.cc File src/url_util.cc (right): https://codereview.chromium.org/11367010/diff/33001/src/url_util.cc#newcode266 src/url_util.cc:266: url_parse::Parsed base_parsed_authority; I'd put this line right below the comment (above ParseStandardURL where the info is filled in). https://codereview.chromium.org/11367010/diff/33001/src/url_util_unittest.cc File src/url_util_unittest.cc (right): https://codereview.chromium.org/11367010/diff/33001/src/url_util_unittest.cc#... src/url_util_unittest.cc:260: } resolve_non_standard_cases[] = { It would be helpful to see what's going on here if you also added the base case of no slash after the colon. Even though this is covered elsewhere, it makes it easier to follow how the slashes change the behavior since so much of following what's going on is just looking at the unit tests.
Thanks Brett! https://codereview.chromium.org/11367010/diff/33001/src/url_util.cc File src/url_util.cc (right): https://codereview.chromium.org/11367010/diff/33001/src/url_util.cc#newcode266 src/url_util.cc:266: url_parse::Parsed base_parsed_authority; On 2012/12/13 21:31:44, brettw wrote: > I'd put this line right below the comment (above ParseStandardURL where the info > is filled in). Done. https://codereview.chromium.org/11367010/diff/33001/src/url_util_unittest.cc File src/url_util_unittest.cc (right): https://codereview.chromium.org/11367010/diff/33001/src/url_util_unittest.cc#... src/url_util_unittest.cc:260: } resolve_non_standard_cases[] = { On 2012/12/13 21:31:44, brettw wrote: > It would be helpful to see what's going on here if you also added the base case > of no slash after the colon. Even though this is covered elsewhere, it makes it > easier to follow how the slashes change the behavior since so much of following > what's going on is just looking at the unit tests. Done. https://codereview.chromium.org/11367010/diff/40002/src/url_util_unittest.cc File src/url_util_unittest.cc (right): https://codereview.chromium.org/11367010/diff/40002/src/url_util_unittest.cc#... src/url_util_unittest.cc:284: {"about:blank", "custom://Authority", true, "custom://Authority"}, For now we're going to live with this and ask app developers to append a slash at the end of their custom scheme and/or content:// scheme base URL. This will need to be re-visited some time later (http://crbug.com/166675) but I'd like to see if this change sticks before I try and fix this.
Checked in as googleurl@181. You should do a patch to pull this version into Chrome via DEPS. You don't need review but you should do try runs in case anything breaks (the last time we did this it broke some unexpected things). You may also want to coordinate with the webkit sheriff when you land the DEPS change since it may break/change some layout tests (my guess is that it fixes some and hopefully won't break any, but you never know).
Message was sent while issue was closed.
Thanks again! The change to roll DEPS is here: https://codereview.chromium.org/11644070/ but there were some trybot failures so I'll most likely end up submitting it next year. |