|
|
DescriptionUpstream "nsurl_util.{h,mm}" from Chrome on iOS repository.
I created the new directory net/base/mac/ to hold shared files between OSX and
iOS. The logic in nsurl_util was mostly duplicated code, so I replaced the bulk
of the code with a single new function EscapeNSURLPrecursor.
BUG=
Committed: https://crrev.com/006a7cfbba2e1c415c9b0d63767a73f14815e27e
Cr-Commit-Position: refs/heads/master@{#306690}
Patch Set 1 : Original files from iOS repository, moved to net/base/mac/ and renamed. #Patch Set 2 : Updated files to reflect upstreaming. Diff against patch set 1. #
Total comments: 8
Patch Set 3 : Comments from mmenke. #
Total comments: 8
Patch Set 4 : Comments from mmenke, round two. #
Messages
Total messages: 32 (10 generated)
Patchset #2 (id:20001) has been deleted
erikchen@chromium.org changed reviewers: + droger@chromium.org
droger: Please review.
droger@chromium.org changed reviewers: + mmenke@chromium.org
lgtm, but I am not familiar with net/ architecture, lets add a net OWNER. +mmenke as net/ OWNER mmenke: note that the code has already been reviewed and used in the iOS fork for a long time. There is probably no need to do a detailed review of the code again. https://codereview.chromium.org/747773002/diff/40001/net/base/mac/url_convers... File net/base/mac/url_conversions.mm (right): https://codereview.chromium.org/747773002/diff/40001/net/base/mac/url_convers... net/base/mac/url_conversions.mm:41: #if defined(OS_IOS) You can remove this ifdef, and the include on line 20, because we no longer support these older OSes.
https://codereview.chromium.org/747773002/diff/40001/net/base/mac/url_convers... File net/base/mac/url_conversions.mm (right): https://codereview.chromium.org/747773002/diff/40001/net/base/mac/url_convers... net/base/mac/url_conversions.mm:59: // pct-encode them for RFC 1738. We do not need a new method to do this. Set net::EscapePath, and its escape list: https://code.google.com/p/chromium/codesearch#chromium/src/net/base/escape.cc... Note: That list does exclude double quotes (") (As does RFC 1738), so if you need those escaped, you do need to do that yourself. It also escapes #, which this mentions below. Actually, though, doesn't GURL itself also escape the same characters as well?
https://codereview.chromium.org/747773002/diff/40001/net/base/mac/url_convers... File net/base/mac/url_conversions.mm (right): https://codereview.chromium.org/747773002/diff/40001/net/base/mac/url_convers... net/base/mac/url_conversions.mm:59: // pct-encode them for RFC 1738. On 2014/11/21 15:48:55, mmenke wrote: > We do not need a new method to do this. Set net::EscapePath, and its escape > list: > https://code.google.com/p/chromium/codesearch#chromium/src/net/base/escape.cc... > > Note: That list does exclude double quotes (") (As does RFC 1738), so if you > need those escaped, you do need to do that yourself. > > > It also escapes #, which this mentions below. > > Actually, though, doesn't GURL itself also escape the same characters as well? mmenke: What are your expectations? (I ask only so that I reduce the number of feedback iterations on this CL) My current best guess is the following: Some of the logic in this method is very similar to logic in escape.cc. This CL should not introduce duplicate code - it should reuse the existing code. 1. This code block and the next one (that converts %) is very similar to EscapePath. I should make a new method in escape.cc EscapeURLForConversionToNSURL which performs the correct conversion. 2. The following code block is very similar to EscapeNonASCII. I should use EscapeNonASCII and make a new method called EscapeControlCharacters to replace this code block. 3. I don't see any equivalent logic for the # replacement, so I should just leave it as is. Lastly, I'll need to make sure that nothing funny happens in the std::string -> NSString conversion, which used to happen at the beginning of this method, but will now happen near the end of this method. Let me know if this description accurately matches your expectations.
https://codereview.chromium.org/747773002/diff/40001/net/base/mac/url_convers... File net/base/mac/url_conversions.mm (right): https://codereview.chromium.org/747773002/diff/40001/net/base/mac/url_convers... net/base/mac/url_conversions.mm:59: // pct-encode them for RFC 1738. On 2014/11/22 00:11:39, erikchen wrote: > On 2014/11/21 15:48:55, mmenke wrote: > > We do not need a new method to do this. Set net::EscapePath, and its escape > > list: > > > https://code.google.com/p/chromium/codesearch#chromium/src/net/base/escape.cc... > > > > Note: That list does exclude double quotes (") (As does RFC 1738), so if you > > need those escaped, you do need to do that yourself. > > > > > > It also escapes #, which this mentions below. > > > > Actually, though, doesn't GURL itself also escape the same characters as well? > > mmenke: What are your expectations? (I ask only so that I reduce the number of > feedback iterations on this CL) > > My current best guess is the following: > Some of the logic in this method is very similar to logic in escape.cc. This CL > should not introduce duplicate code - it should reuse the existing code. > > 1. This code block and the next one (that converts %) is very similar to > EscapePath. I should make a new method in escape.cc > EscapeURLForConversionToNSURL which performs the correct conversion. > 2. The following code block is very similar to EscapeNonASCII. I should use > EscapeNonASCII and make a new method called EscapeControlCharacters to replace > this code block. > 3. I don't see any equivalent logic for the # replacement, so I should just > leave it as is. > > Lastly, I'll need to make sure that nothing funny happens in the std::string -> > NSString conversion, which used to happen at the beginning of this method, but > will now happen near the end of this method. > > Let me know if this description accurately matches your expectations. I'd like to duplicate as little code as possible. First, I want to know that GURL.spec() does not already have all the escaping you need. I'm not currently convinced of this. It does escaping itself, and we have to unescape GURLs before we can display them in the omnibox. Looks like it also escapes double quotes, so I'm not even convinced this code currently does anything at all (Other than the # logic). If we need additional escape logic, I'd like it in net/escape.h or url/. If we need to escape extra #'s (1738 isn't clear on this topic - it could just be referring to the path/query string, since it says all #'s should be escaped, but that's clearly false. RFC 3986 is clearer on the subject, though, and seems to agree with this code), I'd really rather see it done in GURL, rather than here, if we can manage that. I'm not a big fan of non-cross-platform URL handling logic. Brettw owns that code, and we should run it by him, to see what he thinks. Sound reasonable?
On 2014/11/22 01:04:13, mmenke wrote: > https://codereview.chromium.org/747773002/diff/40001/net/base/mac/url_convers... > File net/base/mac/url_conversions.mm (right): > > https://codereview.chromium.org/747773002/diff/40001/net/base/mac/url_convers... > net/base/mac/url_conversions.mm:59: // pct-encode them for RFC 1738. > On 2014/11/22 00:11:39, erikchen wrote: > > On 2014/11/21 15:48:55, mmenke wrote: > > > We do not need a new method to do this. Set net::EscapePath, and its escape > > > list: > > > > > > https://code.google.com/p/chromium/codesearch#chromium/src/net/base/escape.cc... > > > > > > Note: That list does exclude double quotes (") (As does RFC 1738), so if > you > > > need those escaped, you do need to do that yourself. > > > > > > > > > It also escapes #, which this mentions below. > > > > > > Actually, though, doesn't GURL itself also escape the same characters as > well? > > > > mmenke: What are your expectations? (I ask only so that I reduce the number of > > feedback iterations on this CL) > > > > My current best guess is the following: > > Some of the logic in this method is very similar to logic in escape.cc. This > CL > > should not introduce duplicate code - it should reuse the existing code. > > > > 1. This code block and the next one (that converts %) is very similar to > > EscapePath. I should make a new method in escape.cc > > EscapeURLForConversionToNSURL which performs the correct conversion. > > 2. The following code block is very similar to EscapeNonASCII. I should use > > EscapeNonASCII and make a new method called EscapeControlCharacters to replace > > this code block. > > 3. I don't see any equivalent logic for the # replacement, so I should just > > leave it as is. > > > > Lastly, I'll need to make sure that nothing funny happens in the std::string > -> > > NSString conversion, which used to happen at the beginning of this method, but > > will now happen near the end of this method. > > > > Let me know if this description accurately matches your expectations. > > I'd like to duplicate as little code as possible. > > First, I want to know that GURL.spec() does not already have all the escaping > you need. I'm not currently convinced of this. It does escaping itself, and we > have to unescape GURLs before we can display them in the omnibox. Looks like it > also escapes double quotes, so I'm not even convinced this code currently does > anything at all (Other than the # logic). If we need additional escape logic, > I'd like it in net/escape.h or url/. > > If we need to escape extra #'s (1738 isn't clear on this topic - it could just > be referring to the path/query string, since it says all #'s should be escaped, > but that's clearly false. RFC 3986 is clearer on the subject, though, and seems > to agree with this code), I'd really rather see it done in GURL, rather than > here, if we can manage that. I'm not a big fan of non-cross-platform URL > handling logic. Brettw owns that code, and we should run it by him, to see what > he thinks. > > Sound reasonable? That sounds reasonable to me. fwiw, I suspect that this logic is going to need some custom massaging, because its catering to NSURL's interpretation of the RFC specs, which might be outdated. Also, I attempted to land this CL in url/ earlier, and abarth@ believed that the CL did not belong in url/. https://codereview.chromium.org/719783005/ It's possible that he will think differently if the CL is changed to be platform-agnostic URL escaping, but I don't want to land anything behind his back, given his strong feelings on the subject.
https://codereview.chromium.org/747773002/diff/40001/net/base/mac/url_convers... File net/base/mac/url_conversions_unittest.mm (right): https://codereview.chromium.org/747773002/diff/40001/net/base/mac/url_convers... net/base/mac/url_conversions_unittest.mm:67: @"http://www.mysite.com/test?test#test?test", Oh...if the extra logic is only needed for fragments, something I hadn't considered, then should only apply it to the fragment portion of the URL.
On 2014/11/22 01:59:04, mmenke wrote: > https://codereview.chromium.org/747773002/diff/40001/net/base/mac/url_convers... > File net/base/mac/url_conversions_unittest.mm (right): > > https://codereview.chromium.org/747773002/diff/40001/net/base/mac/url_convers... > net/base/mac/url_conversions_unittest.mm:67: > @"http://www.mysite.com/test?test#test?test", > Oh...if the extra logic is only needed for fragments, something I hadn't > considered, then should only apply it to the fragment portion of the URL. So the only open question is whether GURL.spec() already performs the same conversion?
On 2014/11/22 02:11:36, erikchen wrote: > On 2014/11/22 01:59:04, mmenke wrote: > > > https://codereview.chromium.org/747773002/diff/40001/net/base/mac/url_convers... > > File net/base/mac/url_conversions_unittest.mm (right): > > > > > https://codereview.chromium.org/747773002/diff/40001/net/base/mac/url_convers... > > net/base/mac/url_conversions_unittest.mm:67: > > @"http://www.mysite.com/test?test#test?test", > > Oh...if the extra logic is only needed for fragments, something I hadn't > > considered, then should only apply it to the fragment portion of the URL. > > So the only open question is whether GURL.spec() already performs the same > conversion? Yes. I've done a little experimenting, and it looks like it does the same escaping to everything before the fragment. If we just apply stuff to the fragment only, I'm *much* more comfortable with this change, even if it has to be iOS only. Note that this should be systematically checked (Or the code should be read) to verify this.
On 2014/11/22 02:24:54, mmenke wrote: > On 2014/11/22 02:11:36, erikchen wrote: > > On 2014/11/22 01:59:04, mmenke wrote: > > > > > > https://codereview.chromium.org/747773002/diff/40001/net/base/mac/url_convers... > > > File net/base/mac/url_conversions_unittest.mm (right): > > > > > > > > > https://codereview.chromium.org/747773002/diff/40001/net/base/mac/url_convers... > > > net/base/mac/url_conversions_unittest.mm:67: > > > @"http://www.mysite.com/test?test#test?test", > > > Oh...if the extra logic is only needed for fragments, something I hadn't > > > considered, then should only apply it to the fragment portion of the URL. > > > > So the only open question is whether GURL.spec() already performs the same > > conversion? > > Yes. I've done a little experimenting, and it looks like it does the same > escaping to everything before the fragment. If we just apply stuff to the > fragment only, I'm *much* more comfortable with this change, even if it has to > be iOS only. Note that this should be systematically checked (Or the code > should be read) to verify this. One other thing that I didn't make clear enough. If only the query string needs the extra logic, we should be using GURL::query() and the Replacements class, rather than creating new code to locate the query string.
On 2014/11/24 15:53:49, mmenke wrote: > On 2014/11/22 02:24:54, mmenke wrote: > > On 2014/11/22 02:11:36, erikchen wrote: > > > On 2014/11/22 01:59:04, mmenke wrote: > > > > > > > > > > https://codereview.chromium.org/747773002/diff/40001/net/base/mac/url_convers... > > > > File net/base/mac/url_conversions_unittest.mm (right): > > > > > > > > > > > > > > https://codereview.chromium.org/747773002/diff/40001/net/base/mac/url_convers... > > > > net/base/mac/url_conversions_unittest.mm:67: > > > > @"http://www.mysite.com/test?test#test?test", > > > > Oh...if the extra logic is only needed for fragments, something I hadn't > > > > considered, then should only apply it to the fragment portion of the URL. > > > > > > So the only open question is whether GURL.spec() already performs the same > > > conversion? > > > > Yes. I've done a little experimenting, and it looks like it does the same > > escaping to everything before the fragment. If we just apply stuff to the > > fragment only, I'm *much* more comfortable with this change, even if it has to > > be iOS only. Note that this should be systematically checked (Or the code > > should be read) to verify this. > > One other thing that I didn't make clear enough. If only the query string needs > the extra logic, we should be using GURL::query() and the Replacements class, > rather than creating new code to locate the query string. GURL behaves differently from the logic in url_conversions. The easiest way to see some sample differences is to run URLConversionTest.TestNSURLCreationFromStrings, with a modification to also test against GURL conversion. Differences I've spotted: - Handling of # in the fragment. - GURL doesn't escape '[' and ']', but the NSURL conversion does. - GURL handles 7F differently. GURL converts '\177' to "\x7F", whereas NSURL conversion does "%7F". I dove into the implementation of the canonicalization logic in GURL, and it is exceedingly complex. I do not think that we can effectively use the logic in GURL to replace the logic in this method. Instead, I'll move forward with replacing as much of the duplicated logic as possible with logic from net/base/escape.h. Given that we're not passing the string into GURL, that also means that we won't be able to restrict logic to the fragment of the URL. Sound reasonable?
On 2014/11/26 00:42:22, erikchen wrote: > On 2014/11/24 15:53:49, mmenke wrote: > > On 2014/11/22 02:24:54, mmenke wrote: > > > On 2014/11/22 02:11:36, erikchen wrote: > > > > On 2014/11/22 01:59:04, mmenke wrote: > > > > > > > > > > > > > > > https://codereview.chromium.org/747773002/diff/40001/net/base/mac/url_convers... > > > > > File net/base/mac/url_conversions_unittest.mm (right): > > > > > > > > > > > > > > > > > > > > https://codereview.chromium.org/747773002/diff/40001/net/base/mac/url_convers... > > > > > net/base/mac/url_conversions_unittest.mm:67: > > > > > @"http://www.mysite.com/test?test#test?test", > > > > > Oh...if the extra logic is only needed for fragments, something I hadn't > > > > > considered, then should only apply it to the fragment portion of the > URL. > > > > > > > > So the only open question is whether GURL.spec() already performs the same > > > > conversion? > > > > > > Yes. I've done a little experimenting, and it looks like it does the same > > > escaping to everything before the fragment. If we just apply stuff to the > > > fragment only, I'm *much* more comfortable with this change, even if it has > to > > > be iOS only. Note that this should be systematically checked (Or the code > > > should be read) to verify this. > > > > One other thing that I didn't make clear enough. If only the query string > needs > > the extra logic, we should be using GURL::query() and the Replacements class, > > rather than creating new code to locate the query string. > > GURL behaves differently from the logic in url_conversions. The easiest way to > see some sample differences is to run > URLConversionTest.TestNSURLCreationFromStrings, with a modification to also test > against GURL conversion. > > Differences I've spotted: > - Handling of # in the fragment. > - GURL doesn't escape '[' and ']', but the NSURL conversion does. > - GURL handles 7F differently. GURL converts '\177' to "\x7F", whereas NSURL > conversion does "%7F". > > I dove into the implementation of the canonicalization logic in GURL, and it is > exceedingly complex. I do not think that we can effectively use the logic in > GURL to replace the logic in this method. > > Instead, I'll move forward with replacing as much of the duplicated logic as > possible with logic from net/base/escape.h. Given that we're not passing the > string into GURL, that also means that we won't be able to restrict logic to > the fragment of the URL. Sound reasonable? You may not be passing the string into GURL, but you're getting the string *out* of a GURL, so GURL logic has already been applied. Duplicating string parsing logic is rarely a good idea. So you're getting a string out of GURL. It should already have UTF8 escaped, and everything your want escaped escaped except [, ], \x7F in the path/query string, and a whole bunch of characters in the query string. My suggestion: Make a new method in escape.cc that escapes all the characters you want ([, ], 0x7F, #, etc) and apply it to the path, query string, and every element of the fragment but the first character (Which is #, if it's non-empty). Use GURL::Replacements to do this. GURL won't unescape those characters. That makes the conversion about 5 lines of code. https://codereview.chromium.org/747773002/diff/40001/net/base/mac/url_convers... File net/base/mac/url_conversions_unittest.mm (right): https://codereview.chromium.org/747773002/diff/40001/net/base/mac/url_convers... net/base/mac/url_conversions_unittest.mm:175: "%7B%7C%7D~", Should have a test with an IPv6 literal. i.e. http://[::1]/path/ To make sure you aren't incorrectly escaping the brackets.
Patchset #3 (id:60001) has been deleted
Patchset #3 (id:80001) has been deleted
Patchset #3 (id:100001) has been deleted
Patchset #3 (id:120001) has been deleted
mmenke: PTAL I followed your suggestions, and the result is much cleaner. Thanks!
https://codereview.chromium.org/747773002/diff/40001/net/base/mac/url_convers... File net/base/mac/url_conversions.mm (right): https://codereview.chromium.org/747773002/diff/40001/net/base/mac/url_convers... net/base/mac/url_conversions.mm:41: #if defined(OS_IOS) On 2014/11/21 10:49:39, droger wrote: > You can remove this ifdef, and the include on line 20, because we no longer > support these older OSes. Done. https://codereview.chromium.org/747773002/diff/40001/net/base/mac/url_convers... File net/base/mac/url_conversions_unittest.mm (right): https://codereview.chromium.org/747773002/diff/40001/net/base/mac/url_convers... net/base/mac/url_conversions_unittest.mm:175: "%7B%7C%7D~", On 2014/11/26 14:24:02, mmenke wrote: > Should have a test with an IPv6 literal. > > i.e. > > http://[::1]/path/ > > To make sure you aren't incorrectly escaping the brackets. Done.
LGTM, modulo concerns about leaks. https://codereview.chromium.org/747773002/diff/140001/net/base/escape.h File net/base/escape.h (right): https://codereview.chromium.org/747773002/diff/140001/net/base/escape.h#newco... net/base/escape.h:35: NET_EXPORT std::string EscapeNSURLPrecursor(const std::string& precursor); optional: Suggest surrounding this in #ifdef OS_MACOSX (That's define on iOS and OSX, right?). Not because compiling it elsewhere is the end of the world, but just to indicate there's no reason to use it eslewhere. https://codereview.chromium.org/747773002/diff/140001/net/base/mac/url_conver... File net/base/mac/url_conversions.mm (right): https://codereview.chromium.org/747773002/diff/140001/net/base/mac/url_conver... net/base/mac/url_conversions.mm:24: // GURL leaves these characters unencoded in the path, query, and ref. This Maybe "GURL leaves some of these characters..." https://codereview.chromium.org/747773002/diff/140001/net/base/mac/url_conver... net/base/mac/url_conversions.mm:47: return [NSURL URLWithString:escaped_url_string]; Returning a NSString from a scoped_nsobject doesn't result in returning a freed object? https://codereview.chromium.org/747773002/diff/140001/net/base/mac/url_conver... File net/base/mac/url_conversions_unittest.mm (right): https://codereview.chromium.org/747773002/diff/140001/net/base/mac/url_conver... net/base/mac/url_conversions_unittest.mm:214: NSURL* url = net::NSURLWithGURL(GURL(base::SysNSStringToUTF8(inputStr))); This looks like a leak to me... Should NSURLWithGURL be returning a scoped_nsobject?
https://codereview.chromium.org/747773002/diff/140001/net/base/mac/url_conver... File net/base/mac/url_conversions.mm (right): https://codereview.chromium.org/747773002/diff/140001/net/base/mac/url_conver... net/base/mac/url_conversions.mm:47: return [NSURL URLWithString:escaped_url_string]; On 2014/12/03 15:30:12, mmenke wrote: > Returning a NSString from a scoped_nsobject doesn't result in returning a freed > object? This code looks fine. In objective C, everything in reference counted, and scoped_nsobject is equivalent to scoped_refptr in C++. NSURL will keep a reference and prevent the string from being destroyed. https://codereview.chromium.org/747773002/diff/140001/net/base/mac/url_conver... File net/base/mac/url_conversions_unittest.mm (right): https://codereview.chromium.org/747773002/diff/140001/net/base/mac/url_conver... net/base/mac/url_conversions_unittest.mm:214: NSURL* url = net::NSURLWithGURL(GURL(base::SysNSStringToUTF8(inputStr))); On 2014/12/03 15:30:13, mmenke wrote: > This looks like a leak to me... Should NSURLWithGURL be returning a > scoped_nsobject? This is not a leak. In objective-C all methods are expected to return auto-released objects, unless the name of the method starts with "new" or "create". Thus NSURLWithGURL returns a autoreleased object, which means that the auto-release pool has a reference to it. Auto-released objects are in some sense temporary, because they will live until the autorelease pool is cleared, which happens when the control returns to the main thread loop.
https://codereview.chromium.org/747773002/diff/140001/net/base/escape.h File net/base/escape.h (right): https://codereview.chromium.org/747773002/diff/140001/net/base/escape.h#newco... net/base/escape.h:35: NET_EXPORT std::string EscapeNSURLPrecursor(const std::string& precursor); On 2014/12/03 15:30:12, mmenke wrote: > optional: Suggest surrounding this in #ifdef OS_MACOSX (That's define on iOS > and OSX, right?). Not because compiling it elsewhere is the end of the world, > but just to indicate there's no reason to use it eslewhere. Done. https://codereview.chromium.org/747773002/diff/140001/net/base/mac/url_conver... File net/base/mac/url_conversions.mm (right): https://codereview.chromium.org/747773002/diff/140001/net/base/mac/url_conver... net/base/mac/url_conversions.mm:24: // GURL leaves these characters unencoded in the path, query, and ref. This On 2014/12/03 15:30:13, mmenke wrote: > Maybe "GURL leaves some of these characters..." Done.
The CQ bit was checked by erikchen@chromium.org
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/747773002/160001
The CQ bit was unchecked by commit-bot@chromium.org
Try jobs failed on following builders: android_dbg_tests_recipe on tryserver.chromium.linux (http://build.chromium.org/p/tryserver.chromium.linux/builders/android_dbg_tes...)
The CQ bit was checked by erikchen@chromium.org
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/747773002/160001
Message was sent while issue was closed.
Committed patchset #4 (id:160001)
Message was sent while issue was closed.
Patchset 4 (id:??) landed as https://crrev.com/006a7cfbba2e1c415c9b0d63767a73f14815e27e Cr-Commit-Position: refs/heads/master@{#306690} |