Index: url/url_canon.h |
diff --git a/url/url_canon.h b/url/url_canon.h |
index 95d53453f64362efc98d8141c11691573eaae165..2fa145337081f4796d0d8a87bb4f81b67977847b 100644 |
--- a/url/url_canon.h |
+++ b/url/url_canon.h |
@@ -379,6 +379,31 @@ URL_EXPORT void CanonicalizeHostVerbose(const base::char16* spec, |
CanonOutput* output, |
CanonHostInfo* host_info); |
+// Canonicalizes a string according to the host canonicalization rules. Unlike |
+// CanonicalizeHost, this will not check for IP addresses which can change the |
+// meaning (and canonicalization) of the components. This means it is possible |
+// to call this for sub-components of a host name without corruption. |
+// |
+// As an example, "01.02.03.04.com" is a canonical hostname. If you called |
+// CanonicalizeHost on the substring "01.02.03.04" it will get "fixed" to |
+// "1.2.3.4" which will produce an invalid host name when reassembled. This |
+// is more common because all numbers by themselves are considered IP |
Peter Kasting
2016/10/22 05:04:20
Nit: "more common" than what? More common than on
Peter Kasting
2016/10/25 01:33:32
(Still confused about this)
brettw
2016/10/25 20:28:17
Done.
|
+// addresses, so "5" canonicalizes to "0.0.0.5". |
+// |
+// Be careful, unless the input is guaranteed ASCII, it's not possible to split |
Peter Kasting
2016/10/22 05:04:20
Nit: first comma -> semicolon
|
+// host names on anything but "." because Punycode works on each dot-separated |
+// substring as a unit. |
Peter Kasting
2016/10/22 05:04:20
Nit: Maybe this paragraph should be more direct: "
brettw
2016/10/24 21:45:24
I mostly did your comment with some addition.
|
+// |
+// Returns true if the host was valid. This function will treat a 0-length |
+// host as valid (because it's designed to be used for substrings) while the |
+// full version above will mark empty hosts as broken. |
+URL_EXPORT bool CanonicalizeHostSubstring(const char* spec, |
+ const Component& host, |
+ CanonOutput* output); |
+URL_EXPORT bool CanonicalizeHostSubstring(const base::char16* spec, |
+ const Component& host, |
+ CanonOutput* output); |
+ |
// IP addresses. |
// |
// Tries to interpret the given host name as an IPv4 or IPv6 address. If it is |