Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(195)

Side by Side Diff: components/url_formatter/url_formatter.h

Issue 1258813002: Implement a new IDN display policy (Closed) Base URL: https://chromium.googlesource.com/chromium/src.git@master
Patch Set: url_canon test: wchar* needs a surrogate pair on *nix Created 4 years, 9 months ago
Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.
Jump to:
View unified diff | Download patch
OLDNEW
1 // Copyright 2015 The Chromium Authors. All rights reserved. 1 // Copyright 2015 The Chromium Authors. All rights reserved.
2 // Use of this source code is governed by a BSD-style license that can be 2 // Use of this source code is governed by a BSD-style license that can be
3 // found in the LICENSE file. 3 // found in the LICENSE file.
4 4
5 // url_formatter contains routines for formatting URLs in a way that can be 5 // url_formatter contains routines for formatting URLs in a way that can be
6 // safely and securely displayed to users. For example, it is responsible 6 // safely and securely displayed to users. For example, it is responsible
7 // for determining when to convert an IDN A-Label (e.g. "xn--[something]") 7 // for determining when to convert an IDN A-Label (e.g. "xn--[something]")
8 // into the IDN U-Label. 8 // into the IDN U-Label.
9 // 9 //
10 // Note that this formatting is only intended for display purposes; it would 10 // Note that this formatting is only intended for display purposes; it would
(...skipping 35 matching lines...) Expand 10 before | Expand all | Expand 10 after
46 // If the scheme is 'http://', it's removed. 46 // If the scheme is 'http://', it's removed.
47 extern const FormatUrlType kFormatUrlOmitHTTP; 47 extern const FormatUrlType kFormatUrlOmitHTTP;
48 48
49 // Omits the path if it is just a slash and there is no query or ref. This is 49 // Omits the path if it is just a slash and there is no query or ref. This is
50 // meaningful for non-file "standard" URLs. 50 // meaningful for non-file "standard" URLs.
51 extern const FormatUrlType kFormatUrlOmitTrailingSlashOnBareHostname; 51 extern const FormatUrlType kFormatUrlOmitTrailingSlashOnBareHostname;
52 52
53 // Convenience for omitting all unecessary types. 53 // Convenience for omitting all unecessary types.
54 extern const FormatUrlType kFormatUrlOmitAll; 54 extern const FormatUrlType kFormatUrlOmitAll;
55 55
56 // Creates a string representation of |url|. The IDN host name may be in Unicode 56 // Creates a string representation of |url|. The IDN host name is turned to
57 // if |languages| accepts the Unicode representation. |format_type| is a bitmask 57 // Unicode if the Unicode representation is deemed safe. |languages| is not
58 // used any more and will be removed. |format_type| is a bitmask
58 // of FormatUrlTypes, see it for details. |unescape_rules| defines how to clean 59 // of FormatUrlTypes, see it for details. |unescape_rules| defines how to clean
59 // the URL for human readability. You will generally want |UnescapeRule::SPACES| 60 // the URL for human readability. You will generally want |UnescapeRule::SPACES|
60 // for display to the user if you can handle spaces, or |UnescapeRule::NORMAL| 61 // for display to the user if you can handle spaces, or |UnescapeRule::NORMAL|
61 // if not. If the path part and the query part seem to be encoded in %-encoded 62 // if not. If the path part and the query part seem to be encoded in %-encoded
62 // UTF-8, decodes %-encoding and UTF-8. 63 // UTF-8, decodes %-encoding and UTF-8.
63 // 64 //
64 // The last three parameters may be NULL. 65 // The last three parameters may be NULL.
65 // 66 //
66 // |new_parsed| will be set to the parsing parameters of the resultant URL. 67 // |new_parsed| will be set to the parsing parameters of the resultant URL.
67 // 68 //
(...skipping 53 matching lines...) Expand 10 before | Expand all | Expand 10 after
121 inline base::string16 FormatUrl(const GURL& url, const std::string& languages) { 122 inline base::string16 FormatUrl(const GURL& url, const std::string& languages) {
122 return FormatUrl(url, languages, kFormatUrlOmitAll, net::UnescapeRule::SPACES, 123 return FormatUrl(url, languages, kFormatUrlOmitAll, net::UnescapeRule::SPACES,
123 nullptr, nullptr, nullptr); 124 nullptr, nullptr, nullptr);
124 } 125 }
125 126
126 // Returns whether FormatUrl() would strip a trailing slash from |url|, given a 127 // Returns whether FormatUrl() would strip a trailing slash from |url|, given a
127 // format flag including kFormatUrlOmitTrailingSlashOnBareHostname. 128 // format flag including kFormatUrlOmitTrailingSlashOnBareHostname.
128 bool CanStripTrailingSlash(const GURL& url); 129 bool CanStripTrailingSlash(const GURL& url);
129 130
130 // Formats the host in |url| and appends it to |output|. The host formatter 131 // Formats the host in |url| and appends it to |output|. The host formatter
131 // takes the same accept languages component as ElideURL(). 132 // takes the same accept languages component as ElideURL(), but it does not
133 // affect the result. It'll be removed.
132 void AppendFormattedHost(const GURL& url, 134 void AppendFormattedHost(const GURL& url,
133 const std::string& languages, 135 const std::string& languages,
134 base::string16* output); 136 base::string16* output);
135 137
136 // Converts the given host name to unicode characters. This can be called for 138 // Converts the given host name to unicode characters. This can be called for
137 // any host name, if the input is not IDN or is invalid in some way, we'll just 139 // any host name, if the input is not IDN or is invalid in some way, we'll just
138 // return the ASCII source so it is still usable. 140 // return the ASCII source so it is still usable.
139 // 141 //
140 // The input should be the canonicalized ASCII host name from GURL. This 142 // The input should be the canonicalized ASCII host name from GURL. This
141 // function does NOT accept UTF-8! 143 // function does NOT accept UTF-8!
142 // 144 // |languages| is not used any more and will be removed.
143 // |languages| is a comma separated list of ISO 639 language codes. It
144 // is used to determine whether a hostname is 'comprehensible' to a user
145 // who understands languages listed. |host| will be converted to a
146 // human-readable form (Unicode) ONLY when each component of |host| is
147 // regarded as 'comprehensible'. Scipt-mixing is not allowed except that
148 // Latin letters in the ASCII range can be mixed with a limited set of
149 // script-language pairs (currently Han, Kana and Hangul for zh,ja and ko).
150 // When |languages| is empty, even that mixing is not allowed.
151 base::string16 IDNToUnicode(const std::string& host, 145 base::string16 IDNToUnicode(const std::string& host,
152 const std::string& languages); 146 const std::string& languages);
153 147
154 // If |text| starts with "www." it is removed, otherwise |text| is returned 148 // If |text| starts with "www." it is removed, otherwise |text| is returned
155 // unmodified. 149 // unmodified.
156 base::string16 StripWWW(const base::string16& text); 150 base::string16 StripWWW(const base::string16& text);
157 151
158 // Runs |url|'s host through StripWWW(). |url| must be valid. 152 // Runs |url|'s host through StripWWW(). |url| must be valid.
159 base::string16 StripWWWFromHost(const GURL& url); 153 base::string16 StripWWWFromHost(const GURL& url);
160 154
161 } // namespace url_formatter 155 } // namespace url_formatter
162 156
163 #endif // COMPONENTS_URL_FORMATTER_URL_FORMATTER_H_ 157 #endif // COMPONENTS_URL_FORMATTER_URL_FORMATTER_H_
OLDNEW

Powered by Google App Engine
This is Rietveld 408576698