Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(505)

Unified Diff: components/url_formatter/idn_spoof_checker.h

Issue 2784933002: Mitigate spoofing attempt using Latin letters. (Closed)
Patch Set: Delete two more accidentally added files Created 3 years, 7 months ago
Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.
Jump to:
View side-by-side diff with in-line comments
Download patch
« no previous file with comments | « components/url_formatter/BUILD.gn ('k') | components/url_formatter/idn_spoof_checker.cc » ('j') | no next file with comments »
Expand Comments ('e') | Collapse Comments ('c') | Show Comments Hide Comments ('s')
Index: components/url_formatter/idn_spoof_checker.h
diff --git a/components/url_formatter/idn_spoof_checker.h b/components/url_formatter/idn_spoof_checker.h
index 41eafc637b1ce2847df644f26768fbbfa687b20a..41b46fcd09bcaadc5d64dd31d126d2d56e2c058f 100644
--- a/components/url_formatter/idn_spoof_checker.h
+++ b/components/url_formatter/idn_spoof_checker.h
@@ -17,6 +17,7 @@
// 'icu' does not work. Use U_ICU_NAMESPACE.
namespace U_ICU_NAMESPACE {
+class Transliterator;
class UnicodeString;
} // namespace U_ICU_NAMESPACE
@@ -40,19 +41,35 @@ class IDNSpoofChecker {
// See the function body for details on the specific safety checks performed.
bool SafeToDisplayAsUnicode(base::StringPiece16 label, bool is_tld_ascii);
+ // Returns true if |hostname| or the last few components of |hostname| looks
+ // similar to one of top domains listed in top_domains/alexa_domains.list. Two
+ // checks are done:
+ // 1. Calculate the skeleton of |hostname| based on the Unicode confusable
+ // character list and look it up in the pre-calculated skeleton list of
+ // top domains.
+ // 2. Look up the diacritic-free version of |hostname| in the list of
+ // top domains. Note that non-IDN hostnames will not get here.
+ bool SimilarToTopDomains(base::StringPiece16 hostname);
+
private:
// Sets allowed characters in IDN labels and turns on USPOOF_CHAR_LIMIT.
void SetAllowedUnicodeSet(UErrorCode* status);
// Returns true if all the Cyrillic letters in |label| belong to a set of
Peter Kasting 2017/05/15 18:55:31 Nit: I would put a blank line above this comment a
jungshik at Google 2017/05/17 23:11:04 Done.
// Cyrillic letters that look like ASCII Latin letters.
bool IsMadeOfLatinAlikeCyrillic(const icu::UnicodeString& label);
+ // Returns true if the confusability skeleton for |hostname| is calculated
+ // successfully and stored in |skeleton|.
Peter Kasting 2017/05/15 18:55:31 Nit: This comment sounds like the function just ch
jungshik at Google 2017/05/17 23:11:04 ooops. Yeah, it's gone a few PS's ago. Thank you f
+ bool GetSkeleton(base::StringPiece16 hostname, std::string* skeleton);
USpoofChecker* checker_;
icu::UnicodeSet deviation_characters_;
icu::UnicodeSet non_ascii_latin_letters_;
icu::UnicodeSet kana_letters_exceptions_;
+ icu::UnicodeSet combining_diacritics_exceptions_;
icu::UnicodeSet cyrillic_letters_;
icu::UnicodeSet cyrillic_letters_latin_alike_;
+ icu::UnicodeSet lgc_letters_n_ascii_;
+ std::unique_ptr<icu::Transliterator> transliterator_;
IDNSpoofChecker(const IDNSpoofChecker&) = delete;
void operator=(const IDNSpoofChecker&) = delete;
« no previous file with comments | « components/url_formatter/BUILD.gn ('k') | components/url_formatter/idn_spoof_checker.cc » ('j') | no next file with comments »

Powered by Google App Engine
This is Rietveld 408576698