Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(230)

Unified Diff: components/url_formatter/top_domains/README

Issue 2784933002: Mitigate spoofing attempt using Latin letters. (Closed)
Patch Set: add similarity check unittests Created 3 years, 8 months ago
Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.
Jump to:
View side-by-side diff with in-line comments
Download patch
Index: components/url_formatter/top_domains/README
diff --git a/components/url_formatter/top_domains/README b/components/url_formatter/top_domains/README
new file mode 100644
index 0000000000000000000000000000000000000000..804f4e722899ff2b9db9674c9fc51c82f3902485
--- /dev/null
+++ b/components/url_formatter/top_domains/README
@@ -0,0 +1,23 @@
+* alexa_10k_domains.list
+ It is an input to make_top_domain_list and is made up of list of Alexa
+ top 10k domains (one per line).
+ It's derived from
+ src/tools/perf/page_sets/alexa1-10000-urls.json by running the following:
ncarter (slow) 2017/04/20 22:26:59 IIRC the alexa10000 from page_sets was almost five
+
+ grep http ../../../tools/perf/page_sets/alexa1-10000-urls.json | \
+ sed -r -e 's;^.*"https?://(.*)/".*$;\1;' -e 's/www\.//' | \
+ awk 'BEGIN {FS="."} { printf("%s%s\n", NF > 3 ? "#" : "", $0); } \
+ END {printf ("# for testing\ndigklmo68.com\ndigklmo68.co.uk\n");}' > \
ncarter (slow) 2017/04/20 22:26:58 This would probably be better as a python script,
+ alexa_10k_domains.list
+
+* alexa_10k_names_and_skeletons.gperf
+
+ It is generated by running make_top_domain_list and checked in.
+ No command line argument needs to be passed.
+
+ $ ninja -C $build_outdir make_top_domain_list
+ $ $build_outdir/make_top_domain_list
+
+ During a build, it is processed by base/dafsa/make_dafsa.py to generate
+ alexa_10k_names_and_skeletons-inc.cc that is included by
+ components/url_formatter/url_formatter.cc

Powered by Google App Engine
This is Rietveld 408576698