components/autofill/content/renderer/form_autofill_util.cc - Issue 1508293006: Check url path as well as document title to detect formless autofill page

Unified Diff: components/autofill/content/renderer/form_autofill_util.cc

Issue 1508293006: Check url path as well as document title to detect formless autofill page (Closed) Base URL: https://chromium.googlesource.com/chromium/src.git@master

Patch Set: Add non-ASCII tests Created 5 years ago

Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.

Jump to:

View side-by-side diff with in-line comments

Download patch

« chrome/renderer/autofill/form_autofill_browsertest.cc ('K') | « chrome/test/data/autofill/heuristics/output/bug_555010.out ('k') | content/public/test/render_view_test.h » ('j') | no next file with comments »
Expand Comments ('e') | Collapse Comments ('c') | Hide Comments ('s')

Index: components/autofill/content/renderer/form_autofill_util.cc

diff --git a/components/autofill/content/renderer/form_autofill_util.cc b/components/autofill/content/renderer/form_autofill_util.cc

index 5c7f0d6102e281c6fa8f457f2e2e7e5acb38a5be..8c97c194801864ededeea2effd16e7b52353ebb2 100644

--- a/components/autofill/content/renderer/form_autofill_util.cc

+++ b/components/autofill/content/renderer/form_autofill_util.cc

@@ -1404,36 +1404,56 @@ bool UnownedCheckoutFormElementsAndFieldSetsToFormData(

FormFieldData* field) {

// Only attempt formless Autofill on checkout flows. This avoids the many

// false positives found on the non-checkout web. See

- // http://crbug.com/462375. For now this early abort only applies to

- // English-language pages, because the regex is not translated. Note that

- // an empty "lang" attribute counts as English. A potential problem is that

- // this only checks document.title(), but should actually check the main

- // frame's title. Thus it may make bad decisions for iframes.

+ // http://crbug.com/462375.

WebElement html_element = document.documentElement();

jungshik at Google 2015/12/16 21:14:04 Although not very common, there are documents with

+ // For now this restriction only applies to English-language pages, because

+ // the keywords are not translated. Note that an empty "lang" attribute

+ // counts as English.

jungshik at Google 2015/12/14 23:53:56 It'd better to treat an empty 'lang' as in the cur

std::string lang;

if (!html_element.isNull())

lang = html_element.getAttribute("lang").utf8();

- if (lang.empty() ||

- base::StartsWith(lang, "en", base::CompareCase::INSENSITIVE_ASCII)) {

- std::string title(base::UTF16ToUTF8(base::string16(document.title())));

- const char* const kKeywords[] = {

- "payment",

- "checkout",

- "address",

- "delivery",

- "shipping",

- };

- bool found = false;

- for (const auto& keyword : kKeywords) {

- if (title.find(keyword) != base::string16::npos) {

- found = true;

- break;

- }

+ if (!lang.empty() &&

+ !base::StartsWith(lang, "en", base::CompareCase::INSENSITIVE_ASCII)) {

jungshik at Google 2015/12/14 23:53:57 This assumes that there is no language code (3-let

+ return UnownedFormElementsAndFieldSetsToFormData(

+ fieldsets, control_elements, element, document, extract_mask, form,

+ field);

+ }

+ // A potential problem is that this only checks document.title(), but should

+ // actually check the main frame's title. Thus it may make bad decisions for

+ // iframes.

+ base::string16 title(base::ToLowerASCII(base::string16(document.title())));

+ // Don't check the path for url's without a standard format path component,

+ // such as data:.

+ std::string path;

+ GURL url(document.url());

+ if (url.IsStandard())

+ path = base::ToLowerASCII(url.path());

+ const char* const kKeywords[] = {

+ "payment",

+ "checkout",

+ "address",

+ "delivery",

+ "shipping",

+ };

+ bool found = false;

+ for (const auto& keyword : kKeywords) {

+ // Compare char16 elements of |title| with char elements of |keyword| using

+ // operator==.

+ auto title_pos = std::search(title.begin(), title.end(),

+ keyword, keyword + strlen(keyword));

+ if (title_pos != title.end() ||

+ path.find(keyword) != std::string::npos) {

+ found = true;

+ break;

}

- if (!found)

- return false;

}

+ if (!found)

+ return false;

return UnownedFormElementsAndFieldSetsToFormData(

Evan Stade 2016/01/08 21:53:16 seems like you can just stick this inside the loop

fieldsets, control_elements, element, document, extract_mask, form,