components/omnibox/browser/url_index_private_data.cc - Issue 2187343002: Generating autocomplete results with and without word breaks in the Omnibox.

Side by Side Diff: components/omnibox/browser/url_index_private_data.cc

Issue 2187343002: Generating autocomplete results with and without word breaks in the Omnibox. (Closed) Base URL: https://chromium.googlesource.com/chromium/src.git@master

Patch Set: Created 4 years, 4 months ago

Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.

Jump to:

View unified diff | Download patch

OLD	NEW
1 // Copyright (c) 2012 The Chromium Authors. All rights reserved.	1 // Copyright (c) 2012 The Chromium Authors. All rights reserved.

2 // Use of this source code is governed by a BSD-style license that can be	2 // Use of this source code is governed by a BSD-style license that can be

3 // found in the LICENSE file.	3 // found in the LICENSE file.

4	4

5 #include "components/omnibox/browser/url_index_private_data.h"	5 #include "components/omnibox/browser/url_index_private_data.h"

6	6

7 #include <stdint.h>	7 #include <stdint.h>

8	8

9 #include <functional>	9 #include <functional>

10 #include <iterator>	10 #include <iterator>

(...skipping 143 matching lines...) Expand 10 before \| Expand all \| Expand 10 after Loading...
154	154

155 ScoredHistoryMatches URLIndexPrivateData::HistoryItemsForTerms(	155 ScoredHistoryMatches URLIndexPrivateData::HistoryItemsForTerms(

156 base::string16 search_string,	156 base::string16 search_string,

157 size_t cursor_position,	157 size_t cursor_position,

158 size_t max_matches,	158 size_t max_matches,

159 bookmarks::BookmarkModel* bookmark_model,	159 bookmarks::BookmarkModel* bookmark_model,

160 TemplateURLService* template_url_service) {	160 TemplateURLService* template_url_service) {

161 // If cursor position is set and useful (not at either end of the	161 // If cursor position is set and useful (not at either end of the

162 // string), allow the search string to be broken at cursor position.	162 // string), allow the search string to be broken at cursor position.

163 // We do this by pretending there's a space where the cursor is.	163 // We do this by pretending there's a space where the cursor is.

	164 // Furthermore, we also keep a copy of the original search string

	165 // without the pretend space. The rationale behind this is to

	166 // build the History ID Set with both the search string with and

	167 // without the pretend space.

	168 base::string16 search_string_without_break;
	Peter Kasting 2016/07/28 20:32:20 It would be nice to avoid the need for this temp a It would be nice to avoid the need for this temp and keep all the added logic here together. One way might be: move the conditional that was here down to where you added your other block, so we unconditionally search for the original input string. Then combine this block and the other new block, so that we check if we should insert a space, and if so, go ahead and look in the history DB for its words and merge in the results. Lavar Askew 2016/08/10 15:24:03 Done. Show quoted text On 2016/07/28 20:32:20, Peter Kasting wrote: > It would be nice to avoid the need for this temp and keep all the added logic > here together. One way might be: move the conditional that was here down to > where you added your other block, so we unconditionally search for the original > input string. Then combine this block and the other new block, so that we check > if we should insert a space, and if so, go ahead and look in the history DB for > its words and merge in the results. Done.
164 if ((cursor_position != base::string16::npos) &&	169 if ((cursor_position != base::string16::npos) &&

165 (cursor_position < search_string.length()) &&	170 (cursor_position < search_string.length()) &&

166 (cursor_position > 0)) {	171 (cursor_position > 0)) {

167 search_string.insert(cursor_position, base::ASCIIToUTF16(" "));	172

	173 base::string16 blank_space =

	174 base::ASCIIToUTF16(" ");

	175

	176 if ((search_string.compare(cursor_position, 1, blank_space) != 0) \|\|

	177 (search_string.compare(cursor_position - 1, 1, blank_space) != 0) \|\|
	Peter Kasting 2016/07/28 20:32:20 I suspect that rather than comparing against " " s I suspect that rather than comparing against " " specifically, you want to check if this is somewhere ICU would line-break, to match how String16VectorFromString16() splits into words. The dumbest way to do this would be to simply call String16VectorFromString16() twice, once with the space and once without, and throw away one set if it's the same as the other. There may be a better way, though. I would check with mpearson and jshin. Lavar Askew 2016/08/10 15:24:03 Done. I have removed this code after following yo Show quoted text On 2016/07/28 20:32:20, Peter Kasting wrote: > I suspect that rather than comparing against " " specifically, you want to check > if this is somewhere ICU would line-break, to match how > String16VectorFromString16() splits into words. > > The dumbest way to do this would be to simply call String16VectorFromString16() > twice, once with the space and once without, and throw away one set if it's the > same as the other. There may be a better way, though. I would check with > mpearson and jshin. Done. I have removed this code after following your "copy-and-pastes" refactoring suggestion below.
	178 (search_string.compare(cursor_position + 1, 1, blank_space) != 0)) {
	Peter Kasting 2016/07/28 20:32:20 I think you don't want to compare cursor_position I think you don't want to compare cursor_position + 1. That would be the next character after whatever is to the right of the cursor. Lavar Askew 2016/08/10 15:24:03 Done. I have removed this code. Show quoted text On 2016/07/28 20:32:20, Peter Kasting wrote: > I think you don't want to compare cursor_position + 1. That would be the next > character after whatever is to the right of the cursor. Done. I have removed this code.
	179

	180 search_string_without_break = search_string;

	181

	182 search_string.insert(cursor_position, blank_space);

	183 }

168 }	184 }

169 pre_filter_item_count_ = 0;	185 pre_filter_item_count_ = 0;

170 post_filter_item_count_ = 0;	186 post_filter_item_count_ = 0;

171 post_scoring_item_count_ = 0;	187 post_scoring_item_count_ = 0;

172 // The search string we receive may contain escaped characters. For reducing	188 // The search string we receive may contain escaped characters. For reducing

173 // the index we need individual, lower-cased words, ignoring escapings. For	189 // the index we need individual, lower-cased words, ignoring escapings. For

174 // the final filtering we need whitespace separated substrings possibly	190 // the final filtering we need whitespace separated substrings possibly

175 // containing escaped characters.	191 // containing escaped characters.

176 base::string16 lower_raw_string(base::i18n::ToLower(search_string));	192 base::string16 lower_raw_string(base::i18n::ToLower(search_string));

177 base::string16 lower_unescaped_string =	193 base::string16 lower_unescaped_string =

(...skipping 13 matching lines...) Expand all Loading...
191 search_term_cache_.clear(); // Invalidate the term cache.	207 search_term_cache_.clear(); // Invalidate the term cache.

192 return scored_items;	208 return scored_items;

193 }	209 }

194	210

195 // Reset used_ flags for search_term_cache_. We use a basic mark-and-sweep	211 // Reset used_ flags for search_term_cache_. We use a basic mark-and-sweep

196 // approach.	212 // approach.

197 ResetSearchTermCache();	213 ResetSearchTermCache();

198	214

199 HistoryIDSet history_id_set = HistoryIDSetFromWords(lower_words);	215 HistoryIDSet history_id_set = HistoryIDSetFromWords(lower_words);

200	216

	217 // Add to history_id_set the ids that are related to the original search

	218 // string without the break.

	219 if (!search_string_without_break.empty() &&

	220 search_string.compare(search_string_without_break) != 0) {
	Peter Kasting 2016/07/28 20:32:20 It looks to me as if this copy-and-pastes some of It looks to me as if this copy-and-pastes some of the code above, and perhaps the duplicated parts should be pulled out into a helper function instead. Lavar Askew 2016/08/10 15:24:03 Done. Show quoted text On 2016/07/28 20:32:20, Peter Kasting wrote: > It looks to me as if this copy-and-pastes some of the code above, and perhaps > the duplicated parts should be pulled out into a helper function instead. Done.
	221 base::string16 lower_raw_string_without_break(

	222 base::i18n::ToLower(search_string_without_break));

	223

	224 base::string16 lower_unescaped_string_without_break =

	225 net::UnescapeURLComponent(lower_raw_string_without_break,

	226 net::UnescapeRule::SPACES \| net::UnescapeRule::PATH_SEPARATORS \|

	227 net::UnescapeRule::URL_SPECIAL_CHARS_EXCEPT_PATH_SEPARATORS);

	228

	229 String16Vector lower_words_without_break(

	230 String16VectorFromString16(

	231 lower_unescaped_string_without_break, false, nullptr));

	232

	233 HistoryIDSet history_id_set_without_break =

	234 HistoryIDSetFromWords(lower_words_without_break);

	235

	236 if (history_id_set_without_break.size() > 0) {

	237 history_id_set.insert(history_id_set_without_break.begin(),

	238 history_id_set_without_break.end());

	239 }

	240 }

201 // Trim the candidate pool if it is large. Note that we do not filter out	241 // Trim the candidate pool if it is large. Note that we do not filter out

202 // items that do not contain the search terms as proper substrings -- doing	242 // items that do not contain the search terms as proper substrings -- doing

203 // so is the performance-costly operation we are trying to avoid in order	243 // so is the performance-costly operation we are trying to avoid in order

204 // to maintain omnibox responsiveness.	244 // to maintain omnibox responsiveness.

205 const size_t kItemsToScoreLimit = 500;	245 const size_t kItemsToScoreLimit = 500;

206 pre_filter_item_count_ = history_id_set.size();	246 pre_filter_item_count_ = history_id_set.size();

207 // If we trim the results set we do not want to cache the results for next	247 // If we trim the results set we do not want to cache the results for next

208 // time as the user's ultimately desired result could easily be eliminated	248 // time as the user's ultimately desired result could easily be eliminated

209 // in this early rough filter.	249 // in this early rough filter.

210 bool was_trimmed = (pre_filter_item_count_ > kItemsToScoreLimit);	250 bool was_trimmed = (pre_filter_item_count_ > kItemsToScoreLimit);

(...skipping 1143 matching lines...) Expand 10 before \| Expand all \| Expand 10 after Loading...
1354 // First cut: typed count, visit count, recency.	1394 // First cut: typed count, visit count, recency.

1355 // TODO(mrossetti): This is too simplistic. Consider an approach which ranks	1395 // TODO(mrossetti): This is too simplistic. Consider an approach which ranks

1356 // recently visited (within the last 12/24 hours) as highly important. Get	1396 // recently visited (within the last 12/24 hours) as highly important. Get

1357 // input from mpearson.	1397 // input from mpearson.

1358 if (r1.typed_count() != r2.typed_count())	1398 if (r1.typed_count() != r2.typed_count())

1359 return (r1.typed_count() > r2.typed_count());	1399 return (r1.typed_count() > r2.typed_count());

1360 if (r1.visit_count() != r2.visit_count())	1400 if (r1.visit_count() != r2.visit_count())

1361 return (r1.visit_count() > r2.visit_count());	1401 return (r1.visit_count() > r2.visit_count());

1362 return (r1.last_visit() > r2.last_visit());	1402 return (r1.last_visit() > r2.last_visit());

1363 }	1403 }

OLD	NEW

« no previous file with comments | « AUTHORS ('k') | no next file » | no next file with comments »