Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(183)

Side by Side Diff: components/omnibox/browser/url_index_private_data.h

Issue 2187343002: Generating autocomplete results with and without word breaks in the Omnibox. (Closed) Base URL: https://chromium.googlesource.com/chromium/src.git@master
Patch Set: corrected formatting and improved comments. Also added unit test MatchWithAndWithoutCursorWordBreak Created 4 years, 4 months ago
Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.
Jump to:
View unified diff | Download patch
OLDNEW
1 // Copyright (c) 2012 The Chromium Authors. All rights reserved. 1 // Copyright (c) 2012 The Chromium Authors. All rights reserved.
2 // Use of this source code is governed by a BSD-style license that can be 2 // Use of this source code is governed by a BSD-style license that can be
3 // found in the LICENSE file. 3 // found in the LICENSE file.
4 4
5 #ifndef COMPONENTS_OMNIBOX_BROWSER_URL_INDEX_PRIVATE_DATA_H_ 5 #ifndef COMPONENTS_OMNIBOX_BROWSER_URL_INDEX_PRIVATE_DATA_H_
6 #define COMPONENTS_OMNIBOX_BROWSER_URL_INDEX_PRIVATE_DATA_H_ 6 #define COMPONENTS_OMNIBOX_BROWSER_URL_INDEX_PRIVATE_DATA_H_
7 7
8 #include <stddef.h> 8 #include <stddef.h>
9 9
10 #include <set> 10 #include <set>
(...skipping 32 matching lines...) Expand 10 before | Expand all | Expand 10 after
43 // this class is for exclusive use by the InMemoryURLIndex class there should 43 // this class is for exclusive use by the InMemoryURLIndex class there should
44 // be no calls from any other class. 44 // be no calls from any other class.
45 // 45 //
46 // All public member functions are called on the main thread unless otherwise 46 // All public member functions are called on the main thread unless otherwise
47 // annotated. 47 // annotated.
48 class URLIndexPrivateData 48 class URLIndexPrivateData
49 : public base::RefCountedThreadSafe<URLIndexPrivateData> { 49 : public base::RefCountedThreadSafe<URLIndexPrivateData> {
50 public: 50 public:
51 URLIndexPrivateData(); 51 URLIndexPrivateData();
52 52
53 // Given a base::string16 in |term_string|, scans the history index and 53 // Given a |term_string|, scans the history index and returns a vector with
54 // returns a vector with all scored, matching history items. The 54 // all scored, matching history items. The |term_string| is broken down into
55 // |term_string| is broken down into individual terms (words), each of which 55 // individual terms (words), each of which must occur in the candidate
56 // must occur in the candidate history item's URL or page title for the item 56 // history item's URL or page title for the item to qualify; however, the
57 // to qualify; however, the terms do not necessarily have to be adjacent. We 57 // terms do not necessarily have to be adjacent. We also allow breaking
58 // also allow breaking |term_string| at |cursor_position| (if 58 // |term_string| at |cursor_position| (if set). Once we have a set of
59 // set). Once we have a set of candidates, they are filtered to ensure 59 // candidates, they are filtered to ensure that all |term_string| terms, as
60 // that all |term_string| terms, as separated by whitespace and the 60 // separated by whitespace and the cursor (if set), occur within the
61 // cursor (if set), occur within the candidate's URL or page title. 61 // candidate's URL or page title. Scores are then calculated on no more than
62 // Scores are then calculated on no more than |kItemsToScoreLimit| 62 // |kItemsToScoreLimit| candidates, as the scoring of such a large number of
63 // candidates, as the scoring of such a large number of candidates may 63 // candidates may cause perceptible typing response delays in the omnibox.
64 // cause perceptible typing response delays in the omnibox. This is 64 // This is likely to occur for short omnibox terms such as 'h' and 'w' which
65 // likely to occur for short omnibox terms such as 'h' and 'w' which
66 // will be found in nearly all history candidates. Results are sorted by 65 // will be found in nearly all history candidates. Results are sorted by
67 // descending score. The full results set (i.e. beyond the 66 // descending score. The full results set (i.e. beyond the
68 // |kItemsToScoreLimit| limit) will be retained and used for subsequent calls 67 // |kItemsToScoreLimit| limit) will be retained and used for subsequent calls
69 // to this function. In total, |max_matches| of items will be returned in the 68 // to this function. In total, |max_matches| of items will be returned in the
70 // |ScoredHistoryMatches| vector. 69 // |ScoredHistoryMatches| vector.
71 ScoredHistoryMatches HistoryItemsForTerms( 70 ScoredHistoryMatches HistoryItemsForTerms(
72 base::string16 term_string, 71 base::string16 term_string,
73 size_t cursor_position, 72 size_t cursor_position,
74 size_t max_matches, 73 size_t max_matches,
75 bookmarks::BookmarkModel* bookmark_model, 74 bookmarks::BookmarkModel* bookmark_model,
(...skipping 246 matching lines...) Expand 10 before | Expand all | Expand 10 after
322 bool RestoreWordStartsMap( 321 bool RestoreWordStartsMap(
323 const in_memory_url_index::InMemoryURLIndexCacheItem& cache); 322 const in_memory_url_index::InMemoryURLIndexCacheItem& cache);
324 323
325 // Determines if |gurl| has a whitelisted scheme and returns true if so. 324 // Determines if |gurl| has a whitelisted scheme and returns true if so.
326 static bool URLSchemeIsWhitelisted(const GURL& gurl, 325 static bool URLSchemeIsWhitelisted(const GURL& gurl,
327 const std::set<std::string>& whitelist); 326 const std::set<std::string>& whitelist);
328 327
329 // Cache of search terms. 328 // Cache of search terms.
330 SearchTermCacheMap search_term_cache_; 329 SearchTermCacheMap search_term_cache_;
331 330
331 // Support functions for HistoryItemsForTerms
Mark P 2016/08/22 18:24:11 This whole block (comment and functions) should go
Lavar Askew 2016/08/23 20:38:20 Done.
332
333 // This function is responsible for splitting |search_string| into a vector
334 // of terms based on 'true' whitespace as opposed to escaped whitespace,
335 // e.g., When the user types "colspec=ID%20Mstone Release" we get two
336 // 'terms': "colspec=id%20mstone" and "release". Each URL in the users
Mark P 2016/08/22 18:24:11 nit: users -> user's
Lavar Askew 2016/08/23 20:38:20 Done.
337 // history is scored against the a union of the vector of terms. The scored
Mark P 2016/08/22 18:24:11 nit: omit "the a union of "
Lavar Askew 2016/08/23 20:38:19 Done.
338 // hisory matches are returned unordered.
Mark P 2016/08/22 18:24:11 This last sentence is wrong. They are returned or
Lavar Askew 2016/08/23 20:38:19 Done.
339 ScoredHistoryMatches GetScoredItemsForSearchString(
340 const base::string16& search_string,
341 const HistoryIDSet& history_id_set,
342 const size_t max_matches,
343 bookmarks::BookmarkModel* bookmark_model,
344 TemplateURLService* template_url_service);
345
346 // Returns a vector of individual words from |search_string|, in order of
347 // appearance, duplicates allowed.
348 String16Vector ExtractIndividualWordVector(
349 const base::string16& search_string);
350
351 // Given a |term_string|, scans the history index and returns the set of
Mark P 2016/08/22 18:24:11 nit: a |term_string| -> |lower_words|
Lavar Askew 2016/08/23 20:38:20 Done.
352 // history item IDs. Will return an emptyHistoryIDSet containing all the
Mark P 2016/08/22 18:24:11 nit: insert missing word "matching" before "histor
Lavar Askew 2016/08/23 20:38:20 Done.
353 // terms in |term_string|.
Mark P 2016/08/22 18:24:11 For the last sentence, did you mean: Will return a
Lavar Askew 2016/08/23 20:38:20 Done.
354 HistoryIDSet GetHistoryIDSet(const String16Vector& lower_words);
355
332 // Start of data members that are cached ------------------------------------- 356 // Start of data members that are cached -------------------------------------
333 357
334 // The version of the cache file most recently used to restore this instance 358 // The version of the cache file most recently used to restore this instance
335 // of the private data. If the private data was rebuilt from the history 359 // of the private data. If the private data was rebuilt from the history
336 // database this will be 0. 360 // database this will be 0.
337 int restored_cache_version_; 361 int restored_cache_version_;
338 362
339 // The last time the data was rebuilt from the history database. 363 // The last time the data was rebuilt from the history database.
340 base::Time last_time_rebuilt_from_history_; 364 base::Time last_time_rebuilt_from_history_;
341 365
(...skipping 45 matching lines...) Expand 10 before | Expand all | Expand 10 after
387 int saved_cache_version_; 411 int saved_cache_version_;
388 412
389 // Used for unit testing only. Records the number of candidate history items 413 // Used for unit testing only. Records the number of candidate history items
390 // at three stages in the index searching process. 414 // at three stages in the index searching process.
391 size_t pre_filter_item_count_; // After word index is queried. 415 size_t pre_filter_item_count_; // After word index is queried.
392 size_t post_filter_item_count_; // After trimming large result set. 416 size_t post_filter_item_count_; // After trimming large result set.
393 size_t post_scoring_item_count_; // After performing final filter/scoring. 417 size_t post_scoring_item_count_; // After performing final filter/scoring.
394 }; 418 };
395 419
396 #endif // COMPONENTS_OMNIBOX_BROWSER_URL_INDEX_PRIVATE_DATA_H_ 420 #endif // COMPONENTS_OMNIBOX_BROWSER_URL_INDEX_PRIVATE_DATA_H_
OLDNEW

Powered by Google App Engine
This is Rietveld 408576698