Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(128)

Side by Side Diff: components/omnibox/browser/url_index_private_data.h

Issue 1841653003: Drop |languages| from {Format,Elide}Url* and IDNToUnicode (Closed) Base URL: https://chromium.googlesource.com/chromium/src.git@master
Patch Set: fix typo in elide_url.cc Created 4 years, 8 months ago
Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.
Jump to:
View unified diff | Download patch
OLDNEW
1 // Copyright (c) 2012 The Chromium Authors. All rights reserved. 1 // Copyright (c) 2012 The Chromium Authors. All rights reserved.
2 // Use of this source code is governed by a BSD-style license that can be 2 // Use of this source code is governed by a BSD-style license that can be
3 // found in the LICENSE file. 3 // found in the LICENSE file.
4 4
5 #ifndef COMPONENTS_OMNIBOX_BROWSER_URL_INDEX_PRIVATE_DATA_H_ 5 #ifndef COMPONENTS_OMNIBOX_BROWSER_URL_INDEX_PRIVATE_DATA_H_
6 #define COMPONENTS_OMNIBOX_BROWSER_URL_INDEX_PRIVATE_DATA_H_ 6 #define COMPONENTS_OMNIBOX_BROWSER_URL_INDEX_PRIVATE_DATA_H_
7 7
8 #include <stddef.h> 8 #include <stddef.h>
9 9
10 #include <set> 10 #include <set>
(...skipping 48 matching lines...) Expand 10 before | Expand all | Expand 10 after
59 // set). Once we have a set of candidates, they are filtered to ensure 59 // set). Once we have a set of candidates, they are filtered to ensure
60 // that all |term_string| terms, as separated by whitespace and the 60 // that all |term_string| terms, as separated by whitespace and the
61 // cursor (if set), occur within the candidate's URL or page title. 61 // cursor (if set), occur within the candidate's URL or page title.
62 // Scores are then calculated on no more than |kItemsToScoreLimit| 62 // Scores are then calculated on no more than |kItemsToScoreLimit|
63 // candidates, as the scoring of such a large number of candidates may 63 // candidates, as the scoring of such a large number of candidates may
64 // cause perceptible typing response delays in the omnibox. This is 64 // cause perceptible typing response delays in the omnibox. This is
65 // likely to occur for short omnibox terms such as 'h' and 'w' which 65 // likely to occur for short omnibox terms such as 'h' and 'w' which
66 // will be found in nearly all history candidates. Results are sorted by 66 // will be found in nearly all history candidates. Results are sorted by
67 // descending score. The full results set (i.e. beyond the 67 // descending score. The full results set (i.e. beyond the
68 // |kItemsToScoreLimit| limit) will be retained and used for subsequent calls 68 // |kItemsToScoreLimit| limit) will be retained and used for subsequent calls
69 // to this function. |languages| is used to help parse/format the URLs in the 69 // to this function. In total, |max_matches| of items will be returned in the
70 // history index. In total, |max_matches| of items will be returned in the
71 // |ScoredHistoryMatches| vector. 70 // |ScoredHistoryMatches| vector.
72 ScoredHistoryMatches HistoryItemsForTerms( 71 ScoredHistoryMatches HistoryItemsForTerms(
73 base::string16 term_string, 72 base::string16 term_string,
74 size_t cursor_position, 73 size_t cursor_position,
75 size_t max_matches, 74 size_t max_matches,
76 const std::string& languages,
77 bookmarks::BookmarkModel* bookmark_model, 75 bookmarks::BookmarkModel* bookmark_model,
78 TemplateURLService* template_url_service); 76 TemplateURLService* template_url_service);
79 77
80 // Adds the history item in |row| to the index if it does not already already 78 // Adds the history item in |row| to the index if it does not already already
81 // exist and it meets the minimum 'quick' criteria. If the row already exists 79 // exist and it meets the minimum 'quick' criteria. If the row already exists
82 // in the index then the index will be updated if the row still meets the 80 // in the index then the index will be updated if the row still meets the
83 // criteria, otherwise the row will be removed from the index. Returns true 81 // criteria, otherwise the row will be removed from the index. Returns true
84 // if the index was actually updated. |languages| gives a list of language 82 // if the index was actually updated. |scheme_whitelist| is used to filter
85 // encodings by which the URLs and page titles are broken down into words and 83 // non-qualifying schemes. |history_service| is used to schedule an update to
86 // characters. |scheme_whitelist| is used to filter non-qualifying schemes. 84 // the recent visits component of this URL's entry in the index.
87 // |history_service| is used to schedule an update to the recent visits
88 // component of this URL's entry in the index.
89 bool UpdateURL(history::HistoryService* history_service, 85 bool UpdateURL(history::HistoryService* history_service,
90 const history::URLRow& row, 86 const history::URLRow& row,
91 const std::string& languages,
92 const std::set<std::string>& scheme_whitelist, 87 const std::set<std::string>& scheme_whitelist,
93 base::CancelableTaskTracker* tracker); 88 base::CancelableTaskTracker* tracker);
94 89
95 // Updates the entry for |url_id| in the index, replacing its 90 // Updates the entry for |url_id| in the index, replacing its
96 // recent visits information with |recent_visits|. If |url_id| 91 // recent visits information with |recent_visits|. If |url_id|
97 // is not in the index, does nothing. 92 // is not in the index, does nothing.
98 void UpdateRecentVisits(history::URLID url_id, 93 void UpdateRecentVisits(history::URLID url_id,
99 const history::VisitVector& recent_visits); 94 const history::VisitVector& recent_visits);
100 95
101 // Using |history_service| schedules an update (using the historyDB 96 // Using |history_service| schedules an update (using the historyDB
102 // thread) for the recent visits information for |url_id|. Unless 97 // thread) for the recent visits information for |url_id|. Unless
103 // something unexpectedly goes wrong, UdpateRecentVisits() should 98 // something unexpectedly goes wrong, UdpateRecentVisits() should
104 // eventually be called from a callback. 99 // eventually be called from a callback.
105 void ScheduleUpdateRecentVisits(history::HistoryService* history_service, 100 void ScheduleUpdateRecentVisits(history::HistoryService* history_service,
106 history::URLID url_id, 101 history::URLID url_id,
107 base::CancelableTaskTracker* tracker); 102 base::CancelableTaskTracker* tracker);
108 103
109 // Deletes index data for the history item with the given |url|. 104 // Deletes index data for the history item with the given |url|.
110 // The item may not have actually been indexed, which is the case if it did 105 // The item may not have actually been indexed, which is the case if it did
111 // not previously meet minimum 'quick' criteria. Returns true if the index 106 // not previously meet minimum 'quick' criteria. Returns true if the index
112 // was actually updated. 107 // was actually updated.
113 bool DeleteURL(const GURL& url); 108 bool DeleteURL(const GURL& url);
114 109
115 // Constructs a new object by restoring its contents from the cache file 110 // Constructs a new object by restoring its contents from the cache file
116 // at |path|. Returns the new URLIndexPrivateData which on success will 111 // at |path|. Returns the new URLIndexPrivateData which on success will
117 // contain the restored data but upon failure will be empty. |languages| 112 // contain the restored data but upon failure will be empty.
118 // is used to break URLs and page titles into words. This function 113 // This function should be run on the the file thread.
119 // should be run on the the file thread.
120 static scoped_refptr<URLIndexPrivateData> RestoreFromFile( 114 static scoped_refptr<URLIndexPrivateData> RestoreFromFile(
121 const base::FilePath& path, 115 const base::FilePath& path);
122 const std::string& languages);
123 116
124 // Constructs a new object by rebuilding its contents from the history 117 // Constructs a new object by rebuilding its contents from the history
125 // database in |history_db|. Returns the new URLIndexPrivateData which on 118 // database in |history_db|. Returns the new URLIndexPrivateData which on
126 // success will contain the rebuilt data but upon failure will be empty. 119 // success will contain the rebuilt data but upon failure will be empty.
127 // |languages| gives a list of language encodings by which the URLs and page
128 // titles are broken down into words and characters.
129 static scoped_refptr<URLIndexPrivateData> RebuildFromHistory( 120 static scoped_refptr<URLIndexPrivateData> RebuildFromHistory(
130 history::HistoryDatabase* history_db, 121 history::HistoryDatabase* history_db,
131 const std::string& languages,
132 const std::set<std::string>& scheme_whitelist); 122 const std::set<std::string>& scheme_whitelist);
133 123
134 // Writes |private_data| as a cache file to |file_path| and returns success. 124 // Writes |private_data| as a cache file to |file_path| and returns success.
135 static bool WritePrivateDataToCacheFileTask( 125 static bool WritePrivateDataToCacheFileTask(
136 scoped_refptr<URLIndexPrivateData> private_data, 126 scoped_refptr<URLIndexPrivateData> private_data,
137 const base::FilePath& file_path); 127 const base::FilePath& file_path);
138 128
139 // Creates a copy of ourself. 129 // Creates a copy of ourself.
140 scoped_refptr<URLIndexPrivateData> Duplicate() const; 130 scoped_refptr<URLIndexPrivateData> Duplicate() const;
141 131
(...skipping 54 matching lines...) Expand 10 before | Expand all | Expand 10 after
196 }; 186 };
197 typedef std::map<base::string16, SearchTermCacheItem> SearchTermCacheMap; 187 typedef std::map<base::string16, SearchTermCacheItem> SearchTermCacheMap;
198 188
199 // A helper class which performs the final filter on each candidate 189 // A helper class which performs the final filter on each candidate
200 // history URL match, inserting accepted matches into |scored_matches_|. 190 // history URL match, inserting accepted matches into |scored_matches_|.
201 class AddHistoryMatch { 191 class AddHistoryMatch {
202 public: 192 public:
203 AddHistoryMatch(bookmarks::BookmarkModel* bookmark_model, 193 AddHistoryMatch(bookmarks::BookmarkModel* bookmark_model,
204 TemplateURLService* template_url_service, 194 TemplateURLService* template_url_service,
205 const URLIndexPrivateData& private_data, 195 const URLIndexPrivateData& private_data,
206 const std::string& languages,
207 const base::string16& lower_string, 196 const base::string16& lower_string,
208 const String16Vector& lower_terms, 197 const String16Vector& lower_terms,
209 const base::Time now); 198 const base::Time now);
210 AddHistoryMatch(const AddHistoryMatch& other); 199 AddHistoryMatch(const AddHistoryMatch& other);
211 ~AddHistoryMatch(); 200 ~AddHistoryMatch();
212 201
213 void operator()(const HistoryID history_id); 202 void operator()(const HistoryID history_id);
214 203
215 ScoredHistoryMatches ScoredMatches() const { return scored_matches_; } 204 ScoredHistoryMatches ScoredMatches() const { return scored_matches_; }
216 205
217 private: 206 private:
218 friend class InMemoryURLIndexTest; 207 friend class InMemoryURLIndexTest;
219 FRIEND_TEST_ALL_PREFIXES(InMemoryURLIndexTest, AddHistoryMatch); 208 FRIEND_TEST_ALL_PREFIXES(InMemoryURLIndexTest, AddHistoryMatch);
220 bookmarks::BookmarkModel* bookmark_model_; 209 bookmarks::BookmarkModel* bookmark_model_;
221 TemplateURLService* template_url_service_; 210 TemplateURLService* template_url_service_;
222 const URLIndexPrivateData& private_data_; 211 const URLIndexPrivateData& private_data_;
223 const std::string& languages_;
224 ScoredHistoryMatches scored_matches_; 212 ScoredHistoryMatches scored_matches_;
225 const base::string16& lower_string_; 213 const base::string16& lower_string_;
226 const String16Vector& lower_terms_; 214 const String16Vector& lower_terms_;
227 WordStarts lower_terms_to_word_starts_offsets_; 215 WordStarts lower_terms_to_word_starts_offsets_;
228 const base::Time now_; 216 const base::Time now_;
229 }; 217 };
230 218
231 // A helper predicate class used to filter excess history items when the 219 // A helper predicate class used to filter excess history items when the
232 // candidate results set is too large. 220 // candidate results set is too large.
233 class HistoryItemFactorGreater { 221 class HistoryItemFactorGreater {
(...skipping 14 matching lines...) Expand all
248 HistoryIDSet HistoryIDSetFromWords(const String16Vector& unsorted_words); 236 HistoryIDSet HistoryIDSetFromWords(const String16Vector& unsorted_words);
249 237
250 // Helper function to HistoryIDSetFromWords which composes a set of history 238 // Helper function to HistoryIDSetFromWords which composes a set of history
251 // ids for the given term given in |term|. 239 // ids for the given term given in |term|.
252 HistoryIDSet HistoryIDsForTerm(const base::string16& term); 240 HistoryIDSet HistoryIDsForTerm(const base::string16& term);
253 241
254 // Given a set of Char16s, finds words containing those characters. 242 // Given a set of Char16s, finds words containing those characters.
255 WordIDSet WordIDSetForTermChars(const Char16Set& term_chars); 243 WordIDSet WordIDSetForTermChars(const Char16Set& term_chars);
256 244
257 // Indexes one URL history item as described by |row|. Returns true if the 245 // Indexes one URL history item as described by |row|. Returns true if the
258 // row was actually indexed. |languages| gives a list of language encodings by 246 // row was actually indexed. |scheme_whitelist| is used to filter
259 // which the URLs and page titles are broken down into words and characters. 247 // non-qualifying schemes. If |history_db| is not NULL then this function
260 // |scheme_whitelist| is used to filter non-qualifying schemes. If 248 // uses the history database synchronously to get the URL's recent visits
261 // |history_db| is not NULL then this function uses the history database 249 // information. This mode should/ only be used on the historyDB thread.
262 // synchronously to get the URL's recent visits information. This mode should 250 // If |history_db| is NULL, then this function uses |history_service| to
263 // only be used on the historyDB thread. If |history_db| is NULL, then 251 // schedule a task on the historyDB thread to fetch and update the recent
264 // this function uses |history_service| to schedule a task on the 252 // visits information.
265 // historyDB thread to fetch and update the recent visits
266 // information.
267 bool IndexRow(history::HistoryDatabase* history_db, 253 bool IndexRow(history::HistoryDatabase* history_db,
268 history::HistoryService* history_service, 254 history::HistoryService* history_service,
269 const history::URLRow& row, 255 const history::URLRow& row,
270 const std::string& languages,
271 const std::set<std::string>& scheme_whitelist, 256 const std::set<std::string>& scheme_whitelist,
272 base::CancelableTaskTracker* tracker); 257 base::CancelableTaskTracker* tracker);
273 258
274 // Parses and indexes the words in the URL and page title of |row| and 259 // Parses and indexes the words in the URL and page title of |row| and
275 // calculate the word starts in each, saving the starts in |word_starts|. 260 // calculate the word starts in each, saving the starts in |word_starts|.
276 // |languages| gives a list of language encodings by which the URLs and page
277 // titles are broken down into words and characters.
278 void AddRowWordsToIndex(const history::URLRow& row, 261 void AddRowWordsToIndex(const history::URLRow& row,
279 RowWordStarts* word_starts, 262 RowWordStarts* word_starts);
280 const std::string& languages);
281 263
282 // Given a single word in |uni_word|, adds a reference for the containing 264 // Given a single word in |uni_word|, adds a reference for the containing
283 // history item identified by |history_id| to the index. 265 // history item identified by |history_id| to the index.
284 void AddWordToIndex(const base::string16& uni_word, HistoryID history_id); 266 void AddWordToIndex(const base::string16& uni_word, HistoryID history_id);
285 267
286 // Creates a new entry in the word/history map for |word_id| and add 268 // Creates a new entry in the word/history map for |word_id| and add
287 // |history_id| as the initial element of the word's set. 269 // |history_id| as the initial element of the word's set.
288 void AddWordHistory(const base::string16& uni_word, HistoryID history_id); 270 void AddWordHistory(const base::string16& uni_word, HistoryID history_id);
289 271
290 // Updates an existing entry in the word/history index by adding the 272 // Updates an existing entry in the word/history index by adding the
(...skipping 26 matching lines...) Expand all
317 void SaveCharWordMap( 299 void SaveCharWordMap(
318 in_memory_url_index::InMemoryURLIndexCacheItem* cache) const; 300 in_memory_url_index::InMemoryURLIndexCacheItem* cache) const;
319 void SaveWordIDHistoryMap( 301 void SaveWordIDHistoryMap(
320 in_memory_url_index::InMemoryURLIndexCacheItem* cache) const; 302 in_memory_url_index::InMemoryURLIndexCacheItem* cache) const;
321 void SaveHistoryInfoMap( 303 void SaveHistoryInfoMap(
322 in_memory_url_index::InMemoryURLIndexCacheItem* cache) const; 304 in_memory_url_index::InMemoryURLIndexCacheItem* cache) const;
323 void SaveWordStartsMap( 305 void SaveWordStartsMap(
324 in_memory_url_index::InMemoryURLIndexCacheItem* cache) const; 306 in_memory_url_index::InMemoryURLIndexCacheItem* cache) const;
325 307
326 // Decode a data structure from the protobuf |cache|. Return false if there 308 // Decode a data structure from the protobuf |cache|. Return false if there
327 // is any kind of failure. |languages| will be used to break URLs and page 309 // is any kind of failure.
328 // titles into words
329 bool RestorePrivateData( 310 bool RestorePrivateData(
330 const in_memory_url_index::InMemoryURLIndexCacheItem& cache, 311 const in_memory_url_index::InMemoryURLIndexCacheItem& cache);
331 const std::string& languages);
332 bool RestoreWordList( 312 bool RestoreWordList(
333 const in_memory_url_index::InMemoryURLIndexCacheItem& cache); 313 const in_memory_url_index::InMemoryURLIndexCacheItem& cache);
334 bool RestoreWordMap( 314 bool RestoreWordMap(
335 const in_memory_url_index::InMemoryURLIndexCacheItem& cache); 315 const in_memory_url_index::InMemoryURLIndexCacheItem& cache);
336 bool RestoreCharWordMap( 316 bool RestoreCharWordMap(
337 const in_memory_url_index::InMemoryURLIndexCacheItem& cache); 317 const in_memory_url_index::InMemoryURLIndexCacheItem& cache);
338 bool RestoreWordIDHistoryMap( 318 bool RestoreWordIDHistoryMap(
339 const in_memory_url_index::InMemoryURLIndexCacheItem& cache); 319 const in_memory_url_index::InMemoryURLIndexCacheItem& cache);
340 bool RestoreHistoryInfoMap( 320 bool RestoreHistoryInfoMap(
341 const in_memory_url_index::InMemoryURLIndexCacheItem& cache); 321 const in_memory_url_index::InMemoryURLIndexCacheItem& cache);
342 bool RestoreWordStartsMap( 322 bool RestoreWordStartsMap(
343 const in_memory_url_index::InMemoryURLIndexCacheItem& cache, 323 const in_memory_url_index::InMemoryURLIndexCacheItem& cache);
344 const std::string& languages);
345 324
346 // Determines if |gurl| has a whitelisted scheme and returns true if so. 325 // Determines if |gurl| has a whitelisted scheme and returns true if so.
347 static bool URLSchemeIsWhitelisted(const GURL& gurl, 326 static bool URLSchemeIsWhitelisted(const GURL& gurl,
348 const std::set<std::string>& whitelist); 327 const std::set<std::string>& whitelist);
349 328
350 // Cache of search terms. 329 // Cache of search terms.
351 SearchTermCacheMap search_term_cache_; 330 SearchTermCacheMap search_term_cache_;
352 331
353 // Start of data members that are cached ------------------------------------- 332 // Start of data members that are cached -------------------------------------
354 333
(...skipping 53 matching lines...) Expand 10 before | Expand all | Expand 10 after
408 int saved_cache_version_; 387 int saved_cache_version_;
409 388
410 // Used for unit testing only. Records the number of candidate history items 389 // Used for unit testing only. Records the number of candidate history items
411 // at three stages in the index searching process. 390 // at three stages in the index searching process.
412 size_t pre_filter_item_count_; // After word index is queried. 391 size_t pre_filter_item_count_; // After word index is queried.
413 size_t post_filter_item_count_; // After trimming large result set. 392 size_t post_filter_item_count_; // After trimming large result set.
414 size_t post_scoring_item_count_; // After performing final filter/scoring. 393 size_t post_scoring_item_count_; // After performing final filter/scoring.
415 }; 394 };
416 395
417 #endif // COMPONENTS_OMNIBOX_BROWSER_URL_INDEX_PRIVATE_DATA_H_ 396 #endif // COMPONENTS_OMNIBOX_BROWSER_URL_INDEX_PRIVATE_DATA_H_
OLDNEW
« no previous file with comments | « components/omnibox/browser/shortcuts_provider_unittest.cc ('k') | components/omnibox/browser/url_index_private_data.cc » ('j') | no next file with comments »

Powered by Google App Engine
This is Rietveld 408576698