components/omnibox/browser/url_index_private_data.cc - Issue 2690303012: Cleaning up url_index_private_data and in_memory_url_index_types.

Side by Side Diff: components/omnibox/browser/url_index_private_data.cc

Issue 2690303012: Cleaning up url_index_private_data and in_memory_url_index_types. (Closed)

Patch Set: Rough sketch. Created 3 years, 10 months ago

Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.

Jump to:

OLD	NEW
1 // Copyright (c) 2012 The Chromium Authors. All rights reserved.	1 // Copyright (c) 2012 The Chromium Authors. All rights reserved.

2 // Use of this source code is governed by a BSD-style license that can be	2 // Use of this source code is governed by a BSD-style license that can be

3 // found in the LICENSE file.	3 // found in the LICENSE file.

4	4

5 #include "components/omnibox/browser/url_index_private_data.h"	5 #include "components/omnibox/browser/url_index_private_data.h"

6	6

7 #include <stdint.h>	7 #include <stdint.h>

8	8

9 #include <functional>	9 #include <functional>

10 #include <iterator>	10 #include <iterator>

(...skipping 191 matching lines...) Expand 10 before \| Expand all \| Expand 10 after Loading...
202 if (lower_words.empty())	202 if (lower_words.empty())

203 continue;	203 continue;

204 HistoryIDSet history_id_set = HistoryIDSetFromWords(lower_words);	204 HistoryIDSet history_id_set = HistoryIDSetFromWords(lower_words);

205 pre_filter_item_count_ += history_id_set.size();	205 pre_filter_item_count_ += history_id_set.size();

206 // Trim the candidate pool if it is large. Note that we do not filter out	206 // Trim the candidate pool if it is large. Note that we do not filter out

207 // items that do not contain the search terms as proper substrings --	207 // items that do not contain the search terms as proper substrings --

208 // doing so is the performance-costly operation we are trying to avoid in	208 // doing so is the performance-costly operation we are trying to avoid in

209 // order to maintain omnibox responsiveness.	209 // order to maintain omnibox responsiveness.

210 const size_t kItemsToScoreLimit = 500;	210 const size_t kItemsToScoreLimit = 500;

211 if (history_id_set.size() > kItemsToScoreLimit) {	211 if (history_id_set.size() > kItemsToScoreLimit) {

212 HistoryIDVector history_ids;	212 HistoryIDVector history_ids = {history_id_set.begin(),

213 std::copy(history_id_set.begin(), history_id_set.end(),	213 history_id_set.end()};
	Peter Kasting 2017/02/17 01:33:46 Nit: Prefer the direct constructor: History Nit: Prefer the direct constructor: HistoryIDVector history_ids(history_id_set.begin(), history_id_set.end()); dyaroshev 2017/02/17 20:17:22 Done. Show quoted text On 2017/02/17 01:33:46, Peter Kasting wrote: > Nit: Prefer the direct constructor: > > HistoryIDVector history_ids(history_id_set.begin(), history_id_set.end()); Done.
214 std::back_inserter(history_ids));	214

215 // Trim down the set by sorting by typed-count, visit-count, and last	215 // Trim down the set by sorting by typed-count, visit-count, and last

216 // visit.	216 // visit.

217 HistoryItemFactorGreater item_factor_functor(history_info_map_);	217 HistoryItemFactorGreater item_factor_functor(history_info_map_);

218 std::partial_sort(history_ids.begin(),	218 std::nth_element(history_ids.begin(),
	Peter Kasting 2017/02/17 01:33:46 I've never seen this algorithm before. Good use o I've never seen this algorithm before. Good use of it. dyaroshev 2017/02/17 20:17:22 thx) While we are here - I'm not 100% sure, but it Show quoted text On 2017/02/17 01:33:46, Peter Kasting wrote: > I've never seen this algorithm before. Good use of it. thx) While we are here - I'm not 100% sure, but it seems we don't need elements to be sorted after this? We can probably just use the vector? Sure it will require coping always but the code will be clearer and I think I'll be able to land some conversions to/from vector in base::flat_set later, so we'll fix it. It doesn't affect the performance either way - I've checked. Peter Kasting 2017/02/17 21:52:26 Correct, we're going to score them and then sort t Show quoted text On 2017/02/17 20:17:22, dyaroshev wrote: > On 2017/02/17 01:33:46, Peter Kasting wrote: > > I've never seen this algorithm before. Good use of it. > > thx) While we are here - I'm not 100% sure, but it seems we don't need elements > to be sorted after this? Correct, we're going to score them and then sort the scored items. Show quoted text > We can probably just use the vector? Sure it will > require coping always but the code will be clearer and I think I'll be able to > land some conversions to/from vector in base::flat_set later, so we'll fix it. I don't know that I grasp what you're proposing, I'd say write it up and I'll look at the implementation.
219 history_ids.begin() + kItemsToScoreLimit,	219 history_ids.begin() + kItemsToScoreLimit,

220 history_ids.end(), item_factor_functor);	220 history_ids.end(), item_factor_functor);

221 history_id_set.clear();	221 history_id_set = {history_ids.begin(),

222 std::copy(history_ids.begin(), history_ids.begin() + kItemsToScoreLimit,	222 history_ids.begin() + kItemsToScoreLimit};

223 std::inserter(history_id_set, history_id_set.end()));

224 post_filter_item_count_ += history_id_set.size();	223 post_filter_item_count_ += history_id_set.size();

225 } else {	224 } else {

226 post_filter_item_count_ += pre_filter_item_count_;	225 post_filter_item_count_ += pre_filter_item_count_;

227 }	226 }

228 ScoredHistoryMatches temp_scored_items;	227 ScoredHistoryMatches temp_scored_items;

229 HistoryIdSetToScoredMatches(history_id_set, lower_raw_string,	228 HistoryIdSetToScoredMatches(history_id_set, lower_raw_string,

230 template_url_service, bookmark_model,	229 template_url_service, bookmark_model,

231 &temp_scored_items);	230 &temp_scored_items);

232 scored_items.insert(scored_items.end(), temp_scored_items.begin(),	231 scored_items.insert(scored_items.end(), temp_scored_items.begin(),

233 temp_scored_items.end());	232 temp_scored_items.end());

(...skipping 270 matching lines...) Expand 10 before \| Expand all \| Expand 10 after Loading...
504 // a string like "http://www.somewebsite.com" which, from our perspective,	503 // a string like "http://www.somewebsite.com" which, from our perspective,

505 // is four words: 'http', 'www', 'somewebsite', and 'com'.	504 // is four words: 'http', 'www', 'somewebsite', and 'com'.

506 HistoryIDSet history_id_set;	505 HistoryIDSet history_id_set;

507 String16Vector words(unsorted_words);	506 String16Vector words(unsorted_words);

508 // Sort the words into the longest first as such are likely to narrow down	507 // Sort the words into the longest first as such are likely to narrow down

509 // the results quicker. Also, single character words are the most expensive	508 // the results quicker. Also, single character words are the most expensive

510 // to process so save them for last.	509 // to process so save them for last.

511 std::sort(words.begin(), words.end(), LengthGreater);	510 std::sort(words.begin(), words.end(), LengthGreater);

512 for (String16Vector::iterator iter = words.begin(); iter != words.end();	511 for (String16Vector::iterator iter = words.begin(); iter != words.end();

513 ++iter) {	512 ++iter) {

514 base::string16 uni_word = *iter;	513 HistoryIDSet term_history_set = HistoryIDsForTerm(*iter);

515 HistoryIDSet term_history_set = HistoryIDsForTerm(uni_word);

516 if (term_history_set.empty()) {	514 if (term_history_set.empty()) {
	Peter Kasting 2017/02/17 01:33:46 Nit: No {} Nit: No {} dyaroshev 2017/02/17 20:17:22 Done. Show quoted text On 2017/02/17 01:33:46, Peter Kasting wrote: > Nit: No {} Done.
517 history_id_set.clear();	515 return {};
	Peter Kasting 2017/02/17 01:33:46 Nit: I'd return HistoryIDSet(); (3 similar places) Nit: I'd return HistoryIDSet(); (3 similar places) dyaroshev 2017/02/17 20:17:22 Done. Show quoted text On 2017/02/17 01:33:46, Peter Kasting wrote: > Nit: I'd return HistoryIDSet(); (3 similar places) Done.
518 break;

519 }	516 }

520 if (iter == words.begin()) {	517 if (iter == words.begin()) {

521 history_id_set.swap(term_history_set);	518 history_id_set = std::move(term_history_set);

522 } else {	519 } else {

523 HistoryIDSet new_history_id_set = base::STLSetIntersection<HistoryIDSet>(	520 history_id_set = base::STLSetIntersection<HistoryIDSet>(history_id_set,

524 history_id_set, term_history_set);	521 term_history_set);

525 history_id_set.swap(new_history_id_set);

526 }	522 }

527 }	523 }

528 return history_id_set;	524 return history_id_set;

529 }	525 }

530	526

531 HistoryIDSet URLIndexPrivateData::HistoryIDsForTerm(	527 HistoryIDSet URLIndexPrivateData::HistoryIDsForTerm(

532 const base::string16& term) {	528 const base::string16& term) {

533 if (term.empty())	529 if (term.empty())

534 return HistoryIDSet();	530 return HistoryIDSet();

535	531

(...skipping 49 matching lines...) Expand 10 before \| Expand all \| Expand 10 after Loading...
585	581

586 // Reduce the word set with any leftover, unprocessed characters.	582 // Reduce the word set with any leftover, unprocessed characters.

587 if (!unique_chars.empty()) {	583 if (!unique_chars.empty()) {

588 WordIDSet leftover_set(WordIDSetForTermChars(unique_chars));	584 WordIDSet leftover_set(WordIDSetForTermChars(unique_chars));

589 // We might come up empty on the leftovers.	585 // We might come up empty on the leftovers.

590 if (leftover_set.empty()) {	586 if (leftover_set.empty()) {

591 search_term_cache_[term] = SearchTermCacheItem();	587 search_term_cache_[term] = SearchTermCacheItem();

592 return HistoryIDSet();	588 return HistoryIDSet();

593 }	589 }

594 // Or there may not have been a prefix from which to start.	590 // Or there may not have been a prefix from which to start.

595 if (prefix_chars.empty()) {	591 word_id_set = prefix_chars.empty() ? leftover_set

596 word_id_set.swap(leftover_set);	592 : base::STLSetIntersection<WordIDSet>(

597 } else {	593 word_id_set, leftover_set);

598 WordIDSet new_word_id_set = base::STLSetIntersection<WordIDSet>(

599 word_id_set, leftover_set);

600 word_id_set.swap(new_word_id_set);

601 }

602 }	594 }

603	595

604 // We must filter the word list because the resulting word set surely	596 // We must filter the word list because the resulting word set surely

605 // contains words which do not have the search term as a proper subset.	597 // contains words which do not have the search term as a proper subset.

606 for (WordIDSet::iterator word_set_iter = word_id_set.begin();	598 word_id_set.erase(std::remove_if(word_id_set.begin(), word_id_set.end(),

607 word_set_iter != word_id_set.end(); ) {	599 [&](WordID word_id) {

608 if (word_list_[*word_set_iter].find(term) == base::string16::npos)	600 return word_list_[word_id].find(term) ==

609 word_set_iter = word_id_set.erase(word_set_iter);	601 base::string16::npos;

610 else	602 }),

611 ++word_set_iter;	603 word_id_set.end());
	Peter Kasting 2017/02/17 01:33:46 Nit: Slightly shorter, reads a bit less awkwardly Nit: Slightly shorter, reads a bit less awkwardly to me: const auto filter = [this, &term](WordID id) { return word_list_[id].find(term) == base::string16::npos; } const auto end = word_id_set.end(); word_id_set.erase(std::remove_if(word_id_set.begin(), end, filter), end);
612 }	604

613 } else {	605 } else {

614 word_id_set = WordIDSetForTermChars(Char16SetFromString16(term));	606 word_id_set = WordIDSetForTermChars(Char16SetFromString16(term));

615 }	607 }

616	608

617 // If any words resulted then we can compose a set of history IDs by unioning	609 // If any words resulted then we can compose a set of history IDs by unioning

618 // the sets from each word.	610 // the sets from each word.

619 HistoryIDSet history_id_set;	611 auto history_id_set = [&]() -> HistoryIDSet {
	Peter Kasting 2017/02/17 01:33:46 Why do this in a lambda? Why not just "inline" th Why do this in a lambda? Why not just "inline" the body of the lambda here?
620 if (!word_id_set.empty()) {	612 HistoryIDVector buffer;
	Peter Kasting 2017/02/17 01:33:46 Nit: Deserves a comment about why it's important t Nit: Deserves a comment about why it's important to append to the vector before moving into the set all at once.
621 for (WordIDSet::iterator word_id_iter = word_id_set.begin();	613
	Peter Kasting 2017/02/17 01:33:46 Nit: No blank line Nit: No blank line
622 word_id_iter != word_id_set.end(); ++word_id_iter) {	614 for (WordID word_id : word_id_set) {

623 WordID word_id = *word_id_iter;

624 WordIDHistoryMap::iterator word_iter = word_id_history_map_.find(word_id);	615 WordIDHistoryMap::iterator word_iter = word_id_history_map_.find(word_id);

625 if (word_iter != word_id_history_map_.end()) {	616 if (word_iter == word_id_history_map_.end())

626 HistoryIDSet& word_history_id_set(word_iter->second);	617 continue;
	Peter Kasting 2017/02/17 01:33:46 Nit: I probably wouldn't change the old form here Nit: I probably wouldn't change the old form here to the new -- they're about equal in readability, though personally I tend to prefer avoiding the continue in this case. But mostly it's just about not churning the code needlessly.
627 history_id_set.insert(word_history_id_set.begin(),	618 HistoryIDSet& word_history_id_set(word_iter->second);

628 word_history_id_set.end());	619 buffer.insert(buffer.end(), word_history_id_set.begin(),

629 }	620 word_history_id_set.end());

630 }	621 }

631 }	622

	623 return {buffer.begin(), buffer.end()};

	624 }();

632	625

633 // Record a new cache entry for this word if the term is longer than	626 // Record a new cache entry for this word if the term is longer than

634 // a single character.	627 // a single character.

635 if (term_length > 1)	628 if (term_length > 1)

636 search_term_cache_[term] = SearchTermCacheItem(word_id_set, history_id_set);	629 search_term_cache_[term] = SearchTermCacheItem(word_id_set, history_id_set);

637	630

638 return history_id_set;	631 return history_id_set;

639 }	632 }

640	633

641 WordIDSet URLIndexPrivateData::WordIDSetForTermChars(	634 WordIDSet URLIndexPrivateData::WordIDSetForTermChars(

642 const Char16Set& term_chars) {	635 const Char16Set& term_chars) {

643 WordIDSet word_id_set;	636 WordIDSet word_id_set;

644 for (Char16Set::const_iterator c_iter = term_chars.begin();

645 c_iter != term_chars.end(); ++c_iter) {

646 CharWordIDMap::iterator char_iter = char_word_map_.find(*c_iter);

647 if (char_iter == char_word_map_.end()) {

648 // A character was not found so there are no matching results: bail.

649 word_id_set.clear();

650 break;

651 }

652 WordIDSet& char_word_id_set(char_iter->second);

653 // It is possible for there to no longer be any words associated with

654 // a particular character. Give up in that case.

655 if (char_word_id_set.empty()) {

656 word_id_set.clear();

657 break;

658 }

659	637

660 if (c_iter == term_chars.begin()) {	638 for (base::char16 c : term_chars) {

661 // First character results becomes base set of results.	639 CharWordIDMap::iterator char_iter = char_word_map_.find(c);

662 word_id_set = char_word_id_set;	640 if (char_iter == char_word_map_.end())
	Peter Kasting 2017/02/17 01:33:46 Nit: Preserve the old comments Nit: Preserve the old comments dyaroshev 2017/02/17 20:17:23 Done. Show quoted text On 2017/02/17 01:33:46, Peter Kasting wrote: > Nit: Preserve the old comments Done.
663 } else {	641 return {};

664 // Subsequent character results get intersected in.	642

665 WordIDSet new_word_id_set = base::STLSetIntersection<WordIDSet>(	643 const WordIDSet& char_word_id_set(char_iter->second);

666 word_id_set, char_word_id_set);	644 if (char_word_id_set.empty())

667 word_id_set.swap(new_word_id_set);	645 return {};

668 }	646

	647 word_id_set =

	648 base::STLSetIntersection<WordIDSet>(word_id_set, char_word_id_set);
	Peter Kasting 2017/02/17 01:33:46 Does dropping the assignment the old code used in Does dropping the assignment the old code used in the begin() case actually work? That looks like a functional change (it looks like it'd result in never having any results). dyaroshev 2017/02/17 20:17:23 Yes, my mistake. Done. Reverted almost everything Show quoted text On 2017/02/17 01:33:46, Peter Kasting wrote: > Does dropping the assignment the old code used in the begin() case actually > work? That looks like a functional change (it looks like it'd result in never > having any results). Yes, my mistake. Done. Reverted almost everything except for swap.
669 }	649 }

	650

670 return word_id_set;	651 return word_id_set;

671 }	652 }

672	653

673 void URLIndexPrivateData::HistoryIdSetToScoredMatches(	654 void URLIndexPrivateData::HistoryIdSetToScoredMatches(

674 HistoryIDSet history_id_set,	655 HistoryIDSet history_id_set,

675 const base::string16& lower_raw_string,	656 const base::string16& lower_raw_string,

676 const TemplateURLService* template_url_service,	657 const TemplateURLService* template_url_service,

677 bookmarks::BookmarkModel* bookmark_model,	658 bookmarks::BookmarkModel* bookmark_model,

678 ScoredHistoryMatches* scored_items) const {	659 ScoredHistoryMatches* scored_items) const {

679 if (history_id_set.empty())	660 if (history_id_set.empty())

(...skipping 17 matching lines...) Expand all Loading...
697 // are some form of whitespace), but this is such a rare edge case that it's	678 // are some form of whitespace), but this is such a rare edge case that it's

698 // not worth the time.	679 // not worth the time.

699 if (lower_raw_terms.empty())	680 if (lower_raw_terms.empty())

700 return;	681 return;

701	682

702 WordStarts lower_terms_to_word_starts_offsets;	683 WordStarts lower_terms_to_word_starts_offsets;

703 CalculateWordStartsOffsets(lower_raw_terms,	684 CalculateWordStartsOffsets(lower_raw_terms,

704 &lower_terms_to_word_starts_offsets);	685 &lower_terms_to_word_starts_offsets);

705	686

706 // Filter bad matches and other matches we don't want to display.	687 // Filter bad matches and other matches we don't want to display.

707 for (auto it = history_id_set.begin();;) {	688 history_id_set.erase(

708 it = std::find_if(it, history_id_set.end(),	689 std::remove_if(history_id_set.begin(), history_id_set.end(),

709 [this, template_url_service](const HistoryID history_id) {	690 [this, template_url_service](const HistoryID history_id) {

710 return ShouldFilter(history_id, template_url_service);	691 return ShouldFilter(history_id, template_url_service);

711 });	692 }),

712 if (it == history_id_set.end())	693 history_id_set.end());

713 break;

714 it = history_id_set.erase(it);

715 }

716	694

717 // Score the matches.	695 // Score the matches.

718 const size_t num_matches = history_id_set.size();	696 const size_t num_matches = history_id_set.size();

719 const base::Time now = base::Time::Now();	697 const base::Time now = base::Time::Now();

720 std::transform(	698 std::transform(

721 history_id_set.begin(), history_id_set.end(),	699 history_id_set.begin(), history_id_set.end(),

722 std::back_inserter(*scored_items), [&](const HistoryID history_id) {	700 std::back_inserter(*scored_items), [&](const HistoryID history_id) {

723 auto hist_pos = history_info_map_.find(history_id);	701 auto hist_pos = history_info_map_.find(history_id);

724 const history::URLRow& hist_item = hist_pos->second.url_row;	702 const history::URLRow& hist_item = hist_pos->second.url_row;

725 auto starts_pos = word_starts_map_.find(history_id);	703 auto starts_pos = word_starts_map_.find(history_id);

(...skipping 97 matching lines...) Expand 10 before \| Expand all \| Expand 10 after Loading...
823 HistoryID history_id = static_cast<HistoryID>(row.id());	801 HistoryID history_id = static_cast<HistoryID>(row.id());

824 // Split URL into individual, unique words then add in the title words.	802 // Split URL into individual, unique words then add in the title words.

825 const GURL& gurl(row.url());	803 const GURL& gurl(row.url());

826 const base::string16& url =	804 const base::string16& url =

827 bookmarks::CleanUpUrlForMatching(gurl, nullptr);	805 bookmarks::CleanUpUrlForMatching(gurl, nullptr);

828 String16Set url_words = String16SetFromString16(url,	806 String16Set url_words = String16SetFromString16(url,

829 word_starts ? &word_starts->url_word_starts_ : nullptr);	807 word_starts ? &word_starts->url_word_starts_ : nullptr);

830 const base::string16& title = bookmarks::CleanUpTitleForMatching(row.title());	808 const base::string16& title = bookmarks::CleanUpTitleForMatching(row.title());

831 String16Set title_words = String16SetFromString16(title,	809 String16Set title_words = String16SetFromString16(title,

832 word_starts ? &word_starts->title_word_starts_ : nullptr);	810 word_starts ? &word_starts->title_word_starts_ : nullptr);

833 String16Set words = base::STLSetUnion<String16Set>(url_words, title_words);	811 for (const auto& word :

834 for (String16Set::iterator word_iter = words.begin();	812 base::STLSetUnion<String16Set>(url_words, title_words))

835 word_iter != words.end(); ++word_iter)	813 AddWordToIndex(word, history_id);

836 AddWordToIndex(*word_iter, history_id);

837	814

838 search_term_cache_.clear(); // Invalidate the term cache.	815 search_term_cache_.clear(); // Invalidate the term cache.

839 }	816 }

840	817

841 void URLIndexPrivateData::AddWordToIndex(const base::string16& term,	818 void URLIndexPrivateData::AddWordToIndex(const base::string16& term,

842 HistoryID history_id) {	819 HistoryID history_id) {

843 WordMap::iterator word_pos = word_map_.find(term);	820 WordMap::iterator word_pos = word_map_.find(term);

844 if (word_pos != word_map_.end())	821 if (word_pos != word_map_.end())

845 UpdateWordHistory(word_pos->second, history_id);	822 UpdateWordHistory(word_pos->second, history_id);

846 else	823 else

(...skipping 12 matching lines...) Expand all Loading...
859 }	836 }

860 word_map_[term] = word_id;	837 word_map_[term] = word_id;

861	838

862 HistoryIDSet history_id_set;	839 HistoryIDSet history_id_set;

863 history_id_set.insert(history_id);	840 history_id_set.insert(history_id);

864 word_id_history_map_[word_id] = history_id_set;	841 word_id_history_map_[word_id] = history_id_set;

865 AddToHistoryIDWordMap(history_id, word_id);	842 AddToHistoryIDWordMap(history_id, word_id);

866	843

867 // For each character in the newly added word (i.e. a word that is not	844 // For each character in the newly added word (i.e. a word that is not

868 // already in the word index), add the word to the character index.	845 // already in the word index), add the word to the character index.

869 Char16Set characters = Char16SetFromString16(term);	846 for (base::char16 uni_char : Char16SetFromString16(term)) {
	Peter Kasting 2017/02/17 01:33:46 Nit: No {} Nit: No {}
870 for (Char16Set::iterator uni_char_iter = characters.begin();	847 char_word_map_[uni_char].insert(word_id);
	Peter Kasting 2017/02/17 01:33:46 Nice simplification! Nice simplification! dyaroshev 2017/02/17 20:17:23 I rewrote AddWordToIndex, AddWordHistory and Updat Show quoted text On 2017/02/17 01:33:46, Peter Kasting wrote: > Nice simplification! I rewrote AddWordToIndex, AddWordHistory and UpdateWordHistory again in the second patch.
871 uni_char_iter != characters.end(); ++uni_char_iter) {

872 base::char16 uni_char = *uni_char_iter;

873 CharWordIDMap::iterator char_iter = char_word_map_.find(uni_char);

874 if (char_iter != char_word_map_.end()) {

875 // Update existing entry in the char/word index.

876 WordIDSet& word_id_set(char_iter->second);

877 word_id_set.insert(word_id);

878 } else {

879 // Create a new entry in the char/word index.

880 WordIDSet word_id_set;

881 word_id_set.insert(word_id);

882 char_word_map_[uni_char] = word_id_set;

883 }

884 }	848 }

885 }	849 }

886	850

887 void URLIndexPrivateData::UpdateWordHistory(WordID word_id,	851 void URLIndexPrivateData::UpdateWordHistory(WordID word_id,

888 HistoryID history_id) {	852 HistoryID history_id) {

889 WordIDHistoryMap::iterator history_pos = word_id_history_map_.find(word_id);	853 WordIDHistoryMap::iterator history_pos = word_id_history_map_.find(word_id);

890 DCHECK(history_pos != word_id_history_map_.end());	854 DCHECK(history_pos != word_id_history_map_.end());

891 HistoryIDSet& history_id_set(history_pos->second);	855 HistoryIDSet& history_id_set(history_pos->second);

892 history_id_set.insert(history_id);	856 history_id_set.insert(history_id);

893 AddToHistoryIDWordMap(history_id, word_id);	857 AddToHistoryIDWordMap(history_id, word_id);

894 }	858 }

895	859

896 void URLIndexPrivateData::AddToHistoryIDWordMap(HistoryID history_id,	860 void URLIndexPrivateData::AddToHistoryIDWordMap(HistoryID history_id,

897 WordID word_id) {	861 WordID word_id) {

898 HistoryIDWordMap::iterator iter = history_id_word_map_.find(history_id);	862 history_id_word_map_[history_id].insert(word_id);

899 if (iter != history_id_word_map_.end()) {

900 WordIDSet& word_id_set(iter->second);

901 word_id_set.insert(word_id);

902 } else {

903 WordIDSet word_id_set;

904 word_id_set.insert(word_id);

905 history_id_word_map_[history_id] = word_id_set;

906 }

907 }	863 }

908	864

909 void URLIndexPrivateData::RemoveRowFromIndex(const history::URLRow& row) {	865 void URLIndexPrivateData::RemoveRowFromIndex(const history::URLRow& row) {

910 RemoveRowWordsFromIndex(row);	866 RemoveRowWordsFromIndex(row);

911 HistoryID history_id = static_cast<HistoryID>(row.id());	867 HistoryID history_id = static_cast<HistoryID>(row.id());

912 history_info_map_.erase(history_id);	868 history_info_map_.erase(history_id);

913 word_starts_map_.erase(history_id);	869 word_starts_map_.erase(history_id);

914 }	870 }

915	871

916 void URLIndexPrivateData::RemoveRowWordsFromIndex(const history::URLRow& row) {	872 void URLIndexPrivateData::RemoveRowWordsFromIndex(const history::URLRow& row) {

917 // Remove the entries in history_id_word_map_ and word_id_history_map_ for	873 // Remove the entries in history_id_word_map_ and word_id_history_map_ for

918 // this row.	874 // this row.

919 HistoryID history_id = static_cast<HistoryID>(row.id());	875 HistoryID history_id = static_cast<HistoryID>(row.id());

920 WordIDSet word_id_set = history_id_word_map_[history_id];	876 WordIDSet word_id_set = history_id_word_map_[history_id];

921 history_id_word_map_.erase(history_id);	877 history_id_word_map_.erase(history_id);

922	878

923 // Reconcile any changes to word usage.	879 // Reconcile any changes to word usage.

924 for (WordIDSet::iterator word_id_iter = word_id_set.begin();	880 for (WordID word_id : word_id_set) {

925 word_id_iter != word_id_set.end(); ++word_id_iter) {	881 auto word_id_history_map_iter = word_id_history_map_.find(word_id);

926 WordID word_id = *word_id_iter;	882
	Peter Kasting 2017/02/17 01:33:46 Nit: No blank line Nit: No blank line dyaroshev 2017/02/17 20:17:22 Done. Show quoted text On 2017/02/17 01:33:46, Peter Kasting wrote: > Nit: No blank line Done.
927 word_id_history_map_[word_id].erase(history_id);	883 if (word_id_history_map_iter == word_id_history_map_.end())
	Peter Kasting 2017/02/17 01:33:46 Can this conditional succeed? The old code doesn' Can this conditional succeed? The old code doesn't bother checking for it, but maybe that's because that code will silently no-op in this case? dyaroshev 2017/02/17 20:17:22 Seems like no, we always add word to this map. And Show quoted text On 2017/02/17 01:33:46, Peter Kasting wrote: > Can this conditional succeed? The old code doesn't bother checking for it, but > maybe that's because that code will silently no-op in this case? Seems like no, we always add word to this map. And even if it is end() it should be the other way around to match old behaviour. Changed to DCHECK.
928 if (!word_id_history_map_[word_id].empty())	884 continue;

929 continue; // The word is still in use.	885

	886 word_id_history_map_iter->second.erase(history_id);

	887 if (!word_id_history_map_iter->second.empty())

	888 continue;

930	889

931 // The word is no longer in use. Reconcile any changes to character usage.	890 // The word is no longer in use. Reconcile any changes to character usage.

932 base::string16 word = word_list_[word_id];	891 base::string16 word = word_list_[word_id];

933 Char16Set characters = Char16SetFromString16(word);	892 for (base::char16 uni_char : Char16SetFromString16(word)) {

934 for (Char16Set::iterator uni_char_iter = characters.begin();	893 auto char_word_map_iter = char_word_map_.find(uni_char);

935 uni_char_iter != characters.end(); ++uni_char_iter) {	894 char_word_map_iter->second.erase(word_id);

936 base::char16 uni_char = *uni_char_iter;	895 if (char_word_map_iter->second.empty())

937 char_word_map_[uni_char].erase(word_id);	896 char_word_map_.erase(char_word_map_iter);

938 if (char_word_map_[uni_char].empty())

939 char_word_map_.erase(uni_char); // No longer in use.

940 }	897 }

941	898

942 // Complete the removal of references to the word.	899 // Complete the removal of references to the word.

943 word_id_history_map_.erase(word_id);	900 word_id_history_map_.erase(word_id_history_map_iter);

944 word_map_.erase(word);	901 word_map_.erase(word);

945 word_list_[word_id] = base::string16();	902 word_list_[word_id] = base::string16();

946 available_words_.insert(word_id);	903 available_words_.insert(word_id);

947 }	904 }

948 }	905 }

949	906

950 void URLIndexPrivateData::ResetSearchTermCache() {	907 void URLIndexPrivateData::ResetSearchTermCache() {

951 for (SearchTermCacheMap::iterator iter = search_term_cache_.begin();	908 for (auto& item : search_term_cache_)

952 iter != search_term_cache_.end(); ++iter)	909 item.second.used_ = false;

953 iter->second.used_ = false;

954 }	910 }

955	911

956 bool URLIndexPrivateData::SaveToFile(const base::FilePath& file_path) {	912 bool URLIndexPrivateData::SaveToFile(const base::FilePath& file_path) {

957 base::TimeTicks beginning_time = base::TimeTicks::Now();	913 base::TimeTicks beginning_time = base::TimeTicks::Now();

958 InMemoryURLIndexCacheItem index_cache;	914 InMemoryURLIndexCacheItem index_cache;

959 SavePrivateData(&index_cache);	915 SavePrivateData(&index_cache);

960 std::string data;	916 std::string data;

961 if (!index_cache.SerializeToString(&data)) {	917 if (!index_cache.SerializeToString(&data)) {

962 LOG(WARNING) << "Failed to serialize the InMemoryURLIndex cache.";	918 LOG(WARNING) << "Failed to serialize the InMemoryURLIndex cache.";

963 return false;	919 return false;

(...skipping 24 matching lines...) Expand all Loading...
988 SaveWordIDHistoryMap(cache);	944 SaveWordIDHistoryMap(cache);

989 SaveHistoryInfoMap(cache);	945 SaveHistoryInfoMap(cache);

990 SaveWordStartsMap(cache);	946 SaveWordStartsMap(cache);

991 }	947 }

992	948

993 void URLIndexPrivateData::SaveWordList(InMemoryURLIndexCacheItem* cache) const {	949 void URLIndexPrivateData::SaveWordList(InMemoryURLIndexCacheItem* cache) const {

994 if (word_list_.empty())	950 if (word_list_.empty())

995 return;	951 return;

996 WordListItem* list_item = cache->mutable_word_list();	952 WordListItem* list_item = cache->mutable_word_list();

997 list_item->set_word_count(word_list_.size());	953 list_item->set_word_count(word_list_.size());

998 for (String16Vector::const_iterator iter = word_list_.begin();	954 for (const base::string16& word : word_list_)

999 iter != word_list_.end(); ++iter)	955 list_item->add_word(base::UTF16ToUTF8(word));

1000 list_item->add_word(base::UTF16ToUTF8(*iter));

1001 }	956 }

1002	957

1003 void URLIndexPrivateData::SaveWordMap(InMemoryURLIndexCacheItem* cache) const {	958 void URLIndexPrivateData::SaveWordMap(InMemoryURLIndexCacheItem* cache) const {

1004 if (word_map_.empty())	959 if (word_map_.empty())

1005 return;	960 return;

1006 WordMapItem* map_item = cache->mutable_word_map();	961 WordMapItem* map_item = cache->mutable_word_map();

1007 map_item->set_item_count(word_map_.size());	962 map_item->set_item_count(word_map_.size());

1008 for (WordMap::const_iterator iter = word_map_.begin();	963 for (const auto& elem : word_map_) {

1009 iter != word_map_.end(); ++iter) {

1010 WordMapEntry* map_entry = map_item->add_word_map_entry();	964 WordMapEntry* map_entry = map_item->add_word_map_entry();

1011 map_entry->set_word(base::UTF16ToUTF8(iter->first));	965 map_entry->set_word(base::UTF16ToUTF8(elem.first));

1012 map_entry->set_word_id(iter->second);	966 map_entry->set_word_id(elem.second);

1013 }	967 }

1014 }	968 }

1015	969

1016 void URLIndexPrivateData::SaveCharWordMap(	970 void URLIndexPrivateData::SaveCharWordMap(

1017 InMemoryURLIndexCacheItem* cache) const {	971 InMemoryURLIndexCacheItem* cache) const {

1018 if (char_word_map_.empty())	972 if (char_word_map_.empty())

1019 return;	973 return;

1020 CharWordMapItem* map_item = cache->mutable_char_word_map();	974 CharWordMapItem* map_item = cache->mutable_char_word_map();

1021 map_item->set_item_count(char_word_map_.size());	975 map_item->set_item_count(char_word_map_.size());

1022 for (CharWordIDMap::const_iterator iter = char_word_map_.begin();	976 for (const auto& elem : char_word_map_) {

1023 iter != char_word_map_.end(); ++iter) {

1024 CharWordMapEntry* map_entry = map_item->add_char_word_map_entry();	977 CharWordMapEntry* map_entry = map_item->add_char_word_map_entry();

1025 map_entry->set_char_16(iter->first);	978 map_entry->set_char_16(elem.first);

1026 const WordIDSet& word_id_set(iter->second);	979 const WordIDSet& word_id_set(elem.second);

1027 map_entry->set_item_count(word_id_set.size());	980 map_entry->set_item_count(word_id_set.size());

1028 for (WordIDSet::const_iterator set_iter = word_id_set.begin();	981 for (WordID word_id : word_id_set)

1029 set_iter != word_id_set.end(); ++set_iter)	982 map_entry->add_word_id(word_id);

1030 map_entry->add_word_id(*set_iter);

1031 }	983 }

1032 }	984 }

1033	985

1034 void URLIndexPrivateData::SaveWordIDHistoryMap(	986 void URLIndexPrivateData::SaveWordIDHistoryMap(

1035 InMemoryURLIndexCacheItem* cache) const {	987 InMemoryURLIndexCacheItem* cache) const {

1036 if (word_id_history_map_.empty())	988 if (word_id_history_map_.empty())

1037 return;	989 return;

1038 WordIDHistoryMapItem* map_item = cache->mutable_word_id_history_map();	990 WordIDHistoryMapItem* map_item = cache->mutable_word_id_history_map();

1039 map_item->set_item_count(word_id_history_map_.size());	991 map_item->set_item_count(word_id_history_map_.size());

1040 for (WordIDHistoryMap::const_iterator iter = word_id_history_map_.begin();	992 for (const auto& elem : word_id_history_map_) {

1041 iter != word_id_history_map_.end(); ++iter) {

1042 WordIDHistoryMapEntry* map_entry =	993 WordIDHistoryMapEntry* map_entry =

1043 map_item->add_word_id_history_map_entry();	994 map_item->add_word_id_history_map_entry();

1044 map_entry->set_word_id(iter->first);	995 map_entry->set_word_id(elem.first);

1045 const HistoryIDSet& history_id_set(iter->second);	996 const HistoryIDSet& history_id_set(elem.second);

1046 map_entry->set_item_count(history_id_set.size());	997 map_entry->set_item_count(history_id_set.size());

1047 for (HistoryIDSet::const_iterator set_iter = history_id_set.begin();	998 for (HistoryID history_id : history_id_set)

1048 set_iter != history_id_set.end(); ++set_iter)	999 map_entry->add_history_id(history_id);

1049 map_entry->add_history_id(*set_iter);

1050 }	1000 }

1051 }	1001 }

1052	1002

1053 void URLIndexPrivateData::SaveHistoryInfoMap(	1003 void URLIndexPrivateData::SaveHistoryInfoMap(

1054 InMemoryURLIndexCacheItem* cache) const {	1004 InMemoryURLIndexCacheItem* cache) const {

1055 if (history_info_map_.empty())	1005 if (history_info_map_.empty())

1056 return;	1006 return;

1057 HistoryInfoMapItem* map_item = cache->mutable_history_info_map();	1007 HistoryInfoMapItem* map_item = cache->mutable_history_info_map();

1058 map_item->set_item_count(history_info_map_.size());	1008 map_item->set_item_count(history_info_map_.size());

1059 for (HistoryInfoMap::const_iterator iter = history_info_map_.begin();	1009 for (const auto& elem : history_info_map_) {

1060 iter != history_info_map_.end(); ++iter) {

1061 HistoryInfoMapEntry* map_entry = map_item->add_history_info_map_entry();	1010 HistoryInfoMapEntry* map_entry = map_item->add_history_info_map_entry();

1062 map_entry->set_history_id(iter->first);	1011 map_entry->set_history_id(elem.first);

1063 const history::URLRow& url_row(iter->second.url_row);	1012 const history::URLRow& url_row(elem.second.url_row);

1064 // Note: We only save information that contributes to the index so there	1013 // Note: We only save information that contributes to the index so there

1065 // is no need to save search_term_cache_ (not persistent).	1014 // is no need to save search_term_cache_ (not persistent).

1066 map_entry->set_visit_count(url_row.visit_count());	1015 map_entry->set_visit_count(url_row.visit_count());

1067 map_entry->set_typed_count(url_row.typed_count());	1016 map_entry->set_typed_count(url_row.typed_count());

1068 map_entry->set_last_visit(url_row.last_visit().ToInternalValue());	1017 map_entry->set_last_visit(url_row.last_visit().ToInternalValue());

1069 map_entry->set_url(url_row.url().spec());	1018 map_entry->set_url(url_row.url().spec());

1070 map_entry->set_title(base::UTF16ToUTF8(url_row.title()));	1019 map_entry->set_title(base::UTF16ToUTF8(url_row.title()));

1071 const VisitInfoVector& visits(iter->second.visits);	1020 for (const auto& visit : elem.second.visits) {

1072 for (VisitInfoVector::const_iterator visit_iter = visits.begin();

1073 visit_iter != visits.end(); ++visit_iter) {

1074 HistoryInfoMapEntry_VisitInfo* visit_info = map_entry->add_visits();	1021 HistoryInfoMapEntry_VisitInfo* visit_info = map_entry->add_visits();

1075 visit_info->set_visit_time(visit_iter->first.ToInternalValue());	1022 visit_info->set_visit_time(visit.first.ToInternalValue());

1076 visit_info->set_transition_type(visit_iter->second);	1023 visit_info->set_transition_type(visit.second);

1077 }	1024 }

1078 }	1025 }

1079 }	1026 }

1080	1027

1081 void URLIndexPrivateData::SaveWordStartsMap(	1028 void URLIndexPrivateData::SaveWordStartsMap(

1082 InMemoryURLIndexCacheItem* cache) const {	1029 InMemoryURLIndexCacheItem* cache) const {

1083 if (word_starts_map_.empty())	1030 if (word_starts_map_.empty())

1084 return;	1031 return;

1085 // For unit testing: Enable saving of the cache as an earlier version to	1032 // For unit testing: Enable saving of the cache as an earlier version to

1086 // allow testing of cache file upgrading in ReadFromFile().	1033 // allow testing of cache file upgrading in ReadFromFile().

1087 // TODO(mrossetti): Instead of intruding on production code with this kind of	1034 // TODO(mrossetti): Instead of intruding on production code with this kind of

1088 // test harness, save a copy of an older version cache with known results.	1035 // test harness, save a copy of an older version cache with known results.

1089 // Implement this when switching the caching over to SQLite.	1036 // Implement this when switching the caching over to SQLite.

1090 if (saved_cache_version_ < 1)	1037 if (saved_cache_version_ < 1)

1091 return;	1038 return;

1092	1039

1093 WordStartsMapItem* map_item = cache->mutable_word_starts_map();	1040 WordStartsMapItem* map_item = cache->mutable_word_starts_map();

1094 map_item->set_item_count(word_starts_map_.size());	1041 map_item->set_item_count(word_starts_map_.size());

1095 for (WordStartsMap::const_iterator iter = word_starts_map_.begin();	1042 for (const auto& entrie : word_starts_map_) {
	Peter Kasting 2017/02/17 01:33:46 Nit: Did you mean "entry"? (many places) Nit: Did you mean "entry"? (many places) dyaroshev 2017/02/17 20:17:22 Oops, spelling( Done. Show quoted text On 2017/02/17 01:33:46, Peter Kasting wrote: > Nit: Did you mean "entry"? (many places) Oops, spelling( Done.
1096 iter != word_starts_map_.end(); ++iter) {

1097 WordStartsMapEntry* map_entry = map_item->add_word_starts_map_entry();	1043 WordStartsMapEntry* map_entry = map_item->add_word_starts_map_entry();

1098 map_entry->set_history_id(iter->first);	1044 map_entry->set_history_id(entrie.first);

1099 const RowWordStarts& word_starts(iter->second);	1045 const RowWordStarts& word_starts(entrie.second);

1100 for (WordStarts::const_iterator i = word_starts.url_word_starts_.begin();	1046 for (auto url_word_start : word_starts.url_word_starts_)

1101 i != word_starts.url_word_starts_.end(); ++i)	1047 map_entry->add_url_word_starts(url_word_start);

1102 map_entry->add_url_word_starts(*i);	1048 for (auto title_word_start : word_starts.title_word_starts_)

1103 for (WordStarts::const_iterator i = word_starts.title_word_starts_.begin();	1049 map_entry->add_title_word_starts(title_word_start);

1104 i != word_starts.title_word_starts_.end(); ++i)

1105 map_entry->add_title_word_starts(*i);

1106 }	1050 }

1107 }	1051 }

1108	1052

1109 bool URLIndexPrivateData::RestorePrivateData(	1053 bool URLIndexPrivateData::RestorePrivateData(

1110 const InMemoryURLIndexCacheItem& cache) {	1054 const InMemoryURLIndexCacheItem& cache) {

1111 last_time_rebuilt_from_history_ =	1055 last_time_rebuilt_from_history_ =

1112 base::Time::FromInternalValue(cache.last_rebuild_timestamp());	1056 base::Time::FromInternalValue(cache.last_rebuild_timestamp());

1113 const base::TimeDelta rebuilt_ago =	1057 const base::TimeDelta rebuilt_ago =

1114 base::Time::Now() - last_time_rebuilt_from_history_;	1058 base::Time::Now() - last_time_rebuilt_from_history_;

1115 if ((rebuilt_ago > base::TimeDelta::FromDays(7)) \|\|	1059 if ((rebuilt_ago > base::TimeDelta::FromDays(7)) \|\|

(...skipping 22 matching lines...) Expand all Loading...
1138 bool URLIndexPrivateData::RestoreWordList(	1082 bool URLIndexPrivateData::RestoreWordList(

1139 const InMemoryURLIndexCacheItem& cache) {	1083 const InMemoryURLIndexCacheItem& cache) {

1140 if (!cache.has_word_list())	1084 if (!cache.has_word_list())

1141 return false;	1085 return false;

1142 const WordListItem& list_item(cache.word_list());	1086 const WordListItem& list_item(cache.word_list());

1143 uint32_t expected_item_count = list_item.word_count();	1087 uint32_t expected_item_count = list_item.word_count();

1144 uint32_t actual_item_count = list_item.word_size();	1088 uint32_t actual_item_count = list_item.word_size();

1145 if (actual_item_count == 0 \|\| actual_item_count != expected_item_count)	1089 if (actual_item_count == 0 \|\| actual_item_count != expected_item_count)

1146 return false;	1090 return false;

1147 const RepeatedPtrField<std::string>& words(list_item.word());	1091 const RepeatedPtrField<std::string>& words(list_item.word());

1148 for (RepeatedPtrField<std::string>::const_iterator iter = words.begin();	1092 word_list_.reserve(words.size());

1149 iter != words.end(); ++iter)	1093 std::transform(

1150 word_list_.push_back(base::UTF8ToUTF16(*iter));	1094 words.begin(), words.end(), std::back_inserter(word_list_),

	1095 [](const std::string& word) { return base::UTF8ToUTF16(word); });

1151 return true;	1096 return true;

1152 }	1097 }

1153	1098

1154 bool URLIndexPrivateData::RestoreWordMap(	1099 bool URLIndexPrivateData::RestoreWordMap(

1155 const InMemoryURLIndexCacheItem& cache) {	1100 const InMemoryURLIndexCacheItem& cache) {

1156 if (!cache.has_word_map())	1101 if (!cache.has_word_map())

1157 return false;	1102 return false;

1158 const WordMapItem& list_item(cache.word_map());	1103 const WordMapItem& list_item(cache.word_map());

1159 uint32_t expected_item_count = list_item.item_count();	1104 uint32_t expected_item_count = list_item.item_count();

1160 uint32_t actual_item_count = list_item.word_map_entry_size();	1105 uint32_t actual_item_count = list_item.word_map_entry_size();

1161 if (actual_item_count == 0 \|\| actual_item_count != expected_item_count)	1106 if (actual_item_count == 0 \|\| actual_item_count != expected_item_count)

1162 return false;	1107 return false;

1163 const RepeatedPtrField<WordMapEntry>& entries(list_item.word_map_entry());	1108 for (const auto& entrie : list_item.word_map_entry())

1164 for (RepeatedPtrField<WordMapEntry>::const_iterator iter = entries.begin();	1109 word_map_[base::UTF8ToUTF16(entrie.word())] = entrie.word_id();

1165 iter != entries.end(); ++iter)	1110

1166 word_map_[base::UTF8ToUTF16(iter->word())] = iter->word_id();

1167 return true;	1111 return true;

1168 }	1112 }

1169	1113

1170 bool URLIndexPrivateData::RestoreCharWordMap(	1114 bool URLIndexPrivateData::RestoreCharWordMap(

1171 const InMemoryURLIndexCacheItem& cache) {	1115 const InMemoryURLIndexCacheItem& cache) {

1172 if (!cache.has_char_word_map())	1116 if (!cache.has_char_word_map())

1173 return false;	1117 return false;

1174 const CharWordMapItem& list_item(cache.char_word_map());	1118 const CharWordMapItem& list_item(cache.char_word_map());

1175 uint32_t expected_item_count = list_item.item_count();	1119 uint32_t expected_item_count = list_item.item_count();

1176 uint32_t actual_item_count = list_item.char_word_map_entry_size();	1120 uint32_t actual_item_count = list_item.char_word_map_entry_size();

1177 if (actual_item_count == 0 \|\| actual_item_count != expected_item_count)	1121 if (actual_item_count == 0 \|\| actual_item_count != expected_item_count)

1178 return false;	1122 return false;

1179 const RepeatedPtrField<CharWordMapEntry>&	1123

1180 entries(list_item.char_word_map_entry());	1124 for (const auto& entrie : list_item.char_word_map_entry()) {

1181 for (RepeatedPtrField<CharWordMapEntry>::const_iterator iter =	1125 expected_item_count = entrie.item_count();

1182 entries.begin(); iter != entries.end(); ++iter) {	1126 actual_item_count = entrie.word_id_size();

1183 expected_item_count = iter->item_count();

1184 actual_item_count = iter->word_id_size();

1185 if (actual_item_count == 0 \|\| actual_item_count != expected_item_count)	1127 if (actual_item_count == 0 \|\| actual_item_count != expected_item_count)

1186 return false;	1128 return false;

1187 base::char16 uni_char = static_cast<base::char16>(iter->char_16());	1129 base::char16 uni_char = static_cast<base::char16>(entrie.char_16());

1188 WordIDSet word_id_set;	1130 const RepeatedField<int32_t>& word_ids(entrie.word_id());

1189 const RepeatedField<int32_t>& word_ids(iter->word_id());	1131 char_word_map_[uni_char] = {word_ids.begin(), word_ids.end()};

1190 for (RepeatedField<int32_t>::const_iterator jiter = word_ids.begin();

1191 jiter != word_ids.end(); ++jiter)

1192 word_id_set.insert(*jiter);

1193 char_word_map_[uni_char] = word_id_set;

1194 }	1132 }

1195 return true;	1133 return true;

1196 }	1134 }

1197	1135

1198 bool URLIndexPrivateData::RestoreWordIDHistoryMap(	1136 bool URLIndexPrivateData::RestoreWordIDHistoryMap(

1199 const InMemoryURLIndexCacheItem& cache) {	1137 const InMemoryURLIndexCacheItem& cache) {

1200 if (!cache.has_word_id_history_map())	1138 if (!cache.has_word_id_history_map())

1201 return false;	1139 return false;

1202 const WordIDHistoryMapItem& list_item(cache.word_id_history_map());	1140 const WordIDHistoryMapItem& list_item(cache.word_id_history_map());

1203 uint32_t expected_item_count = list_item.item_count();	1141 uint32_t expected_item_count = list_item.item_count();

1204 uint32_t actual_item_count = list_item.word_id_history_map_entry_size();	1142 uint32_t actual_item_count = list_item.word_id_history_map_entry_size();

1205 if (actual_item_count == 0 \|\| actual_item_count != expected_item_count)	1143 if (actual_item_count == 0 \|\| actual_item_count != expected_item_count)

1206 return false;	1144 return false;

1207 const RepeatedPtrField<WordIDHistoryMapEntry>&	1145 for (const auto& entrie : list_item.word_id_history_map_entry()) {

1208 entries(list_item.word_id_history_map_entry());	1146 expected_item_count = entrie.item_count();

1209 for (RepeatedPtrField<WordIDHistoryMapEntry>::const_iterator iter =	1147 actual_item_count = entrie.history_id_size();

1210 entries.begin(); iter != entries.end(); ++iter) {

1211 expected_item_count = iter->item_count();

1212 actual_item_count = iter->history_id_size();

1213 if (actual_item_count == 0 \|\| actual_item_count != expected_item_count)	1148 if (actual_item_count == 0 \|\| actual_item_count != expected_item_count)

1214 return false;	1149 return false;

1215 WordID word_id = iter->word_id();	1150 WordID word_id = entrie.word_id();

1216 HistoryIDSet history_id_set;	1151 const RepeatedField<int64_t>& history_ids(entrie.history_id());

1217 const RepeatedField<int64_t>& history_ids(iter->history_id());	1152 word_id_history_map_[word_id] = {history_ids.begin(), history_ids.end()};

1218 for (RepeatedField<int64_t>::const_iterator jiter = history_ids.begin();	1153 for (HistoryID history_id : history_ids)

1219 jiter != history_ids.end(); ++jiter) {	1154 AddToHistoryIDWordMap(history_id, word_id);

1220 history_id_set.insert(*jiter);

1221 AddToHistoryIDWordMap(*jiter, word_id);

1222 }

1223 word_id_history_map_[word_id] = history_id_set;

1224 }	1155 }

1225 return true;	1156 return true;

1226 }	1157 }

1227	1158

1228 bool URLIndexPrivateData::RestoreHistoryInfoMap(	1159 bool URLIndexPrivateData::RestoreHistoryInfoMap(

1229 const InMemoryURLIndexCacheItem& cache) {	1160 const InMemoryURLIndexCacheItem& cache) {

1230 if (!cache.has_history_info_map())	1161 if (!cache.has_history_info_map())

1231 return false;	1162 return false;

1232 const HistoryInfoMapItem& list_item(cache.history_info_map());	1163 const HistoryInfoMapItem& list_item(cache.history_info_map());

1233 uint32_t expected_item_count = list_item.item_count();	1164 uint32_t expected_item_count = list_item.item_count();

1234 uint32_t actual_item_count = list_item.history_info_map_entry_size();	1165 uint32_t actual_item_count = list_item.history_info_map_entry_size();

1235 if (actual_item_count == 0 \|\| actual_item_count != expected_item_count)	1166 if (actual_item_count == 0 \|\| actual_item_count != expected_item_count)

1236 return false;	1167 return false;

1237 const RepeatedPtrField<HistoryInfoMapEntry>&	1168

1238 entries(list_item.history_info_map_entry());	1169 for (const auto& entrie : list_item.history_info_map_entry()) {

1239 for (RepeatedPtrField<HistoryInfoMapEntry>::const_iterator iter =	1170 HistoryID history_id = entrie.history_id();

1240 entries.begin(); iter != entries.end(); ++iter) {	1171 GURL url(entrie.url());
	Peter Kasting 2017/02/17 01:33:46 Nit: Prefer = to () here Nit: Prefer = to () here dyaroshev 2017/02/17 20:17:22 Removed the temporary altogether. Show quoted text On 2017/02/17 01:33:46, Peter Kasting wrote: > Nit: Prefer = to () here Removed the temporary altogether.
1241 HistoryID history_id = iter->history_id();

1242 GURL url(iter->url());

1243 history::URLRow url_row(url, history_id);	1172 history::URLRow url_row(url, history_id);

1244 url_row.set_visit_count(iter->visit_count());	1173 url_row.set_visit_count(entrie.visit_count());

1245 url_row.set_typed_count(iter->typed_count());	1174 url_row.set_typed_count(entrie.typed_count());

1246 url_row.set_last_visit(base::Time::FromInternalValue(iter->last_visit()));	1175 url_row.set_last_visit(base::Time::FromInternalValue(entrie.last_visit()));

1247 if (iter->has_title()) {	1176 if (entrie.has_title()) {
	Peter Kasting 2017/02/17 01:33:46 Nit: No {} Nit: No {} dyaroshev 2017/02/17 20:17:22 Done. Show quoted text On 2017/02/17 01:33:46, Peter Kasting wrote: > Nit: No {} Done.
1248 base::string16 title(base::UTF8ToUTF16(iter->title()));	1177 url_row.set_title(base::UTF8ToUTF16(entrie.title()));

1249 url_row.set_title(title);

1250 }	1178 }

1251 history_info_map_[history_id].url_row = url_row;	1179 history_info_map_[history_id].url_row = std::move(url_row);

1252	1180

1253 // Restore visits list.	1181 // Restore visits list.

1254 VisitInfoVector visits;	1182 VisitInfoVector visits;

1255 visits.reserve(iter->visits_size());	1183 visits.reserve(entrie.visits_size());

1256 for (int i = 0; i < iter->visits_size(); ++i) {	1184 for (const auto& entrie_visit : entrie.visits()) {

1257 visits.push_back(std::make_pair(	1185 visits.emplace_back(

1258 base::Time::FromInternalValue(iter->visits(i).visit_time()),	1186 base::Time::FromInternalValue(entrie_visit.visit_time()),

1259 ui::PageTransitionFromInt(iter->visits(i).transition_type())));	1187 ui::PageTransitionFromInt(entrie_visit.transition_type()));

1260 }	1188 }

1261 history_info_map_[history_id].visits = visits;	1189 history_info_map_[history_id].visits = std::move(visits);

1262 }	1190 }

1263 return true;	1191 return true;

1264 }	1192 }

1265	1193

1266 bool URLIndexPrivateData::RestoreWordStartsMap(	1194 bool URLIndexPrivateData::RestoreWordStartsMap(

1267 const InMemoryURLIndexCacheItem& cache) {	1195 const InMemoryURLIndexCacheItem& cache) {

1268 // Note that this function must be called after RestoreHistoryInfoMap() has	1196 // Note that this function must be called after RestoreHistoryInfoMap() has

1269 // been run as the word starts may have to be recalculated from the urls and	1197 // been run as the word starts may have to be recalculated from the urls and

1270 // page titles.	1198 // page titles.

1271 if (cache.has_word_starts_map()) {	1199 if (cache.has_word_starts_map()) {

1272 const WordStartsMapItem& list_item(cache.word_starts_map());	1200 const WordStartsMapItem& list_item(cache.word_starts_map());

1273 uint32_t expected_item_count = list_item.item_count();	1201 uint32_t expected_item_count = list_item.item_count();

1274 uint32_t actual_item_count = list_item.word_starts_map_entry_size();	1202 uint32_t actual_item_count = list_item.word_starts_map_entry_size();

1275 if (actual_item_count == 0 \|\| actual_item_count != expected_item_count)	1203 if (actual_item_count == 0 \|\| actual_item_count != expected_item_count)

1276 return false;	1204 return false;

1277 const RepeatedPtrField<WordStartsMapEntry>&	1205 const RepeatedPtrField<WordStartsMapEntry>&

1278 entries(list_item.word_starts_map_entry());	1206 entries(list_item.word_starts_map_entry());

1279 for (RepeatedPtrField<WordStartsMapEntry>::const_iterator iter =	1207 for (const auto& entrie : entries) {

1280 entries.begin(); iter != entries.end(); ++iter) {	1208 HistoryID history_id = entrie.history_id();

1281 HistoryID history_id = iter->history_id();

1282 RowWordStarts word_starts;	1209 RowWordStarts word_starts;

1283 // Restore the URL word starts.	1210 // Restore the URL word starts.

1284 const RepeatedField<int32_t>& url_starts(iter->url_word_starts());	1211 const RepeatedField<int32_t>& url_starts(entrie.url_word_starts());

1285 for (RepeatedField<int32_t>::const_iterator jiter = url_starts.begin();	1212 word_starts.url_word_starts_ = {url_starts.begin(), url_starts.end()};

1286 jiter != url_starts.end(); ++jiter)	1213

1287 word_starts.url_word_starts_.push_back(*jiter);

1288 // Restore the page title word starts.	1214 // Restore the page title word starts.

1289 const RepeatedField<int32_t>& title_starts(iter->title_word_starts());	1215 const RepeatedField<int32_t>& title_starts(entrie.title_word_starts());

1290 for (RepeatedField<int32_t>::const_iterator jiter = title_starts.begin();	1216 word_starts.title_word_starts_ = {title_starts.begin(),

1291 jiter != title_starts.end(); ++jiter)	1217 title_starts.end()};

1292 word_starts.title_word_starts_.push_back(*jiter);	1218

1293 word_starts_map_[history_id] = word_starts;	1219 word_starts_map_[history_id] = std::move(word_starts);

1294 }	1220 }

1295 } else {	1221 } else {

1296 // Since the cache did not contain any word starts we must rebuild then from	1222 // Since the cache did not contain any word starts we must rebuild then from

1297 // the URL and page titles.	1223 // the URL and page titles.

1298 for (HistoryInfoMap::const_iterator iter = history_info_map_.begin();	1224 for (const auto& entrie : history_info_map_) {

1299 iter != history_info_map_.end(); ++iter) {

1300 RowWordStarts word_starts;	1225 RowWordStarts word_starts;

1301 const history::URLRow& row(iter->second.url_row);	1226 const history::URLRow& row(entrie.second.url_row);

1302 const base::string16& url =	1227 const base::string16& url =

1303 bookmarks::CleanUpUrlForMatching(row.url(), nullptr);	1228 bookmarks::CleanUpUrlForMatching(row.url(), nullptr);

1304 String16VectorFromString16(url, false, &word_starts.url_word_starts_);	1229 String16VectorFromString16(url, false, &word_starts.url_word_starts_);

1305 const base::string16& title =	1230 const base::string16& title =

1306 bookmarks::CleanUpTitleForMatching(row.title());	1231 bookmarks::CleanUpTitleForMatching(row.title());

1307 String16VectorFromString16(title, false, &word_starts.title_word_starts_);	1232 String16VectorFromString16(title, false, &word_starts.title_word_starts_);

1308 word_starts_map_[iter->first] = word_starts;	1233 word_starts_map_[entrie.first] = std::move(word_starts);

1309 }	1234 }

1310 }	1235 }

1311 return true;	1236 return true;

1312 }	1237 }

1313	1238

1314 // static	1239 // static

1315 bool URLIndexPrivateData::URLSchemeIsWhitelisted(	1240 bool URLIndexPrivateData::URLSchemeIsWhitelisted(

1316 const GURL& gurl,	1241 const GURL& gurl,

1317 const std::set<std::string>& whitelist) {	1242 const std::set<std::string>& whitelist) {

1318 return whitelist.find(gurl.scheme()) != whitelist.end();	1243 return whitelist.find(gurl.scheme()) != whitelist.end();

(...skipping 62 matching lines...) Expand 10 before \| Expand all \| Expand 10 after Loading...
1381 // First cut: typed count, visit count, recency.	1306 // First cut: typed count, visit count, recency.

1382 // TODO(mrossetti): This is too simplistic. Consider an approach which ranks	1307 // TODO(mrossetti): This is too simplistic. Consider an approach which ranks

1383 // recently visited (within the last 12/24 hours) as highly important. Get	1308 // recently visited (within the last 12/24 hours) as highly important. Get

1384 // input from mpearson.	1309 // input from mpearson.

1385 if (r1.typed_count() != r2.typed_count())	1310 if (r1.typed_count() != r2.typed_count())

1386 return (r1.typed_count() > r2.typed_count());	1311 return (r1.typed_count() > r2.typed_count());

1387 if (r1.visit_count() != r2.visit_count())	1312 if (r1.visit_count() != r2.visit_count())

1388 return (r1.visit_count() > r2.visit_count());	1313 return (r1.visit_count() > r2.visit_count());

1389 return (r1.last_visit() > r2.last_visit());	1314 return (r1.last_visit() > r2.last_visit());

1390 }	1315 }

OLD	NEW

« components/omnibox/browser/in_memory_url_index_types.cc ('K') | « components/omnibox/browser/in_memory_url_index_types.cc ('k') | no next file » | no next file with comments »