|
|
Created:
10 years, 4 months ago by mrossetti Modified:
9 years, 7 months ago CC:
chromium-reviews, ben+cc_chromium.org, brettw-cc_chromium.org, Paweł Hajdan Jr. Base URL:
svn://svn.chromium.org/chrome/trunk/src/ Visibility:
Public. |
DescriptionNext step integrating the HistoryQuickProvider: Implement index initialization and population and provide unit test with test data.
BUG=None
TEST=None
Committed: http://src.chromium.org/viewvc/chrome?view=rev&revision=56962
Patch Set 1 #Patch Set 2 : '' #Patch Set 3 : '' #Patch Set 4 : '' #Patch Set 5 : '' #Patch Set 6 : '' #Patch Set 7 : '' #Patch Set 8 : '' #Patch Set 9 : '' #
Total comments: 42
Patch Set 10 : '' #Patch Set 11 : '' #
Total comments: 38
Patch Set 12 : '' #
Total comments: 12
Patch Set 13 : '' #Patch Set 14 : '' #
Messages
Total messages: 13 (0 generated)
The winbot and linuxbot failures do not appear to be related to this CL, but I will kick off another round to see if they were intermittent and can be cleared.
The test setup for linux fails to find sqlite3. Researching.
'snprintf' does not appear to be available on win. Researching.
This is now ready for review.
http://codereview.chromium.org/3138006/diff/8002/6006 File chrome/browser/history/in_memory_url_index.cc (right): http://codereview.chromium.org/3138006/diff/8002/6006#newcode9 chrome/browser/history/in_memory_url_index.cc:9: #include <queue> What is this include for? http://codereview.chromium.org/3138006/diff/8002/6006#newcode11 chrome/browser/history/in_memory_url_index.cc:11: #include "app/sql/statement.h" Is this still needed? http://codereview.chromium.org/3138006/diff/8002/6006#newcode20 chrome/browser/history/in_memory_url_index.cc:20: using std::string; I don't see too many uses of "using std::string" in the codebase. Usually we explicitly say std::string everywhere. http://codereview.chromium.org/3138006/diff/8002/6006#newcode61 chrome/browser/history/in_memory_url_index.cc:61: #if 0 I would prefer to remove the #if 0'd code and simply have a TODO saying what needs to be done. http://codereview.chromium.org/3138006/diff/8002/6006#newcode73 chrome/browser/history/in_memory_url_index.cc:73: transform(url.begin(), url.end(), url.begin(), tolower); These three lines are duplicated below. http://codereview.chromium.org/3138006/diff/8002/6006#newcode84 chrome/browser/history/in_memory_url_index.cc:84: transform(url.begin(), url.end(), url.begin(), tolower); How will this work for non-ascii urls? http://codereview.chromium.org/3138006/diff/8002/6006#newcode103 chrome/browser/history/in_memory_url_index.cc:103: string16 uni_string) { Can this be string16&? http://codereview.chromium.org/3138006/diff/8002/6006#newcode118 chrome/browser/history/in_memory_url_index.cc:118: return words; I don't have a good sense for how expensive set copies are. Should we be passing in a set* and populating that instead? http://codereview.chromium.org/3138006/diff/8002/6006#newcode122 chrome/browser/history/in_memory_url_index.cc:122: string16 uni_word) { Can this be string16&? http://codereview.chromium.org/3138006/diff/8002/6006#newcode124 chrome/browser/history/in_memory_url_index.cc:124: for (string16::const_iterator iter = uni_word.begin(); How does this work with multi-byte strings? http://codereview.chromium.org/3138006/diff/8002/6006#newcode137 chrome/browser/history/in_memory_url_index.cc:137: word_id_history_map_.find(word_id); Indent 4 spaces. http://codereview.chromium.org/3138006/diff/8002/6006#newcode157 chrome/browser/history/in_memory_url_index.cc:157: InMemoryURLIndex::CharactersFromString16(uni_word); Do you need to specify InMemoryURLIndex::? http://codereview.chromium.org/3138006/diff/8002/6007 File chrome/browser/history/in_memory_url_index.h (right): http://codereview.chromium.org/3138006/diff/8002/6007#newcode57 chrome/browser/history/in_memory_url_index.h:57: static const int kLowQualityMatchAgeLimitInDays = 3; These constants and RecentThreshold() are also in history_types.h. http://codereview.chromium.org/3138006/diff/8002/6007#newcode64 chrome/browser/history/in_memory_url_index.h:64: friend class InMemoryURLIndexTest_Initialization_Test; // unit-test. I prefer to make the test fixture a friend and write fixture methods that expose whatever is needed. If you don't want to do that, gtest has a FRIEND_TEST macro. http://codereview.chromium.org/3138006/diff/8002/6007#newcode78 chrome/browser/history/in_memory_url_index.h:78: typedef std::map<char16, WordIDSet> CharWordIDMap; How will this work with multi-byte characters? http://codereview.chromium.org/3138006/diff/8002/6007#newcode83 chrome/browser/history/in_memory_url_index.h:83: typedef URLID HistoryID; // uint16 might actually work for us. Remove this comment (it's covered by the TODO). http://codereview.chromium.org/3138006/diff/8002/6007#newcode102 chrome/browser/history/in_memory_url_index.h:102: // TODO(mrossetti): Rip out either the SQLite filtering or the in-memory Does this TODO still apply? http://codereview.chromium.org/3138006/diff/8002/6008 File chrome/browser/history/in_memory_url_index_unittest.cc (right): http://codereview.chromium.org/3138006/diff/8002/6008#newcode44 chrome/browser/history/in_memory_url_index_unittest.cc:44: public URLDatabase { I don't like having the test fixture inherit from URLDatabase. Do we do this anywhere else in Chromium? http://codereview.chromium.org/3138006/diff/8002/6008#newcode56 chrome/browser/history/in_memory_url_index_unittest.cc:56: FILE_PATH_LITERAL("url_history_provider_test.db.txt")); Why not AppendASCII here also? http://codereview.chromium.org/3138006/diff/8002/6008#newcode60 chrome/browser/history/in_memory_url_index_unittest.cc:60: std::ifstream proto_file(history_proto_path.value().c_str()); Is there any existing code for populating a DB from a file? It seems silly to have to write this ourselves. http://codereview.chromium.org/3138006/diff/8002/6009 File chrome/test/data/History/url_history_provider_test.db.txt (right): http://codereview.chromium.org/3138006/diff/8002/6009#newcode17 chrome/test/data/History/url_history_provider_test.db.txt:17: INSERT INTO "urls" VALUES(1,'http://www.reuters.com/article/idUSN0839880620100708','UPDATE 1-US 30-yr mortgage rate drops to new record low | Reuters',0,29,1,3,2); Do we have to worry about using real/external urls and titles?
Excellent review comments! Thanks! All comments addressed. http://codereview.chromium.org/3138006/diff/8002/6006 File chrome/browser/history/in_memory_url_index.cc (right): http://codereview.chromium.org/3138006/diff/8002/6006#newcode9 chrome/browser/history/in_memory_url_index.cc:9: #include <queue> On 2010/08/17 19:53:20, rohitrao wrote: > What is this include for? Done. http://codereview.chromium.org/3138006/diff/8002/6006#newcode11 chrome/browser/history/in_memory_url_index.cc:11: #include "app/sql/statement.h" On 2010/08/17 19:53:20, rohitrao wrote: > Is this still needed? Done. http://codereview.chromium.org/3138006/diff/8002/6006#newcode20 chrome/browser/history/in_memory_url_index.cc:20: using std::string; On 2010/08/17 19:53:20, rohitrao wrote: > I don't see too many uses of "using std::string" in the codebase. Usually we > explicitly say std::string everywhere. Done. http://codereview.chromium.org/3138006/diff/8002/6006#newcode61 chrome/browser/history/in_memory_url_index.cc:61: #if 0 On 2010/08/17 19:53:20, rohitrao wrote: > I would prefer to remove the #if 0'd code and simply have a TODO saying what > needs to be done. Done. http://codereview.chromium.org/3138006/diff/8002/6006#newcode73 chrome/browser/history/in_memory_url_index.cc:73: transform(url.begin(), url.end(), url.begin(), tolower); On 2010/08/17 19:53:20, rohitrao wrote: > These three lines are duplicated below. Done. http://codereview.chromium.org/3138006/diff/8002/6006#newcode84 chrome/browser/history/in_memory_url_index.cc:84: transform(url.begin(), url.end(), url.begin(), tolower); On 2010/08/17 19:53:20, rohitrao wrote: > How will this work for non-ascii urls? Replaced with l10n_util::ToLower(), which will take into consideration multi-byte characters. http://codereview.chromium.org/3138006/diff/8002/6006#newcode103 chrome/browser/history/in_memory_url_index.cc:103: string16 uni_string) { On 2010/08/17 19:53:20, rohitrao wrote: > Can this be string16&? Done. It's okay in this situation (the incoming string is tolower'ed) but to be safe I added a note to the prototype. http://codereview.chromium.org/3138006/diff/8002/6006#newcode118 chrome/browser/history/in_memory_url_index.cc:118: return words; On 2010/08/17 19:53:20, rohitrao wrote: > I don't have a good sense for how expensive set copies are. Should we be > passing in a set* and populating that instead? I believe the compiler is smart enough to optimize this so that no copy is required. http://codereview.chromium.org/3138006/diff/8002/6006#newcode122 chrome/browser/history/in_memory_url_index.cc:122: string16 uni_word) { On 2010/08/17 19:53:20, rohitrao wrote: > Can this be string16&? Yep! Made it a string const&. http://codereview.chromium.org/3138006/diff/8002/6006#newcode124 chrome/browser/history/in_memory_url_index.cc:124: for (string16::const_iterator iter = uni_word.begin(); On 2010/08/17 19:53:20, rohitrao wrote: > How does this work with multi-byte strings? Multi-byte-edness does not matter except when lower-casing these strings. I've added a note to the header discussing this topic. http://codereview.chromium.org/3138006/diff/8002/6006#newcode137 chrome/browser/history/in_memory_url_index.cc:137: word_id_history_map_.find(word_id); On 2010/08/17 19:53:20, rohitrao wrote: > Indent 4 spaces. Done. http://codereview.chromium.org/3138006/diff/8002/6006#newcode157 chrome/browser/history/in_memory_url_index.cc:157: InMemoryURLIndex::CharactersFromString16(uni_word); On 2010/08/17 19:53:20, rohitrao wrote: > Do you need to specify InMemoryURLIndex::? Done. http://codereview.chromium.org/3138006/diff/8002/6007 File chrome/browser/history/in_memory_url_index.h (right): http://codereview.chromium.org/3138006/diff/8002/6007#newcode57 chrome/browser/history/in_memory_url_index.h:57: static const int kLowQualityMatchAgeLimitInDays = 3; On 2010/08/17 19:53:20, rohitrao wrote: > These constants and RecentThreshold() are also in history_types.h. Oops. http://codereview.chromium.org/3138006/diff/8002/6007#newcode64 chrome/browser/history/in_memory_url_index.h:64: friend class InMemoryURLIndexTest_Initialization_Test; // unit-test. On 2010/08/17 19:53:20, rohitrao wrote: > I prefer to make the test fixture a friend and write fixture methods that expose > whatever is needed. If you don't want to do that, gtest has a FRIEND_TEST > macro. Done. http://codereview.chromium.org/3138006/diff/8002/6007#newcode78 chrome/browser/history/in_memory_url_index.h:78: typedef std::map<char16, WordIDSet> CharWordIDMap; On 2010/08/17 19:53:20, rohitrao wrote: > How will this work with multi-byte characters? Multi-byte-edness does not matter at this level. I've added some comments at the top of the header about this. http://codereview.chromium.org/3138006/diff/8002/6007#newcode83 chrome/browser/history/in_memory_url_index.h:83: typedef URLID HistoryID; // uint16 might actually work for us. On 2010/08/17 19:53:20, rohitrao wrote: > Remove this comment (it's covered by the TODO). Done. http://codereview.chromium.org/3138006/diff/8002/6007#newcode102 chrome/browser/history/in_memory_url_index.h:102: // TODO(mrossetti): Rip out either the SQLite filtering or the in-memory On 2010/08/17 19:53:20, rohitrao wrote: > Does this TODO still apply? Done. http://codereview.chromium.org/3138006/diff/8002/6008 File chrome/browser/history/in_memory_url_index_unittest.cc (right): http://codereview.chromium.org/3138006/diff/8002/6008#newcode44 chrome/browser/history/in_memory_url_index_unittest.cc:44: public URLDatabase { On 2010/08/17 19:53:20, rohitrao wrote: > I don't like having the test fixture inherit from URLDatabase. Do we do this > anywhere else in Chromium? Yes. This pattern is used frequently for delegate and notification tests, etc., but I was following the examples found in starred_url_database_test.cc and url_database_unittest.cc. http://codereview.chromium.org/3138006/diff/8002/6008#newcode56 chrome/browser/history/in_memory_url_index_unittest.cc:56: FILE_PATH_LITERAL("url_history_provider_test.db.txt")); On 2010/08/17 19:53:20, rohitrao wrote: > Why not AppendASCII here also? Revised. Now using FILE_PATH_LITERAL instead of AppendASCII. http://codereview.chromium.org/3138006/diff/8002/6008#newcode60 chrome/browser/history/in_memory_url_index_unittest.cc:60: std::ifstream proto_file(history_proto_path.value().c_str()); On 2010/08/17 19:53:20, rohitrao wrote: > Is there any existing code for populating a DB from a file? It seems silly to > have to write this ourselves. I couldn't find anything. Perhaps I could move this into sql/connection.h/.cc. Call it 'bool Restore(const FilePath& path)'. Opinion? http://codereview.chromium.org/3138006/diff/8002/6009 File chrome/test/data/History/url_history_provider_test.db.txt (right): http://codereview.chromium.org/3138006/diff/8002/6009#newcode17 chrome/test/data/History/url_history_provider_test.db.txt:17: INSERT INTO "urls" VALUES(1,'http://www.reuters.com/article/idUSN0839880620100708','UPDATE 1-US 30-yr mortgage rate drops to new record low | Reuters',0,29,1,3,2); On 2010/08/17 19:53:20, rohitrao wrote: > Do we have to worry about using real/external urls and titles? I tried to be careful about using anything that might have any user-identifiable information. I don't think I missed any. I checked with markmentovai and he approved this type of URL.
Made some minor cleanup changes.
+ pkasting to review
I skimmed the algorithms, and mostly reviewed for style. http://codereview.chromium.org/3138006/diff/47001/33005 File chrome/browser/history/in_memory_url_index.cc (right): http://codereview.chromium.org/3138006/diff/47001/33005#newcode19 chrome/browser/history/in_memory_url_index.cc:19: Nit: One blank line http://codereview.chromium.org/3138006/diff/47001/33005#newcode27 chrome/browser/history/in_memory_url_index.cc:27: #define TRIM_PARAMETERS 1 This #define doesn't look to be used anywhere. Also, if you do need it, consider using "static const bool" instead of a preprocessor define. http://codereview.chromium.org/3138006/diff/47001/33005#newcode29 chrome/browser/history/in_memory_url_index.cc:29: static const int32_t kURLToLowerBufferSize = 2048; Also not used? http://codereview.chromium.org/3138006/diff/47001/33005#newcode38 chrome/browser/history/in_memory_url_index.cc:38: if (history_db) { Nit: Instead of nesting and |success|, use early-returns of "false". And the while() loop could be: while (history_enum.GetNextURL(&row)) { if (... && !IndexRow(row)) return false; } return true; http://codereview.chromium.org/3138006/diff/47001/33005#newcode62 chrome/browser/history/in_memory_url_index.cc:62: // TODO(mrossetti): Find or implement a ConvertPercentEncoding and use it You mean unescaping? We have unescaping code; talk to brettw about what you want. http://codereview.chromium.org/3138006/diff/47001/33005#newcode134 chrome/browser/history/in_memory_url_index.cc:134: } else { Nit: Use early returns in this function to avoid most of the nesting. For error handling, if a case really should not be able to occur, then just DCHECK (or CHECK if continuing would be a security hole) and continue without handling: DCHECK_NE(word_id_history_map_.end(), history_pos); ... (see Chromium style guide on motivation) http://codereview.chromium.org/3138006/diff/47001/33005#newcode178 chrome/browser/history/in_memory_url_index.cc:178: return recent_threshold; Nit: Just return the calculation directly and avoid the temp http://codereview.chromium.org/3138006/diff/47001/33006 File chrome/browser/history/in_memory_url_index.h (right): http://codereview.chromium.org/3138006/diff/47001/33006#newcode21 chrome/browser/history/in_memory_url_index.h:21: namespace base { class Time; } // Forward declaration of base::Time. Nit: I'd move the "class Time" to a separate line and drop the comment (which is redundant with the code). http://codereview.chromium.org/3138006/diff/47001/33006#newcode49 chrome/browser/history/in_memory_url_index.h:49: ~InMemoryURLIndex() {} Nit: Please move the constructor and destructor impls into the C++ file (see erg's recent emails regarding inlining). http://codereview.chromium.org/3138006/diff/47001/33006#newcode60 chrome/browser/history/in_memory_url_index.h:60: // Given a vector containing one or more words as String16's, scan the Nit: String16's -> string16s http://codereview.chromium.org/3138006/diff/47001/33006#newcode63 chrome/browser/history/in_memory_url_index.h:63: // qualify, however, the terms do not necessarily have to be adjacent. Mechanics Nazi nit: "qualify," -> "qualify;" http://codereview.chromium.org/3138006/diff/47001/33006#newcode70 chrome/browser/history/in_memory_url_index.h:70: friend class InMemoryURLIndexTest; // unit-test. Nit: Omit this comment (obviousness) http://codereview.chromium.org/3138006/diff/47001/33006#newcode83 chrome/browser/history/in_memory_url_index.h:83: // A map from character to word_id's. Nit: omit ' http://codereview.chromium.org/3138006/diff/47001/33006#newcode98 chrome/browser/history/in_memory_url_index.h:98: : char_set_(char_set), word_id_set_(word_id_set), used_(used) {} Nit: When the initializer list can't go on the same line as the function name, use one line per arg. http://codereview.chromium.org/3138006/diff/47001/33006#newcode134 chrome/browser/history/in_memory_url_index.h:134: String16Vector word_list_; // A list of all of indexed words. The index of Nit: Write this comment above the declaration so you can use the whole line. http://codereview.chromium.org/3138006/diff/47001/33007 File chrome/browser/history/in_memory_url_index_unittest.cc (right): http://codereview.chromium.org/3138006/diff/47001/33007#newcode23 chrome/browser/history/in_memory_url_index_unittest.cc:23: /* Nit: We tend to use // comment blocks for these, I believe, and that's what you've used elsewhere. http://codereview.chromium.org/3138006/diff/47001/33007#newcode39 chrome/browser/history/in_memory_url_index_unittest.cc:39: #define UNIT_TEST_COMMAND_BUFFER_MAX_SIZE 2048 Nit: Consider a "static const int" (and perhaps scoping it to the body of SetUp()). http://codereview.chromium.org/3138006/diff/47001/33007#newcode57 chrome/browser/history/in_memory_url_index_unittest.cc:57: FILE_PATH_LITERAL("url_history_provider_test.db.txt")); Nit: Hmmmm... it might be nice to replace the separate text file with a const array of structs, each one containing the appropriate data for an "INSERT" call. Then write the actual sqlite statement in a loop here. That way, the data is in this file, and it's in a block that doesn't have any SQL syntax obscuring anything, so it's easier to inspect. http://codereview.chromium.org/3138006/diff/47001/33007#newcode113 chrome/browser/history/in_memory_url_index_unittest.cc:113: // There should have been 25 of the 29 urls accepted during filtering. Nit: Is there any way to justify or calculate these values? They're super magic and if a test broke I'd have no idea what to do to fix it.
Issues addressed, lint errors also. Ready for another go. http://codereview.chromium.org/3138006/diff/47001/33005 File chrome/browser/history/in_memory_url_index.cc (right): http://codereview.chromium.org/3138006/diff/47001/33005#newcode19 chrome/browser/history/in_memory_url_index.cc:19: On 2010/08/19 20:59:21, Peter Kasting wrote: > Nit: One blank line Done. http://codereview.chromium.org/3138006/diff/47001/33005#newcode27 chrome/browser/history/in_memory_url_index.cc:27: #define TRIM_PARAMETERS 1 On 2010/08/19 20:59:21, Peter Kasting wrote: > This #define doesn't look to be used anywhere. > > Also, if you do need it, consider using "static const bool" instead of a > preprocessor define. Removed. Leftovers! http://codereview.chromium.org/3138006/diff/47001/33005#newcode29 chrome/browser/history/in_memory_url_index.cc:29: static const int32_t kURLToLowerBufferSize = 2048; On 2010/08/19 20:59:21, Peter Kasting wrote: > Also not used? More leftovers. Removed. http://codereview.chromium.org/3138006/diff/47001/33005#newcode38 chrome/browser/history/in_memory_url_index.cc:38: if (history_db) { On 2010/08/19 20:59:21, Peter Kasting wrote: > Nit: Instead of nesting and |success|, use early-returns of "false". And the > while() loop could be: > > while (history_enum.GetNextURL(&row)) { > if (... && !IndexRow(row)) > return false; > } > return true; Changed to early-returns. http://codereview.chromium.org/3138006/diff/47001/33005#newcode62 chrome/browser/history/in_memory_url_index.cc:62: // TODO(mrossetti): Find or implement a ConvertPercentEncoding and use it On 2010/08/19 20:59:21, Peter Kasting wrote: > You mean unescaping? We have unescaping code; talk to brettw about what you > want. Done. http://codereview.chromium.org/3138006/diff/47001/33005#newcode134 chrome/browser/history/in_memory_url_index.cc:134: } else { On 2010/08/19 20:59:21, Peter Kasting wrote: > Nit: Use early returns in this function to avoid most of the nesting. > > For error handling, if a case really should not be able to occur, then just > DCHECK (or CHECK if continuing would be a security hole) and continue without > handling: > > DCHECK_NE(word_id_history_map_.end(), history_pos); > ... > > (see Chromium style guide on motivation) Done. http://codereview.chromium.org/3138006/diff/47001/33005#newcode178 chrome/browser/history/in_memory_url_index.cc:178: return recent_threshold; On 2010/08/19 20:59:21, Peter Kasting wrote: > Nit: Just return the calculation directly and avoid the temp Done. http://codereview.chromium.org/3138006/diff/47001/33006 File chrome/browser/history/in_memory_url_index.h (right): http://codereview.chromium.org/3138006/diff/47001/33006#newcode21 chrome/browser/history/in_memory_url_index.h:21: namespace base { class Time; } // Forward declaration of base::Time. On 2010/08/19 20:59:21, Peter Kasting wrote: > Nit: I'd move the "class Time" to a separate line and drop the comment (which is > redundant with the code). Done. http://codereview.chromium.org/3138006/diff/47001/33006#newcode49 chrome/browser/history/in_memory_url_index.h:49: ~InMemoryURLIndex() {} On 2010/08/19 20:59:21, Peter Kasting wrote: > Nit: Please move the constructor and destructor impls into the C++ file (see > erg's recent emails regarding inlining). Done. http://codereview.chromium.org/3138006/diff/47001/33006#newcode60 chrome/browser/history/in_memory_url_index.h:60: // Given a vector containing one or more words as String16's, scan the On 2010/08/19 20:59:21, Peter Kasting wrote: > Nit: String16's -> string16s Done. http://codereview.chromium.org/3138006/diff/47001/33006#newcode63 chrome/browser/history/in_memory_url_index.h:63: // qualify, however, the terms do not necessarily have to be adjacent. On 2010/08/19 20:59:21, Peter Kasting wrote: > Mechanics Nazi nit: "qualify," -> "qualify;" Indeed! CMoS 5.69 concurs! Excellent catch! http://codereview.chromium.org/3138006/diff/47001/33006#newcode70 chrome/browser/history/in_memory_url_index.h:70: friend class InMemoryURLIndexTest; // unit-test. On 2010/08/19 20:59:21, Peter Kasting wrote: > Nit: Omit this comment (obviousness) Done. http://codereview.chromium.org/3138006/diff/47001/33006#newcode83 chrome/browser/history/in_memory_url_index.h:83: // A map from character to word_id's. On 2010/08/19 20:59:21, Peter Kasting wrote: > Nit: omit ' Done. http://codereview.chromium.org/3138006/diff/47001/33006#newcode98 chrome/browser/history/in_memory_url_index.h:98: : char_set_(char_set), word_id_set_(word_id_set), used_(used) {} On 2010/08/19 20:59:21, Peter Kasting wrote: > Nit: When the initializer list can't go on the same line as the function name, > use one line per arg. Done. http://codereview.chromium.org/3138006/diff/47001/33006#newcode134 chrome/browser/history/in_memory_url_index.h:134: String16Vector word_list_; // A list of all of indexed words. The index of On 2010/08/19 20:59:21, Peter Kasting wrote: > Nit: Write this comment above the declaration so you can use the whole line. Done. http://codereview.chromium.org/3138006/diff/47001/33007 File chrome/browser/history/in_memory_url_index_unittest.cc (right): http://codereview.chromium.org/3138006/diff/47001/33007#newcode23 chrome/browser/history/in_memory_url_index_unittest.cc:23: /* On 2010/08/19 20:59:21, Peter Kasting wrote: > Nit: We tend to use // comment blocks for these, I believe, and that's what > you've used elsewhere. Done. http://codereview.chromium.org/3138006/diff/47001/33007#newcode39 chrome/browser/history/in_memory_url_index_unittest.cc:39: #define UNIT_TEST_COMMAND_BUFFER_MAX_SIZE 2048 On 2010/08/19 20:59:21, Peter Kasting wrote: > Nit: Consider a "static const int" (and perhaps scoping it to the body of > SetUp()). Done. http://codereview.chromium.org/3138006/diff/47001/33007#newcode57 chrome/browser/history/in_memory_url_index_unittest.cc:57: FILE_PATH_LITERAL("url_history_provider_test.db.txt")); On 2010/08/19 20:59:21, Peter Kasting wrote: > Nit: Hmmmm... it might be nice to replace the separate text file with a const > array of structs, each one containing the appropriate data for an "INSERT" call. > Then write the actual sqlite statement in a loop here. That way, the data is > in this file, and it's in a block that doesn't have any SQL syntax obscuring > anything, so it's easier to inspect. I'm using a separate file for (at least) two reasons: 1) So that the file can be updated more easily when extending the test suite, particularly when adding tests for search results validation when simulating an omnibox search. 2) Because I've got another CL in the wings where the text file is scanned by a perl script and the expected counts in the InMemoryURLIndexTest.Initialization test are calculated. I plan to extend this script to provide expected results counts when the search portion of this provider is added (that's waiting in the wings as well). (I figured we could debate perl vs. python when I propose that CL. ;^) http://codereview.chromium.org/3138006/diff/47001/33007#newcode113 chrome/browser/history/in_memory_url_index_unittest.cc:113: // There should have been 25 of the 29 urls accepted during filtering. On 2010/08/19 20:59:21, Peter Kasting wrote: > Nit: Is there any way to justify or calculate these values? They're super magic > and if a test broke I'd have no idea what to do to fix it. Please see reply to previous comment.
LGTM with a couple questions http://codereview.chromium.org/3138006/diff/49001/43007 File chrome/browser/history/in_memory_url_index.cc (right): http://codereview.chromium.org/3138006/diff/49001/43007#newcode37 chrome/browser/history/in_memory_url_index.cc:37: if (history_db) { The previous snapshot returned false on !history_db; this returns true. Which is correct? Nit: You could still add an early-return to un-nest this whole conditional body. http://codereview.chromium.org/3138006/diff/49001/43007#newcode63 chrome/browser/history/in_memory_url_index.cc:63: UnescapeRule::SPACES + Nit: Prefer '|' to '+' on bitmask fields. Personally, I'd also just wrap like: string16 url(WideToUTF16(net::FormatUrl(gurl, UTF16ToWide(languages_), net::kFormatUrlOmitUsernamePassword, UnescapeRule::SPACES | UnescapeRule::URL_SPECIAL_CHARS, NULL, NULL, NULL))); ...just because I dislike lots of vertical whitespace. Finally, do you also want UnescapeRule::REPLACE_PLUS_WITH_SPACE? That way search queries will have spaces in them instead of pluses... I guess this depends on how you tokenize. http://codereview.chromium.org/3138006/diff/49001/43007#newcode101 chrome/browser/history/in_memory_url_index.cc:101: string16 const& uni_string) { Nit: We tend to use "const string16&" (2 places) http://codereview.chromium.org/3138006/diff/49001/43007#newcode138 chrome/browser/history/in_memory_url_index.cc:138: } else { Nit: Still seems like you could make use of early-returns to avoid long nested blocks. http://codereview.chromium.org/3138006/diff/49001/43008 File chrome/browser/history/in_memory_url_index.h (right): http://codereview.chromium.org/3138006/diff/49001/43008#newcode62 chrome/browser/history/in_memory_url_index.h:62: // Given a vector containing one or more words as string16's, scan the Nit: Remove apostrophe http://codereview.chromium.org/3138006/diff/49001/43008#newcode117 chrome/browser/history/in_memory_url_index.h:117: String16Set WordsFromString16(string16 const& uni_string); Nit: We try to use "const string16&" (2 places)
Addressed issues. Will commit after merge and testing. http://codereview.chromium.org/3138006/diff/49001/43007 File chrome/browser/history/in_memory_url_index.cc (right): http://codereview.chromium.org/3138006/diff/49001/43007#newcode37 chrome/browser/history/in_memory_url_index.cc:37: if (history_db) { On 2010/08/20 18:41:09, Peter Kasting wrote: > The previous snapshot returned false on !history_db; this returns true. Which > is correct? > > Nit: You could still add an early-return to un-nest this whole conditional body. Done. http://codereview.chromium.org/3138006/diff/49001/43007#newcode63 chrome/browser/history/in_memory_url_index.cc:63: UnescapeRule::SPACES + On 2010/08/20 18:41:09, Peter Kasting wrote: > Nit: Prefer '|' to '+' on bitmask fields. Personally, I'd also just wrap like: > > string16 url(WideToUTF16(net::FormatUrl(gurl, UTF16ToWide(languages_), > net::kFormatUrlOmitUsernamePassword, > UnescapeRule::SPACES | UnescapeRule::URL_SPECIAL_CHARS, NULL, NULL, > NULL))); > > ...just because I dislike lots of vertical whitespace. Finally, do you also > want UnescapeRule::REPLACE_PLUS_WITH_SPACE? That way search queries will have > spaces in them instead of pluses... I guess this depends on how you tokenize. Rewrapped. The tokenization breaks on '+' and spaces so it doesn't make a real difference. http://codereview.chromium.org/3138006/diff/49001/43007#newcode101 chrome/browser/history/in_memory_url_index.cc:101: string16 const& uni_string) { On 2010/08/20 18:41:09, Peter Kasting wrote: > Nit: We tend to use "const string16&" (2 places) Okay, I'll change it (though I learned to read types from right-to-left) to match our style. http://codereview.chromium.org/3138006/diff/49001/43007#newcode138 chrome/browser/history/in_memory_url_index.cc:138: } else { On 2010/08/20 18:41:09, Peter Kasting wrote: > Nit: Still seems like you could make use of early-returns to avoid long nested > blocks. Done by creating two functions, one for updating existing entries in the WordIDHistoryMap (corresponding to the first half of the 'if') and another for adding a new entry in that map (corresponding to the second half of the 'if'). http://codereview.chromium.org/3138006/diff/49001/43008 File chrome/browser/history/in_memory_url_index.h (right): http://codereview.chromium.org/3138006/diff/49001/43008#newcode62 chrome/browser/history/in_memory_url_index.h:62: // Given a vector containing one or more words as string16's, scan the On 2010/08/20 18:41:09, Peter Kasting wrote: > Nit: Remove apostrophe Done. http://codereview.chromium.org/3138006/diff/49001/43008#newcode117 chrome/browser/history/in_memory_url_index.h:117: String16Set WordsFromString16(string16 const& uni_string); On 2010/08/20 18:41:09, Peter Kasting wrote: > Nit: We try to use "const string16&" (2 places) Changed throughout.
Great, LGTM! |