|
|
Created:
4 years, 6 months ago by vakh (use Gerrit instead) Modified:
4 years, 5 months ago CC:
asvitkine+watch_chromium.org, woz, chromium-reviews, grt+watch_chromium.org, noé Base URL:
https://chromium.googlesource.com/chromium/src.git@master Target Ref:
refs/pending/heads/master Project:
chromium Visibility:
Public. |
DescriptionFetch incremental updates. Store new state in V4Store.
This CL does the following:
1. Makes V4Database the source of truth about the number and status of the lists being updated.
2. Fetch updates for 2 different lists: Malware and Social Engineering. Previously we were just downloading Malware.
3. Update requests are now incremental. Each V4Store updates its "state" when processing an update response so the next update request is incremental on top of that state.
4. Tests to verify that the V4Database and V4Store are being updated correctly.
5. Force enable V4 updates for debug builds (bringing it back).
Overall design document: https://goto.google.com/design-doc-v4store
After this CL:
1. V4LocalDatabaseManager tells V4Database about which lists to fetch (and their filename) at init time.
2. From here on, V4Database becomes the single source of truth for all information about the lists.
3. V4Database creates a V4Store for each of the lists.
4. When the V4Database is ready (this happens on the task runner), the update manager schedules the next update.
5. When the update is received, V4Database determines which stores to update and updates them on the task runner.
6. Updating a V4Store is done by creating a new store that's equivalent to the old store + the update.
7. When the new store is ready, the V4Database swaps it in for the old store.
8. When all the stores that needed to be updated have been updated, the update manager schedules the next update.
BUG=543161
Committed: https://crrev.com/370e9320906395ba0c795af2d3b4cedf864f936d
Cr-Commit-Position: refs/heads/master@{#402565}
Patch Set 1 #Patch Set 2 : Update BUILD.gn #Patch Set 3 : Remove some logging #Patch Set 4 : Undo forced frequent updates useful for debugging #Patch Set 5 : Nit: Use base::SStringPrintf instead of string concat #
Total comments: 40
Patch Set 6 : nparker comments addressed #Patch Set 7 : Nits: Fixed a crash. Corrected a DCHECK. #Patch Set 8 : Nits: Added some comments in BUILD.gn. Using #else for platform_type until I resolve the android bu… #
Total comments: 96
Patch Set 9 : CR feedback: shess@ (except store_state_map ownership) #Patch Set 10 : Don't cache store state. Makes ownership straightforward. #Patch Set 11 : Use PlatformTest as the base class for V4UpdateProtocolManagerTest #Patch Set 12 : git fetch && git pull && gclient sync #Patch Set 13 : Nit: Make V4Store's member variables private instead of protected, because why not? #Patch Set 14 : git fetch && git pull #
Total comments: 22
Patch Set 15 : git fetch && pull #Patch Set 16 : Address comments by shess@ #Patch Set 17 : Nit: store_file_name_map ready at compile time #Patch Set 18 : Minor: Filename format in CamelCase: <ThreatEntryType><ThreatType>.store #Patch Set 19 : CR feedback from shess@ #Patch Set 20 : git fetch && git pull && gclient sync #
Total comments: 16
Patch Set 21 : Minor: CR feedback: shess@ #Patch Set 22 : git fetch && git pull && gclient sync #Patch Set 23 : Minor: shess@ latest CR #Patch Set 24 : git fetch && git pull && gclient sync #
Total comments: 6
Patch Set 25 : Minor: Fix DCHECK #
Total comments: 2
Dependent Patchsets: Messages
Total messages: 72 (10 generated)
Description was changed from ========== Fetch incremental updates. Store new state in V4Store. This CL does the following: 1. Fetch updates for 2 different lists. Malware and Social Engineering. 2. Force enable V4 updates on Debug builds. 3. Update requests are now incremental. 4. Tests to verify that the V4Database and V4Store are being updated correctly. BUG=543161 ========== to ========== Fetch incremental updates. Store new state in V4Store. This CL does the following: 1. Makes V4Database the source of truth about the number and status of the lists being updated. 2. Fetch updates for 2 different lists: Malware and Social Engineering. Previously we were just downloading Malware. 3. Update requests are now incremental. Each V4Store updates its "state" when processing an update response so the next update request is incremental on top of that state. 4. Tests to verify that the V4Database and V4Store are being updated correctly. 5. Force enable V4 updates for debug builds (bringing it back). Overall design document: https://goto.google.com/design-doc-v4store After this CL: 1. V4LocalDatabaseManager tells V4Database about which lists to fetch (and their filename) at init time. 2. From here on, V4Database becomes the single source of truth for all information about the lists. 3. V4Database creates a V4Store for each of the lists. 4. When the V4Database is ready (this happens on the task runner), the update manager schedules the next update. 5. When the update is received, V4Database determines which stores to update and updates them on the task runner. 6. Updating a V4Store is done by creating a new store that's equivalent to the old store + the update. 7. When the new store is ready, the V4Database swaps it in for the old store. 8. When all the stores that needed to be updated have been updated, the update manager schedules the next update. BUG=543161 ==========
Update BUILD.gn
Remove some logging
vakh@chromium.org changed reviewers: + nparker@chromium.org, shess@chromium.org
shess@chromium.org: Please review changes in * nparker@chromium.org: Please review changes in *
Undo forced frequent updates useful for debugging
Nit: Use base::SStringPrintf instead of string concat
Ping.
I'm excited about updates coming in! https://codereview.chromium.org/2062013002/diff/80001/components/safe_browsin... File components/safe_browsing_db/v4_database.cc (right): https://codereview.chromium.org/2062013002/diff/80001/components/safe_browsin... components/safe_browsing_db/v4_database.cc:31: if (!base_path.IsAbsolute() || store_file_name_map.empty()) { Could just dcheck on these, and then the caller doesn't need to decide what to do with a false return value, assuming this just indicates a programming error https://codereview.chromium.org/2062013002/diff/80001/components/safe_browsin... components/safe_browsing_db/v4_database.cc:117: db_task_runner_->PostTask( Increment before posting, since otherwise the posted task could finish before it increments. https://codereview.chromium.org/2062013002/diff/80001/components/safe_browsin... components/safe_browsing_db/v4_database.cc:124: DVLOG(1) << "Got update for unexpected identifier: " << identifier; I'd say this is more severe, and could be a DCHECK. https://codereview.chromium.org/2062013002/diff/80001/components/safe_browsin... components/safe_browsing_db/v4_database.cc:161: (*store_state_map_)[store_map_iter.first] = store_map_iter.second->state(); I'm confused: is the store_state_map just a view into the store_map? If so why store it separately? https://codereview.chromium.org/2062013002/diff/80001/components/safe_browsin... File components/safe_browsing_db/v4_database.h (right): https://codereview.chromium.org/2062013002/diff/80001/components/safe_browsin... components/safe_browsing_db/v4_database.h:75: // Updates the stores with the reponse received from the SafeBrowsing service nit: response https://codereview.chromium.org/2062013002/diff/80001/components/safe_browsin... components/safe_browsing_db/v4_database.h:121: void UpdatedStoreReady(UpdateListIdentifier, std::unique_ptr<V4Store>); missing argument names. https://codereview.chromium.org/2062013002/diff/80001/components/safe_browsin... components/safe_browsing_db/v4_database.h:132: const std::unique_ptr<StoreStateMap> store_state_map_; Could this be just a StoreStateMap, and not a ptr? https://codereview.chromium.org/2062013002/diff/80001/components/safe_browsin... File components/safe_browsing_db/v4_database_unittest.cc (right): https://codereview.chromium.org/2062013002/diff/80001/components/safe_browsin... components/safe_browsing_db/v4_database_unittest.cc:215: FakeV4StoreFactory* factory = new FakeV4StoreFactory(false); Could this go in SetUp(), since you probably don't want to use real stores anywhere in this file? https://codereview.chromium.org/2062013002/diff/80001/components/safe_browsin... components/safe_browsing_db/v4_database_unittest.cc:222: base::RunLoop().RunUntilIdle(); A chunk of this code is repeated in tests below. How about putting it in a function? Otherwise it's hard to tell if it's the same, and harder to change reliably. https://codereview.chromium.org/2062013002/diff/80001/components/safe_browsin... components/safe_browsing_db/v4_database_unittest.cc:257: // Verify that a new store was created. Is new one created by the update? I thought the list was set at db-create-time. https://codereview.chromium.org/2062013002/diff/80001/components/safe_browsin... File components/safe_browsing_db/v4_local_database_manager.cc (right): https://codereview.chromium.org/2062013002/diff/80001/components/safe_browsin... components/safe_browsing_db/v4_local_database_manager.cc:199: #else An idea: How about requiring a known OS? You could check if MAC_OS, and then the #else use an #error (i.e. fail to compile). That way we don't get surprises. And we should figure out what to do with chromeos. (==linux?) https://codereview.chromium.org/2062013002/diff/80001/components/safe_browsin... components/safe_browsing_db/v4_local_database_manager.cc:204: (*store_file_name_map)[update_list_identifier] = "os_url_malware.store"; We should come up with a list of all the filenames we plan to use (for existing lists), to make sure we've got a naming strategy that'll work going forward. And document that. I think we should put them in a "Safe Browsing/" directory, and given that most things in the data dir use Upper Case and Spaces, we should stick to that too. (Could be done later) This could be defined at compile time rather than run time, to make it easier to read and edit. StoreFilenameMap default_store_filename_map = { {{PLATFORM, URL, MALWARE_THREAT}, "some malware filename"}, {{PLATFORM, URL, SOCIAL_ENGINEERING_PUBLIC}, "yadda yadda"}, ... } With that, PLATFORM could be a #define'd macro. https://codereview.chromium.org/2062013002/diff/80001/components/safe_browsin... File components/safe_browsing_db/v4_protocol_manager_util.h (right): https://codereview.chromium.org/2062013002/diff/80001/components/safe_browsin... components/safe_browsing_db/v4_protocol_manager_util.h:74: // created inside the user's profile directory. s/profile/data The'll be one set of files shared across profiles. https://codereview.chromium.org/2062013002/diff/80001/components/safe_browsin... components/safe_browsing_db/v4_protocol_manager_util.h:75: // For instance, the UpdateListIdentifier could be for URL expressions for UwS How about just referencing the place that defines them? That could list the naming semantics. https://codereview.chromium.org/2062013002/diff/80001/components/safe_browsin... File components/safe_browsing_db/v4_store.cc (right): https://codereview.chromium.org/2062013002/diff/80001/components/safe_browsin... components/safe_browsing_db/v4_store.cc:23: : task_runner_(other.task_runner_), store_path_(other.store_path_) {} Document what this constructor does or will do in the future, since (I expect) it'll only copy some of the state. https://codereview.chromium.org/2062013002/diff/80001/components/safe_browsin... components/safe_browsing_db/v4_store.cc:54: // new_store is done updataing, pass it to the callback. nit: updating https://codereview.chromium.org/2062013002/diff/80001/components/safe_browsin... File components/safe_browsing_db/v4_store.h (right): https://codereview.chromium.org/2062013002/diff/80001/components/safe_browsin... components/safe_browsing_db/v4_store.h:65: os << store.DebugString(); Does this need to be inline? The code could do in the .cc file. https://codereview.chromium.org/2062013002/diff/80001/components/safe_browsin... File components/safe_browsing_db/v4_update_protocol_manager.h (right): https://codereview.chromium.org/2062013002/diff/80001/components/safe_browsin... components/safe_browsing_db/v4_update_protocol_manager.h:145: const StoreStateMap* store_state_map_; Q: Does this need to be stored here, or can it just be passed when an update is scheduled? The fewer copies the better.
nparker comments addressed
Thanks for the review. PTAL. https://codereview.chromium.org/2062013002/diff/80001/components/safe_browsin... File components/safe_browsing_db/v4_database.cc (right): https://codereview.chromium.org/2062013002/diff/80001/components/safe_browsin... components/safe_browsing_db/v4_database.cc:31: if (!base_path.IsAbsolute() || store_file_name_map.empty()) { On 2016/06/15 21:24:11, Nathan Parker wrote: > Could just dcheck on these, and then the caller doesn't need to decide what to > do with a false return value, assuming this just indicates a programming error Done. https://codereview.chromium.org/2062013002/diff/80001/components/safe_browsin... components/safe_browsing_db/v4_database.cc:117: db_task_runner_->PostTask( On 2016/06/15 21:24:11, Nathan Parker wrote: > Increment before posting, since otherwise the posted task could finish before it > increments. Done. Though that's not a problem since the synchronization is achieved by having it call back UpdatedStoreReady which can't run until the current method finishes since they both run on the IO thread. https://codereview.chromium.org/2062013002/diff/80001/components/safe_browsin... components/safe_browsing_db/v4_database.cc:124: DVLOG(1) << "Got update for unexpected identifier: " << identifier; On 2016/06/15 21:24:12, Nathan Parker wrote: > I'd say this is more severe, and could be a DCHECK. Done. https://codereview.chromium.org/2062013002/diff/80001/components/safe_browsin... components/safe_browsing_db/v4_database.cc:161: (*store_state_map_)[store_map_iter.first] = store_map_iter.second->state(); On 2016/06/15 21:24:13, Nathan Parker wrote: > I'm confused: is the store_state_map just a view into the store_map? If so why > store it separately? Yes, it is a view. If we do not have a store_state_map_, we would need to get the state of the stores from store_map_ for each update. And since most updates do not lead to state changes, most of the time, the state of each store would be the same as last time. So, instead of looking up the state from the stores for each update, the store_state_map provides that information right away. https://codereview.chromium.org/2062013002/diff/80001/components/safe_browsin... File components/safe_browsing_db/v4_database.h (right): https://codereview.chromium.org/2062013002/diff/80001/components/safe_browsin... components/safe_browsing_db/v4_database.h:75: // Updates the stores with the reponse received from the SafeBrowsing service On 2016/06/15 21:24:13, Nathan Parker wrote: > nit: response Done. https://codereview.chromium.org/2062013002/diff/80001/components/safe_browsin... components/safe_browsing_db/v4_database.h:121: void UpdatedStoreReady(UpdateListIdentifier, std::unique_ptr<V4Store>); On 2016/06/15 21:24:15, Nathan Parker wrote: > missing argument names. Done. https://codereview.chromium.org/2062013002/diff/80001/components/safe_browsin... components/safe_browsing_db/v4_database.h:132: const std::unique_ptr<StoreStateMap> store_state_map_; On 2016/06/15 21:24:14, Nathan Parker wrote: > Could this be just a StoreStateMap, and not a ptr? Done. https://codereview.chromium.org/2062013002/diff/80001/components/safe_browsin... File components/safe_browsing_db/v4_database_unittest.cc (right): https://codereview.chromium.org/2062013002/diff/80001/components/safe_browsin... components/safe_browsing_db/v4_database_unittest.cc:215: FakeV4StoreFactory* factory = new FakeV4StoreFactory(false); On 2016/06/15 21:24:17, Nathan Parker wrote: > Could this go in SetUp(), since you probably don't want to use real stores > anywhere in this file? Done. https://codereview.chromium.org/2062013002/diff/80001/components/safe_browsin... components/safe_browsing_db/v4_database_unittest.cc:222: base::RunLoop().RunUntilIdle(); On 2016/06/15 21:24:17, Nathan Parker wrote: > A chunk of this code is repeated in tests below. How about putting it in a > function? Otherwise it's hard to tell if it's the same, and harder to change > reliably. Done. https://codereview.chromium.org/2062013002/diff/80001/components/safe_browsin... components/safe_browsing_db/v4_database_unittest.cc:257: // Verify that a new store was created. On 2016/06/15 21:24:17, Nathan Parker wrote: > Is new one created by the update? I thought the list was set at db-create-time. A new store is created but since the map holds unique_ptrs, as soon as we point the DB to the new store, the old one becomes unonwned and marked for deletion. This check just ensures that the address of the new store is different from that of the old store. https://codereview.chromium.org/2062013002/diff/80001/components/safe_browsin... File components/safe_browsing_db/v4_local_database_manager.cc (right): https://codereview.chromium.org/2062013002/diff/80001/components/safe_browsin... components/safe_browsing_db/v4_local_database_manager.cc:199: #else On 2016/06/15 21:24:18, Nathan Parker wrote: > An idea: How about requiring a known OS? You could check if MAC_OS, and then > the #else use an #error (i.e. fail to compile). That way we don't get > surprises. And we should figure out what to do with chromeos. (==linux?) Done. https://codereview.chromium.org/2062013002/diff/80001/components/safe_browsin... components/safe_browsing_db/v4_local_database_manager.cc:204: (*store_file_name_map)[update_list_identifier] = "os_url_malware.store"; On 2016/06/15 21:24:19, Nathan Parker wrote: > We should come up with a list of all the filenames we plan to use (for existing > lists), to make sure we've got a naming strategy that'll work going forward. > And document that. > > I think we should put them in a "Safe Browsing/" directory, and given that most > things in the data dir use Upper Case and Spaces, we should stick to that too. > > (Could be done later) This could be defined at compile time rather than run > time, to make it easier to read and edit. > > StoreFilenameMap default_store_filename_map = { > {{PLATFORM, URL, MALWARE_THREAT}, "some malware filename"}, > {{PLATFORM, URL, SOCIAL_ENGINEERING_PUBLIC}, "yadda yadda"}, > ... > } > > With that, PLATFORM could be a #define'd macro. Acknowledged. Work for a later CL. https://codereview.chromium.org/2062013002/diff/80001/components/safe_browsin... File components/safe_browsing_db/v4_protocol_manager_util.h (right): https://codereview.chromium.org/2062013002/diff/80001/components/safe_browsin... components/safe_browsing_db/v4_protocol_manager_util.h:74: // created inside the user's profile directory. On 2016/06/15 21:24:19, Nathan Parker wrote: > s/profile/data > > The'll be one set of files shared across profiles. Done. https://codereview.chromium.org/2062013002/diff/80001/components/safe_browsin... components/safe_browsing_db/v4_protocol_manager_util.h:75: // For instance, the UpdateListIdentifier could be for URL expressions for UwS On 2016/06/15 21:24:19, Nathan Parker wrote: > How about just referencing the place that defines them? That could list the > naming semantics. There's no such place right now. The only place where we define these has a TODO for finding a better way to do this. Once there's a definitive place, I'll point this comment to that place. Added a TODO for that. https://codereview.chromium.org/2062013002/diff/80001/components/safe_browsin... File components/safe_browsing_db/v4_store.cc (right): https://codereview.chromium.org/2062013002/diff/80001/components/safe_browsin... components/safe_browsing_db/v4_store.cc:23: : task_runner_(other.task_runner_), store_path_(other.store_path_) {} On 2016/06/15 21:24:21, Nathan Parker wrote: > Document what this constructor does or will do in the future, since (I expect) > it'll only copy some of the state. Done. https://codereview.chromium.org/2062013002/diff/80001/components/safe_browsin... components/safe_browsing_db/v4_store.cc:54: // new_store is done updataing, pass it to the callback. On 2016/06/15 21:24:19, Nathan Parker wrote: > nit: updating Done. https://codereview.chromium.org/2062013002/diff/80001/components/safe_browsin... File components/safe_browsing_db/v4_store.h (right): https://codereview.chromium.org/2062013002/diff/80001/components/safe_browsin... components/safe_browsing_db/v4_store.h:65: os << store.DebugString(); On 2016/06/15 21:24:21, Nathan Parker wrote: > Does this need to be inline? The code could do in the .cc file. Done. https://codereview.chromium.org/2062013002/diff/80001/components/safe_browsin... File components/safe_browsing_db/v4_update_protocol_manager.h (right): https://codereview.chromium.org/2062013002/diff/80001/components/safe_browsin... components/safe_browsing_db/v4_update_protocol_manager.h:145: const StoreStateMap* store_state_map_; On 2016/06/15 21:24:21, Nathan Parker wrote: > Q: Does this need to be stored here, or can it just be passed when an update is > scheduled? The fewer copies the better. As discussed offline, the reason it is stored here is that when an update request to the PVer4 service fails, the update manager needs to know the last state of the stores. It could either get a pointer to the V4Database, which is probably an overkill, or a pointer to the state map. I've updated the code so that once the v4LocalDbManager sets up the database, it passes a reference to the db's state map to the V4UpdateManager. This is done once for an instance of the V4UpdateManager (only one or zero of which exist at any time).
Nits: Fixed a crash. Corrected a DCHECK.
I'm super confused why the Android GN builds are trying to compile the v4_local_database_manager.cc The BUILD.gn file clearly states that it is a dependency of ":safe_browsing_db", not ":safe_browsing_db_shared" or "::safe_browsing_db_mobile"
Interestingly, all the failing targets are trying to compile both ":safe_browsing_api_handler" and ":v4_local_database_manager" which should be mutually exclusive. Clearly something wrong with the dependency tree structure calculation.
Nits: Added some comments in BUILD.gn. Using #else for platform_type until I resolve the android build failure more gracefully.
On 2016/06/16 07:08:13, vakh wrote: > Interestingly, all the failing targets are trying to compile both > ":safe_browsing_api_handler" and ":v4_local_database_manager" which should be > mutually exclusive. > Clearly something wrong with the dependency tree structure calculation. Could it be crbug.com/618066 ?
lgtm with some comments https://codereview.chromium.org/2062013002/diff/80001/components/safe_browsin... File components/safe_browsing_db/v4_database.cc (right): https://codereview.chromium.org/2062013002/diff/80001/components/safe_browsin... components/safe_browsing_db/v4_database.cc:161: (*store_state_map_)[store_map_iter.first] = store_map_iter.second->state(); On 2016/06/16 01:07:35, vakh wrote: > On 2016/06/15 21:24:13, Nathan Parker wrote: > > I'm confused: is the store_state_map just a view into the store_map? If so > why > > store it separately? > > Yes, it is a view. > If we do not have a store_state_map_, we would need to get the state of the > stores from store_map_ for each update. And since most updates do not lead to > state changes, most of the time, the state of each store would be the same as > last time. > > So, instead of looking up the state from the stores for each update, the > store_state_map provides that information right away. What if you created this map on the heap only when it was time to schedule an update, and passed it to the protocol manager (passing ownership of it)? That way you remove the copy here and the protocol manager doesn't have to keep a ptr back into this class. That would decouple them even more. It should be the case that the state should not change before the update manager completes its next fetch, yes? Keeping two data structures in sync can increase the risk of bugs where they're not in sync. And we'd need semantics to ensure they're accessed on the correct threads (I think they are, but it's not obvious from the code). If you think the current way is better then go ahead and submit and we'll hash it out later (file a bug?). I don't want this point to hold you up. https://codereview.chromium.org/2062013002/diff/80001/components/safe_browsin... File components/safe_browsing_db/v4_local_database_manager.cc (right): https://codereview.chromium.org/2062013002/diff/80001/components/safe_browsin... components/safe_browsing_db/v4_local_database_manager.cc:204: (*store_file_name_map)[update_list_identifier] = "os_url_malware.store"; On 2016/06/16 01:07:35, vakh wrote: > On 2016/06/15 21:24:19, Nathan Parker wrote: > > We should come up with a list of all the filenames we plan to use (for > existing > > lists), to make sure we've got a naming strategy that'll work going forward. > > And document that. > > > > I think we should put them in a "Safe Browsing/" directory, and given that > most > > things in the data dir use Upper Case and Spaces, we should stick to that too. > > > > (Could be done later) This could be defined at compile time rather than run > > time, to make it easier to read and edit. > > > > StoreFilenameMap default_store_filename_map = { > > {{PLATFORM, URL, MALWARE_THREAT}, "some malware filename"}, > > {{PLATFORM, URL, SOCIAL_ENGINEERING_PUBLIC}, "yadda yadda"}, > > ... > > } > > > > With that, PLATFORM could be a #define'd macro. > > Acknowledged. Work for a later CL. When we we change the names later, we'll need to write code to delete these since they'll be left on some users' disk.
As you can see, I'm a bit against checking in logging code. AFAICT it's essentially impossible to remove logging code, once you check it in. Or at least nobody ever does, because they no longer remember if it's important or not. Better to just not check it in at all. [Note that you can often enough just keep a local CL with your logging code and rebase to carry it forward.] https://codereview.chromium.org/2062013002/diff/140001/chrome/browser/safe_br... File chrome/browser/safe_browsing/services_delegate_impl.cc (right): https://codereview.chromium.org/2062013002/diff/140001/chrome/browser/safe_br... chrome/browser/safe_browsing/services_delegate_impl.cc:191: #endif I'd be comfortable with just having the #if case return like before, and the #else case return true. Or leading with #ifndef NDEBUG and return true, then return like before. https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... File components/safe_browsing_db/v4_database.cc (right): https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_database.cc:40: return true; This seems like it always returns true. Since it's static, it can't be virtual, so I'm confused why the change from void? https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_database.cc:111: if (old_store->state() != response.new_client_state()) { I'm not entirely liking the auto for |response| and |old_store|. Maybe I'm just too traditionalist, but it feels like almost anything could pop out, here. https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_database.cc:119: response, current_task_runner, store_ready_callback)); I'm slightly questioning whether this is better done as a series of separately posted per-store tasks, or a single db-level task which iterates the stores. I can see that there are cases where the set of responses will be greater than the set of stores updated, but it feels like the common case is that all of the responses result in updates. Moving this to a db-level task would mean that you wouldn't have to track pending_store_updates_, and you also wouldn't have to have store_ready_callback coalescing to call db_updated_callback_. In fact, in that case you could probably pass db_updated_callback as a parameter to ApplyUpdate() and get rid of the member variable (which seems like kind of a wart). https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_database.cc:123: NOTREACHED(); The two lines just create a poor DCHECK, don't they? DVLOG() means only in debug mode, NOTREACHED() will abort in debug. You can probably say NOTREACHED() << "Got update...", too. https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_database.cc:129: } Should this be a post rather than a direct run, so that the callback runs async in either case? https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_database.cc:140: // condition in decrementing pending_store_updates_. Do we have any sort of thread annotations? This comment seems like it belongs with pending_store_updates_ definition, because it needs to apply to all accesses, not just this one. Maybe it would make sense to add a wrapper for accesses to enforce thread usage guidelines, and have the decrement wrapper chain invoke the callback? https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... File components/safe_browsing_db/v4_database.h (right): https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_database.h:23: // This callback is Run once the database has finished processing the update I don't think Run is a proper noun in this context. https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_database.h:55: // been specified in \store_file_name_map\. The created V4Database runs Why the backslashes for this case? What does it mean for the filename to _not_ be specified in store_file_name_map? Or are they just not in the map? If StoreFileNameMap is the thing describing the filenames for stores, maybe just "containing stores in |store_file_name_map|"? https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_database.h:57: // method. When the database creation is complete, it runs the The ApplyUpdate snippet feels pretty contrived, like maybe you need to split this into "The part about the setup process" and "The part about the things that are setup". https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_database.h:61: // attempt to create the database at all. This last bit seems like an API mis-use, rather than something to return false for. Empty store_file_name_map is unexpected, so DCHECK makes sense, but it seems like things should work just fine if that case happens (I mean, it won't _work_ work, but it can just do nothing successfully, I wouldn't optimize for that case in any way). Actually, I guess I kind of feel the same about non-absolute base_path. It would be reasonable to DCHECK, but have the code in general drop it on the floor and spin up a noop instance. https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_database.h:79: const StoreStateMap* store_state_map() { return &store_state_map_; } This feels like it shouldn't be exposed to public:. https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_database.h:124: // Updates the store state map after reading the stores from disk on startup. This is called on startup? https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_database.h:143: unsigned pending_store_updates_; I think countable things are recommended to be int or size_t. That said, I wonder about tracking "count of outstanding stores to update" versus just tracking the actual outstanding stores to update. https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... File components/safe_browsing_db/v4_local_database_manager.cc (right): https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_local_database_manager.cc:175: PopulateStoreFileNameMap(&store_file_name_map); Could PopulateStoreFileNameMap() just return a map rather than requiring a pointer? And then it wouldn't be PopulateSFNM, but CreateSFNM? https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_local_database_manager.cc:197: #else I think there should be no default, #elif defined(OS_OSX) https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_local_database_manager.cc:201: // |platform_type| won't be declared. I vote for no comment. I think anyone debugging why platform_type isn't defined would see that either their platform will need a new enum. https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_local_database_manager.cc:206: update_list_identifier.threat_type = SOCIAL_ENGINEERING_PUBLIC; I am not entirely comfortable with this, because it makes assumptions about update_list_identifier's implementation which may not hold over time. Would it make sense to have: (*store_file_name_map)[UpdateListIdentifier(platform_type, URL, SOCIAL_ENGINEERING_PUBLIC)] = "os_url_soceng.store"? or malware_identifier and soceng_identifier separately. https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_local_database_manager.cc:250: v4_database_->ApplyUpdate(responses); WRT my comment elsewhere with many store tasks versus a single db task, this is the place I think would be cleaner if it was: v4_database_->ApplyUpdate(responses, base::Bind(&V4LDM::DatabaseUpdated, this)); Then the reader would know looking only at this function where they need to look next. https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... File components/safe_browsing_db/v4_local_database_manager.h (right): https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_local_database_manager.h:78: // DatabaseUpdated which schedules the next update request. This feels really specific to the implementation of V4Database. Suggest rewriting this to describe what the function does, without concrete info about some other class. Something like "Called after database update to schedule next update request", maybe. https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... File components/safe_browsing_db/v4_protocol_manager_util.cc (right): https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_protocol_manager_util.cc:27: } I'm not sure how this integrates with the overall stream. Maybe get some {} around the aggregate so that it's clear that this is a unit. Actually, I'm not _entirely_ comfortable having logging stuff like this. It has an unclear cost which probably won't be of general utility. https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_protocol_manager_util.cc:48: UpdateListIdentifier::UpdateListIdentifier() {} Is the bare version ever used? If so, will everything work correctly if the member variables are initialized with random data which may be out of range? If there aren't obvious defaults to use, it might be worthwhile to see if you can make this private: to prevent accidents. https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... File components/safe_browsing_db/v4_protocol_manager_util.h (right): https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_protocol_manager_util.h:74: // created inside the user's data directory. This comment replicates a lot of specifics from the declaration. It's a typedef, so it defines things, the typedef already says hash_map, and the key and value are pretty clear. Maybe something more like "The set of interesting lists and ASCII filenames for their hash-prefix stores. The stores are created inside the user-data directory." Note that most of the hyphenated words in here don't really need to be hyphenated. "hash prefixes" should be fine, "filename" should be fine, "on disk" should be fine. Hyphenate when you need to tie the modifiers together, like "on-disk data" or "user-data directory". https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_protocol_manager_util.h:82: // Used to represent the state of each store. Similar here, language could be less passive. "Represents the state of each store" or "Represents state of stores". https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... File components/safe_browsing_db/v4_store.cc (right): https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_store.cc:31: DVLOG(1) << "Destroying V4Store: " << *this; Is this necessary to check in? https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_store.cc:38: GetHumanReadableState().c_str()); Why the interposition? Why not just return base::StringPrintf() results directly? https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_store.cc:57: DVLOG(1) << "Created V4Store: " << new_store->DebugString(); I'd have expected operator<< to do this for you. https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_store.cc:67: return state_base64; If you really want to have this, just inline it into DebugString(). https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... File components/safe_browsing_db/v4_store.h (right): https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_store.h:46: UpdatedStoreReadyCallback); While you're still early days, maybe try to keep this alphabetized. https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_store.h:51: std::string DebugString() const; Is it necessary to have debug-output stuff in here? https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_store.h:56: std::string state_; Is this the entire state of the store? Or is it some sort of etag-type hash blob? Probably worth a comment. https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_store.h:61: // different from |other|. This feels like you're using the copy constructor as a convenience. Anyplace that can reach this private declaration can also reach the instance variables, I think that just explicitly passing the task runner and store path is going to be much easier to understand at that point than some comment in some other file. https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_store.h:67: std::ostream& operator<<(std::ostream& os, const V4Store& store); I like this even less than the version on the list identifier. https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... File components/safe_browsing_db/v4_update_protocol_manager.h (right): https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_update_protocol_manager.h:143: const StoreStateMap* store_state_map_; Having the instance retain a pointer to a map owned by another object seems like a recipe for use-after-free errors. Is there any way to have a single reference to this, which others access via accessor? Or, better yet, which only one object needs to look at?
Also, there's a subset of this CL which is just renaming type things. I fully support pulling out pieces which are more stand-alone and getting those reviewed in expedited fashion to make it easier to see what's left.
CR feedback: shess@ (except store_state_map ownership)
shess@ -- addressed most of your comments. Except the ownership of store_state_map, which I am still thinking about. Left those comments unresolved. https://codereview.chromium.org/2062013002/diff/140001/chrome/browser/safe_br... File chrome/browser/safe_browsing/services_delegate_impl.cc (right): https://codereview.chromium.org/2062013002/diff/140001/chrome/browser/safe_br... chrome/browser/safe_browsing/services_delegate_impl.cc:191: #endif On 2016/06/17 22:53:42, Scott Hess wrote: > I'd be comfortable with just having the #if case return like before, and the > #else case return true. Or leading with #ifndef NDEBUG and return true, then > return like before. Done. https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... File components/safe_browsing_db/v4_database.cc (right): https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_database.cc:40: return true; On 2016/06/17 22:53:42, Scott Hess wrote: > This seems like it always returns true. Since it's static, it can't be virtual, > so I'm confused why the change from void? I was initially returning false for the DCHECKS above (instead of the DCHECKs) but nparker@ said that this should be a harder fail so changed it to DCHECK, but forgot to revert the return type chage. Done now. https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_database.cc:111: if (old_store->state() != response.new_client_state()) { On 2016/06/17 22:53:42, Scott Hess wrote: > I'm not entirely liking the auto for |response| and |old_store|. Maybe I'm just > too traditionalist, but it feels like almost anything could pop out, here. "for (const auto& iter ...)" is a commonly used and encouraged pattern so leaving the response as is, but I've changed the declaration for old_store. https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_database.cc:119: response, current_task_runner, store_ready_callback)); On 2016/06/17 22:53:42, Scott Hess wrote: > I'm slightly questioning whether this is better done as a series of separately > posted per-store tasks, or a single db-level task which iterates the stores. I > can see that there are cases where the set of responses will be greater than the > set of stores updated, but it feels like the common case is that all of the > responses result in updates. > > Moving this to a db-level task would mean that you wouldn't have to track > pending_store_updates_, and you also wouldn't have to have store_ready_callback > coalescing to call db_updated_callback_. In fact, in that case you could > probably pass db_updated_callback as a parameter to ApplyUpdate() and get rid of > the member variable (which seems like kind of a wart). Writing a store to disk is a long operation so the idea is to keep individual tasks on the task runner short in order to not hog the runner for all stores at once. I'm not much familiar with the guidelines about this. Do you know if it is OK to run long operations on the task runner? If it is, then doing the update all at once is definitely better. https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_database.cc:123: NOTREACHED(); On 2016/06/17 22:53:42, Scott Hess wrote: > The two lines just create a poor DCHECK, don't they? DVLOG() means only in > debug mode, NOTREACHED() will abort in debug. > > You can probably say NOTREACHED() << "Got update...", too. Done. https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_database.cc:129: } On 2016/06/17 22:53:42, Scott Hess wrote: > Should this be a post rather than a direct run, so that the callback runs async > in either case? Done. https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_database.cc:140: // condition in decrementing pending_store_updates_. On 2016/06/17 22:53:42, Scott Hess wrote: > Do we have any sort of thread annotations? Sorry, I'm not sure what you mean. > This comment seems like it belongs > with pending_store_updates_ definition, because it needs to apply to all > accesses, not just this one. Done. > > Maybe it would make sense to add a wrapper for accesses to enforce thread usage > guidelines, and have the decrement wrapper chain invoke the callback? Do you know of any example that I can look at to understand how this can be done and what exactly it does? https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... File components/safe_browsing_db/v4_database.h (right): https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_database.h:23: // This callback is Run once the database has finished processing the update On 2016/06/17 22:53:42, Scott Hess wrote: > I don't think Run is a proper noun in this context. Changed to scheduled. https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_database.h:55: // been specified in \store_file_name_map\. The created V4Database runs On 2016/06/17 22:53:42, Scott Hess wrote: > Why the backslashes for this case? > > What does it mean for the filename to _not_ be specified in store_file_name_map? > Or are they just not in the map? If StoreFileNameMap is the thing describing > the filenames for stores, maybe just "containing stores in > |store_file_name_map|"? Done. https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_database.h:57: // method. When the database creation is complete, it runs the On 2016/06/17 22:53:43, Scott Hess wrote: > The ApplyUpdate snippet feels pretty contrived, like maybe you need to split > this into "The part about the setup process" and "The part about the things that > are setup". Removed the ApplyUpdate part of the comment since that is described by the comment about ApplyUpdate itself. https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_database.h:61: // attempt to create the database at all. On 2016/06/17 22:53:42, Scott Hess wrote: > This last bit seems like an API mis-use, rather than something to return false > for. Empty store_file_name_map is unexpected, so DCHECK makes sense, but it > seems like things should work just fine if that case happens (I mean, it won't > _work_ work, but it can just do nothing successfully, I wouldn't optimize for > that case in any way). > > Actually, I guess I kind of feel the same about non-absolute base_path. It > would be reasonable to DCHECK, but have the code in general drop it on the floor > and spin up a noop instance. Done. https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_database.h:124: // Updates the store state map after reading the stores from disk on startup. On 2016/06/17 22:53:42, Scott Hess wrote: > This is called on startup? Updated to say: "when the database is being created" https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_database.h:143: unsigned pending_store_updates_; On 2016/06/17 22:53:42, Scott Hess wrote: > I think countable things are recommended to be int or size_t. Changed to int. > > That said, I wonder about tracking "count of outstanding stores to update" > versus just tracking the actual outstanding stores to update. I thought of that at first, but couldn't figure out any advantage of doing that, over jut keeping a counter. The only thing that I could come up with was to help debug or track how long each store update took or if there are any that are not getting updated at all for some reason. Finally decided to keep it simple because those cases aren't really worthy enough. https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... File components/safe_browsing_db/v4_local_database_manager.cc (right): https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_local_database_manager.cc:175: PopulateStoreFileNameMap(&store_file_name_map); On 2016/06/17 22:53:43, Scott Hess wrote: > Could PopulateStoreFileNameMap() just return a map rather than requiring a > pointer? And then it wouldn't be PopulateSFNM, but CreateSFNM? Done. https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_local_database_manager.cc:197: #else On 2016/06/17 22:53:43, Scott Hess wrote: > I think there should be no default, > #elif defined(OS_OSX) This code is getting compiled on Android for an unknown reason -- it should not. I'll look into that later but in the mean time, in order to not break the compilation, I defaulted to one of the platforms. If I leave the platform_type empty, the server won't recognize the request and return an error response which will pollute our UMA metrics so leaving it as-is for now. I know it is wrong, but probably less wrong than the alternative. Filed: http://crbug.com/621647 https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_local_database_manager.cc:201: // |platform_type| won't be declared. On 2016/06/17 22:53:43, Scott Hess wrote: > I vote for no comment. I think anyone debugging why platform_type isn't defined > would see that either their platform will need a new enum. Done. https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_local_database_manager.cc:206: update_list_identifier.threat_type = SOCIAL_ENGINEERING_PUBLIC; On 2016/06/17 22:53:43, Scott Hess wrote: > I am not entirely comfortable with this, because it makes assumptions about > update_list_identifier's implementation which may not hold over time. Would it > make sense to have: > > (*store_file_name_map)[UpdateListIdentifier(platform_type, URL, > SOCIAL_ENGINEERING_PUBLIC)] = "os_url_soceng.store"? > > or malware_identifier and soceng_identifier separately. Done. https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_local_database_manager.cc:250: v4_database_->ApplyUpdate(responses); On 2016/06/17 22:53:43, Scott Hess wrote: > WRT my comment elsewhere with many store tasks versus a single db task, this is > the place I think would be cleaner if it was: > v4_database_->ApplyUpdate(responses, base::Bind(&V4LDM::DatabaseUpdated, this)); > Then the reader would know looking only at this function where they need to look > next. Still not sure about updating all stores at once, please see my comment in v4_database. Since the method to be called back is the same every time (without any arguments), isn't it better to pass it at construction time? https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... File components/safe_browsing_db/v4_local_database_manager.h (right): https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_local_database_manager.h:78: // DatabaseUpdated which schedules the next update request. On 2016/06/17 22:53:43, Scott Hess wrote: > This feels really specific to the implementation of V4Database. Suggest > rewriting this to describe what the function does, without concrete info about > some other class. Something like "Called after database update to schedule next > update request", maybe. Done. https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... File components/safe_browsing_db/v4_protocol_manager_util.cc (right): https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_protocol_manager_util.cc:27: } On 2016/06/17 22:53:43, Scott Hess wrote: > I'm not sure how this integrates with the overall stream. Maybe get some {} > around the aggregate so that it's clear that this is a unit. You mean just empty {}? It's not clear to me how that would be beneficial. > > Actually, I'm not _entirely_ comfortable having logging stuff like this. It has > an unclear cost which probably won't be of general utility. This isn't logging stuff on its own and is very useful in debugging things (specially when gdb isn't that useful). It does provide value in terms of time saved debugging. Is there a recommended way of checking in code that is useful in the near future (while building stuff), but may be not in the long term? I could add a crbug to remove this (and other debugging aids) after a quarter. https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_protocol_manager_util.cc:48: UpdateListIdentifier::UpdateListIdentifier() {} On 2016/06/17 22:53:43, Scott Hess wrote: > Is the bare version ever used? > > If so, will everything work correctly if the member variables are initialized > with random data which may be out of range? If there aren't obvious defaults to > use, it might be worthwhile to see if you can make this private: to prevent > accidents. Done. https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... File components/safe_browsing_db/v4_protocol_manager_util.h (right): https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_protocol_manager_util.h:74: // created inside the user's data directory. On 2016/06/17 22:53:43, Scott Hess wrote: > This comment replicates a lot of specifics from the declaration. It's a > typedef, so it defines things, the typedef already says hash_map, and the key > and value are pretty clear. Maybe something more like "The set of interesting > lists and ASCII filenames for their hash-prefix stores. The stores are created > inside the user-data directory." > > Note that most of the hyphenated words in here don't really need to be > hyphenated. "hash prefixes" should be fine, "filename" should be fine, "on > disk" should be fine. Hyphenate when you need to tie the modifiers together, > like "on-disk data" or "user-data directory". Done. https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_protocol_manager_util.h:82: // Used to represent the state of each store. On 2016/06/17 22:53:43, Scott Hess wrote: > Similar here, language could be less passive. "Represents the state of each > store" or "Represents state of stores". Done. https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... File components/safe_browsing_db/v4_store.cc (right): https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_store.cc:31: DVLOG(1) << "Destroying V4Store: " << *this; On 2016/06/17 22:53:43, Scott Hess wrote: > Is this necessary to check in? Done. https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_store.cc:38: GetHumanReadableState().c_str()); On 2016/06/17 22:53:43, Scott Hess wrote: > Why the interposition? Why not just return base::StringPrintf() results > directly? Done. https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_store.cc:57: DVLOG(1) << "Created V4Store: " << new_store->DebugString(); On 2016/06/17 22:53:43, Scott Hess wrote: > I'd have expected operator<< to do this for you. Done. https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_store.cc:67: return state_base64; On 2016/06/17 22:53:43, Scott Hess wrote: > If you really want to have this, just inline it into DebugString(). Done. https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... File components/safe_browsing_db/v4_store.h (right): https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_store.h:46: UpdatedStoreReadyCallback); On 2016/06/17 22:53:43, Scott Hess wrote: > While you're still early days, maybe try to keep this alphabetized. Done. In each visibility section: Constructur/Destructor Sorted simple methods Sorted other methods https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_store.h:51: std::string DebugString() const; On 2016/06/17 22:53:43, Scott Hess wrote: > Is it necessary to have debug-output stuff in here? Useful for debugging. Not linked into the binary unless used. As I mentioned in the other file, I find them useful right now but probably won't be very useful in about 3-6 months. I could add a crbug to remove these in 6 months. https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_store.h:56: std::string state_; On 2016/06/17 22:53:43, Scott Hess wrote: > Is this the entire state of the store? Or is it some sort of etag-type hash > blob? Probably worth a comment. Done. https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_store.h:61: // different from |other|. On 2016/06/17 22:53:43, Scott Hess wrote: > This feels like you're using the copy constructor as a convenience. Anyplace > that can reach this private declaration can also reach the instance variables, I > think that just explicitly passing the task runner and store path is going to be > much easier to understand at that point than some comment in some other file. Done. https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_store.h:67: std::ostream& operator<<(std::ostream& os, const V4Store& store); On 2016/06/17 22:53:43, Scott Hess wrote: > I like this even less than the version on the list identifier. The UpdateListIdentifier is a struct so its members were visible. Not the case here so I need the DebugString method.
Don't cache store state. Makes ownership straightforward.
Use PlatformTest as the base class for V4UpdateProtocolManagerTest
On 2016/06/17 22:54:48, Scott Hess wrote: > Also, there's a subset of this CL which is just renaming type things. I fully > support pulling out pieces which are more stand-alone and getting those reviewed > in expedited fashion to make it easier to see what's left. http://crrev.com/2080603005
git fetch && git pull && gclient sync
I've removed the caching of the StoreStateMap from V4Database since it was difficult to justify keeping it over managing its lifetime. (This change was done in patch 10) https://codereview.chromium.org/2062013002/diff/80001/components/safe_browsin... File components/safe_browsing_db/v4_database.cc (right): https://codereview.chromium.org/2062013002/diff/80001/components/safe_browsin... components/safe_browsing_db/v4_database.cc:161: (*store_state_map_)[store_map_iter.first] = store_map_iter.second->state(); On 2016/06/17 21:34:19, Nathan Parker wrote: > On 2016/06/16 01:07:35, vakh wrote: > > On 2016/06/15 21:24:13, Nathan Parker wrote: > > > I'm confused: is the store_state_map just a view into the store_map? If so > > why > > > store it separately? > > > > Yes, it is a view. > > If we do not have a store_state_map_, we would need to get the state of the > > stores from store_map_ for each update. And since most updates do not lead to > > state changes, most of the time, the state of each store would be the same as > > last time. > > > > So, instead of looking up the state from the stores for each update, the > > store_state_map provides that information right away. > > What if you created this map on the heap only when it was time to schedule an > update, and passed it to the protocol manager (passing ownership of it)? That > way you remove the copy here and the protocol manager doesn't have to keep a ptr > back into this class. That would decouple them even more. It should be the > case that the state should not change before the update manager completes its > next fetch, yes? > > Keeping two data structures in sync can increase the risk of bugs where they're > not in sync. And we'd need semantics to ensure they're accessed on the correct > threads (I think they are, but it's not obvious from the code). > > If you think the current way is better then go ahead and submit and we'll hash > it out later (file a bug?). I don't want this point to hold you up. Done. See patch 10. https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... File components/safe_browsing_db/v4_database.h (right): https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_database.h:79: const StoreStateMap* store_state_map() { return &store_state_map_; } On 2016/06/17 22:53:42, Scott Hess wrote: > This feels like it shouldn't be exposed to public:. Done. https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... File components/safe_browsing_db/v4_update_protocol_manager.h (right): https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_update_protocol_manager.h:143: const StoreStateMap* store_state_map_; On 2016/06/17 22:53:43, Scott Hess wrote: > Having the instance retain a pointer to a map owned by another object seems like > a recipe for use-after-free errors. Is there any way to have a single reference > to this, which others access via accessor? Or, better yet, which only one > object needs to look at? Done.
Nit: Make V4Store's member variables private instead of protected, because why not?
git fetch && git pull
On 2016/06/17 at 22:54:48, shess wrote: > Also, there's a subset of this CL which is just renaming type things. I fully support pulling out pieces which are more stand-alone and getting those reviewed in expedited fashion to make it easier to see what's left. Done. Please see patch 14.
OK, this is a review-of-responses-to-comments. https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... File components/safe_browsing_db/v4_database.cc (right): https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_database.cc:111: if (old_store->state() != response.new_client_state()) { On 2016/06/20 22:28:42, vakh wrote: > On 2016/06/17 22:53:42, Scott Hess wrote: > > I'm not entirely liking the auto for |response| and |old_store|. Maybe I'm > just > > too traditionalist, but it feels like almost anything could pop out, here. > > "for (const auto& iter ...)" is a commonly used and encouraged pattern so > leaving the response as is, but I've changed the declaration for old_store. Yeah, I'm just feeling like this case is not incredibly obvious what the various calls are doing, without popping over to the header to look at the definition of ListUpdateResponse or StoreMap. I mean like state() and new_client_state(). It's easy enough to guess the intention, but harder to see if the intention departed from what's actually happening :-). https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_database.cc:119: response, current_task_runner, store_ready_callback)); On 2016/06/20 22:28:42, vakh wrote: > On 2016/06/17 22:53:42, Scott Hess wrote: > > I'm slightly questioning whether this is better done as a series of separately > > posted per-store tasks, or a single db-level task which iterates the stores. > I > > can see that there are cases where the set of responses will be greater than > the > > set of stores updated, but it feels like the common case is that all of the > > responses result in updates. > > > > Moving this to a db-level task would mean that you wouldn't have to track > > pending_store_updates_, and you also wouldn't have to have > store_ready_callback > > coalescing to call db_updated_callback_. In fact, in that case you could > > probably pass db_updated_callback as a parameter to ApplyUpdate() and get rid > of > > the member variable (which seems like kind of a wart). > > Writing a store to disk is a long operation so the idea is to keep individual > tasks on the task runner short in order to not hog the runner for all stores at > once. I'm not much familiar with the guidelines about this. > Do you know if it is OK to run long operations on the task runner? If it is, > then doing the update all at once is definitely better. I wouldn't be upset about running long jobs on the task runner - as far as I can tell, there's no particular recommendation or best practices on such things. If the infrastructure needs to make guarantees on whether things can be dispatched, it should either be making assertions about task runtime, or using a pool with appropriate serialization. If you think a single long-running task is poor practice, it still might make sense to post a single background task which chains to itself to pump stores through. But since each task is asymmetric, I'm not sure how valuable that would really be. It might make more sense if they ran in parallel, but that also may be too aggressive. [By pumping in the background, I mean a task which takes a collection of stores and a callback, then repeatedly does: - remove a store from the collection. - write that store. - if empty, post the callback. - else, repost the task with the reduced collection. That removes the outstanding counter, and allows to have a single point to dispatch the updated callback. ] https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_database.cc:140: // condition in decrementing pending_store_updates_. On 2016/06/20 22:28:42, vakh wrote: > On 2016/06/17 22:53:42, Scott Hess wrote: > > Do we have any sort of thread annotations? > Sorry, I'm not sure what you mean. In some systems (perhaps not Chromium), you can annotate variables or accesses to say "This can only happen on thread X", and then automated tooling will verify that's happening. Simply making a comment like this is brittle. > > This comment seems like it belongs > > with pending_store_updates_ definition, because it needs to apply to all > > accesses, not just this one. > Done. > > > > > Maybe it would make sense to add a wrapper for accesses to enforce thread > usage > > guidelines, and have the decrement wrapper chain invoke the callback? > Do you know of any example that I can look at to understand how this can be done > and what exactly it does? What I mean is accessors coded like: void set_pending_store_updates(size_t p) { DCHECK_CURRENTLY_ON(BrowserThread::IO); pending_store_updates_ = p; } size_t pending_store_updates() const { DCHECK_CURRENTLY_ON(BrowserThread::IO); return pending_store_updates_; } void decrement_pending_store_updates() { DCHECK_CURRENTLY_ON(BrowserThread::IO); pending_store_updates_--; if (!pending_store_updates_) { db_updated_callback_.Run(); } } It's a bit heavyweight and redundant, but it does enforce your invariant. OTOH, it is a bit heavyweight to read! I think having the comment on the variable declaration is better, and it's not like this is a general-use class. https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... File components/safe_browsing_db/v4_database.h (right): https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_database.h:23: // This callback is Run once the database has finished processing the update On 2016/06/20 22:28:42, vakh wrote: > On 2016/06/17 22:53:42, Scott Hess wrote: > > I don't think Run is a proper noun in this context. > > Changed to scheduled. :-). Should have been more clear, I meant that "Run" is written like a proper noun, like Mexico or Larry, but here it should be "run". Not that it was not _the_ proper noun to put there! Scheduled is fine, too. https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_database.h:143: unsigned pending_store_updates_; On 2016/06/20 22:28:42, vakh wrote: > On 2016/06/17 22:53:42, Scott Hess wrote: > > That said, I wonder about tracking "count of outstanding stores to update" > > versus just tracking the actual outstanding stores to update. > I thought of that at first, but couldn't figure out any advantage of doing that, > over jut keeping a counter. The only thing that I could come up with was to help > debug or track how long each store update took or if there are any that are not > getting updated at all for some reason. Finally decided to keep it simple > because those cases aren't really worthy enough. OK. Mostly just worried about whether counting the same thing in multiple places can get out of sync. https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... File components/safe_browsing_db/v4_local_database_manager.cc (right): https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_local_database_manager.cc:197: #else On 2016/06/20 22:28:42, vakh wrote: > On 2016/06/17 22:53:43, Scott Hess wrote: > > I think there should be no default, > > #elif defined(OS_OSX) > > This code is getting compiled on Android for an unknown reason -- it should not. > I'll look into that later but in the mean time, in order to not break the > compilation, I defaulted to one of the platforms. > > If I leave the platform_type empty, the server won't recognize the request and > return an error response which will pollute our UMA metrics so leaving it as-is > for now. > > I know it is wrong, but probably less wrong than the alternative. Filed: > http://crbug.com/621647 Suggest having an explicit case for OS_OSX, plus an explicit #else case, with a TODO(vakh) referencing that bug. Perhaps even a comment about why the X_PLATFORM you're choosing for the #else case won't pollute your metrics. https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_local_database_manager.cc:250: v4_database_->ApplyUpdate(responses); On 2016/06/20 22:28:43, vakh wrote: > On 2016/06/17 22:53:43, Scott Hess wrote: > > WRT my comment elsewhere with many store tasks versus a single db task, this > is > > the place I think would be cleaner if it was: > > v4_database_->ApplyUpdate(responses, base::Bind(&V4LDM::DatabaseUpdated, > this)); > > Then the reader would know looking only at this function where they need to > look > > next. > > Still not sure about updating all stores at once, please see my comment in > v4_database. > > Since the method to be called back is the same every time (without any > arguments), isn't it better to pass it at construction time? It's more efficient, probably. But a call like this: object->GoDoTheThing(parameter, parameter, base::Bind(&MyCallback, this)); makes it reasonably obvious what's happening. As compared to: void SetUpt() { object->SetTheCallback(base::Bind(&MyCallback, this)); } // hundreds of lines of code. object->GoDoTheThing(parameter, parameter); In the latter case, it's not immediately clear that there even is a flow-of-control thing, it reads like a synchronous operation. If you do want to initialize up front, then make sure you review your other callback use to see if they should all be regularized that way :-). https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... File components/safe_browsing_db/v4_protocol_manager_util.cc (right): https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_protocol_manager_util.cc:27: } On 2016/06/20 22:28:43, vakh wrote: > On 2016/06/17 22:53:43, Scott Hess wrote: > > I'm not sure how this integrates with the overall stream. Maybe get some {} > > around the aggregate so that it's clear that this is a unit. > You mean just empty {}? It's not clear to me how that would be beneficial. I mean that whoever is invoking this might be coded like: LOG(ERROR) << "Got to " << __FUNCTION__ << ", list_id: " << list_id << ", something_else: " << something_else; this is going to come out like: Got to MyFunction, list_id: hash: 0x8888; platform_type: WINDOWS; threat_entry_type: TYPE; threat_type: TYPE, something_else: some string or something. everything is flat, so it's not clear where the list_id ends. So I'm suggesting something to make that clear, like: Got to MyFunction, list_id: {hash: 0x8888; platform_type: WINDOWS; threat_entry_type: TYPE; threat_type: TYPE}, something_else: some string or something. > > Actually, I'm not _entirely_ comfortable having logging stuff like this. It has > > an unclear cost which probably won't be of general utility. > This isn't logging stuff on its own and is very useful in debugging things > (specially when gdb isn't that useful). It does provide value in terms of time > saved debugging. Is there a recommended way of checking in code that is useful > in the near future (while building stuff), but may be not in the long term? > > I could add a crbug to remove this (and other debugging aids) after a quarter. Do you expect that you'll be regularly communicating with other developers to find these debug lines? What I've generally found with DLOG output is that often enough, I end up having to rewrite whatever is there anyhow, because whatever thing I was debugging last time was fixed, and the new thing is new otherwise I'd have debugged it last time. If it's just a convenience so that you don't have to rewrite the log lines in the future, you should keep them on a local git branch, rather than checking them into the repo. [BTW, I do this all the time. Usually it's pretty straight-forward, I'll just cherry-pick the logging patch over as needed to debug something.] I guess where I'm going is that IMHO you should only checkin logging code if you have a reason to believe you'll regularly ask for the results of the logging code. Often enough, the easiest answer is to not checkin the logging code until you find the case where you would have asked for it, _then_ check it in. 99% of the time you'd have never asked, and the logging code will site in the tree forever, waiting for someone to actually use it. The kinds of cases I hate are when I'm trying to debug a build break, and there are a hundred lines of logging output which I can't tell if it is actually telling me there's a problem, or if someone is logging it just in case it might be useful someday. Almost all the time, it's just-in-case logging, and I'm annoyed that I have to wade through it looking for the pertinent data. I _think_ you're using DVLOG, so that's less of an issue - but, of course, in that case it means that you can't have the case of finding the log output retrospectively, you have to intend to get it. https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... File components/safe_browsing_db/v4_store.h (right): https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_store.h:67: std::ostream& operator<<(std::ostream& os, const V4Store& store); On 2016/06/20 22:28:43, vakh wrote: > On 2016/06/17 22:53:43, Scott Hess wrote: > > I like this even less than the version on the list identifier. > > The UpdateListIdentifier is a struct so its members were visible. Not the case > here so I need the DebugString method. Acknowledged.
Apologies, I lost mental containment and bailed on the last few files. They had fewer changes, I'm sure they were probably solid :-). https://codereview.chromium.org/2062013002/diff/260001/chrome/browser/safe_br... File chrome/browser/safe_browsing/services_delegate_impl.cc (right): https://codereview.chromium.org/2062013002/diff/260001/chrome/browser/safe_br... chrome/browser/safe_browsing/services_delegate_impl.cc:20: #ifdef NDEBUG Grrr, stupid compiler. WDYT of structuring the later thing as: #ifndef NDEBUG return true; #endif return base::FeatureList::IsEnabled( kSafeBrowsingV4LocalDatabaseManagerEnabled); ??? https://codereview.chromium.org/2062013002/diff/260001/components/safe_browsi... File components/safe_browsing_db/v4_database.cc (right): https://codereview.chromium.org/2062013002/diff/260001/components/safe_browsi... components/safe_browsing_db/v4_database.cc:30: // thread. Did this comment float away from the code it refers to? https://codereview.chromium.org/2062013002/diff/260001/components/safe_browsi... components/safe_browsing_db/v4_database.cc:102: // current thread (IO thread). IO thread is implied two lines up. I think "current thread" is more correct, because it will be right even if this moves to a different thread. https://codereview.chromium.org/2062013002/diff/260001/components/safe_browsi... File components/safe_browsing_db/v4_database.h (right): https://codereview.chromium.org/2062013002/diff/260001/components/safe_browsi... components/safe_browsing_db/v4_database.h:134: // possibility of a race condition in decrementing it. I don't think decremented is the key feature, because it is also set by another method. Just say that it should only be accessed on the IO thread, don't worry about how it is accessed. https://codereview.chromium.org/2062013002/diff/260001/components/safe_browsi... File components/safe_browsing_db/v4_database_unittest.cc (right): https://codereview.chromium.org/2062013002/diff/260001/components/safe_browsi... components/safe_browsing_db/v4_database_unittest.cc:81: expected_identifiers_.push_back(win_malware_id); I guess that worked out well! https://codereview.chromium.org/2062013002/diff/260001/components/safe_browsi... components/safe_browsing_db/v4_database_unittest.cc:141: } This feels odd to me. It's setting a member variable to a combination of other member variables and a parameter, and it's only ever called once per test fixture. At this point, it feels like SetupInfoMapAndExpectedState() just belongs up in SetUp(). The callback happens on the logical test thread, so it can can just access the expected_*_ variables directly, rather than having them as parameters. expected_resets_successfully could then also be a member variable for the test fixture to set, and at that point this callback can also just be moved into SetUp() also. If that's entirely wrong, then instead of having this set the callback as a side effect, it might be clearer if this returned the callback which the calling code sets. https://codereview.chromium.org/2062013002/diff/260001/components/safe_browsi... components/safe_browsing_db/v4_database_unittest.cc:153: EXPECT_EQ(1ul, new_store_state_map->count(identifier)); The new_store_map case needs to be an ASSERT, because below you deref it. The new_store_state_map can remain EXPECT, but I think being consistent between the two is probably better. 1u should be sufficient, and places less constraints on size_t. [I hate that there is no contains(). Yeah, checking count() or find()!=end() works, but it's stupidly indirect.] https://codereview.chromium.org/2062013002/diff/260001/components/safe_browsi... components/safe_browsing_db/v4_database_unittest.cc:162: new_store_map->at(identifier).get()); I'd say either consistently use [] or at(), just to reduce the ??? in the reader's head. I'm neutral on which to use, (*new_store_map_)[].get() is atrocious, too. But *shrug*. https://codereview.chromium.org/2062013002/diff/260001/components/safe_browsi... components/safe_browsing_db/v4_database_unittest.cc:192: SetupNewDatabaseReadyCallback(true); My earlier suggestion would push the two Setup*() calls into SetUp() ... the Register could also go there if you had SetUp() setup a factory with a pointer to a boolean which the test fixture could set. I could see that going either way, though. https://codereview.chromium.org/2062013002/diff/260001/components/safe_browsi... File components/safe_browsing_db/v4_store.cc (right): https://codereview.chromium.org/2062013002/diff/260001/components/safe_browsi... components/safe_browsing_db/v4_store.cc:35: store_path_.value().c_str(), state_base64.c_str()); You should be able to drop |value| and just return base::StringPrintf().
git fetch && pull
Address comments by shess@
Thanks for another round of review, shess@ I've addressed some of the comments and responded to others. PTAL. https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... File components/safe_browsing_db/v4_database.cc (right): https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_database.cc:111: if (old_store->state() != response.new_client_state()) { On 2016/06/21 at 21:03:44, Scott Hess wrote: > On 2016/06/20 22:28:42, vakh wrote: > > On 2016/06/17 22:53:42, Scott Hess wrote: > > > I'm not entirely liking the auto for |response| and |old_store|. Maybe I'm > > just > > > too traditionalist, but it feels like almost anything could pop out, here. > > > > "for (const auto& iter ...)" is a commonly used and encouraged pattern so > > leaving the response as is, but I've changed the declaration for old_store. > > Yeah, I'm just feeling like this case is not incredibly obvious what the various calls are doing, without popping over to the header to look at the definition of ListUpdateResponse or StoreMap. I mean like state() and new_client_state(). It's easy enough to guess the intention, but harder to see if the intention departed from what's actually happening :-). Fair enough. Done. Changed for response. For the other occurrences, those are map iterators and I'd prefer keeping them that way. https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_database.cc:140: // condition in decrementing pending_store_updates_. On 2016/06/21 at 21:03:43, Scott Hess wrote: > On 2016/06/20 22:28:42, vakh wrote: > > On 2016/06/17 22:53:42, Scott Hess wrote: > > > Do we have any sort of thread annotations? > > Sorry, I'm not sure what you mean. > > In some systems (perhaps not Chromium), you can annotate variables or accesses to say "This can only happen on thread X", and then automated tooling will verify that's happening. Simply making a comment like this is brittle. > Acknowledged. I've moved the comment next to the variable declaration. > > > This comment seems like it belongs > > > with pending_store_updates_ definition, because it needs to apply to all > > > accesses, not just this one. > > Done. > > > > > > > > Maybe it would make sense to add a wrapper for accesses to enforce thread > > usage > > > guidelines, and have the decrement wrapper chain invoke the callback? > > Do you know of any example that I can look at to understand how this can be done > > and what exactly it does? > > What I mean is accessors coded like: > > void set_pending_store_updates(size_t p) { > DCHECK_CURRENTLY_ON(BrowserThread::IO); > pending_store_updates_ = p; > } > size_t pending_store_updates() const { > DCHECK_CURRENTLY_ON(BrowserThread::IO); > return pending_store_updates_; > } > void decrement_pending_store_updates() { > DCHECK_CURRENTLY_ON(BrowserThread::IO); > pending_store_updates_--; > if (!pending_store_updates_) { > db_updated_callback_.Run(); > } > } > > It's a bit heavyweight and redundant, but it does enforce your invariant. > > OTOH, it is a bit heavyweight to read! I think having the comment on the variable declaration is better, and it's not like this is a general-use class. Yes, I agree that it is a bit heavyweight to read, specially when this is for a private variable so any code can already read/modify it without having to use those interfaces. https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... File components/safe_browsing_db/v4_database.h (right): https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_database.h:23: // This callback is Run once the database has finished processing the update On 2016/06/21 at 21:03:44, Scott Hess wrote: > On 2016/06/20 22:28:42, vakh wrote: > > On 2016/06/17 22:53:42, Scott Hess wrote: > > > I don't think Run is a proper noun in this context. > > > > Changed to scheduled. > > :-). Should have been more clear, I meant that "Run" is written like a proper noun, like Mexico or Larry, but here it should be "run". Not that it was not _the_ proper noun to put there! Scheduled is fine, too. I had a hunch that's what you meant but wasn't sure. "scheduled" is also more accurate :) https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_database.h:143: unsigned pending_store_updates_; On 2016/06/21 at 21:03:44, Scott Hess wrote: > On 2016/06/20 22:28:42, vakh wrote: > > On 2016/06/17 22:53:42, Scott Hess wrote: > > > That said, I wonder about tracking "count of outstanding stores to update" > > > versus just tracking the actual outstanding stores to update. > > I thought of that at first, but couldn't figure out any advantage of doing that, > > over jut keeping a counter. The only thing that I could come up with was to help > > debug or track how long each store update took or if there are any that are not > > getting updated at all for some reason. Finally decided to keep it simple > > because those cases aren't really worthy enough. > > OK. Mostly just worried about whether counting the same thing in multiple places can get out of sync. Yes, I agree but I think in this particular case, we are using thread synchronization to ensure that all the increments happen before all the decrements so I it's not risky. https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... File components/safe_browsing_db/v4_local_database_manager.cc (right): https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_local_database_manager.cc:197: #else On 2016/06/21 at 21:03:44, Scott Hess wrote: > On 2016/06/20 22:28:42, vakh wrote: > > On 2016/06/17 22:53:43, Scott Hess wrote: > > > I think there should be no default, > > > #elif defined(OS_OSX) > > > > This code is getting compiled on Android for an unknown reason -- it should not. > > I'll look into that later but in the mean time, in order to not break the > > compilation, I defaulted to one of the platforms. > > > > If I leave the platform_type empty, the server won't recognize the request and > > return an error response which will pollute our UMA metrics so leaving it as-is > > for now. > > > > I know it is wrong, but probably less wrong than the alternative. Filed: > > http://crbug.com/621647 > > Suggest having an explicit case for OS_OSX, plus an explicit #else case, with a TODO(vakh) referencing that bug. Perhaps even a comment about why the X_PLATFORM you're choosing for the #else case won't pollute your metrics. Done. https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_local_database_manager.cc:250: v4_database_->ApplyUpdate(responses); On 2016/06/21 at 21:03:44, Scott Hess wrote: > On 2016/06/20 22:28:43, vakh wrote: > > On 2016/06/17 22:53:43, Scott Hess wrote: > > > WRT my comment elsewhere with many store tasks versus a single db task, this > > is > > > the place I think would be cleaner if it was: > > > v4_database_->ApplyUpdate(responses, base::Bind(&V4LDM::DatabaseUpdated, > > this)); > > > Then the reader would know looking only at this function where they need to > > look > > > next. > > > > Still not sure about updating all stores at once, please see my comment in > > v4_database. > > > > Since the method to be called back is the same every time (without any > > arguments), isn't it better to pass it at construction time? > > It's more efficient, probably. But a call like this: > > object->GoDoTheThing(parameter, parameter, base::Bind(&MyCallback, this)); > > makes it reasonably obvious what's happening. As compared to: > > void SetUpt() { > object->SetTheCallback(base::Bind(&MyCallback, this)); > } > // hundreds of lines of code. > object->GoDoTheThing(parameter, parameter); > > In the latter case, it's not immediately clear that there even is a flow-of-control thing, it reads like a synchronous operation. > > If you do want to initialize up front, then make sure you review your other callback use to see if they should all be regularized that way :-). The other callbacks in this class also work the same way i.e. setup callback at init and then keep getting called back. https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... File components/safe_browsing_db/v4_protocol_manager_util.cc (right): https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_protocol_manager_util.cc:27: } On 2016/06/21 at 21:03:44, Scott Hess wrote: > On 2016/06/20 22:28:43, vakh wrote: > > On 2016/06/17 22:53:43, Scott Hess wrote: > > > I'm not sure how this integrates with the overall stream. Maybe get some {} > > > around the aggregate so that it's clear that this is a unit. > > You mean just empty {}? It's not clear to me how that would be beneficial. > > I mean that whoever is invoking this might be coded like: > LOG(ERROR) << "Got to " << __FUNCTION__ << ", list_id: " << list_id << ", something_else: " << something_else; > this is going to come out like: > Got to MyFunction, list_id: hash: 0x8888; platform_type: WINDOWS; threat_entry_type: TYPE; threat_type: TYPE, something_else: some string or something. > everything is flat, so it's not clear where the list_id ends. So I'm suggesting something to make that clear, like: > Got to MyFunction, list_id: {hash: 0x8888; platform_type: WINDOWS; threat_entry_type: TYPE; threat_type: TYPE}, something_else: some string or something. > Done. > > > Actually, I'm not _entirely_ comfortable having logging stuff like this. It has > > > an unclear cost which probably won't be of general utility. > > This isn't logging stuff on its own and is very useful in debugging things > > (specially when gdb isn't that useful). It does provide value in terms of time > > saved debugging. Is there a recommended way of checking in code that is useful > > in the near future (while building stuff), but may be not in the long term? > > > > I could add a crbug to remove this (and other debugging aids) after a quarter. > > Do you expect that you'll be regularly communicating with other developers to find these debug lines? What I've generally found with DLOG output is that often enough, I end up having to rewrite whatever is there anyhow, because whatever thing I was debugging last time was fixed, and the new thing is new otherwise I'd have debugged it last time. If it's just a convenience so that you don't have to rewrite the log lines in the future, you should keep them on a local git branch, rather than checking them into the repo. [BTW, I do this all the time. Usually it's pretty straight-forward, I'll just cherry-pick the logging patch over as needed to debug something.] > > I guess where I'm going is that IMHO you should only checkin logging code if you have a reason to believe you'll regularly ask for the results of the logging code. Often enough, the easiest answer is to not checkin the logging code until you find the case where you would have asked for it, _then_ check it in. 99% of the time you'd have never asked, and the logging code will site in the tree forever, waiting for someone to actually use it. > > The kinds of cases I hate are when I'm trying to debug a build break, and there are a hundred lines of logging output which I can't tell if it is actually telling me there's a problem, or if someone is logging it just in case it might be useful someday. Almost all the time, it's just-in-case logging, and I'm annoyed that I have to wade through it looking for the pertinent data. I _think_ you're using DVLOG, so that's less of an issue - but, of course, in that case it means that you can't have the case of finding the log output retrospectively, you have to intend to get it. Acknowledged. WRT this particular method, it is not outputting any log by itself so there's no harm in keeping it. As you pointed out, the other statements are also DVLOGs so they'll only be printed in a DEBUG build and that too only about once in half-an-hour so other developers shouldn't be too affected.
Nit: store_file_name_map ready at compile time
Minor: Filename format in CamelCase: <ThreatEntryType><ThreatType>.store
https://codereview.chromium.org/2062013002/diff/80001/components/safe_browsin... File components/safe_browsing_db/v4_local_database_manager.cc (right): https://codereview.chromium.org/2062013002/diff/80001/components/safe_browsin... components/safe_browsing_db/v4_local_database_manager.cc:204: (*store_file_name_map)[update_list_identifier] = "os_url_malware.store"; On 2016/06/17 21:34:19, Nathan Parker wrote: > On 2016/06/16 01:07:35, vakh wrote: > > On 2016/06/15 21:24:19, Nathan Parker wrote: > > > We should come up with a list of all the filenames we plan to use (for > > existing > > > lists), to make sure we've got a naming strategy that'll work going forward. > > > > And document that. > > > > > > I think we should put them in a "Safe Browsing/" directory, and given that > > most > > > things in the data dir use Upper Case and Spaces, we should stick to that > too. > > > > > > (Could be done later) This could be defined at compile time rather than run > > > time, to make it easier to read and edit. > > > > > > StoreFilenameMap default_store_filename_map = { > > > {{PLATFORM, URL, MALWARE_THREAT}, "some malware filename"}, > > > {{PLATFORM, URL, SOCIAL_ENGINEERING_PUBLIC}, "yadda yadda"}, > > > ... > > > } > > > > > > With that, PLATFORM could be a #define'd macro. > > > > Acknowledged. Work for a later CL. > > When we we change the names later, we'll need to write code to delete these > since they'll be left on some users' disk. Done. The files are being created under the "Safe Browsing/" directory, which we get from safe_browsing_service. Inside that directory, I don't know of any good reason to have spaces in names, but I agree that Chromium doesn't use underscores so I've changed the files to have CamelCase. See patch 17.
CR feedback from shess@
Thanks for the review. I have addressed all the comments so far. PTAL. https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... File components/safe_browsing_db/v4_database.cc (right): https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_database.cc:119: response, current_task_runner, store_ready_callback)); On 2016/06/21 21:03:44, Scott Hess wrote: > On 2016/06/20 22:28:42, vakh wrote: > > On 2016/06/17 22:53:42, Scott Hess wrote: > > > I'm slightly questioning whether this is better done as a series of > separately > > > posted per-store tasks, or a single db-level task which iterates the stores. > > > I > > > can see that there are cases where the set of responses will be greater than > > the > > > set of stores updated, but it feels like the common case is that all of the > > > responses result in updates. > > > > > > Moving this to a db-level task would mean that you wouldn't have to track > > > pending_store_updates_, and you also wouldn't have to have > > store_ready_callback > > > coalescing to call db_updated_callback_. In fact, in that case you could > > > probably pass db_updated_callback as a parameter to ApplyUpdate() and get > rid > > of > > > the member variable (which seems like kind of a wart). > > > > Writing a store to disk is a long operation so the idea is to keep individual > > tasks on the task runner short in order to not hog the runner for all stores > at > > once. I'm not much familiar with the guidelines about this. > > Do you know if it is OK to run long operations on the task runner? If it is, > > then doing the update all at once is definitely better. > > I wouldn't be upset about running long jobs on the task runner - as far as I can > tell, there's no particular recommendation or best practices on such things. If > the infrastructure needs to make guarantees on whether things can be dispatched, > it should either be making assertions about task runtime, or using a pool with > appropriate serialization. > > If you think a single long-running task is poor practice, it still might make > sense to post a single background task which chains to itself to pump stores > through. But since each task is asymmetric, I'm not sure how valuable that > would really be. It might make more sense if they ran in parallel, but that > also may be too aggressive. > > [By pumping in the background, I mean a task which takes a collection of stores > and a callback, then repeatedly does: > - remove a store from the collection. > - write that store. > - if empty, post the callback. > - else, repost the task with the reduced collection. > That removes the outstanding counter, and allows to have a single point to > dispatch the updated callback. > ] If you feel strongly about making this change, I can go ahead and do it. However, I get the sense that both of us agree and disagree with each other, and none of us feels strongly about it. The current implementation is along the lines of what I proposed in the design document so I propose that we keep the current implementation. If at a later time we have a strong and evidence-based reason to change it, we'll do it then. WDYT? https://codereview.chromium.org/2062013002/diff/260001/chrome/browser/safe_br... File chrome/browser/safe_browsing/services_delegate_impl.cc (right): https://codereview.chromium.org/2062013002/diff/260001/chrome/browser/safe_br... chrome/browser/safe_browsing/services_delegate_impl.cc:20: #ifdef NDEBUG On 2016/06/21 22:23:25, Scott Hess wrote: > Grrr, stupid compiler. > > WDYT of structuring the later thing as: > > #ifndef NDEBUG > return true; > #endif > return base::FeatureList::IsEnabled( > kSafeBrowsingV4LocalDatabaseManagerEnabled); > > ??? D'oh! Should have done that in the first place. Done now. https://codereview.chromium.org/2062013002/diff/260001/components/safe_browsi... File components/safe_browsing_db/v4_database.cc (right): https://codereview.chromium.org/2062013002/diff/260001/components/safe_browsi... components/safe_browsing_db/v4_database.cc:30: // thread. On 2016/06/21 22:23:25, Scott Hess wrote: > Did this comment float away from the code it refers to? This is a comment about what this method does, so I've kept it at the top of the method definition. Would it be better to have it just before the method definition? https://codereview.chromium.org/2062013002/diff/260001/components/safe_browsi... components/safe_browsing_db/v4_database.cc:102: // current thread (IO thread). On 2016/06/21 22:23:25, Scott Hess wrote: > IO thread is implied two lines up. I think "current thread" is more correct, > because it will be right even if this moves to a different thread. Done. https://codereview.chromium.org/2062013002/diff/260001/components/safe_browsi... File components/safe_browsing_db/v4_database.h (right): https://codereview.chromium.org/2062013002/diff/260001/components/safe_browsi... components/safe_browsing_db/v4_database.h:134: // possibility of a race condition in decrementing it. On 2016/06/21 22:23:25, Scott Hess wrote: > I don't think decremented is the key feature, because it is also set by another > method. Just say that it should only be accessed on the IO thread, don't worry > about how it is accessed. Done. https://codereview.chromium.org/2062013002/diff/260001/components/safe_browsi... File components/safe_browsing_db/v4_database_unittest.cc (right): https://codereview.chromium.org/2062013002/diff/260001/components/safe_browsi... components/safe_browsing_db/v4_database_unittest.cc:81: expected_identifiers_.push_back(win_malware_id); On 2016/06/21 22:23:25, Scott Hess wrote: > I guess that worked out well! Acknowledged. https://codereview.chromium.org/2062013002/diff/260001/components/safe_browsi... components/safe_browsing_db/v4_database_unittest.cc:141: } On 2016/06/21 22:23:25, Scott Hess wrote: > This feels odd to me. It's setting a member variable to a combination of other > member variables and a parameter, and it's only ever called once per test > fixture. > > At this point, it feels like SetupInfoMapAndExpectedState() just belongs up in > SetUp(). The callback happens on the logical test thread, so it can can just > access the expected_*_ variables directly, rather than having them as > parameters. expected_resets_successfully could then also be a member variable > for the test fixture to set, and at that point this callback can also just be > moved into SetUp() also. > > If that's entirely wrong, then instead of having this set the callback as a side > effect, it might be clearer if this returned the callback which the calling code > sets. Makes good sense. Done. https://codereview.chromium.org/2062013002/diff/260001/components/safe_browsi... components/safe_browsing_db/v4_database_unittest.cc:153: EXPECT_EQ(1ul, new_store_state_map->count(identifier)); On 2016/06/21 22:23:25, Scott Hess wrote: > The new_store_map case needs to be an ASSERT, because below you deref it. The > new_store_state_map can remain EXPECT, but I think being consistent between the > two is probably better. > Done > 1u should be sufficient, and places less constraints on size_t. [I hate that > there is no contains(). Yeah, checking count() or find()!=end() works, but it's > stupidly indirect.] Wholeheartedly agree :-) https://codereview.chromium.org/2062013002/diff/260001/components/safe_browsi... components/safe_browsing_db/v4_database_unittest.cc:162: new_store_map->at(identifier).get()); On 2016/06/21 22:23:25, Scott Hess wrote: > I'd say either consistently use [] or at(), just to reduce the ??? in the > reader's head. I'm neutral on which to use, (*new_store_map_)[].get() is > atrocious, too. But *shrug*. I prefer to use [], but: 1. I really dislike: (*new_store_map_)[] 2. In order for the [] operator to be available, I need to change the new_store_map from 'const StoreMap*' to 'StoreMap*', which IMO better to have than the minor readability gains from using [] So, I've changed all occurences to use at() https://codereview.chromium.org/2062013002/diff/260001/components/safe_browsi... components/safe_browsing_db/v4_database_unittest.cc:192: SetupNewDatabaseReadyCallback(true); On 2016/06/21 22:23:25, Scott Hess wrote: > My earlier suggestion would push the two Setup*() calls into SetUp() ... the > Register could also go there if you had SetUp() setup a factory with a pointer > to a boolean which the test fixture could set. I could see that going either > way, though. I've moved the two Setup* methods to SetUp as you mentioned but the Register suggestions seems a little bit of an overkill so not doing that for now. If you feel strongly about it (I don't), I'll change it. https://codereview.chromium.org/2062013002/diff/260001/components/safe_browsi... File components/safe_browsing_db/v4_store.cc (right): https://codereview.chromium.org/2062013002/diff/260001/components/safe_browsi... components/safe_browsing_db/v4_store.cc:35: store_path_.value().c_str(), state_base64.c_str()); On 2016/06/21 22:23:25, Scott Hess wrote: > You should be able to drop |value| and just return base::StringPrintf(). Done.
git fetch && git pull && gclient sync
I agree we're getting close to the end of the line, here. https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... File components/safe_browsing_db/v4_database.cc (right): https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_database.cc:111: if (old_store->state() != response.new_client_state()) { On 2016/06/21 23:19:35, vakh wrote: > On 2016/06/21 at 21:03:44, Scott Hess wrote: > > On 2016/06/20 22:28:42, vakh wrote: > > > On 2016/06/17 22:53:42, Scott Hess wrote: > > > > I'm not entirely liking the auto for |response| and |old_store|. Maybe > I'm > > > just > > > > too traditionalist, but it feels like almost anything could pop out, here. > > > > > > "for (const auto& iter ...)" is a commonly used and encouraged pattern so > > > leaving the response as is, but I've changed the declaration for old_store. > > > > Yeah, I'm just feeling like this case is not incredibly obvious what the > various calls are doing, without popping over to the header to look at the > definition of ListUpdateResponse or StoreMap. I mean like state() and > new_client_state(). It's easy enough to guess the intention, but harder to see > if the intention departed from what's actually happening :-). > > Fair enough. Done. Changed for response. > For the other occurrences, those are map iterators and I'd prefer keeping them > that way. Acknowledged. I feel like this looks less magical. https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_database.cc:119: response, current_task_runner, store_ready_callback)); On 2016/06/23 06:23:49, vakh wrote: > On 2016/06/21 21:03:44, Scott Hess wrote: > > On 2016/06/20 22:28:42, vakh wrote: > > > On 2016/06/17 22:53:42, Scott Hess wrote: > > > > I'm slightly questioning whether this is better done as a series of > > separately > > > > posted per-store tasks, or a single db-level task which iterates the > stores. > > > > > I > > > > can see that there are cases where the set of responses will be greater > than > > > the > > > > set of stores updated, but it feels like the common case is that all of > the > > > > responses result in updates. > > > > > > > > Moving this to a db-level task would mean that you wouldn't have to track > > > > pending_store_updates_, and you also wouldn't have to have > > > store_ready_callback > > > > coalescing to call db_updated_callback_. In fact, in that case you could > > > > probably pass db_updated_callback as a parameter to ApplyUpdate() and get > > rid > > > of > > > > the member variable (which seems like kind of a wart). > > > > > > Writing a store to disk is a long operation so the idea is to keep > individual > > > tasks on the task runner short in order to not hog the runner for all stores > > at > > > once. I'm not much familiar with the guidelines about this. > > > Do you know if it is OK to run long operations on the task runner? If it is, > > > then doing the update all at once is definitely better. > > > > I wouldn't be upset about running long jobs on the task runner - as far as I > can > > tell, there's no particular recommendation or best practices on such things. > If > > the infrastructure needs to make guarantees on whether things can be > dispatched, > > it should either be making assertions about task runtime, or using a pool with > > appropriate serialization. > > > > If you think a single long-running task is poor practice, it still might make > > sense to post a single background task which chains to itself to pump stores > > through. But since each task is asymmetric, I'm not sure how valuable that > > would really be. It might make more sense if they ran in parallel, but that > > also may be too aggressive. > > > > [By pumping in the background, I mean a task which takes a collection of > stores > > and a callback, then repeatedly does: > > - remove a store from the collection. > > - write that store. > > - if empty, post the callback. > > - else, repost the task with the reduced collection. > > That removes the outstanding counter, and allows to have a single point to > > dispatch the updated callback. > > ] > > If you feel strongly about making this change, I can go ahead and do it. > However, I get the sense that both of us agree and disagree with each other, and > none of us feels strongly about it. > The current implementation is along the lines of what I proposed in the design > document so I propose that we keep the current implementation. If at a later > time we have a strong and evidence-based reason to change it, we'll do it then. > WDYT? OK, I do think it's a style thing rather than a real functional issue. https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... File components/safe_browsing_db/v4_local_database_manager.cc (right): https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_local_database_manager.cc:250: v4_database_->ApplyUpdate(responses); On 2016/06/21 23:19:35, vakh wrote: > On 2016/06/21 at 21:03:44, Scott Hess wrote: > > On 2016/06/20 22:28:43, vakh wrote: > > > On 2016/06/17 22:53:43, Scott Hess wrote: > > > > WRT my comment elsewhere with many store tasks versus a single db task, > this > > > is > > > > the place I think would be cleaner if it was: > > > > v4_database_->ApplyUpdate(responses, base::Bind(&V4LDM::DatabaseUpdated, > > > this)); > > > > Then the reader would know looking only at this function where they need > to > > > look > > > > next. > > > > > > Still not sure about updating all stores at once, please see my comment in > > > v4_database. > > > > > > Since the method to be called back is the same every time (without any > > > arguments), isn't it better to pass it at construction time? > > > > It's more efficient, probably. But a call like this: > > > > object->GoDoTheThing(parameter, parameter, base::Bind(&MyCallback, this)); > > > > makes it reasonably obvious what's happening. As compared to: > > > > void SetUpt() { > > object->SetTheCallback(base::Bind(&MyCallback, this)); > > } > > // hundreds of lines of code. > > object->GoDoTheThing(parameter, parameter); > > > > In the latter case, it's not immediately clear that there even is a > flow-of-control thing, it reads like a synchronous operation. > > > > If you do want to initialize up front, then make sure you review your other > callback use to see if they should all be regularized that way :-). > > The other callbacks in this class also work the same way i.e. setup callback at > init and then keep getting called back. AFAICT there's only that one callback which is stored as a member variable. I may be misreading something. https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... File components/safe_browsing_db/v4_protocol_manager_util.cc (right): https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_protocol_manager_util.cc:27: } On 2016/06/21 23:19:35, vakh wrote: > Acknowledged. WRT this particular method, it is not outputting any log by itself > so there's no harm in keeping it. As you pointed out, the other statements are > also DVLOGs so they'll only be printed in a DEBUG build and that too only about > once in half-an-hour so other developers shouldn't be too affected. I'll grant that there may not be _much_ harm in keeping it, but I think "no harm in keeping it" isn't right. Unnecessary log lines add to compile time in all cases, reduce readability, and in this case add to binary size in debug, and possibly adds irrelevant output if someone uses the wrong --verbose rules. Those costs potentially apply to everyone on the project, whereas the benefits probably only apply to you. [This be didn't get into my bonnet with Chrome - I used to greatly despise having a production emergency where my first order of business was to write filters to let me try to figure out which bits of a 100MB daemond.ERROR file might be relevant.] https://codereview.chromium.org/2062013002/diff/260001/components/safe_browsi... File components/safe_browsing_db/v4_database.cc (right): https://codereview.chromium.org/2062013002/diff/260001/components/safe_browsi... components/safe_browsing_db/v4_database.cc:30: // thread. On 2016/06/23 06:23:50, vakh wrote: > On 2016/06/21 22:23:25, Scott Hess wrote: > > Did this comment float away from the code it refers to? > > This is a comment about what this method does, so I've kept it at the top of the > method definition. Would it be better to have it just before the method > definition? I'd put it on the method itself, since inline like this is looks like it's telling you what the next few lines do. The comment out in the header file pretty much covers the same ground, so another option is to just drop it entirely. That way you don't have to keep the code and two comments in sync. https://codereview.chromium.org/2062013002/diff/380001/chrome/browser/safe_br... File chrome/browser/safe_browsing/services_delegate_impl.cc (right): https://codereview.chromium.org/2062013002/diff/380001/chrome/browser/safe_br... chrome/browser/safe_browsing/services_delegate_impl.cc:193: kSafeBrowsingV4LocalDatabaseManagerEnabled); Is this clang-format? If not, prefer to keep it the same as the previous format. Yes, OCD, I like diffs to be minimal, until they become maximal. I wonder if you changed it to: #ifndef NDEBUG return true; #endif return ...; if that would allow removing the #ifdef above? https://codereview.chromium.org/2062013002/diff/380001/components/safe_browsi... File components/safe_browsing_db/v4_database.cc (right): https://codereview.chromium.org/2062013002/diff/380001/components/safe_browsi... components/safe_browsing_db/v4_database.cc:155: base::MakeUnique<StoreStateMap>(); Can this do the right thing as a StoreStateMap without the unique_ptr? I think so. Just a modest preference, if you think it makes it less clear that it does the right thing, unique_ptr is reasonable. https://codereview.chromium.org/2062013002/diff/380001/components/safe_browsi... File components/safe_browsing_db/v4_database_unittest.cc (right): https://codereview.chromium.org/2062013002/diff/380001/components/safe_browsi... components/safe_browsing_db/v4_database_unittest.cc:82: V4Database::RegisterStoreFactoryForTest(factory_.get()); I think your point about not wanting to put the Register() into setup is reasonable. But it occurs to me that you might want to either put Register(NULL) in TearDown(), or leak this object, rather than leaving a potentially-stale reference outstanding. https://codereview.chromium.org/2062013002/diff/380001/components/safe_browsi... File components/safe_browsing_db/v4_local_database_manager.cc (right): https://codereview.chromium.org/2062013002/diff/380001/components/safe_browsi... components/safe_browsing_db/v4_local_database_manager.cc:30: // TODO(vakh): Once that bug is fixes, this should be removed. If we leave s/fixes/fixed/ https://codereview.chromium.org/2062013002/diff/380001/components/safe_browsi... File components/safe_browsing_db/v4_update_protocol_manager.cc (right): https://codereview.chromium.org/2062013002/diff/380001/components/safe_browsi... components/safe_browsing_db/v4_update_protocol_manager.cc:196: const StoreStateMap* store_state_map) { WRT my other comment about "Does StoreStateMap need unique_ptr?", in this case the code requires that store_state_map not be NULL, so it could be a reference. I'm not entirely clear if there's potential to get a NULL here, now that I think about it.
Minor: CR feedback: shess@
git fetch && git pull && gclient sync
https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... File components/safe_browsing_db/v4_database.cc (right): https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_database.cc:119: response, current_task_runner, store_ready_callback)); On 2016/06/24 23:01:18, Scott Hess wrote: > On 2016/06/23 06:23:49, vakh wrote: > > On 2016/06/21 21:03:44, Scott Hess wrote: > > > On 2016/06/20 22:28:42, vakh wrote: > > > > On 2016/06/17 22:53:42, Scott Hess wrote: > > > > > I'm slightly questioning whether this is better done as a series of > > > separately > > > > > posted per-store tasks, or a single db-level task which iterates the > > stores. > > > > > > > I > > > > > can see that there are cases where the set of responses will be greater > > than > > > > the > > > > > set of stores updated, but it feels like the common case is that all of > > the > > > > > responses result in updates. > > > > > > > > > > Moving this to a db-level task would mean that you wouldn't have to > track > > > > > pending_store_updates_, and you also wouldn't have to have > > > > store_ready_callback > > > > > coalescing to call db_updated_callback_. In fact, in that case you > could > > > > > probably pass db_updated_callback as a parameter to ApplyUpdate() and > get > > > rid > > > > of > > > > > the member variable (which seems like kind of a wart). > > > > > > > > Writing a store to disk is a long operation so the idea is to keep > > individual > > > > tasks on the task runner short in order to not hog the runner for all > stores > > > at > > > > once. I'm not much familiar with the guidelines about this. > > > > Do you know if it is OK to run long operations on the task runner? If it > is, > > > > then doing the update all at once is definitely better. > > > > > > I wouldn't be upset about running long jobs on the task runner - as far as I > > can > > > tell, there's no particular recommendation or best practices on such things. > > > If > > > the infrastructure needs to make guarantees on whether things can be > > dispatched, > > > it should either be making assertions about task runtime, or using a pool > with > > > appropriate serialization. > > > > > > If you think a single long-running task is poor practice, it still might > make > > > sense to post a single background task which chains to itself to pump stores > > > through. But since each task is asymmetric, I'm not sure how valuable that > > > would really be. It might make more sense if they ran in parallel, but that > > > also may be too aggressive. > > > > > > [By pumping in the background, I mean a task which takes a collection of > > stores > > > and a callback, then repeatedly does: > > > - remove a store from the collection. > > > - write that store. > > > - if empty, post the callback. > > > - else, repost the task with the reduced collection. > > > That removes the outstanding counter, and allows to have a single point to > > > dispatch the updated callback. > > > ] > > > > If you feel strongly about making this change, I can go ahead and do it. > > However, I get the sense that both of us agree and disagree with each other, > and > > none of us feels strongly about it. > > The current implementation is along the lines of what I proposed in the design > > document so I propose that we keep the current implementation. If at a later > > time we have a strong and evidence-based reason to change it, we'll do it > then. > > WDYT? > > OK, I do think it's a style thing rather than a real functional issue. Acknowledged. https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... File components/safe_browsing_db/v4_local_database_manager.cc (right): https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_local_database_manager.cc:250: v4_database_->ApplyUpdate(responses); On 2016/06/24 23:01:19, Scott Hess wrote: > On 2016/06/21 23:19:35, vakh wrote: > > On 2016/06/21 at 21:03:44, Scott Hess wrote: > > > On 2016/06/20 22:28:43, vakh wrote: > > > > On 2016/06/17 22:53:43, Scott Hess wrote: > > > > > WRT my comment elsewhere with many store tasks versus a single db task, > > this > > > > is > > > > > the place I think would be cleaner if it was: > > > > > v4_database_->ApplyUpdate(responses, base::Bind(&V4LDM::DatabaseUpdated, > > > > this)); > > > > > Then the reader would know looking only at this function where they need > > to > > > > look > > > > > next. > > > > > > > > Still not sure about updating all stores at once, please see my comment in > > > > v4_database. > > > > > > > > Since the method to be called back is the same every time (without any > > > > arguments), isn't it better to pass it at construction time? > > > > > > It's more efficient, probably. But a call like this: > > > > > > object->GoDoTheThing(parameter, parameter, base::Bind(&MyCallback, this)); > > > > > > makes it reasonably obvious what's happening. As compared to: > > > > > > void SetUpt() { > > > object->SetTheCallback(base::Bind(&MyCallback, this)); > > > } > > > // hundreds of lines of code. > > > object->GoDoTheThing(parameter, parameter); > > > > > > In the latter case, it's not immediately clear that there even is a > > flow-of-control thing, it reads like a synchronous operation. > > > > > > If you do want to initialize up front, then make sure you review your other > > callback use to see if they should all be regularized that way :-). > > > > The other callbacks in this class also work the same way i.e. setup callback > at > > init and then keep getting called back. > > AFAICT there's only that one callback which is stored as a member variable. I > may be misreading something. Your reading is right. What I wrote is different than what I meant, and is incorrect. The other callback (NewDbReadyCallback) is only called back once not repeatedly. https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... File components/safe_browsing_db/v4_protocol_manager_util.cc (right): https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_protocol_manager_util.cc:27: } On 2016/06/24 23:01:19, Scott Hess wrote: > On 2016/06/21 23:19:35, vakh wrote: > > Acknowledged. WRT this particular method, it is not outputting any log by > itself > > so there's no harm in keeping it. As you pointed out, the other statements are > > also DVLOGs so they'll only be printed in a DEBUG build and that too only > about > > once in half-an-hour so other developers shouldn't be too affected. > It is being used in a valid case: if we get an update for a list that we didn't expect, it prints that list's identifier in logs. In V4Database::ApplyUpdate() The alternative is to "inline" this method in that error condition. Given that, if you still feel that it is not useful to have this method, please continue reading this comment. Otherwise please ignore the rest of my comment. > I'll grant that there may not be _much_ harm in keeping it, but I think "no harm > in keeping it" isn't right. Unnecessary log lines It is only being used when an expected condition occurs so I wouldn't call it unnecessary. > add to compile time in all cases, Sure, but so would the inline case. > reduce readability Better than inline case. >, and in this case add to binary size in debug, So would the inline case > and possibly adds irrelevant output if someone uses the wrong --verbose rules. I've removed all other calls to this method already so there shouldn't be any irrelevant output. > Those costs potentially apply to everyone on the project, whereas the benefits > probably only apply to you. > The alternative is that I remove this method and the "inline" printing of the identifier and fail with no information in the logs. > [This be didn't get into my bonnet with Chrome - I used to greatly despise > having a production emergency where my first order of business was to write > filters to let me try to figure out which bits of a 100MB daemond.ERROR file > might be relevant.] Off topic, feel free to ignore: "bonnet with X" -- I couldn't understand what this means and even Google search didn't reveal anything. https://codereview.chromium.org/2062013002/diff/260001/components/safe_browsi... File components/safe_browsing_db/v4_database.cc (right): https://codereview.chromium.org/2062013002/diff/260001/components/safe_browsi... components/safe_browsing_db/v4_database.cc:30: // thread. On 2016/06/24 23:01:19, Scott Hess wrote: > On 2016/06/23 06:23:50, vakh wrote: > > On 2016/06/21 22:23:25, Scott Hess wrote: > > > Did this comment float away from the code it refers to? > > > > This is a comment about what this method does, so I've kept it at the top of > the > > method definition. Would it be better to have it just before the method > > definition? > > I'd put it on the method itself, since inline like this is looks like it's > telling you what the next few lines do. > > The comment out in the header file pretty much covers the same ground, so > another option is to just drop it entirely. That way you don't have to keep the > code and two comments in sync. Done. https://codereview.chromium.org/2062013002/diff/380001/chrome/browser/safe_br... File chrome/browser/safe_browsing/services_delegate_impl.cc (right): https://codereview.chromium.org/2062013002/diff/380001/chrome/browser/safe_br... chrome/browser/safe_browsing/services_delegate_impl.cc:193: kSafeBrowsingV4LocalDatabaseManagerEnabled); On 2016/06/24 23:01:19, Scott Hess wrote: > Is this clang-format? If not, prefer to keep it the same as the previous > format. Yes, OCD, I like diffs to be minimal, until they become maximal. > > I wonder if you changed it to: > #ifndef NDEBUG > return true; > #endif > return ...; > > if that would allow removing the #ifdef above? That's a compile error because the compiler sees statements after an unconditional return in DEBUG mode. https://codereview.chromium.org/2062013002/diff/380001/components/safe_browsi... File components/safe_browsing_db/v4_database.cc (right): https://codereview.chromium.org/2062013002/diff/380001/components/safe_browsi... components/safe_browsing_db/v4_database.cc:155: base::MakeUnique<StoreStateMap>(); On 2016/06/24 23:01:19, Scott Hess wrote: > Can this do the right thing as a StoreStateMap without the unique_ptr? I think > so. Just a modest preference, if you think it makes it less clear that it does > the right thing, unique_ptr is reasonable. Just trying to not create another map on return. It is possible that the compiler will be smart and not create a copy but a unique_ptr just makes sure of that. I don't feel too strongly either way but it is such a simple method that either implementation is clear. I'd be happy to change it if you prefer that. https://codereview.chromium.org/2062013002/diff/380001/components/safe_browsi... File components/safe_browsing_db/v4_database_unittest.cc (right): https://codereview.chromium.org/2062013002/diff/380001/components/safe_browsi... components/safe_browsing_db/v4_database_unittest.cc:82: V4Database::RegisterStoreFactoryForTest(factory_.get()); On 2016/06/24 23:01:19, Scott Hess wrote: > I think your point about not wanting to put the Register() into setup is > reasonable. But it occurs to me that you might want to either put > Register(NULL) in TearDown(), or leak this object, rather than leaving a > potentially-stale reference outstanding. Done. https://codereview.chromium.org/2062013002/diff/380001/components/safe_browsi... File components/safe_browsing_db/v4_local_database_manager.cc (right): https://codereview.chromium.org/2062013002/diff/380001/components/safe_browsi... components/safe_browsing_db/v4_local_database_manager.cc:30: // TODO(vakh): Once that bug is fixes, this should be removed. If we leave On 2016/06/24 23:01:19, Scott Hess wrote: > s/fixes/fixed/ Done. https://codereview.chromium.org/2062013002/diff/380001/components/safe_browsi... File components/safe_browsing_db/v4_update_protocol_manager.cc (right): https://codereview.chromium.org/2062013002/diff/380001/components/safe_browsi... components/safe_browsing_db/v4_update_protocol_manager.cc:196: const StoreStateMap* store_state_map) { On 2016/06/24 23:01:19, Scott Hess wrote: > WRT my other comment about "Does StoreStateMap need unique_ptr?", in this case > the code requires that store_state_map not be NULL, so it could be a reference. > > I'm not entirely clear if there's potential to get a NULL here, now that I think > about it. It can't be NULL but I added a check with NOTREACHED, instead of causing a crash later.
vakh@chromium.org changed reviewers: + rkaplow@chromium.org
rkaplow@ -- Could you please review the changes in histograms.xml? Thanks.
https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... File components/safe_browsing_db/v4_local_database_manager.cc (right): https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_local_database_manager.cc:250: v4_database_->ApplyUpdate(responses); On 2016/06/27 19:38:26, vakh wrote: > On 2016/06/24 23:01:19, Scott Hess wrote: > > On 2016/06/21 23:19:35, vakh wrote: > > > On 2016/06/21 at 21:03:44, Scott Hess wrote: > > > > On 2016/06/20 22:28:43, vakh wrote: > > > > > On 2016/06/17 22:53:43, Scott Hess wrote: > > > > > > WRT my comment elsewhere with many store tasks versus a single db > task, > > > this > > > > > is > > > > > > the place I think would be cleaner if it was: > > > > > > v4_database_->ApplyUpdate(responses, > base::Bind(&V4LDM::DatabaseUpdated, > > > > > this)); > > > > > > Then the reader would know looking only at this function where they > need > > > to > > > > > look > > > > > > next. > > > > > > > > > > Still not sure about updating all stores at once, please see my comment > in > > > > > v4_database. > > > > > > > > > > Since the method to be called back is the same every time (without any > > > > > arguments), isn't it better to pass it at construction time? > > > > > > > > It's more efficient, probably. But a call like this: > > > > > > > > object->GoDoTheThing(parameter, parameter, base::Bind(&MyCallback, > this)); > > > > > > > > makes it reasonably obvious what's happening. As compared to: > > > > > > > > void SetUpt() { > > > > object->SetTheCallback(base::Bind(&MyCallback, this)); > > > > } > > > > // hundreds of lines of code. > > > > object->GoDoTheThing(parameter, parameter); > > > > > > > > In the latter case, it's not immediately clear that there even is a > > > flow-of-control thing, it reads like a synchronous operation. > > > > > > > > If you do want to initialize up front, then make sure you review your > other > > > callback use to see if they should all be regularized that way :-). > > > > > > The other callbacks in this class also work the same way i.e. setup callback > > at > > > init and then keep getting called back. > > > > AFAICT there's only that one callback which is stored as a member variable. I > > may be misreading something. > > Your reading is right. What I wrote is different than what I meant, and is > incorrect. > The other callback (NewDbReadyCallback) is only called back once not repeatedly. OK, I guess I'll leave this to you. I find the stored callback to be a little undesirable in this case, so it might be worthwhile to have a comment on this call saying "This is async, when complete DatabaseUpdated() will be called". Alternately, ApplyUpdate() could be renamed to imply that it's not synchronous, but I can't think of a sensible name. Hmm. Another option would be to have it be ApplyUpdate(responses, callback), and have the ApplyUpdate() implementation DCHECK() that the member callback storage is null (no outstanding ApplyUpdate in process), then store the callback for later, then clear the callback when sent. That puts the pattern I like in place, while still allowing multiple implementation options (loop in calling thread posting multiple tasks, loop in background, or loop in background pumping tasks). https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... File components/safe_browsing_db/v4_protocol_manager_util.cc (right): https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_protocol_manager_util.cc:27: } On 2016/06/27 19:38:27, vakh wrote: > On 2016/06/24 23:01:19, Scott Hess wrote: > > [This be didn't get into my bonnet with Chrome - I used to greatly despise > > having a production emergency where my first order of business was to write > > filters to let me try to figure out which bits of a 100MB daemond.ERROR file > > might be relevant.] > > Off topic, feel free to ignore: "bonnet with X" -- I couldn't understand what > this means and even Google search didn't reveal anything. Sigh, "bee in my bonnet". Basically "Why does Scott hate log lines so much?" I've pummeled you enough with that thread for one review. https://codereview.chromium.org/2062013002/diff/380001/chrome/browser/safe_br... File chrome/browser/safe_browsing/services_delegate_impl.cc (right): https://codereview.chromium.org/2062013002/diff/380001/chrome/browser/safe_br... chrome/browser/safe_browsing/services_delegate_impl.cc:193: kSafeBrowsingV4LocalDatabaseManagerEnabled); On 2016/06/27 19:38:27, vakh wrote: > On 2016/06/24 23:01:19, Scott Hess wrote: > > Is this clang-format? If not, prefer to keep it the same as the previous > > format. Yes, OCD, I like diffs to be minimal, until they become maximal. > > > > I wonder if you changed it to: > > #ifndef NDEBUG > > return true; > > #endif > > return ...; > > > > if that would allow removing the #ifdef above? > > That's a compile error because the compiler sees statements after an > unconditional return in DEBUG mode. Really? I hate compilers. I know for sure that at various points I've put "return true;" at the top of a function to mock out that functionality to let me get through that code. At this point, I kind of dislike all of the options. So I guess whatever works for you is probably where it will end. I wonder how this feels: if (IsEnabled(kKey)) { return true; } #ifdef NDEBUG return false; #else return true; #endif The compiler can't complain about IsEnabled() being unnecessary because it could have a side effect, so I think that would at least get rid of the need to bracket the constant's declaration (which is gross). https://codereview.chromium.org/2062013002/diff/380001/components/safe_browsi... File components/safe_browsing_db/v4_database.cc (right): https://codereview.chromium.org/2062013002/diff/380001/components/safe_browsi... components/safe_browsing_db/v4_database.cc:155: base::MakeUnique<StoreStateMap>(); On 2016/06/27 19:38:27, vakh wrote: > On 2016/06/24 23:01:19, Scott Hess wrote: > > Can this do the right thing as a StoreStateMap without the unique_ptr? I > think > > so. Just a modest preference, if you think it makes it less clear that it > does > > the right thing, unique_ptr is reasonable. > > Just trying to not create another map on return. It is possible that the > compiler will be smart and not create a copy but a unique_ptr just makes sure of > that. I don't feel too strongly either way but it is such a simple method that > either implementation is clear. > I'd be happy to change it if you prefer that. Acknowledged. I guess I want to trust the language to do RVO, and it will probably do RVO, and the structure is probably small enough not to matter anyhow, but I think I agree with you that with a unique_ptr<> it's pretty clear that it's not making a copy, and as a reader I don't have to think about whether a copy is being made. https://codereview.chromium.org/2062013002/diff/380001/components/safe_browsi... File components/safe_browsing_db/v4_update_protocol_manager.cc (right): https://codereview.chromium.org/2062013002/diff/380001/components/safe_browsi... components/safe_browsing_db/v4_update_protocol_manager.cc:196: const StoreStateMap* store_state_map) { On 2016/06/27 19:38:27, vakh wrote: > On 2016/06/24 23:01:19, Scott Hess wrote: > > WRT my other comment about "Does StoreStateMap need unique_ptr?", in this case > > the code requires that store_state_map not be NULL, so it could be a > reference. > > > > I'm not entirely clear if there's potential to get a NULL here, now that I > think > > about it. > > It can't be NULL but I added a check with NOTREACHED, instead of causing a crash > later. To be clear, where I was going with all of that is that if it can't be NULL, then if it were a reference this code wouldn't have to ask the question, the calling code would have to. If it had a well-defined result for the NULL case which can be used in the regular flow, it makes sense to do the check here. If the NULL case should be detected and lead to not sending the request, it should be detected in the caller, instead.
https://codereview.chromium.org/2062013002/diff/380001/components/safe_browsi... File components/safe_browsing_db/v4_update_protocol_manager.cc (right): https://codereview.chromium.org/2062013002/diff/380001/components/safe_browsi... components/safe_browsing_db/v4_update_protocol_manager.cc:196: const StoreStateMap* store_state_map) { On 2016/06/27 22:19:17, Scott Hess wrote: > On 2016/06/27 19:38:27, vakh wrote: > > On 2016/06/24 23:01:19, Scott Hess wrote: > > > WRT my other comment about "Does StoreStateMap need unique_ptr?", in this > case > > > the code requires that store_state_map not be NULL, so it could be a > > reference. > > > > > > I'm not entirely clear if there's potential to get a NULL here, now that I > > think > > > about it. > > > > It can't be NULL but I added a check with NOTREACHED, instead of causing a > crash > > later. > > To be clear, where I was going with all of that is that if it can't be NULL, > then if it were a reference this code wouldn't have to ask the question, the > calling code would have to. If it had a well-defined result for the NULL case > which can be used in the regular flow, it makes sense to do the check here. If > the NULL case should be detected and lead to not sending the request, it should > be detected in the caller, instead. That was ambiguous - I mean if this function's signature were |const StoreStateMap&| then this code could assume it always has a map. The map could be empty, of course :-).
Minor: shess@ latest CR
https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... File components/safe_browsing_db/v4_local_database_manager.cc (right): https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_local_database_manager.cc:250: v4_database_->ApplyUpdate(responses); On 2016/06/27 22:19:17, Scott Hess wrote: > On 2016/06/27 19:38:26, vakh wrote: > > On 2016/06/24 23:01:19, Scott Hess wrote: > > > On 2016/06/21 23:19:35, vakh wrote: > > > > On 2016/06/21 at 21:03:44, Scott Hess wrote: > > > > > On 2016/06/20 22:28:43, vakh wrote: > > > > > > On 2016/06/17 22:53:43, Scott Hess wrote: > > > > > > > WRT my comment elsewhere with many store tasks versus a single db > > task, > > > > this > > > > > > is > > > > > > > the place I think would be cleaner if it was: > > > > > > > v4_database_->ApplyUpdate(responses, > > base::Bind(&V4LDM::DatabaseUpdated, > > > > > > this)); > > > > > > > Then the reader would know looking only at this function where they > > need > > > > to > > > > > > look > > > > > > > next. > > > > > > > > > > > > Still not sure about updating all stores at once, please see my > comment > > in > > > > > > v4_database. > > > > > > > > > > > > Since the method to be called back is the same every time (without any > > > > > > arguments), isn't it better to pass it at construction time? > > > > > > > > > > It's more efficient, probably. But a call like this: > > > > > > > > > > object->GoDoTheThing(parameter, parameter, base::Bind(&MyCallback, > > this)); > > > > > > > > > > makes it reasonably obvious what's happening. As compared to: > > > > > > > > > > void SetUpt() { > > > > > object->SetTheCallback(base::Bind(&MyCallback, this)); > > > > > } > > > > > // hundreds of lines of code. > > > > > object->GoDoTheThing(parameter, parameter); > > > > > > > > > > In the latter case, it's not immediately clear that there even is a > > > > flow-of-control thing, it reads like a synchronous operation. > > > > > > > > > > If you do want to initialize up front, then make sure you review your > > other > > > > callback use to see if they should all be regularized that way :-). > > > > > > > > The other callbacks in this class also work the same way i.e. setup > callback > > > at > > > > init and then keep getting called back. > > > > > > AFAICT there's only that one callback which is stored as a member variable. > I > > > may be misreading something. > > > > Your reading is right. What I wrote is different than what I meant, and is > > incorrect. > > The other callback (NewDbReadyCallback) is only called back once not > repeatedly. > > OK, I guess I'll leave this to you. I find the stored callback to be a little > undesirable in this case, so it might be worthwhile to have a comment on this > call saying "This is async, when complete DatabaseUpdated() will be called". > Alternately, ApplyUpdate() could be renamed to imply that it's not synchronous, > but I can't think of a sensible name. > > Hmm. Another option would be to have it be ApplyUpdate(responses, callback), > and have the ApplyUpdate() implementation DCHECK() that the member callback > storage is null (no outstanding ApplyUpdate in process), then store the callback > for later, then clear the callback when sent. That puts the pattern I like in > place, while still allowing multiple implementation options (loop in calling > thread posting multiple tasks, loop in background, or loop in background pumping > tasks). Done. https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... File components/safe_browsing_db/v4_protocol_manager_util.cc (right): https://codereview.chromium.org/2062013002/diff/140001/components/safe_browsi... components/safe_browsing_db/v4_protocol_manager_util.cc:27: } On 2016/06/27 22:19:17, Scott Hess wrote: > On 2016/06/27 19:38:27, vakh wrote: > > On 2016/06/24 23:01:19, Scott Hess wrote: > > > [This be didn't get into my bonnet with Chrome - I used to greatly despise > > > having a production emergency where my first order of business was to write > > > filters to let me try to figure out which bits of a 100MB daemond.ERROR file > > > might be relevant.] > > > > Off topic, feel free to ignore: "bonnet with X" -- I couldn't understand what > > this means and even Google search didn't reveal anything. > > Sigh, "bee in my bonnet". Basically "Why does Scott hate log lines so much?" > I've pummeled you enough with that thread for one review. Acknowledged. https://codereview.chromium.org/2062013002/diff/380001/chrome/browser/safe_br... File chrome/browser/safe_browsing/services_delegate_impl.cc (right): https://codereview.chromium.org/2062013002/diff/380001/chrome/browser/safe_br... chrome/browser/safe_browsing/services_delegate_impl.cc:193: kSafeBrowsingV4LocalDatabaseManagerEnabled); On 2016/06/27 22:19:17, Scott Hess wrote: > On 2016/06/27 19:38:27, vakh wrote: > > On 2016/06/24 23:01:19, Scott Hess wrote: > > > Is this clang-format? If not, prefer to keep it the same as the previous > > > format. Yes, OCD, I like diffs to be minimal, until they become maximal. > > > > > > I wonder if you changed it to: > > > #ifndef NDEBUG > > > return true; > > > #endif > > > return ...; > > > > > > if that would allow removing the #ifdef above? > > > > That's a compile error because the compiler sees statements after an > > unconditional return in DEBUG mode. > > Really? I hate compilers. I know for sure that at various points I've put > "return true;" at the top of a function to mock out that functionality to let me > get through that code. > > At this point, I kind of dislike all of the options. So I guess whatever works > for you is probably where it will end. > > I wonder how this feels: > if (IsEnabled(kKey)) { > return true; > } > #ifdef NDEBUG > return false; > #else > return true; > #endif > > The compiler can't complain about IsEnabled() being unnecessary because it could > have a side effect, so I think that would at least get rid of the need to > bracket the constant's declaration (which is gross). I feel like this is 6 of one and half-a-dozen of the other. Let me leave it as-is. We'll be removing it sooner rather than later anyway. https://codereview.chromium.org/2062013002/diff/380001/components/safe_browsi... File components/safe_browsing_db/v4_update_protocol_manager.cc (right): https://codereview.chromium.org/2062013002/diff/380001/components/safe_browsi... components/safe_browsing_db/v4_update_protocol_manager.cc:196: const StoreStateMap* store_state_map) { On 2016/06/27 22:37:58, Scott Hess wrote: > On 2016/06/27 22:19:17, Scott Hess wrote: > > On 2016/06/27 19:38:27, vakh wrote: > > > On 2016/06/24 23:01:19, Scott Hess wrote: > > > > WRT my other comment about "Does StoreStateMap need unique_ptr?", in this > > case > > > > the code requires that store_state_map not be NULL, so it could be a > > > reference. > > > > > > > > I'm not entirely clear if there's potential to get a NULL here, now that I > > > think > > > > about it. > > > > > > It can't be NULL but I added a check with NOTREACHED, instead of causing a > > crash > > > later. > > > > To be clear, where I was going with all of that is that if it can't be NULL, > > then if it were a reference this code wouldn't have to ask the question, the > > calling code would have to. If it had a well-defined result for the NULL case > > which can be used in the regular flow, it makes sense to do the check here. > If > > the NULL case should be detected and lead to not sending the request, it > should > > be detected in the caller, instead. > > That was ambiguous - I mean if this function's signature were |const > StoreStateMap&| then this code could assume it always has a map. The map could > be empty, of course :-). Done. Also added a DCHECK to ensure the store_state_map isn't empty at least in DEBUG builds because that's unexpected.
git fetch && git pull && gclient sync
lgtm histogram lg
lgtm. Minor nit on the DCHECK(x >= y), you can ignore me or fix it and retain my lgtm. [I honestly don't recall if DCHECK_op() had any logging advantages like EXPECT_EQ().] https://codereview.chromium.org/2062013002/diff/460001/components/safe_browsi... File components/safe_browsing_db/v4_database.cc (right): https://codereview.chromium.org/2062013002/diff/460001/components/safe_browsi... components/safe_browsing_db/v4_database.cc:96: db_updated_callback_ = db_updated_callback; OK, +1! https://codereview.chromium.org/2062013002/diff/460001/components/safe_browsi... File components/safe_browsing_db/v4_local_database_manager.cc (right): https://codereview.chromium.org/2062013002/diff/460001/components/safe_browsi... components/safe_browsing_db/v4_local_database_manager.cc:256: v4_database_->ApplyUpdate(responses, db_updated_callback_); Is there any possibility of receiving this call after StopOnIOThread() has been called? I think not, if I read things correctly that's handled by the update-protocol-manager, and the stop will delete that, so it will delete the callback pointing to this, so everyone wins. Just want to make sure I read it correctly. https://codereview.chromium.org/2062013002/diff/460001/components/safe_browsi... File components/safe_browsing_db/v4_update_protocol_manager.cc (right): https://codereview.chromium.org/2062013002/diff/460001/components/safe_browsi... components/safe_browsing_db/v4_update_protocol_manager.cc:197: DCHECK(store_state_map.size() >= 1); DCHECK_GE() or DCHECK_LE() depending on preference. [Or DCHECK(!store_state_map.empty()), I suppose.]
https://codereview.chromium.org/2062013002/diff/460001/components/safe_browsi... File components/safe_browsing_db/v4_database.cc (right): https://codereview.chromium.org/2062013002/diff/460001/components/safe_browsi... components/safe_browsing_db/v4_database.cc:96: db_updated_callback_ = db_updated_callback; On 2016/06/28 18:18:06, Scott Hess (OOO Jul 1-Aug 7) wrote: > OK, +1! Acknowledged. https://codereview.chromium.org/2062013002/diff/460001/components/safe_browsi... File components/safe_browsing_db/v4_local_database_manager.cc (right): https://codereview.chromium.org/2062013002/diff/460001/components/safe_browsi... components/safe_browsing_db/v4_local_database_manager.cc:256: v4_database_->ApplyUpdate(responses, db_updated_callback_); On 2016/06/28 18:18:06, Scott Hess (OOO Jul 1-Aug 7) wrote: > Is there any possibility of receiving this call after StopOnIOThread() has been > called? I think not, if I read things correctly that's handled by the > update-protocol-manager, and the stop will delete that, so it will delete the > callback pointing to this, so everyone wins. Just want to make sure I read it > correctly. That's my understanding too with the use of unique_ptr and reset(). https://codereview.chromium.org/2062013002/diff/460001/components/safe_browsi... File components/safe_browsing_db/v4_update_protocol_manager.cc (right): https://codereview.chromium.org/2062013002/diff/460001/components/safe_browsi... components/safe_browsing_db/v4_update_protocol_manager.cc:197: DCHECK(store_state_map.size() >= 1); On 2016/06/28 18:18:06, Scott Hess (OOO Jul 1-Aug 7) wrote: > DCHECK_GE() or DCHECK_LE() depending on preference. > > [Or DCHECK(!store_state_map.empty()), I suppose.] Sorry, should have done that in the first place. Done now.
Minor: Fix DCHECK
The CQ bit was checked by vakh@chromium.org
The patchset sent to the CQ was uploaded after l-g-t-m from nparker@chromium.org, rkaplow@chromium.org, shess@chromium.org Link to the patchset: https://codereview.chromium.org/2062013002/#ps480001 (title: "Minor: Fix DCHECK")
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Try jobs failed on following builders: linux_android_rel_ng on master.tryserver.chromium.android (JOB_FAILED, https://build.chromium.org/p/tryserver.chromium.android/builders/linux_androi...)
The CQ bit was checked by vakh@chromium.org
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
Message was sent while issue was closed.
Description was changed from ========== Fetch incremental updates. Store new state in V4Store. This CL does the following: 1. Makes V4Database the source of truth about the number and status of the lists being updated. 2. Fetch updates for 2 different lists: Malware and Social Engineering. Previously we were just downloading Malware. 3. Update requests are now incremental. Each V4Store updates its "state" when processing an update response so the next update request is incremental on top of that state. 4. Tests to verify that the V4Database and V4Store are being updated correctly. 5. Force enable V4 updates for debug builds (bringing it back). Overall design document: https://goto.google.com/design-doc-v4store After this CL: 1. V4LocalDatabaseManager tells V4Database about which lists to fetch (and their filename) at init time. 2. From here on, V4Database becomes the single source of truth for all information about the lists. 3. V4Database creates a V4Store for each of the lists. 4. When the V4Database is ready (this happens on the task runner), the update manager schedules the next update. 5. When the update is received, V4Database determines which stores to update and updates them on the task runner. 6. Updating a V4Store is done by creating a new store that's equivalent to the old store + the update. 7. When the new store is ready, the V4Database swaps it in for the old store. 8. When all the stores that needed to be updated have been updated, the update manager schedules the next update. BUG=543161 ========== to ========== Fetch incremental updates. Store new state in V4Store. This CL does the following: 1. Makes V4Database the source of truth about the number and status of the lists being updated. 2. Fetch updates for 2 different lists: Malware and Social Engineering. Previously we were just downloading Malware. 3. Update requests are now incremental. Each V4Store updates its "state" when processing an update response so the next update request is incremental on top of that state. 4. Tests to verify that the V4Database and V4Store are being updated correctly. 5. Force enable V4 updates for debug builds (bringing it back). Overall design document: https://goto.google.com/design-doc-v4store After this CL: 1. V4LocalDatabaseManager tells V4Database about which lists to fetch (and their filename) at init time. 2. From here on, V4Database becomes the single source of truth for all information about the lists. 3. V4Database creates a V4Store for each of the lists. 4. When the V4Database is ready (this happens on the task runner), the update manager schedules the next update. 5. When the update is received, V4Database determines which stores to update and updates them on the task runner. 6. Updating a V4Store is done by creating a new store that's equivalent to the old store + the update. 7. When the new store is ready, the V4Database swaps it in for the old store. 8. When all the stores that needed to be updated have been updated, the update manager schedules the next update. BUG=543161 ==========
Message was sent while issue was closed.
Committed patchset #25 (id:480001)
Message was sent while issue was closed.
Description was changed from ========== Fetch incremental updates. Store new state in V4Store. This CL does the following: 1. Makes V4Database the source of truth about the number and status of the lists being updated. 2. Fetch updates for 2 different lists: Malware and Social Engineering. Previously we were just downloading Malware. 3. Update requests are now incremental. Each V4Store updates its "state" when processing an update response so the next update request is incremental on top of that state. 4. Tests to verify that the V4Database and V4Store are being updated correctly. 5. Force enable V4 updates for debug builds (bringing it back). Overall design document: https://goto.google.com/design-doc-v4store After this CL: 1. V4LocalDatabaseManager tells V4Database about which lists to fetch (and their filename) at init time. 2. From here on, V4Database becomes the single source of truth for all information about the lists. 3. V4Database creates a V4Store for each of the lists. 4. When the V4Database is ready (this happens on the task runner), the update manager schedules the next update. 5. When the update is received, V4Database determines which stores to update and updates them on the task runner. 6. Updating a V4Store is done by creating a new store that's equivalent to the old store + the update. 7. When the new store is ready, the V4Database swaps it in for the old store. 8. When all the stores that needed to be updated have been updated, the update manager schedules the next update. BUG=543161 ========== to ========== Fetch incremental updates. Store new state in V4Store. This CL does the following: 1. Makes V4Database the source of truth about the number and status of the lists being updated. 2. Fetch updates for 2 different lists: Malware and Social Engineering. Previously we were just downloading Malware. 3. Update requests are now incremental. Each V4Store updates its "state" when processing an update response so the next update request is incremental on top of that state. 4. Tests to verify that the V4Database and V4Store are being updated correctly. 5. Force enable V4 updates for debug builds (bringing it back). Overall design document: https://goto.google.com/design-doc-v4store After this CL: 1. V4LocalDatabaseManager tells V4Database about which lists to fetch (and their filename) at init time. 2. From here on, V4Database becomes the single source of truth for all information about the lists. 3. V4Database creates a V4Store for each of the lists. 4. When the V4Database is ready (this happens on the task runner), the update manager schedules the next update. 5. When the update is received, V4Database determines which stores to update and updates them on the task runner. 6. Updating a V4Store is done by creating a new store that's equivalent to the old store + the update. 7. When the new store is ready, the V4Database swaps it in for the old store. 8. When all the stores that needed to be updated have been updated, the update manager schedules the next update. BUG=543161 Committed: https://crrev.com/370e9320906395ba0c795af2d3b4cedf864f936d Cr-Commit-Position: refs/heads/master@{#402565} ==========
Message was sent while issue was closed.
Patchset 25 (id:??) landed as https://crrev.com/370e9320906395ba0c795af2d3b4cedf864f936d Cr-Commit-Position: refs/heads/master@{#402565}
Message was sent while issue was closed.
On 2016/06/28 22:14:48, commit-bot: I haz the power wrote: > Patchset 25 (id:??) landed as > https://crrev.com/370e9320906395ba0c795af2d3b4cedf864f936d > Cr-Commit-Position: refs/heads/master@{#402565} This appears to be causing a failed sizes step: https://build.chromium.org/p/chromium/builders/Linux%20x64/builds/21568/steps... As Sheriff, I will be reverting this. Since it is too large for the revert patchset button, I am using drover instead this time.
Message was sent while issue was closed.
https://codereview.chromium.org/2062013002/diff/480001/components/safe_browsi... File components/safe_browsing_db/v4_local_database_manager.cc (right): https://codereview.chromium.org/2062013002/diff/480001/components/safe_browsi... components/safe_browsing_db/v4_local_database_manager.cc:42: "UrlSoceng.store"}}; There's your sizes problem. Sigh. Probably just have a function to create and individually insert items.
Message was sent while issue was closed.
On 2016/06/28 at 23:46:12, shess wrote: > https://codereview.chromium.org/2062013002/diff/480001/components/safe_browsi... > File components/safe_browsing_db/v4_local_database_manager.cc (right): > > https://codereview.chromium.org/2062013002/diff/480001/components/safe_browsi... > components/safe_browsing_db/v4_local_database_manager.cc:42: "UrlSoceng.store"}}; > There's your sizes problem. Sigh. > > Probably just have a function to create and individually insert items. petewill@ : Can I push a quick fix instead? I have it ready and about to send it for review (in 15 mins). PLMK.
Message was sent while issue was closed.
On 2016/06/29 at 01:12:37, vakh wrote: > On 2016/06/28 at 23:46:12, shess wrote: > > https://codereview.chromium.org/2062013002/diff/480001/components/safe_browsi... > > File components/safe_browsing_db/v4_local_database_manager.cc (right): > > > > https://codereview.chromium.org/2062013002/diff/480001/components/safe_browsi... > > components/safe_browsing_db/v4_local_database_manager.cc:42: "UrlSoceng.store"}}; > > There's your sizes problem. Sigh. > > > > Probably just have a function to create and individually insert items. > > petewill@ : Can I push a quick fix instead? I have it ready and about to send it for review (in 15 mins). PLMK. https://codereview.chromium.org/2101153004
Message was sent while issue was closed.
brucedawson@chromium.org changed reviewers: + brucedawson@chromium.org
Message was sent while issue was closed.
https://codereview.chromium.org/2062013002/diff/480001/components/safe_browsi... File components/safe_browsing_db/v4_store.cc (right): https://codereview.chromium.org/2062013002/diff/480001/components/safe_browsi... components/safe_browsing_db/v4_store.cc:33: return base::StringPrintf("path: %s; state: %s", store_path_.value().c_str(), Unfortunately... store_path_.value() is a wstring in Windows, so you can't print it with %s. crbug.com/626411 filed. |