|
|
Created:
5 years, 2 months ago by nhiroki Modified:
5 years, 2 months ago CC:
chromium-reviews, michaeln, jsbell+serviceworker_chromium.org, tzik, serviceworker-reviews, jam, darin-cc_chromium.org, horo+watch_chromium.org, kinuko+serviceworker, kinuko+watch, blink-worker-reviews_chromium.org Base URL:
https://chromium.googlesource.com/chromium/src.git@master Target Ref:
refs/pending/heads/master Project:
chromium Visibility:
Public. |
DescriptionServiceWorker: Schedule DeleteAndStartOver when failing to store resource ids
Storing uncommitted or purgeable resource ids maybe bumps the next available
resource id stored in ServiceWorkerDatabase. If these database operations fail,
subsequent resources could use conflicted ids.
To avoid the confliction, this CL makes ServiceWorkerStorage check the result of
storing resource ids and schedule DeleteAndStartOver if there is an error.
BUG=485900, 505322
Committed: https://crrev.com/979ef438eb2908e88c1963de44a0942cc978d79e
Cr-Commit-Position: refs/heads/master@{#354758}
Patch Set 1 : #
Total comments: 3
Patch Set 2 : simplify the patch #
Total comments: 7
Messages
Total messages: 22 (7 generated)
Patchset #1 (id:1) has been deleted
Patchset #1 (id:20001) has been deleted
nhiroki@chromium.org changed reviewers: + falken@chromium.org, horo@chromium.org
PTAL, thanks! https://codereview.chromium.org/1393783002/diff/40001/content/browser/service... File content/browser/service_worker/service_worker_script_cache_map.cc (left): https://codereview.chromium.org/1393783002/diff/40001/content/browser/service... content/browser/service_worker/service_worker_script_cache_map.cc:66: context_->storage()->DoomUncommittedResponse(LookupResourceId(url)); Renamed this to ClearUncommittedResourceId "DoomUncommitted" sounds odd to me because "uncommited" means that the resource hasn't been stored in the diskcache yet. https://codereview.chromium.org/1393783002/diff/40001/content/browser/service... File content/browser/service_worker/service_worker_storage.cc (left): https://codereview.chromium.org/1393783002/diff/40001/content/browser/service... content/browser/service_worker/service_worker_storage.cc:562: StartPurgingResources(std::vector<int64>(1, id)); |id| should not be stored in the diskcache yet, so we don't have to attempt to purge it.
My understanding was uncommitted resources are in the disk cache, but they don't yet belong to an installed registration. Once the registration is successfully stored, they move from the uncommitted list to the committed list. Is this not accurate? ServiceWorkerDatabase has comments like: // As new resources are put into the diskcache, they go into an uncommitted // list. When a registration is saved that refers to those ids, they're // removed from that list. When a resource no longer has any registrations or // caches referring to it, it's added to the purgeable list. Periodically, // the purgeable list can be purged from the diskcache. At system startup, all // uncommitted ids are moved to the purgeable list.
Thank you for reviewing! https://codereview.chromium.org/1393783002/diff/40001/content/browser/service... File content/browser/service_worker/service_worker_script_cache_map.cc (left): https://codereview.chromium.org/1393783002/diff/40001/content/browser/service... content/browser/service_worker/service_worker_script_cache_map.cc:66: context_->storage()->DoomUncommittedResponse(LookupResourceId(url)); On 2015/10/07 06:10:27, nhiroki wrote: > Renamed this to ClearUncommittedResourceId > > "DoomUncommitted" sounds odd to me because "uncommited" means that the resource > hasn't been stored in the diskcache yet. Sorry for confusing you. This comment was completely wrong. I really meant to say that this is a failure case and the resource for |url| wasn't written in the diskcache, so I thought we don't have to doom it here. But, now I'm thinking that the resource could be partially written in the disk cache even if the failure case, so it might be better always to doom the resource here... This is not what I really want to fix in this CL, so I'll revert this change...
On 2015/10/07 07:37:02, falken wrote: > My understanding was uncommitted resources are in the disk cache, but they don't > yet belong to an installed registration. To be more accurate, ServiceWorkerScriptCacheMap adds a resource to the uncommitted list when it starts caching the resource, so some of resources in the list could be completely written in the disk cache, some of other resources could be partially written and the others couldn't be written yet. > Once the registration is successfully stored, they move from the uncommitted list to the committed list. Correct. ServiceWorkerDatabase::WriteRegistration moves them from the uncommitted list to the committed list. > Is this not accurate? ServiceWorkerDatabase has comments like: > > // As new resources are put into the diskcache, they go into an uncommitted > // list. When a registration is saved that refers to those ids, they're > // removed from that list. When a resource no longer has any registrations or > // caches referring to it, it's added to the purgeable list. Periodically, > // the purgeable list can be purged from the diskcache. At system startup, all > // uncommitted ids are moved to the purgeable list.
Patchset #2 (id:60001) has been deleted
Patchset #2 (id:80001) has been deleted
Patchset #2 (id:100001) has been deleted
Updated. Can you take another look? Thanks!
Code change looks good. Do you think it's useful and feasible to make a test for this change? I guess we'd need a MockDatabase class that you can override and make WriteUncommittedResourceIds fail. https://codereview.chromium.org/1393783002/diff/120001/content/browser/servic... File content/browser/service_worker/service_worker_script_cache_map.cc (right): https://codereview.chromium.org/1393783002/diff/120001/content/browser/servic... content/browser/service_worker/service_worker_script_cache_map.cc:51: context_->storage()->StoreUncommittedResourceId(resource_id); Yeah, this naming has been a bit inconsistent since the beginning. See also in service_worker_types.h: static const int64 kInvalidServiceWorkerResourceId = -1; static const int64 kInvalidServiceWorkerResponseId = -1; https://codereview.chromium.org/1393783002/diff/120001/content/browser/servic... File content/browser/service_worker/service_worker_storage.cc (right): https://codereview.chromium.org/1393783002/diff/120001/content/browser/servic... content/browser/service_worker/service_worker_storage.cc:540: DCHECK_NE(kInvalidServiceWorkerResponseId, resource_id); kInvalidServiceWorkerResourceId? probably we want a bigger code cleanup that just deletes one https://codereview.chromium.org/1393783002/diff/120001/content/browser/servic... content/browser/service_worker/service_worker_storage.cc:1156: if (status != ServiceWorkerDatabase::STATUS_OK) Do we have/want UMA for delete and start over frequency and the failure that triggered it?
On 2015/10/15 05:03:40, falken wrote: > Code change looks good. Do you think it's useful and feasible to make a test for > this change? I guess we'd need a MockDatabase class that you can override and > make WriteUncommittedResourceIds fail. Adding mock tests sounds good. We might want to have such mock tests not only for this change but also for the entire storage class because test coverage of of failure paths is not good overall. I'll file a separate issue for that.
Thank you for reviewing! https://codereview.chromium.org/1393783002/diff/120001/content/browser/servic... File content/browser/service_worker/service_worker_storage.cc (right): https://codereview.chromium.org/1393783002/diff/120001/content/browser/servic... content/browser/service_worker/service_worker_storage.cc:540: DCHECK_NE(kInvalidServiceWorkerResponseId, resource_id); On 2015/10/15 05:03:40, falken wrote: > kInvalidServiceWorkerResourceId? > > probably we want a bigger code cleanup that just deletes one Yeah, I'll make a follow-up patch to deprecate kInvalidSWResponseId. https://codereview.chromium.org/1393783002/diff/120001/content/browser/servic... content/browser/service_worker/service_worker_storage.cc:1156: if (status != ServiceWorkerDatabase::STATUS_OK) On 2015/10/15 05:03:40, falken wrote: > Do we have/want UMA for delete and start over frequency and the failure that > triggered it? What does "frequency" means? The ratio of DeleteAndStartOver to all operations? If so, we might be able to calculate the ratio using results of lower level database/diskcache operations possibly to trigger DeleteAndStartOver (ie. ServiceWorker.Database.* and ServiceWorker.Storage.* UMAs) like this: frequency = (all failures on database and DiskCache) / (all results on database and diskcache)
On 2015/10/16 09:08:37, nhiroki wrote: > On 2015/10/15 05:03:40, falken wrote: > > Code change looks good. Do you think it's useful and feasible to make a test > for > > this change? I guess we'd need a MockDatabase class that you can override and > > make WriteUncommittedResourceIds fail. > > Adding mock tests sounds good. We might want to have such mock tests not only > for this change but also for the entire storage class because test coverage of > of failure paths is not good overall. I'll file a separate issue for that. Filed: http://crbug.com/544022
lgtm https://codereview.chromium.org/1393783002/diff/120001/content/browser/servic... File content/browser/service_worker/service_worker_storage.cc (right): https://codereview.chromium.org/1393783002/diff/120001/content/browser/servic... content/browser/service_worker/service_worker_storage.cc:1156: if (status != ServiceWorkerDatabase::STATUS_OK) On 2015/10/16 09:15:40, nhiroki wrote: > On 2015/10/15 05:03:40, falken wrote: > > Do we have/want UMA for delete and start over frequency and the failure that > > triggered it? > > What does "frequency" means? The ratio of DeleteAndStartOver to all operations? > If so, we might be able to calculate the ratio using results of lower level > database/diskcache operations possibly to trigger DeleteAndStartOver (ie. > ServiceWorker.Database.* and ServiceWorker.Storage.* UMAs) like this: > > frequency = (all failures on database and DiskCache) / (all results on > database and diskcache) Yea... I didn't have something specific in mind, just wanted to know if we have UMA that gives an idea of how often this failure happens. I guess we have ServiceWorker.Storage.DeleteAndStartOverResult so can monitor the raw count of that, or maybe the ratio of user count of DeleteAndStartOverResult to user count of OpenResult? Would it help to have something like DeleteAndStartOverReason that's logged whenever it's called, to help debugging if someday we discover a rise in DeleteAndStartOver calls? Anyway, maybe this discussion is out of the scope of this particular CL.
lgtm
Thank you! https://codereview.chromium.org/1393783002/diff/120001/content/browser/servic... File content/browser/service_worker/service_worker_storage.cc (right): https://codereview.chromium.org/1393783002/diff/120001/content/browser/servic... content/browser/service_worker/service_worker_storage.cc:1156: if (status != ServiceWorkerDatabase::STATUS_OK) On 2015/10/19 04:09:35, falken wrote: > On 2015/10/16 09:15:40, nhiroki wrote: > > On 2015/10/15 05:03:40, falken wrote: > > > Do we have/want UMA for delete and start over frequency and the failure that > > > triggered it? > > > > What does "frequency" means? The ratio of DeleteAndStartOver to all > operations? > > If so, we might be able to calculate the ratio using results of lower level > > database/diskcache operations possibly to trigger DeleteAndStartOver (ie. > > ServiceWorker.Database.* and ServiceWorker.Storage.* UMAs) like this: > > > > frequency = (all failures on database and DiskCache) / (all results on > > database and diskcache) > > Yea... I didn't have something specific in mind, just wanted to know if we have > UMA that gives an idea of how often this failure happens. > > I guess we have ServiceWorker.Storage.DeleteAndStartOverResult so can monitor > the raw count of that, or maybe the ratio of user count of > DeleteAndStartOverResult to user count of OpenResult? We'd like to monitor two things: (1) how many users encounter DASO, and (2) how many times DASO happen on the same user. The ratio of UC of DASO to UC of OpenResult would be helpful for (1). The ratio of the raw count of DASO to UC of DASO would be helpful for (2). > Would it help to have > something like DeleteAndStartOverReason that's logged whenever it's called, to > help debugging if someday we discover a rise in DeleteAndStartOver calls? Recording failure reasons could be helpful to know which storage operation (database access pattern) tends to fail and contains coding error... I'll consider adding them. > Anyway, maybe this discussion is out of the scope of this particular CL.
The CQ bit was checked by nhiroki@chromium.org
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1393783002/120001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1393783002/120001
Message was sent while issue was closed.
Committed patchset #2 (id:120001)
Message was sent while issue was closed.
Patchset 2 (id:??) landed as https://crrev.com/979ef438eb2908e88c1963de44a0942cc978d79e Cr-Commit-Position: refs/heads/master@{#354758} |