|
|
Chromium Code Reviews|
Created:
11 years, 3 months ago by jennb Modified:
9 years, 6 months ago Reviewers:
michaeln CC:
chromium-reviews_googlegroups.com, darin (slow to review) Base URL:
svn://chrome-svn/chrome/trunk/src/ Visibility:
Public. |
DescriptionImplementation of application cache update algorithm.
Does not include storage nor processing of pending master entries.
TEST=verify update functionality
BUG=none
Committed: http://src.chromium.org/viewvc/chrome?view=rev&revision=28123
Patch Set 1 #
Total comments: 2
Patch Set 2 : '' #
Total comments: 24
Patch Set 3 : '' #
Total comments: 85
Patch Set 4 : '' #Patch Set 5 : '' #
Total comments: 71
Patch Set 6 : '' #
Total comments: 13
Patch Set 7 : '' #Patch Set 8 : '' #
Total comments: 3
Patch Set 9 : '' #
Total comments: 24
Patch Set 10 : '' #Patch Set 11 : '' #
Total comments: 11
Patch Set 12 : '' #Messages
Total messages: 36 (0 generated)
Hi Michael, I'm still working on this, but here's an early snapshot in case you wanted to take a peek at the updater progress. Jenn
Thnx for the preview. http://codereview.chromium.org/201070/diff/1/4 File webkit/appcache/appcache_updater.cc (right): http://codereview.chromium.org/201070/diff/1/4#newcode95 Line 95: // TODO(jennb): need a request context? ah... i'll look into providing access to the request context to our appcache lib http://codereview.chromium.org/201070/diff/1/4#newcode141 Line 141: // TODO(jennb): stream data to storage? Streaming to response storage is going to be an async operation, so this method probably wants a completion callback which will continue reading.
Another snapshot of my progress. The remaining TODOs will probably stay TODOs in this CL. I'll start working on unittests for this tomorrow. Look for question in my TODO comments. Thanks, Jenn On 2009/09/09 21:19:20, michaeln wrote: > Thnx for the preview. > > http://codereview.chromium.org/201070/diff/1/4 > File webkit/appcache/appcache_updater.cc (right): > > http://codereview.chromium.org/201070/diff/1/4#newcode95 > Line 95: // TODO(jennb): need a request context? > ah... i'll look into providing access to the request context to our appcache lib > > http://codereview.chromium.org/201070/diff/1/4#newcode141 > Line 141: // TODO(jennb): stream data to storage? > Streaming to response storage is going to be an async operation, so this method > probably wants a completion callback which will continue reading.
This is looking really good overall, there's quite a lot of there there to look over in detail. I haven't done that yet, but i wanted to send early comments on one topic in particular, see the comment on Line 445. I'll look more closely at this tomorrow and get you a more complete review. http://codereview.chromium.org/201070/diff/5001/3006 File webkit/appcache/appcache_updater.cc (right): http://codereview.chromium.org/201070/diff/5001/3006#newcode46 Line 46: NotifyHostMap::value_type(frontend, HostIds(host->host_id()))); nit: maybe insert and empty vector, then unconditionally push_back below http://codereview.chromium.org/201070/diff/5001/3006#newcode50 Line 50: ret.first->second.push_back(host->host_id()); any chance the host is already in the collection... just checking? http://codereview.chromium.org/201070/diff/5001/3006#newcode64 Line 64: HostIds host_ids = it->second; you can avoid making a copy of the vector here, maybe HostIds& hosts_ids = iter->second; or frontend->OnEventRaised(it->second, event_id); http://codereview.chromium.org/201070/diff/5001/3006#newcode73 Line 73: : service_(service), this indentation looks odd to my eye http://codereview.chromium.org/201070/diff/5001/3006#newcode96 Line 96: PendingMasters::value_type(*new_master_resource, host)); What happens when the 'host' goes away prior to update completion? This could be a place for a WeakPtr? http://codereview.chromium.org/201070/diff/5001/3006#newcode136 Line 136: UpdaterRequestInfo::MANIFEST_FETCH : UpdaterRequestInfo::MANIFEST_REFETCH; nit: 4 space indents on line wraps, there are other instances of this too http://codereview.chromium.org/201070/diff/5001/3006#newcode415 Line 415: const AppCacheEntry entry = it->second; maybe a reference, const AppCacheEntry& http://codereview.chromium.org/201070/diff/5001/3006#newcode445 Line 445: request->Start(); I think we don't want to issue so many requests at a time, actually i think the spec says we may work on up to two at a time in parallel. Maybe start two initially, and as each completes, start the next in the list.
I still haven't looked thru with an eye for update algorithm correctness, or URLRequest usage correctness... but I do have some comments on packaging, and apis for retrieving groups/caches from storage. http://codereview.chromium.org/201070/diff/5001/3004 File webkit/appcache/appcache.cc (right): http://codereview.chromium.org/201070/diff/5001/3004#newcode24 Line 24: : owning_group_(NULL), indentation is off http://codereview.chromium.org/201070/diff/5001/3003 File webkit/appcache/appcache.h (right): http://codereview.chromium.org/201070/diff/5001/3003#newcode34 Line 34: explicit AppCache(AppCacheService *service); I wonder if we should construct the appcache with a Manifest& for this use case (updater making a new, unstored instance)? And the AppCache could .swap any contained collections as it sees fit, stripping the ctor arg of the parsed results (so overall less copying of data required). The public interface could be simplified a bit too as a result, no public setters for fallbacks, whitelists, and such. Also this could more future proof the public class interface for this use case if/when new manifest file elements appear. The parse results reflecting new elements would be similarly injested by this ctor. (fyi: i have a hankering to add manifest file features at some point in the future:) http://codereview.chromium.org/201070/diff/5001/3005 File webkit/appcache/appcache_group.h (right): http://codereview.chromium.org/201070/diff/5001/3005#newcode39 Line 39: AppCacheUpdater* update_process() { return updater_; } Naming... wdyt about referring to this as an AppCacheUpdateJob? More of a question than a suggestion. So we use "Job" as the term for long lived processes (since Task is kindof reserved for xthread operations in chrome speak). There is precedence for "job" in chrome speak for long lived things like this. http://codereview.chromium.org/201070/diff/5001/3006 File webkit/appcache/appcache_updater.cc (right): http://codereview.chromium.org/201070/diff/5001/3006#newcode72 Line 72: AppCacheGroup* group) Some packaging details, but we have to think about how updates are initiated a bit, and what the preconditions are... - group exists in storage, but is not loaded - group exists in storage, and is loaded - group does not exist in storage, but is being installed already - group does not exist in storage, and is not being installed already x - updater exists already - updater does not exist yet I think we can take the 'foo' is already being loaded vs not out of the picture provided we have a couple of bottlenecks thru which 'foos' are loaded... service->LoadFoo(x, completion_callback); ... that deals with it already being loaded and just waits for completion. See comments below about the Load methods. ... any other cases? ---------------------------------- What if the way to initiate an update is by calling a method of Group. You can't express 'start an update' without having a group instance. // for use when starting an update w/o a new master entry group->StartUpdate(); // for use when starting an update w a new master entry group->StartUpdateWithMasterEntry(host, new_master_entry); -------------------- And the service has a LoadOrCreateGroup(manifest_url, callback) method which observes semantics similar to those outlined for LoadCache(id) in the other CL under review right now. - group exists in storage, but is not loaded LoadOrCreateGroup happens asyncly and eventually calls GroupLoadedCallback with the goods (group and newest cache loaded). Updater continues. - group exists in storage and is loaded LoadOrCreateGroup happens immediately, updater continues. - group does not exist in storage, but is being installed already LoadOrCreateGroup happens immediately (returning the in-memory only group), updater continues - group does not exist in storage, and is not being installed already LoadOrCreateGroup happens asyncly, after consulting the db and discovering no group, a new group is created in-memory only and returned, updater continues. -------------------- So... callsites in AppCacheHost would use LoadCache or LoadOrCreateGroup at select cache time, and upon completion, call the appropriate group->StartUpdate method. And that method deals with if an updater is already running and pokes the new master entry into it as needed.
New snapshot uploaded to address feedback so far. Still need to work on unittests. http://codereview.chromium.org/201070/diff/5001/3004 File webkit/appcache/appcache.cc (right): http://codereview.chromium.org/201070/diff/5001/3004#newcode24 Line 24: : owning_group_(NULL), On 2009/09/11 21:37:39, michaeln wrote: > indentation is off Done and fixed above as well. Fixed in appcache_group.cc too. http://codereview.chromium.org/201070/diff/5001/3003 File webkit/appcache/appcache.h (right): http://codereview.chromium.org/201070/diff/5001/3003#newcode34 Line 34: explicit AppCache(AppCacheService *service); On 2009/09/11 21:37:39, michaeln wrote: > I wonder if we should construct the appcache with a Manifest& for this use case > (updater making a new, unstored instance)? And the AppCache could .swap any > contained collections as it sees fit, stripping the ctor arg of the parsed > results (so overall less copying of data required). The public interface could > be simplified a bit too as a result, no public setters for fallbacks, > whitelists, and such. > > Also this could more future proof the public class interface for this use case > if/when new manifest file elements appear. The parse results reflecting new > elements would be similarly injested by this ctor. (fyi: i have a hankering to > add manifest file features at some point in the future:) Done. http://codereview.chromium.org/201070/diff/5001/3005 File webkit/appcache/appcache_group.h (right): http://codereview.chromium.org/201070/diff/5001/3005#newcode39 Line 39: AppCacheUpdater* update_process() { return updater_; } On 2009/09/11 21:37:39, michaeln wrote: > Naming... wdyt about referring to this as an AppCacheUpdateJob? More of a > question than a suggestion. > > So we use "Job" as the term for long lived processes (since Task is kindof > reserved for xthread operations in chrome speak). There is precedence for "job" > in chrome speak for long lived things like this. Done. http://codereview.chromium.org/201070/diff/5001/3006 File webkit/appcache/appcache_updater.cc (right): http://codereview.chromium.org/201070/diff/5001/3006#newcode46 Line 46: NotifyHostMap::value_type(frontend, HostIds(host->host_id()))); On 2009/09/11 02:37:49, michaeln wrote: > nit: maybe insert and empty vector, then unconditionally push_back below Done. http://codereview.chromium.org/201070/diff/5001/3006#newcode50 Line 50: ret.first->second.push_back(host->host_id()); On 2009/09/11 02:37:49, michaeln wrote: > any chance the host is already in the collection... just checking? In the hosts associated with a cache case, no dups. In the hosts for pending master entries case... maybe? Added de-duping logic just in case. http://codereview.chromium.org/201070/diff/5001/3006#newcode64 Line 64: HostIds host_ids = it->second; On 2009/09/11 02:37:49, michaeln wrote: > you can avoid making a copy of the vector here, maybe > HostIds& hosts_ids = iter->second; > or > frontend->OnEventRaised(it->second, event_id); > Done. http://codereview.chromium.org/201070/diff/5001/3006#newcode72 Line 72: AppCacheGroup* group) On 2009/09/11 21:37:39, michaeln wrote: > Some packaging details, but we have to think about how updates are initiated a > bit, and what the preconditions are... > > - group exists in storage, but is not loaded > - group exists in storage, and is loaded > - group does not exist in storage, but is being installed already > - group does not exist in storage, and is not being installed already > x > - updater exists already > - updater does not exist yet > > I think we can take the 'foo' is already being loaded vs not out of the picture > provided we have a couple of bottlenecks thru which 'foos' are loaded... > service->LoadFoo(x, completion_callback); > ... that deals with it already being loaded and just waits for completion. See > comments below about the Load methods. > > ... any other cases? > > ---------------------------------- > > What if the way to initiate an update is by calling a method of Group. You can't > express 'start an update' without having a group instance. > > // for use when starting an update w/o a new master entry > group->StartUpdate(); > > // for use when starting an update w a new master entry > group->StartUpdateWithMasterEntry(host, new_master_entry); > > -------------------- > > And the service has a LoadOrCreateGroup(manifest_url, callback) method which > observes semantics similar to those outlined for LoadCache(id) in the other CL > under review right now. > > - group exists in storage, but is not loaded > LoadOrCreateGroup happens asyncly and eventually calls GroupLoadedCallback with > the goods (group and newest cache loaded). Updater continues. > > - group exists in storage and is loaded > LoadOrCreateGroup happens immediately, updater continues. > > - group does not exist in storage, but is being installed already > LoadOrCreateGroup happens immediately (returning the in-memory only group), > updater continues > > - group does not exist in storage, and is not being installed already > LoadOrCreateGroup happens asyncly, after consulting the db and discovering no > group, a new group is created in-memory only and returned, updater continues. > > -------------------- > > So... callsites in AppCacheHost would use LoadCache or LoadOrCreateGroup at > select cache time, and upon completion, call the appropriate group->StartUpdate > method. > > And that method deals with if an updater is already running and pokes the new > master entry into it as needed. > > Changed to expect update job to be invoked via group. http://codereview.chromium.org/201070/diff/5001/3006#newcode73 Line 73: : service_(service), On 2009/09/11 02:37:49, michaeln wrote: > this indentation looks odd to my eye I had the indentation on all my constructor initializer lists wrong. Should be fixed everywhere now. http://codereview.chromium.org/201070/diff/5001/3006#newcode96 Line 96: PendingMasters::value_type(*new_master_resource, host)); On 2009/09/11 02:37:49, michaeln wrote: > What happens when the 'host' goes away prior to update completion? This could be > a place for a WeakPtr? Yes, I am planning to use WeakPtr after I merge with your CL. http://codereview.chromium.org/201070/diff/5001/3006#newcode136 Line 136: UpdaterRequestInfo::MANIFEST_FETCH : UpdaterRequestInfo::MANIFEST_REFETCH; On 2009/09/11 02:37:49, michaeln wrote: > nit: 4 space indents on line wraps, there are other instances of this too Done. Checked the other .cc files too. Hopefully I got all the places. http://codereview.chromium.org/201070/diff/5001/3006#newcode415 Line 415: const AppCacheEntry entry = it->second; On 2009/09/11 02:37:49, michaeln wrote: > maybe a reference, const AppCacheEntry& Done. http://codereview.chromium.org/201070/diff/5001/3006#newcode445 Line 445: request->Start(); On 2009/09/11 02:37:49, michaeln wrote: > I think we don't want to issue so many requests at a time, actually i think the > spec says we may work on up to two at a time in parallel. > > Maybe start two initially, and as each completes, start the next in the list. Spec says 2 or more and I was hoping browser code would be smart enough to queue extra requests. But, I'll change it to fetch 2 initially. That will also make the progress event more useful.
http://codereview.chromium.org/201070/diff/1003/1009 File webkit/appcache/appcache_update_job.cc (right): http://codereview.chromium.org/201070/diff/1003/1009#newcode73 Line 73: manifest_url_ = GURL(group_->manifest_url()); i don't think you need GURL() ctor here... manifest_url_ = group->manifeest_url(); ... should work http://codereview.chromium.org/201070/diff/1003/1009#newcode81 Line 81: void AppCacheUpdateJob::Update(AppCacheHost* host, Maybe call this method Start() ? http://codereview.chromium.org/201070/diff/1003/1009#newcode92 Line 92: frontend = host->frontend(); a helper method/function to notify a single host could simplify this one-off event dispatching code... void NotifySingleHost(host, event_id) { std::vector<int> ids(host->host_id()); host->frontend()->OnEventRaised(ids, event_id); } http://codereview.chromium.org/201070/diff/1003/1009#newcode127 Line 127: // TODO(jennb): need a request context? note to self: make a CL for this http://codereview.chromium.org/201070/diff/1003/1009#newcode221 Line 221: inprogress_cache_ = NULL; inprogress_cache_ should be NULL at this time (i think), since its only created after we have parse the manifest file which happens down below... right? Also, i think there's no need to reset to null here, the scoped_refptr will release upon destruction See my the comment below for more on this. http://codereview.chromium.org/201070/diff/1003/1009#newcode256 Line 256: host->set_selected_cache(inprogress_cache_); I think i'd like to make a more clear distinction between an association with a complete cache and an association with an in progress cache. Those are very different states and i think having them represented differently in our object model would be helpful, making it easier to reason about, and making the resulting code more self-documenting. What if we have the notion of a "potential cache" vs a "selected cache". When an update job starts, it creates a "potential cache" right off the bat. Hosts with new master entries get associated with that "potential cache". AppCacheHost has two methods for use by the update job to express these two associations. A host can only have one type of association at a time. // Establishes association with an incomplete cache. SetPotentialCache(cache); // Establishes association with a complete cache. // This also breaks any association with a different potential cache. SetSelectedCache(cache); The UpdateJob calls SetPotentialCache(potential_cache_) prior to returning from Start() when there is a new master entry involved. Then as the UpdateJob continues it later calls SetSelectedCache(potential_cache_) or SetSelectedCache(newest_cache_) or SetSelectedCache(NULL) on these 'potential' hosts depending on what happens. --------------------------------- There are a couple of scenarios to think thru around update jobs with "new master entries" and "potential hosts". In all cases until the master entry is successfully cached, resource loads for them should be hitting the network. 1) There is no manifest update, master entry is fetched In this case, the new master entry gets added to the existing cache, and after successful insertion of that new master entry, any 'potential host' with that entry should start using the existing cache immediately without calling swap or reloading or anything. Once the new entry is safely stored, UpdateJob calls.. potentialHost->SetSelectedCache(newest_cache_); // newest now contains the new master entry 2) There is a manifest update, master entry is fetched In this case, the new master entry gets added to a new cache, and after successful capture of the entire thing including the new master entry, any 'potential host' with that master entry should start using the newly created cache without calling swap or reloading. Once the entire thing is safely stored, UpdateJob calls... potentialHost->SetSelectedCache(potential_cache_); // its now the newest complete cache 3) Manifest fetch fails. potentialHost->SetSelectedCache(NULL); 4) There is no manifest update, but the master entry fetch fails potentialHost->SetSelectedCache(NULL); 5) There is a manifest update that otherwise succeeds, but a new master entry fetch fails potentialHost->SetSelectedCache(NULL); --------------------------------- At least that's my reading of the spec, although its really something discern the intent of that draft... do you buy that is the desired and specified behavior? And if so, wdyt of this way of representing/expressing things in our object model? I've updated my CL under review along these lines. http://codereview.chromium.org/201070/diff/1003/1009#newcode272 Line 272: // TODO(jennb): associate storage with the new entry note to self: service->GetNextFooId() http://codereview.chromium.org/201070/diff/1003/1009#newcode278 Line 278: // being processed, mark the entry as being foreign. We're setup to handle foreign entry detection further downstream, since to detect them here would require parsing (or sniffing) the html/xml document (and i'm not clear on what happens with xslt'ified xml docs really). Rather than a TODO here, maybe a Note about the opportunity to optimize and a comment that we handle detection of these further downstream at cache selection time. http://codereview.chromium.org/201070/diff/1003/1009#newcode322 Line 322: // Does "same parameters" include all original pending master entries? Good question, i think the intent is yes, this is going to be tricky. There's a fair amount of state that can't be dropped to pull that off. Feels like you'd have to schedule 'this' instance of the UpdateJob to Restart momentarily (and to continue to collect new pending master entries while in waiting to restart limbo). http://codereview.chromium.org/201070/diff/1003/1009#newcode349 Line 349: // De-dup hosts in case different pending master entries have the same host. It shouldn't be possible to have a single host with multiple pending master entries. A master entry is the main resource for a 'page' and there can only be one of those. If this is the only possible source of dups, don't think we need to check for that here. http://codereview.chromium.org/201070/diff/1003/1009#newcode350 Line 350: base::hash_set<int> hosts_seen; also... host ids are not unique within the system... only unique within a given child process (backend/frontend pair). http://codereview.chromium.org/201070/diff/1003/1009#newcode370 Line 370: host_notifier.AddHosts(inprogress_cache_->associated_hosts()); as coded, this also picks up all 'pending master hosts' once the update job has gotten to the point of creating the in progress cache... but prior to reaching that point it doesn't pick up that group... subtly squishy per other comments about more clearly representing cache2host associations and distiguishing between potential hosts and established hosts... maybe have the three (four) distinct Nofify many methods to express where events should be targeted by the update logic... NotifyAllAssociatedHosts // sends to both groups NotifyEstablishedHosts // only sends to hosts with a pre-existing complete cache selected NotifyPotentialHosts // only sends to host w/o with new master entries ... NotifySingleHost // sends to exactly one If we go with that arrangement void NotifyAllAssociatedHosts(event) { NotifyAssociatedHosts(event, true, true); } void NotifyEstablishedHosts(event) { NotifyAssociatedHosts(event, true, false); } void NotifyPotentialHosts(event) { NotifyAssociatedHosts(event, false, true); } void NotifyAssociatedHosts(event, bool established, bool potential) { HostNotifier notifier; if (potential) { notifier->AddHosts(potential_cache_->associated_hosts()); notifier.SendNotifications(e); } if (established) { add old cache hosts; add newest cache hosts; } notifier.SendNotifications(e) } http://codereview.chromium.org/201070/diff/1003/1009#newcode394 Line 394: // TODO(jennb): get stored manifest data from entry into previous_manifest Reading the old manifest from storage is definitely going to be async. http://codereview.chromium.org/201070/diff/1003/1009#newcode400 Line 400: manifest.explicit_urls.begin(); indent is off http://codereview.chromium.org/201070/diff/1003/1009#newcode406 Line 406: manifest.fallback_namespaces.begin(); indentation http://codereview.chromium.org/201070/diff/1003/1009#newcode414 Line 414: group_->newest_complete_cache()->entries(); indent one more tab stop http://codereview.chromium.org/201070/diff/1003/1010 File webkit/appcache/appcache_update_job.h (right): http://codereview.chromium.org/201070/diff/1003/1010#newcode59 Line 59: void FetchManifest(bool is_first_fetch = true); We generally avoid default parameter values, at some point they were explicitly 'forbidden' in the style guide. I'm not sure if they still are, but there a compelling need for it here? http://codereview.chromium.org/201070/diff/1003/1010#newcode92 Line 92: bool MaybeUpdateDone(); naming.... MaybeCompleteJob()? http://codereview.chromium.org/201070/diff/1003/1010#newcode111 Line 111: UrlFileList url_file_list_; naming nit: maybe UrlFetchList http://codereview.chromium.org/201070/diff/1003/1010#newcode115 Line 115: // removed when they fetched (not when the fetch completes). comment nit: 'removed when the fetch is initiated' and perhaps remove the parenthetical comment
some more comments... http://codereview.chromium.org/201070/diff/1003/1009 File webkit/appcache/appcache_update_job.cc (right): http://codereview.chromium.org/201070/diff/1003/1009#newcode112 Line 112: NotifyAllAssociatedHosts(CHECKING_EVENT); So at this point... we would call NotifyEstablishedHosts() to make it clear that this event is only delivered to those guys. http://codereview.chromium.org/201070/diff/1003/1009#newcode116 Line 116: frontend->OnEventRaised(host_to_notify, CHECKING_EVENT); NotifySingleHost() for clarity http://codereview.chromium.org/201070/diff/1003/1009#newcode177 Line 177: return false; maybe for now, schedule a task to continue reading and then return false... so pretend we write the data to response storage after one trip thru the message loop. http://codereview.chromium.org/201070/diff/1003/1009#newcode220 Line 220: NotifyAllPendingMasterHosts(ERROR_EVENT); NotifyEstablishedHosts(obsolete) NotifyPotentialHOsts(error) http://codereview.chromium.org/201070/diff/1003/1009#newcode259 Line 259: NotifyAllAssociatedHosts(DOWNLOADING_EVENT); "all" works here as is http://codereview.chromium.org/201070/diff/1003/1009#newcode339 Line 339: NotifyAllAssociatedHosts(CACHED_EVENT); for a cache attempt, there can only be "potential" hosts... all works here just fine to identify the entire group, but it cast a wider net than is needed. http://codereview.chromium.org/201070/diff/1003/1009#newcode341 Line 341: NotifyAllAssociatedHosts(UPDATE_READY_EVENT); this part of the spec is not clear to me... maybe you know the details of what it says... * should the newly complete cache be in use for the "potential" hosts without having to call swap() or do they have to call swap() to make it so. * is the answer to the above any different for the cache attempt vs update cases? * if swap() is not required for "potential" associated hosts, should UPDATE_READY or CACHED be called for them? http://codereview.chromium.org/201070/diff/1003/1009#newcode442 Line 442: urls_to_fetch_.erase(urls_to_fetch_.begin()); Will erase delete the storage for GURL& url here? http://codereview.chromium.org/201070/diff/1003/1009#newcode492 Line 492: pending_master_entries_.clear(); What sequence of events does a "potential" host see in the case of NO_UPDATE, but a new master entry was added? http://codereview.chromium.org/201070/diff/1003/1009#newcode512 Line 512: inprogress_cache_ = NULL; // no storage to clean up for new cache while there is nothing in the sqlDatabase to cleanup, i think we'll have to clean up any responses that have been been stored in the diskcache/httpcache.
Still didn't make it all the way thru... http://codereview.chromium.org/201070/diff/1003/1009 File webkit/appcache/appcache_update_job.cc (right): http://codereview.chromium.org/201070/diff/1003/1009#newcode71 Line 71: group_(group) { initalize more data members... master_entries_completed_, type_, mode_, http://codereview.chromium.org/201070/diff/1003/1009#newcode85 Line 85: PendingMasters::value_type(new_master_resource, host)); There is a corner case here that's missing. We can get the same 'new_master_resource' here from multiple hosts. This can happen in a couple of different ways. 1) User opens the same page in two different tabs. 2) User opens a page, closes it, then comes back prior to the job having run-till-completion. Both of those are pretty likely really. http://codereview.chromium.org/201070/diff/1003/1009#newcode210 Line 210: } Where are requests deleted? This may be a good place to do so. fyi... from url_request.h // The lifetime of an instance of this class is completely controlled by the // consumer, and the instance is not required to live on the heap or be // allocated in any special way. It is also valid to delete an URLRequest // object during the handling of a callback to its delegate. Of course, once // the URLRequest is deleted, no further callbacks to its delegate will occur. http://codereview.chromium.org/201070/diff/1003/1009#newcode213 Line 213: void AppCacheUpdateJob::HandleManifestFetchCompleted(URLRequest* request) { might be nice to test request->status().is_success() up front here to make the conditions below easier to follow. if (!request->status().is_success() { mode_ = CACHE_FAILURE; MaybeUpdateDone(); return; } http://codereview.chromium.org/201070/diff/1003/1009#newcode222 Line 222: delete this; Instead of calling delete this directly, consider using... MessageLoop::current()->DeleteSoon(FROM_HERE, this); ... to allow the stack to unwind prior to deletion. A little safer and easier to maintain. http://codereview.chromium.org/201070/diff/1003/1009#newcode240 Line 240: if (!request->status().is_success() || mime_type != "text/cache-manifest" || Would be good to have kManifestMimeType = "text/cache-manifest" as a constant in appcache_interface.h .cc http://codereview.chromium.org/201070/diff/1003/1009#newcode242 Line 242: !ParseManifest(manifest_url_, manifest_data_.data(), This logic has us attempting to parse 500 response codes as manifest files, and if the parse succeeds accepting that data into the cache. I think checking for 200 would be good for now. We may want to revisit to allow testing with file:// urls. http://codereview.chromium.org/201070/diff/1003/1009#newcode251 Line 251: inprogress_cache_ = new AppCache(service_, manifest); reminder: inprogress_cache_->set_owning_group() http://codereview.chromium.org/201070/diff/1003/1009#newcode257 Line 257: } We also need to store the manifest resource in response storage. Does that happen elsewhere, if not, here seems like a good place for that. http://codereview.chromium.org/201070/diff/1003/1009#newcode270 Line 270: if (request->status().is_success()) { Need to check the HTTP status code, we want 200 responses only in the cache. The is_success() is not indicative of a successful http status code. That just means the request completed and we got a well formed set of headers and a response body. This indicates that an os error didn't occur. The actual HTTP response we successfully received could be a 500 Internal Error. http://codereview.chromium.org/201070/diff/1003/1009#newcode284 Line 284: // complete cache. I think this logic is off, this may be due to is_success() interpretion. The spec says something like... if (response_code != 200) { if (explicit || fallback) run cache failure steps else if (response_code == 404 or 410) skip it else copy it from previous cache } else { store the good reponse we just received } http://codereview.chromium.org/201070/diff/1003/1009#newcode483 Line 483: if (master_entries_completed_ != pending_master_entries_.size() || For initial checkin... what's the tactical plan? master_entries_completed_ is never incremented, so jobs will never complete as currently coded. http://codereview.chromium.org/201070/diff/1003/1009#newcode514 Line 514: group_ = NULL; // no storage to clean up for new group Don't think you need to reset inprogress_cache_ or group_ to NULL prior to deletion. The dtor will release things for you. Also, i think setting group to NULL here will leave it with a non-idle UpdateStatus, which doesn't seem right. I know it (the group) should be going bye-bye very soon, but its still inconsistent. http://codereview.chromium.org/201070/diff/1003/1010 File webkit/appcache/appcache_update_job.h (right): http://codereview.chromium.org/201070/diff/1003/1010#newcode20 Line 20: class AppCacheUpdateJob : public URLRequest::Delegate { We definitely need a means of canceling an ongoing job for the browser shutdown case. Also, as currently coded, an UpdateJob will run-till-completion even if there are no longer any related hosts/documents open in the system. We need to decide what the lifecycle for these things wants to be? Always run-till-completion could be abused in some ways. I'm not sure that's the best policy. Maybe we start off with a more restrictive policy, run-till-no_related_hosts open anymore. http://codereview.chromium.org/201070/diff/1003/1010#newcode22 Line 22: // Methods for URLRequest::Delegate. We may need to do something with OnCertificateRequested too. What does ResourceDispatcherHost do with that callback? http://codereview.chromium.org/201070/diff/1003/1010#newcode108 Line 108: UpdateMode mode_; 'type' and 'mode' aren't very descriptive... maybe helpers like... IsUpgradeAttempt() IsCacheAttempt() FoundNoUpdate() mode= NOUPATE FoundUpdate() mode=DOWNLOADING HasUpdateFailure() mode=CACHE_FAILURE would help with readability in the .cc file.
New snapshot uploaded. This is as far as I got today. Addressed all feedback comments (leaving notify all associated hosts alone to keep it matching html5 spec). Barely started on unittests. http://codereview.chromium.org/201070/diff/1003/1009 File webkit/appcache/appcache_update_job.cc (right): http://codereview.chromium.org/201070/diff/1003/1009#newcode71 Line 71: group_(group) { On 2009/09/15 03:37:45, michaeln wrote: > initalize more data members... > master_entries_completed_, > type_, > mode_, Done. http://codereview.chromium.org/201070/diff/1003/1009#newcode73 Line 73: manifest_url_ = GURL(group_->manifest_url()); On 2009/09/14 04:17:58, michaeln wrote: > i don't think you need GURL() ctor here... > manifest_url_ = group->manifeest_url(); > ... should work Done. http://codereview.chromium.org/201070/diff/1003/1009#newcode81 Line 81: void AppCacheUpdateJob::Update(AppCacheHost* host, On 2009/09/14 04:17:58, michaeln wrote: > Maybe call this method Start() ? Renamed StartUpdate(). http://codereview.chromium.org/201070/diff/1003/1009#newcode85 Line 85: PendingMasters::value_type(new_master_resource, host)); On 2009/09/15 03:37:45, michaeln wrote: > There is a corner case here that's missing. > > We can get the same 'new_master_resource' here from multiple hosts. This can > happen in a couple of different ways. > > 1) User opens the same page in two different tabs. > 2) User opens a page, closes it, then comes back prior to the job having > run-till-completion. > > Both of those are pretty likely really. Done. http://codereview.chromium.org/201070/diff/1003/1009#newcode92 Line 92: frontend = host->frontend(); On 2009/09/14 04:17:58, michaeln wrote: > a helper method/function to notify a single host could simplify this one-off > event dispatching code... > > void NotifySingleHost(host, event_id) { > std::vector<int> ids(host->host_id()); > host->frontend()->OnEventRaised(ids, event_id); > } > Done. I was trying to avoid constructing the vector more than once, but I agree that made the code hard to read. http://codereview.chromium.org/201070/diff/1003/1009#newcode112 Line 112: NotifyAllAssociatedHosts(CHECKING_EVENT); On 2009/09/14 18:50:33, michaeln wrote: > So at this point... we would call NotifyEstablishedHosts() to make it clear that > this event is only delivered to those guys. Leaving as is. http://codereview.chromium.org/201070/diff/1003/1009#newcode116 Line 116: frontend->OnEventRaised(host_to_notify, CHECKING_EVENT); On 2009/09/14 18:50:33, michaeln wrote: > NotifySingleHost() for clarity Done. http://codereview.chromium.org/201070/diff/1003/1009#newcode177 Line 177: return false; On 2009/09/14 18:50:33, michaeln wrote: > maybe for now, schedule a task to continue reading and then return false... so > pretend we write the data to response storage after one trip thru the message > loop. Done. http://codereview.chromium.org/201070/diff/1003/1009#newcode210 Line 210: } On 2009/09/15 03:37:45, michaeln wrote: > Where are requests deleted? This may be a good place to do so. > > fyi... from url_request.h > // The lifetime of an instance of this class is completely controlled by the > // consumer, and the instance is not required to live on the heap or be > // allocated in any special way. It is also valid to delete an URLRequest > // object during the handling of a callback to its delegate. Of course, once > // the URLRequest is deleted, no further callbacks to its delegate will occur. > Good catch. Added here and when requests are canceled. http://codereview.chromium.org/201070/diff/1003/1009#newcode213 Line 213: void AppCacheUpdateJob::HandleManifestFetchCompleted(URLRequest* request) { On 2009/09/15 03:37:45, michaeln wrote: > might be nice to test request->status().is_success() up front here to make the > conditions below easier to follow. > > if (!request->status().is_success() { > mode_ = CACHE_FAILURE; > MaybeUpdateDone(); > return; > } > Re-orged this method. Should be much clearer now. Thanks for pointing out that status().is_success() != successful http code. http://codereview.chromium.org/201070/diff/1003/1009#newcode221 Line 221: inprogress_cache_ = NULL; On 2009/09/14 04:17:58, michaeln wrote: > inprogress_cache_ should be NULL at this time (i think), since its only created > after we have parse the manifest file which happens down below... right? Also, i > think there's no need to reset to null here, the scoped_refptr will release upon > destruction > > See my the comment below for more on this. Done. http://codereview.chromium.org/201070/diff/1003/1009#newcode222 Line 222: delete this; On 2009/09/15 03:37:45, michaeln wrote: > Instead of calling delete this directly, consider using... > MessageLoop::current()->DeleteSoon(FROM_HERE, this); > ... to allow the stack to unwind prior to deletion. A little safer and easier to > maintain. Done. http://codereview.chromium.org/201070/diff/1003/1009#newcode240 Line 240: if (!request->status().is_success() || mime_type != "text/cache-manifest" || On 2009/09/15 03:37:45, michaeln wrote: > Would be good to have kManifestMimeType = "text/cache-manifest" as a constant in > appcache_interface.h .cc Done. http://codereview.chromium.org/201070/diff/1003/1009#newcode251 Line 251: inprogress_cache_ = new AppCache(service_, manifest); On 2009/09/15 03:37:45, michaeln wrote: > reminder: inprogress_cache_->set_owning_group() Done. http://codereview.chromium.org/201070/diff/1003/1009#newcode257 Line 257: } On 2009/09/15 03:37:45, michaeln wrote: > We also need to store the manifest resource in response storage. Does that > happen elsewhere, if not, here seems like a good place for that. The manifest is stored after we know we won't be discarding this cache (in HandleManifestRefetchCompleted). http://codereview.chromium.org/201070/diff/1003/1009#newcode270 Line 270: if (request->status().is_success()) { On 2009/09/15 03:37:45, michaeln wrote: > Need to check the HTTP status code, we want 200 responses only in the cache. > > The is_success() is not indicative of a successful http status code. That just > means the request completed and we got a well formed set of headers and a > response body. This indicates that an os error didn't occur. > > The actual HTTP response we successfully received could be a 500 Internal Error. > Done. http://codereview.chromium.org/201070/diff/1003/1009#newcode278 Line 278: // being processed, mark the entry as being foreign. On 2009/09/14 04:17:58, michaeln wrote: > We're setup to handle foreign entry detection further downstream, since to > detect them here would require parsing (or sniffing) the html/xml document (and > i'm not clear on what happens with xslt'ified xml docs really). > > Rather than a TODO here, maybe a Note about the opportunity to optimize and a > comment that we handle detection of these further downstream at cache selection > time. > Done. http://codereview.chromium.org/201070/diff/1003/1009#newcode284 Line 284: // complete cache. On 2009/09/15 03:37:45, michaeln wrote: > I think this logic is off, this may be due to is_success() interpretion. The > spec says something like... > > if (response_code != 200) { > if (explicit || fallback) > run cache failure steps > else if (response_code == 404 or 410) > skip it > else > copy it from previous cache > } else { > store the good reponse we just received > } > > Fixed. Not sure what I was thinking before... http://codereview.chromium.org/201070/diff/1003/1009#newcode322 Line 322: // Does "same parameters" include all original pending master entries? On 2009/09/14 04:17:58, michaeln wrote: > Good question, i think the intent is yes, this is going to be tricky. There's a > fair amount of state that can't be dropped to pull that off. Feels like you'd > have to schedule 'this' instance of the UpdateJob to Restart momentarily (and to > continue to collect new pending master entries while in waiting to restart > limbo). I agree that the spec intends the answer to be "yes". Will figure this out in the next CL iteration. http://codereview.chromium.org/201070/diff/1003/1009#newcode341 Line 341: NotifyAllAssociatedHosts(UPDATE_READY_EVENT); On 2009/09/14 18:50:33, michaeln wrote: > this part of the spec is not clear to me... maybe you know the details of what > it says... > > * should the newly complete cache be in use for the "potential" hosts without > having to call swap() or do they have to call swap() to make it so. > > * is the answer to the above any different for the cache attempt vs update > cases? > > * if swap() is not required for "potential" associated hosts, should > UPDATE_READY or CACHED be called for them? Here's my reading of the spec: On CACHED event, "potential" hsots do not have to use swap. They already know they are associated with the newly complete cache. On UPDATE_READY, all hosts can call swapCache(), but for the "potential" hosts, that results in a no-op. Or, if "potential" hosts can tell they are "new" hosts, they can skip calling swapCache(). (Still use UPDATE_READY in this case as it is correctly describes what has happened during the update process.) http://codereview.chromium.org/201070/diff/1003/1009#newcode349 Line 349: // De-dup hosts in case different pending master entries have the same host. On 2009/09/14 04:17:58, michaeln wrote: > It shouldn't be possible to have a single host with multiple pending master > entries. A master entry is the main resource for a 'page' and there can only be > one of those. If this is the only possible source of dups, don't think we need > to check for that here. Gone. http://codereview.chromium.org/201070/diff/1003/1009#newcode350 Line 350: base::hash_set<int> hosts_seen; On 2009/09/14 04:17:58, michaeln wrote: > also... host ids are not unique within the system... only unique within a given > child process (backend/frontend pair). Gone. No longer de-duping. http://codereview.chromium.org/201070/diff/1003/1009#newcode394 Line 394: // TODO(jennb): get stored manifest data from entry into previous_manifest On 2009/09/14 04:17:58, michaeln wrote: > Reading the old manifest from storage is definitely going to be async. Done. http://codereview.chromium.org/201070/diff/1003/1009#newcode400 Line 400: manifest.explicit_urls.begin(); On 2009/09/14 04:17:58, michaeln wrote: > indent is off Done. http://codereview.chromium.org/201070/diff/1003/1009#newcode406 Line 406: manifest.fallback_namespaces.begin(); On 2009/09/14 04:17:58, michaeln wrote: > indentation Done. http://codereview.chromium.org/201070/diff/1003/1009#newcode414 Line 414: group_->newest_complete_cache()->entries(); On 2009/09/14 04:17:58, michaeln wrote: > indent one more tab stop Done. http://codereview.chromium.org/201070/diff/1003/1009#newcode442 Line 442: urls_to_fetch_.erase(urls_to_fetch_.begin()); On 2009/09/14 18:50:33, michaeln wrote: > Will erase delete the storage for GURL& url here? Wow, good catch. http://codereview.chromium.org/201070/diff/1003/1009#newcode483 Line 483: if (master_entries_completed_ != pending_master_entries_.size() || On 2009/09/15 03:37:45, michaeln wrote: > For initial checkin... what's the tactical plan? > > master_entries_completed_ is never incremented, so jobs will never complete as > currently coded. > Commented out insertion of pending master entries into job for now. http://codereview.chromium.org/201070/diff/1003/1009#newcode492 Line 492: pending_master_entries_.clear(); On 2009/09/14 18:50:33, michaeln wrote: > What sequence of events does a "potential" host see in the case of NO_UPDATE, > but a new master entry was added? if master_entry failed to download, the "potential" host sees ERROR else the host sees OnCacheSelected(newest_complete_cache) and NO_UPDATE http://codereview.chromium.org/201070/diff/1003/1009#newcode512 Line 512: inprogress_cache_ = NULL; // no storage to clean up for new cache On 2009/09/14 18:50:33, michaeln wrote: > while there is nothing in the sqlDatabase to cleanup, i think we'll have to > clean up any responses that have been been stored in the diskcache/httpcache. Added a TODO note. http://codereview.chromium.org/201070/diff/1003/1009#newcode514 Line 514: group_ = NULL; // no storage to clean up for new group On 2009/09/15 03:37:45, michaeln wrote: > Don't think you need to reset inprogress_cache_ or group_ to NULL prior to > deletion. The dtor will release things for you. > > Also, i think setting group to NULL here will leave it with a non-idle > UpdateStatus, which doesn't seem right. I know it (the group) should be going > bye-bye very soon, but its still inconsistent. Done. http://codereview.chromium.org/201070/diff/1003/1010 File webkit/appcache/appcache_update_job.h (right): http://codereview.chromium.org/201070/diff/1003/1010#newcode20 Line 20: class AppCacheUpdateJob : public URLRequest::Delegate { On 2009/09/15 03:37:45, michaeln wrote: > We definitely need a means of canceling an ongoing job for the browser shutdown > case. > > Also, as currently coded, an UpdateJob will run-till-completion even if there > are no longer any related hosts/documents open in the system. > > We need to decide what the lifecycle for these things wants to be? > > Always run-till-completion could be abused in some ways. I'm not sure that's the > best policy. Maybe we start off with a more restrictive policy, > run-till-no_related_hosts open anymore. Added a TODO to add api to cancel this job. Punting for now on when cancel should/can be called. http://codereview.chromium.org/201070/diff/1003/1010#newcode22 Line 22: // Methods for URLRequest::Delegate. On 2009/09/15 03:37:45, michaeln wrote: > We may need to do something with OnCertificateRequested too. What does > ResourceDispatcherHost do with that callback? ResourceDispatcherHost queries the user to pick a certificate. I'm letting the default behavior (continue with no certificate) run for now, which will probably result in the url request failing. http://codereview.chromium.org/201070/diff/1003/1010#newcode59 Line 59: void FetchManifest(bool is_first_fetch = true); On 2009/09/14 04:17:58, michaeln wrote: > We generally avoid default parameter values, at some point they were explicitly > 'forbidden' in the style guide. I'm not sure if they still are, but there a > compelling need for it here? Done. http://codereview.chromium.org/201070/diff/1003/1010#newcode92 Line 92: bool MaybeUpdateDone(); On 2009/09/14 04:17:58, michaeln wrote: > naming.... MaybeCompleteJob()? Changed to MaybeCompleteUpdate(); http://codereview.chromium.org/201070/diff/1003/1010#newcode108 Line 108: UpdateMode mode_; On 2009/09/15 03:37:45, michaeln wrote: > 'type' and 'mode' aren't very descriptive... maybe helpers like... > IsUpgradeAttempt() IsCacheAttempt() > FoundNoUpdate() mode= NOUPATE > FoundUpdate() mode=DOWNLOADING > HasUpdateFailure() mode=CACHE_FAILURE > would help with readability in the .cc file. > Renamed type_ to update_type_ for clarity. Renamed UpdateMode and mode_ to UpdateState and update_state_ as that FSM terminology is more commonly used. http://codereview.chromium.org/201070/diff/1003/1010#newcode111 Line 111: UrlFileList url_file_list_; On 2009/09/14 04:17:58, michaeln wrote: > naming nit: maybe UrlFetchList Wanted clear distinction between this and urls_to_fetch list, thus I went with the naming used in the spec. http://codereview.chromium.org/201070/diff/1003/1010#newcode115 Line 115: // removed when they fetched (not when the fetch completes). On 2009/09/14 04:17:58, michaeln wrote: > comment nit: 'removed when the fetch is initiated' and perhaps remove the > parenthetical comment Reworded.
And this is as far as i got... http://codereview.chromium.org/201070/diff/1003/1010 File webkit/appcache/appcache_update_job.h (right): http://codereview.chromium.org/201070/diff/1003/1010#newcode22 Line 22: // Methods for URLRequest::Delegate. > ResourceDispatcherHost queries the user to pick a certificate. I'm letting the > default behavior (continue with no certificate) run for now, which will probably > result in the url request failing. let's ping darin about this one? http://codereview.chromium.org/201070/diff/8002/8005 File webkit/appcache/appcache.cc (right): http://codereview.chromium.org/201070/diff/8002/8005#newcode20 Line 20: service_(service) { thnx for fixing the indents http://codereview.chromium.org/201070/diff/8002/8004 File webkit/appcache/appcache.h (right): http://codereview.chromium.org/201070/diff/8002/8004#newcode36 Line 36: explicit AppCache(AppCacheService *service, Manifest& manifest); with two ctor args, no need for explicit http://codereview.chromium.org/201070/diff/8002/8004#newcode36 Line 36: explicit AppCache(AppCacheService *service, Manifest& manifest); Manifest* instead of &, i think that's a google-style thing still, doesnt like non-const references. http://codereview.chromium.org/201070/diff/8002/8004#newcode63 Line 63: const std::set<AppCacheHost*>& associated_hosts() { the method can be const too... associated_hosts() const { http://codereview.chromium.org/201070/diff/8002/8004#newcode75 Line 75: const std::vector<FallbackNamespace>& fallback_namespaces() { ditto http://codereview.chromium.org/201070/diff/8002/8008 File webkit/appcache/appcache_interfaces.h (right): http://codereview.chromium.org/201070/diff/8002/8008#newcode17 Line 17: static const std::string kManifestMimeType = "text/cache-manifest"; oops, please define this constant as a primitive char[]. we avoid running ctors at DLL load time for performance. you can look at chrome/common/url_constants.h .cc for how we do string constants... or just look here :) // .h extern const char kManifestMimeType[]; // .cc const char kManifestMimeType[] = "text/cache-manifest"; http://codereview.chromium.org/201070/diff/8002/8012 File webkit/appcache/appcache_unittest.cc (right): http://codereview.chromium.org/201070/diff/8002/8012#newcode79 Line 79: EXPECT_EQ(2, manifest.explicit_urls.size()); // untouched also check to see that the manifest's namespaces are empty http://codereview.chromium.org/201070/diff/8002/8010 File webkit/appcache/appcache_update_job.cc (right): http://codereview.chromium.org/201070/diff/8002/8010#newcode100 Line 100: DCHECK(group_->update_job() == this); you could move this DCHECK to the top of the method body to get it out of the way early http://codereview.chromium.org/201070/diff/8002/8010#newcode136 Line 136: } else { for one-line blocks, you can leave the curly braces off http://codereview.chromium.org/201070/diff/8002/8010#newcode203 Line 203: void AppCacheUpdateJob::OnResponseCompleted(URLRequest* request) { Hmmmm... there may be something we put into Gears that we should think about here... look for "503" handling in here... http://cs/p?sa=N&cd=1&ct=rc#googleclient/gears/opensource/gears/localserver/c... http://codereview.chromium.org/201070/diff/8002/8010#newcode223 Line 223: void AppCacheUpdateJob::HandleManifestFetchCompleted(URLRequest* request) { I think you want to check .is_success() up top here to avoid accepting partial responses into the cache (for all HandleCompleted calls). Here's the thing... if the connection drops after having received the headers, but prior to fully receiving the response body, the response_code will be valid, may even say 200. http://codereview.chromium.org/201070/diff/8002/8010#newcode229 Line 229: mime_type == kManifestMimeType) { do we get a Content-Type header for 304s? http://codereview.chromium.org/201070/diff/8002/8010#newcode231 Line 231: ContinueHandleManifestFetchCompleted(request, true); we could have a 304 response here, this would be a cache failure condition http://codereview.chromium.org/201070/diff/8002/8010#newcode233 Line 233: ContinueHandleManifestFetchCompleted(request, false); indentation, and the next else clause too http://codereview.chromium.org/201070/diff/8002/8010#newcode235 Line 235: CheckIfManifestChanged(); // continues asynchronously This last else is the expected 200 case. I find this block of logic difficult to follow. Its easy to not spot logic errors or to introduce new ones. Could you take a shot at rearranging this method body for clarity? Maybe something like... if (!is_success()) CACHE_FAILURE; switch (GetResponseCode()) { case 200: break; case 304: break; case 404: case 410: break; default: CACHE_FAILURE; } http://codereview.chromium.org/201070/diff/8002/8010#newcode255 Line 255: Manifest manifest; What do you think of checking the mimetype here? The tests in HandleManifestFetchCompleted are all oriented towards "do we have a 200 OK response we may need to think about". Once you get past that door, its time to examine the material content of the reponse, including its content-type. wdyt? http://codereview.chromium.org/201070/diff/8002/8010#newcode576 Line 576: MessageLoop::current()->DeleteSoon(FROM_HERE, this); definitely rename the method to DeleteSoon http://codereview.chromium.org/201070/diff/8002/8011 File webkit/appcache/appcache_update_job.h (right): http://codereview.chromium.org/201070/diff/8002/8011#newcode50 Line 50: typedef std::map<GURL, AppCacheEntry> UrlFileList; we have a typedef for this, AppCache::EntryMap, the data member name should be sufficient to describe its purpose http://codereview.chromium.org/201070/diff/8002/8011#newcode63 Line 63: enum UpdateState { UpdateState is definitely an improvement. In looking at the .cc file, there is an UpdateStatus for the group and this UpdateState type, its easy to lose track of which is which. Not sure this is a good idea, but InternalUpdateState? Is the first state really unknown or is it FETCHING_MANIFEST? http://codereview.chromium.org/201070/diff/8002/8011#newcode83 Line 83: // Returns false if response data processed asynchronously, in which data _is_ processed http://codereview.chromium.org/201070/diff/8002/8011#newcode108 Line 108: bool SkipUrlFetch(const AppCacheEntry& entry); maybe ShouldSkipUrlFetch? http://codereview.chromium.org/201070/diff/8002/8011#newcode121 Line 121: // the update process. describe the return value in the doc comment http://codereview.chromium.org/201070/diff/8002/8011#newcode133 Line 133: void UpdateFinished(); maybe DeleteSoon? http://codereview.chromium.org/201070/diff/8002/8011#newcode152 Line 152: std::vector<GURL> urls_to_fetch_; can we use a std::deque here
Getting there... http://codereview.chromium.org/201070/diff/8002/8010 File webkit/appcache/appcache_update_job.cc (right): http://codereview.chromium.org/201070/diff/8002/8010#newcode127 Line 127: request->SetUserData(this, new UpdateJobInfo(fetch_type)); Please also set the BYPASS_INTERCEPT load flag for all requests initiated by the UpdateJob. What that flag does is circumvent URLRequest::Interceptors, they won't get called, the appcache interceptor as well as all other. In particular, we want to bypass the Gears interceptor for these requests so we aren't loading resources from Gears.LocalServer and putting them in AppCache. (note to self: ensure a potential new 'master entry' wasn't pulled from gears) http://codereview.chromium.org/201070/diff/8002/8010#newcode183 Line 183: MessageLoop::current()->PostTask(FROM_HERE, NewRunnableMethod( See ScopedMethodFactory<T> to avoid derivation from RefCounted. http://codereview.chromium.org/201070/diff/8002/8010#newcode287 Line 287: void AppCacheUpdateJob::HandleUrlFetchCompleted(URLRequest* request) { gotta restructure this for if !is_success() handling. http://codereview.chromium.org/201070/diff/8002/8010#newcode342 Line 342: // TODO(jennb): add manifest_data_ to storage and put storage key in entry ah... i see... good http://codereview.chromium.org/201070/diff/8002/8010#newcode346 Line 346: inprogress_cache_->set_complete(true); i think we need to defer set_complete(true) until we've committed to storage, ditto for the firing of CACHED or UPDATE_READY events. here's why... upon new version of application script being installed for offline use, the app should feel free to begin upgrading other persistent storage schemas (Databases and/or the structure of data in LocalStorage) in ways that the newly installed application script can interpret... but these schema upgrade may involve breaking changes when viewed by the old application script. so... don't signal DONE until the new application script is in the bank. And definitely need to put set_complete(true) adjacent to the signaling of DONE'ness, with no async operations in between. http://codereview.chromium.org/201070/diff/8002/8010#newcode350 Line 350: // TODO(jennb): write new group to storage here? we should have a storage api that gestures to store cache + newest group in one fell swoop... a single transaction http://codereview.chromium.org/201070/diff/8002/8010#newcode459 Line 459: remove the extra blank line http://codereview.chromium.org/201070/diff/8002/8010#newcode473 Line 473: // repeatedly. Order of fetching doesn't matter. use a std::deque :) http://codereview.chromium.org/201070/diff/8002/8010#newcode481 Line 481: LoadFromNewestCache(url, entry); // load is async we just need meta-data here, cache-control headers... is that right? http://codereview.chromium.org/201070/diff/8002/8010#newcode485 Line 485: request->SetUserData(this, new UpdateJobInfo(UpdateJobInfo::URL_FETCH)); BYPASS_INTERCEPTION http://codereview.chromium.org/201070/diff/8002/8010#newcode499 Line 499: // TODO(jennb): decide if entry should be skipped to expire it from cache hmmmm... httpcache/diskcache may be keeping last access data for us to help base this decision on. http://codereview.chromium.org/201070/diff/8002/8011 File webkit/appcache/appcache_update_job.h (right): http://codereview.chromium.org/201070/diff/8002/8011#newcode24 Line 24: public base::RefCountedThreadSafe<AppCacheUpdateJob> { I was wondering why refcounted? You should look at ScopedMethodFactory. http://codereview.chromium.org/201070/diff/8002/8011#newcode48 Line 48: typedef std::vector<base::WeakPtr<AppCacheHost> > PendingHosts; I think the observer model may be better suited for keeping track of these AppCacheHosts (and eventually observing the main resource data as it gets fed back to us from the client). So instead of WeakPtrs here, we observe OnDestructionImmiment(host), and pull them from the collection at that time. See the CL i sent out for review yesterday. Also, consider keeping this collection of PendindMasters in the Group itself rather than the UpdateJob. a. That is how the spec phrases the algoritm b. Will more naturally survive the big 'Restart' step.
ok... made it all the way thru... last set of comments... http://codereview.chromium.org/201070/diff/8002/8010 File webkit/appcache/appcache_update_job.cc (right): http://codereview.chromium.org/201070/diff/8002/8010#newcode514 Line 514: // Else, add url back to urls_to_fetch and call FetchUrls(). ah... i see... this method will maybe migrate the entry from the existing 'newest' cache to the 'inprogress' cache... excellent * the method name could use some wordsmithing... MaybeMigrateExistingEntry? * if/when push back into the fetch_queue, will need to be flagged so we don't get here again Hmmmmm... * we also want to ulilize the existing headers when composing the URLRequest... which could tie up lots of memory when these requeued things are working thru the queue again (but you could put on the near side of the fetchqueue to avoid that) * what keeps MaybeUpdateComplete from thinking we're done while we're in this state... maybe we have two queues along these lines... validate_queue_; fetch_queue_ * urls go thru both queues, first validate then if needed fetch. * the validation phase involves async read and examinination of existing headers to determine if we can migrate, and move on (no reading of existing response data involved, just copy response_id to new entry) * if (!migrated) { // prepare a URLRequest with IFMS conditional headers per the existing headers fetch_queue_.push_back(url, unstarted_request); ProcessUrls(); // will pop() and start next fetch and next validate } * instead of FetchUrls, we have ProcessUrls which pops both queues and StartValidation() or StartsFetch() (each of which increments urls_being_validated_ or urls_being_fetched_ accordingly). * MaybeUpdateComplete checks for emptiness of both queues and !urls_being_validated_ and !urls_being_fetched_ (when validation or fetch completes decrement the urls_being_munched_on_)
New snapshot uploaded. Added a few unittests. http://codereview.chromium.org/201070/diff/8002/8004 File webkit/appcache/appcache.h (right): http://codereview.chromium.org/201070/diff/8002/8004#newcode36 Line 36: explicit AppCache(AppCacheService *service, Manifest& manifest); On 2009/09/16 06:24:11, michaeln wrote: > with two ctor args, no need for explicit Ctor is gone. Added separate method to extract manifest data instead to allow one to create the cache, use the manifest for other purposes, then hand manifest data over to the cache. http://codereview.chromium.org/201070/diff/8002/8004#newcode36 Line 36: explicit AppCache(AppCacheService *service, Manifest& manifest); On 2009/09/16 06:24:11, michaeln wrote: > Manifest* instead of &, i think that's a google-style thing still, doesnt like > non-const references. This ctor is gone, but fixed this style violation in other areas. Constructor is gone, http://codereview.chromium.org/201070/diff/8002/8004#newcode63 Line 63: const std::set<AppCacheHost*>& associated_hosts() { On 2009/09/16 06:24:11, michaeln wrote: > the method can be const too... > associated_hosts() const { Done. http://codereview.chromium.org/201070/diff/8002/8008 File webkit/appcache/appcache_interfaces.h (right): http://codereview.chromium.org/201070/diff/8002/8008#newcode17 Line 17: static const std::string kManifestMimeType = "text/cache-manifest"; On 2009/09/16 06:24:11, michaeln wrote: > oops, please define this constant as a primitive char[]. > we avoid running ctors at DLL load time for performance. > you can look at chrome/common/url_constants.h .cc for how we do string > constants... or just look here :) > > // .h > extern const char kManifestMimeType[]; > > // .cc > const char kManifestMimeType[] = "text/cache-manifest"; > Done. http://codereview.chromium.org/201070/diff/8002/8012 File webkit/appcache/appcache_unittest.cc (right): http://codereview.chromium.org/201070/diff/8002/8012#newcode79 Line 79: EXPECT_EQ(2, manifest.explicit_urls.size()); // untouched On 2009/09/16 06:24:11, michaeln wrote: > also check to see that the manifest's namespaces are empty I ended up taking out this check. Don't care if manifest is mangled in a particular way or not. Only care that relevant data makes it into the cache. http://codereview.chromium.org/201070/diff/8002/8010 File webkit/appcache/appcache_update_job.cc (right): http://codereview.chromium.org/201070/diff/8002/8010#newcode100 Line 100: DCHECK(group_->update_job() == this); On 2009/09/16 06:24:11, michaeln wrote: > you could move this DCHECK to the top of the method body to get it out of the > way early Done. http://codereview.chromium.org/201070/diff/8002/8010#newcode127 Line 127: request->SetUserData(this, new UpdateJobInfo(fetch_type)); On 2009/09/16 16:23:57, michaeln wrote: > Please also set the BYPASS_INTERCEPT load flag for all requests initiated by the > UpdateJob. What that flag does is circumvent URLRequest::Interceptors, they > won't get called, the appcache interceptor as well as all other. In particular, > we want to bypass the Gears interceptor for these requests so we aren't loading > resources from Gears.LocalServer and putting them in AppCache. > > (note to self: ensure a potential new 'master entry' wasn't pulled from gears) Done. http://codereview.chromium.org/201070/diff/8002/8010#newcode136 Line 136: } else { On 2009/09/16 06:24:11, michaeln wrote: > for one-line blocks, you can leave the curly braces off Done. http://codereview.chromium.org/201070/diff/8002/8010#newcode183 Line 183: MessageLoop::current()->PostTask(FROM_HERE, NewRunnableMethod( On 2009/09/16 16:23:57, michaeln wrote: > See ScopedMethodFactory<T> to avoid derivation from RefCounted. Done. http://codereview.chromium.org/201070/diff/8002/8010#newcode203 Line 203: void AppCacheUpdateJob::OnResponseCompleted(URLRequest* request) { On 2009/09/16 06:24:11, michaeln wrote: > Hmmmm... there may be something we put into Gears that we should think about > here... look for "503" handling in here... > > http://cs/p?sa=N&cd=1&ct=rc#googleclient/gears/opensource/gears/localserver/c... Gears retries up to 3 times on 503. However it does not take the retry-after header into account. Saving this to revisit in a separate CL. http://codereview.chromium.org/201070/diff/8002/8010#newcode223 Line 223: void AppCacheUpdateJob::HandleManifestFetchCompleted(URLRequest* request) { On 2009/09/16 06:24:11, michaeln wrote: > I think you want to check .is_success() up top here to avoid accepting partial > responses into the cache (for all HandleCompleted calls). > > Here's the thing... if the connection drops after having received the headers, > but prior to fully receiving the response body, the response_code will be valid, > may even say 200. > Ah, did not know that response codes may be set if !is_success. Re-orged entire method. Did not use switch{} because it's convenient to check more than just response code in the if/else-if clauses. http://codereview.chromium.org/201070/diff/8002/8010#newcode229 Line 229: mime_type == kManifestMimeType) { On 2009/09/16 06:24:11, michaeln wrote: > do we get a Content-Type header for 304s? Don't know, but revised code no longer checks for the header on a 304. http://codereview.chromium.org/201070/diff/8002/8010#newcode255 Line 255: Manifest manifest; On 2009/09/16 06:24:11, michaeln wrote: > What do you think of checking the mimetype here? The tests in > HandleManifestFetchCompleted are all oriented towards "do we have a 200 OK > response we may need to think about". Once you get past that door, its time to > examine the material content of the reponse, including its content-type. > > wdyt? Better to check the mimetype before triggering a stoage load of the manifest data from the newest complete cache to avoid an unnecessary call to storage. http://codereview.chromium.org/201070/diff/8002/8010#newcode287 Line 287: void AppCacheUpdateJob::HandleUrlFetchCompleted(URLRequest* request) { On 2009/09/16 16:23:57, michaeln wrote: > gotta restructure this for if !is_success() handling. Done. http://codereview.chromium.org/201070/diff/8002/8010#newcode346 Line 346: inprogress_cache_->set_complete(true); On 2009/09/16 16:23:57, michaeln wrote: > i think we need to defer set_complete(true) until we've committed to storage, > ditto for the firing of CACHED or UPDATE_READY events. here's why... > > upon new version of application script being installed for offline use, the app > should feel free to begin upgrading other persistent storage schemas (Databases > and/or the structure of data in LocalStorage) in ways that the newly installed > application script can interpret... but these schema upgrade may involve > breaking changes when viewed by the old application script. > > so... don't signal DONE until the new application script is in the bank. And > definitely need to put set_complete(true) adjacent to the signaling of > DONE'ness, with no async operations in between. Added TODO notes. http://codereview.chromium.org/201070/diff/8002/8010#newcode350 Line 350: // TODO(jennb): write new group to storage here? On 2009/09/16 16:23:57, michaeln wrote: > we should have a storage api that gestures to store cache + newest group in one > fell swoop... a single transaction That'd be nice. Make sure it knows the difference between a brand new group vs an existing group that got a cache added. http://codereview.chromium.org/201070/diff/8002/8010#newcode459 Line 459: On 2009/09/16 16:23:57, michaeln wrote: > remove the extra blank line Done. http://codereview.chromium.org/201070/diff/8002/8010#newcode473 Line 473: // repeatedly. Order of fetching doesn't matter. On 2009/09/16 16:23:57, michaeln wrote: > use a std::deque :) Done. http://codereview.chromium.org/201070/diff/8002/8010#newcode481 Line 481: LoadFromNewestCache(url, entry); // load is async On 2009/09/16 16:23:57, michaeln wrote: > we just need meta-data here, cache-control headers... is that right? Yes. We'll need a storage api call to just load headers. http://codereview.chromium.org/201070/diff/8002/8010#newcode485 Line 485: request->SetUserData(this, new UpdateJobInfo(UpdateJobInfo::URL_FETCH)); On 2009/09/16 16:23:57, michaeln wrote: > BYPASS_INTERCEPTION Done. http://codereview.chromium.org/201070/diff/8002/8010#newcode514 Line 514: // Else, add url back to urls_to_fetch and call FetchUrls(). On 2009/09/16 17:10:56, michaeln wrote: > ah... i see... this method will maybe migrate the entry from the existing > 'newest' cache to the 'inprogress' cache... excellent > > * the method name could use some wordsmithing... MaybeMigrateExistingEntry? > > * if/when push back into the fetch_queue, will need to be flagged so we don't > get here again Good point. Added note to the comment. > > Hmmmmm... > > * we also want to ulilize the existing headers when composing the URLRequest... > which could tie up lots of memory when these requeued things are working thru > the queue again (but you could put on the near side of the fetchqueue to avoid > that) Don't know what 'near side of the fetchqueue' means. > > * what keeps MaybeUpdateComplete from thinking we're done while we're in this > state... maybe we have two queues along these lines... > > validate_queue_; > fetch_queue_ > > * urls go thru both queues, first validate then if needed fetch. > > * the validation phase involves async read and examinination of existing headers > to determine if we can migrate, and move on (no reading of existing response > data involved, just copy response_id to new entry) > > * if (!migrated) { > // prepare a URLRequest with IFMS conditional headers per the existing What is IFMS? > headers > fetch_queue_.push_back(url, unstarted_request); > ProcessUrls(); // will pop() and start next fetch and next validate > } > > * instead of FetchUrls, we have ProcessUrls which pops both queues and > StartValidation() or StartsFetch() (each of which increments > urls_being_validated_ or urls_being_fetched_ accordingly). > > * MaybeUpdateComplete checks for emptiness of both queues and > !urls_being_validated_ and !urls_being_fetched_ (when validation or fetch > completes decrement the urls_being_munched_on_) > Fixed by correcting check in MaybeUpdateComplete. http://codereview.chromium.org/201070/diff/8002/8010#newcode576 Line 576: MessageLoop::current()->DeleteSoon(FROM_HERE, this); On 2009/09/16 06:24:11, michaeln wrote: > definitely rename the method to DeleteSoon Done. http://codereview.chromium.org/201070/diff/8002/8011 File webkit/appcache/appcache_update_job.h (right): http://codereview.chromium.org/201070/diff/8002/8011#newcode24 Line 24: public base::RefCountedThreadSafe<AppCacheUpdateJob> { On 2009/09/16 16:23:57, michaeln wrote: > I was wondering why refcounted? You should look at ScopedMethodFactory. Done. http://codereview.chromium.org/201070/diff/8002/8011#newcode48 Line 48: typedef std::vector<base::WeakPtr<AppCacheHost> > PendingHosts; On 2009/09/16 16:23:57, michaeln wrote: > I think the observer model may be better suited for keeping track of these > AppCacheHosts (and eventually observing the main resource data as it gets fed > back to us from the client). > > So instead of WeakPtrs here, we observe OnDestructionImmiment(host), and pull > them from the collection at that time. See the CL i sent out for review > yesterday. > > Also, consider keeping this collection of PendindMasters in the Group itself > rather than the UpdateJob. > a. That is how the spec phrases the algoritm > b. Will more naturally survive the big 'Restart' step. Also saving for separate CL. 'Restart': if master entries have completed their download, how could they be re-included in a 'Restart'? There will be no further notifications about their download status. http://codereview.chromium.org/201070/diff/8002/8011#newcode50 Line 50: typedef std::map<GURL, AppCacheEntry> UrlFileList; On 2009/09/16 06:24:11, michaeln wrote: > we have a typedef for this, AppCache::EntryMap, the data member name should be > sufficient to describe its purpose Done. http://codereview.chromium.org/201070/diff/8002/8011#newcode63 Line 63: enum UpdateState { On 2009/09/16 06:24:11, michaeln wrote: > UpdateState is definitely an improvement. > > In looking at the .cc file, there is an UpdateStatus for the group and this > UpdateState type, its easy to lose track of which is which. Not sure this is a > good idea, but InternalUpdateState? > > Is the first state really unknown or is it FETCHING_MANIFEST? Changed to InternalUpdateState. Added states for FETCH_MANIFEST and REFETCH_MANIFEST. http://codereview.chromium.org/201070/diff/8002/8011#newcode83 Line 83: // Returns false if response data processed asynchronously, in which On 2009/09/16 06:24:11, michaeln wrote: > data _is_ processed Done. http://codereview.chromium.org/201070/diff/8002/8011#newcode108 Line 108: bool SkipUrlFetch(const AppCacheEntry& entry); On 2009/09/16 06:24:11, michaeln wrote: > maybe ShouldSkipUrlFetch? Done. http://codereview.chromium.org/201070/diff/8002/8011#newcode121 Line 121: // the update process. On 2009/09/16 06:24:11, michaeln wrote: > describe the return value in the doc comment Done. http://codereview.chromium.org/201070/diff/8002/8011#newcode133 Line 133: void UpdateFinished(); On 2009/09/16 06:24:11, michaeln wrote: > maybe DeleteSoon? Done. http://codereview.chromium.org/201070/diff/8002/8011#newcode152 Line 152: std::vector<GURL> urls_to_fetch_; On 2009/09/16 06:24:11, michaeln wrote: > can we use a std::deque here Ah, that's what I was looking for. Changed. I had found queue but that didn't support .clear() so went back to vector.
> Don't know what 'near side of the fetchqueue' means I mean push_front() vs push_back(), so the requeue'd guy with more storage allocated on its behalf cuts in line instead of going to the back of the line.
> Also saving for separate CL. Agreed! > 'Restart': if master entries have completed their download, how could they be > re-included in a 'Restart'? There will be no further notifications about their > download status. That state would also be held in the PendindMasters collection itself. Furthermore as pending master responses data is received, it's spool to responseStorage independent of the UpdateJob. (So, while we're fetching the manifest, the pending master is getting written to disk). The UpdateJob (or a restarted one) can check for completion at leisure, or sign up to observe when completion occurs, or poke the system to stop saving and delete the pending master entry on OBSOLETE or CACHE_FAILURE conditions where the resource wont be needed. This is for later ;)
> Gears retries up to 3 times on 503. However it does not take the > retry-after header into account. Yes, it only pays attention to 503 if retry-after == 0. There is also a larger 503 retry loop bracketing the entire UpdateTask, from the initial manifest fetch to the final manifest (refetch). > Saving this to revisit in a separate CL. Yes, please to put a "TODO(xxx): think about 503s..." here's a publicly accessible link to the relevant gears source file... http://code.google.com/p/gears/source/browse/trunk/gears/localserver/common/u...
here's a couple of comments outside of the core new class and its tests... i'm looking at those now... http://codereview.chromium.org/201070/diff/15/8017 File webkit/appcache/appcache.h (right): http://codereview.chromium.org/201070/diff/15/8017#newcode71 Line 71: void ExtractManifestData(Manifest* manifest); the directionality is not real clear... Is the caller asking the appcache to extract its state and provide a Manifest representation of that state? Or is the caller asking the appcache to InitializeWithManifest(Manifest*)? http://codereview.chromium.org/201070/diff/15/8027 File webkit/appcache/appcache_group_unittest.cc (right): http://codereview.chromium.org/201070/diff/15/8027#newcode150 Line 150: delete update; maybe EXPECT_EQ(NULL, group->update_job_) here http://codereview.chromium.org/201070/diff/15/8026 File webkit/appcache/appcache_unittest.cc (right): http://codereview.chromium.org/201070/diff/15/8026#newcode80 Line 80: EXPECT_TRUE(cache->online_whitelist_all_); it would be nice to check that these collections are actually not copied, we want them to be swap'd in, if that changes unexpectedly for some reason, we'd want to know... ala... EXPECT_TRUE(manfiest.fallback_namespaces.empty()); EXPECT_TRUE(online_whitelist_namespaces.empty());
http://codereview.chromium.org/201070/diff/8002/8010 File webkit/appcache/appcache_update_job.cc (right): http://codereview.chromium.org/201070/diff/8002/8010#newcode255 Line 255: Manifest manifest; Fair enough. http://codereview.chromium.org/201070/diff/8002/8010#newcode350 Line 350: // TODO(jennb): write new group to storage here? yup http://codereview.chromium.org/201070/diff/15/8024#newcode231 Line 231: ContinueHandleManifestFetchCompleted(request, true); this method reads much nicer... thank you! http://codereview.chromium.org/201070/diff/15/8024#newcode302 Line 302: // whose value doesn't match the manifest url of the application cache maybe rearrange the sequence of things here a little to put the cache/filelist population up front, followed by host associations and notifications, following by saying GO http://codereview.chromium.org/201070/diff/15/8024#newcode307 Line 307: if (update_state_ != CACHE_FAILURE) { i like these state checks http://codereview.chromium.org/201070/diff/15/8024#newcode394 Line 394: // about duplicate hosts being added to the notifier. nice http://codereview.chromium.org/201070/diff/15/8025 File webkit/appcache/appcache_update_job.h (right): http://codereview.chromium.org/201070/diff/15/8025#newcode27 Line 27: ScopedRunnableMethodFactory<AppCacheUpdateJob> method_factory_; move this down to the private section below instead of making a new private section up here
New snapshot uploaded. http://codereview.chromium.org/201070/diff/15/8017 File webkit/appcache/appcache.h (right): http://codereview.chromium.org/201070/diff/15/8017#newcode71 Line 71: void ExtractManifestData(Manifest* manifest); On 2009/09/16 22:24:08, michaeln wrote: > the directionality is not real clear... > > Is the caller asking the appcache to extract its state and provide a Manifest > representation of that state? > > Or is the caller asking the appcache to InitializeWithManifest(Manifest*)? Renamed. http://codereview.chromium.org/201070/diff/15/8027 File webkit/appcache/appcache_group_unittest.cc (right): http://codereview.chromium.org/201070/diff/15/8027#newcode150 Line 150: delete update; On 2009/09/16 22:24:08, michaeln wrote: > maybe EXPECT_EQ(NULL, group->update_job_) here Done. http://codereview.chromium.org/201070/diff/15/8026 File webkit/appcache/appcache_unittest.cc (right): http://codereview.chromium.org/201070/diff/15/8026#newcode80 Line 80: EXPECT_TRUE(cache->online_whitelist_all_); On 2009/09/16 22:24:08, michaeln wrote: > it would be nice to check that these collections are actually not copied, we > want them to be swap'd in, if that changes unexpectedly for some reason, we'd > want to know... ala... > > EXPECT_TRUE(manfiest.fallback_namespaces.empty()); > EXPECT_TRUE(online_whitelist_namespaces.empty()); > Done. http://codereview.chromium.org/201070/diff/15/8024 File webkit/appcache/appcache_update_job.cc (right): http://codereview.chromium.org/201070/diff/15/8024#newcode302 Line 302: inprogress_cache_->ExtractManifestData(&manifest); On 2009/09/16 22:58:52, michaeln wrote: > maybe rearrange the sequence of things here a little to put the cache/filelist > population up front, followed by host associations and notifications, following > by saying GO Done. http://codereview.chromium.org/201070/diff/15/8025 File webkit/appcache/appcache_update_job.h (right): http://codereview.chromium.org/201070/diff/15/8025#newcode27 Line 27: ScopedRunnableMethodFactory<AppCacheUpdateJob> method_factory_; On 2009/09/16 22:58:52, michaeln wrote: > move this down to the private section below instead of making a new private > section up here Done. I thought the example in task.h was instructing me to put it up top.
http://codereview.chromium.org/201070/diff/17001/18009 File webkit/appcache/appcache_update_job.cc (right): http://codereview.chromium.org/201070/diff/17001/18009#newcode139 Line 139: request->Start(); We'll need to hang on to a reference to this request in order to cancel it. http://codereview.chromium.org/201070/diff/17001/18009#newcode253 Line 253: group_->set_obsolete(true); I'm looking at this callsite, i think storage->MarkAsObsoleteAndDelete(group) would work here, then raise the events upon completion. http://codereview.chromium.org/201070/diff/17001/18009#newcode376 Line 376: // TODO(jennb): write new group to storage here I think storage->StoreGroupAndNewestCache() works here, then raise CACHED_EVENT/UPDATE_READY_EVENT upon completion.
On Tue, Sep 29, 2009 at 12:08 PM, <michaeln@chromium.org> wrote: > > http://codereview.chromium.org/201070/diff/17001/18009 > File webkit/appcache/appcache_update_job.cc (right): > > http://codereview.chromium.org/201070/diff/17001/18009#newcode139 > Line 139: request->Start(); > We'll need to hang on to a reference to this request in order to cancel > it. > Yep, already found this while working on unittests. > > http://codereview.chromium.org/201070/diff/17001/18009#newcode253 > Line 253: group_->set_obsolete(true); > I'm looking at this callsite, i think > storage->MarkAsObsoleteAndDelete(group) would work here, then raise the > events upon completion. > We don't want to delete the group here. We simply want to mark it obsolete. Group gets deleted when it is no longer in use. > http://codereview.chromium.org/201070/diff/17001/18009#newcode376 > Line 376: // TODO(jennb): write new group to storage here > I think storage->StoreGroupAndNewestCache() works here, then raise > CACHED_EVENT/UPDATE_READY_EVENT upon completion. > > Yes, storage in separate CL. I hope to have the unittests finished soon and will upload a new snapshot then. Jenn > http://codereview.chromium.org/201070 >
> > http://codereview.chromium.org/201070/diff/17001/18009#newcode253 > > Line 253: group_->set_obsolete(true); > > I'm looking at this callsite, i think > > storage->MarkAsObsoleteAndDelete(group) would work here, then raise the > > events upon completion. > > > > We don't want to delete the group here. We simply want to mark it obsolete. > Group gets deleted when it is no longer in use. That could work for me. When an obsolete 'item' falls out of use, then delete. We have to think thru marking as obsolete and then crashing prior to 'item' falling out of use. Upon restart, it should not appear as a valid thing. I think this is a small impedance mismatch we can work thru during respective code reviews.
New patch uploaded. Finally done with unittests. Added some code to simulate async behavior for now until storage code is added (separate CL). Also added data files for the HTTP test server.
Hi Jenn, I got distracted yesterday with a couple of things and didn't get you review comments. I'm digging into it now. While I'm digging, can you give me an update on couple issues left dangling from before and where we stand with them in the current snapshot... 1) Always run-till-completion vs not for unattended UpdateJobs. In particular at browser shutdown time. 2) State of affairs with pending master entries. How does the CL stub that out for now. http://codereview.chromium.org/201070/diff/24001/24006 File webkit/appcache/appcache_group.cc (right): http://codereview.chromium.org/201070/diff/24001/24006#newcode25 Line 25: service_->AddGroup(this); Depending on what we do about when to terminate unattended update jobs. This may not apply, but given the current run-till-completion semantics i think a DCHECK(!update_job_) would make sense here. http://codereview.chromium.org/201070/diff/24001/24013 File webkit/appcache/appcache_unittest.cc (right): http://codereview.chromium.org/201070/diff/24001/24013#newcode65 Line 65: FallbackNamespace(GURL("http://fb1.com"), GURL("http://fbone.com"))); nit: indent two more spaces (i think i fits) http://codereview.chromium.org/201070/diff/24001/24013#newcode74 Line 74: EXPECT_EQ(expected, fallbacks.size()); I think it'd be slightly better to put constant values in the EXPECT_EQ statements instead of the local 'expected' var.
Haven't look thru the unit tests yet, but here's a pass thru the job.cc and .h files. http://codereview.chromium.org/201070/diff/24001/24011 File webkit/appcache/appcache_update_job.cc (right): http://codereview.chromium.org/201070/diff/24001/24011#newcode35 Line 35: // TODO(jennb): need storage info to stream response data to storage I think an instance of AppCacheResponseWriter works here for that purpose. http://codereview.chromium.org/201070/diff/24001/24011#newcode86 Line 86: DCHECK(!manifest_url_request_); maybe also DCHECK(pending_url_fetches_.empty()), i think thats a valid assertion here http://codereview.chromium.org/201070/diff/24001/24011#newcode130 Line 130: DCHECK(!manifest_url_request_); Any chance that the manifest refetch can be initiated while in the process the completing of the initial manifest fetch (and this data member is not yet nulled out) in the empty manifest file case? http://codereview.chromium.org/201070/diff/24001/24011#newcode195 Line 195: // For now, schedule a task to continue reading to simulate async-ness. I think AppCacheResponseWriter.WriteData will work here. http://codereview.chromium.org/201070/diff/24001/24011#newcode260 Line 260: group_->set_obsolete(true); We need to involve storage in some form here, once a group/cache appears as 'obsolete' to webapp logic, it shouldn't later (say after browser crash/restart) appear as unobsoleted. http://codereview.chromium.org/201070/diff/24001/24011#newcode323 Line 323: // TODO(jennb): associate storage with the new entry Not for this CL... I think at this point the response data should have been written to AppCacheResponseWriter.WriteData, with maybe the last write still pending completion. We will still need to write the response info to storage and finalize the response, see AppCacheResponseWriter.WriteInfo and Finalize. Maybe that gets initiated here? http://codereview.chromium.org/201070/diff/24001/24011#newcode356 Line 356: // For now, post message to copy entry data We can avoid actually copying the bytes. All that is needed here is to add an entry to the 'in_progress_cache' that has the same 'response_id' as the corresponding entry from the 'newest_complete_cache'. So no async diskIO steps involved. http://codereview.chromium.org/201070/diff/24001/24011#newcode523 Line 523: bool storage_checked = urls_to_fetch_.front().second; i think popping then repushing back on the front on behalf of the MaybeLoadNewestCache case works, but its a little subtle and delicate... fragile For example, the repushing that can be done as a consequence of MaybeLoadFromNewestCache MUST be async for this to work, it cannot happen in the nested method call, otherwise the call to .pop_front() below will eat the requeued item. http://codereview.chromium.org/201070/diff/24001/24011#newcode564 Line 564: // TODO(jennb): load HTTP headers for copy_me and wait for callback I think storage.LoadResponseInfo will work here. http://codereview.chromium.org/201070/diff/24001/24011#newcode665 Line 665: // TODO(jennb): code other cancel cleanup do we need to iterate over pending_url_requests_ and .Cancel() delete. I think it may be sufficient to just delete them w/o calling cancel... STLUtil macros are handy for that. http://codereview.chromium.org/201070/diff/24001/24012 File webkit/appcache/appcache_update_job.h (right): http://codereview.chromium.org/201070/diff/24001/24012#newcode32 Line 32: // Methods for URLRequest::Delegate. maybe put these in the private section, so the public API exposed by this class is more clearly prominent in the .h file... StartUpdate() and CancelUpdate() http://codereview.chromium.org/201070/diff/24001/24012#newcode45 Line 45: static void RetryUpdate(); // TODO(jennb): flesh out params It may be premature to have this static method for this purpose. We need to get to it (to be sure). Unclear if this is the right API to trigger that behavior. I expect it will need to be 'scoped' to a particular service in some way (so as NOT to happen should the profile/service get disposed of after a retry is scheduled). For now, I'd suggest just a TODO comment in the .cc file where we expect to gesture 'RetrySoon', and leave it at that. http://codereview.chromium.org/201070/diff/24001/24012#newcode47 Line 47: void CancelUpdate(); Cool, where is this called from (if anywhere yet)? http://codereview.chromium.org/201070/diff/24001/24012#newcode54 Line 54: // in different tabs. similar to retry impl details, i think some of this pending master support may be premature http://codereview.chromium.org/201070/diff/24001/24012#newcode55 Line 55: // TODO(jennb): detect when hosts are deleted i think this particular TODO should be addressed in this CL in some form, otherwise we can crash... I think OnDestructionImminent notification works for this purpose. http://codereview.chromium.org/201070/diff/24001/24012#newcode59 Line 59: typedef std::pair<GURL, bool> UrlsToFetch; // flag TRUE if storage checked UrlToFetch, singular http://codereview.chromium.org/201070/diff/24001/24012#newcode70 Line 70: // when their download asynchronously completes. The comment is true but seems misplaced as a doc comment on this enum. I think the name and enum values are self-documenting enough that you don't need a doc comment here at all. The particular comment you have would be more appropriate as a code comment around the code that handle pending master entry completion.
No new snapshot yet. Just replying. On Thu, Oct 1, 2009 at 11:37 AM, <michaeln@chromium.org> wrote: > Hi Jenn, I got distracted yesterday with a couple of things and didn't get > you > review comments. I'm digging into it now. While I'm digging, can you give > me an > update on couple issues left dangling from before and where we stand with > them > in the current snapshot... > > 1) Always run-till-completion vs not for unattended UpdateJobs. In > particular at > browser shutdown time. > > There is a Cancel method stubbed out, but we'll have to solve this in a separate CL. > 2) State of affairs with pending master entries. How does the CL stub that > out > for now. > > Pending master entries are ignored in this CL. Will be addressed in a separate CL. > > http://codereview.chromium.org/201070/diff/24001/24006 > File webkit/appcache/appcache_group.cc (right): > > http://codereview.chromium.org/201070/diff/24001/24006#newcode25 > Line 25: service_->AddGroup(this); > Depending on what we do about when to terminate unattended update jobs. > This may not apply, but given the current run-till-completion semantics > i think a DCHECK(!update_job_) would make sense here. > I'm assuming you mean in the dtor? > > http://codereview.chromium.org/201070/diff/24001/24013 > File webkit/appcache/appcache_unittest.cc (right): > > http://codereview.chromium.org/201070/diff/24001/24013#newcode65 > Line 65: FallbackNamespace(GURL("http://fb1.com"), > GURL("http://fbone.com"))); > nit: indent two more spaces (i think i fits) > > Done. > http://codereview.chromium.org/201070/diff/24001/24013#newcode74 > Line 74: EXPECT_EQ(expected, fallbacks.size()); > I think it'd be slightly better to put constant values in the EXPECT_EQ > statements instead of the local 'expected' var. > > > gcc doesn't like constants unless I cast them to size_t. To avoid a bunch of casts, I followed what other code does by using a size_t var. Jenn > http://codereview.chromium.org/201070 >
Still no new snapshot yet. Just replying again for now. On Thu, Oct 1, 2009 at 2:19 PM, <michaeln@chromium.org> wrote: > Haven't look thru the unit tests yet, but here's a pass thru the job.cc and > .h > files. > > > http://codereview.chromium.org/201070/diff/24001/24011 > File webkit/appcache/appcache_update_job.cc (right): > > http://codereview.chromium.org/201070/diff/24001/24011#newcode35 > Line 35: // TODO(jennb): need storage info to stream response data to > storage > I think an instance of AppCacheResponseWriter works here for that > purpose. > > http://codereview.chromium.org/201070/diff/24001/24011#newcode86 > Line 86: DCHECK(!manifest_url_request_); > maybe also DCHECK(pending_url_fetches_.empty()), i think thats a valid > assertion here > > Done. > http://codereview.chromium.org/201070/diff/24001/24011#newcode130 > Line 130: DCHECK(!manifest_url_request_); > Any chance that the manifest refetch can be initiated while in the > process the completing of the initial manifest fetch (and this data > member is not yet nulled out) in the empty manifest file case? > > Not possible. Initial manifest fetch must complete before the manifest is refetched. > http://codereview.chromium.org/201070/diff/24001/24011#newcode195 > Line 195: // For now, schedule a task to continue reading to simulate > async-ness. > I think AppCacheResponseWriter.WriteData will work here. > > http://codereview.chromium.org/201070/diff/24001/24011#newcode260 > Line 260: group_->set_obsolete(true); > We need to involve storage in some form here, once a group/cache appears > as 'obsolete' to webapp logic, it shouldn't later (say after browser > crash/restart) appear as unobsoleted. > We could have something similar to the MarkAsForeign api. Though if a group is obsolete, but we lose that info on restart, it will be detected again when anything is loaded from that group's appcache as that will trigger another update. > > http://codereview.chromium.org/201070/diff/24001/24011#newcode323 > Line 323: // TODO(jennb): associate storage with the new entry > Not for this CL... > > I think at this point the response data should have been written to > AppCacheResponseWriter.WriteData, with maybe the last write still > pending completion. We will still need to write the response info to > storage and finalize the response, see AppCacheResponseWriter.WriteInfo > and Finalize. > > Maybe that gets initiated here? > Could be here, or could be upon first write as soon as we know the response_id (that's the same as response_storage_id?). That way if the url fetch does not succeed, we have the id in the entry to clean up any stored data when we discard the entry. > > http://codereview.chromium.org/201070/diff/24001/24011#newcode356 > Line 356: // For now, post message to copy entry data > We can avoid actually copying the bytes. All that is needed here is to > add an entry to the 'in_progress_cache' that has the same 'response_id' > as the corresponding entry from the 'newest_complete_cache'. So no async > diskIO steps involved. > Doh! That's how I had it originally and then got overzealous with async simulations. Changed back. > > http://codereview.chromium.org/201070/diff/24001/24011#newcode523 > Line 523: bool storage_checked = urls_to_fetch_.front().second; > i think popping then repushing back on the front on behalf of the > MaybeLoadNewestCache case works, but its a little subtle and delicate... > fragile > > For example, the repushing that can be done as a consequence of > MaybeLoadFromNewestCache MUST be async for this to work, it cannot > happen in the nested method call, otherwise the call to .pop_front() > below will eat the requeued item. > Moved call to .pop_front() immediately after accessing front element. That makes this method reentrant. I realized the re-entrancy would be an issue when I saw your storage CL comments saying some things would be called back immediately, rather than async, if already loaded. Let's leave this repushing for now, and revisit as needed when storage code is added and we can more thoroughly test this scenario. > http://codereview.chromium.org/201070/diff/24001/24011#newcode564 > Line 564: // TODO(jennb): load HTTP headers for copy_me and wait for > callback > I think storage.LoadResponseInfo will work here. > > http://codereview.chromium.org/201070/diff/24001/24011#newcode665 > Line 665: // TODO(jennb): code other cancel cleanup > do we need to iterate over pending_url_requests_ and .Cancel() delete. I > think it may be sufficient to just delete them w/o calling cancel... > STLUtil macros are handy for that. > Yes, but I did not flesh out the Cancel method because I was trying to keep all cancel logic for a separate CL, except for the barebones that I needed for a couple unittests in this CL. There's plenty more that Cancel needs to do. > > http://codereview.chromium.org/201070/diff/24001/24012 > File webkit/appcache/appcache_update_job.h (right): > > http://codereview.chromium.org/201070/diff/24001/24012#newcode32 > Line 32: // Methods for URLRequest::Delegate. > maybe put these in the private section, so the public API exposed by > this class is more clearly prominent in the .h file... StartUpdate() and > CancelUpdate() > Don't they have to be public for URLRequest to call them? I don't want too many friend classes. Also seems reasonable to have this as part of this class API as this class subclasses URLRequest::Delegate. > > http://codereview.chromium.org/201070/diff/24001/24012#newcode45 > Line 45: static void RetryUpdate(); // TODO(jennb): flesh out params > It may be premature to have this static method for this purpose. We need > to get to it (to be sure). Unclear if this is the right API to trigger > that behavior. I expect it will need to be 'scoped' to a particular > service in some way (so as NOT to happen should the profile/service get > disposed of after a retry is scheduled). > > For now, I'd suggest just a TODO comment in the .cc file where we expect > to gesture 'RetrySoon', and leave it at that. > That was essentially my placeholder TODO, but I took it out for now to avoid getting stuck thinking static method when coming back to this. I'm thinking the delayed task will be aimed at the group's RetryUpdate method, and possibly store the task in the group so the group can cancel it if another update is triggered before the retry fires. > http://codereview.chromium.org/201070/diff/24001/24012#newcode47 > Line 47: void CancelUpdate(); > Cool, where is this called from (if anywhere yet)? > Only added because I needed it for a couple unittests for now. > > http://codereview.chromium.org/201070/diff/24001/24012#newcode54 > Line 54: // in different tabs. > similar to retry impl details, i think some of this pending master > support may be premature > All pending master support will be revisited in a separate CL. I'm going to leave this in there for now so that I don't lose the logic for checking them to know when an update is done. They are essentially placeholders for now in the code to remind me where they fit in the update algorithm. > > http://codereview.chromium.org/201070/diff/24001/24012#newcode55 > Line 55: // TODO(jennb): detect when hosts are deleted > i think this particular TODO should be addressed in this CL in some > form, otherwise we can crash... I think OnDestructionImminent > notification works for this purpose. > We won't crash because no pending master entries are ever inserted in this CL, and thus none of these hosts are ever accessed in the code. I prefer to wait to address them in the separate CL where I figure out pending master entries as that would obsolete any OnDestructionImminent code I add in this CL. Plus I can't test OnDestructionImminent logic added in this CL without any pending master entries. > > http://codereview.chromium.org/201070/diff/24001/24012#newcode59 > Line 59: typedef std::pair<GURL, bool> UrlsToFetch; // flag TRUE if > storage checked > UrlToFetch, singular > Done. > > http://codereview.chromium.org/201070/diff/24001/24012#newcode70 > Line 70: // when their download asynchronously completes. > The comment is true but seems misplaced as a doc comment on this enum. I > think the name and enum values are self-documenting enough that you > don't need a doc comment here at all. The particular comment you have > would be more appropriate as a code comment around the code that handle > pending master entry completion. > > Deleted. Jenn > > http://codereview.chromium.org/201070 >
Haven't made it all the way thru the unit tests yet... i like the general shape of the test harness. http://codereview.chromium.org/201070/diff/24001/24015 File webkit/appcache/appcache_update_job_unittest.cc (right): http://codereview.chromium.org/201070/diff/24001/24015#newcode103 Line 103: io_thread.StartWithOptions(options); Would it make sense to start one io_thread for class AppCacheUpdateJobTest and to resuse that thread for the individual tests? http://codereview.chromium.org/201070/diff/24001/24015#newcode140 Line 140: WaitForUpdateToFinish(); Maybe 'delete update' can imply Cancel and we don't actually need a Cancel method. Not sure if thats a good idea or not. Is there anything about Cancel() that forces it to be async? This may be a better idea... Maybe Cancel() should be guaranteed to be syncrhonous. So while remains a programming error do delete a running task (DCHECK's in dtor apply), there is recourse... when needed a caller can... update.Cancel(); delete update; ... at will A lot of care has been taken in the existing URLRequest interface and in the StorageAPIs out for review to make killing async operations unscary from the callers perspective. In the StorageAPI... appcache_storage.CancelRequests(delegate); // kills all storage requests delete appcache_reader; delete appcache_writer; delete appcache_storage; ... are all safe things to do http://codereview.chromium.org/201070/diff/24001/24015#newcode276 Line 276: http_server_ = HTTPTestServer::CreateServer(kDocRoot, NULL); Is it expensive to spin up an HTTPTestServer? If so it may make sense to spin one test server up for the class and resuse for the individual tests (similar to reusing the iothread but probably more bang for the buck in this case). http://codereview.chromium.org/201070/diff/24001/24015#newcode825 Line 825: &AppCacheUpdateJobTest::WaitForUpdateToFinish)); This will spin furiously until the job unwinds. Some kind of UpdateJob Observer or Delegate callback mechanism may be better suited for this. And likely for production code, so the 'service' can Cancel things if need be.
>> 1) Always run-till-completion vs not for unattended UpdateJobs. In >> particular at browser shutdown time. > > There is a Cancel method stubbed out, but we'll have to solve this in a > separate CL. One follow up question on this, how far will an UpdateJob get before error'ing out in the current snapshot when run in the context of chrome? I'm wonder how big the window is once a UpdateJob starts till it goes away on its own accord. >> 2) State of affairs with pending master entries. How does the CL stub that >> out for now. > > Pending master entries are ignored in this CL. Will be addressed in a > separate CL. Ah, I missed that nothing got inserted into these collections for now. Yea, that works for me. >> http://codereview.chromium.org/201070/diff/24001/24011#newcode260 >> Line 260: group_->set_obsolete(true); >> We need to involve storage in some form here, once a group/cache appears >> as 'obsolete' to webapp logic, it shouldn't later (say after browser >> crash/restart) appear as unobsoleted. > > We could have something similar to the MarkAsForeign api. Though if a group > is obsolete, but we lose that info on restart, it will be detected again > when anything is loaded from that group's appcache as that will trigger > another update. I'm modifying the StorageAPIs in light of this a bit. No new snapshot uploaded but... i'm thinking we have DoomGroup(group) to schedule deletion upon no longer in use (regardless of being obsolete), and MarkGroupAsObsolete(group, callback) to do the persistent book-keeping for the case i'm concerned about. Its true that upon restart and a future run of a UpdateJob, we'll notice that its obsolete again and clean things up. But once we inform webapp code that the thing is obsolete, the webapp may take actions that have side-effects outside of the AppCache (update local Databases or LocalStorage or send a message to its server). Upon waking up later, it may see an inconsistent view of its world if the group is not obsolete and some of these other things indicate that it is. Taking care here does not completely solve that class of problem, but its one less source of a webapp getting into an inconsistent state. >> http://codereview.chromium.org/201070/diff/24001/24011#newcode260 >> Line 260: group_->set_obsolete(true); >> We need to involve storage in some form here, once a group/cache appears >> as 'obsolete' to webapp logic, it shouldn't later (say after browser >> crash/restart) appear as unobsoleted. > > Could be here, or could be upon first write as soon as we know the > response_id (that's the same as response_storage_id?). That way if the url > fetch does not succeed, we have the id in the entry to clean up any stored > data when we discard the entry. Yes, response_id == response_storage_id in these discussion. I think i've got a handle on cleaning up if the fetch doesn't succeed case (and other faiure mode cases too). A newly written entry will be automatically 'Doomed' unless it is 'Finalized', no other action required by the caller. See AppCacheResponseWriter.Finalize(). > Let's leave this repushing for now, and revisit as needed when storage code > is added and we can more thoroughly test this scenario. That sounds good, we should trade notes on corner of the appcache code we think should be revisited at some point, but I don't want to slow down our forward progress on building things out for house cleaning tasks.
New snapshot uploaded. Hopefully, all the try bots will be happy now. On Thu, Oct 1, 2009 at 4:55 PM, <michaeln@chromium.org> wrote: > Haven't made it all the way thru the unit tests yet... i like the general > shape > of the test harness. > > > http://codereview.chromium.org/201070/diff/24001/24015 > File webkit/appcache/appcache_update_job_unittest.cc (right): > > http://codereview.chromium.org/201070/diff/24001/24015#newcode103 > Line 103: io_thread.StartWithOptions(options); > Would it make sense to start one io_thread for class > AppCacheUpdateJobTest and to resuse that thread for the individual > tests? > Switched to one IO thread for all tests. > > http://codereview.chromium.org/201070/diff/24001/24015#newcode140 > Line 140: WaitForUpdateToFinish(); > Maybe 'delete update' can imply Cancel and we don't actually need a > Cancel method. Not sure if thats a good idea or not. Is there anything > about Cancel() that forces it to be async? > > This may be a better idea... > > Maybe Cancel() should be guaranteed to be syncrhonous. So while remains > a programming error do delete a running task (DCHECK's in dtor apply), > there is recourse... when needed a caller can... > update.Cancel(); > delete update; > ... at will > > A lot of care has been taken in the existing URLRequest interface and in > the StorageAPIs out for review to make killing async operations unscary > from the callers perspective. In the StorageAPI... > appcache_storage.CancelRequests(delegate); // kills all storage > requests > delete appcache_reader; > delete appcache_writer; > delete appcache_storage; > ... are all safe things to do > Delete now implies cancel. > > http://codereview.chromium.org/201070/diff/24001/24015#newcode276 > Line 276: http_server_ = HTTPTestServer::CreateServer(kDocRoot, NULL); > Is it expensive to spin up an HTTPTestServer? If so it may make sense to > spin one test server up for the class and resuse for the individual > tests (similar to reusing the iothread but probably more bang for the > buck in this case). > Other tests fire up one http server per test, but I switched to using one http server for all tests as we have so many. > http://codereview.chromium.org/201070/diff/24001/24015#newcode825 > Line 825: &AppCacheUpdateJobTest::WaitForUpdateToFinish)); > This will spin furiously until the job unwinds. Some kind of UpdateJob > Observer or Delegate callback mechanism may be better suited for this. > And likely for production code, so the 'service' can Cancel things if > need be. > > Added observer on AppCacheGroup to be informed when the update job is finished. On Fri, Oct 2, 2009 at 12:58 PM, <michaeln@chromium.org> wrote: > 1) Always run-till-completion vs not for unattended UpdateJobs. In >>> particular at browser shutdown time. >>> >> > There is a Cancel method stubbed out, but we'll have to solve this in a >> separate CL. >> > > One follow up question on this, how far will an UpdateJob get before > error'ing > out in the current snapshot when run in the context of chrome? I'm wonder > how > big the window is once a UpdateJob starts till it goes away on its own > accord. An update job will error out if (in increasing amount of progress): - manifest cannot be fetched - manifest fails to parse - a URL fetch fails for an explicit or fallback entry - manifest refetch fails (or does not return the same data as we don't have retry logic yet) Jenn
This is so close. > An update job will error out if (in increasing amount of progress): > - manifest cannot be fetched > - manifest fails to parse > - a URL fetch fails for an explicit or fallback entry > - manifest refetch fails (or does not return the same data as we don't have > retry logic yet) So they'll run till they've retrieved all of the resource in the manifest. This probably has some pretty bad consequence on browser exit prior to having completed. I'm a little uncomfortable with checking in with that. Here's something i think works for this initial checkin... * Make group->StartUpdate (and family) actually not do anything for now, just return. * Disable the TEST(AppCacheGroupTest, StartUpdate) for now. The real job unittests don't call group->StartUpdate (and family), so they're unaffected by this, but group->StartUpdate continues to be a noop when invoked in Chrome or the test_shell browser. >> Moving URLRequest::Delegate methods to the private section > Don't they have to be public for URLRequest to call them? I don't want too > many friend classes. Also seems reasonable to have this as part of this > class API as this class subclasses URLRequest::Delegate. The don't have to be public for URLRequest to call them, URLRequest doesn't know about AppCacheUpdateJob at compile time. It's a fairly common technique to 'hide' methods like this (delegate callbacks that are part of the impl, but are not really part of the API exposed by the class itself). http://codereview.chromium.org/201070/diff/26006/26011 File webkit/appcache/appcache_group.cc (right): http://codereview.chromium.org/201070/diff/26006/26011#newcode86 Line 86: AppCacheHost* host, const GURL& new_master_resource) { simply return here w/o starting a job until we have run-till-completion &| cancel on shutdown worked out http://codereview.chromium.org/201070/diff/26006/26011#newcode93 Line 93: void AppCacheGroup::SetUpdateStatus(UpdateStatus status) { if (status == update_status_) return; ... to avoid notifying complete when nothing really changed http://codereview.chromium.org/201070/diff/26006/26019 File webkit/appcache/appcache_group_unittest.cc (right): http://codereview.chromium.org/201070/diff/26006/26019#newcode134 Line 134: TEST(AppCacheGroupTest, StartUpdate) { TEST(AppCacheGroupTest, DISABLED_StartUpdate) until we have run-till-completion &| cancel on shutdown handled in some way. http://codereview.chromium.org/201070/diff/26006/26016 File webkit/appcache/appcache_update_job.cc (right): http://codereview.chromium.org/201070/diff/26006/26016#newcode668 Line 668: MessageLoop::current()->DeleteSoon(FROM_HERE, this); I think we have to revisit the delete/cancel semantics. There are some races in here. The class calls DeleteSoon on itself which makes deleting instances, or even hanging on to instance from the outside, from the outside a dangerous thing. A TODO works for me for the initial checking. http://codereview.chromium.org/201070/diff/26006/26020 File webkit/appcache/appcache_update_job_unittest.cc (right): http://codereview.chromium.org/201070/diff/26006/26020#newcode95 Line 95: URLRequest::RegisterProtocolFactory("http", NULL); I think these are getting called on the 'main' thread and not on the seperate IO thread you've setup to do the bulk of the work... is that right? It would be good to perform this cleanup on the IO thread itself, so the appcache library is only getting called from a single thread. I think that can be accomplished by doing the cleanup in UpdateFinished() just prior calling event_->Signal(). Also, would be good to cleanup the 'service_' along with the other objects. http://codereview.chromium.org/201070/diff/26006/26020#newcode98 Line 98: static void SetUpTestCase() { Nice... its not real clear from looking at gtest.h that this is run once for all 'fixtures' instead of once per each TEST_F. Would you add a comment saying that here? http://codereview.chromium.org/201070/diff/26006/26020#newcode355 Line 355: AppCacheHost* host2 = new AppCacheHost(2, frontend2, service_.get()); In reviewing these cases, in addition to examing what behavior is being tested, i'm also looking for how these 'host' and 'frontend' objects are cleaned up. That latter thing is kind of distracting compared to the former thing. I see their added to the hosts_ or frontends_ collection down below which get cleaned on on test completion, but i keep having to confirm that for each case. Maybe a way to make the harness hide those bookeeping details could help with readability. You have some helpers for CreateService and MakeCacheForGroup that hide some mundane tasks (which is good). It may make sense to have some helpers to MakeMockFrontend and MakeAppCacheHost that take care of adding the resulting object to the collections for cleanup. The test cases themselves would remain more focused on what artifacts of 'starting' an update with the inputs provided should have. http://codereview.chromium.org/201070/diff/26006/26020#newcode846 Line 846: group_->AddObserver(this); what if the update is already finished? if (group_->update_status() == IDLE) UpdateFinished(); else group_->AddObserver(this); http://codereview.chromium.org/201070/diff/26006/26020#newcode852 Line 852: // Finish up outside of observer callback so that group can be deleted. It looks should be safe to release the group from within this observer callback. At this point, we're in the UpdateJob dtor and the UpdateJob still has its reference to the group. Update return the UpdateJob's ref to the group will be released. Does something go awry if UpdateFinished is called directly, or do you just want to unwind the stack for other reasons? I think its fine to unwind the stack, just curious. http://codereview.chromium.org/201070/diff/26006/26020#newcode862 Line 862: VerifyExpectations(); Invoke cleanup on IO thread here. http://codereview.chromium.org/201070/diff/26006/26020#newcode867 Line 867: void CreateService() { maybe MakeService for consistency with Make<OtherThings> naming
No new snapshot yet. Just replying. On Mon, Oct 5, 2009 at 12:16 PM, <michaeln@chromium.org> wrote: > Moving URLRequest::Delegate methods to the private section >>> >> > Don't they have to be public for URLRequest to call them? I don't want >> too >> many friend classes. Also seems reasonable to have this as part of this >> class API as this class subclasses URLRequest::Delegate. >> > > The don't have to be public for URLRequest to call them, URLRequest doesn't > know > about AppCacheUpdateJob at compile time. It's a fairly common technique to > 'hide' methods like this (delegate callbacks that are part of the impl, but > are > not really part of the API exposed by the class itself). > > > Done. > http://codereview.chromium.org/201070/diff/26006/26011 > File webkit/appcache/appcache_group.cc (right): > > http://codereview.chromium.org/201070/diff/26006/26011#newcode93 > Line 93: void AppCacheGroup::SetUpdateStatus(UpdateStatus status) { > if (status == update_status_) > return; > > ... to avoid notifying complete when nothing really changed > Done. > > http://codereview.chromium.org/201070/diff/26006/26020 > File webkit/appcache/appcache_update_job_unittest.cc (right): > > http://codereview.chromium.org/201070/diff/26006/26020#newcode846 > Line 846: group_->AddObserver(this); > what if the update is already finished? > > if (group_->update_status() == IDLE) > UpdateFinished(); > else > group_->AddObserver(this); > > Done. > http://codereview.chromium.org/201070/diff/26006/26020#newcode852 > Line 852: // Finish up outside of observer callback so that group can be > deleted. > It looks should be safe to release the group from within this observer > callback. At this point, we're in the UpdateJob dtor and the UpdateJob > still has its reference to the group. Update return the UpdateJob's ref > to the group will be released. > > Does something go awry if UpdateFinished is called directly, or do you > just want to unwind the stack for other reasons? > > I think its fine to unwind the stack, just curious. > > > Things can go awry as the AppCacheUpdateJobTest dtor runs in the UI thread. If I signal the UI thread to continue while in the observer callback, the group still exists, which causes the AppCacheUpdateJobTest dtor to fail the DCHECK when AppCacheService dtor is called. Jenn http://codereview.chromium.org/201070 >
> Things can go awry as the AppCacheUpdateJobTest dtor runs in the UI thread. > If I signal the UI thread to continue while in the observer callback, the > group still exists, which causes the AppCacheUpdateJobTest dtor to fail the > DCHECK when AppCacheService dtor is called. I see... i had some comments about that UI thread vs IO thread arrangement as well.
New snapshot uploaded. - AppCacheGroup::StartUpdate family now does not do actually start any updates. Commented out with a TODO. - AppCacheGroupTest::StartUpdate test body commented out. Adding DISABLED_ prefix voids the FRIEND_TEST for this test, resulting in compile errors. Easiest to just comment out test body. - Added TODO for DeleteSoon. - Test cleans up on IO thread. - Added helpers to create host and mock frontend in tests. - CreateService renamed to MakeService. Jenn On Mon, Oct 5, 2009 at 1:54 PM, <michaeln@chromium.org> wrote: > This is so close. > > An update job will error out if (in increasing amount of progress): >> - manifest cannot be fetched >> - manifest fails to parse >> - a URL fetch fails for an explicit or fallback entry >> - manifest refetch fails (or does not return the same data as we don't >> have >> retry logic yet) >> > > So they'll run till they've retrieved all of the resource in the manifest. > This > probably has some pretty bad consequence on browser exit prior to having > completed. I'm a little uncomfortable with checking in with that. > > Here's something i think works for this initial checkin... > > * Make group->StartUpdate (and family) actually not do anything for now, > just > return. > * Disable the TEST(AppCacheGroupTest, StartUpdate) for now. > > The real job unittests don't call group->StartUpdate (and family), so > they're > unaffected by this, but group->StartUpdate continues to be a noop when > invoked > in Chrome or the test_shell browser. > > Moving URLRequest::Delegate methods to the private section >>> >> Don't they have to be public for URLRequest to call them? I don't want >> too >> many friend classes. Also seems reasonable to have this as part of this >> class API as this class subclasses URLRequest::Delegate. >> > > The don't have to be public for URLRequest to call them, URLRequest doesn't > know > about AppCacheUpdateJob at compile time. It's a fairly common technique to > 'hide' methods like this (delegate callbacks that are part of the impl, but > are > not really part of the API exposed by the class itself). > > > > http://codereview.chromium.org/201070/diff/26006/26011 > File webkit/appcache/appcache_group.cc (right): > > http://codereview.chromium.org/201070/diff/26006/26011#newcode86 > Line 86: AppCacheHost* host, const GURL& new_master_resource) { > simply return here w/o starting a job until we have run-till-completion > &| cancel on shutdown worked out > > http://codereview.chromium.org/201070/diff/26006/26011#newcode93 > Line 93: void AppCacheGroup::SetUpdateStatus(UpdateStatus status) { > if (status == update_status_) > return; > > ... to avoid notifying complete when nothing really changed > > http://codereview.chromium.org/201070/diff/26006/26019 > File webkit/appcache/appcache_group_unittest.cc (right): > > http://codereview.chromium.org/201070/diff/26006/26019#newcode134 > Line 134: TEST(AppCacheGroupTest, StartUpdate) { > TEST(AppCacheGroupTest, DISABLED_StartUpdate) > > until we have run-till-completion &| cancel on shutdown handled in some > way. > > http://codereview.chromium.org/201070/diff/26006/26016 > File webkit/appcache/appcache_update_job.cc (right): > > http://codereview.chromium.org/201070/diff/26006/26016#newcode668 > Line 668: MessageLoop::current()->DeleteSoon(FROM_HERE, this); > I think we have to revisit the delete/cancel semantics. There are some > races in here. The class calls DeleteSoon on itself which makes deleting > instances, or even hanging on to instance from the outside, from the > outside a dangerous thing. > > A TODO works for me for the initial checking. > > http://codereview.chromium.org/201070/diff/26006/26020 > File webkit/appcache/appcache_update_job_unittest.cc (right): > > http://codereview.chromium.org/201070/diff/26006/26020#newcode95 > Line 95: URLRequest::RegisterProtocolFactory("http", NULL); > I think these are getting called on the 'main' thread and not on the > seperate IO thread you've setup to do the bulk of the work... is that > right? > > It would be good to perform this cleanup on the IO thread itself, so the > appcache library is only getting called from a single thread. > > I think that can be accomplished by doing the cleanup in > UpdateFinished() just prior calling event_->Signal(). > > Also, would be good to cleanup the 'service_' along with the other > objects. > > http://codereview.chromium.org/201070/diff/26006/26020#newcode98 > Line 98: static void SetUpTestCase() { > Nice... its not real clear from looking at gtest.h that this is run once > for all 'fixtures' instead of once per each TEST_F. Would you add a > comment saying that here? > > http://codereview.chromium.org/201070/diff/26006/26020#newcode355 > Line 355: AppCacheHost* host2 = new AppCacheHost(2, frontend2, > service_.get()); > In reviewing these cases, in addition to examing what behavior is being > tested, i'm also looking for how these 'host' and 'frontend' objects are > cleaned up. > > That latter thing is kind of distracting compared to the former thing. I > see their added to the hosts_ or frontends_ collection down below which > get cleaned on on test completion, but i keep having to confirm that for > each case. > > Maybe a way to make the harness hide those bookeeping details could help > with readability. You have some helpers for CreateService and > MakeCacheForGroup that hide some mundane tasks (which is good). It may > make sense to have some helpers to MakeMockFrontend and MakeAppCacheHost > that take care of adding the resulting object to the collections for > cleanup. The test cases themselves would remain more focused on what > artifacts of 'starting' an update with the inputs provided should have. > > http://codereview.chromium.org/201070/diff/26006/26020#newcode846 > Line 846: group_->AddObserver(this); > what if the update is already finished? > > if (group_->update_status() == IDLE) > UpdateFinished(); > else > group_->AddObserver(this); > > http://codereview.chromium.org/201070/diff/26006/26020#newcode852 > Line 852: // Finish up outside of observer callback so that group can be > deleted. > It looks should be safe to release the group from within this observer > callback. At this point, we're in the UpdateJob dtor and the UpdateJob > still has its reference to the group. Update return the UpdateJob's ref > to the group will be released. > > Does something go awry if UpdateFinished is called directly, or do you > just want to unwind the stack for other reasons? > > I think its fine to unwind the stack, just curious. > > http://codereview.chromium.org/201070/diff/26006/26020#newcode862 > Line 862: VerifyExpectations(); > Invoke cleanup on IO thread here. > > http://codereview.chromium.org/201070/diff/26006/26020#newcode867 > Line 867: void CreateService() { > maybe MakeService for consistency with Make<OtherThings> naming > > > http://codereview.chromium.org/201070 >
LGTM! |
