|
|
Chromium Code Reviews|
Created:
4 years, 1 month ago by dougarnett Modified:
4 years, 1 month ago CC:
chromium-reviews, romax+watch_chromium.org, fgorski+watch_chromium.org, dewittj+watch_chromium.org, petewil+watch_chromium.org, chili+watch_chromium.org, dimich+watch_chromium.org Target Ref:
refs/pending/heads/master Project:
chromium Visibility:
Public. |
Description[OfflinePages] Fixes RequestCoordinator bug not clearing state on timeout
Factors out some OfflinerDoneCallback logic so that it may be used for handling timeout case.
BUG=663948
Committed: https://crrev.com/5d66b4a41712bb24405cc261ea0ada34c4ee9eb6
Cr-Commit-Position: refs/heads/master@{#431638}
Patch Set 1 #
Total comments: 3
Patch Set 2 : Broke out some OfflinerDoneCallback logic and reworked timeout handling #
Total comments: 4
Patch Set 3 : Adress petewil comments #Patch Set 4 : Merge #
Messages
Total messages: 33 (20 generated)
The CQ bit was checked by dougarnett@chromium.org to run a CQ dry run
dougarnett@chromium.org changed reviewers: + petewil@chromium.org, romax@chromium.org
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: ios-device on master.tryserver.chromium.mac (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.mac/builders/ios-device/builds...) ios-simulator on master.tryserver.chromium.mac (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.mac/builders/ios-simulator/bui...) mac_chromium_compile_dbg_ng on master.tryserver.chromium.mac (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.mac/builders/mac_chromium_comp...) mac_chromium_rel_ng on master.tryserver.chromium.mac (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.mac/builders/mac_chromium_rel_...)
https://codereview.chromium.org/2493683002/diff/1/components/offline_pages/ba... File components/offline_pages/background/request_coordinator.cc (right): https://codereview.chromium.org/2493683002/diff/1/components/offline_pages/ba... components/offline_pages/background/request_coordinator.cc:247: AbortRequestAttempt(active_request_.get()); When we get a timeout like this, we should get the OfflinerDoneCallback, which should be the one to mark is_busy_ to false. Is that callback not happening somehow? If it is not happening, that's a problem it is what calls TryNextRequest()
https://codereview.chromium.org/2493683002/diff/1/components/offline_pages/ba... File components/offline_pages/background/request_coordinator.cc (right): https://codereview.chromium.org/2493683002/diff/1/components/offline_pages/ba... components/offline_pages/background/request_coordinator.cc:247: AbortRequestAttempt(active_request_.get()); On 2016/11/10 01:03:30, Pete Williamson wrote: > When we get a timeout like this, we should get the OfflinerDoneCallback, which > should be the one to mark is_busy_ to false. Is that callback not happening > somehow? If it is not happening, that's a problem it is what calls > TryNextRequest() Offliner::Cancel contract is to not call the callback: https://cs.chromium.org/chromium/src/components/offline_pages/background/offl... so don't see how the early return from here ever worked. I will look at breaking out common "completion" logic from OfflinerDoneCallback to call from HandleWatchdogTimeout and see if that seems viable.
https://codereview.chromium.org/2493683002/diff/1/components/offline_pages/ba... File components/offline_pages/background/request_coordinator.cc (right): https://codereview.chromium.org/2493683002/diff/1/components/offline_pages/ba... components/offline_pages/background/request_coordinator.cc:247: AbortRequestAttempt(active_request_.get()); On 2016/11/10 16:38:38, dougarnett wrote: > On 2016/11/10 01:03:30, Pete Williamson wrote: > > When we get a timeout like this, we should get the OfflinerDoneCallback, which > > should be the one to mark is_busy_ to false. Is that callback not happening > > somehow? If it is not happening, that's a problem it is what calls > > TryNextRequest() > > Offliner::Cancel contract is to not call the callback: > https://cs.chromium.org/chromium/src/components/offline_pages/background/offl... > so don't see how the early return from here ever worked. > > I will look at breaking out common "completion" logic from OfflinerDoneCallback > to call from HandleWatchdogTimeout and see if that seems viable. OK, in that case, it is my misunderstanding of the contract. So, we need to look at what OfflinerDoneCallback does, and do the relevant parts . Doing it from HandleWatchdogTimeout seems like a good idea. We need to at least clear the is_busy_ flag and active_request_, and call TryNextRequest(). We may also need to track some UMA. We should remove the request if it is over the retry limit, and call MarkCompleted() if it is not over the retry limit. There may be an opportunity to refactor some common tasks with OfflinerDoneCallback and HandleWatchdogTimeout into a single function.
The CQ bit was checked by dougarnett@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
Description was changed from ========== [OfflinePages] Fixes RequestCoordinator bug not clearing state on timeout BUG=663948 ========== to ========== [OfflinePages] Fixes RequestCoordinator bug not clearing state on timeout Factors out some OfflinerDoneCallback logic so that it may be used for handling timeout case. BUG=663948 ==========
On 2016/11/10 19:01:29, Pete Williamson wrote: > https://codereview.chromium.org/2493683002/diff/1/components/offline_pages/ba... > File components/offline_pages/background/request_coordinator.cc (right): > > https://codereview.chromium.org/2493683002/diff/1/components/offline_pages/ba... > components/offline_pages/background/request_coordinator.cc:247: > AbortRequestAttempt(active_request_.get()); > On 2016/11/10 16:38:38, dougarnett wrote: > > On 2016/11/10 01:03:30, Pete Williamson wrote: > > > When we get a timeout like this, we should get the OfflinerDoneCallback, > which > > > should be the one to mark is_busy_ to false. Is that callback not happening > > > somehow? If it is not happening, that's a problem it is what calls > > > TryNextRequest() > > > > Offliner::Cancel contract is to not call the callback: > > > https://cs.chromium.org/chromium/src/components/offline_pages/background/offl... > > so don't see how the early return from here ever worked. > > > > I will look at breaking out common "completion" logic from > OfflinerDoneCallback > > to call from HandleWatchdogTimeout and see if that seems viable. > > OK, in that case, it is my misunderstanding of the contract. So, we need to > look at what OfflinerDoneCallback does, and do the relevant parts . Doing it > from HandleWatchdogTimeout seems like a good idea. We need to at least clear > the is_busy_ flag and active_request_, and call TryNextRequest(). We may also > need to track some UMA. We should remove the request if it is over the retry > limit, and call MarkCompleted() if it is not over the retry limit. > > There may be an opportunity to refactor some common tasks with > OfflinerDoneCallback and HandleWatchdogTimeout into a single function. PTAL Now have broken out some OfflinerDoneCallback into couple of supporting methods that are now used by timeout path. [Have verified retrying now happening via logging and dropping timeout value.]
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
https://codereview.chromium.org/2493683002/diff/20001/components/offline_page... File components/offline_pages/background/request_coordinator.cc (right): https://codereview.chromium.org/2493683002/diff/20001/components/offline_page... components/offline_pages/background/request_coordinator.cc:515: if (ShouldTryNextRequest(watchdog_status)) Why call ShouldTryNextRequest() here? In this case, the answer is always yes (because timed out returns true). It might be better to just call TryNextRequest. https://codereview.chromium.org/2493683002/diff/20001/components/offline_page... File components/offline_pages/background/request_coordinator_unittest.cc (right): https://codereview.chromium.org/2493683002/diff/20001/components/offline_page... components/offline_pages/background/request_coordinator_unittest.cc:1077: request.MarkAttemptCompleted(); Instead of calling MarkAttemptCompleted, why not just call set_completed_attempt_count()?
https://codereview.chromium.org/2493683002/diff/20001/components/offline_page... File components/offline_pages/background/request_coordinator.cc (right): https://codereview.chromium.org/2493683002/diff/20001/components/offline_page... components/offline_pages/background/request_coordinator.cc:515: if (ShouldTryNextRequest(watchdog_status)) On 2016/11/11 00:24:04, Pete Williamson wrote: > Why call ShouldTryNextRequest() here? In this case, the answer is always yes > (because timed out returns true). It might be better to just call > TryNextRequest. Yeah, almost did. The idea here was for the policy to be in the ShouldTry method instead of sprinkled around (one place to look/change, no contradictions). https://codereview.chromium.org/2493683002/diff/20001/components/offline_page... File components/offline_pages/background/request_coordinator_unittest.cc (right): https://codereview.chromium.org/2493683002/diff/20001/components/offline_page... components/offline_pages/background/request_coordinator_unittest.cc:1077: request.MarkAttemptCompleted(); On 2016/11/11 00:24:04, Pete Williamson wrote: > Instead of calling MarkAttemptCompleted, why not just call > set_completed_attempt_count()? Done.
lgtm
The CQ bit was checked by dougarnett@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
The CQ bit was checked by dougarnett@chromium.org
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Failed to apply patch for
components/offline_pages/background/request_coordinator.cc:
While running git apply --index -p1;
error: patch failed:
components/offline_pages/background/request_coordinator.cc:302
error: components/offline_pages/background/request_coordinator.cc: patch does
not apply
Patch: components/offline_pages/background/request_coordinator.cc
Index: components/offline_pages/background/request_coordinator.cc
diff --git a/components/offline_pages/background/request_coordinator.cc
b/components/offline_pages/background/request_coordinator.cc
index
098024afd8436c03a6875d2c1f49a780973de2b2..b26b228c18d2d127412d9cd15d82018dab955b83
100644
--- a/components/offline_pages/background/request_coordinator.cc
+++ b/components/offline_pages/background/request_coordinator.cc
@@ -241,13 +241,15 @@ void
RequestCoordinator::StopPrerendering(Offliner::RequestStatus stop_status) {
DCHECK(active_request_.get());
offliner_->Cancel();
- // If we timed out, let the offliner done callback handle it.
- if (stop_status == Offliner::RequestStatus::REQUEST_COORDINATOR_TIMED_OUT)
- return;
-
- // Otherwise, this attempt never really had a chance to run, mark it
- // aborted.
- AbortRequestAttempt(*active_request_.get());
+ if (stop_status == Offliner::RequestStatus::REQUEST_COORDINATOR_TIMED_OUT)
{
+ // Consider watchdog timeout a completed attempt.
+ SavePageRequest request(*active_request_.get());
+ UpdateRequestForCompletedAttempt(request,
+
Offliner::REQUEST_COORDINATOR_TIMED_OUT);
+ } else {
+ // Otherwise consider this stop an aborted attempt.
+ UpdateRequestForAbortedAttempt(*active_request_.get());
+ }
}
// Stopping offliner means it will not call callback so set last status.
@@ -302,7 +304,8 @@ bool RequestCoordinator::CancelActiveRequestIfItMatches(
return false;
}
-void RequestCoordinator::AbortRequestAttempt(const SavePageRequest& request) {
+void RequestCoordinator::UpdateRequestForAbortedAttempt(
+ const SavePageRequest& request) {
if (request.started_attempt_count() >= policy_->GetMaxStartedTries()) {
const BackgroundSavePageResult result(
BackgroundSavePageResult::START_COUNT_EXCEEDED);
@@ -506,7 +509,10 @@ void RequestCoordinator::StopProcessing(
}
void RequestCoordinator::HandleWatchdogTimeout() {
- StopProcessing(Offliner::REQUEST_COORDINATOR_TIMED_OUT);
+ Offliner::RequestStatus watchdog_status =
+ Offliner::REQUEST_COORDINATOR_TIMED_OUT;
+ StopPrerendering(watchdog_status);
+ TryNextRequest();
}
// Returns true if the caller should expect a callback, false otherwise. For
@@ -741,16 +747,25 @@ void RequestCoordinator::OfflinerDoneCallback(const
SavePageRequest& request,
RecordOfflinerResultUMA(request.client_id(), request.creation_time(),
last_offlining_status_);
watchdog_timer_.Stop();
-
is_busy_ = false;
active_request_.reset(nullptr);
+ UpdateRequestForCompletedAttempt(request, status);
+ if (ShouldTryNextRequest(status))
+ TryNextRequest();
+ else
+ scheduler_callback_.Run(true);
+}
+
+void RequestCoordinator::UpdateRequestForCompletedAttempt(
+ const SavePageRequest& request,
+ Offliner::RequestStatus status) {
if (status == Offliner::RequestStatus::FOREGROUND_CANCELED ||
status == Offliner::RequestStatus::PRERENDERING_CANCELED) {
// Update the request for the canceled attempt.
// TODO(dougarnett): See if we can conclusively identify other attempt
// aborted cases to treat this way (eg, for Render Process Killed).
- AbortRequestAttempt(request);
+ UpdateRequestForAbortedAttempt(request);
} else if (status == Offliner::RequestStatus::SAVED) {
// Remove the request from the queue if it succeeded.
RemoveAttemptedRequest(request, BackgroundSavePageResult::SUCCESS);
@@ -777,29 +792,27 @@ void RequestCoordinator::OfflinerDoneCallback(const
SavePageRequest& request,
weak_ptr_factory_.GetWeakPtr(), request.request_id(),
request.client_id()));
}
+}
- // Determine whether we might try another request in this
- // processing window based on how the previous request completed.
- switch (status) {
+bool RequestCoordinator::ShouldTryNextRequest(
+ Offliner::RequestStatus previous_request_status) {
+ switch (previous_request_status) {
case Offliner::RequestStatus::SAVED:
case Offliner::RequestStatus::SAVE_FAILED:
case Offliner::RequestStatus::REQUEST_COORDINATOR_CANCELED:
case Offliner::RequestStatus::REQUEST_COORDINATOR_TIMED_OUT:
case Offliner::RequestStatus::PRERENDERING_FAILED_NO_RETRY:
- // Consider processing another request in this service window.
- TryNextRequest();
- break;
+ return true;
case Offliner::RequestStatus::FOREGROUND_CANCELED:
case Offliner::RequestStatus::PRERENDERING_CANCELED:
case Offliner::RequestStatus::PRERENDERING_FAILED:
// No further processing in this service window.
- // Let the scheduler know we are done processing.
- scheduler_callback_.Run(true);
- break;
+ return false;
default:
// Make explicit choice about new status codes that actually reach here.
// Their default is no further processing in this service window.
NOTREACHED();
+ return false;
}
}
The CQ bit was checked by dougarnett@chromium.org
The patchset sent to the CQ was uploaded after l-g-t-m from petewil@chromium.org Link to the patchset: https://codereview.chromium.org/2493683002/#ps60001 (title: "Merge")
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
Message was sent while issue was closed.
Description was changed from ========== [OfflinePages] Fixes RequestCoordinator bug not clearing state on timeout Factors out some OfflinerDoneCallback logic so that it may be used for handling timeout case. BUG=663948 ========== to ========== [OfflinePages] Fixes RequestCoordinator bug not clearing state on timeout Factors out some OfflinerDoneCallback logic so that it may be used for handling timeout case. BUG=663948 ==========
Message was sent while issue was closed.
Committed patchset #4 (id:60001)
Message was sent while issue was closed.
Description was changed from ========== [OfflinePages] Fixes RequestCoordinator bug not clearing state on timeout Factors out some OfflinerDoneCallback logic so that it may be used for handling timeout case. BUG=663948 ========== to ========== [OfflinePages] Fixes RequestCoordinator bug not clearing state on timeout Factors out some OfflinerDoneCallback logic so that it may be used for handling timeout case. BUG=663948 Committed: https://crrev.com/5d66b4a41712bb24405cc261ea0ada34c4ee9eb6 Cr-Commit-Position: refs/heads/master@{#431638} ==========
Message was sent while issue was closed.
Patchset 4 (id:??) landed as https://crrev.com/5d66b4a41712bb24405cc261ea0ada34c4ee9eb6 Cr-Commit-Position: refs/heads/master@{#431638} |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
