|
|
Chromium Code Reviews|
Created:
4 years, 4 months ago by dougarnett Modified:
4 years, 4 months ago CC:
chromium-reviews, romax+watch_chromium.org, fgorski+watch_chromium.org, dewittj+watch_chromium.org, petewil+watch_chromium.org, chili+watch_chromium.org, dimich+watch_chromium.org Base URL:
https://chromium.googlesource.com/chromium/src.git@master Target Ref:
refs/pending/heads/master Project:
chromium Visibility:
Public. |
DescriptionSkips processing more SavePageLater requests in current processing window when one completes unsuccessfully.
Ideally, we would consider processing another request in the same processing
window for certain types of page-specific failures (follow-on work) but the
default needs to be more conservative in case we are seeing system resource
related failures/cancels.
BUG=632119
Committed: https://crrev.com/e91a689c5f795a9c7f7d472d75c970d237897e5c
Cr-Commit-Position: refs/heads/master@{#409554}
Patch Set 1 #
Total comments: 3
Patch Set 2 : Allow processing to continue for some cases with TODO to split PrerenderingFailed #
Total comments: 7
Patch Set 3 : Makes targeted status codes more explicit #Patch Set 4 : DCHECK in default case #
Total comments: 2
Messages
Total messages: 35 (20 generated)
The CQ bit was checked by dougarnett@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
dougarnett@chromium.org changed reviewers: + petewil@chromium.org
This is simple fix for now. It will be more involved to support continuing processing for some types of failures - to surface failure types in some form up to Coordinator and to defer the request that just failed in case attempt count gets set higher than one.
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
https://codereview.chromium.org/2196293002/diff/1/components/offline_pages/ba... File components/offline_pages/background/request_coordinator.cc (right): https://codereview.chromium.org/2196293002/diff/1/components/offline_pages/ba... components/offline_pages/background/request_coordinator.cc:237: if (status == Offliner::RequestStatus::SAVED) { Let's fix this right while we are here. THere aren't all that many cases: REQUEST_COORDINATOR_CANCELED - Let's try the next one. PRERENDERING_CANCELED - Presumably this happens because the user started using the device, so don't continue. PRERENDERING_FAILED - If it was the page's fault, try the next one. If the prerenderer crashed, abort. Can we tell the difference? SAVE_FAILED - Let's try the next one, then. FOREGROUND_CANCELED - Not sure how this is different than PRERENDERING_CANCELLED.
On 2016/08/01 19:56:27, Pete Williamson wrote: > https://codereview.chromium.org/2196293002/diff/1/components/offline_pages/ba... > File components/offline_pages/background/request_coordinator.cc (right): > > https://codereview.chromium.org/2196293002/diff/1/components/offline_pages/ba... > components/offline_pages/background/request_coordinator.cc:237: if (status == > Offliner::RequestStatus::SAVED) { > Let's fix this right while we are here. Trying the next one means we need to either remove the retry support (so configuring the attempt count > 1 won't just run same one again) or add support somehow for RequestPicker to skip over this one. An existing mechanism we could consider is setting some backoff in the activation_time on the SavePageRequest. We can ship without this optimization but we can ship with basic fix - that is my proposal in the CL for landing this required fix first. > THere aren't all that many cases: > REQUEST_COORDINATOR_CANCELED - Let's try the next one. > PRERENDERING_CANCELED - Presumably this happens because the user started > using the device, so don't continue. > > PRERENDERING_FAILED - If it was the page's fault, try the next one. If the > prerenderer crashed, abort. Can we tell the difference? This is TODO - we could "know" for couple cases now but need to determine how to surface that to Coordinator. Perhaps via PRERENDERING_FAILED_NO_RETRY would make sense for things like UNSUPPORTED_SCHEME or CREATING_AUDIO_STREAM). Or we could plumb through detailed error code (as int of FinalStatus or as new, duplicating enum) and pass to a Policy method to tell us to continue or not. Seems preferable to do more evaluation on failures we actually hit first though - and then decide how to deal with most impactful ones. > > SAVE_FAILED - Let's try the next one, then. > > FOREGROUND_CANCELED - Not sure how this is different than > PRERENDERING_CANCELLED. FOREGROUND_CANCELED is detecting Chrome going to the foreground on low-end device (so need to cease processing). PRERENDERING_CANCELED is prerender stack discovering it should stop running presumably for some system resource reason (can also be for navigation reason for non-OFFLINE_PAGES prerender requests). We should cease for both of these.
On 2016/08/01 20:18:01, dougarnett wrote: > On 2016/08/01 19:56:27, Pete Williamson wrote: > > > https://codereview.chromium.org/2196293002/diff/1/components/offline_pages/ba... > > File components/offline_pages/background/request_coordinator.cc (right): > > > > > https://codereview.chromium.org/2196293002/diff/1/components/offline_pages/ba... > > components/offline_pages/background/request_coordinator.cc:237: if (status == > > Offliner::RequestStatus::SAVED) { > > Let's fix this right while we are here. > > Trying the next one means we need to either remove the retry support (so > configuring the attempt count > 1 won't just run same one again) or add support > somehow for RequestPicker to skip over this one. An existing mechanism we could > consider is setting some backoff in the activation_time on the SavePageRequest. I think we already removed retry support. > > We can ship without this optimization but we can ship with basic fix - that is > my proposal in the CL for landing this required fix first. > > > THere aren't all that many cases: > > REQUEST_COORDINATOR_CANCELED - Let's try the next one. > > PRERENDERING_CANCELED - Presumably this happens because the user started > > using the device, so don't continue. > > > > PRERENDERING_FAILED - If it was the page's fault, try the next one. If > the > > prerenderer crashed, abort. Can we tell the difference? > > This is TODO - we could "know" for couple cases now but need to determine how to > surface that to Coordinator. Perhaps via PRERENDERING_FAILED_NO_RETRY would make > sense for things like UNSUPPORTED_SCHEME or CREATING_AUDIO_STREAM). Or we could > plumb through detailed error code (as int of FinalStatus or as new, duplicating > enum) and pass to a Policy method to tell us to continue or not. > > Seems preferable to do more evaluation on failures we actually hit first though > - and then decide how to deal with most impactful ones. > > > > > SAVE_FAILED - Let's try the next one, then. > > > > FOREGROUND_CANCELED - Not sure how this is different than > > PRERENDERING_CANCELLED. > > FOREGROUND_CANCELED is detecting Chrome going to the foreground on low-end > device (so need to cease processing). > PRERENDERING_CANCELED is prerender stack discovering it should stop running > presumably for some system resource reason (can also be for navigation reason > for non-OFFLINE_PAGES prerender requests). > > We should cease for both of these.
Nice patch. I think in general it makes sense to simplify the processing for now (~M54). The goal for us is to start experimental usage soon and then observe the actual behavior in the field. Observing the behavior via UMA is easier if the main processing/scheduling logic is simple so it is possible to build theories of what's actually happening by looking at cumulative histograms of every run. Often, we have to 'reverse engineer' actual behavior from accumulated stats which may be not even solvable. Eventually we need to add retry logic (and corresponding UMA to tell us how the policy happen to be working). However, assuming it'still functional, the simplest initial solution will be easiest to observe in the field. I bet we'll find things we don't expect now (as we found a lot of 'unknown error pages' and 'online redirect loops' in case of last1) and simple logic will make it possible to reason about what's happening. So while there can be a good way to batch several opportunistic loads into a single processing window (and we may have to do it very soon), I'm glad we can start experimenting on something that is easier to reason about.
The CQ bit was checked by dougarnett@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
Allow processing for some codes now with splitting PrenderingFailed still being a TODO.
https://codereview.chromium.org/2196293002/diff/1/components/offline_pages/ba... File components/offline_pages/background/request_coordinator_unittest.cc (right): https://codereview.chromium.org/2196293002/diff/1/components/offline_pages/ba... components/offline_pages/background/request_coordinator_unittest.cc:409: TEST_F(RequestCoordinatorTest, OfflinerDoneRequestCanceled) { Why are we removing this test? Intentional? https://codereview.chromium.org/2196293002/diff/20001/components/offline_page... File components/offline_pages/background/request_coordinator.cc (right): https://codereview.chromium.org/2196293002/diff/20001/components/offline_page... components/offline_pages/background/request_coordinator.cc:240: case Offliner::RequestStatus::PRERENDERING_CANCELED: Good! Let's be a bit more explicit for the cases where we don't keep processing by calling out the cases: case Offliner::RequestStatus::PRERENDER_FAILED: case Offliner::RequestStatus::FOREGOUND_CANCELLED: ... default: break;
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
fgorski@chromium.org changed reviewers: + fgorski@chromium.org
drive-by https://codereview.chromium.org/2196293002/diff/20001/components/offline_page... File components/offline_pages/background/request_coordinator.cc (right): https://codereview.chromium.org/2196293002/diff/20001/components/offline_page... components/offline_pages/background/request_coordinator.cc:237: case Offliner::RequestStatus::SAVED: It is a bit worrying that you have two scope operators here. Perhaps the enum does not belong inside offliner? https://codereview.chromium.org/2196293002/diff/20001/components/offline_page... components/offline_pages/background/request_coordinator.cc:244: // Stop further processing in this service window. https://google.github.io/styleguide/cppguide.html#Loops_and_Switch_Statements
The CQ bit was checked by dougarnett@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
The CQ bit was checked by dougarnett@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
https://codereview.chromium.org/2196293002/diff/1/components/offline_pages/ba... File components/offline_pages/background/request_coordinator_unittest.cc (right): https://codereview.chromium.org/2196293002/diff/1/components/offline_pages/ba... components/offline_pages/background/request_coordinator_unittest.cc:409: TEST_F(RequestCoordinatorTest, OfflinerDoneRequestCanceled) { On 2016/08/02 22:20:00, Pete Williamson wrote: > Why are we removing this test? Intentional? Intentional - wasn't really testing anything additional and have some merge conflict coming up with your change that also defines 2nd request values. My plan is to inject mock RequestPicker (per line 392 TODO) when I split the PRENDERING_FAILED codes so we can really verify whether TryNextRequest gets called or not. https://codereview.chromium.org/2196293002/diff/20001/components/offline_page... File components/offline_pages/background/request_coordinator.cc (right): https://codereview.chromium.org/2196293002/diff/20001/components/offline_page... components/offline_pages/background/request_coordinator.cc:237: case Offliner::RequestStatus::SAVED: On 2016/08/03 03:40:04, fgorski wrote: > It is a bit worrying that you have two scope operators here. Perhaps the enum > does not belong inside offliner? worrying? This is part of the Offliner interface so not sure of better place to define it. Happy to hear more specific recommendation for reducing scope or just using "using". https://codereview.chromium.org/2196293002/diff/20001/components/offline_page... components/offline_pages/background/request_coordinator.cc:240: case Offliner::RequestStatus::PRERENDERING_CANCELED: On 2016/08/02 22:20:00, Pete Williamson wrote: > Good! > Let's be a bit more explicit for the cases where we don't keep processing by > calling out the cases: > > case Offliner::RequestStatus::PRERENDER_FAILED: > case Offliner::RequestStatus::FOREGOUND_CANCELLED: > ... > default: > break; Done. https://codereview.chromium.org/2196293002/diff/20001/components/offline_page... components/offline_pages/background/request_coordinator.cc:244: // Stop further processing in this service window. On 2016/08/03 03:40:05, fgorski wrote: > https://google.github.io/styleguide/cppguide.html#Loops_and_Switch_Statements Done.
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
lgtm with a comment https://codereview.chromium.org/2196293002/diff/20001/components/offline_page... File components/offline_pages/background/request_coordinator.cc (right): https://codereview.chromium.org/2196293002/diff/20001/components/offline_page... components/offline_pages/background/request_coordinator.cc:237: case Offliner::RequestStatus::SAVED: On 2016/08/03 15:33:26, dougarnett wrote: > On 2016/08/03 03:40:04, fgorski wrote: > > It is a bit worrying that you have two scope operators here. Perhaps the enum > > does not belong inside offliner? > > worrying? This is part of the Offliner interface so not sure of better place to > define it. Happy to hear more specific recommendation for reducing scope or just > using "using". Acknowledged. https://codereview.chromium.org/2196293002/diff/60001/components/offline_page... File components/offline_pages/background/request_coordinator.cc (right): https://codereview.chromium.org/2196293002/diff/60001/components/offline_page... components/offline_pages/background/request_coordinator.cc:244: case Offliner::RequestStatus::FOREGROUND_CANCELED: I think you might be missing UNKNOWN and LOADED.
https://codereview.chromium.org/2196293002/diff/60001/components/offline_page... File components/offline_pages/background/request_coordinator.cc (right): https://codereview.chromium.org/2196293002/diff/60001/components/offline_page... components/offline_pages/background/request_coordinator.cc:244: case Offliner::RequestStatus::FOREGROUND_CANCELED: On 2016/08/03 16:18:59, fgorski wrote: > I think you might be missing UNKNOWN and LOADED. Right, actually intentional here as those are interim codes that should not be received by callback. I'm a bit torn about including them explicitly and asserting vs. just let default do it and not bothering the Coordinator maintainer with these interim codes. Also, I have a TODO supported by pasko to merge the PrerendingOffliner and PrerenderingLoader which would allow me to stop using LOADED. UNKNOWN is still needed for initialization and testing.
lgtm
The CQ bit was checked by dougarnett@chromium.org
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
Message was sent while issue was closed.
Committed patchset #4 (id:60001)
Message was sent while issue was closed.
Description was changed from ========== Skips processing more SavePageLater requests in current processing window when one completes unsuccessfully. Ideally, we would consider processing another request in the same processing window for certain types of page-specific failures (follow-on work) but the default needs to be more conservative in case we are seeing system resource related failures/cancels. BUG=632119 ========== to ========== Skips processing more SavePageLater requests in current processing window when one completes unsuccessfully. Ideally, we would consider processing another request in the same processing window for certain types of page-specific failures (follow-on work) but the default needs to be more conservative in case we are seeing system resource related failures/cancels. BUG=632119 Committed: https://crrev.com/e91a689c5f795a9c7f7d472d75c970d237897e5c Cr-Commit-Position: refs/heads/master@{#409554} ==========
Message was sent while issue was closed.
Patchset 4 (id:??) landed as https://crrev.com/e91a689c5f795a9c7f7d472d75c970d237897e5c Cr-Commit-Position: refs/heads/master@{#409554} |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
