|
|
Created:
4 years, 11 months ago by bacek_yandex Modified:
4 years, 10 months ago CC:
chromium-reviews, cbentzel+watch_chromium.org Base URL:
https://chromium.googlesource.com/chromium/src.git@master Target Ref:
refs/pending/heads/master Project:
chromium Visibility:
Public. |
DescriptionWhen URLRequestHttpJob processing cached response with GetDictionary header we already have seen the header in SdchOwner. So just ignore it.
R=rdsmith@chromium.org
Committed: https://crrev.com/41dc425d047c421c46d35b42587d71da6df37283
Cr-Commit-Position: refs/heads/master@{#372192}
Patch Set 1 #Patch Set 2 : Fix compilation #Patch Set 3 : Don't download SDCH Dictionaries for cached responses #
Total comments: 1
Patch Set 4 : Test added #Messages
Total messages: 26 (5 generated)
So as I read the CL, the functional change is that, if a Get-Dictionary is seen from reading a request from cache, the dictionary is still fetched, but only from the cache; if it's not in the cache, no network request is generated. Presuming I understand correctly, I'm not sure it's the right behavior. I can see an argument that the dictionary shouldn't be fetched at all; after all, it was presumably fetched when the entry was loaded from the network, and after that point, whether it sticks around should be based on whether it's being *used*. Alternatively, if the entry keeps being fetched from the cache, that indicates to me that the web site is being revisited often, and thus that the dictionary should be fetched. I think I lean more towards the first argument, which suggests to me that the OnGetDictionary signal shouldn't be raised if the entry it is seen in is a cached entry. Would you be willing to shift this CL over to behaving in that fashion?
On 2016/01/11 02:42:04, Randy Smith - Not in Fridays wrote: > So as I read the CL, the functional change is that, if a Get-Dictionary is seen > from reading a request from cache, the dictionary is still fetched, but only > from the cache; if it's not in the cache, no network request is generated. Yes, this is correct. > Presuming I understand correctly, I'm not sure it's the right behavior. I can > see an argument that the dictionary shouldn't be fetched at all; after all, it > was presumably fetched when the entry was loaded from the network, and after > that point, whether it sticks around should be based on whether it's being > *used*. Alternatively, if the entry keeps being fetched from the cache, that > indicates to me that the web site is being revisited often, and thus that the > dictionary should be fetched. > > I think I lean more towards the first argument, which suggests to me that the > OnGetDictionary signal shouldn't be raised if the entry it is seen in is a > cached entry. Would you be willing to shift this CL over to behaving in that > fashion? There is a slight problem with this approach. Currently we have 4+% of Sdch3.ResponseCorruptionDetected.Cached:RESPONSE_NO_DICTIONARY errors. Which means that we have cached response without loaded dictionary. OTOH, Sdch3.ProblemCodes_5:DICTIONARY_PREVIOUSLY_SCHEDULED_TO_DOWNLOAD stats shows that we are trying to load Dictionaries too often. With my CL I'm trying to reduce latter one. And for former one I'm looking at idea of getting rid of "meta-refresh workaround to reload html resource" and implement kind of full proper reloading bypassing cache.
On 2016/01/11 04:43:40, bacek_yandex wrote: > On 2016/01/11 02:42:04, Randy Smith - Not in Fridays wrote: > > So as I read the CL, the functional change is that, if a Get-Dictionary is > seen > > from reading a request from cache, the dictionary is still fetched, but only > > from the cache; if it's not in the cache, no network request is generated. > > Yes, this is correct. > > > Presuming I understand correctly, I'm not sure it's the right behavior. I can > > see an argument that the dictionary shouldn't be fetched at all; after all, it > > was presumably fetched when the entry was loaded from the network, and after > > that point, whether it sticks around should be based on whether it's being > > *used*. Alternatively, if the entry keeps being fetched from the cache, that > > indicates to me that the web site is being revisited often, and thus that the > > dictionary should be fetched. > > > > I think I lean more towards the first argument, which suggests to me that the > > OnGetDictionary signal shouldn't be raised if the entry it is seen in is a > > cached entry. Would you be willing to shift this CL over to behaving in that > > fashion? > > There is a slight problem with this approach. Currently we have 4+% of > Sdch3.ResponseCorruptionDetected.Cached:RESPONSE_NO_DICTIONARY errors. Which > means that we have cached response without loaded dictionary. OTOH, > Sdch3.ProblemCodes_5:DICTIONARY_PREVIOUSLY_SCHEDULED_TO_DOWNLOAD stats shows > that we are trying to load Dictionaries too often. With my CL I'm trying to > reduce latter one. So I think that these are two independent problems, and the existence of the Cached:RESPONSE_NO_DICTIONARY case doesn't apply to the value or lack of same of this CL. If we get a RESPONSE_NO_DICTIONARY error, that means that we did *not* have an entry in the cache for with a Get-Dictionary in it, as only uncached entries have those. Please let me know if you see some connection I've missed. In terms of the DICTIONARY_PREVIOUSLY_SCHEDULED_TO_DOWNLOAD statistic, I don't see the value in reducing it, as those downloads are blocked from using up network bandwidth; that's what the code means ("Sorry, didn't let this go out because it already went out"). So there's negligible cost to those cases. If we have some cleaner way of eliminating attempts to load dictionaries that we know we don't need to load, I'm happy to do so. But I don't see the value of loading dictionaries only from the cache if we find a Get-Dictionary in a cache entry; what do we gain? WRT that entry, we *know* we don't need that dictionary, because we have the unencoded version in the cache. Is it that we're assuming a correlation between having loading entries from the same domain, some of which may be in the cache and some not? > And for former one I'm looking at idea of getting rid of "meta-refresh > workaround to reload html resource" and implement kind of full proper reloading > bypassing cache. That would be wonderful (I hate the meta-refresh workaround), but I'll warn you that I looked at doing that and spent some time brainstorming with other folks working on the net stack, and it looked like it'd be a fairly major hassle (you need to restart the request with new parameters and make sure you don't mess up any of the state machines attached to it in doing so).
On 2016/01/14 23:47:56, Randy Smith - Not in Fridays wrote: > > So I think that these are two independent problems, and the existence of the > Cached:RESPONSE_NO_DICTIONARY case doesn't apply to the value or lack of same of > this CL. If we get a RESPONSE_NO_DICTIONARY error, that means that we did *not* > have an entry in the cache for with a Get-Dictionary in it, as only uncached > entries have those. Please let me know if you see some connection I've missed. > > > In terms of the DICTIONARY_PREVIOUSLY_SCHEDULED_TO_DOWNLOAD statistic, I don't > see the value in reducing it, as those downloads are blocked from using up > network bandwidth; that's what the code means ("Sorry, didn't let this go out > because it already went out"). So there's negligible cost to those cases. If > we have some cleaner way of eliminating attempts to load dictionaries that we > know we don't need to load, I'm happy to do so. But I don't see the value of > loading dictionaries only from the cache if we find a Get-Dictionary in a cache > entry; what do we gain? WRT that entry, we *know* we don't need that > dictionary, because we have the unencoded version in the cache. Is it that > we're assuming a correlation between having loading entries from the same > domain, some of which may be in the cache and some not? Actually, after quite a bit of thinking I agree with your original proposal to just disable loading dictionaries for cached responses. It's too late anyway to recover from absent dictionary. > > And for former one I'm looking at idea of getting rid of "meta-refresh > > workaround to reload html resource" and implement kind of full proper > reloading > > bypassing cache. > > That would be wonderful (I hate the meta-refresh workaround), but I'll warn you > that I looked at doing that and spent some time brainstorming with other folks > working on the net stack, and it looked like it'd be a fairly major hassle (you > need to restart the request with new parameters and make sure you don't mess up > any of the state machines attached to it in doing so). Working on it... It's way too hard. Especially with all stuff around NotificationHeadersComplete and many many layers of abstractions.
Would you also put in a test for this behavior? (Sorry about taking a long time to get back to you--I was traveling last week and am sick this week.) https://codereview.chromium.org/1563163002/diff/40001/net/url_request/url_req... File net/url_request/url_request_http_job.cc (right): https://codereview.chromium.org/1563163002/diff/40001/net/url_request/url_req... net/url_request/url_request_http_job.cc:370: // useless or too late. nit: If you're willing, I'd like to phrase this comment a bit differently: "Don't re-notify the existence of a Get-Dictionary header for a cached response; a previous notification was already sent." Officially, this is just notifying the manager (and any observers through it) that we've seen an Get-Dictionary header, it's not actually requesting a download (which is important because of the potential circular dependency in the net stack). Part of me wants to wired the information as to whether it's a cached response through the notification (because the decision as to what to do with a Get-Dictionary header really belongs in SdchOwner) but that seems like way too much trouble when it's fairly clear we don't want to do anything with it. If the information is ever needed, we can do that wiring later.
On 2016/01/26 20:44:10, Randy Smith - Not in Fridays wrote: > Would you also put in a test for this behavior? Will do. > (Sorry about taking a long time to get back to you--I was traveling last week > and am sick this week.) No worries. Life happens. > https://codereview.chromium.org/1563163002/diff/40001/net/url_request/url_req... > File net/url_request/url_request_http_job.cc (right): > > https://codereview.chromium.org/1563163002/diff/40001/net/url_request/url_req... > net/url_request/url_request_http_job.cc:370: // useless or too late. > nit: If you're willing, I'd like to phrase this comment a bit differently: > "Don't re-notify the existence of a Get-Dictionary header for a cached response; > a previous notification was already sent." > > Officially, this is just notifying the manager (and any observers through it) > that we've seen an Get-Dictionary header, it's not actually requesting a > download (which is important because of the potential circular dependency in the > net stack). Part of me wants to wired the information as to whether it's a > cached response through the notification (because the decision as to what to do > with a Get-Dictionary header really belongs in SdchOwner) but that seems like > way too much trouble when it's fairly clear we don't want to do anything with > it. If the information is ever needed, we can do that wiring later. Basically it was in my original version. Just last step should be different: instead of redirecting to SdchDictionaryDownloader::ScheduleReload SdchOwner can just ignore it. Shall I resurrect first version with that change?
On 2016/01/27 01:04:21, bacek_yandex wrote: > On 2016/01/26 20:44:10, Randy Smith - Not in Fridays wrote: > > Would you also put in a test for this behavior? > > Will do. > > > (Sorry about taking a long time to get back to you--I was traveling last week > > and am sick this week.) > > No worries. Life happens. > > > > https://codereview.chromium.org/1563163002/diff/40001/net/url_request/url_req... > > File net/url_request/url_request_http_job.cc (right): > > > > > https://codereview.chromium.org/1563163002/diff/40001/net/url_request/url_req... > > net/url_request/url_request_http_job.cc:370: // useless or too late. > > nit: If you're willing, I'd like to phrase this comment a bit differently: > > "Don't re-notify the existence of a Get-Dictionary header for a cached > response; > > a previous notification was already sent." > > > > Officially, this is just notifying the manager (and any observers through it) > > that we've seen an Get-Dictionary header, it's not actually requesting a > > download (which is important because of the potential circular dependency in > the > > net stack). Part of me wants to wired the information as to whether it's a > > cached response through the notification (because the decision as to what to > do > > with a Get-Dictionary header really belongs in SdchOwner) but that seems like > > way too much trouble when it's fairly clear we don't want to do anything with > > it. If the information is ever needed, we can do that wiring later. > > Basically it was in my original version. Just last step should be different: > instead of redirecting to SdchDictionaryDownloader::ScheduleReload SdchOwner can > just ignore it. Shall I resurrect first version with that change? Nah. In this case I think my "Don't put in complexity until it's needed" heuristic overrides my "make sure the decisions are made in the section of code that has conceptual responsibility for those decisions" heuristic (though I do believe both).
On 2016/01/27 16:16:20, Randy Smith - Not in Fridays wrote: > > just ignore it. Shall I resurrect first version with that change? > > Nah. In this case I think my "Don't put in complexity until it's needed" > heuristic overrides my "make sure the decisions are made in the section of code > that has conceptual responsibility for those decisions" heuristic (though I do > believe both). Make sense. Then this CL is done from my point of view. Shall we invite someone else to also review it?
On 2016/01/27 20:45:52, bacek_yandex wrote: > On 2016/01/27 16:16:20, Randy Smith - Not in Fridays wrote: > > > > just ignore it. Shall I resurrect first version with that change? > > > > Nah. In this case I think my "Don't put in complexity until it's needed" > > heuristic overrides my "make sure the decisions are made in the section of > code > > that has conceptual responsibility for those decisions" heuristic (though I do > > believe both). > > Make sense. Then this CL is done from my point of view. Shall we invite someone > else to also review it? Whoops, missed that you had put in the test. Nope, let me give it another read through and then we're good.
On 2016/01/27 20:46:36, Randy Smith - Not in Fridays wrote: > On 2016/01/27 20:45:52, bacek_yandex wrote: > > On 2016/01/27 16:16:20, Randy Smith - Not in Fridays wrote: > > > > > > just ignore it. Shall I resurrect first version with that change? > > > > > > Nah. In this case I think my "Don't put in complexity until it's needed" > > > heuristic overrides my "make sure the decisions are made in the section of > > code > > > that has conceptual responsibility for those decisions" heuristic (though I > do > > > believe both). > > > > Make sense. Then this CL is done from my point of view. Shall we invite > someone > > else to also review it? > > Whoops, missed that you had put in the test. Nope, let me give it another read > through and then we're good. Actually, could you bring the CL description in line with the current implementation?
Description was changed from ========== Pass cached status from Response up to SdchOwner. Main reason to avoid useless reload of Dictionaries when we restore session or navigate back. 1. Change SdchObserver.OnGetDictionary method to accept cached_response parameter. 2. Expose generic SdchDictionaryFetcher.Schedule method instead of separate Schedule and ScheduleReload to simplify handling of cached responses in SdhcOwner. R=rdsmith@chromium.org ========== to ========== When URLRequestHttpJob processing cached response with GetDictionary header we already has said header in SdchOwner. So just ignore it. R=rdsmith@chromium.org ==========
On 2016/01/27 20:47:40, Randy Smith - Not in Fridays wrote: > > Actually, could you bring the CL description in line with the current > implementation? Something like this?
On 2016/01/27 20:50:23, bacek_yandex wrote: > On 2016/01/27 20:47:40, Randy Smith - Not in Fridays wrote: > > > > Actually, could you bring the CL description in line with the current > > implementation? > > Something like this? suggestion: "has said" -> "have seen the"? Beyond that, looks good. (I'm still on the hook for a full pass review, which I expect I'll get done today.)
Description was changed from ========== When URLRequestHttpJob processing cached response with GetDictionary header we already has said header in SdchOwner. So just ignore it. R=rdsmith@chromium.org ========== to ========== When URLRequestHttpJob processing cached response with GetDictionary header we already have seen the header in SdchOwner. So just ignore it. R=rdsmith@chromium.org ==========
On 2016/01/27 20:54:02, Randy Smith - Not in Fridays wrote: > > suggestion: "has said" -> "have seen the"? Beyond that, looks good. > > (I'm still on the hook for a full pass review, which I expect I'll get done > today.) Done. I'm going to slightly rearrange unittests in separate CL. Just to group SDCH related tests into single fixture.
LGTM. Thanks! (I'm not completely happy with the use of RunLoop in the test, but it's consistent with what's elsewhere in the file, I don't think adding new tests that are out of sync with what's already there makes sense, and I'm not going to ask you to cleanup that whole file as part of this CL :-}. But for the record, I think the proper use of RunLoop is more like what's in content/browser/download/download_browsertest.cc::DownloadCreateObserver.)
The CQ bit was checked by bacek@yandex-team.ru
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1563163002/60001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1563163002/60001
On 2016/01/28 17:32:57, Randy Smith - Not in Fridays wrote: > (I'm not completely happy with the use of RunLoop in the test, but it's > consistent with what's elsewhere in the file, I don't think adding new tests > that are out of sync with what's already there makes sense, and I'm not going to > ask you to cleanup that whole file as part of this CL :-}. But for the record, > I think the proper use of RunLoop is more like what's in > content/browser/download/download_browsertest.cc::DownloadCreateObserver.) Do you mean posting QuitClosure to runloop? I can do it. I was going to look closely to those tests anyway.
On 2016/01/28 21:26:18, bacek_yandex wrote: > On 2016/01/28 17:32:57, Randy Smith - Not in Fridays wrote: > > (I'm not completely happy with the use of RunLoop in the test, but it's > > consistent with what's elsewhere in the file, I don't think adding new tests > > that are out of sync with what's already there makes sense, and I'm not going > to > > ask you to cleanup that whole file as part of this CL :-}. But for the > record, > > I think the proper use of RunLoop is more like what's in > > content/browser/download/download_browsertest.cc::DownloadCreateObserver.) > > Do you mean posting QuitClosure to runloop? I can do it. I was going to look > closely to those tests anyway. You're doing the same thing the rest of that file is doing, so leave it alone. I just wanted to make you aware that in other files there's a better way (creating an instance of RunLoop, getting a QuitClosure from it, then running that instance of run loop--it protects against nested run loops).
Message was sent while issue was closed.
Description was changed from ========== When URLRequestHttpJob processing cached response with GetDictionary header we already have seen the header in SdchOwner. So just ignore it. R=rdsmith@chromium.org ========== to ========== When URLRequestHttpJob processing cached response with GetDictionary header we already have seen the header in SdchOwner. So just ignore it. R=rdsmith@chromium.org ==========
Message was sent while issue was closed.
Committed patchset #4 (id:60001)
Message was sent while issue was closed.
Description was changed from ========== When URLRequestHttpJob processing cached response with GetDictionary header we already have seen the header in SdchOwner. So just ignore it. R=rdsmith@chromium.org ========== to ========== When URLRequestHttpJob processing cached response with GetDictionary header we already have seen the header in SdchOwner. So just ignore it. R=rdsmith@chromium.org Committed: https://crrev.com/41dc425d047c421c46d35b42587d71da6df37283 Cr-Commit-Position: refs/heads/master@{#372192} ==========
Message was sent while issue was closed.
Patchset 4 (id:??) landed as https://crrev.com/41dc425d047c421c46d35b42587d71da6df37283 Cr-Commit-Position: refs/heads/master@{#372192} |