Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(173)

Issue 517043: Http cache: Avoid resuming (and keeping) truncated entries... (Closed)

Created:
10 years, 11 months ago by rvargas (doing something else)
Modified:
9 years, 6 months ago
Reviewers:
eroman
CC:
chromium-reviews_googlegroups.com, darin (slow to review), Paweł Hajdan Jr.
Visibility:
Public.

Description

Http cache: Avoid resuming (and keeping) truncated entries if the server doesn't provide a strong validator. We require: - A strong etag or a strong last modified date. - The total Content length. - Lack of an explicit rejection of ranges (Accept-ranges: none) This aligns better with the conditions used by Firefox. BUG=30220 , b/2329250 TEST=unittests Committed: http://src.chromium.org/viewvc/chrome?view=rev&revision=35761

Patch Set 1 #

Patch Set 2 : '' #

Total comments: 3

Patch Set 3 : '' #

Patch Set 4 : '' #

Patch Set 5 : Add Accept-Ranges:none check #

Unified diffs Side-by-side diffs Delta from patch set Stats (+264 lines, -17 lines) Patch
M net/http/http_cache_transaction.cc View 1 2 3 4 3 chunks +10 lines, -2 lines 0 comments Download
M net/http/http_cache_unittest.cc View 1 2 3 4 7 chunks +155 lines, -11 lines 0 comments Download
M net/http/http_response_headers.h View 1 2 3 1 chunk +4 lines, -0 lines 0 comments Download
M net/http/http_response_headers.cc View 1 2 3 1 chunk +26 lines, -0 lines 0 comments Download
M net/http/http_response_headers_unittest.cc View 1 2 3 1 chunk +54 lines, -0 lines 0 comments Download
M net/http/partial_data.cc View 3 chunks +15 lines, -4 lines 0 comments Download

Messages

Total messages: 11 (0 generated)
rvargas (doing something else)
10 years, 11 months ago (2010-01-06 00:35:41 UTC) #1
eroman
http://codereview.chromium.org/517043/diff/4001/4004 File net/http/http_response_headers.cc (right): http://codereview.chromium.org/517043/diff/4001/4004#newcode1015 net/http/http_response_headers.cc:1015: if (slash == std::string::npos || !slash) nit: i think ...
10 years, 11 months ago (2010-01-06 02:20:33 UTC) #2
rvargas (doing something else)
http://codereview.chromium.org/517043/diff/4001/4006 File net/http/http_response_headers_unittest.cc (right): http://codereview.chromium.org/517043/diff/4001/4006#newcode1448 net/http/http_response_headers_unittest.cc:1448: "ETag: \"foo\"\n", On 2010/01/06 02:20:33, eroman wrote: > Why ...
10 years, 11 months ago (2010-01-06 02:25:18 UTC) #3
eroman
lgtm
10 years, 11 months ago (2010-01-06 03:58:01 UTC) #4
darin (slow to review)
Why require both etag and a last modified date? Why the strong check? Firefox uses ...
10 years, 11 months ago (2010-01-06 06:03:52 UTC) #5
rvargas (doing something else)
On 2010/01/06 06:03:52, darin wrote: > Why require both etag and a last modified date? ...
10 years, 11 months ago (2010-01-06 19:05:45 UTC) #6
rvargas (doing something else)
Or do you think it is better to also use last-modified (when there is no ...
10 years, 11 months ago (2010-01-07 01:37:13 UTC) #7
darin (slow to review)
On Wed, Jan 6, 2010 at 11:05 AM, <rvargas@chromium.org> wrote: > On 2010/01/06 06:03:52, darin ...
10 years, 11 months ago (2010-01-07 08:14:15 UTC) #8
rvargas (doing something else)
On Thu, Jan 7, 2010 at 12:14 AM, Darin Fisher <darin@chromium.org> wrote: > > > ...
10 years, 11 months ago (2010-01-07 19:12:30 UTC) #9
rvargas (doing something else)
> > Firefox only has support for resuming an entry (not for caching random > ...
10 years, 11 months ago (2010-01-07 19:14:07 UTC) #10
darin (slow to review)
10 years, 11 months ago (2010-01-07 19:22:53 UTC) #11
On Thu, Jan 7, 2010 at 11:12 AM, Ricardo Vargas <rvargas@chromium.org>wrote:

>
> On Thu, Jan 7, 2010 at 12:14 AM, Darin Fisher <darin@chromium.org> wrote:
>
>>
>>
>> On Wed, Jan 6, 2010 at 11:05 AM, <rvargas@chromium.org> wrote:
>>
>>> On 2010/01/06 06:03:52, darin wrote:
>>>
>>>> Why require both etag and a last modified date?  Why the strong check?
>>>>
>>>
>>>  Firefox uses these checks:
>>>>
>>>
>>>
>>>
http://mxr.mozilla.org/mozilla/source/netwerk/protocol/http/src/nsHttpRespons...
>>>
>>>  -Darin
>>>>
>>>
>>>
>>> I guess I misunderstood the comment about what the entity id is
>>> (escaped_etag/size/lastmod):
>>>
>>>
http://mxr.mozilla.org/firefox/source/netwerk/protocol/http/src/nsHttpChannel...
>>>
>>>
>> That is part of the code used to support nsHttpChannel::ResumeAt, which is
>> for resuming downloads.  The user of the API is explicitly asking for a
>> range request.  That's why an If-Match / If-Unmodified-Since request is
>> made.
>>
>> That's very different from the code used to transparently complete a
>> partial cache entry.
>>
>> See:
>>
>>
http://mxr.mozilla.org/firefox/source/netwerk/protocol/http/src/nsHttpChannel...
>>
>>
>>
>>> so I thought that all three parts were required by FF.
>>>
>>> However, given that the Last-Modified date is not as strong as an etag,
>>> and that
>>> if we really wanted to use that we should probably need to use the 60
>>> secs
>>> heuristic (and that also requires looking for a Date header so the actual
>>> required headers are even more), I thought that a good compromise was to
>>> settle
>>> for any type of Last-modified but still with a strong etag... just to be
>>> more
>>> conservative as to when to use the optimization. (Using weak validators
>>> for a
>>> byte-range request is a client bug).
>>>
>>> Note that in this particular bug (the internal one), the use of strong or
>>> weak
>>> validators doesn't change the outcome (just etag, just last-modified or
>>> both),
>>> and we would be relying on the lack of a content-length header.
>>>
>>> It also seems that looking for "accept: bytes" is kind of superfluous
>>> since the
>>> spec clearly states that any server must be prepared to receive byte
>>> range
>>> requests even if its support was not advertised... and this particular
>>> server/proxy replies to that request, it's just that it changed the
>>> content. I'd
>>> expect most servers will not bother advertising that.
>>
>>
>>
>>> I was actually thinking about adding a histogram to measure the presence
>>> of
>>> these headers in the wild. I'd be happy to drop the requirement of
>>> last-modified
>>> but I'm not so sure about dropping the etag requirement.
>>>
>>>
>> I think matching Firefox is a good idea.  It is well supported.
>>
>> Send "If-Range: <etag-value>" if there was an ETag header, else send
>> "If-Range: <last-modified-value>" if there was a Last-Modified header.
>>
>
> This is exactly what we do... for the second request that we make (the one
> that actually fetches the end of the entry).
>
> Firefox only has support for resuming an entry (not for caching random
> changes) so they can simply ask the server for the last part and then (after
> receiving the response) start reading from the cache in order to return data
> to the caller. We have a lot of infrastructure already to support arbitrary
> combinations of data stored/not stored by the cache so in order to complete
> a truncated request we simply re-use all that code... so we end up making
> two requests, the first one to validate the cached data and the second one
> to fetch the end of the resource. This is slightly less efficient but way
> simpler than adding more logic to special case truncated entries.
>
> So I don't think we should match "exactly" what ff does.
>
> Going back to my question, regardless of the reason for the request
> (resuming a download or transparently completing one), weak validators
> should not be used for a byte range request. We could follow the spec and
> allow a single strong one (either etag or last-modified), or we could be
> more restrictive. We already agreed in using content-length as a required
> header (not required by the spec), and I was being even more cautious this
> time and reducing even further our chances of having corrupt data when
> servers/proxies don't do what they should... but what I'm leaning now to not
> being so paranoid and just allow a last-modified by itself (after applying
> the heuristic of being old enough).
>
> By the way, I'm not doing that in this CL, but I was thinking that the
> behavior (whatever we decide to do) should also be applied for other
> byte-range request, a.k.a. media resources (although to be fair, in that
> case we don't mix data received with 200 with data received with 206 so we
> don't have to require content-length (we can't), but we should check for
> weak etags/dates)
>


Thanks for the clear explanation.  I'm convinced.

-Darin



>
>
>>
>> -Darin
>>
>>
>>
>>> http://codereview.chromium.org/517043
>>>
>>
>>
>

Powered by Google App Engine
This is Rietveld 408576698