net/http/http_cache_transaction.cc - Issue 23710059: Release the cache entry on deferred redirect.

Unified Diff: net/http/http_cache_transaction.cc

Issue 23710059: Release the cache entry on deferred redirect. (Closed) Base URL: https://chromium.googlesource.com/chromium/src.git@master

Patch Set: Disable new tests on Chrome Frame Created 7 years, 3 months ago

Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.

Jump to:

View side-by-side diff with in-line comments

Download patch

Index: net/http/http_cache_transaction.cc

diff --git a/net/http/http_cache_transaction.cc b/net/http/http_cache_transaction.cc

index a69bcc0b4397632ecec8b46f39dcadf913873cc0..1057e20e0d28ba364299bd552681fdd49d91ee0e 100644

--- a/net/http/http_cache_transaction.cc

+++ b/net/http/http_cache_transaction.cc

@@ -449,10 +449,12 @@ bool HttpCache::Transaction::GetFullRequestHeaders(

void HttpCache::Transaction::DoneReading() {

if (cache_.get() && entry_) {

- DCHECK(reading_);

- DCHECK_NE(mode_, UPDATE);

rvargas (doing something else) 2013/09/17 19:53:25 Is this line failing? UPDATE means that the caller

davidben 2013/09/17 22:16:04 Hrm. Yeah, I'm not sure why I removed it... I thin

- if (mode_ & WRITE)

+ if (mode_ & WRITE) {

DoneWritingToEntry(true);

+ } else {

+ cache_->DoneReadingFromEntry(entry_, this);

+ entry_ = NULL;

+ }

}

@@ -1340,10 +1342,6 @@ int HttpCache::Transaction::DoTruncateCachedMetadataComplete(int result) {

}

- // If this response is a redirect, then we can stop writing now. (We don't

- // need to cache the response body of a redirect.)

- if (response_.headers->IsRedirect(NULL))

- DoneWritingToEntry(true);

rvargas (doing something else) 2013/09/17 19:53:25 Even though it is redundant in the sense that the

davidben 2013/09/17 22:16:04 My reasoning for removing it was that this way we

rvargas (doing something else) 2013/09/18 18:37:10 I see your point. On the other hand, having to rel

mmenke 2013/09/18 18:40:39 We don't have to have the caller do anything but r

rvargas (doing something else) 2013/09/18 19:09:12 Here we are doing two things: If we know this is a

Here we are doing two things: If we know this is a redirect, we don't bother to store the response body regardless of the caller reading it or not. That's a small optimization that seems completely internal to the cache. Granted, it is small enough to consider removing it. The second thing is to expect a further call from the caller or not, in order to detect errors coming from higher layers. There, the regular contract is that the transaction should be read until we return 0. If that doesn't happen, we assume that there was an error and we should figure out if we want to discard the entry, mark it as truncated, or mark it as good. From this point of view, DoneReading is a signal that tells the cache that the result was OK, but the cache still has heuristics to determine by itself if the resource should be discarded in the absence of that signal. One of those heuristics is having received part of the response body (aka, having something valuable to preserve). This is another part of that heuristic: for a redirect, having part of the body is not significant. We are not removing all the heuristics and replacing them with expecting DoneReading to be called (and we cannot, because we still need a default if DoneReading is not called). Removing this part of the heuristic seems to me like formalizing a contract between the transaction and the job that says that caching of a redirect is completely under control of the job, that not being the case for any other return code.

davidben 2013/09/18 19:35:43 Well, it's a requirement to release the cache entr

Well, it's a requirement to release the cache entry. :-) Is it not also a requirement in the filter case though? Maybe I'm misunderstanding the original purpose of DoneReading.

But it's not completely internal to the cache, right? Although for this run, whether the body is cached or not does not affect the current caller, it affects the next caller. If the next caller happens to want to read it, they'll get something empty instead of real body. Which is why I thought it would be better if all the logic for whether a response body is read or not was in one place.

Isn't it then similarly under control of the job for filters, where DoneReading is also called? I reused DoneReading here just because it seemed like a similar signal; the filter logic knows the response body is useless past a certain point and, likewise, the redirect logic knows the response body is useless past a certain point that happens to be its start. It does seem weird to me that, in both cases, the information about where a request's response body effectively ends is at a higher level than the cache, rather than a lower level, but yeah.

mmenke 2013/09/18 19:38:31 The old contract is effectively, "After receiving

On 2013/09/18 19:09:12, rvargas wrote: > On 2013/09/18 18:40:39, mmenke wrote: > > On 2013/09/18 18:37:10, rvargas wrote: > > > On 2013/09/17 22:16:04, David Benjamin wrote: > > > > On 2013/09/17 19:53:25, rvargas wrote: > > > > > Even though it is redundant in the sense that the job _should_ call > > > > DoneReading, > > > > > it is better to keep this logic here so that we don't depend on the job > > > > issuing > > > > > that call. > > > > > > > > My reasoning for removing it was that this way we have only one place that > > > > declares redirects have no response data. Also redirects are something > that > > > > happens across transactions rather than within a single one, so moving it > > out > > > of > > > > HttpCache::Transaction seemed cleaner. > > > > > > > > I don't feel very strongly though. I can put it back if you prefer. > > > > > > I see your point. On the other hand, having to rely on the caller doing > > > something in order to cache a resource as opposed to making the decision by > > > itself seems weird. It is a trivial check to do, and leaves DoneReading as > an > > > optimization as opposed to requirement. > > > > > > I actually prefer leaving this check in. > > > > We don't have to have the caller do anything but read the body to cache the > > resource. Should the cache know the caller, on redirect, won't bother to read > > the body? That seems like a decision that should be up to the caller, not the > > cache. > > Here we are doing two things: If we know this is a redirect, we don't bother to > store the response body regardless of the caller reading it or not. That's a > small optimization that seems completely internal to the cache. Granted, it is > small enough to consider removing it. > > The second thing is to expect a further call from the caller or not, in order to > detect errors coming from higher layers. There, the regular contract is that the > transaction should be read until we return 0. If that doesn't happen, we assume > that there was an error and we should figure out if we want to discard the > entry, mark it as truncated, or mark it as good. From this point of view, > DoneReading is a signal that tells the cache that the result was OK, but the > cache still has heuristics to determine by itself if the resource should be > discarded in the absence of that signal.

The old contract is effectively, "After receiving the headers, if it's a redirect, we're done. Otherwise, we read until we get back a value <= 0, or until we abort." Normally if we abort, I'm assuming the cache assumes the transfer was incomplete, so we effectively don't cache the result. This CL moves/copies that "We never care about (or even try to read) the body of a redirect" logic into the UrlRequestJob instead of the HttpCacheTransaction. While I defer to you on this, I'm not really a fan of having that assumption duplicated at two different layers.

rvargas (doing something else) 2013/09/18 21:41:32 sure, but what the cache keeps or drops is not par

On 2013/09/18 19:38:31, mmenke wrote: > On 2013/09/18 19:09:12, rvargas wrote: > > On 2013/09/18 18:40:39, mmenke wrote: > > > On 2013/09/18 18:37:10, rvargas wrote: > > > > On 2013/09/17 22:16:04, David Benjamin wrote: > > > > > On 2013/09/17 19:53:25, rvargas wrote: > > > > > > Even though it is redundant in the sense that the job _should_ call > > > > > DoneReading, > > > > > > it is better to keep this logic here so that we don't depend on the > job > > > > > issuing > > > > > > that call. > > > > > > > > > > My reasoning for removing it was that this way we have only one place > that > > > > > declares redirects have no response data. Also redirects are something > > that > > > > > happens across transactions rather than within a single one, so moving > it > > > out > > > > of > > > > > HttpCache::Transaction seemed cleaner. > > > > > > > > > > I don't feel very strongly though. I can put it back if you prefer. > > > > > > > > I see your point. On the other hand, having to rely on the caller doing > > > > something in order to cache a resource as opposed to making the decision > by > > > > itself seems weird. It is a trivial check to do, and leaves DoneReading as > > an > > > > optimization as opposed to requirement. > > > > > > > > I actually prefer leaving this check in. > > > > > > We don't have to have the caller do anything but read the body to cache the > > > resource. Should the cache know the caller, on redirect, won't bother to > read > > > the body? That seems like a decision that should be up to the caller, not > the > > > cache. > > > > Here we are doing two things: If we know this is a redirect, we don't bother > to > > store the response body regardless of the caller reading it or not. That's a > > small optimization that seems completely internal to the cache. Granted, it is > > small enough to consider removing it. > > > > The second thing is to expect a further call from the caller or not, in order > to > > detect errors coming from higher layers. There, the regular contract is that > the > > transaction should be read until we return 0. If that doesn't happen, we > assume > > that there was an error and we should figure out if we want to discard the > > entry, mark it as truncated, or mark it as good. From this point of view, > > DoneReading is a signal that tells the cache that the result was OK, but the > > cache still has heuristics to determine by itself if the resource should be > > discarded in the absence of that signal. > > The old contract is effectively, "After receiving the headers, if it's a > redirect, we're done. Otherwise, we read until we get back a value <= 0, or > until we abort."

sure, but what the cache keeps or drops is not part of the contract so it is free to inspect whatever it wants to decide what to do.

rvargas (doing something else) 2013/09/18 21:41:32 Sort of. The contract for releasing the entry is t

Sort of. The contract for releasing the entry is that the transaction writing to it should be cancelled. I wouldn't really want to change the meaning of "I'm not going to read anymore" to be "release the entry for the next reader". In the case of filters, the result of not calling DoneReading() is that the resource will be marked as truncated (assuming there is something on the body), and the next user will have to revalidate and attempt to complete the download. That may or may not result in an unusable cached resource (as in it may not be possible to do a revalidation), so it _may_ degenerate into the same behavior for redirects (where we would drop the cached response).

That's a very good point. It would be bad to skip the body if the caller wants it for some reason. off with the check!

mmenke 2013/09/18 22:14:31 I think falsely claiming an HTTP redirect response

On 2013/09/18 21:41:32, rvargas wrote: > On 2013/09/18 19:38:31, mmenke wrote: > > On 2013/09/18 19:09:12, rvargas wrote: > > > On 2013/09/18 18:40:39, mmenke wrote: > > > > On 2013/09/18 18:37:10, rvargas wrote: > > > > > On 2013/09/17 22:16:04, David Benjamin wrote: > > > > > > On 2013/09/17 19:53:25, rvargas wrote: > > > > > > > Even though it is redundant in the sense that the job _should_ call > > > > > > DoneReading, > > > > > > > it is better to keep this logic here so that we don't depend on the > > job > > > > > > issuing > > > > > > > that call. > > > > > > > > > > > > My reasoning for removing it was that this way we have only one place > > that > > > > > > declares redirects have no response data. Also redirects are something > > > that > > > > > > happens across transactions rather than within a single one, so moving > > it > > > > out > > > > > of > > > > > > HttpCache::Transaction seemed cleaner. > > > > > > > > > > > > I don't feel very strongly though. I can put it back if you prefer. > > > > > > > > > > I see your point. On the other hand, having to rely on the caller doing > > > > > something in order to cache a resource as opposed to making the decision > > by > > > > > itself seems weird. It is a trivial check to do, and leaves DoneReading > as > > > an > > > > > optimization as opposed to requirement. > > > > > > > > > > I actually prefer leaving this check in. > > > > > > > > We don't have to have the caller do anything but read the body to cache > the > > > > resource. Should the cache know the caller, on redirect, won't bother to > > read > > > > the body? That seems like a decision that should be up to the caller, not > > the > > > > cache. > > > > > > Here we are doing two things: If we know this is a redirect, we don't bother > > to > > > store the response body regardless of the caller reading it or not. That's a > > > small optimization that seems completely internal to the cache. Granted, it > is > > > small enough to consider removing it. > > > > > > The second thing is to expect a further call from the caller or not, in > order > > to > > > detect errors coming from higher layers. There, the regular contract is that > > the > > > transaction should be read until we return 0. If that doesn't happen, we > > assume > > > that there was an error and we should figure out if we want to discard the > > > entry, mark it as truncated, or mark it as good. From this point of view, > > > DoneReading is a signal that tells the cache that the result was OK, but the > > > cache still has heuristics to determine by itself if the resource should be > > > discarded in the absence of that signal. > > > > The old contract is effectively, "After receiving the headers, if it's a > > redirect, we're done. Otherwise, we read until we get back a value <= 0, or > > until we abort." > > sure, but what the cache keeps or drops is not part of the contract so it is > free to inspect whatever it wants to decide what to do.

I think falsely claiming an HTTP redirect response has a 0-length body does call the contract into question a bit.

next_state_ = STATE_PARTIAL_HEADERS_RECEIVED;

return OK;

}

« no previous file with comments | « net/http/http_cache.cc ('k') | net/http/http_cache_unittest.cc » ('j') | net/http/http_cache_unittest.cc » ('J')