|
|
Index: net/http/http_cache_transaction.cc |
diff --git a/net/http/http_cache_transaction.cc b/net/http/http_cache_transaction.cc |
index a69bcc0b4397632ecec8b46f39dcadf913873cc0..1057e20e0d28ba364299bd552681fdd49d91ee0e 100644 |
--- a/net/http/http_cache_transaction.cc |
+++ b/net/http/http_cache_transaction.cc |
@@ -449,10 +449,12 @@ bool HttpCache::Transaction::GetFullRequestHeaders( |
void HttpCache::Transaction::DoneReading() { |
if (cache_.get() && entry_) { |
- DCHECK(reading_); |
- DCHECK_NE(mode_, UPDATE); |
rvargas (doing something else)
2013/09/17 19:53:25
Is this line failing? UPDATE means that the caller
Is this line failing? UPDATE means that the caller is attempting manual
revalidation, and the transaction is not even able to read the data, just the
headers.
Being here for UPDATE means that something is weird and most likely someone is
doing something unexpected. Maybe it triggers because line 1346 is gone...
davidben
2013/09/17 22:16:04
Hrm. Yeah, I'm not sure why I removed it... I thin
On 2013/09/17 19:53:25, rvargas wrote:
> Is this line failing? UPDATE means that the caller is attempting manual
> revalidation, and the transaction is not even able to read the data, just the
> headers.
>
> Being here for UPDATE means that something is weird and most likely someone is
> doing something unexpected. Maybe it triggers because line 1346 is gone...
Hrm. Yeah, I'm not sure why I removed it... I think I was confused about the
state machine and thought it was actually legitimate state to be in here.
Keeping it in there doesn't break net_unittests for me, and makes sense on a
second read through the code. I've put it back.
|
- if (mode_ & WRITE) |
+ if (mode_ & WRITE) { |
DoneWritingToEntry(true); |
+ } else { |
+ cache_->DoneReadingFromEntry(entry_, this); |
+ entry_ = NULL; |
+ } |
} |
} |
@@ -1340,10 +1342,6 @@ int HttpCache::Transaction::DoTruncateCachedMetadataComplete(int result) { |
} |
} |
- // If this response is a redirect, then we can stop writing now. (We don't |
- // need to cache the response body of a redirect.) |
- if (response_.headers->IsRedirect(NULL)) |
- DoneWritingToEntry(true); |
rvargas (doing something else)
2013/09/17 19:53:25
Even though it is redundant in the sense that the
Even though it is redundant in the sense that the job _should_ call DoneReading,
it is better to keep this logic here so that we don't depend on the job issuing
that call.
davidben
2013/09/17 22:16:04
My reasoning for removing it was that this way we
On 2013/09/17 19:53:25, rvargas wrote:
> Even though it is redundant in the sense that the job _should_ call
DoneReading,
> it is better to keep this logic here so that we don't depend on the job
issuing
> that call.
My reasoning for removing it was that this way we have only one place that
declares redirects have no response data. Also redirects are something that
happens across transactions rather than within a single one, so moving it out of
HttpCache::Transaction seemed cleaner.
I don't feel very strongly though. I can put it back if you prefer.
rvargas (doing something else)
2013/09/18 18:37:10
I see your point. On the other hand, having to rel
On 2013/09/17 22:16:04, David Benjamin wrote:
> On 2013/09/17 19:53:25, rvargas wrote:
> > Even though it is redundant in the sense that the job _should_ call
> DoneReading,
> > it is better to keep this logic here so that we don't depend on the job
> issuing
> > that call.
>
> My reasoning for removing it was that this way we have only one place that
> declares redirects have no response data. Also redirects are something that
> happens across transactions rather than within a single one, so moving it out
of
> HttpCache::Transaction seemed cleaner.
>
> I don't feel very strongly though. I can put it back if you prefer.
I see your point. On the other hand, having to rely on the caller doing
something in order to cache a resource as opposed to making the decision by
itself seems weird. It is a trivial check to do, and leaves DoneReading as an
optimization as opposed to requirement.
I actually prefer leaving this check in.
mmenke
2013/09/18 18:40:39
We don't have to have the caller do anything but r
On 2013/09/18 18:37:10, rvargas wrote:
> On 2013/09/17 22:16:04, David Benjamin wrote:
> > On 2013/09/17 19:53:25, rvargas wrote:
> > > Even though it is redundant in the sense that the job _should_ call
> > DoneReading,
> > > it is better to keep this logic here so that we don't depend on the job
> > issuing
> > > that call.
> >
> > My reasoning for removing it was that this way we have only one place that
> > declares redirects have no response data. Also redirects are something that
> > happens across transactions rather than within a single one, so moving it
out
> of
> > HttpCache::Transaction seemed cleaner.
> >
> > I don't feel very strongly though. I can put it back if you prefer.
>
> I see your point. On the other hand, having to rely on the caller doing
> something in order to cache a resource as opposed to making the decision by
> itself seems weird. It is a trivial check to do, and leaves DoneReading as an
> optimization as opposed to requirement.
>
> I actually prefer leaving this check in.
We don't have to have the caller do anything but read the body to cache the
resource. Should the cache know the caller, on redirect, won't bother to read
the body? That seems like a decision that should be up to the caller, not the
cache.
rvargas (doing something else)
2013/09/18 19:09:12
Here we are doing two things: If we know this is a
On 2013/09/18 18:40:39, mmenke wrote:
> On 2013/09/18 18:37:10, rvargas wrote:
> > On 2013/09/17 22:16:04, David Benjamin wrote:
> > > On 2013/09/17 19:53:25, rvargas wrote:
> > > > Even though it is redundant in the sense that the job _should_ call
> > > DoneReading,
> > > > it is better to keep this logic here so that we don't depend on the job
> > > issuing
> > > > that call.
> > >
> > > My reasoning for removing it was that this way we have only one place that
> > > declares redirects have no response data. Also redirects are something
that
> > > happens across transactions rather than within a single one, so moving it
> out
> > of
> > > HttpCache::Transaction seemed cleaner.
> > >
> > > I don't feel very strongly though. I can put it back if you prefer.
> >
> > I see your point. On the other hand, having to rely on the caller doing
> > something in order to cache a resource as opposed to making the decision by
> > itself seems weird. It is a trivial check to do, and leaves DoneReading as
an
> > optimization as opposed to requirement.
> >
> > I actually prefer leaving this check in.
>
> We don't have to have the caller do anything but read the body to cache the
> resource. Should the cache know the caller, on redirect, won't bother to read
> the body? That seems like a decision that should be up to the caller, not the
> cache.
Here we are doing two things: If we know this is a redirect, we don't bother to
store the response body regardless of the caller reading it or not. That's a
small optimization that seems completely internal to the cache. Granted, it is
small enough to consider removing it.
The second thing is to expect a further call from the caller or not, in order to
detect errors coming from higher layers. There, the regular contract is that the
transaction should be read until we return 0. If that doesn't happen, we assume
that there was an error and we should figure out if we want to discard the
entry, mark it as truncated, or mark it as good. From this point of view,
DoneReading is a signal that tells the cache that the result was OK, but the
cache still has heuristics to determine by itself if the resource should be
discarded in the absence of that signal.
One of those heuristics is having received part of the response body (aka,
having something valuable to preserve). This is another part of that heuristic:
for a redirect, having part of the body is not significant.
We are not removing all the heuristics and replacing them with expecting
DoneReading to be called (and we cannot, because we still need a default if
DoneReading is not called).
Removing this part of the heuristic seems to me like formalizing a contract
between the transaction and the job that says that caching of a redirect is
completely under control of the job, that not being the case for any other
return code.
davidben
2013/09/18 19:35:43
Well, it's a requirement to release the cache entr
On 2013/09/18 19:09:12, rvargas wrote:
> On 2013/09/18 18:40:39, mmenke wrote:
> > On 2013/09/18 18:37:10, rvargas wrote:
> > > On 2013/09/17 22:16:04, David Benjamin wrote:
> > > > On 2013/09/17 19:53:25, rvargas wrote:
> > > > > Even though it is redundant in the sense that the job _should_ call
> > > > DoneReading,
> > > > > it is better to keep this logic here so that we don't depend on the
job
> > > > issuing
> > > > > that call.
> > > >
> > > > My reasoning for removing it was that this way we have only one place
that
> > > > declares redirects have no response data. Also redirects are something
> that
> > > > happens across transactions rather than within a single one, so moving
it
> > out
> > > of
> > > > HttpCache::Transaction seemed cleaner.
> > > >
> > > > I don't feel very strongly though. I can put it back if you prefer.
> > >
> > > I see your point. On the other hand, having to rely on the caller doing
> > > something in order to cache a resource as opposed to making the decision
by
> > > itself seems weird. It is a trivial check to do, and leaves DoneReading as
> an
> > > optimization as opposed to requirement.
> > >
> > > I actually prefer leaving this check in.
Well, it's a requirement to release the cache entry. :-)
Is it not also a requirement in the filter case though? Maybe I'm
misunderstanding the original purpose of DoneReading.
> > We don't have to have the caller do anything but read the body to cache the
> > resource. Should the cache know the caller, on redirect, won't bother to
read
> > the body? That seems like a decision that should be up to the caller, not
the
> > cache.
>
> Here we are doing two things: If we know this is a redirect, we don't bother
to
> store the response body regardless of the caller reading it or not. That's a
> small optimization that seems completely internal to the cache. Granted, it is
> small enough to consider removing it.
But it's not completely internal to the cache, right? Although for this run,
whether the body is cached or not does not affect the current caller, it affects
the next caller. If the next caller happens to want to read it, they'll get
something empty instead of real body. Which is why I thought it would be better
if all the logic for whether a response body is read or not was in one place.
> The second thing is to expect a further call from the caller or not, in order
to
> detect errors coming from higher layers. There, the regular contract is that
the
> transaction should be read until we return 0. If that doesn't happen, we
assume
> that there was an error and we should figure out if we want to discard the
> entry, mark it as truncated, or mark it as good. From this point of view,
> DoneReading is a signal that tells the cache that the result was OK, but the
> cache still has heuristics to determine by itself if the resource should be
> discarded in the absence of that signal.
>
> One of those heuristics is having received part of the response body (aka,
> having something valuable to preserve). This is another part of that
heuristic:
> for a redirect, having part of the body is not significant.
>
> We are not removing all the heuristics and replacing them with expecting
> DoneReading to be called (and we cannot, because we still need a default if
> DoneReading is not called).
>
> Removing this part of the heuristic seems to me like formalizing a contract
> between the transaction and the job that says that caching of a redirect is
> completely under control of the job, that not being the case for any other
> return code.
Isn't it then similarly under control of the job for filters, where DoneReading
is also called? I reused DoneReading here just because it seemed like a similar
signal; the filter logic knows the response body is useless past a certain point
and, likewise, the redirect logic knows the response body is useless past a
certain point that happens to be its start. It does seem weird to me that, in
both cases, the information about where a request's response body effectively
ends is at a higher level than the cache, rather than a lower level, but yeah.
mmenke
2013/09/18 19:38:31
The old contract is effectively, "After receiving
On 2013/09/18 19:09:12, rvargas wrote:
> On 2013/09/18 18:40:39, mmenke wrote:
> > On 2013/09/18 18:37:10, rvargas wrote:
> > > On 2013/09/17 22:16:04, David Benjamin wrote:
> > > > On 2013/09/17 19:53:25, rvargas wrote:
> > > > > Even though it is redundant in the sense that the job _should_ call
> > > > DoneReading,
> > > > > it is better to keep this logic here so that we don't depend on the
job
> > > > issuing
> > > > > that call.
> > > >
> > > > My reasoning for removing it was that this way we have only one place
that
> > > > declares redirects have no response data. Also redirects are something
> that
> > > > happens across transactions rather than within a single one, so moving
it
> > out
> > > of
> > > > HttpCache::Transaction seemed cleaner.
> > > >
> > > > I don't feel very strongly though. I can put it back if you prefer.
> > >
> > > I see your point. On the other hand, having to rely on the caller doing
> > > something in order to cache a resource as opposed to making the decision
by
> > > itself seems weird. It is a trivial check to do, and leaves DoneReading as
> an
> > > optimization as opposed to requirement.
> > >
> > > I actually prefer leaving this check in.
> >
> > We don't have to have the caller do anything but read the body to cache the
> > resource. Should the cache know the caller, on redirect, won't bother to
read
> > the body? That seems like a decision that should be up to the caller, not
the
> > cache.
>
> Here we are doing two things: If we know this is a redirect, we don't bother
to
> store the response body regardless of the caller reading it or not. That's a
> small optimization that seems completely internal to the cache. Granted, it is
> small enough to consider removing it.
>
> The second thing is to expect a further call from the caller or not, in order
to
> detect errors coming from higher layers. There, the regular contract is that
the
> transaction should be read until we return 0. If that doesn't happen, we
assume
> that there was an error and we should figure out if we want to discard the
> entry, mark it as truncated, or mark it as good. From this point of view,
> DoneReading is a signal that tells the cache that the result was OK, but the
> cache still has heuristics to determine by itself if the resource should be
> discarded in the absence of that signal.
The old contract is effectively, "After receiving the headers, if it's a
redirect, we're done. Otherwise, we read until we get back a value <= 0, or
until we abort." Normally if we abort, I'm assuming the cache assumes the
transfer was incomplete, so we effectively don't cache the result.
This CL moves/copies that "We never care about (or even try to read) the body of
a redirect" logic into the UrlRequestJob instead of the HttpCacheTransaction.
While I defer to you on this, I'm not really a fan of having that assumption
duplicated at two different layers.
> One of those heuristics is having received part of the response body (aka,
> having something valuable to preserve). This is another part of that
heuristic:
> for a redirect, having part of the body is not significant.
>
> We are not removing all the heuristics and replacing them with expecting
> DoneReading to be called (and we cannot, because we still need a default if
> DoneReading is not called).
>
> Removing this part of the heuristic seems to me like formalizing a contract
> between the transaction and the job that says that caching of a redirect is
> completely under control of the job, that not being the case for any other
> return code.
rvargas (doing something else)
2013/09/18 21:41:32
sure, but what the cache keeps or drops is not par
On 2013/09/18 19:38:31, mmenke wrote:
> On 2013/09/18 19:09:12, rvargas wrote:
> > On 2013/09/18 18:40:39, mmenke wrote:
> > > On 2013/09/18 18:37:10, rvargas wrote:
> > > > On 2013/09/17 22:16:04, David Benjamin wrote:
> > > > > On 2013/09/17 19:53:25, rvargas wrote:
> > > > > > Even though it is redundant in the sense that the job _should_ call
> > > > > DoneReading,
> > > > > > it is better to keep this logic here so that we don't depend on the
> job
> > > > > issuing
> > > > > > that call.
> > > > >
> > > > > My reasoning for removing it was that this way we have only one place
> that
> > > > > declares redirects have no response data. Also redirects are something
> > that
> > > > > happens across transactions rather than within a single one, so moving
> it
> > > out
> > > > of
> > > > > HttpCache::Transaction seemed cleaner.
> > > > >
> > > > > I don't feel very strongly though. I can put it back if you prefer.
> > > >
> > > > I see your point. On the other hand, having to rely on the caller doing
> > > > something in order to cache a resource as opposed to making the decision
> by
> > > > itself seems weird. It is a trivial check to do, and leaves DoneReading
as
> > an
> > > > optimization as opposed to requirement.
> > > >
> > > > I actually prefer leaving this check in.
> > >
> > > We don't have to have the caller do anything but read the body to cache
the
> > > resource. Should the cache know the caller, on redirect, won't bother to
> read
> > > the body? That seems like a decision that should be up to the caller, not
> the
> > > cache.
> >
> > Here we are doing two things: If we know this is a redirect, we don't bother
> to
> > store the response body regardless of the caller reading it or not. That's a
> > small optimization that seems completely internal to the cache. Granted, it
is
> > small enough to consider removing it.
> >
> > The second thing is to expect a further call from the caller or not, in
order
> to
> > detect errors coming from higher layers. There, the regular contract is that
> the
> > transaction should be read until we return 0. If that doesn't happen, we
> assume
> > that there was an error and we should figure out if we want to discard the
> > entry, mark it as truncated, or mark it as good. From this point of view,
> > DoneReading is a signal that tells the cache that the result was OK, but the
> > cache still has heuristics to determine by itself if the resource should be
> > discarded in the absence of that signal.
>
> The old contract is effectively, "After receiving the headers, if it's a
> redirect, we're done. Otherwise, we read until we get back a value <= 0, or
> until we abort."
sure, but what the cache keeps or drops is not part of the contract so it is
free to inspect whatever it wants to decide what to do.
> Normally if we abort, I'm assuming the cache assumes the
> transfer was incomplete, so we effectively don't cache the result.
>
> This CL moves/copies that "We never care about (or even try to read) the body
of
> a redirect" logic into the UrlRequestJob instead of the HttpCacheTransaction.
> While I defer to you on this, I'm not really a fan of having that assumption
> duplicated at two different layers.
>
> > One of those heuristics is having received part of the response body (aka,
> > having something valuable to preserve). This is another part of that
> heuristic:
> > for a redirect, having part of the body is not significant.
> >
> > We are not removing all the heuristics and replacing them with expecting
> > DoneReading to be called (and we cannot, because we still need a default if
> > DoneReading is not called).
> >
> > Removing this part of the heuristic seems to me like formalizing a contract
> > between the transaction and the job that says that caching of a redirect is
> > completely under control of the job, that not being the case for any other
> > return code.
>
rvargas (doing something else)
2013/09/18 21:41:32
Sort of. The contract for releasing the entry is t
On 2013/09/18 19:35:43, David Benjamin wrote:
> On 2013/09/18 19:09:12, rvargas wrote:
> > On 2013/09/18 18:40:39, mmenke wrote:
> > > On 2013/09/18 18:37:10, rvargas wrote:
> > > > On 2013/09/17 22:16:04, David Benjamin wrote:
> > > > > On 2013/09/17 19:53:25, rvargas wrote:
> > > > > > Even though it is redundant in the sense that the job _should_ call
> > > > > DoneReading,
> > > > > > it is better to keep this logic here so that we don't depend on the
> job
> > > > > issuing
> > > > > > that call.
> > > > >
> > > > > My reasoning for removing it was that this way we have only one place
> that
> > > > > declares redirects have no response data. Also redirects are something
> > that
> > > > > happens across transactions rather than within a single one, so moving
> it
> > > out
> > > > of
> > > > > HttpCache::Transaction seemed cleaner.
> > > > >
> > > > > I don't feel very strongly though. I can put it back if you prefer.
> > > >
> > > > I see your point. On the other hand, having to rely on the caller doing
> > > > something in order to cache a resource as opposed to making the decision
> by
> > > > itself seems weird. It is a trivial check to do, and leaves DoneReading
as
> > an
> > > > optimization as opposed to requirement.
> > > >
> > > > I actually prefer leaving this check in.
>
> Well, it's a requirement to release the cache entry. :-)
>
> Is it not also a requirement in the filter case though? Maybe I'm
> misunderstanding the original purpose of DoneReading.
Sort of. The contract for releasing the entry is that the transaction writing to
it should be cancelled. I wouldn't really want to change the meaning of "I'm not
going to read anymore" to be "release the entry for the next reader".
In the case of filters, the result of not calling DoneReading() is that the
resource will be marked as truncated (assuming there is something on the body),
and the next user will have to revalidate and attempt to complete the download.
That may or may not result in an unusable cached resource (as in it may not be
possible to do a revalidation), so it _may_ degenerate into the same behavior
for redirects (where we would drop the cached response).
>
>
> > > We don't have to have the caller do anything but read the body to cache
the
> > > resource. Should the cache know the caller, on redirect, won't bother to
> read
> > > the body? That seems like a decision that should be up to the caller, not
> the
> > > cache.
> >
> > Here we are doing two things: If we know this is a redirect, we don't bother
> to
> > store the response body regardless of the caller reading it or not. That's a
> > small optimization that seems completely internal to the cache. Granted, it
is
> > small enough to consider removing it.
>
> But it's not completely internal to the cache, right? Although for this run,
> whether the body is cached or not does not affect the current caller, it
affects
> the next caller. If the next caller happens to want to read it, they'll get
> something empty instead of real body. Which is why I thought it would be
better
> if all the logic for whether a response body is read or not was in one place.
That's a very good point. It would be bad to skip the body if the caller wants
it for some reason.
off with the check!
>
>
> > The second thing is to expect a further call from the caller or not, in
order
> to
> > detect errors coming from higher layers. There, the regular contract is that
> the
> > transaction should be read until we return 0. If that doesn't happen, we
> assume
> > that there was an error and we should figure out if we want to discard the
> > entry, mark it as truncated, or mark it as good. From this point of view,
> > DoneReading is a signal that tells the cache that the result was OK, but the
> > cache still has heuristics to determine by itself if the resource should be
> > discarded in the absence of that signal.
> >
> > One of those heuristics is having received part of the response body (aka,
> > having something valuable to preserve). This is another part of that
> heuristic:
> > for a redirect, having part of the body is not significant.
> >
> > We are not removing all the heuristics and replacing them with expecting
> > DoneReading to be called (and we cannot, because we still need a default if
> > DoneReading is not called).
> >
> > Removing this part of the heuristic seems to me like formalizing a contract
> > between the transaction and the job that says that caching of a redirect is
> > completely under control of the job, that not being the case for any other
> > return code.
>
> Isn't it then similarly under control of the job for filters, where
DoneReading
> is also called? I reused DoneReading here just because it seemed like a
similar
> signal; the filter logic knows the response body is useless past a certain
point
> and, likewise, the redirect logic knows the response body is useless past a
> certain point that happens to be its start. It does seem weird to me that, in
> both cases, the information about where a request's response body effectively
> ends is at a higher level than the cache, rather than a lower level, but yeah.
mmenke
2013/09/18 22:14:31
I think falsely claiming an HTTP redirect response
On 2013/09/18 21:41:32, rvargas wrote:
> On 2013/09/18 19:38:31, mmenke wrote:
> > On 2013/09/18 19:09:12, rvargas wrote:
> > > On 2013/09/18 18:40:39, mmenke wrote:
> > > > On 2013/09/18 18:37:10, rvargas wrote:
> > > > > On 2013/09/17 22:16:04, David Benjamin wrote:
> > > > > > On 2013/09/17 19:53:25, rvargas wrote:
> > > > > > > Even though it is redundant in the sense that the job _should_
call
> > > > > > DoneReading,
> > > > > > > it is better to keep this logic here so that we don't depend on
the
> > job
> > > > > > issuing
> > > > > > > that call.
> > > > > >
> > > > > > My reasoning for removing it was that this way we have only one
place
> > that
> > > > > > declares redirects have no response data. Also redirects are
something
> > > that
> > > > > > happens across transactions rather than within a single one, so
moving
> > it
> > > > out
> > > > > of
> > > > > > HttpCache::Transaction seemed cleaner.
> > > > > >
> > > > > > I don't feel very strongly though. I can put it back if you prefer.
> > > > >
> > > > > I see your point. On the other hand, having to rely on the caller
doing
> > > > > something in order to cache a resource as opposed to making the
decision
> > by
> > > > > itself seems weird. It is a trivial check to do, and leaves
DoneReading
> as
> > > an
> > > > > optimization as opposed to requirement.
> > > > >
> > > > > I actually prefer leaving this check in.
> > > >
> > > > We don't have to have the caller do anything but read the body to cache
> the
> > > > resource. Should the cache know the caller, on redirect, won't bother
to
> > read
> > > > the body? That seems like a decision that should be up to the caller,
not
> > the
> > > > cache.
> > >
> > > Here we are doing two things: If we know this is a redirect, we don't
bother
> > to
> > > store the response body regardless of the caller reading it or not. That's
a
> > > small optimization that seems completely internal to the cache. Granted,
it
> is
> > > small enough to consider removing it.
> > >
> > > The second thing is to expect a further call from the caller or not, in
> order
> > to
> > > detect errors coming from higher layers. There, the regular contract is
that
> > the
> > > transaction should be read until we return 0. If that doesn't happen, we
> > assume
> > > that there was an error and we should figure out if we want to discard the
> > > entry, mark it as truncated, or mark it as good. From this point of view,
> > > DoneReading is a signal that tells the cache that the result was OK, but
the
> > > cache still has heuristics to determine by itself if the resource should
be
> > > discarded in the absence of that signal.
> >
> > The old contract is effectively, "After receiving the headers, if it's a
> > redirect, we're done. Otherwise, we read until we get back a value <= 0, or
> > until we abort."
>
> sure, but what the cache keeps or drops is not part of the contract so it is
> free to inspect whatever it wants to decide what to do.
I think falsely claiming an HTTP redirect response has a 0-length body does call
the contract into question a bit.
|
next_state_ = STATE_PARTIAL_HEADERS_RECEIVED; |
return OK; |
} |