|
|
Created:
5 years, 8 months ago by mmenke Modified:
5 years, 8 months ago CC:
chromium-reviews, cbentzel+watch_chromium.org, asvitkine+watch_chromium.org, rkaplow Base URL:
https://chromium.googlesource.com/chromium/src.git@fetcher4 Target Ref:
refs/pending/heads/master Project:
chromium Visibility:
Public. |
DescriptionAdd histograms for a number of hacks in the HTTP header parsing logic.
The hacks are: HTTP/0.9 support, allowing bonus characters before
HTTP responses, and accepting a truncated set of HTTP headers.
Also rename the confusingly named "ParseResponseHeaders" and
"DoParseResponseHeaders" methods.
BUG=472762
Committed: https://crrev.com/7174c6a47d6d00d2aae3fc1f24f7865ca1a5bad8
Cr-Commit-Position: refs/heads/master@{#325527}
Patch Set 1 #Patch Set 2 : Merge #
Total comments: 7
Patch Set 3 : Fix search #
Total comments: 4
Patch Set 4 : Response to comments #
Messages
Total messages: 17 (6 generated)
Patchset #2 (id:20001) has been deleted
Patchset #1 (id:1) has been deleted
mmenke@chromium.org changed reviewers: + davidben@chromium.org
David: By showing the least bit of interest in the effort, you implicitly volunteered to review this CL. At least it's small. :)
https://chromiumcodereview.appspot.com/1074263003/diff/60001/net/http/http_st... File net/http/http_stream_parser.cc (right): https://chromiumcodereview.appspot.com/1074263003/diff/60001/net/http/http_st... net/http/http_stream_parser.cc:952: if (strspn(read_buf_->StartOfBuffer(), " \t\r\n") >= I believe read_buf_ needn't be NUL-terminated, so this can read off the edge of the buffer if it happens to be all whitespace. (Shouldn't this check be after LocateEndOfHeaders is called? Seems this would also include cases where there was slop in the buffer from the previous response.) https://chromiumcodereview.appspot.com/1074263003/diff/60001/tools/metrics/hi... File tools/metrics/histograms/histograms.xml (right): https://chromiumcodereview.appspot.com/1074263003/diff/60001/tools/metrics/hi... tools/metrics/histograms/histograms.xml:18633: + Records the frequency of a number of hacky HTTP header parsing rules are Do we care more about frequency or proportion of responses being of this or that type. I've always been kinda dubious of these "event" histograms from the prerender side since their denominators are weird. They're hard to read.
https://codereview.chromium.org/1074263003/diff/60001/net/http/http_stream_pa... File net/http/http_stream_parser.cc (right): https://codereview.chromium.org/1074263003/diff/60001/net/http/http_stream_pa... net/http/http_stream_parser.cc:952: if (strspn(read_buf_->StartOfBuffer(), " \t\r\n") >= On 2015/04/13 19:55:14, David Benjamin wrote: > I believe read_buf_ needn't be NUL-terminated, so this can read off the edge of > the buffer if it happens to be all whitespace. This is certainly true...On the other hand, we *know* it contains "HTTP" somewhere, since "response_header_start_offset_" is greater than 0, so it doesn't have to be \0 terminated for this to work, for reasonable implementations of strspn...Though depending on that is really weird. Fixed. > (Shouldn't this check be after > LocateEndOfHeaders is called? Seems this would also include cases where there > was slop in the buffer from the previous response.) This is only called after HttpUtil::LocateStartOfStatusLine is called, and finds the start of the HTTP response. Anything before that is the junk we're randomly ignoring before the headers we're trying to parse. Note that when the HttpStreamParser is constructed, everything in the input buffer is after the end of the previous response. We only end up here when we've already called LocateEndOfHeaders, and determined we don't have HTTP headers. If we put this at line 938, we'd also miss the case where we have truncated headers *and* skipped bytes before the header. https://codereview.chromium.org/1074263003/diff/60001/tools/metrics/histogram... File tools/metrics/histograms/histograms.xml (right): https://codereview.chromium.org/1074263003/diff/60001/tools/metrics/histogram... tools/metrics/histograms/histograms.xml:18633: + Records the frequency of a number of hacky HTTP header parsing rules are On 2015/04/13 19:55:14, David Benjamin wrote: > Do we care more about frequency or proportion of responses being of this or that > type. I've always been kinda dubious of these "event" histograms from the > prerender side since their denominators are weird. They're hard to read. I agree with your dislike of event-based histograms. Unfortunately, we need some baseline other than the frequency with which other hacks are executed. We could record one separate histogram per hack, with used vs not-used, but these will be are very high use histograms, even if they only have two buckets, and I think the way they're stored, server side, means that results in a lot of overhead. Also, I expect we'll see much less than 1% of responses trigger any of these hacks, so I'm hoping that we'll basically get percentage use (As hack frequency drops to 0, <use of a particular hack> / (<requests> + <use count of all hacks>) approaches the percentage of requests that use that particular hack.
lgtm https://codereview.chromium.org/1074263003/diff/60001/net/http/http_stream_pa... File net/http/http_stream_parser.cc (right): https://codereview.chromium.org/1074263003/diff/60001/net/http/http_stream_pa... net/http/http_stream_parser.cc:952: if (strspn(read_buf_->StartOfBuffer(), " \t\r\n") >= On 2015/04/13 21:12:21, mmenke wrote: > On 2015/04/13 19:55:14, David Benjamin wrote: > > I believe read_buf_ needn't be NUL-terminated, so this can read off the edge > of > > the buffer if it happens to be all whitespace. > > This is certainly true...On the other hand, we *know* it contains "HTTP" > somewhere, since "response_header_start_offset_" is greater than 0, so it > doesn't have to be \0 terminated for this to work, for reasonable > implementations of strspn...Though depending on that is really weird. Fixed. > > > (Shouldn't this check be after > > LocateEndOfHeaders is called? Seems this would also include cases where there > > was slop in the buffer from the previous response.) > > This is only called after HttpUtil::LocateStartOfStatusLine is called, and finds > the start of the HTTP response. Anything before that is the junk we're randomly > ignoring before the headers we're trying to parse. Note that when the > HttpStreamParser is constructed, everything in the input buffer is after the end > of the previous response. > > We only end up here when we've already called LocateEndOfHeaders, and determined > we don't have HTTP headers. If we put this at line 938, we'd also miss the case > where we have truncated headers *and* skipped bytes before the header. Oh, I see. StartOfBuffer is also where LocateStartOfStatusLine starts searching, and we memmove on slop. https://codereview.chromium.org/1074263003/diff/60001/tools/metrics/histogram... File tools/metrics/histograms/histograms.xml (right): https://codereview.chromium.org/1074263003/diff/60001/tools/metrics/histogram... tools/metrics/histograms/histograms.xml:18633: + Records the frequency of a number of hacky HTTP header parsing rules are On 2015/04/13 21:12:21, mmenke wrote: > On 2015/04/13 19:55:14, David Benjamin wrote: > > Do we care more about frequency or proportion of responses being of this or > that > > type. I've always been kinda dubious of these "event" histograms from the > > prerender side since their denominators are weird. They're hard to read. > > I agree with your dislike of event-based histograms. Unfortunately, we need > some baseline other than the frequency with which other hacks are executed. We > could record one separate histogram per hack, with used vs not-used, but these > will be are very high use histograms, even if they only have two buckets, and I > think the way they're stored, server side, means that results in a lot of > overhead. > > Also, I expect we'll see much less than 1% of responses trigger any of these > hacks, so I'm hoping that we'll basically get percentage use (As hack frequency > drops to 0, <use of a particular hack> / (<requests> + <use count of all hacks>) > approaches the percentage of requests that use that particular hack. I suppose UseCounter is the same deal. Are separate histograms that expensive that 1 vs 3 high-use histograms makes a big difference? I would think that they'd be a drop in the bucket. *shrug* Whatever. (3 because SKIPPED_WS_PREFIX and SKIPPED_NON_WS_PREFIX can be part of the same histogram.)
mmenke@chromium.org changed reviewers: + isherman@chromium.org
[+isherman]: Please review histograms.xml. Also, could you comment on my understanding of things: Even if a histogram only has a couple of buckets, what really matters for server-side overhead is how often those buckets are incremented, moreso than how many buckets there are? See discussion for more context. https://codereview.chromium.org/1074263003/diff/60001/tools/metrics/histogram... File tools/metrics/histograms/histograms.xml (right): https://codereview.chromium.org/1074263003/diff/60001/tools/metrics/histogram... tools/metrics/histograms/histograms.xml:18633: + Records the frequency of a number of hacky HTTP header parsing rules are On 2015/04/14 23:41:40, David Benjamin wrote: > On 2015/04/13 21:12:21, mmenke wrote: > > On 2015/04/13 19:55:14, David Benjamin wrote: > > > Do we care more about frequency or proportion of responses being of this or > > that > > > type. I've always been kinda dubious of these "event" histograms from the > > > prerender side since their denominators are weird. They're hard to read. > > > > I agree with your dislike of event-based histograms. Unfortunately, we need > > some baseline other than the frequency with which other hacks are executed. > We > > could record one separate histogram per hack, with used vs not-used, but these > > will be are very high use histograms, even if they only have two buckets, and > I > > think the way they're stored, server side, means that results in a lot of > > overhead. > > > > Also, I expect we'll see much less than 1% of responses trigger any of these > > hacks, so I'm hoping that we'll basically get percentage use (As hack > frequency > > drops to 0, <use of a particular hack> / (<requests> + <use count of all > hacks>) > > approaches the percentage of requests that use that particular hack. > > I suppose UseCounter is the same deal. Are separate histograms that expensive > that 1 vs 3 high-use histograms makes a big difference? I would think that > they'd be a drop in the bucket. *shrug* Whatever. > > (3 because SKIPPED_WS_PREFIX and SKIPPED_NON_WS_PREFIX can be part of the same > histogram.) I think that each increment of 1 results in another entry in a table, which is what makes them expensive.
+Rob in case he wants to comment on storage efficiency. LGTM % nits. I think it would also be fine to split this into three separate histograms, if you prefer. The advantage of having a single histogram is that (1) it lets you easily compare the prevalence of individual hacks w.r.t. one another, and (2) it's slightly more efficient. The server only stores data for buckets that were emitted to; so yes, it's the buckets that are emitted to most frequently that are most important to consolidate. https://codereview.chromium.org/1074263003/diff/80001/tools/metrics/histogram... File tools/metrics/histograms/histograms.xml (right): https://codereview.chromium.org/1074263003/diff/80001/tools/metrics/histogram... tools/metrics/histograms/histograms.xml:18635: + we can remove the hacks. nit: "frequency of" -> "frequency with which" https://codereview.chromium.org/1074263003/diff/80001/tools/metrics/histogram... tools/metrics/histograms/histograms.xml:18635: + we can remove the hacks. nit: "Histogram should" -> "This histogram should"
Going to go ahead and land this as-is, out of paranoia about histogram size (And because one of the histograms is a bit of a pain to calculate as a percent of requests without touching more code). https://codereview.chromium.org/1074263003/diff/80001/tools/metrics/histogram... File tools/metrics/histograms/histograms.xml (right): https://codereview.chromium.org/1074263003/diff/80001/tools/metrics/histogram... tools/metrics/histograms/histograms.xml:18635: + we can remove the hacks. On 2015/04/15 00:08:06, Ilya Sherman wrote: > nit: "Histogram should" -> "This histogram should" Done. https://codereview.chromium.org/1074263003/diff/80001/tools/metrics/histogram... tools/metrics/histograms/histograms.xml:18635: + we can remove the hacks. On 2015/04/15 00:08:06, Ilya Sherman wrote: > nit: "frequency of" -> "frequency with which" Done.
And thanks for the reviews.
The CQ bit was checked by mmenke@chromium.org
The patchset sent to the CQ was uploaded after l-g-t-m from isherman@chromium.org, davidben@chromium.org Link to the patchset: https://codereview.chromium.org/1074263003/#ps100001 (title: "Response to comments")
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1074263003/100001
Message was sent while issue was closed.
Committed patchset #4 (id:100001)
Message was sent while issue was closed.
Patchset 4 (id:??) landed as https://crrev.com/7174c6a47d6d00d2aae3fc1f24f7865ca1a5bad8 Cr-Commit-Position: refs/heads/master@{#325527} |