|
|
Created:
3 years, 7 months ago by Maks Orlovich Modified:
3 years, 3 months ago CC:
chromium-reviews, cbentzel+watch_chromium.org, gavinp+disk_chromium.org, net-reviews_chromium.org Target Ref:
refs/heads/master Project:
chromium Visibility:
Public. |
DescriptionFor files smaller than a configurable threshold we will only
issue a single file read, immediately reading in all of
metadata and stream 0 and stream 1 content upon entry opening.
Files exceeding the threshold will keep the old behavior of only
reading stream 0 and metadata (with somewhat more complicated
sequence of file ops), and will read stream 1 content on-demand.
The exact threshold will be determined by an experiment, with the
code defaulting to 0 (e.g. disabled).
Prefetching stream 1 content like this seem to help cold read performance significantly on Mac (and less so on Android and Linux, haven't seen much effect on Windows).
Doc for experiment: https://goo.gl/RgW942
A few bits and pieces based on jkarlin's 2872943002
BUG=719979
Review-Url: https://codereview.chromium.org/2874833005
Cr-Commit-Position: refs/heads/master@{#498256}
Committed: https://chromium.googlesource.com/chromium/src/+/2acfa3ba04b5902deefdc2aacdcc1d71e785d9be
Patch Set 1 #Patch Set 2 : Actually pre-read stream1 stuff now, and wire it out of SimpleSynchronousEntry, lifting some bits f… #Patch Set 3 : Get it to actually serve reads, c&p'ing more stuff from jkarlin's CL and some existing bits. #Patch Set 4 : Avoid copies. This seems to work, but needs a bunch of new tests and refinement #Patch Set 5 : Rollback overcomplicate zero copy stuff, doesn't do much. Fix copy-paste error. Rebase #Patch Set 6 : Refactor some to reduce code dupe, though this is the easy part. #Patch Set 7 : Refactor + range check the in-memory reads. Probably needs an another round, but a bit cleaner. #Patch Set 8 : More refactors. #Patch Set 9 : Remove no longer needed function split #Patch Set 10 : Update tests. #Patch Set 11 : Rename + comment a method. #
Total comments: 7
Patch Set 12 : Add some metrics and an experiment knob. Not really happy with coverage, though. #
Total comments: 38
Patch Set 13 : Rebase #Patch Set 14 : Apply some of the (easier) feedback, small refactors. #Patch Set 15 : Rename method, tweak some formatting. #Patch Set 16 : Add tests for prefetch, apply some more feedback. #Patch Set 17 : Remove some now-useless include additions (after moving stuff around) #Patch Set 18 : Make DiskCacheSimplePrefetchTest.YesPrefetchNoRead run standalone. (Also remove some unwanted debug… #Patch Set 19 : Group and array up data + crc to cut down on code dupe and excessive arg counts. #
Total comments: 43
Patch Set 20 : Apply most of review feedback, still need to deal with the struct thing #
Total comments: 6
Patch Set 21 : Apply the easy part of the feedback. #Patch Set 22 : Add tests for CRC and footer histograms, fix bug in footer histogram reporting #Patch Set 23 : More direct computation of file layout. Might revert based on feedback. #Patch Set 24 : One more range check. #
Total comments: 2
Patch Set 25 : Apply review feedback. #
Total comments: 14
Patch Set 26 : Apply some of the feedback, a bit more to discuss still. #Patch Set 27 : Apply some of the feedback, a bit more to discuss still. #
Total comments: 1
Patch Set 28 : Tweak histogram description based on feedback #Messages
Total messages: 122 (71 generated)
The CQ bit was checked by morlovich@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was checked by morlovich@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was checked by morlovich@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
Description was changed from ========== Sketching out the "read it all in one gulp if it's small" stuff BUG=719979 ========== to ========== ... prefetching stream 1 content along with all the metadata + stream 0 content we always read. Current revision does it for anything <= 32KiB, though perhaps it may be worth doing this as an experiment? Prefetching like this seem to help cold read performance significantly on Mac (and less so on Android and Linux, haven't seen much effect on Windows). A few bits and pieces based on jkarlin's 2872943002 BUG=719979 ==========
The CQ bit was checked by morlovich@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
Description was changed from ========== ... prefetching stream 1 content along with all the metadata + stream 0 content we always read. Current revision does it for anything <= 32KiB, though perhaps it may be worth doing this as an experiment? Prefetching like this seem to help cold read performance significantly on Mac (and less so on Android and Linux, haven't seen much effect on Windows). A few bits and pieces based on jkarlin's 2872943002 BUG=719979 ========== to ========== ... prefetching stream 1 content along with all the metadata + stream 0 content we always read. Current revision does it for anything <= 32KiB, though perhaps it may be worth doing this as an experiment? Prefetching like this seem to help cold read performance significantly on Mac (and less so on Android and Linux, haven't seen much effect on Windows). As a bonus, it also means that the first data/stream 1 read request from the client can be answered immediately for files to which this applies. A few bits and pieces based on jkarlin's 2872943002 BUG=719979 ==========
morlovich@chromium.org changed reviewers: + gavinp@chromium.org, jkarlin@chromium.org, pasko@chromium.org
Might make sense to turn this into an experiment (over the prefetch size)?
A few questions for sanity checking before reviewing the code: * I saw in the bug that this used to regress on warm cases on Linux/Android, is it no longer the case? * How will this affect browser memory consumption on low end devices? * 32KiB sounds arbitrary ;) should we instead call readahead(2) to allow the system to prioritize I/O better? (Not sure if OSX supports it, saw signs that it may, also not sure whether there is something analogous on Windows, on Linux you have it after reading the first byte or even at open, and the readahead window is adjustable, but not sure how much in practice, the max is adjustable by OEMs - tricky area and keeps moving in the kernel)
On 2017/07/10 14:28:59, pasko wrote: > A few questions for sanity checking before reviewing the code: > > * I saw in the bug that this used to regress on warm cases on Linux/Android, is > it no longer the case? Rather the "headers only cases", it seems: https://docs.google.com/spreadsheets/d/1l4zb18ez2cYNHlmTb3zLwslS7ejQpnVq90Aeh... (SR1 was an earlier revision, SR2 was one w/a bunch of copy avoidance that didn't do much, so current is basically SR1 + lots of refactoring, but it doesn't matter to much) > * How will this affect browser memory consumption on low end devices? Well, it reads payload it may not need, so something like 32K * # of open entries that are in early stage of HttpCache::Transaction state machine, worst case being as long as validation fetches are going on? > > * 32KiB sounds arbitrary ;) Not completely --- it's based around network stack read sizes, and observation it covers a lot of files, but that's a couple of steps removed, so it won't actually be enough for a full read. So yeah, something something experiment, with potentially different settings for different platforms? > should we instead call readahead(2) to allow the > system to prioritize I/O better? (Not sure if OSX supports it, saw signs that it > may Can't seem to find an analogue?
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
OK, overall looks safe and non-invasive. How do you want to proceed? * add variable length prefetches and Finch? * replace jkarlin's change or land that one first? * new histograms? * does it require more local benchmarking? On 2017/07/10 15:02:37, Maks Orlovich wrote: > On 2017/07/10 14:28:59, pasko wrote: > > A few questions for sanity checking before reviewing the code: > > > > * I saw in the bug that this used to regress on warm cases on Linux/Android, > is > > it no longer the case? > > Rather the "headers only cases", it seems: > https://docs.google.com/spreadsheets/d/1l4zb18ez2cYNHlmTb3zLwslS7ejQpnVq90Aeh... > (SR1 was an earlier revision, SR2 was one w/a bunch of copy avoidance that > didn't do much, so current > is basically SR1 + lots of refactoring, but it doesn't matter to much) Do you suggest to disable prefetch for Android/Linux as a possible outcome of experimentation? > > * How will this affect browser memory consumption on low end devices? > > Well, it reads payload it may not need, so something like 32K * # of open > entries that are in > early stage of HttpCache::Transaction state machine, worst case being as long as > validation fetches > are going on? I don't understand the http cache transaction enough (not proud of it), do we doom the entry on revalidation or truncate? I think the latter. Should we clear stream_1_prefetch_data_ on truncate then? Sounds like a layering violation .. but I guess as usual it's easier to manage than guaranteeing exclusivity of prefetches vs. writes in http cache transaction? I would accept that. > > * 32KiB sounds arbitrary ;) > > Not completely --- it's based around network stack read sizes, and observation > it covers a lot of files, Ah, makes sense. So basically the size of the typical buffer for reading in. > but that's a couple of steps removed, so it won't actually be enough for a full > read. > > So yeah, something something experiment, with potentially different settings for > different platforms? Yeah, we can throw exact same groups at all platforms and then figure which ones are bettr. We can look at low-end devices separately as well. > > should we instead call readahead(2) to allow the > > system to prioritize I/O better? (Not sure if OSX supports it, saw signs that > it > > may > > Can't seem to find an analogue? That was premature, I saw this: https://github.com/xdevs23/busybox-osx/blob/master/miscutils/readahead.c, which after closer inspection it appears to be excluded from compilation. mmap+madvise is another option on OSX, but not sure it will eat the mmap overhead. https://codereview.chromium.org/2874833005/diff/200001/net/disk_cache/simple/... File net/disk_cache/simple/simple_entry_impl.cc (right): https://codereview.chromium.org/2874833005/diff/200001/net/disk_cache/simple/... net/disk_cache/simple/simple_entry_impl.cc:857: ReadInMemoryStreamData(stream_1_prefetch_data_.get(), offset, buf_len, buf, perhaps we could record how much time it takes to do so? this may help to reason about speedups/slowdowns? adding prefethes to netlog may be less trivial, though close to something my inner perf nerd would put in a wishlist. WDYT? also, record how much of stream data invalidation is happening?
On 2017/07/11 13:24:12, pasko wrote: > OK, overall looks safe and non-invasive. > > How do you want to proceed? > * add variable length prefetches and Finch? So code-wise it would be a base::Feature with an associated param, with default being existing behavior? Doesn't seem like for this particular thing we need to do any flushing since it shouldn't affect the contents.... Then presumably some doc/bug process stuff for Finch, followed by a Google-land CL to configure the experiment? > * replace jkarlin's change or land that one first? That change is far from a complete one, so there is nothing to land. There was a bit of useful starting code for me to lift, though. > * new histograms? Some discussion of that in reply to code comments (which I still can't figure out how to not make a separate e-mail). > * does it require more local benchmarking? I've added some Windows benchmarking, which is mostly inconclusive except for the SSD + no antivirus case, which... is probably not where the focus should be. (I don't expect the change to make much of a difference with Windows Defender, since that reads in entire file at once itself) Still need to redo mac+HDD data, but I seem to have misrecorded my password, so will need to bug Gavin to reset it. > Do you suggest to disable prefetch for Android/Linux as a possible outcome of > experimentation? Absolutely. Setting threshold to zero ought to do it. Hmm, actually should probably just error out if file_size is zero rather than try to reason about zero-length allocations? > I don't understand the http cache transaction enough (not proud of it), do we > doom the entry on revalidation or truncate? I think the latter. I think both can happen. > Should we clear stream_1_prefetch_data_ on truncate then? > Sounds like a layering violation .. This code already should? A truncate is just a write, after all, and I am not sure why you consider it a layering violation, it's the backend's business how it manages data between memory and disk, and this sort of thing should only be observable in some reads becoming immediate rather than roundtrip'ing via some event loops. > > So yeah, something something experiment, with potentially different settings > for > > different platforms? > > Yeah, we can throw exact same groups at all platforms and then figure which ones > are bettr. We can look at low-end devices separately as well. Is that the "Low memory device (android)" split? Can we actually know that in //net when setting the defaults?
https://codereview.chromium.org/2874833005/diff/200001/net/disk_cache/simple/... File net/disk_cache/simple/simple_entry_impl.cc (right): https://codereview.chromium.org/2874833005/diff/200001/net/disk_cache/simple/... net/disk_cache/simple/simple_entry_impl.cc:857: ReadInMemoryStreamData(stream_1_prefetch_data_.get(), offset, buf_len, buf, On 2017/07/11 13:24:12, pasko wrote: > perhaps we could record how much time it takes to do so? this may help to reason > about speedups/slowdowns? It's a memcpy? Doesn't sound that interesting... The Sync analogue may be more important since that does CRC, at least. > > adding prefethes to netlog may be less trivial, though close to something my > inner perf nerd would put in a wishlist. WDYT? net_log_ isn't wired at SimpleSyncEntry, but we could log on this end at around l.1162 if a separate event sounds interesting? > > also, record how much of stream data invalidation is happening? Hmm, maybe rather record prefetches done vs. prefetches read? That seems a little easier to interpret, and also includes the case of data prefetched, not used, but not actively invalidated (can happen e.g. in the cant_conditionalize case).
> On 2017/07/11 13:24:12, pasko wrote: > > OK, overall looks safe and non-invasive. > > > > How do you want to proceed? > > * add variable length prefetches and Finch? > > So code-wise it would be a base::Feature with an associated param, with default > being existing behavior? > Doesn't seem like for this particular thing we need to do any flushing since it > shouldn't affect the > contents.... Yes. > Then presumably some doc/bug process stuff for Finch, followed by a Google-land > CL to configure the experiment? Yes, sorry about this overhead. > > * replace jkarlin's change or land that one first? > > That change is far from a complete one, so there is nothing to land. There was > a bit of useful starting code for me to lift, though. OK, let's close that one if it does not intend to land? > > * new histograms? > > Some discussion of that in reply to code comments (which I still can't figure > out how to not make a separate e-mail). Thank you. > > * does it require more local benchmarking? > > I've added some Windows benchmarking, which is mostly inconclusive except for > the SSD + no antivirus case, which... is probably not where the focus should > be. (I don't expect the change to make much of a difference with Windows > Defender, since that reads in entire file at once itself) > > Still need to redo mac+HDD data, but I seem to have misrecorded my password, > so will need to bug Gavin to reset it. Ack, good to know, thanks for the update. > > Do you suggest to disable prefetch for Android/Linux as a possible outcome > > of experimentation? > > Absolutely. Setting threshold to zero ought to do it. Hmm, actually should > probably just error out if file_size is zero rather than try to reason about > zero-length allocations? Not sure which allocation you mean, and I feel relaxed about overhead of one unnecessary small allocation per cache entry. > > I don't understand the http cache transaction enough (not proud of it), do we > > doom the entry on revalidation or truncate? I think the latter. > > I think both can happen. > > > Should we clear stream_1_prefetch_data_ on truncate then? > > Sounds like a layering violation .. > > This code already should? A truncate is just a write, after all, and I am not > sure why you consider it a layering violation, it's the backend's business how > it manages data between memory and disk, and this sort of thing should only be > observable in some reads becoming immediate rather than roundtrip'ing via some > event loops. Right, I asked this question, but then found that your change does it properly, but forgot to remove the question. Sorry for noise. > > > So yeah, something something experiment, with potentially different > > > settings for different platforms? > > > > Yeah, we can throw exact same groups at all platforms and then figure which > > ones are bettr. We can look at low-end devices separately as well. > > Is that the "Low memory device (android)" split? Can we actually know that in > //net when setting the defaults? Absolutely: base::SysInfo::IsLowEndDevice(). https://codereview.chromium.org/2874833005/diff/200001/net/disk_cache/simple/... File net/disk_cache/simple/simple_entry_impl.cc (right): https://codereview.chromium.org/2874833005/diff/200001/net/disk_cache/simple/... net/disk_cache/simple/simple_entry_impl.cc:857: ReadInMemoryStreamData(stream_1_prefetch_data_.get(), offset, buf_len, buf, On 2017/07/11 14:34:06, Maks Orlovich wrote: > On 2017/07/11 13:24:12, pasko wrote: > > perhaps we could record how much time it takes to do so? this may help to > > reason about speedups/slowdowns? > > It's a memcpy? Doesn't sound that interesting... The Sync analogue may be more > important since that does CRC, at least. sorry, wrong pointer, I was asking whether we can time the interval of prefetching from disk. > > adding prefethes to netlog may be less trivial, though close to something my > > inner perf nerd would put in a wishlist. WDYT? > > net_log_ isn't wired at SimpleSyncEntry, but we could log on this end at around > l.1162 if a separate event sounds interesting? Yeah, I was not sure whether you can provide a custom time interval to a netlog .. > > also, record how much of stream data invalidation is happening? > > Hmm, maybe rather record prefetches done vs. prefetches read? That seems a > little easier to interpret, and also includes the case of data prefetched, not > used, but not actively invalidated (can happen e.g. in the cant_conditionalize > case). Neat! I like it.
The CQ bit was checked by morlovich@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
Add some of the metrics, and an experiment knob. I am suddenly a bit worried about coverage... https://codereview.chromium.org/2874833005/diff/200001/net/disk_cache/simple/... File net/disk_cache/simple/simple_entry_impl.cc (right): https://codereview.chromium.org/2874833005/diff/200001/net/disk_cache/simple/... net/disk_cache/simple/simple_entry_impl.cc:857: ReadInMemoryStreamData(stream_1_prefetch_data_.get(), offset, buf_len, buf, > sorry, wrong pointer, I was asking whether we can time the interval > of prefetching from disk. It's not really a separate event, since it replaces all the disk I/O we would be normally doing, so ... kinda? > Yeah, I was not sure whether you can provide a custom time interval > to a netlog Time interval? You kinda lost me here...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: android_n5x_swarming_rel on master.tryserver.chromium.android (JOB_FAILED, https://build.chromium.org/p/tryserver.chromium.android/builders/android_n5x_...)
Finally got to understand how the disk IO is organized. Generally looks good. On the issue description: please update it, also explain that the small "_0" files (with stream0 and stream1) are read to a buffer first and then if this buffer is present, it replaces all file access. My comments below are a mix between nits and a suggestion for a not-so-trivial refactor. Sorry about that. The optimal way to read it is to find the largest comment in GetEOFRecordData() and continuing the discussion from there, we can sort out the remaining comments later. https://codereview.chromium.org/2874833005/diff/200001/net/disk_cache/simple/... File net/disk_cache/simple/simple_entry_impl.cc (right): https://codereview.chromium.org/2874833005/diff/200001/net/disk_cache/simple/... net/disk_cache/simple/simple_entry_impl.cc:857: ReadInMemoryStreamData(stream_1_prefetch_data_.get(), offset, buf_len, buf, On 2017/07/12 16:06:09, Maks Orlovich wrote: > > > sorry, wrong pointer, I was asking whether we can time the interval > > of prefetching from disk. > > It's not really a separate event, since it replaces all the disk I/O we would be > normally doing, so ... kinda? > > > > Yeah, I was not sure whether you can provide a custom time interval > > to a netlog > > Time interval? You kinda lost me here... lulz :) I was writing it without knowing that you decided to read the whole file first and then figure out which parts are which. So yeah, while it's a good fast way, I should take back my attempts at measuring time it took to "read potentially unnecessary parts". So I guess I should say "Acknowledged". Acknowledged https://codereview.chromium.org/2874833005/diff/200001/net/disk_cache/simple/... net/disk_cache/simple/simple_entry_impl.cc:1162: if (in_results->stream_1_data.get()) { almost a copy of the block above, perhaps make stream_data_ an array and parametrize this? https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/... File net/disk_cache/simple/simple_entry_impl.h (right): https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/... net/disk_cache/simple/simple_entry_impl.h:350: bool first_stream1_read_; // used for metrics only. nit: for less ambiguity: is_initial_stream1_read? However, if we put this under a wrapped base::File that I suggested in another comment, this would go away. https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/... net/disk_cache/simple/simple_entry_impl.h:403: // Unlike other streams, stream 0 data is read from the disk when the entry is this "Unlike other streams" will need to be updated because it is "sometimes" like stream 1, right? https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/... File net/disk_cache/simple/simple_synchronous_entry.cc (left): https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/... net/disk_cache/simple/simple_synchronous_entry.cc:1308: int32_t* out_data_size) const { Following how the state migrates from disk to the buffer and across EOF checks is becoming difficult to understand. It was non-trivial before the change, but now it hits near my thresholds :) For example: ReadAndValidateStream0 sets the buffer and then sits on top of PreRead to get it back from the buffer. Another example: GetEOFRecordData hides reading something else under the scenes. These things make it non-trivial to guess how the functionality is layered, and the file has 1750 lines. This makes me wonder whether we can redesign it a little. How about wrapping base::File and hiding from the sync entry the fact that it has a buffer cached? This would make the sync entry virtually unchanged (modulo clearing the buffer sometimes, copying from it into the SimpleEntryImpl, histograms will need to know cache type), all the EOF checking stays the same. Will this lead too over-copying? Any other problems? https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/... File net/disk_cache/simple/simple_synchronous_entry.cc (right): https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/... net/disk_cache/simple/simple_synchronous_entry.cc:123: void RecordSyncOpenPrefetchStatus(net::CacheType cache_type, bool result) { my favorite: nit on naming, because it is always possible to nit on names :) "Sync" is implied and unnecessary in this context. Are we planning to extend the status from boolean? If not, then better be explicit: RecordWhetherDidPrefetchedOnOpen(...) https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/... net/disk_cache/simple/simple_synchronous_entry.cc:279: &out_results->stream_1_crc32); too many output arguments some set under non-trivial conditions? Maybe consider pre-filling a SimpleEntryCreationResults in InitializeForOpen() or make another struct? https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/... net/disk_cache/simple/simple_synchronous_entry.cc:722: int SimpleSynchronousEntry::PreReadStreamPayload( Why Pre? It just reads, no? https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/... net/disk_cache/simple/simple_synchronous_entry.cc:1262: int SimpleSynchronousEntry::ReadAndValidateStream0( this fetches stream 1 as well, so the name of the function needs to be updated https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/... net/disk_cache/simple/simple_synchronous_entry.cc:1280: RecordSyncOpenPrefetchStatus(cache_type_, false); nit: it is usually easier to read if the short branch comes first https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/... net/disk_cache/simple/simple_synchronous_entry.cc:1382: if (offset >= file_size || end.ValueOrDie() >= file_size) feel free to insert a DCHECK for the overflow, but not a CHECK! If reading some corrupt data can lead to an overflow here, then we must handle it. An explanation for why it can/cannot happen would be handy because this function can later be used for something else. https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/... File net/disk_cache/simple/simple_synchronous_entry.h (right): https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/... net/disk_cache/simple/simple_synchronous_entry.h:87: uint32_t stream_1_crc32; are you packing this to make the sizeof(SimpleEntryCreationResults) smaller? I think saving a few bytes is not important here, while readability would improve if we group the data and the crc fields together, possibly wrapped in a separate struct.
a few bits on testing as well https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/disk_ca... File net/disk_cache/disk_cache_test_base.cc (right): https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/disk_ca... net/disk_cache/disk_cache_test_base.cc:95: // Make sure to cover the prefetch path in SimpleCache. doing work in constructors is generally discouraged... The bigger issue is that we need a test to experiment with different params, while most of the tests should run with param 0. There is a standard mechanism to get coverage for different values of params as we roll to channels, there is a json file to override the base::Feature for testing and set the most popular configuration there. When submitting the Finch config you'll likely get a warning about that. So that'd be preferable IMO. https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/... File net/disk_cache/simple/simple_experiment.cc (right): https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/... net/disk_cache/simple/simple_experiment.cc:70: return base::GetFieldTrialParamByFeatureAsInt(kSimpleCachePrefetchExperiment, please also add a test that sets the experiment param differently and confirms that things work on the SSEntry level. https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/... File net/disk_cache/simple/simple_experiment.h (right): https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/... net/disk_cache/simple/simple_experiment.h:39: NET_EXPORT_PRIVATE int GetSimpleCachePrefetchSize(); I think this file is for experiments that depend on persistent state on disk (hence some index manipulations). The prefetch experiment does not need any of this, hence it can be kept local to the sync entry. Not sure we can remove the NET_EXPORT_PRIVATE if we want to test it, but the explicit separation would be good.
Commenting on stuff that needs discussing rather than just doing. https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/disk_ca... File net/disk_cache/disk_cache_test_base.cc (right): https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/disk_ca... net/disk_cache/disk_cache_test_base.cc:95: // Make sure to cover the prefetch path in SimpleCache. On 2017/07/18 14:02:56, pasko wrote: > doing work in constructors is generally discouraged... > > > The bigger issue is that we need a test to experiment with different params, > while most of the tests should run with param 0. There is a standard mechanism > to get coverage for different values of params as we roll to channels, there is > a json file to override the base::Feature for testing and set the most popular > configuration there. When submitting the Finch config you'll likely get a > warning about that. So that'd be preferable IMO. Any keyword to search for/pointer for that mechanism? https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/... File net/disk_cache/simple/simple_synchronous_entry.cc (left): https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/... net/disk_cache/simple/simple_synchronous_entry.cc:1308: int32_t* out_data_size) const { On 2017/07/18 13:46:31, pasko wrote: > Following how the state migrates from disk to the buffer and across EOF checks > is becoming difficult to understand. It was non-trivial before the change, but > now it hits near my thresholds :) > > For example: ReadAndValidateStream0 sets the buffer and then sits on top of > PreRead to get it back from the buffer. Another example: GetEOFRecordData hides > reading something else under the scenes. These things make it non-trivial to > guess how the functionality is layered, and the file has 1750 lines. > > This makes me wonder whether we can redesign it a little. How about wrapping > base::File and hiding from the sync entry the fact that it has a buffer cached? > This would make the sync entry virtually unchanged (modulo clearing the buffer > sometimes, copying from it into the SimpleEntryImpl, histograms will need to > know cache type), all the EOF checking stays the same. > > Will this lead too over-copying? Any other problems? So my idea was that ReadFromFileOrPrefetched would abstract whether reading from file or prefetch, though not fully since you still need to drag the two extra arguments around (might makes sense to bundle them up, maybe even as a StringPiece). GetEOFRecordData is kinda where it shows extra roughness since that's also used outside initial read and for files_[1]. Maybe a smaller change would be to make ReadFromFileOrPrefetch take file number, so that handles of the wrinkle here? I need to sleep on the wrapping-File idea, but what bother me about it a little is that files_ is permanent state, while the intent with the prefetch buffer was to just to have it for the duration of initial read. Also you one probably still want to do some of the method additions I did, like creating PreReadStreamPayload, since otherwise a whole bunch of stuff gets dupllicated. https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/... File net/disk_cache/simple/simple_synchronous_entry.cc (right): https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/... net/disk_cache/simple/simple_synchronous_entry.cc:722: int SimpleSynchronousEntry::PreReadStreamPayload( On 2017/07/18 13:46:31, pasko wrote: > Why Pre? It just reads, no? Because it's only used in the prefetch path? https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/... net/disk_cache/simple/simple_synchronous_entry.cc:1262: int SimpleSynchronousEntry::ReadAndValidateStream0( On 2017/07/18 13:46:31, pasko wrote: > this fetches stream 1 as well, so the name of the function needs to be updated How awful is ReadAndValidateStream0AndMaybe1 ? Any ideas for a better name? https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/... net/disk_cache/simple/simple_synchronous_entry.cc:1382: if (offset >= file_size || end.ValueOrDie() >= file_size) On 2017/07/18 13:46:31, pasko wrote: > feel free to insert a DCHECK for the overflow, but not a CHECK! > > If reading some corrupt data can lead to an overflow here, then we must handle > it. An explanation for why it can/cannot happen would be handy because this > function can later be used for something else. See the conditional in line 1380, this can't actually fail at this point. I guess it would be better to use AssignIfValid, though?
for the stuff worth discussing, developing the idea further https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/disk_ca... File net/disk_cache/disk_cache_test_base.cc (right): https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/disk_ca... net/disk_cache/disk_cache_test_base.cc:95: // Make sure to cover the prefetch path in SimpleCache. On 2017/07/18 14:32:31, morlovich wrote: > On 2017/07/18 14:02:56, pasko wrote: > > doing work in constructors is generally discouraged... > > > > > > The bigger issue is that we need a test to experiment with different params, > > while most of the tests should run with param 0. There is a standard mechanism > > to get coverage for different values of params as we roll to channels, there > is > > a json file to override the base::Feature for testing and set the most popular > > configuration there. When submitting the Finch config you'll likely get a > > warning about that. So that'd be preferable IMO. > > Any keyword to search for/pointer for that mechanism? go/finch101 and search for "testing-config", but you'll need to read the whole page anyway, which makes searching unnecessary https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/... File net/disk_cache/simple/simple_synchronous_entry.cc (left): https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/... net/disk_cache/simple/simple_synchronous_entry.cc:1308: int32_t* out_data_size) const { On 2017/07/18 14:32:31, morlovich wrote: > On 2017/07/18 13:46:31, pasko wrote: > > Following how the state migrates from disk to the buffer and across EOF checks > > is becoming difficult to understand. It was non-trivial before the change, but > > now it hits near my thresholds :) > > > > For example: ReadAndValidateStream0 sets the buffer and then sits on top of > > PreRead to get it back from the buffer. Another example: GetEOFRecordData > hides > > reading something else under the scenes. These things make it non-trivial to > > guess how the functionality is layered, and the file has 1750 lines. > > > > This makes me wonder whether we can redesign it a little. How about wrapping > > base::File and hiding from the sync entry the fact that it has a buffer > cached? > > This would make the sync entry virtually unchanged (modulo clearing the buffer > > sometimes, copying from it into the SimpleEntryImpl, histograms will need to > > know cache type), all the EOF checking stays the same. > > > > Will this lead too over-copying? Any other problems? > > So my idea was that ReadFromFileOrPrefetched would abstract whether reading from > file or prefetch, though not fully since you still need to drag the two extra > arguments around (might makes sense to bundle them up, maybe even as a > StringPiece). GetEOFRecordData is kinda where it shows extra roughness since > that's also used outside initial read and for files_[1]. Maybe a smaller change > would be to make ReadFromFileOrPrefetch take file number, so that handles of the > wrinkle here? Yeah, I understand your way to put ReadFromFileOrPrefetched under everything. It is just not easy to figure from sync_entry.h, and the name is not very memorable, so in a month from now I'd need to rediscover it again. Tossing file numbers vs streams are actually also confusing. Thinking about those files: could make a something like a CacheStreams object that abstracts away file manipulation as much as possible. The sync entry operates on top of streams 0, 1 and 2. What CacheStreams would do: * mapping of stream to file * opening/closing streams and associated laziness * pull/push the key * it does _not_ know about EOF records and crc32/sha256, those are just part of streams * has this readahead optimization that allows to memcpy later instead of a fair read * this is where the abstraction leaks: allows the client to 'drop caches' to save memory (the streams object drops it under the scenes when encountering a write/truncate) this makes it sort of a logical lower layer, which will allow not carrying the prefetch_buf around and removes the usual file/offset calculation/passing clutter in all codepaths. We still have the problem of verifying all CRC32 and dealing with variants of SHA256 in the SSE. Would it help? https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/... File net/disk_cache/simple/simple_synchronous_entry.cc (right): https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/... net/disk_cache/simple/simple_synchronous_entry.cc:722: int SimpleSynchronousEntry::PreReadStreamPayload( On 2017/07/18 14:32:31, morlovich wrote: > On 2017/07/18 13:46:31, pasko wrote: > > Why Pre? It just reads, no? > > Because it's only used in the prefetch path? ah, ok https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/... net/disk_cache/simple/simple_synchronous_entry.cc:745: reinterpret_cast<const Bytef*>((*stream_data)->data()), In case you have similar allergies to reinterpret_cast repetitions, perhaps worth a function in simple_util to get a crc32 of a void* and size? https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/... net/disk_cache/simple/simple_synchronous_entry.cc:750: RecordCheckEOFResult(cache_type_, CHECK_EOF_RESULT_CRC_MISMATCH); just realized that this might slightly increase the crc mismatch rate because we would newly-fail for prefetched stream1 that we would have otherwise discarded. Sounds not very important (because mismatches are rare), but wanted to highlight to be less worried later if we get alerts about that. https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/... net/disk_cache/simple/simple_synchronous_entry.cc:1262: int SimpleSynchronousEntry::ReadAndValidateStream0( On 2017/07/18 14:32:31, morlovich wrote: > On 2017/07/18 13:46:31, pasko wrote: > > this fetches stream 1 as well, so the name of the function needs to be updated > > How awful is ReadAndValidateStream0AndMaybe1 ? > Any ideas for a better name? As a name it sounds good to me https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/... net/disk_cache/simple/simple_synchronous_entry.cc:1382: if (offset >= file_size || end.ValueOrDie() >= file_size) On 2017/07/18 14:32:31, morlovich wrote: > On 2017/07/18 13:46:31, pasko wrote: > > feel free to insert a DCHECK for the overflow, but not a CHECK! > > > > If reading some corrupt data can lead to an overflow here, then we must handle > > it. An explanation for why it can/cannot happen would be handy because this > > function can later be used for something else. > > See the conditional in line 1380, this can't actually fail at this point. I > guess it would be better to use AssignIfValid, though? Ah, missed that. AssignIfValid() looks lengthy, I'd prefer ValueOrDefault(), but ValueOrDie() would also work for me.
> go/finch101 and search for "testing-config", but you'll need to read the whole > page anyway, which makes searching unnecessary Thanks. Looked at it, working on the doc ATM. > Thinking about those files: could make a something like a CacheStreams object > that abstracts away file manipulation as much as possible. The sync entry > operates on top of streams 0, 1 and 2. What CacheStreams would do: > > * mapping of stream to file > * opening/closing streams and associated laziness > * pull/push the key > * it does _not_ know about EOF records and crc32/sha256, those are just part of > streams So this is what breaks in this idea, I think, since those things are needed to sort out the packing of streams 0 and 1 in file 0. Much of complexity in ReadAndValidate... comes exactly from sorting out those headers (and you can see it sort of leaking across abstraction boundaries already existing for that). > > https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/... > net/disk_cache/simple/simple_synchronous_entry.cc:745: reinterpret_cast<const > Bytef*>((*stream_data)->data()), > In case you have similar allergies to reinterpret_cast repetitions, perhaps > worth a function in simple_util to get a crc32 of a void* and size? Probably, since https://cs.chromium.org/chromium/src/net/disk_cache/simple/simple_index_file.... is also a thing, and we probably do it for writes, too. > > Ah, missed that. AssignIfValid() looks lengthy, I'd prefer ValueOrDefault(), but > ValueOrDie() would also work for me. Well, what I like about AssignIfValid is that it'll make the conditionals all together...
On 2017/07/21 14:23:42, Maks Orlovich wrote: > > go/finch101 and search for "testing-config", but you'll need to read the > whole > > page anyway, which makes searching unnecessary > > Thanks. Looked at it, working on the doc ATM. Doc, with chromium.org access: https://docs.google.com/document/d/1u4udJ8fWV-GOoyZAh5NeOwHYlToCGiQX6YiyXff6V...
The CQ bit was checked by morlovich@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
Flushing some comments, some things still in progress. https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/... File net/disk_cache/simple/simple_entry_impl.h (right): https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/... net/disk_cache/simple/simple_entry_impl.h:350: bool first_stream1_read_; // used for metrics only. On 2017/07/18 13:46:31, pasko wrote: > nit: for less ambiguity: is_initial_stream1_read? However, if we put this under > a wrapped base::File that I suggested in another comment, this would go away. Done. https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/... File net/disk_cache/simple/simple_synchronous_entry.cc (right): https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/... net/disk_cache/simple/simple_synchronous_entry.cc:123: void RecordSyncOpenPrefetchStatus(net::CacheType cache_type, bool result) { On 2017/07/18 13:46:31, pasko wrote: > my favorite: nit on naming, because it is always possible to nit on names :) > > "Sync" is implied and unnecessary in this context. Are we planning to extend the > status from boolean? If not, then better be explicit: > > RecordWhetherDidPrefetchedOnOpen(...) Done'ish https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/... net/disk_cache/simple/simple_synchronous_entry.cc:722: int SimpleSynchronousEntry::PreReadStreamPayload( On 2017/07/19 16:28:26, pasko wrote: > On 2017/07/18 14:32:31, morlovich wrote: > > On 2017/07/18 13:46:31, pasko wrote: > > > Why Pre? It just reads, no? > > > > Because it's only used in the prefetch path? > > ah, ok Acknowledged. https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/... net/disk_cache/simple/simple_synchronous_entry.cc:750: RecordCheckEOFResult(cache_type_, CHECK_EOF_RESULT_CRC_MISMATCH); On 2017/07/19 16:28:26, pasko wrote: > just realized that this might slightly increase the crc mismatch rate because we > would newly-fail for prefetched stream1 that we would have otherwise discarded. > > Sounds not very important (because mismatches are rare), but wanted to highlight > to be less worried later if we get alerts about that. Acknowledged. https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/... net/disk_cache/simple/simple_synchronous_entry.cc:1280: RecordSyncOpenPrefetchStatus(cache_type_, false); On 2017/07/18 13:46:31, pasko wrote: > nit: it is usually easier to read if the short branch comes first [Citation needed], but done. https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/... net/disk_cache/simple/simple_synchronous_entry.cc:1382: if (offset >= file_size || end.ValueOrDie() >= file_size) > Ah, missed that. AssignIfValid() looks lengthy, I'd prefer ValueOrDefault(), but > ValueOrDie() would also work for me. Ended up with AssignIfValid anyway, but I think it works better with how the code evolved.. https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/... File net/disk_cache/simple/simple_synchronous_entry.h (right): https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/... net/disk_cache/simple/simple_synchronous_entry.h:87: uint32_t stream_1_crc32; On 2017/07/18 13:46:31, pasko wrote: > are you packing this to make the sizeof(SimpleEntryCreationResults) smaller? I > think saving a few bytes is not important here, while readability would improve > if we group the data and the crc fields together, possibly wrapped in a separate > struct. > > Done.
On 2017/07/21 17:32:15, Maks Orlovich wrote: > On 2017/07/21 14:23:42, Maks Orlovich wrote: > > > go/finch101 and search for "testing-config", but you'll need to read the > > whole > > > page anyway, which makes searching unnecessary > > > > Thanks. Looked at it, working on the doc ATM. > > Doc, with http://chromium.org access: > https://docs.google.com/document/d/1u4udJ8fWV-GOoyZAh5NeOwHYlToCGiQX6YiyXff6V... why not make it world-readable? Having a chromium.org account is not a requirement for contributing to our codebase.
On 2017/07/21 14:23:42, Maks Orlovich wrote: > > go/finch101 and search for "testing-config", but you'll need to read the > whole > > page anyway, which makes searching unnecessary > > Thanks. Looked at it, working on the doc ATM. > > > Thinking about those files: could make a something like a CacheStreams object > > that abstracts away file manipulation as much as possible. The sync entry > > operates on top of streams 0, 1 and 2. What CacheStreams would do: > > > > * mapping of stream to file > > * opening/closing streams and associated laziness > > * pull/push the key > > * it does _not_ know about EOF records and crc32/sha256, those are just part > of > > streams > > So this is what breaks in this idea, I think, since those things are needed to > sort out the packing of streams > 0 and 1 in file 0. Much of complexity in ReadAndValidate... comes exactly from > sorting out those headers > (and you can see it sort of leaking across abstraction boundaries already > existing for that). Yeah, it is not very pure, the SSEntry will need to know stream pecularities .. for example, in order to validate the key it would need to read end of stream0 if there is a sha256, but it does not need to know which file it is in, and does not need to push various out_crc32 down one layer, which I think will lead to shorter and more concise functions. Not without a few by-design violations, of course, which need attention anyway, so we'll document them. I think I'll be able to mentally handle this review, and I am not too worried because I am actively fantasizing about not having to review this code soon, but still kind of interested in simplifying to avoid the blockfile-cache-like dilemma in the future. WDYT?
https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/... File net/disk_cache/simple/simple_synchronous_entry.cc (right): https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/... net/disk_cache/simple/simple_synchronous_entry.cc:1280: RecordSyncOpenPrefetchStatus(cache_type_, false); On 2017/07/21 18:37:33, Maks Orlovich wrote: > On 2017/07/18 13:46:31, pasko wrote: > > nit: it is usually easier to read if the short branch comes first > > [Citation needed], but done. unfortunately, could not find a good one, and now hesitating as a result, on such a basic thing, wow. I could argue that this looks analogous to the "early return" argument [1], where I generally prefer early returns, but there is a point that "more common things" should go first. Certainly this helps understanding in some cases. I just read this code as all edge cases everywhere disregarding what is common on purpose, so probably it's just me. Sorry for the noise. [1] https://groups.google.com/a/chromium.org/forum/#!topic/chromium-dev/NO2CUzFVmUA
The CQ bit was checked by morlovich@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was checked by morlovich@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
> Yeah, it is not very pure, the SSEntry will need to know stream pecularities .. > for example, in order to validate the key it would need to read end of stream0 > if there is a sha256, but it does not need to know which file it is in, and does > not need to push various out_crc32 down one layer, which I think will lead to > shorter and more concise functions. Not without a few by-design violations, of > course, which need attention anyway, so we'll document them. > > I think I'll be able to mentally handle this review, and I am not too worried > because I am actively fantasizing about not having to review this code soon, but > still kind of interested in simplifying to avoid the blockfile-cache-like > dilemma in the future. > > WDYT? I think it will make things harder to read, since it will be less obvious that regular ops just go straight to disk. The current situation confines the prefetch stuff almost fully to the open path, which I think makes it clearer. Having said that, I do agree that the layout stuff is kinda messy right now, and there are some really low-hanging fruit out there, like code not consistently using stream_index or file_index, but rather "index" in spot.
Flushing a few more things https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/disk_ca... File net/disk_cache/disk_cache_test_base.cc (right): https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/disk_ca... net/disk_cache/disk_cache_test_base.cc:95: // Make sure to cover the prefetch path in SimpleCache. > go/finch101 and search for "testing-config", but you'll need to read the whole > page anyway, which makes searching unnecessary Removed this from here (and added for the dedicated test) https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/... File net/disk_cache/simple/simple_experiment.cc (right): https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/... net/disk_cache/simple/simple_experiment.cc:70: return base::GetFieldTrialParamByFeatureAsInt(kSimpleCachePrefetchExperiment, On 2017/07/18 14:02:57, pasko wrote: > please also add a test that sets the experiment param differently and confirms > that things work on the SSEntry level. Done. https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/... File net/disk_cache/simple/simple_experiment.h (right): https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/... net/disk_cache/simple/simple_experiment.h:39: NET_EXPORT_PRIVATE int GetSimpleCachePrefetchSize(); On 2017/07/18 14:02:57, pasko wrote: > I think this file is for experiments that depend on persistent state on disk > (hence some index manipulations). The prefetch experiment does not need any of > this, hence it can be kept local to the sync entry. Not sure we can remove the > NET_EXPORT_PRIVATE if we want to test it, but the explicit separation would be > good. OK, done. This PoV is more convincing with the eviction experiment here.. https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/... File net/disk_cache/simple/simple_synchronous_entry.cc (right): https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/... net/disk_cache/simple/simple_synchronous_entry.cc:745: reinterpret_cast<const Bytef*>((*stream_data)->data()), On 2017/07/19 16:28:26, pasko wrote: > In case you have similar allergies to reinterpret_cast repetitions, perhaps > worth a function in simple_util to get a crc32 of a void* and size? Split out into https://chromium-review.googlesource.com/581803
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: linux_chromium_asan_rel_ng on master.tryserver.chromium.linux (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.linux/builders/linux_chromium_...)
Oh, the failure is due to sharding sensitivity, I think. Will fix.
The CQ bit was checked by morlovich@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
morlovich@chromium.org changed reviewers: - jkarlin@chromium.org
Description was changed from ========== ... prefetching stream 1 content along with all the metadata + stream 0 content we always read. Current revision does it for anything <= 32KiB, though perhaps it may be worth doing this as an experiment? Prefetching like this seem to help cold read performance significantly on Mac (and less so on Android and Linux, haven't seen much effect on Windows). As a bonus, it also means that the first data/stream 1 read request from the client can be answered immediately for files to which this applies. A few bits and pieces based on jkarlin's 2872943002 BUG=719979 ========== to ========== ... prefetching stream 1 content along with all the metadata + stream 0 content we always read. Current revision does it for anything <= 32KiB, though perhaps it may be worth doing this as an experiment? Prefetching like this seem to help cold read performance significantly on Mac (and less so on Android and Linux, haven't seen much effect on Windows). As a bonus, it also means that the first data/stream 1 read request from the client can be answered immediately for files to which this applies. A few bits and pieces based on jkarlin's 2872943002 Doc for experiment: https://docs.google.com/a/chromium.org/document/d/1u4udJ8fWV-GOoyZAh5NeOwHYlT... BUG=719979 ==========
The CQ bit was checked by morlovich@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
PTAL --- I think I incorporated all feedback that's not the major refactor suggestion I disagree with. https://codereview.chromium.org/2874833005/diff/200001/net/disk_cache/simple/... File net/disk_cache/simple/simple_entry_impl.cc (right): https://codereview.chromium.org/2874833005/diff/200001/net/disk_cache/simple/... net/disk_cache/simple/simple_entry_impl.cc:1162: if (in_results->stream_1_data.get()) { On 2017/07/18 13:46:30, pasko wrote: > almost a copy of the block above, perhaps make stream_data_ an array and > parametrize this? Done. https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/... File net/disk_cache/simple/simple_synchronous_entry.cc (right): https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/... net/disk_cache/simple/simple_synchronous_entry.cc:279: &out_results->stream_1_crc32); On 2017/07/18 13:46:31, pasko wrote: > too many output arguments some set under non-trivial conditions? Maybe consider > pre-filling a SimpleEntryCreationResults in InitializeForOpen() or make another > struct? Ended up grouping into array of structs, also simplifying the CRC code.
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: win_chromium_x64_rel_ng on master.tryserver.chromium.win (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.win/builders/win_chromium_x64_...)
Pardon for slow response, travelling/meetings/funevents/ishouldcomplainsomewhere. Overall looks correct modulo maybe one histogram is over-recorded. I have not looked at unittests yet, instead I took the approach of overwhelming you with naming nits and bikesheds. WRT refactoring .. I think things got more readable since the early patchsets, but I still cannot reason about the thing until I (re)draw the whole callgraph, which is makes me want to avoid touching this place ever again. On the other hand, you are the primary owner of this thing, so it is up to you how to keep up with it in the long run. Now to the nits ---\ | | V Please improve the commit description. Here are a few tips on how to aim at good commit descriptions: https://sites.google.com/a/chromium.org/dev/developers/contributing-code#writ... > ... prefetching stream 1 content along with all the metadata + > stream 0 content we always read. nit: s/.../SimpleCache:/ nit: maybe a shorter line overall? like: "SimpleCache: Additionally prefetch stream1 to memory on open" > Current revision does it for anything <= 32KiB, though perhaps > it may be worth doing this as an experiment? > Prefetching like this seem to help cold read performance significantly on Mac > (and less so on Android and Linux, haven't seen much effect on Windows). please also mention adding the finch experiment, and a few words on what parameter(s) it has for perf tuning. > As a bonus, it also means that the first data/stream 1 read request from the > client can be answered immediately for files to which this applies. One could interpret 'immediately' as returning synchronously (which is possible, but probably risky - hence not in this change), which is not the case. Maybe define 'immediately' at the top of the file as returning without doing disk IO or use another term? > A few bits and pieces based on jkarlin's 2872943002 it this part relevant? > Doc for experiment: https://docs.google.com/a/chromium.org/document/d/1u4udJ8fWV-GOoyZAh5NeOwHYlT... nit: short link to avoid overflowing the line? https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/entry_u... File net/disk_cache/entry_unittest.cc (right): https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/entry_u... net/disk_cache/entry_unittest.cc:3447: const int kBufferSize = 50000; // to avoid quick read The context is not easy to guess from this comment. Perhaps we could make a constant like kMaxFilePrefetchSize and add a compile_assert to verify that the buffer size is bigger? A comment can explain that otherwise the EXPECT_EQ(net::ERR_IO_PENDING, ...) would not work. Or .. maybe the explanation is different? https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/... File net/disk_cache/simple/simple_entry_impl.h (right): https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/... net/disk_cache/simple/simple_entry_impl.h:85: // Adds another reader/writer to this entry, if possible, returning |this| to nit: perhaps mention somewhere in this file that it may read the file contents in advance to return synchronously from ReadData? https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/... net/disk_cache/simple/simple_entry_impl.h:89: // Creates this entry, if possible. Returns |this| to |entry|. same nit about comment as OpenEntry https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/... net/disk_cache/simple/simple_entry_impl.h:304: // Called after completion of asynchronous IO and receiving file metadata for it can also be called after reading the stream from memory https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/... net/disk_cache/simple/simple_entry_impl.h:318: void ReadInMemoryStreamData(net::GrowableIOBuffer* in_buf, naming bikeshed: 'Stream' is slightly confusing because we are not providing stream index to the method. Also seems important to reflect the posting activity. Suggestion: ReadFromBufferAndPostReply()? https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/... net/disk_cache/simple/simple_entry_impl.h:415: // the first read call on stream 1 synchronous. If a write to the stream oops, it's not synchronous because we PostTask every time. Apologies if I suggested this formulation, I forgot that we do not return synchronously. https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/... File net/disk_cache/simple/simple_synchronous_entry.cc (right): https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/... net/disk_cache/simple/simple_synchronous_entry.cc:715: int rv = GetEOFRecordData(base::StringPiece(), stream_index, entry_stat, base::StringPiece is one of those rare classes that does not require explicit constructor, and StringPiece parameters are designed to accept both C++ and C strings, so a nullptr would work here. If you agree that nullptr is more readable, then I'd prefer also a comment: GetEOFRecordData(nullptr /* file_0_prefetch */, ...) https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/... net/disk_cache/simple/simple_synchronous_entry.cc:1321: file_0_prefetch, 0, extra_stream_0_read, *out_entry_stat, stream_0_eof, maybe a few explanations how args correspond to params? like: s!extra_stream_0_read!extra_stream_0_size /* extra_size */! https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/... net/disk_cache/simple/simple_synchronous_entry.cc:1334: rv = PreReadStreamPayload(file_0_prefetch, 1, 0, *out_entry_stat, s!0,!0 /* extra_size */,! https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/... net/disk_cache/simple/simple_synchronous_entry.cc:1407: sizeof(SimpleFileEOF), generally it is less error-prone to to use sizeof(variable) instead of sizeof(type) https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/... net/disk_cache/simple/simple_synchronous_entry.cc:1422: SIMPLE_CACHE_UMA(BOOLEAN, "SyncCheckEOFHasCrc", cache_type_, Seems like both prefetch and last ReadData would record this histogram for stream1... Am I right? In this case we would have different counts in this histogram in control/ experiment group, which would be not ideal for comparisons. Did not look at tests yet .. maybe worth some histogram tester work ... https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/... File net/disk_cache/simple/simple_synchronous_entry.h (right): https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/... net/disk_cache/simple/simple_synchronous_entry.h:85: struct SimpleStreamPrefetchData { Arguably we should use "a struct only for passive objects that carry data; everything else is a class", and scoped_refptr is part of it and is _not_ a passive object. Magical things happen inside the object pointed to by |data| when we copy the struct. class? https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/... net/disk_cache/simple/simple_synchronous_entry.h:93: struct SimpleEntryCreationResults { then this should also be a class? https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/... net/disk_cache/simple/simple_synchronous_entry.h:320: SimpleStreamPrefetchData* stream_prefetch_data); We should explain that |stream_prefetch_data| is a two-element array, otherwise the declaration is confusing. Also, how about the array syntax: SimpleStreamPrefetchData stream_prefetch_data[2] https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/... net/disk_cache/simple/simple_synchronous_entry.h:353: scoped_refptr<net::GrowableIOBuffer>* stream_data, why not producing SimpleStreamPrefetchData instead of the last two output params? https://codereview.chromium.org/2874833005/diff/360001/tools/metrics/histogra... File tools/metrics/histograms/histograms.xml (right): https://codereview.chromium.org/2874833005/diff/360001/tools/metrics/histogra... tools/metrics/histograms/histograms.xml:73456: + on first read op. s!first read op.!first read op of each entry! Also: if there are multiple concurrent openers/readers/writers of the entry, only the result from a single read is recorded. Maybe it'd be nice to mention that?
Description was changed from ========== ... prefetching stream 1 content along with all the metadata + stream 0 content we always read. Current revision does it for anything <= 32KiB, though perhaps it may be worth doing this as an experiment? Prefetching like this seem to help cold read performance significantly on Mac (and less so on Android and Linux, haven't seen much effect on Windows). As a bonus, it also means that the first data/stream 1 read request from the client can be answered immediately for files to which this applies. A few bits and pieces based on jkarlin's 2872943002 Doc for experiment: https://docs.google.com/a/chromium.org/document/d/1u4udJ8fWV-GOoyZAh5NeOwHYlT... BUG=719979 ========== to ========== For files smaller than a configurable threshold we will only issue a single file read, immediately reading in all of metadata and stream 0 and stream 1 content upon entry opening. Files exceeding the threshold will keep the old behavior of only reading stream 0 and metadata (with somewhat more complicated sequence of file ops), and will read stream 1 content on-demand. The exact threshold will be determined by an experiment, with the code defaulting to 0 (e.g. disabled). Prefetching stream 1 content like this seem to help cold read performance significantly on Mac (and less so on Android and Linux, haven't seen much effect on Windows). Doc for experiment: https://goo.gl/RgW942 A few bits and pieces based on jkarlin's 2872943002 BUG=719979 ==========
The CQ bit was checked by morlovich@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
Flushing most of the comments. Thanks for the feedback, and don't worry about the timing --- Cache thread migration has been keeping me plenty occupied. https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/entry_u... File net/disk_cache/entry_unittest.cc (right): https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/entry_u... net/disk_cache/entry_unittest.cc:3447: const int kBufferSize = 50000; // to avoid quick read On 2017/08/04 01:28:34, pasko wrote: > The context is not easy to guess from this comment. Perhaps we could make a > constant like kMaxFilePrefetchSize and add a compile_assert to verify that the > buffer size is bigger? A comment can explain that otherwise the > EXPECT_EQ(net::ERR_IO_PENDING, ...) would not work. Or .. maybe the explanation > is different? Can't really give an exact number of that (since people could set it as high as the experiment framework permits...), but I did expand upon the comment. https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/... File net/disk_cache/simple/simple_entry_impl.h (right): https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/... net/disk_cache/simple/simple_entry_impl.h:85: // Adds another reader/writer to this entry, if possible, returning |this| to On 2017/08/04 01:28:34, pasko wrote: > nit: perhaps mention somewhere in this file that it may read the file contents > in advance to return synchronously from ReadData? Acknowledged. https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/... net/disk_cache/simple/simple_entry_impl.h:89: // Creates this entry, if possible. Returns |this| to |entry|. On 2017/08/04 01:28:34, pasko wrote: > same nit about comment as OpenEntry Doesn't apply since Create fails if the entry already exists. https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/... net/disk_cache/simple/simple_entry_impl.h:304: // Called after completion of asynchronous IO and receiving file metadata for On 2017/08/04 01:28:34, pasko wrote: > it can also be called after reading the stream from memory Rephrased. https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/... net/disk_cache/simple/simple_entry_impl.h:318: void ReadInMemoryStreamData(net::GrowableIOBuffer* in_buf, On 2017/08/04 01:28:34, pasko wrote: > naming bikeshed: 'Stream' is slightly confusing because we are not providing > stream index to the method. Also seems important to reflect the posting > activity. Suggestion: ReadFromBufferAndPostReply()? Done. https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/... net/disk_cache/simple/simple_entry_impl.h:415: // the first read call on stream 1 synchronous. If a write to the stream On 2017/08/04 01:28:34, pasko wrote: > oops, it's not synchronous because we PostTask every time. Apologies if I > suggested this formulation, I forgot that we do not return synchronously. So did I, actually. I think this was the original motivation Josh had, but it kinda got overtaken by me liking single-read. Dropped this sentence. https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/... File net/disk_cache/simple/simple_synchronous_entry.cc (right): https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/... net/disk_cache/simple/simple_synchronous_entry.cc:715: int rv = GetEOFRecordData(base::StringPiece(), stream_index, entry_stat, On 2017/08/04 01:28:34, pasko wrote: > base::StringPiece is one of those rare classes that does not require explicit > constructor, and StringPiece parameters are designed to accept both C++ and C > strings, so a nullptr would work here. I prefer the explicit construction since it increases typechecking. (As you can't pass a StringPiece to something that take a Object*, while you can with nullptr) https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/... net/disk_cache/simple/simple_synchronous_entry.cc:1321: file_0_prefetch, 0, extra_stream_0_read, *out_entry_stat, stream_0_eof, On 2017/08/04 01:28:34, pasko wrote: > maybe a few explanations how args correspond to params? > > like: > > s!extra_stream_0_read!extra_stream_0_size /* extra_size */! I went for extra_post_stream_0_read as name instead, to make it clear that it's not really part of stream data. Commented stream_index, too. https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/... net/disk_cache/simple/simple_synchronous_entry.cc:1334: rv = PreReadStreamPayload(file_0_prefetch, 1, 0, *out_entry_stat, On 2017/08/04 01:28:34, pasko wrote: > s!0,!0 /* extra_size */,! Did a variant of that. https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/... net/disk_cache/simple/simple_synchronous_entry.cc:1407: sizeof(SimpleFileEOF), On 2017/08/04 01:28:34, pasko wrote: > generally it is less error-prone to to use sizeof(variable) instead of > sizeof(type) It would have to be sizeof(*variable) in this case, though....? I am more confident in my ability to know what I am reading --- when it's a struct, rather than a buffer, anyway --- than to not forget a * in a way the typechecker can't catch. https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/... net/disk_cache/simple/simple_synchronous_entry.cc:1422: SIMPLE_CACHE_UMA(BOOLEAN, "SyncCheckEOFHasCrc", cache_type_, On 2017/08/04 01:28:34, pasko wrote: > Seems like both prefetch and last ReadData would record this histogram for > stream1... Am I right? I think it can't actually happen. To hit sync-side ReadData on stream1, it would need to have the prefetched stream1 data discarded due to a write, and we don't/can't verify checksums if a stream has been written to. https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/... File net/disk_cache/simple/simple_synchronous_entry.h (right): https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/... net/disk_cache/simple/simple_synchronous_entry.h:320: SimpleStreamPrefetchData* stream_prefetch_data); On 2017/08/04 01:28:35, pasko wrote: > We should explain that |stream_prefetch_data| is a two-element array, otherwise > the declaration is confusing. Also, how about the array syntax: > > SimpleStreamPrefetchData stream_prefetch_data[2] Done (and elsewhere) https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/... net/disk_cache/simple/simple_synchronous_entry.h:353: scoped_refptr<net::GrowableIOBuffer>* stream_data, On 2017/08/04 01:28:34, pasko wrote: > why not producing SimpleStreamPrefetchData instead of the last two output > params? Done. https://codereview.chromium.org/2874833005/diff/360001/tools/metrics/histogra... File tools/metrics/histograms/histograms.xml (right): https://codereview.chromium.org/2874833005/diff/360001/tools/metrics/histogra... tools/metrics/histograms/histograms.xml:73456: + on first read op. On 2017/08/04 01:28:35, pasko wrote: > s!first read op.!first read op of each entry! > > Also: if there are multiple concurrent openers/readers/writers of the entry, > only the result from a single read is recorded. Maybe it'd be nice to mention > that? Yeah, good point. I think they're pretty rare, though, since that sounds kinda hard to interpret.
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/... File net/disk_cache/simple/simple_synchronous_entry.h (right): https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/... net/disk_cache/simple/simple_synchronous_entry.h:85: struct SimpleStreamPrefetchData { On 2017/08/04 01:28:35, pasko wrote: > Arguably we should use "a struct only for passive objects that carry data; > everything else is a class", and scoped_refptr is part of it and is _not_ a > passive object. Magical things happen inside the object pointed to by |data| > when we copy the struct. > > class? Is it that different from having an std::string member?
https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/entry_u... File net/disk_cache/entry_unittest.cc (right): https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/entry_u... net/disk_cache/entry_unittest.cc:3447: const int kBufferSize = 50000; // to avoid quick read On 2017/08/04 18:35:43, Maks Orlovich wrote: > On 2017/08/04 01:28:34, pasko wrote: > > The context is not easy to guess from this comment. Perhaps we could make a > > constant like kMaxFilePrefetchSize and add a compile_assert to verify that the > > buffer size is bigger? A comment can explain that otherwise the > > EXPECT_EQ(net::ERR_IO_PENDING, ...) would not work. Or .. maybe the > explanation > > is different? > > Can't really give an exact number of that (since people could set it as high as > the experiment framework permits...), Oh, my comment is about adding a constant and a compile assert only in the test, and compare it only with the default. This may still break if we set an insanely large number in fieldtrial_testing_config.json, but I guess we won't do that.. > but I did expand upon the comment. The comment worksforme as well, thanks https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/... File net/disk_cache/simple/simple_entry_impl.h (right): https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/... net/disk_cache/simple/simple_entry_impl.h:85: // Adds another reader/writer to this entry, if possible, returning |this| to On 2017/08/04 18:35:44, Maks Orlovich wrote: > On 2017/08/04 01:28:34, pasko wrote: > > nit: perhaps mention somewhere in this file that it may read the file contents > > in advance to return synchronously from ReadData? > > Acknowledged. sry, it was a leftover comment when I thought that we return synchronously https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/... net/disk_cache/simple/simple_entry_impl.h:89: // Creates this entry, if possible. Returns |this| to |entry|. On 2017/08/04 18:35:44, Maks Orlovich wrote: > On 2017/08/04 01:28:34, pasko wrote: > > same nit about comment as OpenEntry > > Doesn't apply since Create fails if the entry already exists. Acknowledged. https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/... File net/disk_cache/simple/simple_synchronous_entry.cc (right): https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/... net/disk_cache/simple/simple_synchronous_entry.cc:715: int rv = GetEOFRecordData(base::StringPiece(), stream_index, entry_stat, On 2017/08/04 18:35:44, Maks Orlovich wrote: > On 2017/08/04 01:28:34, pasko wrote: > > base::StringPiece is one of those rare classes that does not require explicit > > constructor, and StringPiece parameters are designed to accept both C++ and C > > strings, so a nullptr would work here. > > I prefer the explicit construction since it increases typechecking. > (As you can't pass a StringPiece to something that take a Object*, while you can > with nullptr) OK, makes sense. I am not sure I fully understand the benefits of StringPiece here compared to char[]. Not having to pass size as additional parameter? https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/... net/disk_cache/simple/simple_synchronous_entry.cc:1334: rv = PreReadStreamPayload(file_0_prefetch, 1, 0, *out_entry_stat, On 2017/08/04 18:35:44, Maks Orlovich wrote: > On 2017/08/04 01:28:34, pasko wrote: > > s!0,!0 /* extra_size */,! > > Did a variant of that. Thanks for that, I forgot the style guide https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/... net/disk_cache/simple/simple_synchronous_entry.cc:1422: SIMPLE_CACHE_UMA(BOOLEAN, "SyncCheckEOFHasCrc", cache_type_, On 2017/08/04 18:35:44, Maks Orlovich wrote: > On 2017/08/04 01:28:34, pasko wrote: > > Seems like both prefetch and last ReadData would record this histogram for > > stream1... Am I right? > > I think it can't actually happen. To hit sync-side ReadData on stream1, it would > need to have the prefetched stream1 data discarded due to a write, and we > don't/can't verify checksums if a stream has been written to. Even though this might be currently true, I find it easy to overlook in later changes. Please add a test to verify that SyncCheckEOFHasCrc is recorded only once for both prefetched and non-prefetched stream1 cases when we read the entry from the beginning to the end. Hopefully in two variants: when there is CRC and when there is not. https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/... File net/disk_cache/simple/simple_synchronous_entry.h (right): https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/... net/disk_cache/simple/simple_synchronous_entry.h:85: struct SimpleStreamPrefetchData { On 2017/08/07 13:09:01, Maks Orlovich wrote: > On 2017/08/04 01:28:35, pasko wrote: > > Arguably we should use "a struct only for passive objects that carry data; > > everything else is a class", and scoped_refptr is part of it and is _not_ a > > passive object. Magical things happen inside the object pointed to by |data| > > when we copy the struct. > > > > class? > > Is it that different from having an std::string member? I think std::string and scoped_refptr equally make me want to make it a class. Also there is " If in doubt, make it a class" in the relevant part of the styleguide... https://codereview.chromium.org/2874833005/diff/360001/tools/metrics/histogra... File tools/metrics/histograms/histograms.xml (right): https://codereview.chromium.org/2874833005/diff/360001/tools/metrics/histogra... tools/metrics/histograms/histograms.xml:73456: + on first read op. On 2017/08/04 18:35:44, Maks Orlovich wrote: > On 2017/08/04 01:28:35, pasko wrote: > > s!first read op.!first read op of each entry! > > > > Also: if there are multiple concurrent openers/readers/writers of the entry, > > only the result from a single read is recorded. Maybe it'd be nice to mention > > that? > > Yeah, good point. I think they're pretty rare, though, since that sounds kinda > hard to interpret. well, if they were too rare, we would not have invested into dealing with the cache lock. These edge cases are hard to interpret and remember, that's why I think being more explicit in a description here would help a lot. https://codereview.chromium.org/2874833005/diff/380001/net/disk_cache/simple/... File net/disk_cache/simple/simple_entry_impl.h (right): https://codereview.chromium.org/2874833005/diff/380001/net/disk_cache/simple/... net/disk_cache/simple/simple_entry_impl.h:318: // and updating metadata as appropriate. If |callback| is non-null, it will nit: s/as appropriate// https://codereview.chromium.org/2874833005/diff/380001/net/disk_cache/simple/... net/disk_cache/simple/simple_entry_impl.h:319: // be posted to with the return code. s/posted to/posted to the current task runner/ ? ... since a callback cannot be posted to (though it may be that my English is lacking here, lemme know) https://codereview.chromium.org/2874833005/diff/380001/net/disk_cache/simple/... File net/disk_cache/simple/simple_synchronous_entry.cc (right): https://codereview.chromium.org/2874833005/diff/380001/net/disk_cache/simple/... net/disk_cache/simple/simple_synchronous_entry.cc:1333: /*extra_size = */ 0, *out_entry_stat, nit: one extra space before 'extra' for consistency
Flushing the easy stuff while look at the how to test the CRC thing. https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/... File net/disk_cache/simple/simple_synchronous_entry.cc (right): https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/... net/disk_cache/simple/simple_synchronous_entry.cc:715: int rv = GetEOFRecordData(base::StringPiece(), stream_index, entry_stat, On 2017/08/09 12:28:21, pasko wrote: > On 2017/08/04 18:35:44, Maks Orlovich wrote: > > On 2017/08/04 01:28:34, pasko wrote: > > > base::StringPiece is one of those rare classes that does not require > explicit > > > constructor, and StringPiece parameters are designed to accept both C++ and > C > > > strings, so a nullptr would work here. > > > > I prefer the explicit construction since it increases typechecking. > > (As you can't pass a StringPiece to something that take a Object*, while you > can > > with nullptr) > > OK, makes sense. I am not sure I fully understand the benefits of StringPiece > here compared to char[]. Not having to pass size as additional parameter? Yeah, bundles up the pointer and length together, so they are a single parameter and it's clear they're associated. https://codereview.chromium.org/2874833005/diff/360001/tools/metrics/histogra... File tools/metrics/histograms/histograms.xml (right): https://codereview.chromium.org/2874833005/diff/360001/tools/metrics/histogra... tools/metrics/histograms/histograms.xml:73456: + on first read op. > well, if they were too rare, we would not have invested into dealing with the > cache lock. These edge cases are hard to interpret and remember, that's why I > think being more explicit in a description here would help a lot. Oh, wow, I thought I changed this... Huh, when I open the file in my source tree I get this: " Whether a read from stream 1 was satisfied from prefetch data. Reported only on first read op of entry (including if there are multiple readers, or even some writers). " Not sure what's going on here. https://codereview.chromium.org/2874833005/diff/380001/net/disk_cache/simple/... File net/disk_cache/simple/simple_entry_impl.h (right): https://codereview.chromium.org/2874833005/diff/380001/net/disk_cache/simple/... net/disk_cache/simple/simple_entry_impl.h:318: // and updating metadata as appropriate. If |callback| is non-null, it will On 2017/08/09 12:28:22, pasko wrote: > nit: s/as appropriate// Done. https://codereview.chromium.org/2874833005/diff/380001/net/disk_cache/simple/... net/disk_cache/simple/simple_entry_impl.h:319: // be posted to with the return code. On 2017/08/09 12:28:22, pasko wrote: > s/posted to/posted to the current task runner/ ? ... since a callback cannot be > posted to (though it may be that my English is lacking here, lemme know) Done. https://codereview.chromium.org/2874833005/diff/380001/net/disk_cache/simple/... File net/disk_cache/simple/simple_synchronous_entry.cc (right): https://codereview.chromium.org/2874833005/diff/380001/net/disk_cache/simple/... net/disk_cache/simple/simple_synchronous_entry.cc:1333: /*extra_size = */ 0, *out_entry_stat, On 2017/08/09 12:28:22, pasko wrote: > nit: one extra space before 'extra' for consistency Done.
The CQ bit was checked by morlovich@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
flushing responses https://codereview.chromium.org/2874833005/diff/360001/tools/metrics/histogra... File tools/metrics/histograms/histograms.xml (right): https://codereview.chromium.org/2874833005/diff/360001/tools/metrics/histogra... tools/metrics/histograms/histograms.xml:73456: + on first read op. On 2017/08/09 15:25:01, Maks Orlovich wrote: > > > well, if they were too rare, we would not have invested into dealing with the > > cache lock. These edge cases are hard to interpret and remember, that's why I > > think being more explicit in a description here would help a lot. > > Oh, wow, I thought I changed this... Huh, when I open the file in my source tree > I get this: > > " > Whether a read from stream 1 was satisfied from prefetch data. Reported only > on first read op of entry (including if there are multiple readers, or even > some writers). > " > Not sure what's going on here. Thanks! When I was replying I did not look at your latest changes because your response did not obviously suggest that you made the changes. Confusionz :)
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/... File net/disk_cache/simple/simple_synchronous_entry.cc (right): https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/... net/disk_cache/simple/simple_synchronous_entry.cc:1422: SIMPLE_CACHE_UMA(BOOLEAN, "SyncCheckEOFHasCrc", cache_type_, On 2017/08/09 12:28:21, pasko wrote: > On 2017/08/04 18:35:44, Maks Orlovich wrote: > > On 2017/08/04 01:28:34, pasko wrote: > > > Seems like both prefetch and last ReadData would record this histogram for > > > stream1... Am I right? > > > > I think it can't actually happen. To hit sync-side ReadData on stream1, it > would > > need to have the prefetched stream1 data discarded due to a write, and we > > don't/can't verify checksums if a stream has been written to. > > Even though this might be currently true, I find it easy to overlook in later > changes. Please add a test to verify that SyncCheckEOFHasCrc is recorded only > once for both prefetched and non-prefetched stream1 cases when we read the entry > from the beginning to the end. Hopefully in two variants: when there is CRC and > when there is not. Good call on this since while SyncCheckEOFHasCrc was indeed recorded fine, SyncCheckEOFResult was handled inconsistently. (Will upload fix + new tests after the histogram thing is done with..)
Going over this, I really like storing the prefetched data explicitly in the element when the entire entry is small enough to read at once. Smart, saves seeks, etc... But I think that the code in the synchronous entry perpetuates some of the awkward paths in ReadAndValidateStream0(...AndMaybeStream1), and that is undesirable. Would it make sense to land some version of https://codereview.chromium.org/2090263004 , which will greatly simplify this code path (and should have some speedup, and certainly seems to in our benchmarks), then rebase this change to be downstream of that? This would make the syncronous entry part of the code a lot simpler. Maks, wdyt?
On 2017/08/10 16:43:26, gavinp wrote: > Going over this, I really like storing the prefetched data explicitly in the > element when the entire entry is small enough to read at once. > > Smart, saves seeks, etc... > > But I think that the code in the synchronous entry perpetuates some of the > awkward paths in ReadAndValidateStream0(...AndMaybeStream1), and that is > undesirable. > > Would it make sense to land some version of > https://codereview.chromium.org/2090263004 , which will greatly simplify this > code path (and should have some speedup, and certainly seems to in our > benchmarks), then rebase this change to be downstream of that? Could you elaborate what you find simpler there? I think in general this version feels to me to give a better high-level overview; though the thing that does bother me is the whole "// Pretend this file has a null stream zero, and contains the optional key" bit, and some of the other weirdness around that. Changing this to compute the offset for the stream 0 eof directly, as well as stream 1 does seem like something that can be done directly pretty reasonably, and it's not clear the existing code is *really* using the sizing abstractions rather than piercing them.
The CQ bit was checked by morlovich@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
Reworked the size computation to be what I feel is more natural --- Gavin, Egor, what do you think? Easy enough to revert if it feels worse. Also a previous revision added the requested checksum stat tests.
The CQ bit was checked by morlovich@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was checked by morlovich@chromium.org to run a CQ dry run
On 2017/08/10 16:57:14, Maks Orlovich wrote: > On 2017/08/10 16:43:26, gavinp wrote: > > Going over this, I really like storing the prefetched data explicitly in the > > element when the entire entry is small enough to read at once. > > > > Smart, saves seeks, etc... > > > > But I think that the code in the synchronous entry perpetuates some of the > > awkward paths in ReadAndValidateStream0(...AndMaybeStream1), and that is > > undesirable. > > > > Would it make sense to land some version of > > https://codereview.chromium.org/2090263004 , which will greatly simplify this > > code path (and should have some speedup, and certainly seems to in our > > benchmarks), then rebase this change to be downstream of that? > > Could you elaborate what you find simpler there? I think in general this version > > feels to me to give a better high-level overview; though the thing that does > bother me is the whole > "// Pretend this file has a null stream zero, and contains the optional key" > bit, and some of the other weirdness around that. > > Changing this to compute the offset for the stream 0 eof directly, as well as > stream 1 does seem like something that can be done directly pretty reasonably, > and it's not clear the existing code is *really* using the sizing abstractions > rather than piercing them. Maybe let's have a short 1:1 to go through this. I think the other CL really cleans up a lot of the spaghetti that's piercing abstractions, and stops so many methods from having multiple uses. I think rebasing this on top of it will cut the size of this CL, and probably make it faster. Maybe go over this together. I think the final upload of https://codereview.chromium.org/2090263004 will delete a lot of code; for testing right now it's keeping a flag.
> Maybe let's have a short 1:1 to go through this. Sure, whenever convenient for you. (But please do see what I just uploaded, mostly rev.23 slightly fixed in 24)
On 2017/08/10 18:59:23, Maks Orlovich wrote: > Reworked the size computation to be what I feel is more natural --- Gavin, Egor, > what do you think? Easy enough to revert if it feels worse. I don't have much preference, both versions of this 100 line function are equally easy for me to read. I feel like .. since Gavin is actively looking into this, I don't want to unnecessarily slow you guys down by being closely involved. And I don't have much more to say other than asking for a higher level abstraction to proxy things .. like file streams .. but .. we already looked at it and it had a few questionable consequences .. nothing more to suggest from me. Let me know if you want more of my involvement, like making The Final Scan.
Maks, I've gone through it, and I think I'm OK with this, we don't need to have a 1:1 to go through. I'll use the time instead to do a detailed review. You're welcome to come by all the same of course.
On 2017/08/15 17:45:41, gavinp wrote: > Maks, > > I've gone through it, and I think I'm OK with this, we don't need to have a 1:1 > to go through. I'll use the time instead to do a detailed review. You're welcome > to come by all the same of course. Thanks Gavin, I'll look forward to the review then.
> Thanks Gavin, I'll look forward to the review then. Sorry, but ping on this...
Would really appreciate if I could proceed with improving this...
this lgtm https://codereview.chromium.org/2874833005/diff/460001/net/disk_cache/simple/... File net/disk_cache/simple/simple_synchronous_entry.cc (right): https://codereview.chromium.org/2874833005/diff/460001/net/disk_cache/simple/... net/disk_cache/simple/simple_synchronous_entry.cc:1362: if (!ok) { Could we just move the expression initializing ok into this if statement?
The CQ bit was checked by morlovich@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
morlovich@chromium.org changed reviewers: + mpearson@chromium.org
+mpearson for the histograms. https://codereview.chromium.org/2874833005/diff/460001/net/disk_cache/simple/... File net/disk_cache/simple/simple_synchronous_entry.cc (right): https://codereview.chromium.org/2874833005/diff/460001/net/disk_cache/simple/... net/disk_cache/simple/simple_synchronous_entry.cc:1362: if (!ok) { On 2017/08/25 17:46:46, gavinp wrote: > Could we just move the expression initializing ok into this if statement? Done.
https://codereview.chromium.org/2874833005/diff/480001/net/disk_cache/simple/... File net/disk_cache/simple/simple_synchronous_entry.cc (right): https://codereview.chromium.org/2874833005/diff/480001/net/disk_cache/simple/... net/disk_cache/simple/simple_synchronous_entry.cc:122: "GetSimpleCachePrefetchExperiment", base::FEATURE_DISABLED_BY_DEFAULT}; Err, I should probably drop the "Get" here, shouldn't I?
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
Sorry this took me a while. Somehow I missed it in my inbox, and because it was on the old code review site, I wasn't reminded of it. --mark https://codereview.chromium.org/2874833005/diff/480001/net/disk_cache/simple/... File net/disk_cache/simple/simple_synchronous_entry.cc (right): https://codereview.chromium.org/2874833005/diff/480001/net/disk_cache/simple/... net/disk_cache/simple/simple_synchronous_entry.cc:122: "GetSimpleCachePrefetchExperiment", base::FEATURE_DISABLED_BY_DEFAULT}; On 2017/08/25 19:43:26, Maks Orlovich wrote: > Err, I should probably drop the "Get" here, shouldn't I? Yes. :-P You may also want to drop "Experiment" in the string and variable name. IsEnabled(kSimpleCachePrefetch) looks pretty readable to me. https://codereview.chromium.org/2874833005/diff/480001/tools/metrics/histogra... File tools/metrics/histograms/histograms.xml (right): https://codereview.chromium.org/2874833005/diff/480001/tools/metrics/histogra... tools/metrics/histograms/histograms.xml:75064: + Whether a read from stream 1 was satisfied from prefetch data. Reported only nit: prefetch -> prefetched https://codereview.chromium.org/2874833005/diff/480001/tools/metrics/histogra... tools/metrics/histograms/histograms.xml:75065: + on first read op of entry (including if there are multiple readers, or even nit: on first read op of entry -> on the first read of the stream (right?) (for clarity) https://codereview.chromium.org/2874833005/diff/480001/tools/metrics/histogra... tools/metrics/histograms/histograms.xml:75066: + some writers). nit: Please define / explain "stream 1". (Maybe as a sentence between those two existing sentence.) https://codereview.chromium.org/2874833005/diff/480001/tools/metrics/histogra... tools/metrics/histograms/histograms.xml:75156: + entry. nit: "entry" is vague. Please be clearer. URL? Also, please state what causes this to be emitted, i.e., what causes an entry to be "opened"? Is this always user-initiated?
https://codereview.chromium.org/2874833005/diff/480001/net/disk_cache/simple/... File net/disk_cache/simple/simple_synchronous_entry.cc (right): https://codereview.chromium.org/2874833005/diff/480001/net/disk_cache/simple/... net/disk_cache/simple/simple_synchronous_entry.cc:122: "GetSimpleCachePrefetchExperiment", base::FEATURE_DISABLED_BY_DEFAULT}; On 2017/08/29 04:52:29, Mark P wrote: > On 2017/08/25 19:43:26, Maks Orlovich wrote: > > Err, I should probably drop the "Get" here, shouldn't I? > > Yes. :-P Done. > > You may also want to drop "Experiment" in the string and variable name. > > IsEnabled(kSimpleCachePrefetch) looks pretty readable to me. Other things in net/disk_cache/simple/ have the suffix so I'll stay consistent with them. https://codereview.chromium.org/2874833005/diff/480001/tools/metrics/histogra... File tools/metrics/histograms/histograms.xml (right): https://codereview.chromium.org/2874833005/diff/480001/tools/metrics/histogra... tools/metrics/histograms/histograms.xml:75064: + Whether a read from stream 1 was satisfied from prefetch data. Reported only On 2017/08/29 04:52:30, Mark P wrote: > nit: prefetch -> prefetched Done. https://codereview.chromium.org/2874833005/diff/480001/tools/metrics/histogra... tools/metrics/histograms/histograms.xml:75065: + on first read op of entry (including if there are multiple readers, or even On 2017/08/29 04:52:30, Mark P wrote: > nit: > on first read op of entry > -> > on the first read of the stream > (right?) > > (for clarity) Indeed. Reads from other streams wouldn't affect this. https://codereview.chromium.org/2874833005/diff/480001/tools/metrics/histogra... tools/metrics/histograms/histograms.xml:75066: + some writers). On 2017/08/29 04:52:30, Mark P wrote: > nit: Please define / explain "stream 1". (Maybe as a sentence between those two > existing sentence.) (see below) https://codereview.chromium.org/2874833005/diff/480001/tools/metrics/histogra... tools/metrics/histograms/histograms.xml:75156: + entry. On 2017/08/29 04:52:29, Mark P wrote: > nit: "entry" is vague. Please be clearer. URL? > Also, please state what causes this to be emitted, i.e., what causes an entry to > be "opened"? Is this always user-initiated? So this is obvious when one knows the disk_cache API: there is a disk_cache::Entry class, and one gets one via an OpenEntry method (or a CreateEntry, but that one makes an empty one, so there is nothing to read). How much context familiarity is expected of documentation strings here? The "stream" thing is slightly different, in that the API docs only use it in a couple spots, and generally use a vaguer "with the given index", which I am not entirely fond off as its less specific; it might make sense to just say |index| == 1 in the above descriptions, though, to match the header signature?
The CQ bit was checked by morlovich@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: linux_chromium_chromeos_rel_ng on master.tryserver.chromium.linux (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.linux/builders/linux_chromium_...)
The CQ bit was checked by morlovich@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
https://codereview.chromium.org/2874833005/diff/480001/tools/metrics/histogra... File tools/metrics/histograms/histograms.xml (right): https://codereview.chromium.org/2874833005/diff/480001/tools/metrics/histogra... tools/metrics/histograms/histograms.xml:75156: + entry. On 2017/08/29 13:24:12, Maks Orlovich wrote: > On 2017/08/29 04:52:29, Mark P wrote: > > nit: "entry" is vague. Please be clearer. URL? > > Also, please state what causes this to be emitted, i.e., what causes an entry > to > > be "opened"? Is this always user-initiated? > > So this is obvious when one knows the disk_cache API: there is a > disk_cache::Entry class, and one gets one via an OpenEntry method (or a > CreateEntry, but that one makes an empty one, so there is nothing to read). How > much context familiarity is expected of documentation strings here? Enough so that people who don't understand the feature can roughly understand what's going on. https://chromium.googlesource.com/chromium/src.git/+/HEAD/tools/metrics/histo... For example, rather than saying "when opening the entry", say "when the disk_cache API OpenEntry function is called." This makes it clean enough so someone can have some idea where to look for further documentation. Perhaps even mention whether this API is called only by web developers or whether it is called internally by Chrome too, as that can make a difference in how to interpret changes in its value. > The "stream" thing is slightly different, in that the API docs only use it in a > couple spots, and generally use a vaguer "with the given index", which I am not > entirely fond off as its less specific; it might make sense to just say |index| > == 1 in the above descriptions, though, to match the header signature? That doesn't sound any better to me: it's more technical yet no more enlightening. Conceptually what does "stream 1" or "stream at a given index (with index == 1)" represent? It must be something special, else you'd be logging all streams... For example, is it simply the first stream that was written to? https://codereview.chromium.org/2874833005/diff/520001/tools/metrics/histogra... File tools/metrics/histograms/histograms.xml (right): https://codereview.chromium.org/2874833005/diff/520001/tools/metrics/histogra... tools/metrics/histograms/histograms.xml:75205: + only on the first read op of the stream (including if there are multiple op -> operation No need to abbreviate in writing.
https://codereview.chromium.org/2874833005/diff/480001/tools/metrics/histogra... File tools/metrics/histograms/histograms.xml (right): https://codereview.chromium.org/2874833005/diff/480001/tools/metrics/histogra... tools/metrics/histograms/histograms.xml:75156: + entry. > Enough so that people who don't understand the feature can roughly understand > what's going on. > https://chromium.googlesource.com/chromium/src.git/+/HEAD/tools/metrics/histo... > > For example, rather than saying "when opening the entry", say "when the > disk_cache API OpenEntry function is called." > > This makes it clean enough so someone can have some idea where to look for > further documentation. > Perhaps even mention whether this API is called only by web developers or > whether it is called internally by Chrome too, as that can make a difference in > how to interpret changes in its value. Done, using the full C++ name to make it clear it's not any sort of Web platform API, but rather an internal thing. > > > The "stream" thing is slightly different, in that the API docs only use it in > a > > couple spots, and generally use a vaguer "with the given index", which I am > not > > entirely fond off as its less specific; it might make sense to just say > |index| > > == 1 in the above descriptions, though, to match the header signature? > > That doesn't sound any better to me: it's more technical yet no more > enlightening. > Conceptually what does "stream 1" or "stream at a given index (with index == 1)" > represent? It must be something special, else you'd be logging all streams... > For example, is it simply the first stream that was written to? Gave some context.
histograms.xml with one warning below https://codereview.chromium.org/2874833005/diff/480001/tools/metrics/histogra... File tools/metrics/histograms/histograms.xml (right): https://codereview.chromium.org/2874833005/diff/480001/tools/metrics/histogra... tools/metrics/histograms/histograms.xml:75156: + entry. On 2017/08/29 19:04:10, Maks Orlovich wrote: > > > Enough so that people who don't understand the feature can roughly understand > > what's going on. > > > https://chromium.googlesource.com/chromium/src.git/+/HEAD/tools/metrics/histo... > > > > For example, rather than saying "when opening the entry", say "when the > > disk_cache API OpenEntry function is called." > > > > This makes it clean enough so someone can have some idea where to look for > > further documentation. > > Perhaps even mention whether this API is called only by web developers or > > whether it is called internally by Chrome too, as that can make a difference > in > > how to interpret changes in its value. > > Done, using the full C++ name to make it clear it's not any sort of > Web platform API, but rather an internal thing. Okay, this is good enough. However, I should mention that we try to avoid metrics and metrics descriptions that are logged deep inside function calls. This can be a problem when people refactor or add new code. If the function then gets called in additional contexts, the meaning of the histogram will change, though no one will likely end up realizing and revising the description. Or if people refactor the code, the description no longer makes sense. A good practice is to prefer histogram and user action descriptions that are clearly tied to a user-initiated event, or something that happens under particular conditions when dealing with a user event. This is probably less of an issue because this function is only called in one place at the moment and that isn't likely to change. Nonetheless, I feel I should provide this advice. > > > > > > The "stream" thing is slightly different, in that the API docs only use it > in > > a > > > couple spots, and generally use a vaguer "with the given index", which I am > > not > > > entirely fond off as its less specific; it might make sense to just say > > |index| > > > == 1 in the above descriptions, though, to match the header signature? > > > > That doesn't sound any better to me: it's more technical yet no more > > enlightening. > > Conceptually what does "stream 1" or "stream at a given index (with index == > 1)" > > represent? It must be something special, else you'd be logging all streams... > > For example, is it simply the first stream that was written to? > > Gave some context. That helps, thanks.
On 2017/08/29 19:13:26, Mark P wrote: > histograms.xml with one warning below (If this was meant to be a LGTM, I think it wasn't?) > Okay, this is good enough. However, I should mention that we try to avoid > metrics and metrics descriptions that are logged deep inside function > calls. This can be a problem when people refactor or add new code. > If the function then gets called in additional contexts, the meaning > of the histogram will change, though no one will likely end up realizing > and revising the description. Or if people refactor the code, the > description no longer makes sense. > > A good practice is to prefer histogram and user action descriptions > that are clearly tied to a user-initiated event, or something that happens > under particular conditions when dealing with a user event. These aren't that far in, though they do have a peculiar flavor since they are trying to help evaluate a performance change... But basically, when an HTTP request is handled for the user, there will be an OpenEntry call, then a ReadData from stream 0, and maybe a ReadData from stream 1. One metric determines what % of the time this particular implementation prefetches stream 1 data, and the second which % of the time that's useful (with some caveats in that it gets a little weird when you have concurrent access, though that's uncommon). (The above is true for SimpleCache.Http. infix... but there are other clients of this internal API, which get their own infixes).
lgtm On 2017/08/29 19:40:38, Maks Orlovich wrote: > On 2017/08/29 19:13:26, Mark P wrote: > > histograms.xml with one warning below > > (If this was meant to be a LGTM, I think it wasn't?) oops, I missed the magic word. > > > Okay, this is good enough. However, I should mention that we try to avoid > > metrics and metrics descriptions that are logged deep inside function > > calls. This can be a problem when people refactor or add new code. > > If the function then gets called in additional contexts, the meaning > > of the histogram will change, though no one will likely end up realizing > > and revising the description. Or if people refactor the code, the > > description no longer makes sense. > > > > A good practice is to prefer histogram and user action descriptions > > that are clearly tied to a user-initiated event, or something that happens > > under particular conditions when dealing with a user event. > > These aren't that far in, though they do have a peculiar flavor since they are > trying > to help evaluate a performance change... But basically, when an HTTP request is > handled > for the user, there will be an OpenEntry call, then a ReadData from stream 0, > and maybe > a ReadData from stream 1. One metric determines what % of the time this > particular implementation > prefetches stream 1 data, and the second which % of the time that's useful (with > some caveats > in that it gets a little weird when you have concurrent access, though that's > uncommon). > > (The above is true for SimpleCache.Http. infix... but there are other clients of > this internal API, > which get their own infixes).
The CQ bit was checked by morlovich@chromium.org
The patchset sent to the CQ was uploaded after l-g-t-m from gavinp@chromium.org Link to the patchset: https://codereview.chromium.org/2874833005/#ps540001 (title: "Tweak histogram description based on feedback")
CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
CQ is committing da patch. Bot data: {"patchset_id": 540001, "attempt_start_ts": 1504041156853440, "parent_rev": "c5b58a676675c9ffd2f0daad83e5038f3b6cced4", "commit_rev": "c11194298e5fae423ed47cf6acccd0a717f436d1"}
CQ is committing da patch. Bot data: {"patchset_id": 540001, "attempt_start_ts": 1504041156853440, "parent_rev": "f2046fbe79e4e19c2391506be1303c412d04f5c7", "commit_rev": "2acfa3ba04b5902deefdc2aacdcc1d71e785d9be"}
Message was sent while issue was closed.
Description was changed from ========== For files smaller than a configurable threshold we will only issue a single file read, immediately reading in all of metadata and stream 0 and stream 1 content upon entry opening. Files exceeding the threshold will keep the old behavior of only reading stream 0 and metadata (with somewhat more complicated sequence of file ops), and will read stream 1 content on-demand. The exact threshold will be determined by an experiment, with the code defaulting to 0 (e.g. disabled). Prefetching stream 1 content like this seem to help cold read performance significantly on Mac (and less so on Android and Linux, haven't seen much effect on Windows). Doc for experiment: https://goo.gl/RgW942 A few bits and pieces based on jkarlin's 2872943002 BUG=719979 ========== to ========== For files smaller than a configurable threshold we will only issue a single file read, immediately reading in all of metadata and stream 0 and stream 1 content upon entry opening. Files exceeding the threshold will keep the old behavior of only reading stream 0 and metadata (with somewhat more complicated sequence of file ops), and will read stream 1 content on-demand. The exact threshold will be determined by an experiment, with the code defaulting to 0 (e.g. disabled). Prefetching stream 1 content like this seem to help cold read performance significantly on Mac (and less so on Android and Linux, haven't seen much effect on Windows). Doc for experiment: https://goo.gl/RgW942 A few bits and pieces based on jkarlin's 2872943002 BUG=719979 Review-Url: https://codereview.chromium.org/2874833005 Cr-Commit-Position: refs/heads/master@{#498256} Committed: https://chromium.googlesource.com/chromium/src/+/2acfa3ba04b5902deefdc2aacdcc... ==========
Message was sent while issue was closed.
Committed patchset #28 (id:540001) as https://chromium.googlesource.com/chromium/src/+/2acfa3ba04b5902deefdc2aacdcc... |