Issue 2874833005: SimpleCache: read small files all at once.

Maks Orlovich

The CQ bit was checked by morlovich@chromium.org to run a CQ dry run

3 years, 5 months ago (2017-07-07 17:03:31 UTC) #1

commit-bot: I haz the power

Dry run: CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2874833005/80001

3 years, 5 months ago (2017-07-07 17:03:46 UTC) #2

Maks Orlovich

The CQ bit was checked by morlovich@chromium.org to run a CQ dry run

3 years, 5 months ago (2017-07-07 19:59:36 UTC) #3

commit-bot: I haz the power

Dry run: CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2874833005/160001

3 years, 5 months ago (2017-07-07 19:59:56 UTC) #4

Maks Orlovich

The CQ bit was checked by morlovich@chromium.org to run a CQ dry run

3 years, 5 months ago (2017-07-07 20:22:41 UTC) #5

commit-bot: I haz the power

Dry run: CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2874833005/180001

3 years, 5 months ago (2017-07-07 20:22:59 UTC) #6

commit-bot: I haz the power

The CQ bit was unchecked by commit-bot@chromium.org

3 years, 5 months ago (2017-07-07 21:38:45 UTC) #7

commit-bot: I haz the power

Dry run: This issue passed the CQ dry run.

3 years, 5 months ago (2017-07-07 21:38:47 UTC) #8

Maks Orlovich

Description was changed from ========== Sketching out the "read it all in one gulp if ...

3 years, 5 months ago (2017-07-10 13:30:00 UTC) #9

Maks Orlovich

The CQ bit was checked by morlovich@chromium.org to run a CQ dry run

3 years, 5 months ago (2017-07-10 13:44:09 UTC) #10

commit-bot: I haz the power

Dry run: CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2874833005/200001

3 years, 5 months ago (2017-07-10 13:44:26 UTC) #11

Maks Orlovich

Description was changed from ========== ... prefetching stream 1 content along with all the metadata ...

3 years, 5 months ago (2017-07-10 13:47:32 UTC) #12

Maks Orlovich

morlovich@chromium.org changed reviewers: + gavinp@chromium.org, jkarlin@chromium.org, pasko@chromium.org

3 years, 5 months ago (2017-07-10 13:50:57 UTC) #13

Maks Orlovich

Might make sense to turn this into an experiment (over the prefetch size)?

3 years, 5 months ago (2017-07-10 13:51:00 UTC) #14

pasko

A few questions for sanity checking before reviewing the code: * I saw in the ...

3 years, 5 months ago (2017-07-10 14:28:59 UTC) #15

Maks Orlovich

On 2017/07/10 14:28:59, pasko wrote: > A few questions for sanity checking before reviewing the ...

3 years, 5 months ago (2017-07-10 15:02:37 UTC) #16

commit-bot: I haz the power

The CQ bit was unchecked by commit-bot@chromium.org

3 years, 5 months ago (2017-07-10 15:46:03 UTC) #17

commit-bot: I haz the power

Dry run: This issue passed the CQ dry run.

3 years, 5 months ago (2017-07-10 15:46:05 UTC) #18

pasko

OK, overall looks safe and non-invasive. How do you want to proceed? * add variable ...

3 years, 5 months ago (2017-07-11 13:24:12 UTC) #19

OK, overall looks safe and non-invasive.

How do you want to proceed?
* add variable length prefetches and Finch?
* replace jkarlin's change or land that one first?
* new histograms?
* does it require more local benchmarking?

On 2017/07/10 15:02:37, Maks Orlovich wrote:
> On 2017/07/10 14:28:59, pasko wrote:
> > A few questions for sanity checking before reviewing the code:
> > 
> > * I saw in the bug that this used to regress on warm cases on Linux/Android,
> is
> > it no longer the case?
> 
> Rather the "headers only cases", it seems:
>
https://docs.google.com/spreadsheets/d/1l4zb18ez2cYNHlmTb3zLwslS7ejQpnVq90Aeh...
> (SR1 was an earlier revision, SR2 was one w/a bunch of copy avoidance that
> didn't do much, so current 
>  is basically SR1 + lots of refactoring, but it doesn't matter to much)

Do you suggest to disable prefetch for Android/Linux as a possible outcome of
experimentation?

> > * How will this affect browser memory consumption on low end devices?
> 
> Well, it reads payload it may not need, so something like 32K * # of open
> entries that are in 
> early stage of HttpCache::Transaction state machine, worst case being as long
as
> validation fetches
> are going on? 

I don't understand the http cache transaction enough (not proud of it), do we
doom the entry on revalidation or truncate? I think the latter. Should we clear
stream_1_prefetch_data_ on truncate then? Sounds like a layering violation ..
but I guess as usual it's easier to manage than guaranteeing exclusivity of
prefetches vs. writes in http cache transaction? I would accept that.

> > * 32KiB sounds arbitrary ;) 
> 
> Not completely --- it's based around network stack read sizes, and observation
> it covers a lot of files, 

Ah, makes sense. So basically the size of the typical buffer for reading in.

> but that's a couple of steps removed, so it won't actually be enough for a
full
> read. 
> 
> So yeah, something something experiment, with potentially different settings
for
> different platforms?

Yeah, we can throw exact same groups at all platforms and then figure which ones
are bettr. We can look at low-end devices separately as well.

> > should we instead call readahead(2) to allow the
> > system to prioritize I/O better? (Not sure if OSX supports it, saw signs
that
> it
> > may
> 
> Can't seem to find an analogue?

That was premature, I saw this:
https://github.com/xdevs23/busybox-osx/blob/master/miscutils/readahead.c, which
after closer inspection it appears to be excluded from compilation.

mmap+madvise is another option on OSX, but not sure it will eat the mmap
overhead.

https://codereview.chromium.org/2874833005/diff/200001/net/disk_cache/simple/...
File net/disk_cache/simple/simple_entry_impl.cc (right):

https://codereview.chromium.org/2874833005/diff/200001/net/disk_cache/simple/...
net/disk_cache/simple/simple_entry_impl.cc:857:
ReadInMemoryStreamData(stream_1_prefetch_data_.get(), offset, buf_len, buf,
perhaps we could record how much time it takes to do so? this may help to reason
about speedups/slowdowns?

adding prefethes to netlog may be less trivial, though close to something my
inner perf nerd would put in a wishlist. WDYT?

also, record how much of stream data invalidation is happening?

Maks Orlovich

On 2017/07/11 13:24:12, pasko wrote: > OK, overall looks safe and non-invasive. > > How ...

3 years, 5 months ago (2017-07-11 14:32:58 UTC) #20

On 2017/07/11 13:24:12, pasko wrote:
> OK, overall looks safe and non-invasive.
> 
> How do you want to proceed?
> * add variable length prefetches and Finch?

So code-wise it would be a base::Feature with an associated param, with default
being existing behavior?
Doesn't seem like for this particular thing we need to do any flushing since it
shouldn't affect the
contents....

Then presumably some doc/bug process stuff for Finch, followed by a Google-land
CL to configure the experiment?

> * replace jkarlin's change or land that one first?

That change is far from a complete one, so there is nothing to land. There was a
bit of 
useful starting code for me to lift, though.

> * new histograms?

Some discussion of that in reply to code comments (which I still can't figure
out 
how to not make a separate e-mail).

> * does it require more local benchmarking?

I've added some Windows benchmarking, which is mostly inconclusive except for
the 
SSD + no antivirus case, which... is probably not where the focus should be.
(I don't expect the change to make much of a difference with Windows Defender,
since that
 reads in entire file at once itself)

Still need to redo mac+HDD data, but I seem to have misrecorded my password, so
will need 
to bug Gavin to reset it.

> Do you suggest to disable prefetch for Android/Linux as a possible outcome of
> experimentation?

Absolutely. Setting threshold to zero ought to do it. Hmm, actually should
probably 
just error out if file_size is zero rather than try to reason about zero-length
allocations?

> I don't understand the http cache transaction enough (not proud of it), do we
> doom the entry on revalidation or truncate? I think the latter. 

I think both can happen.

> Should we clear stream_1_prefetch_data_ on truncate then? 
> Sounds like a layering violation ..

This code already should? A truncate is just a write, after all, and I am not
sure why you 
consider it a layering violation, it's the backend's business how it manages
data between 
memory and disk, and this sort of thing should only be observable in some reads
becoming
immediate rather than roundtrip'ing via some event loops.

> > So yeah, something something experiment, with potentially different settings
> for
> > different platforms?
> 
> Yeah, we can throw exact same groups at all platforms and then figure which
ones
> are bettr. We can look at low-end devices separately as well.

Is that the "Low memory device (android)" split? Can we actually know that in
//net 
when setting the defaults?

Maks Orlovich

https://codereview.chromium.org/2874833005/diff/200001/net/disk_cache/simple/simple_entry_impl.cc File net/disk_cache/simple/simple_entry_impl.cc (right): https://codereview.chromium.org/2874833005/diff/200001/net/disk_cache/simple/simple_entry_impl.cc#newcode857 net/disk_cache/simple/simple_entry_impl.cc:857: ReadInMemoryStreamData(stream_1_prefetch_data_.get(), offset, buf_len, buf, On 2017/07/11 13:24:12, pasko wrote: ...

3 years, 5 months ago (2017-07-11 14:34:07 UTC) #21

pasko

> On 2017/07/11 13:24:12, pasko wrote: > > OK, overall looks safe and non-invasive. > ...

3 years, 5 months ago (2017-07-11 16:17:31 UTC) #22

> On 2017/07/11 13:24:12, pasko wrote:
> > OK, overall looks safe and non-invasive.
> > 
> > How do you want to proceed?
> > * add variable length prefetches and Finch?
>
> So code-wise it would be a base::Feature with an associated param, with
default
> being existing behavior?
> Doesn't seem like for this particular thing we need to do any flushing since
it
> shouldn't affect the
> contents....

Yes.

> Then presumably some doc/bug process stuff for Finch, followed by a
Google-land
> CL to configure the experiment?

Yes, sorry about this overhead.

> > * replace jkarlin's change or land that one first?
>
> That change is far from a complete one, so there is nothing to land. There was
> a bit of useful starting code for me to lift, though.

OK, let's close that one if it does not intend to land?

> > * new histograms?
>
> Some discussion of that in reply to code comments (which I still can't figure
> out how to not make a separate e-mail).

Thank you.

> > * does it require more local benchmarking?
>
> I've added some Windows benchmarking, which is mostly inconclusive except for
> the SSD + no antivirus case, which... is probably not where the focus should
> be.  (I don't expect the change to make much of a difference with Windows
> Defender, since that reads in entire file at once itself)
>
> Still need to redo mac+HDD data, but I seem to have misrecorded my password,
> so will need to bug Gavin to reset it.

Ack, good to know, thanks for the update.

> > Do you suggest to disable prefetch for Android/Linux as a possible outcome
> > of experimentation?
>
> Absolutely. Setting threshold to zero ought to do it. Hmm, actually should
> probably just error out if file_size is zero rather than try to reason about
> zero-length allocations?

Not sure which allocation you mean, and I feel relaxed about overhead of one
unnecessary small allocation per cache entry.

> > I don't understand the http cache transaction enough (not proud of it), do
we
> > doom the entry on revalidation or truncate? I think the latter. 
>
> I think both can happen.
>
> > Should we clear stream_1_prefetch_data_ on truncate then? 
> > Sounds like a layering violation ..
>
> This code already should? A truncate is just a write, after all, and I am not
> sure why you consider it a layering violation, it's the backend's business how
> it manages data between memory and disk, and this sort of thing should only be
> observable in some reads becoming immediate rather than roundtrip'ing via some
> event loops.

Right, I asked this question, but then found that your change does it properly,
but forgot to remove the question. Sorry for noise.

> > > So yeah, something something experiment, with potentially different
> > > settings for different platforms?
> > 
> > Yeah, we can throw exact same groups at all platforms and then figure which
> > ones are bettr. We can look at low-end devices separately as well.
>
> Is that the "Low memory device (android)" split? Can we actually know that in
> //net when setting the defaults?

Absolutely: base::SysInfo::IsLowEndDevice().

https://codereview.chromium.org/2874833005/diff/200001/net/disk_cache/simple/...
File net/disk_cache/simple/simple_entry_impl.cc (right):

https://codereview.chromium.org/2874833005/diff/200001/net/disk_cache/simple/...
net/disk_cache/simple/simple_entry_impl.cc:857:
ReadInMemoryStreamData(stream_1_prefetch_data_.get(), offset, buf_len, buf,
On 2017/07/11 14:34:06, Maks Orlovich wrote:
> On 2017/07/11 13:24:12, pasko wrote:
> > perhaps we could record how much time it takes to do so? this may help to
> > reason about speedups/slowdowns?
> 
> It's a memcpy? Doesn't sound that interesting... The Sync analogue may be more
> important since that does CRC, at least. 

sorry, wrong pointer, I was asking whether we can time the interval of
prefetching from disk.

> > adding prefethes to netlog may be less trivial, though close to something my
> > inner perf nerd would put in a wishlist. WDYT?
> 
> net_log_ isn't wired at SimpleSyncEntry, but we could log on this end at
around
> l.1162 if a separate event sounds interesting?

Yeah, I was not sure whether you can provide a custom time interval to a netlog
..

> > also, record how much of stream data invalidation is happening?
> 
> Hmm, maybe rather record prefetches done vs. prefetches read? That seems a
> little easier to interpret, and also includes the case of data prefetched, not
> used, but not actively invalidated (can happen e.g. in the cant_conditionalize
> case).

Neat! I like it.

Maks Orlovich

The CQ bit was checked by morlovich@chromium.org to run a CQ dry run

3 years, 5 months ago (2017-07-12 16:04:38 UTC) #23

commit-bot: I haz the power

Dry run: CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2874833005/220001

3 years, 5 months ago (2017-07-12 16:04:50 UTC) #24

Maks Orlovich

Add some of the metrics, and an experiment knob. I am suddenly a bit worried ...

3 years, 5 months ago (2017-07-12 16:06:09 UTC) #25

commit-bot: I haz the power

The CQ bit was unchecked by commit-bot@chromium.org

3 years, 5 months ago (2017-07-12 17:08:47 UTC) #26

commit-bot: I haz the power

Dry run: Try jobs failed on following builders: android_n5x_swarming_rel on master.tryserver.chromium.android (JOB_FAILED, https://build.chromium.org/p/tryserver.chromium.android/builders/android_n5x_swarming_rel/builds/218727)

3 years, 5 months ago (2017-07-12 17:08:48 UTC) #27

pasko

Finally got to understand how the disk IO is organized. Generally looks good. On the ...

3 years, 5 months ago (2017-07-18 13:46:31 UTC) #28

Finally got to understand how the disk IO is organized. Generally looks good.

On the issue description: please update it, also explain that the small "_0"
files (with stream0 and stream1) are read to a buffer first and then if this
buffer is present, it replaces all file access.

My comments below are a mix between nits and a suggestion for a not-so-trivial
refactor. Sorry about that. The optimal way to read it is to find the largest
comment in GetEOFRecordData() and continuing the discussion from there, we can
sort out the remaining comments later.

https://codereview.chromium.org/2874833005/diff/200001/net/disk_cache/simple/...
File net/disk_cache/simple/simple_entry_impl.cc (right):

https://codereview.chromium.org/2874833005/diff/200001/net/disk_cache/simple/...
net/disk_cache/simple/simple_entry_impl.cc:857:
ReadInMemoryStreamData(stream_1_prefetch_data_.get(), offset, buf_len, buf,
On 2017/07/12 16:06:09, Maks Orlovich wrote:
> 
> > sorry, wrong pointer, I was asking whether we can time the interval 
> > of prefetching from disk.
> 
> It's not really a separate event, since it replaces all the disk I/O we would
be
> normally doing, so ... kinda?
> 
> 
> > Yeah, I was not sure whether you can provide a custom time interval 
> > to a netlog
> 
> Time interval? You kinda lost me here...

lulz :) I was writing it without knowing that you decided to read the whole file
first and then figure out which parts are which. So yeah, while it's a good fast
way, I should take back my attempts at measuring time it took to "read
potentially unnecessary parts".

So I guess I should say "Acknowledged".

Acknowledged

https://codereview.chromium.org/2874833005/diff/200001/net/disk_cache/simple/...
net/disk_cache/simple/simple_entry_impl.cc:1162: if
(in_results->stream_1_data.get()) {
almost a copy of the block above, perhaps make stream_data_ an array and
parametrize this?

https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/...
File net/disk_cache/simple/simple_entry_impl.h (right):

https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/...
net/disk_cache/simple/simple_entry_impl.h:350: bool first_stream1_read_;  //
used for metrics only.
nit: for less ambiguity: is_initial_stream1_read? However, if we put this under
a wrapped base::File that I suggested in another comment, this would go away.

https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/...
net/disk_cache/simple/simple_entry_impl.h:403: // Unlike other streams, stream 0
data is read from the disk when the entry is
this "Unlike other streams" will need to be updated because it is "sometimes"
like stream 1, right?

https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/...
File net/disk_cache/simple/simple_synchronous_entry.cc (left):

https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/...
net/disk_cache/simple/simple_synchronous_entry.cc:1308: int32_t* out_data_size)
const {
Following how the state migrates from disk to the buffer and across EOF checks
is becoming difficult to understand. It was non-trivial before the change, but
now it hits near my thresholds :)

For example: ReadAndValidateStream0 sets the buffer and then sits on top  of
PreRead to get it back from the buffer. Another example: GetEOFRecordData hides
reading something else under the scenes. These things make it non-trivial to
guess how the functionality is layered, and the file has 1750 lines.

This makes me wonder whether we can redesign it a little. How about wrapping
base::File and hiding from the sync entry the fact that it has a buffer cached?
This would make the sync entry virtually unchanged (modulo clearing the buffer
sometimes, copying from it into the SimpleEntryImpl, histograms will need to
know cache type), all the EOF checking stays the same.

Will this lead too over-copying? Any other problems?

https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/...
File net/disk_cache/simple/simple_synchronous_entry.cc (right):

https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/...
net/disk_cache/simple/simple_synchronous_entry.cc:123: void
RecordSyncOpenPrefetchStatus(net::CacheType cache_type, bool result) {
my favorite: nit on naming, because it is always possible to nit on names :)

"Sync" is implied and unnecessary in this context. Are we planning to extend the
status from boolean? If not, then better be explicit:

RecordWhetherDidPrefetchedOnOpen(...)

https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/...
net/disk_cache/simple/simple_synchronous_entry.cc:279:
&out_results->stream_1_crc32);
too many output arguments some set under non-trivial conditions? Maybe consider
pre-filling a SimpleEntryCreationResults in InitializeForOpen() or make another
struct?

https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/...
net/disk_cache/simple/simple_synchronous_entry.cc:722: int
SimpleSynchronousEntry::PreReadStreamPayload(
Why Pre? It just reads, no?

https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/...
net/disk_cache/simple/simple_synchronous_entry.cc:1262: int
SimpleSynchronousEntry::ReadAndValidateStream0(
this fetches stream 1 as well, so the name of the function needs to be updated

https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/...
net/disk_cache/simple/simple_synchronous_entry.cc:1280:
RecordSyncOpenPrefetchStatus(cache_type_, false);
nit: it is usually easier to read if the short branch comes first

https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/...
net/disk_cache/simple/simple_synchronous_entry.cc:1382: if (offset >= file_size
|| end.ValueOrDie() >= file_size)
feel free to insert a DCHECK for the overflow, but not a CHECK!

If reading some corrupt data can lead to an overflow here, then we must handle
it. An explanation for why it can/cannot happen would be handy because this
function can later be used for something else.

https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/...
File net/disk_cache/simple/simple_synchronous_entry.h (right):

https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/...
net/disk_cache/simple/simple_synchronous_entry.h:87: uint32_t stream_1_crc32;
are you packing this to make the sizeof(SimpleEntryCreationResults) smaller? I
think saving a few bytes is not important here, while readability would improve
if we group the data and the crc fields together, possibly wrapped in a separate
struct.

pasko

a few bits on testing as well https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/disk_cache_test_base.cc File net/disk_cache/disk_cache_test_base.cc (right): https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/disk_cache_test_base.cc#newcode95 net/disk_cache/disk_cache_test_base.cc:95: // Make ...

3 years, 5 months ago (2017-07-18 14:02:57 UTC) #29

a few bits on testing as well

https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/disk_ca...
File net/disk_cache/disk_cache_test_base.cc (right):

https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/disk_ca...
net/disk_cache/disk_cache_test_base.cc:95: // Make sure to cover the prefetch
path in SimpleCache.
doing work in constructors is generally discouraged...


The bigger issue is that we need a test to experiment with different params,
while most of the tests should run with param 0. There is a standard mechanism
to get coverage for different values of params as we roll to channels, there is
a json file to override the base::Feature for testing and set the most popular
configuration there. When submitting the Finch config you'll likely get a
warning about that. So that'd be preferable IMO.

https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/...
File net/disk_cache/simple/simple_experiment.cc (right):

https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/...
net/disk_cache/simple/simple_experiment.cc:70: return
base::GetFieldTrialParamByFeatureAsInt(kSimpleCachePrefetchExperiment,
please also add a test that sets the experiment param differently and confirms
that things work on the SSEntry level.

https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/...
File net/disk_cache/simple/simple_experiment.h (right):

https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/...
net/disk_cache/simple/simple_experiment.h:39: NET_EXPORT_PRIVATE int
GetSimpleCachePrefetchSize();
I think this file is for experiments that depend on persistent state on disk
(hence some index manipulations). The prefetch experiment does not need any of
this, hence it can be kept local to the sync entry. Not sure we can remove the
NET_EXPORT_PRIVATE if we want to test it, but the explicit separation would be
good.

morlovich

Commenting on stuff that needs discussing rather than just doing. https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/disk_cache_test_base.cc File net/disk_cache/disk_cache_test_base.cc (right): https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/disk_cache_test_base.cc#newcode95 ...

3 years, 5 months ago (2017-07-18 14:32:31 UTC) #30

Commenting on stuff that needs discussing rather than just doing.

https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/disk_ca...
File net/disk_cache/disk_cache_test_base.cc (right):

https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/disk_ca...
net/disk_cache/disk_cache_test_base.cc:95: // Make sure to cover the prefetch
path in SimpleCache.
On 2017/07/18 14:02:56, pasko wrote:
> doing work in constructors is generally discouraged...
> 
> 
> The bigger issue is that we need a test to experiment with different params,
> while most of the tests should run with param 0. There is a standard mechanism
> to get coverage for different values of params as we roll to channels, there
is
> a json file to override the base::Feature for testing and set the most popular
> configuration there. When submitting the Finch config you'll likely get a
> warning about that. So that'd be preferable IMO.

Any keyword to search for/pointer for that mechanism?

https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/...
File net/disk_cache/simple/simple_synchronous_entry.cc (left):

https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/...
net/disk_cache/simple/simple_synchronous_entry.cc:1308: int32_t* out_data_size)
const {
On 2017/07/18 13:46:31, pasko wrote:
> Following how the state migrates from disk to the buffer and across EOF checks
> is becoming difficult to understand. It was non-trivial before the change, but
> now it hits near my thresholds :)
> 
> For example: ReadAndValidateStream0 sets the buffer and then sits on top  of
> PreRead to get it back from the buffer. Another example: GetEOFRecordData
hides
> reading something else under the scenes. These things make it non-trivial to
> guess how the functionality is layered, and the file has 1750 lines.
> 
> This makes me wonder whether we can redesign it a little. How about wrapping
> base::File and hiding from the sync entry the fact that it has a buffer
cached?
> This would make the sync entry virtually unchanged (modulo clearing the buffer
> sometimes, copying from it into the SimpleEntryImpl, histograms will need to
> know cache type), all the EOF checking stays the same.
> 
> Will this lead too over-copying? Any other problems?

So my idea was that ReadFromFileOrPrefetched would abstract whether reading from
file or prefetch, though not fully since you still need to  drag the two extra
arguments around (might makes sense to bundle them up, maybe even as a
StringPiece). GetEOFRecordData is kinda where it shows extra roughness since
that's also used outside initial read and for files_[1].  Maybe a smaller change
would be to make ReadFromFileOrPrefetch take file number, so that handles of the
wrinkle here?

I need to sleep on the wrapping-File idea, but what bother me about it a little
is that files_ is permanent state, while the intent with the prefetch buffer was
to just to have it for the duration of initial read. 

Also you one probably still want to do some of the method additions I did, like
creating PreReadStreamPayload, since otherwise a whole bunch of stuff gets
dupllicated.

https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/...
File net/disk_cache/simple/simple_synchronous_entry.cc (right):

https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/...
net/disk_cache/simple/simple_synchronous_entry.cc:722: int
SimpleSynchronousEntry::PreReadStreamPayload(
On 2017/07/18 13:46:31, pasko wrote:
> Why Pre? It just reads, no?

Because it's only used in the prefetch path?

https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/...
net/disk_cache/simple/simple_synchronous_entry.cc:1262: int
SimpleSynchronousEntry::ReadAndValidateStream0(
On 2017/07/18 13:46:31, pasko wrote:
> this fetches stream 1 as well, so the name of the function needs to be updated

How awful is ReadAndValidateStream0AndMaybe1 ?
Any ideas for a better name?

https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/...
net/disk_cache/simple/simple_synchronous_entry.cc:1382: if (offset >= file_size
|| end.ValueOrDie() >= file_size)
On 2017/07/18 13:46:31, pasko wrote:
> feel free to insert a DCHECK for the overflow, but not a CHECK!
> 
> If reading some corrupt data can lead to an overflow here, then we must handle
> it. An explanation for why it can/cannot happen would be handy because this
> function can later be used for something else.

See the conditional in line 1380, this can't actually fail at this point. I
guess it would be better to use AssignIfValid, though?

pasko

for the stuff worth discussing, developing the idea further https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/disk_cache_test_base.cc File net/disk_cache/disk_cache_test_base.cc (right): https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/disk_cache_test_base.cc#newcode95 net/disk_cache/disk_cache_test_base.cc:95: ...

3 years, 5 months ago (2017-07-19 16:28:26 UTC) #31

for the stuff worth discussing, developing the idea further

https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/disk_ca...
File net/disk_cache/disk_cache_test_base.cc (right):

https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/disk_ca...
net/disk_cache/disk_cache_test_base.cc:95: // Make sure to cover the prefetch
path in SimpleCache.
On 2017/07/18 14:32:31, morlovich wrote:
> On 2017/07/18 14:02:56, pasko wrote:
> > doing work in constructors is generally discouraged...
> > 
> > 
> > The bigger issue is that we need a test to experiment with different params,
> > while most of the tests should run with param 0. There is a standard
mechanism
> > to get coverage for different values of params as we roll to channels, there
> is
> > a json file to override the base::Feature for testing and set the most
popular
> > configuration there. When submitting the Finch config you'll likely get a
> > warning about that. So that'd be preferable IMO.
> 
> Any keyword to search for/pointer for that mechanism?

go/finch101 and search for "testing-config", but  you'll need to read the whole
page anyway, which makes searching unnecessary

https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/...
File net/disk_cache/simple/simple_synchronous_entry.cc (left):

https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/...
net/disk_cache/simple/simple_synchronous_entry.cc:1308: int32_t* out_data_size)
const {
On 2017/07/18 14:32:31, morlovich wrote:
> On 2017/07/18 13:46:31, pasko wrote:
> > Following how the state migrates from disk to the buffer and across EOF
checks
> > is becoming difficult to understand. It was non-trivial before the change,
but
> > now it hits near my thresholds :)
> > 
> > For example: ReadAndValidateStream0 sets the buffer and then sits on top  of
> > PreRead to get it back from the buffer. Another example: GetEOFRecordData
> hides
> > reading something else under the scenes. These things make it non-trivial to
> > guess how the functionality is layered, and the file has 1750 lines.
> > 
> > This makes me wonder whether we can redesign it a little. How about wrapping
> > base::File and hiding from the sync entry the fact that it has a buffer
> cached?
> > This would make the sync entry virtually unchanged (modulo clearing the
buffer
> > sometimes, copying from it into the SimpleEntryImpl, histograms will need to
> > know cache type), all the EOF checking stays the same.
> > 
> > Will this lead too over-copying? Any other problems?
> 
> So my idea was that ReadFromFileOrPrefetched would abstract whether reading
from
> file or prefetch, though not fully since you still need to  drag the two extra
> arguments around (might makes sense to bundle them up, maybe even as a
> StringPiece). GetEOFRecordData is kinda where it shows extra roughness since
> that's also used outside initial read and for files_[1].  Maybe a smaller
change
> would be to make ReadFromFileOrPrefetch take file number, so that handles of
the
> wrinkle here? 

Yeah, I understand your way to put ReadFromFileOrPrefetched under everything. It
is just not easy to figure from sync_entry.h, and the name is not very
memorable, so in a month from now I'd need to rediscover it again. Tossing file
numbers vs streams are actually also confusing.

Thinking about those files: could make a something like a CacheStreams object
that abstracts away file manipulation as much as possible. The sync entry
operates on top of streams 0, 1 and 2. What CacheStreams would do:

* mapping of stream to file
* opening/closing streams and associated laziness
* pull/push the key
* it does _not_ know about EOF records and crc32/sha256, those are just part of
streams
* has this readahead optimization that allows to memcpy later instead of a fair
read
* this is where the abstraction leaks: allows the client to 'drop caches' to
save memory (the streams object drops it under the scenes when encountering a
write/truncate)

this makes it sort of a logical lower layer, which will allow not carrying the
prefetch_buf around and removes the usual file/offset calculation/passing
clutter in all codepaths. We still have the problem of verifying all CRC32 and
dealing with variants of SHA256 in the SSE.

Would it help?

https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/...
File net/disk_cache/simple/simple_synchronous_entry.cc (right):

https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/...
net/disk_cache/simple/simple_synchronous_entry.cc:722: int
SimpleSynchronousEntry::PreReadStreamPayload(
On 2017/07/18 14:32:31, morlovich wrote:
> On 2017/07/18 13:46:31, pasko wrote:
> > Why Pre? It just reads, no?
> 
> Because it's only used in the prefetch path?

ah, ok

https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/...
net/disk_cache/simple/simple_synchronous_entry.cc:745: reinterpret_cast<const
Bytef*>((*stream_data)->data()),
In case you have similar allergies to reinterpret_cast repetitions, perhaps
worth a function in simple_util to get a crc32 of a void* and size?

https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/...
net/disk_cache/simple/simple_synchronous_entry.cc:750:
RecordCheckEOFResult(cache_type_, CHECK_EOF_RESULT_CRC_MISMATCH);
just realized that this might slightly increase the crc mismatch rate because we
would newly-fail for prefetched stream1 that we would have otherwise discarded.

Sounds not very important (because mismatches are rare), but wanted to highlight
to be less worried later if we get alerts about that.

https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/...
net/disk_cache/simple/simple_synchronous_entry.cc:1262: int
SimpleSynchronousEntry::ReadAndValidateStream0(
On 2017/07/18 14:32:31, morlovich wrote:
> On 2017/07/18 13:46:31, pasko wrote:
> > this fetches stream 1 as well, so the name of the function needs to be
updated
> 
> How awful is ReadAndValidateStream0AndMaybe1 ?
> Any ideas for a better name?

As a name it sounds good to me

https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/...
net/disk_cache/simple/simple_synchronous_entry.cc:1382: if (offset >= file_size
|| end.ValueOrDie() >= file_size)
On 2017/07/18 14:32:31, morlovich wrote:
> On 2017/07/18 13:46:31, pasko wrote:
> > feel free to insert a DCHECK for the overflow, but not a CHECK!
> > 
> > If reading some corrupt data can lead to an overflow here, then we must
handle
> > it. An explanation for why it can/cannot happen would be handy because this
> > function can later be used for something else.
> 
> See the conditional in line 1380, this can't actually fail at this point. I
> guess it would be better to use AssignIfValid, though?

Ah, missed that. AssignIfValid() looks lengthy, I'd prefer ValueOrDefault(), but
ValueOrDie() would also work for me.

Maks Orlovich

> go/finch101 and search for "testing-config", but you'll need to read the whole > page ...

3 years, 5 months ago (2017-07-21 14:23:42 UTC) #32

Maks Orlovich

On 2017/07/21 14:23:42, Maks Orlovich wrote: > > go/finch101 and search for "testing-config", but you'll ...

3 years, 5 months ago (2017-07-21 17:32:15 UTC) #33

Maks Orlovich

The CQ bit was checked by morlovich@chromium.org to run a CQ dry run

3 years, 5 months ago (2017-07-21 18:25:32 UTC) #34

commit-bot: I haz the power

Dry run: CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2874833005/260001

3 years, 5 months ago (2017-07-21 18:25:50 UTC) #35

Maks Orlovich

Flushing some comments, some things still in progress. https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/simple_entry_impl.h File net/disk_cache/simple/simple_entry_impl.h (right): https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/simple_entry_impl.h#newcode350 net/disk_cache/simple/simple_entry_impl.h:350: bool ...

3 years, 5 months ago (2017-07-21 18:37:34 UTC) #36

Flushing some comments, some things still in progress.

https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/...
File net/disk_cache/simple/simple_entry_impl.h (right):

https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/...
net/disk_cache/simple/simple_entry_impl.h:350: bool first_stream1_read_;  //
used for metrics only.
On 2017/07/18 13:46:31, pasko wrote:
> nit: for less ambiguity: is_initial_stream1_read? However, if we put this
under
> a wrapped base::File that I suggested in another comment, this would go away.

Done.

https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/...
File net/disk_cache/simple/simple_synchronous_entry.cc (right):

https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/...
net/disk_cache/simple/simple_synchronous_entry.cc:123: void
RecordSyncOpenPrefetchStatus(net::CacheType cache_type, bool result) {
On 2017/07/18 13:46:31, pasko wrote:
> my favorite: nit on naming, because it is always possible to nit on names :)
> 
> "Sync" is implied and unnecessary in this context. Are we planning to extend
the
> status from boolean? If not, then better be explicit:
> 
> RecordWhetherDidPrefetchedOnOpen(...)

Done'ish

https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/...
net/disk_cache/simple/simple_synchronous_entry.cc:722: int
SimpleSynchronousEntry::PreReadStreamPayload(
On 2017/07/19 16:28:26, pasko wrote:
> On 2017/07/18 14:32:31, morlovich wrote:
> > On 2017/07/18 13:46:31, pasko wrote:
> > > Why Pre? It just reads, no?
> > 
> > Because it's only used in the prefetch path?
> 
> ah, ok

Acknowledged.

https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/...
net/disk_cache/simple/simple_synchronous_entry.cc:750:
RecordCheckEOFResult(cache_type_, CHECK_EOF_RESULT_CRC_MISMATCH);
On 2017/07/19 16:28:26, pasko wrote:
> just realized that this might slightly increase the crc mismatch rate because
we
> would newly-fail for prefetched stream1 that we would have otherwise
discarded.
> 
> Sounds not very important (because mismatches are rare), but wanted to
highlight
> to be less worried later if we get alerts about that.

Acknowledged.

https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/...
net/disk_cache/simple/simple_synchronous_entry.cc:1280:
RecordSyncOpenPrefetchStatus(cache_type_, false);
On 2017/07/18 13:46:31, pasko wrote:
> nit: it is usually easier to read if the short branch comes first

[Citation needed], but done.

https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/...
net/disk_cache/simple/simple_synchronous_entry.cc:1382: if (offset >= file_size
|| end.ValueOrDie() >= file_size)
> Ah, missed that. AssignIfValid() looks lengthy, I'd prefer ValueOrDefault(),
but
> ValueOrDie() would also work for me.

Ended up with AssignIfValid anyway, but I think it works better with how the
code evolved..

https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/...
File net/disk_cache/simple/simple_synchronous_entry.h (right):

https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/...
net/disk_cache/simple/simple_synchronous_entry.h:87: uint32_t stream_1_crc32;
On 2017/07/18 13:46:31, pasko wrote:
> are you packing this to make the sizeof(SimpleEntryCreationResults) smaller? I
> think saving a few bytes is not important here, while readability would
improve
> if we group the data and the crc fields together, possibly wrapped in a
separate
> struct.
> 
> 

Done.

pasko

On 2017/07/21 17:32:15, Maks Orlovich wrote: > On 2017/07/21 14:23:42, Maks Orlovich wrote: > > ...

3 years, 5 months ago (2017-07-24 12:04:36 UTC) #37

pasko

On 2017/07/21 14:23:42, Maks Orlovich wrote: > > go/finch101 and search for "testing-config", but you'll ...

3 years, 5 months ago (2017-07-24 12:22:31 UTC) #38

pasko

https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/simple_synchronous_entry.cc File net/disk_cache/simple/simple_synchronous_entry.cc (right): https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/simple/simple_synchronous_entry.cc#newcode1280 net/disk_cache/simple/simple_synchronous_entry.cc:1280: RecordSyncOpenPrefetchStatus(cache_type_, false); On 2017/07/21 18:37:33, Maks Orlovich wrote: > ...

3 years, 5 months ago (2017-07-24 12:22:46 UTC) #39

Maks Orlovich

The CQ bit was checked by morlovich@chromium.org to run a CQ dry run

3 years, 5 months ago (2017-07-25 15:48:41 UTC) #40

commit-bot: I haz the power

Dry run: CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2874833005/300001

3 years, 5 months ago (2017-07-25 15:48:49 UTC) #41

Maks Orlovich

The CQ bit was checked by morlovich@chromium.org to run a CQ dry run

3 years, 5 months ago (2017-07-25 15:55:45 UTC) #42

commit-bot: I haz the power

Dry run: CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2874833005/320001

3 years, 5 months ago (2017-07-25 15:55:50 UTC) #43

Maks Orlovich

> Yeah, it is not very pure, the SSEntry will need to know stream pecularities ...

3 years, 5 months ago (2017-07-25 16:06:10 UTC) #44

Maks Orlovich

Flushing a few more things https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/disk_cache_test_base.cc File net/disk_cache/disk_cache_test_base.cc (right): https://codereview.chromium.org/2874833005/diff/220001/net/disk_cache/disk_cache_test_base.cc#newcode95 net/disk_cache/disk_cache_test_base.cc:95: // Make sure to ...

3 years, 5 months ago (2017-07-25 16:06:30 UTC) #45

commit-bot: I haz the power

The CQ bit was unchecked by commit-bot@chromium.org

3 years, 5 months ago (2017-07-25 17:26:40 UTC) #46

commit-bot: I haz the power

Dry run: Try jobs failed on following builders: linux_chromium_asan_rel_ng on master.tryserver.chromium.linux (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.linux/builders/linux_chromium_asan_rel_ng/builds/419259)

3 years, 5 months ago (2017-07-25 17:26:42 UTC) #47

Maks Orlovich

Oh, the failure is due to sharding sensitivity, I think. Will fix.

3 years, 5 months ago (2017-07-25 17:46:19 UTC) #48

Maks Orlovich

The CQ bit was checked by morlovich@chromium.org to run a CQ dry run

3 years, 5 months ago (2017-07-25 19:10:29 UTC) #49

commit-bot: I haz the power

Dry run: CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2874833005/340001

3 years, 5 months ago (2017-07-25 19:10:46 UTC) #50

commit-bot: I haz the power

The CQ bit was unchecked by commit-bot@chromium.org

3 years, 5 months ago (2017-07-25 22:10:24 UTC) #51

commit-bot: I haz the power

Dry run: This issue passed the CQ dry run.

3 years, 5 months ago (2017-07-25 22:10:25 UTC) #52

Maks Orlovich

morlovich@chromium.org changed reviewers: - jkarlin@chromium.org

3 years, 4 months ago (2017-07-28 16:53:40 UTC) #53

Maks Orlovich

Description was changed from ========== ... prefetching stream 1 content along with all the metadata ...

3 years, 4 months ago (2017-07-28 17:25:39 UTC) #54

Maks Orlovich

The CQ bit was checked by morlovich@chromium.org to run a CQ dry run

3 years, 4 months ago (2017-07-28 17:25:54 UTC) #55

commit-bot: I haz the power

Dry run: CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2874833005/360001

3 years, 4 months ago (2017-07-28 17:26:10 UTC) #56

Maks Orlovich

PTAL --- I think I incorporated all feedback that's not the major refactor suggestion I ...

3 years, 4 months ago (2017-07-28 17:27:35 UTC) #57

commit-bot: I haz the power

The CQ bit was unchecked by commit-bot@chromium.org

3 years, 4 months ago (2017-07-28 20:25:32 UTC) #58

commit-bot: I haz the power

Dry run: Try jobs failed on following builders: win_chromium_x64_rel_ng on master.tryserver.chromium.win (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.win/builders/win_chromium_x64_rel_ng/builds/481190)

3 years, 4 months ago (2017-07-28 20:25:33 UTC) #59

pasko

Pardon for slow response, travelling/meetings/funevents/ishouldcomplainsomewhere. Overall looks correct modulo maybe one histogram is over-recorded. I ...

3 years, 4 months ago (2017-08-04 01:28:35 UTC) #60

Pardon for slow response,
travelling/meetings/funevents/ishouldcomplainsomewhere.

Overall looks correct modulo maybe one histogram is over-recorded. I have not
looked at unittests yet, instead I took the approach of overwhelming you with
naming nits and bikesheds.

WRT refactoring .. I think things got more readable since the early patchsets,
but I still cannot reason about the thing until I (re)draw the whole callgraph,
which is makes me want to avoid touching this place ever again. On the other
hand, you are the primary owner of this thing, so it is up to you how to keep up
with it in the long run.

Now to the nits ---\
                   |
                   |
                   V

Please improve the commit description. Here are a few tips on how to aim at good
commit descriptions:

https://sites.google.com/a/chromium.org/dev/developers/contributing-code#writ...

> ... prefetching stream 1 content along with all the metadata + 
> stream 0 content we always read.

nit: s/.../SimpleCache:/
nit: maybe a shorter line overall? like: "SimpleCache: Additionally prefetch
stream1 to memory on open"

> Current revision does it for anything <= 32KiB, though perhaps
> it may be worth doing this as an experiment?

> Prefetching like this seem to help cold read performance significantly on Mac
> (and less so on Android and Linux, haven't seen much effect on Windows).

please also mention adding the finch experiment, and a few words on what
parameter(s) it has for perf tuning.

> As a bonus, it also means that the first data/stream 1 read request from the
> client can be answered immediately for files to which this applies.

One could interpret 'immediately' as returning synchronously (which is possible,
but probably risky - hence not in this change), which is not the case.  Maybe
define 'immediately' at the top of the file as returning without doing disk IO
or use another term?

> A few bits and pieces based on jkarlin's 2872943002

it this part relevant?

> Doc for experiment:
https://docs.google.com/a/chromium.org/document/d/1u4udJ8fWV-GOoyZAh5NeOwHYlT...

nit: short link to avoid overflowing the line?

https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/entry_u...
File net/disk_cache/entry_unittest.cc (right):

https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/entry_u...
net/disk_cache/entry_unittest.cc:3447: const int kBufferSize = 50000;  // to
avoid quick read
The context is not easy to guess from this comment. Perhaps we could make a
constant like kMaxFilePrefetchSize and add a compile_assert to verify that the
buffer size is bigger? A comment can explain that otherwise the
EXPECT_EQ(net::ERR_IO_PENDING, ...) would not work. Or .. maybe the explanation
is different?

https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/...
File net/disk_cache/simple/simple_entry_impl.h (right):

https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/...
net/disk_cache/simple/simple_entry_impl.h:85: // Adds another reader/writer to
this entry, if possible, returning |this| to
nit: perhaps mention somewhere in this file that it may read the file contents
in advance to return synchronously from ReadData?

https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/...
net/disk_cache/simple/simple_entry_impl.h:89: // Creates this entry, if
possible. Returns |this| to |entry|.
same nit about comment as OpenEntry

https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/...
net/disk_cache/simple/simple_entry_impl.h:304: // Called after completion of
asynchronous IO and receiving file metadata for
it can also be called after reading the stream from memory

https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/...
net/disk_cache/simple/simple_entry_impl.h:318: void
ReadInMemoryStreamData(net::GrowableIOBuffer* in_buf,
naming bikeshed: 'Stream' is slightly confusing because we are not providing
stream index to the method. Also seems important to reflect the posting
activity. Suggestion: ReadFromBufferAndPostReply()?

https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/...
net/disk_cache/simple/simple_entry_impl.h:415: // the first read call on stream
1 synchronous. If a write to the stream
oops, it's not synchronous because we PostTask every time. Apologies if I
suggested this formulation, I forgot that we do not return synchronously.

https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/...
File net/disk_cache/simple/simple_synchronous_entry.cc (right):

https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/...
net/disk_cache/simple/simple_synchronous_entry.cc:715: int rv =
GetEOFRecordData(base::StringPiece(), stream_index, entry_stat,
base::StringPiece is one of those rare classes that does not require explicit
constructor, and StringPiece parameters are designed to accept both C++ and C
strings, so a nullptr would work here.

If you agree that nullptr is more readable, then I'd prefer also a comment:
GetEOFRecordData(nullptr /* file_0_prefetch */, ...)

https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/...
net/disk_cache/simple/simple_synchronous_entry.cc:1321: file_0_prefetch, 0,
extra_stream_0_read, *out_entry_stat, stream_0_eof,
maybe a few explanations how args correspond to params?

like:

s!extra_stream_0_read!extra_stream_0_size /* extra_size */!

https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/...
net/disk_cache/simple/simple_synchronous_entry.cc:1334: rv =
PreReadStreamPayload(file_0_prefetch, 1, 0, *out_entry_stat,
s!0,!0 /* extra_size */,!

https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/...
net/disk_cache/simple/simple_synchronous_entry.cc:1407: sizeof(SimpleFileEOF),
generally it is less error-prone to to use sizeof(variable) instead of
sizeof(type)

https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/...
net/disk_cache/simple/simple_synchronous_entry.cc:1422:
SIMPLE_CACHE_UMA(BOOLEAN, "SyncCheckEOFHasCrc", cache_type_,
Seems like both prefetch and last ReadData would record this histogram for
stream1... Am I right?

In this case we would have different counts in this histogram in control/
experiment group, which would be not ideal for comparisons.

Did not look at tests yet .. maybe worth some histogram tester work ...

https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/...
File net/disk_cache/simple/simple_synchronous_entry.h (right):

https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/...
net/disk_cache/simple/simple_synchronous_entry.h:85: struct
SimpleStreamPrefetchData {
Arguably we should use "a struct only for passive objects that carry data;
everything else is a class", and scoped_refptr is part of it and is _not_ a
passive object. Magical things happen inside the object pointed to by |data|
when we copy the struct.

class?

https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/...
net/disk_cache/simple/simple_synchronous_entry.h:93: struct
SimpleEntryCreationResults {
then this should also be a class?

https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/...
net/disk_cache/simple/simple_synchronous_entry.h:320: SimpleStreamPrefetchData*
stream_prefetch_data);
We should explain that |stream_prefetch_data| is a two-element array, otherwise
the declaration is confusing. Also, how about the array syntax:

SimpleStreamPrefetchData stream_prefetch_data[2]

https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/...
net/disk_cache/simple/simple_synchronous_entry.h:353:
scoped_refptr<net::GrowableIOBuffer>* stream_data,
why not producing SimpleStreamPrefetchData instead of the last two output
params?

https://codereview.chromium.org/2874833005/diff/360001/tools/metrics/histogra...
File tools/metrics/histograms/histograms.xml (right):

https://codereview.chromium.org/2874833005/diff/360001/tools/metrics/histogra...
tools/metrics/histograms/histograms.xml:73456: +    on first read op.
s!first read op.!first read op of each entry!

Also: if there are multiple concurrent openers/readers/writers of the entry,
only the result from a single read is recorded. Maybe it'd be nice to mention
that?

Maks Orlovich

Description was changed from ========== ... prefetching stream 1 content along with all the metadata ...

3 years, 4 months ago (2017-08-04 18:28:56 UTC) #61

Maks Orlovich

The CQ bit was checked by morlovich@chromium.org to run a CQ dry run

3 years, 4 months ago (2017-08-04 18:34:26 UTC) #62

commit-bot: I haz the power

Dry run: CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2874833005/380001

3 years, 4 months ago (2017-08-04 18:34:36 UTC) #63

Maks Orlovich

Flushing most of the comments. Thanks for the feedback, and don't worry about the timing ...

3 years, 4 months ago (2017-08-04 18:35:44 UTC) #64

Flushing most of the comments. Thanks for the feedback, and don't worry about
the timing --- Cache thread migration has been keeping me plenty occupied.

https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/entry_u...
File net/disk_cache/entry_unittest.cc (right):

https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/entry_u...
net/disk_cache/entry_unittest.cc:3447: const int kBufferSize = 50000;  // to
avoid quick read
On 2017/08/04 01:28:34, pasko wrote:
> The context is not easy to guess from this comment. Perhaps we could make a
> constant like kMaxFilePrefetchSize and add a compile_assert to verify that the
> buffer size is bigger? A comment can explain that otherwise the
> EXPECT_EQ(net::ERR_IO_PENDING, ...) would not work. Or .. maybe the
explanation
> is different?

Can't really give an exact number of that (since people could set it as high as
the experiment framework permits...), but I did expand upon the comment.

https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/...
File net/disk_cache/simple/simple_entry_impl.h (right):

https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/...
net/disk_cache/simple/simple_entry_impl.h:85: // Adds another reader/writer to
this entry, if possible, returning |this| to
On 2017/08/04 01:28:34, pasko wrote:
> nit: perhaps mention somewhere in this file that it may read the file contents
> in advance to return synchronously from ReadData?

Acknowledged.

https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/...
net/disk_cache/simple/simple_entry_impl.h:89: // Creates this entry, if
possible. Returns |this| to |entry|.
On 2017/08/04 01:28:34, pasko wrote:
> same nit about comment as OpenEntry

Doesn't apply since Create fails if the entry already exists.

https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/...
net/disk_cache/simple/simple_entry_impl.h:304: // Called after completion of
asynchronous IO and receiving file metadata for
On 2017/08/04 01:28:34, pasko wrote:
> it can also be called after reading the stream from memory

Rephrased.

https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/...
net/disk_cache/simple/simple_entry_impl.h:318: void
ReadInMemoryStreamData(net::GrowableIOBuffer* in_buf,
On 2017/08/04 01:28:34, pasko wrote:
> naming bikeshed: 'Stream' is slightly confusing because we are not providing
> stream index to the method. Also seems important to reflect the posting
> activity. Suggestion: ReadFromBufferAndPostReply()?

Done.

https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/...
net/disk_cache/simple/simple_entry_impl.h:415: // the first read call on stream
1 synchronous. If a write to the stream
On 2017/08/04 01:28:34, pasko wrote:
> oops, it's not synchronous because we PostTask every time. Apologies if I
> suggested this formulation, I forgot that we do not return synchronously.

So did I, actually. I think this was the original motivation Josh had, but it
kinda got overtaken by me liking single-read. Dropped this sentence.

https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/...
File net/disk_cache/simple/simple_synchronous_entry.cc (right):

https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/...
net/disk_cache/simple/simple_synchronous_entry.cc:715: int rv =
GetEOFRecordData(base::StringPiece(), stream_index, entry_stat,
On 2017/08/04 01:28:34, pasko wrote:
> base::StringPiece is one of those rare classes that does not require explicit
> constructor, and StringPiece parameters are designed to accept both C++ and C
> strings, so a nullptr would work here.

I prefer the explicit construction since it increases typechecking.
(As you can't pass a StringPiece to something that take a Object*, while you can
with nullptr)

https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/...
net/disk_cache/simple/simple_synchronous_entry.cc:1321: file_0_prefetch, 0,
extra_stream_0_read, *out_entry_stat, stream_0_eof,
On 2017/08/04 01:28:34, pasko wrote:
> maybe a few explanations how args correspond to params?
> 
> like:
> 
> s!extra_stream_0_read!extra_stream_0_size /* extra_size */!

I went for extra_post_stream_0_read as name instead, to make it clear that it's
not really part of stream data. Commented stream_index, too.

https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/...
net/disk_cache/simple/simple_synchronous_entry.cc:1334: rv =
PreReadStreamPayload(file_0_prefetch, 1, 0, *out_entry_stat,
On 2017/08/04 01:28:34, pasko wrote:
> s!0,!0 /* extra_size */,!

Did a variant of that.

https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/...
net/disk_cache/simple/simple_synchronous_entry.cc:1407: sizeof(SimpleFileEOF),
On 2017/08/04 01:28:34, pasko wrote:
> generally it is less error-prone to to use sizeof(variable) instead of
> sizeof(type)

It would have to be sizeof(*variable) in this case, though....?
I am more confident in my ability to know what I am reading --- when it's a
struct, rather than a buffer, anyway --- than to not forget a * in a way the
typechecker can't catch.

https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/...
net/disk_cache/simple/simple_synchronous_entry.cc:1422:
SIMPLE_CACHE_UMA(BOOLEAN, "SyncCheckEOFHasCrc", cache_type_,
On 2017/08/04 01:28:34, pasko wrote:
> Seems like both prefetch and last ReadData would record this histogram for
> stream1... Am I right?

I think it can't actually happen. To hit sync-side ReadData on stream1, it would
need to have the prefetched stream1 data discarded due to a write, and we
don't/can't verify checksums if a stream has been written to.

https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/...
File net/disk_cache/simple/simple_synchronous_entry.h (right):

https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/...
net/disk_cache/simple/simple_synchronous_entry.h:320: SimpleStreamPrefetchData*
stream_prefetch_data);
On 2017/08/04 01:28:35, pasko wrote:
> We should explain that |stream_prefetch_data| is a two-element array,
otherwise
> the declaration is confusing. Also, how about the array syntax:
> 
> SimpleStreamPrefetchData stream_prefetch_data[2]

Done (and elsewhere)

https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/...
net/disk_cache/simple/simple_synchronous_entry.h:353:
scoped_refptr<net::GrowableIOBuffer>* stream_data,
On 2017/08/04 01:28:34, pasko wrote:
> why not producing SimpleStreamPrefetchData instead of the last two output
> params?

Done.

https://codereview.chromium.org/2874833005/diff/360001/tools/metrics/histogra...
File tools/metrics/histograms/histograms.xml (right):

https://codereview.chromium.org/2874833005/diff/360001/tools/metrics/histogra...
tools/metrics/histograms/histograms.xml:73456: +    on first read op.
On 2017/08/04 01:28:35, pasko wrote:
> s!first read op.!first read op of each entry!
> 
> Also: if there are multiple concurrent openers/readers/writers of the entry,
> only the result from a single read is recorded. Maybe it'd be nice to mention
> that?

Yeah, good point. I think they're pretty rare, though, since that sounds kinda
hard to interpret.

commit-bot: I haz the power

The CQ bit was unchecked by commit-bot@chromium.org

3 years, 4 months ago (2017-08-04 20:30:29 UTC) #65

commit-bot: I haz the power

Dry run: This issue passed the CQ dry run.

3 years, 4 months ago (2017-08-04 20:30:30 UTC) #66

Maks Orlovich

https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/simple_synchronous_entry.h File net/disk_cache/simple/simple_synchronous_entry.h (right): https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/simple_synchronous_entry.h#newcode85 net/disk_cache/simple/simple_synchronous_entry.h:85: struct SimpleStreamPrefetchData { On 2017/08/04 01:28:35, pasko wrote: > ...

3 years, 4 months ago (2017-08-07 13:09:02 UTC) #67

pasko

https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/entry_unittest.cc File net/disk_cache/entry_unittest.cc (right): https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/entry_unittest.cc#newcode3447 net/disk_cache/entry_unittest.cc:3447: const int kBufferSize = 50000; // to avoid quick ...

3 years, 4 months ago (2017-08-09 12:28:22 UTC) #68

https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/entry_u...
File net/disk_cache/entry_unittest.cc (right):

https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/entry_u...
net/disk_cache/entry_unittest.cc:3447: const int kBufferSize = 50000;  // to
avoid quick read
On 2017/08/04 18:35:43, Maks Orlovich wrote:
> On 2017/08/04 01:28:34, pasko wrote:
> > The context is not easy to guess from this comment. Perhaps we could make a
> > constant like kMaxFilePrefetchSize and add a compile_assert to verify that
the
> > buffer size is bigger? A comment can explain that otherwise the
> > EXPECT_EQ(net::ERR_IO_PENDING, ...) would not work. Or .. maybe the
> explanation
> > is different?
> 
> Can't really give an exact number of that (since people could set it as high
as
> the experiment framework permits...),

Oh, my comment is about adding a constant and a compile assert only in the test,
and compare it only with the default. This may still break if we set an insanely
large number in fieldtrial_testing_config.json, but I guess we won't do that..

> but I did expand upon the comment.

The comment worksforme as well, thanks

https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/...
File net/disk_cache/simple/simple_entry_impl.h (right):

https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/...
net/disk_cache/simple/simple_entry_impl.h:85: // Adds another reader/writer to
this entry, if possible, returning |this| to
On 2017/08/04 18:35:44, Maks Orlovich wrote:
> On 2017/08/04 01:28:34, pasko wrote:
> > nit: perhaps mention somewhere in this file that it may read the file
contents
> > in advance to return synchronously from ReadData?
> 
> Acknowledged.

sry, it was a leftover comment when I thought that we return synchronously

https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/...
net/disk_cache/simple/simple_entry_impl.h:89: // Creates this entry, if
possible. Returns |this| to |entry|.
On 2017/08/04 18:35:44, Maks Orlovich wrote:
> On 2017/08/04 01:28:34, pasko wrote:
> > same nit about comment as OpenEntry
> 
> Doesn't apply since Create fails if the entry already exists.

Acknowledged.

https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/...
File net/disk_cache/simple/simple_synchronous_entry.cc (right):

https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/...
net/disk_cache/simple/simple_synchronous_entry.cc:715: int rv =
GetEOFRecordData(base::StringPiece(), stream_index, entry_stat,
On 2017/08/04 18:35:44, Maks Orlovich wrote:
> On 2017/08/04 01:28:34, pasko wrote:
> > base::StringPiece is one of those rare classes that does not require
explicit
> > constructor, and StringPiece parameters are designed to accept both C++ and
C
> > strings, so a nullptr would work here.
> 
> I prefer the explicit construction since it increases typechecking.
> (As you can't pass a StringPiece to something that take a Object*, while you
can
> with nullptr)

OK, makes sense. I am not sure I fully understand the benefits of StringPiece
here compared to char[]. Not having to pass size as additional parameter?

https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/...
net/disk_cache/simple/simple_synchronous_entry.cc:1334: rv =
PreReadStreamPayload(file_0_prefetch, 1, 0, *out_entry_stat,
On 2017/08/04 18:35:44, Maks Orlovich wrote:
> On 2017/08/04 01:28:34, pasko wrote:
> > s!0,!0 /* extra_size */,!
> 
> Did a variant of that.

Thanks for that, I forgot the style guide

https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/...
net/disk_cache/simple/simple_synchronous_entry.cc:1422:
SIMPLE_CACHE_UMA(BOOLEAN, "SyncCheckEOFHasCrc", cache_type_,
On 2017/08/04 18:35:44, Maks Orlovich wrote:
> On 2017/08/04 01:28:34, pasko wrote:
> > Seems like both prefetch and last ReadData would record this histogram for
> > stream1... Am I right?
> 
> I think it can't actually happen. To hit sync-side ReadData on stream1, it
would
> need to have the prefetched stream1 data discarded due to a write, and we
> don't/can't verify checksums if a stream has been written to.

Even though this might be currently true, I find it easy to overlook in later
changes. Please add a test to verify that SyncCheckEOFHasCrc is recorded only
once for both prefetched and non-prefetched stream1 cases when we read the entry
from the beginning to the end. Hopefully in two variants: when there is CRC and
when there is not.

https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/...
File net/disk_cache/simple/simple_synchronous_entry.h (right):

https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/...
net/disk_cache/simple/simple_synchronous_entry.h:85: struct
SimpleStreamPrefetchData {
On 2017/08/07 13:09:01, Maks Orlovich wrote:
> On 2017/08/04 01:28:35, pasko wrote:
> > Arguably we should use "a struct only for passive objects that carry data;
> > everything else is a class", and scoped_refptr is part of it and is _not_ a
> > passive object. Magical things happen inside the object pointed to by |data|
> > when we copy the struct.
> > 
> > class?
> 
> Is it that different from having an std::string member?

I think std::string and scoped_refptr equally make me want to make it a class.
Also there is " If in doubt, make it a class" in the relevant part of the
styleguide...

https://codereview.chromium.org/2874833005/diff/360001/tools/metrics/histogra...
File tools/metrics/histograms/histograms.xml (right):

https://codereview.chromium.org/2874833005/diff/360001/tools/metrics/histogra...
tools/metrics/histograms/histograms.xml:73456: +    on first read op.
On 2017/08/04 18:35:44, Maks Orlovich wrote:
> On 2017/08/04 01:28:35, pasko wrote:
> > s!first read op.!first read op of each entry!
> > 
> > Also: if there are multiple concurrent openers/readers/writers of the entry,
> > only the result from a single read is recorded. Maybe it'd be nice to
mention
> > that?
> 
> Yeah, good point. I think they're pretty rare, though, since that sounds kinda
> hard to interpret.

well, if they were too rare, we would not have invested into dealing with the
cache lock. These edge cases are hard to interpret and remember, that's why I
think being more explicit in a description here would help a lot.

https://codereview.chromium.org/2874833005/diff/380001/net/disk_cache/simple/...
File net/disk_cache/simple/simple_entry_impl.h (right):

https://codereview.chromium.org/2874833005/diff/380001/net/disk_cache/simple/...
net/disk_cache/simple/simple_entry_impl.h:318: // and updating metadata as
appropriate. If |callback| is non-null, it will
nit: s/as appropriate//

https://codereview.chromium.org/2874833005/diff/380001/net/disk_cache/simple/...
net/disk_cache/simple/simple_entry_impl.h:319: // be posted to with the return
code.
s/posted to/posted to the current task runner/ ? ... since a callback cannot be
posted to (though it may be that my English is lacking here, lemme know)

https://codereview.chromium.org/2874833005/diff/380001/net/disk_cache/simple/...
File net/disk_cache/simple/simple_synchronous_entry.cc (right):

https://codereview.chromium.org/2874833005/diff/380001/net/disk_cache/simple/...
net/disk_cache/simple/simple_synchronous_entry.cc:1333: /*extra_size = */ 0,
*out_entry_stat,
nit: one extra space before 'extra' for consistency

Maks Orlovich

Flushing the easy stuff while look at the how to test the CRC thing. https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/simple_synchronous_entry.cc ...

3 years, 4 months ago (2017-08-09 15:25:03 UTC) #69

Flushing the easy stuff while look at the how to test the CRC thing.

https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/...
File net/disk_cache/simple/simple_synchronous_entry.cc (right):

https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/...
net/disk_cache/simple/simple_synchronous_entry.cc:715: int rv =
GetEOFRecordData(base::StringPiece(), stream_index, entry_stat,
On 2017/08/09 12:28:21, pasko wrote:
> On 2017/08/04 18:35:44, Maks Orlovich wrote:
> > On 2017/08/04 01:28:34, pasko wrote:
> > > base::StringPiece is one of those rare classes that does not require
> explicit
> > > constructor, and StringPiece parameters are designed to accept both C++
and
> C
> > > strings, so a nullptr would work here.
> > 
> > I prefer the explicit construction since it increases typechecking.
> > (As you can't pass a StringPiece to something that take a Object*, while you
> can
> > with nullptr)
> 
> OK, makes sense. I am not sure I fully understand the benefits of StringPiece
> here compared to char[]. Not having to pass size as additional parameter?

Yeah, bundles up the pointer and length together, so they are a single parameter
and it's clear they're associated.

https://codereview.chromium.org/2874833005/diff/360001/tools/metrics/histogra...
File tools/metrics/histograms/histograms.xml (right):

https://codereview.chromium.org/2874833005/diff/360001/tools/metrics/histogra...
tools/metrics/histograms/histograms.xml:73456: +    on first read op.

> well, if they were too rare, we would not have invested into dealing with the
> cache lock. These edge cases are hard to interpret and remember, that's why I
> think being more explicit in a description here would help a lot.

Oh, wow, I thought I changed this... Huh, when I open the file in my source tree
I get this:

"
    Whether a read from stream 1 was satisfied from prefetch data. Reported only
    on first read op of entry (including if there are multiple readers, or even
    some writers).
"
Not sure what's going on here.

https://codereview.chromium.org/2874833005/diff/380001/net/disk_cache/simple/...
File net/disk_cache/simple/simple_entry_impl.h (right):

https://codereview.chromium.org/2874833005/diff/380001/net/disk_cache/simple/...
net/disk_cache/simple/simple_entry_impl.h:318: // and updating metadata as
appropriate. If |callback| is non-null, it will
On 2017/08/09 12:28:22, pasko wrote:
> nit: s/as appropriate//

Done.

https://codereview.chromium.org/2874833005/diff/380001/net/disk_cache/simple/...
net/disk_cache/simple/simple_entry_impl.h:319: // be posted to with the return
code.
On 2017/08/09 12:28:22, pasko wrote:
> s/posted to/posted to the current task runner/ ? ... since a callback cannot
be
> posted to (though it may be that my English is lacking here, lemme know)

Done.

https://codereview.chromium.org/2874833005/diff/380001/net/disk_cache/simple/...
File net/disk_cache/simple/simple_synchronous_entry.cc (right):

https://codereview.chromium.org/2874833005/diff/380001/net/disk_cache/simple/...
net/disk_cache/simple/simple_synchronous_entry.cc:1333: /*extra_size = */ 0,
*out_entry_stat,
On 2017/08/09 12:28:22, pasko wrote:
> nit: one extra space before 'extra' for consistency

Done.

Maks Orlovich

The CQ bit was checked by morlovich@chromium.org to run a CQ dry run

3 years, 4 months ago (2017-08-09 15:25:09 UTC) #70

commit-bot: I haz the power

Dry run: CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2874833005/400001

3 years, 4 months ago (2017-08-09 15:26:04 UTC) #71

pasko

flushing responses https://codereview.chromium.org/2874833005/diff/360001/tools/metrics/histograms/histograms.xml File tools/metrics/histograms/histograms.xml (right): https://codereview.chromium.org/2874833005/diff/360001/tools/metrics/histograms/histograms.xml#newcode73456 tools/metrics/histograms/histograms.xml:73456: + on first read op. On 2017/08/09 ...

3 years, 4 months ago (2017-08-09 15:33:58 UTC) #72

commit-bot: I haz the power

The CQ bit was unchecked by commit-bot@chromium.org

3 years, 4 months ago (2017-08-09 17:49:42 UTC) #73

commit-bot: I haz the power

Dry run: This issue passed the CQ dry run.

3 years, 4 months ago (2017-08-09 17:49:44 UTC) #74

Maks Orlovich

https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/simple_synchronous_entry.cc File net/disk_cache/simple/simple_synchronous_entry.cc (right): https://codereview.chromium.org/2874833005/diff/360001/net/disk_cache/simple/simple_synchronous_entry.cc#newcode1422 net/disk_cache/simple/simple_synchronous_entry.cc:1422: SIMPLE_CACHE_UMA(BOOLEAN, "SyncCheckEOFHasCrc", cache_type_, On 2017/08/09 12:28:21, pasko wrote: > ...

3 years, 4 months ago (2017-08-09 17:52:29 UTC) #75

gavinp

Going over this, I really like storing the prefetched data explicitly in the element when ...

3 years, 4 months ago (2017-08-10 16:43:26 UTC) #76

Maks Orlovich

On 2017/08/10 16:43:26, gavinp wrote: > Going over this, I really like storing the prefetched ...

3 years, 4 months ago (2017-08-10 16:57:14 UTC) #77

Maks Orlovich

The CQ bit was checked by morlovich@chromium.org to run a CQ dry run

3 years, 4 months ago (2017-08-10 17:27:32 UTC) #78

commit-bot: I haz the power

Dry run: CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2874833005/420001

3 years, 4 months ago (2017-08-10 17:27:41 UTC) #79

commit-bot: I haz the power

The CQ bit was unchecked by commit-bot@chromium.org

3 years, 4 months ago (2017-08-10 18:50:36 UTC) #80

commit-bot: I haz the power

Dry run: This issue passed the CQ dry run.

3 years, 4 months ago (2017-08-10 18:50:37 UTC) #81

Maks Orlovich

Reworked the size computation to be what I feel is more natural --- Gavin, Egor, ...

3 years, 4 months ago (2017-08-10 18:59:23 UTC) #82

Maks Orlovich

The CQ bit was checked by morlovich@chromium.org to run a CQ dry run

3 years, 4 months ago (2017-08-10 19:01:15 UTC) #83

commit-bot: I haz the power

Dry run: CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2874833005/440001

3 years, 4 months ago (2017-08-10 19:01:25 UTC) #84

Maks Orlovich

The CQ bit was checked by morlovich@chromium.org to run a CQ dry run

3 years, 4 months ago (2017-08-10 19:07:27 UTC) #85

gavinp

On 2017/08/10 16:57:14, Maks Orlovich wrote: > On 2017/08/10 16:43:26, gavinp wrote: > > Going ...

3 years, 4 months ago (2017-08-10 19:07:27 UTC) #86

Maks Orlovich

> Maybe let's have a short 1:1 to go through this. Sure, whenever convenient for ...

3 years, 4 months ago (2017-08-10 19:12:26 UTC) #87

pasko

On 2017/08/10 18:59:23, Maks Orlovich wrote: > Reworked the size computation to be what I ...

3 years, 4 months ago (2017-08-11 14:38:56 UTC) #88

gavinp

Maks, I've gone through it, and I think I'm OK with this, we don't need ...

3 years, 4 months ago (2017-08-15 17:45:41 UTC) #89

Maks Orlovich

On 2017/08/15 17:45:41, gavinp wrote: > Maks, > > I've gone through it, and I ...

3 years, 4 months ago (2017-08-15 17:52:19 UTC) #90

Maks Orlovich

> Thanks Gavin, I'll look forward to the review then. Sorry, but ping on this...

3 years, 4 months ago (2017-08-22 14:01:28 UTC) #91

Maks Orlovich

Would really appreciate if I could proceed with improving this...

3 years, 3 months ago (2017-08-25 16:51:17 UTC) #92

gavinp

this lgtm https://codereview.chromium.org/2874833005/diff/460001/net/disk_cache/simple/simple_synchronous_entry.cc File net/disk_cache/simple/simple_synchronous_entry.cc (right): https://codereview.chromium.org/2874833005/diff/460001/net/disk_cache/simple/simple_synchronous_entry.cc#newcode1362 net/disk_cache/simple/simple_synchronous_entry.cc:1362: if (!ok) { Could we just move ...

3 years, 3 months ago (2017-08-25 17:46:46 UTC) #93

Maks Orlovich

The CQ bit was checked by morlovich@chromium.org to run a CQ dry run

3 years, 3 months ago (2017-08-25 18:18:54 UTC) #94

commit-bot: I haz the power

Dry run: CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2874833005/480001

3 years, 3 months ago (2017-08-25 18:18:58 UTC) #95

Maks Orlovich

morlovich@chromium.org changed reviewers: + mpearson@chromium.org

3 years, 3 months ago (2017-08-25 18:19:53 UTC) #96

Maks Orlovich

+mpearson for the histograms. https://codereview.chromium.org/2874833005/diff/460001/net/disk_cache/simple/simple_synchronous_entry.cc File net/disk_cache/simple/simple_synchronous_entry.cc (right): https://codereview.chromium.org/2874833005/diff/460001/net/disk_cache/simple/simple_synchronous_entry.cc#newcode1362 net/disk_cache/simple/simple_synchronous_entry.cc:1362: if (!ok) { On 2017/08/25 ...

3 years, 3 months ago (2017-08-25 18:19:54 UTC) #97

Maks Orlovich

https://codereview.chromium.org/2874833005/diff/480001/net/disk_cache/simple/simple_synchronous_entry.cc File net/disk_cache/simple/simple_synchronous_entry.cc (right): https://codereview.chromium.org/2874833005/diff/480001/net/disk_cache/simple/simple_synchronous_entry.cc#newcode122 net/disk_cache/simple/simple_synchronous_entry.cc:122: "GetSimpleCachePrefetchExperiment", base::FEATURE_DISABLED_BY_DEFAULT}; Err, I should probably drop the "Get" ...

3 years, 3 months ago (2017-08-25 19:43:27 UTC) #98

commit-bot: I haz the power

The CQ bit was unchecked by commit-bot@chromium.org

3 years, 3 months ago (2017-08-25 19:49:54 UTC) #99

commit-bot: I haz the power

Dry run: This issue passed the CQ dry run.

3 years, 3 months ago (2017-08-25 19:49:56 UTC) #100

Mark P

Sorry this took me a while. Somehow I missed it in my inbox, and because ...

3 years, 3 months ago (2017-08-29 04:52:30 UTC) #101

Maks Orlovich

https://codereview.chromium.org/2874833005/diff/480001/net/disk_cache/simple/simple_synchronous_entry.cc File net/disk_cache/simple/simple_synchronous_entry.cc (right): https://codereview.chromium.org/2874833005/diff/480001/net/disk_cache/simple/simple_synchronous_entry.cc#newcode122 net/disk_cache/simple/simple_synchronous_entry.cc:122: "GetSimpleCachePrefetchExperiment", base::FEATURE_DISABLED_BY_DEFAULT}; On 2017/08/29 04:52:29, Mark P wrote: > ...

3 years, 3 months ago (2017-08-29 13:24:13 UTC) #102

https://codereview.chromium.org/2874833005/diff/480001/net/disk_cache/simple/...
File net/disk_cache/simple/simple_synchronous_entry.cc (right):

https://codereview.chromium.org/2874833005/diff/480001/net/disk_cache/simple/...
net/disk_cache/simple/simple_synchronous_entry.cc:122:
"GetSimpleCachePrefetchExperiment", base::FEATURE_DISABLED_BY_DEFAULT};
On 2017/08/29 04:52:29, Mark P wrote:
> On 2017/08/25 19:43:26, Maks Orlovich wrote:
> > Err, I should probably drop the "Get" here, shouldn't I?
> 
> Yes. :-P

Done. 

> 
> You may also want to drop "Experiment" in the string and variable name.
> 
> IsEnabled(kSimpleCachePrefetch) looks pretty readable to me.

Other things in net/disk_cache/simple/ have the suffix so I'll stay consistent
with them.

https://codereview.chromium.org/2874833005/diff/480001/tools/metrics/histogra...
File tools/metrics/histograms/histograms.xml (right):

https://codereview.chromium.org/2874833005/diff/480001/tools/metrics/histogra...
tools/metrics/histograms/histograms.xml:75064: +    Whether a read from stream 1
was satisfied from prefetch data. Reported only
On 2017/08/29 04:52:30, Mark P wrote:
> nit: prefetch -> prefetched

Done.

https://codereview.chromium.org/2874833005/diff/480001/tools/metrics/histogra...
tools/metrics/histograms/histograms.xml:75065: +    on first read op of entry
(including if there are multiple readers, or even
On 2017/08/29 04:52:30, Mark P wrote:
> nit:
> on first read op of entry
> ->
> on the first read of the stream
> (right?)
> 
> (for clarity)

Indeed. Reads from other streams wouldn't affect this.

https://codereview.chromium.org/2874833005/diff/480001/tools/metrics/histogra...
tools/metrics/histograms/histograms.xml:75066: +    some writers).
On 2017/08/29 04:52:30, Mark P wrote:
> nit: Please define / explain "stream 1".  (Maybe as a sentence between those
two
> existing sentence.)

(see below)

https://codereview.chromium.org/2874833005/diff/480001/tools/metrics/histogra...
tools/metrics/histograms/histograms.xml:75156: +    entry.
On 2017/08/29 04:52:29, Mark P wrote:
> nit: "entry" is vague.  Please be clearer.  URL?
> Also, please state what causes this to be emitted, i.e., what causes an entry
to
> be "opened"?  Is this always user-initiated?

So this is obvious when one knows the disk_cache API: there is a
disk_cache::Entry class, and one gets one via an OpenEntry method (or a
CreateEntry, but that one makes an empty one, so there is nothing to read). How
much context familiarity is expected of documentation strings here?

The "stream" thing is slightly different, in that the API docs only use it in a
couple spots, and generally use a vaguer "with the given index", which I am not
entirely fond off as its less specific; it might make sense to just say |index|
== 1 in the above descriptions, though, to match the header signature?

Maks Orlovich

The CQ bit was checked by morlovich@chromium.org to run a CQ dry run

3 years, 3 months ago (2017-08-29 13:29:56 UTC) #103

commit-bot: I haz the power

Dry run: CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2874833005/500001

3 years, 3 months ago (2017-08-29 13:30:06 UTC) #104

commit-bot: I haz the power

The CQ bit was unchecked by commit-bot@chromium.org

3 years, 3 months ago (2017-08-29 14:47:16 UTC) #105

commit-bot: I haz the power

Dry run: Try jobs failed on following builders: linux_chromium_chromeos_rel_ng on master.tryserver.chromium.linux (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.linux/builders/linux_chromium_chromeos_rel_ng/builds/499389)

3 years, 3 months ago (2017-08-29 14:47:17 UTC) #106

Maks Orlovich

The CQ bit was checked by morlovich@chromium.org to run a CQ dry run

3 years, 3 months ago (2017-08-29 16:18:44 UTC) #107

commit-bot: I haz the power

Dry run: CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2874833005/520001

3 years, 3 months ago (2017-08-29 16:18:55 UTC) #108

commit-bot: I haz the power

The CQ bit was unchecked by commit-bot@chromium.org

3 years, 3 months ago (2017-08-29 17:35:51 UTC) #109

commit-bot: I haz the power

Dry run: This issue passed the CQ dry run.

3 years, 3 months ago (2017-08-29 17:35:53 UTC) #110

Mark P

https://codereview.chromium.org/2874833005/diff/480001/tools/metrics/histograms/histograms.xml File tools/metrics/histograms/histograms.xml (right): https://codereview.chromium.org/2874833005/diff/480001/tools/metrics/histograms/histograms.xml#newcode75156 tools/metrics/histograms/histograms.xml:75156: + entry. On 2017/08/29 13:24:12, Maks Orlovich wrote: > ...

3 years, 3 months ago (2017-08-29 18:34:58 UTC) #111

https://codereview.chromium.org/2874833005/diff/480001/tools/metrics/histogra...
File tools/metrics/histograms/histograms.xml (right):

https://codereview.chromium.org/2874833005/diff/480001/tools/metrics/histogra...
tools/metrics/histograms/histograms.xml:75156: +    entry.
On 2017/08/29 13:24:12, Maks Orlovich wrote:
> On 2017/08/29 04:52:29, Mark P wrote:
> > nit: "entry" is vague.  Please be clearer.  URL?
> > Also, please state what causes this to be emitted, i.e., what causes an
entry
> to
> > be "opened"?  Is this always user-initiated?
> 
> So this is obvious when one knows the disk_cache API: there is a
> disk_cache::Entry class, and one gets one via an OpenEntry method (or a
> CreateEntry, but that one makes an empty one, so there is nothing to read).
How
> much context familiarity is expected of documentation strings here?

Enough so that people who don't understand the feature can roughly understand
what's going on.
https://chromium.googlesource.com/chromium/src.git/+/HEAD/tools/metrics/histo...

For example, rather than saying "when opening the entry", say "when the
disk_cache API OpenEntry function is called."
This makes it clean enough so someone can have some idea where to look for
further documentation.
Perhaps even mention whether this API is called only by web developers or
whether it is called internally by Chrome too, as that can make a difference in
how to interpret changes in its value.

> The "stream" thing is slightly different, in that the API docs only use it in
a
> couple spots, and generally use a vaguer "with the given index", which I am
not
> entirely fond off as its less specific; it might make sense to just say
|index|
> == 1 in the above descriptions, though, to match the header signature?

That doesn't sound any better to me: it's more technical yet no more
enlightening.
Conceptually what does "stream 1" or "stream at a given index (with index == 1)"
represent?  It must be something special, else you'd be logging all streams...
For example, is it simply the first stream that was written to?

https://codereview.chromium.org/2874833005/diff/520001/tools/metrics/histogra...
File tools/metrics/histograms/histograms.xml (right):

https://codereview.chromium.org/2874833005/diff/520001/tools/metrics/histogra...
tools/metrics/histograms/histograms.xml:75205: +    only on the first read op of
the stream (including if there are multiple
op -> operation

No need to abbreviate in writing.

Maks Orlovich

https://codereview.chromium.org/2874833005/diff/480001/tools/metrics/histograms/histograms.xml File tools/metrics/histograms/histograms.xml (right): https://codereview.chromium.org/2874833005/diff/480001/tools/metrics/histograms/histograms.xml#newcode75156 tools/metrics/histograms/histograms.xml:75156: + entry. > Enough so that people who don't ...

3 years, 3 months ago (2017-08-29 19:04:11 UTC) #112

Mark P

histograms.xml with one warning below https://codereview.chromium.org/2874833005/diff/480001/tools/metrics/histograms/histograms.xml File tools/metrics/histograms/histograms.xml (right): https://codereview.chromium.org/2874833005/diff/480001/tools/metrics/histograms/histograms.xml#newcode75156 tools/metrics/histograms/histograms.xml:75156: + entry. On 2017/08/29 ...

3 years, 3 months ago (2017-08-29 19:13:26 UTC) #113

histograms.xml with one warning below

https://codereview.chromium.org/2874833005/diff/480001/tools/metrics/histogra...
File tools/metrics/histograms/histograms.xml (right):

https://codereview.chromium.org/2874833005/diff/480001/tools/metrics/histogra...
tools/metrics/histograms/histograms.xml:75156: +    entry.
On 2017/08/29 19:04:10, Maks Orlovich wrote:
> 
> > Enough so that people who don't understand the feature can roughly
understand
> > what's going on.
> >
>
https://chromium.googlesource.com/chromium/src.git/+/HEAD/tools/metrics/histo...
> > 
> > For example, rather than saying "when opening the entry", say "when the
> > disk_cache API OpenEntry function is called."
> >
> > This makes it clean enough so someone can have some idea where to look for
> > further documentation.
> > Perhaps even mention whether this API is called only by web developers or
> > whether it is called internally by Chrome too, as that can make a difference
> in
> > how to interpret changes in its value.
> 
> Done, using the full C++ name to make it clear it's not any sort of 
> Web platform API, but rather an internal thing.

Okay, this is good enough.  However, I should mention that we try to avoid
metrics and metrics descriptions that are logged deep inside function
calls.  This can be a problem when people refactor or add new code.
If the function then gets called in additional contexts, the meaning
of the histogram will change, though no one will likely end up realizing
and revising the description.  Or if people refactor the code, the
description no longer makes sense.

A good practice is to prefer histogram and user action descriptions
that are clearly tied to a user-initiated event, or something that happens
under particular conditions when dealing with a user event.

This is probably less of an issue because this function is only
called in one place at the moment and that isn't likely to change.
Nonetheless, I feel I should provide this advice.

> 
> > 
> > > The "stream" thing is slightly different, in that the API docs only use it
> in
> > a
> > > couple spots, and generally use a vaguer "with the given index", which I
am
> > not
> > > entirely fond off as its less specific; it might make sense to just say
> > |index|
> > > == 1 in the above descriptions, though, to match the header signature?
> > 
> > That doesn't sound any better to me: it's more technical yet no more
> > enlightening.
> > Conceptually what does "stream 1" or "stream at a given index (with index ==
> 1)"
> > represent?  It must be something special, else you'd be logging all
streams...
> > For example, is it simply the first stream that was written to?
> 
> Gave some context. 

That helps, thanks.

Maks Orlovich

On 2017/08/29 19:13:26, Mark P wrote: > histograms.xml with one warning below (If this was ...

3 years, 3 months ago (2017-08-29 19:40:38 UTC) #114

Mark P

lgtm On 2017/08/29 19:40:38, Maks Orlovich wrote: > On 2017/08/29 19:13:26, Mark P wrote: > ...

3 years, 3 months ago (2017-08-29 19:49:05 UTC) #115

Maks Orlovich

The patchset sent to the CQ was uploaded after l-g-t-m from gavinp@chromium.org Link to the ...

3 years, 3 months ago (2017-08-29 21:12:37 UTC) #117

commit-bot: I haz the power

CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2874833005/540001

3 years, 3 months ago (2017-08-29 21:12:53 UTC) #118

commit-bot: I haz the power

CQ is committing da patch. Bot data: {"patchset_id": 540001, "attempt_start_ts": 1504041156853440, "parent_rev": "c5b58a676675c9ffd2f0daad83e5038f3b6cced4", "commit_rev": "c11194298e5fae423ed47cf6acccd0a717f436d1"}

3 years, 3 months ago (2017-08-29 22:24:19 UTC) #119

commit-bot: I haz the power

CQ is committing da patch. Bot data: {"patchset_id": 540001, "attempt_start_ts": 1504041156853440, "parent_rev": "f2046fbe79e4e19c2391506be1303c412d04f5c7", "commit_rev": "2acfa3ba04b5902deefdc2aacdcc1d71e785d9be"}

3 years, 3 months ago (2017-08-29 22:24:47 UTC) #120

commit-bot: I haz the power

Description was changed from ========== For files smaller than a configurable threshold we will only ...

3 years, 3 months ago (2017-08-29 22:25:16 UTC) #121

commit-bot: I haz the power

3 years, 3 months ago (2017-08-29 22:25:18 UTC) #122

Message was sent while issue was closed.

Committed patchset #28 (id:540001) as
https://chromium.googlesource.com/chromium/src/+/2acfa3ba04b5902deefdc2aacdcc...

Issue 2874833005: SimpleCache: read small files all at once. (Closed)

Description

Patch Set 1 #

Patch Set 2 : Actually pre-read stream1 stuff now, and wire it out of SimpleSynchronousEntry, lifting some bits f… #

Patch Set 3 : Get it to actually serve reads, c&p'ing more stuff from jkarlin's CL and some existing bits. #

Patch Set 4 : Avoid copies. This seems to work, but needs a bunch of new tests and refinement #

Patch Set 5 : Rollback overcomplicate zero copy stuff, doesn't do much. Fix copy-paste error. Rebase #

Patch Set 6 : Refactor some to reduce code dupe, though this is the easy part. #

Patch Set 7 : Refactor + range check the in-memory reads. Probably needs an another round, but a bit cleaner. #

Patch Set 8 : More refactors. #

Patch Set 9 : Remove no longer needed function split #

Patch Set 10 : Update tests. #

Patch Set 11 : Rename + comment a method. #

Patch Set 12 : Add some metrics and an experiment knob. Not really happy with coverage, though. #

Patch Set 13 : Rebase #

Patch Set 14 : Apply some of the (easier) feedback, small refactors. #

Patch Set 15 : Rename method, tweak some formatting. #

Patch Set 16 : Add tests for prefetch, apply some more feedback. #

Patch Set 17 : Remove some now-useless include additions (after moving stuff around) #

Patch Set 18 : Make DiskCacheSimplePrefetchTest.YesPrefetchNoRead run standalone. (Also remove some unwanted debug… #

Patch Set 19 : Group and array up data + crc to cut down on code dupe and excessive arg counts. #

Patch Set 20 : Apply most of review feedback, still need to deal with the struct thing #

Patch Set 21 : Apply the easy part of the feedback. #

Patch Set 22 : Add tests for CRC and footer histograms, fix bug in footer histogram reporting #

Patch Set 23 : More direct computation of file layout. Might revert based on feedback. #

Patch Set 24 : One more range check. #

Patch Set 25 : Apply review feedback. #

Patch Set 26 : Apply some of the feedback, a bit more to discuss still. #

Patch Set 27 : Apply some of the feedback, a bit more to discuss still. #

Patch Set 28 : Tweak histogram description based on feedback #

Messages