Issue 17816008: Disk cache: Introduce BlockBitmaps for V3.

rvargas (doing something else)

7 years, 6 months ago (2013-06-26 21:41:13 UTC) #1

gavinp

The most serious problem is about the magic constants 1 and 4. Otherwise, I think ...

7 years, 6 months ago (2013-06-27 07:06:17 UTC) #2

rvargas (doing something else)

As for testing... I'm having a hard time figuring out how to proceed from this ...

7 years, 5 months ago (2013-06-27 19:57:54 UTC) #3

As for testing... I'm having a hard time figuring out how to proceed from this
point forward, given that I'm almost done with changes that can be isolated,
built and unit tested (BTW, this CL builds and unit tests the code, or at least
the portion of the code not covered by other unit tests after everything lands).

I'm seriously considering sending CLs that don't compile at all while things are
landing because the tests require a somewhat working backend + backend_worker +
work_item + entry and the only way to get everything at once building is by
adding lots of ifdefs to basically ignore all the missing pieces (that we want
to keep because a lot of code is not appearing from scratch)... and that may be
uglier than just accepting that the code doesn't build.

But I can go with #ifdef 0 if you think that's better.

https://codereview.chromium.org/17816008/diff/1/net/disk_cache/v3/block_bitma...
File net/disk_cache/v3/block_bitmaps.cc (right):

https://codereview.chromium.org/17816008/diff/1/net/disk_cache/v3/block_bitma...
net/disk_cache/v3/block_bitmaps.cc:29: if (block_type < BLOCK_256 || block_count
< 1 || block_count > 4)
On 2013/06/27 07:06:18, gavinp wrote:
> I don't think it's good to just put 1 and 4 throughout the code here; I know
> that a file can use between 1 and 4 blocks in a blockfile, but a name would
> really help here.
> 
> Since this affects the disk format, should the name be in disk_format_base.h?

Updated the use of 4. 1 is not really magic.

https://codereview.chromium.org/17816008/diff/1/net/disk_cache/v3/block_bitma...
net/disk_cache/v3/block_bitmaps.cc:62: if (!address.is_initialized() ||
address.is_separate_file())
On 2013/06/27 07:06:18, gavinp wrote:
> Why isn't this an error?

I'm mainly following the current code semantics: This method is forgiving in
that any error condition is "handled" internally without bothering the caller.
The pattern allows the caller to perform cleanup of possibly invalid or corrupt
entries without having to explicitly ignore errors.

gavinp

Thanks Ricardo. I hadn't seen the tests when I made my earlier comment; I'm sorry ...

7 years, 5 months ago (2013-06-28 21:47:45 UTC) #4

Thanks Ricardo. I hadn't seen the tests when I made my earlier comment; I'm
sorry for the silly question.

For future patches, adding code that doesn't compile is fine; but since we have
tests let's avoid removing anything from net.gyp, so I guess we are stuck with
ifdefs if we're adding things in any of the already landed tests going forward?

Right now, my biggest concern in this code is the BlockFile class. I'm uploading
now so we can start talking about it. There's a lot of misc style comments in
here too, but I think we need to figure out the BlockFile class first.

Thanks!

https://codereview.chromium.org/17816008/diff/26002/net/disk_cache/block_file...
File net/disk_cache/block_files.cc (right):

https://codereview.chromium.org/17816008/diff/26002/net/disk_cache/block_file...
net/disk_cache/block_files.cc:275: DCHECK(block_files_.size() >=
kFirstAdditionalBlockFile);
Nit: Consider using the DCHECK_LE macro here, and many other places through this
CL.

https://codereview.chromium.org/17816008/diff/26002/net/disk_cache/v3/block_b...
File net/disk_cache/v3/block_bitmaps.cc (right):

https://codereview.chromium.org/17816008/diff/26002/net/disk_cache/v3/block_b...
net/disk_cache/v3/block_bitmaps.cc:29: if (block_type < BLOCK_256 || block_count
< 1 || block_count > kMaxNumBlocks)
block_type < BLOCK_256 is hard for me to understand. Could we do that test
instead with equality comparisons, so the list of invalid types is in the test?

https://codereview.chromium.org/17816008/diff/26002/net/disk_cache/v3/block_b...
net/disk_cache/v3/block_bitmaps.cc:66: if (header_num < 0)
Why isn't this a DCHECK?

https://codereview.chromium.org/17816008/diff/26002/net/disk_cache/v3/block_b...
net/disk_cache/v3/block_bitmaps.cc:130: COMPILE_ASSERT(RANKINGS == 1,
invalid_file_type);
What does this do?

https://codereview.chromium.org/17816008/diff/26002/net/disk_cache/v3/block_b...
net/disk_cache/v3/block_bitmaps.cc:152: if (!found) {
It took me a few passes to understand the logic of the nested loop here; the use
of "found" in the test inside of the test, and the assertion/test confused me.

Is this code equivalent:

do {
  if (bitmaps_[header_num].CanAllocate(block_count))
    break;
  header_num = bitmaps_[header_num]->next_file;
} while (header_num);
if (!header_num)
  header_num = -1;

https://codereview.chromium.org/17816008/diff/26002/net/disk_cache/v3/block_b...
net/disk_cache/v3/block_bitmaps.cc:153: NOTREACHED();
This scares me a bit. Is this an assertion, or an error case? It is coded as
both.

https://codereview.chromium.org/17816008/diff/26002/net/disk_cache/v3/block_b...
net/disk_cache/v3/block_bitmaps.cc:158:
HISTOGRAM_TIMES("DiskCache.GetFileForNewBlock", TimeTicks::Now() - start);
Should we be separating out v3 histograms by name? This histogram is strongly
parallel to an existing one, but I'm not seeing the benefit of mixing them. What
do you think?

https://codereview.chromium.org/17816008/diff/26002/net/disk_cache/v3/block_b...
File net/disk_cache/v3/block_bitmaps_unittest.cc (right):

https://codereview.chromium.org/17816008/diff/26002/net/disk_cache/v3/block_b...
net/disk_cache/v3/block_bitmaps_unittest.cc:10: 
I think this whole file can be put in an anonymous namespace.

https://codereview.chromium.org/17816008/diff/51001/net/disk_cache/block_file...
File net/disk_cache/block_files.cc (right):

https://codereview.chromium.org/17816008/diff/51001/net/disk_cache/block_file...
net/disk_cache/block_files.cc:200: int empty_blocks = 0;
What's empty_blocks for?

https://codereview.chromium.org/17816008/diff/51001/net/disk_cache/block_file...
net/disk_cache/block_files.cc:204: have_space = true;
Can we just return true here?

https://codereview.chromium.org/17816008/diff/51001/net/disk_cache/block_file...
net/disk_cache/block_files.cc:207: return have_space;
And return false here?

https://codereview.chromium.org/17816008/diff/51001/net/disk_cache/block_file...
net/disk_cache/block_files.cc:292: bool BlockFiles::CreateBlock(FileType
block_type, int block_count,
When is CreateBlock called for EXTERNAL, FILES, ENTRIES or EVICTED? Should that
be a DCHECK?

https://codereview.chromium.org/17816008/diff/51001/net/disk_cache/block_file...
net/disk_cache/block_files.cc:295: if (block_type < RANKINGS || block_type >
BLOCK_4K ||
These inequalities are hard for me to read, the ordering of the enum isn't
really easy to keep paged in for someone studying this code.

Could we consider cleaning this up a bit while we're here? Perhaps:

if ((block_type != RANKINGS && block_type != BLOCK_256 && block_type != BLOCK_1K
&& block_type != BLOCK_4K) || ...

Which has the virtue of listing precisely the list of blocks that can do this.

https://codereview.chromium.org/17816008/diff/51001/net/disk_cache/block_file...
net/disk_cache/block_files.cc:299: if (!init_)
Nit: could just add this to the earlier test, and it would be a lot more
readable if it was the first condition tested.

https://codereview.chromium.org/17816008/diff/51001/net/disk_cache/block_files.h
File net/disk_cache/block_files.h (right):

https://codereview.chromium.org/17816008/diff/51001/net/disk_cache/block_file...
net/disk_cache/block_files.h:28: class NET_EXPORT_PRIVATE BlockHeader {
This class continues to look like a lot of mechanism for a BlockFileHeader*; and
it introduces the potential for confusing double indirection if it's not used
carefully. I can understand why we don't want these semantic methods to be in
the file format definition, but I don't think this class is the right way to do
it.

I also understand that this is intended to be a kind of abstraction boundary,
between things on memory and things that might be on disk. That's why it
introduces these semantic methods. But I can't get my head around this class as
the right way to introduce this abstraction boundary. If you have a
BlockFIleHeader object, you can do all of these semantic operations.

I have two suggestions:

1. Have this class inherit from BlockFileHeader, and do the (sketchy)
static_cast from BlockFileHeader* to BlockHeader* when you need one of these
objects. We'd need a compile assert to make sure BlockHeader truly is POD from
the struct; I think sizeof() being equal should do it, since BlockFileHeader is
POD adding a virtual method would add a vptr.

2. Write this interface in the C-style for structure methods; i.e.

namespace disk_cache {

namespace block_header {

  bool CreateMapBlock(BlockFileHeader*, int target, int size, int* index);
  bool DeleteMapBlock(BlockFileHeader*, int index, int block_size);
...

}

}

Of these, I like (2) best, after all, the data is exactly a BlockFIleHeader;
callers can know that. But if they've brought in this header, they get the
semantic operations on that disk structure.

But, (1) is most like the current idea here. There's a data type that captures
the distinction. There will be a lot of static casts.

What do you think of these suggestions? Do you see any advantages or
disadvantages to them?

https://codereview.chromium.org/17816008/diff/51001/net/disk_cache/block_file...
net/disk_cache/block_files.h:67: BlockFileHeader* operator->() { return header_;
}
The Google C++ style guide has a strong presumption against operator
overloading:
http://google-styleguide.googlecode.com/svn/trunk/cppguide.xml#Operator_Overl...

Is this a "rare, special circumstance"? How?

rvargas (doing something else)

https://codereview.chromium.org/17816008/diff/26002/net/disk_cache/block_files.cc File net/disk_cache/block_files.cc (right): https://codereview.chromium.org/17816008/diff/26002/net/disk_cache/block_files.cc#newcode275 net/disk_cache/block_files.cc:275: DCHECK(block_files_.size() >= kFirstAdditionalBlockFile); On 2013/06/28 21:47:45, gavinp wrote: > ...

7 years, 5 months ago (2013-07-11 19:54:55 UTC) #5

https://codereview.chromium.org/17816008/diff/26002/net/disk_cache/block_file...
File net/disk_cache/block_files.cc (right):

https://codereview.chromium.org/17816008/diff/26002/net/disk_cache/block_file...
net/disk_cache/block_files.cc:275: DCHECK(block_files_.size() >=
kFirstAdditionalBlockFile);
On 2013/06/28 21:47:45, gavinp wrote:
> Nit: Consider using the DCHECK_LE macro here, and many other places through
this
> CL.

Already changed

https://codereview.chromium.org/17816008/diff/26002/net/disk_cache/v3/block_b...
File net/disk_cache/v3/block_bitmaps.cc (right):

https://codereview.chromium.org/17816008/diff/26002/net/disk_cache/v3/block_b...
net/disk_cache/v3/block_bitmaps.cc:66: if (header_num < 0)
On 2013/06/28 21:47:45, gavinp wrote:
> Why isn't this a DCHECK?

Same reason I mentioned before: (this way) DeleteBlock can be called with an
address that doesn't contain data (or is not even the address of a block) and if
there's nothing to do this function doesn't do anything.

That behavior is valuable when cleaning up a mess.

https://codereview.chromium.org/17816008/diff/26002/net/disk_cache/v3/block_b...
net/disk_cache/v3/block_bitmaps.cc:130: COMPILE_ASSERT(RANKINGS == 1,
invalid_file_type);
On 2013/06/28 21:47:45, gavinp wrote:
> What does this do?

Makes sure that the next line doesn't result in a negative number. To be honest,
this was mostly for a reader thinking that we may end up using a negative index.

I'm sad Rietveld is not doing a better job showing actual changes, but this
function should have been aligned with line 236 of the old file.

https://codereview.chromium.org/17816008/diff/26002/net/disk_cache/v3/block_b...
net/disk_cache/v3/block_bitmaps.cc:152: if (!found) {
On 2013/06/28 21:47:45, gavinp wrote:
> It took me a few passes to understand the logic of the nested loop here; the
use
> of "found" in the test inside of the test, and the assertion/test confused me.
> 
> Is this code equivalent:
> 
> do {
>   if (bitmaps_[header_num].CanAllocate(block_count))
>     break;
>   header_num = bitmaps_[header_num]->next_file;
> } while (header_num);
> if (!header_num)
>   header_num = -1;

Yes and no. In general it is not the same because we may start looking at the
first file (header_num = 0), and find space... which would make us think that we
failed to find space.

It can be the same if we ignore the general case and assume that this code will
never be called with block_type = RANKINGS... that condition is true today, but
it may become false in the future after we delete the code for V2 and may decide
to reuse that file type for something else. I didn't want to restrict this code
that way.

I can add a comment to point out why it looks this way.

https://codereview.chromium.org/17816008/diff/26002/net/disk_cache/v3/block_b...
net/disk_cache/v3/block_bitmaps.cc:153: NOTREACHED();
On 2013/06/28 21:47:45, gavinp wrote:
> This scares me a bit. Is this an assertion, or an error case? It is coded as
> both.

Not that uncommon pattern. It is an assertion in that tests or developer
versions should break into a debugger if this happens. On the other hand, I
expect a reasonable action if this happens on the field... as in creation of the
current entry to fail.

https://codereview.chromium.org/17816008/diff/26002/net/disk_cache/v3/block_b...
net/disk_cache/v3/block_bitmaps.cc:158:
HISTOGRAM_TIMES("DiskCache.GetFileForNewBlock", TimeTicks::Now() - start);
On 2013/06/28 21:47:45, gavinp wrote:
> Should we be separating out v3 histograms by name? This histogram is strongly
> parallel to an existing one, but I'm not seeing the benefit of mixing them.
What
> do you think?

This histogram is exactly the same one as before and I see no reason to rename
it. In the short term we can tell them apart by version and or experiment.

https://codereview.chromium.org/17816008/diff/26002/net/disk_cache/v3/block_b...
File net/disk_cache/v3/block_bitmaps_unittest.cc (right):

https://codereview.chromium.org/17816008/diff/26002/net/disk_cache/v3/block_b...
net/disk_cache/v3/block_bitmaps_unittest.cc:10: 
On 2013/06/28 21:47:45, gavinp wrote:
> I think this whole file can be put in an anonymous namespace.

Tests usually don't go into namespaces.

https://codereview.chromium.org/17816008/diff/51001/net/disk_cache/block_file...
File net/disk_cache/block_files.cc (right):

https://codereview.chromium.org/17816008/diff/51001/net/disk_cache/block_file...
net/disk_cache/block_files.cc:200: int empty_blocks = 0;
On 2013/06/28 21:47:45, gavinp wrote:
> What's empty_blocks for?

left overs... sorry about that.

https://codereview.chromium.org/17816008/diff/51001/net/disk_cache/block_file...
net/disk_cache/block_files.cc:292: bool BlockFiles::CreateBlock(FileType
block_type, int block_count,
On 2013/06/28 21:47:45, gavinp wrote:
> When is CreateBlock called for EXTERNAL, FILES, ENTRIES or EVICTED? Should
that
> be a DCHECK?

Done.

https://codereview.chromium.org/17816008/diff/51001/net/disk_cache/block_file...
net/disk_cache/block_files.cc:295: if (block_type < RANKINGS || block_type >
BLOCK_4K ||
On 2013/06/28 21:47:45, gavinp wrote:
> These inequalities are hard for me to read, the ordering of the enum isn't
> really easy to keep paged in for someone studying this code.
> 
> Could we consider cleaning this up a bit while we're here? Perhaps:
> 
> if ((block_type != RANKINGS && block_type != BLOCK_256 && block_type !=
BLOCK_1K
> && block_type != BLOCK_4K) || ...
> 
> Which has the virtue of listing precisely the list of blocks that can do this.

Done.

https://codereview.chromium.org/17816008/diff/51001/net/disk_cache/block_file...
net/disk_cache/block_files.cc:299: if (!init_)
On 2013/06/28 21:47:45, gavinp wrote:
> Nit: could just add this to the earlier test, and it would be a lot more
> readable if it was the first condition tested.

In my mind the two checks do two different things: the first one is for being
asked to do something that doesn't make much sense, the second one is for being
accessed at an improper time. The typical answer to the former is in the order
of "invalid arguments", which tends to be more serious than using the object
correctly but at a bad time (and that's why that test goes first... not that it
really matters)

I find multiple conditions on an if harder to follow that multiple individual
statements.

https://codereview.chromium.org/17816008/diff/51001/net/disk_cache/block_files.h
File net/disk_cache/block_files.h (right):

https://codereview.chromium.org/17816008/diff/51001/net/disk_cache/block_file...
net/disk_cache/block_files.h:28: class NET_EXPORT_PRIVATE BlockHeader {
On 2013/06/28 21:47:45, gavinp wrote:
> This class continues to look like a lot of mechanism for a BlockFileHeader*;
and
> it introduces the potential for confusing double indirection if it's not used
> carefully. I can understand why we don't want these semantic methods to be in
> the file format definition, but I don't think this class is the right way to
do
> it.
> 
> I also understand that this is intended to be a kind of abstraction boundary,
> between things on memory and things that might be on disk. That's why it
> introduces these semantic methods. But I can't get my head around this class
as
> the right way to introduce this abstraction boundary. If you have a
> BlockFIleHeader object, you can do all of these semantic operations.
> 
> I have two suggestions:
> 
> 1. Have this class inherit from BlockFileHeader, and do the (sketchy)
> static_cast from BlockFileHeader* to BlockHeader* when you need one of these
> objects. We'd need a compile assert to make sure BlockHeader truly is POD from
> the struct; I think sizeof() being equal should do it, since BlockFileHeader
is
> POD adding a virtual method would add a vptr.
> 
> 2. Write this interface in the C-style for structure methods; i.e.
> 
> namespace disk_cache {
> 
> namespace block_header {
> 
>   bool CreateMapBlock(BlockFileHeader*, int target, int size, int* index);
>   bool DeleteMapBlock(BlockFileHeader*, int index, int block_size);
> ...
> 
> }
> 
> }
> 
> Of these, I like (2) best, after all, the data is exactly a BlockFIleHeader;
> callers can know that. But if they've brought in this header, they get the
> semantic operations on that disk structure.
> 
> But, (1) is most like the current idea here. There's a data type that captures
> the distinction. There will be a lot of static casts.
> 
> What do you think of these suggestions? Do you see any advantages or
> disadvantages to them?

I'm not sure if it's a matter of naming what is bothering you.

What I want is an abstraction of the operations that I can do with the metadata
of a single logical block file (every word is important here because another CL
will be adding support for a block file that is implemented with more than one
physical file on disk).

That basically means Create and delete a block etc up to Size (as I said before,
Size is not needed on the final version). All those methods operate on the
concept of the metadata of a block file (whatever name that has). So I don't
want a namespace, I want a class because I have an entity that behaves like an
object. And I want a user of that class to know that if it needs to look for an
empty space to store a block on a given file, it invokes a method on the object
that represents that file.

So we have 8 methods that operate on that object and one method that retrieves a
pointer to the struct (the actual header) if the caller needs to access one of
the 11 fields of that struct.

I cannot simply inherit from the struct because this concept is not a struct
with methods... the underlying (in principle private) member is a pointer to a
struct: a pointer that comes from a memory mapped section, from a file (handle)
owned by another entity.

I _could_ remove the accessor to the struct but that in practice means that
users of this class will most likely need to keep two objects to interact with
two views of an entity: an instance of this class to FixAllocationCounters, and
a separate pointer to the header to get the number of the next file of the same
type. Not exactly a gain in overall clarity, simplicity or encapsulation.

My last point is that I don't want to over engineer things to accommodate code
that should go away soon (when we drop support for v2), but this particular
class moves in the direction that I want the code to go: more encapsulation and
reliance on higher level interfaces.

https://codereview.chromium.org/17816008/diff/51001/net/disk_cache/block_file...
net/disk_cache/block_files.h:67: BlockFileHeader* operator->() { return header_;
}
On 2013/06/28 21:47:45, gavinp wrote:
> The Google C++ style guide has a strong presumption against operator
> overloading:
>
http://google-styleguide.googlecode.com/svn/trunk/cppguide.xml#Operator_Overl...
> 
> Is this a "rare, special circumstance"? How?

Happy to remove it.

gavinp

+rdsmith Randy, I think me and Ricardo are talking past each other a bit or ...

7 years, 5 months ago (2013-07-15 18:23:35 UTC) #6

+rdsmith

Randy,

I think me and Ricardo are talking past each other a bit or confusing each other
with our shared architectural concerns about the new BlockHeader class
introduced in this CL. Since that's my top level architectural concern here, I
want to invite you in to take a look over this. Please look over anything else
you like, but pay particular attention to the comments in block_files.h/cc.

Thanks!

https://codereview.chromium.org/17816008/diff/26002/net/disk_cache/v3/block_b...
File net/disk_cache/v3/block_bitmaps.cc (right):

https://codereview.chromium.org/17816008/diff/26002/net/disk_cache/v3/block_b...
net/disk_cache/v3/block_bitmaps.cc:130: COMPILE_ASSERT(RANKINGS == 1,
invalid_file_type);
On 2013/07/11 19:54:55, rvargas wrote:
> On 2013/06/28 21:47:45, gavinp wrote:
> > What does this do?
> 
> Makes sure that the next line doesn't result in a negative number. To be
honest,
> this was mostly for a reader thinking that we may end up using a negative
index.

Why not:

DCHECK_GT(0, block_type);

I'm at least one reader who would get that more easily than a discussion of the
ordering of the FileType enum, which is what the COMPILE_ASSERT ends up being.

> 
> I'm sad Rietveld is not doing a better job showing actual changes, but this
> function should have been aligned with line 236 of the old file.

https://codereview.chromium.org/17816008/diff/26002/net/disk_cache/v3/block_b...
net/disk_cache/v3/block_bitmaps.cc:152: if (!found) {
On 2013/07/11 19:54:55, rvargas wrote:
> On 2013/06/28 21:47:45, gavinp wrote:
> > It took me a few passes to understand the logic of the nested loop here; the
> use
> > of "found" in the test inside of the test, and the assertion/test confused
me.
> > 
> > Is this code equivalent:
> > 
> > do {
> >   if (bitmaps_[header_num].CanAllocate(block_count))
> >     break;
> >   header_num = bitmaps_[header_num]->next_file;
> > } while (header_num);
> > if (!header_num)
> >   header_num = -1;
> 
> Yes and no. In general it is not the same because we may start looking at the
> first file (header_num = 0), and find space... which would make us think that
we
> failed to find space.
> 
> It can be the same if we ignore the general case and assume that this code
will
> never be called with block_type = RANKINGS... that condition is true today,
but
> it may become false in the future after we delete the code for V2 and may
decide
> to reuse that file type for something else. I didn't want to restrict this
code
> that way.
> 
> I can add a comment to point out why it looks this way.

The two loops could be made a lot more clear if we renamed
"NeedToGrowBlockFile". Maybe "NeedToGrowBlockFileToStoreThisManyMoreBlocks ?
Once I went and reread the definition things made a bit more sense to me. Or
perhaps expand the comment right in "if (!found)", what weren't we searching
before?

https://codereview.chromium.org/17816008/diff/26002/net/disk_cache/v3/block_b...
net/disk_cache/v3/block_bitmaps.cc:153: NOTREACHED();
On 2013/07/11 19:54:55, rvargas wrote:
> On 2013/06/28 21:47:45, gavinp wrote:
> > This scares me a bit. Is this an assertion, or an error case? It is coded as
> > both.
> 
> Not that uncommon pattern. It is an assertion in that tests or developer
> versions should break into a debugger if this happens. On the other hand, I
> expect a reasonable action if this happens on the field... as in creation of
the
> current entry to fail.

Why not crash in the field? If this is impossible (a DCHECK), then the crash
will be really diagnostic, right?

https://codereview.chromium.org/17816008/diff/26002/net/disk_cache/v3/block_b...
File net/disk_cache/v3/block_bitmaps_unittest.cc (right):

https://codereview.chromium.org/17816008/diff/26002/net/disk_cache/v3/block_b...
net/disk_cache/v3/block_bitmaps_unittest.cc:10: 
On 2013/07/11 19:54:55, rvargas wrote:
> On 2013/06/28 21:47:45, gavinp wrote:
> > I think this whole file can be put in an anonymous namespace.
> 
> Tests usually don't go into namespaces.

Yes, they do. For instance, take a peek at tcp_client_socket_unittest.cc,
https://code.google.com/p/chromium/codesearch#chromium/src/net/socket/tcp_cli...


On a per file basis for all chrome:

    ~/chrome/src$ git grep -n -e '^[^ ]*TEST' -l -- '*test.cc' | wc
       3334    3334  177565
    ~/chrome/src$ TESTS=`git grep -n -e '^[^ ]*TEST' -l -- '*test.cc'`
    ~/chrome/src$ git grep -n -e '^\([^ ]*TEST\|}  // namespace$\)' -- $TESTS |
tac | awk -F :  -- '{if (last == $1) next; last = $1; print}' | grep namespace |
wc
        577    1731   40215
  
So it looks like about 577/3334 = 17% of tests are in the anonymous namespace.

    ~/chrome/src$ git grep -n -e '^\([^ ]*TEST\|}  // namespace \)' -- $TESTS |
tac | awk -F :  -- '{if (last == $1) next; last = $1; print}' | grep namespace |
wc
        2094    8376  166222

Another 2094/3334 = 63% of tests are in named namespaces.

Only 20% of tests are not in a namespace. That's the unusual case.

But even if a namespace wasn't usual, it'd be the right thing to do. Putting it
in an anonymous namespace isolates it. And you also learn that none of the tests
require friendship. If we're not doing it elsewhere, that's a mistake. Let's get
it right here.

https://codereview.chromium.org/17816008/diff/51001/net/disk_cache/block_files.h
File net/disk_cache/block_files.h (right):

https://codereview.chromium.org/17816008/diff/51001/net/disk_cache/block_file...
net/disk_cache/block_files.h:28: class NET_EXPORT_PRIVATE BlockHeader {
On 2013/07/11 19:54:55, rvargas wrote:
> On 2013/06/28 21:47:45, gavinp wrote:
> > This class continues to look like a lot of mechanism for a BlockFileHeader*;
> and
> > it introduces the potential for confusing double indirection if it's not
used
> > carefully. I can understand why we don't want these semantic methods to be
in
> > the file format definition, but I don't think this class is the right way to
> do
> > it.
> > 
> > I also understand that this is intended to be a kind of abstraction
boundary,
> > between things on memory and things that might be on disk. That's why it
> > introduces these semantic methods. But I can't get my head around this class
> as
> > the right way to introduce this abstraction boundary. If you have a
> > BlockFIleHeader object, you can do all of these semantic operations.
> > 
> > I have two suggestions:
> > 
> > 1. Have this class inherit from BlockFileHeader, and do the (sketchy)
> > static_cast from BlockFileHeader* to BlockHeader* when you need one of these
> > objects. We'd need a compile assert to make sure BlockHeader truly is POD
from
> > the struct; I think sizeof() being equal should do it, since BlockFileHeader
> is
> > POD adding a virtual method would add a vptr.
> > 
> > 2. Write this interface in the C-style for structure methods; i.e.
> > 
> > namespace disk_cache {
> > 
> > namespace block_header {
> > 
> >   bool CreateMapBlock(BlockFileHeader*, int target, int size, int* index);
> >   bool DeleteMapBlock(BlockFileHeader*, int index, int block_size);
> > ...
> > 
> > }
> > 
> > }
> > 
> > Of these, I like (2) best, after all, the data is exactly a BlockFIleHeader;
> > callers can know that. But if they've brought in this header, they get the
> > semantic operations on that disk structure.
> > 
> > But, (1) is most like the current idea here. There's a data type that
captures
> > the distinction. There will be a lot of static casts.
> > 
> > What do you think of these suggestions? Do you see any advantages or
> > disadvantages to them?
> 
> I'm not sure if it's a matter of naming what is bothering you.

No, it's the structure. This abstraction does only two things; it hides a cast,
and it adds data methods here rather than with the disk format; and using this
class obscures that none of these methods are not actually just operating on a
BlockHeader. I tried to make two suggestions that unobscure that, and make the
casts more clear.

I can understand the desire to keep the methods that operate on the disk format
separate from the disk format. But I'm troubled by a class that is just a
pointer and a bunch of methods that really just operate on the disk format type.

> What I want is an abstraction of the operations that I can do with the
metadata
> of a single logical block file (every word is important here because another
CL
> will be adding support for a block file that is implemented with more than one
> physical file on disk).
> 
> That basically means Create and delete a block etc up to Size (as I said
before,
> Size is not needed on the final version). All those methods operate on the
> concept of the metadata of a block file (whatever name that has). So I don't
> want a namespace, I want a class because I have an entity that behaves like an
> object. And I want a user of that class to know that if it needs to look for
an
> empty space to store a block on a given file, it invokes a method on the
object
> that represents that file.
> 
> So we have 8 methods that operate on that object and one method that retrieves
a
> pointer to the struct (the actual header) if the caller needs to access one of
> the 11 fields of that struct.
> 
> I cannot simply inherit from the struct because this concept is not a struct
> with methods... the underlying (in principle private) member is a pointer to a
> struct: a pointer that comes from a memory mapped section, from a file
(handle)
> owned by another entity.
> I _could_ remove the accessor to the struct but that in practice means that
> users of this class will most likely need to keep two objects to interact with
> two views of an entity: an instance of this class to FixAllocationCounters,
and
> a separate pointer to the header to get the number of the next file of the
same
> type. Not exactly a gain in overall clarity, simplicity or encapsulation.
> 
> My last point is that I don't want to over engineer things to accommodate code
> that should go away soon (when we drop support for v2), but this particular
> class moves in the direction that I want the code to go: more encapsulation
and
> reliance on higher level interfaces.

Is there a final version of this class that provides something more than just a
BlockFileHeader*? Will every method on it require that data? How can we separate
methods that just operate on BlockFileHeader* from the methods that use the new
private data in a BlockHeader*?

https://codereview.chromium.org/17816008/diff/5629499534213120/net/disk_cache...
File net/disk_cache/block_files.cc (right):

https://codereview.chromium.org/17816008/diff/5629499534213120/net/disk_cache...
net/disk_cache/block_files.cc:276: DCHECK_GE(block_files_.size(),
Convention is to put the constant first, and the thing being tested second, so
this needs to be reversed into a DCHECK_LE.

Randy Smith (Not in Mondays)

On 2013/07/15 18:23:35, gavinp wrote: > +rdsmith > > Randy, > > I think me ...

7 years, 5 months ago (2013-07-15 18:52:14 UTC) #7

Randy Smith (Not in Mondays)

If I'm understanding your separate points correctly, this sounds like a future / present conflict: ...

7 years, 5 months ago (2013-07-16 18:01:41 UTC) #8

If I'm understanding your separate points correctly, this sounds like a future /
present conflict:

+ Future: Ricardo has a plan for where he wants the code to go that involves
abstracting out the allocation of blocks from the physical file that they go
into (Ricardo: I'm intentionally trying to simplify from "metadata of a single
logical block file" in hopes that if I've gotten the key point wrong you'll
object.  To me, based on what you're saying so far, it sounds like what you're
looking for is an allocation abstraction, not something that has any other
particular attributes of a file).  I don't think this is about memory/disk
separation so much as about logical/physical separation.

+ Present: Gavin is concerned that in the code as it currently stands, the
BlockHeader class is a pretty pure wrapper around BlockFileHeader; effectively
that it "is a" BlockFileHeader, and creating the new class will both confuse
people reading the code about the relationship between BlockHeader and
BlockFileHeader, and make interacting with BlockFileHeaders itself confusing
because the user will have to indirect through BlockHeader to get to them. 

(Obviously, if my summary of either of your positions is incorrect or
incomplete, please call that out, otherwise I'm sure I'll be heading in a
completely wrong direction.)

That conflict makes me want to have a more full understanding of where this code
is headed.  Ricardo, if (if) it's easy to put up a CL of what you expect the
code to look like when you do the logical/physical separation (it doesn't have
to be completely or compile), that would be very useful, at least for me, in
evaluating the abstraction.  If not, could you give a sketch of what you expect
the different entities in your final state to look like, and what their
relationships are going to be?  I'm specifically curious about what would happen
to the Header() method, since I wouldn't think that that would make any sense if
the abstraction you seem to be aiming for were fully realized.  But I may be
misunderstanding the abstraction.  I think if we more fully understand where the
code is headed, we can do a better job of figuring out how it should get there.

One other top level point: As I read this CL, BlockHeader wasn't introduced in
this CL (at least, it's in my LKGR checkout, and the diff I read in this CL
doesn't have a lot of changes to BlockHeader). That makes me think that we
shouldn't block this CL, which seems primarily about block bitmaps, because of
issues with BlockHeader, and instead should carry on this discussion in some
other format (maybe the sketch CL mentioned above, if Ricardo can throw together
such a thing).  Obviously there's a danger in letting the BlockHeader issue
slide if we do that, but it still strikes me as the right thing to do--if we can
deal with issues orthogonally, we should, and BlockHeader is already in the
backing tree.

gavinp

This effectively is the first review of BlockHeader, and should be treated as such. BlockHeader ...

7 years, 5 months ago (2013-07-16 18:18:33 UTC) #9

This effectively is the first review of BlockHeader, and should be treated as
such. BlockHeader was introduced in https://codereview.chromium.org/16843007/ .
At that time, I made the same objections, and Ricardo asked me to LGTM it until
implementation and users landed. I did, and this is the first CL giving such
users. We can proceed either treating this review as a new landing of
BlockHeader, or we can revert r207665 and proceed from there.

Ricardo hopefully will provide more context on where I'm going, but I have been
presuming that the CL showing the full v3 cache implementation remains accurate.
It is at
https://codereview.chromium.org/15203004/diff/32001/net/disk_cache/block_files.h
, and has block_files essentially as shown.

On 2013/07/16 18:01:41, rdsmith wrote:
> If I'm understanding your separate points correctly, this sounds like a future
/
> present conflict:
> 
> + Future: Ricardo has a plan for where he wants the code to go that involves
> abstracting out the allocation of blocks from the physical file that they go
> into (Ricardo: I'm intentionally trying to simplify from "metadata of a single
> logical block file" in hopes that if I've gotten the key point wrong you'll
> object.  To me, based on what you're saying so far, it sounds like what you're
> looking for is an allocation abstraction, not something that has any other
> particular attributes of a file).  I don't think this is about memory/disk
> separation so much as about logical/physical separation.
> 
> + Present: Gavin is concerned that in the code as it currently stands, the
> BlockHeader class is a pretty pure wrapper around BlockFileHeader; effectively
> that it "is a" BlockFileHeader, and creating the new class will both confuse
> people reading the code about the relationship between BlockHeader and
> BlockFileHeader, and make interacting with BlockFileHeaders itself confusing
> because the user will have to indirect through BlockHeader to get to them. 
> 
> (Obviously, if my summary of either of your positions is incorrect or
> incomplete, please call that out, otherwise I'm sure I'll be heading in a
> completely wrong direction.)

Your discussion of my concerns seems accurate.

> 
> That conflict makes me want to have a more full understanding of where this
code
> is headed.  Ricardo, if (if) it's easy to put up a CL of what you expect the
> code to look like when you do the logical/physical separation (it doesn't have
> to be completely or compile), that would be very useful, at least for me, in
> evaluating the abstraction.  If not, could you give a sketch of what you
expect
> the different entities in your final state to look like, and what their
> relationships are going to be?  I'm specifically curious about what would
happen
> to the Header() method, since I wouldn't think that that would make any sense
if
> the abstraction you seem to be aiming for were fully realized.  But I may be
> misunderstanding the abstraction.  I think if we more fully understand where
the
> code is headed, we can do a better job of figuring out how it should get
there.
> 
> One other top level point: As I read this CL, BlockHeader wasn't introduced in
> this CL (at least, it's in my LKGR checkout, and the diff I read in this CL
> doesn't have a lot of changes to BlockHeader). That makes me think that we
> shouldn't block this CL, which seems primarily about block bitmaps, because of
> issues with BlockHeader, and instead should carry on this discussion in some
> other format (maybe the sketch CL mentioned above, if Ricardo can throw
together
> such a thing).  Obviously there's a danger in letting the BlockHeader issue
> slide if we do that, but it still strikes me as the right thing to do--if we
can
> deal with issues orthogonally, we should, and BlockHeader is already in the
> backing tree.

gavinp

Whoops: I'm writing not lgtm here just to undue my earlier accidental Ell Gee Tee ...

7 years, 5 months ago (2013-07-16 18:18:56 UTC) #10

rvargas (doing something else)

On 2013/07/16 18:01:41, rdsmith wrote: > If I'm understanding your separate points correctly, this sounds ...

7 years, 5 months ago (2013-07-17 03:04:20 UTC) #11

On 2013/07/16 18:01:41, rdsmith wrote:
> If I'm understanding your separate points correctly, this sounds like a future
/
> present conflict:
> 
> + Future: Ricardo has a plan for where he wants the code to go that involves
> abstracting out the allocation of blocks from the physical file that they go
> into (Ricardo: I'm intentionally trying to simplify from "metadata of a single
> logical block file" in hopes that if I've gotten the key point wrong you'll
> object.  To me, based on what you're saying so far, it sounds like what you're
> looking for is an allocation abstraction, not something that has any other
> particular attributes of a file).  I don't think this is about memory/disk
> separation so much as about logical/physical separation.
> 
> + Present: Gavin is concerned that in the code as it currently stands, the
> BlockHeader class is a pretty pure wrapper around BlockFileHeader; effectively
> that it "is a" BlockFileHeader, and creating the new class will both confuse
> people reading the code about the relationship between BlockHeader and
> BlockFileHeader, and make interacting with BlockFileHeaders itself confusing
> because the user will have to indirect through BlockHeader to get to them. 
> 
> (Obviously, if my summary of either of your positions is incorrect or
> incomplete, please call that out, otherwise I'm sure I'll be heading in a
> completely wrong direction.)
> 
> That conflict makes me want to have a more full understanding of where this
code
> is headed.  Ricardo, if (if) it's easy to put up a CL of what you expect the
> code to look like when you do the logical/physical separation (it doesn't have
> to be completely or compile), that would be very useful, at least for me, in
> evaluating the abstraction.  If not, could you give a sketch of what you
expect
> the different entities in your final state to look like, and what their
> relationships are going to be?  I'm specifically curious about what would
happen
> to the Header() method, since I wouldn't think that that would make any sense
if
> the abstraction you seem to be aiming for were fully realized.  But I may be
> misunderstanding the abstraction.  I think if we more fully understand where
the
> code is headed, we can do a better job of figuring out how it should get
there.
> 
> One other top level point: As I read this CL, BlockHeader wasn't introduced in
> this CL (at least, it's in my LKGR checkout, and the diff I read in this CL
> doesn't have a lot of changes to BlockHeader). That makes me think that we
> shouldn't block this CL, which seems primarily about block bitmaps, because of
> issues with BlockHeader, and instead should carry on this discussion in some
> other format (maybe the sketch CL mentioned above, if Ricardo can throw
together
> such a thing).  Obviously there's a danger in letting the BlockHeader issue
> slide if we do that, but it still strikes me as the right thing to do--if we
can
> deal with issues orthogonally, we should, and BlockHeader is already in the
> backing tree.

A slight clarification: when this class landed Gavin actually expressed in
person to be convinced about the logical value of the class (as in if it makes
sense or not). That said, I pointed out at that time that it is always possible
to change anything from a class when more users are implemented, and that would
be this CL so I'm more than happy to put the issue to rest now, even though I
thought we were past that point already.

I apologize because I don't anticipate having a lot of time this week to be
fully responsive to this review. I'm not sure I would agree with the summary
about the future of the code (or at least I may understand something different
with that description), so I'll try to write something soon to be more clear
about it.

In the mean time, there is a document that should provide a lot of background,
in case you have not seen it yet:
http://www.chromium.org/developers/design-documents/network-stack/disk-cache/...
(a longer version is available at
https://docs.google.com/a/google.com/document/d/1UnBlsdJusO4pdgzC4Ut194YVki16...
but it mostly adds justification for the details of the design)

Changes to the structure of the block files are minimal (other than splitting
each file in two), so the section on the old document is also relevant
(http://www.chromium.org/developers/design-documents/network-stack/disk-cache#...)

About the future of the code, there is the V3 as implemented
(https://codereview.chromium.org/17507006/ is close enough to the latest
version), and there is the fact that V2 will be removed when we deploy V3 so a
lot of code will go away.

Thanks for your patience.

Randy Smith (Not in Mondays)

Quick heads up here that I won't be able to respond to this until this ...

7 years, 5 months ago (2013-07-17 12:44:12 UTC) #12

Randy Smith (Not in Mondays)

On 2013/07/17 03:04:20, rvargas wrote: > On 2013/07/16 18:01:41, rdsmith wrote: > > If I'm ...

7 years, 5 months ago (2013-07-18 01:04:59 UTC) #13

On 2013/07/17 03:04:20, rvargas wrote:
> On 2013/07/16 18:01:41, rdsmith wrote:
> > If I'm understanding your separate points correctly, this sounds like a
future
> /
> > present conflict:
> > 
> > + Future: Ricardo has a plan for where he wants the code to go that involves
> > abstracting out the allocation of blocks from the physical file that they go
> > into (Ricardo: I'm intentionally trying to simplify from "metadata of a
single
> > logical block file" in hopes that if I've gotten the key point wrong you'll
> > object.  To me, based on what you're saying so far, it sounds like what
you're
> > looking for is an allocation abstraction, not something that has any other
> > particular attributes of a file).  I don't think this is about memory/disk
> > separation so much as about logical/physical separation.
> > 
> > + Present: Gavin is concerned that in the code as it currently stands, the
> > BlockHeader class is a pretty pure wrapper around BlockFileHeader;
effectively
> > that it "is a" BlockFileHeader, and creating the new class will both confuse
> > people reading the code about the relationship between BlockHeader and
> > BlockFileHeader, and make interacting with BlockFileHeaders itself confusing
> > because the user will have to indirect through BlockHeader to get to them. 
> > 
> > (Obviously, if my summary of either of your positions is incorrect or
> > incomplete, please call that out, otherwise I'm sure I'll be heading in a
> > completely wrong direction.)
> > 
> > That conflict makes me want to have a more full understanding of where this
> code
> > is headed.  Ricardo, if (if) it's easy to put up a CL of what you expect the
> > code to look like when you do the logical/physical separation (it doesn't
have
> > to be completely or compile), that would be very useful, at least for me, in
> > evaluating the abstraction.  If not, could you give a sketch of what you
> expect
> > the different entities in your final state to look like, and what their
> > relationships are going to be?  I'm specifically curious about what would
> happen
> > to the Header() method, since I wouldn't think that that would make any
sense
> if
> > the abstraction you seem to be aiming for were fully realized.  But I may be
> > misunderstanding the abstraction.  I think if we more fully understand where
> the
> > code is headed, we can do a better job of figuring out how it should get
> there.
> > 
> > One other top level point: As I read this CL, BlockHeader wasn't introduced
in
> > this CL (at least, it's in my LKGR checkout, and the diff I read in this CL
> > doesn't have a lot of changes to BlockHeader). That makes me think that we
> > shouldn't block this CL, which seems primarily about block bitmaps, because
of
> > issues with BlockHeader, and instead should carry on this discussion in some
> > other format (maybe the sketch CL mentioned above, if Ricardo can throw
> together
> > such a thing).  Obviously there's a danger in letting the BlockHeader issue
> > slide if we do that, but it still strikes me as the right thing to do--if we
> can
> > deal with issues orthogonally, we should, and BlockHeader is already in the
> > backing tree.
> 
> A slight clarification: when this class landed Gavin actually expressed in
> person to be convinced about the logical value of the class (as in if it makes
> sense or not). That said, I pointed out at that time that it is always
possible
> to change anything from a class when more users are implemented, and that
would
> be this CL so I'm more than happy to put the issue to rest now, even though I
> thought we were past that point already.
> 
> I apologize because I don't anticipate having a lot of time this week to be
> fully responsive to this review. I'm not sure I would agree with the summary
> about the future of the code (or at least I may understand something different
> with that description), so I'll try to write something soon to be more clear
> about it.
> 
> In the mean time, there is a document that should provide a lot of background,
> in case you have not seen it yet:
>
http://www.chromium.org/developers/design-documents/network-stack/disk-cache/...
> (a longer version is available at
>
https://docs.google.com/a/google.com/document/d/1UnBlsdJusO4pdgzC4Ut194YVki16...
> but it mostly adds justification for the details of the design)
> 
> Changes to the structure of the block files are minimal (other than splitting
> each file in two), so the section on the old document is also relevant
>
(http://www.chromium.org/developers/design-documents/network-stack/disk-cache#...)
> 
> About the future of the code, there is the V3 as implemented
> (https://codereview.chromium.org/17507006/ is close enough to the latest
> version), and there is the fact that V2 will be removed when we deploy V3 so a
> lot of code will go away.
> 
> Thanks for your patience.

No worries on my end--as indicated, I'm feeling a bit snowed at the moment too. 
I'll study the material you've provided, but won't worry about doing any kind of
detailed response until I hear from you again.  Thanks!

gavinp

On 2013/07/17 03:04:20, rvargas wrote: > On 2013/07/16 18:01:41, rdsmith wrote: > > If I'm ...

7 years, 5 months ago (2013-07-20 02:53:38 UTC) #14

On 2013/07/17 03:04:20, rvargas wrote:
> On 2013/07/16 18:01:41, rdsmith wrote:
> > If I'm understanding your separate points correctly, this sounds like a
future
> /
> > present conflict:
> > 
> > + Future: Ricardo has a plan for where he wants the code to go that involves
> > abstracting out the allocation of blocks from the physical file that they go
> > into (Ricardo: I'm intentionally trying to simplify from "metadata of a
single
> > logical block file" in hopes that if I've gotten the key point wrong you'll
> > object.  To me, based on what you're saying so far, it sounds like what
you're
> > looking for is an allocation abstraction, not something that has any other
> > particular attributes of a file).  I don't think this is about memory/disk
> > separation so much as about logical/physical separation.
> > 
> > + Present: Gavin is concerned that in the code as it currently stands, the
> > BlockHeader class is a pretty pure wrapper around BlockFileHeader;
effectively
> > that it "is a" BlockFileHeader, and creating the new class will both confuse
> > people reading the code about the relationship between BlockHeader and
> > BlockFileHeader, and make interacting with BlockFileHeaders itself confusing
> > because the user will have to indirect through BlockHeader to get to them. 
> > 
> > (Obviously, if my summary of either of your positions is incorrect or
> > incomplete, please call that out, otherwise I'm sure I'll be heading in a
> > completely wrong direction.)
> > 
> > That conflict makes me want to have a more full understanding of where this
> code
> > is headed.  Ricardo, if (if) it's easy to put up a CL of what you expect the
> > code to look like when you do the logical/physical separation (it doesn't
have
> > to be completely or compile), that would be very useful, at least for me, in
> > evaluating the abstraction.  If not, could you give a sketch of what you
> expect
> > the different entities in your final state to look like, and what their
> > relationships are going to be?  I'm specifically curious about what would
> happen
> > to the Header() method, since I wouldn't think that that would make any
sense
> if
> > the abstraction you seem to be aiming for were fully realized.  But I may be
> > misunderstanding the abstraction.  I think if we more fully understand where
> the
> > code is headed, we can do a better job of figuring out how it should get
> there.
> > 
> > One other top level point: As I read this CL, BlockHeader wasn't introduced
in
> > this CL (at least, it's in my LKGR checkout, and the diff I read in this CL
> > doesn't have a lot of changes to BlockHeader). That makes me think that we
> > shouldn't block this CL, which seems primarily about block bitmaps, because
of
> > issues with BlockHeader, and instead should carry on this discussion in some
> > other format (maybe the sketch CL mentioned above, if Ricardo can throw
> together
> > such a thing).  Obviously there's a danger in letting the BlockHeader issue
> > slide if we do that, but it still strikes me as the right thing to do--if we
> can
> > deal with issues orthogonally, we should, and BlockHeader is already in the
> > backing tree.
> 
> A slight clarification: when this class landed Gavin actually expressed in
> person to be convinced about the logical value of the class (as in if it makes
> sense or not). That said, I pointed out at that time that it is always
possible
> to change anything from a class when more users are implemented, and that
would
> be this CL so I'm more than happy to put the issue to rest now, even though I
> thought we were past that point already.
> 
> I apologize because I don't anticipate having a lot of time this week to be
> fully responsive to this review. I'm not sure I would agree with the summary
> about the future of the code (or at least I may understand something different
> with that description), so I'll try to write something soon to be more clear
> about it.

Ricardo, I just saw that you updated the V3 blockfile summary CL, at
https://codereview.chromium.org/15203004/ . That's awesome.

I'm sad that you disagree with my discussion of the plans, I would have been
helpfully informed if you had said how. I said that the CL you just updated
contained them, I hope you updating this shows that was correct. I also said
that block_files.h in that CL is precisely as it is in this CL. That is also
true in your most recent upload, so far as I can tell.

I know I'd benefit from you telling me how I'm wrong.

>
> In the mean time, there is a document that should provide a lot of background,
> in case you have not seen it yet:
>
http://www.chromium.org/developers/design-documents/network-stack/disk-cache/...
> (a longer version is available at
>
https://docs.google.com/a/google.com/document/d/1UnBlsdJusO4pdgzC4Ut194YVki16...
> but it mostly adds justification for the details of the design)
> 
> Changes to the structure of the block files are minimal (other than splitting
> each file in two), so the section on the old document is also relevant
>
(http://www.chromium.org/developers/design-documents/network-stack/disk-cache#...)
> 
> About the future of the code, there is the V3 as implemented
> (https://codereview.chromium.org/17507006/ is close enough to the latest
> version), and there is the fact that V2 will be removed when we deploy V3 so a
> lot of code will go away.
> 
> Thanks for your patience.

gavinp

Just FYI: rdsmith, I'm blocking any response until after I've heard from you. Thanks.

7 years, 5 months ago (2013-07-20 02:54:25 UTC) #15

Randy Smith (Not in Mondays)

On 2013/07/20 02:54:25, gavinp wrote: > Just FYI: rdsmith, I'm blocking any response until after ...

7 years, 5 months ago (2013-07-20 12:16:16 UTC) #16

Randy Smith (Not in Mondays)

Ricardo: I've spent some time going over https://codereview.chromium.org/15203004 to understand the general shape of where ...

7 years, 5 months ago (2013-07-23 23:02:02 UTC) #17

rvargas (doing something else)

On 2013/07/23 23:02:02, rdsmith wrote: > Ricardo: I've spent some time going over > https://codereview.chromium.org/15203004 ...

7 years, 5 months ago (2013-07-24 02:21:06 UTC) #18

On 2013/07/23 23:02:02, rdsmith wrote:
> Ricardo: I've spent some time going over
> https://codereview.chromium.org/15203004 to understand the general shape of
> where you're going with the v3 cache, and I think my basic question is what
you
> feel the utility of the BlockHeader abstraction is (as opposed to one that
more
> generally encapsulates the things you might want to do with a
BlockFileHeader). 
> 
> 
> I'm interested both on a conceptual level and on the practical level. 
> Conceptually, I'd like to understand your intentions for the abstraction.   Is
> the issue that you want to pull out the allocation, as I suggest above, that
you
> want to make a distinction between in-memory and on-disk operations, or that
you
> have plans for something, presumably not included in 15203004, that would be a
> further abstraction from BlockFileHeader?  
> 
> Practically, I'd like to understand the code value you expect to get from the
> abstraction.  My personal perspective on abstractions is that they are for
> reducing the amount of knowledge required to edit code (down from details of
the
> implementation of the abstraction to only the interface in the header).  But I
> don't see that as happening here, since just about every routine in
> block_bitmaps.cc (which I believe is the code that BlockHeader was written
for?)
> seems to both use the BlockHeader interface and reach through to
> BlockFileHeader.  So I suspect you have a different value in mind for use of
the
> class than I'm aware of.
> 
> I'm happy to continue this conversation by CL, but we might also want to talk
> directly.  I'd be up for either a VC chat, or talking while I'm in California
> next week--just let me know.
> 
> Thanks much in advance!

Earlier today I started drawing a diagram of what I want but it's not helping
much so maybe it's better if I go with plain text instead.
I hope this is not too boring.

A logical block file in disk is a collection of semi-random fields on the header
+ the allocation map + the actual user data (blocks). Going forward, the memory
mapped part is split from the non-mapped part, so the actual user data (blocks)
are going to an independent physical file.

BlockFileHeader is a plain C struct that defines how the first part should look.
It has the plain fields of the header plus the allocation map, because all that
is memory mapped. I feel very strongly about providing a plain structure without
any interference of code itself as the file format definition (meaning this is
not a class and it doesn't expose any methods at all), because allowing third
parties to understand what we store without having to cleanup our headers is
important.

Next up is BlockHeader. This is a class that represents a logical block file for
some parts of the code (BlockBitmaps). It means that it has methods that expose
"file" level operations, as CreateMapBlock or NeedToGrowBlockFile, because that
is what a high user level wants to do with a block file. In this context,
BlockHeader represents all the metadata of the file (actual fields from the
structure + the allocation bitmap), but in a way also the user data, except that
the user will never really interact directly with the user data part of the file
because that means file IO so in practice all that is handled by knowing the
address that stores a given block. In other words, it doesn't provide access to
the data because the user should not write or read directly from that, not
because that is not part of "the file" represented by the "header".

The next level up has two entities: BlockFiles and BlockBitmaps. They both sit
at the same level of abstraction and they are both direct users of BlockHeader.
While BlockHeader represents a single logical file, these two objects represent
the collection of all block files used by the cache. These two objects also
expose an interface to for instance CreateBlock(), but what they have to do is
pick up the right file to use and then use the proper method on object
(BlockHeader) to create the block. Again, this just allocates space on a data
section somewhere and returns the address to use to reference that data.

There are two basic differences between BlockFiles and BlockBitmaps:
1. The former does everything for V2
2. The latter is only used for V3 and ensures that no IO takes place (doesn't
know about real file handles). In this case, an instance of BlockFiles actually
does some work on the proper thread, like creating and extending files because
this object is still keeping all the handles to actual files on disk, but it's
not used to manipulate blocks directly.

Then there's the issue of encapsulation. BlockHeader is not intended to isolate
consumers from the fact that a file header contains a field with the number of
entries on the file. It's purpose is to provide methods to make the life of
consumers simpler. For the most part, BlockBitmaps doesn't interact directly
with fields on the header, and never with the allocation bitmap. However, when
it needs to perform something a little lower level (say, getting the id of the
current file), it goes through the BlockHeader object and accesses the required
field. The alternatives would be to expose all needed bits through dedicated
accessors, or attempt to encapsulate all uses of those bits through very simple
"functional" wrappers and they both seem like pure overhead to me (it's not like
the consumer will forget that it is accessing a field just because it is using a
dedicated accessor). Another option would be to remove the header() accessor and
force the caller to keep two objects instead of one, just for the sake of it.

BlockFiles is a little different because it uses raw BlockFileHeader pointers
more often. This is in part because it used to do that all the time (there was
no BlockHeader class). There is opportunity to revisit this class and reduce the
use of BlockFileHeader, especially once V2 support is eliminated.

So, to answer your question, conceptually BlockHeader is the class that
represents the header (and the header represents the file). This could be part
of the actual file format definition (aka, no BlockFileHeader structure), but I
don't consider adding methods there as an option (neither really, removing the
structure). So if this represents the file, a method to allocate a block on a
file belongs here as opposed to a namespace... and if the file has an id,
consumers should be able to get to that id through this object.

I hope this makes sense now.

Randy Smith (Not in Mondays)

Gavin: Ricardo and I chatted for a bit this afternoon, and got to a place ...

7 years, 4 months ago (2013-07-30 22:31:58 UTC) #19

Gavin: Ricardo and I chatted for a bit this afternoon, and got to a place that
I'm comfortable with based on my current understanding of the code and design. 
I wanted to writeup our conclusions to make sure I have the same understanding
of the conversation as Ricardo, and see whether/how much they deal with your
concerns.

My primary concern in reading the code wasn't so much the existence of both
BlockHeader and BlockFileHeader, or the fact that BlockHeader wraps
BlockFileHeader, as the fact that most consumers of them use them both.  To me,
that's an abstraction that doesn't have much use; it doesn't allow any reduction
of information within the consumer code.  I'd rather have a single abstraction
that has more methods, even if those methods aren't necessarily at the same
level.  So I was interested in either unifying the two classes (through the
proposals you made for unification or something like them) or in actually
separating their usage.  My preference was for separation.  

The two consumers of BlockHeader that I found were BlockBitmaps (the v3 future)
and BlockFiles (the v2 past.  Mostly :-}.).  For BlockBitmaps my sense, which
Ricardo agreed with, is that it would be relatively easy to make it only target
BlockHeader.  There are O(2) methods in BlockHeader which would need to be
implemented by just reading BlockFileHeader data members, and that doesn't seem
like a particularly high price to pay to get a type (BlockFileHeader) out of the
conceptual pool required for reading/maintaining BlockBitmaps. 

BlockFiles is trickier.  Much of it is going away, which raises the question of
how much effort is worth putting in getting it "right".  However, some of it
will be sticking around, because BlockHeader is only about manipulating the
allocation map (loosely), and BlockFiles will be used for, e.g. remapping and
growing files.  Ricardo believes that after the v2 cache is deprecated,
BlockFiles can be written to only use BlockFileHeader, but he's reluctant to put
a lot of effort into it right now.

That final state (BlockHeader is a wrapper around BlockFileHeader that provides
block allocating/freeing primitives, BlockBitmaps uses only BlockHeader,
BlockFiles uses only BlockFileHeader) is one that I think I'd be happy with. 
I'm not completely happy with leaving the current mixed usage inside of
BlockFiles (I'm a big believer from my own career in "temporary solutions
aren't" :-J), but I'm willing to accept it as part of the process of getting to
the new v3 state.

So what this would mean for this CL from my perspective: 
+ BlockBitmaps and BlockHeader be modified so that BlockBitmaps doesn't use
BlockFileHeader at all.
+ A TODO be put at the BlockHeader::Header() declaration indicating that it
should be removed after the v2 cache has been deprecated.  

Does this deal with some amount of the concerns you have around BlockHeader and
BlockFileHeader?  What concerns remain?  (If it doesn't deal with any of the
concerns, then I haven't done a good job of groking your feedback and we should
talk directly, which is fine.)

gavinp

On 2013/07/30 22:31:58, rdsmith wrote: > Gavin: Ricardo and I chatted for a bit this ...

7 years, 4 months ago (2013-07-31 19:09:26 UTC) #20

On 2013/07/30 22:31:58, rdsmith wrote:
> Gavin: Ricardo and I chatted for a bit this afternoon, and got to a place that
> I'm comfortable with based on my current understanding of the code and design.

> I wanted to writeup our conclusions to make sure I have the same understanding
> of the conversation as Ricardo, and see whether/how much they deal with your
> concerns.
> 
> My primary concern in reading the code wasn't so much the existence of both
> BlockHeader and BlockFileHeader, or the fact that BlockHeader wraps
> BlockFileHeader, as the fact that most consumers of them use them both.  To
me,
> that's an abstraction that doesn't have much use; it doesn't allow any
reduction
> of information within the consumer code.  I'd rather have a single abstraction
> that has more methods, even if those methods aren't necessarily at the same
> level.  So I was interested in either unifying the two classes (through the
> proposals you made for unification or something like them) or in actually
> separating their usage.  My preference was for separation.  
> 
> The two consumers of BlockHeader that I found were BlockBitmaps (the v3
future)
> and BlockFiles (the v2 past.  Mostly :-}.).  For BlockBitmaps my sense, which
> Ricardo agreed with, is that it would be relatively easy to make it only
target
> BlockHeader.  There are O(2) methods in BlockHeader which would need to be
> implemented by just reading BlockFileHeader data members, and that doesn't
seem
> like a particularly high price to pay to get a type (BlockFileHeader) out of
the
> conceptual pool required for reading/maintaining BlockBitmaps. 
> 
> BlockFiles is trickier.  Much of it is going away, which raises the question
of
> how much effort is worth putting in getting it "right".  However, some of it
> will be sticking around, because BlockHeader is only about manipulating the
> allocation map (loosely), and BlockFiles will be used for, e.g. remapping and
> growing files.  Ricardo believes that after the v2 cache is deprecated,
> BlockFiles can be written to only use BlockFileHeader, but he's reluctant to
put
> a lot of effort into it right now.
> 
> That final state (BlockHeader is a wrapper around BlockFileHeader that
provides
> block allocating/freeing primitives, BlockBitmaps uses only BlockHeader,
> BlockFiles uses only BlockFileHeader) is one that I think I'd be happy with. 
> I'm not completely happy with leaving the current mixed usage inside of
> BlockFiles (I'm a big believer from my own career in "temporary solutions
> aren't" :-J), but I'm willing to accept it as part of the process of getting
to
> the new v3 state.
> 
> So what this would mean for this CL from my perspective: 
> + BlockBitmaps and BlockHeader be modified so that BlockBitmaps doesn't use
> BlockFileHeader at all.
> + A TODO be put at the BlockHeader::Header() declaration indicating that it
> should be removed after the v2 cache has been deprecated.  
> 
> Does this deal with some amount of the concerns you have around BlockHeader
and
> BlockFileHeader?  What concerns remain?  (If it doesn't deal with any of the
> concerns, then I haven't done a good job of groking your feedback and we
should
> talk directly, which is fine.)

Thank you for taking the time on this Randy, and for taking the the time to
explain the reasoning to me. This, together with our conversation was the first
time I had realized:
  1. That the MappedFile* constructor for BlockHeader is expected to dissappear
in the final V3 cache (is this correct, Ricardo?)
  2. That the BlockHeader::Header() method is expected to disappear in the final
V3 cache (is this correct, Ricardo?)
Certainly if these are both true, and both commented with TODOs, some of the
architectural concern around this class disappears. I had not realized the
extent to which these two abstraction busters were going to disappear in V3. :(

The second question is whether the relationship between a BlockHeader and a
BlockFileHeader is an "is a" relationship or not. Certainly, every
BlockFileHeader object has one and exactly one BlockFile object. Likewise, every
BlockFile object has one and exactly one BlockFileHeader object. It bothers me
architecturally that every method on BlockFileHeader is actually a method on a
BlockFile*, but we are just using language semantics to hide that fact. Where I
see a full isomorphism, I suspect the relationship is identity, although I
respect your suggestion that isomorphism is not identity. Given that, and given
(1) and (2) above, and the presumption in favour of the reviewer, I think I can
accept this class.

Now let's continue with this review. :D

Randy Smith (Not in Mondays)

On 2013/07/31 19:09:26, gavinp wrote: > On 2013/07/30 22:31:58, rdsmith wrote: > > Gavin: Ricardo ...

7 years, 4 months ago (2013-07-31 22:30:58 UTC) #21

On 2013/07/31 19:09:26, gavinp wrote:
> On 2013/07/30 22:31:58, rdsmith wrote:
> > Gavin: Ricardo and I chatted for a bit this afternoon, and got to a place
that
> > I'm comfortable with based on my current understanding of the code and
design.
> 
> > I wanted to writeup our conclusions to make sure I have the same
understanding
> > of the conversation as Ricardo, and see whether/how much they deal with your
> > concerns.
> > 
> > My primary concern in reading the code wasn't so much the existence of both
> > BlockHeader and BlockFileHeader, or the fact that BlockHeader wraps
> > BlockFileHeader, as the fact that most consumers of them use them both.  To
> me,
> > that's an abstraction that doesn't have much use; it doesn't allow any
> reduction
> > of information within the consumer code.  I'd rather have a single
abstraction
> > that has more methods, even if those methods aren't necessarily at the same
> > level.  So I was interested in either unifying the two classes (through the
> > proposals you made for unification or something like them) or in actually
> > separating their usage.  My preference was for separation.  
> > 
> > The two consumers of BlockHeader that I found were BlockBitmaps (the v3
> future)
> > and BlockFiles (the v2 past.  Mostly :-}.).  For BlockBitmaps my sense,
which
> > Ricardo agreed with, is that it would be relatively easy to make it only
> target
> > BlockHeader.  There are O(2) methods in BlockHeader which would need to be
> > implemented by just reading BlockFileHeader data members, and that doesn't
> seem
> > like a particularly high price to pay to get a type (BlockFileHeader) out of
> the
> > conceptual pool required for reading/maintaining BlockBitmaps. 
> > 
> > BlockFiles is trickier.  Much of it is going away, which raises the question
> of
> > how much effort is worth putting in getting it "right".  However, some of it
> > will be sticking around, because BlockHeader is only about manipulating the
> > allocation map (loosely), and BlockFiles will be used for, e.g. remapping
and
> > growing files.  Ricardo believes that after the v2 cache is deprecated,
> > BlockFiles can be written to only use BlockFileHeader, but he's reluctant to
> put
> > a lot of effort into it right now.
> > 
> > That final state (BlockHeader is a wrapper around BlockFileHeader that
> provides
> > block allocating/freeing primitives, BlockBitmaps uses only BlockHeader,
> > BlockFiles uses only BlockFileHeader) is one that I think I'd be happy with.

> > I'm not completely happy with leaving the current mixed usage inside of
> > BlockFiles (I'm a big believer from my own career in "temporary solutions
> > aren't" :-J), but I'm willing to accept it as part of the process of getting
> to
> > the new v3 state.
> > 
> > So what this would mean for this CL from my perspective: 
> > + BlockBitmaps and BlockHeader be modified so that BlockBitmaps doesn't use
> > BlockFileHeader at all.
> > + A TODO be put at the BlockHeader::Header() declaration indicating that it
> > should be removed after the v2 cache has been deprecated.  
> > 
> > Does this deal with some amount of the concerns you have around BlockHeader
> and
> > BlockFileHeader?  What concerns remain?  (If it doesn't deal with any of the
> > concerns, then I haven't done a good job of groking your feedback and we
> should
> > talk directly, which is fine.)
> 
> Thank you for taking the time on this Randy, and for taking the the time to
> explain the reasoning to me. This, together with our conversation was the
first
> time I had realized:
>   1. That the MappedFile* constructor for BlockHeader is expected to
dissappear
> in the final V3 cache (is this correct, Ricardo?)
>   2. That the BlockHeader::Header() method is expected to disappear in the
final
> V3 cache (is this correct, Ricardo?)
> Certainly if these are both true, and both commented with TODOs, some of the
> architectural concern around this class disappears. I had not realized the
> extent to which these two abstraction busters were going to disappear in V3.
:(
> 
> The second question is whether the relationship between a BlockHeader and a
> BlockFileHeader is an "is a" relationship or not. Certainly, every
> BlockFileHeader object has one and exactly one BlockFile object. Likewise,
every
> BlockFile object has one and exactly one BlockFileHeader object. It bothers me
> architecturally that every method on BlockFileHeader is actually a method on a
> BlockFile*, but we are just using language semantics to hide that fact. Where
I
> see a full isomorphism, I suspect the relationship is identity, although I
> respect your suggestion that isomorphism is not identity. Given that, and
given
> (1) and (2) above, and the presumption in favour of the reviewer, I think I
can
> accept this class.

Gavin: Just confirming that when you say BlockFile above, you mean BlockHeader? 
I'm not aware of a BlockFile class, just a BlockFileHeader structure and a
BlockHeader class.  BlockFiles and BlockBitmaps are higher level classes that
manage the group of block files.  But I don't think there's a BlockFile class. 
So I'm interpreting you as saying that it bothers you architecturally that every
method on BlockHeader is effectively a method on BlockFileHeader because it only
and completely manipulates or reads BlockFileHeader's data.  Please let me know
if I've got that wrong.  Thanks!

> 
> Now let's continue with this review. :D

gavinp

On 2013/07/31 22:30:58, rdsmith wrote: > On 2013/07/31 19:09:26, gavinp wrote: > > On 2013/07/30 ...

7 years, 4 months ago (2013-08-05 15:12:13 UTC) #22

On 2013/07/31 22:30:58, rdsmith wrote:
> On 2013/07/31 19:09:26, gavinp wrote:
> > On 2013/07/30 22:31:58, rdsmith wrote:
> > > Gavin: Ricardo and I chatted for a bit this afternoon, and got to a place
> that
> > > I'm comfortable with based on my current understanding of the code and
> design.
> > 
> > > I wanted to writeup our conclusions to make sure I have the same
> understanding
> > > of the conversation as Ricardo, and see whether/how much they deal with
your
> > > concerns.
> > > 
> > > My primary concern in reading the code wasn't so much the existence of
both
> > > BlockHeader and BlockFileHeader, or the fact that BlockHeader wraps
> > > BlockFileHeader, as the fact that most consumers of them use them both. 
To
> > me,
> > > that's an abstraction that doesn't have much use; it doesn't allow any
> > reduction
> > > of information within the consumer code.  I'd rather have a single
> abstraction
> > > that has more methods, even if those methods aren't necessarily at the
same
> > > level.  So I was interested in either unifying the two classes (through
the
> > > proposals you made for unification or something like them) or in actually
> > > separating their usage.  My preference was for separation.  
> > > 
> > > The two consumers of BlockHeader that I found were BlockBitmaps (the v3
> > future)
> > > and BlockFiles (the v2 past.  Mostly :-}.).  For BlockBitmaps my sense,
> which
> > > Ricardo agreed with, is that it would be relatively easy to make it only
> > target
> > > BlockHeader.  There are O(2) methods in BlockHeader which would need to be
> > > implemented by just reading BlockFileHeader data members, and that doesn't
> > seem
> > > like a particularly high price to pay to get a type (BlockFileHeader) out
of
> > the
> > > conceptual pool required for reading/maintaining BlockBitmaps. 
> > > 
> > > BlockFiles is trickier.  Much of it is going away, which raises the
question
> > of
> > > how much effort is worth putting in getting it "right".  However, some of
it
> > > will be sticking around, because BlockHeader is only about manipulating
the
> > > allocation map (loosely), and BlockFiles will be used for, e.g. remapping
> and
> > > growing files.  Ricardo believes that after the v2 cache is deprecated,
> > > BlockFiles can be written to only use BlockFileHeader, but he's reluctant
to
> > put
> > > a lot of effort into it right now.
> > > 
> > > That final state (BlockHeader is a wrapper around BlockFileHeader that
> > provides
> > > block allocating/freeing primitives, BlockBitmaps uses only BlockHeader,
> > > BlockFiles uses only BlockFileHeader) is one that I think I'd be happy
with.
> 
> > > I'm not completely happy with leaving the current mixed usage inside of
> > > BlockFiles (I'm a big believer from my own career in "temporary solutions
> > > aren't" :-J), but I'm willing to accept it as part of the process of
getting
> > to
> > > the new v3 state.
> > > 
> > > So what this would mean for this CL from my perspective: 
> > > + BlockBitmaps and BlockHeader be modified so that BlockBitmaps doesn't
use
> > > BlockFileHeader at all.
> > > + A TODO be put at the BlockHeader::Header() declaration indicating that
it
> > > should be removed after the v2 cache has been deprecated.  
> > > 
> > > Does this deal with some amount of the concerns you have around
BlockHeader
> > and
> > > BlockFileHeader?  What concerns remain?  (If it doesn't deal with any of
the
> > > concerns, then I haven't done a good job of groking your feedback and we
> > should
> > > talk directly, which is fine.)
> > 
> > Thank you for taking the time on this Randy, and for taking the the time to
> > explain the reasoning to me. This, together with our conversation was the
> first
> > time I had realized:
> >   1. That the MappedFile* constructor for BlockHeader is expected to
> dissappear
> > in the final V3 cache (is this correct, Ricardo?)
> >   2. That the BlockHeader::Header() method is expected to disappear in the
> final
> > V3 cache (is this correct, Ricardo?)
> > Certainly if these are both true, and both commented with TODOs, some of the
> > architectural concern around this class disappears. I had not realized the
> > extent to which these two abstraction busters were going to disappear in V3.
> :(
> > 
> > The second question is whether the relationship between a BlockHeader and a
> > BlockFileHeader is an "is a" relationship or not. Certainly, every
> > BlockFileHeader object has one and exactly one BlockFile object. Likewise,
> every
> > BlockFile object has one and exactly one BlockFileHeader object. It bothers
me
> > architecturally that every method on BlockFileHeader is actually a method on
a
> > BlockFile*, but we are just using language semantics to hide that fact.
Where
> I
> > see a full isomorphism, I suspect the relationship is identity, although I
> > respect your suggestion that isomorphism is not identity. Given that, and
> given
> > (1) and (2) above, and the presumption in favour of the reviewer, I think I
> can
> > accept this class.
> 
> Gavin: Just confirming that when you say BlockFile above, you mean
BlockHeader? 
> I'm not aware of a BlockFile class, just a BlockFileHeader structure and a
> BlockHeader class.  BlockFiles and BlockBitmaps are higher level classes that
> manage the group of block files.  But I don't think there's a BlockFile class.

> So I'm interpreting you as saying that it bothers you architecturally that
every
> method on BlockHeader is effectively a method on BlockFileHeader because it
only
> and completely manipulates or reads BlockFileHeader's data.  Please let me
know
> if I've got that wrong.  Thanks!

Yes, that's what I meant, I'm sorry that I got lost in name soup, and as a
result my comment was far more confusing than it needed to be.

The BlockHeader class, declared in "block_files.h" is the one that confuses me,
because it only manipulates the BlockFileHeader, declared in
"disk_format_base.h". I find the relationship confusing, in part because the
blockfile cache generally contains a lot of classes, and the fine distinctions
between each of them escape me. I'm sorry for the confusing comment that
resulted from my name confusion, but I also submit it illustrates the trouble I
have reading code here.

> 
> > 
> > Now let's continue with this review. :D

gavinp

https://codereview.chromium.org/17816008/diff/5629499534213120/net/disk_cache/block_files.cc File net/disk_cache/block_files.cc (right): https://codereview.chromium.org/17816008/diff/5629499534213120/net/disk_cache/block_files.cc#newcode199 net/disk_cache/block_files.cc:199: for (int i = 0; i < kMaxNumBlocks; i++) ...

7 years, 4 months ago (2013-08-05 17:06:15 UTC) #23

rvargas (doing something else)

New version uploaded. The last one is just rebasing to tot (no real changes) https://codereview.chromium.org/17816008/diff/26002/net/disk_cache/v3/block_bitmaps.cc ...

7 years, 4 months ago (2013-08-05 19:50:45 UTC) #24

New version uploaded. The last one is just rebasing to tot (no real changes)

https://codereview.chromium.org/17816008/diff/26002/net/disk_cache/v3/block_b...
File net/disk_cache/v3/block_bitmaps.cc (right):

https://codereview.chromium.org/17816008/diff/26002/net/disk_cache/v3/block_b...
net/disk_cache/v3/block_bitmaps.cc:153: NOTREACHED();
On 2013/07/15 18:23:35, gavinp wrote:
> On 2013/07/11 19:54:55, rvargas wrote:
> > On 2013/06/28 21:47:45, gavinp wrote:
> > > This scares me a bit. Is this an assertion, or an error case? It is coded
as
> > > both.
> > 
> > Not that uncommon pattern. It is an assertion in that tests or developer
> > versions should break into a debugger if this happens. On the other hand, I
> > expect a reasonable action if this happens on the field... as in creation of
> the
> > current entry to fail.
> 
> Why not crash in the field? If this is impossible (a DCHECK), then the crash
> will be really diagnostic, right?

I din't say it was impossible to get here. All is needed is to get really behind
growing the files. That's no reason to crash the whole browser.

https://codereview.chromium.org/17816008/diff/26002/net/disk_cache/v3/block_b...
File net/disk_cache/v3/block_bitmaps_unittest.cc (right):

https://codereview.chromium.org/17816008/diff/26002/net/disk_cache/v3/block_b...
net/disk_cache/v3/block_bitmaps_unittest.cc:10: 
What problem are you trying to solve?

The usual guidance of unit test files for chrome was to avoid namespaces so that
the test was not using unintentionally any code inside the namespace under test.
The second reason was to somewhat use the same pattern that actual consumers
would have to use when dealing with the code, and using the explicit namespace
is one of those things that consumers have to deal with.

A clear problem with that is when tests need to friend the code under test, so a
good number of test files started getting that namespace. In that sense, it is
kind of funny to have a written rule saying "don't use namespaces" when half of
the files would have to anyway.

Given that these tests don't _have to_ live inside net, I still think it is a
good idea to follow the old guidance.

Now, about an anonymous namespace... what problem would that be solving,
exactly? Name collisions with another, unrelated test? Isn't that the reason to
be very explicit about the name used for the tests? I would prefer a world
without name collisions than a world when collisions are handled by virtue of
anonymous namespaces.

https://codereview.chromium.org/17816008/diff/5629499534213120/net/disk_cache...
File net/disk_cache/block_files.cc (right):

https://codereview.chromium.org/17816008/diff/5629499534213120/net/disk_cache...
net/disk_cache/block_files.cc:199: for (int i = 0; i < kMaxNumBlocks; i++) {
On 2013/08/05 17:06:15, gavinp wrote:
> I did a real double take reading this loop. Why do we initialize i = 0 and not
> with block_count - 1?

To avoid a negative values?

done

https://codereview.chromium.org/17816008/diff/5629499534213120/net/disk_cache...
net/disk_cache/block_files.cc:276: DCHECK_GE(block_files_.size(),
On 2013/07/15 18:23:36, gavinp wrote:
> Convention is to put the constant first, and the thing being tested second, so
> this needs to be reversed into a DCHECK_LE.

I believe you are confusing the requirements of EXPECT_LE with the one of
DCHECK_LE.

The test macros expect the first argument to be the constant value. On the other
hand, the recommended convention for the runtime macros is to follow the pattern
used when coding (and that means make the code more readable, usually by having
the constant after the variable)

https://codereview.chromium.org/17816008/diff/5629499534213120/net/disk_cache...
File net/disk_cache/block_files.h (right):

https://codereview.chromium.org/17816008/diff/5629499534213120/net/disk_cache...
net/disk_cache/block_files.h:27: // Note that this class doesn't perform any
file operation.
On 2013/08/05 17:06:15, gavinp wrote:
> Is "file operation" well defined?
> 
> This class definitely does a lot of IO. Can we expand on this comment a bit,
> both in explaining what a BlockHeader is, and what it means to not do file
> operations?

I meant the usual meaning of file operations (file IO). This class doesn't do
file IO by itself any more than any class that touches memory (almost any
class?) does file IO.

https://codereview.chromium.org/17816008/diff/5629499534213120/net/disk_cache...
net/disk_cache/block_files.h:74: typedef std::vector<BlockHeader>
BlockFilesBitmaps;
On 2013/08/05 17:06:15, gavinp wrote:
> In reviewing this code, I think this typedef is causing me some pain. Can we
> just use std::vector<BlockHeader> everywhere, it's only seven characters
longer?
> I think it's referenced only twice?

It is used more that twice on the V3 code. This is the type that is transferred
between BlockFiles and BlockBitmaps whenever the mappings change.

BlockFilesMappings ?

https://codereview.chromium.org/17816008/diff/5629499534213120/net/disk_cache...
File net/disk_cache/v3/block_bitmaps.cc (right):

https://codereview.chromium.org/17816008/diff/5629499534213120/net/disk_cache...
net/disk_cache/v3/block_bitmaps.cc:157: header_num = -1;
On 2013/08/05 17:06:15, gavinp wrote:
> NOTREACHED() is a DCHECK which says this is impossible. The code to handle
this
> case in release is thus redundant. If this really is impossible, then don't
> fixup header_num, or consider a CHECK() to force the crash so we can debug it.
> 
> On the other hand, if this actually happens, perhaps a DLOG is more
appropriate?

See the other comment about crashing the browser.

https://codereview.chromium.org/17816008/diff/5629499534213120/net/disk_cache...
File net/disk_cache/v3/block_bitmaps.h (right):

https://codereview.chromium.org/17816008/diff/5629499534213120/net/disk_cache...
net/disk_cache/v3/block_bitmaps.h:17: // This class handles the set of
block-files open by the disk cache.
On 2013/08/05 17:06:15, gavinp wrote:
> Can we expand on this comment a bit? I'm still a bit confused about what a
> bitmap is? Also "handles" is a really general verb. More specifics will put
this
> in context.

Expended the comment. I can think of handles/represents/encapsulates but the
first one is the most accurate imo.

https://codereview.chromium.org/17816008/diff/5629499534213120/net/disk_cache...
File net/disk_cache/v3/block_bitmaps_unittest.cc (right):

https://codereview.chromium.org/17816008/diff/5629499534213120/net/disk_cache...
net/disk_cache/v3/block_bitmaps_unittest.cc:16: const int kNumHeaders = 10;  //
Just a bunch of "files".
On 2013/08/05 17:06:15, gavinp wrote:
> I don't understand this comment. Can you remove it or expand on it?

Done.

https://codereview.chromium.org/17816008/diff/5629499534213120/net/disk_cache...
net/disk_cache/v3/block_bitmaps_unittest.cc:36:
EXPECT_TRUE(block_bitmaps.CreateBlock(disk_cache::BLOCK_1K, block_size,
On 2013/08/05 17:06:15, gavinp wrote:
> Should this be an ASSERT? What is the value of address[i] on failure?

Done.

https://codereview.chromium.org/17816008/diff/5629499534213120/net/disk_cache...
net/disk_cache/v3/block_bitmaps_unittest.cc:41: EXPECT_EQ(start / 4, (start +
block_size - 1) / 4);
On 2013/08/05 17:06:15, gavinp wrote:
> I think this is equivalent to: 
>   EXPECT_GT(4, (i % 4) + (start % 4));
> and also
>   EXPECT_EQ(0, (start % 4 + block_size - 1) / 4);
> Is that right?
> 
> But I still don't understand what's being asserted about the start addresses.
Is
> it that we make our allocations aligned in some way?
> 
> Could we document this with a comment on the test failure, would:
>   << "Block was not allocated aligned to ... boundary" 
> work?

Done.

https://codereview.chromium.org/17816008/diff/5629499534213120/net/disk_cache...
net/disk_cache/v3/block_bitmaps_unittest.cc:50: // 10 bits per each four
entries, so 250 bits total. All entries should go to
On 2013/08/05 17:06:15, gavinp wrote:
> Can we fix this grammar while we're here? Is it "We used 10 bits for each of
> four entries, ..." ?

Done.

Randy Smith (Not in Mondays)

https://codereview.chromium.org/17816008/diff/5629499534213120/net/disk_cache/block_files.h File net/disk_cache/block_files.h (right): https://codereview.chromium.org/17816008/diff/5629499534213120/net/disk_cache/block_files.h#newcode27 net/disk_cache/block_files.h:27: // Note that this class doesn't perform any file ...

7 years, 4 months ago (2013-08-05 20:23:48 UTC) #25

rvargas (doing something else)

https://codereview.chromium.org/17816008/diff/5629499534213120/net/disk_cache/block_files.h File net/disk_cache/block_files.h (right): https://codereview.chromium.org/17816008/diff/5629499534213120/net/disk_cache/block_files.h#newcode27 net/disk_cache/block_files.h:27: // Note that this class doesn't perform any file ...

7 years, 4 months ago (2013-08-05 21:03:01 UTC) #26

https://codereview.chromium.org/17816008/diff/5629499534213120/net/disk_cache...
File net/disk_cache/block_files.h (right):

https://codereview.chromium.org/17816008/diff/5629499534213120/net/disk_cache...
net/disk_cache/block_files.h:27: // Note that this class doesn't perform any
file operation.
On 2013/08/05 20:23:49, rdsmith wrote:
> On 2013/08/05 19:50:45, rvargas wrote:
> > On 2013/08/05 17:06:15, gavinp wrote:
> > > Is "file operation" well defined?
> > > 
> > > This class definitely does a lot of IO. Can we expand on this comment a
bit,
> > > both in explaining what a BlockHeader is, and what it means to not do file
> > > operations?
> > 
> > I meant the usual meaning of file operations (file IO). This class doesn't
do
> > file IO by itself any more than any class that touches memory (almost any
> > class?) does file IO.
> 
> For what it's worth, this confused me a bit too when I read it.  Touching
memory
> that's mapped from a file means you can't make any guarantees about when file
> data changes on disk, so I'm not sure what use the comment has.  Is there a
way
> in which consumer code relies on this class not performing explicit file
> operations, while it performing implicit file changes is ok?  (Or am I
confused
> about it touching data mapped from a file somehow?)

Now I'm confused too. I can certainly remove the comment... the only reason for
the comment was to point out the main design difference between working with
this object (or a collection of these objects, like BlockFilesBitmaps), and
working instead with BlockFiles. As in BackendImplV2 deals with BlockFiles and
it should expect all kind of synchronous IO operations to take place, as opposed
to BackendImplV3 which, by using this class (or specifically BlockBitmaps which
relies on BlockFilesBitmaps which is a vector of BlockHeader(s)), can expect all
calls to be non-blocking.

Another way to put it is that this class is IO thread friendly, as opposed to
BlockFiles... but saying that violates the abstraction of threads of net.

Randy Smith (Not in Mondays)

On 2013/08/05 21:03:01, rvargas wrote: > https://codereview.chromium.org/17816008/diff/5629499534213120/net/disk_cache/block_files.h > File net/disk_cache/block_files.h (right): > > https://codereview.chromium.org/17816008/diff/5629499534213120/net/disk_cache/block_files.h#newcode27 > ...

7 years, 4 months ago (2013-08-05 21:37:59 UTC) #27

On 2013/08/05 21:03:01, rvargas wrote:
>
https://codereview.chromium.org/17816008/diff/5629499534213120/net/disk_cache...
> File net/disk_cache/block_files.h (right):
> 
>
https://codereview.chromium.org/17816008/diff/5629499534213120/net/disk_cache...
> net/disk_cache/block_files.h:27: // Note that this class doesn't perform any
> file operation.
> On 2013/08/05 20:23:49, rdsmith wrote:
> > On 2013/08/05 19:50:45, rvargas wrote:
> > > On 2013/08/05 17:06:15, gavinp wrote:
> > > > Is "file operation" well defined?
> > > > 
> > > > This class definitely does a lot of IO. Can we expand on this comment a
> bit,
> > > > both in explaining what a BlockHeader is, and what it means to not do
file
> > > > operations?
> > > 
> > > I meant the usual meaning of file operations (file IO). This class doesn't
> do
> > > file IO by itself any more than any class that touches memory (almost any
> > > class?) does file IO.
> > 
> > For what it's worth, this confused me a bit too when I read it.  Touching
> memory
> > that's mapped from a file means you can't make any guarantees about when
file
> > data changes on disk, so I'm not sure what use the comment has.  Is there a
> way
> > in which consumer code relies on this class not performing explicit file
> > operations, while it performing implicit file changes is ok?  (Or am I
> confused
> > about it touching data mapped from a file somehow?)
> 
> Now I'm confused too. I can certainly remove the comment... the only reason
for
> the comment was to point out the main design difference between working with
> this object (or a collection of these objects, like BlockFilesBitmaps), and
> working instead with BlockFiles. As in BackendImplV2 deals with BlockFiles and
> it should expect all kind of synchronous IO operations to take place, as
opposed
> to BackendImplV3 which, by using this class (or specifically BlockBitmaps
which
> relies on BlockFilesBitmaps which is a vector of BlockHeader(s)), can expect
all
> calls to be non-blocking.
> 
> Another way to put it is that this class is IO thread friendly, as opposed to
> BlockFiles... but saying that violates the abstraction of threads of net.

Ah!  That makes sense.  My confusion when I first encountered it was around
whether there was some meaning having to do with guaranteeing consistency in the
files.  

If it's just around guaranteeing that it doesn't do operations not permitted on
the IO thread, how about "No blocking file operations"?

rvargas (doing something else)

On 2013/08/05 21:37:59, rdsmith wrote: > On 2013/08/05 21:03:01, rvargas wrote: > > > https://codereview.chromium.org/17816008/diff/5629499534213120/net/disk_cache/block_files.h ...

7 years, 4 months ago (2013-08-05 21:40:47 UTC) #28

On 2013/08/05 21:37:59, rdsmith wrote:
> On 2013/08/05 21:03:01, rvargas wrote:
> >
>
https://codereview.chromium.org/17816008/diff/5629499534213120/net/disk_cache...
> > File net/disk_cache/block_files.h (right):
> > 
> >
>
https://codereview.chromium.org/17816008/diff/5629499534213120/net/disk_cache...
> > net/disk_cache/block_files.h:27: // Note that this class doesn't perform any
> > file operation.
> > On 2013/08/05 20:23:49, rdsmith wrote:
> > > On 2013/08/05 19:50:45, rvargas wrote:
> > > > On 2013/08/05 17:06:15, gavinp wrote:
> > > > > Is "file operation" well defined?
> > > > > 
> > > > > This class definitely does a lot of IO. Can we expand on this comment
a
> > bit,
> > > > > both in explaining what a BlockHeader is, and what it means to not do
> file
> > > > > operations?
> > > > 
> > > > I meant the usual meaning of file operations (file IO). This class
doesn't
> > do
> > > > file IO by itself any more than any class that touches memory (almost
any
> > > > class?) does file IO.
> > > 
> > > For what it's worth, this confused me a bit too when I read it.  Touching
> > memory
> > > that's mapped from a file means you can't make any guarantees about when
> file
> > > data changes on disk, so I'm not sure what use the comment has.  Is there
a
> > way
> > > in which consumer code relies on this class not performing explicit file
> > > operations, while it performing implicit file changes is ok?  (Or am I
> > confused
> > > about it touching data mapped from a file somehow?)
> > 
> > Now I'm confused too. I can certainly remove the comment... the only reason
> for
> > the comment was to point out the main design difference between working with
> > this object (or a collection of these objects, like BlockFilesBitmaps), and
> > working instead with BlockFiles. As in BackendImplV2 deals with BlockFiles
and
> > it should expect all kind of synchronous IO operations to take place, as
> opposed
> > to BackendImplV3 which, by using this class (or specifically BlockBitmaps
> which
> > relies on BlockFilesBitmaps which is a vector of BlockHeader(s)), can expect
> all
> > calls to be non-blocking.
> > 
> > Another way to put it is that this class is IO thread friendly, as opposed
to
> > BlockFiles... but saying that violates the abstraction of threads of net.
> 
> Ah!  That makes sense.  My confusion when I first encountered it was around
> whether there was some meaning having to do with guaranteeing consistency in
the
> files.  
> 
> If it's just around guaranteeing that it doesn't do operations not permitted
on
> the IO thread, how about "No blocking file operations"?

ok... will change on the next round (+ a little more on what the class is)

gavinp

https://codereview.chromium.org/17816008/diff/26002/net/disk_cache/v3/block_bitmaps.cc File net/disk_cache/v3/block_bitmaps.cc (right): https://codereview.chromium.org/17816008/diff/26002/net/disk_cache/v3/block_bitmaps.cc#newcode153 net/disk_cache/v3/block_bitmaps.cc:153: NOTREACHED(); On 2013/08/05 19:50:45, rvargas wrote: > On 2013/07/15 ...

7 years, 4 months ago (2013-08-06 01:11:02 UTC) #29

https://codereview.chromium.org/17816008/diff/26002/net/disk_cache/v3/block_b...
File net/disk_cache/v3/block_bitmaps.cc (right):

https://codereview.chromium.org/17816008/diff/26002/net/disk_cache/v3/block_b...
net/disk_cache/v3/block_bitmaps.cc:153: NOTREACHED();
On 2013/08/05 19:50:45, rvargas wrote:
> On 2013/07/15 18:23:35, gavinp wrote:
> > On 2013/07/11 19:54:55, rvargas wrote:
> > > On 2013/06/28 21:47:45, gavinp wrote:
> > > > This scares me a bit. Is this an assertion, or an error case? It is
coded
> as
> > > > both.
> > > 
> > > Not that uncommon pattern. It is an assertion in that tests or developer
> > > versions should break into a debugger if this happens. On the other hand,
I
> > > expect a reasonable action if this happens on the field... as in creation
of
> > the
> > > current entry to fail.
> > 
> > Why not crash in the field? If this is impossible (a DCHECK), then the crash
> > will be really diagnostic, right?
> 
> I din't say it was impossible to get here. All is needed is to get really
behind
> growing the files. That's no reason to crash the whole browser.

Aha, so should this be a DLOG, then. Recall that the Chomium style guide says
that "you should not handle DCHECK() failures." This code is handling a DCHECK
failure.

https://codereview.chromium.org/17816008/diff/26002/net/disk_cache/v3/block_b...
File net/disk_cache/v3/block_bitmaps_unittest.cc (right):

https://codereview.chromium.org/17816008/diff/26002/net/disk_cache/v3/block_b...
net/disk_cache/v3/block_bitmaps_unittest.cc:10: 
On 2013/08/05 19:50:45, rvargas wrote:
> What problem are you trying to solve?
> 
> The usual guidance of unit test files for chrome was to avoid namespaces so
that
> the test was not using unintentionally any code inside the namespace under
test.
> The second reason was to somewhat use the same pattern that actual consumers
> would have to use when dealing with the code, and using the explicit namespace
> is one of those things that consumers have to deal with.
> 
> A clear problem with that is when tests need to friend the code under test, so
a
> good number of test files started getting that namespace. In that sense, it is
> kind of funny to have a written rule saying "don't use namespaces" when half
of
> the files would have to anyway.
> 
> Given that these tests don't _have to_ live inside net, I still think it is a
> good idea to follow the old guidance.
> 
> Now, about an anonymous namespace... what problem would that be solving,
> exactly? Name collisions with another, unrelated test? Isn't that the reason
to
> be very explicit about the name used for the tests? I would prefer a world
> without name collisions than a world when collisions are handled by virtue of
> anonymous namespaces.

The anonymous namespace is the only one I advocated, so I will only address
that.

- A test in an anonymous namespace is guaranteed not to be a friend of another
class. That's a huge virtue that helps readability: it's a strong assertion
about the relationship of the test to the published interfaces of the classes it
includes.

- Name collisions at linkage are bad, indeed. Further, they have undefined
results as far as I'm aware. Having translation units export symbols that aren't
even in a header is just asking for trouble down the line. It would be wonderful
if we never collided in names, but many tests mock similar things, many tests
create similar test classes, it just happens. Defined results win over undefined
results every time.

- Linking is not free. The general linking problem is O(n^2), and reducing n is
a great path to reducing link time.

- The Google C++ style guide tells me that using namespaces is encouraged. See
http://google-styleguide.googlecode.com/svn/trunk/cppguide.xml?showone=Namesp...

- The Chromium style guide tells me that "Items local to a .cc file should be
wrapped in an unnamed namespace", which is stronger, and appears fairly
definitive in the case of these tests.

https://codereview.chromium.org/17816008/diff/5629499534213120/net/disk_cache...
File net/disk_cache/block_files.cc (right):

https://codereview.chromium.org/17816008/diff/5629499534213120/net/disk_cache...
net/disk_cache/block_files.cc:199: for (int i = 0; i < kMaxNumBlocks; i++) {
On 2013/08/05 19:50:45, rvargas wrote:
> On 2013/08/05 17:06:15, gavinp wrote:
> > I did a real double take reading this loop. Why do we initialize i = 0 and
not
> > with block_count - 1?
> 
> To avoid a negative values?

Is it possible for block_count to have a zero or negative value? Should that be
a DCHECK?

> 
> done

https://codereview.chromium.org/17816008/diff/5629499534213120/net/disk_cache...
net/disk_cache/block_files.cc:276: DCHECK_GE(block_files_.size(),
On 2013/08/05 19:50:45, rvargas wrote:
> On 2013/07/15 18:23:36, gavinp wrote:
> > Convention is to put the constant first, and the thing being tested second,
so
> > this needs to be reversed into a DCHECK_LE.
> 
> I believe you are confusing the requirements of EXPECT_LE with the one of
> DCHECK_LE.
> 
> The test macros expect the first argument to be the constant value. On the
other
> hand, the recommended convention for the runtime macros is to follow the
pattern
> used when coding (and that means make the code more readable, usually by
having
> the constant after the variable)

I am not aware of this recommended convention, and this contradicts what I've
been told. Can you please provide me with a link to this recommendation and how
it was shared with us?

https://codereview.chromium.org/17816008/diff/5629499534213120/net/disk_cache...
File net/disk_cache/block_files.h (right):

https://codereview.chromium.org/17816008/diff/5629499534213120/net/disk_cache...
net/disk_cache/block_files.h:27: // Note that this class doesn't perform any
file operation.
On 2013/08/05 19:50:45, rvargas wrote:
> On 2013/08/05 17:06:15, gavinp wrote:
> > Is "file operation" well defined?
> > 
> > This class definitely does a lot of IO. Can we expand on this comment a bit,
> > both in explaining what a BlockHeader is, and what it means to not do file
> > operations?
> 
> I meant the usual meaning of file operations (file IO). This class doesn't do
> file IO by itself any more than any class that touches memory (almost any
> class?) does file IO.

Is this class constructed on something other than a memory mapped file in
anything other than tests? Can you tell me more about that situation?

https://codereview.chromium.org/17816008/diff/5629499534213120/net/disk_cache...
net/disk_cache/block_files.h:27: // Note that this class doesn't perform any
file operation.
On 2013/08/05 21:03:02, rvargas wrote:
> On 2013/08/05 20:23:49, rdsmith wrote:
> > On 2013/08/05 19:50:45, rvargas wrote:
> > > On 2013/08/05 17:06:15, gavinp wrote:
> > > > Is "file operation" well defined?
> > > > 
> > > > This class definitely does a lot of IO. Can we expand on this comment a
> bit,
> > > > both in explaining what a BlockHeader is, and what it means to not do
file
> > > > operations?
> > > 
> > > I meant the usual meaning of file operations (file IO). This class doesn't
> do
> > > file IO by itself any more than any class that touches memory (almost any
> > > class?) does file IO.
> > 
> > For what it's worth, this confused me a bit too when I read it.  Touching
> memory
> > that's mapped from a file means you can't make any guarantees about when
file
> > data changes on disk, so I'm not sure what use the comment has.  Is there a
> way
> > in which consumer code relies on this class not performing explicit file
> > operations, while it performing implicit file changes is ok?  (Or am I
> confused
> > about it touching data mapped from a file somehow?)
> 
> Now I'm confused too. I can certainly remove the comment... the only reason
for
> the comment was to point out the main design difference between working with
> this object (or a collection of these objects, like BlockFilesBitmaps), and
> working instead with BlockFiles. As in BackendImplV2 deals with BlockFiles and
> it should expect all kind of synchronous IO operations to take place, as
opposed
> to BackendImplV3 which, by using this class (or specifically BlockBitmaps
which
> relies on BlockFilesBitmaps which is a vector of BlockHeader(s)), can expect
all
> calls to be non-blocking.
> 
> Another way to put it is that this class is IO thread friendly, as opposed to
> BlockFiles... but saying that violates the abstraction of threads of net.

Dereferencing a pointer into a memory map is no different than calling read(2)
on some offset into the file. If the latter is prohibited, so is the former. It
is NOT OK to memory map a file and then use that memory map on the IO thread. Do
not be confused by the fact that a thread guard doesn't prevent this.

Further, reading and writing to memory that is, in all cases, expected to be
backed by a memory map is precisely doing IO using read(2) and write(2). If this
class was designed with the expectation that it dereferences pointers into
memory maps, it was designed to do IO.

https://codereview.chromium.org/17816008/diff/5629499534213120/net/disk_cache...
File net/disk_cache/v3/block_bitmaps.cc (right):

https://codereview.chromium.org/17816008/diff/5629499534213120/net/disk_cache...
net/disk_cache/v3/block_bitmaps.cc:157: header_num = -1;
On 2013/08/05 19:50:45, rvargas wrote:
> On 2013/08/05 17:06:15, gavinp wrote:
> > NOTREACHED() is a DCHECK which says this is impossible. The code to handle
> this
> > case in release is thus redundant. If this really is impossible, then don't
> > fixup header_num, or consider a CHECK() to force the crash so we can debug
it.
> > 
> > On the other hand, if this actually happens, perhaps a DLOG is more
> appropriate?
> 
> See the other comment about crashing the browser.

You should not handle DCHECK() failures.

https://codereview.chromium.org/17816008/diff/5629499534213120/net/disk_cache...
File net/disk_cache/v3/block_bitmaps.h (right):

https://codereview.chromium.org/17816008/diff/5629499534213120/net/disk_cache...
net/disk_cache/v3/block_bitmaps.h:17: // This class handles the set of
block-files open by the disk cache.
On 2013/08/05 19:50:45, rvargas wrote:
> On 2013/08/05 17:06:15, gavinp wrote:
> > Can we expand on this comment a bit? I'm still a bit confused about what a
> > bitmap is? Also "handles" is a really general verb. More specifics will put
> this
> > in context.
> 
> Expended the comment. I can think of handles/represents/encapsulates but the
> first one is the most accurate imo.

None of "handles/represents/encapsulates" are great words to use when explaining
a new object to a naive audience, though they sometimes work. The trouble is
they're really general. I want to be able to read a class description in a
header and put the class in context of what it's used for; knowing about what it
operates on, what operations it does, and what consumers it's aimed at are
great.

The expansion helps me, but I'm still a bit befuddled. What does it mean to not
use dedicated files? This interface is used by what consumers? I really am at a
loss here.

Randy Smith (Not in Mondays)

https://codereview.chromium.org/17816008/diff/5629499534213120/net/disk_cache/block_files.h File net/disk_cache/block_files.h (right): https://codereview.chromium.org/17816008/diff/5629499534213120/net/disk_cache/block_files.h#newcode27 net/disk_cache/block_files.h:27: // Note that this class doesn't perform any file ...

7 years, 4 months ago (2013-08-06 15:16:27 UTC) #30

https://codereview.chromium.org/17816008/diff/5629499534213120/net/disk_cache...
File net/disk_cache/block_files.h (right):

https://codereview.chromium.org/17816008/diff/5629499534213120/net/disk_cache...
net/disk_cache/block_files.h:27: // Note that this class doesn't perform any
file operation.
On 2013/08/06 01:11:02, gavinp wrote:
> On 2013/08/05 21:03:02, rvargas wrote:
> > On 2013/08/05 20:23:49, rdsmith wrote:
> > > On 2013/08/05 19:50:45, rvargas wrote:
> > > > On 2013/08/05 17:06:15, gavinp wrote:
> > > > > Is "file operation" well defined?
> > > > > 
> > > > > This class definitely does a lot of IO. Can we expand on this comment
a
> > bit,
> > > > > both in explaining what a BlockHeader is, and what it means to not do
> file
> > > > > operations?
> > > > 
> > > > I meant the usual meaning of file operations (file IO). This class
doesn't
> > do
> > > > file IO by itself any more than any class that touches memory (almost
any
> > > > class?) does file IO.
> > > 
> > > For what it's worth, this confused me a bit too when I read it.  Touching
> > memory
> > > that's mapped from a file means you can't make any guarantees about when
> file
> > > data changes on disk, so I'm not sure what use the comment has.  Is there
a
> > way
> > > in which consumer code relies on this class not performing explicit file
> > > operations, while it performing implicit file changes is ok?  (Or am I
> > confused
> > > about it touching data mapped from a file somehow?)
> > 
> > Now I'm confused too. I can certainly remove the comment... the only reason
> for
> > the comment was to point out the main design difference between working with
> > this object (or a collection of these objects, like BlockFilesBitmaps), and
> > working instead with BlockFiles. As in BackendImplV2 deals with BlockFiles
and
> > it should expect all kind of synchronous IO operations to take place, as
> opposed
> > to BackendImplV3 which, by using this class (or specifically BlockBitmaps
> which
> > relies on BlockFilesBitmaps which is a vector of BlockHeader(s)), can expect
> all
> > calls to be non-blocking.
> > 
> > Another way to put it is that this class is IO thread friendly, as opposed
to
> > BlockFiles... but saying that violates the abstraction of threads of net.
> 
> Dereferencing a pointer into a memory map is no different than calling read(2)
> on some offset into the file. If the latter is prohibited, so is the former.
It
> is NOT OK to memory map a file and then use that memory map on the IO thread.
Do
> not be confused by the fact that a thread guard doesn't prevent this.
> 
> Further, reading and writing to memory that is, in all cases, expected to be
> backed by a memory map is precisely doing IO using read(2) and write(2). If
this
> class was designed with the expectation that it dereferences pointers into
> memory maps, it was designed to do IO.

Just to make explicit what may be obvious to both of you (based on a
conversation Gavin and I had last night): This equivalence is because reading or
writing to a memory mapped to a file may block as the file is paged in.  In some
ways it's worse than using read/write because you don't know ahead of time if
you get blocked, and if you do the code doesn't let you know.

Randy Smith (Not in Mondays)

On 2013/08/06 15:16:27, rdsmith wrote: > Just to make explicit what may be obvious to ...

7 years, 4 months ago (2013-08-07 19:46:54 UTC) #31

On 2013/08/06 15:16:27, rdsmith wrote:
> Just to make explicit what may be obvious to both of you (based on a
> conversation Gavin and I had last night): This equivalence is because reading
or
> writing to a memory mapped to a file may block as the file is paged in.  In
some
> ways it's worse than using read/write because you don't know ahead of time if
> you get blocked, and if you do the code doesn't let you know.

Gavin and I spent some time talking about this offline yesterday, since it seems
core to the v3 cache design and "doing file IO on the IO thread" is considered
Really Bad :-}.  I'll summarize the conversation here for the record (and Gavin,
please correct me if I'm saying anything that doesn't match your memory of the
conversation).  

Touching memory that's mapped from a file is somewhat like touching regularly
allocated memory, somewhat like touching data segment memory, and somewhat like
calling file operations.  On Linux (which is the OS Gavin and I are both the
most familiar with) the breakdown looks like:
* First memory dereference: The memory reference may block as the page is paged
in.  Often the OS will have started reading in the first couple of pages upon
mmap, so if the area of the file mapped only has a couple of pages, the blocking
won't happen.  This is believed to be functionally identical to referencing data
segment memory (which is paged in from the executable).
* Later memory dereferences: These may block as the page is paged in, but only
if the page has been evicted.  This will generally occur due to memory pressure.
 This is believed to be functionally equivalent to normal paging for heap
allocated memory.
* Writing to memory: This is different.  For heap allocated or data segment
memory, it simply marks the page dirty.  That page may or may not be cleaned to
the page store, presumably primarily based on whether the OS thinks it'll need
the page in the future.  For mmap'd memory, the page is scheduled to be flushed
to disk.  This is similar to what happens with a "write" to a file when the page
is in the block cache (though the write case requires a copy in the OS).

So the summary of the above is: On Linux, touching page mapped memory seems like
it won't introduce any IO thread jank that we're already accepting because of
the paging system.  If mapping pages substantially increases the working set
size, that could increase memory pressure and thus jank, but it doesn't seem
from the V3 cache design as if this increases the working set any larger than it
would be for equivalent control structures that weren't memory mapped.  So
functionally, on Linux, this is probably fine.

On Android (I know, not a target platform, bear with me) the situation is
different, in that a) memory pressure is a lot higher, and b) regularly memory
is just in core; if you're running, it's not paged out.  That's not true for
mmap'd page data (or data segment data, until it's written to, at which point it
becomes regular memory).  So on Android this could substantially change the jank
expected on the IO thread by introducing paging where there wasn't any before.  

An obvious important question is how the behavior is different on Windows and
Mac.  My assumption is that Mac and Linux are the same, but it might be
worthwhile verifying that.  My assumption is that Windows is different, that
Ricardo knows the differences pretty well, and that there isn't a problem there
either, but it would be good to verify all of *those* assumptions :-}.  I have
no idea what this looks like on IOS.

My primary concerns, having gone through the above exercise, is around
documentation and maintainability.  I think the assumptions that programmers
will make reading the class are similar to the ones I made when I first read it
(which is that there's nothing to worry about here because we're just
manipulating data structures in memory).  And I don't think those assumptions
are correct; the analysis that indicates that memory mapped files can be used on
threads that aren't allowed to do file IO actually digs fairly deep into OS
behavior. 

Ricardo, would you be willing to write-up an analysis like the above somewhere
justifying why it's ok to touch memory mapped files, and reference it from the
BlockHeader comment?  I'd also like something say "Careful analysis is needed
before using this on Android or IOS" with the same reference, just so a future
port doesn't trip over this.

Randy Smith (Not in Mondays)

On 2013/08/07 19:46:54, rdsmith wrote: > On 2013/08/06 15:16:27, rdsmith wrote: > > Just to ...

7 years, 4 months ago (2013-08-07 19:59:49 UTC) #32

On 2013/08/07 19:46:54, rdsmith wrote:
> On 2013/08/06 15:16:27, rdsmith wrote:
> > Just to make explicit what may be obvious to both of you (based on a
> > conversation Gavin and I had last night): This equivalence is because
reading
> or
> > writing to a memory mapped to a file may block as the file is paged in.  In
> some
> > ways it's worse than using read/write because you don't know ahead of time
if
> > you get blocked, and if you do the code doesn't let you know.
> 
> Gavin and I spent some time talking about this offline yesterday, since it
seems
> core to the v3 cache design and "doing file IO on the IO thread" is considered
> Really Bad :-}.  I'll summarize the conversation here for the record (and
Gavin,
> please correct me if I'm saying anything that doesn't match your memory of the
> conversation).  
> 
> Touching memory that's mapped from a file is somewhat like touching regularly
> allocated memory, somewhat like touching data segment memory, and somewhat
like
> calling file operations.  On Linux (which is the OS Gavin and I are both the
> most familiar with) the breakdown looks like:
> * First memory dereference: The memory reference may block as the page is
paged
> in.  Often the OS will have started reading in the first couple of pages upon
> mmap, so if the area of the file mapped only has a couple of pages, the
blocking
> won't happen.  This is believed to be functionally identical to referencing
data
> segment memory (which is paged in from the executable).
> * Later memory dereferences: These may block as the page is paged in, but only
> if the page has been evicted.  This will generally occur due to memory
pressure.
>  This is believed to be functionally equivalent to normal paging for heap
> allocated memory.
> * Writing to memory: This is different.  For heap allocated or data segment
> memory, it simply marks the page dirty.  That page may or may not be cleaned
to
> the page store, presumably primarily based on whether the OS thinks it'll need
> the page in the future.  For mmap'd memory, the page is scheduled to be
flushed
> to disk.  This is similar to what happens with a "write" to a file when the
page
> is in the block cache (though the write case requires a copy in the OS).
> 
> So the summary of the above is: On Linux, touching page mapped memory seems
like
> it won't introduce any IO thread jank that we're already accepting because of
> the paging system.  If mapping pages substantially increases the working set
> size, that could increase memory pressure and thus jank, but it doesn't seem
> from the V3 cache design as if this increases the working set any larger than
it
> would be for equivalent control structures that weren't memory mapped.  So
> functionally, on Linux, this is probably fine.
> 
> On Android (I know, not a target platform, bear with me) the situation is
> different, in that a) memory pressure is a lot higher, and b) regularly memory
> is just in core; if you're running, it's not paged out.  That's not true for
> mmap'd page data (or data segment data, until it's written to, at which point
it
> becomes regular memory).  So on Android this could substantially change the
jank
> expected on the IO thread by introducing paging where there wasn't any before.
 
> 
> An obvious important question is how the behavior is different on Windows and
> Mac.  My assumption is that Mac and Linux are the same, but it might be
> worthwhile verifying that.  My assumption is that Windows is different, that
> Ricardo knows the differences pretty well, and that there isn't a problem
there
> either, but it would be good to verify all of *those* assumptions :-}.  I have
> no idea what this looks like on IOS.
> 
> My primary concerns, having gone through the above exercise, is around
> documentation and maintainability.  I think the assumptions that programmers
> will make reading the class are similar to the ones I made when I first read
it
> (which is that there's nothing to worry about here because we're just
> manipulating data structures in memory).  And I don't think those assumptions
> are correct; the analysis that indicates that memory mapped files can be used
on
> threads that aren't allowed to do file IO actually digs fairly deep into OS
> behavior. 
> 
> Ricardo, would you be willing to write-up an analysis like the above somewhere
> justifying why it's ok to touch memory mapped files, and reference it from the
> BlockHeader comment?  I'd also like something say "Careful analysis is needed
> before using this on Android or IOS" with the same reference, just so a future
> port doesn't trip over this.

Suffix: On a whim, I chatted with Darin about this (I figured if I was worried
about the meaning of a high level chromium architectural restriction in a corner
case, he wasn't a bad person to ask).  He wasn't concerned about touching memory
mapped files in general, but was concerned about how several small IOs would
affect chrome startup time (because most IO at startup is bulk reading and
pre-reading of DLLs and similar).  So his basic feedback was that it didn't
bother him architecturally, but he thought it would be important to verify that
it wasn't actually causing startup performance issues.  Of course, I'm not
certain how to do that :-}.  But the advice/suggestion makes sense to me.

gavinp

On 2013/08/07 19:46:54, rdsmith wrote: > On 2013/08/06 15:16:27, rdsmith wrote: > > Just to ...

7 years, 4 months ago (2013-08-07 20:13:45 UTC) #33

On 2013/08/07 19:46:54, rdsmith wrote:
> On 2013/08/06 15:16:27, rdsmith wrote:
> > Just to make explicit what may be obvious to both of you (based on a
> > conversation Gavin and I had last night): This equivalence is because
reading
> or
> > writing to a memory mapped to a file may block as the file is paged in.  In
> some
> > ways it's worse than using read/write because you don't know ahead of time
if
> > you get blocked, and if you do the code doesn't let you know.
> 
> Gavin and I spent some time talking about this offline yesterday, since it
seems
> core to the v3 cache design and "doing file IO on the IO thread" is considered
> Really Bad :-}.  I'll summarize the conversation here for the record (and
Gavin,
> please correct me if I'm saying anything that doesn't match your memory of the
> conversation).  
> 
> Touching memory that's mapped from a file is somewhat like touching regularly
> allocated memory, somewhat like touching data segment memory, and somewhat
like
> calling file operations.  On Linux (which is the OS Gavin and I are both the
> most familiar with) the breakdown looks like:
> * First memory dereference: The memory reference may block as the page is
paged
> in.  Often the OS will have started reading in the first couple of pages upon
> mmap, so if the area of the file mapped only has a couple of pages, the
blocking
> won't happen.  This is believed to be functionally identical to referencing
data
> segment memory (which is paged in from the executable).
> * Later memory dereferences: These may block as the page is paged in, but only
> if the page has been evicted.  This will generally occur due to memory
pressure.
>  This is believed to be functionally equivalent to normal paging for heap
> allocated memory.
> * Writing to memory: This is different.  For heap allocated or data segment
> memory, it simply marks the page dirty.  That page may or may not be cleaned
to
> the page store, presumably primarily based on whether the OS thinks it'll need
> the page in the future.  For mmap'd memory, the page is scheduled to be
flushed
> to disk.  This is similar to what happens with a "write" to a file when the
page
> is in the block cache (though the write case requires a copy in the OS).

I had a meeting today with tytso to ask about these distinctions. He told me
that when you call write(2) on linux, a 30s flush timer is started for the dirty
pages; they will be flushed at that timer if fsync() is not called first.
However, in the case of dirty pages through a memory map, no such timer starts.
So in Linux, expect memory mapped dirty pages to flush only at explicit calls to
msync(), removing the map or process exit. They may flush earlier due to memory
pressure, but there is no timer.

My apologies for my bad memory yesterday.

> 
> So the summary of the above is: On Linux, touching page mapped memory seems
like
> it won't introduce any IO thread jank that we're already accepting because of
> the paging system.  If mapping pages substantially increases the working set
> size, that could increase memory pressure and thus jank, but it doesn't seem
> from the V3 cache design as if this increases the working set any larger than
it
> would be for equivalent control structures that weren't memory mapped.  So
> functionally, on Linux, this is probably fine.
> 
> On Android (I know, not a target platform, bear with me) the situation is
> different, in that a) memory pressure is a lot higher, and b) regularly memory
> is just in core; if you're running, it's not paged out.  That's not true for
> mmap'd page data (or data segment data, until it's written to, at which point
it
> becomes regular memory).  So on Android this could substantially change the
jank
> expected on the IO thread by introducing paging where there wasn't any before.
 
> 
> An obvious important question is how the behavior is different on Windows and
> Mac.  My assumption is that Mac and Linux are the same, but it might be
> worthwhile verifying that.  My assumption is that Windows is different, that
> Ricardo knows the differences pretty well, and that there isn't a problem
there
> either, but it would be good to verify all of *those* assumptions :-}.  I have
> no idea what this looks like on IOS.
> 
> My primary concerns, having gone through the above exercise, is around
> documentation and maintainability.  I think the assumptions that programmers
> will make reading the class are similar to the ones I made when I first read
it
> (which is that there's nothing to worry about here because we're just
> manipulating data structures in memory).  And I don't think those assumptions
> are correct; the analysis that indicates that memory mapped files can be used
on
> threads that aren't allowed to do file IO actually digs fairly deep into OS
> behavior. 
> 
> Ricardo, would you be willing to write-up an analysis like the above somewhere
> justifying why it's ok to touch memory mapped files, and reference it from the
> BlockHeader comment?  I'd also like something say "Careful analysis is needed
> before using this on Android or IOS" with the same reference, just so a future
> port doesn't trip over this.

rvargas (doing something else)

My apologies for not commenting on this thread before and letting you guys go on ...

7 years, 4 months ago (2013-08-08 02:50:00 UTC) #34

My apologies for not commenting on this thread before and letting you guys go on
a implications hunting trip. I was distracted with other stuff and I didn't
think you were going to spend much time here.

Yes, the point that you were missing is that, in general, not only any code can
page fault when accessing data (assuming an OS with virtual memory), but it will
also page fault when accessing code, and both things can be triggered by memory
pressure. So, again, in general, this code is not much different than other code
that touches memory.

Down to the dirty details, the other important detail to keep in mind is that
the memory mapped section of a block file is exactly the first 8KB. There are
usually maybe 7 block files, maybe a little more, but we are talking about
probably less than 20 pages. That's without considering the index itself, of
course.

Furthermore, to deal with the issue of startup latency and random blocking
without memory pressure, all mapped sections are loaded at initialization. Today
that is performed by directly reading the section on all OSes, but that will
change to touching all pages.

While I appreciate your concern about being misled by reading a given header on
this code, I don't really believe that the way to learn about the high level
implications of a design is by reading random headers. I believe that should be
documented on a design doc, giving the overall picture of how things are
supposed to work, and what are the requirements for porting the code to other
platforms. Code reuse of a random class from inside a net subfolder is something
that I'm not particularly concerned about, because we are very clear about what
is the public interface of net.

Android is not the target of this design. If it were, there would not be memory
maps given the overall brokenness of the feature on that OS.

New version up.

https://codereview.chromium.org/17816008/diff/26002/net/disk_cache/v3/block_b...
File net/disk_cache/v3/block_bitmaps_unittest.cc (right):

https://codereview.chromium.org/17816008/diff/26002/net/disk_cache/v3/block_b...
net/disk_cache/v3/block_bitmaps_unittest.cc:10: 
file_a.cc:
namespace {
TEST(a, foo){
}
}

file_b.cc:
namespace {
TEST(a, foo){
}
}

That should compile and link alright... what happens when I run tests with
--gtest_filter=a.foo ?

I honestly don't want to delve into gtest and/or gmock implementation to see
what it does.

If you are arguing about placing common infrastructure (mocks etc) inside an
anonymous namespace, I'm fine with that (see http_cache_unittest.cc for an
example). This file has only tests.

If you're still not convinced that's fine. I'm not asking you (right now) to
change the code you are writing. However, as a whole, your point is not
enforceable at the moment. See:

https://groups.google.com/a/chromium.org/forum/#!searchin/chromium-dev/test$2...

https://codereview.chromium.org/17816008/diff/5629499534213120/net/disk_cache...
File net/disk_cache/block_files.cc (right):

https://codereview.chromium.org/17816008/diff/5629499534213120/net/disk_cache...
net/disk_cache/block_files.cc:199: for (int i = 0; i < kMaxNumBlocks; i++) {
On 2013/08/06 01:11:02, gavinp wrote:
> On 2013/08/05 19:50:45, rvargas wrote:
> > On 2013/08/05 17:06:15, gavinp wrote:
> > > I did a real double take reading this loop. Why do we initialize i = 0 and
> not
> > > with block_count - 1?
> > 
> > To avoid a negative values?
> 
> Is it possible for block_count to have a zero or negative value? Should that
be
> a DCHECK?

Do you mean like the first line of this method (on the version posted when I
sent this comment)
> 
> > 
> > done
> 
>

https://codereview.chromium.org/17816008/diff/5629499534213120/net/disk_cache...
net/disk_cache/block_files.cc:276: DCHECK_GE(block_files_.size(),
On 2013/08/06 01:11:02, gavinp wrote:
> On 2013/08/05 19:50:45, rvargas wrote:
> > On 2013/07/15 18:23:36, gavinp wrote:
> > > Convention is to put the constant first, and the thing being tested
second,
> so
> > > this needs to be reversed into a DCHECK_LE.
> > 
> > I believe you are confusing the requirements of EXPECT_LE with the one of
> > DCHECK_LE.
> > 
> > The test macros expect the first argument to be the constant value. On the
> other
> > hand, the recommended convention for the runtime macros is to follow the
> pattern
> > used when coding (and that means make the code more readable, usually by
> having
> > the constant after the variable)
> 
> I am not aware of this recommended convention, and this contradicts what I've
> been told. Can you please provide me with a link to this recommendation and
how
> it was shared with us?
> 

I have not access to all emails that have talked about style issues (and
honestly looking up that is not a great use of time), but the current style
guide says
Prefer (foo == 0) to (0 == foo).

The reasoning being that it is generally easier to follow the flow that way.

I'm positive I've seen more than once the point about using regular (as opposed
to "test") ordering for dchecks.

https://codereview.chromium.org/17816008/diff/5629499534213120/net/disk_cache...
File net/disk_cache/v3/block_bitmaps.h (right):

https://codereview.chromium.org/17816008/diff/5629499534213120/net/disk_cache...
net/disk_cache/v3/block_bitmaps.h:17: // This class handles the set of
block-files open by the disk cache.
On 2013/08/06 01:11:02, gavinp wrote:
> On 2013/08/05 19:50:45, rvargas wrote:
> > On 2013/08/05 17:06:15, gavinp wrote:
> > > Can we expand on this comment a bit? I'm still a bit confused about what a
> > > bitmap is? Also "handles" is a really general verb. More specifics will
put
> > this
> > > in context.
> > 
> > Expended the comment. I can think of handles/represents/encapsulates but the
> > first one is the most accurate imo.
> 
> None of "handles/represents/encapsulates" are great words to use when
explaining
> a new object to a naive audience, though they sometimes work. The trouble is
> they're really general. I want to be able to read a class description in a
> header and put the class in context of what it's used for; knowing about what
it
> operates on, what operations it does, and what consumers it's aimed at are
> great.
> 
> The expansion helps me, but I'm still a bit befuddled. What does it mean to
not
> use dedicated files? This interface is used by what consumers? I really am at
a
> loss here.

"dedicated files" is a basic concept of this cache: a resource can be stored
inside a block file, or on a dedicated file. I was afraid you were going to read
a statement without an explicit mention to the exception (storage needed by the
cache which is _not_ handled by this class) and come back saying that the
comment was misleading.

While it is possible to list all known consumers of a class on its declaration,
I generally consider that practice to be incorrect because a class should not
care about who uses it, and it breaks the logical structure of dependencies.
Granted, the final version of this class has basically one implied user because
the constructor takes a BackendImplV3 so I might as well document at this point
who is the caller, but that is certainly not my first impulse when documenting a
class.

As for what it does (what's the meaning of "handling"), a quick glance at the 6
public methods convey a much clear idea of what it does than adding text
repeating (although in advance) how to use this class. In fact it mainly creates
and deletes blocks, so I honestly considered saying that when you asked for an
expanded comment. My issue is that I see it basically as repetition, and it is
really all that the backend needs to do with blockfiles as an entity (which is
the actual point of this class and it is expressed in the first line of the
text). Ha, I just realized that I actually wrote that it creates and deletes
blocks even though I really don't like that as a "description".

Keep in mind that this is not a general purpose class, and it is not intended to
be used by anybody outside of the implementation of this cache. It even has a
comment pointing stray readers to the proper public interface.

At the end of the day, it is much faster if you just write the text that you
want to see here and I copy it.

gavinp

https://codereview.chromium.org/17816008/diff/5629499534213120/net/disk_cache/v3/block_bitmaps.h File net/disk_cache/v3/block_bitmaps.h (right): https://codereview.chromium.org/17816008/diff/5629499534213120/net/disk_cache/v3/block_bitmaps.h#newcode17 net/disk_cache/v3/block_bitmaps.h:17: // This class handles the set of block-files open ...

7 years, 4 months ago (2013-08-08 03:04:45 UTC) #35

https://codereview.chromium.org/17816008/diff/5629499534213120/net/disk_cache...
File net/disk_cache/v3/block_bitmaps.h (right):

https://codereview.chromium.org/17816008/diff/5629499534213120/net/disk_cache...
net/disk_cache/v3/block_bitmaps.h:17: // This class handles the set of
block-files open by the disk cache.
On 2013/08/08 02:50:01, rvargas wrote:
> On 2013/08/06 01:11:02, gavinp wrote:
> > On 2013/08/05 19:50:45, rvargas wrote:
> > > On 2013/08/05 17:06:15, gavinp wrote:
> > > > Can we expand on this comment a bit? I'm still a bit confused about what
a
> > > > bitmap is? Also "handles" is a really general verb. More specifics will
> put
> > > this
> > > > in context.
> > > 
> > > Expended the comment. I can think of handles/represents/encapsulates but
the
> > > first one is the most accurate imo.
> > 
> > None of "handles/represents/encapsulates" are great words to use when
> explaining
> > a new object to a naive audience, though they sometimes work. The trouble is
> > they're really general. I want to be able to read a class description in a
> > header and put the class in context of what it's used for; knowing about
what
> it
> > operates on, what operations it does, and what consumers it's aimed at are
> > great.
> > 
> > The expansion helps me, but I'm still a bit befuddled. What does it mean to
> not
> > use dedicated files? This interface is used by what consumers? I really am
at
> a
> > loss here.
> 
> "dedicated files" is a basic concept of this cache: a resource can be stored
> inside a block file, or on a dedicated file. I was afraid you were going to
read
> a statement without an explicit mention to the exception (storage needed by
the
> cache which is _not_ handled by this class) and come back saying that the
> comment was misleading.
> 
> While it is possible to list all known consumers of a class on its
declaration,
> I generally consider that practice to be incorrect because a class should not
> care about who uses it, and it breaks the logical structure of dependencies.
> Granted, the final version of this class has basically one implied user
because
> the constructor takes a BackendImplV3 so I might as well document at this
point
> who is the caller, but that is certainly not my first impulse when documenting
a
> class.
> 
> As for what it does (what's the meaning of "handling"), a quick glance at the
6
> public methods convey a much clear idea of what it does than adding text
> repeating (although in advance) how to use this class. In fact it mainly
creates
> and deletes blocks, so I honestly considered saying that when you asked for an
> expanded comment. My issue is that I see it basically as repetition, and it is
> really all that the backend needs to do with blockfiles as an entity (which is
> the actual point of this class and it is expressed in the first line of the
> text). Ha, I just realized that I actually wrote that it creates and deletes
> blocks even though I really don't like that as a "description".
> 
> Keep in mind that this is not a general purpose class, and it is not intended
to
> be used by anybody outside of the implementation of this cache. It even has a
> comment pointing stray readers to the proper public interface.
> 
> At the end of the day, it is much faster if you just write the text that you
> want to see here and I copy it.

If I understood what this class did, I would be happy to do so.

However, I do not.

Randy Smith (Not in Mondays)

On 2013/08/08 02:50:00, rvargas wrote: > My apologies for not commenting on this thread before ...

7 years, 4 months ago (2013-08-09 18:54:20 UTC) #36

On 2013/08/08 02:50:00, rvargas wrote:
> My apologies for not commenting on this thread before and letting you guys go
on
> a implications hunting trip. I was distracted with other stuff and I didn't
> think you were going to spend much time here.

No worries, at least for me--I'm happy to have dug down to the floor on this.

> Yes, the point that you were missing is that, in general, not only any code
can
> page fault when accessing data (assuming an OS with virtual memory), but it
will
> also page fault when accessing code, and both things can be triggered by
memory
> pressure. So, again, in general, this code is not much different than other
code
> that touches memory.

Yes (on most platforms).  I'd add the suffix that figuring that out actually
took some exploration of the detailed paging behavior of each system.

> Down to the dirty details, the other important detail to keep in mind is that
> the memory mapped section of a block file is exactly the first 8KB. There are
> usually maybe 7 block files, maybe a little more, but we are talking about
> probably less than 20 pages. That's without considering the index itself, of
> course.
> 
> Furthermore, to deal with the issue of startup latency and random blocking
> without memory pressure, all mapped sections are loaded at initialization. 

Excellent!  I hadn't realized that, and I agree with you that it's a good thing.

> Today
> that is performed by directly reading the section on all OSes, but that will
> change to touching all pages.
> 
> While I appreciate your concern about being misled by reading a given header
on
> this code, I don't really believe that the way to learn about the high level
> implications of a design is by reading random headers. I believe that should
be
> documented on a design doc, giving the overall picture of how things are
> supposed to work, and what are the requirements for porting the code to other
> platforms. Code reuse of a random class from inside a net subfolder is
something
> that I'm not particularly concerned about, because we are very clear about
what
> is the public interface of net.

Agreed, but I *would* like a reference to that design doc with a note about the
design being dependent on the paging behavior of the platform from some
appropriate header file.  I think when people are modifying and enhancing code,
they don't usually look at design docs, so I want something to alert them that
in this case they might want to.  Maybe in backend_impl.h?  (I'm not fixated on
that location, I just want it somewhere where, if someone goes to port the v3
cache to a new platform at some point in the future, they'll see it).

(Also: Do you talk about this issue in the design doc?  If so, my apologies,
since I don't remember that from reading it, and it might have saved some time
:-{.)

> Android is not the target of this design. If it were, there would not be
memory
> maps given the overall brokenness of the feature on that OS.

Yep.  I'm just trying to put guards in against future changes by someone other
than the three of us.

Randy Smith (Not in Mondays)

https://codereview.chromium.org/17816008/diff/5629499534213120/net/disk_cache/v3/block_bitmaps.h File net/disk_cache/v3/block_bitmaps.h (right): https://codereview.chromium.org/17816008/diff/5629499534213120/net/disk_cache/v3/block_bitmaps.h#newcode17 net/disk_cache/v3/block_bitmaps.h:17: // This class handles the set of block-files open ...

7 years, 4 months ago (2013-08-09 19:11:48 UTC) #37

https://codereview.chromium.org/17816008/diff/5629499534213120/net/disk_cache...
File net/disk_cache/v3/block_bitmaps.h (right):

https://codereview.chromium.org/17816008/diff/5629499534213120/net/disk_cache...
net/disk_cache/v3/block_bitmaps.h:17: // This class handles the set of
block-files open by the disk cache.
On 2013/08/08 03:04:46, gavinp wrote:
> On 2013/08/08 02:50:01, rvargas wrote:
> > On 2013/08/06 01:11:02, gavinp wrote:
> > > On 2013/08/05 19:50:45, rvargas wrote:
> > > > On 2013/08/05 17:06:15, gavinp wrote:
> > > > > Can we expand on this comment a bit? I'm still a bit confused about
what
> a
> > > > > bitmap is? Also "handles" is a really general verb. More specifics
will
> > put
> > > > this
> > > > > in context.
> > > > 
> > > > Expended the comment. I can think of handles/represents/encapsulates but
> the
> > > > first one is the most accurate imo.
> > > 
> > > None of "handles/represents/encapsulates" are great words to use when
> > explaining
> > > a new object to a naive audience, though they sometimes work. The trouble
is
> > > they're really general. I want to be able to read a class description in a
> > > header and put the class in context of what it's used for; knowing about
> what
> > it
> > > operates on, what operations it does, and what consumers it's aimed at are
> > > great.
> > > 
> > > The expansion helps me, but I'm still a bit befuddled. What does it mean
to
> > not
> > > use dedicated files? This interface is used by what consumers? I really am
> at
> > a
> > > loss here.
> > 
> > "dedicated files" is a basic concept of this cache: a resource can be stored
> > inside a block file, or on a dedicated file. I was afraid you were going to
> read
> > a statement without an explicit mention to the exception (storage needed by
> the
> > cache which is _not_ handled by this class) and come back saying that the
> > comment was misleading.
> > 
> > While it is possible to list all known consumers of a class on its
> declaration,
> > I generally consider that practice to be incorrect because a class should
not
> > care about who uses it, and it breaks the logical structure of dependencies.
> > Granted, the final version of this class has basically one implied user
> because
> > the constructor takes a BackendImplV3 so I might as well document at this
> point
> > who is the caller, but that is certainly not my first impulse when
documenting
> a
> > class.
> > 
> > As for what it does (what's the meaning of "handling"), a quick glance at
the
> 6
> > public methods convey a much clear idea of what it does than adding text
> > repeating (although in advance) how to use this class. In fact it mainly
> creates
> > and deletes blocks, so I honestly considered saying that when you asked for
an
> > expanded comment. My issue is that I see it basically as repetition, and it
is
> > really all that the backend needs to do with blockfiles as an entity (which
is
> > the actual point of this class and it is expressed in the first line of the
> > text). Ha, I just realized that I actually wrote that it creates and deletes
> > blocks even though I really don't like that as a "description".
> > 
> > Keep in mind that this is not a general purpose class, and it is not
intended
> to
> > be used by anybody outside of the implementation of this cache. It even has
a
> > comment pointing stray readers to the proper public interface.
> > 
> > At the end of the day, it is much faster if you just write the text that you
> > want to see here and I copy it.
> 
> If I understood what this class did, I would be happy to do so.
> 
> However, I do not.
> 

At risk of saying something completely wrong (which seems to be my modus
operandi in this review :-}), may I give it a try?

"This class is the interface in the v3 disk cache to the set of files holding
cached data that is small enough to not be efficiently stored in a dedicated
file (i.e. < kMaxBlockSize).  It is primarily used to allocate and free regions
in those files used to store data."

If that doesn't address your concerns, Gavin, could you give some specific
questions you imagine someone reading this code for the first time would have on
reading it?

(Ricardo: If I'm saying something incorrect or that you object to, please
correct me--I'm focussing on Gavin since I think this is just an issue of
addressing his concerns about documentation.)

rvargas (doing something else)

https://codereview.chromium.org/17816008/diff/5629499534213120/net/disk_cache/v3/block_bitmaps.h File net/disk_cache/v3/block_bitmaps.h (right): https://codereview.chromium.org/17816008/diff/5629499534213120/net/disk_cache/v3/block_bitmaps.h#newcode17 net/disk_cache/v3/block_bitmaps.h:17: // This class handles the set of block-files open ...

7 years, 4 months ago (2013-08-09 19:53:40 UTC) #38

https://codereview.chromium.org/17816008/diff/5629499534213120/net/disk_cache...
File net/disk_cache/v3/block_bitmaps.h (right):

https://codereview.chromium.org/17816008/diff/5629499534213120/net/disk_cache...
net/disk_cache/v3/block_bitmaps.h:17: // This class handles the set of
block-files open by the disk cache.
On 2013/08/09 19:11:49, rdsmith wrote:
> On 2013/08/08 03:04:46, gavinp wrote:
> > On 2013/08/08 02:50:01, rvargas wrote:
> > > On 2013/08/06 01:11:02, gavinp wrote:
> > > > On 2013/08/05 19:50:45, rvargas wrote:
> > > > > On 2013/08/05 17:06:15, gavinp wrote:
> > > > > > Can we expand on this comment a bit? I'm still a bit confused about
> what
> > a
> > > > > > bitmap is? Also "handles" is a really general verb. More specifics
> will
> > > put
> > > > > this
> > > > > > in context.
> > > > > 
> > > > > Expended the comment. I can think of handles/represents/encapsulates
but
> > the
> > > > > first one is the most accurate imo.
> > > > 
> > > > None of "handles/represents/encapsulates" are great words to use when
> > > explaining
> > > > a new object to a naive audience, though they sometimes work. The
trouble
> is
> > > > they're really general. I want to be able to read a class description in
a
> > > > header and put the class in context of what it's used for; knowing about
> > what
> > > it
> > > > operates on, what operations it does, and what consumers it's aimed at
are
> > > > great.
> > > > 
> > > > The expansion helps me, but I'm still a bit befuddled. What does it mean
> to
> > > not
> > > > use dedicated files? This interface is used by what consumers? I really
am
> > at
> > > a
> > > > loss here.
> > > 
> > > "dedicated files" is a basic concept of this cache: a resource can be
stored
> > > inside a block file, or on a dedicated file. I was afraid you were going
to
> > read
> > > a statement without an explicit mention to the exception (storage needed
by
> > the
> > > cache which is _not_ handled by this class) and come back saying that the
> > > comment was misleading.
> > > 
> > > While it is possible to list all known consumers of a class on its
> > declaration,
> > > I generally consider that practice to be incorrect because a class should
> not
> > > care about who uses it, and it breaks the logical structure of
dependencies.
> > > Granted, the final version of this class has basically one implied user
> > because
> > > the constructor takes a BackendImplV3 so I might as well document at this
> > point
> > > who is the caller, but that is certainly not my first impulse when
> documenting
> > a
> > > class.
> > > 
> > > As for what it does (what's the meaning of "handling"), a quick glance at
> the
> > 6
> > > public methods convey a much clear idea of what it does than adding text
> > > repeating (although in advance) how to use this class. In fact it mainly
> > creates
> > > and deletes blocks, so I honestly considered saying that when you asked
for
> an
> > > expanded comment. My issue is that I see it basically as repetition, and
it
> is
> > > really all that the backend needs to do with blockfiles as an entity
(which
> is
> > > the actual point of this class and it is expressed in the first line of
the
> > > text). Ha, I just realized that I actually wrote that it creates and
deletes
> > > blocks even though I really don't like that as a "description".
> > > 
> > > Keep in mind that this is not a general purpose class, and it is not
> intended
> > to
> > > be used by anybody outside of the implementation of this cache. It even
has
> a
> > > comment pointing stray readers to the proper public interface.
> > > 
> > > At the end of the day, it is much faster if you just write the text that
you
> > > want to see here and I copy it.
> > 
> > If I understood what this class did, I would be happy to do so.
> > 
> > However, I do not.
> > 
> 
> At risk of saying something completely wrong (which seems to be my modus
> operandi in this review :-}), may I give it a try?
> 
> "This class is the interface in the v3 disk cache to the set of files holding
> cached data that is small enough to not be efficiently stored in a dedicated
> file (i.e. < kMaxBlockSize).  It is primarily used to allocate and free
regions
> in those files used to store data."
> 
> If that doesn't address your concerns, Gavin, could you give some specific
> questions you imagine someone reading this code for the first time would have
on
> reading it?
> 
> (Ricardo: If I'm saying something incorrect or that you object to, please
> correct me--I'm focussing on Gavin since I think this is just an issue of
> addressing his concerns about documentation.)
> 
>

Thanks. I'm more than happy happy to use that description.

rvargas (doing something else)

On 2013/08/09 18:54:20, rdsmith wrote: > On 2013/08/08 02:50:00, rvargas wrote: > > My apologies ...

7 years, 4 months ago (2013-08-09 20:06:22 UTC) #39

On 2013/08/09 18:54:20, rdsmith wrote:
> On 2013/08/08 02:50:00, rvargas wrote:
> > My apologies for not commenting on this thread before and letting you guys
go
> on
> > a implications hunting trip. I was distracted with other stuff and I didn't
> > think you were going to spend much time here.
> 
> No worries, at least for me--I'm happy to have dug down to the floor on this.
> 
> > Yes, the point that you were missing is that, in general, not only any code
> can
> > page fault when accessing data (assuming an OS with virtual memory), but it
> will
> > also page fault when accessing code, and both things can be triggered by
> memory
> > pressure. So, again, in general, this code is not much different than other
> code
> > that touches memory.
> 
> Yes (on most platforms).  I'd add the suffix that figuring that out actually
> took some exploration of the detailed paging behavior of each system.
> 
> > Down to the dirty details, the other important detail to keep in mind is
that
> > the memory mapped section of a block file is exactly the first 8KB. There
are
> > usually maybe 7 block files, maybe a little more, but we are talking about
> > probably less than 20 pages. That's without considering the index itself, of
> > course.
> > 
> > Furthermore, to deal with the issue of startup latency and random blocking
> > without memory pressure, all mapped sections are loaded at initialization. 
> 
> Excellent!  I hadn't realized that, and I agree with you that it's a good
thing.
> 
> > Today
> > that is performed by directly reading the section on all OSes, but that will
> > change to touching all pages.
> > 
> > While I appreciate your concern about being misled by reading a given header
> on
> > this code, I don't really believe that the way to learn about the high level
> > implications of a design is by reading random headers. I believe that should
> be
> > documented on a design doc, giving the overall picture of how things are
> > supposed to work, and what are the requirements for porting the code to
other
> > platforms. Code reuse of a random class from inside a net subfolder is
> something
> > that I'm not particularly concerned about, because we are very clear about
> what
> > is the public interface of net.
> 
> Agreed, but I *would* like a reference to that design doc with a note about
the
> design being dependent on the paging behavior of the platform from some
> appropriate header file.  I think when people are modifying and enhancing
code,
> they don't usually look at design docs, so I want something to alert them that
> in this case they might want to.  Maybe in backend_impl.h?  (I'm not fixated
on
> that location, I just want it somewhere where, if someone goes to port the v3
> cache to a new platform at some point in the future, they'll see it).
> 
> (Also: Do you talk about this issue in the design doc?  If so, my apologies,
> since I don't remember that from reading it, and it might have saved some time
> :-{.)
> 
> > Android is not the target of this design. If it were, there would not be
> memory
> > maps given the overall brokenness of the feature on that OS.
> 
> Yep.  I'm just trying to put guards in against future changes by someone other
> than the three of us.

Yes, I agree something should be said somewhere, probably on a future CL. I
don't think this is the right header though, given that in reality this class is
not even aware that the headers are memory mapped by someone else, or where this
code is running (thread) etc. I expect backend_impl_v3.cc will have a hefty
section about how things fit together.

I didn't mean to imply that the v3 design doc currently covers implementation
details (threading, code structure or similar)... I can add that now if that
helps you (it probably does, because really, following the code is not easy at
all). I was just planning to add that later. (v2's doc has some mentions about
system requirements and use of mmap)

Believe it or not, I still have the vision of a flash-specific cache as the long
term future for mobile (or otherwise underpowered, flash based devices). I
believe (may be wrong) that the V3 docs mentions everywhere the design as a
desktop cache, but it also needs a section about platform requirements for
portability (in the past, internal and external discussions have been handled
mostly with direct communications, but usually people go through the doc first
and that sets the basic expectations). I don't think our design docs are
sufficiently clear about how they fit together or long term expectations.

Randy Smith (Not in Mondays)

On 2013/08/09 20:06:22, rvargas wrote: > On 2013/08/09 18:54:20, rdsmith wrote: > > On 2013/08/08 ...

7 years, 4 months ago (2013-08-12 14:51:35 UTC) #40

On 2013/08/09 20:06:22, rvargas wrote:
> On 2013/08/09 18:54:20, rdsmith wrote:
> > On 2013/08/08 02:50:00, rvargas wrote:
> > > My apologies for not commenting on this thread before and letting you guys
> go
> > on
> > > a implications hunting trip. I was distracted with other stuff and I
didn't
> > > think you were going to spend much time here.
> > 
> > No worries, at least for me--I'm happy to have dug down to the floor on
this.
> > 
> > > Yes, the point that you were missing is that, in general, not only any
code
> > can
> > > page fault when accessing data (assuming an OS with virtual memory), but
it
> > will
> > > also page fault when accessing code, and both things can be triggered by
> > memory
> > > pressure. So, again, in general, this code is not much different than
other
> > code
> > > that touches memory.
> > 
> > Yes (on most platforms).  I'd add the suffix that figuring that out actually
> > took some exploration of the detailed paging behavior of each system.
> > 
> > > Down to the dirty details, the other important detail to keep in mind is
> that
> > > the memory mapped section of a block file is exactly the first 8KB. There
> are
> > > usually maybe 7 block files, maybe a little more, but we are talking about
> > > probably less than 20 pages. That's without considering the index itself,
of
> > > course.
> > > 
> > > Furthermore, to deal with the issue of startup latency and random blocking
> > > without memory pressure, all mapped sections are loaded at initialization.

> > 
> > Excellent!  I hadn't realized that, and I agree with you that it's a good
> thing.
> > 
> > > Today
> > > that is performed by directly reading the section on all OSes, but that
will
> > > change to touching all pages.
> > > 
> > > While I appreciate your concern about being misled by reading a given
header
> > on
> > > this code, I don't really believe that the way to learn about the high
level
> > > implications of a design is by reading random headers. I believe that
should
> > be
> > > documented on a design doc, giving the overall picture of how things are
> > > supposed to work, and what are the requirements for porting the code to
> other
> > > platforms. Code reuse of a random class from inside a net subfolder is
> > something
> > > that I'm not particularly concerned about, because we are very clear about
> > what
> > > is the public interface of net.
> > 
> > Agreed, but I *would* like a reference to that design doc with a note about
> the
> > design being dependent on the paging behavior of the platform from some
> > appropriate header file.  I think when people are modifying and enhancing
> code,
> > they don't usually look at design docs, so I want something to alert them
that
> > in this case they might want to.  Maybe in backend_impl.h?  (I'm not fixated
> on
> > that location, I just want it somewhere where, if someone goes to port the
v3
> > cache to a new platform at some point in the future, they'll see it).
> > 
> > (Also: Do you talk about this issue in the design doc?  If so, my apologies,
> > since I don't remember that from reading it, and it might have saved some
time
> > :-{.)
> > 
> > > Android is not the target of this design. If it were, there would not be
> > memory
> > > maps given the overall brokenness of the feature on that OS.
> > 
> > Yep.  I'm just trying to put guards in against future changes by someone
other
> > than the three of us.
> 
> Yes, I agree something should be said somewhere, probably on a future CL. I
> don't think this is the right header though, given that in reality this class
is
> not even aware that the headers are memory mapped by someone else, or where
this
> code is running (thread) etc. I expect backend_impl_v3.cc will have a hefty
> section about how things fit together.
> 
> I didn't mean to imply that the v3 design doc currently covers implementation
> details (threading, code structure or similar)... I can add that now if that
> helps you (it probably does, because really, following the code is not easy at
> all). I was just planning to add that later. (v2's doc has some mentions about
> system requirements and use of mmap)

I'm good with all that.  My only concern at this point is that it happen before
you finish the implementation :-}, but I'll trust you not to let it drop through
the cracks.  I certainly agree that it's not particularly related to this CL;
this is just the CL where I got to the point of understanding the V3 design well
enough to make the request.

> Believe it or not, I still have the vision of a flash-specific cache as the
long
> term future for mobile (or otherwise underpowered, flash based devices). I
> believe (may be wrong) that the V3 docs mentions everywhere the design as a
> desktop cache, but it also needs a section about platform requirements for
> portability (in the past, internal and external discussions have been handled
> mostly with direct communications, but usually people go through the doc first
> and that sets the basic expectations). I don't think our design docs are
> sufficiently clear about how they fit together or long term expectations.

No argument--we tend to do things in a patchwork fashion based on the whims of
reviewers :-{.

rvargas (doing something else)

Is there any pending item to land this CL?

7 years, 4 months ago (2013-08-24 02:24:43 UTC) #41

gavinp

I'm glad we talked through why it's OK to do IO on the IO thread ...

7 years, 3 months ago (2013-08-26 14:30:55 UTC) #42

I'm glad we talked through why it's OK to do IO on the IO thread despite the
prohibition when we're using these memory maps; especially given that read(2)
and write(2) are pretty thin layers on top of memcpy() into the buffer cache, it
was pretty interesting to me to expose those pragmatic trade offs.

This change LGTM, either way on my nits below. But I'd really be helped if you
could answer my questions. Thanks!

https://codereview.chromium.org/17816008/diff/5629499534213120/net/disk_cache...
File net/disk_cache/block_files.h (right):

https://codereview.chromium.org/17816008/diff/5629499534213120/net/disk_cache...
net/disk_cache/block_files.h:32: explicit BlockHeader(MappedFile* file);
On 2013/08/05 17:06:15, gavinp wrote:
> Is it appropriate to add a TODO(rvargas): Remove this constructor after v3
cache
> finishes landing?

Do these TODO comments make sense? I was asking as much to make sure I
understand what's happening as to improve the code?

https://codereview.chromium.org/17816008/diff/5629499534213120/net/disk_cache...
net/disk_cache/block_files.h:67: // Returns a pointer to the underlying
BlockFileHeader.
On 2013/08/05 17:06:15, gavinp wrote:
> Is it appropriate to add a TODO(rvargas): Remove this accessor after v3 cache
> finishes landing?

Do these TODO comments make sense? I was asking as much to make sure I
understand what's happening as to improve the code?

https://codereview.chromium.org/17816008/diff/5629499534213120/net/disk_cache...
net/disk_cache/block_files.h:74: typedef std::vector<BlockHeader>
BlockFilesBitmaps;
On 2013/08/05 19:50:45, rvargas wrote:
> On 2013/08/05 17:06:15, gavinp wrote:
> > In reviewing this code, I think this typedef is causing me some pain. Can we
> > just use std::vector<BlockHeader> everywhere, it's only seven characters
> longer?
> > I think it's referenced only twice?
> 
> It is used more that twice on the V3 code. This is the type that is
transferred
> between BlockFiles and BlockBitmaps whenever the mappings change.
> 
> BlockFilesMappings ?

I'll go look at the V3 code. But I'm still scared; I worry that STL typedefs
hide subtleties that deserve attention.

Can we just not use the typedef here, and then later introduce it in one of the
downstream CLs in which we use it more?

https://codereview.chromium.org/17816008/diff/5629499534213120/net/disk_cache...
File net/disk_cache/v3/block_bitmaps_unittest.cc (right):

https://codereview.chromium.org/17816008/diff/5629499534213120/net/disk_cache...
net/disk_cache/v3/block_bitmaps_unittest.cc:41: EXPECT_EQ(start / 4, (start +
block_size - 1) / 4);
On 2013/08/05 19:50:45, rvargas wrote:
> On 2013/08/05 17:06:15, gavinp wrote:
> > I think this is equivalent to: 
> >   EXPECT_GT(4, (i % 4) + (start % 4));
> > and also
> >   EXPECT_EQ(0, (start % 4 + block_size - 1) / 4);
> > Is that right?
> > 
> > But I still don't understand what's being asserted about the start
addresses.
> Is
> > it that we make our allocations aligned in some way?
> > 
> > Could we document this with a comment on the test failure, would:
> >   << "Block was not allocated aligned to ... boundary" 
> > work?
> 
> Done.

I don't know why it's important not to cross a four block boundary. Why is it
bad?

https://codereview.chromium.org/17816008/diff/156001/net/disk_cache/block_fil...
File net/disk_cache/block_files.cc (right):

https://codereview.chromium.org/17816008/diff/156001/net/disk_cache/block_fil...
net/disk_cache/block_files.cc:159: uint8* byte_map =
reinterpret_cast<uint8*>(header_->allocation_map);
Is this a strict aliasing violation? Why not?

https://codereview.chromium.org/17816008/diff/156001/net/disk_cache/v3/block_...
File net/disk_cache/v3/block_bitmaps_unittest.cc (right):

https://codereview.chromium.org/17816008/diff/156001/net/disk_cache/v3/block_...
net/disk_cache/v3/block_bitmaps_unittest.cc:55: for (int i =0; i < 29; i++) {
Nit: i = 0.

Randy Smith (Not in Mondays)

Just in case you're waiting for me, here's my LGTM. Note that I've only done ...

7 years, 3 months ago (2013-09-03 19:09:27 UTC) #43

rvargas (doing something else)

thanks https://codereview.chromium.org/17816008/diff/5629499534213120/net/disk_cache/block_files.h File net/disk_cache/block_files.h (right): https://codereview.chromium.org/17816008/diff/5629499534213120/net/disk_cache/block_files.h#newcode32 net/disk_cache/block_files.h:32: explicit BlockHeader(MappedFile* file); On 2013/08/26 14:30:56, gavinp wrote: ...

7 years, 3 months ago (2013-09-11 02:49:52 UTC) #44

thanks

https://codereview.chromium.org/17816008/diff/5629499534213120/net/disk_cache...
File net/disk_cache/block_files.h (right):

https://codereview.chromium.org/17816008/diff/5629499534213120/net/disk_cache...
net/disk_cache/block_files.h:32: explicit BlockHeader(MappedFile* file);
On 2013/08/26 14:30:56, gavinp wrote:
> On 2013/08/05 17:06:15, gavinp wrote:
> > Is it appropriate to add a TODO(rvargas): Remove this constructor after v3
> cache
> > finishes landing?
> 
> Do these TODO comments make sense? I was asking as much to make sure I
> understand what's happening as to improve the code?

I added a TODO at line 81 of PS9... I thought that made it clear that the class
has more than needed while V2 and V3 coexist.

https://codereview.chromium.org/17816008/diff/5629499534213120/net/disk_cache...
net/disk_cache/block_files.h:74: typedef std::vector<BlockHeader>
BlockFilesBitmaps;
On 2013/08/26 14:30:56, gavinp wrote:
> On 2013/08/05 19:50:45, rvargas wrote:
> > On 2013/08/05 17:06:15, gavinp wrote:
> > > In reviewing this code, I think this typedef is causing me some pain. Can
we
> > > just use std::vector<BlockHeader> everywhere, it's only seven characters
> > longer?
> > > I think it's referenced only twice?
> > 
> > It is used more that twice on the V3 code. This is the type that is
> transferred
> > between BlockFiles and BlockBitmaps whenever the mappings change.
> > 
> > BlockFilesMappings ?
> 
> I'll go look at the V3 code. But I'm still scared; I worry that STL typedefs
> hide subtleties that deserve attention.
> 
> Can we just not use the typedef here, and then later introduce it in one of
the
> downstream CLs in which we use it more?

I could move the define to block_bitmaps.h in this CL, but it will be moving
back to this file later on (because it is used by BlockFiles) so I don't see
much value in doing that.

How about leaving it here for the time being and talking about its need when
there are more users?. I don't share your concern about stl defines but we
should talk about that later.

https://codereview.chromium.org/17816008/diff/5629499534213120/net/disk_cache...
File net/disk_cache/v3/block_bitmaps_unittest.cc (right):

https://codereview.chromium.org/17816008/diff/5629499534213120/net/disk_cache...
net/disk_cache/v3/block_bitmaps_unittest.cc:41: EXPECT_EQ(start / 4, (start +
block_size - 1) / 4);
On 2013/08/26 14:30:56, gavinp wrote:
> On 2013/08/05 19:50:45, rvargas wrote:
> > On 2013/08/05 17:06:15, gavinp wrote:
> > > I think this is equivalent to: 
> > >   EXPECT_GT(4, (i % 4) + (start % 4));
> > > and also
> > >   EXPECT_EQ(0, (start % 4 + block_size - 1) / 4);
> > > Is that right?
> > > 
> > > But I still don't understand what's being asserted about the start
> addresses.
> > Is
> > > it that we make our allocations aligned in some way?
> > > 
> > > Could we document this with a comment on the test failure, would:
> > >   << "Block was not allocated aligned to ... boundary" 
> > > work?
> > 
> > Done.
> 
> I don't know why it's important not to cross a four block boundary. Why is it
> bad?

There are four basic approaches:
1. All resources are given a single block.
2. A resource is given 1 or more blocks, but all resources have a fix alignment.
3. There is a "natural" alignment that depends on the resource size.
4. There is no alignment at all.

The code roughly grows in complexity with each approach. The first and second
are basically equivalent in terms of wasted space. The last two are also
equivalent (at least on average). We use the third one. It means that 4 and 3
block resources are aligned at 4, 2 block resources at 2, and single block
resources can go anywhere.

https://codereview.chromium.org/17816008/diff/156001/net/disk_cache/block_fil...
File net/disk_cache/block_files.cc (right):

https://codereview.chromium.org/17816008/diff/156001/net/disk_cache/block_fil...
net/disk_cache/block_files.cc:159: uint8* byte_map =
reinterpret_cast<uint8*>(header_->allocation_map);
On 2013/08/26 14:30:56, gavinp wrote:
> Is this a strict aliasing violation? Why not?

It probably is, but not in the typical way of affecting possible optimizations.
Both variables are raw pointers to a large buffer and the code is not really
doing things with two pointers in a way that may confuse the optimizer.

https://codereview.chromium.org/17816008/diff/156001/net/disk_cache/v3/block_...
File net/disk_cache/v3/block_bitmaps_unittest.cc (right):

https://codereview.chromium.org/17816008/diff/156001/net/disk_cache/v3/block_...
net/disk_cache/v3/block_bitmaps_unittest.cc:55: for (int i =0; i < 29; i++) {
On 2013/08/26 14:30:56, gavinp wrote:
> Nit: i = 0.

Done.

commit-bot: I haz the power

CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/rvargas@chromium.org/17816008/184001

7 years, 3 months ago (2013-09-11 18:53:58 UTC) #45

commit-bot: I haz the power

Retried try job too often on win7_aura for step(s) browser_tests http://build.chromium.org/p/tryserver.chromium/buildstatus?builder=win7_aura&number=77909

7 years, 3 months ago (2013-09-12 04:43:16 UTC) #46

rvargas (doing something else)

7 years, 3 months ago (2013-09-13 21:51:58 UTC) #47

Message was sent while issue was closed.

Committed patchset #10 manually as r223133 (presubmit successful).

Issue 17816008: Disk cache: Introduce BlockBitmaps for V3. (Closed)

Description

Patch Set 1 #

Patch Set 2 : #

Patch Set 3 : #

Patch Set 4 : #

Patch Set 5 : #

Patch Set 6 : #

Patch Set 7 : Rebase #

Patch Set 8 : Remove dchecks #

Patch Set 9 : New BlockBitmaps description #

Patch Set 10 : #

Messages