|
|
Created:
5 years, 1 month ago by bcwhite Modified:
4 years, 11 months ago Reviewers:
Primiano Tucci (use gerrit), Alexander Potapenko, benwells, chrisha, Nico, Alexei Svitkine (slow), JF, pasko, dvyukov, Dmitry Vyukov, mdempsky CC:
chromium-reviews, gavinp+memory_chromium.org, vmpstr+watch_chromium.org Base URL:
https://chromium.googlesource.com/chromium/src.git@master Target Ref:
refs/pending/heads/master Project:
chromium Visibility:
Public. |
DescriptionCreate "persistent memory allocator" for persisting and sharing objects.
This allocator reserves memory within a specific block that itself can
be persisted and/or shared between processes by various means.
It is lock-free for operation on all platforms (even Android which
does not support inter-process locks) and self-checking to deal
with security concerns where not all processes accessing the memory
are fully trustworthy.
BUG=546019
Committed: https://crrev.com/34ae4983d4a24f5136bf8fda7b618842920962b0
Cr-Commit-Position: refs/heads/master@{#370377}
Patch Set 1 #
Total comments: 53
Patch Set 2 : fixed compile problems; added paging test #
Total comments: 2
Patch Set 3 : addressed review comments by chrisha #Patch Set 4 : changed int32 to int32_t #
Total comments: 40
Patch Set 5 : addressed review comments by Alexander #
Total comments: 4
Patch Set 6 : added parallel test (and fixed a bug found by it) #Patch Set 7 : cleaned up some checks no longer necessary after previous fix #Patch Set 8 : added iterable loop detection; added "malicious" test #Patch Set 9 : simplified loop detection #
Total comments: 2
Patch Set 10 : moved flags to Atomic32 #
Total comments: 34
Patch Set 11 : addressed review comments by Alexander and Dmitry #Patch Set 12 : some 'git cl format' changes and remove double-space after periods #
Total comments: 20
Patch Set 13 : addressed review comments by Chris #
Total comments: 6
Patch Set 14 : rebased #
Total comments: 38
Patch Set 15 : addressed review comments by Dmitry #Patch Set 16 : fixed signed/unsigned comparison; improved comment #Patch Set 17 : new MakeIterable enqueue algorithm #Patch Set 18 : remove unnecessary barriers in Allocate; rename Corrupted->Corrupt to match tense of Full #Patch Set 19 : use size_t for sizes on external interface #Patch Set 20 : fix signed/unsigned compile issues #Patch Set 21 : improvements to external interface ('offset' becomes 'reference') #Patch Set 22 : hardening of GetAllocSize() and more signed/unsigned compile problem fixes #Patch Set 23 : even more signed/unsigned comparison fixes #Patch Set 24 : good grief -- when will it end? #
Total comments: 30
Patch Set 25 : addressed review comments by Chris #
Total comments: 22
Patch Set 26 : addressed review comments; added metrics #Patch Set 27 : add missing <string> import #Patch Set 28 : SharedMemoryAllocator -> PersistentMemoryAllocator #
Total comments: 4
Patch Set 29 : overflow-safe validation of passed size_t #Patch Set 30 : use 'volatile' for shared memory; use std::atomic for member flag #
Total comments: 6
Patch Set 31 : changed from subtle::Atomic32 to std::atomic<uint32_t> #Patch Set 32 : fixed a few remaining int32_t values #Patch Set 33 : fixed unittest signed/unsigned usage #Patch Set 34 : iterator improvements (to allow comparisons and hide internal details) #
Total comments: 9
Patch Set 35 : support for read-only (const) memory segments; support arbitrary iteration start; more tests #Patch Set 36 : addressed review comments by Dmity #Patch Set 37 : added File allocator and test #Patch Set 38 : fix delete of new[]; record failed allocs in histogram #Patch Set 39 : harden used(); file allocator now read-only #Patch Set 40 : moved from base/memory to base/metrics #Patch Set 41 : added IsMemoryAcceptable and IsFileAcceptable methods for pre-checking source #Patch Set 42 : rebased #
Total comments: 47
Patch Set 43 : addesed comments by Alexander Potapenko #Patch Set 44 : addessed more comments by Alexander Potapenko #
Total comments: 51
Patch Set 45 : addressed review comments by Chris #Dependent Patchsets: Messages
Total messages: 153 (26 generated)
bcwhite@chromium.org changed reviewers: + chrisha@chromium.org, glider@chromium.org, pasko@chromium.org
I'm still working on a multi-threaded test for the allocator to verify its parallel operation but wanted to get this out there for review. Thanks for everybody's insight on the original document and pseudo-code!
A first review based just on the header file. I started looking at the .cc but I'll look in more detail tomorrow. https://codereview.chromium.org/1410213004/diff/1/base/memory/shared_memory_a... File base/memory/shared_memory_allocator.cc (right): https://codereview.chromium.org/1410213004/diff/1/base/memory/shared_memory_a... base/memory/shared_memory_allocator.cc:13: #define ALLOC_ALIGNMENT 16 Windows only provides 8-byte alignment... and a true lower bound is 4 bytes. Why 16? Should this vary across platforms? Also, we tend to use constants in the anonymous namespace rather than defines. https://codereview.chromium.org/1410213004/diff/1/base/memory/shared_memory_a... base/memory/shared_memory_allocator.cc:33: const int32 BLOCK_COOKIE_ALLOCATED = 0xC8799269; For all of these constants the more common style is kBlockCookieAllocated. https://codereview.chromium.org/1410213004/diff/1/base/memory/shared_memory_a... base/memory/shared_memory_allocator.cc:39: struct SharedMemoryAllocator::BlockHeader { Description of this struct. https://codereview.chromium.org/1410213004/diff/1/base/memory/shared_memory_a... base/memory/shared_memory_allocator.cc:40: int32 size; // number of bytes in this block, including header micronit: Comments should be full sentences. Number of bytes, ... header. https://codereview.chromium.org/1410213004/diff/1/base/memory/shared_memory_a... base/memory/shared_memory_allocator.cc:42: int32 type; // a number provided by caller indicating data type Is this needed? https://codereview.chromium.org/1410213004/diff/1/base/memory/shared_memory_a... base/memory/shared_memory_allocator.cc:46: struct SharedMemoryAllocator::SharedMetaData { Ditto. https://codereview.chromium.org/1410213004/diff/1/base/memory/shared_memory_a... base/memory/shared_memory_allocator.cc:51: int32 reserved[2]; What is this for? https://codereview.chromium.org/1410213004/diff/1/base/memory/shared_memory_a... base/memory/shared_memory_allocator.cc:54: char flags[2]; // align to next int (not strictly needed but avoid confusion) Use a bitfield for flags? (Easier to grow in the future.) https://codereview.chromium.org/1410213004/diff/1/base/memory/shared_memory_a... base/memory/shared_memory_allocator.cc:65: #define OFFSET_QUEUE offsetof(SharedMetaData, queue) Use a constant here? https://codereview.chromium.org/1410213004/diff/1/base/memory/shared_memory_a... base/memory/shared_memory_allocator.cc:68: int32 page) This doesn't fit on the previous line? https://codereview.chromium.org/1410213004/diff/1/base/memory/shared_memory_a... base/memory/shared_memory_allocator.cc:75: static_assert(sizeof(BlockHeader) % ALLOC_ALIGNMENT == 0, Put these static asserts below the struct definitions? https://codereview.chromium.org/1410213004/diff/1/base/memory/shared_memory_a... base/memory/shared_memory_allocator.cc:103: // ...or something malicious has been playing with the meta-data. ubernit: metadata is one word https://codereview.chromium.org/1410213004/diff/1/base/memory/shared_memory_a... File base/memory/shared_memory_allocator.h (right): https://codereview.chromium.org/1410213004/diff/1/base/memory/shared_memory_a... base/memory/shared_memory_allocator.h:14: // multiple processes. More detailed comments about the thread safety guarantees this achieves, and the fact that this is monotonic (ie: no frees...) https://codereview.chromium.org/1410213004/diff/1/base/memory/shared_memory_a... base/memory/shared_memory_allocator.h:28: // caller. The allocator needs to know only the block's |base| address, the micronit: The more common pattern in Chrome is a single space after a period when starting a new sentence. (Roughly 5:1 usage of single space versus double space, from a few quick greps.) https://codereview.chromium.org/1410213004/diff/1/base/memory/shared_memory_a... base/memory/shared_memory_allocator.h:35: SharedMemoryAllocator(void* base, int32 size, int32 page); (We now use int32_t instead of int32.) Shouldn't these all be uint32_t, or size_t? Given that they are strictly non-negative quantities. (Here and most everywhere in the API.) https://codereview.chromium.org/1410213004/diff/1/base/memory/shared_memory_a... base/memory/shared_memory_allocator.h:46: template<typename T> T* GetObject(int32 offset, int32 type, T* unused) { Why pass a T* unused instead of just getting having the caller specify the type via GetObject<T>(...)? Syntactic sugar? Also, I'm curious what this is needed for. Presumably users are going to make an allocation and remember the returned pointer, and won't need to go try and look up the object again by offset? Why should users be aware of the concept of offsets at all? And finally, why do we need the notion of types? This makes this non-standard from most allocators, which only care about ranges of memory of a fixed size. If the thing you need to build on *top* of the allocator needs to be explicitly type aware, then provide the type info at that layer? https://codereview.chromium.org/1410213004/diff/1/base/memory/shared_memory_a... base/memory/shared_memory_allocator.h:64: void MakeIterable(int32 offset); Why do you need this? The implementation already has all the information required to perform iteration... https://codereview.chromium.org/1410213004/diff/1/base/memory/shared_memory_a... base/memory/shared_memory_allocator.h:84: struct SharedMetaData; Metadata is a single word, no need to capitalize Data. https://codereview.chromium.org/1410213004/diff/1/base/memory/shared_memory_a... base/memory/shared_memory_allocator.h:93: int32 mem_page_; Do we really care about page sizes? What's the harm in allowing allocations to cross a page boundary? (This happens all the time with most any allocator. I'd be more worried about crossing cache lines for small allocations.) And if we do, isn't this an implementation detail? (ie: platform specific, but knowable from the system and doesn't need to be provided) Maybe you only need to inform the allocator whether or not you care about crossing page boundaries, but not what the page size is?
https://codereview.chromium.org/1410213004/diff/1/base/memory/shared_memory_a... File base/memory/shared_memory_allocator.cc (right): https://codereview.chromium.org/1410213004/diff/1/base/memory/shared_memory_a... base/memory/shared_memory_allocator.cc:13: #define ALLOC_ALIGNMENT 16 On 2015/10/29 21:05:13, chrisha wrote: > Windows only provides 8-byte alignment... and a true lower bound is 4 bytes. Why > 16? Should this vary across platforms? > > Also, we tend to use constants in the anonymous namespace rather than defines. It shouldn't be 4 because then 64-bit values could get split across multiple bus reads. I can drop it to 8. I put it at 16 because the block header is 16 bytes which guarantees that a block header can't span cache lines. I tried an anonymous constant but ran into some signed/unsigned compiler errors whereas the #define just worked. That may have changed and the code progressed; I'll try it again. https://codereview.chromium.org/1410213004/diff/1/base/memory/shared_memory_a... base/memory/shared_memory_allocator.cc:33: const int32 BLOCK_COOKIE_ALLOCATED = 0xC8799269; On 2015/10/29 21:05:13, chrisha wrote: > For all of these constants the more common style is kBlockCookieAllocated. Done. https://codereview.chromium.org/1410213004/diff/1/base/memory/shared_memory_a... base/memory/shared_memory_allocator.cc:39: struct SharedMemoryAllocator::BlockHeader { On 2015/10/29 21:05:14, chrisha wrote: > Description of this struct. Done. https://codereview.chromium.org/1410213004/diff/1/base/memory/shared_memory_a... base/memory/shared_memory_allocator.cc:40: int32 size; // number of bytes in this block, including header On 2015/10/29 21:05:13, chrisha wrote: > micronit: Comments should be full sentences. > > Number of bytes, ... header. That's technically a sentence fragment. I can make full sentences but they'll be redundant ("This is...") and likely won't fit on a single line. https://codereview.chromium.org/1410213004/diff/1/base/memory/shared_memory_a... base/memory/shared_memory_allocator.cc:42: int32 type; // a number provided by caller indicating data type On 2015/10/29 21:05:13, chrisha wrote: > Is this needed? I think so, yes. When allocating, the caller knows the type of the object it's creating. When iterating, the caller has no idea what it is getting back. If there is only ever one type of object being made iterable (likely for the Shared Histogram use-case), it's a moot point but in the general case, some other information needs to be made available. https://codereview.chromium.org/1410213004/diff/1/base/memory/shared_memory_a... base/memory/shared_memory_allocator.cc:46: struct SharedMemoryAllocator::SharedMetaData { On 2015/10/29 21:05:13, chrisha wrote: > Ditto. Done. https://codereview.chromium.org/1410213004/diff/1/base/memory/shared_memory_a... base/memory/shared_memory_allocator.cc:51: int32 reserved[2]; On 2015/10/29 21:05:13, chrisha wrote: > What is this for? Padding. This structure must pad out to a multiple of the alloc-alignment and the "queue" block header must be at the very end. https://codereview.chromium.org/1410213004/diff/1/base/memory/shared_memory_a... base/memory/shared_memory_allocator.cc:54: char flags[2]; // align to next int (not strictly needed but avoid confusion) On 2015/10/29 21:05:13, chrisha wrote: > Use a bitfield for flags? (Easier to grow in the future.) Bitfields are read-modify-write which are race-conditions in a parallel system. Chars are guaranteed to be able to be written without affecting any other bytes on the machine-word. https://codereview.chromium.org/1410213004/diff/1/base/memory/shared_memory_a... base/memory/shared_memory_allocator.cc:65: #define OFFSET_QUEUE offsetof(SharedMetaData, queue) On 2015/10/29 21:05:13, chrisha wrote: > Use a constant here? A constant doesn't work because SharedMetaData is a private member and so can't be accessed by a global variable. I could make the constant part of the class but that seemed to make things worse. https://codereview.chromium.org/1410213004/diff/1/base/memory/shared_memory_a... base/memory/shared_memory_allocator.cc:68: int32 page) On 2015/10/29 21:05:13, chrisha wrote: > This doesn't fit on the previous line? So it does. At one point I was trying uint32. (bad idea since Atomic32 is signed) Done. https://codereview.chromium.org/1410213004/diff/1/base/memory/shared_memory_a... base/memory/shared_memory_allocator.cc:75: static_assert(sizeof(BlockHeader) % ALLOC_ALIGNMENT == 0, On 2015/10/29 21:05:13, chrisha wrote: > Put these static asserts below the struct definitions? They don't work there because the structures are private to the class. https://codereview.chromium.org/1410213004/diff/1/base/memory/shared_memory_a... base/memory/shared_memory_allocator.cc:103: // ...or something malicious has been playing with the meta-data. On 2015/10/29 21:05:13, chrisha wrote: > ubernit: metadata is one word Done. Here and elsewhere. https://codereview.chromium.org/1410213004/diff/1/base/memory/shared_memory_a... File base/memory/shared_memory_allocator.h (right): https://codereview.chromium.org/1410213004/diff/1/base/memory/shared_memory_a... base/memory/shared_memory_allocator.h:14: // multiple processes. On 2015/10/29 21:05:14, chrisha wrote: > More detailed comments about the thread safety guarantees this achieves, and the > fact that this is monotonic (ie: no frees...) Done. https://codereview.chromium.org/1410213004/diff/1/base/memory/shared_memory_a... base/memory/shared_memory_allocator.h:28: // caller. The allocator needs to know only the block's |base| address, the On 2015/10/29 21:05:14, chrisha wrote: > micronit: The more common pattern in Chrome is a single space after a period > when starting a new sentence. (Roughly 5:1 usage of single space versus double > space, from a few quick greps.) I can't. I'm sorry. <hanging head in shame> Blame it on my 7th grade typing teacher. Always 2 spaces after a period. https://codereview.chromium.org/1410213004/diff/1/base/memory/shared_memory_a... base/memory/shared_memory_allocator.h:35: SharedMemoryAllocator(void* base, int32 size, int32 page); > (We now use int32_t instead of int32.) Awww.... shoot. > Shouldn't these all be uint32_t, or size_t? Given that they are strictly > non-negative quantities. (Here and most everywhere in the API.) I tried going that way early on. However, Atomic32 is always a signed value and the mixing of signed and unsigned values causes either excessive compiler errors or excessive casting. Since we're talking small values (kilobytes at most), there isn't any danger of overflowing even a signed integer. It's much simpler this way. https://codereview.chromium.org/1410213004/diff/1/base/memory/shared_memory_a... base/memory/shared_memory_allocator.h:46: template<typename T> T* GetObject(int32 offset, int32 type, T* unused) { On 2015/10/29 21:05:14, chrisha wrote: > Why pass a T* unused instead of just getting having the caller specify the type > via GetObject<T>(...)? Syntactic sugar? That works. Done. > Also, I'm curious what this is needed for. Presumably users are going to make an > allocation and remember the returned pointer, and won't need to go try and look > up the object again by offset? Why should users be aware of the concept of > offsets at all? Callers need to be returned an offset from an allocation in case they want to store that reference inside another object. A Histogram, for example, has a name. That name needs to be stored in the shared memory and the object needs to reference it. References can be only offsets since pointers would be invalid in another process should it map the segment at a different address. > And finally, why do we need the notion of types? This makes this non-standard > from most allocators, which only care about ranges of memory of a fixed size. If > the thing you need to build on *top* of the allocator needs to be explicitly > type aware, then provide the type info at that layer? It could certainly be done that way. I did it this way to force some basic checking based on type. By passing knowledge of the object to the allocator, it can ensure that the memory it is returning is capable of at least holding that object. If that was not done, it would be possible to craft a malicious memory segment that could cause a segmentation fault in another process. https://codereview.chromium.org/1410213004/diff/1/base/memory/shared_memory_a... base/memory/shared_memory_allocator.h:64: void MakeIterable(int32 offset); On 2015/10/29 21:05:14, chrisha wrote: > Why do you need this? The implementation already has all the information > required to perform iteration... Unfortunately, it doesn't. If you try to make it iterable during the allocation, you get a race condition between the filling in of the iteration information (the block header) and the update of the "global_freeptr". It's possible for a process to iterate into the newly allocated block before size and other things get set. See comments on my design document for details. As a bonus, not all allocations need to be made iterable. The histogram names, for example, can be omitted. https://codereview.chromium.org/1410213004/diff/1/base/memory/shared_memory_a... base/memory/shared_memory_allocator.h:84: struct SharedMetaData; On 2015/10/29 21:05:14, chrisha wrote: > Metadata is a single word, no need to capitalize Data. Done. https://codereview.chromium.org/1410213004/diff/1/base/memory/shared_memory_a... base/memory/shared_memory_allocator.h:93: int32 mem_page_; > Do we really care about page sizes? What's the harm in allowing allocations to > cross a page boundary? (This happens all the time with most any allocator. I'd > be more worried about crossing cache lines for small allocations.) Probably not. However, if this were to be used against a memory-mapped file, then an allocation that crossed page boundaries could cause two pages to be dirtied and written to disk. If pages aren't a concern on whatever memory is being assigned, just set it to zero and it'll be ignored. > And if we do, isn't this an implementation detail? (ie: platform specific, but > knowable from the system and doesn't need to be provided) Maybe you only need to > inform the allocator whether or not you care about crossing page boundaries, but > not what the page size is? Possibly but since the management of the memory segment is external, I think it best to get the page size knowledge from that external source rather than make assumptions here.
I think this code is over-complicated by the seemingly unnecessary corruption checks. Can you please elaborate in the comments in which cases the corruption may happen? Right now this looks like an attempt to detect random writes to the buffer by the code outside of SharedMemoryAllocator. I don't think there's a point in doing so. https://codereview.chromium.org/1410213004/diff/20001/base/memory/shared_memo... File base/memory/shared_memory_allocator.cc (right): https://codereview.chromium.org/1410213004/diff/20001/base/memory/shared_memo... base/memory/shared_memory_allocator.cc:276: corrupted_ = true; Can SetCorrupted() and IsCorrupted() be invoked concurrently? If so, that's a data race. https://codereview.chromium.org/1410213004/diff/60001/base/memory/shared_memo... File base/memory/shared_memory_allocator.cc (right): https://codereview.chromium.org/1410213004/diff/60001/base/memory/shared_memo... base/memory/shared_memory_allocator.cc:16: // RAM bus access. 16 was chosen so that the block header would always fall Nit: please 's/. /. ' here and below. https://codereview.chromium.org/1410213004/diff/60001/base/memory/shared_memo... base/memory/shared_memory_allocator.cc:42: int32_t size; // number of bytes in this block, including header Please fix the comments according to https://google.github.io/styleguide/cppguide.html#Punctuation,_Spelling_and_G... https://codereview.chromium.org/1410213004/diff/60001/base/memory/shared_memo... base/memory/shared_memory_allocator.cc:43: int32_t cookie; // constant value indicating completed allocation I think it's better to align the comments to this struct's fields, but YMMV. Ditto for the struct below. https://codereview.chromium.org/1410213004/diff/60001/base/memory/shared_memo... base/memory/shared_memory_allocator.cc:55: int32_t reserved[2]; // padding to ensure size is multiple of alignment I think it's more common to put the padding at the end of the structure, not in the middle. https://codereview.chromium.org/1410213004/diff/60001/base/memory/shared_memo... base/memory/shared_memory_allocator.cc:71: #define OFFSET_NULL 0 // the equivalest NULL value for an offset I think '0' is good enough to not introduce a constant effectively meaning zero. https://codereview.chromium.org/1410213004/diff/60001/base/memory/shared_memo... base/memory/shared_memory_allocator.cc:74: int32_t page) s/page/page_size/ https://codereview.chromium.org/1410213004/diff/60001/base/memory/shared_memo... base/memory/shared_memory_allocator.cc:96: if (shared_meta_->cookie != 0 || Can these checks be moved to the SharedMetadata constructor? Also, you seem to be just asserting the whole memory buffer is zeroed. Instead of writing this if-statement you can: a) scan the memory to make sure all the bytes are zeroes b) zero-fill this memory https://codereview.chromium.org/1410213004/diff/60001/base/memory/shared_memo... base/memory/shared_memory_allocator.cc:99: shared_meta_->freeptr != 0 || shared_meta_->freeptr is an atomic and should be accessed as such. https://codereview.chromium.org/1410213004/diff/60001/base/memory/shared_memo... base/memory/shared_memory_allocator.cc:113: // This is still safe to do even if corruption has been detected. Why bother if we've detected the corruption? Shouldn't we just bail out? https://codereview.chromium.org/1410213004/diff/60001/base/memory/shared_memo... base/memory/shared_memory_allocator.cc:117: subtle::NoBarrier_Store(&shared_meta_->freeptr, sizeof(SharedMetadata)); Shouldn't this be a Release_Store? shared_meta_->freeptr is read using an Acquire_Load. https://codereview.chromium.org/1410213004/diff/60001/base/memory/shared_memo... base/memory/shared_memory_allocator.cc:175: size = page_free; It may be worth noticing in the comments that the size of the allocated block may be greater than the requested size + header size + alignment. https://codereview.chromium.org/1410213004/diff/60001/base/memory/shared_memo... base/memory/shared_memory_allocator.cc:191: // zeros, any other value indicates that something has run amuck. Can you please elaborate on how's that possible? https://codereview.chromium.org/1410213004/diff/60001/base/memory/shared_memo... base/memory/shared_memory_allocator.cc:211: meminfo->free = shared_meta_->corrupted ? 0 : remaining - sizeof(BlockHeader); Shouldn't we use IsCorrupted() here? https://codereview.chromium.org/1410213004/diff/60001/base/memory/shared_memo... base/memory/shared_memory_allocator.cc:235: // Ensure that the tail pointer didn't change while reading next. Why do you need that? What happens if the tail pointer changes after you ensure that? https://codereview.chromium.org/1410213004/diff/60001/base/memory/shared_memo... base/memory/shared_memory_allocator.cc:236: if (tail == subtle::Release_Load(&shared_meta_->tailptr)) { Most certainly you do not want to use Release_Load(). https://codereview.chromium.org/1410213004/diff/60001/base/memory/shared_memo... base/memory/shared_memory_allocator.cc:242: &block->next, OFFSET_QUEUE, offset) == OFFSET_QUEUE) { Please mind the indentation. https://codereview.chromium.org/1410213004/diff/60001/base/memory/shared_memo... base/memory/shared_memory_allocator.cc:282: void SharedMemoryAllocator::SetCorrupted() { Can SetCorrupted() and IsCorrupted() be called from different threads? https://codereview.chromium.org/1410213004/diff/60001/base/memory/shared_memo... base/memory/shared_memory_allocator.cc:296: bool SharedMemoryAllocator::IsFull() { How will IsFull() be used? Looks like a data race between it and Allocate() is possible. https://codereview.chromium.org/1410213004/diff/60001/base/memory/shared_memo... base/memory/shared_memory_allocator.cc:305: if (offset < (int)(special ? OFFSET_QUEUE : sizeof(SharedMetadata))) It's unclear what 'special' stands for.
glider@chromium.org changed reviewers: + dvyukov@chromium.org
+dvyukov for the lock-free stuff.
The implementation is certainly interesting. I am worried of wasting time reviewing it, so I asked the security team to for an ack about this first.
> I think this code is over-complicated by the seemingly unnecessary corruption > checks. Can you please elaborate in the comments in which cases the corruption > may happen? The desire is to be able to use this to get histogram information from the Rendered back into the Browser. Because the Renderer is insecure and could be compromised, it's necessary to harden the allocator such that a malicious process can't cause other processes to crash when using the allocator. The stored data will have to be similarly hardened. I'll get to the rest of your comments shortly.
https://codereview.chromium.org/1410213004/diff/20001/base/memory/shared_memo... File base/memory/shared_memory_allocator.cc (right): https://codereview.chromium.org/1410213004/diff/20001/base/memory/shared_memo... base/memory/shared_memory_allocator.cc:276: corrupted_ = true; On 2015/10/30 06:53:13, Alexander Potapenko wrote: > Can SetCorrupted() and IsCorrupted() be invoked concurrently? > If so, that's a data race. It is, but not an important one. Correct operation does not depend on this value; it is advisory only. If one thread continues because it just missed the setting of the flag, it will still run without any failure; it's safe regardless of the outcome of the race. Once set, the flag will never be reset which would be an issue if possible. https://codereview.chromium.org/1410213004/diff/60001/base/memory/shared_memo... File base/memory/shared_memory_allocator.cc (right): https://codereview.chromium.org/1410213004/diff/60001/base/memory/shared_memo... base/memory/shared_memory_allocator.cc:42: int32_t size; // number of bytes in this block, including header On 2015/10/30 06:53:13, Alexander Potapenko wrote: > Please fix the comments according to > https://google.github.io/styleguide/cppguide.html#Punctuation,_Spelling_and_G... Done. https://codereview.chromium.org/1410213004/diff/60001/base/memory/shared_memo... base/memory/shared_memory_allocator.cc:43: int32_t cookie; // constant value indicating completed allocation On 2015/10/30 06:53:13, Alexander Potapenko wrote: > I think it's better to align the comments to this struct's fields, but YMMV. > Ditto for the struct below. *I* prefer it but I've gotten nags in the past that I shouldn't do it. Let me see if presumbit complains about too many spaces before the double-slash if I do... https://codereview.chromium.org/1410213004/diff/60001/base/memory/shared_memo... base/memory/shared_memory_allocator.cc:55: int32_t reserved[2]; // padding to ensure size is multiple of alignment On 2015/10/30 06:53:13, Alexander Potapenko wrote: > I think it's more common to put the padding at the end of the structure, not in > the middle. The "BlockHeader" entry must be the very last entry so that it starts on a proper alignment boundary. https://codereview.chromium.org/1410213004/diff/60001/base/memory/shared_memo... base/memory/shared_memory_allocator.cc:71: #define OFFSET_NULL 0 // the equivalest NULL value for an offset On 2015/10/30 06:53:13, Alexander Potapenko wrote: > I think '0' is good enough to not introduce a constant effectively meaning zero. I could just use NULL since it is defined as ``0''. I was trying to be descriptive (self-documenting code) without adding confusion since everybody associates NULL with pointers. Thoughts? https://codereview.chromium.org/1410213004/diff/60001/base/memory/shared_memo... base/memory/shared_memory_allocator.cc:74: int32_t page) On 2015/10/30 06:53:13, Alexander Potapenko wrote: > s/page/page_size/ Done. https://codereview.chromium.org/1410213004/diff/60001/base/memory/shared_memo... base/memory/shared_memory_allocator.cc:96: if (shared_meta_->cookie != 0 || On 2015/10/30 06:53:13, Alexander Potapenko wrote: > Can these checks be moved to the SharedMetadata constructor? I don't think so because (1) it's never constructed and (2) it would have to be called only for newly created allocations. > Also, you seem to be just asserting the whole memory buffer is zeroed. > Instead of writing this if-statement you can: > a) scan the memory to make sure all the bytes are zeroes > b) zero-fill this memory It must be zero but only when first attached. If other processes attach it, the global-cookie will be set indicating that it is already initialized. Also, it's not allowed to zero the memory here because (1) it doesn't know if it's the first or later attachment and (2) accessing all memory would "realize" that memory -- in the case of memory-mapped files, the file size could start at zero and auto-grow as needed; we don't want to "need" it before we really need it. https://codereview.chromium.org/1410213004/diff/60001/base/memory/shared_memo... base/memory/shared_memory_allocator.cc:99: shared_meta_->freeptr != 0 || On 2015/10/30 06:53:13, Alexander Potapenko wrote: > shared_meta_->freeptr is an atomic and should be accessed as such. Right. It doesn't actually affect anything at this point but I think TSAN or similar checks these things. https://codereview.chromium.org/1410213004/diff/60001/base/memory/shared_memo... base/memory/shared_memory_allocator.cc:113: // This is still safe to do even if corruption has been detected. On 2015/10/30 06:53:13, Alexander Potapenko wrote: > Why bother if we've detected the corruption? Shouldn't we just bail out? It doesn't change anything either way. I opted for simpler code but could use "else" or "return" to skip it. https://codereview.chromium.org/1410213004/diff/60001/base/memory/shared_memo... base/memory/shared_memory_allocator.cc:117: subtle::NoBarrier_Store(&shared_meta_->freeptr, sizeof(SharedMetadata)); On 2015/10/30 06:53:13, Alexander Potapenko wrote: > Shouldn't this be a Release_Store? > shared_meta_->freeptr is read using an Acquire_Load. My understanding is that acquire/release affects only memory ordering of operations directly around this statement and has no bearing on other parts of the code or other threads. From atomicops.h: "Acquire" operations ensure that no later memory access can be reordered ahead of the operation. "Release" operations ensure that no previous memory access can be reordered after the operation. "Barrier" operations have both "Acquire" and "Release" semantics. https://codereview.chromium.org/1410213004/diff/60001/base/memory/shared_memo... base/memory/shared_memory_allocator.cc:175: size = page_free; On 2015/10/30 06:53:13, Alexander Potapenko wrote: > It may be worth noticing in the comments that the size of the allocated block > may be greater than the requested size + header size + alignment. Done. https://codereview.chromium.org/1410213004/diff/60001/base/memory/shared_memo... base/memory/shared_memory_allocator.cc:191: // zeros, any other value indicates that something has run amuck. On 2015/10/30 06:53:13, Alexander Potapenko wrote: > Can you please elaborate on how's that possible? Done. https://codereview.chromium.org/1410213004/diff/60001/base/memory/shared_memo... base/memory/shared_memory_allocator.cc:211: meminfo->free = shared_meta_->corrupted ? 0 : remaining - sizeof(BlockHeader); On 2015/10/30 06:53:13, Alexander Potapenko wrote: > Shouldn't we use IsCorrupted() here? Definitely. https://codereview.chromium.org/1410213004/diff/60001/base/memory/shared_memo... base/memory/shared_memory_allocator.cc:235: // Ensure that the tail pointer didn't change while reading next. On 2015/10/30 06:53:13, Alexander Potapenko wrote: > Why do you need that? Only the read of the tail pointer is atomic but we need to read both the tail pointer and the next pointer from it in an atomic fashion. The way to do this is to read both non-atomically and then verify after the last read that the first read is still valid/unchanged. I'll add that to the comment. > What happens if the tail pointer changes after you ensure that? Then we go about preparing to append the current node in the wrong place but fail to do so at the CAS down on line 241/242, starting over. https://codereview.chromium.org/1410213004/diff/60001/base/memory/shared_memo... base/memory/shared_memory_allocator.cc:236: if (tail == subtle::Release_Load(&shared_meta_->tailptr)) { On 2015/10/30 06:53:13, Alexander Potapenko wrote: > Most certainly you do not want to use Release_Load(). Why? It ensures no previous memory access in the code above will get reordered after this. https://codereview.chromium.org/1410213004/diff/60001/base/memory/shared_memo... base/memory/shared_memory_allocator.cc:242: &block->next, OFFSET_QUEUE, offset) == OFFSET_QUEUE) { On 2015/10/30 06:53:13, Alexander Potapenko wrote: > Please mind the indentation. Isn't it supposed to be indented 4 spaces from the start of the function call? https://codereview.chromium.org/1410213004/diff/60001/base/memory/shared_memo... base/memory/shared_memory_allocator.cc:282: void SharedMemoryAllocator::SetCorrupted() { On 2015/10/30 06:53:13, Alexander Potapenko wrote: > Can SetCorrupted() and IsCorrupted() be called from different threads? Certainly. See previous comment on this. https://codereview.chromium.org/1410213004/diff/60001/base/memory/shared_memo... base/memory/shared_memory_allocator.cc:296: bool SharedMemoryAllocator::IsFull() { On 2015/10/30 06:53:13, Alexander Potapenko wrote: > How will IsFull() be used? > Looks like a data race between it and Allocate() is possible. It's just informational to the caller. Data-race is not important for the same reason as IsCorrupted() -- it's safe either way. https://codereview.chromium.org/1410213004/diff/60001/base/memory/shared_memo... base/memory/shared_memory_allocator.cc:305: if (offset < (int)(special ? OFFSET_QUEUE : sizeof(SharedMetadata))) On 2015/10/30 06:53:13, Alexander Potapenko wrote: > It's unclear what 'special' stands for. "special" indicates that we may try to access block headers not available to callers but still accessed by this module. By having internal dereferences go through this same function, the allocator is hardened against corruption. I'll add a comment. I could change the name of the parameter to "system" or "meta_ok" if that would be more clear.
https://codereview.chromium.org/1410213004/diff/1/base/memory/shared_memory_a... File base/memory/shared_memory_allocator.cc (right): https://codereview.chromium.org/1410213004/diff/1/base/memory/shared_memory_a... base/memory/shared_memory_allocator.cc:13: #define ALLOC_ALIGNMENT 16 On 2015/10/29 23:40:57, bcwhite wrote: > On 2015/10/29 21:05:13, chrisha wrote: > > Windows only provides 8-byte alignment... and a true lower bound is 4 bytes. > Why > > 16? Should this vary across platforms? > > > > Also, we tend to use constants in the anonymous namespace rather than defines. > > It shouldn't be 4 because then 64-bit values could get split across multiple bus > reads. I can drop it to 8. > > I put it at 16 because the block header is 16 bytes which guarantees that a > block header can't span cache lines. > > I tried an anonymous constant but ran into some signed/unsigned compiler errors > whereas the #define just worked. That may have changed and the code progressed; > I'll try it again. 8 sgtm https://codereview.chromium.org/1410213004/diff/1/base/memory/shared_memory_a... base/memory/shared_memory_allocator.cc:42: int32 type; // a number provided by caller indicating data type On 2015/10/29 23:40:57, bcwhite wrote: > On 2015/10/29 21:05:13, chrisha wrote: > > Is this needed? > > I think so, yes. When allocating, the caller knows the type of the object it's > creating. When iterating, the caller has no idea what it is getting back. If > there is only ever one type of object being made iterable (likely for the Shared > Histogram use-case), it's a moot point but in the general case, some other > information needs to be made available. Many heap implementations are iterable. They don't have a clue about what is being stored in it, and don't persist type information. If you need the type information as a user of the heap then you should store whatever metadata you want in your own 'header' inside of the returned allocation. Again, this is mixing things that are specific to UMA (where type info is required) into the allocator, which really doesn't care. https://codereview.chromium.org/1410213004/diff/1/base/memory/shared_memory_a... base/memory/shared_memory_allocator.cc:54: char flags[2]; // align to next int (not strictly needed but avoid confusion) On 2015/10/29 23:40:57, bcwhite wrote: > On 2015/10/29 21:05:13, chrisha wrote: > > Use a bitfield for flags? (Easier to grow in the future.) > > Bitfields are read-modify-write which are race-conditions in a parallel system. > Chars are guaranteed to be able to be written without affecting any other bytes > on the machine-word. The only thing that's atomic is a machine word write, so writing chars is racy. You need to be writing an entire machine word and using atomic if the thing is ever to be safely modified. I'd still use a bitfield (and a union to an int32). I'd then atomically read the current flags, modify the bit I care about, and atomically compare and exchange until success. https://codereview.chromium.org/1410213004/diff/1/base/memory/shared_memory_a... base/memory/shared_memory_allocator.cc:65: #define OFFSET_QUEUE offsetof(SharedMetaData, queue) On 2015/10/29 23:40:57, bcwhite wrote: > On 2015/10/29 21:05:13, chrisha wrote: > > Use a constant here? > > A constant doesn't work because SharedMetaData is a private member and so can't > be accessed by a global variable. I could make the constant part of the class > but that seemed to make things worse. sgtm https://codereview.chromium.org/1410213004/diff/1/base/memory/shared_memory_a... File base/memory/shared_memory_allocator.h (right): https://codereview.chromium.org/1410213004/diff/1/base/memory/shared_memory_a... base/memory/shared_memory_allocator.h:28: // caller. The allocator needs to know only the block's |base| address, the On 2015/10/29 23:40:58, bcwhite wrote: > On 2015/10/29 21:05:14, chrisha wrote: > > micronit: The more common pattern in Chrome is a single space after a period > > when starting a new sentence. (Roughly 5:1 usage of single space versus double > > space, from a few quick greps.) > > I can't. I'm sorry. <hanging head in shame> Blame it on my 7th grade typing > teacher. Always 2 spaces after a period. http://www.slate.com/articles/technology/technology/2011/01/space_invaders.html https://codereview.chromium.org/1410213004/diff/1/base/memory/shared_memory_a... base/memory/shared_memory_allocator.h:35: SharedMemoryAllocator(void* base, int32 size, int32 page); On 2015/10/29 23:40:57, bcwhite wrote: > > (We now use int32_t instead of int32.) > > Awww.... shoot. > > > Shouldn't these all be uint32_t, or size_t? Given that they are strictly > > non-negative quantities. (Here and most everywhere in the API.) > > I tried going that way early on. However, Atomic32 is always a signed value and > the mixing of signed and unsigned values causes either excessive compiler errors > or excessive casting. Since we're talking small values (kilobytes at most), > there isn't any danger of overflowing even a signed integer. It's much simpler > this way. Okay, sounds like a reasonable reason to me. Worth a comment somewhere? https://codereview.chromium.org/1410213004/diff/1/base/memory/shared_memory_a... base/memory/shared_memory_allocator.h:46: template<typename T> T* GetObject(int32 offset, int32 type, T* unused) { > Callers need to be returned an offset from an allocation in case they want to > store that reference inside another object. A Histogram, for example, has a > name. That name needs to be stored in the shared memory and the object needs to > reference it. References can be only offsets since pointers would be invalid in > another process should it map the segment at a different address. The offset isn't sufficient in this case, because things may live in multiple independent memory segments, so you'll need both some kind of segment ID and an offset. Why not just abstract this away entirely and have the UMA allocator tie things together with IDs? (ie: hashes of the name, the bucket metadata, etc). The actual renderer using this allocator will keep direct pointers to things, and can store sufficient metadata into the individual allocations (representing various parts of the histograms) so that they can subsequently be tied together when read. No need for pointers or offsets at all. > > And finally, why do we need the notion of types? This makes this non-standard > > from most allocators, which only care about ranges of memory of a fixed size. > If > > the thing you need to build on *top* of the allocator needs to be explicitly > > type aware, then provide the type info at that layer? > > It could certainly be done that way. I did it this way to force some basic > checking based on type. By passing knowledge of the object to the allocator, it > can ensure that the memory it is returning is capable of at least holding that > object. If that was not done, it would be possible to craft a malicious memory > segment that could cause a segmentation fault in another process. I feel like you're mixing things that are specific to the UMA storage layer and things that are specific to the allocator. The allocator need only be able to return chunks of memory with a given size. Anybody with access to the memory can screw up the allocator and the data stored within it, regardless of any security measures you put in place. I agree that a simple 'consistency' check (checksum of block headers? constant cookie?) are useful because they quickly provide information about derails. However, the type means nothing to somebody parsing a range of allocations... it means something to the *user* of the segment, but not to the segment itself. I strongly feel it should be in the storage layer, not the allocator layer. https://codereview.chromium.org/1410213004/diff/1/base/memory/shared_memory_a... base/memory/shared_memory_allocator.h:64: void MakeIterable(int32 offset); On 2015/10/29 23:40:58, bcwhite wrote: > On 2015/10/29 21:05:14, chrisha wrote: > > Why do you need this? The implementation already has all the information > > required to perform iteration... > > Unfortunately, it doesn't. If you try to make it iterable during the > allocation, you get a race condition between the filling in of the iteration > information (the block header) and the update of the "global_freeptr". It's > possible for a process to iterate into the newly allocated block before size and > other things get set. See comments on my design document for details. > > As a bonus, not all allocations need to be made iterable. The histogram names, > for example, can be omitted. Still implementation details. This should be entirely internal: - carve out space by incrementing the cursor (Any other allocations will strictly go to the end, no conflict. The allocation is still not iterable, but that's fine.) - initialize the block header, the memory, etc. (The allocation is still not iterable, but that's fine.) - weave the allocation into the link list using atomics. Thus, the 'MakeIterable' (weave into the link list) could be entirely private and an implementation detail. If something doesn't need to be iterable (ie: the reader doesn't care about), then I wouldn't store it in the shared memory segment. I'd store it in the regular heap and let it die with the process. I'd only use the shared memory for things that must be communicated, in which case all of the allocations need to be iterable. https://codereview.chromium.org/1410213004/diff/1/base/memory/shared_memory_a... base/memory/shared_memory_allocator.h:93: int32 mem_page_; > Probably not. However, if this were to be used against a memory-mapped file, > then an allocation that crossed page boundaries could cause two pages to be > dirtied and written to disk. > > If pages aren't a concern on whatever memory is being assigned, just set it to > zero and it'll be ignored. Okay, I'm fine with that. https://codereview.chromium.org/1410213004/diff/80001/base/memory/shared_memo... File base/memory/shared_memory_allocator.cc (right): https://codereview.chromium.org/1410213004/diff/80001/base/memory/shared_memo... base/memory/shared_memory_allocator.cc:93: // being initialized. It's unshared and single-threaded... Maybe it's worth having two separate constructors. One that is explicitly supposed to be the initializer and initial user of a memory region, and another that 'attaches' to one that's already initialized? This document the user intent, and separates the two bits of code. It also should be documented that the 'initializer' needs to have exclusive access before any other SharedMemoryAllocators can be attached to it. I'd also document the intended use and workflow in the .h with a brief example. https://codereview.chromium.org/1410213004/diff/80001/base/memory/shared_memo... File base/memory/shared_memory_allocator.h (right): https://codereview.chromium.org/1410213004/diff/80001/base/memory/shared_memo... base/memory/shared_memory_allocator.h:19: // or processes that may be compromised and thus have malicious intent) How can this possibly be true? If I'm a malicious thread I can simply write whatever I want all over memory, and fool the allocator into thinking anything I want it to think. I don't this strong security against malicious writers is a requirement. Some kind of weak consistency is useful, but making this all out secure in the face of something in our process is both unrealistic and unnecessary. https://codereview.chromium.org/1410213004/diff/80001/base/memory/shared_memo... base/memory/shared_memory_allocator.h:26: // greatly complicate the allocation algorithm. Pointer invalidation isn't a reason, as this is true of all allocators, and memory management is the user's responsibility. (It doesn't support it because of the complication aspect, and the fact that we don't actually require it.)
Okay, after offline discussions: I'm okay with the "alloc" and "commit" phase (ie: MakeIterable). Maybe we want to rephrase the interface to that language? I'm okay with the concept of offsets, but don't think we should be thinking about those for the UMA use-case, as we will likely require multiple segments. I'm not 100% sold on the concept of the 'type' being in the allocator interface itself. It's something that the UMA use-case can make use of, but it's not generally required. Maybe just call it a cookie, because that's basically what it is? (ie: it has no relationship with nor bearing on actual C++ types) And maybe this belongs in the UMA component rather than base? Unless there are other people that will require this primitive as well?
https://codereview.chromium.org/1410213004/diff/60001/base/memory/shared_memo... File base/memory/shared_memory_allocator.cc (right): https://codereview.chromium.org/1410213004/diff/60001/base/memory/shared_memo... base/memory/shared_memory_allocator.cc:43: int32_t cookie; // constant value indicating completed allocation On 2015/10/30 14:01:10, bcwhite wrote: > On 2015/10/30 06:53:13, Alexander Potapenko wrote: > > I think it's better to align the comments to this struct's fields, but YMMV. > > Ditto for the struct below. > > *I* prefer it but I've gotten nags in the past that I shouldn't do it. Let me > see if presumbit complains about too many spaces before the double-slash if I > do... How about letting "git cl format" solve this problem? https://codereview.chromium.org/1410213004/diff/60001/base/memory/shared_memo... base/memory/shared_memory_allocator.cc:242: &block->next, OFFSET_QUEUE, offset) == OFFSET_QUEUE) { On 2015/10/30 14:01:10, bcwhite wrote: > On 2015/10/30 06:53:13, Alexander Potapenko wrote: > > Please mind the indentation. > > Isn't it supposed to be indented 4 spaces from the start of the function call? git cl format will take care of these things for you
https://codereview.chromium.org/1410213004/diff/1/base/memory/shared_memory_a... File base/memory/shared_memory_allocator.cc (right): https://codereview.chromium.org/1410213004/diff/1/base/memory/shared_memory_a... base/memory/shared_memory_allocator.cc:13: #define ALLOC_ALIGNMENT 16 On 2015/10/30 14:36:46, chrisha wrote: > On 2015/10/29 23:40:57, bcwhite wrote: > > On 2015/10/29 21:05:13, chrisha wrote: > > > Windows only provides 8-byte alignment... and a true lower bound is 4 bytes. > > Why > > > 16? Should this vary across platforms? > > > > > > Also, we tend to use constants in the anonymous namespace rather than > defines. > > > > It shouldn't be 4 because then 64-bit values could get split across multiple > bus > > reads. I can drop it to 8. > > > > I put it at 16 because the block header is 16 bytes which guarantees that a > > block header can't span cache lines. > > > > I tried an anonymous constant but ran into some signed/unsigned compiler > errors > > whereas the #define just worked. That may have changed and the code > progressed; > > I'll try it again. > > 8 sgtm Done. https://codereview.chromium.org/1410213004/diff/1/base/memory/shared_memory_a... base/memory/shared_memory_allocator.cc:54: char flags[2]; // align to next int (not strictly needed but avoid confusion) > > Bitfields are read-modify-write which are race-conditions in a parallel > system. > > Chars are guaranteed to be able to be written without affecting any other > bytes > > on the machine-word. > > The only thing that's atomic is a machine word write, so writing chars is racy. > You need to be writing an entire machine word and using atomic if the thing is > ever to be safely modified. That used to be the case but C++-11 changed that. http://www.cpptalk.net/array-thread-safety-in-c-11-vt60124.html http://en.cppreference.com/w/cpp/language/memory_model Now a "char" is a memory location and "different threads of execution are always allowed to access (read and modify) different memory locations concurrently, with no interference and no synchronization requirements." (2nd link) https://codereview.chromium.org/1410213004/diff/1/base/memory/shared_memory_a... File base/memory/shared_memory_allocator.h (right): https://codereview.chromium.org/1410213004/diff/1/base/memory/shared_memory_a... base/memory/shared_memory_allocator.h:28: // caller. The allocator needs to know only the block's |base| address, the On 2015/10/30 14:36:46, chrisha wrote: > On 2015/10/29 23:40:58, bcwhite wrote: > > On 2015/10/29 21:05:14, chrisha wrote: > > > micronit: The more common pattern in Chrome is a single space after a period > > > when starting a new sentence. (Roughly 5:1 usage of single space versus > double > > > space, from a few quick greps.) > > > > I can't. I'm sorry. <hanging head in shame> Blame it on my 7th grade typing > > teacher. Always 2 spaces after a period. > > http://www.slate.com/articles/technology/technology/2011/01/space_invaders.html Heh! Yeah, I've read that. It even makes sense. But when I say "I can't", I mean I physically can't. Double spacing after periods is such a ingrained habit that I can't break. And I have tried. I'm pedantic enough that I had to try. It doesn't help that it now simply looks completely "wrong" to me for a single space to be there. I'll do a global search/replace when this is nearing completion. If I do it sooner, more will sneak in. In the mean time, please suffer my personal failing in this regard. :-) https://codereview.chromium.org/1410213004/diff/1/base/memory/shared_memory_a... base/memory/shared_memory_allocator.h:35: SharedMemoryAllocator(void* base, int32 size, int32 page); On 2015/10/30 14:36:46, chrisha wrote: > On 2015/10/29 23:40:57, bcwhite wrote: > > > (We now use int32_t instead of int32.) > > > > Awww.... shoot. > > > > > Shouldn't these all be uint32_t, or size_t? Given that they are strictly > > > non-negative quantities. (Here and most everywhere in the API.) > > > > I tried going that way early on. However, Atomic32 is always a signed value > and > > the mixing of signed and unsigned values causes either excessive compiler > errors > > or excessive casting. Since we're talking small values (kilobytes at most), > > there isn't any danger of overflowing even a signed integer. It's much > simpler > > this way. > > Okay, sounds like a reasonable reason to me. Worth a comment somewhere? Done. https://codereview.chromium.org/1410213004/diff/1/base/memory/shared_memory_a... base/memory/shared_memory_allocator.h:46: template<typename T> T* GetObject(int32 offset, int32 type, T* unused) { On 2015/10/30 14:36:46, chrisha wrote: > > Callers need to be returned an offset from an allocation in case they want to > > store that reference inside another object. A Histogram, for example, has a > > name. That name needs to be stored in the shared memory and the object needs > to > > reference it. References can be only offsets since pointers would be invalid > in > > another process should it map the segment at a different address. > > The offset isn't sufficient in this case, because things may live in multiple > independent memory segments, so you'll need both some kind of segment ID and an > offset. > > Why not just abstract this away entirely and have the UMA allocator tie things > together with IDs? (ie: hashes of the name, the bucket metadata, etc). The > actual renderer using this allocator will keep direct pointers to things, and > can store sufficient metadata into the individual allocations (representing > various parts of the histograms) so that they can subsequently be tied together > when read. No need for pointers or offsets at all. > > > > And finally, why do we need the notion of types? This makes this > non-standard > > > from most allocators, which only care about ranges of memory of a fixed > size. > > If > > > the thing you need to build on *top* of the allocator needs to be explicitly > > > type aware, then provide the type info at that layer? > > > > It could certainly be done that way. I did it this way to force some basic > > checking based on type. By passing knowledge of the object to the allocator, > it > > can ensure that the memory it is returning is capable of at least holding that > > object. If that was not done, it would be possible to craft a malicious > memory > > segment that could cause a segmentation fault in another process. > > I feel like you're mixing things that are specific to the UMA storage layer and > things that are specific to the allocator. > > The allocator need only be able to return chunks of memory with a given size. > Anybody with access to the memory can screw up the allocator and the data stored > within it, regardless of any security measures you put in place. I agree that a > simple 'consistency' check (checksum of block headers? constant cookie?) are > useful because they quickly provide information about derails. However, the type > means nothing to somebody parsing a range of allocations... it means something > to the *user* of the segment, but not to the segment itself. I strongly feel it > should be in the storage layer, not the allocator layer. Acknowledged. https://codereview.chromium.org/1410213004/diff/1/base/memory/shared_memory_a... base/memory/shared_memory_allocator.h:64: void MakeIterable(int32 offset); On 2015/10/30 14:36:46, chrisha wrote: > On 2015/10/29 23:40:58, bcwhite wrote: > > On 2015/10/29 21:05:14, chrisha wrote: > > > Why do you need this? The implementation already has all the information > > > required to perform iteration... > > > > Unfortunately, it doesn't. If you try to make it iterable during the > > allocation, you get a race condition between the filling in of the iteration > > information (the block header) and the update of the "global_freeptr". It's > > possible for a process to iterate into the newly allocated block before size > and > > other things get set. See comments on my design document for details. > > > > As a bonus, not all allocations need to be made iterable. The histogram > names, > > for example, can be omitted. > > Still implementation details. This should be entirely internal: > > - carve out space by incrementing the cursor > (Any other allocations will strictly go to the end, no conflict. The > allocation is still not iterable, but that's fine.) > - initialize the block header, the memory, etc. > (The allocation is still not iterable, but that's fine.) > - weave the allocation into the link list using atomics. > > Thus, the 'MakeIterable' (weave into the link list) could be entirely private > and an implementation detail. > > If something doesn't need to be iterable (ie: the reader doesn't care about), > then I wouldn't store it in the shared memory segment. I'd store it in the > regular heap and let it die with the process. I'd only use the shared memory for > things that must be communicated, in which case all of the allocations need to > be iterable. Acknowledged. https://codereview.chromium.org/1410213004/diff/60001/base/memory/shared_memo... File base/memory/shared_memory_allocator.cc (right): https://codereview.chromium.org/1410213004/diff/60001/base/memory/shared_memo... base/memory/shared_memory_allocator.cc:242: &block->next, OFFSET_QUEUE, offset) == OFFSET_QUEUE) { On 2015/10/30 15:11:53, chrisha wrote: > On 2015/10/30 14:01:10, bcwhite wrote: > > On 2015/10/30 06:53:13, Alexander Potapenko wrote: > > > Please mind the indentation. > > > > Isn't it supposed to be indented 4 spaces from the start of the function call? > > git cl format will take care of these things for you I tried that. It made it worse (IMO) by splitting the parameters across the first and second lines and indenting all the way to the parenthesis. I'll do a full "git cl format" cleanup at some point.
https://codereview.chromium.org/1410213004/diff/80001/base/memory/shared_memo... File base/memory/shared_memory_allocator.cc (right): https://codereview.chromium.org/1410213004/diff/80001/base/memory/shared_memo... base/memory/shared_memory_allocator.cc:93: // being initialized. It's unshared and single-threaded... On 2015/10/30 14:36:46, chrisha wrote: > Maybe it's worth having two separate constructors. One that is explicitly > supposed to be the initializer and initial user of a memory region, and another > that 'attaches' to one that's already initialized? Maybe. That seems a reasonable idea though it might break the use-case of a memory-mapped file where you don't know if the file exists or not. Let me think on it. > I'd also document the intended use and workflow in the .h with a brief example. Done.
Before proceeding with the review I'd love to make several important points. 1. There is no such thing as "non-harmful data race". This is undefined behavior according to the C++ Standard. Period. If you want your behavior to be somewhat defined, you need to synchronize the accesses and/or use atomics. Unless a memory location is marked as atomic and is accessed using atomic operations, you cannon assume that memory location is updated atomically, even if that's word-sized. 2. TSan is a best-effort tool, it won't detect all bugs in your lock-free code unless that code is extensively tested. It's not enough to rely on TSan to be sure there're no errors in your code. I also suggest you take a look at http://preshing.com/20120913/acquire-and-release-semantics/ (and the related articles maybe) as a refresher for atomicops semantics (in particular acquire-release atomics and barriers). I'll respond to other comments shortly (am on a conference now)
Before proceeding with the review I'd love to make several important points. 1. There is no such thing as "non-harmful data race". This is undefined behavior according to the C++ Standard. Period. If you want your behavior to be somewhat defined, you need to synchronize the accesses and/or use atomics. Unless a memory location is marked as atomic and is accessed using atomic operations, you cannon assume that memory location is updated atomically, even if that's word-sized. 2. TSan is a best-effort tool, it won't detect all bugs in your lock-free code unless that code is extensively tested. It's not enough to rely on TSan to be sure there're no errors in your code. I also suggest you take a look at http://preshing.com/20120913/acquire-and-release-semantics/ (and the related articles maybe) as a refresher for atomicops semantics (in particular acquire-release atomics and barriers). I'll respond to other comments shortly (am on a conference now)
I still have a strong belief that the corruption checks should be less invasive. How about factoring them out to functions that perform cookie validation etc.? Also, SetCorruption()/IsCorrupted() must be private allocator methods. I'd really want to hear from the security team about this solution and the need to check for corruptions.
> 1. There is no such thing as "non-harmful data race". This is undefined behavior > according to the C++ Standard. Period. There is a definition in the C++ standard and there is a general definition: Two things racing to do something with different possible outcomes depending on timing. In referring to the latter, it is not necessarily a bad thing. If all possible results are acceptable then that "race condition" (in the general sense) is acceptable. > If you want your behavior to be somewhat defined, you need to synchronize the > accesses and/or use atomics. If you want the behavior to be *rigorously* defined, yes. > Unless a memory location is marked as atomic and is accessed using atomic > operations, you cannon assume that memory location is updated atomically, even > if that's word-sized. Absolutely. But if there are no read-modify-write operations and the only value that will ever be written is a 1 and you don't need to be synchronized with another thread, then you don't really need to do atomic operations. > I also suggest you take a look at > http://preshing.com/20120913/acquire-and-release-semantics/ (and the related > articles maybe) as a refresher for atomicops semantics (in particular > acquire-release atomics and barriers). There are definitely memory locations that require atomic access. The allocation loop requires it to avoid conflict and double-allocations. The iteration queue requires it for proper appending. Atomic variables are used in those cases. There are definitely memory locations that do not require atomic access. They are informational and the system will behave properly regardless of their value. For example, the "corrupted" flag will short-circuit some functions but those functions will still be safe even if that flag isn't set. After all, they have to be safe even when corruption hasn't yet been detected. Whether it's detected an flagged by another thread a nanosecond after the flag is checked does not change anything.
On 2015/10/30 19:27:16, bcwhite wrote: > > 1. There is no such thing as "non-harmful data race". This is undefined > behavior > > according to the C++ Standard. Period. > > There is a definition in the C++ standard and there is a general definition: > Two things racing to do something with different possible outcomes depending on > timing. > > In referring to the latter, it is not necessarily a bad thing. If all possible > results are acceptable then that "race condition" (in the general sense) is > acceptable. I am sorry, but it is not acceptable. The data races I am talking about (e.g. the one between SetCorrupted() and IsCorrupted(), to be precise) are not only "race conditions in the general sense", but also race conditions from the C++ Standard perspective. Because the behavior in the case of a race undefined, the compilers may take advantage of this fact and assume no race can ever occur. You can take a look at Hans Boehm's "How to miscompile programs with 'benign' data races" (http://static.usenix.org/event/hotpar11/tech/final_files/Boehm.pdf) for examples of how this can affect the compilation. > > > If you want your behavior to be somewhat defined, you need to synchronize the > > accesses and/or use atomics. > > If you want the behavior to be *rigorously* defined, yes. This is exactly what we want. I can also suggest you take a look at the "Benign" data races in Chromium" discussion on chromium-dev several months ago. It's been stated quite clear that there should be no so-called benign races in Chromium. > > > Unless a memory location is marked as atomic and is accessed using atomic > > operations, you cannon assume that memory location is updated atomically, even > > if that's word-sized. > > Absolutely. But if there are no read-modify-write operations and the only value > that will ever be written is a 1 and you don't need to be synchronized with > another thread, then you don't really need to do atomic operations. This is false. For example, the compiler may safely introduce additional reads and writes of that variable to use it as scratch space. > > > I also suggest you take a look at > > http://preshing.com/20120913/acquire-and-release-semantics/ (and the related > > articles maybe) as a refresher for atomicops semantics (in particular > > acquire-release atomics and barriers). > > There are definitely memory locations that require atomic access. The > allocation loop requires it to avoid conflict and double-allocations. The > iteration queue requires it for proper appending. Atomic variables are used in > those cases. > > There are definitely memory locations that do not require atomic access. They > are informational and the system will behave properly regardless of their value. > For example, the "corrupted" flag will short-circuit some functions but those > functions will still be safe even if that flag isn't set. After all, they have > to be safe even when corruption hasn't yet been detected. Whether it's detected > an flagged by another thread a nanosecond after the flag is checked does not > change anything. C++ Standard, 1.10.21: ===================== The execution of a program contains a data race if it contains two conflicting actions in different threads, at least one of which is not atomic, and neither happens before the other. Any such data race results in undefined behavior. ===================== You must use relaxed atomics if you do not care about the order of execution.
> I am sorry, but it is not acceptable. The data races I am talking about (e.g. > the one between SetCorrupted() and IsCorrupted(), to be precise) are not only > "race conditions in the general sense", but also race conditions from the C++ > Standard perspective. Because the behavior in the case of a race undefined, the > compilers may take advantage of this fact and assume no race can ever occur. You > can take a look at Hans Boehm's "How to miscompile programs with 'benign' data > races" (http://static.usenix.org/event/hotpar11/tech/final_files/Boehm.pdf) for > examples of how this can affect the compilation. Thanks for the specific example and link! Let me read through it and then get back to you on this... (might not be for a few days -- have a Happy Hallowe'en)
https://codereview.chromium.org/1410213004/diff/160001/base/memory/shared_mem... File base/memory/shared_memory_allocator.cc (right): https://codereview.chromium.org/1410213004/diff/160001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:269: if (tail == subtle::Release_Load(&shared_meta_->tailptr)) { > Why? It ensures no previous memory access in the code above will get reordered > after this. The problem with Release_Load() and Acquire_Store() is that they do not create happens-before arcs between memory accesses, so that's much harder to reason about the order of accesses when using them. Having them around is usually a red flag. FWIW if you really think you need a memory barrier between the loads of &block->next and &shared_meta_->tailptr, it's better to attribute that barrier to the former memory access, making it an Acquire_Load(). Please note that a single memory barrier is not enough to make sure nothing is reordered with your memory access. Because there are other threads that may access the same memory (otherwise you wouldn't care about the ordering at all), there must be pairing memory barriers that are executed on those threads. Please check http://www.1024cores.net/home/lock-free-algorithms/so-what-is-a-memory-model-... for an example.
https://codereview.chromium.org/1410213004/diff/160001/base/memory/shared_mem... File base/memory/shared_memory_allocator.cc (right): https://codereview.chromium.org/1410213004/diff/160001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:264: // Ensure that the tail pointer didn't change while reading next. Only If I'm understanding correctly, you don't actually need to re-read the tail pointer to ensure it hasn't changed. You just need to make sure that when updating |next| and |tailptr| you do that in the following order: 1. write |next| 2. issue a memory barrier 3. write |tailptr| There's already a barrier between the load of |tailptr| (which is an acquire_load) and the read of |next|: a. read |tailptr| b. issue a memory barrier c. read |next| This guarantees that the reader will either see the old value of |tailptr| (which will result in it obtaining the old result of |next|, just because GetBlock will get you the old value), or the new value of |tailptr|, which makes it unavoidable to see the new value of |next|.
> > Absolutely. But if there are no read-modify-write operations and the only > value > > that will ever be written is a 1 and you don't need to be synchronized with > > another thread, then you don't really need to do atomic operations. > > This is false. For example, the compiler may safely introduce additional reads > and writes of that variable to use it as scratch space. Okay, let me see if I understand this correctly... An optimizing compiler, realizing that it's about to write a "1" to a location and believing it has exclusive access to the variable, could choose to write other values to that location in the interim. We'll ignore for the moment that it would be completely ridiculous for the compiler to do so. Memory writes are relatively expensive and so there are certainly better places for the compiler to place temporaries; this isn't the case of re-reading after discarding a previously read copy. But, "ridiculous" does not mean "impossible" so we have to accommodate this case. Do I have the right of it? Anyway, I can easily make these flags atomic. The base/atomicops has only 32-bit and 64-bit atomics. Is it okay to use std::atomic<char> in these cases? > C++ Standard, 1.10.21: > ===================== > The execution of a program contains a data race if it contains two conflicting > actions in different threads, > at least one of which is not atomic, and neither happens before the other. Any > such data race results in > undefined behavior. > ===================== To be fair, it's only undefined as to the outcome of those operations. If the code after that point runs just fine for all possible outcomes (which is the case with my code), then there is no harm.
You could use bitfields or chars. Then you could use an atomic 32-bit read, update the value, followed by an atomic compare and exchange, in a loop, like a spin lock. On Sat, Oct 31, 2015, 18:28 <bcwhite@chromium.org> wrote: > > > Absolutely. But if there are no read-modify-write operations and the > > only > > value > > > that will ever be written is a 1 and you don't need to be synchronized > > with > > > another thread, then you don't really need to do atomic operations. > > > This is false. For example, the compiler may safely introduce additional > > reads > > and writes of that variable to use it as scratch space. > > Okay, let me see if I understand this correctly... > > An optimizing compiler, realizing that it's about to write a "1" to a > location > and believing it has exclusive access to the variable, could choose to > write > other values to that location in the interim. > > We'll ignore for the moment that it would be completely ridiculous for the > compiler to do so. Memory writes are relatively expensive and so there are > certainly better places for the compiler to place temporaries; this isn't > the > case of re-reading after discarding a previously read copy. > But, "ridiculous" > does not mean "impossible" so we have to accommodate this case. > > Do I have the right of it? > > Anyway, I can easily make these flags atomic. The base/atomicops has only > 32-bit and 64-bit atomics. Is it okay to use std::atomic<char> in these > cases? > > > > C++ Standard, 1.10.21: > > ===================== > > The execution of a program contains a data race if it contains two > > conflicting > > actions in different threads, > > at least one of which is not atomic, and neither happens before the > > other. Any > > such data race results in > > undefined behavior. > > ===================== > > To be fair, it's only undefined as to the outcome of those operations. If > the > code after that point runs just fine for all possible outcomes (which is > the > case with my code), then there is no harm. > > https://codereview.chromium.org/1410213004/ > To unsubscribe from this group and stop receiving emails from it, send an email to chromium-reviews+unsubscribe@chromium.org.
On 2015/10/31 22:28:48, bcwhite wrote: > > C++ Standard, 1.10.21: > > ===================== > > The execution of a program contains a data race if it contains two conflicting > > actions in different threads, > > at least one of which is not atomic, and neither happens before the other. Any > > such data race results in > > undefined behavior. > > ===================== > > To be fair, it's only undefined as to the outcome of those operations. If the > code after that point runs just fine for all possible outcomes (which is the > case with my code), then there is no harm. No. You this is a wrong assumption. Such undefined behavior can affect other memory locations. Generally: this is not the right place to discuss hardware/software co-design. Implementors of the software stack below us optimize for the memory model in very tricky ways and can sometimes outsmart you in further version of systems. So, looking at the code of the compiler, all virtualizations implementations, all CPUs on Earth would not give you a guarantee that this code won't blow up, all you can do is maintain correctness in the given memory model. Thanks.
On 2015/10/31 22:28:48, bcwhite wrote: > > > Absolutely. But if there are no read-modify-write operations and the only > > value > > > that will ever be written is a 1 and you don't need to be synchronized with > > > another thread, then you don't really need to do atomic operations. > > > > This is false. For example, the compiler may safely introduce additional reads > > and writes of that variable to use it as scratch space. > > Okay, let me see if I understand this correctly... > > An optimizing compiler, realizing that it's about to write a "1" to a location > and believing it has exclusive access to the variable, could choose to write > other values to that location in the interim. > > We'll ignore for the moment that it would be completely ridiculous for the > compiler to do so. Memory writes are relatively expensive and so there are > certainly better places for the compiler to place temporaries; this isn't the > case of re-reading after discarding a previously read copy. But, "ridiculous" > does not mean "impossible" so we have to accommodate this case. > > Do I have the right of it? Yes. I have given another example below, which is more practical, because it removes code, not adds it. > Anyway, I can easily make these flags atomic. This is really the best thing to do. > The base/atomicops has only > 32-bit and 64-bit atomics. Is it okay to use std::atomic<char> in these cases? std::atomic isn't allowed in Chromium yet, but we are almost there (I think that's a matter of a month or so) You can follow Chris's suggestion and use a CAS loop for now. Don't think it'll affect the performance much. > To be fair, it's only undefined as to the outcome of those operations. If the > code after that point runs just fine for all possible outcomes (which is the > case with my code), then there is no harm. You cannot think of all the possible outcomes - exactly because the behavior is undefined, and the compiler is free to do whatever it wants. Imagine the following situation: one thread is doing corruption checks by reading a non-atomic flag, and the other one may (or may not) set that flag under certain circumstances. An advanced compiler working in the whole program mode may figure out that there are no stores to that flag which happen before loads from it (because e.g. the thread itself doesn't modify the flag, an the other thread does that asynchronously). The compiler can therefore choose to remove all the corruption checks, or just move them to places where they don't make much sense (e.g. to the beginning of the program). The C++ Standard allows such optimization, and it can certainly make conforming code faster.
> > The base/atomicops has only > > 32-bit and 64-bit atomics. Is it okay to use std::atomic<char> in these > cases? > std::atomic isn't allowed in Chromium yet, but we are almost there (I think > that's a matter of a month or so) > You can follow Chris's suggestion and use a CAS loop for now. Don't think it'll > affect the performance much. I guess there isn't much choice. I'll make it happen. We can revisit std::atomic in the future when it's allowed and decide then if it's worth using. Thanks for your patience and explanations. I appreciate it. As you might guess, I have more background in hardware, asm (though not x86 asm), and "simple" C so look at it a bit differently.
https://codereview.chromium.org/1410213004/diff/180001/base/atomicops_unittes... File base/atomicops_unittest.cc (right): https://codereview.chromium.org/1410213004/diff/180001/base/atomicops_unittes... base/atomicops_unittest.cc:92: AtomicType fail = base::subtle::NoBarrier_CompareAndSwap(&value, 0, 2); I suggest you commit this change separately, it really doesn't relate to the rest much.
On 2015/11/02 19:40:53, Alexander Potapenko wrote: > https://codereview.chromium.org/1410213004/diff/180001/base/atomicops_unittes... > File base/atomicops_unittest.cc (right): > > https://codereview.chromium.org/1410213004/diff/180001/base/atomicops_unittes... > base/atomicops_unittest.cc:92: AtomicType fail = > base::subtle::NoBarrier_CompareAndSwap(&value, 0, 2); > I suggest you commit this change separately, it really doesn't relate to the > rest much. Will do.
https://codereview.chromium.org/1410213004/diff/180001/base/memory/shared_mem... File base/memory/shared_memory_allocator.cc (right): https://codereview.chromium.org/1410213004/diff/180001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:49: base::subtle::Atomic32 loaded_flags = base::subtle::NoBarrier_Load(flags); It's not immediately evident whether the currently existing flags require any barriers. But to be sure nothing breaks in the future, SetFlag() and CheckFlag() must probably constitute a happens-before arc. These two flags (or other flags introduced later) may be used to signal that some bits of metadata have been changed, i.e.: 1. The user modifies the metadata and sets the corresponding flag 2. He then checks the flag value in another thread. 3. In the case the flag is true, the user assumes the metadata bits have already been changed. This assumption is quite natural, but is only possible if the flag is written using a Release_Store (it already is) and read using an Acquire_Load(). (See base::CancellationFlag, which has similar semantics) https://codereview.chromium.org/1410213004/diff/180001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:115: DCHECK(base && reinterpret_cast<uintptr_t>(base) % kAllocAlignment == 0); These invariants shouldn't be checked often, should be fine to make them CHECKs. https://codereview.chromium.org/1410213004/diff/180001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:359: subtle::NoBarrier_Store(&corrupted_, 1); Why do you need both corrupted_ and kFlagCorrupted? Isn't any of the two enough? https://codereview.chromium.org/1410213004/diff/180001/base/memory/shared_mem... File base/memory/shared_memory_allocator.h (right): https://codereview.chromium.org/1410213004/diff/180001/base/memory/shared_mem... base/memory/shared_memory_allocator.h:30: // or previously initialized memory. In the first case, construction must Please run clang-format in order to fix the extra spaces. https://codereview.chromium.org/1410213004/diff/180001/base/memory/shared_mem... base/memory/shared_memory_allocator.h:109: void SetCorrupted(); Shall we move these two to the private section?
dvyukov@google.com changed reviewers: + dvyukov@google.com
https://codereview.chromium.org/1410213004/diff/180001/base/memory/shared_mem... File base/memory/shared_memory_allocator.cc (right): https://codereview.chromium.org/1410213004/diff/180001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:20: // It shouldn't be less than 8 so that 64-bit values can be read in a single What architecture do you have in mind? And what about cache? In C++ proper formulation of this would be "... to ensure proper alignment for all types". https://codereview.chromium.org/1410213004/diff/180001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:53: void SetFlag(base::subtle::Atomic32* flags, int flag, bool set) { You never pass set=false. Please delete it. This code is complex enough without dead code. https://codereview.chromium.org/1410213004/diff/180001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:58: if (base::subtle::Release_CompareAndSwap( You use NoBarrier_Load to load flags, so Release_CompareAndSwap cannot possibly be the right thing. Release/Acquire barrier must always be paired. If you don't need synchronization use NoBarrier to both load and set; if you need synchronization then change NoBarrier_Load to Acquire_Load. https://codereview.chromium.org/1410213004/diff/180001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:75: subtle::Atomic32 next; // Pointer to the next block when iterating add . at the end of comment for consistency https://codereview.chromium.org/1410213004/diff/180001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:272: subtle::NoBarrier_Store(&block->next, OFFSET_QUEUE); // will be tail block Why don't you use a lock-free stack? The stack algorithm is much simpler. It will also eliminate the SharedMetadata.queue, OFFSET_QUEUE, special flag in GetBlock and just greatly simplify lots of things. https://codereview.chromium.org/1410213004/diff/180001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:344: if (state->loop_detector == OFFSET_QUEUE) There is a simpler way to do it: count number of iterations in state->niter if state->niter > freeptr/sizeof(BlockHeader), we have a loop https://codereview.chromium.org/1410213004/diff/180001/base/memory/shared_mem... File base/memory/shared_memory_allocator.h (right): https://codereview.chromium.org/1410213004/diff/180001/base/memory/shared_mem... base/memory/shared_memory_allocator.h:37: // written to a location. This is an important distinction as it supports the Is it really important? Looks more confusing than important to me. I guess any code can work with both types of memory. And actually you cannot even say what type you have. E.g. if you call calloc, the returned memory can actually be just "read as zero until something different is written". And even if you explicitly write zeros, it still can be "fake memory" that reads as zeros. https://codereview.chromium.org/1410213004/diff/180001/base/memory/shared_mem... base/memory/shared_memory_allocator.h:65: // code and size-of |unused| are compared to ensure the reference is valid I guess it is s/unused/T/ https://codereview.chromium.org/1410213004/diff/180001/base/memory/shared_mem... base/memory/shared_memory_allocator.h:85: // the GetObject() call. s/GetObject/GetType/ https://codereview.chromium.org/1410213004/diff/180001/base/memory/shared_mem... File base/memory/shared_memory_allocator_unittest.cc (right): https://codereview.chromium.org/1410213004/diff/180001/base/memory/shared_mem... base/memory/shared_memory_allocator_unittest.cc:160: AllocatorThread t5("t5", memory, TEST_MEMORY_SIZE, TEST_MEMORY_PAGE); Also run iteration thread concurrently. https://codereview.chromium.org/1410213004/diff/180001/base/memory/shared_mem... base/memory/shared_memory_allocator_unittest.cc:193: AllocatorThread t5("t5", memory, TEST_MEMORY_SIZE, TEST_MEMORY_PAGE); Also run iteration thread concurrently.
addressed review comments by Alexander and Dmitry
https://codereview.chromium.org/1410213004/diff/180001/base/memory/shared_mem... File base/memory/shared_memory_allocator.cc (right): https://codereview.chromium.org/1410213004/diff/180001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:20: // It shouldn't be less than 8 so that 64-bit values can be read in a single On 2015/11/03 14:06:46, Dmitry Vyukov wrote: > What architecture do you have in mind? And what about cache? > In C++ proper formulation of this would be "... to ensure proper alignment for > all types". Comment expanded and updated. I don't have a particular architecture in mind other than 64-bit but cache is a consideration. I originally had an alignment of 16 so at least the block header is always within a cache line but it was felt by others that it wasn't worth the extra ~4 bytes of wasted padding. https://codereview.chromium.org/1410213004/diff/180001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:49: base::subtle::Atomic32 loaded_flags = base::subtle::NoBarrier_Load(flags); On 2015/11/03 08:12:54, Alexander Potapenko wrote: > It's not immediately evident whether the currently existing flags require any > barriers. > But to be sure nothing breaks in the future, SetFlag() and CheckFlag() must > probably constitute a happens-before arc. > > These two flags (or other flags introduced later) may be used to signal that > some bits of metadata have been changed, i.e.: > 1. The user modifies the metadata and sets the corresponding flag > 2. He then checks the flag value in another thread. > 3. In the case the flag is true, the user assumes the metadata bits have already > been changed. > > This assumption is quite natural, but is only possible if the flag is written > using a Release_Store (it already is) and read using an Acquire_Load(). (See > base::CancellationFlag, which has similar semantics) Done. https://codereview.chromium.org/1410213004/diff/180001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:53: void SetFlag(base::subtle::Atomic32* flags, int flag, bool set) { On 2015/11/03 14:06:45, Dmitry Vyukov wrote: > You never pass set=false. Please delete it. This code is complex enough without > dead code. Done. https://codereview.chromium.org/1410213004/diff/180001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:58: if (base::subtle::Release_CompareAndSwap( On 2015/11/03 14:06:45, Dmitry Vyukov wrote: > You use NoBarrier_Load to load flags, so Release_CompareAndSwap cannot possibly > be the right thing. Release/Acquire barrier must always be paired. If you don't > need synchronization use NoBarrier to both load and set; if you need > synchronization then change NoBarrier_Load to Acquire_Load. Done. https://codereview.chromium.org/1410213004/diff/180001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:75: subtle::Atomic32 next; // Pointer to the next block when iterating On 2015/11/03 14:06:45, Dmitry Vyukov wrote: > add . at the end of comment for consistency Done. https://codereview.chromium.org/1410213004/diff/180001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:115: DCHECK(base && reinterpret_cast<uintptr_t>(base) % kAllocAlignment == 0); On 2015/11/03 08:12:54, Alexander Potapenko wrote: > These invariants shouldn't be checked often, should be fine to make them CHECKs. Done. https://codereview.chromium.org/1410213004/diff/180001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:272: subtle::NoBarrier_Store(&block->next, OFFSET_QUEUE); // will be tail block On 2015/11/03 14:06:46, Dmitry Vyukov wrote: > Why don't you use a lock-free stack? > The stack algorithm is much simpler. It will also eliminate the > SharedMetadata.queue, OFFSET_QUEUE, special flag in GetBlock and just greatly > simplify lots of things. Honestly, because an M&S Queue was what Alexander recommended when I was doing the design. But, I also need to be able to iterate through the queue and continue iteration after stopping if new items get added to the collection. That means "forward" pointers from the head whereas a Stack has only "backward" pointers from the tail. https://codereview.chromium.org/1410213004/diff/180001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:344: if (state->loop_detector == OFFSET_QUEUE) On 2015/11/03 14:06:46, Dmitry Vyukov wrote: > There is a simpler way to do it: > > count number of iterations in state->niter > if state->niter > freeptr/sizeof(BlockHeader), we have a loop Great! Though mine worked, it had to change anyway because it occurred to me that two loops could go undetected. https://codereview.chromium.org/1410213004/diff/180001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:359: subtle::NoBarrier_Store(&corrupted_, 1); On 2015/11/03 08:12:54, Alexander Potapenko wrote: > Why do you need both corrupted_ and kFlagCorrupted? > Isn't any of the two enough? The shared flag can't be trusted since a malicious actor could overwrite it. The local version is immune to foreign actors. Thus, if we see it, we copy it locally and once known, always restore it globally. I'll add a comment. https://codereview.chromium.org/1410213004/diff/180001/base/memory/shared_mem... File base/memory/shared_memory_allocator.h (right): https://codereview.chromium.org/1410213004/diff/180001/base/memory/shared_mem... base/memory/shared_memory_allocator.h:30: // or previously initialized memory. In the first case, construction must On 2015/11/03 08:12:54, Alexander Potapenko wrote: > Please run clang-format in order to fix the extra spaces. It's on my to-do list. https://codereview.chromium.org/1410213004/diff/180001/base/memory/shared_mem... base/memory/shared_memory_allocator.h:37: // written to a location. This is an important distinction as it supports the On 2015/11/03 14:06:46, Dmitry Vyukov wrote: > Is it really important? Looks more confusing than important to me. > I guess any code can work with both types of memory. And actually you cannot > even say what type you have. E.g. if you call calloc, the returned memory can > actually be just "read as zero until something different is written". And even > if you explicitly write zeros, it still can be "fake memory" that reads as > zeros. It's important because the histogram data is likely several KB in size but could be hundreds of MB in size. One option we're exploring is to just allocate 256MiB of non-pinned memory. If that works then we can eliminate the complexities of growing the required memory segment by allocating and sharing more as needed but we don't want to waste such a huge swath of RAM if it's not actually getting used. https://codereview.chromium.org/1410213004/diff/180001/base/memory/shared_mem... base/memory/shared_memory_allocator.h:65: // code and size-of |unused| are compared to ensure the reference is valid On 2015/11/03 14:06:46, Dmitry Vyukov wrote: > I guess it is s/unused/T/ Done. https://codereview.chromium.org/1410213004/diff/180001/base/memory/shared_mem... base/memory/shared_memory_allocator.h:85: // the GetObject() call. On 2015/11/03 14:06:46, Dmitry Vyukov wrote: > s/GetObject/GetType/ Done. https://codereview.chromium.org/1410213004/diff/180001/base/memory/shared_mem... base/memory/shared_memory_allocator.h:109: void SetCorrupted(); On 2015/11/03 08:12:54, Alexander Potapenko wrote: > Shall we move these two to the private section? IsCorrupted definitely needs to be public. SetCorrupted should be public because corruption could be detected by the caller (during examination of a returned object) and should let the allocator know. https://codereview.chromium.org/1410213004/diff/180001/base/memory/shared_mem... File base/memory/shared_memory_allocator_unittest.cc (right): https://codereview.chromium.org/1410213004/diff/180001/base/memory/shared_mem... base/memory/shared_memory_allocator_unittest.cc:160: AllocatorThread t5("t5", memory, TEST_MEMORY_SIZE, TEST_MEMORY_PAGE); On 2015/11/03 14:06:46, Dmitry Vyukov wrote: > Also run iteration thread concurrently. The test thread is doing that concurrently with the separate allocator threads. Is there a benefit to spawning another thread to concentrate on that?
https://codereview.chromium.org/1410213004/diff/180001/base/memory/shared_mem... File base/memory/shared_memory_allocator.cc (right): https://codereview.chromium.org/1410213004/diff/180001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:272: subtle::NoBarrier_Store(&block->next, OFFSET_QUEUE); // will be tail block On 2015/11/03 16:28:19, bcwhite wrote: > On 2015/11/03 14:06:46, Dmitry Vyukov wrote: > > Why don't you use a lock-free stack? > > The stack algorithm is much simpler. It will also eliminate the > > SharedMetadata.queue, OFFSET_QUEUE, special flag in GetBlock and just greatly > > simplify lots of things. > > Honestly, because an M&S Queue was what Alexander recommended when I was doing > the design. But, I also need to be able to iterate through the queue and > continue iteration after stopping if new items get added to the collection. > That means "forward" pointers from the head whereas a Stack has only "backward" > pointers from the tail. Was that another Alexander? I never mentioned M&S queues (of which I first heard from the design doc)
On 2015/11/03 17:25:16, Alexander Potapenko wrote: > https://codereview.chromium.org/1410213004/diff/180001/base/memory/shared_mem... > File base/memory/shared_memory_allocator.cc (right): > > https://codereview.chromium.org/1410213004/diff/180001/base/memory/shared_mem... > base/memory/shared_memory_allocator.cc:272: > subtle::NoBarrier_Store(&block->next, OFFSET_QUEUE); // will be tail block > On 2015/11/03 16:28:19, bcwhite wrote: > > On 2015/11/03 14:06:46, Dmitry Vyukov wrote: > > > Why don't you use a lock-free stack? > > > The stack algorithm is much simpler. It will also eliminate the > > > SharedMetadata.queue, OFFSET_QUEUE, special flag in GetBlock and just > greatly > > > simplify lots of things. > > > > Honestly, because an M&S Queue was what Alexander recommended when I was doing > > the design. But, I also need to be able to iterate through the queue and > > continue iteration after stopping if new items get added to the collection. > > That means "forward" pointers from the head whereas a Stack has only > "backward" > > pointers from the tail. > > Was that another Alexander? I never mentioned M&S queues (of which I first heard > from the design doc) Oh, sorry. It was Dmitry that mention M&S but on the comment you started with "This is tricky..." https://docs.google.com/document/d/1YvBXzi745UDjiFVLjP8IWUMFgYL9gDb2J4xuwcv-c...
On 2015/11/03 17:29:40, bcwhite wrote: > On 2015/11/03 17:25:16, Alexander Potapenko wrote: > > > https://codereview.chromium.org/1410213004/diff/180001/base/memory/shared_mem... > > File base/memory/shared_memory_allocator.cc (right): > > > > > https://codereview.chromium.org/1410213004/diff/180001/base/memory/shared_mem... > > base/memory/shared_memory_allocator.cc:272: > > subtle::NoBarrier_Store(&block->next, OFFSET_QUEUE); // will be tail block > > On 2015/11/03 16:28:19, bcwhite wrote: > > > On 2015/11/03 14:06:46, Dmitry Vyukov wrote: > > > > Why don't you use a lock-free stack? > > > > The stack algorithm is much simpler. It will also eliminate the > > > > SharedMetadata.queue, OFFSET_QUEUE, special flag in GetBlock and just > > greatly > > > > simplify lots of things. > > > > > > Honestly, because an M&S Queue was what Alexander recommended when I was > doing > > > the design. But, I also need to be able to iterate through the queue and > > > continue iteration after stopping if new items get added to the collection. > > > That means "forward" pointers from the head whereas a Stack has only > > "backward" > > > pointers from the tail. > > > > Was that another Alexander? I never mentioned M&S queues (of which I first > heard > > from the design doc) > > Oh, sorry. It was Dmitry that mention M&S but on the comment you started with > "This is tricky..." > https://docs.google.com/document/d/1YvBXzi745UDjiFVLjP8IWUMFgYL9gDb2J4xuwcv-c... Apparently Dmitry referred to M&S as a more complex solution :)
On 2015/11/03 17:32:13, Alexander Potapenko wrote: > On 2015/11/03 17:29:40, bcwhite wrote: > > On 2015/11/03 17:25:16, Alexander Potapenko wrote: > > > > > > https://codereview.chromium.org/1410213004/diff/180001/base/memory/shared_mem... > > > File base/memory/shared_memory_allocator.cc (right): > > > > > > > > > https://codereview.chromium.org/1410213004/diff/180001/base/memory/shared_mem... > > > base/memory/shared_memory_allocator.cc:272: > > > subtle::NoBarrier_Store(&block->next, OFFSET_QUEUE); // will be tail block > > > On 2015/11/03 16:28:19, bcwhite wrote: > > > > On 2015/11/03 14:06:46, Dmitry Vyukov wrote: > > > > > Why don't you use a lock-free stack? > > > > > The stack algorithm is much simpler. It will also eliminate the > > > > > SharedMetadata.queue, OFFSET_QUEUE, special flag in GetBlock and just > > > greatly > > > > > simplify lots of things. > > > > > > > > Honestly, because an M&S Queue was what Alexander recommended when I was > > doing > > > > the design. But, I also need to be able to iterate through the queue and > > > > continue iteration after stopping if new items get added to the > collection. > > > > That means "forward" pointers from the head whereas a Stack has only > > > "backward" > > > > pointers from the tail. > > > > > > Was that another Alexander? I never mentioned M&S queues (of which I first > > heard > > > from the design doc) > > > > Oh, sorry. It was Dmitry that mention M&S but on the comment you started with > > "This is tricky..." > > > https://docs.google.com/document/d/1YvBXzi745UDjiFVLjP8IWUMFgYL9gDb2J4xuwcv-c... > > Apparently Dmitry referred to M&S as a more complex solution :) The first suggestion is nice but isn't fail-safe for iteration. If the process crashes between the allocation and the update of the block-header then it's not possible to safely recover.
https://codereview.chromium.org/1410213004/diff/180001/base/memory/shared_mem... File base/memory/shared_memory_allocator.cc (right): https://codereview.chromium.org/1410213004/diff/180001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:272: subtle::NoBarrier_Store(&block->next, OFFSET_QUEUE); // will be tail block On 2015/11/03 16:28:19, bcwhite wrote: > On 2015/11/03 14:06:46, Dmitry Vyukov wrote: > > Why don't you use a lock-free stack? > > The stack algorithm is much simpler. It will also eliminate the > > SharedMetadata.queue, OFFSET_QUEUE, special flag in GetBlock and just greatly > > simplify lots of things. > > Honestly, because an M&S Queue was what Alexander recommended when I was doing > the design. But, I also need to be able to iterate through the queue and > continue iteration after stopping if new items get added to the collection. > That means "forward" pointers from the head whereas a Stack has only "backward" > pointers from the tail. Makes sense. I guess we could do something along of the following lines for stack iteration. Namely, re-read stack head, and it some new elements get added we iterate from new head to the previous head. But at this point I am not sure it worth it. struct Iter { Node* pos; Node* head; Node* next_head; int32 n; }; void Alloc::IterInit(Iter *iter) { iter->pos = 0; iter->head = 0; iter->last_head = 0; iter->n = 0; } void* Alloc::IterNext(Iter *iter) { if (iter->pos == iter->head) { iter->head = iter->next_head; iter->pos = iter->next_head = head_; } if (iter->pos == iter->head) return NULL; void *cur = iter->pos; iter->pos = iter->pos->next; return cur; } I will look at the queue algorithm tomorrow.
Another round of comments on the API. https://codereview.chromium.org/1410213004/diff/220001/base/memory/shared_mem... File base/memory/shared_memory_allocator.h (right): https://codereview.chromium.org/1410213004/diff/220001/base/memory/shared_mem... base/memory/shared_memory_allocator.h:50: }; Each of these structs is worth of a one line comment describing what they are/will be used for. https://codereview.chromium.org/1410213004/diff/220001/base/memory/shared_mem... base/memory/shared_memory_allocator.h:61: SharedMemoryAllocator(void* base, int32_t size, int32_t page_size); The comments for this still aren't clear. The description earlier up states that construction must complete before it is shared to others, but there's no comment of this fact here. This constructor is also able to be both the 'initializing' constructor and an 'attaching' constructor. Maybe document that fact very clearly? (I still prefer separate constructors, or an additional parameter that declares whether the constructor is expected to initialize the memory or attach to an already initialized allocator.) https://codereview.chromium.org/1410213004/diff/220001/base/memory/shared_mem... base/memory/shared_memory_allocator.h:79: T* GetType(int32_t offset, int32_t type) { This is poorly named, as it's not actually getting a type. It's getting a pointer to an object, and validating its type. GetAsType? GetAtOffset? Also, 'type' is overloading here and throughout this class. A type means something very specific in C++, and that is not what |type| is. Maybe you want |type_id|? Or maybe you want to make this more generic and call this a |cookie|? And given that there's a special value (0), I'd like a static constant (well, typed enum is the new hotness) for that magic value: kAnyCookie, or kAnyTypeId? This function also wants a counterpart to convert a typed pointer to an offset, that also checks validity, size, etc. https://codereview.chromium.org/1410213004/diff/220001/base/memory/shared_mem... base/memory/shared_memory_allocator.h:86: // the GetType() call. Comment that the user is expected to Commit/MakeIterable the returned allocation once it is initialized/readable. Comment that the user should check the flags for failure reason? https://codereview.chromium.org/1410213004/diff/220001/base/memory/shared_mem... base/memory/shared_memory_allocator.h:92: // be allocated. Also note that this is inherently 'racy' in the sense that the information is only valid instantaneously, and may no longer be true due to activity on other threads. Could also mention any guarantees this gives under single threaded use. https://codereview.chromium.org/1410213004/diff/220001/base/memory/shared_mem... base/memory/shared_memory_allocator.h:97: void MakeIterable(int32_t offset); Do we want to use the seemingly standard 'Allocate' and 'Commit' language? I'd also move this declaration directly below Allocate, as they are intimately related. https://codereview.chromium.org/1410213004/diff/220001/base/memory/shared_mem... base/memory/shared_memory_allocator.h:105: int32_t GetNextIterable(Iterator* state, int32_t* type); Is it worth making this have STL iterator semantics so it can be used in STL algorithms and for-range constructs? Not a lot more complexity for much more power. https://codereview.chromium.org/1410213004/diff/220001/base/memory/shared_mem... base/memory/shared_memory_allocator.h:111: bool IsCorrupted(); Does SetCorrupted need to be exposed for anyone to call? Or does it really only need to private, and called internally? https://codereview.chromium.org/1410213004/diff/220001/base/memory/shared_mem... base/memory/shared_memory_allocator.h:127: int32_t mem_page_; Should these be stored in this object, or in the header itself? As is it's possible for different threads/processes to use different values for these... https://codereview.chromium.org/1410213004/diff/220001/base/memory/shared_mem... base/memory/shared_memory_allocator.h:128: int32_t last_seen_; Comment for what this is? While you're at it, comment all of these member variables and functions.
> > > Why don't you use a lock-free stack? > > > The stack algorithm is much simpler. It will also eliminate the > > > SharedMetadata.queue, OFFSET_QUEUE, special flag in GetBlock and just > greatly > > > simplify lots of things. > > > > Honestly, because an M&S Queue was what Alexander recommended when I was doing > > the design. But, I also need to be able to iterate through the queue and > > continue iteration after stopping if new items get added to the collection. > > That means "forward" pointers from the head whereas a Stack has only > "backward" > > pointers from the tail. > > Makes sense. > > I guess we could do something along of the following lines for stack iteration. > Namely, re-read stack head, and it some new elements get added we iterate from > new head to the previous head. But at this point I am not sure it worth it. An iterator has state so can remember the previous point it stopped. Going forward is still a problem. The order might not make a difference but it seems like we should at least try to return them in the same order they were allocated even though that isn't guaranteed because, well, what is the order when multiple threads all try at about the same time? Going backwards up the stack would mean the iterator keeping the head it started from as well as the previous head it started from. When done and trying to continue, move start-head to previous-head, current-head to start-head and go. Sounds doable but not really any less complex overall.
https://codereview.chromium.org/1410213004/diff/240001/base/memory/shared_mem... File base/memory/shared_memory_allocator.h (right): https://codereview.chromium.org/1410213004/diff/240001/base/memory/shared_mem... base/memory/shared_memory_allocator.h:54: enum { Use a typed enum: enum : int32_t { kTypeIdAny = 0 }; https://code.google.com/p/chromium/codesearch#chromium/src/third_party/llvm/i... https://codereview.chromium.org/1410213004/diff/240001/base/memory/shared_mem... base/memory/shared_memory_allocator.h:101: // Allocated objects can be added to an internal list that can then be Shouldn't this be "should" rather than "can"? Is there any reason for allocations not to be made iterable? https://codereview.chromium.org/1410213004/diff/240001/base/memory/shared_mem... base/memory/shared_memory_allocator.h:146: char* mem_base_; // Same. (char because sizeof guaranteed 1) ubernit: Do we even need to store this? Or maybe just have a private accessor that does the casting for us starting from shared_meta_? (It's a little wasteful to store the exact same pointer twice.)
https://codereview.chromium.org/1410213004/diff/220001/base/memory/shared_mem... File base/memory/shared_memory_allocator.h (right): https://codereview.chromium.org/1410213004/diff/220001/base/memory/shared_mem... base/memory/shared_memory_allocator.h:50: }; On 2015/11/03 18:20:51, chrisha wrote: > Each of these structs is worth of a one line comment describing what they > are/will be used for. Done. https://codereview.chromium.org/1410213004/diff/220001/base/memory/shared_mem... base/memory/shared_memory_allocator.h:61: SharedMemoryAllocator(void* base, int32_t size, int32_t page_size); On 2015/11/03 18:20:51, chrisha wrote: > The comments for this still aren't clear. > > The description earlier up states that construction must complete before it is > shared to others, but there's no comment of this fact here. This constructor is > also able to be both the 'initializing' constructor and an 'attaching' > constructor. Maybe document that fact very clearly? > > (I still prefer separate constructors, or an additional parameter that declares > whether the constructor is expected to initialize the memory or attach to an > already initialized allocator.) Comment updated and extended. Regarding multiple ctors... They would have the same nominal signature making it difficult to distinguish them when viewing the code. Also, it may be that the caller doesn't know whether the segment is new or existing; it seems best to leave the test for that in code that knows what it's doing. Fewer things to go wrong. https://codereview.chromium.org/1410213004/diff/220001/base/memory/shared_mem... base/memory/shared_memory_allocator.h:79: T* GetType(int32_t offset, int32_t type) { On 2015/11/03 18:20:51, chrisha wrote: > This is poorly named, as it's not actually getting a type. It's getting a > pointer to an object, and validating its type. It was originally "GetObject" but it turns out that this name gets mangled by some VS header files into "GetObjectW". <sigh> How about GetAsObject<T>? > Also, 'type' is overloading here and throughout this class. A type means > something very specific in C++, and that is not what |type| is. Maybe you want > |type_id|? Or maybe you want to make this more generic and call this a |cookie|? Done. > And given that there's a special value (0), I'd like a static constant (well, > typed enum is the new hotness) for that magic value: kAnyCookie, or kAnyTypeId? Done. https://codereview.chromium.org/1410213004/diff/220001/base/memory/shared_mem... base/memory/shared_memory_allocator.h:86: // the GetType() call. On 2015/11/03 18:20:51, chrisha wrote: > Comment that the user is expected to Commit/MakeIterable the returned allocation > once it is initialized/readable. > > Comment that the user should check the flags for failure reason? Done. https://codereview.chromium.org/1410213004/diff/220001/base/memory/shared_mem... base/memory/shared_memory_allocator.h:92: // be allocated. On 2015/11/03 18:20:51, chrisha wrote: > Also note that this is inherently 'racy' in the sense that the information is > only valid instantaneously, and may no longer be true due to activity on other > threads. Could also mention any guarantees this gives under single threaded use. Done. https://codereview.chromium.org/1410213004/diff/220001/base/memory/shared_mem... base/memory/shared_memory_allocator.h:97: void MakeIterable(int32_t offset); On 2015/11/03 18:20:51, chrisha wrote: > Do we want to use the seemingly standard 'Allocate' and 'Commit' language? > > I'd also move this declaration directly below Allocate, as they are intimately > related. I moved it. It's not necessary to commit all allocations, just those you want iterable. https://codereview.chromium.org/1410213004/diff/220001/base/memory/shared_mem... base/memory/shared_memory_allocator.h:105: int32_t GetNextIterable(Iterator* state, int32_t* type); On 2015/11/03 18:20:51, chrisha wrote: > Is it worth making this have STL iterator semantics so it can be used in STL > algorithms and for-range constructs? Not a lot more complexity for much more > power. I'm not sure. STL iterators have a designated "end()" beyond which they cannot iterate. We insert iterable objects just before "end" so we couldn't be able to continue iteration and get those new objects. https://codereview.chromium.org/1410213004/diff/220001/base/memory/shared_mem... base/memory/shared_memory_allocator.h:111: bool IsCorrupted(); On 2015/11/03 18:20:51, chrisha wrote: > Does SetCorrupted need to be exposed for anyone to call? Or does it really only > need to private, and called internally? I was thinking that the caller might want to report it if it detects corruption within data objects. I'm flexible either way. https://codereview.chromium.org/1410213004/diff/220001/base/memory/shared_mem... base/memory/shared_memory_allocator.h:127: int32_t mem_page_; On 2015/11/03 18:20:51, chrisha wrote: > Should these be stored in this object, or in the header itself? As is it's > possible for different threads/processes to use different values for these... They need to be stored in both places. We want to compare them during attachment to ensure everything is still in good shape but we can't trust the shared values when verifying boundary conditions. I'm assuming that the mechanism passing the shared segments around provides reliable size information provided by the OS and the page size will likely be a compile-time constant. https://codereview.chromium.org/1410213004/diff/220001/base/memory/shared_mem... base/memory/shared_memory_allocator.h:128: int32_t last_seen_; On 2015/11/03 18:20:51, chrisha wrote: > Comment for what this is? While you're at it, comment all of these member > variables and functions. Done. https://codereview.chromium.org/1410213004/diff/240001/base/memory/shared_mem... File base/memory/shared_memory_allocator.h (right): https://codereview.chromium.org/1410213004/diff/240001/base/memory/shared_mem... base/memory/shared_memory_allocator.h:54: enum { On 2015/11/03 21:25:25, chrisha wrote: > Use a typed enum: > > enum : int32_t { kTypeIdAny = 0 }; > > https://code.google.com/p/chromium/codesearch#chromium/src/third_party/llvm/i... Done. https://codereview.chromium.org/1410213004/diff/240001/base/memory/shared_mem... base/memory/shared_memory_allocator.h:101: // Allocated objects can be added to an internal list that can then be On 2015/11/03 21:25:24, chrisha wrote: > Shouldn't this be "should" rather than "can"? Is there any reason for > allocations not to be made iterable? Sure. In the case of a Histogram, it has it's metadata (min/max/etc) and it's counts and maybe the bucket-ranges (still investigating that one). Since the last two are both dynamic length, that will likely be three different allocations from the shared memory (again, still investigating). Because the metadata will have direct "offset" references to the others, only the metadata object need be made iterable once everything else is created. The others could be made iterable but it's not necessary since no other process will need to find them that way. Data structures tend to be complex and generally only the top-level ones need to be made visible, just like classes with internal sub-object pointers make them private. https://codereview.chromium.org/1410213004/diff/240001/base/memory/shared_mem... base/memory/shared_memory_allocator.h:146: char* mem_base_; // Same. (char because sizeof guaranteed 1) On 2015/11/03 21:25:24, chrisha wrote: > ubernit: Do we even need to store this? Or maybe just have a private accessor > that does the casting for us starting from shared_meta_? > > (It's a little wasteful to store the exact same pointer twice.) Correct on all counts. It's just so much simpler to have a "char*" because it allows "base + offset" calculations without any casting. I could do it as a #define or inline method, or I could do the same casting mem_base_ as shared_meta_. Actually, I originally did just that (casting mem_base_ as SharedMetadata*) but this way just made the code so much more readable. But if you prefer, I'll remove the duplicate and alter the code to cast with a #define.
https://codereview.chromium.org/1410213004/diff/260001/base/memory/shared_mem... File base/memory/shared_memory_allocator.cc (right): https://codereview.chromium.org/1410213004/diff/260001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:174: if (size < 0) { check that size != 0 as well in GetNextIterable and in other places you assume that min block size is (sizeof(BlockHeader) + kAllocAlignment) zero size can, in particular, falsely fail the loop check during iteration https://codereview.chromium.org/1410213004/diff/260001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:180: size += sizeof(BlockHeader); check for overflow, rendered can pass INT_MAX-1 nothing good will come out of that you can just check that size < page_size as first check https://codereview.chromium.org/1410213004/diff/260001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:182: if (size > mem_page_) check that size <= page_size https://codereview.chromium.org/1410213004/diff/260001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:192: int32_t freeptr = subtle::Acquire_Load(&shared_meta_->freeptr); What do we acquire here? Where is the pairing release? https://codereview.chromium.org/1410213004/diff/260001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:212: if (size > page_free) { %K returns value in [0, K), not [1, K] check for page_free = 0 https://codereview.chromium.org/1410213004/diff/260001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:214: if (subtle::Release_CompareAndSwap( What do we release here? Where is the pairing acquire? https://codereview.chromium.org/1410213004/diff/260001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:216: block->size = page_free; Why do we need this? We not don't iterate the region itself, only through the queue. So it is OK to just bump freeptr. https://codereview.chromium.org/1410213004/diff/260001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:234: if (subtle::Release_CompareAndSwap( What do we release here? Where is the pairing acquire? https://codereview.chromium.org/1410213004/diff/260001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:255: block->size = size; These should be atomic stores as they race with allocation checks in another thread. Or alternatively you can just not check these fields on allocation. https://codereview.chromium.org/1410213004/diff/260001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:295: if (tail == subtle::Release_Load(&shared_meta_->tailptr)) { Why do we need the atomic read of both fields? next or (next and tail) can change right after this check, so this is pointless just use the return value from the failed CAS: tail = tailptr next = CAS(&tail->next, OFFSET_QUEUE, offset) if next == OFFSET_QUEUE enqueued else CAS(&tailptr, tail, next) // help Also, Release_Load is a weird function without clearly defined semantics. It will make porting to C++ atomics harder and ThreadSanitizer does not understand it. Please don't use it. https://codereview.chromium.org/1410213004/diff/260001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:331: int32_t next = subtle::NoBarrier_Load(&block->next); this needs to be Acquire_Load, this is what acquires visibility of new nodes from threads that concurrently add new nodes synchronizes with Release_CAS(&block->next) in MakeIterable there is another super-subtle ordering issue here (worth a comment) currently it is possible that we load shared_meta_->freeptr first, then another thread allocates and enqueues a node, then we load block->next and observe the new node, as the result we can falsely fail the loop check and falsely mark the region as corrupted if we use acquire_load here, it synchronizes with enqueue of the node, which in turn synchronizes with allocation of the node (presumably happens in the same thread); so the result of the load of freeptr below must cover the newly observed node https://codereview.chromium.org/1410213004/diff/260001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:341: int32_t freeptr = std::min(subtle::Acquire_Load(&shared_meta_->freeptr), visibility over what do we acquire here? https://codereview.chromium.org/1410213004/diff/260001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:387: bool special) { Split special flag into two flags: one allows to get pointer to OFFSET_QUEUE, second allows to get pointer to non-allocated blocks. Different call sites want one or another. This will also get rid of the -size hack in Allocate (don't need to check cookie, type and size for new blocks).
https://codereview.chromium.org/1410213004/diff/260001/base/memory/shared_mem... File base/memory/shared_memory_allocator.cc (right): https://codereview.chromium.org/1410213004/diff/260001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:174: if (size < 0) { On 2015/11/04 13:52:29, Dmitry Vyukov wrote: > check that size != 0 as well > in GetNextIterable and in other places you assume that min block size is > (sizeof(BlockHeader) + kAllocAlignment) > zero size can, in particular, falsely fail the loop check during iteration Done. https://codereview.chromium.org/1410213004/diff/260001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:180: size += sizeof(BlockHeader); On 2015/11/04 13:52:29, Dmitry Vyukov wrote: > check for overflow, rendered can pass INT_MAX-1 > nothing good will come out of that > you can just check that size < page_size as first check Done. https://codereview.chromium.org/1410213004/diff/260001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:192: int32_t freeptr = subtle::Acquire_Load(&shared_meta_->freeptr); On 2015/11/04 13:52:29, Dmitry Vyukov wrote: > What do we acquire here? Where is the pairing release? It's the CAS on line 214 or 234 (only one is executed). Unless I'm not understanding something. As an aside, why do they have to be paired? Logically there is a symmetry but in actual operation, aren't they completely independent? https://codereview.chromium.org/1410213004/diff/260001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:212: if (size > page_free) { On 2015/11/04 13:52:29, Dmitry Vyukov wrote: > %K returns value in [0, K), not [1, K] > check for page_free = 0 I want [0, K). If "freeptr" points to the start of a 4KiB page then "page_free" should be 4096. page_free can never be zero. https://codereview.chromium.org/1410213004/diff/260001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:216: block->size = page_free; On 2015/11/04 13:52:29, Dmitry Vyukov wrote: > Why do we need this? > We not don't iterate the region itself, only through the queue. So it is OK to > just bump freeptr. I suppose it could be omitted now. It was part of the original design that provided iteration and recovery by making it possible to step through all allocated objects. Race conditions proved that insufficient in an active system, hence the queue. It feels "right" to have it, though. And since the block header has to be padded for alignment anyway, removing it would just leave a "reserved" field in its place; no savings there. So I'd like to keep it if you don't object too much. https://codereview.chromium.org/1410213004/diff/260001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:255: block->size = size; On 2015/11/04 13:52:29, Dmitry Vyukov wrote: > These should be atomic stores as they race with allocation checks in another > thread. > Or alternatively you can just not check these fields on allocation. You mean the checks on lines 247-250? https://codereview.chromium.org/1410213004/diff/260001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:295: if (tail == subtle::Release_Load(&shared_meta_->tailptr)) { > Why do we need the atomic read of both fields? This is how it is done in the M&S Queue paper -- the PDF link is in the comments at the top of this file. I expanded this comment based on my understanding but I could be off. > next or (next and tail) can change right after this check, so this is pointless > just use the return value from the failed CAS: They can change but the algorithm will fail later on and retry. I believe the original paper assumes CAS has only a true/false return code. As you say, it may be possible to modify the algorithm to take advantage of our version returning the existing value before the attempt. For example, it may be possible to combine the two "if" statements on lines 298 and 300 but there are so many subtleties with these things that I'm loath to go against a researched paper by people who know far more than I because there seems a better way with the tools I have that they didn't. > Also, Release_Load is a weird function without clearly defined semantics. It > will make porting to C++ atomics harder and ThreadSanitizer does not understand > it. Please don't use it. Interesting. I was looking at it only with regard to before/after ordering. I'll change it to an Acquire. https://codereview.chromium.org/1410213004/diff/260001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:331: int32_t next = subtle::NoBarrier_Load(&block->next); On 2015/11/04 13:52:28, Dmitry Vyukov wrote: > this needs to be Acquire_Load, this is what acquires visibility of new nodes > from threads that concurrently add new nodes > synchronizes with Release_CAS(&block->next) in MakeIterable > > there is another super-subtle ordering issue here (worth a comment) > currently it is possible that we load shared_meta_->freeptr first, then another > thread allocates and enqueues a node, then we load block->next and observe the > new node, as the result we can falsely fail the loop check and falsely mark the > region as corrupted > if we use acquire_load here, it synchronizes with enqueue of the node, which in > turn synchronizes with allocation of the node (presumably happens in the same > thread); so the result of the load of freeptr below must cover the newly > observed node Whew! I've added a comment according to my understanding. Please check that I did actually understand. :-D Note, though, that the calculation below will actually return max+1 because there is a memory-segment header that's worth a minimum allocation. A dozen allocations could be made in between but we're only returning one at a time. https://codereview.chromium.org/1410213004/diff/260001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:341: int32_t freeptr = std::min(subtle::Acquire_Load(&shared_meta_->freeptr), On 2015/11/04 13:52:29, Dmitry Vyukov wrote: > visibility over what do we acquire here? There must be something I don't understand about acquire/release naming because to me it's just an atomic load of the value. https://codereview.chromium.org/1410213004/diff/260001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:387: bool special) { On 2015/11/04 13:52:29, Dmitry Vyukov wrote: > Split special flag into two flags: one allows to get pointer to OFFSET_QUEUE, > second allows to get pointer to non-allocated blocks. > Different call sites want one or another. > This will also get rid of the -size hack in Allocate (don't need to check > cookie, type and size for new blocks). Done.
https://codereview.chromium.org/1410213004/diff/260001/base/memory/shared_mem... File base/memory/shared_memory_allocator.cc (right): https://codereview.chromium.org/1410213004/diff/260001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:212: if (size > page_free) { On 2015/11/04 17:18:55, bcwhite wrote: > On 2015/11/04 13:52:29, Dmitry Vyukov wrote: > > %K returns value in [0, K), not [1, K] > > check for page_free = 0 > > I want [0, K). If "freeptr" points to the start of a 4KiB page then "page_free" > should be 4096. page_free can never be zero. I may be missing something then. If we get page_free==0, then (size > page_free) is true, and we will try to advice freeptr by 0, which will cause an infinite loop. If page_free==0, we must not even try to advice freeptr. We have a whole page. So I think it should be: if (page_free == 0) page_free = mem_page_; No? https://codereview.chromium.org/1410213004/diff/260001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:216: block->size = page_free; On 2015/11/04 17:18:55, bcwhite wrote: > On 2015/11/04 13:52:29, Dmitry Vyukov wrote: > > Why do we need this? > > We not don't iterate the region itself, only through the queue. So it is OK to > > just bump freeptr. > > I suppose it could be omitted now. It was part of the original design that > provided iteration and recovery by making it possible to step through all > allocated objects. Race conditions proved that insufficient in an active > system, hence the queue. > > It feels "right" to have it, though. And since the block header has to be > padded for alignment anyway, removing it would just leave a "reserved" field in > its place; no savings there. So I'd like to keep it if you don't object too > much. I don't object too much. But then add a comment. Future readers won't know about "the original design". And also it is the only usage of kBlockCookieWasted, which makes it look very suspicious. https://codereview.chromium.org/1410213004/diff/260001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:255: block->size = size; On 2015/11/04 17:18:55, bcwhite wrote: > On 2015/11/04 13:52:29, Dmitry Vyukov wrote: > > These should be atomic stores as they race with allocation checks in another > > thread. > > Or alternatively you can just not check these fields on allocation. > > You mean the checks on lines 247-250? I mean checks in GetBlock done by another thread which now tries to allocate this block. https://codereview.chromium.org/1410213004/diff/260001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:295: if (tail == subtle::Release_Load(&shared_meta_->tailptr)) { On 2015/11/04 17:18:55, bcwhite wrote: > > Why do we need the atomic read of both fields? > > This is how it is done in the M&S Queue paper -- the PDF link is in the comments > at the top of this file. I expanded this comment based on my understanding but > I could be off. > > > > next or (next and tail) can change right after this check, so this is > pointless > > just use the return value from the failed CAS: > > They can change but the algorithm will fail later on and retry. > > I believe the original paper assumes CAS has only a true/false return code. As > you say, it may be possible to modify the algorithm to take advantage of our > version returning the existing value before the attempt. For example, it may be > possible to combine the two "if" statements on lines 298 and 300 but there are > so many subtleties with these things that I'm loath to go against a researched > paper by people who know far more than I because there seems a better way with > the tools I have that they didn't. > > > > Also, Release_Load is a weird function without clearly defined semantics. It > > will make porting to C++ atomics harder and ThreadSanitizer does not > understand > > it. Please don't use it. > > Interesting. I was looking at it only with regard to before/after ordering. > I'll change it to an Acquire. What will break if we remove the CAS?
https://codereview.chromium.org/1410213004/diff/260001/base/memory/shared_mem... File base/memory/shared_memory_allocator.cc (right): https://codereview.chromium.org/1410213004/diff/260001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:212: if (size > page_free) { > > I want [0, K). If "freeptr" points to the start of a 4KiB page then > "page_free" > > should be 4096. page_free can never be zero. > > I may be missing something then. > If we get page_free==0 mem_page_ > 0 therefore freeptr % mem_page_ < mem_page_ therefore mem_page_ - freeptr % mem_page_ > 0 therefore page_free > 0 https://codereview.chromium.org/1410213004/diff/260001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:216: block->size = page_free; On 2015/11/04 17:33:08, Dmitry Vyukov wrote: > On 2015/11/04 17:18:55, bcwhite wrote: > > On 2015/11/04 13:52:29, Dmitry Vyukov wrote: > > > Why do we need this? > > > We not don't iterate the region itself, only through the queue. So it is OK > to > > > just bump freeptr. > > > > I suppose it could be omitted now. It was part of the original design that > > provided iteration and recovery by making it possible to step through all > > allocated objects. Race conditions proved that insufficient in an active > > system, hence the queue. > > > > It feels "right" to have it, though. And since the block header has to be > > padded for alignment anyway, removing it would just leave a "reserved" field > in > > its place; no savings there. So I'd like to keep it if you don't object too > > much. > > > I don't object too much. But then add a comment. > Future readers won't know about "the original design". And also it is the only > usage of kBlockCookieWasted, which makes it look very suspicious. Done. https://codereview.chromium.org/1410213004/diff/260001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:295: if (tail == subtle::Release_Load(&shared_meta_->tailptr)) { On 2015/11/04 17:33:08, Dmitry Vyukov wrote: > On 2015/11/04 17:18:55, bcwhite wrote: > > > Why do we need the atomic read of both fields? > > > > This is how it is done in the M&S Queue paper -- the PDF link is in the > comments > > at the top of this file. I expanded this comment based on my understanding > but > > I could be off. > > > > > > > next or (next and tail) can change right after this check, so this is > > pointless > > > just use the return value from the failed CAS: > > > > They can change but the algorithm will fail later on and retry. > > > > I believe the original paper assumes CAS has only a true/false return code. > As > > you say, it may be possible to modify the algorithm to take advantage of our > > version returning the existing value before the attempt. For example, it may > be > > possible to combine the two "if" statements on lines 298 and 300 but there are > > so many subtleties with these things that I'm loath to go against a researched > > paper by people who know far more than I because there seems a better way with > > the tools I have that they didn't. > > > > > > > Also, Release_Load is a weird function without clearly defined semantics. It > > > will make porting to C++ atomics harder and ThreadSanitizer does not > > understand > > > it. Please don't use it. > > > > Interesting. I was looking at it only with regard to before/after ordering. > > I'll change it to an Acquire. > > > What will break if we remove the CAS? As I understand it... If we remove the block-next update CAS on line 300 then it's possible for another thread to insert into the queue only to be overwritten by this thread. The previous entry will become disconnected. If we remove the tailptr update CAS on line 318 then it's possible for tailptr to point to a node that isn't actually the tail. This should get corrected by the CAS on line 310 during the next enqueue attempt but will be in a bad state until then.
https://codereview.chromium.org/1410213004/diff/260001/base/memory/shared_mem... File base/memory/shared_memory_allocator.cc (right): https://codereview.chromium.org/1410213004/diff/260001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:192: int32_t freeptr = subtle::Acquire_Load(&shared_meta_->freeptr); On 2015/11/04 17:18:55, bcwhite wrote: > On 2015/11/04 13:52:29, Dmitry Vyukov wrote: > > What do we acquire here? Where is the pairing release? > > It's the CAS on line 214 or 234 (only one is executed). Unless I'm not > understanding something. > > As an aside, why do they have to be paired? Logically there is a symmetry but > in actual operation, aren't they completely independent? Acquire or release operation that is not paired with the opposite operation is a no-op. Ok, it is paired with CAS on line 214 or 234. But what is the data that we want to synchronize with this Acquire? Or put another way: why did you use Acquire instead of NoBarrier? https://codereview.chromium.org/1410213004/diff/260001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:212: if (size > page_free) { On 2015/11/04 18:40:16, bcwhite wrote: > > > I want [0, K). If "freeptr" points to the start of a 4KiB page then > > "page_free" > > > should be 4096. page_free can never be zero. > > > > I may be missing something then. > > If we get page_free==0 > > mem_page_ > 0 therefore > freeptr % mem_page_ < mem_page_ therefore > mem_page_ - freeptr % mem_page_ > 0 therefore > page_free > 0 Aha! I missed "mem_page_ - " part. Sorry. https://codereview.chromium.org/1410213004/diff/260001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:295: if (tail == subtle::Release_Load(&shared_meta_->tailptr)) { On 2015/11/04 17:18:55, bcwhite wrote: > > Why do we need the atomic read of both fields? > > This is how it is done in the M&S Queue paper -- the PDF link is in the comments > at the top of this file. I expanded this comment based on my understanding but > I could be off. > > > > next or (next and tail) can change right after this check, so this is > pointless > > just use the return value from the failed CAS: > > They can change but the algorithm will fail later on and retry. > > I believe the original paper assumes CAS has only a true/false return code. As > you say, it may be possible to modify the algorithm to take advantage of our > version returning the existing value before the attempt. For example, it may be > possible to combine the two "if" statements on lines 298 and 300 but there are > so many subtleties with these things that I'm loath to go against a researched > paper by people who know far more than I because there seems a better way with > the tools I have that they didn't. We need to understand this algorithm well enough to be sure that it is correct and to be able to do future modifications. These kind of papers known to contain bugs as well. And they are usually sloppy about proper memory barrier. I propose to change this logic to: for (;;) { tail = tailptr next = CAS(&tail->next, OFFSET_QUEUE, offset) if next == OFFSET_QUEUE { CAS(&tailptr, tail, offset) return } else CAS(&tailptr, tail, next) // help } It is way simpler. I can actually understand how it works. The current "atomic" load of 2 fields raises more questions: - what if block->next transitioned from OFFSET_QUEUE to offset of some node, is it OK? how does it affect algorithm operation? - what if block->next changes right after we do Release_Load check? Is it ok? - if above is OK, why do we do the check at all? - what if shared_meta_->tailptr changes right after we do the check? Is it ok? https://codereview.chromium.org/1410213004/diff/260001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:341: int32_t freeptr = std::min(subtle::Acquire_Load(&shared_meta_->freeptr), On 2015/11/04 17:18:55, bcwhite wrote: > On 2015/11/04 13:52:29, Dmitry Vyukov wrote: > > visibility over what do we acquire here? > > There must be something I don't understand about acquire/release naming because > to me it's just an atomic load of the value. Just atomic load is NoBarrier_Load. Acquire/Release also give you mutual ordering of memory access as perceived by other threads.
https://codereview.chromium.org/1410213004/diff/260001/base/memory/shared_mem... File base/memory/shared_memory_allocator.cc (right): https://codereview.chromium.org/1410213004/diff/260001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:192: int32_t freeptr = subtle::Acquire_Load(&shared_meta_->freeptr); > > > What do we acquire here? Where is the pairing release? > > > > It's the CAS on line 214 or 234 (only one is executed). Unless I'm not > > understanding something. > > > > As an aside, why do they have to be paired? Logically there is a symmetry but > > in actual operation, aren't they completely independent? > > Acquire or release operation that is not paired with the opposite operation is a > no-op. > > Ok, it is paired with CAS on line 214 or 234. But what is the data that we want > to synchronize with this Acquire? Or put another way: why did you use Acquire > instead of NoBarrier? I've been thinking about acquire/release in terms of how they translate into machine code. At their base, they simply prevent re-ordering memory access from after/before their location. I wasn't considering any "higher level" reasoning since obviously machine instructions don't actually know anything about what is happening on other threads. With a lock/mutex, you "release" a lock you previous "acquired". But with atomics, it seems that you "acquire" a value "released" by another thread, a value that something else claims it is finished with, at least for the moment. There are no _code_ requirements to match acquire/release (unlike a mutex) but conceptually you should consider this kind of pairing... and it should be documented in comments. Am I thinking of this correctly? If so, then... It's acquiring the state indicating where unallocated memory (currently) begins. https://codereview.chromium.org/1410213004/diff/260001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:295: if (tail == subtle::Release_Load(&shared_meta_->tailptr)) { > We need to understand this algorithm well enough to be sure that it is correct > and to be able to do future modifications. These kind of papers known to contain > bugs as well. And they are usually sloppy about proper memory barrier. Fair enough. Let's see what happens. > > I propose to change this logic to: > > for (;;) { > tail = tailptr > next = CAS(&tail->next, OFFSET_QUEUE, offset) > if next == OFFSET_QUEUE { > CAS(&tailptr, tail, offset) > return > } else > CAS(&tailptr, tail, next) // help > } Nice. I don't see any flaws in it. The parallel test hasn't crashed or failed; I have it running continuously at the moment. Check my comments to make sure I understood everything correctly.
https://codereview.chromium.org/1410213004/diff/260001/base/memory/shared_mem... File base/memory/shared_memory_allocator.cc (right): https://codereview.chromium.org/1410213004/diff/260001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:192: int32_t freeptr = subtle::Acquire_Load(&shared_meta_->freeptr); On 2015/11/05 14:37:15, bcwhite wrote: > > > > What do we acquire here? Where is the pairing release? > > > > > > It's the CAS on line 214 or 234 (only one is executed). Unless I'm not > > > understanding something. > > > > > > As an aside, why do they have to be paired? Logically there is a symmetry > but > > > in actual operation, aren't they completely independent? > > > > Acquire or release operation that is not paired with the opposite operation is > a > > no-op. > > > > Ok, it is paired with CAS on line 214 or 234. But what is the data that we > want > > to synchronize with this Acquire? Or put another way: why did you use Acquire > > instead of NoBarrier? > > I've been thinking about acquire/release in terms of how they translate into > machine code. At their base, they simply prevent re-ordering memory access from > after/before their location. I wasn't considering any "higher level" reasoning > since obviously machine instructions don't actually know anything about what is > happening on other threads. > > With a lock/mutex, you "release" a lock you previous "acquired". But with > atomics, it seems that you "acquire" a value "released" by another thread, a > value that something else claims it is finished with, at least for the moment. > > There are no _code_ requirements to match acquire/release (unlike a mutex) but > conceptually you should consider this kind of pairing... and it should be > documented in comments. > > Am I thinking of this correctly? Correct. > If so, then... > > It's acquiring the state indicating where unallocated memory (currently) begins. Well, a NoBarrier_Load will perfectly return where unallocated memory begins. Memory barriers are about _associated_ state. Consider: // thread 1 data = 42; Release_Store(&data_ready, 1); // thread 2 if (Acquire_Load(&data_ready) == 1) assert(data == 42); Here data_ready is synchronization variable and data is the associated state that we synchronize by using acquire/release. If there is no associated state, then there is no point in using memory barriers. E.g.: // thread 1 while (NoBarrier_Load(&stop) == 0) { ... } // thread 2 NoBarrier_Store(&stop, 1); The question is: here shared_meta_->freeptr is the synchronization variable, if we use acquire, then there must be some associated state (which is not freeptr itself -- we already loaded it). What is that state? Or put it another way: reordering of memory accesses to shared_meta_->freeptr and what else we want to prevent by using barriers? Excessive memory barriers won't break the algorithm (it is missing barrier that can break it). My concern is more clarity and documentation. If I see an acquire or release, then I start looking for the pairing operation and try to understand what state we synchronize by using barriers. If I find them, then good. If I don't, then I continue looking, at some point I will have to stop with impression that I am missing something important here.
I don't see any issues in the code, so LGTM.
https://codereview.chromium.org/1410213004/diff/260001/base/memory/shared_mem... File base/memory/shared_memory_allocator.cc (right): https://codereview.chromium.org/1410213004/diff/260001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:192: int32_t freeptr = subtle::Acquire_Load(&shared_meta_->freeptr); > > If so, then... > > > > It's acquiring the state indicating where unallocated memory (currently) > begins. > > Well, a NoBarrier_Load will perfectly return where unallocated memory begins. > Memory barriers are about _associated_ state. Consider: Ahhh! So it's not that specific value we're acquiring or releasing, it's things that are related to it. The value is more a kind of pivot around which the related variables must show integrity. So because "freeptr" is all alone, we don't need any barriers; any atomic access will do and the Acquire/Release can be changed to NoBarrier. In the case of tailptr/next, then there must be acquire/release semantics to ensure that access to those _two_ values happen in a guaranteed order. Neither the compiler nor the cpu actually know what values are related so they just ensure that all memory access before/after remains before/after in the face of optimizations. I'll change these to NoBarrier ops.
> Nice. I don't see any flaws in it. The parallel test hasn't crashed or failed; > I have it running continuously at the moment. Check my comments to make sure I > understood everything correctly. Having a parallel test is good, but please keep in mind that running it on x86 may not uncover all the possible problems, just because that's a strong memory ordered CPU, and every instruction has implicit acquire/release semantics (I'm not saying explicit memory ordering is bad because of ARM and the C++ standard) I've fired off a TSan tryjob just to increase the level of confidence.
On 2015/11/05 17:06:30, bcwhite wrote: > https://codereview.chromium.org/1410213004/diff/260001/base/memory/shared_mem... > File base/memory/shared_memory_allocator.cc (right): > > https://codereview.chromium.org/1410213004/diff/260001/base/memory/shared_mem... > base/memory/shared_memory_allocator.cc:192: int32_t freeptr = > subtle::Acquire_Load(&shared_meta_->freeptr); > > > If so, then... > > > > > > It's acquiring the state indicating where unallocated memory (currently) > > begins. > > > > Well, a NoBarrier_Load will perfectly return where unallocated memory begins. > > Memory barriers are about _associated_ state. Consider: > > Ahhh! So it's not that specific value we're acquiring or releasing, it's things > that are related to it. The value is more a kind of pivot around which the > related variables must show integrity. > > So because "freeptr" is all alone, we don't need any barriers; any atomic access > will do and the Acquire/Release can be changed to NoBarrier. Right. > In the case of tailptr/next, then there must be acquire/release semantics to > ensure that access to those _two_ values happen in a guaranteed order. Neither > the compiler nor the cpu actually know what values are related so they just > ensure that all memory access before/after remains before/after in the face of > optimizations. In the case of enqueue, it is also important to use Release for the operation that makes the new node available for iterating threads. Because the associated state in this case is the node data (e.g. initialized histogram). And consequently the iteration must use Acquire when it discovers new nodes to acquire visibility over the node data.
Chris, any further comments?
A few more comments, mostly nits. Also, do we want to house this in base/metrics, as discussed offline? https://codereview.chromium.org/1410213004/diff/450001/base/memory/shared_mem... File base/memory/shared_memory_allocator.cc (right): https://codereview.chromium.org/1410213004/diff/450001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:56: enum { You can type this to be the same as the 'flags_' field. This is also worthy of a small comment that these values are what get stored in 'flags_'. https://codereview.chromium.org/1410213004/diff/450001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:57: kFlagCorrupt, It's more common in Chrome to define the flags in the (1 << 0), (1 << 1), etc, so the shift is already done in the constant. Makes reading/writing code more readable. https://codereview.chromium.org/1410213004/diff/450001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:72: flags, loaded_flags, new_flags) == loaded_flags) { All of this is more readable if you hide the bitshift in the enum. https://codereview.chromium.org/1410213004/diff/450001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:231: // for completeness sake. What will your hardening code make of a partial block header? (ie: suppose there are insufficient bytes left before the page boundary to house a full block header) Do you maybe want to flood fill with some 'padding' cookie as a more general solution? (Also more visually obvious when looking at memory and trying to make sense of a truncated block header.) https://codereview.chromium.org/1410213004/diff/450001/base/memory/shared_mem... File base/memory/shared_memory_allocator.h (right): https://codereview.chromium.org/1410213004/diff/450001/base/memory/shared_mem... base/memory/shared_memory_allocator.h:57: kTypeIdAny = 0 // Match any type-id inside GetAsObject(). Also worth having a int32_t kReferenceInvalid = 0? https://codereview.chromium.org/1410213004/diff/450001/base/memory/shared_mem... base/memory/shared_memory_allocator.h:78: // Get an object referenced by an |ref|. For safety reasons, the |type_id| by a* |ref| https://codereview.chromium.org/1410213004/diff/450001/base/memory/shared_mem... base/memory/shared_memory_allocator.h:89: // segment, it makes not guarantees of the validity of the data within the no* https://codereview.chromium.org/1410213004/diff/450001/base/memory/shared_mem... base/memory/shared_memory_allocator.h:93: T* GetAsObject(Reference ref, uint32_t type_id) { This function also wants a counterpart to convert a typed pointer to an offset, that also checks validity, size, etc. (Especially if you want users to be able to bake-in 'reference' based pointers. I'd keep the exact format of Reference as completely opaque and expect users to convert back and forth via the API. This also allows you to change how its represented in the future.) https://codereview.chromium.org/1410213004/diff/450001/base/memory/shared_mem... base/memory/shared_memory_allocator.h:114: void MakeIterable(Reference ref); Can't this return true on success, false on failure? Avoids having to call 'IsCorrupted' after every call to MakeIterable. https://codereview.chromium.org/1410213004/diff/450001/base/memory/shared_mem... base/memory/shared_memory_allocator.h:120: // return _less_ than could actually be allocated. This last sentence is at odds with the sentence before it. Either suffix it with " at the exact moment of the query.", or simply remove it altogether? https://codereview.chromium.org/1410213004/diff/450001/base/memory/shared_mem... base/memory/shared_memory_allocator.h:124: // returns both the reference reference to the object as well as the |type_id| reference reference https://codereview.chromium.org/1410213004/diff/450001/base/memory/shared_mem... base/memory/shared_memory_allocator.h:153: char* mem_base_; // Same. (char because sizeof guaranteed 1) This can be replaced with a pair of member functions: char* mem_base() { return reinterpret_cast<char*>(shared_meta_); } const char* mem_base() const { return reinterpret_cast<const char*>(shared_meta_); } https://codereview.chromium.org/1410213004/diff/450001/base/memory/shared_mem... base/memory/shared_memory_allocator.h:156: subtle::Atomic32 corrupted_; // TODO(bcwhite): Use std::atomic<char> when ok. This should be called flags_, no? As it also stores a 'full' bit? I'd also nuke the <char> comment, as you have an equivalent one in the .cc file.
https://codereview.chromium.org/1410213004/diff/450001/base/memory/shared_mem... File base/memory/shared_memory_allocator.cc (right): https://codereview.chromium.org/1410213004/diff/450001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:56: enum { On 2015/11/09 16:53:51, chrisha wrote: > You can type this to be the same as the 'flags_' field. This is also worthy of a > small comment that these values are what get stored in 'flags_'. Done. https://codereview.chromium.org/1410213004/diff/450001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:57: kFlagCorrupt, On 2015/11/09 16:53:51, chrisha wrote: > It's more common in Chrome to define the flags in the (1 << 0), (1 << 1), etc, > so the shift is already done in the constant. Makes reading/writing code more > readable. Done. https://codereview.chromium.org/1410213004/diff/450001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:72: flags, loaded_flags, new_flags) == loaded_flags) { On 2015/11/09 16:53:51, chrisha wrote: > All of this is more readable if you hide the bitshift in the enum. Done. https://codereview.chromium.org/1410213004/diff/450001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:231: // for completeness sake. On 2015/11/09 16:53:51, chrisha wrote: > What will your hardening code make of a partial block header? (ie: suppose there > are insufficient bytes left before the page boundary to house a full block > header) Do you maybe want to flood fill with some 'padding' cookie as a more > general solution? (Also more visually obvious when looking at memory and trying > to make sense of a truncated block header.) There is code below to round-up an allocation size should the request leave insufficient remaining space for even a block header. There should be a check here, though, to protect against corruption. https://codereview.chromium.org/1410213004/diff/450001/base/memory/shared_mem... File base/memory/shared_memory_allocator.h (right): https://codereview.chromium.org/1410213004/diff/450001/base/memory/shared_mem... base/memory/shared_memory_allocator.h:57: kTypeIdAny = 0 // Match any type-id inside GetAsObject(). On 2015/11/09 16:53:51, chrisha wrote: > Also worth having a int32_t kReferenceInvalid = 0? I don't think it's necessary. Like with pointers, the check for validity inside Chrome code is always "if (myref) { ... }" https://codereview.chromium.org/1410213004/diff/450001/base/memory/shared_mem... base/memory/shared_memory_allocator.h:78: // Get an object referenced by an |ref|. For safety reasons, the |type_id| On 2015/11/09 16:53:51, chrisha wrote: > by a* |ref| Doh! I *knew* I was going to miss something in the rename from object->ref. https://codereview.chromium.org/1410213004/diff/450001/base/memory/shared_mem... base/memory/shared_memory_allocator.h:89: // segment, it makes not guarantees of the validity of the data within the On 2015/11/09 16:53:51, chrisha wrote: > no* Done. https://codereview.chromium.org/1410213004/diff/450001/base/memory/shared_mem... base/memory/shared_memory_allocator.h:93: T* GetAsObject(Reference ref, uint32_t type_id) { On 2015/11/09 16:53:51, chrisha wrote: > This function also wants a counterpart to convert a typed pointer to an offset, > that also checks validity, size, etc. > > (Especially if you want users to be able to bake-in 'reference' based pointers. > I'd keep the exact format of Reference as completely opaque and expect users to > convert back and forth via the API. This also allows you to change how its > represented in the future.) Possibly. I've left it out because I haven't come across cause to use it and I've generally been asked to leave out code "for future use". https://codereview.chromium.org/1410213004/diff/450001/base/memory/shared_mem... base/memory/shared_memory_allocator.h:114: void MakeIterable(Reference ref); On 2015/11/09 16:53:51, chrisha wrote: > Can't this return true on success, false on failure? Avoids having to call > 'IsCorrupted' after every call to MakeIterable. It's never supposed to fail so I don't think there as a need to check. If there is corruption then it's likely there is nothing the caller can do. A return flag would indicate that the caller should check the value and act but that would be code never executed except in the presence of a serious failure. https://codereview.chromium.org/1410213004/diff/450001/base/memory/shared_mem... base/memory/shared_memory_allocator.h:120: // return _less_ than could actually be allocated. On 2015/11/09 16:53:51, chrisha wrote: > This last sentence is at odds with the sentence before it. Either suffix it with > " at the exact moment of the query.", or simply remove it altogether? Done. https://codereview.chromium.org/1410213004/diff/450001/base/memory/shared_mem... base/memory/shared_memory_allocator.h:124: // returns both the reference reference to the object as well as the |type_id| On 2015/11/09 16:53:51, chrisha wrote: > reference reference Done. https://codereview.chromium.org/1410213004/diff/450001/base/memory/shared_mem... base/memory/shared_memory_allocator.h:153: char* mem_base_; // Same. (char because sizeof guaranteed 1) On 2015/11/09 16:53:51, chrisha wrote: > This can be replaced with a pair of member functions: > > char* mem_base() { return reinterpret_cast<char*>(shared_meta_); } > const char* mem_base() const { return reinterpret_cast<const > char*>(shared_meta_); } Done. https://codereview.chromium.org/1410213004/diff/450001/base/memory/shared_mem... base/memory/shared_memory_allocator.h:156: subtle::Atomic32 corrupted_; // TODO(bcwhite): Use std::atomic<char> when ok. On 2015/11/09 16:53:51, chrisha wrote: > This should be called flags_, no? As it also stores a 'full' bit? I'd also nuke > the <char> comment, as you have an equivalent one in the .cc file. It's only a single flag and has either a 0 or 1 value. There has to be a "local" copy that can't be reset by an external process.
Looks mostly good. https://codereview.chromium.org/1410213004/diff/450001/base/memory/shared_mem... File base/memory/shared_memory_allocator.h (right): https://codereview.chromium.org/1410213004/diff/450001/base/memory/shared_mem... base/memory/shared_memory_allocator.h:19: // This class provides for thread-secure (i.e. safe against other threads Has this design been reviewed by the security team already? I'd say corruption detection is nice, but that's a best-effort security mechanism and should not be relied on much (I'm not a real security engineer though). https://codereview.chromium.org/1410213004/diff/450001/base/memory/shared_mem... File base/memory/shared_memory_allocator_unittest.cc (right): https://codereview.chromium.org/1410213004/diff/450001/base/memory/shared_mem... base/memory/shared_memory_allocator_unittest.cc:20: You're missing the tests for: -- using the allocator across processes -- reusing the already-initialized memory buffer (with kGlobalCookie set)
Looks mostly good.
I cannot approve this until I get an answer to this question: why is this complexity in the security boundary preferable to registering a process-private memory region in breakpad and allowing it to pass the data to the browser (or persist it to a file directly) using existing mechanisms? Some numbers: $ wc -l base/memory/shared_memory_allocator.cc 460 base/memory/shared_memory_allocator.cc $ grep subtle base/memory/shared_memory_allocator.cc | wc -l 32 Seems not worth the maintenance costs.
On 2015/11/09 18:15:22, pasko wrote: > I cannot approve this until I get an answer to this question: > > why is this complexity in the security boundary preferable to registering a > process-private memory region in breakpad and allowing it to pass the data to > the browser (or persist it to a file directly) using existing mechanisms? Not all processes have breakpad available. Setup is one example but there are more; it may want to use a memory-mapped file or it may just want to write the whole memory segment out on exit without having to import code to otherwise serialize the histograms. There are also issues, as I recall, of breakpad being able to do complex operations and memory allocations during clean-up, something that would definitely be necessary with the existing state. *How* the allocator is used will vary. It can be used with Breakpad and a local block of memory but then, during a crash, Breakpad would have to do nothing more than open file, write memory segment, close file. The allocator will still have to be thread-safe, though, whether that be a lock or a lockfree algorithm. There's also the ability to record metrics from inside Breakpad to be accessed by the parent process. That's not to say it can't be done in other ways but this allocator provides flexibility in how it is done.
https://codereview.chromium.org/1410213004/diff/450001/base/memory/shared_mem... File base/memory/shared_memory_allocator.h (right): https://codereview.chromium.org/1410213004/diff/450001/base/memory/shared_mem... base/memory/shared_memory_allocator.h:19: // This class provides for thread-secure (i.e. safe against other threads On 2015/11/09 18:03:46, Alexander Potapenko wrote: > Has this design been reviewed by the security team already? > I'd say corruption detection is nice, but that's a best-effort security > mechanism and should not be relied on much (I'm not a real security engineer > though). Yes, from the security team Justin Schuh gave some initial "moderately positive" comments and Matthew Dempsky from has been over the design. JF Bastien also wants to take a closer look when he's back from vacation. https://codereview.chromium.org/1410213004/diff/450001/base/memory/shared_mem... File base/memory/shared_memory_allocator_unittest.cc (right): https://codereview.chromium.org/1410213004/diff/450001/base/memory/shared_mem... base/memory/shared_memory_allocator_unittest.cc:20: On 2015/11/09 18:03:46, Alexander Potapenko wrote: > You're missing the tests for: > -- using the allocator across processes The ParallelismTest and CorruptionTest are intra-process but create multiple SharedMemoryAllocator objects and thus behave the same as if they were multiple independent processes. > -- reusing the already-initialized memory buffer (with kGlobalCookie set) No explicit test but exists as part of the above.
Thank you for the background, Brian. That's very helpful. On 2015/11/09 19:20:58, bcwhite wrote: > On 2015/11/09 18:15:22, pasko wrote: > > I cannot approve this until I get an answer to this question: > > > > why is this complexity in the security boundary preferable to registering a > > process-private memory region in breakpad and allowing it to pass the data to > > the browser (or persist it to a file directly) using existing mechanisms? > > Not all processes have breakpad available. Setup is one example but there are > more; Can you enumerate all of them? I am quite interested, also interested in knowing why we don't have breakpad for all these. > it may want to use a memory-mapped file or it may just want to write the > whole memory segment out on exit without having to import code to otherwise > serialize the histograms. If we write 'whole memory segment on exit' wouldn't we have race conditions in C++ memory model sense? With Breakpad we know that there is only one thread remaining, we can copy the data with just memcpy() after that. > There are also issues, as I recall, of breakpad being able to do complex > operations and memory allocations during clean-up, something that would > definitely be necessary with the existing state. Not sure what it is about. Copying data from a buffer of known size into a pipe can be done without any allocations. > *How* the allocator is used will vary. It can be used with Breakpad and a local > block of memory but then, during a crash, Breakpad would have to do nothing more > than open file, write memory segment, close file. The allocator will still have > to be thread-safe, though, whether that be a lock or a lockfree algorithm. For this we do not need shared memory. Code with locks are less error-prone to write and maintain than lockfree code. > There's also the ability to record metrics from inside Breakpad to be accessed > by the parent process. Is it worth the complexity? How many metrics does one need to record from Breakpad? > That's not to say it can't be done in other ways but this allocator provides > flexibility in how it is done. This flexibility comes at maintenance cost and unpleasant bus factor. I am not convinced that the tradeoff we are making is the right one.
> > > why is this complexity in the security boundary preferable to registering a > > > process-private memory region in breakpad and allowing it to pass the data > to > > > the browser (or persist it to a file directly) using existing mechanisms? > > > > Not all processes have breakpad available. Setup is one example but there are > > more; > > Can you enumerate all of them? I am quite interested, also interested in knowing > why we don't have breakpad for all these. No, I can't enumerate all of them because I'm not familiar with all of Chrome. Chris said in an email: It is desirable to have a reusable UMA metric writing component that can be easily used by other processes and software in the Chrome ecosystem (installer, foil, asan, browser process watcher, etc). These want to be able to leave UMA data in a convenient format for a browser process to pick up reports and deliver them to the server. The persistent layout we've been talking about doubles as a file format. > > it may want to use a memory-mapped file or it may just want to write the > > whole memory segment out on exit without having to import code to otherwise > > serialize the histograms. > > If we write 'whole memory segment on exit' wouldn't we have race conditions in > C++ memory model sense? With Breakpad we know that there is only one thread > remaining, we can copy the data with just memcpy() after that. Writing the entire block out to a file would be done when the process was shutting down so there wouldn't be other threads running. But even if there were, we don't care. Incomplete allocations aren't important (we're shutting down) and a missed histogram update (or even corruption) isn't critical. > > There are also issues, as I recall, of breakpad being able to do complex > > operations and memory allocations during clean-up, something that would > > definitely be necessary with the existing state. > > Not sure what it is about. Copying data from a buffer of known size into a pipe > can be done without any allocations. Histograms don't offer that. They "serialize" by creating a snapshot (i.e. duplicate) of the information that can then be uploaded to UMA. Sure it's possible to do that in a different manner but it would be a parallel set of code that adds no value other than "emergency read" semantics. The allocator is what stores all histogram data in a "buffer of known size". > > *How* the allocator is used will vary. It can be used with Breakpad and a > local > > block of memory but then, during a crash, Breakpad would have to do nothing > more > > than open file, write memory segment, close file. The allocator will still > have > > to be thread-safe, though, whether that be a lock or a lockfree algorithm. > > For this we do not need shared memory. Code with locks are less error-prone to > write and maintain than lockfree code. Correct. But I'm not convinced that all processes can be relied upon to always handle their own clean-up on all operating systems. Histograms span the entire code-base. While the lockfree allocator is certainly more complex than a basic lock one, it isn't that difficult to understand and it's more efficient (which is important for histograms though allocation is only a tiny part of histogram operations). > > There's also the ability to record metrics from inside Breakpad to be accessed > > by the parent process. > > Is it worth the complexity? How many metrics does one need to record from > Breakpad? I couldn't say but the number doesn't really matter. It only takes one if it's an important one. > > That's not to say it can't be done in other ways but this allocator provides > > flexibility in how it is done. > > This flexibility comes at maintenance cost and unpleasant bus factor. I am not > convinced that the tradeoff we are making is the right one. I understand. The complexities and, more importantly, _subtleties_ of a lockfree algorithm are cause for concern. Even with a moderately simple algorithm it requires considerable brain-power to consider the myriad of possibilities in which multiple concurrent threads could move through the code. Couple that with optimizing compilers and out-of-order execution on the CPU and great care must be taken. Testing is hard and complete testing is impossible. While I perhaps undertook the lockfree approach more lightly than I should have, I still believe it's the right way forward. There's no single convincing reason; it's the culmination of everything that tells me this is the best way.
OK, thanks for responses. Given the goal of "easy" creation of UMA regions in various binaries, this makes sense. Overall design sounds good. I don't have many cycles to make a detailed review right now, and I trust the opinion of dvyukov@, so please go ahead with remaining reviewers, I'll probably take a look at this after it's committed.
mdempsky@chromium.org changed reviewers: + mdempsky@chromium.org
(I'm going to assume others have checked the algorithm for correctness in non-malicious scenarios, so I'm just focusing on security-specific concerns.) https://codereview.chromium.org/1410213004/diff/470001/base/memory/shared_mem... File base/memory/shared_memory_allocator.cc (right): https://codereview.chromium.org/1410213004/diff/470001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:16: // is far simpler than the alternative. I notice that 1) there's no checks to ensure the size_t values still fit into int32_t, and 2) lots of C style casts. How about adding an "int32_t SizeToInt(size_t)" function so we can consistently D?CHECK, and SizeToInt(x) is a little more concise than static_cast<int32_t>(x) everywhere. Also lets you put this comment on SizeToInt to explain why it's necessary. https://codereview.chromium.org/1410213004/diff/470001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:112: // This can't be a constant because SharedMetadata is a private definition. FWIW, this is valid C++: class Foo { struct Bar; static const size_t kBarOff; }; struct Foo::Bar { int x; }; const size_t Foo::kBarOff = offsetof(Foo::Bar, x); https://codereview.chromium.org/1410213004/diff/470001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:114: #define REF_NULL 0 // the equivalest NULL value for a reference typo: equivalent https://codereview.chromium.org/1410213004/diff/470001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:394: subtle::NoBarrier_Store(&corrupted_, 1); (I take individual SharedMemoryAllocator objects meant to be safe for concurrent use within a process?) https://codereview.chromium.org/1410213004/diff/470001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:438: if (block->size < size) Like JF and I mentioned, for C++ correctness, block should be a pointer to a volatile BlockHeader and *all* accesses to its member variables need to be atomic to guard against malicious processes. (Probably GetAsObject, etc. should return pointers-to-volatile too to help enforce this.) https://codereview.chromium.org/1410213004/diff/470001/base/memory/shared_mem... File base/memory/shared_memory_allocator.h (right): https://codereview.chromium.org/1410213004/diff/470001/base/memory/shared_mem... base/memory/shared_memory_allocator.h:84: // TIME before accessing it or risk crashing! Once dereferenced, the pointer I'm a little unclear on these last two sentences. What precisely does "accessing it" refer to? My first impression is it means dereferencing the pointer, which sounds like I need to do something like: // Get my int pointer. int* p = GetAsObject<int>(ref, kIntType); // Check that it's not-null before dereferencing. if (p == nullptr) return; *p++; // [... later on in the same function...] // I need to check *again* that p is still not-null?? if (p == nullptr) return; *p++; Also, "once dereferenced, the pointer is safe to reuse forever" sounds confusing. Dereferencing to me means pointer dereferencing, but I'm wondering if maybe here it means using GetAsObject to convert a Reference into a T*? https://codereview.chromium.org/1410213004/diff/470001/base/memory/shared_mem... base/memory/shared_memory_allocator.h:122: // Iterating uses a |state| structure (initialized by CreateIterator) and nit: I think "uses an |Iterator| structure" is meant here? https://codereview.chromium.org/1410213004/diff/470001/base/memory/shared_mem... base/memory/shared_memory_allocator.h:128: int32_t GetNextIterable(Iterator* state, uint32_t* type_id); Should the return type be Reference? https://codereview.chromium.org/1410213004/diff/470001/base/memory/shared_mem... base/memory/shared_memory_allocator.h:136: // shared segment, possibly my a malicious actor. Once detected, future typo: s/my/by/ https://codereview.chromium.org/1410213004/diff/470001/base/memory/shared_mem... base/memory/shared_memory_allocator.h:148: // This convenience function eliminates constant casting of the base pointer nit: Say "repeated casting" since "constant casting" might be misinterpreted as referring to const_cast? https://codereview.chromium.org/1410213004/diff/470001/base/memory/shared_mem... base/memory/shared_memory_allocator.h:158: char* mem_base_; // Same. (char because sizeof guaranteed 1) Same as what?
https://codereview.chromium.org/1410213004/diff/470001/base/memory/shared_mem... File base/memory/shared_memory_allocator.cc (right): https://codereview.chromium.org/1410213004/diff/470001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:438: if (block->size < size) On 2015/11/10 19:47:55, mdempsky wrote: > Like JF and I mentioned, for C++ correctness, block should be a pointer to a > volatile BlockHeader and *all* accesses to its member variables need to be > atomic to guard against malicious processes. (Probably GetAsObject, etc. should > return pointers-to-volatile too to help enforce this.) By "volatile" did you mean "atomic"? (this is basically the same with base/atomicops.h, but I don't think C++11 atomics have volatile semantics)
https://codereview.chromium.org/1410213004/diff/470001/base/memory/shared_mem... File base/memory/shared_memory_allocator.cc (right): https://codereview.chromium.org/1410213004/diff/470001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:16: // is far simpler than the alternative. On 2015/11/10 19:47:55, mdempsky wrote: > I notice that 1) there's no checks to ensure the size_t values still fit into > int32_t, and 2) lots of C style casts. How about adding an "int32_t > SizeToInt(size_t)" function so we can consistently D?CHECK, and SizeToInt(x) is > a little more concise than static_cast<int32_t>(x) everywhere. Also lets you > put this comment on SizeToInt to explain why it's necessary. There are size_t checks where it's used (which is only as part of the external interface), for example on line #196: usize > (size_t)std::numeric_limits<int32_t>::max() I used c-style casts for constants like "(int32_t)sizeof(BlockHeader)" because it's just so much clearer and, well, it's an immediate constant. https://codereview.chromium.org/1410213004/diff/470001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:112: // This can't be a constant because SharedMetadata is a private definition. On 2015/11/10 19:47:55, mdempsky wrote: > FWIW, this is valid C++: > > class Foo { > struct Bar; > static const size_t kBarOff; > }; > > struct Foo::Bar { > int x; > }; > const size_t Foo::kBarOff = offsetof(Foo::Bar, x); That would require the constant to be declared in the header file, no? https://codereview.chromium.org/1410213004/diff/470001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:114: #define REF_NULL 0 // the equivalest NULL value for a reference On 2015/11/10 19:47:55, mdempsky wrote: > typo: equivalent Done. https://codereview.chromium.org/1410213004/diff/470001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:394: subtle::NoBarrier_Store(&corrupted_, 1); On 2015/11/10 19:47:55, mdempsky wrote: > (I take individual SharedMemoryAllocator objects meant to be safe for concurrent > use within a process?) Yes. https://codereview.chromium.org/1410213004/diff/470001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:438: if (block->size < size) On 2015/11/10 19:47:55, mdempsky wrote: > Like JF and I mentioned, for C++ correctness, block should be a pointer to a > volatile BlockHeader and *all* accesses to its member variables need to be > atomic to guard against malicious processes. (Probably GetAsObject, etc. should > return pointers-to-volatile too to help enforce this.) You mean "volatile T* GetAsObject(...) {...}"? I think any objects being fetched and expecting continued shared access would have to use Atomic data values so wouldn't this be redundant? Near as I can tell, volatile would only help with reads. Writes to volatile data won't necessary be immediately visible to other cores and so volatile isn't sufficient. To be honest, I don't understand what use "volatile" has in these days of multi-core processing.
https://codereview.chromium.org/1410213004/diff/470001/base/memory/shared_mem... File base/memory/shared_memory_allocator.cc (right): https://codereview.chromium.org/1410213004/diff/470001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:16: // is far simpler than the alternative. On 2015/11/10 21:17:34, bcwhite wrote: > I used c-style casts for constants like "(int32_t)sizeof(BlockHeader)" because > it's just so much clearer and, well, it's an immediate constant. Style guide says "Do not use C-style casts." https://codereview.chromium.org/1410213004/diff/470001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:112: // This can't be a constant because SharedMetadata is a private definition. On 2015/11/10 21:17:34, bcwhite wrote: > That would require the constant to be declared in the header file, no? Declared, yes. Is that problematic somehow? https://codereview.chromium.org/1410213004/diff/470001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:438: if (block->size < size) On 2015/11/10 20:47:10, Alexander Potapenko wrote: > By "volatile" did you mean "atomic"? JF said they need to both volatile and atomic (and I trust that he understand C++ memory model correctness requirements better than I do). I'll have to defer to him for elaborations on why volatile is necessary. https://codereview.chromium.org/1410213004/diff/470001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:438: if (block->size < size) On 2015/11/10 21:17:35, bcwhite wrote: > You mean "volatile T* GetAsObject(...) {...}"? Yes. > I think any objects being > fetched and expecting continued shared access would have to use Atomic data > values so wouldn't this be redundant? (Again, have to defer to JF.)
https://codereview.chromium.org/1410213004/diff/470001/base/memory/shared_mem... File base/memory/shared_memory_allocator.cc (right): https://codereview.chromium.org/1410213004/diff/470001/base/memory/shared_mem... base/memory/shared_memory_allocator.cc:112: // This can't be a constant because SharedMetadata is a private definition. On 2015/11/10 22:45:52, mdempsky wrote: > On 2015/11/10 21:17:34, bcwhite wrote: > > That would require the constant to be declared in the header file, no? > > Declared, yes. Is that problematic somehow? Nope. Just making sure I understood.
FYI: I'm going to rename this to PersistentMemoryAllocator to reflect that data within it is persistent and cannot be released as well as to correlate with the fact that it can be used with other mechanisms (such as shared memory, mmap'd files, or death-rattle handlers) to survive process termination. Support for operating in shared memory across multiple processes is essential for some use cases but not the primary purpose.
Description was changed from ========== Create "shared memory allocator" for sharing objects. This allocator reserves memory within a specific block that itself can be persisted and/or shared between processes by various means. It is lock-free for operation on all platforms (even Android which does not support inter-process locks) and self-checking to deal with security concerns where not all processes accessing the memory are fully trustworthy. BUG=546019 ========== to ========== Create "persistent memory allocator" for persisting and sharing objects. This allocator reserves memory within a specific block that itself can be persisted and/or shared between processes by various means. It is lock-free for operation on all platforms (even Android which does not support inter-process locks) and self-checking to deal with security concerns where not all processes accessing the memory are fully trustworthy. BUG=546019 ==========
https://codereview.chromium.org/1410213004/diff/530001/base/memory/persistent... File base/memory/persistent_memory_allocator.cc (right): https://codereview.chromium.org/1410213004/diff/530001/base/memory/persistent... base/memory/persistent_memory_allocator.cc:234: int32_t size = static_cast<int32_t>(usize + sizeof(BlockHeader)); Can "usize + sizeof(BlockHeader)" overflow int32_t?
https://codereview.chromium.org/1410213004/diff/530001/base/memory/persistent... File base/memory/persistent_memory_allocator.cc (right): https://codereview.chromium.org/1410213004/diff/530001/base/memory/persistent... base/memory/persistent_memory_allocator.cc:234: int32_t size = static_cast<int32_t>(usize + sizeof(BlockHeader)); On 2015/11/16 16:01:51, Alexander Potapenko wrote: > Can "usize + sizeof(BlockHeader)" overflow int32_t? Yes, but then... umm, yeah. Right. I used to have a check here for too big but merged it into "size <= sizeof(BlockHeader)" which will catch overflow but not loss of precision. Good catch; I'll restore the other check.
https://codereview.chromium.org/1410213004/diff/530001/base/memory/persistent... File base/memory/persistent_memory_allocator.cc (right): https://codereview.chromium.org/1410213004/diff/530001/base/memory/persistent... base/memory/persistent_memory_allocator.cc:234: int32_t size = static_cast<int32_t>(usize + sizeof(BlockHeader)); On 2015/11/17 01:35:08, bcwhite wrote: > On 2015/11/16 16:01:51, Alexander Potapenko wrote: > > Can "usize + sizeof(BlockHeader)" overflow int32_t? > > Yes, but then... umm, yeah. Right. I used to have a check here for too big but > merged it into "size <= sizeof(BlockHeader)" which will catch overflow but not > loss of precision. Good catch; I'll restore the other check. Oh, wait. There is a "usize > int32::max" as well so it is checked. Definitely too late in the day for me.
https://codereview.chromium.org/1410213004/diff/530001/base/memory/persistent... File base/memory/persistent_memory_allocator.cc (right): https://codereview.chromium.org/1410213004/diff/530001/base/memory/persistent... base/memory/persistent_memory_allocator.cc:234: int32_t size = static_cast<int32_t>(usize + sizeof(BlockHeader)); On 2015/11/17 02:39:26, bcwhite wrote: > On 2015/11/17 01:35:08, bcwhite wrote: > > On 2015/11/16 16:01:51, Alexander Potapenko wrote: > > > Can "usize + sizeof(BlockHeader)" overflow int32_t? > > > > Yes, but then... umm, yeah. Right. I used to have a check here for too big > but > > merged it into "size <= sizeof(BlockHeader)" which will catch overflow but not > > loss of precision. Good catch; I'll restore the other check. > > Oh, wait. There is a "usize > int32::max" as well so it is checked. Definitely > too late in the day for me. Consider usize = int32::max - 1. In the case sizeof(BlockHeader) > 1, "usize + sizeof(BlockHeader)" is going to overflow int32_t, which is going to lead to undefined behavior. Because the compiler is free to assume usize + sizeof(BlockHeader) never overflows, it may safely discard the "size <= (int)sizeof(BlockHeader)" check at line 237 (see the examples in http://blog.llvm.org/2011/05/what-every-c-programmer-should-know_14.html). As a result the NOTREACHED() branch will never be taken, despite we're allocating too much memory. I think making |size| an uint32_t is the easiest way to fix this problem (still need to carefully look at further uses of this variable for the case it overflows somewhere else).
> > > > Can "usize + sizeof(BlockHeader)" overflow int32_t? > > > > > > Yes, but then... umm, yeah. Right. I used to have a check here for too big > > but > > > merged it into "size <= sizeof(BlockHeader)" which will catch overflow but > not > > > loss of precision. Good catch; I'll restore the other check. > > > > Oh, wait. There is a "usize > int32::max" as well so it is checked. > Definitely > > too late in the day for me. > > Consider usize = int32::max - 1. > In the case sizeof(BlockHeader) > 1, "usize + sizeof(BlockHeader)" is going to > overflow int32_t, which is going to lead to undefined behavior. That's pretty warped stuff. In essence, overflowing a signed integer is well-defined but converting from a value outside of the range of the destination type is implementation-defined. This should protect against that, then: // Round up the requested size, plus header, to the next allocation alignment. int32_t size = static_cast<int32_t>(usize + sizeof(BlockHeader)); size = (size + (kAllocAlignment - 1)) & ~(kAllocAlignment - 1); if (usize > (size_t)kSegmentMaxSize - sizeof(BlockHeader) || size <= (int)sizeof(BlockHeader) || size > mem_page_) { NOTREACHED(); return kReferenceNull; } > I think making |size| an uint32_t is the easiest way to fix this problem (still > need to carefully look at further uses of this variable for the case it > overflows somewhere else). I would if Atomic32 were unsigned but it's not. Best to convert the input parameter to a int32_t as soon as possible and in one place than to deal with the signed/unsigned mix through the rest of the code.
On 2015/11/17 13:40:10, bcwhite wrote: > > > > > Can "usize + sizeof(BlockHeader)" overflow int32_t? > > > > > > > > Yes, but then... umm, yeah. Right. I used to have a check here for too > big > > > but > > > > merged it into "size <= sizeof(BlockHeader)" which will catch overflow but > > not > > > > loss of precision. Good catch; I'll restore the other check. > > > > > > Oh, wait. There is a "usize > int32::max" as well so it is checked. > > Definitely > > > too late in the day for me. > > > > Consider usize = int32::max - 1. > > In the case sizeof(BlockHeader) > 1, "usize + sizeof(BlockHeader)" is going to > > overflow int32_t, which is going to lead to undefined behavior. > > That's pretty warped stuff. In essence, overflowing a signed integer is > well-defined Only if you're compiling with -fwrapv, otherwise the result is undefined. > but converting from a value outside of the range of the destination > type is implementation-defined. > This should protect against that, then: > > // Round up the requested size, plus header, to the next allocation alignment. > int32_t size = static_cast<int32_t>(usize + sizeof(BlockHeader)); As mentioned before, 'usize + sizeof(BlockHeader)' is already UB, the code below isn't helping. > size = (size + (kAllocAlignment - 1)) & ~(kAllocAlignment - 1); > if (usize > (size_t)kSegmentMaxSize - sizeof(BlockHeader) || > size <= (int)sizeof(BlockHeader) || size > mem_page_) { > NOTREACHED(); > return kReferenceNull; > } > > > > I think making |size| an uint32_t is the easiest way to fix this problem > (still > > need to carefully look at further uses of this variable for the case it > > overflows somewhere else). > > I would if Atomic32 were unsigned but it's not. Best to convert the input > parameter to a int32_t as soon as possible and in one place than to deal with > the signed/unsigned mix through the rest of the code. Ok, how about checking for the overflow before it actually happens then: if (usize < int32::max - sizeof(BlockHeader)) size = static_cast<int32_t>(usize + sizeof(BlockHeader));
> > This should protect against that, then: > > > > // Round up the requested size, plus header, to the next allocation > alignment. > > int32_t size = static_cast<int32_t>(usize + sizeof(BlockHeader)); > As mentioned before, 'usize + sizeof(BlockHeader)' is already UB, the code below > isn't helping. The "if" statement below validates "usize" directly and will attempt to validate "size" only if there was no possible way that this operation could have overflowed in any way. Is there something else? I guess I'm assuming that the machine can't throw an exception on overflow, just that the result would be implementation-defined. This was previously two conditions, pretty much as you gave below. Easy enough to do again. > > size = (size + (kAllocAlignment - 1)) & ~(kAllocAlignment - 1); > > if (usize > (size_t)kSegmentMaxSize - sizeof(BlockHeader) || > > size <= (int)sizeof(BlockHeader) || size > mem_page_) { > > NOTREACHED(); > > return kReferenceNull; > > } > > Ok, how about checking for the overflow before it actually happens then: > > if (usize < int32::max - sizeof(BlockHeader)) > size = static_cast<int32_t>(usize + sizeof(BlockHeader));
Sorry, looks like I've misunderstood the code. The following line: int32_t size = static_cast<int32_t>(usize + sizeof(BlockHeader)); does not contain undefined behavior, because you're adding up unsigned values, for which the overflow is defined. The cast of size_t to int32_t is implementation-defined, so probably not scary enough. However on the next line: size = (size + (kAllocAlignment - 1)) & ~(kAllocAlignment - 1); |size| may finally overflow. I suggest making this addition in the size_t land and then converting the result to int32_t just for the case. It's harder to reason about which conditions the compiler may choose to discard because of the overflow, but another reason to do so is that UBSan is gonna report this bug when we enable it in Chromium. I'll let the security team chime in for the possible security consequences, if any. Modulo the fact no signed operation overflows the check you've written so far look fine.
> Sorry, looks like I've misunderstood the code. > > The following line: > int32_t size = static_cast<int32_t>(usize + sizeof(BlockHeader)); > does not contain undefined behavior, because you're adding up unsigned values, > for which the overflow is defined. > The cast of size_t to int32_t is implementation-defined, so probably not scary > enough. > > However on the next line: > size = (size + (kAllocAlignment - 1)) & ~(kAllocAlignment - 1); > |size| may finally overflow. It could, but only in cases where the check of "usize > (size_t)kSegmentMaxSize - sizeof(BlockHeader)" would fail below and thus the (questionable) result would be completely ignored anyway. Still, UBSan in unlikely to notice that subtlety so best to make things explicit just to avoid future headaches. > I suggest making this addition in the size_t land and then converting the result > to int32_t just for the case. > It's harder to reason about which conditions the compiler may choose to discard > because of the overflow, but another reason to do so is that UBSan is gonna > report this bug when we enable it in Chromium. I'll let the security team chime > in for the possible security consequences, if any.
jfb@chromium.org changed reviewers: + jfb@chromium.org
(post-vacation review as requested) I haven't looked in-depth yet, but two things come to mind: (1) Could you instead use std::atomic? thakis said elsewhere that he thinks we can start using them now since the toolchains should all support them. That would simplify things significantly. (2) Are you using these atomics to communicate across processes? If so you should use volatile atomic to denote "external modification". I'm being a bit of a pedant here, but shared memory is ill-defined in C++ and volatile atomic is well suited to express what you want: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2006/n2016.html This is one of the gotchas about moving from atomicops.h to std::atomic, but in exchange std::atomic gives you unsigned ints, as well as 2's complement signed ints (i.e. no UB on overflow).
> (1) Could you instead use std::atomic? thakis said elsewhere that he thinks we > can start using them now since the toolchains should all support them. That > would simplify things significantly. I asked but was told they were "not quite ready" (comment #27). > (2) Are you using these atomics to communicate across processes? If so you > should use volatile atomic to denote "external modification". I'm being a bit of > a pedant here, but shared memory is ill-defined in C++ and volatile atomic is > well suited to express what you want: > http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2006/n2016.html > This is one of the gotchas about moving from atomicops.h to std::atomic, but in > exchange std::atomic gives you unsigned ints, as well as 2's complement signed > ints (i.e. no UB on overflow). There were some questions about what "volatile" did (comment #70 and #71). Looking at the link you provided, it seems to claim that "volatile" isn't really useful without "atomic" but nothing about "atomic" not being useful without "volatile". Am I missing something?
On 2015/11/20 18:15:10, bcwhite wrote: > > (1) Could you instead use std::atomic? thakis said elsewhere that he thinks we > > can start using them now since the toolchains should all support them. That > > would simplify things significantly. > > I asked but was told they were "not quite ready" (comment #27). In a separate / more recent discussion thakis said they should now be ready. jschuh is moving in that direction for another patch: https://codereview.chromium.org/1463683002/ > > (2) Are you using these atomics to communicate across processes? If so you > > should use volatile atomic to denote "external modification". I'm being a bit > of > > a pedant here, but shared memory is ill-defined in C++ and volatile atomic is > > well suited to express what you want: > > http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2006/n2016.html > > This is one of the gotchas about moving from atomicops.h to std::atomic, but > in > > exchange std::atomic gives you unsigned ints, as well as 2's complement signed > > ints (i.e. no UB on overflow). > > There were some questions about what "volatile" did (comment #70 and #71). > Looking at the link you provided, it seems to claim that "volatile" isn't really > useful without "atomic" but nothing about "atomic" not being useful without > "volatile". Am I missing something? C++11 atomics can also be volatile, which tells the compiler among other things "there can be external modification which you are not aware of" (it also means the compiler must respect ordering of volatiles between each other, allowing you to use them for signals and such). In the case of shared memory (in the same process, or between different processes) it means the compiler can't do usual atomic optimization such as those in http://wg21.link/n4455. Without volatile the compiler can prove that a variable is use atomically only in certain places, and assume no other accesses occur as "as-if" the atomic uses. Realistically compilers today don't do much of this, but they will in the future and the bugs it'll create will be wonderful to track down. volatile doesn't provide no-tear guarantees, nor write ordering guarantees. It merely tells the compiler that it can't elide or duplicate accesses, and can't reorder volatile accesses w.r.t. each other (it can reorder non-volatile w.r.t. volatile). To get no-tear and happens-before you need atomic, so in the case of the ill-defined shared memory you need volatile+atomic. Furthermore, for cross-process shared memory you need lock-free atomics! In the same process you're guaranteed address-free even when not lock-free, but across process you have no guarantee that non-lock-free operations work properly (they in fact do not on implementations I'm aware of). As an added recommendation, I'd have a check that the types you're using in shared memory have CHECK(var.is_lock_free()). Practically speaking, this means you can't use 64-bit values because they can tear on some ARM and MIPS CPUs.
On 2015/11/20 18:45:13, JF wrote: > On 2015/11/20 18:15:10, bcwhite wrote: > > > (1) Could you instead use std::atomic? thakis said elsewhere that he thinks > we > > > can start using them now since the toolchains should all support them. That > > > would simplify things significantly. > > > > I asked but was told they were "not quite ready" (comment #27). > > In a separate / more recent discussion thakis said they should now be ready. > jschuh is moving in that direction for another patch: > https://codereview.chromium.org/1463683002/ Comment 27 says "almost ready", and that time is now. See thread "C++11 library feature rollout" on chromium-dev from 9 days ago. (Nobody has tried using std::atomic everywhere yet so there might be toolchain issues, but it has a chance of working. We should probably try to use base/atomicops_internals_portable.h everywhere – filed http://crbug.com/559247 for that)
> Comment 27 says "almost ready", and that time is now. See thread "C++11 library > feature rollout" on chromium-dev from 9 days ago. (Nobody has tried using > std::atomic everywhere yet so there might be toolchain issues, but it has a > chance of working. We should probably try to use > base/atomicops_internals_portable.h everywhere – filed http://crbug.com/559247 > for that) Well all right! I've uploaded a new patch that uses std::atomic for the "corrupt_" member flag and also turned on the "volatile" qualifier for all the internal references. If you could take a look to make sure I'm on the right track with this, I'll finish the conversion of the rest of the atomics next week.
https://codereview.chromium.org/1410213004/diff/570001/base/memory/persistent... File base/memory/persistent_memory_allocator.cc (right): https://codereview.chromium.org/1410213004/diff/570001/base/memory/persistent... base/memory/persistent_memory_allocator.cc:464: CHECK(corrupt_.is_lock_free()); It would be useful to explain why this is here (multi-process sharedmem atomics must be lock-free). https://codereview.chromium.org/1410213004/diff/570001/base/memory/persistent... File base/memory/persistent_memory_allocator.h (right): https://codereview.chromium.org/1410213004/diff/570001/base/memory/persistent... base/memory/persistent_memory_allocator.h:103: // other processes, it is not necessarily the case. The internaal typo "internaal" https://codereview.chromium.org/1410213004/diff/570001/base/memory/persistent... base/memory/persistent_memory_allocator.h:107: reinterpret_cast<volatile T*>(GetBlockData(ref, type_id, sizeof(T)))); I'm not sure I understand this: is the T* returned in shared memory?
https://codereview.chromium.org/1410213004/diff/570001/base/memory/persistent... File base/memory/persistent_memory_allocator.h (right): https://codereview.chromium.org/1410213004/diff/570001/base/memory/persistent... base/memory/persistent_memory_allocator.h:107: reinterpret_cast<volatile T*>(GetBlockData(ref, type_id, sizeof(T)))); On 2015/11/20 20:43:31, JF wrote: > I'm not sure I understand this: is the T* returned in shared memory? That is correct. Note that it may or may not be shared depending on whether the code that created the Allocator shared the memory or not. There are uses for "persistent allocation" that don't involve actually sharing it with other processes. I think it's best for a user of this class to know that it should recognize returned objects as volatile if it gave volatile memory to the class during construction.
> volatile doesn't provide no-tear guarantees, nor write ordering guarantees. It > merely tells the compiler that it can't elide or duplicate accesses, and can't > reorder volatile accesses w.r.t. each other (it can reorder non-volatile w.r.t. > volatile). If the compiler cannot reorder volatile access with respect to other volatile variables... does that mean it inserts memory barrier instructions? It would do no good for the compiler to forgo this optimization when the CPU could do so internally. Or do CPUs have non-barrier ways of preventing the re-ordering of memory operations?
Dmitry, sorry to trouble you again but would you look over this change to std::atomic to ensure I haven't botched anything? Thanks. https://codereview.chromium.org/1410213004/diff/570001/base/memory/persistent... File base/memory/persistent_memory_allocator.cc (right): https://codereview.chromium.org/1410213004/diff/570001/base/memory/persistent... base/memory/persistent_memory_allocator.cc:464: CHECK(corrupt_.is_lock_free()); On 2015/11/20 20:43:31, JF wrote: > It would be useful to explain why this is here (multi-process sharedmem atomics > must be lock-free). Done. https://codereview.chromium.org/1410213004/diff/570001/base/memory/persistent... File base/memory/persistent_memory_allocator.h (right): https://codereview.chromium.org/1410213004/diff/570001/base/memory/persistent... base/memory/persistent_memory_allocator.h:103: // other processes, it is not necessarily the case. The internaal On 2015/11/20 20:43:31, JF wrote: > typo "internaal" Done.
Patchset #34 (id:650001) has been deleted
Patchset #33 (id:630001) has been deleted
Patchset #33 (id:670001) has been deleted
On 2015/11/20 22:01:26, bcwhite wrote: > > volatile doesn't provide no-tear guarantees, nor write ordering guarantees. It > > merely tells the compiler that it can't elide or duplicate accesses, and can't > > reorder volatile accesses w.r.t. each other (it can reorder non-volatile > w.r.t. > > volatile). > > If the compiler cannot reorder volatile access with respect to other volatile > variables... does that mean it inserts memory barrier instructions? It would do > no good for the compiler to forgo this optimization when the CPU could do so > internally. Or do CPUs have non-barrier ways of preventing the re-ordering of > memory operations? No barriers are inserted, so what you're inferring is exactly right: using volatile doesn't mean what it naively seems to mean, and CPUs can totally reorder accesses **unless the accesses are to special memory**. Note that special memory (e.g. device memory) isn't defined by the standard, so you're in dangerous waters there and volatile isn't a huge comfort. Sorry for the slow reply, my desktop just died so I had to handle that interrupt. I can take a deeper look at the code, but I'm assuming Dmitry has more context for it already so it may be better if he reviews it.
bcwhite@chromium.org changed reviewers: + thakis@chromium.org
thakis@chromium.org: Please review changes in base/* The algorithm has been vetted; now I need permission to submit it to "base". It has been suggested that I place it under base/metrics/ for now since that is (will be) the only user at this time and let someone else move it back to base/memory/ if it becomes needed elsewhere.
LGTM with nits https://codereview.chromium.org/1410213004/diff/710001/base/memory/persistent... File base/memory/persistent_memory_allocator.cc (right): https://codereview.chromium.org/1410213004/diff/710001/base/memory/persistent... base/memory/persistent_memory_allocator.cc:288: // -sizeof(header) so accouting for that will yield an expected size of We don't pass -sizeof(header) anymore. https://codereview.chromium.org/1410213004/diff/710001/base/memory/persistent... base/memory/persistent_memory_allocator.cc:395: if (block->next.compare_exchange_strong(next, ref, Since this file is so rich with comments and you have strong/weak remark above, it's worth nothing that strong is actually required here, otherwise we will CAS tailptr to kReferenceQueue in the else branch. https://codereview.chromium.org/1410213004/diff/710001/base/memory/persistent... base/memory/persistent_memory_allocator.cc:418: std::memory_order_acquire); FWIW, the holy standard says: 20.7.2.5 template<class T> bool atomic_compare_exchange_strong_explicit( shared_ptr<T>* p, shared_ptr<T>* v, shared_ptr<T> w, memory_order success, memory_order failure); Requires: failure shall not be memory_order_release, memory_order_acq_rel, or stronger than success. So I guess you need success=acq_rel, failure=acquire https://codereview.chromium.org/1410213004/diff/710001/base/memory/persistent... File base/memory/persistent_memory_allocator.h (right): https://codereview.chromium.org/1410213004/diff/710001/base/memory/persistent... base/memory/persistent_memory_allocator.h:12: #include "base/atomicops.h" Do you still need this include? https://codereview.chromium.org/1410213004/diff/710001/base/memory/persistent... base/memory/persistent_memory_allocator.h:39: // It should be noted that memory doesn't need to actually have zeros written I still don't understand this comment. All memory is like that. Malloc() can well give you non-pinned/non-existent memory as well. OS can also push any page to swap at any moment, so that page suddenly becomes non-pinned. Moreover, even if you write a non-zero value to a page (so that now it is surely exists), but then write zero, OS can observe that and vanish the page asynchronously. The distinction you are talking about is non-important and non-observable from user-space. This remark equally applies to any code that just needs zeros. E.g. is a global variable actually exist in memory, or just "reads as zero"? I think we need to remove this part. Besides being non-necessary (how could it not support such memory?) it can be very confusing for developers who are not familiar with virtual memory/OS internal works.
https://codereview.chromium.org/1410213004/diff/710001/base/memory/persistent... File base/memory/persistent_memory_allocator.cc (right): https://codereview.chromium.org/1410213004/diff/710001/base/memory/persistent... base/memory/persistent_memory_allocator.cc:288: // -sizeof(header) so accouting for that will yield an expected size of On 2015/12/03 20:51:37, Dmitry Vyukov wrote: > We don't pass -sizeof(header) anymore. Heh. The fun of documentation. Fixed. https://codereview.chromium.org/1410213004/diff/710001/base/memory/persistent... base/memory/persistent_memory_allocator.cc:395: if (block->next.compare_exchange_strong(next, ref, On 2015/12/03 20:51:37, Dmitry Vyukov wrote: > Since this file is so rich with comments and you have strong/weak remark above, > it's worth nothing that strong is actually required here, otherwise we will CAS > tailptr to kReferenceQueue in the else branch. Done. https://codereview.chromium.org/1410213004/diff/710001/base/memory/persistent... base/memory/persistent_memory_allocator.cc:418: std::memory_order_acquire); On 2015/12/03 20:51:37, Dmitry Vyukov wrote: > FWIW, the holy standard says: > > 20.7.2.5 > > template<class T> > bool atomic_compare_exchange_strong_explicit( > shared_ptr<T>* p, shared_ptr<T>* v, shared_ptr<T> w, > memory_order success, memory_order failure); > > Requires: failure shall not be memory_order_release, memory_order_acq_rel, or > stronger than > success. > > > > So I guess you need success=acq_rel, failure=acquire Done. Of note, the standard also says... http://en.cppreference.com/w/cpp/atomic/atomic/compare_exchange In the (2) and (4) versions [single "order" parameter] order is used for both read-modify-write and load operations, except that std::memory_order_acquire and std::memory_order_relaxed are used for the load operation if order == std::memory_order_acq_rel, or order == std::memory_order_release respectively. However, MSVC just passes "order" as both "success" and "failure" which then crashes if order was "std::memory_order_acq_rel" because that's not allowed in the "failure" case. Lovely. https://codereview.chromium.org/1410213004/diff/710001/base/memory/persistent... File base/memory/persistent_memory_allocator.h (right): https://codereview.chromium.org/1410213004/diff/710001/base/memory/persistent... base/memory/persistent_memory_allocator.h:39: // It should be noted that memory doesn't need to actually have zeros written On 2015/12/03 20:51:37, Dmitry Vyukov wrote: > I still don't understand this comment. All memory is like that. Malloc() can > well give you non-pinned/non-existent memory as well. OS can also push any page > to swap at any moment, so that page suddenly becomes non-pinned. Moreover, even > if you write a non-zero value to a page (so that now it is surely exists), but > then write zero, OS can observe that and vanish the page asynchronously. > The distinction you are talking about is non-important and non-observable from > user-space. This remark equally applies to any code that just needs zeros. E.g. > is a global variable actually exist in memory, or just "reads as zero"? I guess it is somewhat redundant since the previous paragraph states the memory must be "clean (i.e. zeroed)". Done. Note, however, that malloc() doesn't guarantee that the memory will be zero if it was un-pinned but now is pinned because malloc doesn't guarantee initialization of any kind. It could pin the memory on read and return whatever garbage was there previously; such might be inefficient but within spec.
Patchset #37 (id:770001) has been deleted
thakis@chromium.org: Please review changes in base/* The algorithm has been vetted; now I need permission to submit it to "base". It has been suggested that I place it under base/metrics/ for now since that is (will be) the only user at this time and let someone else move it back to base/memory/ if it becomes needed elsewhere.
Patchset #39 (id:830001) has been deleted
Patchset #39 (id:850001) has been deleted
Patchset #39 (id:870001) has been deleted
Patchset #39 (id:890001) has been deleted
thakis@chromium.org changed reviewers: + primiano@chromium.org
Yes, if this is for base/metrics it should be in base/metrics for now. (+primiano for all things allocator)
On 2015/12/15 20:17:44, Nico wrote: > Yes, if this is for base/metrics it should be in base/metrics for now. > (+primiano for all things allocator) I'll try to learn context on this tomorrow, but glancing through the CL looks like we might want to keep track of this in bit.ly/memory-infra. Filed crbug.com/570004 to keep track of this. Thanks for looping me in.
Moved to base/metrics. crbug/750004 updated with more information, including original design-doc.
Patchset #41 (id:950001) has been deleted
bcwhite@chromium.org changed reviewers: + asvitkine@chromium.org
Happy New Year! 2 months, 10 reviewers, 42 revisions ... and counting :-) Primiano, would you take a look at this when you get a chance? Thanks.
On 2016/01/07 20:11:56, bcwhite wrote: > Primiano, would you take a look at this when you get a chance? Thanks. I am not an owner here so not sure you need my review here. I think Nico CC-ed me to keep me in the loops on stuff that happens on memory. From my viewpoint, what I care about is what I wrote in crbug.com/570004 If this is allocating new memory, we should keep track of this somewhere. If this is just mapping memory allocated by something else that we already tracking, it's all good. Tracking or not, I don't think this CL should be blocked on the tracking side, which can be done in a separate cl, if needed. I guess you already have enough code here for one cl :)
On 2016/01/11 11:16:22, Primiano Tucci wrote: > On 2016/01/07 20:11:56, bcwhite wrote: > > Primiano, would you take a look at this when you get a chance? Thanks. > > I am not an owner here so not sure you need my review here. > I think Nico CC-ed me to keep me in the loops on stuff that happens on memory. > > From my viewpoint, what I care about is what I wrote in crbug.com/570004 > If this is allocating new memory, we should keep track of this somewhere. > If this is just mapping memory allocated by something else that we already > tracking, it's all good. > Tracking or not, I don't think this CL should be blocked on the tracking side, > which can be done in a separate cl, if needed. > I guess you already have enough code here for one cl :) It's just mapping already-allocated memory. Nico, back to you. Are you okay with it now that it's been moved to base/metrics?
lgtm!
The CQ bit was checked by bcwhite@chromium.org
The patchset sent to the CQ was uploaded after l-g-t-m from dvyukov@google.com Link to the patchset: https://codereview.chromium.org/1410213004/#ps990001 (title: "rebased")
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1410213004/990001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1410213004/990001
The CQ bit was unchecked by asvitkine@chromium.org
Sorry, I just want to clarify whether this was thoroughly reviewed by a Chromium committer. It seems the only L-G-T-Ms are from dvyukov@ (who seems to only have 1 issues reviewed and 1 issued created according to code review and Nico, who didn't provide any comments. Nico, did you thoroughly review this or were you deferring to other reviewers and just providing rubber-stamp approval for base/?
https://codereview.chromium.org/1410213004/diff/990001/base/metrics/persisten... File base/metrics/persistent_memory_allocator.cc (right): https://codereview.chromium.org/1410213004/diff/990001/base/metrics/persisten... base/metrics/persistent_memory_allocator.cc:150: CHECK(((SharedMetadata*)0)->freeptr.is_lock_free()); Sorry, are you dereferencing a NULL pointer here? https://codereview.chromium.org/1410213004/diff/990001/base/metrics/persisten... File base/metrics/persistent_memory_allocator_unittest.cc (right): https://codereview.chromium.org/1410213004/diff/990001/base/metrics/persisten... base/metrics/persistent_memory_allocator_unittest.cc:74: scoped_ptr<char[]> mem_segment_; Should be protected. https://codereview.chromium.org/1410213004/diff/990001/base/metrics/persisten... base/metrics/persistent_memory_allocator_unittest.cc:79: std::string base_name(TEST_NAME); Looks like this test case is testing several scenarios - can you please add comments before each? https://codereview.chromium.org/1410213004/diff/990001/base/metrics/persisten... base/metrics/persistent_memory_allocator_unittest.cc:241: AllocatorThread t1("t1", memory, TEST_MEMORY_SIZE, TEST_MEMORY_PAGE); How about making t1..t5 an array and employing loops here and below? https://codereview.chromium.org/1410213004/diff/990001/base/metrics/persisten... base/metrics/persistent_memory_allocator_unittest.cc:289: memory[offset] = value; I'm concerned this may lead to test flakiness in the case of an actual crash. https://codereview.chromium.org/1410213004/diff/990001/base/metrics/persisten... base/metrics/persistent_memory_allocator_unittest.cc:301: // Attempt to cause crashes or loops by expressly creating dangerous coditions. s/coditions/conditions https://codereview.chromium.org/1410213004/diff/990001/base/metrics/persisten... base/metrics/persistent_memory_allocator_unittest.cc:313: EXPECT_EQ(5U, CountIterables()); s/5U/5 (here and at other places, it doesn't really improve readability) https://codereview.chromium.org/1410213004/diff/990001/base/metrics/persisten... base/metrics/persistent_memory_allocator_unittest.cc:320: CountIterables(); // loop: 1-2-3-4-3 Isn't the allocator corrupted already after this line? https://codereview.chromium.org/1410213004/diff/990001/base/metrics/persisten... base/metrics/persistent_memory_allocator_unittest.cc:424: if (FilePersistentMemoryAllocator::IsFileAcceptable(*mmfile)) { Can you rephrase this as: if (minsize==filesize) { ASSERT_TRUE(FilePersistentMemoryAllocator::IsFileAcceptable(*mmfile)); } https://codereview.chromium.org/1410213004/diff/990001/base/metrics/persisten... base/metrics/persistent_memory_allocator_unittest.cc:426: FilePersistentMemoryAllocator allocator(mmfile.release(), 0, ""); This variable is unused, the compiler may warn about it. Do we need it? https://codereview.chromium.org/1410213004/diff/990001/base/metrics/persisten... base/metrics/persistent_memory_allocator_unittest.cc:448: EXPECT_TRUE(allocator.IsCorrupt()); // Gargbage data so it should be. s/Gargbage/Garbage
https://codereview.chromium.org/1410213004/diff/990001/base/metrics/persisten... File base/metrics/persistent_memory_allocator.cc (right): https://codereview.chromium.org/1410213004/diff/990001/base/metrics/persisten... base/metrics/persistent_memory_allocator.cc:150: CHECK(((SharedMetadata*)0)->freeptr.is_lock_free()); On 2016/01/13 18:34:00, Alexander Potapenko wrote: > Sorry, are you dereferencing a NULL pointer here? No. It's a static method of the type that doesn't access the data but is a fixed value based on the template definition and template parameters. https://codereview.chromium.org/1410213004/diff/990001/base/metrics/persisten... File base/metrics/persistent_memory_allocator_unittest.cc (right): https://codereview.chromium.org/1410213004/diff/990001/base/metrics/persisten... base/metrics/persistent_memory_allocator_unittest.cc:74: scoped_ptr<char[]> mem_segment_; On 2016/01/13 18:34:00, Alexander Potapenko wrote: > Should be protected. Done. https://codereview.chromium.org/1410213004/diff/990001/base/metrics/persisten... base/metrics/persistent_memory_allocator_unittest.cc:79: std::string base_name(TEST_NAME); On 2016/01/13 18:34:00, Alexander Potapenko wrote: > Looks like this test case is testing several scenarios - can you please add > comments before each? Done. https://codereview.chromium.org/1410213004/diff/990001/base/metrics/persisten... base/metrics/persistent_memory_allocator_unittest.cc:241: AllocatorThread t1("t1", memory, TEST_MEMORY_SIZE, TEST_MEMORY_PAGE); On 2016/01/13 18:34:01, Alexander Potapenko wrote: > How about making t1..t5 an array and employing loops here and below? It was more complicated to do so because of the lack of ability to provide constructor parameters to objects inside an array. They could be dynamically allocated but that means scoped_ptrs, etc. This was easier but I can change it if you'd like. https://codereview.chromium.org/1410213004/diff/990001/base/metrics/persisten... base/metrics/persistent_memory_allocator_unittest.cc:289: memory[offset] = value; On 2016/01/13 18:34:00, Alexander Potapenko wrote: > I'm concerned this may lead to test flakiness in the case of an actual crash. I'm not sure what you mean by "in the case of an actual crash". It's supposed to crash; or rather, it's supposed to *fail* to crash. Any failure of this particular test means there's a bug in the code somewhere. https://codereview.chromium.org/1410213004/diff/990001/base/metrics/persisten... base/metrics/persistent_memory_allocator_unittest.cc:301: // Attempt to cause crashes or loops by expressly creating dangerous coditions. On 2016/01/13 18:34:01, Alexander Potapenko wrote: > s/coditions/conditions Done. https://codereview.chromium.org/1410213004/diff/990001/base/metrics/persisten... base/metrics/persistent_memory_allocator_unittest.cc:313: EXPECT_EQ(5U, CountIterables()); On 2016/01/13 18:34:01, Alexander Potapenko wrote: > s/5U/5 > > (here and at other places, it doesn't really improve readability) The "U" is required or there are signed/unsigned comparison errors thrown by the Clang compiler. https://codereview.chromium.org/1410213004/diff/990001/base/metrics/persisten... base/metrics/persistent_memory_allocator_unittest.cc:320: CountIterables(); // loop: 1-2-3-4-3 On 2016/01/13 18:34:01, Alexander Potapenko wrote: > Isn't the allocator corrupted already after this line? Corruption doesn't stop or prevent iteration. It's still allowed to try to retrieve as much information as possible. So while yes, the flag is set, it still has to detect the loop. https://codereview.chromium.org/1410213004/diff/990001/base/metrics/persisten... base/metrics/persistent_memory_allocator_unittest.cc:424: if (FilePersistentMemoryAllocator::IsFileAcceptable(*mmfile)) { On 2016/01/13 18:34:00, Alexander Potapenko wrote: > Can you rephrase this as: > if (minsize==filesize) { > ASSERT_TRUE(FilePersistentMemoryAllocator::IsFileAcceptable(*mmfile)); > } I need the function to run for all sizes. It must return True for minsize==filesize. For all other cases, either result is allowed. The purpose of running it is to ensure it doesn't crash as data becomes invalid. https://codereview.chromium.org/1410213004/diff/990001/base/metrics/persisten... base/metrics/persistent_memory_allocator_unittest.cc:426: FilePersistentMemoryAllocator allocator(mmfile.release(), 0, ""); On 2016/01/13 18:34:00, Alexander Potapenko wrote: > This variable is unused, the compiler may warn about it. Do we need it? The constructor needs to be run to ensure it can't crash on any data that IsFileAcceptable says is okay. Good point about "unused", though. Would be bad in the compiler optimized out the call. https://codereview.chromium.org/1410213004/diff/990001/base/metrics/persisten... base/metrics/persistent_memory_allocator_unittest.cc:448: EXPECT_TRUE(allocator.IsCorrupt()); // Gargbage data so it should be. On 2016/01/13 18:34:00, Alexander Potapenko wrote: > s/Gargbage/Garbage Done.
https://codereview.chromium.org/1410213004/diff/990001/base/metrics/persisten... File base/metrics/persistent_memory_allocator.cc (right): https://codereview.chromium.org/1410213004/diff/990001/base/metrics/persisten... base/metrics/persistent_memory_allocator.cc:313: uint32_t type_id) { OOC: did clang-format put the arguments on the different lines? https://codereview.chromium.org/1410213004/diff/990001/base/metrics/persisten... File base/metrics/persistent_memory_allocator_unittest.cc (right): https://codereview.chromium.org/1410213004/diff/990001/base/metrics/persisten... base/metrics/persistent_memory_allocator_unittest.cc:1: // Copyright 2015 The Chromium Authors. All rights reserved. Dunno if this should be 2016 or it's fine to keep the year in which you started writing this code. https://codereview.chromium.org/1410213004/diff/990001/base/metrics/persisten... base/metrics/persistent_memory_allocator_unittest.cc:34: int32_t onething; Not that it makes any difference, but can this be a plain int? https://codereview.chromium.org/1410213004/diff/990001/base/metrics/persisten... base/metrics/persistent_memory_allocator_unittest.cc:98: EXPECT_GE(sizeof(TestObject1) + 7, allocator_->GetAllocSize(block1)); Where does this '7' comes from (here and below)? https://codereview.chromium.org/1410213004/diff/990001/base/metrics/persisten... base/metrics/persistent_memory_allocator_unittest.cc:198: Reference block2 = allocator_->Allocate(TEST_MEMORY_PAGE - 16, 2); What are we testing here? Where does '16' come from? In which case will this constant change? https://codereview.chromium.org/1410213004/diff/990001/base/metrics/persisten... base/metrics/persistent_memory_allocator_unittest.cc:232: unsigned count_; I think it's better to make count_ and iterable_ private and declare PersistentMemoryAllocatorTest a friend class. https://codereview.chromium.org/1410213004/diff/990001/base/metrics/persisten... base/metrics/persistent_memory_allocator_unittest.cc:239: TEST_F(PersistentMemoryAllocatorTest, ParallelismTest) { Please add a commend describing what you're testing here. https://codereview.chromium.org/1410213004/diff/990001/base/metrics/persisten... base/metrics/persistent_memory_allocator_unittest.cc:241: AllocatorThread t1("t1", memory, TEST_MEMORY_SIZE, TEST_MEMORY_PAGE); On 2016/01/13 21:31:39, bcwhite wrote: > On 2016/01/13 18:34:01, Alexander Potapenko wrote: > > How about making t1..t5 an array and employing loops here and below? > > It was more complicated to do so because of the lack of ability to provide > constructor parameters to objects inside an array. They could be dynamically > allocated but that means scoped_ptrs, etc. This was easier but I can change it > if you'd like. I just thought this could let you shorten ParallelismTest and CorruptionTest. I don't insist if there're only two such tests, but in the case you're planning to add more, it's better to change that code. https://codereview.chromium.org/1410213004/diff/990001/base/metrics/persisten... base/metrics/persistent_memory_allocator_unittest.cc:289: memory[offset] = value; On 2016/01/13 21:31:39, bcwhite wrote: > On 2016/01/13 18:34:00, Alexander Potapenko wrote: > > I'm concerned this may lead to test flakiness in the case of an actual crash. > > I'm not sure what you mean by "in the case of an actual crash". It's supposed > to crash; or rather, it's supposed to *fail* to crash. Any failure of this > particular test means there's a bug in the code somewhere. My point is that in the case there's a bug in the code somewhere it won't be detected every time because you write random bytes to the memory, so the test becomes flaky. https://codereview.chromium.org/1410213004/diff/990001/base/metrics/persisten... base/metrics/persistent_memory_allocator_unittest.cc:313: EXPECT_EQ(5U, CountIterables()); On 2016/01/13 21:31:39, bcwhite wrote: > On 2016/01/13 18:34:01, Alexander Potapenko wrote: > > s/5U/5 > > > > (here and at other places, it doesn't really improve readability) > > The "U" is required or there are signed/unsigned comparison errors thrown by the > Clang compiler. This is quite strange. `clang -Wsign-compare` does not complain about a comparison between an unsigned int returned by a function and a positive integer literal (and it sure should not). Are the errors reported because of the EXPECT_EQ macro? https://codereview.chromium.org/1410213004/diff/990001/base/metrics/persisten... base/metrics/persistent_memory_allocator_unittest.cc:320: CountIterables(); // loop: 1-2-3-4-3 On 2016/01/13 21:31:39, bcwhite wrote: > On 2016/01/13 18:34:01, Alexander Potapenko wrote: > > Isn't the allocator corrupted already after this line? > > Corruption doesn't stop or prevent iteration. It's still allowed to try to > retrieve as much information as possible. So while yes, the flag is set, it > still has to detect the loop. If the flag is already set after the first CountIterables() call, let us perform the 'EXPECT_TRUE(allocator_->IsCorrupt())' check right after that call. Otherwise it's hard to tell whether the allocator had been corrupted by the first assignment or the later ones. This brings up the questions whether you need the second and third loop checks (probably yes) and whether you want to setup the blocks before each check (I'd say yes). Also, you don't check the return values of CountIterables() calls - what are your expectations in the case there's a loop? https://codereview.chromium.org/1410213004/diff/990001/base/metrics/persisten... base/metrics/persistent_memory_allocator_unittest.cc:335: EXPECT_FALSE(allocator.IsFull()); We need a positive test for IsFull() somewhere. https://codereview.chromium.org/1410213004/diff/990001/base/metrics/persisten... base/metrics/persistent_memory_allocator_unittest.cc:426: FilePersistentMemoryAllocator allocator(mmfile.release(), 0, ""); On 2016/01/13 21:31:39, bcwhite wrote: > On 2016/01/13 18:34:00, Alexander Potapenko wrote: > > This variable is unused, the compiler may warn about it. Do we need it? > > The constructor needs to be run to ensure it can't crash on any data that > IsFileAcceptable says is okay. > > Good point about "unused", though. Would be bad in the compiler optimized out > the call. You can cast the variable to void to prevent that: FilePersistentMemoryAllocator allocator(mmfile.release(), 0, ""); (void)allocator;
Still looking at the .cc file, but some comments for now... https://codereview.chromium.org/1410213004/diff/1030001/base/metrics/persiste... File base/metrics/persistent_memory_allocator.h (right): https://codereview.chromium.org/1410213004/diff/1030001/base/metrics/persiste... base/metrics/persistent_memory_allocator.h:113: // contents are valid, just that the pramaters won't cause the program to parameters* https://codereview.chromium.org/1410213004/diff/1030001/base/metrics/persiste... base/metrics/persistent_memory_allocator.h:130: // histograms will be backed my memory provided by this very allocator. backed by* https://codereview.chromium.org/1410213004/diff/1030001/base/metrics/persiste... base/metrics/persistent_memory_allocator.h:131: void CreateHistograms(const std::string& name); I'm not sure I follow what this does? CreateHistograms, plural, but only a single name is passed in? No bucket sizes? No other details? If these have implicit sizes, etc, for tracking internal usage, maybe make that a little more explicit in this comment? The function name is also very generic and could be made more specific. https://codereview.chromium.org/1410213004/diff/1030001/base/metrics/persiste... base/metrics/persistent_memory_allocator.h:143: // zero will match any though the size is still checked. NULL is returned kTypeIdAny https://codereview.chromium.org/1410213004/diff/1030001/base/metrics/persiste... base/metrics/persistent_memory_allocator.h:146: // TIME before accessing it or risk crashing! Once dereferenced, the pointer Remove double space after . https://codereview.chromium.org/1410213004/diff/1030001/base/metrics/persiste... base/metrics/persistent_memory_allocator.h:214: void CreateIterator(Iterator* state, Reference starting_after) const; Not clear if the object pointer to by |starting_after| will be visited as the first item in the iteration... this is consistent with typical iterator semantics and the comment, but the name leads one to believe that the iteration will first visit the item following it. Clarify the comment and/or name? https://codereview.chromium.org/1410213004/diff/1030001/base/metrics/persiste... base/metrics/persistent_memory_allocator.h:232: void UpdateStaticHistograms(); Maybe rename CreateHistograms above to CreateStaticHistograms? https://codereview.chromium.org/1410213004/diff/1030001/base/metrics/persiste... base/metrics/persistent_memory_allocator.h:309: class BASE_EXPORT FilePersistentMemoryAllocator Oh? There's no way to have a writable FilePersistent allocator, even if we specify a maximum size up front? This seems like it would be eminently so processes don't need to have a "death rattle" handler, but can rely on the OS to do that for them...
https://codereview.chromium.org/1410213004/diff/1030001/base/metrics/persiste... File base/metrics/persistent_memory_allocator.cc (right): https://codereview.chromium.org/1410213004/diff/1030001/base/metrics/persiste... base/metrics/persistent_memory_allocator.cc:47: const uint32_t kBlockCookieAllocated = 0xC8799269; Worth putting these values in a typed enum? Say, BlockCookieState? https://codereview.chromium.org/1410213004/diff/1030001/base/metrics/persiste... base/metrics/persistent_memory_allocator.cc:97: uint32_t _padding_; // Padding to match required allocation size. Why the trailing and leading underscores for this name? Does this represent the padding of the whole shared allocator, or is it strictly a placeholder that pads out the actual SharedMetadata structure? If the former, then maybe some static_asserts that ensure the size is as expected, and that any alignments/offsets are met? https://codereview.chromium.org/1410213004/diff/1030001/base/metrics/persiste... base/metrics/persistent_memory_allocator.cc:143: "SharedMetadata is not a multiple of kAllocAlignment"); static_asserts don't need to be in functions... place these directly below the definitions of the of the two structures above? https://codereview.chromium.org/1410213004/diff/1030001/base/metrics/persiste... base/metrics/persistent_memory_allocator.cc:211: DCHECK(mem_size_ >= shared_meta()->freeptr.load()); Can't use DCHECK_GE? https://codereview.chromium.org/1410213004/diff/1030001/base/metrics/persiste... base/metrics/persistent_memory_allocator.cc:257: name + ".UsedKiB", 1, 256 << 10, 100, HistogramBase::kNoFlags); Where is the storage for these histograms? How will these make it out of processes whose intent is to use persistent memory allocator backed histograms? Can they end up being housed in this allocator? Should they be, by convention? https://codereview.chromium.org/1410213004/diff/1030001/base/metrics/persiste... base/metrics/persistent_memory_allocator.cc:275: if (size <= sizeof(BlockHeader) || ref + size >= mem_size_) Wouldn't this imply corruption? https://codereview.chromium.org/1410213004/diff/1030001/base/metrics/persiste... base/metrics/persistent_memory_allocator.cc:402: // writing beyond the allocated space and into unallocated space. Not worth checking the block body? Maybe at least under #ifndef NDEBUG? https://codereview.chromium.org/1410213004/diff/1030001/base/metrics/persiste... base/metrics/persistent_memory_allocator.cc:555: SetFlag(const_cast<volatile std::atomic<uint32_t>*>(&shared_meta()->flags), Multiline "if" body requires braces. https://codereview.chromium.org/1410213004/diff/1030001/base/metrics/persiste... File base/metrics/persistent_memory_allocator_unittest.cc (right): https://codereview.chromium.org/1410213004/diff/1030001/base/metrics/persiste... base/metrics/persistent_memory_allocator_unittest.cc:25: // Duplicat from persistent_memory_allocator.cc which is not visible here. Duplicated* Maybe easier to make it visible from the .h, declared as an extern? Guarantees that it won't deviate between the two places... https://codereview.chromium.org/1410213004/diff/1030001/base/metrics/persiste... base/metrics/persistent_memory_allocator_unittest.cc:55: memset(mem_segment_.get(), 0, TEST_MEMORY_SIZE); nit: Maybe use ::memset, in order to make explicit the fact that you want the one in the global namespace. (This is the norm in many Chromium projects, although not necessarily in Chrome itself. I find it useful because there is also std::memset, and occasionally other projects define their own variants.) https://codereview.chromium.org/1410213004/diff/1030001/base/metrics/persiste... base/metrics/persistent_memory_allocator_unittest.cc:98: // Validate allocation of test object and make sure it can be rerefernecd referenced* https://codereview.chromium.org/1410213004/diff/1030001/base/metrics/persiste... base/metrics/persistent_memory_allocator_unittest.cc:173: // Check that an objcets type can be changed. object's https://codereview.chromium.org/1410213004/diff/1030001/base/metrics/persiste... base/metrics/persistent_memory_allocator_unittest.cc:206: // Ensure that iteraton and access through third allocator works. iteration* https://codereview.chromium.org/1410213004/diff/1030001/base/metrics/persiste... base/metrics/persistent_memory_allocator_unittest.cc:231: class AllocatorThread : public SimpleThread { This class is worthy of a top level comment describing what it does. https://codereview.chromium.org/1410213004/diff/1030001/base/metrics/persiste... base/metrics/persistent_memory_allocator_unittest.cc:302: // This test doesn't verify anything other than it doesn't crash. It's goal Its* goal https://codereview.chromium.org/1410213004/diff/1030001/base/metrics/persiste... base/metrics/persistent_memory_allocator_unittest.cc:303: // is to coding errors that aren't otherwise tested for, much like a "fuzzer" is to find coding ^^^^ https://codereview.chromium.org/1410213004/diff/1030001/base/metrics/persiste... base/metrics/persistent_memory_allocator_unittest.cc:351: // failure, the call will hang and the test killed for taking to long. too* long
https://codereview.chromium.org/1410213004/diff/990001/base/metrics/persisten... File base/metrics/persistent_memory_allocator_unittest.cc (right): https://codereview.chromium.org/1410213004/diff/990001/base/metrics/persisten... base/metrics/persistent_memory_allocator_unittest.cc:1: // Copyright 2015 The Chromium Authors. All rights reserved. On 2016/01/14 10:54:16, Alexander Potapenko wrote: > Dunno if this should be 2016 or it's fine to keep the year in which you started > writing this code. Acknowledged. https://codereview.chromium.org/1410213004/diff/990001/base/metrics/persisten... base/metrics/persistent_memory_allocator_unittest.cc:34: int32_t onething; On 2016/01/14 10:54:17, Alexander Potapenko wrote: > Not that it makes any difference, but can this be a plain int? Done. https://codereview.chromium.org/1410213004/diff/990001/base/metrics/persisten... base/metrics/persistent_memory_allocator_unittest.cc:98: EXPECT_GE(sizeof(TestObject1) + 7, allocator_->GetAllocSize(block1)); On 2016/01/14 10:54:16, Alexander Potapenko wrote: > Where does this '7' comes from (here and below)? Done. Defined as constant. https://codereview.chromium.org/1410213004/diff/990001/base/metrics/persisten... base/metrics/persistent_memory_allocator_unittest.cc:198: Reference block2 = allocator_->Allocate(TEST_MEMORY_PAGE - 16, 2); On 2016/01/14 10:54:16, Alexander Potapenko wrote: > What are we testing here? Where does '16' come from? In which case will this > constant change? Done. https://codereview.chromium.org/1410213004/diff/990001/base/metrics/persisten... base/metrics/persistent_memory_allocator_unittest.cc:232: unsigned count_; On 2016/01/14 10:54:17, Alexander Potapenko wrote: > I think it's better to make count_ and iterable_ private and declare > PersistentMemoryAllocatorTest a friend class. That would have to be: FRIEND_TEST_ALL_PREFIXES(PersistentMemoryAllocatorTest, CorruptionTest); FRIEND_TEST_ALL_PREFIXES(PersistentMemoryAllocatorTest, MaliciousTest); Added accessor method instead. https://codereview.chromium.org/1410213004/diff/990001/base/metrics/persisten... base/metrics/persistent_memory_allocator_unittest.cc:239: TEST_F(PersistentMemoryAllocatorTest, ParallelismTest) { On 2016/01/14 10:54:16, Alexander Potapenko wrote: > Please add a commend describing what you're testing here. Done. https://codereview.chromium.org/1410213004/diff/990001/base/metrics/persisten... base/metrics/persistent_memory_allocator_unittest.cc:241: AllocatorThread t1("t1", memory, TEST_MEMORY_SIZE, TEST_MEMORY_PAGE); On 2016/01/14 10:54:17, Alexander Potapenko wrote: > On 2016/01/13 21:31:39, bcwhite wrote: > > On 2016/01/13 18:34:01, Alexander Potapenko wrote: > > > How about making t1..t5 an array and employing loops here and below? > > > > It was more complicated to do so because of the lack of ability to provide > > constructor parameters to objects inside an array. They could be dynamically > > allocated but that means scoped_ptrs, etc. This was easier but I can change > it > > if you'd like. > > I just thought this could let you shorten ParallelismTest and CorruptionTest. I > don't insist if there're only two such tests, but in the case you're planning to > add more, it's better to change that code. Acknowledged. https://codereview.chromium.org/1410213004/diff/990001/base/metrics/persisten... base/metrics/persistent_memory_allocator_unittest.cc:289: memory[offset] = value; On 2016/01/14 10:54:16, Alexander Potapenko wrote: > On 2016/01/13 21:31:39, bcwhite wrote: > > On 2016/01/13 18:34:00, Alexander Potapenko wrote: > > > I'm concerned this may lead to test flakiness in the case of an actual > crash. > > > > I'm not sure what you mean by "in the case of an actual crash". It's supposed > > to crash; or rather, it's supposed to *fail* to crash. Any failure of this > > particular test means there's a bug in the code somewhere. > > My point is that in the case there's a bug in the code somewhere it won't be > detected every time because you write random bytes to the memory, so the test > becomes flaky. I know but at least there is indication that something is wrong somewhere which can then be investigated further. A stack dump, if present, would also narrow the search quite significantly. This test allowed me to find a few bugs when it was first written (and run in a loop for a while). I believe it's important to leave present, more as protection for future changes than in the expectation it will find additional issues with the current code. https://codereview.chromium.org/1410213004/diff/990001/base/metrics/persisten... base/metrics/persistent_memory_allocator_unittest.cc:313: EXPECT_EQ(5U, CountIterables()); On 2016/01/14 10:54:16, Alexander Potapenko wrote: > On 2016/01/13 21:31:39, bcwhite wrote: > > On 2016/01/13 18:34:01, Alexander Potapenko wrote: > > > s/5U/5 > > > > > > (here and at other places, it doesn't really improve readability) > > > > The "U" is required or there are signed/unsigned comparison errors thrown by > the > > Clang compiler. > > This is quite strange. > `clang -Wsign-compare` does not complain about a comparison between an unsigned > int returned by a function and a positive integer literal (and it sure should > not). > Are the errors reported because of the EXPECT_EQ macro? The errors were always in the EXPECT_ macros. The Windows build didn't have a problem with it but various trybots did. https://codereview.chromium.org/1410213004/diff/990001/base/metrics/persisten... base/metrics/persistent_memory_allocator_unittest.cc:320: CountIterables(); // loop: 1-2-3-4-3 > If the flag is already set after the first CountIterables() call, let us perform > the 'EXPECT_TRUE(allocator_->IsCorrupt())' check right after that call. > Otherwise it's hard to tell whether the allocator had been corrupted by the > first assignment or the later ones. Done. > This brings up the questions whether you need the second and third loop checks > (probably yes) and whether you want to setup the blocks before each check (I'd > say yes). The other loops trials test other edge cases; I'll add some comments. Iteration isn't allowed to change any state in the persistent memory (because it could be a read-only segment) so there shouldn't be any behavior differences when not recreating the entire structure. I could change it to a parameterized test, with the loop-to being the parameter, if you'd prefer. > Also, you don't check the return values of CountIterables() calls - what are > your expectations in the case there's a loop? It's undefined. In the case of a loop, iterable items may get returned multiple times so there's no way of knowing just what the count will be. The only requirement is that it stops. https://codereview.chromium.org/1410213004/diff/990001/base/metrics/persisten... base/metrics/persistent_memory_allocator_unittest.cc:335: EXPECT_FALSE(allocator.IsFull()); On 2016/01/14 10:54:17, Alexander Potapenko wrote: > We need a positive test for IsFull() somewhere. There's an implicit one in the Parallelism test. I'll make it explicit. https://codereview.chromium.org/1410213004/diff/990001/base/metrics/persisten... base/metrics/persistent_memory_allocator_unittest.cc:426: FilePersistentMemoryAllocator allocator(mmfile.release(), 0, ""); On 2016/01/14 10:54:16, Alexander Potapenko wrote: > On 2016/01/13 21:31:39, bcwhite wrote: > > On 2016/01/13 18:34:00, Alexander Potapenko wrote: > > > This variable is unused, the compiler may warn about it. Do we need it? > > > > The constructor needs to be run to ensure it can't crash on any data that > > IsFileAcceptable says is okay. > > > > Good point about "unused", though. Would be bad in the compiler optimized out > > the call. > > You can cast the variable to void to prevent that: > FilePersistentMemoryAllocator allocator(mmfile.release(), 0, ""); > (void)allocator; I wasn't aware of that trick. Done. https://codereview.chromium.org/1410213004/diff/1030001/base/metrics/persiste... File base/metrics/persistent_memory_allocator.cc (right): https://codereview.chromium.org/1410213004/diff/1030001/base/metrics/persiste... base/metrics/persistent_memory_allocator.cc:47: const uint32_t kBlockCookieAllocated = 0xC8799269; On 2016/01/19 16:37:41, chrisha wrote: > Worth putting these values in a typed enum? > > Say, BlockCookieState? I don't think so. It's not an "enumeration" and never used externally so there's no need for type-checking. https://codereview.chromium.org/1410213004/diff/1030001/base/metrics/persiste... base/metrics/persistent_memory_allocator.cc:97: uint32_t _padding_; // Padding to match required allocation size. On 2016/01/19 16:37:41, chrisha wrote: > Why the trailing and leading underscores for this name? Just to indicate that it's not something that should be used. It's gone in a new upload I have because the "id" is growing to 64 bits. https://codereview.chromium.org/1410213004/diff/1030001/base/metrics/persiste... base/metrics/persistent_memory_allocator.cc:143: "SharedMetadata is not a multiple of kAllocAlignment"); On 2016/01/19 16:37:41, chrisha wrote: > static_asserts don't need to be in functions... place these directly below the > definitions of the of the two structures above? They have to be in the scope of the class to be able to access class-private variables. Won't work at file-scope. https://codereview.chromium.org/1410213004/diff/1030001/base/metrics/persiste... base/metrics/persistent_memory_allocator.cc:211: DCHECK(mem_size_ >= shared_meta()->freeptr.load()); On 2016/01/19 16:37:41, chrisha wrote: > Can't use DCHECK_GE? Done. https://codereview.chromium.org/1410213004/diff/1030001/base/metrics/persiste... base/metrics/persistent_memory_allocator.cc:257: name + ".UsedKiB", 1, 256 << 10, 100, HistogramBase::kNoFlags); On 2016/01/19 16:37:41, chrisha wrote: > Where is the storage for these histograms? How will these make it out of > processes whose intent is to use persistent memory allocator backed histograms? > Can they end up being housed in this allocator? Should they be, by convention? Storage is wherever the histogram factory chooses to put them. Your question is the very reason why these are in CreateHistograms() rather in the ctor (as they were originally). If this allocator is to be used for histogram storage then it can be set after construction but before calling this method. If not or if it's an allocator for something completely different, just call it immediately after construction. https://codereview.chromium.org/1410213004/diff/1030001/base/metrics/persiste... base/metrics/persistent_memory_allocator.cc:275: if (size <= sizeof(BlockHeader) || ref + size >= mem_size_) On 2016/01/19 16:37:40, chrisha wrote: > Wouldn't this imply corruption? Done. https://codereview.chromium.org/1410213004/diff/1030001/base/metrics/persiste... base/metrics/persistent_memory_allocator.cc:402: // writing beyond the allocated space and into unallocated space. On 2016/01/19 16:37:41, chrisha wrote: > Not worth checking the block body? Maybe at least under #ifndef NDEBUG? Tough call. I could be useful on a release build to prevent violating the expectations of the caller in the presence of a malicious actor but the cost of such is probably too high. As for a debug build, it should never happen because we don't expect malicious actors. https://codereview.chromium.org/1410213004/diff/1030001/base/metrics/persiste... base/metrics/persistent_memory_allocator.cc:555: SetFlag(const_cast<volatile std::atomic<uint32_t>*>(&shared_meta()->flags), On 2016/01/19 16:37:41, chrisha wrote: > Multiline "if" body requires braces. Done. https://codereview.chromium.org/1410213004/diff/1030001/base/metrics/persiste... File base/metrics/persistent_memory_allocator.h (right): https://codereview.chromium.org/1410213004/diff/1030001/base/metrics/persiste... base/metrics/persistent_memory_allocator.h:113: // contents are valid, just that the pramaters won't cause the program to On 2016/01/18 22:40:28, chrisha wrote: > parameters* Done. https://codereview.chromium.org/1410213004/diff/1030001/base/metrics/persiste... base/metrics/persistent_memory_allocator.h:130: // histograms will be backed my memory provided by this very allocator. On 2016/01/18 22:40:28, chrisha wrote: > backed by* Done. https://codereview.chromium.org/1410213004/diff/1030001/base/metrics/persiste... base/metrics/persistent_memory_allocator.h:131: void CreateHistograms(const std::string& name); On 2016/01/18 22:40:28, chrisha wrote: > I'm not sure I follow what this does? CreateHistograms, plural, but only a > single name is passed in? No bucket sizes? No other details? > > If these have implicit sizes, etc, for tracking internal usage, maybe make that > a little more explicit in this comment? > > The function name is also very generic and could be made more specific. Done. https://codereview.chromium.org/1410213004/diff/1030001/base/metrics/persiste... base/metrics/persistent_memory_allocator.h:143: // zero will match any though the size is still checked. NULL is returned On 2016/01/18 22:40:28, chrisha wrote: > kTypeIdAny Done. https://codereview.chromium.org/1410213004/diff/1030001/base/metrics/persiste... base/metrics/persistent_memory_allocator.h:146: // TIME before accessing it or risk crashing! Once dereferenced, the pointer On 2016/01/18 22:40:28, chrisha wrote: > Remove double space after . Done. https://codereview.chromium.org/1410213004/diff/1030001/base/metrics/persiste... base/metrics/persistent_memory_allocator.h:214: void CreateIterator(Iterator* state, Reference starting_after) const; On 2016/01/18 22:40:28, chrisha wrote: > Not clear if the object pointer to by |starting_after| will be visited as the > first item in the iteration... this is consistent with typical iterator > semantics and the comment, but the name leads one to believe that the iteration > will first visit the item following it. Clarify the comment and/or name? Done. https://codereview.chromium.org/1410213004/diff/1030001/base/metrics/persiste... base/metrics/persistent_memory_allocator.h:232: void UpdateStaticHistograms(); On 2016/01/18 22:40:28, chrisha wrote: > Maybe rename CreateHistograms above to CreateStaticHistograms? Static referred to those that don't get updated automatically. Names and comment updated. https://codereview.chromium.org/1410213004/diff/1030001/base/metrics/persiste... base/metrics/persistent_memory_allocator.h:309: class BASE_EXPORT FilePersistentMemoryAllocator On 2016/01/18 22:40:28, chrisha wrote: > Oh? There's no way to have a writable FilePersistent allocator, even if we > specify a maximum size up front? This seems like it would be eminently so > processes don't need to have a "death rattle" handler, but can rely on the OS to > do that for them... Agreed but right now the underlying MemoryMappedFile allows for read-only access. I hope to change that (and this) when the time comes. https://codereview.chromium.org/1410213004/diff/1030001/base/metrics/persiste... File base/metrics/persistent_memory_allocator_unittest.cc (right): https://codereview.chromium.org/1410213004/diff/1030001/base/metrics/persiste... base/metrics/persistent_memory_allocator_unittest.cc:25: // Duplicat from persistent_memory_allocator.cc which is not visible here. On 2016/01/19 16:37:41, chrisha wrote: > Duplicated* > > Maybe easier to make it visible from the .h, declared as an extern? Guarantees > that it won't deviate between the two places... Done. https://codereview.chromium.org/1410213004/diff/1030001/base/metrics/persiste... base/metrics/persistent_memory_allocator_unittest.cc:55: memset(mem_segment_.get(), 0, TEST_MEMORY_SIZE); On 2016/01/19 16:37:41, chrisha wrote: > nit: Maybe use ::memset, in order to make explicit the fact that you want the > one in the global namespace. > > (This is the norm in many Chromium projects, although not necessarily in Chrome > itself. I find it useful because there is also std::memset, and occasionally > other projects define their own variants.) Done. https://codereview.chromium.org/1410213004/diff/1030001/base/metrics/persiste... base/metrics/persistent_memory_allocator_unittest.cc:98: // Validate allocation of test object and make sure it can be rerefernecd On 2016/01/19 16:37:41, chrisha wrote: > referenced* Done. https://codereview.chromium.org/1410213004/diff/1030001/base/metrics/persiste... base/metrics/persistent_memory_allocator_unittest.cc:173: // Check that an objcets type can be changed. On 2016/01/19 16:37:41, chrisha wrote: > object's Done. https://codereview.chromium.org/1410213004/diff/1030001/base/metrics/persiste... base/metrics/persistent_memory_allocator_unittest.cc:206: // Ensure that iteraton and access through third allocator works. On 2016/01/19 16:37:41, chrisha wrote: > iteration* Done. https://codereview.chromium.org/1410213004/diff/1030001/base/metrics/persiste... base/metrics/persistent_memory_allocator_unittest.cc:231: class AllocatorThread : public SimpleThread { On 2016/01/19 16:37:41, chrisha wrote: > This class is worthy of a top level comment describing what it does. Done. https://codereview.chromium.org/1410213004/diff/1030001/base/metrics/persiste... base/metrics/persistent_memory_allocator_unittest.cc:302: // This test doesn't verify anything other than it doesn't crash. It's goal On 2016/01/19 16:37:41, chrisha wrote: > Its* goal Done. https://codereview.chromium.org/1410213004/diff/1030001/base/metrics/persiste... base/metrics/persistent_memory_allocator_unittest.cc:303: // is to coding errors that aren't otherwise tested for, much like a "fuzzer" On 2016/01/19 16:37:41, chrisha wrote: > is to find coding > ^^^^ Done. https://codereview.chromium.org/1410213004/diff/1030001/base/metrics/persiste... base/metrics/persistent_memory_allocator_unittest.cc:351: // failure, the call will hang and the test killed for taking to long. On 2016/01/19 16:37:41, chrisha wrote: > too* long Done.
Okay, this lgtm now! Odyssey nearly over? :)
The CQ bit was checked by bcwhite@chromium.org
The patchset sent to the CQ was uploaded after l-g-t-m from thakis@chromium.org, dvyukov@google.com Link to the patchset: https://codereview.chromium.org/1410213004/#ps1050001 (title: "addressed review comments by Chris")
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1410213004/1050001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1410213004/1050001
Message was sent while issue was closed.
Description was changed from ========== Create "persistent memory allocator" for persisting and sharing objects. This allocator reserves memory within a specific block that itself can be persisted and/or shared between processes by various means. It is lock-free for operation on all platforms (even Android which does not support inter-process locks) and self-checking to deal with security concerns where not all processes accessing the memory are fully trustworthy. BUG=546019 ========== to ========== Create "persistent memory allocator" for persisting and sharing objects. This allocator reserves memory within a specific block that itself can be persisted and/or shared between processes by various means. It is lock-free for operation on all platforms (even Android which does not support inter-process locks) and self-checking to deal with security concerns where not all processes accessing the memory are fully trustworthy. BUG=546019 ==========
Message was sent while issue was closed.
Committed patchset #45 (id:1050001)
Message was sent while issue was closed.
Description was changed from ========== Create "persistent memory allocator" for persisting and sharing objects. This allocator reserves memory within a specific block that itself can be persisted and/or shared between processes by various means. It is lock-free for operation on all platforms (even Android which does not support inter-process locks) and self-checking to deal with security concerns where not all processes accessing the memory are fully trustworthy. BUG=546019 ========== to ========== Create "persistent memory allocator" for persisting and sharing objects. This allocator reserves memory within a specific block that itself can be persisted and/or shared between processes by various means. It is lock-free for operation on all platforms (even Android which does not support inter-process locks) and self-checking to deal with security concerns where not all processes accessing the memory are fully trustworthy. BUG=546019 Committed: https://crrev.com/34ae4983d4a24f5136bf8fda7b618842920962b0 Cr-Commit-Position: refs/heads/master@{#370377} ==========
Message was sent while issue was closed.
Patchset 45 (id:??) landed as https://crrev.com/34ae4983d4a24f5136bf8fda7b618842920962b0 Cr-Commit-Position: refs/heads/master@{#370377}
Message was sent while issue was closed.
LGTM https://codereview.chromium.org/1410213004/diff/1030001/base/metrics/persiste... File base/metrics/persistent_memory_allocator.cc (right): https://codereview.chromium.org/1410213004/diff/1030001/base/metrics/persiste... base/metrics/persistent_memory_allocator.cc:97: uint32_t _padding_; // Padding to match required allocation size. On 2016/01/19 19:49:40, bcwhite wrote: > On 2016/01/19 16:37:41, chrisha wrote: > > Why the trailing and leading underscores for this name? > > Just to indicate that it's not something that should be used. It's gone in a > new upload I have because the "id" is growing to 64 bits. For the future: please avoid using leading underscores, such names are reserved according to the C++ Standard.
Message was sent while issue was closed.
Looks like one of the tests added by this fails on the 32-bit clang/win bots: https://build.chromium.org/p/chromium.fyi/builders/CrWinClang%20tester/builds... Do you know why?
Message was sent while issue was closed.
On 2016/01/20 20:09:42, Nico wrote: > Looks like one of the tests added by this fails on the 32-bit clang/win bots: > https://build.chromium.org/p/chromium.fyi/builders/CrWinClang%20tester/builds... > > Do you know why? I think the test incorrectly initializes the allocator Id with zero.
Message was sent while issue was closed.
benwells@chromium.org changed reviewers: + benwells@chromium.org
Message was sent while issue was closed.
Is it expected that PersistentMemoryAllocatorTest.CorruptionTest would introduce data races picked up by TSAN? See https://build.chromium.org/p/chromium.memory.fyi/builders/Linux%20TSan%20Test... and https://crbug.com/579867
Message was sent while issue was closed.
On 2016/01/21 06:59:11, benwells wrote: > Is it expected that PersistentMemoryAllocatorTest.CorruptionTest would introduce > data races picked up by TSAN? > > See > https://build.chromium.org/p/chromium.memory.fyi/builders/Linux%20TSan%20Test... > > and https://crbug.com/579867 You need to convert those writes of random values to relaxed atomics to avoid data races.
Message was sent while issue was closed.
On 2016/01/21 11:40:52, Alexander Potapenko wrote: > On 2016/01/21 06:59:11, benwells wrote: > > Is it expected that PersistentMemoryAllocatorTest.CorruptionTest would > introduce > > data races picked up by TSAN? > > > > See > > > https://build.chromium.org/p/chromium.memory.fyi/builders/Linux%20TSan%20Test... > > > > and https://crbug.com/579867 > > You need to convert those writes of random values to relaxed atomics to avoid > data races. Thanks for the tip. I'll look into the test errors once I'm in the office.
Message was sent while issue was closed.
On 2016/01/21 11:44:20, bcwhite wrote: > On 2016/01/21 11:40:52, Alexander Potapenko wrote: > > On 2016/01/21 06:59:11, benwells wrote: > > > Is it expected that PersistentMemoryAllocatorTest.CorruptionTest would > > introduce > > > data races picked up by TSAN? > > > > > > See > > > > > > https://build.chromium.org/p/chromium.memory.fyi/builders/Linux%20TSan%20Test... > > > > > > and https://crbug.com/579867 > > > > You need to convert those writes of random values to relaxed atomics to avoid > > data races. I don't think that is enough. The "corruptor" code is just doing random writes while valid code is sometimes doing atomic read/write and sometimes doing regular read/write (because it knows that there should be nothing operating in parallel on those memory locations). If I make the corruptor do atomic writes, won't that still cause a TSan error when valid code is doing non-atomic work? If that is the case... Perhaps I can just disable that particular test during the TSan test?
Message was sent while issue was closed.
On 2016/01/21 18:37:51, bcwhite wrote: > On 2016/01/21 11:44:20, bcwhite wrote: > > On 2016/01/21 11:40:52, Alexander Potapenko wrote: > > > On 2016/01/21 06:59:11, benwells wrote: > > > > Is it expected that PersistentMemoryAllocatorTest.CorruptionTest would > > > introduce > > > > data races picked up by TSAN? > > > > > > > > See > > > > > > > > > > https://build.chromium.org/p/chromium.memory.fyi/builders/Linux%20TSan%20Test... > > > > > > > > and https://crbug.com/579867 > > > > > > You need to convert those writes of random values to relaxed atomics to > avoid > > > data races. > > I don't think that is enough. The "corruptor" code is just doing random writes > while valid code is sometimes doing atomic read/write and sometimes doing > regular read/write (because it knows that there should be nothing operating in > parallel on those memory locations). If I make the corruptor do atomic writes, > won't that still cause a TSan error when valid code is doing non-atomic work? > > If that is the case... Perhaps I can just disable that particular test during > the TSan test? Agreed, relaxed loads aren't enough. Why do you need this test to be parallel in the first place?
Message was sent while issue was closed.
On 2016/01/20 20:09:42, Nico wrote: > Looks like one of the tests added by this fails on the 32-bit clang/win bots: > https://build.chromium.org/p/chromium.fyi/builders/CrWinClang%20tester/builds... > > Do you know why? Seems to be an issue with reading 64-bit values from a read-only memory mapped file. https://code.google.com/p/chromium/issues/detail?id=580241
Message was sent while issue was closed.
> > I don't think that is enough. The "corruptor" code is just doing random > writes > > while valid code is sometimes doing atomic read/write and sometimes doing > > regular read/write (because it knows that there should be nothing operating in > > parallel on those memory locations). If I make the corruptor do atomic > writes, > > won't that still cause a TSan error when valid code is doing non-atomic work? > > > > If that is the case... Perhaps I can just disable that particular test during > > the TSan test? > > Agreed, relaxed loads aren't enough. > Why do you need this test to be parallel in the first place? It was just a simple simulation of a parallel system with a malicious actor. I could serialize it, I guess, and achieve pretty much the same results. No way to just disable it under TSan?
Message was sent while issue was closed.
On 2016/01/21 20:15:01, bcwhite wrote: > > > I don't think that is enough. The "corruptor" code is just doing random > > writes > > > while valid code is sometimes doing atomic read/write and sometimes doing > > > regular read/write (because it knows that there should be nothing operating > in > > > parallel on those memory locations). If I make the corruptor do atomic > > writes, > > > won't that still cause a TSan error when valid code is doing non-atomic > work? > > > > > > If that is the case... Perhaps I can just disable that particular test > during > > > the TSan test? > > > > Agreed, relaxed loads aren't enough. > > Why do you need this test to be parallel in the first place? > > It was just a simple simulation of a parallel system with a malicious actor. I > could serialize it, I guess, and achieve pretty much the same results. > > No way to just disable it under TSan? This is piling hacks to simulate a malicious actor, but you could map the memory location at a different address (or a different process), and access that one from your malicious actor. Your malicious actor would need to use volatile, and you're still in UB land, but that's kind of the entire point of the test IIUC.
Message was sent while issue was closed.
On 2016/01/21 20:15:01, bcwhite wrote: > > > I don't think that is enough. The "corruptor" code is just doing random > > writes > > > while valid code is sometimes doing atomic read/write and sometimes doing > > > regular read/write (because it knows that there should be nothing operating > in > > > parallel on those memory locations). If I make the corruptor do atomic > > writes, > > > won't that still cause a TSan error when valid code is doing non-atomic > work? > > > > > > If that is the case... Perhaps I can just disable that particular test > during > > > the TSan test? > > > > Agreed, relaxed loads aren't enough. > > Why do you need this test to be parallel in the first place? > > It was just a simple simulation of a parallel system with a malicious actor. I > could serialize it, I guess, and achieve pretty much the same results. > > No way to just disable it under TSan? Could I just surround the "corrupt" write with ANNOTATE_IGNORE_WRITES_BEGIN/END?
Message was sent while issue was closed.
On 2016/01/21 20:44:27, bcwhite wrote: > On 2016/01/21 20:15:01, bcwhite wrote: > > > > I don't think that is enough. The "corruptor" code is just doing random > > > writes > > > > while valid code is sometimes doing atomic read/write and sometimes doing > > > > regular read/write (because it knows that there should be nothing > operating > > in > > > > parallel on those memory locations). If I make the corruptor do atomic > > > writes, > > > > won't that still cause a TSan error when valid code is doing non-atomic > > work? > > > > > > > > If that is the case... Perhaps I can just disable that particular test > > during > > > > the TSan test? > > > > > > Agreed, relaxed loads aren't enough. > > > Why do you need this test to be parallel in the first place? > > > > It was just a simple simulation of a parallel system with a malicious actor. > I > > could serialize it, I guess, and achieve pretty much the same results. > > > > No way to just disable it under TSan? > > Could I just surround the "corrupt" write with ANNOTATE_IGNORE_WRITES_BEGIN/END? Please don't use ANNOTATE_XXX, those are deprecated. Perhaps just disabling the test under TSan is ok, as the test is performing races on purpose.
Message was sent while issue was closed.
On 2016/01/21 21:19:50, Alexander Potapenko wrote: > On 2016/01/21 20:44:27, bcwhite wrote: > > On 2016/01/21 20:15:01, bcwhite wrote: > > > > > I don't think that is enough. The "corruptor" code is just doing random > > > > writes > > > > > while valid code is sometimes doing atomic read/write and sometimes > doing > > > > > regular read/write (because it knows that there should be nothing > > operating > > > in > > > > > parallel on those memory locations). If I make the corruptor do atomic > > > > writes, > > > > > won't that still cause a TSan error when valid code is doing non-atomic > > > work? > > > > > > > > > > If that is the case... Perhaps I can just disable that particular test > > > during > > > > > the TSan test? > > > > > > > > Agreed, relaxed loads aren't enough. > > > > Why do you need this test to be parallel in the first place? > > > > > > It was just a simple simulation of a parallel system with a malicious actor. > > > I > > > could serialize it, I guess, and achieve pretty much the same results. > > > > > > No way to just disable it under TSan? > > > > Could I just surround the "corrupt" write with > ANNOTATE_IGNORE_WRITES_BEGIN/END? > > Please don't use ANNOTATE_XXX, those are deprecated. > > Perhaps just disabling the test under TSan is ok, as the test is performing > races on purpose. Sure. The question is, how to disable them only on TSan...
Message was sent while issue was closed.
On 2016/01/22 00:17:16, bcwhite wrote: > On 2016/01/21 21:19:50, Alexander Potapenko wrote: > > On 2016/01/21 20:44:27, bcwhite wrote: > > > On 2016/01/21 20:15:01, bcwhite wrote: > > > > > > I don't think that is enough. The "corruptor" code is just doing > random > > > > > writes > > > > > > while valid code is sometimes doing atomic read/write and sometimes > > doing > > > > > > regular read/write (because it knows that there should be nothing > > > operating > > > > in > > > > > > parallel on those memory locations). If I make the corruptor do > atomic > > > > > writes, > > > > > > won't that still cause a TSan error when valid code is doing > non-atomic > > > > work? > > > > > > > > > > > > If that is the case... Perhaps I can just disable that particular > test > > > > during > > > > > > the TSan test? > > > > > > > > > > Agreed, relaxed loads aren't enough. > > > > > Why do you need this test to be parallel in the first place? > > > > > > > > It was just a simple simulation of a parallel system with a malicious > actor. > > > > > I > > > > could serialize it, I guess, and achieve pretty much the same results. > > > > > > > > No way to just disable it under TSan? > > > > > > Could I just surround the "corrupt" write with > > ANNOTATE_IGNORE_WRITES_BEGIN/END? > > > > Please don't use ANNOTATE_XXX, those are deprecated. > > > > Perhaps just disabling the test under TSan is ok, as the test is performing > > races on purpose. > > Sure. The question is, how to disable them only on TSan... Fixed: https://codereview.chromium.org/1611393003 |