Issue 2675433002: profiler: decouple ThreadHeapUsageTracker from allocator shim internals

Primiano Tucci (use gerrit)

The CQ bit was checked by primiano@chromium.org to run a CQ dry run

3 years, 10 months ago (2017-02-01 20:56:12 UTC) #1

commit-bot: I haz the power

Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2675433002/1

3 years, 10 months ago (2017-02-01 20:56:28 UTC) #2

Primiano Tucci (use gerrit)

Description was changed from ========== Background: ----------- Until crrev.com/2658723007, the allocator shim was initialized at ...

3 years, 10 months ago (2017-02-01 20:57:36 UTC) #3

Description was changed from

==========
Background: ----------- Until crrev.com/2658723007, the allocator shim was
initialized at compile time and hence always ready to be used. After that CL,
the shim requires a run-time initialization. The code in ThreadHeapUsageTracker
currently assumes that the shim is initialized by directly calling into the shim
chain even before a malloc/free call is intercepted. There are two possible
solutions to this: 1. Initialize the shim before ThreadHeapUsageTracker is
initialized. 2. Prevent ThreadHeapUsageTracker from directly entering the shim.
(more discussion in https://codereview.chromium.org/2658723007/#msg53).

1. is a one liner but looks quite hacky. ThreadHeapUsageTracker is
lazily initialized when a Thread (the main thread) is created (in the
TrackedObject construction). This would mean, in practice, putting
the shim initialization in tracked_objects.cc which is quite obscure.

This CL:
--------
The approach used here is 2. The existing code had already the
right sentinel logic to detect and prevent re-entrancy. This CL
is just extending that to the dtor and just using new/delete to
create the TLS object.
This allows to initialize the shim in a less obscure place (e.g.
content_main_runner).

BUG=665567
==========

to

==========
Background:
-----------
Until crrev.com/2658723007, the allocator shim was initialized at
compile time and hence always ready to be used. After that CL,
the shim requires a run-time initialization. The code in
ThreadHeapUsageTracker currently assumes that the shim is initialized
by directly calling into the shim chain even before a malloc/free call
is intercepted.
There are two possible solutions to this:
1. Initialize the shim before ThreadHeapUsageTracker is initialized.
2. Prevent ThreadHeapUsageTracker from directly entering the shim.
(more discussion in https://codereview.chromium.org/2658723007/#msg53).

1. is a one liner but looks quite hacky. ThreadHeapUsageTracker is
lazily initialized when a Thread (the main thread) is created (in the
TrackedObject construction). This would mean, in practice, putting
the shim initialization in tracked_objects.cc which is quite obscure.

This CL:
--------
The approach used here is 2. The existing code had already the
right sentinel logic to detect and prevent re-entrancy. This CL
is just extending that to the dtor and just using new/delete to
create the TLS object.
This allows to initialize the shim in a less obscure place (e.g.
content_main_runner).

BUG=665567
TEST=Build with use_experimental_allocator_shim=true on Mac, run base_unittests
==========

Primiano Tucci (use gerrit)

primiano@chromium.org changed reviewers: + erikchen@chromium.org, siggi@chromium.org

3 years, 10 months ago (2017-02-01 20:57:37 UTC) #4

Primiano Tucci (use gerrit)

Description was changed from ========== Background: ----------- Until crrev.com/2658723007, the allocator shim was initialized at ...

3 years, 10 months ago (2017-02-01 21:53:09 UTC) #5

Description was changed from

==========
Background:
-----------
Until crrev.com/2658723007, the allocator shim was initialized at
compile time and hence always ready to be used. After that CL,
the shim requires a run-time initialization. The code in
ThreadHeapUsageTracker currently assumes that the shim is initialized
by directly calling into the shim chain even before a malloc/free call
is intercepted.
There are two possible solutions to this:
1. Initialize the shim before ThreadHeapUsageTracker is initialized.
2. Prevent ThreadHeapUsageTracker from directly entering the shim.
(more discussion in https://codereview.chromium.org/2658723007/#msg53).

1. is a one liner but looks quite hacky. ThreadHeapUsageTracker is
lazily initialized when a Thread (the main thread) is created (in the
TrackedObject construction). This would mean, in practice, putting
the shim initialization in tracked_objects.cc which is quite obscure.

This CL:
--------
The approach used here is 2. The existing code had already the
right sentinel logic to detect and prevent re-entrancy. This CL
is just extending that to the dtor and just using new/delete to
create the TLS object.
This allows to initialize the shim in a less obscure place (e.g.
content_main_runner).

BUG=665567
TEST=Build with use_experimental_allocator_shim=true on Mac, run base_unittests
==========

to

==========
profiler: decouple ThreadHeapUsageTracker from allocator shim internals

Background:
-----------
Until crrev.com/2658723007, the allocator shim was initialized at
compile time and hence always ready to be used. After that CL,
the shim requires a run-time initialization. The code in
ThreadHeapUsageTracker currently assumes that the shim is initialized
by directly calling into the shim chain even before a malloc/free call
is intercepted.
There are two possible solutions to this:
1. Initialize the shim before ThreadHeapUsageTracker is initialized.
2. Prevent ThreadHeapUsageTracker from directly entering the shim.
(more discussion in https://codereview.chromium.org/2658723007/#msg53).

1. is a one liner but not particularly clean. ThreadHeapUsageTracker is
lazily initialized when a Thread (the main thread) is created (in the
TrackedObject construction). This would mean, in practice, putting
the shim initialization in tracked_objects.cc which is quite obscure.

This CL:
--------
The approach used here is 2. The existing code had already the
right sentinel logic to detect and prevent re-entrancy. This CL
is just extending that to the dtor and just using new/delete to
create the TLS object.
This allows to initialize the shim in a less obscure place (e.g.
content_main_runner).

BUG=665567,644385
TEST=Build with use_experimental_allocator_shim=true on Mac, run base_unittests
==========

Primiano Tucci (use gerrit)

The CQ bit was checked by primiano@chromium.org to run a CQ dry run

3 years, 10 months ago (2017-02-01 22:14:59 UTC) #7

commit-bot: I haz the power

Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2675433002/20001

3 years, 10 months ago (2017-02-01 22:15:19 UTC) #8

Sigurður Ásgeirsson

lgtm with a nit. Feel free to address or not. https://codereview.chromium.org/2675433002/diff/20001/base/debug/thread_heap_usage_tracker.cc File base/debug/thread_heap_usage_tracker.cc (right): https://codereview.chromium.org/2675433002/diff/20001/base/debug/thread_heap_usage_tracker.cc#newcode33 ...

3 years, 10 months ago (2017-02-01 22:21:27 UTC) #9

Primiano Tucci (use gerrit)

https://codereview.chromium.org/2675433002/diff/20001/base/debug/thread_heap_usage_tracker.cc File base/debug/thread_heap_usage_tracker.cc (right): https://codereview.chromium.org/2675433002/diff/20001/base/debug/thread_heap_usage_tracker.cc#newcode33 base/debug/thread_heap_usage_tracker.cc:33: ThreadHeapUsage* const kInitializationOrTeardownSentinel = On 2017/02/01 22:21:27, Sigurður Ásgeirsson ...

3 years, 10 months ago (2017-02-01 22:39:05 UTC) #10

https://codereview.chromium.org/2675433002/diff/20001/base/debug/thread_heap_...
File base/debug/thread_heap_usage_tracker.cc (right):

https://codereview.chromium.org/2675433002/diff/20001/base/debug/thread_heap_...
base/debug/thread_heap_usage_tracker.cc:33: ThreadHeapUsage* const
kInitializationOrTeardownSentinel =
On 2017/02/01 22:21:27, Sigurður Ásgeirsson wrote:
> I'd be tempted to use two sentinels here, as you have a state machine going
> 
> unintialized -> being initialized -> initialized -> de-initialized
> 
> where re-initialization should never happen. I worry that when/if this
derails,
> it'll be a bear to figure out what's derailed with one sentinel only. This
does
> however imply a second comparison (or something like a CHECK) in
> GetOrCreateThreadUsage...

Yeah I thought to this. Honestly the only reason I didn't do that was to avoid a
double if, fearing to slow down the TUT. Thinking more though, I think I can
come up with a kSentinelMask, so keep still one branch but two distinct value

https://codereview.chromium.org/2675433002/diff/20001/base/debug/thread_heap_...
base/debug/thread_heap_usage_tracker.cc:184: allocator_usage = new
ThreadHeapUsage();
On 2017/02/01 22:21:27, Sigurður Ásgeirsson wrote:
> comically this'll lead to the shim accounting its own usage, except on Mac :).

Hmm how? This shouldn't happen because we set the setinel up, which would cause
the GetOrCreate to return nullptr and hence skip the accounting (same on
deletion below)

https://codereview.chromium.org/2675433002/diff/20001/base/debug/thread_heap_...
base/debug/thread_heap_usage_tracker.cc:299: if (thread_heap_usage ==
kInitializationOrTeardownSentinel)
On 2017/02/01 22:21:27, Sigurður Ásgeirsson wrote:
> How would this ever happen?

Ha! this is tricky but will happen all the times.
This destructor is called after the TLS slot is cleared up. So re-setting the
sentinel (the line below) will re-create the TLS slot. so this destructor will
be called a second time immediately after this returns. the second time though
the slot contains the sentinel and we early out.
Thankfully the TLS code explicitly mentions and supports this case. see 
https://cs.chromium.org/chromium/src/base/threading/thread_local_storage.cc?r...

Sigurður Ásgeirsson

https://codereview.chromium.org/2675433002/diff/20001/base/debug/thread_heap_usage_tracker.cc File base/debug/thread_heap_usage_tracker.cc (right): https://codereview.chromium.org/2675433002/diff/20001/base/debug/thread_heap_usage_tracker.cc#newcode184 base/debug/thread_heap_usage_tracker.cc:184: allocator_usage = new ThreadHeapUsage(); On 2017/02/01 22:39:05, Primiano Tucci ...

3 years, 10 months ago (2017-02-01 22:46:16 UTC) #11

commit-bot: I haz the power

The CQ bit was unchecked by commit-bot@chromium.org

3 years, 10 months ago (2017-02-01 23:14:35 UTC) #12

commit-bot: I haz the power

Dry run: This issue passed the CQ dry run.

3 years, 10 months ago (2017-02-01 23:14:36 UTC) #13

Primiano Tucci (use gerrit)

The CQ bit was checked by primiano@chromium.org to run a CQ dry run

3 years, 10 months ago (2017-02-02 00:31:04 UTC) #14

commit-bot: I haz the power

Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2675433002/40001

3 years, 10 months ago (2017-02-02 00:31:32 UTC) #15

commit-bot: I haz the power

The CQ bit was unchecked by commit-bot@chromium.org

3 years, 10 months ago (2017-02-02 00:34:56 UTC) #16

commit-bot: I haz the power

Dry run: Try jobs failed on following builders: ios-device on master.tryserver.chromium.mac (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.mac/builders/ios-device/builds/146049) ios-device-xcode-clang on ...

3 years, 10 months ago (2017-02-02 00:34:57 UTC) #17

Primiano Tucci (use gerrit)

https://codereview.chromium.org/2675433002/diff/20001/base/debug/thread_heap_usage_tracker.cc File base/debug/thread_heap_usage_tracker.cc (right): https://codereview.chromium.org/2675433002/diff/20001/base/debug/thread_heap_usage_tracker.cc#newcode299 base/debug/thread_heap_usage_tracker.cc:299: if (thread_heap_usage == kInitializationOrTeardownSentinel) On 2017/02/01 22:46:16, Sigurður Ásgeirsson ...

3 years, 10 months ago (2017-02-02 00:38:27 UTC) #18

Primiano Tucci (use gerrit)

The CQ bit was checked by primiano@chromium.org to run a CQ dry run

3 years, 10 months ago (2017-02-02 01:04:22 UTC) #19

commit-bot: I haz the power

Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2675433002/60001

3 years, 10 months ago (2017-02-02 01:04:57 UTC) #20

Primiano Tucci (use gerrit)

primiano@chromium.org changed reviewers: + dcheng@chromium.org

3 years, 10 months ago (2017-02-02 01:09:06 UTC) #21

commit-bot: I haz the power

The CQ bit was unchecked by commit-bot@chromium.org

3 years, 10 months ago (2017-02-02 02:20:44 UTC) #23

commit-bot: I haz the power

Dry run: This issue passed the CQ dry run.

3 years, 10 months ago (2017-02-02 02:20:45 UTC) #24

dcheng

LGTM https://codereview.chromium.org/2675433002/diff/60001/base/debug/thread_heap_usage_tracker.cc File base/debug/thread_heap_usage_tracker.cc (right): https://codereview.chromium.org/2675433002/diff/60001/base/debug/thread_heap_usage_tracker.cc#newcode272 base/debug/thread_heap_usage_tracker.cc:272: DCHECK(thread_heap_usage != kInitializationSentinel); Minor nit: DCHECK_NE?

3 years, 10 months ago (2017-02-03 06:57:16 UTC) #25

Primiano Tucci (use gerrit)

https://codereview.chromium.org/2675433002/diff/60001/base/debug/thread_heap_usage_tracker.cc File base/debug/thread_heap_usage_tracker.cc (right): https://codereview.chromium.org/2675433002/diff/60001/base/debug/thread_heap_usage_tracker.cc#newcode272 base/debug/thread_heap_usage_tracker.cc:272: DCHECK(thread_heap_usage != kInitializationSentinel); On 2017/02/03 06:57:16, dcheng wrote: > ...

3 years, 10 months ago (2017-02-03 14:50:24 UTC) #26

Primiano Tucci (use gerrit)

The CQ bit was checked by primiano@chromium.org

3 years, 10 months ago (2017-02-03 14:50:33 UTC) #27

Primiano Tucci (use gerrit)

The patchset sent to the CQ was uploaded after l-g-t-m from siggi@chromium.org Link to the ...

3 years, 10 months ago (2017-02-03 14:50:33 UTC) #28

commit-bot: I haz the power

CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2675433002/60001

3 years, 10 months ago (2017-02-03 14:50:47 UTC) #29

commit-bot: I haz the power

The CQ bit was unchecked by commit-bot@chromium.org

3 years, 10 months ago (2017-02-03 16:45:27 UTC) #30

commit-bot: I haz the power

Failed to apply patch for base/debug/thread_heap_usage_tracker.cc: While running git apply --index -p1; error: patch failed: ...

3 years, 10 months ago (2017-02-03 16:45:29 UTC) #31

Failed to apply patch for base/debug/thread_heap_usage_tracker.cc:
While running git apply --index -p1;
  error: patch failed: base/debug/thread_heap_usage_tracker.cc:136
  error: base/debug/thread_heap_usage_tracker.cc: patch does not apply

Patch:       base/debug/thread_heap_usage_tracker.cc
Index: base/debug/thread_heap_usage_tracker.cc
diff --git a/base/debug/thread_heap_usage_tracker.cc
b/base/debug/thread_heap_usage_tracker.cc
index
25af20b0aa7d20ecdfa04af0bda3d6d0ab2eb884..bb6873bf305c6878488bcd5489fc377b6c3f89a0
100644
--- a/base/debug/thread_heap_usage_tracker.cc
+++ b/base/debug/thread_heap_usage_tracker.cc
@@ -6,6 +6,7 @@
 
 #include <stdint.h>
 #include <algorithm>
+#include <limits>
 #include <new>
 #include <type_traits>
 
@@ -30,8 +31,11 @@ using base::allocator::AllocatorDispatch;
 
 ThreadLocalStorage::StaticSlot g_thread_allocator_usage = TLS_INITIALIZER;
 
-ThreadHeapUsage* const kInitializingSentinel =
-    reinterpret_cast<ThreadHeapUsage*>(-1);
+const uintptr_t kSentinelMask = std::numeric_limits<uintptr_t>::max() - 1;
+ThreadHeapUsage* const kInitializationSentinel =
+    reinterpret_cast<ThreadHeapUsage*>(kSentinelMask);
+ThreadHeapUsage* const kTeardownSentinel =
+    reinterpret_cast<ThreadHeapUsage*>(kSentinelMask | 1);
 
 bool g_heap_tracking_enabled = false;
 
@@ -136,21 +140,16 @@ AllocatorDispatch allocator_dispatch = {
     &FreeFn,  &GetSizeEstimateFn,      nullptr};
 
 ThreadHeapUsage* GetOrCreateThreadUsage() {
-  ThreadHeapUsage* allocator_usage =
-      static_cast<ThreadHeapUsage*>(g_thread_allocator_usage.Get());
-  if (allocator_usage == kInitializingSentinel)
+  auto tls_ptr = reinterpret_cast<uintptr_t>(g_thread_allocator_usage.Get());
+  if ((tls_ptr & kSentinelMask) == kSentinelMask)
     return nullptr;  // Re-entrancy case.
 
+  auto* allocator_usage = reinterpret_cast<ThreadHeapUsage*>(tls_ptr);
   if (allocator_usage == nullptr) {
     // Prevent reentrancy due to the allocation below.
-    g_thread_allocator_usage.Set(kInitializingSentinel);
-
-    // Delegate the allocation of the per-thread structure to the underlying
-    // heap shim, for symmetry with the deallocation. Otherwise interposing
-    // shims may mis-attribute or mis-direct this allocation.
-    const AllocatorDispatch* next = allocator_dispatch.next;
-    allocator_usage = new (next->alloc_function(next, sizeof(ThreadHeapUsage)))
-        ThreadHeapUsage();
+    g_thread_allocator_usage.Set(kInitializationSentinel);
+
+    allocator_usage = new ThreadHeapUsage();
     static_assert(std::is_pod<ThreadHeapUsage>::value,
                   "AllocatorDispatch must be POD");
     memset(allocator_usage, 0, sizeof(*allocator_usage));
@@ -261,12 +260,22 @@ ThreadHeapUsageTracker::GetDispatchForTesting() {
 
 void ThreadHeapUsageTracker::EnsureTLSInitialized() {
   if (!g_thread_allocator_usage.initialized()) {
-    g_thread_allocator_usage.Initialize([](void* allocator_usage) {
-      // Delegate the freeing of the per-thread structure to the next-lower
-      // heap shim. Otherwise this free will re-initialize the TLS on thread
-      // exit.
-      allocator_dispatch.next->free_function(allocator_dispatch.next,
-                                             allocator_usage);
+    g_thread_allocator_usage.Initialize([](void* thread_heap_usage) {
+      // This destructor will be called twice. Once to destroy the actual
+      // ThreadHeapUsage instance and a second time, immediately after, for the
+      // sentinel. Re-setting the TLS slow (below) does re-initialize the TLS
+      // slot. The ThreadLocalStorage code is designed to deal with this use
+      // case (see comments in ThreadHeapUsageTracker::EnsureTLSInitialized)
and
+      // will re-call the destructor with the kTeardownSentinel as arg.
+      if (thread_heap_usage == kTeardownSentinel)
+        return;
+      DCHECK(thread_heap_usage != kInitializationSentinel);
+
+      // Deleting the ThreadHeapUsage TLS object will re-enter the shim and hit
+      // RecordFree() above. The sentinel prevents RecordFree() from
re-creating
+      // another ThreadHeapUsage object.
+      g_thread_allocator_usage.Set(kTeardownSentinel);
+      delete static_cast<ThreadHeapUsage*>(thread_heap_usage);
     });
   }
 }

Primiano Tucci (use gerrit)

The CQ bit was checked by primiano@chromium.org

3 years, 10 months ago (2017-02-06 18:42:33 UTC) #32

Primiano Tucci (use gerrit)

The patchset sent to the CQ was uploaded after l-g-t-m from dcheng@chromium.org, siggi@chromium.org Link to ...

3 years, 10 months ago (2017-02-06 18:42:34 UTC) #33

commit-bot: I haz the power

CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2675433002/80001

3 years, 10 months ago (2017-02-06 18:43:08 UTC) #34

commit-bot: I haz the power

CQ is committing da patch. Bot data: {"patchset_id": 80001, "attempt_start_ts": 1486406553610370, "parent_rev": "555c81dca33110db593ac5c0951579a372f6878b", "commit_rev": "e9ffc9c6104f4cc7ff5e62a94869f111c645106a"}

3 years, 10 months ago (2017-02-06 19:52:48 UTC) #35

commit-bot: I haz the power

Description was changed from ========== profiler: decouple ThreadHeapUsageTracker from allocator shim internals Background: ----------- Until ...

3 years, 10 months ago (2017-02-06 19:53:28 UTC) #36

Message was sent while issue was closed.

Description was changed from

==========
profiler: decouple ThreadHeapUsageTracker from allocator shim internals

Background:
-----------
Until crrev.com/2658723007, the allocator shim was initialized at
compile time and hence always ready to be used. After that CL,
the shim requires a run-time initialization. The code in
ThreadHeapUsageTracker currently assumes that the shim is initialized
by directly calling into the shim chain even before a malloc/free call
is intercepted.
There are two possible solutions to this:
1. Initialize the shim before ThreadHeapUsageTracker is initialized.
2. Prevent ThreadHeapUsageTracker from directly entering the shim.
(more discussion in https://codereview.chromium.org/2658723007/#msg53).

1. is a one liner but not particularly clean. ThreadHeapUsageTracker is
lazily initialized when a Thread (the main thread) is created (in the
TrackedObject construction). This would mean, in practice, putting
the shim initialization in tracked_objects.cc which is quite obscure.

This CL:
--------
The approach used here is 2. The existing code had already the
right sentinel logic to detect and prevent re-entrancy. This CL
is just extending that to the dtor and just using new/delete to
create the TLS object.
This allows to initialize the shim in a less obscure place (e.g.
content_main_runner).

BUG=665567,644385
TEST=Build with use_experimental_allocator_shim=true on Mac, run base_unittests
==========

to

==========
profiler: decouple ThreadHeapUsageTracker from allocator shim internals

Background:
-----------
Until crrev.com/2658723007, the allocator shim was initialized at
compile time and hence always ready to be used. After that CL,
the shim requires a run-time initialization. The code in
ThreadHeapUsageTracker currently assumes that the shim is initialized
by directly calling into the shim chain even before a malloc/free call
is intercepted.
There are two possible solutions to this:
1. Initialize the shim before ThreadHeapUsageTracker is initialized.
2. Prevent ThreadHeapUsageTracker from directly entering the shim.
(more discussion in https://codereview.chromium.org/2658723007/#msg53).

1. is a one liner but not particularly clean. ThreadHeapUsageTracker is
lazily initialized when a Thread (the main thread) is created (in the
TrackedObject construction). This would mean, in practice, putting
the shim initialization in tracked_objects.cc which is quite obscure.

This CL:
--------
The approach used here is 2. The existing code had already the
right sentinel logic to detect and prevent re-entrancy. This CL
is just extending that to the dtor and just using new/delete to
create the TLS object.
This allows to initialize the shim in a less obscure place (e.g.
content_main_runner).

BUG=665567,644385
TEST=Build with use_experimental_allocator_shim=true on Mac, run base_unittests

Review-Url: https://codereview.chromium.org/2675433002
Cr-Commit-Position: refs/heads/master@{#448360}
Committed:
https://chromium.googlesource.com/chromium/src/+/e9ffc9c6104f4cc7ff5e62a94869...
==========

commit-bot: I haz the power

3 years, 10 months ago (2017-02-06 19:53:28 UTC) #37

Message was sent while issue was closed.

Committed patchset #5 (id:80001) as
https://chromium.googlesource.com/chromium/src/+/e9ffc9c6104f4cc7ff5e62a94869...

Issue 2675433002: profiler: decouple ThreadHeapUsageTracker from allocator shim internals (Closed)

Description

Patch Set 1 #

Patch Set 2 : typo in comment #

Patch Set 3 : Use two sentinels and add comment #

Patch Set 4 : rebase #

Patch Set 5 : rebase #

Messages