|
|
Created:
9 years ago by jar (doing other things) Modified:
9 years ago CC:
chromium-reviews, brettw-cc_chromium.org Base URL:
svn://chrome-svn/chrome/trunk/src/ Visibility:
Public. |
DescriptionMake ThreadLocalStorage more posix pthread compliant
This is my first shot at cleaning up the implementation.
1) I support multiple calls to destructors, as suggested in pthread
standards
2) I added volatile so that the destructors can't cause any compiler
confusion if they are nulled out while a second thread is calling
the list of destructors.
3) Windows already avoids producing a key which has value zero
<good!>, but I added some DHECKs to make this fact more obvious. I
plan to try to enforce that in the posix is similarly
constrained... but that will be in a future CL.
4) I did some extra cleaning of destructor handling, so that it is
plausible that TCMalloc can use this service (and not have to hack its
own). The problem *was* that initialization called new, which would
trigger allocation activities before a thread local heap could be
setup <sigh>. This is now fixed (in this CL). I also handled the
destructor calls with greater care, avoiding calling delete after all
the destructors have been called (as that would re-incarnate a
TCMalloc heap). This is also fixed.
r=willchan
BUG=105410
Committed: http://src.chromium.org/viewvc/chrome?view=rev&revision=112030
Patch Set 1 #Patch Set 2 : '' #Patch Set 3 : '' #Patch Set 4 : '' #
Total comments: 8
Patch Set 5 : '' #Patch Set 6 : '' #
Total comments: 26
Patch Set 7 : '' #Patch Set 8 : '' #Patch Set 9 : '' #Patch Set 10 : '' #Patch Set 11 : '' #
Total comments: 5
Patch Set 12 : '' #
Total comments: 2
Patch Set 13 : '' #
Messages
Total messages: 18 (0 generated)
This is my first shot at cleaning up the implementation. 1) I support multiple calls to destructors, as suggested in pthread standards 2) I add some atomic calls so that the destructors are guaranteed visible on other threads. 3) Windows already avoids producing a key which has value zero <good!>, but I added some DHECKs to make this fact more obvious. I plan to try to enforce that in the posix is similarly constrained... but that will be in a future CL. 4) I did some extra cleaning of destructor handling, so that it is plausible that TCMalloc can use this service (and not have to hack its own). The problem *was* that initialization called new, which would trigger allocation activities before a thread local heap could be setup <sigh>. This is now fixed (in this CL). I also handled the destructor calls with greater care, avoiding calling delete after all the destructors have been called (as that would re-incarnate a TCMalloc heap). This is also fixed.
It seems to me that it should be possible to write a test for this code. Is that correct? http://codereview.chromium.org/8702014/diff/7002/base/threading/thread_local_... File base/threading/thread_local_storage_win.cc (right): http://codereview.chromium.org/8702014/diff/7002/base/threading/thread_local_... base/threading/thread_local_storage_win.cc:54: // even call new until after we're setup) You should explain that this is because an allocator may use thread local storage. http://codereview.chromium.org/8702014/diff/7002/base/threading/thread_local_... base/threading/thread_local_storage_win.cc:132: // even call delete[] after we're done with destructors.) You should again explain that this is because the allocator may depend on thread local data structures that may no longer exist.
http://codereview.chromium.org/8702014/diff/7002/base/threading/thread_local_... File base/threading/thread_local_storage_win.cc (right): http://codereview.chromium.org/8702014/diff/7002/base/threading/thread_local_... base/threading/thread_local_storage_win.cc:85: // Setup our destructor. Make sure that any thread that is dostroyed can see s/dostroyed/destroyed/ http://codereview.chromium.org/8702014/diff/7002/base/threading/thread_local_... base/threading/thread_local_storage_win.cc:87: subtle::NoBarrier_Store(&tls_destructors_[slot_], It's not obvious to me that the NoBarrier variants are what you want to use here. I would expect to do an Acquire_Load() and Release_Store(). Can you explain why it's ok to use NoBarrier_Store() and NoBarrier_Load()?
This message is great. I think it should move into the changelist description. On 2011/11/27 06:34:44, jar wrote: > This is my first shot at cleaning up the implementation. > > 1) I support multiple calls to destructors, as suggested in pthread standards > > 2) I add some atomic calls so that the destructors are guaranteed visible on > other threads. > > 3) Windows already avoids producing a key which has value zero <good!>, but I > added some DHECKs to make this fact more obvious. I plan to try to enforce that > in the posix is similarly constrained... but that will be in a future CL. > > 4) I did some extra cleaning of destructor handling, so that it is plausible > that TCMalloc can use this service (and not have to hack its own). The problem > *was* that initialization called new, which would trigger allocation activities > before a thread local heap could be setup <sigh>. This is now fixed (in this > CL). I also handled the destructor calls with greater care, avoiding calling > delete after all the destructors have been called (as that would re-incarnate a > TCMalloc heap). This is also fixed.
Response to comments by willchan Added unit test coverage of repeated calls to the destructor, and added additional comments to the code and the CL description.
http://codereview.chromium.org/8702014/diff/15001/base/threading/thread_local... File base/threading/thread_local_storage.h (right): http://codereview.chromium.org/8702014/diff/15001/base/threading/thread_local... base/threading/thread_local_storage.h:87: static const int PTHREAD_DESTRUCTOR_ITERATIONS = kThreadLocalStorageSize; I don't see why we want to expose this value in the header... or why we want to go against the chromium naming convention for that matter. In fact, even in the cc file I'd remove the reference to pthreads. http://codereview.chromium.org/8702014/diff/15001/base/threading/thread_local... File base/threading/thread_local_storage_win.cc (right): http://codereview.chromium.org/8702014/diff/15001/base/threading/thread_local... base/threading/thread_local_storage_win.cc:90: // Setup our destructor. Make sure that any thread that is dostroyed can see nit: typo http://codereview.chromium.org/8702014/diff/15001/base/threading/thread_local... base/threading/thread_local_storage_win.cc:92: subtle::NoBarrier_Store(&tls_destructors_[slot_], Isn't it more readable to do a regular assignment? (assuming we don't want barriers)
http://codereview.chromium.org/8702014/diff/15001/base/threading/thread_local... File base/threading/thread_local_storage.h (right): http://codereview.chromium.org/8702014/diff/15001/base/threading/thread_local... base/threading/thread_local_storage.h:85: // The maximum number of times to try to clear slots by caling destructors. s/caling/calling/ http://codereview.chromium.org/8702014/diff/15001/base/threading/thread_local... File base/threading/thread_local_storage_unittest.cc (right): http://codereview.chromium.org/8702014/diff/15001/base/threading/thread_local... base/threading/thread_local_storage_unittest.cc:62: // Destructors should never be called wit a NULL. s/wit/with/ http://codereview.chromium.org/8702014/diff/15001/base/threading/thread_local... base/threading/thread_local_storage_unittest.cc:63: ASSERT_NE(ptr, reinterpret_cast<int*>(NULL)); Should be ASSERT_OP(expected, actual). You have it reversed here. http://codereview.chromium.org/8702014/diff/15001/base/threading/thread_local... base/threading/thread_local_storage_unittest.cc:109: EXPECT_EQ(values[index], kFinalTlsValue); Looks like this one was wrong too, can you fix the order? http://codereview.chromium.org/8702014/diff/15001/base/threading/thread_local... File base/threading/thread_local_storage_win.cc (right): http://codereview.chromium.org/8702014/diff/15001/base/threading/thread_local... base/threading/thread_local_storage_win.cc:53: // As a result, any attempt to call new (or malloc) will lazilly cause such a s/lazilly/lazily/g http://codereview.chromium.org/8702014/diff/15001/base/threading/thread_local... base/threading/thread_local_storage_win.cc:57: // Use use a stack allocated vector, so that we don't have dependence s/Use use/Use/g http://codereview.chromium.org/8702014/diff/15001/base/threading/thread_local... base/threading/thread_local_storage_win.cc:61: memset(stack_allocated_tls_data, 0, sizeof(stack_allocated_tls_data)); s/sizeof/arraysize/ http://codereview.chromium.org/8702014/diff/15001/base/threading/thread_local... base/threading/thread_local_storage_win.cc:62: // Ensure that any rentrant calls change the temp version. re-entrant http://codereview.chromium.org/8702014/diff/15001/base/threading/thread_local... base/threading/thread_local_storage_win.cc:67: memcpy(tls_data, stack_allocated_tls_data, sizeof(stack_allocated_tls_data)); s/sizeof/arraysize/
Changes per comments by rvargas and willchan. I moved several Windows-only-statics out of the header file and into the Windows-only CC file as well. http://codereview.chromium.org/8702014/diff/7002/base/threading/thread_local_... File base/threading/thread_local_storage_win.cc (right): http://codereview.chromium.org/8702014/diff/7002/base/threading/thread_local_... base/threading/thread_local_storage_win.cc:54: // even call new until after we're setup) On 2011/11/28 20:58:28, willchan wrote: > You should explain that this is because an allocator may use thread local > storage. Done. http://codereview.chromium.org/8702014/diff/7002/base/threading/thread_local_... base/threading/thread_local_storage_win.cc:85: // Setup our destructor. Make sure that any thread that is dostroyed can see On 2011/11/28 21:04:32, willchan wrote: > s/dostroyed/destroyed/ Done. http://codereview.chromium.org/8702014/diff/7002/base/threading/thread_local_... base/threading/thread_local_storage_win.cc:87: subtle::NoBarrier_Store(&tls_destructors_[slot_], As per our discussion, I added a pile of comments justifying this. On 2011/11/28 21:04:32, willchan wrote: > It's not obvious to me that the NoBarrier variants are what you want to use > here. I would expect to do an Acquire_Load() and Release_Store(). Can you > explain why it's ok to use NoBarrier_Store() and NoBarrier_Load()? http://codereview.chromium.org/8702014/diff/7002/base/threading/thread_local_... base/threading/thread_local_storage_win.cc:132: // even call delete[] after we're done with destructors.) On 2011/11/28 20:58:28, willchan wrote: > You should again explain that this is because the allocator may depend on thread > local data structures that may no longer exist. Done. http://codereview.chromium.org/8702014/diff/15001/base/threading/thread_local... File base/threading/thread_local_storage.h (right): http://codereview.chromium.org/8702014/diff/15001/base/threading/thread_local... base/threading/thread_local_storage.h:85: // The maximum number of times to try to clear slots by caling destructors. On 2011/11/29 01:46:49, willchan wrote: > s/caling/calling/ Done. http://codereview.chromium.org/8702014/diff/15001/base/threading/thread_local... base/threading/thread_local_storage.h:87: static const int PTHREAD_DESTRUCTOR_ITERATIONS = kThreadLocalStorageSize; On 2011/11/29 01:26:31, rvargas wrote: > I don't see why we want to expose this value in the header... or why we want to > go against the chromium naming convention for that matter. > > In fact, even in the cc file I'd remove the reference to pthreads. Done. http://codereview.chromium.org/8702014/diff/15001/base/threading/thread_local... File base/threading/thread_local_storage_unittest.cc (right): http://codereview.chromium.org/8702014/diff/15001/base/threading/thread_local... base/threading/thread_local_storage_unittest.cc:62: // Destructors should never be called wit a NULL. On 2011/11/29 01:46:49, willchan wrote: > s/wit/with/ Done. http://codereview.chromium.org/8702014/diff/15001/base/threading/thread_local... base/threading/thread_local_storage_unittest.cc:63: ASSERT_NE(ptr, reinterpret_cast<int*>(NULL)); On 2011/11/29 01:46:49, willchan wrote: > Should be ASSERT_OP(expected, actual). You have it reversed here. Done. http://codereview.chromium.org/8702014/diff/15001/base/threading/thread_local... base/threading/thread_local_storage_unittest.cc:109: EXPECT_EQ(values[index], kFinalTlsValue); On 2011/11/29 01:46:49, willchan wrote: > Looks like this one was wrong too, can you fix the order? Done. http://codereview.chromium.org/8702014/diff/15001/base/threading/thread_local... File base/threading/thread_local_storage_win.cc (right): http://codereview.chromium.org/8702014/diff/15001/base/threading/thread_local... base/threading/thread_local_storage_win.cc:53: // As a result, any attempt to call new (or malloc) will lazilly cause such a On 2011/11/29 01:46:49, willchan wrote: > s/lazilly/lazily/g Done. http://codereview.chromium.org/8702014/diff/15001/base/threading/thread_local... base/threading/thread_local_storage_win.cc:57: // Use use a stack allocated vector, so that we don't have dependence On 2011/11/29 01:46:49, willchan wrote: > s/Use use/Use/g Done. http://codereview.chromium.org/8702014/diff/15001/base/threading/thread_local... base/threading/thread_local_storage_win.cc:61: memset(stack_allocated_tls_data, 0, sizeof(stack_allocated_tls_data)); On 2011/11/29 01:46:49, willchan wrote: > s/sizeof/arraysize/ I think I want the size of the array. Array size would only give me a count (ratio of array size to element size), and not give me number of bytes. I could equally well have done: kThreadLocalStorageSize * sizeof (stack_allocated_tls_data[0]) ...but the above is clearer (subtle, in that one of the few times that an array is not equivalent to a pointer is when used as an arg to sizeof). http://codereview.chromium.org/8702014/diff/15001/base/threading/thread_local... base/threading/thread_local_storage_win.cc:62: // Ensure that any rentrant calls change the temp version. On 2011/11/29 01:46:49, willchan wrote: > re-entrant Done. http://codereview.chromium.org/8702014/diff/15001/base/threading/thread_local... base/threading/thread_local_storage_win.cc:67: memcpy(tls_data, stack_allocated_tls_data, sizeof(stack_allocated_tls_data)); On 2011/11/29 01:46:49, willchan wrote: > s/sizeof/arraysize/ Again... I think I have the right form. http://codereview.chromium.org/8702014/diff/15001/base/threading/thread_local... base/threading/thread_local_storage_win.cc:90: // Setup our destructor. Make sure that any thread that is dostroyed can see On 2011/11/29 01:26:31, rvargas wrote: > nit: typo Done. http://codereview.chromium.org/8702014/diff/15001/base/threading/thread_local... base/threading/thread_local_storage_win.cc:92: subtle::NoBarrier_Store(&tls_destructors_[slot_], On 2011/11/29 01:26:31, rvargas wrote: > Isn't it more readable to do a regular assignment? (assuming we don't want > barriers) Yeah... but I think it is a bit wrong. Perhaps if windows was only x86 it would be ok... but I'm not sure if I should count on that.
(if it wasn't clear, I really didn't looked at the code too much, just enough be picky). Thanks. http://codereview.chromium.org/8702014/diff/15001/base/threading/thread_local... File base/threading/thread_local_storage_win.cc (right): http://codereview.chromium.org/8702014/diff/15001/base/threading/thread_local... base/threading/thread_local_storage_win.cc:92: subtle::NoBarrier_Store(&tls_destructors_[slot_], On 2011/11/29 02:32:45, jar wrote: > On 2011/11/29 01:26:31, rvargas wrote: > > Isn't it more readable to do a regular assignment? (assuming we don't want > > barriers) > > Yeah... but I think it is a bit wrong. Perhaps if windows was only x86 it would > be ok... but I'm not sure if I should count on that. The way I'm reading this code (I may be wrong) is that we don't need a memory barrier here, regardless of the platform where this code is compiled. So I see this basically as a documentation line, telling the future reader of the code that the implications were already considered. If that's the case, I'd prefer a simple assignment and a comment saying that a barrier is not needed, that way the comment is clear, and the code is also clear. Your call :)
lgtm http://codereview.chromium.org/8702014/diff/15001/base/threading/thread_local... File base/threading/thread_local_storage_win.cc (right): http://codereview.chromium.org/8702014/diff/15001/base/threading/thread_local... base/threading/thread_local_storage_win.cc:67: memcpy(tls_data, stack_allocated_tls_data, sizeof(stack_allocated_tls_data)); On 2011/11/29 02:32:45, jar wrote: > On 2011/11/29 01:46:49, willchan wrote: > > s/sizeof/arraysize/ > > Again... I think I have the right form. > Oops, you're right. Ignore me.
After further discussion with rvargas, I've relented and removed the use of atomics to store the destructor array. Although I thought that the atomic forces a save to main memory (without a barrier), I think that at best this sort of construct will only help slightly when there is a race (between a tls-free, and a thread destruction). I'll assume we won't race (or will catch and fix such), and that the atomic *might* not do what I *thought* it does (which is also what willchan was suggesting). Bottom line: revert to using non-atomic storage for the destructor array, which is no worse than prior status quo. The rest of the CL is unchanged (from the last LGTM).
Linux was failing, apparently because it did not willing to re-try the destructor calls more than about 4 times. I changed the unit test to call a smaller number of times, so hopefully it will work on all platoforms now. The error was: base/threading/thread_local_storage_unittest.cc:110: Failure Value of: kFinalTlsValue Actual: 30583 Expected: values[index] Which is: 30587 which suggested that I had actually (in the CL) exchanged the arguments to EXPECT(...) putting them backwards. I reverted the one on line 110 (that this CL changed).
After sleeping on it... I decided to at least add a volatile qualifier to the array of destructors, along with a comment as to why. Truth be known, in our codebase, we don't free TLS entries (perhaps we should remove that call form the API??), but I like code to be as clean/defensive as possible. The windows implementation supports lazy idempotent re-initialization of its keys (when a destructor is not used), and it would be nice to support the same feature in the posix implementation. To do that, we should probably have a lock (rather than the interlocked-exchange that is done in Windows), and then I'll just put all these destructor accesses inside that lock (and stop thinking about the problem). I would even do that now, except we really need a linker initialized spin lock (and a lazy singleton induced lock won't cut it, as it would call malloc :-( ). For now, the code is at least better than it was before, and it should be good enough that I can move on to fixing XP to properly support this service. I can then complete the thread cleanup issues on Windows (by migrating migrating windows versions of TCMalloc and nspr to use this base TLS service).
http://codereview.chromium.org/8702014/diff/13012/base/threading/thread_local... File base/threading/thread_local_storage_win.cc (right): http://codereview.chromium.org/8702014/diff/13012/base/threading/thread_local... base/threading/thread_local_storage_win.cc:31: volatile ThreadLocalStorage::TLSDestructorFunc I'm always super wary of use of volatile. Are you sure it's doing what you think it's doing? Please refer to http://en.wikipedia.org/wiki/Volatile_variable#In_C_and_C.2B.2B. http://codereview.chromium.org/8702014/diff/13012/base/threading/thread_local... base/threading/thread_local_storage_win.cc:118: g_tls_destructors[slot_] = NULL; extra horizontal whitespace?
Response to Willchan's comments. http://codereview.chromium.org/8702014/diff/13012/base/threading/thread_local... File base/threading/thread_local_storage_win.cc (right): http://codereview.chromium.org/8702014/diff/13012/base/threading/thread_local... base/threading/thread_local_storage_win.cc:31: volatile ThreadLocalStorage::TLSDestructorFunc I'm beyond super wary, and almost tend to think of it as useless. This is one of the rare cases where it seems applicable, and that point seems to be agreed to by the article you reference. Search for the text: "However, foo might represent a location that can be changed by other elements of the computer system at any time,... [use volatile]" That is indeed the situation we are put in with an asynchronous change via the tls-free call on another thread. We wish to preclude any compiler optimizations, including reading it form memory more than once (per use). They also summarize it as: "prevent the (pseudo)compiler from applying any optimizations on the code that assume values of variables cannot change "on their own." " These variables can change, and hence should not be read more than once, when the value of the first read is tested and re-used. On 2011/11/29 18:52:14, willchan wrote: > I'm always super wary of use of volatile. Are you sure it's doing what you think > it's doing? Please refer to > http://en.wikipedia.org/wiki/Volatile_variable#In_C_and_C.2B.2B. http://codereview.chromium.org/8702014/diff/13012/base/threading/thread_local... base/threading/thread_local_storage_win.cc:118: g_tls_destructors[slot_] = NULL; On 2011/11/29 18:52:14, willchan wrote: > extra horizontal whitespace? Done.
http://codereview.chromium.org/8702014/diff/13012/base/threading/thread_local... File base/threading/thread_local_storage_win.cc (right): http://codereview.chromium.org/8702014/diff/13012/base/threading/thread_local... base/threading/thread_local_storage_win.cc:31: volatile ThreadLocalStorage::TLSDestructorFunc On 2011/11/29 18:52:14, willchan wrote: > I'm always super wary of use of volatile. Are you sure it's doing what you think > it's doing? Please refer to > http://en.wikipedia.org/wiki/Volatile_variable#In_C_and_C.2B.2B. Given that this is Windows specific code (and assuming that we're not going to move away from VS), this version is actually more strict than the previous one (the one that was using the atomic type but nobarrier calls). Atomic implies memory barriers and no optimizations: http://msdn.microsoft.com/en-us/library/12a04hfd(VS.80).aspx That said, we probably want to revisit the whole issue soon, because chatting with Jim I'm now wary that at least the TLS API is not right.
OK, lgtm http://codereview.chromium.org/8702014/diff/11010/base/threading/thread_local... File base/threading/thread_local_storage.h (right): http://codereview.chromium.org/8702014/diff/11010/base/threading/thread_local... base/threading/thread_local_storage.h:82: #endif // OS_WIN This shouldn't be indented
http://codereview.chromium.org/8702014/diff/11010/base/threading/thread_local... File base/threading/thread_local_storage.h (right): http://codereview.chromium.org/8702014/diff/11010/base/threading/thread_local... base/threading/thread_local_storage.h:82: #endif // OS_WIN On 2011/11/29 21:25:31, willchan wrote: > This shouldn't be indented Done. |