|
|
Created:
4 years, 6 months ago by fdoray Modified:
4 years, 5 months ago CC:
chromium-reviews, gab+watch_chromium.org, robliao+watch_chromium.org, fdoray+watch_chromium.org Base URL:
https://chromium.googlesource.com/chromium/src.git@master Target Ref:
refs/pending/heads/master Project:
chromium Visibility:
Public. |
DescriptionTaskScheduler: Atomic operations in TaskTracker
With this CL, TaskTracker never acquires a lock except when shutdown is
not in progress.
TaskTracker::State wraps an Atomic32 whose LSB indicates whether shutdown
has been initiated and whose other bits count the number of tasks
blocking shutdown. This State is accessed in the following situations:
A) Before a BLOCK_SHUTDOWN task is posted
Increment the Atomic32 by 2 [1] and read it. If the "shutdown initiated"
bit is set, acquire the lock [2] to:
- DCHECK that shutdown hasn't completed
- Record a UMA histogram
B) Before a CONTINUE_ON_SHUTDOWN or SKIP_ON_SHUTDOWN task is posted:
Read the Atomic32. The task is allowed to be posted iff the
"shutdown initiated" bit isn't set.
C) Before a BLOCK_SHUTDOWN task is executed.
Read the Atomic32. DCHECK that there are tasks blocking shutdown
(number of tasks blocking shutdown was incremented when the task
was posted).
D) Before a SKIP_ON_SHUTDOWN task is executed.
Increment the Atomic32 by 2 [1] and read it.
If the "shutdown initiated" bit isn't set:
The task is allowed to be executed.
If the "shutdown initiated bit is set:
Decrement the Atomic32 by 2 [1] (to revert the incorrect increment) and
read it. If the number of tasks blocking shutdown became zero, do [3]
(note: this can trigger [3] more than once during shutdown as multiple
threads realize that shutdown has been requested, but [3] is resilient
to this).
E) Before a CONTINUE_ON_SHUTDOWN task is executed.
Read the Atomic32. Allow the task to be executed iff the "shutdown
initiated" bit isn't set.
F) After a BLOCK_SHUTDOWN or SKIP_ON_SHUTDOWN task is executed:
Decrement the Atomic32 by 2 [1] and read it. If the number of tasks
blocking shutdown became zero and "shutdown initiated", do [3].
G) When Shutdown() is called:
Acquire the lock.
Instantiate the shutdown complete event.
Increment the Atomic32 by 1 (to set the "shutdown initiated bit") and
read it. From this moment, if a thread causes the number of tasks
blocking shutdown to become zero, it will do [3].
If there are no tasks blocking shutdown, signal the event and return.
Wait until the shutdown event is signaled.
[1] Incrementing the Atomic32 by 2 increments the number of tasks
blocking shutdown by 1 because the LSB is used to indicate whether
shutdown has been initiated.
[2] The TaskTracker must be locked to access the shutdown event and the
number of BLOCK_SHUTDOWN tasks posted during shutdown.
[3] These actions are performed by OnNumTasksBlockingShutdownIsZero():
- Acquire the lock.
- DCHECK that shutdown has started (i.e. "shutdown initiated" bit is
set and shutdown event exists).
- Signal the shutdown complete event (note: this event can be signaled
more than once per multiple threads reaching the conclusion that
shutdown is complete, ref. SKIP_ON_SHUTDOWN dance above, but this
is harmless)
Benchmark - 10 million increment/decrement per thread
Atomic Lock
4 threads Win Z840 1.5s 2.2s
4 threads Ubuntu VM 0.8s 4.9s
4 threads MacBookPro 1.8s 270.7s
8 threads Win Z840 3.0s 5.5s
8 threads Ubuntu VM 1.6s 11.8s
8 threads MacBookPro 4.2s 559.0s
BUG=553459
Committed: https://crrev.com/caa8d6675e165cd04fc22f24b36c1076d6720388
Cr-Commit-Position: refs/heads/master@{#406260}
Patch Set 1 #
Total comments: 13
Patch Set 2 : CR robliao #8 #Patch Set 3 : CR gab #10 #Patch Set 4 : CR gab #10 #
Total comments: 14
Patch Set 5 : CR gab #26 #
Total comments: 13
Patch Set 6 : CR gab/robliao #30-31 #
Total comments: 24
Patch Set 7 : CR danakj/gab #38-39 #
Total comments: 2
Patch Set 8 : CR danakj #45-46 #
Total comments: 7
Patch Set 9 : CR danakj #51 (add memory barriers) #
Messages
Total messages: 71 (23 generated)
Micro-benchmark: atomic counter vs. lock. Release build on Windows. 4 threads: Atomic: 0.19s Lock: 1.9s 8 threads: Atomic: 0.33s Lock: 6.8s Code: class AtomicIncrementDecrementThread : public SimpleThread { public: AtomicIncrementDecrementThread(subtle::Atomic32* counter) : SimpleThread(""), counter_(counter) {} void Run() override { for (size_t i = 0; i < 1000000; ++i) { subtle::NoBarrier_AtomicIncrement(counter_, 1); subtle::NoBarrier_AtomicIncrement(counter_, -1); } } private: subtle::Atomic32* counter_; DISALLOW_COPY_AND_ASSIGN(AtomicIncrementDecrementThread); }; class LockIncrementDecrementThread : public SimpleThread { public: LockIncrementDecrementThread(int32_t* counter, Lock* lock) : SimpleThread(""), counter_(counter), lock_(lock) {} void Run() override { for (size_t i = 0; i < 10000000; ++i) { { AutoLock auto_lock(*lock_); ++counter_; } { AutoLock auto_lock(*lock_); --counter_; } } } private: int32_t* counter_; Lock* lock_; DISALLOW_COPY_AND_ASSIGN(LockIncrementDecrementThread); }; TEST(TaskTrackerPerfTest, Atomic) { subtle::Atomic32 counter = 0; TimeTicks start(TimeTicks::Now()); std::vector<std::unique_ptr<AtomicIncrementDecrementThread>> threads; for (size_t i = 0; i < 8; ++i) { threads.push_back(WrapUnique(new AtomicIncrementDecrementThread(&counter))); threads.back()->Start(); } for (const auto& thread : threads) thread->Join(); TimeTicks end(TimeTicks::Now()); TimeDelta elapsed_time = end - start; LOG(ERROR) << "Elapsed time: " << elapsed_time; EXPECT_EQ(0, subtle::NoBarrier_Load(&counter)); } TEST(TaskTrackerPerfTest, Lock) { int32_t counter = 0; Lock lock; TimeTicks start(TimeTicks::Now()); std::vector<std::unique_ptr<LockIncrementDecrementThread>> threads; for (size_t i = 0; i < 8; ++i) { threads.push_back( WrapUnique(new LockIncrementDecrementThread(&counter, &lock))); threads.back()->Start(); } for (const auto& thread : threads) thread->Join(); TimeTicks end(TimeTicks::Now()); TimeDelta elapsed_time = end - start; LOG(ERROR) << "Elapsed time: " << elapsed_time; AutoLock auto_lock(lock); EXPECT_EQ(0, counter); }
The CQ bit was checked by fdoray@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/2019763002/1 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/2019763002/1
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
fdoray@chromium.org changed reviewers: + gab@chromium.org, robliao@chromium.org
PTAL
https://codereview.chromium.org/2019763002/diff/1/base/task_scheduler/task_tr... File base/task_scheduler/task_tracker.cc (right): https://codereview.chromium.org/2019763002/diff/1/base/task_scheduler/task_tr... base/task_scheduler/task_tracker.cc:46: subtle::Barrier_AtomicIncrement(&bits_, kShutdownHasStartedMask); If this gets called twice, doesn't that mean the LSB is now 0? https://codereview.chromium.org/2019763002/diff/1/base/task_scheduler/task_tr... base/task_scheduler/task_tracker.cc:80: const auto new_bits = subtle::NoBarrier_AtomicIncrement( Is this safe without a barrier? If the atomic increment in IncrementNumTasks gets reordered after DecrementNumTasks, we just told someone that the num of tasks blocking shutdown hit 0. https://codereview.chromium.org/2019763002/diff/1/base/task_scheduler/task_tr... File base/task_scheduler/task_tracker.h (right): https://codereview.chromium.org/2019763002/diff/1/base/task_scheduler/task_tr... base/task_scheduler/task_tracker.h:76: void OnNumTasksBlockingShutdownIsZero(); Alternately OnBlockingShutdownTasksComplete
PTAnL https://codereview.chromium.org/2019763002/diff/1/base/task_scheduler/task_tr... File base/task_scheduler/task_tracker.cc (right): https://codereview.chromium.org/2019763002/diff/1/base/task_scheduler/task_tr... base/task_scheduler/task_tracker.cc:46: subtle::Barrier_AtomicIncrement(&bits_, kShutdownHasStartedMask); On 2016/05/27 22:16:14, robliao wrote: > If this gets called twice, doesn't that mean the LSB is now 0? Yes. StartShutdown() can't be called twice. Added a DCHECK. https://codereview.chromium.org/2019763002/diff/1/base/task_scheduler/task_tr... base/task_scheduler/task_tracker.cc:80: const auto new_bits = subtle::NoBarrier_AtomicIncrement( On 2016/05/27 22:16:14, robliao wrote: > Is this safe without a barrier? > If the atomic increment in IncrementNumTasks gets reordered after > DecrementNumTasks, we just told someone that the num of tasks blocking shutdown > hit 0. I believe a barrier "specifies how regular, non-atomic memory accesses are to be ordered around an atomic operation" http://en.cppreference.com/w/cpp/atomic/memory_order The atomic increment cannot be reordered after the atomic decrement. https://codereview.chromium.org/2019763002/diff/1/base/task_scheduler/task_tr... File base/task_scheduler/task_tracker.h (right): https://codereview.chromium.org/2019763002/diff/1/base/task_scheduler/task_tr... base/task_scheduler/task_tracker.h:76: void OnNumTasksBlockingShutdownIsZero(); On 2016/05/27 22:16:14, robliao wrote: > Alternately > OnBlockingShutdownTasksComplete Done.
A few CL description notes and a major correction on the benchmark code below: 1) "The task is allowed the task to be executed." ? 2) "Before a BLOCK_SHUTDOWN task is executed. Read the Atomic32. DCHECK that the number of tasks blocking shutdown is zero (this number was incremented when the task was posted)." is NOT zero? 3) "Increment the Atomic32 by 2 [1] and read it." Document why inc+read before checking shutdown state is key to avoiding a shutdown race. 4) "BLOCK_SHUTDOWN of SKIP_ON_SHUTDOWN" ^ "or" 5) "If the number of tasks blocking shutdown became zero, do [3]." Shouldn't [3] only be called if "shutdown initiated"? 6) In [3]: "DCHECK that the shutdown CV exists or that shutdown has completed." Is it possible to end up in [3] and have shutdown be completed? Shouldn't the last caller running [3] be the one releasing the CV and marking shutdown as completed? On 2016/05/27 18:13:19, fdoray wrote: > Micro-benchmark: atomic counter vs. lock. > Release build on Windows. > > 4 threads: > Atomic: 0.19s > Lock: 1.9s > > 8 threads: > Atomic: 0.33s > Lock: 6.8s > > Code: > > class AtomicIncrementDecrementThread : public SimpleThread { > public: > AtomicIncrementDecrementThread(subtle::Atomic32* counter) > : SimpleThread(""), counter_(counter) {} > > void Run() override { > for (size_t i = 0; i < 1000000; ++i) { > subtle::NoBarrier_AtomicIncrement(counter_, 1); > subtle::NoBarrier_AtomicIncrement(counter_, -1); > } > } > > private: > subtle::Atomic32* counter_; > > DISALLOW_COPY_AND_ASSIGN(AtomicIncrementDecrementThread); > }; > > class LockIncrementDecrementThread : public SimpleThread { > public: > LockIncrementDecrementThread(int32_t* counter, Lock* lock) > : SimpleThread(""), counter_(counter), lock_(lock) {} > > void Run() override { > for (size_t i = 0; i < 10000000; ++i) { Hmmm, AFAICT this is 10 million iterations whereas your atomic test is 1M :-O! Maybe use a constant for both to avoid such zero counting mistakes? Also, please add benchmarking results (once fixed) to the CL description (easier to see post commit and easier to edit during review :-)). > { > AutoLock auto_lock(*lock_); > ++counter_; > } > { > AutoLock auto_lock(*lock_); > --counter_; > } > } > } > > private: > int32_t* counter_; > Lock* lock_; > > DISALLOW_COPY_AND_ASSIGN(LockIncrementDecrementThread); > }; > > TEST(TaskTrackerPerfTest, Atomic) { > subtle::Atomic32 counter = 0; > > TimeTicks start(TimeTicks::Now()); > > std::vector<std::unique_ptr<AtomicIncrementDecrementThread>> threads; > for (size_t i = 0; i < 8; ++i) { > threads.push_back(WrapUnique(new AtomicIncrementDecrementThread(&counter))); > threads.back()->Start(); > } > for (const auto& thread : threads) > thread->Join(); > > TimeTicks end(TimeTicks::Now()); > TimeDelta elapsed_time = end - start; > LOG(ERROR) << "Elapsed time: " << elapsed_time; > > EXPECT_EQ(0, subtle::NoBarrier_Load(&counter)); > } > > TEST(TaskTrackerPerfTest, Lock) { > int32_t counter = 0; > Lock lock; > > TimeTicks start(TimeTicks::Now()); > > std::vector<std::unique_ptr<LockIncrementDecrementThread>> threads; > for (size_t i = 0; i < 8; ++i) { > threads.push_back( > WrapUnique(new LockIncrementDecrementThread(&counter, &lock))); > threads.back()->Start(); > } > for (const auto& thread : threads) > thread->Join(); > > TimeTicks end(TimeTicks::Now()); > TimeDelta elapsed_time = end - start; > LOG(ERROR) << "Elapsed time: " << elapsed_time; > > AutoLock auto_lock(lock); > EXPECT_EQ(0, counter); > }
Description was changed from ========== TaskScheduler: Atomic operations in TaskTracker With this CL, TaskTracker never acquires a lock except when shutdown is in progress. TaskTracker::State wraps an Atomic32 whose LSB indicates whether shutdown has been initiated and whose other bits count the number of tasks blocking shutdown. This State is accessed in the following situations: - Before a BLOCK_SHUTDOWN task is posted Increment the Atomic32 by 2 [1] and read it. If the "shutdown initiated" bit is set, acquire the lock [2] to: - DCHECK that shutdown hasn't completed - Record a UMA histogram - Before a CONTINUE_ON_SHUTDOWN or SKIP_ON_SHUTDOWN task is posted: Read the Atomic32. The task is allowed to be posted iff the "shutdown initiated" bit isn't set. - Before a BLOCK_SHUTDOWN task is executed. Read the Atomic32. DCHECK that the number of tasks blocking shutdown is zero (this number was incremented when the task was posted). - Before a SKIP_ON_SHUTDOWN task is executed. Increment the Atomic32 by 2 [1] and read it. If the "shutdown initiated" bit isn't set: The task is allowed the task to be executed. If the "shutdown initiated bit is set: Decrement the Atomic32 by 2 [1] (to revert the incorrect increment) and read it. If the number of tasks blocking shutdown became zero, do [3]. - Before a CONTINUE_ON_SHUTDOWN task is executed. Read the Atomic32. Allow the task to be executed iff the "shutdown initiated" bit isn't set. - After a BLOCK_SHUTDOWN of SKIP_ON_SHUTDOWN task is executed: Decrement the Atomic32 by 2 [1] and read it. If the number of tasks blocking shutdown became zero, do [3]. - When Shutdown() is called: Acquire the lock. Increment the Atomic32 by 1 (to set the "shutdown initiated bit"). From this moment, if a thread causes the number of tasks blocking shutdown to become zero, it will do [3]. Instantiate the shutdown CV. a. Read the number of tasks blocking shutdown. b. If there are no tasks blocking shutdown, go to e. c. Wait for the shutdown CV to be signaled. d. Go back to a. e. Set the "shutdown completed" flag. [1] Incrementing the Atomic32 by 2 increments the number of tasks blocking shutdown by 1 because the LSB is used to indicate whether shutdown has been initiated. [2] The TaskTracker must be locked to access the shutdown CV, the "shutdown completed" boolean and the number of BLOCK_SHUTDOWN tasks posted during shutdown. [3] These actions are performed by OnNumTasksBlockingShutdownIsZero(): - Acquire the lock. - DCHECK that the shutdown CV exists or that shutdown has completed. This condition must be true because Shutdown() holds the lock: - Between the time it sets the "shutdown initiated" flag and the time it instantiates the CV. - Between the time it deletes the CV and the time it sets the "shutdown completed" flag. - If the shutdown CV exists, signal it. BUG=553459 ========== to ========== TaskScheduler: Atomic operations in TaskTracker With this CL, TaskTracker never acquires a lock except when shutdown is in progress. TaskTracker::State wraps an Atomic32 whose LSB indicates whether shutdown has been initiated and whose other bits count the number of tasks blocking shutdown. This State is accessed in the following situations: - Before a BLOCK_SHUTDOWN task is posted Increment the Atomic32 by 2 [1] and read it. If the "shutdown initiated" bit is set, acquire the lock [2] to: - DCHECK that shutdown hasn't completed - Record a UMA histogram - Before a CONTINUE_ON_SHUTDOWN or SKIP_ON_SHUTDOWN task is posted: Read the Atomic32. The task is allowed to be posted iff the "shutdown initiated" bit isn't set. - Before a BLOCK_SHUTDOWN task is executed. Read the Atomic32. DCHECK that the number of tasks blocking shutdown is zero (this number was incremented when the task was posted). - Before a SKIP_ON_SHUTDOWN task is executed. Increment the Atomic32 by 2 [1] and read it. If the "shutdown initiated" bit isn't set: The task is allowed the task to be executed. If the "shutdown initiated bit is set: Decrement the Atomic32 by 2 [1] (to revert the incorrect increment) and read it. If the number of tasks blocking shutdown became zero, do [3]. - Before a CONTINUE_ON_SHUTDOWN task is executed. Read the Atomic32. Allow the task to be executed iff the "shutdown initiated" bit isn't set. - After a BLOCK_SHUTDOWN of SKIP_ON_SHUTDOWN task is executed: Decrement the Atomic32 by 2 [1] and read it. If the number of tasks blocking shutdown became zero, do [3]. - When Shutdown() is called: Acquire the lock. Increment the Atomic32 by 1 (to set the "shutdown initiated bit"). From this moment, if a thread causes the number of tasks blocking shutdown to become zero, it will do [3]. Instantiate the shutdown CV. a. Read the number of tasks blocking shutdown. b. If there are no tasks blocking shutdown, go to e. c. Wait for the shutdown CV to be signaled. d. Go back to a. e. Set the "shutdown completed" flag. [1] Incrementing the Atomic32 by 2 increments the number of tasks blocking shutdown by 1 because the LSB is used to indicate whether shutdown has been initiated. [2] The TaskTracker must be locked to access the shutdown CV, the "shutdown completed" boolean and the number of BLOCK_SHUTDOWN tasks posted during shutdown. [3] These actions are performed by OnNumTasksBlockingShutdownIsZero(): - Acquire the lock. - DCHECK that the shutdown CV exists or that shutdown has completed. This condition must be true because Shutdown() holds the lock: - Between the time it sets the "shutdown initiated" flag and the time it instantiates the CV. - Between the time it deletes the CV and the time it sets the "shutdown completed" flag. - If the shutdown CV exists, signal it. Benchmark - 10 million increment/decrement per thread Atomic Lock 4 threads Win Z840 1.5s 2.2s 4 thread Ubuntu VM 0.8s 4.9s 8 threads Win Z840 3.0s 5.5s 8 thread Ubuntu VM 1.6s 11.8s 8 threads atomic BUG=553459 ==========
Description was changed from ========== TaskScheduler: Atomic operations in TaskTracker With this CL, TaskTracker never acquires a lock except when shutdown is in progress. TaskTracker::State wraps an Atomic32 whose LSB indicates whether shutdown has been initiated and whose other bits count the number of tasks blocking shutdown. This State is accessed in the following situations: - Before a BLOCK_SHUTDOWN task is posted Increment the Atomic32 by 2 [1] and read it. If the "shutdown initiated" bit is set, acquire the lock [2] to: - DCHECK that shutdown hasn't completed - Record a UMA histogram - Before a CONTINUE_ON_SHUTDOWN or SKIP_ON_SHUTDOWN task is posted: Read the Atomic32. The task is allowed to be posted iff the "shutdown initiated" bit isn't set. - Before a BLOCK_SHUTDOWN task is executed. Read the Atomic32. DCHECK that the number of tasks blocking shutdown is zero (this number was incremented when the task was posted). - Before a SKIP_ON_SHUTDOWN task is executed. Increment the Atomic32 by 2 [1] and read it. If the "shutdown initiated" bit isn't set: The task is allowed the task to be executed. If the "shutdown initiated bit is set: Decrement the Atomic32 by 2 [1] (to revert the incorrect increment) and read it. If the number of tasks blocking shutdown became zero, do [3]. - Before a CONTINUE_ON_SHUTDOWN task is executed. Read the Atomic32. Allow the task to be executed iff the "shutdown initiated" bit isn't set. - After a BLOCK_SHUTDOWN of SKIP_ON_SHUTDOWN task is executed: Decrement the Atomic32 by 2 [1] and read it. If the number of tasks blocking shutdown became zero, do [3]. - When Shutdown() is called: Acquire the lock. Increment the Atomic32 by 1 (to set the "shutdown initiated bit"). From this moment, if a thread causes the number of tasks blocking shutdown to become zero, it will do [3]. Instantiate the shutdown CV. a. Read the number of tasks blocking shutdown. b. If there are no tasks blocking shutdown, go to e. c. Wait for the shutdown CV to be signaled. d. Go back to a. e. Set the "shutdown completed" flag. [1] Incrementing the Atomic32 by 2 increments the number of tasks blocking shutdown by 1 because the LSB is used to indicate whether shutdown has been initiated. [2] The TaskTracker must be locked to access the shutdown CV, the "shutdown completed" boolean and the number of BLOCK_SHUTDOWN tasks posted during shutdown. [3] These actions are performed by OnNumTasksBlockingShutdownIsZero(): - Acquire the lock. - DCHECK that the shutdown CV exists or that shutdown has completed. This condition must be true because Shutdown() holds the lock: - Between the time it sets the "shutdown initiated" flag and the time it instantiates the CV. - Between the time it deletes the CV and the time it sets the "shutdown completed" flag. - If the shutdown CV exists, signal it. Benchmark - 10 million increment/decrement per thread Atomic Lock 4 threads Win Z840 1.5s 2.2s 4 thread Ubuntu VM 0.8s 4.9s 8 threads Win Z840 3.0s 5.5s 8 thread Ubuntu VM 1.6s 11.8s 8 threads atomic BUG=553459 ========== to ========== TaskScheduler: Atomic operations in TaskTracker With this CL, TaskTracker never acquires a lock except when shutdown is in progress. TaskTracker::State wraps an Atomic32 whose LSB indicates whether shutdown has been initiated and whose other bits count the number of tasks blocking shutdown. This State is accessed in the following situations: - Before a BLOCK_SHUTDOWN task is posted Increment the Atomic32 by 2 [1] and read it. If the "shutdown initiated" bit is set, acquire the lock [2] to: - DCHECK that shutdown hasn't completed - Record a UMA histogram - Before a CONTINUE_ON_SHUTDOWN or SKIP_ON_SHUTDOWN task is posted: Read the Atomic32. The task is allowed to be posted iff the "shutdown initiated" bit isn't set. - Before a BLOCK_SHUTDOWN task is executed. Read the Atomic32. DCHECK that the number of tasks blocking shutdown is zero (this number was incremented when the task was posted). - Before a SKIP_ON_SHUTDOWN task is executed. Increment the Atomic32 by 2 [1] and read it. If the "shutdown initiated" bit isn't set: The task is allowed the task to be executed. If the "shutdown initiated bit is set: Decrement the Atomic32 by 2 [1] (to revert the incorrect increment) and read it. If the number of tasks blocking shutdown became zero, do [3]. - Before a CONTINUE_ON_SHUTDOWN task is executed. Read the Atomic32. Allow the task to be executed iff the "shutdown initiated" bit isn't set. - After a BLOCK_SHUTDOWN of SKIP_ON_SHUTDOWN task is executed: Decrement the Atomic32 by 2 [1] and read it. If the number of tasks blocking shutdown became zero, do [3]. - When Shutdown() is called: Acquire the lock. Increment the Atomic32 by 1 (to set the "shutdown initiated bit"). From this moment, if a thread causes the number of tasks blocking shutdown to become zero, it will do [3]. Instantiate the shutdown CV. a. Read the number of tasks blocking shutdown. b. If there are no tasks blocking shutdown, go to e. c. Wait for the shutdown CV to be signaled. d. Go back to a. e. Set the "shutdown completed" flag. [1] Incrementing the Atomic32 by 2 increments the number of tasks blocking shutdown by 1 because the LSB is used to indicate whether shutdown has been initiated. [2] The TaskTracker must be locked to access the shutdown CV, the "shutdown completed" boolean and the number of BLOCK_SHUTDOWN tasks posted during shutdown. [3] These actions are performed by OnNumTasksBlockingShutdownIsZero(): - Acquire the lock. - DCHECK that the shutdown CV exists or that shutdown has completed. This condition must be true because Shutdown() holds the lock: - Between the time it sets the "shutdown initiated" flag and the time it instantiates the CV. - Between the time it deletes the CV and the time it sets the "shutdown completed" flag. - If the shutdown CV exists, signal it. Benchmark - 10 million increment/decrement per thread Atomic Lock 4 threads Win Z840 1.5s 2.2s 4 thread Ubuntu VM 0.8s 4.9s 8 threads Win Z840 3.0s 5.5s 8 thread Ubuntu VM 1.6s 11.8s BUG=553459 ==========
Description was changed from ========== TaskScheduler: Atomic operations in TaskTracker With this CL, TaskTracker never acquires a lock except when shutdown is in progress. TaskTracker::State wraps an Atomic32 whose LSB indicates whether shutdown has been initiated and whose other bits count the number of tasks blocking shutdown. This State is accessed in the following situations: - Before a BLOCK_SHUTDOWN task is posted Increment the Atomic32 by 2 [1] and read it. If the "shutdown initiated" bit is set, acquire the lock [2] to: - DCHECK that shutdown hasn't completed - Record a UMA histogram - Before a CONTINUE_ON_SHUTDOWN or SKIP_ON_SHUTDOWN task is posted: Read the Atomic32. The task is allowed to be posted iff the "shutdown initiated" bit isn't set. - Before a BLOCK_SHUTDOWN task is executed. Read the Atomic32. DCHECK that the number of tasks blocking shutdown is zero (this number was incremented when the task was posted). - Before a SKIP_ON_SHUTDOWN task is executed. Increment the Atomic32 by 2 [1] and read it. If the "shutdown initiated" bit isn't set: The task is allowed the task to be executed. If the "shutdown initiated bit is set: Decrement the Atomic32 by 2 [1] (to revert the incorrect increment) and read it. If the number of tasks blocking shutdown became zero, do [3]. - Before a CONTINUE_ON_SHUTDOWN task is executed. Read the Atomic32. Allow the task to be executed iff the "shutdown initiated" bit isn't set. - After a BLOCK_SHUTDOWN of SKIP_ON_SHUTDOWN task is executed: Decrement the Atomic32 by 2 [1] and read it. If the number of tasks blocking shutdown became zero, do [3]. - When Shutdown() is called: Acquire the lock. Increment the Atomic32 by 1 (to set the "shutdown initiated bit"). From this moment, if a thread causes the number of tasks blocking shutdown to become zero, it will do [3]. Instantiate the shutdown CV. a. Read the number of tasks blocking shutdown. b. If there are no tasks blocking shutdown, go to e. c. Wait for the shutdown CV to be signaled. d. Go back to a. e. Set the "shutdown completed" flag. [1] Incrementing the Atomic32 by 2 increments the number of tasks blocking shutdown by 1 because the LSB is used to indicate whether shutdown has been initiated. [2] The TaskTracker must be locked to access the shutdown CV, the "shutdown completed" boolean and the number of BLOCK_SHUTDOWN tasks posted during shutdown. [3] These actions are performed by OnNumTasksBlockingShutdownIsZero(): - Acquire the lock. - DCHECK that the shutdown CV exists or that shutdown has completed. This condition must be true because Shutdown() holds the lock: - Between the time it sets the "shutdown initiated" flag and the time it instantiates the CV. - Between the time it deletes the CV and the time it sets the "shutdown completed" flag. - If the shutdown CV exists, signal it. Benchmark - 10 million increment/decrement per thread Atomic Lock 4 threads Win Z840 1.5s 2.2s 4 thread Ubuntu VM 0.8s 4.9s 8 threads Win Z840 3.0s 5.5s 8 thread Ubuntu VM 1.6s 11.8s BUG=553459 ========== to ========== TaskScheduler: Atomic operations in TaskTracker With this CL, TaskTracker never acquires a lock except when shutdown is in progress. TaskTracker::State wraps an Atomic32 whose LSB indicates whether shutdown has been initiated and whose other bits count the number of tasks blocking shutdown. This State is accessed in the following situations: - Before a BLOCK_SHUTDOWN task is posted Increment the Atomic32 by 2 [1] and read it. If the "shutdown initiated" bit is set, acquire the lock [2] to: - DCHECK that shutdown hasn't completed - Record a UMA histogram - Before a CONTINUE_ON_SHUTDOWN or SKIP_ON_SHUTDOWN task is posted: Read the Atomic32. The task is allowed to be posted iff the "shutdown initiated" bit isn't set. - Before a BLOCK_SHUTDOWN task is executed. Read the Atomic32. DCHECK that the number of tasks blocking shutdown is not zero (this number was incremented when the task was posted). - Before a SKIP_ON_SHUTDOWN task is executed. Increment the Atomic32 by 2 [1] and read it. If the "shutdown initiated" bit isn't set: The task is allowed to be executed. If the "shutdown initiated bit is set: Decrement the Atomic32 by 2 [1] (to revert the incorrect increment) and read it. If the number of tasks blocking shutdown became zero, do [3]. - Before a CONTINUE_ON_SHUTDOWN task is executed. Read the Atomic32. Allow the task to be executed iff the "shutdown initiated" bit isn't set. - After a BLOCK_SHUTDOWN or SKIP_ON_SHUTDOWN task is executed: Decrement the Atomic32 by 2 [1] and read it. If the number of tasks blocking shutdown became zero and "shutdown initiated", do [3]. - When Shutdown() is called: Acquire the lock. Increment the Atomic32 by 1 (to set the "shutdown initiated bit"). From this moment, if a thread causes the number of tasks blocking shutdown to become zero, it will do [3]. Instantiate the shutdown CV. a. Read the number of tasks blocking shutdown. b. If there are no tasks blocking shutdown, go to e. c. Wait for the shutdown CV to be signaled. d. Go back to a. e. Set the "shutdown completed" flag. [1] Incrementing the Atomic32 by 2 increments the number of tasks blocking shutdown by 1 because the LSB is used to indicate whether shutdown has been initiated. [2] The TaskTracker must be locked to access the shutdown CV, the "shutdown completed" boolean and the number of BLOCK_SHUTDOWN tasks posted during shutdown. [3] These actions are performed by OnNumTasksBlockingShutdownIsZero(): - Acquire the lock. - DCHECK that the shutdown CV exists or that shutdown has completed. This condition must be true because Shutdown() holds the lock: - Between the time it sets the "shutdown initiated" flag and the time it instantiates the CV. - Between the time it deletes the CV and the time it sets the "shutdown completed" flag. - If the shutdown CV exists, signal it. Benchmark - 10 million increment/decrement per thread Atomic Lock 4 threads Win Z840 1.5s 2.2s 4 thread Ubuntu VM 0.8s 4.9s 8 threads Win Z840 3.0s 5.5s 8 thread Ubuntu VM 1.6s 11.8s BUG=553459 ==========
https://codereview.chromium.org/2019763002/diff/1/base/task_scheduler/task_tr... File base/task_scheduler/task_tracker.cc (right): https://codereview.chromium.org/2019763002/diff/1/base/task_scheduler/task_tr... base/task_scheduler/task_tracker.cc:80: const auto new_bits = subtle::NoBarrier_AtomicIncrement( On 2016/05/30 15:48:04, fdoray wrote: > On 2016/05/27 22:16:14, robliao wrote: > > Is this safe without a barrier? > > If the atomic increment in IncrementNumTasks gets reordered after > > DecrementNumTasks, we just told someone that the num of tasks blocking > shutdown > > hit 0. > > I believe a barrier "specifies how regular, non-atomic memory accesses are to be > ordered around an atomic operation" > http://en.cppreference.com/w/cpp/atomic/memory_order The atomic increment cannot > be reordered after the atomic decrement. Our usage is not unlike shared_ptr refcounts. From the doc: Typical use for relaxed memory ordering is incrementing counters, such as the reference counters of std::shared_ptr, since this only requires atomicity, but not ordering or synchronization (note that decrementing the shared_ptr counters requires acquire-release synchronization with the destructor)
Description was changed from ========== TaskScheduler: Atomic operations in TaskTracker With this CL, TaskTracker never acquires a lock except when shutdown is in progress. TaskTracker::State wraps an Atomic32 whose LSB indicates whether shutdown has been initiated and whose other bits count the number of tasks blocking shutdown. This State is accessed in the following situations: - Before a BLOCK_SHUTDOWN task is posted Increment the Atomic32 by 2 [1] and read it. If the "shutdown initiated" bit is set, acquire the lock [2] to: - DCHECK that shutdown hasn't completed - Record a UMA histogram - Before a CONTINUE_ON_SHUTDOWN or SKIP_ON_SHUTDOWN task is posted: Read the Atomic32. The task is allowed to be posted iff the "shutdown initiated" bit isn't set. - Before a BLOCK_SHUTDOWN task is executed. Read the Atomic32. DCHECK that the number of tasks blocking shutdown is not zero (this number was incremented when the task was posted). - Before a SKIP_ON_SHUTDOWN task is executed. Increment the Atomic32 by 2 [1] and read it. If the "shutdown initiated" bit isn't set: The task is allowed to be executed. If the "shutdown initiated bit is set: Decrement the Atomic32 by 2 [1] (to revert the incorrect increment) and read it. If the number of tasks blocking shutdown became zero, do [3]. - Before a CONTINUE_ON_SHUTDOWN task is executed. Read the Atomic32. Allow the task to be executed iff the "shutdown initiated" bit isn't set. - After a BLOCK_SHUTDOWN or SKIP_ON_SHUTDOWN task is executed: Decrement the Atomic32 by 2 [1] and read it. If the number of tasks blocking shutdown became zero and "shutdown initiated", do [3]. - When Shutdown() is called: Acquire the lock. Increment the Atomic32 by 1 (to set the "shutdown initiated bit"). From this moment, if a thread causes the number of tasks blocking shutdown to become zero, it will do [3]. Instantiate the shutdown CV. a. Read the number of tasks blocking shutdown. b. If there are no tasks blocking shutdown, go to e. c. Wait for the shutdown CV to be signaled. d. Go back to a. e. Set the "shutdown completed" flag. [1] Incrementing the Atomic32 by 2 increments the number of tasks blocking shutdown by 1 because the LSB is used to indicate whether shutdown has been initiated. [2] The TaskTracker must be locked to access the shutdown CV, the "shutdown completed" boolean and the number of BLOCK_SHUTDOWN tasks posted during shutdown. [3] These actions are performed by OnNumTasksBlockingShutdownIsZero(): - Acquire the lock. - DCHECK that the shutdown CV exists or that shutdown has completed. This condition must be true because Shutdown() holds the lock: - Between the time it sets the "shutdown initiated" flag and the time it instantiates the CV. - Between the time it deletes the CV and the time it sets the "shutdown completed" flag. - If the shutdown CV exists, signal it. Benchmark - 10 million increment/decrement per thread Atomic Lock 4 threads Win Z840 1.5s 2.2s 4 thread Ubuntu VM 0.8s 4.9s 8 threads Win Z840 3.0s 5.5s 8 thread Ubuntu VM 1.6s 11.8s BUG=553459 ========== to ========== TaskScheduler: Atomic operations in TaskTracker With this CL, TaskTracker never acquires a lock except when shutdown is in progress. TaskTracker::State wraps an Atomic32 whose LSB indicates whether shutdown has been initiated and whose other bits count the number of tasks blocking shutdown. This State is accessed in the following situations: - Before a BLOCK_SHUTDOWN task is posted Increment the Atomic32 by 2 [1] and read it. If the "shutdown initiated" bit is set, acquire the lock [2] to: - DCHECK that shutdown hasn't completed - Record a UMA histogram - Before a CONTINUE_ON_SHUTDOWN or SKIP_ON_SHUTDOWN task is posted: Read the Atomic32. The task is allowed to be posted iff the "shutdown initiated" bit isn't set. - Before a BLOCK_SHUTDOWN task is executed. Read the Atomic32. DCHECK that there are tasks blocking shutdown (number of tasks blocking shutdown was incremented when the task was posted). - Before a SKIP_ON_SHUTDOWN task is executed. Increment the Atomic32 by 2 [1] and read it. If the "shutdown initiated" bit isn't set: The task is allowed to be executed. If the "shutdown initiated bit is set: Decrement the Atomic32 by 2 [1] (to revert the incorrect increment) and read it. If the number of tasks blocking shutdown became zero, do [3]. Note: Reading the "shutdown initiated" bit and incrementing the number of tasks blocking shutdown if it isn't set (without re-checking it afterwards) wouldn't work because the bit can be set between the two operations. It wouldn't be valid to read the "shutdown initiated" bit and increment the number of tasks blocking shutdown and increment the number - Before a CONTINUE_ON_SHUTDOWN task is executed. Read the Atomic32. Allow the task to be executed iff the "shutdown initiated" bit isn't set. - After a BLOCK_SHUTDOWN or SKIP_ON_SHUTDOWN task is executed: Decrement the Atomic32 by 2 [1] and read it. If the number of tasks blocking shutdown became zero and "shutdown initiated", do [3]. - When Shutdown() is called: Acquire the lock. Increment the Atomic32 by 1 (to set the "shutdown initiated bit"). From this moment, if a thread causes the number of tasks blocking shutdown to become zero, it will do [3]. Instantiate the shutdown CV. a. Read the number of tasks blocking shutdown. b. If there are no tasks blocking shutdown, go to e. c. Wait for the shutdown CV to be signaled. d. Go back to a. e. Set the "shutdown completed" flag. [1] Incrementing the Atomic32 by 2 increments the number of tasks blocking shutdown by 1 because the LSB is used to indicate whether shutdown has been initiated. [2] The TaskTracker must be locked to access the shutdown CV, the "shutdown completed" boolean and the number of BLOCK_SHUTDOWN tasks posted during shutdown. [3] These actions are performed by OnNumTasksBlockingShutdownIsZero(): - Acquire the lock. - DCHECK that the shutdown CV exists or that shutdown has completed. This condition must be true because Shutdown() holds the lock: - Between the time it sets the "shutdown initiated" flag and the time it instantiates the CV. - Between the time it deletes the CV and the time it sets the "shutdown completed" flag. - If the shutdown CV exists, signal it. Benchmark - 10 million increment/decrement per thread Atomic Lock 4 threads Win Z840 1.5s 2.2s 4 thread Ubuntu VM 0.8s 4.9s 8 threads Win Z840 3.0s 5.5s 8 thread Ubuntu VM 1.6s 11.8s BUG=553459 ==========
Description was changed from ========== TaskScheduler: Atomic operations in TaskTracker With this CL, TaskTracker never acquires a lock except when shutdown is in progress. TaskTracker::State wraps an Atomic32 whose LSB indicates whether shutdown has been initiated and whose other bits count the number of tasks blocking shutdown. This State is accessed in the following situations: - Before a BLOCK_SHUTDOWN task is posted Increment the Atomic32 by 2 [1] and read it. If the "shutdown initiated" bit is set, acquire the lock [2] to: - DCHECK that shutdown hasn't completed - Record a UMA histogram - Before a CONTINUE_ON_SHUTDOWN or SKIP_ON_SHUTDOWN task is posted: Read the Atomic32. The task is allowed to be posted iff the "shutdown initiated" bit isn't set. - Before a BLOCK_SHUTDOWN task is executed. Read the Atomic32. DCHECK that there are tasks blocking shutdown (number of tasks blocking shutdown was incremented when the task was posted). - Before a SKIP_ON_SHUTDOWN task is executed. Increment the Atomic32 by 2 [1] and read it. If the "shutdown initiated" bit isn't set: The task is allowed to be executed. If the "shutdown initiated bit is set: Decrement the Atomic32 by 2 [1] (to revert the incorrect increment) and read it. If the number of tasks blocking shutdown became zero, do [3]. Note: Reading the "shutdown initiated" bit and incrementing the number of tasks blocking shutdown if it isn't set (without re-checking it afterwards) wouldn't work because the bit can be set between the two operations. It wouldn't be valid to read the "shutdown initiated" bit and increment the number of tasks blocking shutdown and increment the number - Before a CONTINUE_ON_SHUTDOWN task is executed. Read the Atomic32. Allow the task to be executed iff the "shutdown initiated" bit isn't set. - After a BLOCK_SHUTDOWN or SKIP_ON_SHUTDOWN task is executed: Decrement the Atomic32 by 2 [1] and read it. If the number of tasks blocking shutdown became zero and "shutdown initiated", do [3]. - When Shutdown() is called: Acquire the lock. Increment the Atomic32 by 1 (to set the "shutdown initiated bit"). From this moment, if a thread causes the number of tasks blocking shutdown to become zero, it will do [3]. Instantiate the shutdown CV. a. Read the number of tasks blocking shutdown. b. If there are no tasks blocking shutdown, go to e. c. Wait for the shutdown CV to be signaled. d. Go back to a. e. Set the "shutdown completed" flag. [1] Incrementing the Atomic32 by 2 increments the number of tasks blocking shutdown by 1 because the LSB is used to indicate whether shutdown has been initiated. [2] The TaskTracker must be locked to access the shutdown CV, the "shutdown completed" boolean and the number of BLOCK_SHUTDOWN tasks posted during shutdown. [3] These actions are performed by OnNumTasksBlockingShutdownIsZero(): - Acquire the lock. - DCHECK that the shutdown CV exists or that shutdown has completed. This condition must be true because Shutdown() holds the lock: - Between the time it sets the "shutdown initiated" flag and the time it instantiates the CV. - Between the time it deletes the CV and the time it sets the "shutdown completed" flag. - If the shutdown CV exists, signal it. Benchmark - 10 million increment/decrement per thread Atomic Lock 4 threads Win Z840 1.5s 2.2s 4 thread Ubuntu VM 0.8s 4.9s 8 threads Win Z840 3.0s 5.5s 8 thread Ubuntu VM 1.6s 11.8s BUG=553459 ========== to ========== TaskScheduler: Atomic operations in TaskTracker With this CL, TaskTracker never acquires a lock except when shutdown is in progress. TaskTracker::State wraps an Atomic32 whose LSB indicates whether shutdown has been initiated and whose other bits count the number of tasks blocking shutdown. This State is accessed in the following situations: - Before a BLOCK_SHUTDOWN task is posted Increment the Atomic32 by 2 [1] and read it. If the "shutdown initiated" bit is set, acquire the lock [2] to: - DCHECK that shutdown hasn't completed - Record a UMA histogram - Before a CONTINUE_ON_SHUTDOWN or SKIP_ON_SHUTDOWN task is posted: Read the Atomic32. The task is allowed to be posted iff the "shutdown initiated" bit isn't set. - Before a BLOCK_SHUTDOWN task is executed. Read the Atomic32. DCHECK that there are tasks blocking shutdown (number of tasks blocking shutdown was incremented when the task was posted). - Before a SKIP_ON_SHUTDOWN task is executed. Increment the Atomic32 by 2 [1] and read it. If the "shutdown initiated" bit isn't set: The task is allowed to be executed. If the "shutdown initiated bit is set: Decrement the Atomic32 by 2 [1] (to revert the incorrect increment) and read it. If the number of tasks blocking shutdown became zero, do [3]. Note: Reading the "shutdown initiated" bit and incrementing the number of tasks blocking shutdown if it isn't set (without re-checking it afterwards) wouldn't work because the bit can be set between the two operations. - Before a CONTINUE_ON_SHUTDOWN task is executed. Read the Atomic32. Allow the task to be executed iff the "shutdown initiated" bit isn't set. - After a BLOCK_SHUTDOWN or SKIP_ON_SHUTDOWN task is executed: Decrement the Atomic32 by 2 [1] and read it. If the number of tasks blocking shutdown became zero and "shutdown initiated", do [3]. - When Shutdown() is called: Acquire the lock. Increment the Atomic32 by 1 (to set the "shutdown initiated bit"). From this moment, if a thread causes the number of tasks blocking shutdown to become zero, it will do [3]. Instantiate the shutdown CV. a. Read the number of tasks blocking shutdown. b. If there are no tasks blocking shutdown, go to e. c. Wait for the shutdown CV to be signaled. d. Go back to a. e. Set the "shutdown completed" flag. [1] Incrementing the Atomic32 by 2 increments the number of tasks blocking shutdown by 1 because the LSB is used to indicate whether shutdown has been initiated. [2] The TaskTracker must be locked to access the shutdown CV, the "shutdown completed" boolean and the number of BLOCK_SHUTDOWN tasks posted during shutdown. [3] These actions are performed by OnNumTasksBlockingShutdownIsZero(): - Acquire the lock. - DCHECK that the shutdown CV exists or that shutdown has completed. This condition must be true because Shutdown() holds the lock: - Between the time it sets the "shutdown initiated" flag and the time it instantiates the CV. - Between the time it deletes the CV and the time it sets the "shutdown completed" flag. - If the shutdown CV exists, signal it. Benchmark - 10 million increment/decrement per thread Atomic Lock 4 threads Win Z840 1.5s 2.2s 4 thread Ubuntu VM 0.8s 4.9s 8 threads Win Z840 3.0s 5.5s 8 thread Ubuntu VM 1.6s 11.8s BUG=553459 ==========
Description was changed from ========== TaskScheduler: Atomic operations in TaskTracker With this CL, TaskTracker never acquires a lock except when shutdown is in progress. TaskTracker::State wraps an Atomic32 whose LSB indicates whether shutdown has been initiated and whose other bits count the number of tasks blocking shutdown. This State is accessed in the following situations: - Before a BLOCK_SHUTDOWN task is posted Increment the Atomic32 by 2 [1] and read it. If the "shutdown initiated" bit is set, acquire the lock [2] to: - DCHECK that shutdown hasn't completed - Record a UMA histogram - Before a CONTINUE_ON_SHUTDOWN or SKIP_ON_SHUTDOWN task is posted: Read the Atomic32. The task is allowed to be posted iff the "shutdown initiated" bit isn't set. - Before a BLOCK_SHUTDOWN task is executed. Read the Atomic32. DCHECK that there are tasks blocking shutdown (number of tasks blocking shutdown was incremented when the task was posted). - Before a SKIP_ON_SHUTDOWN task is executed. Increment the Atomic32 by 2 [1] and read it. If the "shutdown initiated" bit isn't set: The task is allowed to be executed. If the "shutdown initiated bit is set: Decrement the Atomic32 by 2 [1] (to revert the incorrect increment) and read it. If the number of tasks blocking shutdown became zero, do [3]. Note: Reading the "shutdown initiated" bit and incrementing the number of tasks blocking shutdown if it isn't set (without re-checking it afterwards) wouldn't work because the bit can be set between the two operations. - Before a CONTINUE_ON_SHUTDOWN task is executed. Read the Atomic32. Allow the task to be executed iff the "shutdown initiated" bit isn't set. - After a BLOCK_SHUTDOWN or SKIP_ON_SHUTDOWN task is executed: Decrement the Atomic32 by 2 [1] and read it. If the number of tasks blocking shutdown became zero and "shutdown initiated", do [3]. - When Shutdown() is called: Acquire the lock. Increment the Atomic32 by 1 (to set the "shutdown initiated bit"). From this moment, if a thread causes the number of tasks blocking shutdown to become zero, it will do [3]. Instantiate the shutdown CV. a. Read the number of tasks blocking shutdown. b. If there are no tasks blocking shutdown, go to e. c. Wait for the shutdown CV to be signaled. d. Go back to a. e. Set the "shutdown completed" flag. [1] Incrementing the Atomic32 by 2 increments the number of tasks blocking shutdown by 1 because the LSB is used to indicate whether shutdown has been initiated. [2] The TaskTracker must be locked to access the shutdown CV, the "shutdown completed" boolean and the number of BLOCK_SHUTDOWN tasks posted during shutdown. [3] These actions are performed by OnNumTasksBlockingShutdownIsZero(): - Acquire the lock. - DCHECK that the shutdown CV exists or that shutdown has completed. This condition must be true because Shutdown() holds the lock: - Between the time it sets the "shutdown initiated" flag and the time it instantiates the CV. - Between the time it deletes the CV and the time it sets the "shutdown completed" flag. - If the shutdown CV exists, signal it. Benchmark - 10 million increment/decrement per thread Atomic Lock 4 threads Win Z840 1.5s 2.2s 4 thread Ubuntu VM 0.8s 4.9s 8 threads Win Z840 3.0s 5.5s 8 thread Ubuntu VM 1.6s 11.8s BUG=553459 ========== to ========== TaskScheduler: Atomic operations in TaskTracker With this CL, TaskTracker never acquires a lock except when shutdown is in progress. TaskTracker::State wraps an Atomic32 whose LSB indicates whether shutdown has been initiated and whose other bits count the number of tasks blocking shutdown. This State is accessed in the following situations: - Before a BLOCK_SHUTDOWN task is posted Increment the Atomic32 by 2 [1] and read it. If the "shutdown initiated" bit is set, acquire the lock [2] to: - DCHECK that shutdown hasn't completed - Record a UMA histogram - Before a CONTINUE_ON_SHUTDOWN or SKIP_ON_SHUTDOWN task is posted: Read the Atomic32. The task is allowed to be posted iff the "shutdown initiated" bit isn't set. - Before a BLOCK_SHUTDOWN task is executed. Read the Atomic32. DCHECK that there are tasks blocking shutdown (number of tasks blocking shutdown was incremented when the task was posted). - Before a SKIP_ON_SHUTDOWN task is executed. Increment the Atomic32 by 2 [1] and read it. If the "shutdown initiated" bit isn't set: The task is allowed to be executed. If the "shutdown initiated bit is set: Decrement the Atomic32 by 2 [1] (to revert the incorrect increment) and read it. If the number of tasks blocking shutdown became zero, do [3]. Note: Reading the "shutdown initiated" bit and incrementing the number of tasks blocking shutdown if it isn't set (without re-checking it afterwards) wouldn't work because the bit can be set between the two operations. - Before a CONTINUE_ON_SHUTDOWN task is executed. Read the Atomic32. Allow the task to be executed iff the "shutdown initiated" bit isn't set. - After a BLOCK_SHUTDOWN or SKIP_ON_SHUTDOWN task is executed: Decrement the Atomic32 by 2 [1] and read it. If the number of tasks blocking shutdown became zero and "shutdown initiated", do [3]. - When Shutdown() is called: Acquire the lock. Increment the Atomic32 by 1 (to set the "shutdown initiated bit"). From this moment, if a thread causes the number of tasks blocking shutdown to become zero, it will do [3]. Instantiate the shutdown CV. a. Read the number of tasks blocking shutdown. b. If there are no tasks blocking shutdown, go to e. c. Wait for the shutdown CV to be signaled. d. Go back to a. e. Set the "shutdown completed" flag. [1] Incrementing the Atomic32 by 2 increments the number of tasks blocking shutdown by 1 because the LSB is used to indicate whether shutdown has been initiated. [2] The TaskTracker must be locked to access the shutdown CV, the "shutdown completed" boolean and the number of BLOCK_SHUTDOWN tasks posted during shutdown. [3] These actions are performed by OnNumTasksBlockingShutdownIsZero(): - Acquire the lock. - DCHECK that the shutdown CV exists or that shutdown has completed. This condition must be true because Shutdown() holds the lock: - Between the time it sets the "shutdown initiated" flag and the time it instantiates the CV. - Between the time it deletes the CV and the time it sets the "shutdown completed" flag. - If the shutdown CV exists, signal it. Benchmark - 10 million increment/decrement per thread Atomic Lock 4 threads Win Z840 1.5s 2.2s 4 threads Ubuntu VM 0.8s 4.9s 4 threads MacBookPro 1.8s 270.7s 8 threads Win Z840 3.0s 5.5s 8 threads Ubuntu VM 1.6s 11.8s 8 threads MacBookPro 4.2s 559.0s BUG=553459 ==========
Description was changed from ========== TaskScheduler: Atomic operations in TaskTracker With this CL, TaskTracker never acquires a lock except when shutdown is in progress. TaskTracker::State wraps an Atomic32 whose LSB indicates whether shutdown has been initiated and whose other bits count the number of tasks blocking shutdown. This State is accessed in the following situations: - Before a BLOCK_SHUTDOWN task is posted Increment the Atomic32 by 2 [1] and read it. If the "shutdown initiated" bit is set, acquire the lock [2] to: - DCHECK that shutdown hasn't completed - Record a UMA histogram - Before a CONTINUE_ON_SHUTDOWN or SKIP_ON_SHUTDOWN task is posted: Read the Atomic32. The task is allowed to be posted iff the "shutdown initiated" bit isn't set. - Before a BLOCK_SHUTDOWN task is executed. Read the Atomic32. DCHECK that there are tasks blocking shutdown (number of tasks blocking shutdown was incremented when the task was posted). - Before a SKIP_ON_SHUTDOWN task is executed. Increment the Atomic32 by 2 [1] and read it. If the "shutdown initiated" bit isn't set: The task is allowed to be executed. If the "shutdown initiated bit is set: Decrement the Atomic32 by 2 [1] (to revert the incorrect increment) and read it. If the number of tasks blocking shutdown became zero, do [3]. Note: Reading the "shutdown initiated" bit and incrementing the number of tasks blocking shutdown if it isn't set (without re-checking it afterwards) wouldn't work because the bit can be set between the two operations. - Before a CONTINUE_ON_SHUTDOWN task is executed. Read the Atomic32. Allow the task to be executed iff the "shutdown initiated" bit isn't set. - After a BLOCK_SHUTDOWN or SKIP_ON_SHUTDOWN task is executed: Decrement the Atomic32 by 2 [1] and read it. If the number of tasks blocking shutdown became zero and "shutdown initiated", do [3]. - When Shutdown() is called: Acquire the lock. Increment the Atomic32 by 1 (to set the "shutdown initiated bit"). From this moment, if a thread causes the number of tasks blocking shutdown to become zero, it will do [3]. Instantiate the shutdown CV. a. Read the number of tasks blocking shutdown. b. If there are no tasks blocking shutdown, go to e. c. Wait for the shutdown CV to be signaled. d. Go back to a. e. Set the "shutdown completed" flag. [1] Incrementing the Atomic32 by 2 increments the number of tasks blocking shutdown by 1 because the LSB is used to indicate whether shutdown has been initiated. [2] The TaskTracker must be locked to access the shutdown CV, the "shutdown completed" boolean and the number of BLOCK_SHUTDOWN tasks posted during shutdown. [3] These actions are performed by OnNumTasksBlockingShutdownIsZero(): - Acquire the lock. - DCHECK that the shutdown CV exists or that shutdown has completed. This condition must be true because Shutdown() holds the lock: - Between the time it sets the "shutdown initiated" flag and the time it instantiates the CV. - Between the time it deletes the CV and the time it sets the "shutdown completed" flag. - If the shutdown CV exists, signal it. Benchmark - 10 million increment/decrement per thread Atomic Lock 4 threads Win Z840 1.5s 2.2s 4 threads Ubuntu VM 0.8s 4.9s 4 threads MacBookPro 1.8s 270.7s 8 threads Win Z840 3.0s 5.5s 8 threads Ubuntu VM 1.6s 11.8s 8 threads MacBookPro 4.2s 559.0s BUG=553459 ========== to ========== TaskScheduler: Atomic operations in TaskTracker With this CL, TaskTracker never acquires a lock except when shutdown is in progress. TaskTracker::State wraps an Atomic32 whose LSB indicates whether shutdown has been initiated and whose other bits count the number of tasks blocking shutdown. This State is accessed in the following situations: - Before a BLOCK_SHUTDOWN task is posted Increment the Atomic32 by 2 [1] and read it. If the "shutdown initiated" bit is set, acquire the lock [2] to: - DCHECK that shutdown hasn't completed - Record a UMA histogram - Before a CONTINUE_ON_SHUTDOWN or SKIP_ON_SHUTDOWN task is posted: Read the Atomic32. The task is allowed to be posted iff the "shutdown initiated" bit isn't set. - Before a BLOCK_SHUTDOWN task is executed. Read the Atomic32. DCHECK that there are tasks blocking shutdown (number of tasks blocking shutdown was incremented when the task was posted). - Before a SKIP_ON_SHUTDOWN task is executed. Increment the Atomic32 by 2 [1] and read it. If the "shutdown initiated" bit isn't set: The task is allowed to be executed. If the "shutdown initiated bit is set: Decrement the Atomic32 by 2 [1] (to revert the incorrect increment) and read it. If the number of tasks blocking shutdown became zero, do [3]. Note: Reading the "shutdown initiated" bit and incrementing the number of tasks blocking shutdown if it isn't set (without re-checking it afterwards) wouldn't work because the bit can be set between the two operations. - Before a CONTINUE_ON_SHUTDOWN task is executed. Read the Atomic32. Allow the task to be executed iff the "shutdown initiated" bit isn't set. - After a BLOCK_SHUTDOWN or SKIP_ON_SHUTDOWN task is executed: Decrement the Atomic32 by 2 [1] and read it. If the number of tasks blocking shutdown became zero and "shutdown initiated", do [3]. - When Shutdown() is called: Acquire the lock. Instantiate the shutdown event. Increment the Atomic32 by 1 (to set the "shutdown initiated bit") and read it. From this moment, if a thread causes the number of tasks blocking shutdown to become zero, it will do [3]. If there are no tasks blocking shutdown, signal the event and return. Wait until the shutdown event is signaled. [1] Incrementing the Atomic32 by 2 increments the number of tasks blocking shutdown by 1 because the LSB is used to indicate whether shutdown has been initiated. [2] The TaskTracker must be locked to access the shutdown event and the number of BLOCK_SHUTDOWN tasks posted during shutdown. [3] These actions are performed by OnNumTasksBlockingShutdownIsZero(): - Acquire the lock. - DCHECK that shutdown has started (i.e. "shutdown initiated" bit is set and shutdown event exists). - Signal the shutdown event. Benchmark - 10 million increment/decrement per thread Atomic Lock 4 threads Win Z840 1.5s 2.2s 4 threads Ubuntu VM 0.8s 4.9s 4 threads MacBookPro 1.8s 270.7s 8 threads Win Z840 3.0s 5.5s 8 threads Ubuntu VM 1.6s 11.8s 8 threads MacBookPro 4.2s 559.0s BUG=553459 ==========
On 2016/05/31 13:08:17, gab wrote: > A few CL description notes and a major correction on the benchmark code below: > > 1) "The task is allowed the task to be executed." ? Done > > 2) "Before a BLOCK_SHUTDOWN task is executed. > Read the Atomic32. DCHECK that the number of tasks blocking shutdown is > zero (this number was incremented when the task was posted)." > is NOT zero? Done > > 3) "Increment the Atomic32 by 2 [1] and read it." > Document why inc+read before checking shutdown state is key to avoiding a > shutdown race. Done, in CL description and in code comment. > > 4) "BLOCK_SHUTDOWN of SKIP_ON_SHUTDOWN" > ^ "or" Done > > 5) "If the number of tasks blocking shutdown became zero, do [3]." > Shouldn't [3] only be called if "shutdown initiated"? Done. Fixed CL description. > > 6) In [3]: "DCHECK that the shutdown CV exists or that shutdown has completed." > Is it possible to end up in [3] and have shutdown be completed? Shouldn't the > last caller running [3] be the one releasing the CV and marking shutdown as > completed? ~Done. A worker thread may try to run a SKIP_ON_SHUTDOWN task *after* shutdown has completed. It will increment+decrement the number of tasks blocking shutdown and potentially do [3]. In the latest code, I never release the event: I use event exists and is signaled as an indicator that shutdown has completed. > Updated benchmark code: (updated results in the CL's description) static const size_t kNumIterations = 10000000; static const size_t kNumThreads = 8; class AtomicIncrementDecrementThread : public SimpleThread { public: using Data = subtle::Atomic32; AtomicIncrementDecrementThread(Data* data) : SimpleThread(""), counter_(data) {} void Run() override { for (size_t i = 0; i < kNumIterations; ++i) { subtle::NoBarrier_AtomicIncrement(counter_, 1); subtle::NoBarrier_AtomicIncrement(counter_, -1); } } private: subtle::Atomic32* counter_; DISALLOW_COPY_AND_ASSIGN(AtomicIncrementDecrementThread); }; class LockIncrementDecrementThread : public SimpleThread { public: using Data = std::pair<int32_t, Lock>; LockIncrementDecrementThread(Data* data) : SimpleThread(""), counter_(&data->first), lock_(&data->second) {} void Run() override { for (size_t i = 0; i < kNumIterations; ++i) { { AutoLock auto_lock(*lock_); ++counter_; } { AutoLock auto_lock(*lock_); --counter_; } } } private: int32_t* counter_; Lock* lock_; DISALLOW_COPY_AND_ASSIGN(LockIncrementDecrementThread); }; template <typename T> class TaskTrackerPerfTest : public testing::Test {}; using ThreadTypes = ::testing::Types<AtomicIncrementDecrementThread, LockIncrementDecrementThread>; TYPED_TEST_CASE(TaskTrackerPerfTest, ThreadTypes); TYPED_TEST(TaskTrackerPerfTest, TaskTrackerPerfTest) { typename TypeParam::Data data; TimeTicks start(TimeTicks::Now()); std::vector<std::unique_ptr<TypeParam>> threads; for (size_t i = 0; i < kNumThreads; ++i) { threads.push_back(WrapUnique(new TypeParam(&data))); threads.back()->Start(); } for (const auto& thread : threads) thread->Join(); TimeTicks end(TimeTicks::Now()); TimeDelta elapsed_time = end - start; LOG(ERROR) << "Elapsed time: " << elapsed_time; }
On 2016/05/31 13:08:17, gab wrote: > A few CL description notes and a major correction on the benchmark code below: > > 1) "The task is allowed the task to be executed." ? Done > > 2) "Before a BLOCK_SHUTDOWN task is executed. > Read the Atomic32. DCHECK that the number of tasks blocking shutdown is > zero (this number was incremented when the task was posted)." > is NOT zero? Done > > 3) "Increment the Atomic32 by 2 [1] and read it." > Document why inc+read before checking shutdown state is key to avoiding a > shutdown race. Done, in CL description and in code comment. > > 4) "BLOCK_SHUTDOWN of SKIP_ON_SHUTDOWN" > ^ "or" Done > > 5) "If the number of tasks blocking shutdown became zero, do [3]." > Shouldn't [3] only be called if "shutdown initiated"? Done. Fixed CL description. > > 6) In [3]: "DCHECK that the shutdown CV exists or that shutdown has completed." > Is it possible to end up in [3] and have shutdown be completed? Shouldn't the > last caller running [3] be the one releasing the CV and marking shutdown as > completed? ~Done. A worker thread may try to run a SKIP_ON_SHUTDOWN task *after* shutdown has completed. It will increment+decrement the number of tasks blocking shutdown and potentially do [3]. In the latest code, I never release the event: I use event exists and is signaled as an indicator that shutdown has completed. > Updated benchmark code: (updated results in the CL's description) static const size_t kNumIterations = 10000000; static const size_t kNumThreads = 8; class AtomicIncrementDecrementThread : public SimpleThread { public: using Data = subtle::Atomic32; AtomicIncrementDecrementThread(Data* data) : SimpleThread(""), counter_(data) {} void Run() override { for (size_t i = 0; i < kNumIterations; ++i) { subtle::NoBarrier_AtomicIncrement(counter_, 1); subtle::NoBarrier_AtomicIncrement(counter_, -1); } } private: subtle::Atomic32* counter_; DISALLOW_COPY_AND_ASSIGN(AtomicIncrementDecrementThread); }; class LockIncrementDecrementThread : public SimpleThread { public: using Data = std::pair<int32_t, Lock>; LockIncrementDecrementThread(Data* data) : SimpleThread(""), counter_(&data->first), lock_(&data->second) {} void Run() override { for (size_t i = 0; i < kNumIterations; ++i) { { AutoLock auto_lock(*lock_); ++counter_; } { AutoLock auto_lock(*lock_); --counter_; } } } private: int32_t* counter_; Lock* lock_; DISALLOW_COPY_AND_ASSIGN(LockIncrementDecrementThread); }; template <typename T> class TaskTrackerPerfTest : public testing::Test {}; using ThreadTypes = ::testing::Types<AtomicIncrementDecrementThread, LockIncrementDecrementThread>; TYPED_TEST_CASE(TaskTrackerPerfTest, ThreadTypes); TYPED_TEST(TaskTrackerPerfTest, TaskTrackerPerfTest) { typename TypeParam::Data data; TimeTicks start(TimeTicks::Now()); std::vector<std::unique_ptr<TypeParam>> threads; for (size_t i = 0; i < kNumThreads; ++i) { threads.push_back(WrapUnique(new TypeParam(&data))); threads.back()->Start(); } for (const auto& thread : threads) thread->Join(); TimeTicks end(TimeTicks::Now()); TimeDelta elapsed_time = end - start; LOG(ERROR) << "Elapsed time: " << elapsed_time; } https://codereview.chromium.org/2019763002/diff/1/base/task_scheduler/task_tr... File base/task_scheduler/task_tracker.cc (right): https://codereview.chromium.org/2019763002/diff/1/base/task_scheduler/task_tr... base/task_scheduler/task_tracker.cc:80: const auto new_bits = subtle::NoBarrier_AtomicIncrement( On 2016/06/01 18:32:26, robliao wrote: > On 2016/05/30 15:48:04, fdoray wrote: > > On 2016/05/27 22:16:14, robliao wrote: > > > Is this safe without a barrier? > > > If the atomic increment in IncrementNumTasks gets reordered after > > > DecrementNumTasks, we just told someone that the num of tasks blocking > > shutdown > > > hit 0. > > > > I believe a barrier "specifies how regular, non-atomic memory accesses are to > be > > ordered around an atomic operation" > > http://en.cppreference.com/w/cpp/atomic/memory_order The atomic increment > cannot > > be reordered after the atomic decrement. > > Our usage is not unlike shared_ptr refcounts. From the doc: > Typical use for relaxed memory ordering is incrementing counters, such as the > reference counters of std::shared_ptr, since this only requires atomicity, but > not ordering or synchronization (note that decrementing the shared_ptr counters > requires acquire-release synchronization with the destructor) Are you suggesting that I use Barrier_AtomicIncrement instead of NoBarrier_AtomicIncrement here? That would add a memory barrier which I don't think is useful since after calling DecrementNumTasksBlockingShutdown(), we always call OnBlockingShutdownTasksComplete() which acquires a lock (implicit memory barrier).
https://codereview.chromium.org/2019763002/diff/1/base/task_scheduler/task_tr... File base/task_scheduler/task_tracker.cc (right): https://codereview.chromium.org/2019763002/diff/1/base/task_scheduler/task_tr... base/task_scheduler/task_tracker.cc:80: const auto new_bits = subtle::NoBarrier_AtomicIncrement( On 2016/06/06 16:43:35, fdoray wrote: > On 2016/06/01 18:32:26, robliao wrote: > > On 2016/05/30 15:48:04, fdoray wrote: > > > On 2016/05/27 22:16:14, robliao wrote: > > > > Is this safe without a barrier? > > > > If the atomic increment in IncrementNumTasks gets reordered after > > > > DecrementNumTasks, we just told someone that the num of tasks blocking > > > shutdown > > > > hit 0. > > > > > > I believe a barrier "specifies how regular, non-atomic memory accesses are > to > > be > > > ordered around an atomic operation" > > > http://en.cppreference.com/w/cpp/atomic/memory_order The atomic increment > > cannot > > > be reordered after the atomic decrement. > > > > Our usage is not unlike shared_ptr refcounts. From the doc: > > Typical use for relaxed memory ordering is incrementing counters, such as the > > reference counters of std::shared_ptr, since this only requires atomicity, but > > not ordering or synchronization (note that decrementing the shared_ptr > counters > > requires acquire-release synchronization with the destructor) > > Are you suggesting that I use Barrier_AtomicIncrement instead of > NoBarrier_AtomicIncrement here? That would add a memory barrier which I don't > think is useful since after calling DecrementNumTasksBlockingShutdown(), we > always call OnBlockingShutdownTasksComplete() which acquires a lock (implicit > memory barrier). Yup, because it seems like we'll have a race without the barrier. We decrement the count. An increment gets reordered before we make the call to OnBlockingShutdownTasksComplete. The count is now non-zero when we make the call.
robliao@: PTAnL https://codereview.chromium.org/2019763002/diff/1/base/task_scheduler/task_tr... File base/task_scheduler/task_tracker.cc (right): https://codereview.chromium.org/2019763002/diff/1/base/task_scheduler/task_tr... base/task_scheduler/task_tracker.cc:80: const auto new_bits = subtle::NoBarrier_AtomicIncrement( On 2016/06/06 19:16:30, robliao wrote: > On 2016/06/06 16:43:35, fdoray wrote: > > On 2016/06/01 18:32:26, robliao wrote: > > > On 2016/05/30 15:48:04, fdoray wrote: > > > > On 2016/05/27 22:16:14, robliao wrote: > > > > > Is this safe without a barrier? > > > > > If the atomic increment in IncrementNumTasks gets reordered after > > > > > DecrementNumTasks, we just told someone that the num of tasks blocking > > > > shutdown > > > > > hit 0. > > > > > > > > I believe a barrier "specifies how regular, non-atomic memory accesses are > > to > > > be > > > > ordered around an atomic operation" > > > > http://en.cppreference.com/w/cpp/atomic/memory_order The atomic increment > > > cannot > > > > be reordered after the atomic decrement. > > > > > > Our usage is not unlike shared_ptr refcounts. From the doc: > > > Typical use for relaxed memory ordering is incrementing counters, such as > the > > > reference counters of std::shared_ptr, since this only requires atomicity, > but > > > not ordering or synchronization (note that decrementing the shared_ptr > > counters > > > requires acquire-release synchronization with the destructor) > > > > Are you suggesting that I use Barrier_AtomicIncrement instead of > > NoBarrier_AtomicIncrement here? That would add a memory barrier which I don't > > think is useful since after calling DecrementNumTasksBlockingShutdown(), we > > always call OnBlockingShutdownTasksComplete() which acquires a lock (implicit > > memory barrier). > > Yup, because it seems like we'll have a race without the barrier. > > We decrement the count. An increment gets reordered before we make the call to > OnBlockingShutdownTasksComplete. The count is now non-zero when we make the > call. Do you worry about increment on thread A occurring between decrement and call to OnBlockingShutdownTasksComplete() on thread B? This increment must have been caused by one of these: - A BLOCK_SHUTDOWN task was posted from a CONTINUE_ON_SHUTDOWN task or from a thread that isn't joined before the TaskScheduler is shut down [1]: We should disallow that. - A SKIP_ON_SHUTDOWN task is about to be scheduled (BeforeRunTask): The counter will be decremented immediately after because the shutdown bit is set (https://codereview.chromium.org/2019763002/diff/60001/base/task_scheduler/tas...). It doesn't matter if OnBlockingShutdownTasksComplete() is called on thread B when the counter hasn't been decremented yet because no task is really blocking shutdown. [1] The task can't have been posted from a SKIP_ON_SHUTDOWN or BLOCK_SHUTDOWN task. Otherwise, the counter wouldn't have become zero when it was decremented on thread B.
https://codereview.chromium.org/2019763002/diff/1/base/task_scheduler/task_tr... File base/task_scheduler/task_tracker.cc (right): https://codereview.chromium.org/2019763002/diff/1/base/task_scheduler/task_tr... base/task_scheduler/task_tracker.cc:80: const auto new_bits = subtle::NoBarrier_AtomicIncrement( On 2016/06/16 13:19:37, fdoray wrote: > On 2016/06/06 19:16:30, robliao wrote: > > On 2016/06/06 16:43:35, fdoray wrote: > > > On 2016/06/01 18:32:26, robliao wrote: > > > > On 2016/05/30 15:48:04, fdoray wrote: > > > > > On 2016/05/27 22:16:14, robliao wrote: > > > > > > Is this safe without a barrier? > > > > > > If the atomic increment in IncrementNumTasks gets reordered after > > > > > > DecrementNumTasks, we just told someone that the num of tasks blocking > > > > > shutdown > > > > > > hit 0. > > > > > > > > > > I believe a barrier "specifies how regular, non-atomic memory accesses > are > > > to > > > > be > > > > > ordered around an atomic operation" > > > > > http://en.cppreference.com/w/cpp/atomic/memory_order The atomic > increment > > > > cannot > > > > > be reordered after the atomic decrement. > > > > > > > > Our usage is not unlike shared_ptr refcounts. From the doc: > > > > Typical use for relaxed memory ordering is incrementing counters, such as > > the > > > > reference counters of std::shared_ptr, since this only requires atomicity, > > but > > > > not ordering or synchronization (note that decrementing the shared_ptr > > > counters > > > > requires acquire-release synchronization with the destructor) > > > > > > Are you suggesting that I use Barrier_AtomicIncrement instead of > > > NoBarrier_AtomicIncrement here? That would add a memory barrier which I > don't > > > think is useful since after calling DecrementNumTasksBlockingShutdown(), we > > > always call OnBlockingShutdownTasksComplete() which acquires a lock > (implicit > > > memory barrier). > > > > Yup, because it seems like we'll have a race without the barrier. > > > > We decrement the count. An increment gets reordered before we make the call to > > OnBlockingShutdownTasksComplete. The count is now non-zero when we make the > > call. > > Do you worry about increment on thread A occurring between decrement and call to > OnBlockingShutdownTasksComplete() on thread B? This increment must have been > caused by one of these: > > - A BLOCK_SHUTDOWN task was posted from a CONTINUE_ON_SHUTDOWN task or from a > thread that isn't joined before the TaskScheduler is shut down [1]: We should > disallow that. > - A SKIP_ON_SHUTDOWN task is about to be scheduled (BeforeRunTask): The counter > will be decremented immediately after because the shutdown bit is set > (https://codereview.chromium.org/2019763002/diff/60001/base/task_scheduler/tas...). > It doesn't matter if OnBlockingShutdownTasksComplete() is called on thread B > when the counter hasn't been decremented yet because no task is really blocking > shutdown. > > [1] The task can't have been posted from a SKIP_ON_SHUTDOWN or BLOCK_SHUTDOWN > task. Otherwise, the counter wouldn't have become zero when it was decremented > on thread B. I'm considering the atomics in isolation to the rest of the system. Relaxing the memory consistency requirement based off of existing knowledge on how the rest of the system is currently calling the state makes this less resilient towards changes in calling pattern. It would be difficult for us to trace bugs through here if we ran OnBlockingShutdownTasksComplete with a non-zero value in num_tasks_blocking_shutdown.
https://codereview.chromium.org/2019763002/diff/1/base/task_scheduler/task_tr... File base/task_scheduler/task_tracker.cc (right): https://codereview.chromium.org/2019763002/diff/1/base/task_scheduler/task_tr... base/task_scheduler/task_tracker.cc:80: const auto new_bits = subtle::NoBarrier_AtomicIncrement( On 2016/06/16 20:40:45, robliao wrote: > On 2016/06/16 13:19:37, fdoray wrote: > > On 2016/06/06 19:16:30, robliao wrote: > > > On 2016/06/06 16:43:35, fdoray wrote: > > > > On 2016/06/01 18:32:26, robliao wrote: > > > > > On 2016/05/30 15:48:04, fdoray wrote: > > > > > > On 2016/05/27 22:16:14, robliao wrote: > > > > > > > Is this safe without a barrier? > > > > > > > If the atomic increment in IncrementNumTasks gets reordered after > > > > > > > DecrementNumTasks, we just told someone that the num of tasks > blocking > > > > > > shutdown > > > > > > > hit 0. > > > > > > > > > > > > I believe a barrier "specifies how regular, non-atomic memory accesses > > are > > > > to > > > > > be > > > > > > ordered around an atomic operation" > > > > > > http://en.cppreference.com/w/cpp/atomic/memory_order The atomic > > increment > > > > > cannot > > > > > > be reordered after the atomic decrement. > > > > > > > > > > Our usage is not unlike shared_ptr refcounts. From the doc: > > > > > Typical use for relaxed memory ordering is incrementing counters, such > as > > > the > > > > > reference counters of std::shared_ptr, since this only requires > atomicity, > > > but > > > > > not ordering or synchronization (note that decrementing the shared_ptr > > > > counters > > > > > requires acquire-release synchronization with the destructor) > > > > > > > > Are you suggesting that I use Barrier_AtomicIncrement instead of > > > > NoBarrier_AtomicIncrement here? That would add a memory barrier which I > > don't > > > > think is useful since after calling DecrementNumTasksBlockingShutdown(), > we > > > > always call OnBlockingShutdownTasksComplete() which acquires a lock > > (implicit > > > > memory barrier). > > > > > > Yup, because it seems like we'll have a race without the barrier. > > > > > > We decrement the count. An increment gets reordered before we make the call > to > > > OnBlockingShutdownTasksComplete. The count is now non-zero when we make the > > > call. > > > > Do you worry about increment on thread A occurring between decrement and call > to > > OnBlockingShutdownTasksComplete() on thread B? This increment must have been > > caused by one of these: > > > > - A BLOCK_SHUTDOWN task was posted from a CONTINUE_ON_SHUTDOWN task or from a > > thread that isn't joined before the TaskScheduler is shut down [1]: We should > > disallow that. > > - A SKIP_ON_SHUTDOWN task is about to be scheduled (BeforeRunTask): The > counter > > will be decremented immediately after because the shutdown bit is set > > > (https://codereview.chromium.org/2019763002/diff/60001/base/task_scheduler/tas...). > > It doesn't matter if OnBlockingShutdownTasksComplete() is called on thread B > > when the counter hasn't been decremented yet because no task is really > blocking > > shutdown. > > > > [1] The task can't have been posted from a SKIP_ON_SHUTDOWN or BLOCK_SHUTDOWN > > task. Otherwise, the counter wouldn't have become zero when it was decremented > > on thread B. > > I'm considering the atomics in isolation to the rest of the system. Relaxing the > memory consistency requirement based off of existing knowledge on how the rest > of the system is currently calling the state makes this less resilient towards > changes in calling pattern. It would be difficult for us to trace bugs through > here if we ran OnBlockingShutdownTasksComplete with a non-zero value in > num_tasks_blocking_shutdown. I'm not against adding a memory barrier... but it won't prevent another thread from incrementing the counter (can happen in BeforeRunTask with shutdown_behavior == SKIP_ON_SHUTDOWN) between a counter decrement and a call to OnBlockingShutdownTasksComplete().
https://codereview.chromium.org/2019763002/diff/1/base/task_scheduler/task_tr... File base/task_scheduler/task_tracker.cc (right): https://codereview.chromium.org/2019763002/diff/1/base/task_scheduler/task_tr... base/task_scheduler/task_tracker.cc:80: const auto new_bits = subtle::NoBarrier_AtomicIncrement( On 2016/06/17 20:28:49, fdoray wrote: > On 2016/06/16 20:40:45, robliao wrote: > > On 2016/06/16 13:19:37, fdoray wrote: > > > On 2016/06/06 19:16:30, robliao wrote: > > > > On 2016/06/06 16:43:35, fdoray wrote: > > > > > On 2016/06/01 18:32:26, robliao wrote: > > > > > > On 2016/05/30 15:48:04, fdoray wrote: > > > > > > > On 2016/05/27 22:16:14, robliao wrote: > > > > > > > > Is this safe without a barrier? > > > > > > > > If the atomic increment in IncrementNumTasks gets reordered after > > > > > > > > DecrementNumTasks, we just told someone that the num of tasks > > blocking > > > > > > > shutdown > > > > > > > > hit 0. > > > > > > > > > > > > > > I believe a barrier "specifies how regular, non-atomic memory > accesses > > > are > > > > > to > > > > > > be > > > > > > > ordered around an atomic operation" > > > > > > > http://en.cppreference.com/w/cpp/atomic/memory_order The atomic > > > increment > > > > > > cannot > > > > > > > be reordered after the atomic decrement. > > > > > > > > > > > > Our usage is not unlike shared_ptr refcounts. From the doc: > > > > > > Typical use for relaxed memory ordering is incrementing counters, such > > as > > > > the > > > > > > reference counters of std::shared_ptr, since this only requires > > atomicity, > > > > but > > > > > > not ordering or synchronization (note that decrementing the shared_ptr > > > > > counters > > > > > > requires acquire-release synchronization with the destructor) > > > > > > > > > > Are you suggesting that I use Barrier_AtomicIncrement instead of > > > > > NoBarrier_AtomicIncrement here? That would add a memory barrier which I > > > don't > > > > > think is useful since after calling DecrementNumTasksBlockingShutdown(), > > we > > > > > always call OnBlockingShutdownTasksComplete() which acquires a lock > > > (implicit > > > > > memory barrier). > > > > > > > > Yup, because it seems like we'll have a race without the barrier. > > > > > > > > We decrement the count. An increment gets reordered before we make the > call > > to > > > > OnBlockingShutdownTasksComplete. The count is now non-zero when we make > the > > > > call. > > > > > > Do you worry about increment on thread A occurring between decrement and > call > > to > > > OnBlockingShutdownTasksComplete() on thread B? This increment must have been > > > caused by one of these: > > > > > > - A BLOCK_SHUTDOWN task was posted from a CONTINUE_ON_SHUTDOWN task or from > a > > > thread that isn't joined before the TaskScheduler is shut down [1]: We > should > > > disallow that. > > > - A SKIP_ON_SHUTDOWN task is about to be scheduled (BeforeRunTask): The > > counter > > > will be decremented immediately after because the shutdown bit is set > > > > > > (https://codereview.chromium.org/2019763002/diff/60001/base/task_scheduler/tas...). > > > It doesn't matter if OnBlockingShutdownTasksComplete() is called on thread B > > > when the counter hasn't been decremented yet because no task is really > > blocking > > > shutdown. > > > > > > [1] The task can't have been posted from a SKIP_ON_SHUTDOWN or > BLOCK_SHUTDOWN > > > task. Otherwise, the counter wouldn't have become zero when it was > decremented > > > on thread B. > > > > I'm considering the atomics in isolation to the rest of the system. Relaxing > the > > memory consistency requirement based off of existing knowledge on how the rest > > of the system is currently calling the state makes this less resilient towards > > changes in calling pattern. It would be difficult for us to trace bugs through > > here if we ran OnBlockingShutdownTasksComplete with a non-zero value in > > num_tasks_blocking_shutdown. > > I'm not against adding a memory barrier... but it won't prevent another thread > from incrementing the counter (can happen in BeforeRunTask with > shutdown_behavior == SKIP_ON_SHUTDOWN) between a counter decrement and a call to > OnBlockingShutdownTasksComplete(). Oddly enough, I think the same thing can happen in the lock world. An unlucky thread needs to increment the count and waits on the currently running thread that happens to decrement the count to zero. The old behavior is to DCHECK and then disallow execution. We can plausibly do the same here. The increment "fails" and the task is rejected.
lvg, couple things => In CL description: 1) With this CL, TaskTracker never acquires a lock except when shutdown is in progress. "is NOT in progress" 2) If the "shutdown initiated bit is set: Decrement the Atomic32 by 2 [1] (to revert the incorrect increment) and read it. If the number of tasks blocking shutdown became zero, do [3]. Add a note after "do [3]" that: " (note: this can trigger [3] more than once during shutdown as multiple threads realize that shutdown has been requested, but [3] is resilient to this). (I think this + (5.b) below resolves your point of contention with Rob?) 3) Note: Reading the "shutdown initiated" bit and incrementing the number of tasks blocking shutdown if it isn't set (without re-checking it afterwards) wouldn't work because the bit can be set between the two operations. I don't think this note is necessary, it's implicit that any atomic must be update/read in one step. 4) Instantiate the shutdown event. s/shutdown event/shutdown complete event/ ? 5) - Signal the shutdown event. a) s/shutdown event/shutdown complete event/ ? b) + "(note: this event can be signaled more than once per multiple threads reaching the conclusion that shutdown is complete, ref. SKIP_ON_SHUTDOWN dance above, but this is harmless) https://codereview.chromium.org/2019763002/diff/60001/base/task_scheduler/tas... File base/task_scheduler/task_tracker.cc (right): https://codereview.chromium.org/2019763002/diff/60001/base/task_scheduler/tas... base/task_scheduler/task_tracker.cc:55: new_value / kNumTasksBlockingShutdownIncrement; I think it would be simpler to have the shutdown bit in the MSB. With constexpr the mask should be computable in kShutdownHasStartedMask at compile time as something like 0x1 << 31. Then you don't need weird divisions and +2/-2 business. https://codereview.chromium.org/2019763002/diff/60001/base/task_scheduler/tas... base/task_scheduler/task_tracker.cc:212: return !!shutdown_event_ && !shutdown_event_->IsSignaled(); Don't think you need !! because the && explicitly turns this into a boolean expression already. https://codereview.chromium.org/2019763002/diff/60001/base/task_scheduler/tas... base/task_scheduler/task_tracker.cc:278: // between the two operations. As mentioned in my CL description comments I don't think this note is required. https://codereview.chromium.org/2019763002/diff/60001/base/task_scheduler/tas... File base/task_scheduler/task_tracker.h (right): https://codereview.chromium.org/2019763002/diff/60001/base/task_scheduler/tas... base/task_scheduler/task_tracker.h:80: // Synchronizes access to all members below. s/all members below/shutdown related members below/ ? https://codereview.chromium.org/2019763002/diff/60001/base/task_scheduler/tas... base/task_scheduler/task_tracker.h:81: mutable SchedulerLock lock_; shutdown_lock_ ? https://codereview.chromium.org/2019763002/diff/60001/base/task_scheduler/tas... File base/task_scheduler/task_tracker_unittest.cc (right): https://codereview.chromium.org/2019763002/diff/60001/base/task_scheduler/tas... base/task_scheduler/task_tracker_unittest.cc:96: ; rm https://codereview.chromium.org/2019763002/diff/60001/base/task_scheduler/tas... base/task_scheduler/task_tracker_unittest.cc:175: #define WAIT_FOR_ASYNC_ShutdownCompleted() \ CAPS
Description was changed from ========== TaskScheduler: Atomic operations in TaskTracker With this CL, TaskTracker never acquires a lock except when shutdown is in progress. TaskTracker::State wraps an Atomic32 whose LSB indicates whether shutdown has been initiated and whose other bits count the number of tasks blocking shutdown. This State is accessed in the following situations: - Before a BLOCK_SHUTDOWN task is posted Increment the Atomic32 by 2 [1] and read it. If the "shutdown initiated" bit is set, acquire the lock [2] to: - DCHECK that shutdown hasn't completed - Record a UMA histogram - Before a CONTINUE_ON_SHUTDOWN or SKIP_ON_SHUTDOWN task is posted: Read the Atomic32. The task is allowed to be posted iff the "shutdown initiated" bit isn't set. - Before a BLOCK_SHUTDOWN task is executed. Read the Atomic32. DCHECK that there are tasks blocking shutdown (number of tasks blocking shutdown was incremented when the task was posted). - Before a SKIP_ON_SHUTDOWN task is executed. Increment the Atomic32 by 2 [1] and read it. If the "shutdown initiated" bit isn't set: The task is allowed to be executed. If the "shutdown initiated bit is set: Decrement the Atomic32 by 2 [1] (to revert the incorrect increment) and read it. If the number of tasks blocking shutdown became zero, do [3]. Note: Reading the "shutdown initiated" bit and incrementing the number of tasks blocking shutdown if it isn't set (without re-checking it afterwards) wouldn't work because the bit can be set between the two operations. - Before a CONTINUE_ON_SHUTDOWN task is executed. Read the Atomic32. Allow the task to be executed iff the "shutdown initiated" bit isn't set. - After a BLOCK_SHUTDOWN or SKIP_ON_SHUTDOWN task is executed: Decrement the Atomic32 by 2 [1] and read it. If the number of tasks blocking shutdown became zero and "shutdown initiated", do [3]. - When Shutdown() is called: Acquire the lock. Instantiate the shutdown event. Increment the Atomic32 by 1 (to set the "shutdown initiated bit") and read it. From this moment, if a thread causes the number of tasks blocking shutdown to become zero, it will do [3]. If there are no tasks blocking shutdown, signal the event and return. Wait until the shutdown event is signaled. [1] Incrementing the Atomic32 by 2 increments the number of tasks blocking shutdown by 1 because the LSB is used to indicate whether shutdown has been initiated. [2] The TaskTracker must be locked to access the shutdown event and the number of BLOCK_SHUTDOWN tasks posted during shutdown. [3] These actions are performed by OnNumTasksBlockingShutdownIsZero(): - Acquire the lock. - DCHECK that shutdown has started (i.e. "shutdown initiated" bit is set and shutdown event exists). - Signal the shutdown event. Benchmark - 10 million increment/decrement per thread Atomic Lock 4 threads Win Z840 1.5s 2.2s 4 threads Ubuntu VM 0.8s 4.9s 4 threads MacBookPro 1.8s 270.7s 8 threads Win Z840 3.0s 5.5s 8 threads Ubuntu VM 1.6s 11.8s 8 threads MacBookPro 4.2s 559.0s BUG=553459 ========== to ========== TaskScheduler: Atomic operations in TaskTracker With this CL, TaskTracker never acquires a lock except when shutdown is not in progress. TaskTracker::State wraps an Atomic32 whose LSB indicates whether shutdown has been initiated and whose other bits count the number of tasks blocking shutdown. This State is accessed in the following situations: A) Before a BLOCK_SHUTDOWN task is posted Increment the Atomic32 by 2 [1] and read it. If the "shutdown initiated" bit is set, acquire the lock [2] to: - DCHECK that shutdown hasn't completed - Record a UMA histogram B) Before a CONTINUE_ON_SHUTDOWN or SKIP_ON_SHUTDOWN task is posted: Read the Atomic32. The task is allowed to be posted iff the "shutdown initiated" bit isn't set. C) Before a BLOCK_SHUTDOWN task is executed. Read the Atomic32. DCHECK that there are tasks blocking shutdown (number of tasks blocking shutdown was incremented when the task was posted). D) Before a SKIP_ON_SHUTDOWN task is executed. Increment the Atomic32 by 2 [1] and read it. If the "shutdown initiated" bit isn't set: The task is allowed to be executed. If the "shutdown initiated bit is set: Decrement the Atomic32 by 2 [1] (to revert the incorrect increment) and read it. If the number of tasks blocking shutdown became zero, do [3]. (note: this can trigger [3] more than once during shutdown as multiple threads realize that shutdown has been requested, but [3] is resilient to this). E) Before a CONTINUE_ON_SHUTDOWN task is executed. Read the Atomic32. Allow the task to be executed iff the "shutdown initiated" bit isn't set. F) After a BLOCK_SHUTDOWN or SKIP_ON_SHUTDOWN task is executed: Decrement the Atomic32 by 2 [1] and read it. If the number of tasks blocking shutdown became zero and "shutdown initiated", do [3]. G) When Shutdown() is called: Acquire the lock. Instantiate the shutdown complete event. Increment the Atomic32 by 1 (to set the "shutdown initiated bit") and read it. From this moment, if a thread causes the number of tasks blocking shutdown to become zero, it will do [3]. If there are no tasks blocking shutdown, signal the event and return. Wait until the shutdown event is signaled. [1] Incrementing the Atomic32 by 2 increments the number of tasks blocking shutdown by 1 because the LSB is used to indicate whether shutdown has been initiated. [2] The TaskTracker must be locked to access the shutdown event and the number of BLOCK_SHUTDOWN tasks posted during shutdown. [3] These actions are performed by OnNumTasksBlockingShutdownIsZero(): - Acquire the lock. - DCHECK that shutdown has started (i.e. "shutdown initiated" bit is set and shutdown event exists). - Signal the shutdown complete event (note: this event can be signaled more than once per multiple threads reaching the conclusion that shutdown is complete, ref. SKIP_ON_SHUTDOWN dance above, but this is harmless) Benchmark - 10 million increment/decrement per thread Atomic Lock 4 threads Win Z840 1.5s 2.2s 4 threads Ubuntu VM 0.8s 4.9s 4 threads MacBookPro 1.8s 270.7s 8 threads Win Z840 3.0s 5.5s 8 threads Ubuntu VM 1.6s 11.8s 8 threads MacBookPro 4.2s 559.0s BUG=553459 ==========
PTAnL https://codereview.chromium.org/2019763002/diff/60001/base/task_scheduler/tas... File base/task_scheduler/task_tracker.cc (right): https://codereview.chromium.org/2019763002/diff/60001/base/task_scheduler/tas... base/task_scheduler/task_tracker.cc:55: new_value / kNumTasksBlockingShutdownIncrement; On 2016/06/20 16:46:20, gab wrote: > I think it would be simpler to have the shutdown bit in the MSB. With constexpr > the mask should be computable in kShutdownHasStartedMask at compile time as > something like 0x1 << 31. > > Then you don't need weird divisions and +2/-2 business. I don't know how I would set the shutdown bit it it was the MSB because base/atomicops.h ... - ... doesn't have an atomic "and" (can't do atomic &= 2^31) - ... only works with signed integers (can't do atomic += 2^31) https://codereview.chromium.org/2019763002/diff/60001/base/task_scheduler/tas... base/task_scheduler/task_tracker.cc:212: return !!shutdown_event_ && !shutdown_event_->IsSignaled(); On 2016/06/20 16:46:20, gab wrote: > Don't think you need !! because the && explicitly turns this into a boolean > expression already. Done. https://codereview.chromium.org/2019763002/diff/60001/base/task_scheduler/tas... base/task_scheduler/task_tracker.cc:278: // between the two operations. On 2016/06/20 16:46:20, gab wrote: > As mentioned in my CL description comments I don't think this note is required. Done. https://codereview.chromium.org/2019763002/diff/60001/base/task_scheduler/tas... File base/task_scheduler/task_tracker.h (right): https://codereview.chromium.org/2019763002/diff/60001/base/task_scheduler/tas... base/task_scheduler/task_tracker.h:80: // Synchronizes access to all members below. On 2016/06/20 16:46:21, gab wrote: > s/all members below/shutdown related members below/ ? Done. https://codereview.chromium.org/2019763002/diff/60001/base/task_scheduler/tas... base/task_scheduler/task_tracker.h:81: mutable SchedulerLock lock_; On 2016/06/20 16:46:21, gab wrote: > shutdown_lock_ ? Done. https://codereview.chromium.org/2019763002/diff/60001/base/task_scheduler/tas... File base/task_scheduler/task_tracker_unittest.cc (right): https://codereview.chromium.org/2019763002/diff/60001/base/task_scheduler/tas... base/task_scheduler/task_tracker_unittest.cc:96: ; On 2016/06/20 16:46:21, gab wrote: > rm Done. https://codereview.chromium.org/2019763002/diff/60001/base/task_scheduler/tas... base/task_scheduler/task_tracker_unittest.cc:175: #define WAIT_FOR_ASYNC_ShutdownCompleted() \ On 2016/06/20 16:46:21, gab wrote: > CAPS Done.
Description was changed from ========== TaskScheduler: Atomic operations in TaskTracker With this CL, TaskTracker never acquires a lock except when shutdown is not in progress. TaskTracker::State wraps an Atomic32 whose LSB indicates whether shutdown has been initiated and whose other bits count the number of tasks blocking shutdown. This State is accessed in the following situations: A) Before a BLOCK_SHUTDOWN task is posted Increment the Atomic32 by 2 [1] and read it. If the "shutdown initiated" bit is set, acquire the lock [2] to: - DCHECK that shutdown hasn't completed - Record a UMA histogram B) Before a CONTINUE_ON_SHUTDOWN or SKIP_ON_SHUTDOWN task is posted: Read the Atomic32. The task is allowed to be posted iff the "shutdown initiated" bit isn't set. C) Before a BLOCK_SHUTDOWN task is executed. Read the Atomic32. DCHECK that there are tasks blocking shutdown (number of tasks blocking shutdown was incremented when the task was posted). D) Before a SKIP_ON_SHUTDOWN task is executed. Increment the Atomic32 by 2 [1] and read it. If the "shutdown initiated" bit isn't set: The task is allowed to be executed. If the "shutdown initiated bit is set: Decrement the Atomic32 by 2 [1] (to revert the incorrect increment) and read it. If the number of tasks blocking shutdown became zero, do [3]. (note: this can trigger [3] more than once during shutdown as multiple threads realize that shutdown has been requested, but [3] is resilient to this). E) Before a CONTINUE_ON_SHUTDOWN task is executed. Read the Atomic32. Allow the task to be executed iff the "shutdown initiated" bit isn't set. F) After a BLOCK_SHUTDOWN or SKIP_ON_SHUTDOWN task is executed: Decrement the Atomic32 by 2 [1] and read it. If the number of tasks blocking shutdown became zero and "shutdown initiated", do [3]. G) When Shutdown() is called: Acquire the lock. Instantiate the shutdown complete event. Increment the Atomic32 by 1 (to set the "shutdown initiated bit") and read it. From this moment, if a thread causes the number of tasks blocking shutdown to become zero, it will do [3]. If there are no tasks blocking shutdown, signal the event and return. Wait until the shutdown event is signaled. [1] Incrementing the Atomic32 by 2 increments the number of tasks blocking shutdown by 1 because the LSB is used to indicate whether shutdown has been initiated. [2] The TaskTracker must be locked to access the shutdown event and the number of BLOCK_SHUTDOWN tasks posted during shutdown. [3] These actions are performed by OnNumTasksBlockingShutdownIsZero(): - Acquire the lock. - DCHECK that shutdown has started (i.e. "shutdown initiated" bit is set and shutdown event exists). - Signal the shutdown complete event (note: this event can be signaled more than once per multiple threads reaching the conclusion that shutdown is complete, ref. SKIP_ON_SHUTDOWN dance above, but this is harmless) Benchmark - 10 million increment/decrement per thread Atomic Lock 4 threads Win Z840 1.5s 2.2s 4 threads Ubuntu VM 0.8s 4.9s 4 threads MacBookPro 1.8s 270.7s 8 threads Win Z840 3.0s 5.5s 8 threads Ubuntu VM 1.6s 11.8s 8 threads MacBookPro 4.2s 559.0s BUG=553459 ========== to ========== TaskScheduler: Atomic operations in TaskTracker With this CL, TaskTracker never acquires a lock except when shutdown is not in progress. TaskTracker::State wraps an Atomic32 whose LSB indicates whether shutdown has been initiated and whose other bits count the number of tasks blocking shutdown. This State is accessed in the following situations: A) Before a BLOCK_SHUTDOWN task is posted Increment the Atomic32 by 2 [1] and read it. If the "shutdown initiated" bit is set, acquire the lock [2] to: - DCHECK that shutdown hasn't completed - Record a UMA histogram B) Before a CONTINUE_ON_SHUTDOWN or SKIP_ON_SHUTDOWN task is posted: Read the Atomic32. The task is allowed to be posted iff the "shutdown initiated" bit isn't set. C) Before a BLOCK_SHUTDOWN task is executed. Read the Atomic32. DCHECK that there are tasks blocking shutdown (number of tasks blocking shutdown was incremented when the task was posted). D) Before a SKIP_ON_SHUTDOWN task is executed. Increment the Atomic32 by 2 [1] and read it. If the "shutdown initiated" bit isn't set: The task is allowed to be executed. If the "shutdown initiated bit is set: Decrement the Atomic32 by 2 [1] (to revert the incorrect increment) and read it. If the number of tasks blocking shutdown became zero, do [3] (note: this can trigger [3] more than once during shutdown as multiple threads realize that shutdown has been requested, but [3] is resilient to this). E) Before a CONTINUE_ON_SHUTDOWN task is executed. Read the Atomic32. Allow the task to be executed iff the "shutdown initiated" bit isn't set. F) After a BLOCK_SHUTDOWN or SKIP_ON_SHUTDOWN task is executed: Decrement the Atomic32 by 2 [1] and read it. If the number of tasks blocking shutdown became zero and "shutdown initiated", do [3]. G) When Shutdown() is called: Acquire the lock. Instantiate the shutdown complete event. Increment the Atomic32 by 1 (to set the "shutdown initiated bit") and read it. From this moment, if a thread causes the number of tasks blocking shutdown to become zero, it will do [3]. If there are no tasks blocking shutdown, signal the event and return. Wait until the shutdown event is signaled. [1] Incrementing the Atomic32 by 2 increments the number of tasks blocking shutdown by 1 because the LSB is used to indicate whether shutdown has been initiated. [2] The TaskTracker must be locked to access the shutdown event and the number of BLOCK_SHUTDOWN tasks posted during shutdown. [3] These actions are performed by OnNumTasksBlockingShutdownIsZero(): - Acquire the lock. - DCHECK that shutdown has started (i.e. "shutdown initiated" bit is set and shutdown event exists). - Signal the shutdown complete event (note: this event can be signaled more than once per multiple threads reaching the conclusion that shutdown is complete, ref. SKIP_ON_SHUTDOWN dance above, but this is harmless) Benchmark - 10 million increment/decrement per thread Atomic Lock 4 threads Win Z840 1.5s 2.2s 4 threads Ubuntu VM 0.8s 4.9s 4 threads MacBookPro 1.8s 270.7s 8 threads Win Z840 3.0s 5.5s 8 threads Ubuntu VM 1.6s 11.8s 8 threads MacBookPro 4.2s 559.0s BUG=553459 ==========
lgtm w/ few comments, thanks! https://codereview.chromium.org/2019763002/diff/80001/base/task_scheduler/tas... File base/task_scheduler/task_tracker.cc (right): https://codereview.chromium.org/2019763002/diff/80001/base/task_scheduler/tas... base/task_scheduler/task_tracker.cc:48: subtle::Barrier_AtomicIncrement(&bits_, kShutdownHasStartedMask); NoBarrier? https://codereview.chromium.org/2019763002/diff/80001/base/task_scheduler/tas... base/task_scheduler/task_tracker.cc:79: const auto num_tasks_blocking_shutdown = #if DCHECK_IS_ON() https://codereview.chromium.org/2019763002/diff/80001/base/task_scheduler/tas... base/task_scheduler/task_tracker.cc:101: static constexpr subtle::Atomic32 kNumTasksBlockingShutdownIncrement = 2; s/2/1 << kShutdownHasStartedMask/ to make it clearer? And then above perhaps also: "new_bits / kNumTasksBlockingShutdownIncrement" => "new_bits >> kShutdownHasStartedMask" ? I prefer shift over division since this is really what it is (reading all bits but the mask).
lgtm https://codereview.chromium.org/2019763002/diff/80001/base/task_scheduler/tas... File base/task_scheduler/task_tracker.cc (right): https://codereview.chromium.org/2019763002/diff/80001/base/task_scheduler/tas... base/task_scheduler/task_tracker.cc:48: subtle::Barrier_AtomicIncrement(&bits_, kShutdownHasStartedMask); On 2016/06/20 21:06:51, gab wrote: > NoBarrier? Do we have a sense of how much a barrier slows the code down? If it's not much, keeping it will certainly make the code more robust. OTOH, agreed that in the current usage, we can get away without a barrier. https://codereview.chromium.org/2019763002/diff/80001/base/task_scheduler/tas... base/task_scheduler/task_tracker.cc:105: subtle::Atomic32 bits_ = 0; Comment on the structure of the lack of barriers depends on a single value here. Adding another atomic variable and reading from it with bits_ means the the atomic behavior needs to be reassessed. https://codereview.chromium.org/2019763002/diff/80001/base/task_scheduler/tas... File base/task_scheduler/task_tracker_unittest.cc (right): https://codereview.chromium.org/2019763002/diff/80001/base/task_scheduler/tas... base/task_scheduler/task_tracker_unittest.cc:63: enum Action { Nit: enum class.
The CQ bit was checked by robliao@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: ios-simulator on master.tryserver.chromium.mac (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.mac/builders/ios-simulator/bui...) ios-simulator-gn on master.tryserver.chromium.mac (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.mac/builders/ios-simulator-gn/...) mac_chromium_compile_dbg_ng on master.tryserver.chromium.mac (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.mac/builders/mac_chromium_comp...) mac_chromium_rel_ng on master.tryserver.chromium.mac (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.mac/builders/mac_chromium_rel_...)
fdoray@chromium.org changed reviewers: + danakj@chromium.org
danakj@: Can you review this CL? Thanks. https://codereview.chromium.org/2019763002/diff/80001/base/task_scheduler/tas... File base/task_scheduler/task_tracker.cc (right): https://codereview.chromium.org/2019763002/diff/80001/base/task_scheduler/tas... base/task_scheduler/task_tracker.cc:48: subtle::Barrier_AtomicIncrement(&bits_, kShutdownHasStartedMask); On 2016/06/20 21:32:20, robliao wrote: > On 2016/06/20 21:06:51, gab wrote: > > NoBarrier? > > Do we have a sense of how much a barrier slows the code down? If it's not much, > keeping it will certainly make the code more robust. > > OTOH, agreed that in the current usage, we can get away without a barrier. Done. The Barrier is not required here because there is no state other than the atomic variable that needs to be synchronized with other threads. https://codereview.chromium.org/2019763002/diff/80001/base/task_scheduler/tas... base/task_scheduler/task_tracker.cc:79: const auto num_tasks_blocking_shutdown = On 2016/06/20 21:06:51, gab wrote: > #if DCHECK_IS_ON() Done. https://codereview.chromium.org/2019763002/diff/80001/base/task_scheduler/tas... base/task_scheduler/task_tracker.cc:101: static constexpr subtle::Atomic32 kNumTasksBlockingShutdownIncrement = 2; On 2016/06/20 21:06:51, gab wrote: > s/2/1 << kShutdownHasStartedMask/ to make it clearer? > > And then above perhaps also: > > "new_bits / kNumTasksBlockingShutdownIncrement" => "new_bits >> > kShutdownHasStartedMask" ? > > I prefer shift over division since this is really what it is (reading all bits > but the mask). Done. https://codereview.chromium.org/2019763002/diff/80001/base/task_scheduler/tas... base/task_scheduler/task_tracker.cc:105: subtle::Atomic32 bits_ = 0; On 2016/06/20 21:32:20, robliao wrote: > Comment on the structure of the lack of barriers depends on a single value here. > Adding another atomic variable and reading from it with bits_ means the the > atomic behavior needs to be reassessed. Done. https://codereview.chromium.org/2019763002/diff/80001/base/task_scheduler/tas... File base/task_scheduler/task_tracker_unittest.cc (right): https://codereview.chromium.org/2019763002/diff/80001/base/task_scheduler/tas... base/task_scheduler/task_tracker_unittest.cc:63: enum Action { On 2016/06/20 21:32:20, robliao wrote: > Nit: enum class. Done.
nits still lgtm https://codereview.chromium.org/2019763002/diff/80001/base/task_scheduler/tas... File base/task_scheduler/task_tracker.cc (right): https://codereview.chromium.org/2019763002/diff/80001/base/task_scheduler/tas... base/task_scheduler/task_tracker.cc:101: static constexpr subtle::Atomic32 kNumTasksBlockingShutdownIncrement = 2; On 2016/06/29 18:22:37, fdoray wrote: > On 2016/06/20 21:06:51, gab wrote: > > s/2/1 << kShutdownHasStartedMask/ to make it clearer? > > > > And then above perhaps also: > > > > "new_bits / kNumTasksBlockingShutdownIncrement" => "new_bits >> > > kShutdownHasStartedMask" ? > > > > I prefer shift over division since this is really what it is (reading all bits > > but the mask). > > Done. Actually now that I look at the updated code, shifting by a mask looks wrong (sorry!). I.e. if the mask was instead say 00001111 we would need to shift by 4 not 15. I still prefer shifting but shifting by the mask is wrong, The required shift is really log_2(kShutdownHasStartedMask)+1 but that sounds overcomplicated to derive an offset in this simple case. Maybe simply define: kNumTasksBlockingShutdownBitOffset = 1 and then use it to shift and you can increment by (1 << kNumTasksBlockingShutdownBitOffset)? Up to you. https://codereview.chromium.org/2019763002/diff/100001/base/task_scheduler/ta... File base/task_scheduler/task_tracker.cc (right): https://codereview.chromium.org/2019763002/diff/100001/base/task_scheduler/ta... base/task_scheduler/task_tracker.cc:110: // correct because it encapsulates all shutdown state. Atomic behavior would s/Atomic behavior/Barrier semantics/ ?
https://codereview.chromium.org/2019763002/diff/100001/base/task_scheduler/ta... File base/task_scheduler/task_tracker.cc (right): https://codereview.chromium.org/2019763002/diff/100001/base/task_scheduler/ta... base/task_scheduler/task_tracker.cc:60: bool ShutdownHasStarted() const { nit: i prefer names that start with an action, in the cases of getters, indicating a query: HasShutdownStarted, AreTasksBlockingShutdown. https://codereview.chromium.org/2019763002/diff/100001/base/task_scheduler/ta... base/task_scheduler/task_tracker.cc:79: // Verify that there is no overflow. Signed integer overflow is undefined is it not? I think you'll need to instead DCHECK on the iteration before where adding anything else *would* be overflow. But is DCHECK enough? What prevents it from happening, should we reject tasks if it would overflow? https://codereview.chromium.org/2019763002/diff/100001/base/task_scheduler/ta... base/task_scheduler/task_tracker.cc:81: new_bits >> kShutdownHasStartedMask; You mean / Increment? https://codereview.chromium.org/2019763002/diff/100001/base/task_scheduler/ta... base/task_scheduler/task_tracker.cc:96: new_bits >> kShutdownHasStartedMask; you mean / Increment? https://codereview.chromium.org/2019763002/diff/100001/base/task_scheduler/ta... base/task_scheduler/task_tracker.cc:104: 1 << kShutdownHasStartedMask; I don't think this is quite right logically, if mask was 11b you'd want to shift by 2, not by 3. Maybe just write 2 (or mask+1 if you need to do this mathy) with a comment explaining it's the number of bits in the mask? https://codereview.chromium.org/2019763002/diff/100001/base/task_scheduler/ta... base/task_scheduler/task_tracker.cc:134: // From now, if a thread causes the number of tasks blocking shutdown to What happens if a BLOCK_SHUTDOWN task is posted right here? It sees shutdown has started, takes the lock waits for this block to exit, then increments num_block_shutdown_tasks_posted_during_shutdown_ and returns *true*? I think that's different than before right? Then we have a task in queue after Shutdown is done. https://codereview.chromium.org/2019763002/diff/100001/base/task_scheduler/ta... base/task_scheduler/task_tracker.cc:212: bool TaskTracker::ShutdownCompleted() const { same naming nit: IsShutdownComplete? https://codereview.chromium.org/2019763002/diff/100001/base/task_scheduler/ta... base/task_scheduler/task_tracker.cc:228: if (shutdown_started) { If shutdown started, it may also have finished by now, how do we handle that? https://codereview.chromium.org/2019763002/diff/100001/base/task_scheduler/ta... base/task_scheduler/task_tracker.cc:254: return !state_->ShutdownHasStarted(); What if shutdown started between this and the check above?
danakj@: PTAnL https://codereview.chromium.org/2019763002/diff/80001/base/task_scheduler/tas... File base/task_scheduler/task_tracker.cc (right): https://codereview.chromium.org/2019763002/diff/80001/base/task_scheduler/tas... base/task_scheduler/task_tracker.cc:101: static constexpr subtle::Atomic32 kNumTasksBlockingShutdownIncrement = 2; On 2016/06/29 21:38:47, gab wrote: > On 2016/06/29 18:22:37, fdoray wrote: > > On 2016/06/20 21:06:51, gab wrote: > > > s/2/1 << kShutdownHasStartedMask/ to make it clearer? > > > > > > And then above perhaps also: > > > > > > "new_bits / kNumTasksBlockingShutdownIncrement" => "new_bits >> > > > kShutdownHasStartedMask" ? > > > > > > I prefer shift over division since this is really what it is (reading all > bits > > > but the mask). > > > > Done. > > Actually now that I look at the updated code, shifting by a mask looks wrong > (sorry!). I.e. if the mask was instead say 00001111 we would need to shift by 4 > not 15. > > I still prefer shifting but shifting by the mask is wrong, The required shift is > really log_2(kShutdownHasStartedMask)+1 but that sounds overcomplicated to > derive an offset in this simple case. > > Maybe simply define: > > kNumTasksBlockingShutdownBitOffset = 1 > > and then use it to shift and you can increment by (1 << > kNumTasksBlockingShutdownBitOffset)? > > Up to you. Done. https://codereview.chromium.org/2019763002/diff/100001/base/task_scheduler/ta... File base/task_scheduler/task_tracker.cc (right): https://codereview.chromium.org/2019763002/diff/100001/base/task_scheduler/ta... base/task_scheduler/task_tracker.cc:60: bool ShutdownHasStarted() const { On 2016/06/30 22:25:36, danakj wrote: > nit: i prefer names that start with an action, in the cases of getters, > indicating a query: HasShutdownStarted, AreTasksBlockingShutdown. Done. https://codereview.chromium.org/2019763002/diff/100001/base/task_scheduler/ta... base/task_scheduler/task_tracker.cc:79: // Verify that there is no overflow. On 2016/06/30 22:25:36, danakj wrote: > Signed integer overflow is undefined is it not? I think you'll need to instead > DCHECK on the iteration before where adding anything else *would* be overflow. > But is DCHECK enough? What prevents it from happening, should we reject tasks if > it would overflow? I moved the DCHECK before the increment. Note that it is racy. This counter will overflow if there are 2^30 *pending* tasks. I assume that if this happens, something is wrong. Getting a crash report (CHECK instead of DCHECK) might be a better solution than silently rejecting tasks. https://codereview.chromium.org/2019763002/diff/100001/base/task_scheduler/ta... base/task_scheduler/task_tracker.cc:81: new_bits >> kShutdownHasStartedMask; On 2016/06/30 22:25:36, danakj wrote: > You mean / Increment? Gab had a similar comment https://codereview.chromium.org/2019763002/diff/80001/base/task_scheduler/tas... It is now new_bits >> kNumTasksBlockingShutdownBitOffset. https://codereview.chromium.org/2019763002/diff/100001/base/task_scheduler/ta... base/task_scheduler/task_tracker.cc:96: new_bits >> kShutdownHasStartedMask; On 2016/06/30 22:25:36, danakj wrote: > you mean / Increment? ditto https://codereview.chromium.org/2019763002/diff/100001/base/task_scheduler/ta... base/task_scheduler/task_tracker.cc:104: 1 << kShutdownHasStartedMask; On 2016/06/30 22:25:36, danakj wrote: > I don't think this is quite right logically, if mask was 11b you'd want to shift > by 2, not by 3. Maybe just write 2 (or mask+1 if you need to do this mathy) with > a comment explaining it's the number of bits in the mask? ditto https://codereview.chromium.org/2019763002/diff/100001/base/task_scheduler/ta... base/task_scheduler/task_tracker.cc:110: // correct because it encapsulates all shutdown state. Atomic behavior would On 2016/06/29 21:38:47, gab wrote: > s/Atomic behavior/Barrier semantics/ ? Done. https://codereview.chromium.org/2019763002/diff/100001/base/task_scheduler/ta... base/task_scheduler/task_tracker.cc:134: // From now, if a thread causes the number of tasks blocking shutdown to On 2016/06/30 22:25:36, danakj wrote: > What happens if a BLOCK_SHUTDOWN task is posted right here? > > It sees shutdown has started, takes the lock waits for this block to exit, then > increments num_block_shutdown_tasks_posted_during_shutdown_ and returns *true*? > > I think that's different than before right? Then we have a task in queue after > Shutdown is done. You mean if a BLOCK_SHUTDOWN task is posted right here when the number of tasks blocking shutdown is zero? Line 238 will DCHECK. Posting a BLOCK_SHUTDOWN task when shutdown is complete isn't supported. I would argue that shutdown is complete when Shutdown() has entered and the number of tasks blocking shutdown is zero. If we disallowed posting BLOCK_SHUTDOWN tasks from CONTINUE_ON_SHUTDOWN tasks and from external threads that aren't joined before the task scheduler is shut down, this couldn't happen. https://codereview.chromium.org/2019763002/diff/100001/base/task_scheduler/ta... base/task_scheduler/task_tracker.cc:212: bool TaskTracker::ShutdownCompleted() const { On 2016/06/30 22:25:36, danakj wrote: > same naming nit: IsShutdownComplete? Done. https://codereview.chromium.org/2019763002/diff/100001/base/task_scheduler/ta... base/task_scheduler/task_tracker.cc:228: if (shutdown_started) { On 2016/06/30 22:25:36, danakj wrote: > If shutdown started, it may also have finished by now, how do we handle that? If shutdown is complete, line 238 will DCHECK: DCHECK(!shutdown_event_->IsSignaled()); https://codereview.chromium.org/2019763002/diff/100001/base/task_scheduler/ta... base/task_scheduler/task_tracker.cc:254: return !state_->ShutdownHasStarted(); On 2016/06/30 22:25:36, danakj wrote: > What if shutdown started between this and the check above? Which check above? The rest of this method is for BLOCK_SHUTDOWN tasks and this line is for non-BLOCK_SHUTDOWN tasks.
https://codereview.chromium.org/2019763002/diff/100001/base/task_scheduler/ta... File base/task_scheduler/task_tracker.cc (right): https://codereview.chromium.org/2019763002/diff/100001/base/task_scheduler/ta... base/task_scheduler/task_tracker.cc:134: // From now, if a thread causes the number of tasks blocking shutdown to On 2016/07/04 20:53:11, fdoray wrote: > On 2016/06/30 22:25:36, danakj wrote: > > What happens if a BLOCK_SHUTDOWN task is posted right here? > > > > It sees shutdown has started, takes the lock waits for this block to exit, > then > > increments num_block_shutdown_tasks_posted_during_shutdown_ and returns > *true*? > > > > I think that's different than before right? Then we have a task in queue after > > Shutdown is done. > > You mean if a BLOCK_SHUTDOWN task is posted right here when the number of tasks > blocking shutdown is zero? Line 238 will DCHECK. > > Posting a BLOCK_SHUTDOWN task when shutdown is complete isn't supported. The DCHECK is that the event is not Signaled. It is not Signalled yet here, that happens just below. There is some time between the StartShutdown() call the Signal of the shutdown event, which is where BLOCK_SHUTDOWN can be posted (shutdown isn't done), but it will see that shutdown has started and assumes we will Wait here, which we wouldn't. > I would > argue that shutdown is complete when Shutdown() has entered and the number of > tasks blocking shutdown is zero. But the code is using the Signal to determine it right? > If we disallowed posting BLOCK_SHUTDOWN tasks from CONTINUE_ON_SHUTDOWN tasks > and from external threads that aren't joined before the task scheduler is shut > down, this couldn't happen.
https://codereview.chromium.org/2019763002/diff/100001/base/task_scheduler/ta... File base/task_scheduler/task_tracker.cc (right): https://codereview.chromium.org/2019763002/diff/100001/base/task_scheduler/ta... base/task_scheduler/task_tracker.cc:81: new_bits >> kShutdownHasStartedMask; On 2016/07/04 20:53:11, fdoray wrote: > On 2016/06/30 22:25:36, danakj wrote: > > You mean / Increment? > > Gab had a similar comment > https://codereview.chromium.org/2019763002/diff/80001/base/task_scheduler/tas... > > It is now new_bits >> kNumTasksBlockingShutdownBitOffset. Are you sure you want to >> a signed integer? Does DCHECK_GE even make sense after you did a right shift? I assumed you were doing division to preserve the sign bit?
PTAnL. Let me know if I didn't understand your concerns correctly. https://codereview.chromium.org/2019763002/diff/100001/base/task_scheduler/ta... File base/task_scheduler/task_tracker.cc (right): https://codereview.chromium.org/2019763002/diff/100001/base/task_scheduler/ta... base/task_scheduler/task_tracker.cc:134: // From now, if a thread causes the number of tasks blocking shutdown to On 2016/07/06 18:50:24, danakj wrote: > On 2016/07/04 20:53:11, fdoray wrote: > > On 2016/06/30 22:25:36, danakj wrote: > > > What happens if a BLOCK_SHUTDOWN task is posted right here? > > > > > > It sees shutdown has started, takes the lock waits for this block to exit, > > then > > > increments num_block_shutdown_tasks_posted_during_shutdown_ and returns > > *true*? > > > > > > I think that's different than before right? Then we have a task in queue > after > > > Shutdown is done. > > > > You mean if a BLOCK_SHUTDOWN task is posted right here when the number of > tasks > > blocking shutdown is zero? Line 238 will DCHECK. > > > > Posting a BLOCK_SHUTDOWN task when shutdown is complete isn't supported. > > The DCHECK is that the event is not Signaled. It is not Signalled yet here, that > happens just below. There is some time between the StartShutdown() call the > Signal of the shutdown event, which is where BLOCK_SHUTDOWN can be posted > (shutdown isn't done), but it will see that shutdown has started and assumes we > will Wait here, which we wouldn't. > > > I would > > argue that shutdown is complete when Shutdown() has entered and the number of > > tasks blocking shutdown is zero. > > But the code is using the Signal to determine it right? > > > If we disallowed posting BLOCK_SHUTDOWN tasks from CONTINUE_ON_SHUTDOWN tasks > > and from external threads that aren't joined before the task scheduler is shut > > down, this couldn't happen. > Thread A: state_->StartShutdown(); Thread B: Starts posting a BLOCK_SHUTDOWN task. BeforePostTask blocks when it tries to acquire |shutdown_lock_| because it is still held by thread A. Thread A: Signals the shutdown event to indicate that shutdown is complete. Releases the lock. Thread B: Acquires the lock. DCHECK(!shutdown_event_->IsSignaled()) fails because it is not valid to post a task when shutdown is complete (and shutdown is complete as soon as StartShutdown() has been called and num tasks blocking shutdown is zero).
On 2016/07/06 18:51:48, danakj wrote: > https://codereview.chromium.org/2019763002/diff/100001/base/task_scheduler/ta... > File base/task_scheduler/task_tracker.cc (right): > > https://codereview.chromium.org/2019763002/diff/100001/base/task_scheduler/ta... > base/task_scheduler/task_tracker.cc:81: new_bits >> kShutdownHasStartedMask; > On 2016/07/04 20:53:11, fdoray wrote: > > On 2016/06/30 22:25:36, danakj wrote: > > > You mean / Increment? > > > > Gab had a similar comment > > > https://codereview.chromium.org/2019763002/diff/80001/base/task_scheduler/tas... > > > > It is now new_bits >> kNumTasksBlockingShutdownBitOffset. > > Are you sure you want to >> a signed integer? Does DCHECK_GE even make sense > after you did a right shift? I assumed you were doing division to preserve the > sign bit? Standard says "The value of E1 >> E2 is E1 right-shifted E2 bit positions. If E1 has an unsigned type or if E1 has a signed type and a non-negative value, the value of the result is the integral part of the quotient of E1/2 E2. If E1 has a signed type and a negative value, the resulting value is implementation-defined." http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4296.pdf As long as |bits_| isn't negative, bit shift is correct. Let me know if you prefer a division.
On 2016/07/06 21:11:47, fdoray wrote: > As long as |bits_| isn't negative, bit shift is correct. Let me know if you > prefer a division. Ok, then DCHECKing >= 0 after shifting seems pointless though. https://codereview.chromium.org/2019763002/diff/100001/base/task_scheduler/ta... File base/task_scheduler/task_tracker.cc (right): https://codereview.chromium.org/2019763002/diff/100001/base/task_scheduler/ta... base/task_scheduler/task_tracker.cc:134: // From now, if a thread causes the number of tasks blocking shutdown to On 2016/07/06 19:19:40, fdoray wrote: > On 2016/07/06 18:50:24, danakj wrote: > > On 2016/07/04 20:53:11, fdoray wrote: > > > On 2016/06/30 22:25:36, danakj wrote: > > > > What happens if a BLOCK_SHUTDOWN task is posted right here? > > > > > > > > It sees shutdown has started, takes the lock waits for this block to exit, > > > then > > > > increments num_block_shutdown_tasks_posted_during_shutdown_ and returns > > > *true*? > > > > > > > > I think that's different than before right? Then we have a task in queue > > after > > > > Shutdown is done. > > > > > > You mean if a BLOCK_SHUTDOWN task is posted right here when the number of > > tasks > > > blocking shutdown is zero? Line 238 will DCHECK. > > > > > > Posting a BLOCK_SHUTDOWN task when shutdown is complete isn't supported. > > > > The DCHECK is that the event is not Signaled. It is not Signalled yet here, > that > > happens just below. There is some time between the StartShutdown() call the > > Signal of the shutdown event, which is where BLOCK_SHUTDOWN can be posted > > (shutdown isn't done), but it will see that shutdown has started and assumes > we > > will Wait here, which we wouldn't. > > > > > I would > > > argue that shutdown is complete when Shutdown() has entered and the number > of > > > tasks blocking shutdown is zero. > > > > But the code is using the Signal to determine it right? > > > > > If we disallowed posting BLOCK_SHUTDOWN tasks from CONTINUE_ON_SHUTDOWN > tasks > > > and from external threads that aren't joined before the task scheduler is > shut > > > down, this couldn't happen. > > > > Thread A: state_->StartShutdown(); > Thread B: Starts posting a BLOCK_SHUTDOWN task. BeforePostTask blocks when it > tries to acquire |shutdown_lock_| because it is still held by thread A. > Thread A: Signals the shutdown event to indicate that shutdown is complete. > Releases the lock. > Thread B: Acquires the lock. DCHECK(!shutdown_event_->IsSignaled()) fails > because it is not valid to post a task when shutdown is complete (and shutdown > is complete as soon as StartShutdown() has been called and num tasks blocking > shutdown is zero). Ah I see. The lock makes this work. Ok thanks for the explanation. Please consider adding some comments in places to explain this, like on the lock grabs and stuff.
https://codereview.chromium.org/2019763002/diff/120001/base/task_scheduler/ta... File base/task_scheduler/task_tracker.cc (right): https://codereview.chromium.org/2019763002/diff/120001/base/task_scheduler/ta... base/task_scheduler/task_tracker.cc:58: new_value / kNumTasksBlockingShutdownIncrement; Why are some / Increment, and some are >> Offset?
danakj@: PTAnL https://codereview.chromium.org/2019763002/diff/120001/base/task_scheduler/ta... File base/task_scheduler/task_tracker.cc (right): https://codereview.chromium.org/2019763002/diff/120001/base/task_scheduler/ta... base/task_scheduler/task_tracker.cc:58: new_value / kNumTasksBlockingShutdownIncrement; On 2016/07/07 23:05:21, danakj wrote: > Why are some / Increment, and some are >> Offset? Everything should have been >>. Fixed in latest patch set.
https://codereview.chromium.org/2019763002/diff/140001/base/task_scheduler/ta... File base/task_scheduler/task_tracker.cc (right): https://codereview.chromium.org/2019763002/diff/140001/base/task_scheduler/ta... base/task_scheduler/task_tracker.cc:82: DCHECK_LT(num_tasks_blocking_shutdown, So I'm still a bit unsure why this is a DCHECK instead of handling it. What makes you confident this won't actually happen? https://codereview.chromium.org/2019763002/diff/140001/base/task_scheduler/ta... base/task_scheduler/task_tracker.cc:114: // correct because it encapsulates all shutdown state. Barrier semantics would I spent some time learning about atomics yesterday, and I don't understand this comment about not needing barriers. Can you explain this further? AFAIK using atomic increment etc ensures that the operation happens atomically, so that no one would ever see the value half-changed. But it does not guarantee that other threads will see the new value at that time, they may never see it. But the logic in this patch depends on changes to the bits_ value being visible on other threads immediately.
@dana: I'm chiming in for Francois while he's on vacation (and I'm also OOO starting tomorrow but at least this is one more round :-)) https://codereview.chromium.org/2019763002/diff/140001/base/task_scheduler/ta... File base/task_scheduler/task_tracker.cc (right): https://codereview.chromium.org/2019763002/diff/140001/base/task_scheduler/ta... base/task_scheduler/task_tracker.cc:114: // correct because it encapsulates all shutdown state. Barrier semantics would On 2016/07/08 21:21:29, danakj wrote: > I spent some time learning about atomics yesterday, and I don't understand this > comment about not needing barriers. Can you explain this further? > > AFAIK using atomic increment etc ensures that the operation happens atomically, > so that no one would ever see the value half-changed. But it does not guarantee > that other threads will see the new value at that time, they may never see it. > But the logic in this patch depends on changes to the bits_ value being visible > on other threads immediately. Quoting Francois from way further up in this review: """ I believe a barrier "specifies how regular, non-atomic memory accesses are to be ordered around an atomic operation" http://en.cppreference.com/w/cpp/atomic/memory_order The atomic increment cannot be reordered after the atomic decrement. """ So what we understand is that atomic operations themselves are always atomic (and see each other in memory). Adding a barrier merely makes sure that everything else around it has also been synchronized. In this case since we merely care about that specific value and nothing else around it: NoBarrier is appropriate. For example: atomic_ref_count.h uses NoBarrier on increment but Barrier on decrement. It does so not to make sure multiple threads don't hit 0 but to make sure that the thread that hits zero inserts a barrier so that it sees the most recent version of surrounding non-atomic state (e.g. to ensure a clean deletion or whatever else the zero'ed ref triggers). In our case we don't need barriers either way as no non-atomic state depends on being synced based on calls to TaskTracker::State.
https://codereview.chromium.org/2114993003/ is super relevant, another case of people considering atomics, and another reason I've been trying to understand how they work at all, yay.
https://codereview.chromium.org/2019763002/diff/140001/base/task_scheduler/ta... File base/task_scheduler/task_tracker.cc (right): https://codereview.chromium.org/2019763002/diff/140001/base/task_scheduler/ta... base/task_scheduler/task_tracker.cc:114: // correct because it encapsulates all shutdown state. Barrier semantics would On 2016/07/13 18:49:10, gab wrote: > On 2016/07/08 21:21:29, danakj wrote: > > I spent some time learning about atomics yesterday, and I don't understand > this > > comment about not needing barriers. Can you explain this further? > > > > AFAIK using atomic increment etc ensures that the operation happens > atomically, > > so that no one would ever see the value half-changed. But it does not > guarantee > > that other threads will see the new value at that time, they may never see it. > > But the logic in this patch depends on changes to the bits_ value being > visible > > on other threads immediately. > > Quoting Francois from way further up in this review: > """ > I believe a barrier "specifies how regular, non-atomic memory accesses are to be > ordered around an atomic operation" > http://en.cppreference.com/w/cpp/atomic/memory_order The atomic increment cannot > be reordered after the atomic decrement. > """ > > So what we understand is that atomic operations themselves are always atomic > (and see each other in memory). Adding a barrier merely makes sure that > everything else around it has also been synchronized. In this case since we > merely care about that specific value and nothing else around it: NoBarrier is > appropriate. That's not my understanding at all. The atomic operation means that you'll never see the operation half-complete. It does not promise that other threads will ever see the new value vs the old value. Here's what I read the other day on this: http://preshing.com/20130618/atomic-vs-non-atomic-operations/ Our base atomics almost always use std::memory_order_relaxed, which means that "Relaxed operation: there are no synchronization or ordering constraints, only atomicity is required of this operation." it only guarantees the memory modified/read by the atomic operation is seen entirely before or entirely after the operation. But provides no guarantees about what value is seen otherwise. Without the barrier thread A may change the value, and thread B may never read the new value (but the atomics will guarantee it doesn't read a half-changed value). > For example: atomic_ref_count.h uses NoBarrier on increment but Barrier on > decrement. It does so not to make sure multiple threads don't hit 0 but to make > sure that the thread that hits zero inserts a barrier so that it sees the most > recent version of surrounding non-atomic state (e.g. to ensure a clean deletion > or whatever else the zero'ed ref triggers). In our case we don't need barriers > either way as no non-atomic state depends on being synced based on calls to > TaskTracker::State. I think the problem is where we have situtations like: Thread A: Posts a task, TaskTracker::BeforePostTask does IncrementNumTasksBlockingShutdown(). Thread B: Runs a task, DCHECKs AreTasksBlockingShutdown(), what guantees it sees it? Maybe in this case there's a barrier somewhere due to the fact the task has made it to the other thread. But are we guaranteed that the atomic op made it too, since we used relaxed? Or Thread A: TaskTracker::BeforePostTask() does IncrementNumTasksBlockingShutdown(), sees false for shutdown started. Thread B: TaskTracker::Shutdown() sets the shutdown flag. Thread A: checks HasShutdownStarted(), this check may not see the shutdown state changed. Is that okay? Maybe that should just be returning the already queried bool then anyways? Shutdown could start just after it checks the flag too.
PTAnL https://codereview.chromium.org/2019763002/diff/140001/base/task_scheduler/ta... File base/task_scheduler/task_tracker.cc (right): https://codereview.chromium.org/2019763002/diff/140001/base/task_scheduler/ta... base/task_scheduler/task_tracker.cc:82: DCHECK_LT(num_tasks_blocking_shutdown, On 2016/07/08 21:21:29, danakj wrote: > So I'm still a bit unsure why this is a DCHECK instead of handling it. What > makes you confident this won't actually happen? Having 2^30 pending tasks requires more than 2^36 bytes of RAM (sizeof(Task) == 120). That means that a 32-bit Chrome will get an OOM error before we reach 2^30 pending tasks. We don't handle that gracefully. Similarly, I don't think we should handle incrementing the number of tasks blocking shutdown above 2^30 (maybe we should even remove the DCHECK?). Note that |num_tasks_blocking_shutdown| is the number of tasks that are *currently* blocking shutdown, not the number of tasks that have blocked shutdown over the lifetime of the process. https://codereview.chromium.org/2019763002/diff/140001/base/task_scheduler/ta... base/task_scheduler/task_tracker.cc:114: // correct because it encapsulates all shutdown state. Barrier semantics would According to your link, atomic operations without barriers offer no ordering guarantees. Knowing that, I agree that we should use barriers every time we read/write |bits_|. On 2016/07/13 22:39:17, danakj wrote: > On 2016/07/13 18:49:10, gab wrote: > > On 2016/07/08 21:21:29, danakj wrote: > > > I spent some time learning about atomics yesterday, and I don't understand > > this > > > comment about not needing barriers. Can you explain this further? > > > > > > AFAIK using atomic increment etc ensures that the operation happens > > atomically, > > > so that no one would ever see the value half-changed. But it does not > > guarantee > > > that other threads will see the new value at that time, they may never see > it. > > > But the logic in this patch depends on changes to the bits_ value being > > visible > > > on other threads immediately. > > > > Quoting Francois from way further up in this review: > > """ > > I believe a barrier "specifies how regular, non-atomic memory accesses are to > be > > ordered around an atomic operation" > > http://en.cppreference.com/w/cpp/atomic/memory_order The atomic increment > cannot > > be reordered after the atomic decrement. > > """ > > > > So what we understand is that atomic operations themselves are always atomic > > (and see each other in memory). Adding a barrier merely makes sure that > > everything else around it has also been synchronized. In this case since we > > merely care about that specific value and nothing else around it: NoBarrier is > > appropriate. > > That's not my understanding at all. The atomic operation means that you'll never > see the operation half-complete. It does not promise that other threads will > ever see the new value vs the old value. > > Here's what I read the other day on this: > http://preshing.com/20130618/atomic-vs-non-atomic-operations/ > > Our base atomics almost always use std::memory_order_relaxed, which means that > "Relaxed operation: there are no synchronization or ordering constraints, only > atomicity is required of this operation." it only guarantees the memory > modified/read by the atomic operation is seen entirely before or entirely after > the operation. But provides no guarantees about what value is seen otherwise. > > Without the barrier thread A may change the value, and thread B may never read > the new value (but the atomics will guarantee it doesn't read a half-changed > value). > > > For example: atomic_ref_count.h uses NoBarrier on increment but Barrier on > > decrement. It does so not to make sure multiple threads don't hit 0 but to > make > > sure that the thread that hits zero inserts a barrier so that it sees the most > > recent version of surrounding non-atomic state (e.g. to ensure a clean > deletion > > or whatever else the zero'ed ref triggers). In our case we don't need barriers > > either way as no non-atomic state depends on being synced based on calls to > > TaskTracker::State. > > I think the problem is where we have situtations like: > > Thread A: Posts a task, TaskTracker::BeforePostTask does > IncrementNumTasksBlockingShutdown(). > Thread B: Runs a task, DCHECKs AreTasksBlockingShutdown(), what guantees it sees > it? Maybe in this case there's a barrier somewhere due to the fact the task has > made it to the other thread. But are we guaranteed that the atomic op made it > too, since we used relaxed? This should no longer be an issue with Barrier_AtomicIncrement and Acquire_Load (since IncrementNumTasksBlockingShutdown runs before AreTasksBlockingShutdown and a memory barrier is used, the DCHECK should never fail). > > Or > > Thread A: TaskTracker::BeforePostTask() does > IncrementNumTasksBlockingShutdown(), sees false for shutdown started. > Thread B: TaskTracker::Shutdown() sets the shutdown flag. > Thread A: checks HasShutdownStarted(), this check may not see the shutdown state > changed. Is that okay? Maybe that should just be returning the already queried > bool then anyways? Shutdown could start just after it checks the flag too. We never call both IncrementNumTasksBlockingShutdown() and HasShutdownStarted() in BeforePostTask() (there is a code path for BLOCK_SHUTDOWN tasks and a code path for other tasks).
Thanks, LGTM
The CQ bit was checked by fdoray@chromium.org
The patchset sent to the CQ was uploaded after l-g-t-m from robliao@chromium.org, gab@chromium.org Link to the patchset: https://codereview.chromium.org/2019763002/#ps160001 (title: "CR danakj #51 (add memory barriers)")
The CQ bit was unchecked by fdoray@chromium.org
The CQ bit was checked by fdoray@chromium.org
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Try jobs failed on following builders: mac_chromium_rel_ng on master.tryserver.chromium.mac (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.mac/builders/mac_chromium_rel_...)
The CQ bit was checked by fdoray@chromium.org
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
Message was sent while issue was closed.
Description was changed from ========== TaskScheduler: Atomic operations in TaskTracker With this CL, TaskTracker never acquires a lock except when shutdown is not in progress. TaskTracker::State wraps an Atomic32 whose LSB indicates whether shutdown has been initiated and whose other bits count the number of tasks blocking shutdown. This State is accessed in the following situations: A) Before a BLOCK_SHUTDOWN task is posted Increment the Atomic32 by 2 [1] and read it. If the "shutdown initiated" bit is set, acquire the lock [2] to: - DCHECK that shutdown hasn't completed - Record a UMA histogram B) Before a CONTINUE_ON_SHUTDOWN or SKIP_ON_SHUTDOWN task is posted: Read the Atomic32. The task is allowed to be posted iff the "shutdown initiated" bit isn't set. C) Before a BLOCK_SHUTDOWN task is executed. Read the Atomic32. DCHECK that there are tasks blocking shutdown (number of tasks blocking shutdown was incremented when the task was posted). D) Before a SKIP_ON_SHUTDOWN task is executed. Increment the Atomic32 by 2 [1] and read it. If the "shutdown initiated" bit isn't set: The task is allowed to be executed. If the "shutdown initiated bit is set: Decrement the Atomic32 by 2 [1] (to revert the incorrect increment) and read it. If the number of tasks blocking shutdown became zero, do [3] (note: this can trigger [3] more than once during shutdown as multiple threads realize that shutdown has been requested, but [3] is resilient to this). E) Before a CONTINUE_ON_SHUTDOWN task is executed. Read the Atomic32. Allow the task to be executed iff the "shutdown initiated" bit isn't set. F) After a BLOCK_SHUTDOWN or SKIP_ON_SHUTDOWN task is executed: Decrement the Atomic32 by 2 [1] and read it. If the number of tasks blocking shutdown became zero and "shutdown initiated", do [3]. G) When Shutdown() is called: Acquire the lock. Instantiate the shutdown complete event. Increment the Atomic32 by 1 (to set the "shutdown initiated bit") and read it. From this moment, if a thread causes the number of tasks blocking shutdown to become zero, it will do [3]. If there are no tasks blocking shutdown, signal the event and return. Wait until the shutdown event is signaled. [1] Incrementing the Atomic32 by 2 increments the number of tasks blocking shutdown by 1 because the LSB is used to indicate whether shutdown has been initiated. [2] The TaskTracker must be locked to access the shutdown event and the number of BLOCK_SHUTDOWN tasks posted during shutdown. [3] These actions are performed by OnNumTasksBlockingShutdownIsZero(): - Acquire the lock. - DCHECK that shutdown has started (i.e. "shutdown initiated" bit is set and shutdown event exists). - Signal the shutdown complete event (note: this event can be signaled more than once per multiple threads reaching the conclusion that shutdown is complete, ref. SKIP_ON_SHUTDOWN dance above, but this is harmless) Benchmark - 10 million increment/decrement per thread Atomic Lock 4 threads Win Z840 1.5s 2.2s 4 threads Ubuntu VM 0.8s 4.9s 4 threads MacBookPro 1.8s 270.7s 8 threads Win Z840 3.0s 5.5s 8 threads Ubuntu VM 1.6s 11.8s 8 threads MacBookPro 4.2s 559.0s BUG=553459 ========== to ========== TaskScheduler: Atomic operations in TaskTracker With this CL, TaskTracker never acquires a lock except when shutdown is not in progress. TaskTracker::State wraps an Atomic32 whose LSB indicates whether shutdown has been initiated and whose other bits count the number of tasks blocking shutdown. This State is accessed in the following situations: A) Before a BLOCK_SHUTDOWN task is posted Increment the Atomic32 by 2 [1] and read it. If the "shutdown initiated" bit is set, acquire the lock [2] to: - DCHECK that shutdown hasn't completed - Record a UMA histogram B) Before a CONTINUE_ON_SHUTDOWN or SKIP_ON_SHUTDOWN task is posted: Read the Atomic32. The task is allowed to be posted iff the "shutdown initiated" bit isn't set. C) Before a BLOCK_SHUTDOWN task is executed. Read the Atomic32. DCHECK that there are tasks blocking shutdown (number of tasks blocking shutdown was incremented when the task was posted). D) Before a SKIP_ON_SHUTDOWN task is executed. Increment the Atomic32 by 2 [1] and read it. If the "shutdown initiated" bit isn't set: The task is allowed to be executed. If the "shutdown initiated bit is set: Decrement the Atomic32 by 2 [1] (to revert the incorrect increment) and read it. If the number of tasks blocking shutdown became zero, do [3] (note: this can trigger [3] more than once during shutdown as multiple threads realize that shutdown has been requested, but [3] is resilient to this). E) Before a CONTINUE_ON_SHUTDOWN task is executed. Read the Atomic32. Allow the task to be executed iff the "shutdown initiated" bit isn't set. F) After a BLOCK_SHUTDOWN or SKIP_ON_SHUTDOWN task is executed: Decrement the Atomic32 by 2 [1] and read it. If the number of tasks blocking shutdown became zero and "shutdown initiated", do [3]. G) When Shutdown() is called: Acquire the lock. Instantiate the shutdown complete event. Increment the Atomic32 by 1 (to set the "shutdown initiated bit") and read it. From this moment, if a thread causes the number of tasks blocking shutdown to become zero, it will do [3]. If there are no tasks blocking shutdown, signal the event and return. Wait until the shutdown event is signaled. [1] Incrementing the Atomic32 by 2 increments the number of tasks blocking shutdown by 1 because the LSB is used to indicate whether shutdown has been initiated. [2] The TaskTracker must be locked to access the shutdown event and the number of BLOCK_SHUTDOWN tasks posted during shutdown. [3] These actions are performed by OnNumTasksBlockingShutdownIsZero(): - Acquire the lock. - DCHECK that shutdown has started (i.e. "shutdown initiated" bit is set and shutdown event exists). - Signal the shutdown complete event (note: this event can be signaled more than once per multiple threads reaching the conclusion that shutdown is complete, ref. SKIP_ON_SHUTDOWN dance above, but this is harmless) Benchmark - 10 million increment/decrement per thread Atomic Lock 4 threads Win Z840 1.5s 2.2s 4 threads Ubuntu VM 0.8s 4.9s 4 threads MacBookPro 1.8s 270.7s 8 threads Win Z840 3.0s 5.5s 8 threads Ubuntu VM 1.6s 11.8s 8 threads MacBookPro 4.2s 559.0s BUG=553459 ==========
Message was sent while issue was closed.
Committed patchset #9 (id:160001)
Message was sent while issue was closed.
CQ bit was unchecked.
Message was sent while issue was closed.
Description was changed from ========== TaskScheduler: Atomic operations in TaskTracker With this CL, TaskTracker never acquires a lock except when shutdown is not in progress. TaskTracker::State wraps an Atomic32 whose LSB indicates whether shutdown has been initiated and whose other bits count the number of tasks blocking shutdown. This State is accessed in the following situations: A) Before a BLOCK_SHUTDOWN task is posted Increment the Atomic32 by 2 [1] and read it. If the "shutdown initiated" bit is set, acquire the lock [2] to: - DCHECK that shutdown hasn't completed - Record a UMA histogram B) Before a CONTINUE_ON_SHUTDOWN or SKIP_ON_SHUTDOWN task is posted: Read the Atomic32. The task is allowed to be posted iff the "shutdown initiated" bit isn't set. C) Before a BLOCK_SHUTDOWN task is executed. Read the Atomic32. DCHECK that there are tasks blocking shutdown (number of tasks blocking shutdown was incremented when the task was posted). D) Before a SKIP_ON_SHUTDOWN task is executed. Increment the Atomic32 by 2 [1] and read it. If the "shutdown initiated" bit isn't set: The task is allowed to be executed. If the "shutdown initiated bit is set: Decrement the Atomic32 by 2 [1] (to revert the incorrect increment) and read it. If the number of tasks blocking shutdown became zero, do [3] (note: this can trigger [3] more than once during shutdown as multiple threads realize that shutdown has been requested, but [3] is resilient to this). E) Before a CONTINUE_ON_SHUTDOWN task is executed. Read the Atomic32. Allow the task to be executed iff the "shutdown initiated" bit isn't set. F) After a BLOCK_SHUTDOWN or SKIP_ON_SHUTDOWN task is executed: Decrement the Atomic32 by 2 [1] and read it. If the number of tasks blocking shutdown became zero and "shutdown initiated", do [3]. G) When Shutdown() is called: Acquire the lock. Instantiate the shutdown complete event. Increment the Atomic32 by 1 (to set the "shutdown initiated bit") and read it. From this moment, if a thread causes the number of tasks blocking shutdown to become zero, it will do [3]. If there are no tasks blocking shutdown, signal the event and return. Wait until the shutdown event is signaled. [1] Incrementing the Atomic32 by 2 increments the number of tasks blocking shutdown by 1 because the LSB is used to indicate whether shutdown has been initiated. [2] The TaskTracker must be locked to access the shutdown event and the number of BLOCK_SHUTDOWN tasks posted during shutdown. [3] These actions are performed by OnNumTasksBlockingShutdownIsZero(): - Acquire the lock. - DCHECK that shutdown has started (i.e. "shutdown initiated" bit is set and shutdown event exists). - Signal the shutdown complete event (note: this event can be signaled more than once per multiple threads reaching the conclusion that shutdown is complete, ref. SKIP_ON_SHUTDOWN dance above, but this is harmless) Benchmark - 10 million increment/decrement per thread Atomic Lock 4 threads Win Z840 1.5s 2.2s 4 threads Ubuntu VM 0.8s 4.9s 4 threads MacBookPro 1.8s 270.7s 8 threads Win Z840 3.0s 5.5s 8 threads Ubuntu VM 1.6s 11.8s 8 threads MacBookPro 4.2s 559.0s BUG=553459 ========== to ========== TaskScheduler: Atomic operations in TaskTracker With this CL, TaskTracker never acquires a lock except when shutdown is not in progress. TaskTracker::State wraps an Atomic32 whose LSB indicates whether shutdown has been initiated and whose other bits count the number of tasks blocking shutdown. This State is accessed in the following situations: A) Before a BLOCK_SHUTDOWN task is posted Increment the Atomic32 by 2 [1] and read it. If the "shutdown initiated" bit is set, acquire the lock [2] to: - DCHECK that shutdown hasn't completed - Record a UMA histogram B) Before a CONTINUE_ON_SHUTDOWN or SKIP_ON_SHUTDOWN task is posted: Read the Atomic32. The task is allowed to be posted iff the "shutdown initiated" bit isn't set. C) Before a BLOCK_SHUTDOWN task is executed. Read the Atomic32. DCHECK that there are tasks blocking shutdown (number of tasks blocking shutdown was incremented when the task was posted). D) Before a SKIP_ON_SHUTDOWN task is executed. Increment the Atomic32 by 2 [1] and read it. If the "shutdown initiated" bit isn't set: The task is allowed to be executed. If the "shutdown initiated bit is set: Decrement the Atomic32 by 2 [1] (to revert the incorrect increment) and read it. If the number of tasks blocking shutdown became zero, do [3] (note: this can trigger [3] more than once during shutdown as multiple threads realize that shutdown has been requested, but [3] is resilient to this). E) Before a CONTINUE_ON_SHUTDOWN task is executed. Read the Atomic32. Allow the task to be executed iff the "shutdown initiated" bit isn't set. F) After a BLOCK_SHUTDOWN or SKIP_ON_SHUTDOWN task is executed: Decrement the Atomic32 by 2 [1] and read it. If the number of tasks blocking shutdown became zero and "shutdown initiated", do [3]. G) When Shutdown() is called: Acquire the lock. Instantiate the shutdown complete event. Increment the Atomic32 by 1 (to set the "shutdown initiated bit") and read it. From this moment, if a thread causes the number of tasks blocking shutdown to become zero, it will do [3]. If there are no tasks blocking shutdown, signal the event and return. Wait until the shutdown event is signaled. [1] Incrementing the Atomic32 by 2 increments the number of tasks blocking shutdown by 1 because the LSB is used to indicate whether shutdown has been initiated. [2] The TaskTracker must be locked to access the shutdown event and the number of BLOCK_SHUTDOWN tasks posted during shutdown. [3] These actions are performed by OnNumTasksBlockingShutdownIsZero(): - Acquire the lock. - DCHECK that shutdown has started (i.e. "shutdown initiated" bit is set and shutdown event exists). - Signal the shutdown complete event (note: this event can be signaled more than once per multiple threads reaching the conclusion that shutdown is complete, ref. SKIP_ON_SHUTDOWN dance above, but this is harmless) Benchmark - 10 million increment/decrement per thread Atomic Lock 4 threads Win Z840 1.5s 2.2s 4 threads Ubuntu VM 0.8s 4.9s 4 threads MacBookPro 1.8s 270.7s 8 threads Win Z840 3.0s 5.5s 8 threads Ubuntu VM 1.6s 11.8s 8 threads MacBookPro 4.2s 559.0s BUG=553459 Committed: https://crrev.com/caa8d6675e165cd04fc22f24b36c1076d6720388 Cr-Commit-Position: refs/heads/master@{#406260} ==========
Message was sent while issue was closed.
Patchset 9 (id:??) landed as https://crrev.com/caa8d6675e165cd04fc22f24b36c1076d6720388 Cr-Commit-Position: refs/heads/master@{#406260}
Message was sent while issue was closed.
lgtm, still wrapping my head around the resolution below, thanks! https://codereview.chromium.org/2019763002/diff/140001/base/task_scheduler/ta... File base/task_scheduler/task_tracker.cc (right): https://codereview.chromium.org/2019763002/diff/140001/base/task_scheduler/ta... base/task_scheduler/task_tracker.cc:114: // correct because it encapsulates all shutdown state. Barrier semantics would Now that we've changed to use barriers, it would be worth it to redo the benchmarks in the CL description to get another column for barriers versus no barriers. Great reads I found to educate myself further: http://preshing.com/20130618/atomic-vs-non-atomic-operations/ http://preshing.com/20120710/memory-barriers-are-like-source-control-operations/ http://preshing.com/20120913/acquire-and-release-semantics/ http://preshing.com/20120612/an-introduction-to-lock-free-programming/ http://preshing.com/20130922/acquire-and-release-fences/ Given all of this I think the barriers are correct for the consistency of this API alone (although as was mentioned in practice there are other things ensuring consistencies of the actual calls but this API can't know that). So long as the benchmarks w/ barriers show that it doesn't matter much I'm happy with this -- otherwise we may want to go a notch further.
Message was sent while issue was closed.
On 2016/07/19 16:22:21, gab wrote: > lgtm, still wrapping my head around the resolution below, thanks! > > https://codereview.chromium.org/2019763002/diff/140001/base/task_scheduler/ta... > File base/task_scheduler/task_tracker.cc (right): > > https://codereview.chromium.org/2019763002/diff/140001/base/task_scheduler/ta... > base/task_scheduler/task_tracker.cc:114: // correct because it encapsulates all > shutdown state. Barrier semantics would > Now that we've changed to use barriers, it would be worth it to redo the > benchmarks in the CL description to get another column for barriers versus no > barriers. > > Great reads I found to educate myself further: > > http://preshing.com/20130618/atomic-vs-non-atomic-operations/ > http://preshing.com/20120710/memory-barriers-are-like-source-control-operations/ > http://preshing.com/20120913/acquire-and-release-semantics/ > http://preshing.com/20120612/an-introduction-to-lock-free-programming/ > http://preshing.com/20130922/acquire-and-release-fences/ > > Given all of this I think the barriers are correct for the consistency of this > API alone (although as was mentioned in practice there are other things ensuring > consistencies of the actual calls but this API can't know that). > > So long as the benchmarks w/ barriers show that it doesn't matter much I'm happy > with this -- otherwise we may want to go a notch further. Updated benchmarks. tl;dr: Memory barriers don't affect performance. Benchmark - 10 million increment/decrement per thread 10 runs per test case - average (stdev) Atomic no barrier Atomic barrier Lock 8 threads Win Z840 3.04s (0.06) 3.01s (0.05) 5.53s (0.72) 8 threads MacBookPro 3.75s (0.16) 3.76s (0.13) 557.29s (6.12)
Message was sent while issue was closed.
On Tue, Jul 19, 2016 at 1:18 PM, <fdoray@chromium.org> wrote: > On 2016/07/19 16:22:21, gab wrote: > > lgtm, still wrapping my head around the resolution below, thanks! > > > > > > https://codereview.chromium.org/2019763002/diff/140001/base/task_scheduler/ta... > > File base/task_scheduler/task_tracker.cc (right): > > > > > > https://codereview.chromium.org/2019763002/diff/140001/base/task_scheduler/ta... > > base/task_scheduler/task_tracker.cc:114: // correct because it > encapsulates > all > > shutdown state. Barrier semantics would > > Now that we've changed to use barriers, it would be worth it to redo the > > benchmarks in the CL description to get another column for barriers > versus no > > barriers. > > > > Great reads I found to educate myself further: > > > > http://preshing.com/20130618/atomic-vs-non-atomic-operations/ > > > > http://preshing.com/20120710/memory-barriers-are-like-source-control-operations/ > > http://preshing.com/20120913/acquire-and-release-semantics/ > > http://preshing.com/20120612/an-introduction-to-lock-free-programming/ > > http://preshing.com/20130922/acquire-and-release-fences/ > > > > Given all of this I think the barriers are correct for the consistency > of this > > API alone (although as was mentioned in practice there are other things > ensuring > > consistencies of the actual calls but this API can't know that). > > > > So long as the benchmarks w/ barriers show that it doesn't matter much > I'm > happy > > with this -- otherwise we may want to go a notch further. > > Updated benchmarks. > tl;dr: Memory barriers don't affect performance. > > Benchmark - 10 million increment/decrement per thread > 10 runs per test case - average (stdev) > Atomic no barrier Atomic barrier Lock > 8 threads Win Z840 3.04s (0.06) 3.01s (0.05) 5.53s (0.72) > 8 threads MacBookPro 3.75s (0.16) 3.76s (0.13) 557.29s (6.12) > Barriers have 0 impact on x64 fwiw. Might be different on arm (this was mentioned on the CL I linked to earlier). -- You received this message because you are subscribed to the Google Groups "Chromium-reviews" group. To unsubscribe from this group and stop receiving emails from it, send an email to chromium-reviews+unsubscribe@chromium.org.
Message was sent while issue was closed.
On 2016/07/19 20:20:19, danakj wrote: > On Tue, Jul 19, 2016 at 1:18 PM, <mailto:fdoray@chromium.org> wrote: > > > On 2016/07/19 16:22:21, gab wrote: > > > lgtm, still wrapping my head around the resolution below, thanks! > > > > > > > > > > > https://codereview.chromium.org/2019763002/diff/140001/base/task_scheduler/ta... > > > File base/task_scheduler/task_tracker.cc (right): > > > > > > > > > > > https://codereview.chromium.org/2019763002/diff/140001/base/task_scheduler/ta... > > > base/task_scheduler/task_tracker.cc:114: // correct because it > > encapsulates > > all > > > shutdown state. Barrier semantics would > > > Now that we've changed to use barriers, it would be worth it to redo the > > > benchmarks in the CL description to get another column for barriers > > versus no > > > barriers. > > > > > > Great reads I found to educate myself further: > > > > > > http://preshing.com/20130618/atomic-vs-non-atomic-operations/ > > > > > > > > http://preshing.com/20120710/memory-barriers-are-like-source-control-operations/ > > > http://preshing.com/20120913/acquire-and-release-semantics/ > > > http://preshing.com/20120612/an-introduction-to-lock-free-programming/ > > > http://preshing.com/20130922/acquire-and-release-fences/ > > > > > > Given all of this I think the barriers are correct for the consistency > > of this > > > API alone (although as was mentioned in practice there are other things > > ensuring > > > consistencies of the actual calls but this API can't know that). > > > > > > So long as the benchmarks w/ barriers show that it doesn't matter much > > I'm > > happy > > > with this -- otherwise we may want to go a notch further. > > > > Updated benchmarks. > > tl;dr: Memory barriers don't affect performance. > > > > Benchmark - 10 million increment/decrement per thread > > 10 runs per test case - average (stdev) > > Atomic no barrier Atomic barrier Lock > > 8 threads Win Z840 3.04s (0.06) 3.01s (0.05) 5.53s (0.72) > > 8 threads MacBookPro 3.75s (0.16) 3.76s (0.13) 557.29s (6.12) > > > > Barriers have 0 impact on x64 fwiw. Might be different on arm (this was > mentioned on the CL I linked to earlier). > > -- > You received this message because you are subscribed to the Google Groups > "Chromium-reviews" group. > To unsubscribe from this group and stop receiving emails from it, send an email > to mailto:chromium-reviews+unsubscribe@chromium.org. Performance will also vary on the underlying platform implementation. Might also want to check Linux. There's currently an open issue where they're implemented poorly on Linux, although this impacts the NoBarrier version more. https://bugs.chromium.org/p/chromium/issues/detail?id=592903 |