|
|
DescriptionWILL BE MERGED Change Ownership of Sequence to the Single Thread SchedulerWorker Delegate
This change will be merged into https://codereview.chromium.org/2698843006/
During shutdown for tests, it's possible for tasks to linger around and
not be destroyed because of a reference cycle. The
SingleThreadTaskRunner references a sequence that references a task
that references the original SingleThreadTaskRunner.
The fix here changes ownership of the sequence to the SchedulerWorker
delegate and adds a OnMainExit to clear this sequence upon thread
termination.
BUG=684080
Patch Set 1 #
Total comments: 13
Patch Set 2 : CR Feedback #
Total comments: 10
Patch Set 3 : CR Feedback #
Total comments: 12
Messages
Total messages: 23 (13 generated)
The CQ bit was checked by robliao@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
robliao@chromium.org changed reviewers: + fdoray@chromium.org, gab@chromium.org
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: linux_android_rel_ng on master.tryserver.chromium.android (JOB_FAILED, https://build.chromium.org/p/tryserver.chromium.android/builders/linux_androi...)
The CQ bit was checked by robliao@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
https://codereview.chromium.org/2726073002/diff/1/base/task_scheduler/schedul... File base/task_scheduler/scheduler_single_thread_task_runner_manager.cc (right): https://codereview.chromium.org/2726073002/diff/1/base/task_scheduler/schedul... base/task_scheduler/scheduler_single_thread_task_runner_manager.cc:94: // the tasks to destroy themselves. DCHECK(sequence->HasOneRef()); https://codereview.chromium.org/2726073002/diff/1/base/task_scheduler/schedul... base/task_scheduler/scheduler_single_thread_task_runner_manager.cc:167: Sequence* sequence = GetDelegate()->sequence(); When shutdown completes, the thread exits and the sequence is destroyed. If a task is posted after that, this will crash. I don't think this is right. Instead of dropping the reference to the Sequence in OnMainExit(), you could Terminate() the Sequence (destroy all its tasks and prevent new tasks from being added). https://codereview.chromium.org/2726073002/diff/1/base/task_scheduler/schedul... base/task_scheduler/scheduler_single_thread_task_runner_manager.cc:208: << "SchedulerSingleThreadTaskRunners must outlive " I think this would be clearer from a user point of view: "All references to SingleThreadTaskRunners should be dropped before deleting TaskScheduler." https://codereview.chromium.org/2726073002/diff/1/base/task_scheduler/schedul... base/task_scheduler/scheduler_single_thread_task_runner_manager.cc:264: subtle::NoBarrier_AtomicIncrement(&workers_unregistered_during_join_, 1); Use base::AtomicRefCountInc and base::AtomicRefCountIsZero. https://codereview.chromium.org/2726073002/diff/1/base/task_scheduler/schedul... File base/task_scheduler/scheduler_single_thread_task_runner_manager.h (right): https://codereview.chromium.org/2726073002/diff/1/base/task_scheduler/schedul... base/task_scheduler/scheduler_single_thread_task_runner_manager.h:62: #if DCHECK_IS_ON()
Patchset #2 (id:20001) has been deleted
Description was changed from ========== Change Ownership of Sequence to the Single Thread SchedulerWorker Delegate During shutdown for tests, it's possible for tasks to linger around and not be destroyed because of a reference cycle. The SingleThreadTaskRunner references a sequence that references a task that references the original SingleThreadTaskRunner. The fix here changes ownership of the sequence to the SchedulerWorker delegate and adds a OnMainExit to clear this sequence upon thread termination. BUG=684080 ========== to ========== WILL BE MERGED Change Ownership of Sequence to the Single Thread SchedulerWorker Delegate This change will be merged into https://codereview.chromium.org/2698843006/ During shutdown for tests, it's possible for tasks to linger around and not be destroyed because of a reference cycle. The SingleThreadTaskRunner references a sequence that references a task that references the original SingleThreadTaskRunner. The fix here changes ownership of the sequence to the SchedulerWorker delegate and adds a OnMainExit to clear this sequence upon thread termination. BUG=684080 ==========
Thanks for the comments! https://codereview.chromium.org/2726073002/diff/1/base/task_scheduler/schedul... File base/task_scheduler/scheduler_single_thread_task_runner_manager.cc (right): https://codereview.chromium.org/2726073002/diff/1/base/task_scheduler/schedul... base/task_scheduler/scheduler_single_thread_task_runner_manager.cc:94: // the tasks to destroy themselves. On 2017/03/02 14:37:48, fdoray wrote: > DCHECK(sequence->HasOneRef()); In general for refcounted things, it's a dangerous practice to assert OneRef remaining at a certain point. If one ref is remaining and we fail to cleanup, the ConcurrentJoin tests will fail. If someone in the future legitimately needs a ref to survive past the delegate and they do cleanup correctly, they'll DCHECK here. https://codereview.chromium.org/2726073002/diff/1/base/task_scheduler/schedul... base/task_scheduler/scheduler_single_thread_task_runner_manager.cc:167: Sequence* sequence = GetDelegate()->sequence(); On 2017/03/02 14:37:48, fdoray wrote: > When shutdown completes, the thread exits and the sequence is destroyed. If a > task is posted after that, this will crash. I don't think this is right. Instead > of dropping the reference to the Sequence in OnMainExit(), you could Terminate() > the Sequence (destroy all its tasks and prevent new tasks from being added). Handled by making this a scoped_refptr. At this point, the caller of the PostTask will clean up the task (which if there are thread checkers, is probably a violation of the Shutdown + PostTask contract by the caller). https://codereview.chromium.org/2726073002/diff/1/base/task_scheduler/schedul... base/task_scheduler/scheduler_single_thread_task_runner_manager.cc:208: << "SchedulerSingleThreadTaskRunners must outlive " On 2017/03/02 14:37:48, fdoray wrote: > I think this would be clearer from a user point of view: > "All references to SingleThreadTaskRunners should be dropped before deleting > TaskScheduler." While that's true, it doesn't make sense in the context of the unit tests strangely enough since we don't have a TaskScheduler. I've added the note in to cover both cases. https://codereview.chromium.org/2726073002/diff/1/base/task_scheduler/schedul... base/task_scheduler/scheduler_single_thread_task_runner_manager.cc:264: subtle::NoBarrier_AtomicIncrement(&workers_unregistered_during_join_, 1); On 2017/03/02 14:37:48, fdoray wrote: > Use base::AtomicRefCountInc and base::AtomicRefCountIsZero. Those functions are intended for reference counting. The author(s) don't seem to want people to directly call these functions: https://cs.chromium.org/chromium/src/base/atomic_ref_count.h?rcl=27c78942d0fb... https://codereview.chromium.org/2726073002/diff/1/base/task_scheduler/schedul... File base/task_scheduler/scheduler_single_thread_task_runner_manager.h (right): https://codereview.chromium.org/2726073002/diff/1/base/task_scheduler/schedul... base/task_scheduler/scheduler_single_thread_task_runner_manager.h:62: On 2017/03/02 14:37:48, fdoray wrote: > #if DCHECK_IS_ON() Done.
https://codereview.chromium.org/2726073002/diff/1/base/task_scheduler/schedul... File base/task_scheduler/scheduler_single_thread_task_runner_manager.cc (right): https://codereview.chromium.org/2726073002/diff/1/base/task_scheduler/schedul... base/task_scheduler/scheduler_single_thread_task_runner_manager.cc:264: subtle::NoBarrier_AtomicIncrement(&workers_unregistered_during_join_, 1); On 2017/03/03 04:24:17, robliao (PST plus 17h) wrote: > On 2017/03/02 14:37:48, fdoray wrote: > > Use base::AtomicRefCountInc and base::AtomicRefCountIsZero. > > Those functions are intended for reference counting. > The author(s) don't seem to want people to directly call these functions: > https://cs.chromium.org/chromium/src/base/atomic_ref_count.h?rcl=27c78942d0fb... They want people to use base/memory/ref_counted.h instead of base/atomic_ref_count.h, but they don't say anything about using base/atomic_ref_counter.h instead of base/atomicops.h. You wouldn't be the first user of this header. I'll let you decide what's best. https://codereview.chromium.org/2726073002/diff/40001/base/task_scheduler/sch... File base/task_scheduler/scheduler_single_thread_task_runner_manager.cc (right): https://codereview.chromium.org/2726073002/diff/40001/base/task_scheduler/sch... base/task_scheduler/scheduler_single_thread_task_runner_manager.cc:99: // the tasks to destroy themselves. Need to grab the lock to clear the pointer. To avoid deleting tasks in the scope of the lock, you could do: scoped_refptr<Sequence> sequence; { AutoSchedulerLock auto_lock(sequence_lock_); sequence = std::move(sequence_); } // Reference released at the end of the scope. https://codereview.chromium.org/2726073002/diff/40001/base/task_scheduler/sch... base/task_scheduler/scheduler_single_thread_task_runner_manager.cc:109: return sequence_.get(); no .get(); https://codereview.chromium.org/2726073002/diff/40001/base/task_scheduler/sch... base/task_scheduler/scheduler_single_thread_task_runner_manager.cc:180: // shutdonw or joined). We will destroy the task in this context instead. s/shutdonw/shutdown/ https://codereview.chromium.org/2726073002/diff/40001/base/task_scheduler/sch... base/task_scheduler/scheduler_single_thread_task_runner_manager.cc:180: // shutdonw or joined). We will destroy the task in this context instead. I don't find "We will destroy the task in this context instead." useful. The task is always dropped when it can't be posted, no matter the reason. https://codereview.chromium.org/2726073002/diff/40001/base/task_scheduler/sch... base/task_scheduler/scheduler_single_thread_task_runner_manager.cc:188: } The sequence and tasks it contained before the PushTask() may be destroyed here. To avoid that, you could: 1. Never reset |SchedulerWorkerDelegate::sequence_|; 2. Add Sequence::Terminate() (see below). 3. Call Sequence::Terminate() from OnMainExit(). 4. Do not take a strong ref in PostTaskNow(). That would guarantee that a task is destroyed: - On the thread, or - In its *own* post task call site. void Sequence::Terminate() { { AutoSchedulerLock auto_lock(sequence_lock_); // All methods become no-op and queue is never accessed // when |terminated_| is true. terminated_ = true; } queue_.clear(); }
Patchset #3 (id:60001) has been deleted
Patchset #3 (id:80001) has been deleted
https://codereview.chromium.org/2726073002/diff/40001/base/task_scheduler/sch... File base/task_scheduler/scheduler_single_thread_task_runner_manager.cc (right): https://codereview.chromium.org/2726073002/diff/40001/base/task_scheduler/sch... base/task_scheduler/scheduler_single_thread_task_runner_manager.cc:99: // the tasks to destroy themselves. On 2017/03/03 05:31:13, fdoray wrote: > Need to grab the lock to clear the pointer. To avoid deleting tasks in the scope > of the lock, you could do: > > scoped_refptr<Sequence> sequence; > { > AutoSchedulerLock auto_lock(sequence_lock_); > sequence = std::move(sequence_); > } > // Reference released at the end of the scope. Nice. Yes indeed! https://codereview.chromium.org/2726073002/diff/40001/base/task_scheduler/sch... base/task_scheduler/scheduler_single_thread_task_runner_manager.cc:109: return sequence_.get(); On 2017/03/03 05:31:13, fdoray wrote: > no .get(); Done. https://codereview.chromium.org/2726073002/diff/40001/base/task_scheduler/sch... base/task_scheduler/scheduler_single_thread_task_runner_manager.cc:180: // shutdonw or joined). We will destroy the task in this context instead. On 2017/03/03 05:31:13, fdoray wrote: > I don't find "We will destroy the task in this context instead." useful. The > task is always dropped when it can't be posted, no matter the reason. Sounds good. Removed. https://codereview.chromium.org/2726073002/diff/40001/base/task_scheduler/sch... base/task_scheduler/scheduler_single_thread_task_runner_manager.cc:180: // shutdonw or joined). We will destroy the task in this context instead. On 2017/03/03 05:31:13, fdoray wrote: > s/shutdonw/shutdown/ Done. https://codereview.chromium.org/2726073002/diff/40001/base/task_scheduler/sch... base/task_scheduler/scheduler_single_thread_task_runner_manager.cc:188: } On 2017/03/03 05:31:13, fdoray wrote: > The sequence and tasks it contained before the PushTask() may be destroyed here. > To avoid that, you could: > > 1. Never reset |SchedulerWorkerDelegate::sequence_|; > 2. Add Sequence::Terminate() (see below). > 3. Call Sequence::Terminate() from OnMainExit(). > 4. Do not take a strong ref in PostTaskNow(). > > That would guarantee that a task is destroyed: > - On the thread, or > - In its *own* post task call site. > > void Sequence::Terminate() { > { > AutoSchedulerLock auto_lock(sequence_lock_); > // All methods become no-op and queue is never accessed > // when |terminated_| is true. > terminated_ = true; > } > queue_.clear(); > } Offline discussion: There is still a potential for tasks to get destroyed on the caller thread even if we implement a way to clear the sequence. We're going to keep things as is for now and see if this is a real problem in the field. In either way, it's likely callers that do encounter this may be doing the wrong thing.
lgtm with ping https://codereview.chromium.org/2726073002/diff/1/base/task_scheduler/schedul... File base/task_scheduler/scheduler_single_thread_task_runner_manager.cc (right): https://codereview.chromium.org/2726073002/diff/1/base/task_scheduler/schedul... base/task_scheduler/scheduler_single_thread_task_runner_manager.cc:264: subtle::NoBarrier_AtomicIncrement(&workers_unregistered_during_join_, 1); On 2017/03/03 05:31:13, fdoray wrote: > On 2017/03/03 04:24:17, robliao (PST plus 17h) wrote: > > On 2017/03/02 14:37:48, fdoray wrote: > > > Use base::AtomicRefCountInc and base::AtomicRefCountIsZero. > > > > Those functions are intended for reference counting. > > The author(s) don't seem to want people to directly call these functions: > > > https://cs.chromium.org/chromium/src/base/atomic_ref_count.h?rcl=27c78942d0fb... > > They want people to use base/memory/ref_counted.h instead of > base/atomic_ref_count.h, but they don't say anything about using > base/atomic_ref_counter.h instead of base/atomicops.h. You wouldn't be the first > user of this header. I'll let you decide what's best. ping
https://codereview.chromium.org/2726073002/diff/1/base/task_scheduler/schedul... File base/task_scheduler/scheduler_single_thread_task_runner_manager.cc (right): https://codereview.chromium.org/2726073002/diff/1/base/task_scheduler/schedul... base/task_scheduler/scheduler_single_thread_task_runner_manager.cc:264: subtle::NoBarrier_AtomicIncrement(&workers_unregistered_during_join_, 1); On 2017/03/03 07:22:39, fdoray wrote: > On 2017/03/03 05:31:13, fdoray wrote: > > On 2017/03/03 04:24:17, robliao (PST plus 17h) wrote: > > > On 2017/03/02 14:37:48, fdoray wrote: > > > > Use base::AtomicRefCountInc and base::AtomicRefCountIsZero. > > > > > > Those functions are intended for reference counting. > > > The author(s) don't seem to want people to directly call these functions: > > > > > > https://cs.chromium.org/chromium/src/base/atomic_ref_count.h?rcl=27c78942d0fb... > > > > They want people to use base/memory/ref_counted.h instead of > > base/atomic_ref_count.h, but they don't say anything about using > > base/atomic_ref_counter.h instead of base/atomicops.h. You wouldn't be the > first > > user of this header. I'll let you decide what's best. > > ping The intent of atomic_ref_count.h as written is to handle reference counting. We aren't quite doing that here. We just need a counter. I would be more okay using AtomicRefCountInc if it were instead AtomicIncrement (no refcount indication). Having said that, it might be an interesting refactor of atomic_ref_count -> atomic_count
Message was sent while issue was closed.
On 2017/03/03 07:50:05, robliao wrote: > https://codereview.chromium.org/2726073002/diff/1/base/task_scheduler/schedul... > File base/task_scheduler/scheduler_single_thread_task_runner_manager.cc (right): > > https://codereview.chromium.org/2726073002/diff/1/base/task_scheduler/schedul... > base/task_scheduler/scheduler_single_thread_task_runner_manager.cc:264: > subtle::NoBarrier_AtomicIncrement(&workers_unregistered_during_join_, 1); > On 2017/03/03 07:22:39, fdoray wrote: > > On 2017/03/03 05:31:13, fdoray wrote: > > > On 2017/03/03 04:24:17, robliao (PST plus 17h) wrote: > > > > On 2017/03/02 14:37:48, fdoray wrote: > > > > > Use base::AtomicRefCountInc and base::AtomicRefCountIsZero. > > > > > > > > Those functions are intended for reference counting. > > > > The author(s) don't seem to want people to directly call these functions: > > > > > > > > > > https://cs.chromium.org/chromium/src/base/atomic_ref_count.h?rcl=27c78942d0fb... > > > > > > They want people to use base/memory/ref_counted.h instead of > > > base/atomic_ref_count.h, but they don't say anything about using > > > base/atomic_ref_counter.h instead of base/atomicops.h. You wouldn't be the > > first > > > user of this header. I'll let you decide what's best. > > > > ping > > The intent of atomic_ref_count.h as written is to handle reference counting. We > aren't quite doing that here. We just need a counter. > I would be more okay using AtomicRefCountInc if it were instead AtomicIncrement > (no refcount indication). > > Having said that, it might be an interesting refactor of atomic_ref_count -> > atomic_count This was successfully merged into https://codereview.chromium.org/2698843006/
Message was sent while issue was closed.
lgtm w/ comments https://codereview.chromium.org/2726073002/diff/100001/base/task_scheduler/sc... File base/task_scheduler/scheduler_single_thread_task_runner_manager.cc (right): https://codereview.chromium.org/2726073002/diff/100001/base/task_scheduler/sc... base/task_scheduler/scheduler_single_thread_task_runner_manager.cc:232: << "There cannot be outstanding SingleThreadTaskRunners upon destruction" nit: space before closing " to lead next word. https://codereview.chromium.org/2726073002/diff/100001/base/task_scheduler/sc... File base/task_scheduler/scheduler_single_thread_task_runner_manager_unittest.cc (right): https://codereview.chromium.org/2726073002/diff/100001/base/task_scheduler/sc... base/task_scheduler/scheduler_single_thread_task_runner_manager_unittest.cc:45: DCHECK_EQ(FOREGROUND_WORKER_POOL, params_vector.size()); EXPECT_EQ instead of DCHECK_EQ in tests. https://codereview.chromium.org/2726073002/diff/100001/base/task_scheduler/sc... base/task_scheduler/scheduler_single_thread_task_runner_manager_unittest.cc:75: virtual void TearDownSingleThreadTaskRunnerManager() { Instead of having this virtual method simply to skip the join call, keep it inline in TearDown() but only do it if (single_thread_task_runner_manager_), allowing early reset in fixtures that care? (i.e your other fixture can override TearDown() and do its thing before handing off to its parent's TearDown()?) https://codereview.chromium.org/2726073002/diff/100001/base/task_scheduler/sc... base/task_scheduler/scheduler_single_thread_task_runner_manager_unittest.cc:285: DISALLOW_COPY_AND_ASSIGN(CallJoinFromDifferentThread); empty line before DISALLOW...() macro https://codereview.chromium.org/2726073002/diff/100001/base/task_scheduler/sc... base/task_scheduler/scheduler_single_thread_task_runner_manager_unittest.cc:306: TEST_F(TaskSchedulerSingleThreadTaskRunnerManagerJoinTest, ConcurrentJoin) { How is concurrent join stressing the code added in this CL? https://codereview.chromium.org/2726073002/diff/100001/base/task_scheduler/sc... base/task_scheduler/scheduler_single_thread_task_runner_manager_unittest.cc:332: ConcurrentJoinExtraSkippedTask) { And how is this one different? Please document what each test is stressing, otherwise we're just piling tests and it makes future refactorings more cumbersome.
Message was sent while issue was closed.
Done in https://codereview.chromium.org/2753443005/ Add all further discussion there to make following the threads a little easier. https://codereview.chromium.org/2726073002/diff/100001/base/task_scheduler/sc... File base/task_scheduler/scheduler_single_thread_task_runner_manager.cc (right): https://codereview.chromium.org/2726073002/diff/100001/base/task_scheduler/sc... base/task_scheduler/scheduler_single_thread_task_runner_manager.cc:232: << "There cannot be outstanding SingleThreadTaskRunners upon destruction" On 2017/03/15 20:00:13, gab wrote: > nit: space before closing " to lead next word. Done. https://codereview.chromium.org/2726073002/diff/100001/base/task_scheduler/sc... File base/task_scheduler/scheduler_single_thread_task_runner_manager_unittest.cc (right): https://codereview.chromium.org/2726073002/diff/100001/base/task_scheduler/sc... base/task_scheduler/scheduler_single_thread_task_runner_manager_unittest.cc:45: DCHECK_EQ(FOREGROUND_WORKER_POOL, params_vector.size()); On 2017/03/15 20:00:13, gab wrote: > EXPECT_EQ instead of DCHECK_EQ in tests. I would prefer to fail fast here since there's no point in continuing the test if any of these checks fail. If these conditions are violated we're in one of the following scenarios: 1) Someone swapped the order. The tests will fail but won't crash the shard. 2) Someone added some worker pool types but forgot to update the mapper function and params vector generation. We could crash the shard due to vector out of bounds accesses. Since we could still crash the shard, we might as well crash early. https://codereview.chromium.org/2726073002/diff/100001/base/task_scheduler/sc... base/task_scheduler/scheduler_single_thread_task_runner_manager_unittest.cc:75: virtual void TearDownSingleThreadTaskRunnerManager() { On 2017/03/15 20:00:13, gab wrote: > Instead of having this virtual method simply to skip the join call, keep it > inline in TearDown() but only do it if (single_thread_task_runner_manager_), > allowing early reset in fixtures that care? (i.e your other fixture can override > TearDown() and do its thing before handing off to its parent's TearDown()?) That seemed pretty subtle. It's nicer when a class can fully control the validity of the pointers. A virtual method call is super cheap in that this is just a test. This way we don't have to check in each test if the pointers are cleared by the test. https://codereview.chromium.org/2726073002/diff/100001/base/task_scheduler/sc... base/task_scheduler/scheduler_single_thread_task_runner_manager_unittest.cc:285: DISALLOW_COPY_AND_ASSIGN(CallJoinFromDifferentThread); On 2017/03/15 20:00:13, gab wrote: > empty line before DISALLOW...() macro Done. https://codereview.chromium.org/2726073002/diff/100001/base/task_scheduler/sc... base/task_scheduler/scheduler_single_thread_task_runner_manager_unittest.cc:306: TEST_F(TaskSchedulerSingleThreadTaskRunnerManagerJoinTest, ConcurrentJoin) { On 2017/03/15 20:00:13, gab wrote: > How is concurrent join stressing the code added in this CL? Added a comment. Both of these tests cause two different failures without the fix. // Exercises the codepath where the workers are unavailable for unregistration // because of a Join call. https://codereview.chromium.org/2726073002/diff/100001/base/task_scheduler/sc... base/task_scheduler/scheduler_single_thread_task_runner_manager_unittest.cc:332: ConcurrentJoinExtraSkippedTask) { On 2017/03/15 20:00:13, gab wrote: > And how is this one different? Please document what each test is stressing, > otherwise we're just piling tests and it makes future refactorings more > cumbersome. Added a comment. // Tests to make sure that tasks are properly cleaned up at Join, allowing // SingleThreadTaskRunners to unregister themselves. |