|
|
DescriptionSynchronize memory in TaskTracker::Flush().
To prevent races, make sure that code that runs after
TaskTracker::Flush() sees memory changed by all tasks it was waiting
for.
BUG=717527
Review-Url: https://codereview.chromium.org/2857613002
Cr-Commit-Position: refs/heads/master@{#469008}
Committed: https://chromium.googlesource.com/chromium/src/+/9888c32790d9934ce4a4446349c9a691536561cd
Patch Set 1 #
Total comments: 6
Patch Set 2 : update_comment_in_header_file #
Messages
Total messages: 24 (13 generated)
The CQ bit was checked by fdoray@chromium.org to run a CQ dry run
fdoray@chromium.org changed reviewers: + gab@chromium.org, robliao@chromium.org
PTAL
Dry run: CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
https://codereview.chromium.org/2857613002/diff/1/base/task_scheduler/task_tr... File base/task_scheduler/task_tracker.cc (right): https://codereview.chromium.org/2857613002/diff/1/base/task_scheduler/task_tr... base/task_scheduler/task_tracker.cc:218: while (subtle::Acquire_Load(&num_pending_undelayed_tasks_) != 0 && What was the failure that caused this? https://codereview.chromium.org/2857613002/diff/1/base/task_scheduler/task_tr... base/task_scheduler/task_tracker.cc:487: subtle::Barrier_AtomicIncrement(&num_pending_undelayed_tasks_, -1); Is this barrier actually necessary? What other memory was impacted?
On 2017/05/02 14:23:20, robliao wrote: > https://codereview.chromium.org/2857613002/diff/1/base/task_scheduler/task_tr... > File base/task_scheduler/task_tracker.cc (right): > > https://codereview.chromium.org/2857613002/diff/1/base/task_scheduler/task_tr... > base/task_scheduler/task_tracker.cc:218: while > (subtle::Acquire_Load(&num_pending_undelayed_tasks_) != 0 && > What was the failure that caused this? > > https://codereview.chromium.org/2857613002/diff/1/base/task_scheduler/task_tr... > base/task_scheduler/task_tracker.cc:487: > subtle::Barrier_AtomicIncrement(&num_pending_undelayed_tasks_, -1); > Is this barrier actually necessary? What other memory was impacted? See bug: https://crbug.com/717527 In TaskSchedulerWorkerPoolImplPostTaskBeforeStartTest.PostTasksBeforeStart, there is a race between a task writing state in |barrier| and the main thread destroying |barrier|. |barrier| is destroyed after TaskTracker::Flush(). Unfortunately, TaskTracker::Flush() doesn't provide memory synchronization with the task that wrote the state. Code: A void TaskTracker::Flush() { B AutoSchedulerLock auto_lock(flush_lock_); C while (subtle::NoBarrier_Load(&num_pending_undelayed_tasks_) != 0 && D !IsShutdownComplete()) { E flush_cv_->Wait(); F } G } H void TaskTracker::DecrementNumPendingUndelayedTasks() { I const auto new_num_pending_undelayed_tasks = J subtle::NoBarrier_AtomicIncrement(&num_pending_undelayed_tasks_, -1); K DCHECK_GE(new_num_pending_undelayed_tasks, 0); L if (new_num_pending_undelayed_tasks == 0) { M AutoSchedulerLock auto_lock(flush_lock_); N flush_cv_->Signal(); O } P } Thread Line Main A Main B Sched H Sched I Sched J Main C |num_pending_undelayed_tasks_| is zero Main F Main G Flush() has returned without memory synchronization with task. Sched K Sched L Sched M Acquiring |flush_lock_| provides memory synchronization, but it is too late. Main thread may already have accessed stale memory.
https://codereview.chromium.org/2857613002/diff/1/base/task_scheduler/task_tr... File base/task_scheduler/task_tracker.cc (right): https://codereview.chromium.org/2857613002/diff/1/base/task_scheduler/task_tr... base/task_scheduler/task_tracker.cc:218: while (subtle::Acquire_Load(&num_pending_undelayed_tasks_) != 0 && On 2017/05/02 14:23:20, robliao wrote: > What was the failure that caused this? Offline Discussion: The Acquire here is probably not necessary due to the flush_lock_. We also aren't concerned about the effects of writes reordered within this thread for visibility to other threads. task_tracker.h will need to be updated too. // Number of undelayed tasks that haven't completed their execution. Is // incremented and decremented without a barrier. When it reaches zero, // |flush_lock_| is acquired (forcing memory synchronization) and |flush_cv_| // is signaled. subtle::Atomic32 num_pending_undelayed_tasks_ = 0; https://codereview.chromium.org/2857613002/diff/1/base/task_scheduler/task_tr... base/task_scheduler/task_tracker.cc:487: subtle::Barrier_AtomicIncrement(&num_pending_undelayed_tasks_, -1); On 2017/05/02 14:23:20, robliao wrote: > Is this barrier actually necessary? What other memory was impacted? Offline discussion: The memory writes of the task finishing could be reordered after this atomic decrement, which is why we need a barrier here so that when another thread sees 0, it's guaranteed that all writes are visible.
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
Thought about this for a while, I think this is the cleanest solution, lgtm https://codereview.chromium.org/2857613002/diff/1/base/task_scheduler/task_tr... File base/task_scheduler/task_tracker.cc (right): https://codereview.chromium.org/2857613002/diff/1/base/task_scheduler/task_tr... base/task_scheduler/task_tracker.cc:218: while (subtle::Acquire_Load(&num_pending_undelayed_tasks_) != 0 && My thoughts (no action required): So the problem is that it's possible that this sees 0 before |flush_lock_| is acquired because it dropped it to 0 in DecrementNumPendingUndelayedTasks() So what needs to happen is DecrementNumPendingUndelayedTasks() not only calls flush_cv_->Signal() inside the lock but also sets another piece of state that the conditional here will depend on (e.g "bool flushed_ = true;"). The atomic |num_pending_undelayed_tasks_| is still useful to make sure we only grab the lock on sync points (0 -> 1 and 1 -> 0)... But meh... overall the barriers are simpler I think... PS: I do think we need the Acquire here as this is what forces this to happen-after the Release semantics of the Barrier_AtomicIncrement.
The CQ bit was checked by fdoray@chromium.org to run a CQ dry run
The CQ bit was unchecked by fdoray@chromium.org
https://codereview.chromium.org/2857613002/diff/1/base/task_scheduler/task_tr... File base/task_scheduler/task_tracker.cc (right): https://codereview.chromium.org/2857613002/diff/1/base/task_scheduler/task_tr... base/task_scheduler/task_tracker.cc:218: while (subtle::Acquire_Load(&num_pending_undelayed_tasks_) != 0 && On 2017/05/02 15:44:47, robliao wrote: > On 2017/05/02 14:23:20, robliao wrote: > > What was the failure that caused this? > > Offline Discussion: The Acquire here is probably not necessary due to the > flush_lock_. We also aren't concerned about the effects of writes reordered > within this thread for visibility to other threads. > > task_tracker.h will need to be updated too. > // Number of undelayed tasks that haven't completed their execution. Is > // incremented and decremented without a barrier. When it reaches zero, > // |flush_lock_| is acquired (forcing memory synchronization) and |flush_cv_| > // is signaled. > subtle::Atomic32 num_pending_undelayed_tasks_ = 0; Updating comment in header file: Done.
The CQ bit was checked by fdoray@chromium.org
The patchset sent to the CQ was uploaded after l-g-t-m from gab@chromium.org Link to the patchset: https://codereview.chromium.org/2857613002/#ps20001 (title: "update_comment_in_header_file")
CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
lgtm
The CQ bit was unchecked by commit-bot@chromium.org
Try jobs failed on following builders: linux_chromium_chromeos_ozone_rel_ng on master.tryserver.chromium.linux (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.linux/builders/linux_chromium_...)
The CQ bit was checked by fdoray@chromium.org
CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
CQ is committing da patch. Bot data: {"patchset_id": 20001, "attempt_start_ts": 1493828844842010, "parent_rev": "1d44ab0e3d0a7cd05ce1da956bb87a85a8b88174", "commit_rev": "9888c32790d9934ce4a4446349c9a691536561cd"}
Message was sent while issue was closed.
Description was changed from ========== Synchronize memory in TaskTracker::Flush(). To prevent races, make sure that code that runs after TaskTracker::Flush() sees memory changed by all tasks it was waiting for. BUG=717527 ========== to ========== Synchronize memory in TaskTracker::Flush(). To prevent races, make sure that code that runs after TaskTracker::Flush() sees memory changed by all tasks it was waiting for. BUG=717527 Review-Url: https://codereview.chromium.org/2857613002 Cr-Commit-Position: refs/heads/master@{#469008} Committed: https://chromium.googlesource.com/chromium/src/+/9888c32790d9934ce4a4446349c9... ==========
Message was sent while issue was closed.
Committed patchset #2 (id:20001) as https://chromium.googlesource.com/chromium/src/+/9888c32790d9934ce4a4446349c9... |