|
|
DescriptionFixes for SkRWBuffer
Do not call SkBufferHead::validate in SkROBuffer's destructor, which
may be called in a separate thread from SkRWBuffer::append. validate()
reads SkBufferBlock::fUsed, and append() writes to it, resulting in
a data race.
Update some comments to be more clear about how it is safe to use
these classes across threads.
Test the readers in separate threads.
In addition, make sure it is safe to create a reader even when no
data has been appended. Add tests for this case.
Mark a parameter to SkBufferHead::validate() as const, reflecting
its use.
BUG=chromium:601578
BUG=chromium:605479
GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1871953002
Committed: https://skia.googlesource.com/skia/+/635164028594b4af0086ec85b5e4570dd11091da
Patch Set 1 #
Total comments: 7
Patch Set 2 : Replace long comment with function call #Patch Set 3 : Rebase #Patch Set 4 : Remove fTail changes #Patch Set 5 : Capture arrays by reference #Patch Set 6 : Attempt to avoid data race in SkBufferBlock->fNext #Patch Set 7 : Capture only the stream/reader needed #Patch Set 8 : Change ownership in test, plus remove test after SkRWBuffer deletion #Patch Set 9 : Fix indentation #Patch Set 10 : Restore original test (in addition to threaded one) #
Depends on Patchset: Messages
Total messages: 29 (13 generated)
Description was changed from ========== Fix data race in SkRWBuffer SkBufferBlock::fUsed may be updated by the writer while a reader is attempting to read it. This is only on the last block, so keep track of the last block at SkROBuffer creation time and how much data it has. This way, it can read that value instead of fUsed. Continue to read fUsed on earlier SkBufferBlocks, which will never be updated. Do not call SkBufferHead::validate in SkROBuffer's destructor, to avoid another race. Update some comments to be more clear about how it is safe to use these classes across threads. Test the readers in separate threads. In addition, make sure it is safe to create a reader even when no data has been appended. Add tests for this case. BUG=chromium:601578 ========== to ========== Fix data race in SkRWBuffer SkBufferBlock::fUsed may be updated by the writer while a reader is attempting to read it. This is only on the last block, so keep track of the last block at SkROBuffer creation time and how much data it has. This way, it can read that value instead of fUsed. Continue to read fUsed on earlier SkBufferBlocks, which will never be updated. Do not call SkBufferHead::validate in SkROBuffer's destructor, to avoid another race. Update some comments to be more clear about how it is safe to use these classes across threads. Test the readers in separate threads. In addition, make sure it is safe to create a reader even when no data has been appended. Add tests for this case. BUG=chromium:601578 GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is... ==========
scroggo@google.com changed reviewers: + bungeman@google.com, mtklein@google.com, reed@google.com
reed@ - public API bungeman@ - avoid the data race, as we discussed mtklein@ - questions about multithreading in a test Feel free to review other aspects as well. https://codereview.chromium.org/1871953002/diff/1/include/core/SkRWBuffer.h File include/core/SkRWBuffer.h (right): https://codereview.chromium.org/1871953002/diff/1/include/core/SkRWBuffer.h#n... include/core/SkRWBuffer.h:56: const SkROBuffer* fBuffer; SkROBuffer is a const class, so should I remove this "const" here, as we have done with SkPicture, etc? https://codereview.chromium.org/1871953002/diff/1/include/core/SkRWBuffer.h#n... include/core/SkRWBuffer.h:70: const size_t fTailUsed; I'm not a huge fan of this name, but I don't have a better one. This is the value of fUsed when this ROBuffer was created; fUsed may have increased since then. It's more like fTailKnownUsed or something like that. Suggestions? https://codereview.chromium.org/1871953002/diff/1/tests/DataRefTest.cpp File tests/DataRefTest.cpp (right): https://codereview.chromium.org/1871953002/diff/1/tests/DataRefTest.cpp#newco... tests/DataRefTest.cpp:313: tasks.add([reporter, i, readers, streams] { DM will put this whole test in a task, and it will spawn its own tasks. I'm not sure whether the subtasks will be running at the same time as we are appending to the SkRWBuffer. If not, it defeats the point of this test, although it was useful for me in developing (when there weren't lots of tests/gms competing for threads). Should there be a way to run a multithreaded test in DM that is the only thing running?
https://codereview.chromium.org/1871953002/diff/1/tests/DataRefTest.cpp File tests/DataRefTest.cpp (right): https://codereview.chromium.org/1871953002/diff/1/tests/DataRefTest.cpp#newco... tests/DataRefTest.cpp:297: // Declaring this SkTaskGroup inside this block ensures that all tasks will You may just want to call .wait() instead of having to write out this long note? https://codereview.chromium.org/1871953002/diff/1/tests/DataRefTest.cpp#newco... tests/DataRefTest.cpp:313: tasks.add([reporter, i, readers, streams] { On 2016/04/08 at 16:15:29, scroggo wrote: > DM will put this whole test in a task, and it will spawn its own tasks. I'm not sure whether the subtasks will be running at the same time as we are appending to the SkRWBuffer. If not, it defeats the point of this test, although it was useful for me in developing (when there weren't lots of tests/gms competing for threads). Should there be a way to run a multithreaded test in DM that is the only thing running? The tasks this tests spawns can run in parallel with each other, with other tests, with GMs, whatever. We usually try to start tasks in stack order to help memory locality, but it's not guaranteed, and obviously we cannot guarantee the order that tasks finish. So they really may be running in parallel to anything else. If you have a set of tasks that cannot run at the same time safely, they all need to be marked as serial. DM will then run them serially on the main thread. Other tasks will still be running around them on different threads. Today the only way to signal that a unit test should serialize is by writing it as GPU unit test and not passing --gpu_threading. Do not piggy back on this unless your code also becomes thread safe when Ganesh is. But you can look at DM.cpp to see how it decides to put these test in gSerialTests. The simplest way to make sure two tests cannot have threading issues is to write them as one test.
mike and I had a chat about this. lets talk more before proceeding with this approach
Description was changed from ========== Fix data race in SkRWBuffer SkBufferBlock::fUsed may be updated by the writer while a reader is attempting to read it. This is only on the last block, so keep track of the last block at SkROBuffer creation time and how much data it has. This way, it can read that value instead of fUsed. Continue to read fUsed on earlier SkBufferBlocks, which will never be updated. Do not call SkBufferHead::validate in SkROBuffer's destructor, to avoid another race. Update some comments to be more clear about how it is safe to use these classes across threads. Test the readers in separate threads. In addition, make sure it is safe to create a reader even when no data has been appended. Add tests for this case. BUG=chromium:601578 GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is... ========== to ========== Fixes for SkRWBuffer Do not call SkBufferHead::validate in SkROBuffer's destructor, which may be called in a separate thread from SkRWBuffer::append. validate() reads SkBufferBlock::fUsed, and append() writes to it, resulting in a data race. Update some comments to be more clear about how it is safe to use these classes across threads. Test the readers in separate threads. In addition, make sure it is safe to create a reader even when no data has been appended. Add tests for this case. Mark a parameter to SkBufferHead::validate() as const, reflecting its use. BUG=chromium:601578 GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is... ==========
On 2016/04/08 17:17:00, reed1 wrote: > mike and I had a chat about this. lets talk more before proceeding with this > approach To recap: reed@ already landed crrev.com/1872853002, which removed one of the flavors of SkRWBuffer::apend(), removing a data race. I've rebased on that, and removed my changes to avoid that data race. This CL still fixes another data race, use of an empty SkRWBuffer, and updates some comments. PTAL at patch set 4 https://codereview.chromium.org/1871953002/diff/1/tests/DataRefTest.cpp File tests/DataRefTest.cpp (right): https://codereview.chromium.org/1871953002/diff/1/tests/DataRefTest.cpp#newco... tests/DataRefTest.cpp:297: // Declaring this SkTaskGroup inside this block ensures that all tasks will On 2016/04/08 17:13:49, mtklein wrote: > You may just want to call .wait() instead of having to write out this long note? Done. https://codereview.chromium.org/1871953002/diff/1/tests/DataRefTest.cpp#newco... tests/DataRefTest.cpp:313: tasks.add([reporter, i, readers, streams] { On 2016/04/08 17:13:49, mtklein wrote: > On 2016/04/08 at 16:15:29, scroggo wrote: > > DM will put this whole test in a task, and it will spawn its own tasks. I'm > not sure whether the subtasks will be running at the same time as we are > appending to the SkRWBuffer. If not, it defeats the point of this test, although > it was useful for me in developing (when there weren't lots of tests/gms > competing for threads). Should there be a way to run a multithreaded test in DM > that is the only thing running? > > The tasks this tests spawns can run in parallel with each other, with other > tests, with GMs, whatever. We usually try to start tasks in stack order to help > memory locality, but it's not guaranteed, and obviously we cannot guarantee the > order that tasks finish. So they really may be running in parallel to anything > else. > > If you have a set of tasks that cannot run at the same time safely, they all > need to be marked as serial. DM will then run them serially on the main thread. > Other tasks will still be running around them on different threads. > > Today the only way to signal that a unit test should serialize is by writing it > as GPU unit test and not passing --gpu_threading. Do not piggy back on this > unless your code also becomes thread safe when Ganesh is. But you can look at > DM.cpp to see how it decides to put these test in gSerialTests. > > The simplest way to make sure two tests cannot have threading issues is to write > them as one test. I have a different issue. It is perfectly safe to run this test (and its subtasks) in parallel with other tests. That said, this test itself is only interesting if its main thread (which is calling SkRWBuffer::append()) runs in parallel to its subtasks (which read from the SkROBuffer), so we test any racing between them. If tasks might be run in stack order, that seems ideal - even if it doesn't always happen it may help us catch any bugs that pop up when it does. If we want to feel better about these tasks running in parallel with the main thread, I think we want to run a task group *before* we kick off any parallel work. gSerialTests starts after, meaning that it runs at the same time as the parallel stuff.
> I have a different issue. It is perfectly safe to run this test (and its subtasks) in parallel with other tests. That said, this test itself is only interesting if its main thread (which is calling SkRWBuffer::append()) runs in parallel to its subtasks (which read from the SkROBuffer), so we test any racing between them. If tasks might be run in stack order, that seems ideal - even if it doesn't always happen it may help us catch any bugs that pop up when it does. Do you mean, can line 300 run at the same time as line 307? Yes, they can. > I think we want to run a task group *before* we kick off any parallel work. I don't understand what you're getting at here. A task group is parallel work.
On 2016/04/11 17:05:49, mtklein wrote: > > I have a different issue. It is perfectly safe to run this test (and its > subtasks) in parallel with other tests. That said, this test itself is only > interesting if its main thread (which is calling SkRWBuffer::append()) runs in > parallel to its subtasks (which read from the SkROBuffer), so we test any racing > between them. If tasks might be run in stack order, that seems ideal - even if > it doesn't always happen it may help us catch any bugs that pop up when it does. > > Do you mean, can line 300 run at the same time as line 307? Yes, they can. > > > > I think we want to run a task group *before* we kick off any parallel work. > > I don't understand what you're getting at here. A task group is parallel work. mtklein@ and I discussed in person. This test is sufficient. PTAL.
lgtm
The CQ bit was checked by scroggo@google.com
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1871953002/60001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1871953002/60001
The CQ bit was unchecked by commit-bot@chromium.org
Try jobs failed on following builders: Build-Win-MSVC-x86_64-Debug-Trybot on client.skia.compile (JOB_FAILED, http://build.chromium.org/p/client.skia.compile/builders/Build-Win-MSVC-x86_6...)
The CQ bit was checked by scroggo@google.com
The patchset sent to the CQ was uploaded after l-g-t-m from reed@google.com Link to the patchset: https://codereview.chromium.org/1871953002/#ps80001 (title: "Capture arrays by reference")
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1871953002/80001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1871953002/80001
Message was sent while issue was closed.
Description was changed from ========== Fixes for SkRWBuffer Do not call SkBufferHead::validate in SkROBuffer's destructor, which may be called in a separate thread from SkRWBuffer::append. validate() reads SkBufferBlock::fUsed, and append() writes to it, resulting in a data race. Update some comments to be more clear about how it is safe to use these classes across threads. Test the readers in separate threads. In addition, make sure it is safe to create a reader even when no data has been appended. Add tests for this case. Mark a parameter to SkBufferHead::validate() as const, reflecting its use. BUG=chromium:601578 GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is... ========== to ========== Fixes for SkRWBuffer Do not call SkBufferHead::validate in SkROBuffer's destructor, which may be called in a separate thread from SkRWBuffer::append. validate() reads SkBufferBlock::fUsed, and append() writes to it, resulting in a data race. Update some comments to be more clear about how it is safe to use these classes across threads. Test the readers in separate threads. In addition, make sure it is safe to create a reader even when no data has been appended. Add tests for this case. Mark a parameter to SkBufferHead::validate() as const, reflecting its use. BUG=chromium:601578 GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is... Committed: https://skia.googlesource.com/skia/+/d06920a29fe11c68bde2b93948ec99f277bb8459 ==========
Message was sent while issue was closed.
Committed patchset #5 (id:80001) as https://skia.googlesource.com/skia/+/d06920a29fe11c68bde2b93948ec99f277bb8459
Message was sent while issue was closed.
A revert of this CL (patchset #5 id:80001) has been created in https://codereview.chromium.org/1882853004/ by bungeman@google.com. The reason for reverting is: Making MSAN and TSAN rather unhappy. https://build.chromium.org/p/client.skia/builders/Test-Ubuntu-GCC-GCE-CPU-AVX... https://build.chromium.org/p/client.skia/builders/Test-Ubuntu-GCC-GCE-CPU-AVX....
Message was sent while issue was closed.
Description was changed from ========== Fixes for SkRWBuffer Do not call SkBufferHead::validate in SkROBuffer's destructor, which may be called in a separate thread from SkRWBuffer::append. validate() reads SkBufferBlock::fUsed, and append() writes to it, resulting in a data race. Update some comments to be more clear about how it is safe to use these classes across threads. Test the readers in separate threads. In addition, make sure it is safe to create a reader even when no data has been appended. Add tests for this case. Mark a parameter to SkBufferHead::validate() as const, reflecting its use. BUG=chromium:601578 GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is... Committed: https://skia.googlesource.com/skia/+/d06920a29fe11c68bde2b93948ec99f277bb8459 ========== to ========== Fixes for SkRWBuffer Do not call SkBufferHead::validate in SkROBuffer's destructor, which may be called in a separate thread from SkRWBuffer::append. validate() reads SkBufferBlock::fUsed, and append() writes to it, resulting in a data race. Update some comments to be more clear about how it is safe to use these classes across threads. Test the readers in separate threads. In addition, make sure it is safe to create a reader even when no data has been appended. Add tests for this case. Mark a parameter to SkBufferHead::validate() as const, reflecting its use. BUG=chromium:601578 GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is... Committed: https://skia.googlesource.com/skia/+/d06920a29fe11c68bde2b93948ec99f277bb8459 ==========
On 2016/04/14 21:56:39, bungeman1 wrote: > A revert of this CL (patchset #5 id:80001) has been created in > https://codereview.chromium.org/1882853004/ by mailto:bungeman@google.com. > > The reason for reverting is: Making MSAN and TSAN rather unhappy. > > https://build.chromium.org/p/client.skia/builders/Test-Ubuntu-GCC-GCE-CPU-AVX... > > https://build.chromium.org/p/client.skia/builders/Test-Ubuntu-GCC-GCE-CPU-AVX.... The TSAN failure revolves around SkRWBuffer::append() and SkROBuffer::Iter::next(). Both access SkBufferBlock->fNext. append() may change fNext from nullptr to a newly created SkBufferBlock. next() reads it to read the next SkBufferBlock. According to my understanding, these should not be a problem, since next() only reads fNext if fRemaining > 0. We know that fNext only gets set once (not counting to nullptr at initialization), and if it is, when we get to next(), it will have already been set OR fRemaining will be zero. So if it's possible that append() is writing to fNext, next() will not read it. But TSAN is still unhappy. (As is MSAN, but that one is harder to follow.) I've uploaded a CL which revives my earlier idea to keep track of the tail. I'm guessing that will make TSAN no happier though (running trybots now).
Description was changed from ========== Fixes for SkRWBuffer Do not call SkBufferHead::validate in SkROBuffer's destructor, which may be called in a separate thread from SkRWBuffer::append. validate() reads SkBufferBlock::fUsed, and append() writes to it, resulting in a data race. Update some comments to be more clear about how it is safe to use these classes across threads. Test the readers in separate threads. In addition, make sure it is safe to create a reader even when no data has been appended. Add tests for this case. Mark a parameter to SkBufferHead::validate() as const, reflecting its use. BUG=chromium:601578 GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is... Committed: https://skia.googlesource.com/skia/+/d06920a29fe11c68bde2b93948ec99f277bb8459 ========== to ========== Fixes for SkRWBuffer Do not call SkBufferHead::validate in SkROBuffer's destructor, which may be called in a separate thread from SkRWBuffer::append. validate() reads SkBufferBlock::fUsed, and append() writes to it, resulting in a data race. Update some comments to be more clear about how it is safe to use these classes across threads. Test the readers in separate threads. In addition, make sure it is safe to create a reader even when no data has been appended. Add tests for this case. Mark a parameter to SkBufferHead::validate() as const, reflecting its use. BUG=chromium:601578 BUG=chromium:605479 GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is... ==========
On 2016/04/18 19:23:24, scroggo wrote: > On 2016/04/14 21:56:39, bungeman1 wrote: > > A revert of this CL (patchset #5 id:80001) has been created in > > https://codereview.chromium.org/1882853004/ by mailto:bungeman@google.com. > > > > The reason for reverting is: Making MSAN and TSAN rather unhappy. > > > > > https://build.chromium.org/p/client.skia/builders/Test-Ubuntu-GCC-GCE-CPU-AVX... > > > > > https://build.chromium.org/p/client.skia/builders/Test-Ubuntu-GCC-GCE-CPU-AVX.... > > The TSAN failure revolves around SkRWBuffer::append() and > SkROBuffer::Iter::next(). Both access SkBufferBlock->fNext. > > append() may change fNext from nullptr to a newly created SkBufferBlock. next() > reads it to read the next SkBufferBlock. > > According to my understanding, these should not be a problem, since next() only > reads fNext if fRemaining > 0. We know that fNext only gets set once (not > counting to nullptr at initialization), and if it is, when we get to next(), it > will have already been set OR fRemaining will be zero. So if it's possible that > append() is writing to fNext, next() will not read it. > > But TSAN is still unhappy. (As is MSAN, but that one is harder to follow.) I've > uploaded a CL which revives my earlier idea to keep track of the tail. I'm > guessing that will make TSAN no happier though (running trybots now). I was too pessimistic - TSAN is happy with using fTail :) And I have finally fixed MSAN! (I think - patchset 1 of crrev.com/1906893002, which I used to help me isolate the problem, passed MSAN the first time, and passed locally. After patchset 2 failed, I tried 1 again locally and it failed, too. So it seems to be inconsistent.) I dropped the arrays, thinking they might be part of the problem. (The arrays are uninitialized, after all, from i+1 to N-1. I expect i to be okay though...) Instead I use an sk_sp for the SkROBuffer (reader) and just finish the test after spawning all the threads (which now delete the streams, too). PTAL. I revived storing the Tail in the SkROBuffer, which crrev.com/1872853002 was (at least partially) attempting to avoid.
The CQ bit was checked by scroggo@google.com
The patchset sent to the CQ was uploaded after l-g-t-m from reed@google.com Link to the patchset: https://codereview.chromium.org/1871953002/#ps180001 (title: "Restore original test (in addition to threaded one)")
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1871953002/180001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1871953002/180001
Message was sent while issue was closed.
Description was changed from ========== Fixes for SkRWBuffer Do not call SkBufferHead::validate in SkROBuffer's destructor, which may be called in a separate thread from SkRWBuffer::append. validate() reads SkBufferBlock::fUsed, and append() writes to it, resulting in a data race. Update some comments to be more clear about how it is safe to use these classes across threads. Test the readers in separate threads. In addition, make sure it is safe to create a reader even when no data has been appended. Add tests for this case. Mark a parameter to SkBufferHead::validate() as const, reflecting its use. BUG=chromium:601578 BUG=chromium:605479 GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is... ========== to ========== Fixes for SkRWBuffer Do not call SkBufferHead::validate in SkROBuffer's destructor, which may be called in a separate thread from SkRWBuffer::append. validate() reads SkBufferBlock::fUsed, and append() writes to it, resulting in a data race. Update some comments to be more clear about how it is safe to use these classes across threads. Test the readers in separate threads. In addition, make sure it is safe to create a reader even when no data has been appended. Add tests for this case. Mark a parameter to SkBufferHead::validate() as const, reflecting its use. BUG=chromium:601578 BUG=chromium:605479 GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is... Committed: https://skia.googlesource.com/skia/+/635164028594b4af0086ec85b5e4570dd11091da ==========
Message was sent while issue was closed.
Committed patchset #10 (id:180001) as https://skia.googlesource.com/skia/+/635164028594b4af0086ec85b5e4570dd11091da |