|
|
Chromium Code Reviews
DescriptionAvoid holding locks across sync IPCs in command buffer
Pepper drops its global lock and reacquires it during synchronous IPCs, so these can turn into deadlocks.
BUG=418651
Patch Set 1 #
Total comments: 3
Patch Set 2 : #
Total comments: 3
Messages
Total messages: 17 (3 generated)
jbauman@chromium.org changed reviewers: + vmiura@chromium.org
penghuang@chromium.org changed reviewers: + penghuang@chromium.org
https://codereview.chromium.org/613843003/diff/1/gpu/command_buffer/client/pr... File gpu/command_buffer/client/program_info_manager.cc (right): https://codereview.chromium.org/613843003/diff/1/gpu/command_buffer/client/pr... gpu/command_buffer/client/program_info_manager.cc:456: return gl->GetUniformLocationHelper(program, name); Does this gl call use sync IPC? https://codereview.chromium.org/613843003/diff/1/gpu/command_buffer/client/pr... gpu/command_buffer/client/program_info_manager.cc:490: return gl->GetActiveAttribHelper( Same here? https://codereview.chromium.org/613843003/diff/1/gpu/command_buffer/client/pr... gpu/command_buffer/client/program_info_manager.cc:524: return gl->GetActiveUniformHelper( same?
On 2014/09/29 23:28:59, Peng wrote: > https://codereview.chromium.org/613843003/diff/1/gpu/command_buffer/client/pr... > File gpu/command_buffer/client/program_info_manager.cc (right): > > https://codereview.chromium.org/613843003/diff/1/gpu/command_buffer/client/pr... > gpu/command_buffer/client/program_info_manager.cc:456: return > gl->GetUniformLocationHelper(program, name); > Does this gl call use sync IPC? You're right, these can all used sync IPCs. Fixed. > > https://codereview.chromium.org/613843003/diff/1/gpu/command_buffer/client/pr... > gpu/command_buffer/client/program_info_manager.cc:490: return > gl->GetActiveAttribHelper( > Same here? > > https://codereview.chromium.org/613843003/diff/1/gpu/command_buffer/client/pr... > gpu/command_buffer/client/program_info_manager.cc:524: return > gl->GetActiveUniformHelper( > same?
Tested this CL with the app several times, and the deadlock never happened again. thanks.
Tested this CL with the app several times, and the deadlock never happened again. thanks.
piman@chromium.org changed reviewers: + piman@chromium.org
https://codereview.chromium.org/613843003/diff/20001/gpu/command_buffer/clien... File gpu/command_buffer/client/share_group.cc (right): https://codereview.chromium.org/613843003/diff/20001/gpu/command_buffer/clien... gpu/command_buffer/client/share_group.cc:65: } I think this change would cause a potential race if bind_generates_resource is true, the race leading to a command buffer loss. Thread 1 does glDeleteTextures(1, &id) Thread 2 does glBindTexture(GL_TEXTURE_2D, id), glFlush() Where id is the same. If the ordering is such as the bind + flush in Thread 2 happens after the l.59 (the CommandBufferHelper::Flush) on thread 1, but before the lock+FreeId, then we will send a bind command to the GPU process, which will recreate the client id on the service side (since it happened after the delete), but even though it marks it as used on the client side (on thread 2), it will be freed right away (on thread 1). So after that point we have an inconsistency between the service and client side, and the next glGenTextures that reuses the client id (since it's freed on the client side) will lose the context on the service side (GLES2DecoderImpl::GenTexturesHelper returns false because the client id is in use, so GLES2DecoderImpl::HandleGenTexturesImmediate raises the kInvalidArguments error). We really need the sending of the Delete command and the FreeID to be atomic when bind_generates_resource is true, unless we add one more level of virtualization.
On 2014/10/02 21:44:02, piman (Very slow to review) wrote: > https://codereview.chromium.org/613843003/diff/20001/gpu/command_buffer/clien... > File gpu/command_buffer/client/share_group.cc (right): > > https://codereview.chromium.org/613843003/diff/20001/gpu/command_buffer/clien... > gpu/command_buffer/client/share_group.cc:65: } > I think this change would cause a potential race if bind_generates_resource is > true, the race leading to a command buffer loss. > > Thread 1 does glDeleteTextures(1, &id) > Thread 2 does glBindTexture(GL_TEXTURE_2D, id), glFlush() > > Where id is the same. > > If the ordering is such as the bind + flush in Thread 2 happens after the l.59 > (the CommandBufferHelper::Flush) on thread 1, but before the lock+FreeId, then > we will send a bind command to the GPU process, which will recreate the client > id on the service side (since it happened after the delete), but even though it > marks it as used on the client side (on thread 2), it will be freed right away > (on thread 1). So after that point we have an inconsistency between the service > and client side, and the next glGenTextures that reuses the client id (since > it's freed on the client side) will lose the context on the service side > (GLES2DecoderImpl::GenTexturesHelper returns false because the client id is in > use, so GLES2DecoderImpl::HandleGenTexturesImmediate raises the > kInvalidArguments error). > > We really need the sending of the Delete command and the FreeID to be atomic > when bind_generates_resource is true, unless we add one more level of > virtualization. Couldn't the same issue happen if the glBindTexture executes before the glDeleteTextures, and the flush (on thread 2) executes afterward? I think whether it starts executing immediately before or in the middle of the other function doesn't change anything, so at least this patch doesn't make it much worse. It feels like these sorts of cases count as user error.
On Thu, Oct 2, 2014 at 3:56 PM, <jbauman@chromium.org> wrote: > On 2014/10/02 21:44:02, piman (Very slow to review) wrote: > > https://codereview.chromium.org/613843003/diff/20001/gpu/ > command_buffer/client/share_group.cc > >> File gpu/command_buffer/client/share_group.cc (right): >> > > > https://codereview.chromium.org/613843003/diff/20001/gpu/ > command_buffer/client/share_group.cc#newcode65 > >> gpu/command_buffer/client/share_group.cc:65: } >> I think this change would cause a potential race if >> bind_generates_resource is >> true, the race leading to a command buffer loss. >> > > Thread 1 does glDeleteTextures(1, &id) >> Thread 2 does glBindTexture(GL_TEXTURE_2D, id), glFlush() >> > > Where id is the same. >> > > If the ordering is such as the bind + flush in Thread 2 happens after the >> l.59 >> (the CommandBufferHelper::Flush) on thread 1, but before the lock+FreeId, >> then >> we will send a bind command to the GPU process, which will recreate the >> client >> id on the service side (since it happened after the delete), but even >> though >> > it > >> marks it as used on the client side (on thread 2), it will be freed right >> away >> (on thread 1). So after that point we have an inconsistency between the >> > service > >> and client side, and the next glGenTextures that reuses the client id >> (since >> it's freed on the client side) will lose the context on the service side >> (GLES2DecoderImpl::GenTexturesHelper returns false because the client id >> is in >> use, so GLES2DecoderImpl::HandleGenTexturesImmediate raises the >> kInvalidArguments error). >> > > We really need the sending of the Delete command and the FreeID to be >> atomic >> when bind_generates_resource is true, unless we add one more level of >> virtualization. >> > > Couldn't the same issue happen if the glBindTexture executes before the > glDeleteTextures, and the flush (on thread 2) executes afterward? I think > whether it starts executing immediately before or in the middle of the > other > function doesn't change anything, so at least this patch doesn't make it > much > worse. That's a bug that should be fixed (e.g. by flushing under lock), or dealt with another way... > It feels like these sorts of cases count as user error. > These are not allowed to lose the context though, per spec. They may generate appropriate GL errors. > > https://codereview.chromium.org/613843003/ > To unsubscribe from this group and stop receiving emails from it, send an email to chromium-reviews+unsubscribe@chromium.org.
And update for this CL? On 2014/10/03 03:11:52, piman (Very slow to review) wrote: > On Thu, Oct 2, 2014 at 3:56 PM, <mailto:jbauman@chromium.org> wrote: > > > On 2014/10/02 21:44:02, piman (Very slow to review) wrote: > > > > https://codereview.chromium.org/613843003/diff/20001/gpu/ > > command_buffer/client/share_group.cc > > > >> File gpu/command_buffer/client/share_group.cc (right): > >> > > > > > > https://codereview.chromium.org/613843003/diff/20001/gpu/ > > command_buffer/client/share_group.cc#newcode65 > > > >> gpu/command_buffer/client/share_group.cc:65: } > >> I think this change would cause a potential race if > >> bind_generates_resource is > >> true, the race leading to a command buffer loss. > >> > > > > Thread 1 does glDeleteTextures(1, &id) > >> Thread 2 does glBindTexture(GL_TEXTURE_2D, id), glFlush() > >> > > > > Where id is the same. > >> > > > > If the ordering is such as the bind + flush in Thread 2 happens after the > >> l.59 > >> (the CommandBufferHelper::Flush) on thread 1, but before the lock+FreeId, > >> then > >> we will send a bind command to the GPU process, which will recreate the > >> client > >> id on the service side (since it happened after the delete), but even > >> though > >> > > it > > > >> marks it as used on the client side (on thread 2), it will be freed right > >> away > >> (on thread 1). So after that point we have an inconsistency between the > >> > > service > > > >> and client side, and the next glGenTextures that reuses the client id > >> (since > >> it's freed on the client side) will lose the context on the service side > >> (GLES2DecoderImpl::GenTexturesHelper returns false because the client id > >> is in > >> use, so GLES2DecoderImpl::HandleGenTexturesImmediate raises the > >> kInvalidArguments error). > >> > > > > We really need the sending of the Delete command and the FreeID to be > >> atomic > >> when bind_generates_resource is true, unless we add one more level of > >> virtualization. > >> > > > > Couldn't the same issue happen if the glBindTexture executes before the > > glDeleteTextures, and the flush (on thread 2) executes afterward? I think > > whether it starts executing immediately before or in the middle of the > > other > > function doesn't change anything, so at least this patch doesn't make it > > much > > worse. > > > That's a bug that should be fixed (e.g. by flushing under lock), or dealt > with another way... > > > > It feels like these sorts of cases count as user error. > > > > These are not allowed to lose the context though, per spec. They may > generate appropriate GL errors. > > > > > > https://codereview.chromium.org/613843003/ > > > > To unsubscribe from this group and stop receiving emails from it, send an email > to mailto:chromium-reviews+unsubscribe@chromium.org.
https://codereview.chromium.org/613843003/diff/20001/gpu/command_buffer/clien... File gpu/command_buffer/client/share_group.cc (right): https://codereview.chromium.org/613843003/diff/20001/gpu/command_buffer/clien... gpu/command_buffer/client/share_group.cc:65: } On 2014/10/02 21:44:02, piman (Very slow to review) wrote: > I think this change would cause a potential race if bind_generates_resource is > true, the race leading to a command buffer loss. > > Thread 1 does glDeleteTextures(1, &id) > Thread 2 does glBindTexture(GL_TEXTURE_2D, id), glFlush() > > Where id is the same. > > If the ordering is such as the bind + flush in Thread 2 happens after the l.59 > (the CommandBufferHelper::Flush) on thread 1, but before the lock+FreeId, then > we will send a bind command to the GPU process, which will recreate the client > id on the service side (since it happened after the delete), but even though it > marks it as used on the client side (on thread 2), it will be freed right away > (on thread 1). So after that point we have an inconsistency between the service > and client side, and the next glGenTextures that reuses the client id (since > it's freed on the client side) will lose the context on the service side > (GLES2DecoderImpl::GenTexturesHelper returns false because the client id is in > use, so GLES2DecoderImpl::HandleGenTexturesImmediate raises the > kInvalidArguments error). > > We really need the sending of the Delete command and the FreeID to be atomic > when bind_generates_resource is true, unless we add one more level of > virtualization. I think delete_fn and Flush() do not use sync IPC currently. So it is fine to call them with the lock locked (will not cause deadlock in pepper). Without the change in this file, does other part of the CL LGTY?
The program_info_manager.cc part is ok, but see below. https://codereview.chromium.org/613843003/diff/20001/gpu/command_buffer/clien... File gpu/command_buffer/client/share_group.cc (right): https://codereview.chromium.org/613843003/diff/20001/gpu/command_buffer/clien... gpu/command_buffer/client/share_group.cc:65: } On 2014/12/05 18:16:42, Peng wrote: > On 2014/10/02 21:44:02, piman (Very slow to review) wrote: > > I think this change would cause a potential race if bind_generates_resource is > > true, the race leading to a command buffer loss. > > > > Thread 1 does glDeleteTextures(1, &id) > > Thread 2 does glBindTexture(GL_TEXTURE_2D, id), glFlush() > > > > Where id is the same. > > > > If the ordering is such as the bind + flush in Thread 2 happens after the l.59 > > (the CommandBufferHelper::Flush) on thread 1, but before the lock+FreeId, then > > we will send a bind command to the GPU process, which will recreate the client > > id on the service side (since it happened after the delete), but even though > it > > marks it as used on the client side (on thread 2), it will be freed right away > > (on thread 1). So after that point we have an inconsistency between the > service > > and client side, and the next glGenTextures that reuses the client id (since > > it's freed on the client side) will lose the context on the service side > > (GLES2DecoderImpl::GenTexturesHelper returns false because the client id is in > > use, so GLES2DecoderImpl::HandleGenTexturesImmediate raises the > > kInvalidArguments error). > > > > We really need the sending of the Delete command and the FreeID to be atomic > > when bind_generates_resource is true, unless we add one more level of > > virtualization. > > I think delete_fn and Flush() do not use sync IPC currently. So it is fine to > call them with the lock locked (will not cause deadlock in pepper). Without the > change in this file, does other part of the CL LGTY? delete_fn will use a sync IPC in some cases, e.g. if the command buffer is full.
Is this abandoned?
On 2015/06/25 01:49:52, vmiura wrote: > Is this abandoned? Yes. I think there were concerns it might be racy in some cases (I haven't read the review comments). And the real problem it was trying to fix was: https://code.google.com/p/chromium/issues/detail?id=418651 ...which was fixed in a different, more robust way. This CL would only be an optimization now, I think. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
