Issue 12213073: Re-land: Mark async texture uploads as completed from the upload thread.

Issue 12213073: Re-land: Mark async texture uploads as completed from the upload thread. (Closed)

Created:
7 years, 10 months ago by reveman

Modified:
7 years, 10 months ago

Reviewers:
greggman, epenner, nduca

CC:
chromium-reviews, apatrick_chromium

Base URL:
svn://svn.chromium.org/chrome/trunk/src

Visibility:
Public.

More Reviews

Description

Re-land: Mark async texture uploads as completed from the upload thread. This reduces the latency between when an upload completes and when the client is notified. BUG=173802 NOTRY=True Committed: https://src.chromium.org/viewvc/chrome?view=rev&revision=183010

Patch Set 1 #

Total comments: 17

Patch Set 2 : Keep replies. #

Total comments: 2

Patch Set 3 : Avoid unnecessary polling by GpuCommandBufferStub. #

Total comments: 2

Patch Set 4 : add check to make sure shared memory is valid in AsyncPixelTransfersCompletedQuery::End. #

Patch Set 5 : Fix shutdown issue #

Total comments: 2

Patch Set 6 : fix typo #

Patch Set 7 : rebase #

Created: 7 years, 10 months ago

Download [raw] [tar.bz2]

	Unified diffs	Side-by-side diffs	Delta from patch set	Stats (+170 lines, -49 lines)			Patch
M	content/common/gpu/gpu_command_buffer_stub.cc	View	1 2 3 4	1 chunk	+3 lines, -0 lines	0 comments	Download
M	gpu/command_buffer/service/async_pixel_transfer_delegate_mock.h	View	1	1 chunk	+3 lines, -2 lines	0 comments	Download
M	gpu/command_buffer/service/common_decoder.h	View	1 2 3 4	1 chunk	+1 line, -0 lines	0 comments	Download
M	gpu/command_buffer/service/common_decoder.cc	View	1 2 3 4	1 chunk	+1 line, -0 lines	0 comments	Download
M	gpu/command_buffer/service/gles2_cmd_decoder.cc	View	1 2 3 4 5 6	2 chunks	+5 lines, -0 lines	0 comments	Download
M	gpu/command_buffer/service/query_manager.h	View	1 2	5 chunks	+28 lines, -4 lines	0 comments	Download
M	gpu/command_buffer/service/query_manager.cc	View	1 2 3	7 chunks	+83 lines, -21 lines	0 comments	Download
M	ui/gl/async_pixel_transfer_delegate.h	View	1	2 chunks	+5 lines, -1 line	0 comments	Download
M	ui/gl/async_pixel_transfer_delegate_android.cc	View	1 2 3 4 5 6	8 chunks	+35 lines, -18 lines	0 comments	Download
M	ui/gl/async_pixel_transfer_delegate_stub.h	View	1	1 chunk	+3 lines, -1 line	0 comments	Download
M	ui/gl/async_pixel_transfer_delegate_stub.cc	View	1	1 chunk	+3 lines, -2 lines	0 comments	Download

Messages

Total messages: 34 (0 generated)

Expand Messages | Collapse Messages

reveman

https://codereview.chromium.org/12213073/diff/1/ui/gl/DEPS File ui/gl/DEPS (right): https://codereview.chromium.org/12213073/diff/1/ui/gl/DEPS#newcode4 ui/gl/DEPS:4: "+gpu/command_buffer/common", I assume that this is not cool. We ...

7 years, 10 months ago (2013-02-07 19:33:47 UTC) #1

reveman

https://codereview.chromium.org/12213073/diff/1/ui/gl/async_pixel_transfer_delegate_android.cc File ui/gl/async_pixel_transfer_delegate_android.cc (right): https://codereview.chromium.org/12213073/diff/1/ui/gl/async_pixel_transfer_delegate_android.cc#newcode306 ui/gl/async_pixel_transfer_delegate_android.cc:306: base::Lock transfer_in_progress_lock_; we could use base::subtle::Atomic32 here instead if ...

7 years, 10 months ago (2013-02-07 19:59:02 UTC) #2

epenner

https://codereview.chromium.org/12213073/diff/1/gpu/command_buffer/service/async_pixel_transfer_delegate_mock.h File gpu/command_buffer/service/async_pixel_transfer_delegate_mock.h (right): https://codereview.chromium.org/12213073/diff/1/gpu/command_buffer/service/async_pixel_transfer_delegate_mock.h#newcode37 gpu/command_buffer/service/async_pixel_transfer_delegate_mock.h:37: uint32 submit_count)); It was a bit unclear to me ...

7 years, 10 months ago (2013-02-07 20:22:56 UTC) #3

https://codereview.chromium.org/12213073/diff/1/gpu/command_buffer/service/as...
File gpu/command_buffer/service/async_pixel_transfer_delegate_mock.h (right):

https://codereview.chromium.org/12213073/diff/1/gpu/command_buffer/service/as...
gpu/command_buffer/service/async_pixel_transfer_delegate_mock.h:37: uint32
submit_count));
It was a bit unclear to me before what submit_count meant, and it's even less
clear in this context. Could there be a better name?

https://codereview.chromium.org/12213073/diff/1/gpu/command_buffer/service/qu...
File gpu/command_buffer/service/query_manager.cc (right):

https://codereview.chromium.org/12213073/diff/1/gpu/command_buffer/service/qu...
gpu/command_buffer/service/query_manager.cc:200: mem_params.shm_data_size =
sizeof(QuerySync);
I was in debate of whether we needed to do this in this instance. The Query
class is actually ref-counted... So if that is legit, we could insure a
gpu-thread reference is kept for the duration of the async call.

However, I'm suspicious of the Query being reference counted in the first place.
Can it really outlive its manager safely? This is part of the reason I was
hesitant to do query completion on the thread. I also used a weak pointer rather
than holding a reference, to be sure.

If we keep the dupe, we should investigate removing it along with the texture
memory dupe. We just need an intermediate class that can be reference counted
within a process.

https://codereview.chromium.org/12213073/diff/1/gpu/command_buffer/service/qu...
gpu/command_buffer/service/query_manager.cc:204:
manager()->decoder()->GetAsyncPixelTransferDelegate()->AsyncNotifyCompletion(
I think to avoid the layering violation, we could do the query setting via a
callback that we create right here and pass along with the memory etc. The
delegate can decide whether it needs to dupe the memory, and when/what-thread to
execute the callback on. The callback will need to be very minimal in what it
does, since it could happen on another thread.

https://codereview.chromium.org/12213073/diff/1/gpu/command_buffer/service/qu...
gpu/command_buffer/service/query_manager.cc:217: if (sync->process_count !=
submit_count())
Is this thread safe? Given it's simplicity maybe it is? I think we should add a
comment at the minimum that we are comparing something that is being changed on
another thread.

https://codereview.chromium.org/12213073/diff/1/ui/gl/async_pixel_transfer_de...
File ui/gl/async_pixel_transfer_delegate_android.cc (right):

https://codereview.chromium.org/12213073/diff/1/ui/gl/async_pixel_transfer_de...
ui/gl/async_pixel_transfer_delegate_android.cc:306: base::Lock
transfer_in_progress_lock_;
On 2013/02/07 19:59:02, David Reveman wrote:
> we could use base::subtle::Atomic32 here instead if performance is a concern.

Can you explain a bit why we need a lock or atomic int at all?

https://codereview.chromium.org/12213073/diff/1/ui/gl/async_pixel_transfer_de...
File ui/gl/async_pixel_transfer_delegate_stub.cc (right):

https://codereview.chromium.org/12213073/diff/1/ui/gl/async_pixel_transfer_de...
ui/gl/async_pixel_transfer_delegate_stub.cc:83: gpu::gles2::QuerySync* sync =
static_cast<gpu::gles2::QuerySync*>(
See my earlier comment. I think the layering violation can be avoided with a
callback.

epenner

https://codereview.chromium.org/12213073/diff/1/ui/gl/async_pixel_transfer_delegate_android.cc File ui/gl/async_pixel_transfer_delegate_android.cc (right): https://codereview.chromium.org/12213073/diff/1/ui/gl/async_pixel_transfer_delegate_android.cc#newcode36 ui/gl/async_pixel_transfer_delegate_android.cc:36: class TextureUploadStats Can this be another change? I don't ...

7 years, 10 months ago (2013-02-07 20:36:34 UTC) #4

reveman

I also forgot to mention that this patch change async query result from always being ...

7 years, 10 months ago (2013-02-07 21:30:52 UTC) #5

I also forgot to mention that this patch change async query result from always
being 1 to instead always being 0. We don't use result so it shouldn't matter.

https://codereview.chromium.org/12213073/diff/1/gpu/command_buffer/service/as...
File gpu/command_buffer/service/async_pixel_transfer_delegate_mock.h (right):

https://codereview.chromium.org/12213073/diff/1/gpu/command_buffer/service/as...
gpu/command_buffer/service/async_pixel_transfer_delegate_mock.h:37: uint32
submit_count));
On 2013/02/07 20:22:56, epenner wrote:
> It was a bit unclear to me before what submit_count meant, and it's even less
> clear in this context. Could there be a better name?

this is what the query system uses for the value that should  be written to the
shared memory at query completion. I wouldn't want to change it unless we change
all of the query system.

https://codereview.chromium.org/12213073/diff/1/gpu/command_buffer/service/qu...
File gpu/command_buffer/service/query_manager.cc (right):

https://codereview.chromium.org/12213073/diff/1/gpu/command_buffer/service/qu...
gpu/command_buffer/service/query_manager.cc:200: mem_params.shm_data_size =
sizeof(QuerySync);
On 2013/02/07 20:22:56, epenner wrote:
> I was in debate of whether we needed to do this in this instance. The Query
> class is actually ref-counted... So if that is legit, we could insure a
> gpu-thread reference is kept for the duration of the async call.
> 
> However, I'm suspicious of the Query being reference counted in the first
place.
> Can it really outlive its manager safely? This is part of the reason I was
> hesitant to do query completion on the thread. I also used a weak pointer
rather
> than holding a reference, to be sure.
> 
> If we keep the dupe, we should investigate removing it along with the texture
> memory dupe. We just need an intermediate class that can be reference counted
> within a process.

I think the safe way to do this for now is duplicating it and we can later
change it if possible along with the texture memory.

https://codereview.chromium.org/12213073/diff/1/gpu/command_buffer/service/qu...
gpu/command_buffer/service/query_manager.cc:204:
manager()->decoder()->GetAsyncPixelTransferDelegate()->AsyncNotifyCompletion(
On 2013/02/07 20:22:56, epenner wrote:
> I think to avoid the layering violation, we could do the query setting via a
> callback that we create right here and pass along with the memory etc. The
> delegate can decide whether it needs to dupe the memory, and when/what-thread
to
> execute the callback on. The callback will need to be very minimal in what it
> does, since it could happen on another thread.

yea, sounds like that might be a pretty clean solution. and it will get rid of
that inappropriate include, yay!

https://codereview.chromium.org/12213073/diff/1/gpu/command_buffer/service/qu...
gpu/command_buffer/service/query_manager.cc:217: if (sync->process_count !=
submit_count())
On 2013/02/07 20:22:56, epenner wrote:
> Is this thread safe? Given it's simplicity maybe it is? I think we should add
a
> comment at the minimum that we are comparing something that is being changed
on
> another thread.

this is consistent with the code in the client side query tracker. the query
system already guarantees that this is atomic across processes. I think that
adding some additional thread safety guards would just be confusing. I'll add a
comment though.

https://codereview.chromium.org/12213073/diff/1/ui/gl/async_pixel_transfer_de...
File ui/gl/async_pixel_transfer_delegate_android.cc (right):

https://codereview.chromium.org/12213073/diff/1/ui/gl/async_pixel_transfer_de...
ui/gl/async_pixel_transfer_delegate_android.cc:36: class TextureUploadStats
On 2013/02/07 20:36:34, epenner wrote:
> Can this be another change? I don't understand how upload stats are related to
> this change. And it adds another lock, each instance of which requires a lot
of
> justification, IMO.

I can't get rid of the replies and keep collecting upload stats correctly
without this. we can of course remove the stats but that breaks the benchmarks
extension.

https://codereview.chromium.org/12213073/diff/1/ui/gl/async_pixel_transfer_de...
ui/gl/async_pixel_transfer_delegate_android.cc:306: base::Lock
transfer_in_progress_lock_;
On 2013/02/07 20:22:56, epenner wrote:
> On 2013/02/07 19:59:02, David Reveman wrote:
> > we could use base::subtle::Atomic32 here instead if performance is a
concern.
> 
> Can you explain a bit why we need a lock or atomic int at all?

Both threads will access this. Main thread to determine when transfer is done.
Upload thread sets it when done.

epenner

I think I understand this more and I think it could be hugely simplified by ...

7 years, 10 months ago (2013-02-07 21:59:47 UTC) #6

epenner

Dang. I realized the normal PostTaskAndReply doesn't do what I said. It will mark the ...

7 years, 10 months ago (2013-02-07 22:52:23 UTC) #7

epenner

On 2013/02/07 22:52:23, epenner wrote: > Dang. I realized the normal PostTaskAndReply doesn't do what ...

7 years, 10 months ago (2013-02-07 23:10:01 UTC) #9

reveman

I think you're right, removing the replies from AsyncPixelTransferDelegateAndroid as part of this CL is ...

7 years, 10 months ago (2013-02-08 04:29:44 UTC) #10

epenner

From a quick look it's looking really good! I'll look at it closely tonight. > ...

7 years, 10 months ago (2013-02-08 05:23:28 UTC) #11

reveman

On 2013/02/08 05:23:28, epenner wrote: > From a quick look it's looking really good! I'll ...

7 years, 10 months ago (2013-02-08 05:55:55 UTC) #12

epenner

LGTM However, I think we need a GPU reviewer for the Query changes. The query ...

7 years, 10 months ago (2013-02-08 06:02:01 UTC) #13

reveman

On 2013/02/08 05:55:55, David Reveman wrote: > On 2013/02/08 05:23:28, epenner wrote: > > From ...

7 years, 10 months ago (2013-02-08 06:12:02 UTC) #14

On 2013/02/08 05:55:55, David Reveman wrote:
> On 2013/02/08 05:23:28, epenner wrote:
> > From a quick look it's looking really good! I'll look at it closely tonight.
> > 
> > > In that case, can we make the assumption that a task posted to the message
> > loop will run before an IO handler triggered by an event just a microsecond
> > later? 
> > 
> > I'm not sure if there is a distinction on Android, as the 'GPU process'
still
> > exists. It still communicates via IPCs where it would previously, it's just
> that
> > it shares the same process as the browser. Can you describe the race you are
> > thinking of in detail? Which message / IPCs specifically are you thinking
of?
> 
> Let's say our message loop looks something like this:
> 
> while (!quit_) {
>   while (renderer_channel->CanReadWithoutBlocking())
>     renderer_channel->Dispatch();
>   while (pending_tasks_.size())
>     pending_tasks_.front.Run();
> pending_tasks_.front.
> }

oops, accidentally hit send. so lets say it looks something like this:

while (!quit_) {
  while (renderer_channel->CanReadWithoutBlocking())
    renderer_channel->Dispatch();

  while (pending_tasks_.size()) {
    pending_tasks_.front.Run();
    pending_tasks_.pop_back();
  }
}

which of course is very far from the real thing but we should be able to think
of the message loop as a black box and it could replace by something that
resembles the code above.

In this case, just because a task was added to the pending task list before
renderer_channel->CanReadWithoutBlocking() would return true doesn't guarantee
the task will be run first.

> 
> > 
> > > Btw, doesn't seem like TransferStateInternal need to be
> RefCountedThreadSafe.
> > Latest patch makes it just RefCounted. Am I missing something here?
> > 
> > I'm very curious about this myself! The question is, is this safe?
> > otherThread->PostTaskAndReply(
> >     SomeClosure,
> >     SomeClosureWithReferenceParameter)
> > The reference parameter *should* only be accessed on the calling thread and
> not
> > the other thread. However, what if the closure is copied-by-value or
something
> > while on the other thread? I'd just like to know for sure that it is *NOT*
> > touched by the other thread.

We should be safe if we use base::Unretained() and make sure not to ref it on
the other thread. We should add the const modifier to prevent future mistakes.
I'll fix that.

reveman

> We should be safe if we use base::Unretained() and make sure not to ref ...

7 years, 10 months ago (2013-02-08 06:26:33 UTC) #15

reveman

Gregg, please have a look at QueryManager changes in patch when you have a chance.

7 years, 10 months ago (2013-02-08 06:29:48 UTC) #16

epenner

> oops, accidentally hit send. so lets say it looks something like this: > > ...

7 years, 10 months ago (2013-02-08 18:59:46 UTC) #17

reveman

On 2013/02/08 18:59:46, epenner wrote: > > oops, accidentally hit send. so lets say it ...

7 years, 10 months ago (2013-02-08 21:32:37 UTC) #18

On 2013/02/08 18:59:46, epenner wrote:
> > oops, accidentally hit send. so lets say it looks something like this:
> > 
> > while (!quit_) {
> >   while (renderer_channel->CanReadWithoutBlocking())
> >     renderer_channel->Dispatch();
> > 
> >   while (pending_tasks_.size()) {
> >     pending_tasks_.front.Run();
> >     pending_tasks_.pop_back();
> >   }
> > }
> > 
> > which of course is very far from the real thing but we should be able to
think
> > of the message loop as a black box and it could replace by something that
> > resembles the code above.
> > 
> > In this case, just because a task was added to the pending task list before
> > renderer_channel->CanReadWithoutBlocking() would return true doesn't
guarantee
> > the task will be run first.
> 
> So your saying basically that the GPU thread's raw tasks could be reordered.

Not really. There's 2 assumptions that need to be correct for the current patch
to be without any race conditions.

when using this:
upload_thread->PostTaskAndReply(PerformUpload, SetTransferComplete)
upload_thread->PostTask(NotifyCompletion)

1. The SetTransferComplete callback needs to be in the caller thread's pending
task queue at the time NotifyCompletion is run.
2. Any IPC events from the renderer needs to go through the same task queue.

I'm guessing it's OK to make these assumptions. It felt a bit suspicious as if
we would use an IPC channel for the SetTransferComplete notification, condition
1 would break.

> task order on the GPU thread is assumed right now. I think such a re-ordering
> message loop should probably use a different type/interface to make clear this
> massive semantic change, but in any event I guess the main thing for now is to
> confirm that this doesn't happen like you say.

Right.

I've done some more tracing and I think it's critical to get rid of these
replies for good performance. The upload thread has low priority so avoiding to
wake up other threads after each upload significantly impacts how packed the
uploads will be on the thread.

So we probably want to get rid of these replies after all but I think it's
better to do that in a follow up patch unless the assumptions made in the
current patch are determined inappropriate.

epenner

Sounds good. Assumptions seem solid to me unless I'm missing something big. Personally I'd prefer ...

7 years, 10 months ago (2013-02-08 23:03:52 UTC) #19

greggman

https://codereview.chromium.org/12213073/diff/7002/gpu/command_buffer/service/query_manager.cc File gpu/command_buffer/service/query_manager.cc (right): https://codereview.chromium.org/12213073/diff/7002/gpu/command_buffer/service/query_manager.cc#newcode210 gpu/command_buffer/service/query_manager.cc:210: Buffer buffer = manager()->decoder()->GetSharedMemoryBuffer(shm_id()); Is there any chance this ...

7 years, 10 months ago (2013-02-11 17:34:09 UTC) #20

reveman

7 years, 10 months ago (2013-02-11 23:27:13 UTC) #21

commit-bot: I haz the power

CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/reveman@chromium.org/12213073/5003

7 years, 10 months ago (2013-02-12 00:29:52 UTC) #23

reveman

This latest problem fixes the problem reported here: https://code.google.com/p/chromium/issues/detail?id=175863 During shutdown we were calling ProcessPendingTransferQueries() ...

7 years, 10 months ago (2013-02-13 20:20:42 UTC) #25

greggman

lgtm https://codereview.chromium.org/12213073/diff/16001/ui/gl/async_pixel_transfer_delegate_android.cc File ui/gl/async_pixel_transfer_delegate_android.cc (right): https://codereview.chromium.org/12213073/diff/16001/ui/gl/async_pixel_transfer_delegate_android.cc#newcode437 ui/gl/async_pixel_transfer_delegate_android.cc:437: // In practice, they are complete when the ...

7 years, 10 months ago (2013-02-13 20:31:10 UTC) #26

reveman

https://codereview.chromium.org/12213073/diff/16001/ui/gl/async_pixel_transfer_delegate_android.cc File ui/gl/async_pixel_transfer_delegate_android.cc (right): https://codereview.chromium.org/12213073/diff/16001/ui/gl/async_pixel_transfer_delegate_android.cc#newcode437 ui/gl/async_pixel_transfer_delegate_android.cc:437: // In practice, they are complete when the CPU ...

7 years, 10 months ago (2013-02-13 20:40:55 UTC) #27

reveman

Eric, I just rebased this onto your safe shared memory change. PTAL.

7 years, 10 months ago (2013-02-15 21:51:08 UTC) #28

commit-bot: I haz the power

CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/reveman@chromium.org/12213073/27001

7 years, 10 months ago (2013-02-15 21:57:45 UTC) #29

commit-bot: I haz the power

CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/reveman@chromium.org/12213073/27001

7 years, 10 months ago (2013-02-15 22:25:10 UTC) #30

commit-bot: I haz the power

Commit queue rejected this change because the description was changed between the time the change ...

7 years, 10 months ago (2013-02-16 18:15:37 UTC) #32

commit-bot: I haz the power

CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/reveman@chromium.org/12213073/27001

7 years, 10 months ago (2013-02-16 22:37:10 UTC) #33

Message was sent while issue was closed.

Change committed as 183010

Expand Messages | Collapse Messages