|
|
Created:
5 years, 3 months ago by David Yen Modified:
5 years, 3 months ago CC:
chromium-reviews, darin-cc_chromium.org, jam, piman+watch_chromium.org, sunnyps, vmiura Base URL:
https://chromium.googlesource.com/chromium/src.git@master Target Ref:
refs/pending/heads/master Project:
chromium Visibility:
Public. |
DescriptionGPU Channel's now maintain a global order number for each processed IPC.
Each IPC that a GPU Channel processes now is assigned a global order
ID which is associated to each. As they get processed each channel
can be queried what they are currently processing, what they have
already processed, and what they will eventually process.
The global order number also replaces the number of messages processed
number that was used before. Technically they do not contain the same
information (IE it is impossible to query how many messages a channel
has processed now), but the old use cases didn't use the variable for
this way. The old use cases can be replaced with the new order number,
which was whether an IPC message has been processed already, and
whether a channel has been active.
BUG=514815
Committed: https://crrev.com/4151ec766a698853621a1fff2b2b80c8bc36cc0c
Cr-Commit-Position: refs/heads/master@{#347428}
Patch Set 1 #Patch Set 2 : Added support for current_order_num, unprocsessed_order_num #Patch Set 3 : Removed unused function #Patch Set 4 : GPU Channel now shares a single queue between IO/Main threads #Patch Set 5 : Use 32 bit number for order number instead #Patch Set 6 : No longer process all messages, rely on message loop for fairness #Patch Set 7 : Reduce a lock when handling message #Patch Set 8 : Fixed some comments and how GetNextMessageTimeTick() works. #Patch Set 9 : Updated command buffer stub to use 32 bit order numbers #
Total comments: 24
Patch Set 10 : GPU Channel message queue placed in own class, applied suggestions #
Total comments: 22
Patch Set 11 : Move sync point retire to main thread, refactored some things #Patch Set 12 : reordered include #
Total comments: 6
Patch Set 13 : DCHECK with unsigned 0 #Patch Set 14 : Removed some unused variables, reformatted. #Patch Set 15 : Enforce that queue is to not be modified after being disabled #
Total comments: 2
Patch Set 16 : schedule handle message when queue is not empty after we do not return early #Patch Set 17 : No need to forward declare GpuChannelMessage #
Total comments: 3
Patch Set 18 : Fix windows error #Patch Set 19 : Added requested DCHECKs #Patch Set 20 : Reordered includes, started global order counter at 1 #Patch Set 21 : sunnyps pointed out HasQueuedMessages() needs to check both queues, added helper function for this #Patch Set 22 : rebase #Patch Set 23 : Fix merge error #
Messages
Total messages: 30 (7 generated)
dyen@chromium.org changed reviewers: + jam@chromium.org, jbauman@chromium.org
+jbauman for review of global order numbers implementation. +jam for review of "IPC::MessageFilter" usage and bypassing OnMessageReceived().
I've refactored the GPU Channel messages so all messages are handled on a single locked deque. PTAL.
dyen@chromium.org changed reviewers: + piman@chromium.org
+piman for review also
https://codereview.chromium.org/1308913004/diff/160001/content/common/gpu/gpu... File content/common/gpu/gpu_channel.cc (right): https://codereview.chromium.org/1308913004/diff/160001/content/common/gpu/gpu... content/common/gpu/gpu_channel.cc:113: gpu_channel_io_->gpu_channel_manager()->GenerateGlobalOrderNumber(); I don't think gpu_channel_io_ is safe to use, because the filter may outlive the GpuChannel. My suggestion is to move the message queue (and the lock) to the filter, and use the existing weak pointer to post the task when needed. For the GenerateGlobalOrderNumber, I think it's fine to use a StaticAtomicSequenceNumber, global in this file. There's other options if we need it to be owned by the GpuChannelManager (e.g. a thread safe refcounted class). https://codereview.chromium.org/1308913004/diff/160001/content/common/gpu/gpu... content/common/gpu/gpu_channel.cc:432: STLDeleteElements(&channel_messages_); We need to handle unprocessed InsertSyncPoint here, otherwise they'll never get retired. https://codereview.chromium.org/1308913004/diff/160001/content/common/gpu/gpu... content/common/gpu/gpu_channel.cc:698: m = channel_messages_.front(); Can we pop_front and release the lock? RetireSyncPoint could cause another stub to become scheduled and call StubSchedulingChanged, which could call ScheduleHandleMessage that takes the lock again. It would be better to make sure RetireSyncPoint is only called with the lock not held. We may need to take again the lock and pushing back the message at the front if needed, or alternatively figure out whether the stub is scheduled/preempted or not first, pop the message, then release the lock, and only then handle the Retire. https://codereview.chromium.org/1308913004/diff/160001/content/common/gpu/gpu... content/common/gpu/gpu_channel.cc:785: ScheduleHandleMessageLocked(); Arguably we could do this while the lock is taken the first time, to avoid taking the lock a second time. The only place where channel_messages_.empty() may change after that is if we push GpuCommandBufferMsg_Rescheduled, which will call ScheduleHandleMessageLocked anyway. https://codereview.chromium.org/1308913004/diff/160001/content/common/gpu/gpu... content/common/gpu/gpu_channel.cc:869: if (!handle_messages_scheduled_) { You really only need to post a task if the queue was empty. If the queue is not empty, the main thread is responsible for processing all messages, posting tasks if needed. Because you're adding the message with the lock taken, it's either one of 2 things: - either the queue was empty, the main thread is idle, so we need to post a task - or the queue was not empty, so the main thread is either descheduled or it will have posted a task to handle the next message. Either way it will continue processing, including this new message, if/when it is scheduled. If we post a message here anyway, we may end up posting 2 tasks per message (once here, and once on the main thread after processing the previous message), which is inefficient (lots of tasks will end up early exiting). That way, you can even remove handle_messages_scheduled_. https://codereview.chromium.org/1308913004/diff/160001/content/common/gpu/gpu... File content/common/gpu/gpu_channel.h (right): https://codereview.chromium.org/1308913004/diff/160001/content/common/gpu/gpu... content/common/gpu/gpu_channel.h:268: }; nit: move to before OnDestroy (within each visiblity section, order is types, methods, fields per style guide) https://codereview.chromium.org/1308913004/diff/160001/content/common/gpu/gpu... content/common/gpu/gpu_channel.h:310: base::subtle::Atomic32 unprocessed_order_num_; As far as I can tell, this is only used in a DCHECK, so I'm not sure it's useful to optimize it as an Atomic32. Instead you can easily get it by going over the messages in the queue (if you take the lock, which is fine to do in a DCHECK) https://codereview.chromium.org/1308913004/diff/160001/content/common/gpu/gpu... File content/common/gpu/gpu_channel_manager.cc (right): https://codereview.chromium.org/1308913004/diff/160001/content/common/gpu/gpu... content/common/gpu/gpu_channel_manager.cc:267: for (auto& kv : gpu_channels_) nit: needs brackets per style guide https://codereview.chromium.org/1308913004/diff/160001/content/common/gpu/gpu... File content/common/gpu/gpu_channel_manager.h (right): https://codereview.chromium.org/1308913004/diff/160001/content/common/gpu/gpu... content/common/gpu/gpu_channel_manager.h:165: uint32_t global_order_counter_; Use a AtomicSequenceNumber, then you don't need extra thread checks.
https://codereview.chromium.org/1308913004/diff/160001/content/common/gpu/gpu... File content/common/gpu/gpu_channel.cc (right): https://codereview.chromium.org/1308913004/diff/160001/content/common/gpu/gpu... content/common/gpu/gpu_channel.cc:113: gpu_channel_io_->gpu_channel_manager()->GenerateGlobalOrderNumber(); On 2015/08/31 23:15:04, piman (OOO back 2015-08-31) wrote: > I don't think gpu_channel_io_ is safe to use, because the filter may outlive the > GpuChannel. > > My suggestion is to move the message queue (and the lock) to the filter, and use > the existing weak pointer to post the task when needed. > > For the GenerateGlobalOrderNumber, I think it's fine to use a > StaticAtomicSequenceNumber, global in this file. There's other options if we > need it to be owned by the GpuChannelManager (e.g. a thread safe refcounted > class). Having the filter own the messages didn't feel right since most of the operations are done on the GpuChannel side. I made it a separate object instead which is shared by both, having the GpuChannel specific functions be private while GpuChannel is a friend. The global order number doesn't really need to be owned by the GpuChannelManager and it's only accessed on the IO thread so I think a regular static number in GpuChannelMessageFilter is perfect. https://codereview.chromium.org/1308913004/diff/160001/content/common/gpu/gpu... content/common/gpu/gpu_channel.cc:432: STLDeleteElements(&channel_messages_); On 2015/08/31 23:15:04, piman (OOO back 2015-08-31) wrote: > We need to handle unprocessed InsertSyncPoint here, otherwise they'll never get > retired. Done. Although isn't this the same as how it was done before? sync points were always retired within a task on the main thread. I admit I am not familiar with what guarantees we have about queued tasks that are run by the time the destructor is called. https://codereview.chromium.org/1308913004/diff/160001/content/common/gpu/gpu... content/common/gpu/gpu_channel.cc:698: m = channel_messages_.front(); On 2015/08/31 23:15:04, piman (OOO back 2015-08-31) wrote: > Can we pop_front and release the lock? > RetireSyncPoint could cause another stub to become scheduled and call > StubSchedulingChanged, which could call ScheduleHandleMessage that takes the > lock again. It would be better to make sure RetireSyncPoint is only called with > the lock not held. > > We may need to take again the lock and pushing back the message at the front if > needed, or alternatively figure out whether the stub is scheduled/preempted or > not first, pop the message, then release the lock, and only then handle the > Retire. I would have to think about it a bit if we can pop in the beginning and push at the front if we are being pre-empted. It is probably okay but I currently have an invariant that only "out of order messages" are pushed to the front to make reasoning about the state of the queue easier (which is why pushing are all handled by functions). The simplest solution is simply move the retire sync point handling to after we have released the lock though. I am not too concerned about the old sync point code because the new wait/release based on global order will be in shortly after this. I have moved it now. https://codereview.chromium.org/1308913004/diff/160001/content/common/gpu/gpu... content/common/gpu/gpu_channel.cc:785: ScheduleHandleMessageLocked(); On 2015/08/31 23:15:04, piman (OOO back 2015-08-31) wrote: > Arguably we could do this while the lock is taken the first time, to avoid > taking the lock a second time. The only place where channel_messages_.empty() > may change after that is if we push GpuCommandBufferMsg_Rescheduled, which will > call ScheduleHandleMessageLocked anyway. Good point, I guess I was trying to replicate old code and didn't think this through. Done. https://codereview.chromium.org/1308913004/diff/160001/content/common/gpu/gpu... content/common/gpu/gpu_channel.cc:869: if (!handle_messages_scheduled_) { On 2015/08/31 23:15:04, piman (OOO back 2015-08-31) wrote: > You really only need to post a task if the queue was empty. If the queue is not > empty, the main thread is responsible for processing all messages, posting tasks > if needed. Because you're adding the message with the lock taken, it's either > one of 2 things: > - either the queue was empty, the main thread is idle, so we need to post a task > - or the queue was not empty, so the main thread is either descheduled or it > will have posted a task to handle the next message. Either way it will continue > processing, including this new message, if/when it is scheduled. > > If we post a message here anyway, we may end up posting 2 tasks per message > (once here, and once on the main thread after processing the previous message), > which is inefficient (lots of tasks will end up early exiting). > > That way, you can even remove handle_messages_scheduled_. Done. Although we cannot get rid of handle_messages_scheduled_ because the descheduler uses it. https://codereview.chromium.org/1308913004/diff/160001/content/common/gpu/gpu... File content/common/gpu/gpu_channel.h (right): https://codereview.chromium.org/1308913004/diff/160001/content/common/gpu/gpu... content/common/gpu/gpu_channel.h:268: }; On 2015/08/31 23:15:04, piman (OOO back 2015-08-31) wrote: > nit: move to before OnDestroy (within each visiblity section, order is types, > methods, fields per style guide) Done. https://codereview.chromium.org/1308913004/diff/160001/content/common/gpu/gpu... content/common/gpu/gpu_channel.h:310: base::subtle::Atomic32 unprocessed_order_num_; On 2015/08/31 23:15:04, piman (OOO back 2015-08-31) wrote: > As far as I can tell, this is only used in a DCHECK, so I'm not sure it's useful > to optimize it as an Atomic32. Instead you can easily get it by going over the > messages in the queue (if you take the lock, which is fine to do in a DCHECK) This is going to be used later in the actual sync point wait implementation. It is always written to on the IO thread and read from from main thread from the GetUnprocessedOrderNum() function. https://codereview.chromium.org/1308913004/diff/160001/content/common/gpu/gpu... File content/common/gpu/gpu_channel_manager.cc (right): https://codereview.chromium.org/1308913004/diff/160001/content/common/gpu/gpu... content/common/gpu/gpu_channel_manager.cc:267: for (auto& kv : gpu_channels_) On 2015/08/31 23:15:04, piman (OOO back 2015-08-31) wrote: > nit: needs brackets per style guide Done, although I was following the style in the below function :). I changed the function below as well. https://codereview.chromium.org/1308913004/diff/160001/content/common/gpu/gpu... File content/common/gpu/gpu_channel_manager.h (right): https://codereview.chromium.org/1308913004/diff/160001/content/common/gpu/gpu... content/common/gpu/gpu_channel_manager.h:165: uint32_t global_order_counter_; On 2015/08/31 23:15:04, piman (OOO back 2015-08-31) wrote: > Use a AtomicSequenceNumber, then you don't need extra thread checks. This number is only operated on from the IO thread so it doesn't need to be atomic. But since I've moved it to the GpuChannelMessageFilter as a private static number, I don't think the thread checker is necessary anymore since it's easier to verify that it is only called from the IO thread.
https://codereview.chromium.org/1308913004/diff/160001/content/common/gpu/gpu... File content/common/gpu/gpu_channel.cc (right): https://codereview.chromium.org/1308913004/diff/160001/content/common/gpu/gpu... content/common/gpu/gpu_channel.cc:432: STLDeleteElements(&channel_messages_); On 2015/09/01 02:01:52, David Yen wrote: > On 2015/08/31 23:15:04, piman (OOO back 2015-08-31) wrote: > > We need to handle unprocessed InsertSyncPoint here, otherwise they'll never > get > > retired. > > Done. Although isn't this the same as how it was done before? sync points were > always retired within a task on the main thread. I admit I am not familiar with > what guarantees we have about queued tasks that are run by the time the > destructor is called. Before, InsertSyncPointOnMainThread would always run (static method), out-of-order, and retire the sync point immediately if the stub doesn't exist. If it did, the sync point would be added to the stub, and GpuCommandBufferStub::Destroy would retire the sync point if it hasn't been retired. Now, it is handled in-order (that's ok), but that supposes HandleMessage gets to it. Nothing guarantees that if the channel is destroyed early. In particular, the fact that we use a weak pointer means the task is not run if the GpuChannel is destroyed. (note that GpuCommandBufferStub::AddSyncPoint has almost no purpose now, since you retire immediately after it, except for "future" sync points. If we remove those, then we don't need that logic in GpuCommandBufferStub). https://codereview.chromium.org/1308913004/diff/160001/content/common/gpu/gpu... content/common/gpu/gpu_channel.cc:698: m = channel_messages_.front(); On 2015/09/01 02:01:52, David Yen wrote: > On 2015/08/31 23:15:04, piman (OOO back 2015-08-31) wrote: > > Can we pop_front and release the lock? > > RetireSyncPoint could cause another stub to become scheduled and call > > StubSchedulingChanged, which could call ScheduleHandleMessage that takes the > > lock again. It would be better to make sure RetireSyncPoint is only called > with > > the lock not held. > > > > We may need to take again the lock and pushing back the message at the front > if > > needed, or alternatively figure out whether the stub is scheduled/preempted or > > not first, pop the message, then release the lock, and only then handle the > > Retire. > > I would have to think about it a bit if we can pop in the beginning and push at > the front if we are being pre-empted. It is probably okay but I currently have > an invariant that only "out of order messages" are pushed to the front to make > reasoning about the state of the queue easier (which is why pushing are all > handled by functions). > > The simplest solution is simply move the retire sync point handling to after we > have released the lock though. I am not too concerned about the old sync point > code because the new wait/release based on global order will be in shortly after > this. I have moved it now. That's fine. A possibility for out-of-order messages is to keep them in a separate queue. That way, special handling may be easier (in particular, GetNextMessageTimeTick could simply look at the front of both queues). https://codereview.chromium.org/1308913004/diff/160001/content/common/gpu/gpu... content/common/gpu/gpu_channel.cc:869: if (!handle_messages_scheduled_) { On 2015/09/01 02:01:52, David Yen wrote: > On 2015/08/31 23:15:04, piman (OOO back 2015-08-31) wrote: > > You really only need to post a task if the queue was empty. If the queue is > not > > empty, the main thread is responsible for processing all messages, posting > tasks > > if needed. Because you're adding the message with the lock taken, it's either > > one of 2 things: > > - either the queue was empty, the main thread is idle, so we need to post a > task > > - or the queue was not empty, so the main thread is either descheduled or it > > will have posted a task to handle the next message. Either way it will > continue > > processing, including this new message, if/when it is scheduled. > > > > If we post a message here anyway, we may end up posting 2 tasks per message > > (once here, and once on the main thread after processing the previous > message), > > which is inefficient (lots of tasks will end up early exiting). > > > > That way, you can even remove handle_messages_scheduled_. > > Done. Although we cannot get rid of handle_messages_scheduled_ because the > descheduler uses it. I think the only place this is used is GpuCommandBufferStub::PollWork, and you could use HasQueuedMessages() for it instead. https://codereview.chromium.org/1308913004/diff/180001/content/common/gpu/gpu... File content/common/gpu/gpu_channel.cc (right): https://codereview.chromium.org/1308913004/diff/180001/content/common/gpu/gpu... content/common/gpu/gpu_channel.cc:87: class GpuChannelMessageQueue : public base::RefCounted<GpuChannelMessageQueue> { Needs to be RefCountedThreadSafe https://codereview.chromium.org/1308913004/diff/180001/content/common/gpu/gpu... content/common/gpu/gpu_channel.cc:98: const IPC::Message& message) { nit: indent, here and below (run git cl format) https://codereview.chromium.org/1308913004/diff/180001/content/common/gpu/gpu... content/common/gpu/gpu_channel.cc:112: channel_messages_.push_front(new ChannelMessage(static_cast<uint32_t>(-1), nit: maybe define a const uint32_t kOutOfOrderNumer = -1u; and use here and below. https://codereview.chromium.org/1308913004/diff/180001/content/common/gpu/gpu... content/common/gpu/gpu_channel.cc:134: } nit: all 3 functions (and PushUnfinishedMessage) could maybe use a helper: void PushMessageAndSchedule(ChannelMessage* message) { base::AutoLock auto_lock(channel_messages_lock_); const bool was_empty = channel_messages_.empty(); channel_messages_.push_back(message); if (was_empty) ScheduleHandleMessageLocked(); } https://codereview.chromium.org/1308913004/diff/180001/content/common/gpu/gpu... content/common/gpu/gpu_channel.cc:161: virtual ~GpuChannelMessageQueue() { This logic needs to happen on the main thread. Since this is refcounted by both the filter and the GpuChannel, you have no guarantee where the destructor is run. Instead, cleaning up the sync point messages should probably happen by an explicit call from GpuChannel in its destructor. This could also set a flag so that the filter stops creating sync point numbers, and allocating ChannelMessages. https://codereview.chromium.org/1308913004/diff/180001/content/common/gpu/gpu... content/common/gpu/gpu_channel.cc:163: for (const auto& msg : channel_messages_) { I'd suggest making this ChannelMessage* msg for clarity. auto& makes it sound like it's a ChannelMessage reference. Also, you can simply delete msg in this loop, rather than running a second loop in STLDeleteElements. https://codereview.chromium.org/1308913004/diff/180001/content/common/gpu/gpu... content/common/gpu/gpu_channel.cc:165: gpu_channel_manager_->sync_point_manager()->RetireSyncPoint(sync_point); if (sync_point) gpu_channel_manager_->sync_point_manager()->RetireSyncPoint(sync_point); (RetireSyncPoint is non-trivial, you don't want to run it on 0 for every non-InsertSyncPoint message). https://codereview.chromium.org/1308913004/diff/180001/content/common/gpu/gpu... content/common/gpu/gpu_channel.cc:211: base::subtle::Atomic32 unprocessed_order_num_; If we really need this, I'd prefer it to be a uint32_t, and accessed under the lock. Other access patterns are incorrect unless proven correct - it encodes the highest global order in the queue, but that's racy if you don't have the lock held. https://codereview.chromium.org/1308913004/diff/180001/content/common/gpu/gpu... content/common/gpu/gpu_channel.cc:525: base::WeakPtr<GpuChannel> gpu_channel_; I don't think you need this any more. https://codereview.chromium.org/1308913004/diff/180001/content/common/gpu/gpu... content/common/gpu/gpu_channel.cc:815: if (m->message.type() == GpuCommandBufferMsg_InsertSyncPoint::ID) { Why not move that part of the logic after the lock release too? In particular, if unscheduled/preempted, we'll go through this multiple times, which means we'll do AddSyncPoint multiple times - that's bad. https://codereview.chromium.org/1308913004/diff/180001/content/common/gpu/gpu... File content/common/gpu/gpu_channel.h (right): https://codereview.chromium.org/1308913004/diff/180001/content/common/gpu/gpu... content/common/gpu/gpu_channel.h:58: struct ChannelMessage; nit: GpuChannelMessage? ChannelMessage is unspecific enough that ODR violations seem likely.
Note this currently does not pass PRESUBMIT checks, I tracked it down and put a fix here: https://codereview.chromium.org/1309283003/ https://codereview.chromium.org/1308913004/diff/160001/content/common/gpu/gpu... File content/common/gpu/gpu_channel.cc (right): https://codereview.chromium.org/1308913004/diff/160001/content/common/gpu/gpu... content/common/gpu/gpu_channel.cc:432: STLDeleteElements(&channel_messages_); On 2015/09/01 03:55:26, piman (OOO back 2015-08-31) wrote: > On 2015/09/01 02:01:52, David Yen wrote: > > On 2015/08/31 23:15:04, piman (OOO back 2015-08-31) wrote: > > > We need to handle unprocessed InsertSyncPoint here, otherwise they'll never > > get > > > retired. > > > > Done. Although isn't this the same as how it was done before? sync points were > > always retired within a task on the main thread. I admit I am not familiar > with > > what guarantees we have about queued tasks that are run by the time the > > destructor is called. > > Before, InsertSyncPointOnMainThread would always run (static method), > out-of-order, and retire the sync point immediately if the stub doesn't exist. > If it did, the sync point would be added to the stub, and > GpuCommandBufferStub::Destroy would retire the sync point if it hasn't been > retired. > > Now, it is handled in-order (that's ok), but that supposes HandleMessage gets to > it. Nothing guarantees that if the channel is destroyed early. In particular, > the fact that we use a weak pointer means the task is not run if the GpuChannel > is destroyed. > > (note that GpuCommandBufferStub::AddSyncPoint has almost no purpose now, since > you retire immediately after it, except for "future" sync points. If we remove > those, then we don't need that logic in GpuCommandBufferStub). Oh I see, I didn't notice that it was static before. That makes sense. https://codereview.chromium.org/1308913004/diff/160001/content/common/gpu/gpu... content/common/gpu/gpu_channel.cc:698: m = channel_messages_.front(); On 2015/09/01 03:55:26, piman (OOO back 2015-08-31) wrote: > On 2015/09/01 02:01:52, David Yen wrote: > > On 2015/08/31 23:15:04, piman (OOO back 2015-08-31) wrote: > > > Can we pop_front and release the lock? > > > RetireSyncPoint could cause another stub to become scheduled and call > > > StubSchedulingChanged, which could call ScheduleHandleMessage that takes the > > > lock again. It would be better to make sure RetireSyncPoint is only called > > with > > > the lock not held. > > > > > > We may need to take again the lock and pushing back the message at the front > > if > > > needed, or alternatively figure out whether the stub is scheduled/preempted > or > > > not first, pop the message, then release the lock, and only then handle the > > > Retire. > > > > I would have to think about it a bit if we can pop in the beginning and push > at > > the front if we are being pre-empted. It is probably okay but I currently have > > an invariant that only "out of order messages" are pushed to the front to make > > reasoning about the state of the queue easier (which is why pushing are all > > handled by functions). > > > > The simplest solution is simply move the retire sync point handling to after > we > > have released the lock though. I am not too concerned about the old sync point > > code because the new wait/release based on global order will be in shortly > after > > this. I have moved it now. > > That's fine. > > A possibility for out-of-order messages is to keep them in a separate queue. > That way, special handling may be easier (in particular, GetNextMessageTimeTick > could simply look at the front of both queues). Done. https://codereview.chromium.org/1308913004/diff/160001/content/common/gpu/gpu... content/common/gpu/gpu_channel.cc:869: if (!handle_messages_scheduled_) { On 2015/09/01 03:55:26, piman (OOO back 2015-08-31) wrote: > On 2015/09/01 02:01:52, David Yen wrote: > > On 2015/08/31 23:15:04, piman (OOO back 2015-08-31) wrote: > > > You really only need to post a task if the queue was empty. If the queue is > > not > > > empty, the main thread is responsible for processing all messages, posting > > tasks > > > if needed. Because you're adding the message with the lock taken, it's > either > > > one of 2 things: > > > - either the queue was empty, the main thread is idle, so we need to post a > > task > > > - or the queue was not empty, so the main thread is either descheduled or it > > > will have posted a task to handle the next message. Either way it will > > continue > > > processing, including this new message, if/when it is scheduled. > > > > > > If we post a message here anyway, we may end up posting 2 tasks per message > > > (once here, and once on the main thread after processing the previous > > message), > > > which is inefficient (lots of tasks will end up early exiting). > > > > > > That way, you can even remove handle_messages_scheduled_. > > > > Done. Although we cannot get rid of handle_messages_scheduled_ because the > > descheduler uses it. > > I think the only place this is used is GpuCommandBufferStub::PollWork, and you > could use HasQueuedMessages() for it instead. Actually looking at GpuCommandBufferStub::PollWork(), we can now detect if anything has been processed or queued by simply comparing UnprocessedOrderNumber(). I'm modified the function to use that instead and removed "handle_messages_scheduled_". This could potentially be faster if it just accessed the global order number instead, but I believe the scheduling code is going to change a lot after streams goes in right? https://codereview.chromium.org/1308913004/diff/180001/content/common/gpu/gpu... File content/common/gpu/gpu_channel.cc (right): https://codereview.chromium.org/1308913004/diff/180001/content/common/gpu/gpu... content/common/gpu/gpu_channel.cc:87: class GpuChannelMessageQueue : public base::RefCounted<GpuChannelMessageQueue> { On 2015/09/01 03:55:26, piman (OOO back 2015-08-31) wrote: > Needs to be RefCountedThreadSafe Done. https://codereview.chromium.org/1308913004/diff/180001/content/common/gpu/gpu... content/common/gpu/gpu_channel.cc:98: const IPC::Message& message) { On 2015/09/01 03:55:26, piman (OOO back 2015-08-31) wrote: > nit: indent, here and below (run git cl format) Done. https://codereview.chromium.org/1308913004/diff/180001/content/common/gpu/gpu... content/common/gpu/gpu_channel.cc:112: channel_messages_.push_front(new ChannelMessage(static_cast<uint32_t>(-1), On 2015/09/01 03:55:26, piman (OOO back 2015-08-31) wrote: > nit: maybe define a const uint32_t kOutOfOrderNumer = -1u; and use here and > below. Done. https://codereview.chromium.org/1308913004/diff/180001/content/common/gpu/gpu... content/common/gpu/gpu_channel.cc:134: } On 2015/09/01 03:55:26, piman (OOO back 2015-08-31) wrote: > nit: all 3 functions (and PushUnfinishedMessage) could maybe use a helper: > > void PushMessageAndSchedule(ChannelMessage* message) { > base::AutoLock auto_lock(channel_messages_lock_); > const bool was_empty = channel_messages_.empty(); > channel_messages_.push_back(message); > if (was_empty) > ScheduleHandleMessageLocked(); > } Done. https://codereview.chromium.org/1308913004/diff/180001/content/common/gpu/gpu... content/common/gpu/gpu_channel.cc:161: virtual ~GpuChannelMessageQueue() { On 2015/09/01 03:55:26, piman (OOO back 2015-08-31) wrote: > This logic needs to happen on the main thread. > > Since this is refcounted by both the filter and the GpuChannel, you have no > guarantee where the destructor is run. > > > Instead, cleaning up the sync point messages should probably happen by an > explicit call from GpuChannel in its destructor. This could also set a flag so > that the filter stops creating sync point numbers, and allocating > ChannelMessages. Ah, I just noticed that too. I've moved it to the main thread and added an "enabled_" boolean. I also had to move the sync point generating code to be behind the lock to or else there would be a race condition, it is all handled within the GpuChannelMessageQueue::GenerateSyncPointMessage() now. https://codereview.chromium.org/1308913004/diff/180001/content/common/gpu/gpu... content/common/gpu/gpu_channel.cc:163: for (const auto& msg : channel_messages_) { On 2015/09/01 03:55:26, piman (OOO back 2015-08-31) wrote: > I'd suggest making this ChannelMessage* msg for clarity. auto& makes it sound > like it's a ChannelMessage reference. > > Also, you can simply delete msg in this loop, rather than running a second loop > in STLDeleteElements. Done. https://codereview.chromium.org/1308913004/diff/180001/content/common/gpu/gpu... content/common/gpu/gpu_channel.cc:165: gpu_channel_manager_->sync_point_manager()->RetireSyncPoint(sync_point); On 2015/09/01 03:55:26, piman (OOO back 2015-08-31) wrote: > if (sync_point) > gpu_channel_manager_->sync_point_manager()->RetireSyncPoint(sync_point); > > (RetireSyncPoint is non-trivial, you don't want to run it on 0 for every > non-InsertSyncPoint message). Done. https://codereview.chromium.org/1308913004/diff/180001/content/common/gpu/gpu... content/common/gpu/gpu_channel.cc:211: base::subtle::Atomic32 unprocessed_order_num_; On 2015/09/01 03:55:26, piman (OOO back 2015-08-31) wrote: > If we really need this, I'd prefer it to be a uint32_t, and accessed under the > lock. Other access patterns are incorrect unless proven correct - it encodes the > highest global order in the queue, but that's racy if you don't have the lock > held. Done. https://codereview.chromium.org/1308913004/diff/180001/content/common/gpu/gpu... content/common/gpu/gpu_channel.cc:525: base::WeakPtr<GpuChannel> gpu_channel_; On 2015/09/01 03:55:26, piman (OOO back 2015-08-31) wrote: > I don't think you need this any more. Done. https://codereview.chromium.org/1308913004/diff/180001/content/common/gpu/gpu... content/common/gpu/gpu_channel.cc:815: if (m->message.type() == GpuCommandBufferMsg_InsertSyncPoint::ID) { On 2015/09/01 03:55:26, piman (OOO back 2015-08-31) wrote: > Why not move that part of the logic after the lock release too? > > In particular, if unscheduled/preempted, we'll go through this multiple times, > which means we'll do AddSyncPoint multiple times - that's bad. Done. https://codereview.chromium.org/1308913004/diff/180001/content/common/gpu/gpu... File content/common/gpu/gpu_channel.h (right): https://codereview.chromium.org/1308913004/diff/180001/content/common/gpu/gpu... content/common/gpu/gpu_channel.h:58: struct ChannelMessage; On 2015/09/01 03:55:27, piman (OOO back 2015-08-31) wrote: > nit: GpuChannelMessage? ChannelMessage is unspecific enough that ODR violations > seem likely. Done.
One last thing, LGTM after that. https://codereview.chromium.org/1308913004/diff/220001/content/common/gpu/gpu... File content/common/gpu/gpu_channel.cc (right): https://codereview.chromium.org/1308913004/diff/220001/content/common/gpu/gpu... content/common/gpu/gpu_channel.cc:191: msg->sync_point_number); Same comment as previous patch, we should probably not call RetireSyncPoint while the lock is held. My suggestion: take the lock, set enabled to false, delete OOO messages, and swap channel_messages_ with an empty queue. Then you can release the lock and do the loop on the swapped queue safely.
https://codereview.chromium.org/1308913004/diff/220001/content/common/gpu/gpu... File content/common/gpu/gpu_channel.cc (right): https://codereview.chromium.org/1308913004/diff/220001/content/common/gpu/gpu... content/common/gpu/gpu_channel.cc:191: msg->sync_point_number); On 2015/09/01 19:23:44, piman (OOO back 2015-08-31) wrote: > Same comment as previous patch, we should probably not call RetireSyncPoint > while the lock is held. > My suggestion: take the lock, set enabled to false, delete OOO messages, and > swap channel_messages_ with an empty queue. Then you can release the lock and do > the loop on the swapped queue safely. I thought of another issue, we will leak anything that gets pushed after the queue has been disabled. Since I now fixed that by checking enabled before pushing, we now guarantee the queue will not be touched after enabled is false. How do you feel about just setting enabled to false with the lock, then clearing the queues without the lock? with the guarantee it should be safe to modify the queue without the lock.
https://codereview.chromium.org/1308913004/diff/220001/content/common/gpu/gpu... File content/common/gpu/gpu_channel.cc (right): https://codereview.chromium.org/1308913004/diff/220001/content/common/gpu/gpu... content/common/gpu/gpu_channel.cc:655: bool GpuChannel::OnMessageReceived(const IPC::Message& message) { Eventually we should restructure the code to remove this completely. https://codereview.chromium.org/1308913004/diff/220001/content/common/gpu/gpu... content/common/gpu/gpu_channel.cc:840: message_queue_->ScheduleHandleMessage(); If the stub is descheduled and there's at least two messages in the queue, this could cause it to spin trying to execute messages. I think it should check if the stub is scheduled before doing ScheduleHandleMessage. https://codereview.chromium.org/1308913004/diff/250002/content/common/gpu/gpu... File content/common/gpu/gpu_channel.cc (right): https://codereview.chromium.org/1308913004/diff/250002/content/common/gpu/gpu... content/common/gpu/gpu_channel.cc:1016: processed_order_num_ = order_number; Can we DCHECK that processed_order_num_ only ever increases?
On 2015/09/01 19:34:34, David Yen wrote: > https://codereview.chromium.org/1308913004/diff/220001/content/common/gpu/gpu... > File content/common/gpu/gpu_channel.cc (right): > > https://codereview.chromium.org/1308913004/diff/220001/content/common/gpu/gpu... > content/common/gpu/gpu_channel.cc:191: msg->sync_point_number); > On 2015/09/01 19:23:44, piman (OOO back 2015-08-31) wrote: > > Same comment as previous patch, we should probably not call RetireSyncPoint > > while the lock is held. > > My suggestion: take the lock, set enabled to false, delete OOO messages, and > > swap channel_messages_ with an empty queue. Then you can release the lock and > do > > the loop on the swapped queue safely. > > I thought of another issue, we will leak anything that gets pushed after the > queue has been disabled. Since I now fixed that by checking enabled before > pushing, we now guarantee the queue will not be touched after enabled is false. > > How do you feel about just setting enabled to false with the lock, then clearing > the queues without the lock? with the guarantee it should be safe to modify the > queue without the lock. That sounds good to me. Let's make sure TSAN doesn't complain, but hopefully should be ok.
https://codereview.chromium.org/1308913004/diff/220001/content/common/gpu/gpu... File content/common/gpu/gpu_channel.cc (right): https://codereview.chromium.org/1308913004/diff/220001/content/common/gpu/gpu... content/common/gpu/gpu_channel.cc:840: message_queue_->ScheduleHandleMessage(); On 2015/09/01 20:03:06, jbauman wrote: > If the stub is descheduled and there's at least two messages in the queue, this > could cause it to spin trying to execute messages. I think it should check if > the stub is scheduled before doing ScheduleHandleMessage. Good point.
https://codereview.chromium.org/1308913004/diff/220001/content/common/gpu/gpu... File content/common/gpu/gpu_channel.cc (right): https://codereview.chromium.org/1308913004/diff/220001/content/common/gpu/gpu... content/common/gpu/gpu_channel.cc:840: message_queue_->ScheduleHandleMessage(); On 2015/09/01 20:36:36, piman (OOO back 2015-08-31) wrote: > On 2015/09/01 20:03:06, jbauman wrote: > > If the stub is descheduled and there's at least two messages in the queue, > this > > could cause it to spin trying to execute messages. I think it should check if > > the stub is scheduled before doing ScheduleHandleMessage. > > Good point. I think it's easier to store whether the message queue is empty here and scheduled the handle message only after we do not return early. PTAL.
lgtm
drive-by: It looks like a fair bit of complexity in this CL is due to out-of-order messages. Would it be simpler to handle them on the IO thread and add them to a waiting_offset / waiting_token structure rather than process them as messages on the main thread? So that all messages on the main thread are always processed in order.
lgtm with one nit. https://codereview.chromium.org/1308913004/diff/300001/content/common/gpu/gpu... File content/common/gpu/gpu_channel.cc (left): https://codereview.chromium.org/1308913004/diff/300001/content/common/gpu/gpu... content/common/gpu/gpu_channel.cc:561: handle_messages_scheduled_ = true; One thing I am a bit worried about is that we may have multiple HandleMessages scheduled at a time if e.g. the stub is descheduled, an IPC comes in (causing a ScheduleHandleMessage) and then it's rescheduled (causing another ScheduleHandleMessage). Hopefully the unfairness there will be acceptable, though. https://codereview.chromium.org/1308913004/diff/300001/content/common/gpu/gpu... File content/common/gpu/gpu_channel.cc (right): https://codereview.chromium.org/1308913004/diff/300001/content/common/gpu/gpu... content/common/gpu/gpu_channel.cc:168: virtual ~GpuChannelMessageQueue() {} DCHECK(channel_messages_.empty()); DCHECK(out_of_order_messages_.empty());
On 2015/09/01 22:13:26, sunnyps wrote: > drive-by: It looks like a fair bit of complexity in this CL is due to > out-of-order messages. Would it be simpler to handle them on the IO thread and > add them to a waiting_offset / waiting_token structure rather than process them > as messages on the main thread? So that all messages on the main thread are > always processed in order. I don't think removing the out of order messages would simplify it much, it would just remove the code that chooses between the 2 queues. Basically the if blocks in GetNextMessageTimeTick(), PushUnfinishedMessage(), and HandleMessage(). Everything else would be exactly the same.
https://codereview.chromium.org/1308913004/diff/250002/content/common/gpu/gpu... File content/common/gpu/gpu_channel.cc (right): https://codereview.chromium.org/1308913004/diff/250002/content/common/gpu/gpu... content/common/gpu/gpu_channel.cc:1016: processed_order_num_ = order_number; On 2015/09/01 20:03:06, jbauman wrote: > Can we DCHECK that processed_order_num_ only ever increases? Missed this in my last patch. Done now. https://codereview.chromium.org/1308913004/diff/300001/content/common/gpu/gpu... File content/common/gpu/gpu_channel.cc (right): https://codereview.chromium.org/1308913004/diff/300001/content/common/gpu/gpu... content/common/gpu/gpu_channel.cc:168: virtual ~GpuChannelMessageQueue() {} On 2015/09/01 22:30:07, jbauman wrote: > DCHECK(channel_messages_.empty()); DCHECK(out_of_order_messages_.empty()); Done.
The CQ bit was checked by dyen@chromium.org
The patchset sent to the CQ was uploaded after l-g-t-m from piman@chromium.org, jbauman@chromium.org Link to the patchset: https://codereview.chromium.org/1308913004/#ps360001 (title: "Reordered includes, started global order counter at 1")
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1308913004/360001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1308913004/360001
The CQ bit was unchecked by commit-bot@chromium.org
Try jobs failed on following builders: linux_android_rel_ng on tryserver.chromium.linux (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.linux/builders/linux_android_r...)
The CQ bit was checked by dyen@chromium.org
The patchset sent to the CQ was uploaded after l-g-t-m from jbauman@chromium.org, piman@chromium.org Link to the patchset: https://codereview.chromium.org/1308913004/#ps420001 (title: "Fix merge error")
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1308913004/420001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1308913004/420001
Message was sent while issue was closed.
Committed patchset #23 (id:420001)
Message was sent while issue was closed.
Patchset 23 (id:??) landed as https://crrev.com/4151ec766a698853621a1fff2b2b80c8bc36cc0c Cr-Commit-Position: refs/heads/master@{#347428} |