|
|
DescriptionMessagePumpForGpu - reduce potential work starvation
MessagePumpForGpu has been on Canary for a few days and I
am still assessing whether it is better than standard
MessagePumpForUI in terms of reducing [GPU hang] crashes.
While it has eliminated hangs with some signatures the
rate of hangs in WaitForWork has increased.
When experimenting with MsgWaitForMultipleObjectsEx I
realized that it favors messages over events - when an
event is signaled and a message is available in the queue
it returns a result that indicates a message even when
the message doesn't belong to a current thread (e.g. a
browser UI message). With a high volume of messages it
might be possible for WaitForWork to keep looping and
starve the work processing. It is possible with Direct
Composition where some of UI input messages have to go
through the GPU process message loop because its D3D
surface window covers the entire browser window.
The fix is to do an additional check for the event. Please
note that this is a much less common code path so this
shouldn't affect performance much in scenarios like
video playback. Also I've added some diagnostics code to
help with crash dump analysis.
BUG=588798, 596190
Committed: https://crrev.com/33490baaa1e5601b9284e28f58a99a4dc0db90ea
Cr-Commit-Position: refs/heads/master@{#397886}
Patch Set 1 : #
Total comments: 1
Patch Set 2 : Addressed feedback #
Total comments: 2
Patch Set 3 : Fixed typo #
Messages
Total messages: 19 (11 generated)
Patchset #1 (id:1) has been deleted
Patchset #1 (id:20001) has been deleted
Description was changed from ========== WaitForWork fix BUG= ========== to ========== MessagePumpForGpu - reduce potential work starvation MessagePumpForGpu has been on Canary for a few days and I am still assessing whether it is better than standard MessagePumpForUI in terms of reducing [GPU hang] crashes. While it has eliminated hangs with some signatures the rate of hangs in WaitForWork has increased. When experimenting with MsgWaitForMultipleObjectsEx I realized that it favors messages over events - when an event is signalled and a message is available in the queue it returns a result that indicates a message even when the message doesn't belong to a current thread (e.g. a browser UI message). With a high volume of messages it might be possible for WaitForWork to keep looping and starve the work processing. It is possible with Direct Composition where some of UI input messages have go through the GPU process message look because its D3D surface window covers the entire browser window. The fix to do an additional check for the event. Please note that this is a much less common codepath so this shouldn't affect performance much in scenarios like video playback. Also I've added some diagnostics code to help with crash dump analysis. BUG=588798,596190 ==========
stanisc@chromium.org changed reviewers: + brucedawson@chromium.org, mark@chromium.org
PTAL
I’m going on vacation, so I defer to Bruce. Wait for his review, but once he likes it, LGTM from me.
Description was changed from ========== MessagePumpForGpu - reduce potential work starvation MessagePumpForGpu has been on Canary for a few days and I am still assessing whether it is better than standard MessagePumpForUI in terms of reducing [GPU hang] crashes. While it has eliminated hangs with some signatures the rate of hangs in WaitForWork has increased. When experimenting with MsgWaitForMultipleObjectsEx I realized that it favors messages over events - when an event is signalled and a message is available in the queue it returns a result that indicates a message even when the message doesn't belong to a current thread (e.g. a browser UI message). With a high volume of messages it might be possible for WaitForWork to keep looping and starve the work processing. It is possible with Direct Composition where some of UI input messages have go through the GPU process message look because its D3D surface window covers the entire browser window. The fix to do an additional check for the event. Please note that this is a much less common codepath so this shouldn't affect performance much in scenarios like video playback. Also I've added some diagnostics code to help with crash dump analysis. BUG=588798,596190 ========== to ========== MessagePumpForGpu - reduce potential work starvation MessagePumpForGpu has been on Canary for a few days and I am still assessing whether it is better than standard MessagePumpForUI in terms of reducing [GPU hang] crashes. While it has eliminated hangs with some signatures the rate of hangs in WaitForWork has increased. When experimenting with MsgWaitForMultipleObjectsEx I realized that it favors messages over events - when an event is signalled and a message is available in the queue it returns a result that indicates a message even when the message doesn't belong to a current thread (e.g. a browser UI message). With a high volume of messages it might be possible for WaitForWork to keep looping and starve the work processing. It is possible with Direct Composition where some of UI input messages have to go through the GPU process message loop because its D3D surface window covers the entire browser window. The fix is to do an additional check for the event. Please note that this is a much less common codepath so this shouldn't affect performance much in scenarios like video playback. Also I've added some diagnostics code to help with crash dump analysis. BUG=588798,596190 ==========
Description was changed from ========== MessagePumpForGpu - reduce potential work starvation MessagePumpForGpu has been on Canary for a few days and I am still assessing whether it is better than standard MessagePumpForUI in terms of reducing [GPU hang] crashes. While it has eliminated hangs with some signatures the rate of hangs in WaitForWork has increased. When experimenting with MsgWaitForMultipleObjectsEx I realized that it favors messages over events - when an event is signalled and a message is available in the queue it returns a result that indicates a message even when the message doesn't belong to a current thread (e.g. a browser UI message). With a high volume of messages it might be possible for WaitForWork to keep looping and starve the work processing. It is possible with Direct Composition where some of UI input messages have to go through the GPU process message loop because its D3D surface window covers the entire browser window. The fix is to do an additional check for the event. Please note that this is a much less common codepath so this shouldn't affect performance much in scenarios like video playback. Also I've added some diagnostics code to help with crash dump analysis. BUG=588798,596190 ========== to ========== MessagePumpForGpu - reduce potential work starvation MessagePumpForGpu has been on Canary for a few days and I am still assessing whether it is better than standard MessagePumpForUI in terms of reducing [GPU hang] crashes. While it has eliminated hangs with some signatures the rate of hangs in WaitForWork has increased. When experimenting with MsgWaitForMultipleObjectsEx I realized that it favors messages over events - when an event is signaled and a message is available in the queue it returns a result that indicates a message even when the message doesn't belong to a current thread (e.g. a browser UI message). With a high volume of messages it might be possible for WaitForWork to keep looping and starve the work processing. It is possible with Direct Composition where some of UI input messages have to go through the GPU process message loop because its D3D surface window covers the entire browser window. The fix is to do an additional check for the event. Please note that this is a much less common code path so this shouldn't affect performance much in scenarios like video playback. Also I've added some diagnostics code to help with crash dump analysis. BUG=588798,596190 ==========
https://codereview.chromium.org/2036603002/diff/40001/base/message_loop/messa... File base/message_loop/message_pump_win.cc (right): https://codereview.chromium.org/2036603002/diff/40001/base/message_loop/messa... base/message_loop/message_pump_win.cc:539: if (WaitForSingleObject(event_, 0) == WAIT_OBJECT_0) Previously this loop would favor window messages or a signaled event. This changes it to favor signaled events over windows messages. This is okay as an experiment, but I'm not seeing a clear explanation for why this is likely to be better. The risk is that if we favor signaled events then we could potentially starve the other processes that share this message queue (I think, it's all very messy). Note that if work requests were coming in via WM_USER window messages then there would be stronger guarantees of fairness. Note that it shouldn't be possible to 'flood' the message queue with windows messages. Even a high-end gaming mouse probably doesn't send WM_MOUSEMOVE messages any faster than 250-1,000 Hz, and this loop should be able to remove them at least 100 times faster than that. If there are signs of starvation due to excess messages we need to understand that, perhaps by tracking the incoming message rate.
Patchset #2 (id:60001) has been deleted
I've done a bit more experimentation with tracing timing of suspected PeekMessage calls, then Bruce and I looked at context switches in ETW profile. It looks like there some context switches at PeekMessage inside WaitForWork. According to my tracing results PeekMessage might take up to 0.25 ms on my workstation although usually it is within 0.05-0.1 ms. After discussing this with Bruce I decided to simplify the code and remove the special case with PeekMessage altogether. The outer loop still processes messages and it probably doesn't make sense to treat messages that belong to this thread differently from messages that belong to the browser UI. This might be a bit less efficient but only when user moves the mouse in non-client area. I've added a temporary diagnostics code for crash dump analysis that captures timing of setting the event and entering the wait.
lgtm https://codereview.chromium.org/2036603002/diff/80001/base/message_loop/messa... File base/message_loop/message_pump_win.cc (right): https://codereview.chromium.org/2036603002/diff/80001/base/message_loop/messa... base/message_loop/message_pump_win.cc:453: // Remove this when the bug is fixes. fixes -> fixed https://codereview.chromium.org/2036603002/diff/80001/base/message_loop/messa... base/message_loop/message_pump_win.cc:528: // Remove this when the bug is fixes. fixes -> fixed
The CQ bit was checked by stanisc@chromium.org
The patchset sent to the CQ was uploaded after l-g-t-m from mark@chromium.org, brucedawson@chromium.org Link to the patchset: https://codereview.chromium.org/2036603002/#ps100001 (title: "Fixed typo")
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/2036603002/100001
Message was sent while issue was closed.
Description was changed from ========== MessagePumpForGpu - reduce potential work starvation MessagePumpForGpu has been on Canary for a few days and I am still assessing whether it is better than standard MessagePumpForUI in terms of reducing [GPU hang] crashes. While it has eliminated hangs with some signatures the rate of hangs in WaitForWork has increased. When experimenting with MsgWaitForMultipleObjectsEx I realized that it favors messages over events - when an event is signaled and a message is available in the queue it returns a result that indicates a message even when the message doesn't belong to a current thread (e.g. a browser UI message). With a high volume of messages it might be possible for WaitForWork to keep looping and starve the work processing. It is possible with Direct Composition where some of UI input messages have to go through the GPU process message loop because its D3D surface window covers the entire browser window. The fix is to do an additional check for the event. Please note that this is a much less common code path so this shouldn't affect performance much in scenarios like video playback. Also I've added some diagnostics code to help with crash dump analysis. BUG=588798,596190 ========== to ========== MessagePumpForGpu - reduce potential work starvation MessagePumpForGpu has been on Canary for a few days and I am still assessing whether it is better than standard MessagePumpForUI in terms of reducing [GPU hang] crashes. While it has eliminated hangs with some signatures the rate of hangs in WaitForWork has increased. When experimenting with MsgWaitForMultipleObjectsEx I realized that it favors messages over events - when an event is signaled and a message is available in the queue it returns a result that indicates a message even when the message doesn't belong to a current thread (e.g. a browser UI message). With a high volume of messages it might be possible for WaitForWork to keep looping and starve the work processing. It is possible with Direct Composition where some of UI input messages have to go through the GPU process message loop because its D3D surface window covers the entire browser window. The fix is to do an additional check for the event. Please note that this is a much less common code path so this shouldn't affect performance much in scenarios like video playback. Also I've added some diagnostics code to help with crash dump analysis. BUG=588798,596190 ==========
Message was sent while issue was closed.
Committed patchset #3 (id:100001)
Message was sent while issue was closed.
Description was changed from ========== MessagePumpForGpu - reduce potential work starvation MessagePumpForGpu has been on Canary for a few days and I am still assessing whether it is better than standard MessagePumpForUI in terms of reducing [GPU hang] crashes. While it has eliminated hangs with some signatures the rate of hangs in WaitForWork has increased. When experimenting with MsgWaitForMultipleObjectsEx I realized that it favors messages over events - when an event is signaled and a message is available in the queue it returns a result that indicates a message even when the message doesn't belong to a current thread (e.g. a browser UI message). With a high volume of messages it might be possible for WaitForWork to keep looping and starve the work processing. It is possible with Direct Composition where some of UI input messages have to go through the GPU process message loop because its D3D surface window covers the entire browser window. The fix is to do an additional check for the event. Please note that this is a much less common code path so this shouldn't affect performance much in scenarios like video playback. Also I've added some diagnostics code to help with crash dump analysis. BUG=588798,596190 ========== to ========== MessagePumpForGpu - reduce potential work starvation MessagePumpForGpu has been on Canary for a few days and I am still assessing whether it is better than standard MessagePumpForUI in terms of reducing [GPU hang] crashes. While it has eliminated hangs with some signatures the rate of hangs in WaitForWork has increased. When experimenting with MsgWaitForMultipleObjectsEx I realized that it favors messages over events - when an event is signaled and a message is available in the queue it returns a result that indicates a message even when the message doesn't belong to a current thread (e.g. a browser UI message). With a high volume of messages it might be possible for WaitForWork to keep looping and starve the work processing. It is possible with Direct Composition where some of UI input messages have to go through the GPU process message loop because its D3D surface window covers the entire browser window. The fix is to do an additional check for the event. Please note that this is a much less common code path so this shouldn't affect performance much in scenarios like video playback. Also I've added some diagnostics code to help with crash dump analysis. BUG=588798,596190 Committed: https://crrev.com/33490baaa1e5601b9284e28f58a99a4dc0db90ea Cr-Commit-Position: refs/heads/master@{#397886} ==========
Message was sent while issue was closed.
Patchset 3 (id:??) landed as https://crrev.com/33490baaa1e5601b9284e28f58a99a4dc0db90ea Cr-Commit-Position: refs/heads/master@{#397886} |