|
|
Chromium Code Reviews
DescriptionIgnore calls to Reset() when in error state
No checks are made for STATE_ERROR in DecoderStream::Reset(). This leads
to some problems when:
- The Decoder is succesfully initialized, but falls back to a new
decoder on first decode error, and the fallback decoder initialization
fails, right before a call to Reset() is executed.
- The Decoder sends a DECODE_ERROR when there is a pending demuxer read,
right before a call to Reset() is executed.
This CL shortcircuits calls to Reset() when the DecoderStream is in
STATE_ERROR. This changes the behavior of Reset(): calling it will
no longer change the STATE_ERROR to STATE_NORMAL upon successful reset.
This is acceptable, since no one is calling Reset() to recover from
errors. It is only being called by AudioRendererImpl and
VideoRendererImpl when flushing.
BUG=597605, 607454
TEST=media unit passed, extensive runs of fuzzed faulty media passed for 607454
Committed: https://crrev.com/1a359d181818d87fbef52d5b003972877aa4bc1d
Cr-Commit-Position: refs/heads/master@{#391650}
Patch Set 1 #
Total comments: 4
Patch Set 2 : #Patch Set 3 : #Patch Set 4 : #Patch Set 5 : #Messages
Total messages: 19 (5 generated)
Description was changed from ========== Ignore calls to Reset() when in error state No checks are made for STATE_ERROR in DecoderStream::Reset(). This leads to some problems when: - The Decoder is succesfully initialized, but falls back to a new decoder on first decode error, and the fallback decoder initialization fails, right before a call to Reset() is executed. - The Decoder sends a DECODE_ERROR when there is a pending demuxer read, right before a call to Reset() is executed. This CL shortcircuits calls to Reset() when the DecoderStream is in STATE_ERROR. This changes the behavior of Reset(): calling it will no longer change the STATE_ERROR to STATE_NORMAL upon successful reset. This is acceptable, since no one is calling Reset() to recover from errors. It is only being called by AudioRendererImpl and VideoRendererImpl when flushing. BUG=597605, 607454 TEST=media unit tests were run, a lot of the fuzzed faulty media as well REVIEW=sandersd@chromium.org, dalecurtis@chromium.org ========== to ========== Ignore calls to Reset() when in error state No checks are made for STATE_ERROR in DecoderStream::Reset(). This leads to some problems when: - The Decoder is succesfully initialized, but falls back to a new decoder on first decode error, and the fallback decoder initialization fails, right before a call to Reset() is executed. - The Decoder sends a DECODE_ERROR when there is a pending demuxer read, right before a call to Reset() is executed. This CL shortcircuits calls to Reset() when the DecoderStream is in STATE_ERROR. This changes the behavior of Reset(): calling it will no longer change the STATE_ERROR to STATE_NORMAL upon successful reset. This is acceptable, since no one is calling Reset() to recover from errors. It is only being called by AudioRendererImpl and VideoRendererImpl when flushing. BUG=597605, 607454 TEST=media unit passed, extensive runs of fuzzed faulty media passed for 607454 ==========
tguilbert@chromium.org changed reviewers: + dalecurtis@chromium.org, sandersd@chromium.org
I have not run into any DCHECK failures since removing STATE_ERROR as a valid state when entering ResetDecoder() or OnDecoderReset(). Is there a slew of tests that I should put this change through, or should our current tests be enough? There might still be some new bugs that will bubble up due to the removal of the "STATE_ERROR" --> "STATE_NORMAL" possible transition that might have been happening in OnDecoderReset() (here https://code.google.com/p/chromium/codesearch#chromium/src/media/filters/deco...). Thanks! Thomas
So this can occur if the decoder encounters an error, but the renderer hasn't responded to the error notification yet? You might be able to test this by postdelaytasking the error event from the decoder stream and then issuing a seek during the interim.
https://codereview.chromium.org/1939993002/diff/1/media/filters/decoder_strea... File media/filters/decoder_stream.cc (right): https://codereview.chromium.org/1939993002/diff/1/media/filters/decoder_strea... media/filters/decoder_stream.cc:168: // before a Reset() is executed. See crbug.com/597605 or crbug.com/607454. or -> and. https://codereview.chromium.org/1939993002/diff/1/media/filters/decoder_strea... media/filters/decoder_stream.cc:170: task_runner_->PostTask(FROM_HERE, base::ResetAndReturn(&reset_cb_)); Should this be below the |read_cb_| check? It seems that entering an error state should be responsible for firing that callback, and so here we should be safe to put the state check first.
> So this can occur if the decoder encounters an error, but the renderer hasn't > responded to the error notification yet? You might be able to test this by > postdelaytasking the error event from the decoder stream and then issuing a seek > during the interim. For 507605: The issue arises on resume, when creating a new Renderer while the read request from the previous Renderer's DecoderStream has not completed. I do not think that reporting or delaying reports of decode errors in the "soon-to-be-discarded-renderer" will affect this case; IIUC we discard errors in the pipeline when suspending, and this is when the decode error would bubble up. For 607454: I was able to exercise the code path and the ASAN build doesn't crash anymore. https://codereview.chromium.org/1939993002/diff/1/media/filters/decoder_strea... File media/filters/decoder_stream.cc (right): https://codereview.chromium.org/1939993002/diff/1/media/filters/decoder_strea... media/filters/decoder_stream.cc:168: // before a Reset() is executed. See crbug.com/597605 or crbug.com/607454. On 2016/05/02 22:57:03, sandersd wrote: > or -> and. Done. https://codereview.chromium.org/1939993002/diff/1/media/filters/decoder_strea... media/filters/decoder_stream.cc:170: task_runner_->PostTask(FROM_HERE, base::ResetAndReturn(&reset_cb_)); On 2016/05/02 22:57:03, sandersd wrote: > Should this be below the |read_cb_| check? It seems that entering an error state > should be responsible for firing that callback, and so here we should be safe to > put the state check first. This is something that I considered. The comments above Reset() mention that it fires any pending read callbacks. I therefore purposely put |reset_cb_| after |read_cb_|, but that may not be the right thing. There are two places where STATE_ERROR is set. - In OnDecodeDone, if we receive a DECODE_ERROR, we set the state and then fire any |read_cb_|. - In CompleteDecoderReinitialization(), we set the state, followed by firing |reset_cb_| and returning if it is present, OR satisfying |read_cb_|. In either of the cases where we enter STATE_ERROR, I don't think firing or not firing the of |read_cb_| will cause problems. The |read_cb_| should always be null when we are in a STATE_ERROR (resulting in a no-op), except in a codepath that should never happen (queuing a Read() while a Reset() is happening).
On 2016/05/03 at 00:06:41, tguilbert wrote: > > So this can occur if the decoder encounters an error, but the renderer hasn't > > responded to the error notification yet? You might be able to test this by > > postdelaytasking the error event from the decoder stream and then issuing a seek > > during the interim. > > For 507605: > The issue arises on resume, when creating a new Renderer while the read request from the previous Renderer's DecoderStream has not completed. I do not think that reporting or delaying reports of decode errors in the "soon-to-be-discarded-renderer" will affect this case; IIUC we discard errors in the pipeline when suspending, and this is when the decode error would bubble up. > Is this issue fixed in your CL? How are we getting into a case where there's a pending read? Suspend flushes the renderer which was something we wish we could avoid.
On 2016/05/03 00:14:11, DaleCurtis wrote: > On 2016/05/03 at 00:06:41, tguilbert wrote: > > > So this can occur if the decoder encounters an error, but the renderer > hasn't > > > responded to the error notification yet? You might be able to test this by > > > postdelaytasking the error event from the decoder stream and then issuing a > seek > > > during the interim. > > > > For 507605: > > The issue arises on resume, when creating a new Renderer while the read > request from the previous Renderer's DecoderStream has not completed. I do not > think that reporting or delaying reports of decode errors in the > "soon-to-be-discarded-renderer" will affect this case; IIUC we discard errors in > the pipeline when suspending, and this is when the decode error would bubble up. > > > > Is this issue fixed in your CL? How are we getting into a case where there's a > pending read? Suspend flushes the renderer which was something we wish we could > avoid. If we are in STATE_PENDING_DEMUXER_READ and we get a DECODE_ERROR, we enter STATE_ERROR. If Reset() is executed then, we never return at this line: https://code.google.com/p/chromium/codesearch#chromium/src/media/filters/deco... Which means that we enter ResetDecoder(), even if we were still waiting for a buffer from the demuxer. At this point while typing this reply, I realized that we never actually enter ReinitializeDecoder at the end of OnDecoderReset(), because this condition will stop the flow https://code.google.com/p/chromium/codesearch#chromium/src/media/filters/deco.... I originally thought that the extra pending read was from the DecoderStream requesting buffers after OnDecoderReinitialized(). It is now much more likely that the pending read is the leftover read that was not waited for, as a result of the STATE_PENDING_DEMUXER_READ (which was something I had considered but not revisited). How often do we have "micro suspensions"? As in, how likely is it that we transition from suspend to resume faster than the time it takes to complete a demuxer read?
On 2016/05/03 at 00:38:48, tguilbert wrote: > On 2016/05/03 00:14:11, DaleCurtis wrote: > > On 2016/05/03 at 00:06:41, tguilbert wrote: > > > > So this can occur if the decoder encounters an error, but the renderer > > hasn't > > > > responded to the error notification yet? You might be able to test this by > > > > postdelaytasking the error event from the decoder stream and then issuing a > > seek > > > > during the interim. > > > > > > For 507605: > > > The issue arises on resume, when creating a new Renderer while the read > > request from the previous Renderer's DecoderStream has not completed. I do not > > think that reporting or delaying reports of decode errors in the > > "soon-to-be-discarded-renderer" will affect this case; IIUC we discard errors in > > the pipeline when suspending, and this is when the decode error would bubble up. > > > > > > > Is this issue fixed in your CL? How are we getting into a case where there's a > > pending read? Suspend flushes the renderer which was something we wish we could > > avoid. > > If we are in STATE_PENDING_DEMUXER_READ and we get a DECODE_ERROR, we enter STATE_ERROR. > If Reset() is executed then, we never return at this line: > https://code.google.com/p/chromium/codesearch#chromium/src/media/filters/deco... > Which means that we enter ResetDecoder(), even if we were still waiting for a buffer from the demuxer. > > At this point while typing this reply, I realized that we never actually enter ReinitializeDecoder at the end of OnDecoderReset(), because this condition will stop the flow https://code.google.com/p/chromium/codesearch#chromium/src/media/filters/deco.... I originally thought that the extra pending read was from the DecoderStream requesting buffers after OnDecoderReinitialized(). It is now much more likely that the pending read is the leftover read that was not waited for, as a result of the STATE_PENDING_DEMUXER_READ (which was something I had considered but not revisited). > > How often do we have "micro suspensions"? As in, how likely is it that we transition from suspend to resume faster than the time it takes to complete a demuxer read? I don't think this should be possible. suspend -> resume in an effectively synchronous process IIUC, one which we start by waiting for all pending reads to complete before entering the suspended state. Are you suggesting that we're resuming while still processing the suspend? I don't think that can happen either since we enter the "SUSPENDING" state once we start suspending, a subsequent resume will just set a pending resume flag IIRC.
I stopped the clusterfuzz minimized test after 90 mins of not running into a DCHECK. The reset logic is still complicated. If anyone has a suggestion as to how to simplify it, please let me know. If not, this will have to go into the code-health brushup of this class.
On 2016/05/03 21:35:33, ThomasGuilbert wrote: > I stopped the clusterfuzz minimized test after 90 mins of not running into a > DCHECK. > > The reset logic is still complicated. If anyone has a suggestion as to how to > simplify it, please let me know. If not, this will have to go into the > code-health brushup of this class. Sorry, I've got nothing that isn't on the same scale as the existing bug. Does this mean you won't be creating a patch set with error separated into a flag?
lgtm
lgtm, though it feels like we should delete the pending read state now. I remember you saying that was messy so this is fine for now.
The CQ bit was checked by tguilbert@chromium.org
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1939993002/80001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1939993002/80001
Message was sent while issue was closed.
Description was changed from ========== Ignore calls to Reset() when in error state No checks are made for STATE_ERROR in DecoderStream::Reset(). This leads to some problems when: - The Decoder is succesfully initialized, but falls back to a new decoder on first decode error, and the fallback decoder initialization fails, right before a call to Reset() is executed. - The Decoder sends a DECODE_ERROR when there is a pending demuxer read, right before a call to Reset() is executed. This CL shortcircuits calls to Reset() when the DecoderStream is in STATE_ERROR. This changes the behavior of Reset(): calling it will no longer change the STATE_ERROR to STATE_NORMAL upon successful reset. This is acceptable, since no one is calling Reset() to recover from errors. It is only being called by AudioRendererImpl and VideoRendererImpl when flushing. BUG=597605, 607454 TEST=media unit passed, extensive runs of fuzzed faulty media passed for 607454 ========== to ========== Ignore calls to Reset() when in error state No checks are made for STATE_ERROR in DecoderStream::Reset(). This leads to some problems when: - The Decoder is succesfully initialized, but falls back to a new decoder on first decode error, and the fallback decoder initialization fails, right before a call to Reset() is executed. - The Decoder sends a DECODE_ERROR when there is a pending demuxer read, right before a call to Reset() is executed. This CL shortcircuits calls to Reset() when the DecoderStream is in STATE_ERROR. This changes the behavior of Reset(): calling it will no longer change the STATE_ERROR to STATE_NORMAL upon successful reset. This is acceptable, since no one is calling Reset() to recover from errors. It is only being called by AudioRendererImpl and VideoRendererImpl when flushing. BUG=597605, 607454 TEST=media unit passed, extensive runs of fuzzed faulty media passed for 607454 ==========
Message was sent while issue was closed.
Committed patchset #5 (id:80001)
Message was sent while issue was closed.
Description was changed from ========== Ignore calls to Reset() when in error state No checks are made for STATE_ERROR in DecoderStream::Reset(). This leads to some problems when: - The Decoder is succesfully initialized, but falls back to a new decoder on first decode error, and the fallback decoder initialization fails, right before a call to Reset() is executed. - The Decoder sends a DECODE_ERROR when there is a pending demuxer read, right before a call to Reset() is executed. This CL shortcircuits calls to Reset() when the DecoderStream is in STATE_ERROR. This changes the behavior of Reset(): calling it will no longer change the STATE_ERROR to STATE_NORMAL upon successful reset. This is acceptable, since no one is calling Reset() to recover from errors. It is only being called by AudioRendererImpl and VideoRendererImpl when flushing. BUG=597605, 607454 TEST=media unit passed, extensive runs of fuzzed faulty media passed for 607454 ========== to ========== Ignore calls to Reset() when in error state No checks are made for STATE_ERROR in DecoderStream::Reset(). This leads to some problems when: - The Decoder is succesfully initialized, but falls back to a new decoder on first decode error, and the fallback decoder initialization fails, right before a call to Reset() is executed. - The Decoder sends a DECODE_ERROR when there is a pending demuxer read, right before a call to Reset() is executed. This CL shortcircuits calls to Reset() when the DecoderStream is in STATE_ERROR. This changes the behavior of Reset(): calling it will no longer change the STATE_ERROR to STATE_NORMAL upon successful reset. This is acceptable, since no one is calling Reset() to recover from errors. It is only being called by AudioRendererImpl and VideoRendererImpl when flushing. BUG=597605, 607454 TEST=media unit passed, extensive runs of fuzzed faulty media passed for 607454 Committed: https://crrev.com/1a359d181818d87fbef52d5b003972877aa4bc1d Cr-Commit-Position: refs/heads/master@{#391650} ==========
Message was sent while issue was closed.
Patchset 5 (id:??) landed as https://crrev.com/1a359d181818d87fbef52d5b003972877aa4bc1d Cr-Commit-Position: refs/heads/master@{#391650} |
