|
|
Created:
5 years, 5 months ago by henrika (OOO until Aug 14) Modified:
5 years, 5 months ago CC:
chromium-reviews, darin-cc_chromium.org, feature-media-reviews_chromium.org, jam Base URL:
https://chromium.googlesource.com/chromium/src.git@master Target Ref:
refs/pending/heads/master Project:
chromium Visibility:
Public. |
DescriptionFixes issue where Web Speech API drops a frame every 5.1 seconds
TBR=tommi
BUG=506051
TEST=content_unittests --gtest_filter=Speech* and manual testing using https://www.google.com/intl/en/chrome/demos/speech.html
Verified 8000, 11025, 16000, 22050, 32000, 44100, 48000, 96000 and 192000 as native input sample rate.
Committed: https://crrev.com/fd4c2189c51bfad5d6f0be146b1d0ad0a6f50022
Cr-Commit-Position: refs/heads/master@{#338013}
Patch Set 1 #
Total comments: 15
Patch Set 2 : Feedback from Dale #
Total comments: 6
Patch Set 3 : nits #
Messages
Total messages: 26 (8 generated)
henrika@chromium.org changed reviewers: + dalecurtis@chromium.org
https://codereview.chromium.org/1211203006/diff/1/content/browser/speech/spee... File content/browser/speech/speech_recognizer_impl.cc (right): https://codereview.chromium.org/1211203006/diff/1/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl.cc:157: DVLOG(1) << "SRI::ODC::Convert done (" << convert_count_ << ")"; seems like this will always log a value higher than 0 (up to max int of course) and convert_count_ is otherwise not used (or maybe I'm missing something). It seems like that if we really need this variable, we should get rid of waiting_for_input_.
henrika@chromium.org changed reviewers: + tommi@chromium.org
https://codereview.chromium.org/1211203006/diff/1/content/browser/speech/spee... File content/browser/speech/speech_recognizer_impl.cc (right): https://codereview.chromium.org/1211203006/diff/1/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl.cc:157: DVLOG(1) << "SRI::ODC::Convert done (" << convert_count_ << ")"; Got it. Plan is to remove this counter. Just wanted to see where (~5.1) we end up in a state where we don't ask for new input data but instead read from the internal storage inside the Converter (that is when we miss 102ms of input data currently).
Just some pedantic nits and a repeat comment :) https://codereview.chromium.org/1211203006/diff/1/content/browser/speech/spee... File content/browser/speech/speech_recognizer_impl.cc (right): https://codereview.chromium.org/1211203006/diff/1/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl.cc:156: audio_converter_.Convert(output_bus_.get()); As mentioned in the email thread, you should probably just have a "bool provide_input_called_" and if you have data and it isn't, then call convert again and generate another AudioChunk. https://codereview.chromium.org/1211203006/diff/1/media/base/audio_converter.cc File media/base/audio_converter.cc (right): https://codereview.chromium.org/1211203006/diff/1/media/base/audio_converter.... media/base/audio_converter.cc:95: DVLOG(1) << "AudioConverter::PrimeWithSilence"; Remove before submit and invert conditional; i.e. if (resampler_) PrimeWithSilence(). https://codereview.chromium.org/1211203006/diff/1/media/base/multi_channel_re... File media/base/multi_channel_resampler.cc (right): https://codereview.chromium.org/1211203006/diff/1/media/base/multi_channel_re... media/base/multi_channel_resampler.cc:47: for (size_t i = 0; i < resamplers_.size(); ++i) { No {} for single line please.
Thanks. Please check my comment/question. https://codereview.chromium.org/1211203006/diff/1/content/browser/speech/spee... File content/browser/speech/speech_recognizer_impl.cc (right): https://codereview.chromium.org/1211203006/diff/1/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl.cc:156: audio_converter_.Convert(output_bus_.get()); Smart. But note that I have modified the input size to 100ms (used to be 102) in this patch as well. And it will result in that ProvideInput is always called. Do you want me to restore 102 and use "double convert when needed" or keep 100. I don't remember the exact reason why we selected 102 in the first place.
https://codereview.chromium.org/1211203006/diff/1/content/browser/speech/spee... File content/browser/speech/speech_recognizer_impl.cc (right): https://codereview.chromium.org/1211203006/diff/1/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl.cc:156: audio_converter_.Convert(output_bus_.get()); On 2015/07/07 19:30:05, henrika wrote: > Smart. But note that I have modified the input size to 100ms (used to be 102) in > this patch as well. And it will result in that ProvideInput is always called. Do > you want me to restore 102 and use "double convert when needed" or keep 100. I > don't remember the exact reason why we selected 102 in the first place. Well, I'm not confident your modification always ensures a ProvideInput() call; due to rounding over time you may end up without a ProvideInput() call, possibly speech input is short enough that this isn't a problem though. I'm not sure it's possible to make a stricter guarantee than "at most 1 call." https://codereview.chromium.org/1211203006/diff/1/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl.cc:555: #if defined(OS_WIN) I'm surprised we're not using the native params for all platforms, but I guess if it works it works.
https://codereview.chromium.org/1211203006/diff/1/content/browser/speech/spee... File content/browser/speech/speech_recognizer_impl.cc (right): https://codereview.chromium.org/1211203006/diff/1/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl.cc:156: audio_converter_.Convert(output_bus_.get()); Got it. I will test your scheme using 102 where I know it will be triggered once every 5th seconds. Then go back to 100 but keep the scheme just in case. https://codereview.chromium.org/1211203006/diff/1/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl.cc:555: #if defined(OS_WIN) Hmm, but we are on all but Win XP. All others use native. Clear, or am I missing something?
https://codereview.chromium.org/1211203006/diff/1/content/browser/speech/spee... File content/browser/speech/speech_recognizer_impl.cc (right): https://codereview.chromium.org/1211203006/diff/1/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl.cc:156: audio_converter_.Convert(output_bus_.get()); On 2015/07/07 21:14:08, henrika wrote: > Got it. I will test your scheme using 102 where I know it will be triggered once > every 5th seconds. Then go back to 100 but keep the scheme just in case. I think it's fine to use 100ms like you do w/ prime with silence, you just need to also handle the extra-convert case too. https://codereview.chromium.org/1211203006/diff/1/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl.cc:555: #if defined(OS_WIN) On 2015/07/07 21:14:08, henrika wrote: > Hmm, but we are on all but Win XP. All others use native. Clear, or am I missing > something? Ah no, I misread and inverted the check.
Thanks Dale, PTAL. https://codereview.chromium.org/1211203006/diff/1/content/browser/speech/spee... File content/browser/speech/speech_recognizer_impl.cc (right): https://codereview.chromium.org/1211203006/diff/1/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl.cc:156: audio_converter_.Convert(output_bus_.get()); Done. Hope you are OK with how I handle the extra convert case. I wanted to make it clear that it should be a rare event. https://codereview.chromium.org/1211203006/diff/1/media/base/audio_converter.cc File media/base/audio_converter.cc (right): https://codereview.chromium.org/1211203006/diff/1/media/base/audio_converter.... media/base/audio_converter.cc:95: DVLOG(1) << "AudioConverter::PrimeWithSilence"; On 2015/07/07 16:38:17, DaleCurtis wrote: > Remove before submit and invert conditional; i.e. if (resampler_) > PrimeWithSilence(). Done. https://codereview.chromium.org/1211203006/diff/1/media/base/multi_channel_re... File media/base/multi_channel_resampler.cc (right): https://codereview.chromium.org/1211203006/diff/1/media/base/multi_channel_re... media/base/multi_channel_resampler.cc:47: for (size_t i = 0; i < resamplers_.size(); ++i) { On 2015/07/07 16:38:17, DaleCurtis wrote: > No {} for single line please. Done.
Assuming chunk size is fine, lgtm https://codereview.chromium.org/1211203006/diff/20001/content/browser/speech/... File content/browser/speech/speech_recognizer_impl.cc (right): https://codereview.chromium.org/1211203006/diff/20001/content/browser/speech/... content/browser/speech/speech_recognizer_impl.cc:565: ((in_params.sample_rate() * chunk_duration_ms) / 1000.0) + 0.5; One last thing, can verify that the chunk size for 8kHz -> 16kHz is greater than the output request size? I'm kind of surprised it's not a problem.
https://codereview.chromium.org/1211203006/diff/20001/content/browser/speech/... File content/browser/speech/speech_recognizer_impl.cc (right): https://codereview.chromium.org/1211203006/diff/20001/content/browser/speech/... content/browser/speech/speech_recognizer_impl.cc:282: // See http://crbug.com/506051 regarding why one extra convert call can Oh, you could just do a while (!data_was_converted) arund the above section. https://codereview.chromium.org/1211203006/diff/20001/content/browser/speech/... content/browser/speech/speech_recognizer_impl.cc:285: DCHECK(false); Remove dcheck and dlog, I don't think it's unexpected.
Thanks Dale. I verified that 8->16 worked as well. Please check out my comments. https://codereview.chromium.org/1211203006/diff/20001/content/browser/speech/... File content/browser/speech/speech_recognizer_impl.cc (right): https://codereview.chromium.org/1211203006/diff/20001/content/browser/speech/... content/browser/speech/speech_recognizer_impl.cc:282: // See http://crbug.com/506051 regarding why one extra convert call can Great point, tried that. But what then happens is: #1: data_was_converted (dwc) is false => convert and dwc is true => #2: break while loop but now dwc is true and we never enter the while loop again.. Yes, there are ways around it but I figured that the current style makes it clear that the second loop is an extra-ordinary thing and I did not want to risk ending up in an infinite loop. I can make changes in a follow-up if you like. https://codereview.chromium.org/1211203006/diff/20001/content/browser/speech/... content/browser/speech/speech_recognizer_impl.cc:285: DCHECK(false); On 2015/07/08 18:41:47, DaleCurtis wrote: > Remove dcheck and dlog, I don't think it's unexpected. Done. https://codereview.chromium.org/1211203006/diff/20001/content/browser/speech/... content/browser/speech/speech_recognizer_impl.cc:565: ((in_params.sample_rate() * chunk_duration_ms) / 1000.0) + 0.5; SRI::output_parameters: format: 1 channels: 1 channel_layout: 2 sample_rate: 16000 bits_per_sample: 16 frames_per_buffer: 1600 SRI::input_parameters: format: 1 channels: 2 channel_layout: 3 sample_rate: 8000 bits_per_sample: 16 frames_per_buffer: 800 AudioConverter::ChunkSize: 1600 It seems to work fine for me.
henrika@chromium.org changed reviewers: - tommi@chromium.org
The CQ bit was checked by henrika@chromium.org
The patchset sent to the CQ was uploaded after l-g-t-m from dalecurtis@chromium.org Link to the patchset: https://codereview.chromium.org/1211203006/#ps40001 (title: "nits")
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1211203006/40001
The CQ bit was unchecked by commit-bot@chromium.org
Try jobs failed on following builders: chromium_presubmit on tryserver.chromium.linux (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.linux/builders/chromium_presub...)
henrika@chromium.org changed reviewers: + tommi@chromium.org
Landing with TBR=tommi@ since he is the owner. Tommi, please ping me if you want some final changes.
The CQ bit was checked by henrika@chromium.org
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1211203006/40001
Message was sent while issue was closed.
Committed patchset #3 (id:40001)
Message was sent while issue was closed.
Patchset 3 (id:??) landed as https://crrev.com/fd4c2189c51bfad5d6f0be146b1d0ad0a6f50022 Cr-Commit-Position: refs/heads/master@{#338013}
Message was sent while issue was closed.
sorry, missed the ping. lgtm and thanks for fixing! |