content/renderer/media/webrtc_audio_processor.cc - Issue 54383003: Added an "enable-audio-processor" flag and WebRtcAudioProcessor class

Side by Side Diff: content/renderer/media/webrtc_audio_processor.cc

Issue 54383003: Added an "enable-audio-processor" flag and WebRtcAudioProcessor class (Closed) Base URL: svn://svn.chromium.org/chrome/trunk/src

Patch Set: rebased and added an include Created 7 years, 1 month ago

Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.

Jump to:

View unified diff | Download patch | Annotate | Revision Log

OLD	NEW
(Empty)
	1 // Copyright 2013 The Chromium Authors. All rights reserved.

	2 // Use of this source code is governed by a BSD-style license that can be

	3 // found in the LICENSE file.

	4

	5 #include "content/renderer/media/webrtc_audio_processor.h"

	6

	7 #include "base/command_line.h"

	8 #include "base/debug/trace_event.h"

	9 #include "content/public/common/content_switches.h"

	10 #include "content/renderer/media/webrtc_audio_processor_options.h"

	11 #include "media/audio/audio_parameters.h"

	12 #include "media/base/audio_converter.h"

	13 #include "media/base/audio_fifo.h"

	14 #include "media/base/channel_layout.h"

	15

	16 namespace content {

	17

	18 namespace {

	19

	20 using webrtc::AudioProcessing;

	21 using webrtc::MediaConstraintsInterface;

	22

	23 #if defined(ANDROID)

	24 const int kAudioProcessingSampleRate = 16000;

	25 #else

	26 const int kAudioProcessingSampleRate = 32000;

	27 #endif

	28 const int kAudioProcessingNumberOfChannel = 1;

	29

	30 const int kMaxNumberOfBuffersInFifo = 2;

	31

	32 } // namespace

	33

	34 class WebRtcAudioProcessor::WebRtcAudioConverter

	35 : public media::AudioConverter::InputCallback {

	36 public:

	37 WebRtcAudioConverter(const media::AudioParameters& source_params,

	38 const media::AudioParameters& sink_params)

	39 : source_params_(source_params),

	40 sink_params_(sink_params),

	41 audio_converter_(source_params, sink_params_, false) {

	42 worker_thread_checker_.DetachFromThread();
	Henrik Grunell 2013/11/18 13:19:34 How does the threading model look like? Push, Conv How does the threading model look like? Push, Convert and destructor must be called on the same thread, but not necessarily the same as the constructor? Why is that? Why not all on the same thread? no longer working on chromium 2013/11/18 17:41:46 The capture converter is created in the main rende Show quoted text On 2013/11/18 13:19:34, Henrik Grunell wrote: > How does the threading model look like? Push, Convert and destructor must be > called on the same thread, but not necessarily the same as the constructor? Why > is that? Why not all on the same thread? The capture converter is created in the main render thread but used in the audio capture thread. the render converter is created and used in the audio render thread. The reason why the capture converter is different from the render converter is that on the capture side, we get a SetCaptureFormat() notification when the format has changed, this happens on the main render thread, and the source is responsible for stopping the data flow before this SetCaptureFormat() callback, so that the clients can re-initialize the components. But on the render side we don't have such callback. If you are asking can we can do the same thing to the capture converter as how we do with the render converter. Like checking the format for each callback, and do reinitialization if format has changed, then we have everything on the audio capture thread. We may be able to do this, but I don't think it is a good idea to allow changing resources on the fly. Henrik Grunell 2013/11/19 08:52:22 Ah of course, there's two thread checkers. Show quoted text On 2013/11/18 17:41:46, xians1 wrote: > On 2013/11/18 13:19:34, Henrik Grunell wrote: > > How does the threading model look like? Push, Convert and destructor must be > > called on the same thread, but not necessarily the same as the constructor? > Why > > is that? Why not all on the same thread? > > The capture converter is created in the main render thread but used in the audio > capture thread. > the render converter is created and used in the audio render thread. Ah of course, there's two thread checkers. Show quoted text > > The reason why the capture converter is different from the render converter is > that on the capture side, we get a SetCaptureFormat() notification when the > format has changed, this happens on the main render thread, and the source is > responsible for stopping the data flow before this SetCaptureFormat() callback, > so that the clients can re-initialize the components. > > But on the render side we don't have such callback. > > If you are asking can we can do the same thing to the capture converter as how > we do with the render converter. Like checking the format for each callback, and > do reinitialization if format has changed, then we have everything on the audio > capture thread. > We may be able to do this, but I don't think it is a good idea to allow changing > resources on the fly. > >
	43

	44 audio_converter_.AddInput(this);

	45 // Create and initialize audio fifo and audio bus wrapper.

	46 // The size of the FIFO should be at least twice of the source buffer size

	47 // or twice of the sink buffer size.

	48 int buffer_size = std::max(

	49 kMaxNumberOfBuffersInFifo * source_params_.frames_per_buffer(),

	50 kMaxNumberOfBuffersInFifo * sink_params_.frames_per_buffer());

	51 fifo_.reset(new media::AudioFifo(source_params_.channels(), buffer_size));

	52 // TODO(xians): Use CreateWrapper to save one memcpy.

	53 audio_wrapper_ = media::AudioBus::Create(sink_params_.channels(),

	54 sink_params_.frames_per_buffer());

	55 }

	56

	57 virtual ~WebRtcAudioConverter() {

	58 DCHECK(create_thread_checker_.CalledOnValidThread());

	59 audio_converter_.RemoveInput(this);

	60 }

	61

	62 void Push(media::AudioBus* audio_source) {

	63 // Called on the audio thread, which is the capture audio thread for

	64 // \|WebRtcAudioProcessor::capture_converter_\|, and render audio thread for

	65 // \|WebRtcAudioProcessor::render_converter_\|.

	66 // And it must be the same thread as calling Convert().

	67 worker_thread_checker_.CalledOnValidThread();
	Henrik Grunell 2013/11/18 13:19:34 I may be ignorant regarding ThreadChecker, but don I may be ignorant regarding ThreadChecker, but don't you need a DCHECK? This will only set the valid thread, right? no longer working on chromium 2013/11/18 17:41:46 right, it is needed. Show quoted text On 2013/11/18 13:19:34, Henrik Grunell wrote: > I may be ignorant regarding ThreadChecker, but don't you need a DCHECK? This > will only set the valid thread, right? right, it is needed.
	68 fifo_->Push(audio_source);

	69 }

	70

	71 bool Convert(webrtc::AudioFrame* out) {

	72 // Called on the audio thread, which is the capture audio thread for

	73 // \|WebRtcAudioProcessor::capture_converter_\|, and render audio thread for

	74 // \|WebRtcAudioProcessor::render_converter_\|.

	75 // Return false if there is no 10ms data in the FIFO.

	76 worker_thread_checker_.CalledOnValidThread();

	77 if (fifo_->frames() < (source_params_.sample_rate() / 100))

	78 return false;

	79

	80 // Convert 10ms data to the output format, this will trigger ProvideInput().

	81 audio_converter_.Convert(audio_wrapper_.get());

	82

	83 // TODO(xians): Figure out a better way to handle the interleaved and

	84 // deinterleaved format switching.

	85 audio_wrapper_->ToInterleaved(audio_wrapper_->frames(),

	86 sink_params_.bits_per_sample() / 8,

	87 out->data_);

	88

	89 out->samples_per_channel_ = sink_params_.frames_per_buffer();

	90 out->sample_rate_hz_ = sink_params_.sample_rate();

	91 out->speech_type_ = webrtc::AudioFrame::kNormalSpeech;

	92 out->vad_activity_ = webrtc::AudioFrame::kVadUnknown;

	93 out->num_channels_ = sink_params_.channels();

	94

	95 return true;

	96 }

	97

	98 const media::AudioParameters& source_parameters() const {

	99 return source_params_;

	100 }

	101 const media::AudioParameters& sink_parameters() const {

	102 return sink_params_;

	103 }

	104

	105 private:

	106 // AudioConverter::InputCallback implementation.

	107 virtual double ProvideInput(media::AudioBus* audio_bus,

	108 base::TimeDelta buffer_delay) OVERRIDE {

	109 // Called on realtime audio thread.

	110 // TODO(xians): Figure out why the first Convert() triggers ProvideInput

	111 // two times.

	112 if (fifo_->frames() < audio_bus->frames())

	113 return 0;

	114

	115 fifo_->Consume(audio_bus, 0, audio_bus->frames());

	116 return 1.0;

	117 }

	118

	119 base::ThreadChecker create_thread_checker_;

	120 base::ThreadChecker worker_thread_checker_;

	121 media::AudioParameters source_params_;

	122 media::AudioParameters sink_params_;

	123

	124 // TODO(xians): consider using SincResampler to save some memcpy.

	125 // Handles mixing and resampling between input and output parameters.

	126 media::AudioConverter audio_converter_;

	127 scoped_ptr<media::AudioBus> audio_wrapper_;

	128 scoped_ptr<media::AudioFifo> fifo_;

	129 };

	130

	131 WebRtcAudioProcessor::WebRtcAudioProcessor(

	132 const webrtc::MediaConstraintsInterface* constraints)

	133 : render_delay_ms_(0) {

	134 capture_thread_checker_.DetachFromThread();

	135 render_thread_checker_.DetachFromThread();

	136 InitializeAudioProcessingModule(constraints);

	137 }

	138

	139 WebRtcAudioProcessor::~WebRtcAudioProcessor() {

	140 DCHECK(main_thread_checker_.CalledOnValidThread());

	141 StopAudioProcessing();

	142 }

	143

	144 void WebRtcAudioProcessor::SetCaptureFormat(

	145 const media::AudioParameters& source_params) {

	146 DCHECK(main_thread_checker_.CalledOnValidThread());

	147 DCHECK(source_params.IsValid());

	148

	149 // Create and initialize audio converter for the source data.

	150 // When the webrtc AudioProcessing is enabled, the sink format of the

	151 // converter will be the same as the post-processed data format, which is

	152 // 32k mono for desktops and 16k mono for Android. When the AudioProcessing

	153 // is disabled, the sink format will be the same as the source format.

	154 const int sink_sample_rate = audio_processing_ ?

	155 kAudioProcessingSampleRate : source_params.sample_rate();

	156 const media::ChannelLayout sink_channel_layout = audio_processing_ ?

	157 media::CHANNEL_LAYOUT_MONO : source_params.channel_layout();

	158

	159 // WebRtc is using 10ms data as its native packet size.

	160 media::AudioParameters sink_params(

	161 media::AudioParameters::AUDIO_PCM_LOW_LATENCY, sink_channel_layout,

	162 sink_sample_rate, 16, sink_sample_rate / 100);

	163 capture_converter_.reset(

	164 new WebRtcAudioConverter(source_params, sink_params));

	165 }

	166

	167 void WebRtcAudioProcessor::PushCaptureData(media::AudioBus* audio_source) {

	168 capture_thread_checker_.CalledOnValidThread();
	Henrik Grunell 2013/11/18 13:19:34 Shouldn't you return if \|audio_processor_\| == NULL Shouldn't you return if \|audio_processor_\| == NULL? no longer working on chromium 2013/11/18 17:41:46 No, we handle the case when audio_processor_ is NU Show quoted text On 2013/11/18 13:19:34, Henrik Grunell wrote: > Shouldn't you return if \|audio_processor_\| == NULL? No, we handle the case when audio_processor_ is NULL, where this class works as a FIFO. Henrik Grunell 2013/11/19 08:52:22 OK. Show quoted text On 2013/11/18 17:41:46, xians1 wrote: > On 2013/11/18 13:19:34, Henrik Grunell wrote: > > Shouldn't you return if \|audio_processor_\| == NULL? > > No, we handle the case when audio_processor_ is NULL, where this class works as > a FIFO. OK.
	169 capture_converter_->Push(audio_source);

	170 }

	171

	172 bool WebRtcAudioProcessor::ProcessAndConsumeData(

	173 base::TimeDelta capture_delay, int volume, bool key_pressed,

	174 int16** out) {

	175 capture_thread_checker_.CalledOnValidThread();

	176 TRACE_EVENT0("audio",

	177 "WebRtcAudioProcessor::ProcessAndConsumeData");

	178

	179 if (!capture_converter_->Convert(&capture_frame_))

	180 return false;

	181

	182 ProcessData(&capture_frame_, capture_delay, volume, key_pressed);

	183 *out = capture_frame_.data_;

	184

	185 return true;

	186 }

	187

	188 const media::AudioParameters& WebRtcAudioProcessor::OutputFormat() const {

	189 return capture_converter_->sink_parameters();

	190 }

	191

	192 void WebRtcAudioProcessor::ProcessData(webrtc::AudioFrame* audio_frame,
	Henrik Grunell 2013/11/18 13:19:34 Does it make sense to set the delay, volume and ke Does it make sense to set the delay, volume and key pressed in other separate functions? I'm not sure how the system behind is designed. Or if these values are strongly tied to an audio frame, maybe they should be in there audio frame class? no longer working on chromium 2013/11/18 17:41:46 They are tied to the audio_frame. From the code pe Show quoted text On 2013/11/18 13:19:34, Henrik Grunell wrote: > Does it make sense to set the delay, volume and key pressed in other separate > functions? I'm not sure how the system behind is designed. Or if these values > are strongly tied to an audio frame, maybe they should be in there audio frame > class? They are tied to the audio_frame. From the code perspective, there won't be any benefit by extracting them out to a separate method. Henrik Grunell 2013/11/19 08:52:22 OK, should they be added to AudioFrame? Or are the Show quoted text On 2013/11/18 17:41:46, xians1 wrote: > On 2013/11/18 13:19:34, Henrik Grunell wrote: > > Does it make sense to set the delay, volume and key pressed in other separate > > functions? I'm not sure how the system behind is designed. Or if these values > > are strongly tied to an audio frame, maybe they should be in there audio frame > > class? > > They are tied to the audio_frame. From the code perspective, there won't be any > benefit by extracting them out to a separate method. OK, should they be added to AudioFrame? Or are they only used here? no longer working on chromium 2013/11/21 15:59:25 They are only used here, we need to pass them to w Show quoted text On 2013/11/19 08:52:22, Henrik Grunell wrote: > On 2013/11/18 17:41:46, xians1 wrote: > > On 2013/11/18 13:19:34, Henrik Grunell wrote: > > > Does it make sense to set the delay, volume and key pressed in other > separate > > > functions? I'm not sure how the system behind is designed. Or if these > values > > > are strongly tied to an audio frame, maybe they should be in there audio > frame > > > class? > > > > They are tied to the audio_frame. From the code perspective, there won't be > any > > benefit by extracting them out to a separate method. > > OK, should they be added to AudioFrame? Or are they only used here? They are only used here, we need to pass them to webrtc::AudioProcessing module, then most of them will not be used again, except that the delay value needs to be cached for video/audio sync.
	193 base::TimeDelta capture_delay,

	194 int volume,

	195 bool key_pressed) {

	196 capture_thread_checker_.CalledOnValidThread();

	197 if (!audio_processing_)

	198 return;

	199

	200 TRACE_EVENT0("audio", "WebRtcAudioProcessor::Process10MsData");

	201 DCHECK_EQ(audio_processing_->sample_rate_hz(),

	202 capture_converter_->sink_parameters().sample_rate());

	203 DCHECK_EQ(audio_processing_->num_input_channels(),

	204 capture_converter_->sink_parameters().channels());

	205 DCHECK_EQ(audio_processing_->num_output_channels(),

	206 capture_converter_->sink_parameters().channels());

	207

	208 base::subtle::Atomic32 render_delay_ms =

	209 base::subtle::Acquire_Load(&render_delay_ms_);

	210 int64 capture_delay_ms = capture_delay.InMilliseconds();

	211 DCHECK_LT(capture_delay_ms,

	212 std::numeric_limits<base::subtle::Atomic32>::max());

	213 int total_delay_ms = capture_delay_ms + render_delay_ms;

	214 if (total_delay_ms > 1000) {

	215 LOG(WARNING) << "Large audio delay, capture delay: " << capture_delay_ms

	216 << "ms; render delay: " << render_delay_ms << "ms";

	217 }

	218

	219 audio_processing_->set_stream_delay_ms(total_delay_ms);

	220 webrtc::GainControl* agc = audio_processing_->gain_control();

	221 if (agc->set_stream_analog_level(volume))

	222 NOTREACHED();

	223 int err = audio_processing_->ProcessStream(audio_frame);

	224 DCHECK(!err) << "ProcessStream() error: " << err;

	225

	226 // TODO(xians): Add support for AGC, typing detection, audio level

	227 // calculation, stereo swapping.

	228 }

	229

	230 void WebRtcAudioProcessor::PushRenderData(

	231 const int16* render_audio, int sample_rate, int number_of_channels,

	232 int number_of_frames, base::TimeDelta render_delay) {

	233 render_thread_checker_.CalledOnValidThread();

	234

	235 // Return immediately if the echo cancellation is off.

	236 if (!audio_processing_ \|\|

	237 !audio_processing_->echo_cancellation()->is_enabled())

	238 return;

	239

	240 TRACE_EVENT0("audio",

	241 "WebRtcAudioProcessor::FeedRenderDataToAudioProcessing");

	242 int64 new_render_delay_ms = render_delay.InMilliseconds();

	243 DCHECK_LT(new_render_delay_ms,

	244 std::numeric_limits<base::subtle::Atomic32>::max());

	245 base::subtle::Release_Store(&render_delay_ms_, new_render_delay_ms);

	246

	247 InitializeRenderConverterIfNeeded(sample_rate, number_of_channels,

	248 number_of_frames);

	249

	250 // TODO(xians): Avoid this extra interleave/deinterleave.

	251 render_data_bus_->FromInterleaved(render_audio,

	252 render_data_bus_->frames(),

	253 sizeof(render_audio[0]));

	254 render_converter_->Push(render_data_bus_.get());

	255 while (render_converter_->Convert(&render_frame_)) {

	256 audio_processing_->AnalyzeReverseStream(&render_frame_);

	257 }

	258 }

	259

	260 void WebRtcAudioProcessor::InitializeAudioProcessingModule(

	261 const webrtc::MediaConstraintsInterface* constraints) {

	262 if (!CommandLine::ForCurrentProcess()->HasSwitch(

	263 switches::kEnableAudioTrackProcessing)) {

	264 return;

	265 }

	266

	267 if (!constraints)

	268 return;

	269

	270 const bool enable_aec = GetPropertyFromConstraints(

	271 constraints, MediaConstraintsInterface::kEchoCancellation);

	272 const bool enable_ns = GetPropertyFromConstraints(

	273 constraints, MediaConstraintsInterface::kNoiseSuppression);

	274 const bool enable_high_pass_filter = GetPropertyFromConstraints(

	275 constraints, MediaConstraintsInterface::kHighpassFilter);

	276 const bool start_aec_dump = GetPropertyFromConstraints(

	277 constraints, MediaConstraintsInterface::kInternalAecDump);

	278 #if defined(IOS) \|\| defined(ANDROID)

	279 const bool enable_experimental_aec = false;

	280 const bool enable_typing_detection = false;

	281 #else

	282 const bool enable_experimental_aec = GetPropertyFromConstraints(

	283 constraints, MediaConstraintsInterface::kExperimentalEchoCancellation);

	284 const bool enable_typing_detection = GetPropertyFromConstraints(

	285 constraints, MediaConstraintsInterface::kTypingNoiseDetection);

	286 #endif

	287

	288 // Return immediately if no audio processing component is enabled.

	289 if (!enable_aec && !enable_experimental_aec && !enable_ns &&

	290 !enable_high_pass_filter && !enable_typing_detection) {

	291 return;

	292 }

	293

	294 // Create and configure the audio processing if it does not exist.

	295 if (!audio_processing_)

	296 audio_processing_.reset(webrtc::AudioProcessing::Create(0));

	297

	298 // Enable the audio processing components.

	299 if (enable_aec) {

	300 EnableEchoCancellation(audio_processing_.get());

	301 if (enable_experimental_aec)

	302 EnableExperimentalEchoCancellation(audio_processing_.get());

	303 }

	304

	305 if (enable_ns)

	306 EnableNoiseSuppression(audio_processing_.get());

	307

	308 if (enable_high_pass_filter)

	309 EnableHighPassFilter(audio_processing_.get());

	310

	311 if (enable_typing_detection)

	312 EnableTypingDetection(audio_processing_.get());

	313

	314 if (enable_aec && start_aec_dump)

	315 StartAecDump(audio_processing_.get());

	316

	317 // Configure the audio format the audio processing is running on. This

	318 // has to be done after all the needed components are enabled.

	319 if (audio_processing_->set_sample_rate_hz(kAudioProcessingSampleRate))

	320 NOTREACHED();

	321 if (audio_processing_->set_num_channels(kAudioProcessingNumberOfChannel,

	322 kAudioProcessingNumberOfChannel))

	323 NOTREACHED();

	324 }

	325

	326 void WebRtcAudioProcessor::InitializeRenderConverterIfNeeded(

	327 int sample_rate, int number_of_channels, int frames_per_buffer) {

	328 // TODO(xians): Figure out if we need to handle the buffer size change.

	329 if (render_converter_.get() &&

	330 render_converter_->source_parameters().sample_rate() == sample_rate &&

	331 render_converter_->source_parameters().channels() == number_of_channels) {

	332 // Do nothing if the \|render_converter_\| has been setup properly.

	333 return;

	334 }

	335

	336 media::AudioParameters source_params(

	337 media::AudioParameters::AUDIO_PCM_LOW_LATENCY,

	338 media::GuessChannelLayout(number_of_channels), sample_rate, 16,

	339 frames_per_buffer);

	340 media::AudioParameters sink_params(

	341 media::AudioParameters::AUDIO_PCM_LOW_LATENCY,

	342 media::CHANNEL_LAYOUT_MONO, kAudioProcessingSampleRate, 16,

	343 kAudioProcessingSampleRate / 100);

	344 render_converter_.reset(new WebRtcAudioConverter(source_params, sink_params));

	345 render_data_bus_ = media::AudioBus::Create(number_of_channels,

	346 frames_per_buffer);

	347 }

	348

	349 void WebRtcAudioProcessor::StopAudioProcessing() {

	350 if (!audio_processing_.get())

	351 return;

	352

	353 // It is safe to stop the AEC dump even it is not started.

	354 StopAecDump(audio_processing_.get());

	355

	356 audio_processing_.reset();

	357 }

	358

	359 } // namespace content

OLD	NEW

« no previous file with comments | « content/renderer/media/webrtc_audio_processor.h ('k') | content/renderer/media/webrtc_audio_processor_options.h » ('j') | no next file with comments »