|
|
Created:
8 years, 9 months ago by Primiano Tucci (use gerrit) Modified:
8 years, 8 months ago CC:
chromium-reviews, joi+watch-content_chromium.org, darin-cc_chromium.org, jam, stuartmorgan Base URL:
svn://svn.chromium.org/chrome/trunk/src Visibility:
Public. |
DescriptionSpeech refactoring: Reimplemented speech_recognizer as a FSM (CL1.5)
BUG=116954
TEST=none
Committed: http://src.chromium.org/viewvc/chrome?view=rev&revision=132179
Patch Set 1 : #Patch Set 2 : Rebased from master. #
Total comments: 92
Patch Set 3 : Fixed according to Satish review + rebase. #Patch Set 4 : Rebased with Issue 9791070 (soon to be master). #Patch Set 5 : Rewrote FSM according to Satish indications. #Patch Set 6 : Minor style fixes. #
Total comments: 22
Patch Set 7 : Reverted switch-style FSM and fixed according to Hans comments. #Patch Set 8 : Rebased from master due to renames in media:: package. #
Total comments: 36
Patch Set 9 : Fixed according to Bulach review. #
Total comments: 6
Patch Set 10 : Fixed according to Satish review. #
Total comments: 10
Patch Set 11 : Fixed according to Bulach review (+ win compile error). #
Total comments: 1
Patch Set 12 : Added CONTENT_EXPORT on GoogleOneShotRemoteEngineConfig to address compilation issues on win. #
Messages
Total messages: 27 (0 generated)
Great start. http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... File content/browser/speech/speech_recognizer_impl.cc (right): http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.cc:19: #define UNREACHABLE_CONDITION() do { NOTREACHED(); return state_; } while(0) can this be changed to a method InvalidInput() along the lines of the existing DoNothing() method? You can remove the 'do-while' usage as well http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.cc:30: // SpeechRecognitionEngine does not deliver a result (in reasonable time)? for remote engines, the network connection should timeout automatically so a watchdog is probably not required. Since thats the only one we support now could remove this todo http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.cc:32: // Enables spontaneous transition from WaitingForSpeech to RecognizingSpeech, add newline above http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.cc:34: const bool skipSilenceDetectionForTesting = false; This doesn't seem to be set to true anywhere else so code checking for this runs always now. Is it going to be in a future CL? If so, you can remove it here and add it in that CL http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.cc:91: google_sr_engine->SetConfig(google_sr_config); Is this config ever changed after creating the engine? If not, can we pass it as an arg to the constructor instead of a separate method? http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.cc:113: caller_id_(caller_id), this initializer list should be in the same order as in the class declaration (.h file). I think the Clang builder will turn red otherwise. http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.cc:128: SpeechRecognizerImpl::~SpeechRecognizerImpl() { add a DCHECK to verify you are in a valid (idle?) state when destroyed http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.cc:137: // Imagine what would happen if a Start has been enqueued from another thread 137-145 looks like a scare tactic :) and could be removed. 134-136 is quite explanatory http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.cc:151: this, kStartRequest, args)); could make it simple by replacing 'args' with 'FSMEventArgs()' http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.cc:178: return state_ >= kStartingRecognition && state_ <= kRecognizingSpeech; Would checking for audio_controller_ != NULL be more authoritative? http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.cc:197: args.audio_data = new AudioChunk(data, static_cast<size_t>(size), add a comment here that the event handler takes ownership http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.cc:198: kNumBitsPerAudioSample / 8); since we are assuming kNumBitsPerAudioSample as a multiple of 8, can you add a COMPILE_ASSERT at the top to validate this assumption? If it ever gets changed to something else (e.g. 15 bits per sample) this assert would catch it http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.cc:206: FSMEvent event = kRecognitionResult; can this value be passed directly to the base::Bind call below? http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.cc:216: FSMEvent event = kRecognitionError; ditto http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.cc:230: // Event dispatching must be sequential, otherwise it will break all the rules add newline above full length comments such as these http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.cc:232: DCHECK_EQ(event_dispatch_nesting_level_, 0); could be clearer if this variable was a bool such as 'in_dispatch_event_', set to true here and to false at the end of the method http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.cc:234: // Guard against the delegate freeing us until we finish processing the event. ditto http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.cc:237: event_ = event; These look a bit dangerous as they are invalid after this method returns. Can you pass them as arguments and not have as member variables? http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.cc:242: // The audio pipeline must be processed before the ProcessEvent, otherwise it add newline above http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.cc:247: if (args.audio_data) this cleanup should be part of the FSMEventArgs destructor. Better still you can make audio_data into a scoped_ptr so a custom destructor isn't required http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.cc:264: const bool always = true; remove this as its used only in the next line http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.cc:265: const bool route_audio_to_clipper = always; only use 1 space on either side of = and && operators, we don't align RHS of assignments across multiple lines like here http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.cc:272: AudioChunk& recorded_audio_data = *(event_args_->audio_data); use "const AudioChunk&" http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.cc:277: clipper_detected_clip_ = DetectClipping(recorded_audio_data); clipper_detected_clip_ is set here and used in UpdateSignalAndNoiseLevels() call made a few lines below. So seems like it shouldn't be a member variable, instead move it local here and pass as a parameter to the method below http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.cc:280: endpointer_.ProcessAudio(recorded_audio_data, &rms_); ditto for 'rms_' http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.cc:284: UpdateSignalAndNoiseLevels(rms_); since this is the only method making use of clipping information, seems like DetectClipping should be called above this line and no need for 'route_audio_to_clipper' check http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.cc:292: SpeechRecognizerImpl::FSMState SpeechRecognizerImpl::ProcessEvent( DispatchEvent and ProcessEvent are too similar, please find a more suitable name for this method or merge with DispatchEvent http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.cc:300: case kStartRequest: return InitializeAndStartRecording(); since this is the only valid event in this state, can you rewrite as switch (event) { case kStartRequest: return InitializeAndStartRecording(); default: return DoNothing(); } http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.cc:310: case kAbortRequest: return Abort(); would be simpler to collapse multiple similar handlers: switch (event) { case kAbortRequest: case kStopCaptureRequest: case kRecognitionError: case kAudioError: return Abort(event); case kAudioData: return StartSpeechRecognition(); default: return InvalidInput(); } Same for every other switch below http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.cc:321: case kAbortRequest: return Abort(); hmm, since kAbortRequest, kRecognitionError and kAudioError are always triggering 'return Abort' in every state, may be these can be moved outside at the top and handled in a single place (if you also add a 'default: InvalidInput();' case). Up to you though.. http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.cc:395: const int samples_per_packet = kAudioSampleRate * add parentheses around (kAudioSampleRate * ..) / 1000 ? http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.cc:417: // This was the first audio packet recorded, so start a request to the update comment to say that the first audio packet has been received so start the recognition engine http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.cc:422: // TODO(primiano) this is a little hack, since TakeAudioChunk() is already add newline above http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.cc:424: // architectural beauty of this class. But it is the best tradeoff, unless we could remove reference to 'architectural beauty' :) http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.cc:430: SpeechRecognizerImpl::FSMState SpeechRecognizerImpl::EnvironmentEstimation() { this method's name doesn't indicate what it actually does, can you use a more appropriate name? http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.cc:503: bool has_error, const SpeechRecognitionError& error) { can we change 'error' to be a pointer and remove 'has_error' parameter? http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.cc:514: //TODO(primiano) reset the engine? Why, after all? This comment is unclear, please reword if required or remove. Also should the next line remain commented? http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.cc:518: if (state_ > kWaitingForSpeech && state_ < kWaitingFinalResult) would be useful for the unittest to verify that all these callbacks to listener come as expected http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.cc:534: // This is in preparation for future speech recognition functions. remove these commented lines http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.cc:554: return state_; // Just keep the current state. 2 spaces before //, here and other places in this file http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.cc:572: return num_samples_recorded_ * 1000 / kAudioSampleRate; use parenthesis around (num_samples_recorded_ * 1000) / ... http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... File content/browser/speech/speech_recognizer_impl.h (right): http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.h:60: kIdle = 0, enum values should be MACRO_STYLE http://dev.chromium.org/developers/coding-style (in the Naming section) http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.h:70: kAbortRequest = 0, seems like we can drop the 'Request' suffix in these 3 as the rest don't use it http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.h:84: content::SpeechRecognitionError error; change to speech_error http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.h:105: FSMState InitializeAndStartRecording(); rename to StartRecording http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.h:106: FSMState StartSpeechRecognition(); rename to StartRecognitionEngine
http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... File content/browser/speech/speech_recognizer_impl.cc (right): http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.cc:19: #define UNREACHABLE_CONDITION() do { NOTREACHED(); return state_; } while(0) On 2012/03/27 09:47:42, Satish wrote: > can this be changed to a method InvalidInput() along the lines of the existing > DoNothing() method? You can remove the 'do-while' usage as well I used a macro since, in case of bugs/failing DCHECKS, the line number that is pointed out refers to the code in the FSM switch matrix, helping a lot the debugging. http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.cc:30: // SpeechRecognitionEngine does not deliver a result (in reasonable time)? On 2012/03/27 09:47:42, Satish wrote: > for remote engines, the network connection should timeout automatically so a > watchdog is probably not required. Since thats the only one we support now could > remove this todo Done. http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.cc:32: // Enables spontaneous transition from WaitingForSpeech to RecognizingSpeech, On 2012/03/27 09:47:42, Satish wrote: > add newline above Done. http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.cc:34: const bool skipSilenceDetectionForTesting = false; On 2012/03/27 09:47:42, Satish wrote: > This doesn't seem to be set to true anywhere else so code checking for this runs > always now. Is it going to be in a future CL? If so, you can remove it here and > add it in that CL Done. http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.cc:91: google_sr_engine->SetConfig(google_sr_config); On 2012/03/27 09:47:42, Satish wrote: > Is this config ever changed after creating the engine? If not, can we pass it as > an arg to the constructor instead of a separate method? It can be changed, so that the recognition engine can be reused avoiding to istantiate a new engine for each request (e.g. the JS changes the .lang property and reissues .Start() on the recognition object). http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.cc:113: caller_id_(caller_id), On 2012/03/27 09:47:42, Satish wrote: > this initializer list should be in the same order as in the class declaration > (.h file). I think the Clang builder will turn red otherwise. Isn't it? (except for non pod fields and pod fields that do not require initialization, thus skipped). http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.cc:128: SpeechRecognizerImpl::~SpeechRecognizerImpl() { On 2012/03/27 09:47:42, Satish wrote: > add a DCHECK to verify you are in a valid (idle?) state when destroyed Hmm, the browser could be closed while a recognition is in progress, which would erroneously fail the DCHECK. http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.cc:137: // Imagine what would happen if a Start has been enqueued from another thread On 2012/03/27 09:47:42, Satish wrote: > 137-145 looks like a scare tactic :) and could be removed. 134-136 is quite > explanatory Done. http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.cc:151: this, kStartRequest, args)); On 2012/03/27 09:47:42, Satish wrote: > could make it simple by replacing 'args' with 'FSMEventArgs()' Done. http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.cc:178: return state_ >= kStartingRecognition && state_ <= kRecognizingSpeech; On 2012/03/27 09:47:42, Satish wrote: > Would checking for audio_controller_ != NULL be more authoritative? IMHO, all the decisions related to the evolution of a FSM should refer to its state and not on the side effects caused by previous transitions. However, I agree with the DCHECK and slightly revised the code, making more pedantic checks. http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.cc:197: args.audio_data = new AudioChunk(data, static_cast<size_t>(size), On 2012/03/27 09:47:42, Satish wrote: > add a comment here that the event handler takes ownership Done. http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.cc:198: kNumBitsPerAudioSample / 8); On 2012/03/27 09:47:42, Satish wrote: > since we are assuming kNumBitsPerAudioSample as a multiple of 8, can you add a > COMPILE_ASSERT at the top to validate this assumption? If it ever gets changed > to something else (e.g. 15 bits per sample) this assert would catch it Done. http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.cc:206: FSMEvent event = kRecognitionResult; On 2012/03/27 09:47:42, Satish wrote: > can this value be passed directly to the base::Bind call below? It must! honestly don't know why I did pass through that var, probably a rest of some copy/paste. http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.cc:216: FSMEvent event = kRecognitionError; On 2012/03/27 09:47:42, Satish wrote: > ditto Done. http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.cc:230: // Event dispatching must be sequential, otherwise it will break all the rules On 2012/03/27 09:47:42, Satish wrote: > add newline above full length comments such as these Done. http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.cc:232: DCHECK_EQ(event_dispatch_nesting_level_, 0); On 2012/03/27 09:47:42, Satish wrote: > could be clearer if this variable was a bool such as 'in_dispatch_event_', set > to true here and to false at the end of the method Right. http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.cc:234: // Guard against the delegate freeing us until we finish processing the event. On 2012/03/27 09:47:42, Satish wrote: > ditto Done. http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.cc:237: event_ = event; On 2012/03/27 09:47:42, Satish wrote: > These look a bit dangerous as they are invalid after this method returns. Can > you pass them as arguments and not have as member variables? Mmm what do you mean? They are used only by (private) functions that are uniquely called in DispatchEvent. Perhaps we could add a DCHECK(in_event_processing_) on each of those functions, but I don't know if it is worth. Or if we want to be very paranoid instead of setting the event_args_ pointer to NULL we could set it to an empty (static) object. http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.cc:242: // The audio pipeline must be processed before the ProcessEvent, otherwise it On 2012/03/27 09:47:42, Satish wrote: > add newline above Done. http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.cc:247: if (args.audio_data) On 2012/03/27 09:47:42, Satish wrote: > this cleanup should be part of the FSMEventArgs destructor. Better still you can > make audio_data into a scoped_ptr so a custom destructor isn't required AudioChunk is now refcounted and should be destroyed automatically upon return of this function. http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.cc:264: const bool always = true; On 2012/03/27 09:47:42, Satish wrote: > remove this as its used only in the next line Done. http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.cc:265: const bool route_audio_to_clipper = always; On 2012/03/27 09:47:42, Satish wrote: > only use 1 space on either side of = and && operators, we don't align RHS of > assignments across multiple lines like here Done. http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.cc:272: AudioChunk& recorded_audio_data = *(event_args_->audio_data); On 2012/03/27 09:47:42, Satish wrote: > use "const AudioChunk&" Done. http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.cc:277: clipper_detected_clip_ = DetectClipping(recorded_audio_data); On 2012/03/27 09:47:42, Satish wrote: > clipper_detected_clip_ is set here and used in UpdateSignalAndNoiseLevels() call > made a few lines below. So seems like it shouldn't be a member variable, instead > move it local here and pass as a parameter to the method below Done. http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.cc:280: endpointer_.ProcessAudio(recorded_audio_data, &rms_); On 2012/03/27 09:47:42, Satish wrote: > ditto for 'rms_' Done. http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.cc:284: UpdateSignalAndNoiseLevels(rms_); On 2012/03/27 09:47:42, Satish wrote: > since this is the only method making use of clipping information, seems like > DetectClipping should be called above this line and no need for > 'route_audio_to_clipper' check Done. http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.cc:292: SpeechRecognizerImpl::FSMState SpeechRecognizerImpl::ProcessEvent( On 2012/03/27 09:47:42, Satish wrote: > DispatchEvent and ProcessEvent are too similar, please find a more suitable name > for this method or merge with DispatchEvent Done. ExecuteTransitionAndGetNextState Done. http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.cc:300: case kStartRequest: return InitializeAndStartRecording(); On 2012/03/27 09:47:42, Satish wrote: > since this is the only valid event in this state, can you rewrite as > switch (event) { > case kStartRequest: > return InitializeAndStartRecording(); > default: > return DoNothing(); > } Hmm IMHO it might introduce bugs if a new event is introduced. In the current implementation is a new event is introduced and is not explicitly processed if will be caught by the final UNREACHABLE_CONDITION. http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.cc:310: case kAbortRequest: return Abort(); On 2012/03/27 09:47:42, Satish wrote: > would be simpler to collapse multiple similar handlers: > switch (event) { > case kAbortRequest: > case kStopCaptureRequest: > case kRecognitionError: > case kAudioError: > return Abort(event); > case kAudioData: > return StartSpeechRecognition(); > default: > return InvalidInput(); > } > > Same for every other switch below IMHO it would become more difficult to read (since the order of events will not be the same in all cases) and to maintain. http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.cc:321: case kAbortRequest: return Abort(); On 2012/03/27 09:47:42, Satish wrote: > hmm, since kAbortRequest, kRecognitionError and kAudioError are always > triggering 'return Abort' in every state, may be these can be moved outside at > the top and handled in a single place (if you also add a 'default: > InvalidInput();' case). Up to you though.. They are not exactly equivalent since they trigger different behaviors inside Abort. Btw, I would like to keep the current structure: I know that it can be compacted, but I think that its verbosity helps to see all the possible race conditions that might happen, that would be otherwise hidden grouping cases or factoring events. http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.cc:395: const int samples_per_packet = kAudioSampleRate * On 2012/03/27 09:47:42, Satish wrote: > add parentheses around (kAudioSampleRate * ..) / 1000 ? Done. http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.cc:417: // This was the first audio packet recorded, so start a request to the On 2012/03/27 09:47:42, Satish wrote: > update comment to say that the first audio packet has been received so start the > recognition engine Done. http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.cc:422: // TODO(primiano) this is a little hack, since TakeAudioChunk() is already On 2012/03/27 09:47:42, Satish wrote: > add newline above Done. http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.cc:424: // architectural beauty of this class. But it is the best tradeoff, unless we On 2012/03/27 09:47:42, Satish wrote: > could remove reference to 'architectural beauty' :) Done. http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.cc:430: SpeechRecognizerImpl::FSMState SpeechRecognizerImpl::EnvironmentEstimation() { On 2012/03/27 09:47:42, Satish wrote: > this method's name doesn't indicate what it actually does, can you use a more > appropriate name? Done. WaitEnvironmentEstimationCompletion http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.cc:503: bool has_error, const SpeechRecognitionError& error) { On 2012/03/27 09:47:42, Satish wrote: > can we change 'error' to be a pointer and remove 'has_error' parameter? Done. http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.cc:514: //TODO(primiano) reset the engine? Why, after all? On 2012/03/27 09:47:42, Satish wrote: > This comment is unclear, please reword if required or remove. Also should the > next line remain commented? Done. http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.cc:518: if (state_ > kWaitingForSpeech && state_ < kWaitingFinalResult) On 2012/03/27 09:47:42, Satish wrote: > would be useful for the unittest to verify that all these callbacks to listener > come as expected Done. http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.cc:534: // This is in preparation for future speech recognition functions. On 2012/03/27 09:47:42, Satish wrote: > remove these commented lines Done. http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.cc:554: return state_; // Just keep the current state. On 2012/03/27 09:47:42, Satish wrote: > 2 spaces before //, here and other places in this file Done. http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.cc:572: return num_samples_recorded_ * 1000 / kAudioSampleRate; On 2012/03/27 09:47:42, Satish wrote: > use parenthesis around (num_samples_recorded_ * 1000) / ... Done. http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... File content/browser/speech/speech_recognizer_impl.h (right): http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.h:60: kIdle = 0, On 2012/03/27 09:47:42, Satish wrote: > enum values should be MACRO_STYLE > http://dev.chromium.org/developers/coding-style (in the Naming section) Done. http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.h:70: kAbortRequest = 0, On 2012/03/27 09:47:42, Satish wrote: > seems like we can drop the 'Request' suffix in these 3 as the rest don't use it I added the suffix because their name should represent an event. Removing that suffix would make them sound as actions, not event. Furthermore, I wonder if any name clash might occur renaming them just to ABORT and START. http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.h:84: content::SpeechRecognitionError error; On 2012/03/27 09:47:42, Satish wrote: > change to speech_error Done. http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.h:105: FSMState InitializeAndStartRecording(); On 2012/03/27 09:47:42, Satish wrote: > rename to StartRecording Done. http://codereview.chromium.org/9835049/diff/3001/content/browser/speech/speec... content/browser/speech/speech_recognizer_impl.h:106: FSMState StartSpeechRecognition(); On 2012/03/27 09:47:42, Satish wrote: > rename to StartRecognitionEngine Done.
Reminder: need to slightly refine the FSM adding WAIT_AUDIO_CLOSURE state, in order to deal with recent media:: package change and unit tests.
- Minor style fixes - Added a comment in the code regarding to recent changes of AudioInputController class (Async Close)
I just have some style comments. Also, I'm not that keen on having a state-transition table.. I found the switch-case from before easier to read. http://codereview.chromium.org/9835049/diff/28001/content/browser/speech/spee... File content/browser/speech/speech_recognizer_impl.cc (right): http://codereview.chromium.org/9835049/diff/28001/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl.cc:20: #define BIND(x) base::Bind(&SpeechRecognizerImpl::x, this) Hmm, not super happy about this macro and the use of a state-transition table.. I found the switch-case version much easier to read :/ http://codereview.chromium.org/9835049/diff/28001/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl.cc:103: COMPILE_ASSERT((SpeechRecognizerImpl::kNumBitsPerAudioSample & 0x7) == 0, I think using the % operator instead of & would make this more readable. http://codereview.chromium.org/9835049/diff/28001/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl.cc:135: // NOTE: all the external events and request should be enqueued (PostTask), even s/request/requests/ ? http://codereview.chromium.org/9835049/diff/28001/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl.cc:162: // If you're doing it, probably you have some design issues. i'm not sure this comment adds much.. i think the dcheck says it all http://codereview.chromium.org/9835049/diff/28001/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl.cc:225: // represent a problem for the browser itself, since since refcounting protects s/since since/since/ http://codereview.chromium.org/9835049/diff/28001/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl.cc:305: ProcessAudioPipeline(*(args.audio_data.get())); I think you can just do ProcessAudioPipeline(*args.audio_data); http://codereview.chromium.org/9835049/diff/28001/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl.cc:310: const TransitionFunction& transition = fsm[state_][event]; i liked the switch-case better http://codereview.chromium.org/9835049/diff/28001/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl.cc:404: // started and the delegate notifies about the event. s/notifies/notified/ http://codereview.chromium.org/9835049/diff/28001/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl.cc:505: listener_->OnRecognitionError(caller_id_, *error); just a thought (maybe for the future).. i wonder what is preferable -- to fire the error event first, or the onsoundend/onaudioend first. http://codereview.chromium.org/9835049/diff/28001/content/browser/speech/spee... File content/browser/speech/speech_recognizer_impl.h (right): http://codereview.chromium.org/9835049/diff/28001/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl.h:99: // Callback called on IO thread by audio_controller->Close(). nit: maybe put a blank line before this, since it's not part of the EventHandler methods. http://codereview.chromium.org/9835049/diff/28001/content/browser/speech/spee... File content/browser/speech/speech_recognizer_impl_unittest.cc (right): http://codereview.chromium.org/9835049/diff/28001/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl_unittest.cc:168: float noise_volume) OVERRIDE { is there a space too much here?
Just one reply to Hans's comment earlier.. will review in detail once this is agreed upon http://codereview.chromium.org/9835049/diff/28001/content/browser/speech/spee... File content/browser/speech/speech_recognizer_impl.cc (right): http://codereview.chromium.org/9835049/diff/28001/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl.cc:310: const TransitionFunction& transition = fsm[state_][event]; On 2012/04/02 16:05:59, hans wrote: > i liked the switch-case better I was thinking earlier that a table would be appealing because every case statement in the switch was calling a function and each of those required the same data (event, state, audio data/error, ..). In particular I was thinking we could reduce redundancy in function calls/parameter passing. But the above table along with usage of BIND macro isn't as readable as I hoped. Instead of items being a closure can we use an id for the operation to perform? I see we have a handful of operations so far. enum EventHandler { HANDLER_DO_NOTHING, HANDLER_ABORT, HANDLER_START_RECORDING, ... }; EventHandler g_fsm[STATE_MAX][EVENT_MAX] = { /* STATE_IDLE */ { /* EVENT_ABORT */ HANDLER_DO_NOTHING, /* EVENT_START */ HANDLER_START_RECORDING, /* EVENT_STOP_CAPTURE */ HANDLER_DO_NOTHING, /* EVENT_AUDIO_DATA */ HANDLER_DO_NOTHING, /* EVENT_ENGINE_RESULT */ HANDLER_DO_NOTHING, /* EVENT_ENGINE_ERROR */ HANDLER_DO_NOTHING, /* EVENT_AUDIO_ERROR */ HANDLER_DO_NOTHING, }, /* STATE_STARTING */ { ... } }; and DispatchEvent would have a switch-case for the handler IDs vs. the actual calls with the right set of parameters.
On 2012/04/02 21:57:09, Satish wrote: > On 2012/04/02 16:05:59, hans wrote: > > i liked the switch-case better > > I was thinking earlier that a table would be appealing because every case > statement in the switch was calling a function and each of those required the > same data (event, state, audio data/error, ..). In particular I was thinking we > could reduce redundancy in function calls/parameter passing. Well, it does as far as I see (event args are passed transparently as arguments of each transition function). > But the above table along with usage of BIND macro isn't as readable as I hoped. > Instead of items being a closure can we use an id for the operation to perform? > I see we have a handful of operations so far. > > enum EventHandler { > HANDLER_DO_NOTHING, > HANDLER_ABORT, > HANDLER_START_RECORDING, > ... > }; > EventHandler g_fsm[STATE_MAX][EVENT_MAX] = { > /* STATE_IDLE */ > { > /* EVENT_ABORT */ HANDLER_DO_NOTHING, > /* EVENT_START */ HANDLER_START_RECORDING, > /* EVENT_STOP_CAPTURE */ HANDLER_DO_NOTHING, > /* EVENT_AUDIO_DATA */ HANDLER_DO_NOTHING, > /* EVENT_ENGINE_RESULT */ HANDLER_DO_NOTHING, > /* EVENT_ENGINE_ERROR */ HANDLER_DO_NOTHING, > /* EVENT_AUDIO_ERROR */ HANDLER_DO_NOTHING, > }, > /* STATE_STARTING */ > { > ... > } > }; > > and DispatchEvent would have a switch-case for the handler IDs vs. the actual > calls with the right set of parameters. Honestly I really don't see the benefits of this additional dispatching: we first switched to the look-up table to get rid of switch-es, and now we want to use it just for getting an id and switch-ing over it? Isn't it the same of calling BIND-> HANDLER(FunctionName)? Furthermore wouldn't it make the class more difficult to read (events, transition functions, now handlers, a LUT and a switch) and maintain?
Reverted to switch-style FSM, using always FSMEventArgs (now it embeds also the event in order to have a unique argument) and not a private field for passing event to transition functions. http://codereview.chromium.org/9835049/diff/28001/content/browser/speech/spee... File content/browser/speech/speech_recognizer_impl.cc (right): http://codereview.chromium.org/9835049/diff/28001/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl.cc:20: #define BIND(x) base::Bind(&SpeechRecognizerImpl::x, this) On 2012/04/02 16:05:59, hans wrote: > Hmm, not super happy about this macro and the use of a state-transition table.. > I found the switch-case version much easier to read :/ Reverted to switch-style FSM as agreed. http://codereview.chromium.org/9835049/diff/28001/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl.cc:103: COMPILE_ASSERT((SpeechRecognizerImpl::kNumBitsPerAudioSample & 0x7) == 0, On 2012/04/02 16:05:59, hans wrote: > I think using the % operator instead of & would make this more readable. Done. http://codereview.chromium.org/9835049/diff/28001/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl.cc:135: // NOTE: all the external events and request should be enqueued (PostTask), even On 2012/04/02 16:05:59, hans wrote: > s/request/requests/ ? Done. http://codereview.chromium.org/9835049/diff/28001/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl.cc:162: // If you're doing it, probably you have some design issues. On 2012/04/02 16:05:59, hans wrote: > i'm not sure this comment adds much.. i think the dcheck says it all Agree, removed the last line. http://codereview.chromium.org/9835049/diff/28001/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl.cc:225: // represent a problem for the browser itself, since since refcounting protects On 2012/04/02 16:05:59, hans wrote: > s/since since/since/ Done. http://codereview.chromium.org/9835049/diff/28001/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl.cc:305: ProcessAudioPipeline(*(args.audio_data.get())); On 2012/04/02 16:05:59, hans wrote: > I think you can just do ProcessAudioPipeline(*args.audio_data); Done. http://codereview.chromium.org/9835049/diff/28001/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl.cc:404: // started and the delegate notifies about the event. On 2012/04/02 16:05:59, hans wrote: > s/notifies/notified/ Done. http://codereview.chromium.org/9835049/diff/28001/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl.cc:505: listener_->OnRecognitionError(caller_id_, *error); On 2012/04/02 16:05:59, hans wrote: > just a thought (maybe for the future).. i wonder what is preferable -- to fire > the error event first, or the onsoundend/onaudioend first. We should think on the implications that it might have on the end-user. Perhaps the situation will be clearer once we have put all the stuff together (including JS APIs). By now I used this ordering just because it reflects the actual situation (CloseAudioController is called just before these callbacks). http://codereview.chromium.org/9835049/diff/28001/content/browser/speech/spee... File content/browser/speech/speech_recognizer_impl.h (right): http://codereview.chromium.org/9835049/diff/28001/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl.h:99: // Callback called on IO thread by audio_controller->Close(). On 2012/04/02 16:05:59, hans wrote: > nit: maybe put a blank line before this, since it's not part of the EventHandler > methods. Done. http://codereview.chromium.org/9835049/diff/28001/content/browser/speech/spee... File content/browser/speech/speech_recognizer_impl_unittest.cc (right): http://codereview.chromium.org/9835049/diff/28001/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl_unittest.cc:168: float noise_volume) OVERRIDE { On 2012/04/02 16:05:59, hans wrote: > is there a space too much here? Done.
lgtm
drive-by comments: I'm not 100% familiar with it though, so I'm not sure if it all makes sense :) http://codereview.chromium.org/9835049/diff/23012/content/browser/speech/spee... File content/browser/speech/speech_recognizer_impl.cc (right): http://codereview.chromium.org/9835049/diff/23012/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl.cc:20: #define NOT_FEASIBLE() do { NOTREACHED(); return state_; } while(0) nit: we avoid using macros as much as possible.. this case specifically, this macro requires "state_" to be defined at the call site.. better to just make these two lines explicit where needed.. http://codereview.chromium.org/9835049/diff/23012/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl.cc:77: speech::GoogleOneShotRemoteEngineConfig google_sr_config; nit: prefer to call "remote_engine_config" http://codereview.chromium.org/9835049/diff/23012/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl.cc:88: speech::GoogleOneShotRemoteEngine* google_sr_engine = nit: remote_engine. also, just to clarify could add a comment: // SpeechRecognizerImpl takes ownership. http://codereview.chromium.org/9835049/diff/23012/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl.cc:277: return DoNothing(event_args); I find this is a bit hard to follow.. would it be simpler to just list the event we're interested in this state, and then have a "default:" for everything else? http://codereview.chromium.org/9835049/diff/23012/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl.cc:295: return Abort(event_args); ditto here... maybe something like: case EVENT_AUDIO_DATA: return StartRecognitionEngine(event_args); case EVENT_ABORT: case EVENT_STOP_CAPTURE: case EVENT_ENGINE_ERROR: ... return Abort(event_args); default: NOTREACHED(); return ...; http://codereview.chromium.org/9835049/diff/23012/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl.cc:395: if (route_to_endpointer) { nit: we normally avoid {} on single line if blocks: if (route_to_endpointer) endpointer_.ProcessAudio(raw_audio, &rms); http://codereview.chromium.org/9835049/diff/23012/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl.cc:417: VLOG(1) << "SpeechRecognizerImpl starting audio capture."; nit: DVLOG? http://codereview.chromium.org/9835049/diff/23012/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl.cc:441: if (audio_controller_.get() == NULL) { nit: if (!audio_controller_.get()) { http://codereview.chromium.org/9835049/diff/23012/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl.cc:477: } else { nit: here, 491 and 500, remove the final "else" block and just do the return in the last line.. http://codereview.chromium.org/9835049/diff/23012/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl.cc:508: VLOG(1) << "Concluding recognition"; nit: DVLOG? http://codereview.chromium.org/9835049/diff/23012/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl.cc:523: // However, SPEECH_RECOGNITION_ERROR_ABORTED is not caught in UI layers which UI layers? I think it's about the renderers, right? our UI layers (i.e., the browser) doesn't catch any exceptions.. http://codereview.chromium.org/9835049/diff/23012/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl.cc:535: const SpeechRecognitionError& error) { can we avoid this overload? http://codereview.chromium.org/9835049/diff/23012/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl.cc:568: // This is in preparation for future speech recognition functions. nit: indent http://codereview.chromium.org/9835049/diff/23012/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl.cc:576: VLOG(1) << "Got valid result"; nit: DVLOG http://codereview.chromium.org/9835049/diff/23012/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl.cc:614: audio_level_ += (level - audio_level_) * kUpSmoothingFactor; nit: you can probably simplify this with: const smoothing_factor = (level > audio_level_) ? kUpSmoothingFactor : kDownSmoothingFactor; audio_level_ += (level - audio_level_) * smoothing_factor; http://codereview.chromium.org/9835049/diff/23012/content/browser/speech/spee... File content/browser/speech/speech_recognizer_impl.h (right): http://codereview.chromium.org/9835049/diff/23012/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl.h:92: FSMEventArgs(FSMEvent event_value); nits: move the constructor / destructor before the members, and add a \n after the dtor.. also, mark the ctor explicit. http://codereview.chromium.org/9835049/diff/23012/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl.h:130: void CloseAudioControllerAsynchronously(); nits: we tend to declare all non-virtuals first, then all virtuals afterwards.. http://codereview.chromium.org/9835049/diff/23012/content/browser/speech/spee... File content/browser/speech/speech_recognizer_impl_unittest.cc (right): http://codereview.chromium.org/9835049/diff/23012/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl_unittest.cc:110: GoogleOneShotRemoteEngine* sr_engine = nit: // Ownership is transferred to SpeechRecognizerImpl.
Thanks for the review. http://codereview.chromium.org/9835049/diff/23012/content/browser/speech/spee... File content/browser/speech/speech_recognizer_impl.cc (right): http://codereview.chromium.org/9835049/diff/23012/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl.cc:20: #define NOT_FEASIBLE() do { NOTREACHED(); return state_; } while(0) On 2012/04/04 15:38:17, bulach wrote: > nit: we avoid using macros as much as possible.. this case specifically, this > macro requires "state_" to be defined at the call site.. better to just make > these two lines explicit where needed.. Ok. Turned into a regular function like the others. http://codereview.chromium.org/9835049/diff/23012/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl.cc:77: speech::GoogleOneShotRemoteEngineConfig google_sr_config; On 2012/04/04 15:38:17, bulach wrote: > nit: prefer to call "remote_engine_config" Done. http://codereview.chromium.org/9835049/diff/23012/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl.cc:88: speech::GoogleOneShotRemoteEngine* google_sr_engine = On 2012/04/04 15:38:17, bulach wrote: > nit: remote_engine. > > also, just to clarify could add a comment: > // SpeechRecognizerImpl takes ownership. Done. http://codereview.chromium.org/9835049/diff/23012/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl.cc:277: return DoNothing(event_args); On 2012/04/04 15:38:17, bulach wrote: > I find this is a bit hard to follow.. > would it be simpler to just list the event we're interested in this state, and > then have a "default:" for everything else? Hmm, the point is that is not obvious whether the default case should be DoNothing or NOT_FEASIBLE. IMHO making all the cases explicit forces to reason (and hopefully react properly) on corner cases that would be otherwise hidden by a default case. http://codereview.chromium.org/9835049/diff/23012/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl.cc:295: return Abort(event_args); On 2012/04/04 15:38:17, bulach wrote: > ditto here... > maybe something like: > case EVENT_AUDIO_DATA: > return StartRecognitionEngine(event_args); > case EVENT_ABORT: > case EVENT_STOP_CAPTURE: > case EVENT_ENGINE_ERROR: > ... > return Abort(event_args); > default: > NOTREACHED(); > return ...; Grouped adjacent cases ending with the same action (but I preserved the event order, IMO helps readability). http://codereview.chromium.org/9835049/diff/23012/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl.cc:395: if (route_to_endpointer) { On 2012/04/04 15:38:17, bulach wrote: > nit: we normally avoid {} on single line if blocks: > if (route_to_endpointer) > endpointer_.ProcessAudio(raw_audio, &rms); Done. http://codereview.chromium.org/9835049/diff/23012/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl.cc:417: VLOG(1) << "SpeechRecognizerImpl starting audio capture."; On 2012/04/04 15:38:17, bulach wrote: > nit: DVLOG? Done. http://codereview.chromium.org/9835049/diff/23012/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl.cc:441: if (audio_controller_.get() == NULL) { On 2012/04/04 15:38:17, bulach wrote: > nit: if (!audio_controller_.get()) { Hmm is it strict? I feel to violate my moral and ethic rules doing it :) I see that there are a lot of other code which uses explicit (!/=)= NULL checks. http://codereview.chromium.org/9835049/diff/23012/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl.cc:477: } else { On 2012/04/04 15:38:17, bulach wrote: > nit: here, 491 and 500, remove the final "else" block and just do the return in > the last line.. Done. http://codereview.chromium.org/9835049/diff/23012/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl.cc:508: VLOG(1) << "Concluding recognition"; On 2012/04/04 15:38:17, bulach wrote: > nit: DVLOG? Done. http://codereview.chromium.org/9835049/diff/23012/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl.cc:523: // However, SPEECH_RECOGNITION_ERROR_ABORTED is not caught in UI layers On 2012/04/04 15:38:17, bulach wrote: > which UI layers? I think it's about the renderers, right? our UI layers (i.e., > the browser) doesn't catch any exceptions.. Done. http://codereview.chromium.org/9835049/diff/23012/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl.cc:535: const SpeechRecognitionError& error) { On 2012/04/04 15:38:17, bulach wrote: > can we avoid this overload? Hmm I guess it would make more verbose statements as "return AbortWithError(SpeechRecognitionError(...))" http://codereview.chromium.org/9835049/diff/23012/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl.cc:568: // This is in preparation for future speech recognition functions. On 2012/04/04 15:38:17, bulach wrote: > nit: indent Done. http://codereview.chromium.org/9835049/diff/23012/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl.cc:576: VLOG(1) << "Got valid result"; On 2012/04/04 15:38:17, bulach wrote: > nit: DVLOG Done. http://codereview.chromium.org/9835049/diff/23012/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl.cc:614: audio_level_ += (level - audio_level_) * kUpSmoothingFactor; On 2012/04/04 15:38:17, bulach wrote: > nit: you can probably simplify this with: > > const smoothing_factor = (level > audio_level_) ? > kUpSmoothingFactor : kDownSmoothingFactor; > audio_level_ += (level - audio_level_) * smoothing_factor; It was code "inherited" from the original class, btw it makes perfect sense. http://codereview.chromium.org/9835049/diff/23012/content/browser/speech/spee... File content/browser/speech/speech_recognizer_impl.h (right): http://codereview.chromium.org/9835049/diff/23012/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl.h:92: FSMEventArgs(FSMEvent event_value); On 2012/04/04 15:38:17, bulach wrote: > nits: move the constructor / destructor before the members, and add a \n after > the dtor.. also, mark the ctor explicit. Done. http://codereview.chromium.org/9835049/diff/23012/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl.h:130: void CloseAudioControllerAsynchronously(); On 2012/04/04 15:38:17, bulach wrote: > nits: we tend to declare all non-virtuals first, then all virtuals afterwards.. Done. http://codereview.chromium.org/9835049/diff/23012/content/browser/speech/spee... File content/browser/speech/speech_recognizer_impl_unittest.cc (right): http://codereview.chromium.org/9835049/diff/23012/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl_unittest.cc:110: GoogleOneShotRemoteEngine* sr_engine = On 2012/04/04 15:38:17, bulach wrote: > nit: // Ownership is transferred to SpeechRecognizerImpl. Done.
LGTM from me with the below nits. Happy to have this land once Marcus approves as well http://codereview.chromium.org/9835049/diff/35001/content/browser/speech/spee... File content/browser/speech/speech_recognizer_impl.h (right): http://codereview.chromium.org/9835049/diff/35001/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl.h:30: // TODO(primiano) Next CL: Remove the Impl suffix and the exported add newline above full length comments such as this http://codereview.chromium.org/9835049/diff/35001/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl.h:113: int GetElapsedTimeMs() const; can we separate logically the above methods from below and add comments explaining that the above ones handle the FSM transitions? Also add individual comments for the ones from this line below (e.g. what does GetElapsedTimeMs calculate, what does UpdateSignalAndNoiseLevels update etc) http://codereview.chromium.org/9835049/diff/35001/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl.h:143: bool in_event_dispatching_; probably reword as 'is_dispatching_event_'
http://codereview.chromium.org/9835049/diff/35001/content/browser/speech/spee... File content/browser/speech/speech_recognizer_impl.h (right): http://codereview.chromium.org/9835049/diff/35001/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl.h:30: // TODO(primiano) Next CL: Remove the Impl suffix and the exported On 2012/04/12 08:58:33, Satish wrote: > add newline above full length comments such as this Done. http://codereview.chromium.org/9835049/diff/35001/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl.h:113: int GetElapsedTimeMs() const; On 2012/04/12 08:58:33, Satish wrote: > can we separate logically the above methods from below and add comments > explaining that the above ones handle the FSM transitions? Also add individual > comments for the ones from this line below (e.g. what does GetElapsedTimeMs > calculate, what does UpdateSignalAndNoiseLevels update etc) Done. http://codereview.chromium.org/9835049/diff/35001/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl.h:143: bool in_event_dispatching_; On 2012/04/12 08:58:33, Satish wrote: > probably reword as 'is_dispatching_event_' Done.
lgtm thanks! just some nits, feel free to go after addressing them.. http://codereview.chromium.org/9835049/diff/41001/content/browser/speech/spee... File content/browser/speech/speech_recognizer_impl.cc (right): http://codereview.chromium.org/9835049/diff/41001/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl.cc:380: float rms = 0; nit: 0.0f http://codereview.chromium.org/9835049/diff/41001/content/browser/speech/spee... File content/browser/speech/speech_recognizer_impl.h (right): http://codereview.chromium.org/9835049/diff/41001/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl.h:124: int GetElapsedTimeMs() const; nit: maybe use base::TimeDelta instead of specifying the time unit? http://codereview.chromium.org/9835049/diff/41001/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl.h:142: void OnAudioClosed(media::AudioInputController*); nit: move this method up before the virtuals http://codereview.chromium.org/9835049/diff/41001/content/browser/speech/spee... File content/browser/speech/speech_recognizer_impl_unittest.cc (right): http://codereview.chromium.org/9835049/diff/41001/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl_unittest.cc:110: // SpeechRecognizerImpl takes ownership of sr_engine. there was a thread on chromium-dev to un-inline code, even from tests.. maybe do this as a follow up?
http://codereview.chromium.org/9835049/diff/41001/content/browser/speech/spee... File content/browser/speech/speech_recognizer_impl_unittest.cc (right): http://codereview.chromium.org/9835049/diff/41001/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl_unittest.cc:110: // SpeechRecognizerImpl takes ownership of sr_engine. On 2012/04/12 16:20:16, bulach wrote: > there was a thread on chromium-dev to un-inline code, even from tests.. maybe do > this as a follow up? please don't do this unless you feel strongly inclined to do so (I personally find it clutters things up more). as this isn't in the style guide, it's up to personal preferences
http://codereview.chromium.org/9835049/diff/41001/content/browser/speech/spee... File content/browser/speech/speech_recognizer_impl.h (right): http://codereview.chromium.org/9835049/diff/41001/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl.h:32: // /content/public/browser/speech_recognizer.h interface since this class should that'd be great! (the fewer interfaces in content/public the better). i'm curious, how would extensions use this?
http://codereview.chromium.org/9835049/diff/41001/content/browser/speech/spee... File content/browser/speech/speech_recognizer_impl_unittest.cc (right): http://codereview.chromium.org/9835049/diff/41001/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl_unittest.cc:110: // SpeechRecognizerImpl takes ownership of sr_engine. On 2012/04/12 16:21:50, John Abd-El-Malek wrote: > On 2012/04/12 16:20:16, bulach wrote: > > there was a thread on chromium-dev to un-inline code, even from tests.. maybe > do > > this as a follow up? > > please don't do this unless you feel strongly inclined to do so (I personally > find it clutters things up more). as this isn't in the style guide, it's up to > personal preferences ahn, thanks for the clarification jam! :) it wasn't clear to me if it was personal preference or if we were going to eventually move to that model, glad you clarified.
On 2012/04/12 16:22:59, John Abd-El-Malek wrote: > http://codereview.chromium.org/9835049/diff/41001/content/browser/speech/spee... > File content/browser/speech/speech_recognizer_impl.h (right): > > http://codereview.chromium.org/9835049/diff/41001/content/browser/speech/spee... > content/browser/speech/speech_recognizer_impl.h:32: // > /content/public/browser/speech_recognizer.h interface since this class should > that'd be great! (the fewer interfaces in content/public the better). i'm > curious, how would extensions use this? The plan is that everything within chromium that needs speech recognition should interface exclusively with the SpeechRecognitionManager interface and receive events exclusively through the SpeechRecognitionEventListener interface. All the other classes should be "hidden black magic". I will try adapt later speech input extensions support, which currently use the recognizer directly (and, among the other things, share very similar code with the SpeechRecognitionManager)
On Thu, Apr 12, 2012 at 9:44 AM, <primiano@chromium.org> wrote: > On 2012/04/12 16:22:59, John Abd-El-Malek wrote: > > http://codereview.chromium.**org/9835049/diff/41001/** > content/browser/speech/speech_**recognizer_impl.h<http://codereview.chromium.org/9835049/diff/41001/content/browser/speech/speech_recognizer_impl.h> > >> File content/browser/speech/speech_**recognizer_impl.h (right): >> > > > http://codereview.chromium.**org/9835049/diff/41001/** > content/browser/speech/speech_**recognizer_impl.h#newcode32<http://codereview.chromium.org/9835049/diff/41001/content/browser/speech/speech_recognizer_impl.h#newcode32> > >> content/browser/speech/speech_**recognizer_impl.h:32: // >> /content/public/browser/**speech_recognizer.h interface since this class >> should >> that'd be great! (the fewer interfaces in content/public the better). i'm >> curious, how would extensions use this? >> > > The plan is that everything within chromium that needs speech recognition > should > interface exclusively with the SpeechRecognitionManager interface and > receive > events exclusively through the SpeechRecognitionEventListener interface. > All the > other classes should be "hidden black magic". > I will try adapt later speech input extensions support, which currently > use the > recognizer directly (and, among the other things, share very similar code > with > the SpeechRecognitionManager) > sounds great > > http://codereview.chromium.**org/9835049/<http://codereview.chromium.org/9835... >
http://codereview.chromium.org/9835049/diff/41001/content/browser/speech/spee... File content/browser/speech/speech_recognizer_impl.cc (right): http://codereview.chromium.org/9835049/diff/41001/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl.cc:380: float rms = 0; On 2012/04/12 16:20:16, bulach wrote: > nit: 0.0f Done. http://codereview.chromium.org/9835049/diff/41001/content/browser/speech/spee... File content/browser/speech/speech_recognizer_impl.h (right): http://codereview.chromium.org/9835049/diff/41001/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl.h:124: int GetElapsedTimeMs() const; On 2012/04/12 16:20:16, bulach wrote: > nit: maybe use base::TimeDelta instead of specifying the time unit? Hmm it is not a true time (not derived by system clock) but an "ideal" measurement based on the amount of audio frames received (which stands on the assumption that audio is captured at the specified rate), thus it is always integral and multiple of a known base. http://codereview.chromium.org/9835049/diff/41001/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl.h:142: void OnAudioClosed(media::AudioInputController*); On 2012/04/12 16:20:16, bulach wrote: > nit: move this method up before the virtuals Oops, sorry, did it again! :/ http://codereview.chromium.org/9835049/diff/45002/content/browser/speech/spee... File content/browser/speech/speech_recognizer_impl.h (right): http://codereview.chromium.org/9835049/diff/45002/content/browser/speech/spee... content/browser/speech/speech_recognizer_impl.h:73: STATE_MAX_VALUE = STATE_WAITING_FINAL_RESULT Renamed due to a name clash on windows (STATE_MAX defined in windows.h).
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/primiano@chromium.org/9835049/45002
Try job failure for 9835049-45002 (retry) on win for step "compile" (clobber build). It's a second try, previously, step "compile" failed. http://build.chromium.org/p/tryserver.chromium/buildstatus?builder=win&number...
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/primiano@chromium.org/9835049/48003
Try job failure for 9835049-48003 (retry) on mac_rel for steps "browser_tests, unit_tests". It's a second try, previously, steps "browser_tests, unit_tests" failed. http://build.chromium.org/p/tryserver.chromium/buildstatus?builder=mac_rel&nu...
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/primiano@chromium.org/9835049/48003
Change committed as 132179 |