OLD | NEW |
1 <p id="classSummary"> | 1 <p id="classSummary"> |
2 Use the <code>chrome.experimental.tts</code> module to play synthesized | 2 Use the <code>chrome.experimental.tts</code> module to play synthesized |
3 text-to-speech (TTS) from your extension or packaged app, or to register | 3 text-to-speech (TTS) from your extension or packaged app. |
4 as a speech provider for other extensions and packaged apps that want to speak. | 4 See also the related |
| 5 <a href="experimental.tts_engine.html">experimental.tts_engine</a> |
| 6 module which allows an extension to implement a speech engine. |
5 </p> | 7 </p> |
6 | 8 |
7 <p class="note"><b>Give us feedback:</b> If you have suggestions, | 9 <p class="note"><b>Give us feedback:</b> If you have suggestions, |
8 especially changes that should be made before stabilizing the first | 10 especially changes that should be made before stabilizing the first |
9 version of this API, please send your ideas to the | 11 version of this API, please send your ideas to the |
10 <a href="http://groups.google.com/a/chromium.org/group/chromium-extensions">chro
mium-extensions</a> | 12 <a href="http://groups.google.com/a/chromium.org/group/chromium-extensions">chro
mium-extensions</a> |
11 group.</p> | 13 group.</p> |
12 | 14 |
13 <h2 id="overview">Overview</h2> | 15 <h2 id="overview">Overview</h2> |
14 | 16 |
15 <p>To enable this experimental API, visit | 17 <p>To enable this experimental API, visit |
16 <b>chrome://flags</b> and enable <b>Experimental Extension APIs</b>. | 18 <b>chrome://flags</b> and enable <b>Experimental Extension APIs</b>. |
17 | 19 |
18 <p>Chrome provides native support for speech on Windows (using SAPI | 20 <p>Chrome provides native support for speech on Windows (using SAPI |
19 5), Mac OS X, and Chrome OS, using speech synthesis capabilities | 21 5), Mac OS X, and Chrome OS, using speech synthesis capabilities |
20 provided by the operating system. On all platforms, the user can | 22 provided by the operating system. On all platforms, the user can |
21 install extensions that register themselves as alternative speech | 23 install extensions that register themselves as alternative speech |
22 synthesis providers.</p> | 24 engines.</p> |
23 | 25 |
24 <h2 id="generating_speech">Generating speech</h2> | 26 <h2 id="generating_speech">Generating speech</h2> |
25 | 27 |
26 <p>Call <code>speak()</code> from your extension or | 28 <p>Call <code>speak()</code> from your extension or |
27 packaged app to speak. For example:</p> | 29 packaged app to speak. For example:</p> |
28 | 30 |
29 <pre>chrome.experimental.tts.speak('Hello, world.');</pre> | 31 <pre>chrome.experimental.tts.speak('Hello, world.');</pre> |
30 | 32 |
| 33 <p>To stop speaking immediately, just call <code>stop()</code>: |
| 34 |
| 35 <pre>chrome.experimental.tts.stop();</pre> |
| 36 |
31 <p>You can provide options that control various properties of the speech, | 37 <p>You can provide options that control various properties of the speech, |
32 such as its rate, pitch, and more. For example:</p> | 38 such as its rate, pitch, and more. For example:</p> |
33 | 39 |
34 <pre>chrome.experimental.tts.speak('Hello, world.', {'rate': 0.8});</pre> | 40 <pre>chrome.experimental.tts.speak('Hello, world.', {'rate': 2.0});</pre> |
35 | 41 |
36 <p>It's also a good idea to specify the locale so that a synthesizer | 42 <p>It's also a good idea to specify the language so that a synthesizer |
37 supporting that language (and regional dialect, if applicable) is chosen.</p> | 43 supporting that language (and regional dialect, if applicable) is chosen.</p> |
38 | 44 |
39 <pre>chrome.experimental.tts.speak( | 45 <pre>chrome.experimental.tts.speak( |
40 'Hello, world.', | 46 'Hello, world.', {'lang': 'en-US', 'rate': 2.0});</pre> |
41 { | 47 |
42 'locale': 'en-US', | 48 <p>By default, each call to <code>speak()</code> will interrupt any |
43 'rate': 0.8 | 49 ongoing speech and speak immediately. To determine if a call would be |
| 50 interrupting anything, you can call <code>isSpeaking()</code>, or |
| 51 you can use the <code>enqueue</code> option to cause this utterance to |
| 52 be added to a queue of utterances that will be spoken when the current |
| 53 utterance has finished. |
| 54 |
| 55 <pre>chrome.experimental.tts.speak( |
| 56 'Speak this first.'); |
| 57 chrome.experimental.tts.speak( |
| 58 'Speak this next, when the first sentence is done.', {'enqueue': true}); |
| 59 </pre> |
| 60 |
| 61 <p>A complete description of all options can be found in the |
| 62 <a href="#method-speak">speak() method documentation</a> below. |
| 63 Not all speech engines will support all options.</p> |
| 64 |
| 65 <p>To catch errors and make sure you're calling <code>speak()</code> |
| 66 correctly, pass a callback function that takes no arguments. Inside |
| 67 the callback, check |
| 68 <a href="extension.html#property-lastError">chrome.extension.lastError</a> |
| 69 to see if there were any errors.</p> |
| 70 |
| 71 <pre>chrome.experimental.tts.speak( |
| 72 utterance, |
| 73 options, |
| 74 function() { |
| 75 if (chrome.extension.lastError) { |
| 76 console.log('Error: ' + chrome.extension.lastError.message); |
| 77 } |
44 });</pre> | 78 });</pre> |
45 | 79 |
46 <p>Not all speech engines will support all options.</p> | 80 <p>The callback returns right away, before the speech engine has started |
| 81 generating speech. The purpose of the callback is to alert you to syntax |
| 82 errors in your use of the TTS API, not all possible errors that might occur |
| 83 in the process of synthesizing and outputting speech. To catch these errors |
| 84 too, you need to use an event listener, described below. |
47 | 85 |
48 <p>You can also pass a callback function that will be called when the | 86 <h2 id="events">Listening to events</h2> |
49 speech has finished. For example, suppose we have an image on our page | |
50 displaying a picture of a face with a closed mouth. We could open the mouth | |
51 while speaking, and close it when done.</p> | |
52 | 87 |
53 <pre>faceImage.src = 'open_mouth.png'; | 88 <p>To get more real-time information about the status of synthesized speech, |
54 chrome.experimental.tts.speak( | 89 pass an event listener in the options to <code>speak()</code>, like this:</p> |
55 'Hello, world.', null, function() { | |
56 faceImage.src = 'closed_mouth.png'; | |
57 }); | |
58 </pre> | |
59 | 90 |
60 <p>To stop speaking immediately, just call <code>stop()</code>. Call | 91 <pre>chrome.experimental.tts.speak( |
61 <code>isSpeaking()</code> to find out if a TTS engine is currently speaking.</p> | 92 utterance, { |
| 93 'onevent': function(event) { |
| 94 console.log('Event ' + event.type ' at position ' + event.charIndex); |
| 95 if (event.type == 'error') { |
| 96 console.log('Error: ' + event.errorMessage); |
| 97 } |
| 98 } |
| 99 }, |
| 100 callback);</pre> |
62 | 101 |
63 <p>You can check to see if an error occurred by checking | 102 <p>Each event includes an event type, the character index of the current |
64 <code>chrome.extension.lastError</code> inside the callback function.</p> | 103 speech relative to the utterance, and for error events, an optional |
| 104 error message. The event types are:</p> |
| 105 |
| 106 <ul> |
| 107 <li><code>'start'</code>: the engine has started speaking the utterance. |
| 108 <li><code>'word'</code>: a word boundary was reached. Use |
| 109 <code>event.charIndex</code> to determine the current speech |
| 110 position. |
| 111 <li><code>'sentence'</code>: a sentence boundary was reached. Use |
| 112 <code>event.charIndex</code> to determine the current speech |
| 113 position. |
| 114 <li><code>'marker'</code>: an SSML marker was reached. Use |
| 115 <code>event.charIndex</code> to determine the current speech |
| 116 position. |
| 117 <li><code>'end'</code>: the engine has finished speaking the utterance. |
| 118 <li><code>'interrupted'</code>: this utterance was interrupted by another |
| 119 call to <code>speak()</code> or <code>stop()</code> and did not |
| 120 finish. |
| 121 <li><code>'cancelled'</code>: this utterance was cancelled by another |
| 122 call to <code>speak()</code> or <code>stop()</code> and never |
| 123 began to speak at all. |
| 124 <li><code>'error'</code>: An engine-specific error occurred and |
| 125 this utterance cannot be spoken. |
| 126 Check <code>event.errorMessage</code> for details. |
| 127 </ul> |
| 128 |
| 129 <p>Four of the event types, <code>'end'</code>, <code>'interrupted'</code>, |
| 130 <code>'cancelled'</code>, and <code>'error'</code>, are <i>final</i>. After |
| 131 one of those events is received, this utterance will no longer speak and |
| 132 no new events from this utterance will be received.</p> |
| 133 |
| 134 <p>Some TTS engines may not support all event types, and some may not even |
| 135 support any events at all. To require that the speech engine used sends |
| 136 the events you're interested in, you can pass a list of event types in |
| 137 the <code>requiredEventTypes</code> member of the options object, or use |
| 138 <code>getVoices</code> to choose a voice that has the events you need. |
| 139 Both are documented below. |
65 | 140 |
66 <h2 id="ssml">SSML markup</h2> | 141 <h2 id="ssml">SSML markup</h2> |
67 | 142 |
68 <p>Utterances used in this API may include markup using the | 143 <p>Utterances used in this API may include markup using the |
69 <a href="http://www.w3.org/TR/speech-synthesis">Speech Synthesis Markup | 144 <a href="http://www.w3.org/TR/speech-synthesis">Speech Synthesis Markup |
70 Language (SSML)</a>. For example: | 145 Language (SSML)</a>. If you use SSML, the first argument to |
| 146 <code>speak()</code> should be a complete SSML document with an XML |
| 147 header and a top-level <code><speak></code> tag, not a document |
| 148 fragment. |
71 | 149 |
72 <pre>chrome.experimental.tts.speak('The <emphasis>second</emphasis>
word of this sentence was emphasized.');</pre> | 150 For example: |
| 151 |
| 152 <pre>chrome.experimental.tts.speak( |
| 153 '<?xml version="1.0"?>' + |
| 154 '<speak>' + |
| 155 ' The <emphasis>second</emphasis> ' + |
| 156 ' word of this sentence was emphasized.' + |
| 157 '</speak>');</pre> |
73 | 158 |
74 <p>Not all speech engines will support all SSML tags, and some may not support | 159 <p>Not all speech engines will support all SSML tags, and some may not support |
75 SSML at all, but all engines are expected to ignore any SSML they don't | 160 SSML at all, but all engines are required to ignore any SSML they don't |
76 support and still speak the underlying text.</p> | 161 support and still speak the underlying text.</p> |
77 | 162 |
78 <h2 id="provider">Implementing a speech provider</h2> | 163 <h2 id="choosing_voice">Choosing a voice</h2> |
79 | 164 |
80 <p>An extension can register itself as a speech provider. By doing so, it | 165 <p>By default, Chrome will choose the most appropriate voice for each |
81 can intercept some or all calls to functions such as | 166 utterance you want to speak, based on the language and gender. On most |
82 <code>speak()</code> and <code>stop()</code> and provide an alternate | 167 Windows, Mac OS X, and Chrome OS systems, speech synthesis provided by |
83 implementation. Extensions are free to use any available web technology | 168 the operating system should be able to speak any text in at least one |
84 to provide speech, including streaming audio from a server, HTML5 audio, | 169 language. Some users may have a variety of voices available, though, |
85 Native Client, or Flash. An extension could even do something different | 170 from their operating system and from speech engines implemented by other |
86 with the utterances, like display closed captions in a pop-up window or | 171 Chrome extensions. In those cases, you can implement custom code to choose |
87 send them as log messages to a remote server.</p> | 172 the appropriate voice, or present the user with a list of choices.</p> |
88 | 173 |
89 <p>To provide TTS, an extension must first declare all voices it provides | 174 <p>To get a list of all voices, call <code>getVoices()</code> and pass it |
90 in the extension manifest, like this:</p> | 175 a function that receives an array of <code>TtsVoice</code> objects as its |
| 176 argument:</p> |
91 | 177 |
92 <pre>{ | 178 <pre>chrome.experimental.tts.getVoices( |
93 "name": "My TTS Provider", | 179 function(voices) { |
94 "version": "1.0", | 180 for (var i = 0; i < voices.length; i++) { |
95 <b>"permissions": ["experimental"] | 181 console.log('Voice ' + i + ':'); |
96 "tts": { | 182 console.log(' name: ' + voices[i].voiceName); |
97 "voices": [ | 183 console.log(' lang: ' + voices[i].lang); |
98 { | 184 console.log(' gender: ' + voices[i].gender); |
99 "voiceName": "Alice", | 185 console.log(' extension id: ' + voices[i].extensionId); |
100 "locale": "en-US", | 186 console.log(' event types: ' + voices[i].eventTypes); |
101 "gender": "female" | |
102 }, | |
103 { | |
104 "voiceName": "Pat", | |
105 "locale": "en-US" | |
106 } | 187 } |
107 ] | 188 });</pre> |
108 },</b> | |
109 "background_page": "background.html", | |
110 }</pre> | |
111 | |
112 <p>An extension can specify any number of voices. The three | |
113 parameters—<code>voiceName</code>, <code>locale</code>, | |
114 and <code>gender</code>—are all optional. If they are all unspecified, | |
115 the extension will handle all speech from all clients. If any of them | |
116 are specified, they can be used to filter speech requests. For | |
117 example, if a voice only supports French, it should set the locale to | |
118 'fr' (or something more specific like 'fr-FR') so that only utterances | |
119 in that locale are routed to that extension.</p> | |
120 | |
121 <p>To handle speech calls, the extension should register listeners | |
122 for <code>onSpeak</code> and <code>onStop</code>, like this:</p> | |
123 | |
124 <pre>var speakListener = function(utterance, options, callback) { | |
125 ... | |
126 callback(); | |
127 }; | |
128 var stopListener = function() { | |
129 ... | |
130 }; | |
131 chrome.experimental.tts.onSpeak.addListener(speakListener); | |
132 chrome.experimental.tts.onStop.addListener(stopListener);</pre> | |
133 | |
134 <p class="warning"><b>Important:</b> Don't forget to call the callback | |
135 function from your speak listener!</p> | |
136 | |
137 <p>If an extension does not register listeners for both | |
138 <code>onSpeak</code> and <code>onStop</code>, it will not intercept any | |
139 speech calls, regardless of what is in the manifest. | |
140 | |
141 <p>The decision of whether or not to send a given speech request to an | |
142 extension is based solely on whether the extension supports the given voice | |
143 parameters in its manifest and has registered listeners | |
144 for <code>onSpeak</code> and <code>onStop</code>. In other words, | |
145 there's no way for an extension to receive a speech request and | |
146 dynamically decide whether to handle it or not.</p> | |
OLD | NEW |