OLD | NEW |
1 <?xml version="1.0" encoding="US-ASCII"?> | 1 <?xml version="1.0" encoding="US-ASCII"?> |
2 <!DOCTYPE rfc SYSTEM "rfc2629.dtd"> | 2 <!DOCTYPE rfc SYSTEM "rfc2629.dtd"> |
3 <?rfc toc="yes"?> | 3 <?rfc toc="yes"?> |
4 <?rfc tocompact="yes"?> | 4 <?rfc tocompact="yes"?> |
5 <?rfc tocdepth="3"?> | 5 <?rfc tocdepth="3"?> |
6 <?rfc tocindent="yes"?> | 6 <?rfc tocindent="yes"?> |
7 <?rfc symrefs="yes"?> | 7 <?rfc symrefs="yes"?> |
8 <?rfc sortrefs="yes"?> | 8 <?rfc sortrefs="yes"?> |
9 <?rfc comments="yes"?> | 9 <?rfc comments="yes"?> |
10 <?rfc inline="yes"?> | 10 <?rfc inline="yes"?> |
11 <?rfc compact="yes"?> | 11 <?rfc compact="yes"?> |
12 <?rfc subcompact="no"?> | 12 <?rfc subcompact="no"?> |
13 <rfc category="std" docName="draft-valin-codec-opus-update-00" | 13 <rfc category="std" docName="draft-ietf-codec-opus-update-01" |
14 ipr="trust200902"> | 14 ipr="trust200902"> |
15 <front> | 15 <front> |
16 <title abbrev="Opus Update">Updates to the Opus Audio Codec</title> | 16 <title abbrev="Opus Update">Updates to the Opus Audio Codec</title> |
17 | 17 |
18 <author initials="JM" surname="Valin" fullname="Jean-Marc Valin"> | 18 <author initials="JM" surname="Valin" fullname="Jean-Marc Valin"> |
19 <organization>Mozilla Corporation</organization> | 19 <organization>Mozilla Corporation</organization> |
20 <address> | 20 <address> |
21 <postal> | 21 <postal> |
22 <street>650 Castro Street</street> | 22 <street>331 E. Evelyn Avenue</street> |
23 <city>Mountain View</city> | 23 <city>Mountain View</city> |
24 <region>CA</region> | 24 <region>CA</region> |
25 <code>94041</code> | 25 <code>94041</code> |
26 <country>USA</country> | 26 <country>USA</country> |
27 </postal> | 27 </postal> |
28 <phone>+1 650 903-0800</phone> | 28 <phone>+1 650 903-0800</phone> |
29 <email>jmvalin@jmvalin.ca</email> | 29 <email>jmvalin@jmvalin.ca</email> |
30 </address> | 30 </address> |
31 </author> | 31 </author> |
32 | 32 |
33 <author initials="T." surname="Terriberry" fullname="Timothy B. Terriberry"> | |
34 <organization>Mozilla Corporation</organization> | |
35 <address> | |
36 <postal> | |
37 <street>650 Castro Street</street> | |
38 <city>Mountain View</city> | |
39 <region>CA</region> | |
40 <code>94041</code> | |
41 <country>USA</country> | |
42 </postal> | |
43 <phone>+1 650 903-0800</phone> | |
44 <email>tterriberry@mozilla.com</email> | |
45 </address> | |
46 </author> | |
47 | |
48 <author initials="K." surname="Vos" fullname="Koen Vos"> | 33 <author initials="K." surname="Vos" fullname="Koen Vos"> |
49 <organization>Skype Technologies S.A.</organization> | 34 <organization>vocTone</organization> |
50 <address> | 35 <address> |
51 <postal> | 36 <postal> |
52 <street>Soder Malarstrand 43</street> | 37 <street></street> |
53 <city>Stockholm</city> | 38 <city></city> |
54 <region></region> | 39 <region></region> |
55 <code>11825</code> | 40 <code></code> |
56 <country>SE</country> | 41 <country></country> |
57 </postal> | 42 </postal> |
58 <phone>+46 73 085 7619</phone> | 43 <phone></phone> |
59 <email>koen.vos@skype.net</email> | 44 <email>koenvos74@gmail.com</email> |
60 </address> | 45 </address> |
61 </author> | 46 </author> |
62 | 47 |
63 | 48 |
64 | 49 |
65 <date day="12" month="July" year="2013" /> | 50 <date day="4" month="September" year="2014" /> |
66 | 51 |
67 <abstract> | 52 <abstract> |
68 <t>This document addresses minor issues that were found in the specificati
on | 53 <t>This document addresses minor issues that were found in the specificati
on |
69 of the Opus audio codec in <xref target="RFC6716">RFC 6716</xref>.</t> | 54 of the Opus audio codec in <xref target="RFC6716">RFC 6716</xref>.</t> |
70 </abstract> | 55 </abstract> |
71 </front> | 56 </front> |
72 | 57 |
73 <middle> | 58 <middle> |
74 <section title="Introduction"> | 59 <section title="Introduction"> |
75 <t>This document address minor issues that were discovered in the referenc
e | 60 <t>This document addresses minor issues that were discovered in the refere
nce |
76 implementation of the Opus codec that serves as the specification in | 61 implementation of the Opus codec that serves as the specification in |
77 <xref target="RFC6716">RFC 6716</xref>. Only issues affecting the decoder
are | 62 <xref target="RFC6716">RFC 6716</xref>. Only issues affecting the decoder
are |
78 listed here. An up-to-date implementation of the Opus encoder can be found
at | 63 listed here. An up-to-date implementation of the Opus encoder can be found
at |
79 http://opus-codec.org/. The updated specification remains fully compatible
with | 64 http://opus-codec.org/. The updated specification remains fully compatible
with |
80 the original specification and only one of the changes results in any diff
erence | 65 the original specification and only one of the changes results in any diff
erence |
81 in the audio output. | 66 in the audio output. |
82 </t> | 67 </t> |
83 </section> | 68 </section> |
84 | 69 |
85 <section title="Terminology"> | 70 <section title="Terminology"> |
86 <t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | 71 <t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", |
87 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | 72 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this |
88 document are to be interpreted as described in <xref | 73 document are to be interpreted as described in <xref |
89 target="RFC2119">RFC 2119</xref>.</t> | 74 target="RFC2119">RFC 2119</xref>.</t> |
90 </section> | 75 </section> |
91 | 76 |
92 <section title="Stereo State Reset in SILK"> | 77 <section title="Stereo State Reset in SILK"> |
93 <t>The reference implementation does not reinitialize the stereo state | 78 <t>The reference implementation does not reinitialize the stereo state |
94 during a mode switch. The old stereo memory can produce a brief impulse | 79 during a mode switch. The old stereo memory can produce a brief impulse |
95 (i.e. single sample) in the decoded audio. This can be fixed by changing | 80 (i.e. single sample) in the decoded audio. This can be fixed by changing |
96 silk/dec_API.c at line 72: | 81 silk/dec_API.c at line 72: |
97 <figure> | 82 <figure> |
98 <artwork><![CDATA[ | 83 <artwork><![CDATA[ |
99 for( n = 0; n < DECODER_NUM_CHANNELS; n++ ) { | 84 for( n = 0; n < DECODER_NUM_CHANNELS; n++ ) { |
100 ret = silk_init_decoder( &channel_state[ n ] ); | 85 ret = silk_init_decoder( &channel_state[ n ] ); |
101 } | 86 } |
102 + silk_memset(&((silk_decoder *)decState)->sStereo, 0, | 87 + silk_memset(&((silk_decoder *)decState)->sStereo, 0, |
103 + sizeof(((silk_decoder *)decState)->sStereo)); | 88 + sizeof(((silk_decoder *)decState)->sStereo)); |
104 + /* Not strictly needed, but it's cleaner that way */ | 89 + /* Not strictly needed, but it's cleaner that way */ |
105 + ((silk_decoder *)decState)->prev_decode_only_middle = 0; | 90 + ((silk_decoder *)decState)->prev_decode_only_middle = 0; |
106 | 91 |
107 return ret; | 92 return ret; |
108 } | 93 } |
109 ]]></artwork> | 94 ]]></artwork> |
110 </figure> | 95 </figure> |
111 This change affects the normative part of the decoder. Fortunately, | 96 This change affects the normative part of the decoder. Fortunately, |
112 the modified decoder is still compliant with the original specification bec
ause | 97 the modified decoder is still compliant with the original specification bec
ause |
(...skipping 27 matching lines...) Expand all Loading... |
140 - len -= padding; | 125 - len -= padding; |
141 } | 126 } |
142 ]]></artwork> | 127 ]]></artwork> |
143 </figure> | 128 </figure> |
144 </t> | 129 </t> |
145 <t>This packet parsing issue is limited to reading memory up | 130 <t>This packet parsing issue is limited to reading memory up |
146 to about 60 kB beyond the compressed buffer. This can only be triggered | 131 to about 60 kB beyond the compressed buffer. This can only be triggered |
147 by a compressed packet more than about 16 MB long, so it's not a proble
m | 132 by a compressed packet more than about 16 MB long, so it's not a proble
m |
148 for RTP. In theory, it <spanx style="emph">could</spanx> crash a file | 133 for RTP. In theory, it <spanx style="emph">could</spanx> crash a file |
149 decoder (e.g. Opus in Ogg) if the memory just after the incoming packet | 134 decoder (e.g. Opus in Ogg) if the memory just after the incoming packet |
150 is out-of-range, but that could not be achieved when attempted in a pro
duction | 135 is out-of-range, but our attempts to trigger such a crash in a producti
on |
151 application built using an affected version of the Opus decoder.</t> | 136 application built using an affected version of the Opus decoder failed.
</t> |
152 </section> | 137 </section> |
153 | 138 |
154 <section anchor="resampler" title="Resampler buffer"> | 139 <section anchor="resampler" title="Resampler buffer"> |
155 <t>The SILK resampler had the following issues: | 140 <t>The SILK resampler had the following issues: |
156 <list style="numbers"> | 141 <list style="numbers"> |
157 <t>The calls to memcpy() were using sizeof(opus_int32), but the type of the | 142 <t>The calls to memcpy() were using sizeof(opus_int32), but the type of the |
158 local buffer was opus_int16.</t> | 143 local buffer was opus_int16.</t> |
159 <t>Because the size was wrong, this potentially allowed the source | 144 <t>Because the size was wrong, this potentially allowed the source |
160 and destination regions of the memcpy overlap. | 145 and destination regions of the memcpy() to overlap. |
161 We <spanx style="emph">believe</spanx> that nSamplesIn is at least fs_
in_khZ, | 146 We <spanx style="emph">believe</spanx> that nSamplesIn is at least fs_
in_khZ, |
162 which is at least 8. | 147 which is at least 8. |
163 Since RESAMPLER_ORDER_FIR_12 is only 8,that should not be a problem once | 148 Since RESAMPLER_ORDER_FIR_12 is only 8, that should not be a problem once |
164 the type size is fixed.</t> | 149 the type size is fixed.</t> |
165 <t>The size of the buffer used RESAMPLER_MAX_BATCH_SIZE_IN, but the | 150 <t>The size of the buffer used RESAMPLER_MAX_BATCH_SIZE_IN, but the |
166 data stored in it was actually _twice_ the input batch size | 151 data stored in it was actually _twice_ the input batch size |
167 (nSamplesIn<<1).</t> | 152 (nSamplesIn<<1).</t> |
168 </list></t> | 153 </list></t> |
169 <t> | 154 <t> |
170 The fact that the code never produced any error in testing (including when
run under the | 155 The fact that the code never produced any error in testing (including when
run under the |
171 Valgrind memory debugger), suggests that in practice | 156 Valgrind memory debugger), suggests that in practice |
172 the batch sizes are reasonable enough that none of the issues above | 157 the batch sizes are reasonable enough that none of the issues above |
173 was ever a problem. However, proving that is non-obvious. | 158 was ever a problem. However, proving that is non-obvious. |
174 </t> | 159 </t> |
175 <t>The code can be fixed by applying the following changes to like 70 of sil
k/resampler_private_IIR_FIR.c: | 160 <t>The code can be fixed by applying the following changes to line 70 of sil
k/resampler_private_IIR_FIR.c: |
176 <figure> | 161 <figure> |
177 <artwork><![CDATA[ | 162 <artwork><![CDATA[ |
178 opus_int16 out[], /* O Output signal
*/ | |
179 const opus_int16 in[], /* I Input signal
*/ | |
180 opus_int32 inLen /* I Number of input sam
ples */ | |
181 ) | 163 ) |
182 { | 164 { |
183 silk_resampler_state_struct *S = (silk_resampler_state_struct *)SS; | 165 silk_resampler_state_struct *S = \ |
| 166 (silk_resampler_state_struct *)SS; |
184 opus_int32 nSamplesIn; | 167 opus_int32 nSamplesIn; |
185 opus_int32 max_index_Q16, index_increment_Q16; | 168 opus_int32 max_index_Q16, index_increment_Q16; |
186 - opus_int16 buf[ RESAMPLER_MAX_BATCH_SIZE_IN + RESAMPLER_ORDER_FIR_12 ]; | 169 - opus_int16 buf[ RESAMPLER_MAX_BATCH_SIZE_IN + \ |
187 + opus_int16 buf[ 2*RESAMPLER_MAX_BATCH_SIZE_IN + RESAMPLER_ORDER_FIR_12 ]; | 170 RESAMPLER_ORDER_FIR_12 ]; |
| 171 + opus_int16 buf[ 2*RESAMPLER_MAX_BATCH_SIZE_IN + \ |
| 172 RESAMPLER_ORDER_FIR_12 ]; |
188 | 173 |
189 /* Copy buffered samples to start of buffer */ | 174 /* Copy buffered samples to start of buffer */ |
190 - silk_memcpy( buf, S->sFIR, RESAMPLER_ORDER_FIR_12 * sizeof( opus_int32 ) ); | 175 - silk_memcpy( buf, S->sFIR, RESAMPLER_ORDER_FIR_12 \ |
191 + silk_memcpy( buf, S->sFIR, RESAMPLER_ORDER_FIR_12 * sizeof( opus_int16 ) ); | 176 * sizeof( opus_int32 ) ); |
| 177 + silk_memcpy( buf, S->sFIR, RESAMPLER_ORDER_FIR_12 \ |
| 178 * sizeof( opus_int16 ) ); |
192 | 179 |
193 /* Iterate over blocks of frameSizeIn input samples */ | 180 /* Iterate over blocks of frameSizeIn input samples */ |
194 index_increment_Q16 = S->invRatio_Q16; | 181 index_increment_Q16 = S->invRatio_Q16; |
195 while( 1 ) { | 182 while( 1 ) { |
196 nSamplesIn = silk_min( inLen, S->batchSize ); | 183 nSamplesIn = silk_min( inLen, S->batchSize ); |
197 | 184 |
198 /* Upsample 2x */ | 185 /* Upsample 2x */ |
199 silk_resampler_private_up2_HQ( S->sIIR, &buf[ RESAMPLER_ORDER_FIR_12 ],
in, nSamplesIn ); | 186 silk_resampler_private_up2_HQ( S->sIIR, &buf[ \ |
| 187 RESAMPLER_ORDER_FIR_12 ], in, nSamplesIn ); |
200 | 188 |
201 max_index_Q16 = silk_LSHIFT32( nSamplesIn, 16 + 1 ); /* + 1 bec
ause 2x upsampling */ | 189 max_index_Q16 = silk_LSHIFT32( nSamplesIn, 16 + 1 \ |
202 out = silk_resampler_private_IIR_FIR_INTERPOL( out, buf, max_index_Q16,
index_increment_Q16 ); | 190 ); /* + 1 because 2x upsampling */ |
| 191 out = silk_resampler_private_IIR_FIR_INTERPOL( out, \ |
| 192 buf, max_index_Q16, index_increment_Q16 ); |
203 in += nSamplesIn; | 193 in += nSamplesIn; |
204 inLen -= nSamplesIn; | 194 inLen -= nSamplesIn; |
205 | 195 |
206 if( inLen > 0 ) { | 196 if( inLen > 0 ) { |
207 /* More iterations to do; copy last part of filtered signal to begi
nning of buffer */ | 197 /* More iterations to do; copy last part of \ |
208 - silk_memcpy( buf, &buf[ nSamplesIn << 1 ], RESAMPLER_ORDER_FIR_12 *
sizeof( opus_int32 ) ); | 198 filtered signal to beginning of buffer */ |
209 + silk_memmove( buf, &buf[ nSamplesIn << 1 ], RESAMPLER_ORDER_FIR_12
* sizeof( opus_int16 ) ); | 199 - silk_memcpy( buf, &buf[ nSamplesIn << 1 ], \ |
| 200 RESAMPLER_ORDER_FIR_12 * sizeof( opus_int32 ) ); |
| 201 + silk_memmove( buf, &buf[ nSamplesIn << 1 ], \ |
| 202 RESAMPLER_ORDER_FIR_12 * sizeof( opus_int16 ) ); |
210 } else { | 203 } else { |
211 break; | 204 break; |
212 } | 205 } |
213 } | 206 } |
214 | 207 |
215 /* Copy last part of filtered signal to the state for the next call */ | 208 /* Copy last part of filtered signal to the state for \ |
216 - silk_memcpy( S->sFIR, &buf[ nSamplesIn << 1 ], RESAMPLER_ORDER_FIR_12 * siz
eof( opus_int32 ) ); | 209 the next call */ |
217 + silk_memcpy( S->sFIR, &buf[ nSamplesIn << 1 ], RESAMPLER_ORDER_FIR_12 * siz
eof( opus_int16 ) ); | 210 - silk_memcpy( S->sFIR, &buf[ nSamplesIn << 1 ], \ |
| 211 RESAMPLER_ORDER_FIR_12 * sizeof( opus_int32 ) ); |
| 212 + silk_memcpy( S->sFIR, &buf[ nSamplesIn << 1 ], \ |
| 213 RESAMPLER_ORDER_FIR_12 * sizeof( opus_int16 ) ); |
218 } | 214 } |
219 ]]></artwork> | 215 ]]></artwork> |
220 </figure> | 216 </figure> |
| 217 Note: due to RFC formatting conventions, lines exceeding the column width |
| 218 in the patch above are split using a backslash character. The backslashes |
| 219 at the end of a line and the white space at the beginning |
| 220 of the following line are not part of the patch. A properly formatted patch |
| 221 including the three changes above is available at |
| 222 <eref target="http://jmvalin.ca/misc_stuff/opus_update.patch"/>. |
221 </t> | 223 </t> |
222 </section> | 224 </section> |
223 | 225 |
224 <section title="Downmix to Mono"> | 226 <section title="Downmix to Mono"> |
225 <t>The last issue is not strictly a bug, but it is an issue that has been
reported | 227 <t>The last issue is not strictly a bug, but it is an issue that has been
reported |
226 when downmixing Opus decoded stream to mono, whether this is done inside t
he decoder | 228 when downmixing an Opus decoded stream to mono, whether this is done insid
e the decoder |
227 or as a post-processing on the stereo decoder output. Opus intensity stere
o allows | 229 or as a post-processing step on the stereo decoder output. Opus intensity
stereo allows |
228 optionally coding the two channels 180-degrees out of phase on a per-band
basis. | 230 optionally coding the two channels 180-degrees out of phase on a per-band
basis. |
229 This provides better stereo quality than forcing the two channels to be in
phase, | 231 This provides better stereo quality than forcing the two channels to be in
phase, |
230 but when the output is downmixed to mono, the energy in the affected bands
is cancelled | 232 but when the output is downmixed to mono, the energy in the affected bands
is cancelled |
231 sometimes resulting in audible artefacts. | 233 sometimes resulting in audible artefacts. |
232 </t> | 234 </t> |
233 <t>A possible work-around for this issue would be to optionally allow the
decoder to | 235 <t>As a work-around for this issue, the decoder MAY choose not to apply th
e 180-degree |
234 not apply the 180-degree phase shift when the output is meant to be downmi
xed (inside or | 236 phase shift when the output is meant to be downmixed (inside or |
235 outside of the decoder). | 237 outside of the decoder). |
236 </t> | 238 </t> |
237 </section> | 239 </section> |
238 <section anchor="IANA" title="IANA Considerations"> | 240 <section anchor="IANA" title="IANA Considerations"> |
239 <t>This document makes no request of IANA.</t> | 241 <t>This document makes no request of IANA.</t> |
240 | 242 |
241 <t>Note to RFC Editor: this section may be removed on publication as an | 243 <t>Note to RFC Editor: this section may be removed on publication as an |
242 RFC.</t> | 244 RFC.</t> |
243 </section> | 245 </section> |
244 | 246 |
245 <section anchor="Acknowledgements" title="Acknowledgements"> | 247 <section anchor="Acknowledgements" title="Acknowledgements"> |
246 <t>We would like to thank Juri Aedla for reporting the issue with the pars
ing of | 248 <t>We would like to thank Juri Aedla for reporting the issue with the pars
ing of |
247 the Opus padding.</t> | 249 the Opus padding.</t> |
248 </section> | 250 </section> |
249 </middle> | 251 </middle> |
250 | 252 |
251 <back> | 253 <back> |
252 <references title="References"> | 254 <references title="References"> |
253 <?rfc include="http://xml.resource.org/public/rfc/bibxml/reference.RFC.211
9.xml"?> | 255 <?rfc include="http://xml.resource.org/public/rfc/bibxml/reference.RFC.211
9.xml"?> |
254 <?rfc include="http://xml.resource.org/public/rfc/bibxml/reference.RFC.671
6.xml"?> | 256 <?rfc include="http://xml.resource.org/public/rfc/bibxml/reference.RFC.671
6.xml"?> |
255 | 257 |
256 | 258 |
257 </references> | 259 </references> |
258 </back> | 260 </back> |
259 </rfc> | 261 </rfc> |
OLD | NEW |