| OLD | NEW |
| 1 Parsing | 1 Parsing |
| 2 ======= | 2 ======= |
| 3 | 3 |
| 4 Parsing in Sky is a strict pipeline consisting of five stages: | 4 Parsing in Sky is a strict pipeline consisting of five stages: |
| 5 | 5 |
| 6 - decoding, which converts incoming bytes into Unicode characters | 6 - decoding, which converts incoming bytes into Unicode characters |
| 7 using UTF-8. | 7 using UTF-8. |
| 8 | 8 |
| 9 - normalising, which manipulates the sequence of characters. | 9 - normalising, which manipulates the sequence of characters. |
| 10 | 10 |
| (...skipping 74 matching lines...) Expand 10 before | Expand all | Expand 10 after Loading... |
| 85 | 85 |
| 86 6. Switch to the **expect a string** state. | 86 6. Switch to the **expect a string** state. |
| 87 | 87 |
| 88 | 88 |
| 89 ### Tokeniser states ### | 89 ### Tokeniser states ### |
| 90 | 90 |
| 91 #### **Signature** state #### | 91 #### **Signature** state #### |
| 92 | 92 |
| 93 If the current character is... | 93 If the current character is... |
| 94 | 94 |
| 95 * '```#```': If the _parsing context_ is not an Application, switch to | 95 * '``#``': If the _parsing context_ is not an Application, switch to |
| 96 the _failed signature_ state. Otherwise, expect the string | 96 the _failed signature_ state. Otherwise, expect the string |
| 97 "```#!mojo mojo:sky```", with _after signature_ as the _success_ | 97 "``#!mojo mojo:sky``", with _after signature_ as the _success_ |
| 98 state and _failed signature_ as the _failure_ state. | 98 state and _failed signature_ as the _failure_ state. |
| 99 | 99 |
| 100 * '```S```': If the _parsing context_ is not a Module, switch to the | 100 * '``S``': If the _parsing context_ is not a Module, switch to the |
| 101 _failed signature_ state. Otherwise, expect the string | 101 _failed signature_ state. Otherwise, expect the string |
| 102 "```SKY MODULE```", with _after signature_ as the _success_ state, | 102 "``SKY MODULE``", with _after signature_ as the _success_ state, |
| 103 and _failed signature_ as the _failure_ state. | 103 and _failed signature_ as the _failure_ state. |
| 104 | 104 |
| 105 * Anything else: Jump to the **failed signature** state. | 105 * Anything else: Jump to the **failed signature** state. |
| 106 | 106 |
| 107 | 107 |
| 108 #### **Expect a string** state #### | 108 #### **Expect a string** state #### |
| 109 | 109 |
| 110 If the current character is not the same as the <i>index</i>th character in | 110 If the current character is not the same as the <i>index</i>th character in |
| 111 _expectation_, then switch to the _failure_ state. | 111 _expectation_, then switch to the _failure_ state. |
| 112 | 112 |
| (...skipping 22 matching lines...) Expand all Loading... |
| 135 If the current character is... | 135 If the current character is... |
| 136 | 136 |
| 137 * U+000A: Consume the character and switch to the **data** state. | 137 * U+000A: Consume the character and switch to the **data** state. |
| 138 * Anything else: Consume the character and stay in this state. | 138 * Anything else: Consume the character and stay in this state. |
| 139 | 139 |
| 140 | 140 |
| 141 ### **Data** state ### | 141 ### **Data** state ### |
| 142 | 142 |
| 143 If the current character is... | 143 If the current character is... |
| 144 | 144 |
| 145 * '```<```': Consume the character and switch to the **tag open** state. | 145 * '``<``': Consume the character and switch to the **tag open** state. |
| 146 | 146 |
| 147 * '```&```': Consume the character and switch to the **character | 147 * '``&``': Consume the character and switch to the **character |
| 148 reference** state, with the _return state_ set to the **data** | 148 reference** state, with the _return state_ set to the **data** |
| 149 state, the _extra terminating character_ unset (or set to U+0000, | 149 state, the _extra terminating character_ unset (or set to U+0000, |
| 150 which has the same effect), and the _emitting operation_ being to | 150 which has the same effect), and the _emitting operation_ being to |
| 151 emit a character token for the given character. | 151 emit a character token for the given character. |
| 152 | 152 |
| 153 * Anything else: Emit the current input character as a character | 153 * Anything else: Emit the current input character as a character |
| 154 token. Consume the character. Stay in this state. | 154 token. Consume the character. Stay in this state. |
| 155 | 155 |
| 156 | 156 |
| 157 ### **Script raw data** state ### | 157 ### **Script raw data** state ### |
| 158 | 158 |
| 159 If the current character is... | 159 If the current character is... |
| 160 | 160 |
| 161 * '```<```': Consume the character and switch to the **script raw | 161 * '``<``': Consume the character and switch to the **script raw |
| 162 data: close 1** state. | 162 data: close 1** state. |
| 163 | 163 |
| 164 * Anything else: Emit the current input character as a character | 164 * Anything else: Emit the current input character as a character |
| 165 token. Consume the character. Stay in this state. | 165 token. Consume the character. Stay in this state. |
| 166 | 166 |
| 167 | 167 |
| 168 ### **Script raw data: close 1** state ### | 168 ### **Script raw data: close 1** state ### |
| 169 | 169 |
| 170 If the current character is... | 170 If the current character is... |
| 171 | 171 |
| 172 * '```/```': Consume the character and switch to the **script raw | 172 * '``/``': Consume the character and switch to the **script raw |
| 173 data: close 2** state. | 173 data: close 2** state. |
| 174 | 174 |
| 175 * Anything else: Emit '```<```' character tokens. Consume the | 175 * Anything else: Emit '``<``' character tokens. Consume the |
| 176 character. Switch to the **script raw data** state. | 176 character. Switch to the **script raw data** state. |
| 177 | 177 |
| 178 | 178 |
| 179 ### **Script raw data: close 2** state ### | 179 ### **Script raw data: close 2** state ### |
| 180 | 180 |
| 181 If the current character is... | 181 If the current character is... |
| 182 | 182 |
| 183 * '```s```': Consume the character and switch to the **script raw | 183 * '``s``': Consume the character and switch to the **script raw |
| 184 data: close 3** state. | 184 data: close 3** state. |
| 185 | 185 |
| 186 * Anything else: Emit '```</```' character tokens. Consume the | 186 * Anything else: Emit '``</``' character tokens. Consume the |
| 187 character. Switch to the **script raw data** state. | 187 character. Switch to the **script raw data** state. |
| 188 | 188 |
| 189 | 189 |
| 190 ### **Script raw data: close 3** state ### | 190 ### **Script raw data: close 3** state ### |
| 191 | 191 |
| 192 If the current character is... | 192 If the current character is... |
| 193 | 193 |
| 194 * '```c```': Consume the character and switch to the **script raw | 194 * '``c``': Consume the character and switch to the **script raw |
| 195 data: close 4** state. | 195 data: close 4** state. |
| 196 | 196 |
| 197 * Anything else: Emit '```</s```' character tokens. Consume the | 197 * Anything else: Emit '``</s``' character tokens. Consume the |
| 198 character. Switch to the **script raw data** state. | 198 character. Switch to the **script raw data** state. |
| 199 | 199 |
| 200 | 200 |
| 201 ### **Script raw data: close 4** state ### | 201 ### **Script raw data: close 4** state ### |
| 202 | 202 |
| 203 If the current character is... | 203 If the current character is... |
| 204 | 204 |
| 205 * '```r```': Consume the character and switch to the **script raw | 205 * '``r``': Consume the character and switch to the **script raw |
| 206 data: close 5** state. | 206 data: close 5** state. |
| 207 | 207 |
| 208 * Anything else: Emit '```</sc```' character tokens. Consume the | 208 * Anything else: Emit '``</sc``' character tokens. Consume the |
| 209 character. Switch to the **script raw data** state. | 209 character. Switch to the **script raw data** state. |
| 210 | 210 |
| 211 | 211 |
| 212 ### **Script raw data: close 5** state ### | 212 ### **Script raw data: close 5** state ### |
| 213 | 213 |
| 214 If the current character is... | 214 If the current character is... |
| 215 | 215 |
| 216 * '```i```': Consume the character and switch to the **script raw | 216 * '``i``': Consume the character and switch to the **script raw |
| 217 data: close 6** state. | 217 data: close 6** state. |
| 218 | 218 |
| 219 * Anything else: Emit '```</scr```' character tokens. Consume the | 219 * Anything else: Emit '``</scr``' character tokens. Consume the |
| 220 character. Switch to the **script raw data** state. | 220 character. Switch to the **script raw data** state. |
| 221 | 221 |
| 222 | 222 |
| 223 ### **Script raw data: close 6** state ### | 223 ### **Script raw data: close 6** state ### |
| 224 | 224 |
| 225 If the current character is... | 225 If the current character is... |
| 226 | 226 |
| 227 * '```p```': Consume the character and switch to the **script raw | 227 * '``p``': Consume the character and switch to the **script raw |
| 228 data: close 7** state. | 228 data: close 7** state. |
| 229 | 229 |
| 230 * Anything else: Emit '```</scri```' character tokens. Consume the | 230 * Anything else: Emit '``</scri``' character tokens. Consume the |
| 231 character. Switch to the **script raw data** state. | 231 character. Switch to the **script raw data** state. |
| 232 | 232 |
| 233 | 233 |
| 234 ### **Script raw data: close 7** state ### | 234 ### **Script raw data: close 7** state ### |
| 235 | 235 |
| 236 If the current character is... | 236 If the current character is... |
| 237 | 237 |
| 238 * '```t```': Consume the character and switch to the **script raw | 238 * '``t``': Consume the character and switch to the **script raw |
| 239 data: close 8** state. | 239 data: close 8** state. |
| 240 | 240 |
| 241 * Anything else: Emit '```</scrip```' character tokens. Consume the | 241 * Anything else: Emit '``</scrip``' character tokens. Consume the |
| 242 character. Switch to the **script raw data** state. | 242 character. Switch to the **script raw data** state. |
| 243 | 243 |
| 244 | 244 |
| 245 ### **Script raw data: close 8** state ### | 245 ### **Script raw data: close 8** state ### |
| 246 | 246 |
| 247 If the current character is... | 247 If the current character is... |
| 248 | 248 |
| 249 * U+0020, U+000A, '```/```', '```>```': Create an end tag token, and | 249 * U+0020, U+000A, '``/``', '``>``': Create an end tag token, and |
| 250 let its tag name be the string '```script```'. Switch to the | 250 let its tag name be the string '``script``'. Switch to the |
| 251 **before attribute name** state without consuming the character. | 251 **before attribute name** state without consuming the character. |
| 252 | 252 |
| 253 * Anything else: Emit '```</script```' character tokens. Consume the | 253 * Anything else: Emit '``</script``' character tokens. Consume the |
| 254 character. Switch to the **script raw data** state. | 254 character. Switch to the **script raw data** state. |
| 255 | 255 |
| 256 | 256 |
| 257 ### **Style raw data** state ### | 257 ### **Style raw data** state ### |
| 258 | 258 |
| 259 If the current character is... | 259 If the current character is... |
| 260 | 260 |
| 261 * '```<```': Consume the character and switch to the **style raw | 261 * '``<``': Consume the character and switch to the **style raw |
| 262 data: close 1** state. | 262 data: close 1** state. |
| 263 | 263 |
| 264 * Anything else: Emit the current input character as a character | 264 * Anything else: Emit the current input character as a character |
| 265 token. Consume the character. Stay in this state. | 265 token. Consume the character. Stay in this state. |
| 266 | 266 |
| 267 | 267 |
| 268 ### **Style raw data: close 1** state ### | 268 ### **Style raw data: close 1** state ### |
| 269 | 269 |
| 270 If the current character is... | 270 If the current character is... |
| 271 | 271 |
| 272 * '```/```': Consume the character and switch to the **style raw | 272 * '``/``': Consume the character and switch to the **style raw |
| 273 data: close 2** state. | 273 data: close 2** state. |
| 274 | 274 |
| 275 * Anything else: Emit '```<```' character tokens. Consume the | 275 * Anything else: Emit '``<``' character tokens. Consume the |
| 276 character. Switch to the **style raw data** state. | 276 character. Switch to the **style raw data** state. |
| 277 | 277 |
| 278 | 278 |
| 279 ### **Style raw data: close 2** state ### | 279 ### **Style raw data: close 2** state ### |
| 280 | 280 |
| 281 If the current character is... | 281 If the current character is... |
| 282 | 282 |
| 283 * '```s```': Consume the character and switch to the **style raw | 283 * '``s``': Consume the character and switch to the **style raw |
| 284 data: close 3** state. | 284 data: close 3** state. |
| 285 | 285 |
| 286 * Anything else: Emit '```</```' character tokens. Consume the | 286 * Anything else: Emit '``</``' character tokens. Consume the |
| 287 character. Switch to the **style raw data** state. | 287 character. Switch to the **style raw data** state. |
| 288 | 288 |
| 289 | 289 |
| 290 ### **Style raw data: close 3** state ### | 290 ### **Style raw data: close 3** state ### |
| 291 | 291 |
| 292 If the current character is... | 292 If the current character is... |
| 293 | 293 |
| 294 * '```t```': Consume the character and switch to the **style raw | 294 * '``t``': Consume the character and switch to the **style raw |
| 295 data: close 4** state. | 295 data: close 4** state. |
| 296 | 296 |
| 297 * Anything else: Emit '```</s```' character tokens. Consume the | 297 * Anything else: Emit '``</s``' character tokens. Consume the |
| 298 character. Switch to the **style raw data** state. | 298 character. Switch to the **style raw data** state. |
| 299 | 299 |
| 300 | 300 |
| 301 ### **Style raw data: close 4** state ### | 301 ### **Style raw data: close 4** state ### |
| 302 | 302 |
| 303 If the current character is... | 303 If the current character is... |
| 304 | 304 |
| 305 * '```y```': Consume the character and switch to the **style raw | 305 * '``y``': Consume the character and switch to the **style raw |
| 306 data: close 5** state. | 306 data: close 5** state. |
| 307 | 307 |
| 308 * Anything else: Emit '```</st```' character tokens. Consume the | 308 * Anything else: Emit '``</st``' character tokens. Consume the |
| 309 character. Switch to the **style raw data** state. | 309 character. Switch to the **style raw data** state. |
| 310 | 310 |
| 311 | 311 |
| 312 ### **Style raw data: close 5** state ### | 312 ### **Style raw data: close 5** state ### |
| 313 | 313 |
| 314 If the current character is... | 314 If the current character is... |
| 315 | 315 |
| 316 * '```l```': Consume the character and switch to the **style raw | 316 * '``l``': Consume the character and switch to the **style raw |
| 317 data: close 6** state. | 317 data: close 6** state. |
| 318 | 318 |
| 319 * Anything else: Emit '```</sty```' character tokens. Consume the | 319 * Anything else: Emit '``</sty``' character tokens. Consume the |
| 320 character. Switch to the **style raw data** state. | 320 character. Switch to the **style raw data** state. |
| 321 | 321 |
| 322 | 322 |
| 323 ### **Style raw data: close 6** state ### | 323 ### **Style raw data: close 6** state ### |
| 324 | 324 |
| 325 If the current character is... | 325 If the current character is... |
| 326 | 326 |
| 327 * '```e```': Consume the character and switch to the **style raw | 327 * '``e``': Consume the character and switch to the **style raw |
| 328 data: close 7** state. | 328 data: close 7** state. |
| 329 | 329 |
| 330 * Anything else: Emit '```</styl```' character tokens. Consume the | 330 * Anything else: Emit '``</styl``' character tokens. Consume the |
| 331 character. Switch to the **style raw data** state. | 331 character. Switch to the **style raw data** state. |
| 332 | 332 |
| 333 | 333 |
| 334 ### **Style raw data: close 7** state ### | 334 ### **Style raw data: close 7** state ### |
| 335 | 335 |
| 336 If the current character is... | 336 If the current character is... |
| 337 | 337 |
| 338 * U+0020, U+000A, '```/```', '```>```': Create an end tag token, and | 338 * U+0020, U+000A, '``/``', '``>``': Create an end tag token, and |
| 339 let its tag name be the string '```style```'. Switch to the | 339 let its tag name be the string '``style``'. Switch to the |
| 340 **before attribute name** state without consuming the character. | 340 **before attribute name** state without consuming the character. |
| 341 | 341 |
| 342 * Anything else: Emit '```</style```' character tokens. Consume the | 342 * Anything else: Emit '``</style``' character tokens. Consume the |
| 343 character. Switch to the **style raw data** state. | 343 character. Switch to the **style raw data** state. |
| 344 | 344 |
| 345 | 345 |
| 346 ### **Tag open** state ### | 346 ### **Tag open** state ### |
| 347 | 347 |
| 348 If the current character is... | 348 If the current character is... |
| 349 | 349 |
| 350 * '```!```': Consume the character and switch to the **comment start | 350 * '``!``': Consume the character and switch to the **comment start |
| 351 1** state. | 351 1** state. |
| 352 | 352 |
| 353 * '```/```': Consume the character and switch to the **close tag | 353 * '``/``': Consume the character and switch to the **close tag |
| 354 state** state. | 354 state** state. |
| 355 | 355 |
| 356 * '```>```': Emit character tokens for '```<>```'. Consume the current | 356 * '``>``': Emit character tokens for '``<>``'. Consume the current |
| 357 character. Switch to the **data** state. | 357 character. Switch to the **data** state. |
| 358 | 358 |
| 359 * '```0```'..'```9```', '```a```'..'```z```', '```A```'..'```Z```', | 359 * '``0``'..'``9``', '``a``'..'``z``', '``A``'..'``Z``', |
| 360 '```-```', '```_```', '```.```': Create a start tag token, let its | 360 '``-``', '``_``', '``.``': Create a start tag token, let its |
| 361 tag name be the current character, consume the current character and | 361 tag name be the current character, consume the current character and |
| 362 switch to the **tag name** state. | 362 switch to the **tag name** state. |
| 363 | 363 |
| 364 * Anything else: Emit the character token for '```<```'. Switch to the | 364 * Anything else: Emit the character token for '``<``'. Switch to the |
| 365 **data** state without consuming the current character. | 365 **data** state without consuming the current character. |
| 366 | 366 |
| 367 | 367 |
| 368 ### **Close tag** state ### | 368 ### **Close tag** state ### |
| 369 | 369 |
| 370 If the current character is... | 370 If the current character is... |
| 371 | 371 |
| 372 * '```>```': Emit character tokens for '```</>```'. Consume the current | 372 * '``>``': Emit character tokens for '``</>``'. Consume the current |
| 373 character. Switch to the **data** state. | 373 character. Switch to the **data** state. |
| 374 | 374 |
| 375 * '```0```'..'```9```', '```a```'..'```z```', '```A```'..'```Z```', | 375 * '``0``'..'``9``', '``a``'..'``z``', '``A``'..'``Z``', |
| 376 '```-```', '```_```', '```.```': Create an end tag token, let its | 376 '``-``', '``_``', '``.``': Create an end tag token, let its |
| 377 tag name be the current character, consume the current character and | 377 tag name be the current character, consume the current character and |
| 378 switch to the **tag name** state. | 378 switch to the **tag name** state. |
| 379 | 379 |
| 380 * Anything else: Emit the character tokens for '```</```'. Switch to | 380 * Anything else: Emit the character tokens for '``</``'. Switch to |
| 381 the **data** state without consuming the current character. | 381 the **data** state without consuming the current character. |
| 382 | 382 |
| 383 | 383 |
| 384 ### **Tag name** state ### | 384 ### **Tag name** state ### |
| 385 | 385 |
| 386 If the current character is... | 386 If the current character is... |
| 387 | 387 |
| 388 * U+0020, U+000A: Consume the current character. Switch to the | 388 * U+0020, U+000A: Consume the current character. Switch to the |
| 389 **before attribute name** state. | 389 **before attribute name** state. |
| 390 | 390 |
| 391 * '```/```': Consume the current character. Switch to the **void tag** | 391 * '``/``': Consume the current character. Switch to the **void tag** |
| 392 state. | 392 state. |
| 393 | 393 |
| 394 * '```>```': Consume the current character. Switch to the **after | 394 * '``>``': Consume the current character. Switch to the **after |
| 395 tag** state. | 395 tag** state. |
| 396 | 396 |
| 397 * Anything else: Append the current character to the tag name, and | 397 * Anything else: Append the current character to the tag name, and |
| 398 consume the current character. Stay in this state. | 398 consume the current character. Stay in this state. |
| 399 | 399 |
| 400 | 400 |
| 401 ### **Void tag** state ### | 401 ### **Void tag** state ### |
| 402 | 402 |
| 403 If the current character is... | 403 If the current character is... |
| 404 | 404 |
| 405 * '```>```': Consume the current character. Switch to the **after void | 405 * '``>``': Consume the current character. Switch to the **after void |
| 406 tag** state. | 406 tag** state. |
| 407 | 407 |
| 408 * Anything else: Switch to the **before attribute name** state without | 408 * Anything else: Switch to the **before attribute name** state without |
| 409 consuming the current character. | 409 consuming the current character. |
| 410 | 410 |
| 411 | 411 |
| 412 ### **Before attribute name** state ### | 412 ### **Before attribute name** state ### |
| 413 | 413 |
| 414 If the current character is... | 414 If the current character is... |
| 415 | 415 |
| 416 * U+0020, U+000A: Consume the current character. Stay in this state. | 416 * U+0020, U+000A: Consume the current character. Stay in this state. |
| 417 | 417 |
| 418 * '```/```': Consume the current character. Switch to the **void tag** | 418 * '``/``': Consume the current character. Switch to the **void tag** |
| 419 state. | 419 state. |
| 420 | 420 |
| 421 * '```>```': Consume the current character. Switch to the **after | 421 * '``>``': Consume the current character. Switch to the **after |
| 422 tag** state. | 422 tag** state. |
| 423 | 423 |
| 424 * Anything else: Create a new attribute in the tag token, and set its | 424 * Anything else: Create a new attribute in the tag token, and set its |
| 425 name to the current character. Consume the current character. Switch | 425 name to the current character. Consume the current character. Switch |
| 426 to the **attribute name** state. | 426 to the **attribute name** state. |
| 427 | 427 |
| 428 | 428 |
| 429 ### **Attribute name** state ### | 429 ### **Attribute name** state ### |
| 430 | 430 |
| 431 If the current character is... | 431 If the current character is... |
| 432 | 432 |
| 433 * U+0020, U+000A: Consume the current character. Switch to the **after | 433 * U+0020, U+000A: Consume the current character. Switch to the **after |
| 434 attribute name** state. | 434 attribute name** state. |
| 435 | 435 |
| 436 * '```/```': Consume the current character. Switch to the **void tag** | 436 * '``/``': Consume the current character. Switch to the **void tag** |
| 437 state. | 437 state. |
| 438 | 438 |
| 439 * '```=```': Consume the current character. Switch to the **before | 439 * '``=``': Consume the current character. Switch to the **before |
| 440 attribute value** state. | 440 attribute value** state. |
| 441 | 441 |
| 442 * '```>```': Consume the current character. Switch to the **after | 442 * '``>``': Consume the current character. Switch to the **after |
| 443 tag** state. | 443 tag** state. |
| 444 | 444 |
| 445 * Anything else: Append the current character to the most recently | 445 * Anything else: Append the current character to the most recently |
| 446 added attribute's name, and consume the current character. Stay in | 446 added attribute's name, and consume the current character. Stay in |
| 447 this state. | 447 this state. |
| 448 | 448 |
| 449 | 449 |
| 450 ### **After attribute name** state ### | 450 ### **After attribute name** state ### |
| 451 | 451 |
| 452 If the current character is... | 452 If the current character is... |
| 453 | 453 |
| 454 * U+0020, U+000A: Consume the current character. Stay in this state. | 454 * U+0020, U+000A: Consume the current character. Stay in this state. |
| 455 | 455 |
| 456 * '```/```': Consume the current character. Switch to the **void tag** | 456 * '``/``': Consume the current character. Switch to the **void tag** |
| 457 state. | 457 state. |
| 458 | 458 |
| 459 * '```=```': Consume the current character. Switch to the **before | 459 * '``=``': Consume the current character. Switch to the **before |
| 460 attribute value** state. | 460 attribute value** state. |
| 461 | 461 |
| 462 * '```>```': Consume the current character. Switch to the **after | 462 * '``>``': Consume the current character. Switch to the **after |
| 463 tag** state. | 463 tag** state. |
| 464 | 464 |
| 465 * Anything else: Create a new attribute in the tag token, and set its | 465 * Anything else: Create a new attribute in the tag token, and set its |
| 466 name to the current character. Consume the current character. Switch | 466 name to the current character. Consume the current character. Switch |
| 467 to the **attribute name** state. | 467 to the **attribute name** state. |
| 468 | 468 |
| 469 | 469 |
| 470 ### **Before attribute value** state ### | 470 ### **Before attribute value** state ### |
| 471 | 471 |
| 472 If the current character is... | 472 If the current character is... |
| 473 | 473 |
| 474 * U+0020, U+000A: Consume the current character. Stay in this state. | 474 * U+0020, U+000A: Consume the current character. Stay in this state. |
| 475 | 475 |
| 476 * '```>```': Consume the current character. Switch to the **after | 476 * '``>``': Consume the current character. Switch to the **after |
| 477 tag** state. | 477 tag** state. |
| 478 | 478 |
| 479 * '```'```': Consume the current character. Switch to the | 479 * '``'``': Consume the current character. Switch to the |
| 480 **single-quoted attribute value** state. | 480 **single-quoted attribute value** state. |
| 481 | 481 |
| 482 * '```"```': Consume the current character. Switch to the | 482 * '``"``': Consume the current character. Switch to the |
| 483 **double-quoted attribute value** state. | 483 **double-quoted attribute value** state. |
| 484 | 484 |
| 485 * Anything else: Set the value of the most recently added attribute to | 485 * Anything else: Set the value of the most recently added attribute to |
| 486 the current character. Consume the current character. Switch to the | 486 the current character. Consume the current character. Switch to the |
| 487 **unquoted attribute value** state. | 487 **unquoted attribute value** state. |
| 488 | 488 |
| 489 | 489 |
| 490 ### **Single-quoted attribute value** state ### | 490 ### **Single-quoted attribute value** state ### |
| 491 | 491 |
| 492 If the current character is... | 492 If the current character is... |
| 493 | 493 |
| 494 * '```'```': Consume the current character. Switch to the | 494 * '``'``': Consume the current character. Switch to the |
| 495 **before attribute name** state. | 495 **before attribute name** state. |
| 496 | 496 |
| 497 * '```&```': Consume the character and switch to the **character | 497 * '``&``': Consume the character and switch to the **character |
| 498 reference** state, with the _return state_ set to the | 498 reference** state, with the _return state_ set to the |
| 499 **single-quoted attribute value** state, the _extra terminating | 499 **single-quoted attribute value** state, the _extra terminating |
| 500 character_ set to '```'```', and the _emitting operation_ being to | 500 character_ set to '``'``', and the _emitting operation_ being to |
| 501 append the given character to the value of the most recently added | 501 append the given character to the value of the most recently added |
| 502 attribute. | 502 attribute. |
| 503 | 503 |
| 504 * Anything else: Append the current character to the value of the most | 504 * Anything else: Append the current character to the value of the most |
| 505 recently added attribute. Consume the current character. Stay in | 505 recently added attribute. Consume the current character. Stay in |
| 506 this state. | 506 this state. |
| 507 | 507 |
| 508 | 508 |
| 509 ### **Double-quoted attribute value** state ### | 509 ### **Double-quoted attribute value** state ### |
| 510 | 510 |
| 511 If the current character is... | 511 If the current character is... |
| 512 | 512 |
| 513 * '```"```': Consume the current character. Switch to the | 513 * '``"``': Consume the current character. Switch to the |
| 514 **before attribute name** state. | 514 **before attribute name** state. |
| 515 | 515 |
| 516 * '```&```': Consume the character and switch to the **character | 516 * '``&``': Consume the character and switch to the **character |
| 517 reference** state, with the _return state_ set to the | 517 reference** state, with the _return state_ set to the |
| 518 **double-quoted attribute value** state, the _extra terminating | 518 **double-quoted attribute value** state, the _extra terminating |
| 519 character_ set to '```"```', and the _emitting operation_ being to | 519 character_ set to '``"``', and the _emitting operation_ being to |
| 520 append the given character to the value of the most recently added | 520 append the given character to the value of the most recently added |
| 521 attribute. | 521 attribute. |
| 522 | 522 |
| 523 * Anything else: Append the current character to the value of the most | 523 * Anything else: Append the current character to the value of the most |
| 524 recently added attribute. Consume the current character. Stay in | 524 recently added attribute. Consume the current character. Stay in |
| 525 this state. | 525 this state. |
| 526 | 526 |
| 527 | 527 |
| 528 ### **Unquoted attribute value** state ### | 528 ### **Unquoted attribute value** state ### |
| 529 | 529 |
| 530 If the current character is... | 530 If the current character is... |
| 531 | 531 |
| 532 * U+0020, U+000A: Consume the current character. Switch to the | 532 * U+0020, U+000A: Consume the current character. Switch to the |
| 533 **before attribute name** state. | 533 **before attribute name** state. |
| 534 | 534 |
| 535 * '```>```': Consume the current character. Switch to the **data** | 535 * '``>``': Consume the current character. Switch to the **data** |
| 536 state. Switch to the **after tag** state. | 536 state. Switch to the **after tag** state. |
| 537 | 537 |
| 538 * '```&```': Consume the character and switch to the **character | 538 * '``&``': Consume the character and switch to the **character |
| 539 reference** state, with the _return state_ set to the **unquoted | 539 reference** state, with the _return state_ set to the **unquoted |
| 540 attribute value** state, the _extra terminating character_ unset (or | 540 attribute value** state, the _extra terminating character_ unset (or |
| 541 set to U+0000, which has the same effect), and the _emitting | 541 set to U+0000, which has the same effect), and the _emitting |
| 542 operation_ being to append the given character to the value of the | 542 operation_ being to append the given character to the value of the |
| 543 most recently added attribute. | 543 most recently added attribute. |
| 544 | 544 |
| 545 * Anything else: Append the current character to the value of the most | 545 * Anything else: Append the current character to the value of the most |
| 546 recently added attribute. Consume the current character. Stay in | 546 recently added attribute. Consume the current character. Stay in |
| 547 this state. | 547 this state. |
| 548 | 548 |
| 549 | 549 |
| 550 ### **After tag** state ### | 550 ### **After tag** state ### |
| 551 | 551 |
| 552 Emit the tag token. | 552 Emit the tag token. |
| 553 | 553 |
| 554 If the tag token was a start tag token and the tag name was | 554 If the tag token was a start tag token and the tag name was |
| 555 '```script```', then and switch to the **script raw data** state. | 555 '``script``', then and switch to the **script raw data** state. |
| 556 | 556 |
| 557 If the tag token was a start tag token and the tag name was | 557 If the tag token was a start tag token and the tag name was |
| 558 '```style```', then and switch to the **style raw data** state. | 558 '``style``', then and switch to the **style raw data** state. |
| 559 | 559 |
| 560 Otherwise, switch to the **data** state. | 560 Otherwise, switch to the **data** state. |
| 561 | 561 |
| 562 | 562 |
| 563 ### **After void tag** state ### | 563 ### **After void tag** state ### |
| 564 | 564 |
| 565 Emit the tag token. | 565 Emit the tag token. |
| 566 | 566 |
| 567 If the tag token is a start tag token, emit an end tag token with the | 567 If the tag token is a start tag token, emit an end tag token with the |
| 568 same tag name. | 568 same tag name. |
| 569 | 569 |
| 570 Switch to the **data** state. | 570 Switch to the **data** state. |
| 571 | 571 |
| 572 | 572 |
| 573 ### **Comment start 1** state ### | 573 ### **Comment start 1** state ### |
| 574 | 574 |
| 575 If the current character is... | 575 If the current character is... |
| 576 | 576 |
| 577 * '```-```': Consume the character and switch to the **comment start | 577 * '``-``': Consume the character and switch to the **comment start |
| 578 2** state. | 578 2** state. |
| 579 | 579 |
| 580 * '```>```': Emit character tokens for '```<!>```'. Consume the | 580 * '``>``': Emit character tokens for '``<!>``'. Consume the |
| 581 current character. Switch to the **data** state. | 581 current character. Switch to the **data** state. |
| 582 | 582 |
| 583 | 583 |
| 584 ### **Comment start 2** state ### | 584 ### **Comment start 2** state ### |
| 585 | 585 |
| 586 If the current character is... | 586 If the current character is... |
| 587 | 587 |
| 588 * '```-```': Consume the character and switch to the **comment** | 588 * '``-``': Consume the character and switch to the **comment** |
| 589 state. | 589 state. |
| 590 | 590 |
| 591 * '```>```': Emit character tokens for '```<!->```'. Consume the | 591 * '``>``': Emit character tokens for '``<!->``'. Consume the |
| 592 current character. Switch to the **data** state. | 592 current character. Switch to the **data** state. |
| 593 | 593 |
| 594 | 594 |
| 595 ### **Comment** state ### | 595 ### **Comment** state ### |
| 596 | 596 |
| 597 If the current character is... | 597 If the current character is... |
| 598 | 598 |
| 599 * '```-```': Consume the character and switch to the **comment end 1** | 599 * '``-``': Consume the character and switch to the **comment end 1** |
| 600 state. | 600 state. |
| 601 | 601 |
| 602 * Anything else: Consume the character and switch to the **comment** | 602 * Anything else: Consume the character and switch to the **comment** |
| 603 state. | 603 state. |
| 604 | 604 |
| 605 | 605 |
| 606 ### **Comment end 1** state ### | 606 ### **Comment end 1** state ### |
| 607 | 607 |
| 608 If the current character is... | 608 If the current character is... |
| 609 | 609 |
| 610 * '```-```': Consume the character, switch to the **comment end 2** | 610 * '``-``': Consume the character, switch to the **comment end 2** |
| 611 state. | 611 state. |
| 612 | 612 |
| 613 * Anything else: Consume the character, and switch to the **comment** | 613 * Anything else: Consume the character, and switch to the **comment** |
| 614 state. | 614 state. |
| 615 | 615 |
| 616 | 616 |
| 617 ### **Comment end 2** state ### | 617 ### **Comment end 2** state ### |
| 618 | 618 |
| 619 If the current character is... | 619 If the current character is... |
| 620 | 620 |
| 621 * '```>```': Consume the character and switch to the **data** state. | 621 * '``>``': Consume the character and switch to the **data** state. |
| 622 | 622 |
| 623 * '```-```': Consume the character, but stay in this state. | 623 * '``-``': Consume the character, but stay in this state. |
| 624 | 624 |
| 625 * Anything else: Consume the character, and switch to the **comment** | 625 * Anything else: Consume the character, and switch to the **comment** |
| 626 state. | 626 state. |
| 627 | 627 |
| 628 | 628 |
| 629 ### **Character reference** state ### | 629 ### **Character reference** state ### |
| 630 | 630 |
| 631 Let _raw value_ be the string '```&```'. | 631 Let _raw value_ be the string '``&``'. |
| 632 | 632 |
| 633 Append the current character to _raw value_. | 633 Append the current character to _raw value_. |
| 634 | 634 |
| 635 If the current character is... | 635 If the current character is... |
| 636 | 636 |
| 637 * '```#```': Consume the character, and switch to the **numeric | 637 * '``#``': Consume the character, and switch to the **numeric |
| 638 character reference** state. | 638 character reference** state. |
| 639 | 639 |
| 640 * '```l```': Consume the character and switch to the **named character | 640 * '``l``': Consume the character and switch to the **named character |
| 641 reference L** state. | 641 reference L** state. |
| 642 | 642 |
| 643 * '```a```': Consume the character and switch to the **named character | 643 * '``a``': Consume the character and switch to the **named character |
| 644 reference A** state. | 644 reference A** state. |
| 645 | 645 |
| 646 * '```g```': Consume the character and switch to the **named character | 646 * '``g``': Consume the character and switch to the **named character |
| 647 reference G** state. | 647 reference G** state. |
| 648 | 648 |
| 649 * '```q```': Consume the character and switch to the **named character | 649 * '``q``': Consume the character and switch to the **named character |
| 650 reference Q** state. | 650 reference Q** state. |
| 651 | 651 |
| 652 * Any other character in the range '```0```'..'```9```', | 652 * Any other character in the range '``0``'..'``9``', |
| 653 '```a```'..'```f```', '```A```'..'```F```': Consume the character | 653 '``a``'..'``f``', '``A``'..'``F``': Consume the character |
| 654 and switch to the **bad named character reference** state. | 654 and switch to the **bad named character reference** state. |
| 655 | 655 |
| 656 * Anything else: Run the _emitting operation_ for all but the last | 656 * Anything else: Run the _emitting operation_ for all but the last |
| 657 character in _raw value_, and switch to the **data state** without | 657 character in _raw value_, and switch to the **data state** without |
| 658 consuming the current character. | 658 consuming the current character. |
| 659 | 659 |
| 660 | 660 |
| 661 ### **Numeric character reference** state ### | 661 ### **Numeric character reference** state ### |
| 662 | 662 |
| 663 Append the current character to _raw value_. | 663 Append the current character to _raw value_. |
| 664 | 664 |
| 665 If the current character is... | 665 If the current character is... |
| 666 | 666 |
| 667 * '```x```', '```X```': Let _value_ be zero, consume the character, | 667 * '``x``', '``X``': Let _value_ be zero, consume the character, |
| 668 and switch to the **hexadecimal numeric character reference** state. | 668 and switch to the **hexadecimal numeric character reference** state. |
| 669 | 669 |
| 670 * '```0```'..'```9```': Let _value_ be the numeric value of the | 670 * '``0``'..'``9``': Let _value_ be the numeric value of the |
| 671 current character interpreted as a decimal digit, consume the | 671 current character interpreted as a decimal digit, consume the |
| 672 character, and switch to the **decimal numeric character reference** | 672 character, and switch to the **decimal numeric character reference** |
| 673 state. | 673 state. |
| 674 | 674 |
| 675 * Anything else: Run the _emitting operation_ for all but the last | 675 * Anything else: Run the _emitting operation_ for all but the last |
| 676 character in _raw value_, and switch to the **data state** without | 676 character in _raw value_, and switch to the **data state** without |
| 677 consuming the current character. | 677 consuming the current character. |
| 678 | 678 |
| 679 | 679 |
| 680 ### **Hexadecimal numeric character reference** state ### | 680 ### **Hexadecimal numeric character reference** state ### |
| 681 | 681 |
| 682 Append the current character to _raw value_. | 682 Append the current character to _raw value_. |
| 683 | 683 |
| 684 If the current character is... | 684 If the current character is... |
| 685 | 685 |
| 686 * '```0```'..'```9```', '```a```'..'```f```', '```A```'..'```F```': | 686 * '``0``'..'``9``', '``a``'..'``f``', '``A``'..'``F``': |
| 687 Let _value_ be sixteen times _value_ plus the numeric value of the | 687 Let _value_ be sixteen times _value_ plus the numeric value of the |
| 688 current character interpreted as a hexadecimal digit. | 688 current character interpreted as a hexadecimal digit. |
| 689 | 689 |
| 690 * '```;```': Consume the character. If _value_ is between 0x0001 and | 690 * '``;``': Consume the character. If _value_ is between 0x0001 and |
| 691 0x10FFFF inclusive, but is not between 0xD800 and 0xDFFF inclusive, | 691 0x10FFFF inclusive, but is not between 0xD800 and 0xDFFF inclusive, |
| 692 run the _emitting operation_ with a unicode character having the | 692 run the _emitting operation_ with a unicode character having the |
| 693 scalar value _value_; otherwise, run the _emitting operation_ with | 693 scalar value _value_; otherwise, run the _emitting operation_ with |
| 694 the character U+FFFD. Then, in either case, switch to the _return | 694 the character U+FFFD. Then, in either case, switch to the _return |
| 695 state_. | 695 state_. |
| 696 | 696 |
| 697 * Anything else: Run the _emitting operation_ for all but the last | 697 * Anything else: Run the _emitting operation_ for all but the last |
| 698 character in _raw value_, and switch to the **data state** without | 698 character in _raw value_, and switch to the **data state** without |
| 699 consuming the current character. | 699 consuming the current character. |
| 700 | 700 |
| 701 | 701 |
| 702 ### **Decimal numeric character reference** state ### | 702 ### **Decimal numeric character reference** state ### |
| 703 | 703 |
| 704 Append the current character to _raw value_. | 704 Append the current character to _raw value_. |
| 705 | 705 |
| 706 If the current character is... | 706 If the current character is... |
| 707 | 707 |
| 708 * '```0```'..'```9```': Let _value_ be ten times _value_ plus the | 708 * '``0``'..'``9``': Let _value_ be ten times _value_ plus the |
| 709 numeric value of the current character interpreted as a decimal | 709 numeric value of the current character interpreted as a decimal |
| 710 digit. | 710 digit. |
| 711 | 711 |
| 712 * '```;```': Consume the character. If _value_ is between 0x0001 and | 712 * '``;``': Consume the character. If _value_ is between 0x0001 and |
| 713 0x10FFFF inclusive, but is not between 0xD800 and 0xDFFF inclusive, | 713 0x10FFFF inclusive, but is not between 0xD800 and 0xDFFF inclusive, |
| 714 run the _emitting operation_ with a unicode character having the | 714 run the _emitting operation_ with a unicode character having the |
| 715 scalar value _value_; otherwise, run the _emitting operation_ with | 715 scalar value _value_; otherwise, run the _emitting operation_ with |
| 716 the character U+FFFD. Then, in either case, switch to the _return | 716 the character U+FFFD. Then, in either case, switch to the _return |
| 717 state_. | 717 state_. |
| 718 | 718 |
| 719 * Anything else: Run the _emitting operation_ for all but the last | 719 * Anything else: Run the _emitting operation_ for all but the last |
| 720 character in _raw value_, and switch to the **data state** without | 720 character in _raw value_, and switch to the **data state** without |
| 721 consuming the current character. | 721 consuming the current character. |
| 722 | 722 |
| 723 | 723 |
| 724 ### **Named character reference L** state ### | 724 ### **Named character reference L** state ### |
| 725 | 725 |
| 726 Append the current character to _raw value_. | 726 Append the current character to _raw value_. |
| 727 | 727 |
| 728 If the current character is... | 728 If the current character is... |
| 729 | 729 |
| 730 * '```t```': Let _character_ be '```<```', consume the current | 730 * '``t``': Let _character_ be '``<``', consume the current |
| 731 character, and switch to the **after named character reference** | 731 character, and switch to the **after named character reference** |
| 732 state. | 732 state. |
| 733 | 733 |
| 734 * Anything else: Switch to the _bad named character reference_ state | 734 * Anything else: Switch to the _bad named character reference_ state |
| 735 without consuming the character. | 735 without consuming the character. |
| 736 | 736 |
| 737 | 737 |
| 738 ### **Named character reference A** state ### | 738 ### **Named character reference A** state ### |
| 739 | 739 |
| 740 Append the current character to _raw value_. | 740 Append the current character to _raw value_. |
| 741 | 741 |
| 742 If the current character is... | 742 If the current character is... |
| 743 | 743 |
| 744 * '```p```': Consume the current character and switch to the **named | 744 * '``p``': Consume the current character and switch to the **named |
| 745 character reference AP** state. | 745 character reference AP** state. |
| 746 | 746 |
| 747 * '```m```': Consume the current character and switch to the **named | 747 * '``m``': Consume the current character and switch to the **named |
| 748 character reference AM** state. | 748 character reference AM** state. |
| 749 | 749 |
| 750 * Anything else: Switch to the _bad named character reference_ state | 750 * Anything else: Switch to the _bad named character reference_ state |
| 751 without consuming the character. | 751 without consuming the character. |
| 752 | 752 |
| 753 | 753 |
| 754 ### **Named character reference AM** state ### | 754 ### **Named character reference AM** state ### |
| 755 | 755 |
| 756 Append the current character to _raw value_. | 756 Append the current character to _raw value_. |
| 757 | 757 |
| 758 If the current character is... | 758 If the current character is... |
| 759 | 759 |
| 760 * '```p```': Let _character_ be '```&```', consume the current | 760 * '``p``': Let _character_ be '``&``', consume the current |
| 761 character, and switch to the **after named character reference** | 761 character, and switch to the **after named character reference** |
| 762 state. | 762 state. |
| 763 | 763 |
| 764 * Anything else: Switch to the _bad named character reference_ state | 764 * Anything else: Switch to the _bad named character reference_ state |
| 765 without consuming the character. | 765 without consuming the character. |
| 766 | 766 |
| 767 | 767 |
| 768 ### **Named character reference AP** state ### | 768 ### **Named character reference AP** state ### |
| 769 | 769 |
| 770 Append the current character to _raw value_. | 770 Append the current character to _raw value_. |
| 771 | 771 |
| 772 If the current character is... | 772 If the current character is... |
| 773 | 773 |
| 774 * '```o```': Consume the current character and switch to the **named | 774 * '``o``': Consume the current character and switch to the **named |
| 775 character reference APO** state. | 775 character reference APO** state. |
| 776 | 776 |
| 777 * Anything else: Switch to the _bad named character reference_ state | 777 * Anything else: Switch to the _bad named character reference_ state |
| 778 without consuming the character. | 778 without consuming the character. |
| 779 | 779 |
| 780 | 780 |
| 781 ### **Named character reference APO** state ### | 781 ### **Named character reference APO** state ### |
| 782 | 782 |
| 783 Append the current character to _raw value_. | 783 Append the current character to _raw value_. |
| 784 | 784 |
| 785 If the current character is... | 785 If the current character is... |
| 786 | 786 |
| 787 * '```s```': Let _character_ be '```'```', consume the current | 787 * '``s``': Let _character_ be '``'``', consume the current |
| 788 character, and switch to the **after named character reference** | 788 character, and switch to the **after named character reference** |
| 789 state. | 789 state. |
| 790 | 790 |
| 791 * Anything else: Switch to the _bad named character reference_ state | 791 * Anything else: Switch to the _bad named character reference_ state |
| 792 without consuming the character. | 792 without consuming the character. |
| 793 | 793 |
| 794 | 794 |
| 795 ### **Named character reference G** state ### | 795 ### **Named character reference G** state ### |
| 796 | 796 |
| 797 Append the current character to _raw value_. | 797 Append the current character to _raw value_. |
| 798 | 798 |
| 799 If the current character is... | 799 If the current character is... |
| 800 | 800 |
| 801 * '```t```': Let _character_ be '```>```', consume the current | 801 * '``t``': Let _character_ be '``>``', consume the current |
| 802 character, and switch to the **after named character reference** | 802 character, and switch to the **after named character reference** |
| 803 state. | 803 state. |
| 804 | 804 |
| 805 * Anything else: Switch to the _bad named character reference_ state | 805 * Anything else: Switch to the _bad named character reference_ state |
| 806 without consuming the character. | 806 without consuming the character. |
| 807 | 807 |
| 808 | 808 |
| 809 ### **Named character reference Q** state ### | 809 ### **Named character reference Q** state ### |
| 810 | 810 |
| 811 Append the current character to _raw value_. | 811 Append the current character to _raw value_. |
| 812 | 812 |
| 813 If the current character is... | 813 If the current character is... |
| 814 | 814 |
| 815 * '```u```': Consume the current character and switch to the **named | 815 * '``u``': Consume the current character and switch to the **named |
| 816 character reference QU** state. | 816 character reference QU** state. |
| 817 | 817 |
| 818 * Anything else: Switch to the _bad named character reference_ state | 818 * Anything else: Switch to the _bad named character reference_ state |
| 819 without consuming the character. | 819 without consuming the character. |
| 820 | 820 |
| 821 | 821 |
| 822 ### **Named character reference QU** state ### | 822 ### **Named character reference QU** state ### |
| 823 | 823 |
| 824 Append the current character to _raw value_. | 824 Append the current character to _raw value_. |
| 825 | 825 |
| 826 If the current character is... | 826 If the current character is... |
| 827 | 827 |
| 828 * '```o```': Consume the current character and switch to the **named | 828 * '``o``': Consume the current character and switch to the **named |
| 829 character reference QUO** state. | 829 character reference QUO** state. |
| 830 | 830 |
| 831 * Anything else: Switch to the _bad named character reference_ state | 831 * Anything else: Switch to the _bad named character reference_ state |
| 832 without consuming the character. | 832 without consuming the character. |
| 833 | 833 |
| 834 | 834 |
| 835 ### **Named character reference QUO** state ### | 835 ### **Named character reference QUO** state ### |
| 836 | 836 |
| 837 Append the current character to _raw value_. | 837 Append the current character to _raw value_. |
| 838 | 838 |
| 839 If the current character is... | 839 If the current character is... |
| 840 | 840 |
| 841 * '```t```': Let _character_ be '```"```', consume the current | 841 * '``t``': Let _character_ be '``"``', consume the current |
| 842 character, and switch to the **after named character reference** | 842 character, and switch to the **after named character reference** |
| 843 state. | 843 state. |
| 844 | 844 |
| 845 * Anything else: Switch to the _bad named character reference_ state | 845 * Anything else: Switch to the _bad named character reference_ state |
| 846 without consuming the character. | 846 without consuming the character. |
| 847 | 847 |
| 848 | 848 |
| 849 ### **After named character reference** state ### | 849 ### **After named character reference** state ### |
| 850 | 850 |
| 851 Append the current character to _raw value_. | 851 Append the current character to _raw value_. |
| 852 | 852 |
| 853 If the current character is... | 853 If the current character is... |
| 854 | 854 |
| 855 * '```;```': Consume the character. Run the _emitting operation_ with | 855 * '``;``': Consume the character. Run the _emitting operation_ with |
| 856 the character _character_. Switch to the _return state_. | 856 the character _character_. Switch to the _return state_. |
| 857 | 857 |
| 858 * The _extra terminating character_: Run the _emitting operation_ with | 858 * The _extra terminating character_: Run the _emitting operation_ with |
| 859 the character U+FFFD. Switch to the _return state_ without consuming | 859 the character U+FFFD. Switch to the _return state_ without consuming |
| 860 the current character. | 860 the current character. |
| 861 | 861 |
| 862 * Anything else: Switch to the _bad named character reference_ state | 862 * Anything else: Switch to the _bad named character reference_ state |
| 863 without consuming the current character. | 863 without consuming the current character. |
| 864 | 864 |
| 865 | 865 |
| 866 ### **Bad named character reference** state ### | 866 ### **Bad named character reference** state ### |
| 867 | 867 |
| 868 Append the current character to _raw value_. | 868 Append the current character to _raw value_. |
| 869 | 869 |
| 870 If the current character is... | 870 If the current character is... |
| 871 | 871 |
| 872 * '```;```': Consume the character. Run the _emitting operation_ with | 872 * '``;``': Consume the character. Run the _emitting operation_ with |
| 873 the character U+FFFD. Switch to the _return state_. | 873 the character U+FFFD. Switch to the _return state_. |
| 874 | 874 |
| 875 * The _extra terminating character_: Switch to the _return state_ | 875 * The _extra terminating character_: Switch to the _return state_ |
| 876 without consuming the current character. | 876 without consuming the current character. |
| 877 | 877 |
| 878 * Any other character in the range '```0```'..'```9```', | 878 * Any other character in the range '``0``'..'``9``', |
| 879 '```a```'..'```f```', '```A```'..'```F```': Consume the character | 879 '``a``'..'``f``', '``A``'..'``F``': Consume the character |
| 880 and stay in this state. | 880 and stay in this state. |
| 881 | 881 |
| 882 * Anything else: Run the _emitting operation_ for all but the last | 882 * Anything else: Run the _emitting operation_ for all but the last |
| 883 character in _raw value_, and switch to the **data state** without | 883 character in _raw value_, and switch to the **data state** without |
| 884 consuming the current character. | 884 consuming the current character. |
| 885 | 885 |
| 886 | 886 |
| 887 Token cleanup stage | 887 Token cleanup stage |
| 888 ------------------- | 888 ------------------- |
| 889 | 889 |
| (...skipping 18 matching lines...) Expand all Loading... |
| 908 | 908 |
| 909 To construct a node tree from a _sequence of tokens_ and a document | 909 To construct a node tree from a _sequence of tokens_ and a document |
| 910 _document_: | 910 _document_: |
| 911 | 911 |
| 912 1. Initialize the _stack of open nodes_ to be _document_. | 912 1. Initialize the _stack of open nodes_ to be _document_. |
| 913 2. Consider each token _token_ in the _sequence of tokens_ in turn, as | 913 2. Consider each token _token_ in the _sequence of tokens_ in turn, as |
| 914 follows. If a token is to be skipped, then jump straight to the | 914 follows. If a token is to be skipped, then jump straight to the |
| 915 next token, without doing any more work with the skipped token. | 915 next token, without doing any more work with the skipped token. |
| 916 - If _token_ is a string token, | 916 - If _token_ is a string token, |
| 917 1. If the value of the token contains only U+0020 and U+000A | 917 1. If the value of the token contains only U+0020 and U+000A |
| 918 characters, and there is no ```t``` element on the _stack of | 918 characters, and there is no ``t`` element on the _stack of |
| 919 open nodes_, then skip the token. | 919 open nodes_, then skip the token. |
| 920 2. Create a text node _node_ whose character data is the value of | 920 2. Create a text node _node_ whose character data is the value of |
| 921 the token. | 921 the token. |
| 922 3. Append _node_ to the top node in the _stack of open nodes_. | 922 3. Append _node_ to the top node in the _stack of open nodes_. |
| 923 - If _token_ is a start tag token, | 923 - If _token_ is a start tag token, |
| 924 1. Create an element _node_ with tag name and attributes given by | 924 1. Create an element _node_ with tag name and attributes given by |
| 925 the token. | 925 the token. |
| 926 2. Append _node_ to the top node in the _stack of open nodes_. | 926 2. Append _node_ to the top node in the _stack of open nodes_. |
| 927 - If _token_ is an end tag token: | 927 - If _token_ is an end tag token: |
| 928 1. Let _node_ be the topmost node in the _stack of open nodes_ | 928 1. Let _node_ be the topmost node in the _stack of open nodes_ |
| 929 whose tag name is the same as the token's tag name, if any. If | 929 whose tag name is the same as the token's tag name, if any. If |
| 930 there isn't one, skip this token. | 930 there isn't one, skip this token. |
| 931 2. If there's a ```template``` element in the _stack of open | 931 2. If there's a ``template`` element in the _stack of open |
| 932 nodes_ above _node_, then skip this token. | 932 nodes_ above _node_, then skip this token. |
| 933 3. Pop nodes from the _stack of open nodes_ until _node_ has been | 933 3. Pop nodes from the _stack of open nodes_ until _node_ has been |
| 934 popped. | 934 popped. |
| 935 4. If _node_'s tag name is ```script```, then yield until there | 935 4. If _node_'s tag name is ``script``, then yield until there |
| 936 are no pending import loads, then execute the script given by | 936 are no pending import loads, then execute the script given by |
| 937 the element's contents. | 937 the element's contents. |
| 938 3. Yield until there are no pending import loads. | 938 3. Yield until there are no pending import loads. |
| 939 3. Fire a ```load``` event at the _parsing context_ object. | 939 3. Fire a ``load`` event at the _parsing context_ object. |
| OLD | NEW |