| OLD | NEW |
| 1 Parsing | 1 Parsing |
| 2 ======= | 2 ======= |
| 3 | 3 |
| 4 Parsing in Sky is a strict pipeline consisting of five stages: | 4 Parsing in Sky is a strict pipeline consisting of five stages: |
| 5 | 5 |
| 6 - decoding, which converts incoming bytes into Unicode characters | 6 - decoding, which converts incoming bytes into Unicode characters |
| 7 using UTF-8. | 7 using UTF-8. |
| 8 | 8 |
| 9 - normalising, which manipulates the sequence of characters. | 9 - normalising, which manipulates the sequence of characters. |
| 10 | 10 |
| 11 - tokenising, which converts these characters into three kinds of | 11 - tokenising, which converts these characters into four kinds of |
| 12 tokens: character tokens, start tag tokens, and end tag tokens. | 12 tokens: character tokens, start tag tokens, end tag tokens, and |
| 13 Character tokens have a single character value. Tag tokens have a | 13 automatic end tag tokens. Character tokens have a single character |
| 14 tag name, and a list of name/value pairs known as attributes. | 14 value. Start and end tag tokens have a tag name, and a list of |
| 15 name/value pairs known as attributes. |
| 15 | 16 |
| 16 - token cleanup, which converts sequences of character tokens into | 17 - token cleanup, which converts sequences of character tokens into |
| 17 string tokens, and removes duplicate attributes in tag tokens. | 18 string tokens, and removes duplicate attributes in tag tokens. |
| 18 | 19 |
| 19 - tree construction, which converts these tokens into a tree of nodes. | 20 - tree construction, which converts these tokens into a tree of nodes. |
| 20 | 21 |
| 21 Later stages cannot affect earlier stages. | 22 Later stages cannot affect earlier stages. |
| 22 | 23 |
| 23 When a sequence of bytes is to be parsed, there is always a defined | 24 When a sequence of bytes is to be parsed, there is always a defined |
| 24 _parsing context_, which is either an Application object or a Module | 25 _parsing context_, which is either an Application object or a Module |
| (...skipping 339 matching lines...) Expand 10 before | Expand all | Expand 10 after Loading... |
| 364 switch to the **tag name** state. | 365 switch to the **tag name** state. |
| 365 | 366 |
| 366 * Anything else: Emit the character token for '``<``'. Switch to the | 367 * Anything else: Emit the character token for '``<``'. Switch to the |
| 367 **data** state without consuming the current character. | 368 **data** state without consuming the current character. |
| 368 | 369 |
| 369 | 370 |
| 370 #### **Close tag** state #### | 371 #### **Close tag** state #### |
| 371 | 372 |
| 372 If the current character is... | 373 If the current character is... |
| 373 | 374 |
| 374 * '``>``': Emit character tokens for '``</>``'. Consume the current | 375 * '``>``': Emit an automatic end tag token. Switch to the **data** |
| 375 character. Switch to the **data** state. | 376 state. |
| 376 | 377 |
| 377 * '``0``'..'``9``', '``a``'..'``z``', '``A``'..'``Z``', | 378 * '``0``'..'``9``', '``a``'..'``z``', '``A``'..'``Z``', |
| 378 '``-``', '``_``', '``.``': Create an end tag token, let its | 379 '``-``', '``_``', '``.``': Create an end tag token, let its |
| 379 tag name be the current character, consume the current character and | 380 tag name be the current character, consume the current character and |
| 380 switch to the **tag name** state. | 381 switch to the **tag name** state. |
| 381 | 382 |
| 382 * Anything else: Emit the character tokens for '``</``'. Switch to | 383 * Anything else: Emit the character tokens for '``</``'. Switch to |
| 383 the **data** state without consuming the current character. | 384 the **data** state without consuming the current character. |
| 384 | 385 |
| 385 | 386 |
| (...skipping 159 matching lines...) Expand 10 before | Expand all | Expand 10 after Loading... |
| 545 * Anything else: Append the current character to the value of the most | 546 * Anything else: Append the current character to the value of the most |
| 546 recently added attribute. Consume the current character. Stay in | 547 recently added attribute. Consume the current character. Stay in |
| 547 this state. | 548 this state. |
| 548 | 549 |
| 549 | 550 |
| 550 #### **After tag** state #### | 551 #### **After tag** state #### |
| 551 | 552 |
| 552 Emit the tag token. | 553 Emit the tag token. |
| 553 | 554 |
| 554 If the tag token was a start tag token and the tag name was | 555 If the tag token was a start tag token and the tag name was |
| 555 '``script``', then and switch to the **script raw data** state. | 556 '``script``', then switch to the **script raw data** state. |
| 556 | 557 |
| 557 If the tag token was a start tag token and the tag name was | 558 If the tag token was a start tag token and the tag name was |
| 558 '``style``', then and switch to the **style raw data** state. | 559 '``style``', then switch to the **style raw data** state. |
| 559 | 560 |
| 560 Otherwise, switch to the **data** state. | 561 Otherwise, switch to the **data** state. |
| 561 | 562 |
| 562 | 563 |
| 563 #### **After void tag** state #### | 564 #### **After void tag** state #### |
| 564 | 565 |
| 565 Emit the tag token. | 566 Emit the tag token. |
| 566 | 567 |
| 567 If the tag token is a start tag token, emit an end tag token with the | 568 If the tag token is a start tag token, emit an end tag token with the |
| 568 same tag name. | 569 same tag name. |
| (...skipping 256 matching lines...) Expand 10 before | Expand all | Expand 10 after Loading... |
| 825 whose tag name is _tag name_, if any. If there isn't one, skip | 826 whose tag name is _tag name_, if any. If there isn't one, skip |
| 826 this token. | 827 this token. |
| 827 3. If there's a ``template`` element in the _stack of open | 828 3. If there's a ``template`` element in the _stack of open |
| 828 nodes_ above _node_, then skip this token. | 829 nodes_ above _node_, then skip this token. |
| 829 4. Pop nodes from the _stack of open nodes_ until _node_ has been | 830 4. Pop nodes from the _stack of open nodes_ until _node_ has been |
| 830 popped. | 831 popped. |
| 831 5. If _node_'s tag name is ``script``, then yield until _imported | 832 5. If _node_'s tag name is ``script``, then yield until _imported |
| 832 modules_ contains no entries with unresolved promises, then | 833 modules_ contains no entries with unresolved promises, then |
| 833 execute the script given by the element's contents, using the | 834 execute the script given by the element's contents, using the |
| 834 associated names as appropriate. | 835 associated names as appropriate. |
| 836 - If _token_ is an automatic end tag token: |
| 837 1. Pop the top node from the _stack of open nodes_, if any. |
| 835 4. Yield until _imported modules_ has no promises. | 838 4. Yield until _imported modules_ has no promises. |
| 836 5. Fire a ``load`` event at the _parsing context_ object. | 839 5. Fire a ``load`` event at the _parsing context_ object. |
| OLD | NEW |