OLD | NEW |
1 Parsing | 1 Parsing |
2 ======= | 2 ======= |
3 | 3 |
4 Parsing in Sky is a strict pipeline consisting of four stages: | 4 Parsing in Sky is a strict pipeline consisting of four stages: |
5 | 5 |
6 - decoding, which converts incoming bytes into Unicode characters | 6 - decoding, which converts incoming bytes into Unicode characters |
7 using UTF-8 | 7 using UTF-8 |
8 | 8 |
9 - normalising, which converts certain sequences of characters | 9 - normalising, which converts certain sequences of characters |
10 | 10 |
(...skipping 131 matching lines...) Expand 10 before | Expand all | Expand 10 after Loading... |
142 state, the _extra terminating character_ unset (or set to U+0000, | 142 state, the _extra terminating character_ unset (or set to U+0000, |
143 which has the same effect), and the _emitting operation_ being to | 143 which has the same effect), and the _emitting operation_ being to |
144 emit a character token for the given character. | 144 emit a character token for the given character. |
145 | 145 |
146 * Anything else: Emit the current input character as a character | 146 * Anything else: Emit the current input character as a character |
147 token. Consume the character. Stay in this state. | 147 token. Consume the character. Stay in this state. |
148 | 148 |
149 | 149 |
150 ### **Script raw data** state ### | 150 ### **Script raw data** state ### |
151 | 151 |
152 TOOD(ianh): spec this | 152 If the current character is... |
| 153 |
| 154 * '```<```': Consume the character and switch to the **script raw |
| 155 data: close 1** state. |
| 156 |
| 157 * Anything else: Emit the current input character as a character |
| 158 token. Consume the character. Stay in this state. |
| 159 |
| 160 |
| 161 ### **Script raw data: close 1** state ### |
| 162 |
| 163 If the current character is... |
| 164 |
| 165 * '```/```': Consume the character and switch to the **script raw |
| 166 data: close 2** state. |
| 167 |
| 168 * Anything else: Emit '```<```' character tokens. Consume the |
| 169 character. Switch to the **script raw data** state. |
| 170 |
| 171 |
| 172 ### **Script raw data: close 2** state ### |
| 173 |
| 174 If the current character is... |
| 175 |
| 176 * '```s```': Consume the character and switch to the **script raw |
| 177 data: close 3** state. |
| 178 |
| 179 * Anything else: Emit '```</```' character tokens. Consume the |
| 180 character. Switch to the **script raw data** state. |
| 181 |
| 182 |
| 183 ### **Script raw data: close 3** state ### |
| 184 |
| 185 If the current character is... |
| 186 |
| 187 * '```c```': Consume the character and switch to the **script raw |
| 188 data: close 4** state. |
| 189 |
| 190 * Anything else: Emit '```</s```' character tokens. Consume the |
| 191 character. Switch to the **script raw data** state. |
| 192 |
| 193 |
| 194 ### **Script raw data: close 4** state ### |
| 195 |
| 196 If the current character is... |
| 197 |
| 198 * '```r```': Consume the character and switch to the **script raw |
| 199 data: close 5** state. |
| 200 |
| 201 * Anything else: Emit '```</sc```' character tokens. Consume the |
| 202 character. Switch to the **script raw data** state. |
| 203 |
| 204 |
| 205 ### **Script raw data: close 5** state ### |
| 206 |
| 207 If the current character is... |
| 208 |
| 209 * '```i```': Consume the character and switch to the **script raw |
| 210 data: close 6** state. |
| 211 |
| 212 * Anything else: Emit '```</scr```' character tokens. Consume the |
| 213 character. Switch to the **script raw data** state. |
| 214 |
| 215 |
| 216 ### **Script raw data: close 6** state ### |
| 217 |
| 218 If the current character is... |
| 219 |
| 220 * '```p```': Consume the character and switch to the **script raw |
| 221 data: close 7** state. |
| 222 |
| 223 * Anything else: Emit '```</scri```' character tokens. Consume the |
| 224 character. Switch to the **script raw data** state. |
| 225 |
| 226 |
| 227 ### **Script raw data: close 7** state ### |
| 228 |
| 229 If the current character is... |
| 230 |
| 231 * '```t```': Consume the character and switch to the **script raw |
| 232 data: close 8** state. |
| 233 |
| 234 * Anything else: Emit '```</scrip```' character tokens. Consume the |
| 235 character. Switch to the **script raw data** state. |
| 236 |
| 237 |
| 238 ### **Script raw data: close 8** state ### |
| 239 |
| 240 If the current character is... |
| 241 |
| 242 * U+0020, U+000A, '```/```', '```>```': Create an end tag token, and |
| 243 let its tag name be the string '```script```'. Switch to the |
| 244 **before attribute name** state without consuming the character. |
| 245 |
| 246 * Anything else: Emit '```</script```' character tokens. Consume the |
| 247 character. Switch to the **script raw data** state. |
153 | 248 |
154 | 249 |
155 ### **Style raw data** state ### | 250 ### **Style raw data** state ### |
156 | 251 |
157 TOOD(ianh): spec this | 252 If the current character is... |
| 253 |
| 254 * '```<```': Consume the character and switch to the **style raw |
| 255 data: close 1** state. |
| 256 |
| 257 * Anything else: Emit the current input character as a character |
| 258 token. Consume the character. Stay in this state. |
158 | 259 |
159 | 260 |
160 ### **After tag** state ### | 261 ### **Style raw data: close 1** state ### |
161 | 262 |
162 Emit the tag token. | 263 If the current character is... |
163 | 264 |
164 If the tag token was a start tag token and the tag name was | 265 * '```/```': Consume the character and switch to the **style raw |
165 '```script```', then and switch to the **script raw data** state. | 266 data: close 2** state. |
166 | 267 |
167 If the tag token was a start tag token and the tag name was | 268 * Anything else: Emit '```<```' character tokens. Consume the |
168 '```style```', then and switch to the **style raw data** state. | 269 character. Switch to the **style raw data** state. |
169 | 270 |
170 Otherwise, switch to the **data** state. | 271 |
| 272 ### **Style raw data: close 2** state ### |
| 273 |
| 274 If the current character is... |
| 275 |
| 276 * '```s```': Consume the character and switch to the **style raw |
| 277 data: close 3** state. |
| 278 |
| 279 * Anything else: Emit '```</```' character tokens. Consume the |
| 280 character. Switch to the **style raw data** state. |
| 281 |
| 282 |
| 283 ### **Style raw data: close 3** state ### |
| 284 |
| 285 If the current character is... |
| 286 |
| 287 * '```t```': Consume the character and switch to the **style raw |
| 288 data: close 4** state. |
| 289 |
| 290 * Anything else: Emit '```</s```' character tokens. Consume the |
| 291 character. Switch to the **style raw data** state. |
| 292 |
| 293 |
| 294 ### **Style raw data: close 4** state ### |
| 295 |
| 296 If the current character is... |
| 297 |
| 298 * '```y```': Consume the character and switch to the **style raw |
| 299 data: close 5** state. |
| 300 |
| 301 * Anything else: Emit '```</st```' character tokens. Consume the |
| 302 character. Switch to the **style raw data** state. |
| 303 |
| 304 |
| 305 ### **Style raw data: close 5** state ### |
| 306 |
| 307 If the current character is... |
| 308 |
| 309 * '```l```': Consume the character and switch to the **style raw |
| 310 data: close 6** state. |
| 311 |
| 312 * Anything else: Emit '```</sty```' character tokens. Consume the |
| 313 character. Switch to the **style raw data** state. |
| 314 |
| 315 |
| 316 ### **Style raw data: close 6** state ### |
| 317 |
| 318 If the current character is... |
| 319 |
| 320 * '```e```': Consume the character and switch to the **style raw |
| 321 data: close 7** state. |
| 322 |
| 323 * Anything else: Emit '```</styl```' character tokens. Consume the |
| 324 character. Switch to the **style raw data** state. |
| 325 |
| 326 |
| 327 ### **Style raw data: close 7** state ### |
| 328 |
| 329 If the current character is... |
| 330 |
| 331 * U+0020, U+000A, '```/```', '```>```': Create an end tag token, and |
| 332 let its tag name be the string '```style```'. Switch to the |
| 333 **before attribute name** state without consuming the character. |
| 334 |
| 335 * Anything else: Emit '```</style```' character tokens. Consume the |
| 336 character. Switch to the **style raw data** state. |
171 | 337 |
172 | 338 |
173 ### **Tag open** state ### | 339 ### **Tag open** state ### |
174 | 340 |
175 If the current character is... | 341 If the current character is... |
176 | 342 |
177 * '```!```': Consume the character and switch to the **comment start | 343 * '```!```': Consume the character and switch to the **comment start |
178 1** state. | 344 1** state. |
179 | 345 |
180 * '```/```': Consume the character and switch to the **close tag | 346 * '```/```': Consume the character and switch to the **close tag |
(...skipping 186 matching lines...) Expand 10 before | Expand all | Expand 10 after Loading... |
367 attribute value** state, the _extra terminating character_ unset (or | 533 attribute value** state, the _extra terminating character_ unset (or |
368 set to U+0000, which has the same effect), and the _emitting | 534 set to U+0000, which has the same effect), and the _emitting |
369 operation_ being to append the given character to the value of the | 535 operation_ being to append the given character to the value of the |
370 most recently added attribute. | 536 most recently added attribute. |
371 | 537 |
372 * Anything else: Append the current character to the value of the most | 538 * Anything else: Append the current character to the value of the most |
373 recently added attribute. Consume the current character. Stay in | 539 recently added attribute. Consume the current character. Stay in |
374 this state. | 540 this state. |
375 | 541 |
376 | 542 |
| 543 ### **After tag** state ### |
| 544 |
| 545 Emit the tag token. |
| 546 |
| 547 If the tag token was a start tag token and the tag name was |
| 548 '```script```', then and switch to the **script raw data** state. |
| 549 |
| 550 If the tag token was a start tag token and the tag name was |
| 551 '```style```', then and switch to the **style raw data** state. |
| 552 |
| 553 Otherwise, switch to the **data** state. |
| 554 |
| 555 |
377 ### **Comment start 1** state ### | 556 ### **Comment start 1** state ### |
378 | 557 |
379 If the current character is... | 558 If the current character is... |
380 | 559 |
381 * '```-```': Consume the character and switch to the **comment start | 560 * '```-```': Consume the character and switch to the **comment start |
382 2** state. | 561 2** state. |
383 | 562 |
384 * '```>```': Emit character tokens for '```<!>```'. Consume the | 563 * '```>```': Emit character tokens for '```<!>```'. Consume the |
385 current character. Switch to the **data** state. | 564 current character. Switch to the **data** state. |
386 | 565 |
(...skipping 324 matching lines...) Expand 10 before | Expand all | Expand 10 after Loading... |
711 - Pop nodes from the _stack of open nodes_ until a node with | 890 - Pop nodes from the _stack of open nodes_ until a node with |
712 a _tagName_ equal to _token.tagName_ has been popped. | 891 a _tagName_ equal to _token.tagName_ has been popped. |
713 2. Otherwise, ignore _token_. | 892 2. Otherwise, ignore _token_. |
714 - If _token_ is a comment token, | 893 - If _token_ is a comment token, |
715 1. Ignore _token_. | 894 1. Ignore _token_. |
716 - If _token_ is an EOF token, | 895 - If _token_ is an EOF token, |
717 1. Pop all the nodes from the _stack of open nodes_. | 896 1. Pop all the nodes from the _stack of open nodes_. |
718 2. Signal _document_ that parsing is complete. | 897 2. Signal _document_ that parsing is complete. |
719 | 898 |
720 TODO(ianh): <template>, <t> | 899 TODO(ianh): <template>, <t> |
OLD | NEW |