Description[WIP] Streaming CSS parser
This patch introduces a CSSParserTokenStream class, which is a lazily
tokenized list of CSSParserTokens. It has a similar interface to
CSSParserTokenRange, but as it lazily tokenizes it allows us to get the
character offset of where we've tokenized up to. Instead of doing an
up-front tokenization of entire stylesheets, we have it now interleaved
with parsing. This does *not* make the parser interruptible.
Lazy parsing now just stores the start offset for declaration list
instead of a vector of CSSParserTokens. A function on CSSTokenizer is
added to efficiently skip over a block to support this, without needing
to actually tokenize. Most of the complexity in this comes from url
tokens, which have interesting error recovery. The empty-block
optimization in lazy parsing is removed, although I suspect it is
generally not useful as empty blocks are only going to be used as
sentinel values. We could re-add this later by seeing the number of
characters if needed (the minimum required is 3, e.g. "x:0"), although
it's slightly awkward as we would skip the block but then have to rewind
to tokenize it.
The CSSParserObserverWrapper class, which was used to store character
offset information about tokens and comment locations, is removed since
we can now extract the information directly from stream objects. This
means anywhere that requires this offset information now needs to
operate on stream objects instead of ranges. This requires being careful
when calling peek()/atEnd() as token lookahead advances the current
offset.
For at-rules, we now need to pass in the offsets where the preludes
start. This is so we can retain the generic at-rule parsing logic as
independent of the individual at-rule logic. For style rules we now take
a stream from the start of the selector, instead of a range and a stream
for the block, as the observer requires callbacks for the selector
structure. The callbacks for @import rules now also include the imported
url (i.e. contain the entire prelude) for simplicity.
This patch removes some of the tracing metrics we have around
tokenization and parsing. As these are now interleaved, we can no longer
have separate measurements for tokenization and parsing. We also lose
the information about number of tokens as when lazy parsing is enabled
we will skip tokenization inside style rule declaration blocks.
************* TODO: Work out what to do with blink_style bucketing
This patch greatly reduces the memory requirements of stylesheet parsing
as we discard tokens after parsing them. We currently will allocate a
large contiguous chunk of memory for storing tokens (can be up to a few
MB for large web properties). With this patch, the vector usually caps
at 128 tokens or less.
The two changes in InspectorStyleSheet are due to slight changes in how
we call the CSSParserObserver. Firstly we now call it for style rules
even when the selector is invalid (startRuleHeader, observeSelector,
endRuleHeader, but no start/endRuleBlock). Secondly we now nest keyframe
rules inside the @keyframes rules (endRuleBlock for @keyframes is called
after after the nested blocks).
************* TODO: Investigate and comment on performance
************* TODO: Comment on custom property stuff
Some more details are available in the design doc:
https://docs.google.com/document/d/125OYuPzEXLziVNzzgClkRdggS1iyGEqC75c_H61rPJM/edit
BUG=661854
Patch Set 1 #Patch Set 2 : bla #Patch Set 3 : bla #Patch Set 4 : wip #Patch Set 5 : bla #Patch Set 6 : bla #Patch Set 7 : clean up a bit #Patch Set 8 : rebase/fix last inspector tests #Patch Set 9 : bla #Patch Set 10 : bla #Patch Set 11 : fix @apply #Patch Set 12 : moo #Patch Set 13 : ready for review #Patch Set 14 : rebase #
Total comments: 11
Messages
Total messages: 36 (31 generated)
|