|
|
Created:
6 years, 3 months ago by jbroman Modified:
6 years, 3 months ago CC:
blink-reviews, jamesr, blink-reviews-rendering, krit, zoltan1, eae+blinkwatch, leviw+renderwatch, danakj, Rik, jchaffraix+rendering, pdr., rwlbuis, Stephen Chennney, rune+blink Base URL:
svn://svn.chromium.org/blink/trunk Project:
blink Visibility:
Public. |
DescriptionTextBlob: Start caching a text blob per InlineTextBox.
BUG=377845
Committed: https://src.chromium.org/viewvc/blink?view=rev&revision=182243
Patch Set 1 #Patch Set 2 : rewrite in light of changed Font::drawText signature #
Total comments: 13
Patch Set 3 : rebase, review comments #Patch Set 4 : turn off runtime-enabled feature for landing #
Messages
Total messages: 24 (7 generated)
Patchset #2 (id:20001) has been deleted
Patchset #3 (id:60001) has been deleted
Patchset #2 (id:40001) has been deleted
jbroman@chromium.org changed reviewers: + fmalita@chromium.org, reed@google.com
I was originally going to plumb this through Font::drawText, but it's non-trivial to get the float it's now supposed to return. We could return a dummy value in the cached-textblob case (but that's wrong and confusing), or cache the width (even though we are guaranteed not to need it), or compute it based on bounds (I don't think this is correct). So instead I made TextPainter responsible for building and then drawing the text blob. This is pretty limited at present, but it does A previous version of this patch got about 30% reduction in record time on desktop Wikipedia, and I expect this one is similar. Unfortunately, an OS upgrade broke Telemetry on me, so I can't measure right now. I will of course revert the RuntimeEnabledFeatures.in change before landing.
On 2014/09/11 18:11:42, jbroman wrote: > This is pretty limited at present, but it does ...get most of the simple text in practice, and seems like a reasonable starting point.
Layout test results include 3 expected changes, and the remainder (canvas tests) are unrelated to this change, and fmalita@ is investigating them: https://code.google.com/p/chromium/issues/detail?id=412445
Awesome! On 2014/09/11 18:11:42, jbroman wrote: > I was originally going to plumb this through Font::drawText, but it's > non-trivial to get the float it's now supposed to return. We could return a > dummy value in the cached-textblob case (but that's wrong and confusing), or > cache the width (even though we are guaranteed not to need it), or compute it > based on bounds (I don't think this is correct). Yeah, Font::draw(simple?)Text does look like a better spot to hook the cache. Since only GraphicsContext::drawBidiText seems to care about the float return value, can we make that an optional out-param in Font::drawText and bypass the cache when the value is requested? > A previous version of this patch got about 30% reduction in record time on > desktop Wikipedia, and I expect this one is similar. Unfortunately, an OS > upgrade broke Telemetry on me, so I can't measure right now. Ouch, good to know I should stay clear of that Trusty upgrade for now :) Can you run it through cluster telemetry instead? I'd also like to get a feel of the memory impact for text-heavy pages before landing -- I think there are some telemetry measurements we can use. https://codereview.chromium.org/554613004/diff/80001/Source/core/rendering/In... File Source/core/rendering/InlineTextBox.h (right): https://codereview.chromium.org/554613004/diff/80001/Source/core/rendering/In... Source/core/rendering/InlineTextBox.h:165: TextBlobPtr m_textBlobCache; naming nit: this sounds a bit like a general/multiple entry cache, maybe m_cachedBlob or m_cachedTextBlob would be clearer? https://codereview.chromium.org/554613004/diff/80001/Source/core/rendering/Te... File Source/core/rendering/TextPainter.cpp (right): https://codereview.chromium.org/554613004/diff/80001/Source/core/rendering/Te... Source/core/rendering/TextPainter.cpp:121: // the blob here and then pass that reference to Skia. We may be able to attain the same using something like TextBlobPtr throwAwayBlob; if (!textBlobCache) { textBlobCache = &throwAwayBlob; } if (!*textBlobCache && canUseTextBlobs) *textBlobCache = m_font.buildTextBlob(textRunPaintInfo, m_textOrigin, m_graphicsContext->couldUseLCDRenderedText()); if (*textBlobCache && canUseTextBlobs) { m_font.drawTextBlob(m_graphicsContext, textBlobCache->get(), m_textOrigin.data()); } else { m_graphicsContext->drawText(m_font, textRunPaintInfo, m_textOrigin); } WDYT? https://codereview.chromium.org/554613004/diff/80001/Source/core/rendering/Te... File Source/core/rendering/TextPainter.h (right): https://codereview.chromium.org/554613004/diff/80001/Source/core/rendering/Te... Source/core/rendering/TextPainter.h:9: #include "platform/fonts/TextBlob.h" Could we use a forward declaration in the header? https://codereview.chromium.org/554613004/diff/80001/Source/core/rendering/Te... Source/core/rendering/TextPainter.h:69: void paintInternal(int startOffset, int endOffset, int truncationPoint, TextBlobPtr* textBlobCache = 0); We may avoid hauling textBlobCache all over the place if we stick it into TextRunPaintInfo (and use that for passing startOffset and endOffset too). Not sure how intrusive a change that is, maybe we can clean up later.
On 2014/09/12 13:21:24, Florin Malita wrote: > Awesome! > > > On 2014/09/11 18:11:42, jbroman wrote: > > I was originally going to plumb this through Font::drawText, but it's > > non-trivial to get the float it's now supposed to return. We could return a > > dummy value in the cached-textblob case (but that's wrong and confusing), or > > cache the width (even though we are guaranteed not to need it), or compute it > > based on bounds (I don't think this is correct). > > Yeah, Font::draw(simple?)Text does look like a better spot to hook the cache. > > Since only GraphicsContext::drawBidiText seems to care about the float return > value, can we make that an optional out-param in Font::drawText and bypass the > cache when the value is requested? Could possibly do. I'm of two minds about this. I don't really like the idea of "we'll ignore argument i if you also supply argument j" in API design. It could have really confusing performance characteristics if some caller started asking for width and suddenly found everything dramatically slower. > > A previous version of this patch got about 30% reduction in record time on > > desktop Wikipedia, and I expect this one is similar. Unfortunately, an OS > > upgrade broke Telemetry on me, so I can't measure right now. > > Ouch, good to know I should stay clear of that Trusty upgrade for now :) Can you > run it through cluster telemetry instead? Managed to fix it (not actually a Trusty issue; it turns out someone just landed a Telemetry change while I was upgrading my workstation so I blamed Trusty), though the fix isn't yet landed in ToT. Here are a bunch of record time numbers on some of the more text-y pages we have (from my N7 now, since eseidel@ asked for mobile on the doc). top_10_mobile Wikipedia SkPicture 2.11 -> 1.34 ms -37%, SkRecord 2.01 -> 1.25 ms -38% top_10_mobile Google SkPicture 3.98 -> 3.39 ms -15%, SkRecord 3.77 -> 3.20 ms -15% top_10_mobile Yahoo! SkPicture 3.55 -> 2.91 ms -18%, SkRecord 3.32 -> 2.71 ms -18% top_10_mobile Baidu SkPicture 3.06 -> 1.94 ms -37%, SkRecord 2.81 -> 1.84 ms -34% key_silk_cases Google "boogie" SkPicture 3.65 -> 3.10 ms -15%, SkRecord 3.51 -> 2.97 ms -15% key_mobile_sites NYTimes SkPicture 1.70 -> 1.28 ms -25%, SkRecord 1.65 -> 1.24 ms -25% > I'd also like to get a feel of the memory impact for text-heavy pages before > landing -- I think there are some telemetry measurements we can use. The mobile memory one complains about a missing .so when I try to use it. There are better and worse cases for memory. Here are some of the torture cases on my workstation measured with the task manager. ;) HTML5 spec (single-page, about 450k words) - wait for load, then check renderer memory (161,932k -> 166,764k) - scroll to bottom with PgDn, then back up, then check renderer memory (361,972k -> 578,696k) Wikipedia: Google Chrome (about 12k words) - scroll to bottom then back up (58,908k -> 64,668k) Keep in mind that with impl-side painting, we record the viewport + 8000px in each direction, which is only about 1% of the HTML5 spec, but is most of the wiki article. Since this patch does not discard blobs as long as the box is held by the RenderObject tree, we do see significant memory impact if there is a lot of text >8000px offscreen. The difference in the "wait for load" only case is probably mostly the null pointer per InlineTextBox. The scrolling case difference probably includes the cost of holding way-offscreen text blobs. This is a torture case, though -- I expect few pages would suffer so extremely.
On 2014/09/12 19:06:01, jbroman wrote: > Here are a bunch of record time numbers on some of the more text-y pages we have > (from my N7 now, since eseidel@ asked for mobile on the doc). > > top_10_mobile Wikipedia SkPicture 2.11 -> 1.34 ms -37%, SkRecord 2.01 > -> 1.25 ms -38% > top_10_mobile Google SkPicture 3.98 -> 3.39 ms -15%, SkRecord 3.77 > -> 3.20 ms -15% > top_10_mobile Yahoo! SkPicture 3.55 -> 2.91 ms -18%, SkRecord 3.32 > -> 2.71 ms -18% > top_10_mobile Baidu SkPicture 3.06 -> 1.94 ms -37%, SkRecord 2.81 > -> 1.84 ms -34% > key_silk_cases Google "boogie" SkPicture 3.65 -> 3.10 ms -15%, SkRecord 3.51 > -> 2.97 ms -15% > key_mobile_sites NYTimes SkPicture 1.70 -> 1.28 ms -25%, SkRecord 1.65 > -> 1.24 ms -25% > > > I'd also like to get a feel of the memory impact for text-heavy pages before > > landing -- I think there are some telemetry measurements we can use. > > The mobile memory one complains about a missing .so when I try to use it. > > There are better and worse cases for memory. Here are some of the torture cases > on my workstation measured with the task manager. ;) > > HTML5 spec (single-page, about 450k words) > - wait for load, then check renderer memory (161,932k -> 166,764k) > - scroll to bottom with PgDn, then back up, then check renderer memory (361,972k > -> 578,696k) > > Wikipedia: Google Chrome (about 12k words) > - scroll to bottom then back up (58,908k -> 64,668k) > > Keep in mind that with impl-side painting, we record the viewport + 8000px in > each direction, which is only about 1% of the HTML5 spec, but is most of the > wiki article. > Since this patch does not discard blobs as long as the box is held by the > RenderObject tree, we do see significant memory impact if there is a lot of text > >8000px offscreen. > > The difference in the "wait for load" only case is probably mostly the null > pointer per InlineTextBox. > The scrolling case difference probably includes the cost of holding > way-offscreen text blobs. > > This is a torture case, though -- I expect few pages would suffer so extremely. I'm not too worried about the blob mem overhead at this point, we haven't started squeezing them yet. Just want to make sure that the extra InlineTextBox pointer is not going to have a significant impact with blobs/caching disabled... looks about 3% for text-heavy pages if I parse your numbers right, which will likely light some perf bots up. Ideally we'll end up anchoring the bobs on RenderText which should make things (much?) better, but it's kind of unfortunate that we have to take a hit first.
these didn't get published before; sending now; will reply to your new comments later https://codereview.chromium.org/554613004/diff/80001/Source/core/rendering/In... File Source/core/rendering/InlineTextBox.h (right): https://codereview.chromium.org/554613004/diff/80001/Source/core/rendering/In... Source/core/rendering/InlineTextBox.h:165: TextBlobPtr m_textBlobCache; On 2014/09/12 13:21:23, Florin Malita wrote: > naming nit: this sounds a bit like a general/multiple entry cache, maybe > m_cachedBlob or m_cachedTextBlob would be clearer? m_cachedTextBlob was previously the name, and I bikeshedded away. Will change it back in the next patch set. https://codereview.chromium.org/554613004/diff/80001/Source/core/rendering/Te... File Source/core/rendering/TextPainter.cpp (right): https://codereview.chromium.org/554613004/diff/80001/Source/core/rendering/Te... Source/core/rendering/TextPainter.cpp:121: // the blob here and then pass that reference to Skia. On 2014/09/12 13:21:23, Florin Malita wrote: > We may be able to attain the same using something like > > TextBlobPtr throwAwayBlob; > if (!textBlobCache) { > textBlobCache = &throwAwayBlob; > } > > if (!*textBlobCache && canUseTextBlobs) > *textBlobCache = m_font.buildTextBlob(textRunPaintInfo, m_textOrigin, > m_graphicsContext->couldUseLCDRenderedText()); > if (*textBlobCache && canUseTextBlobs) { > m_font.drawTextBlob(m_graphicsContext, textBlobCache->get(), > m_textOrigin.data()); > } else { > m_graphicsContext->drawText(m_font, textRunPaintInfo, m_textOrigin); > } > > WDYT? Yeah, okay. I'm not sure whether that's clearer or not, but I have no objection. https://codereview.chromium.org/554613004/diff/80001/Source/core/rendering/Te... File Source/core/rendering/TextPainter.h (right): https://codereview.chromium.org/554613004/diff/80001/Source/core/rendering/Te... Source/core/rendering/TextPainter.h:9: #include "platform/fonts/TextBlob.h" On 2014/09/12 13:21:24, Florin Malita wrote: > Could we use a forward declaration in the header? Tricky because it's a typedef for a templated class, so I think I'd have to forward-declare RefPtr (e.g. via wtf/RefPtr.h) and SkTextBlob, and then create the typedef. I'm not really sure this is worth addressing now (especially since this header only ends up in two translation units at present anyhow). https://codereview.chromium.org/554613004/diff/80001/Source/core/rendering/Te... Source/core/rendering/TextPainter.h:69: void paintInternal(int startOffset, int endOffset, int truncationPoint, TextBlobPtr* textBlobCache = 0); On 2014/09/12 13:21:24, Florin Malita wrote: > We may avoid hauling textBlobCache all over the place if we stick it into > TextRunPaintInfo (and use that for passing startOffset and endOffset too). Not > sure how intrusive a change that is, maybe we can clean up later. I'm mostly afraid that you can pass TextRunPaintInfo into places that would not respect that, which might be confusing.
On 2014/09/12 20:36:35, Florin Malita wrote: > On 2014/09/12 19:06:01, jbroman wrote: > > Here are a bunch of record time numbers on some of the more text-y pages we > have > > (from my N7 now, since eseidel@ asked for mobile on the doc). > > > > top_10_mobile Wikipedia SkPicture 2.11 -> 1.34 ms -37%, SkRecord > 2.01 > > -> 1.25 ms -38% > > top_10_mobile Google SkPicture 3.98 -> 3.39 ms -15%, SkRecord > 3.77 > > -> 3.20 ms -15% > > top_10_mobile Yahoo! SkPicture 3.55 -> 2.91 ms -18%, SkRecord > 3.32 > > -> 2.71 ms -18% > > top_10_mobile Baidu SkPicture 3.06 -> 1.94 ms -37%, SkRecord > 2.81 > > -> 1.84 ms -34% > > key_silk_cases Google "boogie" SkPicture 3.65 -> 3.10 ms -15%, SkRecord > 3.51 > > -> 2.97 ms -15% > > key_mobile_sites NYTimes SkPicture 1.70 -> 1.28 ms -25%, SkRecord > 1.65 > > -> 1.24 ms -25% > > > > > I'd also like to get a feel of the memory impact for text-heavy pages before > > > landing -- I think there are some telemetry measurements we can use. > > > > The mobile memory one complains about a missing .so when I try to use it. > > > > There are better and worse cases for memory. Here are some of the torture > cases > > on my workstation measured with the task manager. ;) > > > > HTML5 spec (single-page, about 450k words) > > - wait for load, then check renderer memory (161,932k -> 166,764k) > > - scroll to bottom with PgDn, then back up, then check renderer memory > (361,972k > > -> 578,696k) > > > > Wikipedia: Google Chrome (about 12k words) > > - scroll to bottom then back up (58,908k -> 64,668k) > > > > Keep in mind that with impl-side painting, we record the viewport + 8000px in > > each direction, which is only about 1% of the HTML5 spec, but is most of the > > wiki article. > > Since this patch does not discard blobs as long as the box is held by the > > RenderObject tree, we do see significant memory impact if there is a lot of > text > > >8000px offscreen. > > > > The difference in the "wait for load" only case is probably mostly the null > > pointer per InlineTextBox. > > The scrolling case difference probably includes the cost of holding > > way-offscreen text blobs. > > > > This is a torture case, though -- I expect few pages would suffer so > extremely. > > I'm not too worried about the blob mem overhead at this point, we haven't > started squeezing them yet. Just want to make sure that the extra InlineTextBox > pointer is not going to have a significant impact with blobs/caching disabled... > looks about 3% for text-heavy pages if I parse your numbers right, which will > likely light some perf bots up. > > Ideally we'll end up anchoring the bobs on RenderText which should make things > (much?) better, but it's kind of unfortunate that we have to take a hit first. The alternative is using a map hosted somewhere higher (global, document, or something like that) to map the InlineTextBox pointers to text blobs. Slower to look up, but will cause the memory impact with text blobs disabled to be negligible. I'm not sure whether RenderText will be the right place to hold this, but my (unproven) suspicion is that the number of inline text boxes per RenderText is on three or so on average (since I suspect a lot of text nodes don't end up being broken), so I'm not sure it will be a colossal win.
fmalita@chromium.org changed reviewers: + eae@chromium.org, leviw@chromium.org
On 2014/09/15 14:24:54, jbroman wrote: > The alternative is using a map hosted somewhere higher (global, document, or > something like that) to map the InlineTextBox pointers to text blobs. Slower to > look up, but will cause the memory impact with text blobs disabled to be > negligible. > A side cache seems easier to swallow at this point, especially if the hashmap lookup is not impacting blob caching gains significantly. Can you give it a quick try and verify that it doesn't obliterate blob perf? I'm actually on the fence and could be persuaded either way, but it seems more cautious to defer paying the mem penalty until we actually turn blob caching on. Curious what other folks think. https://codereview.chromium.org/554613004/diff/80001/Source/core/rendering/In... File Source/core/rendering/InlineTextBox.cpp (right): https://codereview.chromium.org/554613004/diff/80001/Source/core/rendering/In... Source/core/rendering/InlineTextBox.cpp:85: void InlineTextBox::markDirty() Do we need to invalidate the cache on markDirty()?
On 2014/09/16 00:18:29, Florin Malita wrote: > On 2014/09/15 14:24:54, jbroman wrote: > > The alternative is using a map hosted somewhere higher (global, document, or > > something like that) to map the InlineTextBox pointers to text blobs. Slower > to > > look up, but will cause the memory impact with text blobs disabled to be > > negligible. > > > > A side cache seems easier to swallow at this point, especially if the hashmap > lookup is not impacting blob caching gains significantly. Can you give it a > quick try and verify that it doesn't obliterate blob perf? > > I'm actually on the fence and could be persuaded either way, but it seems more > cautious to defer paying the mem penalty until we actually turn blob caching on. > Curious what other folks think. Long term we probably want the cache at a higher level, for now though I agree that a side cache is likely the right approach.
On 2014/09/16 00:27:06, eae wrote: > On 2014/09/16 00:18:29, Florin Malita wrote: > > On 2014/09/15 14:24:54, jbroman wrote: > > > The alternative is using a map hosted somewhere higher (global, document, or > > > something like that) to map the InlineTextBox pointers to text blobs. Slower > > to > > > look up, but will cause the memory impact with text blobs disabled to be > > > negligible. > > > > > > > A side cache seems easier to swallow at this point, especially if the hashmap > > lookup is not impacting blob caching gains significantly. Can you give it a > > quick try and verify that it doesn't obliterate blob perf? > > > > I'm actually on the fence and could be persuaded either way, but it seems more > > cautious to defer paying the mem penalty until we actually turn blob caching > on. > > Curious what other folks think. > > Long term we probably want the cache at a higher level, for now though I agree > that a side cache is likely the right approach. InlineTextBox is already virtual, so can we know at node-creation time that we will want storage for a blob, and only have that slot for subclasses that need it? Just a thought to same lots of unused ptrs.
Moved to a HashMap on the side. Performance hit seems negligible (mobile Wiki -34%, mobile Google -15%, mobile Yahoo -18%, mobile Baidu -37% on my Nexus 7). I expect memory usage to be lower when the REF is off, but slightly higher when it is on (compared to the previous patch set). https://codereview.chromium.org/554613004/diff/80001/Source/core/rendering/In... File Source/core/rendering/InlineTextBox.cpp (right): https://codereview.chromium.org/554613004/diff/80001/Source/core/rendering/In... Source/core/rendering/InlineTextBox.cpp:85: void InlineTextBox::markDirty() On 2014/09/16 00:18:29, Florin Malita wrote: > Do we need to invalidate the cache on markDirty()? I don't believe that we can actually paint a dirty InlineTextBox, but I'm not certain. I'll invalidate to be safe, and leave a FIXME. https://codereview.chromium.org/554613004/diff/80001/Source/core/rendering/In... File Source/core/rendering/InlineTextBox.h (right): https://codereview.chromium.org/554613004/diff/80001/Source/core/rendering/In... Source/core/rendering/InlineTextBox.h:165: TextBlobPtr m_textBlobCache; On 2014/09/13 00:06:42, jbroman wrote: > On 2014/09/12 13:21:23, Florin Malita wrote: > > naming nit: this sounds a bit like a general/multiple entry cache, maybe > > m_cachedBlob or m_cachedTextBlob would be clearer? > > m_cachedTextBlob was previously the name, and I bikeshedded away. Will change it > back in the next patch set. Actually, using a hash map on the side makes this moot. https://codereview.chromium.org/554613004/diff/80001/Source/core/rendering/Te... File Source/core/rendering/TextPainter.cpp (right): https://codereview.chromium.org/554613004/diff/80001/Source/core/rendering/Te... Source/core/rendering/TextPainter.cpp:121: // the blob here and then pass that reference to Skia. On 2014/09/13 00:06:42, jbroman wrote: > On 2014/09/12 13:21:23, Florin Malita wrote: > > We may be able to attain the same using something like > > > > TextBlobPtr throwAwayBlob; > > if (!textBlobCache) { > > textBlobCache = &throwAwayBlob; > > } > > > > if (!*textBlobCache && canUseTextBlobs) > > *textBlobCache = m_font.buildTextBlob(textRunPaintInfo, m_textOrigin, > > m_graphicsContext->couldUseLCDRenderedText()); > > if (*textBlobCache && canUseTextBlobs) { > > m_font.drawTextBlob(m_graphicsContext, textBlobCache->get(), > > m_textOrigin.data()); > > } else { > > m_graphicsContext->drawText(m_font, textRunPaintInfo, m_textOrigin); > > } > > > > WDYT? > > Yeah, okay. I'm not sure whether that's clearer or not, but I have no objection. I've changed to a variation on this theme.
LGTM (don't forget to switch the RTEF off before landing :P)
The CQ bit was checked by jbroman@chromium.org
The CQ bit was checked by jbroman@chromium.org
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patchset/554613004/120001
On 2014/09/18 13:20:50, Florin Malita wrote: > LGTM > > (don't forget to switch the RTEF off before landing :P) Aww, you're no fun. :P
Message was sent while issue was closed.
Committed patchset #4 (id:120001) as 182243
Message was sent while issue was closed.
https://codereview.chromium.org/554613004/diff/80001/Source/core/rendering/In... File Source/core/rendering/InlineTextBox.cpp (right): https://codereview.chromium.org/554613004/diff/80001/Source/core/rendering/In... Source/core/rendering/InlineTextBox.cpp:85: void InlineTextBox::markDirty() On 2014/09/17 22:25:37, jbroman wrote: > On 2014/09/16 00:18:29, Florin Malita wrote: > > Do we need to invalidate the cache on markDirty()? > > I don't believe that we can actually paint a dirty InlineTextBox, but I'm not > certain. I'll invalidate to be safe, and leave a FIXME. Just to be clear: we should *never* paint a dirty InlineTextBox. |