|
|
DescriptionPicture Recording: fix the performance bottleneck in SkDeferredCanvas::isFullFrame
blink skips all pending commands during picture recording if it is drawing an opaque full-frame
geometry or image. This may improve performance for some edge cases. To recognize an opaque
full-frame drawing should be cheap enough. Otherwise, the overhead will offset the improvement.
Unfortunately, data from perf for content_shell on Nexus7 shows that SkDeferredCanvas::isFullFrame
is far from cheap. Table below shows that how much isFullFrame() costs in the whole render process.
benchmark percentage
my local benchmark(draw 1000 sprites) 4.1%
speedReading 2.8%
FishIETank(1000 fishes) 1.5%
GUIMark3 Bitmap 2.0%
By contrast, real recording (SkGPipeCanvas::drawBitmapRectToRect) and real rasterization
(GrDrawTarget::drawRect) cost ~4% and ~6% in the whole render process respectively. Apparently,
SkDeferredCanvas::isFullFrame() is nontrivial.
getDeviceSize() is the main contributor to this hotspot. The change simply save the canvasSize and
reuse it among drawings if it is not a fresh frame. This change cut off ~65% (or improved ~2 times)
of isFullFrame().
telemetry smoothness canvas_tough_test didn't show obvious improvement or regression.
BUG=411166
Committed: https://skia.googlesource.com/skia/+/8e45c3777d886ba3fe239bb549d06b0693692152
Committed: https://skia.googlesource.com/skia/+/49005bf8929dd8ca86431e13414d683b509cd538
Patch Set 1 #Patch Set 2 : pass skia unittest DeferredCanvas in dm #Patch Set 3 : use interface in SkDeferredCanvas #
Total comments: 7
Patch Set 4 : coding style + add a interface SkDeferredCanvas::canvasSizeChanged #Patch Set 5 : expose an API SkDeferredCanvas::getCanvasSize according to reed's suggestion #Patch Set 6 : add a test case for SkDeferredCanvas::getCanvasSize into tests/DeferredCanvasTest.cpp #
Total comments: 4
Patch Set 7 : nits #Patch Set 8 : a small fix for memory leak in unit test #
Messages
Total messages: 33 (5 generated)
yunchao.he@intel.com changed reviewers: + junov@chromium.org, tomhudson@google.com
PTAL. Thanks a lot! As you know, we discussed the c1k problem in Graphics-dev: https://groups.google.com/a/chromium.org/forum/#!topic/graphics-dev/RzHob5VS8Dg. I suppose this is my last patch after 2 weeks investigation. I will make a summary is these days. And file a bug for some other bottlenecks. https://codereview.chromium.org/545813002/diff/40001/src/utils/SkDeferredCanv... File src/utils/SkDeferredCanvas.cpp (right): https://codereview.chromium.org/545813002/diff/40001/src/utils/SkDeferredCanv... src/utils/SkDeferredCanvas.cpp:636: if (isFreshFrame()) { maybe I should implement a new method named isNewFrame(), and update fNewFrame only in SkDeferredDevice::flush. Don't update it in SkDeferredDevice::skipPendingCommands.
reed@google.com changed reviewers: + reed@google.com
https://codereview.chromium.org/545813002/diff/40001/include/utils/SkDeferred... File include/utils/SkDeferredCanvas.h (right): https://codereview.chromium.org/545813002/diff/40001/include/utils/SkDeferred... include/utils/SkDeferredCanvas.h:252: SkISize canvasSize; nit: fCanvasSize https://codereview.chromium.org/545813002/diff/40001/src/utils/SkDeferredCanv... File src/utils/SkDeferredCanvas.cpp (right): https://codereview.chromium.org/545813002/diff/40001/src/utils/SkDeferredCanv... src/utils/SkDeferredCanvas.cpp:636: if (isFreshFrame()) { nit: this->isFreshFrame()
Thank you for finding this hotspot. This is great. https://codereview.chromium.org/545813002/diff/40001/src/utils/SkDeferredCanv... File src/utils/SkDeferredCanvas.cpp (right): https://codereview.chromium.org/545813002/diff/40001/src/utils/SkDeferredCanv... src/utils/SkDeferredCanvas.cpp:636: if (isFreshFrame()) { isFreshFrame is not the right condition for refreshing the cached size. In general, caches should be invalidated by actions that affect the value that is cached. The only calls that could affect the device size are: setSurface(), saveLayer, and a restore call that is associated with a saveLayer. These actions are completely independent of flushing.
On 2014/09/05 13:54:50, junov wrote: > Thank you for finding this hotspot. This is great. After poking around the code a bit, it's not obvious to me *why* it's a hotspot. There are a couple of virtual method calls and an ISize constructor? SkCanvas::getDeviceSize() redirects to SkCanvas::getBaseLayerSize(), which does not appear to be overriden. That calls SkBaseDevice::getDevice(), then width() and height() on the device. If we have to get the device off the MCRec stack it's with front(), which is constant-time. The SkDeferredDevice doesn't seem to be involved?
On 2014/09/05 14:31:48, tomhudson wrote: > On 2014/09/05 13:54:50, junov wrote: > > Thank you for finding this hotspot. This is great. > > After poking around the code a bit, it's not obvious to me *why* it's a hotspot. > There are a couple of virtual method calls and an ISize constructor? > > SkCanvas::getDeviceSize() redirects to SkCanvas::getBaseLayerSize(), which does > not appear to be overriden. > That calls SkBaseDevice::getDevice(), then width() and height() on the device. > If we have to get the device off the MCRec stack it's with front(), which is > constant-time. > The SkDeferredDevice doesn't seem to be involved? Good point. @yunchao: Does your profiling data identify any specific instructions that are critically slow inside getDeviceSize()? Perhaps there is something specific we can address to avoid CPU pipeline stalls, cache misses, or something like that.
On 2014/09/05 14:58:39, junov wrote: > On 2014/09/05 14:31:48, tomhudson wrote: > > On 2014/09/05 13:54:50, junov wrote: > > > Thank you for finding this hotspot. This is great. > > > > After poking around the code a bit, it's not obvious to me *why* it's a > hotspot. > > There are a couple of virtual method calls and an ISize constructor? > > > > SkCanvas::getDeviceSize() redirects to SkCanvas::getBaseLayerSize(), which > does > > not appear to be overriden. > > That calls SkBaseDevice::getDevice(), then width() and height() on the device. > > If we have to get the device off the MCRec stack it's with front(), which is > > constant-time. > > The SkDeferredDevice doesn't seem to be involved? > > Good point. > @yunchao: Does your profiling data identify any specific instructions that are > critically slow inside getDeviceSize()? Perhaps there is something specific we > can address to avoid CPU pipeline stalls, cache misses, or something like that. Tom and Justin, the call stack of this hotspot: SkCanvas::getDeviceSize -> SkCanvas::getBaselayerSize -> SkCanvas::getDevice -> SkBaseDevice::width -> SkDeferredDevice::imageInfo -> SkGpuDevice::imageInfo -> GrSurface::info. I think the virtual functions (SkDeferredDevice::imangeInfo and SkGpuDevice::imageInfo) are the main contributors. Dynamically binding to virtual functions over and over again is nontrivial.
On 2014/09/09 08:32:42, yunchao wrote: > On 2014/09/05 14:58:39, junov wrote: > > On 2014/09/05 14:31:48, tomhudson wrote: > > > On 2014/09/05 13:54:50, junov wrote: > > > > Thank you for finding this hotspot. This is great. > > > > > > After poking around the code a bit, it's not obvious to me *why* it's a > > hotspot. > > > There are a couple of virtual method calls and an ISize constructor? > > > > > > SkCanvas::getDeviceSize() redirects to SkCanvas::getBaseLayerSize(), which > > does > > > not appear to be overriden. > > > That calls SkBaseDevice::getDevice(), then width() and height() on the > device. > > > If we have to get the device off the MCRec stack it's with front(), which is > > > constant-time. > > > The SkDeferredDevice doesn't seem to be involved? > > > > Good point. > > @yunchao: Does your profiling data identify any specific instructions that are > > critically slow inside getDeviceSize()? Perhaps there is something specific we > > can address to avoid CPU pipeline stalls, cache misses, or something like > that. > > Tom and Justin, the call stack of this hotspot: > SkCanvas::getDeviceSize -> SkCanvas::getBaselayerSize -> SkCanvas::getDevice -> > SkBaseDevice::width -> SkDeferredDevice::imageInfo -> SkGpuDevice::imageInfo -> > GrSurface::info. > I think the virtual functions (SkDeferredDevice::imangeInfo and > SkGpuDevice::imageInfo) are the main contributors. Dynamically binding to > virtual functions over and over again is nontrivial. The profile data shows that the hotspot is caused by 1) dynamic binding to virtual functions, 2) deep call stack and 3) GrSurface::info itself.
Thanks for your reviews. I updated the code accordingly. PTAL. https://codereview.chromium.org/545813002/diff/40001/include/utils/SkDeferred... File include/utils/SkDeferredCanvas.h (right): https://codereview.chromium.org/545813002/diff/40001/include/utils/SkDeferred... include/utils/SkDeferredCanvas.h:252: SkISize canvasSize; On 2014/09/05 13:18:33, reed1 wrote: > nit: fCanvasSize Done. https://codereview.chromium.org/545813002/diff/40001/src/utils/SkDeferredCanv... File src/utils/SkDeferredCanvas.cpp (right): https://codereview.chromium.org/545813002/diff/40001/src/utils/SkDeferredCanv... src/utils/SkDeferredCanvas.cpp:636: if (isFreshFrame()) { On 2014/09/05 13:18:33, reed1 wrote: > nit: this->isFreshFrame() Done. https://codereview.chromium.org/545813002/diff/40001/src/utils/SkDeferredCanv... src/utils/SkDeferredCanvas.cpp:636: if (isFreshFrame()) { On 2014/09/05 13:54:49, junov wrote: > isFreshFrame is not the right condition for refreshing the cached size. In > general, caches should be invalidated by actions that affect the value that is > cached. The only calls that could affect the device size are: setSurface(), > saveLayer, and a restore call that is associated with a saveLayer. These > actions are completely independent of flushing. Done. Added a new interface SkDeferredCanvas::canvasSizeChanged.
Patchset #4 (id:60001) has been deleted
Can we instead expose a public getCanvasSize() instead? Under the hood, it can still use this dirty-flag to decide when its "cache" of the size is out-of-date. I think that is a smaller API exposure than this sort of bare flag.
On 2014/09/09 13:10:12, reed1 wrote: > Can we instead expose a public getCanvasSize() instead? Under the hood, it can > still use this dirty-flag to decide when its "cache" of the size is out-of-date. > I think that is a smaller API exposure than this sort of bare flag. Also, why not put this fix in SkCanvas::getDeviceSize? The hotspot is not in code that is specific to SkDeferedCanvas, other than the fact that SkDeferredCanvas calls SkCanvas::getDeviceSize a lot. By putting the fix into SkCanvas, more call sites would benefit. reed: what do you think?
On 2014/09/09 13:31:37, junov wrote: > On 2014/09/09 13:10:12, reed1 wrote: > > Can we instead expose a public getCanvasSize() instead? Under the hood, it can > > still use this dirty-flag to decide when its "cache" of the size is > out-of-date. > > I think that is a smaller API exposure than this sort of bare flag. > > Also, why not put this fix in SkCanvas::getDeviceSize? The hotspot is not in > code that is specific to SkDeferedCanvas, other than the fact that > SkDeferredCanvas calls SkCanvas::getDeviceSize a lot. By putting the fix into > SkCanvas, more call sites would benefit. > > reed: what do you think? Hi Justin, I thought of this too. One problem of putting this fix into SkCanvas is that we need another virtual function to check the flag in SkDeferredCanvas, just like the function canvasSizeChanged(). Then dynamic binding will be called over and over again per frame too. Only if SkCanvas drawing API(drawImage, drawRect...) are called frequently per frame, getDeviceSize() will become a bottleneck. This situation exists in Canvas2D cases, and they are not edge cases in Canvas2D. And this fix is design to cover these cases. Other call sites (the are from chromium base layer drawing) rarely called hundreds of SkCanvas drawing API per frame, and they will introduce an overhead of dynamical binding for virtual functions to all cases. So I think put this fix in SkDeferredCanvas may be appropriate.
On 2014/09/10 13:51:01, yunchao wrote: > On 2014/09/09 13:31:37, junov wrote: > > On 2014/09/09 13:10:12, reed1 wrote: > > > Can we instead expose a public getCanvasSize() instead? Under the hood, it > can > > > still use this dirty-flag to decide when its "cache" of the size is > > out-of-date. > > > I think that is a smaller API exposure than this sort of bare flag. > > > > Also, why not put this fix in SkCanvas::getDeviceSize? The hotspot is not in > > code that is specific to SkDeferedCanvas, other than the fact that > > SkDeferredCanvas calls SkCanvas::getDeviceSize a lot. By putting the fix into > > SkCanvas, more call sites would benefit. > > > > reed: what do you think? > > Hi Justin, I thought of this too. One problem of putting this fix into SkCanvas > is that we need another virtual function to check the flag in SkDeferredCanvas, > just like the function canvasSizeChanged(). Then dynamic binding will be called > over and over again per frame too. But the flag and canvasSizeChanged() could all be in SkCanvas.
On 2014/09/10 14:06:15, junov wrote: > On 2014/09/10 13:51:01, yunchao wrote: > > On 2014/09/09 13:31:37, junov wrote: > > > On 2014/09/09 13:10:12, reed1 wrote: > > > > Can we instead expose a public getCanvasSize() instead? Under the hood, it > > can > > > > still use this dirty-flag to decide when its "cache" of the size is > > > out-of-date. > > > > I think that is a smaller API exposure than this sort of bare flag. > > > > > > Also, why not put this fix in SkCanvas::getDeviceSize? The hotspot is not in > > > code that is specific to SkDeferedCanvas, other than the fact that > > > SkDeferredCanvas calls SkCanvas::getDeviceSize a lot. By putting the fix > into > > > SkCanvas, more call sites would benefit. > > > > > > reed: what do you think? > > > > Hi Justin, I thought of this too. One problem of putting this fix into > SkCanvas > > is that we need another virtual function to check the flag in > SkDeferredCanvas, > > just like the function canvasSizeChanged(). Then dynamic binding will be > called > > over and over again per frame too. > > But the flag and canvasSizeChanged() could all be in SkCanvas. Since this has not come up anywhere bug deferredcanvas, I might suggest we try out the change there, to keep it localized (easier to iterate as well). If later on we're sure its a good idea, we can consider pushing it deeper.
On 2014/09/10 14:14:16, reed1 wrote: > Since this has not come up anywhere bug deferredcanvas, I might suggest we try > out the change there, to keep it localized (easier to iterate as well). If later > on we're sure its a good idea, we can consider pushing it deeper. Fair enough. yunchao: can you add a unit test for this in tests/DeferredCanvasTest.cpp Thanks.
On 2014/09/10 14:32:06, junov wrote: > On 2014/09/10 14:14:16, reed1 wrote: > > Since this has not come up anywhere bug deferredcanvas, I might suggest we try > > out the change there, to keep it localized (easier to iterate as well). If > later > > on we're sure its a good idea, we can consider pushing it deeper. > > Fair enough. > > yunchao: can you add a unit test for this in tests/DeferredCanvasTest.cpp > Thanks. Justin, test case is added into tests/DeferredCanvasTest.cpp, PTAL. In addition, I made 2 small revision: 1) To call SkCanvas::getBaseLayerSize instead of SkCanvas::getDeviceSize to update canvas size. The latter is marked as deprecated in SkCanvas.h 2) don't assign true to fCanvasSizeChanged in SkDeferredCanvas::willSaveLayer and corresponding willRestore. saveLayer may change the drawable area size, just like clip. It doesn't change the canvas size (the root device size). What do you think about this?
another thought: if we mark the 2 new fields mutable, we can make the public getter const, which might be clearer. https://codereview.chromium.org/545813002/diff/120001/src/utils/SkDeferredCan... File src/utils/SkDeferredCanvas.cpp (right): https://codereview.chromium.org/545813002/diff/120001/src/utils/SkDeferredCan... src/utils/SkDeferredCanvas.cpp:529: fCanvasSizeChanged = true; nit: often in skia we name this sort of boolean "Dirty" -- meaning the cache value field it refers to is dirty and needs to be recomputed. https://codereview.chromium.org/545813002/diff/120001/src/utils/SkDeferredCan... src/utils/SkDeferredCanvas.cpp:530: fCanvasSize.setEmpty(); nit: may be clear to name this fCachedCanvasSize, so we don't ever think it could be different from the computed size.
On 2014/09/11 11:14:47, yunchao wrote: > > In addition, I made 2 small revision: > 1) To call SkCanvas::getBaseLayerSize instead of SkCanvas::getDeviceSize to > update canvas size. The latter is marked as deprecated in SkCanvas.h I agree with that. These are not equivalent, but I think querying the base layer is actually the right thing to do here, but it does not really matter in the current implementation because > 2) don't assign true to fCanvasSizeChanged in SkDeferredCanvas::willSaveLayer > and corresponding willRestore. saveLayer may change the drawable area size, just > like clip. It doesn't change the canvas size (the root device size). What do you > think about this? Agreed, especially since you are now querying the base layer size.
Thanks for your review, reed. I have updated the code accordingly. PTAL. Thanks a lot. https://codereview.chromium.org/545813002/diff/120001/src/utils/SkDeferredCan... File src/utils/SkDeferredCanvas.cpp (right): https://codereview.chromium.org/545813002/diff/120001/src/utils/SkDeferredCan... src/utils/SkDeferredCanvas.cpp:529: fCanvasSizeChanged = true; On 2014/09/11 15:38:13, reed1 wrote: > nit: often in skia we name this sort of boolean "Dirty" -- meaning the cache > value field it refers to is dirty and needs to be recomputed. Done. https://codereview.chromium.org/545813002/diff/120001/src/utils/SkDeferredCan... src/utils/SkDeferredCanvas.cpp:530: fCanvasSize.setEmpty(); On 2014/09/11 15:38:13, reed1 wrote: > nit: may be clear to name this fCachedCanvasSize, so we don't ever think it > could be different from the computed size. Done.
On 2014/09/12 10:13:28, yunchao wrote: > Thanks for your review, reed. I have updated the code accordingly. PTAL. Thanks > a lot. > > https://codereview.chromium.org/545813002/diff/120001/src/utils/SkDeferredCan... > File src/utils/SkDeferredCanvas.cpp (right): > > https://codereview.chromium.org/545813002/diff/120001/src/utils/SkDeferredCan... > src/utils/SkDeferredCanvas.cpp:529: fCanvasSizeChanged = true; > On 2014/09/11 15:38:13, reed1 wrote: > > nit: often in skia we name this sort of boolean "Dirty" -- meaning the cache > > value field it refers to is dirty and needs to be recomputed. > > Done. > > https://codereview.chromium.org/545813002/diff/120001/src/utils/SkDeferredCan... > src/utils/SkDeferredCanvas.cpp:530: fCanvasSize.setEmpty(); > On 2014/09/11 15:38:13, reed1 wrote: > > nit: may be clear to name this fCachedCanvasSize, so we don't ever think it > > could be different from the computed size. > > Done. lgtm
change lgtm Question : can "setSurface" really change the size of the backing behind the canvas? How does the canvas update its existing clips? This is explicitly not a feature of SkCanvas (changing the backend).
On 2014/09/12 16:06:36, reed1 wrote: > change lgtm > > Question : can "setSurface" really change the size of the backing behind the > canvas? How does the canvas update its existing clips? This is explicitly not a > feature of SkCanvas (changing the backend). It could in theory, but Blink does not use it that way. If the canvas size changes, the SkDeferredCanvas gets torn down and a new one gets created. This is because of how the ImageBuffer and ExternalLayer architectures in Blink and cc were designed.
The CQ bit was checked by yunchao.he@intel.com
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patchset/545813002/140001
Message was sent while issue was closed.
Committed patchset #7 (id:140001) as 8e45c3777d886ba3fe239bb549d06b0693692152
Message was sent while issue was closed.
A revert of this CL (patchset #7 id:140001) has been created in https://codereview.chromium.org/571053002/ by mtklein@google.com. The reason for reverting is: This is leaking memory: http://108.170.220.120:10117/builders/Test-Ubuntu13.10-GCE-NoGPU-x86_64-Debug....
Message was sent while issue was closed.
On 2014/09/15 13:00:04, mtklein wrote: > The reason for reverting is: This is leaking memory: It appears that the problem isn't the runtime code, just the unit test.
On 2014/09/15 13:05:38, tomhudson wrote: > On 2014/09/15 13:00:04, mtklein wrote: > > The reason for reverting is: This is leaking memory: > > It appears that the problem isn't the runtime code, just the unit test. Yeah. The runtime code is quite easy. The memory leak happens in unit test. One SkSurface allocated in unit test is not released. Thanks for your information, Tom. I would like to re-land it because it is just a small fix in unit test.
The CQ bit was checked by yunchao.he@intel.com
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patchset/545813002/160001
Message was sent while issue was closed.
Committed patchset #8 (id:160001) as 49005bf8929dd8ca86431e13414d683b509cd538 |