|
|
DescriptionImplement SkImageFilter::Cache with SkResourceCache.
The single global cache now uses the global SkResourceCache,
and any Create()ed cache uses a local SkResourceCache.
No real public API changes (and only deletes).
I don't see any pixel diffs on .skps or GMs.
Don't see any significant perf difference on 8888 or gpu configs.
DM peak memory usage did drop by about 113M, close to the 128M cache size.
BUG=skia:3662
Landing PS8 temporarily to trigger the perf bots.
TBR=reed@google.com
Committed: https://skia.googlesource.com/skia/+/75135d8ae1aa12e8e6bfce63291e5e876a77546f
Perf impact of PS8 looks pretty darn neutral:
https://perf.skia.org/kernel/?query=bench_type%3Dplayback%26source_type%3Dskp%26sub_result%3Dmin_ms&commit1=242397a1c95e34e3e2e5e85c85c81090317115cb&commit2=75135d8ae1aa12e8e6bfce63291e5e876a77546f
https://perf.skia.org/#4252
Patch Set 1 #Patch Set 2 : no friend #Patch Set 3 : tweaks #Patch Set 4 : remove purge #Patch Set 5 : include SkPixelRef #Patch Set 6 : Copy if allocator and getDiscardable() dont agree #Patch Set 7 : force the copy to try it on the perf bots #Patch Set 8 : alternative force #Patch Set 9 : restore conditional #Patch Set 10 : rearrange, comment #Patch Set 11 : build #
Total comments: 3
Messages
Total messages: 41 (13 generated)
The CQ bit was checked by mtklein@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1370323002/40001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1370323002/40001
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-Shared-Trybot on client.skia (JOB_FAILED, http://build.chromium.org/p/client.skia/builders/Test-Ubuntu-GCC-GCE-CPU-AVX2...)
The CQ bit was checked by mtklein@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1370323002/80001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1370323002/80001
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
mtklein@chromium.org changed reviewers: + reed@google.com, robertphillips@google.com
Question : where/how does the bitmap get Allocated? To switch over to SkResourceCache, they all need to be allocated via the RC's allocator... so the pixels will be allocated via DiscardableMemory...
On 2015/09/29 17:30:55, reed1 wrote: > Question : where/how does the bitmap get Allocated? To switch over to > SkResourceCache, they all need to be allocated via the RC's allocator... so the > pixels will be allocated via DiscardableMemory... Today the image filter subclasses handle the allocation in onFilterImage(). So options here are: 1) thread the cache or an Allocator through onFilterImage(), make sure we use it everywhere; 2) when putting things into the cache, deep copy if the allocators disagree. 1) is probably best, but 2) is easy to write and get right. So I'm going to mock up 2) here first, and see how much we suffer.
The CQ bit was checked by mtklein@chromium.org
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1370323002/120001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1370323002/120001
The CQ bit was unchecked by commit-bot@chromium.org
Try jobs failed on following builders: Build-Win-MSVC-x86-Debug-Trybot on client.skia.compile (JOB_FAILED, http://build.chromium.org/p/client.skia.compile/builders/Build-Win-MSVC-x86-D...) Build-Win-MSVC-x86_64-Debug-Trybot on client.skia.compile (JOB_FAILED, http://build.chromium.org/p/client.skia.compile/builders/Build-Win-MSVC-x86_6...)
The CQ bit was checked by mtklein@chromium.org
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1370323002/140001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1370323002/140001
Message was sent while issue was closed.
Committed patchset #8 (id:140001) as https://skia.googlesource.com/skia/+/75135d8ae1aa12e8e6bfce63291e5e876a77546f
Message was sent while issue was closed.
A revert of this CL (patchset #8 id:140001) has been created in https://codereview.chromium.org/1381523002/ by mtklein@chromium.org. The reason for reverting is: Landed PS8 temporarily to trigger the perf bots. .
The CQ bit was checked by mtklein@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1370323002/180001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1370323002/180001
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: Build-Ubuntu-GCC-x86_64-Release-Trybot on client.skia.compile (JOB_FAILED, http://build.chromium.org/p/client.skia.compile/builders/Build-Ubuntu-GCC-x86...)
The CQ bit was checked by mtklein@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1370323002/200001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1370323002/200001
PTAL Looks like the hack of just copying as needed is working out fine.
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
senorblanco@chromium.org changed reviewers: + senorblanco@chromium.org
I love this approach, but I think we need to try to get rid of that copy. I've added a comment with a rough idea of how it might be done. One good HTML test for caching is http://letmespellitoutforyou.com/samples/svg/filter_terrain.svg. You can run it locally in Chrome with --show-fps-counter. This is also checked into Telemetry here: https://chromeperf.appspot.com/report?sid=650a3cc1b51aad218887575e5bf77735b39... https://codereview.chromium.org/1370323002/diff/200001/src/core/SkImageFilter... File src/core/SkImageFilter.cpp (right): https://codereview.chromium.org/1370323002/diff/200001/src/core/SkImageFilter... src/core/SkImageFilter.cpp:537: // Image filters allocate their own result bitmaps. How much cleaner/faster would this code become if we changed this? I.e., if we changed image filters to use a callback for allocation, or something like that? If it helps, a central pinch-point would be SkImageFilter::DeviceProxy::createDevice(). That should handle almost all filters, I think. https://codereview.chromium.org/1370323002/diff/200001/src/core/SkImageFilter... src/core/SkImageFilter.cpp:540: // The expected case in Chrome is: rcIsDiscardable == true, bmIsDiscardable == false. So you're saying the expected case in Chrome is that we're going to have to copy? This seems unfortunate, especially for caches that are written once and never used (e.g., a large DAG with mostly-animated results). https://codereview.chromium.org/1370323002/diff/200001/src/core/SkImageFilter... src/core/SkImageFilter.cpp:583: SkImageFilter::Cache* SkImageFilter::Cache::Create(size_t maxBytes) { Rename this to CreateLocal()?
Stephen, mike was trying this out because the current situation is pretty bad (given the size of the cache and that it is not discardable). I think I understand the direction you're suggesting, about plumbing the allocator. Perhaps you could sketch an alternate CL, given your understanding of how/where devices/bitmaps are allocated today.
On 2015/10/13 18:55:13, reed1 wrote: > Stephen, mike was trying this out because the current situation is pretty bad > (given the size of the cache and that it is not discardable). > > I think I understand the direction you're suggesting, about plumbing the > allocator. Perhaps you could sketch an alternate CL, given your understanding of > how/where devices/bitmaps are allocated today. I also agree that threading an allocator through is the best design, and the way I'd write it from scratch today. I stopped short because I wasn't willing to invest the effort to do that. And, that land-and-revert of PS 8 makes it look like the copy may not be a problem (PS 8 forced the copy, just like it'd happen in Chrome). Think running that patch through CT would be a good way of answering that for Chrome? If it's no big deal, it might be nice to land first to get this memory accounted for, then finesse the copy later. I'd even go as far as landing it with an unconditional copy, to both simplify the code and make sure everyone's testing the same paths for correctness and performance. While you're thinking, I'll get going on building Chrome with this patch to see how filter_terrain.svg fares.
On 2015/10/14 15:19:55, mtklein wrote: > On 2015/10/13 18:55:13, reed1 wrote: > > Stephen, mike was trying this out because the current situation is pretty bad > > (given the size of the cache and that it is not discardable). > > > > I think I understand the direction you're suggesting, about plumbing the > > allocator. Perhaps you could sketch an alternate CL, given your understanding > of > > how/where devices/bitmaps are allocated today. Off the top of my head, I think have SkRasterDevice create discardable memory when it sees the "fForImageFilter" in the CreateInfo might work, assuming devices have access to the discardable memory pool. That wouldn't require any API changes, I think. Another option would be to have DeviceProxy::createDevice() allocate a discardable bitmap, and wrap a device around that (if devices can be wrapped around bitmaps still?). That'd be less intrusive, but it wouldn't work for the GPU side. Maybe a createDiscardableDevice() on SkDevice? I dunno. > I also agree that threading an allocator through is the best design, and the way > I'd write it from scratch today. I stopped short because I wasn't willing to > invest the effort to do that. And, that land-and-revert of PS 8 makes it look > like the copy may not be a problem (PS 8 forced the copy, just like it'd happen > in Chrome). Think running that patch through CT would be a good way of > answering that for Chrome? If it's no big deal, it might be nice to land first > to get this memory accounted for, then finesse the copy later. I'd even go as > far as landing it with an unconditional copy, to both simplify the code and make > sure everyone's testing the same paths for correctness and performance. The other place this copy is going to hit us is Canvas, which in the display list canvas mode uses an image filter when it needs to resize the canvas for zooming, etc. It shouldn't be caching at all, but right now it is, which means with this change we'll be doing a copy of the canvas backing store on every frame, AFAICT. > While you're thinking, I'll get going on building Chrome with this patch to see > how filter_terrain.svg fares. Great, thanks. Note that that one (and image_filter_dag bench in Skia) test whether caching is functional (bound on a heavy blur which is done 5 times or 1 time if it's working), but may not be affected by the copy, since they're dominated by the blur. We have a paucity of image filter benches, sadly.
> Great, thanks. Note that that one (and image_filter_dag bench in Skia) test > whether caching is functional (bound on a heavy blur which is done 5 times or > 1 time if it's working), but may not be affected by the copy, since they're > dominated by the blur. We have a paucity of image filter benches, sadly. Yeah, things being dominated by the image filter itself is exactly what I'm hoping allows us to just copy. As far as I can tell I'm getting the same framerate on that page with my stock M47 and with a two-weeks-ago tree with this CL applied. (I added a trace event locally just to make sure we're really copying. We are.) Looks like image_filter_dag takes a 2% hit. Do we have a test case / bench for that zooming Canvas?
On 2015/10/14 16:28:04, mtklein wrote: > > Great, thanks. Note that that one (and image_filter_dag bench in Skia) test > > whether caching is functional (bound on a heavy blur which is done 5 times or > > 1 time if it's working), but may not be affected by the copy, since they're > > dominated by the blur. We have a paucity of image filter benches, sadly. > > Yeah, things being dominated by the image filter itself is exactly what I'm > hoping allows us to just copy. Definitely true for these cases, where the filters are slow. Unfortunately, not always true. > As far as I can tell I'm getting the same framerate on that page with my stock > M47 and with a two-weeks-ago tree with this CL applied. (I added a trace event > locally just to make sure we're really copying. We are.) Looks like > image_filter_dag takes a 2% hit. Good; that verifies that the caching is working at least. > Do we have a test case / bench for that zooming Canvas? Sadly no. I have a Skia bench I'm working on for other purposes that should be a lot less dominated by the filter time, though. Let me upload that and cc you.
Agree that https://codereview.chromium.org/1401173006/ shows that the copy is significant compared to trivial image filters.
On 2015/10/14 16:10:32, Stephen White wrote: > On 2015/10/14 15:19:55, mtklein wrote: > > On 2015/10/13 18:55:13, reed1 wrote: > > > Stephen, mike was trying this out because the current situation is pretty > bad > > > (given the size of the cache and that it is not discardable). > > > > > > I think I understand the direction you're suggesting, about plumbing the > > > allocator. Perhaps you could sketch an alternate CL, given your > understanding > > of > > > how/where devices/bitmaps are allocated today. > > Off the top of my head, I think have SkRasterDevice create discardable memory > when it sees the "fForImageFilter" in the CreateInfo might work, assuming > devices > have access to the discardable memory pool. That wouldn't require any API > changes, > I think. > > Another option would be to have DeviceProxy::createDevice() allocate a > discardable > bitmap, and wrap a device around that (if devices can be wrapped around bitmaps > still?). > That'd be less intrusive, but it wouldn't work for the GPU side. Maybe a > createDiscardableDevice() > on SkDevice? I dunno. I've realized there's a problem with both of these approaches: they require that all image filters go through the proxy->createDevice() path to allocate bitmaps. The GPU and generic Skia filters already do, but filters with custom pixel loops often use tryAllocPixels() directly on the destination SkBitmap. I've put up a patch here to switch them over: https://codereview.chromium.org/1414843003 > > I also agree that threading an allocator through is the best design, and the > way > > I'd write it from scratch today. I stopped short because I wasn't willing to > > invest the effort to do that. And, that land-and-revert of PS 8 makes it look > > like the copy may not be a problem (PS 8 forced the copy, just like it'd > happen > > in Chrome). Think running that patch through CT would be a good way of > > answering that for Chrome? If it's no big deal, it might be nice to land > first > > to get this memory accounted for, then finesse the copy later. I'd even go as > > far as landing it with an unconditional copy, to both simplify the code and > make > > sure everyone's testing the same paths for correctness and performance. > > The other place this copy is going to hit us is Canvas, which in the display > list canvas > mode uses an image filter when it needs to resize the canvas for zooming, etc. > It shouldn't be caching at all, but right now it is, which means with this > change we'll > be doing a copy of the canvas backing store on every frame, AFAICT. > > > While you're thinking, I'll get going on building Chrome with this patch to > see > > how filter_terrain.svg fares. > > Great, thanks. Note that that one (and image_filter_dag bench in Skia) test > whether caching is functional (bound on a heavy blur which is done 5 times or > 1 time if it's working), but may not be affected by the copy, since they're > dominated by the blur. We have a paucity of image filter benches, sadly.
On 2015/10/20 16:08:13, Stephen White wrote: > On 2015/10/14 16:10:32, Stephen White wrote: > > On 2015/10/14 15:19:55, mtklein wrote: > > > On 2015/10/13 18:55:13, reed1 wrote: > > > > Stephen, mike was trying this out because the current situation is pretty > > bad > > > > (given the size of the cache and that it is not discardable). > > > > > > > > I think I understand the direction you're suggesting, about plumbing the > > > > allocator. Perhaps you could sketch an alternate CL, given your > > understanding > > > of > > > > how/where devices/bitmaps are allocated today. > > > > Off the top of my head, I think have SkRasterDevice create discardable memory > > when it sees the "fForImageFilter" in the CreateInfo might work, assuming > > devices > > have access to the discardable memory pool. That wouldn't require any API > > changes, > > I think. > > > > Another option would be to have DeviceProxy::createDevice() allocate a > > discardable > > bitmap, and wrap a device around that (if devices can be wrapped around > bitmaps > > still?). > > That'd be less intrusive, but it wouldn't work for the GPU side. Maybe a > > createDiscardableDevice() > > on SkDevice? I dunno. > > I've realized there's a problem with both of these approaches: they require that > all image filters go through the proxy->createDevice() path to allocate bitmaps. > The GPU > and generic Skia filters already do, but filters with custom pixel loops often > use > tryAllocPixels() directly on the destination SkBitmap. > > I've put up a patch here to switch them over: > https://codereview.chromium.org/1414843003 > > > > I also agree that threading an allocator through is the best design, and the > > way > > > I'd write it from scratch today. I stopped short because I wasn't willing > to > > > invest the effort to do that. And, that land-and-revert of PS 8 makes it > look > > > like the copy may not be a problem (PS 8 forced the copy, just like it'd > > happen > > > in Chrome). Think running that patch through CT would be a good way of > > > answering that for Chrome? If it's no big deal, it might be nice to land > > first > > > to get this memory accounted for, then finesse the copy later. I'd even go > as > > > far as landing it with an unconditional copy, to both simplify the code and > > make > > > sure everyone's testing the same paths for correctness and performance. > > > > The other place this copy is going to hit us is Canvas, which in the display > > list canvas > > mode uses an image filter when it needs to resize the canvas for zooming, etc. > > It shouldn't be caching at all, but right now it is, which means with this > > change we'll > > be doing a copy of the canvas backing store on every frame, AFAICT. > > > > > While you're thinking, I'll get going on building Chrome with this patch to > > see > > > how filter_terrain.svg fares. > > > > Great, thanks. Note that that one (and image_filter_dag bench in Skia) test > > whether caching is functional (bound on a heavy blur which is done 5 times or > > 1 time if it's working), but may not be affected by the copy, since they're > > dominated by the blur. We have a paucity of image filter benches, sadly. Just out of curiosity, does this look neutral to you? https://doc-0jeti-135dm-s-googleusercontent.commondatastorage.googleapis.com/...
On 2015/10/20 16:21:42, mtklein wrote: > On 2015/10/20 16:08:13, Stephen White wrote: > > On 2015/10/14 16:10:32, Stephen White wrote: > > > On 2015/10/14 15:19:55, mtklein wrote: > > > > On 2015/10/13 18:55:13, reed1 wrote: > > > > > Stephen, mike was trying this out because the current situation is > pretty > > > bad > > > > > (given the size of the cache and that it is not discardable). > > > > > > > > > > I think I understand the direction you're suggesting, about plumbing the > > > > > allocator. Perhaps you could sketch an alternate CL, given your > > > understanding > > > > of > > > > > how/where devices/bitmaps are allocated today. > > > > > > Off the top of my head, I think have SkRasterDevice create discardable > memory > > > when it sees the "fForImageFilter" in the CreateInfo might work, assuming > > > devices > > > have access to the discardable memory pool. That wouldn't require any API > > > changes, > > > I think. > > > > > > Another option would be to have DeviceProxy::createDevice() allocate a > > > discardable > > > bitmap, and wrap a device around that (if devices can be wrapped around > > bitmaps > > > still?). > > > That'd be less intrusive, but it wouldn't work for the GPU side. Maybe a > > > createDiscardableDevice() > > > on SkDevice? I dunno. > > > > I've realized there's a problem with both of these approaches: they require > that > > all image filters go through the proxy->createDevice() path to allocate > bitmaps. > > The GPU > > and generic Skia filters already do, but filters with custom pixel loops often > > use > > tryAllocPixels() directly on the destination SkBitmap. > > > > I've put up a patch here to switch them over: > > https://codereview.chromium.org/1414843003 > > > > > > I also agree that threading an allocator through is the best design, and > the > > > way > > > > I'd write it from scratch today. I stopped short because I wasn't willing > > to > > > > invest the effort to do that. And, that land-and-revert of PS 8 makes it > > look > > > > like the copy may not be a problem (PS 8 forced the copy, just like it'd > > > happen > > > > in Chrome). Think running that patch through CT would be a good way of > > > > answering that for Chrome? If it's no big deal, it might be nice to land > > > first > > > > to get this memory accounted for, then finesse the copy later. I'd even > go > > as > > > > far as landing it with an unconditional copy, to both simplify the code > and > > > make > > > > sure everyone's testing the same paths for correctness and performance. > > > > > > The other place this copy is going to hit us is Canvas, which in the display > > > list canvas > > > mode uses an image filter when it needs to resize the canvas for zooming, > etc. > > > It shouldn't be caching at all, but right now it is, which means with this > > > change we'll > > > be doing a copy of the canvas backing store on every frame, AFAICT. > > > > > > > While you're thinking, I'll get going on building Chrome with this patch > to > > > see > > > > how filter_terrain.svg fares. > > > > > > Great, thanks. Note that that one (and image_filter_dag bench in Skia) test > > > whether caching is functional (bound on a heavy blur which is done 5 times > or > > > 1 time if it's working), but may not be affected by the copy, since they're > > > dominated by the blur. We have a paucity of image filter benches, sadly. > > Just out of curiosity, does this look neutral to you? > https://doc-0jeti-135dm-s-googleusercontent.commondatastorage.googleapis.com/... I don't have access to that link.
On 2015/10/20 19:46:34, Stephen White wrote: > On 2015/10/20 16:21:42, mtklein wrote: > > On 2015/10/20 16:08:13, Stephen White wrote: > > > On 2015/10/14 16:10:32, Stephen White wrote: > > > > On 2015/10/14 15:19:55, mtklein wrote: > > > > > On 2015/10/13 18:55:13, reed1 wrote: > > > > > > Stephen, mike was trying this out because the current situation is > > pretty > > > > bad > > > > > > (given the size of the cache and that it is not discardable). > > > > > > > > > > > > I think I understand the direction you're suggesting, about plumbing > the > > > > > > allocator. Perhaps you could sketch an alternate CL, given your > > > > understanding > > > > > of > > > > > > how/where devices/bitmaps are allocated today. > > > > > > > > Off the top of my head, I think have SkRasterDevice create discardable > > memory > > > > when it sees the "fForImageFilter" in the CreateInfo might work, assuming > > > > devices > > > > have access to the discardable memory pool. That wouldn't require any API > > > > changes, > > > > I think. > > > > > > > > Another option would be to have DeviceProxy::createDevice() allocate a > > > > discardable > > > > bitmap, and wrap a device around that (if devices can be wrapped around > > > bitmaps > > > > still?). > > > > That'd be less intrusive, but it wouldn't work for the GPU side. Maybe a > > > > createDiscardableDevice() > > > > on SkDevice? I dunno. > > > > > > I've realized there's a problem with both of these approaches: they require > > that > > > all image filters go through the proxy->createDevice() path to allocate > > bitmaps. > > > The GPU > > > and generic Skia filters already do, but filters with custom pixel loops > often > > > use > > > tryAllocPixels() directly on the destination SkBitmap. > > > > > > I've put up a patch here to switch them over: > > > https://codereview.chromium.org/1414843003 > > > > > > > > I also agree that threading an allocator through is the best design, and > > the > > > > way > > > > > I'd write it from scratch today. I stopped short because I wasn't > willing > > > to > > > > > invest the effort to do that. And, that land-and-revert of PS 8 makes > it > > > look > > > > > like the copy may not be a problem (PS 8 forced the copy, just like it'd > > > > happen > > > > > in Chrome). Think running that patch through CT would be a good way of > > > > > answering that for Chrome? If it's no big deal, it might be nice to > land > > > > first > > > > > to get this memory accounted for, then finesse the copy later. I'd even > > go > > > as > > > > > far as landing it with an unconditional copy, to both simplify the code > > and > > > > make > > > > > sure everyone's testing the same paths for correctness and performance. > > > > > > > > The other place this copy is going to hit us is Canvas, which in the > display > > > > list canvas > > > > mode uses an image filter when it needs to resize the canvas for zooming, > > etc. > > > > It shouldn't be caching at all, but right now it is, which means with this > > > > change we'll > > > > be doing a copy of the canvas backing store on every frame, AFAICT. > > > > > > > > > While you're thinking, I'll get going on building Chrome with this patch > > to > > > > see > > > > > how filter_terrain.svg fares. > > > > > > > > Great, thanks. Note that that one (and image_filter_dag bench in Skia) > test > > > > whether caching is functional (bound on a heavy blur which is done 5 times > > or > > > > 1 time if it's working), but may not be affected by the copy, since > they're > > > > dominated by the blur. We have a paucity of image filter benches, sadly. > > > > Just out of curiosity, does this look neutral to you? > > > https://doc-0jeti-135dm-s-googleusercontent.commondatastorage.googleapis.com/... > > I don't have access to that link. Weird. Can you get to it from here http://build.chromium.org/p/client.skia.fyi/builders/CT-Perf-10k-Linux-RR-Par... ? (CT Perf Results) |