|
|
Created:
6 years, 4 months ago by Tom Hudson Modified:
6 years, 1 month ago CC:
enne_, reviews_skia.org, schenney, vangelis, digit Base URL:
https://skia.googlesource.com/skia.git@master Project:
skia Visibility:
Public. |
DescriptionImplement SkPicture::bytesUsed() for SkRecord backend
BUG=chromium:230419
R=mtklein@google.com,reed@google.com
Committed: https://skia.googlesource.com/skia/+/158fcaa031d105dc999d9813fee8927db56a871c
Patch Set 1 #
Total comments: 1
Patch Set 2 : Move to SkPictureUtils, start unit test #
Total comments: 5
Patch Set 3 : Straightforward rebase / address Mike's comments #Patch Set 4 : Unit test prevents bloat #
Total comments: 6
Patch Set 5 : Respond to review, add deletion listener tracking #
Total comments: 2
Patch Set 6 : Fix SkTileGrid computation #Patch Set 7 : Use Mike's SkVarAlloc changes to dynamically compute size #Patch Set 8 : Fix minor edit error in SkTileGrid::bytesUsed #
Total comments: 5
Patch Set 9 : Tweak SkPictureUtils doc comment #Patch Set 10 : SkRecord visitor recursively measures SkPictures #
Total comments: 7
Patch Set 11 : Streamline visitor #Patch Set 12 : Tweak expected upper bound of data structure size #Patch Set 13 : NOT FOR COMMIT debug logs #Patch Set 14 : NOT FOR COMMIT more logging #Patch Set 15 : More tweaks for object size #
Messages
Total messages: 65 (7 generated)
Support a year+-old feature request from Telemetry/Svelte once we shift to SkRecord. reed@: one new function on SkPicture public API to report the bytes used; returns 0 for the non-SkRecord implementation.
impl LGTM https://codereview.chromium.org/490253003/diff/1/include/core/SkPicture.h File include/core/SkPicture.h (right): https://codereview.chromium.org/490253003/diff/1/include/core/SkPicture.h#new... include/core/SkPicture.h:212: size_t bytesUsed() const; I'd perhaps hedge to approximateBytesUsed(), like ~OpCount() above.
Agree that this is impossible to return exactly. What is the current use-case and importance. What will we accidentally break in the future if our impl makes different approximations down the road?
The use case is that (a year and a quarter ago) we were interested in tracking the size of pictures that were being created, both for benchmarks and in UMA. Nat's the one who initiated the bug, but enne@, reveman@, or vangelis@ might be able to provide more context and confirm whether it's still relevant.
It'd be nice to have benchmark coverage of the sizes. Not intended for decision making. There'd be hookup required chrome side once this was available but its simple. I'm not actively asking for this though for any particular immediately required thing.
In out-of-band discussion, Mike Reed suggested implementing this in SkPictureUtils, but I think that requires declaring a new friend or exposing SkPicture internals.
On 2014/08/25 17:27:18, tomhudson wrote: > In out-of-band discussion, Mike Reed suggested implementing this in > SkPictureUtils, but I think that requires declaring a new friend or exposing > SkPicture internals. Agreed. This fits best in SkPicture itself.
On 2014/08/22 at 14:06:09, tomhudson wrote: > The use case is that (a year and a quarter ago) we were interested in tracking the size of pictures that were being created, both for benchmarks and in UMA. Nat's the one who initiated the bug, but enne@, reveman@, or vangelis@ might be able to provide more context and confirm whether it's still relevant. This isn't that relevant any more. Our worry was during creating impl-side painting that we were recording large SkPictures or keeping alive stale SkPictures and would blow out memory usage. SkPicture size was only implemented behind a compile-time flag at the time so we weren't able to use it and just went on without that knowledge at the time. Since metrics haven't blown up, it's probably not a big deal. It's interesting, but it's low priority. It's not at all as important as bugs like https://code.google.com/p/chromium/issues/detail?id=113309 or https://code.google.com/p/chromium/issues/detail?id=269080.
Moves code to SkPictureUtils at the suggestion of reed@ to avoid putting any more non-core-drawing functions in the SkPicture API. We'll consider similarly moving the entry points of other query/analysis functions, even if the implementation needs to remain part of SkPicture, but not in this CL. Interesting fun fact from the unit test: an empty SkPicture weighs in at 200B. (112 for the SkPicture itself, 88 for the SkRecord, no bounding box hierarchy?)
Slimming Paint wants this to monitor their migration, so we need to dust this off and come to consensus on API.
schenney@chromium.org changed reviewers: + schenney@chromium.org
We now have a pressing use case with the Slimming Paint project tracking the overhead of the display list architecture, and providing a metric to drive down memory overhead compared to the current implementation. Approximate is good enough, provided it's within a few % and stable across runs.
Might want to start fresh. All these implementation details have changed. Do you care that we're not counting bytes held by reference (e.g., paths, bitmap pixels)? Those tend to be rather large, but won't be counted here. Text will though. https://codereview.chromium.org/490253003/diff/20001/src/core/SkQuadTree.h File src/core/SkQuadTree.h (right): https://codereview.chromium.org/490253003/diff/20001/src/core/SkQuadTree.h#ne... src/core/SkQuadTree.h:75: virtual size_t bytesUsed() const SK_OVERRIDE; No more quad tree. https://codereview.chromium.org/490253003/diff/20001/src/core/SkRTree.cpp File src/core/SkRTree.cpp (right): https://codereview.chromium.org/490253003/diff/20001/src/core/SkRTree.cpp#new... src/core/SkRTree.cpp:454: byteCount += fDeferredInserts.reserved() * sizeof(Branch); There probably isn't an fDeferredInserts any more. https://codereview.chromium.org/490253003/diff/20001/src/core/SkRecord.h File src/core/SkRecord.h (right): https://codereview.chromium.org/490253003/diff/20001/src/core/SkRecord.h#newc... src/core/SkRecord.h:113: size_t bytesUsed() const { return fAlloc.totalCapacity() + fAlloc doesn't know how large it is any more, so you'll need to add a new member to SkRecord and update it when calling alloc(). https://codereview.chromium.org/490253003/diff/20001/src/core/SkTileGrid.cpp File src/core/SkTileGrid.cpp (right): https://codereview.chromium.org/490253003/diff/20001/src/core/SkTileGrid.cpp#... src/core/SkTileGrid.cpp:172: byteCount = fTiles[i].reserved() * sizeof(Entry); This is going to need an update too. https://codereview.chromium.org/490253003/diff/20001/src/utils/SkPictureUtils... File src/utils/SkPictureUtils.cpp (right): https://codereview.chromium.org/490253003/diff/20001/src/utils/SkPictureUtils... src/utils/SkPictureUtils.cpp:228: if (!pict->fRecord.get()) { You can assume fRecord now.
On 2014/11/17 20:19:32, mtklein wrote: > Might want to start fresh. All these implementation details have changed. > > Do you care that we're not counting bytes held by reference (e.g., paths, bitmap > pixels)? Those tend to be rather large, but won't be counted here. Text will > though. It's not a problem to leave out referenced objects that we would expect to reference the same regardless of the picture structure. I think all the externally allocated and ref-counted objects fall into this category.
Naive rebase was easy with Mike's instructions. Added a test that the size of an empty SkPicture doesn't grow. It's 176B, and I think we agreed in meeting today that was worth reducing if we're going to go to dense forests of these things. reed@ API review: new function is in SkPictureUtils; SkPicture just needed a friend.
https://codereview.chromium.org/490253003/diff/60001/include/utils/SkPictureU... File include/utils/SkPictureUtils.h (right): https://codereview.chromium.org/490253003/diff/60001/include/utils/SkPictureU... include/utils/SkPictureUtils.h:83: static size_t approximateBytesUsed(const SkPicture* pict); nit: Skia Static Methods Are Capitalized. "Used"? Can the name or dox give a better idea of what is being measured?
chrishtr@chromium.org changed reviewers: + chrishtr@chromium.org
https://codereview.chromium.org/490253003/diff/60001/src/core/SkTileGrid.cpp File src/core/SkTileGrid.cpp (right): https://codereview.chromium.org/490253003/diff/60001/src/core/SkTileGrid.cpp#... src/core/SkTileGrid.cpp:181: for (int i = 0; i < fXTiles * fYTiles; i++) { Will this be slow? Overall, what asymptotic runtime should we expect for the public bytesUsed() method?
https://codereview.chromium.org/490253003/diff/60001/tests/PictureTest.cpp File tests/PictureTest.cpp (right): https://codereview.chromium.org/490253003/diff/60001/tests/PictureTest.cpp#ne... tests/PictureTest.cpp:1730: // A SkPicture also has SkRecord and possibly SkBBHierarchy as members. As written, there will be no BBH. Did you want to add one?
tomhudson@google.com changed reviewers: + tomhudson@google.com
https://codereview.chromium.org/490253003/diff/60001/include/utils/SkPictureU... File include/utils/SkPictureUtils.h (right): https://codereview.chromium.org/490253003/diff/60001/include/utils/SkPictureU... include/utils/SkPictureUtils.h:83: static size_t approximateBytesUsed(const SkPicture* pict); On 2014/11/17 21:32:26, reed1 wrote: > nit: Skia Static Methods Are Capitalized. Done. > "Used"? Can the name or dox give a better idea of what is being measured? /** * How many bytes are allocated to hold the SkPicture. * Includes operations, parameters, bounding data, deletion listeners; * does not include large objects that SkRecord holds a reference to * (e.g. paths, pixels backing bitmaps). */ Happy to rename if we've got a better candidate. https://codereview.chromium.org/490253003/diff/60001/src/core/SkTileGrid.cpp File src/core/SkTileGrid.cpp (right): https://codereview.chromium.org/490253003/diff/60001/src/core/SkTileGrid.cpp#... src/core/SkTileGrid.cpp:181: for (int i = 0; i < fXTiles * fYTiles; i++) { On 2014/11/17 21:34:21, chrishtr wrote: > Will this be slow? Overall, what asymptotic runtime should we expect for the > public bytesUsed() method? If the tile grid just spans a screen on the Nexus 9, it'd be 8x6, so ~50 adds and multiplies, but there should be a buffer above and below the screen. If that's 2kpx each way we're looking at 24x6 -> 150 adds and multiplies? If it's expensive in practice, I suspect it'll be much more reasonable to cache than to try to track as we go, but either ought to be feasible.
https://codereview.chromium.org/490253003/diff/60001/src/core/SkTileGrid.cpp File src/core/SkTileGrid.cpp (right): https://codereview.chromium.org/490253003/diff/60001/src/core/SkTileGrid.cpp#... src/core/SkTileGrid.cpp:182: byteCount = fTiles[i].reserved() * sizeof(unsigned); I think you want += here? While you're at it, can't hurt to move the multiply out of the loop. https://codereview.chromium.org/490253003/diff/80001/src/core/SkRecord.h File src/core/SkRecord.h (right): https://codereview.chromium.org/490253003/diff/80001/src/core/SkRecord.h#newc... src/core/SkRecord.h:241: unsigned fBytesAllocated; Now that I'm thinking about it, it seems counterproductive to add a member here just to track how large it is. More than that, this fBytesAllocated will count how many bytes we asked for, not how many we've allocated to serve that request. Ideally those scale well with each other, but the whole point of this is to assert that, right? How often do you guys figure you'll be calling ApproximateBytesUsed, and do you care if it doesn't work on all platforms? What about Debug vs. Release, etc? SkVarAlloc could implement a byte-precise bytesAllocated() without any extra memory usage at the cost of non-constant run time. Mapping malloc_size / malloc_usable_size / _msize down the chain of fPrev pointers should work, though to date I haven't really liked to depend on non-standard methods like those outside of debugging code.
https://codereview.chromium.org/490253003/diff/80001/src/core/SkRecord.h File src/core/SkRecord.h (right): https://codereview.chromium.org/490253003/diff/80001/src/core/SkRecord.h#newc... src/core/SkRecord.h:241: unsigned fBytesAllocated; On 2014/11/17 at 23:15:33, mtklein wrote: > Now that I'm thinking about it, it seems counterproductive to add a member here just to track how large it is. More than that, this fBytesAllocated will count how many bytes we asked for, not how many we've allocated to serve that request. Ideally those scale well with each other, but the whole point of this is to assert that, right? > > How often do you guys figure you'll be calling ApproximateBytesUsed, and do you care if it doesn't work on all platforms? What about Debug vs. Release, etc? It will only be when we want to collect debug info. But it needs to work in production as well as debug environments, so we can integrate with about:tracing. > > SkVarAlloc could implement a byte-precise bytesAllocated() without any extra memory usage at the cost of non-constant run time. Mapping malloc_size / malloc_usable_size / _msize down the chain of fPrev pointers should work, though to date I haven't really liked to depend on non-standard methods like those outside of debugging code. non-constant runtime is probably ok so long as it isn't too slow. How slow are we talking about? Regarding debugging code: it'll only run when the page is under explicit debug inspection (via the user turning on about:tracing, or us running rasterize_and_record. So potentially unsafe or slow behavior is more acceptable.
> non-constant runtime is probably ok so long as it isn't too slow. How slow are > we talking about? > Regarding debugging code: it'll only run when the page is under explicit debug > inspection (via the user turning on about:tracing, or us running > rasterize_and_record. So potentially unsafe or slow behavior is more acceptable. It'll sort of scale O(lg N) given the way SkVarAlloc grows by doubling, until it hits its 64K page cap, where it will begin scaling linearly. I don't think calling these malloc+ methods is particularly unsafe, but I'm not very familiar with them. They may involve synchronization, but probably not system calls. Best way to know how much that strategy would cost is to try.
On 2014/11/17 23:15:33, mtklein wrote: > https://codereview.chromium.org/490253003/diff/60001/src/core/SkTileGrid.cpp > File src/core/SkTileGrid.cpp (right): > > https://codereview.chromium.org/490253003/diff/60001/src/core/SkTileGrid.cpp#... > src/core/SkTileGrid.cpp:182: byteCount = fTiles[i].reserved() * > sizeof(unsigned); > I think you want += here? > > While you're at it, can't hurt to move the multiply out of the loop. D('oh)ne. > https://codereview.chromium.org/490253003/diff/80001/src/core/SkRecord.h > File src/core/SkRecord.h (right): > > https://codereview.chromium.org/490253003/diff/80001/src/core/SkRecord.h#newc... > src/core/SkRecord.h:241: unsigned fBytesAllocated; > Now that I'm thinking about it, it seems counterproductive to add a member here > just to track how large it is. More than that, this fBytesAllocated will count > how many bytes we asked for, not how many we've allocated to serve that request. > Ideally those scale well with each other, but the whole point of this is to > assert that, right? > > How often do you guys figure you'll be calling ApproximateBytesUsed, and do you > care if it doesn't work on all platforms? What about Debug vs. Release, etc? > > SkVarAlloc could implement a byte-precise bytesAllocated() without any extra > memory usage at the cost of non-constant run time. Mapping malloc_size / > malloc_usable_size / _msize down the chain of fPrev pointers should work, though > to date I haven't really liked to depend on non-standard methods like those > outside of debugging code. I'd like to keep this more-or-less constant time, but I understand that it's ironic to do that by storing it in a member at the same time we're trying to keep size down. Meh.
malloc_usable_size() was deprecated ~4 years ago on Android; although jemalloc normally ships with it on Linux, it seems to be hidden there as well. So we can't have nice things? +Digit: TLDR we're trying to track size allocated for some Skia data structures in Clank. The change we're considering would create many more of the allocated objects, so we don't want to bloat them to carry around their sizes. Your groups posts on the subject were: https://groups.google.com/forum/#!searchin/android-ndk/malloc_usable_size/and... https://groups.google.com/forum/#!searchin/android-ndk/malloc_usable_size/and...
On 2014/11/18 16:33:53, tomhudson wrote: > malloc_usable_size() was deprecated ~4 years ago on Android; although jemalloc > normally ships with it on Linux, it seems to be hidden there as well. So we > can't have nice things? > > +Digit: TLDR we're trying to track size allocated for some Skia data structures > in Clank. The change we're considering would create many more of the allocated > objects, so we don't want to bloat them to carry around their sizes. Your groups > posts on the subject were: > > https://groups.google.com/forum/#!searchin/android-ndk/malloc_usable_size/and... > https://groups.google.com/forum/#!searchin/android-ndk/malloc_usable_size/and... Tom, are you working on this: https://codereview.chromium.org/730193003/ ?
On 2014/11/18 16:34:48, mtklein wrote: > Tom, are you working on this: https://codereview.chromium.org/730193003/ ? Nope, I wasn't aware of that, but Clank TOT isn't using dlmalloc, so I don't trust that patch to work in situ.
> Nope, I wasn't aware of that, but Clank TOT isn't using dlmalloc, so I don't > trust that patch to work in situ. OK. Going to have it return 0 on Android instead. If we care and figure out good way to detect and call into jemalloc, should be easy to follow up with that.
On 2014/11/18 16:43:30, mtklein wrote: > OK. Going to have it return 0 on Android instead. If we care and figure out > good way to detect and call into jemalloc, should be easy to follow up with > that. ... but that makes the exact measurement approach completely wrong on Android, which is our target platform for these changes?
> ... but that makes the exact measurement approach completely wrong on Android, > which is our target platform for these changes? If you say so. Except for malloc itself, nothing in picture recording is platform specific. Changes you seen on a desktop will parallel what happens on a phone.
> It's not a problem to leave out referenced objects that we would expect to > reference the same regardless of the picture structure. I think all the > externally allocated and ref-counted objects fall into this category. Oh hey, what about nested pictures? Do you guys intend to call drawPicture()? Sort of makes gaming the metric trivially easy. :)
> > How often do you guys figure you'll be calling ApproximateBytesUsed, and do you care if it doesn't work on all platforms? What about Debug vs. Release, etc? > > It will only be when we want to collect debug info. But it needs to work in production as well as debug environments, so we can integrate with about:tracing. Yeah. I think ideally we would be able to add this to about:tracing, which means that when somebody has requested a particular trace category, we would call this function. At this point, I would probably want to turn it on along with the trace that includes serialization of pictures, so it doesn't have to be that cheap relatively speaking. We also want to call it from test microbenchmarks, but it also doesn't have to be cheap there. We want to be able to compare the old approach (one large SkPicture) with slimming paint's current approach (lots of little pictures), so anything that is shared among pictures like bitmaps shouldn't be counted so that we can have a fair comparison. I also don't think there are any plans to use this in production at runtime to make real decisions with. I think the lazy calculation that's being done now seems like a great approach rather than trying to be faster or smarter about it.
reed@: PTAL at public API again; is the documentation added to SkPictureUtil::ApproximateBytesUsed() adequate? Do you have a better name in mind? Rewrote to dynamically measure size using mtklein@'s SkVarAlloc patch, which still *might* have problems with Chrome for Android.
impl lgtm https://codereview.chromium.org/490253003/diff/140001/include/utils/SkPicture... File include/utils/SkPictureUtils.h (right): https://codereview.chromium.org/490253003/diff/140001/include/utils/SkPicture... include/utils/SkPictureUtils.h:86: * (e.g. paths, pixels backing bitmaps). Might add nested pictures to the list of things we don't count? Took me a while to remember that. https://codereview.chromium.org/490253003/diff/140001/src/core/SkRTree.h File src/core/SkRTree.h (right): https://codereview.chromium.org/490253003/diff/140001/src/core/SkRTree.h#newc... src/core/SkRTree.h:57: virtual size_t bytesUsed() const SK_OVERRIDE; Might want to move this above the "only public for tests" line, but I don't think it'll be a big source of confusion either way.
https://codereview.chromium.org/490253003/diff/140001/include/utils/SkPicture... File include/utils/SkPictureUtils.h (right): https://codereview.chromium.org/490253003/diff/140001/include/utils/SkPicture... include/utils/SkPictureUtils.h:86: * (e.g. paths, pixels backing bitmaps). On 2014/11/18 19:26:14, mtklein wrote: > Might add nested pictures to the list of things we don't count? Took me a while > to remember that. Done. https://codereview.chromium.org/490253003/diff/140001/src/core/SkRTree.h File src/core/SkRTree.h (right): https://codereview.chromium.org/490253003/diff/140001/src/core/SkRTree.h#newc... src/core/SkRTree.h:57: virtual size_t bytesUsed() const SK_OVERRIDE; Since we don't really want this used for production decision-making, 'only for tests' seems like a reasonable way to tag it.
On 2014/11/18 19:35:12, tomhudson wrote: > https://codereview.chromium.org/490253003/diff/140001/include/utils/SkPicture... > File include/utils/SkPictureUtils.h (right): > > https://codereview.chromium.org/490253003/diff/140001/include/utils/SkPicture... > include/utils/SkPictureUtils.h:86: * (e.g. paths, pixels backing bitmaps). > On 2014/11/18 19:26:14, mtklein wrote: > > Might add nested pictures to the list of things we don't count? Took me a > while > > to remember that. > > Done. > > https://codereview.chromium.org/490253003/diff/140001/src/core/SkRTree.h > File src/core/SkRTree.h (right): > > https://codereview.chromium.org/490253003/diff/140001/src/core/SkRTree.h#newc... > src/core/SkRTree.h:57: virtual size_t bytesUsed() const SK_OVERRIDE; > Since we don't really want this used for production decision-making, 'only for > tests' seems like a reasonable way to tag it. Sorry, I totally spaced on the SkPicture refs and was head down not checking email. I was only thinking of the shared objects like bitmaps, gradients, etc. With Slimming Paint we really do care about SkPicture refs because that is in fact the way we would be blowing out memory. My assumption was that we would get everything that used the SkVarAlloc or SkChunkAlloc for its memory.
https://codereview.chromium.org/490253003/diff/140001/src/core/SkRTree.h File src/core/SkRTree.h (right): https://codereview.chromium.org/490253003/diff/140001/src/core/SkRTree.h#newc... src/core/SkRTree.h:57: virtual size_t bytesUsed() const SK_OVERRIDE; On 2014/11/18 19:35:12, tomhudson wrote: > Since we don't really want this used for production decision-making, 'only for > tests' seems like a reasonable way to tag it. Uh, okay, but that's a more confusing answer than "oh, didn't see that". This use case is not testing. We'd not expect people to look down here before deciding whether or not to use ApproximateBytesUsed(). If we're going to have a note that says "Don't use this for anything but diagnostics," it belongs in that header. This method isn't public for testing. It's public to match the SkBBH interface.
> Sorry, I totally spaced on the SkPicture refs and was head down not checking > email. I was only thinking of the shared objects like bitmaps, gradients, etc. > With Slimming Paint we really do care about SkPicture refs because that is in > fact the way we would be blowing out memory. > > My assumption was that we would get everything that used the SkVarAlloc or > SkChunkAlloc for its memory. Nested pictures have been allocated, recorded, and are stand alone objects all before you can record them into another picture. They do both use SkVarAlloc, but they're using different SkVarAllocs. This, incidentally, is my main hesitation with recording thousands of tiny pictures. It's not so much you're going to be using more RAM---you will, but at 168 bytes, the overhead is only 1 or 2 DrawFoo ops---but you're going to be calling malloc (and free) thousands of times more often. Amortizing this allocation is pretty much the most important thing we do to make recording SkPictures fast, but you won't be able to hide that. Is there a single scope that's convenient to peg the lifetime of all these micropictures to? If so, we could thread an arena (e.g. SkVarAlloc) through all of them, perhaps restoring some that amortization. It'd take some work to really commit to that, though. There are lots of allocations, not just what we allocate with SkVarAlloc today, but the call to new SkRecord, the calls inside BBH factories to return a new SkBBH*, allocation done filling bounding boxes, allocation done by layer analysis, etc. Those are all one-off mallocs today that will trickle up the profile when we start doing O(N) of them.
On 2014/11/18 at 19:59:22, mtklein wrote: > > Sorry, I totally spaced on the SkPicture refs and was head down not checking > > email. I was only thinking of the shared objects like bitmaps, gradients, etc. > > With Slimming Paint we really do care about SkPicture refs because that is in > > fact the way we would be blowing out memory. > > > > My assumption was that we would get everything that used the SkVarAlloc or > > SkChunkAlloc for its memory. > > Nested pictures have been allocated, recorded, and are stand alone objects all before you can record them into another picture. They do both use SkVarAlloc, but they're using different SkVarAllocs. > > This, incidentally, is my main hesitation with recording thousands of tiny pictures. It's not so much you're going to be using more RAM---you will, but at 168 bytes, the overhead is only 1 or 2 DrawFoo ops---but you're going to be calling malloc (and free) thousands of times more often. Amortizing this allocation is pretty much the most important thing we do to make recording SkPictures fast, but you won't be able to hide that. > > Is there a single scope that's convenient to peg the lifetime of all these micropictures to? If so, we could thread an arena (e.g. SkVarAlloc) through all of them, perhaps restoring some that amortization. It'd take some work to really commit to that, though. There are lots of allocations, not just what we allocate with SkVarAlloc today, but the call to new SkRecord, the calls inside BBH factories to return a new SkBBH*, allocation done filling bounding boxes, allocation done by layer analysis, etc. Those are all one-off mallocs today that will trickle up the profile when we start doing O(N) of them. Yes, we can, and should, inject a common arena, to avoid the expense of malloc. Can we adopt such a strategy incrementally? e.g. by migrating some mallocs to it and not others?
On 2014/11/18 19:41:58, mtklein wrote: > Uh, okay, but that's a more confusing answer than "oh, didn't see that". Good point; done. If we aren't keeping a word around with size information, it looks like tracking the size of nested pictures is going to add a O(N) visit to all the SkRecords objects to find the SkRecords::DrawPicture and recurse? > This, incidentally, is my main hesitation with recording thousands of tiny > pictures. It's not so much you're going to be using more RAM---you will, but at > 168 bytes, the overhead is only 1 or 2 DrawFoo ops---but you're going to be > calling malloc (and free) thousands of times more often. It seems like part of the reason Android was able to get away with this architecture for their DisplayLists is they expect to have fewer than we do - Views are relatively heavyweight in their API, and there's one DL per View IIRC.
> If we aren't keeping a word around with size information, it looks like > tracking the size of nested pictures is going to add a O(N) visit to > all the SkRecords objects to find the SkRecords::DrawPicture and recurse? Yep. This is one of those passes that will be cheaper than you'd expect, given you're only looking for/at DrawPictures. I'm not sure I see how another word of storage would let you dodge this. You mean you want to do the calculation at record-time?
> Yes, we can, and should, inject a common arena, to avoid the expense of malloc. > Can we adopt such a strategy incrementally? e.g. by migrating some mallocs to it > and not others? Oh, absolutely. I'm sure there will always be allocs we don't bother consolidating. If it were me, I'd start exactly as-is, consolidating nothing, and profile, cry, hack, profile, smile, repeat...
On 2014/11/18 at 20:28:11, mtklein wrote: > > Yes, we can, and should, inject a common arena, to avoid the expense of malloc. > > Can we adopt such a strategy incrementally? e.g. by migrating some mallocs to it > > and not others? > > Oh, absolutely. I'm sure there will always be allocs we don't bother consolidating. > If it were me, I'd start exactly as-is, consolidating nothing, and profile, cry, hack, > profile, smile, repeat... Agreed, let's take that approach. Do we have the pattern to inject an arena ready to go? Or does that need some design work?
> Agreed, let's take that approach. Do we have the pattern to inject an arena > ready to go? Or does that need some design work? Let's not design the solution until we see the problem and understand its scope. Seems easy to guess wrong.
On 2014/11/18 at 20:38:04, mtklein wrote: > > Agreed, let's take that approach. Do we have the pattern to inject an arena > > ready to go? Or does that need some design work? > > Let's not design the solution until we see the problem and understand its scope. Seems easy to guess wrong. I guess my question is merely whether you're pretty sure there is a reasonable way to proceed when we see it on a profile. Seems likely?
> I guess my question is merely whether you're pretty sure there is a reasonable > way to proceed when we see it on a profile. Seems likely? My point is that what's best to do depends on the profile we see. But, no, I don't think we're backed into a corner where changing how we allocate things is impossible. We can always change code. It's just a matter of effort. We're a bunch of coders. Someone will either figure out a clever way to solve the problem, or prove it's unsolvable.
On 2014/11/18 20:31:30, chrishtr wrote: > On 2014/11/18 at 20:28:11, mtklein wrote: > > > Yes, we can, and should, inject a common arena, to avoid the expense of > malloc. > > > Can we adopt such a strategy incrementally? e.g. by migrating some mallocs > to it > > > and not others? > > > > Oh, absolutely. I'm sure there will always be allocs we don't bother > consolidating. > > If it were me, I'd start exactly as-is, consolidating nothing, and profile, > cry, hack, > > profile, smile, repeat... > > Agreed, let's take that approach. Do we have the pattern to inject an arena > ready to go? Or does that need some design work? See proposal 1 in https://docs.google.com/document/d/1qNPBk60388ah5FoHwVunwbNZJtv5-n0N97cGDj0Ld... It's not an existing pattern, but it is one that addresses the display item problem with a single allocator. I think it's worth implementing as an experiment.
Here's a draft with a SkRecord visitor to recursively measure SkPictures, and another trivial unit test to exercise it. Fodder for future optimizing: according to the function a single empty SkPicture is 168B, with another picture wrapped around we consume 448B.
On 2014/11/18 21:28:26, tomhudson wrote: > Here's a draft with a SkRecord visitor to recursively measure SkPictures, and > another trivial unit test to exercise it. > > Fodder for future optimizing: according to the function a single empty SkPicture > is 168B, with another picture wrapped around we consume 448B. Meets our primary concerns for Slimming Paint benchmarks. I'd be happy plumbing this through and, in fact, will get started now. The APi seems robust even if the details slow down commit.
https://codereview.chromium.org/490253003/diff/180001/src/core/SkRecord.h File src/core/SkRecord.h (right): https://codereview.chromium.org/490253003/diff/180001/src/core/SkRecord.h#new... src/core/SkRecord.h:242: #endif //SkRecord_DEFINED oh snap https://codereview.chromium.org/490253003/diff/180001/src/utils/SkPictureUtil... File src/utils/SkPictureUtils.cpp (right): https://codereview.chromium.org/490253003/diff/180001/src/utils/SkPictureUtil... src/utils/SkPictureUtils.cpp:225: template <typename T> void operator()(const T& op) { Maybe just this? struct MeasureRecords { template <typename T> size_t operator()(const T&) { return 0; } size_t operator()(const DrawPicture& p) { return SPU::ABU(p.picture); } }; ... for (unsigned i = 0; i < pict->fRecord->count(); i++) { byteCount += pict->fRecord->visit<size_t>(i, MeasureRecords()); } https://codereview.chromium.org/490253003/diff/180001/src/utils/SkPictureUtil... src/utils/SkPictureUtils.cpp:232: this->fSize += SkPictureUtils::ApproximateBytesUsed(p.picture); I applaud your adherence to the style guide, but this may be hypercorrection. f already means this->. :)
https://codereview.chromium.org/490253003/diff/180001/src/utils/SkPictureUtil... File src/utils/SkPictureUtils.cpp (right): https://codereview.chromium.org/490253003/diff/180001/src/utils/SkPictureUtil... src/utils/SkPictureUtils.cpp:225: template <typename T> void operator()(const T& op) { On 2014/11/18 21:38:32, mtklein wrote: > Maybe just this? > > struct MeasureRecords { > template <typename T> size_t operator()(const T&) { return 0; } > size_t operator()(const DrawPicture& p) { return SPU::ABU(p.picture); } > }; > ... > for (unsigned i = 0; i < pict->fRecord->count(); i++) { > byteCount += pict->fRecord->visit<size_t>(i, MeasureRecords()); > } Oh, sadly I think you have to do MeasureRecords v; for ( ... ) { byteCount += pict->fRecord->visit<size_t>(i,v); } :( Non-const references don't get their lifetime extended, even if there's no state to mutate.
tomhudson@google.com changed required reviewers: + reed@google.com
https://codereview.chromium.org/490253003/diff/180001/src/core/SkRecord.h File src/core/SkRecord.h (right): https://codereview.chromium.org/490253003/diff/180001/src/core/SkRecord.h#new... src/core/SkRecord.h:242: #endif //SkRecord_DEFINED On 2014/11/18 21:38:32, mtklein wrote: > oh snap Sigh, I guess there is no standard spec here? SkCanvas doesn't even have a #endif // comment. Undone. https://codereview.chromium.org/490253003/diff/180001/src/utils/SkPictureUtil... File src/utils/SkPictureUtils.cpp (right): https://codereview.chromium.org/490253003/diff/180001/src/utils/SkPictureUtil... src/utils/SkPictureUtils.cpp:225: template <typename T> void operator()(const T& op) { On 2014/11/18 21:46:23, mtklein wrote: > On 2014/11/18 21:38:32, mtklein wrote: > > Maybe just this? > > > > struct MeasureRecords { > > template <typename T> size_t operator()(const T&) { return 0; } > > size_t operator()(const DrawPicture& p) { return SPU::ABU(p.picture); } > > }; > > ... > > for (unsigned i = 0; i < pict->fRecord->count(); i++) { > > byteCount += pict->fRecord->visit<size_t>(i, MeasureRecords()); > > } > > Oh, sadly I think you have to do > > MeasureRecords v; > for ( ... ) { > byteCount += pict->fRecord->visit<size_t>(i,v); > } > > :( Non-const references don't get their lifetime extended, even if there's no > state to mutate. Acknowledged. (I think you posted this at the same time I was compiling it...)
https://codereview.chromium.org/490253003/diff/180001/src/core/SkRecord.h File src/core/SkRecord.h (right): https://codereview.chromium.org/490253003/diff/180001/src/core/SkRecord.h#new... src/core/SkRecord.h:242: #endif //SkRecord_DEFINED On 2014/11/18 21:49:37, tomhudson wrote: > On 2014/11/18 21:38:32, mtklein wrote: > > oh snap > > Sigh, I guess there is no standard spec here? SkCanvas doesn't even have a > #endif // comment. > > Undone. Goodness, why would we ever have a standard spec for this?! Of course this doesn't matter. I tease. But, my way is slightly most awesome. '#ifndef ' '#define ' and '#undef//' are all exactly 8 characters, so it feels good to see it all line up as you write the first few lines of a new header file.
On 2014/11/18 21:54:35, mtklein wrote: > https://codereview.chromium.org/490253003/diff/180001/src/core/SkRecord.h > File src/core/SkRecord.h (right): > > https://codereview.chromium.org/490253003/diff/180001/src/core/SkRecord.h#new... > src/core/SkRecord.h:242: #endif //SkRecord_DEFINED > On 2014/11/18 21:49:37, tomhudson wrote: > > On 2014/11/18 21:38:32, mtklein wrote: > > > oh snap > > > > Sigh, I guess there is no standard spec here? SkCanvas doesn't even have a > > #endif // comment. > > > > Undone. > > Goodness, why would we ever have a standard spec for this?! Of course this > doesn't matter. I tease. > > But, my way is slightly most awesome. '#ifndef ' '#define ' and '#undef//' are > all exactly 8 characters, so it feels good to see it all line up as you write > the first few lines of a new header file. Err, '#endif//'
(forgot to say, lgtm)
api lgtm
The CQ bit was checked by tomhudson@google.com
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/490253003/200001
The CQ bit was unchecked by commit-bot@chromium.org
Try jobs failed on following builders: Test-Ubuntu13.10-GCE-NoGPU-x86_64-Debug-Trybot on client.skia (http://build.chromium.org/p/client.skia/builders/Test-Ubuntu13.10-GCE-NoGPU-x...)
The CQ bit was checked by tomhudson@google.com
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/490253003/280001
Message was sent while issue was closed.
Committed patchset #15 (id:280001) as https://skia.googlesource.com/skia/+/158fcaa031d105dc999d9813fee8927db56a871c |