|
|
Chromium Code Reviews
DescriptionSpeed up InvalidationRegion
Staging invalidation rectangles in a vector allows us to skip building
a complex SkRegion for cases with many invalidations (> 256). For cases
with fewer invalidations we will still build a full SkRegion.
R=enne@chromium.org
BUG=606069
CQ_INCLUDE_TRYBOTS=tryserver.blink:linux_blink_rel
Committed: https://crrev.com/62045ceab0fd83013d6cb781ba727d3d3ce68e18
Cr-Commit-Position: refs/heads/master@{#398755}
Patch Set 1 #
Total comments: 2
Patch Set 2 : Limit pending_rects_ to kMaxInvalidationRectCount #
Total comments: 1
Messages
Total messages: 27 (9 generated)
Description was changed from ========== Speed up InvalidationRegion Staging invalidation rectangles in a vector allows us to skip building a complex SkRegion for cases with many invalidations (> 256). For cases with fewer invalidations we will still build a full SkRegion. R=enne@chromium.org BUG=606069 ========== to ========== Speed up InvalidationRegion Staging invalidation rectangles in a vector allows us to skip building a complex SkRegion for cases with many invalidations (> 256). For cases with fewer invalidations we will still build a full SkRegion. R=enne@chromium.org BUG=606069 CQ_INCLUDE_TRYBOTS=tryserver.blink:linux_blink_rel ==========
The CQ bit was checked by vmiura@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/2054473002/1
PTAL. This change gets pretty much all the improvement measured with kMaxInvalidationRectCount == 1 on Animometer, without changing the rect count for <256 invalidations.
Most of the HTML & SVG tests on Animometer show some improvement, 5 to 10%, and a few show 25 to 30% improvement.
lgtm Thanks! That's nice to hear that just the vector maintains the same speedup. https://codereview.chromium.org/2054473002/diff/1/cc/base/invalidation_region.cc File cc/base/invalidation_region.cc (right): https://codereview.chromium.org/2054473002/diff/1/cc/base/invalidation_region... cc/base/invalidation_region.cc:32: pending_rects_.push_back(rect); I'm prematurely optimizing here, but what about early-outing pushing back additional rects (and resizing the vector) if we have more rects then the max? You could just union it into one of the other existing pending rects.
The CQ bit was checked by vmiura@chromium.org to run a CQ dry run
https://codereview.chromium.org/2054473002/diff/1/cc/base/invalidation_region.cc File cc/base/invalidation_region.cc (right): https://codereview.chromium.org/2054473002/diff/1/cc/base/invalidation_region... cc/base/invalidation_region.cc:32: pending_rects_.push_back(rect); On 2016/06/08 19:05:14, enne wrote: > I'm prematurely optimizing here, but what about early-outing pushing back > additional rects (and resizing the vector) if we have more rects then the max? > You could just union it into one of the other existing pending rects. Done.
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/2054473002/20001
chrishtr@chromium.org changed reviewers: + chrishtr@chromium.org
\o/
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: linux_android_rel_ng on tryserver.chromium.android (JOB_FAILED, https://build.chromium.org/p/tryserver.chromium.android/builders/linux_androi...)
The CQ bit was checked by vmiura@chromium.org
The patchset sent to the CQ was uploaded after l-g-t-m from enne@chromium.org Link to the patchset: https://codereview.chromium.org/2054473002/#ps20001 (title: "Limit pending_rects_ to kMaxInvalidationRectCount")
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/2054473002/20001
Message was sent while issue was closed.
Committed patchset #2 (id:20001)
Message was sent while issue was closed.
Description was changed from ========== Speed up InvalidationRegion Staging invalidation rectangles in a vector allows us to skip building a complex SkRegion for cases with many invalidations (> 256). For cases with fewer invalidations we will still build a full SkRegion. R=enne@chromium.org BUG=606069 CQ_INCLUDE_TRYBOTS=tryserver.blink:linux_blink_rel ========== to ========== Speed up InvalidationRegion Staging invalidation rectangles in a vector allows us to skip building a complex SkRegion for cases with many invalidations (> 256). For cases with fewer invalidations we will still build a full SkRegion. R=enne@chromium.org BUG=606069 CQ_INCLUDE_TRYBOTS=tryserver.blink:linux_blink_rel Committed: https://crrev.com/62045ceab0fd83013d6cb781ba727d3d3ce68e18 Cr-Commit-Position: refs/heads/master@{#398755} ==========
Message was sent while issue was closed.
Patchset 2 (id:??) landed as https://crrev.com/62045ceab0fd83013d6cb781ba727d3d3ce68e18 Cr-Commit-Position: refs/heads/master@{#398755}
Message was sent while issue was closed.
danakj@chromium.org changed reviewers: + danakj@chromium.org
Message was sent while issue was closed.
So the take away here is that a vector of rects is more performant than a Region? Curious what is the usage pattern that makes this true here? Looks of unioning? Very little, and lots of iterating? What's the point of maintaining the region_ member variable now? It looks like this is become a vector of rects that can generate a Region basically?
Message was sent while issue was closed.
On 2016/06/09 19:02:13, danakj wrote: > So the take away here is that a vector of rects is more performant than a > Region? Curious what is the usage pattern that makes this true here? Looks of > unioning? Very little, and lots of iterating? I haven't looked deeply into SkRegion's implementation. Isn't Region::Union some O(n ln(n)) operation? > What's the point of maintaining the region_ member variable now? It looks like > this is become a vector of rects that can generate a Region basically? Basically. The Swap(Region*) api is the reason I kept region_, but if we replaced Swap() with GetRegion() or something then it could go away.
Message was sent while issue was closed.
On Thu, Jun 9, 2016 at 12:15 PM, <vmiura@chromium.org> wrote: > On 2016/06/09 19:02:13, danakj wrote: > > So the take away here is that a vector of rects is more performant than a > > Region? Curious what is the usage pattern that makes this true here? > Looks of > > unioning? Very little, and lots of iterating? > > I haven't looked deeply into SkRegion's implementation. Isn't Region::Union > some O(n ln(n)) operation? > Ya it is, I guess you're saying the usage is lots of unioning with little querying? At some point we're putting those rects into a Region, so using a vector in between is strictly overhead. Is the win here coming from unioning with rect_[0] a lot of times in these cases? Trying to understand what makes this good. > > > > What's the point of maintaining the region_ member variable now? It > looks like > > this is become a vector of rects that can generate a Region basically? > > Basically. The Swap(Region*) api is the reason I kept region_, but if we > replaced Swap() with GetRegion() or something then it could go away. > > https://codereview.chromium.org/2054473002/ > -- You received this message because you are subscribed to the Google Groups "Chromium-reviews" group. To unsubscribe from this group and stop receiving emails from it, send an email to chromium-reviews+unsubscribe@chromium.org.
Message was sent while issue was closed.
I think the performance hit is unioning and finding the set of non-overlapping rects to construct the SkRegion. When the are too many rects, then there's just "simple unioning" and the bounding box of all rects is used as invalidation. By that, it makes it guaranteed to only do the SkRegion work when the number of rects is small. Followup work here should really be to use a vector of rects elsewhere instead of pushing into an SkRegion, but I suspect this was just trying to be conservative and make a minimal change for the important win.
Message was sent while issue was closed.
On 2016/06/09 20:00:01, enne wrote: > I think the performance hit is unioning and finding the set of non-overlapping > rects to construct the SkRegion. When the are too many rects, then there's just > "simple unioning" and the bounding box of all rects is used as invalidation. By > that, it makes it guaranteed to only do the SkRegion work when the number of > rects is small. > > Followup work here should really be to use a vector of rects elsewhere instead > of pushing into an SkRegion, but I suspect this was just trying to be > conservative and make a minimal change for the important win. I think the main hit that we would've had before is for the regions that have more than 256 rects, since the previous approach would collapse the 256 into a bounds, but then continue unioning (in the region sense) the rest of the rects. So technically, you could end up having "higher resolution" region, where the first 256 are collapsed, but the rest are individual. With this approach, once we breach 256 we always have just a single bounds. Also, rects.size() + region complexity might also effectively reduce 256 to a smaller number thus giving us more performance when iterating. That is, a single rect doesn't necessarily translate into one region complexity unit, it depends on the rects, so rects.size() might be larger than the resulting region complexity.
Message was sent while issue was closed.
Sorry didn't have time to go into more detail earlier. I haven't noticed the iteration overhead stand out. We were limiting the region complexity to 256 rects, and also it's off the critical path here. I The cost I measured was on insertion for large numbers of invalidation rects, e.g. the "Design" test case does over 3000 invalidations, with many partial overlaps generating a complex contour. The test is main thread bound so the insertion cost is what's making a difference. The old code at different kMaxInvalidationRectCount values: kMaxInvalidationRectCount = 256 Design score = 131.82 kMaxInvalidationRectCount = 16 Design score = 140.25 kMaxInvalidationRectCount = 1 Design score = 153.73 Old approach: O(N ln(M)) where N = Num Unions, M = Max region complexity This approach: O(N) for N > 256 O(N + N ln(M)) for N <= 256 so a fixed small order overhead added
Message was sent while issue was closed.
https://codereview.chromium.org/2054473002/diff/20001/cc/base/invalidation_re... File cc/base/invalidation_region.cc (right): https://codereview.chromium.org/2054473002/diff/20001/cc/base/invalidation_re... cc/base/invalidation_region.cc:32: if (pending_rects_.size() >= kMaxInvalidationRectCount) I just realized after this change, we never get (pending_rects_.size() > kMaxInvalidationRectCount) so in FinalizePendingRects we'll always take the region building path. I'll upload a fix. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
