|
|
Created:
6 years, 10 months ago by epenner Modified:
6 years, 10 months ago CC:
chromium-reviews, jbauman+watch_chromium.org, jam, sievers+watch_chromium.org, joi+watch-content_chromium.org, danakj+watch_chromium.org, darin-cc_chromium.org, kalyank, piman+watch_chromium.org, cc-bugs_chromium.org, android-webview-reviews_chromium.org, boliu, aelias_OOO_until_Jul13, ccameron Base URL:
https://chromium.googlesource.com/chromium/src.git@master Visibility:
Public. |
DescriptionCC/GPU: Add a soft limit to the compositor.
On low-end devices (and increasingly high-end) we want to
reduce memory as much as possible for normal content yet
not cause calamitous fallbacks such as raster-on-demand
or flickering.
This adds a hard memory limit in addition to the soft limit.
The hard limit will be used only to avoid calamitous behavior.
Limits are the same for this patch and will be tweaked in
future patches.
BUG=339144
NOTRY=true
No-try since this is cc-only, cc_unittests pass, and the errors are not related.
Committed: https://src.chromium.org/viewvc/chrome?view=rev&revision=251180
Patch Set 1 #Patch Set 2 : #Patch Set 3 : Use CC Setting. #
Total comments: 12
Patch Set 4 : Format #Patch Set 5 : Tests + Rebase. #Patch Set 6 : Rebase. #Patch Set 7 : Rebase. #
Messages
Total messages: 32 (0 generated)
Ptal. I wanted to get this patch up before I go too far (given the large amount of plumbing involved). At the moment, the hard limit is set to the same as the soft limit. In terms of the final calculation, I'd like to keep it simple and just have the hard limit be a multiple of the soft limit (eg. 2X), with the soft limit reduced in some cases (if 2X would make the hard limit too high).
On 2014/02/11 05:57:33, epenner wrote: > Ptal. > > I wanted to get this patch up before I go too far > (given the large amount of plumbing involved). > > At the moment, the hard limit is set to the same as the soft > limit. In terms of the final calculation, I'd like to keep > it simple and just have the hard limit be a multiple of the > soft limit (eg. 2X), with the soft limit reduced in some > cases (if 2X would make the hard limit too high). This seems too complicated to me. I think the best solution is to "always use as little as memory possible, and never use more than the limit given to you". This is the policy that I've been slowly moving all platforms to, as the "cache as much stuff as you can, up to the limit" old strategy has broken them. Maybe we should move Android to this policy too? The place where it's controlled is in the following block. // Use a more conservative memory allocation policy on Linux and Mac because // the platform is unstable when under memory pressure. // http://crbug.com/145600 (Linux) // http://crbug.com/141377 (Mac) #if defined(OS_MACOSX) || (defined(OS_LINUX) && !defined(OS_CHROMEOS)) priority_cutoff_ = MemoryAllocation::CUTOFF_ALLOW_NICE_TO_HAVE; #endif You can also change it to MemoryAllocation::REQUIRED to be even more conservative. If that doesn't accomplish this, then maybe the impl-side-painting tile management should be changed so that the priority-cutoff does this.
On 2014/02/11 17:45:16, ccameron1 wrote: > On 2014/02/11 05:57:33, epenner wrote: > > Ptal. > > > > I wanted to get this patch up before I go too far > > (given the large amount of plumbing involved). > > > > At the moment, the hard limit is set to the same as the soft > > limit. In terms of the final calculation, I'd like to keep > > it simple and just have the hard limit be a multiple of the > > soft limit (eg. 2X), with the soft limit reduced in some > > cases (if 2X would make the hard limit too high). > > This seems too complicated to me. > > I think the best solution is to "always use as little as memory possible, and > never use more than the limit given to you". > I think this ends up being an argument of who has the 'ugly heuristic'. Another possibility is to just increase the single limit from GPU-memory-manager, and then CC can make two limits out of it. Would that be acceptable from GPU-mem-manager perspective? Indeed the goal of this patch is exactly what you are suggesting: "use as little as memory possible". It was decided in some design talks that it will be accomplished in two ways: 1.) Visible content is 'extra special' and can use the entire limit 2.) Non-visible will 'use as little as possible' in two different ways: - Bounded by distance - Bounded by a limit that is lower than the hard-limit It's perfectly acceptable to me that we only pass the hard-limit, and CC derives the rest by itself. Enne/vmpstr/aelias does that sound acceptable? It will be considerably less pluming, and CC will generate hard/soft limits from the hard-limit it gets.
On 2014/02/11 19:35:37, epennerAtGoogle wrote: > Enne/vmpstr/aelias does that sound acceptable? It will be considerably less > pluming, and CC will generate hard/soft limits from the hard-limit it gets. My opinion would be that it'd be better to have the memory manager involved in hard and soft limits. If there's only one really long static tab, then the memory manager could give out a very high soft limit and that tab could just cache everything, saving power by not having to reraster the same content. I agree that it is more complicated to have more information in the memory manager, but isn't this information it could use to make the system as a whole behave better?
On 2014/02/11 19:45:53, enne wrote: > On 2014/02/11 19:35:37, epennerAtGoogle wrote: > > > Enne/vmpstr/aelias does that sound acceptable? It will be considerably less > > pluming, and CC will generate hard/soft limits from the hard-limit it gets. > > My opinion would be that it'd be better to have the memory manager involved in > hard and soft limits. If there's only one really long static tab, then the > memory manager could give out a very high soft limit and that tab could just > cache everything, saving power by not having to reraster the same content. > > I agree that it is more complicated to have more information in the memory > manager, but isn't this information it could use to make the system as a whole > behave better? I tend to agree that since we have the GPU-memory-manager, it seems like the right place to provide these limits. We discussed at length how best the Compositor should 'use as little as possible', and the three strategies above were the best we could come up with. The case Enne mentions is one way in which different platforms could expose different caching behaviors, which is likely what we'll want in the long run. If we did this tuning in CC, it would effectively be coupling some heuristics in CC (how to calculate the two limits) to the heuristics in GPU-memory-manager (how to calculate the total limit).
TL;DR: What about putting this in CC to start with, and then, if we see a concrete way that this can have additional value by being sent to the memory manager, then consider plumbing it through? Remember, the gpu memory manager acts at a very long distance from the renderers in terms of latency, and it has to limit how often it broadcasts messages to everyone -- if it wakes up to tweak what people are doing, it goes and tells all of the renders to do something, which is bad for power, performance, etc. ---- Details: (this is me going through a memory manager ptsd episode ... sorry ...) I think some of this is getting at "what is the memory manager for". And I'd say that it's there to (1) compute and know the actual memory limits of the system (based on GL extensions, etc) (2) coordinate the renderer's memory use, sending limits to each renderer. Note that on Android, (2) isn't really a thing, since there's just one renderer (in effect). The information the gpu memory manager has is just "we have X bytes to work with". On 2014/02/11 19:53:39, epennerAtGoogle wrote: > On 2014/02/11 19:45:53, enne wrote: > > On 2014/02/11 19:35:37, epennerAtGoogle wrote: > > > > > It's perfectly acceptable to me that we only pass the hard-limit, and CC derives > > > the rest by itself. I'd really strongly advise that you do this. Plus "this is what hard means and this is what soft means, etc" are really compositor-specific (memory manager doesn't know what a compositor is), so this involves casting everything up into an abstract space in the memory manager (which doesn't know what a compositor is, in theory), and then projecting all of that back down, and adding a bunch of latency, too. I see nothing concrete being gained and lots of concrete things being lost (latency, information, etc). To be specific for this patch, if the idea was to say "set a soft limit of the total limit divided by two", what do we gain by sending that to the gpu memory manager? > > My opinion would be that it'd be better to have the memory manager involved in > > hard and soft limits. If there's only one really long static tab, then the > > memory manager could give out a very high soft limit and that tab could just > > cache everything, saving power by not having to reraster the same content. But everything involved in this decision is compositor-side -- the compositor knows what's important and what isn't, and it also knows how much memory it can use, so why send all of that to the gpu process? We'd have to somehow find an encoding to say "well, this is static and it's long". This is especially true on Android, where, having (effectively) one compositor means that you're not even coordinating between compositors in the gpumemmgr -- you're just saying "hey, we have X bytes total to do the best we can on this one page". I don't see how doing the roundtrip to the memmgr helps this. > > I agree that it is more complicated to have more information in the memory > > manager, but isn't this information it could use to make the system as a whole > > behave better? I think that's a lot easier imagined and said than actually done -- basically every scenario that I thought would be helped by a memory manager was made worse (remember, with multiple renderers, you have to deal with a lot of latency, not being able to predict future interaction and future resource needs, etc). When discussing things that are made better by projecting this, I'd stick with the very very very concrete examples, and I'd ask "how is this better than doing it in cc/". I put a lot of time into trying to get the memory manager to do complicated things, and was very happy when we just tore most of it out because it was unnecessarily complicated. I wouldn't suggested going down that road again.
Well said. In a utopia of fast synchronous IPCS, I think the complex cases would work a lot better. The 2X heuristic will likely get more complicated down the road, but it could be done via the CC embedder which also has the IsLowEndDevice() and some #ifdefs for this type of thing, etc etc. Really I could go either way, so I'll let Enne/ccameron be deciders if you two are comfortable hashing it out?
I don't feel strongly about this. If ccameron doesn't think this is worthwhile, then we should punt on it. If we come up with some use case that we want to optimize (like single long tab with few invalidations), we can revisit this.
Okay I removed all the plumbing and implemented the CC/TileManager bit. Reveman +R CCameron +cc Ptal. I'll land with the limit being the same initially so such that limit changes can be tweaked/rolled-back separately. Tests coming up.
https://codereview.chromium.org/140673009/diff/160009/cc/resources/tile_manag... File cc/resources/tile_manager.cc (right): https://codereview.chromium.org/140673009/diff/160009/cc/resources/tile_manag... cc/resources/tile_manager.cc:685: nit: no need for this line. I'm surprised the presubmit check didn't complain about this.. https://codereview.chromium.org/140673009/diff/160009/cc/resources/tile_manag... cc/resources/tile_manager.cc:724: soft_bytes_left = std::max<int64>(0, soft_bytes_left); does it hurt to let this go negative? in that case, can you adjust the types above so it doesn't instead of having to do std::max here?
https://codereview.chromium.org/140673009/diff/160009/cc/resources/tile_manag... File cc/resources/tile_manager.cc (right): https://codereview.chromium.org/140673009/diff/160009/cc/resources/tile_manag... cc/resources/tile_manager.cc:685: On 2014/02/12 20:08:02, David Reveman wrote: > nit: no need for this line. I'm surprised the presubmit check didn't complain > about this.. Sorry I didn't rerun git cl format. Done. https://codereview.chromium.org/140673009/diff/160009/cc/resources/tile_manag... cc/resources/tile_manager.cc:724: soft_bytes_left = std::max<int64>(0, soft_bytes_left); On 2014/02/12 20:08:02, David Reveman wrote: > does it hurt to let this go negative? in that case, can you adjust the types > above so it doesn't instead of having to do std::max here? Do you mean make it unsigned and add a if-statement here to avoid overflow here instead of std::max()? I was going to use size_t but I changed to signed since this is more concise (less code than preventing overflow would be). I totally don't mind changing it to size_t if that is the normal pattern though.
https://codereview.chromium.org/140673009/diff/160009/cc/resources/tile_manag... File cc/resources/tile_manager.cc (right): https://codereview.chromium.org/140673009/diff/160009/cc/resources/tile_manag... cc/resources/tile_manager.cc:724: soft_bytes_left = std::max<int64>(0, soft_bytes_left); On 2014/02/12 21:54:34, epennerAtGoogle wrote: > On 2014/02/12 20:08:02, David Reveman wrote: > > does it hurt to let this go negative? in that case, can you adjust the types > > above so it doesn't instead of having to do std::max here? > > Do you mean make it unsigned and add a if-statement here to avoid overflow here > instead of std::max()? > > I was going to use size_t but I changed to signed since this is more concise > (less code than preventing overflow would be). I totally don't mind changing it > to size_t if that is the normal pattern though. I was thinking you'd keep this signed and just let it become negative instead of doing std::max. I think that'd be the cleanest but you should probably change a few of the types above to signed too in the case.
On 2014/02/12 22:14:52, David Reveman wrote: > https://codereview.chromium.org/140673009/diff/160009/cc/resources/tile_manag... > File cc/resources/tile_manager.cc (right): > > https://codereview.chromium.org/140673009/diff/160009/cc/resources/tile_manag... > cc/resources/tile_manager.cc:724: soft_bytes_left = std::max<int64>(0, > soft_bytes_left); > On 2014/02/12 21:54:34, epennerAtGoogle wrote: > > On 2014/02/12 20:08:02, David Reveman wrote: > > > does it hurt to let this go negative? in that case, can you adjust the types > > > above so it doesn't instead of having to do std::max here? > > > > Do you mean make it unsigned and add a if-statement here to avoid overflow > here > > instead of std::max()? > > > > I was going to use size_t but I changed to signed since this is more concise > > (less code than preventing overflow would be). I totally don't mind changing > it > > to size_t if that is the normal pattern though. > > I was thinking you'd keep this signed and just let it become negative instead of > doing std::max. I think that'd be the cleanest but you should probably change a > few of the types above to signed too in the case. Oh I gotcha. Okay let me look into that. I'll put it up together with tests shortly.
https://codereview.chromium.org/140673009/diff/160009/cc/resources/tile_manag... File cc/resources/tile_manager.cc (right): https://codereview.chromium.org/140673009/diff/160009/cc/resources/tile_manag... cc/resources/tile_manager.cc:622: int64 soft_bytes_available = maybe visible_bytes_available and repaint_bytes_available? https://codereview.chromium.org/140673009/diff/160009/cc/resources/tile_manag... cc/resources/tile_manager.cc:680: (mts.bin == NOW_BIN) ? hard_bytes_left : soft_bytes_left; So in SMOOTHNESS_TAKES_PRIORITY, we prioritize the active tree. However, it is possible that the pending tree has this tile in the NOW_BIN (even though active tree has it as SOON or EVENTUALLY or even NEVER). I think this means that the mts.bin == NOW_BIN is not sufficient to check visibility. Also, it means that "visible" tiles might come after prepaint tiles in the priority ordering, something that I think your patch assumes doesn't happen. I think we should still be judging visibility as something that is visible on either tree, wdyt? In fact, I think that should be some sort of a tile setting like "visible on either tree" Also, that's from me looking at the code in GetTilesWithAssignedBins, which is kinda complicated. Feel free to correct me if I got it wrong.
https://codereview.chromium.org/140673009/diff/160009/cc/resources/tile_manag... File cc/resources/tile_manager.cc (right): https://codereview.chromium.org/140673009/diff/160009/cc/resources/tile_manag... cc/resources/tile_manager.cc:622: int64 soft_bytes_available = On 2014/02/12 22:19:36, vmpstr wrote: > maybe visible_bytes_available and repaint_bytes_available? s/repaint/prepaint/
https://codereview.chromium.org/140673009/diff/160009/cc/resources/tile_manag... File cc/resources/tile_manager.cc (right): https://codereview.chromium.org/140673009/diff/160009/cc/resources/tile_manag... cc/resources/tile_manager.cc:622: int64 soft_bytes_available = On 2014/02/12 22:20:20, vmpstr wrote: > On 2014/02/12 22:19:36, vmpstr wrote: > > maybe visible_bytes_available and repaint_bytes_available? > > s/repaint/prepaint/ Arg! That's what I called it before! ;) I can change it back if that's the preference, but actually there is some details that makes 'visible' not quite what we are after (at least at the moment). See below. https://codereview.chromium.org/140673009/diff/160009/cc/resources/tile_manag... cc/resources/tile_manager.cc:680: (mts.bin == NOW_BIN) ? hard_bytes_left : soft_bytes_left; I had to deal with this sticky case before when we were trying to avoid raster-on-demand at low memory limits. That was a problem of assuming that visible 'required-to-activate' tiles always come first. I believe this code is okay, but we should be double-sure, so please let me know if there is anything wrong with my thinking: In GetTilesWithAssignedBins, when we are in SMOOTHNESS we use the active-tree's priority for all tiles: https://code.google.com/p/chromium/codesearch#chromium/src/cc/resources/tile_... What this means is that some visible 'required-to-activate' tiles' mts.bin will not be set to NOW_BIN. However, similar to raster-on-demand, we don't actually strictly need those tiles when we are near the soft limit. We can simply delay activation instead, as we do here: https://code.google.com/p/chromium/codesearch#chromium/src/cc/resources/tile_... So visibility isn't quite the indicator of OOM when we are in smoothness mode. Once the user exits smoothness mode, all the tiles will be in the NOW_BIN and we'll need more memory at that time, and will use the hard-limit temporarily to activate. I actually don't like this policy in the long run, and I think we should always reserve memory for visible tiles first. This requires us to be able to reserve memory using one ordering, while painting that memory with a different ordering... I haven't thought about how this will work in the future design, but it's probably worth going over it. OH! And it should be mts.bin <= NOW_BIN by the looks of it. I'll change that.
lgtm assuming everyone else is happy with it. I just have one comment request below. https://codereview.chromium.org/140673009/diff/160009/cc/resources/tile_manag... File cc/resources/tile_manager.cc (right): https://codereview.chromium.org/140673009/diff/160009/cc/resources/tile_manag... cc/resources/tile_manager.cc:622: int64 soft_bytes_available = On 2014/02/12 23:08:46, epennerAtGoogle wrote: > On 2014/02/12 22:20:20, vmpstr wrote: > > On 2014/02/12 22:19:36, vmpstr wrote: > > > maybe visible_bytes_available and repaint_bytes_available? > > > > s/repaint/prepaint/ > > Arg! That's what I called it before! ;) I can change it back if that's the > preference, but actually there is some details that makes 'visible' not quite > what we are after (at least at the moment). See below. Yeah, I see the problem. I guess soft/hard is OK, I just have to get used to it a little bit as the mapping of soft to prepaint and hard to visible isn't very well established in my brain :P https://codereview.chromium.org/140673009/diff/160009/cc/resources/tile_manag... cc/resources/tile_manager.cc:680: (mts.bin == NOW_BIN) ? hard_bytes_left : soft_bytes_left; On 2014/02/12 23:08:46, epennerAtGoogle wrote: > I had to deal with this sticky case before when we were trying to avoid > raster-on-demand at low memory limits. That was a problem of assuming that > visible 'required-to-activate' tiles always come first. I believe this code is > okay, but we should be double-sure, so please let me know if there is anything > wrong with my thinking: > > In GetTilesWithAssignedBins, when we are in SMOOTHNESS we use the active-tree's > priority for all tiles: > https://code.google.com/p/chromium/codesearch#chromium/src/cc/resources/tile_... > > What this means is that some visible 'required-to-activate' tiles' mts.bin will > not be set to NOW_BIN. However, similar to raster-on-demand, we don't actually > strictly need those tiles when we are near the soft limit. We can simply delay > activation instead, as we do here: > https://code.google.com/p/chromium/codesearch#chromium/src/cc/resources/tile_... > > So visibility isn't quite the indicator of OOM when we are in smoothness mode. > Once the user exits smoothness mode, all the tiles will be in the NOW_BIN and > we'll need more memory at that time, and will use the hard-limit temporarily to > activate. I guess the exception is NEW_CONTENT_TAKES_PRIORITY, but tbh I have no idea when we enter this mode. However, in there we do want active tree now bin tiles to use the hard memory limit, since otherwise we might checkerboard. Overall, I can buy that we only want hard limit to be used for what the tile manager itself decides is now bin (which might exclude some pending visible tiles). In that spirit though, could you add an explicit comment somewhere around here to explain this discussion? It almost looks like we care about visible tiles, and I want to make it clear that we know that it's just NOW_BIN (or less) that we care about. > > I actually don't like this policy in the long run, and I think we should always > reserve memory for visible tiles first. This requires us to be able to reserve > memory using one ordering, while painting that memory with a different > ordering... I haven't thought about how this will work in the future design, but > it's probably worth going over it. Yeah, I think in the future I want tile manager to "know" that a tile is visible. Period. It doesn't matter which tree it comes from or whether it exists on one tree or both, just the fact that it's visible somewhere should be an indicator of the tile's importance. But maybe that's simplistic. > > OH! And it should be mts.bin <= NOW_BIN by the looks of it. I'll change that. Good catch :)
Ptal. Okay new patch up with tests. There was a minor bug in the tests where we didn't use 'resource mode' after resetting the tree priority. That is fixed, and I found enough tests to modify to exercise this pretty well. Regarding int64s, I briefly got paranoid that int64 might be more expensive, so I just used size_t after all (with check for overflow). It seems more clean to be consistent throughout with size_t, so I just left it that way. FYI, we are very close to 32-bit memory limits on the perf-tests. Should we reduce the tile size or something there?
lgtm
The CQ bit was checked by epenner@google.com
The CQ bit was unchecked by epenner@chromium.org
The CQ bit was checked by epenner@chromium.org
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/epenner@chromium.org/140673009/550001
The CQ bit was unchecked by commit-bot@chromium.org
Failed to apply patch for cc/trees/layer_tree_settings.cc: While running patch -p1 --forward --force --no-backup-if-mismatch; patching file cc/trees/layer_tree_settings.cc Hunk #1 FAILED at 51. 1 out of 1 hunk FAILED -- saving rejects to file cc/trees/layer_tree_settings.cc.rej Patch: cc/trees/layer_tree_settings.cc Index: cc/trees/layer_tree_settings.cc diff --git a/cc/trees/layer_tree_settings.cc b/cc/trees/layer_tree_settings.cc index 0ace95a6868ec18abe3501083c79cdc39dd0cca0..7eaf431e8229da7e9d8b8087fc39262aff1171de 100644 --- a/cc/trees/layer_tree_settings.cc +++ b/cc/trees/layer_tree_settings.cc @@ -51,14 +51,14 @@ LayerTreeSettings::LayerTreeSettings() // At 256x256 tiles, 128 tiles cover an area of 2048x4096 pixels. max_tiles_for_interest_area(128), max_unused_resource_memory_percentage(100), + max_memory_for_prepaint_percentage(100), highp_threshold_min(0), strict_layer_property_change_checking(false), use_map_image(false), ignore_root_layer_flings(false), use_rgba_4444_textures(false), touch_hit_testing(true), - texture_id_allocation_chunk_size(64) { -} + texture_id_allocation_chunk_size(64) {} LayerTreeSettings::~LayerTreeSettings() {}
The CQ bit was checked by epenner@chromium.org
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/epenner@chromium.org/140673009/880001
The CQ bit was unchecked by commit-bot@chromium.org
Retried try job too often on win_rel for step(s) browser_tests http://build.chromium.org/p/tryserver.chromium/buildstatus?builder=win_rel&nu...
The CQ bit was checked by epenner@chromium.org
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/epenner@chromium.org/140673009/880001
Message was sent while issue was closed.
Change committed as 251180 |