|
|
Chromium Code Reviews|
Created:
5 years, 8 months ago by haraken Modified:
5 years, 8 months ago CC:
blink-reviews, oilpan-reviews, Mads Ager (chromium), eae+blinkwatch, blink-reviews-dom_chromium.org, dglazkov+blink, kouhei+heap_chromium.org, rwlbuis, Hannes Payer (out of office), ulan Base URL:
svn://svn.chromium.org/blink/trunk Target Ref:
refs/heads/master Project:
blink Visibility:
Public. |
DescriptionOilpan: Consider PartitionAlloc's memory usage when triggering a GC
Oilpan builds sometimes increase peak memory usage because it does not understand the amount of memory used in embedders (e.g., PartitionAlloc) and thus fails in triggering a GC.
This CL fixes the issue by taking into account PartitionAlloc's memory usage when triggering an Oilpan's GC. See this design document (https://docs.google.com/document/d/1QMDr1YDEXdvZXoOC7RuHH6VRU7ot3SpArPlHokoBVPw/edit) for more details.
BUG=474470
Committed: https://src.chromium.org/viewvc/blink?view=rev&revision=194282
Patch Set 1 #Patch Set 2 : #
Total comments: 3
Patch Set 3 : #Patch Set 4 : #Patch Set 5 : #
Total comments: 1
Messages
Total messages: 44 (4 generated)
haraken@chromium.org changed reviewers: + oilpan-reviews@chromium.org, sigbjornf@opera.com
I verified that GCs are properly triggered in textarea-dom.html. I'll collect full performance numbers of blink_perf overnight. yuta-san: Would you apply this CL to ToT and verify that the peak memory increase in Google+ is gone? Sigbjorn: This CL is mostly ready for review. I'd be happy if you could take a look when you have a cycle.
Isn't this the CL&approach that was considered ~ a month ago, and decided against as it led to regressions?
On 2015/04/07 18:20:31, sof wrote: > Isn't this the CL&approach that was considered ~ a month ago, and decided > against as it led to regressions? The original CL was doing too many things and it was not clear where the regressions were introduced. (I failed to get performance data last night; let me give it another try tonight.) We're looking at memory consumption of memory.top_7_stress and observing cases where peak memory increases because Oilpan doesn't trigger a GC when a lot of objects are allocated in PartitionAlloc. We need to fix the problem somehow.
https://codereview.chromium.org/1063083002/diff/20001/Source/platform/heap/Th... File Source/platform/heap/ThreadState.cpp (right): https://codereview.chromium.org/1063083002/diff/20001/Source/platform/heap/Th... Source/platform/heap/ThreadState.cpp:863: Heap::setEstimatedLiveObjectSize(Heap::markedObjectSize() + Heap::externalObjectSizeAtLastGC()); Hmm, this feels like arbitrary measures to combine together. I think it would be better if we kept the different components of a GC scheduling policy apart, i.e., external allocation amounts (by way of a global PartitionAlloc measurement, or targeted registration of external memory allocations added like now on trunk) is considered as a triggering component separately from the rest(*). Either because it is above some set threshold or that the derivative suggests a GC might be in order & the dangers of not taking a possibly imprecise conservative GC is greater than not doing so. And, if so, effectively signal an urgent GC (which we can keep as a mechanism? methinks of possible memory pressure listener uses of it.) * - or you can make them conditional on one another to filter out 'pointless' GCs.
On 2015/04/10 07:21:09, sof wrote: > https://codereview.chromium.org/1063083002/diff/20001/Source/platform/heap/Th... > File Source/platform/heap/ThreadState.cpp (right): > > https://codereview.chromium.org/1063083002/diff/20001/Source/platform/heap/Th... > Source/platform/heap/ThreadState.cpp:863: > Heap::setEstimatedLiveObjectSize(Heap::markedObjectSize() + > Heap::externalObjectSizeAtLastGC()); > Hmm, this feels like arbitrary measures to combine together. > > I think it would be better if we kept the different components of a GC > scheduling policy apart, i.e., external allocation amounts (by way of a global > PartitionAlloc measurement, or targeted registration of external memory > allocations added like now on trunk) is considered as a triggering component > separately from the rest(*). Either because it is above some set threshold or > that the derivative suggests a GC might be in order & the dangers of not taking > a possibly imprecise conservative GC is greater than not doing so. And, if so, > effectively signal an urgent GC (which we can keep as a mechanism? methinks of > possible memory pressure listener uses of it.) > > * - or you can make them conditional on one another to filter out 'pointless' > GCs. Thanks for review. BTW, yutak@ confirmed that this CL mostly solves the peak memory increase in memory.top_7_stress. I'll collect performance numbers tonight.
This is a performance result: https://docs.google.com/spreadsheets/d/1qFZ5gca9okdtI1ndhw-v-tbU6sU9gmjY35fIl... It's sad that the result is flaky (again)... I manually verified that the only suspicious regressions are dromaeo.domcoremodify.*. As far as I see http://haraken.info/a/results.html, it seems that dromaeo.domcoremodify.createElement regresses a bit but dromaeo.domcoremodify.cloneNode improves a bit. In summary, it seems this CL will cause no performance change. On 2015/04/10 07:21:09, sof wrote: > https://codereview.chromium.org/1063083002/diff/20001/Source/platform/heap/Th... > File Source/platform/heap/ThreadState.cpp (right): > > https://codereview.chromium.org/1063083002/diff/20001/Source/platform/heap/Th... > Source/platform/heap/ThreadState.cpp:863: > Heap::setEstimatedLiveObjectSize(Heap::markedObjectSize() + > Heap::externalObjectSizeAtLastGC()); > Hmm, this feels like arbitrary measures to combine together. > > I think it would be better if we kept the different components of a GC > scheduling policy apart, i.e., external allocation amounts (by way of a global > PartitionAlloc measurement, or targeted registration of external memory > allocations added like now on trunk) is considered as a triggering component > separately from the rest(*). Either because it is above some set threshold or > that the derivative suggests a GC might be in order & the dangers of not taking > a possibly imprecise conservative GC is greater than not doing so. And, if so, > effectively signal an urgent GC (which we can keep as a mechanism? methinks of > possible memory pressure listener uses of it.) > > * - or you can make them conditional on one another to filter out 'pointless' > GCs. My gut feeling is that it wouldn't be realistic to perfectly report the external memory to Oilpan's GC. Of course, it doesn't need to be perfect but at least we want to avoid whack-a-mole fixes, e.g., adding the report calls to only TextNode and things that actually arise as a problem in the future. On the other hand, I agree with you that just combining the two different memory components wouldn't be an ideal fix either. One thing we could do might be to observe the increase in Partitions::totalCommittedPages() and use the diff to determine the GC timing. What do you think?
On 2015/04/13 06:06:02, haraken wrote: > This is a performance result: > https://docs.google.com/spreadsheets/d/1qFZ5gca9okdtI1ndhw-v-tbU6sU9gmjY35fIl... > > It's sad that the result is flaky (again)... I manually verified that the only > suspicious regressions are dromaeo.domcoremodify.*. As far as I see > http://haraken.info/a/results.html, it seems that > dromaeo.domcoremodify.createElement regresses a bit but > dromaeo.domcoremodify.cloneNode improves a bit. In summary, it seems this CL > will cause no performance change. > That seems like a problematic test for this CL; when locally testing this CL on Windows (32-bit), it OOMs.
On 2015/04/13 13:16:21, sof wrote: > On 2015/04/13 06:06:02, haraken wrote: > > This is a performance result: > > > https://docs.google.com/spreadsheets/d/1qFZ5gca9okdtI1ndhw-v-tbU6sU9gmjY35fIl... > > > > It's sad that the result is flaky (again)... I manually verified that the only > > suspicious regressions are dromaeo.domcoremodify.*. As far as I see > > http://haraken.info/a/results.html, it seems that > > dromaeo.domcoremodify.createElement regresses a bit but > > dromaeo.domcoremodify.cloneNode improves a bit. In summary, it seems this CL > > will cause no performance change. > > > > That seems like a problematic test for this CL; when locally testing this CL on > Windows (32-bit), it OOMs. Thanks, that is good to know... (In theory, this CL is going to solve that problem, but some heuristics would be going wrong. Let me check it.)
https://codereview.chromium.org/1063083002/diff/20001/Source/platform/heap/Th... File Source/platform/heap/ThreadState.cpp (left): https://codereview.chromium.org/1063083002/diff/20001/Source/platform/heap/Th... Source/platform/heap/ThreadState.cpp:559: if (Heap::markedObjectSize() + allocatedObjectSize >= 300 * 1024 * 1024) { If you change this to size_t currentObjectSize = allocatedObjectSize + Heap::markedObjectSize() + WTF::Partitions::totalSizeOfCommittedPages(); if (currentObjectSize >= 300 * 1024 * 1024) { return allocatedObjectSize > estimatedLiveObjectSize / 2; } and you do not consider totalSizeOfCommittedPages() anywhere else, dom-modify (or textarea-dom) no longer OOMs on win32. See https://codereview.chromium.org/1085023002/
bratell@opera.com changed reviewers: + bratell@opera.com
I don't know the code well enough to determine the effect of this but as I see it this is tricky and duplicating research and efforts done by the v8 people. Adding a few good ones on CC in case they have any input (see the google doc document for an explanation what the patch is trying to do). https://codereview.chromium.org/1063083002/diff/20001/Source/platform/heap/Th... File Source/platform/heap/ThreadState.cpp (right): https://codereview.chromium.org/1063083002/diff/20001/Source/platform/heap/Th... Source/platform/heap/ThreadState.cpp:580: fprintf(stderr, "conservative GC: allocatedObjectSize=%ld estimatedLiveObjectSize=%ld markedObjectSize=%ld totalSizeOfCommittedPages=%ld currentObjectSize=%ld\n", Heap::allocatedObjectSize() / 1024 / 1024, Heap::estimatedLiveObjectSize() / 1024 / 1024, Heap::markedObjectSize() / 1024 / 1024, WTF::Partitions::totalSizeOfCommittedPages() / 1024 / 1024, currentObjectSize / 1024 / 1024); Need the fprintf to not be there when this lands.
hpayer@chromium.org changed reviewers: + hpayer@chromium.org
Considering both spaces as one space is fine from a heuristics perspective. However, there is one thing in the design doc that concerns me: You mention that you use a heap growing factor of up to 5. That gives you for example 500M from a 100M live set. Without compaction, that is problematic. Any observed OOM issues with that?
On 2015/04/17 13:47:07, Hannes Payer wrote: > Considering both spaces as one space is fine from a heuristics perspective. > However, there is one thing in the > design doc that concerns me: You mention that you use a heap growing factor of > up to 5. That gives you for example 500M from a 100M live set. Without > compaction, that is problematic. Any observed OOM issues with that? If the memory consumption exceeds 300 MB, the heap growing factor goes down to 0.5. However, I agree that the factor of 5 would be a bit too larger. I'll tune the value in a follow-up. (This CL is not intending to change a threshold except that the CL starts considering the memory consumption of PartitionAlloc.) I'll address Sigbjorn's comment next week.
On 2015/04/14 20:13:19, sof wrote: > https://codereview.chromium.org/1063083002/diff/20001/Source/platform/heap/Th... > File Source/platform/heap/ThreadState.cpp (left): > > https://codereview.chromium.org/1063083002/diff/20001/Source/platform/heap/Th... > Source/platform/heap/ThreadState.cpp:559: if (Heap::markedObjectSize() + > allocatedObjectSize >= 300 * 1024 * 1024) { > If you change this to > > size_t currentObjectSize = allocatedObjectSize + Heap::markedObjectSize() + > WTF::Partitions::totalSizeOfCommittedPages(); > if (currentObjectSize >= 300 * 1024 * 1024) { > return allocatedObjectSize > estimatedLiveObjectSize / 2; > } > > and you do not consider totalSizeOfCommittedPages() anywhere else, dom-modify > (or textarea-dom) no longer OOMs on win32. See > https://codereview.chromium.org/1085023002/ Changing the |Heap::markedObjectSize() + allocatedObjectSize| to |currentObjectSize| makes sense. Wouldn't the change be enough to avoid the OOMs in win? That said, it is sad that we're hitting the 300 MB branch. The branch is the last resort and a GC is expected to happen before hitting the branch. As harness pointed out, maybe we should lower the heap growing factor of 500%.
On 2015/04/19 01:02:53, haraken wrote: > On 2015/04/14 20:13:19, sof wrote: > > > https://codereview.chromium.org/1063083002/diff/20001/Source/platform/heap/Th... > > File Source/platform/heap/ThreadState.cpp (left): > > > > > https://codereview.chromium.org/1063083002/diff/20001/Source/platform/heap/Th... > > Source/platform/heap/ThreadState.cpp:559: if (Heap::markedObjectSize() + > > allocatedObjectSize >= 300 * 1024 * 1024) { > > If you change this to > > > > size_t currentObjectSize = allocatedObjectSize + Heap::markedObjectSize() + > > WTF::Partitions::totalSizeOfCommittedPages(); > > if (currentObjectSize >= 300 * 1024 * 1024) { > > return allocatedObjectSize > estimatedLiveObjectSize / 2; > > } > > > > and you do not consider totalSizeOfCommittedPages() anywhere else, dom-modify > > (or textarea-dom) no longer OOMs on win32. See > > https://codereview.chromium.org/1085023002/ > > Changing the |Heap::markedObjectSize() + allocatedObjectSize| to > |currentObjectSize| makes sense. Wouldn't the change be enough to avoid the OOMs > in win? > > That said, it is sad that we're hitting the 300 MB branch. The branch is the > last resort and a GC is expected to happen before hitting the branch. As harness > pointed out, maybe we should lower the heap growing factor of 500%. This is a perf result for Patch Set 3: https://docs.google.com/spreadsheets/d/1qFZ5gca9okdtI1ndhw-v-tbU6sU9gmjY35fIl... The patch set 3 changed |Heap::markedObjectSize() + allocatedObjectSize| to |currentObjectSize|, and 500% to 400%. Again, the result is flaky (dom_modify_createElement regresses but dom_modify_createNode improves) and it's hard to say if this is an improvement or regression, but at least I can say this doesn't cause a clear regression.
On 2015/04/19 08:36:10, haraken wrote: > On 2015/04/19 01:02:53, haraken wrote: > > On 2015/04/14 20:13:19, sof wrote: > > > > > > https://codereview.chromium.org/1063083002/diff/20001/Source/platform/heap/Th... > > > File Source/platform/heap/ThreadState.cpp (left): > > > > > > > > > https://codereview.chromium.org/1063083002/diff/20001/Source/platform/heap/Th... > > > Source/platform/heap/ThreadState.cpp:559: if (Heap::markedObjectSize() + > > > allocatedObjectSize >= 300 * 1024 * 1024) { > > > If you change this to > > > > > > size_t currentObjectSize = allocatedObjectSize + Heap::markedObjectSize() > + > > > WTF::Partitions::totalSizeOfCommittedPages(); > > > if (currentObjectSize >= 300 * 1024 * 1024) { > > > return allocatedObjectSize > estimatedLiveObjectSize / 2; > > > } > > > > > > and you do not consider totalSizeOfCommittedPages() anywhere else, > dom-modify > > > (or textarea-dom) no longer OOMs on win32. See > > > https://codereview.chromium.org/1085023002/ > > > > Changing the |Heap::markedObjectSize() + allocatedObjectSize| to > > |currentObjectSize| makes sense. Wouldn't the change be enough to avoid the > OOMs > > in win? > > > > That said, it is sad that we're hitting the 300 MB branch. The branch is the > > last resort and a GC is expected to happen before hitting the branch. As > harness > > pointed out, maybe we should lower the heap growing factor of 500%. > > This is a perf result for Patch Set 3: > https://docs.google.com/spreadsheets/d/1qFZ5gca9okdtI1ndhw-v-tbU6sU9gmjY35fIl... > > The patch set 3 changed |Heap::markedObjectSize() + allocatedObjectSize| to > |currentObjectSize|, and 500% to 400%. > > Again, the result is flaky (dom_modify_createElement regresses but > dom_modify_createNode improves) and it's hard to say if this is an improvement > or regression, but at least I can say this doesn't cause a clear regression. For ps#3, if I just peg down the limit from 500% to 400% (i.e., do not consider the global allocation pressure for the > 300Mb bailout check), it still OOMs on win32 for the known problematic perf tests.
I would assume that the OOM comes from the fact that Oilpan grows the heap way to aggressively. Other GC-systems have a maximum heap growing factor of 1.5. Bigger factors give you performance (since no GCs happen) but may be very dangerous in the long run (OOM). I don't think you have to take external memory into account, when you stay closer to the live memory in Oilpan. And if thing go really wrong, force an emergency GC, but that may not help too much in Oilpan since objects are non-moving and may occupy a lot of fragmented space. Which again suggests to not grow the heap too aggressively.
On 2015/04/19 23:43:03, Hannes Payer wrote: > I would assume that the OOM comes from the fact that Oilpan grows the heap way > to aggressively. Other GC-systems have a maximum heap growing factor of 1.5. > Bigger factors give you performance (since no GCs happen) but may be very > dangerous in the long run (OOM). > > I don't think you have to take external memory into account, when you stay > closer to the live memory in Oilpan. And if thing go really wrong, force an > emergency GC, but that may not help too much in Oilpan since objects are > non-moving and may occupy a lot of fragmented space. Which again suggests to not > grow the heap too aggressively. Harness's point makes sense. I changed the 500% to 200% in the patch set 4. Interestingly, I observe no regression in LargeDistribution and EventDispatching* benchmarks. It seems that what matters in these benchmarks are the >32 MB threshold, not the %. Sigbjorn: Would you mind checking if the patch set 4 hits OOM in win? I'll collect full performance numbers of blink_perf tonight.
On 2015/04/20 12:31:04, haraken wrote: > On 2015/04/19 23:43:03, Hannes Payer wrote: > > I would assume that the OOM comes from the fact that Oilpan grows the heap way > > to aggressively. Other GC-systems have a maximum heap growing factor of 1.5. > > Bigger factors give you performance (since no GCs happen) but may be very > > dangerous in the long run (OOM). > > > > I don't think you have to take external memory into account, when you stay > > closer to the live memory in Oilpan. And if thing go really wrong, force an > > emergency GC, but that may not help too much in Oilpan since objects are > > non-moving and may occupy a lot of fragmented space. Which again suggests to > not > > grow the heap too aggressively. > > Harness's point makes sense. > > I changed the 500% to 200% in the patch set 4. Interestingly, I observe no > regression in LargeDistribution and EventDispatching* benchmarks. It seems that > what matters in these benchmarks are the >32 MB threshold, not the %. > > Sigbjorn: Would you mind checking if the patch set 4 hits OOM in win? I'll > collect full performance numbers of blink_perf tonight. Same behavior as in #18 (ps#4 doesn't OOM, but OOMs if the 300Mb check is adjust like there.) However, this doesn't strike me as a good idea -- if we need to perform GCs more often, that should really be handled as idle/precise GCs not as conservative & abrupt GCs _unless we really have to_. So I don't understand the underlying reasoning here.
On 2015/04/20 13:02:03, sof wrote: > On 2015/04/20 12:31:04, haraken wrote: > > On 2015/04/19 23:43:03, Hannes Payer wrote: > > > I would assume that the OOM comes from the fact that Oilpan grows the heap > way > > > to aggressively. Other GC-systems have a maximum heap growing factor of 1.5. > > > Bigger factors give you performance (since no GCs happen) but may be very > > > dangerous in the long run (OOM). > > > > > > I don't think you have to take external memory into account, when you stay > > > closer to the live memory in Oilpan. And if thing go really wrong, force an > > > emergency GC, but that may not help too much in Oilpan since objects are > > > non-moving and may occupy a lot of fragmented space. Which again suggests to > > not > > > grow the heap too aggressively. > > > > Harness's point makes sense. > > > > I changed the 500% to 200% in the patch set 4. Interestingly, I observe no > > regression in LargeDistribution and EventDispatching* benchmarks. It seems > that > > what matters in these benchmarks are the >32 MB threshold, not the %. > > > > Sigbjorn: Would you mind checking if the patch set 4 hits OOM in win? I'll > > collect full performance numbers of blink_perf tonight. > > Same behavior as in #18 (ps#4 doesn't OOM, but OOMs if the 300Mb check is adjust > like there.) > > However, this doesn't strike me as a good idea -- if we need to perform GCs more > often, that should really be handled as idle/precise GCs not as conservative & > abrupt GCs _unless we really have to_. So I don't understand the underlying > reasoning here. Thanks, then how about: - Consider Partitions::totalCommittedSize in a triggering condition of an idle GC. - Consider Partitions::totalCommittedSize in a triggering condition of a precise GC. - Consider Partitions::totalCommittedSize in the 300 MB check in a conservative GC. - Not consider Partitions::totalCommittedSize in a triggering condition of a conservative GC. ? (The original problem we need to solve is the peak memory increase in memory.top_7_stress. I guess considering Partitions::totalCommittedSize in idle GCs would be enough to solve the problem.)
On 2015/04/20 13:47:12, haraken wrote: > On 2015/04/20 13:02:03, sof wrote: > > On 2015/04/20 12:31:04, haraken wrote: > > > On 2015/04/19 23:43:03, Hannes Payer wrote: > > > > I would assume that the OOM comes from the fact that Oilpan grows the heap > > way > > > > to aggressively. Other GC-systems have a maximum heap growing factor of > 1.5. > > > > Bigger factors give you performance (since no GCs happen) but may be very > > > > dangerous in the long run (OOM). > > > > > > > > I don't think you have to take external memory into account, when you stay > > > > closer to the live memory in Oilpan. And if thing go really wrong, force > an > > > > emergency GC, but that may not help too much in Oilpan since objects are > > > > non-moving and may occupy a lot of fragmented space. Which again suggests > to > > > not > > > > grow the heap too aggressively. > > > > > > Harness's point makes sense. > > > > > > I changed the 500% to 200% in the patch set 4. Interestingly, I observe no > > > regression in LargeDistribution and EventDispatching* benchmarks. It seems > > that > > > what matters in these benchmarks are the >32 MB threshold, not the %. > > > > > > Sigbjorn: Would you mind checking if the patch set 4 hits OOM in win? I'll > > > collect full performance numbers of blink_perf tonight. > > > > Same behavior as in #18 (ps#4 doesn't OOM, but OOMs if the 300Mb check is > adjust > > like there.) > > > > However, this doesn't strike me as a good idea -- if we need to perform GCs > more > > often, that should really be handled as idle/precise GCs not as conservative & > > abrupt GCs _unless we really have to_. So I don't understand the underlying > > reasoning here. > > Thanks, then how about: > > - Consider Partitions::totalCommittedSize in a triggering condition of an idle > GC. > - Consider Partitions::totalCommittedSize in a triggering condition of a precise > GC. > - Consider Partitions::totalCommittedSize in the 300 MB check in a conservative > GC. > - Not consider Partitions::totalCommittedSize in a triggering condition of a > conservative GC. > > ? > > (The original problem we need to solve is the peak memory increase in > memory.top_7_stress. I guess considering Partitions::totalCommittedSize in idle > GCs would be enough to solve the problem.) Good, that clarifies. Let's try to take that tack -- GCs due to external allocation buildup are preferably done as idle/precise GCs, so try tuning down the limits for those predicates (first.) For an allocation hog like the dom-modify microbenchmark, idle GCs won't get to run, so I think we're stuck with conservative GCs to handle it.
On 2015/04/20 13:02:03, sof wrote: > On 2015/04/20 12:31:04, haraken wrote: > > On 2015/04/19 23:43:03, Hannes Payer wrote: > > > I would assume that the OOM comes from the fact that Oilpan grows the heap > way > > > to aggressively. Other GC-systems have a maximum heap growing factor of 1.5. > > > Bigger factors give you performance (since no GCs happen) but may be very > > > dangerous in the long run (OOM). > > > > > > I don't think you have to take external memory into account, when you stay > > > closer to the live memory in Oilpan. And if thing go really wrong, force an > > > emergency GC, but that may not help too much in Oilpan since objects are > > > non-moving and may occupy a lot of fragmented space. Which again suggests to > > not > > > grow the heap too aggressively. > > > > Harness's point makes sense. > > > > I changed the 500% to 200% in the patch set 4. Interestingly, I observe no > > regression in LargeDistribution and EventDispatching* benchmarks. It seems > that > > what matters in these benchmarks are the >32 MB threshold, not the %. > > > > Sigbjorn: Would you mind checking if the patch set 4 hits OOM in win? I'll > > collect full performance numbers of blink_perf tonight. > > Same behavior as in #18 (ps#4 doesn't OOM, but OOMs if the 300Mb check is adjust > like there.) Sorry, one more question here. Do you mean that if we remove the 300 MB check from the PS4, it hits OOM? What the PS4 is doing is: a) Trigger a conservative GC on a 100% increase if <300 MB. b) Trigger a conservative GC on a 50% increase if >300 MB. It seems a bit strange to me that we don't hit OOM with a) and b) but hit OOM with only a).
On 2015/04/20 14:13:52, haraken wrote: > On 2015/04/20 13:02:03, sof wrote: > > On 2015/04/20 12:31:04, haraken wrote: > > > On 2015/04/19 23:43:03, Hannes Payer wrote: > > > > I would assume that the OOM comes from the fact that Oilpan grows the heap > > way > > > > to aggressively. Other GC-systems have a maximum heap growing factor of > 1.5. > > > > Bigger factors give you performance (since no GCs happen) but may be very > > > > dangerous in the long run (OOM). > > > > > > > > I don't think you have to take external memory into account, when you stay > > > > closer to the live memory in Oilpan. And if thing go really wrong, force > an > > > > emergency GC, but that may not help too much in Oilpan since objects are > > > > non-moving and may occupy a lot of fragmented space. Which again suggests > to > > > not > > > > grow the heap too aggressively. > > > > > > Harness's point makes sense. > > > > > > I changed the 500% to 200% in the patch set 4. Interestingly, I observe no > > > regression in LargeDistribution and EventDispatching* benchmarks. It seems > > that > > > what matters in these benchmarks are the >32 MB threshold, not the %. > > > > > > Sigbjorn: Would you mind checking if the patch set 4 hits OOM in win? I'll > > > collect full performance numbers of blink_perf tonight. > > > > Same behavior as in #18 (ps#4 doesn't OOM, but OOMs if the 300Mb check is > adjust > > like there.) > > Sorry, one more question here. Do you mean that if we remove the 300 MB check > from the PS4, it hits OOM? > > What the PS4 is doing is: > > a) Trigger a conservative GC on a 100% increase if <300 MB. > b) Trigger a conservative GC on a 50% increase if >300 MB. > > It seems a bit strange to me that we don't hit OOM with a) and b) but hit OOM > with only a). That would be confusing if it did OOM with just a), as it doesn't. If I revert the "300Mb check" back to what it was in ps#2, it OOMs on dom-modify, regardless of the threshold used for a). i.e., considering totalSizeOfCommittedPages() matters for b).
On 2015/04/20 14:35:45, sof wrote: > On 2015/04/20 14:13:52, haraken wrote: > > On 2015/04/20 13:02:03, sof wrote: > > > On 2015/04/20 12:31:04, haraken wrote: > > > > On 2015/04/19 23:43:03, Hannes Payer wrote: > > > > > I would assume that the OOM comes from the fact that Oilpan grows the > heap > > > way > > > > > to aggressively. Other GC-systems have a maximum heap growing factor of > > 1.5. > > > > > Bigger factors give you performance (since no GCs happen) but may be > very > > > > > dangerous in the long run (OOM). > > > > > > > > > > I don't think you have to take external memory into account, when you > stay > > > > > closer to the live memory in Oilpan. And if thing go really wrong, force > > an > > > > > emergency GC, but that may not help too much in Oilpan since objects are > > > > > non-moving and may occupy a lot of fragmented space. Which again > suggests > > to > > > > not > > > > > grow the heap too aggressively. > > > > > > > > Harness's point makes sense. > > > > > > > > I changed the 500% to 200% in the patch set 4. Interestingly, I observe no > > > > regression in LargeDistribution and EventDispatching* benchmarks. It seems > > > that > > > > what matters in these benchmarks are the >32 MB threshold, not the %. > > > > > > > > Sigbjorn: Would you mind checking if the patch set 4 hits OOM in win? I'll > > > > collect full performance numbers of blink_perf tonight. > > > > > > Same behavior as in #18 (ps#4 doesn't OOM, but OOMs if the 300Mb check is > > adjust > > > like there.) > > > > Sorry, one more question here. Do you mean that if we remove the 300 MB check > > from the PS4, it hits OOM? > > > > What the PS4 is doing is: > > > > a) Trigger a conservative GC on a 100% increase if <300 MB. > > b) Trigger a conservative GC on a 50% increase if >300 MB. > > > > It seems a bit strange to me that we don't hit OOM with a) and b) but hit OOM > > with only a). > > That would be confusing if it did OOM with just a), as it doesn't. > > If I revert the "300Mb check" back to what it was in ps#2, it OOMs on > dom-modify, regardless of the threshold used for a). i.e., considering > totalSizeOfCommittedPages() matters for b). If we revert the 300Mb check to the PS2's one and then hit OOM, then would that mean that we'll hit OOM if we don't do anything in the 300Mb check? In other words, if we have only a), then we'll hit OOM. > That would be confusing if it did OOM with just a), as it doesn't. So I think ^^^ is happening, that's why I'm confused.
On 2015/04/20 14:42:05, haraken wrote: > On 2015/04/20 14:35:45, sof wrote: > > On 2015/04/20 14:13:52, haraken wrote: > > > On 2015/04/20 13:02:03, sof wrote: > > > > On 2015/04/20 12:31:04, haraken wrote: > > > > > On 2015/04/19 23:43:03, Hannes Payer wrote: > > > > > > I would assume that the OOM comes from the fact that Oilpan grows the > > heap > > > > way > > > > > > to aggressively. Other GC-systems have a maximum heap growing factor > of > > > 1.5. > > > > > > Bigger factors give you performance (since no GCs happen) but may be > > very > > > > > > dangerous in the long run (OOM). > > > > > > > > > > > > I don't think you have to take external memory into account, when you > > stay > > > > > > closer to the live memory in Oilpan. And if thing go really wrong, > force > > > an > > > > > > emergency GC, but that may not help too much in Oilpan since objects > are > > > > > > non-moving and may occupy a lot of fragmented space. Which again > > suggests > > > to > > > > > not > > > > > > grow the heap too aggressively. > > > > > > > > > > Harness's point makes sense. > > > > > > > > > > I changed the 500% to 200% in the patch set 4. Interestingly, I observe > no > > > > > regression in LargeDistribution and EventDispatching* benchmarks. It > seems > > > > that > > > > > what matters in these benchmarks are the >32 MB threshold, not the %. > > > > > > > > > > Sigbjorn: Would you mind checking if the patch set 4 hits OOM in win? > I'll > > > > > collect full performance numbers of blink_perf tonight. > > > > > > > > Same behavior as in #18 (ps#4 doesn't OOM, but OOMs if the 300Mb check is > > > adjust > > > > like there.) > > > > > > Sorry, one more question here. Do you mean that if we remove the 300 MB > check > > > from the PS4, it hits OOM? > > > > > > What the PS4 is doing is: > > > > > > a) Trigger a conservative GC on a 100% increase if <300 MB. > > > b) Trigger a conservative GC on a 50% increase if >300 MB. > > > > > > It seems a bit strange to me that we don't hit OOM with a) and b) but hit > OOM > > > with only a). > > > > That would be confusing if it did OOM with just a), as it doesn't. > > > > If I revert the "300Mb check" back to what it was in ps#2, it OOMs on > > dom-modify, regardless of the threshold used for a). i.e., considering > > totalSizeOfCommittedPages() matters for b). > > If we revert the 300Mb check to the PS2's one and then hit OOM, then would that > mean that we'll hit OOM if we don't do anything in the 300Mb check? In other > words, if we have only a), then we'll hit OOM. > > > That would be confusing if it did OOM with just a), as it doesn't. > > So I think ^^^ is happening, that's why I'm confused. Sorry, shouldn't have brought up that earlier variation :) (Heap::markedObjectSize() + allocatedObjectSize) does exceed 300Mb for dom-modify, but if the inner check isn't true, we will OOM. . I don't have access to the sequence of values seen for these various counters leading up the OOM hitting, I'm afraid. Without the b) check, just a), it doesn't OOM, but it runs very close to. We've already determined that b) needs to be over currentObjectSize though, right? In dom-modify's case it will release the Strings that all the Text nodes hold onto earlier.
On 2015/04/20 15:11:53, sof wrote:
> On 2015/04/20 14:42:05, haraken wrote:
> > On 2015/04/20 14:35:45, sof wrote:
> > > On 2015/04/20 14:13:52, haraken wrote:
> > > > On 2015/04/20 13:02:03, sof wrote:
> > > > > On 2015/04/20 12:31:04, haraken wrote:
> > > > > > On 2015/04/19 23:43:03, Hannes Payer wrote:
> > > > > > > I would assume that the OOM comes from the fact that Oilpan grows
> the
> > > heap
> > > > > way
> > > > > > > to aggressively. Other GC-systems have a maximum heap growing
factor
> > of
> > > > 1.5.
> > > > > > > Bigger factors give you performance (since no GCs happen) but may
be
> > > very
> > > > > > > dangerous in the long run (OOM).
> > > > > > >
> > > > > > > I don't think you have to take external memory into account, when
> you
> > > stay
> > > > > > > closer to the live memory in Oilpan. And if thing go really wrong,
> > force
> > > > an
> > > > > > > emergency GC, but that may not help too much in Oilpan since
objects
> > are
> > > > > > > non-moving and may occupy a lot of fragmented space. Which again
> > > suggests
> > > > to
> > > > > > not
> > > > > > > grow the heap too aggressively.
> > > > > >
> > > > > > Harness's point makes sense.
> > > > > >
> > > > > > I changed the 500% to 200% in the patch set 4. Interestingly, I
> observe
> > no
> > > > > > regression in LargeDistribution and EventDispatching* benchmarks. It
> > seems
> > > > > that
> > > > > > what matters in these benchmarks are the >32 MB threshold, not the
%.
> > > > > >
> > > > > > Sigbjorn: Would you mind checking if the patch set 4 hits OOM in
win?
> > I'll
> > > > > > collect full performance numbers of blink_perf tonight.
> > > > >
> > > > > Same behavior as in #18 (ps#4 doesn't OOM, but OOMs if the 300Mb check
> is
> > > > adjust
> > > > > like there.)
> > > >
> > > > Sorry, one more question here. Do you mean that if we remove the 300 MB
> > check
> > > > from the PS4, it hits OOM?
> > > >
> > > > What the PS4 is doing is:
> > > >
> > > > a) Trigger a conservative GC on a 100% increase if <300 MB.
> > > > b) Trigger a conservative GC on a 50% increase if >300 MB.
> > > >
> > > > It seems a bit strange to me that we don't hit OOM with a) and b) but
hit
> > OOM
> > > > with only a).
> > >
> > > That would be confusing if it did OOM with just a), as it doesn't.
> > >
> > > If I revert the "300Mb check" back to what it was in ps#2, it OOMs on
> > > dom-modify, regardless of the threshold used for a). i.e., considering
> > > totalSizeOfCommittedPages() matters for b).
> >
> > If we revert the 300Mb check to the PS2's one and then hit OOM, then would
> that
> > mean that we'll hit OOM if we don't do anything in the 300Mb check? In other
> > words, if we have only a), then we'll hit OOM.
> >
> > > That would be confusing if it did OOM with just a), as it doesn't.
> >
> > So I think ^^^ is happening, that's why I'm confused.
>
> Sorry, shouldn't have brought up that earlier variation :)
>
> (Heap::markedObjectSize() + allocatedObjectSize) does exceed 300Mb for
> dom-modify, but if the inner check isn't true, we will OOM. . I don't have
> access to the sequence of values seen for these various counters leading up
the
> OOM hitting, I'm afraid.
>
> Without the b) check, just a), it doesn't OOM, but it runs very close to.
>
> We've already determined that b) needs to be over currentObjectSize though,
> right? In dom-modify's case it will release the Strings that all the Text
nodes
> hold onto earlier.
Hmm, I don't fully understand what's going on there. It seems that in dom-modify
a GC that happens around 400 MB (300MB < 400MB < 300MB*1.5 < 300MB*2) is
mandatory. I wonder why that memory cannot be collected in a prior GC that has
happened at 200MB.
Either way, let's discuss with concrete code:
if (currentObjectSize > 300 MB) { // I think this is fine.
return currentObjectSize > estimatedLiveObjectSize * 3 / 2; // I think this
is fine too.
}
return currentObjectSize >= 32 * 1024 * 1024 && currentObjectSize >
estimatedLiveObjectSize * 2; // This is in question.
- I personally think that currentObjectSize > estimatedLiveObjectSize * 2 is
reasonable. That said, I'm fine with starting with a more conservative approach;
i.e., as discussed above, not consider totalCommittedSize here.
- However, it's not that trivial to not consider totalCommittedSize. We cannot
use allocatedObjectSize > 4 * estimatedLiveObjectSize, because
estimatedLiveObjectSize is a value considering totalCommittedSize. We cannot use
allocatedObjectSize > 4 * Heap::markedObjectSize(), because
Heap::markedObjectSize() might be close to 0 if shouldForceConservativeGC is
called from a worker thread (if lazy sweeping of the main thread is not yet
complete).
On 2015/04/20 16:02:22, haraken wrote:
> On 2015/04/20 15:11:53, sof wrote:
> > On 2015/04/20 14:42:05, haraken wrote:
> > > On 2015/04/20 14:35:45, sof wrote:
> > > > On 2015/04/20 14:13:52, haraken wrote:
> > > > > On 2015/04/20 13:02:03, sof wrote:
> > > > > > On 2015/04/20 12:31:04, haraken wrote:
> > > > > > > On 2015/04/19 23:43:03, Hannes Payer wrote:
> > > > > > > > I would assume that the OOM comes from the fact that Oilpan
grows
> > the
> > > > heap
> > > > > > way
> > > > > > > > to aggressively. Other GC-systems have a maximum heap growing
> factor
> > > of
> > > > > 1.5.
> > > > > > > > Bigger factors give you performance (since no GCs happen) but
may
> be
> > > > very
> > > > > > > > dangerous in the long run (OOM).
> > > > > > > >
> > > > > > > > I don't think you have to take external memory into account,
when
> > you
> > > > stay
> > > > > > > > closer to the live memory in Oilpan. And if thing go really
wrong,
> > > force
> > > > > an
> > > > > > > > emergency GC, but that may not help too much in Oilpan since
> objects
> > > are
> > > > > > > > non-moving and may occupy a lot of fragmented space. Which again
> > > > suggests
> > > > > to
> > > > > > > not
> > > > > > > > grow the heap too aggressively.
> > > > > > >
> > > > > > > Harness's point makes sense.
> > > > > > >
> > > > > > > I changed the 500% to 200% in the patch set 4. Interestingly, I
> > observe
> > > no
> > > > > > > regression in LargeDistribution and EventDispatching* benchmarks.
It
> > > seems
> > > > > > that
> > > > > > > what matters in these benchmarks are the >32 MB threshold, not the
> %.
> > > > > > >
> > > > > > > Sigbjorn: Would you mind checking if the patch set 4 hits OOM in
> win?
> > > I'll
> > > > > > > collect full performance numbers of blink_perf tonight.
> > > > > >
> > > > > > Same behavior as in #18 (ps#4 doesn't OOM, but OOMs if the 300Mb
check
> > is
> > > > > adjust
> > > > > > like there.)
> > > > >
> > > > > Sorry, one more question here. Do you mean that if we remove the 300
MB
> > > check
> > > > > from the PS4, it hits OOM?
> > > > >
> > > > > What the PS4 is doing is:
> > > > >
> > > > > a) Trigger a conservative GC on a 100% increase if <300 MB.
> > > > > b) Trigger a conservative GC on a 50% increase if >300 MB.
> > > > >
> > > > > It seems a bit strange to me that we don't hit OOM with a) and b) but
> hit
> > > OOM
> > > > > with only a).
> > > >
> > > > That would be confusing if it did OOM with just a), as it doesn't.
> > > >
> > > > If I revert the "300Mb check" back to what it was in ps#2, it OOMs on
> > > > dom-modify, regardless of the threshold used for a). i.e., considering
> > > > totalSizeOfCommittedPages() matters for b).
> > >
> > > If we revert the 300Mb check to the PS2's one and then hit OOM, then would
> > that
> > > mean that we'll hit OOM if we don't do anything in the 300Mb check? In
other
> > > words, if we have only a), then we'll hit OOM.
> > >
> > > > That would be confusing if it did OOM with just a), as it doesn't.
> > >
> > > So I think ^^^ is happening, that's why I'm confused.
> >
> > Sorry, shouldn't have brought up that earlier variation :)
> >
> > (Heap::markedObjectSize() + allocatedObjectSize) does exceed 300Mb for
> > dom-modify, but if the inner check isn't true, we will OOM. . I don't have
> > access to the sequence of values seen for these various counters leading up
> the
> > OOM hitting, I'm afraid.
> >
> > Without the b) check, just a), it doesn't OOM, but it runs very close to.
> >
> > We've already determined that b) needs to be over currentObjectSize though,
> > right? In dom-modify's case it will release the Strings that all the Text
> nodes
> > hold onto earlier.
>
> Hmm, I don't fully understand what's going on there. It seems that in
dom-modify
> a GC that happens around 400 MB (300MB < 400MB < 300MB*1.5 < 300MB*2) is
> mandatory. I wonder why that memory cannot be collected in a prior GC that has
> happened at 200MB.
>
For dom-modify, it is dependent on the allocations made before the
"createTextNode" stage runs -- if I move it up & first, disabling b) and only
using a) (with 200% or 500% as threshold), it no longer OOMs.
> Either way, let's discuss with concrete code:
>
> if (currentObjectSize > 300 MB) { // I think this is fine.
> return currentObjectSize > estimatedLiveObjectSize * 3 / 2; // I think
this
> is fine too.
> }
> return currentObjectSize >= 32 * 1024 * 1024 && currentObjectSize >
> estimatedLiveObjectSize * 2; // This is in question.
>
> - I personally think that currentObjectSize > estimatedLiveObjectSize * 2 is
> reasonable. That said, I'm fine with starting with a more conservative
approach;
> i.e., as discussed above, not consider totalCommittedSize here.
>
Either:
- leave final check as currentObjectSize >= 32 * 1024 * 1024 &&
currentObjectSize > 5 * estimatedLiveObjectSize; and evaluate performance on
problematic perf test. Everything else as in ps#4.
- use ps#4 (200% as threshold instead of 500%) and run perf tests, but check
that the conservative GC telemetry counts before & after are about the same.
I'm fine with either as a next step.
On 2015/04/21 12:39:17, sof wrote:
> On 2015/04/20 16:02:22, haraken wrote:
> > On 2015/04/20 15:11:53, sof wrote:
> > > On 2015/04/20 14:42:05, haraken wrote:
> > > > On 2015/04/20 14:35:45, sof wrote:
> > > > > On 2015/04/20 14:13:52, haraken wrote:
> > > > > > On 2015/04/20 13:02:03, sof wrote:
> > > > > > > On 2015/04/20 12:31:04, haraken wrote:
> > > > > > > > On 2015/04/19 23:43:03, Hannes Payer wrote:
> > > > > > > > > I would assume that the OOM comes from the fact that Oilpan
> grows
> > > the
> > > > > heap
> > > > > > > way
> > > > > > > > > to aggressively. Other GC-systems have a maximum heap growing
> > factor
> > > > of
> > > > > > 1.5.
> > > > > > > > > Bigger factors give you performance (since no GCs happen) but
> may
> > be
> > > > > very
> > > > > > > > > dangerous in the long run (OOM).
> > > > > > > > >
> > > > > > > > > I don't think you have to take external memory into account,
> when
> > > you
> > > > > stay
> > > > > > > > > closer to the live memory in Oilpan. And if thing go really
> wrong,
> > > > force
> > > > > > an
> > > > > > > > > emergency GC, but that may not help too much in Oilpan since
> > objects
> > > > are
> > > > > > > > > non-moving and may occupy a lot of fragmented space. Which
again
> > > > > suggests
> > > > > > to
> > > > > > > > not
> > > > > > > > > grow the heap too aggressively.
> > > > > > > >
> > > > > > > > Harness's point makes sense.
> > > > > > > >
> > > > > > > > I changed the 500% to 200% in the patch set 4. Interestingly, I
> > > observe
> > > > no
> > > > > > > > regression in LargeDistribution and EventDispatching*
benchmarks.
> It
> > > > seems
> > > > > > > that
> > > > > > > > what matters in these benchmarks are the >32 MB threshold, not
the
> > %.
> > > > > > > >
> > > > > > > > Sigbjorn: Would you mind checking if the patch set 4 hits OOM in
> > win?
> > > > I'll
> > > > > > > > collect full performance numbers of blink_perf tonight.
> > > > > > >
> > > > > > > Same behavior as in #18 (ps#4 doesn't OOM, but OOMs if the 300Mb
> check
> > > is
> > > > > > adjust
> > > > > > > like there.)
> > > > > >
> > > > > > Sorry, one more question here. Do you mean that if we remove the 300
> MB
> > > > check
> > > > > > from the PS4, it hits OOM?
> > > > > >
> > > > > > What the PS4 is doing is:
> > > > > >
> > > > > > a) Trigger a conservative GC on a 100% increase if <300 MB.
> > > > > > b) Trigger a conservative GC on a 50% increase if >300 MB.
> > > > > >
> > > > > > It seems a bit strange to me that we don't hit OOM with a) and b)
but
> > hit
> > > > OOM
> > > > > > with only a).
> > > > >
> > > > > That would be confusing if it did OOM with just a), as it doesn't.
> > > > >
> > > > > If I revert the "300Mb check" back to what it was in ps#2, it OOMs on
> > > > > dom-modify, regardless of the threshold used for a). i.e., considering
> > > > > totalSizeOfCommittedPages() matters for b).
> > > >
> > > > If we revert the 300Mb check to the PS2's one and then hit OOM, then
would
> > > that
> > > > mean that we'll hit OOM if we don't do anything in the 300Mb check? In
> other
> > > > words, if we have only a), then we'll hit OOM.
> > > >
> > > > > That would be confusing if it did OOM with just a), as it doesn't.
> > > >
> > > > So I think ^^^ is happening, that's why I'm confused.
> > >
> > > Sorry, shouldn't have brought up that earlier variation :)
> > >
> > > (Heap::markedObjectSize() + allocatedObjectSize) does exceed 300Mb for
> > > dom-modify, but if the inner check isn't true, we will OOM. . I don't have
> > > access to the sequence of values seen for these various counters leading
up
> > the
> > > OOM hitting, I'm afraid.
> > >
> > > Without the b) check, just a), it doesn't OOM, but it runs very close to.
> > >
> > > We've already determined that b) needs to be over currentObjectSize
though,
> > > right? In dom-modify's case it will release the Strings that all the Text
> > nodes
> > > hold onto earlier.
> >
> > Hmm, I don't fully understand what's going on there. It seems that in
> dom-modify
> > a GC that happens around 400 MB (300MB < 400MB < 300MB*1.5 < 300MB*2) is
> > mandatory. I wonder why that memory cannot be collected in a prior GC that
has
> > happened at 200MB.
> >
>
> For dom-modify, it is dependent on the allocations made before the
> "createTextNode" stage runs -- if I move it up & first, disabling b) and only
> using a) (with 200% or 500% as threshold), it no longer OOMs.
>
> > Either way, let's discuss with concrete code:
> >
> > if (currentObjectSize > 300 MB) { // I think this is fine.
> > return currentObjectSize > estimatedLiveObjectSize * 3 / 2; // I think
> this
> > is fine too.
> > }
> > return currentObjectSize >= 32 * 1024 * 1024 && currentObjectSize >
> > estimatedLiveObjectSize * 2; // This is in question.
> >
> > - I personally think that currentObjectSize > estimatedLiveObjectSize * 2 is
> > reasonable. That said, I'm fine with starting with a more conservative
> approach;
> > i.e., as discussed above, not consider totalCommittedSize here.
> >
>
> Either:
>
> - leave final check as currentObjectSize >= 32 * 1024 * 1024 &&
> currentObjectSize > 5 * estimatedLiveObjectSize; and evaluate performance on
> problematic perf test. Everything else as in ps#4.
> - use ps#4 (200% as threshold instead of 500%) and run perf tests, but check
> that the conservative GC telemetry counts before & after are about the same.
>
> I'm fine with either as a next step.
I collected the performance numbers and found that the 200% approach regresses
create-element.html, dromaeo.dommodify.createElement etc.
https://docs.google.com/spreadsheets/d/1qFZ5gca9okdtI1ndhw-v-tbU6sU9gmjY35fIl...
I'm now measuring performance numbers for the 500% approach. If the numbers are
fine, I'm planning to land the 500% one. I still think that 500% is too big, but
it is not worse than what we're doing today. We can make the threshold better in
a follow-up CL.
On 2015/04/22 12:59:23, haraken wrote:
> On 2015/04/21 12:39:17, sof wrote:
> > On 2015/04/20 16:02:22, haraken wrote:
> > > On 2015/04/20 15:11:53, sof wrote:
> > > > On 2015/04/20 14:42:05, haraken wrote:
> > > > > On 2015/04/20 14:35:45, sof wrote:
> > > > > > On 2015/04/20 14:13:52, haraken wrote:
> > > > > > > On 2015/04/20 13:02:03, sof wrote:
> > > > > > > > On 2015/04/20 12:31:04, haraken wrote:
> > > > > > > > > On 2015/04/19 23:43:03, Hannes Payer wrote:
> > > > > > > > > > I would assume that the OOM comes from the fact that Oilpan
> > grows
> > > > the
> > > > > > heap
> > > > > > > > way
> > > > > > > > > > to aggressively. Other GC-systems have a maximum heap
growing
> > > factor
> > > > > of
> > > > > > > 1.5.
> > > > > > > > > > Bigger factors give you performance (since no GCs happen)
but
> > may
> > > be
> > > > > > very
> > > > > > > > > > dangerous in the long run (OOM).
> > > > > > > > > >
> > > > > > > > > > I don't think you have to take external memory into account,
> > when
> > > > you
> > > > > > stay
> > > > > > > > > > closer to the live memory in Oilpan. And if thing go really
> > wrong,
> > > > > force
> > > > > > > an
> > > > > > > > > > emergency GC, but that may not help too much in Oilpan since
> > > objects
> > > > > are
> > > > > > > > > > non-moving and may occupy a lot of fragmented space. Which
> again
> > > > > > suggests
> > > > > > > to
> > > > > > > > > not
> > > > > > > > > > grow the heap too aggressively.
> > > > > > > > >
> > > > > > > > > Harness's point makes sense.
> > > > > > > > >
> > > > > > > > > I changed the 500% to 200% in the patch set 4. Interestingly,
I
> > > > observe
> > > > > no
> > > > > > > > > regression in LargeDistribution and EventDispatching*
> benchmarks.
> > It
> > > > > seems
> > > > > > > > that
> > > > > > > > > what matters in these benchmarks are the >32 MB threshold, not
> the
> > > %.
> > > > > > > > >
> > > > > > > > > Sigbjorn: Would you mind checking if the patch set 4 hits OOM
in
> > > win?
> > > > > I'll
> > > > > > > > > collect full performance numbers of blink_perf tonight.
> > > > > > > >
> > > > > > > > Same behavior as in #18 (ps#4 doesn't OOM, but OOMs if the 300Mb
> > check
> > > > is
> > > > > > > adjust
> > > > > > > > like there.)
> > > > > > >
> > > > > > > Sorry, one more question here. Do you mean that if we remove the
300
> > MB
> > > > > check
> > > > > > > from the PS4, it hits OOM?
> > > > > > >
> > > > > > > What the PS4 is doing is:
> > > > > > >
> > > > > > > a) Trigger a conservative GC on a 100% increase if <300 MB.
> > > > > > > b) Trigger a conservative GC on a 50% increase if >300 MB.
> > > > > > >
> > > > > > > It seems a bit strange to me that we don't hit OOM with a) and b)
> but
> > > hit
> > > > > OOM
> > > > > > > with only a).
> > > > > >
> > > > > > That would be confusing if it did OOM with just a), as it doesn't.
> > > > > >
> > > > > > If I revert the "300Mb check" back to what it was in ps#2, it OOMs
on
> > > > > > dom-modify, regardless of the threshold used for a). i.e.,
considering
> > > > > > totalSizeOfCommittedPages() matters for b).
> > > > >
> > > > > If we revert the 300Mb check to the PS2's one and then hit OOM, then
> would
> > > > that
> > > > > mean that we'll hit OOM if we don't do anything in the 300Mb check? In
> > other
> > > > > words, if we have only a), then we'll hit OOM.
> > > > >
> > > > > > That would be confusing if it did OOM with just a), as it doesn't.
> > > > >
> > > > > So I think ^^^ is happening, that's why I'm confused.
> > > >
> > > > Sorry, shouldn't have brought up that earlier variation :)
> > > >
> > > > (Heap::markedObjectSize() + allocatedObjectSize) does exceed 300Mb for
> > > > dom-modify, but if the inner check isn't true, we will OOM. . I don't
have
> > > > access to the sequence of values seen for these various counters leading
> up
> > > the
> > > > OOM hitting, I'm afraid.
> > > >
> > > > Without the b) check, just a), it doesn't OOM, but it runs very close
to.
> > > >
> > > > We've already determined that b) needs to be over currentObjectSize
> though,
> > > > right? In dom-modify's case it will release the Strings that all the
Text
> > > nodes
> > > > hold onto earlier.
> > >
> > > Hmm, I don't fully understand what's going on there. It seems that in
> > dom-modify
> > > a GC that happens around 400 MB (300MB < 400MB < 300MB*1.5 < 300MB*2) is
> > > mandatory. I wonder why that memory cannot be collected in a prior GC that
> has
> > > happened at 200MB.
> > >
> >
> > For dom-modify, it is dependent on the allocations made before the
> > "createTextNode" stage runs -- if I move it up & first, disabling b) and
only
> > using a) (with 200% or 500% as threshold), it no longer OOMs.
> >
> > > Either way, let's discuss with concrete code:
> > >
> > > if (currentObjectSize > 300 MB) { // I think this is fine.
> > > return currentObjectSize > estimatedLiveObjectSize * 3 / 2; // I
think
> > this
> > > is fine too.
> > > }
> > > return currentObjectSize >= 32 * 1024 * 1024 && currentObjectSize >
> > > estimatedLiveObjectSize * 2; // This is in question.
> > >
> > > - I personally think that currentObjectSize > estimatedLiveObjectSize * 2
is
> > > reasonable. That said, I'm fine with starting with a more conservative
> > approach;
> > > i.e., as discussed above, not consider totalCommittedSize here.
> > >
> >
> > Either:
> >
> > - leave final check as currentObjectSize >= 32 * 1024 * 1024 &&
> > currentObjectSize > 5 * estimatedLiveObjectSize; and evaluate performance on
> > problematic perf test. Everything else as in ps#4.
^^^ I did this in PS5 and confirmed that there is no regression in
dromaeo.dommodify.*, create-element, EventsDispatching* and LargeDistribution.
PTAL at PS5.
lgtm. I'll verify win32 status tomorrow morning.
On 2015/04/22 21:39:48, sof wrote: > lgtm. > > I'll verify win32 status tomorrow morning. No OOMs, but the performance on textarea-edit is quite abysmal. Checking ToT status for that one.
On 2015/04/23 07:51:26, sof wrote: > On 2015/04/22 21:39:48, sof wrote: > > lgtm. > > > > I'll verify win32 status tomorrow morning. > > No OOMs, but the performance on textarea-edit is quite abysmal. Checking ToT > status for that one. Not specific to this CL (nor Oilpan), so all clear wrt win32.
On 2015/04/23 08:26:43, sof wrote: > On 2015/04/23 07:51:26, sof wrote: > > On 2015/04/22 21:39:48, sof wrote: > > > lgtm. > > > > > > I'll verify win32 status tomorrow morning. > > > > No OOMs, but the performance on textarea-edit is quite abysmal. Checking ToT > > status for that one. > > Not specific to this CL (nor Oilpan), so all clear wrt win32. Thanks for a lot of verifications; let me land this.
The CQ bit was checked by haraken@chromium.org
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1063083002/80001
Factor 2, cool. How far can you push it down without regressing performance? Something below 1.5 would be nice (and is widely used in other garbage collected systems).
On 2015/04/23 08:35:32, Hannes Payer wrote: > Factor 2, cool. How far can you push it down without regressing performance? > Something below 1.5 would be nice (and is widely used in other garbage collected > systems). If we're referring to the same piece of logic, please keep in mind that this is for conservative GCs that will go ahead there & then. For Oilpan, the preference is for idle/precise GCs once back at the event loop rather than conservative GCs.
According to Sigbjorn's results, it seems that win32 hits OOM when currentObjectSize (which takes PartitionAlloc's memory usage into account) exceeds 400 MB or so. Are we hitting physical memory OOM or address space OOM? It seems to me that 400 MB is a bit too small to hit OOM in win32 and I'm wondering what's going on there.
Message was sent while issue was closed.
Committed patchset #5 (id:80001) as https://src.chromium.org/viewvc/blink?view=rev&revision=194282
Message was sent while issue was closed.
https://codereview.chromium.org/1063083002/diff/80001/Source/platform/heap/Th... File Source/platform/heap/ThreadState.cpp (right): https://codereview.chromium.org/1063083002/diff/80001/Source/platform/heap/Th... Source/platform/heap/ThreadState.cpp:521: return currentObjectSize >= 1024 * 1024 && currentObjectSize > estimatedLiveObjectSize * 3 / 2; Re: (currentObjectSize >= 1024 * 1024) -- wouldn't this always be true since PA's allocations are included? And if so, wouldn't it schedule an idle GC without having allocated anything on the heap (as estimatedLiveObjectSize is 0)?
Message was sent while issue was closed.
On 2015/04/23 15:20:33, sof wrote: > https://codereview.chromium.org/1063083002/diff/80001/Source/platform/heap/Th... > File Source/platform/heap/ThreadState.cpp (right): > > https://codereview.chromium.org/1063083002/diff/80001/Source/platform/heap/Th... > Source/platform/heap/ThreadState.cpp:521: return currentObjectSize >= 1024 * > 1024 && currentObjectSize > estimatedLiveObjectSize * 3 / 2; > Re: (currentObjectSize >= 1024 * 1024) -- wouldn't this always be true since > PA's allocations are included? > > And if so, wouldn't it schedule an idle GC without having allocated anything on > the heap (as estimatedLiveObjectSize is 0)? Thanks for digging into this. A fix would be to set some proper initial value on estimatedLiveObjectSize?
Message was sent while issue was closed.
On 2015/04/23 15:27:44, haraken wrote: > On 2015/04/23 15:20:33, sof wrote: > > > https://codereview.chromium.org/1063083002/diff/80001/Source/platform/heap/Th... > > File Source/platform/heap/ThreadState.cpp (right): > > > > > https://codereview.chromium.org/1063083002/diff/80001/Source/platform/heap/Th... > > Source/platform/heap/ThreadState.cpp:521: return currentObjectSize >= 1024 * > > 1024 && currentObjectSize > estimatedLiveObjectSize * 3 / 2; > > Re: (currentObjectSize >= 1024 * 1024) -- wouldn't this always be true since > > PA's allocations are included? > > > > And if so, wouldn't it schedule an idle GC without having allocated anything > on > > the heap (as estimatedLiveObjectSize is 0)? > > Thanks for digging into this. > > A fix would be to set some proper initial value on estimatedLiveObjectSize? May I suggest having the LHS of (&&) use allocatedObjectSize instead (like before)? |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
