|
|
Chromium Code Reviews|
Created:
4 years, 10 months ago by haraken Modified:
4 years, 10 months ago CC:
chromium-reviews, oilpan-reviews, Mads Ager (chromium), blink-reviews, kinuko+watch, kouhei+heap_chromium.org, blink-reviews-wtf_chromium.org, Mikhail Base URL:
https://chromium.googlesource.com/chromium/src.git@master Target Ref:
refs/pending/heads/master Project:
chromium Visibility:
Public. |
DescriptionOilpan: Discard unused system pages when sweeping NormalPageHeaps
This CL makes Oilpan's GC discard empty system pages (i.e., call madvise(MADV_FREE)) in free lists.
This CL fixes regressions in memory.memory_health_plan.
BUG=
Committed: https://crrev.com/6fed8736a88eecf12017be4066db47b7427de8e5
Cr-Commit-Position: refs/heads/master@{#374616}
Patch Set 1 #Patch Set 2 : #
Total comments: 2
Patch Set 3 : #Patch Set 4 : #
Total comments: 1
Patch Set 5 : #
Total comments: 1
Patch Set 6 : #
Total comments: 1
Patch Set 7 : #Patch Set 8 : #
Total comments: 1
Patch Set 9 : #Patch Set 10 : #Patch Set 11 : #Messages
Total messages: 43 (13 generated)
haraken@chromium.org changed reviewers: + yutak@chromium.org
Not for review. yutak@: Would you check if this CL fixes the regression on memory_health_plan?
Description was changed from ========== Oilpan: Discard unused system pages when sweeping NormalPageHeaps BUG= ========== to ========== Oilpan: Discard unused system pages when sweeping NormalPageHeaps This CL makes Oilpan's GC discard empty system pages (i.e., call madvise(MADV_FREE)) in free lists. This CL fixes regressions in memory.memory_health_plan. BUG= ==========
haraken@chromium.org changed reviewers: + sigbjornf@opera.com
PTAL Sigbjorn: yutak@ confirmed that this CL mostly fixes the regressions in memory.memory_health_plan on low-RAM devices. So I want to land this but I'm wondering if we should enable discardSystemPages only on low-RAM devices or on all devices (including desktops). Do you have any idea?
On 2016/02/04 09:45:55, haraken wrote: > PTAL > > Sigbjorn: yutak@ confirmed that this CL mostly fixes the regressions in > memory.memory_health_plan on low-RAM devices. So I want to land this but I'm > wondering if we should enable discardSystemPages only on low-RAM devices or on > all devices (including desktops). Do you have any idea? i.e., I'm a bit concerned about its performance impact.
A sign of fragmentation - any particular (sub) heap that has a longer freelist? If there's a way to make the discard marking not be done for every freelist entry, that'd be preferable. https://codereview.chromium.org/1666083002/diff/20001/third_party/WebKit/Sour... File third_party/WebKit/Source/platform/heap/HeapPage.cpp (right): https://codereview.chromium.org/1666083002/diff/20001/third_party/WebKit/Sour... third_party/WebKit/Source/platform/heap/HeapPage.cpp:1111: if (beginAddress < endAddress) I understand the concern; for other live objects on that page, their next access will be potentially costly.
https://codereview.chromium.org/1666083002/diff/20001/third_party/WebKit/Sour... File third_party/WebKit/Source/platform/heap/HeapPage.cpp (right): https://codereview.chromium.org/1666083002/diff/20001/third_party/WebKit/Sour... third_party/WebKit/Source/platform/heap/HeapPage.cpp:1111: if (beginAddress < endAddress) On 2016/02/04 10:20:04, sof wrote: > I understand the concern; for other live objects on that page, their next access > will be potentially costly. We could make the condition a little bit tighter, i.e. discard memory only when the unused interval is more than e.g. 32KB. According to my experiment, the major contributor is large-sized buckets, so I suspect there's enough memory that can be collected by just discarding longer contiguous unused regions. But that's kind of trade-offs; on low mem devices we probably should return every unused 4K page, while on desktops we have some more freedom on delaying the disposal. Just large contiguous memory should be more prioritized than shorter regions.
On 2016/02/04 10:50:11, Yuta Kitamura wrote: > https://codereview.chromium.org/1666083002/diff/20001/third_party/WebKit/Sour... > File third_party/WebKit/Source/platform/heap/HeapPage.cpp (right): > > https://codereview.chromium.org/1666083002/diff/20001/third_party/WebKit/Sour... > third_party/WebKit/Source/platform/heap/HeapPage.cpp:1111: if (beginAddress < > endAddress) > On 2016/02/04 10:20:04, sof wrote: > > I understand the concern; for other live objects on that page, their next > access > > will be potentially costly. > > We could make the condition a little bit tighter, i.e. discard memory > only when the unused interval is more than e.g. 32KB. > > According to my experiment, the major contributor is large-sized > buckets, so I suspect there's enough memory that can be collected > by just discarding longer contiguous unused regions. > > But that's kind of trade-offs; on low mem devices we probably should > return every unused 4K page, while on desktops we have some more > freedom on delaying the disposal. Just large contiguous memory should > be more prioritized than shorter regions. From the perspective of performance, I'm not sure how much the threshold (32KB or 4KB) difference makes a difference. Remember that Oilpan uses free lists with the worst-fit strategy. This means that if you have one 64KB free list and three 4KB free lists, the 64KB free list is used first. So what matters for performance is whether the 64KB free list is discarded, not whether the 4KB free lists are discarded. For safety, my proposal here is to enable the discarding only when isLowEndDevice is true.
On 2016/02/04 11:12:49, haraken wrote: > On 2016/02/04 10:50:11, Yuta Kitamura wrote: > > > https://codereview.chromium.org/1666083002/diff/20001/third_party/WebKit/Sour... > > File third_party/WebKit/Source/platform/heap/HeapPage.cpp (right): > > > > > https://codereview.chromium.org/1666083002/diff/20001/third_party/WebKit/Sour... > > third_party/WebKit/Source/platform/heap/HeapPage.cpp:1111: if (beginAddress < > > endAddress) > > On 2016/02/04 10:20:04, sof wrote: > > > I understand the concern; for other live objects on that page, their next > > access > > > will be potentially costly. > > > > We could make the condition a little bit tighter, i.e. discard memory > > only when the unused interval is more than e.g. 32KB. > > > > According to my experiment, the major contributor is large-sized > > buckets, so I suspect there's enough memory that can be collected > > by just discarding longer contiguous unused regions. > > > > But that's kind of trade-offs; on low mem devices we probably should > > return every unused 4K page, while on desktops we have some more > > freedom on delaying the disposal. Just large contiguous memory should > > be more prioritized than shorter regions. > > From the perspective of performance, I'm not sure how much the threshold (32KB > or 4KB) difference makes a difference. Remember that Oilpan uses free lists with > the worst-fit strategy. This means that if you have one 64KB free list and three > 4KB free lists, the 64KB free list is used first. So what matters for > performance is whether the 64KB free list is discarded, not whether the 4KB free > lists are discarded. > > For safety, my proposal here is to enable the discarding only when > isLowEndDevice is true. On my local measurements discarding only >32KB chunk has caused a slight regression, so I assume doing only >32KB should be enough to recover the regression. I defer the decision to you; actually, if isLowEndDevice is true, I think we should drop everything we can, so your proposal sounds fine as a band-aid to fix the metrics.
On 2016/02/05 04:15:40, Yuta Kitamura wrote: > On 2016/02/04 11:12:49, haraken wrote: > > On 2016/02/04 10:50:11, Yuta Kitamura wrote: > > > > > > https://codereview.chromium.org/1666083002/diff/20001/third_party/WebKit/Sour... > > > File third_party/WebKit/Source/platform/heap/HeapPage.cpp (right): > > > > > > > > > https://codereview.chromium.org/1666083002/diff/20001/third_party/WebKit/Sour... > > > third_party/WebKit/Source/platform/heap/HeapPage.cpp:1111: if (beginAddress > < > > > endAddress) > > > On 2016/02/04 10:20:04, sof wrote: > > > > I understand the concern; for other live objects on that page, their next > > > access > > > > will be potentially costly. > > > > > > We could make the condition a little bit tighter, i.e. discard memory > > > only when the unused interval is more than e.g. 32KB. > > > > > > According to my experiment, the major contributor is large-sized > > > buckets, so I suspect there's enough memory that can be collected > > > by just discarding longer contiguous unused regions. > > > > > > But that's kind of trade-offs; on low mem devices we probably should > > > return every unused 4K page, while on desktops we have some more > > > freedom on delaying the disposal. Just large contiguous memory should > > > be more prioritized than shorter regions. > > > > From the perspective of performance, I'm not sure how much the threshold (32KB > > or 4KB) difference makes a difference. Remember that Oilpan uses free lists > with > > the worst-fit strategy. This means that if you have one 64KB free list and > three > > 4KB free lists, the 64KB free list is used first. So what matters for > > performance is whether the 64KB free list is discarded, not whether the 4KB > free > > lists are discarded. > > > > For safety, my proposal here is to enable the discarding only when > > isLowEndDevice is true. > > On my local measurements discarding only >32KB chunk has caused > a slight regression, so I assume doing only >32KB should be enough > to recover the regression. > > I defer the decision to you; actually, if isLowEndDevice is true, > I think we should drop everything we can, so your proposal sounds > fine as a band-aid to fix the metrics. I tried to measure performance for blink_perf benchmarks, but the CL caused a couple of crashes during the measurement. Maybe the CL is doing something wrong. Will take a look.
Still debugging. https://codereview.chromium.org/1666083002/diff/60001/third_party/WebKit/Sour... File third_party/WebKit/Source/platform/heap/HeapPage.cpp (right): https://codereview.chromium.org/1666083002/diff/60001/third_party/WebKit/Sour... third_party/WebKit/Source/platform/heap/HeapPage.cpp:107: if (gcInfo->hasFinalizer()) { I crash here with the following log: header=0x307fed8420b0 gcInfo=0x40bda50 index=279 not hasFinalizer header=0x307fed8420d8 gcInfo=0x40bda50 index=279 not hasFinalizer header=0x307fed842100 gcInfo=0x40bda50 index=279 not hasFinalizer header=0x307fed842128 gcInfo=0x40bda50 index=279 not hasFinalizer header=0x307fed842150 gcInfo=0x40bda50 index=279 not hasFinalizer header=0x307fed842188 gcInfo=0x40bda50 index=279 not hasFinalizer trying to discard 0x307fed841d80 - 0x307fed846b98 discarded 307fed842000 - 307fed846000 size=16384 trying to discard 0x307fed846be8 - 0x307fed846d38 trying to discard 0x307fed846d88 - 0x307fed846fb0 header=0x307fed847000 gcInfo=0x3333333333333333 index=0 This means that the header at 0x307fed847000 has been already cleared out. However, the previous discarding was conducted for [307fed842000, 307fed846000), which doesn't contain 0x307fed847000. It's strange that 0x307fed847000 is cleared out...
https://codereview.chromium.org/1666083002/diff/80001/third_party/WebKit/Sour... File third_party/WebKit/Source/platform/heap/HeapPage.cpp (right): https://codereview.chromium.org/1666083002/diff/80001/third_party/WebKit/Sour... third_party/WebKit/Source/platform/heap/HeapPage.cpp:1161: discardPages(startOfGap + sizeof(FreeListEntry), headerAddress); OK, I found the cause. I need to add '+ sizeof(FreeListEntry)' here. Otherwise, we end up with clearing the FreeList header that has been added by the above addToFreeList. I think this CL now works correctly. I'll measure the performance again.
On 2016/02/08 08:45:09, haraken wrote: > https://codereview.chromium.org/1666083002/diff/80001/third_party/WebKit/Sour... > File third_party/WebKit/Source/platform/heap/HeapPage.cpp (right): > > https://codereview.chromium.org/1666083002/diff/80001/third_party/WebKit/Sour... > third_party/WebKit/Source/platform/heap/HeapPage.cpp:1161: > discardPages(startOfGap + sizeof(FreeListEntry), headerAddress); > > OK, I found the cause. I need to add '+ sizeof(FreeListEntry)' here. Otherwise, > we end up with clearing the FreeList header that has been added by the above > addToFreeList. > > I think this CL now works correctly. I'll measure the performance again. I confirmed that discarding pages regress LargeDistribution.html by 8% on Linux. So I decided to enable the discarding only on low-RAM devices. yutak@: PTAL.
LGTM https://codereview.chromium.org/1666083002/diff/100001/third_party/WebKit/Sou... File third_party/WebKit/Source/platform/heap/HeapPage.cpp (right): https://codereview.chromium.org/1666083002/diff/100001/third_party/WebKit/Sou... third_party/WebKit/Source/platform/heap/HeapPage.cpp:33: #include "base/sys_info.h" Don't you need to update DEPS for this new dependency?
The CQ bit was checked by haraken@chromium.org
The patchset sent to the CQ was uploaded after l-g-t-m from yutak@chromium.org Link to the patchset: https://codereview.chromium.org/1666083002/#ps120001 (title: " ")
> https://codereview.chromium.org/1666083002/diff/100001/third_party/WebKit/Sou... > File third_party/WebKit/Source/platform/heap/HeapPage.cpp (right): > > https://codereview.chromium.org/1666083002/diff/100001/third_party/WebKit/Sou... > third_party/WebKit/Source/platform/heap/HeapPage.cpp:33: #include > "base/sys_info.h" > Don't you need to update DEPS for this new dependency? Done.
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1666083002/120001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1666083002/120001
The CQ bit was unchecked by commit-bot@chromium.org
Try jobs failed on following builders: mac_chromium_gn_rel on tryserver.chromium.mac (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.mac/builders/mac_chromium_gn_r...)
The CQ bit was checked by haraken@chromium.org
The patchset sent to the CQ was uploaded after l-g-t-m from yutak@chromium.org Link to the patchset: https://codereview.chromium.org/1666083002/#ps140001 (title: " ")
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1666083002/140001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1666083002/140001
The CQ bit was unchecked by commit-bot@chromium.org
Try jobs failed on following builders: android_chromium_gn_compile_dbg on tryserver.chromium.android (JOB_FAILED, https://build.chromium.org/p/tryserver.chromium.android/builders/android_chro...) android_chromium_gn_compile_rel on tryserver.chromium.android (JOB_FAILED, https://build.chromium.org/p/tryserver.chromium.android/builders/android_chro...) cast_shell_linux on tryserver.chromium.linux (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.linux/builders/cast_shell_linu...) linux_chromium_compile_dbg_ng on tryserver.chromium.linux (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.linux/builders/linux_chromium_...)
https://codereview.chromium.org/1666083002/diff/140001/third_party/WebKit/Sou... File third_party/WebKit/Source/platform/heap/HeapPage.cpp (right): https://codereview.chromium.org/1666083002/diff/140001/third_party/WebKit/Sou... third_party/WebKit/Source/platform/heap/HeapPage.cpp:1120: bool isLowEndDevice = base::SysInfo::IsLowEndDevice(); Do we have to sample this each time a page is swept?
On 2016/02/08 11:03:25, sof wrote: > https://codereview.chromium.org/1666083002/diff/140001/third_party/WebKit/Sou... > File third_party/WebKit/Source/platform/heap/HeapPage.cpp (right): > > https://codereview.chromium.org/1666083002/diff/140001/third_party/WebKit/Sou... > third_party/WebKit/Source/platform/heap/HeapPage.cpp:1120: bool isLowEndDevice = > base::SysInfo::IsLowEndDevice(); > Do we have to sample this each time a page is swept? Made it a static bool variable in Heap.
(fyi, this one isn't quite ready wrt IsLowEndDevice() usage)
On 2016/02/09 14:07:30, sof wrote: > (fyi, this one isn't quite ready wrt IsLowEndDevice() usage) Now the tests pass; landing.
The CQ bit was checked by haraken@chromium.org
The patchset sent to the CQ was uploaded after l-g-t-m from yutak@chromium.org Link to the patchset: https://codereview.chromium.org/1666083002/#ps200001 (title: " ")
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1666083002/200001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1666083002/200001
are one or more of the bots config'ed as low-end?
On 2016/02/10 07:14:54, sof wrote: > are one or more of the bots config'ed as low-end? I don't think so. Also I guess ASSERTs are enabled on all bots that run tests. Which means that page discarding is not enabled even if the bot is configed as low-end.
Message was sent while issue was closed.
Description was changed from ========== Oilpan: Discard unused system pages when sweeping NormalPageHeaps This CL makes Oilpan's GC discard empty system pages (i.e., call madvise(MADV_FREE)) in free lists. This CL fixes regressions in memory.memory_health_plan. BUG= ========== to ========== Oilpan: Discard unused system pages when sweeping NormalPageHeaps This CL makes Oilpan's GC discard empty system pages (i.e., call madvise(MADV_FREE)) in free lists. This CL fixes regressions in memory.memory_health_plan. BUG= ==========
Message was sent while issue was closed.
Committed patchset #11 (id:200001)
Message was sent while issue was closed.
Description was changed from ========== Oilpan: Discard unused system pages when sweeping NormalPageHeaps This CL makes Oilpan's GC discard empty system pages (i.e., call madvise(MADV_FREE)) in free lists. This CL fixes regressions in memory.memory_health_plan. BUG= ========== to ========== Oilpan: Discard unused system pages when sweeping NormalPageHeaps This CL makes Oilpan's GC discard empty system pages (i.e., call madvise(MADV_FREE)) in free lists. This CL fixes regressions in memory.memory_health_plan. BUG= Committed: https://crrev.com/6fed8736a88eecf12017be4066db47b7427de8e5 Cr-Commit-Position: refs/heads/master@{#374616} ==========
Message was sent while issue was closed.
Patchset 11 (id:??) landed as https://crrev.com/6fed8736a88eecf12017be4066db47b7427de8e5 Cr-Commit-Position: refs/heads/master@{#374616}
Message was sent while issue was closed.
On 2016/02/10 07:20:21, haraken wrote: > On 2016/02/10 07:14:54, sof wrote: > > are one or more of the bots config'ed as low-end? > > I don't think so. > > Also I guess ASSERTs are enabled on all bots that run tests. Which means that > page discarding is not enabled even if the bot is configed as low-end. oh, right. c/should it be using !defined(NDEBUG) instead of !ENABLE(ASSERT)? (i hope i got the negations right there.. i.e., enable it in release builds.)
Message was sent while issue was closed.
On 2016/02/10 07:32:46, sof wrote: > On 2016/02/10 07:20:21, haraken wrote: > > On 2016/02/10 07:14:54, sof wrote: > > > are one or more of the bots config'ed as low-end? > > > > I don't think so. > > > > Also I guess ASSERTs are enabled on all bots that run tests. Which means that > > page discarding is not enabled even if the bot is configed as low-end. > > oh, right. c/should it be using !defined(NDEBUG) instead of !ENABLE(ASSERT)? (i > hope i got the negations right there.. i.e., enable it in release builds.) Hmm, not really. When ASSERTs are enabled, we zap the free lists to assert use-after-free. So we cannot discard pages of the free lists. If we want to discard the pages, we need to disable the asserts about use-after-free.
Message was sent while issue was closed.
On 2016/02/10 07:45:03, haraken wrote: > On 2016/02/10 07:32:46, sof wrote: > > On 2016/02/10 07:20:21, haraken wrote: > > > On 2016/02/10 07:14:54, sof wrote: > > > > are one or more of the bots config'ed as low-end? > > > > > > I don't think so. > > > > > > Also I guess ASSERTs are enabled on all bots that run tests. Which means > that > > > page discarding is not enabled even if the bot is configed as low-end. > > > > oh, right. c/should it be using !defined(NDEBUG) instead of !ENABLE(ASSERT)? > (i > > hope i got the negations right there.. i.e., enable it in release builds.) > > Hmm, not really. When ASSERTs are enabled, we zap the free lists to assert > use-after-free. So we cannot discard pages of the free lists. If we want to > discard the pages, we need to disable the asserts about use-after-free. Correctness wise it ought to work, but ok. That this code path is untested we just have to accept for now. "Solving" fragmentation is the longer term issue to address, I think.
Message was sent while issue was closed.
On 2016/02/10 08:23:17, sof wrote: > On 2016/02/10 07:45:03, haraken wrote: > > On 2016/02/10 07:32:46, sof wrote: > > > On 2016/02/10 07:20:21, haraken wrote: > > > > On 2016/02/10 07:14:54, sof wrote: > > > > > are one or more of the bots config'ed as low-end? > > > > > > > > I don't think so. > > > > > > > > Also I guess ASSERTs are enabled on all bots that run tests. Which means > > that > > > > page discarding is not enabled even if the bot is configed as low-end. > > > > > > oh, right. c/should it be using !defined(NDEBUG) instead of !ENABLE(ASSERT)? > > (i > > > hope i got the negations right there.. i.e., enable it in release builds.) > > > > Hmm, not really. When ASSERTs are enabled, we zap the free lists to assert > > use-after-free. So we cannot discard pages of the free lists. If we want to > > discard the pages, we need to disable the asserts about use-after-free. > > Correctness wise it ought to work, but ok. That this code path is untested we > just have to accept for now. "Solving" fragmentation is the longer term issue to > address, I think. Yeah, agreed. |
