|
|
Chromium Code Reviews|
Created:
4 years, 6 months ago by Xiaocheng Modified:
4 years, 6 months ago CC:
chromium-reviews, creis+watch_chromium.org, darin-cc_chromium.org, jam, mkwst+moarreviews-renderer_chromium.org, mlamouri+watch-content_chromium.org, nasko+codewatch_chromium.org, scheduler-bugs_chromium.org, tracing+reviews_chromium.org, wfh+watch_chromium.org Base URL:
https://chromium.googlesource.com/chromium/src.git@master Target Ref:
refs/pending/heads/master Project:
chromium Visibility:
Public. |
DescriptionMake sure FrameScheduler is cleared when LocalFrame::detach() finishes
This patch is a relanding of https://codereview.chromium.org/1907453002 (Make
RenderFrameImpl own its frame blame context) with necessary modification:
Since a FrameScheduler may be created in LocalFrame::detach() due to scripting,
this patch adds an extra clearing of |m_frameScheduler| to make sure it is cleared.
BUG=618599
Committed: https://crrev.com/01c3b106a46f15a7250ce869b294fcc35fe0662f
Cr-Commit-Position: refs/heads/master@{#399607}
Patch Set 1 #
Total comments: 4
Patch Set 2 : Use thread checker in BlameContext::weak_ptr() #Patch Set 3 : (Failed attempt) Introduce and use WebFrameScheduler::setBlameContext #Patch Set 4 : Use m_useNullBlameContext flag to instruct FrameScheduler #Patch Set 5 : Make sure FrameScheduler is cleared when LocalFrame::detach() is done #
Messages
Total messages: 30 (9 generated)
Description was changed from ========== Non-owners should store weak pointers of BlameContext, not raw pointers BlameContext objects may have complex lifecycle management. Keeping raw pointers in non-owner objects introduces the risk of use-after-free crashes. This patch ensures that no object may store a BlameContext as a raw pointer and eliminates the risk. Consequently, the relanding of https://codereview.chromium.org/1907453002 (Make RenderFrameImpl own its frame blame context) is allowed, and is contained in this patch. BUG=618599 ========== to ========== Non-owners should store weak pointers of BlameContext, not raw pointers BlameContext objects may have complex lifecycle management. Keeping raw pointers in non-owner objects introduces the risk of use-after-free crashes. This patch ensures that no object may store a BlameContext as a raw pointer and eliminates the risk. Consequently, https://codereview.chromium.org/1907453002 (Make RenderFrameImpl own its frame blame context) can be relanded, and is contained in this patch. BUG=618599 ==========
xiaochengh@chromium.org changed reviewers: + jochen@chromium.org, primiano@chromium.org, skyostil@chromium.org
PTAL. I've also investigated into the use-after-free bug crbug.com/605480 and found the exact cause of the crashing. However, I think the appropriate way to fix it is not digging into the lifecycle of WebFrameScheduler --- it requires catching all possible routes from LocalFrame::detach() to creation of WebFrameScheduler, which doesn't seem quite doable for a Blink object. So the correct way is just not to store any raw pointer.
https://codereview.chromium.org/2053233002/diff/1/components/scheduler/child/... File components/scheduler/child/single_thread_idle_task_runner.cc (right): https://codereview.chromium.org/2053233002/diff/1/components/scheduler/child/... components/scheduler/child/single_thread_idle_task_runner.cc:87: blame_context_ = blame_context ? blame_context->weak_ptr() : nullptr; how do you know that here you are on the right thread (the one where blame_context lives) so invoking weak_ptr() is safe? https://codereview.chromium.org/2053233002/diff/1/content/renderer/render_fra... File content/renderer/render_frame_impl.cc (right): https://codereview.chromium.org/2053233002/diff/1/content/renderer/render_fra... content/renderer/render_frame_impl.cc:2641: return blame_context_.get(); how do you know that this is the right thread (The one where blamecontext lives) and that weakptr.get() is safe?
Thanks for the review. https://codereview.chromium.org/2053233002/diff/1/components/scheduler/child/... File components/scheduler/child/single_thread_idle_task_runner.cc (right): https://codereview.chromium.org/2053233002/diff/1/components/scheduler/child/... components/scheduler/child/single_thread_idle_task_runner.cc:87: blame_context_ = blame_context ? blame_context->weak_ptr() : nullptr; On 2016/06/10 at 07:50:35, Primiano Tucci wrote: > how do you know that here you are on the right thread (the one where blame_context lives) so invoking weak_ptr() is safe? Are you implying any potentially risky design? In my understanding, it's currently safe. The only call site of SingleThreadIdleTaskRunner::SetBlameContext() is in RendererBlinkPlatformImpl::ctor(), which can only be run in the main thread. The only BlameContext that can be passed to this function is the singleton RendererBlinkPlatformImpl::top_level_blame_context_, which also live in the main thread. I can also add a DCHECK(thread_checker_.CalledOnValidThread()) in BlameContext::weak_ptr() to make sure that it's called on the correct thread. https://codereview.chromium.org/2053233002/diff/1/content/renderer/render_fra... File content/renderer/render_frame_impl.cc (right): https://codereview.chromium.org/2053233002/diff/1/content/renderer/render_fra... content/renderer/render_frame_impl.cc:2641: return blame_context_.get(); On 2016/06/10 at 07:50:35, Primiano Tucci wrote: > how do you know that this is the right thread (The one where blamecontext lives) and that weakptr.get() is safe? blame_context_ is an std::unique_ptr so we are safe here.
Ok I am probably misunderstanding the goal of this CL. Is the deal here that you are using WeakPtr always on the same thread only to detect deletion ? In this case fine for me as long as Sami is happy. The only think that scares when I see WeakPtr is that clients might be mistakenly assuming it is safe to use/pass cross-threads. Which is not true. the only safe use of WeakPtr cross-threads is by using it to PostTask (or directly try to dereference) onto the thread where the object is supposed to live (% it has been destroyed).
On 2016/06/10 at 08:22:32, primiano wrote: > Ok I am probably misunderstanding the goal of this CL. Is the deal here that you are using WeakPtr always on the same thread only to detect deletion ? In this case fine for me as long as Sami is happy. Yes. The original goal is to fix the memory leak of FrameBlameContext, but it has raw pointers stored elsewhere. So I think it's better to just store weak pointers instead. Sorry I should have made it clear at the beginning. > > The only think that scares when I see WeakPtr is that clients might be mistakenly assuming it is safe to use/pass cross-threads. Which is not true. the only safe use of WeakPtr cross-threads is by using it to PostTask (or directly try to dereference) onto the thread where the object is supposed to live (% it has been destroyed). Does it help if I change the comment of BlameContext::weak_ptr() into "Creates and returns a WeakPtr to itself", so that clients know that the function must be called on the correct thread?
Description was changed from ========== Non-owners should store weak pointers of BlameContext, not raw pointers BlameContext objects may have complex lifecycle management. Keeping raw pointers in non-owner objects introduces the risk of use-after-free crashes. This patch ensures that no object may store a BlameContext as a raw pointer and eliminates the risk. Consequently, https://codereview.chromium.org/1907453002 (Make RenderFrameImpl own its frame blame context) can be relanded, and is contained in this patch. BUG=618599 ========== to ========== Non-owners should store weak pointers of BlameContext, not raw pointers Currently, there is a memory leak with BlameContext. However, some non-owner objects are storing raw pointers to BlameContexts, making it difficult to safely delete a BlameContext without causing use-after -free crashing. This patch solves the issue by ensuring that non-owner objects must store a BlameContext as a weak pointer, so that deletions can be automatically detected. This patch also relands https://codereview.chromium.org/1907453002 (Make RenderFrameImpl own its frame blame context) with necessary modifications. BUG=618599 ==========
On 2016/06/10 08:36:07, Xiaocheng wrote: > > Does it help if I change the comment of BlameContext::weak_ptr() into "Creates > and returns a WeakPtr to itself", so that clients know that the function must be > called on the correct thread? The name is fine. I think just some DCHECK(ThreadChecker) might help avoiding misinterpretations of intents w.r.t. crossing threads. Thanks for the explanation.
On 2016/06/10 at 08:45:47, primiano wrote: > On 2016/06/10 08:36:07, Xiaocheng wrote: > > > > Does it help if I change the comment of BlameContext::weak_ptr() into "Creates > > and returns a WeakPtr to itself", so that clients know that the function must be > > called on the correct thread? > > The name is fine. I think just some DCHECK(ThreadChecker) might help avoiding misinterpretations of intents w.r.t. crossing threads. > Thanks for the explanation. Done in patch 2.
Your comment in the linked bug made it sound like we were just forgetting to clear the pointer when the frame went away. Is this not the case, or are there other places where we might still hit a use after free? In general I would prefer explicitly managing the lifetime of the BlameContext if possible.
On 2016/06/10 at 09:16:51, skyostil wrote: > Your comment in the linked bug made it sound like we were just forgetting to clear the pointer when the frame went away. Is this not the case, or are there other places where we might still hit a use after free? In general I would prefer explicitly managing the lifetime of the BlameContext if possible. Well, in some sense, every use-after-free can be understood as "forgetting to clear the pointer when something goes away". If we add another |m_frameScheduler.reset()| into LocalFrame::detach() after |m_loader.stopAllLoaders()|, the crashing in crbug.com/605480 can no longer be reproduced. But I'm not sure if it's enough. The code after |m_loader.stopAllLoaders()| looks quite complicated (at least to me). If any subsequent line hides another call to LocalFrame::frameScheduler(), we need to clear |m_frameScheduler| again. That doesn't look like an elegant solution. Or maybe |m_frameScheduler| is cleared to early? Is it assumed in the design that in LocalFrame::detach(), we are not expecting any use of the frame's scheduler after calling |m_frameScheduler.reset()|? If so, then it's an issue with |m_frameScheduler| instead.
On 2016/06/10 09:53:46, Xiaocheng wrote: > Well, in some sense, every use-after-free can be understood as "forgetting to > clear the pointer when something goes away". > > If we add another |m_frameScheduler.reset()| into LocalFrame::detach() after > |m_loader.stopAllLoaders()|, the crashing in crbug.com/605480 can no longer be > reproduced. But I'm not sure if it's enough. The code after > |m_loader.stopAllLoaders()| looks quite complicated (at least to me). If any > subsequent line hides another call to LocalFrame::frameScheduler(), we need to > clear |m_frameScheduler| again. That doesn't look like an elegant solution. > > Or maybe |m_frameScheduler| is cleared to early? Is it assumed in the design > that in LocalFrame::detach(), we are not expecting any use of the frame's > scheduler after calling |m_frameScheduler.reset()|? If so, then it's an issue > with |m_frameScheduler| instead. Right, ideally we wouldn't be using a FrameScheduler after the frame gets detached. One option might be return a null scheduler if the frame has been detached, although I'm not sure if the call sites can cope with that. If a FrameScheduler gets recreated after detaching, will it get a null blame context or a pointer to the old one? Maybe we could make sure it gets a null in that case.
After some thought I still prefer using WeakPtrs instead of clearing raw pointers manually. I'm not sure if manual management is preferred, but it's clearly more complicated and error-prone. On 2016/06/10 at 10:05:11, skyostil wrote: > On 2016/06/10 09:53:46, Xiaocheng wrote: > > Well, in some sense, every use-after-free can be understood as "forgetting to > > clear the pointer when something goes away". > > > > If we add another |m_frameScheduler.reset()| into LocalFrame::detach() after > > |m_loader.stopAllLoaders()|, the crashing in crbug.com/605480 can no longer be > > reproduced. But I'm not sure if it's enough. The code after > > |m_loader.stopAllLoaders()| looks quite complicated (at least to me). If any > > subsequent line hides another call to LocalFrame::frameScheduler(), we need to > > clear |m_frameScheduler| again. That doesn't look like an elegant solution. > > > > Or maybe |m_frameScheduler| is cleared to early? Is it assumed in the design > > that in LocalFrame::detach(), we are not expecting any use of the frame's > > scheduler after calling |m_frameScheduler.reset()|? If so, then it's an issue > > with |m_frameScheduler| instead. > > Right, ideally we wouldn't be using a FrameScheduler after the frame gets detached. One option might be return a null scheduler if the frame has been detached, although I'm not sure if the call sites can cope with that. With a rough code search, allowing LocalFrame::frameScheduler() to return null seems to be affecting too much. I'm not seeing a strong reason for that. In fact I have a possibly unrelated question. Let's ignore BlameContext for now since it's not supposed to affect any important logic, just like tracing. Then why are we clearing the FrameScheduler in detach(), given that it (i) is already managed as an OwnPtr, and (ii) will be created whenever any client requests for it? Clearing the FrameScheduler doesn't seem to accomplish anything. > > If a FrameScheduler gets recreated after detaching, will it get a null blame context or a pointer to the old one? Maybe we could make sure it gets a null in that case. Currently it's the old one, which is the direct troublemaker. However, if we shouldn't manually clear the FrameScheduler in the first place, then we need to make sure that its stored BlameContext pointers are cleared before the BCs are deleted. Once again, it's the question of WeakPtr vs. manual management, and I prefer WeakPtr.
On 2016/06/10 14:09:45, Xiaocheng wrote: > After some thought I still prefer using WeakPtrs instead of clearing raw > pointers manually. I'm not sure if manual management is preferred, but it's > clearly more complicated and error-prone. > > On 2016/06/10 at 10:05:11, skyostil wrote: > > On 2016/06/10 09:53:46, Xiaocheng wrote: > > > Well, in some sense, every use-after-free can be understood as "forgetting > to > > > clear the pointer when something goes away". > > > > > > If we add another |m_frameScheduler.reset()| into LocalFrame::detach() after > > > |m_loader.stopAllLoaders()|, the crashing in crbug.com/605480 can no longer > be > > > reproduced. But I'm not sure if it's enough. The code after > > > |m_loader.stopAllLoaders()| looks quite complicated (at least to me). If any > > > subsequent line hides another call to LocalFrame::frameScheduler(), we need > to > > > clear |m_frameScheduler| again. That doesn't look like an elegant solution. > > > > > > Or maybe |m_frameScheduler| is cleared to early? Is it assumed in the design > > > that in LocalFrame::detach(), we are not expecting any use of the frame's > > > scheduler after calling |m_frameScheduler.reset()|? If so, then it's an > issue > > > with |m_frameScheduler| instead. > > > > Right, ideally we wouldn't be using a FrameScheduler after the frame gets > detached. One option might be return a null scheduler if the frame has been > detached, although I'm not sure if the call sites can cope with that. > > With a rough code search, allowing LocalFrame::frameScheduler() to return null > seems to be affecting too much. I'm not seeing a strong reason for that. > > In fact I have a possibly unrelated question. Let's ignore BlameContext for now > since it's not supposed to affect any important logic, just like tracing. Then > why are we clearing the FrameScheduler in detach(), given that it (i) is already > managed as an OwnPtr, and (ii) will be created whenever any client requests for > it? Clearing the FrameScheduler doesn't seem to accomplish anything. I think it's only about making sure the scheduler frame tree matches the hierarchy of the the frame tree. But like you said, the frame scheduler can get recreated so this assumption doesn't really hold completely. > > > > If a FrameScheduler gets recreated after detaching, will it get a null blame > context or a pointer to the old one? Maybe we could make sure it gets a null in > that case. > > Currently it's the old one, which is the direct troublemaker. However, if we > shouldn't manually clear the FrameScheduler in the first place, then we need to > make sure that its stored BlameContext pointers are cleared before the BCs are > deleted. Once again, it's the question of WeakPtr vs. manual management, and I > prefer WeakPtr. The only reason I'm opposed to WeakPtr is that it's basically an admission that we don't know what's going on. I'd rather not be in that state :) WDYT about making the BlameContext become null after detach?
On 2016/06/10 at 16:31:13, skyostil wrote: > On 2016/06/10 14:09:45, Xiaocheng wrote: > > After some thought I still prefer using WeakPtrs instead of clearing raw > > pointers manually. I'm not sure if manual management is preferred, but it's > > clearly more complicated and error-prone. > > > > On 2016/06/10 at 10:05:11, skyostil wrote: > > > On 2016/06/10 09:53:46, Xiaocheng wrote: > > > > Well, in some sense, every use-after-free can be understood as "forgetting > > to > > > > clear the pointer when something goes away". > > > > > > > > If we add another |m_frameScheduler.reset()| into LocalFrame::detach() after > > > > |m_loader.stopAllLoaders()|, the crashing in crbug.com/605480 can no longer > > be > > > > reproduced. But I'm not sure if it's enough. The code after > > > > |m_loader.stopAllLoaders()| looks quite complicated (at least to me). If any > > > > subsequent line hides another call to LocalFrame::frameScheduler(), we need > > to > > > > clear |m_frameScheduler| again. That doesn't look like an elegant solution. > > > > > > > > Or maybe |m_frameScheduler| is cleared to early? Is it assumed in the design > > > > that in LocalFrame::detach(), we are not expecting any use of the frame's > > > > scheduler after calling |m_frameScheduler.reset()|? If so, then it's an > > issue > > > > with |m_frameScheduler| instead. > > > > > > Right, ideally we wouldn't be using a FrameScheduler after the frame gets > > detached. One option might be return a null scheduler if the frame has been > > detached, although I'm not sure if the call sites can cope with that. > > > > With a rough code search, allowing LocalFrame::frameScheduler() to return null > > seems to be affecting too much. I'm not seeing a strong reason for that. > > > > In fact I have a possibly unrelated question. Let's ignore BlameContext for now > > since it's not supposed to affect any important logic, just like tracing. Then > > why are we clearing the FrameScheduler in detach(), given that it (i) is already > > managed as an OwnPtr, and (ii) will be created whenever any client requests for > > it? Clearing the FrameScheduler doesn't seem to accomplish anything. > > I think it's only about making sure the scheduler frame tree matches the hierarchy of the the frame tree. But like you said, the frame scheduler can get recreated so this assumption doesn't really hold completely. > > > > > > > If a FrameScheduler gets recreated after detaching, will it get a null blame > > context or a pointer to the old one? Maybe we could make sure it gets a null in > > that case. > > > > Currently it's the old one, which is the direct troublemaker. However, if we > > shouldn't manually clear the FrameScheduler in the first place, then we need to > > make sure that its stored BlameContext pointers are cleared before the BCs are > > deleted. Once again, it's the question of WeakPtr vs. manual management, and I > > prefer WeakPtr. > > The only reason I'm opposed to WeakPtr is that it's basically an admission that we don't know what's going on. I'd rather not be in that state :) > > WDYT about making the BlameContext become null after detach? It seems that I overcomplicated the problem -- I was thinking about what if the same LocalFrame gets reattached, but there's not even an attach() function for Frame and its subclasses. So yeah, setting the BlameContext to be null after detaching is enough.
Description was changed from ========== Non-owners should store weak pointers of BlameContext, not raw pointers Currently, there is a memory leak with BlameContext. However, some non-owner objects are storing raw pointers to BlameContexts, making it difficult to safely delete a BlameContext without causing use-after -free crashing. This patch solves the issue by ensuring that non-owner objects must store a BlameContext as a weak pointer, so that deletions can be automatically detected. This patch also relands https://codereview.chromium.org/1907453002 (Make RenderFrameImpl own its frame blame context) with necessary modifications. BUG=618599 ========== to ========== Set FrameScheduler's BlameContext to null when LocalFrame is detached This patch is a relanding of https://codereview.chromium.org/1907453002 (Make RenderFrameImpl own its frame blame context) with necessary modifications: 1. Introduce WebFrameScheduler::setBlameContext() so that a FrameScheduler's BlameContext can be modified after initialization. 2. In LocalFrame::detach(), use the new method to clear the stored BlameContext from the FrameScheduler. BUG=618599 ==========
Description was changed from ========== Set FrameScheduler's BlameContext to null when LocalFrame is detached This patch is a relanding of https://codereview.chromium.org/1907453002 (Make RenderFrameImpl own its frame blame context) with necessary modifications: 1. Introduce WebFrameScheduler::setBlameContext() so that a FrameScheduler's BlameContext can be modified after initialization. 2. In LocalFrame::detach(), use the new method to clear the stored BlameContext from the FrameScheduler. BUG=618599 ========== to ========== Instruct FrameScheduler to use null BlameContext when LocalFrame is detached This patch is a relanding of https://codereview.chromium.org/1907453002 (Make RenderFrameImpl own its frame blame context) with necessary modification: A flag m_useNullBlameContext is added to LocalFrame. After a LocalFrame is being detached and the FrameScheduler is cleared, this flag is set to false so that if the FrameScheduler gets recreated later, it gets a null BlameContext; BUG=618599 ==========
Description was changed from ========== Instruct FrameScheduler to use null BlameContext when LocalFrame is detached This patch is a relanding of https://codereview.chromium.org/1907453002 (Make RenderFrameImpl own its frame blame context) with necessary modification: A flag m_useNullBlameContext is added to LocalFrame. After a LocalFrame is being detached and the FrameScheduler is cleared, this flag is set to false so that if the FrameScheduler gets recreated later, it gets a null BlameContext; BUG=618599 ========== to ========== Instruct FrameScheduler to use null BlameContext when LocalFrame is detached This patch is a relanding of https://codereview.chromium.org/1907453002 (Make RenderFrameImpl own its frame blame context) with necessary modification: A flag m_useNullBlameContext is added to LocalFrame. After a LocalFrame is being detached and the FrameScheduler is cleared, this flag is set to false so that if the FrameScheduler gets recreated later, it gets a null BlameContext. BUG=618599 ==========
Description was changed from ========== Instruct FrameScheduler to use null BlameContext when LocalFrame is detached This patch is a relanding of https://codereview.chromium.org/1907453002 (Make RenderFrameImpl own its frame blame context) with necessary modification: A flag m_useNullBlameContext is added to LocalFrame. After a LocalFrame is being detached and the FrameScheduler is cleared, this flag is set to false so that if the FrameScheduler gets recreated later, it gets a null BlameContext. BUG=618599 ========== to ========== Make sure FrameScheduler is cleared when LocalFrame::detach() finishes This patch is a relanding of https://codereview.chromium.org/1907453002 (Make RenderFrameImpl own its frame blame context) with necessary modification: Since a FrameScheduler may be created in LocalFrame::detach() due to scripting, this patch adds an extra clearing of |m_frameScheduler| to make sure it is cleared. BUG=618599 ==========
PTAL at Patch 5, which is just a relanding of https://codereview.chromium.org/1907453002/ plus an extra clearing of FrameScheduler to handle the recreation in detaching. I tried (in Patch 3) another approach of adding a WebFrameScheduler::setBlameContext() to set the BlameContext to null without clearing the FrameScheduler in LocalFrame::detach(). It failed as the tasks queues don't seem to be allowed to be cleared by Blink GC, in which case a ThreadChecker assertion is hit in scheduler::TimeDomain::UnregisterQueue() (details available in the try bot logs in Patch 3). Since we have to manually clear the FrameScheduler, there's no longer any need to add WebFrameScheduler::setBlameContext(). That's how Patch 5 comes.
Thanks, lgtm.
lgtm
The CQ bit was checked by xiaochengh@chromium.org
Thanks for the comments!
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/2053233002/80001
Message was sent while issue was closed.
Committed patchset #5 (id:80001)
Message was sent while issue was closed.
CQ bit was unchecked
Message was sent while issue was closed.
Description was changed from ========== Make sure FrameScheduler is cleared when LocalFrame::detach() finishes This patch is a relanding of https://codereview.chromium.org/1907453002 (Make RenderFrameImpl own its frame blame context) with necessary modification: Since a FrameScheduler may be created in LocalFrame::detach() due to scripting, this patch adds an extra clearing of |m_frameScheduler| to make sure it is cleared. BUG=618599 ========== to ========== Make sure FrameScheduler is cleared when LocalFrame::detach() finishes This patch is a relanding of https://codereview.chromium.org/1907453002 (Make RenderFrameImpl own its frame blame context) with necessary modification: Since a FrameScheduler may be created in LocalFrame::detach() due to scripting, this patch adds an extra clearing of |m_frameScheduler| to make sure it is cleared. BUG=618599 Committed: https://crrev.com/01c3b106a46f15a7250ce869b294fcc35fe0662f Cr-Commit-Position: refs/heads/master@{#399607} ==========
Message was sent while issue was closed.
Patchset 5 (id:??) landed as https://crrev.com/01c3b106a46f15a7250ce869b294fcc35fe0662f Cr-Commit-Position: refs/heads/master@{#399607} |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
