|
|
Chromium Code Reviews|
Created:
4 years, 6 months ago by Anand Mistry (off Chromium) Modified:
4 years, 6 months ago CC:
chromium-reviews, extensions-reviews_chromium.org, qsr+mojo_chromium.org, Aaron Boodman, Lei Zhang, viettrungluu+watch_chromium.org, yzshen+watch_chromium.org, abarth-chromium, tommycli, darin-cc_chromium.org, chromium-apps-reviews_chromium.org, darin (slow to review), ben+mojo_chromium.org Base URL:
https://chromium.googlesource.com/chromium/src.git@master Target Ref:
refs/pending/heads/master Project:
chromium Visibility:
Public. |
DescriptionAlways leak the Mojo parent pipe handle on child process shutdown.
In legacy Chrome-IPC, the browser uses the disconnection on the IPC
pipe as indication that the child process has died. It does this by
intentionally holding on to the parent IPC pipe in the child as the
child process exits. When the child exits, the OS closes the pipe
and the browser receives a disconnection event and assumes the child
process is dead.
When using ChannelMojo, the IPC pipe is a Mojo message pipe. In the
child, Mojo is is told to perform an orderly shutdown when the child
process is shutting down. This results in the IPC pipe being closed
before the child process dies. In this case, when the browser sees
the pipe closure, the child process may not be dead, violating the
assumption that the child is dead, which causes problems with child
termination.
To maintain existing semantics and avoid issues with child
termination, the EDK always leaks the parent pipe in the child instead
of closing it. The parent will then only see the pipe close when the
child process exits. The EDK will only leak the parent-child OS pipe
in the child process. All other pipes will be closed when Mojo is shut
down.
BUG=618121
Committed: https://crrev.com/e64e2962ecf525fff64e48ba3766f97eca5bdd2e
Cr-Commit-Position: refs/heads/master@{#399949}
Patch Set 1 #Patch Set 2 : Get rid of embedder API. #Patch Set 3 : comment #
Messages
Total messages: 29 (11 generated)
amistry@chromium.org changed reviewers: + jam@chromium.org, rockot@chromium.org
rockot: Because EDK. jam: Because ChildThreadImpl and https://codereview.chromium.org/2058653002/ System developers HATE this! One weird trick to get race-free child process shutdown.
On 2016/06/14 at 04:57:43, amistry wrote: > rockot: Because EDK. > jam: Because ChildThreadImpl and https://codereview.chromium.org/2058653002/ > > System developers HATE this! One weird trick to get race-free child process shutdown. This API makes me sad. Can you please help me understand the reason why it's a problem for the browser to notice channel closure before process death, and why there's nothing else we can do about it?
On 2016/06/14 05:05:58, Ken Rockot wrote: > On 2016/06/14 at 04:57:43, amistry wrote: > > rockot: Because EDK. > > jam: Because ChildThreadImpl and https://codereview.chromium.org/2058653002/ > > > > System developers HATE this! One weird trick to get race-free child process > shutdown. > > This API makes me sad. Can you please help me understand the reason why it's a > problem for the browser to notice channel closure before process death, and why > there's nothing else we can do about it? You and me both. I shed a tear before mailing this out. The explanation is in https://codereview.chromium.org/2058653002/, specifically the big comment block I put in content/browser/browser_child_process_host_impl.cc When BrowserChildProcessHostImpl detects an IPC channel disconnection, it assumes the child is dead. Then as part of determining how the child died, it SIGKILLs the child before waitpid()ing on it. If the child is still alive, as might be the case in ChannelMojo, it ends up being killed by the SIGKILL and the termination status is "process was killed" instead of "process died peacefully". The reason the child is SIGKILL'd is: 1. It is assumed the child is dead, so it shouldn't matter unless something _really_ strange happened. 2. It allows for a non-blocking waitpid(), which is needed because waitpid() can't timeout, and we can't use threads in the zygote to simulate a timeout.
On 2016/06/14 at 05:20:08, amistry wrote: > On 2016/06/14 05:05:58, Ken Rockot wrote: > > On 2016/06/14 at 04:57:43, amistry wrote: > > > rockot: Because EDK. > > > jam: Because ChildThreadImpl and https://codereview.chromium.org/2058653002/ > > > > > > System developers HATE this! One weird trick to get race-free child process > > shutdown. > > > > This API makes me sad. Can you please help me understand the reason why it's a > > problem for the browser to notice channel closure before process death, and why > > there's nothing else we can do about it? > > You and me both. I shed a tear before mailing this out. > > The explanation is in https://codereview.chromium.org/2058653002/, specifically the big comment block I put in content/browser/browser_child_process_host_impl.cc > When BrowserChildProcessHostImpl detects an IPC channel disconnection, it assumes the child is dead. Then as part of determining how the child died, it SIGKILLs the child before waitpid()ing on it. If the child is still alive, as might be the case in ChannelMojo, it ends up being killed by the SIGKILL and the termination status is "process was killed" instead of "process died peacefully". The reason the child is SIGKILL'd is: > 1. It is assumed the child is dead, so it shouldn't matter unless something _really_ strange happened. > 2. It allows for a non-blocking waitpid(), which is needed because waitpid() can't timeout, and we can't use threads in the zygote to simulate a timeout. I see. So what if we just always leaked it?
On 2016/06/14 05:24:55, Ken Rockot wrote: > On 2016/06/14 at 05:20:08, amistry wrote: > > On 2016/06/14 05:05:58, Ken Rockot wrote: > > > On 2016/06/14 at 04:57:43, amistry wrote: > > > > rockot: Because EDK. > > > > jam: Because ChildThreadImpl and > https://codereview.chromium.org/2058653002/ > > > > > > > > System developers HATE this! One weird trick to get race-free child > process > > > shutdown. > > > > > > This API makes me sad. Can you please help me understand the reason why it's > a > > > problem for the browser to notice channel closure before process death, and > why > > > there's nothing else we can do about it? > > > > You and me both. I shed a tear before mailing this out. > > > > The explanation is in https://codereview.chromium.org/2058653002/, > specifically the big comment block I put in > content/browser/browser_child_process_host_impl.cc > > When BrowserChildProcessHostImpl detects an IPC channel disconnection, it > assumes the child is dead. Then as part of determining how the child died, it > SIGKILLs the child before waitpid()ing on it. If the child is still alive, as > might be the case in ChannelMojo, it ends up being killed by the SIGKILL and the > termination status is "process was killed" instead of "process died peacefully". > The reason the child is SIGKILL'd is: > > 1. It is assumed the child is dead, so it shouldn't matter unless something > _really_ strange happened. > > 2. It allows for a non-blocking waitpid(), which is needed because waitpid() > can't timeout, and we can't use threads in the zygote to simulate a timeout. > > I see. So what if we just always leaked it? We could, but only the parent pipe in the child. It would get rid of the embedder API (yay!). But I wonder if DrMemory or *SAN might complain about the leaked handle in certain tests.
On 2016/06/14 at 05:36:53, amistry wrote: > On 2016/06/14 05:24:55, Ken Rockot wrote: > > On 2016/06/14 at 05:20:08, amistry wrote: > > > On 2016/06/14 05:05:58, Ken Rockot wrote: > > > > On 2016/06/14 at 04:57:43, amistry wrote: > > > > > rockot: Because EDK. > > > > > jam: Because ChildThreadImpl and > > https://codereview.chromium.org/2058653002/ > > > > > > > > > > System developers HATE this! One weird trick to get race-free child > > process > > > > shutdown. > > > > > > > > This API makes me sad. Can you please help me understand the reason why it's > > a > > > > problem for the browser to notice channel closure before process death, and > > why > > > > there's nothing else we can do about it? > > > > > > You and me both. I shed a tear before mailing this out. > > > > > > The explanation is in https://codereview.chromium.org/2058653002/, > > specifically the big comment block I put in > > content/browser/browser_child_process_host_impl.cc > > > When BrowserChildProcessHostImpl detects an IPC channel disconnection, it > > assumes the child is dead. Then as part of determining how the child died, it > > SIGKILLs the child before waitpid()ing on it. If the child is still alive, as > > might be the case in ChannelMojo, it ends up being killed by the SIGKILL and the > > termination status is "process was killed" instead of "process died peacefully". > > The reason the child is SIGKILL'd is: > > > 1. It is assumed the child is dead, so it shouldn't matter unless something > > _really_ strange happened. > > > 2. It allows for a non-blocking waitpid(), which is needed because waitpid() > > can't timeout, and we can't use threads in the zygote to simulate a timeout. > > > > I see. So what if we just always leaked it? > > We could, but only the parent pipe in the child. It would get rid of the embedder API (yay!). But I wonder if DrMemory or *SAN might complain about the leaked handle in certain tests. Seems reasonable. They might complain but we should be able to suppress their warnings given that it's intentional behavior.
Description was changed from ========== Leak the Mojo parent pipe handle on child process shutdown. Child process shutdown assumes the child holds onto the IPC pipe until it exits. This implies that when the browser notices the IPC disconnection, the child is dead. When using ChannelMojo, this isn't true because Mojo performs an orderly shutdown before the child exits, which causes the parent pipe to be closed before the child exits. To solve this, force the EDK to leak the parent pipe instead of closing it. The parent will then only see the pipe close when the child process exits. BUG=618121 ========== to ========== Always leak the Mojo parent pipe handle on child process shutdown. Child process shutdown assumes the child holds onto the IPC pipe until it exits. This implies that when the browser notices the IPC disconnection, the child is dead. When using ChannelMojo, this isn't true because Mojo performs an orderly shutdown before the child exits, which causes the parent pipe to be closed before the child exits. To solve this, the EDK always leaks the parent pipe in the child instead of closing it. The parent will then only see the pipe close when the child process exits. BUG=618121 ==========
Description was changed from ========== Always leak the Mojo parent pipe handle on child process shutdown. Child process shutdown assumes the child holds onto the IPC pipe until it exits. This implies that when the browser notices the IPC disconnection, the child is dead. When using ChannelMojo, this isn't true because Mojo performs an orderly shutdown before the child exits, which causes the parent pipe to be closed before the child exits. To solve this, the EDK always leaks the parent pipe in the child instead of closing it. The parent will then only see the pipe close when the child process exits. BUG=618121 ========== to ========== Always leak the Mojo parent pipe handle on child process shutdown. Child process shutdown assumes the child holds onto the IPC pipe until it exits. This implies that when the browser notices the IPC disconnection, the child is dead. When using ChannelMojo, this isn't true because Mojo performs an orderly shutdown before the child exits, which causes the parent pipe to be closed before the child exits. To solve this, the EDK always leaks the parent pipe in the child instead of closing it. The parent will then only see the pipe close when the child process exits. BUG=618121 ==========
On 2016/06/14 05:48:10, Ken Rockot wrote: > On 2016/06/14 at 05:36:53, amistry wrote: > > On 2016/06/14 05:24:55, Ken Rockot wrote: > > > On 2016/06/14 at 05:20:08, amistry wrote: > > > > On 2016/06/14 05:05:58, Ken Rockot wrote: > > > > > On 2016/06/14 at 04:57:43, amistry wrote: > > > > > > rockot: Because EDK. > > > > > > jam: Because ChildThreadImpl and > > > https://codereview.chromium.org/2058653002/ > > > > > > > > > > > > System developers HATE this! One weird trick to get race-free child > > > process > > > > > shutdown. > > > > > > > > > > This API makes me sad. Can you please help me understand the reason why > it's > > > a > > > > > problem for the browser to notice channel closure before process death, > and > > > why > > > > > there's nothing else we can do about it? > > > > > > > > You and me both. I shed a tear before mailing this out. > > > > > > > > The explanation is in https://codereview.chromium.org/2058653002/, > > > specifically the big comment block I put in > > > content/browser/browser_child_process_host_impl.cc > > > > When BrowserChildProcessHostImpl detects an IPC channel disconnection, it > > > assumes the child is dead. Then as part of determining how the child died, > it > > > SIGKILLs the child before waitpid()ing on it. If the child is still alive, > as > > > might be the case in ChannelMojo, it ends up being killed by the SIGKILL and > the > > > termination status is "process was killed" instead of "process died > peacefully". > > > The reason the child is SIGKILL'd is: > > > > 1. It is assumed the child is dead, so it shouldn't matter unless > something > > > _really_ strange happened. > > > > 2. It allows for a non-blocking waitpid(), which is needed because > waitpid() > > > can't timeout, and we can't use threads in the zygote to simulate a timeout. > > > > > > I see. So what if we just always leaked it? > > > > We could, but only the parent pipe in the child. It would get rid of the > embedder API (yay!). But I wonder if DrMemory or *SAN might complain about the > leaked handle in certain tests. > > Seems reasonable. They might complain but we should be able to suppress their > warnings given that it's intentional behavior. Ok. PTAL. I could have made "leak" a constructor argument to Channel, but since this is a special case, I figured it's better to have an explicit "leak this handle".
lgtm the cl description is a bit confusing: "When using ChannelMojo, this isn't true because Mojo performs an orderly shutdown before the child exits, which causes the parent pipe to be closed before the child exits.". I'm confused because IPC::ChannelProxy also by default closes the pipe. Just we chose not to do this for the pipe to the child process so that as you saw the disconnect signal means the child is gone.
lgtm
Description was changed from ========== Always leak the Mojo parent pipe handle on child process shutdown. Child process shutdown assumes the child holds onto the IPC pipe until it exits. This implies that when the browser notices the IPC disconnection, the child is dead. When using ChannelMojo, this isn't true because Mojo performs an orderly shutdown before the child exits, which causes the parent pipe to be closed before the child exits. To solve this, the EDK always leaks the parent pipe in the child instead of closing it. The parent will then only see the pipe close when the child process exits. BUG=618121 ========== to ========== Always leak the Mojo parent pipe handle on child process shutdown. In legacy Chrome-IPC, when a child process shuts downs, it holds onto the IPC pipe while the process exits. This allows the browser process to notice the pipe closure and assume the child process is dead. When using ChannelMojo, the IPC pipe is a Mojo message pipe and Mojo performs an orderly shutdown of the system, which results in the IPC pipe being closed before the child process dies. In this case, when the browser sees the pipe closure, the child process may not be dead, violating the assumption that the child is dead, which causes problems with child termination. To solve this, the EDK always leaks the parent pipe in the child instead of closing it. The parent will then only see the pipe close when the child process exits. BUG=618121 ==========
On 2016/06/14 16:27:11, jam wrote: > lgtm > > the cl description is a bit confusing: "When using ChannelMojo, this isn't true > because Mojo performs an orderly shutdown before the child exits, which causes > the parent pipe to be closed before the child exits.". I'm confused because > IPC::ChannelProxy also by default closes the pipe. Just we chose not to do this > for the pipe to the child process so that as you saw the disconnect signal means > the child is gone. I've re-written the description. Hopefully it's clearer.
The CQ bit was checked by amistry@chromium.org
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/2065453004/40001
On 2016/06/15 00:23:22, Anand Mistry wrote: > On 2016/06/14 16:27:11, jam wrote: > > lgtm > > > > the cl description is a bit confusing: "When using ChannelMojo, this isn't > true > > because Mojo performs an orderly shutdown before the child exits, which causes > > the parent pipe to be closed before the child exits.". I'm confused because > > IPC::ChannelProxy also by default closes the pipe. Just we chose not to do > this > > for the pipe to the child process so that as you saw the disconnect signal > means > > the child is gone. > > I've re-written the description. Hopefully it's clearer. not quite. what i was trying to get to is that leaking the ipc pipe so that pipe-error = process-died is something higher level and is configured outside ipc layer. i.e. with both chrome ipc and mojo, both by default dont leak pipe and close it when the channel is closed. this behavior is only configured for the main pipe to each child process.
The CQ bit was unchecked by amistry@chromium.org
Description was changed from ========== Always leak the Mojo parent pipe handle on child process shutdown. In legacy Chrome-IPC, when a child process shuts downs, it holds onto the IPC pipe while the process exits. This allows the browser process to notice the pipe closure and assume the child process is dead. When using ChannelMojo, the IPC pipe is a Mojo message pipe and Mojo performs an orderly shutdown of the system, which results in the IPC pipe being closed before the child process dies. In this case, when the browser sees the pipe closure, the child process may not be dead, violating the assumption that the child is dead, which causes problems with child termination. To solve this, the EDK always leaks the parent pipe in the child instead of closing it. The parent will then only see the pipe close when the child process exits. BUG=618121 ========== to ========== Always leak the Mojo parent pipe handle on child process shutdown. In legacy Chrome-IPC, the browser uses the disconnection on the IPC pipe as indication that the child process has died. It does this by intentionally holding on to the parent IPC pipe in the child as the child process exits. When the child exits, the OS closes the pipe and the browser receives a disconnection event and assumes the child process is dead. When using ChannelMojo, the IPC pipe is a Mojo message pipe. In the child, Mojo is is told to perform an orderly shutdown when the child process is shutting down. This results in the IPC pipe being closed before the child process dies. In this case, when the browser sees the pipe closure, the child process may not be dead, violating the assumption that the child is dead, which causes problems with child termination. To maintain existing semantics and avoid issues with child termination, the EDK always leaks the parent pipe in the child instead of closing it. The parent will then only see the pipe close when the child process exits. The EDK will only leak the parent-child OS pipe in the child process. All other pipes will be closed when Mojo is shut down. BUG=618121 ==========
On 2016/06/15 04:24:38, jam wrote: > On 2016/06/15 00:23:22, Anand Mistry wrote: > > On 2016/06/14 16:27:11, jam wrote: > > > lgtm > > > > > > the cl description is a bit confusing: "When using ChannelMojo, this isn't > > true > > > because Mojo performs an orderly shutdown before the child exits, which > causes > > > the parent pipe to be closed before the child exits.". I'm confused because > > > IPC::ChannelProxy also by default closes the pipe. Just we chose not to do > > this > > > for the pipe to the child process so that as you saw the disconnect signal > > means > > > the child is gone. > > > > I've re-written the description. Hopefully it's clearer. > > not quite. what i was trying to get to is that leaking the ipc pipe so that > pipe-error = process-died is something higher level and is configured outside > ipc layer. i.e. with both chrome ipc and mojo, both by default dont leak pipe > and close it when the channel is closed. this behavior is only configured for > the main pipe to each child process. How about now?
On 2016/06/15 06:57:20, Anand Mistry wrote: > On 2016/06/15 04:24:38, jam wrote: > > On 2016/06/15 00:23:22, Anand Mistry wrote: > > > On 2016/06/14 16:27:11, jam wrote: > > > > lgtm > > > > > > > > the cl description is a bit confusing: "When using ChannelMojo, this isn't > > > true > > > > because Mojo performs an orderly shutdown before the child exits, which > > causes > > > > the parent pipe to be closed before the child exits.". I'm confused > because > > > > IPC::ChannelProxy also by default closes the pipe. Just we chose not to do > > > this > > > > for the pipe to the child process so that as you saw the disconnect signal > > > means > > > > the child is gone. > > > > > > I've re-written the description. Hopefully it's clearer. > > > > not quite. what i was trying to get to is that leaking the ipc pipe so that > > pipe-error = process-died is something higher level and is configured outside > > ipc layer. i.e. with both chrome ipc and mojo, both by default dont leak pipe > > and close it when the channel is closed. this behavior is only configured for > > the main pipe to each child process. > > How about now? yep looks great, thanks
The CQ bit was checked by jam@chromium.org
The CQ bit was checked by jam@chromium.org
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/2065453004/40001
Message was sent while issue was closed.
Description was changed from ========== Always leak the Mojo parent pipe handle on child process shutdown. In legacy Chrome-IPC, the browser uses the disconnection on the IPC pipe as indication that the child process has died. It does this by intentionally holding on to the parent IPC pipe in the child as the child process exits. When the child exits, the OS closes the pipe and the browser receives a disconnection event and assumes the child process is dead. When using ChannelMojo, the IPC pipe is a Mojo message pipe. In the child, Mojo is is told to perform an orderly shutdown when the child process is shutting down. This results in the IPC pipe being closed before the child process dies. In this case, when the browser sees the pipe closure, the child process may not be dead, violating the assumption that the child is dead, which causes problems with child termination. To maintain existing semantics and avoid issues with child termination, the EDK always leaks the parent pipe in the child instead of closing it. The parent will then only see the pipe close when the child process exits. The EDK will only leak the parent-child OS pipe in the child process. All other pipes will be closed when Mojo is shut down. BUG=618121 ========== to ========== Always leak the Mojo parent pipe handle on child process shutdown. In legacy Chrome-IPC, the browser uses the disconnection on the IPC pipe as indication that the child process has died. It does this by intentionally holding on to the parent IPC pipe in the child as the child process exits. When the child exits, the OS closes the pipe and the browser receives a disconnection event and assumes the child process is dead. When using ChannelMojo, the IPC pipe is a Mojo message pipe. In the child, Mojo is is told to perform an orderly shutdown when the child process is shutting down. This results in the IPC pipe being closed before the child process dies. In this case, when the browser sees the pipe closure, the child process may not be dead, violating the assumption that the child is dead, which causes problems with child termination. To maintain existing semantics and avoid issues with child termination, the EDK always leaks the parent pipe in the child instead of closing it. The parent will then only see the pipe close when the child process exits. The EDK will only leak the parent-child OS pipe in the child process. All other pipes will be closed when Mojo is shut down. BUG=618121 ==========
Message was sent while issue was closed.
Committed patchset #3 (id:40001)
Message was sent while issue was closed.
CQ bit was unchecked
Message was sent while issue was closed.
Description was changed from ========== Always leak the Mojo parent pipe handle on child process shutdown. In legacy Chrome-IPC, the browser uses the disconnection on the IPC pipe as indication that the child process has died. It does this by intentionally holding on to the parent IPC pipe in the child as the child process exits. When the child exits, the OS closes the pipe and the browser receives a disconnection event and assumes the child process is dead. When using ChannelMojo, the IPC pipe is a Mojo message pipe. In the child, Mojo is is told to perform an orderly shutdown when the child process is shutting down. This results in the IPC pipe being closed before the child process dies. In this case, when the browser sees the pipe closure, the child process may not be dead, violating the assumption that the child is dead, which causes problems with child termination. To maintain existing semantics and avoid issues with child termination, the EDK always leaks the parent pipe in the child instead of closing it. The parent will then only see the pipe close when the child process exits. The EDK will only leak the parent-child OS pipe in the child process. All other pipes will be closed when Mojo is shut down. BUG=618121 ========== to ========== Always leak the Mojo parent pipe handle on child process shutdown. In legacy Chrome-IPC, the browser uses the disconnection on the IPC pipe as indication that the child process has died. It does this by intentionally holding on to the parent IPC pipe in the child as the child process exits. When the child exits, the OS closes the pipe and the browser receives a disconnection event and assumes the child process is dead. When using ChannelMojo, the IPC pipe is a Mojo message pipe. In the child, Mojo is is told to perform an orderly shutdown when the child process is shutting down. This results in the IPC pipe being closed before the child process dies. In this case, when the browser sees the pipe closure, the child process may not be dead, violating the assumption that the child is dead, which causes problems with child termination. To maintain existing semantics and avoid issues with child termination, the EDK always leaks the parent pipe in the child instead of closing it. The parent will then only see the pipe close when the child process exits. The EDK will only leak the parent-child OS pipe in the child process. All other pipes will be closed when Mojo is shut down. BUG=618121 Committed: https://crrev.com/e64e2962ecf525fff64e48ba3766f97eca5bdd2e Cr-Commit-Position: refs/heads/master@{#399949} ==========
Message was sent while issue was closed.
Patchset 3 (id:??) landed as https://crrev.com/e64e2962ecf525fff64e48ba3766f97eca5bdd2e Cr-Commit-Position: refs/heads/master@{#399949} |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
