|
|
Created:
7 years, 3 months ago by meacer Modified:
7 years, 1 month ago CC:
chromium-reviews, joi+watch-content_chromium.org, darin-cc_chromium.org, jochen (gone - plz use gerrit), kinuko Base URL:
svn://svn.chromium.org/chrome/trunk/src Visibility:
Public. |
DescriptionKill worker process by way of a sync IPC message before it cleans up.
When a worker process shuts down, it shuts down WebKit. If there are
other threads running in the worker process, this leads to crashes.
This fix tries to kill the worker process forcibly so that no cleanup
takes place.
BUG=249502
Committed: https://src.chromium.org/viewvc/chrome?view=rev&revision=233099
Patch Set 1 #Patch Set 2 : Fix comment #
Total comments: 4
Patch Set 3 : Fix indentation #
Total comments: 2
Patch Set 4 : Kill process in worker thread rather than websharedworker_stub #Patch Set 5 : Remove unused header #
Total comments: 2
Patch Set 6 : Kill worker process in OnShutdown, that is after browser process says OK to shut down #Patch Set 7 : Handle shutdown IPC in WorkerThread #Patch Set 8 : Change process exit code #
Total comments: 3
Patch Set 9 : Add process_ checks #
Messages
Total messages: 32 (0 generated)
Michael, this is the patch that implements your suggestion :) Layout tests pass and I don't see the crashes in the bug.
tryjobs would be great to see https://codereview.chromium.org/23496052/diff/3001/content/browser/worker_hos... File content/browser/worker_host/worker_process_host.h (right): https://codereview.chromium.org/23496052/diff/3001/content/browser/worker_hos... content/browser/worker_host/worker_process_host.h:205: void OnForceKillWorkerProcess(int worker_route_id, indent is off https://codereview.chromium.org/23496052/diff/3001/content/worker/websharedwo... File content/worker/websharedworker_stub.cc (right): https://codereview.chromium.org/23496052/diff/3001/content/worker/websharedwo... content/worker/websharedworker_stub.cc:51: worker_thread->Send(new WorkerProcessHostMsg_ForceKillWorker( I think WorkerThread::Shutdown() would be a better place to do this. With this call here, the first shared worker to exit will cause the entire process to die, including any other shared workers in that process. Despite it's name, class WorkerThread is actually a one per process sort of class.
https://codereview.chromium.org/23496052/diff/3001/content/browser/worker_hos... File content/browser/worker_host/worker_process_host.h (right): https://codereview.chromium.org/23496052/diff/3001/content/browser/worker_hos... content/browser/worker_host/worker_process_host.h:205: void OnForceKillWorkerProcess(int worker_route_id, On 2013/09/12 00:22:11, michaeln wrote: > indent is off Done. https://codereview.chromium.org/23496052/diff/3001/content/worker/websharedwo... File content/worker/websharedworker_stub.cc (right): https://codereview.chromium.org/23496052/diff/3001/content/worker/websharedwo... content/worker/websharedworker_stub.cc:51: worker_thread->Send(new WorkerProcessHostMsg_ForceKillWorker( On 2013/09/12 00:22:11, michaeln wrote: > I think WorkerThread::Shutdown() would be a better place to do this. > > With this call here, the first shared worker to exit will cause the entire > process to die, including any other shared workers in that process. > > Despite it's name, class WorkerThread is actually a one per process sort of > class. I started with that, but the route_id being removed from worker thread caused the IPC to be lost. I'll take another look. Also, from reading the code there seems to be a one to one mapping between WorkerThread and WebSharedWorkerStub. Is that correct (probably not)?
> Also, from reading the code there seems to be a one to one mapping between > WorkerThread and WebSharedWorkerStub. Is that correct (probably not)? Your reading of the code is correct, right now, a chromium worker process host at most 1 worker. It should be capable of hosting a larger number of workers but right now its not. Let's not add to the reasons for why its not. https://codereview.chromium.org/23496052/diff/12001/content/common/worker_mes... File content/common/worker_messages.h (right): https://codereview.chromium.org/23496052/diff/12001/content/common/worker_mes... content/common/worker_messages.h:149: bool /* result */) You don't need the route_id or handl params for this message. The browser process knows exactly which child process to terminate based on who sent it.
Now it's killing the process in the worker thread. I haven't yet removed the code in WorkerThread::Shutdown. https://codereview.chromium.org/23496052/diff/12001/content/common/worker_mes... File content/common/worker_messages.h (right): https://codereview.chromium.org/23496052/diff/12001/content/common/worker_mes... content/common/worker_messages.h:149: bool /* result */) On 2013/09/12 02:03:20, michaeln wrote: > You don't need the route_id or handl params for this message. The browser > process knows exactly which child process to terminate based on who sent it. Done.
https://codereview.chromium.org/23496052/diff/24001/content/worker/worker_thr... File content/worker/worker_thread.cc (right): https://codereview.chromium.org/23496052/diff/24001/content/worker/worker_thr... content/worker/worker_thread.cc:63: Send(new WorkerProcessHostMsg_ForceKillWorker()); Does it not work to have this line as the first and only line of WorkerThread::Shutdown? In the 'final release' sequence, there's an opportunity for the browser process to change its mind about shutting down the child afterall. With current code, when WorkerProcessHost::CanShutdown() returns false, the 'final release' doesn't result in a shutdown. This CL changes that behavior and will kill the process even if CanShutdown says 'false'. Not sure we want that behavior change.
https://codereview.chromium.org/23496052/diff/24001/content/worker/worker_thr... File content/worker/worker_thread.cc (right): https://codereview.chromium.org/23496052/diff/24001/content/worker/worker_thr... content/worker/worker_thread.cc:63: Send(new WorkerProcessHostMsg_ForceKillWorker()); On 2013/09/16 22:00:41, michaeln wrote: > Does it not work to have this line as the first and only line of > WorkerThread::Shutdown? It doesn't work. > > In the 'final release' sequence, there's an opportunity for the browser process > to change its mind about shutting down the child afterall. With current code, > when WorkerProcessHost::CanShutdown() returns false, the 'final release' doesn't > result in a shutdown. This CL changes that behavior and will kill the process > even if CanShutdown says 'false'. Not sure we want that behavior change. > How about overriding ChildThread::OnShutdown then? That should be called after the browser says that the process can be shut down.
On 2013/09/16 22:08:43, Mustafa Acer wrote: > https://codereview.chromium.org/23496052/diff/24001/content/worker/worker_thr... > File content/worker/worker_thread.cc (right): > > https://codereview.chromium.org/23496052/diff/24001/content/worker/worker_thr... > content/worker/worker_thread.cc:63: Send(new > WorkerProcessHostMsg_ForceKillWorker()); > On 2013/09/16 22:00:41, michaeln wrote: > > Does it not work to have this line as the first and only line of > > WorkerThread::Shutdown? > > It doesn't work. Will that's too bad, i guess we're too far along in the shutdown sequence to send IPCs at this point? > How about overriding ChildThread::OnShutdown then? That should be called after > the browser says that the process can be shut down. I can't help but notice that method is not virtual, then again OnMessageReceived() is virtual, so the WorkerThread derivative could handle the ChildProcessMsg_Shutdown message without ever calling base::OnMessageReceived(). I'd say it's probably better to override OnMessageReceived for this. Let's ask jam what he thinks?
> I can't help but notice that method is not virtual, then again > OnMessageReceived() is virtual, so the WorkerThread derivative could handle the > ChildProcessMsg_Shutdown message without ever calling base::OnMessageReceived(). > I'd say it's probably better to override OnMessageReceived for this. Not sure if I understand correctly. Can you see if patch #6 will work the way you described? ChildThread::OnMessageReceived is what will call ChildThread::OnShutdown on a ChildProcessMsg_Shutdown message. Since OnShutdown is now virtual, WorkerThread::OnShutdown will be called and kill the process. I can't see why OnShutdown should call base::OnMessageReceived. Maybe I'm missing something?
On 2013/09/16 22:44:21, Mustafa Emre Acer wrote: > > I can't help but notice that method is not virtual, then again > > OnMessageReceived() is virtual, so the WorkerThread derivative could handle > the > > ChildProcessMsg_Shutdown message without ever calling > base::OnMessageReceived(). > > I'd say it's probably better to override OnMessageReceived for this. > > Not sure if I understand correctly. Can you see if patch #6 will work the way > you described? > > ChildThread::OnMessageReceived is what will call ChildThread::OnShutdown on a > ChildProcessMsg_Shutdown message. Since OnShutdown is now virtual, > WorkerThread::OnShutdown will be called and kill the process. I can't see why > OnShutdown should call base::OnMessageReceived. Maybe I'm missing something? I was suggesting there's no need to make OnShutdown()virtual. If a derived class wants to process a message differently, it can override OnMessageReceived and handle it there. WorkerThread::OnMessageReceived could call its own nonvirtual OnShutdown method with zero changes to the ChildThread class.
> I was suggesting there's no need to make OnShutdown()virtual. If a derived class > wants to process a message differently, it can override OnMessageReceived and > handle it there. WorkerThread::OnMessageReceived could call its own nonvirtual > OnShutdown method with zero changes to the ChildThread class. Right, I misunderstood then :) Would you mind taking a look at the last patch? I'll rebase later on.
pending trybot happiness, this lgtm! https://codereview.chromium.org/23496052/diff/41001/content/browser/worker_ho... File content/browser/worker_host/worker_process_host.cc (right): https://codereview.chromium.org/23496052/diff/41001/content/browser/worker_ho... content/browser/worker_host/worker_process_host.cc:394: base::KillProcess( Is this the same low level api we use elsewhere to kill child processes (like from the taskmanager or that fast renderer exit code path)... just checking
On 2013/09/17 20:30:37, michaeln wrote: > pending trybot happiness, this lgtm! > > https://codereview.chromium.org/23496052/diff/41001/content/browser/worker_ho... > File content/browser/worker_host/worker_process_host.cc (right): > > https://codereview.chromium.org/23496052/diff/41001/content/browser/worker_ho... > content/browser/worker_host/worker_process_host.cc:394: base::KillProcess( > Is this the same low level api we use elsewhere to kill child processes (like > from the taskmanager or that fast renderer exit code path)... just checking Yes, taskmanager and other places uses this to kill processes (I can't find it in the fast renderer exit code path though).
+atwilson: Can you please review content/browser/worker_host/* and content/worker/* files? Context: We get crashes if there are threads doing work during a worker process exit. This change sends a sync IPC from the worker process to be killed by the browser, so that no cleanup is done.
Adding atwilson@ for real this time, please see my previous comment :)
lgtm, although I'm really not in a position to confirm whether it's OK to just kill the process without even *trying* to shutdown WebKit cleanly. John/Jochen - any objections?
I don't think that's the right solution. We shouldn't call from random threads into WebKit anyway (but only from the main thread). Also, we should fix all call sites to release WebKit objects and stop calling into WebKit before WebKit shuts down.
> We shouldn't call from random threads into WebKit anyway (but only from the main > thread). That ship sailed a long time ago. Parts of the WebKitAPI are intended to be called from background threads. > we should fix all call sites to release WebKit objects The problem is not calling into blink, its the other way around. Blink calling back out thru nulled out and deleted WebKitAPI interface pointers. This solution in the CL is better than writing code to carefully terminate and join webcore's background threads within WebKit::shutdown() without introducing a deadlock. Any solution of that form would likely not be correct and would practically speaking never be fully tested. Security vulnerabilities would still be there. This solution makes this class of shutdown bug impossible. Still lgtm.
On 2013/09/19 01:34:02, Andrew T Wilson wrote: > lgtm, although I'm really not in a position to confirm whether it's OK to just > kill the process without even *trying* to shutdown WebKit cleanly. John/Jochen - > any objections? 1) How much work is it to shut down threads in webkit cleanly so we don't do this band aid? 2) Is this really exploitable?
On 2013/10/02 20:13:11, jam wrote: > On 2013/09/19 01:34:02, Andrew T Wilson wrote: > > lgtm, although I'm really not in a position to confirm whether it's OK to just > > kill the process without even *trying* to shutdown WebKit cleanly. John/Jochen > - > > any objections? > > 1) How much work is it to shut down threads in webkit cleanly so we don't do > this band aid? > 2) Is this really exploitable? Conclusion from the email thread: - It seems there is still some risk of exploitation even though UAF happens during worker process exit - The actual mitigation (threads shutting down correctly) will land in M33 Should we land this patch in the meanwhile?
> Should we land this patch in the meanwhile? I've made my point of view pretty clear on the topic, so i assume you're not asking me. Jochen's been looking into rationalizing the unwinding/ending/joining of these webcore threads. Maybe he can better answer jam's question of "how much work is it" question? A quick chat with him turned up an interesting factoid... in google3 in some cases the preferred way of terminating background threads is to just not do that and to terminate the process instead... huh... how dirty of them ;)
i don't have a strong opinion. If you think this improves the situation until this is fixed, feel free to land it.
On 2013/10/15 23:49:26, jochen wrote: > i don't have a strong opinion. If you think this improves the situation until > this is fixed, feel free to land it. Considering that there is a risk of exploitation and that the patch solves a few similar issues, I've decided to land this soon. Once Jochen's work on shutting down threads correctly lands, everything should all be good.
Cris, can you take a look at content/common/worker_messages.h?
Adding the right cdn@
On 2013/11/01 01:47:13, Mustafa Emre Acer wrote: > Adding the right cdn@ LGTM for IPC changes with nit.
https://codereview.chromium.org/23496052/diff/41001/content/browser/worker_ho... File content/browser/worker_host/worker_process_host.cc (right): https://codereview.chromium.org/23496052/diff/41001/content/browser/worker_ho... content/browser/worker_host/worker_process_host.cc:395: process_->GetData().handle, RESULT_CODE_NORMAL_EXIT, false); nit: may want to check process_launched_ == true before this. Not sure if this can happen before the process launched or what you will get back as the handle if it does.
https://codereview.chromium.org/23496052/diff/41001/content/browser/worker_ho... File content/browser/worker_host/worker_process_host.cc (right): https://codereview.chromium.org/23496052/diff/41001/content/browser/worker_ho... content/browser/worker_host/worker_process_host.cc:395: process_->GetData().handle, RESULT_CODE_NORMAL_EXIT, false); On 2013/11/01 23:26:34, Cris Neckar wrote: > nit: may want to check process_launched_ == true before this. Not sure if this > can happen before the process launched or what you will get back as the handle > if it does. Done.
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/meacer@chromium.org/23496052/108001
Retried try job too often on win_rel for step(s) browser_tests http://build.chromium.org/p/tryserver.chromium/buildstatus?builder=win_rel&nu...
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/meacer@chromium.org/23496052/108001
Message was sent while issue was closed.
Change committed as 233099 |