|
|
Created:
7 years, 2 months ago by hubbe Modified:
6 years, 10 months ago CC:
chromium-reviews, darin-cc_chromium.org, jam, Fredrik Öhrn Base URL:
svn://svn.chromium.org/chrome/trunk/src Visibility:
Public. |
DescriptionFix posix IPC channel hanging problem.
If a channel closes right before a send call, listeners might not be notified of
the problem, which can cause hangs. This CL fixes that and adds a test that makes
sure that this does not happen in the future.
Committed: https://src.chromium.org/viewvc/chrome?view=rev&revision=248060
Patch Set 1 #Patch Set 2 : actual fix #
Total comments: 2
Patch Set 3 : added another test #Patch Set 4 : make new test pass #
Total comments: 4
Patch Set 5 : git cl try #Patch Set 6 : try calling OnChannelError only once #Patch Set 7 : early-fail send after close #Patch Set 8 : LOG(INFO) -> LOG(WARNING) to make presubmit happy #Messages
Total messages: 36 (0 generated)
Not sure if this ever happens, except when the mac kernel goes berserk. But it occurred to me that there was two problems: 1 the mac kernel had a bug, 2 the IPC layer got stuck instead of dying gracefully. This CL fixes #2. The first upload contains the test, should fail on trybots, the second upload contains the fix, which should make the test work again.
plus scott
On 2013/10/18 23:23:25, cpu wrote: > plus scott Ping?
LGTM, I think. I'm a little out of my depth, though. https://codereview.chromium.org/30133002/diff/40001/ipc/ipc_channel_posix.cc File ipc/ipc_channel_posix.cc (right): https://codereview.chromium.org/30133002/diff/40001/ipc/ipc_channel_posix.cc#... ipc/ipc_channel_posix.cc:713: return ProcessOutgoingMessages(); Could this have the same race condition if the other end closes after accept but before ProcessOutgoingMessages()?
https://codereview.chromium.org/30133002/diff/40001/ipc/ipc_channel_posix.cc File ipc/ipc_channel_posix.cc (right): https://codereview.chromium.org/30133002/diff/40001/ipc/ipc_channel_posix.cc#... ipc/ipc_channel_posix.cc:713: return ProcessOutgoingMessages(); On 2013/10/22 23:52:46, shess wrote: > Could this have the same race condition if the other end closes after accept but > before ProcessOutgoingMessages()? Yes, I think so. I'll see if I can trigger this in a test.
lgtm
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/hubbe@chromium.org/30133002/40001
Retried try job too often on win_rel for step(s) browser_tests http://build.chromium.org/p/tryserver.chromium/buildstatus?builder=win_rel&nu...
is this CL abandoned?
On 2013/11/11 23:15:57, cpu wrote: > is this CL abandoned? No, but it's been on an extended vacation since I've been quite busy with other stuff. I'll pick it up again soon.
On 2013/11/11 23:41:58, hubbe wrote: > On 2013/11/11 23:15:57, cpu wrote: > > is this CL abandoned? > > No, but it's been on an extended vacation since I've been quite busy with other > stuff. > I'll pick it up again soon. Added test & fix for the other race that shess pointed out. PTAL. /Hubbe
lgtm
lgtm. https://codereview.chromium.org/30133002/diff/440001/ipc/ipc_channel_posix_un... File ipc/ipc_channel_posix_unittest.cc (right): https://codereview.chromium.org/30133002/diff/440001/ipc/ipc_channel_posix_un... ipc/ipc_channel_posix_unittest.cc:252: in_chan.Close(); // simulate remote process dying at an unforunate time. sp "unfortunate". https://codereview.chromium.org/30133002/diff/440001/ipc/ipc_channel_posix_un... ipc/ipc_channel_posix_unittest.cc:261: // If a connection closes right before an Connect() call, "a Connect() call". Also paragraph formatting looks weird WRT previous test's comment.
https://codereview.chromium.org/30133002/diff/440001/ipc/ipc_channel_posix_un... File ipc/ipc_channel_posix_unittest.cc (right): https://codereview.chromium.org/30133002/diff/440001/ipc/ipc_channel_posix_un... ipc/ipc_channel_posix_unittest.cc:252: in_chan.Close(); // simulate remote process dying at an unforunate time. On 2013/11/13 21:33:22, shess wrote: > sp "unfortunate". Done. https://codereview.chromium.org/30133002/diff/440001/ipc/ipc_channel_posix_un... ipc/ipc_channel_posix_unittest.cc:261: // If a connection closes right before an Connect() call, On 2013/11/13 21:33:22, shess wrote: > "a Connect() call". Also paragraph formatting looks weird WRT previous test's > comment. Done.
On 2013/11/20 20:21:21, hubbe wrote: > https://codereview.chromium.org/30133002/diff/440001/ipc/ipc_channel_posix_un... > File ipc/ipc_channel_posix_unittest.cc (right): > > https://codereview.chromium.org/30133002/diff/440001/ipc/ipc_channel_posix_un... > ipc/ipc_channel_posix_unittest.cc:252: in_chan.Close(); // simulate remote > process dying at an unforunate time. > On 2013/11/13 21:33:22, shess wrote: > > sp "unfortunate". > > Done. > > https://codereview.chromium.org/30133002/diff/440001/ipc/ipc_channel_posix_un... > ipc/ipc_channel_posix_unittest.cc:261: // If a connection closes right before an > Connect() call, > On 2013/11/13 21:33:22, shess wrote: > > "a Connect() call". Also paragraph formatting looks weird WRT previous test's > > comment. > > Done. I had to do one more (minor) things to make all tests pass. Please take a look before I submit.
lgtm
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/hubbe@chromium.org/30133002/730001
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/hubbe@chromium.org/30133002/730001
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/hubbe@chromium.org/30133002/730001
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/hubbe@chromium.org/30133002/730001
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/hubbe@chromium.org/30133002/730001
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/hubbe@chromium.org/30133002/730001
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/hubbe@chromium.org/30133002/730001
Retried try job too often on chromium_presubmit for step(s) presubmit http://build.chromium.org/p/tryserver.chromium/buildstatus?builder=chromium_p...
Any news on this? https://codereview.chromium.org/131513018/ wants to fix the same thing (I think).
I haven't given up, but fixing things is harder than I thought, because it is sometimes difficult to detect if the ipc channel is not yet open or have already been closed. On Wed, Jan 29, 2014 at 11:13 AM, <shess@chromium.org> wrote: > Any news on this? > > https://codereview.chromium.org/131513018/ wants to fix the same thing (I > think). > > https://codereview.chromium.org/30133002/ > To unsubscribe from this group and stop receiving emails from it, send an email to chromium-reviews+unsubscribe@chromium.org.
On 2014/01/29 19:13:51, shess wrote: > Any news on this? > > https://codereview.chromium.org/131513018/ wants to fix the same thing (I > think). Correct. It triggers a crash in Opera for Android (crbug.com/338709) so we'd really like to see it fixed.
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/hubbe@chromium.org/30133002/730001
On 2014/01/30 09:45:25, Fredrik Öhrn wrote: > On 2014/01/29 19:13:51, shess wrote: > > Any news on this? > > > > https://codereview.chromium.org/131513018/ wants to fix the same thing (I > > think). > > Correct. > > It triggers a crash in Opera for Android (crbug.com/338709) so we'd really like > to see it fixed. I thought there was some more work to do on this, but now that I look at it I cannot remember what it was, and tests seem to pass, so I'm going to try to submit again.
Retried try job too often on chromium_presubmit for step(s) presubmit http://build.chromium.org/p/tryserver.chromium/buildstatus?builder=chromium_p...
CQ bit was unchecked on CL. Ignoring.
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/hubbe@chromium.org/30133002/750001
Message was sent while issue was closed.
Change committed as 248060
Message was sent while issue was closed.
CQ bit was unchecked on CL. Ignoring.
Message was sent while issue was closed.
A revert of this CL has been created in https://codereview.chromium.org/150883002/ by hubbe@chromium.org. The reason for reverting is: Seems to cause some leaks. Reverting while I investigate. http://build.chromium.org/p/chromium.memory/builders/Linux%20ASAN%20Tests%20%... . |