|
|
Created:
4 years, 11 months ago by Xianzhu Modified:
4 years, 11 months ago CC:
chromium-reviews, klundberg+watch_chromium.org, mikecase+watch_chromium.org, yfriedman+watch_chromium.org, jbudorick+watch_chromium.org Base URL:
https://chromium.googlesource.com/chromium/src.git@master Target Ref:
refs/pending/heads/master Project:
chromium Visibility:
Public. |
DescriptionFix crash in device_forwarder
There are many crashes in device_forwarder on bots, e.g.
https://build.chromium.org/p/tryserver.chromium.android/builders/linux_android_rel_ng/builds/5713/steps/stack_tool_for_tombstones/logs/stdio/text
The original method was incorrect because
ServiceDelegate::DeleteControllerOnInternalThread()
may execute after ServiceDelegate::~ServiceDelegate().
Committed: https://crrev.com/91b8d5ad02d99b59cb17e100eb645b2fefb4eb7f
Cr-Commit-Position: refs/heads/master@{#368466}
Patch Set 1 #
Total comments: 3
Messages
Total messages: 23 (8 generated)
wangxianzhu@chromium.org changed reviewers: + jbudorick@chromium.org, skyostil@chromium.org
wangxianzhu@chromium.org changed reviewers: + qinmin@chromium.org
lgtm
The CQ bit was checked by wangxianzhu@chromium.org
The CQ bit was unchecked by wangxianzhu@chromium.org
fwiw, most of the "crashes" on the bots are when we kill the device_forwarder. https://codereview.chromium.org/1571643003/diff/1/tools/android/forwarder2/de... File tools/android/forwarder2/device_forwarder_main.cc (left): https://codereview.chromium.org/1571643003/diff/1/tools/android/forwarder2/de... tools/android/forwarder2/device_forwarder_main.cc:56: // DeleteSoon() is not used here since it would imply reading |controller_| I hadn't reviewed yet because I'd like to look into this a bit more. Switching to DeleteSoon despite the previous comment is a little concerning.
fwiw, most of the "crashes" on the bots are when we kill the device_forwarder. https://codereview.chromium.org/1571643003/diff/1/tools/android/forwarder2/de... File tools/android/forwarder2/device_forwarder_main.cc (left): https://codereview.chromium.org/1571643003/diff/1/tools/android/forwarder2/de... tools/android/forwarder2/device_forwarder_main.cc:56: // DeleteSoon() is not used here since it would imply reading |controller_| I hadn't reviewed yet because I'd like to look into this a bit more. Switching to DeleteSoon despite the previous comment is a little concerning.
The CQ bit was checked by wangxianzhu@chromium.org
The CQ bit was unchecked by wangxianzhu@chromium.org
https://codereview.chromium.org/1571643003/diff/1/tools/android/forwarder2/de... File tools/android/forwarder2/device_forwarder_main.cc (left): https://codereview.chromium.org/1571643003/diff/1/tools/android/forwarder2/de... tools/android/forwarder2/device_forwarder_main.cc:56: // DeleteSoon() is not used here since it would imply reading |controller_| On 2016/01/08 21:41:08, jbudorick wrote: > I hadn't reviewed yet because I'd like to look into this a bit more. Switching > to DeleteSoon despite the previous comment is a little concerning. I don't think the previous comment is correct. The only risk of reading controller_ from the main thread is that ~ServerDelegate() is called at the same time of StartController(), but the original code could not avoid the risk. The risk is very low so I would just ignore it.
On 2016/01/08 21:41:08, jbudorick wrote: > fwiw, most of the "crashes" on the bots are when we kill the device_forwarder. Yes, but the crashes make it hard to find the real crashes from logcat dump and tombstone dump. Sometimes all 8 tomestones are device_forwarder crashes.
https://codereview.chromium.org/1571643003/diff/1/tools/android/forwarder2/de... File tools/android/forwarder2/device_forwarder_main.cc (left): https://codereview.chromium.org/1571643003/diff/1/tools/android/forwarder2/de... tools/android/forwarder2/device_forwarder_main.cc:56: // DeleteSoon() is not used here since it would imply reading |controller_| On 2016/01/08 21:47:03, Xianzhu wrote: > On 2016/01/08 21:41:08, jbudorick wrote: > > I hadn't reviewed yet because I'd like to look into this a bit more. Switching > > to DeleteSoon despite the previous comment is a little concerning. > > I don't think the previous comment is correct. The only risk of reading > controller_ from the main thread is that ~ServerDelegate() is called at the same > time of StartController(), but the original code could not avoid the risk. The > risk is very low so I would just ignore it. It looks like the original would avoid that risk by posting StartController to controller_thread_. I'm not crazy about this, but it does seem to be an improvement over the existing code. Could we add something like a WaitableEvent that gets set by DeleteControllerOnInternalThread after resetting controller_ that this dtor could wait on after posting the task?
On 2016/01/08 22:11:37, jbudorick wrote: > https://codereview.chromium.org/1571643003/diff/1/tools/android/forwarder2/de... > File tools/android/forwarder2/device_forwarder_main.cc (left): > > https://codereview.chromium.org/1571643003/diff/1/tools/android/forwarder2/de... > tools/android/forwarder2/device_forwarder_main.cc:56: // DeleteSoon() is not > used here since it would imply reading |controller_| > On 2016/01/08 21:47:03, Xianzhu wrote: > > On 2016/01/08 21:41:08, jbudorick wrote: > > > I hadn't reviewed yet because I'd like to look into this a bit more. > Switching > > > to DeleteSoon despite the previous comment is a little concerning. > > > > I don't think the previous comment is correct. The only risk of reading > > controller_ from the main thread is that ~ServerDelegate() is called at the > same > > time of StartController(), but the original code could not avoid the risk. The > > risk is very low so I would just ignore it. > > It looks like the original would avoid that risk by posting StartController to > controller_thread_. > > I'm not crazy about this, but it does seem to be an improvement over the > existing code. Could we add something like a WaitableEvent that gets set by > DeleteControllerOnInternalThread after resetting controller_ that this dtor > could wait on after posting the task? The simplest way to avoid all risks is to make ServerDelegate ref counted, so that it can only be deleted after all pending tasks finish. This is much simpler than using WaitableEvents.
On 2016/01/08 22:19:26, Xianzhu wrote: > On 2016/01/08 22:11:37, jbudorick wrote: > > > https://codereview.chromium.org/1571643003/diff/1/tools/android/forwarder2/de... > > File tools/android/forwarder2/device_forwarder_main.cc (left): > > > > > https://codereview.chromium.org/1571643003/diff/1/tools/android/forwarder2/de... > > tools/android/forwarder2/device_forwarder_main.cc:56: // DeleteSoon() is not > > used here since it would imply reading |controller_| > > On 2016/01/08 21:47:03, Xianzhu wrote: > > > On 2016/01/08 21:41:08, jbudorick wrote: > > > > I hadn't reviewed yet because I'd like to look into this a bit more. > > Switching > > > > to DeleteSoon despite the previous comment is a little concerning. > > > > > > I don't think the previous comment is correct. The only risk of reading > > > controller_ from the main thread is that ~ServerDelegate() is called at the > > same > > > time of StartController(), but the original code could not avoid the risk. > The > > > risk is very low so I would just ignore it. > > > > It looks like the original would avoid that risk by posting StartController to > > controller_thread_. > > > > I'm not crazy about this, but it does seem to be an improvement over the > > existing code. Could we add something like a WaitableEvent that gets set by > > DeleteControllerOnInternalThread after resetting controller_ that this dtor > > could wait on after posting the task? > > The simplest way to avoid all risks is to make ServerDelegate ref counted, so > that it can only be deleted after all pending tasks finish. This is much simpler > than using WaitableEvents. Works for me.
On 2016/01/08 22:20:47, jbudorick wrote: > On 2016/01/08 22:19:26, Xianzhu wrote: > > On 2016/01/08 22:11:37, jbudorick wrote: > > > > > > https://codereview.chromium.org/1571643003/diff/1/tools/android/forwarder2/de... > > > File tools/android/forwarder2/device_forwarder_main.cc (left): > > > > > > > > > https://codereview.chromium.org/1571643003/diff/1/tools/android/forwarder2/de... > > > tools/android/forwarder2/device_forwarder_main.cc:56: // DeleteSoon() is not > > > used here since it would imply reading |controller_| > > > On 2016/01/08 21:47:03, Xianzhu wrote: > > > > On 2016/01/08 21:41:08, jbudorick wrote: > > > > > I hadn't reviewed yet because I'd like to look into this a bit more. > > > Switching > > > > > to DeleteSoon despite the previous comment is a little concerning. > > > > > > > > I don't think the previous comment is correct. The only risk of reading > > > > controller_ from the main thread is that ~ServerDelegate() is called at > the > > > same > > > > time of StartController(), but the original code could not avoid the risk. > > The > > > > risk is very low so I would just ignore it. > > > > > > It looks like the original would avoid that risk by posting StartController > to > > > controller_thread_. > > > > > > I'm not crazy about this, but it does seem to be an improvement over the > > > existing code. Could we add something like a WaitableEvent that gets set by > > > DeleteControllerOnInternalThread after resetting controller_ that this dtor > > > could wait on after posting the task? > > > > The simplest way to avoid all risks is to make ServerDelegate ref counted, so > > that it can only be deleted after all pending tasks finish. This is much > simpler > > than using WaitableEvents. > > Works for me. Just looked into ref-counted path a bit and found that we still can't schedule a task to delete controller_ which holds a reference to the delegate because we are already in delegate's destructor. We need a much bigger change to make it work. WaitableEvent also can't resolve the root problem of asynchronous StartController(). The original intent of avoiding main thread access to controller_ is because controller_ might be written by StartController() at the same time. However if this happened, we would crash in a way much worse (read-after-free of the delegate's fields) than the little chance of multi-thread access to a scoped_ptr. I think the latest patch is the best choice because it fixes the crash in a simple way and doesn't increase any existing risk.
On 2016/01/08 23:02:59, Xianzhu wrote: > On 2016/01/08 22:20:47, jbudorick wrote: > > On 2016/01/08 22:19:26, Xianzhu wrote: > > > On 2016/01/08 22:11:37, jbudorick wrote: > > > > > > > > > > https://codereview.chromium.org/1571643003/diff/1/tools/android/forwarder2/de... > > > > File tools/android/forwarder2/device_forwarder_main.cc (left): > > > > > > > > > > > > > > https://codereview.chromium.org/1571643003/diff/1/tools/android/forwarder2/de... > > > > tools/android/forwarder2/device_forwarder_main.cc:56: // DeleteSoon() is > not > > > > used here since it would imply reading |controller_| > > > > On 2016/01/08 21:47:03, Xianzhu wrote: > > > > > On 2016/01/08 21:41:08, jbudorick wrote: > > > > > > I hadn't reviewed yet because I'd like to look into this a bit more. > > > > Switching > > > > > > to DeleteSoon despite the previous comment is a little concerning. > > > > > > > > > > I don't think the previous comment is correct. The only risk of reading > > > > > controller_ from the main thread is that ~ServerDelegate() is called at > > the > > > > same > > > > > time of StartController(), but the original code could not avoid the > risk. > > > The > > > > > risk is very low so I would just ignore it. > > > > > > > > It looks like the original would avoid that risk by posting > StartController > > to > > > > controller_thread_. > > > > > > > > I'm not crazy about this, but it does seem to be an improvement over the > > > > existing code. Could we add something like a WaitableEvent that gets set > by > > > > DeleteControllerOnInternalThread after resetting controller_ that this > dtor > > > > could wait on after posting the task? > > > > > > The simplest way to avoid all risks is to make ServerDelegate ref counted, > so > > > that it can only be deleted after all pending tasks finish. This is much > > simpler > > > than using WaitableEvents. > > > > Works for me. > > Just looked into ref-counted path a bit and found that we still can't schedule a > task to delete controller_ which holds a reference to the delegate because we > are already in delegate's destructor. We need a much bigger change to make it > work. > > WaitableEvent also can't resolve the root problem of asynchronous > StartController(). The original intent of avoiding main thread access to > controller_ is because controller_ might be written by StartController() at the > same time. However if this happened, we would crash in a way much worse > (read-after-free of the delegate's fields) than the little chance of > multi-thread access to a scoped_ptr. > > I think the latest patch is the best choice because it fixes the crash in a > simple way and doesn't increase any existing risk. Hrm, ok. Thanks for looking into this. lgtm
The CQ bit was checked by wangxianzhu@chromium.org
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1571643003/1 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1571643003/1
Message was sent while issue was closed.
Committed patchset #1 (id:1)
Message was sent while issue was closed.
Description was changed from ========== Fix crash in device_forwarder There are many crashes in device_forwarder on bots, e.g. https://build.chromium.org/p/tryserver.chromium.android/builders/linux_androi... The original method was incorrect because ServiceDelegate::DeleteControllerOnInternalThread() may execute after ServiceDelegate::~ServiceDelegate(). ========== to ========== Fix crash in device_forwarder There are many crashes in device_forwarder on bots, e.g. https://build.chromium.org/p/tryserver.chromium.android/builders/linux_androi... The original method was incorrect because ServiceDelegate::DeleteControllerOnInternalThread() may execute after ServiceDelegate::~ServiceDelegate(). Committed: https://crrev.com/91b8d5ad02d99b59cb17e100eb645b2fefb4eb7f Cr-Commit-Position: refs/heads/master@{#368466} ==========
Message was sent while issue was closed.
Patchset 1 (id:??) landed as https://crrev.com/91b8d5ad02d99b59cb17e100eb645b2fefb4eb7f Cr-Commit-Position: refs/heads/master@{#368466} |