|
|
Created:
10 years, 4 months ago by akalin Modified:
9 years, 7 months ago CC:
chromium-reviews, brettw-cc_chromium.org, Evan Martin Visibility:
Public. |
DescriptionDelete tasks unconditionally in MessageLoop::DeletePendingTasks().
BUG=6532
TEST=existing unit tests
Committed: http://src.chromium.org/viewvc/chrome?view=rev&revision=54584
Patch Set 1 #
Total comments: 1
Messages
Total messages: 8 (0 generated)
+darin evmar is of the opinion that we should just delete tasks unconditionally and we can deal with the fallout. What do you think?
sure, the last time i tried to do this there were some mysterious crashes on the reliability bot. i was never able to track them down. it looked like deleting a task without the task being run was somehow causing memory corruption. the crash stack was unrelated to tasks. good luck!! LGTM :-) -darin On Wed, Jul 28, 2010 at 6:34 PM, <akalin@chromium.org> wrote: > Reviewers: darin, > > Message: > +darin > > evmar is of the opinion that we should just delete tasks unconditionally > and we > can deal with the fallout. What do you think? > > Description: > Delete tasks unconditionally in MessageLoop::DeletePendingTasks(). > > BUG=6532 > TEST=existing unit tests > > Please review this at http://codereview.chromium.org/3076015/show > > Affected files: > M base/message_loop.cc > > > Index: base/message_loop.cc > diff --git a/base/message_loop.cc b/base/message_loop.cc > index > 1d86f79d4d1a9f912d94fa093655d0c6c4f87618..d67e64ad76b586077eb953883b459d64ab06f90b > 100644 > --- a/base/message_loop.cc > +++ b/base/message_loop.cc > @@ -466,37 +466,18 @@ bool MessageLoop::DeletePendingTasks() { > // tasks. > AddToDelayedWorkQueue(pending_task); > } else { > - // TODO(darin): Delete all tasks once it is safe to do so. > - // Until it is totally safe, just do it when running Purify or > - // Valgrind. > -#if defined(PURIFY) > delete pending_task.task; > -#elif defined(OS_POSIX) > - if (RUNNING_ON_VALGRIND) > - delete pending_task.task; > -#endif // defined(OS_POSIX) > } > } > did_work |= !deferred_non_nestable_work_queue_.empty(); > while (!deferred_non_nestable_work_queue_.empty()) { > - // TODO(darin): Delete all tasks once it is safe to do so. > - // Until it is totaly safe, only delete them under Purify and > Valgrind. > - Task* task = NULL; > -#if defined(PURIFY) > - task = deferred_non_nestable_work_queue_.front().task; > -#elif defined(OS_POSIX) > - if (RUNNING_ON_VALGRIND) > - task = deferred_non_nestable_work_queue_.front().task; > -#endif > + delete deferred_non_nestable_work_queue_.front().task; > deferred_non_nestable_work_queue_.pop(); > - if (task) > - delete task; > } > did_work |= !delayed_work_queue_.empty(); > while (!delayed_work_queue_.empty()) { > - Task* task = delayed_work_queue_.top().task; > + delete delayed_work_queue_.top().task; > delayed_work_queue_.pop(); > - delete task; > } > return did_work; > } > > >
Please revert the change, and then implement the change per comments below. I believe the change listed here should cause crashes in a variety of scenarios. I think that the same task could easily be destroyed more than once. Mostly, try to use the existing code, only removing the ifdefs. Thanks, Jim
Oops... my comment didn't make it in my publication. I just entered it again. Please revert and use the existing code structure to avoid problems and crashes. Thanks, Jim http://codereview.chromium.org/3076015/diff/1/2 File base/message_loop.cc (right): http://codereview.chromium.org/3076015/diff/1/2#newcode479 base/message_loop.cc:479: delete delayed_work_queue_.top().task; A task deletion may have a side effect, which may push a new task onto a queue. As a result, you have to remove it from the queue before calling for the deletion. This was all carefully done in the existing code, and you should try to maintain that format (removing only the ifdefs.
Ah, I see. I'll do that, and add a comment so it won't get changed back again. On 2010/08/03 02:40:35, jar wrote: > Oops... my comment didn't make it in my publication. I just entered it again. > Please revert and use the existing code structure to avoid problems and crashes. > > Thanks, > > Jim > > http://codereview.chromium.org/3076015/diff/1/2 > File base/message_loop.cc (right): > > http://codereview.chromium.org/3076015/diff/1/2#newcode479 > base/message_loop.cc:479: delete delayed_work_queue_.top().task; > A task deletion may have a side effect, which may push a new task onto a queue. > As a result, you have to remove it from the queue before calling for the > deletion. This was all carefully done in the existing code, and you should try > to maintain that format (removing only the ifdefs.
The critical thing is the code in and around message loop is generally very carefully crafted as re-entrant code, working to not hold any state on the stack that could be corrupted by the side-effects of executing an arbitrary task. Any comment would really have to be written time and again throughout the file. In general, any changes to the message loop have to be very carefully reviewed. I'd say it is a wasted effort to add the "be careful: reentrant code" comment, other than putting a giant caveat at the top of the file, perhaps saying that this is such critical code, that perhaps we need more than a single code review before making any changes. When you make your planned change, even if you do it "correctly" (following the pattern supplied), there is a very good chance that something very bad will happen. Specifically, the deletion of tasks without executing them may leave all sorts of components in unstable states. As the task is deleted, the resources held for use as task arguments (via reference counting) will also be deleted. Unfortunately, it is not clear that it is safe to delete the arguments without having run the associated task <ut oh>. In addition, the action of deleting the task, causing arguments to be deleted, may indeed have side effects, such as the posting of more tasks! In the extreme, deleting of tasks is almost as dangerous as running tasks, which of course can spawn new tasks, and preclude any shutdown ever <oops>. That sort of response will show up as hung browsers (infinite loop of re-posting tasks), causing users to be forced to manually kill browsers, and then we'll see our crash rate sky rocket. Note that the limited unit tests will probably never reveal the complex set of tasks that may be pending in the field :-(. IMO, it is a dangerous thing to try to delete these tasks as you are proposing. As Darin says, "Good luck!" You need to watch the crash rate carefully, and be ready to completely revert to the current status quo of not deleting these tasks at shutdown (which causes finite shutdown leaks... and no recurring leaks... but no known crashes... and that safety may be pretty nearly proved). There is also a reasonable chance that once this change is made, some future otherwise innocuous change will induce the infinite loop of tasks re-posts, and that will be a rather hard problem to diagnose in the field (other than seeing the crash rate jump due to manual process terminations!). Soooo.... all I can say is "Good Luck!" I'd strongly recommend against this change (no matter how well crafted)... but I think I said that in the bug already. Jim On Mon, Aug 2, 2010 at 8:05 PM, <akalin@chromium.org> wrote: > Ah, I see. I'll do that, and add a comment so it won't get changed back > again. > > > On 2010/08/03 02:40:35, jar wrote: > >> Oops... my comment didn't make it in my publication. I just entered it >> again. >> > > Please revert and use the existing code structure to avoid problems and >> > crashes. > > Thanks, >> > > Jim >> > > http://codereview.chromium.org/3076015/diff/1/2 >> File base/message_loop.cc (right): >> > > http://codereview.chromium.org/3076015/diff/1/2#newcode479 >> base/message_loop.cc:479: delete delayed_work_queue_.top().task; >> A task deletion may have a side effect, which may push a new task onto a >> > queue. > >> As a result, you have to remove it from the queue before calling for the >> deletion. This was all carefully done in the existing code, and you >> should >> > try > >> to maintain that format (removing only the ifdefs. >> > > > > http://codereview.chromium.org/3076015/show >
This makes me stop and think... what's the purpose of this change? If anything, I think we should change shutdown to do the following: 1) run all critical shutdown tasks (save user data to disk), 2) terminate the browser process. We don't have any need to free memory carefully at shutdown, other than to pacify valgrind ;-) -Darin On Mon, Aug 2, 2010 at 10:17 PM, Jim Roskind <jar@chromium.org> wrote: > The critical thing is the code in and around message loop is generally very > carefully crafted as re-entrant code, working to not hold any state on the > stack that could be corrupted by the side-effects of executing an arbitrary > task. Any comment would really have to be written time and again throughout > the file. In general, any changes to the message loop have to be very > carefully reviewed. > > I'd say it is a wasted effort to add the "be careful: reentrant code" > comment, other than putting a giant caveat at the top of the file, perhaps > saying that this is such critical code, that perhaps we need more than a > single code review before making any changes. > > When you make your planned change, even if you do it "correctly" (following > the pattern supplied), there is a very good chance that something very bad > will happen. Specifically, the deletion of tasks without executing them may > leave all sorts of components in unstable states. As the task is deleted, > the resources held for use as task arguments (via reference counting) will > also be deleted. Unfortunately, it is not clear that it is safe to delete > the arguments without having run the associated task <ut oh>. In addition, > the action of deleting the task, causing arguments to be deleted, may indeed > have side effects, such as the posting of more tasks! In the extreme, > deleting of tasks is almost as dangerous as running tasks, which of course > can spawn new tasks, and preclude any shutdown ever <oops>. That sort of > response will show up as hung browsers (infinite loop of re-posting tasks), > causing users to be forced to manually kill browsers, and then we'll see our > crash rate sky rocket. > > Note that the limited unit tests will probably never reveal the complex set > of tasks that may be pending in the field :-(. > > IMO, it is a dangerous thing to try to delete these tasks as you are > proposing. As Darin says, "Good luck!" You need to watch the crash rate > carefully, and be ready to completely revert to the current status quo of > not deleting these tasks at shutdown (which causes finite shutdown leaks... > and no recurring leaks... but no known crashes... and that safety may be > pretty nearly proved). > > There is also a reasonable chance that once this change is made, some > future otherwise innocuous change will induce the infinite loop of tasks > re-posts, and that will be a rather hard problem to diagnose in the field > (other than seeing the crash rate jump due to manual process terminations!). > > Soooo.... all I can say is "Good Luck!" I'd strongly recommend against > this change (no matter how well crafted)... but I think I said that in the > bug already. > > Jim > > > On Mon, Aug 2, 2010 at 8:05 PM, <akalin@chromium.org> wrote: > >> Ah, I see. I'll do that, and add a comment so it won't get changed back >> again. >> >> >> On 2010/08/03 02:40:35, jar wrote: >> >>> Oops... my comment didn't make it in my publication. I just entered it >>> again. >>> >> >> Please revert and use the existing code structure to avoid problems and >>> >> crashes. >> >> Thanks, >>> >> >> Jim >>> >> >> http://codereview.chromium.org/3076015/diff/1/2 >>> File base/message_loop.cc (right): >>> >> >> http://codereview.chromium.org/3076015/diff/1/2#newcode479 >>> base/message_loop.cc:479: delete delayed_work_queue_.top().task; >>> A task deletion may have a side effect, which may push a new task onto a >>> >> queue. >> >>> As a result, you have to remove it from the queue before calling for the >>> deletion. This was all carefully done in the existing code, and you >>> should >>> >> try >> >>> to maintain that format (removing only the ifdefs. >>> >> >> >> >> http://codereview.chromium.org/3076015/show >> > >
+1: Faster shutdown, with OS doing process cleanup, seems like a real win. This change can only slow shutdown. Jim On Mon, Aug 2, 2010 at 11:32 PM, Darin Fisher <darin@chromium.org> wrote: > This makes me stop and think... what's the purpose of this change? If > anything, I think we should change shutdown to do the following: 1) run all > critical shutdown tasks (save user data to disk), 2) terminate the browser > process. We don't have any need to free memory carefully at shutdown, other > than to pacify valgrind ;-) > > -Darin > > > On Mon, Aug 2, 2010 at 10:17 PM, Jim Roskind <jar@chromium.org> wrote: > >> The critical thing is the code in and around message loop is generally >> very carefully crafted as re-entrant code, working to not hold any state on >> the stack that could be corrupted by the side-effects of executing an >> arbitrary task. Any comment would really have to be written time and again >> throughout the file. In general, any changes to the message loop have to be >> very carefully reviewed. >> >> I'd say it is a wasted effort to add the "be careful: reentrant code" >> comment, other than putting a giant caveat at the top of the file, perhaps >> saying that this is such critical code, that perhaps we need more than a >> single code review before making any changes. >> >> When you make your planned change, even if you do it "correctly" >> (following the pattern supplied), there is a very good chance that something >> very bad will happen. Specifically, the deletion of tasks without executing >> them may leave all sorts of components in unstable states. As the task is >> deleted, the resources held for use as task arguments (via reference >> counting) will also be deleted. Unfortunately, it is not clear that it is >> safe to delete the arguments without having run the associated task <ut oh>. >> In addition, the action of deleting the task, causing arguments to be >> deleted, may indeed have side effects, such as the posting of more tasks! >> In the extreme, deleting of tasks is almost as dangerous as running tasks, >> which of course can spawn new tasks, and preclude any shutdown ever <oops>. >> That sort of response will show up as hung browsers (infinite loop of >> re-posting tasks), causing users to be forced to manually kill browsers, and >> then we'll see our crash rate sky rocket. >> >> Note that the limited unit tests will probably never reveal the complex >> set of tasks that may be pending in the field :-(. >> >> IMO, it is a dangerous thing to try to delete these tasks as you are >> proposing. As Darin says, "Good luck!" You need to watch the crash rate >> carefully, and be ready to completely revert to the current status quo of >> not deleting these tasks at shutdown (which causes finite shutdown leaks... >> and no recurring leaks... but no known crashes... and that safety may be >> pretty nearly proved). >> >> There is also a reasonable chance that once this change is made, some >> future otherwise innocuous change will induce the infinite loop of tasks >> re-posts, and that will be a rather hard problem to diagnose in the field >> (other than seeing the crash rate jump due to manual process terminations!). >> >> Soooo.... all I can say is "Good Luck!" I'd strongly recommend against >> this change (no matter how well crafted)... but I think I said that in the >> bug already. >> >> Jim >> >> >> On Mon, Aug 2, 2010 at 8:05 PM, <akalin@chromium.org> wrote: >> >>> Ah, I see. I'll do that, and add a comment so it won't get changed back >>> again. >>> >>> >>> On 2010/08/03 02:40:35, jar wrote: >>> >>>> Oops... my comment didn't make it in my publication. I just entered it >>>> again. >>>> >>> >>> Please revert and use the existing code structure to avoid problems and >>>> >>> crashes. >>> >>> Thanks, >>>> >>> >>> Jim >>>> >>> >>> http://codereview.chromium.org/3076015/diff/1/2 >>>> File base/message_loop.cc (right): >>>> >>> >>> http://codereview.chromium.org/3076015/diff/1/2#newcode479 >>>> base/message_loop.cc:479: delete delayed_work_queue_.top().task; >>>> A task deletion may have a side effect, which may push a new task onto a >>>> >>> queue. >>> >>>> As a result, you have to remove it from the queue before calling for the >>>> deletion. This was all carefully done in the existing code, and you >>>> should >>>> >>> try >>> >>>> to maintain that format (removing only the ifdefs. >>>> >>> >>> >>> >>> http://codereview.chromium.org/3076015/show >>> >> >> > |