|
|
Created:
11 years, 1 month ago by Scott Hess - ex-Googler Modified:
9 years, 6 months ago CC:
chromium-reviews_googlegroups.com, John Grabowski Visibility:
Public. |
Description[Mac] Delay TabContents::Close() when event-tracking.
The close is delayed until the main event loop restarts.
Renderers can cause UI state changes which can badly break
event-tracking loops. The basic pattern is "Run JavaScript to close
window after a timeout, and start event-tracking loop and keep it
going across the timeout." Things crash when UI elements attempt to
refer to freed objects. Examples:
- Last tab in a window closes while dragging the window.
- Last tab in a window closes while bookmark bar context menu visible.
- Last tab in a window closes while download shelf context menu visible.
- Tab closes while dragging a link over the tab.
- Tab closes while dragging a link from the tab.
- Tab closes while back/forward context menu is open.
- Tab closes while click-hold in the tab's close button.
- Tab closes while closing info bar.
- Tab closes while tab context menu is visible.
- Probably more I'm not aware of.
Supersedes (and reverts) previous fix for issues 25462 and 25465.
BUG=25462, 25463, 25465, 26135, 26136, 26137, 25467
TEST=See bugs for repro cases.
Committed: http://src.chromium.org/viewvc/chrome?view=rev&revision=31183
Patch Set 1 #
Total comments: 15
Patch Set 2 : Mark and dmac suggestions. #Patch Set 3 : 0.0 is supposed to work. #Patch Set 4 : Match nameing from dmac's change. #Patch Set 5 : Merged with dmac's change. #Patch Set 6 : Oops, included file moved. #
Messages
Total messages: 25 (0 generated)
Superceeds http://codereview.chromium.org/348047
On 2009/11/04 22:51:14, shess wrote: > Superceeds http://codereview.chromium.org/348047 For the commit comment: supersedes
On 2009/11/04 22:55:35, Avi wrote: > On 2009/11/04 22:51:14, shess wrote: > > Superceeds http://codereview.chromium.org/348047 > > For the commit comment: supersedes http://www.merriam-webster.com/dictionary/supercede > Supercede has occurred as a spelling variant of supersede > since the 17th century, and it is common in current > published writing. It continues, however, to be widely > regarded as an error. Darn, _so_ close.
take a look at http://codereview.chromium.org/345051 which may fix this as well. Testing now.
On 2009/11/04 23:18:01, dmac wrote: > take a look at http://codereview.chromium.org/345051 which may fix this as > well. Testing now. So http://codereview.chromium.org/345051 would fix 25462, 25463 and 25465. I'd have to look closely at the others to see what's going on there.
The problem in most of these is that the Chrome-side objects unwind while the Cocoa-side objects are still retained in various ways. There are some cases where the target still exists but the C++ object it wraps does not, and other cases where the target is owned by a C++ object, so it doesn't even exist. This change effectively pins the tab until it is safe to let things go. -scott On Wed, Nov 4, 2009 at 5:06 PM, <dmaclach@chromium.org> wrote: > On 2009/11/04 23:18:01, dmac wrote: >> >> take a look at http://codereview.chromium.org/345051 which may fix this >> =A0as >> well. Testing now. > > So http://codereview.chromium.org/345051 would fix 25462, 25463 and 25465= . > I'd > have to look closely at the others to see what's going on there. > > http://codereview.chromium.org/362013 >
Adding myself to the cc here (as if I wasn't working on enough already)
Understood. Keeping stuff alive in the top level autorelease pool should accomplish and should fix this more generally (and not have to do interesting things with delay:0.002. On Nov 4, 2009, at 4:16 PM, Scott Hess wrote: > The problem in most of these is that the Chrome-side objects unwind > while the Cocoa-side objects are still retained in various ways. > There are some cases where the target still exists but the C++ object > it wraps does not, and other cases where the target is owned by a C++ > object, so it doesn't even exist. This change effectively pins the > tab until it is safe to let things go. > > -scott > > > On Wed, Nov 4, 2009 at 5:06 PM, <dmaclach@chromium.org> wrote: >> On 2009/11/04 23:18:01, dmac wrote: >>> >>> take a look at http://codereview.chromium.org/345051 which may fix >>> this >>> as >>> well. Testing now. >> >> So http://codereview.chromium.org/345051 would fix 25462, 25463 and >> 25465. >> I'd >> have to look closely at the others to see what's going on there. >> >> http://codereview.chromium.org/362013 >>
http://codereview.chromium.org/362013/diff/1/8 File chrome/browser/tab_contents/tab_contents_view_mac.mm (right): http://codereview.chromium.org/362013/diff/1/8#newcode286 Line 286: afterDelay:0.02]; I swear to goodness that when I was setting this to 0, it was firing too early. But dmac's remark about funky 0.002 made me go back and test, and now I wonder if it wasn't just twitchy mouse fingers? The 0.02 value comes from data-mining other uses in this directory, but I have no justification for it. 0 seems to work, and is certainly easier to justify.
http://codereview.chromium.org/362013/diff/1/3 File chrome/browser/chrome_application_mac.mm (right): http://codereview.chromium.org/362013/diff/1/3#newcode260 Line 260: isEventTracking_ = NO; Guard against nested -sendEvent: calls. At the entry to this method, save off lastIsEventTracking = isEventTracking_ before setting isEventTracking_ to YES. Then here, restore isEventTracking_ = lastIsEventTracking. http://codereview.chromium.org/362013/diff/1/8 File chrome/browser/tab_contents/tab_contents_view_mac.mm (right): http://codereview.chromium.org/362013/diff/1/8#newcode286 Line 286: afterDelay:0.02]; Yeah, 0.02?!
Scott, we need to decide what to do with the CL vs my CL (http://codereview.chromium.org/345051). I think my CL plus some minor changes elsewhere will cover all the bugs that your CL covers, plus a couple on my side. It also should start to remove the need to ever have to worry about calling things like [performSelector:@selector(close:) ... withDelay:n]. Our two solutions are similar in some ways, but I think that mine is going more to the root of the problem which is when ObjC objects are being released vis-a-vis when their C++ counterparts are disappearing. I don't want us to be unnecessarily duplicating work. What do you think?
Maybe I need to patch your change in and play with it, because I don't see how your change really covers the same stuff. With my change, when you're event tracking, the tab stays there until you are done. So if you are in the tab right-click menu, you have to dismiss the right-click menu before the tab can close. This does mean that you might make a selection from the menu which will be ignored, but the menu makes sense while it's up. AFAICT, with your change the tab (or window) might disappear out from under you, so now you have a menu which applies to an object you cannot see, though it still exists because it hasn't been autoreleased. Additionally, my change pins the Chrome-side objects in memory. AFAICT, your change affects the Cocoa-side objects which are autoreleased. But my change is dealing with cases where a Chrome object is being destructed and releasing references to a Cocoa object, but where the crash is when the Cocoa object references the Chrome object via a weak reference. So we'd still have to audit the code for places where we need to clear weak references and check them before dereferencing them. Basically, I don't think there's a downside to having both changes. They address an overlapping set of problem cases. Your change addresses cases which stem from UI controls initiating object-graph destruction, mine addresses cases which stem from renderers or plug-ins initiating destruction. -scott On Wed, Nov 4, 2009 at 8:28 PM, <dmaclach@chromium.org> wrote: > Scott, we need to decide what to do with the CL vs my CL > (http://codereview.chromium.org/345051). > > I think my CL plus some minor changes elsewhere will cover all the bugs that > your CL covers, plus a couple on my side. It also should start to remove the > need to ever have to worry about calling things like > [performSelector:@selector(close:) ... withDelay:n]. Our two solutions are > similar in some ways, but I think that mine is going more to the root of the > problem which is when ObjC objects are being released vis-a-vis when their > C++ > counterparts are disappearing. I don't want us to be unnecessarily > duplicating > work. What do you think? > > http://codereview.chromium.org/362013 >
Hmm, as a specific for-instance of a case which my change doesn't currently address, but which I will work on soon, is tab-dragging when the tab closes out from under you. The tab that you're dragging disappears, but the window and tab-contents continue to survive and be draggable. Nothing crashes, because deallocated objects aren't referenced, but having someone partially deconstruct the window while dragging is inappropriate. The right solution is to either cancel the drag and yank the tab, or pin the tab up until you're done dragging it, THEN yank it. There are a handful of those that can either be done ad-hoc, or they can be done with something like my change. From talking to Darin, I think it's reasonable to expect my change to have cross-platform application (see the dragging stanza right belong my new code in TabContents::Close()). The NSObject delayed send currently in there is a hack. I spoke with Darin about this, and he described how to do it using PostNonNestableTask(), which does things very directly without the bridging around. But in practice our message loop isn't working the way he describes. As I understand it, PostNonNestableTask() is supposed to post items which will be processed when we get back out to the main event loop (from -sendEvent:), but we're processing them during event-tracking loops. Kind of the same problem as you're solving, but from the Chrome side. Since DeleteSoon() and ReleaseSoon() are implemented in terms of PostNonNestableTask(), this may be the root cause of some ownership things that we've had to fix in ad-hoc fashion. -scott On Wed, Nov 4, 2009 at 9:41 PM, Scott Hess <shess@chromium.org> wrote: > Maybe I need to patch your change in and play with it, because I don't > see how your change really covers the same stuff. =A0With my change, > when you're event tracking, the tab stays there until you are done. > So if you are in the tab right-click menu, you have to dismiss the > right-click menu before the tab can close. =A0This does mean that you > might make a selection from the menu which will be ignored, but the > menu makes sense while it's up. =A0AFAICT, with your change the tab (or > window) might disappear out from under you, so now you have a menu > which applies to an object you cannot see, though it still exists > because it hasn't been autoreleased. > > Additionally, my change pins the Chrome-side objects in memory. > AFAICT, your change affects the Cocoa-side objects which are > autoreleased. =A0But my change is dealing with cases where a Chrome > object is being destructed and releasing references to a Cocoa object, > but where the crash is when the Cocoa object references the Chrome > object via a weak reference. =A0So we'd still have to audit the code for > places where we need to clear weak references and check them before > dereferencing them. > > Basically, I don't think there's a downside to having both changes. > They address an overlapping set of problem cases. =A0Your change > addresses cases which stem from UI controls initiating object-graph > destruction, mine addresses cases which stem from renderers or > plug-ins initiating destruction. > > -scott > > > > On Wed, Nov 4, 2009 at 8:28 PM, =A0<dmaclach@chromium.org> wrote: >> Scott, we need to decide what to do with the CL vs my CL >> (http://codereview.chromium.org/345051). >> >> I think my CL plus some minor changes elsewhere will cover all the bugs = that >> your CL covers, plus a couple on my side. It also should start to remove= the >> need to ever have to worry about calling things like >> [performSelector:@selector(close:) ... withDelay:n]. Our two solutions a= re >> similar in some ways, but I think that mine is going more to the root of= the >> problem which is when ObjC objects are being released vis-a-vis when the= ir >> C++ >> counterparts are disappearing. I don't want us to be unnecessarily >> duplicating >> work. What do you think? >> >> http://codereview.chromium.org/362013 >> >
http://codereview.chromium.org/362013/diff/1/2 File chrome/browser/chrome_application_mac.h (right): http://codereview.chromium.org/362013/diff/1/2#newcode13 Line 13: BOOL isEventTracking_; @private http://codereview.chromium.org/362013/diff/1/8 File chrome/browser/tab_contents/tab_contents_view_mac.mm (right): http://codereview.chromium.org/362013/diff/1/8#newcode75 Line 75: [cocoa_view_.get() cancelDeferredClose]; do you need a get() here? http://codereview.chromium.org/362013/diff/1/8#newcode282 Line 282: TabContentsViewCocoa* view = cocoa_view_.get(); do you need the get? or could you just use [cocoa_view_ cancelDeferredClose] etc. http://codereview.chromium.org/362013/diff/1/8#newcode516 Line 516: [NSObject cancelPreviousPerformRequestsWithTarget:self]; do you want to cancel all requests or is + (void)cancelPreviousPerformRequestsWithTarget:(id)aTarget selector:(SEL)aSelector object:(id)anArgument; more appropriate so you can just target closes?
[Have not compiled the uploaded CL in the interests of getting to bed.] http://codereview.chromium.org/362013/diff/1/2 File chrome/browser/chrome_application_mac.h (right): http://codereview.chromium.org/362013/diff/1/2#newcode13 Line 13: BOOL isEventTracking_; On 2009/11/05 06:13:57, dmac wrote: > @private Done. [This bit might want some refactoring between my change and your change. Wouldn't seem to be any particular reason to have multiple ways to say "Am I in -sendEvent:?". IMHO can be a TODO() added by whoever gets to merge things.] http://codereview.chromium.org/362013/diff/1/3 File chrome/browser/chrome_application_mac.mm (right): http://codereview.chromium.org/362013/diff/1/3#newcode260 Line 260: isEventTracking_ = NO; On 2009/11/05 02:51:21, Mark Mentovai wrote: > Guard against nested -sendEvent: calls. At the entry to this method, save off > lastIsEventTracking = isEventTracking_ before setting isEventTracking_ to YES. > Then here, restore isEventTracking_ = lastIsEventTracking. Really? Ick. OK. http://codereview.chromium.org/362013/diff/1/8 File chrome/browser/tab_contents/tab_contents_view_mac.mm (right): http://codereview.chromium.org/362013/diff/1/8#newcode75 Line 75: [cocoa_view_.get() cancelDeferredClose]; On 2009/11/05 06:13:57, dmac wrote: > do you need a get() here? Do not know. I'm sure there is a nice rule which indicates when to message directly and when to use get(), but darned if I can figure it out. The other code in this file does [cocoa_view_.get() message] 11 times to [cocoa_view_ message] 6 times. I'll make the suggested change, but I'm not at the office so I won't get to check out its operation until tomorrow. http://codereview.chromium.org/362013/diff/1/8#newcode282 Line 282: TabContentsViewCocoa* view = cocoa_view_.get(); On 2009/11/05 06:13:57, dmac wrote: > do you need the get? or could you just use [cocoa_view_ cancelDeferredClose] > etc. Done. http://codereview.chromium.org/362013/diff/1/8#newcode286 Line 286: afterDelay:0.02]; On 2009/11/05 02:51:21, Mark Mentovai wrote: > Yeah, 0.02?! So ... is this suggesting that 0.0 should work fine? Like I said, I thought earlier that it didn't work, which supported the cases I was finding where a small value (0.02) was passed instead of 0.0. But when I tested it today, 0.0 was fine, so now I suspect that the 0.02 cases I've found are red herring. I'm willing to do the right thing here, but the docs say that 0 "does not necessarily" call immediately, which is not nearly so heartening as a straight-up statement would be. Again, please be specific if you recommend 0.0. http://codereview.chromium.org/362013/diff/1/8#newcode516 Line 516: [NSObject cancelPreviousPerformRequestsWithTarget:self]; On 2009/11/05 06:13:57, dmac wrote: > do you want to cancel all requests or is + > (void)cancelPreviousPerformRequestsWithTarget:(id)aTarget > selector:(SEL)aSelector object:(id)anArgument; more appropriate so you can just > target closes? Honestly, I used the shorter version because the more specific version was annoying to format for no apparent gain. Changed to use the more specific version.
On 2009/11/05 06:41:13, shess wrote: > On 2009/11/05 06:13:57, dmac wrote: > > do you need a get() here? > > Do not know. I'm sure there is a nice rule which indicates when to message > directly and when to use get(), but darned if I can figure it out. The other > code in this file does [cocoa_view_.get() message] 11 times to [cocoa_view_ > message] 6 times. > > I'll make the suggested change, but I'm not at the office so I won't get to > check out its operation until tomorrow. > My understanding is that you shouldn't ever need the .get() when you are directly messaging it as an ObjC-object. You do need it when you are assigning it to another pointer.... but I'm probably wrong ;-)
http://codereview.chromium.org/362013/diff/1/8 File chrome/browser/tab_contents/tab_contents_view_mac.mm (right): http://codereview.chromium.org/362013/diff/1/8#newcode75 Line 75: [cocoa_view_.get() cancelDeferredClose]; shess wrote: > Do not know. I'm sure there is a nice rule which indicates when to message > directly and when to use get(), but darned if I can figure it out. The rule is simple. Don't use .get() unless you have to. You only "have to" when the compiler makes you do it. There are some stupid cases where the implicit conversions break down, even with the hints we've given, and you have to use .get() to specify the receiver. One case I uncovered was NSShadow receivers with the 10.6 SDK. http://codereview.chromium.org/205016. http://codereview.chromium.org/362013/diff/1/8#newcode286 Line 286: afterDelay:0.02]; On 2009/11/05 06:41:14, shess wrote: > Again, please be specific if you recommend 0.0. I'd prefer, in order: (most preferable) 0.0 std::numeric_limits<NSTimeInterval>::min() other magic numbers (least preferable) 0.02 is another magic number, and it's actually quite a long time. std::numeric_limits<NSTimeInterval>::min() is the new and improved name for DBL_MIN. In my experience, I don't think -[NSObject performSelector:withObject:afterDelay:] ever executes immediately, I think the selector is always deferred back to the loop. To be clear, and to expand on your comment, the docs say: The minimum time before which the message is sent. Specifying a delay of 0 does not necessarily cause the selector to be performed immediately. The selector is still queued on the thread’s run loop and performed as soon as possible. I read this as meaning "the loop always fires off the selector;" the "immediately" bit means "this might not be the very next thing that the loop does," or even "depending on the shape of your program, you might not get back to the loop immediately after making this call." That matches my own experience.
> I read this as meaning "the loop always fires off the selector;" the > "immediately" bit means "this might not be the very next thing that the loop > does," or even "depending on the shape of your program, you might not get back > to the loop immediately after making this call." That matches my own > experience. This matches my experience as well, and I know that lots of code (Apple and non-Apple) depend on this behavior.
On 2009/11/05 17:09:41, dmac wrote: > > I read this as meaning "the loop always fires off the selector;" the > > "immediately" bit means "this might not be the very next thing that the loop > > does," or even "depending on the shape of your program, you might not get back > > to the loop immediately after making this call." That matches my own > > experience. > > This matches my experience as well, and I know that lots of code (Apple and > non-Apple) depend on this behavior. 0.0 it is. Also renamed variables and methods in CrApplication to match dmac's change.
Who's gonna go first, you or dmac? Mark Scott wrote: > On 2009/11/05 17:09:41, dmac wrote: >> >> > I read this as meaning "the loop always fires off the selector;" the >> > "immediately" bit means "this might not be the very next thing that th= e >> > loop >> > does," or even "depending on the shape of your program, you might not >> > get > > back >> >> > to the loop immediately after making this call." =A0That matches my ow= n >> > experience. > >> This matches my experience as well, and I know that lots of code (Apple >> and >> non-Apple) depend on this behavior. > > 0.0 it is. =A0Also renamed variables and methods in CrApplication to matc= h > dmac's > change. > > http://codereview.chromium.org/362013 >
dmac can go. With his change in place, mine will require trivial merging (mainly deleting my CrApplication changes, I think). -scott On Thu, Nov 5, 2009 at 12:21 PM, Mark Mentovai <mark@chromium.org> wrote: > Who's gonna go first, you or dmac? > > Mark > > Scott wrote: >> On 2009/11/05 17:09:41, dmac wrote: >>> >>> > I read this as meaning "the loop always fires off the selector;" the >>> > "immediately" bit means "this might not be the very next thing that t= he >>> > loop >>> > does," or even "depending on the shape of your program, you might not >>> > get >> >> back >>> >>> > to the loop immediately after making this call." =A0That matches my o= wn >>> > experience. >> >>> This matches my experience as well, and I know that lots of code (Apple >>> and >>> non-Apple) depend on this behavior. >> >> 0.0 it is. =A0Also renamed variables and methods in CrApplication to mat= ch >> dmac's >> change. >> >> http://codereview.chromium.org/362013 >> >
shess. My cl is in (31135). 13 other changes in there. Hopefully the tree stays green. On Nov 5, 2009, at 12:41 PM, Scott Hess wrote: > dmac can go. With his change in place, mine will require trivial > merging (mainly deleting my CrApplication changes, I think). > > -scott > > > On Thu, Nov 5, 2009 at 12:21 PM, Mark Mentovai <mark@chromium.org> > wrote: >> Who's gonna go first, you or dmac? >> >> Mark >> >> Scott wrote: >>> On 2009/11/05 17:09:41, dmac wrote: >>>> >>>>> I read this as meaning "the loop always fires off the selector;" >>>>> the >>>>> "immediately" bit means "this might not be the very next thing >>>>> that the >>>>> loop >>>>> does," or even "depending on the shape of your program, you >>>>> might not >>>>> get >>> >>> back >>>> >>>>> to the loop immediately after making this call." That matches >>>>> my own >>>>> experience. >>> >>>> This matches my experience as well, and I know that lots of code >>>> (Apple >>>> and >>>> non-Apple) depend on this behavior. >>> >>> 0.0 it is. Also renamed variables and methods in CrApplication to >>> match >>> dmac's >>> change. >>> >>> http://codereview.chromium.org/362013 >>> >>
On 2009/11/05 22:09:30, dmaclach_google.com wrote: > shess. My cl is in (31135). 13 other changes in there. Hopefully the > tree stays green. > > On Nov 5, 2009, at 12:41 PM, Scott Hess wrote: > > dmac can go. With his change in place, mine will require trivial > > merging (mainly deleting my CrApplication changes, I think). > > > > On Thu, Nov 5, 2009 at 12:21 PM, Mark Mentovai <mailto:mark@chromium.org> wrote: > >> Who's gonna go first, you or dmac? This exchange sounds like an implicit LGTM, but I figured I should ping just in case it's more of a "When you break the tree, what order will it be in?" query. I already have another fix queued up based on this CL :-). -scott
Sometimes when I mean to say LGTM, I don't actually say it. LGTM.
On 2009/11/05 23:44:27, Mark Mentovai wrote: > Sometimes when I mean to say LGTM, I don't actually say it. > > LGTM. LGTM here too. |