Issue 107343008: mac: Fix a history swiper bug where the shield wouldn't disappear.

Issue 107343008: mac: Fix a history swiper bug where the shield wouldn't disappear. (Closed)

Created:
6 years, 12 months ago by erikchen

Modified:
6 years, 11 months ago

Reviewers:
Robert Sesek, cmp, Nico

CC:
chromium-reviews, sail

Base URL:
https://chromium.googlesource.com/chromium/src.git@master

Visibility:
Public.

More Reviews

Description

mac: Fix a history swiper bug where the shield wouldn't disappear. There were 3 separate problems. 1. Retain bug :( 2. historyOverlay was removing the view from the superview during dealloc, which is atypical. If the view didn't exist, historyOverlay was actually creating the view, and sending a callback to the delegate. 3. We were not specifying an NSAnimationContext grouping/duration, which non-deterministically would use a 0-duration animation, which actually prevented the animation and callback from triggering. As a result, the shield would never disappear. On a side note: unit_tests did not work when run with --gtest-filter=HistoryOverlay* because we did not specify an NSAnimationContext with a non-zero duration. 0-duration animation contexts cause animations to not fire, which meant that the delegate callback never went through. BUG=330312 TEST=manual Committed: https://src.chromium.org/viewvc/chrome?view=rev&revision=242911

Patch Set 1 #

Total comments: 4

Patch Set 2 : fix unit tests #

Patch Set 3 : combine unit tests #

Patch Set 4 : Add a missing animation context to history overlay dismissal. #

Patch Set 5 : Rebase against top of tree. #

Created: 6 years, 11 months ago

Download [raw] [tar.bz2]

	Unified diffs	Side-by-side diffs	Delta from patch set	Stats (+13 lines, -22 lines)			Patch
M	chrome/browser/renderer_host/chrome_render_widget_host_view_mac_history_swiper.mm	View		1 chunk	+1 line, -1 line	0 comments	Download
M	chrome/browser/ui/cocoa/history_overlay_controller.mm	View	1 2 3	3 chunks	+6 lines, -7 lines	0 comments	Download
M	chrome/browser/ui/cocoa/history_overlay_controller_unittest.mm	View	1 2	3 chunks	+6 lines, -14 lines	0 comments	Download

Messages

Total messages: 28 (0 generated)

Expand Messages | Collapse Messages

erikchen

Please take a look. I haven't confirmed that this fixes the problem, but it seems ...

6 years, 12 months ago (2013-12-27 23:29:13 UTC) #1

Nico

https://codereview.chromium.org/107343008/diff/1/chrome/browser/ui/cocoa/history_overlay_controller.mm File chrome/browser/ui/cocoa/history_overlay_controller.mm (left): https://codereview.chromium.org/107343008/diff/1/chrome/browser/ui/cocoa/history_overlay_controller.mm#oldcode97 chrome/browser/ui/cocoa/history_overlay_controller.mm:97: } This was added in https://chromiumcodereview.appspot.com/10913066 / https://chromiumcodereview.appspot.com/16632009 . ...

6 years, 12 months ago (2013-12-27 23:56:13 UTC) #2

Nico

https://codereview.chromium.org/107343008/diff/1/chrome/browser/renderer_host/chrome_render_widget_host_view_mac_history_swiper.mm File chrome/browser/renderer_host/chrome_render_widget_host_view_mac_history_swiper.mm (left): https://codereview.chromium.org/107343008/diff/1/chrome/browser/renderer_host/chrome_render_widget_host_view_mac_history_swiper.mm#oldcode249 chrome/browser/renderer_host/chrome_render_widget_host_view_mac_history_swiper.mm:249: historyOverlay_ = [historyOverlay retain]; I guess this path isn't ...

6 years, 12 months ago (2013-12-28 00:08:30 UTC) #3

erikchen

PTAL question: When I run the unit tests locally, with my changes reverted, 'DismissClearsAnimations' hangs. ...

6 years, 11 months ago (2013-12-30 19:20:33 UTC) #4

Robert Sesek

On 2013/12/30 19:20:33, erikchen1 wrote: > PTAL > > question: When I run the unit ...

6 years, 11 months ago (2013-12-30 19:23:27 UTC) #5

erikchen

On 2013/12/30 19:23:27, rsesek wrote: > On 2013/12/30 19:20:33, erikchen1 wrote: > > PTAL > ...

6 years, 11 months ago (2013-12-30 20:11:49 UTC) #6

Robert Sesek

On 2013/12/30 20:11:49, erikchen1 wrote: > On 2013/12/30 19:23:27, rsesek wrote: > > On 2013/12/30 ...

6 years, 11 months ago (2013-12-30 21:05:46 UTC) #7

erikchen

On 2013/12/30 21:05:46, rsesek wrote: > On 2013/12/30 20:11:49, erikchen1 wrote: > > On 2013/12/30 ...

6 years, 11 months ago (2013-12-30 21:40:21 UTC) #8

On 2013/12/30 21:05:46, rsesek wrote:
> On 2013/12/30 20:11:49, erikchen1 wrote:
> > On 2013/12/30 19:23:27, rsesek wrote:
> > > On 2013/12/30 19:20:33, erikchen1 wrote:
> > > > PTAL
> > > > 
> > > > question: When I run the unit tests locally, with my changes reverted,
> > > > 'DismissClearsAnimations' hangs. Since I've migrated the code over to
the
> > > other
> > > > test, now both of them hang locally. Any idea what's going on?
> > > 
> > > These tests all run the MessagePump (essentially CFRunLoop) while waiting
on
> > the
> > > animation to complete. They assign a completion block to then stop the
loop
> > once
> > > the animation completes. If it's hanging, -animationDidStop:finished: is
not
> > > being called (or the mock is not matching it). C.f. lines 90-98.
> > 
> > I'm trying to figure out why the test 'DismissClearsAnimations' hangs on my
> > local machine on a clean repo, but works on the try-bots. This difference
> makes
> > me think that the error is not in code-land but in a compile/run time
> > configuration problem.
> 
> The bots may run the tests using a different test runner strategy. I'm not
> abreast of all the changes that happened in this area recently, but there are
> threads on chromium-dev about it. At the very least, try using the exact
> invocation locally that's printed to the stdout of the bots.

I've already tried running the tests with all the relevant flags from the
testing bots. As far as I can tell, the flags pertain to sharding and
multi-threading, neither of which are particularly relevant to this bug. Given
that the tests still pass on the testing-bots, I'm not going to worry too much
about this. My best guess is that there's been a behavioral change in 10.9 vs
10.7/10.6 to CFRunLoop. If you want me to fix the test so that it runs locally,
I would just rewrite the logic to use a more standard form of flushing the
CFRunLoop (rather than calling Quit() from within the run loop.

Robert Sesek

On 2013/12/30 21:40:21, erikchen1 wrote: > On 2013/12/30 21:05:46, rsesek wrote: > > On 2013/12/30 ...

6 years, 11 months ago (2013-12-30 21:51:32 UTC) #9

On 2013/12/30 21:40:21, erikchen1 wrote:
> On 2013/12/30 21:05:46, rsesek wrote:
> > On 2013/12/30 20:11:49, erikchen1 wrote:
> > > On 2013/12/30 19:23:27, rsesek wrote:
> > > > On 2013/12/30 19:20:33, erikchen1 wrote:
> > > > > PTAL
> > > > > 
> > > > > question: When I run the unit tests locally, with my changes reverted,
> > > > > 'DismissClearsAnimations' hangs. Since I've migrated the code over to
> the
> > > > other
> > > > > test, now both of them hang locally. Any idea what's going on?
> > > > 
> > > > These tests all run the MessagePump (essentially CFRunLoop) while
waiting
> on
> > > the
> > > > animation to complete. They assign a completion block to then stop the
> loop
> > > once
> > > > the animation completes. If it's hanging, -animationDidStop:finished: is
> not
> > > > being called (or the mock is not matching it). C.f. lines 90-98.
> > > 
> > > I'm trying to figure out why the test 'DismissClearsAnimations' hangs on
my
> > > local machine on a clean repo, but works on the try-bots. This difference
> > makes
> > > me think that the error is not in code-land but in a compile/run time
> > > configuration problem.
> > 
> > The bots may run the tests using a different test runner strategy. I'm not
> > abreast of all the changes that happened in this area recently, but there
are
> > threads on chromium-dev about it. At the very least, try using the exact
> > invocation locally that's printed to the stdout of the bots.
> 
> I've already tried running the tests with all the relevant flags from the
> testing bots. As far as I can tell, the flags pertain to sharding and
> multi-threading, neither of which are particularly relevant to this bug. Given

I wouldn't assume that the sharding and threading do not affect this, since test
parallelization and nondeterministic ordering definitely affect the test runs.

> that the tests still pass on the testing-bots, I'm not going to worry too much
> about this. My best guess is that there's been a behavioral change in 10.9 vs
> 10.7/10.6 to CFRunLoop. If you want me to fix the test so that it runs
locally,
> I would just rewrite the logic to use a more standard form of flushing the
> CFRunLoop (rather than calling Quit() from within the run loop.

Do the other tests that use this pattern pass, though (you haven't mentioned
otherwise, so I'm guessing yes)? I'd be interesting to know why that is the case
if so. What's a "more standard form of flushing the CFRunLoop"? It's completely
safe to call CFRunLoopStop() on the running loop (which is what Quit() does).

erikchen

On 2013/12/30 21:51:32, rsesek wrote: > On 2013/12/30 21:40:21, erikchen1 wrote: > > On 2013/12/30 ...

6 years, 11 months ago (2013-12-30 22:22:41 UTC) #10

On 2013/12/30 21:51:32, rsesek wrote:
> On 2013/12/30 21:40:21, erikchen1 wrote:
> > On 2013/12/30 21:05:46, rsesek wrote:
> > > On 2013/12/30 20:11:49, erikchen1 wrote:
> > > > On 2013/12/30 19:23:27, rsesek wrote:
> > > > > On 2013/12/30 19:20:33, erikchen1 wrote:
> > > > > > PTAL
> > > > > > 
> > > > > > question: When I run the unit tests locally, with my changes
reverted,
> > > > > > 'DismissClearsAnimations' hangs. Since I've migrated the code over
to
> > the
> > > > > other
> > > > > > test, now both of them hang locally. Any idea what's going on?
> > > > > 
> > > > > These tests all run the MessagePump (essentially CFRunLoop) while
> waiting
> > on
> > > > the
> > > > > animation to complete. They assign a completion block to then stop the
> > loop
> > > > once
> > > > > the animation completes. If it's hanging, -animationDidStop:finished:
is
> > not
> > > > > being called (or the mock is not matching it). C.f. lines 90-98.
> > > > 
> > > > I'm trying to figure out why the test 'DismissClearsAnimations' hangs on
> my
> > > > local machine on a clean repo, but works on the try-bots. This
difference
> > > makes
> > > > me think that the error is not in code-land but in a compile/run time
> > > > configuration problem.
> > > 
> > > The bots may run the tests using a different test runner strategy. I'm not
> > > abreast of all the changes that happened in this area recently, but there
> are
> > > threads on chromium-dev about it. At the very least, try using the exact
> > > invocation locally that's printed to the stdout of the bots.
> > 
> > I've already tried running the tests with all the relevant flags from the
> > testing bots. As far as I can tell, the flags pertain to sharding and
> > multi-threading, neither of which are particularly relevant to this bug.
Given
> 
> I wouldn't assume that the sharding and threading do not affect this, since
test
> parallelization and nondeterministic ordering definitely affect the test runs.
You are right.
I was running my tests with a --gtest_filter. It is possible that testing state
from previous tests was leaking into this test, which allowed it to succeed,
though that's pretty scary in and of itself.
> 
> > that the tests still pass on the testing-bots, I'm not going to worry too
much
> > about this. My best guess is that there's been a behavioral change in 10.9
vs
> > 10.7/10.6 to CFRunLoop. If you want me to fix the test so that it runs
> locally,
> > I would just rewrite the logic to use a more standard form of flushing the
> > CFRunLoop (rather than calling Quit() from within the run loop.
> 
> Do the other tests that use this pattern pass, though (you haven't mentioned
> otherwise, so I'm guessing yes)? I'd be interesting to know why that is the
case
> if so. What's a "more standard form of flushing the CFRunLoop"? It's
completely
> safe to call CFRunLoopStop() on the running loop (which is what Quit() does).
A search in our code base for 'new base::MessagePumpNSRunLoop' shows that there
are exactly 4 matches. The other 3 are used in different ways from the one that
we're discussing. The pattern we're looking at right now is unique.

The way I've seen and used CFRunLoop in testing environments is with the api
'CFRunLoopRunInMode'.
while (waitForConditionToHappen) {
  // Let the current run loop run for a small number of seconds.
  CFRunLoopRunInMode(mode, 0.001, NO);
}

I'd have to take a more careful look at how we use run loops in our testing
code, since it looks like we're making an NSRunLoop, and then using CFRunLoop
apis.

Robert Sesek

On 2013/12/30 22:22:41, erikchen1 wrote: > On 2013/12/30 21:51:32, rsesek wrote: > > On 2013/12/30 ...

6 years, 11 months ago (2013-12-30 22:43:56 UTC) #11

On 2013/12/30 22:22:41, erikchen1 wrote:
> On 2013/12/30 21:51:32, rsesek wrote:
> > On 2013/12/30 21:40:21, erikchen1 wrote:
> > > On 2013/12/30 21:05:46, rsesek wrote:
> > > > On 2013/12/30 20:11:49, erikchen1 wrote:
> > > > > On 2013/12/30 19:23:27, rsesek wrote:
> > > > > > On 2013/12/30 19:20:33, erikchen1 wrote:
> > > > > > > PTAL
> > > > > > > 
> > > > > > > question: When I run the unit tests locally, with my changes
> reverted,
> > > > > > > 'DismissClearsAnimations' hangs. Since I've migrated the code over
> to
> > > the
> > > > > > other
> > > > > > > test, now both of them hang locally. Any idea what's going on?
> > > > > > 
> > > > > > These tests all run the MessagePump (essentially CFRunLoop) while
> > waiting
> > > on
> > > > > the
> > > > > > animation to complete. They assign a completion block to then stop
the
> > > loop
> > > > > once
> > > > > > the animation completes. If it's hanging,
-animationDidStop:finished:
> is
> > > not
> > > > > > being called (or the mock is not matching it). C.f. lines 90-98.
> > > > > 
> > > > > I'm trying to figure out why the test 'DismissClearsAnimations' hangs
on
> > my
> > > > > local machine on a clean repo, but works on the try-bots. This
> difference
> > > > makes
> > > > > me think that the error is not in code-land but in a compile/run time
> > > > > configuration problem.
> > > > 
> > > > The bots may run the tests using a different test runner strategy. I'm
not
> > > > abreast of all the changes that happened in this area recently, but
there
> > are
> > > > threads on chromium-dev about it. At the very least, try using the exact
> > > > invocation locally that's printed to the stdout of the bots.
> > > 
> > > I've already tried running the tests with all the relevant flags from the
> > > testing bots. As far as I can tell, the flags pertain to sharding and
> > > multi-threading, neither of which are particularly relevant to this bug.
> Given
> > 
> > I wouldn't assume that the sharding and threading do not affect this, since
> test
> > parallelization and nondeterministic ordering definitely affect the test
runs.
> You are right.
> I was running my tests with a --gtest_filter. It is possible that testing
state
> from previous tests was leaking into this test, which allowed it to succeed,
> though that's pretty scary in and of itself.
> > 
> > > that the tests still pass on the testing-bots, I'm not going to worry too
> much
> > > about this. My best guess is that there's been a behavioral change in 10.9
> vs
> > > 10.7/10.6 to CFRunLoop. If you want me to fix the test so that it runs
> > locally,
> > > I would just rewrite the logic to use a more standard form of flushing the
> > > CFRunLoop (rather than calling Quit() from within the run loop.
> > 
> > Do the other tests that use this pattern pass, though (you haven't mentioned
> > otherwise, so I'm guessing yes)? I'd be interesting to know why that is the
> case
> > if so. What's a "more standard form of flushing the CFRunLoop"? It's
> completely
> > safe to call CFRunLoopStop() on the running loop (which is what Quit()
does).
> A search in our code base for 'new base::MessagePumpNSRunLoop' shows that
there
> are exactly 4 matches. The other 3 are used in different ways from the one
that
> we're discussing. The pattern we're looking at right now is unique.

It's unique in that it's executing Quit() from a block, but all of the results
from codesearch run the loop, wait for an asynchronous event, and then call
Quit() from the running loop. They all seem quite similar to me.

> The way I've seen and used CFRunLoop in testing environments is with the api
> 'CFRunLoopRunInMode'.
> while (waitForConditionToHappen) {
>   // Let the current run loop run for a small number of seconds.
>   CFRunLoopRunInMode(mode, 0.001, NO);
> }

This is functionally equivalent to what is being done now (in terms of how the
run loop is activated) except that it also invokes a busy-wait loop. I don't
think this preferable.

> I'd have to take a more careful look at how we use run loops in our testing
> code, since it looks like we're making an NSRunLoop, and then using CFRunLoop
> apis.

Yes, this is true and could be problematic. The issue is that NSRunLoop provides
no programmatic stopping function and IIRC CFRunLoop alone cannot run
animations. In the test environment, this doesn't seem to cause issues though.

erikchen

Found the problem - we were missing an animation context, so we were in fact ...

6 years, 11 months ago (2013-12-31 00:36:07 UTC) #12

erikchen

On 2013/12/31 00:36:07, erikchen1 wrote: > Found the problem - we were missing an animation ...

6 years, 11 months ago (2014-01-02 17:35:13 UTC) #13

Nico

On Thu, Jan 2, 2014 at 9:35 AM, <erikchen@chromium.org> wrote: > On 2013/12/31 00:36:07, erikchen1 ...

6 years, 11 months ago (2014-01-02 18:39:12 UTC) #15