Issue 1294913004: Enable HitTestCache by default.

Issue 1294913004: Enable HitTestCache by default. (Closed)

Created:
5 years, 4 months ago by dtapuska

Modified:
5 years, 4 months ago

Reviewers:
esprehn

CC:
blink-reviews, blink-reviews-rendering, eae+blinkwatch, jchaffraix+rendering, leviw+renderwatch, pdr+renderingwatchlist_chromium.org, szager+layoutwatch_chromium.org, zoltan1

Base URL:
https://chromium.googlesource.com/chromium/blink.git@master

Target Ref:
refs/heads/master

Project:
blink

Visibility:
Public.

More Reviews

Description

Enable HitTestCache by default. UMA metrics indicate we have great validity. Enable the hit test cache in release by default. Keep the validity checks around in ASSERTs are enabled. BUG=398920 Committed: https://src.chromium.org/viewvc/blink?view=rev&revision=200852

Patch Set 1 #

Total comments: 3

Patch Set 2 : Remove UMA metrics for validity #

Patch Set 3 : Fix layout test #

Created: 5 years, 4 months ago

Download [raw] [tar.bz2]

	Unified diffs	Side-by-side diffs	Delta from patch set	Stats (+22 lines, -66 lines)			Patch
M	LayoutTests/fast/events/hit-test-counts-expected.txt	View	1 2	4 chunks	+7 lines, -7 lines	0 comments	Download
M	Source/core/layout/HitTestCache.h	View	1	2 chunks	+0 lines, -17 lines	0 comments	Download
M	Source/core/layout/HitTestCache.cpp	View	1	1 chunk	+0 lines, -15 lines	0 comments	Download
M	Source/core/layout/HitTestResult.h	View	1	1 chunk	+0 lines, -1 line	0 comments	Download
M	Source/core/layout/HitTestResult.cpp	View	1	1 chunk	+0 lines, -12 lines	0 comments	Download
M	Source/core/layout/LayoutView.cpp	View	1	1 chunk	+15 lines, -14 lines	0 comments	Download

Messages

Total messages: 22 (7 generated)

Expand Messages | Collapse Messages | Show Generated Messages | Hide Generated Messages

dtapuska

https://codereview.chromium.org/1294913004/diff/1/Source/core/layout/LayoutView.cpp File Source/core/layout/LayoutView.cpp (left): https://codereview.chromium.org/1294913004/diff/1/Source/core/layout/LayoutView.cpp#oldcode117 Source/core/layout/LayoutView.cpp:117: m_hitTestCache->verifyCachedResult(result, cacheResult); verifyCacheResult logs some UMA metrics. rbyers@ indicated ...

5 years, 4 months ago (2015-08-17 21:23:15 UTC) #2

esprehn

What does "we have great validity" mean? Is it 100% correct? https://codereview.chromium.org/1294913004/diff/1/Source/core/layout/LayoutView.cpp File Source/core/layout/LayoutView.cpp (left): ...

5 years, 4 months ago (2015-08-18 20:27:41 UTC) #3

dtapuska

On 2015/08/18 20:27:41, esprehn wrote: > What does "we have great validity" mean? Is it ...

5 years, 4 months ago (2015-08-18 20:42:37 UTC) #4

dtapuska

5 years, 4 months ago (2015-08-18 20:56:34 UTC) #5

esprehn

11 in 500m means we're still failing sometimes though? How do we go about fixing ...

5 years, 4 months ago (2015-08-18 21:04:32 UTC) #6

dtapuska

On 2015/08/18 21:04:32, esprehn wrote: > 11 in 500m means we're still failing sometimes though? ...

5 years, 4 months ago (2015-08-18 21:14:52 UTC) #7

Rick Byers

On 2015/08/18 21:14:52, dtapuska wrote: > On 2015/08/18 21:04:32, esprehn wrote: > > 11 in ...

5 years, 4 months ago (2015-08-18 21:32:42 UTC) #8

On 2015/08/18 21:14:52, dtapuska wrote:
> On 2015/08/18 21:04:32, esprehn wrote:
> > 11 in 500m means we're still failing sometimes though? How do we go about
> fixing
> > that correctness?
> 
> Yes it still means that there are some failures. What isn't clear to me is
that
> if all the open hit test bugs can cause it not to be deterministic.
> 
> I can continue thinking about the problem but I'm completely out of ideas as
it
> stands.
> 
> What is consistent with the current 11 failures is that they are all of code
> 151. Which implies the inner node, the pseudo node and the local point don't
> match. So that implies it pretty much found a different DOM node than what was
> in the cache at a different location.
> 
> I'm not certain what I can do to try to grab more information from these
> failures. I have not received any reports of any failures in debug mode; so
the
> non-release assert isn't grabbing any useful information. Turning it into a
> release assert is one possibility but that would add potential crashes at a
rate
> of 1 in 50 million. 
> 
> Moving the mouse a single pixel causes it not to use the cache; so my biggest
> bet is that the hit test is non-deterministic sometimes.

Right.  My argument to Dave was that this is more rigor then we've probably ever
subjected the hit-testing code to.  If we could easily identify these 11 cases,
I would not be surprised to find that most (or all) of them were bugs in hit
testing itself.  We're getting dramatically decreasing returns on our investment
here - Dave's time could be way better spent on the bugs we KNOW we have (which
surely occur more often than 0.0000022% of the time because we have reports from
users).

Of course we could pretty easily test his hypothesis if we want to.  Maybe
(independent from all hit test cache work) we should add code that for 1 in
10,000 hit tests does the hit test again a 2nd time and compares the result.

Perhaps more likely is that we're doing something somewhere that changes the
layout tree without dirtying layout correctly.  Again we could probably test
this by occasionally triggering a supposedly unnecessary layout and verifying
nothing changed.  If we find this happens extremely rarely, would that really be
reason not to ship this cache?

esprehn

On 2015/08/18 at 21:32:42, rbyers wrote: > On 2015/08/18 21:14:52, dtapuska wrote: > > On ...

5 years, 4 months ago (2015-08-18 21:50:08 UTC) #9

On 2015/08/18 at 21:32:42, rbyers wrote:
> On 2015/08/18 21:14:52, dtapuska wrote:
> > On 2015/08/18 21:04:32, esprehn wrote:
> > > 11 in 500m means we're still failing sometimes though? How do we go about
> > fixing
> > > that correctness?
> > 
> > Yes it still means that there are some failures. What isn't clear to me is
that
> > if all the open hit test bugs can cause it not to be deterministic.
> > 
> > I can continue thinking about the problem but I'm completely out of ideas as
it
> > stands.
> > 
> > What is consistent with the current 11 failures is that they are all of code
> > 151. Which implies the inner node, the pseudo node and the local point don't
> > match. So that implies it pretty much found a different DOM node than what
was
> > in the cache at a different location.
> > 
> > I'm not certain what I can do to try to grab more information from these
> > failures. I have not received any reports of any failures in debug mode; so
the
> > non-release assert isn't grabbing any useful information. Turning it into a
> > release assert is one possibility but that would add potential crashes at a
rate
> > of 1 in 50 million. 
> > 
> > Moving the mouse a single pixel causes it not to use the cache; so my
biggest
> > bet is that the hit test is non-deterministic sometimes.
> 
> Right.  My argument to Dave was that this is more rigor then we've probably
ever subjected the hit-testing code to.  If we could easily identify these 11
cases, I would not be surprised to find that most (or all) of them were bugs in
hit testing itself.  We're getting dramatically decreasing returns on our
investment here - Dave's time could be way better spent on the bugs we KNOW we
have (which surely occur more often than 0.0000022% of the time because we have
reports from users).
> 
> Of course we could pretty easily test his hypothesis if we want to.  Maybe
(independent from all hit test cache work) we should add code that for 1 in
10,000 hit tests does the hit test again a 2nd time and compares the result.
> 
> Perhaps more likely is that we're doing something somewhere that changes the
layout tree without dirtying layout correctly.  Again we could probably test
this by occasionally triggering a supposedly unnecessary layout and verifying
nothing changed.  If we find this happens extremely rarely, would that really be
reason not to ship this cache?

Could we add a RELEASE_ASSERT to make this crash and collect those urls? I think
this is probably okay to ship, but it makes me sad to ship some known bugs like
this.

lgtm

dtapuska

On 2015/08/18 21:50:08, esprehn wrote: > On 2015/08/18 at 21:32:42, rbyers wrote: > > On ...

5 years, 4 months ago (2015-08-18 22:10:53 UTC) #10

On 2015/08/18 21:50:08, esprehn wrote:
> On 2015/08/18 at 21:32:42, rbyers wrote:
> > On 2015/08/18 21:14:52, dtapuska wrote:
> > > On 2015/08/18 21:04:32, esprehn wrote:
> > > > 11 in 500m means we're still failing sometimes though? How do we go
about
> > > fixing
> > > > that correctness?
> > > 
> > > Yes it still means that there are some failures. What isn't clear to me is
> that
> > > if all the open hit test bugs can cause it not to be deterministic.
> > > 
> > > I can continue thinking about the problem but I'm completely out of ideas
as
> it
> > > stands.
> > > 
> > > What is consistent with the current 11 failures is that they are all of
code
> > > 151. Which implies the inner node, the pseudo node and the local point
don't
> > > match. So that implies it pretty much found a different DOM node than what
> was
> > > in the cache at a different location.
> > > 
> > > I'm not certain what I can do to try to grab more information from these
> > > failures. I have not received any reports of any failures in debug mode;
so
> the
> > > non-release assert isn't grabbing any useful information. Turning it into
a
> > > release assert is one possibility but that would add potential crashes at
a
> rate
> > > of 1 in 50 million. 
> > > 
> > > Moving the mouse a single pixel causes it not to use the cache; so my
> biggest
> > > bet is that the hit test is non-deterministic sometimes.
> > 
> > Right.  My argument to Dave was that this is more rigor then we've probably
> ever subjected the hit-testing code to.  If we could easily identify these 11
> cases, I would not be surprised to find that most (or all) of them were bugs
in
> hit testing itself.  We're getting dramatically decreasing returns on our
> investment here - Dave's time could be way better spent on the bugs we KNOW we
> have (which surely occur more often than 0.0000022% of the time because we
have
> reports from users).
> > 
> > Of course we could pretty easily test his hypothesis if we want to.  Maybe
> (independent from all hit test cache work) we should add code that for 1 in
> 10,000 hit tests does the hit test again a 2nd time and compares the result.
> > 
> > Perhaps more likely is that we're doing something somewhere that changes the
> layout tree without dirtying layout correctly.  Again we could probably test
> this by occasionally triggering a supposedly unnecessary layout and verifying
> nothing changed.  If we find this happens extremely rarely, would that really
be
> reason not to ship this cache?
> 
> Could we add a RELEASE_ASSERT to make this crash and collect those urls? I
think
> this is probably okay to ship, but it makes me sad to ship some known bugs
like
> this.
> 
> lgtm

I hear you Elliott I don't like to admit defeat either. I spent a few moons
trying different things to get the failures out.. What I suggest is we land
this. And can add a temporary release assert after m46 branches. That way we can
capture some data in canary builds... Have it for a few days and see what we get
with it.