Issue 148153009: DOM-object leak detection at run_webkit_tests.py

Issue 148153009: DOM-object leak detection at run_webkit_tests.py (Closed)

Created:
6 years, 10 months ago by hajimehoshi

Modified:
6 years, 9 months ago

Reviewers:
Dirk Pranke, ojan, jochen (gone - plz use gerrit), eseidel

CC:
blink-reviews, haraken

Base URL:
svn://svn.chromium.org/blink/trunk

Visibility:
Public.

More Reviews

Description

DOM-object leak detection at run_webkit_tests.py Now we're implementing the DOM-object leak detector. When the number of DOM objects is increased after a test, we can assume that the test causes a leak. This feature is available at content_shell with the flag --enable-leak-detection. This patch enables this leak detection feature at run_webkit_test.py with the flag --enable-leak-detection. content_shell is assumed to report a detected leak with the following format: #LEAK - renderer pid PID (DETAIL_IN_JSON_FORMAT) and run_webkit_test.py will treat this as a kind of error by this patch. See also: https://docs.google.com/a/chromium.org/document/d/1sFAsZxeISKnbGdXoLZlB2tDZ8pvO102ePFQx6TX4X14/edit BUG=332630 TEST=test-webkitpy Committed: https://src.chromium.org/viewvc/blink?view=rev&revision=170255

Patch Set 1 : . #

Total comments: 6

Patch Set 2 : Add outputting leak logs #

Patch Set 3 : Remove .coverage #

Patch Set 4 : (rebasing) #

Patch Set 5 : Bug fix: Reset driver._leaked when starting tests #

Patch Set 6 : Bug fix: unit test #

Created: 6 years, 9 months ago

Download [raw] [tar.bz2]

	Unified diffs	Side-by-side diffs	Delta from patch set	Stats (+94 lines, -12 lines)			Patch
M	LayoutTests/fast/harness/results.html	View	1 2 3	4 chunks	+12 lines, -1 line	0 comments	Download
M	Tools/Scripts/webkitpy/layout_tests/controllers/single_test_runner.py	View	1	1 chunk	+4 lines, -0 lines	0 comments	Download
M	Tools/Scripts/webkitpy/layout_tests/controllers/test_result_writer.py	View	1 2 3	3 chunks	+7 lines, -0 lines	0 comments	Download
M	Tools/Scripts/webkitpy/layout_tests/models/test_expectations.py	View	1	4 chunks	+5 lines, -2 lines	0 comments	Download
M	Tools/Scripts/webkitpy/layout_tests/models/test_failures.py	View	1	2 chunks	+12 lines, -0 lines	0 comments	Download
M	Tools/Scripts/webkitpy/layout_tests/models/test_run_results_unittest.py	View	1 2 3 4 5	1 chunk	+3 lines, -3 lines	0 comments	Download
M	Tools/Scripts/webkitpy/layout_tests/port/base.py	View	1 2 3	1 chunk	+4 lines, -0 lines	0 comments	Download
M	Tools/Scripts/webkitpy/layout_tests/port/driver.py	View	1 2 3 4	10 chunks	+30 lines, -5 lines	0 comments	Download
M	Tools/Scripts/webkitpy/layout_tests/port/driver_unittest.py	View	1 2 3 4 5	2 chunks	+15 lines, -1 line	0 comments	Download
M	Tools/Scripts/webkitpy/layout_tests/run_webkit_tests.py	View	1 2 3	1 chunk	+2 lines, -0 lines	0 comments	Download

Messages

Total messages: 40 (0 generated)

Expand Messages | Collapse Messages

jochen (gone - plz use gerrit)

+dpranke who's a better reviewer for this than me

6 years, 10 months ago (2014-01-28 13:08:38 UTC) #2

Dirk Pranke

I'm not sure if this patch is quite right. When we detect leaks, do we ...

6 years, 10 months ago (2014-01-28 20:11:46 UTC) #3

Dirk Pranke

On 2014/01/28 20:11:46, Dirk Pranke wrote: > I'm not sure if this patch is quite ...

6 years, 10 months ago (2014-01-28 20:12:29 UTC) #4

hajimehoshi

> When we detect leaks, do we actually also crash the process? No. There is ...

6 years, 10 months ago (2014-01-29 01:59:24 UTC) #5

kouhei (in TOK)

On 2014/01/29 01:59:24, hajimehoshi wrote: > > When we detect leaks, do we actually also ...

6 years, 10 months ago (2014-01-29 02:13:57 UTC) #6

Dirk Pranke

On 2014/01/29 01:59:24, hajimehoshi wrote: > > When we detect leaks, do we actually also ...

6 years, 10 months ago (2014-01-30 01:36:46 UTC) #7

Dirk Pranke

https://chromiumcodereview.appspot.com/148153009/diff/20001/Tools/Scripts/webkitpy/layout_tests/port/driver.py File Tools/Scripts/webkitpy/layout_tests/port/driver.py (right): https://chromiumcodereview.appspot.com/148153009/diff/20001/Tools/Scripts/webkitpy/layout_tests/port/driver.py#newcode189 Tools/Scripts/webkitpy/layout_tests/port/driver.py:189: crash_log += 'DOM-object leaking was detected.\n' Given that the ...

6 years, 10 months ago (2014-01-30 01:43:20 UTC) #8

hajimehoshi

Thank you! I added output of the detected leaks separated from crashes. Sorry for the ...

6 years, 10 months ago (2014-02-14 10:18:58 UTC) #9

Dirk Pranke

Oh dear, I'm afraid I wasn't clear enough, and you've done a lot more work ...

6 years, 10 months ago (2014-02-14 21:27:00 UTC) #11

hajimehoshi

> (kouhei) > IIUC, we don't crash the process, but the content_shell would stop running ...

6 years, 10 months ago (2014-02-17 04:05:29 UTC) #12

Dirk Pranke

On 2014/02/17 04:05:29, hajimehoshi wrote: > > (kouhei) > > IIUC, we don't crash the ...

6 years, 10 months ago (2014-02-18 23:35:43 UTC) #13

On 2014/02/17 04:05:29, hajimehoshi wrote:
> > (kouhei)
> > IIUC, we don't crash the process, but the content_shell would stop running
> after
> > hitting the leaked test?
> 
> No as long as long as content_shell is executed as multi-processes. I think
this
> is not executed with the flag --single-process on the bots. Sorry for the late
> response.
> 
> On 2014/02/14 21:27:00, Dirk Pranke wrote:
> > Oh dear, I'm afraid I wasn't clear enough, and you've done a lot more work
> than
> > I would've preferred :(.
> > 
> > Given that we're not going to run with this flag on all the time (as far as
I
> > know), we probably don't want to support making this an entirely different
> type
> > of failure like Failure, ImageOnlyFailure, etc. I would prefer to treat this
> > like we do ASAN failures, and treat them as a kind of crash.
> > 
> > So, what I meant in my previous comments was that the driver should report
> back
> > that the test crashed (so that we don't need to change all of the other
> files),
> > but that the code in driver.py itself should not confuse the two situations.
> > 
> > I'm afraid the "treat this as an entirely different kind of failure" is just
> > going to make things harder to manage and maintain. Does that make sense?
> 
> We discussed about this and concluded that it would be better to treat leaks
as
> other errors than crashes. Your assuming that the leak detection is not
executed
> all time is right, but we are still against treating leaks as crashes like
ASAN
> failures. It is because the number of current leaks are quite big. Now we
found
> more than one thousand leaks on the layout tests. If the leaks were treated as
> crashes, it would be difficult to distinguish true crashes and the current
> leaks.

Hm. Okay, what if you were to simply append the leak output to stdout and treat
it as a text failure ? 

I guess I'm not really sure what the plans for the leak detection are; do we
expect to have bots for these configurations and people actively monitoring them
and triaging them, etc. ? Do we expect normal developers to have to be aware of
which tests are exposing leaks?

hajimehoshi

On 2014/02/18 23:35:43, Dirk Pranke wrote: > On 2014/02/17 04:05:29, hajimehoshi wrote: > > > ...

6 years, 10 months ago (2014-02-19 04:49:09 UTC) #14

On 2014/02/18 23:35:43, Dirk Pranke wrote:
> On 2014/02/17 04:05:29, hajimehoshi wrote:
> > > (kouhei)
> > > IIUC, we don't crash the process, but the content_shell would stop running
> > after
> > > hitting the leaked test?
> > 
> > No as long as long as content_shell is executed as multi-processes. I think
> this
> > is not executed with the flag --single-process on the bots. Sorry for the
late
> > response.
> > 
> > On 2014/02/14 21:27:00, Dirk Pranke wrote:
> > > Oh dear, I'm afraid I wasn't clear enough, and you've done a lot more work
> > than
> > > I would've preferred :(.
> > > 
> > > Given that we're not going to run with this flag on all the time (as far
as
> I
> > > know), we probably don't want to support making this an entirely different
> > type
> > > of failure like Failure, ImageOnlyFailure, etc. I would prefer to treat
this
> > > like we do ASAN failures, and treat them as a kind of crash.
> > > 
> > > So, what I meant in my previous comments was that the driver should report
> > back
> > > that the test crashed (so that we don't need to change all of the other
> > files),
> > > but that the code in driver.py itself should not confuse the two
situations.
> > > 
> > > I'm afraid the "treat this as an entirely different kind of failure" is
just
> > > going to make things harder to manage and maintain. Does that make sense?
> > 
> > We discussed about this and concluded that it would be better to treat leaks
> as
> > other errors than crashes. Your assuming that the leak detection is not
> executed
> > all time is right, but we are still against treating leaks as crashes like
> ASAN
> > failures. It is because the number of current leaks are quite big. Now we
> found
> > more than one thousand leaks on the layout tests. If the leaks were treated
as
> > crashes, it would be difficult to distinguish true crashes and the current
> > leaks.
> 
> Hm. Okay, what if you were to simply append the leak output to stdout and
treat
> it as a text failure ? 
> 
> I guess I'm not really sure what the plans for the leak detection are; do we
> expect to have bots for these configurations and people actively monitoring
them
> and triaging them, etc. ? Do we expect normal developers to have to be aware
of
> which tests are exposing leaks?

Sorry for lack of explanation. Right, we want to execute the leak detector on
the try bots. Now there are over 1000 leaks on the layout tests and the number
is increasing. Additionally, nobody cares about this... By executing the
detector on try bots, we can share the situation and everyone can monitor and
triage them. Of course, we can't treat them as blockers because the leaks are
too many, so we want to treat them like ASAN tests for a while.

We aim to show the result at a HTML of webkit_test_runner.py and eventually on
the waterfall page.

I've just written the text about running this on try bots:
https://docs.google.com/a/chromium.org/document/d/1sFAsZxeISKnbGdXoLZlB2tDZ8p...

Dirk Pranke

On 2014/02/19 04:49:09, hajimehoshi wrote: > On 2014/02/18 23:35:43, Dirk Pranke wrote: > > On ...

6 years, 10 months ago (2014-02-19 06:59:17 UTC) #15

On 2014/02/19 04:49:09, hajimehoshi wrote:
> On 2014/02/18 23:35:43, Dirk Pranke wrote:
> > On 2014/02/17 04:05:29, hajimehoshi wrote:
> > > > (kouhei)
> > > > IIUC, we don't crash the process, but the content_shell would stop
running
> > > after
> > > > hitting the leaked test?
> > > 
> > > No as long as long as content_shell is executed as multi-processes. I
think
> > this
> > > is not executed with the flag --single-process on the bots. Sorry for the
> late
> > > response.
> > > 
> > > On 2014/02/14 21:27:00, Dirk Pranke wrote:
> > > > Oh dear, I'm afraid I wasn't clear enough, and you've done a lot more
work
> > > than
> > > > I would've preferred :(.
> > > > 
> > > > Given that we're not going to run with this flag on all the time (as far
> as
> > I
> > > > know), we probably don't want to support making this an entirely
different
> > > type
> > > > of failure like Failure, ImageOnlyFailure, etc. I would prefer to treat
> this
> > > > like we do ASAN failures, and treat them as a kind of crash.
> > > > 
> > > > So, what I meant in my previous comments was that the driver should
report
> > > back
> > > > that the test crashed (so that we don't need to change all of the other
> > > files),
> > > > but that the code in driver.py itself should not confuse the two
> situations.
> > > > 
> > > > I'm afraid the "treat this as an entirely different kind of failure" is
> just
> > > > going to make things harder to manage and maintain. Does that make
sense?
> > > 
> > > We discussed about this and concluded that it would be better to treat
leaks
> > as
> > > other errors than crashes. Your assuming that the leak detection is not
> > executed
> > > all time is right, but we are still against treating leaks as crashes like
> > ASAN
> > > failures. It is because the number of current leaks are quite big. Now we
> > found
> > > more than one thousand leaks on the layout tests. If the leaks were
treated
> as
> > > crashes, it would be difficult to distinguish true crashes and the current
> > > leaks.
> > 
> > Hm. Okay, what if you were to simply append the leak output to stdout and
> treat
> > it as a text failure ? 
> > 
> > I guess I'm not really sure what the plans for the leak detection are; do we
> > expect to have bots for these configurations and people actively monitoring
> them
> > and triaging them, etc. ? Do we expect normal developers to have to be aware
> of
> > which tests are exposing leaks?
> 
> Sorry for lack of explanation. Right, we want to execute the leak detector on
> the try bots. Now there are over 1000 leaks on the layout tests and the number
> is increasing. Additionally, nobody cares about this... By executing the
> detector on try bots, we can share the situation and everyone can monitor and
> triage them. Of course, we can't treat them as blockers because the leaks are
> too many, so we want to treat them like ASAN tests for a while.
> 
> We aim to show the result at a HTML of webkit_test_runner.py and eventually on
> the waterfall page.
> 
> I've just written the text about running this on try bots:
>
https://docs.google.com/a/chromium.org/document/d/1sFAsZxeISKnbGdXoLZlB2tDZ8p...

Okay, so if we were to append the list of detected leaks to the normal text
output, this would have the advantage that (a) you could re-use the text failure
type and (b) you could land new baselines that included the leak messages; this
would also mean that we could just have leak checking always be on (though I
don't know what kind of performance hit leak checking causes).

Does appending the leak messages to the text output sound like a possibility to
you? For now, you could leave the baselines alone, and track the leak failures
using a new, dedicated test expectations file, e.g.,
LayoutTests/LeakingExpectations or something, that would be used when
--enable-leak-detection was passed. Then, if it turned out this was useful we
can consider making it be on by default and/or updating baselines.

kouhei (in TOK)

On 2014/02/19 06:59:17, Dirk Pranke wrote: > On 2014/02/19 04:49:09, hajimehoshi wrote: > > On ...

6 years, 10 months ago (2014-02-19 08:00:20 UTC) #16

On 2014/02/19 06:59:17, Dirk Pranke wrote:
> On 2014/02/19 04:49:09, hajimehoshi wrote:
> > On 2014/02/18 23:35:43, Dirk Pranke wrote:
> > > On 2014/02/17 04:05:29, hajimehoshi wrote:
> > > > > (kouhei)
> > > > > IIUC, we don't crash the process, but the content_shell would stop
> running
> > > > after
> > > > > hitting the leaked test?
> > > > 
> > > > No as long as long as content_shell is executed as multi-processes. I
> think
> > > this
> > > > is not executed with the flag --single-process on the bots. Sorry for
the
> > late
> > > > response.
> > > > 
> > > > On 2014/02/14 21:27:00, Dirk Pranke wrote:
> > > > > Oh dear, I'm afraid I wasn't clear enough, and you've done a lot more
> work
> > > > than
> > > > > I would've preferred :(.
> > > > > 
> > > > > Given that we're not going to run with this flag on all the time (as
far
> > as
> > > I
> > > > > know), we probably don't want to support making this an entirely
> different
> > > > type
> > > > > of failure like Failure, ImageOnlyFailure, etc. I would prefer to
treat
> > this
> > > > > like we do ASAN failures, and treat them as a kind of crash.
> > > > > 
> > > > > So, what I meant in my previous comments was that the driver should
> report
> > > > back
> > > > > that the test crashed (so that we don't need to change all of the
other
> > > > files),
> > > > > but that the code in driver.py itself should not confuse the two
> > situations.
> > > > > 
> > > > > I'm afraid the "treat this as an entirely different kind of failure"
is
> > just
> > > > > going to make things harder to manage and maintain. Does that make
> sense?
> > > > 
> > > > We discussed about this and concluded that it would be better to treat
> leaks
> > > as
> > > > other errors than crashes. Your assuming that the leak detection is not
> > > executed
> > > > all time is right, but we are still against treating leaks as crashes
like
> > > ASAN
> > > > failures. It is because the number of current leaks are quite big. Now
we
> > > found
> > > > more than one thousand leaks on the layout tests. If the leaks were
> treated
> > as
> > > > crashes, it would be difficult to distinguish true crashes and the
current
> > > > leaks.
> > > 
> > > Hm. Okay, what if you were to simply append the leak output to stdout and
> > treat
> > > it as a text failure ? 
> > > 
> > > I guess I'm not really sure what the plans for the leak detection are; do
we
> > > expect to have bots for these configurations and people actively
monitoring
> > them
> > > and triaging them, etc. ? Do we expect normal developers to have to be
aware
> > of
> > > which tests are exposing leaks?
> > 
> > Sorry for lack of explanation. Right, we want to execute the leak detector
on
> > the try bots. Now there are over 1000 leaks on the layout tests and the
number
> > is increasing. Additionally, nobody cares about this... By executing the
> > detector on try bots, we can share the situation and everyone can monitor
and
> > triage them. Of course, we can't treat them as blockers because the leaks
are
> > too many, so we want to treat them like ASAN tests for a while.
> > 
> > We aim to show the result at a HTML of webkit_test_runner.py and eventually
on
> > the waterfall page.
> > 
> > I've just written the text about running this on try bots:
> >
>
https://docs.google.com/a/chromium.org/document/d/1sFAsZxeISKnbGdXoLZlB2tDZ8p...
> 
> Okay, so if we were to append the list of detected leaks to the normal text
> output, this would have the advantage that (a) you could re-use the text
failure
> type and (b) you could land new baselines that included the leak messages;
this
> would also mean that we could just have leak checking always be on (though I
> don't know what kind of performance hit leak checking causes).
> 
> Does appending the leak messages to the text output sound like a possibility
to
> you? For now, you could leave the baselines alone, and track the leak failures
> using a new, dedicated test expectations file, e.g.,
> LayoutTests/LeakingExpectations or something, that would be used when
> --enable-leak-detection was passed. Then, if it turned out this was useful we
> can consider making it be on by default and/or updating baselines.

> (a) you could re-use the text failure type
I understand that this will minimize change in the test runner script, but I
think leak and text failures are different concepts.
Treating this as a CRASH type would align to how we currently handle ASAN, but
unlike ASAN which actually crashes when it fail, the leak detector will continue
running, and we do not need stacktrace info either.

> (b) you could land new baselines that included the leak messages.
This isn't clear to me.
1) Ideally we should have no leaks at all, so the baseline files should be all
without leak messages. I like the idea of having separate expectations file like
LayoutTests/LeakingExpectations for the known leak failures.
2) Many tests are known to cause leaks, but do not have text baseline files.
This is especially the case in svg where expectation is also written in svg. It
is even very common to have a leak without using Javascript at all.

Performance wise, the effect of leak detection enabled is that we aggressively
force GC between the tests. I'm not sure how much this will change test
execution time, but I think this is within acceptable range. @hajimehoshi: Do
you have any data on this?

jochen (gone - plz use gerrit)

I wonder how useful this is if it currently reports thousands of leaks. I'd expect ...

6 years, 10 months ago (2014-02-19 17:25:50 UTC) #17

Dirk Pranke

Okay, well, I can't say that I'm wild about this approach, but I have failed ...

6 years, 10 months ago (2014-02-21 00:09:51 UTC) #18

ojan

I'm not sure I understand how this code is intended to work. What happens if ...

6 years, 10 months ago (2014-02-22 00:11:52 UTC) #19

Dirk Pranke

On 2014/02/22 00:11:52, ojan wrote: > An alternate proposal, what if we treat leak information ...

6 years, 10 months ago (2014-02-22 01:18:13 UTC) #20

eseidel

run-webkit-tests --leaks used to work based on stderr iirc. Then again it used the mac ...

6 years, 10 months ago (2014-02-22 06:10:36 UTC) #21

hajimehoshi

Sorry for my terribly late reply. Now we've tackled the problem that content_shell itself causes ...

6 years, 9 months ago (2014-03-26 08:31:30 UTC) #22