|
|
Created:
6 years, 10 months ago by hajimehoshi Modified:
6 years, 9 months ago CC:
blink-reviews, haraken Base URL:
svn://svn.chromium.org/blink/trunk Visibility:
Public. |
DescriptionDOM-object leak detection at run_webkit_tests.py
Now we're implementing the DOM-object leak detector. When the number of
DOM objects is increased after a test, we can assume that the test
causes a leak. This feature is available at content_shell with the flag
--enable-leak-detection.
This patch enables this leak detection feature at run_webkit_test.py
with the flag --enable-leak-detection.
content_shell is assumed to report a detected leak with the following
format:
#LEAK - renderer pid PID (DETAIL_IN_JSON_FORMAT)
and run_webkit_test.py will treat this as a kind of error by this patch.
See also:
https://docs.google.com/a/chromium.org/document/d/1sFAsZxeISKnbGdXoLZlB2tDZ8pvO102ePFQx6TX4X14/edit
BUG=332630
TEST=test-webkitpy
Committed: https://src.chromium.org/viewvc/blink?view=rev&revision=170255
Patch Set 1 : . #
Total comments: 6
Patch Set 2 : Add outputting leak logs #Patch Set 3 : Remove .coverage #Patch Set 4 : (rebasing) #Patch Set 5 : Bug fix: Reset driver._leaked when starting tests #Patch Set 6 : Bug fix: unit test #Messages
Total messages: 40 (0 generated)
Can you take a look? Thank you in advance.
+dpranke who's a better reviewer for this than me
I'm not sure if this patch is quite right. When we detect leaks, do we actually also crash the process? If we don't, I'd probably want to change how this is done a bit. https://chromiumcodereview.appspot.com/148153009/diff/20001/Tools/Scripts/web... File Tools/Scripts/webkitpy/layout_tests/port/driver.py (right): https://chromiumcodereview.appspot.com/148153009/diff/20001/Tools/Scripts/web... Tools/Scripts/webkitpy/layout_tests/port/driver.py:116: # WebKitTestRunenr can report back subprocess DOM-object leaks by printing WebKitTestRunner doesn't exist in Blink :) I'd probably change this to "content_shell can report back DOM object leaks by printing".
On 2014/01/28 20:11:46, Dirk Pranke wrote: > I'm not sure if this patch is quite right. > > When we detect leaks, do we actually also crash the process? If we don't, I'd > probably want to change how this is done a bit. > Note that "we" means "the leak detection code inside content_shell", not run-webkit-tests. In other words, has the process also crashed?
> When we detect leaks, do we actually also crash the process? No. There is not a special reason why I treated leaks as crashes at this python script. Could you tell me how you wanted to change this?
On 2014/01/29 01:59:24, hajimehoshi wrote: > > When we detect leaks, do we actually also crash the process? > > No. There is not a special reason why I treated leaks as crashes at this python > script. Could you tell me how you wanted to change this? IIUC, we don't crash the process, but the content_shell would stop running after hitting the leaked test?
On 2014/01/29 01:59:24, hajimehoshi wrote: > > When we detect leaks, do we actually also crash the process? > > No. There is not a special reason why I treated leaks as crashes at this python > script. Could you tell me how you wanted to change this? Yes, I was waiting for the answer to figure out how I wanted to comment on the patch, which I will do now (in a separate email :).
https://chromiumcodereview.appspot.com/148153009/diff/20001/Tools/Scripts/web... File Tools/Scripts/webkitpy/layout_tests/port/driver.py (right): https://chromiumcodereview.appspot.com/148153009/diff/20001/Tools/Scripts/web... Tools/Scripts/webkitpy/layout_tests/port/driver.py:189: crash_log += 'DOM-object leaking was detected.\n' Given that the process isn't actually crashing, it is probably wrong to be making any changes to _check_for_driver_crash() or inside the if block that starts on line 178. I suggest that you add lines before line 167 that parses the stderr output, and if it finds the '# LEAK - ' line, marks something to tell the driver code to restart content_shell (assuming that's what you want to have happen) and propagate the right information back up in the DriverOutput. Let me know if the approach I describe turns out to have problems. https://chromiumcodereview.appspot.com/148153009/diff/20001/Tools/Scripts/web... Tools/Scripts/webkitpy/layout_tests/port/driver.py:344: or error_line.startswith("#LEAK - ")): See comments above.
Thank you! I added output of the detected leaks separated from crashes. Sorry for the lack of the tests. I'll add them soon. https://codereview.chromium.org/148153009/diff/20001/Tools/Scripts/webkitpy/l... File Tools/Scripts/webkitpy/layout_tests/port/driver.py (right): https://codereview.chromium.org/148153009/diff/20001/Tools/Scripts/webkitpy/l... Tools/Scripts/webkitpy/layout_tests/port/driver.py:116: # WebKitTestRunenr can report back subprocess DOM-object leaks by printing On 2014/01/28 20:11:46, Dirk Pranke wrote: > WebKitTestRunner doesn't exist in Blink :) I'd probably change this to > "content_shell can report back DOM object leaks by printing". Done. https://codereview.chromium.org/148153009/diff/20001/Tools/Scripts/webkitpy/l... Tools/Scripts/webkitpy/layout_tests/port/driver.py:189: crash_log += 'DOM-object leaking was detected.\n' On 2014/01/30 01:43:21, Dirk Pranke wrote: > Given that the process isn't actually crashing, it is probably wrong to be > making any changes to _check_for_driver_crash() or inside the if block that > starts on line 178. > > I suggest that you add lines before line 167 that parses the stderr output, and > if it finds the '# LEAK - ' line, marks something to tell the driver code to > restart content_shell (assuming that's what you want to have happen) and > propagate the right information back up in the DriverOutput. > > Let me know if the approach I describe turns out to have problems. Done. https://codereview.chromium.org/148153009/diff/20001/Tools/Scripts/webkitpy/l... Tools/Scripts/webkitpy/layout_tests/port/driver.py:344: or error_line.startswith("#LEAK - ")): On 2014/01/30 01:43:21, Dirk Pranke wrote: > See comments above. Done.
+kouhei (CC)
Oh dear, I'm afraid I wasn't clear enough, and you've done a lot more work than I would've preferred :(. Given that we're not going to run with this flag on all the time (as far as I know), we probably don't want to support making this an entirely different type of failure like Failure, ImageOnlyFailure, etc. I would prefer to treat this like we do ASAN failures, and treat them as a kind of crash. So, what I meant in my previous comments was that the driver should report back that the test crashed (so that we don't need to change all of the other files), but that the code in driver.py itself should not confuse the two situations. I'm afraid the "treat this as an entirely different kind of failure" is just going to make things harder to manage and maintain. Does that make sense?
> (kouhei) > IIUC, we don't crash the process, but the content_shell would stop running after > hitting the leaked test? No as long as long as content_shell is executed as multi-processes. I think this is not executed with the flag --single-process on the bots. Sorry for the late response. On 2014/02/14 21:27:00, Dirk Pranke wrote: > Oh dear, I'm afraid I wasn't clear enough, and you've done a lot more work than > I would've preferred :(. > > Given that we're not going to run with this flag on all the time (as far as I > know), we probably don't want to support making this an entirely different type > of failure like Failure, ImageOnlyFailure, etc. I would prefer to treat this > like we do ASAN failures, and treat them as a kind of crash. > > So, what I meant in my previous comments was that the driver should report back > that the test crashed (so that we don't need to change all of the other files), > but that the code in driver.py itself should not confuse the two situations. > > I'm afraid the "treat this as an entirely different kind of failure" is just > going to make things harder to manage and maintain. Does that make sense? We discussed about this and concluded that it would be better to treat leaks as other errors than crashes. Your assuming that the leak detection is not executed all time is right, but we are still against treating leaks as crashes like ASAN failures. It is because the number of current leaks are quite big. Now we found more than one thousand leaks on the layout tests. If the leaks were treated as crashes, it would be difficult to distinguish true crashes and the current leaks. Additionally, we want to show the details of leaks at logs like how many objects are increased by loading a page. Now I think a stack trace is shown at the crash logs, but it is not needed at leak logs.
On 2014/02/17 04:05:29, hajimehoshi wrote: > > (kouhei) > > IIUC, we don't crash the process, but the content_shell would stop running > after > > hitting the leaked test? > > No as long as long as content_shell is executed as multi-processes. I think this > is not executed with the flag --single-process on the bots. Sorry for the late > response. > > On 2014/02/14 21:27:00, Dirk Pranke wrote: > > Oh dear, I'm afraid I wasn't clear enough, and you've done a lot more work > than > > I would've preferred :(. > > > > Given that we're not going to run with this flag on all the time (as far as I > > know), we probably don't want to support making this an entirely different > type > > of failure like Failure, ImageOnlyFailure, etc. I would prefer to treat this > > like we do ASAN failures, and treat them as a kind of crash. > > > > So, what I meant in my previous comments was that the driver should report > back > > that the test crashed (so that we don't need to change all of the other > files), > > but that the code in driver.py itself should not confuse the two situations. > > > > I'm afraid the "treat this as an entirely different kind of failure" is just > > going to make things harder to manage and maintain. Does that make sense? > > We discussed about this and concluded that it would be better to treat leaks as > other errors than crashes. Your assuming that the leak detection is not executed > all time is right, but we are still against treating leaks as crashes like ASAN > failures. It is because the number of current leaks are quite big. Now we found > more than one thousand leaks on the layout tests. If the leaks were treated as > crashes, it would be difficult to distinguish true crashes and the current > leaks. Hm. Okay, what if you were to simply append the leak output to stdout and treat it as a text failure ? I guess I'm not really sure what the plans for the leak detection are; do we expect to have bots for these configurations and people actively monitoring them and triaging them, etc. ? Do we expect normal developers to have to be aware of which tests are exposing leaks?
On 2014/02/18 23:35:43, Dirk Pranke wrote: > On 2014/02/17 04:05:29, hajimehoshi wrote: > > > (kouhei) > > > IIUC, we don't crash the process, but the content_shell would stop running > > after > > > hitting the leaked test? > > > > No as long as long as content_shell is executed as multi-processes. I think > this > > is not executed with the flag --single-process on the bots. Sorry for the late > > response. > > > > On 2014/02/14 21:27:00, Dirk Pranke wrote: > > > Oh dear, I'm afraid I wasn't clear enough, and you've done a lot more work > > than > > > I would've preferred :(. > > > > > > Given that we're not going to run with this flag on all the time (as far as > I > > > know), we probably don't want to support making this an entirely different > > type > > > of failure like Failure, ImageOnlyFailure, etc. I would prefer to treat this > > > like we do ASAN failures, and treat them as a kind of crash. > > > > > > So, what I meant in my previous comments was that the driver should report > > back > > > that the test crashed (so that we don't need to change all of the other > > files), > > > but that the code in driver.py itself should not confuse the two situations. > > > > > > I'm afraid the "treat this as an entirely different kind of failure" is just > > > going to make things harder to manage and maintain. Does that make sense? > > > > We discussed about this and concluded that it would be better to treat leaks > as > > other errors than crashes. Your assuming that the leak detection is not > executed > > all time is right, but we are still against treating leaks as crashes like > ASAN > > failures. It is because the number of current leaks are quite big. Now we > found > > more than one thousand leaks on the layout tests. If the leaks were treated as > > crashes, it would be difficult to distinguish true crashes and the current > > leaks. > > Hm. Okay, what if you were to simply append the leak output to stdout and treat > it as a text failure ? > > I guess I'm not really sure what the plans for the leak detection are; do we > expect to have bots for these configurations and people actively monitoring them > and triaging them, etc. ? Do we expect normal developers to have to be aware of > which tests are exposing leaks? Sorry for lack of explanation. Right, we want to execute the leak detector on the try bots. Now there are over 1000 leaks on the layout tests and the number is increasing. Additionally, nobody cares about this... By executing the detector on try bots, we can share the situation and everyone can monitor and triage them. Of course, we can't treat them as blockers because the leaks are too many, so we want to treat them like ASAN tests for a while. We aim to show the result at a HTML of webkit_test_runner.py and eventually on the waterfall page. I've just written the text about running this on try bots: https://docs.google.com/a/chromium.org/document/d/1sFAsZxeISKnbGdXoLZlB2tDZ8p...
On 2014/02/19 04:49:09, hajimehoshi wrote: > On 2014/02/18 23:35:43, Dirk Pranke wrote: > > On 2014/02/17 04:05:29, hajimehoshi wrote: > > > > (kouhei) > > > > IIUC, we don't crash the process, but the content_shell would stop running > > > after > > > > hitting the leaked test? > > > > > > No as long as long as content_shell is executed as multi-processes. I think > > this > > > is not executed with the flag --single-process on the bots. Sorry for the > late > > > response. > > > > > > On 2014/02/14 21:27:00, Dirk Pranke wrote: > > > > Oh dear, I'm afraid I wasn't clear enough, and you've done a lot more work > > > than > > > > I would've preferred :(. > > > > > > > > Given that we're not going to run with this flag on all the time (as far > as > > I > > > > know), we probably don't want to support making this an entirely different > > > type > > > > of failure like Failure, ImageOnlyFailure, etc. I would prefer to treat > this > > > > like we do ASAN failures, and treat them as a kind of crash. > > > > > > > > So, what I meant in my previous comments was that the driver should report > > > back > > > > that the test crashed (so that we don't need to change all of the other > > > files), > > > > but that the code in driver.py itself should not confuse the two > situations. > > > > > > > > I'm afraid the "treat this as an entirely different kind of failure" is > just > > > > going to make things harder to manage and maintain. Does that make sense? > > > > > > We discussed about this and concluded that it would be better to treat leaks > > as > > > other errors than crashes. Your assuming that the leak detection is not > > executed > > > all time is right, but we are still against treating leaks as crashes like > > ASAN > > > failures. It is because the number of current leaks are quite big. Now we > > found > > > more than one thousand leaks on the layout tests. If the leaks were treated > as > > > crashes, it would be difficult to distinguish true crashes and the current > > > leaks. > > > > Hm. Okay, what if you were to simply append the leak output to stdout and > treat > > it as a text failure ? > > > > I guess I'm not really sure what the plans for the leak detection are; do we > > expect to have bots for these configurations and people actively monitoring > them > > and triaging them, etc. ? Do we expect normal developers to have to be aware > of > > which tests are exposing leaks? > > Sorry for lack of explanation. Right, we want to execute the leak detector on > the try bots. Now there are over 1000 leaks on the layout tests and the number > is increasing. Additionally, nobody cares about this... By executing the > detector on try bots, we can share the situation and everyone can monitor and > triage them. Of course, we can't treat them as blockers because the leaks are > too many, so we want to treat them like ASAN tests for a while. > > We aim to show the result at a HTML of webkit_test_runner.py and eventually on > the waterfall page. > > I've just written the text about running this on try bots: > https://docs.google.com/a/chromium.org/document/d/1sFAsZxeISKnbGdXoLZlB2tDZ8p... Okay, so if we were to append the list of detected leaks to the normal text output, this would have the advantage that (a) you could re-use the text failure type and (b) you could land new baselines that included the leak messages; this would also mean that we could just have leak checking always be on (though I don't know what kind of performance hit leak checking causes). Does appending the leak messages to the text output sound like a possibility to you? For now, you could leave the baselines alone, and track the leak failures using a new, dedicated test expectations file, e.g., LayoutTests/LeakingExpectations or something, that would be used when --enable-leak-detection was passed. Then, if it turned out this was useful we can consider making it be on by default and/or updating baselines.
On 2014/02/19 06:59:17, Dirk Pranke wrote: > On 2014/02/19 04:49:09, hajimehoshi wrote: > > On 2014/02/18 23:35:43, Dirk Pranke wrote: > > > On 2014/02/17 04:05:29, hajimehoshi wrote: > > > > > (kouhei) > > > > > IIUC, we don't crash the process, but the content_shell would stop > running > > > > after > > > > > hitting the leaked test? > > > > > > > > No as long as long as content_shell is executed as multi-processes. I > think > > > this > > > > is not executed with the flag --single-process on the bots. Sorry for the > > late > > > > response. > > > > > > > > On 2014/02/14 21:27:00, Dirk Pranke wrote: > > > > > Oh dear, I'm afraid I wasn't clear enough, and you've done a lot more > work > > > > than > > > > > I would've preferred :(. > > > > > > > > > > Given that we're not going to run with this flag on all the time (as far > > as > > > I > > > > > know), we probably don't want to support making this an entirely > different > > > > type > > > > > of failure like Failure, ImageOnlyFailure, etc. I would prefer to treat > > this > > > > > like we do ASAN failures, and treat them as a kind of crash. > > > > > > > > > > So, what I meant in my previous comments was that the driver should > report > > > > back > > > > > that the test crashed (so that we don't need to change all of the other > > > > files), > > > > > but that the code in driver.py itself should not confuse the two > > situations. > > > > > > > > > > I'm afraid the "treat this as an entirely different kind of failure" is > > just > > > > > going to make things harder to manage and maintain. Does that make > sense? > > > > > > > > We discussed about this and concluded that it would be better to treat > leaks > > > as > > > > other errors than crashes. Your assuming that the leak detection is not > > > executed > > > > all time is right, but we are still against treating leaks as crashes like > > > ASAN > > > > failures. It is because the number of current leaks are quite big. Now we > > > found > > > > more than one thousand leaks on the layout tests. If the leaks were > treated > > as > > > > crashes, it would be difficult to distinguish true crashes and the current > > > > leaks. > > > > > > Hm. Okay, what if you were to simply append the leak output to stdout and > > treat > > > it as a text failure ? > > > > > > I guess I'm not really sure what the plans for the leak detection are; do we > > > expect to have bots for these configurations and people actively monitoring > > them > > > and triaging them, etc. ? Do we expect normal developers to have to be aware > > of > > > which tests are exposing leaks? > > > > Sorry for lack of explanation. Right, we want to execute the leak detector on > > the try bots. Now there are over 1000 leaks on the layout tests and the number > > is increasing. Additionally, nobody cares about this... By executing the > > detector on try bots, we can share the situation and everyone can monitor and > > triage them. Of course, we can't treat them as blockers because the leaks are > > too many, so we want to treat them like ASAN tests for a while. > > > > We aim to show the result at a HTML of webkit_test_runner.py and eventually on > > the waterfall page. > > > > I've just written the text about running this on try bots: > > > https://docs.google.com/a/chromium.org/document/d/1sFAsZxeISKnbGdXoLZlB2tDZ8p... > > Okay, so if we were to append the list of detected leaks to the normal text > output, this would have the advantage that (a) you could re-use the text failure > type and (b) you could land new baselines that included the leak messages; this > would also mean that we could just have leak checking always be on (though I > don't know what kind of performance hit leak checking causes). > > Does appending the leak messages to the text output sound like a possibility to > you? For now, you could leave the baselines alone, and track the leak failures > using a new, dedicated test expectations file, e.g., > LayoutTests/LeakingExpectations or something, that would be used when > --enable-leak-detection was passed. Then, if it turned out this was useful we > can consider making it be on by default and/or updating baselines. > (a) you could re-use the text failure type I understand that this will minimize change in the test runner script, but I think leak and text failures are different concepts. Treating this as a CRASH type would align to how we currently handle ASAN, but unlike ASAN which actually crashes when it fail, the leak detector will continue running, and we do not need stacktrace info either. > (b) you could land new baselines that included the leak messages. This isn't clear to me. 1) Ideally we should have no leaks at all, so the baseline files should be all without leak messages. I like the idea of having separate expectations file like LayoutTests/LeakingExpectations for the known leak failures. 2) Many tests are known to cause leaks, but do not have text baseline files. This is especially the case in svg where expectation is also written in svg. It is even very common to have a leak without using Javascript at all. Performance wise, the effect of leak detection enabled is that we aggressively force GC between the tests. I'm not sure how much this will change test execution time, but I think this is within acceptable range. @hajimehoshi: Do you have any data on this?
I wonder how useful this is if it currently reports thousands of leaks. I'd expect that we fix those leaks first (or at least a large part of it), so it would make more sense to optimize our infrastructure where leaks are the exception, not the rule, wdyt
Okay, well, I can't say that I'm wild about this approach, but I have failed to convince you otherwise. I don't want to stand in the way of us getting leak detection going, so I'm okay with trying this approach for now and seeing what happens. However, just to make sure I'm not missing something, Ojan, can you take a look at the basic idea here and make sure you're okay with it (a new test failure type for leaks)? The actual code changes lgtm.
I'm not sure I understand how this code is intended to work. What happens if a test both crashes and leaks? Or fails image diff and leaks? Or times out and leaks? Another disadvantage of having a new failure type is that you'll need to update garden-o-matic and the flakiness dashboard to understand the new type. On the plus side, once you do, then those tools will surface leak information. An alternate proposal, what if we treat leak information the same way we do stderr output? It gets reported in the results page, but it does so in a separate text file.
On 2014/02/22 00:11:52, ojan wrote: > An alternate proposal, what if we treat leak information the same way we do > stderr output? It gets reported in the results page, but it does so in a > separate text file. If we did that, I don't think we'd have a good way to suppress expected leaks (which they probably need for triaging).
run-webkit-tests --leaks used to work based on stderr iirc. Then again it used the mac os x "leaks" tool which allowed you to pass it a file containing suppressions.
Sorry for my terribly late reply. Now we've tackled the problem that content_shell itself causes leaks, and we could see the light at the end of the tunnel. On 2014/02/22 00:11:52, ojan wrote: > I'm not sure I understand how this code is intended to work. What happens if a > test both crashes and leaks? A leak is treated like a crash, but the priority of a leak is lower than a crash. So, the crash will be detected. > Or fails image diff and leaks? IIUC, the leak will be detected. > Or times out and leaks? IIUC, the time out will be detected. > Another disadvantage of having a new failure type is that you'll need to update > garden-o-matic and the flakiness dashboard to understand the new type. On the > plus side, once you do, then those tools will surface leak information. Now we don't think that gardeners should care about the detect leaks for now because for the initial period of time, the leak detection infrastructure will not be stable. Let us think about garden-o-matic or other tools later.
The CQ bit was checked by hajimehoshi@chromium.org
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/hajimehoshi@chromium.org/148153009/420001
The CQ bit was unchecked by commit-bot@chromium.org
Try jobs failed on following builders: tryserver.blink on linux_blink_rel
The CQ bit was checked by hajimehoshi@chromium.org
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/hajimehoshi@chromium.org/148153009/440001
The CQ bit was unchecked by hajimehoshi@chromium.org
The CQ bit was checked by hajimehoshi@chromium.org
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/hajimehoshi@chromium.org/148153009/460001
The CQ bit was unchecked by commit-bot@chromium.org
Try jobs failed on following builders: tryserver.blink on win_blink_rel
The CQ bit was checked by hajimehoshi@chromium.org
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/hajimehoshi@chromium.org/148153009/460001
The CQ bit was unchecked by commit-bot@chromium.org
Try jobs failed on following builders: tryserver.blink on linux_blink_dbg
The CQ bit was checked by hajimehoshi@chromium.org
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/hajimehoshi@chromium.org/148153009/460001
Message was sent while issue was closed.
Change committed as 170255 |