|
|
Created:
4 years, 4 months ago by Charlie Harrison Modified:
4 years, 4 months ago CC:
chromium-reviews, Dirk Pranke Base URL:
https://chromium.googlesource.com/chromium/src.git@master Target Ref:
refs/pending/heads/master Project:
chromium Visibility:
Public. |
DescriptionAdd testing configs for ParseHTMLOnMainThread experiment
BUG=623165
Committed: https://crrev.com/46be1b831ffec878df1b258a4f26872451d7795e
Cr-Commit-Position: refs/heads/master@{#411880}
Patch Set 1 #Patch Set 2 : fix bug in sync case #Patch Set 3 : Fix test #
Total comments: 4
Patch Set 4 : Edit browsertest #
Total comments: 3
Patch Set 5 : add virtual test suites and make field trial configs for Enabled group only #
Total comments: 6
Patch Set 6 : Devlin review #
Total comments: 4
Patch Set 7 : nits #Messages
Total messages: 51 (29 generated)
The CQ bit was checked by csharrison@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: linux_chromium_asan_rel_ng on master.tryserver.chromium.linux (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.linux/builders/linux_chromium_...)
The CQ bit was checked by csharrison@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: linux_chromium_chromeos_ozone_rel_ng on master.tryserver.chromium.linux (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.linux/builders/linux_chromium_...)
The CQ bit was checked by csharrison@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: chromeos_amd64-generic_chromium_compile_only_ng on master.tryserver.chromium.linux (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.linux/builders/chromeos_amd64-...) chromeos_x86-generic_chromium_compile_only_ng on master.tryserver.chromium.linux (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.linux/builders/chromeos_x86-ge...)
The CQ bit was checked by csharrison@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: cast_shell_linux on master.tryserver.chromium.linux (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.linux/builders/cast_shell_linu...)
csharrison@chromium.org changed reviewers: + dpranke@chromium.org, rdevlin.cronin@chromium.org
rdevlin, do the extension test changes seem reasonable? I think these are just race conditions surfaced by the experiment. +dpranke for test configs. Note that this experiment has 4 groups, so I gave each group its own platform. This surfaced one test error in extension tests.
https://codereview.chromium.org/2221193002/diff/40001/chrome/test/data/extens... File chrome/test/data/extensions/api_test/extension_resource_request_policy/web_accessible/accessible_redirect_resource.html (right): https://codereview.chromium.org/2221193002/diff/40001/chrome/test/data/extens... chrome/test/data/extensions/api_test/extension_resource_request_policy/web_accessible/accessible_redirect_resource.html:2: window.onload = function() { Why does this no longer work? It seems like something web pages or extensions would potentially do. (Admittedly, not quite so trivially, but the idea of modifying the location before the DOM content is fully loaded is not in and of itself insane.)
https://codereview.chromium.org/2221193002/diff/40001/chrome/test/data/extens... File chrome/test/data/extensions/api_test/extension_resource_request_policy/web_accessible/accessible_redirect_resource.html (right): https://codereview.chromium.org/2221193002/diff/40001/chrome/test/data/extens... chrome/test/data/extensions/api_test/extension_resource_request_policy/web_accessible/accessible_redirect_resource.html:2: window.onload = function() { On 2016/08/09 20:23:38, Devlin wrote: > Why does this no longer work? It seems like something web pages or extensions > would potentially do. (Admittedly, not quite so trivially, but the idea of > modifying the location before the DOM content is fully loaded is not in and of > itself insane.) It still works, but there's a race in your browsertest fixture, where I believe you expect two load events, so we hang waiting for the second one because this page load is aborted before reaching that point. At least, that was my analysis.
https://codereview.chromium.org/2221193002/diff/40001/chrome/test/data/extens... File chrome/test/data/extensions/api_test/extension_resource_request_policy/web_accessible/accessible_redirect_resource.html (right): https://codereview.chromium.org/2221193002/diff/40001/chrome/test/data/extens... chrome/test/data/extensions/api_test/extension_resource_request_policy/web_accessible/accessible_redirect_resource.html:2: window.onload = function() { On 2016/08/09 20:27:41, Charlie Harrison wrote: > On 2016/08/09 20:23:38, Devlin wrote: > > Why does this no longer work? It seems like something web pages or extensions > > would potentially do. (Admittedly, not quite so trivially, but the idea of > > modifying the location before the DOM content is fully loaded is not in and of > > itself insane.) > > It still works, but there's a race in your browsertest fixture, where I believe > you expect two load events, so we hang waiting for the second one because this > page load is aborted before reaching that point. > > At least, that was my analysis. Ah, okay, I assume you're talking about this line: https://chromium.googlesource.com/chromium/src/+/5cf9d45c437b7b2d899e46f2f324... Rather than change this file, I'd prefer to change the browsertest. Can we update it to wait for the expected URL to load rather than a set number of navigations (e.g., use UrlLoadObserver from ui_test_utils)? That way, we keep this behavior and make sure it doesn't break, still wait for the proper url to load, and aren't relying on whether or not a load finished. WDYT?
https://codereview.chromium.org/2221193002/diff/40001/chrome/test/data/extens... File chrome/test/data/extensions/api_test/extension_resource_request_policy/web_accessible/accessible_redirect_resource.html (right): https://codereview.chromium.org/2221193002/diff/40001/chrome/test/data/extens... chrome/test/data/extensions/api_test/extension_resource_request_policy/web_accessible/accessible_redirect_resource.html:2: window.onload = function() { On 2016/08/09 20:47:45, Devlin wrote: > On 2016/08/09 20:27:41, Charlie Harrison wrote: > > On 2016/08/09 20:23:38, Devlin wrote: > > > Why does this no longer work? It seems like something web pages or > extensions > > > would potentially do. (Admittedly, not quite so trivially, but the idea of > > > modifying the location before the DOM content is fully loaded is not in and > of > > > itself insane.) > > > > It still works, but there's a race in your browsertest fixture, where I > believe > > you expect two load events, so we hang waiting for the second one because this > > page load is aborted before reaching that point. > > > > At least, that was my analysis. > > Ah, okay, I assume you're talking about this line: > https://chromium.googlesource.com/chromium/src/+/5cf9d45c437b7b2d899e46f2f324... > > Rather than change this file, I'd prefer to change the browsertest. Can we > update it to wait for the expected URL to load rather than a set number of > navigations (e.g., use UrlLoadObserver from ui_test_utils)? That way, we keep > this behavior and make sure it doesn't break, still wait for the proper url to > load, and aren't relying on whether or not a load finished. > > WDYT? Yep, that solution sounds good to me! Will upload a new PS shortly.
The CQ bit was checked by csharrison@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: chromium_presubmit on master.tryserver.chromium.linux (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.linux/builders/chromium_presub...)
dpranke@chromium.org changed reviewers: + isherman@chromium.org
it's probably better to have a //testing/variations/ OWNER approve this, as I'm not all that familiar w/ the experimenting stuff. Maybe isherman@?
https://codereview.chromium.org/2221193002/diff/60001/testing/variations/fiel... File testing/variations/fieldtrial_testing_config_linux.json (right): https://codereview.chromium.org/2221193002/diff/60001/testing/variations/fiel... testing/variations/fieldtrial_testing_config_linux.json:167: } I think the best practice is to test the configuration that you are most likely to ship, and to test that configuration on each platform, rather than testing one configuration per platform. This allows us to catch, e.g. performance regressions, even if they happen only on a single platform.
https://codereview.chromium.org/2221193002/diff/60001/testing/variations/fiel... File testing/variations/fieldtrial_testing_config_linux.json (right): https://codereview.chromium.org/2221193002/diff/60001/testing/variations/fiel... testing/variations/fieldtrial_testing_config_linux.json:167: } On 2016/08/10 05:53:13, Ilya Sherman wrote: > I think the best practice is to test the configuration that you are most likely > to ship, and to test that configuration on each platform, rather than testing > one configuration per platform. This allows us to catch, e.g. performance > regressions, even if they happen only on a single platform. That makes sense, but I'd argue that shipping experiment groups that are not tested (or tested on a small subset of tests) is worse than missing platform specific behavior of an experiment group. In this case I think we can reach a compromise, as most of the code this touches is tested by layout tests which can have virtual test suites added. However as discovered by this CL this doesn't catch everything :) WDYT?
isherman@chromium.org changed reviewers: + asvitkine@chromium.org, rkaplow@chromium.org
+Rob and Alexei for their thoughts on field trial configs with multiple experimental groups (PTAL at the CL and past discussion) https://codereview.chromium.org/2221193002/diff/60001/testing/variations/fiel... File testing/variations/fieldtrial_testing_config_linux.json (right): https://codereview.chromium.org/2221193002/diff/60001/testing/variations/fiel... testing/variations/fieldtrial_testing_config_linux.json:167: } On 2016/08/10 12:05:31, Charlie Harrison wrote: > On 2016/08/10 05:53:13, Ilya Sherman wrote: > > I think the best practice is to test the configuration that you are most > likely > > to ship, and to test that configuration on each platform, rather than testing > > one configuration per platform. This allows us to catch, e.g. performance > > regressions, even if they happen only on a single platform. > > That makes sense, but I'd argue that shipping experiment groups that are not > tested (or tested on a small subset of tests) is worse than missing platform > specific behavior of an experiment group. > > In this case I think we can reach a compromise, as most of the code this touches > is tested by layout tests which can have virtual test suites added. However as > discovered by this CL this doesn't catch everything :) > > WDYT? In general, each feature's tests should test all the configurations for that feature. These config files are for running the *entire* test suite with the most likely field trial configuration for shipping, to see if there are any unexpected regressions detected by more general tests. Thus, we prefer to test a single configuration, the one that is actually going to be launched to users assuming that nothing looks wrong. We only require/recommend adding these configs once the field trial is reaching the Beta channel, on the assumption that by that point, it's usually pretty clear which configuration is the most promising one. Now, there's a lot of assumptions in the above, which aren't necessarily correct -- most notably, that there's enough data prior to testing on Beta to determine which configuration is most promising. For this particular experiment, which configuration are you *expecting* to ship? Is there a single candidate, or multiple ones?
On 2016/08/10 18:53:21, Ilya Sherman wrote: > +Rob and Alexei for their thoughts on field trial configs with multiple > experimental groups (PTAL at the CL and past discussion) > > https://codereview.chromium.org/2221193002/diff/60001/testing/variations/fiel... > File testing/variations/fieldtrial_testing_config_linux.json (right): > > https://codereview.chromium.org/2221193002/diff/60001/testing/variations/fiel... > testing/variations/fieldtrial_testing_config_linux.json:167: } > On 2016/08/10 12:05:31, Charlie Harrison wrote: > > On 2016/08/10 05:53:13, Ilya Sherman wrote: > > > I think the best practice is to test the configuration that you are most > > likely > > > to ship, and to test that configuration on each platform, rather than > testing > > > one configuration per platform. This allows us to catch, e.g. performance > > > regressions, even if they happen only on a single platform. > > > > That makes sense, but I'd argue that shipping experiment groups that are not > > tested (or tested on a small subset of tests) is worse than missing platform > > specific behavior of an experiment group. > > > > In this case I think we can reach a compromise, as most of the code this > touches > > is tested by layout tests which can have virtual test suites added. However as > > discovered by this CL this doesn't catch everything :) > > > > WDYT? > > In general, each feature's tests should test all the configurations for that > feature. These config files are for running the *entire* test suite with the > most likely field trial configuration for shipping, to see if there are any > unexpected regressions detected by more general tests. Thus, we prefer to test > a single configuration, the one that is actually going to be launched to users > assuming that nothing looks wrong. We only require/recommend adding these > configs once the field trial is reaching the Beta channel, on the assumption > that by that point, it's usually pretty clear which configuration is the most > promising one. > > Now, there's a lot of assumptions in the above, which aren't necessarily correct > -- most notably, that there's enough data prior to testing on Beta to determine > which configuration is most promising. For this particular experiment, which > configuration are you *expecting* to ship? Is there a single candidate, or > multiple ones? Agree with everything Ilya has said. I don't like the idea of using platforms to fake testing multiple experiment groups - I would prefer correctly testing the most likely experiment. It would be nice if we had the option to test multiple groups, but since we don't have that option yet, I'd prefer using the model of testing the best candidate. Usually at the time that this part gets setup, there has already been experimentation on canary/dev, so generally at this point the groups are only slightly different - maybe differences in parameters which are less risky. If the group that we expect to launch ends up changing, it will then change here before shipping, of course, so any bugs will get shaken out here in either case.
On 2016/08/10 at 19:14:54, rkaplow wrote: > On 2016/08/10 18:53:21, Ilya Sherman wrote: > > +Rob and Alexei for their thoughts on field trial configs with multiple > > experimental groups (PTAL at the CL and past discussion) > > > > https://codereview.chromium.org/2221193002/diff/60001/testing/variations/fiel... > > File testing/variations/fieldtrial_testing_config_linux.json (right): > > > > https://codereview.chromium.org/2221193002/diff/60001/testing/variations/fiel... > > testing/variations/fieldtrial_testing_config_linux.json:167: } > > On 2016/08/10 12:05:31, Charlie Harrison wrote: > > > On 2016/08/10 05:53:13, Ilya Sherman wrote: > > > > I think the best practice is to test the configuration that you are most > > > likely > > > > to ship, and to test that configuration on each platform, rather than > > testing > > > > one configuration per platform. This allows us to catch, e.g. performance > > > > regressions, even if they happen only on a single platform. > > > > > > That makes sense, but I'd argue that shipping experiment groups that are not > > > tested (or tested on a small subset of tests) is worse than missing platform > > > specific behavior of an experiment group. > > > > > > In this case I think we can reach a compromise, as most of the code this > > touches > > > is tested by layout tests which can have virtual test suites added. However as > > > discovered by this CL this doesn't catch everything :) > > > > > > WDYT? > > > > In general, each feature's tests should test all the configurations for that > > feature. These config files are for running the *entire* test suite with the > > most likely field trial configuration for shipping, to see if there are any > > unexpected regressions detected by more general tests. Thus, we prefer to test > > a single configuration, the one that is actually going to be launched to users > > assuming that nothing looks wrong. We only require/recommend adding these > > configs once the field trial is reaching the Beta channel, on the assumption > > that by that point, it's usually pretty clear which configuration is the most > > promising one. > > > > Now, there's a lot of assumptions in the above, which aren't necessarily correct > > -- most notably, that there's enough data prior to testing on Beta to determine > > which configuration is most promising. For this particular experiment, which > > configuration are you *expecting* to ship? Is there a single candidate, or > > multiple ones? > > Agree with everything Ilya has said. I don't like the idea of using platforms to fake testing multiple experiment groups - I would prefer correctly testing the most likely experiment. It would be nice if we had the option to test multiple groups, but since we don't have that option yet, I'd prefer using the model of testing the best candidate. Usually at the time that this part gets setup, there has already been experimentation on canary/dev, so generally at this point the groups are only slightly different - maybe differences in parameters which are less risky. > > If the group that we expect to launch ends up changing, it will then change here before shipping, of course, so any bugs will get shaken out here in either case. Thank you both for the excellent explanations. A few points: - These experiments are in Dev, I only wanted this to increase coverage due to my own nervousness after finding a pretty serious bug in a group that was non-tested. - There is a way to increase coverage without abusing multiple platforms: virtual test suites for LayoutTests pointing at a subset of webkit_tests. I will do that instead. - I think we will be able to use Dev data to inform us of the proper Beta groups (and the most likely candidate to ship). All in all, I think what you are saying makes sense for this experiment, and I was being too aggressive with this CL. I'll update the CL.
The CQ bit was checked by csharrison@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
Ok I have made the changes to the configs and added virtual test suites to two experiment groups for fast/parser tests. Thanks again for taking a look.
extensions lgtm with nit https://codereview.chromium.org/2221193002/diff/80001/chrome/browser/extensio... File chrome/browser/extensions/extension_resource_request_policy_apitest.cc (left): https://codereview.chromium.org/2221193002/diff/80001/chrome/browser/extensio... chrome/browser/extensions/extension_resource_request_policy_apitest.cc:269: ASSERT_TRUE(content::ExecuteScriptAndExtractString( Using JS to get the url was probably overkill, but I wonder if we should have a WebContents::GetLastCommittedURL() check here. It's pretty cheap and would double-check that the tab in question is actually the one that loaded the url (though I doubt any other ever would). https://codereview.chromium.org/2221193002/diff/80001/chrome/browser/extensio... chrome/browser/extensions/extension_resource_request_policy_apitest.cc:280: ASSERT_TRUE(content::ExecuteScriptAndExtractString( ditto https://codereview.chromium.org/2221193002/diff/80001/chrome/browser/extensio... File chrome/browser/extensions/extension_resource_request_policy_apitest.cc (right): https://codereview.chromium.org/2221193002/diff/80001/chrome/browser/extensio... chrome/browser/extensions/extension_resource_request_policy_apitest.cc:268: GURL("chrome-extension://ggmldgjhdenlnjjjmehkomheglpmijnf/test.png"), nit: instead, use a reference to the extension (LoadExtension should return an Extension*) and use extension->GetURL("test.png").
Thanks Devlin https://codereview.chromium.org/2221193002/diff/80001/chrome/browser/extensio... File chrome/browser/extensions/extension_resource_request_policy_apitest.cc (left): https://codereview.chromium.org/2221193002/diff/80001/chrome/browser/extensio... chrome/browser/extensions/extension_resource_request_policy_apitest.cc:269: ASSERT_TRUE(content::ExecuteScriptAndExtractString( On 2016/08/10 20:20:08, Devlin wrote: > Using JS to get the url was probably overkill, but I wonder if we should have a > WebContents::GetLastCommittedURL() check here. It's pretty cheap and would > double-check that the tab in question is actually the one that loaded the url > (though I doubt any other ever would). Done. https://codereview.chromium.org/2221193002/diff/80001/chrome/browser/extensio... chrome/browser/extensions/extension_resource_request_policy_apitest.cc:280: ASSERT_TRUE(content::ExecuteScriptAndExtractString( On 2016/08/10 20:20:08, Devlin wrote: > ditto Done. https://codereview.chromium.org/2221193002/diff/80001/chrome/browser/extensio... File chrome/browser/extensions/extension_resource_request_policy_apitest.cc (right): https://codereview.chromium.org/2221193002/diff/80001/chrome/browser/extensio... chrome/browser/extensions/extension_resource_request_policy_apitest.cc:268: GURL("chrome-extension://ggmldgjhdenlnjjjmehkomheglpmijnf/test.png"), On 2016/08/10 20:20:08, Devlin wrote: > nit: instead, use a reference to the extension (LoadExtension should return an > Extension*) and use extension->GetURL("test.png"). Done.
(s lgtm) https://codereview.chromium.org/2221193002/diff/100001/chrome/browser/extensi... File chrome/browser/extensions/extension_resource_request_policy_apitest.cc (right): https://codereview.chromium.org/2221193002/diff/100001/chrome/browser/extensi... chrome/browser/extensions/extension_resource_request_policy_apitest.cc:241: .AppendASCII("web_accessible")); nitty nit: leave in an ASSERT_TRUE(extension) so that this still gives us a nice error if the extension fails to load. https://codereview.chromium.org/2221193002/diff/100001/chrome/browser/extensi... chrome/browser/extensions/extension_resource_request_policy_apitest.cc:276: EXPECT_EQ(browser() nitty nit: expected, actual so EXPECT_EQ(accessible_url, <web_contents_url>)
https://codereview.chromium.org/2221193002/diff/100001/chrome/browser/extensi... File chrome/browser/extensions/extension_resource_request_policy_apitest.cc (right): https://codereview.chromium.org/2221193002/diff/100001/chrome/browser/extensi... chrome/browser/extensions/extension_resource_request_policy_apitest.cc:241: .AppendASCII("web_accessible")); On 2016/08/10 20:42:35, Devlin wrote: > nitty nit: leave in an ASSERT_TRUE(extension) so that this still gives us a nice > error if the extension fails to load. Done. Sorry about that. https://codereview.chromium.org/2221193002/diff/100001/chrome/browser/extensi... chrome/browser/extensions/extension_resource_request_policy_apitest.cc:276: EXPECT_EQ(browser() On 2016/08/10 20:42:35, Devlin wrote: > nitty nit: expected, actual > so > EXPECT_EQ(accessible_url, > <web_contents_url>) Done.
Thanks. Field trial testing configs lgtm.
csharrison@chromium.org changed reviewers: + kouhei@chromium.org
+kouhei, would you take a look at the virtual tests?
lgtm
The CQ bit was checked by csharrison@chromium.org
The patchset sent to the CQ was uploaded after l-g-t-m from rdevlin.cronin@chromium.org Link to the patchset: https://codereview.chromium.org/2221193002/#ps120001 (title: "nits")
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
Message was sent while issue was closed.
Committed patchset #7 (id:120001)
Message was sent while issue was closed.
Description was changed from ========== Add testing configs for ParseHTMLOnMainThread experiment BUG=623165 ========== to ========== Add testing configs for ParseHTMLOnMainThread experiment BUG=623165 Committed: https://crrev.com/46be1b831ffec878df1b258a4f26872451d7795e Cr-Commit-Position: refs/heads/master@{#411880} ==========
Message was sent while issue was closed.
Patchset 7 (id:??) landed as https://crrev.com/46be1b831ffec878df1b258a4f26872451d7795e Cr-Commit-Position: refs/heads/master@{#411880} |