|
|
Created:
6 years, 5 months ago by Sergiy Byelozyorov Modified:
6 years, 5 months ago CC:
chromium-reviews, dpranke_chromioum.org Base URL:
svn://svn.chromium.org/chrome/trunk/src Project:
chromium Visibility:
Public. |
DescriptionCreated an empty expectations file to ignore flaky tests
This is used by https://codereview.chromium.org/373223003/
BUG=390600
R=eseidel@chromium.org, stip@chromium.org, hinoka@chromium.org
Committed: https://src.chromium.org/viewvc/chrome?view=rev&revision=284062
Patch Set 1 #Patch Set 2 : #Patch Set 3 : #Patch Set 4 : #Patch Set 5 : #Patch Set 6 : #Patch Set 7 : Updated file format description #
Total comments: 6
Patch Set 8 : #Patch Set 9 : Removed text about allow editing file manually #Messages
Total messages: 27 (0 generated)
PTAL
Stip is OOO until Friday. LGTM. The code is likely more interesting, but the fileformat looks very similar to TestExpectations which is good.
https://codereview.chromium.org/376653002/diff/120001/tools/ignorer_bot/ignor... File tools/ignorer_bot/ignored_failed_tests.txt (right): https://codereview.chromium.org/376653002/diff/120001/tools/ignorer_bot/ignor... tools/ignorer_bot/ignored_failed_tests.txt:6: # http://crbug.com/ISSUE [ PLATFORM1, PLATFORM2, ... ] TESTNAME Are you sure you want commas in this list?
https://codereview.chromium.org/376653002/diff/120001/tools/ignorer_bot/ignor... File tools/ignorer_bot/ignored_failed_tests.txt (right): https://codereview.chromium.org/376653002/diff/120001/tools/ignorer_bot/ignor... tools/ignorer_bot/ignored_failed_tests.txt:6: # http://crbug.com/ISSUE [ PLATFORM1, PLATFORM2, ... ] TESTNAME On 2014/07/08 19:51:59, eseidel wrote: > Are you sure you want commas in this list? Yes, how otherwise should I separate platform definitions? For example a definition with 3 platforms will look as following: http://crbug.com/12345 [ OS_LINUX CPU_64_BITS MODE_RELEASE, OS_LINUX CPU_32_BITS MODE_DEBUG, OS_WIN CPU_32_BITS MODE_RELEASE ] MyTests.TestOne Spaces are used to split flags within a since platform definition, while commas are used to split platforms.
Adding Sergey to lgtm instead of stip.
On 2014/07/09 13:34:14, Sergiy Byelozyorov wrote: > Adding Sergey to lgtm instead of stip. Oh. Sergiy is not an OWNER. Adding hinoka instead.
OK.
On 2014/07/09 13:37:35, Sergiy Byelozyorov wrote: > On 2014/07/09 13:34:14, Sergiy Byelozyorov wrote: > > Adding Sergey to lgtm instead of stip. > > Oh. Sergiy is not an OWNER. Adding hinoka instead. eseidel's LGTM should be sufficient, go ahead.
I think this file should be in the same repository as the tests that are being skipped. That way the test flakiness list is tied to a specific source code revision. https://codereview.chromium.org/376653002/diff/120001/tools/ignorer_bot/ignor... File tools/ignorer_bot/ignored_failed_tests.txt (right): https://codereview.chromium.org/376653002/diff/120001/tools/ignorer_bot/ignor... tools/ignorer_bot/ignored_failed_tests.txt:6: # http://crbug.com/ISSUE [ PLATFORM1, PLATFORM2, ... ] TESTNAME Can we use exactly the same format as third_party/WebKit/LayoutTests/FlakyTests? We should use the same code to parse the files and it's less overhead for developers to need to understand only format. Here's the list of things I see that are different -No http:// before the bug numbers. -Include the failure type at the end [ Failure ]. Right now that's the only mode for gtests, but in the future we'll want to distinguish timeouts and crashes like we do for blink tests. -Use the same plaform identifiers: s/OS_Linux/Linux, s/MODE_RELEASE/Release. -No 32 bit vs. 64 bit identifiers. We used to have this in Blink and it was used so rarely that it wasn't worth the confusion. It's really rare to have tests that are flaky only on one of 32 bit or 64 bit. We can add this back in the future if we find a need for it, but lets start with leaving it out. https://codereview.chromium.org/376653002/diff/120001/tools/ignorer_bot/ignor... tools/ignorer_bot/ignored_failed_tests.txt:17: # ignores highly flaky tests. However, it may also be edited manually as long as I think we should keep this file 100% auto-generated. No manual changes. That way we can freely remove things when they stop being flaky. If people manually add lines, we won't know what we can remove and what we can't.
https://codereview.chromium.org/376653002/diff/120001/tools/ignorer_bot/ignor... File tools/ignorer_bot/ignored_failed_tests.txt (right): https://codereview.chromium.org/376653002/diff/120001/tools/ignorer_bot/ignor... tools/ignorer_bot/ignored_failed_tests.txt:6: # http://crbug.com/ISSUE [ PLATFORM1, PLATFORM2, ... ] TESTNAME On 2014/07/09 17:34:02, ojan-only-code-yellow-reviews wrote: > Can we use exactly the same format as third_party/WebKit/LayoutTests/FlakyTests? > We should use the same code to parse the files and it's less overhead for > developers to need to understand only format. > > Here's the list of things I see that are different > -No http:// before the bug numbers. Done. > -Include the failure type at the end [ Failure ]. Right now that's the only mode > for gtests, but in the future we'll want to distinguish timeouts and crashes > like we do for blink tests. Right now this is not the only type. We ignore flaky tests which fail on any types. I can add this, but it will be for now ignored by the test runner, which ignores any failures of the test in the list. > -Use the same plaform identifiers: s/OS_Linux/Linux, s/MODE_RELEASE/Release. This will require writing and supporting a hard-coded mapping from flags to platforms. In particular this mapping will have to be updated each time the new platforms are added. Why not use flags directly? > -No 32 bit vs. 64 bit identifiers. We used to have this in Blink and it was used > so rarely that it wasn't worth the confusion. It's really rare to have tests > that are flaky only on one of 32 bit or 64 bit. We can add this back in the > future if we find a need for it, but lets start with leaving it out. Why would this be confusing? Developers will hardly ever read this file and if the test is flaky on both 32-bit and 64-bit platforms both will be added automatically. https://codereview.chromium.org/376653002/diff/120001/tools/ignorer_bot/ignor... tools/ignorer_bot/ignored_failed_tests.txt:17: # ignores highly flaky tests. However, it may also be edited manually as long as On 2014/07/09 17:34:02, ojan-only-code-yellow-reviews wrote: > I think we should keep this file 100% auto-generated. No manual changes. That > way we can freely remove things when they stop being flaky. If people manually > add lines, we won't know what we can remove and what we can't. In the first version we will not remove tests automatically and people will need to edit this file to restore them manually.
This is only used by ignorer_bot, correct? Not the test binaries themselves? I'd strongly consider using a serialization such as YAML or JSON for this instead of your own syntax. YAML is preferred as it has built-in #-style comments while JSON does not. You'd have the benefit that other consumers of the code (such as any monitoring or analysis) would be able to parse your code with less logic than if you had your own format.
Where is the code that produces and consumes this file? Might be good to upload those so we can be sure we're working from the same set of assumptions about how all this stuff hooks together. The question of location got lost. I would expect this file to either be in the chrome source tree or the new infra.git repo. The important thing is that when a developer syncs to a specific revision they get the right set of ignored tests. On 2014/07/11 at 18:14:22, stip wrote: > This is only used by ignorer_bot, correct? Not the test binaries themselves? In my mental model, ignorer_bot spits out the file. The file is consumed by the test harness/binaries to figure out which test failures to ignore. > I'd strongly consider using a serialization such as YAML or JSON for this instead of your own syntax. YAML is preferred as it has built-in #-style comments while JSON does not. You'd have the benefit that other consumers of the code (such as any monitoring or analysis) would be able to parse your code with less logic than if you had your own format. For some test types, there will be a manually curated list in addition to the auto-generated one. We have this for layout tests. TestExpectations (and a few other files) are manually curated and FlakyTests is auto-generated. We shouldn't have different formats for the manually and auto-generated ones. I'm a big fan of the idea of not having the parsing code littered across multiple places though. What I'd probably do instead of changing the format is to make a small self-contained python script that parses this file and outputs JSON, then have all the tooling call out to that python script. So, the only things that would interact with the file format directly are this one python script and human beings. I think Dirk is already working on making the parsing code that Blink uses self-contained and reusable (right Dirk?), so the extra work would just be to have that code output JSON. On 2014/07/10 at 11:12:56, sergiyb wrote: > > -Include the failure type at the end [ Failure ]. Right now that's the only mode > > for gtests, but in the future we'll want to distinguish timeouts and crashes > > like we do for blink tests. > Right now this is not the only type. We ignore flaky tests which fail on any types. gtests currently has multiple failure types? What are they? > I can add this, but it will be for now ignored by the test runner, which ignores any failures of the test in the list. Why not make it only ignore the failure type that actually happens? At least for layout tests, it's very common that a test will flaky timeout, but if it doesn't timeout then it passes. If that test started failing asserts, then it should still count as a failure. This is more important for local development than on the buildbots. > > -Use the same plaform identifiers: s/OS_Linux/Linux, s/MODE_RELEASE/Release. > This will require writing and supporting a hard-coded mapping from flags to platforms. In particular this mapping will have to be updated each time the new platforms are added. Why not use flags directly? The part I care about is that we have one format across all different test types. So, we'll need a mapping for some test harnesses regardless. Coupling the file format to what gtest uses internally seems bad to me. Now when you want to change gtest, you need to make sure all the other test harnesses are updated appropriately. That said, I don't really care about the actual terms we use. I care that we have one format so that developers don't need to make sense of different formats and so that the tools that consume this file don't need to make sense of different formats. If we're going to change formats, we should start by changing the Blink format to the new format. I don't think it's worth all that work just to avoid having a mapping for gtests when we'll need a mapping for other test harnesses. It seems easiest to me to just use the Blink syntax wholesale for now and we can change the syntax later if we want to. But if you feel strongly, then we should change the Blink syntax now. Another aspect of the blink format relevant to this is that there are meta identifiers, e.g. [ Win ] is a shorthand for [ XP Win7 ]. > > -No 32 bit vs. 64 bit identifiers. We used to have this in Blink and it was used > > so rarely that it wasn't worth the confusion. It's really rare to have tests > > that are flaky only on one of 32 bit or 64 bit. We can add this back in the > > future if we find a need for it, but lets start with leaving it out. > Why would this be confusing? Developers will hardly ever read this file and if the test is flaky on both 32-bit and 64-bit platforms both will be added automatically. We can add it for now. One problem is that we don't run all test suites on both 32bit and 64bit bots. So if a test is flaky on both platforms, we only run the tests on 32bit, then a developer running the tests locally on 64bit will not get that test ignored. So, to make that work, you need to check if a test is run on both 32bit and 64bit before deciding whether to stick the identifier in this file. Now you have a *lot* of extra complexity for very little benefit. > https://codereview.chromium.org/376653002/diff/120001/tools/ignorer_bot/ignor... > tools/ignorer_bot/ignored_failed_tests.txt:17: # ignores highly flaky tests. However, it may also be edited manually as long as > On 2014/07/09 17:34:02, ojan-only-code-yellow-reviews wrote: > > I think we should keep this file 100% auto-generated. No manual changes. That > > way we can freely remove things when they stop being flaky. If people manually > > add lines, we won't know what we can remove and what we can't. > In the first version we will not remove tests automatically and people will need to edit this file to restore them manually. I think this needs to be in the first version. Removing tests automatically is very important. It's very common for a test to stop being flaky. If this file is always auto-generated, removing tests is very easy to do. Everytime you generate the file, you completely overwrite the whole thing and tests that stopped being flaky get removed from the list. On 2014/07/09 at 13:30:19, sergiyb wrote: > https://codereview.chromium.org/376653002/diff/120001/tools/ignorer_bot/ignor... > File tools/ignorer_bot/ignored_failed_tests.txt (right): > > https://codereview.chromium.org/376653002/diff/120001/tools/ignorer_bot/ignor... > tools/ignorer_bot/ignored_failed_tests.txt:6: # http://crbug.com/ISSUE [ PLATFORM1, PLATFORM2, ... ] TESTNAME > On 2014/07/08 19:51:59, eseidel wrote: > > Are you sure you want commas in this list? > Yes, how otherwise should I separate platform definitions? For example a definition with 3 platforms will look as following: > > http://crbug.com/12345 [ OS_LINUX CPU_64_BITS MODE_RELEASE, OS_LINUX CPU_32_BITS MODE_DEBUG, > OS_WIN CPU_32_BITS MODE_RELEASE ] MyTests.TestOne > > Spaces are used to split flags within a since platform definition, while commas are used to split platforms. I missed this earlier. In the Blink format, you add separate lines for each thing. So, you're example above would be: crbug.com/12345 [ OS_LINUX CPU_64_BITS MODE_RELEASE ] MyTests.TestOne crbug.com/12345 [ OS_LINUX CPU_32_BITS MODE_DEBUG ] MyTests.TestOne crbug.com/12345 [ OS_WIN CPU_32_BITS MODE_RELEASE ] MyTests.TestOne Although, really the way you'd do this is: crbug.com/12345 [ OS_LINUX ] MyTests.TestOne crbug.com/12345 [ OS_WIN CPU_32_BITS MODE_RELEASE ] MyTests.TestOne If a type of an identifier is left out, then it's assumed to apply to both. Also valid (but unnecessarily redundant) would be: crbug.com/12345 [ OS_LINUX CPU_32_BITS CPU_64_BITS MODE_RELEASE MODE_DEBUG ] MyTests.TestOne crbug.com/12345 [ OS_WIN CPU_32_BITS MODE_RELEASE ] MyTests.TestOne We can bikeshed over the details of the syntax, but again I don't much care about the details. I care that we have one syntax everywhere.
On 2014/07/11 20:25:12, ojan-only-code-yellow-reviews wrote: > Where is the code that produces and consumes this file? Might be good to upload > those so we can be sure we're working from the same set of assumptions about how > all this stuff hooks together. > > The question of location got lost. I would expect this file to either be in the > chrome source tree or the new infra.git repo. The important thing is that when a > developer syncs to a specific revision they get the right set of ignored tests. > > On 2014/07/11 at 18:14:22, stip wrote: > > This is only used by ignorer_bot, correct? Not the test binaries themselves? > > In my mental model, ignorer_bot spits out the file. The file is consumed by the > test harness/binaries to figure out which test failures to ignore. > > > I'd strongly consider using a serialization such as YAML or JSON for this > instead of your own syntax. YAML is preferred as it has built-in #-style > comments while JSON does not. You'd have the benefit that other consumers of the > code (such as any monitoring or analysis) would be able to parse your code with > less logic than if you had your own format. > > For some test types, there will be a manually curated list in addition to the > auto-generated one. We have this for layout tests. TestExpectations (and a few > other files) are manually curated and FlakyTests is auto-generated. We shouldn't > have different formats for the manually and auto-generated ones. > > I'm a big fan of the idea of not having the parsing code littered across > multiple places though. What I'd probably do instead of changing the format is > to make a small self-contained python script that parses this file and outputs > JSON, then have all the tooling call out to that python script. So, the only > things that would interact with the file format directly are this one python > script and human beings. This seems slightly crazy to me. Why not just have everyone use this YAML or JSON file directly? As soon as you introduce a translation script you're asking for pain. Is the issue that currently layout tests and testexpectations have their own formats? We should use this opportunity to unify them on a standardized serialization instead of adding Yet Another Configuration Language. > > I think Dirk is already working on making the parsing code that Blink uses > self-contained and reusable (right Dirk?), so the extra work would just be to > have that code output JSON. > > On 2014/07/10 at 11:12:56, sergiyb wrote: > > > -Include the failure type at the end [ Failure ]. Right now that's the only > mode > > > for gtests, but in the future we'll want to distinguish timeouts and crashes > > > like we do for blink tests. > > Right now this is not the only type. We ignore flaky tests which fail on any > types. > > gtests currently has multiple failure types? What are they? > > > I can add this, but it will be for now ignored by the test runner, which > ignores any failures of the test in the list. > > Why not make it only ignore the failure type that actually happens? At least for > layout tests, it's very common that a test will flaky timeout, but if it doesn't > timeout then it passes. If that test started failing asserts, then it should > still count as a failure. This is more important for local development than on > the buildbots. > > > > -Use the same plaform identifiers: s/OS_Linux/Linux, s/MODE_RELEASE/Release. > > This will require writing and supporting a hard-coded mapping from flags to > platforms. In particular this mapping will have to be updated each time the new > platforms are added. Why not use flags directly? > > The part I care about is that we have one format across all different test > types. So, we'll need a mapping for some test harnesses regardless. Coupling the > file format to what gtest uses internally seems bad to me. Now when you want to > change gtest, you need to make sure all the other test harnesses are updated > appropriately. > > That said, I don't really care about the actual terms we use. I care that we > have one format so that developers don't need to make sense of different formats > and so that the tools that consume this file don't need to make sense of > different formats. If we're going to change formats, we should start by changing > the Blink format to the new format. I don't think it's worth all that work just > to avoid having a mapping for gtests when we'll need a mapping for other test > harnesses. > > It seems easiest to me to just use the Blink syntax wholesale for now and we can > change the syntax later if we want to. But if you feel strongly, then we should > change the Blink syntax now. > > Another aspect of the blink format relevant to this is that there are meta > identifiers, e.g. [ Win ] is a shorthand for [ XP Win7 ]. > > > > -No 32 bit vs. 64 bit identifiers. We used to have this in Blink and it was > used > > > so rarely that it wasn't worth the confusion. It's really rare to have tests > > > that are flaky only on one of 32 bit or 64 bit. We can add this back in the > > > future if we find a need for it, but lets start with leaving it out. > > Why would this be confusing? Developers will hardly ever read this file and if > the test is flaky on both 32-bit and 64-bit platforms both will be added > automatically. > > We can add it for now. One problem is that we don't run all test suites on both > 32bit and 64bit bots. So if a test is flaky on both platforms, we only run the > tests on 32bit, then a developer running the tests locally on 64bit will not get > that test ignored. So, to make that work, you need to check if a test is run on > both 32bit and 64bit before deciding whether to stick the identifier in this > file. Now you have a *lot* of extra complexity for very little benefit. > > > > https://codereview.chromium.org/376653002/diff/120001/tools/ignorer_bot/ignor... > > tools/ignorer_bot/ignored_failed_tests.txt:17: # ignores highly flaky tests. > However, it may also be edited manually as long as > > On 2014/07/09 17:34:02, ojan-only-code-yellow-reviews wrote: > > > I think we should keep this file 100% auto-generated. No manual changes. > That > > > way we can freely remove things when they stop being flaky. If people > manually > > > add lines, we won't know what we can remove and what we can't. > > In the first version we will not remove tests automatically and people will > need to edit this file to restore them manually. > > I think this needs to be in the first version. Removing tests automatically is > very important. It's very common for a test to stop being flaky. If this file is > always auto-generated, removing tests is very easy to do. Everytime you generate > the file, you completely overwrite the whole thing and tests that stopped being > flaky get removed from the list. > > On 2014/07/09 at 13:30:19, sergiyb wrote: > > > https://codereview.chromium.org/376653002/diff/120001/tools/ignorer_bot/ignor... > > File tools/ignorer_bot/ignored_failed_tests.txt (right): > > > > > https://codereview.chromium.org/376653002/diff/120001/tools/ignorer_bot/ignor... > > tools/ignorer_bot/ignored_failed_tests.txt:6: # http://crbug.com/ISSUE [ > PLATFORM1, PLATFORM2, ... ] TESTNAME > > On 2014/07/08 19:51:59, eseidel wrote: > > > Are you sure you want commas in this list? > > Yes, how otherwise should I separate platform definitions? For example a > definition with 3 platforms will look as following: > > > > http://crbug.com/12345 [ OS_LINUX CPU_64_BITS MODE_RELEASE, OS_LINUX > CPU_32_BITS MODE_DEBUG, > > OS_WIN CPU_32_BITS MODE_RELEASE ] MyTests.TestOne > > > > Spaces are used to split flags within a since platform definition, while > commas are used to split platforms. > > I missed this earlier. In the Blink format, you add separate lines for each > thing. So, you're example above would be: > crbug.com/12345 [ OS_LINUX CPU_64_BITS MODE_RELEASE ] MyTests.TestOne > crbug.com/12345 [ OS_LINUX CPU_32_BITS MODE_DEBUG ] MyTests.TestOne > crbug.com/12345 [ OS_WIN CPU_32_BITS MODE_RELEASE ] MyTests.TestOne > > Although, really the way you'd do this is: > crbug.com/12345 [ OS_LINUX ] MyTests.TestOne > crbug.com/12345 [ OS_WIN CPU_32_BITS MODE_RELEASE ] MyTests.TestOne > > If a type of an identifier is left out, then it's assumed to apply to both. > > Also valid (but unnecessarily redundant) would be: > crbug.com/12345 [ OS_LINUX CPU_32_BITS CPU_64_BITS MODE_RELEASE MODE_DEBUG ] > MyTests.TestOne > crbug.com/12345 [ OS_WIN CPU_32_BITS MODE_RELEASE ] MyTests.TestOne > > We can bikeshed over the details of the syntax, but again I don't much care > about the details. I care that we have one syntax everywhere.
> > On 2014/07/11 at 18:14:22, stip wrote: > > > This is only used by ignorer_bot, correct? Not the test binaries themselves? > > > > In my mental model, ignorer_bot spits out the file. The file is consumed by the > > test harness/binaries to figure out which test failures to ignore. > > > > > I'd strongly consider using a serialization such as YAML or JSON for this > > instead of your own syntax. YAML is preferred as it has built-in #-style > > comments while JSON does not. You'd have the benefit that other consumers of the > > code (such as any monitoring or analysis) would be able to parse your code with > > less logic than if you had your own format. > > > > For some test types, there will be a manually curated list in addition to the > > auto-generated one. We have this for layout tests. TestExpectations (and a few > > other files) are manually curated and FlakyTests is auto-generated. We shouldn't > > have different formats for the manually and auto-generated ones. > > > > I'm a big fan of the idea of not having the parsing code littered across > > multiple places though. What I'd probably do instead of changing the format is > > to make a small self-contained python script that parses this file and outputs > > JSON, then have all the tooling call out to that python script. So, the only > > things that would interact with the file format directly are this one python > > script and human beings. > > This seems slightly crazy to me. Why not just have everyone use this YAML or JSON file directly? As soon as you introduce a translation script you're asking for pain. Is the issue that currently layout tests and testexpectations have their own formats? We should use this opportunity to unify them on a standardized serialization instead of adding Yet Another Configuration Language. I don't feel strongly about whether we use the current format or JSON, but I do feel strongly that we have *one* format for developers to wrap their heads around. I'm totally fine switching to JSON. I certainly see the simplicity it gives us. But, if we do that, lets start by converting the Blink format over first. Here's a compact-ish format I could imagine using: { "legend": [ "bugs", "specifiers", "test", "expectations" ], "sections": [ { "comment": "...", "expectations": [ [["http://crbug.com/123"], ["Linux", "Debug"], "path/to/test", ["Pass", "Failure"]], [["http://crbug.com/123"], ["Linux", "Debug"], "path/to/test", ["Pass", "Failure"]], [["http://crbug.com/123"], ["Linux", "Debug"], "path/to/test", ["Pass", "Failure"]], [["http://crbug.com/123"], ["Linux", "Debug"], "path/to/test", ["Pass", "Failure"]], [["http://crbug.com/123"], ["Linux", "Debug"], "path/to/test", ["Pass", "Failure"]], [["http://crbug.com/123"], ["Linux", "Debug"], "path/to/test", ["Pass", "Failure"]], [["http://crbug.com/123"], ["Linux", "Debug"], "path/to/test", ["Pass", "Failure"]], [["http://crbug.com/123"], ["Linux", "Debug"], "path/to/test", ["Pass", "Failure"]], ] }, { "expectations": [ [["http://crbug.com/123"], ["Linux", "Debug"], "path/to/test", ["Pass", "Failure"]], [["http://crbug.com/123"], ["Linux", "Debug"], "path/to/test", ["Pass", "Failure"]], ] } ] } I'm not too familiar with YAML, but is there a way to do a one-line list? I think having each test span multiple lines would be a non-starter.
On 2014/07/15 02:52:31, ojan-only-code-yellow-reviews wrote: > > > On 2014/07/11 at 18:14:22, stip wrote: > > > > This is only used by ignorer_bot, correct? Not the test binaries > themselves? > > > > > > In my mental model, ignorer_bot spits out the file. The file is consumed by > the > > > test harness/binaries to figure out which test failures to ignore. > > > > > > > I'd strongly consider using a serialization such as YAML or JSON for this > > > instead of your own syntax. YAML is preferred as it has built-in #-style > > > comments while JSON does not. You'd have the benefit that other consumers of > the > > > code (such as any monitoring or analysis) would be able to parse your code > with > > > less logic than if you had your own format. > > > > > > For some test types, there will be a manually curated list in addition to > the > > > auto-generated one. We have this for layout tests. TestExpectations (and a > few > > > other files) are manually curated and FlakyTests is auto-generated. We > shouldn't > > > have different formats for the manually and auto-generated ones. > > > > > > I'm a big fan of the idea of not having the parsing code littered across > > > multiple places though. What I'd probably do instead of changing the format > is > > > to make a small self-contained python script that parses this file and > outputs > > > JSON, then have all the tooling call out to that python script. So, the only > > > things that would interact with the file format directly are this one python > > > script and human beings. > > > > This seems slightly crazy to me. Why not just have everyone use this YAML or > JSON file directly? As soon as you introduce a translation script you're asking > for pain. Is the issue that currently layout tests and testexpectations have > their own formats? We should use this opportunity to unify them on a > standardized serialization instead of adding Yet Another Configuration Language. > > I don't feel strongly about whether we use the current format or JSON, but I do > feel strongly that we have *one* format for developers to wrap their heads > around. I'm totally fine switching to JSON. I certainly see the simplicity it > gives us. But, if we do that, lets start by converting the Blink format over > first. > > Here's a compact-ish format I could imagine using: > { > "legend": [ "bugs", "specifiers", "test", "expectations" ], > "sections": [ > { > "comment": "...", > "expectations": [ > [["http://crbug.com/123"], ["Linux", "Debug"], "path/to/test", > ["Pass", "Failure"]], > [["http://crbug.com/123"], ["Linux", "Debug"], "path/to/test", > ["Pass", "Failure"]], > [["http://crbug.com/123"], ["Linux", "Debug"], "path/to/test", > ["Pass", "Failure"]], > [["http://crbug.com/123"], ["Linux", "Debug"], "path/to/test", > ["Pass", "Failure"]], > [["http://crbug.com/123"], ["Linux", "Debug"], "path/to/test", > ["Pass", "Failure"]], > [["http://crbug.com/123"], ["Linux", "Debug"], "path/to/test", > ["Pass", "Failure"]], > [["http://crbug.com/123"], ["Linux", "Debug"], "path/to/test", > ["Pass", "Failure"]], > [["http://crbug.com/123"], ["Linux", "Debug"], "path/to/test", > ["Pass", "Failure"]], > ] > }, > { > "expectations": [ > [["http://crbug.com/123"], ["Linux", "Debug"], "path/to/test", > ["Pass", "Failure"]], > [["http://crbug.com/123"], ["Linux", "Debug"], "path/to/test", > ["Pass", "Failure"]], > ] > } > ] > } > > I'm not too familiar with YAML, but is there a way to do a one-line list? I > think having each test span multiple lines would be a non-starter. http://www.yaml.org/YAML_for_ruby.html#simple_inline_array also, according to http://jsontoyaml.com/: "Technically speaking, JSON already is YAML since YAML allows for the use of brackets and commas in the exact same way that JSON does."
If we can agree on something here, then we can propose it to blink-dev. I ran this by a couple people and their reaction was pretty allergic, but I don't mind having the broader discussion. I tried to get this to be yaml. I don't know yaml that well though, so I'm not sure I did it in the best way. legend: [ bugs, specifiers, test, expectations ] sections: [ { comment: These test fail because...., expectations: [[crbug.com/123], [Linux, Debug], path/to/test, [Pass, Failure]], [[crbug.com/123], [Win], path/to/test2, [Failure]], }, { expectations: [[Bug(ojan), crbug.com/234], [], path/to/test, [Pass, Failure]], }, ] We can't just use yaml comments because we need to be able to read the comments so that we can write them back out again.
I disagree that we should follow the same format for both layout tests and gTests. They are too different in nature. Layout tests are html files, while gTests are pieces of code with asserts. Layout tests have text and image expectations and may also be expected to fail in various ways, while gTests are always expected to pass all assertions. Various failure types are only provided to help developers with debugging. Furthermore, gTests have an established method to disable them already, but this method was designed to be used by humans and can't be used to disabled tests automatically as it requires changing C++ source code. Therefore we will have inconsistency anyway. Finally, why does it matter if we have different formats as long as these files are only processed by the ignorer-bot and test runner? There are no plans that developers will ever edit this file. As for the format, I prefer JSON, but YAML is also okay.
I propose to move further discussion online.
lgtm to land as is. We chatted offline about this and agreed that we should eventually have one file-format for all test types, but also that we don't need to block this work on that. Dirk is planning on making the Blink parsing code for this generic and we can convert the gtests format over to the shared format when he's done. Some things that we should still resolve eventually, but we can work out on infra-dev. -I don't think we should allow manually editing this file. It's very important that we automatically remove lines from this file when tests stop being flaky, which manual editing makes hard. -I don't understand why it makes sense for the file to be located here.
The CQ bit was checked by sergiyb@chromium.org
The CQ bit was unchecked by sergiyb@chromium.org
The CQ bit was checked by sergiyb@chromium.org
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/sergiyb@chromium.org/376653002/160001
On 2014/07/17 20:40:34, ojan-only-code-yellow-reviews wrote: > lgtm to land as is. > > We chatted offline about this and agreed that we should eventually have one > file-format for all test types, but also that we don't need to block this work > on that. Dirk is planning on making the Blink parsing code for this generic and > we can convert the gtests format over to the shared format when he's done. Commited. > Some things that we should still resolve eventually, but we can work out on > infra-dev. > > -I don't think we should allow manually editing this file. It's very important > that we automatically remove lines from this file when tests stop being flaky, > which manual editing makes hard. Agreed. Removed the statement about allowing manual edits and place huge warning in the beginning of the file. > -I don't understand why it makes sense for the file to be located here. I wasn't sure where to place it and asked a couple of Chrome developers about it. If you have a better suggestion, I'll be glad to move the file there. Note that it must be in Chrome repo, so that runtest.py has access to it when running tests.
Message was sent while issue was closed.
Change committed as 284062
Message was sent while issue was closed.
A revert of this CL (patchset #9 id:160001) has been created in https://codereview.chromium.org/549103003/ by sergiyb@chromium.org. The reason for reverting is: Ignorer-bot project has been delayed to determine whether it is necessary at all. This doesn't have to be in the repo.. |