OLD | NEW |
---|---|
1 # Perf Bot Sheriffing | 1 # Perf Bot Sheriffing |
2 | 2 |
3 The perf bot sheriff is responsible for keeping the bots on the chromium.perf | 3 The perf bot sheriff is responsible for keeping the bots on the chromium.perf |
4 waterfall up and running, and triaging performance test failures and flakes. | 4 waterfall up and running, and triaging performance test failures and flakes. |
5 | 5 |
6 **[Rotation calendar](https://calendar.google.com/calendar/embed?src=google.com_ 2fpmo740pd1unrui9d7cgpbg2k%40group.calendar.google.com)** | 6 **[Rotation calendar](https://calendar.google.com/calendar/embed?src=google.com_ 2fpmo740pd1unrui9d7cgpbg2k%40group.calendar.google.com)** |
7 | 7 |
8 ## Key Responsibilities | 8 ## Key Responsibilities |
9 | 9 |
10 * [Handle Device and Bot Failures](#botfailures) | 10 * [Handle Device and Bot Failures](#botfailures) |
(...skipping 20 matching lines...) Expand all Loading... | |
31 it easier to see a summary. | 31 it easier to see a summary. |
32 2. [Waterfall view](https://uberchromegw.corp.google.com/i/chromium.perf/waterf all) | 32 2. [Waterfall view](https://uberchromegw.corp.google.com/i/chromium.perf/waterf all) |
33 shows more details, including recent changes. | 33 shows more details, including recent changes. |
34 3. [Firefighter](https://chromiumperfstats.appspot.com/) shows traces of | 34 3. [Firefighter](https://chromiumperfstats.appspot.com/) shows traces of |
35 recent builds. It takes url parameter arguments: | 35 recent builds. It takes url parameter arguments: |
36 * **master** can be chromium.perf, tryserver.chromium.perf | 36 * **master** can be chromium.perf, tryserver.chromium.perf |
37 * **builder** can be a builder or tester name, like | 37 * **builder** can be a builder or tester name, like |
38 "Android Nexus5 Perf (2)" | 38 "Android Nexus5 Perf (2)" |
39 * **start_time** is seconds since the epoch. | 39 * **start_time** is seconds since the epoch. |
40 | 40 |
41 | |
42 There is also [milo](https://luci-milo.appspot.com), which has the same data as | |
43 buildbot, but mirrored in a different datastore. It is generally faster than | |
44 buildbot, and links to it will not break, as the data is kept around for much | |
45 longer. | |
46 | |
41 In addition to watching the waterfall directly, | 47 In addition to watching the waterfall directly, |
42 [Sheriff-O-Matic](https://sheriff-o-matic.appspot.com/chromium.perf) may | 48 [Sheriff-O-Matic](https://sheriff-o-matic.appspot.com/chromium.perf) may |
43 optionally be used to easily track the different issues and associate | 49 optionally be used to easily track the different issues and associate |
44 them with specific bugs. | 50 them with specific bugs. It also attempts to group together similar failures |
51 across different builders, so it can help to see a higher level perspective on | |
52 what is happening on the perf waterfall. | |
45 | 53 |
46 You can see a list of all previously filed bugs using the | 54 You can see a list of all previously filed bugs using the |
47 **[Performance-Sheriff-BotHealth](https://bugs.chromium.org/p/chromium/issues/li st?can=2&q=label%3APerformance-Sheriff-BotHealth)** | 55 **[Performance-Sheriff-BotHealth](https://bugs.chromium.org/p/chromium/issues/li st?can=2&q=label%3APerformance-Sheriff-BotHealth)** |
48 label in crbug. | 56 label in crbug. |
49 | 57 |
50 Please also check the recent | 58 Please also check the recent |
51 **[perf-sheriffs@chromium.org](https://groups.google.com/a/chromium.org/forum/#! forum/perf-sheriffs)** | 59 **[perf-sheriffs@chromium.org](https://groups.google.com/a/chromium.org/forum/#! forum/perf-sheriffs)** |
52 postings for important announcements about bot turndowns and other known issues. | 60 postings for important announcements about bot turndowns and other known issues. |
53 | 61 |
54 ## Handle Device and Bot Failures | 62 ## Handle Device and Bot Failures |
(...skipping 10 matching lines...) Expand all Loading... | |
65 * [chromium.perf buildslaves](https://build.chromium.org/p/chromium.perf/build slaves) | 73 * [chromium.perf buildslaves](https://build.chromium.org/p/chromium.perf/build slaves) |
66 * [tryserver.chromium.perf buildslaves](https://build.chromium.org/p/tryserver .chromium.perf/buildslaves) | 74 * [tryserver.chromium.perf buildslaves](https://build.chromium.org/p/tryserver .chromium.perf/buildslaves) |
67 | 75 |
68 The machines restart between test runs, so just looking for "Status: Not | 76 The machines restart between test runs, so just looking for "Status: Not |
69 connected" is not enough to indicate a problem. For each disconnected machine, | 77 connected" is not enough to indicate a problem. For each disconnected machine, |
70 you can also check the "Last heard from" column to ensure that it's been gone | 78 you can also check the "Last heard from" column to ensure that it's been gone |
71 for at least an hour. To get it running again, | 79 for at least an hour. To get it running again, |
72 [file a bug](https://bugs.chromium.org/p/chromium/issues/entry?labels=Pri-1,Perf ormance-Sheriff-BotHealth,Infra-Troopers,OS-?&comment=Hostname:&summary=Buildsla ve+offline+on+chromium.perf) | 80 [file a bug](https://bugs.chromium.org/p/chromium/issues/entry?labels=Pri-1,Perf ormance-Sheriff-BotHealth,Infra-Troopers,OS-?&comment=Hostname:&summary=Buildsla ve+offline+on+chromium.perf) |
73 against the current trooper and read [go/bug-a-trooper](http://go/bug-a-trooper) for contacting troopers. | 81 against the current trooper and read [go/bug-a-trooper](http://go/bug-a-trooper) for contacting troopers. |
74 | 82 |
83 The chrome infrastructure team also maintains a dashboard you can use to view | |
84 some debugging information about down machines. This is available at | |
85 [go/chrome-infra-mon](http://go/chrome-infra-mon). To debug offline buildslaves, | |
Sergey Berezin
2016/09/29 18:23:01
nit: this is the first time I see that go/ link :-
martiniss
2016/09/29 20:07:14
Done.
| |
86 you can look at the "Individual machine" dashboard, under the "Machines" | |
Sergey Berezin
2016/09/29 18:23:01
nit: provide direct link, e.g. vi/chrome_infra/Mac
martiniss
2016/09/29 20:08:35
Done
| |
87 section, which can show some useful information about the machine in question. | |
88 | |
75 ### Purple bots | 89 ### Purple bots |
76 | 90 |
77 When a bot goes purple, it's usually because of an infrastructure failure | 91 When a bot goes purple, it's usually because of an infrastructure failure |
78 outside of the tests. But you should first check the logs of a purple bot to | 92 outside of the tests. But you should first check the logs of a purple bot to |
79 try to better understand the problem. Sometimes a telemetry test failure can | 93 try to better understand the problem. Sometimes a telemetry test failure can |
80 turn the bot purple, for example. | 94 turn the bot purple, for example. |
81 | 95 |
82 If the bot goes purple and you believe it's an infrastructure issue, file a bug | 96 If the bot goes purple and you believe it's an infrastructure issue, file a bug |
83 with | 97 with |
84 [this template](https://bugs.chromium.org/p/chromium/issues/entry?labels=Pri-1,P erformance-Sheriff-BotHealth,Infra-Troopers,OS-?&comment=Link+to+buildbot+status +page:&summary=Purple+Bot+on+chromium.perf), | 98 [this template](https://bugs.chromium.org/p/chromium/issues/entry?labels=Pri-1,P erformance-Sheriff-BotHealth,Infra-Troopers,OS-?&comment=Link+to+buildbot+status +page:&summary=Purple+Bot+on+chromium.perf), |
(...skipping 175 matching lines...) Expand 10 before | Expand all | Expand 10 after Loading... | |
260 | 274 |
261 **[Pri-2 bugs](https://bugs.chromium.org/p/chromium/issues/list?can=2&q=label%3A Performance-Sheriff-BotHealth+label%3APri-2)** | 275 **[Pri-2 bugs](https://bugs.chromium.org/p/chromium/issues/list?can=2&q=label%3A Performance-Sheriff-BotHealth+label%3APri-2)** |
262 are for disabled tests. These should be pinged weekly, and work towards fixing | 276 are for disabled tests. These should be pinged weekly, and work towards fixing |
263 should be ongoing when the sheriff is not working on a Pri-1 issue. Here is the | 277 should be ongoing when the sheriff is not working on a Pri-1 issue. Here is the |
264 [list of Pri-2 bugs that have not been pinged in a week](https://bugs.chromium.o rg/p/chromium/issues/list?can=2&q=label:Performance-Sheriff-BotHealth%20label:Pr i-2%20modified-before:today-7&sort=modified). | 278 [list of Pri-2 bugs that have not been pinged in a week](https://bugs.chromium.o rg/p/chromium/issues/list?can=2&q=label:Performance-Sheriff-BotHealth%20label:Pr i-2%20modified-before:today-7&sort=modified). |
265 | 279 |
266 <!-- Unresolved issues: | 280 <!-- Unresolved issues: |
267 1. Do perf sheriffs watch the bisect waterfall? | 281 1. Do perf sheriffs watch the bisect waterfall? |
268 2. Do perf sheriffs watch the internal clank waterfall? | 282 2. Do perf sheriffs watch the internal clank waterfall? |
269 --> | 283 --> |
OLD | NEW |