OLD | NEW |
1 # Perf Regression Sheriffing (go/perfregression-sheriff) | 1 # Perf Regression Sheriffing (go/perfregression-sheriff) |
2 | 2 |
3 The perf regression sheriff tracks performance regressions in Chrome's | 3 The perf regression sheriff tracks performance regressions in Chrome's |
4 continuous integration tests. Note that a [new rotation](perf_bot_sheriffing.md) | 4 continuous integration tests. Note that a [new rotation](perf_bot_sheriffing.md) |
5 has been created to ensure the builds and tests stay green, so the perf | 5 has been created to ensure the builds and tests stay green, so the perf |
6 regression sheriff role is now entirely focused on performance. | 6 regression sheriff role is now entirely focused on performance. |
7 | 7 |
8 **[Rotation calendar](https://calendar.google.com/calendar/embed?src=google.com_
2fpmo740pd1unrui9d7cgpbg2k%40group.calendar.google.com)** | 8 **[Rotation calendar](https://calendar.google.com/calendar/embed?src=google.com_
2fpmo740pd1unrui9d7cgpbg2k%40group.calendar.google.com)** |
9 | 9 |
10 ## Key Responsibilities | 10 ## Key Responsibilities |
11 | 11 |
12 * [Triage Regressions on the Perf Dashboard](#Triage-Regressions-on-the-Perf-Da
shboard) | 12 * [Triage Regressions on the Perf Dashboard](#Triage-Regressions-on-the-Perf-Da
shboard) |
13 * [Triaging Data Stoppage Alerts](#Triaging-Data-Stoppage-Alerts) | 13 * [Triaging Data Stoppage Alerts](#Triaging-Data-Stoppage-Alerts) |
14 * [Follow up on Performance Regressions](#Follow-up-on-Performance-Regressions) | 14 * [Follow up on Performance Regressions](#Follow-up-on-Performance-Regressions) |
15 * [Give Feedback on our Infrastructure](#Give-Feedback-on-our-Infrastructure) | 15 * [Give Feedback on our Infrastructure](#Give-Feedback-on-our-Infrastructure) |
16 | 16 |
17 ## Triage Regressions on the Perf Dashboard | 17 ## Triage Regressions on the Perf Dashboard |
18 | 18 |
19 Open the perf dashboard [alerts page](https://chromeperf.appspot.com/alerts). | 19 Open the perf dashboard [alerts page](https://chromeperf.appspot.com/alerts). |
20 | 20 |
21 In the upper right corner, **sign in with your Chromium account**. Signing in is | 21 In the upper right corner, **sign in with your Chromium account**. Signing in is |
22 important in order to be able to kick off bisect jobs, and see data from | 22 important in order to be able to kick off bisect jobs, and see data from |
23 internal waterfalls. | 23 internal waterfalls. |
24 | 24 |
25 Pick up **Chromium Perf Sheriff** from "Select an item ▼" drop down menu. There | 25 Pick up **Chromium Perf Sheriff** from "Select an item ▼" drop down menu. There |
26 are two tables of alerts that may be shown: | 26 are two tables of alerts that may be shown: |
27 | 27 |
28 * "Performance Alerts", which you should triage, and | 28 * "Performance Alerts" |
29 * "Data Stoppage Alerts", which you can ignore. | 29 * "Data Stoppage Alerts" |
30 | 30 |
31 For either type of alert, if there are no currently pending alerts, then the | 31 For either type of alert, if there are no currently pending alerts, then the |
32 table won't be shown. | 32 table won't be shown. |
33 | 33 |
34 The list can be sorted by clicking on the column header. When you click on the | 34 The list can be sorted by clicking on the column header. When you click on the |
35 checkbox next to an alert, all the other alerts that occurred in the same | 35 checkbox next to an alert, all the other alerts that occurred in the same |
36 revision range will be highlighted. | 36 revision range will be highlighted. |
37 | 37 |
38 Check the boxes next to the alerts you want to take a look at, and click the | 38 Check the boxes next to the alerts you want to take a look at, and click the |
39 "Graph" button. You'll be taken to a page with a table at the top listing all | 39 "Graph" button. You'll be taken to a page with a table at the top listing all |
(...skipping 39 matching lines...) Expand 10 before | Expand all | Expand 10 after Loading... |
79 to see a broader revision range feel free to click on the alert on that graph | 79 to see a broader revision range feel free to click on the alert on that graph |
80 and kick off a bisect for it. There should be capacity to kick off as many | 80 and kick off a bisect for it. There should be capacity to kick off as many |
81 bisects as you feel are necessary to investigate; [give feedback](#feedback) | 81 bisects as you feel are necessary to investigate; [give feedback](#feedback) |
82 below if you feel that is not the case. | 82 below if you feel that is not the case. |
83 | 83 |
84 ## Triaging Data Stoppage Alerts | 84 ## Triaging Data Stoppage Alerts |
85 | 85 |
86 Data stoppage alerts are listed on the | 86 Data stoppage alerts are listed on the |
87 [perf dashboard alerts page](https://chromeperf.appspot.com/alerts). Whenever | 87 [perf dashboard alerts page](https://chromeperf.appspot.com/alerts). Whenever |
88 the dashboard is monitoring a metric, and that metric stops sending data, an | 88 the dashboard is monitoring a metric, and that metric stops sending data, an |
89 alert is fired. Some of these alerts are expected: | 89 alert is fired. See |
90 | 90 [triaging data stoppage alerts](triaging_data_stoppage_alerts.md) for more |
91 * When a telemetry benchmark is disabled, we get a data stoppage alert. | 91 details. |
92 Check the [code for the benchmark](https://code.google.com/p/chromium/codes
earch#chromium/src/tools/perf/benchmarks/) | |
93 to see if it has been disabled, and if so associate the alert with the | |
94 bug for the disable. | |
95 * When a bot has been turned down. These should be announced to | |
96 perf-sheriffs@chromium.org, but if you can't find the bot on | |
97 [the waterfall](https://uberchromegw.corp.google.com/i/chromium.perf/) and | |
98 you didn't see the announcement, double check in the speed infra chat. | |
99 Ideally these will be associated with the bug for the bot turndown, but | |
100 it's okay to mark them invalid if you can't find the bug. | |
101 You can check the | |
102 [recipe](https://chromium.googlesource.com/chromium/tools/build/+/master/sc
ripts/slave/recipe_modules/chromium_tests/chromium_perf.py) | |
103 to find a corresponding bot name for waterfall with one for dashboard. | |
104 | |
105 If there doesn't seem to be a valid reason for the alert, file a bug on it | |
106 using the perf dashboard, and cc [the owner](http://go/perf-owners). Then do | |
107 some diagnosis: | |
108 | |
109 * Look at the perf dashboard graph to see the last revision we got data for, | |
110 and note that in the bug. Click on the `buildbot stdio` link in the tooltip | |
111 to find the buildbot status page for the last good build, and increment | |
112 the build number to get the first build with no data, and note that in the | |
113 bug as well. Check for any changes to the test in the revision range. | |
114 * Go to the buildbot status page of the bot which should be running the test. | |
115 Is it running the test? If not, note that in the bug. | |
116 * If it is running the test and the test is failing, diagnose as a test | |
117 failure. | |
118 * If it is running the test and the test is passing, check the `json.output` | |
119 link on the buildbot status page for the test. This is the data the test | |
120 sent to the perf dashboard. Are there null values? Sometimes it lists a | |
121 reason as well. Please put your finding in the bug. | |
122 | 92 |
123 ## Follow up on Performance Regressions | 93 ## Follow up on Performance Regressions |
124 | 94 |
125 During your shift, you should try to follow up on each of the bugs you filed. | 95 During your shift, you should try to follow up on each of the bugs you filed. |
126 Once you've triaged all the alerts, check to see if the bisects have come back, | 96 Once you've triaged all the alerts, check to see if the bisects have come back, |
127 or if they failed. If the results came back, and a culprit was found, follow up | 97 or if they failed. If the results came back, and a culprit was found, follow up |
128 with the CL author. If the bisects failed to update the bug with results, please | 98 with the CL author. If the bisects failed to update the bug with results, please |
129 file a bug on it (see [feedback](#feedback) links below). | 99 file a bug on it (see [feedback](#feedback) links below). |
130 | 100 |
131 Also during your shift, please spend any spare time driving down bugs from the | 101 Also during your shift, please spend any spare time driving down bugs from the |
(...skipping 33 matching lines...) Expand 10 before | Expand all | Expand 10 after Loading... |
165 [go/bad-bisects](https://docs.google.com/spreadsheets/d/13PYIlRGE8eZzsrSocA3SR
2LEHdzc8n9ORUoOE2vtO6I/edit#gid=0). | 135 [go/bad-bisects](https://docs.google.com/spreadsheets/d/13PYIlRGE8eZzsrSocA3SR
2LEHdzc8n9ORUoOE2vtO6I/edit#gid=0). |
166 The team triages these regularly. If you spot a really clear bug (bisect | 136 The team triages these regularly. If you spot a really clear bug (bisect |
167 job red, bugs not being updated with bisect results) please file it in | 137 job red, bugs not being updated with bisect results) please file it in |
168 crbug with component `Tests>AutoBisect`. If a bisect problem is blocking a | 138 crbug with component `Tests>AutoBisect`. If a bisect problem is blocking a |
169 perf regression bug triage, **please file a new bug with component | 139 perf regression bug triage, **please file a new bug with component |
170 `Tests>AutoBisect` and block the regression bug on the bisect bug**. This | 140 `Tests>AutoBisect` and block the regression bug on the bisect bug**. This |
171 makes it much easier for the team to triage, dupe, and close bugs on the | 141 makes it much easier for the team to triage, dupe, and close bugs on the |
172 infrastructure without affecting the state of the perf regression bugs. | 142 infrastructure without affecting the state of the perf regression bugs. |
173 * **Noisy Tests**: Please file a bug in crbug with component `Tests>Telemetry` | 143 * **Noisy Tests**: Please file a bug in crbug with component `Tests>Telemetry` |
174 and [cc the owner](http://go/perf-owners). | 144 and [cc the owner](http://go/perf-owners). |
OLD | NEW |