OLD | NEW |
(Empty) | |
| 1 # Perf Regression Sheriffing |
| 2 |
| 3 The perf regression sheriff tracks performance regressions in Chrome's |
| 4 continuous integration tests. Note that a [new rotation](perf_bot_sheriffing.md) |
| 5 has been created to ensure the builds and tests stay green, so the perf |
| 6 regression sheriff role is now entirely focused on performance. |
| 7 |
| 8 ## Key Responsibilities |
| 9 |
| 10 * [Triage Regressions on the Perf Dashboard](#triage) |
| 11 * [Follow up on Performance Regressions](#followup) |
| 12 * [Give Feedback on our Infrastructure](#feedback) |
| 13 |
| 14 ###<a name="triage"></a> Triage Regressions on the Perf Dashboard |
| 15 |
| 16 Open the perf dashboard [alerts page](https://chromeperf.appspot.com/alerts). |
| 17 |
| 18 In the upper right corner, **sign in with your Google account**. Signing in is |
| 19 important in order to be able to kick off bisect jobs, and see data from |
| 20 internal waterfalls. |
| 21 |
| 22 The page shows two lists; you are responsible for triaging |
| 23 **Performance Alerts**. The list can be sorted by clicking on the column header. |
| 24 When you click on the checkbox next to an alert, all the other alerts that |
| 25 occurred in the same revision range will be highlighted. |
| 26 |
| 27 Check the boxes next to the alerts you want to take a look at, and click the |
| 28 "Graph" button. You'll be taken to a page with a table at the top listing all |
| 29 the alerts that have an overlapping revision range with the one you chose, and |
| 30 below it the dashboard shows graphs of all the alerts checked in that table. |
| 31 |
| 32 1. **Look at the graph**. |
| 33 * If the alert appears to be **within the noise**, click on the red |
| 34 exclamation point icon for it in the graph and hit the "Invalid" button. |
| 35 * If the alert is **if the alert is visibly to the left or the right of the |
| 36 actual regression**, click on it and use the "nudge" menu to move it into |
| 37 place. |
| 38 * If there is a line labeled "ref" on the graph, that is the reference build. |
| 39 It's an older version of Chrome, used to help us sort out whether a change |
| 40 to the bot or test might have caused the graph to jump, rather than a real |
| 41 performance regression. If **the ref build moved at the same time as the |
| 42 alert**, click on the alert and hit the "Invalid" button. |
| 43 2. **Look at the other alerts** in the table to see if any should be grouped tog
ether. |
| 44 Note that the bisect will automatically dupe bugs if it finds they have the |
| 45 same culprit, so you don't need to be too aggressive about grouping alerts |
| 46 that might not be related. Some signs alerts should be grouped together: |
| 47 * If they're all in the same test suite |
| 48 * If they all regressed the same metric (a lot of commonality in the Test |
| 49 column) |
| 50 3. **Triage the group of alerts**. Check all the alerts you believe are related, |
| 51 and press the triage button. |
| 52 * If one of the alerts already has a bug id, click "existing bug" and use |
| 53 that bug id. |
| 54 * Otherwise click "new bug". Be sure to cc the |
| 55 [test owner](http://go/perf-owners) on the bug. |
| 56 4. **Look at the revision range** for the regression. You can see it in the |
| 57 tooltip on the graph. If you see any likely culprits, cc the authors on the |
| 58 bug. |
| 59 5. **Optionally, kick off more bisects**. The perf dashboard will automatically |
| 60 kick off a bisect for each bug you file. But if you think the regression is |
| 61 much clearer on one platform, or a specific page of a page set, or you want |
| 62 to see a broader revision range feel free to click on the alert on that graph |
| 63 and kick off a bisect for it. There should be capacity to kick off as many |
| 64 bisects as you feel are necessary to investigate; [give feedback](#feedback) |
| 65 below if you feel that is not the case. |
| 66 |
| 67 ###<a name="followup"></a> Follow up on Performance Regressions |
| 68 |
| 69 During your shift, you should try to follow up on each of the bugs you filed. |
| 70 Once you've triaged all the alerts, check to see if the bisects have come back, |
| 71 or if they failed. If the results came back, and a culprit was found, follow up |
| 72 with the CL author. If the bisects failed to update the bug with results, please |
| 73 file a bug on it (see [feedback](#feedback) links below). |
| 74 |
| 75 After your shift, please try to follow up on the bugs you filed weekly. Kick off |
| 76 new bisects if the previous ones failed, and if the bisect picks a likely |
| 77 culprit follow up to ensure the CL author addresses the problem. If you are |
| 78 certain that a specific CL caused a performance regression, and the author does |
| 79 not have an immediate plan to address the problem, please revert the CL. |
| 80 |
| 81 ###<a name="feedback"></a> Give Feedback on our Infrastructure |
| 82 |
| 83 Perf regression sheriffs have their eyes on the perf dashboard and bisects |
| 84 more than anyone else, and their feedback is invaluable for making sure these |
| 85 tools are accurate and improving them. Please file bugs and feature requests |
| 86 as you see them: |
| 87 |
| 88 * **Perf Dashboard**: Please use the red "Report Issue" link in the navbar. |
| 89 * **Perf Bisect/Trybots**: If a bisect is identifying the wrong CL as culprit |
| 90 or missing a clear culprit, or not reproducing what appears to be a clear |
| 91 regression, please link the comment the bisect bot posted on the bug at |
| 92 [go/bad-bisects](https://docs.google.com/spreadsheets/d/13PYIlRGE8eZzsrSocA
3SR2LEHdzc8n9ORUoOE2vtO6I/edit#gid=0). |
| 93 The team triages these regularly. If you spot a really clear bug (bisect |
| 94 job red, bugs not being updated with bisect results) please file it in |
| 95 crbug with label `Cr-Tests-AutoBisect`. |
| 96 * **Noisy Tests**: Please file a bug in crbug with label `Cr-Tests-Telemetry` |
| 97 and [cc the owner](http://go/perf-owners). |
OLD | NEW |