Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(348)

Side by Side Diff: tools/perf/docs/perf_bot_sheriffing.md

Issue 1903913002: [perf] Update perfbot sheriff docs to include buildslave monitoring. (Closed) Base URL: https://chromium.googlesource.com/chromium/src.git@master
Patch Set: Wrap text. Created 4 years, 8 months ago
Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.
Jump to:
View unified diff | Download patch
« no previous file with comments | « no previous file | no next file » | no next file with comments »
Toggle Intra-line Diffs ('i') | Expand Comments ('e') | Collapse Comments ('c') | Show Comments Hide Comments ('s')
OLDNEW
1 # Perf Bot Sheriffing 1 # Perf Bot Sheriffing
2 2
3 The perf bot sheriff is responsible for keeping the bots on the chromium.perf 3 The perf bot sheriff is responsible for keeping the bots on the chromium.perf
4 waterfall up and running, and triaging performance test failures and flakes. 4 waterfall up and running, and triaging performance test failures and flakes.
5 5
6 **[Rotation calendar](https://calendar.google.com/calendar/embed?src=google.com_ 2fpmo740pd1unrui9d7cgpbg2k%40group.calendar.google.com)** 6 **[Rotation calendar](https://calendar.google.com/calendar/embed?src=google.com_ 2fpmo740pd1unrui9d7cgpbg2k%40group.calendar.google.com)**
7 7
8 ## Key Responsibilities 8 ## Key Responsibilities
9 9
10 * [Handle Device and Bot Failures](#botfailures) 10 * [Handle Device and Bot Failures](#botfailures)
(...skipping 36 matching lines...) Expand 10 before | Expand all | Expand 10 after
47 You can see a list of all previously filed bugs using the 47 You can see a list of all previously filed bugs using the
48 **[Performance-BotHealth](https://bugs.chromium.org/p/chromium/issues/list?can=2 &q=label%3APerformance-BotHealth)** 48 **[Performance-BotHealth](https://bugs.chromium.org/p/chromium/issues/list?can=2 &q=label%3APerformance-BotHealth)**
49 label in crbug. 49 label in crbug.
50 50
51 Please also check the recent 51 Please also check the recent
52 **[perf-sheriffs@chromium.org](https://groups.google.com/a/chromium.org/forum/#! forum/perf-sheriffs)** 52 **[perf-sheriffs@chromium.org](https://groups.google.com/a/chromium.org/forum/#! forum/perf-sheriffs)**
53 postings for important announcements about bot turndowns and other known issues. 53 postings for important announcements about bot turndowns and other known issues.
54 54
55 ##<a name="botfailures"></a> Handle Device and Bot Failures 55 ##<a name="botfailures"></a> Handle Device and Bot Failures
56 56
57 ###<a name="offline"></a> Offline Buildslaves
58
59 Some build configurations, in particular the perf builders and trybots, have
60 multiple machines attached. If one or more of the machines go down, there are
61 still other machines running, so the console or waterfall view will still show
62 green, but those configs will run at reduced throughput. At least once during
63 your shift, you should check the lists of buildslaves and ensure they're all
64 running.
65
66 * [chromium.perf buildslaves](https://build.chromium.org/p/chromium.perf/build slaves)
67 * [tryserver.chromium.perf buildslaves](https://build.chromium.org/p/tryserver .chromium.perf/buildslaves)
68
69 The machines restart between test runs, so just looking for "Status: Not
70 connected" is not enough to indicate a problem. For each disconnected machine,
71 you can also check the "Last heard from" column to ensure that it's been gone
72 for at least an hour. To get it running again,
73 [file a bug](https://bugs.chromium.org/p/chromium/issues/entry?labels=Pri-1,Perf ormance-BotHealth,Infra-Troopers,OS-?&comment=Hostname:&summary=Buildslave+offli ne+on+chromium.perf)
74 against the current trooper.
75
57 ###<a name="purplebots"></a> Purple bots 76 ###<a name="purplebots"></a> Purple bots
58 77
59 When a bot goes purple, it's it's usually because of an infrastructure failure 78 When a bot goes purple, it's usually because of an infrastructure failure
60 outside of the tests. But you should first check the logs of a purple bot to 79 outside of the tests. But you should first check the logs of a purple bot to
61 try to better understand the problem. Sometimes a telemetry test failure can 80 try to better understand the problem. Sometimes a telemetry test failure can
62 turn the bot purple, for example. 81 turn the bot purple, for example.
63 82
64 If the bot goes purple and you believe it's an infrastructure issue, file a bug 83 If the bot goes purple and you believe it's an infrastructure issue, file a bug
65 with 84 with
66 [this template](https://bugs.chromium.org/p/chromium/issues/entry?labels=Pri-1,P erformance-BotHealth,Infra-Troopers,OS-?&comment=Link+to+buildbot+status+page:&s ummary=Purple+Bot+on+chromium.perf), 85 [this template](https://bugs.chromium.org/p/chromium/issues/entry?labels=Pri-1,P erformance-BotHealth,Infra-Troopers,OS-?&comment=Link+to+buildbot+status+page:&s ummary=Purple+Bot+on+chromium.perf),
67 which will automatically add the bug to the trooper queue. Be sure to note 86 which will automatically add the bug to the trooper queue. Be sure to note
68 which step is failing, and paste any relevant info from the logs into the bug. 87 which step is failing, and paste any relevant info from the logs into the bug.
69 88
(...skipping 153 matching lines...) Expand 10 before | Expand all | Expand 10 after
223 242
224 **[Pri-2 bugs](https://bugs.chromium.org/p/chromium/issues/list?can=2&q=label%3A Performance-BotHealth+label%3APri-2)** 243 **[Pri-2 bugs](https://bugs.chromium.org/p/chromium/issues/list?can=2&q=label%3A Performance-BotHealth+label%3APri-2)**
225 are for disabled tests. These should be pinged weekly, and work towards fixing 244 are for disabled tests. These should be pinged weekly, and work towards fixing
226 should be ongoing when the sheriff is not working on a Pri-1 issue. Here is the 245 should be ongoing when the sheriff is not working on a Pri-1 issue. Here is the
227 [list of Pri-2 bugs that have not been pinged in a week](https://bugs.chromium.o rg/p/chromium/issues/list?can=2&q=label:Performance-BotHealth%20label:Pri-2%20mo dified-before:today-7&sort=modified). 246 [list of Pri-2 bugs that have not been pinged in a week](https://bugs.chromium.o rg/p/chromium/issues/list?can=2&q=label:Performance-BotHealth%20label:Pri-2%20mo dified-before:today-7&sort=modified).
228 247
229 <!-- Unresolved issues: 248 <!-- Unresolved issues:
230 1. Do perf sheriffs watch the bisect waterfall? 249 1. Do perf sheriffs watch the bisect waterfall?
231 2. Do perf sheriffs watch the internal clank waterfall? 250 2. Do perf sheriffs watch the internal clank waterfall?
232 --> 251 -->
OLDNEW
« no previous file with comments | « no previous file | no next file » | no next file with comments »

Powered by Google App Engine
This is Rietveld 408576698