Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(80)

Unified Diff: tools/perf/docs/perf_bot_sheriffing.md

Issue 1903913002: [perf] Update perfbot sheriff docs to include buildslave monitoring. (Closed) Base URL: https://chromium.googlesource.com/chromium/src.git@master
Patch Set: Wrap text. Created 4 years, 8 months ago
Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.
Jump to:
View side-by-side diff with in-line comments
Download patch
« no previous file with comments | « no previous file | no next file » | no next file with comments »
Expand Comments ('e') | Collapse Comments ('c') | Show Comments Hide Comments ('s')
Index: tools/perf/docs/perf_bot_sheriffing.md
diff --git a/tools/perf/docs/perf_bot_sheriffing.md b/tools/perf/docs/perf_bot_sheriffing.md
index b13755b473d76abdd4b3f8b84de324444c3aa147..e1ea36ea4ac7f32cb90cf936fade65ae168ecf8d 100644
--- a/tools/perf/docs/perf_bot_sheriffing.md
+++ b/tools/perf/docs/perf_bot_sheriffing.md
@@ -54,9 +54,28 @@ postings for important announcements about bot turndowns and other known issues.
##<a name="botfailures"></a> Handle Device and Bot Failures
+###<a name="offline"></a> Offline Buildslaves
+
+Some build configurations, in particular the perf builders and trybots, have
+multiple machines attached. If one or more of the machines go down, there are
+still other machines running, so the console or waterfall view will still show
+green, but those configs will run at reduced throughput. At least once during
+your shift, you should check the lists of buildslaves and ensure they're all
+running.
+
+* [chromium.perf buildslaves](https://build.chromium.org/p/chromium.perf/buildslaves)
+* [tryserver.chromium.perf buildslaves](https://build.chromium.org/p/tryserver.chromium.perf/buildslaves)
+
+The machines restart between test runs, so just looking for "Status: Not
+connected" is not enough to indicate a problem. For each disconnected machine,
+you can also check the "Last heard from" column to ensure that it's been gone
+for at least an hour. To get it running again,
+[file a bug](https://bugs.chromium.org/p/chromium/issues/entry?labels=Pri-1,Performance-BotHealth,Infra-Troopers,OS-?&comment=Hostname:&summary=Buildslave+offline+on+chromium.perf)
+against the current trooper.
+
###<a name="purplebots"></a> Purple bots
-When a bot goes purple, it's it's usually because of an infrastructure failure
+When a bot goes purple, it's usually because of an infrastructure failure
outside of the tests. But you should first check the logs of a purple bot to
try to better understand the problem. Sometimes a telemetry test failure can
turn the bot purple, for example.
« no previous file with comments | « no previous file | no next file » | no next file with comments »

Powered by Google App Engine
This is Rietveld 408576698