Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(186)

Side by Side Diff: site/dev/sheriffing/trooper.md

Issue 2263103003: Add trooper documentation for CT bots and for iOS "the service is invalid" (Closed) Base URL: https://skia.googlesource.com/skia@master
Patch Set: Remove milo link Created 4 years, 4 months ago
Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.
Jump to:
View unified diff | Download patch
« no previous file with comments | « no previous file | no next file » | no next file with comments »
Toggle Intra-line Diffs ('i') | Expand Comments ('e') | Collapse Comments ('c') | Show Comments Hide Comments ('s')
OLDNEW
1 Infra Trooper Documentation 1 Infra Trooper Documentation
2 =========================== 2 ===========================
3 3
4 ### Contents ### 4 ### Contents ###
5 5
6 * [What does an Infra trooper do?](#what_is_a_trooper) 6 * [What does an Infra trooper do?](#what_is_a_trooper)
7 * [View current and upcoming troopers](#view_current_upcoming_troopers) 7 * [View current and upcoming troopers](#view_current_upcoming_troopers)
8 * [How to swap trooper shifts](#how_to_swap) 8 * [How to swap trooper shifts](#how_to_swap)
9 * [Tips for troopers](#tips) 9 * [Tips for troopers](#tips)
10 10
(...skipping 42 matching lines...) Expand 10 before | Expand all | Expand 10 after
53 - Disconnected devices (these are detected as the "wait for device" step faili ng) 53 - Disconnected devices (these are detected as the "wait for device" step faili ng)
54 54
55 - "Failed to execute query" may show a different query than the failing one; 55 - "Failed to execute query" may show a different query than the failing one;
56 dismiss the alert to get a new alert showing the query that is actually 56 dismiss the alert to get a new alert showing the query that is actually
57 failing. (All "failed to execute query" alerts are lumped into a single alert, 57 failing. (All "failed to execute query" alerts are lumped into a single alert,
58 which is why the failed query which initially triggered the alert may not be 58 which is why the failed query which initially triggered the alert may not be
59 failing any more but the alert is still active because another query is 59 failing any more but the alert is still active because another query is
60 failing.) 60 failing.)
61 61
62 - Where machines are located: 62 - Where machines are located:
63 - Machine name like "skia-vm-NNN" -> GCE 63 - Machine name like "skia-vm-NNN", "ct-vm-NNN" -> GCE
64 - Machine name ends with "a3", "a4", "m3" -> Chrome Golo 64 - Machine name ends with "a3", "a4", "m3" -> Chrome Golo
65 - Machine name ends with "m5" -> CT bare-metal bots in Chrome Golo
65 - Machine name starts with "skiabot-" -> Chapel Hill lab 66 - Machine name starts with "skiabot-" -> Chapel Hill lab
66 - Machine name starts with "win8" -> Chapel Hill lab (Windows machine 67 - Machine name starts with "win8" -> Chapel Hill lab (Windows machine
67 names can't be very long, so the "skiabot-shuttle-" prefix is dropped.) 68 names can't be very long, so the "skiabot-shuttle-" prefix is dropped.)
68 - slave11-c3 is a Chrome infra GCE machine (not to be confused with the Skia 69 - slave11-c3 is a Chrome infra GCE machine (not to be confused with the Skia
69 Buildbots GCE, which we refer to as simply "GCE") 70 Buildbots GCE, which we refer to as simply "GCE")
70 71
71 - The [chrome-infra IRC channel](https://comlink.googleplex.com/chrome-infra) is 72 - The [chrome-infra IRC channel](https://comlink.googleplex.com/chrome-infra) is
72 useful for questions regarding bots managed by the Chrome Infra team and to 73 useful for questions regarding bots managed by the Chrome Infra team and to
73 get visibility into upstream failures that cause problems for us. 74 get visibility into upstream failures that cause problems for us.
74 75
(...skipping 12 matching lines...) Expand all
87 88
88 - If there is a problem with a bot in the Chrome Golo or Chrome infra GCE, the 89 - If there is a problem with a bot in the Chrome Golo or Chrome infra GCE, the
89 best course of action is to 90 best course of action is to
90 [file a bug](https://code.google.com/p/chromium/issues/entry?template=Build%20 Infrastructure) 91 [file a bug](https://code.google.com/p/chromium/issues/entry?template=Build%20 Infrastructure)
91 with the Chrome infra team. But if you know what you're doing: 92 with the Chrome infra team. But if you know what you're doing:
92 - To access bots in the Chrome Golo, 93 - To access bots in the Chrome Golo,
93 [follow these instructions](https://chrome-internal.googlesource.com/infra/i nfra_internal/+/master/doc/ssh.md). 94 [follow these instructions](https://chrome-internal.googlesource.com/infra/i nfra_internal/+/master/doc/ssh.md).
94 - Machine name ends with "a3" or "a4" -> ssh command looks like `ssh 95 - Machine name ends with "a3" or "a4" -> ssh command looks like `ssh
95 build3-a3.chrome` 96 build3-a3.chrome`
96 - Machine name ends with "m3" -> ssh command looks like `ssh build5-m3.golo` 97 - Machine name ends with "m3" -> ssh command looks like `ssh build5-m3.golo`
98 - Machine name ends with "m5" -> ssh command looks like `ssh build1-m5.golo` .
99 [Example bug](https://bugs.chromium.org/p/chromium/issues/detail?id=638193 ) to file to Infra Labs.
97 - For MacOS and Windows bots, you will be prompted for a password, which is 100 - For MacOS and Windows bots, you will be prompted for a password, which is
98 stored on [Valentine](https://valentine.corp.google.com/) as "Chrome Golo, 101 stored on [Valentine](https://valentine.corp.google.com/) as "Chrome Golo,
99 Perf, GPU bots - chrome-bot". 102 Perf, GPU bots - chrome-bot".
100 - To access bots in the Chrome infra GCE -> command looks like `gcutil 103 - To access bots in the Chrome infra GCE -> command looks like `gcutil
101 --project=google.com:chromecompute ssh --ssh_user=default slave11-c3` (or 104 --project=google.com:chromecompute ssh --ssh_user=default slave11-c3` (or
102 use the ccompute ssh script from the infra_internal repo). 105 use the ccompute ssh script from the infra_internal repo).
103 106
104 - Read over the [SkiaLab documentation](../testing/skialab) for more detail on 107 - Read over the [SkiaLab documentation](../testing/skialab) for more detail on
105 dealing with device alerts. 108 dealing with device alerts.
106 109
107 - To stop a buildslave for a device, log in to the host for that device, `cd 110 - To stop a buildslave for a device, log in to the host for that device, `cd
108 ~/buildbot/<slave name>/build/slave; make stop`. To start it again, 111 ~/buildbot/<slave name>/build/slave; make stop`. To start it again,
109 `TESTING_SLAVENAME=<slave name> make start`. 112 `TESTING_SLAVENAME=<slave name> make start`.
110 113
111 - Buildslaves can be slow to come up after reboot, but if the buildslave remains 114 - Buildslaves can be slow to come up after reboot, but if the buildslave remains
112 disconnected, you may need to start it manually. On Mac and Linux, check using 115 disconnected, you may need to start it manually. On Mac and Linux, check using
113 `ps aux | grep python` that neither buildbot nor gclient are running, then run 116 `ps aux | grep python` that neither buildbot nor gclient are running, then run
114 `~/skiabot-slave-start-on-boot.sh`. 117 `~/skiabot-slave-start-on-boot.sh`.
118
119 - Sometimes iOS builds fail with 'The service is invalid'. Try rebooting the iOS host to fix this.
OLDNEW
« no previous file with comments | « no previous file | no next file » | no next file with comments »

Powered by Google App Engine
This is Rietveld 408576698