OLD | NEW |
1 Infra Trooper Documentation | 1 Infra Trooper Documentation |
2 =========================== | 2 =========================== |
3 | 3 |
4 ### Contents ### | 4 ### Contents ### |
5 | 5 |
6 * [What does an Infra trooper do?](#what_is_a_trooper) | 6 * [What does an Infra trooper do?](#what_is_a_trooper) |
7 * [View current and upcoming troopers](#view_current_upcoming_troopers) | 7 * [View current and upcoming troopers](#view_current_upcoming_troopers) |
8 * [How to swap trooper shifts](#how_to_swap) | 8 * [How to swap trooper shifts](#how_to_swap) |
9 * [Tips for troopers](#tips) | 9 * [Tips for troopers](#tips) |
10 | 10 |
(...skipping 42 matching lines...) Expand 10 before | Expand all | Expand 10 after Loading... |
53 - Disconnected devices (these are detected as the "wait for device" step faili
ng) | 53 - Disconnected devices (these are detected as the "wait for device" step faili
ng) |
54 | 54 |
55 - "Failed to execute query" may show a different query than the failing one; | 55 - "Failed to execute query" may show a different query than the failing one; |
56 dismiss the alert to get a new alert showing the query that is actually | 56 dismiss the alert to get a new alert showing the query that is actually |
57 failing. (All "failed to execute query" alerts are lumped into a single alert, | 57 failing. (All "failed to execute query" alerts are lumped into a single alert, |
58 which is why the failed query which initially triggered the alert may not be | 58 which is why the failed query which initially triggered the alert may not be |
59 failing any more but the alert is still active because another query is | 59 failing any more but the alert is still active because another query is |
60 failing.) | 60 failing.) |
61 | 61 |
62 - Where machines are located: | 62 - Where machines are located: |
63 - Machine name like "skia-vm-NNN" -> GCE | 63 - Machine name like "skia-vm-NNN", "ct-vm-NNN" -> GCE |
64 - Machine name ends with "a3", "a4", "m3" -> Chrome Golo | 64 - Machine name ends with "a3", "a4", "m3" -> Chrome Golo |
| 65 - Machine name ends with "m5" -> CT bare-metal bots in Chrome Golo |
65 - Machine name starts with "skiabot-" -> Chapel Hill lab | 66 - Machine name starts with "skiabot-" -> Chapel Hill lab |
66 - Machine name starts with "win8" -> Chapel Hill lab (Windows machine | 67 - Machine name starts with "win8" -> Chapel Hill lab (Windows machine |
67 names can't be very long, so the "skiabot-shuttle-" prefix is dropped.) | 68 names can't be very long, so the "skiabot-shuttle-" prefix is dropped.) |
68 - slave11-c3 is a Chrome infra GCE machine (not to be confused with the Skia | 69 - slave11-c3 is a Chrome infra GCE machine (not to be confused with the Skia |
69 Buildbots GCE, which we refer to as simply "GCE") | 70 Buildbots GCE, which we refer to as simply "GCE") |
70 | 71 |
71 - The [chrome-infra IRC channel](https://comlink.googleplex.com/chrome-infra) is | 72 - The [chrome-infra IRC channel](https://comlink.googleplex.com/chrome-infra) is |
72 useful for questions regarding bots managed by the Chrome Infra team and to | 73 useful for questions regarding bots managed by the Chrome Infra team and to |
73 get visibility into upstream failures that cause problems for us. | 74 get visibility into upstream failures that cause problems for us. |
74 | 75 |
(...skipping 12 matching lines...) Expand all Loading... |
87 | 88 |
88 - If there is a problem with a bot in the Chrome Golo or Chrome infra GCE, the | 89 - If there is a problem with a bot in the Chrome Golo or Chrome infra GCE, the |
89 best course of action is to | 90 best course of action is to |
90 [file a bug](https://code.google.com/p/chromium/issues/entry?template=Build%20
Infrastructure) | 91 [file a bug](https://code.google.com/p/chromium/issues/entry?template=Build%20
Infrastructure) |
91 with the Chrome infra team. But if you know what you're doing: | 92 with the Chrome infra team. But if you know what you're doing: |
92 - To access bots in the Chrome Golo, | 93 - To access bots in the Chrome Golo, |
93 [follow these instructions](https://chrome-internal.googlesource.com/infra/i
nfra_internal/+/master/doc/ssh.md). | 94 [follow these instructions](https://chrome-internal.googlesource.com/infra/i
nfra_internal/+/master/doc/ssh.md). |
94 - Machine name ends with "a3" or "a4" -> ssh command looks like `ssh | 95 - Machine name ends with "a3" or "a4" -> ssh command looks like `ssh |
95 build3-a3.chrome` | 96 build3-a3.chrome` |
96 - Machine name ends with "m3" -> ssh command looks like `ssh build5-m3.golo` | 97 - Machine name ends with "m3" -> ssh command looks like `ssh build5-m3.golo` |
| 98 - Machine name ends with "m5" -> ssh command looks like `ssh build1-m5.golo`
. |
| 99 [Example bug](https://bugs.chromium.org/p/chromium/issues/detail?id=638193
) to file to Infra Labs. |
97 - For MacOS and Windows bots, you will be prompted for a password, which is | 100 - For MacOS and Windows bots, you will be prompted for a password, which is |
98 stored on [Valentine](https://valentine.corp.google.com/) as "Chrome Golo, | 101 stored on [Valentine](https://valentine.corp.google.com/) as "Chrome Golo, |
99 Perf, GPU bots - chrome-bot". | 102 Perf, GPU bots - chrome-bot". |
100 - To access bots in the Chrome infra GCE -> command looks like `gcutil | 103 - To access bots in the Chrome infra GCE -> command looks like `gcutil |
101 --project=google.com:chromecompute ssh --ssh_user=default slave11-c3` (or | 104 --project=google.com:chromecompute ssh --ssh_user=default slave11-c3` (or |
102 use the ccompute ssh script from the infra_internal repo). | 105 use the ccompute ssh script from the infra_internal repo). |
103 | 106 |
104 - Read over the [SkiaLab documentation](../testing/skialab) for more detail on | 107 - Read over the [SkiaLab documentation](../testing/skialab) for more detail on |
105 dealing with device alerts. | 108 dealing with device alerts. |
106 | 109 |
107 - To stop a buildslave for a device, log in to the host for that device, `cd | 110 - To stop a buildslave for a device, log in to the host for that device, `cd |
108 ~/buildbot/<slave name>/build/slave; make stop`. To start it again, | 111 ~/buildbot/<slave name>/build/slave; make stop`. To start it again, |
109 `TESTING_SLAVENAME=<slave name> make start`. | 112 `TESTING_SLAVENAME=<slave name> make start`. |
110 | 113 |
111 - Buildslaves can be slow to come up after reboot, but if the buildslave remains | 114 - Buildslaves can be slow to come up after reboot, but if the buildslave remains |
112 disconnected, you may need to start it manually. On Mac and Linux, check using | 115 disconnected, you may need to start it manually. On Mac and Linux, check using |
113 `ps aux | grep python` that neither buildbot nor gclient are running, then run | 116 `ps aux | grep python` that neither buildbot nor gclient are running, then run |
114 `~/skiabot-slave-start-on-boot.sh`. | 117 `~/skiabot-slave-start-on-boot.sh`. |
| 118 |
| 119 - Sometimes iOS builds fail with 'The service is invalid'. Try rebooting the iOS
host to fix this. |
OLD | NEW |