OLD | NEW |
(Empty) | |
| 1 # Chrome Network Bug Triage : Suggested Workflow |
| 2 |
| 3 [TOC] |
| 4 |
| 5 ## Looking for new crashers |
| 6 |
| 7 1. Go to [go/chromecrash](https://goto.google.com/chromecrash). |
| 8 |
| 9 2. For each platform, look through the releases for which releases to |
| 10 investigate. As per bug-triage.txt, this should be the most recent canary, |
| 11 the previous canary (if the most recent is less than a day old), and any of |
| 12 dev/beta/stable that were released in the last couple of days. |
| 13 |
| 14 3. For each release, in the "Process Type" frame, click on "browser". |
| 15 |
| 16 4. At the bottom of the "Magic Signature" frame, click "limit 1000". Reported |
| 17 crashers are sorted in decreasing order of the number of reports for that |
| 18 crash signature. |
| 19 |
| 20 5. Search the page for *"net::"*. |
| 21 |
| 22 6. For each found signature: |
| 23 * If there is a bug already filed, make sure it is correctly describing the |
| 24 current bug (e.g. not closed, or not describing a long-past issue), and |
| 25 make sure that if it is a *net* bug, that it is labeled as such. |
| 26 * Ignore signatures that only occur once, as memory corruption can easily |
| 27 cause one-off failures when the sample size is large enough. |
| 28 * Ignore signatures that only come from a single client ID, as individual |
| 29 machine malware and breakage can also easily cause one-off failures. |
| 30 * Click on the number of reports field to see details of crash. Ignore it |
| 31 if it doesn't appear to be a network bug. |
| 32 * Otherwise, file a new bug directly from chromecrash. Note that this may |
| 33 result in filing bugs for low- and very-low- frequency crashes. That's |
| 34 ok; the bug tracker is a better tool to figure out whether or not we put |
| 35 resources into those crashes than a snap judgement when filing bugs. |
| 36 * For each bug you file, include the following information: |
| 37 * The backtrace. Note that the backtrace should not be added to the |
| 38 bug if Restrict-View-Google isn't set on the bug as it may contain |
| 39 PII. Filing the bug from the crash reporter should do this |
| 40 automatically, but check. |
| 41 * The channel in which the bug is seen (canary/dev/beta/stable), its |
| 42 frequency in that channel, and its rank among crashers in the |
| 43 channel. |
| 44 * The frequency of this signature in recent releases. This information |
| 45 is available by: |
| 46 1. Clicking on the signature in the "Magic Signature" list |
| 47 2. Clicking "Edit" on the dremel query at the top of the page |
| 48 3. Removing the "product.version='X.Y.Z.W' AND" string and clicking |
| 49 "Update". |
| 50 4. Clicking "Limit 1000" in the Product Version list in the |
| 51 resulting page (without this, the listing will be restricted to |
| 52 the releases in which the signature is most common, which will |
| 53 often not include the canary/dev release being investigated). |
| 54 5. Choose some subset of that list, or all of it, to include in the |
| 55 bug. Make sure to indicate if there is a defined point in the |
| 56 past before which the signature is not present. |
| 57 |
| 58 ## Identifying unlabeled network bugs on the tracker |
| 59 |
| 60 * Look at new uncomfirmed bugs since noon PST on the last triager's rotation. |
| 61 [Use this issue tracker |
| 62 query](https://code.google.com/p/chromium/issues/list?can=2&q=status%3Aunconfi
rmed&sort=-id&num=1000). |
| 63 |
| 64 * Press **h** to bring up a preview of the bug text. |
| 65 |
| 66 * Use **j** and **k** to advance through bugs. |
| 67 |
| 68 * If a bug looks like it might be network/download/safe-browsing related, |
| 69 middle click (or command-click on OSX) to open in new tab. |
| 70 |
| 71 * If a user provides a crash ID for a crasher for a bug that could be |
| 72 net-related, look at the crash stack at |
| 73 [go/crash](https://goto.google.com/crash), and see if it looks to be network |
| 74 related. Be sure to check if other bug reports have that stack trace, and |
| 75 mark as a dupe if so. Even if the bug isn't network related, paste the stack |
| 76 trace in the bug, so no one else has to look up the crash stack from the ID. |
| 77 * If there's no other information than the crash ID, ask for more details |
| 78 and add the Needs-Feedback label. |
| 79 |
| 80 * If network causes are possible, ask for a net-internals log (If it's not a |
| 81 browser crash) and attach the most specific internals-network label that's |
| 82 applicable. If there isn't an applicable narrower label, a clear owner for |
| 83 the issue, or there are multiple possibilities, attach the internals-network |
| 84 label and proceed with further investigation. |
| 85 |
| 86 * If non-network causes also seem possible, attach those labels as well. |
| 87 |
| 88 ## Investigating Cr-Internals-Network bugs |
| 89 |
| 90 * It's recommended that while on triage duty, you subscribe to the |
| 91 Cr-Internals-Network label. To do this, go to |
| 92 https://code.google.com/p/chromium/issues/ and click on "Subscriptions". |
| 93 Enter "Cr-Internals-Network" and click submit. |
| 94 |
| 95 * Look through uncomfirmed and untriaged Cr-Internals-Network bugs, |
| 96 prioritizing those updated within the last week. [Use this issue tracker |
| 97 query](https://code.google.com/p/chromium/issues/list?can=2&q=Cr%3DInternals-N
etwork+-status%3AAssigned+-status%3AStarted+-status%3AAvailable+&sort=-modified)
. |
| 98 |
| 99 * If more information is needed from the reporter, ask for it and add the |
| 100 Needs-Feedback label. If the reporter has answered an earlier request for |
| 101 information, remove that label. |
| 102 |
| 103 * While investigating a new issue, change the status to Untriaged. |
| 104 |
| 105 * If a bug is a potential security issue (Allows for code execution from remote |
| 106 site, allows crossing security boundaries, unchecked array bounds, etc) mark |
| 107 it Type-Bug-Security. If it has privacy implication (History, cookies |
| 108 discoverable by an entity that shouldn't be able to do so, incognito state |
| 109 being saved in memory or on disk beyond the lifetime of incognito tabs, etc), |
| 110 mark it Cr-Privacy. |
| 111 |
| 112 * For bugs that already have a more specific network label, go ahead and remove |
| 113 the Cr-Internals-Network label and move on. |
| 114 |
| 115 * Try to figure out if it's really a network bug. See common non-network |
| 116 labels section for description of common labels needed for issues incorrectly |
| 117 tagged as Cr-Internals-Network. |
| 118 |
| 119 * If it's not, attach appropriate labels and go no further. |
| 120 |
| 121 * If it may be a network bug, attach additional possibly relevant labels if |
| 122 any, and continue investigating. Once you either determine it's a |
| 123 non-network bug, or figure out accurate more specific network labels, your |
| 124 job is done, though you should still ask for a net-internals dump if it seems |
| 125 likely to be useful. |
| 126 |
| 127 * Note that ChromeOS-specific network-related code (Captive portal detection, |
| 128 connectivity detection, login, etc) may not all have appropriate more |
| 129 specific labels, but are not in areas handled by the network stack team. |
| 130 Just make sure those have the OS-Chrome label, and any more specific labels |
| 131 if applicable, and then move on. |
| 132 |
| 133 * Gather data and investigate. |
| 134 * Remember to add the Needs-Feedback label whenever waiting for the user to |
| 135 respond with more information, and remove it when not waiting on the |
| 136 user. |
| 137 * Try to reproduce locally. If you can, and it's a regression, use |
| 138 src/tools/bisect-builds.py to figure out when it regressed. |
| 139 * Ask more data from the user as needed (net-internals dumps, repro case, |
| 140 crash ID from about:crashes, run tests, etc). |
| 141 * If asking for an about:net-internals dump, provide this link: |
| 142 https://sites.google.com/a/chromium.org/dev/for-testers/providing-network-
details. |
| 143 Can just grab the link from about:net-internals, as needed. |
| 144 |
| 145 * Try to figure out what's going on, and which more specific network label is |
| 146 most appropriate. |
| 147 |
| 148 * If it's a regression, browse through the git history of relevant files to try |
| 149 and figure out when it regressed. CC authors / primary reviewers of any |
| 150 strongly suspect CLs. |
| 151 |
| 152 * If you are having trouble with an issue, particularly for help understanding |
| 153 net-internals logs, email the public net-dev@chromium.org list for help |
| 154 debugging. If it's a crasher, or for some other reason discussion needs to |
| 155 be done in private, use chrome-network-debugging@google.com. TODO(mmenke): |
| 156 Write up a net-internals tips and tricks docs. |
| 157 |
| 158 * If it appears to be a bug in the unowned core of the network stack (i.e. no |
| 159 sublabel applies, or only the Cr-Internals-Network-HTTP sublabel applies, and |
| 160 there's no clear owner), try to figure out the exact cause. |
| 161 |
| 162 ## Monitoring UMA histograms and gasper alerts |
| 163 |
| 164 For each Gasper alert that fires, determine if it's a real alert and file a bug |
| 165 if so. |
| 166 |
| 167 * Don't file if the alert is coincident with a major volume change. The volume |
| 168 at a particular date can be determined by hovering the mouse over the |
| 169 appropriate location on the alert line. |
| 170 |
| 171 * Don't file if the alert is on a graph with very low volume (< ~200 data |
| 172 points); it's probably noise, and we probably don't care even if it isn't. |
| 173 |
| 174 * Don't file if the graph is really noisy (but eyeball it to decide if there is |
| 175 an underlying important shift under the noise). |
| 176 |
| 177 * Don't file if the alert is in the "Known Ignorable" list: |
| 178 * SimpleCache on Windows |
| 179 * DiskCache on Android. |
| 180 |
| 181 For each Gasper alert, respond to chrome-network-debugging@google.com with a |
| 182 summary of the action you've taken and why, including issue link if an issue |
| 183 was filed. |
| 184 |
| 185 ## Investigating crashers |
| 186 |
| 187 * Only investigate crashers that are still occurring, as identified by above |
| 188 section. If a search on go/crash indicates a crasher is no longer occurring, |
| 189 mark it as WontFix. |
| 190 |
| 191 * Particularly for Windows, look for weird dlls associated with the crashes. |
| 192 If there are some, it may be caused by malware. You can often figure out if |
| 193 a dll is malware by a search, though it's harder to figure out if a dll is |
| 194 definitively not malware. |
| 195 |
| 196 * See if the same users are repeatedly running into the same issue. This can |
| 197 be accomplished by search for (Or clicking on) the client ID associated with |
| 198 a crash report, and seeing if there are multiple reports for the same crash. |
| 199 If this is the case, it may be also be malware, or an issue with an unusual |
| 200 system/chrome/network config. |
| 201 |
| 202 * Dig through crash reports to figure out when the crash first appeared, and |
| 203 dig through revision history in related files to try and locate a suspect CL. |
| 204 TODO(mmenke): Add more detail here. |
| 205 |
| 206 * Load crash dumps, try to figure out a cause. See |
| 207 http://www.chromium.org/developers/crash-reports for more information |
| 208 |
| 209 ## Dealing with old bugs |
| 210 |
| 211 * For all network issues (Even those with owners, or a more specific labels): |
| 212 |
| 213 * If the issue has had the Needs-Feedback label for over a month, verify it |
| 214 is waiting on feedback from the user. If not, remove the label. |
| 215 Otherwise, go ahead and mark the issue WontFix due to lack of response |
| 216 and suggest the user file a new bug if the issue is still present. [Use |
| 217 this issue tracker query for old Needs-Feedback |
| 218 issues](https://code.google.com/p/chromium/issues/list?can=2&q=Cr%3AIntern
als-Network%20Needs=Feedback+modified-before%3Atoday-30&sort=-modified). |
| 219 |
| 220 * If a bug is over 2 months old, and the underlying problem was never |
| 221 reproduced or really understood: |
| 222 * If it's over a year old, go ahead and mark the issue as Archived. |
| 223 * Otherwise, ask reporters if the issue is still present, and attach |
| 224 the Needs-Feedback label. |
| 225 |
| 226 * Old unconfirmed or untriaged Cr-Internals-Network issues can be investigated |
| 227 just like newer ones. Crashers should generally be given higher priority, |
| 228 since we can verify if they still occur, and then newer issues, as they're |
| 229 more likely to still be present, and more likely to have a still responsive |
| 230 bug reporter. |
OLD | NEW |