OLD | NEW |
(Empty) | |
| 1 # Addressing Performance Regressions |
| 2 |
| 3 The bisect bot just picked your CL as the culprit in a performance regression |
| 4 and assigned a bug to you! What should you do? Read on... |
| 5 |
| 6 ## About our performance tests |
| 7 |
| 8 The [chromium.perf waterfall](perf_waterfall.md) is a continuous build which |
| 9 runs performance tests on dozens of devices across Windows, Mac, Linux, and |
| 10 Android Chrome and WebView. Often, a performance regression only affects a |
| 11 certain type of hardware or a certain operating system, which may be different |
| 12 than what you tested locally before landing your CL. |
| 13 |
| 14 Each test has an owner, named in |
| 15 [this spreasheet](https://docs.google.com/spreadsheets/d/1xaAo0_SU3iDfGdqDJZX_jR
V0QtkufwHUKH3kQKF3YQs/edit#gid=0), |
| 16 who you can cc on a performance bug if you have questions. |
| 17 |
| 18 ## Understanding the bisect results |
| 19 |
| 20 The bisect bot spits out a comment on the bug that looks like this: |
| 21 |
| 22 ``` |
| 23 === BISECT JOB RESULTS === |
| 24 Perf regression found with culprit |
| 25 |
| 26 Suspected Commit |
| 27 Author : Your Name |
| 28 Commit : 15092e9195954cbc331cd58e344d0895fe03d0cd |
| 29 Date : Wed Jun 14 03:09:47 2017 |
| 30 Subject: Your CL Description. |
| 31 |
| 32 Bisect Details |
| 33 Configuration: mac_pro_perf_bisect |
| 34 Benchmark : system_health.common_desktop |
| 35 Metric : timeToFirstContentfulPaint_avg/load_search/load_search_taobao |
| 36 Change : 15.25% | 1010.02 -> 1164.04 |
| 37 |
| 38 Revision Result N |
| 39 chromium@479147 1010.02 +- 1535.41 14 good |
| 40 chromium@479209 699.332 +- 1282.01 6 good |
| 41 chromium@479240 383.617 +- 917.038 6 good |
| 42 chromium@479255 649.186 +- 1896.26 14 good |
| 43 chromium@479262 788.828 +- 1897.91 14 good |
| 44 chromium@479268 880.727 +- 2235.29 21 good |
| 45 chromium@479269 886.511 +- 1150.91 6 good |
| 46 chromium@479270 1164.04 +- 979.746 14 bad <-- |
| 47 |
| 48 To Run This Test |
| 49 src/tools/perf/run_benchmark -v --browser=release --output-format=chartjson --up
load-results --pageset-repeat=1 --also-run-disabled-tests --story-filter=load.se
arch.taobao system_health.common_desktop |
| 50 ``` |
| 51 |
| 52 There's a lot of information packed in that bug comment! Here's a breakdown: |
| 53 |
| 54 * **What regressed exactly?** The comment gives you several details: |
| 55 * **The benchmark that regressed**: Under `Bisect Details`, you can see |
| 56 `Benchmark :`. In this case, the `system_health.common_desktop` |
| 57 benchmark regressed. |
| 58 * **What platform did it regress on?** Under `Configuration`, you can find |
| 59 some details on the bot that regressed. In this example, it is a Mac Pro |
| 60 laptop. |
| 61 * **How do I run that locally?** Follow the instructions under |
| 62 `To Run This Test`. But be aware that if it regressed on Android and |
| 63 you're developing on Windows, you may not be able to reproduce locally. |
| 64 (See Debugging regressions below) |
| 65 * **What is this testing?** Generally the metric |
| 66 (`timeToFirstContentfulPaint_avg`) gives some information. If you're not |
| 67 familiar, you can cc the [benchmark owner](https://docs.google.com/spreads
heets/d/1xaAo0_SU3iDfGdqDJZX_jRV0QtkufwHUKH3kQKF3YQs/edit#gid=0) |
| 68 to ask for help. |
| 69 * **How severe is this regression?** There are different axes on which to |
| 70 answer that question: |
| 71 * **How much did performance regress?** The bisect bot answers this both |
| 72 in relative terms (`Change : 15.25%`) and absolute terms |
| 73 (`1010.02 -> 1164.04`). To understand the absolute terms, you'll need |
| 74 to look at the units on the performance graphs linked in comment #1 |
| 75 of the bug (`https://chromeperf.appspot.com/group_report?bug_id=XXX`). |
| 76 In this example, the units are milliseconds; the time to load taobao |
| 77 regressed from ~1.02 second to 1.16 seconds. |
| 78 * **How widespread is the regression?** The graphs linked in comment #1 |
| 79 of the bug (`https://chromeperf.appspot.com/group_report?bug_id=XXX`) |
| 80 will give you an idea how widespread the regression is. The `Bot` |
| 81 column shows all the different bots the regression occurred on, and the |
| 82 `Test` column shows the metrics it regressed on. Often, the same metric |
| 83 is gathered on many different web pages. If you see a long list of |
| 84 pages, it's likely that the regression affects most pages; if it's |
| 85 short maybe your regression is an edge case. |
| 86 |
| 87 ## Debugging regressions |
| 88 |
| 89 * **How do I run the test locally???** Follow the instructions under |
| 90 `To Run This Test` in the bisect comment. But be aware that regressions |
| 91 are often hardware and/or platform-specific. |
| 92 * **What do I do if I don't have the right hardware to test locally?** If |
| 93 you don't have a local machine that matches the specs of the hardware that |
| 94 regressed, you can run a perf tryjob on the same lab machines that ran the |
| 95 bisect that blamed your CL. |
| 96 [Here are the instructions for perf tryjobs](perf_trybots.md). |
| 97 Drop the `perf_bisect` from the bot name and substitute dashes for |
| 98 underscores to get the trybot name (`mac_pro_perf_bisect` -> `mac_pro` |
| 99 in the example above). |
| 100 * **Can I get a trace?** For most metrics, yes. Here are the steps: |
| 101 1. Click on the `All graphs for this bug` link in comment #1. It should |
| 102 look like this: |
| 103 `https://chromeperf.appspot.com/group_report?bug_id=XXXX` |
| 104 2. Select a bot/test combo that looks like what the bisect bot originally |
| 105 caught. You might want to look through various regressions for a really |
| 106 large increase. |
| 107 3. On the graph, click on the exclamation point icon at the regression, and |
| 108 a tooltip comes up. There is a "trace" link in the tooltip, click it to |
| 109 open a the trace that was recorded during the performance test. |
| 110 * **Wait, what's a trace?** See the |
| 111 [documentation on tracing](https://www.chromium.org/developers/how-tos/trace
-event-profiling-tool) |
| 112 to learn how to use traces to debug performance issues. |
| 113 * **Are there debugging tips specific to certain benchmarks?** |
| 114 * **[Memory](https://chromium.googlesource.com/chromium/src/+/master/docs/me
mory-infra/memory_benchmarks.md)** |
| 115 * **[Android binary size](apk_size_regressions.md)** |
| 116 |
| 117 ## If you don't believe your CL could be the cause |
| 118 |
| 119 There are some clear reasons to believe the bisect bot made a mistake: |
| 120 |
| 121 * Your CL changes a test or some code that isn't compiled on the platform |
| 122 that regressed. |
| 123 * Your CL is completely unrelated to the metric that regressed. |
| 124 * You looked at the numbers the bisect spit out (see example above; the first |
| 125 column is the revision, the second column is the value at that revision, |
| 126 and the third column is the standard deviation), and: |
| 127 * The change attributed to your CL seems well within the noise, or |
| 128 * The change at your CL is an improvement (for example, the metric is bytes |
| 129 of memory used, and the value goes **down** at your CL) or |
| 130 * The change is far smaller that what's reported in the bug summary (for |
| 131 example, the bug says there is a 15% memory regression but the bisect |
| 132 found that your CL increases memory by 0.77%) |
| 133 |
| 134 Do the following: |
| 135 |
| 136 * Add a comment to the bug explaining why you believe your CL is not the |
| 137 cause of the regression. |
| 138 * **Unassign yourself from the bug**. This lets our triage process know that |
| 139 you are not actively working on the bug. |
| 140 * Kick off another bisect. You can do this by: |
| 141 1. Click on the `All graphs for this bug` link in comment #1. It should |
| 142 look like this: |
| 143 `https://chromeperf.appspot.com/group_report?bug_id=XXXX` |
| 144 2. Sign in to the dashboard with your chromium.org account in the upper |
| 145 right corner. |
| 146 3. Select a bot/test combo that looks like what the bisect bot originally |
| 147 caught. You might want to look through various regressions for a really |
| 148 clear increase. |
| 149 4. On the graph, click on the exclamation point icon at the regression, and |
| 150 a tooltip comes up. Click the `Bisect` button on the tooltip. |
| 151 |
| 152 |
| 153 ## If you believe the regression is justified |
| 154 |
| 155 Sometimes you are aware that your CL caused a performance regression, but you |
| 156 believe the CL should be landed as-is anyway. Chrome's |
| 157 [core principles](https://www.chromium.org/developers/core-principles) state: |
| 158 |
| 159 > If you make a change that regresses measured performance, you will be required
to fix it or revert. |
| 160 |
| 161 **It is your responsibility to justify the regression.** You must add a comment |
| 162 on the bug explaining your justification clearly before WontFix-ing. |
| 163 |
| 164 Here are some common justification scenarios: |
| 165 |
| 166 * **Your change regresses this metric, but is a net positive for performance.*
* |
| 167 There are a few ways to demonstrate that this is true: |
| 168 * **Use benchmark results.** If your change has a positive impact, there |
| 169 should be clear improvements detected in benchmarks. You can look at all |
| 170 the changes (positive and negative) the perf dashboard detected by |
| 171 entering the commit position of a change into this url: |
| 172 `https://chromeperf.appspot.com/group_report?rev=YOUR_COMMIT_POS_HERE` |
| 173 All of these changes are generally changes found on a CL range, and may |
| 174 not be attributable to your CL. You can bisect any of these to find if |
| 175 your CL caused the improvement, just like you can bisect to find if it |
| 176 caused the regression. |
| 177 * **Use finch trial results.** There are some types of changes that cannot |
| 178 be measured well in benchmarks. If you believe your case falls into this |
| 179 category, you can show that end users are not affected via a finch trial. |
| 180 See the "End-user metrics" section of |
| 181 [How does Chrome measure performance](how_does_chrome_measure_performance.
md) |
| 182 * **Your change is a critical correctness or security fix.** |
| 183 It's true that sometimes something was "fast" because it was implemented |
| 184 incorrectly. In this case, a justification should clarify the performance |
| 185 cost we are paying for the fix and why it is worth it. Some things to |
| 186 include: |
| 187 * **What did the benchmark regression cost?** Look at the |
| 188 list of regressions in bug comment 1: |
| 189 `https://chromeperf.appspot.com/group_report?bug_id=XXXX` |
| 190 What is the absolute cost (5MiB RAM? 200ms on page load?) |
| 191 How many pages regressed? How many platforms? |
| 192 * **What do we gain?** It could be something like: |
| 193 * Reduced code complexity |
| 194 * Optimal code or UI correctness |
| 195 * Additional security |
| 196 * Knowledge via an experiment |
| 197 * Marketing - something good for users |
| 198 * **Is there a more performant way to solve the problem?** |
| 199 The [benchmark owner](https://docs.google.com/spreadsheets/d/1xaAo0_SU3iDf
GdqDJZX_jRV0QtkufwHUKH3kQKF3YQs/edit#gid=0) |
| 200 can generally give you an idea how much work it would take to make a |
| 201 similarly-sized performance gain. For example, it might take 1.5 |
| 202 engineering years to save 3MiB of RAM on Android; could you solve the |
| 203 problem in a way that takes less memory than that in less than 1.5 years? |
| 204 * **This performance metric is incorrect.** Not all tests are perfect. It's |
| 205 possible that your change did not regress performance, and only appears to |
| 206 be a problem because the test is measuring incorrectly. If this is the |
| 207 case, you must explain clearly what the issue with the test is, and why you |
| 208 believe your change is performance neutral. Please include data from traces |
| 209 or other performance tools to clarify your claim. |
| 210 |
| 211 **In all cases,** make sure to cc the [benchmark owner](https://docs.google.com/
spreadsheets/d/1xaAo0_SU3iDfGdqDJZX_jRV0QtkufwHUKH3kQKF3YQs/edit#gid=0) |
| 212 when writing a justification and WontFix-ing a bug. If you cannot come to an |
| 213 agreement with the benchmark owner, you can escalate to benhenry@chromium.org, |
| 214 the owner of speed releasing. |
OLD | NEW |