Chromium Code Reviews| OLD | NEW |
|---|---|
| (Empty) | |
| 1 # Histogram Guidelines | |
|
rkaplow
2016/09/30 17:12:02
You could put the pointer to this file from histog
Mark P
2016/09/30 22:40:36
Done for both.
| |
| 2 | |
| 3 This document gives the best practices on how to use histograms in code and how | |
| 4 to document the histograms for the dashboard. There are three general types of | |
| 5 histograms: enumerated histograms (appropriate for enums), count histograms | |
| 6 (appropriate for arbitrary numbers), and sparse histogram (appropriate for | |
| 7 anything when the precision is important over a wide range is large and/or the | |
| 8 range is not possible to specify a priori). | |
| 9 | |
| 10 [TOC] | |
| 11 | |
| 12 ## Emitting to Histograms | |
| 13 | |
| 14 ### Efficiency | |
| 15 | |
| 16 In general, the histogram code is highly optimized. Do not be concerned about | |
| 17 the processing cost of emitting to a histogram (unless you're using [sparse | |
| 18 histograms](#when-to-use-sparse-histograms)). | |
| 19 | |
| 20 ### Enum Histograms | |
| 21 | |
| 22 Enumerated histogram are most appropriate when you have a list of connected / | |
| 23 related states that should be analyzed jointly. For example, the set of | |
| 24 actions that can be done on the New Tab Page (use the omnibox, click a most | |
| 25 visited tile, click a bookmark, etc.) would make a good enumerated histogram. | |
|
rkaplow
2016/09/30 17:12:01
you may also want to add something like:
"It is o
Mark P
2016/09/30 22:40:36
Added a less wordy version. :-)
Thanks for the f
| |
| 26 | |
| 27 Please put a warning by the enum definition: | |
| 28 ``` | |
| 29 // These values are written to logs. New enum values can be added, but existing | |
| 30 // enums must never be renumbered or deleted and reused. | |
| 31 ``` | |
| 32 | |
| 33 Also, please explicitly set enum values `= 0`, `= 1`, `= 2`, etc. This makes | |
|
rkaplow
2016/09/30 17:12:02
Probably worth mentioning that we allow appending
Mark P
2016/09/30 22:40:36
Done.
| |
| 34 clearer that the actual values are important. In addition, it helps confirm | |
| 35 the values align between the enum definition and histograms.xml. | |
| 36 | |
| 37 ### Count Histograms: Choosing Min and Max | |
|
rkaplow
2016/09/30 17:12:01
should we have a small introduction to counts here
Mark P
2016/09/30 22:40:36
Done. Seems a little wordy though. :-| *shrug*
| |
| 38 | |
| 39 For histogram max, choose a value so that very few emission to the histogram | |
| 40 will exceed the max. If many emissions hit the max, it can be difficult to | |
| 41 compute statistics such as average or high order percentiles such as the 99th | |
| 42 percentile. Err on the side of too large a range versus too short a range. | |
|
rkaplow
2016/09/30 17:12:02
Also option, you can put something like:
"One rul
Mark P
2016/09/30 22:40:36
Added, and removed an existing clause. Then added
| |
| 43 | |
| 44 For histogram min, if you care about all possible values (zero and above), | |
| 45 choose a min of 1. (All histograms have an underflow bucket; emitted zeros | |
| 46 will go there. That's why a min of 1 is appropriate.) Otherwise, choose the | |
| 47 min appropriate for your particular situation. | |
| 48 | |
| 49 ### Count Histograms: Choosing Number of Buckets | |
| 50 | |
| 51 Choose the smallest number of buckets that will get you the granularity you | |
| 52 need. By default count histograms bucket sizes scale exponentially so you can | |
| 53 get finely granularity when the numbers are small yet still reasonable | |
| 54 resolution for larger numbers. The macros default to bucket sizes around 50 | |
| 55 which is appropriate for most purposes. Because histograms pre-allocate all | |
| 56 the buckets, the number of buckets selected directly dictate how much memory | |
| 57 is used. Do not exceed 100 buckets without good reason (and consider whether | |
| 58 [sparse histograms](#when-to-use-sparse-histograms) might work better for you | |
| 59 in that case--they do not pre-allocate their buckets). | |
| 60 | |
| 61 ### Count Histograms with Linear Ranges | |
| 62 | |
| 63 If you want equally spaced buckets of size 1, use an enumerated histogram. | |
| 64 While it's possible to do this with a count histogram, it's easy to make a | |
| 65 mistake when setting the min, max, and number of buckets (because you have | |
| 66 to remember how underflow and overflow buckets are handled) and end up with | |
| 67 a histogram that ends up with mostly buckets of size 1 but not all. | |
| 68 Using an enumerated histogram with a max value of your own choice is less | |
| 69 error-prone. | |
| 70 | |
| 71 ### Testing | |
| 72 | |
| 73 Test your histograms using [chrome://histograms](chrome://histograms). Make | |
| 74 sure they're being emitted to when you expect and not emitted to at other times. | |
| 75 Also check that the values emitted to are correct. Finally, for count | |
| 76 histograms, make sure that buckets capture enough precision for your needs over | |
| 77 the range. | |
| 78 | |
| 79 ### Revising Histograms | |
| 80 | |
| 81 If you're changing the semantics of a histogram (when it's emitted, what buckets | |
| 82 mean, etc.), make it into a new histogram with a new name. Otherwise the | |
| 83 "Everything" view on the dashboard will be mixing two different interpretations | |
| 84 of the data and make no sense. | |
| 85 | |
| 86 ### Deleting Histograms | |
| 87 | |
| 88 Please delete the code that emits to histograms that are no longer needed. | |
| 89 Histograms take up memory. Cleaning up histograms that you no longer care aboud | |
| 90 is good! But see the note below on [Deleting Histogram Entries] | |
| 91 (#deleting-histogram-entries). | |
| 92 | |
| 93 ## Documenting Histograms | |
| 94 | |
| 95 ### Add Histogram and Documentation in the Same Changelist | |
| 96 | |
| 97 If possible, please add the histograms.xml description in the same changelist | |
| 98 in which you add the histogram-emitting code. This has several benefits. One, | |
| 99 it sometimes happens that the histograms.xml reviewer has questions or concerns | |
| 100 about the histogram description that reveal problems with interpretation of the | |
| 101 data and call for a different recording strategy. Two, it allows the histogram | |
| 102 reviewer to easily review the emission code to see if it comports with these | |
| 103 best practices, and to look for other errors. | |
| 104 | |
| 105 ### Understandable to Everyone | |
| 106 | |
| 107 Histogram descriptions should be roughly understandable to someone not familiar | |
| 108 with with your feature. Please add a sentence or two of background if | |
| 109 necessary. | |
|
rkaplow
2016/09/30 17:12:01
Maybe add:
It is good practice to note caveats as
Mark P
2016/09/30 22:40:36
I don't think the list of platforms should be adde
| |
| 110 | |
| 111 ### State When It Is Recorded | |
| 112 | |
| 113 Histogram descriptions should clearly state when the histogram is emitted | |
| 114 (profile open? network request received? etc.). | |
| 115 | |
| 116 ### Deleting Histogram Entries | |
| 117 | |
| 118 Do not delete histograms from histograms.xml. Instead, mark unused histograms | |
| 119 as obsolete. It would be bad if someone to accidentally reused your old | |
|
rkaplow
2016/09/30 17:12:02
obsolete, with the associated date or milestone in
Mark P
2016/09/30 22:40:36
Both good suggestions. Integrated them both.
| |
| 120 histogram name and thereby corrupts new data with whatever old data is still | |
| 121 coming in. It's also useful to keep obsolete histogram descriptions in | |
| 122 histograms.xml--that way, if someone is searching for a histogram to answer | |
| 123 a particular question, they can learn if there was a histogram at some point | |
| 124 that did so even if it isn't active now. | |
| 125 | |
| 126 ## When To Use Sparse Histograms | |
| 127 | |
| 128 Sparse histograms are well suited for recording counts of exact sample values | |
| 129 that are sparsely distributed over a large range. | |
| 130 | |
| 131 The implementation uses a lock and a map, whereas other histogram types use a | |
| 132 vector and no lock. It is thus more costly to add values to, and each value | |
| 133 stored has more overhead, compared to the other histogram types. However it | |
| 134 may be more efficient in memory if the total number of sample values is small | |
| 135 compared to the range of their values. | |
| 136 | |
| 137 For more information, see [sparse_histograms.h] | |
| 138 (https://cs.chromium.org/chromium/src/base/metrics/sparse_histogram.h). | |
| OLD | NEW |