| OLD | NEW | 
|---|
| (Empty) |  | 
|  | 1 # Histogram Guidelines | 
|  | 2 | 
|  | 3 This document gives the best practices on how to use histograms in code and how | 
|  | 4 to document the histograms for the dashboard.  There are three general types of | 
|  | 5 histograms: enumerated histograms (appropriate for enums), count histograms | 
|  | 6 (appropriate for arbitrary numbers), and sparse histogram (appropriate for | 
|  | 7 anything when the precision is important over a wide range is large and/or the | 
|  | 8 range is not possible to specify a priori). | 
|  | 9 | 
|  | 10 [TOC] | 
|  | 11 | 
|  | 12 ## Emitting to Histograms | 
|  | 13 | 
|  | 14 ### Directly Measure What You Want | 
|  | 15 | 
|  | 16 Measure exactly what you want, whether that's time used for a function call, | 
|  | 17 number of bytes transmitted to fetch a page, number of items in a list, etc. | 
|  | 18 Do not assume you can calculate what you want from other histograms.  Most of | 
|  | 19 the ways to do this are incorrect.  For example, if you want to know the time | 
|  | 20 taken by a function that all it does is call two other functions, both of which | 
|  | 21 are have histogram logging, you might think you can simply add up those | 
|  | 22 the histograms for those functions to get the total time.  This is wrong. | 
|  | 23 If we knew which emissions came from which calls, we could pair them up and | 
|  | 24 derive the total time for the function.  However, histograms entries do not | 
|  | 25 come with timestamps--we pair them up appropriately.  If you simply add up the | 
|  | 26 two histograms to get the total histogram, you're implicitly assuming those | 
|  | 27 values are independent, which may not be the case.  Directly measure what you | 
|  | 28 care about; don't try to derive it from other data. | 
|  | 29 | 
|  | 30 ### Efficiency | 
|  | 31 | 
|  | 32 In general, the histogram code is highly optimized.  Do not be concerned about | 
|  | 33 the processing cost of emitting to a histogram (unless you're using [sparse | 
|  | 34 histograms](#when-to-use-sparse-histograms)). | 
|  | 35 | 
|  | 36 ### Enum Histograms | 
|  | 37 | 
|  | 38 Enumerated histogram are most appropriate when you have a list of connected / | 
|  | 39 related states that should be analyzed jointly.  For example, the set of | 
|  | 40 actions that can be done on the New Tab Page (use the omnibox, click a most | 
|  | 41 visited tile, click a bookmark, etc.) would make a good enumerated histogram. | 
|  | 42 If the total count of your histogram (i.e. the sum across all buckets) is | 
|  | 43 something meaningful--as it is in this example--that is generally a good sign. | 
|  | 44 However, the total count does not have to be meaningful for an enum histogram | 
|  | 45 to still be the right choice. | 
|  | 46 | 
|  | 47 You may append to your enum if the possible states/actions grows.  However, you | 
|  | 48 should not reorder, renumber, or otherwise reuse existing values.  As such, | 
|  | 49 please put this warning by the enum definition: | 
|  | 50 ``` | 
|  | 51 // These values are written to logs.  New enum values can be added, but existing | 
|  | 52 // enums must never be renumbered or deleted and reused. | 
|  | 53 ``` | 
|  | 54 | 
|  | 55 Also, please explicitly set enum values `= 0`, `= 1`, `= 2`, etc.  This makes | 
|  | 56 clearer that the actual values are important.  In addition, it helps confirm | 
|  | 57 the values align between the enum definition and histograms.xml. | 
|  | 58 | 
|  | 59 ### Count Histograms | 
|  | 60 | 
|  | 61 [histogram_macros.h](https://cs.chromium.org/chromium/src/base/metrics/histogram
     _macros.h) | 
|  | 62 provides macros for some common count types such as memory or elapsed time, in | 
|  | 63 addition to general count macros.  These have reasonable default values; you | 
|  | 64 will not often need to choose number of buckets or histogram min.  You still | 
|  | 65 will need to choose the histogram max (use the advice below). | 
|  | 66 | 
|  | 67 If none of the default macros work well for you, please thoughtfully choose | 
|  | 68 a min, max, and bucket count for your histogram using the advice below. | 
|  | 69 | 
|  | 70 ### Count Histograms: Choosing Min and Max | 
|  | 71 | 
|  | 72 For histogram max, choose a value so that very few emission to the histogram | 
|  | 73 will exceed the max.  If many emissions hit the max, it can be difficult to | 
|  | 74 compute statistics such as average.  One rule of thumb is at most 1% of samples | 
|  | 75 should be in the overflow bucket.  This allows analysis of the 99th percentile. | 
|  | 76 Err on the side of too large a range versus too short a range.  (Remember that i
     f you choose poorly, you'll have to wait for another release cycle to fix it.) | 
|  | 77 | 
|  | 78 For histogram min, if you care about all possible values (zero and above), | 
|  | 79 choose a min of 1.  (All histograms have an underflow bucket; emitted zeros | 
|  | 80 will go there.  That's why a min of 1 is appropriate.)  Otherwise, choose the | 
|  | 81 min appropriate for your particular situation. | 
|  | 82 | 
|  | 83 ### Count Histograms: Choosing Number of Buckets | 
|  | 84 | 
|  | 85 Choose the smallest number of buckets that will get you the granularity you | 
|  | 86 need.  By default count histograms bucket sizes scale exponentially so you can | 
|  | 87 get finely granularity when the numbers are small yet still reasonable | 
|  | 88 resolution for larger numbers.  The macros default to bucket sizes around 50 | 
|  | 89 which is appropriate for most purposes.  Because histograms pre-allocate all | 
|  | 90 the buckets, the number of buckets selected directly dictate how much memory | 
|  | 91 is used.  Do not exceed 100 buckets without good reason (and consider whether | 
|  | 92 [sparse histograms](#when-to-use-sparse-histograms) might work better for you | 
|  | 93 in that case--they do not pre-allocate their buckets). | 
|  | 94 | 
|  | 95 ### Count Histograms with Linear Ranges | 
|  | 96 | 
|  | 97 If you want equally spaced buckets of size 1, use an enumerated histogram. | 
|  | 98 While it's possible to do this with a count histogram, it's easy to make a | 
|  | 99 mistake when setting the min, max, and number of buckets (because you have | 
|  | 100 to remember how underflow and overflow buckets are handled) and end up with | 
|  | 101 a histogram that ends up with mostly buckets of size 1 but not all. | 
|  | 102 Using an enumerated histogram with a max value of your own choice is less | 
|  | 103 error-prone. | 
|  | 104 | 
|  | 105 ### Testing | 
|  | 106 | 
|  | 107 Test your histograms using [chrome://histograms](chrome://histograms).  Make | 
|  | 108 sure they're being emitted to when you expect and not emitted to at other times. | 
|  | 109 Also check that the values emitted to are correct.  Finally, for count | 
|  | 110 histograms, make sure that buckets capture enough precision for your needs over | 
|  | 111 the range. | 
|  | 112 | 
|  | 113 ### Revising Histograms | 
|  | 114 | 
|  | 115 If you're changing the semantics of a histogram (when it's emitted, what buckets | 
|  | 116 mean, etc.), make it into a new histogram with a new name.  Otherwise the | 
|  | 117 "Everything" view on the dashboard will be mixing two different interpretations | 
|  | 118 of the data and make no sense. | 
|  | 119 | 
|  | 120 ### Deleting Histograms | 
|  | 121 | 
|  | 122 Please delete the code that emits to histograms that are no longer needed. | 
|  | 123 Histograms take up memory.  Cleaning up histograms that you no longer care about | 
|  | 124 is good!  But see the note below on [Deleting Histogram Entries] | 
|  | 125 (#deleting-histogram-entries). | 
|  | 126 | 
|  | 127 ## Documenting Histograms | 
|  | 128 | 
|  | 129 ### Add Histogram and Documentation in the Same Changelist | 
|  | 130 | 
|  | 131 If possible, please add the histograms.xml description in the same changelist | 
|  | 132 in which you add the histogram-emitting code.  This has several benefits.  One, | 
|  | 133 it sometimes happens that the histograms.xml reviewer has questions or concerns | 
|  | 134 about the histogram description that reveal problems with interpretation of the | 
|  | 135 data and call for a different recording strategy.  Two, it allows the histogram | 
|  | 136 reviewer to easily review the emission code to see if it comports with these | 
|  | 137 best practices, and to look for other errors. | 
|  | 138 | 
|  | 139 ### Understandable to Everyone | 
|  | 140 | 
|  | 141 Histogram descriptions should be roughly understandable to someone not familiar | 
|  | 142 with your feature.  Please add a sentence or two of background if necessary. | 
|  | 143 | 
|  | 144 It is good practice to note caveats associated with your histogram in this | 
|  | 145 section, such as which platforms are supported (if the set of supported | 
|  | 146 platforms is surprising).  E.g., a desktop feature that happens not to be logged | 
|  | 147 on Mac. | 
|  | 148 | 
|  | 149 ### State When It Is Recorded | 
|  | 150 | 
|  | 151 Histogram descriptions should clearly state when the histogram is emitted | 
|  | 152 (profile open? network request received? etc.). | 
|  | 153 | 
|  | 154 ### Deleting Histogram Entries | 
|  | 155 | 
|  | 156 Do not delete histograms from histograms.xml.  Instead, mark unused histograms | 
|  | 157 as obsolete, annotating them with the associated date or milestone in the | 
|  | 158 obsolete tag entry.  If your histogram is being replaced by a new version, we | 
|  | 159 suggest noting that in the previous histogram's description. | 
|  | 160 | 
|  | 161 Deleting histogram entries would be bad if someone to accidentally reused your | 
|  | 162 old histogram name and thereby corrupts new data with whatever old data is still | 
|  | 163 coming in.  It's also useful to keep obsolete histogram descriptions in | 
|  | 164 histograms.xml--that way, if someone is searching for a histogram to answer | 
|  | 165 a particular question, they can learn if there was a histogram at some point | 
|  | 166 that did so even if it isn't active now. | 
|  | 167 | 
|  | 168 ## When To Use Sparse Histograms | 
|  | 169 | 
|  | 170 Sparse histograms are well suited for recording counts of exact sample values | 
|  | 171 that are sparsely distributed over a large range. | 
|  | 172 | 
|  | 173 The implementation uses a lock and a map, whereas other histogram types use a | 
|  | 174 vector and no lock. It is thus more costly to add values to, and each value | 
|  | 175 stored has more overhead, compared to the other histogram types. However it | 
|  | 176 may be more efficient in memory if the total number of sample values is small | 
|  | 177 compared to the range of their values. | 
|  | 178 | 
|  | 179 For more information, see [sparse_histograms.h] | 
|  | 180 (https://cs.chromium.org/chromium/src/base/metrics/sparse_histogram.h). | 
| OLD | NEW | 
|---|