OLD | NEW |
---|---|
1 # Histogram Guidelines | 1 # Histogram Guidelines |
2 | 2 |
3 This document gives the best practices on how to use histograms in code and how | 3 This document gives the best practices on how to use histograms in code and how |
4 to document the histograms for the dashboard. There are three general types of | 4 to document the histograms for the dashboards. There are three general types |
5 histograms: enumerated histograms (appropriate for enums), count histograms | 5 of histograms: enumerated histograms (appropriate for enums), count histograms |
6 (appropriate for arbitrary numbers), and sparse histogram (appropriate for | 6 (appropriate for arbitrary numbers), and sparse histogram (appropriate for |
7 anything when the precision is important over a wide range is large and/or the | 7 anything when the precision is important over a wide range is large and/or the |
8 range is not possible to specify a priori). | 8 range is not possible to specify a priori). |
9 | 9 |
10 [TOC] | 10 [TOC] |
11 | 11 |
12 ## Emitting to Histograms | 12 ## Emitting to Histograms |
13 | 13 |
14 ### Directly Measure What You Want | 14 ### Directly Measure What You Want |
15 | 15 |
16 Measure exactly what you want, whether that's time used for a function call, | 16 Measure exactly what you want, whether that's time used for a function call, |
17 number of bytes transmitted to fetch a page, number of items in a list, etc. | 17 number of bytes transmitted to fetch a page, number of items in a list, etc. |
18 Do not assume you can calculate what you want from other histograms. Most of | 18 Do not assume you can calculate what you want from other histograms. Most of |
19 the ways to do this are incorrect. For example, if you want to know the time | 19 the ways to do this are incorrect. For example, if you want to know the time |
20 taken by a function that all it does is call two other functions, both of which | 20 taken by a function that all it does is call two other functions, both of which |
21 are have histogram logging, you might think you can simply add up those | 21 are have histogram logging, you might think you can simply add up those |
22 the histograms for those functions to get the total time. This is wrong. | 22 the histograms for those functions to get the total time. This is wrong. |
23 If we knew which emissions came from which calls, we could pair them up and | 23 If we knew which emissions came from which calls, we could pair them up and |
24 derive the total time for the function. However, histograms entries do not | 24 derive the total time for the function. However, histograms entries do not |
25 come with timestamps--we pair them up appropriately. If you simply add up the | 25 come with timestamps--we pair them up appropriately. If you simply add up the |
26 two histograms to get the total histogram, you're implicitly assuming those | 26 two histograms to get the total histogram, you're implicitly assuming those |
27 values are independent, which may not be the case. Directly measure what you | 27 values are independent, which may not be the case. Directly measure what you |
28 care about; don't try to derive it from other data. | 28 care about; don't try to derive it from other data. |
29 | 29 |
30 ### Efficiency | 30 ### Efficiency |
31 | 31 |
32 In general, the histogram code is highly optimized. Do not be concerned about | 32 In general, the histogram code is highly optimized. Do not be concerned about |
33 the processing cost of emitting to a histogram (unless you're using [sparse | 33 the processing cost of emitting to a histogram (unless you're using [sparse |
34 histograms](#when-to-use-sparse-histograms)). | 34 histograms](#When-To-Use-Sparse-Histograms)). |
35 | 35 |
36 ### Enum Histograms | 36 ### Enum Histograms |
37 | 37 |
38 Enumerated histogram are most appropriate when you have a list of connected / | 38 Enumerated histogram are most appropriate when you have a list of connected / |
39 related states that should be analyzed jointly. For example, the set of | 39 related states that should be analyzed jointly. For example, the set of |
40 actions that can be done on the New Tab Page (use the omnibox, click a most | 40 actions that can be done on the New Tab Page (use the omnibox, click a most |
41 visited tile, click a bookmark, etc.) would make a good enumerated histogram. | 41 visited tile, click a bookmark, etc.) would make a good enumerated histogram. |
42 If the total count of your histogram (i.e. the sum across all buckets) is | 42 If the total count of your histogram (i.e. the sum across all buckets) is |
43 something meaningful--as it is in this example--that is generally a good sign. | 43 something meaningful--as it is in this example--that is generally a good sign. |
44 However, the total count does not have to be meaningful for an enum histogram | 44 However, the total count does not have to be meaningful for an enum histogram |
(...skipping 15 matching lines...) Expand all Loading... | |
60 | 60 |
61 [histogram_macros.h](https://cs.chromium.org/chromium/src/base/metrics/histogram _macros.h) | 61 [histogram_macros.h](https://cs.chromium.org/chromium/src/base/metrics/histogram _macros.h) |
62 provides macros for some common count types such as memory or elapsed time, in | 62 provides macros for some common count types such as memory or elapsed time, in |
63 addition to general count macros. These have reasonable default values; you | 63 addition to general count macros. These have reasonable default values; you |
64 will not often need to choose number of buckets or histogram min. You still | 64 will not often need to choose number of buckets or histogram min. You still |
65 will need to choose the histogram max (use the advice below). | 65 will need to choose the histogram max (use the advice below). |
66 | 66 |
67 If none of the default macros work well for you, please thoughtfully choose | 67 If none of the default macros work well for you, please thoughtfully choose |
68 a min, max, and bucket count for your histogram using the advice below. | 68 a min, max, and bucket count for your histogram using the advice below. |
69 | 69 |
70 ### Count Histograms: Choosing Min and Max | 70 #### Count Histograms: Choosing Min and Max |
71 | 71 |
72 For histogram max, choose a value so that very few emission to the histogram | 72 For histogram max, choose a value so that very few emission to the histogram |
73 will exceed the max. If many emissions hit the max, it can be difficult to | 73 will exceed the max. If many emissions hit the max, it can be difficult to |
74 compute statistics such as average. One rule of thumb is at most 1% of samples | 74 compute statistics such as average. One rule of thumb is at most 1% of samples |
75 should be in the overflow bucket. This allows analysis of the 99th percentile. | 75 should be in the overflow bucket. This allows analysis of the 99th percentile. |
76 Err on the side of too large a range versus too short a range. (Remember that i f you choose poorly, you'll have to wait for another release cycle to fix it.) | 76 Err on the side of too large a range versus too short a range. (Remember that i f you choose poorly, you'll have to wait for another release cycle to fix it.) |
77 | 77 |
78 For histogram min, if you care about all possible values (zero and above), | 78 For histogram min, if you care about all possible values (zero and above), |
79 choose a min of 1. (All histograms have an underflow bucket; emitted zeros | 79 choose a min of 1. (All histograms have an underflow bucket; emitted zeros |
80 will go there. That's why a min of 1 is appropriate.) Otherwise, choose the | 80 will go there. That's why a min of 1 is appropriate.) Otherwise, choose the |
81 min appropriate for your particular situation. | 81 min appropriate for your particular situation. |
82 | 82 |
83 ### Count Histograms: Choosing Number of Buckets | 83 #### Count Histograms: Choosing Number of Buckets |
84 | 84 |
85 Choose the smallest number of buckets that will get you the granularity you | 85 Choose the smallest number of buckets that will get you the granularity you |
86 need. By default count histograms bucket sizes scale exponentially so you can | 86 need. By default count histograms bucket sizes scale exponentially so you can |
87 get finely granularity when the numbers are small yet still reasonable | 87 get fine granularity when the numbers are small yet still reasonable resolution |
88 resolution for larger numbers. The macros default to bucket sizes around 50 | 88 for larger numbers. The macros default to 50 buckets which is appropriate for |
Mark P
2016/10/03 22:00:47
some default to 100; that's why I just wishy-washy
rkaplow
2016/10/04 15:40:33
ok, just added a note.
| |
89 which is appropriate for most purposes. Because histograms pre-allocate all | 89 most purposes. Because histograms pre-allocate all the buckets, the number of |
90 the buckets, the number of buckets selected directly dictate how much memory | 90 buckets selected directly dictate how much memory is used. Do not exceed 100 |
91 is used. Do not exceed 100 buckets without good reason (and consider whether | 91 buckets without good reason (and consider whether [sparse histograms](#When-To- |
92 [sparse histograms](#when-to-use-sparse-histograms) might work better for you | 92 Use-Sparse-Histograms) might work better for you in that case--they do not pre- |
93 in that case--they do not pre-allocate their buckets). | 93 allocate their buckets). |
94 | |
95 ### Count Histograms with Linear Ranges | |
96 | |
97 If you want equally spaced buckets of size 1, use an enumerated histogram. | |
98 While it's possible to do this with a count histogram, it's easy to make a | |
99 mistake when setting the min, max, and number of buckets (because you have | |
100 to remember how underflow and overflow buckets are handled) and end up with | |
101 a histogram that ends up with mostly buckets of size 1 but not all. | |
102 Using an enumerated histogram with a max value of your own choice is less | |
103 error-prone. | |
104 | 94 |
105 ### Testing | 95 ### Testing |
106 | 96 |
107 Test your histograms using [chrome://histograms](chrome://histograms). Make | 97 Test your histograms using *chrome://histograms*. Make sure they're being |
Mark P
2016/10/03 22:00:47
*chrome://histograms*
will this make a link?
rkaplow
2016/10/04 15:40:33
No - this just turns it italic. I'm not sure how t
| |
108 sure they're being emitted to when you expect and not emitted to at other times. | 98 emitted to when you expect and not emitted to at other times. Also check that |
109 Also check that the values emitted to are correct. Finally, for count | 99 the values emitted to are correct. Finally, for count histograms, make sure |
110 histograms, make sure that buckets capture enough precision for your needs over | 100 that buckets capture enough precision for your needs over the range. |
111 the range. | |
112 | 101 |
113 ### Revising Histograms | 102 ### Revising Histograms |
114 | 103 |
115 If you're changing the semantics of a histogram (when it's emitted, what buckets | 104 If you're changing the semantics of a histogram (when it's emitted, what buckets |
116 mean, etc.), make it into a new histogram with a new name. Otherwise the | 105 mean, etc.), make it into a new histogram with a new name. Otherwise the |
117 "Everything" view on the dashboard will be mixing two different interpretations | 106 "Everything" view on the dashboard will be mixing two different interpretations |
118 of the data and make no sense. | 107 of the data and make no sense. |
119 | 108 |
120 ### Deleting Histograms | 109 ### Deleting Histograms |
121 | 110 |
122 Please delete the code that emits to histograms that are no longer needed. | 111 Please delete the code that emits to histograms that are no longer needed. |
123 Histograms take up memory. Cleaning up histograms that you no longer care about | 112 Histograms take up memory. Cleaning up histograms that you no longer care about |
124 is good! But see the note below on [Deleting Histogram Entries] | 113 is good! But see the note below on [Deleting Histogram Entries](#Deleting-Histo gram-Entries). |
125 (#deleting-histogram-entries). | |
126 | 114 |
127 ## Documenting Histograms | 115 ## Documenting Histograms |
128 | 116 |
129 ### Add Histogram and Documentation in the Same Changelist | 117 ### Add Histogram and Documentation in the Same Changelist |
130 | 118 |
131 If possible, please add the histograms.xml description in the same changelist | 119 If possible, please add the histograms.xml description in the same changelist |
132 in which you add the histogram-emitting code. This has several benefits. One, | 120 in which you add the histogram-emitting code. This has several benefits. One, |
133 it sometimes happens that the histograms.xml reviewer has questions or concerns | 121 it sometimes happens that the histograms.xml reviewer has questions or concerns |
134 about the histogram description that reveal problems with interpretation of the | 122 about the histogram description that reveal problems with interpretation of the |
135 data and call for a different recording strategy. Two, it allows the histogram | 123 data and call for a different recording strategy. Two, it allows the histogram |
(...skipping 33 matching lines...) Expand 10 before | Expand all | Expand 10 after Loading... | |
169 | 157 |
170 Sparse histograms are well suited for recording counts of exact sample values | 158 Sparse histograms are well suited for recording counts of exact sample values |
171 that are sparsely distributed over a large range. | 159 that are sparsely distributed over a large range. |
172 | 160 |
173 The implementation uses a lock and a map, whereas other histogram types use a | 161 The implementation uses a lock and a map, whereas other histogram types use a |
174 vector and no lock. It is thus more costly to add values to, and each value | 162 vector and no lock. It is thus more costly to add values to, and each value |
175 stored has more overhead, compared to the other histogram types. However it | 163 stored has more overhead, compared to the other histogram types. However it |
176 may be more efficient in memory if the total number of sample values is small | 164 may be more efficient in memory if the total number of sample values is small |
177 compared to the range of their values. | 165 compared to the range of their values. |
178 | 166 |
179 For more information, see [sparse_histograms.h] | 167 For more information, see [sparse_histograms.h](https://cs.chromium.org/chromium /src/base/metrics/sparse_histogram.h). |
180 (https://cs.chromium.org/chromium/src/base/metrics/sparse_histogram.h). | |
OLD | NEW |