Chromium Code Reviews| Index: tools/metrics/histograms/README.md |
| diff --git a/tools/metrics/histograms/README.md b/tools/metrics/histograms/README.md |
| new file mode 100644 |
| index 0000000000000000000000000000000000000000..7fea174c9f050ec20c1d0b9f332786fcaedf827c |
| --- /dev/null |
| +++ b/tools/metrics/histograms/README.md |
| @@ -0,0 +1,138 @@ |
| +# Histogram Guidelines |
|
rkaplow
2016/09/30 17:12:02
You could put the pointer to this file from histog
Mark P
2016/09/30 22:40:36
Done for both.
|
| + |
| +This document gives the best practices on how to use histograms in code and how |
| +to document the histograms for the dashboard. There are three general types of |
| +histograms: enumerated histograms (appropriate for enums), count histograms |
| +(appropriate for arbitrary numbers), and sparse histogram (appropriate for |
| +anything when the precision is important over a wide range is large and/or the |
| +range is not possible to specify a priori). |
| + |
| +[TOC] |
| + |
| +## Emitting to Histograms |
| + |
| +### Efficiency |
| + |
| +In general, the histogram code is highly optimized. Do not be concerned about |
| +the processing cost of emitting to a histogram (unless you're using [sparse |
| +histograms](#when-to-use-sparse-histograms)). |
| + |
| +### Enum Histograms |
| + |
| +Enumerated histogram are most appropriate when you have a list of connected / |
| +related states that should be analyzed jointly. For example, the set of |
| +actions that can be done on the New Tab Page (use the omnibox, click a most |
| +visited tile, click a bookmark, etc.) would make a good enumerated histogram. |
|
rkaplow
2016/09/30 17:12:01
you may also want to add something like:
"It is o
Mark P
2016/09/30 22:40:36
Added a less wordy version. :-)
Thanks for the f
|
| + |
| +Please put a warning by the enum definition: |
| +``` |
| +// These values are written to logs. New enum values can be added, but existing |
| +// enums must never be renumbered or deleted and reused. |
| +``` |
| + |
| +Also, please explicitly set enum values `= 0`, `= 1`, `= 2`, etc. This makes |
|
rkaplow
2016/09/30 17:12:02
Probably worth mentioning that we allow appending
Mark P
2016/09/30 22:40:36
Done.
|
| +clearer that the actual values are important. In addition, it helps confirm |
| +the values align between the enum definition and histograms.xml. |
| + |
| +### Count Histograms: Choosing Min and Max |
|
rkaplow
2016/09/30 17:12:01
should we have a small introduction to counts here
Mark P
2016/09/30 22:40:36
Done. Seems a little wordy though. :-| *shrug*
|
| + |
| +For histogram max, choose a value so that very few emission to the histogram |
| +will exceed the max. If many emissions hit the max, it can be difficult to |
| +compute statistics such as average or high order percentiles such as the 99th |
| +percentile. Err on the side of too large a range versus too short a range. |
|
rkaplow
2016/09/30 17:12:02
Also option, you can put something like:
"One rul
Mark P
2016/09/30 22:40:36
Added, and removed an existing clause. Then added
|
| + |
| +For histogram min, if you care about all possible values (zero and above), |
| +choose a min of 1. (All histograms have an underflow bucket; emitted zeros |
| +will go there. That's why a min of 1 is appropriate.) Otherwise, choose the |
| +min appropriate for your particular situation. |
| + |
| +### Count Histograms: Choosing Number of Buckets |
| + |
| +Choose the smallest number of buckets that will get you the granularity you |
| +need. By default count histograms bucket sizes scale exponentially so you can |
| +get finely granularity when the numbers are small yet still reasonable |
| +resolution for larger numbers. The macros default to bucket sizes around 50 |
| +which is appropriate for most purposes. Because histograms pre-allocate all |
| +the buckets, the number of buckets selected directly dictate how much memory |
| +is used. Do not exceed 100 buckets without good reason (and consider whether |
| +[sparse histograms](#when-to-use-sparse-histograms) might work better for you |
| +in that case--they do not pre-allocate their buckets). |
| + |
| +### Count Histograms with Linear Ranges |
| + |
| +If you want equally spaced buckets of size 1, use an enumerated histogram. |
| +While it's possible to do this with a count histogram, it's easy to make a |
| +mistake when setting the min, max, and number of buckets (because you have |
| +to remember how underflow and overflow buckets are handled) and end up with |
| +a histogram that ends up with mostly buckets of size 1 but not all. |
| +Using an enumerated histogram with a max value of your own choice is less |
| +error-prone. |
| + |
| +### Testing |
| + |
| +Test your histograms using [chrome://histograms](chrome://histograms). Make |
| +sure they're being emitted to when you expect and not emitted to at other times. |
| +Also check that the values emitted to are correct. Finally, for count |
| +histograms, make sure that buckets capture enough precision for your needs over |
| +the range. |
| + |
| +### Revising Histograms |
| + |
| +If you're changing the semantics of a histogram (when it's emitted, what buckets |
| +mean, etc.), make it into a new histogram with a new name. Otherwise the |
| +"Everything" view on the dashboard will be mixing two different interpretations |
| +of the data and make no sense. |
| + |
| +### Deleting Histograms |
| + |
| +Please delete the code that emits to histograms that are no longer needed. |
| +Histograms take up memory. Cleaning up histograms that you no longer care aboud |
| +is good! But see the note below on [Deleting Histogram Entries] |
| +(#deleting-histogram-entries). |
| + |
| +## Documenting Histograms |
| + |
| +### Add Histogram and Documentation in the Same Changelist |
| + |
| +If possible, please add the histograms.xml description in the same changelist |
| +in which you add the histogram-emitting code. This has several benefits. One, |
| +it sometimes happens that the histograms.xml reviewer has questions or concerns |
| +about the histogram description that reveal problems with interpretation of the |
| +data and call for a different recording strategy. Two, it allows the histogram |
| +reviewer to easily review the emission code to see if it comports with these |
| +best practices, and to look for other errors. |
| + |
| +### Understandable to Everyone |
| + |
| +Histogram descriptions should be roughly understandable to someone not familiar |
| +with with your feature. Please add a sentence or two of background if |
| +necessary. |
|
rkaplow
2016/09/30 17:12:01
Maybe add:
It is good practice to note caveats as
Mark P
2016/09/30 22:40:36
I don't think the list of platforms should be adde
|
| + |
| +### State When It Is Recorded |
| + |
| +Histogram descriptions should clearly state when the histogram is emitted |
| +(profile open? network request received? etc.). |
| + |
| +### Deleting Histogram Entries |
| + |
| +Do not delete histograms from histograms.xml. Instead, mark unused histograms |
| +as obsolete. It would be bad if someone to accidentally reused your old |
|
rkaplow
2016/09/30 17:12:02
obsolete, with the associated date or milestone in
Mark P
2016/09/30 22:40:36
Both good suggestions. Integrated them both.
|
| +histogram name and thereby corrupts new data with whatever old data is still |
| +coming in. It's also useful to keep obsolete histogram descriptions in |
| +histograms.xml--that way, if someone is searching for a histogram to answer |
| +a particular question, they can learn if there was a histogram at some point |
| +that did so even if it isn't active now. |
| + |
| +## When To Use Sparse Histograms |
| + |
| +Sparse histograms are well suited for recording counts of exact sample values |
| +that are sparsely distributed over a large range. |
| + |
| +The implementation uses a lock and a map, whereas other histogram types use a |
| +vector and no lock. It is thus more costly to add values to, and each value |
| +stored has more overhead, compared to the other histogram types. However it |
| +may be more efficient in memory if the total number of sample values is small |
| +compared to the range of their values. |
| + |
| +For more information, see [sparse_histograms.h] |
| +(https://cs.chromium.org/chromium/src/base/metrics/sparse_histogram.h). |