|
|
DescriptionAdd histograms for temperatures and thermal states.
BUG=None
Committed: https://crrev.com/7fbd516e6641f9f02446cb2ae956a6454e81d0c9
Cr-Commit-Position: refs/heads/master@{#313133}
Patch Set 1 #
Total comments: 8
Patch Set 2 : Add thermald histograms #
Total comments: 6
Messages
Total messages: 14 (2 generated)
mka@chromium.org changed reviewers: + isherman@chromium.org, kemp@chromium.org, vadimb@chromium.org
ping
https://codereview.chromium.org/851503005/diff/1/tools/metrics/histograms/his... File tools/metrics/histograms/histograms.xml (right): https://codereview.chromium.org/851503005/diff/1/tools/metrics/histograms/his... tools/metrics/histograms/histograms.xml:24626: + <summary>Temperature reading at sensor 0 of the CPU.</summary> Please add information about how often this is logged, otherwise it's unclear how to interpret that (i.e. logged when the temperature is changed, or every second, or once a day, or when the system is restarted - it's all different). Same for other Temperature histograms. https://codereview.chromium.org/851503005/diff/1/tools/metrics/histograms/his... tools/metrics/histograms/histograms.xml:24647: + Distribution of time spent in the states of the thermal zone of the CPU. 1. Same question about how often this is logged. 2. What are the states? Since it's not enum it will be an integer, how to interpret that integer? 3. So what is logged here? Time or state? E.g., say I have the histogram: 0-1: 50 1-2: 60 How one would understand that? 50ms spent in state 0? Or 50s? Or we've seen the system 50 times in state 0? Same for the other states.
https://codereview.chromium.org/851503005/diff/1/tools/metrics/histograms/his... File tools/metrics/histograms/histograms.xml (right): https://codereview.chromium.org/851503005/diff/1/tools/metrics/histograms/his... tools/metrics/histograms/histograms.xml:24626: + <summary>Temperature reading at sensor 0 of the CPU.</summary> On 2015/01/21 18:48:39, vadimb1 wrote: > Please add information about how often this is logged, otherwise it's unclear > how to interpret that (i.e. logged when the temperature is changed, or every > second, or once a day, or when the system is restarted - it's all different). > Same for other Temperature histograms. my idea is to log it every second, however it could happen that the actual sensor reading is performed at a slightly lower rate (every few seconds). the component reporting the metrics doesn't actually read the sensor but subscribes to the sensor to be notified on changes. the polling rate of the sensor might vary between different sensors and devices. not sure whether to express this here in detail or say "every few seconds" https://codereview.chromium.org/851503005/diff/1/tools/metrics/histograms/his... tools/metrics/histograms/histograms.xml:24647: + Distribution of time spent in the states of the thermal zone of the CPU. On 2015/01/21 18:48:39, vadimb1 wrote: > 1. Same question about how often this is logged. every second > 2. What are the states? Since it's not enum it will be an integer, how to > interpret that integer? I left it vague on purpose as the interpretation of the values might vary across different devices. What is true for all devices is that a higher state corresponds to a higher temperature > 3. So what is logged here? Time or state? E.g., say I have the histogram: > 0-1: 50 > 1-2: 60 > How one would understand that? 50ms spent in state 0? Or 50s? Or we've seen the > system 50 times in state 0? the state is logged, so it would be 50s spent in state 0 > Same for the other states.
https://codereview.chromium.org/851503005/diff/1/tools/metrics/histograms/his... File tools/metrics/histograms/histograms.xml (right): https://codereview.chromium.org/851503005/diff/1/tools/metrics/histograms/his... tools/metrics/histograms/histograms.xml:24626: + <summary>Temperature reading at sensor 0 of the CPU.</summary> On 2015/01/21 19:19:33, mka wrote: > On 2015/01/21 18:48:39, vadimb1 wrote: > > Please add information about how often this is logged, otherwise it's unclear > > how to interpret that (i.e. logged when the temperature is changed, or every > > second, or once a day, or when the system is restarted - it's all different). > > Same for other Temperature histograms. > > my idea is to log it every second, however it could happen that the actual > sensor reading is performed at a slightly lower rate (every few seconds). the > component reporting the metrics doesn't actually read the sensor but subscribes > to the sensor to be notified on changes. the polling rate of the sensor might > vary between different sensors and devices. > > not sure whether to express this here in detail or say "every few seconds" Well the idea of the summary (which is shown on the dashboards) is that anyone looking at it should get as much information as reasonably possible to fit in a few lines of text so they don't need to dig into the sources or e-mail you every time. Not every person looking at this is familiar with your project; we had lots of scenarios when people find a histogram (in many cases created a few years ago) by text search and want to quickly check if it is close to what they were looking for. In this case I would say "collected every few seconds (may vary between devices)". See for example "Platform.Temperature.Sensor01" that says "Temperature reading at sensor 1 (I2C_CPU-Object) taken every 30s." BTW what is the difference, can you use Platform.Temperature.* or is your histogram completely unrelated to that group? Also on this subject, most of the other Platform.* histograms start description with "Chrome OS" - now sure if that is your case but you may follow the example for consistency if it is (but up to you). https://codereview.chromium.org/851503005/diff/1/tools/metrics/histograms/his... tools/metrics/histograms/histograms.xml:24647: + Distribution of time spent in the states of the thermal zone of the CPU. On 2015/01/21 19:19:33, mka wrote: > On 2015/01/21 18:48:39, vadimb1 wrote: > > 1. Same question about how often this is logged. > > every second > > > 2. What are the states? Since it's not enum it will be an integer, how to > > interpret that integer? > > I left it vague on purpose as the interpretation of the values might vary across > different devices. What is true for all devices is that a higher state > corresponds to a higher temperature > > > 3. So what is logged here? Time or state? E.g., say I have the histogram: > > 0-1: 50 > > 1-2: 60 > > How one would understand that? 50ms spent in state 0? Or 50s? Or we've seen > the > > system 50 times in state 0? > > the state is logged, so it would be 50s spent in state 0 > > > Same for the other states. > Well, make it clearer from the summary. It's very similar to the previous histograms group but the summary reads differently. I would say something like "State of the termal zone, collected every second. State depends on device, higher state corresponds to a higher temperature" or "Time spent in the states of termal zone, one count collected every second...". Also you probably could use units="State" here. Then the histograms table will be State Quantity PDF CFD while without units it will be Range Quantity PDF CFD
Vadim, thanks a lot for your feedback https://codereview.chromium.org/851503005/diff/1/tools/metrics/histograms/his... File tools/metrics/histograms/histograms.xml (right): https://codereview.chromium.org/851503005/diff/1/tools/metrics/histograms/his... tools/metrics/histograms/histograms.xml:24626: + <summary>Temperature reading at sensor 0 of the CPU.</summary> On 2015/01/21 22:32:56, vadimb1 wrote: > On 2015/01/21 19:19:33, mka wrote: > > On 2015/01/21 18:48:39, vadimb1 wrote: > > > Please add information about how often this is logged, otherwise it's > unclear > > > how to interpret that (i.e. logged when the temperature is changed, or every > > > second, or once a day, or when the system is restarted - it's all > different). > > > Same for other Temperature histograms. > > > > my idea is to log it every second, however it could happen that the actual > > sensor reading is performed at a slightly lower rate (every few seconds). the > > component reporting the metrics doesn't actually read the sensor but > subscribes > > to the sensor to be notified on changes. the polling rate of the sensor might > > vary between different sensors and devices. > > > > not sure whether to express this here in detail or say "every few seconds" > > Well the idea of the summary (which is shown on the dashboards) is that anyone > looking at it should get as much information as reasonably possible to fit in a > few lines of text so they don't need to dig into the sources or e-mail you every > time. Not every person looking at this is familiar with your project; we had > lots of scenarios when people find a histogram (in many cases created a few > years ago) by text search and want to quickly check if it is close to what they > were looking for. > > In this case I would say "collected every few seconds (may vary between > devices)". > > See for example "Platform.Temperature.Sensor01" that says "Temperature reading > at sensor 1 (I2C_CPU-Object) taken every 30s." ack, I'll update the description > BTW what is the difference, can you use Platform.Temperature.* or is your > histogram completely unrelated to that group? yes, my histogram is completely unrelated. I saw the Platform.Temperature.* histograms and considered to add the new histograms to this group, but as there are also the thermal zone histograms it seemed more appropriate to group them together under thermal > Also on this subject, most of the other Platform.* histograms start description > with "Chrome OS" - now sure if that is your case but you may follow the example > for consistency if it is (but up to you). Jetstream runs Chrome OS (Core), I'll add it to the description https://codereview.chromium.org/851503005/diff/1/tools/metrics/histograms/his... tools/metrics/histograms/histograms.xml:24647: + Distribution of time spent in the states of the thermal zone of the CPU. On 2015/01/21 22:32:56, vadimb1 wrote: > On 2015/01/21 19:19:33, mka wrote: > > On 2015/01/21 18:48:39, vadimb1 wrote: > > > 1. Same question about how often this is logged. > > > > every second > > > > > 2. What are the states? Since it's not enum it will be an integer, how to > > > interpret that integer? > > > > I left it vague on purpose as the interpretation of the values might vary > across > > different devices. What is true for all devices is that a higher state > > corresponds to a higher temperature > > > > > 3. So what is logged here? Time or state? E.g., say I have the histogram: > > > 0-1: 50 > > > 1-2: 60 > > > How one would understand that? 50ms spent in state 0? Or 50s? Or we've seen > > the > > > system 50 times in state 0? > > > > the state is logged, so it would be 50s spent in state 0 > > > > > Same for the other states. > > > > Well, make it clearer from the summary. It's very similar to the previous > histograms group but the summary reads differently. I would say something like > "State of the termal zone, collected every second. State depends on device, > higher state corresponds to a higher temperature" or "Time spent in the states > of termal zone, one count collected every second...". ok > Also you probably could use units="State" here. Then the histograms table will > be > State Quantity PDF CFD > while without units it will be > Range Quantity PDF CFD I'll update the histograms to use units
lgtm
https://codereview.chromium.org/851503005/diff/20001/tools/metrics/histograms... File tools/metrics/histograms/histograms.xml (right): https://codereview.chromium.org/851503005/diff/20001/tools/metrics/histograms... tools/metrics/histograms/histograms.xml:24634: +<histogram name="Platform.Thermal.Temperature.Wifi0" units="Celsius"> nit: You use ".0" for CPU readings, but just "0" for Wifi. I suggest keeping those consistent. https://codereview.chromium.org/851503005/diff/20001/tools/metrics/histograms... tools/metrics/histograms/histograms.xml:24657: + Temperature reading at wireless interface 2 collected every few seconds (may What defines wireless interface "1" vs. "2", etc.? It might be clearer to use more semantically contentful names. https://codereview.chromium.org/851503005/diff/20001/tools/metrics/histograms... tools/metrics/histograms/histograms.xml:24664: +<histogram name="Platform.Thermal.Zone.Cpu.States" units="Thermal state"> Would it make sense for the state to be an enum? You could use a sparse histogram and hash a string, for example, if you wanted to be able to assign flexible labels to the enum.
https://codereview.chromium.org/851503005/diff/20001/tools/metrics/histograms... File tools/metrics/histograms/histograms.xml (right): https://codereview.chromium.org/851503005/diff/20001/tools/metrics/histograms... tools/metrics/histograms/histograms.xml:24634: +<histogram name="Platform.Thermal.Temperature.Wifi0" units="Celsius"> On 2015/01/22 02:53:47, Ilya Sherman wrote: > nit: You use ".0" for CPU readings, but just "0" for Wifi. I suggest keeping > those consistent. In the case of the Wifi interface the 0 is the index of the interface, for the CPU it's the sensor number. Said that, for now a single sensor is used and I can't predict if that will ever change. Putting the index is an attempt of being prepared for more than one sensor. I'm not sure this is actually the right thing to do or if we should rather have a temperature Cpu and adapt the histograms when necessary, possibly deprecating the one without indexes. https://codereview.chromium.org/851503005/diff/20001/tools/metrics/histograms... tools/metrics/histograms/histograms.xml:24657: + Temperature reading at wireless interface 2 collected every few seconds (may On 2015/01/22 02:53:47, Ilya Sherman wrote: > What defines wireless interface "1" vs. "2", etc.? It might be clearer to use > more semantically contentful names. It's the interface number/name assigned by the OS. In our team we discussed more semantic schemes but didn't encounter a consistent and future proof one. Also for the engineers who will most often look at these histograms the hardware/system related names are probably more meaningful. https://codereview.chromium.org/851503005/diff/20001/tools/metrics/histograms... tools/metrics/histograms/histograms.xml:24664: +<histogram name="Platform.Thermal.Zone.Cpu.States" units="Thermal state"> On 2015/01/22 02:53:47, Ilya Sherman wrote: > Would it make sense for the state to be an enum? You could use a sparse > histogram and hash a string, for example, if you wanted to be able to assign > flexible labels to the enum. I considered initially to use labels like "warm", "hot", "critically hot", ... but discarded the idea because the number of states would be limited by the number of reasonable classifications of the thermal states in English language.
Okay, LGTM. Thanks for the clarifications.
The CQ bit was checked by mka@chromium.org
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/851503005/20001
Message was sent while issue was closed.
Committed patchset #2 (id:20001)
Message was sent while issue was closed.
Patchset 2 (id:??) landed as https://crrev.com/7fbd516e6641f9f02446cb2ae956a6454e81d0c9 Cr-Commit-Position: refs/heads/master@{#313133} |