|
|
Created:
7 years, 2 months ago by oystein (OOO til 10th of July) Modified:
7 years, 1 month ago CC:
chromium-reviews, Ilya Sherman, jar (doing other things), asvitkine+watch_chromium.org Base URL:
svn://svn.chromium.org/chrome/trunk/src Visibility:
Public. |
DescriptionPerformanceMonitor: Add a new UMA histograms to track average CPU utilization of the browser process
R=asvitkine@chromium.org, yoz@chromium.org
BUG=306713
Committed: https://src.chromium.org/viewvc/chrome?view=rev&revision=230802
Patch Set 1 #
Total comments: 11
Patch Set 2 : Expanded docs #
Total comments: 6
Patch Set 3 : Increased histogram range #Patch Set 4 : Platform-independent CPU metrics #Patch Set 5 : Rebase #
Total comments: 4
Patch Set 6 : Added units tag to histogram xml #
Total comments: 4
Patch Set 7 : Docs #
Total comments: 2
Patch Set 8 : Comment update #
Messages
Total messages: 24 (0 generated)
LGTM https://codereview.chromium.org/29873002/diff/1/tools/metrics/histograms/hist... File tools/metrics/histograms/histograms.xml (right): https://codereview.chromium.org/29873002/diff/1/tools/metrics/histograms/hist... tools/metrics/histograms/histograms.xml:11287: + Average CPU utilization of the browser process, read out each two-minute s/out/at/?
https://codereview.chromium.org/29873002/diff/1/chrome/browser/performance_mo... File chrome/browser/performance_monitor/process_metrics_history.cc (right): https://codereview.chromium.org/29873002/diff/1/chrome/browser/performance_mo... chrome/browser/performance_monitor/process_metrics_history.cc:82: 0, 1200, 20); Can you explain why you're using these parameters? What's 1200? https://codereview.chromium.org/29873002/diff/1/tools/metrics/histograms/hist... File tools/metrics/histograms/histograms.xml (right): https://codereview.chromium.org/29873002/diff/1/tools/metrics/histograms/hist... tools/metrics/histograms/histograms.xml:11288: + interval. What are the units? Is it percentage? (if so, add units=%" above). Also, if it is %, please clarify what 100% means (e.g. on a dual cpu is 50% using one core fully or will that report as 100%?)
https://codereview.chromium.org/29873002/diff/1/tools/metrics/histograms/hist... File tools/metrics/histograms/histograms.xml (right): https://codereview.chromium.org/29873002/diff/1/tools/metrics/histograms/hist... tools/metrics/histograms/histograms.xml:11288: + interval. On 2013/10/18 21:39:12, Alexei Svitkine wrote: > What are the units? Is it percentage? (if so, add units=%" above). Also, if it > is %, please clarify what 100% means (e.g. on a dual cpu is 50% using one core > fully or will that report as 100%?) I believe it's different on different OSes, so that needs to be documented here too.
https://codereview.chromium.org/29873002/diff/1/chrome/browser/performance_mo... File chrome/browser/performance_monitor/process_metrics_history.cc (right): https://codereview.chromium.org/29873002/diff/1/chrome/browser/performance_mo... chrome/browser/performance_monitor/process_metrics_history.cc:82: 0, 1200, 20); On 2013/10/18 21:39:12, Alexei Svitkine wrote: > Can you explain why you're using these parameters? What's 1200? 1200 is the maximum value we'd see on a 12-core non-Windows system. It's reasonably arbitrary, if anyone has any better suggestions. https://codereview.chromium.org/29873002/diff/1/tools/metrics/histograms/hist... File tools/metrics/histograms/histograms.xml (right): https://codereview.chromium.org/29873002/diff/1/tools/metrics/histograms/hist... tools/metrics/histograms/histograms.xml:11287: + Average CPU utilization of the browser process, read out each two-minute On 2013/10/18 21:13:48, Yoyo Zhou wrote: > s/out/at/? Done. https://codereview.chromium.org/29873002/diff/1/tools/metrics/histograms/hist... tools/metrics/histograms/histograms.xml:11288: + interval. On 2013/10/18 21:49:31, Yoyo Zhou wrote: > On 2013/10/18 21:39:12, Alexei Svitkine wrote: > > What are the units? Is it percentage? (if so, add units=%" above). Also, if it > > is %, please clarify what 100% means (e.g. on a dual cpu is 50% using one core > > fully or will that report as 100%?) > > I believe it's different on different OSes, so that needs to be documented here > too. Done. https://codereview.chromium.org/29873002/diff/1/tools/metrics/histograms/hist... tools/metrics/histograms/histograms.xml:11288: + interval. On 2013/10/18 21:39:12, Alexei Svitkine wrote: > What are the units? Is it percentage? (if so, add units=%" above). Also, if it > is %, please clarify what 100% means (e.g. on a dual cpu is 50% using one core > fully or will that report as 100%?) Done.
https://codereview.chromium.org/29873002/diff/1/chrome/browser/performance_mo... File chrome/browser/performance_monitor/process_metrics_history.cc (right): https://codereview.chromium.org/29873002/diff/1/chrome/browser/performance_mo... chrome/browser/performance_monitor/process_metrics_history.cc:82: 0, 1200, 20); On 2013/10/21 14:38:45, Oystein wrote: > On 2013/10/18 21:39:12, Alexei Svitkine wrote: > > Can you explain why you're using these parameters? What's 1200? > > 1200 is the maximum value we'd see on a 12-core non-Windows system. It's > reasonably arbitrary, if anyone has any better suggestions. I'd suggest picking a value that's more future-proof (the current maximum of today is not the maximum of tomorrow). Alternatively, scale non-Windows values to 100% as the max and use UMA_HISTOGRAM_PERCENTAGE() (which will also give you distinct buckets for all values rather than exponential bucket ranges like UMA_HISTOGRAM_CUSTOM_COUNTS will produce). https://codereview.chromium.org/29873002/diff/70001/tools/metrics/histograms/... File tools/metrics/histograms/histograms.xml (right): https://codereview.chromium.org/29873002/diff/70001/tools/metrics/histograms/... tools/metrics/histograms/histograms.xml:11325: + read as 400%. Can we perhaps transform the value such that its in the same range for all platforms? i.e. either multiply the value by the number of cores/cpus on Windows or divide it on other platforms?
https://codereview.chromium.org/29873002/diff/1/chrome/browser/performance_mo... File chrome/browser/performance_monitor/process_metrics_history.cc (right): https://codereview.chromium.org/29873002/diff/1/chrome/browser/performance_mo... chrome/browser/performance_monitor/process_metrics_history.cc:82: 0, 1200, 20); On 2013/10/21 14:53:46, Alexei Svitkine wrote: > On 2013/10/21 14:38:45, Oystein wrote: > > On 2013/10/18 21:39:12, Alexei Svitkine wrote: > > > Can you explain why you're using these parameters? What's 1200? > > > > 1200 is the maximum value we'd see on a 12-core non-Windows system. It's > > reasonably arbitrary, if anyone has any better suggestions. > > I'd suggest picking a value that's more future-proof (the current maximum of > today is not the maximum of tomorrow). > > Alternatively, scale non-Windows values to 100% as the max and use > UMA_HISTOGRAM_PERCENTAGE() (which will also give you distinct buckets for all > values rather than exponential bucket ranges like UMA_HISTOGRAM_CUSTOM_COUNTS > will produce). Sure, I'll bounce it to 6400 or something. https://codereview.chromium.org/29873002/diff/70001/tools/metrics/histograms/... File tools/metrics/histograms/histograms.xml (right): https://codereview.chromium.org/29873002/diff/70001/tools/metrics/histograms/... tools/metrics/histograms/histograms.xml:11325: + read as 400%. On 2013/10/21 14:53:46, Alexei Svitkine wrote: > Can we perhaps transform the value such that its in the same range for all > platforms? > > i.e. either multiply the value by the number of cores/cpus on Windows or divide > it on other platforms? The plan is to eventually de-normalize the Windows one yeah (rather than the other way around, to avoid losing information for the non-Windows platforms). Since that involves extending some of the base libraries I didn't really want to bundle those changes with this, but I'm happy to do it if you think it's important to keep the values consistent between platforms.
https://codereview.chromium.org/29873002/diff/1/chrome/browser/performance_mo... File chrome/browser/performance_monitor/process_metrics_history.cc (right): https://codereview.chromium.org/29873002/diff/1/chrome/browser/performance_mo... chrome/browser/performance_monitor/process_metrics_history.cc:82: 0, 1200, 20); On 2013/10/21 15:15:05, Oystein wrote: > On 2013/10/21 14:53:46, Alexei Svitkine wrote: > > On 2013/10/21 14:38:45, Oystein wrote: > > > On 2013/10/18 21:39:12, Alexei Svitkine wrote: > > > > Can you explain why you're using these parameters? What's 1200? > > > > > > 1200 is the maximum value we'd see on a 12-core non-Windows system. It's > > > reasonably arbitrary, if anyone has any better suggestions. > > > > I'd suggest picking a value that's more future-proof (the current maximum of > > today is not the maximum of tomorrow). > > > > Alternatively, scale non-Windows values to 100% as the max and use > > UMA_HISTOGRAM_PERCENTAGE() (which will also give you distinct buckets for all > > values rather than exponential bucket ranges like UMA_HISTOGRAM_CUSTOM_COUNTS > > will produce). > > Sure, I'll bounce it to 6400 or something. All right. And you're okay with the exponential bucket ranges provided by this macro? https://codereview.chromium.org/29873002/diff/70001/tools/metrics/histograms/... File tools/metrics/histograms/histograms.xml (right): https://codereview.chromium.org/29873002/diff/70001/tools/metrics/histograms/... tools/metrics/histograms/histograms.xml:11325: + read as 400%. On 2013/10/21 15:15:06, Oystein wrote: > On 2013/10/21 14:53:46, Alexei Svitkine wrote: > > Can we perhaps transform the value such that its in the same range for all > > platforms? > > > > i.e. either multiply the value by the number of cores/cpus on Windows or > divide > > it on other platforms? > > The plan is to eventually de-normalize the Windows one yeah (rather than the > other way around, to avoid losing information for the non-Windows platforms). > Since that involves extending some of the base libraries I didn't really want to > bundle those changes with this, but I'm happy to do it if you think it's > important to keep the values consistent between platforms. I think it's useful to keep it consistent from the start. Otherwise, when you do make that change, you'll need to add a new histogram (obsoleting this one) that has the new semantic.
https://codereview.chromium.org/29873002/diff/70001/tools/metrics/histograms/... File tools/metrics/histograms/histograms.xml (right): https://codereview.chromium.org/29873002/diff/70001/tools/metrics/histograms/... tools/metrics/histograms/histograms.xml:11325: + read as 400%. On 2013/10/21 15:27:31, Alexei Svitkine wrote: > On 2013/10/21 15:15:06, Oystein wrote: > > On 2013/10/21 14:53:46, Alexei Svitkine wrote: > > > Can we perhaps transform the value such that its in the same range for all > > > platforms? > > > > > > i.e. either multiply the value by the number of cores/cpus on Windows or > > divide > > > it on other platforms? > > > > The plan is to eventually de-normalize the Windows one yeah (rather than the > > other way around, to avoid losing information for the non-Windows platforms). > > Since that involves extending some of the base libraries I didn't really want > to > > bundle those changes with this, but I'm happy to do it if you think it's > > important to keep the values consistent between platforms. > > I think it's useful to keep it consistent from the start. Otherwise, when you do > make that change, you'll need to add a new histogram (obsoleting this one) that > has the new semantic. Oh, really? Even if the ranges and buckets stay the same? I was assuming the change would effectively invalidate the historical Windows data.
https://codereview.chromium.org/29873002/diff/70001/tools/metrics/histograms/... File tools/metrics/histograms/histograms.xml (right): https://codereview.chromium.org/29873002/diff/70001/tools/metrics/histograms/... tools/metrics/histograms/histograms.xml:11325: + read as 400%. On 2013/10/21 15:33:12, Oystein wrote: > On 2013/10/21 15:27:31, Alexei Svitkine wrote: > > On 2013/10/21 15:15:06, Oystein wrote: > > > On 2013/10/21 14:53:46, Alexei Svitkine wrote: > > > > Can we perhaps transform the value such that its in the same range for all > > > > platforms? > > > > > > > > i.e. either multiply the value by the number of cores/cpus on Windows or > > > divide > > > > it on other platforms? > > > > > > The plan is to eventually de-normalize the Windows one yeah (rather than the > > > other way around, to avoid losing information for the non-Windows > platforms). > > > Since that involves extending some of the base libraries I didn't really > want > > to > > > bundle those changes with this, but I'm happy to do it if you think it's > > > important to keep the values consistent between platforms. > > > > I think it's useful to keep it consistent from the start. Otherwise, when you > do > > make that change, you'll need to add a new histogram (obsoleting this one) > that > > has the new semantic. > > Oh, really? Even if the ranges and buckets stay the same? I was assuming the > change would effectively invalidate the historical Windows data. Right, the problem is it's not really possible to invalidate the old data (i.e. in the system). For example, if someone were to aggregate the histogram values (i.e. select the "Everything" version), then you'd get the old values aggregated with the new ones, leading to incorrect results. For this reason, we suggest any semantic change to a histogram should rename it (i.e. create a new one and obsolete the old one in histograms.xml). This is why it's better to make the other change to standardize the values first.
https://codereview.chromium.org/29873002/diff/70001/tools/metrics/histograms/... File tools/metrics/histograms/histograms.xml (right): https://codereview.chromium.org/29873002/diff/70001/tools/metrics/histograms/... tools/metrics/histograms/histograms.xml:11325: + read as 400%. On 2013/10/21 16:31:07, Alexei Svitkine wrote: > On 2013/10/21 15:33:12, Oystein wrote: > > On 2013/10/21 15:27:31, Alexei Svitkine wrote: > > > On 2013/10/21 15:15:06, Oystein wrote: > > > > On 2013/10/21 14:53:46, Alexei Svitkine wrote: > > > > > Can we perhaps transform the value such that its in the same range for > all > > > > > platforms? > > > > > > > > > > i.e. either multiply the value by the number of cores/cpus on Windows or > > > > divide > > > > > it on other platforms? > > > > > > > > The plan is to eventually de-normalize the Windows one yeah (rather than > the > > > > other way around, to avoid losing information for the non-Windows > > platforms). > > > > Since that involves extending some of the base libraries I didn't really > > want > > > to > > > > bundle those changes with this, but I'm happy to do it if you think it's > > > > important to keep the values consistent between platforms. > > > > > > I think it's useful to keep it consistent from the start. Otherwise, when > you > > do > > > make that change, you'll need to add a new histogram (obsoleting this one) > > that > > > has the new semantic. > > > > Oh, really? Even if the ranges and buckets stay the same? I was assuming the > > change would effectively invalidate the historical Windows data. > > Right, the problem is it's not really possible to invalidate the old data (i.e. > in the system). For example, if someone were to aggregate the histogram values > (i.e. select the "Everything" version), then you'd get the old values aggregated > with the new ones, leading to incorrect results. For this reason, we suggest any > semantic change to a histogram should rename it (i.e. create a new one and > obsolete the old one in histograms.xml). > > This is why it's better to make the other change to standardize the values > first. Sure thing. The main reason I haven't already done this is lack of access to a Windows machine (won't have that for a few weeks), but I'll figure something out :).
jar, brettw: Following up to my email from yesterday, I added a new function to process_metrics to return a platform-independent value. I didn't modify the existing GetCPUUsage() as a look through the Chrome Task Manager seems to indicate that the authors specifically wanted to have consistent behavior with the equivalent OS tools (Activity Monitor for OS X, Task Manager for Win, etc).
I'm fine with the patch... but please consider the comment below. If you're good to go, I'll approve. https://codereview.chromium.org/29873002/diff/240001/chrome/browser/performan... File chrome/browser/performance_monitor/process_metrics_history.cc (right): https://codereview.chromium.org/29873002/diff/240001/chrome/browser/performan... chrome/browser/performance_monitor/process_metrics_history.cc:82: UMA_HISTOGRAM_CUSTOM_COUNTS("PerformanceMonitor.AverageCPU.BrowserProcess", This may be interesting to look at.... but it is hard to get a handle on "average" or "typical" CPU usage. The hassle is that you could avoid measuring this when the user was "not busy," but that would be biased, or you could do it "intermittently," but that gets lots of samples by users that "run for a long time," even if they don't do "that much work" in the browser. We'll see what this histogram shows.... but I suspect you'll have a hard time interpreting the data meaningfully. :-/ We have a similar problem trying ot look at (for example) memory usage. The issue of when to sample is the big confusing gotcha. https://codereview.chromium.org/29873002/diff/240001/tools/metrics/histograms... File tools/metrics/histograms/histograms.xml (right): https://codereview.chromium.org/29873002/diff/240001/tools/metrics/histograms... tools/metrics/histograms/histograms.xml:11339: +<histogram name="PerformanceMonitor.AverageCPU.BrowserProcess"> nit; add a units tag, which is probably "PercentCpuUsage"
https://codereview.chromium.org/29873002/diff/240001/chrome/browser/performan... File chrome/browser/performance_monitor/process_metrics_history.cc (right): https://codereview.chromium.org/29873002/diff/240001/chrome/browser/performan... chrome/browser/performance_monitor/process_metrics_history.cc:82: UMA_HISTOGRAM_CUSTOM_COUNTS("PerformanceMonitor.AverageCPU.BrowserProcess", On 2013/10/22 22:30:03, jar wrote: > This may be interesting to look at.... but it is hard to get a handle on > "average" or "typical" CPU usage. The hassle is that you could avoid measuring > this when the user was "not busy," but that would be biased, or you could do it > "intermittently," but that gets lots of samples by users that "run for a long > time," even if they don't do "that much work" in the browser. > > We'll see what this histogram shows.... but I suspect you'll have a hard time > interpreting the data meaningfully. :-/ > > We have a similar problem trying ot look at (for example) memory usage. The > issue of when to sample is the big confusing gotcha. Definitely some good points, and to be honest I'm not quite sure what we'll find. I think what I most hope to gain from this is as a form of a canary; if there's many instances where the browser process uses a large amount of CPU over time (this is averages over every two minutes) we can be reasonably sure there's something wrong, so at least this might tell us the *scope* of the problem and hence how much effort to invest into further investigation. https://codereview.chromium.org/29873002/diff/240001/tools/metrics/histograms... File tools/metrics/histograms/histograms.xml (right): https://codereview.chromium.org/29873002/diff/240001/tools/metrics/histograms... tools/metrics/histograms/histograms.xml:11339: +<histogram name="PerformanceMonitor.AverageCPU.BrowserProcess"> On 2013/10/22 22:30:03, jar wrote: > nit; add a units tag, which is probably "PercentCpuUsage" Done.
https://codereview.chromium.org/29873002/diff/400001/base/process/process_met... File base/process/process_metrics.h (right): https://codereview.chromium.org/29873002/diff/400001/base/process/process_met... base/process/process_metrics.h:172: // including Windows. Mention what the consistent values actually are.
https://codereview.chromium.org/29873002/diff/400001/base/process/process_met... File base/process/process_metrics.h (right): https://codereview.chromium.org/29873002/diff/400001/base/process/process_met... base/process/process_metrics.h:172: // including Windows. On 2013/10/24 14:52:00, Alexei Svitkine wrote: > Mention what the consistent values actually are. Actually meant to write "values" rather than "returns" (fixed now), but otherwise do you mean basically repeating the comment for GetCPUUsage() above?
https://codereview.chromium.org/29873002/diff/400001/base/process/process_met... File base/process/process_metrics.h (right): https://codereview.chromium.org/29873002/diff/400001/base/process/process_met... base/process/process_metrics.h:172: // including Windows. On 2013/10/24 15:01:06, Oystein wrote: > On 2013/10/24 14:52:00, Alexei Svitkine wrote: > > Mention what the consistent values actually are. > > Actually meant to write "values" rather than "returns" (fixed now), but > otherwise do you mean basically repeating the comment for GetCPUUsage() above? Not necessarily repeating, you can refer to it - just make it clear that it makes Windows consistent with other platforms and not vice versa, but being super explicit is even better (e.g. specify that 1 core maxed is 100% and 2 cores maxed is 200%, etc.) Also, now that I look at the comment above, it says "Returns the CPU usage in percent since the last time this method was called." Probably now it should say "since the last GetCPUUsage() or GetPlatformIndepententCPUUsage() call".
https://codereview.chromium.org/29873002/diff/400001/base/process/process_met... File base/process/process_metrics.h (right): https://codereview.chromium.org/29873002/diff/400001/base/process/process_met... base/process/process_metrics.h:172: // including Windows. On 2013/10/24 15:13:33, Alexei Svitkine wrote: > On 2013/10/24 15:01:06, Oystein wrote: > > On 2013/10/24 14:52:00, Alexei Svitkine wrote: > > > Mention what the consistent values actually are. > > > > Actually meant to write "values" rather than "returns" (fixed now), but > > otherwise do you mean basically repeating the comment for GetCPUUsage() above? > > Not necessarily repeating, you can refer to it - just make it clear that it > makes Windows consistent with other platforms and not vice versa, but being > super explicit is even better (e.g. specify that 1 core maxed is 100% and 2 > cores maxed is 200%, etc.) > > Also, now that I look at the comment above, it says "Returns the CPU usage in > percent since the last time this method was called." Probably now it should say > "since the last GetCPUUsage() or GetPlatformIndepententCPUUsage() call". Good point; done.
https://codereview.chromium.org/29873002/diff/490004/tools/metrics/histograms... File tools/metrics/histograms/histograms.xml (right): https://codereview.chromium.org/29873002/diff/490004/tools/metrics/histograms... tools/metrics/histograms/histograms.xml:11370: + interval. On Windows this is a range from 0-100% of the total CPU capacity, Update comment now that its consistent between platforms.
https://codereview.chromium.org/29873002/diff/490004/tools/metrics/histograms... File tools/metrics/histograms/histograms.xml (right): https://codereview.chromium.org/29873002/diff/490004/tools/metrics/histograms... tools/metrics/histograms/histograms.xml:11370: + interval. On Windows this is a range from 0-100% of the total CPU capacity, On 2013/10/24 15:48:12, Alexei Svitkine wrote: > Update comment now that its consistent between platforms. Done.
LGTM
LGTM, thanks!
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/oysteine@chromium.org/29873002/590002
Message was sent while issue was closed.
Change committed as 230802 |