|
|
Created:
7 years, 11 months ago by DaleCurtis Modified:
7 years, 11 months ago CC:
chromium-reviews, joi+watch-content_chromium.org, darin-cc_chromium.org, jam, feature-media-reviews_chromium.org, Ami GONE FROM CHROMIUM Base URL:
svn://svn.chromium.org/chrome/trunk/src Visibility:
Public. |
DescriptionTrack UMA stats for when the renderer side audio device wasn't ready.
A renderer that's not ready causes glitching, we should measure what
percentage of users are having issues to determine how big of a problem
this is. To start, we'll record the percentage of clients on each
platform who see more than 3 instances of an unprepared renderer.
BUG=none
TEST=<in progress> Run locally on each platform to ensure common case
is zero.
Committed: https://src.chromium.org/viewvc/chrome?view=rev&revision=179011
Patch Set 1 #
Total comments: 6
Patch Set 2 : Comments. #
Total comments: 3
Patch Set 3 : Cleanup! #
Messages
Total messages: 18 (0 generated)
Chris and I discussed this earlier today and thought it'd be good to have some metrics on the percentage of users who are likely seeing recurring glitches. WDYT?
Thanks Dale, hopefully this will give us some good stats... some bikeshedding comments: https://codereview.chromium.org/11975031/diff/1/content/browser/renderer_host... File content/browser/renderer_host/media/audio_sync_reader.cc (right): https://codereview.chromium.org/11975031/diff/1/content/browser/renderer_host... content/browser/renderer_host/media/audio_sync_reader.cc:47: bool waited_too_many_times = renderer_wasnt_ready_ > 3; nit: should define a constant for this, something like kDeadlineMissedThreshold https://codereview.chromium.org/11975031/diff/1/content/browser/renderer_host... content/browser/renderer_host/media/audio_sync_reader.cc:51: "Media.AudioRendererWasNotReadyLinux", waited_too_many_times); how about "WasNotReady" -> "MissedDeadline"? https://codereview.chromium.org/11975031/diff/1/content/browser/renderer_host... content/browser/renderer_host/media/audio_sync_reader.cc:58: "Media.AudioRendererWasNotReadyWinWASAPI", waited_too_many_times); can we shorten "WinWASAPI" to just "WASAPI"? https://codereview.chromium.org/11975031/diff/1/content/browser/renderer_host... content/browser/renderer_host/media/audio_sync_reader.cc:61: "Media.AudioRendererWasNotReadyWinWaveOut", waited_too_many_times); can we shorten "WinWaveOut" to just "WaveOut"? https://codereview.chromium.org/11975031/diff/1/content/browser/renderer_host... File content/browser/renderer_host/media/audio_sync_reader.h (right): https://codereview.chromium.org/11975031/diff/1/content/browser/renderer_host... content/browser/renderer_host/media/audio_sync_reader.h:70: // Track the number of times the renderer wasn't ready and report a UMA stat nit: "wasn't ready" -> "missed its real-time deadline" https://codereview.chromium.org/11975031/diff/1/content/browser/renderer_host... content/browser/renderer_host/media/audio_sync_reader.h:72: size_t renderer_wasnt_ready_; nit: not too crazy about the name. How about missed_count_?
I was thinking more about this, and think it would be really useful to have one additional metric with is the "fraction" of deadline misses during the lifetime of the SyncReader. For example if AudioSyncReader::Read() is called 1000 times (in say a ten second period) and "misses" for 100 of them, then the miss fraction would be 0.1, as opposed to missing 3 times which would be a fraction of 0.003 Maybe it's even better to track the number of Read() calls separately from the # of miss calls, and we can calculate the fraction ourselves and get a bigger picture. I still think your basic metrics are good too.
Seems like Chris has added a comment on each new line already and I don't have much else to add. +1 on the fraction-proposal from Chris. Also, please ping me when this land with some links so I can check this stat from time to time as well. LGTM
Great idea to add the new stats BTW. Thanks for doing it.
do we have an idea on how many times we miss vs not-miss? do we think it'd be useful to also record a distribution of the ratio of missed-to-non-missed, where 0.0=never missed and 1.0=always missed?
I think we should decide on one set of UMA stats for the first attempt. Fractional buckets are possible, but the utility may be questionable. See the following for an example of HISTOGRAM_PERCENTAGE(): https://uma.googleplex.com/histograms/?dayCount=1&version=Everything&group=Sy... I think that's a bit too fine grained and we should do something like 10% buckets if we go this route. Otherwise we're going to end up with 5 * 100 row tables on our UMA page... In discussions with fischman though, he wasn't sure having the fractional buckets would provide us anything meaningful and instead recommended recording the reasons the deadline was missed. I.e. hardware called back too fast, renderer was never ready, etc.
On 2013/01/17 22:43:59, DaleCurtis wrote: > I think we should decide on one set of UMA stats for the first attempt. > Fractional buckets are possible, but the utility may be questionable. See the > following for an example of HISTOGRAM_PERCENTAGE(): > > https://uma.googleplex.com/histograms/?dayCount=1&version=Everything&group=Sy... > > I think that's a bit too fine grained and we should do something like 10% > buckets if we go this route. Otherwise we're going to end up with 5 * 100 row > tables on our UMA page... Sure, I agree that histograms would probably be overkill. That's not really what I had in mind. I was just thinking we could record two values, both integers: 1) the number of times Read() is called during the lifetime of the SyncReader 2) the number of times the Read() call missed the deadline That would give us two valuable pieces of information, the average duration that the SyncReader is active, and also an idea of the average flakiness of the stream. It really does make a difference if you hear a glitch during a two second interval vs. an hour. One glitch per hour is not nearly so bad in terms of QOS. Making an analogy with FPS, if you're rendering steadily at 60fps, and have one frame drop out in a whole hour, it's not as bad as dropping every other frame.
I don't think we can record two separate integer UMA stats and correlate them in any way. My understanding is that each may end up sampled slightly differently making a comparison unsound. Is that you were suggesting?
On 2013/01/17 23:45:58, DaleCurtis wrote: > I don't think we can record two separate integer UMA stats and correlate them in > any way. My understanding is that each may end up sampled slightly differently > making a comparison unsound. Is that you were suggesting? Ok, then how about a Read() count and a floating-point frac of miss/total? They would each be internally consistent. There has to be a way to record more detailed information than simply if a stream missed >3.
IIUC, we can't correlate between any two variables recorded separately. (cc: fischman). I think the count alone isn't particularly interesting, the fractional portion may be useful in a relative and/or holistic sense of audio pipeline health. What we hope to see is a 1/(x>0) graph.
On 2013/01/18 00:09:56, DaleCurtis wrote: > IIUC, we can't correlate between any two variables recorded separately. (cc: > fischman). I think the count alone isn't particularly interesting, the > fractional portion may be useful in a relative and/or holistic sense of audio > pipeline health. What we hope to see is a 1/(x>0) graph. I think it could be useful - probably good also to only record such fractional values for streams that have been running for a minimum amount of time (like > 1sec). It seems like we should avoid any of the cases where the stream is started/stopped abruptly. In other words, I'm more interested in getting stats on the steady-state performance for that metric, and not the "startup" behavior, although that too could be interesting as a separate bit of info.
Latest patch set adds fractional missed/total as requested.
https://codereview.chromium.org/11975031/diff/14001/content/browser/renderer_... File content/browser/renderer_host/media/audio_sync_reader.cc (right): https://codereview.chromium.org/11975031/diff/14001/content/browser/renderer_... content/browser/renderer_host/media/audio_sync_reader.cc:52: // 1 = 10%, 2 = 20%, ..., 10 = 100%. OOC why not UMA_HISTOGRAM_PERCENTAGE()? If there's a reason perhaps the comment be better served by explaining why we're not using it https://codereview.chromium.org/11975031/diff/14001/content/browser/renderer_... content/browser/renderer_host/media/audio_sync_reader.cc:64: "Media.AudioRendererMissedDeadlineOSX", percentage_missed, kMissedMax); do we need to have 5 different buckets? UMA already has the capability to display different histograms based on the OS at a finer grained level than what you have here the only tricky one is WASAPI vs WaveOut, but CoreAudioUtil::IsSupported() is essentially an OS check for Vista+ and I'd be willing to wager that based on crash reports the # of people that fail the DLL load check is small enough to not make a difference
https://codereview.chromium.org/11975031/diff/14001/content/browser/renderer_... File content/browser/renderer_host/media/audio_sync_reader.cc (right): https://codereview.chromium.org/11975031/diff/14001/content/browser/renderer_... content/browser/renderer_host/media/audio_sync_reader.cc:64: "Media.AudioRendererMissedDeadlineOSX", percentage_missed, kMissedMax); On 2013/01/24 23:34:39, scherkus wrote: > do we need to have 5 different buckets? UMA already has the capability to > display different histograms based on the OS at a finer grained level than what > you have here > > the only tricky one is WASAPI vs WaveOut, but CoreAudioUtil::IsSupported() is > essentially an OS check for Vista+ and I'd be willing to wager that based on > crash reports the # of people that fail the DLL load check is small enough to > not make a difference Great point. I totally forgot about the OS based filtering. A single UMA_HISTOGRAM_PERCENTAGE() is sufficient then. I was just trying to avoid 5x 100 line UMA stat entries.
lgtm lgtm!
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/dalecurtis@chromium.org/11975031/21002
Message was sent while issue was closed.
Change committed as 179011 |