|
|
Created:
6 years, 11 months ago by Yufeng Shen (Slow to review) Modified:
6 years, 10 months ago CC:
chromium-reviews, chrome-speed-team+watch_google.com, telemetry+watch_chromium.org, Rick Byers Base URL:
http://git.chromium.org/chromium/src.git@master Visibility:
Public. |
DescriptionCollecting LatencyInfo in telemetry
This CL collects input event LatencyInfo from trace buffer and
generates input latency metrics:
mouse_wheel_latency:
From when mouse wheel event reaches RWH to when buffer is swapped.
touch_scroll_latency:
From when the touch event is generated to when the the buffer
is swapped due to the gesture scroll generated from the touch event.
BUG=246034
TEST=telemetry smoothness test on top_25.json works on ChromeOS & Android.
Committed: https://src.chromium.org/viewvc/chrome?view=rev&revision=249131
Patch Set 1 #
Total comments: 9
Patch Set 2 : adding unittests. Search browser main thread through name instead of exposing browser backend pid. #Patch Set 3 : unittest added #
Total comments: 3
Patch Set 4 : use 95 percentil for input latency metric #
Total comments: 12
Patch Set 5 : refactor to make code easy to read #Patch Set 6 : correct some formating #Patch Set 7 : rebasae #Patch Set 8 : fix presubmit error "unused variable" #
Messages
Total messages: 26 (0 generated)
On 2014/01/10 00:17:40, Yufeng Shen wrote: Please also test this with the reference build on Linux. I assume you won't have latency info there, but you want to make sure that the smoothness benchmark still works with an older browser version. The reference builds are in chrome/tools/test/reference_build/. You need to point telemetry to the executable with the --browser=exact --browser-executable=<path-to-your-chrome-reference-build-executable> command line options.
On 2014/01/10 19:03:05, Manfred Ernst wrote: > On 2014/01/10 00:17:40, Yufeng Shen wrote: > > Please also test this with the reference build on Linux. I assume you won't have > latency info there, but you want to make sure that the smoothness benchmark > still works with an older browser version. > > The reference builds are in chrome/tools/test/reference_build/. You need to > point telemetry to the executable with the --browser=exact > --browser-executable=<path-to-your-chrome-reference-build-executable> command > line options. So I ran the telemetry smoothness test against 3 reference builds linux dev 33.0.1750.22 linux beta 32.0.1700.72 linux stable 31.0.1650.63 M33 & M32 work fine. For M31, with or without my CL the smoothness test fails on some of the pages anyway.
On 2014/01/10 23:12:17, Yufeng Shen wrote: > On 2014/01/10 19:03:05, Manfred Ernst wrote: > > On 2014/01/10 00:17:40, Yufeng Shen wrote: > > > > Please also test this with the reference build on Linux. I assume you won't > have > > latency info there, but you want to make sure that the smoothness benchmark > > still works with an older browser version. > > > > The reference builds are in chrome/tools/test/reference_build/. You need to > > point telemetry to the executable with the --browser=exact > > --browser-executable=<path-to-your-chrome-reference-build-executable> command > > line options. > > So I ran the telemetry smoothness test against 3 reference builds > linux dev 33.0.1750.22 > linux beta 32.0.1700.72 > linux stable 31.0.1650.63 > > M33 & M32 work fine. For M31, with or without my CL the smoothness test fails > on some of the pages anyway. What I meant was the reference build used on the per bots ( chrome/tools/test/reference_build/chrome_linux/chrome). That one needs to work, otherwise the perf bots will have trouble. The current version is Google Chrome 32.0.1700.19
On 2014/01/10 23:27:21, Manfred Ernst wrote: > On 2014/01/10 23:12:17, Yufeng Shen wrote: > > On 2014/01/10 19:03:05, Manfred Ernst wrote: > > > On 2014/01/10 00:17:40, Yufeng Shen wrote: > > > > > > Please also test this with the reference build on Linux. I assume you won't > > have > > > latency info there, but you want to make sure that the smoothness benchmark > > > still works with an older browser version. > > > > > > The reference builds are in chrome/tools/test/reference_build/. You need to > > > point telemetry to the executable with the --browser=exact > > > --browser-executable=<path-to-your-chrome-reference-build-executable> > command > > > line options. > > > > So I ran the telemetry smoothness test against 3 reference builds > > linux dev 33.0.1750.22 > > linux beta 32.0.1700.72 > > linux stable 31.0.1650.63 > > > > M33 & M32 work fine. For M31, with or without my CL the smoothness test fails > > on some of the pages anyway. > > What I meant was the reference build used on the per bots ( > chrome/tools/test/reference_build/chrome_linux/chrome). That one needs to work, > otherwise the perf bots will have trouble. The current version is Google Chrome > 32.0.1700.19 Thanks Manfred for pointing out this. Reference build 32.0.1700.19 tested with smoothness on top_25.json and it works.
On 2014/01/11 00:02:35, Yufeng Shen wrote: > On 2014/01/10 23:27:21, Manfred Ernst wrote: > > On 2014/01/10 23:12:17, Yufeng Shen wrote: > > > On 2014/01/10 19:03:05, Manfred Ernst wrote: > > > > On 2014/01/10 00:17:40, Yufeng Shen wrote: > > > > > > > > Please also test this with the reference build on Linux. I assume you > won't > > > have > > > > latency info there, but you want to make sure that the smoothness > benchmark > > > > still works with an older browser version. > > > > > > > > The reference builds are in chrome/tools/test/reference_build/. You need > to > > > > point telemetry to the executable with the --browser=exact > > > > --browser-executable=<path-to-your-chrome-reference-build-executable> > > command > > > > line options. > > > > > > So I ran the telemetry smoothness test against 3 reference builds > > > linux dev 33.0.1750.22 > > > linux beta 32.0.1700.72 > > > linux stable 31.0.1650.63 > > > > > > M33 & M32 work fine. For M31, with or without my CL the smoothness test > fails > > > on some of the pages anyway. > > > > What I meant was the reference build used on the per bots ( > > chrome/tools/test/reference_build/chrome_linux/chrome). That one needs to > work, > > otherwise the perf bots will have trouble. The current version is Google > Chrome > > 32.0.1700.19 > > > Thanks Manfred for pointing out this. > > Reference build 32.0.1700.19 tested with smoothness on top_25.json and it works. LGTM. We need an OWNER review from Nat as well for the telemetry part.
this should not get lg without unit tests. I will review in depth later.
https://codereview.chromium.org/132433004/diff/1/tools/perf/metrics/rendering... File tools/perf/metrics/rendering_stats.py (right): https://codereview.chromium.org/132433004/diff/1/tools/perf/metrics/rendering... tools/perf/metrics/rendering_stats.py:44: def initInputLatencyFromTimeline(self, timeline_range): needs unit tests too https://codereview.chromium.org/132433004/diff/1/tools/perf/metrics/smoothnes... File tools/perf/metrics/smoothness.py (right): https://codereview.chromium.org/132433004/diff/1/tools/perf/metrics/smoothnes... tools/perf/metrics/smoothness.py:67: def AddResults(self, tab, results): needs unit tests https://codereview.chromium.org/132433004/diff/1/tools/telemetry/telemetry/co... File tools/telemetry/telemetry/core/browser.py (right): https://codereview.chromium.org/132433004/diff/1/tools/telemetry/telemetry/co... tools/telemetry/telemetry/core/browser.py:88: @property we never expose the browser pid. there are good reasons for this, even though it seems innocent. talk to @epenenr about how to get the browser process in telemetry. he went through this exercise recently and probalby can advise you on how to do it. https://codereview.chromium.org/132433004/diff/1/tools/telemetry/telemetry/co... File tools/telemetry/telemetry/core/timeline/model.py (right): https://codereview.chromium.org/132433004/diff/1/tools/telemetry/telemetry/co... tools/telemetry/telemetry/core/timeline/model.py:143: def GetProcess(self, pid): why not just model.process[pid]?
https://codereview.chromium.org/132433004/diff/1/tools/perf/metrics/rendering... File tools/perf/metrics/rendering_stats.py (right): https://codereview.chromium.org/132433004/diff/1/tools/perf/metrics/rendering... tools/perf/metrics/rendering_stats.py:44: def initInputLatencyFromTimeline(self, timeline_range): On 2014/01/13 18:49:32, nduca wrote: > needs unit tests too unittests added in tools/perf/metrics/rendering_stats_unittest.py https://codereview.chromium.org/132433004/diff/1/tools/perf/metrics/smoothnes... File tools/perf/metrics/smoothness.py (right): https://codereview.chromium.org/132433004/diff/1/tools/perf/metrics/smoothnes... tools/perf/metrics/smoothness.py:67: def AddResults(self, tab, results): On 2014/01/13 18:49:32, nduca wrote: > needs unit tests I assume the unittests for this is a test that starts a simple scrolable webpage and run the smoothness measurement on it and check that the desired metrics show in the result. Currently the chrome side LatencyInfo plumbing has not totally gone through yet (I am close in landing this CL https://codereview.chromium.org/123563002/) so don't have a chrome build that can return the desired input latency metric. Either I hold off on this CL until 123563002 is landed or I merge this one first and add the unittests later. https://codereview.chromium.org/132433004/diff/1/tools/telemetry/telemetry/co... File tools/telemetry/telemetry/core/browser.py (right): https://codereview.chromium.org/132433004/diff/1/tools/telemetry/telemetry/co... tools/telemetry/telemetry/core/browser.py:88: @property On 2014/01/13 18:49:32, nduca wrote: > we never expose the browser pid. there are good reasons for this, even though it > seems innocent. > > talk to @epenenr about how to get the browser process in telemetry. he went > through this exercise recently and probalby can advise you on how to do it. Talked to Eric and he suggests to search the desired thread by name instead of acting on process. Since all the input event LatencyInfo starts on browser main thread, so I change to add TimelineModel.GetBrowserMainThread() to return the thread with name 'CrBrowserMain' and the async slices of that thread has all the dumped LatencyInfo. https://codereview.chromium.org/132433004/diff/1/tools/telemetry/telemetry/co... File tools/telemetry/telemetry/core/timeline/model.py (right): https://codereview.chromium.org/132433004/diff/1/tools/telemetry/telemetry/co... tools/telemetry/telemetry/core/timeline/model.py:143: def GetProcess(self, pid): On 2014/01/13 18:49:32, nduca wrote: > why not just model.process[pid]? removed.
Nat, PTAL, thanks. https://codereview.chromium.org/132433004/diff/1/tools/perf/metrics/smoothnes... File tools/perf/metrics/smoothness.py (right): https://codereview.chromium.org/132433004/diff/1/tools/perf/metrics/smoothnes... tools/perf/metrics/smoothness.py:67: def AddResults(self, tab, results): On 2014/01/14 00:05:47, Yufeng Shen wrote: > On 2014/01/13 18:49:32, nduca wrote: > > needs unit tests > > I assume the unittests for this is a test that starts a simple scrolable webpage > and run the smoothness measurement on it and check that the desired metrics show > in the result. > > Currently the chrome side LatencyInfo plumbing has not totally gone > through yet (I am close in landing this CL > https://codereview.chromium.org/123563002/) so don't have a chrome > build that can return the desired input latency metric. > > Either I hold off on this CL until 123563002 is landed or I merge this one first > and add the unittests later. > Test added.
It's exciting to see this Yufeng, thanks! https://codereview.chromium.org/132433004/diff/220001/tools/perf/metrics/smoo... File tools/perf/metrics/smoothness.py (right): https://codereview.chromium.org/132433004/diff/220001/tools/perf/metrics/smoo... tools/perf/metrics/smoothness.py:96: results.Add('gesture_scroll_latency_75%', 'ms', Has there been any discussion of what the right metric to use is here? To me 75th percentile doesn't seem much more useful than median. Can we bump this up to 90th or 95th without the results on our pagesets getting too noisy?
https://codereview.chromium.org/132433004/diff/220001/tools/perf/metrics/smoo... File tools/perf/metrics/smoothness.py (right): https://codereview.chromium.org/132433004/diff/220001/tools/perf/metrics/smoo... tools/perf/metrics/smoothness.py:96: results.Add('gesture_scroll_latency_75%', 'ms', On 2014/01/21 13:57:17, Rick Byers wrote: > Has there been any discussion of what the right metric to use is here? To me > 75th percentile doesn't seem much more useful than median. Can we bump this up > to 90th or 95th without the results on our pagesets getting too noisy? Done.
https://codereview.chromium.org/132433004/diff/220001/tools/perf/metrics/smoo... File tools/perf/metrics/smoothness.py (right): https://codereview.chromium.org/132433004/diff/220001/tools/perf/metrics/smoo... tools/perf/metrics/smoothness.py:96: results.Add('gesture_scroll_latency_75%', 'ms', On 2014/01/22 23:23:35, Yufeng Shen wrote: > On 2014/01/21 13:57:17, Rick Byers wrote: > > Has there been any discussion of what the right metric to use is here? To me > > 75th percentile doesn't seem much more useful than median. Can we bump this > up > > to 90th or 95th without the results on our pagesets getting too noisy? > > Done. Thanks. How consistent does this tend to be from run-to-run on Android? On Aura I know we have inconsistent scheduling that'll add noise, but hopefully we can get this to be relatively stable on Android.
On 2014/01/23 00:57:04, Rick Byers wrote: > https://codereview.chromium.org/132433004/diff/220001/tools/perf/metrics/smoo... > File tools/perf/metrics/smoothness.py (right): > > https://codereview.chromium.org/132433004/diff/220001/tools/perf/metrics/smoo... > tools/perf/metrics/smoothness.py:96: results.Add('gesture_scroll_latency_75%', > 'ms', > On 2014/01/22 23:23:35, Yufeng Shen wrote: > > On 2014/01/21 13:57:17, Rick Byers wrote: > > > Has there been any discussion of what the right metric to use is here? To > me > > > 75th percentile doesn't seem much more useful than median. Can we bump this > > up > > > to 90th or 95th without the results on our pagesets getting too noisy? > > > > Done. > > Thanks. How consistent does this tend to be from run-to-run on Android? On > Aura I know we have inconsistent scheduling that'll add noise, but hopefully we > can get this to be relatively stable on Android. My observation from manual running smoothness top_25.json against Pixel and Nexus 4/7 is that for normal pages, android is usually less than 16ms, and Pixel is usually 1 vsync slower than android. From run-to-run, android tends to be more consistent that it usually doesn't miss the next vsync, while on Pixel it is more often the event/frame misses next vsync. So I would expect on Pixel it is common to see difference result from run-to-run at the level of ~16ms. I would like to get the data to perf dashboard soon so we can visually inspect how big the variance is from run-to-run.
lots of feedback, thank you for continuing to work on this! :) https://codereview.chromium.org/132433004/diff/280002/tools/perf/metrics/rend... File tools/perf/metrics/rendering_stats.py (right): https://codereview.chromium.org/132433004/diff/280002/tools/perf/metrics/rend... tools/perf/metrics/rendering_stats.py:47: ui_comp_name = 'INPUT_EVENT_LATENCY_UI_COMPONENT' this code is incredibly dense and hard to read and the variable names are hard to understand. what is a comp_name? remember, your job is not just to make this work, but to make it so that other people can read this and understand what is going on. https://codereview.chromium.org/132433004/diff/280002/tools/perf/metrics/rend... tools/perf/metrics/rendering_stats.py:53: for event in self.browser_main_thread.IterAllAsyncSlicesOfName(event_name): i dont get the point of event_name as a variable here, its actually quite misleading https://codereview.chromium.org/132433004/diff/280002/tools/perf/metrics/rend... tools/perf/metrics/rendering_stats.py:53: for event in self.browser_main_thread.IterAllAsyncSlicesOfName(event_name): this function should really be split into a few functions so that you can unit test its pieces. there's one piece that takes a browser_main_thread and gets produces a class with the different events of interest in different arrays. then you might use a @property def mouse_wheel_latency_times(self): if self._mouse_wheel_latency_times: return self._mouse_wheel_latency_times .. build _mouse_wheel_latency_times so that the three loops below are then, once you have this, you'll realize that you can write a unit test for this entire function without having to create a RenderStats object --- all you need to do is test the object in isolation. You'll still need a basic test that the renderstats integration with your new objct works but it'll be better isolated. https://codereview.chromium.org/132433004/diff/280002/tools/perf/metrics/rend... tools/perf/metrics/rendering_stats.py:83: if begin_comp_name in data: this huge set of branches shold be broken into a set of helper functions. I can't understand what is going on here. It looks like you've got a GetGestureScrollLatencyFromEvent() and a GetTouchScrollLatencyFromEvent() maybe? https://codereview.chromium.org/132433004/diff/280002/tools/perf/metrics/rend... File tools/perf/metrics/rendering_stats_unittest.py (right): https://codereview.chromium.org/132433004/diff/280002/tools/perf/metrics/rend... tools/perf/metrics/rendering_stats_unittest.py:45: self.gesture_scroll_latency = [] these member variable names should be chosen to clearly explain *what* is in the array. You have to read the Add...Stats function in a huge amount of time to figure out what they are. https://codereview.chromium.org/132433004/diff/280002/tools/perf/metrics/rend... tools/perf/metrics/rendering_stats_unittest.py:244: AddBrowserInputLatencyStats(timer, 'MouseWheel', browser_main, it worries me that this is one huge unit test. Can you refactor this unit test to have test of the new latency fields separate from the test of the existing stuff? https://codereview.chromium.org/132433004/diff/280002/tools/perf/metrics/smoo... File tools/perf/metrics/smoothness.py (right): https://codereview.chromium.org/132433004/diff/280002/tools/perf/metrics/smoo... tools/perf/metrics/smoothness.py:85: results.Add('mean_mouse_wheel_latency', 'ms', why report mean and 95%ile? can you do an experiment where you print out all the latency values from top_25 on a chromebook and then put those into a spreadsheet so we can see what the histogram of the data are? https://codereview.chromium.org/132433004/diff/280002/tools/telemetry/telemet... File tools/telemetry/telemetry/core/timeline/model.py (right): https://codereview.chromium.org/132433004/diff/280002/tools/telemetry/telemet... tools/telemetry/telemetry/core/timeline/model.py:153: def GetBrowserMainThread(self): I'd like to remove this from this CL and have a CL that introduces just this. I think you're going to need to do something slightly different but we can do the review separate from this patch --- epenner and i need to review it. namely, the model class is browser agnostic. It doesn't know if its a model that came from chrome, or firefox. it might even be a model that came from a devtools timeline recording, which doesn't have a browser main. i think what you want to do is telemetry.core.backends.chrome_trace_result to find the CrRendererMain, and then add AddCoreObjectToContainerMapping for the browser object to the renderer main thread. Then, on this object, have a GetBrowserProcesssFromTab(tab) could look into the mapping just as GetRendererProcessFromTab does.
PTAL, thanks. https://codereview.chromium.org/132433004/diff/280002/tools/perf/metrics/rend... File tools/perf/metrics/rendering_stats.py (right): https://codereview.chromium.org/132433004/diff/280002/tools/perf/metrics/rend... tools/perf/metrics/rendering_stats.py:47: ui_comp_name = 'INPUT_EVENT_LATENCY_UI_COMPONENT' On 2014/01/23 18:34:45, nduca wrote: > this code is incredibly dense and hard to read and the variable names are hard > to understand. what is a comp_name? remember, your job is not just to make this > work, but to make it so that other people can read this and understand what is > going on. I refactored this into a few helper functions, hopefully making this easier to understand. https://codereview.chromium.org/132433004/diff/280002/tools/perf/metrics/rend... File tools/perf/metrics/rendering_stats_unittest.py (right): https://codereview.chromium.org/132433004/diff/280002/tools/perf/metrics/rend... tools/perf/metrics/rendering_stats_unittest.py:244: AddBrowserInputLatencyStats(timer, 'MouseWheel', browser_main, On 2014/01/23 18:34:45, nduca wrote: > it worries me that this is one huge unit test. Can you refactor this unit test > to have test of the new latency fields separate from the test of the existing > stuff? Done. https://codereview.chromium.org/132433004/diff/280002/tools/perf/metrics/smoo... File tools/perf/metrics/smoothness.py (right): https://codereview.chromium.org/132433004/diff/280002/tools/perf/metrics/smoo... tools/perf/metrics/smoothness.py:85: results.Add('mean_mouse_wheel_latency', 'ms', On 2014/01/23 18:34:45, nduca wrote: > why report mean and 95%ile? > > can you do an experiment where you print out all the latency values from top_25 > on a chromebook and then put those into a spreadsheet so we can see what the > histogram of the data are? So I only keep the mean value in the new patch. I have sent out an email "Some scrolling latency data on Pixel & Nexus 7" to the chrome-input about some of the patterns of the test results. Until we are confident which metrics we want to track, lets just use the mean value for now. https://codereview.chromium.org/132433004/diff/280002/tools/telemetry/telemet... File tools/telemetry/telemetry/core/timeline/model.py (right): https://codereview.chromium.org/132433004/diff/280002/tools/telemetry/telemet... tools/telemetry/telemetry/core/timeline/model.py:153: def GetBrowserMainThread(self): On 2014/01/23 18:34:45, nduca wrote: > I'd like to remove this from this CL and have a CL that introduces just this. I > think you're going to need to do something slightly different but we can do the > review separate from this patch --- epenner and i need to review it. > > > namely, the model class is browser agnostic. It doesn't know if its a model that > came from chrome, or firefox. it might even be a model that came from a devtools > timeline recording, which doesn't have a browser main. > > i think what you want to do is telemetry.core.backends.chrome_trace_result to > find the CrRendererMain, and then add AddCoreObjectToContainerMapping for the > browser object to the renderer main thread. > > Then, on this object, have a GetBrowserProcesssFromTab(tab) could look into the > mapping just as GetRendererProcessFromTab does. created a separate CL https://codereview.chromium.org/144423003/ for this.
Ping. Nat, can you take another look at this ?
lgtm thank you
The CQ bit was checked by miletus@chromium.org
The CQ bit was checked by miletus@chromium.org
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/miletus@chromium.org/132433004/540001
The CQ bit was unchecked by commit-bot@chromium.org
Retried try job too often on chromium_presubmit for step(s) presubmit http://build.chromium.org/p/tryserver.chromium/buildstatus?builder=chromium_p...
The CQ bit was checked by miletus@chromium.org
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/miletus@chromium.org/132433004/710001
Message was sent while issue was closed.
Change committed as 249131 |