|
|
Created:
5 years, 10 months ago by Yufeng Shen (Slow to review) Modified:
5 years, 9 months ago CC:
chromium-reviews, telemetry-reviews_chromium.org, David Yen Base URL:
https://chromium.googlesource.com/chromium/src.git@master Target Ref:
refs/pending/heads/master Project:
chromium Visibility:
Public. |
Descriptionkey_mobile_sites_smooth: Reload pages before scrolling for LinkedIn and Wowwiki
We have shader compilation caching but telemetry can't benefit from it since
pages are tested with cache cleared.
We want to reflect the caching impact in our metrics. To enable the cache,
after the initial load of the page, we wait for a while for the page to
build some cache, and then reload the page and test the user interaction.
Linkedin and Wowwiki in key_mobile_sites_smooth are selected for this since
they are found to have expensive initial shader compilation cost.
BUG=430631, 453861, 451216
Committed: https://crrev.com/fb34317558c9f9421937aee2ff45356b4e3eb17c
Cr-Commit-Position: refs/heads/master@{#319102}
Patch Set 1 #
Total comments: 2
Patch Set 2 : use a new benchmark that runs with --page-repeat=2 & --discard-first-results #
Total comments: 1
Patch Set 3 : reload page before scrolling #
Total comments: 4
Patch Set 4 : lets do navigate->scroll->reload->navigate->scroll #
Total comments: 2
Patch Set 5 : address comments #Messages
Total messages: 39 (5 generated)
miletus@chromium.org changed reviewers: + jdduke@chromium.org, nednguyen@google.com, vmiura@chromium.org
https://codereview.chromium.org/959063002/diff/1/tools/perf/page_sets/key_mob... File tools/perf/page_sets/key_mobile_sites_smooth.py (right): https://codereview.chromium.org/959063002/diff/1/tools/perf/page_sets/key_mob... tools/perf/page_sets/key_mobile_sites_smooth.py:283: self.AddUserStory(KeyMobileSitesSmoothPage( Are we guaranteed that the same page will run on the same device? Would it make sense to have it done atomically, e.g., have a page that loads, scrolls, then reloads and scrolls, and we only capture the second sequence? Calling that the rerun?
https://codereview.chromium.org/959063002/diff/1/tools/perf/page_sets/key_mob... File tools/perf/page_sets/key_mobile_sites_smooth.py (right): https://codereview.chromium.org/959063002/diff/1/tools/perf/page_sets/key_mob... tools/perf/page_sets/key_mobile_sites_smooth.py:283: self.AddUserStory(KeyMobileSitesSmoothPage( On 2015/02/26 22:05:04, jdduke wrote: > Are we guaranteed that the same page will run on the same device? > I think one benchmark always runs on the same devices. And different benchmarks can run on different devices. > Would it make sense to have it done atomically, e.g., have a page that loads, > scrolls, then reloads and scrolls, and we only capture the second sequence? > Calling that the rerun? I was thinking of avoiding bias toward first run or second run. So I decide to capture both.
OWNERS, (Jared/Victor on smoothness perf, and Ned on tools/perf/page_sets/key_mobile_sites_pages.*), does this look good to you guys ?
On 2015/02/27 20:26:28, Yufeng Shen wrote: > OWNERS, > (Jared/Victor on smoothness perf, and > Ned on tools/perf/page_sets/key_mobile_sites_pages.*), > > does this look good to you guys ? Seems reasonable as long as Victor is OK with it.
On 2015/02/27 20:45:51, jdduke wrote: > On 2015/02/27 20:26:28, Yufeng Shen wrote: > > OWNERS, > > (Jared/Victor on smoothness perf, and > > Ned on tools/perf/page_sets/key_mobile_sites_pages.*), > > > > does this look good to you guys ? > > Seems reasonable as long as Victor is OK with it. I don't like this hack. Can you write pages that does all the actions you want and then redo them right after? With this hack, if the original page is removed due to reasons like flakiness, suddenly all your smoothness improvement due to cache metrics also regress. Even better if you creat new benchmarks that track this type of performance metrics.
miletus@chromium.org changed reviewers: + sullivan@chromium.org
On 2015/02/27 20:57:35, nednguyen(REVIEW IN OTHER ACC) wrote: > On 2015/02/27 20:45:51, jdduke wrote: > > On 2015/02/27 20:26:28, Yufeng Shen wrote: > > > OWNERS, > > > (Jared/Victor on smoothness perf, and > > > Ned on tools/perf/page_sets/key_mobile_sites_pages.*), > > > > > > does this look good to you guys ? > > > > Seems reasonable as long as Victor is OK with it. > > I don't like this hack. Can you write pages that does all the actions you want > and then redo them right after? > With this hack, if the original page is removed due to reasons like flakiness, > suddenly all your smoothness improvement due to cache metrics also regress. > > Even better if you creat new benchmarks that track this type of performance > metrics. hmmm, maybe a dup of the existing benchmark but with --page-repeat=2 ? Annie, is dashboard able to handle the output of measurement that runs with --page-repeat=2 ?
On 2015/02/27 21:22:00, Yufeng Shen wrote: > On 2015/02/27 20:57:35, nednguyen(REVIEW IN OTHER ACC) wrote: > > On 2015/02/27 20:45:51, jdduke wrote: > > > On 2015/02/27 20:26:28, Yufeng Shen wrote: > > > > OWNERS, > > > > (Jared/Victor on smoothness perf, and > > > > Ned on tools/perf/page_sets/key_mobile_sites_pages.*), > > > > > > > > does this look good to you guys ? > > > > > > Seems reasonable as long as Victor is OK with it. > > > > I don't like this hack. Can you write pages that does all the actions you want > > and then redo them right after? > > With this hack, if the original page is removed due to reasons like flakiness, > > suddenly all your smoothness improvement due to cache metrics also regress. > > > > Even better if you creat new benchmarks that track this type of performance > > metrics. > > hmmm, maybe a dup of the existing benchmark but with --page-repeat=2 ? > > Annie, is dashboard able to handle the output of measurement that runs with > --page-repeat=2 ? page_cycler already does this. You probably want --page-repeat & --discard-first-results
On 2015/02/27 21:37:26, nednguyen wrote: > On 2015/02/27 21:22:00, Yufeng Shen wrote: > > On 2015/02/27 20:57:35, nednguyen(REVIEW IN OTHER ACC) wrote: > > > On 2015/02/27 20:45:51, jdduke wrote: > > > > On 2015/02/27 20:26:28, Yufeng Shen wrote: > > > > > OWNERS, > > > > > (Jared/Victor on smoothness perf, and > > > > > Ned on tools/perf/page_sets/key_mobile_sites_pages.*), > > > > > > > > > > does this look good to you guys ? > > > > > > > > Seems reasonable as long as Victor is OK with it. > > > > > > I don't like this hack. Can you write pages that does all the actions you > want > > > and then redo them right after? > > > With this hack, if the original page is removed due to reasons like > flakiness, > > > suddenly all your smoothness improvement due to cache metrics also regress. > > > > > > Even better if you creat new benchmarks that track this type of performance > > > metrics. > > > > hmmm, maybe a dup of the existing benchmark but with --page-repeat=2 ? > > > > Annie, is dashboard able to handle the output of measurement that runs with > > --page-repeat=2 ? > page_cycler already does this. > > You probably want --page-repeat & --discard-first-results Perfect. This is also exactly what Jared has suggested at the first place. Only that I put it under a different benchmark. PTAL, thanks.
On 2015/02/27 21:37:26, nednguyen wrote: > On 2015/02/27 21:22:00, Yufeng Shen wrote: > > On 2015/02/27 20:57:35, nednguyen(REVIEW IN OTHER ACC) wrote: > > > On 2015/02/27 20:45:51, jdduke wrote: > > > > On 2015/02/27 20:26:28, Yufeng Shen wrote: > > > > > OWNERS, > > > > > (Jared/Victor on smoothness perf, and > > > > > Ned on tools/perf/page_sets/key_mobile_sites_pages.*), > > > > > > > > > > does this look good to you guys ? > > > > > > > > Seems reasonable as long as Victor is OK with it. > > > > > > I don't like this hack. Can you write pages that does all the actions you > want > > > and then redo them right after? > > > With this hack, if the original page is removed due to reasons like > flakiness, > > > suddenly all your smoothness improvement due to cache metrics also regress. > > > > > > Even better if you creat new benchmarks that track this type of performance > > > metrics. > > > > hmmm, maybe a dup of the existing benchmark but with --page-repeat=2 ? > > > > Annie, is dashboard able to handle the output of measurement that runs with > > --page-repeat=2 ? > page_cycler already does this. > > You probably want --page-repeat & --discard-first-results If that works, then making a separate benchmark with the pages we want sounds good. We do want to have the page reload, ideally, restarting the browser as well.
https://codereview.chromium.org/959063002/diff/20001/tools/perf/benchmarks/sm... File tools/perf/benchmarks/smoothness.py (right): https://codereview.chromium.org/959063002/diff/20001/tools/perf/benchmarks/sm... tools/perf/benchmarks/smoothness.py:163: """ It seems overkill to repeat the entire pageset (x2!) just to validate the shader cache... That's a pretty heavy burden for the perf bots that are already running long.
On 2015/02/27 23:25:04, vmiura wrote: > On 2015/02/27 21:37:26, nednguyen wrote: > > On 2015/02/27 21:22:00, Yufeng Shen wrote: > > > On 2015/02/27 20:57:35, nednguyen(REVIEW IN OTHER ACC) wrote: > > > > On 2015/02/27 20:45:51, jdduke wrote: > > > > > On 2015/02/27 20:26:28, Yufeng Shen wrote: > > > > > > OWNERS, > > > > > > (Jared/Victor on smoothness perf, and > > > > > > Ned on tools/perf/page_sets/key_mobile_sites_pages.*), > > > > > > > > > > > > does this look good to you guys ? > > > > > > > > > > Seems reasonable as long as Victor is OK with it. > > > > > > > > I don't like this hack. Can you write pages that does all the actions you > > want > > > > and then redo them right after? > > > > With this hack, if the original page is removed due to reasons like > > flakiness, > > > > suddenly all your smoothness improvement due to cache metrics also > regress. > > > > > > > > Even better if you creat new benchmarks that track this type of > performance > > > > metrics. > > > > > > hmmm, maybe a dup of the existing benchmark but with --page-repeat=2 ? > > > > > > Annie, is dashboard able to handle the output of measurement that runs with > > > --page-repeat=2 ? > > page_cycler already does this. > > > > You probably want --page-repeat & --discard-first-results > > If that works, then making a separate benchmark with the pages we want sounds > good. > > We do want to have the page reload, ideally, restarting the browser as well. --page-repeat will reload the page, but not restart the browser. Are you concerning about the coverage difference for in memory cache vs disk cache ?
On 2015/02/27 23:36:03, jdduke wrote: > https://codereview.chromium.org/959063002/diff/20001/tools/perf/benchmarks/sm... > File tools/perf/benchmarks/smoothness.py (right): > > https://codereview.chromium.org/959063002/diff/20001/tools/perf/benchmarks/sm... > tools/perf/benchmarks/smoothness.py:163: """ > It seems overkill to repeat the entire pageset (x2!) just to validate the shader > cache... That's a pretty heavy burden for the perf bots that are already running > long. or maybe just select a few pages, e.g. LinkedIn and Wowwiki to this rerun benchmark ?
On 2015/02/27 23:40:17, Yufeng Shen wrote: > On 2015/02/27 23:36:03, jdduke wrote: > > > https://codereview.chromium.org/959063002/diff/20001/tools/perf/benchmarks/sm... > > File tools/perf/benchmarks/smoothness.py (right): > > > > > https://codereview.chromium.org/959063002/diff/20001/tools/perf/benchmarks/sm... > > tools/perf/benchmarks/smoothness.py:163: """ > > It seems overkill to repeat the entire pageset (x2!) just to validate the > shader > > cache... That's a pretty heavy burden for the perf bots that are already > running > > long. > > or maybe just select a few pages, e.g. LinkedIn and Wowwiki to this rerun > benchmark ? Right, restrict it to the pages where we suspect the repeat count will make a meaningful difference.
On 2015/02/27 23:40:17, Yufeng Shen wrote: > On 2015/02/27 23:36:03, jdduke wrote: > > > https://codereview.chromium.org/959063002/diff/20001/tools/perf/benchmarks/sm... > > File tools/perf/benchmarks/smoothness.py (right): > > > > > https://codereview.chromium.org/959063002/diff/20001/tools/perf/benchmarks/sm... > > tools/perf/benchmarks/smoothness.py:163: """ > > It seems overkill to repeat the entire pageset (x2!) just to validate the > shader > > cache... That's a pretty heavy burden for the perf bots that are already > running > > long. > > or maybe just select a few pages, e.g. LinkedIn and Wowwiki to this rerun > benchmark ? Correction: after talking to Victor I realized that without restarting the browser, we are only testing the memory cache, not the disk cache.
On 2015/02/27 23:55:03, Yufeng Shen wrote: > On 2015/02/27 23:40:17, Yufeng Shen wrote: > > On 2015/02/27 23:36:03, jdduke wrote: > > > > > > https://codereview.chromium.org/959063002/diff/20001/tools/perf/benchmarks/sm... > > > File tools/perf/benchmarks/smoothness.py (right): > > > > > > > > > https://codereview.chromium.org/959063002/diff/20001/tools/perf/benchmarks/sm... > > > tools/perf/benchmarks/smoothness.py:163: """ > > > It seems overkill to repeat the entire pageset (x2!) just to validate the > > shader > > > cache... That's a pretty heavy burden for the perf bots that are already > > running > > > long. > > > > or maybe just select a few pages, e.g. LinkedIn and Wowwiki to this rerun > > benchmark ? > > Correction: after talking to Victor I realized that without restarting the > browser, > we are only testing the memory cache, not the disk cache. If you want to restart the browser in between runs, you can override SharedPageState to do that. May need to do some refactoring on SharedPageState's code structure to make it less adhoc for you.
On 2015/02/28 01:16:51, nednguyen wrote: > On 2015/02/27 23:55:03, Yufeng Shen wrote: > > On 2015/02/27 23:40:17, Yufeng Shen wrote: > > > On 2015/02/27 23:36:03, jdduke wrote: > > > > > > > > > > https://codereview.chromium.org/959063002/diff/20001/tools/perf/benchmarks/sm... > > > > File tools/perf/benchmarks/smoothness.py (right): > > > > > > > > > > > > > > https://codereview.chromium.org/959063002/diff/20001/tools/perf/benchmarks/sm... > > > > tools/perf/benchmarks/smoothness.py:163: """ > > > > It seems overkill to repeat the entire pageset (x2!) just to validate the > > > shader > > > > cache... That's a pretty heavy burden for the perf bots that are already > > > running > > > > long. > > > > > > or maybe just select a few pages, e.g. LinkedIn and Wowwiki to this rerun > > > benchmark ? > > > > Correction: after talking to Victor I realized that without restarting the > > browser, > > we are only testing the memory cache, not the disk cache. > > If you want to restart the browser in between runs, you can override > SharedPageState to do that. May need to do some refactoring on SharedPageState's > code structure to make it less adhoc for you. Alright, after talking to Jared offline, I feel the way forward is: 1) Apply the caching for Linkedin and Wowwiki in key_mobile_sites_smooth pageset so that we have a way to reflect & track the shader cache impact. 2) Add the no-cache version of Linkedin and Wowwiki to pathological_mobile_sites so we have a place to track the non-optimized version. (I will do this in a separate CL once https://codereview.chromium.org/964903002/ is cleared on how to cross-wire the json file to reuse WPR data). 3) My intention in this CL is only to test shader cache impact, not particularly for disk cache. So I will use the reload approach (no browser restart), which goes through the memoery cache. Verify that disk cache is working is beyond the scope of this CL. 4) To enable cache, after the page navigation, I make the page to wait for a while for the cache to be built, then reload the page and navigate again. (better solutions are welcome). PTAL, thanks.
On 2015/03/02 23:32:26, Yufeng Shen wrote: > 3) My intention in this CL is only to test shader cache impact, not particularly > for > disk cache. So I will use the reload approach (no browser restart), which goes > through > the memoery cache. Verify that disk cache is working is beyond the scope of this > CL. I thought the intention here was to verify the disk cache benefit? Should they effectively achieve the same result? Or are there fundamental behavioral differences between the two caching approaches?
If you want to benchmark smoothness value when there is caching, I strongly encourage you to make a new benchmark for that purpose. Otherwise, the normal smoothness value from other pages will dilute your "smoothness values when there is cache". https://codereview.chromium.org/959063002/diff/40001/tools/perf/page_sets/key... File tools/perf/page_sets/key_mobile_sites_smooth.py (right): https://codereview.chromium.org/959063002/diff/40001/tools/perf/page_sets/key... tools/perf/page_sets/key_mobile_sites_smooth.py:53: action_runner.Wait(5) I don't like random wait time. Please find another way to do the "wait long enough to build cache" operation. https://codereview.chromium.org/959063002/diff/40001/tools/perf/page_sets/key... tools/perf/page_sets/key_mobile_sites_smooth.py:58: # Why: Mobile wiki Make this comment the class's docstring
On 2015/03/03 17:10:11, nednguyen wrote: > If you want to benchmark smoothness value when there is caching, I strongly > encourage you to make a new benchmark for that purpose. Otherwise, the normal > smoothness value from other pages will dilute your "smoothness values when there > is cache". The intent behind key_mobile_sites is to get a feel for how some common, popular sites behave under normal interaction environments. Part of the problem currently is that many results are extremely noisy, and in this case, two sites in particular suffer from expensive shader compilation. If we have strong reason to believe that the cache will benefit these sites most of the time, then I don't see it as a dilution to enable caching for just these sites. I'm not inclined to do so for other sites as they don't exhibit the same bottleneck and I want to minimize historical deviation. Creating yet another benchmark that we probably won't monitor because it's validating a relatively narrow optimization isn't really helping anybody.
On 2015/03/03 17:14:09, jdduke wrote: > On 2015/03/03 17:10:11, nednguyen wrote: > > If you want to benchmark smoothness value when there is caching, I strongly > > encourage you to make a new benchmark for that purpose. Otherwise, the normal > > smoothness value from other pages will dilute your "smoothness values when > there > > is cache". > > The intent behind key_mobile_sites is to get a feel for how some common, popular > sites behave under normal interaction environments. > > Part of the problem currently is that many results are extremely noisy, and in > this case, two sites in particular suffer from expensive shader compilation. If > we have strong reason to believe that the cache will benefit these sites most of > the time, then I don't see it as a dilution to enable caching for just these > sites. I'm not inclined to do so for other sites as they don't exhibit the same > bottleneck and I want to minimize historical deviation. > > Creating yet another benchmark that we probably won't monitor because it's > validating a relatively narrow optimization isn't really helping anybody. I see. To recap, the purpose of this change is less about tracking the smoothness metrics when there is caching but more about reducing noise?
On 2015/03/03 17:36:44, nednguyen wrote: > I see. To recap, the purpose of this change is less about tracking the > smoothness metrics when there is caching but more about reducing noise? Right, we plan to move the non-cached versions of these pages to the pathological cases where we can track/improve non-cached performance, but in the meantime we still want to retain the pages in this benchmark (in a form that hopefully still reflects real world performance, i.e., with caching).
https://codereview.chromium.org/959063002/diff/40001/tools/perf/page_sets/key... File tools/perf/page_sets/key_mobile_sites_smooth.py (right): https://codereview.chromium.org/959063002/diff/40001/tools/perf/page_sets/key... tools/perf/page_sets/key_mobile_sites_smooth.py:53: action_runner.Wait(5) On 2015/03/03 17:10:10, nednguyen wrote: > I don't like random wait time. Please find another way to do the "wait long > enough to build cache" operation. Done. https://codereview.chromium.org/959063002/diff/40001/tools/perf/page_sets/key... tools/perf/page_sets/key_mobile_sites_smooth.py:58: # Why: Mobile wiki On 2015/03/03 17:10:10, nednguyen wrote: > Make this comment the class's docstring Done.
lg2me about how the reload action is handled. Please wait for other reviewers to resolve issues about whether this measures the right thing. https://codereview.chromium.org/959063002/diff/60001/tools/perf/page_sets/key... File tools/perf/page_sets/key_mobile_sites_smooth.py (right): https://codereview.chromium.org/959063002/diff/60001/tools/perf/page_sets/key... tools/perf/page_sets/key_mobile_sites_smooth.py:58: """# Why: Mobile wiki.""" You don't need the "#" here
smoothness OWNER Jared/Victor, PTAL, thanks. https://codereview.chromium.org/959063002/diff/60001/tools/perf/page_sets/key... File tools/perf/page_sets/key_mobile_sites_smooth.py (right): https://codereview.chromium.org/959063002/diff/60001/tools/perf/page_sets/key... tools/perf/page_sets/key_mobile_sites_smooth.py:58: """# Why: Mobile wiki.""" On 2015/03/03 19:16:41, nednguyen wrote: > You don't need the "#" here Done.
This looks generally good. I'd like to ask is there no way we could repeat the page test & use the 2nd result in a consistent way? At some point I can see us wanting to do the same for a page where just top level pre-scrolling is not what we want.
On 2015/03/03 19:55:34, vmiura wrote: > This looks generally good. > > I'd like to ask is there no way we could repeat the page test & use the 2nd > result in a consistent way? At some point I can see us wanting to do the same > for a page where just top level pre-scrolling is not what we want. Flags --page-repeat=2 and --discard-first-results give what you want. But it repeat the whole pageset, while in our case, we only want to repeat certain pages within the pageset, so that's why I have to roll it out manually. But if you want to apply the repeating on the whole pageset, the flags can work.
On 2015/03/03 20:00:10, Yufeng Shen wrote: > On 2015/03/03 19:55:34, vmiura wrote: > > This looks generally good. > > > > I'd like to ask is there no way we could repeat the page test & use the 2nd > > result in a consistent way? At some point I can see us wanting to do the same > > for a page where just top level pre-scrolling is not what we want. > > Flags --page-repeat=2 and --discard-first-results give what you want. > But it repeat the whole pageset, while in our case, we only want to repeat > certain pages within the pageset, so that's why I have to roll it out > manually. > > But if you want to apply the repeating on the whole pageset, the flags can > work. I assume everyone is happy with it now ? Can you guys give owner stamps ?
lgtm
The CQ bit was checked by miletus@chromium.org
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/959063002/80001
The CQ bit was unchecked by commit-bot@chromium.org
Try jobs failed on following builders: ios_rel_device_ninja on tryserver.chromium.mac (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.mac/builders/ios_rel_device_ni...)
The CQ bit was checked by miletus@chromium.org
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/959063002/80001
Message was sent while issue was closed.
Committed patchset #5 (id:80001)
Message was sent while issue was closed.
Patchset 5 (id:??) landed as https://crrev.com/fb34317558c9f9421937aee2ff45356b4e3eb17c Cr-Commit-Position: refs/heads/master@{#319102} |