Issue 962793004: [Telemetry] Make "discard_first_result" apply to user_stories too.

Issue 962793004: [Telemetry] Make "discard_first_result" apply to user_stories too. (Closed)

Created:
5 years, 9 months ago by slamm

Modified:
5 years, 9 months ago

Reviewers:
eakuefner, aiolos (Not reviewing), dtu, nednguyen, sullivan

CC:
chromium-reviews, telemetry-reviews_chromium.org

Base URL:
https://chromium.googlesource.com/chromium/src.git@master

Target Ref:
refs/pending/heads/master

Project:
chromium

Visibility:
Public.

More Reviews

Description

Make discarding the first result possible through Benchmark.ValueCanBeAddedPredicate (add a "is_first_result" arg to it). The main motivation here is to eliminate special cases for PageTest methods/attributes in user_story_runner.Run. BUG=440101 Committed: https://crrev.com/2cd4e23b41a986a172ed9a15e165d3a416d9056b Cr-Commit-Position: refs/heads/master@{#319140}

Patch Set 1 #

Total comments: 3

Patch Set 2 : Address comments and make command-line override-behavior work. #

Patch Set 3 : Make the discard rule part of the results filter. (Not a cmd-line option.) #

Patch Set 4 : Add unit test. #

Patch Set 5 : Remove user_story_runner test for discarding the first result. #

Total comments: 6

Patch Set 6 : Address review comments. #

Created: 5 years, 9 months ago

Download [raw] [tar.bz2]

	Unified diffs	Side-by-side diffs	Delta from patch set	Stats (+108 lines, -124 lines)			Patch
M	tools/perf/benchmarks/gpu_times.py	View	1 2	1 chunk	+1 line, -1 line	0 comments	Download
M	tools/perf/benchmarks/page_cycler.py	View	1 2	4 chunks	+7 lines, -5 lines	0 comments	Download
M	tools/perf/benchmarks/session_restore.py	View	1 2	2 chunks	+5 lines, -0 lines	0 comments	Download
M	tools/perf/benchmarks/startup.py	View	1 2	1 chunk	+4 lines, -0 lines	0 comments	Download
M	tools/perf/measurements/page_cycler.py	View	1 2	1 chunk	+0 lines, -2 lines	0 comments	Download
M	tools/perf/measurements/startup.py	View		1 chunk	+0 lines, -3 lines	0 comments	Download
M	tools/telemetry/telemetry/benchmark.py	View	1 2	1 chunk	+9 lines, -4 lines	0 comments	Download
M	tools/telemetry/telemetry/benchmark_unittest.py	View	1 2	1 chunk	+1 line, -1 line	0 comments	Download
M	tools/telemetry/telemetry/results/page_test_results.py	View	1 2 3 4 5	4 chunks	+12 lines, -10 lines	0 comments	Download
M	tools/telemetry/telemetry/results/page_test_results_unittest.py	View	1 2 3	4 chunks	+67 lines, -33 lines	0 comments	Download
M	tools/telemetry/telemetry/results/results_options.py	View	1 2 3 4 5	1 chunk	+1 line, -1 line	0 comments	Download
M	tools/telemetry/telemetry/user_story/user_story_runner.py	View	1 2	2 chunks	+1 line, -9 lines	0 comments	Download
M	tools/telemetry/telemetry/user_story/user_story_runner_unittest.py	View	1 2 3 4	1 chunk	+0 lines, -55 lines	0 comments	Download

Messages

Total messages: 37 (8 generated)

Expand Messages | Collapse Messages | Show Generated Messages | Hide Generated Messages

slamm

I updated the description: The main motivation here is to eliminate special cases for PageTest ...

5 years, 9 months ago (2015-02-27 01:14:25 UTC) #5

nednguyen

On 2015/02/27 03:35:32, sullivan wrote: > +dtu because command line flags I find --discard-first-result is ...

5 years, 9 months ago (2015-02-27 13:22:25 UTC) #8

nednguyen

https://codereview.chromium.org/962793004/diff/1/tools/perf/benchmarks/page_cycler.py File tools/perf/benchmarks/page_cycler.py (right): https://codereview.chromium.org/962793004/diff/1/tools/perf/benchmarks/page_cycler.py#newcode27 tools/perf/benchmarks/page_cycler.py:27: def ProcessCommandLineArgs(cls, parser, args): Don't you need to call ...

5 years, 9 months ago (2015-02-27 13:27:35 UTC) #9

dtu

On 2015/02/27 13:22:25, nednguyen wrote: > On 2015/02/27 03:35:32, sullivan wrote: > > +dtu because ...

5 years, 9 months ago (2015-02-27 22:10:02 UTC) #10

aiolos (Not reviewing)

On 2015/02/27 13:22:25, nednguyen wrote: > On 2015/02/27 03:35:32, sullivan wrote: > > +dtu because ...

5 years, 9 months ago (2015-02-27 22:11:55 UTC) #11

slamm

On 2015/02/27 22:11:55, aiolos wrote: > On 2015/02/27 13:22:25, nednguyen wrote: > > On 2015/02/27 ...

5 years, 9 months ago (2015-02-27 22:34:15 UTC) #12

nednguyen

On 2015/02/27 22:34:15, slamm wrote: > On 2015/02/27 22:11:55, aiolos wrote: > > On 2015/02/27 ...

5 years, 9 months ago (2015-02-28 01:05:21 UTC) #13

nednguyen

On 2015/02/28 01:05:21, nednguyen wrote: > On 2015/02/27 22:34:15, slamm wrote: > > On 2015/02/27 ...

5 years, 9 months ago (2015-02-28 01:08:21 UTC) #14

On 2015/02/28 01:05:21, nednguyen wrote:
> On 2015/02/27 22:34:15, slamm wrote:
> > On 2015/02/27 22:11:55, aiolos wrote:
> > > On 2015/02/27 13:22:25, nednguyen wrote:
> > > > On 2015/02/27 03:35:32, sullivan wrote:
> > > > > +dtu because command line flags
> > > > 
> > > > I find --discard-first-result is a useful commandline flag that's used
> quite
> > > > often in the past (we removed it at some point, which I didn't remember
> > why).
> > > > Most of the time, the metrics of the first run results is different from
> the
> > > > rest because of some caching effect.
> > > 
> > > I think with our discussion of wanting to reduce the number of command
line
> > > flags, we need more justification than "being able to discard the first
> result
> > > is useful." If it's super useful for reducing noise, why don't we just do
it
> > by
> > > default and let the tests that don't want to use that behavior (anything
> that
> > > want cold times for example) overwrite it. Or if it's only useful in
certain
> > > cases, why don't we just have people write a warm version of their tests
for
> > > those cases?
> > 
> > Ned convinced me by saying that Matt Lee had asked for it. I was also
thinking
> > it would
> > be handy for letting users see just how much slower the first run is for the
> > "warm" case.
> > (Although, now that I think about it, they could simply run the "cold" case
to
> > know that.)
> > 
> > Ned do you have more justification for the command-line flag.
> > If the command-line flag is shot down, could the benchmark somehow tell the
> > results
> > instance to discard the first result?
> 
> Another example that people may want to use --discard-first-result: 
> https://codereview.chromium.org/959063002/

I don't have strong opinion about whether we make --discard-first-result a
commandline flag, or a benchmark option thing that users can specify. I do think
that this is a useful option to many cases, but not all case by default (because
this certainly increase the cycle time).

dtu

On 2015/02/28 01:08:21, nednguyen wrote: > On 2015/02/28 01:05:21, nednguyen wrote: > > On 2015/02/27 ...

5 years, 9 months ago (2015-03-02 18:24:22 UTC) #15

On 2015/02/28 01:08:21, nednguyen wrote:
> On 2015/02/28 01:05:21, nednguyen wrote:
> > On 2015/02/27 22:34:15, slamm wrote:
> > > On 2015/02/27 22:11:55, aiolos wrote:
> > > > On 2015/02/27 13:22:25, nednguyen wrote:
> > > > > On 2015/02/27 03:35:32, sullivan wrote:
> > > > > > +dtu because command line flags
> > > > > 
> > > > > I find --discard-first-result is a useful commandline flag that's used
> > quite
> > > > > often in the past (we removed it at some point, which I didn't
remember
> > > why).
> > > > > Most of the time, the metrics of the first run results is different
from
> > the
> > > > > rest because of some caching effect.
> > > > 
> > > > I think with our discussion of wanting to reduce the number of command
> line
> > > > flags, we need more justification than "being able to discard the first
> > result
> > > > is useful." If it's super useful for reducing noise, why don't we just
do
> it
> > > by
> > > > default and let the tests that don't want to use that behavior (anything
> > that
> > > > want cold times for example) overwrite it. Or if it's only useful in
> certain
> > > > cases, why don't we just have people write a warm version of their tests
> for
> > > > those cases?
> > > 
> > > Ned convinced me by saying that Matt Lee had asked for it. I was also
> thinking
> > > it would
> > > be handy for letting users see just how much slower the first run is for
the
> > > "warm" case.
> > > (Although, now that I think about it, they could simply run the "cold"
case
> to
> > > know that.)
> > > 
> > > Ned do you have more justification for the command-line flag.
> > > If the command-line flag is shot down, could the benchmark somehow tell
the
> > > results
> > > instance to discard the first result?
> > 
> > Another example that people may want to use --discard-first-result: 
> > https://codereview.chromium.org/959063002/
> 
> I don't have strong opinion about whether we make --discard-first-result a
> commandline flag, or a benchmark option thing that users can specify. I do
think
> that this is a useful option to many cases, but not all case by default
(because
> this certainly increase the cycle time).

It should definitely be an option baked into the benchmark and we shouldn't
allow users to specify it at the command-line. We disallow having command-line
configurations that can affect the results of the benchmark.

nednguyen(REVIEW IN OTHER ACC)

On 2015/03/02 18:24:22, dtu wrote: > On 2015/02/28 01:08:21, nednguyen wrote: > > On 2015/02/28 ...

5 years, 9 months ago (2015-03-02 22:31:41 UTC) #16

On 2015/03/02 18:24:22, dtu wrote:
> On 2015/02/28 01:08:21, nednguyen wrote:
> > On 2015/02/28 01:05:21, nednguyen wrote:
> > > On 2015/02/27 22:34:15, slamm wrote:
> > > > On 2015/02/27 22:11:55, aiolos wrote:
> > > > > On 2015/02/27 13:22:25, nednguyen wrote:
> > > > > > On 2015/02/27 03:35:32, sullivan wrote:
> > > > > > > +dtu because command line flags
> > > > > > 
> > > > > > I find --discard-first-result is a useful commandline flag that's
used
> > > quite
> > > > > > often in the past (we removed it at some point, which I didn't
> remember
> > > > why).
> > > > > > Most of the time, the metrics of the first run results is different
> from
> > > the
> > > > > > rest because of some caching effect.
> > > > > 
> > > > > I think with our discussion of wanting to reduce the number of command
> > line
> > > > > flags, we need more justification than "being able to discard the
first
> > > result
> > > > > is useful." If it's super useful for reducing noise, why don't we just
> do
> > it
> > > > by
> > > > > default and let the tests that don't want to use that behavior
(anything
> > > that
> > > > > want cold times for example) overwrite it. Or if it's only useful in
> > certain
> > > > > cases, why don't we just have people write a warm version of their
tests
> > for
> > > > > those cases?
> > > > 
> > > > Ned convinced me by saying that Matt Lee had asked for it. I was also
> > thinking
> > > > it would
> > > > be handy for letting users see just how much slower the first run is for
> the
> > > > "warm" case.
> > > > (Although, now that I think about it, they could simply run the "cold"
> case
> > to
> > > > know that.)
> > > > 
> > > > Ned do you have more justification for the command-line flag.
> > > > If the command-line flag is shot down, could the benchmark somehow tell
> the
> > > > results
> > > > instance to discard the first result?
> > > 
> > > Another example that people may want to use --discard-first-result: 
> > > https://codereview.chromium.org/959063002/
> > 
> > I don't have strong opinion about whether we make --discard-first-result a
> > commandline flag, or a benchmark option thing that users can specify. I do
> think
> > that this is a useful option to many cases, but not all case by default
> (because
> > this certainly increase the cycle time).
> 
> It should definitely be an option baked into the benchmark and we shouldn't
> allow users to specify it at the command-line. We disallow having command-line
> configurations that can affect the results of the benchmark.
Cool, I like this philosophy. @slamm, can you make this a benchmark or result
option?

slamm

On 2015/03/02 22:31:41, nednguyen(REVIEW IN OTHER ACC) wrote: > On 2015/03/02 18:24:22, dtu wrote: > ...

5 years, 9 months ago (2015-03-02 22:59:14 UTC) #17

On 2015/03/02 22:31:41, nednguyen(REVIEW IN OTHER ACC) wrote:
> On 2015/03/02 18:24:22, dtu wrote:
> > On 2015/02/28 01:08:21, nednguyen wrote:
> > > On 2015/02/28 01:05:21, nednguyen wrote:
> > > > On 2015/02/27 22:34:15, slamm wrote:
> > > > > On 2015/02/27 22:11:55, aiolos wrote:
> > > > > > On 2015/02/27 13:22:25, nednguyen wrote:
> > > > > > > On 2015/02/27 03:35:32, sullivan wrote:
> > > > > > > > +dtu because command line flags
> > > > > > > 
> > > > > > > I find --discard-first-result is a useful commandline flag that's
> used
> > > > quite
> > > > > > > often in the past (we removed it at some point, which I didn't
> > remember
> > > > > why).
> > > > > > > Most of the time, the metrics of the first run results is
different
> > from
> > > > the
> > > > > > > rest because of some caching effect.
> > > > > > 
> > > > > > I think with our discussion of wanting to reduce the number of
command
> > > line
> > > > > > flags, we need more justification than "being able to discard the
> first
> > > > result
> > > > > > is useful." If it's super useful for reducing noise, why don't we
just
> > do
> > > it
> > > > > by
> > > > > > default and let the tests that don't want to use that behavior
> (anything
> > > > that
> > > > > > want cold times for example) overwrite it. Or if it's only useful in
> > > certain
> > > > > > cases, why don't we just have people write a warm version of their
> tests
> > > for
> > > > > > those cases?
> > > > > 
> > > > > Ned convinced me by saying that Matt Lee had asked for it. I was also
> > > thinking
> > > > > it would
> > > > > be handy for letting users see just how much slower the first run is
for
> > the
> > > > > "warm" case.
> > > > > (Although, now that I think about it, they could simply run the "cold"
> > case
> > > to
> > > > > know that.)
> > > > > 
> > > > > Ned do you have more justification for the command-line flag.
> > > > > If the command-line flag is shot down, could the benchmark somehow
tell
> > the
> > > > > results
> > > > > instance to discard the first result?
> > > > 
> > > > Another example that people may want to use --discard-first-result: 
> > > > https://codereview.chromium.org/959063002/
> > > 
> > > I don't have strong opinion about whether we make --discard-first-result a
> > > commandline flag, or a benchmark option thing that users can specify. I do
> > think
> > > that this is a useful option to many cases, but not all case by default
> > (because
> > > this certainly increase the cycle time).
> > 
> > It should definitely be an option baked into the benchmark and we shouldn't
> > allow users to specify it at the command-line. We disallow having
command-line
> > configurations that can affect the results of the benchmark.
> Cool, I like this philosophy. @slamm, can you make this a benchmark or result
> option?

Yes, I would like to make it work as a benchmark or result option.
My current idea is to make it a benchmark option and then have that set a
results option. The former being what a benchmark writer changes and the latter
being what gets passed to user_story_runner.

slamm

On 2015/03/02 22:59:14, slamm wrote: > On 2015/03/02 22:31:41, nednguyen(REVIEW IN OTHER ACC) wrote: > ...

5 years, 9 months ago (2015-03-03 18:53:06 UTC) #18

On 2015/03/02 22:59:14, slamm wrote:
> On 2015/03/02 22:31:41, nednguyen(REVIEW IN OTHER ACC) wrote:
> > On 2015/03/02 18:24:22, dtu wrote:
> > > On 2015/02/28 01:08:21, nednguyen wrote:
> > > > On 2015/02/28 01:05:21, nednguyen wrote:
> > > > > On 2015/02/27 22:34:15, slamm wrote:
> > > > > > On 2015/02/27 22:11:55, aiolos wrote:
> > > > > > > On 2015/02/27 13:22:25, nednguyen wrote:
> > > > > > > > On 2015/02/27 03:35:32, sullivan wrote:
> > > > > > > > > +dtu because command line flags
> > > > > > > > 
> > > > > > > > I find --discard-first-result is a useful commandline flag
that's
> > used
> > > > > quite
> > > > > > > > often in the past (we removed it at some point, which I didn't
> > > remember
> > > > > > why).
> > > > > > > > Most of the time, the metrics of the first run results is
> different
> > > from
> > > > > the
> > > > > > > > rest because of some caching effect.
> > > > > > > 
> > > > > > > I think with our discussion of wanting to reduce the number of
> command
> > > > line
> > > > > > > flags, we need more justification than "being able to discard the
> > first
> > > > > result
> > > > > > > is useful." If it's super useful for reducing noise, why don't we
> just
> > > do
> > > > it
> > > > > > by
> > > > > > > default and let the tests that don't want to use that behavior
> > (anything
> > > > > that
> > > > > > > want cold times for example) overwrite it. Or if it's only useful
in
> > > > certain
> > > > > > > cases, why don't we just have people write a warm version of their
> > tests
> > > > for
> > > > > > > those cases?
> > > > > > 
> > > > > > Ned convinced me by saying that Matt Lee had asked for it. I was
also
> > > > thinking
> > > > > > it would
> > > > > > be handy for letting users see just how much slower the first run is
> for
> > > the
> > > > > > "warm" case.
> > > > > > (Although, now that I think about it, they could simply run the
"cold"
> > > case
> > > > to
> > > > > > know that.)
> > > > > > 
> > > > > > Ned do you have more justification for the command-line flag.
> > > > > > If the command-line flag is shot down, could the benchmark somehow
> tell
> > > the
> > > > > > results
> > > > > > instance to discard the first result?
> > > > > 
> > > > > Another example that people may want to use --discard-first-result: 
> > > > > https://codereview.chromium.org/959063002/
> > > > 
> > > > I don't have strong opinion about whether we make --discard-first-result
a
> > > > commandline flag, or a benchmark option thing that users can specify. I
do
> > > think
> > > > that this is a useful option to many cases, but not all case by default
> > > (because
> > > > this certainly increase the cycle time).
> > > 
> > > It should definitely be an option baked into the benchmark and we
shouldn't
> > > allow users to specify it at the command-line. We disallow having
> command-line
> > > configurations that can affect the results of the benchmark.
> > Cool, I like this philosophy. @slamm, can you make this a benchmark or
result
> > option?
> 
> Yes, I would like to make it work as a benchmark or result option.
> My current idea is to make it a benchmark option and then have that set a
> results option. The former being what a benchmark writer changes and the
latter
> being what gets passed to user_story_runner.

Related to this, the page_cycler benchmark has a command-line arg,
--cold-load-percent. Its value determines whether or not the first result is
discarded. As part of my change, should cold-load-percent become a class
attribute?

nednguyen

On 2015/03/03 18:53:06, slamm wrote: > On 2015/03/02 22:59:14, slamm wrote: > > On 2015/03/02 ...

5 years, 9 months ago (2015-03-03 19:18:33 UTC) #20

On 2015/03/03 18:53:06, slamm wrote:
> On 2015/03/02 22:59:14, slamm wrote:
> > On 2015/03/02 22:31:41, nednguyen(REVIEW IN OTHER ACC) wrote:
> > > On 2015/03/02 18:24:22, dtu wrote:
> > > > On 2015/02/28 01:08:21, nednguyen wrote:
> > > > > On 2015/02/28 01:05:21, nednguyen wrote:
> > > > > > On 2015/02/27 22:34:15, slamm wrote:
> > > > > > > On 2015/02/27 22:11:55, aiolos wrote:
> > > > > > > > On 2015/02/27 13:22:25, nednguyen wrote:
> > > > > > > > > On 2015/02/27 03:35:32, sullivan wrote:
> > > > > > > > > > +dtu because command line flags
> > > > > > > > > 
> > > > > > > > > I find --discard-first-result is a useful commandline flag
> that's
> > > used
> > > > > > quite
> > > > > > > > > often in the past (we removed it at some point, which I didn't
> > > > remember
> > > > > > > why).
> > > > > > > > > Most of the time, the metrics of the first run results is
> > different
> > > > from
> > > > > > the
> > > > > > > > > rest because of some caching effect.
> > > > > > > > 
> > > > > > > > I think with our discussion of wanting to reduce the number of
> > command
> > > > > line
> > > > > > > > flags, we need more justification than "being able to discard
the
> > > first
> > > > > > result
> > > > > > > > is useful." If it's super useful for reducing noise, why don't
we
> > just
> > > > do
> > > > > it
> > > > > > > by
> > > > > > > > default and let the tests that don't want to use that behavior
> > > (anything
> > > > > > that
> > > > > > > > want cold times for example) overwrite it. Or if it's only
useful
> in
> > > > > certain
> > > > > > > > cases, why don't we just have people write a warm version of
their
> > > tests
> > > > > for
> > > > > > > > those cases?
> > > > > > > 
> > > > > > > Ned convinced me by saying that Matt Lee had asked for it. I was
> also
> > > > > thinking
> > > > > > > it would
> > > > > > > be handy for letting users see just how much slower the first run
is
> > for
> > > > the
> > > > > > > "warm" case.
> > > > > > > (Although, now that I think about it, they could simply run the
> "cold"
> > > > case
> > > > > to
> > > > > > > know that.)
> > > > > > > 
> > > > > > > Ned do you have more justification for the command-line flag.
> > > > > > > If the command-line flag is shot down, could the benchmark somehow
> > tell
> > > > the
> > > > > > > results
> > > > > > > instance to discard the first result?
> > > > > > 
> > > > > > Another example that people may want to use --discard-first-result: 
> > > > > > https://codereview.chromium.org/959063002/
> > > > > 
> > > > > I don't have strong opinion about whether we make
--discard-first-result
> a
> > > > > commandline flag, or a benchmark option thing that users can specify.
I
> do
> > > > think
> > > > > that this is a useful option to many cases, but not all case by
default
> > > > (because
> > > > > this certainly increase the cycle time).
> > > > 
> > > > It should definitely be an option baked into the benchmark and we
> shouldn't
> > > > allow users to specify it at the command-line. We disallow having
> > command-line
> > > > configurations that can affect the results of the benchmark.
> > > Cool, I like this philosophy. @slamm, can you make this a benchmark or
> result
> > > option?
> > 
> > Yes, I would like to make it work as a benchmark or result option.
> > My current idea is to make it a benchmark option and then have that set a
> > results option. The former being what a benchmark writer changes and the
> latter
> > being what gets passed to user_story_runner.
> 
> Related to this, the page_cycler benchmark has a command-line arg,
> --cold-load-percent. Its value determines whether or not the first result is
> discarded. As part of my change, should cold-load-percent become a class
> attribute?

I think we should make cold-load-percent a class attribute since this is a
command line flag that can affect the results.

aiolos (Not reviewing)

On 2015/03/03 19:18:33, nednguyen wrote: > On 2015/03/03 18:53:06, slamm wrote: > > On 2015/03/02 ...

5 years, 9 months ago (2015-03-03 19:24:45 UTC) #21

On 2015/03/03 19:18:33, nednguyen wrote:
> On 2015/03/03 18:53:06, slamm wrote:
> > On 2015/03/02 22:59:14, slamm wrote:
> > > On 2015/03/02 22:31:41, nednguyen(REVIEW IN OTHER ACC) wrote:
> > > > On 2015/03/02 18:24:22, dtu wrote:
> > > > > On 2015/02/28 01:08:21, nednguyen wrote:
> > > > > > On 2015/02/28 01:05:21, nednguyen wrote:
> > > > > > > On 2015/02/27 22:34:15, slamm wrote:
> > > > > > > > On 2015/02/27 22:11:55, aiolos wrote:
> > > > > > > > > On 2015/02/27 13:22:25, nednguyen wrote:
> > > > > > > > > > On 2015/02/27 03:35:32, sullivan wrote:
> > > > > > > > > > > +dtu because command line flags
> > > > > > > > > > 
> > > > > > > > > > I find --discard-first-result is a useful commandline flag
> > that's
> > > > used
> > > > > > > quite
> > > > > > > > > > often in the past (we removed it at some point, which I
didn't
> > > > > remember
> > > > > > > > why).
> > > > > > > > > > Most of the time, the metrics of the first run results is
> > > different
> > > > > from
> > > > > > > the
> > > > > > > > > > rest because of some caching effect.
> > > > > > > > > 
> > > > > > > > > I think with our discussion of wanting to reduce the number of
> > > command
> > > > > > line
> > > > > > > > > flags, we need more justification than "being able to discard
> the
> > > > first
> > > > > > > result
> > > > > > > > > is useful." If it's super useful for reducing noise, why don't
> we
> > > just
> > > > > do
> > > > > > it
> > > > > > > > by
> > > > > > > > > default and let the tests that don't want to use that behavior
> > > > (anything
> > > > > > > that
> > > > > > > > > want cold times for example) overwrite it. Or if it's only
> useful
> > in
> > > > > > certain
> > > > > > > > > cases, why don't we just have people write a warm version of
> their
> > > > tests
> > > > > > for
> > > > > > > > > those cases?
> > > > > > > > 
> > > > > > > > Ned convinced me by saying that Matt Lee had asked for it. I was
> > also
> > > > > > thinking
> > > > > > > > it would
> > > > > > > > be handy for letting users see just how much slower the first
run
> is
> > > for
> > > > > the
> > > > > > > > "warm" case.
> > > > > > > > (Although, now that I think about it, they could simply run the
> > "cold"
> > > > > case
> > > > > > to
> > > > > > > > know that.)
> > > > > > > > 
> > > > > > > > Ned do you have more justification for the command-line flag.
> > > > > > > > If the command-line flag is shot down, could the benchmark
somehow
> > > tell
> > > > > the
> > > > > > > > results
> > > > > > > > instance to discard the first result?
> > > > > > > 
> > > > > > > Another example that people may want to use
--discard-first-result: 
> > > > > > > https://codereview.chromium.org/959063002/
> > > > > > 
> > > > > > I don't have strong opinion about whether we make
> --discard-first-result
> > a
> > > > > > commandline flag, or a benchmark option thing that users can
specify.
> I
> > do
> > > > > think
> > > > > > that this is a useful option to many cases, but not all case by
> default
> > > > > (because
> > > > > > this certainly increase the cycle time).
> > > > > 
> > > > > It should definitely be an option baked into the benchmark and we
> > shouldn't
> > > > > allow users to specify it at the command-line. We disallow having
> > > command-line
> > > > > configurations that can affect the results of the benchmark.
> > > > Cool, I like this philosophy. @slamm, can you make this a benchmark or
> > result
> > > > option?
> > > 
> > > Yes, I would like to make it work as a benchmark or result option.
> > > My current idea is to make it a benchmark option and then have that set a
> > > results option. The former being what a benchmark writer changes and the
> > latter
> > > being what gets passed to user_story_runner.
> > 
> > Related to this, the page_cycler benchmark has a command-line arg,
> > --cold-load-percent. Its value determines whether or not the first result is
> > discarded. As part of my change, should cold-load-percent become a class
> > attribute?
> 
> I think we should make cold-load-percent a class attribute since this is a
> command line flag that can affect the results.

Agreed.

dtu

On 2015/03/03 19:24:45, aiolos wrote: > On 2015/03/03 19:18:33, nednguyen wrote: > > On 2015/03/03 ...

5 years, 9 months ago (2015-03-03 22:27:23 UTC) #23

On 2015/03/03 19:24:45, aiolos wrote:
> On 2015/03/03 19:18:33, nednguyen wrote:
> > On 2015/03/03 18:53:06, slamm wrote:
> > > On 2015/03/02 22:59:14, slamm wrote:
> > > > On 2015/03/02 22:31:41, nednguyen(REVIEW IN OTHER ACC) wrote:
> > > > > On 2015/03/02 18:24:22, dtu wrote:
> > > > > > On 2015/02/28 01:08:21, nednguyen wrote:
> > > > > > > On 2015/02/28 01:05:21, nednguyen wrote:
> > > > > > > > On 2015/02/27 22:34:15, slamm wrote:
> > > > > > > > > On 2015/02/27 22:11:55, aiolos wrote:
> > > > > > > > > > On 2015/02/27 13:22:25, nednguyen wrote:
> > > > > > > > > > > On 2015/02/27 03:35:32, sullivan wrote:
> > > > > > > > > > > > +dtu because command line flags
> > > > > > > > > > > 
> > > > > > > > > > > I find --discard-first-result is a useful commandline flag
> > > that's
> > > > > used
> > > > > > > > quite
> > > > > > > > > > > often in the past (we removed it at some point, which I
> didn't
> > > > > > remember
> > > > > > > > > why).
> > > > > > > > > > > Most of the time, the metrics of the first run results is
> > > > different
> > > > > > from
> > > > > > > > the
> > > > > > > > > > > rest because of some caching effect.
> > > > > > > > > > 
> > > > > > > > > > I think with our discussion of wanting to reduce the number
of
> > > > command
> > > > > > > line
> > > > > > > > > > flags, we need more justification than "being able to
discard
> > the
> > > > > first
> > > > > > > > result
> > > > > > > > > > is useful." If it's super useful for reducing noise, why
don't
> > we
> > > > just
> > > > > > do
> > > > > > > it
> > > > > > > > > by
> > > > > > > > > > default and let the tests that don't want to use that
behavior
> > > > > (anything
> > > > > > > > that
> > > > > > > > > > want cold times for example) overwrite it. Or if it's only
> > useful
> > > in
> > > > > > > certain
> > > > > > > > > > cases, why don't we just have people write a warm version of
> > their
> > > > > tests
> > > > > > > for
> > > > > > > > > > those cases?
> > > > > > > > > 
> > > > > > > > > Ned convinced me by saying that Matt Lee had asked for it. I
was
> > > also
> > > > > > > thinking
> > > > > > > > > it would
> > > > > > > > > be handy for letting users see just how much slower the first
> run
> > is
> > > > for
> > > > > > the
> > > > > > > > > "warm" case.
> > > > > > > > > (Although, now that I think about it, they could simply run
the
> > > "cold"
> > > > > > case
> > > > > > > to
> > > > > > > > > know that.)
> > > > > > > > > 
> > > > > > > > > Ned do you have more justification for the command-line flag.
> > > > > > > > > If the command-line flag is shot down, could the benchmark
> somehow
> > > > tell
> > > > > > the
> > > > > > > > > results
> > > > > > > > > instance to discard the first result?
> > > > > > > > 
> > > > > > > > Another example that people may want to use
> --discard-first-result: 
> > > > > > > > https://codereview.chromium.org/959063002/
> > > > > > > 
> > > > > > > I don't have strong opinion about whether we make
> > --discard-first-result
> > > a
> > > > > > > commandline flag, or a benchmark option thing that users can
> specify.
> > I
> > > do
> > > > > > think
> > > > > > > that this is a useful option to many cases, but not all case by
> > default
> > > > > > (because
> > > > > > > this certainly increase the cycle time).
> > > > > > 
> > > > > > It should definitely be an option baked into the benchmark and we
> > > shouldn't
> > > > > > allow users to specify it at the command-line. We disallow having
> > > > command-line
> > > > > > configurations that can affect the results of the benchmark.
> > > > > Cool, I like this philosophy. @slamm, can you make this a benchmark or
> > > result
> > > > > option?
> > > > 
> > > > Yes, I would like to make it work as a benchmark or result option.
> > > > My current idea is to make it a benchmark option and then have that set
a
> > > > results option. The former being what a benchmark writer changes and the
> > > latter
> > > > being what gets passed to user_story_runner.
> > > 
> > > Related to this, the page_cycler benchmark has a command-line arg,
> > > --cold-load-percent. Its value determines whether or not the first result
is
> > > discarded. As part of my change, should cold-load-percent become a class
> > > attribute?
> > 
> > I think we should make cold-load-percent a class attribute since this is a
> > command line flag that can affect the results.
> 
> Agreed.

Yus

slamm

Remove user_story_runner test for discarding the first result.

5 years, 9 months ago (2015-03-03 22:59:34 UTC) #25

slamm

On 2015/03/03 22:59:34, slamm wrote: > Remove user_story_runner test for discarding the first result. PTAL.

5 years, 9 months ago (2015-03-03 23:34:10 UTC) #26

nednguyen

+Ethan for results/ code review. This approach is lg2me

5 years, 9 months ago (2015-03-04 19:04:59 UTC) #28

eakuefner

results changes lgtm https://codereview.chromium.org/962793004/diff/100001/tools/telemetry/telemetry/results/page_test_results.py File tools/telemetry/telemetry/results/page_test_results.py (right): https://codereview.chromium.org/962793004/diff/100001/tools/telemetry/telemetry/results/page_test_results.py#newcode26 tools/telemetry/telemetry/results/page_test_results.py:26: value_can_be_added_predicate=lambda v, f: True): f -> ...

5 years, 9 months ago (2015-03-04 19:19:05 UTC) #29

slamm

Thank you for the comments. I am quite pleased with how this turned out. It ...

5 years, 9 months ago (2015-03-04 20:59:31 UTC) #31

nednguyen

lgtm There is some risk that this may cause regression on those benchmarks. I can't ...

5 years, 9 months ago (2015-03-04 21:09:35 UTC) #32

commit-bot: I haz the power

CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/962793004/120001

5 years, 9 months ago (2015-03-04 22:15:31 UTC) #35

commit-bot: I haz the power

5 years, 9 months ago (2015-03-04 22:22:52 UTC) #37

Message was sent while issue was closed.

Patchset 6 (id:??) landed as
https://crrev.com/2cd4e23b41a986a172ed9a15e165d3a416d9056b
Cr-Commit-Position: refs/heads/master@{#319140}

Expand Messages | Collapse Messages | Show Generated Messages | Hide Generated Messages