|
|
DescriptionSeparates faster and slower bench alerts; sorts by delta.
Separate faster and slower bench alerts; sort by delta.
BUG=skia:2193
NOTRY=true
Committed: http://code.google.com/p/skia/source/detail?r=13512
Patch Set 1 #
Total comments: 12
Patch Set 2 : improvement #
Total comments: 12
Patch Set 3 : another round... #
Messages
Total messages: 16 (0 generated)
Looks good on the whole, some suggestions below. https://codereview.chromium.org/172683002/diff/1/bench/check_bench_regression... File bench/check_bench_regressions.py (right): https://codereview.chromium.org/172683002/diff/1/bench/check_bench_regression... bench/check_bench_regressions.py:140: float(elements[-3])) Not new to this CL... but these are strange magic indexes into the "elements" array. I suspect a more maintainable and Pythony approach would be something like a named tuple... see http://docs.python.org/2/library/collections.html#collections.namedtuple Or, at least, constant index values defined at the top of the module. Anyway, not essential for this CL, but I think it would be a good TODO to add in here. https://codereview.chromium.org/172683002/diff/1/bench/check_bench_regression... bench/check_bench_regressions.py:143: """Check if there are benches in the given revising out of range. TODO: document function arguments, return value, and possible exceptions raised in the docstring See http://google-styleguide.googlecode.com/svn/trunk/pyguide.html#Comments https://codereview.chromium.org/172683002/diff/1/bench/check_bench_regression... bench/check_bench_regressions.py:147: exceptions = {} Please document the structure of this dictionary. I think it is: exception message strings, keyed by off_ratio (ratio by which actual result varies from expectation) https://codereview.chromium.org/172683002/diff/1/bench/check_bench_regression... bench/check_bench_regressions.py:158: exception = 'Bench %s off range [%s, %s] (%s vs %s, %s%%).' % ( off -> out of https://codereview.chromium.org/172683002/diff/1/bench/check_bench_regression... bench/check_bench_regressions.py:167: if ratio < 1 and 'Faster:' not in outputs: I think it might be cleaner to build two expectations dicts (slower_than_expected, faster_than_expected) in the above loop, and then output them both here. Then you could do nice things like: 0 bench values faster than expected 18 bench values slower than expected (worst was xx% faster): blah blah or 4 bench values faster than expected (best was xx% faster): blah blah 18 bench values slower than expected (worst was xx% faster): blah blah https://codereview.chromium.org/172683002/diff/1/tools/tests/benchalerts/Perf... File tools/tests/benchalerts/Perf-Android-Nexus7-Tegra3-Arm7-Release/output-expected/stderr (right): https://codereview.chromium.org/172683002/diff/1/tools/tests/benchalerts/Perf... tools/tests/benchalerts/Perf-Android-Nexus7-Tegra3-Arm7-Release/output-expected/stderr:12: Bench desk_amazon.skp_record_,Perf-Android-Nexus7-Tegra3-Arm7-Release-25th off range [-1.0, 1.2] (1.213 vs 1.1, 10.2727272727%). we allow negative results? (-1.0 as lower bound?)
Thanks for the suggestions! PTAL. https://codereview.chromium.org/172683002/diff/1/bench/check_bench_regression... File bench/check_bench_regressions.py (right): https://codereview.chromium.org/172683002/diff/1/bench/check_bench_regression... bench/check_bench_regressions.py:140: float(elements[-3])) Done using the constant index approach, which is more lightweight. On 2014/02/19 18:51:08, epoger wrote: > Not new to this CL... but these are strange magic indexes into the "elements" > array. I suspect a more maintainable and Pythony approach would be something > like a named tuple... see > http://docs.python.org/2/library/collections.html#collections.namedtuple > > Or, at least, constant index values defined at the top of the module. > > Anyway, not essential for this CL, but I think it would be a good TODO to add in > here. https://codereview.chromium.org/172683002/diff/1/bench/check_bench_regression... bench/check_bench_regressions.py:143: """Check if there are benches in the given revising out of range. On 2014/02/19 18:51:08, epoger wrote: > TODO: document function arguments, return value, and possible exceptions raised > in the docstring > > See http://google-styleguide.googlecode.com/svn/trunk/pyguide.html#Comments Done. https://codereview.chromium.org/172683002/diff/1/bench/check_bench_regression... bench/check_bench_regressions.py:147: exceptions = {} On 2014/02/19 18:51:08, epoger wrote: > Please document the structure of this dictionary. I think it is: exception > message strings, keyed by off_ratio (ratio by which actual result varies from > expectation) Done. https://codereview.chromium.org/172683002/diff/1/bench/check_bench_regression... bench/check_bench_regressions.py:158: exception = 'Bench %s off range [%s, %s] (%s vs %s, %s%%).' % ( On 2014/02/19 18:51:08, epoger wrote: > off -> out of Done. https://codereview.chromium.org/172683002/diff/1/bench/check_bench_regression... bench/check_bench_regressions.py:167: if ratio < 1 and 'Faster:' not in outputs: On 2014/02/19 18:51:08, epoger wrote: > I think it might be cleaner to build two expectations dicts > (slower_than_expected, faster_than_expected) in the above loop, and then output > them both here. Then you could do nice things like: > > 0 bench values faster than expected > 18 bench values slower than expected (worst was xx% faster): > blah > blah > > or > > 4 bench values faster than expected (best was xx% faster): > blah > blah > 18 bench values slower than expected (worst was xx% faster): > blah > blah Done. https://codereview.chromium.org/172683002/diff/1/tools/tests/benchalerts/Perf... File tools/tests/benchalerts/Perf-Android-Nexus7-Tegra3-Arm7-Release/output-expected/stderr (right): https://codereview.chromium.org/172683002/diff/1/tools/tests/benchalerts/Perf... tools/tests/benchalerts/Perf-Android-Nexus7-Tegra3-Arm7-Release/output-expected/stderr:12: Bench desk_amazon.skp_record_,Perf-Android-Nexus7-Tegra3-Arm7-Release-25th off range [-1.0, 1.2] (1.213 vs 1.1, 10.2727272727%). Yes, the batch script algorithm could make the lower bound smaller than zero. I think it's good to leave them as is since they reflect the actual calculated ranges (not truncated by 0), so we still know if they would be too wide. On 2014/02/19 18:51:08, epoger wrote: > we allow negative results? (-1.0 as lower bound?)
Cool, getting down to the nits. https://codereview.chromium.org/172683002/diff/70001/bench/check_bench_regres... File bench/check_bench_regressions.py (right): https://codereview.chromium.org/172683002/diff/70001/bench/check_bench_regres... bench/check_bench_regressions.py:149: """Check if there are benches in the given revising out of range. revising -> revision ? Actually, revision disappeared from the arg list anyway. So I guess this should read: Check if any bench results are outside of expected range. https://codereview.chromium.org/172683002/diff/70001/bench/check_bench_regres... bench/check_bench_regressions.py:158: bench representation algorithm (default to "25th"). I don't see anything about defaulting to "25th" in this function. Maybe just remove "(default to "25th")" from this arg description? https://codereview.chromium.org/172683002/diff/70001/bench/check_bench_regres... bench/check_bench_regressions.py:164: Exception containing bench data that are out of range, if nonempty. nonempty -> any https://codereview.chromium.org/172683002/diff/70001/bench/check_bench_regres... bench/check_bench_regressions.py:170: # to a list of correspondig exception messages. correspondig -> corresponding https://codereview.chromium.org/172683002/diff/70001/bench/check_bench_regres... bench/check_bench_regressions.py:186: exceptions[0].setdefault(off_ratio, []).append(exception) Rather than magic values 0 and 1, please use defined constants or a named tuple. Or just create two separate dictionary variables (slower_than_expected, faster_than_expected). https://codereview.chromium.org/172683002/diff/70001/bench/check_bench_regres... bench/check_bench_regressions.py:197: header = '%s Slower benche(s) (Sorted):' % len(li) I think this would be clearer: '%s benches got slower (sorted by %% difference):' Part of this is: I think it's easier to read if you just leave out the "(s)" business to account for singular-vs-plural. I think that's distracting.
Thanks for the careful review! PTAL. https://codereview.chromium.org/172683002/diff/70001/bench/check_bench_regres... File bench/check_bench_regressions.py (right): https://codereview.chromium.org/172683002/diff/70001/bench/check_bench_regres... bench/check_bench_regressions.py:149: """Check if there are benches in the given revising out of range. On 2014/02/19 21:27:51, epoger wrote: > revising -> revision ? > > Actually, revision disappeared from the arg list anyway. So I guess this should > read: > > Check if any bench results are outside of expected range. Done. https://codereview.chromium.org/172683002/diff/70001/bench/check_bench_regres... bench/check_bench_regressions.py:158: bench representation algorithm (default to "25th"). Agreed. Done. On 2014/02/19 21:27:51, epoger wrote: > I don't see anything about defaulting to "25th" in this function. Maybe just > remove "(default to "25th")" from this arg description? https://codereview.chromium.org/172683002/diff/70001/bench/check_bench_regres... bench/check_bench_regressions.py:164: Exception containing bench data that are out of range, if nonempty. On 2014/02/19 21:27:51, epoger wrote: > nonempty -> any Done. https://codereview.chromium.org/172683002/diff/70001/bench/check_bench_regres... bench/check_bench_regressions.py:170: # to a list of correspondig exception messages. Done. How did you catch this one? On 2014/02/19 21:27:51, epoger wrote: > correspondig -> corresponding https://codereview.chromium.org/172683002/diff/70001/bench/check_bench_regres... bench/check_bench_regressions.py:186: exceptions[0].setdefault(off_ratio, []).append(exception) Done using constants. On 2014/02/19 21:27:51, epoger wrote: > Rather than magic values 0 and 1, please use defined constants or a named tuple. > Or just create two separate dictionary variables (slower_than_expected, > faster_than_expected). https://codereview.chromium.org/172683002/diff/70001/bench/check_bench_regres... bench/check_bench_regressions.py:197: header = '%s Slower benche(s) (Sorted):' % len(li) Agreed. Done. On 2014/02/19 21:27:51, epoger wrote: > I think this would be clearer: > > '%s benches got slower (sorted by %% difference):' > > Part of this is: I think it's easier to read if you just leave out the "(s)" > business to account for singular-vs-plural. I think that's distracting.
LGTM! Thanks for making this improvement so quickly, Ben!
The CQ bit was checked by bensong@google.com
CQ is trying da patch. Follow status at https://skia-tree-status.appspot.com/cq/bensong@google.com/172683002/140001
CQ is trying da patch. Follow status at https://skia-tree-status.appspot.com/cq/bensong@google.com/172683002/140001
CQ is trying da patch. Follow status at https://skia-tree-status.appspot.com/cq/bensong@google.com/172683002/140001
CQ is trying da patch. Follow status at https://skia-tree-status.appspot.com/cq/bensong@google.com/172683002/140001
CQ is trying da patch. Follow status at https://skia-tree-status.appspot.com/cq/bensong@google.com/172683002/140001
The CQ bit was unchecked by bensong@google.com
The CQ bit was checked by bensong@google.com
CQ is trying da patch. Follow status at https://skia-tree-status.appspot.com/cq/bensong@google.com/172683002/140001
Message was sent while issue was closed.
Change committed as 13512 |