|
|
Chromium Code Reviews
Descriptionwebkitpy: Improve performance of merge-results script.
This patch changes the JSON merging code to merge multiple things at
once, rather than incrementally merging two things together.
The performance improvement is significant, in walk clock time the
change is;
With 6 shards;
Before - 1m3.957s
After - 0m6.475s
With 18 shards;
Before - 1m57.577s
After - 0m7.163s
BUG=524758
Review-Url: https://codereview.chromium.org/2825253003
Cr-Commit-Position: refs/heads/master@{#465888}
Committed: https://chromium.googlesource.com/chromium/src/+/731836e08488b3f4875dd5377dfb9bc01e67b21e
Patch Set 1 #
Total comments: 15
Patch Set 2 : Fixing for review. #
Total comments: 4
Patch Set 3 : Fixes. #Patch Set 4 : Fixes. #
Messages
Total messages: 23 (15 generated)
The CQ bit was checked by tansell@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
Description was changed from ========== webkitpy: Improve performance of merge-results script. This patch changes the JSON merging code to merge multiple things at once, rather than incrementally merging two things together. The performance improvement is significant, in walk clock time the change is; With 6 shards; Before - 1m3.957s After - 0m6.475s With 18 shards; Before - 1m57.577s After - 0m7.163s BUG=524758 ========== to ========== webkitpy: Improve performance of merge-results script. This patch changes the JSON merging code to merge multiple things at once, rather than incrementally merging two things together. The performance improvement is significant, in walk clock time the change is; With 6 shards; Before - 1m3.957s After - 0m6.475s With 18 shards; Before - 1m57.577s After - 0m7.163s BUG=524758 ==========
tansell@chromium.org changed reviewers: + mcgreevy@chromium.org, qyearsley@chromium.org
lgtm LGTM with optional nits. https://codereview.chromium.org/2825253003/diff/1/third_party/WebKit/Tools/Sc... File third_party/WebKit/Tools/Scripts/webkitpy/layout_tests/merge_results.py (right): https://codereview.chromium.org/2825253003/diff/1/third_party/WebKit/Tools/Sc... third_party/WebKit/Tools/Scripts/webkitpy/layout_tests/merge_results.py:191: ordered_keys = collections.OrderedDict.fromkeys( Why do you need an OrderedDict here? Can't you just use a Set? https://codereview.chromium.org/2825253003/diff/1/third_party/WebKit/Tools/Sc... third_party/WebKit/Tools/Scripts/webkitpy/layout_tests/merge_results.py:197: for dobj in dicts: This nested for loop looks up every key in every dictionary. An alternative would be to not build up ordered_keys in the first place, but instead build up a dict of {key: [values to merge]} then once you've finished building this, iterate over it, merging the values. It would probably be faster at the cost of using more RAM.
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
alancutter@chromium.org changed reviewers: + alancutter@chromium.org
lgtm https://codereview.chromium.org/2825253003/diff/1/third_party/WebKit/Tools/Sc... File third_party/WebKit/Tools/Scripts/webkitpy/layout_tests/merge_results.py (right): https://codereview.chromium.org/2825253003/diff/1/third_party/WebKit/Tools/Sc... third_party/WebKit/Tools/Scripts/webkitpy/layout_tests/merge_results.py:163: """Merge two equal objects together.""" Remove "two". https://codereview.chromium.org/2825253003/diff/1/third_party/WebKit/Tools/Sc... third_party/WebKit/Tools/Scripts/webkitpy/layout_tests/merge_results.py:172: """Merge two things which are "list like" (tuples, lists, sets).""" Remove "two". https://codereview.chromium.org/2825253003/diff/1/third_party/WebKit/Tools/Sc... third_party/WebKit/Tools/Scripts/webkitpy/layout_tests/merge_results.py:184: """Merge two things which are dictionaries.""" Remove "two". https://codereview.chromium.org/2825253003/diff/1/third_party/WebKit/Tools/Sc... third_party/WebKit/Tools/Scripts/webkitpy/layout_tests/merge_results.py:189: dict_0, dict_n)) I think this is worth pulling out as a helper assert function. One that probably would live in MergeFailure. https://codereview.chromium.org/2825253003/diff/1/third_party/WebKit/Tools/Sc... third_party/WebKit/Tools/Scripts/webkitpy/layout_tests/merge_results.py:232: for obj in objs: Seems strange that it needs to test against every object, what fails if it only calls match_func() against the first object?
The CQ bit was checked by tansell@chromium.org to run a CQ dry run
https://codereview.chromium.org/2825253003/diff/1/third_party/WebKit/Tools/Sc... File third_party/WebKit/Tools/Scripts/webkitpy/layout_tests/merge_results.py (right): https://codereview.chromium.org/2825253003/diff/1/third_party/WebKit/Tools/Sc... third_party/WebKit/Tools/Scripts/webkitpy/layout_tests/merge_results.py:163: """Merge two equal objects together.""" On 2017/04/19 07:43:43, alancutter wrote: > Remove "two". Done. https://codereview.chromium.org/2825253003/diff/1/third_party/WebKit/Tools/Sc... third_party/WebKit/Tools/Scripts/webkitpy/layout_tests/merge_results.py:172: """Merge two things which are "list like" (tuples, lists, sets).""" On 2017/04/19 07:43:43, alancutter wrote: > Remove "two". Done. https://codereview.chromium.org/2825253003/diff/1/third_party/WebKit/Tools/Sc... third_party/WebKit/Tools/Scripts/webkitpy/layout_tests/merge_results.py:184: """Merge two things which are dictionaries.""" On 2017/04/19 07:43:43, alancutter wrote: > Remove "two". Done. https://codereview.chromium.org/2825253003/diff/1/third_party/WebKit/Tools/Sc... third_party/WebKit/Tools/Scripts/webkitpy/layout_tests/merge_results.py:189: dict_0, dict_n)) On 2017/04/19 07:43:43, alancutter wrote: > I think this is worth pulling out as a helper assert function. One that probably > would live in MergeFailure. Done. https://codereview.chromium.org/2825253003/diff/1/third_party/WebKit/Tools/Sc... third_party/WebKit/Tools/Scripts/webkitpy/layout_tests/merge_results.py:191: ordered_keys = collections.OrderedDict.fromkeys( On 2017/04/19 06:57:20, mcgreevy wrote: > Why do you need an OrderedDict here? Can't you just use a Set? Set doesn't preserve the order and OrderedSet doesn't exist. https://codereview.chromium.org/2825253003/diff/1/third_party/WebKit/Tools/Sc... third_party/WebKit/Tools/Scripts/webkitpy/layout_tests/merge_results.py:197: for dobj in dicts: On 2017/04/19 06:57:20, mcgreevy wrote: > This nested for loop looks up every key in every dictionary. An alternative > would be to not build up ordered_keys in the first place, but instead build up a > dict of > > {key: [values to merge]} > > then once you've finished building this, iterate over it, merging the values. > It would probably be faster at the cost of using more RAM. I tried implementing this and it made no difference to run time (within the bounds of error).
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
LGTM, great performance improvement :-) https://codereview.chromium.org/2825253003/diff/20001/third_party/WebKit/Tool... File third_party/WebKit/Tools/Scripts/webkitpy/layout_tests/merge_results.py (right): https://codereview.chromium.org/2825253003/diff/20001/third_party/WebKit/Tool... third_party/WebKit/Tools/Scripts/webkitpy/layout_tests/merge_results.py:339: beforen, new_json_data_n, aftern = self.load_jsonp( Using an underscore (before_n/after_n) may look a bit more consistent. https://codereview.chromium.org/2825253003/diff/20001/third_party/WebKit/Tool... third_party/WebKit/Tools/Scripts/webkitpy/layout_tests/merge_results.py:598: FilenameMatch('pywebsocket\\.ws\\.log-.*-err\\.txt$'), Could also use a "raw string" for strings that contain regexps. Not much of a difference except then you don't need to use double-backslashes, e.g. r'pywebsocket\.ws\.log-.*-err\.txt$'
https://codereview.chromium.org/2825253003/diff/1/third_party/WebKit/Tools/Sc... File third_party/WebKit/Tools/Scripts/webkitpy/layout_tests/merge_results.py (right): https://codereview.chromium.org/2825253003/diff/1/third_party/WebKit/Tools/Sc... third_party/WebKit/Tools/Scripts/webkitpy/layout_tests/merge_results.py:191: ordered_keys = collections.OrderedDict.fromkeys( On 2017/04/19 at 08:38:39, mithro wrote: > On 2017/04/19 06:57:20, mcgreevy wrote: > > Why do you need an OrderedDict here? Can't you just use a Set? > > Set doesn't preserve the order and OrderedSet doesn't exist. Right, but why do you need to preserve the order? The use of OrderedDict implies that preserving the order is necessary, but as far as I can tell, it isn't. This harms readability, as I spend extra time studying the code trying to figure out why the key order needs to be preserved. If there is indeed a reason why the key order must be preserved, please add a comment explaining why. If in fact the key order is not significant, please use a Set instead.
https://codereview.chromium.org/2825253003/diff/1/third_party/WebKit/Tools/Sc... File third_party/WebKit/Tools/Scripts/webkitpy/layout_tests/merge_results.py (right): https://codereview.chromium.org/2825253003/diff/1/third_party/WebKit/Tools/Sc... third_party/WebKit/Tools/Scripts/webkitpy/layout_tests/merge_results.py:191: ordered_keys = collections.OrderedDict.fromkeys( On 2017/04/20 00:08:25, mcgreevy wrote: > On 2017/04/19 at 08:38:39, mithro wrote: > > On 2017/04/19 06:57:20, mcgreevy wrote: > > > Why do you need an OrderedDict here? Can't you just use a Set? > > > > Set doesn't preserve the order and OrderedSet doesn't exist. > > Right, but why do you need to preserve the order? The use of OrderedDict > implies that preserving the order is necessary, but as far as I can tell, it > isn't. This harms readability, as I spend extra time studying the code trying to > figure out why the key order needs to be preserved. If there is indeed a reason > why the key order must be preserved, please add a comment explaining why. If in > fact the key order is not significant, please use a Set instead. I've added documentation on this. https://codereview.chromium.org/2825253003/diff/20001/third_party/WebKit/Tool... File third_party/WebKit/Tools/Scripts/webkitpy/layout_tests/merge_results.py (right): https://codereview.chromium.org/2825253003/diff/20001/third_party/WebKit/Tool... third_party/WebKit/Tools/Scripts/webkitpy/layout_tests/merge_results.py:339: beforen, new_json_data_n, aftern = self.load_jsonp( On 2017/04/19 22:40:20, qyearsley wrote: > Using an underscore (before_n/after_n) may look a bit more consistent. Done now. This was actually the reason I didn't submit last night. Noticed this and then couldn't log into my desktop machine from my laptop to fix it :-( https://codereview.chromium.org/2825253003/diff/20001/third_party/WebKit/Tool... third_party/WebKit/Tools/Scripts/webkitpy/layout_tests/merge_results.py:598: FilenameMatch('pywebsocket\\.ws\\.log-.*-err\\.txt$'), On 2017/04/19 22:40:20, qyearsley wrote: > Could also use a "raw string" for strings that contain regexps. Not much of a > difference except then you don't need to use double-backslashes, e.g. > r'pywebsocket\.ws\.log-.*-err\.txt$' Yeah, raw string can be a little bit dangerous to use for regexes but I do think it is probably clearer here. I also renamed FilenameMatch to FilenameRegexMatch (and NameMatch to NameRegexMatch) to be clearer that this should be a regex.
The CQ bit was checked by tansell@chromium.org
The patchset sent to the CQ was uploaded after l-g-t-m from alancutter@chromium.org, mcgreevy@chromium.org, qyearsley@chromium.org Link to the patchset: https://codereview.chromium.org/2825253003/#ps60001 (title: "Fixes.")
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
CQ is committing da patch.
Bot data: {"patchset_id": 60001, "attempt_start_ts": 1492653955386490,
"parent_rev": "6c75bf4e460533cfd8bfad64ab6f8ef5bfebb173", "commit_rev":
"731836e08488b3f4875dd5377dfb9bc01e67b21e"}
Message was sent while issue was closed.
Description was changed from ========== webkitpy: Improve performance of merge-results script. This patch changes the JSON merging code to merge multiple things at once, rather than incrementally merging two things together. The performance improvement is significant, in walk clock time the change is; With 6 shards; Before - 1m3.957s After - 0m6.475s With 18 shards; Before - 1m57.577s After - 0m7.163s BUG=524758 ========== to ========== webkitpy: Improve performance of merge-results script. This patch changes the JSON merging code to merge multiple things at once, rather than incrementally merging two things together. The performance improvement is significant, in walk clock time the change is; With 6 shards; Before - 1m3.957s After - 0m6.475s With 18 shards; Before - 1m57.577s After - 0m7.163s BUG=524758 Review-Url: https://codereview.chromium.org/2825253003 Cr-Commit-Position: refs/heads/master@{#465888} Committed: https://chromium.googlesource.com/chromium/src/+/731836e08488b3f4875dd5377dfb... ==========
Message was sent while issue was closed.
Committed patchset #4 (id:60001) as https://chromium.googlesource.com/chromium/src/+/731836e08488b3f4875dd5377dfb... |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
