|
|
Created:
7 years, 8 months ago by Mike Stip (use stip instead) Modified:
7 years, 5 months ago CC:
chromium-reviews, cmp-cc_chromium.org Visibility:
Public. |
DescriptionOnly calculate slave->build mapping once in json output.
This improves tryserver root json performance by approximately 567%.
Background for this patch is discussed in https://chromiumcodereview.appspot.com/13619004.
Committed: http://src.chromium.org/viewvc/chrome?view=rev&revision=209365
Patch Set 1 #
Total comments: 15
Patch Set 2 : Address nits. #
Total comments: 6
Patch Set 3 : Fix nits. #
Total comments: 2
Patch Set 4 : Make buildcache only persist over a request. #Patch Set 5 : Added readme.chromium changes. #
Total comments: 9
Patch Set 6 : Address iannucci's comments. #Patch Set 7 : Update README.chromium to reflect new changes. #Patch Set 8 : Rename to custom_data. #
Messages
Total messages: 18 (0 generated)
Background: The tryserver json endpoint (http://build.chromium.org/p/tryserver.chromium/json) is a critical URL relied upon by CQ. It has very poor latency characteristics, wavering between 4 seconds during idle periods and 70s at peak load. When profiling the json endpoint, line_prof[1] and statprof[2] revealed a significant amount of time spent in SlaveJsonResource.asDict(). Each of the called functions and lines appeared to be reasonably optimal and did not take much time per call. However, the inner loop of SlaveJsonResource.asDict() iterates over every builder's cache. This same process is repeated for every slave in SlavesJsonResource, and results that do not match the currently looped-over slave are discarded. By creating a mapping of slaves to builders (and builds) once and persisting it across subsequent calls to SlaveJsonResource, this patch significantly reduces the overhead of /slaves and /json endpoints. On an unloaded Z620 with duplicated tryserver data, this patch reduces median latency from 8996ms to 1593ms[3]. This corresponds to a 5.67x speedup. [1] http://pythonhosted.org/line_profiler/ [2] https://pypi.python.org/pypi/statprof/ [3] === Benchmarking Setup === 1) duplicate live tryserver data by loading a postgres dump and copying over pickled builds 2) start up master and warm up cache by visiting http://localhost:8228/waterfall and http://localhost:8228/json?as_text=1 3) test with ab -t 200 -n 1000 -c 5 http://localhost:8228/json\?as_text=1 ==== Results with stock json code ===== Benchmarking localhost (be patient) Completed 100 requests Finished 110 requests Server Software: TwistedWeb/10.2.0 Server Hostname: localhost Server Port: 8228 Document Path: /json?as_text=1 Document Length: 338000 bytes Concurrency Level: 5 Time taken for tests: 201.109 seconds Complete requests: 110 Failed requests: 0 Write errors: 0 Total transferred: 37201890 bytesHTML transferred: 37180000 bytes Requests per second: 0.55 [#/sec] (mean) Time per request: 9141.322 [ms] (mean) Time per request: 1828.264 [ms] (mean, across all concurrent requests)Transfer rate: 180.65 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect: 0 0 0.0 0 0 Processing: 5493 8965 1279.8 8996 13489 Waiting: 2829 8531 1575.8 8917 13482 Total: 5493 8965 1279.8 8996 13489 Percentage of the requests served within a certain time (ms) 50% 8996 66% 9319 75% 9487 80% 9957 90% 10466 95% 10677 98% 12148 99% 12268 100% 13489 (longest request) ==== Results with this patch applied ===== Benchmarking localhost (be patient)Completed 100 requests Completed 200 requests Completed 300 requests Completed 400 requests Completed 500 requests Completed 600 requests Finished 621 requests Server Software: TwistedWeb/10.2.0 Server Hostname: localhost Server Port: 8228 Document Path: /json?as_text=1 Document Length: 338000 bytes Concurrency Level: 5 Time taken for tests: 200.161 seconds Complete requests: 621 Failed requests: 0 Write errors: 0 Total transferred: 210021579 bytes HTML transferred: 209898000 bytes Requests per second: 3.10 [#/sec] (mean) Time per request: 1611.598 [ms] (mean) Time per request: 322.320 [ms] (mean, across all concurrent requests) Transfer rate: 1024.67 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect: 0 0 0.0 0 0 Processing: 1210 1606 172.2 1593 2981 Waiting: 1102 1586 174.2 1581 2972 Total: 1210 1606 172.2 1593 2981 Percentage of the requests served within a certain time (ms) 50% 1593 66% 1613 75% 1624 80% 1633 90% 1661 95% 1730 98% 1797 99% 2738 100% 2981 (longest request)
Some comments. Also note you'll need to add this to our buildbot diff file. https://chromiumcodereview.appspot.com/13619004/diff/1/third_party/buildbot_8... File third_party/buildbot_8_4p1/buildbot/status/web/status_json.py (right): https://chromiumcodereview.appspot.com/13619004/diff/1/third_party/buildbot_8... third_party/buildbot_8_4p1/buildbot/status/web/status_json.py:417: buildcache = {} Maybe: self.buildcache = collections.defaultdict(dict) https://chromiumcodereview.appspot.com/13619004/diff/1/third_party/buildbot_8... third_party/buildbot_8_4p1/buildbot/status/web/status_json.py:424: buildcache=buildcache)) buildcache=self.buildcache https://chromiumcodereview.appspot.com/13619004/diff/1/third_party/buildbot_8... third_party/buildbot_8_4p1/buildbot/status/web/status_json.py:625: def __init__(self, status, slave_status, buildcache=None): why optional? https://chromiumcodereview.appspot.com/13619004/diff/1/third_party/buildbot_8... third_party/buildbot_8_4p1/buildbot/status/web/status_json.py:645: def mapSlavesToBuilds(self): I think this would be cleaner if it buildcache[self.name] (and lazily creates the rest of buildcache) https://chromiumcodereview.appspot.com/13619004/diff/1/third_party/buildbot_8... third_party/buildbot_8_4p1/buildbot/status/web/status_json.py:648: self.buildcache = {} 646-648 can be removed if param not optional. https://chromiumcodereview.appspot.com/13619004/diff/1/third_party/buildbot_8... third_party/buildbot_8_4p1/buildbot/status/web/status_json.py:657: slave = self.buildcache.get(build_status.getSlavename(), {}) if buildcache is defaultdict. 657-660 become: slave = self.buildcache[build_status.getSlavename()] slave.setdefault(builderName, []).append(build_status.getNumber()) https://chromiumcodereview.appspot.com/13619004/diff/1/third_party/buildbot_8... third_party/buildbot_8_4p1/buildbot/status/web/status_json.py:661: return self.buildcache[self.name] https://chromiumcodereview.appspot.com/13619004/diff/1/third_party/buildbot_8... third_party/buildbot_8_4p1/buildbot/status/web/status_json.py:666: results['builders'] = self.buildcache.get(self.name, {}) results['builders'] = getSlavesToBuildMap() https://chromiumcodereview.appspot.com/13619004/diff/1/third_party/buildbot_8... third_party/buildbot_8_4p1/buildbot/status/web/status_json.py:676: buildcache = {} Maybe self.buildcache = collections.defaultdict(dict) https://chromiumcodereview.appspot.com/13619004/diff/1/third_party/buildbot_8... third_party/buildbot_8_4p1/buildbot/status/web/status_json.py:682: buildcache=buildcache)) self.buildcache
Addressed comments. ptal! https://codereview.chromium.org/13619004/diff/1/third_party/buildbot_8_4p1/bu... File third_party/buildbot_8_4p1/buildbot/status/web/status_json.py (right): https://codereview.chromium.org/13619004/diff/1/third_party/buildbot_8_4p1/bu... third_party/buildbot_8_4p1/buildbot/status/web/status_json.py:424: buildcache=buildcache)) On 2013/04/04 21:24:38, Isaac wrote: > buildcache=self.buildcache It's probably not a good idea to persist this cache past init -- it should really only be used during construction and then discarded. https://codereview.chromium.org/13619004/diff/1/third_party/buildbot_8_4p1/bu... third_party/buildbot_8_4p1/buildbot/status/web/status_json.py:625: def __init__(self, status, slave_status, buildcache=None): On 2013/04/04 21:24:38, Isaac wrote: > why optional? It's only needed if you are repeatedly creating SlaveJsonResources. https://codereview.chromium.org/13619004/diff/1/third_party/buildbot_8_4p1/bu... third_party/buildbot_8_4p1/buildbot/status/web/status_json.py:645: def mapSlavesToBuilds(self): On 2013/04/04 21:24:38, Isaac wrote: > I think this would be cleaner if it buildcache[self.name] (and lazily creates > the rest of buildcache) The scan works by creating all slaves at once, so creating "the rest of buildcache" doesn't make sense. It's all or nothing. I've modified this to do the lazy build outside of mapSlavesToBuilds. https://codereview.chromium.org/13619004/diff/1/third_party/buildbot_8_4p1/bu... third_party/buildbot_8_4p1/buildbot/status/web/status_json.py:676: buildcache = {} On 2013/04/04 21:24:38, Isaac wrote: > Maybe self.buildcache = collections.defaultdict(dict) Done.
Still need diff added to README.chromium. I'm not comfortable approving this buildbot change, but the code looks good to me with a couple nits. https://codereview.chromium.org/13619004/diff/1/third_party/buildbot_8_4p1/bu... File third_party/buildbot_8_4p1/buildbot/status/web/status_json.py (right): https://codereview.chromium.org/13619004/diff/1/third_party/buildbot_8_4p1/bu... third_party/buildbot_8_4p1/buildbot/status/web/status_json.py:625: def __init__(self, status, slave_status, buildcache=None): On 2013/04/05 23:35:04, Mike Stipicevic wrote: > On 2013/04/04 21:24:38, Isaac wrote: > > why optional? > > It's only needed if you are repeatedly creating SlaveJsonResources. As far as I can tell, the only clients are in this class and they use the param. I believe option params are discouraged unless there is a present need for them. https://codereview.chromium.org/13619004/diff/5001/third_party/buildbot_8_4p1... File third_party/buildbot_8_4p1/buildbot/status/web/status_json.py (right): https://codereview.chromium.org/13619004/diff/5001/third_party/buildbot_8_4p1... third_party/buildbot_8_4p1/buildbot/status/web/status_json.py:647: def mapSlavesToBuilds(self): delete this method and inline to line 662
https://codereview.chromium.org/13619004/diff/5001/third_party/buildbot_8_4p1... File third_party/buildbot_8_4p1/buildbot/status/web/status_json.py (right): https://codereview.chromium.org/13619004/diff/5001/third_party/buildbot_8_4p1... third_party/buildbot_8_4p1/buildbot/status/web/status_json.py:418: buildcache = collections.defaultdict(dict) Why not create a single global instance? It'd be simpler and more efficient. https://codereview.chromium.org/13619004/diff/5001/third_party/buildbot_8_4p1... third_party/buildbot_8_4p1/buildbot/status/web/status_json.py:657: slave.setdefault(builderName, []).append( Note that it is effectively a memory leak, not a big one since it's just numbers but still.
https://codereview.chromium.org/13619004/diff/5001/third_party/buildbot_8_4p1... File third_party/buildbot_8_4p1/buildbot/status/web/status_json.py (right): https://codereview.chromium.org/13619004/diff/5001/third_party/buildbot_8_4p1... third_party/buildbot_8_4p1/buildbot/status/web/status_json.py:418: buildcache = collections.defaultdict(dict) On 2013/04/06 01:23:46, Marc-Antoine Ruel wrote: > Why not create a single global instance? It'd be simpler and more efficient. Because this is per-request. It's only a cache while the JSON is being build up, then it should be discarded. https://codereview.chromium.org/13619004/diff/5001/third_party/buildbot_8_4p1... third_party/buildbot_8_4p1/buildbot/status/web/status_json.py:647: def mapSlavesToBuilds(self): On 2013/04/05 23:45:01, Isaac wrote: > delete this method and inline to line 662 Done. https://codereview.chromium.org/13619004/diff/5001/third_party/buildbot_8_4p1... third_party/buildbot_8_4p1/buildbot/status/web/status_json.py:657: slave.setdefault(builderName, []).append( On 2013/04/06 01:23:46, Marc-Antoine Ruel wrote: > Note that it is effectively a memory leak, not a big one since it's just numbers > but still. buildcache gets thrown away after every request.
https://codereview.chromium.org/13619004/diff/10001/third_party/buildbot_8_4p... File third_party/buildbot_8_4p1/README.chromium (right): https://codereview.chromium.org/13619004/diff/10001/third_party/buildbot_8_4p... third_party/buildbot_8_4p1/README.chromium:2959: - break This does not look like the diff from origin.
lgtm https://codereview.chromium.org/13619004/diff/10001/third_party/buildbot_8_4p... File third_party/buildbot_8_4p1/buildbot/status/web/status_json.py (right): https://codereview.chromium.org/13619004/diff/10001/third_party/buildbot_8_4p... third_party/buildbot_8_4p1/buildbot/status/web/status_json.py:648: for builderName in self.getBuilders(): So the code inside the if statement will be executed exactly once per web request?
The previous code was based on an incorrect assumption of how twisted Resources work. The code has been updated to properly limit the scope of the buildrequest to each request. Line profiling verifies that, for a master with 79 builders, 10 requests calls the build sweep logic 790 times. patchset 5 has an estimated speedup of 609%. ===== stock code on unloaded z620 ===== Server Software: TwistedWeb/10.2.0 Server Hostname: localhost Server Port: 8228 Document Path: /json?as_text=1 Document Length: 294040 bytes Concurrency Level: 5 Time taken for tests: 201.056 seconds Complete requests: 87 Failed requests: 0 Write errors: 0 Total transferred: 25729865 bytes HTML transferred: 25712353 bytes Requests per second: 0.43 [#/sec] (mean) Time per request: 11554.942 [ms] (mean) Time per request: 2310.988 [ms] (mean, across all concurrent requests) Transfer rate: 124.97 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect: 0 0 0.0 0 0 Processing: 7211 11178 1803.4 11395 17530 Waiting: 3367 10420 2292.3 11248 17525 Total: 7211 11178 1803.4 11395 17530 Percentage of the requests served within a certain time (ms) 50% 11393 66% 11524 75% 11620 80% 11748 90% 13406 95% 13598 98% 15883 99% 17530 100% 17530 (longest request) ===== with patchset 5 ===== Server Software: TwistedWeb/10.2.0 Server Hostname: localhost Server Port: 8228 Document Path: /json?as_text=1 Document Length: 318889 bytes Concurrency Level: 5 Time taken for tests: 38.211 seconds Complete requests: 100 Failed requests: 0 Write errors: 0 Total transferred: 31908800 bytes HTML transferred: 31888900 bytes Requests per second: 2.62 [#/sec] (mean) Time per request: 1910.534 [ms] (mean) Time per request: 382.107 [ms] (mean, across all concurrent requests) Transfer rate: 815.50 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect: 0 0 0.0 0 0 Processing: 1582 1893 92.7 1909 2098 Waiting: 1340 1863 116.0 1897 2093 Total: 1582 1893 92.7 1909 2098 Percentage of the requests served within a certain time (ms) 50% 1909 66% 1921 75% 1936 80% 1951 90% 1980 95% 2055 98% 2074 99% 2098 100% 2098 (longest request)
Does the description need to be updated as well? https://chromiumcodereview.appspot.com/13619004/diff/19001/third_party/buildb... File third_party/buildbot_8_4p1/README.chromium (right): https://chromiumcodereview.appspot.com/13619004/diff/19001/third_party/buildb... third_party/buildbot_8_4p1/README.chromium:11: don't need this, I think? https://chromiumcodereview.appspot.com/13619004/diff/19001/third_party/buildb... File third_party/buildbot_8_4p1/buildbot/status/web/status_json.py (right): https://chromiumcodereview.appspot.com/13619004/diff/19001/third_party/buildb... third_party/buildbot_8_4p1/buildbot/status/web/status_json.py:642: # This builder hasn't been processed in this request yet. obvious comment is obvious? :) https://chromiumcodereview.appspot.com/13619004/diff/19001/third_party/buildb... third_party/buildbot_8_4p1/buildbot/status/web/status_json.py:646: build_status = builder_status.getBuild(-i) This loop construction looks ugly, but I'm assuming it's due to beep-do-doot-doo-BuildBot! https://chromiumcodereview.appspot.com/13619004/diff/19001/third_party/buildb... third_party/buildbot_8_4p1/buildbot/status/web/status_json.py:650: slave = buildcache[build_status.getSlavename()] how is slavename guaranteed to exist in buildcache at this point? https://chromiumcodereview.appspot.com/13619004/diff/19001/third_party/buildb... third_party/buildbot_8_4p1/buildbot/status/web/status_json.py:666: I think this whole block could be: if not hasattr(request, 'requestdata'): request.requestdata = { 'buildcache': collections.defaultdict(dict), 'buildercache': set() } I think? Or could something else set requestdata? https://chromiumcodereview.appspot.com/13619004/diff/19001/third_party/buildb... third_party/buildbot_8_4p1/buildbot/status/web/status_json.py:668: # Enhance it by adding more informations. depluaralize information while you're at it :)
updated https://chromiumcodereview.appspot.com/13619004/diff/19001/third_party/buildb... File third_party/buildbot_8_4p1/README.chromium (right): https://chromiumcodereview.appspot.com/13619004/diff/19001/third_party/buildb... third_party/buildbot_8_4p1/README.chromium:11: On 2013/06/30 08:16:16, iannucci wrote: > don't need this, I think? Done, now sure how that snuck in. https://chromiumcodereview.appspot.com/13619004/diff/19001/third_party/buildb... File third_party/buildbot_8_4p1/buildbot/status/web/status_json.py (right): https://chromiumcodereview.appspot.com/13619004/diff/19001/third_party/buildb... third_party/buildbot_8_4p1/buildbot/status/web/status_json.py:650: slave = buildcache[build_status.getSlavename()] On 2013/06/30 08:16:16, iannucci wrote: > how is slavename guaranteed to exist in buildcache at this point? collections.defaultdict(dict) https://chromiumcodereview.appspot.com/13619004/diff/19001/third_party/buildb... third_party/buildbot_8_4p1/buildbot/status/web/status_json.py:666: On 2013/06/30 08:16:16, iannucci wrote: > I think this whole block could be: > > if not hasattr(request, 'requestdata'): > request.requestdata = { > 'buildcache': collections.defaultdict(dict), > 'buildercache': set() > } > > I think? Or could something else set requestdata? twisted doesn't seem to have a place to stash application data during a request. so I added requestdata here with the idea that if we needed to stash other stuff later we could just put it here instead of adding more and more attrs.
On 2013/06/30 08:29:54, Mike Stipicevic wrote: > updated > > https://chromiumcodereview.appspot.com/13619004/diff/19001/third_party/buildb... > File third_party/buildbot_8_4p1/README.chromium (right): > > https://chromiumcodereview.appspot.com/13619004/diff/19001/third_party/buildb... > third_party/buildbot_8_4p1/README.chromium:11: > On 2013/06/30 08:16:16, iannucci wrote: > > don't need this, I think? > > Done, now sure how that snuck in. > > https://chromiumcodereview.appspot.com/13619004/diff/19001/third_party/buildb... > File third_party/buildbot_8_4p1/buildbot/status/web/status_json.py (right): > > https://chromiumcodereview.appspot.com/13619004/diff/19001/third_party/buildb... > third_party/buildbot_8_4p1/buildbot/status/web/status_json.py:650: slave = > buildcache[build_status.getSlavename()] > On 2013/06/30 08:16:16, iannucci wrote: > > how is slavename guaranteed to exist in buildcache at this point? > > collections.defaultdict(dict) > > https://chromiumcodereview.appspot.com/13619004/diff/19001/third_party/buildb... > third_party/buildbot_8_4p1/buildbot/status/web/status_json.py:666: > On 2013/06/30 08:16:16, iannucci wrote: > > I think this whole block could be: > > > > if not hasattr(request, 'requestdata'): > > request.requestdata = { > > 'buildcache': collections.defaultdict(dict), > > 'buildercache': set() > > } > > > > I think? Or could something else set requestdata? > > twisted doesn't seem to have a place to stash application data during a request. > so I added requestdata here with the idea that if we needed to stash other stuff > later we could just put it here instead of adding more and more attrs. in that case, can we call it data or custom_data or something?
On 2013/06/30 08:33:47, iannucci wrote: > On 2013/06/30 08:29:54, Mike Stipicevic wrote: > > updated > > > > > https://chromiumcodereview.appspot.com/13619004/diff/19001/third_party/buildb... > > File third_party/buildbot_8_4p1/README.chromium (right): > > > > > https://chromiumcodereview.appspot.com/13619004/diff/19001/third_party/buildb... > > third_party/buildbot_8_4p1/README.chromium:11: > > On 2013/06/30 08:16:16, iannucci wrote: > > > don't need this, I think? > > > > Done, now sure how that snuck in. > > > > > https://chromiumcodereview.appspot.com/13619004/diff/19001/third_party/buildb... > > File third_party/buildbot_8_4p1/buildbot/status/web/status_json.py (right): > > > > > https://chromiumcodereview.appspot.com/13619004/diff/19001/third_party/buildb... > > third_party/buildbot_8_4p1/buildbot/status/web/status_json.py:650: slave = > > buildcache[build_status.getSlavename()] > > On 2013/06/30 08:16:16, iannucci wrote: > > > how is slavename guaranteed to exist in buildcache at this point? > > > > collections.defaultdict(dict) > > > > > https://chromiumcodereview.appspot.com/13619004/diff/19001/third_party/buildb... > > third_party/buildbot_8_4p1/buildbot/status/web/status_json.py:666: > > On 2013/06/30 08:16:16, iannucci wrote: > > > I think this whole block could be: > > > > > > if not hasattr(request, 'requestdata'): > > > request.requestdata = { > > > 'buildcache': collections.defaultdict(dict), > > > 'buildercache': set() > > > } > > > > > > I think? Or could something else set requestdata? > > > > twisted doesn't seem to have a place to stash application data during a > request. > > so I added requestdata here with the idea that if we needed to stash other > stuff > > later we could just put it here instead of adding more and more attrs. > > in that case, can we call it data or custom_data or something? done.
lgtm
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/xusydoc@chromium.org/13619004/28002
Message was sent while issue was closed.
Change committed as 209365
Message was sent while issue was closed.
On 2013/06/30 08:45:21, I haz the power (commit-bot) wrote: > Change committed as 209365 This sounds this should be upstreame?
Message was sent while issue was closed.
On 2013/07/09 16:00:39, Marc-Antoine Ruel wrote: > On 2013/06/30 08:45:21, I haz the power (commit-bot) wrote: > > Change committed as 209365 > > This sounds this should be upstreame? Upstreamed* |