|
|
Chromium Code Reviews
DescriptionPreviously alert.py has some pre-hook logic to update associated groups. This involved several sync gets + query, with some potential puts and deletes. The newer implementation strips the pre-hook out in favour of heavily batching async calls when alerts are updated.
So ran through StackDriver re-triaging a bug with 53 alerts (did the new one first so that any memcache boosts would benefit the old one not the new one), gathered these timings:
Demo: https://dev-simonhatch-66cc9519-dot-chromeperf.appspot.com/
Old:
@479 ms
https://monorail-prod.appspot.com/_ah/api/monorail/v1/projects/chromium/issues?alt=json&sendEmail=true
<Modify Alerts>
@10,430 ms
https://monorail-prod.appspot.com/_ah/api/monorail/v1/projects/chromium/issues/685469/comments?alt=json&sendEmail=true
@11,354 ms
https://cr-buildbucket.appspot.com/api/discovery/v1/apis/buildbucket/v1/rest?userIp=2620%3A0%3A1000%3A1409%3A4c33%3Ac669%3A2634%3Af3bd
New:
@770 ms
https://monorail-prod.appspot.com/_ah/api/monorail/v1/projects/chromium/issues?alt=json&sendEmail=true
<Modify Alerts>
@2,072 ms
https://monorail-prod.appspot.com/_ah/api/monorail/v1/projects/chromium/issues/685468/comments?alt=json&sendEmail=true
@3,121 ms
https://cr-buildbucket.appspot.com/api/discovery/v1/apis/buildbucket/v1/rest?userIp=2620%3A0%3A1000%3A1409%3A4c33%3Ac669%3A2634%3Af3bd
This would put the old implementation at about 9951 ms vs 1302 ms in this particular run if I'm understanding these traces correctly.
Could probably make some of these endpoints faster by returning futures from alert_group.ModifyAlertsAndAssociatedGroups() and weaving in with the surrounding functions (like in /file_bug we could theoretically async the issue_tracker stuff).
Also might want to file an issue after to refactor some of this Alert/AlertGroup code. Seems a bit brittle that you have these associated groups that need to be updated whenever attributes on an alert are modified.
BUG=catapult:#2771
Review-Url: https://codereview.chromium.org/2650683007
Committed: https://chromium.googlesource.com/external/github.com/catapult-project/catapult/+/20c5e0b31c76cfbcf8ee0f629fe4c3069367f3c4
Review-Url: https://codereview.chromium.org/2650683007
Committed: https://chromium.googlesource.com/external/github.com/catapult-project/catapult/+/289d83a49d829f8bbf490667f53bb20077c5f5b7
Patch Set 1 #
Total comments: 10
Patch Set 2 : Addressed comments. #Patch Set 3 : Fix race in delete. #
Messages
Total messages: 60 (50 generated)
Description was changed from ========== WIP alert.py prehook thing BUG=catapult:# ========== to ========== WIP alert.py prehook thing BUG=catapult:#2771 ==========
Patchset #4 (id:60001) has been deleted
Patchset #4 (id:80001) has been deleted
Patchset #4 (id:100001) has been deleted
Patchset #3 (id:40001) has been deleted
Patchset #2 (id:20001) has been deleted
Patchset #1 (id:1) has been deleted
Description was changed from ========== WIP alert.py prehook thing BUG=catapult:#2771 ========== to ========== Previously alert.py has some pre-hook logic to update associated groups. This involved several sync gets + query, with some potential puts and deletes. The newer implementation strips the pre-hook out in favour of heavily batching async calls when alerts are updated. So ran through StackDriver re-triaging a bug with 53 alerts, gathered these timings: Old: @479 ms https://monorail-prod.appspot.com/_ah/api/monorail/v1/projects/chromium/issue... <Modify Alerts> @10,430 ms https://monorail-prod.appspot.com/_ah/api/monorail/v1/projects/chromium/issue... @11,354 ms https://cr-buildbucket.appspot.com/api/discovery/v1/apis/buildbucket/v1/rest?... New: @770 ms https://monorail-prod.appspot.com/_ah/api/monorail/v1/projects/chromium/issue... <Modify Alerts> @2,072 ms https://monorail-prod.appspot.com/_ah/api/monorail/v1/projects/chromium/issue... @3,121 ms https://cr-buildbucket.appspot.com/api/discovery/v1/apis/buildbucket/v1/rest?... This would put the old implementation at about 9951 ms vs 1302 ms for newer implementation. BUG=catapult:#2771 ==========
Description was changed from ========== Previously alert.py has some pre-hook logic to update associated groups. This involved several sync gets + query, with some potential puts and deletes. The newer implementation strips the pre-hook out in favour of heavily batching async calls when alerts are updated. So ran through StackDriver re-triaging a bug with 53 alerts, gathered these timings: Old: @479 ms https://monorail-prod.appspot.com/_ah/api/monorail/v1/projects/chromium/issue... <Modify Alerts> @10,430 ms https://monorail-prod.appspot.com/_ah/api/monorail/v1/projects/chromium/issue... @11,354 ms https://cr-buildbucket.appspot.com/api/discovery/v1/apis/buildbucket/v1/rest?... New: @770 ms https://monorail-prod.appspot.com/_ah/api/monorail/v1/projects/chromium/issue... <Modify Alerts> @2,072 ms https://monorail-prod.appspot.com/_ah/api/monorail/v1/projects/chromium/issue... @3,121 ms https://cr-buildbucket.appspot.com/api/discovery/v1/apis/buildbucket/v1/rest?... This would put the old implementation at about 9951 ms vs 1302 ms for newer implementation. BUG=catapult:#2771 ========== to ========== Previously alert.py has some pre-hook logic to update associated groups. This involved several sync gets + query, with some potential puts and deletes. The newer implementation strips the pre-hook out in favour of heavily batching async calls when alerts are updated. So ran through StackDriver re-triaging a bug with 53 alerts, gathered these timings: Old: @479 ms https://monorail-prod.appspot.com/_ah/api/monorail/v1/projects/chromium/issue... <Modify Alerts> @10,430 ms https://monorail-prod.appspot.com/_ah/api/monorail/v1/projects/chromium/issue... @11,354 ms https://cr-buildbucket.appspot.com/api/discovery/v1/apis/buildbucket/v1/rest?... New: @770 ms https://monorail-prod.appspot.com/_ah/api/monorail/v1/projects/chromium/issue... <Modify Alerts> @2,072 ms https://monorail-prod.appspot.com/_ah/api/monorail/v1/projects/chromium/issue... @3,121 ms https://cr-buildbucket.appspot.com/api/discovery/v1/apis/buildbucket/v1/rest?... This would put the old implementation at about 9951 ms vs 1302 ms for newer implementation if I'm understanding these traces correctly. BUG=catapult:#2771 ==========
Patchset #1 (id:120001) has been deleted
Patchset #1 (id:140001) has been deleted
Patchset #1 (id:160001) has been deleted
Patchset #1 (id:180001) has been deleted
Patchset #1 (id:200001) has been deleted
Description was changed from ========== Previously alert.py has some pre-hook logic to update associated groups. This involved several sync gets + query, with some potential puts and deletes. The newer implementation strips the pre-hook out in favour of heavily batching async calls when alerts are updated. So ran through StackDriver re-triaging a bug with 53 alerts, gathered these timings: Old: @479 ms https://monorail-prod.appspot.com/_ah/api/monorail/v1/projects/chromium/issue... <Modify Alerts> @10,430 ms https://monorail-prod.appspot.com/_ah/api/monorail/v1/projects/chromium/issue... @11,354 ms https://cr-buildbucket.appspot.com/api/discovery/v1/apis/buildbucket/v1/rest?... New: @770 ms https://monorail-prod.appspot.com/_ah/api/monorail/v1/projects/chromium/issue... <Modify Alerts> @2,072 ms https://monorail-prod.appspot.com/_ah/api/monorail/v1/projects/chromium/issue... @3,121 ms https://cr-buildbucket.appspot.com/api/discovery/v1/apis/buildbucket/v1/rest?... This would put the old implementation at about 9951 ms vs 1302 ms for newer implementation if I'm understanding these traces correctly. BUG=catapult:#2771 ========== to ========== Previously alert.py has some pre-hook logic to update associated groups. This involved several sync gets + query, with some potential puts and deletes. The newer implementation strips the pre-hook out in favour of heavily batching async calls when alerts are updated. So ran through StackDriver re-triaging a bug with 53 alerts, gathered these timings: Old: @479 ms https://monorail-prod.appspot.com/_ah/api/monorail/v1/projects/chromium/issue... <Modify Alerts> @10,430 ms https://monorail-prod.appspot.com/_ah/api/monorail/v1/projects/chromium/issue... @11,354 ms https://cr-buildbucket.appspot.com/api/discovery/v1/apis/buildbucket/v1/rest?... New: @770 ms https://monorail-prod.appspot.com/_ah/api/monorail/v1/projects/chromium/issue... <Modify Alerts> @2,072 ms https://monorail-prod.appspot.com/_ah/api/monorail/v1/projects/chromium/issue... @3,121 ms https://cr-buildbucket.appspot.com/api/discovery/v1/apis/buildbucket/v1/rest?... This would put the old implementation at about 9951 ms vs 1302 ms for newer implementation if I'm understanding these traces correctly. Could probably make some of these endpoints faster by returning futures from alert_group.ModifyAlertsAndAssociatedGroups() and weaving in with the surrounding functions (like in /file_bug we could theoretically async the issue_tracker stuff). BUG=catapult:#2771 ==========
Description was changed from ========== Previously alert.py has some pre-hook logic to update associated groups. This involved several sync gets + query, with some potential puts and deletes. The newer implementation strips the pre-hook out in favour of heavily batching async calls when alerts are updated. So ran through StackDriver re-triaging a bug with 53 alerts, gathered these timings: Old: @479 ms https://monorail-prod.appspot.com/_ah/api/monorail/v1/projects/chromium/issue... <Modify Alerts> @10,430 ms https://monorail-prod.appspot.com/_ah/api/monorail/v1/projects/chromium/issue... @11,354 ms https://cr-buildbucket.appspot.com/api/discovery/v1/apis/buildbucket/v1/rest?... New: @770 ms https://monorail-prod.appspot.com/_ah/api/monorail/v1/projects/chromium/issue... <Modify Alerts> @2,072 ms https://monorail-prod.appspot.com/_ah/api/monorail/v1/projects/chromium/issue... @3,121 ms https://cr-buildbucket.appspot.com/api/discovery/v1/apis/buildbucket/v1/rest?... This would put the old implementation at about 9951 ms vs 1302 ms for newer implementation if I'm understanding these traces correctly. Could probably make some of these endpoints faster by returning futures from alert_group.ModifyAlertsAndAssociatedGroups() and weaving in with the surrounding functions (like in /file_bug we could theoretically async the issue_tracker stuff). BUG=catapult:#2771 ========== to ========== Previously alert.py has some pre-hook logic to update associated groups. This involved several sync gets + query, with some potential puts and deletes. The newer implementation strips the pre-hook out in favour of heavily batching async calls when alerts are updated. So ran through StackDriver re-triaging a bug with 53 alerts (did the new one first so that any memcache boosts would benefit the old one not the new one), gathered these timings: Old: @479 ms https://monorail-prod.appspot.com/_ah/api/monorail/v1/projects/chromium/issue... <Modify Alerts> @10,430 ms https://monorail-prod.appspot.com/_ah/api/monorail/v1/projects/chromium/issue... @11,354 ms https://cr-buildbucket.appspot.com/api/discovery/v1/apis/buildbucket/v1/rest?... New: @770 ms https://monorail-prod.appspot.com/_ah/api/monorail/v1/projects/chromium/issue... <Modify Alerts> @2,072 ms https://monorail-prod.appspot.com/_ah/api/monorail/v1/projects/chromium/issue... @3,121 ms https://cr-buildbucket.appspot.com/api/discovery/v1/apis/buildbucket/v1/rest?... This would put the old implementation at about 9951 ms vs 1302 ms in this particular run if I'm understanding these traces correctly. Could probably make some of these endpoints faster by returning futures from alert_group.ModifyAlertsAndAssociatedGroups() and weaving in with the surrounding functions (like in /file_bug we could theoretically async the issue_tracker stuff). BUG=catapult:#2771 ==========
Description was changed from ========== Previously alert.py has some pre-hook logic to update associated groups. This involved several sync gets + query, with some potential puts and deletes. The newer implementation strips the pre-hook out in favour of heavily batching async calls when alerts are updated. So ran through StackDriver re-triaging a bug with 53 alerts (did the new one first so that any memcache boosts would benefit the old one not the new one), gathered these timings: Old: @479 ms https://monorail-prod.appspot.com/_ah/api/monorail/v1/projects/chromium/issue... <Modify Alerts> @10,430 ms https://monorail-prod.appspot.com/_ah/api/monorail/v1/projects/chromium/issue... @11,354 ms https://cr-buildbucket.appspot.com/api/discovery/v1/apis/buildbucket/v1/rest?... New: @770 ms https://monorail-prod.appspot.com/_ah/api/monorail/v1/projects/chromium/issue... <Modify Alerts> @2,072 ms https://monorail-prod.appspot.com/_ah/api/monorail/v1/projects/chromium/issue... @3,121 ms https://cr-buildbucket.appspot.com/api/discovery/v1/apis/buildbucket/v1/rest?... This would put the old implementation at about 9951 ms vs 1302 ms in this particular run if I'm understanding these traces correctly. Could probably make some of these endpoints faster by returning futures from alert_group.ModifyAlertsAndAssociatedGroups() and weaving in with the surrounding functions (like in /file_bug we could theoretically async the issue_tracker stuff). BUG=catapult:#2771 ========== to ========== Previously alert.py has some pre-hook logic to update associated groups. This involved several sync gets + query, with some potential puts and deletes. The newer implementation strips the pre-hook out in favour of heavily batching async calls when alerts are updated. So ran through StackDriver re-triaging a bug with 53 alerts (did the new one first so that any memcache boosts would benefit the old one not the new one), gathered these timings: Old: @479 ms https://monorail-prod.appspot.com/_ah/api/monorail/v1/projects/chromium/issue... <Modify Alerts> @10,430 ms https://monorail-prod.appspot.com/_ah/api/monorail/v1/projects/chromium/issue... @11,354 ms https://cr-buildbucket.appspot.com/api/discovery/v1/apis/buildbucket/v1/rest?... New: @770 ms https://monorail-prod.appspot.com/_ah/api/monorail/v1/projects/chromium/issue... <Modify Alerts> @2,072 ms https://monorail-prod.appspot.com/_ah/api/monorail/v1/projects/chromium/issue... @3,121 ms https://cr-buildbucket.appspot.com/api/discovery/v1/apis/buildbucket/v1/rest?... This would put the old implementation at about 9951 ms vs 1302 ms in this particular run if I'm understanding these traces correctly. Could probably make some of these endpoints faster by returning futures from alert_group.ModifyAlertsAndAssociatedGroups() and weaving in with the surrounding functions (like in /file_bug we could theoretically async the issue_tracker stuff). Also might want to file an issue after to refactor some of this Alert/AlertGroup code. Seems a bit brittle that you have these associated groups that need to be updated whenever attributes on an alert are modified. BUG=catapult:#2771 ==========
simonhatch@chromium.org changed reviewers: + sullivan@chromium.org
Description was changed from ========== Previously alert.py has some pre-hook logic to update associated groups. This involved several sync gets + query, with some potential puts and deletes. The newer implementation strips the pre-hook out in favour of heavily batching async calls when alerts are updated. So ran through StackDriver re-triaging a bug with 53 alerts (did the new one first so that any memcache boosts would benefit the old one not the new one), gathered these timings: Old: @479 ms https://monorail-prod.appspot.com/_ah/api/monorail/v1/projects/chromium/issue... <Modify Alerts> @10,430 ms https://monorail-prod.appspot.com/_ah/api/monorail/v1/projects/chromium/issue... @11,354 ms https://cr-buildbucket.appspot.com/api/discovery/v1/apis/buildbucket/v1/rest?... New: @770 ms https://monorail-prod.appspot.com/_ah/api/monorail/v1/projects/chromium/issue... <Modify Alerts> @2,072 ms https://monorail-prod.appspot.com/_ah/api/monorail/v1/projects/chromium/issue... @3,121 ms https://cr-buildbucket.appspot.com/api/discovery/v1/apis/buildbucket/v1/rest?... This would put the old implementation at about 9951 ms vs 1302 ms in this particular run if I'm understanding these traces correctly. Could probably make some of these endpoints faster by returning futures from alert_group.ModifyAlertsAndAssociatedGroups() and weaving in with the surrounding functions (like in /file_bug we could theoretically async the issue_tracker stuff). Also might want to file an issue after to refactor some of this Alert/AlertGroup code. Seems a bit brittle that you have these associated groups that need to be updated whenever attributes on an alert are modified. BUG=catapult:#2771 ========== to ========== Previously alert.py has some pre-hook logic to update associated groups. This involved several sync gets + query, with some potential puts and deletes. The newer implementation strips the pre-hook out in favour of heavily batching async calls when alerts are updated. So ran through StackDriver re-triaging a bug with 53 alerts (did the new one first so that any memcache boosts would benefit the old one not the new one), gathered these timings: Demo: https://dev-simonhatch-66cc9519-dot-chromeperf.appspot.com/ Old: @479 ms https://monorail-prod.appspot.com/_ah/api/monorail/v1/projects/chromium/issue... <Modify Alerts> @10,430 ms https://monorail-prod.appspot.com/_ah/api/monorail/v1/projects/chromium/issue... @11,354 ms https://cr-buildbucket.appspot.com/api/discovery/v1/apis/buildbucket/v1/rest?... New: @770 ms https://monorail-prod.appspot.com/_ah/api/monorail/v1/projects/chromium/issue... <Modify Alerts> @2,072 ms https://monorail-prod.appspot.com/_ah/api/monorail/v1/projects/chromium/issue... @3,121 ms https://cr-buildbucket.appspot.com/api/discovery/v1/apis/buildbucket/v1/rest?... This would put the old implementation at about 9951 ms vs 1302 ms in this particular run if I'm understanding these traces correctly. Could probably make some of these endpoints faster by returning futures from alert_group.ModifyAlertsAndAssociatedGroups() and weaving in with the surrounding functions (like in /file_bug we could theoretically async the issue_tracker stuff). Also might want to file an issue after to refactor some of this Alert/AlertGroup code. Seems a bit brittle that you have these associated groups that need to be updated whenever attributes on an alert are modified. BUG=catapult:#2771 ==========
https://codereview.chromium.org/2650683007/diff/220001/dashboard/dashboard/mo... File dashboard/dashboard/models/alert_group_test.py (left): https://codereview.chromium.org/2650683007/diff/220001/dashboard/dashboard/mo... dashboard/dashboard/models/alert_group_test.py:64: # Add these anomalies to groups and put them again. When anomalies are I don't think this comment was correct. The old logic in alert.Alert._pre_put_hook explicitly checks for changes to the alert (specifically bug_id or revision_ranges). Since none of these ops change those, the extra put() shouldn't trigger any of that additional logic.
lgtm I wrote out my thoughts on tradeoffs between approaches as I was reading, but thinking through the code as a whole, this is probably the right balance of complex vs fast as written https://codereview.chromium.org/2650683007/diff/220001/dashboard/dashboard/mo... File dashboard/dashboard/models/alert_group.py (right): https://codereview.chromium.org/2650683007/diff/220001/dashboard/dashboard/mo... dashboard/dashboard/models/alert_group.py:84: # value to set and using kwargs let's us easily distringuish betwen Nit: lets us easily distinguish between https://codereview.chromium.org/2650683007/diff/220001/dashboard/dashboard/mo... dashboard/dashboard/models/alert_group.py:101: group_futures[a.group.id()] = a.group.get_async() I'm curious on the benefit of using get_async here over just putting these in a list and doing a get_multi(). With both approaches, the gets happen in parallel. Then the difference is in the 2nd pass: * With get_async, you start the 2nd pass a little sooner. but you basically do a get_result() right away , and so it gets a bit more work done until you hit whichever get turned out to be the long pole * With get_multi, you do all the gets in parallel, but don't start the 2nd pass until they are all finished. To me the question here is: Is the tradeoff between the complexity of get_async worth the speed improvement of getting started on the 2nd pass a bit faster? It seems like the advantage of starting the 2nd pass faster is not in running the python code sooner, since it is trivial, but more in kicking off some async queries faster? https://codereview.chromium.org/2650683007/diff/220001/dashboard/dashboard/mo... dashboard/dashboard/models/alert_group.py:130: # We cahce these rather than grab get_result() each time because we may s/cahce/cache/ https://codereview.chromium.org/2650683007/diff/220001/dashboard/dashboard/mo... dashboard/dashboard/models/alert_group.py:136: grouped_alerts = grouped_alerts_futures[a.group.id()].get_result() Queries are slower than gets, so I didn't realize this until I read this far, but here I have the opposite thought: it'd be more complex to implement this to use ndb.wait_any on the futures, but also faster because the queries would likely return different numbers of alerts, and thus you'd generally have a pattern where some come in faster than others. I'm hand-waving on the implementation here; you'd do something like grab the first result that came in, and then process all the alerts with that group id. WDYT? (For reference) https://cloud.google.com/appengine/docs/python/ndb/async#Future_wait_any https://codereview.chromium.org/2650683007/diff/220001/dashboard/dashboard/mo... File dashboard/dashboard/models/alert_group_test.py (left): https://codereview.chromium.org/2650683007/diff/220001/dashboard/dashboard/mo... dashboard/dashboard/models/alert_group_test.py:64: # Add these anomalies to groups and put them again. When anomalies are On 2017/01/30 19:17:20, shatch wrote: > I don't think this comment was correct. The old logic in > alert.Alert._pre_put_hook explicitly checks for changes to the alert > (specifically bug_id or revision_ranges). Since none of these ops change those, > the extra put() shouldn't trigger any of that additional logic. Agreed.
https://codereview.chromium.org/2650683007/diff/220001/dashboard/dashboard/mo... File dashboard/dashboard/models/alert_group.py (right): https://codereview.chromium.org/2650683007/diff/220001/dashboard/dashboard/mo... dashboard/dashboard/models/alert_group.py:84: # value to set and using kwargs let's us easily distringuish betwen On 2017/02/02 01:02:28, sullivan wrote: > Nit: > lets us easily distinguish between Done. https://codereview.chromium.org/2650683007/diff/220001/dashboard/dashboard/mo... dashboard/dashboard/models/alert_group.py:101: group_futures[a.group.id()] = a.group.get_async() On 2017/02/02 01:02:28, sullivan wrote: > I'm curious on the benefit of using get_async here over just putting these in a > list and doing a get_multi(). With both approaches, the gets happen in parallel. > Then the difference is in the 2nd pass: > > * With get_async, you start the 2nd pass a little sooner. but you basically do a > get_result() right away , and so it gets a bit more work done until you hit > whichever get turned out to be the long pole > * With get_multi, you do all the gets in parallel, but don't start the 2nd pass > until they are all finished. > > To me the question here is: Is the tradeoff between the complexity of get_async > worth the speed improvement of getting started on the 2nd pass a bit faster? It > seems like the advantage of starting the 2nd pass faster is not in running the > python code sooner, since it is trivial, but more in kicking off some async > queries faster? Pretty much, I figured at a high level there were 3 general approaches: 1. Using get_multi, easiest to implement, but you wait for worst case. 2. This implementation, tradeoff some complexity vs get_multi to kick-off asyncs earlier, don't necessarily have to wait for the worst case to start sending queries out. 3. Using wait_any, let's you wait the minimum but possibly more complex implementation (didn't try). https://codereview.chromium.org/2650683007/diff/220001/dashboard/dashboard/mo... dashboard/dashboard/models/alert_group.py:130: # We cahce these rather than grab get_result() each time because we may On 2017/02/02 01:02:28, sullivan wrote: > s/cahce/cache/ Done. https://codereview.chromium.org/2650683007/diff/220001/dashboard/dashboard/mo... dashboard/dashboard/models/alert_group.py:136: grouped_alerts = grouped_alerts_futures[a.group.id()].get_result() On 2017/02/02 01:02:28, sullivan wrote: > Queries are slower than gets, so I didn't realize this until I read this far, > but here I have the opposite thought: it'd be more complex to implement this to > use ndb.wait_any on the futures, but also faster because the queries would > likely return different numbers of alerts, and thus you'd generally have a > pattern where some come in faster than others. I'm hand-waving on the > implementation here; you'd do something like grab the first result that came in, > and then process all the alerts with that group id. WDYT? > > (For reference) > https://cloud.google.com/appengine/docs/python/ndb/async#Future_wait_any Yeah I actually considered going that route, but figured there was lower hanging fruit around this code than updating the alerts (ie. we could just use the async api for communicating with issue_tracker). I was looking at /add_point_queue and thinking a pattern like that would work there though, with a bunch of dependent reads on master/bot/test entities before the row puts. Since /add_point_queue doesn't do "other" stuff like communicating with issue_tracker, could be worth pursuing there as it's a hot path. Looks like there may be some lower hanging fruit in there though since it looks like find_anomalies is completely serial.
lgtm +eakuefner since he's interested in these performance discussions. But I think you made the right tradeoffs here, lgtm to submit.
The CQ bit was checked by simonhatch@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
The CQ bit was checked by simonhatch@chromium.org
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
CQ is committing da patch.
Bot data: {"patchset_id": 240001, "attempt_start_ts": 1486132682357080,
"parent_rev": "4dd2f587ffc33c062c8b0308567fdfbb27d6dc6c", "commit_rev":
"20c5e0b31c76cfbcf8ee0f629fe4c3069367f3c4"}
Message was sent while issue was closed.
Description was changed from ========== Previously alert.py has some pre-hook logic to update associated groups. This involved several sync gets + query, with some potential puts and deletes. The newer implementation strips the pre-hook out in favour of heavily batching async calls when alerts are updated. So ran through StackDriver re-triaging a bug with 53 alerts (did the new one first so that any memcache boosts would benefit the old one not the new one), gathered these timings: Demo: https://dev-simonhatch-66cc9519-dot-chromeperf.appspot.com/ Old: @479 ms https://monorail-prod.appspot.com/_ah/api/monorail/v1/projects/chromium/issue... <Modify Alerts> @10,430 ms https://monorail-prod.appspot.com/_ah/api/monorail/v1/projects/chromium/issue... @11,354 ms https://cr-buildbucket.appspot.com/api/discovery/v1/apis/buildbucket/v1/rest?... New: @770 ms https://monorail-prod.appspot.com/_ah/api/monorail/v1/projects/chromium/issue... <Modify Alerts> @2,072 ms https://monorail-prod.appspot.com/_ah/api/monorail/v1/projects/chromium/issue... @3,121 ms https://cr-buildbucket.appspot.com/api/discovery/v1/apis/buildbucket/v1/rest?... This would put the old implementation at about 9951 ms vs 1302 ms in this particular run if I'm understanding these traces correctly. Could probably make some of these endpoints faster by returning futures from alert_group.ModifyAlertsAndAssociatedGroups() and weaving in with the surrounding functions (like in /file_bug we could theoretically async the issue_tracker stuff). Also might want to file an issue after to refactor some of this Alert/AlertGroup code. Seems a bit brittle that you have these associated groups that need to be updated whenever attributes on an alert are modified. BUG=catapult:#2771 ========== to ========== Previously alert.py has some pre-hook logic to update associated groups. This involved several sync gets + query, with some potential puts and deletes. The newer implementation strips the pre-hook out in favour of heavily batching async calls when alerts are updated. So ran through StackDriver re-triaging a bug with 53 alerts (did the new one first so that any memcache boosts would benefit the old one not the new one), gathered these timings: Demo: https://dev-simonhatch-66cc9519-dot-chromeperf.appspot.com/ Old: @479 ms https://monorail-prod.appspot.com/_ah/api/monorail/v1/projects/chromium/issue... <Modify Alerts> @10,430 ms https://monorail-prod.appspot.com/_ah/api/monorail/v1/projects/chromium/issue... @11,354 ms https://cr-buildbucket.appspot.com/api/discovery/v1/apis/buildbucket/v1/rest?... New: @770 ms https://monorail-prod.appspot.com/_ah/api/monorail/v1/projects/chromium/issue... <Modify Alerts> @2,072 ms https://monorail-prod.appspot.com/_ah/api/monorail/v1/projects/chromium/issue... @3,121 ms https://cr-buildbucket.appspot.com/api/discovery/v1/apis/buildbucket/v1/rest?... This would put the old implementation at about 9951 ms vs 1302 ms in this particular run if I'm understanding these traces correctly. Could probably make some of these endpoints faster by returning futures from alert_group.ModifyAlertsAndAssociatedGroups() and weaving in with the surrounding functions (like in /file_bug we could theoretically async the issue_tracker stuff). Also might want to file an issue after to refactor some of this Alert/AlertGroup code. Seems a bit brittle that you have these associated groups that need to be updated whenever attributes on an alert are modified. BUG=catapult:#2771 Review-Url: https://codereview.chromium.org/2650683007 Committed: https://chromium.googlesource.com/external/github.com/catapult-project/catapu... ==========
Message was sent while issue was closed.
Committed patchset #2 (id:240001) as https://chromium.googlesource.com/external/github.com/catapult-project/catapu...
Message was sent while issue was closed.
A revert of this CL (patchset #2 id:240001) has been created in https://codereview.chromium.org/2670003006/ by simonhatch@chromium.org. The reason for reverting is: Fixing flakey test..
Message was sent while issue was closed.
Description was changed from ========== Previously alert.py has some pre-hook logic to update associated groups. This involved several sync gets + query, with some potential puts and deletes. The newer implementation strips the pre-hook out in favour of heavily batching async calls when alerts are updated. So ran through StackDriver re-triaging a bug with 53 alerts (did the new one first so that any memcache boosts would benefit the old one not the new one), gathered these timings: Demo: https://dev-simonhatch-66cc9519-dot-chromeperf.appspot.com/ Old: @479 ms https://monorail-prod.appspot.com/_ah/api/monorail/v1/projects/chromium/issue... <Modify Alerts> @10,430 ms https://monorail-prod.appspot.com/_ah/api/monorail/v1/projects/chromium/issue... @11,354 ms https://cr-buildbucket.appspot.com/api/discovery/v1/apis/buildbucket/v1/rest?... New: @770 ms https://monorail-prod.appspot.com/_ah/api/monorail/v1/projects/chromium/issue... <Modify Alerts> @2,072 ms https://monorail-prod.appspot.com/_ah/api/monorail/v1/projects/chromium/issue... @3,121 ms https://cr-buildbucket.appspot.com/api/discovery/v1/apis/buildbucket/v1/rest?... This would put the old implementation at about 9951 ms vs 1302 ms in this particular run if I'm understanding these traces correctly. Could probably make some of these endpoints faster by returning futures from alert_group.ModifyAlertsAndAssociatedGroups() and weaving in with the surrounding functions (like in /file_bug we could theoretically async the issue_tracker stuff). Also might want to file an issue after to refactor some of this Alert/AlertGroup code. Seems a bit brittle that you have these associated groups that need to be updated whenever attributes on an alert are modified. BUG=catapult:#2771 Review-Url: https://codereview.chromium.org/2650683007 Committed: https://chromium.googlesource.com/external/github.com/catapult-project/catapu... ========== to ========== Previously alert.py has some pre-hook logic to update associated groups. This involved several sync gets + query, with some potential puts and deletes. The newer implementation strips the pre-hook out in favour of heavily batching async calls when alerts are updated. So ran through StackDriver re-triaging a bug with 53 alerts (did the new one first so that any memcache boosts would benefit the old one not the new one), gathered these timings: Demo: https://dev-simonhatch-66cc9519-dot-chromeperf.appspot.com/ Old: @479 ms https://monorail-prod.appspot.com/_ah/api/monorail/v1/projects/chromium/issue... <Modify Alerts> @10,430 ms https://monorail-prod.appspot.com/_ah/api/monorail/v1/projects/chromium/issue... @11,354 ms https://cr-buildbucket.appspot.com/api/discovery/v1/apis/buildbucket/v1/rest?... New: @770 ms https://monorail-prod.appspot.com/_ah/api/monorail/v1/projects/chromium/issue... <Modify Alerts> @2,072 ms https://monorail-prod.appspot.com/_ah/api/monorail/v1/projects/chromium/issue... @3,121 ms https://cr-buildbucket.appspot.com/api/discovery/v1/apis/buildbucket/v1/rest?... This would put the old implementation at about 9951 ms vs 1302 ms in this particular run if I'm understanding these traces correctly. Could probably make some of these endpoints faster by returning futures from alert_group.ModifyAlertsAndAssociatedGroups() and weaving in with the surrounding functions (like in /file_bug we could theoretically async the issue_tracker stuff). Also might want to file an issue after to refactor some of this Alert/AlertGroup code. Seems a bit brittle that you have these associated groups that need to be updated whenever attributes on an alert are modified. BUG=catapult:#2771 Review-Url: https://codereview.chromium.org/2650683007 Committed: https://chromium.googlesource.com/external/github.com/catapult-project/catapu... ==========
The CQ bit was checked by simonhatch@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
The CQ bit was checked by simonhatch@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
The CQ bit was checked by simonhatch@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by simonhatch@chromium.org
The CQ bit was checked by simonhatch@chromium.org to run a CQ dry run
The CQ bit was unchecked by simonhatch@chromium.org
The CQ bit was checked by simonhatch@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
The CQ bit was checked by simonhatch@chromium.org
The patchset sent to the CQ was uploaded after l-g-t-m from sullivan@chromium.org Link to the patchset: https://codereview.chromium.org/2650683007/#ps260001 (title: "Fix race in delete.")
The CQ bit was unchecked by simonhatch@chromium.org
On 2017/02/03 16:57:46, shatch wrote: > A revert of this CL (patchset #2 id:240001) has been created in > https://codereview.chromium.org/2670003006/ by mailto:simonhatch@chromium.org. > > The reason for reverting is: Fixing flakey test.. Ok test should be fixed, ran locally and via cq bunch of times. Assuming still ok to land.
The CQ bit was checked by simonhatch@chromium.org
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
CQ is committing da patch.
Bot data: {"patchset_id": 260001, "attempt_start_ts": 1486153842932700,
"parent_rev": "9d4491ddf17cbee8465eeac6b2bc7840dc4f4c4a", "commit_rev":
"289d83a49d829f8bbf490667f53bb20077c5f5b7"}
Message was sent while issue was closed.
Description was changed from ========== Previously alert.py has some pre-hook logic to update associated groups. This involved several sync gets + query, with some potential puts and deletes. The newer implementation strips the pre-hook out in favour of heavily batching async calls when alerts are updated. So ran through StackDriver re-triaging a bug with 53 alerts (did the new one first so that any memcache boosts would benefit the old one not the new one), gathered these timings: Demo: https://dev-simonhatch-66cc9519-dot-chromeperf.appspot.com/ Old: @479 ms https://monorail-prod.appspot.com/_ah/api/monorail/v1/projects/chromium/issue... <Modify Alerts> @10,430 ms https://monorail-prod.appspot.com/_ah/api/monorail/v1/projects/chromium/issue... @11,354 ms https://cr-buildbucket.appspot.com/api/discovery/v1/apis/buildbucket/v1/rest?... New: @770 ms https://monorail-prod.appspot.com/_ah/api/monorail/v1/projects/chromium/issue... <Modify Alerts> @2,072 ms https://monorail-prod.appspot.com/_ah/api/monorail/v1/projects/chromium/issue... @3,121 ms https://cr-buildbucket.appspot.com/api/discovery/v1/apis/buildbucket/v1/rest?... This would put the old implementation at about 9951 ms vs 1302 ms in this particular run if I'm understanding these traces correctly. Could probably make some of these endpoints faster by returning futures from alert_group.ModifyAlertsAndAssociatedGroups() and weaving in with the surrounding functions (like in /file_bug we could theoretically async the issue_tracker stuff). Also might want to file an issue after to refactor some of this Alert/AlertGroup code. Seems a bit brittle that you have these associated groups that need to be updated whenever attributes on an alert are modified. BUG=catapult:#2771 Review-Url: https://codereview.chromium.org/2650683007 Committed: https://chromium.googlesource.com/external/github.com/catapult-project/catapu... ========== to ========== Previously alert.py has some pre-hook logic to update associated groups. This involved several sync gets + query, with some potential puts and deletes. The newer implementation strips the pre-hook out in favour of heavily batching async calls when alerts are updated. So ran through StackDriver re-triaging a bug with 53 alerts (did the new one first so that any memcache boosts would benefit the old one not the new one), gathered these timings: Demo: https://dev-simonhatch-66cc9519-dot-chromeperf.appspot.com/ Old: @479 ms https://monorail-prod.appspot.com/_ah/api/monorail/v1/projects/chromium/issue... <Modify Alerts> @10,430 ms https://monorail-prod.appspot.com/_ah/api/monorail/v1/projects/chromium/issue... @11,354 ms https://cr-buildbucket.appspot.com/api/discovery/v1/apis/buildbucket/v1/rest?... New: @770 ms https://monorail-prod.appspot.com/_ah/api/monorail/v1/projects/chromium/issue... <Modify Alerts> @2,072 ms https://monorail-prod.appspot.com/_ah/api/monorail/v1/projects/chromium/issue... @3,121 ms https://cr-buildbucket.appspot.com/api/discovery/v1/apis/buildbucket/v1/rest?... This would put the old implementation at about 9951 ms vs 1302 ms in this particular run if I'm understanding these traces correctly. Could probably make some of these endpoints faster by returning futures from alert_group.ModifyAlertsAndAssociatedGroups() and weaving in with the surrounding functions (like in /file_bug we could theoretically async the issue_tracker stuff). Also might want to file an issue after to refactor some of this Alert/AlertGroup code. Seems a bit brittle that you have these associated groups that need to be updated whenever attributes on an alert are modified. BUG=catapult:#2771 Review-Url: https://codereview.chromium.org/2650683007 Committed: https://chromium.googlesource.com/external/github.com/catapult-project/catapu... Review-Url: https://codereview.chromium.org/2650683007 Committed: https://chromium.googlesource.com/external/github.com/catapult-project/catapu... ==========
Message was sent while issue was closed.
Committed patchset #3 (id:260001) as https://chromium.googlesource.com/external/github.com/catapult-project/catapu... |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
