Issue 2650683007: Dashboard - Remove pre-hook logic from alert.py.

shatch

Description was changed from ========== WIP alert.py prehook thing BUG=catapult:# ========== to ========== WIP alert.py ...

3 years, 11 months ago (2017-01-25 19:00:49 UTC) #1

shatch

Description was changed from ========== WIP alert.py prehook thing BUG=catapult:#2771 ========== to ========== Previously alert.py ...

3 years, 11 months ago (2017-01-26 02:33:06 UTC) #8

shatch

Description was changed from ========== Previously alert.py has some pre-hook logic to update associated groups. ...

3 years, 11 months ago (2017-01-26 02:33:50 UTC) #9

shatch

Description was changed from ========== Previously alert.py has some pre-hook logic to update associated groups. ...

3 years, 10 months ago (2017-01-30 18:58:51 UTC) #15

shatch

Description was changed from ========== Previously alert.py has some pre-hook logic to update associated groups. ...

3 years, 10 months ago (2017-01-30 19:00:53 UTC) #16

shatch

Description was changed from ========== Previously alert.py has some pre-hook logic to update associated groups. ...

3 years, 10 months ago (2017-01-30 19:13:39 UTC) #17

shatch

simonhatch@chromium.org changed reviewers: + sullivan@chromium.org

3 years, 10 months ago (2017-01-30 19:14:42 UTC) #18

shatch

Description was changed from ========== Previously alert.py has some pre-hook logic to update associated groups. ...

3 years, 10 months ago (2017-01-30 19:16:44 UTC) #19

shatch

https://codereview.chromium.org/2650683007/diff/220001/dashboard/dashboard/models/alert_group_test.py File dashboard/dashboard/models/alert_group_test.py (left): https://codereview.chromium.org/2650683007/diff/220001/dashboard/dashboard/models/alert_group_test.py#oldcode64 dashboard/dashboard/models/alert_group_test.py:64: # Add these anomalies to groups and put them ...

3 years, 10 months ago (2017-01-30 19:17:20 UTC) #20

sullivan

lgtm I wrote out my thoughts on tradeoffs between approaches as I was reading, but ...

3 years, 10 months ago (2017-02-02 01:02:29 UTC) #21

lgtm

I wrote out my thoughts on tradeoffs between approaches as I was reading, but
thinking through the code as a whole, this is probably the right balance of
complex vs fast as written

https://codereview.chromium.org/2650683007/diff/220001/dashboard/dashboard/mo...
File dashboard/dashboard/models/alert_group.py (right):

https://codereview.chromium.org/2650683007/diff/220001/dashboard/dashboard/mo...
dashboard/dashboard/models/alert_group.py:84: # value to set and using kwargs
let's us easily distringuish betwen
Nit:
lets us easily distinguish between

https://codereview.chromium.org/2650683007/diff/220001/dashboard/dashboard/mo...
dashboard/dashboard/models/alert_group.py:101: group_futures[a.group.id()] =
a.group.get_async()
I'm curious on the benefit of using get_async here over just putting these in a
list and doing a get_multi(). With both approaches, the gets happen in parallel.
Then the difference is in the 2nd pass:

* With get_async, you start the 2nd pass a little sooner. but you basically do a
get_result() right away , and so it gets a bit more work done until you hit
whichever get turned out to be the long pole
* With get_multi, you do all the gets in parallel, but don't start the 2nd pass
until they are all finished.

To me the question here is: Is the tradeoff between the complexity of get_async
worth the speed improvement of getting started on the 2nd pass a bit faster? It
seems like the advantage of starting the 2nd pass faster is not in running the
python code sooner, since it is trivial, but more in kicking off some async
queries faster?

https://codereview.chromium.org/2650683007/diff/220001/dashboard/dashboard/mo...
dashboard/dashboard/models/alert_group.py:130: # We cahce these rather than grab
get_result() each time because we may
s/cahce/cache/

https://codereview.chromium.org/2650683007/diff/220001/dashboard/dashboard/mo...
dashboard/dashboard/models/alert_group.py:136: grouped_alerts =
grouped_alerts_futures[a.group.id()].get_result()
Queries are slower than gets, so I didn't realize this until I read this far,
but here I have the opposite thought: it'd be more complex to implement this to
use ndb.wait_any on the futures, but also faster because the queries would
likely return different numbers of alerts, and thus you'd generally have a
pattern where some come in faster than others. I'm hand-waving on the
implementation here; you'd do something like grab the first result that came in,
and then process all the alerts with that group id. WDYT?

(For reference)
https://cloud.google.com/appengine/docs/python/ndb/async#Future_wait_any

https://codereview.chromium.org/2650683007/diff/220001/dashboard/dashboard/mo...
File dashboard/dashboard/models/alert_group_test.py (left):

https://codereview.chromium.org/2650683007/diff/220001/dashboard/dashboard/mo...
dashboard/dashboard/models/alert_group_test.py:64: # Add these anomalies to
groups and put them again. When anomalies are
On 2017/01/30 19:17:20, shatch wrote:
> I don't think this comment was correct. The old logic in
> alert.Alert._pre_put_hook explicitly checks for changes to the alert
> (specifically bug_id or revision_ranges). Since none of these ops change
those,
> the extra put() shouldn't trigger any of that additional logic.

Agreed.

shatch

https://codereview.chromium.org/2650683007/diff/220001/dashboard/dashboard/models/alert_group.py File dashboard/dashboard/models/alert_group.py (right): https://codereview.chromium.org/2650683007/diff/220001/dashboard/dashboard/models/alert_group.py#newcode84 dashboard/dashboard/models/alert_group.py:84: # value to set and using kwargs let's us ...

3 years, 10 months ago (2017-02-02 16:10:42 UTC) #22

https://codereview.chromium.org/2650683007/diff/220001/dashboard/dashboard/mo...
File dashboard/dashboard/models/alert_group.py (right):

https://codereview.chromium.org/2650683007/diff/220001/dashboard/dashboard/mo...
dashboard/dashboard/models/alert_group.py:84: # value to set and using kwargs
let's us easily distringuish betwen
On 2017/02/02 01:02:28, sullivan wrote:
> Nit:
> lets us easily distinguish between

Done.

https://codereview.chromium.org/2650683007/diff/220001/dashboard/dashboard/mo...
dashboard/dashboard/models/alert_group.py:101: group_futures[a.group.id()] =
a.group.get_async()
On 2017/02/02 01:02:28, sullivan wrote:
> I'm curious on the benefit of using get_async here over just putting these in
a
> list and doing a get_multi(). With both approaches, the gets happen in
parallel.
> Then the difference is in the 2nd pass:
> 
> * With get_async, you start the 2nd pass a little sooner. but you basically do
a
> get_result() right away , and so it gets a bit more work done until you hit
> whichever get turned out to be the long pole
> * With get_multi, you do all the gets in parallel, but don't start the 2nd
pass
> until they are all finished.
> 
> To me the question here is: Is the tradeoff between the complexity of
get_async
> worth the speed improvement of getting started on the 2nd pass a bit faster?
It
> seems like the advantage of starting the 2nd pass faster is not in running the
> python code sooner, since it is trivial, but more in kicking off some async
> queries faster?

Pretty much, I figured at a high level there were 3 general approaches:

1. Using get_multi, easiest to implement, but you wait for worst case.
2. This implementation, tradeoff some complexity vs get_multi to kick-off asyncs
earlier, don't necessarily have to wait for the worst case to start sending
queries out.
3. Using wait_any, let's you wait the minimum but possibly more complex
implementation (didn't try).

https://codereview.chromium.org/2650683007/diff/220001/dashboard/dashboard/mo...
dashboard/dashboard/models/alert_group.py:130: # We cahce these rather than grab
get_result() each time because we may
On 2017/02/02 01:02:28, sullivan wrote:
> s/cahce/cache/

Done.

https://codereview.chromium.org/2650683007/diff/220001/dashboard/dashboard/mo...
dashboard/dashboard/models/alert_group.py:136: grouped_alerts =
grouped_alerts_futures[a.group.id()].get_result()
On 2017/02/02 01:02:28, sullivan wrote:
> Queries are slower than gets, so I didn't realize this until I read this far,
> but here I have the opposite thought: it'd be more complex to implement this
to
> use ndb.wait_any on the futures, but also faster because the queries would
> likely return different numbers of alerts, and thus you'd generally have a
> pattern where some come in faster than others. I'm hand-waving on the
> implementation here; you'd do something like grab the first result that came
in,
> and then process all the alerts with that group id. WDYT?
> 
> (For reference)
> https://cloud.google.com/appengine/docs/python/ndb/async#Future_wait_any

Yeah I actually considered going that route, but figured there was lower hanging
fruit around this code than updating the alerts (ie. we could just use the async
api for communicating with issue_tracker).

I was looking at /add_point_queue and thinking a pattern like that would work
there though, with a bunch of dependent reads on master/bot/test entities before
the row puts. Since /add_point_queue doesn't do "other" stuff like communicating
with issue_tracker, could be worth pursuing there as it's a hot path.

Looks like there may be some lower hanging fruit in there though since it looks
like find_anomalies is completely serial.

sullivan

lgtm +eakuefner since he's interested in these performance discussions. But I think you made the ...

3 years, 10 months ago (2017-02-02 17:01:30 UTC) #23

shatch

The CQ bit was checked by simonhatch@chromium.org to run a CQ dry run

3 years, 10 months ago (2017-02-02 17:59:43 UTC) #24

commit-bot: I haz the power

Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2650683007/240001

3 years, 10 months ago (2017-02-02 17:59:46 UTC) #25

commit-bot: I haz the power

The CQ bit was unchecked by commit-bot@chromium.org

3 years, 10 months ago (2017-02-02 18:25:58 UTC) #26

commit-bot: I haz the power

Dry run: This issue passed the CQ dry run.

3 years, 10 months ago (2017-02-02 18:25:59 UTC) #27

commit-bot: I haz the power

CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2650683007/240001

3 years, 10 months ago (2017-02-03 14:38:13 UTC) #29

commit-bot: I haz the power

CQ is committing da patch. Bot data: {"patchset_id": 240001, "attempt_start_ts": 1486132682357080, "parent_rev": "4dd2f587ffc33c062c8b0308567fdfbb27d6dc6c", "commit_rev": "20c5e0b31c76cfbcf8ee0f629fe4c3069367f3c4"}

3 years, 10 months ago (2017-02-03 14:39:50 UTC) #30

commit-bot: I haz the power

Description was changed from ========== Previously alert.py has some pre-hook logic to update associated groups. ...

3 years, 10 months ago (2017-02-03 14:39:54 UTC) #31

commit-bot: I haz the power

Committed patchset #2 (id:240001) as https://chromium.googlesource.com/external/github.com/catapult-project/catapult/+/20c5e0b31c76cfbcf8ee0f629fe4c3069367f3c4

3 years, 10 months ago (2017-02-03 14:39:55 UTC) #32

shatch

A revert of this CL (patchset #2 id:240001) has been created in https://codereview.chromium.org/2670003006/ by simonhatch@chromium.org. ...

3 years, 10 months ago (2017-02-03 16:57:46 UTC) #33

shatch

Description was changed from ========== Previously alert.py has some pre-hook logic to update associated groups. ...

3 years, 10 months ago (2017-02-03 17:04:42 UTC) #34

shatch

The CQ bit was checked by simonhatch@chromium.org to run a CQ dry run

3 years, 10 months ago (2017-02-03 17:04:48 UTC) #35

commit-bot: I haz the power

Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2650683007/260001

3 years, 10 months ago (2017-02-03 17:04:59 UTC) #36

commit-bot: I haz the power

The CQ bit was unchecked by commit-bot@chromium.org

3 years, 10 months ago (2017-02-03 17:25:33 UTC) #37

commit-bot: I haz the power

Dry run: This issue passed the CQ dry run.

3 years, 10 months ago (2017-02-03 17:25:33 UTC) #38

shatch

The CQ bit was checked by simonhatch@chromium.org to run a CQ dry run

3 years, 10 months ago (2017-02-03 19:35:32 UTC) #39

commit-bot: I haz the power

Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2650683007/260001

3 years, 10 months ago (2017-02-03 19:35:43 UTC) #40

commit-bot: I haz the power

The CQ bit was unchecked by commit-bot@chromium.org

3 years, 10 months ago (2017-02-03 19:37:13 UTC) #41

commit-bot: I haz the power

Dry run: This issue passed the CQ dry run.

3 years, 10 months ago (2017-02-03 19:37:14 UTC) #42

shatch

The CQ bit was checked by simonhatch@chromium.org to run a CQ dry run

3 years, 10 months ago (2017-02-03 19:50:54 UTC) #43

commit-bot: I haz the power

Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2650683007/260001

3 years, 10 months ago (2017-02-03 19:51:03 UTC) #44

shatch

The CQ bit was checked by simonhatch@chromium.org to run a CQ dry run

3 years, 10 months ago (2017-02-03 19:51:52 UTC) #46

shatch

The CQ bit was checked by simonhatch@chromium.org to run a CQ dry run

3 years, 10 months ago (2017-02-03 19:52:08 UTC) #48

commit-bot: I haz the power

Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2650683007/260001

3 years, 10 months ago (2017-02-03 19:52:13 UTC) #49

commit-bot: I haz the power

The CQ bit was unchecked by commit-bot@chromium.org

3 years, 10 months ago (2017-02-03 19:53:53 UTC) #50

commit-bot: I haz the power

Dry run: This issue passed the CQ dry run.

3 years, 10 months ago (2017-02-03 19:53:54 UTC) #51

shatch

The patchset sent to the CQ was uploaded after l-g-t-m from sullivan@chromium.org Link to the ...

3 years, 10 months ago (2017-02-03 20:28:50 UTC) #53

shatch

On 2017/02/03 16:57:46, shatch wrote: > A revert of this CL (patchset #2 id:240001) has ...

3 years, 10 months ago (2017-02-03 20:30:32 UTC) #55

commit-bot: I haz the power

CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2650683007/260001

3 years, 10 months ago (2017-02-03 20:30:55 UTC) #57

commit-bot: I haz the power

CQ is committing da patch. Bot data: {"patchset_id": 260001, "attempt_start_ts": 1486153842932700, "parent_rev": "9d4491ddf17cbee8465eeac6b2bc7840dc4f4c4a", "commit_rev": "289d83a49d829f8bbf490667f53bb20077c5f5b7"}

3 years, 10 months ago (2017-02-03 20:32:32 UTC) #58

commit-bot: I haz the power

Description was changed from ========== Previously alert.py has some pre-hook logic to update associated groups. ...

3 years, 10 months ago (2017-02-03 20:32:35 UTC) #59

commit-bot: I haz the power

3 years, 10 months ago (2017-02-03 20:32:37 UTC) #60

Message was sent while issue was closed.

Committed patchset #3 (id:260001) as
https://chromium.googlesource.com/external/github.com/catapult-project/catapu...

Issue 2650683007: Dashboard - Remove pre-hook logic from alert.py. (Closed)

Description

Patch Set 1 #

Patch Set 2 : Addressed comments. #

Patch Set 3 : Fix race in delete. #

Messages