Issue 2470923003: Handle timeout for update requests.

vakh (use Gerrit instead)

The CQ bit was checked by vakh@chromium.org to run a CQ dry run

4 years, 1 month ago (2016-11-02 00:13:48 UTC) #1

commit-bot: I haz the power

Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2470923003/1

4 years, 1 month ago (2016-11-02 00:14:13 UTC) #2

vakh (use Gerrit instead)

Description was changed from ========== Handle timeout for update requests. 1. Starts 30 second timer ...

4 years, 1 month ago (2016-11-02 00:32:30 UTC) #3

vakh (use Gerrit instead)

vakh@chromium.org changed reviewers: + nparker@chromium.org, shess@chromium.org

4 years, 1 month ago (2016-11-02 00:32:30 UTC) #4

commit-bot: I haz the power

The CQ bit was unchecked by commit-bot@chromium.org

4 years, 1 month ago (2016-11-02 01:10:41 UTC) #5

commit-bot: I haz the power

Dry run: This issue passed the CQ dry run.

4 years, 1 month ago (2016-11-02 01:10:41 UTC) #6

Nathan Parker

https://codereview.chromium.org/2470923003/diff/1/components/safe_browsing_db/v4_update_protocol_manager.cc File components/safe_browsing_db/v4_update_protocol_manager.cc (right): https://codereview.chromium.org/2470923003/diff/1/components/safe_browsing_db/v4_update_protocol_manager.cc#newcode308 components/safe_browsing_db/v4_update_protocol_manager.cc:308: request_.reset(); This should log some UMA value to indicate ...

4 years, 1 month ago (2016-11-02 21:56:57 UTC) #8

Scott Hess - ex-Googler

https://codereview.chromium.org/2470923003/diff/1/components/safe_browsing_db/v4_update_protocol_manager.cc File components/safe_browsing_db/v4_update_protocol_manager.cc (right): https://codereview.chromium.org/2470923003/diff/1/components/safe_browsing_db/v4_update_protocol_manager.cc#newcode309 components/safe_browsing_db/v4_update_protocol_manager.cc:309: IssueUpdateRequest(); On 2016/11/02 21:56:57, Nathan Parker wrote: > This ...

4 years, 1 month ago (2016-11-02 22:05:08 UTC) #9

vakh (use Gerrit instead)

https://codereview.chromium.org/2470923003/diff/1/components/safe_browsing_db/v4_update_protocol_manager.cc File components/safe_browsing_db/v4_update_protocol_manager.cc (right): https://codereview.chromium.org/2470923003/diff/1/components/safe_browsing_db/v4_update_protocol_manager.cc#newcode308 components/safe_browsing_db/v4_update_protocol_manager.cc:308: request_.reset(); On 2016/11/02 21:56:57, Nathan Parker wrote: > This ...

4 years, 1 month ago (2016-11-02 22:37:23 UTC) #10

https://codereview.chromium.org/2470923003/diff/1/components/safe_browsing_db...
File components/safe_browsing_db/v4_update_protocol_manager.cc (right):

https://codereview.chromium.org/2470923003/diff/1/components/safe_browsing_db...
components/safe_browsing_db/v4_update_protocol_manager.cc:308: request_.reset();
On 2016/11/02 21:56:57, Nathan Parker wrote:
> This should log some UMA value to indicate the request failed.

I thought about it but I couldn't find a good way to log the occurence of an
event. Perhaps a UMA_HISTOGRAM_ENUMERATION that logs SUCCESS/FAILURE? Is there a
better way to log this?

https://codereview.chromium.org/2470923003/diff/1/components/safe_browsing_db...
components/safe_browsing_db/v4_update_protocol_manager.cc:309:
IssueUpdateRequest();
On 2016/11/02 21:56:57, Nathan Parker wrote:
> This will keep trying every 30 seconds. Is that what we want?  Maybe it should
> have a fixed number of total requests (3?)

Sure, that's possible. But what happens after that? We can't pause update
requests.
One way to do this is to perform backoff (different from the backoff specified
by the protocol) and cap the backoff: 30 sec, 1 min, 2 min, 4 min, 4 min, 4
min...

> and then give up and wait  for a full
> update cycle (30 min).

Not sure what you mean here. This error is happening when we try to fetch the
next update (typically) 30 mins after the last one.
Are you saying that we should just skip an update entirely? That's worth
considering but it means that one failure can lead to somewhat stale DB.

https://codereview.chromium.org/2470923003/diff/1/components/safe_browsing_db...
components/safe_browsing_db/v4_update_protocol_manager.cc:309:
IssueUpdateRequest();
On 2016/11/02 22:05:08, Scott Hess wrote:
> On 2016/11/02 21:56:57, Nathan Parker wrote:
> > This will keep trying every 30 seconds. Is that what we want?  Maybe it
should
> > have a fixed number of total requests (3?) and then give up and wait  for a
> full
> > update cycle (30 min).  t 
> 
> Does the spec address this?  I see language about how to handle errors, but
not
> about timeouts.

No, I don't see it: https://developers.google.com/safe-browsing/v4/update-api

> 
> If the requests are reaching the Internet, then we definitely shouldn't be
> stacking them up like this because an unresponsive server is definitely bad! 
If
> the requests are _not_ reaching the Internet, then we shouldn't be stacking
them
> up because we're going to repeated power up the CPU to process our requests
and
> the radio to attempt to send them.

So, just drop the request on timeout and schedule the next update at the last
known frequency?
This would be similar to Nathan's proposal. I am neutral to that idea.

Note that the request can also get dropped because of an intermittent connection
failure and I am sure there are a variety of other reasons.

I'll also check with the SafeBrowsing team if they have any opinion about it.

Nathan Parker

https://codereview.chromium.org/2470923003/diff/1/components/safe_browsing_db/v4_update_protocol_manager.cc File components/safe_browsing_db/v4_update_protocol_manager.cc (right): https://codereview.chromium.org/2470923003/diff/1/components/safe_browsing_db/v4_update_protocol_manager.cc#newcode308 components/safe_browsing_db/v4_update_protocol_manager.cc:308: request_.reset(); On 2016/11/02 22:37:22, vakh (Varun Khaneja) wrote: > ...

4 years, 1 month ago (2016-11-05 00:32:21 UTC) #11

vakh (use Gerrit instead)

The CQ bit was checked by vakh@chromium.org to run a CQ dry run

4 years, 1 month ago (2016-11-08 17:44:56 UTC) #12

commit-bot: I haz the power

Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2470923003/1

4 years, 1 month ago (2016-11-08 17:45:11 UTC) #13

vakh (use Gerrit instead)

On timeout, schedule an update request after next_update_interval_

4 years, 1 month ago (2016-11-08 17:45:18 UTC) #14

vakh (use Gerrit instead)

The CQ bit was checked by vakh@chromium.org to run a CQ dry run

4 years, 1 month ago (2016-11-08 17:45:19 UTC) #15

commit-bot: I haz the power

Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2470923003/20001

4 years, 1 month ago (2016-11-08 17:45:31 UTC) #16

vakh (use Gerrit instead)

Changed the behavior on update timeout to schedule a regular update using the existing value ...

4 years, 1 month ago (2016-11-08 17:51:40 UTC) #17

vakh (use Gerrit instead)

The CQ bit was checked by vakh@chromium.org to run a CQ dry run

4 years, 1 month ago (2016-11-08 17:57:42 UTC) #18

commit-bot: I haz the power

Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2470923003/20001

4 years, 1 month ago (2016-11-08 17:58:22 UTC) #19

vakh (use Gerrit instead)

Added a histogram for counting update timeouts

4 years, 1 month ago (2016-11-08 18:04:22 UTC) #20

vakh (use Gerrit instead)

The CQ bit was checked by vakh@chromium.org to run a CQ dry run

4 years, 1 month ago (2016-11-08 18:04:30 UTC) #21

commit-bot: I haz the power

Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2470923003/40001

4 years, 1 month ago (2016-11-08 18:04:52 UTC) #22

vakh (use Gerrit instead)

On 2016/11/05 00:32:21, Nathan Parker wrote: > https://codereview.chromium.org/2470923003/diff/1/components/safe_browsing_db/v4_update_protocol_manager.cc > File components/safe_browsing_db/v4_update_protocol_manager.cc (right): > > https://codereview.chromium.org/2470923003/diff/1/components/safe_browsing_db/v4_update_protocol_manager.cc#newcode308 ...

4 years, 1 month ago (2016-11-08 18:05:42 UTC) #23

commit-bot: I haz the power

The CQ bit was unchecked by commit-bot@chromium.org

4 years, 1 month ago (2016-11-08 19:16:28 UTC) #24

commit-bot: I haz the power

Dry run: This issue passed the CQ dry run.

4 years, 1 month ago (2016-11-08 19:16:29 UTC) #25

Nathan Parker

lgtm https://codereview.chromium.org/2470923003/diff/40001/components/safe_browsing_db/v4_update_protocol_manager.cc File components/safe_browsing_db/v4_update_protocol_manager.cc (right): https://codereview.chromium.org/2470923003/diff/40001/components/safe_browsing_db/v4_update_protocol_manager.cc#newcode306 components/safe_browsing_db/v4_update_protocol_manager.cc:306: UMA_HISTOGRAM_COUNTS_100("SafeBrowsing.V4Update.Timeout.Count", 1); Or UMA_HISTOGRAM_BOOLEAN("..", true) either way https://codereview.chromium.org/2470923003/diff/40001/components/safe_browsing_db/v4_update_protocol_manager.h ...

4 years, 1 month ago (2016-11-08 23:39:29 UTC) #26

vakh (use Gerrit instead)

Description was changed from ========== Handle timeout for update requests. 1. Starts 30 second timer ...

4 years, 1 month ago (2016-11-08 23:58:30 UTC) #27

vakh (use Gerrit instead)

The CQ bit was checked by vakh@chromium.org to run a CQ dry run

4 years, 1 month ago (2016-11-09 00:08:21 UTC) #29

commit-bot: I haz the power

Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2470923003/60001

4 years, 1 month ago (2016-11-09 00:08:50 UTC) #30

vakh (use Gerrit instead)

vakh@chromium.org changed reviewers: + rkaplow@chromium.org

4 years, 1 month ago (2016-11-09 00:08:50 UTC) #31

vakh (use Gerrit instead)

rkaplow@ -- Can you please review the one new histogram? Thanks. https://codereview.chromium.org/2470923003/diff/40001/components/safe_browsing_db/v4_update_protocol_manager.cc File components/safe_browsing_db/v4_update_protocol_manager.cc (right): ...

4 years, 1 month ago (2016-11-09 00:08:51 UTC) #32

commit-bot: I haz the power

The CQ bit was unchecked by commit-bot@chromium.org

4 years, 1 month ago (2016-11-09 01:01:13 UTC) #33

commit-bot: I haz the power

Dry run: Try jobs failed on following builders: linux_android_rel_ng on master.tryserver.chromium.android (JOB_FAILED, https://build.chromium.org/p/tryserver.chromium.android/builders/linux_android_rel_ng/builds/176905)

4 years, 1 month ago (2016-11-09 01:01:14 UTC) #34

Scott Hess - ex-Googler

LGTM. Dangling philosophical question, though. If we're just going to retry on next retry, does ...

4 years, 1 month ago (2016-11-09 01:02:25 UTC) #35

rkaplow

lgtm looks ok, but unusual there's no other case checked. Should this also have a ...

4 years, 1 month ago (2016-11-09 19:12:18 UTC) #36

vakh (use Gerrit instead)

The CQ bit was checked by vakh@chromium.org to run a CQ dry run

4 years, 1 month ago (2016-11-10 22:06:09 UTC) #38

commit-bot: I haz the power

Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2470923003/80001

4 years, 1 month ago (2016-11-10 22:06:43 UTC) #39

vakh (use Gerrit instead)

On 2016/11/09 19:12:18, rkaplow wrote: > lgtm > > looks ok, but unusual there's no ...

4 years, 1 month ago (2016-11-10 22:08:20 UTC) #40

vakh (use Gerrit instead)

Add UMA for the case when the update request does not time out

4 years, 1 month ago (2016-11-10 22:08:28 UTC) #41

vakh (use Gerrit instead)

The CQ bit was checked by vakh@chromium.org to run a CQ dry run

4 years, 1 month ago (2016-11-10 22:08:30 UTC) #42

commit-bot: I haz the power

Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2470923003/100001

4 years, 1 month ago (2016-11-10 22:08:53 UTC) #43

vakh (use Gerrit instead)

On 2016/11/09 01:02:25, Scott Hess wrote: > LGTM. > > Dangling philosophical question, though. If ...

4 years, 1 month ago (2016-11-10 22:10:11 UTC) #44

vakh (use Gerrit instead)

The CQ bit was checked by vakh@chromium.org

4 years, 1 month ago (2016-11-10 22:11:15 UTC) #45

vakh (use Gerrit instead)

The patchset sent to the CQ was uploaded after l-g-t-m from nparker@chromium.org, rkaplow@chromium.org, shess@chromium.org Link ...

4 years, 1 month ago (2016-11-10 22:11:16 UTC) #46

commit-bot: I haz the power

CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2470923003/100001

4 years, 1 month ago (2016-11-10 22:11:33 UTC) #47

Scott Hess - ex-Googler

On 2016/11/10 22:10:11, vakh (Varun Khaneja) wrote: > On 2016/11/09 01:02:25, Scott Hess wrote: > ...

4 years, 1 month ago (2016-11-10 22:18:36 UTC) #48

commit-bot: I haz the power

The CQ bit was unchecked by commit-bot@chromium.org

4 years, 1 month ago (2016-11-10 22:27:11 UTC) #49

commit-bot: I haz the power

Try jobs failed on following builders: android_compile_dbg on master.tryserver.chromium.android (JOB_FAILED, https://build.chromium.org/p/tryserver.chromium.android/builders/android_compile_dbg/builds/162228) linux_chromium_asan_rel_ng on master.tryserver.chromium.linux (JOB_FAILED, ...

4 years, 1 month ago (2016-11-10 22:27:13 UTC) #50

vakh (use Gerrit instead)

On 2016/11/10 22:18:36, Scott Hess wrote: > On 2016/11/10 22:10:11, vakh (Varun Khaneja) wrote: > ...

4 years, 1 month ago (2016-11-11 01:52:03 UTC) #51

vakh (use Gerrit instead)

The CQ bit was checked by vakh@chromium.org

4 years, 1 month ago (2016-11-11 01:52:11 UTC) #52

commit-bot: I haz the power

CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2470923003/100001

4 years, 1 month ago (2016-11-11 01:52:40 UTC) #53

commit-bot: I haz the power

The CQ bit was unchecked by commit-bot@chromium.org

4 years, 1 month ago (2016-11-11 01:56:15 UTC) #54

commit-bot: I haz the power

Try jobs failed on following builders: chromium_presubmit on master.tryserver.chromium.linux (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.linux/builders/chromium_presubmit/builds/301999) linux_chromium_asan_rel_ng on master.tryserver.chromium.linux (JOB_FAILED, ...

4 years, 1 month ago (2016-11-11 01:56:16 UTC) #55

vakh (use Gerrit instead)

The CQ bit was checked by vakh@chromium.org to run a CQ dry run

4 years, 1 month ago (2016-11-11 02:04:03 UTC) #57

vakh (use Gerrit instead)

The CQ bit was checked by vakh@chromium.org

4 years, 1 month ago (2016-11-11 02:04:11 UTC) #58

vakh (use Gerrit instead)

The patchset sent to the CQ was uploaded after l-g-t-m from nparker@chromium.org, rkaplow@chromium.org, shess@chromium.org Link ...

4 years, 1 month ago (2016-11-11 02:04:12 UTC) #59

commit-bot: I haz the power

CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2470923003/120001

4 years, 1 month ago (2016-11-11 02:04:30 UTC) #60

commit-bot: I haz the power

Description was changed from ========== Handle timeout for update requests. 1. Starts 30 second timer ...

4 years, 1 month ago (2016-11-11 03:03:29 UTC) #61

commit-bot: I haz the power

Description was changed from ========== Handle timeout for update requests. 1. Starts 30 second timer ...

4 years, 1 month ago (2016-11-11 03:09:07 UTC) #63

commit-bot: I haz the power

4 years, 1 month ago (2016-11-11 03:09:09 UTC) #64

Message was sent while issue was closed.

Patchset 7 (id:??) landed as
https://crrev.com/1074a4f1111494a115606cecf8363abd63e9bb49
Cr-Commit-Position: refs/heads/master@{#431475}

Issue 2470923003: Handle timeout for update requests. (Closed)

Description

Patch Set 1 #

Patch Set 2 : On timeout, schedule an update request after next_update_interval_ #

Patch Set 3 : Added a histogram for counting update timeouts #

Patch Set 4 : nparker@ review #

Patch Set 5 : rebase #

Patch Set 6 : Add UMA for the case when the update request does not time out #

Patch Set 7 : rebase #

Messages