OLD | NEW |
(Empty) | |
| 1 GRPC Connection Backoff Protocol |
| 2 ================================ |
| 3 |
| 4 When we do a connection to a backend which fails, it is typically desirable to |
| 5 not retry immediately (to avoid flooding the network or the server with |
| 6 requests) and instead do some form of exponential backoff. |
| 7 |
| 8 We have several parameters: |
| 9 1. INITIAL_BACKOFF (how long to wait after the first failure before retrying) |
| 10 2. MULTIPLIER (factor with which to multiply backoff after a failed retry) |
| 11 3. MAX_BACKOFF (upper bound on backoff) |
| 12 4. MIN_CONNECT_TIMEOUT (minimum time we're willing to give a connection to |
| 13 complete) |
| 14 |
| 15 ## Proposed Backoff Algorithm |
| 16 |
| 17 Exponentially back off the start time of connection attempts up to a limit of |
| 18 MAX_BACKOFF, with jitter. |
| 19 |
| 20 ``` |
| 21 ConnectWithBackoff() |
| 22 current_backoff = INITIAL_BACKOFF |
| 23 current_deadline = now() + INITIAL_BACKOFF |
| 24 while (TryConnect(Max(current_deadline, now() + MIN_CONNECT_TIMEOUT)) |
| 25 != SUCCESS) |
| 26 SleepUntil(current_deadline) |
| 27 current_backoff = Min(current_backoff * MULTIPLIER, MAX_BACKOFF) |
| 28 current_deadline = now() + current_backoff + |
| 29 UniformRandom(-JITTER * current_backoff, JITTER * current_backoff) |
| 30 |
| 31 ``` |
| 32 |
| 33 With specific parameters of |
| 34 MIN_CONNECT_TIMEOUT = 20 seconds |
| 35 INITIAL_BACKOFF = 1 second |
| 36 MULTIPLIER = 1.6 |
| 37 MAX_BACKOFF = 120 seconds |
| 38 JITTER = 0.2 |
| 39 |
| 40 Implementations with pressing concerns (such as minimizing the number of wakeups |
| 41 on a mobile phone) may wish to use a different algorithm, and in particular |
| 42 different jitter logic. |
| 43 |
| 44 Alternate implementations must ensure that connection backoffs started at the |
| 45 same time disperse, and must not attempt connections substantially more often |
| 46 than the above algorithm. |
| 47 |
| 48 ## Reset Backoff |
| 49 |
| 50 The back off should be reset to INITIAL_BACKOFF at some time point, so that the |
| 51 reconnecting behavior is consistent no matter the connection is a newly started |
| 52 one or a previously disconnected one. |
| 53 |
| 54 We choose to reset the Backoff when the SETTINGS frame is received, at that time |
| 55 point, we know for sure that this connection was accepted by the server. |
OLD | NEW |