Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(1810)

Issue 2826913002: [BattOr] Retry battor_agent commands when they fail. (Closed)

Created:
3 years, 8 months ago by aschulman
Modified:
3 years, 7 months ago
CC:
chromium-reviews, nednguyen, Zhen Wang
Target Ref:
refs/heads/master
Project:
chromium
Visibility:
Public.

Description

[BattOr] Retry battor_agent commands when they fail. Many battor_agent failures on the waterfall could be solved by just retrying the command (e.g., StartTracing). Previously, the battor_agent retried reads and a select few BattOr control messages when they failed (e.g., INIT). This was incomplete, as there were many errors that did not result in retried BattOr control messages. This CL replaces this incomplete per-control-message retry logic with retry logic for entire battor_agent commands. This will make the battor_agent robust to intermittent failures of any BattOr control message. This required fixing the battor_agent overuse of the ambiguous "Unexpected Message" error. It also required replacing the timeout logic. BUG=699581 Review-Url: https://codereview.chromium.org/2826913002 Cr-Commit-Position: refs/heads/master@{#472724} Committed: https://chromium.googlesource.com/chromium/src/+/b8e9b5f2237d0fad703d20fb0a22d45b6ebdb047

Patch Set 1 #

Patch Set 2 : Stored samples needed to be cleared before each retry of StopTracing. #

Patch Set 3 : Doubled retry attempts (each StopTracing attempt requires two retries) #

Total comments: 31

Patch Set 4 : Addressed Charlie's comments #

Total comments: 6

Patch Set 5 : [BattOr] Retry battor_agent commands when they fail. #

Patch Set 6 : Final comments #

Total comments: 1

Patch Set 7 : Swapped return with break #

Unified diffs Side-by-side diffs Delta from patch set Stats (+363 lines, -390 lines) Patch
M tools/battor_agent/battor_agent.h View 1 2 3 4 3 chunks +8 lines, -10 lines 0 comments Download
M tools/battor_agent/battor_agent.cc View 1 2 3 4 5 6 23 chunks +93 lines, -130 lines 0 comments Download
M tools/battor_agent/battor_agent_unittest.cc View 1 2 3 28 chunks +257 lines, -244 lines 0 comments Download
M tools/battor_agent/battor_connection_impl.cc View 1 chunk +2 lines, -0 lines 0 comments Download
M tools/battor_agent/battor_error.h View 1 chunk +1 line, -2 lines 0 comments Download
M tools/battor_agent/battor_error.cc View 1 chunk +2 lines, -4 lines 0 comments Download

Messages

Total messages: 36 (15 generated)
rnephew (Reviews Here)
Can you use this to run third_party/catapult/common/battor/battor/battor_wrapper_devicetest.py and make sure the battor_wrapper still behaves properly ...
3 years, 8 months ago (2017-04-21 22:10:45 UTC) #4
charliea (OOO until 10-5)
3 years, 7 months ago (2017-05-04 20:35:19 UTC) #6
charliea (OOO until 10-5)
I'm planning to take a look at this tomorrow.
3 years, 7 months ago (2017-05-04 20:36:17 UTC) #7
charliea (OOO until 10-5)
Randy, would you mind prereviewing this and passing it onto me when you think it's ...
3 years, 7 months ago (2017-05-05 19:47:33 UTC) #8
rnephew (Reviews Here)
https://codereview.chromium.org/2826913002/diff/40001/tools/battor_agent/battor_agent.cc File tools/battor_agent/battor_agent.cc (left): https://codereview.chromium.org/2826913002/diff/40001/tools/battor_agent/battor_agent.cc#oldcode530 tools/battor_agent/battor_agent.cc:530: connection_->Flush(); Do we not need to flush the connections ...
3 years, 7 months ago (2017-05-09 15:04:06 UTC) #10
charliea (OOO until 10-5)
https://codereview.chromium.org/2826913002/diff/40001/tools/battor_agent/battor_agent.cc File tools/battor_agent/battor_agent.cc (right): https://codereview.chromium.org/2826913002/diff/40001/tools/battor_agent/battor_agent.cc#newcode20 tools/battor_agent/battor_agent.cc:20: const uint8_t kMaxCommandAttempts = 10; On 2017/05/09 15:04:05, rnephew ...
3 years, 7 months ago (2017-05-09 19:58:09 UTC) #11
aschulman
Thanks for the comments Charlie and Randy. I replied to your comments. It doesn't seem ...
3 years, 7 months ago (2017-05-11 13:13:00 UTC) #12
rnephew (Reviews Here)
looks good to me, but I am not the most practiced at c++ style guides ...
3 years, 7 months ago (2017-05-11 17:19:02 UTC) #13
charliea (OOO until 10-5)
Overall, I think this is pretty close to being ready to go: just have a ...
3 years, 7 months ago (2017-05-12 22:03:43 UTC) #14
aschulman
On 2017/05/12 22:03:43, charliea wrote: > Overall, I think this is pretty close to being ...
3 years, 7 months ago (2017-05-15 18:49:06 UTC) #15
aschulman
https://codereview.chromium.org/2826913002/diff/40001/tools/battor_agent/battor_agent.cc File tools/battor_agent/battor_agent.cc (right): https://codereview.chromium.org/2826913002/diff/40001/tools/battor_agent/battor_agent.cc#newcode468 tools/battor_agent/battor_agent.cc:468: base::TimeDelta::FromSeconds(kBattOrControlMessageTimeoutSeconds)); On 2017/05/12 22:03:42, charliea wrote: > Could we ...
3 years, 7 months ago (2017-05-15 18:49:17 UTC) #16
charliea (OOO until 10-5)
lgtm w/ comments https://codereview.chromium.org/2826913002/diff/40001/tools/battor_agent/battor_agent_unittest.cc File tools/battor_agent/battor_agent_unittest.cc (right): https://codereview.chromium.org/2826913002/diff/40001/tools/battor_agent/battor_agent_unittest.cc#newcode431 tools/battor_agent/battor_agent_unittest.cc:431: TEST_F(BattOrAgentTest, StartTracingSucceedsAfterSetGainSendFails) { On 2017/05/15 18:49:17, ...
3 years, 7 months ago (2017-05-15 19:03:10 UTC) #17
aschulman
Final comments are taken care of in the latest patch set. Randy, after this is ...
3 years, 7 months ago (2017-05-15 22:26:25 UTC) #18
commit-bot: I haz the power
CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2826913002/100001
3 years, 7 months ago (2017-05-15 22:28:23 UTC) #21
commit-bot: I haz the power
Try jobs failed on following builders: linux_chromium_rel_ng on master.tryserver.chromium.linux (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.linux/builders/linux_chromium_rel_ng/builds/455029) win_chromium_compile_dbg_ng on master.tryserver.chromium.win (JOB_FAILED, ...
3 years, 7 months ago (2017-05-15 23:14:57 UTC) #23
charliea (OOO until 10-5)
https://codereview.chromium.org/2826913002/diff/100001/tools/battor_agent/battor_agent.cc File tools/battor_agent/battor_agent.cc (right): https://codereview.chromium.org/2826913002/diff/100001/tools/battor_agent/battor_agent.cc#newcode542 tools/battor_agent/battor_agent.cc:542: base::ThreadTaskRunnerHandle::Get()->PostDelayedTask( Ah, looks like the presubmit caught a bug ...
3 years, 7 months ago (2017-05-16 16:13:08 UTC) #24
aschulman
On 2017/05/16 16:13:08, charliea wrote: > https://codereview.chromium.org/2826913002/diff/100001/tools/battor_agent/battor_agent.cc > File tools/battor_agent/battor_agent.cc (right): > > https://codereview.chromium.org/2826913002/diff/100001/tools/battor_agent/battor_agent.cc#newcode542 > ...
3 years, 7 months ago (2017-05-17 16:49:57 UTC) #25
commit-bot: I haz the power
CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2826913002/120001
3 years, 7 months ago (2017-05-17 20:16:46 UTC) #28
commit-bot: I haz the power
Try jobs failed on following builders: win_chromium_rel_ng on master.tryserver.chromium.win (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.win/builders/win_chromium_rel_ng/builds/447583)
3 years, 7 months ago (2017-05-18 02:56:28 UTC) #30
commit-bot: I haz the power
CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2826913002/120001
3 years, 7 months ago (2017-05-18 07:41:37 UTC) #32
commit-bot: I haz the power
3 years, 7 months ago (2017-05-18 07:47:22 UTC) #36
Message was sent while issue was closed.
Committed patchset #7 (id:120001) as
https://chromium.googlesource.com/chromium/src/+/b8e9b5f2237d0fad703d20fb0a22...

Powered by Google App Engine
This is Rietveld 408576698