|
|
Created:
6 years, 4 months ago by navabi Modified:
6 years, 4 months ago CC:
chromium-reviews, klundberg+watch_chromium.org, yfriedman+watch_chromium.org, ilevy-cc_chromium.org Base URL:
svn://svn.chromium.org/chrome/trunk/src Project:
chromium Visibility:
Public. |
DescriptionIf device does not come back after provisioning, add to bad devices.
BUG=391071
Committed: https://src.chromium.org/viewvc/chrome?view=rev&revision=287910
Patch Set 1 #
Total comments: 2
Patch Set 2 : Add device to bad devices if reboot times out. #
Total comments: 8
Patch Set 3 : Add check to see for no good devices after reboot. #Messages
Total messages: 13 (0 generated)
https://codereview.chromium.org/442173002/diff/1/build/android/provision_devi... File build/android/provision_devices.py (right): https://codereview.chromium.org/442173002/diff/1/build/android/provision_devi... build/android/provision_devices.py:235: device.WaitUntilFullyBooted(timeout=90) What happens if we hit this timeout? Does the step fail? Can we add the device to bad_devices on failure here, too?
https://codereview.chromium.org/442173002/diff/1/build/android/provision_devi... File build/android/provision_devices.py (right): https://codereview.chromium.org/442173002/diff/1/build/android/provision_devi... build/android/provision_devices.py:235: device.WaitUntilFullyBooted(timeout=90) On 2014/08/06 20:24:54, cjhopman wrote: > What happens if we hit this timeout? Does the step fail? Can we add the device > to bad_devices on failure here, too? Done.
PTAL
https://codereview.chromium.org/442173002/diff/20001/build/android/provision_... File build/android/provision_devices.py (right): https://codereview.chromium.org/442173002/diff/20001/build/android/provision_... build/android/provision_devices.py:209: WipeDevicesIfPossible(devices) I guess there is stuff in here, too, that could throw an exception and fail this step for a single bad device https://codereview.chromium.org/442173002/diff/20001/build/android/provision_... build/android/provision_devices.py:229: device_utils.DeviceUtils.parallel(devices).Reboot(True) This is just parallelizing sending the 'adb reboot' command? Is that really that much benefit? This would be much simpler if there was just one loop over the devices that did wipe-provision-reboot-wait (ideally we'd parrallelize that whole sequence). https://codereview.chromium.org/442173002/diff/20001/build/android/provision_... build/android/provision_devices.py:247: devices = [device for device in devices if device not in bad_devices] This should now raise NoDevicesError if devices is empty?
https://codereview.chromium.org/442173002/diff/20001/build/android/provision_... File build/android/provision_devices.py (right): https://codereview.chromium.org/442173002/diff/20001/build/android/provision_... build/android/provision_devices.py:209: WipeDevicesIfPossible(devices) On 2014/08/06 21:50:05, cjhopman wrote: > I guess there is stuff in here, too, that could throw an exception and fail this > step for a single bad device True. The original patch was for a particular place where this was failing (and a P0 that was changed to a P1). We have not seen this be a problem in practice here or on the reboot. It doesn't hurt to add it here though. https://codereview.chromium.org/442173002/diff/20001/build/android/provision_... build/android/provision_devices.py:229: device_utils.DeviceUtils.parallel(devices).Reboot(True) On 2014/08/06 21:50:05, cjhopman wrote: > This is just parallelizing sending the 'adb reboot' command? No. The True that is being passed to Reboot is to block for the reboot to complete. See pylib/device/device_utils.py > Is that really that much benefit? ). Yes. Especially on the try bots where there are 8 devices and cycle time matters the most. https://codereview.chromium.org/442173002/diff/20001/build/android/provision_... build/android/provision_devices.py:247: devices = [device for device in devices if device not in bad_devices] On 2014/08/06 21:50:05, cjhopman wrote: > This should now raise NoDevicesError if devices is empty? Good catch. Done.
https://codereview.chromium.org/442173002/diff/20001/build/android/provision_... File build/android/provision_devices.py (right): https://codereview.chromium.org/442173002/diff/20001/build/android/provision_... build/android/provision_devices.py:209: WipeDevicesIfPossible(devices) On 2014/08/06 21:58:51, navabi wrote: > On 2014/08/06 21:50:05, cjhopman wrote: > > I guess there is stuff in here, too, that could throw an exception and fail > this > > step for a single bad device > > True. The original patch was for a particular place where this was failing (and > a P0 that was changed to a P1). We have not seen this be a problem in practice > here or on the reboot. > > It doesn't hurt to add it here though. Yeah, I think the main point is that one bad device shouldn't cause the whole build to fail. But if we don't actually see the wipe cause such failures, then I'm not too worried about it. https://codereview.chromium.org/442173002/diff/20001/build/android/provision_... build/android/provision_devices.py:229: device_utils.DeviceUtils.parallel(devices).Reboot(True) On 2014/08/06 21:58:51, navabi wrote: > On 2014/08/06 21:50:05, cjhopman wrote: > > This is just parallelizing sending the 'adb reboot' command? > > No. The True that is being passed to Reboot is to block for the reboot to > complete. See pylib/device/device_utils.py > > > Is that really that much benefit? ). > > Yes. Especially on the try bots where there are 8 devices and cycle time matters > the most. Acknowledged.
lgtm
PTAL. I didn't add it to WipeDevicesIfPossible, though we could add that same logic. The original change was added because of a flakiness specifically during configuration that we are seeing on the bots (see the bug). As explained in the bug, this is a difficult to debug and test because it only happens about 3-4% of the time. I'd rather land this, see how it works and then decide if we need it other places.
On 2014/08/06 22:04:12, navabi wrote: > PTAL. > > I didn't add it to WipeDevicesIfPossible, though we could add that same logic. > The original change was added because of a flakiness specifically during > configuration that we are seeing on the bots (see the bug). > > As explained in the bug, this is a difficult to debug and test because it only > happens about 3-4% of the time. I'd rather land this, see how it works and then > decide if we need it other places. Thanks for the reviews Chris.
The CQ bit was checked by navabi@google.com
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/navabi@google.com/442173002/40001
Message was sent while issue was closed.
Change committed as 287910 |