Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(285)

Side by Side Diff: appengine/swarming/server/lease_management.py

Issue 2951023002: Only schedule termination task if an unconnected MP bot is dead (Closed)
Patch Set: Created 3 years, 6 months ago
Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.
Jump to:
View unified diff | Download patch
« no previous file with comments | « no previous file | no next file » | no next file with comments »
Toggle Intra-line Diffs ('i') | Expand Comments ('e') | Collapse Comments ('c') | Show Comments Hide Comments ('s')
OLDNEW
1 # Copyright 2016 The LUCI Authors. All rights reserved. 1 # Copyright 2016 The LUCI Authors. All rights reserved.
2 # Use of this source code is governed under the Apache License, Version 2.0 2 # Use of this source code is governed under the Apache License, Version 2.0
3 # that can be found in the LICENSE file. 3 # that can be found in the LICENSE file.
4 4
5 """Lease management for machines leased from the Machine Provider. 5 """Lease management for machines leased from the Machine Provider.
6 6
7 Keeps a list of machine types which should be leased from the Machine Provider 7 Keeps a list of machine types which should be leased from the Machine Provider
8 and the list of machines of each type currently leased. 8 and the list of machines of each type currently leased.
9 9
10 Swarming integration with Machine Provider 10 Swarming integration with Machine Provider
(...skipping 888 matching lines...) Expand 10 before | Expand all | Expand 10 after
899 (event.ts - machine_lease.instruction_ts).total_seconds(), 899 (event.ts - machine_lease.instruction_ts).total_seconds(),
900 fields={ 900 fields={
901 'machine_type': machine_lease.machine_type.id(), 901 'machine_type': machine_lease.machine_type.id(),
902 }, 902 },
903 ) 903 )
904 return 904 return
905 905
906 # The bot hasn't connected yet. If it's dead or missing, release the lease. 906 # The bot hasn't connected yet. If it's dead or missing, release the lease.
907 # At this point we have sent the connection instruction so the bot could still 907 # At this point we have sent the connection instruction so the bot could still
908 # connect after we release the lease but before Machine Provider actually 908 # connect after we release the lease but before Machine Provider actually
909 # deletes the bot. Therefore we also schedule a termination task. If the bot 909 # deletes the bot. Therefore we also schedule a termination task if releasing
910 # connects, it will just shut itself down immediately. 910 # the bot. That way, if the bot connects, it will just shut itself down.
911 task_scheduler.schedule_request(
912 task_request.create_termination_task(machine_lease.hostname, True),
913 None,
914 check_acls=False,
915 )
916 bot_info = bot_management.get_info_key(machine_lease.hostname).get() 911 bot_info = bot_management.get_info_key(machine_lease.hostname).get()
917 if not bot_info: 912 if not bot_info:
918 logging.error( 913 logging.error(
919 'BotInfo missing:\nKey: %s\nHostname: %s', 914 'BotInfo missing:\nKey: %s\nHostname: %s',
920 machine_lease.key, 915 machine_lease.key,
921 machine_lease.hostname, 916 machine_lease.hostname,
922 ) 917 )
918 task_scheduler.schedule_request(
919 task_request.create_termination_task(machine_lease.hostname, True),
920 None,
921 check_acls=False,
922 )
923 if release(machine_lease): 923 if release(machine_lease):
924 clear_lease_request(machine_lease.key, machine_lease.client_request_id) 924 clear_lease_request(machine_lease.key, machine_lease.client_request_id)
925 return 925 return
926 if bot_info.is_dead(utils.utcnow()): 926 if bot_info.is_dead(utils.utcnow()):
927 logging.warning( 927 logging.warning(
928 'Bot failed to connect in time:\nKey: %s\nHostname: %s', 928 'Bot failed to connect in time:\nKey: %s\nHostname: %s',
929 machine_lease.key, 929 machine_lease.key,
930 machine_lease.hostname, 930 machine_lease.hostname,
931 ) 931 )
932 task_scheduler.schedule_request(
933 task_request.create_termination_task(machine_lease.hostname, True),
934 None,
935 check_acls=False,
936 )
932 if release(machine_lease): 937 if release(machine_lease):
933 cleanup_bot(machine_lease) 938 cleanup_bot(machine_lease)
934 939
935 940
936 def cleanup_bot(machine_lease): 941 def cleanup_bot(machine_lease):
937 """Cleans up entities after a bot is removed.""" 942 """Cleans up entities after a bot is removed."""
938 task_queues.cleanup_after_bot(machine_lease.hostname) 943 task_queues.cleanup_after_bot(machine_lease.hostname)
939 bot_management.get_info_key(machine_lease.hostname).delete() 944 bot_management.get_info_key(machine_lease.hostname).delete()
940 clear_lease_request(machine_lease.key, machine_lease.client_request_id) 945 clear_lease_request(machine_lease.key, machine_lease.client_request_id)
941 946
(...skipping 358 matching lines...) Expand 10 before | Expand all | Expand 10 after
1300 target_fields=ts_mon_metrics.TARGET_FIELDS, 1305 target_fields=ts_mon_metrics.TARGET_FIELDS,
1301 ) 1306 )
1302 ts_mon_metrics.machine_types_actual_size.set( 1307 ts_mon_metrics.machine_types_actual_size.set(
1303 utilization.idle, 1308 utilization.idle,
1304 fields={ 1309 fields={
1305 'busy': False, 1310 'busy': False,
1306 'machine_type': machine_type.key.id(), 1311 'machine_type': machine_type.key.id(),
1307 }, 1312 },
1308 target_fields=ts_mon_metrics.TARGET_FIELDS, 1313 target_fields=ts_mon_metrics.TARGET_FIELDS,
1309 ) 1314 )
OLDNEW
« no previous file with comments | « no previous file | no next file » | no next file with comments »

Powered by Google App Engine
This is Rietveld 408576698