Skip to content

Commit def76c3

Browse files
committed
fix(setup): Add retry logic to post-setup
Troubleshooting the lab build failures, post setup is being executed within seconds after the nova install completes, which may be too fast. If I can confirm it's a race condition, we will want a more intelligent way to determine when cluster is ready before proceeding to post-setup. For now, just adding dumb retry logic to the openstack commands to see if that helps.
1 parent 9565999 commit def76c3

File tree

1 file changed

+32
-26
lines changed

1 file changed

+32
-26
lines changed

scripts/hyperconverged-lab.sh

Lines changed: 32 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -1261,32 +1261,38 @@ set -e
12611261
sudo bash <<HERE
12621262
sudo /opt/genestack/bin/setup-openstack-rc.sh
12631263
source /opt/genestack/scripts/genestack.rc
1264-
if ! openstack --os-cloud default flavor show ${LAB_NAME_PREFIX}-test; then
1265-
openstack --os-cloud default flavor create ${LAB_NAME_PREFIX}-test \
1266-
--public \
1267-
--ram 2048 \
1268-
--disk 10 \
1269-
--vcpus 2
1270-
fi
1271-
if ! openstack --os-cloud default network show flat; then
1272-
openstack --os-cloud default network create \
1273-
--share \
1274-
--availability-zone-hint az1 \
1275-
--external \
1276-
--provider-network-type flat \
1277-
--provider-physical-network physnet1 \
1278-
flat
1279-
fi
1280-
if ! openstack --os-cloud default subnet show flat_subnet; then
1281-
openstack --os-cloud default subnet create \
1282-
--subnet-range 192.168.102.0/24 \
1283-
--gateway 192.168.102.1 \
1284-
--dns-nameserver 1.1.1.1 \
1285-
--allocation-pool start=192.168.102.100,end=192.168.102.109 \
1286-
--dhcp \
1287-
--network flat \
1288-
flat_subnet
1289-
fi
1264+
1265+
# Function to retry openstack commands with backoff
1266+
retry_openstack_command() {
1267+
local cmd="\\\$1"
1268+
local description="\\\$2"
1269+
local retry_count=0
1270+
local max_retries=30 # 30 retries * 10 seconds = 5 minutes
1271+
1272+
while [ \\\$retry_count -lt \\\$max_retries ]; do
1273+
if eval "\\\$cmd" 2>/dev/null; then
1274+
echo "\\\$description succeeded"
1275+
return 0
1276+
else
1277+
retry_count=\\\$((retry_count + 1))
1278+
echo "\\\$description failed (attempt \\\$retry_count/\\\$max_retries). Retrying in 10 seconds..."
1279+
if [ \\\$retry_count -eq \\\$max_retries ]; then
1280+
echo "\\\$description failed after \\\$max_retries attempts. Continuing anyway..."
1281+
return 1
1282+
fi
1283+
sleep 10
1284+
fi
1285+
done
1286+
}
1287+
1288+
# Create flavor with retry
1289+
retry_openstack_command "if ! openstack --os-cloud default flavor show ${LAB_NAME_PREFIX}-test; then openstack --os-cloud default flavor create ${LAB_NAME_PREFIX}-test --public --ram 2048 --disk 10 --vcpus 2; fi" "Flavor setup"
1290+
1291+
# Create network with retry
1292+
retry_openstack_command "if ! openstack --os-cloud default network show flat; then openstack --os-cloud default network create --share --availability-zone-hint az1 --external --provider-network-type flat --provider-physical-network physnet1 flat; fi" "Network setup"
1293+
1294+
# Create subnet with retry
1295+
retry_openstack_command "if ! openstack --os-cloud default subnet show flat_subnet; then openstack --os-cloud default subnet create --subnet-range 192.168.102.0/24 --gateway 192.168.102.1 --dns-nameserver 1.1.1.1 --allocation-pool start=192.168.102.100,end=192.168.102.109 --dhcp --network flat flat_subnet; fi" "Subnet setup"
12901296
HERE
12911297
EOC
12921298

0 commit comments

Comments
 (0)