sled agent: don't special-case vmm-not-present handling for requests to stop (#6698)

gjcolombo · web-flow · commit 69da5d699dbf · 2024-09-27T09:24:42.000-07:00
When sled agent receives a request to stop a VMM that's not in the agent's VMM table, return `NoSuchVmm` instead of succeeding. This allows users manually to recover an instance that was Running prior to a sled reboot but hasn't yet been moved to Failed by the instance watcher. Tested manually as follows: 1. Modify sled agent's VMM worker loop so that it doesn't publish VMM state before exiting; this is needed so that manually unregistering an instance from a sled doesn't cause it to go to Stopped 2. Launch a dev cluster with both (1) and the change in this PR. 3. Start an instance, then send an HTTP DELETE to sled agent's internal API to forcibly unregister the VMM. 4. Observe that the instance remains Running in the console. 5. Stop the instance; observe that the "not found, going to Failed" message is displayed and that the instance then goes to Failed. Fixes #4511.
diff --git a/sled-agent/src/instance_manager.rs b/sled-agent/src/instance_manager.rs
@@ -650,24 +650,8 @@ impl InstanceManagerRunner {
         target: VmmStateRequested,
     ) -> Result<(), Error> {
         let Some(instance) = self.get_propolis(propolis_id) else {
-            match target {
-                // If the instance isn't registered, then by definition it
-                // isn't running here. Allow requests to stop or destroy the
-                // instance to succeed to provide idempotency. This has to
-                // be handled here (that is, on the "instance not found"
-                // path) to handle the case where a stop request arrived,
-                // Propolis handled it, sled agent unregistered the
-                // instance, and only then did a second stop request
-                // arrive.
-                VmmStateRequested::Stopped => {
-                    tx.send(Ok(VmmPutStateResponse { updated_runtime: None }))
-                        .map_err(|_| Error::FailedSendClientClosed)?;
-                }
-                _ => {
-                    tx.send(Err(Error::NoSuchVmm(propolis_id)))
-                        .map_err(|_| Error::FailedSendClientClosed)?;
-                }
-            }
+            tx.send(Err(Error::NoSuchVmm(propolis_id)))
+                .map_err(|_| Error::FailedSendClientClosed)?;
             return Ok(());
         };
         instance.put_state(tx, target).await?;