Skip to content

Conversation

gurasinghMS
Copy link
Contributor

This PR will add a fault functionality while changing the cc enable bit of the nvme fault controller

@gurasinghMS gurasinghMS changed the title wip: cc.enable() delay during servicing for the NvmeFaultController nvme_test: cc.enable() delay during servicing for the NvmeFaultController Aug 28, 2025
@gurasinghMS gurasinghMS changed the title nvme_test: cc.enable() delay during servicing for the NvmeFaultController nvme_test: cc.enable() delay capability during servicing for the NvmeFaultController Aug 28, 2025
@gurasinghMS gurasinghMS marked this pull request as ready for review August 28, 2025 00:20
@Copilot Copilot AI review requested due to automatic review settings August 28, 2025 00:20
@gurasinghMS gurasinghMS requested review from a team as code owners August 28, 2025 00:20
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds fault injection capabilities for PCI controller management operations in the NvMe test framework, specifically enabling delay faults during the cc.enable() process. The changes allow testing of NvMe controller behavior when delays are introduced during controller enablement operations.

Key changes:

  • Introduces a new PciFaultConfig struct with support for delay behavior during cc.enable() operations
  • Updates all test instantiations to include the new PCI fault configuration
  • Implements fault processing logic in the controller's cc.enable() handler

Reviewed Changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
vm/devices/storage/nvme_resources/src/fault.rs Adds new PciFaultConfig and PciFaultBehavior types for PCI fault injection
vm/devices/storage/nvme_test/src/pci.rs Implements fault handling in the cc.enable() logic and stores fault configuration
vm/devices/storage/nvme_test/src/tests/controller_tests.rs Updates test instantiations to include PCI fault configuration
vm/devices/storage/nvme_test/src/tests/shadow_doorbell_tests.rs Updates test instantiations to include PCI fault configuration
vm/devices/storage/disk_nvme/nvme_driver/src/tests.rs Updates test instantiations to include PCI fault configuration
vmm_tests/vmm_tests/tests/tests/multiarch/openhcl_servicing.rs Updates test instantiations and improves error message clarity
Comments suppressed due to low confidence (1)

vm/devices/storage/nvme_resources/src/fault.rs:9

  • The Duration import is used by the new PciFaultBehavior::Delay variant but there's no documentation explaining the expected duration ranges or behavior. Consider adding documentation to clarify appropriate delay values for testing scenarios.
use std::time::Duration;

.controller_management_fault_enable
{
PciFaultBehavior::Delay(duration) => {
std::thread::sleep(duration);
Copy link
Preview

Copilot AI Aug 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using std::thread::sleep() in an async context can block the entire thread and prevent other async tasks from running. Consider using tokio::time::sleep() or another async-compatible delay mechanism instead.

Copilot uses AI. Check for mistakes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have an async sleep available through PolledTimer that should be used instead.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method is not async.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well thats what I get for just reading copilot's comment haha

@gurasinghMS
Copy link
Contributor Author

@mattkur I am adding another fault type enum for the PCI style faults as we discussed earlier. A little unsure of the naming here because technically speaking cc.en() is still an NVMe concept. I am a little unsure if calling this a PCI fault is the right thing to do.
TODO: Fault configuration should also have a builder-style implementation. Will make that a follow up PR to this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants