-
Notifications
You must be signed in to change notification settings - Fork 31
Description
Why
Since Linux kernel v6.6, the VFIO subsystem has been extended to support a new device access model - VFIO cdev [1][2]. Together with the "new" IOMMUFD subsystem [3], advanced features from modern hardware can be fully exposed and utilized in the guest, achieving fully accelerated iommu in the guest. One example is achieving fully accelerated GPU passthrough on Nvidia's Grace Hopper/Blackwell systems [4].
What
Some highlights and quotes from the VFIO kernel document [1]:
IOMMUFD is the new user API to manage I/O page tables from userspace. It intends to be the portal of delivering advanced userspace DMA features, while also providing a backwards compatibility interface for existing VFIO_TYPE1v2_IOMMU use cases. Eventually the vfio_iommu_type1 driver, as well as the legacy vfio container and group model is intended to be deprecated.
Traditionally user acquires a VFIO device fd via VFIO_GROUP_GET_DEVICE_FD in a VFIO group, which is considered as the legacy VFIO interface. With the new VFIO cdev interface, user can now acquire a device fd by directly opening a character device /dev/vfio/devices/vfioX.
How
Here is the tentative plan to complete such support:
- Support the new VFIO uAPIs from the
vfio-bindings
crate; - Refactor
vfio-ioctls
crate to support multiple VFIO interfaces, e.g. both legacy and cdev+iommufd interfaces; - Add the vfio cdev+iommufd support to the
vfio-ioctl
crate;
My plan is to use Cloud Hypervisor [5] as the development and validation vehicle for this work. I will create separate sub-issues to track each of the work items listed above.
Please let me know if you have any questions or suggestions. Thank you.
[1] https://docs.kernel.org/driver-api/vfio.html#vfio-device-cdev
[2] https://lore.kernel.org/all/[email protected]/
[3] https://docs.kernel.org/userspace-api/iommufd.html
[4] https://resources.nvidia.com/en-us-grace-cpu/nvidia-grace-hopper
[5] https://github.com/cloud-hypervisor/cloud-hypervisor/