Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,9 @@ The following specifications have been accepted by the UAPI group:
* [UAPI.13 Efficient time synchronisation for virtual machines](specs/vmclock.md):
Describes the format and mechanism to synchronize the guest clock.
([canonical online location](https://uapi-group.org/specifications/specs/vmclock/))
* [UAPI.14 Virtual Machine Generation ID](specs/vmgenid.md):
Describes the mechanism for detecting virtual machine rollback events.
([canonical online location](https://uapi-group.org/specifications/specs/vmgenid/))

## Work in Progress

Expand Down Expand Up @@ -90,6 +93,8 @@ This section clarifies on terms and abbreviations used in specs and other docume
- [*sysext*](specs/extension_image.md) – System Extension Image
(type of DDI that is overlayed on top of `/usr/` and `/opt/` via overlayfs and can extend the underlying OS vendor resources in a composable, immutable fashion)
- [*UKI*](specs/unified_kernel_image.md) - Unified Kernel Images (UEFI boot stub + kernel + initrd + more)
- [*VMClock*](specs/vmclock.md) – Virtual Machine Clock (efficient time synchronisation for virtual machines)
- [*VMGenID*](specs/vmgenid.md) – Virtual Machine Generation ID (mechanism for detecting VM rollback events)
- [*VOA*](specs/file_hierarchy_for_the_verification_of_os_artifacts.md) – Verification of OS Artifacts

## Participate
Expand Down
26 changes: 24 additions & 2 deletions specs/vmclock.md
Original file line number Diff line number Diff line change
Expand Up @@ -134,7 +134,7 @@ complete, as described below.
| 0x50 | `uint64_t time_frac_sec` | Fractional part of reference time, in units of second / 2⁶⁴. |
| 0x58 | `uint64_t time_esterror_nanosec` | Estimated ± error of the time given in `time_sec` + `time_frac_sec`, in nanoseconds |
| 0x60 | `uint64_t time_maxerror_nanosec` | Maximum ± error of the time given in `time_sec` + `time_frac_sec`, in nanoseconds |
| 0x64 | `uint64_t vm_generation_count` | A change in this field indicates that the guest has been loaded from a snapshot. In addition to handling a disruption in time (which will also be signalled through the `disruption_marker` field), a guest may wish to discard UUIDs, reset network connections or reseed entropy, etc. |
| 0x64 | `uint64_t vm_generation_count` | A change in this field indicates that the guest has been cloned or loaded from a snapshot (see below). |
| 0x68 | ... | The size of the memory region containing this structure is given in the `size` field, which will typically be a full 4KiB page. New fields may be added here, advertised by newly-defined bits in the `flags` field, without changing the `version` field. |

### Feature Flags (0x18)
Expand Down Expand Up @@ -195,6 +195,28 @@ The value of this field shall be valid for the point in time referenced by the
| 0x04 | `VMCLOCK_LEAP_POST_POS` | A positive leap second occurred at the end of the previous month |
| 0x05 | `VMCLOCK_LEAP_POST_NEG` | A negative leap second occurred at the end of the previous month |

### VM Generation Count (0x64)

This field indicates that the guest has been cloned or loaded from a snapshot. The operating system may wish to regenerate unique identifiers, reset network connections or reseed entropy, etc.

The conditions under which this counter changes are identical to those of the [VMGenID device](vmgenid.md). The `vm_generation_count` changes whenever the VM is restored to an earlier or non-unique state:

- Snapshot restoration
- Backup recovery
- VM cloning/copying/import
- Disaster recovery failover

The `vm_generation_count` remains constant during normal VM operations:

- Pause/resume
- Shutdown/restart/reboot
- Host reboot or upgrade
- Live migration or lossless online failover

The `disruption_marker` and `vm_generation_count` fields indicate two orthogonal, but sometimes correlated, types of event. It is generally likely that the `disruption_marker` would also be changed when the `vm_generation_count` changes, but not necessarily vice versa.

It is possible that a VM could be cloned (forked) while running on the same host, such that the precision of the hardware counter is not lost, but the uniqueness is. That would be the rare case where the `vm_generation_count` would be changed but not the `disruption_marker`.

## Calculating time

The VMClock structure provides the following values:
Expand Down Expand Up @@ -258,7 +280,7 @@ To expose VMClock to the operating system via ACPI, the firmware or hypervisor m

## Discovery via Device Tree

Similar to the ACPI binding above, the BIOS or hypervisor must place the
Similar to the ACPI binding above, the firmware or hypervisor must place the
`vmclock_abi` structure in an otherwise unused region of physical memory and
advertise its presence to the operating system. The Device Tree binding for the
`amazon,vmclock` node is as follows:
Expand Down
126 changes: 126 additions & 0 deletions specs/vmgenid.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,126 @@
---
title: UAPI.14 VMGenID
category: Concepts
layout: default
version: 1.0
SPDX-License-Identifier: CC-BY-4.0
weight: 14
aliases:
- /UAPI.14
- /14
---

# UAPI.14 VMGenID: Virtual Machine Generation ID

| Version | Changes |
|---------|---------|
| 1.0 | Initial Release |

Virtual machine operations that restore a VM to an earlier point in time (such as applying snapshots, restoring from backup, cloning, or failover scenarios) can cause serious problems for applications that depend on unique identifiers or cryptographic entropy. The Virtual Machine Generation ID (VMGenID) device provides a mechanism for guest software to detect when such operations have occurred.

The VMGenID is a 128-bit cryptographically random identifier that changes whenever a virtual machine is cloned or restored to an earlier state. This allows applications to detect such events and take appropriate protective measures, such as reseeding random number generators, regenerating unique identifiers, or invalidating cached state.

## The vmgenid_abi Structure

The hypervisor provides 16 bytes in shared memory containing the generation ID. The structure can be represented as two little-endian 64-bit values, and must be placed in an 8-byte aligned buffer.

### Structure Fields

| Offset | Field | Description |
|--------|-------|-------------|
| 0x00 | `uint64_t generation_id_low` | Lower 64 bits of the 128-bit generation ID |
| 0x08 | `uint64_t generation_id_high` | Upper 64 bits of the 128-bit generation ID |

The generation ID is a 128-bit cryptographically random value that is unique across all VM instances and time. All 128 bits are random; it is *not* a Version 4 UUID.

The generation ID changes whenever the VM is restored to an earlier or non-unique state:

- Snapshot restoration
- Backup recovery
- VM cloning/copying/import
- Disaster recovery failover

The generation ID remains constant during normal VM operations:

- Pause/resume
- Shutdown/restart/reboot
- Host reboot or upgrade
- Live migration or lossless online failover

Events, such as live migrations, which merely disrupt the VM's clock without changing the uniqueness of its identity do not result in a change to the generation ID. Conversely, cloning (forking) a running VM running on the same host would result in a new generation ID without disrupting the timekeeping. Guests which want to detect clock disruption should use the [VMClock device](vmclock.md) for that purpose.

### GUID interoperability

If the generation ID is represented as a GUID for the purpose of storage or configuration by a Virtual Machine Monitor, it is recommended that:

- The generation ID shared to the guest is the little-endian representation of that GUID
- The textual representation of the GUID, in display or configuration, is the RFC 4122 standard big-endian form


## Discovery via ACPI

To expose VMGenID to the operating system via ACPI, the firmware or hypervisor must:

1. Place the shared `vmgenid_abi` structure somewhere in RAM, ROM or device memory space, which is guaranteed not to be used by the operating system. It must not be in ranges reported as `AddressRangeMemory` or `AddressRangeACPI`, and must not be in the same page as any memory which is expected to be mapped by a page table entry with caching disabled.

2. Expose a device somewhere in the ACPI namespace with:
- a hardware ID (`_HID`) that is hypervisor-specific
- a DOS Device Name ID (`_DDN`) of "VM_Gen_Counter"
- a compatible ID (`_CID`) of "VM_Gen_Counter"

3. Attach to the device an `ADDR` method which when evaluated returns the 64-bit physical address of the generation ID structure as a package containing the low and high 32-bit address components in that order.

4. After the generation ID changes, the device shall raise an ACPI Notify operation using notification code 0x80. The device may raise the notify operation even if the generation ID has not changed.

## Discovery via Device Tree

The firmware or hypervisor must place the `vmgenid_abi` structure in an otherwise unused region of physical memory and advertise its presence to the operating system. The Device Tree binding for the `microsoft,vmgenid` node is as follows:

```yaml
# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
%YAML 1.2
---
$id: http://devicetree.org/schemas/rng/microsoft,vmgenid.yaml#
$schema: http://devicetree.org/meta-schemas/core.yaml#

title: Virtual Machine Generation ID

maintainers:
- Jason A. Donenfeld <[email protected]>

description:
Firmwares or hypervisors can use this devicetree to describe an
interrupt and a shared resource to inject a Virtual Machine Generation ID.
Virtual Machine Generation ID is a globally unique identifier (GUID) and
the devicetree binding follows VMGenID specification.

properties:
compatible:
const: microsoft,vmgenid

reg:
description:
Specifies a 16-byte VMGenID in endianness-agnostic hexadecimal format.
maxItems: 1

interrupts:
description:
Interrupt used to notify that a new VMGenID is available.
maxItems: 1

required:
- compatible
- reg
- interrupts

additionalProperties: false

examples:
- |
#include <dt-bindings/interrupt-controller/arm-gic.h>
rng@80000000 {
compatible = "microsoft,vmgenid";
reg = <0x80000000 0x1000>;
interrupts = <GIC_SPI 35 IRQ_TYPE_EDGE_RISING>;
};
```