22Device Specification for Inter-VM shared memory device
33------------------------------------------------------
44
5- The Inter-VM shared memory device is designed to share a region of memory to
6- userspace in multiple virtual guests. The memory region does not belong to any
7- guest, but is a POSIX memory object on the host. Optionally, the device may
8- support sending interrupts to other guests sharing the same memory region.
5+ The Inter-VM shared memory device is designed to share a memory region (created
6+ on the host via the POSIX shared memory API) between multiple QEMU processes
7+ running different guests. In order for all guests to be able to pick up the
8+ shared memory area, it is modeled by QEMU as a PCI device exposing said memory
9+ to the guest as a PCI BAR.
10+ The memory region does not belong to any guest, but is a POSIX memory object on
11+ the host. The host can access this shared memory if needed.
12+
13+ The device also provides an optional communication mechanism between guests
14+ sharing the same memory object. More details about that in the section 'Guest to
15+ guest communication' section.
916
1017
1118The Inter-VM PCI device
1219-----------------------
1320
14- *BARs*
21+ From the VM point of view, the ivshmem PCI device supports three BARs.
22+
23+ - BAR0 is a 1 Kbyte MMIO region to support registers and interrupts when MSI is
24+ not used.
25+ - BAR1 is used for MSI-X when it is enabled in the device.
26+ - BAR2 is used to access the shared memory object.
27+
28+ It is your choice how to use the device but you must choose between two
29+ behaviors :
30+
31+ - basically, if you only need the shared memory part, you will map BAR2.
32+ This way, you have access to the shared memory in guest and can use it as you
33+ see fit (memnic, for example, uses it in userland
34+ http://dpdk.org/browse/memnic).
35+
36+ - BAR0 and BAR1 are used to implement an optional communication mechanism
37+ through interrupts in the guests. If you need an event mechanism between the
38+ guests accessing the shared memory, you will most likely want to write a
39+ kernel driver that will handle interrupts. See details in the section 'Guest
40+ to guest communication' section.
41+
42+ The behavior is chosen when starting your QEMU processes:
43+ - no communication mechanism needed, the first QEMU to start creates the shared
44+ memory on the host, subsequent QEMU processes will use it.
45+
46+ - communication mechanism needed, an ivshmem server must be started before any
47+ QEMU processes, then each QEMU process connects to the server unix socket.
48+
49+ For more details on the QEMU ivshmem parameters, see qemu-doc documentation.
50+
51+
52+ Guest to guest communication
53+ ----------------------------
54+
55+ This section details the communication mechanism between the guests accessing
56+ the ivhsmem shared memory.
1557
16- The device supports three BARs. BAR0 is a 1 Kbyte MMIO region to support
17- registers. BAR1 is used for MSI-X when it is enabled in the device. BAR2 is
18- used to map the shared memory object from the host. The size of BAR2 is
19- specified when the guest is started and must be a power of 2 in size.
58+ *ivshmem server*
2059
21- *Registers*
60+ This server code is available in qemu.git/contrib/ivshmem-server.
2261
23- The device currently supports 4 registers of 32-bits each. Registers
24- are used for synchronization between guests sharing the same memory object when
25- interrupts are supported (this requires using the shared memory server) .
62+ The server must be started on the host before any guest.
63+ It creates a shared memory object then waits for clients to connect on a unix
64+ socket .
2665
27- The server assigns each VM an ID number and sends this ID number to the QEMU
28- process when the guest starts.
66+ For each client (QEMU process) that connects to the server:
67+ - the server assigns an ID for this client and sends this ID to him as the first
68+ message,
69+ - the server sends a fd to the shared memory object to this client,
70+ - the server creates a new set of host eventfds associated to the new client and
71+ sends this set to all already connected clients,
72+ - finally, the server sends all the eventfds sets for all clients to the new
73+ client.
74+
75+ The server signals all clients when one of them disconnects.
76+
77+ The client IDs are limited to 16 bits because of the current implementation (see
78+ Doorbell register in 'PCI device registers' subsection). Hence only 65536
79+ clients are supported.
80+
81+ All the file descriptors (fd to the shared memory, eventfds for each client)
82+ are passed to clients using SCM_RIGHTS over the server unix socket.
83+
84+ Apart from the current ivshmem implementation in QEMU, an ivshmem client has
85+ been provided in qemu.git/contrib/ivshmem-client for debug.
86+
87+ *QEMU as an ivshmem client*
88+
89+ At initialisation, when creating the ivshmem device, QEMU gets its ID from the
90+ server then makes it available through BAR0 IVPosition register for the VM to
91+ use (see 'PCI device registers' subsection).
92+ QEMU then uses the fd to the shared memory to map it to BAR2.
93+ eventfds for all other clients received from the server are stored to implement
94+ BAR0 Doorbell register (see 'PCI device registers' subsection).
95+ Finally, eventfds assigned to this QEMU process are used to send interrupts in
96+ this VM.
97+
98+ *PCI device registers*
99+
100+ From the VM point of view, the ivshmem PCI device supports 4 registers of
101+ 32-bits each.
29102
30103enum ivshmem_registers {
31104 IntrMask = 0,
@@ -49,8 +122,8 @@ bit to 0 and unmasked by setting the first bit to 1.
49122IVPosition Register: The IVPosition register is read-only and reports the
50123guest's ID number. The guest IDs are non-negative integers. When using the
51124server, since the server is a separate process, the VM ID will only be set when
52- the device is ready (shared memory is received from the server and accessible via
53- the device). If the device is not ready, the IVPosition will return -1.
125+ the device is ready (shared memory is received from the server and accessible
126+ via the device). If the device is not ready, the IVPosition will return -1.
54127Applications should ensure that they have a valid VM ID before accessing the
55128shared memory.
56129
@@ -59,8 +132,8 @@ Doorbell register. The doorbell register is 32-bits, logically divided into
59132two 16-bit fields. The high 16-bits are the guest ID to interrupt and the low
6013316-bits are the interrupt vector to trigger. The semantics of the value
61134written to the doorbell depends on whether the device is using MSI or a regular
62- pin-based interrupt. In short, MSI uses vectors while regular interrupts set the
63- status register.
135+ pin-based interrupt. In short, MSI uses vectors while regular interrupts set
136+ the status register.
64137
65138Regular Interrupts
66139
@@ -71,7 +144,7 @@ interrupt in the destination guest.
71144
72145Message Signalled Interrupts
73146
74- A ivshmem device may support multiple MSI vectors. If so, the lower 16-bits
147+ An ivshmem device may support multiple MSI vectors. If so, the lower 16-bits
75148written to the Doorbell register must be between 0 and the maximum number of
76149vectors the guest supports. The lower 16 bits written to the doorbell is the
77150MSI vector that will be raised in the destination guest. The number of MSI
@@ -83,14 +156,3 @@ interrupt itself should be communicated via the shared memory region. Devices
83156supporting multiple MSI vectors can use different vectors to indicate different
84157events have occurred. The semantics of interrupt vectors are left to the
85158user's discretion.
86-
87-
88- Usage in the Guest
89- ------------------
90-
91- The shared memory device is intended to be used with the provided UIO driver.
92- Very little configuration is needed. The guest should map BAR0 to access the
93- registers (an array of 32-bit ints allows simple writing) and map BAR2 to
94- access the shared memory region itself. The size of the shared memory region
95- is specified when the guest (or shared memory server) is started. A guest may
96- map the whole shared memory region or only part of it.
0 commit comments