Skip to content

Setting Virt-to-Boundary mappings for IPv6 inner addresses kernel panics due to alignment constraints #901

@FelixMcFelix

Description

@FelixMcFelix

During integrated testing of the IPv6 PR stack, @rcgoodfellow hit the following kernel panic when installing V2B mappings for use by OPTE:

root@oxz_switch:~# WARNING: panicked at xde/src/lib.rs:76:13:
kernel alloc greater than 8-byte alignment

panic[cpu0]/thread=fffffe0c319f0c40: panicked at xde/src/lib.rs:76:13:
kernel alloc greater than 8-byte alignment

fffffe0014eb0c30 xde:_RNvCsj0qrnF3AvHz_7___rustc17rust_begin_unwind+16d ()
fffffe0014eb0c60 xde:opte_panic_debug+3455ea33 ()
fffffe0014eb0e30 xde:_ZN9oxide_vpc6engine7overlay13Virt2Boundary17update_poptrie_v617hccd92ad035b12e59E+1694 ()
fffffe0014eb0ff0 xde:set_v2b_hdlr+a87 ()
fffffe0014eb1c90 xde:xde_ioc_opte_cmd+177 ()
fffffe0014eb1cc0 xde:xde_ioctl+77 ()
fffffe0014eb1d00 genunix:cdev_ioctl+3f ()
fffffe0014eb1d50 specfs:spec_ioctl+55 ()
fffffe0014eb1de0 genunix:fop_ioctl+40 ()
fffffe0014eb1f00 genunix:ioctl+144 ()
fffffe0014eb1f10 unix:brand_sys_syscall+1fe ()

dumping to /dev/zvol/dsk/rpool/dump, offset 65536, content: kernel
dumping:  0:09 100% done
100% done: 740227 pages dumped, dump succeeded
rebooting...

From what we can tell, update_poptrie_v6 constructs an Ipv6RoutingTable<T>(pub BTreeMap<(u128, u8), T>), and here T = TunnelEndpoint. TunnelEndpoint is byte-aligned, so we can rule that out. The key type (u128, u8) has an alignment of 16B because of that u128 member -- which causes us to blow up on one of the internal allocations of the BTreeMap. This is because kmem_alloc can only guarantee us 8B alignment, and we're honouring that:

opte/xde/src/lib.rs

Lines 54 to 84 in 46683e1

// On alignment, `kmem_alloc(9F)` has this of offer:
//
// > The allocated memory is at least double-word aligned, so it can
// > hold any C data structure. No greater alignment can be assumed.
//
// I really hate when documentation uses "word", because that seems to
// mean different things in different contexts. In this case I have to
// assume it means native integer size, or 32-bit in the case our our
// AMD64 kernel. This implis all allocations are at least 8-byte
// aligned, but could be more. However, the last sentence in the quote
// above says that you cannot assume alignment is ever greater than 8
// bytes. Therefore, it seems best to assume it's 8 bytes. For the
// purposes of implementing GlobalAlloc, I believe this means that we
// should return NULL for any Layout which requests more than 8-byte
// alignment (or probably just panic since I never expect this).
// Furthermore, things that can use smaller alignment will just have
// to live with the larger alignment.
struct KmemAlloc;
unsafe impl GlobalAlloc for KmemAlloc {
unsafe fn alloc(&self, layout: Layout) -> *mut u8 {
if layout.align() > 8 {
panic!("kernel alloc greater than 8-byte alignment");
}
unsafe { kmem_alloc(layout.size(), KM_SLEEP) as *mut u8 }
}
unsafe fn dealloc(&self, ptr: *mut u8, layout: Layout) {
unsafe { kmem_free(ptr as *mut c_void, layout.size() as size_t) }
}
}

We need any of:

  • Replacing the K type in the BTreeMap with ((u64, u64), u8) or similar.
  • An alignment-aware allocation function provided by illumos.
  • Attempt to cook up our own memalign-like function in our GlobalAlloc implementation.

The first seems the most expedient, and the second is probably preferable in the long run (equally so when it comes to requesting cache-line aligned structs). The third is... unlikely and likely very tricky.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions