-
Notifications
You must be signed in to change notification settings - Fork 10
Description
During integrated testing of the IPv6 PR stack, @rcgoodfellow hit the following kernel panic when installing V2B mappings for use by OPTE:
root@oxz_switch:~# WARNING: panicked at xde/src/lib.rs:76:13:
kernel alloc greater than 8-byte alignment
panic[cpu0]/thread=fffffe0c319f0c40: panicked at xde/src/lib.rs:76:13:
kernel alloc greater than 8-byte alignment
fffffe0014eb0c30 xde:_RNvCsj0qrnF3AvHz_7___rustc17rust_begin_unwind+16d ()
fffffe0014eb0c60 xde:opte_panic_debug+3455ea33 ()
fffffe0014eb0e30 xde:_ZN9oxide_vpc6engine7overlay13Virt2Boundary17update_poptrie_v617hccd92ad035b12e59E+1694 ()
fffffe0014eb0ff0 xde:set_v2b_hdlr+a87 ()
fffffe0014eb1c90 xde:xde_ioc_opte_cmd+177 ()
fffffe0014eb1cc0 xde:xde_ioctl+77 ()
fffffe0014eb1d00 genunix:cdev_ioctl+3f ()
fffffe0014eb1d50 specfs:spec_ioctl+55 ()
fffffe0014eb1de0 genunix:fop_ioctl+40 ()
fffffe0014eb1f00 genunix:ioctl+144 ()
fffffe0014eb1f10 unix:brand_sys_syscall+1fe ()
dumping to /dev/zvol/dsk/rpool/dump, offset 65536, content: kernel
dumping: 0:09 100% done
100% done: 740227 pages dumped, dump succeeded
rebooting...
From what we can tell, update_poptrie_v6 constructs an Ipv6RoutingTable<T>(pub BTreeMap<(u128, u8), T>), and here T = TunnelEndpoint. TunnelEndpoint is byte-aligned, so we can rule that out. The key type (u128, u8) has an alignment of 16B because of that u128 member -- which causes us to blow up on one of the internal allocations of the BTreeMap. This is because kmem_alloc can only guarantee us 8B alignment, and we're honouring that:
Lines 54 to 84 in 46683e1
| // On alignment, `kmem_alloc(9F)` has this of offer: | |
| // | |
| // > The allocated memory is at least double-word aligned, so it can | |
| // > hold any C data structure. No greater alignment can be assumed. | |
| // | |
| // I really hate when documentation uses "word", because that seems to | |
| // mean different things in different contexts. In this case I have to | |
| // assume it means native integer size, or 32-bit in the case our our | |
| // AMD64 kernel. This implis all allocations are at least 8-byte | |
| // aligned, but could be more. However, the last sentence in the quote | |
| // above says that you cannot assume alignment is ever greater than 8 | |
| // bytes. Therefore, it seems best to assume it's 8 bytes. For the | |
| // purposes of implementing GlobalAlloc, I believe this means that we | |
| // should return NULL for any Layout which requests more than 8-byte | |
| // alignment (or probably just panic since I never expect this). | |
| // Furthermore, things that can use smaller alignment will just have | |
| // to live with the larger alignment. | |
| struct KmemAlloc; | |
| unsafe impl GlobalAlloc for KmemAlloc { | |
| unsafe fn alloc(&self, layout: Layout) -> *mut u8 { | |
| if layout.align() > 8 { | |
| panic!("kernel alloc greater than 8-byte alignment"); | |
| } | |
| unsafe { kmem_alloc(layout.size(), KM_SLEEP) as *mut u8 } | |
| } | |
| unsafe fn dealloc(&self, ptr: *mut u8, layout: Layout) { | |
| unsafe { kmem_free(ptr as *mut c_void, layout.size() as size_t) } | |
| } | |
| } |
We need any of:
- Replacing the
Ktype in theBTreeMapwith((u64, u64), u8)or similar. - An alignment-aware allocation function provided by illumos.
- Attempt to cook up our own
memalign-like function in ourGlobalAllocimplementation.
The first seems the most expedient, and the second is probably preferable in the long run (equally so when it comes to requesting cache-line aligned structs). The third is... unlikely and likely very tricky.