-
Notifications
You must be signed in to change notification settings - Fork 10
Description
Noted while working on #462/#475 -- this is a tracking issue to understand this as its own problem. Today we are converting InnerFlowIds to flow_id_sdt_arg structs, which is moderately costly as it occurs many times per packet. This creates one or two (current, or before+after) stack-local variables which are referenced without issue.
Removing this and passing in either a *const InnerFlowId or converting to a uintptr_t (as we do with our other args) leads to known panics in two locations so far. From some dumps I've captured:
Periodic flow expiry
panic[cpu14]/thread=fffffe009270ac20:
BAD TRAP: type=e (#pf Page fault) rp=fffffe009270a090 addr=18 occurred in module "xde" due to a NULL pointer dereference
sched:
#pf Page fault
Bad kernel fault at addr=0x18
pid=0, pc=0xfffffffff44db7be, sp=0xfffffe009270a180, eflags=0x10246
cr0: 8005003b<pg,wp,ne,et,ts,mp,pe> cr4: 3406f8<smap,smep,osxsav,xmme,fxsr,pge,mce,pae,pse,de>
cr2: 18
cr3: 1a800000
cr8: 0
rdi: fffffe69f79cacf0 rsi: 0 rdx: 110000
rcx: 0 r8: 2 r9: fffffe009270a278
rax: 0 rbx: fffffe69f79cacf0 rbp: fffffe009270a1d0
r10: 8a2ba1a7901 r11: 6 r12: 4
r13: 1 r14: 4 r15: 0
fsb: fffffc7fef2d2a40 gsb: fffffe69deb02000 ds: 0
es: 0 fs: 0 gs: 0
trp: e err: 0 rip: fffffffff44db7be
cs: 30 rfl: 10246 rsp: fffffe009270a180
ss: 38
fffffe0092709fa0 unix:die+c0 ()
fffffe009270a080 unix:trap+999 ()
fffffe009270a090 unix:cmntrap+e9 ()
fffffe009270a1d0 xde:_ZN4core3fmt9Formatter12pad_integral17hdace542c09befd8aE+18e ()
fffffe009270a260 xde:_ZN4core3fmt3num53_$LT$impl$u20$core..fmt..LowerHex$u20$for$u20$u16$GT$3fmt17hba211b57c0906999E+7a ()
fffffe009270a2f0 xde:_ZN4core3fmt5write17h5e760e4f19caf97dE+1b3 ()
fffffe009270a3b0 xde:_ZN67_$LT$smoltcp..wire..ipv6..Address$u20$as$u20$core..fmt..Display$GT$3fmt17h6e9c915dfd6e131fE+195 ()
fffffe009270a440 xde:_ZN4core3fmt5write17h5e760e4f19caf97dE+1b3 ()
fffffe009270a4a0 xde:_ZN44_$LT$$RF$T$u20$as$u20$core..fmt..Display$GT$3fmt17h9128c724cf7b20c1E+61 ()
fffffe009270a530 xde:_ZN4core3fmt5write17h5e760e4f19caf97dE+1b3 ()
fffffe009270a590 xde:_ZN59_$LT$opte_api..ip..IpAddr$u20$as$u20$core..fmt..Display$GT$3fmt17haa9991ea1a942307E+74 ()
fffffe009270a620 xde:_ZN4core3fmt5write17h5e760e4f19caf97dE+1b3 ()
fffffe009270a6c0 xde:_ZN69_$LT$opte..engine..nat..OutboundNat$u20$as$u20$core..fmt..Display$GT$3fmt17hdcf3ff64a1fb60dcE+5b ()
fffffe009270a800 xde:_ZN121_$LT$alloc..collections..btree..map..ExtractIf$LT$K$C$V$C$F$C$A$GT$$u20$as$u20$core..iter..traits..iterator..Iterator$GT$4next17h6364d8716d0dcc4eE+1fc ()
fffffe009270a980 xde:_ZN4opte6engine10flow_table18FlowTable$LT$S$GT$12expire_flows17hb96f1e0513b25d84E+17f ()
fffffe009270ab10 xde:expire_periodic+91 ()
fffffe009270ab50 genunix:periodic_execute+f5 ()
fffffe009270ac00 genunix:taskq_thread+2a6 ()
fffffe009270ac10 unix:thread_start+b ()
dumping to /dev/zvol/dsk/rpool/dump, offset 65536, content: kernel
Update TCP state
panic[cpu3]/thread=fffffe00934eac20:
BAD TRAP: type=e (#pf Page fault) rp=fffffe00934e9aa0 addr=c occurred in module "xde" due to a NULL pointer dereference
sched:
#pf Page fault
Bad kernel fault at addr=0xc
pid=0, pc=0xfffffffff44ee31b, sp=0xfffffe00934e9b90, eflags=0x10286
cr0: 8005003b<pg,wp,ne,et,ts,mp,pe> cr4: 3406f8<smap,smep,osxsav,xmme,fxsr,pge,mce,pae,pse,de>
cr2: c
cr3: 1a800000
cr8: 0
rdi: 2 rsi: fffffe6ba8fb8228 rdx: fffffe00934ea0e8
rcx: 17 r8: a000000 r9: 4
rax: fffffe69e320cac0 rbx: 2 rbp: fffffe00934e9d50
r10: ff r11: 5 r12: fffffe69e8260e10
r13: fffffe00934ea0e8 r14: 17 r15: 2
fsb: fffffc7fef2d2a40 gsb: fffffe69de4bf000 ds: 0
es: 0 fs: 0 gs: 0
trp: e err: 0 rip: fffffffff44ee31b
cs: 30 rfl: 10286 rsp: fffffe00934e9b90
ss: 38
fffffe00934e99b0 unix:die+c0 ()
fffffe00934e9a90 unix:trap+999 ()
fffffe00934e9aa0 unix:cmntrap+e9 ()
fffffe00934e9d50 xde:_ZN4opte6engine4port13Port$LT$N$GT$16update_tcp_entry17h116fa2f846532633E+3b ()
fffffe00934e9f20 xde:_ZN4opte6engine4port13Port$LT$N$GT$16update_tcp_entry17h116fa2f846532633E+25c ()
fffffe00934ea2e0 xde:_ZN4opte6engine4port13Port$LT$N$GT$7process17h132132694f056ce9E+3bc ()
fffffe00934ea950 xde:xde_rx+43c ()
fffffe00934ea9a0 mac:mac_promisc_dispatch_one+60 ()
fffffe00934eaa20 mac:mac_promisc_dispatch+83 ()
fffffe00934eaa80 mac:mac_rx_common+47 ()
fffffe00934eaae0 mac:mac_rx+c6 ()
fffffe00934eab20 mac:mac_rx_ring+2b ()
fffffe00934eab60 igb:igb_intr_rx_work+5c ()
fffffe00934eab80 igb:igb_intr_rx+15 ()
fffffe00934eabd0 apix:apix_dispatch_by_vector+8c ()
fffffe00934eac00 apix:apix_dispatch_lowlevel+29 ()
fffffe0093499a40 unix:switch_sp_and_call+15 ()
fffffe0093499aa0 apix:apix_do_interrupt+f3 ()
fffffe0093499ab0 unix:cmnint+c3 ()
fffffe0093499ba0 unix:i86_mwait+12 ()
fffffe0093499bd0 unix:cpu_idle_mwait+14b ()
fffffe0093499be0 unix:cpu_idle_adaptive+19 ()
fffffe0093499c00 unix:idle+a8 ()
fffffe0093499c10 unix:thread_start+b ()
Both occur some distance from the actual SDT: a format statement on the supposedly-valid &InnerFlowId, and a match on a &InnerFlowId respectively before the probe occurs. Removing the probe causes these callsites to behave/compile correctly. Another SDT, layer-process-return shows a different variation:
NAME DIR EPOCH FLOW BEFORE FLOW AFTER LEN RESULT
opte0 OUT 3 UDP,10.0.0.2:38231,10.0.0.1:10000 ,0.254.255.255:29,16.0.0.0:1980 133 Modified
0 1 2 3 4 5 6 7 8 9 a b c d e f 0123456789abcdef
0: 11 a1 02 00 0a 00 00 02 0a 00 00 01 00 00 00 00 ................
10: 00 00 00 00 02 00 00 03 02 11 00 0e cb 1f a1 fb ................
20: ff ff ff ff 57 95 10 27 ....W..'
0 1 2 3 4 5 6 7 8 9 a b c d e f 0123456789abcdef
0: 90 69 1b 93 00 fe ff ff 10 00 00 00 00 00 00 00 .i..............
10: c0 db 43 42 6a fe ff ff f4 33 a1 fb ff ff ff ff ..CBj....3......
20: 57 95 10 27 1d 00 bc 07 W..'....
Flow_before is obviously valid, while flow_after appears to point elsewhere. The only obvious difference I'm aware of is that flow_after is obtained direct via Packet::flow, while flow_before is obtained and explicltly copied out before pkt is modified.
EDIT: The last case is caused by our uintptr_t untyped args making it easy to pass the wrong thing. The actual kernel panics still stand even with accurate types, however.