PPC: fix HISI/HDSI generation. #2

andreiw · 2015-10-26T08:03:28Z

These can only be generated if the thread is not in HV
mode, even with VPM.

5.7.2 of PowerISA_v2.07: "If the thread is not in hypervisor state,
and either address translation is enabled and VPM1=1, or address
translation is disabled and VPM0=1, conditions that would have caused
a Data Storage or an Instruction Storage interrupt if the affected
memory were not virtu- alized instead cause a Hypervisor Data Storage
or a Hypervisor Instruction Storage interrupt respectively."

6.5.16: "A Hypervisor Data Storage interrupt occurs when no higher
priority exception exists, the thread is not in hypervisor state, and
..."

Signed-off-by: Andrei Warkentin [email protected]

qpci_msix_pending() writes on pba region, causing qemu to SEGV: Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7ffff7fba8c0 (LWP 25882)] 0x0000000000000000 in ?? () (gdb) bt #0 0x0000000000000000 in () #1 0x00005555556556c5 in memory_region_oldmmio_write_accessor (mr=0x5555579f3f80, addr=0, value=0x7fffffffbf68, size=4, shift=0, mask=4294967295, attrs=...) at /home/elmarco/src/qemu/memory.c:434 #2 0x00005555556558e1 in access_with_adjusted_size (addr=0, value=0x7fffffffbf68, size=4, access_size_min=1, access_size_max=4, access=0x55555565563e <memory_region_oldmmio_write_accessor>, mr=0x5555579f3f80, attrs=...) at /home/elmarco/src/qemu/memory.c:506 qemu#3 0x00005555556581eb in memory_region_dispatch_write (mr=0x5555579f3f80, addr=0, data=0, size=4, attrs=...) at /home/elmarco/src/qemu/memory.c:1176 qemu#4 0x000055555560b6f9 in address_space_rw (as=0x555555eff4e0 <address_space_memory>, addr=3759147008, attrs=..., buf=0x7fffffffc1b0 "", len=4, is_write=true) at /home/elmarco/src/qemu/exec.c:2439 qemu#5 0x000055555560baa2 in cpu_physical_memory_rw (addr=3759147008, buf=0x7fffffffc1b0 "", len=4, is_write=1) at /home/elmarco/src/qemu/exec.c:2534 qemu#6 0x000055555564c005 in cpu_physical_memory_write (addr=3759147008, buf=0x7fffffffc1b0, len=4) at /home/elmarco/src/qemu/include/exec/cpu-common.h:80 qemu#7 0x000055555564cd9c in qtest_process_command (chr=0x55555642b890, words=0x5555578de4b0) at /home/elmarco/src/qemu/qtest.c:378 qemu#8 0x000055555564db77 in qtest_process_inbuf (chr=0x55555642b890, inbuf=0x55555641b340) at /home/elmarco/src/qemu/qtest.c:569 qemu#9 0x000055555564dc07 in qtest_read (opaque=0x55555642b890, buf=0x7fffffffc2e0 "writel 0xe0100800 0x0\n", size=22) at /home/elmarco/src/qemu/qtest.c:581 qemu#10 0x000055555574ce3e in qemu_chr_be_write (s=0x55555642b890, buf=0x7fffffffc2e0 "writel 0xe0100800 0x0\n", len=22) at qemu-char.c:306 qemu#11 0x0000555555751263 in tcp_chr_read (chan=0x55555642bcf0, cond=G_IO_IN, opaque=0x55555642b890) at qemu-char.c:2876 qemu#12 0x00007ffff64c9a8a in g_main_context_dispatch (context=0x55555641c400) at gmain.c:3122 (without this patch, this can be reproduced with the ivshmem qtest) Implement an empty mmio write to avoid the crash. Signed-off-by: Marc-André Lureau <[email protected]> Reviewed-by: Paolo Bonzini <[email protected]>

A new gdb commands are added: qemu handlers That dumps an AioContext list (by default qemu_aio_context) possibly including a backtrace for cases it knows about (with the verbose option). Intended to help find why something is hanging waiting for IO. Use 'qemu handlers --verbose iohandler_ctx' to find out why your incoming migration is stuck. Signed-off-by: Dr. David Alan Gilbert <[email protected]> Message-id: [email protected] V2: Merge into one command with optional handlers arg, and only do backtrace in verbose mode (gdb) qemu handlers ---- {pfd = {fd = 6, events = 25, revents = 0}, io_read = 0x55869656ffd0 <event_notifier_dummy_cb>, io_write = 0x0, deleted = 0, opaque = 0x558698c4ce08, node = {le_next = 0x0, le_prev = 0x558698c4cdc0}} (gdb) qemu handlers iohandler_ctx ---- {pfd = {fd = 9, events = 25, revents = 0}, io_read = 0x558696581380 <fd_coroutine_enter>, io_write = 0x0, deleted = 0, opaque = 0x558698dc99d0, node = {le_next = 0x558698c4cca0, le_prev = 0x558698c4c1d0}} ---- {pfd = {fd = 4, events = 25, revents = 0}, io_read = 0x55869657b330 <sigfd_handler>, io_write = 0x0, deleted = 0, opaque = 0x4, node = {le_next = 0x558698c4c260, le_prev = 0x558699f72508}} ---- {pfd = {fd = 5, events = 25, revents = 0}, io_read = 0x55869656ffd0 <event_notifier_dummy_cb>, io_write = 0x0, deleted = 0, opaque = 0x558698c4c218, node = {le_next = 0x0, le_prev = 0x558698c4ccc8}} ---- (gdb) qemu handlers --verbose iohandler_ctx ---- {pfd = {fd = 9, events = 25, revents = 0}, io_read = 0x558696581380 <fd_coroutine_enter>, io_write = 0x0, deleted = 0, opaque = 0x558698dc99d0, node = {le_next = 0x558698c4cca0, le_prev = 0x558698c4c1d0}} #0 0x0000558696581820 in qemu_coroutine_switch (from_=from_@entry=0x558698cb3cf0, to_=to_@entry=0x7f421c37eac8, action=action@entry=COROUTINE_YIELD) at /home/dgilbert/git/qemu/coroutine-ucontext.c:177 #1 0x0000558696580c00 in qemu_coroutine_yield () at /home/dgilbert/git/qemu/qemu-coroutine.c:145 #2 0x00005586965814f5 in yield_until_fd_readable (fd=9) at /home/dgilbert/git/qemu/qemu-coroutine-io.c:90 qemu#3 0x0000558696523937 in socket_get_buffer (opaque=0x55869a3dc620, buf=0x558698c505a0 "", pos=<optimized out>, size=32768) at /home/dgilbert/git/qemu/migration/qemu-file-unix.c:101 qemu#4 0x0000558696521fac in qemu_fill_buffer (f=0x558698c50570) at /home/dgilbert/git/qemu/migration/qemu-file.c:227 qemu#5 0x0000558696522989 in qemu_peek_byte (f=0x558698c50570, offset=0) at /home/dgilbert/git/qemu/migration/qemu-file.c:507 qemu#6 0x0000558696522bf4 in qemu_get_be32 (f=0x558698c50570) at /home/dgilbert/git/qemu/migration/qemu-file.c:520 qemu#7 0x0000558696522bf4 in qemu_get_be32 (f=f@entry=0x558698c50570) at /home/dgilbert/git/qemu/migration/qemu-file.c:604 qemu#8 0x0000558696347e5c in qemu_loadvm_state (f=f@entry=0x558698c50570) at /home/dgilbert/git/qemu/migration/savevm.c:1821 qemu#9 0x000055869651de8c in process_incoming_migration_co (opaque=0x558698c50570) at /home/dgilbert/git/qemu/migration/migration.c:336 qemu#10 0x000055869658188a in coroutine_trampoline (i0=<optimized out>, i1=<optimized out>) at /home/dgilbert/git/qemu/coroutine-ucontext.c:80 qemu#11 0x00007f420f05df10 in __start_context () at /lib64/libc.so.6 qemu#12 0x00007ffc40815f50 in () qemu#13 0x0000000000000000 in () ---- Signed-off-by: Stefan Hajnoczi <[email protected]>

We don't use the resulting accessors and this gets in the way of the split I/D TLB work. Signed-off-by: Benjamin Herrenschmidt <[email protected]>

We rework the way the MMU indices are calculated, providing separate indices for I and D side based on MSR:IR and MSR:DR respectively, and thus no longer need to flush the TLB on context changes. This also adds correct support for HV as a separate address space. Signed-off-by: Benjamin Herrenschmidt <[email protected]>

On ppc64 especially, we flush the tlb on any slbie or tlbie instruction. However, those instructions often come in bursts of 3 or more (context switch will favor a series of slbie's for example to an slbia if the SLB has less than a certain number of entries in it, and tlbie's can happen in a series, with PAPR, H_BULK_REMOVE can remove up to 4 entries at a time. Doing a tlb_flush() each time is a waste of time. We end up doing a memset of the whole TLB, reloading it for the next instruction, memset'ing again, etc... Those instructions don't have to take effect immediately. For slbie, they can wait for the next context synchronizing event. For tlbie, the next tlbsync. This implements batching by keeping a flag that indicates that we have a TLB in need of flushing. We check it on interrupts, rfi's, isync's and tlbsync and flush the TLB if needed. This reduces the number of tlb_flush() on a boot to a ubuntu installer first dialog screen from roughly 360K down to 36K. Signed-off-by: Benjamin Herrenschmidt <[email protected]> [clg: added a 'CPUPPCState *' variable in h_remove() and h_bulk_remove() ] Signed-off-by: Cédric Le Goater <[email protected]>

We don't give them a KVM reg number yet as no current KVM version supports HV mode. Signed-off-by: Benjamin Herrenschmidt <[email protected]> [clg: SPRs AMOR,DAWR,DARWX were already included in commit f401dd3] Signed-off-by: Cédric Le Goater <[email protected]>

This helper is only used by the various instructions that can alter MSR and not interrupts. Add a comment to that effect to the interrupt code as well in case somebody wants to change this Signed-off-by: Benjamin Herrenschmidt <[email protected]> Reviewed-by: David Gibson <[email protected]>

This reworks emulation of the various "rfi" variants. I removed some masking bits that I couldn't make sense of, the only bit that I am aware we should mask here is POW, the CPU's MSR mask should take care of the rest. This also fixes some problems when running 32-bit userspace under a 64-bit kernel. Signed-off-by: Benjamin Herrenschmidt <[email protected]> Reviewed-by: David Gibson <[email protected]>

We use an env. flag which is set to the initial value of MSR_HVB in the msr_mask. We also adjust the POWER8 mask to set SHV. Also use this to adjust ctx.hv so that it is *set* when the processor doesn't have an HV mode (970 with Apple mode for example), thus enabling hypervisor instructions/SPRs. Signed-off-by: Benjamin Herrenschmidt <[email protected]> Reviewed-by: David Gibson <[email protected]>

Not that anything remotely recent supports tlbia but ... Signed-off-by: Benjamin Herrenschmidt <[email protected]>

Otherwise it will trip on the forms used in recent architecture. Ideally, we should have different handlers for different architecture levels but our current implementation of TLB flushing is dumb enough that this will do for now. Signed-off-by: Benjamin Herrenschmidt <[email protected]> Reviewed-by: David Gibson <[email protected]>

Signed-off-by: Michael Neuling <[email protected]> Signed-off-by: Benjamin Herrenschmidt <[email protected]> Reviewed-by: David Gibson <[email protected]>

Otherwise tight loops at smt_low for example, which OPAL does, eat so much CPU that we can't boot a kernel anymore. With that, I can boot 8 CPUs just fine with powernv. Signed-off-by: Benjamin Herrenschmidt <[email protected]> Reviewed-by: David Gibson <[email protected]>

This will enable decoding of hrfid Signed-off-by: Benjamin Herrenschmidt <[email protected]> Reviewed-by: David Gibson <[email protected]>

Also use it to clamp the max SMT mode and ensure that the cpu_dt_id are offset by that value in order to preserve consistency with the HW implementations. Signed-off-by: Benjamin Herrenschmidt <[email protected]> Reviewed-by: David Gibson <[email protected]>

And move the code adjusting the MSR mask and calling kvmppc_set_papr() to it. This allows us to add a few more things such as disabling setting of MSR:HV and appropriate LPCR bits which will be used when fixing the exception model. Signed-off-by: Benjamin Herrenschmidt <[email protected]> Reviewed-by: David Gibson <[email protected]>

Properly implement LPES0/1 handling for HV vs. !HV mode and fix AIL implementation. Signed-off-by: Benjamin Herrenschmidt <[email protected]>

We were initializing unused ones and missing some Signed-off-by: Benjamin Herrenschmidt <[email protected]> Reviewed-by: David Gibson <[email protected]>

Under some circumstances, we need to direct ISI and DSI interrupts at the hypervisor, turning them into HISI/HDSI, and using different SPRs (HDSISR and HDAR) depending on the combination of MSR_DR and the corresponding VPM bits in LPCR. This moves part of the code into helpers that are fixed to select the right exception type and registers. On pre-P7 processors, LPCR is 0 which provides the old behaviour of directing the interrupts at the supervisor. Thanks to Andrei Warkentin for finding a bug when HV=1 Signed-off-by: Benjamin Herrenschmidt <[email protected]> Reviewed-by: David Gibson <[email protected]>

Recent server processors use the Hypervisor Emulation Assistance interrupt for illegal instructions and *some* type of SPR accesses. Also the code was always generating inval instructions even for priv violations due to setting the wrong flags Finally, the checking for PR/HV was open coded everywhere. This reworks it all, using little helper macros for checking, and adding the HV interrupt (which gets converted back to program check in the slow path of excp_helper.c on CPUs that don't want it). Signed-off-by: Benjamin Herrenschmidt <[email protected]>

Those instructions are only available in hypervisor real mode and allow cache inhibited garded access to devices in that mode. Signed-off-by: Benjamin Herrenschmidt <[email protected]>

The vnc_server_info_get will allocate the VncServerInfo struct and then call vnc_init_basic_info_from_server_addr to populate the basic fields. If this returns an error though, the qapi_free_VncServerInfo call will then crash because the VncServerInfo struct instance was not properly NULL-initialized and thus contains random stack garbage. #0 0x00007f1987c8e6f5 in raise () at /lib64/libc.so.6 #1 0x00007f1987c902fa in abort () at /lib64/libc.so.6 #2 0x00007f1987ccf600 in __libc_message () at /lib64/libc.so.6 qemu#3 0x00007f1987cd7d4a in _int_free () at /lib64/libc.so.6 qemu#4 0x00007f1987cdb2ac in free () at /lib64/libc.so.6 qemu#5 0x00007f198b654f6e in g_free () at /lib64/libglib-2.0.so.0 qemu#6 0x0000559193cdcf54 in visit_type_str (v=v@entry= 0x5591972f14b0, name=name@entry=0x559193de1e29 "host", obj=obj@entry=0x5591961dbfa0, errp=errp@entry=0x7fffd7899d80) at qapi/qapi-visit-core.c:255 qemu#7 0x0000559193cca8f3 in visit_type_VncBasicInfo_members (v=v@entry= 0x5591972f14b0, obj=obj@entry=0x5591961dbfa0, errp=errp@entry=0x7fffd7899dc0) at qapi-visit.c:12307 qemu#8 0x0000559193ccb523 in visit_type_VncServerInfo_members (v=v@entry= 0x5591972f14b0, obj=0x5591961dbfa0, errp=errp@entry=0x7fffd7899e00) at qapi-visit.c:12632 qemu#9 0x0000559193ccb60b in visit_type_VncServerInfo (v=v@entry= 0x5591972f14b0, name=name@entry=0x0, obj=obj@entry=0x7fffd7899e48, errp=errp@entry=0x0) at qapi-visit.c:12658 qemu#10 0x0000559193cb53d8 in qapi_free_VncServerInfo (obj=<optimized out>) at qapi-types.c:3970 qemu#11 0x0000559193c1e6ba in vnc_server_info_get (vd=0x7f1951498010) at ui/vnc.c:233 qemu#12 0x0000559193c24275 in vnc_connect (vs=0x559197b2f200, vs=0x559197b2f200, event=QAPI_EVENT_VNC_CONNECTED) at ui/vnc.c:284 qemu#13 0x0000559193c24275 in vnc_connect (vd=vd@entry=0x7f1951498010, sioc=sioc@entry=0x559196bf9c00, skipauth=skipauth@entry=tru e, websocket=websocket@entry=false) at ui/vnc.c:3039 qemu#14 0x0000559193c25806 in vnc_display_add_client (id=<optimized out>, csock=<optimized out>, skipauth=<optimized out>) at ui/vnc.c:3877 qemu#15 0x0000559193a90c28 in qmp_marshal_add_client (args=<optimized out>, ret=<optimized out>, errp=0x7fffd7899f90) at qmp-marshal.c:105 qemu#16 0x000055919399c2b7 in handle_qmp_command (parser=<optimized out>, tokens=<optimized out>) at /home/berrange/src/virt/qemu/monitor.c:3971 qemu#17 0x0000559193ce3307 in json_message_process_token (lexer=0x559194ab0838, input=0x559194a6d940, type=JSON_RCURLY, x=111, y=1 2) at qobject/json-streamer.c:105 qemu#18 0x0000559193cfa90d in json_lexer_feed_char (lexer=lexer@entry=0x559194ab0838, ch=125 '}', flush=flush@entry=false) at qobject/json-lexer.c:319 qemu#19 0x0000559193cfaa1e in json_lexer_feed (lexer=0x559194ab0838, buffer=<optimized out>, size=<optimized out>) at qobject/json-lexer.c:369 qemu#20 0x0000559193ce33c9 in json_message_parser_feed (parser=<optimized out>, buffer=<optimized out>, size=<optimized out>) at qobject/json-streamer.c:124 qemu#21 0x000055919399a85b in monitor_qmp_read (opaque=<optimized out>, buf=<optimized out>, size=<optimized out>) at /home/berrange/src/virt/qemu/monitor.c:3987 qemu#22 0x0000559193a87d00 in tcp_chr_read (chan=<optimized out>, cond=<optimized out>, opaque=0x559194a7d900) at qemu-char.c:2895 qemu#23 0x00007f198b64f703 in g_main_context_dispatch () at /lib64/libglib-2.0.so.0 qemu#24 0x0000559193c484b3 in main_loop_wait () at main-loop.c:213 qemu#25 0x0000559193c484b3 in main_loop_wait (timeout=<optimized out>) at main-loop.c:258 qemu#26 0x0000559193c484b3 in main_loop_wait (nonblocking=<optimized out>) at main-loop.c:506 qemu#27 0x0000559193964c55 in main () at vl.c:1908 qemu#28 0x0000559193964c55 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4603 This was introduced in commit 98481bf Author: Eric Blake <[email protected]> Date: Mon Oct 26 16:34:45 2015 -0600 vnc: Hoist allocation of VncBasicInfo to callers which added error reporting for vnc_init_basic_info_from_server_addr but didn't change the g_malloc calls to g_malloc0. Signed-off-by: Daniel P. Berrange <[email protected]> Message-id: [email protected] Signed-off-by: Gerd Hoffmann <[email protected]>

Spotted by ASAN: ================================================================= ==27907==ERROR: LeakSanitizer: detected memory leaks Direct leak of 4120 byte(s) in 1 object(s) allocated from: #0 0x7f913458ce50 in calloc (/lib64/libasan.so.5+0xeee50) #1 0x7f9133fd641d in g_malloc0 (/lib64/libglib-2.0.so.0+0x5241d) #2 0x5561c6643c95 in qdict_crumple_test_recursive /home/elmarco/src/qq/tests/check-block-qdict.c:438 qemu#3 0x7f9133ff7c49 (/lib64/libglib-2.0.so.0+0x73c49) Signed-off-by: Marc-André Lureau <[email protected]> Message-Id: <[email protected]> [Screwed up in commit 2860b2b] Reviewed-by: Markus Armbruster <[email protected]> Signed-off-by: Markus Armbruster <[email protected]>

Spotted by ASAN, during make check... Direct leak of 40 byte(s) in 1 object(s) allocated from: #0 0x7f8e27262c48 in malloc (/lib64/libasan.so.5+0xeec48) #1 0x7f8e26a5f3c5 in g_malloc (/lib64/libglib-2.0.so.0+0x523c5) #2 0x555ab67078a8 in qstring_from_str /home/elmarco/src/qq/qobject/qstring.c:67 qemu#3 0x555ab67071e4 in qstring_new /home/elmarco/src/qq/qobject/qstring.c:24 qemu#4 0x555ab6713fbf in qstring_from_escaped_str /home/elmarco/src/qq/qobject/json-parser.c:144 qemu#5 0x555ab671738c in parse_literal /home/elmarco/src/qq/qobject/json-parser.c:506 qemu#6 0x555ab67179c3 in parse_value /home/elmarco/src/qq/qobject/json-parser.c:569 qemu#7 0x555ab6715123 in parse_pair /home/elmarco/src/qq/qobject/json-parser.c:306 qemu#8 0x555ab6715483 in parse_object /home/elmarco/src/qq/qobject/json-parser.c:357 qemu#9 0x555ab671798b in parse_value /home/elmarco/src/qq/qobject/json-parser.c:561 qemu#10 0x555ab6717a6b in json_parser_parse_err /home/elmarco/src/qq/qobject/json-parser.c:592 qemu#11 0x555ab4fd4dcf in handle_qmp_command /home/elmarco/src/qq/monitor.c:4257 qemu#12 0x555ab6712c4d in json_message_process_token /home/elmarco/src/qq/qobject/json-streamer.c:105 qemu#13 0x555ab67e01e2 in json_lexer_feed_char /home/elmarco/src/qq/qobject/json-lexer.c:323 qemu#14 0x555ab67e0af6 in json_lexer_feed /home/elmarco/src/qq/qobject/json-lexer.c:373 qemu#15 0x555ab6713010 in json_message_parser_feed /home/elmarco/src/qq/qobject/json-streamer.c:124 qemu#16 0x555ab4fd58ec in monitor_qmp_read /home/elmarco/src/qq/monitor.c:4337 qemu#17 0x555ab6559df2 in qemu_chr_be_write_impl /home/elmarco/src/qq/chardev/char.c:175 qemu#18 0x555ab6559e95 in qemu_chr_be_write /home/elmarco/src/qq/chardev/char.c:187 qemu#19 0x555ab6560127 in fd_chr_read /home/elmarco/src/qq/chardev/char-fd.c:66 qemu#20 0x555ab65d9c73 in qio_channel_fd_source_dispatch /home/elmarco/src/qq/io/channel-watch.c:84 qemu#21 0x7f8e26a598ac in g_main_context_dispatch (/lib64/libglib-2.0.so.0+0x4c8ac) Signed-off-by: Marc-André Lureau <[email protected]> Message-Id: <[email protected]> [Screwed up in commit b273145] Cc: [email protected] Reviewed-by: Markus Armbruster <[email protected]> Signed-off-by: Markus Armbruster <[email protected]>

if qio_channel_rdma_readv return QIO_CHANNEL_ERR_BLOCK, the destination qemu crash. The backtrace is: (gdb) bt #0 0x0000000000000000 in ?? () #1 0x00000000008db50e in qio_channel_set_aio_fd_handler (ioc=0x38111e0, ctx=0x3726080, io_read=0x8db841 <qio_channel_restart_read>, io_write=0x0, opaque=0x38111e0) at io/channel.c: #2 0x00000000008db952 in qio_channel_set_aio_fd_handlers (ioc=0x38111e0) at io/channel.c:438 qemu#3 0x00000000008dbab4 in qio_channel_yield (ioc=0x38111e0, condition=G_IO_IN) at io/channel.c:47 qemu#4 0x00000000007a870b in channel_get_buffer (opaque=0x38111e0, buf=0x440c038 "", pos=0, size=327 at migration/qemu-file-channel.c:83 qemu#5 0x00000000007a70f6 in qemu_fill_buffer (f=0x440c000) at migration/qemu-file.c:299 qemu#6 0x00000000007a79d0 in qemu_peek_byte (f=0x440c000, offset=0) at migration/qemu-file.c:562 qemu#7 0x00000000007a7a22 in qemu_get_byte (f=0x440c000) at migration/qemu-file.c:575 qemu#8 0x00000000007a7c78 in qemu_get_be32 (f=0x440c000) at migration/qemu-file.c:655 qemu#9 0x00000000007a0508 in qemu_loadvm_state (f=0x440c000) at migration/savevm.c:2126 qemu#10 0x0000000000794141 in process_incoming_migration_co (opaque=0x0) at migration/migration.c:366 qemu#11 0x000000000095c598 in coroutine_trampoline (i0=84033984, i1=0) at util/coroutine-ucontext.c:1 qemu#12 0x00007f9c0db56d40 in ?? () from /lib64/libc.so.6 qemu#13 0x00007f96fe858760 in ?? () qemu#14 0x0000000000000000 in ?? () RDMA QIOChannel not implement io_set_aio_fd_handler. so qio_channel_set_aio_fd_handler will access NULL pointer. Signed-off-by: Lidong Chen <[email protected]> Reviewed-by: Juan Quintela <[email protected]> Signed-off-by: Juan Quintela <[email protected]>

when qio_channel_read return QIO_CHANNEL_ERR_BLOCK, the source qemu crash. The backtrace is: (gdb) bt #0 0x00007fb20aba91d7 in raise () from /lib64/libc.so.6 #1 0x00007fb20abaa8c8 in abort () from /lib64/libc.so.6 #2 0x00007fb20aba2146 in __assert_fail_base () from /lib64/libc.so.6 qemu#3 0x00007fb20aba21f2 in __assert_fail () from /lib64/libc.so.6 qemu#4 0x00000000008dba2d in qio_channel_yield (ioc=0x22f9e20, condition=G_IO_IN) at io/channel.c:460 qemu#5 0x00000000007a870b in channel_get_buffer (opaque=0x22f9e20, buf=0x3d54038 "", pos=0, size=32768) at migration/qemu-file-channel.c:83 qemu#6 0x00000000007a70f6 in qemu_fill_buffer (f=0x3d54000) at migration/qemu-file.c:299 qemu#7 0x00000000007a79d0 in qemu_peek_byte (f=0x3d54000, offset=0) at migration/qemu-file.c:562 qemu#8 0x00000000007a7a22 in qemu_get_byte (f=0x3d54000) at migration/qemu-file.c:575 qemu#9 0x00000000007a7c46 in qemu_get_be16 (f=0x3d54000) at migration/qemu-file.c:647 qemu#10 0x0000000000796db7 in source_return_path_thread (opaque=0x2242280) at migration/migration.c:1794 qemu#11 0x00000000009428fa in qemu_thread_start (args=0x3e58420) at util/qemu-thread-posix.c:504 qemu#12 0x00007fb20af3ddc5 in start_thread () from /lib64/libpthread.so.0 qemu#13 0x00007fb20ac6b74d in clone () from /lib64/libc.so.6 This patch fixed by invoke qio_channel_yield only when qemu_in_coroutine(). Signed-off-by: Lidong Chen <[email protected]> Reviewed-by: Juan Quintela <[email protected]> Signed-off-by: Juan Quintela <[email protected]>

Because RDMA QIOChannel not implement shutdown function, If the to_dst_file was set error, the return path thread will wait forever. and the migration thread will wait return path thread exit. the backtrace of return path thread is: (gdb) bt #0 0x00007f372a76bb0f in ppoll () from /lib64/libc.so.6 #1 0x000000000071dc24 in qemu_poll_ns (fds=0x7ef7091d0580, nfds=2, timeout=100000000) at qemu-timer.c:325 #2 0x00000000006b2fba in qemu_rdma_wait_comp_channel (rdma=0xd424000) at migration/rdma.c:1501 qemu#3 0x00000000006b3191 in qemu_rdma_block_for_wrid (rdma=0xd424000, wrid_requested=4000, byte_len=0x7ef7091d0640) at migration/rdma.c:1580 qemu#4 0x00000000006b3638 in qemu_rdma_exchange_get_response (rdma=0xd424000, head=0x7ef7091d0720, expecting=3, idx=0) at migration/rdma.c:1726 qemu#5 0x00000000006b3ad6 in qemu_rdma_exchange_recv (rdma=0xd424000, head=0x7ef7091d0720, expecting=3) at migration/rdma.c:1903 qemu#6 0x00000000006b5d03 in qemu_rdma_get_buffer (opaque=0x6a57dc0, buf=0x5c80030 "", pos=8, size=32768) at migration/rdma.c:2714 qemu#7 0x00000000006a9635 in qemu_fill_buffer (f=0x5c80000) at migration/qemu-file.c:232 qemu#8 0x00000000006a9ecd in qemu_peek_byte (f=0x5c80000, offset=0) at migration/qemu-file.c:502 qemu#9 0x00000000006a9f1f in qemu_get_byte (f=0x5c80000) at migration/qemu-file.c:515 qemu#10 0x00000000006aa162 in qemu_get_be16 (f=0x5c80000) at migration/qemu-file.c:591 qemu#11 0x00000000006a46d3 in source_return_path_thread ( opaque=0xd826a0 <current_migration.37100>) at migration/migration.c:1331 qemu#12 0x00007f372aa49e25 in start_thread () from /lib64/libpthread.so.0 qemu#13 0x00007f372a77635d in clone () from /lib64/libc.so.6 the backtrace of migration thread is: (gdb) bt #0 0x00007f372aa4af57 in pthread_join () from /lib64/libpthread.so.0 #1 0x00000000007d5711 in qemu_thread_join (thread=0xd826f8 <current_migration.37100+88>) at util/qemu-thread-posix.c:504 #2 0x00000000006a4bc5 in await_return_path_close_on_source ( ms=0xd826a0 <current_migration.37100>) at migration/migration.c:1460 qemu#3 0x00000000006a53e4 in migration_completion (s=0xd826a0 <current_migration.37100>, current_active_state=4, old_vm_running=0x7ef7089cf976, start_time=0x7ef7089cf980) at migration/migration.c:1695 qemu#4 0x00000000006a5c54 in migration_thread (opaque=0xd826a0 <current_migration.37100>) at migration/migration.c:1837 qemu#5 0x00007f372aa49e25 in start_thread () from /lib64/libpthread.so.0 qemu#6 0x00007f372a77635d in clone () from /lib64/libc.so.6 Signed-off-by: Lidong Chen <[email protected]> Reviewed-by: Dr. David Alan Gilbert <[email protected]> Reviewed-by: Juan Quintela <[email protected]> Signed-off-by: Juan Quintela <[email protected]>

tests/cdrom-test -p /x86_64/cdrom/boot/megasas Produces the following ASAN leak. ==25700==ERROR: LeakSanitizer: detected memory leaks Direct leak of 16 byte(s) in 1 object(s) allocated from: #0 0x7f06f8faac48 in malloc (/lib64/libasan.so.5+0xeec48) #1 0x7f06f87a73c5 in g_malloc (/lib64/libglib-2.0.so.0+0x523c5) #2 0x55a729f17738 in pci_dma_sglist_init /home/elmarco/src/qq/include/hw/pci/pci.h:818 qemu#3 0x55a729f2a706 in megasas_map_dcmd /home/elmarco/src/qq/hw/scsi/megasas.c:698 qemu#4 0x55a729f39421 in megasas_handle_dcmd /home/elmarco/src/qq/hw/scsi/megasas.c:1574 qemu#5 0x55a729f3f70d in megasas_handle_frame /home/elmarco/src/qq/hw/scsi/megasas.c:1955 qemu#6 0x55a729f40939 in megasas_mmio_write /home/elmarco/src/qq/hw/scsi/megasas.c:2119 qemu#7 0x55a729f41102 in megasas_port_write /home/elmarco/src/qq/hw/scsi/megasas.c:2170 qemu#8 0x55a729220e60 in memory_region_write_accessor /home/elmarco/src/qq/memory.c:527 qemu#9 0x55a7292212b3 in access_with_adjusted_size /home/elmarco/src/qq/memory.c:594 qemu#10 0x55a72922cf70 in memory_region_dispatch_write /home/elmarco/src/qq/memory.c:1473 qemu#11 0x55a7290f5907 in flatview_write_continue /home/elmarco/src/qq/exec.c:3255 qemu#12 0x55a7290f5ceb in flatview_write /home/elmarco/src/qq/exec.c:3294 qemu#13 0x55a7290f6457 in address_space_write /home/elmarco/src/qq/exec.c:3384 qemu#14 0x55a7290f64a8 in address_space_rw /home/elmarco/src/qq/exec.c:3395 qemu#15 0x55a72929ecb0 in kvm_handle_io /home/elmarco/src/qq/accel/kvm/kvm-all.c:1729 qemu#16 0x55a7292a0db5 in kvm_cpu_exec /home/elmarco/src/qq/accel/kvm/kvm-all.c:1969 qemu#17 0x55a7291c4212 in qemu_kvm_cpu_thread_fn /home/elmarco/src/qq/cpus.c:1215 qemu#18 0x55a72a966a6c in qemu_thread_start /home/elmarco/src/qq/util/qemu-thread-posix.c:504 qemu#19 0x7f06ed486593 in start_thread (/lib64/libpthread.so.0+0x7593) Move the qemu_sglist_destroy() from megasas_complete_command() to megasas_unmap_frame(), so map/unmap are balanced. Signed-off-by: Marc-AndrÃ© Lureau <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]> Signed-off-by: Marc-André Lureau <[email protected]>

Spotted by ASAN: ================================================================= ==5378==ERROR: LeakSanitizer: detected memory leaks Direct leak of 65536 byte(s) in 1 object(s) allocated from: #0 0x7f788f83bc48 in malloc (/lib64/libasan.so.5+0xeec48) #1 0x7f788c9923c5 in g_malloc (/lib64/libglib-2.0.so.0+0x523c5) #2 0x5622a1fe37bc in coroutine_trampoline /home/elmarco/src/qq/util/coroutine-ucontext.c:116 qemu#3 0x7f788a15d75f in __correctly_grouped_prefixwc (/lib64/libc.so.6+0x4c75f) (Broken in commit 4c8158e.) Signed-off-by: Marc-André Lureau <[email protected]> Message-id: [email protected] Signed-off-by: Max Reitz <[email protected]>

We need to set cs->halted to 1 before calling ppc_set_compat. The reason is that ppc_set_compat kicks up the new thread created to manage the hotplugged KVM virtual CPU and the code drives directly to KVM_RUN ioctl. When cs->halted is 1, the code: int kvm_cpu_exec(CPUState *cpu) ... if (kvm_arch_process_async_events(cpu)) { atomic_set(&cpu->exit_request, 0); return EXCP_HLT; } ... returns before it reaches KVM_RUN, giving time to the main thread to finish its job. Otherwise we can fall in a deadlock because the KVM thread will issue the KVM_RUN ioctl while the main thread is setting up KVM registers. Depending on how these jobs are scheduled we'll end up freezing QEMU. The following output shows kvm_vcpu_ioctl sleeping because it cannot get the mutex and never will. PS: kvm_vcpu_ioctl was triggered kvm_set_one_reg - compat_pvr. STATE: TASK_UNINTERRUPTIBLE|TASK_WAKEKILL PID: 61564 TASK: c000003e981e0780 CPU: 48 COMMAND: "qemu-system-ppc" #0 [c000003e982679a0] __schedule at c000000000b10a44 #1 [c000003e98267a60] schedule at c000000000b113a8 #2 [c000003e98267a90] schedule_preempt_disabled at c000000000b11910 qemu#3 [c000003e98267ab0] __mutex_lock at c000000000b132ec qemu#4 [c000003e98267bc0] kvm_vcpu_ioctl at c00800000ea03140 [kvm] qemu#5 [c000003e98267d20] do_vfs_ioctl at c000000000407d30 qemu#6 [c000003e98267dc0] ksys_ioctl at c000000000408674 qemu#7 [c000003e98267e10] sys_ioctl at c0000000004086f8 qemu#8 [c000003e98267e30] system_call at c00000000000b488 crash> struct -x kvm.vcpus 0xc000003da0000000 vcpus = {0xc000003db4880000, 0xc000003d52b80000, 0xc0000039e9c80000, 0xc000003d0e200000, 0xc000003d58280000, 0x0, 0x0, ...} crash> struct -x kvm_vcpu.mutex.owner 0xc000003d58280000 mutex.owner = { counter = 0xc000003a23a5c881 <- flag 1: waiters }, crash> bt 0xc000003a23a5c880 PID: 61579 TASK: c000003a23a5c880 CPU: 9 COMMAND: "CPU 4/KVM" (active) crash> struct -x kvm_vcpu.mutex.wait_list 0xc000003d58280000 mutex.wait_list = { next = 0xc000003e98267b10, prev = 0xc000003e98267b10 }, crash> struct -x mutex_waiter.task 0xc000003e98267b10 task = 0xc000003e981e0780 The following command-line was used to reproduce the problem (note: gdb and trace can change the results). $ qemu-ppc/build/ppc64-softmmu/qemu-system-ppc64 -cpu host \ -enable-kvm -m 4096 \ -smp 4,maxcpus=8,sockets=1,cores=2,threads=4 \ -display none -nographic \ -drive file=disk1.qcow2,format=qcow2 ... (qemu) device_add host-spapr-cpu-core,core-id=4 [no interaction is possible after it, only SIGKILL to take the terminal back] Signed-off-by: Jose Ricardo Ziviani <[email protected]> Reviewed-by: Greg Kurz <[email protected]> Signed-off-by: David Gibson <[email protected]>

Spotted by ASAN doing some manual testing: Direct leak of 48 byte(s) in 1 object(s) allocated from: #0 0x7f5fcdc75e50 in calloc (/lib64/libasan.so.5+0xeee50) #1 0x7f5fcd47241d in g_malloc0 (/lib64/libglib-2.0.so.0+0x5241d) #2 0x55f989be92ce in timer_new /home/elmarco/src/qq/include/qemu/timer.h:561 qemu#3 0x55f989be92ff in timer_new_ms /home/elmarco/src/qq/include/qemu/timer.h:630 qemu#4 0x55f989c0219d in hmp_migrate /home/elmarco/src/qq/hmp.c:2038 qemu#5 0x55f98955927b in handle_hmp_command /home/elmarco/src/qq/monitor.c:3498 qemu#6 0x55f98955fb8c in monitor_command_cb /home/elmarco/src/qq/monitor.c:4371 qemu#7 0x55f98ad40f11 in readline_handle_byte /home/elmarco/src/qq/util/readline.c:393 qemu#8 0x55f98955fa4f in monitor_read /home/elmarco/src/qq/monitor.c:4354 qemu#9 0x55f98aae30d7 in qemu_chr_be_write_impl /home/elmarco/src/qq/chardev/char.c:175 qemu#10 0x55f98aae317a in qemu_chr_be_write /home/elmarco/src/qq/chardev/char.c:187 qemu#11 0x55f98aae940c in fd_chr_read /home/elmarco/src/qq/chardev/char-fd.c:66 qemu#12 0x55f98ab63018 in qio_channel_fd_source_dispatch /home/elmarco/src/qq/io/channel-watch.c:84 qemu#13 0x7f5fcd46c8ac in g_main_dispatch gmain.c:3177 Signed-off-by: Marc-André Lureau <[email protected]> Message-Id: <[email protected]> Reviewed-by: Dr. David Alan Gilbert <[email protected]> Signed-off-by: Dr. David Alan Gilbert <[email protected]>

In qemu_laio_process_completions_and_submit, the AioContext is acquired before the ioq_submit iteration and after qemu_laio_process_completions, but the latter is not thread safe either. This change avoids a number of random crashes when the Main Thread and an IO Thread collide processing completions for the same AioContext. This is an example of such crash: - The IO Thread is trying to acquire the AioContext at aio_co_enter, which evidences that it didn't lock it before: Thread 3 (Thread 0x7fdfd8bd8700 (LWP 36743)): #0 0x00007fdfe0dd542d in __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135 #1 0x00007fdfe0dd0de6 in _L_lock_870 () at /lib64/libpthread.so.0 #2 0x00007fdfe0dd0cdf in __GI___pthread_mutex_lock (mutex=mutex@entry=0x5631fde0e6c0) at ../nptl/pthread_mutex_lock.c:114 qemu#3 0x00005631fc0603a7 in qemu_mutex_lock_impl (mutex=0x5631fde0e6c0, file=0x5631fc23520f "util/async.c", line=511) at util/qemu-thread-posix.c:66 qemu#4 0x00005631fc05b558 in aio_co_enter (ctx=0x5631fde0e660, co=0x7fdfcc0c2b40) at util/async.c:493 qemu#5 0x00005631fc05b5ac in aio_co_wake (co=<optimized out>) at util/async.c:478 qemu#6 0x00005631fbfc51ad in qemu_laio_process_completion (laiocb=<optimized out>) at block/linux-aio.c:104 qemu#7 0x00005631fbfc523c in qemu_laio_process_completions (s=s@entry=0x7fdfc0297670) at block/linux-aio.c:222 qemu#8 0x00005631fbfc5499 in qemu_laio_process_completions_and_submit (s=0x7fdfc0297670) at block/linux-aio.c:237 qemu#9 0x00005631fc05d978 in aio_dispatch_handlers (ctx=ctx@entry=0x5631fde0e660) at util/aio-posix.c:406 qemu#10 0x00005631fc05e3ea in aio_poll (ctx=0x5631fde0e660, blocking=blocking@entry=true) at util/aio-posix.c:693 qemu#11 0x00005631fbd7ad96 in iothread_run (opaque=0x5631fde0e1c0) at iothread.c:64 qemu#12 0x00007fdfe0dcee25 in start_thread (arg=0x7fdfd8bd8700) at pthread_create.c:308 qemu#13 0x00007fdfe0afc34d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113 - The Main Thread is also processing completions from the same AioContext, and crashes due to failed assertion at util/iov.c:78: Thread 1 (Thread 0x7fdfeb5eac80 (LWP 36740)): #0 0x00007fdfe0a391f7 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 #1 0x00007fdfe0a3a8e8 in __GI_abort () at abort.c:90 #2 0x00007fdfe0a32266 in __assert_fail_base (fmt=0x7fdfe0b84e68 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertion@entry=0x5631fc238ccb "offset == 0", file=file@entry=0x5631fc23698e "util/iov.c", line=line@entry=78, function=function@entry=0x5631fc236adc <__PRETTY_FUNCTION__.15220> "iov_memset") at assert.c:92 qemu#3 0x00007fdfe0a32312 in __GI___assert_fail (assertion=assertion@entry=0x5631fc238ccb "offset == 0", file=file@entry=0x5631fc23698e "util/iov.c", line=line@entry=78, function=function@entry=0x5631fc236adc <__PRETTY_FUNCTION__.15220> "iov_memset") at assert.c:101 qemu#4 0x00005631fc065287 in iov_memset (iov=<optimized out>, iov_cnt=<optimized out>, offset=<optimized out>, offset@entry=65536, fillc=fillc@entry=0, bytes=15515191315812405248) at util/iov.c:78 qemu#5 0x00005631fc065a63 in qemu_iovec_memset (qiov=<optimized out>, offset=offset@entry=65536, fillc=fillc@entry=0, bytes=<optimized out>) at util/iov.c:410 qemu#6 0x00005631fbfc5178 in qemu_laio_process_completion (laiocb=0x7fdd920df630) at block/linux-aio.c:88 qemu#7 0x00005631fbfc523c in qemu_laio_process_completions (s=s@entry=0x7fdfc0297670) at block/linux-aio.c:222 qemu#8 0x00005631fbfc5499 in qemu_laio_process_completions_and_submit (s=0x7fdfc0297670) at block/linux-aio.c:237 qemu#9 0x00005631fbfc54ed in qemu_laio_poll_cb (opaque=<optimized out>) at block/linux-aio.c:272 qemu#10 0x00005631fc05d85e in run_poll_handlers_once (ctx=ctx@entry=0x5631fde0e660) at util/aio-posix.c:497 qemu#11 0x00005631fc05e2ca in aio_poll (blocking=false, ctx=0x5631fde0e660) at util/aio-posix.c:574 qemu#12 0x00005631fc05e2ca in aio_poll (ctx=0x5631fde0e660, blocking=blocking@entry=false) at util/aio-posix.c:604 qemu#13 0x00005631fbfcb8a3 in bdrv_do_drained_begin (ignore_parent=<optimized out>, recursive=<optimized out>, bs=<optimized out>) at block/io.c:273 qemu#14 0x00005631fbfcb8a3 in bdrv_do_drained_begin (bs=0x5631fe8b6200, recursive=<optimized out>, parent=0x0, ignore_bds_parents=<optimized out>, poll=<optimized out>) at block/io.c:390 qemu#15 0x00005631fbfbcd2e in blk_drain (blk=0x5631fe83ac80) at block/block-backend.c:1590 qemu#16 0x00005631fbfbe138 in blk_remove_bs (blk=blk@entry=0x5631fe83ac80) at block/block-backend.c:774 qemu#17 0x00005631fbfbe3d6 in blk_unref (blk=0x5631fe83ac80) at block/block-backend.c:401 qemu#18 0x00005631fbfbe3d6 in blk_unref (blk=0x5631fe83ac80) at block/block-backend.c:449 qemu#19 0x00005631fbfc9a69 in commit_complete (job=0x5631fe8b94b0, opaque=0x7fdfcc1bb080) at block/commit.c:92 qemu#20 0x00005631fbf7d662 in job_defer_to_main_loop_bh (opaque=0x7fdfcc1b4560) at job.c:973 qemu#21 0x00005631fc05ad41 in aio_bh_poll (bh=0x7fdfcc01ad90) at util/async.c:90 qemu#22 0x00005631fc05ad41 in aio_bh_poll (ctx=ctx@entry=0x5631fddffdb0) at util/async.c:118 qemu#23 0x00005631fc05e210 in aio_dispatch (ctx=0x5631fddffdb0) at util/aio-posix.c:436 qemu#24 0x00005631fc05ac1e in aio_ctx_dispatch (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>) at util/async.c:261 qemu#25 0x00007fdfeaae44c9 in g_main_context_dispatch (context=0x5631fde00140) at gmain.c:3201 qemu#26 0x00007fdfeaae44c9 in g_main_context_dispatch (context=context@entry=0x5631fde00140) at gmain.c:3854 qemu#27 0x00005631fc05d503 in main_loop_wait () at util/main-loop.c:215 qemu#28 0x00005631fc05d503 in main_loop_wait (timeout=<optimized out>) at util/main-loop.c:238 qemu#29 0x00005631fc05d503 in main_loop_wait (nonblocking=nonblocking@entry=0) at util/main-loop.c:497 qemu#30 0x00005631fbd81412 in main_loop () at vl.c:1866 qemu#31 0x00005631fbc18ff3 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4647 - A closer examination shows that s->io_q.in_flight appears to have gone backwards: (gdb) frame 7 qemu#7 0x00005631fbfc523c in qemu_laio_process_completions (s=s@entry=0x7fdfc0297670) at block/linux-aio.c:222 222 qemu_laio_process_completion(laiocb); (gdb) p s $2 = (LinuxAioState *) 0x7fdfc0297670 (gdb) p *s $3 = {aio_context = 0x5631fde0e660, ctx = 0x7fdfeb43b000, e = {rfd = 33, wfd = 33}, io_q = {plugged = 0, in_queue = 0, in_flight = 4294967280, blocked = false, pending = {sqh_first = 0x0, sqh_last = 0x7fdfc0297698}}, completion_bh = 0x7fdfc0280ef0, event_idx = 21, event_max = 241} (gdb) p/x s->io_q.in_flight $4 = 0xfffffff0 Signed-off-by: Sergio Lopez <[email protected]> Signed-off-by: Kevin Wolf <[email protected]>

Spotted by ASAN while running: $ tests/migration-test -p /x86_64/migration/postcopy/recovery ================================================================= ==18034==ERROR: LeakSanitizer: detected memory leaks Direct leak of 33864 byte(s) in 1 object(s) allocated from: #0 0x7f3da7f31e50 in calloc (/lib64/libasan.so.5+0xeee50) #1 0x7f3da644441d in g_malloc0 (/lib64/libglib-2.0.so.0+0x5241d) #2 0x55af9db15440 in qemu_fopen_channel_input /home/elmarco/src/qemu/migration/qemu-file-channel.c:183 qemu#3 0x55af9db15413 in channel_get_output_return_path /home/elmarco/src/qemu/migration/qemu-file-channel.c:159 qemu#4 0x55af9db0d4ac in qemu_file_get_return_path /home/elmarco/src/qemu/migration/qemu-file.c:78 qemu#5 0x55af9dad5e4f in open_return_path_on_source /home/elmarco/src/qemu/migration/migration.c:2295 qemu#6 0x55af9dadb3bf in migrate_fd_connect /home/elmarco/src/qemu/migration/migration.c:3111 qemu#7 0x55af9dae1bf3 in migration_channel_connect /home/elmarco/src/qemu/migration/channel.c:91 qemu#8 0x55af9daddeca in socket_outgoing_migration /home/elmarco/src/qemu/migration/socket.c:108 qemu#9 0x55af9e13d3db in qio_task_complete /home/elmarco/src/qemu/io/task.c:158 qemu#10 0x55af9e13ca03 in qio_task_thread_result /home/elmarco/src/qemu/io/task.c:89 qemu#11 0x7f3da643b1ca in g_idle_dispatch gmain.c:5535 Signed-off-by: Marc-André Lureau <[email protected]> Message-Id: <[email protected]> Reviewed-by: Peter Xu <[email protected]> Reviewed-by: Dr. David Alan Gilbert <[email protected]> Signed-off-by: Dr. David Alan Gilbert <[email protected]>

Recently, the test case has started failing because some job related functions want to drop the AioContext lock even though it hasn't been taken: (gdb) bt #0 0x00007f51c067c9fb in raise () from /lib64/libc.so.6 #1 0x00007f51c067e77d in abort () from /lib64/libc.so.6 #2 0x0000558c9d5dde7b in error_exit (err=<optimized out>, msg=msg@entry=0x558c9d6fe120 <__func__.18373> "qemu_mutex_unlock_impl") at util/qemu-thread-posix.c:36 qemu#3 0x0000558c9d6b5263 in qemu_mutex_unlock_impl (mutex=mutex@entry=0x558c9f3999a0, file=file@entry=0x558c9d6fd36f "util/async.c", line=line@entry=516) at util/qemu-thread-posix.c:96 qemu#4 0x0000558c9d6b0565 in aio_context_release (ctx=ctx@entry=0x558c9f399940) at util/async.c:516 qemu#5 0x0000558c9d5eb3da in job_completed_txn_abort (job=0x558c9f68e640) at job.c:738 qemu#6 0x0000558c9d5eb227 in job_finish_sync (job=0x558c9f68e640, finish=finish@entry=0x558c9d5eb8d0 <job_cancel_err>, errp=errp@entry=0x0) at job.c:986 qemu#7 0x0000558c9d5eb8ee in job_cancel_sync (job=<optimized out>) at job.c:941 qemu#8 0x0000558c9d64d853 in replication_close (bs=<optimized out>) at block/replication.c:148 qemu#9 0x0000558c9d5e5c9f in bdrv_close (bs=0x558c9f41b020) at block.c:3420 qemu#10 bdrv_delete (bs=0x558c9f41b020) at block.c:3629 qemu#11 bdrv_unref (bs=0x558c9f41b020) at block.c:4685 qemu#12 0x0000558c9d62a3f3 in blk_remove_bs (blk=blk@entry=0x558c9f42a7c0) at block/block-backend.c:783 qemu#13 0x0000558c9d62a667 in blk_delete (blk=0x558c9f42a7c0) at block/block-backend.c:402 qemu#14 blk_unref (blk=0x558c9f42a7c0) at block/block-backend.c:457 qemu#15 0x0000558c9d5dfcea in test_secondary_stop () at tests/test-replication.c:478 qemu#16 0x00007f51c1f13178 in g_test_run_suite_internal () from /lib64/libglib-2.0.so.0 qemu#17 0x00007f51c1f1337b in g_test_run_suite_internal () from /lib64/libglib-2.0.so.0 qemu#18 0x00007f51c1f1337b in g_test_run_suite_internal () from /lib64/libglib-2.0.so.0 qemu#19 0x00007f51c1f13552 in g_test_run_suite () from /lib64/libglib-2.0.so.0 qemu#20 0x00007f51c1f13571 in g_test_run () from /lib64/libglib-2.0.so.0 qemu#21 0x0000558c9d5de31f in main (argc=<optimized out>, argv=<optimized out>) at tests/test-replication.c:581 It is yet unclear whether this should really be considered a bug in the test case or whether blk_unref() should work for callers that haven't taken the AioContext lock, but in order to fix the build tests quickly, just take the AioContext lock around blk_unref(). Signed-off-by: Kevin Wolf <[email protected]>

A socket chardev may not have associated address (when adding client fd manually for example). But on disconnect, updating socket filename expects an address and may lead to this crash: Thread 1 "qemu-system-x86" received signal SIGSEGV, Segmentation fault. 0x0000555555d8c70c in SocketAddress_to_str (prefix=0x555556043062 "disconnected:", addr=0x0, is_listen=false, is_telnet=false) at /home/elmarco/src/qq/chardev/char-socket.c:388 388 switch (addr->type) { (gdb) bt #0 0x0000555555d8c70c in SocketAddress_to_str (prefix=0x555556043062 "disconnected:", addr=0x0, is_listen=false, is_telnet=false) at /home/elmarco/src/qq/chardev/char-socket.c:388 #1 0x0000555555d8c8aa in update_disconnected_filename (s=0x555556b1ed00) at /home/elmarco/src/qq/chardev/char-socket.c:419 #2 0x0000555555d8c959 in tcp_chr_disconnect (chr=0x555556b1ed00) at /home/elmarco/src/qq/chardev/char-socket.c:438 qemu#3 0x0000555555d8cba1 in tcp_chr_hup (channel=0x555556b75690, cond=G_IO_HUP, opaque=0x555556b1ed00) at /home/elmarco/src/qq/chardev/char-socket.c:482 qemu#4 0x0000555555da596e in qio_channel_fd_source_dispatch (source=0x555556bb68b0, callback=0x555555d8cb58 <tcp_chr_hup>, user_data=0x555556b1ed00) at /home/elmarco/src/qq/io/channel-watch.c:84 Replace filename with a generic "disconnected:socket" in this case. Signed-off-by: Marc-André Lureau <[email protected]> Reviewed-by: Daniel P. Berrangé <[email protected]>

Spotted by ASAN: ================================================================= ==11893==ERROR: LeakSanitizer: detected memory leaks Direct leak of 1120 byte(s) in 28 object(s) allocated from: #0 0x7fd0515b0c48 in malloc (/lib64/libasan.so.5+0xeec48) #1 0x7fd050ffa3c5 in g_malloc (/lib64/libglib-2.0.so.0+0x523c5) #2 0x559e708b56a4 in qstring_from_str /home/elmarco/src/qq/qobject/qstring.c:66 qemu#3 0x559e708b4fe0 in qstring_new /home/elmarco/src/qq/qobject/qstring.c:23 qemu#4 0x559e708bda7d in parse_string /home/elmarco/src/qq/qobject/json-parser.c:143 qemu#5 0x559e708c1009 in parse_literal /home/elmarco/src/qq/qobject/json-parser.c:484 qemu#6 0x559e708c1627 in parse_value /home/elmarco/src/qq/qobject/json-parser.c:547 qemu#7 0x559e708c1c67 in json_parser_parse /home/elmarco/src/qq/qobject/json-parser.c:573 qemu#8 0x559e708bc0ff in json_message_process_token /home/elmarco/src/qq/qobject/json-streamer.c:92 qemu#9 0x559e708d1655 in json_lexer_feed_char /home/elmarco/src/qq/qobject/json-lexer.c:292 qemu#10 0x559e708d1fe1 in json_lexer_feed /home/elmarco/src/qq/qobject/json-lexer.c:339 qemu#11 0x559e708bc856 in json_message_parser_feed /home/elmarco/src/qq/qobject/json-streamer.c:121 qemu#12 0x559e708b8b4b in qobject_from_jsonv /home/elmarco/src/qq/qobject/qjson.c:69 qemu#13 0x559e708b8d02 in qobject_from_json /home/elmarco/src/qq/qobject/qjson.c:83 qemu#14 0x559e708a74ae in from_json_str /home/elmarco/src/qq/tests/check-qjson.c:30 qemu#15 0x559e708a9f83 in utf8_string /home/elmarco/src/qq/tests/check-qjson.c:781 qemu#16 0x7fd05101bc49 in test_case_run gtestutils.c:2255 qemu#17 0x7fd05101bc49 in g_test_run_suite_internal gtestutils.c:2339 Signed-off-by: Marc-André Lureau <[email protected]> Message-Id: <[email protected]> Reviewed-by: Markus Armbruster <[email protected]> Signed-off-by: Markus Armbruster <[email protected]>

The way we determine if we can start the incoming migration was changed to use migration_has_all_channels() in: commit 428d890 Author: Juan Quintela <[email protected]> Date: Mon Jul 24 13:06:25 2017 +0200 migration: Create migration_has_all_channels This method in turn calls multifd_recv_all_channels_created() which is hardcoded to always return 'true' when multifd is not in use. This is a latent bug... ...activated in a following commit where that return result ends up acting as the flag to indicate whether it is possible to start processing the migration: commit 36c2f8b Author: Juan Quintela <[email protected]> Date: Wed Mar 7 08:40:52 2018 +0100 migration: Delay start of migration main routines This means that if channel initialization fails with normal migration, it'll never notice and attempt to start the incoming migration regardless and crash on a NULL pointer. This can be seen, for example, if a client connects to a server requiring TLS, but has an invalid x509 certificate: qemu-system-x86_64: The certificate hasn't got a known issuer qemu-system-x86_64: migration/migration.c:386: process_incoming_migration_co: Assertion `mis->from_src_file' failed. #0 0x00007fffebd24f2b in raise () at /lib64/libc.so.6 ozbenh#1 0x00007fffebd0f561 in abort () at /lib64/libc.so.6 ozbenh#2 0x00007fffebd0f431 in _nl_load_domain.cold.0 () at /lib64/libc.so.6 qemu#3 0x00007fffebd1d692 in () at /lib64/libc.so.6 qemu#4 0x0000555555ad027e in process_incoming_migration_co (opaque=<optimized out>) at migration/migration.c:386 qemu#5 0x0000555555c45e8b in coroutine_trampoline (i0=<optimized out>, i1=<optimized out>) at util/coroutine-ucontext.c:116 qemu#6 0x00007fffebd3a6a0 in __start_context () at /lib64/libc.so.6 qemu#7 0x0000000000000000 in () To handle the non-multifd case, we check whether mis->from_src_file is non-NULL. With this in place, the migration server drops the rejected client and stays around waiting for another, hopefully valid, client to arrive. Signed-off-by: Daniel P. Berrangé <[email protected]> Message-Id: <[email protected]> Reviewed-by: Juan Quintela <[email protected]> Reviewed-by: Dr. David Alan Gilbert <[email protected]> Signed-off-by: Juan Quintela <[email protected]>

With a Spice port chardev, it is possible to reenter monitor_qapi_event_queue() (when the client disconnects for example). This will dead-lock on monitor_lock. Instead, use some TLS variables to check for recursion and queue the events. Fixes: (gdb) bt #0 0x00007fa69e7217fd in __lll_lock_wait () at /lib64/libpthread.so.0 ozbenh#1 0x00007fa69e71acf4 in pthread_mutex_lock () at /lib64/libpthread.so.0 ozbenh#2 0x0000563303567619 in qemu_mutex_lock_impl (mutex=0x563303d3e220 <monitor_lock>, file=0x5633036589a8 "/home/elmarco/src/qq/monitor.c", line=645) at /home/elmarco/src/qq/util/qemu-thread-posix.c:66 qemu#3 0x0000563302fa6c25 in monitor_qapi_event_queue (event=QAPI_EVENT_SPICE_DISCONNECTED, qdict=0x56330602bde0, errp=0x7ffc6ab5e728) at /home/elmarco/src/qq/monitor.c:645 qemu#4 0x0000563303549aca in qapi_event_send_spice_disconnected (server=0x563305afd630, client=0x563305745360, errp=0x563303d8d0f0 <error_abort>) at qapi/qapi-events-ui.c:149 qemu#5 0x00005633033e600f in channel_event (event=3, info=0x5633061b0050) at /home/elmarco/src/qq/ui/spice-core.c:235 qemu#6 0x00007fa69f6c86bb in reds_handle_channel_event (reds=<optimized out>, event=3, info=0x5633061b0050) at reds.c:316 qemu#7 0x00007fa69f6b193b in main_dispatcher_self_handle_channel_event (info=0x5633061b0050, event=3, self=0x563304e088c0) at main-dispatcher.c:197 qemu#8 0x00007fa69f6b193b in main_dispatcher_channel_event (self=0x563304e088c0, event=event@entry=3, info=0x5633061b0050) at main-dispatcher.c:197 qemu#9 0x00007fa69f6d0833 in red_stream_push_channel_event (s=s@entry=0x563305ad8f50, event=event@entry=3) at red-stream.c:414 qemu#10 0x00007fa69f6d086b in red_stream_free (s=0x563305ad8f50) at red-stream.c:388 qemu#11 0x00007fa69f6b7ddc in red_channel_client_finalize (object=0x563304df2360) at red-channel-client.c:347 qemu#12 0x00007fa6a56b7fb9 in g_object_unref () at /lib64/libgobject-2.0.so.0 qemu#13 0x00007fa69f6ba212 in red_channel_client_push (rcc=0x563304df2360) at red-channel-client.c:1341 qemu#14 0x00007fa69f68b259 in red_char_device_send_msg_to_client (client=<optimized out>, msg=0x5633059b6310, dev=0x563304e08bc0) at char-device.c:305 qemu#15 0x00007fa69f68b259 in red_char_device_send_msg_to_clients (msg=0x5633059b6310, dev=0x563304e08bc0) at char-device.c:305 qemu#16 0x00007fa69f68b259 in red_char_device_read_from_device (dev=0x563304e08bc0) at char-device.c:353 qemu#17 0x000056330317d01d in spice_chr_write (chr=0x563304cafe20, buf=0x563304cc50b0 "{\"timestamp\": {\"seconds\": 1532944763, \"microseconds\": 326636}, \"event\": \"SHUTDOWN\", \"data\": {\"guest\": false}}\r\n", len=111) at /home/elmarco/src/qq/chardev/spice.c:199 qemu#18 0x00005633034deee7 in qemu_chr_write_buffer (s=0x563304cafe20, buf=0x563304cc50b0 "{\"timestamp\": {\"seconds\": 1532944763, \"microseconds\": 326636}, \"event\": \"SHUTDOWN\", \"data\": {\"guest\": false}}\r\n", len=111, offset=0x7ffc6ab5ea70, write_all=false) at /home/elmarco/src/qq/chardev/char.c:112 qemu#19 0x00005633034df054 in qemu_chr_write (s=0x563304cafe20, buf=0x563304cc50b0 "{\"timestamp\": {\"seconds\": 1532944763, \"microseconds\": 326636}, \"event\": \"SHUTDOWN\", \"data\": {\"guest\": false}}\r\n", len=111, write_all=false) at /home/elmarco/src/qq/chardev/char.c:147 qemu#20 0x00005633034e1e13 in qemu_chr_fe_write (be=0x563304dbb800, buf=0x563304cc50b0 "{\"timestamp\": {\"seconds\": 1532944763, \"microseconds\": 326636}, \"event\": \"SHUTDOWN\", \"data\": {\"guest\": false}}\r\n", len=111) at /home/elmarco/src/qq/chardev/char-fe.c:42 qemu#21 0x0000563302fa6334 in monitor_flush_locked (mon=0x563304dbb800) at /home/elmarco/src/qq/monitor.c:425 qemu#22 0x0000563302fa6520 in monitor_puts (mon=0x563304dbb800, str=0x563305de7e9e "") at /home/elmarco/src/qq/monitor.c:468 qemu#23 0x0000563302fa680c in qmp_send_response (mon=0x563304dbb800, rsp=0x563304df5730) at /home/elmarco/src/qq/monitor.c:517 qemu#24 0x0000563302fa6905 in qmp_queue_response (mon=0x563304dbb800, rsp=0x563304df5730) at /home/elmarco/src/qq/monitor.c:538 qemu#25 0x0000563302fa6b5b in monitor_qapi_event_emit (event=QAPI_EVENT_SHUTDOWN, qdict=0x563304df5730) at /home/elmarco/src/qq/monitor.c:624 qemu#26 0x0000563302fa6c4b in monitor_qapi_event_queue (event=QAPI_EVENT_SHUTDOWN, qdict=0x563304df5730, errp=0x7ffc6ab5ed00) at /home/elmarco/src/qq/monitor.c:649 qemu#27 0x0000563303548cce in qapi_event_send_shutdown (guest=false, errp=0x563303d8d0f0 <error_abort>) at qapi/qapi-events-run-state.c:58 qemu#28 0x000056330313bcd7 in main_loop_should_exit () at /home/elmarco/src/qq/vl.c:1822 qemu#29 0x000056330313bde3 in main_loop () at /home/elmarco/src/qq/vl.c:1862 qemu#30 0x0000563303143781 in main (argc=3, argv=0x7ffc6ab5f068, envp=0x7ffc6ab5f088) at /home/elmarco/src/qq/vl.c:4644 Note that error report is now moved to the first caller, which may receive an error for a recursed event. This is probably fine (95% of callers use &error_abort, the rest have NULL error and ignore it) Signed-off-by: Marc-André Lureau <[email protected]> Message-Id: <[email protected]> Reviewed-by: Markus Armbruster <[email protected]> [*_no_recurse renamed to *_no_reenter, local variables reordered] Signed-off-by: Markus Armbruster <[email protected]>

Currently offloads disabled by guest via the VIRTIO_NET_CTRL_GUEST_OFFLOADS_SET command are not preserved on VM migration. Instead all offloads reported by guest features (via VIRTIO_PCI_GUEST_FEATURES) get enabled. What happens is: first the VirtIONet::curr_guest_offloads gets restored and offloads are getting set correctly: #0 qemu_set_offload (nc=0x555556a11400, csum=1, tso4=0, tso6=0, ecn=0, ufo=0) at net/net.c:474 ozbenh#1 virtio_net_apply_guest_offloads (n=0x555557701ca0) at hw/net/virtio-net.c:720 ozbenh#2 virtio_net_post_load_device (opaque=0x555557701ca0, version_id=11) at hw/net/virtio-net.c:2334 qemu#3 vmstate_load_state (f=0x5555569dc010, vmsd=0x555556577c80 <vmstate_virtio_net_device>, opaque=0x555557701ca0, version_id=11) at migration/vmstate.c:168 qemu#4 virtio_load (vdev=0x555557701ca0, f=0x5555569dc010, version_id=11) at hw/virtio/virtio.c:2197 qemu#5 virtio_device_get (f=0x5555569dc010, opaque=0x555557701ca0, size=0, field=0x55555668cd00 <__compound_literal.5>) at hw/virtio/virtio.c:2036 qemu#6 vmstate_load_state (f=0x5555569dc010, vmsd=0x555556577ce0 <vmstate_virtio_net>, opaque=0x555557701ca0, version_id=11) at migration/vmstate.c:143 qemu#7 vmstate_load (f=0x5555569dc010, se=0x5555578189e0) at migration/savevm.c:829 qemu#8 qemu_loadvm_section_start_full (f=0x5555569dc010, mis=0x5555569eee20) at migration/savevm.c:2211 qemu#9 qemu_loadvm_state_main (f=0x5555569dc010, mis=0x5555569eee20) at migration/savevm.c:2395 qemu#10 qemu_loadvm_state (f=0x5555569dc010) at migration/savevm.c:2467 qemu#11 process_incoming_migration_co (opaque=0x0) at migration/migration.c:449 However later on the features are getting restored, and offloads get reset to everything supported by features: #0 qemu_set_offload (nc=0x555556a11400, csum=1, tso4=1, tso6=1, ecn=0, ufo=0) at net/net.c:474 ozbenh#1 virtio_net_apply_guest_offloads (n=0x555557701ca0) at hw/net/virtio-net.c:720 ozbenh#2 virtio_net_set_features (vdev=0x555557701ca0, features=5104441767) at hw/net/virtio-net.c:773 qemu#3 virtio_set_features_nocheck (vdev=0x555557701ca0, val=5104441767) at hw/virtio/virtio.c:2052 qemu#4 virtio_load (vdev=0x555557701ca0, f=0x5555569dc010, version_id=11) at hw/virtio/virtio.c:2220 qemu#5 virtio_device_get (f=0x5555569dc010, opaque=0x555557701ca0, size=0, field=0x55555668cd00 <__compound_literal.5>) at hw/virtio/virtio.c:2036 qemu#6 vmstate_load_state (f=0x5555569dc010, vmsd=0x555556577ce0 <vmstate_virtio_net>, opaque=0x555557701ca0, version_id=11) at migration/vmstate.c:143 qemu#7 vmstate_load (f=0x5555569dc010, se=0x5555578189e0) at migration/savevm.c:829 qemu#8 qemu_loadvm_section_start_full (f=0x5555569dc010, mis=0x5555569eee20) at migration/savevm.c:2211 qemu#9 qemu_loadvm_state_main (f=0x5555569dc010, mis=0x5555569eee20) at migration/savevm.c:2395 qemu#10 qemu_loadvm_state (f=0x5555569dc010) at migration/savevm.c:2467 qemu#11 process_incoming_migration_co (opaque=0x0) at migration/migration.c:449 Fix this by preserving the state in saved_guest_offloads field and pushing out offload initialization to the new post load hook. Cc: [email protected] Signed-off-by: Mikhail Sennikovsky <[email protected]> Signed-off-by: Jason Wang <[email protected]>

Guests can crash QEMU when writting to PnP registers: $ echo 'writeb 0x800ff042 69' | qemu-system-sparc -M leon3_generic -S -bios /etc/magic -qtest stdio [I 1571938309.932255] OPENED [R +0.063474] writeb 0x800ff042 69 Segmentation fault (core dumped) (gdb) bt #0 0x0000000000000000 in () ozbenh#1 0x0000555f4bcdf0bc in memory_region_write_with_attrs_accessor (mr=0x555f4d7be8c0, addr=66, value=0x7fff07d00f08, size=1, shift=0, mask=255, attrs=...) at memory.c:503 ozbenh#2 0x0000555f4bcdf185 in access_with_adjusted_size (addr=66, value=0x7fff07d00f08, size=1, access_size_min=1, access_size_max=4, access_fn=0x555f4bcdeff4 <memory_region_write_with_attrs_accessor>, mr=0x555f4d7be8c0, attrs=...) at memory.c:539 qemu#3 0x0000555f4bce2243 in memory_region_dispatch_write (mr=0x555f4d7be8c0, addr=66, data=69, op=MO_8, attrs=...) at memory.c:1489 qemu#4 0x0000555f4bc80b20 in flatview_write_continue (fv=0x555f4d92c400, addr=2148528194, attrs=..., buf=0x7fff07d01120 "E", len=1, addr1=66, l=1, mr=0x555f4d7be8c0) at exec.c:3161 qemu#5 0x0000555f4bc80c65 in flatview_write (fv=0x555f4d92c400, addr=2148528194, attrs=..., buf=0x7fff07d01120 "E", len=1) at exec.c:3201 qemu#6 0x0000555f4bc80fb0 in address_space_write (as=0x555f4d7aa460, addr=2148528194, attrs=..., buf=0x7fff07d01120 "E", len=1) at exec.c:3291 qemu#7 0x0000555f4bc8101d in address_space_rw (as=0x555f4d7aa460, addr=2148528194, attrs=..., buf=0x7fff07d01120 "E", len=1, is_write=true) at exec.c:3301 qemu#8 0x0000555f4bcdb388 in qtest_process_command (chr=0x555f4c2ed7e0 <qtest_chr>, words=0x555f4db0c5d0) at qtest.c:432 Instead of crashing, log the access as unimplemented. Signed-off-by: Philippe Mathieu-Daudé <[email protected]> Reviewed-by: KONRAD Frederic <[email protected]> Message-Id: <[email protected]> Signed-off-by: Laurent Vivier <[email protected]>

We can't know the caller read enough data in the memory pointed by ext_hdr to cast it as a ip6_ext_hdr_routing. Declare rt_hdr on the stack and fill it again from the iovec. Since we already checked there is enough data in the iovec buffer, simply add an assert() call to consume the bytes_read variable. This fix a 2 bytes buffer overrun in eth_parse_ipv6_hdr() reported by QEMU fuzzer: $ cat << EOF | ./qemu-system-i386 -M pc-q35-5.0 \ -accel qtest -monitor none \ -serial none -nographic -qtest stdio outl 0xcf8 0x80001010 outl 0xcfc 0xe1020000 outl 0xcf8 0x80001004 outw 0xcfc 0x7 write 0x25 0x1 0x86 write 0x26 0x1 0xdd write 0x4f 0x1 0x2b write 0xe1020030 0x4 0x190002e1 write 0xe102003a 0x2 0x0807 write 0xe1020048 0x4 0x12077cdd write 0xe1020400 0x4 0xba077cdd write 0xe1020420 0x4 0x190002e1 write 0xe1020428 0x4 0x3509d807 write 0xe1020438 0x1 0xe2 EOF ================================================================= ==2859770==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7ffdef904902 at pc 0x561ceefa78de bp 0x7ffdef904820 sp 0x7ffdef904818 READ of size 1 at 0x7ffdef904902 thread T0 #0 0x561ceefa78dd in _eth_get_rss_ex_dst_addr net/eth.c:410:17 ozbenh#1 0x561ceefa41fb in eth_parse_ipv6_hdr net/eth.c:532:17 ozbenh#2 0x561cef7de639 in net_tx_pkt_parse_headers hw/net/net_tx_pkt.c:228:14 qemu#3 0x561cef7dbef4 in net_tx_pkt_parse hw/net/net_tx_pkt.c:273:9 qemu#4 0x561ceec29f22 in e1000e_process_tx_desc hw/net/e1000e_core.c:730:29 qemu#5 0x561ceec28eac in e1000e_start_xmit hw/net/e1000e_core.c:927:9 qemu#6 0x561ceec1baab in e1000e_set_tdt hw/net/e1000e_core.c:2444:9 qemu#7 0x561ceebf300e in e1000e_core_write hw/net/e1000e_core.c:3256:9 qemu#8 0x561cef3cd4cd in e1000e_mmio_write hw/net/e1000e.c:110:5 Address 0x7ffdef904902 is located in stack of thread T0 at offset 34 in frame #0 0x561ceefa320f in eth_parse_ipv6_hdr net/eth.c:486 This frame has 1 object(s): [32, 34) 'ext_hdr' (line 487) <== Memory access at offset 34 overflows this variable HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork (longjmp and C++ exceptions *are* supported) SUMMARY: AddressSanitizer: stack-buffer-overflow net/eth.c:410:17 in _eth_get_rss_ex_dst_addr Shadow bytes around the buggy address: 0x10003df188d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x10003df188e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x10003df188f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x10003df18900: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x10003df18910: 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1 =>0x10003df18920:[02]f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00 0x10003df18930: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x10003df18940: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x10003df18950: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x10003df18960: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x10003df18970: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Shadow byte legend (one shadow byte represents 8 application bytes): Addressable: 00 Partially addressable: 01 02 03 04 05 06 07 Stack left redzone: f1 Stack right redzone: f3 ==2859770==ABORTING Add the corresponding qtest case with the fuzzer reproducer. FWIW GCC 11 similarly reported: net/eth.c: In function 'eth_parse_ipv6_hdr': net/eth.c:410:15: error: array subscript 'struct ip6_ext_hdr_routing[0]' is partly outside array bounds of 'struct ip6_ext_hdr[1]' [-Werror=array-bounds] 410 | if ((rthdr->rtype == 2) && (rthdr->segleft == 1)) { | ~~~~~^~~~~~~ net/eth.c:485:24: note: while referencing 'ext_hdr' 485 | struct ip6_ext_hdr ext_hdr; | ^~~~~~~ net/eth.c:410:38: error: array subscript 'struct ip6_ext_hdr_routing[0]' is partly outside array bounds of 'struct ip6_ext_hdr[1]' [-Werror=array-bounds] 410 | if ((rthdr->rtype == 2) && (rthdr->segleft == 1)) { | ~~~~~^~~~~~~~~ net/eth.c:485:24: note: while referencing 'ext_hdr' 485 | struct ip6_ext_hdr ext_hdr; | ^~~~~~~ Cc: [email protected] Buglink: https://bugs.launchpad.net/qemu/+bug/1879531 Reported-by: Alexander Bulekov <[email protected]> Reported-by: Miroslav Rezanina <[email protected]> Reviewed-by: Stefano Garzarella <[email protected]> Reviewed-by: Miroslav Rezanina <[email protected]> Fixes: eb70002 ("net_pkt: Extend packet abstraction as required by e1000e functionality") Signed-off-by: Philippe Mathieu-Daudé <[email protected]> Signed-off-by: Jason Wang <[email protected]>

Incoming enabled bitmaps are busy, because we do bdrv_dirty_bitmap_create_successor() for them. But disabled bitmaps being migrated are not marked busy, and user can remove them during the incoming migration. Then we may crash in cancel_incoming_locked() when try to remove the bitmap that was already removed by user, like this: #0 qemu_mutex_lock_impl (mutex=0x5593d88c50d1, file=0x559680554b20 "../block/dirty-bitmap.c", line=64) at ../util/qemu-thread-posix.c:77 ozbenh#1 bdrv_dirty_bitmaps_lock (bs=0x5593d88c0ee9) at ../block/dirty-bitmap.c:64 ozbenh#2 bdrv_release_dirty_bitmap (bitmap=0x5596810e9570) at ../block/dirty-bitmap.c:362 qemu#3 cancel_incoming_locked (s=0x559680be8208 <dbm_state+40>) at ../migration/block-dirty-bitmap.c:918 qemu#4 dirty_bitmap_load (f=0x559681d02b10, opaque=0x559680be81e0 <dbm_state>, version_id=1) at ../migration/block-dirty-bitmap.c:1194 qemu#5 vmstate_load (f=0x559681d02b10, se=0x559680fb5810) at ../migration/savevm.c:908 qemu#6 qemu_loadvm_section_part_end (f=0x559681d02b10, mis=0x559680fb4a30) at ../migration/savevm.c:2473 qemu#7 qemu_loadvm_state_main (f=0x559681d02b10, mis=0x559680fb4a30) at ../migration/savevm.c:2626 qemu#8 postcopy_ram_listen_thread (opaque=0x0) at ../migration/savevm.c:1871 qemu#9 qemu_thread_start (args=0x5596817ccd10) at ../util/qemu-thread-posix.c:521 qemu#10 start_thread () at /lib64/libpthread.so.0 qemu#11 clone () at /lib64/libc.so.6 Note bs pointer taken from bitmap: it's definitely bad aligned. That's because we are in use after free, bitmap is already freed. So, let's make disabled bitmaps (being migrated) busy during incoming migration. Signed-off-by: Vladimir Sementsov-Ogievskiy <[email protected]> Signed-off-by: Stefan Hajnoczi <[email protected]> Message-Id: <[email protected]>

When building with --enable-sanitizers we get: Direct leak of 32 byte(s) in 2 object(s) allocated from: #0 0x5618479ec7cf in malloc (qemu-system-aarch64+0x233b7cf) ozbenh#1 0x7f675745f958 in g_malloc (/lib64/libglib-2.0.so.0+0x58958) ozbenh#2 0x561847f02ca2 in usb_packet_init hw/usb/core.c:531:5 qemu#3 0x561848df4df4 in usb_ehci_init hw/usb/hcd-ehci.c:2575:5 qemu#4 0x561847c119ac in ehci_sysbus_init hw/usb/hcd-ehci-sysbus.c:73:5 qemu#5 0x56184a5bdab8 in object_init_with_type qom/object.c:375:9 qemu#6 0x56184a5bd955 in object_init_with_type qom/object.c:371:9 qemu#7 0x56184a5a2bda in object_initialize_with_type qom/object.c:517:5 qemu#8 0x56184a5a24d5 in object_initialize qom/object.c:536:5 qemu#9 0x56184a5a2f6c in object_initialize_child_with_propsv qom/object.c:566:5 qemu#10 0x56184a5a2e60 in object_initialize_child_with_props qom/object.c:549:10 qemu#11 0x56184a5a3a1e in object_initialize_child_internal qom/object.c:603:5 qemu#12 0x561849542d18 in npcm7xx_init hw/arm/npcm7xx.c:427:5 Similarly to commit d710e1e ("usb: ehci: fix memory leak in ehci"), fix by calling usb_ehci_finalize() to free the USBPacket. Fixes: 7341ea0 Signed-off-by: Philippe Mathieu-Daudé <[email protected]> Reviewed-by: Thomas Huth <[email protected]> Message-Id: <[email protected]> Signed-off-by: Gerd Hoffmann <[email protected]>

When building with --enable-sanitizers we get: Direct leak of 16 byte(s) in 1 object(s) allocated from: #0 0x5618479ec7cf in malloc (qemu-system-aarch64+0x233b7cf) ozbenh#1 0x7f675745f958 in g_malloc (/lib64/libglib-2.0.so.0+0x58958) ozbenh#2 0x561847c2dcc9 in xlnx_dp_init hw/display/xlnx_dp.c:1259:5 qemu#3 0x56184a5bdab8 in object_init_with_type qom/object.c:375:9 qemu#4 0x56184a5a2bda in object_initialize_with_type qom/object.c:517:5 qemu#5 0x56184a5a24d5 in object_initialize qom/object.c:536:5 qemu#6 0x56184a5a2f6c in object_initialize_child_with_propsv qom/object.c:566:5 qemu#7 0x56184a5a2e60 in object_initialize_child_with_props qom/object.c:549:10 qemu#8 0x56184a5a3a1e in object_initialize_child_internal qom/object.c:603:5 qemu#9 0x5618495aa431 in xlnx_zynqmp_init hw/arm/xlnx-zynqmp.c:273:5 The RX/TX FIFOs are created in xlnx_dp_init(), add xlnx_dp_finalize() to destroy them. Fixes: 58ac482 ("introduce xlnx-dp") Signed-off-by: Philippe Mathieu-Daudé <[email protected]> Reviewed-by: Alistair Francis <[email protected]> Message-id: [email protected] Signed-off-by: Peter Maydell <[email protected]>

g_hash_table_add always retains ownership of the pointer passed in as the key. Its return status merely indicates whether the added entry was new, or replaced an existing entry. Thus key must never be freed after this method returns. Spotted by ASAN: ==2407186==ERROR: AddressSanitizer: heap-use-after-free on address 0x6020003ac4f0 at pc 0x7ffff766659c bp 0x7fffffffd1d0 sp 0x7fffffffc980 READ of size 1 at 0x6020003ac4f0 thread T0 #0 0x7ffff766659b (/lib64/libasan.so.6+0x8a59b) ozbenh#1 0x7ffff6bfa843 in g_str_equal ../glib/ghash.c:2303 ozbenh#2 0x7ffff6bf8167 in g_hash_table_lookup_node ../glib/ghash.c:493 qemu#3 0x7ffff6bf9b78 in g_hash_table_insert_internal ../glib/ghash.c:1598 qemu#4 0x7ffff6bf9c32 in g_hash_table_add ../glib/ghash.c:1689 qemu#5 0x5555596caad4 in module_load_one ../util/module.c:233 qemu#6 0x5555596ca949 in module_load_one ../util/module.c:225 qemu#7 0x5555596ca949 in module_load_one ../util/module.c:225 qemu#8 0x5555596cbdf4 in module_load_qom_all ../util/module.c:349 Typical C bug... Fixes: 9062912 ("module: use g_hash_table_add()") Cc: [email protected] Signed-off-by: Marc-André Lureau <[email protected]> Reviewed-by: Daniel P. Berrangé <[email protected]> Message-Id: <[email protected]> Reviewed-by: Philippe Mathieu-Daudé <[email protected]>

modinfo runs the preprocessor and therefore needs all generated input files to be there. The "depends" clause does not work in Meson 0.55.3, so for now use "input". Part ozbenh#2: Update the rule for target-specific modules too. Signed-off-by: Gerd Hoffmann <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>

…m' into staging Bugfixes. # gpg: Signature made Sat 24 Jul 2021 07:11:18 BST # gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83 # gpg: issuer "[email protected]" # gpg: Good signature from "Paolo Bonzini <[email protected]>" [full] # gpg: aka "Paolo Bonzini <[email protected]>" [full] # Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1 # Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83 * remotes/bonzini-gitlab/tags/for-upstream: qom: use correct field name when getting/setting alias properties qapi: introduce forwarding visitor gitlab: only let pages be published from default branch MAINTAINERS: Add memory_mapping.h and memory_mapping.c to "Memory API" MAINTAINERS: Add Peter Xu and myself as co-maintainer of "Memory API" MAINTAINERS: Replace Eduardo as "Host Memory Backends" maintainer i386: do not call cpudef-only models functions for max, host, base target/i386: Added consistency checks for CR3 meson: fix dependencies for modinfo ozbenh#2 Signed-off-by: Peter Maydell <[email protected]>

When trying to use the pc-dimm device on a non-NUMA machine, we get: $ qemu-system-arm -M none -cpu max -S \ -object memory-backend-file,id=mem1,size=1M,mem-path=/tmp/1m \ -device pc-dimm,id=dimm1,memdev=mem1 Segmentation fault (core dumped) (gdb) bt #0 pc_dimm_realize (dev=0x555556da3e90, errp=0x7fffffffcd10) at hw/mem/pc-dimm.c:184 ozbenh#1 0x0000555555fe1f8f in device_set_realized (obj=0x555556da3e90, value=true, errp=0x7fffffffce18) at hw/core/qdev.c:531 ozbenh#2 0x0000555555feb4a9 in property_set_bool (obj=0x555556da3e90, v=0x555556e54420, name=0x5555563c3c41 "realized", opaque=0x555556a704f0, errp=0x7fffffffce18) at qom/object.c:2257 To avoid that crash, restrict the pc-dimm NUMA check to machines supporting NUMA, and do not allow the use of 'node' property on non-NUMA machines. Suggested-by: Igor Mammedov <[email protected]> Signed-off-by: Philippe Mathieu-Daudé <[email protected]> Message-Id: <[email protected]> Reviewed-by: Michael S. Tsirkin <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]>

Without the previous commit, when running 'make check-qtest-i386' with QEMU configured with '--enable-sanitizers' we get: AddressSanitizer:DEADLYSIGNAL ================================================================= ==287878==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000344 ==287878==The signal is caused by a WRITE memory access. ==287878==Hint: address points to the zero page. #0 0x564b2e5bac27 in blk_inc_in_flight block/block-backend.c:1346:5 ozbenh#1 0x564b2e5bb228 in blk_pwritev_part block/block-backend.c:1317:5 ozbenh#2 0x564b2e5bcd57 in blk_pwrite block/block-backend.c:1498:11 qemu#3 0x564b2ca1cdd3 in fdctrl_write_data hw/block/fdc.c:2221:17 qemu#4 0x564b2ca1b2f7 in fdctrl_write hw/block/fdc.c:829:9 qemu#5 0x564b2dc49503 in portio_write softmmu/ioport.c:201:9 Add the reproducer for CVE-2021-20196. Suggested-by: Alexander Bulekov <[email protected]> Reviewed-by: Darren Kenny <[email protected]> Signed-off-by: Philippe Mathieu-Daudé <[email protected]> Message-id: [email protected] Signed-off-by: John Snow <[email protected]>

ozbenh force-pushed the powernv branch from 694cb84 to 90713f6 Compare November 6, 2015 04:42

ozbenh force-pushed the powernv branch 3 times, most recently from 554592d to de26bb4 Compare November 11, 2015 00:24

ozbenh force-pushed the powernv branch 4 times, most recently from 7643b1c to 7ecd38c Compare November 24, 2015 02:06

andreiw force-pushed the powernv branch from 8049110 to 87a832f Compare May 11, 2016 05:15

ozbenh and others added 19 commits May 12, 2016 11:32

ppc: Remove MMU_MODEn_SUFFIX definitions

671651e

We don't use the resulting accessors and this gets in the way of the split I/D TLB work. Signed-off-by: Benjamin Herrenschmidt <[email protected]>

ppc: tlbie, tlbia and tlbisync are HV only

9378642

Not that anything remotely recent supports tlbia but ... Signed-off-by: Benjamin Herrenschmidt <[email protected]>

ppc: Fix sign extension issue in mtmsr(d) emulation

33df300

Signed-off-by: Michael Neuling <[email protected]> Signed-off-by: Benjamin Herrenschmidt <[email protected]> Reviewed-by: David Gibson <[email protected]>

ppc: Add PPC_64H instruction flag to POWER7 and POWER8

b0920ac

This will enable decoding of hrfid Signed-off-by: Benjamin Herrenschmidt <[email protected]> Reviewed-by: David Gibson <[email protected]>

ppc: Rework POWER7 & POWER8 exception model

f5e0216

Properly implement LPES0/1 handling for HV vs. !HV mode and fix AIL implementation. Signed-off-by: Benjamin Herrenschmidt <[email protected]>

ppc: Fix POWER7 and POWER8 exception definitions

81a4add

We were initializing unused ones and missing some Signed-off-by: Benjamin Herrenschmidt <[email protected]> Reviewed-by: David Gibson <[email protected]>

ppc: Add real mode CI load/store instructions for P7 and P8

2ad5044

Those instructions are only available in hypervisor real mode and allow cache inhibited garded access to devices in that mode. Signed-off-by: Benjamin Herrenschmidt <[email protected]>

ozbenh force-pushed the powernv branch from 7ecd38c to 133abb2 Compare August 9, 2016 11:37

ozbenh force-pushed the powernv branch from 133abb2 to eb850b6 Compare October 24, 2018 03:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

PPC: fix HISI/HDSI generation. #2

PPC: fix HISI/HDSI generation. #2

Uh oh!

andreiw commented Oct 26, 2015

Uh oh!

Uh oh!

PPC: fix HISI/HDSI generation. #2

Are you sure you want to change the base?

PPC: fix HISI/HDSI generation. #2

Uh oh!

Conversation

andreiw commented Oct 26, 2015

Uh oh!

Uh oh!