qga: replace GetIfEntry with GetIfEntry2 by luzhipeng-zte · Pull Request #5 · mdroth/qemu

luzhipeng-zte · 2017-10-26T09:12:40Z

The data obtained by GetIfEntry is 32 bits, and it may overflow. Thus using GetIfEntry2 instead of GetIfEntry.

Signed-off-by: ZhiPeng Lu lu.zhipeng@zte.com.cn

qga: replace GetIfEntry The data obtained by GetIfEntry is 32 bits, and it may overflow. Thus using GetIfEntry2 instead of GetIfEntry.

The find_desc_by_name() from util/qemu-option.c relies on the .name not being NULL to call strcmp(). This check becomes unsafe when the list is not NULL-terminated, which is the case of nbd_runtime_opts in block/nbd.c, and can result in segmentation fault when strcmp() tries to access an invalid memory: #0 0x00007fff8c75f7d4 in __strcmp_power9 () from /lib64/libc.so.6 #1 0x00000000102d3ec8 in find_desc_by_name (desc=0x1036d6f0, name=0x28e46670 "server.path") at util/qemu-option.c:166 #2 0x00000000102d93e0 in qemu_opts_absorb_qdict (opts=0x28e47a80, qdict=0x28e469a0, errp=0x7fffec247c98) at util/qemu-option.c:1026 #3 0x000000001012a2e4 in nbd_open (bs=0x28e42290, options=0x28e469a0, flags=24578, errp=0x7fffec247d80) at block/nbd.c:406 #4 0x00000000100144e8 in bdrv_open_driver (bs=0x28e42290, drv=0x1036e070 <bdrv_nbd_unix>, node_name=0x0, options=0x28e469a0, open_flags=24578, errp=0x7fffec247f50) at block.c:1135 #5 0x0000000010015b04 in bdrv_open_common (bs=0x28e42290, file=0x0, options=0x28e469a0, errp=0x7fffec247f50) at block.c:1395 >From gdb, the desc[i].name was not NULL and resulted in strcmp() accessing an invalid memory: >>> p desc[5] $8 = { name = 0x1037f098 "R27A", type = 1561964883, help = 0xc0bbb23e <error: Cannot access memory at address 0xc0bbb23e>, def_value_str = 0x2 <error: Cannot access memory at address 0x2> } >>> p desc[6] $9 = { name = 0x103dac78 <__gcov0.do_qemu_init_bdrv_nbd_init> "\001", type = 272101528, help = 0x29ec0b754403e31f <error: Cannot access memory at address 0x29ec0b754403e31f>, def_value_str = 0x81f343b9 <error: Cannot access memory at address 0x81f343b9> } This patch fixes the segmentation fault in strcmp() by adding a NULL element at the end of nbd_runtime_opts.desc list, which is the common practice to most of other structs like runtime_opts in block/null.c. Thus, the desc[i].name != NULL check becomes safe because it will not evaluate to true when .desc list reached its end. Reported-by: R. Nageswara Sastry <nasastry@in.ibm.com> Buglink: https://bugs.launchpad.net/qemu/+bug/1727259 Signed-off-by: Murilo Opsfelder Araujo <muriloo@linux.vnet.ibm.com> Message-Id: <20180105133241.14141-2-muriloo@linux.vnet.ibm.com> CC: qemu-stable@nongnu.org Fixes: 7ccc44f Signed-off-by: Eric Blake <eblake@redhat.com> (cherry picked from commit c436573) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

/public/qobject_is_equal_conversion: OK ================================================================= ==14396==ERROR: LeakSanitizer: detected memory leaks Direct leak of 56 byte(s) in 1 object(s) allocated from: #0 0x7f07682c5850 in malloc (/lib64/libasan.so.4+0xde850) #1 0x7f0767d12f0c in g_malloc ../glib/gmem.c:94 #2 0x7f0767d131cf in g_malloc_n ../glib/gmem.c:331 #3 0x562bd767371f in do_test_equality /home/elmarco/src/qq/tests/check-qobject.c:49 #4 0x562bd7674a35 in qobject_is_equal_dict_test /home/elmarco/src/qq/tests/check-qobject.c:267 #5 0x7f0767d37b04 in test_case_run ../glib/gtestutils.c:2237 #6 0x7f0767d37ec4 in g_test_run_suite_internal ../glib/gtestutils.c:2321 #7 0x7f0767d37f6d in g_test_run_suite_internal ../glib/gtestutils.c:2333 #8 0x7f0767d38184 in g_test_run_suite ../glib/gtestutils.c:2408 #9 0x7f0767d36e0d in g_test_run ../glib/gtestutils.c:1674 #10 0x562bd7674e75 in main /home/elmarco/src/qq/tests/check-qobject.c:327 #11 0x7f0766009039 in __libc_start_main (/lib64/libc.so.6+0x21039) Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <20180104160523.22995-9-marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Note that data_dir[] will now point to allocated strings. Fixes: Direct leak of 16 byte(s) in 1 object(s) allocated from: #0 0x7f1448181850 in malloc (/lib64/libasan.so.4+0xde850) #1 0x7f1446ed8f0c in g_malloc ../glib/gmem.c:94 #2 0x7f1446ed91cf in g_malloc_n ../glib/gmem.c:331 #3 0x7f1446ef739a in g_strsplit ../glib/gstrfuncs.c:2364 #4 0x55cf276439d7 in main /home/elmarco/src/qq/vl.c:4311 #5 0x7f143dfad039 in __libc_start_main (/lib64/libc.so.6+0x21039) Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20180104160523.22995-10-marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Fixes leaks such as: Direct leak of 2 byte(s) in 1 object(s) allocated from: #0 0x7eff58beb850 in malloc (/lib64/libasan.so.4+0xde850) #1 0x7eff57942f0c in g_malloc ../glib/gmem.c:94 #2 0x7eff579431cf in g_malloc_n ../glib/gmem.c:331 #3 0x7eff5795f6eb in g_strdup ../glib/gstrfuncs.c:363 #4 0x55db720f1d46 in readline_hist_add /home/elmarco/src/qq/util/readline.c:258 #5 0x55db720f2d34 in readline_handle_byte /home/elmarco/src/qq/util/readline.c:387 #6 0x55db71539d00 in monitor_read /home/elmarco/src/qq/monitor.c:3896 #7 0x55db71f9be35 in qemu_chr_be_write_impl /home/elmarco/src/qq/chardev/char.c:167 #8 0x55db71f9bed3 in qemu_chr_be_write /home/elmarco/src/qq/chardev/char.c:179 #9 0x55db71fa013c in fd_chr_read /home/elmarco/src/qq/chardev/char-fd.c:66 #10 0x55db71fe18a8 in qio_channel_fd_source_dispatch /home/elmarco/src/qq/io/channel-watch.c:84 #11 0x7eff5793a90b in g_main_dispatch ../glib/gmain.c:3182 #12 0x7eff5793b7ac in g_main_context_dispatch ../glib/gmain.c:3847 #13 0x55db720af3bd in glib_pollfds_poll /home/elmarco/src/qq/util/main-loop.c:214 #14 0x55db720af505 in os_host_main_loop_wait /home/elmarco/src/qq/util/main-loop.c:261 #15 0x55db720af6d6 in main_loop_wait /home/elmarco/src/qq/util/main-loop.c:515 #16 0x55db7184e0de in main_loop /home/elmarco/src/qq/vl.c:1995 #17 0x55db7185e956 in main /home/elmarco/src/qq/vl.c:4914 #18 0x7eff4ea17039 in __libc_start_main (/lib64/libc.so.6+0x21039) (while at it, use g_new0(ReadLineState), it's a bit easier to read) Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <20180104160523.22995-11-marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

ASAN complains about: ==8856==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7ffd8a1fe168 at pc 0x561136cb4451 bp 0x7ffd8a1fe130 sp 0x7ffd8a1fd8e0 READ of size 16 at 0x7ffd8a1fe168 thread T0 #0 0x561136cb4450 in __asan_memcpy (/home/elmarco/src/qq/build/tests/test-crypto-ivgen+0x110450) #1 0x561136d2a6a7 in qcrypto_ivgen_essiv_calculate /home/elmarco/src/qq/crypto/ivgen-essiv.c:83:5 #2 0x561136d29af8 in qcrypto_ivgen_calculate /home/elmarco/src/qq/crypto/ivgen.c:72:12 #3 0x561136d07c8e in test_ivgen /home/elmarco/src/qq/tests/test-crypto-ivgen.c:148:5 #4 0x7f77772c3b04 in test_case_run /home/elmarco/src/gnome/glib/builddir/../glib/gtestutils.c:2237 #5 0x7f77772c3ec4 in g_test_run_suite_internal /home/elmarco/src/gnome/glib/builddir/../glib/gtestutils.c:2321 #6 0x7f77772c3f6d in g_test_run_suite_internal /home/elmarco/src/gnome/glib/builddir/../glib/gtestutils.c:2333 #7 0x7f77772c3f6d in g_test_run_suite_internal /home/elmarco/src/gnome/glib/builddir/../glib/gtestutils.c:2333 #8 0x7f77772c3f6d in g_test_run_suite_internal /home/elmarco/src/gnome/glib/builddir/../glib/gtestutils.c:2333 #9 0x7f77772c4184 in g_test_run_suite /home/elmarco/src/gnome/glib/builddir/../glib/gtestutils.c:2408 #10 0x7f77772c2e0d in g_test_run /home/elmarco/src/gnome/glib/builddir/../glib/gtestutils.c:1674 #11 0x561136d0799b in main /home/elmarco/src/qq/tests/test-crypto-ivgen.c:173:12 #12 0x7f77756e6039 in __libc_start_main (/lib64/libc.so.6+0x21039) #13 0x561136c13d89 in _start (/home/elmarco/src/qq/build/tests/test-crypto-ivgen+0x6fd89) Address 0x7ffd8a1fe168 is located in stack of thread T0 at offset 40 in frame #0 0x561136d2a40f in qcrypto_ivgen_essiv_calculate /home/elmarco/src/qq/crypto/ivgen-essiv.c:76 This frame has 1 object(s): [32, 40) 'sector.addr' <== Memory access at offset 40 overflows this variable HINT: this may be a false positive if your program uses some custom stack unwind mechanism or swapcontext (longjmp and C++ exceptions *are* supported) SUMMARY: AddressSanitizer: stack-buffer-overflow (/home/elmarco/src/qq/build/tests/test-crypto-ivgen+0x110450) in __asan_memcpy Shadow bytes around the buggy address: 0x100031437bd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x100031437be0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x100031437bf0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x100031437c00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x100031437c10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 =>0x100031437c20: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00[f3]f3 f3 0x100031437c30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x100031437c40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x100031437c50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x100031437c60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x100031437c70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Shadow byte legend (one shadow byte represents 8 application bytes): Addressable: 00 Partially addressable: 01 02 03 04 05 06 07 Heap left redzone: fa Freed heap region: fd Stack left redzone: f1 Stack mid redzone: f2 Stack right redzone: f3 Stack after return: f5 Stack use after scope: f8 Global redzone: f9 Global init order: f6 Poisoned by user: f7 Container overflow: fc Array cookie: ac Intra object redzone: bb ASan internal: fe Left alloca redzone: ca Right alloca redzone: cb It looks like the rest of the code copes with ndata being larger than sizeof(sector), so limit the memcpy() range. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Daniel P. Berrange <berrange@redhat.com> Message-Id: <20180104160523.22995-13-marcandre.lureau@redhat.com> Tested-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Direct leak of 160 byte(s) in 4 object(s) allocated from: #0 0x55ed7678cda8 in calloc (/home/elmarco/src/qq/build/x86_64-softmmu/qemu-system-x86_64+0x797da8) #1 0x7f3f5e725f75 in g_malloc0 /home/elmarco/src/gnome/glib/builddir/../glib/gmem.c:124 #2 0x55ed778aa3a7 in query_option_descs /home/elmarco/src/qq/util/qemu-config.c:60:16 #3 0x55ed778aa307 in get_drive_infolist /home/elmarco/src/qq/util/qemu-config.c:140:19 #4 0x55ed778a9f40 in qmp_query_command_line_options /home/elmarco/src/qq/util/qemu-config.c:254:36 #5 0x55ed76d4868c in qmp_marshal_query_command_line_options /home/elmarco/src/qq/build/qmp-marshal.c:3078:14 #6 0x55ed77855dd5 in do_qmp_dispatch /home/elmarco/src/qq/qapi/qmp-dispatch.c:104:5 #7 0x55ed778558cc in qmp_dispatch /home/elmarco/src/qq/qapi/qmp-dispatch.c:131:11 #8 0x55ed768b592f in handle_qmp_command /home/elmarco/src/qq/monitor.c:3840:11 #9 0x55ed7786ccfe in json_message_process_token /home/elmarco/src/qq/qobject/json-streamer.c:105:5 #10 0x55ed778fe37c in json_lexer_feed_char /home/elmarco/src/qq/qobject/json-lexer.c:323:13 #11 0x55ed778fdde6 in json_lexer_feed /home/elmarco/src/qq/qobject/json-lexer.c:373:15 #12 0x55ed7786cd83 in json_message_parser_feed /home/elmarco/src/qq/qobject/json-streamer.c:124:12 #13 0x55ed768b559e in monitor_qmp_read /home/elmarco/src/qq/monitor.c:3882:5 #14 0x55ed77714f29 in qemu_chr_be_write_impl /home/elmarco/src/qq/chardev/char.c:167:9 #15 0x55ed77714fde in qemu_chr_be_write /home/elmarco/src/qq/chardev/char.c:179:9 #16 0x55ed7772ffad in tcp_chr_read /home/elmarco/src/qq/chardev/char-socket.c:440:13 #17 0x55ed7777113b in qio_channel_fd_source_dispatch /home/elmarco/src/qq/io/channel-watch.c:84:12 #18 0x7f3f5e71d90b in g_main_dispatch /home/elmarco/src/gnome/glib/builddir/../glib/gmain.c:3182 #19 0x7f3f5e71e7ac in g_main_context_dispatch /home/elmarco/src/gnome/glib/builddir/../glib/gmain.c:3847 #20 0x55ed77886ffc in glib_pollfds_poll /home/elmarco/src/qq/util/main-loop.c:214:9 #21 0x55ed778865fd in os_host_main_loop_wait /home/elmarco/src/qq/util/main-loop.c:261:5 #22 0x55ed77886222 in main_loop_wait /home/elmarco/src/qq/util/main-loop.c:515:11 #23 0x55ed76d2a4df in main_loop /home/elmarco/src/qq/vl.c:1995:9 #24 0x55ed76d1cb4a in main /home/elmarco/src/qq/vl.c:4914:5 #25 0x7f3f555f6039 in __libc_start_main (/lib64/libc.so.6+0x21039) Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20180104160523.22995-14-marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

The coroutine is not finished by the time the test ends, resulting in ASAN warning: ==7005==ERROR: LeakSanitizer: detected memory leaks Direct leak of 312 byte(s) in 1 object(s) allocated from: #0 0x7fd35290fa38 in __interceptor_calloc (/lib64/libasan.so.4+0xdea38) #1 0x7fd3506c5f75 in g_malloc0 ../glib/gmem.c:124 #2 0x55994af03e47 in qemu_coroutine_new /home/elmarco/src/qemu/util/coroutine-ucontext.c:144 #3 0x55994aefed99 in qemu_coroutine_create /home/elmarco/src/qemu/util/qemu-coroutine.c:76 #4 0x55994ac1eb50 in verify_entered_step_1 /home/elmarco/src/qemu/tests/test-coroutine.c:80 #5 0x55994af03c75 in coroutine_trampoline /home/elmarco/src/qemu/util/coroutine-ucontext.c:119 #6 0x7fd34ec02bef (/lib64/libc.so.6+0x50bef) Do not yield() to let the coroutine terminate. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20180104160523.22995-17-marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Spotted thanks to ASAN: ==25226==ERROR: AddressSanitizer: global-buffer-overflow on address 0x556715a1f120 at pc 0x556714b6f6b1 bp 0x7ffcdfac1360 sp 0x7ffcdfac1350 READ of size 1 at 0x556715a1f120 thread T0 #0 0x556714b6f6b0 in init_disasm /home/elmarco/src/qemu/disas/s390.c:219 #1 0x556714b6fa6a in print_insn_s390 /home/elmarco/src/qemu/disas/s390.c:294 #2 0x55671484d031 in monitor_disas /home/elmarco/src/qemu/disas.c:635 #3 0x556714862ec0 in memory_dump /home/elmarco/src/qemu/monitor.c:1324 #4 0x55671486342a in hmp_memory_dump /home/elmarco/src/qemu/monitor.c:1418 #5 0x5567148670be in handle_hmp_command /home/elmarco/src/qemu/monitor.c:3109 #6 0x5567148674ed in qmp_human_monitor_command /home/elmarco/src/qemu/monitor.c:613 #7 0x556714b00918 in qmp_marshal_human_monitor_command /home/elmarco/src/qemu/build/qmp-marshal.c:1704 #8 0x556715138a3e in do_qmp_dispatch /home/elmarco/src/qemu/qapi/qmp-dispatch.c:104 #9 0x556715138f83 in qmp_dispatch /home/elmarco/src/qemu/qapi/qmp-dispatch.c:131 #10 0x55671485cf88 in handle_qmp_command /home/elmarco/src/qemu/monitor.c:3839 #11 0x55671514e80b in json_message_process_token /home/elmarco/src/qemu/qobject/json-streamer.c:105 #12 0x5567151bf2dc in json_lexer_feed_char /home/elmarco/src/qemu/qobject/json-lexer.c:323 #13 0x5567151bf827 in json_lexer_feed /home/elmarco/src/qemu/qobject/json-lexer.c:373 #14 0x55671514ee62 in json_message_parser_feed /home/elmarco/src/qemu/qobject/json-streamer.c:124 #15 0x556714854b1f in monitor_qmp_read /home/elmarco/src/qemu/monitor.c:3881 #16 0x556715045440 in qemu_chr_be_write_impl /home/elmarco/src/qemu/chardev/char.c:172 #17 0x556715047184 in qemu_chr_be_write /home/elmarco/src/qemu/chardev/char.c:184 #18 0x55671505a8e6 in tcp_chr_read /home/elmarco/src/qemu/chardev/char-socket.c:440 #19 0x5567150943c3 in qio_channel_fd_source_dispatch /home/elmarco/src/qemu/io/channel-watch.c:84 #20 0x7fb90292b90b in g_main_dispatch ../glib/gmain.c:3182 #21 0x7fb90292c7ac in g_main_context_dispatch ../glib/gmain.c:3847 #22 0x556715162eca in glib_pollfds_poll /home/elmarco/src/qemu/util/main-loop.c:214 #23 0x556715163001 in os_host_main_loop_wait /home/elmarco/src/qemu/util/main-loop.c:261 #24 0x5567151631fa in main_loop_wait /home/elmarco/src/qemu/util/main-loop.c:515 #25 0x556714ad6d3b in main_loop /home/elmarco/src/qemu/vl.c:1950 #26 0x556714ade329 in main /home/elmarco/src/qemu/vl.c:4865 #27 0x7fb8fe5c9009 in __libc_start_main (/lib64/libc.so.6+0x21009) #28 0x5567147af4d9 in _start (/home/elmarco/src/qemu/build/s390x-softmmu/qemu-system-s390x+0xf674d9) 0x556715a1f120 is located 32 bytes to the left of global variable 'char_hci_type_info' defined in '/home/elmarco/src/qemu/hw/bt/hci-csr.c:493:23' (0x556715a1f140) of size 104 0x556715a1f120 is located 8 bytes to the right of global variable 's390_opcodes' defined in '/home/elmarco/src/qemu/disas/s390.c:860:33' (0x556715a15280) of size 40600 This fix is based on Andreas Arnez <arnez@linux.vnet.ibm.com> upstream commit: https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=commitdiff;h=9ace48f3d7d80ce09c5df60cccb433470410b11b 2014-08-19 Andreas Arnez <arnez@linux.vnet.ibm.com> * s390-dis.c (init_disasm): Simplify initialization of opc_index[]. This also fixes an access after the last element of s390_opcodes[]. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20180104160523.22995-19-marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

…e object missed in 60765b6. Thread 1 "qemu-system-aarch64" received signal SIGSEGV, Segmentation fault. address_space_init (as=0x0, root=0x55555726e410, name=name@entry=0x555555e3f0a7 "sdhci-dma") at memory.c:3050 3050 as->root = root; (gdb) bt #0 address_space_init (as=0x0, root=0x55555726e410, name=name@entry=0x555555e3f0a7 "sdhci-dma") at memory.c:3050 #1 0x0000555555af62c3 in sdhci_sysbus_realize (dev=<optimized out>, errp=0x7fff7f931150) at hw/sd/sdhci.c:1564 #2 0x00005555558b25e5 in zynqmp_sdhci_realize (dev=0x555557051520, errp=0x7fff7f931150) at hw/sd/zynqmp-sdhci.c:151 #3 0x0000555555a2e7f3 in device_set_realized (obj=0x555557051520, value=<optimized out>, errp=0x7fff7f931270) at hw/core/qdev.c:966 #4 0x0000555555ba3f74 in property_set_bool (obj=0x555557051520, v=<optimized out>, name=<optimized out>, opaque=0x555556e04a20, errp=0x7fff7f931270) at qom/object.c:1906 #5 0x0000555555ba51f4 in object_property_set (obj=obj@entry=0x555557051520, v=v@entry=0x5555576dbd60, name=name@entry=0x555555dd6306 "realized", errp=errp@entry=0x7fff7f931270) at qom/object.c:1102 Suggested-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-id: 20180123132051.24448-1-f4bug@amsat.org Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

With a Spice port chardev, it is possible to reenter monitor_qapi_event_queue() (when the client disconnects for example). This will dead-lock on monitor_lock. Instead, use some TLS variables to check for recursion and queue the events. Fixes: (gdb) bt #0 0x00007fa69e7217fd in __lll_lock_wait () at /lib64/libpthread.so.0 #1 0x00007fa69e71acf4 in pthread_mutex_lock () at /lib64/libpthread.so.0 #2 0x0000563303567619 in qemu_mutex_lock_impl (mutex=0x563303d3e220 <monitor_lock>, file=0x5633036589a8 "/home/elmarco/src/qq/monitor.c", line=645) at /home/elmarco/src/qq/util/qemu-thread-posix.c:66 #3 0x0000563302fa6c25 in monitor_qapi_event_queue (event=QAPI_EVENT_SPICE_DISCONNECTED, qdict=0x56330602bde0, errp=0x7ffc6ab5e728) at /home/elmarco/src/qq/monitor.c:645 #4 0x0000563303549aca in qapi_event_send_spice_disconnected (server=0x563305afd630, client=0x563305745360, errp=0x563303d8d0f0 <error_abort>) at qapi/qapi-events-ui.c:149 #5 0x00005633033e600f in channel_event (event=3, info=0x5633061b0050) at /home/elmarco/src/qq/ui/spice-core.c:235 #6 0x00007fa69f6c86bb in reds_handle_channel_event (reds=<optimized out>, event=3, info=0x5633061b0050) at reds.c:316 #7 0x00007fa69f6b193b in main_dispatcher_self_handle_channel_event (info=0x5633061b0050, event=3, self=0x563304e088c0) at main-dispatcher.c:197 #8 0x00007fa69f6b193b in main_dispatcher_channel_event (self=0x563304e088c0, event=event@entry=3, info=0x5633061b0050) at main-dispatcher.c:197 #9 0x00007fa69f6d0833 in red_stream_push_channel_event (s=s@entry=0x563305ad8f50, event=event@entry=3) at red-stream.c:414 #10 0x00007fa69f6d086b in red_stream_free (s=0x563305ad8f50) at red-stream.c:388 #11 0x00007fa69f6b7ddc in red_channel_client_finalize (object=0x563304df2360) at red-channel-client.c:347 #12 0x00007fa6a56b7fb9 in g_object_unref () at /lib64/libgobject-2.0.so.0 #13 0x00007fa69f6ba212 in red_channel_client_push (rcc=0x563304df2360) at red-channel-client.c:1341 #14 0x00007fa69f68b259 in red_char_device_send_msg_to_client (client=<optimized out>, msg=0x5633059b6310, dev=0x563304e08bc0) at char-device.c:305 #15 0x00007fa69f68b259 in red_char_device_send_msg_to_clients (msg=0x5633059b6310, dev=0x563304e08bc0) at char-device.c:305 #16 0x00007fa69f68b259 in red_char_device_read_from_device (dev=0x563304e08bc0) at char-device.c:353 #17 0x000056330317d01d in spice_chr_write (chr=0x563304cafe20, buf=0x563304cc50b0 "{\"timestamp\": {\"seconds\": 1532944763, \"microseconds\": 326636}, \"event\": \"SHUTDOWN\", \"data\": {\"guest\": false}}\r\n", len=111) at /home/elmarco/src/qq/chardev/spice.c:199 #18 0x00005633034deee7 in qemu_chr_write_buffer (s=0x563304cafe20, buf=0x563304cc50b0 "{\"timestamp\": {\"seconds\": 1532944763, \"microseconds\": 326636}, \"event\": \"SHUTDOWN\", \"data\": {\"guest\": false}}\r\n", len=111, offset=0x7ffc6ab5ea70, write_all=false) at /home/elmarco/src/qq/chardev/char.c:112 #19 0x00005633034df054 in qemu_chr_write (s=0x563304cafe20, buf=0x563304cc50b0 "{\"timestamp\": {\"seconds\": 1532944763, \"microseconds\": 326636}, \"event\": \"SHUTDOWN\", \"data\": {\"guest\": false}}\r\n", len=111, write_all=false) at /home/elmarco/src/qq/chardev/char.c:147 #20 0x00005633034e1e13 in qemu_chr_fe_write (be=0x563304dbb800, buf=0x563304cc50b0 "{\"timestamp\": {\"seconds\": 1532944763, \"microseconds\": 326636}, \"event\": \"SHUTDOWN\", \"data\": {\"guest\": false}}\r\n", len=111) at /home/elmarco/src/qq/chardev/char-fe.c:42 #21 0x0000563302fa6334 in monitor_flush_locked (mon=0x563304dbb800) at /home/elmarco/src/qq/monitor.c:425 #22 0x0000563302fa6520 in monitor_puts (mon=0x563304dbb800, str=0x563305de7e9e "") at /home/elmarco/src/qq/monitor.c:468 #23 0x0000563302fa680c in qmp_send_response (mon=0x563304dbb800, rsp=0x563304df5730) at /home/elmarco/src/qq/monitor.c:517 #24 0x0000563302fa6905 in qmp_queue_response (mon=0x563304dbb800, rsp=0x563304df5730) at /home/elmarco/src/qq/monitor.c:538 #25 0x0000563302fa6b5b in monitor_qapi_event_emit (event=QAPI_EVENT_SHUTDOWN, qdict=0x563304df5730) at /home/elmarco/src/qq/monitor.c:624 #26 0x0000563302fa6c4b in monitor_qapi_event_queue (event=QAPI_EVENT_SHUTDOWN, qdict=0x563304df5730, errp=0x7ffc6ab5ed00) at /home/elmarco/src/qq/monitor.c:649 #27 0x0000563303548cce in qapi_event_send_shutdown (guest=false, errp=0x563303d8d0f0 <error_abort>) at qapi/qapi-events-run-state.c:58 #28 0x000056330313bcd7 in main_loop_should_exit () at /home/elmarco/src/qq/vl.c:1822 #29 0x000056330313bde3 in main_loop () at /home/elmarco/src/qq/vl.c:1862 #30 0x0000563303143781 in main (argc=3, argv=0x7ffc6ab5f068, envp=0x7ffc6ab5f088) at /home/elmarco/src/qq/vl.c:4644 Note that error report is now moved to the first caller, which may receive an error for a recursed event. This is probably fine (95% of callers use &error_abort, the rest have NULL error and ignore it) Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20180731150144.14022-1-marcandre.lureau@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> [*_no_recurse renamed to *_no_reenter, local variables reordered] Signed-off-by: Markus Armbruster <armbru@redhat.com>

Spotted by ASAN, during make check... Direct leak of 40 byte(s) in 1 object(s) allocated from: #0 0x7f8e27262c48 in malloc (/lib64/libasan.so.5+0xeec48) #1 0x7f8e26a5f3c5 in g_malloc (/lib64/libglib-2.0.so.0+0x523c5) #2 0x555ab67078a8 in qstring_from_str /home/elmarco/src/qq/qobject/qstring.c:67 #3 0x555ab67071e4 in qstring_new /home/elmarco/src/qq/qobject/qstring.c:24 #4 0x555ab6713fbf in qstring_from_escaped_str /home/elmarco/src/qq/qobject/json-parser.c:144 #5 0x555ab671738c in parse_literal /home/elmarco/src/qq/qobject/json-parser.c:506 #6 0x555ab67179c3 in parse_value /home/elmarco/src/qq/qobject/json-parser.c:569 #7 0x555ab6715123 in parse_pair /home/elmarco/src/qq/qobject/json-parser.c:306 #8 0x555ab6715483 in parse_object /home/elmarco/src/qq/qobject/json-parser.c:357 #9 0x555ab671798b in parse_value /home/elmarco/src/qq/qobject/json-parser.c:561 #10 0x555ab6717a6b in json_parser_parse_err /home/elmarco/src/qq/qobject/json-parser.c:592 #11 0x555ab4fd4dcf in handle_qmp_command /home/elmarco/src/qq/monitor.c:4257 #12 0x555ab6712c4d in json_message_process_token /home/elmarco/src/qq/qobject/json-streamer.c:105 #13 0x555ab67e01e2 in json_lexer_feed_char /home/elmarco/src/qq/qobject/json-lexer.c:323 #14 0x555ab67e0af6 in json_lexer_feed /home/elmarco/src/qq/qobject/json-lexer.c:373 #15 0x555ab6713010 in json_message_parser_feed /home/elmarco/src/qq/qobject/json-streamer.c:124 #16 0x555ab4fd58ec in monitor_qmp_read /home/elmarco/src/qq/monitor.c:4337 #17 0x555ab6559df2 in qemu_chr_be_write_impl /home/elmarco/src/qq/chardev/char.c:175 #18 0x555ab6559e95 in qemu_chr_be_write /home/elmarco/src/qq/chardev/char.c:187 #19 0x555ab6560127 in fd_chr_read /home/elmarco/src/qq/chardev/char-fd.c:66 #20 0x555ab65d9c73 in qio_channel_fd_source_dispatch /home/elmarco/src/qq/io/channel-watch.c:84 #21 0x7f8e26a598ac in g_main_context_dispatch (/lib64/libglib-2.0.so.0+0x4c8ac) Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20180809114417.28718-4-marcandre.lureau@redhat.com> [Screwed up in commit b273145] Cc: qemu-stable@nongnu.org Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>

if qio_channel_rdma_readv return QIO_CHANNEL_ERR_BLOCK, the destination qemu crash. The backtrace is: (gdb) bt #0 0x0000000000000000 in ?? () #1 0x00000000008db50e in qio_channel_set_aio_fd_handler (ioc=0x38111e0, ctx=0x3726080, io_read=0x8db841 <qio_channel_restart_read>, io_write=0x0, opaque=0x38111e0) at io/channel.c: #2 0x00000000008db952 in qio_channel_set_aio_fd_handlers (ioc=0x38111e0) at io/channel.c:438 #3 0x00000000008dbab4 in qio_channel_yield (ioc=0x38111e0, condition=G_IO_IN) at io/channel.c:47 #4 0x00000000007a870b in channel_get_buffer (opaque=0x38111e0, buf=0x440c038 "", pos=0, size=327 at migration/qemu-file-channel.c:83 #5 0x00000000007a70f6 in qemu_fill_buffer (f=0x440c000) at migration/qemu-file.c:299 #6 0x00000000007a79d0 in qemu_peek_byte (f=0x440c000, offset=0) at migration/qemu-file.c:562 #7 0x00000000007a7a22 in qemu_get_byte (f=0x440c000) at migration/qemu-file.c:575 #8 0x00000000007a7c78 in qemu_get_be32 (f=0x440c000) at migration/qemu-file.c:655 #9 0x00000000007a0508 in qemu_loadvm_state (f=0x440c000) at migration/savevm.c:2126 #10 0x0000000000794141 in process_incoming_migration_co (opaque=0x0) at migration/migration.c:366 #11 0x000000000095c598 in coroutine_trampoline (i0=84033984, i1=0) at util/coroutine-ucontext.c:1 #12 0x00007f9c0db56d40 in ?? () from /lib64/libc.so.6 #13 0x00007f96fe858760 in ?? () #14 0x0000000000000000 in ?? () RDMA QIOChannel not implement io_set_aio_fd_handler. so qio_channel_set_aio_fd_handler will access NULL pointer. Signed-off-by: Lidong Chen <lidongchen@tencent.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>

when qio_channel_read return QIO_CHANNEL_ERR_BLOCK, the source qemu crash. The backtrace is: (gdb) bt #0 0x00007fb20aba91d7 in raise () from /lib64/libc.so.6 #1 0x00007fb20abaa8c8 in abort () from /lib64/libc.so.6 #2 0x00007fb20aba2146 in __assert_fail_base () from /lib64/libc.so.6 #3 0x00007fb20aba21f2 in __assert_fail () from /lib64/libc.so.6 #4 0x00000000008dba2d in qio_channel_yield (ioc=0x22f9e20, condition=G_IO_IN) at io/channel.c:460 #5 0x00000000007a870b in channel_get_buffer (opaque=0x22f9e20, buf=0x3d54038 "", pos=0, size=32768) at migration/qemu-file-channel.c:83 #6 0x00000000007a70f6 in qemu_fill_buffer (f=0x3d54000) at migration/qemu-file.c:299 #7 0x00000000007a79d0 in qemu_peek_byte (f=0x3d54000, offset=0) at migration/qemu-file.c:562 #8 0x00000000007a7a22 in qemu_get_byte (f=0x3d54000) at migration/qemu-file.c:575 #9 0x00000000007a7c46 in qemu_get_be16 (f=0x3d54000) at migration/qemu-file.c:647 #10 0x0000000000796db7 in source_return_path_thread (opaque=0x2242280) at migration/migration.c:1794 #11 0x00000000009428fa in qemu_thread_start (args=0x3e58420) at util/qemu-thread-posix.c:504 #12 0x00007fb20af3ddc5 in start_thread () from /lib64/libpthread.so.0 #13 0x00007fb20ac6b74d in clone () from /lib64/libc.so.6 This patch fixed by invoke qio_channel_yield only when qemu_in_coroutine(). Signed-off-by: Lidong Chen <lidongchen@tencent.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>

Because RDMA QIOChannel not implement shutdown function, If the to_dst_file was set error, the return path thread will wait forever. and the migration thread will wait return path thread exit. the backtrace of return path thread is: (gdb) bt #0 0x00007f372a76bb0f in ppoll () from /lib64/libc.so.6 #1 0x000000000071dc24 in qemu_poll_ns (fds=0x7ef7091d0580, nfds=2, timeout=100000000) at qemu-timer.c:325 #2 0x00000000006b2fba in qemu_rdma_wait_comp_channel (rdma=0xd424000) at migration/rdma.c:1501 #3 0x00000000006b3191 in qemu_rdma_block_for_wrid (rdma=0xd424000, wrid_requested=4000, byte_len=0x7ef7091d0640) at migration/rdma.c:1580 #4 0x00000000006b3638 in qemu_rdma_exchange_get_response (rdma=0xd424000, head=0x7ef7091d0720, expecting=3, idx=0) at migration/rdma.c:1726 #5 0x00000000006b3ad6 in qemu_rdma_exchange_recv (rdma=0xd424000, head=0x7ef7091d0720, expecting=3) at migration/rdma.c:1903 #6 0x00000000006b5d03 in qemu_rdma_get_buffer (opaque=0x6a57dc0, buf=0x5c80030 "", pos=8, size=32768) at migration/rdma.c:2714 #7 0x00000000006a9635 in qemu_fill_buffer (f=0x5c80000) at migration/qemu-file.c:232 #8 0x00000000006a9ecd in qemu_peek_byte (f=0x5c80000, offset=0) at migration/qemu-file.c:502 #9 0x00000000006a9f1f in qemu_get_byte (f=0x5c80000) at migration/qemu-file.c:515 #10 0x00000000006aa162 in qemu_get_be16 (f=0x5c80000) at migration/qemu-file.c:591 #11 0x00000000006a46d3 in source_return_path_thread ( opaque=0xd826a0 <current_migration.37100>) at migration/migration.c:1331 #12 0x00007f372aa49e25 in start_thread () from /lib64/libpthread.so.0 #13 0x00007f372a77635d in clone () from /lib64/libc.so.6 the backtrace of migration thread is: (gdb) bt #0 0x00007f372aa4af57 in pthread_join () from /lib64/libpthread.so.0 #1 0x00000000007d5711 in qemu_thread_join (thread=0xd826f8 <current_migration.37100+88>) at util/qemu-thread-posix.c:504 #2 0x00000000006a4bc5 in await_return_path_close_on_source ( ms=0xd826a0 <current_migration.37100>) at migration/migration.c:1460 #3 0x00000000006a53e4 in migration_completion (s=0xd826a0 <current_migration.37100>, current_active_state=4, old_vm_running=0x7ef7089cf976, start_time=0x7ef7089cf980) at migration/migration.c:1695 #4 0x00000000006a5c54 in migration_thread (opaque=0xd826a0 <current_migration.37100>) at migration/migration.c:1837 #5 0x00007f372aa49e25 in start_thread () from /lib64/libpthread.so.0 #6 0x00007f372a77635d in clone () from /lib64/libc.so.6 Signed-off-by: Lidong Chen <lidongchen@tencent.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>

tests/cdrom-test -p /x86_64/cdrom/boot/megasas Produces the following ASAN leak. ==25700==ERROR: LeakSanitizer: detected memory leaks Direct leak of 16 byte(s) in 1 object(s) allocated from: #0 0x7f06f8faac48 in malloc (/lib64/libasan.so.5+0xeec48) #1 0x7f06f87a73c5 in g_malloc (/lib64/libglib-2.0.so.0+0x523c5) #2 0x55a729f17738 in pci_dma_sglist_init /home/elmarco/src/qq/include/hw/pci/pci.h:818 #3 0x55a729f2a706 in megasas_map_dcmd /home/elmarco/src/qq/hw/scsi/megasas.c:698 #4 0x55a729f39421 in megasas_handle_dcmd /home/elmarco/src/qq/hw/scsi/megasas.c:1574 #5 0x55a729f3f70d in megasas_handle_frame /home/elmarco/src/qq/hw/scsi/megasas.c:1955 #6 0x55a729f40939 in megasas_mmio_write /home/elmarco/src/qq/hw/scsi/megasas.c:2119 #7 0x55a729f41102 in megasas_port_write /home/elmarco/src/qq/hw/scsi/megasas.c:2170 #8 0x55a729220e60 in memory_region_write_accessor /home/elmarco/src/qq/memory.c:527 #9 0x55a7292212b3 in access_with_adjusted_size /home/elmarco/src/qq/memory.c:594 #10 0x55a72922cf70 in memory_region_dispatch_write /home/elmarco/src/qq/memory.c:1473 #11 0x55a7290f5907 in flatview_write_continue /home/elmarco/src/qq/exec.c:3255 #12 0x55a7290f5ceb in flatview_write /home/elmarco/src/qq/exec.c:3294 #13 0x55a7290f6457 in address_space_write /home/elmarco/src/qq/exec.c:3384 #14 0x55a7290f64a8 in address_space_rw /home/elmarco/src/qq/exec.c:3395 #15 0x55a72929ecb0 in kvm_handle_io /home/elmarco/src/qq/accel/kvm/kvm-all.c:1729 #16 0x55a7292a0db5 in kvm_cpu_exec /home/elmarco/src/qq/accel/kvm/kvm-all.c:1969 #17 0x55a7291c4212 in qemu_kvm_cpu_thread_fn /home/elmarco/src/qq/cpus.c:1215 #18 0x55a72a966a6c in qemu_thread_start /home/elmarco/src/qq/util/qemu-thread-posix.c:504 #19 0x7f06ed486593 in start_thread (/lib64/libpthread.so.0+0x7593) Move the qemu_sglist_destroy() from megasas_complete_command() to megasas_unmap_frame(), so map/unmap are balanced. Signed-off-by: Marc-AndrÃ© Lureau <marcandre.lureau@redhat.com> Message-Id: <20180814141247.32336-1-marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>

Fixes repeated memory leaks of 18 bytes when using VNC: Direct leak of 831024 byte(s) in 46168 object(s) allocated from: ... #4 0x7f6d2f919bdd in g_strdup_vprintf glib/gstrfuncs.c:514 #5 0x56085cdcf660 in buffer_init util/buffer.c:59 #6 0x56085ca6a7ec in vnc_async_encoding_start ui/vnc-jobs.c:177 #7 0x56085ca6b815 in vnc_worker_thread_loop ui/vnc-jobs.c:240 Fixes: 543b958 ("vnc: attach names to buffers") Cc: Gerd Hoffmann <kraxel@redhat.com> CC: qemu-stable@nongnu.org Signed-off-by: Peter Wu <peter@lekensteyn.nl> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-id: 20180807221830.3844-1-peter@lekensteyn.nl Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>

We need to set cs->halted to 1 before calling ppc_set_compat. The reason is that ppc_set_compat kicks up the new thread created to manage the hotplugged KVM virtual CPU and the code drives directly to KVM_RUN ioctl. When cs->halted is 1, the code: int kvm_cpu_exec(CPUState *cpu) ... if (kvm_arch_process_async_events(cpu)) { atomic_set(&cpu->exit_request, 0); return EXCP_HLT; } ... returns before it reaches KVM_RUN, giving time to the main thread to finish its job. Otherwise we can fall in a deadlock because the KVM thread will issue the KVM_RUN ioctl while the main thread is setting up KVM registers. Depending on how these jobs are scheduled we'll end up freezing QEMU. The following output shows kvm_vcpu_ioctl sleeping because it cannot get the mutex and never will. PS: kvm_vcpu_ioctl was triggered kvm_set_one_reg - compat_pvr. STATE: TASK_UNINTERRUPTIBLE|TASK_WAKEKILL PID: 61564 TASK: c000003e981e0780 CPU: 48 COMMAND: "qemu-system-ppc" #0 [c000003e982679a0] __schedule at c000000000b10a44 #1 [c000003e98267a60] schedule at c000000000b113a8 #2 [c000003e98267a90] schedule_preempt_disabled at c000000000b11910 #3 [c000003e98267ab0] __mutex_lock at c000000000b132ec #4 [c000003e98267bc0] kvm_vcpu_ioctl at c00800000ea03140 [kvm] #5 [c000003e98267d20] do_vfs_ioctl at c000000000407d30 #6 [c000003e98267dc0] ksys_ioctl at c000000000408674 #7 [c000003e98267e10] sys_ioctl at c0000000004086f8 #8 [c000003e98267e30] system_call at c00000000000b488 crash> struct -x kvm.vcpus 0xc000003da0000000 vcpus = {0xc000003db4880000, 0xc000003d52b80000, 0xc0000039e9c80000, 0xc000003d0e200000, 0xc000003d58280000, 0x0, 0x0, ...} crash> struct -x kvm_vcpu.mutex.owner 0xc000003d58280000 mutex.owner = { counter = 0xc000003a23a5c881 <- flag 1: waiters }, crash> bt 0xc000003a23a5c880 PID: 61579 TASK: c000003a23a5c880 CPU: 9 COMMAND: "CPU 4/KVM" (active) crash> struct -x kvm_vcpu.mutex.wait_list 0xc000003d58280000 mutex.wait_list = { next = 0xc000003e98267b10, prev = 0xc000003e98267b10 }, crash> struct -x mutex_waiter.task 0xc000003e98267b10 task = 0xc000003e981e0780 The following command-line was used to reproduce the problem (note: gdb and trace can change the results). $ qemu-ppc/build/ppc64-softmmu/qemu-system-ppc64 -cpu host \ -enable-kvm -m 4096 \ -smp 4,maxcpus=8,sockets=1,cores=2,threads=4 \ -display none -nographic \ -drive file=disk1.qcow2,format=qcow2 ... (qemu) device_add host-spapr-cpu-core,core-id=4 [no interaction is possible after it, only SIGKILL to take the terminal back] Signed-off-by: Jose Ricardo Ziviani <joserz@linux.ibm.com> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

Spotted by ASAN doing some manual testing: Direct leak of 48 byte(s) in 1 object(s) allocated from: #0 0x7f5fcdc75e50 in calloc (/lib64/libasan.so.5+0xeee50) #1 0x7f5fcd47241d in g_malloc0 (/lib64/libglib-2.0.so.0+0x5241d) #2 0x55f989be92ce in timer_new /home/elmarco/src/qq/include/qemu/timer.h:561 #3 0x55f989be92ff in timer_new_ms /home/elmarco/src/qq/include/qemu/timer.h:630 #4 0x55f989c0219d in hmp_migrate /home/elmarco/src/qq/hmp.c:2038 #5 0x55f98955927b in handle_hmp_command /home/elmarco/src/qq/monitor.c:3498 #6 0x55f98955fb8c in monitor_command_cb /home/elmarco/src/qq/monitor.c:4371 #7 0x55f98ad40f11 in readline_handle_byte /home/elmarco/src/qq/util/readline.c:393 #8 0x55f98955fa4f in monitor_read /home/elmarco/src/qq/monitor.c:4354 #9 0x55f98aae30d7 in qemu_chr_be_write_impl /home/elmarco/src/qq/chardev/char.c:175 #10 0x55f98aae317a in qemu_chr_be_write /home/elmarco/src/qq/chardev/char.c:187 #11 0x55f98aae940c in fd_chr_read /home/elmarco/src/qq/chardev/char-fd.c:66 #12 0x55f98ab63018 in qio_channel_fd_source_dispatch /home/elmarco/src/qq/io/channel-watch.c:84 #13 0x7f5fcd46c8ac in g_main_dispatch gmain.c:3177 Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20180901134652.25884-1-marcandre.lureau@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

In qemu_laio_process_completions_and_submit, the AioContext is acquired before the ioq_submit iteration and after qemu_laio_process_completions, but the latter is not thread safe either. This change avoids a number of random crashes when the Main Thread and an IO Thread collide processing completions for the same AioContext. This is an example of such crash: - The IO Thread is trying to acquire the AioContext at aio_co_enter, which evidences that it didn't lock it before: Thread 3 (Thread 0x7fdfd8bd8700 (LWP 36743)): #0 0x00007fdfe0dd542d in __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135 #1 0x00007fdfe0dd0de6 in _L_lock_870 () at /lib64/libpthread.so.0 #2 0x00007fdfe0dd0cdf in __GI___pthread_mutex_lock (mutex=mutex@entry=0x5631fde0e6c0) at ../nptl/pthread_mutex_lock.c:114 #3 0x00005631fc0603a7 in qemu_mutex_lock_impl (mutex=0x5631fde0e6c0, file=0x5631fc23520f "util/async.c", line=511) at util/qemu-thread-posix.c:66 #4 0x00005631fc05b558 in aio_co_enter (ctx=0x5631fde0e660, co=0x7fdfcc0c2b40) at util/async.c:493 #5 0x00005631fc05b5ac in aio_co_wake (co=<optimized out>) at util/async.c:478 #6 0x00005631fbfc51ad in qemu_laio_process_completion (laiocb=<optimized out>) at block/linux-aio.c:104 #7 0x00005631fbfc523c in qemu_laio_process_completions (s=s@entry=0x7fdfc0297670) at block/linux-aio.c:222 #8 0x00005631fbfc5499 in qemu_laio_process_completions_and_submit (s=0x7fdfc0297670) at block/linux-aio.c:237 #9 0x00005631fc05d978 in aio_dispatch_handlers (ctx=ctx@entry=0x5631fde0e660) at util/aio-posix.c:406 #10 0x00005631fc05e3ea in aio_poll (ctx=0x5631fde0e660, blocking=blocking@entry=true) at util/aio-posix.c:693 #11 0x00005631fbd7ad96 in iothread_run (opaque=0x5631fde0e1c0) at iothread.c:64 #12 0x00007fdfe0dcee25 in start_thread (arg=0x7fdfd8bd8700) at pthread_create.c:308 #13 0x00007fdfe0afc34d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113 - The Main Thread is also processing completions from the same AioContext, and crashes due to failed assertion at util/iov.c:78: Thread 1 (Thread 0x7fdfeb5eac80 (LWP 36740)): #0 0x00007fdfe0a391f7 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 #1 0x00007fdfe0a3a8e8 in __GI_abort () at abort.c:90 #2 0x00007fdfe0a32266 in __assert_fail_base (fmt=0x7fdfe0b84e68 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertion@entry=0x5631fc238ccb "offset == 0", file=file@entry=0x5631fc23698e "util/iov.c", line=line@entry=78, function=function@entry=0x5631fc236adc <__PRETTY_FUNCTION__.15220> "iov_memset") at assert.c:92 #3 0x00007fdfe0a32312 in __GI___assert_fail (assertion=assertion@entry=0x5631fc238ccb "offset == 0", file=file@entry=0x5631fc23698e "util/iov.c", line=line@entry=78, function=function@entry=0x5631fc236adc <__PRETTY_FUNCTION__.15220> "iov_memset") at assert.c:101 #4 0x00005631fc065287 in iov_memset (iov=<optimized out>, iov_cnt=<optimized out>, offset=<optimized out>, offset@entry=65536, fillc=fillc@entry=0, bytes=15515191315812405248) at util/iov.c:78 #5 0x00005631fc065a63 in qemu_iovec_memset (qiov=<optimized out>, offset=offset@entry=65536, fillc=fillc@entry=0, bytes=<optimized out>) at util/iov.c:410 #6 0x00005631fbfc5178 in qemu_laio_process_completion (laiocb=0x7fdd920df630) at block/linux-aio.c:88 #7 0x00005631fbfc523c in qemu_laio_process_completions (s=s@entry=0x7fdfc0297670) at block/linux-aio.c:222 #8 0x00005631fbfc5499 in qemu_laio_process_completions_and_submit (s=0x7fdfc0297670) at block/linux-aio.c:237 #9 0x00005631fbfc54ed in qemu_laio_poll_cb (opaque=<optimized out>) at block/linux-aio.c:272 #10 0x00005631fc05d85e in run_poll_handlers_once (ctx=ctx@entry=0x5631fde0e660) at util/aio-posix.c:497 #11 0x00005631fc05e2ca in aio_poll (blocking=false, ctx=0x5631fde0e660) at util/aio-posix.c:574 #12 0x00005631fc05e2ca in aio_poll (ctx=0x5631fde0e660, blocking=blocking@entry=false) at util/aio-posix.c:604 #13 0x00005631fbfcb8a3 in bdrv_do_drained_begin (ignore_parent=<optimized out>, recursive=<optimized out>, bs=<optimized out>) at block/io.c:273 #14 0x00005631fbfcb8a3 in bdrv_do_drained_begin (bs=0x5631fe8b6200, recursive=<optimized out>, parent=0x0, ignore_bds_parents=<optimized out>, poll=<optimized out>) at block/io.c:390 #15 0x00005631fbfbcd2e in blk_drain (blk=0x5631fe83ac80) at block/block-backend.c:1590 #16 0x00005631fbfbe138 in blk_remove_bs (blk=blk@entry=0x5631fe83ac80) at block/block-backend.c:774 #17 0x00005631fbfbe3d6 in blk_unref (blk=0x5631fe83ac80) at block/block-backend.c:401 #18 0x00005631fbfbe3d6 in blk_unref (blk=0x5631fe83ac80) at block/block-backend.c:449 #19 0x00005631fbfc9a69 in commit_complete (job=0x5631fe8b94b0, opaque=0x7fdfcc1bb080) at block/commit.c:92 #20 0x00005631fbf7d662 in job_defer_to_main_loop_bh (opaque=0x7fdfcc1b4560) at job.c:973 #21 0x00005631fc05ad41 in aio_bh_poll (bh=0x7fdfcc01ad90) at util/async.c:90 #22 0x00005631fc05ad41 in aio_bh_poll (ctx=ctx@entry=0x5631fddffdb0) at util/async.c:118 #23 0x00005631fc05e210 in aio_dispatch (ctx=0x5631fddffdb0) at util/aio-posix.c:436 #24 0x00005631fc05ac1e in aio_ctx_dispatch (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>) at util/async.c:261 #25 0x00007fdfeaae44c9 in g_main_context_dispatch (context=0x5631fde00140) at gmain.c:3201 #26 0x00007fdfeaae44c9 in g_main_context_dispatch (context=context@entry=0x5631fde00140) at gmain.c:3854 #27 0x00005631fc05d503 in main_loop_wait () at util/main-loop.c:215 #28 0x00005631fc05d503 in main_loop_wait (timeout=<optimized out>) at util/main-loop.c:238 #29 0x00005631fc05d503 in main_loop_wait (nonblocking=nonblocking@entry=0) at util/main-loop.c:497 #30 0x00005631fbd81412 in main_loop () at vl.c:1866 #31 0x00005631fbc18ff3 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4647 - A closer examination shows that s->io_q.in_flight appears to have gone backwards: (gdb) frame 7 #7 0x00005631fbfc523c in qemu_laio_process_completions (s=s@entry=0x7fdfc0297670) at block/linux-aio.c:222 222 qemu_laio_process_completion(laiocb); (gdb) p s $2 = (LinuxAioState *) 0x7fdfc0297670 (gdb) p *s $3 = {aio_context = 0x5631fde0e660, ctx = 0x7fdfeb43b000, e = {rfd = 33, wfd = 33}, io_q = {plugged = 0, in_queue = 0, in_flight = 4294967280, blocked = false, pending = {sqh_first = 0x0, sqh_last = 0x7fdfc0297698}}, completion_bh = 0x7fdfc0280ef0, event_idx = 21, event_max = 241} (gdb) p/x s->io_q.in_flight $4 = 0xfffffff0 Signed-off-by: Sergio Lopez <slp@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>

Spotted by ASAN while running: $ tests/migration-test -p /x86_64/migration/postcopy/recovery ================================================================= ==18034==ERROR: LeakSanitizer: detected memory leaks Direct leak of 33864 byte(s) in 1 object(s) allocated from: #0 0x7f3da7f31e50 in calloc (/lib64/libasan.so.5+0xeee50) #1 0x7f3da644441d in g_malloc0 (/lib64/libglib-2.0.so.0+0x5241d) #2 0x55af9db15440 in qemu_fopen_channel_input /home/elmarco/src/qemu/migration/qemu-file-channel.c:183 #3 0x55af9db15413 in channel_get_output_return_path /home/elmarco/src/qemu/migration/qemu-file-channel.c:159 #4 0x55af9db0d4ac in qemu_file_get_return_path /home/elmarco/src/qemu/migration/qemu-file.c:78 #5 0x55af9dad5e4f in open_return_path_on_source /home/elmarco/src/qemu/migration/migration.c:2295 #6 0x55af9dadb3bf in migrate_fd_connect /home/elmarco/src/qemu/migration/migration.c:3111 #7 0x55af9dae1bf3 in migration_channel_connect /home/elmarco/src/qemu/migration/channel.c:91 #8 0x55af9daddeca in socket_outgoing_migration /home/elmarco/src/qemu/migration/socket.c:108 #9 0x55af9e13d3db in qio_task_complete /home/elmarco/src/qemu/io/task.c:158 #10 0x55af9e13ca03 in qio_task_thread_result /home/elmarco/src/qemu/io/task.c:89 #11 0x7f3da643b1ca in g_idle_dispatch gmain.c:5535 Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20180925092245.29565-1-marcandre.lureau@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

Recently, the test case has started failing because some job related functions want to drop the AioContext lock even though it hasn't been taken: (gdb) bt #0 0x00007f51c067c9fb in raise () from /lib64/libc.so.6 #1 0x00007f51c067e77d in abort () from /lib64/libc.so.6 #2 0x0000558c9d5dde7b in error_exit (err=<optimized out>, msg=msg@entry=0x558c9d6fe120 <__func__.18373> "qemu_mutex_unlock_impl") at util/qemu-thread-posix.c:36 #3 0x0000558c9d6b5263 in qemu_mutex_unlock_impl (mutex=mutex@entry=0x558c9f3999a0, file=file@entry=0x558c9d6fd36f "util/async.c", line=line@entry=516) at util/qemu-thread-posix.c:96 #4 0x0000558c9d6b0565 in aio_context_release (ctx=ctx@entry=0x558c9f399940) at util/async.c:516 #5 0x0000558c9d5eb3da in job_completed_txn_abort (job=0x558c9f68e640) at job.c:738 #6 0x0000558c9d5eb227 in job_finish_sync (job=0x558c9f68e640, finish=finish@entry=0x558c9d5eb8d0 <job_cancel_err>, errp=errp@entry=0x0) at job.c:986 #7 0x0000558c9d5eb8ee in job_cancel_sync (job=<optimized out>) at job.c:941 #8 0x0000558c9d64d853 in replication_close (bs=<optimized out>) at block/replication.c:148 #9 0x0000558c9d5e5c9f in bdrv_close (bs=0x558c9f41b020) at block.c:3420 #10 bdrv_delete (bs=0x558c9f41b020) at block.c:3629 #11 bdrv_unref (bs=0x558c9f41b020) at block.c:4685 #12 0x0000558c9d62a3f3 in blk_remove_bs (blk=blk@entry=0x558c9f42a7c0) at block/block-backend.c:783 #13 0x0000558c9d62a667 in blk_delete (blk=0x558c9f42a7c0) at block/block-backend.c:402 #14 blk_unref (blk=0x558c9f42a7c0) at block/block-backend.c:457 #15 0x0000558c9d5dfcea in test_secondary_stop () at tests/test-replication.c:478 #16 0x00007f51c1f13178 in g_test_run_suite_internal () from /lib64/libglib-2.0.so.0 #17 0x00007f51c1f1337b in g_test_run_suite_internal () from /lib64/libglib-2.0.so.0 #18 0x00007f51c1f1337b in g_test_run_suite_internal () from /lib64/libglib-2.0.so.0 #19 0x00007f51c1f13552 in g_test_run_suite () from /lib64/libglib-2.0.so.0 #20 0x00007f51c1f13571 in g_test_run () from /lib64/libglib-2.0.so.0 #21 0x0000558c9d5de31f in main (argc=<optimized out>, argv=<optimized out>) at tests/test-replication.c:581 It is yet unclear whether this should really be considered a bug in the test case or whether blk_unref() should work for callers that haven't taken the AioContext lock, but in order to fix the build tests quickly, just take the AioContext lock around blk_unref(). Signed-off-by: Kevin Wolf <kwolf@redhat.com>

Spotted by ASAN: ================================================================= ==11893==ERROR: LeakSanitizer: detected memory leaks Direct leak of 1120 byte(s) in 28 object(s) allocated from: #0 0x7fd0515b0c48 in malloc (/lib64/libasan.so.5+0xeec48) #1 0x7fd050ffa3c5 in g_malloc (/lib64/libglib-2.0.so.0+0x523c5) #2 0x559e708b56a4 in qstring_from_str /home/elmarco/src/qq/qobject/qstring.c:66 #3 0x559e708b4fe0 in qstring_new /home/elmarco/src/qq/qobject/qstring.c:23 #4 0x559e708bda7d in parse_string /home/elmarco/src/qq/qobject/json-parser.c:143 #5 0x559e708c1009 in parse_literal /home/elmarco/src/qq/qobject/json-parser.c:484 #6 0x559e708c1627 in parse_value /home/elmarco/src/qq/qobject/json-parser.c:547 #7 0x559e708c1c67 in json_parser_parse /home/elmarco/src/qq/qobject/json-parser.c:573 #8 0x559e708bc0ff in json_message_process_token /home/elmarco/src/qq/qobject/json-streamer.c:92 #9 0x559e708d1655 in json_lexer_feed_char /home/elmarco/src/qq/qobject/json-lexer.c:292 #10 0x559e708d1fe1 in json_lexer_feed /home/elmarco/src/qq/qobject/json-lexer.c:339 #11 0x559e708bc856 in json_message_parser_feed /home/elmarco/src/qq/qobject/json-streamer.c:121 #12 0x559e708b8b4b in qobject_from_jsonv /home/elmarco/src/qq/qobject/qjson.c:69 #13 0x559e708b8d02 in qobject_from_json /home/elmarco/src/qq/qobject/qjson.c:83 #14 0x559e708a74ae in from_json_str /home/elmarco/src/qq/tests/check-qjson.c:30 #15 0x559e708a9f83 in utf8_string /home/elmarco/src/qq/tests/check-qjson.c:781 #16 0x7fd05101bc49 in test_case_run gtestutils.c:2255 #17 0x7fd05101bc49 in g_test_run_suite_internal gtestutils.c:2339 Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20180901211917.10372-1-marcandre.lureau@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>

A quick coredump on an incomplete command line: ./x86_64-softmmu/qemu-system-x86_64 -mon mode=control,pretty=on #0 0x00007ffff723d9e4 in g_str_hash () at /lib64/libglib-2.0.so.0 #1 0x00007ffff723ce38 in g_hash_table_lookup () at /lib64/libglib-2.0.so.0 #2 0x0000555555cc0073 in object_class_property_find (klass=0x5555566a94b0, name=0x0, errp=0x0) at qom/object.c:1135 #3 0x0000555555cc004b in object_class_property_find (klass=0x5555566a9440, name=0x0, errp=0x0) at qom/object.c:1129 #4 0x0000555555cbfe6e in object_property_find (obj=0x5555568348c0, name=0x0, errp=0x0) at qom/object.c:1080 #5 0x0000555555cc183d in object_resolve_path_component (parent=0x5555568348c0, part=0x0) at qom/object.c:1762 #6 0x0000555555d82071 in qemu_chr_find (name=0x0) at chardev/char.c:802 #7 0x00005555559d77cb in mon_init_func (opaque=0x0, opts=0x5555566b65a0, errp=0x0) at vl.c:2291 Fix it to instead fail gracefully. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20181023213600.364086-1-eblake@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>

When using the 9P2000.u version of the protocol, the following shell command line in the guest can cause QEMU to crash: while true; do rm -rf aa; mkdir -p a/b & touch a/b/c & mv a aa; done With 9P2000.u, file renaming is handled by the WSTAT command. The v9fs_wstat() function calls v9fs_complete_rename(), which calls v9fs_fix_path() for every fid whose path is affected by the change. The involved calls to v9fs_path_copy() may race with any other access to the fid path performed by some worker thread, causing a crash like shown below: Thread 12 "qemu-system-x86" received signal SIGSEGV, Segmentation fault. 0x0000555555a25da2 in local_open_nofollow (fs_ctx=0x555557d958b8, path=0x0, flags=65536, mode=0) at hw/9pfs/9p-local.c:59 59 while (*path && fd != -1) { (gdb) bt #0 0x0000555555a25da2 in local_open_nofollow (fs_ctx=0x555557d958b8, path=0x0, flags=65536, mode=0) at hw/9pfs/9p-local.c:59 #1 0x0000555555a25e0c in local_opendir_nofollow (fs_ctx=0x555557d958b8, path=0x0) at hw/9pfs/9p-local.c:92 #2 0x0000555555a261b8 in local_lstat (fs_ctx=0x555557d958b8, fs_path=0x555556b56858, stbuf=0x7fff84830ef0) at hw/9pfs/9p-local.c:185 #3 0x0000555555a2b367 in v9fs_co_lstat (pdu=0x555557d97498, path=0x555556b56858, stbuf=0x7fff84830ef0) at hw/9pfs/cofile.c:53 #4 0x0000555555a1e9e2 in v9fs_stat (opaque=0x555557d97498) at hw/9pfs/9p.c:1083 #5 0x0000555555e060a2 in coroutine_trampoline (i0=-669165424, i1=32767) at util/coroutine-ucontext.c:116 #6 0x00007fffef4f5600 in __start_context () at /lib64/libc.so.6 #7 0x0000000000000000 in () (gdb) The fix is to take the path write lock when calling v9fs_complete_rename(), like in v9fs_rename(). Impact: DoS triggered by unprivileged guest users. Fixes: CVE-2018-19489 Cc: P J P <ppandit@redhat.com> Reported-by: zhibin hu <noirfate@gmail.com> Reviewed-by: Prasad J Pandit <pjp@fedoraproject.org> Signed-off-by: Greg Kurz <groug@kaod.org>

After iov_discard_front(), the iov may be smaller than its initial size. Fixes the heap-buffer-overflow spotted by ASAN: ==9036==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x6060000001e0 at pc 0x7fe632eca3f0 bp 0x7ffddc4a05a0 sp 0x7ffddc49fd48 WRITE of size 32 at 0x6060000001e0 thread T0 #0 0x7fe632eca3ef (/lib64/libasan.so.5+0x773ef) #1 0x7fe632ecad23 in __interceptor_recvmsg (/lib64/libasan.so.5+0x77d23) #2 0x561e7491936b in vubr_backend_recv_cb /home/elmarco/src/qemu/tests/vhost-user-bridge.c:333 #3 0x561e74917711 in dispatcher_wait /home/elmarco/src/qemu/tests/vhost-user-bridge.c:160 #4 0x561e7491c3b5 in vubr_run /home/elmarco/src/qemu/tests/vhost-user-bridge.c:725 #5 0x561e7491c85c in main /home/elmarco/src/qemu/tests/vhost-user-bridge.c:806 #6 0x7fe631a6c412 in __libc_start_main (/lib64/libc.so.6+0x24412) #7 0x561e7491667d in _start (/home/elmarco/src/qemu/build/tests/vhost-user-bridge+0x3967d) 0x6060000001e0 is located 0 bytes to the right of 64-byte region [0x6060000001a0,0x6060000001e0) allocated by thread T0 here: #0 0x7fe632f42848 in __interceptor_malloc (/lib64/libasan.so.5+0xef848) #1 0x561e7493acd8 in virtqueue_alloc_element /home/elmarco/src/qemu/contrib/libvhost-user/libvhost-user.c:1848 #2 0x561e7493c2a8 in vu_queue_pop /home/elmarco/src/qemu/contrib/libvhost-user/libvhost-user.c:1954 #3 0x561e749189bf in vubr_backend_recv_cb /home/elmarco/src/qemu/tests/vhost-user-bridge.c:297 #4 0x561e74917711 in dispatcher_wait /home/elmarco/src/qemu/tests/vhost-user-bridge.c:160 #5 0x561e7491c3b5 in vubr_run /home/elmarco/src/qemu/tests/vhost-user-bridge.c:725 #6 0x561e7491c85c in main /home/elmarco/src/qemu/tests/vhost-user-bridge.c:806 #7 0x7fe631a6c412 in __libc_start_main (/lib64/libc.so.6+0x24412) SUMMARY: AddressSanitizer: heap-buffer-overflow (/lib64/libasan.so.5+0x773ef) Shadow bytes around the buggy address: 0x0c0c7fff7fe0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0c0c7fff7ff0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0c0c7fff8000: fa fa fa fa 00 00 00 00 00 00 05 fa fa fa fa fa 0x0c0c7fff8010: 00 00 00 00 00 00 00 00 fa fa fa fa fd fd fd fd 0x0c0c7fff8020: fd fd fd fd fa fa fa fa fd fd fd fd fd fd fd fd =>0x0c0c7fff8030: fa fa fa fa 00 00 00 00 00 00 00 00[fa]fa fa fa 0x0c0c7fff8040: fd fd fd fd fd fd fd fd fa fa fa fa fd fd fd fd 0x0c0c7fff8050: fd fd fd fd fa fa fa fa fd fd fd fd fd fd fd fd 0x0c0c7fff8060: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0c0c7fff8070: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0c0c7fff8080: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20181109173028.3372-1-marcandre.lureau@redhat.com> Signed-off-by: Paolo BOnzini <pbonzini@redhat.com>

Let start from the beginning: Commit b9e413d (in 2.9) "block: explicitly acquire aiocontext in aio callbacks that need it" added pairs of aio_context_acquire/release to mirror_write_complete and mirror_read_complete, when they were aio callbacks for blk_aio_* calls. Then, commit 2e1990b (in 3.0) "block/mirror: Convert to coroutines" dropped these blk_aio_* calls, than mirror_write_complete and mirror_read_complete are not callbacks more, and don't need additional aiocontext acquiring. Furthermore, mirror_read_complete calls blk_co_pwritev inside these pair of aio_context_acquire/release, which leads to the following dead-lock with mirror: (gdb) info thr Id Target Id Frame 3 Thread (LWP 145412) "qemu-system-x86" syscall () 2 Thread (LWP 145416) "qemu-system-x86" __lll_lock_wait () * 1 Thread (LWP 145411) "qemu-system-x86" __lll_lock_wait () (gdb) bt #0 __lll_lock_wait () #1 _L_lock_812 () #2 __GI___pthread_mutex_lock #3 qemu_mutex_lock_impl (mutex=0x561032dce420 <qemu_global_mutex>, file=0x5610327d8654 "util/main-loop.c", line=236) at util/qemu-thread-posix.c:66 #4 qemu_mutex_lock_iothread_impl #5 os_host_main_loop_wait (timeout=480116000) at util/main-loop.c:236 #6 main_loop_wait (nonblocking=0) at util/main-loop.c:497 #7 main_loop () at vl.c:1892 #8 main Printing contents of qemu_global_mutex, I see that "__owner = 145416", so, thr1 is main loop, and now it wants BQL, which is owned by thr2. (gdb) thr 2 (gdb) bt #0 __lll_lock_wait () #1 _L_lock_870 () #2 __GI___pthread_mutex_lock #3 qemu_mutex_lock_impl (mutex=0x561034d25dc0, ... #4 aio_context_acquire (ctx=0x561034d25d60) #5 dma_blk_cb #6 dma_blk_io #7 dma_blk_read #8 ide_dma_cb #9 bmdma_cmd_writeb #10 bmdma_write #11 memory_region_write_accessor #12 access_with_adjusted_size #15 flatview_write #16 address_space_write #17 address_space_rw #18 kvm_handle_io #19 kvm_cpu_exec #20 qemu_kvm_cpu_thread_fn #21 qemu_thread_start #22 start_thread #23 clone () Printing mutex in fr 2, I see "__owner = 145411", so thr2 wants aio context mutex, which is owned by thr1. Classic dead-lock. Then, let's check that aio context is hold by mirror coroutine: just print coroutine stack of first tracked request in mirror job target: (gdb) [...] (gdb) qemu coroutine 0x561035dd0860 #0 qemu_coroutine_switch #1 qemu_coroutine_yield #2 qemu_co_mutex_lock_slowpath #3 qemu_co_mutex_lock #4 qcow2_co_pwritev #5 bdrv_driver_pwritev #6 bdrv_aligned_pwritev #7 bdrv_co_pwritev #8 blk_co_pwritev #9 mirror_read_complete () at block/mirror.c:232 #10 mirror_co_read () at block/mirror.c:370 #11 coroutine_trampoline #12 __start_context Yes it is mirror_read_complete calling blk_co_pwritev after acquiring aio context. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>

When a monitor is connected to a Spice chardev, the monitor cleanup can dead-lock: #0 0x00007f43446637fd in __lll_lock_wait () at /lib64/libpthread.so.0 #1 0x00007f434465ccf4 in pthread_mutex_lock () at /lib64/libpthread.so.0 #2 0x0000556dd79f22ba in qemu_mutex_lock_impl (mutex=0x556dd81c9220 <monitor_lock>, file=0x556dd7ae3648 "/home/elmarco/src/qq/monitor.c", line=645) at /home/elmarco/src/qq/util/qemu-thread-posix.c:66 #3 0x0000556dd7431bd5 in monitor_qapi_event_queue (event=QAPI_EVENT_SPICE_DISCONNECTED, qdict=0x556dd9abc850, errp=0x7fffb7bbddd8) at /home/elmarco/src/qq/monitor.c:645 #4 0x0000556dd79d476b in qapi_event_send_spice_disconnected (server=0x556dd98ee760, client=0x556ddaaa8560, errp=0x556dd82180d0 <error_abort>) at qapi/qapi-events-ui.c:149 #5 0x0000556dd7870fc1 in channel_event (event=3, info=0x556ddad1b590) at /home/elmarco/src/qq/ui/spice-core.c:235 #6 0x00007f434560a6bb in reds_handle_channel_event (reds=<optimized out>, event=3, info=0x556ddad1b590) at reds.c:316 #7 0x00007f43455f393b in main_dispatcher_self_handle_channel_event (info=0x556ddad1b590, event=3, self=0x556dd9a7d8c0) at main-dispatcher.c:197 #8 0x00007f43455f393b in main_dispatcher_channel_event (self=0x556dd9a7d8c0, event=event@entry=3, info=0x556ddad1b590) at main-dispatcher.c:197 #9 0x00007f4345612833 in red_stream_push_channel_event (s=s@entry=0x556ddae2ef40, event=event@entry=3) at red-stream.c:414 #10 0x00007f434561286b in red_stream_free (s=0x556ddae2ef40) at red-stream.c:388 #11 0x00007f43455f9ddc in red_channel_client_finalize (object=0x556dd9bb21a0) at red-channel-client.c:347 #12 0x00007f434b5f9fb9 in g_object_unref () at /lib64/libgobject-2.0.so.0 #13 0x00007f43455fc212 in red_channel_client_push (rcc=0x556dd9bb21a0) at red-channel-client.c:1341 #14 0x0000556dd76081ba in spice_port_set_fe_open (chr=0x556dd9925e20, fe_open=0) at /home/elmarco/src/qq/chardev/spice.c:241 #15 0x0000556dd796d74a in qemu_chr_fe_set_open (be=0x556dd9a37c00, fe_open=0) at /home/elmarco/src/qq/chardev/char-fe.c:340 #16 0x0000556dd796d4d9 in qemu_chr_fe_set_handlers (b=0x556dd9a37c00, fd_can_read=0x0, fd_read=0x0, fd_event=0x0, be_change=0x0, opaque=0x0, context=0x0, set_open=true) at /home/elmarco/src/qq/chardev/char-fe.c:280 #17 0x0000556dd796d359 in qemu_chr_fe_deinit (b=0x556dd9a37c00, del=false) at /home/elmarco/src/qq/chardev/char-fe.c:233 #18 0x0000556dd7432240 in monitor_data_destroy (mon=0x556dd9a37c00) at /home/elmarco/src/qq/monitor.c:786 #19 0x0000556dd743b968 in monitor_cleanup () at /home/elmarco/src/qq/monitor.c:4683 #20 0x0000556dd75ce776 in main (argc=3, argv=0x7fffb7bbe458, envp=0x7fffb7bbe478) at /home/elmarco/src/qq/vl.c:4660 Because spice code tries to emit a "disconnected" signal on the monitors. Fix this dead-lock by releasing the monitor lock for flush/destroy. monitor_lock protects mon_list, monitor_qapi_event_state and monitor_destroyed. monitor_flush() and monitor_data_destroy() don't access any of those variables. monitor_cleanup()'s loop is safe because it uses QTAILQ_FOREACH_SAFE(), and no further monitor can be added after calling monitor_cleanup() thanks to monitor_destroyed check in monitor_list_append(). Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20181205203737.9011-8-marcandre.lureau@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>

Since commit ea9ce89, device_post_init() applies globals directly from machines and accelerator classes. There are cases, such as -device scsi-hd,help, where the machine is setup but there in no accelerator. Let's skip accelerator globals in this case. Fixes SEGV: #0 0x0000555558ea04ff in object_get_class (obj=0x0) at /home/elmarco/src/qemu/build/../qom/object.c:857 #1 0x000055555854c797 in object_apply_compat_props (obj=0x616000078980) at /home/elmarco/src/qemu/build/../hw/core/qdev.c:978 #2 0x000055555854c797 in object_apply_compat_props (obj=0x616000078980) at /home/elmarco/src/qemu/build/../hw/core/qdev.c:973 #3 0x000055555854c959 in device_post_init (obj=0x616000078980) at /home/elmarco/src/qemu/build/../hw/core/qdev.c:989 #4 0x0000555558e9e250 in object_post_init_with_type (ti=<optimized out>, obj=0x616000078980) at /home/elmarco/src/qemu/build/../qom/object.c:365 #5 0x0000555558e9e250 in object_initialize_with_type (data=0x616000078980, size=616, type=<optimized out>) at /home/elmarco/src/qemu/build/../qom/object.c:425 #6 0x0000555558e9e571 in object_new_with_type (type=0x613000031900) at /home/elmarco/src/qemu/build/../qom/object.c:588 #7 0x000055555830c048 in qmp_device_list_properties (typename=typename@entry=0x60200000c2d0 "scsi-hd", errp=errp@entry=0x7fffffffc540) at /home/elmarco/src/qemu/qmp.c:519 #8 0x00005555582c4027 in qdev_device_help (opts=<optimized out>) at /home/elmarco/src/qemu/qdev-monitor.c:283 #9 0x0000555559378fa2 in qemu_opts_foreach (list=<optimized out>, func=func@entry=0x5555582cfca0 <device_help_func>, opaque=opaque@entry=0x0, errp=errp@entry=0x0) at /home/elmarco/src/qemu/util/qemu-option.c:1171 https://bugzilla.redhat.com/show_bug.cgi?id=1664364 Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20190109102311.7635-1-marcandre.lureau@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Lukáš Doktor <ldoktor@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>

Lukas reported an hard to reproduce QMP iothread hang on s390 that QEMU might hang at pthread_join() of the QMP monitor iothread before quitting: Thread 1 #0 0x000003ffad10932c in pthread_join #1 0x0000000109e95750 in qemu_thread_join at /home/thuth/devel/qemu/util/qemu-thread-posix.c:570 #2 0x0000000109c95a1c in iothread_stop #3 0x0000000109bb0874 in monitor_cleanup #4 0x0000000109b55042 in main While the iothread is still in the main loop: Thread 4 #0 0x000003ffad0010e4 in ?? #1 0x000003ffad553958 in g_main_context_iterate.isra.19 #2 0x000003ffad553d90 in g_main_loop_run #3 0x0000000109c9585a in iothread_run at /home/thuth/devel/qemu/iothread.c:74 #4 0x0000000109e94752 in qemu_thread_start at /home/thuth/devel/qemu/util/qemu-thread-posix.c:502 #5 0x000003ffad10825a in start_thread #6 0x000003ffad00dcf2 in thread_start IMHO it's because there's a race between the main thread and iothread when stopping the thread in following sequence: main thread iothread =========== ============== aio_poll() iothread_get_g_main_context set iothread->worker_context iothread_stop schedule iothread_stop_bh execute iothread_stop_bh [1] set iothread->running=false (since main_loop==NULL so skip to quit main loop. Note: although main_loop is NULL but worker_context is not!) atomic_read(&iothread->worker_context) [2] create main_loop object g_main_loop_run() [3] pthread_join() [4] We can see that when execute iothread_stop_bh() at [1] it's possible that main_loop is still NULL because it's only created until the first check of the worker_context later at [2]. Then the iothread will hang in the main loop [3] and it'll starve the main thread too [4]. Here the simple solution should be that we check again the "running" variable before check against worker_context. CC: Thomas Huth <thuth@redhat.com> CC: Dr. David Alan Gilbert <dgilbert@redhat.com> CC: Stefan Hajnoczi <stefanha@redhat.com> CC: Lukáš Doktor <ldoktor@redhat.com> CC: Markus Armbruster <armbru@redhat.com> CC: Eric Blake <eblake@redhat.com> CC: Paolo Bonzini <pbonzini@redhat.com> Reported-by: Lukáš Doktor <ldoktor@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Tested-by: Thomas Huth <thuth@redhat.com> Message-id: 20190129051432.22023-1-peterx@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

It fixes heap-use-after-free which was found by clang's ASAN. Control flow of this use-after-free: main_thread: * Got SIGTERM and completes main loop * Calls migration_shutdown - migrate_fd_cancel (so, migration_thread begins to complete) - object_unref(OBJECT(current_migration)); migration_thread: * migration_iteration_finish -> schedule cleanup bh * object_unref(OBJECT(s)); (Now, current_migration is freed) * exits main_thread: * Calls vm_shutdown -> drain bdrvs -> main loop -> cleanup_bh -> use after free If you want to reproduce, these couple of sleeps will help: vl.c:4613: migration_shutdown(); + sleep(2); migration.c:3269: + sleep(1); trace_migration_thread_after_loop(); migration_iteration_finish(s); Original output: qemu-system-x86_64: terminating on signal 15 from pid 31980 (<unknown process>) ================================================================= ==31958==ERROR: AddressSanitizer: heap-use-after-free on address 0x61900001d210 at pc 0x555558a535ca bp 0x7fffffffb190 sp 0x7fffffffb188 READ of size 8 at 0x61900001d210 thread T0 (qemu-vm-0) #0 0x555558a535c9 in migrate_fd_cleanup migration/migration.c:1502:23 #1 0x5555594fde0a in aio_bh_call util/async.c:90:5 #2 0x5555594fe522 in aio_bh_poll util/async.c:118:13 #3 0x555559524783 in aio_poll util/aio-posix.c:725:17 #4 0x555559504fb3 in aio_wait_bh_oneshot util/aio-wait.c:71:5 #5 0x5555573bddf6 in virtio_blk_data_plane_stop hw/block/dataplane/virtio-blk.c:282:5 #6 0x5555589d5c09 in virtio_bus_stop_ioeventfd hw/virtio/virtio-bus.c:246:9 #7 0x5555589e9917 in virtio_pci_stop_ioeventfd hw/virtio/virtio-pci.c:287:5 #8 0x5555589e22bf in virtio_pci_vmstate_change hw/virtio/virtio-pci.c:1072:9 #9 0x555557628931 in virtio_vmstate_change hw/virtio/virtio.c:2257:9 #10 0x555557c36713 in vm_state_notify vl.c:1605:9 #11 0x55555716ef53 in do_vm_stop cpus.c:1074:9 #12 0x55555716eeff in vm_shutdown cpus.c:1092:12 #13 0x555557c4283e in main vl.c:4617:5 #14 0x7fffdfdb482f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f) #15 0x555556ecb118 in _start (x86_64-softmmu/qemu-system-x86_64+0x1977118) 0x61900001d210 is located 144 bytes inside of 952-byte region [0x61900001d180,0x61900001d538) freed by thread T6 (live_migration) here: #0 0x555556f76782 in __interceptor_free /tmp/final/llvm.src/projects/compiler-rt/lib/asan/asan_malloc_linux.cc:124:3 #1 0x555558d5fa94 in object_finalize qom/object.c:618:9 #2 0x555558d57651 in object_unref qom/object.c:1068:9 #3 0x555558a55588 in migration_thread migration/migration.c:3272:5 #4 0x5555595393f2 in qemu_thread_start util/qemu-thread-posix.c:502:9 #5 0x7fffe057f6b9 in start_thread (/lib/x86_64-linux-gnu/libpthread.so.0+0x76b9) previously allocated by thread T0 (qemu-vm-0) here: #0 0x555556f76b03 in __interceptor_malloc /tmp/final/llvm.src/projects/compiler-rt/lib/asan/asan_malloc_linux.cc:146:3 #1 0x7ffff6ee37b8 in g_malloc (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x4f7b8) #2 0x555558d58031 in object_new qom/object.c:640:12 #3 0x555558a31f21 in migration_object_init migration/migration.c:139:25 #4 0x555557c41398 in main vl.c:4320:5 #5 0x7fffdfdb482f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f) Thread T6 (live_migration) created by T0 (qemu-vm-0) here: #0 0x555556f5f0dd in pthread_create /tmp/final/llvm.src/projects/compiler-rt/lib/asan/asan_interceptors.cc:210:3 #1 0x555559538cf9 in qemu_thread_create util/qemu-thread-posix.c:539:11 #2 0x555558a53304 in migrate_fd_connect migration/migration.c:3332:5 #3 0x555558a72bd8 in migration_channel_connect migration/channel.c:92:5 #4 0x555558a6ef87 in exec_start_outgoing_migration migration/exec.c:42:5 #5 0x555558a4f3c2 in qmp_migrate migration/migration.c:1922:9 #6 0x555558bb4f6a in qmp_marshal_migrate qapi/qapi-commands-migration.c:607:5 #7 0x555559363738 in do_qmp_dispatch qapi/qmp-dispatch.c:131:5 #8 0x555559362a15 in qmp_dispatch qapi/qmp-dispatch.c:174:11 #9 0x5555571bac15 in monitor_qmp_dispatch monitor.c:4124:11 #10 0x55555719a22d in monitor_qmp_bh_dispatcher monitor.c:4207:9 #11 0x5555594fde0a in aio_bh_call util/async.c:90:5 #12 0x5555594fe522 in aio_bh_poll util/async.c:118:13 #13 0x5555595201e0 in aio_dispatch util/aio-posix.c:460:5 #14 0x555559503553 in aio_ctx_dispatch util/async.c:261:5 #15 0x7ffff6ede196 in g_main_context_dispatch (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x4a196) SUMMARY: AddressSanitizer: heap-use-after-free migration/migration.c:1502:23 in migrate_fd_cleanup Shadow bytes around the buggy address: 0x0c327fffb9f0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0c327fffba00: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0c327fffba10: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0c327fffba20: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0c327fffba30: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd =>0x0c327fffba40: fd fd[fd]fd fd fd fd fd fd fd fd fd fd fd fd fd 0x0c327fffba50: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 0x0c327fffba60: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 0x0c327fffba70: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 0x0c327fffba80: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 0x0c327fffba90: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd Shadow byte legend (one shadow byte represents 8 application bytes): Addressable: 00 Partially addressable: 01 02 03 04 05 06 07 Heap left redzone: fa Freed heap region: fd Stack left redzone: f1 Stack mid redzone: f2 Stack right redzone: f3 Stack after return: f5 Stack use after scope: f8 Global redzone: f9 Global init order: f6 Poisoned by user: f7 Container overflow: fc Array cookie: ac Intra object redzone: bb ASan internal: fe Left alloca redzone: ca Right alloca redzone: cb Shadow gap: cc ==31958==ABORTING Signed-off-by: Yury Kotov <yury-kotov@yandex-team.ru> Message-Id: <20190408113343.2370-1-yury-kotov@yandex-team.ru> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Fixed up comment formatting

When writing a new board, adding device which uses other devices (container) or simply refactoring, one can discover the hard way his machine misses some devices. In the case of containers, the error is not obvious: $ qemu-system-microblaze -M xlnx-zynqmp-pmu ** ERROR:/source/qemu/qom/object.c:454:object_initialize_with_type: assertion failed: (type != NULL) Aborted (core dumped) And we have to look at the coredump to figure the error: (gdb) bt #1 0x00007f84773cf895 in abort () at /lib64/libc.so.6 #2 0x00007f847961fb53 in () at /lib64/libglib-2.0.so.0 #3 0x00007f847967a4de in g_assertion_message_expr () at /lib64/libglib-2.0.so.0 #4 0x000055c4bcac6c11 in object_initialize_with_type (data=data@entry=0x55c4bdf239e0, size=size@entry=2464, type=<optimized out>) at /source/qemu/qom/object.c:454 #5 0x000055c4bcac6e6d in object_initialize (data=data@entry=0x55c4bdf239e0, size=size@entry=2464, typename=typename@entry=0x55c4bcc7c643 "xlnx.zynqmp_ipi") at /source/qemu/qom/object.c:474 #6 0x000055c4bc9ea474 in xlnx_zynqmp_pmu_init (machine=0x55c4bdd46000) at /source/qemu/hw/microblaze/xlnx-zynqmp-pmu.c:176 #7 0x000055c4bca3b6cb in machine_run_board_init (machine=0x55c4bdd46000) at /source/qemu/hw/core/machine.c:1030 #8 0x000055c4bc95f6d2 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at /source/qemu/vl.c:4479 Since the caller knows the type name requested, we can simply display it to ease development. With this patch applied we get: $ qemu-system-microblaze -M xlnx-zynqmp-pmu qemu-system-microblaze: missing object type 'xlnx.zynqmp_ipi' Aborted (core dumped) Since the assert(type) check in object_initialize_with_type() is now impossible, remove it. Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20190427135642.16464-1-philmd@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>

When emulating irqchip in qemu, such as following command: x86_64-softmmu/qemu-system-x86_64 -m 1024 -smp 4 -hda /home/test/test.img -machine kernel-irqchip=off --enable-kvm -vnc :0 -device edu -monitor stdio We will get a crash with following asan output: (qemu) /home/test/qemu5/qemu/hw/intc/ioapic.c:266:27: runtime error: index 35 out of bounds for type 'int [24]' ================================================================= ==113504==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x61b000003114 at pc 0x5579e3c7a80f bp 0x7fd004bf8c10 sp 0x7fd004bf8c00 WRITE of size 4 at 0x61b000003114 thread T4 #0 0x5579e3c7a80e in ioapic_eoi_broadcast /home/test/qemu5/qemu/hw/intc/ioapic.c:266 #1 0x5579e3c6f480 in apic_eoi /home/test/qemu5/qemu/hw/intc/apic.c:428 #2 0x5579e3c720a7 in apic_mem_write /home/test/qemu5/qemu/hw/intc/apic.c:802 #3 0x5579e3b1e31a in memory_region_write_accessor /home/test/qemu5/qemu/memory.c:503 #4 0x5579e3b1e6a2 in access_with_adjusted_size /home/test/qemu5/qemu/memory.c:569 #5 0x5579e3b28d77 in memory_region_dispatch_write /home/test/qemu5/qemu/memory.c:1497 #6 0x5579e3a1b36b in flatview_write_continue /home/test/qemu5/qemu/exec.c:3323 #7 0x5579e3a1b633 in flatview_write /home/test/qemu5/qemu/exec.c:3362 #8 0x5579e3a1bcb1 in address_space_write /home/test/qemu5/qemu/exec.c:3452 #9 0x5579e3a1bd03 in address_space_rw /home/test/qemu5/qemu/exec.c:3463 #10 0x5579e3b8b979 in kvm_cpu_exec /home/test/qemu5/qemu/accel/kvm/kvm-all.c:2045 #11 0x5579e3ae4499 in qemu_kvm_cpu_thread_fn /home/test/qemu5/qemu/cpus.c:1287 #12 0x5579e4cbdb9f in qemu_thread_start util/qemu-thread-posix.c:502 #13 0x7fd0146376da in start_thread (/lib/x86_64-linux-gnu/libpthread.so.0+0x76da) #14 0x7fd01436088e in __clone (/lib/x86_64-linux-gnu/libc.so.6+0x12188e This is because in ioapic_eoi_broadcast function, we uses 'vector' to index the 's->irq_eoi'. To fix this, we should uses the irq number. Signed-off-by: Li Qiang <liq3ea@163.com> Reviewed-by: Peter Xu <peterx@redhat.com> Message-Id: <20190622002119.126834-1-liq3ea@163.com>

The test aarch64 kernel is in an array defined with unsigned char aarch64_kernel[] = { [...] } which means it could be any size; currently it's quite small. However we write it to a file using init_bootfile(), which writes exactly 512 bytes to the file. This will break if we ever end up with a kernel larger than that, and will read garbage off the end of the array in the current setup where the kernel is smaller. Make init_bootfile() take an argument giving the length of the data to write. This allows us to use it for all architectures (previously s390 had a special-purpose init_bootfile_s390x which hardcoded the file to write so it could write the correct length). We assert that the x86 bootfile really is exactly 512 bytes as it should be (and as we were previously just assuming it was). This was detected by the clang-7 asan: ==15607==ERROR: AddressSanitizer: global-buffer-overflow on address 0x55a796f51d20 at pc 0x55a796b89c2f bp 0x7ffc58e89160 sp 0x7ffc58e88908 READ of size 512 at 0x55a796f51d20 thread T0 #0 0x55a796b89c2e in fwrite (/home/petmay01/linaro/qemu-from-laptop/qemu/build/sanitizers/tests/migration-test+0xb0c2e) #1 0x55a796c46492 in init_bootfile /home/petmay01/linaro/qemu-from-laptop/qemu/tests/migration-test.c:99:5 #2 0x55a796c46492 in test_migrate_start /home/petmay01/linaro/qemu-from-laptop/qemu/tests/migration-test.c:593 #3 0x55a796c44101 in test_baddest /home/petmay01/linaro/qemu-from-laptop/qemu/tests/migration-test.c:854:9 #4 0x7f906ffd3cc9 (/usr/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x72cc9) #5 0x7f906ffd3bfa (/usr/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x72bfa) #6 0x7f906ffd3bfa (/usr/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x72bfa) #7 0x7f906ffd3ea1 in g_test_run_suite (/usr/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x72ea1) #8 0x7f906ffd3ec0 in g_test_run (/usr/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x72ec0) #9 0x55a796c43707 in main /home/petmay01/linaro/qemu-from-laptop/qemu/tests/migration-test.c:1187:11 #10 0x7f906e9abb96 in __libc_start_main /build/glibc-OTsEL5/glibc-2.27/csu/../csu/libc-start.c:310 #11 0x55a796b6c2d9 in _start (/home/petmay01/linaro/qemu-from-laptop/qemu/build/sanitizers/tests/migration-test+0x932d9) Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Laurent Vivier <lvivier@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-id: 20190702150311.20467-1-peter.maydell@linaro.org

Reading the RX_DATA register when the RX_FIFO is empty triggers an abort. This can be easily reproduced: $ qemu-system-arm -M emcraft-sf2 -monitor stdio -S QEMU 4.0.50 monitor - type 'help' for more information (qemu) x 0x40001010 Aborted (core dumped) (gdb) bt #1 0x00007f035874f895 in abort () at /lib64/libc.so.6 #2 0x00005628686591ff in fifo8_pop (fifo=0x56286a9a4c68) at util/fifo8.c:66 #3 0x00005628683e0b8e in fifo32_pop (fifo=0x56286a9a4c68) at include/qemu/fifo32.h:137 #4 0x00005628683e0efb in spi_read (opaque=0x56286a9a4850, addr=4, size=4) at hw/ssi/mss-spi.c:168 #5 0x0000562867f96801 in memory_region_read_accessor (mr=0x56286a9a4b60, addr=16, value=0x7ffeecb0c5c8, size=4, shift=0, mask=4294967295, attrs=...) at memory.c:439 #6 0x0000562867f96cdb in access_with_adjusted_size (addr=16, value=0x7ffeecb0c5c8, size=4, access_size_min=1, access_size_max=4, access_fn=0x562867f967c3 <memory_region_read_accessor>, mr=0x56286a9a4b60, attrs=...) at memory.c:569 #7 0x0000562867f99940 in memory_region_dispatch_read1 (mr=0x56286a9a4b60, addr=16, pval=0x7ffeecb0c5c8, size=4, attrs=...) at memory.c:1420 #8 0x0000562867f99a08 in memory_region_dispatch_read (mr=0x56286a9a4b60, addr=16, pval=0x7ffeecb0c5c8, size=4, attrs=...) at memory.c:1447 #9 0x0000562867f38721 in flatview_read_continue (fv=0x56286aec6360, addr=1073745936, attrs=..., buf=0x7ffeecb0c7c0 "\340ǰ\354\376\177", len=4, addr1=16, l=4, mr=0x56286a9a4b60) at exec.c:3385 #10 0x0000562867f38874 in flatview_read (fv=0x56286aec6360, addr=1073745936, attrs=..., buf=0x7ffeecb0c7c0 "\340ǰ\354\376\177", len=4) at exec.c:3423 #11 0x0000562867f388ea in address_space_read_full (as=0x56286aa3e890, addr=1073745936, attrs=..., buf=0x7ffeecb0c7c0 "\340ǰ\354\376\177", len=4) at exec.c:3436 #12 0x0000562867f389c5 in address_space_rw (as=0x56286aa3e890, addr=1073745936, attrs=..., buf=0x7ffeecb0c7c0 "\340ǰ\354\376\177", len=4, is_write=false) at exec.c:3466 #13 0x0000562867f3bdd7 in cpu_memory_rw_debug (cpu=0x56286aa19d00, addr=1073745936, buf=0x7ffeecb0c7c0 "\340ǰ\354\376\177", len=4, is_write=0) at exec.c:3976 #14 0x000056286811ed51 in memory_dump (mon=0x56286a8c32d0, count=1, format=120, wsize=4, addr=1073745936, is_physical=0) at monitor/misc.c:730 #15 0x000056286811eff1 in hmp_memory_dump (mon=0x56286a8c32d0, qdict=0x56286b15c400) at monitor/misc.c:785 #16 0x00005628684740ee in handle_hmp_command (mon=0x56286a8c32d0, cmdline=0x56286a8caeb2 "0x40001010") at monitor/hmp.c:1082 From the datasheet "Actel SmartFusion Microcontroller Subsystem User's Guide" Rev.1, Table 13-3 "SPI Register Summary", this register has a reset value of 0. Check the FIFO is not empty before accessing it, else log an error message. Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Message-id: 20190709113715.7761-3-philmd@redhat.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

In the previous commit we fixed a crash when the guest read a register that pop from an empty FIFO. By auditing the repository, we found another similar use with an easy way to reproduce: $ qemu-system-aarch64 -M xlnx-zcu102 -monitor stdio -S QEMU 4.0.50 monitor - type 'help' for more information (qemu) xp/b 0xfd4a0134 Aborted (core dumped) (gdb) bt #0 0x00007f6936dea57f in raise () at /lib64/libc.so.6 #1 0x00007f6936dd4895 in abort () at /lib64/libc.so.6 #2 0x0000561ad32975ec in xlnx_dp_aux_pop_rx_fifo (s=0x7f692babee70) at hw/display/xlnx_dp.c:431 #3 0x0000561ad3297dc0 in xlnx_dp_read (opaque=0x7f692babee70, offset=77, size=4) at hw/display/xlnx_dp.c:667 #4 0x0000561ad321b896 in memory_region_read_accessor (mr=0x7f692babf620, addr=308, value=0x7ffe05c1db88, size=4, shift=0, mask=4294967295, attrs=...) at memory.c:439 #5 0x0000561ad321bd70 in access_with_adjusted_size (addr=308, value=0x7ffe05c1db88, size=1, access_size_min=4, access_size_max=4, access_fn=0x561ad321b858 <memory_region_read_accessor>, mr=0x7f692babf620, attrs=...) at memory.c:569 #6 0x0000561ad321e9d5 in memory_region_dispatch_read1 (mr=0x7f692babf620, addr=308, pval=0x7ffe05c1db88, size=1, attrs=...) at memory.c:1420 #7 0x0000561ad321ea9d in memory_region_dispatch_read (mr=0x7f692babf620, addr=308, pval=0x7ffe05c1db88, size=1, attrs=...) at memory.c:1447 #8 0x0000561ad31bd742 in flatview_read_continue (fv=0x561ad69c04f0, addr=4249485620, attrs=..., buf=0x7ffe05c1dcf0 "\020\335\301\005\376\177", len=1, addr1=308, l=1, mr=0x7f692babf620) at exec.c:3385 #9 0x0000561ad31bd895 in flatview_read (fv=0x561ad69c04f0, addr=4249485620, attrs=..., buf=0x7ffe05c1dcf0 "\020\335\301\005\376\177", len=1) at exec.c:3423 #10 0x0000561ad31bd90b in address_space_read_full (as=0x561ad5bb3020, addr=4249485620, attrs=..., buf=0x7ffe05c1dcf0 "\020\335\301\005\376\177", len=1) at exec.c:3436 #11 0x0000561ad33b1c42 in address_space_read (len=1, buf=0x7ffe05c1dcf0 "\020\335\301\005\376\177", attrs=..., addr=4249485620, as=0x561ad5bb3020) at include/exec/memory.h:2131 #12 0x0000561ad33b1c42 in memory_dump (mon=0x561ad59c4530, count=1, format=120, wsize=1, addr=4249485620, is_physical=1) at monitor/misc.c:723 #13 0x0000561ad33b1fc1 in hmp_physical_memory_dump (mon=0x561ad59c4530, qdict=0x561ad6c6fd00) at monitor/misc.c:795 #14 0x0000561ad37b4a9f in handle_hmp_command (mon=0x561ad59c4530, cmdline=0x561ad59d0f22 "/b 0x00000000fd4a0134") at monitor/hmp.c:1082 Fix by checking the FIFO is not empty before popping from it. The datasheet is not clear about the reset value of this register, we choose to return '0'. Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Message-id: 20190709113715.7761-4-philmd@redhat.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

When creating the admin queue in nvme_init() the variable that holds the number of queues created is modified before actual queue creation. This is a problem because if creating the queue fails then the variable is left in inconsistent state. This was actually observed when I tried to hotplug a nvme disk. The control got to nvme_file_open() which called nvme_init() which failed and thus nvme_close() was called which in turn called nvme_free_queue_pair() with queue being NULL. This lead to an instant crash: #0 0x000055d9507ec211 in nvme_free_queue_pair (bs=0x55d952ddb880, q=0x0) at block/nvme.c:164 #1 0x000055d9507ee180 in nvme_close (bs=0x55d952ddb880) at block/nvme.c:729 #2 0x000055d9507ee3d5 in nvme_file_open (bs=0x55d952ddb880, options=0x55d952bb1410, flags=147456, errp=0x7ffd8e19e200) at block/nvme.c:781 #3 0x000055d9507629f3 in bdrv_open_driver (bs=0x55d952ddb880, drv=0x55d95109c1e0 <bdrv_nvme>, node_name=0x0, options=0x55d952bb1410, open_flags=147456, errp=0x7ffd8e19e310) at block.c:1291 #4 0x000055d9507633d6 in bdrv_open_common (bs=0x55d952ddb880, file=0x0, options=0x55d952bb1410, errp=0x7ffd8e19e310) at block.c:1551 #5 0x000055d950766881 in bdrv_open_inherit (filename=0x0, reference=0x0, options=0x55d952bb1410, flags=32768, parent=0x55d9538ce420, child_role=0x55d950eaade0 <child_file>, errp=0x7ffd8e19e510) at block.c:3063 #6 0x000055d950765ae4 in bdrv_open_child_bs (filename=0x0, options=0x55d9541cdff0, bdref_key=0x55d950af33aa "file", parent=0x55d9538ce420, child_role=0x55d950eaade0 <child_file>, allow_none=true, errp=0x7ffd8e19e510) at block.c:2712 #7 0x000055d950766633 in bdrv_open_inherit (filename=0x0, reference=0x0, options=0x55d9541cdff0, flags=0, parent=0x0, child_role=0x0, errp=0x7ffd8e19e908) at block.c:3011 #8 0x000055d950766dba in bdrv_open (filename=0x0, reference=0x0, options=0x55d953d00390, flags=0, errp=0x7ffd8e19e908) at block.c:3156 #9 0x000055d9507cb635 in blk_new_open (filename=0x0, reference=0x0, options=0x55d953d00390, flags=0, errp=0x7ffd8e19e908) at block/block-backend.c:389 #10 0x000055d950465ec5 in blockdev_init (file=0x0, bs_opts=0x55d953d00390, errp=0x7ffd8e19e908) at blockdev.c:602 Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Message-id: 927aae40b617ba7d4b6c7ffe74e6d7a2595f8e86.1562770546.git.mprivozn@redhat.com Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>

Even for block nodes with bs->drv == NULL, we can't just ignore a bdrv_set_aio_context() call. Leaving the node in its old context can mean that it's still in an iothread context in bdrv_close_all() during shutdown, resulting in an attempted unlock of the AioContext lock which we don't hold. This is an example stack trace of a related crash: #0 0x00007ffff59da57f in raise () at /lib64/libc.so.6 #1 0x00007ffff59c4895 in abort () at /lib64/libc.so.6 #2 0x0000555555b97b1e in error_exit (err=<optimized out>, msg=msg@entry=0x555555d386d0 <__func__.19059> "qemu_mutex_unlock_impl") at util/qemu-thread-posix.c:36 #3 0x0000555555b97f7f in qemu_mutex_unlock_impl (mutex=mutex@entry=0x5555568002f0, file=file@entry=0x555555d378df "util/async.c", line=line@entry=507) at util/qemu-thread-posix.c:97 #4 0x0000555555b92f55 in aio_context_release (ctx=ctx@entry=0x555556800290) at util/async.c:507 #5 0x0000555555b05cf8 in bdrv_prwv_co (child=child@entry=0x7fffc80012f0, offset=offset@entry=131072, qiov=qiov@entry=0x7fffffffd4f0, is_write=is_write@entry=true, flags=flags@entry=0) at block/io.c:833 #6 0x0000555555b060a9 in bdrv_pwritev (qiov=0x7fffffffd4f0, offset=131072, child=0x7fffc80012f0) at block/io.c:990 #7 0x0000555555b060a9 in bdrv_pwrite (child=0x7fffc80012f0, offset=131072, buf=<optimized out>, bytes=<optimized out>) at block/io.c:990 #8 0x0000555555ae172b in qcow2_cache_entry_flush (bs=bs@entry=0x555556810680, c=c@entry=0x5555568cc740, i=i@entry=0) at block/qcow2-cache.c:51 #9 0x0000555555ae18dd in qcow2_cache_write (bs=bs@entry=0x555556810680, c=0x5555568cc740) at block/qcow2-cache.c:248 #10 0x0000555555ae15de in qcow2_cache_flush (bs=0x555556810680, c=<optimized out>) at block/qcow2-cache.c:259 #11 0x0000555555ae16b1 in qcow2_cache_flush_dependency (c=0x5555568a1700, c=0x5555568a1700, bs=0x555556810680) at block/qcow2-cache.c:194 #12 0x0000555555ae16b1 in qcow2_cache_entry_flush (bs=bs@entry=0x555556810680, c=c@entry=0x5555568a1700, i=i@entry=0) at block/qcow2-cache.c:194 #13 0x0000555555ae18dd in qcow2_cache_write (bs=bs@entry=0x555556810680, c=0x5555568a1700) at block/qcow2-cache.c:248 #14 0x0000555555ae15de in qcow2_cache_flush (bs=bs@entry=0x555556810680, c=<optimized out>) at block/qcow2-cache.c:259 #15 0x0000555555ad242c in qcow2_inactivate (bs=bs@entry=0x555556810680) at block/qcow2.c:2124 #16 0x0000555555ad2590 in qcow2_close (bs=0x555556810680) at block/qcow2.c:2153 #17 0x0000555555ab0c62 in bdrv_close (bs=0x555556810680) at block.c:3358 #18 0x0000555555ab0c62 in bdrv_delete (bs=0x555556810680) at block.c:3542 #19 0x0000555555ab0c62 in bdrv_unref (bs=0x555556810680) at block.c:4598 #20 0x0000555555af4d72 in blk_remove_bs (blk=blk@entry=0x5555568103d0) at block/block-backend.c:785 #21 0x0000555555af4dbb in blk_remove_all_bs () at block/block-backend.c:483 #22 0x0000555555aae02f in bdrv_close_all () at block.c:3412 #23 0x00005555557f9796 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4776 The reproducer I used is a qcow2 image on gluster volume, where the virtual disk size (4 GB) is larger than the gluster volume size (64M), so we can easily trigger an ENOSPC. This backend is assigned to a virtio-blk device using an iothread, and then from the guest a 'dd if=/dev/zero of=/dev/vda bs=1G count=1' causes the VM to stop because of an I/O error. qemu_gluster_co_flush_to_disk() sets bs->drv = NULL on error, so when virtio-blk stops the dataplane, the block nodes stay in the iothread AioContext. A 'quit' monitor command issued from this paused state crashes the process. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1631227 Cc: qemu-stable@nongnu.org Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> (cherry picked from commit 1bffe1a) *drop context dependency on e64f25f Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

Reading the RX_DATA register when the RX_FIFO is empty triggers an abort. This can be easily reproduced: $ qemu-system-arm -M emcraft-sf2 -monitor stdio -S QEMU 4.0.50 monitor - type 'help' for more information (qemu) x 0x40001010 Aborted (core dumped) (gdb) bt #1 0x00007f035874f895 in abort () at /lib64/libc.so.6 #2 0x00005628686591ff in fifo8_pop (fifo=0x56286a9a4c68) at util/fifo8.c:66 #3 0x00005628683e0b8e in fifo32_pop (fifo=0x56286a9a4c68) at include/qemu/fifo32.h:137 #4 0x00005628683e0efb in spi_read (opaque=0x56286a9a4850, addr=4, size=4) at hw/ssi/mss-spi.c:168 #5 0x0000562867f96801 in memory_region_read_accessor (mr=0x56286a9a4b60, addr=16, value=0x7ffeecb0c5c8, size=4, shift=0, mask=4294967295, attrs=...) at memory.c:439 #6 0x0000562867f96cdb in access_with_adjusted_size (addr=16, value=0x7ffeecb0c5c8, size=4, access_size_min=1, access_size_max=4, access_fn=0x562867f967c3 <memory_region_read_accessor>, mr=0x56286a9a4b60, attrs=...) at memory.c:569 #7 0x0000562867f99940 in memory_region_dispatch_read1 (mr=0x56286a9a4b60, addr=16, pval=0x7ffeecb0c5c8, size=4, attrs=...) at memory.c:1420 #8 0x0000562867f99a08 in memory_region_dispatch_read (mr=0x56286a9a4b60, addr=16, pval=0x7ffeecb0c5c8, size=4, attrs=...) at memory.c:1447 #9 0x0000562867f38721 in flatview_read_continue (fv=0x56286aec6360, addr=1073745936, attrs=..., buf=0x7ffeecb0c7c0 "\340ǰ\354\376\177", len=4, addr1=16, l=4, mr=0x56286a9a4b60) at exec.c:3385 #10 0x0000562867f38874 in flatview_read (fv=0x56286aec6360, addr=1073745936, attrs=..., buf=0x7ffeecb0c7c0 "\340ǰ\354\376\177", len=4) at exec.c:3423 #11 0x0000562867f388ea in address_space_read_full (as=0x56286aa3e890, addr=1073745936, attrs=..., buf=0x7ffeecb0c7c0 "\340ǰ\354\376\177", len=4) at exec.c:3436 #12 0x0000562867f389c5 in address_space_rw (as=0x56286aa3e890, addr=1073745936, attrs=..., buf=0x7ffeecb0c7c0 "\340ǰ\354\376\177", len=4, is_write=false) at exec.c:3466 #13 0x0000562867f3bdd7 in cpu_memory_rw_debug (cpu=0x56286aa19d00, addr=1073745936, buf=0x7ffeecb0c7c0 "\340ǰ\354\376\177", len=4, is_write=0) at exec.c:3976 #14 0x000056286811ed51 in memory_dump (mon=0x56286a8c32d0, count=1, format=120, wsize=4, addr=1073745936, is_physical=0) at monitor/misc.c:730 #15 0x000056286811eff1 in hmp_memory_dump (mon=0x56286a8c32d0, qdict=0x56286b15c400) at monitor/misc.c:785 #16 0x00005628684740ee in handle_hmp_command (mon=0x56286a8c32d0, cmdline=0x56286a8caeb2 "0x40001010") at monitor/hmp.c:1082 From the datasheet "Actel SmartFusion Microcontroller Subsystem User's Guide" Rev.1, Table 13-3 "SPI Register Summary", this register has a reset value of 0. Check the FIFO is not empty before accessing it, else log an error message. Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Message-id: 20190709113715.7761-3-philmd@redhat.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org> (cherry picked from commit c0bccee) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

In the previous commit we fixed a crash when the guest read a register that pop from an empty FIFO. By auditing the repository, we found another similar use with an easy way to reproduce: $ qemu-system-aarch64 -M xlnx-zcu102 -monitor stdio -S QEMU 4.0.50 monitor - type 'help' for more information (qemu) xp/b 0xfd4a0134 Aborted (core dumped) (gdb) bt #0 0x00007f6936dea57f in raise () at /lib64/libc.so.6 #1 0x00007f6936dd4895 in abort () at /lib64/libc.so.6 #2 0x0000561ad32975ec in xlnx_dp_aux_pop_rx_fifo (s=0x7f692babee70) at hw/display/xlnx_dp.c:431 #3 0x0000561ad3297dc0 in xlnx_dp_read (opaque=0x7f692babee70, offset=77, size=4) at hw/display/xlnx_dp.c:667 #4 0x0000561ad321b896 in memory_region_read_accessor (mr=0x7f692babf620, addr=308, value=0x7ffe05c1db88, size=4, shift=0, mask=4294967295, attrs=...) at memory.c:439 #5 0x0000561ad321bd70 in access_with_adjusted_size (addr=308, value=0x7ffe05c1db88, size=1, access_size_min=4, access_size_max=4, access_fn=0x561ad321b858 <memory_region_read_accessor>, mr=0x7f692babf620, attrs=...) at memory.c:569 #6 0x0000561ad321e9d5 in memory_region_dispatch_read1 (mr=0x7f692babf620, addr=308, pval=0x7ffe05c1db88, size=1, attrs=...) at memory.c:1420 #7 0x0000561ad321ea9d in memory_region_dispatch_read (mr=0x7f692babf620, addr=308, pval=0x7ffe05c1db88, size=1, attrs=...) at memory.c:1447 #8 0x0000561ad31bd742 in flatview_read_continue (fv=0x561ad69c04f0, addr=4249485620, attrs=..., buf=0x7ffe05c1dcf0 "\020\335\301\005\376\177", len=1, addr1=308, l=1, mr=0x7f692babf620) at exec.c:3385 #9 0x0000561ad31bd895 in flatview_read (fv=0x561ad69c04f0, addr=4249485620, attrs=..., buf=0x7ffe05c1dcf0 "\020\335\301\005\376\177", len=1) at exec.c:3423 #10 0x0000561ad31bd90b in address_space_read_full (as=0x561ad5bb3020, addr=4249485620, attrs=..., buf=0x7ffe05c1dcf0 "\020\335\301\005\376\177", len=1) at exec.c:3436 #11 0x0000561ad33b1c42 in address_space_read (len=1, buf=0x7ffe05c1dcf0 "\020\335\301\005\376\177", attrs=..., addr=4249485620, as=0x561ad5bb3020) at include/exec/memory.h:2131 #12 0x0000561ad33b1c42 in memory_dump (mon=0x561ad59c4530, count=1, format=120, wsize=1, addr=4249485620, is_physical=1) at monitor/misc.c:723 #13 0x0000561ad33b1fc1 in hmp_physical_memory_dump (mon=0x561ad59c4530, qdict=0x561ad6c6fd00) at monitor/misc.c:795 #14 0x0000561ad37b4a9f in handle_hmp_command (mon=0x561ad59c4530, cmdline=0x561ad59d0f22 "/b 0x00000000fd4a0134") at monitor/hmp.c:1082 Fix by checking the FIFO is not empty before popping from it. The datasheet is not clear about the reset value of this register, we choose to return '0'. Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Message-id: 20190709113715.7761-4-philmd@redhat.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org> (cherry picked from commit a09ef50) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

commit a6f230c move blockbackend back to main AioContext on unplug. It set the AioContext of SCSIDevice to the main AioContex, but s->ctx is still the iothread AioContexï¼ˆif the scsi controller is configure with iothreadï¼‰. So if there are having in-flight requests during unplug, a failing assertion happend. The bt is below: (gdb) bt #0 0x0000ffff86aacbd0 in raise () from /lib64/libc.so.6 #1 0x0000ffff86aadf7c in abort () from /lib64/libc.so.6 #2 0x0000ffff86aa6124 in __assert_fail_base () from /lib64/libc.so.6 #3 0x0000ffff86aa61a4 in __assert_fail () from /lib64/libc.so.6 #4 0x0000000000529118 in virtio_scsi_ctx_check (d=<optimized out>, s=<optimized out>, s=<optimized out>) at /home/qemu-4.0.0/hw/scsi/virtio-scsi.c:246 #5 0x0000000000529ec4 in virtio_scsi_handle_cmd_req_prepare (s=0x2779ec00, req=0xffff740397d0) at /home/qemu-4.0.0/hw/scsi/virtio-scsi.c:559 #6 0x000000000052a228 in virtio_scsi_handle_cmd_vq (s=0x2779ec00, vq=0xffff7c6d7110) at /home/qemu-4.0.0/hw/scsi/virtio-scsi.c:603 #7 0x000000000052afa8 in virtio_scsi_data_plane_handle_cmd (vdev=<optimized out>, vq=0xffff7c6d7110) at /home/qemu-4.0.0/hw/scsi/virtio-scsi-dataplane.c:59 #8 0x000000000054d94c in virtio_queue_host_notifier_aio_poll (opaque=<optimized out>) at /home/qemu-4.0.0/hw/virtio/virtio.c:2452 assert(blk_get_aio_context(d->conf.blk) == s->ctx) failed. To avoid assertion failed, moving the "if" after qdev_simple_device_unplug_cb. In addition, to avoid another qemu crash below, add aio_disable_external before qdev_simple_device_unplug_cb, which disable the further processing of external clients when doing qdev_simple_device_unplug_cb. (gdb) bt #0 scsi_req_unref (req=0xffff6802c6f0) at hw/scsi/scsi-bus.c:1283 #1 0x00000000005294a4 in virtio_scsi_handle_cmd_req_submit (req=<optimized out>, s=<optimized out>) at /home/qemu-4.0.0/hw/scsi/virtio-scsi.c:589 #2 0x000000000052a2a8 in virtio_scsi_handle_cmd_vq (s=s@entry=0x9c90e90, vq=vq@entry=0xffff7c05f110) at /home/qemu-4.0.0/hw/scsi/virtio-scsi.c:625 #3 0x000000000052afd8 in virtio_scsi_data_plane_handle_cmd (vdev=<optimized out>, vq=0xffff7c05f110) at /home/qemu-4.0.0/hw/scsi/virtio-scsi-dataplane.c:60 #4 0x000000000054d97c in virtio_queue_host_notifier_aio_poll (opaque=<optimized out>) at /home/qemu-4.0.0/hw/virtio/virtio.c:2447 #5 0x00000000009b204c in run_poll_handlers_once (ctx=ctx@entry=0x6efea40, timeout=timeout@entry=0xffff7d7f7308) at util/aio-posix.c:521 #6 0x00000000009b2b64 in run_poll_handlers (ctx=ctx@entry=0x6efea40, max_ns=max_ns@entry=4000, timeout=timeout@entry=0xffff7d7f7308) at util/aio-posix.c:559 #7 0x00000000009b2ca0 in try_poll_mode (ctx=ctx@entry=0x6efea40, timeout=0xffff7d7f7308, timeout@entry=0xffff7d7f7348) at util/aio-posix.c:594 #8 0x00000000009b31b8 in aio_poll (ctx=0x6efea40, blocking=blocking@entry=true) at util/aio-posix.c:636 #9 0x00000000006973cc in iothread_run (opaque=0x6ebd800) at iothread.c:75 #10 0x00000000009b592c in qemu_thread_start (args=0x6efef60) at util/qemu-thread-posix.c:502 #11 0x0000ffff8057f8bc in start_thread () from /lib64/libpthread.so.0 #12 0x0000ffff804e5f8c in thread_start () from /lib64/libc.so.6 (gdb) p bus $1 = (SCSIBus *) 0x0 Signed-off-by: Zhengui li <lizhengui@huawei.com> Message-Id: <1563696502-7972-1-git-send-email-lizhengui@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <1563829520-17525-1-git-send-email-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

If interface_count is NO_INTERFACE_INFO, let's not access the arrays out-of-bounds. ==994==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x625000243930 at pc 0x5642068086a8 bp 0x7f0b6f9ffa50 sp 0x7f0b6f9ffa40 READ of size 1 at 0x625000243930 thread T0 #0 0x5642068086a7 in usbredir_check_bulk_receiving /home/elmarco/src/qemu/hw/usb/redirect.c:1503 #1 0x56420681301c in usbredir_post_load /home/elmarco/src/qemu/hw/usb/redirect.c:2154 #2 0x5642068a56c2 in vmstate_load_state /home/elmarco/src/qemu/migration/vmstate.c:168 #3 0x56420688e2ac in vmstate_load /home/elmarco/src/qemu/migration/savevm.c:829 #4 0x5642068980cb in qemu_loadvm_section_start_full /home/elmarco/src/qemu/migration/savevm.c:2211 #5 0x564206899645 in qemu_loadvm_state_main /home/elmarco/src/qemu/migration/savevm.c:2395 #6 0x5642068998cf in qemu_loadvm_state /home/elmarco/src/qemu/migration/savevm.c:2467 #7 0x56420685f3e9 in process_incoming_migration_co /home/elmarco/src/qemu/migration/migration.c:449 #8 0x564207106c47 in coroutine_trampoline /home/elmarco/src/qemu/util/coroutine-ucontext.c:115 #9 0x7f0c0604e37f (/lib64/libc.so.6+0x4d37f) Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Liam Merwick <liam.merwick@oracle.com> Reviewed-by: Li Qiang <liq3ea@gmail.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-id: 20190807084048.4258-1-marcandre.lureau@redhat.com Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>

There's a race condition in which the tcp_chr_read() ioc handler can close a connection that is being written to from another thread. Running iotest 136 in a loop triggers this problem and crashes QEMU. (gdb) bt #0 0x00005558b842902d in object_get_class (obj=0x0) at qom/object.c:860 #1 0x00005558b84f92db in qio_channel_writev_full (ioc=0x0, iov=0x7ffc355decf0, niov=1, fds=0x0, nfds=0, errp=0x0) at io/channel.c:76 #2 0x00005558b84e0e9e in io_channel_send_full (ioc=0x0, buf=0x5558baf5beb0, len=138, fds=0x0, nfds=0) at chardev/char-io.c:123 #3 0x00005558b84e4a69 in tcp_chr_write (chr=0x5558ba460380, buf=0x5558baf5beb0 "...", len=138) at chardev/char-socket.c:135 #4 0x00005558b84dca55 in qemu_chr_write_buffer (s=0x5558ba460380, buf=0x5558baf5beb0 "...", len=138, offset=0x7ffc355dedd0, write_all=false) at chardev/char.c:112 #5 0x00005558b84dcbc2 in qemu_chr_write (s=0x5558ba460380, buf=0x5558baf5beb0 "...", len=138, write_all=false) at chardev/char.c:147 #6 0x00005558b84dfb26 in qemu_chr_fe_write (be=0x5558ba476610, buf=0x5558baf5beb0 "...", len=138) at chardev/char-fe.c:42 #7 0x00005558b8088c86 in monitor_flush_locked (mon=0x5558ba476610) at monitor.c:406 #8 0x00005558b8088e8c in monitor_puts (mon=0x5558ba476610, str=0x5558ba921e49 "") at monitor.c:449 #9 0x00005558b8089178 in qmp_send_response (mon=0x5558ba476610, rsp=0x5558bb161600) at monitor.c:498 #10 0x00005558b808920c in monitor_qapi_event_emit (event=QAPI_EVENT_SHUTDOWN, qdict=0x5558bb161600) at monitor.c:526 #11 0x00005558b8089307 in monitor_qapi_event_queue_no_reenter (event=QAPI_EVENT_SHUTDOWN, qdict=0x5558bb161600) at monitor.c:551 #12 0x00005558b80896c0 in qapi_event_emit (event=QAPI_EVENT_SHUTDOWN, qdict=0x5558bb161600) at monitor.c:626 #13 0x00005558b855f23b in qapi_event_send_shutdown (guest=false, reason=SHUTDOWN_CAUSE_HOST_QMP_QUIT) at qapi/qapi-events-run-state.c:43 #14 0x00005558b81911ef in qemu_system_shutdown (cause=SHUTDOWN_CAUSE_HOST_QMP_QUIT) at vl.c:1837 #15 0x00005558b8191308 in main_loop_should_exit () at vl.c:1885 #16 0x00005558b819140d in main_loop () at vl.c:1924 #17 0x00005558b8198c84 in main (argc=18, argv=0x7ffc355df3f8, envp=0x7ffc355df490) at vl.c:4665 This patch adds a lock to protect tcp_chr_disconnect() and socket_reconnect_timeout() Signed-off-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> Message-Id: <1565625509-404969-3-git-send-email-andrey.shinkevich@virtuozzo.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Address Sanitizer shows memory leak in xhci_address_slot hw/usb/hcd-xhci.c:2156 and the stack is as bellow: Direct leak of 64 byte(s) in 4 object(s) allocated from: #0 0xffff91c6f5ab in realloc (/lib64/libasan.so.4+0xd35ab) #1 0xffff91987243 in g_realloc (/lib64/libglib-2.0.so.0+0x57243) #2 0xaaaab0b26a1f in qemu_iovec_add util/iov.c:296 #3 0xaaaab07e5ce3 in xhci_address_slot hw/usb/hcd-xhci.c:2156 #4 0xaaaab07e5ce3 in xhci_process_commands hw/usb/hcd-xhci.c:2493 #5 0xaaaab00058d7 in memory_region_write_accessor qemu/memory.c:507 #6 0xaaaab0000d87 in access_with_adjusted_size memory.c:573 #7 0xaaaab000abcf in memory_region_dispatch_write memory.c:1516 #8 0xaaaaaff59947 in flatview_write_continue exec.c:3367 #9 0xaaaaaff59c33 in flatview_write exec.c:3406 #10 0xaaaaaff63b3b in address_space_write exec.c:3496 #11 0xaaaab002f263 in kvm_cpu_exec accel/kvm/kvm-all.c:2288 #12 0xaaaaaffee427 in qemu_kvm_cpu_thread_fn cpus.c:1290 #13 0xaaaab0b1a943 in qemu_thread_start util/qemu-thread-posix.c:502 #14 0xffff908ce8bb in start_thread (/lib64/libpthread.so.0+0x78bb) #15 0xffff908165cb in thread_start (/lib64/libc.so.6+0xd55cb) Cc: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Ying Fang <fangying1@huawei.com> Reviewed-by: Li Qiang <liq3ea@gmail.com> Message-id: 20190827080209.2365-1-fangying1@huawei.com Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>

Address Sanitizer shows memory leak in xhci_kick_epctx hw/usb/hcd-xhci.c:1912. A sglist is leaked when a packet is retired and returns USB_RET_NAK status. The leak stack is as bellow: Direct leak of 2688 byte(s) in 168 object(s) allocated from: #0 0xffffae8b11db in __interceptor_malloc (/lib64/libasan.so.4+0xd31db) #1 0xffffae5c9163 in g_malloc (/lib64/libglib-2.0.so.0+0x57163) #2 0xaaaabb6fb3f7 in qemu_sglist_init dma-helpers.c:43 #3 0xaaaabba705a7 in pci_dma_sglist_init include/hw/pci/pci.h:837 #4 0xaaaabba705a7 in xhci_xfer_create_sgl hw/usb/hcd-xhci.c:1443 #5 0xaaaabba705a7 in xhci_setup_packet hw/usb/hcd-xhci.c:1615 #6 0xaaaabba77a6f in xhci_kick_epctx hw/usb/hcd-xhci.c:1912 #7 0xaaaabbdaad27 in timerlist_run_timers util/qemu-timer.c:592 #8 0xaaaabbdab19f in qemu_clock_run_timers util/qemu-timer.c:606 #9 0xaaaabbdab19f in qemu_clock_run_all_timers util/qemu-timer.c:692 #10 0xaaaabbdab9a3 in main_loop_wait util/main-loop.c:524 #11 0xaaaabb6ff5e7 in main_loop vl.c:1806 #12 0xaaaabb1e1453 in main vl.c:4488 Signed-off-by: Ying Fang <fangying1@huawei.com> Message-id: 20190828062535.1573-1-fangying1@huawei.com Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>

Even for block nodes with bs->drv == NULL, we can't just ignore a bdrv_set_aio_context() call. Leaving the node in its old context can mean that it's still in an iothread context in bdrv_close_all() during shutdown, resulting in an attempted unlock of the AioContext lock which we don't hold. This is an example stack trace of a related crash: #0 0x00007ffff59da57f in raise () at /lib64/libc.so.6 #1 0x00007ffff59c4895 in abort () at /lib64/libc.so.6 #2 0x0000555555b97b1e in error_exit (err=<optimized out>, msg=msg@entry=0x555555d386d0 <__func__.19059> "qemu_mutex_unlock_impl") at util/qemu-thread-posix.c:36 #3 0x0000555555b97f7f in qemu_mutex_unlock_impl (mutex=mutex@entry=0x5555568002f0, file=file@entry=0x555555d378df "util/async.c", line=line@entry=507) at util/qemu-thread-posix.c:97 #4 0x0000555555b92f55 in aio_context_release (ctx=ctx@entry=0x555556800290) at util/async.c:507 #5 0x0000555555b05cf8 in bdrv_prwv_co (child=child@entry=0x7fffc80012f0, offset=offset@entry=131072, qiov=qiov@entry=0x7fffffffd4f0, is_write=is_write@entry=true, flags=flags@entry=0) at block/io.c:833 #6 0x0000555555b060a9 in bdrv_pwritev (qiov=0x7fffffffd4f0, offset=131072, child=0x7fffc80012f0) at block/io.c:990 #7 0x0000555555b060a9 in bdrv_pwrite (child=0x7fffc80012f0, offset=131072, buf=<optimized out>, bytes=<optimized out>) at block/io.c:990 #8 0x0000555555ae172b in qcow2_cache_entry_flush (bs=bs@entry=0x555556810680, c=c@entry=0x5555568cc740, i=i@entry=0) at block/qcow2-cache.c:51 #9 0x0000555555ae18dd in qcow2_cache_write (bs=bs@entry=0x555556810680, c=0x5555568cc740) at block/qcow2-cache.c:248 #10 0x0000555555ae15de in qcow2_cache_flush (bs=0x555556810680, c=<optimized out>) at block/qcow2-cache.c:259 #11 0x0000555555ae16b1 in qcow2_cache_flush_dependency (c=0x5555568a1700, c=0x5555568a1700, bs=0x555556810680) at block/qcow2-cache.c:194 #12 0x0000555555ae16b1 in qcow2_cache_entry_flush (bs=bs@entry=0x555556810680, c=c@entry=0x5555568a1700, i=i@entry=0) at block/qcow2-cache.c:194 #13 0x0000555555ae18dd in qcow2_cache_write (bs=bs@entry=0x555556810680, c=0x5555568a1700) at block/qcow2-cache.c:248 #14 0x0000555555ae15de in qcow2_cache_flush (bs=bs@entry=0x555556810680, c=<optimized out>) at block/qcow2-cache.c:259 #15 0x0000555555ad242c in qcow2_inactivate (bs=bs@entry=0x555556810680) at block/qcow2.c:2124 #16 0x0000555555ad2590 in qcow2_close (bs=0x555556810680) at block/qcow2.c:2153 #17 0x0000555555ab0c62 in bdrv_close (bs=0x555556810680) at block.c:3358 #18 0x0000555555ab0c62 in bdrv_delete (bs=0x555556810680) at block.c:3542 #19 0x0000555555ab0c62 in bdrv_unref (bs=0x555556810680) at block.c:4598 #20 0x0000555555af4d72 in blk_remove_bs (blk=blk@entry=0x5555568103d0) at block/block-backend.c:785 #21 0x0000555555af4dbb in blk_remove_all_bs () at block/block-backend.c:483 #22 0x0000555555aae02f in bdrv_close_all () at block.c:3412 #23 0x00005555557f9796 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4776 The reproducer I used is a qcow2 image on gluster volume, where the virtual disk size (4 GB) is larger than the gluster volume size (64M), so we can easily trigger an ENOSPC. This backend is assigned to a virtio-blk device using an iothread, and then from the guest a 'dd if=/dev/zero of=/dev/vda bs=1G count=1' causes the VM to stop because of an I/O error. qemu_gluster_co_flush_to_disk() sets bs->drv = NULL on error, so when virtio-blk stops the dataplane, the block nodes stay in the iothread AioContext. A 'quit' monitor command issued from this paused state crashes the process. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1631227 Cc: qemu-stable@nongnu.org Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> (cherry picked from commit 1bffe1a) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

If interface_count is NO_INTERFACE_INFO, let's not access the arrays out-of-bounds. ==994==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x625000243930 at pc 0x5642068086a8 bp 0x7f0b6f9ffa50 sp 0x7f0b6f9ffa40 READ of size 1 at 0x625000243930 thread T0 #0 0x5642068086a7 in usbredir_check_bulk_receiving /home/elmarco/src/qemu/hw/usb/redirect.c:1503 #1 0x56420681301c in usbredir_post_load /home/elmarco/src/qemu/hw/usb/redirect.c:2154 #2 0x5642068a56c2 in vmstate_load_state /home/elmarco/src/qemu/migration/vmstate.c:168 #3 0x56420688e2ac in vmstate_load /home/elmarco/src/qemu/migration/savevm.c:829 #4 0x5642068980cb in qemu_loadvm_section_start_full /home/elmarco/src/qemu/migration/savevm.c:2211 #5 0x564206899645 in qemu_loadvm_state_main /home/elmarco/src/qemu/migration/savevm.c:2395 #6 0x5642068998cf in qemu_loadvm_state /home/elmarco/src/qemu/migration/savevm.c:2467 #7 0x56420685f3e9 in process_incoming_migration_co /home/elmarco/src/qemu/migration/migration.c:449 #8 0x564207106c47 in coroutine_trampoline /home/elmarco/src/qemu/util/coroutine-ucontext.c:115 #9 0x7f0c0604e37f (/lib64/libc.so.6+0x4d37f) Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Liam Merwick <liam.merwick@oracle.com> Reviewed-by: Li Qiang <liq3ea@gmail.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-id: 20190807084048.4258-1-marcandre.lureau@redhat.com Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> (cherry picked from commit 7b84b90) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

The 'blockdev-create' QMP command was introduced as experimental feature in commit b0292b8, using the assert() debug call. It got promoted to 'stable' command in 3fb588a, but the assert call was not removed. Some block drivers are optional, and bdrv_find_format() might return a NULL value, triggering the assertion. Stable code is not expected to abort, so return an error instead. This is easily reproducible when libnfs is not installed: ./configure [...] module support no Block whitelist (rw) Block whitelist (ro) libiscsi support yes libnfs support no [...] Start QEMU: $ qemu-system-x86_64 -S -qmp unix:/tmp/qemu.qmp,server,nowait Send the 'blockdev-create' with the 'nfs' driver: $ ( cat << 'EOF' {'execute': 'qmp_capabilities'} {'execute': 'blockdev-create', 'arguments': {'job-id': 'x', 'options': {'size': 0, 'driver': 'nfs', 'location': {'path': '/', 'server': {'host': '::1', 'type': 'inet'}}}}, 'id': 'x'} EOF ) | socat STDIO UNIX:/tmp/qemu.qmp {"QMP": {"version": {"qemu": {"micro": 50, "minor": 1, "major": 4}, "package": "v4.1.0-733-g89ea03a7dc"}, "capabilities": ["oob"]}} {"return": {}} QEMU crashes: $ gdb qemu-system-x86_64 core Program received signal SIGSEGV, Segmentation fault. (gdb) bt #0 0x00007ffff510957f in raise () at /lib64/libc.so.6 #1 0x00007ffff50f3895 in abort () at /lib64/libc.so.6 #2 0x00007ffff50f3769 in _nl_load_domain.cold.0 () at /lib64/libc.so.6 #3 0x00007ffff5101a26 in .annobin_assert.c_end () at /lib64/libc.so.6 #4 0x0000555555d7e1f1 in qmp_blockdev_create (job_id=0x555556baee40 "x", options=0x555557666610, errp=0x7fffffffc770) at block/create.c:69 #5 0x0000555555c96b52 in qmp_marshal_blockdev_create (args=0x7fffdc003830, ret=0x7fffffffc7f8, errp=0x7fffffffc7f0) at qapi/qapi-commands-block-core.c:1314 #6 0x0000555555deb0a0 in do_qmp_dispatch (cmds=0x55555645de70 <qmp_commands>, request=0x7fffdc005c70, allow_oob=false, errp=0x7fffffffc898) at qapi/qmp-dispatch.c:131 #7 0x0000555555deb2a1 in qmp_dispatch (cmds=0x55555645de70 <qmp_commands>, request=0x7fffdc005c70, allow_oob=false) at qapi/qmp-dispatch.c:174 With this patch applied, QEMU returns a QMP error: {'execute': 'blockdev-create', 'arguments': {'job-id': 'x', 'options': {'size': 0, 'driver': 'nfs', 'location': {'path': '/', 'server': {'host': '::1', 'type': 'inet'}}}}, 'id': 'x'} {"id": "x", "error": {"class": "GenericError", "desc": "Block driver 'nfs' not found or not supported"}} Cc: qemu-stable@nongnu.org Reported-by: Xu Tian <xutian@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> (cherry picked from commit d90d5ca) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

If interface_count is NO_INTERFACE_INFO, let's not access the arrays out-of-bounds. ==994==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x625000243930 at pc 0x5642068086a8 bp 0x7f0b6f9ffa50 sp 0x7f0b6f9ffa40 READ of size 1 at 0x625000243930 thread T0 #0 0x5642068086a7 in usbredir_check_bulk_receiving /home/elmarco/src/qemu/hw/usb/redirect.c:1503 #1 0x56420681301c in usbredir_post_load /home/elmarco/src/qemu/hw/usb/redirect.c:2154 #2 0x5642068a56c2 in vmstate_load_state /home/elmarco/src/qemu/migration/vmstate.c:168 #3 0x56420688e2ac in vmstate_load /home/elmarco/src/qemu/migration/savevm.c:829 #4 0x5642068980cb in qemu_loadvm_section_start_full /home/elmarco/src/qemu/migration/savevm.c:2211 #5 0x564206899645 in qemu_loadvm_state_main /home/elmarco/src/qemu/migration/savevm.c:2395 #6 0x5642068998cf in qemu_loadvm_state /home/elmarco/src/qemu/migration/savevm.c:2467 #7 0x56420685f3e9 in process_incoming_migration_co /home/elmarco/src/qemu/migration/migration.c:449 #8 0x564207106c47 in coroutine_trampoline /home/elmarco/src/qemu/util/coroutine-ucontext.c:115 #9 0x7f0c0604e37f (/lib64/libc.so.6+0x4d37f) Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Liam Merwick <liam.merwick@oracle.com> Reviewed-by: Li Qiang <liq3ea@gmail.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-id: 20190807084048.4258-1-marcandre.lureau@redhat.com Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> (cherry picked from commit 7b84b90) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

The 'blockdev-create' QMP command was introduced as experimental feature in commit b0292b8, using the assert() debug call. It got promoted to 'stable' command in 3fb588a, but the assert call was not removed. Some block drivers are optional, and bdrv_find_format() might return a NULL value, triggering the assertion. Stable code is not expected to abort, so return an error instead. This is easily reproducible when libnfs is not installed: ./configure [...] module support no Block whitelist (rw) Block whitelist (ro) libiscsi support yes libnfs support no [...] Start QEMU: $ qemu-system-x86_64 -S -qmp unix:/tmp/qemu.qmp,server,nowait Send the 'blockdev-create' with the 'nfs' driver: $ ( cat << 'EOF' {'execute': 'qmp_capabilities'} {'execute': 'blockdev-create', 'arguments': {'job-id': 'x', 'options': {'size': 0, 'driver': 'nfs', 'location': {'path': '/', 'server': {'host': '::1', 'type': 'inet'}}}}, 'id': 'x'} EOF ) | socat STDIO UNIX:/tmp/qemu.qmp {"QMP": {"version": {"qemu": {"micro": 50, "minor": 1, "major": 4}, "package": "v4.1.0-733-g89ea03a7dc"}, "capabilities": ["oob"]}} {"return": {}} QEMU crashes: $ gdb qemu-system-x86_64 core Program received signal SIGSEGV, Segmentation fault. (gdb) bt #0 0x00007ffff510957f in raise () at /lib64/libc.so.6 #1 0x00007ffff50f3895 in abort () at /lib64/libc.so.6 #2 0x00007ffff50f3769 in _nl_load_domain.cold.0 () at /lib64/libc.so.6 #3 0x00007ffff5101a26 in .annobin_assert.c_end () at /lib64/libc.so.6 #4 0x0000555555d7e1f1 in qmp_blockdev_create (job_id=0x555556baee40 "x", options=0x555557666610, errp=0x7fffffffc770) at block/create.c:69 #5 0x0000555555c96b52 in qmp_marshal_blockdev_create (args=0x7fffdc003830, ret=0x7fffffffc7f8, errp=0x7fffffffc7f0) at qapi/qapi-commands-block-core.c:1314 #6 0x0000555555deb0a0 in do_qmp_dispatch (cmds=0x55555645de70 <qmp_commands>, request=0x7fffdc005c70, allow_oob=false, errp=0x7fffffffc898) at qapi/qmp-dispatch.c:131 #7 0x0000555555deb2a1 in qmp_dispatch (cmds=0x55555645de70 <qmp_commands>, request=0x7fffdc005c70, allow_oob=false) at qapi/qmp-dispatch.c:174 With this patch applied, QEMU returns a QMP error: {'execute': 'blockdev-create', 'arguments': {'job-id': 'x', 'options': {'size': 0, 'driver': 'nfs', 'location': {'path': '/', 'server': {'host': '::1', 'type': 'inet'}}}}, 'id': 'x'} {"id": "x", "error": {"class": "GenericError", "desc": "Block driver 'nfs' not found or not supported"}} Cc: qemu-stable@nongnu.org Reported-by: Xu Tian <xutian@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> (cherry picked from commit d90d5ca) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

When creating the admin queue in nvme_init() the variable that holds the number of queues created is modified before actual queue creation. This is a problem because if creating the queue fails then the variable is left in inconsistent state. This was actually observed when I tried to hotplug a nvme disk. The control got to nvme_file_open() which called nvme_init() which failed and thus nvme_close() was called which in turn called nvme_free_queue_pair() with queue being NULL. This lead to an instant crash: #0 0x000055d9507ec211 in nvme_free_queue_pair (bs=0x55d952ddb880, q=0x0) at block/nvme.c:164 #1 0x000055d9507ee180 in nvme_close (bs=0x55d952ddb880) at block/nvme.c:729 #2 0x000055d9507ee3d5 in nvme_file_open (bs=0x55d952ddb880, options=0x55d952bb1410, flags=147456, errp=0x7ffd8e19e200) at block/nvme.c:781 #3 0x000055d9507629f3 in bdrv_open_driver (bs=0x55d952ddb880, drv=0x55d95109c1e0 <bdrv_nvme>, node_name=0x0, options=0x55d952bb1410, open_flags=147456, errp=0x7ffd8e19e310) at block.c:1291 #4 0x000055d9507633d6 in bdrv_open_common (bs=0x55d952ddb880, file=0x0, options=0x55d952bb1410, errp=0x7ffd8e19e310) at block.c:1551 #5 0x000055d950766881 in bdrv_open_inherit (filename=0x0, reference=0x0, options=0x55d952bb1410, flags=32768, parent=0x55d9538ce420, child_role=0x55d950eaade0 <child_file>, errp=0x7ffd8e19e510) at block.c:3063 #6 0x000055d950765ae4 in bdrv_open_child_bs (filename=0x0, options=0x55d9541cdff0, bdref_key=0x55d950af33aa "file", parent=0x55d9538ce420, child_role=0x55d950eaade0 <child_file>, allow_none=true, errp=0x7ffd8e19e510) at block.c:2712 #7 0x000055d950766633 in bdrv_open_inherit (filename=0x0, reference=0x0, options=0x55d9541cdff0, flags=0, parent=0x0, child_role=0x0, errp=0x7ffd8e19e908) at block.c:3011 #8 0x000055d950766dba in bdrv_open (filename=0x0, reference=0x0, options=0x55d953d00390, flags=0, errp=0x7ffd8e19e908) at block.c:3156 #9 0x000055d9507cb635 in blk_new_open (filename=0x0, reference=0x0, options=0x55d953d00390, flags=0, errp=0x7ffd8e19e908) at block/block-backend.c:389 #10 0x000055d950465ec5 in blockdev_init (file=0x0, bs_opts=0x55d953d00390, errp=0x7ffd8e19e908) at blockdev.c:602 Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Message-id: 927aae40b617ba7d4b6c7ffe74e6d7a2595f8e86.1562770546.git.mprivozn@redhat.com Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com> (cherry picked from commit 95667c3) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

Reading the RX_DATA register when the RX_FIFO is empty triggers an abort. This can be easily reproduced: $ qemu-system-arm -M emcraft-sf2 -monitor stdio -S QEMU 4.0.50 monitor - type 'help' for more information (qemu) x 0x40001010 Aborted (core dumped) (gdb) bt #1 0x00007f035874f895 in abort () at /lib64/libc.so.6 #2 0x00005628686591ff in fifo8_pop (fifo=0x56286a9a4c68) at util/fifo8.c:66 #3 0x00005628683e0b8e in fifo32_pop (fifo=0x56286a9a4c68) at include/qemu/fifo32.h:137 #4 0x00005628683e0efb in spi_read (opaque=0x56286a9a4850, addr=4, size=4) at hw/ssi/mss-spi.c:168 #5 0x0000562867f96801 in memory_region_read_accessor (mr=0x56286a9a4b60, addr=16, value=0x7ffeecb0c5c8, size=4, shift=0, mask=4294967295, attrs=...) at memory.c:439 #6 0x0000562867f96cdb in access_with_adjusted_size (addr=16, value=0x7ffeecb0c5c8, size=4, access_size_min=1, access_size_max=4, access_fn=0x562867f967c3 <memory_region_read_accessor>, mr=0x56286a9a4b60, attrs=...) at memory.c:569 #7 0x0000562867f99940 in memory_region_dispatch_read1 (mr=0x56286a9a4b60, addr=16, pval=0x7ffeecb0c5c8, size=4, attrs=...) at memory.c:1420 #8 0x0000562867f99a08 in memory_region_dispatch_read (mr=0x56286a9a4b60, addr=16, pval=0x7ffeecb0c5c8, size=4, attrs=...) at memory.c:1447 #9 0x0000562867f38721 in flatview_read_continue (fv=0x56286aec6360, addr=1073745936, attrs=..., buf=0x7ffeecb0c7c0 "\340ǰ\354\376\177", len=4, addr1=16, l=4, mr=0x56286a9a4b60) at exec.c:3385 #10 0x0000562867f38874 in flatview_read (fv=0x56286aec6360, addr=1073745936, attrs=..., buf=0x7ffeecb0c7c0 "\340ǰ\354\376\177", len=4) at exec.c:3423 #11 0x0000562867f388ea in address_space_read_full (as=0x56286aa3e890, addr=1073745936, attrs=..., buf=0x7ffeecb0c7c0 "\340ǰ\354\376\177", len=4) at exec.c:3436 #12 0x0000562867f389c5 in address_space_rw (as=0x56286aa3e890, addr=1073745936, attrs=..., buf=0x7ffeecb0c7c0 "\340ǰ\354\376\177", len=4, is_write=false) at exec.c:3466 #13 0x0000562867f3bdd7 in cpu_memory_rw_debug (cpu=0x56286aa19d00, addr=1073745936, buf=0x7ffeecb0c7c0 "\340ǰ\354\376\177", len=4, is_write=0) at exec.c:3976 #14 0x000056286811ed51 in memory_dump (mon=0x56286a8c32d0, count=1, format=120, wsize=4, addr=1073745936, is_physical=0) at monitor/misc.c:730 #15 0x000056286811eff1 in hmp_memory_dump (mon=0x56286a8c32d0, qdict=0x56286b15c400) at monitor/misc.c:785 #16 0x00005628684740ee in handle_hmp_command (mon=0x56286a8c32d0, cmdline=0x56286a8caeb2 "0x40001010") at monitor/hmp.c:1082 From the datasheet "Actel SmartFusion Microcontroller Subsystem User's Guide" Rev.1, Table 13-3 "SPI Register Summary", this register has a reset value of 0. Check the FIFO is not empty before accessing it, else log an error message. Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Message-id: 20190709113715.7761-3-philmd@redhat.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org> (cherry picked from commit c0bccee) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

In the previous commit we fixed a crash when the guest read a register that pop from an empty FIFO. By auditing the repository, we found another similar use with an easy way to reproduce: $ qemu-system-aarch64 -M xlnx-zcu102 -monitor stdio -S QEMU 4.0.50 monitor - type 'help' for more information (qemu) xp/b 0xfd4a0134 Aborted (core dumped) (gdb) bt #0 0x00007f6936dea57f in raise () at /lib64/libc.so.6 #1 0x00007f6936dd4895 in abort () at /lib64/libc.so.6 #2 0x0000561ad32975ec in xlnx_dp_aux_pop_rx_fifo (s=0x7f692babee70) at hw/display/xlnx_dp.c:431 #3 0x0000561ad3297dc0 in xlnx_dp_read (opaque=0x7f692babee70, offset=77, size=4) at hw/display/xlnx_dp.c:667 #4 0x0000561ad321b896 in memory_region_read_accessor (mr=0x7f692babf620, addr=308, value=0x7ffe05c1db88, size=4, shift=0, mask=4294967295, attrs=...) at memory.c:439 #5 0x0000561ad321bd70 in access_with_adjusted_size (addr=308, value=0x7ffe05c1db88, size=1, access_size_min=4, access_size_max=4, access_fn=0x561ad321b858 <memory_region_read_accessor>, mr=0x7f692babf620, attrs=...) at memory.c:569 #6 0x0000561ad321e9d5 in memory_region_dispatch_read1 (mr=0x7f692babf620, addr=308, pval=0x7ffe05c1db88, size=1, attrs=...) at memory.c:1420 #7 0x0000561ad321ea9d in memory_region_dispatch_read (mr=0x7f692babf620, addr=308, pval=0x7ffe05c1db88, size=1, attrs=...) at memory.c:1447 #8 0x0000561ad31bd742 in flatview_read_continue (fv=0x561ad69c04f0, addr=4249485620, attrs=..., buf=0x7ffe05c1dcf0 "\020\335\301\005\376\177", len=1, addr1=308, l=1, mr=0x7f692babf620) at exec.c:3385 #9 0x0000561ad31bd895 in flatview_read (fv=0x561ad69c04f0, addr=4249485620, attrs=..., buf=0x7ffe05c1dcf0 "\020\335\301\005\376\177", len=1) at exec.c:3423 #10 0x0000561ad31bd90b in address_space_read_full (as=0x561ad5bb3020, addr=4249485620, attrs=..., buf=0x7ffe05c1dcf0 "\020\335\301\005\376\177", len=1) at exec.c:3436 #11 0x0000561ad33b1c42 in address_space_read (len=1, buf=0x7ffe05c1dcf0 "\020\335\301\005\376\177", attrs=..., addr=4249485620, as=0x561ad5bb3020) at include/exec/memory.h:2131 #12 0x0000561ad33b1c42 in memory_dump (mon=0x561ad59c4530, count=1, format=120, wsize=1, addr=4249485620, is_physical=1) at monitor/misc.c:723 #13 0x0000561ad33b1fc1 in hmp_physical_memory_dump (mon=0x561ad59c4530, qdict=0x561ad6c6fd00) at monitor/misc.c:795 #14 0x0000561ad37b4a9f in handle_hmp_command (mon=0x561ad59c4530, cmdline=0x561ad59d0f22 "/b 0x00000000fd4a0134") at monitor/hmp.c:1082 Fix by checking the FIFO is not empty before popping from it. The datasheet is not clear about the reset value of this register, we choose to return '0'. Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Message-id: 20190709113715.7761-4-philmd@redhat.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org> (cherry picked from commit a09ef50) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

It fixes heap-use-after-free which was found by clang's ASAN. Control flow of this use-after-free: main_thread: * Got SIGTERM and completes main loop * Calls migration_shutdown - migrate_fd_cancel (so, migration_thread begins to complete) - object_unref(OBJECT(current_migration)); migration_thread: * migration_iteration_finish -> schedule cleanup bh * object_unref(OBJECT(s)); (Now, current_migration is freed) * exits main_thread: * Calls vm_shutdown -> drain bdrvs -> main loop -> cleanup_bh -> use after free If you want to reproduce, these couple of sleeps will help: vl.c:4613: migration_shutdown(); + sleep(2); migration.c:3269: + sleep(1); trace_migration_thread_after_loop(); migration_iteration_finish(s); Original output: qemu-system-x86_64: terminating on signal 15 from pid 31980 (<unknown process>) ================================================================= ==31958==ERROR: AddressSanitizer: heap-use-after-free on address 0x61900001d210 at pc 0x555558a535ca bp 0x7fffffffb190 sp 0x7fffffffb188 READ of size 8 at 0x61900001d210 thread T0 (qemu-vm-0) #0 0x555558a535c9 in migrate_fd_cleanup migration/migration.c:1502:23 #1 0x5555594fde0a in aio_bh_call util/async.c:90:5 #2 0x5555594fe522 in aio_bh_poll util/async.c:118:13 #3 0x555559524783 in aio_poll util/aio-posix.c:725:17 #4 0x555559504fb3 in aio_wait_bh_oneshot util/aio-wait.c:71:5 #5 0x5555573bddf6 in virtio_blk_data_plane_stop hw/block/dataplane/virtio-blk.c:282:5 #6 0x5555589d5c09 in virtio_bus_stop_ioeventfd hw/virtio/virtio-bus.c:246:9 #7 0x5555589e9917 in virtio_pci_stop_ioeventfd hw/virtio/virtio-pci.c:287:5 #8 0x5555589e22bf in virtio_pci_vmstate_change hw/virtio/virtio-pci.c:1072:9 #9 0x555557628931 in virtio_vmstate_change hw/virtio/virtio.c:2257:9 #10 0x555557c36713 in vm_state_notify vl.c:1605:9 #11 0x55555716ef53 in do_vm_stop cpus.c:1074:9 #12 0x55555716eeff in vm_shutdown cpus.c:1092:12 #13 0x555557c4283e in main vl.c:4617:5 #14 0x7fffdfdb482f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f) #15 0x555556ecb118 in _start (x86_64-softmmu/qemu-system-x86_64+0x1977118) 0x61900001d210 is located 144 bytes inside of 952-byte region [0x61900001d180,0x61900001d538) freed by thread T6 (live_migration) here: #0 0x555556f76782 in __interceptor_free /tmp/final/llvm.src/projects/compiler-rt/lib/asan/asan_malloc_linux.cc:124:3 #1 0x555558d5fa94 in object_finalize qom/object.c:618:9 #2 0x555558d57651 in object_unref qom/object.c:1068:9 #3 0x555558a55588 in migration_thread migration/migration.c:3272:5 #4 0x5555595393f2 in qemu_thread_start util/qemu-thread-posix.c:502:9 #5 0x7fffe057f6b9 in start_thread (/lib/x86_64-linux-gnu/libpthread.so.0+0x76b9) previously allocated by thread T0 (qemu-vm-0) here: #0 0x555556f76b03 in __interceptor_malloc /tmp/final/llvm.src/projects/compiler-rt/lib/asan/asan_malloc_linux.cc:146:3 #1 0x7ffff6ee37b8 in g_malloc (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x4f7b8) #2 0x555558d58031 in object_new qom/object.c:640:12 #3 0x555558a31f21 in migration_object_init migration/migration.c:139:25 #4 0x555557c41398 in main vl.c:4320:5 #5 0x7fffdfdb482f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f) Thread T6 (live_migration) created by T0 (qemu-vm-0) here: #0 0x555556f5f0dd in pthread_create /tmp/final/llvm.src/projects/compiler-rt/lib/asan/asan_interceptors.cc:210:3 #1 0x555559538cf9 in qemu_thread_create util/qemu-thread-posix.c:539:11 #2 0x555558a53304 in migrate_fd_connect migration/migration.c:3332:5 #3 0x555558a72bd8 in migration_channel_connect migration/channel.c:92:5 #4 0x555558a6ef87 in exec_start_outgoing_migration migration/exec.c:42:5 #5 0x555558a4f3c2 in qmp_migrate migration/migration.c:1922:9 #6 0x555558bb4f6a in qmp_marshal_migrate qapi/qapi-commands-migration.c:607:5 #7 0x555559363738 in do_qmp_dispatch qapi/qmp-dispatch.c:131:5 #8 0x555559362a15 in qmp_dispatch qapi/qmp-dispatch.c:174:11 #9 0x5555571bac15 in monitor_qmp_dispatch monitor.c:4124:11 #10 0x55555719a22d in monitor_qmp_bh_dispatcher monitor.c:4207:9 #11 0x5555594fde0a in aio_bh_call util/async.c:90:5 #12 0x5555594fe522 in aio_bh_poll util/async.c:118:13 #13 0x5555595201e0 in aio_dispatch util/aio-posix.c:460:5 #14 0x555559503553 in aio_ctx_dispatch util/async.c:261:5 #15 0x7ffff6ede196 in g_main_context_dispatch (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x4a196) SUMMARY: AddressSanitizer: heap-use-after-free migration/migration.c:1502:23 in migrate_fd_cleanup Shadow bytes around the buggy address: 0x0c327fffb9f0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0c327fffba00: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0c327fffba10: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0c327fffba20: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0c327fffba30: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd =>0x0c327fffba40: fd fd[fd]fd fd fd fd fd fd fd fd fd fd fd fd fd 0x0c327fffba50: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 0x0c327fffba60: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 0x0c327fffba70: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 0x0c327fffba80: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 0x0c327fffba90: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd Shadow byte legend (one shadow byte represents 8 application bytes): Addressable: 00 Partially addressable: 01 02 03 04 05 06 07 Heap left redzone: fa Freed heap region: fd Stack left redzone: f1 Stack mid redzone: f2 Stack right redzone: f3 Stack after return: f5 Stack use after scope: f8 Global redzone: f9 Global init order: f6 Poisoned by user: f7 Container overflow: fc Array cookie: ac Intra object redzone: bb ASan internal: fe Left alloca redzone: ca Right alloca redzone: cb Shadow gap: cc ==31958==ABORTING Signed-off-by: Yury Kotov <yury-kotov@yandex-team.ru> Message-Id: <20190408113343.2370-1-yury-kotov@yandex-team.ru> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Fixed up comment formatting (cherry picked from commit fd392cf) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

When 'system_reset' is called, the main loop clear the memory region cache before the BH has a chance to execute. Later when the deferred function is called, some assumptions that were made when scheduling them are no longer true when they actually execute. This is what happens using a virtio-blk device (fresh RHEL7.8 install): $ (sleep 12.3; echo system_reset; sleep 12.3; echo system_reset; sleep 1; echo q) \ | qemu-system-x86_64 -m 4G -smp 8 -boot menu=on \ -device virtio-blk-pci,id=image1,drive=drive_image1 \ -drive file=/var/lib/libvirt/images/rhel78.qcow2,if=none,id=drive_image1,format=qcow2,cache=none \ -device virtio-net-pci,netdev=net0,id=nic0,mac=52:54:00:c4:e7:84 \ -netdev tap,id=net0,script=/bin/true,downscript=/bin/true,vhost=on \ -monitor stdio -serial null -nographic (qemu) system_reset (qemu) system_reset (qemu) qemu-system-x86_64: hw/virtio/virtio.c:225: vring_get_region_caches: Assertion `caches != NULL' failed. Aborted (gdb) bt Thread 1 (Thread 0x7f109c17b680 (LWP 10939)): #0 0x00005604083296d1 in vring_get_region_caches (vq=0x56040a24bdd0) at hw/virtio/virtio.c:227 #1 0x000056040832972b in vring_avail_flags (vq=0x56040a24bdd0) at hw/virtio/virtio.c:235 #2 0x000056040832d13d in virtio_should_notify (vdev=0x56040a240630, vq=0x56040a24bdd0) at hw/virtio/virtio.c:1648 #3 0x000056040832d1f8 in virtio_notify_irqfd (vdev=0x56040a240630, vq=0x56040a24bdd0) at hw/virtio/virtio.c:1662 #4 0x00005604082d213d in notify_guest_bh (opaque=0x56040a243ec0) at hw/block/dataplane/virtio-blk.c:75 #5 0x000056040883dc35 in aio_bh_call (bh=0x56040a243f10) at util/async.c:90 #6 0x000056040883dccd in aio_bh_poll (ctx=0x560409161980) at util/async.c:118 #7 0x0000560408842af7 in aio_dispatch (ctx=0x560409161980) at util/aio-posix.c:460 #8 0x000056040883e068 in aio_ctx_dispatch (source=0x560409161980, callback=0x0, user_data=0x0) at util/async.c:261 #9 0x00007f10a8fca06d in g_main_context_dispatch () at /lib64/libglib-2.0.so.0 #10 0x0000560408841445 in glib_pollfds_poll () at util/main-loop.c:215 #11 0x00005604088414bf in os_host_main_loop_wait (timeout=0) at util/main-loop.c:238 #12 0x00005604088415c4 in main_loop_wait (nonblocking=0) at util/main-loop.c:514 #13 0x0000560408416b1e in main_loop () at vl.c:1923 #14 0x000056040841e0e8 in main (argc=20, argv=0x7ffc2c3f9c58, envp=0x7ffc2c3f9d00) at vl.c:4578 Fix this by cancelling the BH when the virtio dataplane is stopped. [This is version of the patch was modified as discussed with Philippe on the mailing list thread. --Stefan] Reported-by: Yihuang Yu <yihyu@redhat.com> Suggested-by: Stefan Hajnoczi <stefanha@redhat.com> Fixes: https://bugs.launchpad.net/qemu/+bug/1839428 Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20190816171503.24761-1-philmd@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> (cherry picked from commit ebb6ff2) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

When 'system_reset' is called, the main loop clear the memory region cache before the BH has a chance to execute. Later when the deferred function is called, some assumptions that were made when scheduling them are no longer true when they actually execute. This is what happens using a virtio-blk device (fresh RHEL7.8 install): $ (sleep 12.3; echo system_reset; sleep 12.3; echo system_reset; sleep 1; echo q) \ | qemu-system-x86_64 -m 4G -smp 8 -boot menu=on \ -device virtio-blk-pci,id=image1,drive=drive_image1 \ -drive file=/var/lib/libvirt/images/rhel78.qcow2,if=none,id=drive_image1,format=qcow2,cache=none \ -device virtio-net-pci,netdev=net0,id=nic0,mac=52:54:00:c4:e7:84 \ -netdev tap,id=net0,script=/bin/true,downscript=/bin/true,vhost=on \ -monitor stdio -serial null -nographic (qemu) system_reset (qemu) system_reset (qemu) qemu-system-x86_64: hw/virtio/virtio.c:225: vring_get_region_caches: Assertion `caches != NULL' failed. Aborted (gdb) bt Thread 1 (Thread 0x7f109c17b680 (LWP 10939)): #0 0x00005604083296d1 in vring_get_region_caches (vq=0x56040a24bdd0) at hw/virtio/virtio.c:227 #1 0x000056040832972b in vring_avail_flags (vq=0x56040a24bdd0) at hw/virtio/virtio.c:235 #2 0x000056040832d13d in virtio_should_notify (vdev=0x56040a240630, vq=0x56040a24bdd0) at hw/virtio/virtio.c:1648 #3 0x000056040832d1f8 in virtio_notify_irqfd (vdev=0x56040a240630, vq=0x56040a24bdd0) at hw/virtio/virtio.c:1662 #4 0x00005604082d213d in notify_guest_bh (opaque=0x56040a243ec0) at hw/block/dataplane/virtio-blk.c:75 #5 0x000056040883dc35 in aio_bh_call (bh=0x56040a243f10) at util/async.c:90 #6 0x000056040883dccd in aio_bh_poll (ctx=0x560409161980) at util/async.c:118 #7 0x0000560408842af7 in aio_dispatch (ctx=0x560409161980) at util/aio-posix.c:460 #8 0x000056040883e068 in aio_ctx_dispatch (source=0x560409161980, callback=0x0, user_data=0x0) at util/async.c:261 #9 0x00007f10a8fca06d in g_main_context_dispatch () at /lib64/libglib-2.0.so.0 #10 0x0000560408841445 in glib_pollfds_poll () at util/main-loop.c:215 #11 0x00005604088414bf in os_host_main_loop_wait (timeout=0) at util/main-loop.c:238 #12 0x00005604088415c4 in main_loop_wait (nonblocking=0) at util/main-loop.c:514 #13 0x0000560408416b1e in main_loop () at vl.c:1923 #14 0x000056040841e0e8 in main (argc=20, argv=0x7ffc2c3f9c58, envp=0x7ffc2c3f9d00) at vl.c:4578 Fix this by cancelling the BH when the virtio dataplane is stopped. [This is version of the patch was modified as discussed with Philippe on the mailing list thread. --Stefan] Reported-by: Yihuang Yu <yihyu@redhat.com> Suggested-by: Stefan Hajnoczi <stefanha@redhat.com> Fixes: https://bugs.launchpad.net/qemu/+bug/1839428 Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20190816171503.24761-1-philmd@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

When encounter error, multifd_send_thread should always notify who pay attention to it before exit. Otherwise it may block migration_thread at multifd_send_sync_main forever. Error as follow: ------------------------------------------------------------------------------- (gdb) bt #0 0x00007f4d669dfa0b in do_futex_wait.constprop.1 () from /lib64/libpthread.so.0 #1 0x00007f4d669dfa9f in __new_sem_wait_slow.constprop.0 () from /lib64/libpthread.so.0 #2 0x00007f4d669dfb3b in sem_wait@@GLIBC_2.2.5 () from /lib64/libpthread.so.0 #3 0x0000562ccf0a5614 in qemu_sem_wait (sem=sem@entry=0x562cd1b698e8) at util/qemu-thread-posix.c:319 #4 0x0000562ccecb4752 in multifd_send_sync_main (rs=<optimized out>) at /qemu/migration/ram.c:1099 #5 0x0000562ccecb95f4 in ram_save_iterate (f=0x562cd0ecc000, opaque=<optimized out>) at /qemu/migration/ram.c:3550 #6 0x0000562ccef43c23 in qemu_savevm_state_iterate (f=0x562cd0ecc000, postcopy=false) at migration/savevm.c:1189 #7 0x0000562ccef3dcf3 in migration_iteration_run (s=0x562cd09fabf0) at migration/migration.c:3131 #8 migration_thread (opaque=opaque@entry=0x562cd09fabf0) at migration/migration.c:3258 #9 0x0000562ccf0a4c26 in qemu_thread_start (args=<optimized out>) at util/qemu-thread-posix.c:502 #10 0x00007f4d669d9e25 in start_thread () from /lib64/libpthread.so.0 #11 0x00007f4d6670635d in clone () from /lib64/libc.so.6 (gdb) f 4 #4 0x0000562ccecb4752 in multifd_send_sync_main (rs=<optimized out>) at /qemu/migration/ram.c:1099 1099 qemu_sem_wait(&p->sem_sync); (gdb) list 1094 } 1095 for (i = 0; i < migrate_multifd_channels(); i++) { 1096 MultiFDSendParams *p = &multifd_send_state->params[i]; 1097 1098 trace_multifd_send_sync_main_wait(p->id); 1099 qemu_sem_wait(&p->sem_sync); 1100 } 1101 trace_multifd_send_sync_main(multifd_send_state->packet_num); 1102 } 1103 (gdb) p i $1 = 0 (gdb) p multifd_send_state->params[0].pending_job $2 = 2 //It means the job before MULTIFD_FLAG_SYNC has already fail (gdb) p multifd_send_state->params[0].quit $3 = true Signed-off-by: Ivan Ren <ivanren@tencent.com> Message-Id: <1567044996-2362-1-git-send-email-ivanren@tencent.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

The 'blockdev-create' QMP command was introduced as experimental feature in commit b0292b8, using the assert() debug call. It got promoted to 'stable' command in 3fb588a, but the assert call was not removed. Some block drivers are optional, and bdrv_find_format() might return a NULL value, triggering the assertion. Stable code is not expected to abort, so return an error instead. This is easily reproducible when libnfs is not installed: ./configure [...] module support no Block whitelist (rw) Block whitelist (ro) libiscsi support yes libnfs support no [...] Start QEMU: $ qemu-system-x86_64 -S -qmp unix:/tmp/qemu.qmp,server,nowait Send the 'blockdev-create' with the 'nfs' driver: $ ( cat << 'EOF' {'execute': 'qmp_capabilities'} {'execute': 'blockdev-create', 'arguments': {'job-id': 'x', 'options': {'size': 0, 'driver': 'nfs', 'location': {'path': '/', 'server': {'host': '::1', 'type': 'inet'}}}}, 'id': 'x'} EOF ) | socat STDIO UNIX:/tmp/qemu.qmp {"QMP": {"version": {"qemu": {"micro": 50, "minor": 1, "major": 4}, "package": "v4.1.0-733-g89ea03a7dc"}, "capabilities": ["oob"]}} {"return": {}} QEMU crashes: $ gdb qemu-system-x86_64 core Program received signal SIGSEGV, Segmentation fault. (gdb) bt #0 0x00007ffff510957f in raise () at /lib64/libc.so.6 #1 0x00007ffff50f3895 in abort () at /lib64/libc.so.6 #2 0x00007ffff50f3769 in _nl_load_domain.cold.0 () at /lib64/libc.so.6 #3 0x00007ffff5101a26 in .annobin_assert.c_end () at /lib64/libc.so.6 #4 0x0000555555d7e1f1 in qmp_blockdev_create (job_id=0x555556baee40 "x", options=0x555557666610, errp=0x7fffffffc770) at block/create.c:69 #5 0x0000555555c96b52 in qmp_marshal_blockdev_create (args=0x7fffdc003830, ret=0x7fffffffc7f8, errp=0x7fffffffc7f0) at qapi/qapi-commands-block-core.c:1314 #6 0x0000555555deb0a0 in do_qmp_dispatch (cmds=0x55555645de70 <qmp_commands>, request=0x7fffdc005c70, allow_oob=false, errp=0x7fffffffc898) at qapi/qmp-dispatch.c:131 #7 0x0000555555deb2a1 in qmp_dispatch (cmds=0x55555645de70 <qmp_commands>, request=0x7fffdc005c70, allow_oob=false) at qapi/qmp-dispatch.c:174 With this patch applied, QEMU returns a QMP error: {'execute': 'blockdev-create', 'arguments': {'job-id': 'x', 'options': {'size': 0, 'driver': 'nfs', 'location': {'path': '/', 'server': {'host': '::1', 'type': 'inet'}}}}, 'id': 'x'} {"id": "x", "error": {"class": "GenericError", "desc": "Block driver 'nfs' not found or not supported"}} Cc: qemu-stable@nongnu.org Reported-by: Xu Tian <xutian@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>

Currently offloads disabled by guest via the VIRTIO_NET_CTRL_GUEST_OFFLOADS_SET command are not preserved on VM migration. Instead all offloads reported by guest features (via VIRTIO_PCI_GUEST_FEATURES) get enabled. What happens is: first the VirtIONet::curr_guest_offloads gets restored and offloads are getting set correctly: #0 qemu_set_offload (nc=0x555556a11400, csum=1, tso4=0, tso6=0, ecn=0, ufo=0) at net/net.c:474 #1 virtio_net_apply_guest_offloads (n=0x555557701ca0) at hw/net/virtio-net.c:720 #2 virtio_net_post_load_device (opaque=0x555557701ca0, version_id=11) at hw/net/virtio-net.c:2334 #3 vmstate_load_state (f=0x5555569dc010, vmsd=0x555556577c80 <vmstate_virtio_net_device>, opaque=0x555557701ca0, version_id=11) at migration/vmstate.c:168 #4 virtio_load (vdev=0x555557701ca0, f=0x5555569dc010, version_id=11) at hw/virtio/virtio.c:2197 #5 virtio_device_get (f=0x5555569dc010, opaque=0x555557701ca0, size=0, field=0x55555668cd00 <__compound_literal.5>) at hw/virtio/virtio.c:2036 #6 vmstate_load_state (f=0x5555569dc010, vmsd=0x555556577ce0 <vmstate_virtio_net>, opaque=0x555557701ca0, version_id=11) at migration/vmstate.c:143 #7 vmstate_load (f=0x5555569dc010, se=0x5555578189e0) at migration/savevm.c:829 #8 qemu_loadvm_section_start_full (f=0x5555569dc010, mis=0x5555569eee20) at migration/savevm.c:2211 #9 qemu_loadvm_state_main (f=0x5555569dc010, mis=0x5555569eee20) at migration/savevm.c:2395 #10 qemu_loadvm_state (f=0x5555569dc010) at migration/savevm.c:2467 #11 process_incoming_migration_co (opaque=0x0) at migration/migration.c:449 However later on the features are getting restored, and offloads get reset to everything supported by features: #0 qemu_set_offload (nc=0x555556a11400, csum=1, tso4=1, tso6=1, ecn=0, ufo=0) at net/net.c:474 #1 virtio_net_apply_guest_offloads (n=0x555557701ca0) at hw/net/virtio-net.c:720 #2 virtio_net_set_features (vdev=0x555557701ca0, features=5104441767) at hw/net/virtio-net.c:773 #3 virtio_set_features_nocheck (vdev=0x555557701ca0, val=5104441767) at hw/virtio/virtio.c:2052 #4 virtio_load (vdev=0x555557701ca0, f=0x5555569dc010, version_id=11) at hw/virtio/virtio.c:2220 #5 virtio_device_get (f=0x5555569dc010, opaque=0x555557701ca0, size=0, field=0x55555668cd00 <__compound_literal.5>) at hw/virtio/virtio.c:2036 #6 vmstate_load_state (f=0x5555569dc010, vmsd=0x555556577ce0 <vmstate_virtio_net>, opaque=0x555557701ca0, version_id=11) at migration/vmstate.c:143 #7 vmstate_load (f=0x5555569dc010, se=0x5555578189e0) at migration/savevm.c:829 #8 qemu_loadvm_section_start_full (f=0x5555569dc010, mis=0x5555569eee20) at migration/savevm.c:2211 #9 qemu_loadvm_state_main (f=0x5555569dc010, mis=0x5555569eee20) at migration/savevm.c:2395 #10 qemu_loadvm_state (f=0x5555569dc010) at migration/savevm.c:2467 #11 process_incoming_migration_co (opaque=0x0) at migration/migration.c:449 Fix this by preserving the state in saved_guest_offloads field and pushing out offload initialization to the new post load hook. Cc: qemu-stable@nongnu.org Signed-off-by: Mikhail Sennikovsky <mikhail.sennikovskii@cloud.ionos.com> Signed-off-by: Jason Wang <jasowang@redhat.com>

Currently offloads disabled by guest via the VIRTIO_NET_CTRL_GUEST_OFFLOADS_SET command are not preserved on VM migration. Instead all offloads reported by guest features (via VIRTIO_PCI_GUEST_FEATURES) get enabled. What happens is: first the VirtIONet::curr_guest_offloads gets restored and offloads are getting set correctly: #0 qemu_set_offload (nc=0x555556a11400, csum=1, tso4=0, tso6=0, ecn=0, ufo=0) at net/net.c:474 #1 virtio_net_apply_guest_offloads (n=0x555557701ca0) at hw/net/virtio-net.c:720 #2 virtio_net_post_load_device (opaque=0x555557701ca0, version_id=11) at hw/net/virtio-net.c:2334 #3 vmstate_load_state (f=0x5555569dc010, vmsd=0x555556577c80 <vmstate_virtio_net_device>, opaque=0x555557701ca0, version_id=11) at migration/vmstate.c:168 #4 virtio_load (vdev=0x555557701ca0, f=0x5555569dc010, version_id=11) at hw/virtio/virtio.c:2197 #5 virtio_device_get (f=0x5555569dc010, opaque=0x555557701ca0, size=0, field=0x55555668cd00 <__compound_literal.5>) at hw/virtio/virtio.c:2036 #6 vmstate_load_state (f=0x5555569dc010, vmsd=0x555556577ce0 <vmstate_virtio_net>, opaque=0x555557701ca0, version_id=11) at migration/vmstate.c:143 #7 vmstate_load (f=0x5555569dc010, se=0x5555578189e0) at migration/savevm.c:829 #8 qemu_loadvm_section_start_full (f=0x5555569dc010, mis=0x5555569eee20) at migration/savevm.c:2211 #9 qemu_loadvm_state_main (f=0x5555569dc010, mis=0x5555569eee20) at migration/savevm.c:2395 #10 qemu_loadvm_state (f=0x5555569dc010) at migration/savevm.c:2467 #11 process_incoming_migration_co (opaque=0x0) at migration/migration.c:449 However later on the features are getting restored, and offloads get reset to everything supported by features: #0 qemu_set_offload (nc=0x555556a11400, csum=1, tso4=1, tso6=1, ecn=0, ufo=0) at net/net.c:474 #1 virtio_net_apply_guest_offloads (n=0x555557701ca0) at hw/net/virtio-net.c:720 #2 virtio_net_set_features (vdev=0x555557701ca0, features=5104441767) at hw/net/virtio-net.c:773 #3 virtio_set_features_nocheck (vdev=0x555557701ca0, val=5104441767) at hw/virtio/virtio.c:2052 #4 virtio_load (vdev=0x555557701ca0, f=0x5555569dc010, version_id=11) at hw/virtio/virtio.c:2220 #5 virtio_device_get (f=0x5555569dc010, opaque=0x555557701ca0, size=0, field=0x55555668cd00 <__compound_literal.5>) at hw/virtio/virtio.c:2036 #6 vmstate_load_state (f=0x5555569dc010, vmsd=0x555556577ce0 <vmstate_virtio_net>, opaque=0x555557701ca0, version_id=11) at migration/vmstate.c:143 #7 vmstate_load (f=0x5555569dc010, se=0x5555578189e0) at migration/savevm.c:829 #8 qemu_loadvm_section_start_full (f=0x5555569dc010, mis=0x5555569eee20) at migration/savevm.c:2211 #9 qemu_loadvm_state_main (f=0x5555569dc010, mis=0x5555569eee20) at migration/savevm.c:2395 #10 qemu_loadvm_state (f=0x5555569dc010) at migration/savevm.c:2467 #11 process_incoming_migration_co (opaque=0x0) at migration/migration.c:449 Fix this by preserving the state in saved_guest_offloads field and pushing out offload initialization to the new post load hook. Cc: qemu-stable@nongnu.org Signed-off-by: Mikhail Sennikovsky <mikhail.sennikovskii@cloud.ionos.com> Signed-off-by: Jason Wang <jasowang@redhat.com> (cherry picked from commit 7788c3f) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

Update commands-win32.c

4fa34bd

qga: replace GetIfEntry The data obtained by GetIfEntry is 32 bits, and it may overflow. Thus using GetIfEntry2 instead of GetIfEntry.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

qga: replace GetIfEntry with GetIfEntry2#5

qga: replace GetIfEntry with GetIfEntry2#5
luzhipeng-zte wants to merge 1 commit into
mdroth:qgafrom
luzhipeng-zte:qga

luzhipeng-zte commented Oct 26, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

luzhipeng-zte commented Oct 26, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant