Skip to content
Snippets Groups Projects
  • Vladimir Sementsov-Ogievskiy's avatar
    78632a3d
    monitor: hmp_qemu_io: acquire aio contex, fix crash · 78632a3d
    Vladimir Sementsov-Ogievskiy authored
    
    Max reported the following bug:
    
    $ ./qemu-img create -f raw src.img 1G
    $ ./qemu-img create -f raw dst.img 1G
    
    $ (echo '
       {"execute":"qmp_capabilities"}
       {"execute":"blockdev-mirror",
        "arguments":{"job-id":"mirror",
                     "device":"source",
                     "target":"target",
                     "sync":"full",
                     "filter-node-name":"mirror-top"}}
    '; sleep 3; echo '
       {"execute":"human-monitor-command",
        "arguments":{"command-line":
                     "qemu-io mirror-top \"write 0 1G\""}}') \
    | x86_64-softmmu/qemu-system-x86_64 \
       -qmp stdio \
       -blockdev file,node-name=source,filename=src.img \
       -blockdev file,node-name=target,filename=dst.img \
       -object iothread,id=iothr0 \
       -device virtio-blk,drive=source,iothread=iothr0
    
    crashes:
    
    0  raise () at /usr/lib/libc.so.6
    1  abort () at /usr/lib/libc.so.6
    2  error_exit
       (err=<optimized out>,
       msg=msg@entry=0x55fbb1634790 <__func__.27> "qemu_mutex_unlock_impl")
       at ../util/qemu-thread-posix.c:37
    3  qemu_mutex_unlock_impl
       (mutex=mutex@entry=0x55fbb25ab6e0,
       file=file@entry=0x55fbb1636957 "../util/async.c",
       line=line@entry=650)
       at ../util/qemu-thread-posix.c:109
    4  aio_context_release (ctx=ctx@entry=0x55fbb25ab680) at ../util/async.c:650
    5  bdrv_do_drained_begin
       (bs=bs@entry=0x55fbb3a87000, recursive=recursive@entry=false,
       parent=parent@entry=0x0,
       ignore_bds_parents=ignore_bds_parents@entry=false,
       poll=poll@entry=true) at ../block/io.c:441
    6  bdrv_do_drained_begin
       (poll=true, ignore_bds_parents=false, parent=0x0, recursive=false,
       bs=0x55fbb3a87000) at ../block/io.c:448
    7  blk_drain (blk=0x55fbb26c5a00) at ../block/block-backend.c:1718
    8  blk_unref (blk=0x55fbb26c5a00) at ../block/block-backend.c:498
    9  blk_unref (blk=0x55fbb26c5a00) at ../block/block-backend.c:491
    10 hmp_qemu_io (mon=0x7fffaf3fc7d0, qdict=<optimized out>)
       at ../block/monitor/block-hmp-cmds.c:628
    
    man pthread_mutex_unlock
    ...
        EPERM  The  mutex type is PTHREAD_MUTEX_ERRORCHECK or
        PTHREAD_MUTEX_RECURSIVE, or the mutex is a robust mutex, and the
        current thread does not own the mutex.
    
    So, thread doesn't own the mutex. And we have iothread here.
    
    Next, note that AIO_WAIT_WHILE() documents that ctx must be acquired
    exactly once by caller. But where is it acquired in the call stack?
    Seems nowhere.
    
    qemuio_command do acquire aio context.. But we need context acquired
    around blk_unref() as well and actually around blk_insert_bs() too.
    
    Let's refactor qemuio_command so that it doesn't acquire aio context
    but callers do that instead. This way we can cleanly acquire aio
    context in hmp_qemu_io() around all three calls.
    
    Reported-by: default avatarMax Reitz <mreitz@redhat.com>
    Signed-off-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
    Message-Id: <20210423134233.51495-1-vsementsov@virtuozzo.com>
    [mreitz: Fixed comment]
    Signed-off-by: default avatarMax Reitz <mreitz@redhat.com>
    78632a3d
    History
    monitor: hmp_qemu_io: acquire aio contex, fix crash
    Vladimir Sementsov-Ogievskiy authored
    
    Max reported the following bug:
    
    $ ./qemu-img create -f raw src.img 1G
    $ ./qemu-img create -f raw dst.img 1G
    
    $ (echo '
       {"execute":"qmp_capabilities"}
       {"execute":"blockdev-mirror",
        "arguments":{"job-id":"mirror",
                     "device":"source",
                     "target":"target",
                     "sync":"full",
                     "filter-node-name":"mirror-top"}}
    '; sleep 3; echo '
       {"execute":"human-monitor-command",
        "arguments":{"command-line":
                     "qemu-io mirror-top \"write 0 1G\""}}') \
    | x86_64-softmmu/qemu-system-x86_64 \
       -qmp stdio \
       -blockdev file,node-name=source,filename=src.img \
       -blockdev file,node-name=target,filename=dst.img \
       -object iothread,id=iothr0 \
       -device virtio-blk,drive=source,iothread=iothr0
    
    crashes:
    
    0  raise () at /usr/lib/libc.so.6
    1  abort () at /usr/lib/libc.so.6
    2  error_exit
       (err=<optimized out>,
       msg=msg@entry=0x55fbb1634790 <__func__.27> "qemu_mutex_unlock_impl")
       at ../util/qemu-thread-posix.c:37
    3  qemu_mutex_unlock_impl
       (mutex=mutex@entry=0x55fbb25ab6e0,
       file=file@entry=0x55fbb1636957 "../util/async.c",
       line=line@entry=650)
       at ../util/qemu-thread-posix.c:109
    4  aio_context_release (ctx=ctx@entry=0x55fbb25ab680) at ../util/async.c:650
    5  bdrv_do_drained_begin
       (bs=bs@entry=0x55fbb3a87000, recursive=recursive@entry=false,
       parent=parent@entry=0x0,
       ignore_bds_parents=ignore_bds_parents@entry=false,
       poll=poll@entry=true) at ../block/io.c:441
    6  bdrv_do_drained_begin
       (poll=true, ignore_bds_parents=false, parent=0x0, recursive=false,
       bs=0x55fbb3a87000) at ../block/io.c:448
    7  blk_drain (blk=0x55fbb26c5a00) at ../block/block-backend.c:1718
    8  blk_unref (blk=0x55fbb26c5a00) at ../block/block-backend.c:498
    9  blk_unref (blk=0x55fbb26c5a00) at ../block/block-backend.c:491
    10 hmp_qemu_io (mon=0x7fffaf3fc7d0, qdict=<optimized out>)
       at ../block/monitor/block-hmp-cmds.c:628
    
    man pthread_mutex_unlock
    ...
        EPERM  The  mutex type is PTHREAD_MUTEX_ERRORCHECK or
        PTHREAD_MUTEX_RECURSIVE, or the mutex is a robust mutex, and the
        current thread does not own the mutex.
    
    So, thread doesn't own the mutex. And we have iothread here.
    
    Next, note that AIO_WAIT_WHILE() documents that ctx must be acquired
    exactly once by caller. But where is it acquired in the call stack?
    Seems nowhere.
    
    qemuio_command do acquire aio context.. But we need context acquired
    around blk_unref() as well and actually around blk_insert_bs() too.
    
    Let's refactor qemuio_command so that it doesn't acquire aio context
    but callers do that instead. This way we can cleanly acquire aio
    context in hmp_qemu_io() around all three calls.
    
    Reported-by: default avatarMax Reitz <mreitz@redhat.com>
    Signed-off-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
    Message-Id: <20210423134233.51495-1-vsementsov@virtuozzo.com>
    [mreitz: Fixed comment]
    Signed-off-by: default avatarMax Reitz <mreitz@redhat.com>