Skip to content
  • Stefan Hajnoczi's avatar
    66547f41
    block/nvme: invoke blk_io_plug_call() outside q->lock · 66547f41
    Stefan Hajnoczi authored
    
    
    blk_io_plug_call() is invoked outside a blk_io_plug()/blk_io_unplug()
    section while opening the NVMe drive from:
    
      nvme_file_open() ->
      nvme_init() ->
      nvme_identify() ->
      nvme_admin_cmd_sync() ->
      nvme_submit_command() ->
      blk_io_plug_call()
    
    blk_io_plug_call() immediately invokes the given callback when the
    current thread is not plugged, as is the case during nvme_file_open().
    
    Unfortunately, nvme_submit_command() calls blk_io_plug_call() with
    q->lock still held:
    
        ...
        q->sq.tail = (q->sq.tail + 1) % NVME_QUEUE_SIZE;
        q->need_kick++;
        blk_io_plug_call(nvme_unplug_fn, q);
        qemu_mutex_unlock(&q->lock);
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^
    
    nvme_unplug_fn() deadlocks trying to acquire q->lock because the lock is
    already acquired by the same thread. The symptom is that QEMU hangs
    during startup while opening the NVMe drive.
    
    Fix this by moving the blk_io_plug_call() outside q->lock. This is safe
    because no other thread runs code related to this queue and
    blk_io_plug_call()'s internal state is immune to thread safety issues
    since it is thread-local.
    
    Reported-by: default avatarLukáš Doktor <ldoktor@redhat.com>
    Signed-off-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
    Tested-by: default avatarLukas Doktor <ldoktor@redhat.com>
    Message-id: 20230712191628.252806-1-stefanha@redhat.com
    Fixes: f2e59000 ("block/nvme: convert to blk_io_plug_call() API")
    Signed-off-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
    66547f41
    block/nvme: invoke blk_io_plug_call() outside q->lock
    Stefan Hajnoczi authored
    
    
    blk_io_plug_call() is invoked outside a blk_io_plug()/blk_io_unplug()
    section while opening the NVMe drive from:
    
      nvme_file_open() ->
      nvme_init() ->
      nvme_identify() ->
      nvme_admin_cmd_sync() ->
      nvme_submit_command() ->
      blk_io_plug_call()
    
    blk_io_plug_call() immediately invokes the given callback when the
    current thread is not plugged, as is the case during nvme_file_open().
    
    Unfortunately, nvme_submit_command() calls blk_io_plug_call() with
    q->lock still held:
    
        ...
        q->sq.tail = (q->sq.tail + 1) % NVME_QUEUE_SIZE;
        q->need_kick++;
        blk_io_plug_call(nvme_unplug_fn, q);
        qemu_mutex_unlock(&q->lock);
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^
    
    nvme_unplug_fn() deadlocks trying to acquire q->lock because the lock is
    already acquired by the same thread. The symptom is that QEMU hangs
    during startup while opening the NVMe drive.
    
    Fix this by moving the blk_io_plug_call() outside q->lock. This is safe
    because no other thread runs code related to this queue and
    blk_io_plug_call()'s internal state is immune to thread safety issues
    since it is thread-local.
    
    Reported-by: default avatarLukáš Doktor <ldoktor@redhat.com>
    Signed-off-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
    Tested-by: default avatarLukas Doktor <ldoktor@redhat.com>
    Message-id: 20230712191628.252806-1-stefanha@redhat.com
    Fixes: f2e59000 ("block/nvme: convert to blk_io_plug_call() API")
    Signed-off-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
Loading