Skip to content
Snippets Groups Projects
  • Stefan Hajnoczi's avatar
    abfcd276
    dma-helpers: prevent dma_blk_cb() vs dma_aio_cancel() race · abfcd276
    Stefan Hajnoczi authored
    
    dma_blk_cb() only takes the AioContext lock around ->io_func(). That
    means the rest of dma_blk_cb() is not protected. In particular, the
    DMAAIOCB field accesses happen outside the lock.
    
    There is a race when the main loop thread holds the AioContext lock and
    invokes scsi_device_purge_requests() -> bdrv_aio_cancel() ->
    dma_aio_cancel() while an IOThread executes dma_blk_cb(). The dbs->acb
    field determines how cancellation proceeds. If dma_aio_cancel() sees
    dbs->acb == NULL while dma_blk_cb() is still running, the request can be
    completed twice (-ECANCELED and the actual return value).
    
    The following assertion can occur with virtio-scsi when an IOThread is
    used:
    
      ../hw/scsi/scsi-disk.c:368: scsi_dma_complete: Assertion `r->req.aiocb != NULL' failed.
    
    Fix the race by holding the AioContext across dma_blk_cb(). Now
    dma_aio_cancel() under the AioContext lock will not see
    inconsistent/intermediate states.
    
    Cc: Paolo Bonzini <pbonzini@redhat.com>
    Reviewed-by: default avatarEric Blake <eblake@redhat.com>
    Signed-off-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
    Message-Id: <20230221212218.1378734-3-stefanha@redhat.com>
    Signed-off-by: default avatarKevin Wolf <kwolf@redhat.com>
    abfcd276
    History
    dma-helpers: prevent dma_blk_cb() vs dma_aio_cancel() race
    Stefan Hajnoczi authored
    
    dma_blk_cb() only takes the AioContext lock around ->io_func(). That
    means the rest of dma_blk_cb() is not protected. In particular, the
    DMAAIOCB field accesses happen outside the lock.
    
    There is a race when the main loop thread holds the AioContext lock and
    invokes scsi_device_purge_requests() -> bdrv_aio_cancel() ->
    dma_aio_cancel() while an IOThread executes dma_blk_cb(). The dbs->acb
    field determines how cancellation proceeds. If dma_aio_cancel() sees
    dbs->acb == NULL while dma_blk_cb() is still running, the request can be
    completed twice (-ECANCELED and the actual return value).
    
    The following assertion can occur with virtio-scsi when an IOThread is
    used:
    
      ../hw/scsi/scsi-disk.c:368: scsi_dma_complete: Assertion `r->req.aiocb != NULL' failed.
    
    Fix the race by holding the AioContext across dma_blk_cb(). Now
    dma_aio_cancel() under the AioContext lock will not see
    inconsistent/intermediate states.
    
    Cc: Paolo Bonzini <pbonzini@redhat.com>
    Reviewed-by: default avatarEric Blake <eblake@redhat.com>
    Signed-off-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
    Message-Id: <20230221212218.1378734-3-stefanha@redhat.com>
    Signed-off-by: default avatarKevin Wolf <kwolf@redhat.com>