Skip to content
  • Stefan Hajnoczi's avatar
    75dcb4d7
    virtio-blk: fix host notifier issues during dataplane start/stop · 75dcb4d7
    Stefan Hajnoczi authored
    
    
    The main loop thread can consume 100% CPU when using --device
    virtio-blk-pci,iothread=<iothread>. ppoll() constantly returns but
    reading virtqueue host notifiers fails with EAGAIN. The file descriptors
    are stale and remain registered with the AioContext because of bugs in
    the virtio-blk dataplane start/stop code.
    
    The problem is that the dataplane start/stop code involves drain
    operations, which call virtio_blk_drained_begin() and
    virtio_blk_drained_end() at points where the host notifier is not
    operational:
    - In virtio_blk_data_plane_start(), blk_set_aio_context() drains after
      vblk->dataplane_started has been set to true but the host notifier has
      not been attached yet.
    - In virtio_blk_data_plane_stop(), blk_drain() and blk_set_aio_context()
      drain after the host notifier has already been detached but with
      vblk->dataplane_started still set to true.
    
    I would like to simplify ->ioeventfd_start/stop() to avoid interactions
    with drain entirely, but couldn't find a way to do that. Instead, this
    patch accepts the fragile nature of the code and reorders it so that
    vblk->dataplane_started is false during drain operations. This way the
    virtio_blk_drained_begin() and virtio_blk_drained_end() calls don't
    touch the host notifier. The result is that
    virtio_blk_data_plane_start() and virtio_blk_data_plane_stop() have
    complete control over the host notifier and stale file descriptors are
    no longer left in the AioContext.
    
    This patch fixes the 100% CPU consumption in the main loop thread and
    correctly moves host notifier processing to the IOThread.
    
    Fixes: 1665d932 ("virtio-blk: implement BlockDevOps->drained_begin()")
    Reported-by: default avatarLukáš Doktor <ldoktor@redhat.com>
    Signed-off-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
    Tested-by: default avatarLukas Doktor <ldoktor@redhat.com>
    Message-id: 20230704151527.193586-1-stefanha@redhat.com
    Signed-off-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
    75dcb4d7
    virtio-blk: fix host notifier issues during dataplane start/stop
    Stefan Hajnoczi authored
    
    
    The main loop thread can consume 100% CPU when using --device
    virtio-blk-pci,iothread=<iothread>. ppoll() constantly returns but
    reading virtqueue host notifiers fails with EAGAIN. The file descriptors
    are stale and remain registered with the AioContext because of bugs in
    the virtio-blk dataplane start/stop code.
    
    The problem is that the dataplane start/stop code involves drain
    operations, which call virtio_blk_drained_begin() and
    virtio_blk_drained_end() at points where the host notifier is not
    operational:
    - In virtio_blk_data_plane_start(), blk_set_aio_context() drains after
      vblk->dataplane_started has been set to true but the host notifier has
      not been attached yet.
    - In virtio_blk_data_plane_stop(), blk_drain() and blk_set_aio_context()
      drain after the host notifier has already been detached but with
      vblk->dataplane_started still set to true.
    
    I would like to simplify ->ioeventfd_start/stop() to avoid interactions
    with drain entirely, but couldn't find a way to do that. Instead, this
    patch accepts the fragile nature of the code and reorders it so that
    vblk->dataplane_started is false during drain operations. This way the
    virtio_blk_drained_begin() and virtio_blk_drained_end() calls don't
    touch the host notifier. The result is that
    virtio_blk_data_plane_start() and virtio_blk_data_plane_stop() have
    complete control over the host notifier and stale file descriptors are
    no longer left in the AioContext.
    
    This patch fixes the 100% CPU consumption in the main loop thread and
    correctly moves host notifier processing to the IOThread.
    
    Fixes: 1665d932 ("virtio-blk: implement BlockDevOps->drained_begin()")
    Reported-by: default avatarLukáš Doktor <ldoktor@redhat.com>
    Signed-off-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
    Tested-by: default avatarLukas Doktor <ldoktor@redhat.com>
    Message-id: 20230704151527.193586-1-stefanha@redhat.com
    Signed-off-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
Loading