virtio-blk: fix host notifier issues during dataplane start/stop
The main loop thread can consume 100% CPU when using --device virtio-blk-pci,iothread=<iothread>. ppoll() constantly returns but reading virtqueue host notifiers fails with EAGAIN. The file descriptors are stale and remain registered with the AioContext because of bugs in the virtio-blk dataplane start/stop code. The problem is that the dataplane start/stop code involves drain operations, which call virtio_blk_drained_begin() and virtio_blk_drained_end() at points where the host notifier is not operational: - In virtio_blk_data_plane_start(), blk_set_aio_context() drains after vblk->dataplane_started has been set to true but the host notifier has not been attached yet. - In virtio_blk_data_plane_stop(), blk_drain() and blk_set_aio_context() drain after the host notifier has already been detached but with vblk->dataplane_started still set to true. I would like to simplify ->ioeventfd_start/stop() to avoid interactions with drain entirely, but couldn't find a way to do that. Instead, this patch accepts the fragile nature of the code and reorders it so that vblk->dataplane_started is false during drain operations. This way the virtio_blk_drained_begin() and virtio_blk_drained_end() calls don't touch the host notifier. The result is that virtio_blk_data_plane_start() and virtio_blk_data_plane_stop() have complete control over the host notifier and stale file descriptors are no longer left in the AioContext. This patch fixes the 100% CPU consumption in the main loop thread and correctly moves host notifier processing to the IOThread. Fixes: 1665d932 ("virtio-blk: implement BlockDevOps->drained_begin()") Reported-by:Lukáš Doktor <ldoktor@redhat.com> Signed-off-by:
Stefan Hajnoczi <stefanha@redhat.com> Tested-by:
Lukas Doktor <ldoktor@redhat.com> Message-id: 20230704151527.193586-1-stefanha@redhat.com Signed-off-by:
Stefan Hajnoczi <stefanha@redhat.com>
Please register or sign in to comment