Skip to content
  • Ming Lei's avatar
    dd67c1d7
    dataplane: submit I/O as a batch · dd67c1d7
    Ming Lei authored
    
    
    Before commit 580b6b2a(dataplane: use the QEMU block
    layer for I/O), dataplane for virtio-blk submits block
    I/O as a batch.
    
    This commit 580b6b2a replaces the custom linux AIO
    implementation(including submit I/O as a batch) with QEMU
    block layer, but this commit causes ~40% throughput regression
    on virtio-blk performance, and removing submitting I/O
    as a batch is one of the causes.
    
    This patch applies the newly introduced bdrv_io_plug() and
    bdrv_io_unplug() interfaces to support submitting I/O
    at batch for Qemu block layer, and in my test, the change
    can improve throughput by ~30% with 'aio=native'.
    
    Following my fio test script:
    
    	[global]
    	direct=1
    	size=4G
    	bsrange=4k-4k
    	timeout=40
    	numjobs=4
    	ioengine=libaio
    	iodepth=64
    	filename=/dev/vdc
    	group_reporting=1
    
    	[f]
    	rw=randread
    
    Result on one of my small machine(host: x86_64, 2cores, 4thread, guest: 4cores):
    	- qemu master: 65K IOPS
    	- qemu master with these patches: 92K IOPS
    	- 2.0.0 release(dataplane using custom linux aio): 104K IOPS
    
    Reviewed-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
    Signed-off-by: default avatarMing Lei <ming.lei@canonical.com>
    Signed-off-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
    dd67c1d7
    dataplane: submit I/O as a batch
    Ming Lei authored
    
    
    Before commit 580b6b2a(dataplane: use the QEMU block
    layer for I/O), dataplane for virtio-blk submits block
    I/O as a batch.
    
    This commit 580b6b2a replaces the custom linux AIO
    implementation(including submit I/O as a batch) with QEMU
    block layer, but this commit causes ~40% throughput regression
    on virtio-blk performance, and removing submitting I/O
    as a batch is one of the causes.
    
    This patch applies the newly introduced bdrv_io_plug() and
    bdrv_io_unplug() interfaces to support submitting I/O
    at batch for Qemu block layer, and in my test, the change
    can improve throughput by ~30% with 'aio=native'.
    
    Following my fio test script:
    
    	[global]
    	direct=1
    	size=4G
    	bsrange=4k-4k
    	timeout=40
    	numjobs=4
    	ioengine=libaio
    	iodepth=64
    	filename=/dev/vdc
    	group_reporting=1
    
    	[f]
    	rw=randread
    
    Result on one of my small machine(host: x86_64, 2cores, 4thread, guest: 4cores):
    	- qemu master: 65K IOPS
    	- qemu master with these patches: 92K IOPS
    	- 2.0.0 release(dataplane using custom linux aio): 104K IOPS
    
    Reviewed-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
    Signed-off-by: default avatarMing Lei <ming.lei@canonical.com>
    Signed-off-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
Loading