- Oct 06, 2021
-
-
Paolo Bonzini authored
Linux limits the size of iovecs to 1024 (UIO_MAXIOV in the kernel sources, IOV_MAX in POSIX). Because of this, on some host adapters requests with many iovecs are rejected with -EINVAL by the io_submit() or readv()/writev() system calls. In fact, the same limit applies to SG_IO as well. To fix both the EINVAL and the possible performance issues from using fewer iovecs than allowed by Linux (some HBAs have max_segments as low as 128), introduce a separate entry in BlockLimits to hold the max_segments value from sysfs. This new limit is used only for SG_IO and clamped to bs->bl.max_iov anyway, just like max_hw_transfer is clamped to bs->bl.max_transfer. Reported-by:
Halil Pasic <pasic@linux.ibm.com> Cc: Hanna Reitz <hreitz@redhat.com> Cc: Kevin Wolf <kwolf@redhat.com> Cc: qemu-block@nongnu.org Cc: qemu-stable@nongnu.org Fixes: 18473467 ("file-posix: try BLKSECTGET on block devices too, do not round to power of 2", 2021-06-25) Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20210923130436.1187591-1-pbonzini@redhat.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-
- Sep 29, 2021
-
-
Vladimir Sementsov-Ogievskiy authored
Now that all drivers are updated by the previous commit, we can drop the last limiter on pdiscard path: INT_MAX in bdrv_co_pdiscard(). Now everything is prepared for implementing incredibly cool and fast big-discard requests in NBD and qcow2. And any other driver which wants it of course. Signed-off-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by:
Eric Blake <eblake@redhat.com> Message-Id: <20210903102807.27127-12-vsementsov@virtuozzo.com> Signed-off-by:
Eric Blake <eblake@redhat.com>
-
Vladimir Sementsov-Ogievskiy authored
We are going to support 64 bit discard requests. Now update the limit variable. It's absolutely safe. The variable is set in some drivers, and used in bdrv_co_pdiscard(). Update also max_pdiscard variable in bdrv_co_pdiscard(), so that bdrv_co_pdiscard() is now prepared for 64bit requests. The remaining logic including num, offset and bytes variables is already supporting 64bit requests. So the only thing that prevents 64 bit requests is limiting max_pdiscard variable to INT_MAX in bdrv_co_pdiscard(). We'll drop this limitation after updating all block drivers. Signed-off-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by:
Eric Blake <eblake@redhat.com> Message-Id: <20210903102807.27127-10-vsementsov@virtuozzo.com> Signed-off-by:
Eric Blake <eblake@redhat.com>
-
Vladimir Sementsov-Ogievskiy authored
Now that all drivers are updated by previous commit, we can drop two last limiters on write-zeroes path: INT_MAX in bdrv_co_do_pwrite_zeroes() and bdrv_check_request32() in bdrv_co_pwritev_part(). Now everything is prepared for implementing incredibly cool and fast big-write-zeroes in NBD and qcow2. And any other driver which wants it of course. Signed-off-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by:
Eric Blake <eblake@redhat.com> Message-Id: <20210903102807.27127-9-vsementsov@virtuozzo.com> Signed-off-by:
Eric Blake <eblake@redhat.com>
-
Vladimir Sementsov-Ogievskiy authored
We are going to support 64 bit write-zeroes requests. Now update the limit variable. It's absolutely safe. The variable is set in some drivers, and used in bdrv_co_do_pwrite_zeroes(). Update also max_write_zeroes variable in bdrv_co_do_pwrite_zeroes(), so that bdrv_co_do_pwrite_zeroes() is now prepared to 64bit requests. The remaining logic including num, offset and bytes variables is already supporting 64bit requests. So the only thing that prevents 64 bit requests is limiting max_write_zeroes variable to INT_MAX in bdrv_co_do_pwrite_zeroes(). We'll drop this limitation after updating all block drivers. Ah, we also have bdrv_check_request32() in bdrv_co_pwritev_part(). It will be modified to do bdrv_check_request() for write-zeroes path. Signed-off-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20210903102807.27127-7-vsementsov@virtuozzo.com> Reviewed-by:
Eric Blake <eblake@redhat.com> Signed-off-by:
Eric Blake <eblake@redhat.com>
-
Vladimir Sementsov-Ogievskiy authored
We are generally moving to int64_t for both offset and bytes parameters on all io paths. Main motivation is realization of 64-bit write_zeroes operation for fast zeroing large disk chunks, up to the whole disk. We chose signed type, to be consistent with off_t (which is signed) and with possibility for signed return type (where negative value means error). So, convert driver write handlers parameters which are already 64bit to signed type. While being here, convert also flags parameter to be BdrvRequestFlags. Now let's consider all callers. Simple git grep '\->bdrv_\(aio\|co\)_pwritev\(_part\)\?' shows that's there three callers of driver function: bdrv_driver_pwritev() and bdrv_driver_pwritev_compressed() in block/io.c, both pass int64_t, checked by bdrv_check_qiov_request() to be non-negative. qcow2_save_vmstate() does bdrv_check_qiov_request(). Still, the functions may be called directly, not only by drv->... Let's check: git grep '\.bdrv_\(aio\|co\)_pwritev\(_part\)\?\s*=' | \ awk '{print $4}' | sed 's/,//' | sed 's/&//' | sort | uniq | \ while read func; do git grep "$func(" | \ grep -v "$func(BlockDriverState"; done shows several callers: qcow2: qcow2_co_truncate() write at most up to @offset, which is checked in generic qcow2_co_truncate() by bdrv_check_request(). qcow2_co_pwritev_compressed_task() pass the request (or part of the request) that already went through normal write path, so it should be OK qcow: qcow_co_pwritev_compressed() pass int64_t, it's updated by this patch quorum: quorum_co_pwrite_zeroes() pass int64_t and int - OK throttle: throttle_co_pwritev_compressed() pass int64_t, it's updated by this patch vmdk: vmdk_co_pwritev_compressed() pass int64_t, it's updated by this patch Signed-off-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20210903102807.27127-5-vsementsov@virtuozzo.com> Reviewed-by:
Eric Blake <eblake@redhat.com> Signed-off-by:
Eric Blake <eblake@redhat.com>
-
Vladimir Sementsov-Ogievskiy authored
We modify the request by adding an offset to vmstate. Let's check the modified request. It will help us to safely move .bdrv_co_preadv_part and .bdrv_co_pwritev_part to int64_t type of offset and bytes. Signed-off-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by:
Eric Blake <eblake@redhat.com> Message-Id: <20210903102807.27127-3-vsementsov@virtuozzo.com> Signed-off-by:
Eric Blake <eblake@redhat.com>
-
Vladimir Sementsov-Ogievskiy authored
Only qcow2 driver supports vmstate. In qcow2 these requests go through .bdrv_co_p{read,write}v_part handlers. So, let's do our basic check for the request on vmstate generic handlers. Signed-off-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by:
Eric Blake <eblake@redhat.com> Message-Id: <20210903102807.27127-2-vsementsov@virtuozzo.com> Signed-off-by:
Eric Blake <eblake@redhat.com>
-
- Sep 15, 2021
-
-
Hanna Reitz authored
As we have attempted before (https://lists.gnu.org/archive/html/qemu-devel/2019-01/msg06451.html, "file-posix: Cache lseek result for data regions"; https://lists.nongnu.org/archive/html/qemu-block/2021-02/msg00934.html, "file-posix: Cache next hole"), this patch seeks to reduce the number of SEEK_DATA/HOLE operations the file-posix driver has to perform. The main difference is that this time it is implemented as part of the general block layer code. The problem we face is that on some filesystems or in some circumstances, SEEK_DATA/HOLE is unreasonably slow. Given the implementation is outside of qemu, there is little we can do about its performance. We have already introduced the want_zero parameter to bdrv_co_block_status() to reduce the number of SEEK_DATA/HOLE calls unless we really want zero information; but sometimes we do want that information, because for files that consist largely of zero areas, special-casing those areas can give large performance boosts. So the real problem is with files that consist largely of data, so that inquiring the block status does not gain us much performance, but where such an inquiry itself takes a lot of time. To address this, we want to cache data regions. Most of the time, when bad performance is reported, it is in places where the image is iterated over from start to end (qemu-img convert or the mirror job), so a simple yet effective solution is to cache only the current data region. (Note that only caching data regions but not zero regions means that returning false information from the cache is not catastrophic: Treating zeroes as data is fine. While we try to invalidate the cache on zero writes and discards, such incongruences may still occur when there are other processes writing to the image.) We only use the cache for nodes without children (i.e. protocol nodes), because that is where the problem is: Drivers that rely on block-status implementations outside of qemu (e.g. SEEK_DATA/HOLE). Resolves: https://gitlab.com/qemu-project/qemu/-/issues/307 Signed-off-by:
Hanna Reitz <hreitz@redhat.com> Message-Id: <20210812084148.14458-3-hreitz@redhat.com> Reviewed-by:
Eric Blake <eblake@redhat.com> Reviewed-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> [hreitz: Added `local_file == bs` assertion, as suggested by Vladimir] Signed-off-by:
Hanna Reitz <hreitz@redhat.com>
-
- Aug 03, 2021
-
-
Kevin Wolf authored
When bdrv_pad_request() fails in bdrv_co_preadv_part(), bs->in_flight has been increased, but is never decreased again. This leads to a hang when trying to drain the block node. This bug was observed with Windows guests which issue a request that fully uses IOV_MAX during installation, so that when padding is necessary (O_DIRECT with a 4k sector size block device on the host), adding another entry causes failure. Call bdrv_dec_in_flight() to fix this. There is a larger problem to solve here because this request shouldn't even fail, but Windows doesn't seem to care and with this minimal fix the installation succeeds. So given that we're already in freeze, let's take this minimal fix for 6.1. Fixes: 98ca4549 Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1972079 Reported-by:
Qing Wang <qinwang@redhat.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com> Message-Id: <20210727154923.91067-1-kwolf@redhat.com> Reviewed-by:
Max Reitz <mreitz@redhat.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-
- Jul 06, 2021
-
-
Akihiko Odaki authored
Signed-off-by:
Akihiko Odaki <akihiko.odaki@gmail.com> Message-id: 20210705130458.97642-3-akihiko.odaki@gmail.com Signed-off-by:
Stefan Hajnoczi <stefanha@redhat.com>
-
- Jun 29, 2021
-
-
Eric Blake authored
No need to start a tracked request that will always fail. The choice to check read-only after bdrv_inc_in_flight() predates 1bc5f09f (block: Use tracked request for truncate), but waiting for serializing requests can make the effect more noticeable. Signed-off-by:
Eric Blake <eblake@redhat.com> Message-Id: <20210609163034.997943-1-eblake@redhat.com> Reviewed-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-
- Jun 25, 2021
-
-
Paolo Bonzini authored
For block host devices, I/O can happen through either the kernel file descriptor I/O system calls (preadv/pwritev, io_submit, io_uring) or the SCSI passthrough ioctl SG_IO. In the latter case, the size of each transfer can be limited by the HBA, while for file descriptor I/O the kernel is able to split and merge I/O in smaller pieces as needed. Applying the HBA limits to file descriptor I/O results in more system calls and suboptimal performance, so this patch splits the max_transfer limit in two: max_transfer remains valid and is used in general, while max_hw_transfer is limited to the maximum hardware size. max_hw_transfer can then be included by the scsi-generic driver in the block limits page, to ensure that the stricter hardware limit is used. Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
- Jun 02, 2021
-
-
Vladimir Sementsov-Ogievskiy authored
It's better to use accessor function instead of bs->read_only directly. In some places use bdrv_is_writable() instead of checking both BDRV_O_RDWR set and BDRV_O_INACTIVE not set. In bdrv_open_common() it's a bit strange to add one more variable, but we are going to drop bs->read_only in the next patch, so new ro local variable substitutes it here. Signed-off-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20210527154056.70294-2-vsementsov@virtuozzo.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-
- May 14, 2021
-
-
Vladimir Sementsov-Ogievskiy authored
They are unused now. Signed-off-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by:
Max Reitz <mreitz@redhat.com> Message-Id: <20210506090621.11848-3-vsementsov@virtuozzo.com> Reviewed-by:
Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by:
Max Reitz <mreitz@redhat.com>
-
Vladimir Sementsov-Ogievskiy authored
write-notifiers are used only for write-threshold. New code for such purpose should create filters. Let's better special-case write-threshold and drop write notifiers at all. (Actually, write-threshold is special-cased anyway, as the only user of write-notifiers) So, create a new direct interface for bdrv_co_write_req_prepare() and drop all write-notifier related logic from write-threshold.c. Signed-off-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by:
Max Reitz <mreitz@redhat.com> Message-Id: <20210506090621.11848-2-vsementsov@virtuozzo.com> Reviewed-by:
Eric Blake <eblake@redhat.com> Reviewed-by:
Stefan Hajnoczi <stefanha@redhat.com> [mreitz: Adjusted comment as per Eric's suggestion] Signed-off-by:
Max Reitz <mreitz@redhat.com>
-
- Apr 30, 2021
-
-
Vladimir Sementsov-Ogievskiy authored
To be used in further commit. Signed-off-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by:
Kevin Wolf <kwolf@redhat.com> Message-Id: <20210428151804.439460-28-vsementsov@virtuozzo.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-
- Feb 12, 2021
-
-
Vladimir Sementsov-Ogievskiy authored
It will be used to stop retrying NBD requests on mirror cancel. Signed-off-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by:
Eric Blake <eblake@redhat.com> Message-Id: <20210205163720.887197-2-vsementsov@virtuozzo.com> Signed-off-by:
Eric Blake <eblake@redhat.com>
-
- Feb 03, 2021
-
-
Vladimir Sementsov-Ogievskiy authored
We are generally moving to int64_t for both offset and bytes parameters on all io paths. Main motivation is realization of 64-bit write_zeroes operation for fast zeroing large disk chunks, up to the whole disk. We chose signed type, to be consistent with off_t (which is signed) and with possibility for signed return type (where negative value means error). So, convert now copy_range parameters which are already 64bit to signed type. It's safe as we don't work with requests overflowing BDRV_MAX_LENGTH (which is less than INT64_MAX), and do check the requests in bdrv_co_copy_range_internal() (by bdrv_check_request32(), which calls bdrv_check_request()). Signed-off-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20201211183934.169161-17-vsementsov@virtuozzo.com> Reviewed-by:
Eric Blake <eblake@redhat.com> Signed-off-by:
Eric Blake <eblake@redhat.com>
-
Vladimir Sementsov-Ogievskiy authored
We are generally moving to int64_t for both offset and bytes parameters on all io paths. Main motivation is realization of 64-bit write_zeroes operation for fast zeroing large disk chunks, up to the whole disk. We chose signed type, to be consistent with off_t (which is signed) and with possibility for signed return type (where negative value means error). Now, since bdrv_co_preadv_part() and bdrv_co_pwritev_part() have been updated, update all their wrappers. For all of them type of 'bytes' is widening, so callers are safe. We have update request_fn in blkverify.c simultaneously. Still it's just a pointer to one of bdrv_co_pwritev() or bdrv_co_preadv(), and type is widening for callers of the request_fn anyway. Signed-off-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20201211183934.169161-16-vsementsov@virtuozzo.com> Reviewed-by:
Eric Blake <eblake@redhat.com> [eblake: grammar tweak] Signed-off-by:
Eric Blake <eblake@redhat.com>
-
Vladimir Sementsov-Ogievskiy authored
We are generally moving to int64_t for both offset and bytes parameters on all io paths. Main motivation is realization of 64-bit write_zeroes operation for fast zeroing large disk chunks, up to the whole disk. We chose signed type, to be consistent with off_t (which is signed) and with possibility for signed return type (where negative value means error). So, prepare bdrv_co_preadv_part() and bdrv_co_pwritev_part() and their remaining dependencies now. bdrv_pad_request() is updated simultaneously, as pointer to bytes passed to it both from bdrv_co_pwritev_part() and bdrv_co_preadv_part(). So, all callers of bdrv_pad_request() are updated to pass 64bit bytes. bdrv_pad_request() is already good for 64bit requests, add corresponding assertion. Look at bdrv_co_preadv_part() and bdrv_co_pwritev_part(). Type is widening, so callers are safe. Let's look inside the functions. In bdrv_co_preadv_part() and bdrv_aligned_pwritev() we only pass bytes to other already int64_t interfaces (and some obviously safe calculations), it's OK. In bdrv_co_do_zero_pwritev() aligned_bytes may become large now, still it's passed to bdrv_aligned_pwritev which supports int64_t bytes. Signed-off-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20201211183934.169161-15-vsementsov@virtuozzo.com> Reviewed-by:
Eric Blake <eblake@redhat.com> Signed-off-by:
Eric Blake <eblake@redhat.com>
-
Vladimir Sementsov-Ogievskiy authored
We are generally moving to int64_t for both offset and bytes parameters on all io paths. Main motivation is realization of 64-bit write_zeroes operation for fast zeroing large disk chunks, up to the whole disk. We chose signed type, to be consistent with off_t (which is signed) and with possibility for signed return type (where negative value means error). So, prepare bdrv_aligned_preadv() now. Make the bytes variable in bdrv_padding_rmw_read() int64_t, as it is only used for pass-through to bdrv_aligned_preadv(). All bdrv_aligned_preadv() callers are safe as type is widening. Let's look inside: - add a new-style assertion that request is good. - callees bdrv_is_allocated(), bdrv_co_do_copy_on_readv() supports int64_t bytes - conversion of bytes_remaining is OK, as we never have requests overflowing BDRV_MAX_LENGTH - looping through bytes_remaining is ok, num is updated to int64_t - for bdrv_driver_preadv we have same limit of max_transfer - qemu_iovec_memset is OK, as bytes+qiov_offset should not overflow qiov->size anyway (thanks to bdrv_check_qiov_request()) Signed-off-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20201211183934.169161-14-vsementsov@virtuozzo.com> Reviewed-by:
Eric Blake <eblake@redhat.com> [eblake: grammar tweak] Signed-off-by:
Eric Blake <eblake@redhat.com>
-
Vladimir Sementsov-Ogievskiy authored
We are generally moving to int64_t for both offset and bytes parameters on all io paths. Main motivation is realization of 64-bit write_zeroes operation for fast zeroing large disk chunks, up to the whole disk. We chose signed type, to be consistent with off_t (which is signed) and with possibility for signed return type (where negative value means error). So, prepare bdrv_co_do_copy_on_readv() now. 'bytes' type widening, so callers are safe. Look at the function itself: bytes, skip_bytes and progress become int64_t. bdrv_round_to_clusters() is OK, cluster_bytes now may be large. trace_bdrv_co_do_copy_on_readv() is OK looping through cluster_bytes is still OK. pnum is still capped to max_transfer, and to MAX_BOUNCE_BUFFER when we are going to do COR operation. Therefor calculations in qemu_iovec_from_buf() and bdrv_driver_preadv() should not change. Signed-off-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20201211183934.169161-13-vsementsov@virtuozzo.com> Reviewed-by:
Eric Blake <eblake@redhat.com> Signed-off-by:
Eric Blake <eblake@redhat.com>
-
Vladimir Sementsov-Ogievskiy authored
We are generally moving to int64_t for both offset and bytes parameters on all io paths. Main motivation is realization of 64-bit write_zeroes operation for fast zeroing large disk chunks, up to the whole disk. We chose signed type, to be consistent with off_t (which is signed) and with possibility for signed return type (where negative value means error). So, prepare bdrv_aligned_pwritev() now and convert the dependencies: bdrv_co_write_req_prepare() and bdrv_co_write_req_finish() to signed type bytes. Conversion of bdrv_co_write_req_prepare() and bdrv_co_write_req_finish() is definitely safe, as all requests in block/io must not overflow BDRV_MAX_LENGTH. Still add assertions. For bdrv_aligned_pwritev() 'bytes' type is widened, so callers are safe. Let's check usage of the parameter inside the function. Passing to bdrv_co_write_req_prepare() and bdrv_co_write_req_finish() is OK. Passing to qemu_iovec_* is OK after new assertion. All other callees are already updated to int64_t. Checking alignment is not changed, offset + bytes and qiov_offset + bytes calculations are safe (thanks to new assertions). max_transfer is kept to be int for now. It has a default of INT_MAX here, and some drivers may rely on it. It's to be refactored later. Signed-off-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20201211183934.169161-12-vsementsov@virtuozzo.com> Reviewed-by:
Eric Blake <eblake@redhat.com> Signed-off-by:
Eric Blake <eblake@redhat.com>
-
Vladimir Sementsov-Ogievskiy authored
We are generally moving to int64_t for both offset and bytes parameters on all io paths. Main motivation is realization of 64-bit write_zeroes operation for fast zeroing large disk chunks, up to the whole disk. We chose signed type, to be consistent with off_t (which is signed) and with possibility for signed return type (where negative value means error). So, prepare bdrv_co_do_pwrite_zeroes() now. Callers are safe, as converting int to int64_t is safe. Concentrate on 'bytes' usage in the function (thx to Eric Blake): compute 'int tail' via % 'int alignment' - safe fragmentation loop 'int num' - still fragments with a cap on max_transfer use of 'num' within the loop MIN(bytes, max_transfer) as well as %alignment - still works, so calculations in if (head) {} are safe clamp size by 'int max_write_zeroes' - safe drv->bdrv_co_pwrite_zeroes(int) - safe because of clamping clamp size by 'int max_transfer' - safe buf allocation is still clamped to max_transfer qemu_iovec_init_buf(size_t) - safe because of clamping bdrv_driver_pwritev(uint64_t) - safe Signed-off-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20201211183934.169161-11-vsementsov@virtuozzo.com> Reviewed-by:
Eric Blake <eblake@redhat.com> Signed-off-by:
Eric Blake <eblake@redhat.com>
-
Vladimir Sementsov-Ogievskiy authored
We are generally moving to int64_t for both offset and bytes parameters on all io paths. Main motivation is realization of 64-bit write_zeroes operation for fast zeroing large disk chunks, up to the whole disk. We chose signed type, to be consistent with off_t (which is signed) and with possibility for signed return type (where negative value means error). So, convert driver wrappers parameters which are already 64bit to signed type. Requests in block/io.c must never exceed BDRV_MAX_LENGTH (which is less than INT64_MAX), which makes the conversion to signed 64bit type safe. Add corresponding assertions. Signed-off-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20201211183934.169161-10-vsementsov@virtuozzo.com> Reviewed-by:
Eric Blake <eblake@redhat.com> Signed-off-by:
Eric Blake <eblake@redhat.com>
-
Eric Blake authored
We are generally moving to int64_t for both offset and bytes parameters on all io paths. Main motivation is realization of 64-bit write_zeroes operation for fast zeroing large disk chunks, up to the whole disk. We chose signed type, to be consistent with off_t (which is signed) and with possibility for signed return type (where negative value means error). All requests in block/io must not overflow BDRV_MAX_LENGTH, all external users of BdrvTrackedRequest already have corresponding assertions, so we are safe. Add some assertions still. Signed-off-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20201211183934.169161-9-vsementsov@virtuozzo.com> Reviewed-by:
Eric Blake <eblake@redhat.com> Signed-off-by:
Eric Blake <eblake@redhat.com>
-
Vladimir Sementsov-Ogievskiy authored
Operations with qiov add more restrictions on bytes, let's cover it. Signed-off-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20201211183934.169161-8-vsementsov@virtuozzo.com> Reviewed-by:
Eric Blake <eblake@redhat.com> Signed-off-by:
Eric Blake <eblake@redhat.com>
-
Vladimir Sementsov-Ogievskiy authored
Make bdrv_pad_request() honest: return error if qemu_iovec_init_extended() failed. Update also bdrv_padding_destroy() to clean the structure for safety. Signed-off-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20201211183934.169161-6-vsementsov@virtuozzo.com> Reviewed-by:
Eric Blake <eblake@redhat.com> Signed-off-by:
Eric Blake <eblake@redhat.com>
-
Vladimir Sementsov-Ogievskiy authored
Prepare for the following patch when bdrv_pad_request() will be able to fail. Update the comments. Signed-off-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20201211183934.169161-5-vsementsov@virtuozzo.com> Reviewed-by:
Eric Blake <eblake@redhat.com> [eblake: grammar tweak] Signed-off-by:
Eric Blake <eblake@redhat.com>
-
Vladimir Sementsov-Ogievskiy authored
Calculation of sum may theoretically overflow, so use 64bit type and add some good assertions. Use int64_t constantly. Signed-off-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20201211183934.169161-4-vsementsov@virtuozzo.com> Reviewed-by:
Eric Blake <eblake@redhat.com> [eblake: tweak assertion order] Signed-off-by:
Eric Blake <eblake@redhat.com>
-
Vladimir Sementsov-Ogievskiy authored
Actually, we can't extend the io vector in all cases. Handle possible MAX_IOV and size_t overflows. For now add assertion to callers (actually they rely on success anyway) and fix them in the following patch. Add also some additional good assertions to qemu_iovec_init_slice() while being here. Signed-off-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20201211183934.169161-3-vsementsov@virtuozzo.com> Reviewed-by:
Eric Blake <eblake@redhat.com> Signed-off-by:
Eric Blake <eblake@redhat.com>
-
Vladimir Sementsov-Ogievskiy authored
It's better to pass &error_abort than just assert that result is 0: on crash, we'll immediately see the reason in the backtrace. Signed-off-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20201211183934.169161-2-vsementsov@virtuozzo.com> Reviewed-by:
Eric Blake <eblake@redhat.com> [eblake: fix iotest 206 fallout] Signed-off-by:
Eric Blake <eblake@redhat.com>
-
- Jan 26, 2021
-
-
Andrey Shinkevich authored
Add the new member supported_read_flags to the BlockDriverState structure. It will control the flags set for copy-on-read operations. Make the block generic layer evaluate supported read flags before they go to a block driver. Suggested-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by:
Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> Signed-off-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> [vsementsov: use assert instead of abort] Reviewed-by:
Max Reitz <mreitz@redhat.com> Message-Id: <20201216061703.70908-8-vsementsov@virtuozzo.com> Signed-off-by:
Max Reitz <mreitz@redhat.com>
-
- Dec 18, 2020
-
-
Vladimir Sementsov-Ogievskiy authored
Add flag to make serialising request no wait: if there are conflicting requests, just return error immediately. It's will be used in upcoming preallocate filter. Signed-off-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by:
Max Reitz <mreitz@redhat.com> Message-Id: <20201021145859.11201-7-vsementsov@virtuozzo.com> Signed-off-by:
Max Reitz <mreitz@redhat.com>
-
Vladimir Sementsov-Ogievskiy authored
We'll need a separate function, which will only "mark" request serialising with specified align but not wait for conflicting requests. So, it will be like old bdrv_mark_request_serialising(), before merging bdrv_wait_serialising_requests_locked() into it. To reduce the possible mess, let's do the following: Public function that does both marking and waiting will be called bdrv_make_request_serialising, and private function which will only "mark" will be called tracked_request_set_serialising(). Signed-off-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by:
Max Reitz <mreitz@redhat.com> Message-Id: <20201021145859.11201-6-vsementsov@virtuozzo.com> Signed-off-by:
Max Reitz <mreitz@redhat.com>
-
Vladimir Sementsov-Ogievskiy authored
bs is linked in req, so no needs to pass it separately. Most of tracked-requests API doesn't have bs argument. Actually, after this patch only tracked_request_begin has it, but it's for purpose. While being here, also add a comment about what "_locked" is. Signed-off-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by:
Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20201021145859.11201-5-vsementsov@virtuozzo.com> Signed-off-by:
Max Reitz <mreitz@redhat.com>
-
Vladimir Sementsov-Ogievskiy authored
To be reused in separate. Signed-off-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by:
Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20201021145859.11201-4-vsementsov@virtuozzo.com> Signed-off-by:
Max Reitz <mreitz@redhat.com>
-
Vladimir Sementsov-Ogievskiy authored
The comments states, that on misaligned request we should have already been waiting. But for bdrv_padding_rmw_read, we called bdrv_mark_request_serialising with align = request_alignment, and now we serialise with align = cluster_size. So we may have to wait again with larger alignment. Note, that the only user of BDRV_REQ_SERIALISING is backup which issues cluster-aligned requests, so seems the assertion should not fire for now. But it's wrong anyway. Signed-off-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by:
Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20201021145859.11201-3-vsementsov@virtuozzo.com> Signed-off-by:
Max Reitz <mreitz@redhat.com>
-
- Dec 11, 2020
-
-
Kevin Wolf authored
If bdrv_co_yield_to_drain() is called for draining a block node that runs in a different AioContext, it keeps that AioContext locked while it yields and schedules a BH in the AioContext to do the actual drain. As long as executing the BH is the very next thing that the event loop of the node's AioContext does, this actually happens to work, but when it tries to execute something else that wants to take the AioContext lock, it will deadlock. (In the bug report, this other thing is a virtio-scsi device running virtio_scsi_data_plane_handle_cmd().) Instead, always drop the AioContext lock across the yield and reacquire it only when the coroutine is reentered. The BH needs to unconditionally take the lock for itself now. This fixes the 'block_resize' QMP command on a block node that runs in an iothread. Cc: qemu-stable@nongnu.org Fixes: eb94b81a Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1903511 Signed-off-by:
Kevin Wolf <kwolf@redhat.com> Message-Id: <20201203172311.68232-4-kwolf@redhat.com> Reviewed-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-