- Jun 29, 2021
-
-
Philippe Mathieu-Daudé authored
Avoid accessing QCryptoTLSCreds internals by using the qcrypto_tls_creds_check_endpoint() helper. Reviewed-by:
Richard Henderson <richard.henderson@linaro.org> Signed-off-by:
Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by:
Daniel P. Berrangé <berrange@redhat.com>
-
- Jun 25, 2021
-
-
Emanuele Giuseppe Esposito authored
By adding acquire/release pairs, we ensure that .ret and .error_is_read fields are written by block_copy_dirty_clusters before .finished is true, and that they are read by API user after .finished is true. The atomic here are necessary because the fields are concurrently modified in coroutines, and read outside. Signed-off-by:
Emanuele Giuseppe Esposito <eesposit@redhat.com> Message-Id: <20210624072043.180494-6-eesposit@redhat.com> Reviewed-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
-
Emanuele Giuseppe Esposito authored
Group various structures fields, to better understand what we need to protect with a lock and what doesn't need it. Then, add a CoMutex to protect concurrent access of block-copy data structures. This mutex also protects .copy_bitmap, because its thread-safe API does not prevent it from assigning two tasks to the same bitmap region. Exceptions to the lock: - .sleep_state is handled in the series "coroutine: new sleep/wake API" and thus here left as TODO. - .finished, .cancelled and reads to .ret and .error_is_read will be protected in the following patch, because are used also outside coroutines. - .skip_unallocated is atomic. Including it under the mutex would increase the critical sections and make them also much more complex. We can have it as atomic since it is only written from outside and read by block-copy coroutines. Signed-off-by:
Emanuele Giuseppe Esposito <eesposit@redhat.com> Message-Id: <20210624072043.180494-5-eesposit@redhat.com> [vsementsov: fix typo in comment] Reviewed-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
-
Emanuele Giuseppe Esposito authored
Moving this function in task_end ensures to update the progress anyways, even if there is an error. It also helps in next patch, allowing task_end to have only one critical section. Reviewed-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by:
Emanuele Giuseppe Esposito <eesposit@redhat.com> Message-Id: <20210624072043.180494-4-eesposit@redhat.com> Signed-off-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
-
Paolo Bonzini authored
Put the logic to determine the copy size in a separate function, so that there is a simple state machine for the possible methods of copying data from one BlockDriverState to the other. Use .method instead of .copy_range as in-out argument, and include also .zeroes as an additional copy method. While at it, store the common computation of block_copy_max_transfer into a new field of BlockCopyState, and make sure that we always obey max_transfer; that's more efficient even for the COPY_RANGE_READ_WRITE case. Reviewed-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by:
Emanuele Giuseppe Esposito <eesposit@redhat.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20210624072043.180494-3-eesposit@redhat.com> Signed-off-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
-
Emanuele Giuseppe Esposito authored
Use a local variable instead of referencing BlockCopyState through a BlockCopyCallState or BlockCopyTask every time. This is in preparation for next patches. No functional change intended. Reviewed-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by:
Emanuele Giuseppe Esposito <eesposit@redhat.com> Message-Id: <20210624072043.180494-2-eesposit@redhat.com> Signed-off-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
-
Emanuele Giuseppe Esposito authored
Progressmeter is protected by the AioContext mutex, which is taken by the block jobs and their caller (like blockdev). We would like to remove the dependency of block layer code on the AioContext mutex, since most drivers and the core I/O code are already not relying on it. Create a new C file to implement the ProgressMeter API, but keep the struct as public, to avoid forcing allocation on the heap. Also add a mutex to be able to provide an accurate snapshot of the progress values to the caller. Signed-off-by:
Emanuele Giuseppe Esposito <eesposit@redhat.com> Reviewed-by:
Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20210614081130.22134-5-eesposit@redhat.com> Signed-off-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
-
Paolo Bonzini authored
Reviewed-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Signed-off-by:
Emanuele Giuseppe Esposito <eesposit@redhat.com> Message-Id: <20210614081130.22134-3-eesposit@redhat.com> Signed-off-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
-
Paolo Bonzini authored
Similar to other handle_aiocb_* functions, handle_aiocb_ioctl needs to cater for the possibility that ioctl is interrupted by a signal. Otherwise, the I/O is incorrectly reported as a failure to the guest. Reported-by:
Gordon Watson <gwatson@redhat.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Joelle van Dyne authored
iOS hosts do not have these defined so we fallback to the default behaviour. Co-authored-by:
Warner Losh <imp@bsdimp.com> Signed-off-by:
Joelle van Dyne <j@getutm.app> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Paolo Bonzini authored
Try all the possible ioctls for disk size as long as they are supported, to keep the #if ladder simple. Extracted and cleaned up from a patch by Joelle van Dyne and Warner Losh. Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Joelle van Dyne authored
On Darwin (iOS), there are no system level APIs for directly accessing host block devices. We detect this at configure time. Signed-off-by:
Joelle van Dyne <j@getutm.app> Message-Id: <20210315180341.31638-2-j@getutm.app> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Paolo Bonzini authored
bs->sg is only true for character devices, but block devices can also be used with scsi-block and scsi-generic. Unfortunately BLKSECTGET returns bytes in an int for /dev/sgN devices, and sectors in a short for block devices, so account for that in the code. The maximum transfer also need not be a power of 2 (for example I have seen disks with 1280 KiB maximum transfer) so there's no need to pass the result through pow2floor. Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Paolo Bonzini authored
For block host devices, I/O can happen through either the kernel file descriptor I/O system calls (preadv/pwritev, io_submit, io_uring) or the SCSI passthrough ioctl SG_IO. In the latter case, the size of each transfer can be limited by the HBA, while for file descriptor I/O the kernel is able to split and merge I/O in smaller pieces as needed. Applying the HBA limits to file descriptor I/O results in more system calls and suboptimal performance, so this patch splits the max_transfer limit in two: max_transfer remains valid and is used in general, while max_hw_transfer is limited to the maximum hardware size. max_hw_transfer can then be included by the scsi-generic driver in the block limits page, to ensure that the stricter hardware limit is used. Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Paolo Bonzini authored
Block device requests must be aligned to bs->bl.request_alignment. It makes sense for drivers to align bs->bl.max_transfer the same way; however when there is no specified limit, blk_get_max_transfer just returns INT_MAX. Since the contract of the function does not specify that INT_MAX means "no maximum", just align the outcome of the function (whether INT_MAX or bs->bl.max_transfer) before returning it. Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Paolo Bonzini authored
I/O to a disk via read/write is not limited by the number of segments allowed by the host adapter; the kernel can split requests if needed, and the limit imposed by the host adapter can be very low (256k or so) to avoid that SG_IO returns EINVAL if memory is heavily fragmented. Since this value is only interesting for SG_IO-based I/O, do not include it in the max_transfer and only take it into account when patching the block limits VPD page in the scsi-generic device. Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Reviewed-by:
Max Reitz <mreitz@redhat.com>
-
Paolo Bonzini authored
Even though it was only called for devices that have bs->sg set (which must be character devices), sg_get_max_segments looked at /sys/dev/block which only works for block devices. On Linux the sg driver has its own way to provide the maximum number of iovecs in a scatter/gather list, so add support for it. The block device path is kept because it will be reinstated in the next patches. Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Reviewed-by:
Max Reitz <mreitz@redhat.com>
-
- Jun 24, 2021
-
-
Hanna Reitz authored
In the bdrv_snapshot_goto() fallback code, we work with a pointer to either bs->file or bs->backing. We detach that child, close the node (with .bdrv_close()), apply the snapshot on the child node, and then re-open the node (with .bdrv_open()). In order for .bdrv_open() to attach the same child node that we had before, we pass "file={child-node}" or "backing={child-node}" to it. Therefore, when .bdrv_open() has returned success, we can assume that bs->file or bs->backing (respectively) points to our original child again. This is verified by an assertion. All of this is not immediately clear from a quick glance at the code, so add a comment to the assertion what it is for, and why it is valid. It certainly confused Coverity. Reported-by: Coverity (CID 1452774) Signed-off-by:
Max Reitz <mreitz@redhat.com> Message-Id: <20210503095418.31521-1-mreitz@redhat.com> [mreitz: s/close/detach/] Reviewed-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
-
- Jun 18, 2021
-
-
Vladimir Sementsov-Ogievskiy authored
req->receiving is a flag of request being in one concrete yield point in nbd_co_do_receive_one_chunk(). Such kind of boolean flag is always better to unset before scheduling the coroutine, to avoid double scheduling. So, let's be more careful. Signed-off-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by:
Eric Blake <eblake@redhat.com> Message-Id: <20210610100802.5888-33-vsementsov@virtuozzo.com> Signed-off-by:
Eric Blake <eblake@redhat.com>
-
Vladimir Sementsov-Ogievskiy authored
We already have two similar helpers for other state. Let's add another one for convenience. Signed-off-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by:
Eric Blake <eblake@redhat.com> Message-Id: <20210610100802.5888-32-vsementsov@virtuozzo.com> Signed-off-by:
Eric Blake <eblake@redhat.com>
-
Vladimir Sementsov-Ogievskiy authored
The only last step we need to reuse the function is coroutine-wrapper. nbd_open() may be called from non-coroutine context. So, generate the wrapper and use it. Signed-off-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by:
Eric Blake <eblake@redhat.com> Message-Id: <20210610100802.5888-31-vsementsov@virtuozzo.com> Signed-off-by:
Eric Blake <eblake@redhat.com>
-
Vladimir Sementsov-Ogievskiy authored
We'll need a possibility of non-blocking nbd_co_establish_connection(), so that it returns immediately, and it returns success only if a connections was previously established in background. Signed-off-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by:
Eric Blake <eblake@redhat.com> Message-Id: <20210610100802.5888-30-vsementsov@virtuozzo.com> Signed-off-by:
Eric Blake <eblake@redhat.com>
-
Vladimir Sementsov-Ogievskiy authored
Split out the part that we want to reuse for nbd_open(). Signed-off-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20210610100802.5888-29-vsementsov@virtuozzo.com> Reviewed-by:
Eric Blake <eblake@redhat.com> Signed-off-by:
Eric Blake <eblake@redhat.com>
-
Vladimir Sementsov-Ogievskiy authored
block/nbd doesn't need underlying sioc channel anymore. So, we can update nbd/client-connection interface to return only one top-most io channel, which is more straight forward. Signed-off-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by:
Eric Blake <eblake@redhat.com> Message-Id: <20210610100802.5888-27-vsementsov@virtuozzo.com> [eblake: squash in Vladimir's fixes for uninit usage caught by clang] Signed-off-by:
Eric Blake <eblake@redhat.com>
-
Vladimir Sementsov-Ogievskiy authored
Currently sioc pointer is used just to pass from socket-connection to nbd negotiation. Drop the field, and use local variables instead. With next commit we'll update nbd/client-connection.c to behave appropriately (return only top-most ioc, not two channels). Signed-off-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by:
Eric Blake <eblake@redhat.com> Message-Id: <20210610100802.5888-26-vsementsov@virtuozzo.com> Signed-off-by:
Eric Blake <eblake@redhat.com>
-
Vladimir Sementsov-Ogievskiy authored
Negotiation during reconnect is now done in a thread, and s->sioc is not available during negotiation. Negotiation in thread will be cancelled by nbd_client_connection_release() called from nbd_clear_bdrvstate(). So, we don't need this code chunk anymore. Signed-off-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by:
Eric Blake <eblake@redhat.com> Message-Id: <20210610100802.5888-25-vsementsov@virtuozzo.com> Signed-off-by:
Eric Blake <eblake@redhat.com>
-
Vladimir Sementsov-Ogievskiy authored
Now that we can opt in to negotiation as part of the client connection thread, use that to simplify connection_co. This is another step on the way to moving all reconnect code into NBDClientConnection. Signed-off-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by:
Eric Blake <eblake@redhat.com> Message-Id: <20210610100802.5888-24-vsementsov@virtuozzo.com> Signed-off-by:
Eric Blake <eblake@redhat.com>
-
Vladimir Sementsov-Ogievskiy authored
To be reused in the following patch. Signed-off-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by:
Roman Kagan <rvkagan@yandex-team.ru> Reviewed-by:
Eric Blake <eblake@redhat.com> Message-Id: <20210610100802.5888-23-vsementsov@virtuozzo.com> Signed-off-by:
Eric Blake <eblake@redhat.com>
-
Vladimir Sementsov-Ogievskiy authored
Add arguments and logic to support nbd negotiation in the same thread after successful connection. Signed-off-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20210610100802.5888-20-vsementsov@virtuozzo.com> Reviewed-by:
Eric Blake <eblake@redhat.com> Signed-off-by:
Eric Blake <eblake@redhat.com>
-
Vladimir Sementsov-Ogievskiy authored
We now have bs-independent connection API, which consists of four functions: nbd_client_connection_new() nbd_client_connection_release() nbd_co_establish_connection() nbd_co_establish_connection_cancel() Move them to a separate file together with NBDClientConnection structure which becomes private to the new API. Signed-off-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by:
Eric Blake <eblake@redhat.com> Message-Id: <20210610100802.5888-18-vsementsov@virtuozzo.com> [eblake: comment tweaks] Signed-off-by:
Eric Blake <eblake@redhat.com>
-
Vladimir Sementsov-Ogievskiy authored
This is a last step of creating bs-independent nbd connection interface. With next commit we can finally move it to separate file. Signed-off-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20210610100802.5888-17-vsementsov@virtuozzo.com> Reviewed-by:
Eric Blake <eblake@redhat.com> Signed-off-by:
Eric Blake <eblake@redhat.com>
-
Vladimir Sementsov-Ogievskiy authored
This is a step of creating bs-independent nbd connection interface. Signed-off-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by:
Roman Kagan <rvkagan@yandex-team.ru> Reviewed-by:
Eric Blake <eblake@redhat.com> Message-Id: <20210610100802.5888-16-vsementsov@virtuozzo.com> Signed-off-by:
Eric Blake <eblake@redhat.com>
-
Vladimir Sementsov-Ogievskiy authored
We are going to move the connection code to its own file, and want clear names and APIs first. The structure is shared between user and (possibly) several runs of connect-thread. So it's wrong to call it "thread". Let's rename to something more generic. Appropriately rename connect_thread and thr variables to conn. Signed-off-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by:
Roman Kagan <rvkagan@yandex-team.ru> Reviewed-by:
Eric Blake <eblake@redhat.com> Message-Id: <20210610100802.5888-15-vsementsov@virtuozzo.com> Signed-off-by:
Eric Blake <eblake@redhat.com>
-
Vladimir Sementsov-Ogievskiy authored
nbd_co_establish_connection_cancel() actually needs only pointer to NBDConnectThread. So, make it clean. Signed-off-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by:
Roman Kagan <rvkagan@yandex-team.ru> Reviewed-by:
Eric Blake <eblake@redhat.com> Message-Id: <20210610100802.5888-14-vsementsov@virtuozzo.com> Signed-off-by:
Eric Blake <eblake@redhat.com>
-
Vladimir Sementsov-Ogievskiy authored
We are going to split connection code to a separate file. Now we are ready to give nbd_co_establish_connection() clean and bs-independent interface. Signed-off-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by:
Roman Kagan <rvkagan@yandex-team.ru> Reviewed-by:
Eric Blake <eblake@redhat.com> Message-Id: <20210610100802.5888-13-vsementsov@virtuozzo.com> Signed-off-by:
Eric Blake <eblake@redhat.com>
-
Vladimir Sementsov-Ogievskiy authored
We don't need all these states. The code refactored to use two boolean variables looks simpler. While moving the comment in nbd_co_establish_connection() rework it to give better information. Also, we are going to move the connection code to separate file and mentioning drained section would be confusing. Improve also the comment in NBDConnectThread, while dropping removed state names from it. Signed-off-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20210610100802.5888-12-vsementsov@virtuozzo.com> Reviewed-by:
Eric Blake <eblake@redhat.com> [eblake: comment tweak] Signed-off-by:
Eric Blake <eblake@redhat.com>
-
Vladimir Sementsov-Ogievskiy authored
Instead of managing connect_bh, bh_ctx, and wait_connect fields, we can use a single link to the waiting coroutine with proper mutex protection. So new logic is: nbd_co_establish_connection() sets wait_co under the mutex, releases the mutex, then yield()s. Note that wait_co may be scheduled by the thread immediately after unlocking the mutex. Still, the main thread (or iothread) will not reach the code for entering the coroutine until the yield(), so we are safe. connect_thread_func() and nbd_co_establish_connection_cancel() do the following to handle wait_co: Under the mutex, if thr->wait_co is not NULL, make it NULL and schedule it. This way, we avoid scheduling the coroutine twice. Still scheduling is a bit different: In connect_thread_func() we can just call aio_co_wake under mutex, after commit [async: the main AioContext is only "current" if under the BQL] we are sure that aio_co_wake() will not try to acquire the aio context and do qemu_aio_coroutine_enter() but simply schedule the coroutine by aio_co_schedule(). nbd_co_establish_connection_cancel() will be called from non-coroutine context in further patch and will be able to go through qemu_aio_coroutine_enter() path of aio_co_wake(). So keep current behavior of waking the coroutine after the critical section. Also, this commit reduces the dependence of nbd_co_establish_connection() on the internals of bs (we now use a generic pointer to the coroutine, instead of direct use of s->connection_co). This is a step towards splitting the connection API out of nbd.c. Signed-off-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20210610100802.5888-11-vsementsov@virtuozzo.com> Reviewied-by:
Eric Blake <eblake@redhat.com> Signed-off-by:
Eric Blake <eblake@redhat.com>
-
Vladimir Sementsov-Ogievskiy authored
These fields are write-only. Drop them. Signed-off-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by:
Roman Kagan <rvkagan@yandex-team.ru> Reviewed-by:
Eric Blake <eblake@redhat.com> Message-Id: <20210610100802.5888-10-vsementsov@virtuozzo.com> Signed-off-by:
Eric Blake <eblake@redhat.com>
-
Vladimir Sementsov-Ogievskiy authored
Signed-off-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by:
Roman Kagan <rvkagan@yandex-team.ru> Message-Id: <20210610100802.5888-9-vsementsov@virtuozzo.com> Signed-off-by:
Eric Blake <eblake@redhat.com>
-
Roman Kagan authored
Simplify lifetime management of BDRVNBDState->connect_thread by delaying the possible cleanup of it until the BDRVNBDState itself goes away. This also reverts 0267101a "block/nbd: fix possible use after free of s->connect_thread" as now s->connect_thread can't be cleared until the very end. Signed-off-by:
Roman Kagan <rvkagan@yandex-team.ru> [vsementsov: rebase, revert 0267101a changes] Signed-off-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> [eblake: tweak comment] Reviewed-by:
Eric Blake <eblake@redhat.com> Message-Id: <20210610100802.5888-8-vsementsov@virtuozzo.com> Signed-off-by:
Eric Blake <eblake@redhat.com>
-