- Apr 06, 2022
-
-
Marc-André Lureau authored
Signed-off-by:
Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20220323155743.1585078-33-marcandre.lureau@redhat.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Marc-André Lureau authored
Signed-off-by:
Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20220323155743.1585078-26-marcandre.lureau@redhat.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Marc-André Lureau authored
Replace the global variables with inlined helper functions. getpagesize() is very likely annotated with a "const" function attribute (at least with glibc), and thus optimization should apply even better. This avoids the need for a constructor initialization too. Signed-off-by:
Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20220323155743.1585078-12-marcandre.lureau@redhat.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
- Mar 29, 2022
-
-
Hanna Reitz authored
When the stream block job cuts out the nodes between top and base in stream_prepare(), it does not drain the subtree manually; it fetches the base node, and tries to insert it as the top node's backing node with bdrv_set_backing_hd(). bdrv_set_backing_hd() however will drain, and so the actual base node might change (because the base node is actually not part of the stream job) before the old base node passed to bdrv_set_backing_hd() is installed. This has two implications: First, the stream job does not keep a strong reference to the base node. Therefore, if it is deleted in bdrv_set_backing_hd()'s drain (e.g. because some other block job is drained to finish), we will get a use-after-free. We should keep a strong reference to that node. Second, even with such a strong reference, the problem remains that the base node might change before bdrv_set_backing_hd() actually runs and as a result the wrong base node is installed. Both effects can be seen in 030's TestParallelOps.test_overlapping_5() case, which has five nodes, and simultaneously streams from the middle node to the top node, and commits the middle node down to the base node. As it is, this will sometimes crash, namely when we encounter the above-described use-after-free. Taking a strong reference to the base node, we no longer get a crash, but the resuling block graph is less than ideal: The expected result is obviously that all middle nodes are cut out and the base node is the immediate backing child of the top node. However, if stream_prepare() takes a strong reference to its base node (the middle node), and then the commit job finishes in bdrv_set_backing_hd(), supposedly dropping that middle node, the stream job will just reinstall it again. Therefore, we need to keep the whole subtree drained in stream_prepare(), so that the graph modification it performs is effectively atomic, i.e. that the base node it fetches is still the base node when bdrv_set_backing_hd() sets it as the top node's backing node. Verify this by asserting in said 030's test case that the base node is always the top node's immediate backing child when both jobs are done. Signed-off-by:
Hanna Reitz <hreitz@redhat.com> Message-Id: <20220324140907.17192-1-hreitz@redhat.com> Reviewed-by:
Eric Blake <eblake@redhat.com> Acked-by:
Vladimir Sementsov-Ogievskiy <v.sementsov-og@mail.ru>
-
- Mar 24, 2022
-
-
Philippe Mathieu-Daudé authored
"0x%u" format is very misleading, replace by "0x%x". Found running: $ git grep -E '0x%[0-9]*([lL]*|" ?PRI)[dDuU]' block/ Inspired-by:
Richard Henderson <richard.henderson@linaro.org> Signed-off-by:
Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by:
Hanna Reitz <hreitz@redhat.com> Reviewed-by:
Daniel P. Berrangé <berrange@redhat.com> Reviewed-by:
Denis V. Lunev <den@openvz.org> Message-id: 20220323114718.58714-2-philippe.mathieu.daude@gmail.com Signed-off-by:
Stefan Hajnoczi <stefanha@redhat.com>
-
- Mar 22, 2022
-
-
Marc-André Lureau authored
One less qemu-specific macro. It also helps to make some headers/units only depend on glib, and thus moved in standalone projects eventually. Signed-off-by:
Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by:
Richard Henderson <richard.henderson@linaro.org> Reviewed-by:
Philippe Mathieu-Daudé <f4bug@amsat.org>
-
Marc-André Lureau authored
One less qemu-specific macro. It also helps to make some headers/units only depend on glib, and thus moved in standalone projects eventually. Signed-off-by:
Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by:
Richard W.M. Jones <rjones@redhat.com>
-
Stefano Garzarella authored
Commit d24f8023 ("block/rbd: increase dynamically the image size") added a workaround to support growing images (eg. qcow2), resizing the image before write operations that exceed the current size. We recently added support for write zeroes and without the workaround we can have problems with qcow2. So let's move the resize into qemu_rbd_start_co() and do it when the command is RBD_AIO_WRITE or RBD_AIO_WRITE_ZEROES. Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2020993 Fixes: c56ac27d ("block/rbd: add write zeroes support") Signed-off-by:
Stefano Garzarella <sgarzare@redhat.com> Message-Id: <20220317162638.41192-1-sgarzare@redhat.com> Signed-off-by:
Hanna Reitz <hreitz@redhat.com>
-
- Mar 21, 2022
-
-
Rao Lei authored
During the IO stress test, the IO request coroutine has a probability that is can't be awakened when the NBD server is killed. The GDB stack is as follows: (gdb) bt 0 0x00007f2ff990cbf6 in __ppoll (fds=0x55575de85000, nfds=1, timeout=<optimized out>, sigmask=0x0) at ../sysdeps/unix/sysv/linux/ppoll.c:44 1 0x000055575c302e7c in qemu_poll_ns (fds=0x55575de85000, nfds=1, timeout=599999603140) at ../util/qemu-timer.c:348 2 0x000055575c2d3c34 in fdmon_poll_wait (ctx=0x55575dc480f0, ready_list=0x7ffd9dd1dae0, timeout=599999603140) at ../util/fdmon-poll.c:80 3 0x000055575c2d350d in aio_poll (ctx=0x55575dc480f0, blocking=true) at ../util/aio-posix.c:655 4 0x000055575c16eabd in bdrv_do_drained_begin(bs=0x55575dee7fe0, recursive=false, parent=0x0, ignore_bds_parents=false, poll=true)at ../block/io.c:474 5 0x000055575c16eba6 in bdrv_drained_begin (bs=0x55575dee7fe0) at ../block/io.c:480 6 0x000055575c1aff33 in quorum_del_child (bs=0x55575dee7fe0, child=0x55575dcea690, errp=0x7ffd9dd1dd08) at ../block/quorum.c:1130 7 0x000055575c14239b in bdrv_del_child (parent_bs=0x55575dee7fe0, child=0x55575dcea690, errp=0x7ffd9dd1dd08) at ../block.c:7705 8 0x000055575c12da28 in qmp_x_blockdev_change(parent=0x55575df404c0 "colo-disk0", has_child=true, child=0x55575de867f0 "children.1", has_node=false, no de=0x0, errp=0x7ffd9dd1dd08) at ../blockdev.c:3676 9 0x000055575c258435 in qmp_marshal_x_blockdev_change (args=0x7f2fec008190, ret=0x7f2ff7b0bd98, errp=0x7f2ff7b0bd90) at qapi/qapi-commands-block-core.c :1675 10 0x000055575c2c6201 in do_qmp_dispatch_bh (opaque=0x7f2ff7b0be30) at ../qapi/qmp-dispatch.c:129 11 0x000055575c2ebb1c in aio_bh_call (bh=0x55575dc429c0) at ../util/async.c:141 12 0x000055575c2ebc2a in aio_bh_poll (ctx=0x55575dc480f0) at ../util/async.c:169 13 0x000055575c2d2d96 in aio_dispatch (ctx=0x55575dc480f0) at ../util/aio-posix.c:415 14 0x000055575c2ec07f in aio_ctx_dispatch (source=0x55575dc480f0, callback=0x0, user_data=0x0) at ../util/async.c:311 15 0x00007f2ff9e7cfbd in g_main_context_dispatch () at /lib/x86_64-linux-gnu/libglib-2.0.so.0 16 0x000055575c2fd581 in glib_pollfds_poll () at ../util/main-loop.c:232 17 0x000055575c2fd5ff in os_host_main_loop_wait (timeout=0) at ../util/main-loop.c:255 18 0x000055575c2fd710 in main_loop_wait (nonblocking=0) at ../util/main-loop.c:531 19 0x000055575bfa7588 in qemu_main_loop () at ../softmmu/runstate.c:726 20 0x000055575bbee57a in main (argc=60, argv=0x7ffd9dd1e0e8, envp=0x7ffd9dd1e2d0) at ../softmmu/main.c:50 (gdb) qemu coroutine 0x55575e16aac0 0 0x000055575c2ee7dc in qemu_coroutine_switch (from_=0x55575e16aac0, to_=0x7f2ff830fba0, action=COROUTINE_YIELD) at ../util/coroutine-ucontext.c:302 1 0x000055575c2fe2a9 in qemu_coroutine_yield () at ../util/qemu-coroutine.c:195 2 0x000055575c2fe93c in qemu_co_queue_wait_impl (queue=0x55575dc46170, lock=0x7f2b32ad9850) at ../util/qemu-coroutine-lock.c:56 3 0x000055575c17ddfb in nbd_co_send_request (bs=0x55575ebfaf20, request=0x7f2b32ad9920, qiov=0x55575dfc15d8) at ../block/nbd.c:478 4 0x000055575c17f931 in nbd_co_request (bs=0x55575ebfaf20, request=0x7f2b32ad9920, write_qiov=0x55575dfc15d8) at ../block/nbd.c:1182 5 0x000055575c17fe14 in nbd_client_co_pwritev (bs=0x55575ebfaf20, offset=403487858688, bytes=4538368, qiov=0x55575dfc15d8, flags=0) at ../block/nbd.c:1284 6 0x000055575c170d25 in bdrv_driver_pwritev (bs=0x55575ebfaf20, offset=403487858688, bytes=4538368, qiov=0x55575dfc15d8, qiov_offset=0, flags=0) at ../block/io.c:1264 7 0x000055575c1733b4 in bdrv_aligned_pwritev (child=0x55575dff6890, req=0x7f2b32ad9ad0, offset=403487858688, bytes=4538368, align=1, qiov=0x55575dfc15d8, qiov_offset=0, flags=0) at ../block/io.c:2126 8 0x000055575c173c67 in bdrv_co_pwritev_part (child=0x55575dff6890, offset=403487858688, bytes=4538368, qiov=0x55575dfc15d8, qiov_offset=0, flags=0) at ../block/io.c:2314 9 0x000055575c17391b in bdrv_co_pwritev (child=0x55575dff6890, offset=403487858688, bytes=4538368, qiov=0x55575dfc15d8, flags=0) at ../block/io.c:2233 10 0x000055575c1ee506 in replication_co_writev (bs=0x55575e9824f0, sector_num=788062224, remaining_sectors=8864, qiov=0x55575dfc15d8, flags=0) at ../block/replication.c:270 11 0x000055575c170eed in bdrv_driver_pwritev (bs=0x55575e9824f0, offset=403487858688, bytes=4538368, qiov=0x55575dfc15d8, qiov_offset=0, flags=0) at ../block/io.c:1297 12 0x000055575c1733b4 in bdrv_aligned_pwritev (child=0x55575dcea690, req=0x7f2b32ad9e00, offset=403487858688, bytes=4538368, align=512, qiov=0x55575dfc15d8, qiov_offset=0, flags=0) at ../block/io.c:2126 13 0x000055575c173c67 in bdrv_co_pwritev_part (child=0x55575dcea690, offset=403487858688, bytes=4538368, qiov=0x55575dfc15d8, qiov_offset=0, flags=0) at ../block/io.c:2314 14 0x000055575c17391b in bdrv_co_pwritev (child=0x55575dcea690, offset=403487858688, bytes=4538368, qiov=0x55575dfc15d8, flags=0) at ../block/io.c:2233 15 0x000055575c1aeffa in write_quorum_entry (opaque=0x7f2fddaf8c50) at ../block/quorum.c:699 16 0x000055575c2ee4db in coroutine_trampoline (i0=1578543808, i1=21847) at ../util/coroutine-ucontext.c:173 17 0x00007f2ff9855660 in __start_context () at ../sysdeps/unix/sysv/linux/x86_64/__start_context.S:91 When we do failover in COLO mode, QEMU will hang while it is waiting for the in-flight IO. From the call trace, we can see the IO request coroutine has yielded in nbd_co_send_request(). When we kill the NBD server, it will never be wake up. Actually, when we do IO stress test, it will have a lot of requests in free_sema queue. When the NBD server is killed, current MAX_NBD_REQUESTS finishes with errors but they wake up at most MAX_NBD_REQEUSTS from the queue. So, let's move qemu_co_queue_next out to fix this issue. Signed-off-by:
Lei Rao <lei.rao@intel.com> Message-Id: <20220309074844.275450-1-lei.rao@intel.com> Reviewed-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by:
Eric Blake <eblake@redhat.com>
-
- Mar 15, 2022
-
-
Philippe Mathieu-Daudé authored
When building on macOS 12 we get: block/file-posix.c:3335:18: warning: 'IOMasterPort' is deprecated: first deprecated in macOS 12.0 [-Wdeprecated-declarations] kernResult = IOMasterPort( MACH_PORT_NULL, &masterPort ); ^~~~~~~~~~~~ IOMainPort Replace by IOMainPort, redefining it to IOMasterPort if not available. Suggested-by:
Akihiko Odaki <akihiko.odaki@gmail.com> Reviewed-by:
Christian Schoenebeck <qemu_oss@crudebyte.com> Reviewed by: Cameron Esfahani <dirty@apple.com> Reviewed-by:
Akihiko Odaki <akihiko.odaki@gmail.com> Tested-by:
Akihiko Odaki <akihiko.odaki@gmail.com> Signed-off-by:
Philippe Mathieu-Daudé <f4bug@amsat.org>
-
- Mar 07, 2022
-
-
Daniel P. Berrangé authored
The TLS usage for NBD was restricted to IP sockets because validating x509 certificates requires knowledge of the hostname that the client is connecting to. TLS does not have to use x509 certificates though, as PSK (pre-shared keys) provide an alternative credential option. These have no requirement for a hostname and can thus be trivially used for UNIX sockets. Furthermore, with the ability to overide the default hostname for TLS validation in the previous patch, it is now also valid to want to use x509 certificates with FD passing and UNIX sockets. Reviewed-by:
Eric Blake <eblake@redhat.com> Signed-off-by:
Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20220304193610.3293146-6-berrange@redhat.com> Signed-off-by:
Eric Blake <eblake@redhat.com>
-
Daniel P. Berrangé authored
When connecting to an NBD server with TLS and x509 credentials, the client must validate the hostname it uses for the connection, against that published in the server's certificate. If the client is tunnelling its connection over some other channel, however, the hostname it uses may not match the info reported in the server's certificate. In such a case, the user needs to explicitly set an override for the hostname to use for certificate validation. This is achieved by adding a 'tls-hostname' property to the NBD block driver. Reviewed-by:
Eric Blake <eblake@redhat.com> Signed-off-by:
Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20220304193610.3293146-4-berrange@redhat.com> Signed-off-by:
Eric Blake <eblake@redhat.com>
-
Daniel P. Berrangé authored
In commit a71d597b Author: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Date: Thu Jun 10 13:08:00 2021 +0300 block/nbd: reuse nbd_co_do_establish_connection() in nbd_open() the use of the 'hostname' field from the BDRVNBDState struct was lost, and 'nbd_connect' just hardcoded it to match the IP socket address. This was a harmless bug at the time since we block use with anything other than IP sockets. Shortly though, we want to allow the caller to override the hostname used in the TLS certificate checks. This is to allow for TLS when doing port forwarding or tunneling. Thus we need to reinstate the passing along of the 'hostname'. Reviewed-by:
Eric Blake <eblake@redhat.com> Signed-off-by:
Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20220304193610.3293146-3-berrange@redhat.com> Signed-off-by:
Eric Blake <eblake@redhat.com>
-
Peter Maydell authored
Move the various memalign-related functions out of osdep.h and into their own header, which we include only where they are used. While we're doing this, add some brief documentation comments. Signed-off-by:
Peter Maydell <peter.maydell@linaro.org> Reviewed-by:
Richard Henderson <richard.henderson@linaro.org> Reviewed-by:
Philippe Mathieu-Daudé <f4bug@amsat.org> Message-id: 20220226180723.1706285-10-peter.maydell@linaro.org
-
Vladimir Sementsov-Ogievskiy authored
Current scheme of image fleecing looks like this: [guest] [NBD export] | | |root | root v v [copy-before-write] -----> [temp.qcow2] | target | |file |backing v | [active disk] <-------------+ - On guest writes copy-before-write filter copies old data from active disk to temp.qcow2. So fleecing client (NBD export) when reads changed regions from temp.qcow2 image and unchanged from active disk through backing link. This patch makes possible new image fleecing scheme: [guest] [NBD export] | | | root | root v file v [copy-before-write]<------[snapshot-access] | | | file | target v v [active-disk] [temp.img] - copy-before-write does CBW operations and also provides snapshot-access API. The API may be accessed through snapshot-access driver. Benefits of new scheme: 1. Access control: if remote client try to read data that not covered by original dirty bitmap used on copy-before-write open, client gets -EACCES. 2. Discard support: if remote client do DISCARD, this additionally to discarding data in temp.img informs block-copy process to not copy these clusters. Next read from discarded area will return -EACCES. This is significant thing: when fleecing user reads data that was not yet copied to temp.img, we can avoid copying it on further guest write. 3. Synchronisation between client reads and block-copy write is more efficient. In old scheme we just rely on BDRV_REQ_SERIALISING flag used for writes to temp.qcow2. New scheme is less blocking: - fleecing reads are never blocked: if data region is untouched or in-flight, we just read from active-disk, otherwise we read from temp.img - writes to temp.img are not blocked by fleecing reads - still, guest writes of-course are blocked by in-flight fleecing reads, that currently read from active-disk - it's the minimum necessary blocking 4. Temporary image may be of any format, as we don't rely on backing feature. 5. Permission relation are simplified. With old scheme we have to share write permission on target child of copy-before-write, otherwise backing link conflicts with copy-before-write file child write permissions. With new scheme we don't have backing link, and copy-before-write node may have unshared access to temporary node. (Not realized in this commit, will be in future). 6. Having control on fleecing reads we'll be able to implement alternative behavior on failed copy-before-write operations. Currently we just break guest request (that's a historical behavior of backup). But in some scenarios it's a bad behavior: better is to drop the backup as failed but don't break guest request. With new scheme we can simply unset some bits in a bitmap on CBW failure and further fleecing reads will -EACCES, or something like this. (Not implemented in this commit, will be in future) Additional application for this is implementing timeout for CBW operations. Iotest 257 output is updated, as two more bitmaps now live in copy-before-write filter. Signed-off-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20220303194349.2304213-13-vsementsov@virtuozzo.com> Signed-off-by:
Hanna Reitz <hreitz@redhat.com>
-
Vladimir Sementsov-Ogievskiy authored
The new block driver simply utilizes snapshot-access API of underlying block node. In further patches we want to use it like this: [guest] [NBD export] | | | root | root v file v [copy-before-write]<------[snapshot-access] | | | file | target v v [active-disk] [temp.img] This way, NBD client will be able to read snapshotted state of active disk, when active disk is continued to be written by guest. This is known as "fleecing", and currently uses another scheme based on qcow2 temporary image which backing file is active-disk. New scheme comes with benefits - see next commit. The other possible application is exporting internal snapshots of qcow2, like this: [guest] [NBD export] | | | root | root v file v [qcow2]<---------[snapshot-access] For this, we'll need to implement snapshot-access API handlers in qcow2 driver, and improve snapshot-access block driver (and API) to make it possible to select snapshot by name. Another thing to improve is size of snapshot. Now for simplicity we just use size of bs->file, which is OK for backup, but for qcow2 snapshots export we'll need to imporve snapshot-access API to get size of snapshot. Signed-off-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20220303194349.2304213-12-vsementsov@virtuozzo.com> [hreitz: Rebased on block GS/IO split] Signed-off-by:
Hanna Reitz <hreitz@redhat.com>
-
Vladimir Sementsov-Ogievskiy authored
Add new block driver handlers and corresponding generic wrappers. It will be used to allow copy-before-write filter to provide reach fleecing interface in further commit. In future this approach may be used to allow reading qcow2 internal snapshots, for example to export them through NBD. Signed-off-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by:
Hanna Reitz <hreitz@redhat.com> Message-Id: <20220303194349.2304213-11-vsementsov@virtuozzo.com> [hreitz: Rebased on block GS/IO split] Signed-off-by:
Hanna Reitz <hreitz@redhat.com>
-
Vladimir Sementsov-Ogievskiy authored
Add function to wait for all intersecting requests. To be used in the further commit. Signed-off-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by:
Nikita Lapshin <nikita.lapshin@virtuozzo.com> Reviewed-by:
Hanna Reitz <hreitz@redhat.com> Message-Id: <20220303194349.2304213-10-vsementsov@virtuozzo.com> Signed-off-by:
Hanna Reitz <hreitz@redhat.com>
-
Vladimir Sementsov-Ogievskiy authored
Add a convenient function similar with bdrv_block_status() to get status of dirty bitmap. Signed-off-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by:
Hanna Reitz <hreitz@redhat.com> Message-Id: <20220303194349.2304213-9-vsementsov@virtuozzo.com> Signed-off-by:
Hanna Reitz <hreitz@redhat.com>
-
Vladimir Sementsov-Ogievskiy authored
Let's reuse convenient helper. Signed-off-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by:
Hanna Reitz <hreitz@redhat.com> Message-Id: <20220303194349.2304213-8-vsementsov@virtuozzo.com> Signed-off-by:
Hanna Reitz <hreitz@redhat.com>
-
Vladimir Sementsov-Ogievskiy authored
Split intersecting-requests functionality out of block-copy to be reused in copy-before-write filter. Note: while being here, fix tiny typo in MAINTAINERS. Signed-off-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by:
Hanna Reitz <hreitz@redhat.com> Message-Id: <20220303194349.2304213-7-vsementsov@virtuozzo.com> Signed-off-by:
Hanna Reitz <hreitz@redhat.com>
-
Vladimir Sementsov-Ogievskiy authored
Split block_copy_reset() out of block_copy_reset_unallocated() to be used separately later. Signed-off-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by:
Hanna Reitz <hreitz@redhat.com> Message-Id: <20220303194349.2304213-6-vsementsov@virtuozzo.com> Signed-off-by:
Hanna Reitz <hreitz@redhat.com>
-
Vladimir Sementsov-Ogievskiy authored
This brings "incremental" mode to copy-before-write filter: user can specify bitmap so that filter will copy only "dirty" areas. Signed-off-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20220303194349.2304213-5-vsementsov@virtuozzo.com> Signed-off-by:
Hanna Reitz <hreitz@redhat.com>
-
Vladimir Sementsov-Ogievskiy authored
This will be used in the following commit to bring "incremental" mode to copy-before-write filter. Signed-off-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by:
Hanna Reitz <hreitz@redhat.com> Message-Id: <20220303194349.2304213-4-vsementsov@virtuozzo.com> Signed-off-by:
Hanna Reitz <hreitz@redhat.com>
-
Vladimir Sementsov-Ogievskiy authored
That simplifies handling failure in existing code and in further new usage of bdrv_merge_dirty_bitmap(). Signed-off-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by:
Hanna Reitz <hreitz@redhat.com> Message-Id: <20220303194349.2304213-3-vsementsov@virtuozzo.com> Signed-off-by:
Hanna Reitz <hreitz@redhat.com>
-
Vladimir Sementsov-Ogievskiy authored
We are going to complicate bitmap initialization in the further commit. And in future, backup job will be able to work without filter (when source is immutable), so we'll need same bitmap initialization in copy-before-write filter and in backup job. So, it's reasonable to do it in block-copy. Note that for now cbw_open() is the only caller of block_copy_state_new(). Signed-off-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by:
Hanna Reitz <hreitz@redhat.com> Message-Id: <20220303194349.2304213-2-vsementsov@virtuozzo.com> Signed-off-by:
Hanna Reitz <hreitz@redhat.com>
-
Vladimir Sementsov-Ogievskiy authored
There is a bug in handling BDRV_REQ_NO_WAIT flag: we still may wait in wait_serialising_requests() if request is unaligned. And this is possible for the only user of this flag (preallocate filter) if underlying file is unaligned to its request_alignment on start. So, we have to fix preallocate filter to do only aligned preallocate requests. Next, we should fix generic block/io.c somehow. Keeping in mind that preallocate is the only user of BDRV_REQ_NO_WAIT and that we have to fix its behavior now, it seems more safe to just assert that we never use BDRV_REQ_NO_WAIT with unaligned requests and add corresponding comment. Let's do so. Signed-off-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by:
Denis V. Lunev <den@openvz.org> Message-Id: <20220215121609.38570-1-vsementsov@virtuozzo.com> [hreitz: Rebased on block GS/IO split] Signed-off-by:
Hanna Reitz <hreitz@redhat.com>
-
Peter Maydell authored
Coverity points out that we aren't checking the return value from curl_easy_setopt() for any of the calls to it we make in block/curl.c. Some of these options are documented as always succeeding (e.g. CURLOPT_VERBOSE) but others have documented failure cases (e.g. CURLOPT_URL). For consistency we check every call, even the ones that theoretically cannot fail. Fixes: Coverity CID 1459336, 1459482, 1460331 Signed-off-by:
Peter Maydell <peter.maydell@linaro.org> Message-Id: <20220222152341.850419-3-peter.maydell@linaro.org> Reviewed-by:
Hanna Reitz <hreitz@redhat.com> Signed-off-by:
Hanna Reitz <hreitz@redhat.com>
-
Peter Maydell authored
In curl_open(), the 'out' label assumes that the state->errmsg string has been set (either by curl_easy_perform() or by manually copying a string into it); however if curl_init_state() fails we will jump to that label without setting the string. Add the missing error string setup. (We can't be specific about the cause of failure: the documentation of curl_easy_init() just says "If this function returns NULL, something went wrong".) Signed-off-by:
Peter Maydell <peter.maydell@linaro.org> Message-Id: <20220222152341.850419-2-peter.maydell@linaro.org> Reviewed-by:
Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by:
Hanna Reitz <hreitz@redhat.com> Signed-off-by:
Hanna Reitz <hreitz@redhat.com>
-
- Mar 04, 2022
-
-
Hanna Reitz authored
Otherwise, the BDS might be freed while the job is running, which would cause a use-after-free. Signed-off-by:
Hanna Reitz <hreitz@redhat.com> Message-Id: <20220304153729.711387-5-hreitz@redhat.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-
Hanna Reitz authored
.bdrv_amend_clean() says block drivers can use it to clean up what was done in .bdrv_amend_pre_run(). Therefore, it should always be called after .bdrv_amend_pre_run(), which means we need it to call it in the JobDriver.free() callback, not in JobDriver.clean(). Signed-off-by:
Hanna Reitz <hreitz@redhat.com> Message-Id: <20220304153729.711387-3-hreitz@redhat.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-
Hanna Reitz authored
bdrv_refresh_limits() recurses down to the node's children. That does not seem necessary: When we refresh limits on some node, and then recurse down and were to change one of its children's BlockLimits, then that would mean we noticed the changed limits by pure chance. The fact that we refresh the parent's limits has nothing to do with it, so the reason for the change probably happened before this point in time, and we should have refreshed the limits then. Consequently, we should actually propagate block limits changes upwards, not downwards. That is a separate and pre-existing issue, though, and so will not be addressed in this patch. The problem with recursing is that bdrv_refresh_limits() is not atomic. It begins with zeroing BDS.bl, and only then sets proper, valid limits. If we do not drain all nodes whose limits are refreshed, then concurrent I/O requests can encounter invalid request_alignment values and crash qemu. Therefore, a recursing bdrv_refresh_limits() requires the whole subtree to be drained, which is currently not ensured by most callers. A non-recursive bdrv_refresh_limits() only requires the node in question to not receive I/O requests, and this is done by most callers in some way or another: - bdrv_open_driver() deals with a new node with no parents yet - bdrv_set_file_or_backing_noperm() acts on a drained node - bdrv_reopen_commit() acts only on drained nodes - bdrv_append() should in theory require the node to be drained; in practice most callers just lock the AioContext, which should at least be enough to prevent concurrent I/O requests from accessing invalid limits So we can resolve the bug by making bdrv_refresh_limits() non-recursive. Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1879437 Signed-off-by:
Hanna Reitz <hreitz@redhat.com> Reviewed-by:
Eric Blake <eblake@redhat.com> Message-Id: <20220216105355.30729-2-hreitz@redhat.com> Reviewed-by:
Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-
Emanuele Giuseppe Esposito authored
Signed-off-by:
Emanuele Giuseppe Esposito <eesposit@redhat.com> Message-Id: <20220303151616.325444-27-eesposit@redhat.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-
Emanuele Giuseppe Esposito authored
block coroutines functions run in different aiocontext, and are not protected by the BQL. Therefore are I/O. On the other side, generated_co_wrapper functions use BDRV_POLL_WHILE, meaning the caller can either be the main loop or a specific iothread. Signed-off-by:
Emanuele Giuseppe Esposito <eesposit@redhat.com> Message-Id: <20220303151616.325444-25-eesposit@redhat.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-
Emanuele Giuseppe Esposito authored
copy-before-write functions always run under BQL. Signed-off-by:
Emanuele Giuseppe Esposito <eesposit@redhat.com> Message-Id: <20220303151616.325444-24-eesposit@redhat.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-
Emanuele Giuseppe Esposito authored
Snapshots run also under the BQL, so they all are in the global state API. The aiocontext lock that they hold is currently an overkill and in future could be removed. Signed-off-by:
Emanuele Giuseppe Esposito <eesposit@redhat.com> Message-Id: <20220303151616.325444-23-eesposit@redhat.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-
Emanuele Giuseppe Esposito authored
Signed-off-by:
Emanuele Giuseppe Esposito <eesposit@redhat.com> Message-Id: <20220303151616.325444-22-eesposit@redhat.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-
Emanuele Giuseppe Esposito authored
Following the assertion derived from the API split, propagate the assertion also in the static functions. Signed-off-by:
Emanuele Giuseppe Esposito <eesposit@redhat.com> Message-Id: <20220303151616.325444-18-eesposit@redhat.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-
Emanuele Giuseppe Esposito authored
Mark all I/O functions with IO_CODE, and all "I/O OR GS" with IO_OR_GS_CODE. Signed-off-by:
Emanuele Giuseppe Esposito <eesposit@redhat.com> Message-Id: <20220303151616.325444-14-eesposit@redhat.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-
Emanuele Giuseppe Esposito authored
Signed-off-by:
Emanuele Giuseppe Esposito <eesposit@redhat.com> Message-Id: <20220303151616.325444-13-eesposit@redhat.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-