- Nov 21, 2023
-
-
Kevin Wolf authored
bdrv_graph_wrunlock() calls aio_poll(), which may run callbacks that have a nested event loop. Nested event loops can depend on other iothreads making progress, so in order to allow them to make progress it must not hold the AioContext lock of another thread while calling aio_poll(). This introduces a @bs parameter to bdrv_graph_wrunlock() whose AioContext is temporarily dropped (which matches bdrv_graph_wrlock()), and a bdrv_graph_wrunlock_ctx() that can be used if the BlockDriverState doesn't necessarily exist any more when unlocking. This also requires a change to bdrv_schedule_unref(), which was relying on the incorrectly taken lock. It needs to take the lock itself now. While this is a separate bug, it can't be fixed a separate patch because otherwise the intermediate state would either deadlock or try to release a lock that we don't even hold. Signed-off-by:
Kevin Wolf <kwolf@redhat.com> Message-ID: <20231115172012.112727-3-kwolf@redhat.com> Reviewed-by:
Stefan Hajnoczi <stefanha@redhat.com> [kwolf: Fixed up bdrv_schedule_unref()] Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-
- Nov 07, 2023
-
-
Kevin Wolf authored
Instead of taking the writer lock internally, require callers to already hold it when calling block_job_add_bdrv(). These callers will typically already hold the graph lock once the locking work is completed, which means that they can't call functions that take it internally. Signed-off-by:
Kevin Wolf <kwolf@redhat.com> Message-ID: <20231027155333.420094-6-kwolf@redhat.com> Reviewed-by:
Eric Blake <eblake@redhat.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-
Kevin Wolf authored
Instead of taking the writer lock internally, require callers to already hold it when calling bdrv_root_attach_child(). These callers will typically already hold the graph lock once the locking work is completed, which means that they can't call functions that take it internally. Signed-off-by:
Kevin Wolf <kwolf@redhat.com> Message-ID: <20231027155333.420094-5-kwolf@redhat.com> Reviewed-by:
Eric Blake <eblake@redhat.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-
- Oct 31, 2023
-
-
Fiona Ebner authored
Signed-off-by:
Fiona Ebner <f.ebner@proxmox.com> Message-ID: <20231031135431.393137-9-f.ebner@proxmox.com> Reviewed-by:
Kevin Wolf <kwolf@redhat.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-
Fiona Ebner authored
In preparation to turn BlockJobInfo into a union with @type as the discriminator. That requires it to be an enum. Even without that requirement, it's nicer to have an enum instead of a str here. No functional change is intended. Signed-off-by:
Fiona Ebner <f.ebner@proxmox.com> Reviewed-by:
Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by:
Markus Armbruster <armbru@redhat.com> Message-ID: <20231031135431.393137-7-f.ebner@proxmox.com> Reviewed-by:
Kevin Wolf <kwolf@redhat.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-
Fiona Ebner authored
which will allow changing job-type-specific options after job creation. In the JobVerbTable, the same allow bits as for set-speed are used, because set-speed can be considered an existing change command. Signed-off-by:
Fiona Ebner <f.ebner@proxmox.com> Reviewed-by:
Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Message-ID: <20231031135431.393137-2-f.ebner@proxmox.com> Reviewed-by:
Kevin Wolf <kwolf@redhat.com> Reviewed-by:
Eric Blake <eblake@redhat.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-
Fiona Ebner authored
Same rationale as in 31b2ddfe ("graph-lock: Unlock the AioContext while polling"). Otherwise, a deadlock can happen. The alternative would be to pass a BlockDriverState along to bdrv_graph_wrlock(), but there is no BlockDriverState readily available and it's also better conceptually, because the lock is held for the job. The function is always called with the job's AioContext lock held, via one of the .abort, .clean, .free or .prepare job driver functions. Thus, it's safe to drop it. While mirror_exit_common() does hold a second AioContext lock while calling block_job_remove_all_bdrv(), that is for the main thread's AioContext and does not need to be dropped (bdrv_graph_wrlock(bs) also skips dropping the lock if bdrv_get_aio_context(bs) == qemu_get_aio_context()). Suggested-by:
Paolo Bonzini <pbonzini@redhat.com> Signed-off-by:
Fiona Ebner <f.ebner@proxmox.com> Message-ID: <20231019131936.414246-2-f.ebner@proxmox.com> Reviewed-by:
Kevin Wolf <kwolf@redhat.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-
- Oct 12, 2023
-
-
Kevin Wolf authored
This adds GRAPH_RDLOCK annotations to declare that callers of bdrv_get_parent_name() need to hold a reader lock for the graph because it accesses the parents list of a node. For some places, we know that they will hold the lock, but we don't have the GRAPH_RDLOCK annotations yet. In this case, add assume_graph_lock() with a FIXME comment. These places will be removed once everything is properly annotated. Signed-off-by:
Kevin Wolf <kwolf@redhat.com> Message-ID: <20230929145157.45443-13-kwolf@redhat.com> Reviewed-by:
Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-
- Sep 20, 2023
-
-
Kevin Wolf authored
Instead of taking the writer lock internally, require callers to already hold it when calling bdrv_root_unref_child(). These callers will typically already hold the graph lock once the locking work is completed, which means that they can't call functions that take it internally. Signed-off-by:
Kevin Wolf <kwolf@redhat.com> Reviewed-by:
Emanuele Giuseppe Esposito <eesposit@redhat.com> Reviewed-by:
Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20230911094620.45040-20-kwolf@redhat.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-
- Jun 28, 2023
-
-
Kevin Wolf authored
bdrv_root_attach_child() requires callers to hold the AioContext lock for child_bs. Take it in block_job_add_bdrv() before calling the function. Signed-off-by:
Kevin Wolf <kwolf@redhat.com> Message-ID: <20230605085711.21261-10-kwolf@redhat.com> Reviewed-by:
Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-
- May 19, 2023
-
-
Kevin Wolf authored
When jobs are sleeping, for example to enforce a given rate limit, they can be reentered early, in particular in order to get paused, to update the rate limit or to get cancelled. Before this patch, they behave in this case as if they had fully completed their rate limiting delay. This means that requests are sped up beyond their limit, violating the constraints that the user gave us. Change the block jobs to sleep in a loop until the necessary delay is completed, while still allowing cancelling them immediately as well pausing (handled by the pause point in job_sleep_ns()) and updating the rate limit. This change is also motivated by iotests cases being prone to fail because drain operations pause and unpause them so often that block jobs complete earlier than they are supposed to. In particular, the next commit would fail iotests 030 without this change. Signed-off-by:
Kevin Wolf <kwolf@redhat.com> Message-Id: <20230510203601.418015-8-kwolf@redhat.com> Reviewed-by:
Eric Blake <eblake@redhat.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-
- Jan 20, 2023
-
-
Markus Armbruster authored
We have two inclusion loops: block/block.h -> block/block-global-state.h -> block/block-common.h -> block/blockjob.h -> block/block.h block/block.h -> block/block-io.h -> block/block-common.h -> block/blockjob.h -> block/block.h I believe these go back to Emanuele's reorganization of the block API, merged a few months ago in commit d7e2fe4a. Fortunately, breaking them is merely a matter of deleting unnecessary includes from headers, and adding them back in places where they are now missing. Signed-off-by:
Markus Armbruster <armbru@redhat.com> Message-Id: <20221221133551.3967339-2-armbru@redhat.com>
-
- Jan 19, 2023
-
-
Markus Armbruster authored
Signed-off-by:
Markus Armbruster <armbru@redhat.com> Reviewed-by:
Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20221221131435.3851212-2-armbru@redhat.com>
-
- Dec 15, 2022
-
-
Kevin Wolf authored
drained_end_counter is unused now, nobody changes its value any more. It can be removed. In cases where we had two almost identical functions that only differed in whether the caller passes drained_end_counter, or whether they would poll for a local drained_end_counter to reach 0, these become a single function. Signed-off-by:
Kevin Wolf <kwolf@redhat.com> Reviewed-by:
Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by:
Emanuele Giuseppe Esposito <eesposit@redhat.com> Message-Id: <20221118174110.55183-5-kwolf@redhat.com> Reviewed-by:
Hanna Reitz <hreitz@redhat.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-
- Dec 14, 2022
-
-
Markus Armbruster authored
The has_FOO for pointer-valued FOO are redundant, except for arrays. They are also a nuisance to work with. Recent commit "qapi: Start to elide redundant has_FOO in generated C" provided the means to elide them step by step. This is the step for qapi/block*.json. Said commit explains the transformation in more detail. There is one instance of the invariant violation mentioned there: qcow2_signal_corruption() passes false, "" when node_name is an empty string. Take care to pass NULL then. The previous two commits cleaned up two more. Additionally, helper bdrv_latency_histogram_stats() loses its output parameters and returns a value instead. Cc: Kevin Wolf <kwolf@redhat.com> Cc: Hanna Reitz <hreitz@redhat.com> Cc: qemu-block@nongnu.org Signed-off-by:
Markus Armbruster <armbru@redhat.com> Message-Id: <20221104160712.3005652-11-armbru@redhat.com> [Fixes for #ifndef LIBRBD_SUPPORTS_ENCRYPTION and MacOS squashed in]
-
- Nov 10, 2022
-
-
Hanna Reitz authored
We want to use bdrv_child_get_parent_aio_context() from bdrv_parent_drained_{begin,end}_single(), both of which are "I/O or GS" functions. Prior to 3ed4f708, all the implementations were I/O code anyway. 3ed4f708 has put block jobs' AioContext field under the job mutex, so to make child_job_get_parent_aio_context() work in an I/O context, we need to take that lock there. Furthermore, blk_root_get_parent_aio_context() is not marked as anything, but is safe to run in an I/O context, so mark it that way now. (blk_get_aio_context() is an I/O code function.) With that done, all implementations explicitly are I/O code, so we can mark bdrv_child_get_parent_aio_context() as I/O code, too, so callers know it is safe to run from both GS and I/O contexts. Signed-off-by:
Hanna Reitz <hreitz@redhat.com> Message-Id: <20221107151321.211175-2-hreitz@redhat.com> Reviewed-by:
Kevin Wolf <kwolf@redhat.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-
- Oct 27, 2022
-
-
Emanuele Giuseppe Esposito authored
Together with all _can_set_ and _set_ APIs, as they are not needed anymore. Signed-off-by:
Emanuele Giuseppe Esposito <eesposit@redhat.com> Reviewed-by:
Kevin Wolf <kwolf@redhat.com> Message-Id: <20221025084952.2139888-9-eesposit@redhat.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-
Emanuele Giuseppe Esposito authored
child_job_change_aio_ctx() is very similar to child_job_can_set_aio_ctx(), but it implements a new transaction so that if all check pass, the new transaction's .commit() will take care of changin the BlockJob AioContext. child_job_set_aio_ctx_commit() is similar to child_job_set_aio_ctx(), but it doesn't need to invoke the recursion, as this is already taken care by child_job_change_aio_ctx(). Note: bdrv_child_try_change_aio_context() is not called by anyone at this point. Signed-off-by:
Emanuele Giuseppe Esposito <eesposit@redhat.com> Reviewed-by:
Hanna Reitz <hreitz@redhat.com> Reviewed-by:
Kevin Wolf <kwolf@redhat.com> Message-Id: <20221025084952.2139888-5-eesposit@redhat.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-
- Oct 07, 2022
-
-
Emanuele Giuseppe Esposito authored
These public functions are not used anywhere, thus can be dropped. Signed-off-by:
Emanuele Giuseppe Esposito <eesposit@redhat.com> Reviewed-by:
Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by:
Kevin Wolf <kwolf@redhat.com> Reviewed-by:
Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Message-Id: <20220926093214.506243-21-eesposit@redhat.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-
Emanuele Giuseppe Esposito authored
Not sure what the atomic here was supposed to do, since job.busy is protected by the job lock. Since the whole function is called under job_mutex, just remove the atomic. Signed-off-by:
Emanuele Giuseppe Esposito <eesposit@redhat.com> Reviewed-by:
Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by:
Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by:
Kevin Wolf <kwolf@redhat.com> Message-Id: <20220926093214.506243-20-eesposit@redhat.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-
Emanuele Giuseppe Esposito authored
iostatus is the only field (together with .job) that needs protection using the job mutex. It is set in the main loop (GLOBAL_STATE functions) but read in I/O code (block_job_error_action). In order to protect it, change block_job_iostatus_set_err to block_job_iostatus_set_err_locked(), always called under job lock. Signed-off-by:
Emanuele Giuseppe Esposito <eesposit@redhat.com> Reviewed-by:
Kevin Wolf <kwolf@redhat.com> Reviewed-by:
Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Message-Id: <20220926093214.506243-17-eesposit@redhat.com> [kwolf: Fixed up type of iostatus] Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-
Emanuele Giuseppe Esposito authored
They all are called with job_lock held, in job_event_*_locked() Signed-off-by:
Emanuele Giuseppe Esposito <eesposit@redhat.com> Reviewed-by:
Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by:
Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by:
Kevin Wolf <kwolf@redhat.com> Message-Id: <20220926093214.506243-16-eesposit@redhat.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-
Emanuele Giuseppe Esposito authored
In order to make it thread safe, implement a "fake rwlock", where we allow reads under BQL *or* job_mutex held, but writes only under BQL *and* job_mutex. The only write we have is in child_job_set_aio_ctx, which always happens under drain (so the job is paused). For this reason, introduce job_set_aio_context and make sure that the context is set under BQL, job_mutex and drain. Also make sure all other places where the aiocontext is read are protected. The reads in commit.c and mirror.c are actually safe, because always done under BQL. Note: at this stage, job_{lock/unlock} and job lock guard macros are *nop*. Suggested-by:
Paolo Bonzini <pbonzini@redhat.com> Signed-off-by:
Emanuele Giuseppe Esposito <eesposit@redhat.com> Message-Id: <20220926093214.506243-14-eesposit@redhat.com> Reviewed-by:
Kevin Wolf <kwolf@redhat.com> Reviewed-by:
Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-
Emanuele Giuseppe Esposito authored
Now that the API offers also _locked() functions, take advantage of it and give also the caller control to take the lock and call _locked functions. This makes sense especially when we have for loops, because it makes no sense to have: for(job = job_next(); ...) where each job_next() takes the lock internally. Instead we want JOB_LOCK_GUARD(); for(job = job_next_locked(); ...) In addition, protect also direct field accesses, by either creating a new critical section or widening the existing ones. Note: at this stage, job_{lock/unlock} and job lock guard macros are *nop*. Signed-off-by:
Emanuele Giuseppe Esposito <eesposit@redhat.com> Reviewed-by:
Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Message-Id: <20220926093214.506243-12-eesposit@redhat.com> Reviewed-by:
Kevin Wolf <kwolf@redhat.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-
Emanuele Giuseppe Esposito authored
Just as done with job.h, create _locked() functions in blockjob.h These functions will be later useful when caller has already taken the lock. All blockjob _locked functions call job _locked functions. Note: at this stage, job_{lock/unlock} and job lock guard macros are *nop*. Signed-off-by:
Emanuele Giuseppe Esposito <eesposit@redhat.com> Reviewed-by:
Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by:
Kevin Wolf <kwolf@redhat.com> Reviewed-by:
Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20220926093214.506243-8-eesposit@redhat.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-
Emanuele Giuseppe Esposito authored
This comment applies more on job, it was left in blockjob as in the past the whole job logic was implemented there. Note: at this stage, job_{lock/unlock} and job lock guard macros are *nop*. No functional change intended. Signed-off-by:
Emanuele Giuseppe Esposito <eesposit@redhat.com> Reviewed-by:
Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by:
Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20220926093214.506243-7-eesposit@redhat.com> Reviewed-by:
Kevin Wolf <kwolf@redhat.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-
- Mar 04, 2022
-
-
Emanuele Giuseppe Esposito authored
Signed-off-by:
Emanuele Giuseppe Esposito <eesposit@redhat.com> Message-Id: <20220303151616.325444-20-eesposit@redhat.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-
Emanuele Giuseppe Esposito authored
Signed-off-by:
Emanuele Giuseppe Esposito <eesposit@redhat.com> Message-Id: <20220303151616.325444-17-eesposit@redhat.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-
- Dec 28, 2021
-
-
Vladimir Sementsov-Ogievskiy authored
It's unused now (except for permission handling)[*]. The only reasonable user of it was block-stream job, recently updated to use own blk. And other block jobs prefer to use own source node related objects. So, the arguments of dropping the field are: - block jobs prefer not to use it - block jobs usually has more then one node to operate on, and better to operate symmetrically (for example has both source and target blk's in specific block-job state structure) *: BlockJob.blk is used to keep some permissions. We simply move permissions to block-job child created in block_job_create() together with blk. In mirror, we just should not care anymore about restoring state of blk. Most probably this code could be dropped long ago, after dropping bs->job pointer. Now it finally goes away together with BlockJob.blk itself. iotest 141 output is updated, as "bdrv_has_blk(bs)" check in qmp_blockdev_del() doesn't fail (we don't have blk now). Still, new error message looks even better. In iotest 283 we need to add a job id, otherwise "Invalid job ID" happens now earlier than permission check (as permissions moved from blk to block-job node). Signed-off-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by:
Nikita Lapshin <nikita.lapshin@virtuozzo.com>
-
Vladimir Sementsov-Ogievskiy authored
We are going to drop BlockJob.blk. So let's retrieve block job context from underlying job instead of main node. Signed-off-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by:
Nikita Lapshin <nikita.lapshin@virtuozzo.com>
-
- Jun 25, 2021
-
-
Emanuele Giuseppe Esposito authored
Progressmeter is protected by the AioContext mutex, which is taken by the block jobs and their caller (like blockdev). We would like to remove the dependency of block layer code on the AioContext mutex, since most drivers and the core I/O code are already not relying on it. Create a new C file to implement the ProgressMeter API, but keep the struct as public, to avoid forcing allocation on the heap. Also add a mutex to be able to provide an accurate snapshot of the progress values to the caller. Signed-off-by:
Emanuele Giuseppe Esposito <eesposit@redhat.com> Reviewed-by:
Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20210614081130.22134-5-eesposit@redhat.com> Signed-off-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
-
Paolo Bonzini authored
Reviewed-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Signed-off-by:
Emanuele Giuseppe Esposito <eesposit@redhat.com> Message-Id: <20210614081130.22134-4-eesposit@redhat.com> Signed-off-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
-
- May 04, 2021
-
-
Paolo Bonzini authored
Right now, rate limiting is protected by the AioContext mutex, which is taken for example both by the block jobs and by qmp_block_job_set_speed (via find_block_job). We would like to remove the dependency of block layer code on the AioContext mutex, since most drivers and the core I/O code are already not relying on it. However, there is no existing lock that can easily be taken by both ratelimit_set_speed and ratelimit_calculate_delay, especially because the latter might run in coroutine context (and therefore under a CoMutex) but the former will not. Since concurrent calls to ratelimit_calculate_delay are not possible, one idea could be to use a seqlock to get a snapshot of slice_ns and slice_quota. But for now keep it simple, and just add a mutex to the RateLimit struct; block jobs are generally not performance critical to the point of optimizing the clock cycles spent in synchronization. This also requires the introduction of init/destroy functions, so add them to the two users of ratelimit.h. Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
- Apr 30, 2021
-
-
Vladimir Sementsov-Ogievskiy authored
Passing parent aio context is redundant, as child_class and parent opaque pointer are enough to retrieve it. Drop the argument and use new bdrv_child_get_parent_aio_context() interface. Signed-off-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by:
Alberto Garcia <berto@igalia.com> Reviewed-by:
Kevin Wolf <kwolf@redhat.com> Message-Id: <20210428151804.439460-7-vsementsov@virtuozzo.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-
Vladimir Sementsov-Ogievskiy authored
Add new handler to get aio context and implement it in all child classes. Add corresponding public interface to be used soon. Signed-off-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by:
Alberto Garcia <berto@igalia.com> Reviewed-by:
Kevin Wolf <kwolf@redhat.com> Message-Id: <20210428151804.439460-6-vsementsov@virtuozzo.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-
- Mar 08, 2021
-
-
Vladimir Sementsov-Ogievskiy authored
Better to return status together with setting errp. It allows to avoid error propagation in the caller. Signed-off-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by:
Greg Kurz <groug@kaod.org> Reviewed-by:
Alberto Garcia <berto@igalia.com> Message-Id: <20210202124956.63146-8-vsementsov@virtuozzo.com> Signed-off-by:
Eric Blake <eblake@redhat.com>
-
Stefano Garzarella authored
When a block job fails, we report strerror(-job->job.ret) error message, also if the job set an error object. Let's report a better error message using error_get_pretty(job->job.err). If an error object was not set, strerror(-job->ret) is used as fallback, as explained in include/qemu/job.h: typedef struct Job { ... /** * Error object for a failed job. * If job->ret is nonzero and an error object was not set, it will be set * to strerror(-job->ret) during job_completed. */ Error *err; } In block_job_query() there can be a transient where 'job.err' is not set by a scheduled bottom half. In that case we use strerror(-job->ret) as it was before. Suggested-by:
Kevin Wolf <kwolf@redhat.com> Signed-off-by:
Stefano Garzarella <sgarzare@redhat.com> Message-Id: <20210225103633.76746-1-sgarzare@redhat.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-
- Feb 15, 2021
-
-
Michael Qiu authored
Currently, if guest has workloads, IO thread will acquire aio_context lock before do io_submit, it leads to segmentfault when do block commit after snapshot. Just like below: Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7f7c7d91f700 (LWP 99907)] 0x00005576d0f65aab in bdrv_mirror_top_pwritev at ../block/mirror.c:1437 1437 ../block/mirror.c: No such file or directory. (gdb) p s->job $17 = (MirrorBlockJob *) 0x0 (gdb) p s->stop $18 = false Call trace of IO thread: 0 0x00005576d0f65aab in bdrv_mirror_top_pwritev at ../block/mirror.c:1437 1 0x00005576d0f7f3ab in bdrv_driver_pwritev at ../block/io.c:1174 2 0x00005576d0f8139d in bdrv_aligned_pwritev at ../block/io.c:1988 3 0x00005576d0f81b65 in bdrv_co_pwritev_part at ../block/io.c:2156 4 0x00005576d0f8e6b7 in blk_do_pwritev_part at ../block/block-backend.c:1260 5 0x00005576d0f8e84d in blk_aio_write_entry at ../block/block-backend.c:1476 ... Switch to qemu main thread: 0 0x00007f903be704ed in __lll_lock_wait at /lib/../lib64/libpthread.so.0 1 0x00007f903be6bde6 in _L_lock_941 at /lib/../lib64/libpthread.so.0 2 0x00007f903be6bcdf in pthread_mutex_lock at /lib/../lib64/libpthread.so.0 3 0x0000564b21456889 in qemu_mutex_lock_impl at ../util/qemu-thread-posix.c:79 4 0x0000564b213af8a5 in block_job_add_bdrv at ../blockjob.c:224 5 0x0000564b213b00ad in block_job_create at ../blockjob.c:440 6 0x0000564b21357c0a in mirror_start_job at ../block/mirror.c:1622 7 0x0000564b2135a9af in commit_active_start at ../block/mirror.c:1867 8 0x0000564b2133d132 in qmp_block_commit at ../blockdev.c:2768 9 0x0000564b2141fef3 in qmp_marshal_block_commit at qapi/qapi-commands-block-core.c:346 10 0x0000564b214503c9 in do_qmp_dispatch_bh at ../qapi/qmp-dispatch.c:110 11 0x0000564b21451996 in aio_bh_poll at ../util/async.c:164 12 0x0000564b2146018e in aio_dispatch at ../util/aio-posix.c:381 13 0x0000564b2145187e in aio_ctx_dispatch at ../util/async.c:306 14 0x00007f9040239049 in g_main_context_dispatch at /lib/../lib64/libglib-2.0.so.0 15 0x0000564b21447368 in main_loop_wait at ../util/main-loop.c:232 16 0x0000564b21447368 in main_loop_wait at ../util/main-loop.c:255 17 0x0000564b21447368 in main_loop_wait at ../util/main-loop.c:531 18 0x0000564b212304e1 in qemu_main_loop at ../softmmu/runstate.c:721 19 0x0000564b20f7975e in main at ../softmmu/main.c:50 In IO thread when do bdrv_mirror_top_pwritev, the job is NULL, and stop field is false, this means the MirrorBDSOpaque "s" object has not been initialized yet, and this object is initialized by block_job_create(), but the initialize process is stuck in acquiring the lock. In this situation, IO thread come to bdrv_mirror_top_pwritev(),which means that mirror-top node is already inserted into block graph, but its bs->opaque->job is not initialized. The root cause is that qemu main thread do release/acquire when hold the lock, at the same time, IO thread get the lock after release stage, and the crash occured. Actually, in this situation, job->job.aio_context will not equal to qemu_get_aio_context(), and will be the same as bs->aio_context, thus, no need to release the lock, becasue bdrv_root_attach_child() will not change the context. This patch fix this issue. Fixes: 132ada80 "block: Adjust AioContexts when attaching nodes" Signed-off-by:
Michael Qiu <qiudayu@huayun.com> Message-Id: <20210203024059.52683-1-08005325@163.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-
- Jan 26, 2021
-
-
Vladimir Sementsov-Ogievskiy authored
We are going to use async block-copy call in backup, so we'll need to passthrough setting backup speed to block-copy call. Signed-off-by:
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by:
Max Reitz <mreitz@redhat.com> Message-Id: <20210116214705.822267-9-vsementsov@virtuozzo.com> Signed-off-by:
Max Reitz <mreitz@redhat.com>
-
- Sep 23, 2020
-
-
Stefan Hajnoczi authored
clang's C11 atomic_fetch_*() functions only take a C11 atomic type pointer argument. QEMU uses direct types (int, etc) and this causes a compiler error when a QEMU code calls these functions in a source file that also included <stdatomic.h> via a system header file: $ CC=clang CXX=clang++ ./configure ... && make ../util/async.c:79:17: error: address argument to atomic operation must be a pointer to _Atomic type ('unsigned int *' invalid) Avoid using atomic_*() names in QEMU's atomic.h since that namespace is used by <stdatomic.h>. Prefix QEMU's APIs with 'q' so that atomic.h and <stdatomic.h> can co-exist. I checked /usr/include on my machine and searched GitHub for existing "qatomic_" users but there seem to be none. This patch was generated using: $ git grep -h -o '\<atomic\(64\)\?_[a-z0-9_]\+' include/qemu/atomic.h | \ sort -u >/tmp/changed_identifiers $ for identifier in $(</tmp/changed_identifiers); do sed -i "s%\<$identifier\>%q$identifier%g" \ $(git grep -I -l "\<$identifier\>") done I manually fixed line-wrap issues and misaligned rST tables. Signed-off-by:
Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by:
Philippe Mathieu-Daudé <philmd@redhat.com> Acked-by:
Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20200923105646.47864-1-stefanha@redhat.com>
-