Skip to content
Snippets Groups Projects
  1. Aug 21, 2020
    • Paolo Bonzini's avatar
      trace: switch position of headers to what Meson requires · 243af022
      Paolo Bonzini authored
      
      Meson doesn't enjoy the same flexibility we have with Make in choosing
      the include path.  In particular the tracing headers are using
      $(build_root)/$(<D).
      
      In order to keep the include directives unchanged,
      the simplest solution is to generate headers with patterns like
      "trace/trace-audio.h" and place forwarding headers in the source tree
      such that for example "audio/trace.h" includes "trace/trace-audio.h".
      
      This patch is too ugly to be applied to the Makefiles now.  It's only
      a way to separate the changes to the tracing header files from the
      Meson rewrite of the tracing logic.
      
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      243af022
  2. Apr 07, 2020
    • Stefan Reiter's avatar
      job: take each job's lock individually in job_txn_apply · b660a84b
      Stefan Reiter authored
      
      All callers of job_txn_apply hold a single job's lock, but different
      jobs within a transaction can have different contexts, thus we need to
      lock each one individually before applying the callback function.
      
      Similar to job_completed_txn_abort this also requires releasing the
      caller's context before and reacquiring it after to avoid recursive
      locks which might break AIO_WAIT_WHILE in the callback. This is safe, since
      existing code would already have to take this into account, lest
      job_completed_txn_abort might have broken.
      
      This also brings to light a different issue: When a callback function in
      job_txn_apply moves it's job to a different AIO context, callers will
      try to release the wrong lock (now that we re-acquire the lock
      correctly, previously it would just continue with the old lock, leaving
      the job unlocked for the rest of the return path). Fix this by not caching
      the job's context.
      
      This is only necessary for qmp_block_job_finalize, qmp_job_finalize and
      job_exit, since everyone else calls through job_exit.
      
      One test needed adapting, since it calls job_finalize directly, so it
      manually needs to acquire the correct context.
      
      Signed-off-by: default avatarStefan Reiter <s.reiter@proxmox.com>
      Message-Id: <20200407115651.69472-2-s.reiter@proxmox.com>
      Signed-off-by: default avatarKevin Wolf <kwolf@redhat.com>
      b660a84b
  3. Mar 11, 2020
  4. Sep 10, 2019
  5. Jun 12, 2019
    • Markus Armbruster's avatar
      Include qemu-common.h exactly where needed · a8d25326
      Markus Armbruster authored
      
      No header includes qemu-common.h after this commit, as prescribed by
      qemu-common.h's file comment.
      
      Signed-off-by: default avatarMarkus Armbruster <armbru@redhat.com>
      Message-Id: <20190523143508.25387-5-armbru@redhat.com>
      [Rebased with conflicts resolved automatically, except for
      include/hw/arm/xlnx-zynqmp.h hw/arm/nrf51_soc.c hw/arm/msf2-soc.c
      block/qcow2-refcount.c block/qcow2-cluster.c block/qcow2-cache.c
      target/arm/cpu.h target/lm32/cpu.h target/m68k/cpu.h target/mips/cpu.h
      target/moxie/cpu.h target/nios2/cpu.h target/openrisc/cpu.h
      target/riscv/cpu.h target/tilegx/cpu.h target/tricore/cpu.h
      target/unicore32/cpu.h target/xtensa/cpu.h; bsd-user/main.c and
      net/tap-bsd.c fixed up]
      a8d25326
  6. May 10, 2019
    • Kevin Wolf's avatar
      blockjob: Fix coroutine thread after AioContext change · 13726123
      Kevin Wolf authored
      
      Commit 463e0be1 ('blockjob: add AioContext attached callback') tried to
      make block jobs robust against AioContext changes of their main node,
      but it never made sure that the job coroutine actually runs in the new
      thread.
      
      Instead of waking up the job coroutine in whatever thread it ran before,
      let's always pass the AioContext where it should be running now.
      
      Signed-off-by: default avatarKevin Wolf <kwolf@redhat.com>
      13726123
  7. Nov 12, 2018
  8. Sep 25, 2018
    • Kevin Wolf's avatar
      block: Use a single global AioWait · cfe29d82
      Kevin Wolf authored
      
      When draining a block node, we recurse to its parent and for subtree
      drains also to its children. A single AIO_WAIT_WHILE() is then used to
      wait for bdrv_drain_poll() to become true, which depends on all of the
      nodes we recursed to. However, if the respective child or parent becomes
      quiescent and calls bdrv_wakeup(), only the AioWait of the child/parent
      is checked, while AIO_WAIT_WHILE() depends on the AioWait of the
      original node.
      
      Fix this by using a single AioWait for all callers of AIO_WAIT_WHILE().
      
      This may mean that the draining thread gets a few more unnecessary
      wakeups because an unrelated operation got completed, but we already
      wake it up when something _could_ have changed rather than only if it
      has certainly changed.
      
      Apart from that, drain is a slow path anyway. In theory it would be
      possible to use wakeups more selectively and still correctly, but the
      gains are likely not worth the additional complexity. In fact, this
      patch is a nice simplification for some places in the code.
      
      Signed-off-by: default avatarKevin Wolf <kwolf@redhat.com>
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      Reviewed-by: default avatarMax Reitz <mreitz@redhat.com>
      cfe29d82
    • Kevin Wolf's avatar
      job: Avoid deadlocks in job_completed_txn_abort() · 644f3a29
      Kevin Wolf authored
      
      Amongst others, job_finalize_single() calls the .prepare/.commit/.abort
      callbacks of the individual job driver. Recently, their use was adapted
      for all block jobs so that they involve code calling AIO_WAIT_WHILE()
      now. Such code must be called under the AioContext lock for the
      respective job, but without holding any other AioContext lock.
      
      Signed-off-by: default avatarKevin Wolf <kwolf@redhat.com>
      Reviewed-by: default avatarMax Reitz <mreitz@redhat.com>
      644f3a29
    • Kevin Wolf's avatar
      blockjob: Lie better in child_job_drained_poll() · b5a7a057
      Kevin Wolf authored
      
      Block jobs claim in .drained_poll() that they are in a quiescent state
      as soon as job->deferred_to_main_loop is true. This is obviously wrong,
      they still have a completion BH to run. We only get away with this
      because commit 91af091f added an unconditional aio_poll(false) to the
      drain functions, but this is bypassing the regular drain mechanisms.
      
      However, just removing this and telling that the job is still active
      doesn't work either: The completion callbacks themselves call drain
      functions (directly, or indirectly with bdrv_reopen), so they would
      deadlock then.
      
      As a better lie, tell that the job is active as long as the BH is
      pending, but falsely call it quiescent from the point in the BH when the
      completion callback is called. At this point, nested drain calls won't
      deadlock because they ignore the job, and outer drains will wait for the
      job to really reach a quiescent state because the callback is already
      running.
      
      Signed-off-by: default avatarKevin Wolf <kwolf@redhat.com>
      Reviewed-by: default avatarMax Reitz <mreitz@redhat.com>
      b5a7a057
    • Kevin Wolf's avatar
      job: Use AIO_WAIT_WHILE() in job_finish_sync() · de0fbe64
      Kevin Wolf authored
      
      job_finish_sync() needs to release the AioContext lock of the job before
      calling aio_poll(). Otherwise, callbacks called by aio_poll() would
      possibly take the lock a second time and run into a deadlock with a
      nested AIO_WAIT_WHILE() call.
      
      Also, job_drain() without aio_poll() isn't necessarily enough to make
      progress on a job, it could depend on bottom halves to be executed.
      
      Combine both open-coded while loops into a single AIO_WAIT_WHILE() call
      that solves both of these problems.
      
      Signed-off-by: default avatarKevin Wolf <kwolf@redhat.com>
      Reviewed-by: default avatarFam Zheng <famz@redhat.com>
      Reviewed-by: default avatarMax Reitz <mreitz@redhat.com>
      de0fbe64
    • Kevin Wolf's avatar
      blockjob: Wake up BDS when job becomes idle · 34dc97b9
      Kevin Wolf authored
      
      In the context of draining a BDS, the .drained_poll callback of block
      jobs is called. If this returns true (i.e. there is still some activity
      pending), the drain operation may call aio_poll() with blocking=true to
      wait for completion.
      
      As soon as the pending activity is completed and the job finally arrives
      in a quiescent state (i.e. its coroutine either yields with busy=false
      or terminates), the block job must notify the aio_poll() loop to wake
      up, otherwise we get a deadlock if both are running in different
      threads.
      
      Signed-off-by: default avatarKevin Wolf <kwolf@redhat.com>
      Reviewed-by: default avatarFam Zheng <famz@redhat.com>
      Reviewed-by: default avatarMax Reitz <mreitz@redhat.com>
      34dc97b9
    • Kevin Wolf's avatar
      job: Fix missing locking due to mismerge · d1756c78
      Kevin Wolf authored
      
      job_completed() had a problem with double locking that was recently
      fixed independently by two different commits:
      
      "job: Fix nested aio_poll() hanging in job_txn_apply"
      "jobs: add exit shim"
      
      One fix removed the first aio_context_acquire(), the other fix removed
      the other one. Now we have a bug again and the code is run without any
      locking.
      
      Add it back in one of the places.
      
      Signed-off-by: default avatarKevin Wolf <kwolf@redhat.com>
      Reviewed-by: default avatarMax Reitz <mreitz@redhat.com>
      Reviewed-by: default avatarJohn Snow <jsnow@redhat.com>
      d1756c78
    • Fam Zheng's avatar
      job: Fix nested aio_poll() hanging in job_txn_apply · 49880165
      Fam Zheng authored
      
      All callers have acquired ctx already. Doing that again results in
      aio_poll() hang. This fixes the problem that a BDRV_POLL_WHILE() in the
      callback cannot make progress because ctx is recursively locked, for
      example, when drive-backup finishes.
      
      There are two callers of job_finalize():
      
          fam@lemon:~/work/qemu [master]$ git grep -w -A1 '^\s*job_finalize'
          blockdev.c:    job_finalize(&job->job, errp);
          blockdev.c-    aio_context_release(aio_context);
          --
          job-qmp.c:    job_finalize(job, errp);
          job-qmp.c-    aio_context_release(aio_context);
          --
          tests/test-blockjob.c:    job_finalize(&job->job, &error_abort);
          tests/test-blockjob.c-    assert(job->job.status == JOB_STATUS_CONCLUDED);
      
      Ignoring the test, it's easy to see both callers to job_finalize (and
      job_do_finalize) have acquired the context.
      
      Cc: qemu-stable@nongnu.org
      Reported-by: default avatarGu Nini <ngu@redhat.com>
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      Signed-off-by: default avatarFam Zheng <famz@redhat.com>
      Signed-off-by: default avatarKevin Wolf <kwolf@redhat.com>
      49880165
    • John Snow's avatar
      jobs: remove .exit callback · ccbfb331
      John Snow authored
      
      Now that all of the jobs use the component finalization callbacks,
      there's no use for the heavy-hammer .exit callback anymore.
      
      job_exit becomes a glorified type shim so that we can call
      job_completed from aio_bh_schedule_oneshot.
      
      Move these three functions down into job.c to eliminate a
      forward reference.
      
      Signed-off-by: default avatarJohn Snow <jsnow@redhat.com>
      Reviewed-by: default avatarMax Reitz <mreitz@redhat.com>
      Message-id: 20180906130225.5118-12-jsnow@redhat.com
      Reviewed-by: default avatarJeff Cody <jcody@redhat.com>
      Signed-off-by: default avatarMax Reitz <mreitz@redhat.com>
      ccbfb331
  9. Aug 31, 2018
    • John Snow's avatar
      jobs: remove job_defer_to_main_loop · e21a1c98
      John Snow authored
      
      Now that the job infrastructure is handling the job_completed call for
      all implemented jobs, we can remove the interface that allowed jobs to
      schedule their own completion.
      
      Signed-off-by: default avatarJohn Snow <jsnow@redhat.com>
      Reviewed-by: default avatarMax Reitz <mreitz@redhat.com>
      Message-id: 20180830015734.19765-10-jsnow@redhat.com
      Signed-off-by: default avatarMax Reitz <mreitz@redhat.com>
      e21a1c98
    • John Snow's avatar
      jobs: remove ret argument to job_completed; privatize it · 404ff28d
      John Snow authored
      
      Jobs are now expected to return their retcode on the stack, from the
      .run callback, so we can remove that argument.
      
      job_cancel does not need to set -ECANCELED because job_completed will
      update the return code itself if the job was canceled.
      
      While we're here, make job_completed static to job.c and remove it from
      job.h; move the documentation of return code to the .run() callback and
      to the job->ret property, accordingly.
      
      Signed-off-by: default avatarJohn Snow <jsnow@redhat.com>
      Message-id: 20180830015734.19765-9-jsnow@redhat.com
      Reviewed-by: default avatarMax Reitz <mreitz@redhat.com>
      Signed-off-by: default avatarMax Reitz <mreitz@redhat.com>
      404ff28d
    • John Snow's avatar
      jobs: add exit shim · 00359a71
      John Snow authored
      
      All jobs do the same thing when they leave their running loop:
      - Store the return code in a structure
      - wait to receive this structure in the main thread
      - signal job completion via job_completed
      
      Few jobs do anything beyond exactly this. Consolidate this exit
      logic for a net reduction in SLOC.
      
      More seriously, when we utilize job_defer_to_main_loop_bh to call
      a function that calls job_completed, job_finalize_single will run
      in a context where it has recursively taken the aio_context lock,
      which can cause hangs if it puts down a reference that causes a flush.
      
      You can observe this in practice by looking at mirror_exit's careful
      placement of job_completed and bdrv_unref calls.
      
      If we centralize job exiting, we can signal job completion from outside
      of the aio_context, which should allow for job cleanup code to run with
      only one lock, which makes cleanup callbacks less tricky to write.
      
      Signed-off-by: default avatarJohn Snow <jsnow@redhat.com>
      Reviewed-by: default avatarMax Reitz <mreitz@redhat.com>
      Message-id: 20180830015734.19765-4-jsnow@redhat.com
      Reviewed-by: default avatarJeff Cody <jcody@redhat.com>
      Signed-off-by: default avatarMax Reitz <mreitz@redhat.com>
      00359a71
    • John Snow's avatar
      jobs: canonize Error object · 3d1f8b07
      John Snow authored
      
      Jobs presently use both an Error object in the case of the create job,
      and char strings in the case of generic errors elsewhere.
      
      Unify the two paths as just j->err, and remove the extra argument from
      job_completed. The integer error code for job_completed is kept for now,
      to be removed shortly in a separate patch.
      
      Signed-off-by: default avatarJohn Snow <jsnow@redhat.com>
      Message-id: 20180830015734.19765-3-jsnow@redhat.com
      [mreitz: Dropped a superfluous g_strdup()]
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      Signed-off-by: default avatarMax Reitz <mreitz@redhat.com>
      3d1f8b07
    • John Snow's avatar
      jobs: change start callback to run callback · f67432a2
      John Snow authored
      
      Presently we codify the entry point for a job as the "start" callback,
      but a more apt name would be "run" to clarify the idea that when this
      function returns we consider the job to have "finished," except for
      any cleanup which occurs in separate callbacks later.
      
      As part of this clarification, change the signature to include an error
      object and a return code. The error ptr is not yet used, and the return
      code while captured, will be overwritten by actions in the job_completed
      function.
      
      Signed-off-by: default avatarJohn Snow <jsnow@redhat.com>
      Reviewed-by: default avatarMax Reitz <mreitz@redhat.com>
      Message-id: 20180830015734.19765-2-jsnow@redhat.com
      Reviewed-by: default avatarJeff Cody <jcody@redhat.com>
      Signed-off-by: default avatarMax Reitz <mreitz@redhat.com>
      f67432a2
  10. Aug 28, 2018
  11. Aug 21, 2018
  12. Jun 18, 2018
  13. May 30, 2018
    • Kevin Wolf's avatar
      job: Add error message for failing jobs · 1266c9b9
      Kevin Wolf authored
      
      So far we relied on job->ret and strerror() to produce an error message
      for failed jobs. Not surprisingly, this tends to result in completely
      useless messages.
      
      This adds a Job.error field that can contain an error string for a
      failing job, and a parameter to job_completed() that sets the field. As
      a default, if NULL is passed, we continue to use strerror(job->ret).
      
      All existing callers are changed to pass NULL. They can be improved in
      separate patches.
      
      Signed-off-by: default avatarKevin Wolf <kwolf@redhat.com>
      Reviewed-by: default avatarMax Reitz <mreitz@redhat.com>
      Reviewed-by: default avatarJeff Cody <jcody@redhat.com>
      1266c9b9
  14. May 23, 2018
Loading