Skip to content
Snippets Groups Projects
  1. Dec 19, 2017
    • Stefan Hajnoczi's avatar
      iothread: fix iothread_stop() race condition · 2362a28e
      Stefan Hajnoczi authored
      
      There is a small chance that iothread_stop() hangs as follows:
      
        Thread 3 (Thread 0x7f63eba5f700 (LWP 16105)):
        #0  0x00007f64012c09b6 in ppoll () at /lib64/libc.so.6
        #1  0x000055959992eac9 in ppoll (__ss=0x0, __timeout=0x0, __nfds=<optimized out>, __fds=<optimized out>) at /usr/include/bits/poll2.h:77
        #2  0x000055959992eac9 in qemu_poll_ns (fds=<optimized out>, nfds=<optimized out>, timeout=<optimized out>) at util/qemu-timer.c:322
        #3  0x0000559599930711 in aio_poll (ctx=0x55959bdb83c0, blocking=blocking@entry=true) at util/aio-posix.c:629
        #4  0x00005595996806fe in iothread_run (opaque=0x55959bd78400) at iothread.c:59
        #5  0x00007f640159f609 in start_thread () at /lib64/libpthread.so.0
        #6  0x00007f64012cce6f in clone () at /lib64/libc.so.6
      
        Thread 1 (Thread 0x7f640b45b280 (LWP 16103)):
        #0  0x00007f64015a0b6d in pthread_join () at /lib64/libpthread.so.0
        #1  0x00005595999332ef in qemu_thread_join (thread=<optimized out>) at util/qemu-thread-posix.c:547
        #2  0x00005595996808ae in iothread_stop (iothread=<optimized out>) at iothread.c:91
        #3  0x000055959968094d in iothread_stop_iter (object=<optimized out>, opaque=<optimized out>) at iothread.c:102
        #4  0x0000559599857d97 in do_object_child_foreach (obj=obj@entry=0x55959bdb8100, fn=fn@entry=0x559599680930 <iothread_stop_iter>, opaque=opaque@entry=0x0, recurse=recurse@entry=false) at qom/object.c:852
        #5  0x0000559599859477 in object_child_foreach (obj=obj@entry=0x55959bdb8100, fn=fn@entry=0x559599680930 <iothread_stop_iter>, opaque=opaque@entry=0x0) at qom/object.c:867
        #6  0x0000559599680a6e in iothread_stop_all () at iothread.c:341
        #7  0x000055959955b1d5 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4913
      
      The relevant code from iothread_run() is:
      
        while (!atomic_read(&iothread->stopping)) {
            aio_poll(iothread->ctx, true);
      
      and iothread_stop():
      
        iothread->stopping = true;
        aio_notify(iothread->ctx);
        ...
        qemu_thread_join(&iothread->thread);
      
      The following scenario can occur:
      
      1. IOThread:
        while (!atomic_read(&iothread->stopping)) -> stopping=false
      
      2. Main loop:
        iothread->stopping = true;
        aio_notify(iothread->ctx);
      
      3. IOThread:
        aio_poll(iothread->ctx, true); -> hang
      
      The bug is explained by the AioContext->notify_me doc comments:
      
        "If this field is 0, everything (file descriptors, bottom halves,
        timers) will be re-evaluated before the next blocking poll(), thus the
        event_notifier_set call can be skipped."
      
      The problem is that "everything" does not include checking
      iothread->stopping.  This means iothread_run() will block in aio_poll()
      if aio_notify() was called just before aio_poll().
      
      This patch fixes the hang by replacing aio_notify() with
      aio_bh_schedule_oneshot().  This makes aio_poll() or g_main_loop_run()
      to return.
      
      Implementing this properly required a new bool running flag.  The new
      flag prevents races that are tricky if we try to use iothread->stopping.
      Now iothread->stopping is purely for iothread_stop() and
      iothread->running is purely for the iothread_run() thread.
      
      Signed-off-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      Message-id: 20171207201320.19284-6-stefanha@redhat.com
      Signed-off-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      2362a28e
    • Stefan Hajnoczi's avatar
      iothread: add iothread_by_id() API · fbcc6923
      Stefan Hajnoczi authored
      
      Encapsulate IOThread QOM object lookup so that callers don't need to
      know how and where IOThread objects live.
      
      Signed-off-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      Reviewed-by: default avatarKevin Wolf <kwolf@redhat.com>
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      Message-id: 20171206144550.22295-8-stefanha@redhat.com
      Signed-off-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      fbcc6923
  2. Oct 03, 2017
  3. Sep 29, 2017
  4. Sep 08, 2017
  5. Feb 21, 2017
  6. Feb 03, 2017
    • Stefan Hajnoczi's avatar
      iothread: enable AioContext polling by default · cdd7abfd
      Stefan Hajnoczi authored
      
      IOThread AioContexts are likely to consist only of event sources like
      virtqueue ioeventfds and LinuxAIO completion eventfds that are pollable
      from userspace (without system calls).
      
      We recently merged the AioContext polling feature but didn't enable it
      by default yet.  I have gone back over the performance data on the
      mailing list and picked a default polling value that gave good results.
      
      Let's enable AioContext polling by default so users don't have another
      switch they need to set manually.  If performance regressions are found
      we can still disable this for the QEMU 2.9 release.
      
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: Karl Rister <krister@redhat.com>
      Signed-off-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      Message-id: 20170126170119.27876-1-stefanha@redhat.com
      Signed-off-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      cdd7abfd
  7. Jan 03, 2017
  8. Oct 28, 2016
  9. Sep 28, 2016
  10. Sep 13, 2016
    • Fam Zheng's avatar
      iothread: Stop threads before main() quits · dce8921b
      Fam Zheng authored
      
      Right after main_loop ends, we release various things but keep iothread
      alive. The latter is not prepared to the sudden change of resources.
      
      Specifically, after bdrv_close_all(), virtio-scsi dataplane get a
      surprise at the empty BlockBackend:
      
      (gdb) bt
          at /usr/src/debug/qemu-2.6.0/hw/scsi/virtio-scsi.c:543
          at /usr/src/debug/qemu-2.6.0/hw/scsi/virtio-scsi.c:577
      
      It is because the d->conf.blk->root is set to NULL, then
      blk_get_aio_context() returns qemu_aio_context, whereas s->ctx is still
      pointing to the iothread:
      
          hw/scsi/virtio-scsi.c:543:
      
          if (s->dataplane_started) {
              assert(blk_get_aio_context(d->conf.blk) == s->ctx);
          }
      
      To fix this, let's stop iothreads before doing bdrv_close_all().
      
      Cc: qemu-stable@nongnu.org
      Signed-off-by: default avatarFam Zheng <famz@redhat.com>
      Reviewed-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Message-id: 1473326931-9699-1-git-send-email-famz@redhat.com
      Signed-off-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      dce8921b
  11. Feb 04, 2016
    • Peter Maydell's avatar
      all: Clean up includes · d38ea87a
      Peter Maydell authored
      
      Clean up includes so that osdep.h is included first and headers
      which it implies are not included manually.
      
      This commit was created with scripts/clean-includes.
      
      Signed-off-by: default avatarPeter Maydell <peter.maydell@linaro.org>
      Message-id: 1454089805-5470-16-git-send-email-peter.maydell@linaro.org
      d38ea87a
  12. Dec 03, 2015
  13. Jul 24, 2015
  14. Jun 19, 2015
  15. Jun 12, 2015
  16. May 08, 2015
  17. Apr 28, 2015
  18. Sep 22, 2014
    • Chrysostomos Nanakos's avatar
      async: aio_context_new(): Handle event_notifier_init failure · 2f78e491
      Chrysostomos Nanakos authored
      
      On a system with a low limit of open files the initialization
      of the event notifier could fail and QEMU exits without printing any
      error information to the user.
      
      The problem can be easily reproduced by enforcing a low limit of open
      files and start QEMU with enough I/O threads to hit this limit.
      
      The same problem raises, without the creation of I/O threads, while
      QEMU initializes the main event loop by enforcing an even lower limit of
      open files.
      
      This commit adds an error message on failure:
      
       # qemu [...] -object iothread,id=iothread0 -object iothread,id=iothread1
       qemu: Failed to initialize event notifier: Too many open files in system
      
      Signed-off-by: default avatarChrysostomos Nanakos <cnanakos@grnet.gr>
      Signed-off-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      2f78e491
  19. Jul 14, 2014
    • Paolo Bonzini's avatar
      AioContext: do not rely on aio_poll(ctx, true) result to end a loop · acfb23ad
      Paolo Bonzini authored
      
      Currently, whenever aio_poll(ctx, true) has completed all pending
      work it returns true *and* the next call to aio_poll(ctx, true)
      will not block.
      
      This invariant has its roots in qemu_aio_flush()'s implementation
      as "while (qemu_aio_wait()) {}".  However, qemu_aio_flush() does
      not exist anymore and bdrv_drain_all() is implemented differently;
      and this invariant is complicated to maintain and subtly different
      from the return value of GMainLoop's g_main_context_iteration.
      
      All calls to aio_poll(ctx, true) except one are guarded by a
      while() loop checking for a request to be incomplete, or a
      BlockDriverState to be idle.  The one remaining call (in
      iothread.c) uses this to delay the aio_context_release/acquire
      pair until the AioContext is quiescent, however:
      
      - we can do the same just by using non-blocking aio_poll,
        similar to how vl.c invokes main_loop_wait
      
      - it is buggy, because it does not ensure that the AioContext
        is released between an aio_notify and the next time the
        iothread goes to sleep.  This leads to hangs when stopping
        the dataplane thread.
      
      In the end, these semantics are a bad match for the current
      users of AioContext.  So modify that one exception in iothread.c,
      which also fixes the hangs, as well as the testcase so that
      it use the same idiom as the actual QEMU code.
      
      Reported-by: default avatarChristian Borntraeger <borntraeger@de.ibm.com>
      Tested-by: default avatarChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarKevin Wolf <kwolf@redhat.com>
      acfb23ad
  20. Apr 04, 2014
  21. Mar 13, 2014
    • Stefan Hajnoczi's avatar
      qmp: add query-iothreads command · dc3dd0d2
      Stefan Hajnoczi authored
      
      The "query-iothreads" command returns a list of information about
      iothreads.  See the patch for API documentation.
      
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      Signed-off-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      dc3dd0d2
    • Stefan Hajnoczi's avatar
      iothread: stash thread ID away · 88eb7c29
      Stefan Hajnoczi authored
      
      Keep the thread ID around so we can report it via QMP.
      
      There's only one problem: qemu_get_thread_id() (gettid() wrapper on
      Linux) must be called from the thread itself.  There is no way to get
      the thread ID outside the thread.
      
      This patch uses a condvar to wait for iothread_run() to populate the
      thread_id inside the thread.
      
      Signed-off-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      88eb7c29
    • Stefan Hajnoczi's avatar
      iothread: add I/O thread object · be8d8537
      Stefan Hajnoczi authored
      
      This is a stand-in for Michael Roth's QContext.  I expect this to be
      replaced once QContext is completed.
      
      The IOThread object is an AioContext event loop thread.  This patch adds
      the concept of multiple event loop threads, allowing users to define
      them.
      
      When SMP guests run on SMP hosts it makes sense to instantiate multiple
      IOThreads.  This spreads event loop processing across multiple cores.
      Note that additional patches are required to actually bind a device to
      an IOThread.
      
      [Andreas Färber <afaerber@suse.de> pointed out that the embedded parent
      object instance should be called "parent_obj" and have a newline
      afterwards.  This patch has been changed to reflect this.
      -- Stefan]
      
      Signed-off-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      be8d8537
Loading