Skip to content
Snippets Groups Projects
  1. Sep 08, 2023
    • Stefan Hajnoczi's avatar
      io: follow coroutine AioContext in qio_channel_yield() · 06e0f098
      Stefan Hajnoczi authored
      
      The ongoing QEMU multi-queue block layer effort makes it possible for multiple
      threads to process I/O in parallel. The nbd block driver is not compatible with
      the multi-queue block layer yet because QIOChannel cannot be used easily from
      coroutines running in multiple threads. This series changes the QIOChannel API
      to make that possible.
      
      In the current API, calling qio_channel_attach_aio_context() sets the
      AioContext where qio_channel_yield() installs an fd handler prior to yielding:
      
        qio_channel_attach_aio_context(ioc, my_ctx);
        ...
        qio_channel_yield(ioc); // my_ctx is used here
        ...
        qio_channel_detach_aio_context(ioc);
      
      This API design has limitations: reading and writing must be done in the same
      AioContext and moving between AioContexts involves a cumbersome sequence of API
      calls that is not suitable for doing on a per-request basis.
      
      There is no fundamental reason why a QIOChannel needs to run within the
      same AioContext every time qio_channel_yield() is called. QIOChannel
      only uses the AioContext while inside qio_channel_yield(). The rest of
      the time, QIOChannel is independent of any AioContext.
      
      In the new API, qio_channel_yield() queries the AioContext from the current
      coroutine using qemu_coroutine_get_aio_context(). There is no need to
      explicitly attach/detach AioContexts anymore and
      qio_channel_attach_aio_context() and qio_channel_detach_aio_context() are gone.
      One coroutine can read from the QIOChannel while another coroutine writes from
      a different AioContext.
      
      This API change allows the nbd block driver to use QIOChannel from any thread.
      It's important to keep in mind that the block driver already synchronizes
      QIOChannel access and ensures that two coroutines never read simultaneously or
      write simultaneously.
      
      This patch updates all users of qio_channel_attach_aio_context() to the
      new API. Most conversions are simple, but vhost-user-server requires a
      new qemu_coroutine_yield() call to quiesce the vu_client_trip()
      coroutine when not attached to any AioContext.
      
      While the API is has become simpler, there is one wart: QIOChannel has a
      special case for the iohandler AioContext (used for handlers that must not run
      in nested event loops). I didn't find an elegant way preserve that behavior, so
      I added a new API called qio_channel_set_follow_coroutine_ctx(ioc, true|false)
      for opting in to the new AioContext model. By default QIOChannel uses the
      iohandler AioHandler. Code that formerly called
      qio_channel_attach_aio_context() now calls
      qio_channel_set_follow_coroutine_ctx(ioc, true) once after the QIOChannel is
      created.
      
      Signed-off-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      Acked-by: default avatarDaniel P. Berrangé <berrange@redhat.com>
      Message-ID: <20230830224802.493686-5-stefanha@redhat.com>
      [eblake: also fix migration/rdma.c]
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      06e0f098
  2. Aug 31, 2023
  3. Aug 01, 2023
    • Daniel P. Berrangé's avatar
      io: remove io watch if TLS channel is closed during handshake · 10be627d
      Daniel P. Berrangé authored
      
      The TLS handshake make take some time to complete, during which time an
      I/O watch might be registered with the main loop. If the owner of the
      I/O channel invokes qio_channel_close() while the handshake is waiting
      to continue the I/O watch must be removed. Failing to remove it will
      later trigger the completion callback which the owner is not expecting
      to receive. In the case of the VNC server, this results in a SEGV as
      vnc_disconnect_start() tries to shutdown a client connection that is
      already gone / NULL.
      
      CVE-2023-3354
      Reported-by: default avatarjiangyegen <jiangyegen@huawei.com>
      Signed-off-by: default avatarDaniel P. Berrangé <berrange@redhat.com>
      10be627d
  4. May 19, 2023
    • Kevin Wolf's avatar
      nbd/server: Fix drained_poll to wake coroutine in right AioContext · 7c1f51bf
      Kevin Wolf authored
      
      nbd_drained_poll() generally runs in the main thread, not whatever
      iothread the NBD server coroutine is meant to run in, so it can't
      directly reenter the coroutines to wake them up.
      
      The code seems to have the right intention, it specifies the correct
      AioContext when it calls qemu_aio_coroutine_enter(). However, this
      functions doesn't schedule the coroutine to run in that AioContext, but
      it assumes it is already called in the home thread of the AioContext.
      
      To fix this, add a new thread-safe qio_channel_wake_read() that can be
      called in the main thread to wake up the coroutine in its AioContext,
      and use this in nbd_drained_poll().
      
      Cc: qemu-stable@nongnu.org
      Signed-off-by: default avatarKevin Wolf <kwolf@redhat.com>
      Message-Id: <20230517152834.277483-3-kwolf@redhat.com>
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      Signed-off-by: default avatarKevin Wolf <kwolf@redhat.com>
      7c1f51bf
  5. Apr 20, 2023
    • Paolo Bonzini's avatar
      io: mark mixed functions that can suspend · 1dd91b22
      Paolo Bonzini authored
      
      There should be no paths from a coroutine_fn to aio_poll, however in
      practice coroutine_mixed_fn will call aio_poll in the !qemu_in_coroutine()
      path.  By marking mixed functions, we can track accurately the call paths
      that execute entirely in coroutine context, and find more missing
      coroutine_fn markers.  This results in more accurate checks that
      coroutine code does not end up blocking.
      
      If the marking were extended transitively to all functions that call
      these ones, static analysis could be done much more efficiently.
      However, this is a start and makes it possible to use vrc's path-based
      searches to find potential bugs where coroutine_fns call blocking functions.
      
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      1dd91b22
  6. Feb 06, 2023
  7. Jan 20, 2023
    • Markus Armbruster's avatar
      coroutine: Split qemu/coroutine-core.h off qemu/coroutine.h · 68ba85ce
      Markus Armbruster authored
      
      qemu/coroutine.h and qemu/lockable.h include each other.
      
      They need each other only in macro expansions, so we could simply drop
      both inclusions to break the loop, and add suitable includes to files
      that expand the macros.
      
      Instead, move a part of qemu/coroutine.h to new qemu/coroutine-core.h
      so that qemu/coroutine-core.h doesn't need qemu/lockable.h, and
      qemu/lockable.h only needs qemu/coroutine-core.h.  Result:
      qemu/coroutine.h includes qemu/lockable.h includes
      qemu/coroutine-core.h.
      
      Signed-off-by: default avatarMarkus Armbruster <armbru@redhat.com>
      Message-Id: <20221221131435.3851212-5-armbru@redhat.com>
      [Semantic rebase conflict with 7c10cb38 "accel/tcg: Add debuginfo
      support" resolved]
      68ba85ce
  8. Dec 14, 2022
  9. Oct 12, 2022
    • Marc-André Lureau's avatar
      io/command: implement support for win32 · ec5b6c9c
      Marc-André Lureau authored
      
      The initial implementation was changing the pipe state created by GLib
      to PIPE_NOWAIT, but it turns out it doesn't work (read/write returns an
      error). Since reading may return less than the requested amount, it
      seems to be non-blocking already. However, the IO operation may block
      until the FD is ready, I can't find good sources of information, to be
      safe we can just poll for readiness before.
      
      Alternatively, we could setup the FDs ourself, and use UNIX sockets on
      Windows, which can be used in blocking/non-blocking mode. I haven't
      tried it, as I am not sure it is necessary.
      
      Signed-off-by: default avatarMarc-André Lureau <marcandre.lureau@redhat.com>
      Reviewed-by: default avatarDaniel P. Berrangé <berrange@redhat.com>
      Message-Id: <20221006113657.2656108-6-marcandre.lureau@redhat.com>
      ec5b6c9c
    • Marc-André Lureau's avatar
      io/command: use glib GSpawn, instead of open-coding fork/exec · a95570e3
      Marc-André Lureau authored
      
      Simplify qio_channel_command_new_spawn() with GSpawn API. This will
      allow to build for WIN32 in the following patches.
      
      As pointed out by Daniel Berrangé: there is a change in semantics here
      too. The current code only touches stdin/stdout/stderr. Any other FDs
      which do NOT have O_CLOEXEC set will be inherited. With the new code,
      all FDs except stdin/out/err will be explicitly closed, because we don't
      set the flag G_SPAWN_LEAVE_DESCRIPTORS_OPEN. The only place we use
      QIOChannelCommand today is the migration exec: protocol, and that is
      only declared to use stdin/stdout.
      
      Reviewed-by: default avatarDaniel P. Berrangé <berrange@redhat.com>
      Signed-off-by: default avatarMarc-André Lureau <marcandre.lureau@redhat.com>
      Message-Id: <20221006113657.2656108-5-marcandre.lureau@redhat.com>
      a95570e3
  10. Jun 22, 2022
  11. May 16, 2022
    • Leonardo Bras's avatar
      QIOChannelSocket: Implement io_writev zero copy flag & io_flush for CONFIG_LINUX · 2bc58ffc
      Leonardo Bras authored
      
      For CONFIG_LINUX, implement the new zero copy flag and the optional callback
      io_flush on QIOChannelSocket, but enables it only when MSG_ZEROCOPY
      feature is available in the host kernel, which is checked on
      qio_channel_socket_connect_sync()
      
      qio_channel_socket_flush() was implemented by counting how many times
      sendmsg(...,MSG_ZEROCOPY) was successfully called, and then reading the
      socket's error queue, in order to find how many of them finished sending.
      Flush will loop until those counters are the same, or until some error occurs.
      
      Notes on using writev() with QIO_CHANNEL_WRITE_FLAG_ZERO_COPY:
      1: Buffer
      - As MSG_ZEROCOPY tells the kernel to use the same user buffer to avoid copying,
      some caution is necessary to avoid overwriting any buffer before it's sent.
      If something like this happen, a newer version of the buffer may be sent instead.
      - If this is a problem, it's recommended to call qio_channel_flush() before freeing
      or re-using the buffer.
      
      2: Locked memory
      - When using MSG_ZERCOCOPY, the buffer memory will be locked after queued, and
      unlocked after it's sent.
      - Depending on the size of each buffer, and how often it's sent, it may require
      a larger amount of locked memory than usually available to non-root user.
      - If the required amount of locked memory is not available, writev_zero_copy
      will return an error, which can abort an operation like migration,
      - Because of this, when an user code wants to add zero copy as a feature, it
      requires a mechanism to disable it, so it can still be accessible to less
      privileged users.
      
      Signed-off-by: default avatarLeonardo Bras <leobras@redhat.com>
      Reviewed-by: default avatarPeter Xu <peterx@redhat.com>
      Reviewed-by: default avatarDaniel P. Berrangé <berrange@redhat.com>
      Reviewed-by: default avatarJuan Quintela <quintela@redhat.com>
      Message-Id: <20220513062836.965425-4-leobras@redhat.com>
      Signed-off-by: default avatarDr. David Alan Gilbert <dgilbert@redhat.com>
      2bc58ffc
    • Leonardo Bras's avatar
      QIOChannel: Add flags on io_writev and introduce io_flush callback · b88651cb
      Leonardo Bras authored
      
      Add flags to io_writev and introduce io_flush as optional callback to
      QIOChannelClass, allowing the implementation of zero copy writes by
      subclasses.
      
      How to use them:
      - Write data using qio_channel_writev*(...,QIO_CHANNEL_WRITE_FLAG_ZERO_COPY),
      - Wait write completion with qio_channel_flush().
      
      Notes:
      As some zero copy write implementations work asynchronously, it's
      recommended to keep the write buffer untouched until the return of
      qio_channel_flush(), to avoid the risk of sending an updated buffer
      instead of the buffer state during write.
      
      As io_flush callback is optional, if a subclass does not implement it, then:
      - io_flush will return 0 without changing anything.
      
      Also, some functions like qio_channel_writev_full_all() were adapted to
      receive a flag parameter. That allows shared code between zero copy and
      non-zero copy writev, and also an easier implementation on new flags.
      
      Signed-off-by: default avatarLeonardo Bras <leobras@redhat.com>
      Reviewed-by: default avatarDaniel P. Berrangé <berrange@redhat.com>
      Reviewed-by: default avatarPeter Xu <peterx@redhat.com>
      Reviewed-by: default avatarJuan Quintela <quintela@redhat.com>
      Message-Id: <20220513062836.965425-3-leobras@redhat.com>
      Signed-off-by: default avatarDr. David Alan Gilbert <dgilbert@redhat.com>
      b88651cb
  12. May 03, 2022
  13. Feb 10, 2021
  14. Jan 13, 2021
  15. Oct 29, 2020
  16. Oct 27, 2020
  17. Sep 18, 2020
  18. Sep 09, 2020
  19. Jun 10, 2020
  20. Dec 18, 2019
  21. Sep 03, 2019
  22. Jun 12, 2019
    • Markus Armbruster's avatar
      Include qemu-common.h exactly where needed · a8d25326
      Markus Armbruster authored
      
      No header includes qemu-common.h after this commit, as prescribed by
      qemu-common.h's file comment.
      
      Signed-off-by: default avatarMarkus Armbruster <armbru@redhat.com>
      Message-Id: <20190523143508.25387-5-armbru@redhat.com>
      [Rebased with conflicts resolved automatically, except for
      include/hw/arm/xlnx-zynqmp.h hw/arm/nrf51_soc.c hw/arm/msf2-soc.c
      block/qcow2-refcount.c block/qcow2-cluster.c block/qcow2-cache.c
      target/arm/cpu.h target/lm32/cpu.h target/m68k/cpu.h target/mips/cpu.h
      target/moxie/cpu.h target/nios2/cpu.h target/openrisc/cpu.h
      target/riscv/cpu.h target/tilegx/cpu.h target/tricore/cpu.h
      target/unicore32/cpu.h target/xtensa/cpu.h; bsd-user/main.c and
      net/tap-bsd.c fixed up]
      a8d25326
  23. Feb 25, 2019
  24. Feb 12, 2019
    • Daniel P. Berrangé's avatar
      io: add qio_task_wait_thread to join with a background thread · dbb44504
      Daniel P. Berrangé authored
      
      Add the ability for a caller to wait for completion of the
      background thread to synchronously dispatch its result, without
      needing to wait for the main loop to run the idle callback.
      
      This method needs very careful usage to avoid a dangerous
      race condition with the free'ing of the task. The completion
      callback is normally invoked from an idle callback registered
      with the main loop context. The qio_task_wait_thread method
      must only be called if the completion callback has not yet
      run. The only safe way to achieve this is to run the
      qio_task_wait_thread method from the thread that executes
      the main loop.
      
      It is generally a bad idea to use this method since it will
      block execution of the main loop, however, the design of
      the character devices and its usage from vhostuser already
      requires blocking execution.
      
      Signed-off-by: default avatarDaniel P. Berrangé <berrange@redhat.com>
      Message-Id: <20190211182442.8542-3-berrange@redhat.com>
      Signed-off-by: default avatarMarc-André Lureau <marcandre.lureau@redhat.com>
      dbb44504
  25. Nov 19, 2018
    • Daniel P. Berrangé's avatar
      io: return 0 for EOF in TLS session read after shutdown · a2458b6f
      Daniel P. Berrangé authored
      
      GNUTLS takes a paranoid approach when seeing 0 bytes returned by the
      underlying OS read() function. It will consider this an error and
      return GNUTLS_E_PREMATURE_TERMINATION instead of propagating the 0
      return value. It expects apps to arrange for clean termination at
      the protocol level and not rely on seeing EOF from a read call to
      detect shutdown. This is to harden apps against a malicious 3rd party
      causing termination of the sockets layer.
      
      This is unhelpful for the QEMU NBD code which does have a clean
      protocol level shutdown, but still relies on seeing 0 from the I/O
      channel read in the coroutine handling incoming replies.
      
      The upshot is that when using a plain NBD connection shutdown is
      silent, but when using TLS, the client spams the console with
      
        Cannot read from TLS channel: Broken pipe
      
      The NBD connection has, however, called qio_channel_shutdown()
      at this point to indicate that it is done with I/O. This gives
      the opportunity to optimize the code such that when the channel
      has been shutdown in the read direction, the error code
      GNUTLS_E_PREMATURE_TERMINATION gets turned into a '0' return
      instead of an error.
      
      Signed-off-by: default avatarDaniel P. Berrangé <berrange@redhat.com>
      Message-Id: <20181119134228.11031-1-berrange@redhat.com>
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      a2458b6f
  26. Mar 06, 2018
Loading