Skip to content
Snippets Groups Projects
  1. May 16, 2022
    • Leonardo Bras's avatar
      QIOChannelSocket: Implement io_writev zero copy flag & io_flush for CONFIG_LINUX · 2bc58ffc
      Leonardo Bras authored
      
      For CONFIG_LINUX, implement the new zero copy flag and the optional callback
      io_flush on QIOChannelSocket, but enables it only when MSG_ZEROCOPY
      feature is available in the host kernel, which is checked on
      qio_channel_socket_connect_sync()
      
      qio_channel_socket_flush() was implemented by counting how many times
      sendmsg(...,MSG_ZEROCOPY) was successfully called, and then reading the
      socket's error queue, in order to find how many of them finished sending.
      Flush will loop until those counters are the same, or until some error occurs.
      
      Notes on using writev() with QIO_CHANNEL_WRITE_FLAG_ZERO_COPY:
      1: Buffer
      - As MSG_ZEROCOPY tells the kernel to use the same user buffer to avoid copying,
      some caution is necessary to avoid overwriting any buffer before it's sent.
      If something like this happen, a newer version of the buffer may be sent instead.
      - If this is a problem, it's recommended to call qio_channel_flush() before freeing
      or re-using the buffer.
      
      2: Locked memory
      - When using MSG_ZERCOCOPY, the buffer memory will be locked after queued, and
      unlocked after it's sent.
      - Depending on the size of each buffer, and how often it's sent, it may require
      a larger amount of locked memory than usually available to non-root user.
      - If the required amount of locked memory is not available, writev_zero_copy
      will return an error, which can abort an operation like migration,
      - Because of this, when an user code wants to add zero copy as a feature, it
      requires a mechanism to disable it, so it can still be accessible to less
      privileged users.
      
      Signed-off-by: default avatarLeonardo Bras <leobras@redhat.com>
      Reviewed-by: default avatarPeter Xu <peterx@redhat.com>
      Reviewed-by: default avatarDaniel P. Berrangé <berrange@redhat.com>
      Reviewed-by: default avatarJuan Quintela <quintela@redhat.com>
      Message-Id: <20220513062836.965425-4-leobras@redhat.com>
      Signed-off-by: default avatarDr. David Alan Gilbert <dgilbert@redhat.com>
      2bc58ffc
    • Leonardo Bras's avatar
      QIOChannel: Add flags on io_writev and introduce io_flush callback · b88651cb
      Leonardo Bras authored
      
      Add flags to io_writev and introduce io_flush as optional callback to
      QIOChannelClass, allowing the implementation of zero copy writes by
      subclasses.
      
      How to use them:
      - Write data using qio_channel_writev*(...,QIO_CHANNEL_WRITE_FLAG_ZERO_COPY),
      - Wait write completion with qio_channel_flush().
      
      Notes:
      As some zero copy write implementations work asynchronously, it's
      recommended to keep the write buffer untouched until the return of
      qio_channel_flush(), to avoid the risk of sending an updated buffer
      instead of the buffer state during write.
      
      As io_flush callback is optional, if a subclass does not implement it, then:
      - io_flush will return 0 without changing anything.
      
      Also, some functions like qio_channel_writev_full_all() were adapted to
      receive a flag parameter. That allows shared code between zero copy and
      non-zero copy writev, and also an easier implementation on new flags.
      
      Signed-off-by: default avatarLeonardo Bras <leobras@redhat.com>
      Reviewed-by: default avatarDaniel P. Berrangé <berrange@redhat.com>
      Reviewed-by: default avatarPeter Xu <peterx@redhat.com>
      Reviewed-by: default avatarJuan Quintela <quintela@redhat.com>
      Message-Id: <20220513062836.965425-3-leobras@redhat.com>
      Signed-off-by: default avatarDr. David Alan Gilbert <dgilbert@redhat.com>
      b88651cb
  2. May 03, 2022
  3. Feb 10, 2021
  4. Jan 13, 2021
  5. Oct 29, 2020
  6. Oct 27, 2020
  7. Sep 18, 2020
  8. Sep 09, 2020
  9. Jun 10, 2020
  10. Dec 18, 2019
  11. Sep 03, 2019
  12. Jun 12, 2019
    • Markus Armbruster's avatar
      Include qemu-common.h exactly where needed · a8d25326
      Markus Armbruster authored
      
      No header includes qemu-common.h after this commit, as prescribed by
      qemu-common.h's file comment.
      
      Signed-off-by: default avatarMarkus Armbruster <armbru@redhat.com>
      Message-Id: <20190523143508.25387-5-armbru@redhat.com>
      [Rebased with conflicts resolved automatically, except for
      include/hw/arm/xlnx-zynqmp.h hw/arm/nrf51_soc.c hw/arm/msf2-soc.c
      block/qcow2-refcount.c block/qcow2-cluster.c block/qcow2-cache.c
      target/arm/cpu.h target/lm32/cpu.h target/m68k/cpu.h target/mips/cpu.h
      target/moxie/cpu.h target/nios2/cpu.h target/openrisc/cpu.h
      target/riscv/cpu.h target/tilegx/cpu.h target/tricore/cpu.h
      target/unicore32/cpu.h target/xtensa/cpu.h; bsd-user/main.c and
      net/tap-bsd.c fixed up]
      a8d25326
  13. Feb 25, 2019
  14. Feb 12, 2019
    • Daniel P. Berrangé's avatar
      io: add qio_task_wait_thread to join with a background thread · dbb44504
      Daniel P. Berrangé authored
      
      Add the ability for a caller to wait for completion of the
      background thread to synchronously dispatch its result, without
      needing to wait for the main loop to run the idle callback.
      
      This method needs very careful usage to avoid a dangerous
      race condition with the free'ing of the task. The completion
      callback is normally invoked from an idle callback registered
      with the main loop context. The qio_task_wait_thread method
      must only be called if the completion callback has not yet
      run. The only safe way to achieve this is to run the
      qio_task_wait_thread method from the thread that executes
      the main loop.
      
      It is generally a bad idea to use this method since it will
      block execution of the main loop, however, the design of
      the character devices and its usage from vhostuser already
      requires blocking execution.
      
      Signed-off-by: default avatarDaniel P. Berrangé <berrange@redhat.com>
      Message-Id: <20190211182442.8542-3-berrange@redhat.com>
      Signed-off-by: default avatarMarc-André Lureau <marcandre.lureau@redhat.com>
      dbb44504
  15. Nov 19, 2018
    • Daniel P. Berrangé's avatar
      io: return 0 for EOF in TLS session read after shutdown · a2458b6f
      Daniel P. Berrangé authored
      
      GNUTLS takes a paranoid approach when seeing 0 bytes returned by the
      underlying OS read() function. It will consider this an error and
      return GNUTLS_E_PREMATURE_TERMINATION instead of propagating the 0
      return value. It expects apps to arrange for clean termination at
      the protocol level and not rely on seeing EOF from a read call to
      detect shutdown. This is to harden apps against a malicious 3rd party
      causing termination of the sockets layer.
      
      This is unhelpful for the QEMU NBD code which does have a clean
      protocol level shutdown, but still relies on seeing 0 from the I/O
      channel read in the coroutine handling incoming replies.
      
      The upshot is that when using a plain NBD connection shutdown is
      silent, but when using TLS, the client spams the console with
      
        Cannot read from TLS channel: Broken pipe
      
      The NBD connection has, however, called qio_channel_shutdown()
      at this point to indicate that it is done with I/O. This gives
      the opportunity to optimize the code such that when the channel
      has been shutdown in the read direction, the error code
      GNUTLS_E_PREMATURE_TERMINATION gets turned into a '0' return
      instead of an error.
      
      Signed-off-by: default avatarDaniel P. Berrangé <berrange@redhat.com>
      Message-Id: <20181119134228.11031-1-berrange@redhat.com>
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      a2458b6f
  16. Mar 06, 2018
  17. Mar 02, 2018
  18. Feb 15, 2018
  19. Dec 15, 2017
  20. Oct 16, 2017
    • Daniel P. Berrangé's avatar
      io: get rid of bounce buffering in websock write path · 8dfd5f96
      Daniel P. Berrangé authored
      
      Currently most outbound I/O on the websock channel gets copied into the
      rawoutput buffer, and then immediately copied again into the encoutput
      buffer, with a header prepended. Now that qio_channel_websock_encode
      accepts a struct iovec, we can trivially remove this bounce buffering
      and write directly to encoutput.
      
      In doing so, we also now correctly validate the encoutput size against
      the QIO_CHANNEL_WEBSOCK_MAX_BUFFER limit.
      
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      Signed-off-by: default avatarDaniel P. Berrange <berrange@redhat.com>
      8dfd5f96
    • Daniel P. Berrangé's avatar
      io: simplify websocket ping reply handling · 57b0cdf1
      Daniel P. Berrangé authored
      
      We must ensure we don't get flooded with ping replies if the outbound
      channel is slow. Currently we do this by keeping the ping reply in a
      separate temporary buffer and only writing it if the encoutput buffer
      is completely empty. This is overly pessimistic, as it is reasonable
      to add a ping reply to the encoutput buffer even if it has previous
      data in it, as long as that previous data doesn't include a ping
      reply.
      
      To track this better, put the ping reply directly into the encoutput
      buffer, and then record the size of encoutput at this time in
      pong_remain. As we write encoutput to the underlying channel, we
      can decrement the pong_remain counter. Once it hits zero, we can
      accept further ping replies for transmission.
      
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      Signed-off-by: default avatarDaniel P. Berrange <berrange@redhat.com>
      57b0cdf1
  21. Oct 04, 2017
  22. Sep 06, 2017
    • Eric Blake's avatar
      io: Add new qio_channel_read{, v}_all_eof functions · e8ffaa31
      Eric Blake authored
      
      Some callers want to distinguish between clean EOF (no bytes read)
      vs. a short read (at least one byte read, but EOF encountered
      before reaching the desired length), as it allows clients the
      ability to do a graceful shutdown when a server shuts down at
      defined safe points in the protocol, rather than treating all
      shutdown scenarios as an error due to EOF.  However, we don't want
      to require all callers to have to check for early EOF.  So add
      another wrapper function that can be used by the callers that care
      about the distinction.
      
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      Message-Id: <20170905191114.5959-3-eblake@redhat.com>
      Acked-by: default avatarDaniel P. Berrange <berrange@redhat.com>
      e8ffaa31
  23. Sep 05, 2017
  24. May 10, 2017
Loading