Skip to content
Snippets Groups Projects
  1. Sep 22, 2022
  2. Aug 05, 2022
    • Leonardo Bras's avatar
      QIOChannelSocket: Add support for MSG_ZEROCOPY + IPV6 · 5258a7e2
      Leonardo Bras authored
      
      For using MSG_ZEROCOPY, there are two steps:
      1 - io_writev() the packet, which enqueues the packet for sending, and
      2 - io_flush(), which gets confirmation that all packets got correctly sent
      
      Currently, if MSG_ZEROCOPY is used to send packets over IPV6, no error will
      be reported in (1), but it will fail in the first time (2) happens.
      
      This happens because (2) currently checks for cmsg_level & cmsg_type
      associated with IPV4 only, before reporting any error.
      
      Add checks for cmsg_level & cmsg_type associated with IPV6, and thus enable
      support for MSG_ZEROCOPY + IPV6
      
      Fixes: 2bc58ffc ("QIOChannelSocket: Implement io_writev zero copy flag & io_flush for CONFIG_LINUX")
      Signed-off-by: default avatarLeonardo Bras <leobras@redhat.com>
      Signed-off-by: default avatarDaniel P. Berrangé <berrange@redhat.com>
      5258a7e2
  3. Jul 20, 2022
  4. Jun 22, 2022
  5. May 16, 2022
    • Leonardo Bras's avatar
      QIOChannelSocket: Implement io_writev zero copy flag & io_flush for CONFIG_LINUX · 2bc58ffc
      Leonardo Bras authored
      
      For CONFIG_LINUX, implement the new zero copy flag and the optional callback
      io_flush on QIOChannelSocket, but enables it only when MSG_ZEROCOPY
      feature is available in the host kernel, which is checked on
      qio_channel_socket_connect_sync()
      
      qio_channel_socket_flush() was implemented by counting how many times
      sendmsg(...,MSG_ZEROCOPY) was successfully called, and then reading the
      socket's error queue, in order to find how many of them finished sending.
      Flush will loop until those counters are the same, or until some error occurs.
      
      Notes on using writev() with QIO_CHANNEL_WRITE_FLAG_ZERO_COPY:
      1: Buffer
      - As MSG_ZEROCOPY tells the kernel to use the same user buffer to avoid copying,
      some caution is necessary to avoid overwriting any buffer before it's sent.
      If something like this happen, a newer version of the buffer may be sent instead.
      - If this is a problem, it's recommended to call qio_channel_flush() before freeing
      or re-using the buffer.
      
      2: Locked memory
      - When using MSG_ZERCOCOPY, the buffer memory will be locked after queued, and
      unlocked after it's sent.
      - Depending on the size of each buffer, and how often it's sent, it may require
      a larger amount of locked memory than usually available to non-root user.
      - If the required amount of locked memory is not available, writev_zero_copy
      will return an error, which can abort an operation like migration,
      - Because of this, when an user code wants to add zero copy as a feature, it
      requires a mechanism to disable it, so it can still be accessible to less
      privileged users.
      
      Signed-off-by: default avatarLeonardo Bras <leobras@redhat.com>
      Reviewed-by: default avatarPeter Xu <peterx@redhat.com>
      Reviewed-by: default avatarDaniel P. Berrangé <berrange@redhat.com>
      Reviewed-by: default avatarJuan Quintela <quintela@redhat.com>
      Message-Id: <20220513062836.965425-4-leobras@redhat.com>
      Signed-off-by: default avatarDr. David Alan Gilbert <dgilbert@redhat.com>
      2bc58ffc
    • Leonardo Bras's avatar
      QIOChannel: Add flags on io_writev and introduce io_flush callback · b88651cb
      Leonardo Bras authored
      
      Add flags to io_writev and introduce io_flush as optional callback to
      QIOChannelClass, allowing the implementation of zero copy writes by
      subclasses.
      
      How to use them:
      - Write data using qio_channel_writev*(...,QIO_CHANNEL_WRITE_FLAG_ZERO_COPY),
      - Wait write completion with qio_channel_flush().
      
      Notes:
      As some zero copy write implementations work asynchronously, it's
      recommended to keep the write buffer untouched until the return of
      qio_channel_flush(), to avoid the risk of sending an updated buffer
      instead of the buffer state during write.
      
      As io_flush callback is optional, if a subclass does not implement it, then:
      - io_flush will return 0 without changing anything.
      
      Also, some functions like qio_channel_writev_full_all() were adapted to
      receive a flag parameter. That allows shared code between zero copy and
      non-zero copy writev, and also an easier implementation on new flags.
      
      Signed-off-by: default avatarLeonardo Bras <leobras@redhat.com>
      Reviewed-by: default avatarDaniel P. Berrangé <berrange@redhat.com>
      Reviewed-by: default avatarPeter Xu <peterx@redhat.com>
      Reviewed-by: default avatarJuan Quintela <quintela@redhat.com>
      Message-Id: <20220513062836.965425-3-leobras@redhat.com>
      Signed-off-by: default avatarDr. David Alan Gilbert <dgilbert@redhat.com>
      b88651cb
  6. May 03, 2022
  7. Apr 06, 2022
  8. Mar 22, 2022
  9. Jan 12, 2022
    • Stefan Hajnoczi's avatar
      aio-posix: split poll check from ready handler · 826cc324
      Stefan Hajnoczi authored
      
      Adaptive polling measures the execution time of the polling check plus
      handlers called when a polled event becomes ready. Handlers can take a
      significant amount of time, making it look like polling was running for
      a long time when in fact the event handler was running for a long time.
      
      For example, on Linux the io_submit(2) syscall invoked when a virtio-blk
      device's virtqueue becomes ready can take 10s of microseconds. This
      can exceed the default polling interval (32 microseconds) and cause
      adaptive polling to stop polling.
      
      By excluding the handler's execution time from the polling check we make
      the adaptive polling calculation more accurate. As a result, the event
      loop now stays in polling mode where previously it would have fallen
      back to file descriptor monitoring.
      
      The following data was collected with virtio-blk num-queues=2
      event_idx=off using an IOThread. Before:
      
      168k IOPS, IOThread syscalls:
      
        9837.115 ( 0.020 ms): IO iothread1/620155 io_submit(ctx_id: 140512552468480, nr: 16, iocbpp: 0x7fcb9f937db0)    = 16
        9837.158 ( 0.002 ms): IO iothread1/620155 write(fd: 103, buf: 0x556a2ef71b88, count: 8)                         = 8
        9837.161 ( 0.001 ms): IO iothread1/620155 write(fd: 104, buf: 0x556a2ef71b88, count: 8)                         = 8
        9837.163 ( 0.001 ms): IO iothread1/620155 ppoll(ufds: 0x7fcb90002800, nfds: 4, tsp: 0x7fcb9f1342d0, sigsetsize: 8) = 3
        9837.164 ( 0.001 ms): IO iothread1/620155 read(fd: 107, buf: 0x7fcb9f939cc0, count: 512)                        = 8
        9837.174 ( 0.001 ms): IO iothread1/620155 read(fd: 105, buf: 0x7fcb9f939cc0, count: 512)                        = 8
        9837.176 ( 0.001 ms): IO iothread1/620155 read(fd: 106, buf: 0x7fcb9f939cc0, count: 512)                        = 8
        9837.209 ( 0.035 ms): IO iothread1/620155 io_submit(ctx_id: 140512552468480, nr: 32, iocbpp: 0x7fca7d0cebe0)    = 32
      
      174k IOPS (+3.6%), IOThread syscalls:
      
        9809.566 ( 0.036 ms): IO iothread1/623061 io_submit(ctx_id: 140539805028352, nr: 32, iocbpp: 0x7fd0cdd62be0)    = 32
        9809.625 ( 0.001 ms): IO iothread1/623061 write(fd: 103, buf: 0x5647cfba5f58, count: 8)                         = 8
        9809.627 ( 0.002 ms): IO iothread1/623061 write(fd: 104, buf: 0x5647cfba5f58, count: 8)                         = 8
        9809.663 ( 0.036 ms): IO iothread1/623061 io_submit(ctx_id: 140539805028352, nr: 32, iocbpp: 0x7fd0d0388b50)    = 32
      
      Notice that ppoll(2) and eventfd read(2) syscalls are eliminated because
      the IOThread stays in polling mode instead of falling back to file
      descriptor monitoring.
      
      As usual, polling is not implemented on Windows so this patch ignores
      the new io_poll_read() callback in aio-win32.c.
      
      Signed-off-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      Reviewed-by: default avatarStefano Garzarella <sgarzare@redhat.com>
      Message-id: 20211207132336.36627-2-stefanha@redhat.com
      
      [Fixed up aio_set_event_notifier() calls in
      tests/unit/test-fdmon-epoll.c added after this series was queued.
      --Stefan]
      
      Signed-off-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      826cc324
  10. Sep 30, 2021
  11. Jul 14, 2021
  12. Jun 08, 2021
  13. Jun 02, 2021
  14. Feb 12, 2021
  15. Feb 10, 2021
  16. Jan 13, 2021
  17. Jan 12, 2021
  18. Oct 29, 2020
  19. Oct 27, 2020
  20. Oct 12, 2020
  21. Sep 18, 2020
  22. Sep 16, 2020
  23. Aug 21, 2020
  24. Jun 10, 2020
  25. Apr 29, 2020
    • Markus Armbruster's avatar
      io: Fix qio_channel_socket_close() error handling · fdceb4ab
      Markus Armbruster authored
      
      The Error ** argument must be NULL, &error_abort, &error_fatal, or a
      pointer to a variable containing NULL.  Passing an argument of the
      latter kind twice without clearing it in between is wrong: if the
      first call sets an error, it no longer points to NULL for the second
      call.
      
      qio_channel_socket_close() passes @errp first to
      socket_listen_cleanup(), and then, if closesocket() fails, to
      error_setg_errno().  If socket_listen_cleanup() failed, this will trip
      the assertion in error_setv().
      
      Fix by ignoring a second error.
      
      Fixes: 73564c40
      Cc: Daniel P. Berrangé <berrange@redhat.com>
      Signed-off-by: default avatarMarkus Armbruster <armbru@redhat.com>
      Reviewed-by: default avatarDaniel P. Berrangé <berrange@redhat.com>
      Message-Id: <20200422130719.28225-11-armbru@redhat.com>
      fdceb4ab
  26. Feb 07, 2020
  27. Sep 03, 2019
Loading