Skip to content
Snippets Groups Projects
  1. Oct 26, 2022
    • Stefan Hajnoczi's avatar
      virtio-blk: use BDRV_REQ_REGISTERED_BUF optimization hint · baf42268
      Stefan Hajnoczi authored
      
      Register guest RAM using BlockRAMRegistrar and set the
      BDRV_REQ_REGISTERED_BUF flag so block drivers can optimize memory
      accesses in I/O requests.
      
      This is for vdpa-blk, vhost-user-blk, and other I/O interfaces that rely
      on DMA mapping/unmapping.
      
      Signed-off-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      Reviewed-by: default avatarStefano Garzarella <sgarzare@redhat.com>
      Message-id: 20221013185908.1297568-14-stefanha@redhat.com
      Signed-off-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      baf42268
    • Stefan Hajnoczi's avatar
      blkio: implement BDRV_REQ_REGISTERED_BUF optimization · c5640b3e
      Stefan Hajnoczi authored
      
      Avoid bounce buffers when QEMUIOVector elements are within previously
      registered bdrv_register_buf() buffers.
      
      The idea is that emulated storage controllers will register guest RAM
      using bdrv_register_buf() and set the BDRV_REQ_REGISTERED_BUF on I/O
      requests. Therefore no blkio_map_mem_region() calls are necessary in the
      performance-critical I/O code path.
      
      This optimization doesn't apply if the I/O buffer is internally
      allocated by QEMU (e.g. qcow2 metadata). There we still take the slow
      path because BDRV_REQ_REGISTERED_BUF is not set.
      
      Signed-off-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      Reviewed-by: default avatarStefano Garzarella <sgarzare@redhat.com>
      Message-id: 20221013185908.1297568-13-stefanha@redhat.com
      Signed-off-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      c5640b3e
    • Stefan Hajnoczi's avatar
      stubs: add qemu_ram_block_from_host() and qemu_ram_get_fd() · 701bff24
      Stefan Hajnoczi authored
      
      The blkio block driver will need to look up the file descriptor for a
      given pointer. This is possible in softmmu builds where the RAMBlock API
      is available for querying guest RAM.
      
      Add stubs so tools like qemu-img that link the block layer still build
      successfully. In this case there is no guest RAM but that is fine.
      Bounce buffers and their file descriptors will be allocated with
      libblkio's blkio_alloc_mem_region() so we won't rely on QEMU's
      qemu_ram_get_fd() in that case.
      
      Signed-off-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      Message-id: 20221013185908.1297568-12-stefanha@redhat.com
      Signed-off-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      701bff24
    • Stefan Hajnoczi's avatar
      exec/cpu-common: add qemu_ram_get_fd() · 6d998f3c
      Stefan Hajnoczi authored
      
      Add a function to get the file descriptor for a RAMBlock. Device
      emulation code typically uses the MemoryRegion APIs but vhost-style code
      may use RAMBlock directly for sharing guest memory with another process.
      
      This new API will be used by the libblkio block driver so it can share
      guest memory via .bdrv_register_buf().
      
      Signed-off-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      Message-id: 20221013185908.1297568-11-stefanha@redhat.com
      Signed-off-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      6d998f3c
    • Stefan Hajnoczi's avatar
      block: add BlockRAMRegistrar · 7f9241d8
      Stefan Hajnoczi authored
      
      Emulated devices and other BlockBackend users wishing to take advantage
      of blk_register_buf() all have the same repetitive job: register
      RAMBlocks with the BlockBackend using RAMBlockNotifier.
      
      Add a BlockRAMRegistrar API to do this. A later commit will use this
      from hw/block/virtio-blk.c.
      
      Signed-off-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      Reviewed-by: default avatarStefano Garzarella <sgarzare@redhat.com>
      Message-id: 20221013185908.1297568-10-stefanha@redhat.com
      Signed-off-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      7f9241d8
    • Stefan Hajnoczi's avatar
      numa: use QLIST_FOREACH_SAFE() for RAM block notifiers · 4fdd0a1a
      Stefan Hajnoczi authored
      
      Make list traversal work when a callback removes a notifier
      mid-traversal. This is a cleanup to prevent bugs in the future.
      
      Signed-off-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      Reviewed-by: default avatarDavid Hildenbrand <david@redhat.com>
      Message-id: 20221013185908.1297568-9-stefanha@redhat.com
      Signed-off-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      4fdd0a1a
    • Stefan Hajnoczi's avatar
      block: return errors from bdrv_register_buf() · f4ec04ba
      Stefan Hajnoczi authored
      
      Registering an I/O buffer is only a performance optimization hint but it
      is still necessary to return errors when it fails.
      
      Later patches will need to detect errors when registering buffers but an
      immediate advantage is that error_report() calls are no longer needed in
      block driver .bdrv_register_buf() functions.
      
      Signed-off-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      Message-id: 20221013185908.1297568-8-stefanha@redhat.com
      Signed-off-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      f4ec04ba
    • Stefan Hajnoczi's avatar
      block: add BDRV_REQ_REGISTERED_BUF request flag · e8b65355
      Stefan Hajnoczi authored
      
      Block drivers may optimize I/O requests accessing buffers previously
      registered with bdrv_register_buf(). Checking whether all elements of a
      request's QEMUIOVector are within previously registered buffers is
      expensive, so we need a hint from the user to avoid costly checks.
      
      Add a BDRV_REQ_REGISTERED_BUF request flag to indicate that all
      QEMUIOVector elements in an I/O request are known to be within
      previously registered buffers.
      
      Always pass the flag through to driver read/write functions. There is
      little harm in passing the flag to a driver that does not use it.
      Passing the flag to drivers avoids changes across many block drivers.
      Filter drivers would need to explicitly support the flag and pass
      through to their children when the children support it. That's a lot of
      code changes and it's hard to remember to do that everywhere, leading to
      silent reduced performance when the flag is accidentally dropped.
      
      The only problematic scenario with the approach in this patch is when a
      driver passes the flag through to internal I/O requests that don't use
      the same I/O buffer. In that case the hint may be set when it should
      actually be clear. This is a rare case though so the risk is low.
      
      Some drivers have assert(!flags), which no longer works when
      BDRV_REQ_REGISTERED_BUF is passed in. These assertions aren't very
      useful anyway since the functions are called almost exclusively by
      bdrv_driver_preadv/pwritev() so if we get flags handling right there
      then the assertion is not needed.
      
      Signed-off-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      Message-id: 20221013185908.1297568-7-stefanha@redhat.com
      Signed-off-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      e8b65355
    • Stefan Hajnoczi's avatar
      block: use BdrvRequestFlags type for supported flag fields · 98b3ddc7
      Stefan Hajnoczi authored
      
      Use the enum type so GDB displays the enum members instead of printing a
      numeric constant.
      
      Signed-off-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      Reviewed-by: default avatarStefano Garzarella <sgarzare@redhat.com>
      Message-id: 20221013185908.1297568-6-stefanha@redhat.com
      Signed-off-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      98b3ddc7
    • Stefan Hajnoczi's avatar
      block: pass size to bdrv_unregister_buf() · 4f384011
      Stefan Hajnoczi authored
      
      The only implementor of bdrv_register_buf() is block/nvme.c, where the
      size is not needed when unregistering a buffer. This is because
      util/vfio-helpers.c can look up mappings by address.
      
      Future block drivers that implement bdrv_register_buf() may not be able
      to do their job given only the buffer address. Add a size argument to
      bdrv_unregister_buf().
      
      Also document the assumptions about
      bdrv_register_buf()/bdrv_unregister_buf() calls. The same <host, size>
      values that were given to bdrv_register_buf() must be given to
      bdrv_unregister_buf().
      
      gcc 11.2.1 emits a spurious warning that img_bench()'s buf_size local
      variable might be uninitialized, so it's necessary to silence the
      compiler.
      
      Signed-off-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      Reviewed-by: default avatarStefano Garzarella <sgarzare@redhat.com>
      Message-id: 20221013185908.1297568-5-stefanha@redhat.com
      Signed-off-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      4f384011
    • Stefan Hajnoczi's avatar
      numa: call ->ram_block_removed() in ram_block_notifer_remove() · 1f0fea38
      Stefan Hajnoczi authored
      
      When a RAMBlockNotifier is added, ->ram_block_added() is called with all
      existing RAMBlocks. There is no equivalent ->ram_block_removed() call
      when a RAMBlockNotifier is removed.
      
      The util/vfio-helpers.c code (the sole user of RAMBlockNotifier) is fine
      with this asymmetry because it does not rely on RAMBlockNotifier for
      cleanup. It walks its internal list of DMA mappings and unmaps them by
      itself.
      
      Future users of RAMBlockNotifier may not have an internal data structure
      that records added RAMBlocks so they will need ->ram_block_removed()
      callbacks.
      
      This patch makes ram_block_notifier_remove() symmetric with respect to
      callbacks. Now util/vfio-helpers.c needs to unmap remaining DMA mappings
      after ram_block_notifier_remove() has been called. This is necessary
      since users like block/nvme.c may create additional DMA mappings that do
      not originate from the RAMBlockNotifier.
      
      Reviewed-by: default avatarDavid Hildenbrand <david@redhat.com>
      Signed-off-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      Message-id: 20221013185908.1297568-4-stefanha@redhat.com
      Signed-off-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      1f0fea38
    • Stefan Hajnoczi's avatar
      blkio: add libblkio block driver · fd66dbd4
      Stefan Hajnoczi authored
      libblkio (https://gitlab.com/libblkio/libblkio/
      
      ) is a library for
      high-performance disk I/O. It currently supports io_uring,
      virtio-blk-vhost-user, and virtio-blk-vhost-vdpa with additional drivers
      under development.
      
      One of the reasons for developing libblkio is that other applications
      besides QEMU can use it. This will be particularly useful for
      virtio-blk-vhost-user which applications may wish to use for connecting
      to qemu-storage-daemon.
      
      libblkio also gives us an opportunity to develop in Rust behind a C API
      that is easy to consume from QEMU.
      
      This commit adds io_uring, nvme-io_uring, virtio-blk-vhost-user, and
      virtio-blk-vhost-vdpa BlockDrivers to QEMU using libblkio. It will be
      easy to add other libblkio drivers since they will share the majority of
      code.
      
      For now I/O buffers are copied through bounce buffers if the libblkio
      driver requires it. Later commits add an optimization for
      pre-registering guest RAM to avoid bounce buffers.
      
      The syntax is:
      
        --blockdev io_uring,node-name=drive0,filename=test.img,readonly=on|off,cache.direct=on|off
      
        --blockdev nvme-io_uring,node-name=drive0,filename=/dev/ng0n1,readonly=on|off,cache.direct=on
      
        --blockdev virtio-blk-vhost-vdpa,node-name=drive0,path=/dev/vdpa...,readonly=on|off,cache.direct=on
      
        --blockdev virtio-blk-vhost-user,node-name=drive0,path=vhost-user-blk.sock,readonly=on|off,cache.direct=on
      
      Signed-off-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      Acked-by: default avatarMarkus Armbruster <armbru@redhat.com>
      Reviewed-by: default avatarStefano Garzarella <sgarzare@redhat.com>
      Message-id: 20221013185908.1297568-3-stefanha@redhat.com
      Signed-off-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      fd66dbd4
    • Stefan Hajnoczi's avatar
      coroutine: add flag to re-queue at front of CoQueue · 0421b563
      Stefan Hajnoczi authored
      
      When a coroutine wakes up it may determine that it must re-queue.
      Normally coroutines are pushed onto the back of the CoQueue, but for
      fairness it may be necessary to push it onto the front of the CoQueue.
      
      Add a flag to specify that the coroutine should be pushed onto the front
      of the CoQueue. A later patch will use this to ensure fairness in the
      bounce buffer CoQueue used by the blkio BlockDriver.
      
      Signed-off-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      Message-id: 20221013185908.1297568-2-stefanha@redhat.com
      Signed-off-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      0421b563
  2. Oct 25, 2022
    • Stefan Hajnoczi's avatar
      Merge tag 'trivial-branch-for-7.2-pull-request' of... · 79fc2fb6
      Stefan Hajnoczi authored
      Merge tag 'trivial-branch-for-7.2-pull-request' of https://gitlab.com/laurent_vivier/qemu into staging
      
      Pull request
      
      # -----BEGIN PGP SIGNATURE-----
      #
      # iQJGBAABCAAwFiEEzS913cjjpNwuT1Fz8ww4vT8vvjwFAmNXleQSHGxhdXJlbnRA
      # dml2aWVyLmV1AAoJEPMMOL0/L748TIsP/1gulTFpYAs3Kao6IZonsuCzrjQrJWqv
      # 5SD7cVb7isOWdOSNK3glE4dG54Q38PaS9GHaCvzIndjHxlWddCCUuwiw6p1Wdo70
      # fjNfcCOEPoalQbkZvLejhs5n2rlfTvS5JUnLKVD9+ton7hjnTyKGDDYao5mYhtzv
      # Kn9NpCD3m+K3orzG2Jj7jR1UAumg4cW4YQEpT8ItDT4Y5UAxjL6TZQ6CE220DQDq
      # YwDrHEgDYr/UKlTbIC/JwlKOLr0sh+UB1VV8GZS6e6pU9u5WpDDHlQZpU8W2tLLg
      # cG5m8tLG2avFxRMUFrPNZ8Lx2xKO8wL1PtgAO9w7qFK+r0soZvv+Zh4ev/t5zGLf
      # ciliItqf97yPYNIc3su75jqdQHed7lmZc3m9LBHg8VXN6rAatt8vWUbG90sAZuTU
      # tWBZHvQmG0s2MK4UYqeQ59tc21v9T2+VCiiv/1vjgEUr8tBhXS562jrDt/bNEqKa
      # eRzT4h4ffbP6BJRnyakxkFkQ7nd2OdlLNKUAr9Tk6T2fYuarfEdbYx//0950agqD
      # AAtdQ/AJm6Pq1Px0/RuMKK5WsL818BoAkfr6n7qXleunytJ1W5hjW9EmFIPZWPTR
      # ce/lSFHA0+MCpg6C8zAa4iNBg/Pk0p3GRrTeWyHK1FjV+Gep1QtE/a1vk/qiPzTM
      # qZVfPxa8cXXe
      # =caiq
      # -----END PGP SIGNATURE-----
      # gpg: Signature made Tue 25 Oct 2022 03:53:08 EDT
      # gpg:                using RSA key CD2F75DDC8E3A4DC2E4F5173F30C38BD3F2FBE3C
      # gpg:                issuer "laurent@vivier.eu"
      # gpg: Good signature from "Laurent Vivier <lvivier@redhat.com>" [full]
      # gpg:                 aka "Laurent Vivier <laurent@vivier.eu>" [full]
      # gpg:                 aka "Laurent Vivier (Red Hat) <lvivier@redhat.com>" [full]
      # Primary key fingerprint: CD2F 75DD C8E3 A4DC 2E4F  5173 F30C 38BD 3F2F BE3C
      
      * tag 'trivial-branch-for-7.2-pull-request' of https://gitlab.com/laurent_vivier/qemu
      
      :
        accel/tcg/tcg-accel-ops-rr: fix trivial typo
        ui: remove useless typecasts
        treewide: Remove the unnecessary space before semicolon
        include/hw/scsi/scsi.h: Remove unused scsi_legacy_handle_cmdline() prototype
        vmstate-static-checker:remove this redundant return
        tests/qtest: vhost-user-test: Fix [-Werror=format-overflow=] build warning
        tests/qtest: migration-test: Fix [-Werror=format-overflow=] build warning
        Drop useless casts from g_malloc() & friends to pointer
        elf2dmp: free memory in failure
        hw/core: Tidy up unnecessary casting away of const
        .gitignore: add multiple items to .gitignore
      
      Signed-off-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      79fc2fb6
    • Stefan Hajnoczi's avatar
      Merge tag 'linux-user-for-7.2-pull-request' of https://gitlab.com/laurent_vivier/qemu into staging · d3553079
      Stefan Hajnoczi authored
      linux-user pull request 20221025
      
      Add faccess2()
      Fix ioclt(), execve(), pidfd_send_signal() and MIPS n32 syscall ABI
      Improve EXCP_DUMP()
      
      # -----BEGIN PGP SIGNATURE-----
      #
      # iQJGBAABCAAwFiEEzS913cjjpNwuT1Fz8ww4vT8vvjwFAmNXkawSHGxhdXJlbnRA
      # dml2aWVyLmV1AAoJEPMMOL0/L748uvUQAJ3Br5Yh+0uuT0524DvVHjvE/bYys43A
      # JRilXtYsTrmGhatiF5vaaOmhRbsQ8Ljq8l/R4D7b7cLmRUJ7Q0pbZM5k3PRAEYOa
      # rMdTY8aSNhlKPvioOhLE5Ha4eua17YGQfP1LJW4jvEGqrhNV2qhUPPFbN3WlZKyt
      # 6T4N8y3FWWVD3C/qGpmHic3xK9CZW5hUIT3rL2BLxNx23rjCVViFhU4uFz7/43d1
      # Rf3pKLWbNOsUB4P0g56otlviPrNRwGoKEr2MGAGr2pz6ZHvSPUCD0PnJvOZ/0iHa
      # jpswpStPYYpmEXHOjwTT6ua1Roe0EaNJfcI5FoUDBYjCMyoyQ+4XoPfMvm/SqPKr
      # TbK/cEBEUUej7anUX6faNaofh3mDz5BMF+/r7scCqHKem2+/ZnoBFdx8f/meKwYB
      # Te29eC8/y4eFGlI6RsE7dcvwH+wz/z0aVCdX4luxzX0pjWp7ZhIs9ljLjEbdelUO
      # D6+nWACUF1HnTLIGSGWY4oihF4ST/NaZ0u+NLHqE5WoS3vq4xgas9agqkr6f5HnM
      # 1hdjcDFOJs6Xjac+IM6bi3MX0vAeGrBWK1YA/3vQRaF91uOfwBRhNjHSXwI+dWwM
      # LL6pLjiDIIrEXY3QbO/TZFfFKRhooDVSOopiRvPkZVHeugbsYdKVwZ8geTyvGlmn
      # vsxDnihSUWot
      # =o10I
      # -----END PGP SIGNATURE-----
      # gpg: Signature made Tue 25 Oct 2022 03:35:08 EDT
      # gpg:                using RSA key CD2F75DDC8E3A4DC2E4F5173F30C38BD3F2FBE3C
      # gpg:                issuer "laurent@vivier.eu"
      # gpg: Good signature from "Laurent Vivier <lvivier@redhat.com>" [full]
      # gpg:                 aka "Laurent Vivier <laurent@vivier.eu>" [full]
      # gpg:                 aka "Laurent Vivier (Red Hat) <lvivier@redhat.com>" [full]
      # Primary key fingerprint: CD2F 75DD C8E3 A4DC 2E4F  5173 F30C 38BD 3F2F BE3C
      
      * tag 'linux-user-for-7.2-pull-request' of https://gitlab.com/laurent_vivier/qemu
      
      :
        linux-user: Add guest memory layout to exception dump
        linux-user: Implement faccessat2
        linux-user: remove conditionals for many fs.h ioctls
        linux-user: add more compat ioctl definitions
        linux-user: don't use AT_EXECFD in do_openat()
        linux-user: handle /proc/self/exe with execve() syscall
        linux-user: fix pidfd_send_signal()
        linux-user: Fix more MIPS n32 syscall ABI issues
      
      Signed-off-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      d3553079
    • Helge Deller's avatar
      linux-user: Add guest memory layout to exception dump · bd5ccd61
      Helge Deller authored
      
      When the emulation stops with a hard exception it's very useful for
      debugging purposes to dump the current guest memory layout (for an
      example see /proc/self/maps) beside the CPU registers.
      
      The open_self_maps() function provides such a memory dump, but since
      it's located in the syscall.c file, various changes (add #includes, make
      this function externally visible, ...) are needed to be able to call it
      from the existing EXCP_DUMP() macro.
      
      This patch takes another approach by re-defining EXCP_DUMP() to call
      target_exception_dump(), which is in syscall.c, consolidates the log
      print functions and allows to add the call to dump the memory layout.
      
      Beside a reduced code footprint, this approach keeps the changes across
      the various callers minimal, and keeps EXCP_DUMP() highlighted as
      important macro/function.
      
      Signed-off-by: default avatarHelge Deller <deller@gmx.de>
      Reviewed-by: default avatarRichard Henderson <richard.henderson@linaro.org>
      Message-Id: <Y1bzAWbw07WBKPxw@p100>
      [lv: remove pc declaration and setting]
      Signed-off-by: default avatarLaurent Vivier <laurent@vivier.eu>
      bd5ccd61
  3. Oct 24, 2022
Loading