Skip to content
Snippets Groups Projects
  1. Aug 30, 2023
    • Stefan Hajnoczi's avatar
      aio-posix: zero out io_uring sqe user_data · 87ec6f55
      Stefan Hajnoczi authored
      
      liburing does not clear sqe->user_data. We must do it ourselves to avoid
      undefined behavior in process_cqe() when user_data is used.
      
      Note that fdmon-io_uring is currently disabled, so this is a latent bug
      that does not affect users. Let's merge this fix now to make it easier
      to enable fdmon-io_uring in the future (and I'm working on that).
      
      Signed-off-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      Message-ID: <20230426212639.82310-1-stefanha@redhat.com>
      87ec6f55
  2. Aug 09, 2023
  3. Aug 08, 2023
  4. Aug 03, 2023
  5. Aug 01, 2023
    • Anthony PERARD's avatar
      thread-pool: signal "request_cond" while locked · f4f71363
      Anthony PERARD authored
      thread_pool_free() might have been called on the `pool`, which would
      be a reason for worker_thread() to quit. In this case,
      `pool->request_cond` is been destroyed.
      
      If worker_thread() didn't managed to signal `request_cond` before it
      been destroyed by thread_pool_free(), we got:
          util/qemu-thread-posix.c:198: qemu_cond_signal: Assertion `cond->initialized' failed.
      
      One backtrace:
          __GI___assert_fail (assertion=0x55555614abcb "cond->initialized", file=0x55555614ab88 "util/qemu-thread-posix.c", line=198,
      	function=0x55555614ad80 <__PRETTY_FUNCTION__.17104> "qemu_cond_signal") at assert.c:101
          qemu_cond_signal (cond=0x7fffb800db30) at util/qemu-thread-posix.c:198
          worker_thread (opaque=0x7fffb800dab0) at util/thread-pool.c:129
          qemu_thread_start (args=0x7fffb8000b20) at util/qemu-thread-posix.c:505
          start_thread (arg=<optimized out>) at pthread_create.c:486
      
      Reported here:
          https://lore.kernel.org/all/ZJwoK50FcnTSfFZ8@MacBook-Air-de-Roger.local/T/#u
      
      
      
      To avoid issue, keep lock while sending a signal to `request_cond`.
      
      Fixes: 900fa208 ("thread-pool: replace semaphore with condition variable")
      Signed-off-by: default avatarAnthony PERARD <anthony.perard@citrix.com>
      Reviewed-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      Message-Id: <20230714152720.5077-1-anthony.perard@citrix.com>
      Signed-off-by: default avatarAnthony PERARD <anthony.perard@citrix.com>
      f4f71363
  6. Jul 31, 2023
  7. Jul 10, 2023
  8. Jul 08, 2023
  9. Jun 27, 2023
    • Marc-André Lureau's avatar
      console/win32: allocate shareable display surface · 09b4c198
      Marc-André Lureau authored
      
      Introduce qemu_win32_map_alloc() and qemu_win32_map_free() to allocate
      shared memory mapping. The handle can be used to share the mapping with
      another process.
      
      Teach qemu_create_displaysurface() to allocate shared memory. Following
      patches will introduce other places for shared memory allocation.
      
      Other patches for -display dbus will share the memory when possible with
      the client, to avoid expensive memory copy between the processes.
      
      Signed-off-by: default avatarMarc-André Lureau <marcandre.lureau@redhat.com>
      Message-Id: <20230606115658.677673-10-marcandre.lureau@redhat.com>
      09b4c198
  10. Jun 13, 2023
  11. Jun 06, 2023
    • Paolo Bonzini's avatar
      atomics: eliminate mb_read/mb_set · 06831001
      Paolo Bonzini authored
      
      qatomic_mb_read and qatomic_mb_set were the very first atomic primitives
      introduced for QEMU; their semantics are unclear and they provide a false
      sense of safety.
      
      The last use of qatomic_mb_read() has been removed, so delete it.
      qatomic_mb_set() instead can survive as an optimized
      qatomic_set()+smp_mb(), similar to Linux's smp_store_mb(), but
      rename it to qatomic_set_mb() to match the order of the two
      operations.
      
      Reviewed-by: default avatarRichard Henderson <richard.henderson@linaro.org>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      06831001
  12. Jun 05, 2023
    • Hanna Czenczek's avatar
      util/iov: Remove qemu_iovec_init_extended() · cc63f6f6
      Hanna Czenczek authored
      
      bdrv_pad_request() was the main user of qemu_iovec_init_extended().
      HEAD^ has removed that use, so we can remove qemu_iovec_init_extended()
      now.
      
      The only remaining user is qemu_iovec_init_slice(), which can easily
      inline the small part it really needs.
      
      Note that qemu_iovec_init_extended() offered a memcpy() optimization to
      initialize the new I/O vector.  qemu_iovec_concat_iov(), which is used
      to replace its functionality, does not, but calls qemu_iovec_add() for
      every single element.  If we decide this optimization was important, we
      will need to re-implement it in qemu_iovec_concat_iov(), which might
      also benefit its pre-existing users.
      
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      Reviewed-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
      Signed-off-by: default avatarHanna Czenczek <hreitz@redhat.com>
      Message-Id: <20230411173418.19549-4-hreitz@redhat.com>
      cc63f6f6
    • Hanna Czenczek's avatar
      util/iov: Make qiov_slice() public · 3d06cea8
      Hanna Czenczek authored
      
      We want to inline qemu_iovec_init_extended() in block/io.c for padding
      requests, and having access to qiov_slice() is useful for this.  As a
      public function, it is renamed to qemu_iovec_slice().
      
      (We will need to count the number of I/O vector elements of a slice
      there, and then later process this slice.  Without qiov_slice(), we
      would need to call qemu_iovec_subvec_niov(), and all further
      IOV-processing functions may need to skip prefixing elements to
      accomodate for a qiov_offset.  Because qemu_iovec_subvec_niov()
      internally calls qiov_slice(), we can just have the block/io.c code call
      qiov_slice() itself, thus get the number of elements, and also create an
      iovec array with the superfluous prefixing elements stripped, so the
      following processing functions no longer need to skip them.)
      
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      Reviewed-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
      Signed-off-by: default avatarHanna Czenczek <hreitz@redhat.com>
      Message-Id: <20230411173418.19549-2-hreitz@redhat.com>
      3d06cea8
  13. Jun 02, 2023
    • Eric Blake's avatar
      cutils: Improve qemu_strtosz handling of fractions · 42cc08d1
      Eric Blake authored
      We have several limitations and bugs worth fixing; they are
      inter-related enough that it is not worth splitting this patch into
      smaller pieces:
      
      * ".5k" should work to specify 512, just as "0.5k" does
      * "1.9999k" and "1." + "9"*50 + "k" should both produce the same
        result of 2048 after rounding
      * "1." + "0"*350 + "1B" should not be treated the same as "1.0B";
        underflow in the fraction should not be lost
      * "7.99e99" and "7.99e999" look similar, but our code was doing a
        read-out-of-bounds on the latter because it was not expecting ERANGE
        due to overflow. While we document that scientific notation is not
        supported, and the previous patch actually fixed
        qemu_strtod_finite() to no longer return ERANGE overflows, it is
        easier to pre-filter than to try and determine after the fact if
        strtod() consumed more than we wanted.  Note that this is a
        low-level semantic change (when endptr is not NULL, we can now
        successfully parse with a scale of 'E' and then report trailing
        junk, instead of failing outright with EINVAL); but an earlier
        commit already argued that this is not a high-level semantic change
        since the only caller passing in a non-NULL endptr also checks that
        the tail is whitespace-only.
      
      Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1629
      
      
      Fixes: cf923b78 ("utils: Improve qemu_strtosz() to have 64 bits of precision", 6.0.0)
      Fixes: 7625a1ed ("utils: Use fixed-point arithmetic in qemu_strtosz", 6.0.0)
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      Reviewed-by: default avatarHanna Czenczek <hreitz@redhat.com>
      Message-Id: <20230522190441.64278-20-eblake@redhat.com>
      [eblake: tweak function comment for accuracy]
      42cc08d1
    • Eric Blake's avatar
      cutils: Improve qemu_strtod* error paths · c25b1683
      Eric Blake authored
      Previous patches changed all integral qemu_strto*() error paths to
      guarantee that *value is never left uninitialized.  Do likewise for
      qemu_strtod.  Also, tighten qemu_strtod_finite() to never return a
      non-finite value (prior to this patch, we were rejecting "inf" with
      -EINVAL and unspecified result 0.0, but failing "9e999" with -ERANGE
      and HUGE_VAL - which is infinite on IEEE machines - despite our
      function claiming to recognize only finite values).
      
      Auditing callers, we have no external callers of qemu_strtod, and
      among the callers of qemu_strtod_finite:
      
      - qapi/qobject-input-visitor.c:qobject_input_type_number_keyval() and
        qapi/string-input-visitor.c:parse_type_number() which reject all
        errors (does not matter what we store)
      
      - utils/cutils.c:do_strtosz() incorrectly assumes that *endptr points
        to '.' on all failures (that is, it is not distinguishing between
        EINVAL and ERANGE; and therefore still does the WRONG THING for
        "9.9e999".  The change here does not entirely fix that (a later
        patch will tackle this more systematically), but at least it fixes
        the read-out-of-bounds first diagnosed in
        https://gitlab.com/qemu-project/qemu/-/issues/1629
      
      
      
      - our testsuite, which we can update to match what we document
      
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      Reviewed-by: default avatarHanna Czenczek <hreitz@redhat.com>
      CC: qemu-stable@nongnu.org
      Message-Id: <20230522190441.64278-19-eblake@redhat.com>
      c25b1683
    • Eric Blake's avatar
      cutils: Use parse_uint in qemu_strtosz for negative rejection · b87ac966
      Eric Blake authored
      
      Rather than open-coding two different ways to check for an unwanted
      negative sign, reuse the same code in both functions.  That way, if we
      decide down the road to accept "-0" instead of rejecting it, we have
      fewer places to change.  Also, it means we now get ERANGE instead of
      EINVAL for negative values in qemu_strtosz, which is reasonable for
      what it represents.  This in turn changes the expected output of a
      couple of iotests.
      
      The change is not quite complete: negative fractional scaled values
      can trip us up.  This will be fixed in a later patch addressing other
      issues with fractional scaled values.
      
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      Reviewed-by: default avatarHanna Czenczek <hreitz@redhat.com>
      Message-Id: <20230522190441.64278-18-eblake@redhat.com>
      b87ac966
    • Eric Blake's avatar
      cutils: Set value in all integral qemu_strto* error paths · 3c5f2467
      Eric Blake authored
      
      Our goal in writing qemu_strtoi() and friends is to have an interface
      harder to abuse than libc's strtol().  Leaving the return value
      uninitialized on some but not all error paths does not lend itself
      well to this goal; and our documentation wasn't helpful on what to
      expect.
      
      Note that the previous patch changed all qemu_strtosz() EINVAL error
      paths to slam value to 0 rather than stay uninitialized, even when the
      EINVAL eror occurs because of trailing junk.  But for the remaining
      integral qemu_strto*, it's easier to return the parsed value than to
      force things back to zero, in part because of how check_strtox_error
      works; in part because people expect that from libc strto* (while
      there is no libc strtosz to compare to), and in part because doing so
      creates less churn in the testsuite.
      
      Here, the list of affected callers is much longer ('git grep
      "qemu_strto[ui]" "*.c" "**/*.c" | grep -v tests/ |wc -l' outputs 107,
      although a few of those are the implementation in in cutils.c), so
      touching as little as possible is the wisest course of action.
      
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      Reviewed-by: default avatarHanna Czenczek <hreitz@redhat.com>
      Message-Id: <20230522190441.64278-17-eblake@redhat.com>
      3c5f2467
    • Eric Blake's avatar
      cutils: Set value in all qemu_strtosz* error paths · 896fbd90
      Eric Blake authored
      
      Making callers determine whether or not *value was populated on error
      is not nice for usability.  Pre-patch, we have unit tests that check
      that *result is left unchanged on most EINVAL errors and set to 0 on
      many ERANGE errors.  This is subtly different from libc strtoumax()
      behavior which returns UINT64_MAX on ERANGE errors, as well as
      different from our parse_uint() which slams to 0 on EINVAL on the
      grounds that we want our functions to be harder to mis-use than
      strtoumax().
      
      Let's audit callers:
      
      - hw/core/numa.c:parse_numa() fixed in the previous patch to check for
        errors
      
      - migration/migration-hmp-cmds.c:hmp_migrate_set_parameter(),
        monitor/hmp.c:monitor_parse_arguments(),
        qapi/opts-visitor.c:opts_type_size(),
        qapi/qobject-input-visitor.c:qobject_input_type_size_keyval(),
        qemu-img.c:cvtnum_full(), qemu-io-cmds.c:cvtnum(),
        target/i386/cpu.c:x86_cpu_parse_featurestr(), and
        util/qemu-option.c:parse_option_size() appear to reject all failures
        (although some with distinct messages for ERANGE as opposed to
        EINVAL), so it doesn't matter what is in the value parameter on
        error.
      
      - All remaining callers are in the testsuite, where we can tweak our
        expectations to match our new desired behavior.
      
      Advancing to the end of the string parsed on overflow (ERANGE), while
      still returning 0, makes sense (UINT64_MAX as a size is unlikely to be
      useful); likewise, our size parsing code is complex enough that it's
      easier to always return 0 when endptr is NULL but trailing garbage was
      found, rather than trying to return the value of the prefix actually
      parsed (no current caller cared about the value of the prefix).
      
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      Reviewed-by: default avatarHanna Czenczek <hreitz@redhat.com>
      Message-Id: <20230522190441.64278-16-eblake@redhat.com>
      896fbd90
    • Eric Blake's avatar
      cutils: Allow NULL str in qemu_strtosz · f49371ec
      Eric Blake authored
      
      All the other qemu_strto* and parse_uint allow a NULL str.  Having
      qemu_strtosz not crash on qemu_strtosz(NULL, NULL, &value) is an easy
      fix that adds some consistency between our string parsers.
      
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      Reviewed-by: default avatarPhilippe Mathieu-Daudé <philmd@linaro.org>
      Reviewed-by: default avatarHanna Czenczek <hreitz@redhat.com>
      Message-Id: <20230522190441.64278-13-eblake@redhat.com>
      f49371ec
    • Eric Blake's avatar
      cutils: Allow NULL endptr in parse_uint() · 52d606aa
      Eric Blake authored
      
      All the qemu_strto*() functions permit a NULL endptr, just like their
      libc counterparts, leaving parse_uint() as the oddball that caused
      SEGFAULT on NULL and required the user to call parse_uint_full()
      instead.  Relax things for consistency, even though the testsuite is
      the only impacted caller.  Add one more unit test to ensure even
      parse_uint_full(NULL, 0, &value) works.  This also fixes our code to
      uniformly favor EINVAL over ERANGE when both apply.
      
      Also fixes a doc mismatch @v vs. a parameter named value.
      
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      Reviewed-by: default avatarHanna Czenczek <hreitz@redhat.com>
      Message-Id: <20230522190441.64278-9-eblake@redhat.com>
      52d606aa
    • Eric Blake's avatar
      cutils: Adjust signature of parse_uint[_full] · bd1386cc
      Eric Blake authored
      
      It's already confusing that we have two very similar functions for
      wrapping the parse of a 64-bit unsigned value, differing mainly on
      whether they permit leading '-'.  Adjust the signature of parse_uint()
      and parse_uint_full() to be like all of qemu_strto*(): put the result
      parameter last, use the same types (uint64_t and unsigned long long
      have the same width, but are not always the same type), and mark
      endptr const (this latter change only affects the rare caller of
      parse_uint).  Adjust all callers in the tree.
      
      While at it, note that since cutils.c already includes:
      
          QEMU_BUILD_BUG_ON(sizeof(int64_t) != sizeof(long long));
      
      we are guaranteed that the result of parse_uint* cannot exceed
      UINT64_MAX (or the build would have failed), so we can drop
      pre-existing dead comparisons in opts-visitor.c that were never false.
      
      Reviewed-by: default avatarHanna Czenczek <hreitz@redhat.com>
      Message-Id: <20230522190441.64278-8-eblake@redhat.com>
      [eblake: Drop dead code spotted by Markus]
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      bd1386cc
    • Eric Blake's avatar
      cutils: Document differences between parse_uint and qemu_strtou64 · 84760bbc
      Eric Blake authored
      
      These two functions are subtly different, and not just because of
      swapped parameter order.  It took me adding better unit tests to
      figure out why.  Document the differences to make it more obvious to
      developers trying to pick which one to use, as well as to aid in
      upcoming semantic changes.
      
      While touching the documentation, adjust a mis-statement: parse_uint
      does not return -EINVAL on invalid base, but assert()s, like all the
      other qemu_strto* functions that take a base argument.
      
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      Reviewed-by: default avatarHanna Czenczek <hreitz@redhat.com>
      Message-Id: <20230522190441.64278-7-eblake@redhat.com>
      84760bbc
    • Eric Blake's avatar
      cutils: Fix wraparound parsing in qemu_strtoui · 56ddafde
      Eric Blake authored
      
      While we were matching 32-bit strtol in qemu_strtoi, our use of a
      64-bit parse was leaking through for some inaccurate answers in
      qemu_strtoui in comparison to a 32-bit strtoul (see the unit test for
      examples).  The comment for that function even described what we have
      to do for a correct parse, but didn't implement it correctly: since
      strtoull checks for overflow against the wrong values and then
      negates, we have to temporarily undo negation before checking for
      overflow against our desired value.
      
      Our int wrappers would be a lot easier to write if libc had a
      guaranteed 32-bit parser even on platforms with 64-bit long.
      
      Whether we parse C2x binary strings like "0b1000" is currently up to
      what libc does; our unit tests intentionally don't cover that at the
      moment, though.
      
      Fixes: 473a2a33 ("cutils: add qemu_strtoi & qemu_strtoui parsers for int/unsigned int types", v2.12.0)
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      CC: qemu-stable@nongnu.org
      Message-Id: <20230522190441.64278-6-eblake@redhat.com>
      Reviewed-by: default avatarHanna Czenczek <hreitz@redhat.com>
      56ddafde
  14. Jun 01, 2023
    • Alex Bennée's avatar
      accel/tcg: include cs_base in our hash calculations · 367189ef
      Alex Bennée authored
      
      We weren't using cs_base in the hash calculations before. Since the
      arm front end moved a chunk of flags in a378206a (target/arm: Move
      mode specific TB flags to tb->cs_base) they comprise of an important
      part of the execution state.
      
      Widen the tb_hash_func to include cs_base and expand to qemu_xxhash8()
      to accommodate it.
      
      My initial benchmark shows very little difference in the
      runtime.
      
      Before:
      
      armhf
      
      ➜  hyperfine -w 2 -m 20 "./arm-softmmu/qemu-system-arm -cpu cortex-a15 -machine type=virt,highmem=off -display none -m 2048 -serial mon:stdio -netdev user,id=unet,hostfwd=tcp::2222-:22 -device virtio-net-pci,netdev=unet -device virtio-scsi-pci -blockdev driver=raw,node-name=hd,discard=unmap,file.driver=host_device,file.filename=/dev/zen-disk/debian-bullseye-armhf -device scsi-hd,drive=hd -smp 4 -kernel /home/alex/lsrc/linux.git/builds/arm/arch/arm/boot/zImage -append 'console=ttyAMA0 root=/dev/sda2 systemd.unit=benchmark.service' -snapshot"
      Benchmark 1: ./arm-softmmu/qemu-system-arm -cpu cortex-a15 -machine type=virt,highmem=off -display none -m 2048 -serial mon:stdio -netdev user,id=unet,hostfwd=tcp::2222-:22 -device virtio-net-pci,netdev=unet -device virtio-scsi-pci -blockdev driver=raw,node-name=hd,discard=unmap,file.driver=host_device,file.filename=/dev/zen-disk/debian-bullseye-armhf -device scsi-hd,drive=hd -smp 4 -kernel /home/alex/lsrc/linux.git/builds/arm/arch/arm/boot/zImage -append 'console=ttyAMA0 root=/dev/sda2 systemd.unit=benchmark.service' -snapshot
        Time (mean ± σ):     24.627 s ±  2.708 s    [User: 34.309 s, System: 1.797 s]
        Range (min … max):   22.345 s … 29.864 s    20 runs
      
      arm64
      
      ➜  hyperfine -w 2 -n 20 "./qemu-system-aarch64 -cpu max,pauth-impdef=on -machine type=virt,virtualization=on,gic-version=3 -display none -serial mon:stdio -netdev user,id=unet,hostfwd=tcp::2222-:22,hostfwd=tcp::1234-:1234 -device virtio-net-pci,netdev=unet -device virtio-scsi-pci -blockdev driver=raw,node-name=hd,discard=unmap,file.driver=host_device,file.filename=/dev/zen-disk/debian-bullseye-arm64 -device scsi-hd,drive=hd -smp 4 -kernel ~/lsrc/linux.git/builds/arm64/arch/arm64/boot/Image.gz -append 'console=ttyAMA0 root=/dev/sda2 systemd.unit=benchmark-pigz.service' -snapshot"
      Benchmark 1: 20
        Time (mean ± σ):     62.559 s ±  2.917 s    [User: 189.115 s, System: 4.089 s]
        Range (min … max):   59.997 s … 70.153 s    10 runs
      
      After:
      
      armhf
      
      Benchmark 1: ./arm-softmmu/qemu-system-arm -cpu cortex-a15 -machine type=virt,highmem=off -display none -m 2048 -serial mon:stdio -netdev user,id=unet,hostfwd=tcp::2222-:22 -device virtio-net-pci,netdev=unet -device virtio-scsi-pci -blockdev driver=raw,node-name=hd,discard=unmap,file.driver=host_device,file.filename=/dev/zen-disk/debian-bullseye-armhf -device scsi-hd,drive=hd -smp 4 -kernel /home/alex/lsrc/linux.git/builds/arm/arch/arm/boot/zImage -append 'console=ttyAMA0 root=/dev/sda2 systemd.unit=benchmark.service' -snapshot
        Time (mean ± σ):     24.223 s ±  2.151 s    [User: 34.284 s, System: 1.906 s]
        Range (min … max):   22.000 s … 28.476 s    20 runs
      
      arm64
      
      hyperfine -w 2 -n 20 "./qemu-system-aarch64 -cpu max,pauth-impdef=on -machine type=virt,virtualization=on,gic-version=3 -display none -serial mon:stdio -netdev user,id=unet,hostfwd=tcp::2222-:22,hostfwd=tcp::1234-:1234 -device virtio-net-pci,netdev=unet -device virtio-scsi-pci -blockdev driver=raw,node-name=hd,discard=unmap,file.driver=host_device,file.filename=/dev/zen-disk/debian-bullseye-arm64 -device scsi-hd,drive=hd -smp 4 -kernel ~/lsrc/linux.git/builds/arm64/arch/arm64/boot/Image.gz -append 'console=ttyAMA0 root=/dev/sda2 systemd.unit=benchmark-pigz.service' -snapshot"
      Benchmark 1: 20
        Time (mean ± σ):     62.769 s ±  1.978 s    [User: 188.431 s, System: 5.269 s]
        Range (min … max):   60.285 s … 66.868 s    10 runs
      
      Signed-off-by: default avatarAlex Bennée <alex.bennee@linaro.org>
      Reviewed-by: default avatarRichard Henderson <richard.henderson@linaro.org>
      Message-id: 20230526165401.574474-12-alex.bennee@linaro.org
      Message-Id: <20230524133952.3971948-11-alex.bennee@linaro.org>
      Signed-off-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      367189ef
  15. May 30, 2023
    • Stefan Hajnoczi's avatar
      aio: remove aio_disable_external() API · 60f782b6
      Stefan Hajnoczi authored
      All callers now pass is_external=false to aio_set_fd_handler() and
      aio_set_event_notifier(). The aio_disable_external() API that
      temporarily disables fd handlers that were registered is_external=true
      is therefore dead code.
      
      Remove aio_disable_external(), aio_enable_external(), and the
      is_external arguments to aio_set_fd_handler() and
      aio_set_event_notifier().
      
      The entire test-fdmon-epoll test is removed because its sole purpose was
      testing aio_disable_external().
      
      Parts of this patch were generated using the following coccinelle
      (https://coccinelle.lip6.fr/
      
      ) semantic patch:
      
        @@
        expression ctx, fd, is_external, io_read, io_write, io_poll, io_poll_ready, opaque;
        @@
        - aio_set_fd_handler(ctx, fd, is_external, io_read, io_write, io_poll, io_poll_ready, opaque)
        + aio_set_fd_handler(ctx, fd, io_read, io_write, io_poll, io_poll_ready, opaque)
      
        @@
        expression ctx, notifier, is_external, io_read, io_poll, io_poll_ready;
        @@
        - aio_set_event_notifier(ctx, notifier, is_external, io_read, io_poll, io_poll_ready)
        + aio_set_event_notifier(ctx, notifier, io_read, io_poll, io_poll_ready)
      
      Reviewed-by: default avatarJuan Quintela <quintela@redhat.com>
      Reviewed-by: default avatarPhilippe Mathieu-Daudé <philmd@linaro.org>
      Signed-off-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      Message-Id: <20230516190238.8401-21-stefanha@redhat.com>
      Signed-off-by: default avatarKevin Wolf <kwolf@redhat.com>
      60f782b6
    • Stefan Hajnoczi's avatar
      block/export: stop using is_external in vhost-user-blk server · e1054cd4
      Stefan Hajnoczi authored
      
      vhost-user activity must be suspended during bdrv_drained_begin/end().
      This prevents new requests from interfering with whatever is happening
      in the drained section.
      
      Previously this was done using aio_set_fd_handler()'s is_external
      argument. In a multi-queue block layer world the aio_disable_external()
      API cannot be used since multiple AioContext may be processing I/O, not
      just one.
      
      Switch to BlockDevOps->drained_begin/end() callbacks.
      
      Signed-off-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      Reviewed-by: default avatarKevin Wolf <kwolf@redhat.com>
      Message-Id: <20230516190238.8401-8-stefanha@redhat.com>
      Signed-off-by: default avatarKevin Wolf <kwolf@redhat.com>
      e1054cd4
    • Stefan Hajnoczi's avatar
      block/export: wait for vhost-user-blk requests when draining · 8f5e9a8e
      Stefan Hajnoczi authored
      
      Each vhost-user-blk request runs in a coroutine. When the BlockBackend
      enters a drained section we need to enter a quiescent state. Currently
      any in-flight requests race with bdrv_drained_begin() because it is
      unaware of vhost-user-blk requests.
      
      When blk_co_preadv/pwritev()/etc returns it wakes the
      bdrv_drained_begin() thread but vhost-user-blk request processing has
      not yet finished. The request coroutine continues executing while the
      main loop thread thinks it is in a drained section.
      
      One example where this is unsafe is for blk_set_aio_context() where
      bdrv_drained_begin() is called before .aio_context_detached() and
      .aio_context_attach(). If request coroutines are still running after
      bdrv_drained_begin(), then the AioContext could change underneath them
      and they race with new requests processed in the new AioContext. This
      could lead to virtqueue corruption, for example.
      
      (This example is theoretical, I came across this while reading the
      code and have not tried to reproduce it.)
      
      It's easy to make bdrv_drained_begin() wait for in-flight requests: add
      a .drained_poll() callback that checks the VuServer's in-flight counter.
      VuServer just needs an API that returns true when there are requests in
      flight. The in-flight counter needs to be atomic.
      
      Signed-off-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      Reviewed-by: default avatarKevin Wolf <kwolf@redhat.com>
      Message-Id: <20230516190238.8401-7-stefanha@redhat.com>
      Signed-off-by: default avatarKevin Wolf <kwolf@redhat.com>
      8f5e9a8e
    • Stefan Hajnoczi's avatar
      util/vhost-user-server: rename refcount to in_flight counter · 75d33e85
      Stefan Hajnoczi authored
      
      The VuServer object has a refcount field and ref/unref APIs. The name is
      confusing because it's actually an in-flight request counter instead of
      a refcount.
      
      Normally a refcount destroys the object upon reaching zero. The VuServer
      counter is used to wake up the vhost-user coroutine when there are no
      more requests.
      
      Avoid confusing by renaming refcount and ref/unref to in_flight and
      inc/dec.
      
      Reviewed-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Reviewed-by: default avatarPhilippe Mathieu-Daudé <philmd@linaro.org>
      Signed-off-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      Reviewed-by: default avatarKevin Wolf <kwolf@redhat.com>
      Message-Id: <20230516190238.8401-6-stefanha@redhat.com>
      Signed-off-by: default avatarKevin Wolf <kwolf@redhat.com>
      75d33e85
  16. May 28, 2023
    • Marc-André Lureau's avatar
      win32: wrap socket close() with an exception handler · d89f30b4
      Marc-André Lureau authored
      
      Since commit abe34282 ("win32: avoid mixing SOCKET and file descriptor
      space"), we set HANDLE_FLAG_PROTECT_FROM_CLOSE on the socket FD, to
      prevent closing the HANDLE with CloseHandle. This raises an exception
      which under gdb is fatal, and qemu exits.
      
      Let's catch the expected error instead.
      
      Note: this appears to work, but the mingw64 macro is not well documented
      or tested, and it's not obvious how it is meant to be used.
      
      Signed-off-by: default avatarMarc-André Lureau <marcandre.lureau@redhat.com>
      Message-Id: <20230515132440.1025315-1-marcandre.lureau@redhat.com>
      d89f30b4
  17. May 24, 2023
    • Akihiko Odaki's avatar
      util/vfio-helpers: Use g_file_read_link() · dbdea0db
      Akihiko Odaki authored
      
      When _FORTIFY_SOURCE=2, glibc version is 2.35, and GCC version is
      12.1.0, the compiler complains as follows:
      
      In file included from /usr/include/features.h:490,
                       from /usr/include/bits/libc-header-start.h:33,
                       from /usr/include/stdint.h:26,
                       from /usr/lib/gcc/aarch64-unknown-linux-gnu/12.1.0/include/stdint.h:9,
                       from /home/alarm/q/var/qemu/include/qemu/osdep.h:94,
                       from ../util/vfio-helpers.c:13:
      In function 'readlink',
          inlined from 'sysfs_find_group_file' at ../util/vfio-helpers.c:116:9,
          inlined from 'qemu_vfio_init_pci' at ../util/vfio-helpers.c:326:18,
          inlined from 'qemu_vfio_open_pci' at ../util/vfio-helpers.c:517:9:
      /usr/include/bits/unistd.h:119:10: error: argument 2 is null but the corresponding size argument 3 value is 4095 [-Werror=nonnull]
        119 |   return __glibc_fortify (readlink, __len, sizeof (char),
            |          ^~~~~~~~~~~~~~~
      
      This error implies the allocated buffer can be NULL. Use
      g_file_read_link(), which allocates buffer automatically to avoid the
      error.
      
      Signed-off-by: default avatarAkihiko Odaki <akihiko.odaki@daynix.com>
      Reviewed-by: default avatarPhilippe Mathieu-Daudé <philmd@linaro.org>
      Reviewed-by: default avatarCédric Le Goater <clg@redhat.com>
      Signed-off-by: default avatarCédric Le Goater <clg@redhat.com>
      dbdea0db
  18. May 23, 2023
Loading