Skip to content
Snippets Groups Projects
  1. Mar 08, 2022
    • Eric Blake's avatar
      qemu-io: Allow larger write zeroes under no fallback · 395aecd0
      Eric Blake authored
      
      When writing zeroes can fall back to a slow write, permitting an
      overly large request can become an amplification denial of service
      attack in triggering a large amount of work from a small request.  But
      the whole point of the no fallback flag is to quickly determine if
      writing an entire device to zero can be done quickly (such as when it
      is already known that the device started with zero contents); in those
      cases, artificially capping things at 2G in qemu-io itself doesn't
      help us.
      
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      Message-Id: <20211203231539.3900865-4-eblake@redhat.com>
      Reviewed-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      395aecd0
    • Eric Blake's avatar
      qemu-io: Utilize 64-bit status during map · 087f2fb3
      Eric Blake authored
      The block layer has supported 64-bit block status from drivers since
      commit 86a3d5c6 ("block: Add .bdrv_co_block_status() callback",
      v2.12) and friends, with individual driver callbacks responsible for
      capping things where necessary.  Artificially capping things below 2G
      in the qemu-io 'map' command, added in commit d6a644bb ("block: Make
      bdrv_is_allocated() byte-based", v2.10) is thus no longer necessary.
      
      One way to test this is with qemu-nbd as server on a raw file larger
      than 4G (the entire file should show as allocated), plus 'qemu-io -f
      raw -c map nbd://localhost
      
       --trace=nbd_\*' as client.  Prior to this
      patch, the NBD_CMD_BLOCK_STATUS requests are fragmented at 0x7ffffe00
      distances; with this patch, the fragmenting changes to 0x7fffffff
      (since the NBD protocol is currently still limited to 32-bit
      transactions - see block/nbd.c:nbd_client_co_block_status).  Then in
      later patches, once I add an NBD extension for a 64-bit block status,
      the same map command completes with just one NBD_CMD_BLOCK_STATUS.
      
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      Message-Id: <20211203231539.3900865-3-eblake@redhat.com>
      Reviewed-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      087f2fb3
  2. Mar 07, 2022
  3. Jul 09, 2021
  4. Jun 02, 2021
  5. May 14, 2021
    • Vladimir Sementsov-Ogievskiy's avatar
      monitor: hmp_qemu_io: acquire aio contex, fix crash · 78632a3d
      Vladimir Sementsov-Ogievskiy authored
      
      Max reported the following bug:
      
      $ ./qemu-img create -f raw src.img 1G
      $ ./qemu-img create -f raw dst.img 1G
      
      $ (echo '
         {"execute":"qmp_capabilities"}
         {"execute":"blockdev-mirror",
          "arguments":{"job-id":"mirror",
                       "device":"source",
                       "target":"target",
                       "sync":"full",
                       "filter-node-name":"mirror-top"}}
      '; sleep 3; echo '
         {"execute":"human-monitor-command",
          "arguments":{"command-line":
                       "qemu-io mirror-top \"write 0 1G\""}}') \
      | x86_64-softmmu/qemu-system-x86_64 \
         -qmp stdio \
         -blockdev file,node-name=source,filename=src.img \
         -blockdev file,node-name=target,filename=dst.img \
         -object iothread,id=iothr0 \
         -device virtio-blk,drive=source,iothread=iothr0
      
      crashes:
      
      0  raise () at /usr/lib/libc.so.6
      1  abort () at /usr/lib/libc.so.6
      2  error_exit
         (err=<optimized out>,
         msg=msg@entry=0x55fbb1634790 <__func__.27> "qemu_mutex_unlock_impl")
         at ../util/qemu-thread-posix.c:37
      3  qemu_mutex_unlock_impl
         (mutex=mutex@entry=0x55fbb25ab6e0,
         file=file@entry=0x55fbb1636957 "../util/async.c",
         line=line@entry=650)
         at ../util/qemu-thread-posix.c:109
      4  aio_context_release (ctx=ctx@entry=0x55fbb25ab680) at ../util/async.c:650
      5  bdrv_do_drained_begin
         (bs=bs@entry=0x55fbb3a87000, recursive=recursive@entry=false,
         parent=parent@entry=0x0,
         ignore_bds_parents=ignore_bds_parents@entry=false,
         poll=poll@entry=true) at ../block/io.c:441
      6  bdrv_do_drained_begin
         (poll=true, ignore_bds_parents=false, parent=0x0, recursive=false,
         bs=0x55fbb3a87000) at ../block/io.c:448
      7  blk_drain (blk=0x55fbb26c5a00) at ../block/block-backend.c:1718
      8  blk_unref (blk=0x55fbb26c5a00) at ../block/block-backend.c:498
      9  blk_unref (blk=0x55fbb26c5a00) at ../block/block-backend.c:491
      10 hmp_qemu_io (mon=0x7fffaf3fc7d0, qdict=<optimized out>)
         at ../block/monitor/block-hmp-cmds.c:628
      
      man pthread_mutex_unlock
      ...
          EPERM  The  mutex type is PTHREAD_MUTEX_ERRORCHECK or
          PTHREAD_MUTEX_RECURSIVE, or the mutex is a robust mutex, and the
          current thread does not own the mutex.
      
      So, thread doesn't own the mutex. And we have iothread here.
      
      Next, note that AIO_WAIT_WHILE() documents that ctx must be acquired
      exactly once by caller. But where is it acquired in the call stack?
      Seems nowhere.
      
      qemuio_command do acquire aio context.. But we need context acquired
      around blk_unref() as well and actually around blk_insert_bs() too.
      
      Let's refactor qemuio_command so that it doesn't acquire aio context
      but callers do that instead. This way we can cleanly acquire aio
      context in hmp_qemu_io() around all three calls.
      
      Reported-by: default avatarMax Reitz <mreitz@redhat.com>
      Signed-off-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      Message-Id: <20210423134233.51495-1-vsementsov@virtuozzo.com>
      [mreitz: Fixed comment]
      Signed-off-by: default avatarMax Reitz <mreitz@redhat.com>
      78632a3d
  6. Dec 18, 2020
  7. Oct 02, 2020
    • Dr. David Alan Gilbert's avatar
      qemu-io-cmds: Simplify help_oneline · da16f4b8
      Dr. David Alan Gilbert authored
      help_oneline is declared and starts as:
      
        static void help_oneline(const char *cmd, const cmdinfo_t *ct)
        {
            if (cmd) {
                printf("%s ", cmd);
            } else {
                printf("%s ", ct->name);
                if (ct->altname) {
                    printf("(or %s) ", ct->altname);
                }
            }
      
      However, there are only two routes to help_oneline being called:
      
         help_f -> help_all -> help_oneline(ct->name, ct)
      
         help_f -> help_onecmd(argv[1], ct)
      
      In the first case, 'cmd' and 'ct->name' are the same thing,
      so it's impossible for the if (cmd) to be false and then validly
      print ct->name - this is upsetting gcc
      ( https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96739
      
       )
      
      In the second case, cmd is argv[1] and we know we've got argv[1]
      so again (cmd) is non-NULL.
      
      Simplify help_oneline by just printing cmd.
      (Also strengthen argc check just to be pedantic)
      
      Signed-off-by: default avatarDr. David Alan Gilbert <dgilbert@redhat.com>
      Message-Id: <20200824102914.105619-1-dgilbert@redhat.com>
      Signed-off-by: default avatarKevin Wolf <kwolf@redhat.com>
      da16f4b8
  8. Jul 28, 2020
  9. Apr 30, 2020
  10. Oct 28, 2019
  11. Sep 13, 2019
  12. Sep 03, 2019
  13. Jun 12, 2019
    • Alex Bennée's avatar
      qemu-io-cmds: use clock_gettime for benchmarking · 50290c00
      Alex Bennée authored
      
      The previous use of gettimeofday() ran into undefined behaviour when
      we ended up doing a div 0 for a very short operation. This is because
      gettimeofday only works at the microsecond level as well as being
      prone to discontinuous jumps in system time. Using clock_gettime with
      CLOCK_MONOTONIC gives greater precision and alleviates some of the
      potential problems with time jumping around.
      
      We could use CLOCK_MONOTONIC_RAW to avoid being tripped up by NTP and
      adjtime but that is Linux specific so I decided it would do for now.
      
      Signed-off-by: default avatarAlex Bennée <alex.bennee@linaro.org>
      50290c00
  14. May 20, 2019
  15. Apr 18, 2019
  16. Mar 26, 2019
  17. Mar 12, 2019
  18. Feb 11, 2019
  19. Jan 30, 2019
    • Richard W.M. Jones's avatar
      qemu-io: Add generic function for reinitializing optind. · d339d766
      Richard W.M. Jones authored
      
      On FreeBSD 11.2:
      
        $ nbdkit memory size=1M --run './qemu-io -f raw -c "aio_write 0 512" $nbd'
        Parsing error: non-numeric argument, or extraneous/unrecognized suffix -- aio_write
      
      After main option parsing, we reinitialize optind so we can parse each
      command.  However reinitializing optind to 0 does not work on FreeBSD.
      What happens when you do this is optind remains 0 after the option
      parsing loop, and the result is we try to parse argv[optind] ==
      argv[0] == "aio_write" as if it was the first parameter.
      
      The FreeBSD manual page says:
      
        In order to use getopt() to evaluate multiple sets of arguments, or to
        evaluate a single set of arguments multiple times, the variable optreset
        must be set to 1 before the second and each additional set of calls to
        getopt(), and the variable optind must be reinitialized.
      
      (From the rest of the man page it is clear that optind must be
      reinitialized to 1).
      
      The glibc man page says:
      
        A program that scans multiple argument vectors,  or  rescans  the  same
        vector  more than once, and wants to make use of GNU extensions such as
        '+' and '-' at  the  start  of  optstring,  or  changes  the  value  of
        POSIXLY_CORRECT  between scans, must reinitialize getopt() by resetting
        optind to 0, rather than the traditional value of 1.  (Resetting  to  0
        forces  the  invocation  of  an  internal  initialization  routine that
        rechecks POSIXLY_CORRECT and checks for GNU extensions in optstring.)
      
      This commit introduces an OS-portability function called
      qemu_reset_optind which provides a way of resetting optind that works
      on FreeBSD and platforms that use optreset, while keeping it the same
      as now on other platforms.
      
      Note that the qemu codebase sets optind in many other places, but in
      those other places it's setting a local variable and not using getopt.
      This change is only needed in places where we are using getopt and the
      associated global variable optind.
      
      Signed-off-by: default avatarRichard W.M. Jones <rjones@redhat.com>
      Message-id: 20190118101114.11759-2-rjones@redhat.com
      Reviewed-by: default avatarDaniel P. Berrangé <berrange@redhat.com>
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      Signed-off-by: default avatarMax Reitz <mreitz@redhat.com>
      d339d766
  20. Dec 14, 2018
  21. Nov 05, 2018
  22. Oct 01, 2018
  23. Jun 11, 2018
  24. Feb 09, 2018
  25. Dec 22, 2017
  26. Oct 26, 2017
    • Eric Blake's avatar
      qemu-io: Relax 'alloc' now that block-status doesn't assert · f0a9c18f
      Eric Blake authored
      
      Previously, the alloc command required that input parameters be
      sector-aligned and clamped to 32 bits, because the underlying
      bdrv_is_allocated used a 32-bit parameter and asserted aligned
      inputs.  But now that we have fixed block status to report a
      64-bit bytes value, and to properly round requests on behalf of
      guests, we can pass any values, and can use qemu-io to add
      coverage that our rounding is correct regardless of the guest
      alignment constraints.
      
      Update iotest 177 to intentionally probe block status at
      unaligned boundaries as well as with a bytes value that does not
      map to 32-bit sectors, which also required tweaking the image
      prep to leave an unallocated portion to the image under test.
      
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      Signed-off-by: default avatarKevin Wolf <kwolf@redhat.com>
      f0a9c18f
  27. Sep 26, 2017
    • Kevin Wolf's avatar
      qemu-io: Drop write permissions before read-only reopen · f3adefb2
      Kevin Wolf authored
      
      qemu-io provides a 'reopen' command that allows switching from writable
      to read-only access. We need to make sure that we don't try to keep
      write permissions to a BlockBackend that becomes read-only, otherwise
      things are going to fail.
      
      This requires a bdrv_drain() call because otherwise in-flight AIO
      write requests could issue new internal requests while the permission
      has already gone away, which would cause assertion failures. Draining
      the queue doesn't break AIO requests in any new way, bdrv_reopen() would
      drain it anyway only a few lines later.
      
      Signed-off-by: default avatarKevin Wolf <kwolf@redhat.com>
      Reviewed-by: default avatarFam Zheng <famz@redhat.com>
      f3adefb2
  28. Aug 08, 2017
  29. Jul 11, 2017
  30. Jul 10, 2017
    • Eric Blake's avatar
      block: Make bdrv_is_allocated() byte-based · d6a644bb
      Eric Blake authored
      
      We are gradually moving away from sector-based interfaces, towards
      byte-based.  In the common case, allocation is unlikely to ever use
      values that are not naturally sector-aligned, but it is possible
      that byte-based values will let us be more precise about allocation
      at the end of an unaligned file that can do byte-based access.
      
      Changing the signature of the function to use int64_t *pnum ensures
      that the compiler enforces that all callers are updated.  For now,
      the io.c layer still assert()s that all callers are sector-aligned
      on input and that *pnum is sector-aligned on return to the caller,
      but that can be relaxed when a later patch implements byte-based
      block status.  Therefore, this code adds usages like
      DIV_ROUND_UP(,BDRV_SECTOR_SIZE) to callers that still want aligned
      values, where the call might reasonbly give non-aligned results
      in the future; on the other hand, no rounding is needed for callers
      that should just continue to work with byte alignment.
      
      For the most part this patch is just the addition of scaling at the
      callers followed by inverse scaling at bdrv_is_allocated().  But
      some code, particularly bdrv_commit(), gets a lot simpler because it
      no longer has to mess with sectors; also, it is now possible to pass
      NULL if the caller does not care how much of the image is allocated
      beyond the initial offset.  Leave comments where we can further
      simplify once a later patch eliminates the need for sector-aligned
      requests through bdrv_is_allocated().
      
      For ease of review, bdrv_is_allocated_above() will be tackled
      separately.
      
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      Signed-off-by: default avatarKevin Wolf <kwolf@redhat.com>
      d6a644bb
  31. Jun 26, 2017
  32. May 11, 2017
  33. Apr 28, 2017
Loading