Skip to content
Snippets Groups Projects
  1. Oct 06, 2021
    • Paolo Bonzini's avatar
      block: introduce max_hw_iov for use in scsi-generic · cc071629
      Paolo Bonzini authored
      
      Linux limits the size of iovecs to 1024 (UIO_MAXIOV in the kernel
      sources, IOV_MAX in POSIX).  Because of this, on some host adapters
      requests with many iovecs are rejected with -EINVAL by the
      io_submit() or readv()/writev() system calls.
      
      In fact, the same limit applies to SG_IO as well.  To fix both the
      EINVAL and the possible performance issues from using fewer iovecs
      than allowed by Linux (some HBAs have max_segments as low as 128),
      introduce a separate entry in BlockLimits to hold the max_segments
      value from sysfs.  This new limit is used only for SG_IO and clamped
      to bs->bl.max_iov anyway, just like max_hw_transfer is clamped to
      bs->bl.max_transfer.
      
      Reported-by: default avatarHalil Pasic <pasic@linux.ibm.com>
      Cc: Hanna Reitz <hreitz@redhat.com>
      Cc: Kevin Wolf <kwolf@redhat.com>
      Cc: qemu-block@nongnu.org
      Cc: qemu-stable@nongnu.org
      Fixes: 18473467 ("file-posix: try BLKSECTGET on block devices too, do not round to power of 2", 2021-06-25)
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Message-Id: <20210923130436.1187591-1-pbonzini@redhat.com>
      Signed-off-by: default avatarKevin Wolf <kwolf@redhat.com>
      cc071629
  2. Sep 29, 2021
    • Vladimir Sementsov-Ogievskiy's avatar
      block/io: allow 64bit discard requests · 6a8f3dbb
      Vladimir Sementsov-Ogievskiy authored
      
      Now that all drivers are updated by the previous commit, we can drop
      the last limiter on pdiscard path: INT_MAX in bdrv_co_pdiscard().
      
      Now everything is prepared for implementing incredibly cool and fast
      big-discard requests in NBD and qcow2. And any other driver which wants
      it of course.
      
      Signed-off-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      Message-Id: <20210903102807.27127-12-vsementsov@virtuozzo.com>
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      6a8f3dbb
    • Vladimir Sementsov-Ogievskiy's avatar
      block: make BlockLimits::max_pdiscard 64bit · 39af49c0
      Vladimir Sementsov-Ogievskiy authored
      
      We are going to support 64 bit discard requests. Now update the
      limit variable. It's absolutely safe. The variable is set in some
      drivers, and used in bdrv_co_pdiscard().
      
      Update also max_pdiscard variable in bdrv_co_pdiscard(), so that
      bdrv_co_pdiscard() is now prepared for 64bit requests. The remaining
      logic including num, offset and bytes variables is already
      supporting 64bit requests.
      
      So the only thing that prevents 64 bit requests is limiting
      max_pdiscard variable to INT_MAX in bdrv_co_pdiscard().
      We'll drop this limitation after updating all block drivers.
      
      Signed-off-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      Message-Id: <20210903102807.27127-10-vsementsov@virtuozzo.com>
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      39af49c0
    • Vladimir Sementsov-Ogievskiy's avatar
      block/io: allow 64bit write-zeroes requests · 2aaa3f9b
      Vladimir Sementsov-Ogievskiy authored
      
      Now that all drivers are updated by previous commit, we can drop two
      last limiters on write-zeroes path: INT_MAX in
      bdrv_co_do_pwrite_zeroes() and bdrv_check_request32() in
      bdrv_co_pwritev_part().
      
      Now everything is prepared for implementing incredibly cool and fast
      big-write-zeroes in NBD and qcow2. And any other driver which wants it
      of course.
      
      Signed-off-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      Message-Id: <20210903102807.27127-9-vsementsov@virtuozzo.com>
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      2aaa3f9b
    • Vladimir Sementsov-Ogievskiy's avatar
      block: make BlockLimits::max_pwrite_zeroes 64bit · d544f5d3
      Vladimir Sementsov-Ogievskiy authored
      
      We are going to support 64 bit write-zeroes requests. Now update the
      limit variable. It's absolutely safe. The variable is set in some
      drivers, and used in bdrv_co_do_pwrite_zeroes().
      
      Update also max_write_zeroes variable in bdrv_co_do_pwrite_zeroes(), so
      that bdrv_co_do_pwrite_zeroes() is now prepared to 64bit requests. The
      remaining logic including num, offset and bytes variables is already
      supporting 64bit requests.
      
      So the only thing that prevents 64 bit requests is limiting
      max_write_zeroes variable to INT_MAX in bdrv_co_do_pwrite_zeroes().
      We'll drop this limitation after updating all block drivers.
      
      Ah, we also have bdrv_check_request32() in bdrv_co_pwritev_part(). It
      will be modified to do bdrv_check_request() for write-zeroes path.
      
      Signed-off-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      Message-Id: <20210903102807.27127-7-vsementsov@virtuozzo.com>
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      d544f5d3
    • Vladimir Sementsov-Ogievskiy's avatar
      block: use int64_t instead of uint64_t in driver write handlers · e75abeda
      Vladimir Sementsov-Ogievskiy authored
      
      We are generally moving to int64_t for both offset and bytes parameters
      on all io paths.
      
      Main motivation is realization of 64-bit write_zeroes operation for
      fast zeroing large disk chunks, up to the whole disk.
      
      We chose signed type, to be consistent with off_t (which is signed) and
      with possibility for signed return type (where negative value means
      error).
      
      So, convert driver write handlers parameters which are already 64bit to
      signed type.
      
      While being here, convert also flags parameter to be BdrvRequestFlags.
      
      Now let's consider all callers. Simple
      
        git grep '\->bdrv_\(aio\|co\)_pwritev\(_part\)\?'
      
      shows that's there three callers of driver function:
      
       bdrv_driver_pwritev() and bdrv_driver_pwritev_compressed() in
       block/io.c, both pass int64_t, checked by bdrv_check_qiov_request() to
       be non-negative.
      
       qcow2_save_vmstate() does bdrv_check_qiov_request().
      
      Still, the functions may be called directly, not only by drv->...
      Let's check:
      
      git grep '\.bdrv_\(aio\|co\)_pwritev\(_part\)\?\s*=' | \
      awk '{print $4}' | sed 's/,//' | sed 's/&//' | sort | uniq | \
      while read func; do git grep "$func(" | \
      grep -v "$func(BlockDriverState"; done
      
      shows several callers:
      
      qcow2:
        qcow2_co_truncate() write at most up to @offset, which is checked in
          generic qcow2_co_truncate() by bdrv_check_request().
        qcow2_co_pwritev_compressed_task() pass the request (or part of the
          request) that already went through normal write path, so it should
          be OK
      
      qcow:
        qcow_co_pwritev_compressed() pass int64_t, it's updated by this patch
      
      quorum:
        quorum_co_pwrite_zeroes() pass int64_t and int - OK
      
      throttle:
        throttle_co_pwritev_compressed() pass int64_t, it's updated by this
        patch
      
      vmdk:
        vmdk_co_pwritev_compressed() pass int64_t, it's updated by this
        patch
      
      Signed-off-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      Message-Id: <20210903102807.27127-5-vsementsov@virtuozzo.com>
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      e75abeda
    • Vladimir Sementsov-Ogievskiy's avatar
      qcow2: check request on vmstate save/load path · 558902cc
      Vladimir Sementsov-Ogievskiy authored
      
      We modify the request by adding an offset to vmstate. Let's check the
      modified request. It will help us to safely move .bdrv_co_preadv_part
      and .bdrv_co_pwritev_part to int64_t type of offset and bytes.
      
      Signed-off-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      Message-Id: <20210903102807.27127-3-vsementsov@virtuozzo.com>
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      558902cc
    • Vladimir Sementsov-Ogievskiy's avatar
      block/io: bring request check to bdrv_co_(read,write)v_vmstate · b984b296
      Vladimir Sementsov-Ogievskiy authored
      
      Only qcow2 driver supports vmstate.
      In qcow2 these requests go through .bdrv_co_p{read,write}v_part
      handlers.
      
      So, let's do our basic check for the request on vmstate generic
      handlers.
      
      Signed-off-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      Message-Id: <20210903102807.27127-2-vsementsov@virtuozzo.com>
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      b984b296
  3. Sep 15, 2021
    • Hanna Reitz's avatar
      block: block-status cache for data regions · 0bc329fb
      Hanna Reitz authored
      As we have attempted before
      (https://lists.gnu.org/archive/html/qemu-devel/2019-01/msg06451.html,
      "file-posix: Cache lseek result for data regions";
      https://lists.nongnu.org/archive/html/qemu-block/2021-02/msg00934.html,
      "file-posix: Cache next hole"), this patch seeks to reduce the number of
      SEEK_DATA/HOLE operations the file-posix driver has to perform.  The
      main difference is that this time it is implemented as part of the
      general block layer code.
      
      The problem we face is that on some filesystems or in some
      circumstances, SEEK_DATA/HOLE is unreasonably slow.  Given the
      implementation is outside of qemu, there is little we can do about its
      performance.
      
      We have already introduced the want_zero parameter to
      bdrv_co_block_status() to reduce the number of SEEK_DATA/HOLE calls
      unless we really want zero information; but sometimes we do want that
      information, because for files that consist largely of zero areas,
      special-casing those areas can give large performance boosts.  So the
      real problem is with files that consist largely of data, so that
      inquiring the block status does not gain us much performance, but where
      such an inquiry itself takes a lot of time.
      
      To address this, we want to cache data regions.  Most of the time, when
      bad performance is reported, it is in places where the image is iterated
      over from start to end (qemu-img convert or the mirror job), so a simple
      yet effective solution is to cache only the current data region.
      
      (Note that only caching data regions but not zero regions means that
      returning false information from the cache is not catastrophic: Treating
      zeroes as data is fine.  While we try to invalidate the cache on zero
      writes and discards, such incongruences may still occur when there are
      other processes writing to the image.)
      
      We only use the cache for nodes without children (i.e. protocol nodes),
      because that is where the problem is: Drivers that rely on block-status
      implementations outside of qemu (e.g. SEEK_DATA/HOLE).
      
      Resolves: https://gitlab.com/qemu-project/qemu/-/issues/307
      
      
      Signed-off-by: default avatarHanna Reitz <hreitz@redhat.com>
      Message-Id: <20210812084148.14458-3-hreitz@redhat.com>
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      Reviewed-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      [hreitz: Added `local_file == bs` assertion, as suggested by Vladimir]
      Signed-off-by: default avatarHanna Reitz <hreitz@redhat.com>
      0bc329fb
  4. Aug 03, 2021
  5. Jul 06, 2021
  6. Jun 29, 2021
  7. Jun 25, 2021
    • Paolo Bonzini's avatar
      block: add max_hw_transfer to BlockLimits · 24b36e98
      Paolo Bonzini authored
      
      For block host devices, I/O can happen through either the kernel file
      descriptor I/O system calls (preadv/pwritev, io_submit, io_uring)
      or the SCSI passthrough ioctl SG_IO.
      
      In the latter case, the size of each transfer can be limited by the
      HBA, while for file descriptor I/O the kernel is able to split and
      merge I/O in smaller pieces as needed.  Applying the HBA limits to
      file descriptor I/O results in more system calls and suboptimal
      performance, so this patch splits the max_transfer limit in two:
      max_transfer remains valid and is used in general, while max_hw_transfer
      is limited to the maximum hardware size.  max_hw_transfer can then be
      included by the scsi-generic driver in the block limits page, to ensure
      that the stricter hardware limit is used.
      
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      24b36e98
  8. Jun 02, 2021
  9. May 14, 2021
  10. Apr 30, 2021
  11. Feb 12, 2021
  12. Feb 03, 2021
    • Vladimir Sementsov-Ogievskiy's avatar
      block/io: use int64_t bytes in copy_range · a5215b8f
      Vladimir Sementsov-Ogievskiy authored
      
      We are generally moving to int64_t for both offset and bytes parameters
      on all io paths.
      
      Main motivation is realization of 64-bit write_zeroes operation for
      fast zeroing large disk chunks, up to the whole disk.
      
      We chose signed type, to be consistent with off_t (which is signed) and
      with possibility for signed return type (where negative value means
      error).
      
      So, convert now copy_range parameters which are already 64bit to signed
      type.
      
      It's safe as we don't work with requests overflowing BDRV_MAX_LENGTH
      (which is less than INT64_MAX), and do check the requests in
      bdrv_co_copy_range_internal() (by bdrv_check_request32(), which calls
      bdrv_check_request()).
      
      Signed-off-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      Message-Id: <20201211183934.169161-17-vsementsov@virtuozzo.com>
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      a5215b8f
    • Vladimir Sementsov-Ogievskiy's avatar
      block/io: support int64_t bytes in read/write wrappers · e9e52efd
      Vladimir Sementsov-Ogievskiy authored
      
      We are generally moving to int64_t for both offset and bytes parameters
      on all io paths.
      
      Main motivation is realization of 64-bit write_zeroes operation for
      fast zeroing large disk chunks, up to the whole disk.
      
      We chose signed type, to be consistent with off_t (which is signed) and
      with possibility for signed return type (where negative value means
      error).
      
      Now, since bdrv_co_preadv_part() and bdrv_co_pwritev_part() have been
      updated, update all their wrappers.
      
      For all of them type of 'bytes' is widening, so callers are safe. We
      have update request_fn in blkverify.c simultaneously. Still it's just a
      pointer to one of bdrv_co_pwritev() or bdrv_co_preadv(), and type is
      widening for callers of the request_fn anyway.
      
      Signed-off-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      Message-Id: <20201211183934.169161-16-vsementsov@virtuozzo.com>
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      [eblake: grammar tweak]
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      e9e52efd
    • Vladimir Sementsov-Ogievskiy's avatar
      block/io: support int64_t bytes in bdrv_co_p{read,write}v_part() · 37e9403e
      Vladimir Sementsov-Ogievskiy authored
      
      We are generally moving to int64_t for both offset and bytes parameters
      on all io paths.
      
      Main motivation is realization of 64-bit write_zeroes operation for
      fast zeroing large disk chunks, up to the whole disk.
      
      We chose signed type, to be consistent with off_t (which is signed) and
      with possibility for signed return type (where negative value means
      error).
      
      So, prepare bdrv_co_preadv_part() and bdrv_co_pwritev_part() and their
      remaining dependencies now.
      
      bdrv_pad_request() is updated simultaneously, as pointer to bytes passed
      to it both from bdrv_co_pwritev_part() and bdrv_co_preadv_part().
      
      So, all callers of bdrv_pad_request() are updated to pass 64bit bytes.
      bdrv_pad_request() is already good for 64bit requests, add
      corresponding assertion.
      
      Look at bdrv_co_preadv_part() and bdrv_co_pwritev_part().
      Type is widening, so callers are safe. Let's look inside the functions.
      
      In bdrv_co_preadv_part() and bdrv_aligned_pwritev() we only pass bytes
      to other already int64_t interfaces (and some obviously safe
      calculations), it's OK.
      
      In bdrv_co_do_zero_pwritev() aligned_bytes may become large now, still
      it's passed to bdrv_aligned_pwritev which supports int64_t bytes.
      
      Signed-off-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      Message-Id: <20201211183934.169161-15-vsementsov@virtuozzo.com>
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      37e9403e
    • Vladimir Sementsov-Ogievskiy's avatar
      block/io: support int64_t bytes in bdrv_aligned_preadv() · 8b0c5d76
      Vladimir Sementsov-Ogievskiy authored
      
      We are generally moving to int64_t for both offset and bytes parameters
      on all io paths.
      
      Main motivation is realization of 64-bit write_zeroes operation for
      fast zeroing large disk chunks, up to the whole disk.
      
      We chose signed type, to be consistent with off_t (which is signed) and
      with possibility for signed return type (where negative value means
      error).
      
      So, prepare bdrv_aligned_preadv() now.
      
      Make the bytes variable in bdrv_padding_rmw_read() int64_t, as it is
      only used for pass-through to bdrv_aligned_preadv().
      
      All bdrv_aligned_preadv() callers are safe as type is widening. Let's
      look inside:
      
       - add a new-style assertion that request is good.
       - callees bdrv_is_allocated(), bdrv_co_do_copy_on_readv() supports
         int64_t bytes
       - conversion of bytes_remaining is OK, as we never have requests
         overflowing BDRV_MAX_LENGTH
       - looping through bytes_remaining is ok, num is updated to int64_t
         - for bdrv_driver_preadv we have same limit of max_transfer
         - qemu_iovec_memset is OK, as bytes+qiov_offset should not overflow
           qiov->size anyway (thanks to bdrv_check_qiov_request())
      
      Signed-off-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      Message-Id: <20201211183934.169161-14-vsementsov@virtuozzo.com>
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      [eblake: grammar tweak]
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      8b0c5d76
    • Vladimir Sementsov-Ogievskiy's avatar
      block/io: support int64_t bytes in bdrv_co_do_copy_on_readv() · 9df5afbd
      Vladimir Sementsov-Ogievskiy authored
      
      We are generally moving to int64_t for both offset and bytes parameters
      on all io paths.
      
      Main motivation is realization of 64-bit write_zeroes operation for
      fast zeroing large disk chunks, up to the whole disk.
      
      We chose signed type, to be consistent with off_t (which is signed) and
      with possibility for signed return type (where negative value means
      error).
      
      So, prepare bdrv_co_do_copy_on_readv() now.
      
      'bytes' type widening, so callers are safe. Look at the function
      itself:
      
      bytes, skip_bytes and progress become int64_t.
      
      bdrv_round_to_clusters() is OK, cluster_bytes now may be large.
      trace_bdrv_co_do_copy_on_readv() is OK
      
      looping through cluster_bytes is still OK.
      
      pnum is still capped to max_transfer, and to MAX_BOUNCE_BUFFER when we
      are going to do COR operation. Therefor calculations in
      qemu_iovec_from_buf() and bdrv_driver_preadv() should not change.
      
      Signed-off-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      Message-Id: <20201211183934.169161-13-vsementsov@virtuozzo.com>
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      9df5afbd
    • Vladimir Sementsov-Ogievskiy's avatar
      block/io: support int64_t bytes in bdrv_aligned_pwritev() · fcfd9ade
      Vladimir Sementsov-Ogievskiy authored
      
      We are generally moving to int64_t for both offset and bytes parameters
      on all io paths.
      
      Main motivation is realization of 64-bit write_zeroes operation for
      fast zeroing large disk chunks, up to the whole disk.
      
      We chose signed type, to be consistent with off_t (which is signed) and
      with possibility for signed return type (where negative value means
      error).
      
      So, prepare bdrv_aligned_pwritev() now and convert the dependencies:
      bdrv_co_write_req_prepare() and bdrv_co_write_req_finish() to signed
      type bytes.
      
      Conversion of bdrv_co_write_req_prepare() and
      bdrv_co_write_req_finish() is definitely safe, as all requests in
      block/io must not overflow BDRV_MAX_LENGTH. Still add assertions.
      
      For bdrv_aligned_pwritev() 'bytes' type is widened, so callers are
      safe. Let's check usage of the parameter inside the function.
      
      Passing to bdrv_co_write_req_prepare() and bdrv_co_write_req_finish()
      is OK.
      
      Passing to qemu_iovec_* is OK after new assertion. All other callees
      are already updated to int64_t.
      
      Checking alignment is not changed, offset + bytes and qiov_offset +
      bytes calculations are safe (thanks to new assertions).
      
      max_transfer is kept to be int for now. It has a default of INT_MAX
      here, and some drivers may rely on it. It's to be refactored later.
      
      Signed-off-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      Message-Id: <20201211183934.169161-12-vsementsov@virtuozzo.com>
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      fcfd9ade
    • Vladimir Sementsov-Ogievskiy's avatar
      block/io: support int64_t bytes in bdrv_co_do_pwrite_zeroes() · 5ae07b14
      Vladimir Sementsov-Ogievskiy authored
      
      We are generally moving to int64_t for both offset and bytes parameters
      on all io paths.
      
      Main motivation is realization of 64-bit write_zeroes operation for
      fast zeroing large disk chunks, up to the whole disk.
      
      We chose signed type, to be consistent with off_t (which is signed) and
      with possibility for signed return type (where negative value means
      error).
      
      So, prepare bdrv_co_do_pwrite_zeroes() now.
      
      Callers are safe, as converting int to int64_t is safe. Concentrate on
      'bytes' usage in the function (thx to Eric Blake):
      
          compute 'int tail' via % 'int alignment' - safe
          fragmentation loop 'int num' - still fragments with a cap on
            max_transfer
      
          use of 'num' within the loop
          MIN(bytes, max_transfer) as well as %alignment - still works, so
               calculations in if (head) {} are safe
          clamp size by 'int max_write_zeroes' - safe
          drv->bdrv_co_pwrite_zeroes(int) - safe because of clamping
          clamp size by 'int max_transfer' - safe
          buf allocation is still clamped to max_transfer
          qemu_iovec_init_buf(size_t) - safe because of clamping
          bdrv_driver_pwritev(uint64_t) - safe
      
      Signed-off-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      Message-Id: <20201211183934.169161-11-vsementsov@virtuozzo.com>
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      5ae07b14
    • Vladimir Sementsov-Ogievskiy's avatar
      block/io: use int64_t bytes in driver wrappers · 17abcbee
      Vladimir Sementsov-Ogievskiy authored
      
      We are generally moving to int64_t for both offset and bytes parameters
      on all io paths.
      
      Main motivation is realization of 64-bit write_zeroes operation for
      fast zeroing large disk chunks, up to the whole disk.
      
      We chose signed type, to be consistent with off_t (which is signed) and
      with possibility for signed return type (where negative value means
      error).
      
      So, convert driver wrappers parameters which are already 64bit to
      signed type.
      
      Requests in block/io.c must never exceed BDRV_MAX_LENGTH (which is less
      than INT64_MAX), which makes the conversion to signed 64bit type safe.
      
      Add corresponding assertions.
      
      Signed-off-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      Message-Id: <20201211183934.169161-10-vsementsov@virtuozzo.com>
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      17abcbee
    • Eric Blake's avatar
      block: use int64_t as bytes type in tracked requests · 80247264
      Eric Blake authored
      
      We are generally moving to int64_t for both offset and bytes parameters
      on all io paths.
      
      Main motivation is realization of 64-bit write_zeroes operation for
      fast zeroing large disk chunks, up to the whole disk.
      
      We chose signed type, to be consistent with off_t (which is signed) and
      with possibility for signed return type (where negative value means
      error).
      
      All requests in block/io must not overflow BDRV_MAX_LENGTH, all
      external users of BdrvTrackedRequest already have corresponding
      assertions, so we are safe. Add some assertions still.
      
      Signed-off-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      Message-Id: <20201211183934.169161-9-vsementsov@virtuozzo.com>
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      80247264
    • Vladimir Sementsov-Ogievskiy's avatar
      block/io: improve bdrv_check_request: check qiov too · 63f4ad11
      Vladimir Sementsov-Ogievskiy authored
      
      Operations with qiov add more restrictions on bytes, let's cover it.
      
      Signed-off-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      Message-Id: <20201211183934.169161-8-vsementsov@virtuozzo.com>
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      63f4ad11
    • Vladimir Sementsov-Ogievskiy's avatar
      block/io: bdrv_pad_request(): support qemu_iovec_init_extended failure · 98ca4549
      Vladimir Sementsov-Ogievskiy authored
      
      Make bdrv_pad_request() honest: return error if
      qemu_iovec_init_extended() failed.
      
      Update also bdrv_padding_destroy() to clean the structure for safety.
      
      Signed-off-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      Message-Id: <20201211183934.169161-6-vsementsov@virtuozzo.com>
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      98ca4549
    • Vladimir Sementsov-Ogievskiy's avatar
      block/io: refactor bdrv_pad_request(): move bdrv_pad_request() up · f0deecff
      Vladimir Sementsov-Ogievskiy authored
      
      Prepare for the following patch when bdrv_pad_request() will be able to
      fail. Update the comments.
      
      Signed-off-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      Message-Id: <20201211183934.169161-5-vsementsov@virtuozzo.com>
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      [eblake: grammar tweak]
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      f0deecff
    • Vladimir Sementsov-Ogievskiy's avatar
      block: fix theoretical overflow in bdrv_init_padding() · a56ed80c
      Vladimir Sementsov-Ogievskiy authored
      
      Calculation of sum may theoretically overflow, so use 64bit type and
      add some good assertions.
      
      Use int64_t constantly.
      
      Signed-off-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      Message-Id: <20201211183934.169161-4-vsementsov@virtuozzo.com>
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      [eblake: tweak assertion order]
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      a56ed80c
    • Vladimir Sementsov-Ogievskiy's avatar
      util/iov: make qemu_iovec_init_extended() honest · 4c002cef
      Vladimir Sementsov-Ogievskiy authored
      
      Actually, we can't extend the io vector in all cases. Handle possible
      MAX_IOV and size_t overflows.
      
      For now add assertion to callers (actually they rely on success anyway)
      and fix them in the following patch.
      
      Add also some additional good assertions to qemu_iovec_init_slice()
      while being here.
      
      Signed-off-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      Message-Id: <20201211183934.169161-3-vsementsov@virtuozzo.com>
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      4c002cef
    • Vladimir Sementsov-Ogievskiy's avatar
      block: refactor bdrv_check_request: add errp · 69b55e03
      Vladimir Sementsov-Ogievskiy authored
      
      It's better to pass &error_abort than just assert that result is 0: on
      crash, we'll immediately see the reason in the backtrace.
      
      Signed-off-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      Message-Id: <20201211183934.169161-2-vsementsov@virtuozzo.com>
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      [eblake: fix iotest 206 fallout]
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      69b55e03
  13. Jan 26, 2021
  14. Dec 18, 2020
  15. Dec 11, 2020
    • Kevin Wolf's avatar
      block: Fix deadlock in bdrv_co_yield_to_drain() · 960d5fb3
      Kevin Wolf authored
      If bdrv_co_yield_to_drain() is called for draining a block node that
      runs in a different AioContext, it keeps that AioContext locked while it
      yields and schedules a BH in the AioContext to do the actual drain.
      
      As long as executing the BH is the very next thing that the event loop
      of the node's AioContext does, this actually happens to work, but when
      it tries to execute something else that wants to take the AioContext
      lock, it will deadlock. (In the bug report, this other thing is a
      virtio-scsi device running virtio_scsi_data_plane_handle_cmd().)
      
      Instead, always drop the AioContext lock across the yield and reacquire
      it only when the coroutine is reentered. The BH needs to unconditionally
      take the lock for itself now.
      
      This fixes the 'block_resize' QMP command on a block node that runs in
      an iothread.
      
      Cc: qemu-stable@nongnu.org
      Fixes: eb94b81a
      Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1903511
      
      
      Signed-off-by: default avatarKevin Wolf <kwolf@redhat.com>
      Message-Id: <20201203172311.68232-4-kwolf@redhat.com>
      Reviewed-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      Signed-off-by: default avatarKevin Wolf <kwolf@redhat.com>
      960d5fb3
Loading