Skip to content
Snippets Groups Projects
  1. Nov 16, 2021
    • Hanna Reitz's avatar
      block: Let replace_child_noperm free children · b0a9f6fe
      Hanna Reitz authored
      
      In most of the block layer, especially when traversing down from other
      BlockDriverStates, we assume that BdrvChild.bs can never be NULL.  When
      it becomes NULL, it is expected that the corresponding BdrvChild pointer
      also becomes NULL and the BdrvChild object is freed.
      
      Therefore, once bdrv_replace_child_noperm() sets the BdrvChild.bs
      pointer to NULL, it should also immediately set the corresponding
      BdrvChild pointer (like bs->file or bs->backing) to NULL.
      
      In that context, it also makes sense for this function to free the
      child.  Sometimes we cannot do so, though, because it is called in a
      transactional context where the caller might still want to reinstate the
      child in the abort branch (and free it only on commit), so this behavior
      has to remain optional.
      
      In bdrv_replace_child_tran()'s abort handler, we now rely on the fact
      that the BdrvChild passed to bdrv_replace_child_tran() must have had a
      non-NULL .bs pointer initially.  Make a note of that and assert it.
      
      Signed-off-by: default avatarHanna Reitz <hreitz@redhat.com>
      Message-Id: <20211111120829.81329-10-hreitz@redhat.com>
      Signed-off-by: default avatarKevin Wolf <kwolf@redhat.com>
      Message-Id: <20211115145409.176785-10-kwolf@redhat.com>
      Signed-off-by: default avatarHanna Reitz <hreitz@redhat.com>
      b0a9f6fe
    • Hanna Reitz's avatar
      block: Let replace_child_tran keep indirect pointer · 82b54cf5
      Hanna Reitz authored
      
      As of a future commit, bdrv_replace_child_noperm() will clear the
      indirect BdrvChild pointer passed to it if the new child BDS is NULL.
      bdrv_replace_child_tran() will want to let it do that, but revert this
      change in its abort handler.  For that, we need to have it receive a
      BdrvChild ** pointer, too, and keep it stored in the
      BdrvReplaceChildState object that we attach to the transaction.
      
      Note that we do not need to store it in the BdrvReplaceChildState when
      new_bs is not NULL, because then there is nothing to revert.  This is
      important so that bdrv_replace_node_noperm() can pass a pointer to a
      loop-local variable to bdrv_replace_child_tran() without worrying that
      this pointer will outlive one loop iteration.
      
      (Of course, for that to work, bdrv_replace_node_noperm() and in turn
      bdrv_replace_node() and its relatives may not be called with a NULL @to
      node.  Luckily, they already are not, but now we should assert this.)
      
      bdrv_remove_file_or_backing_child() on the other hand needs to ensure
      that the indirect pointer it passes will stay valid for the duration of
      the transaction.  Ensure this by keeping a strong reference to the BDS
      whose &bs->backing or &bs->file it passes to bdrv_replace_child_tran(),
      and giving up that reference only in the transaction .clean() handler.
      
      Signed-off-by: default avatarHanna Reitz <hreitz@redhat.com>
      Message-Id: <20211111120829.81329-9-hreitz@redhat.com>
      Reviewed-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      Signed-off-by: default avatarKevin Wolf <kwolf@redhat.com>
      Message-Id: <20211115145409.176785-9-kwolf@redhat.com>
      Signed-off-by: default avatarHanna Reitz <hreitz@redhat.com>
      82b54cf5
    • Hanna Reitz's avatar
      block: Restructure remove_file_or_backing_child() · 562bda8b
      Hanna Reitz authored
      
      As of a future patch, bdrv_replace_child_tran() will take a BdrvChild **
      pointer.  Prepare for that by getting such a pointer and using it where
      applicable, and (dereferenced) as a parameter for
      bdrv_replace_child_tran().
      
      Signed-off-by: default avatarHanna Reitz <hreitz@redhat.com>
      Message-Id: <20211111120829.81329-7-hreitz@redhat.com>
      Reviewed-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      Signed-off-by: default avatarKevin Wolf <kwolf@redhat.com>
      Message-Id: <20211115145409.176785-7-kwolf@redhat.com>
      Signed-off-by: default avatarHanna Reitz <hreitz@redhat.com>
      562bda8b
    • Hanna Reitz's avatar
      block: Pass BdrvChild ** to replace_child_noperm · be64bbb0
      Hanna Reitz authored
      
      bdrv_replace_child_noperm() modifies BdrvChild.bs, and can potentially
      set it to NULL.  That is dangerous, because BDS parents generally assume
      that their children's .bs pointer is never NULL.  We therefore want to
      let bdrv_replace_child_noperm() set the corresponding BdrvChild pointer
      to NULL, too.
      
      This patch lays the foundation for it by passing a BdrvChild ** pointer
      to bdrv_replace_child_noperm() so that it can later use it to NULL the
      BdrvChild pointer immediately after setting BdrvChild.bs to NULL.
      
      (We will still need to undertake some intermediate steps, though.)
      
      Signed-off-by: default avatarHanna Reitz <hreitz@redhat.com>
      Message-Id: <20211111120829.81329-6-hreitz@redhat.com>
      Reviewed-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      Signed-off-by: default avatarKevin Wolf <kwolf@redhat.com>
      Message-Id: <20211115145409.176785-6-kwolf@redhat.com>
      Signed-off-by: default avatarHanna Reitz <hreitz@redhat.com>
      be64bbb0
    • Hanna Reitz's avatar
      block: Drop detached child from ignore list · 26518061
      Hanna Reitz authored
      
      bdrv_attach_child_common_abort() restores the parent's AioContext.  To
      do so, the child (which was supposed to be attached, but is now detached
      again by this abort handler) is added to the ignore list for the
      AioContext changing functions.
      
      However, since we modify a BDS's children list in the BdrvChildClass's
      .attach and .detach handlers, the child is already effectively detached
      from the parent by this point.  We do not need to put it into the ignore
      list.
      
      Use this opportunity to clean up the empty line structure: Keep setting
      the ignore list, invoking the AioContext function, and freeing the
      ignore list in blocks separated by empty lines.
      
      Signed-off-by: default avatarHanna Reitz <hreitz@redhat.com>
      Reviewed-by: default avatarKevin Wolf <kwolf@redhat.com>
      Reviewed-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      Message-Id: <20211111120829.81329-5-hreitz@redhat.com>
      Signed-off-by: default avatarKevin Wolf <kwolf@redhat.com>
      Message-Id: <20211115145409.176785-5-kwolf@redhat.com>
      Signed-off-by: default avatarHanna Reitz <hreitz@redhat.com>
      26518061
    • Hanna Reitz's avatar
      block: Unite remove_empty_child and child_free · 04c9c3a5
      Hanna Reitz authored
      
      Now that bdrv_remove_empty_child() no longer removes the child from the
      parent's children list but only checks that it is not in such a list, it
      is only a wrapper around bdrv_child_free() that checks that the child is
      empty and unused.  That should apply to all children that we free, so
      put those checks into bdrv_child_free() and drop
      bdrv_remove_empty_child().
      
      Signed-off-by: default avatarHanna Reitz <hreitz@redhat.com>
      Reviewed-by: default avatarKevin Wolf <kwolf@redhat.com>
      Reviewed-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      Message-Id: <20211111120829.81329-4-hreitz@redhat.com>
      Signed-off-by: default avatarKevin Wolf <kwolf@redhat.com>
      Message-Id: <20211115145409.176785-4-kwolf@redhat.com>
      Signed-off-by: default avatarHanna Reitz <hreitz@redhat.com>
      04c9c3a5
    • Hanna Reitz's avatar
      block: Manipulate children list in .attach/.detach · a225369b
      Hanna Reitz authored
      
      The children list is specific to BDS parents.  We should not modify it
      in the general children modification code, but let BDS parents deal with
      it in their .attach() and .detach() methods.
      
      This also has the advantage that a BdrvChild is removed from the
      children list before its .bs pointer can become NULL.  BDS parents
      generally assume that their children's .bs pointer is never NULL, so
      this is actually a bug fix.
      
      Signed-off-by: default avatarHanna Reitz <hreitz@redhat.com>
      Reviewed-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      Message-Id: <20211111120829.81329-3-hreitz@redhat.com>
      Signed-off-by: default avatarKevin Wolf <kwolf@redhat.com>
      Message-Id: <20211115145409.176785-3-kwolf@redhat.com>
      Signed-off-by: default avatarHanna Reitz <hreitz@redhat.com>
      a225369b
  2. Nov 02, 2021
  3. Oct 06, 2021
  4. Sep 15, 2021
    • Vladimir Sementsov-Ogievskiy's avatar
      block: bdrv_inactivate_recurse(): check for permissions and fix crash · a13de40a
      Vladimir Sementsov-Ogievskiy authored
      
      We must not inactivate child when parent has write permissions on
      it.
      
      Calling .bdrv_inactivate() doesn't help: actually only qcow2 has this
      handler and it is used to flush caches, not for permission
      manipulations.
      
      So, let's simply check cumulative parent permissions before
      inactivating the node.
      
      This commit fixes a crash when we do migration during backup: prior to
      the commit nothing prevents all nodes inactivation at migration finish
      and following backup write to the target crashes on assertion
      "assert(!(bs->open_flags & BDRV_O_INACTIVE));" in
      bdrv_co_write_req_prepare().
      
      After the commit, we rely on the fact that copy-before-write filter
      keeps write permission on target node to be able to write to it. So
      inactivation fails and migration fails as expected.
      
      Corresponding test now passes, so, enable it.
      
      Signed-off-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      Reviewed-by: default avatarHanna Reitz <hreitz@redhat.com>
      Message-Id: <20210911120027.8063-3-vsementsov@virtuozzo.com>
      Signed-off-by: default avatarHanna Reitz <hreitz@redhat.com>
      a13de40a
    • Hanna Reitz's avatar
      block: block-status cache for data regions · 0bc329fb
      Hanna Reitz authored
      As we have attempted before
      (https://lists.gnu.org/archive/html/qemu-devel/2019-01/msg06451.html,
      "file-posix: Cache lseek result for data regions";
      https://lists.nongnu.org/archive/html/qemu-block/2021-02/msg00934.html,
      "file-posix: Cache next hole"), this patch seeks to reduce the number of
      SEEK_DATA/HOLE operations the file-posix driver has to perform.  The
      main difference is that this time it is implemented as part of the
      general block layer code.
      
      The problem we face is that on some filesystems or in some
      circumstances, SEEK_DATA/HOLE is unreasonably slow.  Given the
      implementation is outside of qemu, there is little we can do about its
      performance.
      
      We have already introduced the want_zero parameter to
      bdrv_co_block_status() to reduce the number of SEEK_DATA/HOLE calls
      unless we really want zero information; but sometimes we do want that
      information, because for files that consist largely of zero areas,
      special-casing those areas can give large performance boosts.  So the
      real problem is with files that consist largely of data, so that
      inquiring the block status does not gain us much performance, but where
      such an inquiry itself takes a lot of time.
      
      To address this, we want to cache data regions.  Most of the time, when
      bad performance is reported, it is in places where the image is iterated
      over from start to end (qemu-img convert or the mirror job), so a simple
      yet effective solution is to cache only the current data region.
      
      (Note that only caching data regions but not zero regions means that
      returning false information from the cache is not catastrophic: Treating
      zeroes as data is fine.  While we try to invalidate the cache on zero
      writes and discards, such incongruences may still occur when there are
      other processes writing to the image.)
      
      We only use the cache for nodes without children (i.e. protocol nodes),
      because that is where the problem is: Drivers that rely on block-status
      implementations outside of qemu (e.g. SEEK_DATA/HOLE).
      
      Resolves: https://gitlab.com/qemu-project/qemu/-/issues/307
      
      
      Signed-off-by: default avatarHanna Reitz <hreitz@redhat.com>
      Message-Id: <20210812084148.14458-3-hreitz@redhat.com>
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      Reviewed-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      [hreitz: Added `local_file == bs` assertion, as suggested by Vladimir]
      Signed-off-by: default avatarHanna Reitz <hreitz@redhat.com>
      0bc329fb
  5. Sep 01, 2021
  6. Jul 20, 2021
    • Kevin Wolf's avatar
      block: Add option to use driver whitelist even in tools · e5f05f8c
      Kevin Wolf authored
      
      Currently, the block driver whitelists are only applied for the system
      emulator. All other binaries still give unrestricted access to all block
      drivers. There are use cases where this made sense because the main
      concern was avoiding customers running VMs on less optimised block
      drivers and getting bad performance. Allowing the same image format e.g.
      as a target for 'qemu-img convert' is not a problem then.
      
      However, if the concern is the supportability of the driver in general,
      either in full or when used read-write, not applying the list driver
      whitelist in tools doesn't help - especially since qemu-nbd and
      qemu-storage-daemon now give access to more or less the same operations
      in block drivers as running a system emulator.
      
      In order to address this, introduce a new configure option that enforces
      the driver whitelist in all binaries.
      
      Signed-off-by: default avatarKevin Wolf <kwolf@redhat.com>
      Message-Id: <20210709164141.254097-1-kwolf@redhat.com>
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      Signed-off-by: default avatarKevin Wolf <kwolf@redhat.com>
      e5f05f8c
  7. Jul 09, 2021
  8. Jun 29, 2021
  9. Jun 25, 2021
  10. Jun 02, 2021
    • Vladimir Sementsov-Ogievskiy's avatar
      block: improve permission conflict error message · 30ebb9aa
      Vladimir Sementsov-Ogievskiy authored
      
      Now permissions are updated as follows:
       1. do graph modifications ignoring permissions
       2. do permission update
      
       (of course, we rollback [1] if [2] fails)
      
      So, on stage [2] we can't say which users are "old" and which are
      "new" and exist only since [1]. And current error message is a bit
      outdated. Let's improve it, to make everything clean.
      
      While being here, add also a comment and some good assertions.
      
      iotests 283, 307, qsd-jobs outputs are updated.
      
      Signed-off-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      Message-Id: <20210601075218.79249-7-vsementsov@virtuozzo.com>
      Signed-off-by: default avatarKevin Wolf <kwolf@redhat.com>
      30ebb9aa
    • Vladimir Sementsov-Ogievskiy's avatar
      block: simplify bdrv_child_user_desc() · da261b69
      Vladimir Sementsov-Ogievskiy authored
      
      All child classes have this callback. So, drop unreachable code.
      
      Still add an assertion to bdrv_attach_child_common(), to early detect
      bad classes.
      
      Signed-off-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      Message-Id: <20210601075218.79249-6-vsementsov@virtuozzo.com>
      Signed-off-by: default avatarKevin Wolf <kwolf@redhat.com>
      da261b69
    • Vladimir Sementsov-Ogievskiy's avatar
      block: improve bdrv_child_get_parent_desc() · 2c0a3acb
      Vladimir Sementsov-Ogievskiy authored
      
      We have different types of parents: block nodes, block backends and
      jobs. So, it makes sense to specify type together with name.
      
      Next, this handler us used to compose an error message about permission
      conflict. And permission conflict occurs in a specific place of block
      graph. We shouldn't report name of parent device (as it refers another
      place in block graph), but exactly and only the name of the node. So,
      use bdrv_get_node_name() directly.
      
      iotest 283 output is updated.
      
      Signed-off-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      Reviewed-by: default avatarAlberto Garcia <berto@igalia.com>
      Message-Id: <20210601075218.79249-4-vsementsov@virtuozzo.com>
      Signed-off-by: default avatarKevin Wolf <kwolf@redhat.com>
      2c0a3acb
    • Vladimir Sementsov-Ogievskiy's avatar
      block: document child argument of bdrv_attach_child_common() · f8d2ad78
      Vladimir Sementsov-Ogievskiy authored
      
      The logic around **child is not obvious: this reference is used not
      only to return resulting child, but also to rollback NULL value on
      transaction abort.
      
      So, let's add documentation and some assertions.
      
      While being here, drop extra declaration of bdrv_attach_child_noperm().
      
      Signed-off-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      Message-Id: <20210601075218.79249-2-vsementsov@virtuozzo.com>
      Signed-off-by: default avatarKevin Wolf <kwolf@redhat.com>
      f8d2ad78
    • Vladimir Sementsov-Ogievskiy's avatar
      block: drop BlockDriverState::read_only · 975da073
      Vladimir Sementsov-Ogievskiy authored
      
      This variable is just a cache for !(bs->open_flags & BDRV_O_RDWR),
      which we have to synchronize everywhere. Let's just drop it and
      consistently use bdrv_is_read_only().
      
      Signed-off-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      Message-Id: <20210527154056.70294-3-vsementsov@virtuozzo.com>
      Signed-off-by: default avatarKevin Wolf <kwolf@redhat.com>
      975da073
    • Vladimir Sementsov-Ogievskiy's avatar
      block: consistently use bdrv_is_read_only() · 307261b2
      Vladimir Sementsov-Ogievskiy authored
      
      It's better to use accessor function instead of bs->read_only directly.
      In some places use bdrv_is_writable() instead of
      checking both BDRV_O_RDWR set and BDRV_O_INACTIVE not set.
      
      In bdrv_open_common() it's a bit strange to add one more variable, but
      we are going to drop bs->read_only in the next patch, so new ro local
      variable substitutes it here.
      
      Signed-off-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      Message-Id: <20210527154056.70294-2-vsementsov@virtuozzo.com>
      Signed-off-by: default avatarKevin Wolf <kwolf@redhat.com>
      307261b2
    • Vladimir Sementsov-Ogievskiy's avatar
      block/vvfat: child_vvfat_qcow: add .get_parent_aio_context, fix crash · fb62b588
      Vladimir Sementsov-Ogievskiy authored
      
      Commit 3ca1f322
      "block: BdrvChildClass: add .get_parent_aio_context handler" introduced
      new handler and commit 228ca37e
      "block: drop ctx argument from bdrv_root_attach_child" made a generic
      use of it. But 3ca1f322 didn't update
      child_vvfat_qcow. Fix that.
      
      Before that fix the command
      
      ./build/qemu-system-x86_64 -usb -device usb-storage,drive=fat16 \
        -drive file=fat:rw:fat-type=16:"<path of a host folder>",id=fat16,format=raw,if=none
      
      crashes:
      
      1  bdrv_child_get_parent_aio_context (c=0x559d62426d20)
          at ../block.c:1440
      2  bdrv_attach_child_common
          (child_bs=0x559d62468190, child_name=0x559d606f9e3d "write-target",
           child_class=0x559d60c58d20 <child_vvfat_qcow>, child_role=3,
           perm=3, shared_perm=4, opaque=0x559d62445690,
           child=0x7ffc74c2acc8, tran=0x559d6246ddd0, errp=0x7ffc74c2ae60)
          at ../block.c:2795
      3  bdrv_attach_child_noperm
          (parent_bs=0x559d62445690, child_bs=0x559d62468190,
           child_name=0x559d606f9e3d "write-target",
           child_class=0x559d60c58d20 <child_vvfat_qcow>, child_role=3,
           child=0x7ffc74c2acc8, tran=0x559d6246ddd0, errp=0x7ffc74c2ae60) at
          ../block.c:2855
      4  bdrv_attach_child
          (parent_bs=0x559d62445690, child_bs=0x559d62468190,
           child_name=0x559d606f9e3d "write-target",
           child_class=0x559d60c58d20 <child_vvfat_qcow>, child_role=3,
           errp=0x7ffc74c2ae60) at ../block.c:2953
      5  bdrv_open_child
          (filename=0x559d62464b80 "/var/tmp/vl.h3TIS4",
           options=0x559d6246ec20, bdref_key=0x559d606f9e3d "write-target",
           parent=0x559d62445690, child_class=0x559d60c58d20
           <child_vvfat_qcow>, child_role=3, allow_none=false,
           errp=0x7ffc74c2ae60) at ../block.c:3351
      6  enable_write_target (bs=0x559d62445690, errp=0x7ffc74c2ae60) at
         ../block/vvfat.c:3176
      7  vvfat_open (bs=0x559d62445690, options=0x559d6244adb0, flags=155650,
                     errp=0x7ffc74c2ae60) at ../block/vvfat.c:1236
      8  bdrv_open_driver (bs=0x559d62445690, drv=0x559d60d4f7e0
                           <bdrv_vvfat>, node_name=0x0,
                           options=0x559d6244adb0, open_flags=155650,
                           errp=0x7ffc74c2af70) at ../block.c:1557
      9  bdrv_open_common (bs=0x559d62445690, file=0x0,
                           options=0x559d6244adb0, errp=0x7ffc74c2af70) at
      ...
      
      (gdb) fr 1
       #1  0x0000559d603ea3bf in bdrv_child_get_parent_aio_context
           (c=0x559d62426d20) at ../block.c:1440
      1440        return c->klass->get_parent_aio_context(c);
       (gdb) p c->klass
      $1 = (const BdrvChildClass *) 0x559d60c58d20 <child_vvfat_qcow>
       (gdb) p c->klass->get_parent_aio_context
      $2 = (AioContext *(*)(BdrvChild *)) 0x0
      
      Fixes: 3ca1f322
      Fixes: 228ca37e
      Reported-by: default avatarJohn Arbuckle <programmingkidx@gmail.com>
      Signed-off-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      Message-Id: <20210524101257.119377-2-vsementsov@virtuozzo.com>
      Tested-by: default avatarJohn Arbuckle <programmingkidx@gmail.com>
      Signed-off-by: default avatarKevin Wolf <kwolf@redhat.com>
      fb62b588
  11. May 18, 2021
Loading