Skip to content
Snippets Groups Projects
  1. Nov 03, 2020
    • Alex Chen's avatar
      block/vvfat: Fix bad printf format specifiers · c9eb2f3e
      Alex Chen authored
      
      We should use printf format specifier "%u" instead of "%d" for
      argument of type "unsigned int".
      In addition, fix two error format problems found by checkpatch.pl:
      ERROR: space required after that ',' (ctx:VxV)
      +        fprintf(stderr,"%s attributes=0x%02x begin=%u size=%d\n",
                             ^
      ERROR: line over 90 characters
      +        fprintf(stderr, "%d, %s (%u, %d)\n", i, commit->path ? commit->path : "(null)", commit->param.rename.cluster, commit->action);
      
      Reported-by: default avatarEuler Robot <euler.robot@huawei.com>
      Signed-off-by: default avatarAlex Chen <alex.chen@huawei.com>
      Message-Id: <5FA12620.6030705@huawei.com>
      Signed-off-by: default avatarKevin Wolf <kwolf@redhat.com>
      c9eb2f3e
  2. Oct 30, 2020
    • Eric Blake's avatar
      nbd: Add 'qemu-nbd -A' to expose allocation depth · dbc7b014
      Eric Blake authored
      
      Allow the server to expose an additional metacontext to be requested
      by savvy clients.  qemu-nbd adds a new option -A to expose the
      qemu:allocation-depth metacontext through NBD_CMD_BLOCK_STATUS; this
      can also be set via QMP when using block-export-add.
      
      qemu as client is hacked into viewing the key aspects of this new
      context by abusing the already-experimental x-dirty-bitmap option to
      collapse all depths greater than 2, which results in a tri-state value
      visible in the output of 'qemu-img map --output=json' (yes, that means
      x-dirty-bitmap is now a bit of a misnomer, but I didn't feel like
      renaming it as it would introduce a needless break of back-compat,
      even though we make no compat guarantees with x- members):
      
      unallocated (depth 0) => "zero":false, "data":true
      local (depth 1)       => "zero":false, "data":false
      backing (depth 2+)    => "zero":true,  "data":true
      
      libnbd as client is probably a nicer way to get at the information
      without having to decipher such hacks in qemu as client. ;)
      
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      Message-Id: <20201027050556.269064-11-eblake@redhat.com>
      Reviewed-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      dbc7b014
    • Eric Blake's avatar
      block: Return depth level during bdrv_is_allocated_above · a92b1b06
      Eric Blake authored
      
      When checking for allocation across a chain, it's already easy to
      count the depth within the chain at which the allocation is found.
      Instead of throwing that information away, return it to the caller.
      Existing callers only cared about allocated/non-allocated, but having
      a depth available will be used by NBD in the next patch.
      
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      Message-Id: <20201027050556.269064-9-eblake@redhat.com>
      Reviewed-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      [eblake: rebase to master]
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      a92b1b06
  3. Oct 27, 2020
    • Greg Kurz's avatar
      block: End quiescent sections when a BDS is deleted · 1a6d3bd2
      Greg Kurz authored
      If a BDS gets deleted during blk_drain_all(), it might miss a
      call to bdrv_do_drained_end(). This means missing a call to
      aio_enable_external() and the AIO context remains disabled for
      ever. This can cause a device to become irresponsive and to
      disrupt the guest execution, ie. hang, loop forever or worse.
      
      This scenario is quite easy to encounter with virtio-scsi
      on POWER when punching multiple blockdev-create QMP commands
      while the guest is booting and it is still running the SLOF
      firmware. This happens because SLOF disables/re-enables PCI
      devices multiple times via IO/MEM/MASTER bits of PCI_COMMAND
      register after the initial probe/feature negotiation, as it
      tends to work with a single device at a time at various stages
      like probing and running block/network bootloaders without
      doing a full reset in-between. This naturally generates many
      dataplane stops and starts, and thus many drain sections that
      can race with blockdev_create_run(). In the end, SLOF bails
      out.
      
      It is somehow reproducible on x86 but it requires to generate
      articial dataplane start/stop activity with stop/cont QMP
      commands. In this case, seabios ends up looping for ever,
      waiting for the virtio-scsi device to send a response to
      a command it never received.
      
      Add a helper that pairs all previously called bdrv_do_drained_begin()
      with a bdrv_do_drained_end() and call it from bdrv_close().
      While at it, update the "/bdrv-drain/graph-change/drain_all"
      test in test-bdrv-drain so that it can catch the issue.
      
      BugId: https://bugzilla.redhat.com/show_bug.cgi?id=1874441
      
      
      Signed-off-by: default avatarGreg Kurz <groug@kaod.org>
      Message-Id: <160346526998.272601.9045392804399803158.stgit@bahia.lan>
      Signed-off-by: default avatarKevin Wolf <kwolf@redhat.com>
      1a6d3bd2
    • Alberto Garcia's avatar
      qcow2: Skip copy-on-write when allocating a zero cluster · 46cd1e8a
      Alberto Garcia authored
      
      Since commit c8bb23cb when a write
      request results in a new allocation QEMU first tries to see if the
      rest of the cluster outside the written area contains only zeroes.
      
      In that case, instead of doing a normal copy-on-write operation and
      writing explicit zero buffers to disk, the code zeroes the whole
      cluster efficiently using pwrite_zeroes() with BDRV_REQ_NO_FALLBACK.
      
      This improves performance very significantly but it only happens when
      we are writing to an area that was completely unallocated before. Zero
      clusters (QCOW2_CLUSTER_ZERO_*) are treated like normal clusters and
      are therefore slower to allocate.
      
      This happens because the code uses bdrv_is_allocated_above() rather
      bdrv_block_status_above(). The former is not as accurate for this
      purpose but it is faster. However in the case of qcow2 the underlying
      call does already report zero clusters just fine so there is no reason
      why we cannot use that information.
      
      After testing 4KB writes on an image that only contains zero clusters
      this patch results in almost five times more IOPS.
      
      Signed-off-by: default avatarAlberto Garcia <berto@igalia.com>
      Reviewed-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      Message-Id: <6d77cab968c501c44d6e1089b9bc91b04170b49e.1603731354.git.berto@igalia.com>
      Signed-off-by: default avatarKevin Wolf <kwolf@redhat.com>
      46cd1e8a
    • Alberto Garcia's avatar
      qcow2: Report BDRV_BLOCK_ZERO more accurately in bdrv_co_block_status() · d40f4a56
      Alberto Garcia authored
      
      If a BlockDriverState supports backing files but has none then any
      unallocated area reads back as zeroes.
      
      bdrv_co_block_status() is only reporting this is if want_zero is true,
      but this is an inexpensive test and there is no reason not to do it in
      all cases.
      
      Suggested-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      Signed-off-by: default avatarAlberto Garcia <berto@igalia.com>
      Reviewed-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      Message-Id: <66fa0914a0e2b727ab6d1b63ca773d7cd29a9a9e.1603731354.git.berto@igalia.com>
      Signed-off-by: default avatarKevin Wolf <kwolf@redhat.com>
      d40f4a56
  4. Oct 23, 2020
  5. Oct 22, 2020
    • Claudio Fontana's avatar
      replay: do not build if TCG is not available · 9b1c9116
      Claudio Fontana authored
      
      this fixes non-TCG builds broken recently by replay reverse debugging.
      
      Stub the needed functions in stub/, splitting roughly between functions
      needed only by system emulation, by system emulation and tools,
      and by everyone.  This includes duplicating some code in replay/, and
      puts the logic for non-replay related events in the replay/ module (+
      the stubs), so this should be revisited in the future.
      
      Surprisingly, only _one_ qtest was affected by this, ide-test.c, which
      resulted in a buzz as the bh events were never delivered, and the bh
      never executed.
      
      Many other subsystems _should_ have been affected.
      
      This fixes the immediate issue, however a better way to group replay
      functionality to TCG-only code could be developed in the long term.
      
      Signed-off-by: default avatarClaudio Fontana <cfontana@suse.de>
      Message-Id: <20201013192123.22632-4-cfontana@suse.de>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      9b1c9116
  6. Oct 15, 2020
  7. Oct 13, 2020
    • Elena Afanasova's avatar
      block/blkdebug: fix memory leak · 5b4c95d0
      Elena Afanasova authored
      
      Spotted by PVS-Studio
      
      Signed-off-by: default avatarElena Afanasova <eafanasova@gmail.com>
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      Message-Id: <1e903f928eb3da332cc95e2a6f87243bd9fe66e4.camel@gmail.com>
      Signed-off-by: default avatarLaurent Vivier <laurent@vivier.eu>
      5b4c95d0
    • Christian Borntraeger's avatar
      vmdk: fix maybe uninitialized warnings · cd466702
      Christian Borntraeger authored
      
      Fedora 32 gcc 10 seems to give false positives:
      
      Compiling C object libblock.fa.p/block_vmdk.c.o
      ../block/vmdk.c: In function ‘vmdk_parse_extents’:
      ../block/vmdk.c:587:5: error: ‘extent’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
        587 |     g_free(extent->l1_table);
            |     ^~~~~~~~~~~~~~~~~~~~~~~~
      ../block/vmdk.c:754:17: note: ‘extent’ was declared here
        754 |     VmdkExtent *extent;
            |                 ^~~~~~
      ../block/vmdk.c:620:11: error: ‘extent’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
        620 |     ret = vmdk_init_tables(bs, extent, errp);
            |           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      ../block/vmdk.c:598:17: note: ‘extent’ was declared here
        598 |     VmdkExtent *extent;
            |                 ^~~~~~
      ../block/vmdk.c:1178:39: error: ‘extent’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
       1178 |             extent->flat_start_offset = flat_offset << 9;
            |             ~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~
      ../block/vmdk.c: In function ‘vmdk_open_vmdk4’:
      ../block/vmdk.c:581:22: error: ‘extent’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
        581 |     extent->l2_cache =
            |     ~~~~~~~~~~~~~~~~~^
        582 |         g_malloc(extent->entry_size * extent->l2_size * L2_CACHE_SIZE);
            |         ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      ../block/vmdk.c:872:17: note: ‘extent’ was declared here
        872 |     VmdkExtent *extent;
            |                 ^~~~~~
      ../block/vmdk.c: In function ‘vmdk_open’:
      ../block/vmdk.c:620:11: error: ‘extent’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
        620 |     ret = vmdk_init_tables(bs, extent, errp);
            |           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      ../block/vmdk.c:598:17: note: ‘extent’ was declared here
        598 |     VmdkExtent *extent;
            |                 ^~~~~~
      cc1: all warnings being treated as errors
      make: *** [Makefile.ninja:884: libblock.fa.p/block_vmdk.c.o] Error 1
      
      fix them by assigning a default value.
      
      Signed-off-by: default avatarChristian Borntraeger <borntraeger@de.ibm.com>
      Reviewed-by: default avatarFam Zheng <fam@euphon.net>
      Message-Id: <20200930155859.303148-2-borntraeger@de.ibm.com>
      Signed-off-by: default avatarLaurent Vivier <laurent@vivier.eu>
      cd466702
  8. Oct 09, 2020
    • Vladimir Sementsov-Ogievskiy's avatar
      block/nbd: nbd_co_reconnect_loop(): don't connect if drained · 99d72dba
      Vladimir Sementsov-Ogievskiy authored
      
      In a recent commit 12c75e20 we've improved
      nbd_co_reconnect_loop() to not make drain wait for additional sleep.
      Similarly, we shouldn't try to connect, if previous sleep was
      interrupted by drain begin, otherwise drain_begin will have to wait for
      the whole connection attempt.
      
      Signed-off-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      Message-Id: <20200903190301.367620-5-vsementsov@virtuozzo.com>
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      99d72dba
    • Vladimir Sementsov-Ogievskiy's avatar
      block/nbd: fix reconnect-delay · 46f56631
      Vladimir Sementsov-Ogievskiy authored
      
      reconnect-delay has a design flaw: we handle it in the same loop where
      we do connection attempt. So, reconnect-delay may be exceeded by
      unpredictable time of connection attempt.
      
      Let's instead use separate timer.
      
      How to reproduce the bug:
      
      1. Create an image on node1:
         qemu-img create -f qcow2 xx 100M
      
      2. Start NBD server on node1:
         qemu-nbd xx
      
      3. On node2 start qemu-io:
      
      ./build/qemu-io --image-opts \
      driver=nbd,server.type=inet,server.host=192.168.100.5,server.port=10809,reconnect-delay=15
      
      4. Type 'read 0 512' in qemu-io interface to check that connection
         works
      
      Be careful: you should make steps 5-7 in a short time, less than 15
      seconds.
      
      5. Kill nbd server on node1
      
      6. Run 'read 0 512' in qemu-io interface again, to be sure that nbd
      client goes to reconnect loop.
      
      7. On node1 run the following command
      
         sudo iptables -A INPUT -p tcp --dport 10809 -j DROP
      
      This will make the connect() call of qemu-io at node2 take a long time.
      
      And you'll see that read command in qemu-io will hang for a long time,
      more than 15 seconds specified by reconnect-delay parameter. It's the
      bug.
      
      8. Don't forget to drop iptables rule on node1:
      
         sudo iptables -D INPUT -p tcp --dport 10809 -j DROP
      
      Important note: Step [5] is necessary to reproduce _this_ bug. If we
      miss step [5], the read command (step 6) will hang for a long time and
      this commit doesn't help, because there will be not long connect() to
      unreachable host, but long sendmsg() to unreachable host, which should
      be fixed by enabling and adjusting keep-alive on the socket, which is a
      thing for further patch set.
      
      Signed-off-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      Message-Id: <20200903190301.367620-4-vsementsov@virtuozzo.com>
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      46f56631
    • Vladimir Sementsov-Ogievskiy's avatar
      block/nbd: correctly use qio_channel_detach_aio_context when needed · 8a509afd
      Vladimir Sementsov-Ogievskiy authored
      
      Don't use nbd_client_detach_aio_context() driver handler where we want
      to finalize the connection. We should directly use
      qio_channel_detach_aio_context() in such cases. Driver handler may (and
      will) contain another things, unrelated to the qio channel.
      
      Signed-off-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      Message-Id: <20200903190301.367620-3-vsementsov@virtuozzo.com>
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      8a509afd
    • Vladimir Sementsov-Ogievskiy's avatar
      block/nbd: fix drain dead-lock because of nbd reconnect-delay · 8c517de2
      Vladimir Sementsov-Ogievskiy authored
      
      We pause reconnect process during drained section. So, if we have some
      requests, waiting for reconnect we should cancel them, otherwise they
      deadlock the drained section.
      
      How to reproduce:
      
      1. Create an image:
         qemu-img create -f qcow2 xx 100M
      
      2. Start NBD server:
         qemu-nbd xx
      
      3. Start vm with second nbd disk on node2, like this:
      
        ./build/x86_64-softmmu/qemu-system-x86_64 -nodefaults -drive \
           file=/work/images/cent7.qcow2 -drive \
           driver=nbd,server.type=inet,server.host=192.168.100.5,server.port=10809,reconnect-delay=60 \
           -vnc :0 -m 2G -enable-kvm -vga std
      
      4. Access the vm through vnc (or some other way?), and check that NBD
         drive works:
      
         dd if=/dev/sdb of=/dev/null bs=1M count=10
      
         - the command should succeed.
      
      5. Now, kill the nbd server, and run dd in the guest again:
      
         dd if=/dev/sdb of=/dev/null bs=1M count=10
      
      Now Qemu is trying to reconnect, and dd-generated requests are waiting
      for the connection (they will wait up to 60 seconds (see
      reconnect-delay option above) and than fail). But suddenly, vm may
      totally hang in the deadlock. You may need to increase reconnect-delay
      period to catch the dead-lock.
      
      VM doesn't respond because drain dead-lock happens in cpu thread with
      global mutex taken. That's not good thing by itself and is not fixed
      by this commit (true way is using iothreads). Still this commit fixes
      drain dead-lock itself.
      
      Note: probably, we can instead continue to reconnect during drained
      section. To achieve this, we may move negotiation to the connect thread
      to make it independent of bs aio context. But expanding drained section
      doesn't seem good anyway. So, let's now fix the bug the simplest way.
      
      Signed-off-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      Message-Id: <20200903190301.367620-2-vsementsov@virtuozzo.com>
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      8c517de2
  9. Oct 06, 2020
  10. Oct 05, 2020
Loading