Skip to content
Snippets Groups Projects
  1. Oct 09, 2020
    • Eric Blake's avatar
      nbd: Simplify meta-context parsing · ebd57062
      Eric Blake authored
      
      We had a premature optimization of trying to read as little from the
      wire as possible while handling NBD_OPT_SET_META_CONTEXT in phases.
      But in reality, we HAVE to read the entire string from the client
      before we can get to the next command, and it is easier to just read
      it all at once than it is to read it in pieces.  And once we do that,
      several functions end up no longer performing I/O, so they can drop
      length and errp parameters, and just return a bool instead of
      modifying through a pointer.
      
      Our iotests still pass; I also checked that libnbd's testsuite (which
      covers more corner cases of odd meta context requests) still passes.
      There are cases where the sequence of trace messages produced differs
      (for example, when no bitmap is exported, a query for "qemu:" now
      produces two trace lines instead of one), but trace points are for
      debug and have no effect on what the client sees.
      
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      Message-Id: <20200930121105.667049-4-eblake@redhat.com>
      Reviewed-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      [eblake: enhance commit message]
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      ebd57062
    • Eric Blake's avatar
      nbd/server: Reject embedded NUL in NBD strings · d1e2c3e7
      Eric Blake authored
      
      The NBD spec is clear that any string sent from the client must not
      contain embedded NUL characters.  If the client passes "a\0", we
      should reject that option request rather than act on "a".
      
      Testing this is not possible with a compliant client, but I was able
      to use gdb to coerce libnbd into temporarily behaving as such a
      client.
      
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      Message-Id: <20200930121105.667049-3-eblake@redhat.com>
      Reviewed-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      d1e2c3e7
    • Christian Borntraeger's avatar
      nbd: silence maybe-uninitialized warnings · bbc35fc2
      Christian Borntraeger authored
      
      gcc 10 from Fedora 32 gives me:
      
      Compiling C object libblock.fa.p/nbd_server.c.o
      ../nbd/server.c: In function ‘nbd_co_client_start’:
      ../nbd/server.c:625:14: error: ‘namelen’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
        625 |         rc = nbd_negotiate_send_info(client, NBD_INFO_NAME, namelen, name,
            |              ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        626 |                                      errp);
            |                                      ~~~~~
      ../nbd/server.c:564:14: note: ‘namelen’ was declared here
        564 |     uint32_t namelen;
            |              ^~~~~~~
      cc1: all warnings being treated as errors
      
      As I cannot see how this can happen, let uns silence the warning.
      
      Signed-off-by: default avatarChristian Borntraeger <borntraeger@de.ibm.com>
      Message-Id: <20200930155859.303148-3-borntraeger@de.ibm.com>
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      bbc35fc2
  2. Oct 02, 2020
  3. Sep 07, 2020
  4. Aug 21, 2020
  5. Jul 28, 2020
    • Eric Blake's avatar
      nbd: Fix large trim/zero requests · 890cbccb
      Eric Blake authored
      Although qemu as NBD client limits requests to <2G, the NBD protocol
      allows clients to send requests almost all the way up to 4G.  But
      because our block layer is not yet 64-bit clean, we accidentally wrap
      such requests into a negative size, and fail with EIO instead of
      performing the intended operation.
      
      The bug is visible in modern systems with something as simple as:
      
      $ qemu-img create -f qcow2 /tmp/image.img 5G
      $ sudo qemu-nbd --connect=/dev/nbd0 /tmp/image.img
      $ sudo blkdiscard /dev/nbd0
      
      or with user-space only:
      
      $ truncate --size=3G file
      $ qemu-nbd -f raw file
      $ nbdsh -u nbd://localhost:10809 -c 'h.trim(3*1024*1024*1024,0)'
      
      Although both blk_co_pdiscard and blk_pwrite_zeroes currently return 0
      on success, this is also a good time to fix our code to a more robust
      paradigm that treats all non-negative values as success.
      
      Alas, our iotests do not currently make it easy to add external
      dependencies on blkdiscard or nbdsh, so we have to rely on manual
      testing for now.
      
      This patch can be reverted when we later improve the overall block
      layer to be 64-bit clean, but for now, a minimal fix was deemed less
      risky prior to release.
      
      CC: qemu-stable@nongnu.org
      Fixes: 1f4d6d18
      Fixes: 1c6c4bb7
      Fixes: https://github.com/systemd/systemd/issues/16242
      
      
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      Message-Id: <20200722212231.535072-1-eblake@redhat.com>
      Reviewed-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      [eblake: rework success tests to use >=0]
      890cbccb
  6. Jul 17, 2020
    • Vladimir Sementsov-Ogievskiy's avatar
      nbd: make nbd_export_close_all() synchronous · 453cc6be
      Vladimir Sementsov-Ogievskiy authored
      
      Consider nbd_export_close_all(). The call-stack looks like this:
       nbd_export_close_all() -> nbd_export_close -> call client_close() for
      each client.
      
      client_close() doesn't guarantee that client is closed: nbd_trip()
      keeps reference to it. So, nbd_export_close_all() just reduce
      reference counter on export and removes it from the list, but doesn't
      guarantee that nbd_trip() finished neither export actually removed.
      
      Let's wait for all exports actually removed.
      
      Without this fix, the following crash is possible:
      
      - export bitmap through internal Qemu NBD server
      - connect a client
      - shutdown Qemu
      
      On shutdown nbd_export_close_all is called, but it actually don't wait
      for nbd_trip() to finish and to release its references. So, export is
      not release, and exported bitmap remains busy, and on try to remove the
      bitmap (which is part of bdrv_close()) the assertion fails:
      
      bdrv_release_dirty_bitmap_locked: Assertion `!bdrv_dirty_bitmap_busy(bitmap)' failed
      
      Signed-off-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      Message-Id: <20200714162234.13113-2-vsementsov@virtuozzo.com>
      Signed-off-by: default avatarKevin Wolf <kwolf@redhat.com>
      453cc6be
  7. Jul 10, 2020
    • Vladimir Sementsov-Ogievskiy's avatar
      nbd: Use ERRP_GUARD() · 795d946d
      Vladimir Sementsov-Ogievskiy authored
      
      If we want to check error after errp-function call, we need to
      introduce local_err and then propagate it to errp. Instead, use
      the ERRP_GUARD() macro, benefits are:
      1. No need of explicit error_propagate call
      2. No need of explicit local_err variable: use errp directly
      3. ERRP_GUARD() leaves errp as is if it's not NULL or
         &error_fatal, this means that we don't break error_abort
         (we'll abort on error_set, not on error_propagate)
      
      If we want to add some info to errp (by error_prepend() or
      error_append_hint()), we must use the ERRP_GUARD() macro.
      Otherwise, this info will not be added when errp == &error_fatal
      (the program will exit prior to the error_append_hint() or
      error_prepend() call).  Fix several such cases, e.g. in nbd_read().
      
      This commit is generated by command
      
          sed -n '/^Network Block Device (NBD)$/,/^$/{s/^F: //p}' \
              MAINTAINERS | \
          xargs git ls-files | grep '\.[hc]$' | \
          xargs spatch \
              --sp-file scripts/coccinelle/errp-guard.cocci \
              --macro-file scripts/cocci-macro-file.h \
              --in-place --no-show-diff --max-width 80
      
      Reported-by: default avatarKevin Wolf <kwolf@redhat.com>
      Reported-by: default avatarGreg Kurz <groug@kaod.org>
      Signed-off-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      Reviewed-by: default avatarMarkus Armbruster <armbru@redhat.com>
      [Commit message tweaked]
      Signed-off-by: default avatarMarkus Armbruster <armbru@redhat.com>
      Message-Id: <20200707165037.1026246-8-armbru@redhat.com>
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      [ERRP_AUTO_PROPAGATE() renamed to ERRP_GUARD(), and
      auto-propagated-errp.cocci to errp-guard.cocci.  Commit message
      tweaked again.]
      795d946d
  8. Jun 10, 2020
    • Eric Blake's avatar
      nbd/server: Avoid long error message assertions CVE-2020-10761 · 5c4fe018
      Eric Blake authored
      
      Ever since commit 36683283 (v2.8), the server code asserts that error
      strings sent to the client are well-formed per the protocol by not
      exceeding the maximum string length of 4096.  At the time the server
      first started sending error messages, the assertion could not be
      triggered, because messages were completely under our control.
      However, over the years, we have added latent scenarios where a client
      could trigger the server to attempt an error message that would
      include the client's information if it passed other checks first:
      
      - requesting NBD_OPT_INFO/GO on an export name that is not present
        (commit 0cfae925 in v2.12 echoes the name)
      
      - requesting NBD_OPT_LIST/SET_META_CONTEXT on an export name that is
        not present (commit e7b1948d in v2.12 echoes the name)
      
      At the time, those were still safe because we flagged names larger
      than 256 bytes with a different message; but that changed in commit
      93676c88 (v4.2) when we raised the name limit to 4096 to match the NBD
      string limit.  (That commit also failed to change the magic number
      4096 in nbd_negotiate_send_rep_err to the just-introduced named
      constant.)  So with that commit, long client names appended to server
      text can now trigger the assertion, and thus be used as a denial of
      service attack against a server.  As a mitigating factor, if the
      server requires TLS, the client cannot trigger the problematic paths
      unless it first supplies TLS credentials, and such trusted clients are
      less likely to try to intentionally crash the server.
      
      We may later want to further sanitize the user-supplied strings we
      place into our error messages, such as scrubbing out control
      characters, but that is less important to the CVE fix, so it can be a
      later patch to the new nbd_sanitize_name.
      
      Consideration was given to changing the assertion in
      nbd_negotiate_send_rep_verr to instead merely log a server error and
      truncate the message, to avoid leaving a latent path that could
      trigger a future CVE DoS on any new error message.  However, this
      merely complicates the code for something that is already (correctly)
      flagging coding errors, and now that we are aware of the long message
      pitfall, we are less likely to introduce such errors in the future,
      which would make such error handling dead code.
      
      Reported-by: default avatarXueqiang Wei <xuwei@redhat.com>
      CC: qemu-stable@nongnu.org
      Fixes: https://bugzilla.redhat.com/1843684
      
       CVE-2020-10761
      Fixes: 93676c88
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      Message-Id: <20200610163741.3745251-2-eblake@redhat.com>
      Reviewed-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      5c4fe018
  9. Mar 18, 2020
  10. Feb 26, 2020
    • Eric Blake's avatar
      nbd: Fix regression with multiple meta contexts · 73e064cc
      Eric Blake authored
      
      Detected by a hang in the libnbd testsuite.  If a client requests
      multiple meta contexts (both base:allocation and qemu:dirty-bitmap:x)
      at the same time, our attempt to silence a false-positive warning
      about a potential uninitialized variable introduced botched logic: we
      were short-circuiting the second context, and never sending the
      NBD_REPLY_FLAG_DONE.  Combining two 'if' into one 'if/else' in
      bdf200a5 was wrong (I'm a bit embarrassed that such a change was my
      initial suggestion after the v1 patch, then I did not review the v2
      patch that actually got committed). Revert that, and instead silence
      the false positive warning by replacing 'return ret' with 'return 0'
      (the value it always has at that point in the code, even though it
      eluded the deduction abilities of the robot that reported the false
      positive).
      
      Fixes: bdf200a5
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      Message-Id: <20200206173832.130004-1-eblake@redhat.com>
      Reviewed-by: default avatarLaurent Vivier <laurent@vivier.eu>
      73e064cc
  11. Jan 08, 2020
  12. Nov 18, 2019
    • Eric Blake's avatar
      nbd: Don't send oversize strings · 93676c88
      Eric Blake authored
      
      Qemu as server currently won't accept export names larger than 256
      bytes, nor create dirty bitmap names longer than 1023 bytes, so most
      uses of qemu as client or server have no reason to get anywhere near
      the NBD spec maximum of a 4k limit per string.
      
      However, we weren't actually enforcing things, ignoring when the
      remote side violates the protocol on input, and also having several
      code paths where we send oversize strings on output (for example,
      qemu-nbd --description could easily send more than 4k).  Tighten
      things up as follows:
      
      client:
      - Perform bounds check on export name and dirty bitmap request prior
        to handing it to server
      - Validate that copied server replies are not too long (ignoring
        NBD_INFO_* replies that are not copied is not too bad)
      server:
      - Perform bounds check on export name and description prior to
        advertising it to client
      - Reject client name or metadata query that is too long
      - Adjust things to allow full 4k name limit rather than previous
        256 byte limit
      
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      Message-Id: <20191114024635.11363-4-eblake@redhat.com>
      Reviewed-by: default avatarMaxim Levitsky <mlevitsk@redhat.com>
      Reviewed-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      93676c88
    • Eric Blake's avatar
      nbd/server: Prefer heap over stack for parsing client names · 9d7ab222
      Eric Blake authored
      
      As long as we limit NBD names to 256 bytes (the bare minimum permitted
      by the standard), stack-allocation works for parsing a name received
      from the client.  But as mentioned in a comment, we eventually want to
      permit up to the 4k maximum of the NBD standard, which is too large
      for stack allocation; so switch everything in the server to use heap
      allocation.  For now, there is no change in actually supported name
      length.
      
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      Message-Id: <20191114024635.11363-2-eblake@redhat.com>
      [eblake: fix uninit variable compile failure]
      Reviewed-by: default avatarMaxim Levitsky <mlevitsk@redhat.com>
      Reviewed-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      9d7ab222
  13. Sep 24, 2019
    • Eric Blake's avatar
      nbd: Grab aio context lock in more places · 61bc846d
      Eric Blake authored
      
      When iothreads are in use, the failure to grab the aio context results
      in an assertion failure when trying to unlock things during blk_unref,
      when trying to unlock a mutex that was not locked.  In short, all
      calls to nbd_export_put need to done while within the correct aio
      context.  But since nbd_export_put can recursively reach itself via
      nbd_export_close, and recursively grabbing the context would deadlock,
      we can't do the context grab directly in those functions, but must do
      so in their callers.
      
      Hoist the use of the correct aio_context from nbd_export_new() to its
      caller qmp_nbd_server_add().  Then tweak qmp_nbd_server_remove(),
      nbd_eject_notifier(), and nbd_esport_close_all() to grab the right
      context, so that all callers during qemu now own the context before
      nbd_export_put() can call blk_unref().
      
      Remaining uses in qemu-nbd don't matter (since that use case does not
      support iothreads).
      
      Suggested-by: default avatarKevin Wolf <kwolf@redhat.com>
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      Message-Id: <20190917023917.32226-1-eblake@redhat.com>
      Reviewed-by: default avatarSergio Lopez <slp@redhat.com>
      61bc846d
    • Sergio Lopez's avatar
      nbd/server: attach client channel to the export's AioContext · b4961249
      Sergio Lopez authored
      On creation, the export's AioContext is set to the same one as the
      BlockBackend, while the AioContext in the client QIOChannel is left
      untouched.
      
      As a result, when using data-plane, nbd_client_receive_next_request()
      schedules coroutines in the IOThread AioContext, while the client's
      QIOChannel is serviced from the main_loop, potentially triggering the
      assertion at qio_channel_restart_[read|write].
      
      To fix this, as soon we have the export corresponding to the client,
      we call qio_channel_attach_aio_context() to attach the QIOChannel
      context to the export's AioContext. This matches with the logic at
      blk_aio_attached().
      
      RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1748253
      
      
      Signed-off-by: default avatarSergio Lopez <slp@redhat.com>
      Message-Id: <20190912110032.26395-1-slp@redhat.com>
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      b4961249
    • Eric Blake's avatar
      nbd/client: Add hint when TLS is missing · 1b5c15ce
      Eric Blake authored
      
      I received an off-list report of failure to connect to an NBD server
      expecting an x509 certificate, when the client was attempting something
      similar to this command line:
      
      $ ./x86_64-softmmu/qemu-system-x86_64 -name 'blah' -machine q35 -nodefaults \
        -object tls-creds-x509,id=tls0,endpoint=client,dir=$path_to_certs \
        -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pcie.0,addr=0x6 \
        -drive id=drive_image1,if=none,snapshot=off,aio=threads,cache=none,format=raw,file=nbd:localhost:9000,werror=stop,rerror=stop,tls-creds=tls0 \
        -device scsi-hd,id=image1,drive=drive_image1,bootindex=0
      qemu-system-x86_64: -drive id=drive_image1,if=none,snapshot=off,aio=threads,cache=none,format=raw,file=nbd:localhost:9000,werror=stop,rerror=stop,tls-creds=tls0: TLS negotiation required before option 7 (go)
      server reported: Option 0x7 not permitted before TLS
      
      The problem?  As specified, -drive is trying to pass tls-creds to the
      raw format driver instead of the nbd protocol driver, but before we
      get to the point where we can detect that raw doesn't know what to do
      with tls-creds, the nbd driver has already failed because the server
      complained.  The fix to the broken command line?  Pass
      '...,file.tls-creds=tls0' to ensure the tls-creds option is handed to
      nbd, not raw.  But since the error message was rather cryptic, I'm
      trying to improve the error message.
      
      With this patch, the error message adds a line:
      
      qemu-system-x86_64: -drive id=drive_image1,if=none,snapshot=off,aio=threads,cache=none,format=raw,file=nbd:localhost:9000,werror=stop,rerror=stop,tls-creds=tls0: TLS negotiation required before option 7 (go)
      Did you forget a valid tls-creds?
      server reported: Option 0x7 not permitted before TLS
      
      And with luck, someone grepping for that error message will find this
      commit message and figure out their command line mistake.  Sadly, the
      only mention of file.tls-creds in our docs relates to an --image-opts
      use of PSK encryption with qemu-img as the client, rather than x509
      certificate encryption with qemu-kvm as the client.
      
      CC: Tingting Mao <timao@redhat.com>
      CC: Daniel P. Berrangé <berrange@redhat.com>
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      Message-Id: <20190907172055.26870-1-eblake@redhat.com>
      [eblake: squash in iotest 233 fix]
      Reviewed-by: default avatarDaniel P. Berrangé <berrange@redhat.com>
      1b5c15ce
  14. Sep 18, 2019
  15. Sep 05, 2019
    • Eric Blake's avatar
      nbd: Implement server use of NBD FAST_ZERO · b491dbb7
      Eric Blake authored
      
      The server side is fairly straightforward: we can always advertise
      support for detection of fast zero, and implement it by mapping the
      request to the block layer BDRV_REQ_NO_FALLBACK.
      
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      Message-Id: <20190823143726.27062-5-eblake@redhat.com>
      Reviewed-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      [eblake: update iotests 223, 233]
      b491dbb7
    • Eric Blake's avatar
      nbd: Prepare for NBD_CMD_FLAG_FAST_ZERO · 0a479545
      Eric Blake authored
      
      Commit fe0480d6 and friends added BDRV_REQ_NO_FALLBACK as a way to
      avoid wasting time on a preliminary write-zero request that will later
      be rewritten by actual data, if it is known that the write-zero
      request will use a slow fallback; but in doing so, could not optimize
      for NBD.  The NBD specification is now considering an extension that
      will allow passing on those semantics; this patch updates the new
      protocol bits and 'qemu-nbd --list' output to recognize the bit, as
      well as the new errno value possible when using the new flag; while
      upcoming patches will improve the client to use the feature when
      present, and the server to advertise support for it.
      
      The NBD spec recommends (but not requires) that ENOTSUP be avoided for
      all but failures of a fast zero (the only time it is mandatory to
      avoid an ENOTSUP failure is when fast zero is supported but not
      requested during write zeroes; the questionable use is for ENOTSUP to
      other actions like a normal write request).  However, clients that get
      an unexpected ENOTSUP will either already be treating it the same as
      EINVAL, or may appreciate the extra bit of information.  We were
      equally loose for returning EOVERFLOW in more situations than
      recommended by the spec, so if it turns out to be a problem in
      practice, a later patch can tighten handling for both error codes.
      
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      Message-Id: <20190823143726.27062-3-eblake@redhat.com>
      Reviewed-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      [eblake: tweak commit message, also handle EOPNOTSUPP]
      0a479545
    • Eric Blake's avatar
      nbd: Improve per-export flag handling in server · dbb38caa
      Eric Blake authored
      
      When creating a read-only image, we are still advertising support for
      TRIM and WRITE_ZEROES to the client, even though the client should not
      be issuing those commands.  But seeing this requires looking across
      multiple functions:
      
      All callers to nbd_export_new() passed a single flag based solely on
      whether the export allows writes.  Later, we then pass a constant set
      of flags to nbd_negotiate_options() (namely, the set of flags which we
      always support, at least for writable images), which is then further
      dynamically modified with NBD_FLAG_SEND_DF based on client requests
      for structured options.  Finally, when processing NBD_OPT_EXPORT_NAME
      or NBD_OPT_EXPORT_GO we bitwise-or the original caller's flag with the
      runtime set of flags we've built up over several functions.
      
      Let's refactor things to instead compute a baseline of flags as soon
      as possible which gets shared between multiple clients, in
      nbd_export_new(), and changing the signature for the callers to pass
      in a simpler bool rather than having to figure out flags.  We can then
      get rid of the 'myflags' parameter to various functions, and instead
      refer to client for everything we need (we still have to perform a
      bitwise-OR for NBD_FLAG_SEND_DF during NBD_OPT_EXPORT_NAME and
      NBD_OPT_EXPORT_GO, but it's easier to see what is being computed).
      This lets us quit advertising senseless flags for read-only images, as
      well as making the next patch for exposing FAST_ZERO support easier to
      write.
      
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      Message-Id: <20190823143726.27062-2-eblake@redhat.com>
      Reviewed-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      [eblake: improve commit message, update iotest 223]
      dbb38caa
Loading