Skip to content
Snippets Groups Projects
  1. Sep 29, 2021
    • Vladimir Sementsov-Ogievskiy's avatar
      block/nbd: drop connection_co · 4ddb5d2f
      Vladimir Sementsov-Ogievskiy authored
      
      OK, that's a big rewrite of the logic.
      
      Pre-patch we have an always running coroutine - connection_co. It does
      reply receiving and reconnecting. And it leads to a lot of difficult
      and unobvious code around drained sections and context switch. We also
      abuse bs->in_flight counter which is increased for connection_co and
      temporary decreased in points where we want to allow drained section to
      begin. One of these place is in another file: in nbd_read_eof() in
      nbd/client.c.
      
      We also cancel reconnect and requests waiting for reconnect on drained
      begin which is not correct. And this patch fixes that.
      
      Let's finally drop this always running coroutine and go another way:
      do both reconnect and receiving in request coroutines.
      
      The detailed list of changes below (in the sequence of diff hunks).
      
      1. receiving coroutines are woken directly from nbd_channel_error, when
         we change s->state
      
      2. nbd_co_establish_connection_cancel(): we don't have drain_begin now,
         and in nbd_teardown_connection() all requests should already be
         finished (and reconnect is done from request). So
         nbd_co_establish_connection_cancel() is called from
         nbd_cancel_in_flight() (to cancel the request that is doing
         nbd_co_establish_connection()) and from reconnect_delay_timer_cb()
         (previously we didn't need it, as reconnect delay only should cancel
         active requests not the reconnection itself). But now reconnection
         itself is done in the separate thread (we now call
         nbd_client_connection_enable_retry() in nbd_open()), and we need to
         cancel the requests that wait in nbd_co_establish_connection()
         now).
      
      2A. We do receive headers in request coroutine. But we also should
         dispatch replies for other pending requests. So,
         nbd_connection_entry() is turned into nbd_receive_replies(), which
         does reply dispatching while it receives other request headers, and
         returns when it receives the requested header.
      
      3. All old staff around drained sections and context switch is dropped.
         In details:
         - we don't need to move connection_co to new aio context, as we
           don't have connection_co anymore
         - we don't have a fake "request" of connection_co (extra increasing
           in_flight), so don't care with it in drain_begin/end
         - we don't stop reconnection during drained section anymore. This
           means that drain_begin may wait for a long time (up to
           reconnect_delay). But that's an improvement and more correct
           behavior see below[*]
      
      4. In nbd_teardown_connection() we don't have to wait for
         connection_co, as it is dropped. And cleanup for s->ioc and nbd_yank
         is moved here from removed connection_co.
      
      5. In nbd_co_do_establish_connection() we now should handle
         NBD_CLIENT_CONNECTING_NOWAIT: if new request comes when we are in
         NBD_CLIENT_CONNECTING_NOWAIT, it still should call
         nbd_co_establish_connection() (who knows, maybe the connection was
         already established by another thread in the background). But we
         shouldn't wait: if nbd_co_establish_connection() can't return new
         channel immediately the request should fail (we are in
         NBD_CLIENT_CONNECTING_NOWAIT state).
      
      6. nbd_reconnect_attempt() is simplified: it's now easier to wait for
         other requests in the caller, so here we just assert that fact.
         Also delay time is now initialized here: we can easily detect first
         attempt and start a timer.
      
      7. nbd_co_reconnect_loop() is dropped, we don't need it. Reconnect
         retries are fully handle by thread (nbd/client-connection.c), delay
         timer we initialize in nbd_reconnect_attempt(), we don't have to
         bother with s->drained and friends. nbd_reconnect_attempt() now
         called from nbd_co_send_request().
      
      8. nbd_connection_entry is dropped: reconnect is now handled by
         nbd_co_send_request(), receiving reply is now handled by
         nbd_receive_replies(): all handled from request coroutines.
      
      9. So, welcome new nbd_receive_replies() called from request coroutine,
         that receives reply header instead of nbd_connection_entry().
         Like with sending requests, only one coroutine may receive in a
         moment. So we introduce receive_mutex, which is locked around
         nbd_receive_reply(). It also protects some related fields. Still,
         full audit of thread-safety in nbd driver is a separate task.
         New function waits for a reply with specified handle being received
         and works rather simple:
      
         Under mutex:
           - if current handle is 0, do receive by hand. If another handle
             received - switch to other request coroutine, release mutex and
             yield. Otherwise return success
           - if current handle == requested handle, we are done
           - otherwise, release mutex and yield
      
      10: in nbd_co_send_request() we now do nbd_reconnect_attempt() if
          needed. Also waiting in free_sema queue we now wait for one of two
          conditions:
          - connectED, in_flight < MAX_NBD_REQUESTS (so we can start new one)
          - connectING, in_flight == 0, so we can call
            nbd_reconnect_attempt()
          And this logic is protected by s->send_mutex
      
          Also, on failure we don't have to care of removed s->connection_co
      
      11. nbd_co_do_receive_one_chunk(): now instead of yield() and wait for
          s->connection_co we just call new nbd_receive_replies().
      
      12. nbd_co_receive_one_chunk(): place where s->reply.handle becomes 0,
          which means that handling of the whole reply is finished. Here we
          need to wake one of coroutines sleeping in nbd_receive_replies().
          If none are sleeping - do nothing. That's another behavior change: we
          don't have endless recv() in the idle time. It may be considered as
          a drawback. If so, it may be fixed later.
      
      13. nbd_reply_chunk_iter_receive(): don't care about removed
          connection_co, just ping in_flight waiters.
      
      14. Don't create connection_co, enable retry in the connection thread
          (we don't have own reconnect loop anymore)
      
      15. We now need to add a nbd_co_establish_connection_cancel() call in
          nbd_cancel_in_flight(), to cancel the request that is doing a
          connection attempt.
      
      [*], ok, now we don't cancel reconnect on drain begin. That's correct:
          reconnect feature leads to possibility of long-running requests (up
          to reconnect delay). Still, drain begin is not a reason to kill
          long requests. We should wait for them.
      
          This also means, that we can again reproduce a dead-lock, described
          in 8c517de2.
          Why we are OK with it:
          1. Now this is not absolutely-dead dead-lock: the vm is unfrozen
             after reconnect delay. Actually 8c517de2 fixed a bug in
             NBD logic, that was not described in 8c517de2 and led to
             forever dead-lock. The problem was that nobody woke the free_sema
             queue, but drain_begin can't finish until there is a request in
             free_sema queue. Now we have a reconnect delay timer that works
             well.
          2. It's not a problem of the NBD driver, but of the ide code,
             because it does drain_begin under the global mutex; the problem
             doesn't reproduce when using scsi instead of ide.
      
      Signed-off-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      Message-Id: <20210902103805.25686-5-vsementsov@virtuozzo.com>
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      [eblake: grammar and comment tweaks]
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      4ddb5d2f
  2. Jul 10, 2020
    • Vladimir Sementsov-Ogievskiy's avatar
      nbd: Use ERRP_GUARD() · 795d946d
      Vladimir Sementsov-Ogievskiy authored
      
      If we want to check error after errp-function call, we need to
      introduce local_err and then propagate it to errp. Instead, use
      the ERRP_GUARD() macro, benefits are:
      1. No need of explicit error_propagate call
      2. No need of explicit local_err variable: use errp directly
      3. ERRP_GUARD() leaves errp as is if it's not NULL or
         &error_fatal, this means that we don't break error_abort
         (we'll abort on error_set, not on error_propagate)
      
      If we want to add some info to errp (by error_prepend() or
      error_append_hint()), we must use the ERRP_GUARD() macro.
      Otherwise, this info will not be added when errp == &error_fatal
      (the program will exit prior to the error_append_hint() or
      error_prepend() call).  Fix several such cases, e.g. in nbd_read().
      
      This commit is generated by command
      
          sed -n '/^Network Block Device (NBD)$/,/^$/{s/^F: //p}' \
              MAINTAINERS | \
          xargs git ls-files | grep '\.[hc]$' | \
          xargs spatch \
              --sp-file scripts/coccinelle/errp-guard.cocci \
              --macro-file scripts/cocci-macro-file.h \
              --in-place --no-show-diff --max-width 80
      
      Reported-by: default avatarKevin Wolf <kwolf@redhat.com>
      Reported-by: default avatarGreg Kurz <groug@kaod.org>
      Signed-off-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      Reviewed-by: default avatarMarkus Armbruster <armbru@redhat.com>
      [Commit message tweaked]
      Signed-off-by: default avatarMarkus Armbruster <armbru@redhat.com>
      Message-Id: <20200707165037.1026246-8-armbru@redhat.com>
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      [ERRP_AUTO_PROPAGATE() renamed to ERRP_GUARD(), and
      auto-propagated-errp.cocci to errp-guard.cocci.  Commit message
      tweaked again.]
      795d946d
  3. Nov 18, 2019
    • Eric Blake's avatar
      nbd: Don't send oversize strings · 93676c88
      Eric Blake authored
      
      Qemu as server currently won't accept export names larger than 256
      bytes, nor create dirty bitmap names longer than 1023 bytes, so most
      uses of qemu as client or server have no reason to get anywhere near
      the NBD spec maximum of a 4k limit per string.
      
      However, we weren't actually enforcing things, ignoring when the
      remote side violates the protocol on input, and also having several
      code paths where we send oversize strings on output (for example,
      qemu-nbd --description could easily send more than 4k).  Tighten
      things up as follows:
      
      client:
      - Perform bounds check on export name and dirty bitmap request prior
        to handing it to server
      - Validate that copied server replies are not too long (ignoring
        NBD_INFO_* replies that are not copied is not too bad)
      server:
      - Perform bounds check on export name and description prior to
        advertising it to client
      - Reject client name or metadata query that is too long
      - Adjust things to allow full 4k name limit rather than previous
        256 byte limit
      
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      Message-Id: <20191114024635.11363-4-eblake@redhat.com>
      Reviewed-by: default avatarMaxim Levitsky <mlevitsk@redhat.com>
      Reviewed-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      93676c88
  4. Sep 24, 2019
    • Eric Blake's avatar
      nbd/client: Add hint when TLS is missing · 1b5c15ce
      Eric Blake authored
      
      I received an off-list report of failure to connect to an NBD server
      expecting an x509 certificate, when the client was attempting something
      similar to this command line:
      
      $ ./x86_64-softmmu/qemu-system-x86_64 -name 'blah' -machine q35 -nodefaults \
        -object tls-creds-x509,id=tls0,endpoint=client,dir=$path_to_certs \
        -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pcie.0,addr=0x6 \
        -drive id=drive_image1,if=none,snapshot=off,aio=threads,cache=none,format=raw,file=nbd:localhost:9000,werror=stop,rerror=stop,tls-creds=tls0 \
        -device scsi-hd,id=image1,drive=drive_image1,bootindex=0
      qemu-system-x86_64: -drive id=drive_image1,if=none,snapshot=off,aio=threads,cache=none,format=raw,file=nbd:localhost:9000,werror=stop,rerror=stop,tls-creds=tls0: TLS negotiation required before option 7 (go)
      server reported: Option 0x7 not permitted before TLS
      
      The problem?  As specified, -drive is trying to pass tls-creds to the
      raw format driver instead of the nbd protocol driver, but before we
      get to the point where we can detect that raw doesn't know what to do
      with tls-creds, the nbd driver has already failed because the server
      complained.  The fix to the broken command line?  Pass
      '...,file.tls-creds=tls0' to ensure the tls-creds option is handed to
      nbd, not raw.  But since the error message was rather cryptic, I'm
      trying to improve the error message.
      
      With this patch, the error message adds a line:
      
      qemu-system-x86_64: -drive id=drive_image1,if=none,snapshot=off,aio=threads,cache=none,format=raw,file=nbd:localhost:9000,werror=stop,rerror=stop,tls-creds=tls0: TLS negotiation required before option 7 (go)
      Did you forget a valid tls-creds?
      server reported: Option 0x7 not permitted before TLS
      
      And with luck, someone grepping for that error message will find this
      commit message and figure out their command line mistake.  Sadly, the
      only mention of file.tls-creds in our docs relates to an --image-opts
      use of PSK encryption with qemu-img as the client, rather than x509
      certificate encryption with qemu-kvm as the client.
      
      CC: Tingting Mao <timao@redhat.com>
      CC: Daniel P. Berrangé <berrange@redhat.com>
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      Message-Id: <20190907172055.26870-1-eblake@redhat.com>
      [eblake: squash in iotest 233 fix]
      Reviewed-by: default avatarDaniel P. Berrangé <berrange@redhat.com>
      1b5c15ce
  5. Sep 05, 2019
    • Eric Blake's avatar
      nbd: Tolerate more errors to structured reply request · 5de47735
      Eric Blake authored
      
      A server may have a reason to reject a request for structured replies,
      beyond just not recognizing them as a valid request; similarly, it may
      have a reason for rejecting a request for a meta context.  It doesn't
      hurt us to continue talking to such a server; otherwise 'qemu-nbd
      --list' of such a server fails to display all available details about
      the export.
      
      Encountered when temporarily tweaking nbdkit to reply with
      NBD_REP_ERR_POLICY.  Present since structured reply support was first
      added (commit d795299b reused starttls handling, but starttls is
      different in that we can't fall back to other behavior on any error).
      
      Note that for an unencrypted client trying to connect to a server that
      requires encryption, this defers the point of failure to when we
      finally execute a strict command (such as NBD_OPT_GO or NBD_OPT_LIST),
      now that the intermediate NBD_OPT_STRUCTURED_REPLY does not diagnose
      NBD_REP_ERR_TLS_REQD as fatal; but as the protocol eventually gets us
      to a command where we can't continue onwards, the changed error
      message doesn't cause any security concerns.
      
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      Message-Id: <20190824172813.29720-3-eblake@redhat.com>
      Reviewed-by: default avatarDaniel P. Berrangé <berrange@redhat.com>
      [eblake: fix iotest 233]
      5de47735
    • Eric Blake's avatar
      nbd: Use g_autofree in a few places · df18c04e
      Eric Blake authored
      
      Thanks to our recent move to use glib's g_autofree, I can join the
      bandwagon.  Getting rid of gotos is fun ;)
      
      There are probably more places where we could register cleanup
      functions and get rid of more gotos; this patch just focuses on the
      labels that existed merely to call g_free.
      
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      Message-Id: <20190824172813.29720-2-eblake@redhat.com>
      Reviewed-by: default avatarDaniel P. Berrangé <berrange@redhat.com>
      Reviewed-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      df18c04e
  6. Aug 16, 2019
  7. Aug 15, 2019
  8. Apr 08, 2019
  9. Apr 01, 2019
    • Eric Blake's avatar
      nbd/client: Reject inaccessible tail of inconsistent server · 3add3ab7
      Eric Blake authored
      
      The NBD spec suggests that a server should never advertise a size
      inconsistent with its minimum block alignment, as that tail is
      effectively inaccessible to a compliant client obeying those block
      constraints. Since we have a habit of rounding up rather than
      truncating, to avoid losing the last few bytes of user input, and we
      cannot access the tail when the server advertises bogus block sizing,
      abort the connection to alert the server to fix their bug.  And
      rejecting such servers matches what we already did for a min_block
      that was not a power of 2 or which was larger than max_block.
      
      Does not impact either qemu (which always sends properly aligned
      sizes) or nbdkit (which does not send minimum block requirements yet);
      so this is mostly aimed at new NBD server implementations, and ensures
      that the rest of our code can assume the size is aligned.
      
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      Message-Id: <20190330155704.24191-1-eblake@redhat.com>
      Reviewed-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      3add3ab7
  10. Feb 25, 2019
  11. Feb 04, 2019
  12. Jan 21, 2019
    • Eric Blake's avatar
      nbd/client: Work around 3.0 bug for listing meta contexts · 7c6f5ddc
      Eric Blake authored
      
      Commit 3d068aff forgot to advertise available qemu: contexts
      when the client requests a list with 0 queries. Furthermore,
      3.0 shipped with a qemu-img hack of x-dirty-bitmap (commit
      216ee365) that _silently_ acts as though the entire image is
      clean if a requested bitmap is not present.  Both bugs have
      been recently fixed, so that a modern qemu server gives full
      context output right away, and the client refuses a
      connection if a requested x-dirty-bitmap was not found.
      
      Still, it is likely that there will be users that have to
      work with a mix of old and new qemu versions, depending on
      which features get backported where, at which point being
      able to rely on 'qemu-img --list' output to know for sure
      whether a given NBD export has the desired dirty bitmap is
      much nicer than blindly connecting and risking that the
      entire image may appear clean.  We can make our --list code
      smart enough to work around buggy servers by tracking
      whether we've seen any qemu: replies in the original 0-query
      list; if not, repeat with a single query on "qemu:" (which
      may still have no replies, but then we know for sure we
      didn't trip up on the server bug).
      
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      Reviewed-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      Message-Id: <20190117193658.16413-21-eblake@redhat.com>
      7c6f5ddc
    • Eric Blake's avatar
      nbd/client: Add meta contexts to nbd_receive_export_list() · 0b576b6b
      Eric Blake authored
      
      We want to be able to detect whether a given qemu NBD server is
      exposing the right export(s) and dirty bitmaps, at least for
      regression testing.  We could use 'nbd-client -l' from the upstream
      NBD project to list exports, but it's annoying to rely on
      out-of-tree binaries; furthermore, nbd-client doesn't necessarily
      know about all of the qemu NBD extensions.  Thus, we plan on adding
      a new mode to qemu-nbd that merely sniffs all possible information
      from the server during handshake phase, then disconnects and dumps
      the information.
      
      This patch continues the work of the previous patch, by adding the
      ability to track the list of available meta contexts into
      NBDExportInfo.  It benefits from the recent refactoring patches
      with a new nbd_list_meta_contexts() that reuses much of the same
      framework as setting a meta context.
      
      Note: a malicious server could exhaust memory of a client by feeding
      an unending loop of contexts; perhaps we could place a limit on how
      many we are willing to receive. But this is no different from our
      earlier analysis on a server sending an unending list of exports,
      and the death of a client due to memory exhaustion when the client
      was going to exit soon anyways is not really a denial of service
      attack.
      
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      Reviewed-by: default avatarRichard W.M. Jones <rjones@redhat.com>
      Reviewed-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      Message-Id: <20190117193658.16413-19-eblake@redhat.com>
      0b576b6b
    • Eric Blake's avatar
      nbd/client: Add nbd_receive_export_list() · d21a2d34
      Eric Blake authored
      
      We want to be able to detect whether a given qemu NBD server is
      exposing the right export(s) and dirty bitmaps, at least for
      regression testing.  We could use 'nbd-client -l' from the upstream
      NBD project to list exports, but it's annoying to rely on
      out-of-tree binaries; furthermore, nbd-client doesn't necessarily
      know about all of the qemu NBD extensions.  Thus, we plan on adding
      a new mode to qemu-nbd that merely sniffs all possible information
      from the server during handshake phase, then disconnects and dumps
      the information.
      
      This patch adds the low-level client code for grabbing the list
      of exports.  It benefits from the recent refactoring patches, in
      order to share as much code as possible when it comes to doing
      validation of server replies.  The resulting information is stored
      in an array of NBDExportInfo which has been expanded to any
      description string, along with a convenience function for freeing
      the list.
      
      Note: a malicious server could exhaust memory of a client by feeding
      an unending loop of exports; perhaps we should place a limit on how
      many we are willing to receive. But note that a server could
      reasonably be serving an export for every file in a large directory,
      where an arbitrary limit in the client means we can't list anything
      from such a server; the same happens if we just run until the client
      fails to malloc() and thus dies by an abort(), where the limit is
      no longer arbitrary but determined by available memory.  Since the
      client is already planning on being short-lived, it's hard to call
      this a denial of service attack that would starve off other uses,
      so it does not appear to be a security issue.
      
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      Reviewed-by: default avatarRichard W.M. Jones <rjones@redhat.com>
      Message-Id: <20190117193658.16413-18-eblake@redhat.com>
      Reviewed-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      d21a2d34
    • Eric Blake's avatar
      nbd/client: Refactor nbd_opt_go() to support NBD_OPT_INFO · 138796d0
      Eric Blake authored
      
      Rename the function to nbd_opt_info_or_go() with an added parameter
      and slight changes to comments and trace messages, in order to
      reuse the function for NBD_OPT_INFO.
      
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      Message-Id: <20190117193658.16413-17-eblake@redhat.com>
      Reviewed-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      138796d0
    • Eric Blake's avatar
      nbd/client: Pull out oldstyle size determination · b3c9d33b
      Eric Blake authored
      
      Another refactoring creating nbd_negotiate_finish_oldstyle()
      for further reuse during 'qemu-nbd --list'.
      
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      Reviewed-by: default avatarRichard W.M. Jones <rjones@redhat.com>
      Message-Id: <20190117193658.16413-16-eblake@redhat.com>
      Reviewed-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      b3c9d33b
    • Eric Blake's avatar
      nbd/client: Split handshake into two functions · 10b89988
      Eric Blake authored
      
      An upcoming patch will add the ability for qemu-nbd to list
      the services provided by an NBD server.  Share the common
      code of the TLS handshake by splitting the initial exchange
      into a separate function, leaving only the export handling
      in the original function.  Functionally, there should be no
      change in behavior in this patch, although some of the code
      motion may be difficult to follow due to indentation changes
      (view with 'git diff -w' for a smaller changeset).
      
      I considered an enum for the return code coordinating state
      between the two functions, but in the end just settled with
      ample comments.
      
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      Reviewed-by: default avatarRichard W.M. Jones <rjones@redhat.com>
      Reviewed-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      Message-Id: <20190117193658.16413-15-eblake@redhat.com>
      10b89988
    • Eric Blake's avatar
      nbd/client: Refactor return of nbd_receive_negotiate() · 2b8d0954
      Eric Blake authored
      
      The function could only ever return 0 or -EINVAL; make this
      clearer by dropping a useless 'fail:' label.
      
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      Reviewed-by: default avatarRichard W.M. Jones <rjones@redhat.com>
      Reviewed-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      Message-Id: <20190117193658.16413-14-eblake@redhat.com>
      2b8d0954
    • Eric Blake's avatar
      nbd/client: Split out nbd_receive_one_meta_context() · 0182c1ae
      Eric Blake authored
      
      Extract portions of nbd_negotiate_simple_meta_context() to
      a new function nbd_receive_one_meta_context() that copies the
      pattern of nbd_receive_list() for performing the argument
      validation of one reply.  The error message when the server
      replies with more than one context changes slightly, but
      that shouldn't happen in the common case.
      
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      Reviewed-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      Message-Id: <20190117193658.16413-13-eblake@redhat.com>
      0182c1ae
    • Eric Blake's avatar
      nbd/client: Split out nbd_send_meta_query() · 757b3ab9
      Eric Blake authored
      
      Refactor nbd_negotiate_simple_meta_context() to pull out the
      code that can be reused to send a LIST request for 0 or 1 query.
      No semantic change.  The old comment about 'sizeof(uint32_t)'
      being equivalent to '/* number of queries */' is no longer
      needed, now that we are computing 'sizeof(queries)' instead.
      
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      Reviewed-by: default avatarRichard W.M. Jones <rjones@redhat.com>
      Message-Id: <20190117193658.16413-12-eblake@redhat.com>
      Reviewed-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      757b3ab9
    • Eric Blake's avatar
      nbd/client: Change signature of nbd_negotiate_simple_meta_context() · 2df94eb5
      Eric Blake authored
      
      Pass 'info' instead of three separate parameters related to info,
      when requesting the server to set the meta context.  Update the
      NBDExportInfo struct to rename the received id field to match the
      fact that we are currently overloading the field to match whatever
      context the user supplied through the x-dirty-bitmap hack, as well
      as adding a TODO comment to remind future patches about a desire
      to request two contexts at once.
      
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      Reviewed-by: default avatarRichard W.M. Jones <rjones@redhat.com>
      Reviewed-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      Message-Id: <20190117193658.16413-11-eblake@redhat.com>
      2df94eb5
    • Eric Blake's avatar
      nbd/client: Move export name into NBDExportInfo · 6dc1667d
      Eric Blake authored
      
      Refactor the 'name' parameter of nbd_receive_negotiate() from
      being a separate parameter into being part of the in-out 'info'.
      This also spills over to a simplification of nbd_opt_go().
      
      The main driver for this refactoring is that an upcoming patch
      would like to add support to qemu-nbd to list information about
      all exports available on a server, where the name(s) will be
      provided by the server instead of the client.  But another benefit
      is that we can now allow the client to explicitly specify the
      empty export name "" even when connecting to an oldstyle server
      (even if qemu is no longer such a server after commit 7f7dfe2a).
      
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      Reviewed-by: default avatarRichard W.M. Jones <rjones@redhat.com>
      Reviewed-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      Message-Id: <20190117193658.16413-10-eblake@redhat.com>
      6dc1667d
    • Eric Blake's avatar
      nbd/client: Refactor nbd_receive_list() · 091d0bf3
      Eric Blake authored
      
      Right now, nbd_receive_list() is only called by
      nbd_receive_query_exports(), which in turn is only called if the
      server lacks NBD_OPT_GO but has working option negotiation, and is
      merely used as a quality-of-implementation trick since servers
      can't give decent errors for NBD_OPT_EXPORT_NAME.  However, servers
      that lack NBD_OPT_GO are becoming increasingly rare (nbdkit was a
      latecomer, in Aug 2018, but qemu has been such a server since commit
      f37708f6 in July 2017 and released in 2.10), so it no longer makes
      sense to micro-optimize that function for performance.
      
      Furthermore, when debugging a server's implementation, tracing the
      full reply (both names and descriptions) is useful, not to mention
      that upcoming patches adding 'qemu-nbd --list' will want to collect
      that data.  And when you consider that a server can send an export
      name up to the NBD protocol length limit of 4k; but our current
      NBD_MAX_NAME_SIZE is only 256, we can't trace all valid server
      names without more storage, but 4k is large enough that the heap
      is better than the stack for long names.
      
      Thus, I'm changing the division of labor, with nbd_receive_list()
      now always malloc'ing a result on success (the malloc is bounded
      by the fact that we reject servers with a reply length larger
      than 32M), and moving the comparison to 'wantname' to the caller.
      
      There is a minor change in behavior where a server with 0 exports
      (an immediate NBD_REP_ACK reply) is now no longer distinguished
      from a server without LIST support (NBD_REP_ERR_UNSUP); this
      information could be preserved with a complication to the calling
      contract to provide a bit more information, but I didn't see the
      point.  After all, the worst that can happen if our guess at a
      match is wrong is that the caller will get a cryptic disconnect
      when NBD_OPT_EXPORT_NAME fails (which is no different from what
      would happen if we had not tried LIST), while treating an empty
      list as immediate failure would prevent connecting to really old
      servers that really did lack LIST.  Besides, NBD servers with 0
      exports are rare (qemu can do it when using QMP nbd-server-start
      without nbd-server-add - but qemu understands NBD_OPT_GO and
      thus won't tickle this change in behavior).
      
      Fix the spelling of foundExport to match coding standards while
      in the area.
      
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      Reviewed-by: default avatarRichard W.M. Jones <rjones@redhat.com>
      Reviewed-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      Message-Id: <20190117193658.16413-9-eblake@redhat.com>
      091d0bf3
  13. Jan 05, 2019
    • Eric Blake's avatar
      nbd/client: Drop pointless buf variable · ef2e35fc
      Eric Blake authored
      
      There's no need to read into a temporary buffer (oversized
      since commit 7d3123e1) followed by a byteswap into a uint64_t
      to check for a magic number via memcmp(), when the code
      immediately below demonstrates reading into the uint64_t then
      byteswapping in place and checking for a magic number via
      integer math.  What's more, having a different error message
      when the server's first reply byte is 0 is unusual - it's no
      different from any other wrong magic number, and we already
      detected short reads. That whole strlen() issue has been
      present and useless since commit 1d45f8b5 in 2010; perhaps it
      was leftover debugging (since the correct magic number happens
      to be ASCII)?  Make the error messages more consistent and
      detailed while touching things.
      
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      Reviewed-by: default avatarRichard W.M. Jones <rjones@redhat.com>
      Reviewed-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      Message-Id: <20181215135324.152629-9-eblake@redhat.com>
      ef2e35fc
    • Eric Blake's avatar
      qemu-nbd: Fail earlier for -c/-d on non-linux · 3c1fa35d
      Eric Blake authored
      
      Connecting to a /dev/nbdN device is a Linux-specific action.
      We were already masking -c and -d from 'qemu-nbd --help' on
      non-linux.  However, while -d fails with a sensible error
      message, it took hunting through a couple of files to prove
      that.  What's more, the code for -c doesn't fail until after
      it has created a pthread and tried to open a device - possibly
      even printing an error message with %m on a non-Linux platform
      in spite of the comment that %m is glibc-specific.  Make the
      failure happen sooner, then get rid of stubs that are no
      longer needed because of the early exits.
      
      While at it: tweak the blank newlines in --help output to be
      consistent, whether or not built on Linux.
      
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      Message-Id: <20181215135324.152629-7-eblake@redhat.com>
      Reviewed-by: default avatarRichard W.M. Jones <rjones@redhat.com>
      Reviewed-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      3c1fa35d
    • Eric Blake's avatar
      nbd/client: More consistent error messages · 6c5c0351
      Eric Blake authored
      
      Consolidate on using decimal (not hex), on outputting the
      option reply name (not just value), and a consistent comma between
      clauses, when the client reports protocol discrepancies from the
      server.  While it won't affect normal operation, it makes
      debugging additions easier.
      
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      Reviewed-by: default avatarRichard W.M. Jones <rjones@redhat.com>
      Reviewed-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      Message-Id: <20181215135324.152629-6-eblake@redhat.com>
      6c5c0351
  14. Jan 04, 2019
  15. Oct 03, 2018
    • Peter Maydell's avatar
      nbd: Don't take address of fields in packed structs · 80c7c2b0
      Peter Maydell authored
      
      Taking the address of a field in a packed struct is a bad idea, because
      it might not be actually aligned enough for that pointer type (and
      thus cause a crash on dereference on some host architectures). Newer
      versions of clang warn about this. Avoid the bug by not using the
      "modify in place" byte swapping functions.
      
      This patch was produced with the following spatch script:
      @@
      expression E;
      @@
      -be16_to_cpus(&E);
      +E = be16_to_cpu(E);
      @@
      expression E;
      @@
      -be32_to_cpus(&E);
      +E = be32_to_cpu(E);
      @@
      expression E;
      @@
      -be64_to_cpus(&E);
      +E = be64_to_cpu(E);
      @@
      expression E;
      @@
      -cpu_to_be16s(&E);
      +E = cpu_to_be16(E);
      @@
      expression E;
      @@
      -cpu_to_be32s(&E);
      +E = cpu_to_be32(E);
      @@
      expression E;
      @@
      -cpu_to_be64s(&E);
      +E = cpu_to_be64(E);
      
      Signed-off-by: default avatarPeter Maydell <peter.maydell@linaro.org>
      Message-Id: <20180927164200.15097-1-peter.maydell@linaro.org>
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      [eblake: rebase, and squash in missed changes]
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      80c7c2b0
  16. Jul 02, 2018
    • Eric Blake's avatar
      nbd/client: Add x-dirty-bitmap to query bitmap from server · 216ee365
      Eric Blake authored
      
      In order to test that the NBD server is properly advertising
      dirty bitmaps, we need a bare minimum client that can request
      and read the context.  Since feature freeze for 3.0 is imminent,
      this is the smallest workable patch, which replaces the qemu
      block status report with the results of the NBD server's dirty
      bitmap (making it very easy to use 'qemu-img map --output=json'
      to learn where the dirty portions are).  Note that the NBD
      protocol defines a dirty section with the same bit but opposite
      sense that normal "base:allocation" uses to report an allocated
      section; so in qemu-img map output, "data":true corresponds to
      clean, "data":false corresponds to dirty.
      
      A more complete solution that allows dirty bitmaps to be queried
      at the same time as normal block status will be required before
      this addition can lose the x- prefix.  Until then, the fact that
      this replaces normal status with dirty status means actions
      like 'qemu-img convert' will likely misbehave due to treating
      dirty regions of the file as if they are unallocated.
      
      The next patch adds an iotest to exercise this new code.
      
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      Message-Id: <20180702191458.28741-2-eblake@redhat.com>
      216ee365
  17. May 04, 2018
  18. Apr 02, 2018
    • Eric Blake's avatar
      nbd: trace meta context negotiation · 2b53af25
      Eric Blake authored
      
      Having a more detailed log of the interaction between client and
      server is invaluable in debugging how meta context negotiation
      actually works.
      
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      Message-Id: <20180330130950.1931229-1-eblake@redhat.com>
      Reviewed-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      2b53af25
    • Eric Blake's avatar
      nbd/client: Correctly handle bad server REP_META_CONTEXT · 260e34db
      Eric Blake authored
      
      It's never a good idea to blindly read for size bytes as
      returned by the server without first validating that the size
      is within bounds; a malicious or buggy server could cause us
      to hang or get out of sync from reading further messages.
      
      It may be smarter to try and teach the client to cope with
      unexpected context ids by silently ignoring them instead of
      hanging up on the server, but for now, if the server doesn't
      reply with exactly the one context we expect, it's easier to
      just give up - however, if we give up for any reason other
      than an I/O failure, we might as well try to politely tell
      the server we are quitting rather than continuing.
      
      Fix some typos in the process.
      
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      Message-Id: <20180329231837.1914680-1-eblake@redhat.com>
      Reviewed-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      260e34db
  19. Mar 13, 2018
  20. Mar 06, 2018
  21. Mar 01, 2018
  22. Jan 10, 2018
  23. Nov 17, 2017
    • Eric Blake's avatar
      nbd/client: Don't hard-disconnect on ESHUTDOWN from server · 01b05c66
      Eric Blake authored
      
      The NBD spec says that a server may fail any transmission request
      with ESHUTDOWN when it is apparent that no further request from
      the client can be successfully honored.  The client is supposed
      to then initiate a soft shutdown (wait for all remaining in-flight
      requests to be answered, then send NBD_CMD_DISC).  However, since
      qemu's server never uses ESHUTDOWN errors, this code was mostly
      untested since its introduction in commit b6f5d3b5.
      
      More recently, I learned that nbdkit as the NBD server is able to
      send ESHUTDOWN errors, so I finally tested this code, and noticed
      that our client was special-casing ESHUTDOWN to cause a hard
      shutdown (immediate disconnect, with no NBD_CMD_DISC), but only
      if the server sends this error as a simple reply.  Further
      investigation found that commit d2febedb introduced a regression
      where structured replies behave differently than simple replies -
      but that the structured reply behavior is more in line with the
      spec (even if we still lack code in nbd-client.c to properly quit
      sending further requests).  So this patch reverts the portion of
      b6f5d3b5 that introduced an improper hard-disconnect special-case
      at the lower level, and leaves the future enhancement of a nicer
      soft-disconnect at the higher level for another day.
      
      CC: qemu-stable@nongnu.org
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      Message-Id: <20171113194857.13933-1-eblake@redhat.com>
      Reviewed-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      01b05c66
Loading