Skip to content
Snippets Groups Projects
  1. May 13, 2021
  2. May 06, 2021
  3. Apr 06, 2021
  4. Mar 24, 2021
  5. Mar 15, 2021
  6. Mar 04, 2021
    • Dr. David Alan Gilbert's avatar
      virtiofs: drop remapped security.capability xattr as needed · e586edcb
      Dr. David Alan Gilbert authored
      
      On Linux, the 'security.capability' xattr holds a set of
      capabilities that can change when an executable is run, giving
      a limited form of privilege escalation to those programs that
      the writer of the file deemed worthy.
      
      Any write causes the 'security.capability' xattr to be dropped,
      stopping anyone from gaining privilege by modifying a blessed
      file.
      
      Fuse relies on the daemon to do this dropping, and in turn the
      daemon relies on the host kernel to drop the xattr for it.  However,
      with the addition of -o xattrmap, the xattr that the guest
      stores its capabilities in is now not the same as the one that
      the host kernel automatically clears.
      
      Where the mapping changes 'security.capability', explicitly clear
      the remapped name to preserve the same behaviour.
      
      This bug is assigned CVE-2021-20263.
      
      Signed-off-by: default avatarDr. David Alan Gilbert <dgilbert@redhat.com>
      Reviewed-by: default avatarVivek Goyal <vgoyal@redhat.com>
      e586edcb
  7. Feb 16, 2021
    • Vivek Goyal's avatar
      virtiofsd: Do not use a thread pool by default · 26ec1909
      Vivek Goyal authored
      
      Currently we created a thread pool (With 64 max threads per pool) for
      each virtqueue. We hoped that this will provide us with better scalability
      and performance.
      
      But in practice, we are getting better numbers in most of the cases
      when we don't create a thread pool at all and a single thread per
      virtqueue receives the request and processes it.
      
      Hence, I am proposing that we switch to no thread pool by default
      (equivalent of --thread-pool-size=0). This will provide out of
      box better performance to most of the users. In fact other users
      have confirmed that not using a thread pool gives them better
      numbers. So why not use this as default. It can be changed when
      somebody can fix the issues with thread pool performance.
      
      Signed-off-by: default avatarVivek Goyal <vgoyal@redhat.com>
      Message-Id: <20210210182744.27324-2-vgoyal@redhat.com>
      Reviewed-by: default avatarDr. David Alan Gilbert <dgilbert@redhat.com>
      Signed-off-by: default avatarDr. David Alan Gilbert <dgilbert@redhat.com>
      26ec1909
    • Vivek Goyal's avatar
      viriofsd: Add support for FUSE_HANDLE_KILLPRIV_V2 · d64907ac
      Vivek Goyal authored
      
      This patch adds basic support for FUSE_HANDLE_KILLPRIV_V2. virtiofsd
      can enable/disable this by specifying option "-o killpriv_v2/no_killpriv_v2".
      By default this is enabled as long as client supports it
      
      Enabling this option helps with performance in write path. Without this
      option, currently every write is first preceeded with a getxattr() operation
      to find out if security.capability is set. (Write is supposed to clear
      security.capability). With this option enabled, server is signing up for
      clearing security.capability on every WRITE and also clearing suid/sgid
      subject to certain rules. This gets rid of extra getxattr() call for every
      WRITE and improves performance. This is true when virtiofsd is run with
      option -o xattr.
      
      What does enabling FUSE_HANDLE_KILLPRIV_V2 mean for file server implementation.
      It needs to adhere to following rules. Thanks to Miklos for this summary.
      
      - clear "security.capability" on write, truncate and chown unconditionally
      - clear suid/sgid in case of following. Note, sgid is cleared only if
        group executable bit is set.
          o setattr has FATTR_SIZE and FATTR_KILL_SUIDGID set.
          o setattr has FATTR_UID or FATTR_GID
          o open has O_TRUNC and FUSE_OPEN_KILL_SUIDGID
          o create has O_TRUNC and FUSE_OPEN_KILL_SUIDGID flag set.
          o write has FUSE_WRITE_KILL_SUIDGID
      
      >From Linux VFS client perspective, here are the requirements.
      
      - caps are always cleared on chown/write/truncate
      - suid is always cleared on chown, while for truncate/write it is cleared
        only if caller does not have CAP_FSETID.
      - sgid is always cleared on chown, while for truncate/write it is cleared
        only if caller does not have CAP_FSETID as well as file has group execute
        permission.
      
      virtiofsd implementation has not changed much to adhere to above ruls. And
      reason being that current assumption is that we are running on Linux
      and on top of filesystems like ext4/xfs which already follow above rules.
      On write, truncate, chown, seucurity.capability is cleared. And virtiofsd
      drops CAP_FSETID if need be and that will lead to clearing of suid/sgid.
      
      But if virtiofsd is running on top a filesystem which breaks above assumptions,
      then it will have to take extra actions to emulate above. That's a TODO
      for later when need arises.
      
      Note: create normally is supposed to be called only when file does not
            exist. So generally there should not be any question of clearing
            setuid/setgid. But it is possible that after client checks that
            file is not present, some other client creates file on server
            and this race can trigger sending FUSE_CREATE. In that case, if
            O_TRUNC is set, we should clear suid/sgid if FUSE_OPEN_KILL_SUIDGID
            is also set.
      
      v3:
        - Resolved conflicts due to lo_inode_open() changes.
        - Moved capability code in lo_do_open() so that both lo_open() and
          lo_create() can benefit from common code.
        - Dropped changes to kernel headers as these are part of qemu already.
      
      Signed-off-by: default avatarVivek Goyal <vgoyal@redhat.com>
      Acked-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      Reviewed-by: default avatarDr. David Alan Gilbert <dgilbert@redhat.com>
      Message-Id: <20210208224024.43555-3-vgoyal@redhat.com>
      Signed-off-by: default avatarDr. David Alan Gilbert <dgilbert@redhat.com>
      d64907ac
    • Vivek Goyal's avatar
      virtiofsd: Save error code early at the failure callsite · 1e08f164
      Vivek Goyal authored
      
      Change error code handling slightly in lo_setattr(). Right now we seem
      to jump to out_err and assume that "errno" is valid and use that to
      send reply.
      
      But if caller has to do some other operations before jumping to out_err,
      then it does the dance of first saving errno to saverr and the restore
      errno before jumping to out_err. This makes it more confusing.
      
      I am about to make more changes where caller will have to do some
      work after error before jumping to out_err. I found it easier to
      change the convention a bit. That is caller saves error in "saverr"
      before jumping to out_err. And out_err uses "saverr" to send error
      back and does not rely on "errno" having actual error.
      
      v3: Resolved conflicts in lo_setattr() due to lo_inode_open() changes.
      
      Signed-off-by: default avatarVivek Goyal <vgoyal@redhat.com>
      Reviewed-by: default avatarDr. David Alan Gilbert <dgilbert@redhat.com>
      Message-Id: <20210208224024.43555-2-vgoyal@redhat.com>
      Signed-off-by: default avatarDr. David Alan Gilbert <dgilbert@redhat.com>
      1e08f164
    • Philippe Mathieu-Daudé's avatar
      tools/virtiofsd: Replace the word 'whitelist' · a65963ef
      Philippe Mathieu-Daudé authored
      Follow the inclusive terminology from the "Conscious Language in your
      Open Source Projects" guidelines [*] and replace the words "whitelist"
      appropriately.
      
      [*] https://github.com/conscious-lang/conscious-lang-docs/blob/main/faq.md
      
      
      
      Reviewed-by: default avatarDr. David Alan Gilbert <dgilbert@redhat.com>
      Reviewed-by: default avatarDaniel P. Berrangé <berrange@redhat.com>
      Signed-off-by: default avatarPhilippe Mathieu-Daudé <philmd@redhat.com>
      Message-Id: <20210205171817.2108907-3-philmd@redhat.com>
      Signed-off-by: default avatarDr. David Alan Gilbert <dgilbert@redhat.com>
      a65963ef
    • Greg Kurz's avatar
      virtiofsd: vu_dispatch locking should never fail · 525a3030
      Greg Kurz authored
      
      pthread_rwlock_rdlock() and pthread_rwlock_wrlock() can fail if a
      deadlock condition is detected or the current thread already owns
      the lock. They can also fail, like pthread_rwlock_unlock(), if the
      mutex wasn't properly initialized. None of these are ever expected
      to happen with fv_VuDev::vu_dispatch_rwlock.
      
      Some users already check the return value and assert, some others
      don't. Introduce rdlock/wrlock/unlock wrappers that just do the
      former and use them everywhere for improved consistency and
      robustness.
      
      This is just cleanup. It doesn't fix any actual issue.
      
      Signed-off-by: default avatarGreg Kurz <groug@kaod.org>
      Message-Id: <20210203182434.93870-1-groug@kaod.org>
      Reviewed-by: default avatarVivek Goyal <vgoyal@redhat.com>
      Reviewed-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      Signed-off-by: default avatarDr. David Alan Gilbert <dgilbert@redhat.com>
      525a3030
    • Wainer dos Santos Moschetta's avatar
      virtiofsd: Allow to build it without the tools · 0958ee89
      Wainer dos Santos Moschetta authored
      
      This changed the Meson build script to allow virtiofsd be built even
      though the tools build is disabled, thus honoring the --enable-virtiofsd
      option.
      
      Fixes: cece116c (configure: add option for virtiofsd)
      Signed-off-by: default avatarWainer dos Santos Moschetta <wainersm@redhat.com>
      Message-Id: <20210201211456.1133364-2-wainersm@redhat.com>
      Reviewed-by: default avatarMisono Tomohiro <misono.tomohiro@jp.fujitsu.com>
      Reviewed-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      Signed-off-by: default avatarDr. David Alan Gilbert <dgilbert@redhat.com>
      0958ee89
  8. Feb 04, 2021
  9. Jan 06, 2021
  10. Dec 18, 2020
    • Alex Chen's avatar
      virtiofsd: Remove useless code about send_notify_iov · 03350a1e
      Alex Chen authored
      
      The 'ch' will be NULL in the following stack:
      send_notify_iov()->fuse_send_msg()->virtio_send_msg(), and
      this may lead to NULL pointer dereferenced in virtio_send_msg().
      But send_notify_iov() was never called, so remove the useless code
      about send_notify_iov() to fix this problem.
      
      Signed-off-by: default avatarAlex Chen <alex.chen@huawei.com>
      Message-Id: <20201214121615.29967-1-alex.chen@huawei.com>
      Reviewed-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      Signed-off-by: default avatarDr. David Alan Gilbert <dgilbert@redhat.com>
      03350a1e
    • Laszlo Ersek's avatar
      virtiofsd: update FUSE_FORGET comment on "lo_inode.nlookup" · d6211148
      Laszlo Ersek authored
      
      Miklos confirms it's *only* the FUSE_FORGET request that the client can
      use for decrementing "lo_inode.nlookup".
      
      Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
      Cc: Miklos Szeredi <mszeredi@redhat.com>
      Cc: Stefan Hajnoczi <stefanha@redhat.com>
      Fixes: 1222f015
      Signed-off-by: default avatarLaszlo Ersek <lersek@redhat.com>
      Message-Id: <20201208073936.8629-1-lersek@redhat.com>
      Reviewed-by: default avatarVivek Goyal <vgoyal@redhat.com>
      Signed-off-by: default avatarDr. David Alan Gilbert <dgilbert@redhat.com>
      d6211148
    • Vivek Goyal's avatar
      virtiofsd: Check file type in lo_flush() · 31a4990f
      Vivek Goyal authored
      
      Currently lo_flush() is written in such a way that it expects to receive
      a FLUSH requests on a regular file (and not directories). For example,
      we call lo_fi_fd() which searches lo->fd_map. If we open directories
      using opendir(), we keep don't keep track of these in lo->fd_map instead
      we keep them in lo->dir_map. So we expect lo_flush() to be called on
      regular files only.
      
      Even linux fuse client calls FLUSH only for regular files and not
      directories. So put a check for filetype and return EBADF if
      lo_flush() is called on a non-regular file.
      
      Reported-by: default avatarLaszlo Ersek <lersek@redhat.com>
      Signed-off-by: default avatarVivek Goyal <vgoyal@redhat.com>
      Message-Id: <20201211142544.GB3285@redhat.com>
      Reviewed-by: default avatarDr. David Alan Gilbert <dgilbert@redhat.com>
      Signed-off-by: default avatarDr. David Alan Gilbert <dgilbert@redhat.com>
      31a4990f
    • Vivek Goyal's avatar
      virtiofsd: Disable posix_lock hash table if remote locks are not enabled · e7e8aa8a
      Vivek Goyal authored
      
      If remote posix locks are not enabled (lo->posix_lock == false), then disable
      code paths taken to initialize inode->posix_lock hash table and corresponding
      destruction and search etc.
      
      lo_getlk() and lo_setlk() have been modified to return ENOSYS if daemon
      does not support posix lock but client still sends a lock/unlock request.
      
      Signed-off-by: default avatarVivek Goyal <vgoyal@redhat.com>
      Message-Id: <20201207183021.22752-3-vgoyal@redhat.com>
      Reviewed-by: default avatarDr. David Alan Gilbert <dgilbert@redhat.com>
      Signed-off-by: default avatarDr. David Alan Gilbert <dgilbert@redhat.com>
      e7e8aa8a
    • Vivek Goyal's avatar
      virtiofsd: Set up posix_lock hash table for root inode · ad3bfe1b
      Vivek Goyal authored
      
      We setup per inode hash table ->posix_lock to support remote posix locks.
      But we forgot to initialize this table for root inode.
      
      Laszlo managed to trigger an issue where he sent a FUSE_FLUSH request for
      root inode and lo_flush() found inode with inode->posix_lock NULL and
      accessing this table crashed virtiofsd.
      
      May be we can get rid of initializing this hash table for directory
      objects completely. But that optimization is for another day.
      
      Reported-by: default avatarLaszlo Ersek <lersek@redhat.com>
      Signed-off-by: default avatarVivek Goyal <vgoyal@redhat.com>
      Message-Id: <20201207195539.GB3107@redhat.com>
      Reviewed-by: default avatarDr. David Alan Gilbert <dgilbert@redhat.com>
      Signed-off-by: default avatarDr. David Alan Gilbert <dgilbert@redhat.com>
      ad3bfe1b
    • Laszlo Ersek's avatar
      virtiofsd: make the debug log timestamp on stderr more human-readable · bebc3c24
      Laszlo Ersek authored
      
      The current timestamp format doesn't help me visually notice small jumps
      in time ("small" as defined on human scale, such as a few seconds or a few
      ten seconds). Replace it with a local time format where such differences
      stand out.
      
      Before:
      
      > [13316826770337] [ID: 00000004] unique: 62, opcode: RELEASEDIR (29), nodeid: 1, insize: 64, pid: 1
      > [13316826778175] [ID: 00000004]    unique: 62, success, outsize: 16
      > [13316826781156] [ID: 00000004] virtio_send_msg: elem 0: with 1 in desc of length 16
      > [15138279317927] [ID: 00000001] virtio_loop: Got VU event
      > [15138279504884] [ID: 00000001] fv_queue_set_started: qidx=1 started=0
      > [15138279519034] [ID: 00000003] fv_queue_thread: kill event on queue 1 - quitting
      > [15138280876463] [ID: 00000001] fv_remove_watch: TODO! fd=9
      > [15138280897381] [ID: 00000001] virtio_loop: Waiting for VU event
      > [15138280946834] [ID: 00000001] virtio_loop: Got VU event
      > [15138281175421] [ID: 00000001] virtio_loop: Waiting for VU event
      > [15138281182387] [ID: 00000001] virtio_loop: Got VU event
      > [15138281189474] [ID: 00000001] virtio_loop: Waiting for VU event
      > [15138309321936] [ID: 00000001] virtio_loop: Unexpected poll revents 11
      > [15138309434150] [ID: 00000001] virtio_loop: Exit
      
      (Notice how you don't (easily) notice the gap in time after
      "virtio_send_msg", and especially the amount of time passed is hard to
      estimate.)
      
      After:
      
      > [2020-12-08 06:43:22.58+0100] [ID: 00000004] unique: 51, opcode: RELEASEDIR (29), nodeid: 1, insize: 64, pid: 1
      > [2020-12-08 06:43:22.58+0100] [ID: 00000004]    unique: 51, success, outsize: 16
      > [2020-12-08 06:43:22.58+0100] [ID: 00000004] virtio_send_msg: elem 0: with 1 in desc of length 16
      > [2020-12-08 06:43:29.34+0100] [ID: 00000001] virtio_loop: Got VU event
      > [2020-12-08 06:43:29.34+0100] [ID: 00000001] fv_queue_set_started: qidx=1 started=0
      > [2020-12-08 06:43:29.34+0100] [ID: 00000003] fv_queue_thread: kill event on queue 1 - quitting
      > [2020-12-08 06:43:29.34+0100] [ID: 00000001] fv_remove_watch: TODO! fd=9
      > [2020-12-08 06:43:29.34+0100] [ID: 00000001] virtio_loop: Waiting for VU event
      > [2020-12-08 06:43:29.34+0100] [ID: 00000001] virtio_loop: Got VU event
      > [2020-12-08 06:43:29.34+0100] [ID: 00000001] virtio_loop: Waiting for VU event
      > [2020-12-08 06:43:29.34+0100] [ID: 00000001] virtio_loop: Got VU event
      > [2020-12-08 06:43:29.34+0100] [ID: 00000001] virtio_loop: Waiting for VU event
      > [2020-12-08 06:43:29.37+0100] [ID: 00000001] virtio_loop: Unexpected poll revents 11
      > [2020-12-08 06:43:29.37+0100] [ID: 00000001] virtio_loop: Exit
      
      Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
      Cc: Stefan Hajnoczi <stefanha@redhat.com>
      Signed-off-by: default avatarLaszlo Ersek <lersek@redhat.com>
      Message-Id: <20201208055043.31548-1-lersek@redhat.com>
      Reviewed-by: default avatarDr. David Alan Gilbert <dgilbert@redhat.com>
      Signed-off-by: default avatarDr. David Alan Gilbert <dgilbert@redhat.com>
      bebc3c24
Loading