Skip to content
Snippets Groups Projects
  1. Jun 25, 2021
    • Paolo Bonzini's avatar
      block: add max_hw_transfer to BlockLimits · 24b36e98
      Paolo Bonzini authored
      
      For block host devices, I/O can happen through either the kernel file
      descriptor I/O system calls (preadv/pwritev, io_submit, io_uring)
      or the SCSI passthrough ioctl SG_IO.
      
      In the latter case, the size of each transfer can be limited by the
      HBA, while for file descriptor I/O the kernel is able to split and
      merge I/O in smaller pieces as needed.  Applying the HBA limits to
      file descriptor I/O results in more system calls and suboptimal
      performance, so this patch splits the max_transfer limit in two:
      max_transfer remains valid and is used in general, while max_hw_transfer
      is limited to the maximum hardware size.  max_hw_transfer can then be
      included by the scsi-generic driver in the block limits page, to ensure
      that the stricter hardware limit is used.
      
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      24b36e98
    • Paolo Bonzini's avatar
      osdep: provide ROUND_DOWN macro · c9797456
      Paolo Bonzini authored
      
      osdep.h provides a ROUND_UP macro to hide bitwise operations for the
      purpose of rounding a number up to a power of two; add a ROUND_DOWN
      macro that does the same with truncation towards zero.
      
      While at it, change the formatting of some comments.
      
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      c9797456
  2. Jun 21, 2021
  3. Jun 19, 2021
  4. Jun 18, 2021
  5. Jun 17, 2021
    • Chenyi Qiang's avatar
      i386: Add ratelimit for bus locks acquired in guest · 035d1ef2
      Chenyi Qiang authored
      A bus lock is acquired through either split locked access to writeback
      (WB) memory or any locked access to non-WB memory. It is typically >1000
      cycles slower than an atomic operation within a cache and can also
      disrupts performance on other cores.
      
      Virtual Machines can exploit bus locks to degrade the performance of
      system. To address this kind of performance DOS attack coming from the
      VMs, bus lock VM exit is introduced in KVM and it can report the bus
      locks detected in guest. If enabled in KVM, it would exit to the
      userspace to let the user enforce throttling policies once bus locks
      acquired in VMs.
      
      The availability of bus lock VM exit can be detected through the
      KVM_CAP_X86_BUS_LOCK_EXIT. The returned bitmap contains the potential
      policies supported by KVM. The field KVM_BUS_LOCK_DETECTION_EXIT in
      bitmap is the only supported strategy at present. It indicates that KVM
      will exit to userspace to handle the bus locks.
      
      This patch adds a ratelimit on the bus locks acquired in guest as a
      mitigation policy.
      
      Introduce a new field "bus_lock_ratelimit" to record the limited speed
      of bus locks in the target VM. The user can specify it through the
      "bus-lock-ratelimit" as a machine property. In current implementation,
      the default value of the speed is 0 per second, which means no
      restrictions on the bus locks.
      
      As for ratelimit on detected bus locks, simply set the ratelimit
      interval to 1s and restrict the quota of bus lock occurence to the value
      of "bus_lock_ratelimit". A potential alternative is to introduce the
      time slice as a property which can help the user achieve more precise
      control.
      
      The detail of bus lock VM exit can be found in spec:
      https://software.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html
      
      
      
      Signed-off-by: default avatarChenyi Qiang <chenyi.qiang@intel.com>
      Message-Id: <20210521043820.29678-1-chenyi.qiang@intel.com>
      Signed-off-by: default avatarEduardo Habkost <ehabkost@redhat.com>
      035d1ef2
    • Eduardo Habkost's avatar
      Update Linux headers to 5.13-rc4 · 278f064e
      Eduardo Habkost authored
      
      Signed-off-by: default avatarEduardo Habkost <ehabkost@redhat.com>
      Message-Id: <20210603191541.2862286-1-ehabkost@redhat.com>
      Signed-off-by: default avatarEduardo Habkost <ehabkost@redhat.com>
      278f064e
  6. Jun 16, 2021
  7. Jun 15, 2021
    • David Hildenbrand's avatar
      hostmem: Wire up RAM_NORESERVE via "reserve" property · 9181fb70
      David Hildenbrand authored
      
      Let's provide a way to control the use of RAM_NORESERVE via memory
      backends using the "reserve" property which defaults to true (old
      behavior).
      
      Only Linux currently supports clearing the flag (and support is checked at
      runtime, depending on the setting of "/proc/sys/vm/overcommit_memory").
      Windows and other POSIX systems will bail out with "reserve=false".
      
      The target use case is virtio-mem, which dynamically exposes memory
      inside a large, sparse memory area to the VM. This essentially allows
      avoiding to set "/proc/sys/vm/overcommit_memory == 0") when using
      virtio-mem and also supporting hugetlbfs in the future.
      
      As really only Linux implements RAM_NORESERVE right now, let's expose
      the property only with CONFIG_LINUX. Setting the property to "false"
      will then only fail in corner cases -- for example on very old kernels
      or when memory overcommit was completely disabled by the admin.
      
      Reviewed-by: default avatarPeter Xu <peterx@redhat.com>
      Reviewed-by: default avatarEduardo Habkost <ehabkost@redhat.com>
      Reviewed-by: default avatarMarkus Armbruster <armbru@redhat.com>
      Acked-by: Eduardo Habkost <ehabkost@redhat.com> for memory backend and machine core
      Cc: Markus Armbruster <armbru@redhat.com>
      Cc: Eric Blake <eblake@redhat.com>
      Cc: Igor Mammedov <imammedo@redhat.com>
      Signed-off-by: default avatarDavid Hildenbrand <david@redhat.com>
      Message-Id: <20210510114328.21835-11-david@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      9181fb70
    • David Hildenbrand's avatar
      util/mmap-alloc: Support RAM_NORESERVE via MAP_NORESERVE under Linux · d94e0bc9
      David Hildenbrand authored
      Let's support RAM_NORESERVE via MAP_NORESERVE on Linux. The flag has no
      effect on most shared mappings - except for hugetlbfs and anonymous memory.
      
      Linux man page:
        "MAP_NORESERVE: Do not reserve swap space for this mapping. When swap
        space is reserved, one has the guarantee that it is possible to modify
        the mapping. When swap space is not reserved one might get SIGSEGV
        upon a write if no physical memory is available. See also the discussion
        of the file /proc/sys/vm/overcommit_memory in proc(5). In kernels before
        2.6, this flag had effect only for private writable mappings."
      
      Note that the "guarantee" part is wrong with memory overcommit in Linux.
      
      Also, in Linux hugetlbfs is treated differently - we configure reservation
      of huge pages from the pool, not reservation of swap space (huge pages
      cannot be swapped).
      
      The rough behavior is [1]:
      a) !Hugetlbfs:
      
        1) Without MAP_NORESERVE *or* with memory overcommit under Linux
           disabled ("/proc/sys/vm/overcommit_memory == 2"), the following
           accounting/reservation happens:
            For a file backed map
             SHARED or READ-only - 0 cost (the file is the map not swap)
             PRIVATE WRITABLE - size of mapping per instance
      
            For an anonymous or /dev/zero map
             SHARED   - size of mapping
             PRIVATE READ-only - 0 cost (but of little use)
             PRIVATE WRITABLE - size of mapping per instance
      
        2) With MAP_NORESERVE, no accounting/reservation happens.
      
      b) Hugetlbfs:
      
        1) Without MAP_NORESERVE, huge pages are reserved.
      
        2) With MAP_NORESERVE, no huge pages are reserved.
      
      Note: With "/proc/sys/vm/overcommit_memory == 0", we were already able
      to configure it for !hugetlbfs globally; this toggle now allows
      configuring it more fine-grained, not for the whole system.
      
      The target use case is virtio-mem, which dynamically exposes memory
      inside a large, sparse memory area to the VM.
      
      [1] https://www.kernel.org/doc/Documentation/vm/overcommit-accounting
      
      
      
      Reviewed-by: default avatarPeter Xu <peterx@redhat.com>
      Acked-by: Eduardo Habkost <ehabkost@redhat.com> for memory backend and machine core
      Signed-off-by: default avatarDavid Hildenbrand <david@redhat.com>
      Message-Id: <20210510114328.21835-10-david@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      d94e0bc9
Loading