Skip to content
  • Stefan Hajnoczi's avatar
    3009edff
    vhost-user: fix VHOST_USER_ADD/REM_MEM_REG truncation · 3009edff
    Stefan Hajnoczi authored
    
    
    QEMU currently truncates the mmap_offset field when sending
    VHOST_USER_ADD_MEM_REG and VHOST_USER_REM_MEM_REG messages. The struct
    layout looks like this:
    
      typedef struct VhostUserMemoryRegion {
          uint64_t guest_phys_addr;
          uint64_t memory_size;
          uint64_t userspace_addr;
          uint64_t mmap_offset;
      } VhostUserMemoryRegion;
    
      typedef struct VhostUserMemRegMsg {
          uint32_t padding;
          /* WARNING: there is a 32-bit hole here! */
          VhostUserMemoryRegion region;
      } VhostUserMemRegMsg;
    
    The payload size is calculated as follows when sending the message in
    hw/virtio/vhost-user.c:
    
      msg->hdr.size = sizeof(msg->payload.mem_reg.padding) +
          sizeof(VhostUserMemoryRegion);
    
    This calculation produces an incorrect result of only 36 bytes.
    sizeof(VhostUserMemRegMsg) is actually 40 bytes.
    
    The consequence of this is that the final field, mmap_offset, is
    truncated. This breaks x86_64 TCG guests on s390 hosts. Other guest/host
    combinations may get lucky if either of the following holds:
    1. The guest memory layout does not need mmap_offset != 0.
    2. The host is little-endian and mmap_offset <= 0xffffffff so the
       truncation has no effect.
    
    Fix this by extending the existing 32-bit padding field to 64-bit. Now
    the padding reflects the actual compiler padding. This can be verified
    using pahole(1).
    
    Also document the layout properly in the vhost-user specification.  The
    vhost-user spec did not document the exact layout. It would be
    impossible to implement the spec without looking at the QEMU source
    code.
    
    Existing vhost-user frontends and device backends continue to work after
    this fix has been applied. The only change in the wire protocol is that
    QEMU now sets hdr.size to 40 instead of 36. If a vhost-user
    implementation has a hardcoded size check for 36 bytes, then it will
    fail with new QEMUs. Both QEMU and DPDK/SPDK don't check the exact
    payload size, so they continue to work.
    
    Fixes: f1aeb14b ("Transmit vhost-user memory regions individually")
    Cc: Raphael Norwitz <raphael.norwitz@nutanix.com>
    Cc: Cornelia Huck <cohuck@redhat.com>
    Cc: Michael S. Tsirkin <mst@redhat.com>
    Cc: Christian Borntraeger <borntraeger@de.ibm.com>
    Signed-off-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
    Message-Id: <20201109174355.1069147-1-stefanha@redhat.com>
    Reviewed-by: default avatarMichael S. Tsirkin <mst@redhat.com>
    Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
    Fixes: f1aeb14b ("Transmit vhost-user memory regions individually")
    Reviewed-by: default avatarCornelia Huck <cohuck@redhat.com>
    Reviewed-by: default avatarRaphael Norwitz <raphael.norwitz@nutanix.com>
    3009edff
    vhost-user: fix VHOST_USER_ADD/REM_MEM_REG truncation
    Stefan Hajnoczi authored
    
    
    QEMU currently truncates the mmap_offset field when sending
    VHOST_USER_ADD_MEM_REG and VHOST_USER_REM_MEM_REG messages. The struct
    layout looks like this:
    
      typedef struct VhostUserMemoryRegion {
          uint64_t guest_phys_addr;
          uint64_t memory_size;
          uint64_t userspace_addr;
          uint64_t mmap_offset;
      } VhostUserMemoryRegion;
    
      typedef struct VhostUserMemRegMsg {
          uint32_t padding;
          /* WARNING: there is a 32-bit hole here! */
          VhostUserMemoryRegion region;
      } VhostUserMemRegMsg;
    
    The payload size is calculated as follows when sending the message in
    hw/virtio/vhost-user.c:
    
      msg->hdr.size = sizeof(msg->payload.mem_reg.padding) +
          sizeof(VhostUserMemoryRegion);
    
    This calculation produces an incorrect result of only 36 bytes.
    sizeof(VhostUserMemRegMsg) is actually 40 bytes.
    
    The consequence of this is that the final field, mmap_offset, is
    truncated. This breaks x86_64 TCG guests on s390 hosts. Other guest/host
    combinations may get lucky if either of the following holds:
    1. The guest memory layout does not need mmap_offset != 0.
    2. The host is little-endian and mmap_offset <= 0xffffffff so the
       truncation has no effect.
    
    Fix this by extending the existing 32-bit padding field to 64-bit. Now
    the padding reflects the actual compiler padding. This can be verified
    using pahole(1).
    
    Also document the layout properly in the vhost-user specification.  The
    vhost-user spec did not document the exact layout. It would be
    impossible to implement the spec without looking at the QEMU source
    code.
    
    Existing vhost-user frontends and device backends continue to work after
    this fix has been applied. The only change in the wire protocol is that
    QEMU now sets hdr.size to 40 instead of 36. If a vhost-user
    implementation has a hardcoded size check for 36 bytes, then it will
    fail with new QEMUs. Both QEMU and DPDK/SPDK don't check the exact
    payload size, so they continue to work.
    
    Fixes: f1aeb14b ("Transmit vhost-user memory regions individually")
    Cc: Raphael Norwitz <raphael.norwitz@nutanix.com>
    Cc: Cornelia Huck <cohuck@redhat.com>
    Cc: Michael S. Tsirkin <mst@redhat.com>
    Cc: Christian Borntraeger <borntraeger@de.ibm.com>
    Signed-off-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
    Message-Id: <20201109174355.1069147-1-stefanha@redhat.com>
    Reviewed-by: default avatarMichael S. Tsirkin <mst@redhat.com>
    Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
    Fixes: f1aeb14b ("Transmit vhost-user memory regions individually")
    Reviewed-by: default avatarCornelia Huck <cohuck@redhat.com>
    Reviewed-by: default avatarRaphael Norwitz <raphael.norwitz@nutanix.com>
Loading