Skip to content
Snippets Groups Projects
  1. May 20, 2021
    • Eric Farman's avatar
      vfio-ccw: Attempt to clean up all IRQs on error · dcc9cf38
      Eric Farman authored
      
      The vfio_ccw_unrealize() routine makes an unconditional attempt to
      unregister every IRQ notifier, though they may not have been registered
      in the first place (when running on an older kernel, for example).
      
      Let's mirror this behavior in the error cleanups in vfio_ccw_realize()
      so that if/when new IRQs are added, it is less confusing to recognize
      the necessary procedures. The worst case scenario would be some extra
      messages about an undefined IRQ, but since this is an error exit that
      won't be the only thing to worry about.
      
      And regarding those messages, let's change it to a warning instead of
      an error, to better reflect their severity. The existing code in both
      paths handles everything anyway.
      
      Signed-off-by: default avatarEric Farman <farman@linux.ibm.com>
      Acked-by: default avatarMatthew Rosato <mjrosato@linux.ibm.com>
      Message-Id: <20210428143652.1571487-1-farman@linux.ibm.com>
      Signed-off-by: default avatarCornelia Huck <cohuck@redhat.com>
      dcc9cf38
    • Eric Farman's avatar
      vfio-ccw: Permit missing IRQs · 6178d468
      Eric Farman authored
      
      Commit 690e29b9 ("vfio-ccw: Refactor ccw irq handler") changed
      one of the checks for the IRQ notifier registration from saying
      "the host needs to recognize the only IRQ that exists" to saying
      "the host needs to recognize ANY IRQ that exists."
      
      And this worked fine, because the subsequent change to support the
      CRW IRQ notifier doesn't get into this code when running on an older
      kernel, thanks to a guard by a capability region. The later addition
      of the REQ(uest) IRQ by commit b2f96f9e ("vfio-ccw: Connect the
      device request notifier") broke this assumption because there is no
      matching capability region. Thus, running new QEMU on an older
      kernel fails with:
      
        vfio: unexpected number of irqs 2
      
      Let's adapt the message here so that there's a better clue of what
      IRQ is missing.
      
      Furthermore, let's make the REQ(uest) IRQ not fail when attempting
      to register it, to permit running vfio-ccw on a newer QEMU with an
      older kernel.
      
      Fixes: b2f96f9e ("vfio-ccw: Connect the device request notifier")
      Signed-off-by: default avatarEric Farman <farman@linux.ibm.com>
      Message-Id: <20210421152053.2379873-1-farman@linux.ibm.com>
      Signed-off-by: default avatarCornelia Huck <cohuck@redhat.com>
      6178d468
  2. May 02, 2021
  3. Mar 04, 2021
    • Eric Farman's avatar
      vfio-ccw: Do not read region ret_code after write · d6cd6631
      Eric Farman authored
      
      A pwrite() call returns the number of bytes written (or -1 on error),
      and vfio-ccw compares this number with the size of the region to
      determine if an error had occurred or not.
      
      If they are not equal, this is a failure and the errno is used to
      determine exactly how things failed. An errno of zero is possible
      (though unlikely) in this situation and would be translated to a
      successful operation.
      
      If they ARE equal, the ret_code field is read from the region to
      determine how to proceed. While the kernel sets the ret_code field
      as necessary, the region and thus this field is not "written back"
      to the user. So the value can only be what it was initialized to,
      which is zero.
      
      So, let's convert an unexpected length with errno of zero to a
      return code of -EFAULT, and explicitly set an expected length to
      a return code of zero. This will be a little safer and clearer.
      
      Suggested-by: default avatarMatthew Rosato <mjrosato@linux.ibm.com>
      Signed-off-by: default avatarEric Farman <farman@linux.ibm.com>
      Message-Id: <20210303160739.2179378-1-farman@linux.ibm.com>
      Signed-off-by: default avatarCornelia Huck <cohuck@redhat.com>
      d6cd6631
  4. Jan 21, 2021
  5. Oct 02, 2020
  6. Jul 02, 2020
  7. Jun 18, 2020
  8. Jun 05, 2020
  9. May 15, 2020
    • Markus Armbruster's avatar
      qdev: Unrealize must not fail · b69c3c21
      Markus Armbruster authored
      
      Devices may have component devices and buses.
      
      Device realization may fail.  Realization is recursive: a device's
      realize() method realizes its components, and device_set_realized()
      realizes its buses (which should in turn realize the devices on that
      bus, except bus_set_realized() doesn't implement that, yet).
      
      When realization of a component or bus fails, we need to roll back:
      unrealize everything we realized so far.  If any of these unrealizes
      failed, the device would be left in an inconsistent state.  Must not
      happen.
      
      device_set_realized() lets it happen: it ignores errors in the roll
      back code starting at label child_realize_fail.
      
      Since realization is recursive, unrealization must be recursive, too.
      But how could a partly failed unrealize be rolled back?  We'd have to
      re-realize, which can fail.  This design is fundamentally broken.
      
      device_set_realized() does not roll back at all.  Instead, it keeps
      unrealizing, ignoring further errors.
      
      It can screw up even for a device with no buses: if the lone
      dc->unrealize() fails, it still unregisters vmstate, and calls
      listeners' unrealize() callback.
      
      bus_set_realized() does not roll back either.  Instead, it stops
      unrealizing.
      
      Fortunately, no unrealize method can fail, as we'll see below.
      
      To fix the design error, drop parameter @errp from all the unrealize
      methods.
      
      Any unrealize method that uses @errp now needs an update.  This leads
      us to unrealize() methods that can fail.  Merely passing it to another
      unrealize method cannot cause failure, though.  Here are the ones that
      do other things with @errp:
      
      * virtio_serial_device_unrealize()
      
        Fails when qbus_set_hotplug_handler() fails, but still does all the
        other work.  On failure, the device would stay realized with its
        resources completely gone.  Oops.  Can't happen, because
        qbus_set_hotplug_handler() can't actually fail here.  Pass
        &error_abort to qbus_set_hotplug_handler() instead.
      
      * hw/ppc/spapr_drc.c's unrealize()
      
        Fails when object_property_del() fails, but all the other work is
        already done.  On failure, the device would stay realized with its
        vmstate registration gone.  Oops.  Can't happen, because
        object_property_del() can't actually fail here.  Pass &error_abort
        to object_property_del() instead.
      
      * spapr_phb_unrealize()
      
        Fails and bails out when remove_drcs() fails, but other work is
        already done.  On failure, the device would stay realized with some
        of its resources gone.  Oops.  remove_drcs() fails only when
        chassis_from_bus()'s object_property_get_uint() fails, and it can't
        here.  Pass &error_abort to remove_drcs() instead.
      
      Therefore, no unrealize method can fail before this patch.
      
      device_set_realized()'s recursive unrealization via bus uses
      object_property_set_bool().  Can't drop @errp there, so pass
      &error_abort.
      
      We similarly unrealize with object_property_set_bool() elsewhere,
      always ignoring errors.  Pass &error_abort instead.
      
      Several unrealize methods no longer handle errors from other unrealize
      methods: virtio_9p_device_unrealize(),
      virtio_input_device_unrealize(), scsi_qdev_unrealize(), ...
      Much of the deleted error handling looks wrong anyway.
      
      One unrealize methods no longer ignore such errors:
      usb_ehci_pci_exit().
      
      Several realize methods no longer ignore errors when rolling back:
      v9fs_device_realize_common(), pci_qdev_unrealize(),
      spapr_phb_realize(), usb_qdev_realize(), vfio_ccw_realize(),
      virtio_device_realize().
      
      Signed-off-by: default avatarMarkus Armbruster <armbru@redhat.com>
      Reviewed-by: default avatarPhilippe Mathieu-Daudé <philmd@redhat.com>
      Reviewed-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Message-Id: <20200505152926.18877-17-armbru@redhat.com>
      b69c3c21
  10. Jan 24, 2020
  11. Dec 14, 2019
  12. Aug 16, 2019
    • Markus Armbruster's avatar
      Include hw/qdev-properties.h less · a27bd6c7
      Markus Armbruster authored
      
      In my "build everything" tree, changing hw/qdev-properties.h triggers
      a recompile of some 2700 out of 6600 objects (not counting tests and
      objects that don't depend on qemu/osdep.h).
      
      Many places including hw/qdev-properties.h (directly or via hw/qdev.h)
      actually need only hw/qdev-core.h.  Include hw/qdev-core.h there
      instead.
      
      hw/qdev.h is actually pointless: all it does is include hw/qdev-core.h
      and hw/qdev-properties.h, which in turn includes hw/qdev-core.h.
      Replace the remaining uses of hw/qdev.h by hw/qdev-properties.h.
      
      While there, delete a few superfluous inclusions of hw/qdev-core.h.
      
      Touching hw/qdev-properties.h now recompiles some 1200 objects.
      
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Daniel P. Berrangé" <berrange@redhat.com>
      Cc: Eduardo Habkost <ehabkost@redhat.com>
      Signed-off-by: default avatarMarkus Armbruster <armbru@redhat.com>
      Reviewed-by: default avatarEduardo Habkost <ehabkost@redhat.com>
      Message-Id: <20190812052359.30071-22-armbru@redhat.com>
      a27bd6c7
    • Markus Armbruster's avatar
      Include qemu/main-loop.h less · db725815
      Markus Armbruster authored
      
      In my "build everything" tree, changing qemu/main-loop.h triggers a
      recompile of some 5600 out of 6600 objects (not counting tests and
      objects that don't depend on qemu/osdep.h).  It includes block/aio.h,
      which in turn includes qemu/event_notifier.h, qemu/notify.h,
      qemu/processor.h, qemu/qsp.h, qemu/queue.h, qemu/thread-posix.h,
      qemu/thread.h, qemu/timer.h, and a few more.
      
      Include qemu/main-loop.h only where it's needed.  Touching it now
      recompiles only some 1700 objects.  For block/aio.h and
      qemu/event_notifier.h, these numbers drop from 5600 to 2800.  For the
      others, they shrink only slightly.
      
      Signed-off-by: default avatarMarkus Armbruster <armbru@redhat.com>
      Message-Id: <20190812052359.30071-21-armbru@redhat.com>
      Reviewed-by: default avatarAlex Bennée <alex.bennee@linaro.org>
      Reviewed-by: default avatarPhilippe Mathieu-Daudé <philmd@redhat.com>
      Tested-by: default avatarPhilippe Mathieu-Daudé <philmd@redhat.com>
      db725815
  13. Jul 08, 2019
  14. Jun 24, 2019
  15. Jun 12, 2019
  16. May 22, 2019
  17. Apr 12, 2019
  18. Apr 03, 2019
    • Daniel P. Berrangé's avatar
      hw/vfio/ccw: avoid taking address members in packed structs · e1d0b372
      Daniel P. Berrangé authored
      
      The GCC 9 compiler complains about many places in s390 code
      that take the address of members of the 'struct SCHIB' which
      is marked packed:
      
      hw/vfio/ccw.c: In function ‘vfio_ccw_io_notifier_handler’:
      hw/vfio/ccw.c:133:15: warning: taking address of packed member of ‘struct SCHIB’ may result in an unaligned pointer value \
      [-Waddress-of-packed-member]
        133 |     SCSW *s = &sch->curr_status.scsw;
            |               ^~~~~~~~~~~~~~~~~~~~~~
      hw/vfio/ccw.c:134:15: warning: taking address of packed member of ‘struct SCHIB’ may result in an unaligned pointer value \
      [-Waddress-of-packed-member]
        134 |     PMCW *p = &sch->curr_status.pmcw;
            |               ^~~~~~~~~~~~~~~~~~~~~~
      
      ...snip many more...
      
      Almost all of these are just done for convenience to avoid
      typing out long variable/field names when referencing struct
      members. We can get most of this convenience by taking the
      address of the 'struct SCHIB' instead, avoiding triggering
      the compiler warnings.
      
      In a couple of places we copy via a local variable which is
      a technique already applied elsewhere in s390 code for this
      problem.
      
      Signed-off-by: default avatarDaniel P. Berrangé <berrange@redhat.com>
      Message-Id: <20190329111104.17223-12-berrange@redhat.com>
      Reviewed-by: default avatarEric Farman <farman@linux.ibm.com>
      Reviewed-by: default avatarHalil Pasic <pasic@linux.ibm.com>
      Reviewed-by: default avatarFarhan Ali <alifm@linux.ibm.com>
      Signed-off-by: default avatarCornelia Huck <cohuck@redhat.com>
      e1d0b372
  19. Sep 24, 2018
  20. Aug 17, 2018
    • Alex Williamson's avatar
      vfio/ccw/pci: Allow devices to opt-in for ballooning · 238e9172
      Alex Williamson authored
      
      If a vfio assigned device makes use of a physical IOMMU, then memory
      ballooning is necessarily inhibited due to the page pinning, lack of
      page level granularity at the IOMMU, and sufficient notifiers to both
      remove the page on balloon inflation and add it back on deflation.
      However, not all devices are backed by a physical IOMMU.  In the case
      of mediated devices, if a vendor driver is well synchronized with the
      guest driver, such that only pages actively used by the guest driver
      are pinned by the host mdev vendor driver, then there should be no
      overlap between pages available for the balloon driver and pages
      actively in use by the device.  Under these conditions, ballooning
      should be safe.
      
      vfio-ccw devices are always mediated devices and always operate under
      the constraints above.  Therefore we can consider all vfio-ccw devices
      as balloon compatible.
      
      The situation is far from straightforward with vfio-pci.  These
      devices can be physical devices with physical IOMMU backing or
      mediated devices where it is unknown whether a physical IOMMU is in
      use or whether the vendor driver is well synchronized to the working
      set of the guest driver.  The safest approach is therefore to assume
      all vfio-pci devices are incompatible with ballooning, but allow user
      opt-in should they have further insight into mediated devices.
      
      Signed-off-by: default avatarAlex Williamson <alex.williamson@redhat.com>
      238e9172
  21. Jun 18, 2018
    • Halil Pasic's avatar
      vfio-ccw: add force unlimited prefetch property · 9a51c9ee
      Halil Pasic authored
      
      There is at least one guest (OS) such that although it does not rely on
      the guarantees provided by ORB 1 word 9 bit (aka unlimited prefetch, aka
      P bit) not being set, it fails to tell this to the machine.
      
      Usually this ain't a big deal, as the original purpose of the P bit is to
      allow for performance optimizations. vfio-ccw however can not provide the
      guarantees required if the bit is not set.
      
      It is not possible to implement support for the P bit not set without
      transitioning to lower level protocols for vfio-ccw.  So let's give the
      user the opportunity to force setting the P bit, if the user knows this
      is safe.  For self modifying channel programs forcing the P bit is not
      safe.  If the P bit is forced for a self modifying channel program things
      are expected to break in strange ways.
      
      Let's also avoid warning multiple about P bit not set in the ORB in case
      P bit is not told to be forced, and designate the affected vfio-ccw
      device.
      
      Signed-off-by: default avatarHalil Pasic <pasic@linux.ibm.com>
      Suggested-by: default avatarDong Jia Shi <bjsdjshi@linux.ibm.com>
      Acked-by: default avatarJason J. Herne <jjherne@linux.ibm.com>
      Tested-by: default avatarJason J. Herne <jjherne@linux.ibm.com>
      Message-Id: <20180524175828.3143-2-pasic@linux.ibm.com>
      Signed-off-by: default avatarCornelia Huck <cohuck@redhat.com>
      9a51c9ee
  22. May 31, 2018
  23. Apr 30, 2018
    • Greg Kurz's avatar
      vfio-ccw: introduce vfio_ccw_get_device() · c96f2c2a
      Greg Kurz authored
      
      A recent patch fixed leaks of the dynamically allocated vcdev->vdev.name
      field in vfio_ccw_realize(), but we now have three freeing sites for it.
      This is unfortunate and seems to indicate something is wrong with its
      life cycle.
      
      The root issue is that vcdev->vdev.name is set before vfio_get_device()
      is called, which theoretically prevents to call vfio_put_device() to
      do the freeing. Well actually, we could call it anyway  because
      vfio_put_base_device() is a nop if the device isn't attached, but this
      would be confusing.
      
      This patch hence moves all the logic of attaching the device, including
      the "already attached" check, to a separate vfio_ccw_get_device() function,
      counterpart of vfio_put_device(). While here, vfio_put_device() is renamed
      to vfio_ccw_put_device() for consistency.
      
      Signed-off-by: default avatarGreg Kurz <groug@kaod.org>
      Message-Id: <152326891065.266543.9487977590811413472.stgit@bahia.lan>
      Signed-off-by: default avatarCornelia Huck <cohuck@redhat.com>
      c96f2c2a
  24. Apr 09, 2018
    • Greg Kurz's avatar
      vfio-ccw: fix memory leaks in vfio_ccw_realize() · be4d026f
      Greg Kurz authored
      
      If the subchannel is already attached or if vfio_get_device() fails, the
      code jumps to the 'out_device_err' label and doesn't free the string it
      has just allocated.
      
      The code should be reworked so that vcdev->vdev.name only gets set when
      the device has been attached, and freed when it is about to be detached.
      This could be achieved  with the addition of a vfio_ccw_get_device()
      function that would be the counterpart of vfio_put_device(). But this is
      a more elaborate cleanup that should be done in a follow-up. For now,
      let's just add calls to g_free() on the buggy error paths.
      
      Signed-off-by: default avatarGreg Kurz <groug@kaod.org>
      Message-Id: <152311222681.203086.8874800175539040298.stgit@bahia>
      Signed-off-by: default avatarCornelia Huck <cohuck@redhat.com>
      be4d026f
  25. Mar 08, 2018
  26. Dec 18, 2017
  27. Oct 20, 2017
    • Halil Pasic's avatar
      s390x: improve error handling for SSCH and RSCH · 66dc50f7
      Halil Pasic authored
      
      Simplify the error handling of the SSCH and RSCH handler avoiding
      arbitrary and cryptic error codes being used to tell how the instruction
      is supposed to end.  Let the code detecting the condition tell how it's
      to be handled in a less ambiguous way.  It's best to handle SSCH and RSCH
      in one go as the emulation of the two shares a lot of code.
      
      For passthrough this change isn't pure refactoring, but changes the way
      kernel reported EFAULT is handled. After clarifying the kernel interface
      we decided that EFAULT shall be mapped to unit exception.  Same goes for
      unexpected error codes and absence of required ORB flags.
      
      Signed-off-by: default avatarHalil Pasic <pasic@linux.vnet.ibm.com>
      Message-Id: <20171017140453.51099-4-pasic@linux.vnet.ibm.com>
      Tested-by: default avatarDong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
      [CH: cosmetic changes]
      Reviewed-by: default avatarDong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
      Signed-off-by: default avatarCornelia Huck <cohuck@redhat.com>
      66dc50f7
  28. Oct 06, 2017
  29. Jul 25, 2017
  30. May 19, 2017
Loading