Skip to content
Snippets Groups Projects
  1. May 08, 2018
  2. May 07, 2018
    • David Hildenbrand's avatar
      pc-dimm: factor out MemoryDevice interface · 2cc0e2e8
      David Hildenbrand authored
      
      On the qmp level, we already have the concept of memory devices:
          "query-memory-devices"
      Right now, we only support NVDIMM and PCDIMM.
      
      We want to map other devices later into the address space of the guest.
      Such device could e.g. be virtio devices. These devices will have a
      guest memory range assigned but won't be exposed via e.g. ACPI. We want
      to make them look like memory device, but not glued to pc-dimm.
      
      Especially, it will not always be possible to have TYPE_PC_DIMM as a parent
      class (e.g. virtio devices). Let's use an interface instead. As a first
      part, convert handling of
      - qmp_pc_dimm_device_list
      - get_plugged_memory_size
      to our new model. plug/unplug stuff etc. will follow later.
      
      A memory device will have to provide the following functions:
      - get_addr(): Necessary, as the property "addr" can e.g. not be used for
                    virtio devices (already defined).
      - get_plugged_size(): The amount this device offers to the guest as of
                            now.
      - get_region_size(): Because this can later on be bigger than the
                           plugged size.
      - fill_device_info(): Fill MemoryDeviceInfo, e.g. for qmp.
      
      Reviewed-by: default avatarDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: default avatarDavid Hildenbrand <david@redhat.com>
      Message-Id: <20180423165126.15441-2-david@redhat.com>
      Reviewed-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarEduardo Habkost <ehabkost@redhat.com>
      2cc0e2e8
  3. Apr 27, 2018
    • David Gibson's avatar
      Clear mem_path if we fall back to anonymous RAM allocation · 6233b679
      David Gibson authored
      
      If the -mem-path option is set, we attempt to map the guest's RAM from a
      file in the given path; it's usually used to back guest RAM with hugepages.
      If we're unable to (e.g. not enough free hugepages) then we fall back to
      allocating normal anonymous pages.  This behaviour can be surprising, but a
      comment in allocate_system_memory_nonnuma() suggests it's legacy behaviour
      we can't change.
      
      What really isn't ok, though, is that in this case we leave mem_path set.
      That means functions which attempt to determine the pagesize of main RAM
      can erroneously think it is hugepage based on the requested path, even
      though it's not.
      
      This is particular bad for the pseries machine type.  KVM HV limitations
      mean the guest can't use pagesizes larger than the host page size used to
      back RAM.  That means that such a fallback, rather than merely giving
      poorer performance than expected will cause the guest to freeze up early in
      boot as it attempts to use large page mappings that can't work.
      
      This patch addresses the problem by clearing the mem_path variable when we
      fall back to anonymous pages, meaning that subsequent attempts to
      determine the RAM page size will get an accurate result.
      
      Signed-off-by: default avatarDavid Gibson <david@gibson.dropbear.id.au>
      6233b679
  4. Mar 20, 2018
  5. Mar 08, 2018
    • David Hildenbrand's avatar
      numa: we don't implement NUMA for s390x · 81ce6aa5
      David Hildenbrand authored
      
      Right now it is possible to crash QEMU for s390x by providing e.g.
          -numa node,nodeid=0,cpus=0-1
      
      Problem is, that numa.c uses mc->cpu_index_to_instance_props as an
      indicator whether NUMA is supported by a machine type. We don't
      implement NUMA for s390x ("topology") yet. However we need
      mc->cpu_index_to_instance_props for query-cpus.
      
      So let's fix this case by also checking for mc->get_default_cpu_node_id,
      which will be needed by machine_set_cpu_numa_node().
      
      qemu-system-s390x: -numa node,nodeid=0,cpus=0-1: NUMA is not supported by
                         this machine-type
      
      While at it, make s390_cpu_index_to_props() look like on other
      architectures.
      
      Signed-off-by: default avatarDavid Hildenbrand <david@redhat.com>
      Message-Id: <20180227110255.20999-1-david@redhat.com>
      Reviewed-by: default avatarChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: default avatarCornelia Huck <cohuck@redhat.com>
      81ce6aa5
  6. Mar 02, 2018
    • Markus Armbruster's avatar
      qapi: Empty out qapi-schema.json · 112ed241
      Markus Armbruster authored
      
      The previous commit improved compile time by including less of the
      generated QAPI headers.  This is impossible for stuff defined directly
      in qapi-schema.json, because that ends up in headers that that pull in
      everything.
      
      Move everything but include directives from qapi-schema.json to new
      sub-module qapi/misc.json, then include just the "misc" shard where
      possible.
      
      It's possible everywhere, except:
      
      * monitor.c needs qmp-command.h to get qmp_init_marshal()
      
      * monitor.c, ui/vnc.c and the generated qapi-event-FOO.c need
        qapi-event.h to get enum QAPIEvent
      
      Perhaps we'll get rid of those some other day.
      
      Adding a type to qapi/migration.json now recompiles some 120 instead
      of 2300 out of 5100 objects.
      
      Signed-off-by: default avatarMarkus Armbruster <armbru@redhat.com>
      Message-Id: <20180211093607.27351-25-armbru@redhat.com>
      [eblake: rebase to master]
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      112ed241
  7. Feb 09, 2018
  8. Feb 05, 2018
  9. Jan 19, 2018
    • Haozhong Zhang's avatar
      hostmem-file: add "align" option · 98376843
      Haozhong Zhang authored
      
      When mmap(2) the backend files, QEMU uses the host page size
      (getpagesize(2)) by default as the alignment of mapping address.
      However, some backends may require alignments different than the page
      size. For example, mmap a device DAX (e.g., /dev/dax0.0) on Linux
      kernel 4.13 to an address, which is 4K-aligned but not 2M-aligned,
      fails with a kernel message like
      
      [617494.969768] dax dax0.0: qemu-system-x86: dax_mmap: fail, unaligned vma (0x7fa37c579000 - 0x7fa43c579000, 0x1fffff)
      
      Because there is no common approach to get such alignment requirement,
      we add the 'align' option to 'memory-backend-file', so that users or
      management utils, which have enough knowledge about the backend, can
      specify a proper alignment via this option.
      
      Signed-off-by: default avatarHaozhong Zhang <haozhong.zhang@intel.com>
      Message-Id: <20171211072806.2812-2-haozhong.zhang@intel.com>
      Reviewed-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Reviewed-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      [ehabkost: fixed typo, fixed error_setg() format string]
      Signed-off-by: default avatarEduardo Habkost <ehabkost@redhat.com>
      98376843
  10. Dec 18, 2017
  11. Dec 14, 2017
  12. Nov 16, 2017
    • Dou Liyang's avatar
      NUMA: Enable adding NUMA node implicitly · 7b8be49d
      Dou Liyang authored
      
      Linux and Windows need ACPI SRAT table to make memory hotplug work properly,
      however currently QEMU doesn't create SRAT table if numa options aren't present
      on CLI.
      
      Which breaks both linux and windows guests in certain conditions:
       * Windows: won't enable memory hotplug without SRAT table at all
       * Linux: if QEMU is started with initial memory all below 4Gb and no SRAT table
         present, guest kernel will use nommu DMA ops, which breaks 32bit hw drivers
         when memory is hotplugged and guest tries to use it with that drivers.
      
      Fix above issues by automatically creating a numa node when QEMU is started with
      memory hotplug enabled but without '-numa' options on CLI.
      (PS: auto-create numa node only for new machine types so not to break migration).
      
      Which would provide SRAT table to guests without explicit -numa options on CLI
      and would allow:
       * Windows: to enable memory hotplug
       * Linux: switch to SWIOTLB DMA ops, to bounce DMA transfers to 32bit allocated
         buffers that legacy drivers/hw can handle.
      
      [Rewritten by Igor]
      
      Reported-by: default avatarThadeu Lima de Souza Cascardo <cascardo@canonical.com>
      Suggested-by: default avatarIgor Mammedov <imammedo@redhat.com>
      Signed-off-by: default avatarDou Liyang <douly.fnst@cn.fujitsu.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Eduardo Habkost <ehabkost@redhat.com>
      Cc: "Michael S. Tsirkin" <mst@redhat.com>
      Cc: Marcel Apfelbaum <marcel@redhat.com>
      Cc: Igor Mammedov <imammedo@redhat.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: Thomas Huth <thuth@redhat.com>
      Cc: Alistair Francis <alistair23@gmail.com>
      Cc: Takao Indoh <indou.takao@jp.fujitsu.com>
      Cc: Izumi Taku <izumi.taku@jp.fujitsu.com>
      Reviewed-by: default avatarIgor Mammedov <imammedo@redhat.com>
      Reviewed-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      7b8be49d
  13. Oct 27, 2017
    • Igor Mammedov's avatar
      numa: fixup parsed NumaNodeOptions earlier · cc001888
      Igor Mammedov authored
      
      numa 'mem' option with suffix or without one is possible
      only on CLI/HMP. Instead of fixing up special suffix less
      CLI case deep in parse_numa_node() do it earlier right
      after option is parsed into NumaNodeOptions with OptVisistor
      so that the rest of the code would use valid values in
      NumaNodeOptions and won't have to reparse QemuOpts.
      
      It will help to isolate CLI/HMP parts in parse_numa() and
      split out parsed NumaNodeOptions processing into separate
      function that could be reused by QMP handler where we have
      only NumaNodeOptions and don't need any fixups.
      
      While at it reuse qemu_strtosz_MiB() instead of manually
      checking for suffixes.
      
      Signed-off-by: default avatarIgor Mammedov <imammedo@redhat.com>
      Message-Id: <1507801198-98182-1-git-send-email-imammedo@redhat.com>
      Reviewed-by: default avatarEduardo Habkost <ehabkost@redhat.com>
      Signed-off-by: default avatarEduardo Habkost <ehabkost@redhat.com>
      cc001888
  14. Sep 19, 2017
  15. Sep 14, 2017
  16. Jul 14, 2017
  17. Jun 20, 2017
  18. Jun 05, 2017
  19. May 30, 2017
    • Eduardo Habkost's avatar
      numa: Fix format string for "Invalid node" message · f892291e
      Eduardo Habkost authored
      
      Some compilers complain about the PRIu16 format string with the
      MAX(src, dst) and MAX_NODES arguments.  Example output from Apple LLVM
      version 7.3.0 (clang-703.0.31):
      
        numa.c:236:20: warning: format specifies type 'unsigned short' but the argument has type 'int' [-Wformat]
                           MAX(src, dst), MAX_NODES);
        ~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~
        include/qapi/error.h:163:35: note: expanded from macro 'error_setg'
                                (fmt), ## __VA_ARGS__)
                                          ^~~~~~~~~~~
        glib/2.52.2/include/glib-2.0/glib/gmacros.h:288:20: note: expanded from macro 'MAX'
        #define MAX(a, b)  (((a) > (b)) ? (a) : (b))
                           ^~~~~~~~~~~~~~~~~~~~~~~~~
        numa.c:236:35: warning: format specifies type 'unsigned short' but the argument has type 'int' [-Wformat]
                           MAX(src, dst), MAX_NODES);
        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~
        include/qapi/error.h:163:35: note: expanded from macro 'error_setg'
                                (fmt), ## __VA_ARGS__)
                                          ^~~~~~~~~~~
        include/sysemu/sysemu.h:165:19: note: expanded from macro 'MAX_NODES'
        #define MAX_NODES 128
                          ^~~
      MAX(src, dst) promotes the src and dst arguments to int, and MAX_NODES
      is an int.  Use %d to silence those warnings.
      
      Signed-off-by: default avatarEduardo Habkost <ehabkost@redhat.com>
      Message-Id: <20170530184013.31044-1-ehabkost@redhat.com>
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      Signed-off-by: default avatarEduardo Habkost <ehabkost@redhat.com>
      f892291e
  20. May 11, 2017
  21. May 07, 2017
  22. Mar 22, 2017
    • Laurent Vivier's avatar
      numa,spapr: align default numa node memory size to 256MB · 55641213
      Laurent Vivier authored
      
      Since commit 224245bf ("spapr: Add LMB DR connectors"), NUMA node
      memory size must be aligned to 256MB (SPAPR_MEMORY_BLOCK_SIZE).
      
      But when "-numa" option is provided without "mem" parameter,
      the memory is equally divided between nodes, but 8MB aligned.
      This can be not valid for pseries.
      
      In that case we can have:
      $ ./ppc64-softmmu/qemu-system-ppc64 -m 4G -numa node -numa node -numa node
      qemu-system-ppc64: Node 0 memory size 0x55000000 is not aligned to 256 MiB
      
      With this patch, we have:
      (qemu) info numa
      3 nodes
      node 0 cpus: 0
      node 0 size: 1280 MB
      node 1 cpus:
      node 1 size: 1280 MB
      node 2 cpus:
      node 2 size: 1536 MB
      
      Signed-off-by: default avatarLaurent Vivier <lvivier@redhat.com>
      Signed-off-by: default avatarDavid Gibson <david@gibson.dropbear.id.au>
      55641213
  23. Feb 22, 2017
  24. Jan 16, 2017
  25. Jan 12, 2017
  26. Oct 09, 2016
Loading