Skip to content
Snippets Groups Projects
  1. Mar 22, 2017
    • Laurent Vivier's avatar
      numa,spapr: align default numa node memory size to 256MB · 55641213
      Laurent Vivier authored
      
      Since commit 224245bf ("spapr: Add LMB DR connectors"), NUMA node
      memory size must be aligned to 256MB (SPAPR_MEMORY_BLOCK_SIZE).
      
      But when "-numa" option is provided without "mem" parameter,
      the memory is equally divided between nodes, but 8MB aligned.
      This can be not valid for pseries.
      
      In that case we can have:
      $ ./ppc64-softmmu/qemu-system-ppc64 -m 4G -numa node -numa node -numa node
      qemu-system-ppc64: Node 0 memory size 0x55000000 is not aligned to 256 MiB
      
      With this patch, we have:
      (qemu) info numa
      3 nodes
      node 0 cpus: 0
      node 0 size: 1280 MB
      node 1 cpus:
      node 1 size: 1280 MB
      node 2 cpus:
      node 2 size: 1536 MB
      
      Signed-off-by: default avatarLaurent Vivier <lvivier@redhat.com>
      Signed-off-by: default avatarDavid Gibson <david@gibson.dropbear.id.au>
      55641213
  2. Feb 22, 2017
  3. Jan 16, 2017
  4. Jan 12, 2017
  5. Oct 09, 2016
  6. Aug 07, 2016
  7. Aug 02, 2016
    • Greg Kurz's avatar
      numa: set the memory backend "is_mapped" field · 0b217571
      Greg Kurz authored
      
      Commit 2aece63c "hostmem: detect host backend memory is being used properly"
      added a way to know if a memory backend is busy or available for use. It
      caused a slight regression if we pass the same backend to a NUMA node and
      to a pc-dimm device:
      
      -m 1G,slots=2,maxmem=2G \
      -object memory-backend-ram,size=1G,id=mem-mem1 \
      -device pc-dimm,id=dimm-mem1,memdev=mem-mem1 \
      -numa node,nodeid=0,memdev=mem-mem1
      
      Before commit 2aece63c, this would cause QEMU to print an error message and
      to exit gracefully:
      
      qemu-system-ppc64: -device pc-dimm,id=dimm-mem1,memdev=mem-mem1:
          can't use already busy memdev: mem-mem1
      
      Since commit 2aece63c, QEMU hits an assertion in the memory code:
      
      qemu-system-ppc64: memory.c:1934: memory_region_add_subregion_common:
          Assertion `!subregion->container' failed.
      Aborted
      
      This happens because pc-dimm devices don't use memory_region_is_mapped()
      anymore and cannot guess the backend is already used by a NUMA node.
      
      Let's revert to the previous behavior by turning the NUMA code to also
      call host_memory_backend_set_mapped() when it uses a backend.
      
      Fixes: 2aece63c
      Signed-off-by: default avatarGreg Kurz <groug@kaod.org>
      Message-Id: <146891691503.15642.9817215371777203794.stgit@bahia.lan>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      0b217571
  8. Jul 06, 2016
  9. Mar 18, 2016
    • Eric Blake's avatar
      qapi: Don't special-case simple union wrappers · 32bafa8f
      Eric Blake authored
      
      Simple unions were carrying a special case that hid their 'data'
      QMP member from the resulting C struct, via the hack method
      QAPISchemaObjectTypeVariant.simple_union_type().  But by using
      the work we started by unboxing flat union and alternate
      branches, coupled with the ability to visit the members of an
      implicit type, we can now expose the simple union's implicit
      type in qapi-types.h:
      
      | struct q_obj_ImageInfoSpecificQCow2_wrapper {
      |     ImageInfoSpecificQCow2 *data;
      | };
      |
      | struct q_obj_ImageInfoSpecificVmdk_wrapper {
      |     ImageInfoSpecificVmdk *data;
      | };
      ...
      | struct ImageInfoSpecific {
      |     ImageInfoSpecificKind type;
      |     union { /* union tag is @type */
      |         void *data;
      |-        ImageInfoSpecificQCow2 *qcow2;
      |-        ImageInfoSpecificVmdk *vmdk;
      |+        q_obj_ImageInfoSpecificQCow2_wrapper qcow2;
      |+        q_obj_ImageInfoSpecificVmdk_wrapper vmdk;
      |     } u;
      | };
      
      Doing this removes asymmetry between QAPI's QMP side and its
      C side (both sides now expose 'data'), and means that the
      treatment of a simple union as sugar for a flat union is now
      equivalent in both languages (previously the two approaches used
      a different layer of dereferencing, where the simple union could
      be converted to a flat union with equivalent C layout but
      different {} on the wire, or to an equivalent QMP wire form
      but with different C representation).  Using the implicit type
      also lets us get rid of the simple_union_type() hack.
      
      Of course, now all clients of simple unions have to adjust from
      using su->u.member to using su->u.member.data; while this touches
      a number of files in the tree, some earlier cleanup patches
      helped minimize the change to the initialization of a temporary
      variable rather than every single member access.  The generated
      qapi-visit.c code is also affected by the layout change:
      
      |@@ -7393,10 +7393,10 @@ void visit_type_ImageInfoSpecific_member
      |     }
      |     switch (obj->type) {
      |     case IMAGE_INFO_SPECIFIC_KIND_QCOW2:
      |-        visit_type_ImageInfoSpecificQCow2(v, "data", &obj->u.qcow2, &err);
      |+        visit_type_q_obj_ImageInfoSpecificQCow2_wrapper_members(v, &obj->u.qcow2, &err);
      |         break;
      |     case IMAGE_INFO_SPECIFIC_KIND_VMDK:
      |-        visit_type_ImageInfoSpecificVmdk(v, "data", &obj->u.vmdk, &err);
      |+        visit_type_q_obj_ImageInfoSpecificVmdk_wrapper_members(v, &obj->u.vmdk, &err);
      |         break;
      |     default:
      |         abort();
      
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      Message-Id: <1458254921-17042-13-git-send-email-eblake@redhat.com>
      Signed-off-by: default avatarMarkus Armbruster <armbru@redhat.com>
      32bafa8f
  10. Mar 04, 2016
    • Eric Blake's avatar
      qapi-dealloc: Reduce use outside of generated code · 96a1616c
      Eric Blake authored
      
      No need to roll our own use of the dealloc visitors when we can
      just directly use the qapi_free_FOO() functions that do what we
      want in one line.
      
      In net.c, inline net_visit() into its remaining lone caller.
      
      After this patch, test-visitor-serialization.c is the only
      non-generated file that needs to use a dealloc visitor, because
      it is testing low level aspects of the visitor interface.
      
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      Message-Id: <1456262075-3311-2-git-send-email-eblake@redhat.com>
      Signed-off-by: default avatarMarkus Armbruster <armbru@redhat.com>
      96a1616c
  11. Feb 08, 2016
    • Eric Blake's avatar
      qapi: Swap visit_* arguments for consistent 'name' placement · 51e72bc1
      Eric Blake authored
      
      JSON uses "name":value, but many of our visitor interfaces were
      called with visit_type_FOO(v, &value, name, errp).  This can be
      a bit confusing to have to mentally swap the parameter order to
      match JSON order.  It's particularly bad for visit_start_struct(),
      where the 'name' parameter is smack in the middle of the
      otherwise-related group of 'obj, kind, size' parameters! It's
      time to do a global swap of the parameter ordering, so that the
      'name' parameter is always immediately after the Visitor argument.
      
      Additional reason in favor of the swap: the existing include/qjson.h
      prefers listing 'name' first in json_prop_*(), and I have plans to
      unify that file with the qapi visitors; listing 'name' first in
      qapi will minimize churn to the (admittedly few) qjson.h clients.
      
      Later patches will then fix docs, object.h, visitor-impl.h, and
      those clients to match.
      
      Done by first patching scripts/qapi*.py by hand to make generated
      files do what I want, then by running the following Coccinelle
      script to affect the rest of the code base:
       $ spatch --sp-file script `git grep -l '\bvisit_' -- '**/*.[ch]'`
      I then had to apply some touchups (Coccinelle insisted on TAB
      indentation in visitor.h, and botched the signature of
      visit_type_enum() by rewriting 'const char *const strings[]' to
      the syntactically invalid 'const char*const[] strings').  The
      movement of parameters is sufficient to provoke compiler errors
      if any callers were missed.
      
          // Part 1: Swap declaration order
          @@
          type TV, TErr, TObj, T1, T2;
          identifier OBJ, ARG1, ARG2;
          @@
           void visit_start_struct
          -(TV v, TObj OBJ, T1 ARG1, const char *name, T2 ARG2, TErr errp)
          +(TV v, const char *name, TObj OBJ, T1 ARG1, T2 ARG2, TErr errp)
           { ... }
      
          @@
          type bool, TV, T1;
          identifier ARG1;
          @@
           bool visit_optional
          -(TV v, T1 ARG1, const char *name)
          +(TV v, const char *name, T1 ARG1)
           { ... }
      
          @@
          type TV, TErr, TObj, T1;
          identifier OBJ, ARG1;
          @@
           void visit_get_next_type
          -(TV v, TObj OBJ, T1 ARG1, const char *name, TErr errp)
          +(TV v, const char *name, TObj OBJ, T1 ARG1, TErr errp)
           { ... }
      
          @@
          type TV, TErr, TObj, T1, T2;
          identifier OBJ, ARG1, ARG2;
          @@
           void visit_type_enum
          -(TV v, TObj OBJ, T1 ARG1, T2 ARG2, const char *name, TErr errp)
          +(TV v, const char *name, TObj OBJ, T1 ARG1, T2 ARG2, TErr errp)
           { ... }
      
          @@
          type TV, TErr, TObj;
          identifier OBJ;
          identifier VISIT_TYPE =~ "^visit_type_";
          @@
           void VISIT_TYPE
          -(TV v, TObj OBJ, const char *name, TErr errp)
          +(TV v, const char *name, TObj OBJ, TErr errp)
           { ... }
      
          // Part 2: swap caller order
          @@
          expression V, NAME, OBJ, ARG1, ARG2, ERR;
          identifier VISIT_TYPE =~ "^visit_type_";
          @@
          (
          -visit_start_struct(V, OBJ, ARG1, NAME, ARG2, ERR)
          +visit_start_struct(V, NAME, OBJ, ARG1, ARG2, ERR)
          |
          -visit_optional(V, ARG1, NAME)
          +visit_optional(V, NAME, ARG1)
          |
          -visit_get_next_type(V, OBJ, ARG1, NAME, ERR)
          +visit_get_next_type(V, NAME, OBJ, ARG1, ERR)
          |
          -visit_type_enum(V, OBJ, ARG1, ARG2, NAME, ERR)
          +visit_type_enum(V, NAME, OBJ, ARG1, ARG2, ERR)
          |
          -VISIT_TYPE(V, OBJ, NAME, ERR)
          +VISIT_TYPE(V, NAME, OBJ, ERR)
          )
      
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      Reviewed-by: default avatarMarc-André Lureau <marcandre.lureau@redhat.com>
      Message-Id: <1454075341-13658-19-git-send-email-eblake@redhat.com>
      Signed-off-by: default avatarMarkus Armbruster <armbru@redhat.com>
      51e72bc1
  12. Feb 04, 2016
    • Peter Maydell's avatar
      all: Clean up includes · d38ea87a
      Peter Maydell authored
      
      Clean up includes so that osdep.h is included first and headers
      which it implies are not included manually.
      
      This commit was created with scripts/clean-includes.
      
      Signed-off-by: default avatarPeter Maydell <peter.maydell@linaro.org>
      Message-id: 1454089805-5470-16-git-send-email-peter.maydell@linaro.org
      d38ea87a
  13. Jan 26, 2016
  14. Jan 13, 2016
    • Markus Armbruster's avatar
      Use error_fatal to simplify obvious fatal errors · 007b0657
      Markus Armbruster authored
      
      Done with this Coccinelle semantic patch:
      
          @@
          type T;
          identifier FUN, RET;
          expression list ARGS;
          expression ERR, EC;
          @@
          (
          -    T RET = FUN(ARGS, &ERR);
          +    T RET = FUN(ARGS, &error_fatal);
          |
          -    RET = FUN(ARGS, &ERR);
          +    RET = FUN(ARGS, &error_fatal);
          |
          -    FUN(ARGS, &ERR);
          +    FUN(ARGS, &error_fatal);
          )
          -    if (ERR != NULL) {
          -        error_report_err(ERR);
          -        exit(EC);
          -    }
      
      This is actually a more elegant version of my initial semantic patch
      by courtesy of Eduardo.
      
      It leaves dead Error * variables behind, cleaned up manually.
      
      Cc: qemu-arm@nongnu.org
      Cc: "Michael S. Tsirkin" <mst@redhat.com>
      Cc: Eduardo Habkost <ehabkost@redhat.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarMarkus Armbruster <armbru@redhat.com>
      Reviewed-by: default avatarEduardo Habkost <ehabkost@redhat.com>
      007b0657
  15. Dec 18, 2015
  16. Nov 02, 2015
    • Eric Blake's avatar
      memory: Convert to new qapi union layout · 1fd5d4fe
      Eric Blake authored
      
      We have two issues with our qapi union layout:
      1) Even though the QMP wire format spells the tag 'type', the
      C code spells it 'kind', requiring some hacks in the generator.
      2) The C struct uses an anonymous union, which places all tag
      values in the same namespace as all non-variant members. This
      leads to spurious collisions if a tag value matches a non-variant
      member's name.
      
      Make the conversion to the new layout for memory-related code.
      
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      Message-Id: <1445898903-12082-21-git-send-email-eblake@redhat.com>
      [Commit message tweaked slightly]
      Signed-off-by: default avatarMarkus Armbruster <armbru@redhat.com>
      1fd5d4fe
  17. Sep 18, 2015
    • Markus Armbruster's avatar
      Fix bad error handling after memory_region_init_ram() · f8ed85ac
      Markus Armbruster authored
      
      Symptom:
      
          $ qemu-system-x86_64 -m 10000000
          Unexpected error in ram_block_add() at /work/armbru/qemu/exec.c:1456:
          upstream-qemu: cannot set up guest memory 'pc.ram': Cannot allocate memory
          Aborted (core dumped)
      
      Root cause: commit ef701d7b screwed up handling of out-of-memory
      conditions.  Before the commit, we report the error and exit(1), in
      one place, ram_block_add().  The commit lifts the error handling up
      the call chain some, to three places.  Fine.  Except it uses
      &error_abort in these places, changing the behavior from exit(1) to
      abort(), and thus undoing the work of commit 39228250 "exec: Don't
      abort when we can't allocate guest memory".
      
      The three places are:
      
      * memory_region_init_ram()
      
        Commit 49946538 (right after commit ef701d7b) lifted the error
        handling further, through memory_region_init_ram(), multiplying the
        incorrect use of &error_abort.  Later on, imitation of existing
        (bad) code may have created more.
      
      * memory_region_init_ram_ptr()
      
        The &error_abort is still there.
      
      * memory_region_init_rom_device()
      
        Doesn't need fixing, because commit 33e0eb52 (soon after commit
        ef701d7b) lifted the error handling further, and in the process
        changed it from &error_abort to passing it up the call chain.
        Correct, because the callers are realize() methods.
      
      Fix the error handling after memory_region_init_ram() with a
      Coccinelle semantic patch:
      
          @r@
          expression mr, owner, name, size, err;
          position p;
          @@
                  memory_region_init_ram(mr, owner, name, size,
          (
          -                              &error_abort
          +                              &error_fatal
          |
                                         err@p
          )
                                        );
          @script:python@
              p << r.p;
          @@
          print "%s:%s:%s" % (p[0].file, p[0].line, p[0].column)
      
      When the last argument is &error_abort, it gets replaced by
      &error_fatal.  This is the fix.
      
      If the last argument is anything else, its position is reported.  This
      lets us check the fix is complete.  Four positions get reported:
      
      * ram_backend_memory_alloc()
      
        Error is passed up the call chain, ultimately through
        user_creatable_complete().  As far as I can tell, it's callers all
        handle the error sanely.
      
      * fsl_imx25_realize(), fsl_imx31_realize(), dp8393x_realize()
      
        DeviceClass.realize() methods, errors handled sanely further up the
        call chain.
      
      We're good.  Test case again behaves:
      
          $ qemu-system-x86_64 -m 10000000
          qemu-system-x86_64: cannot set up guest memory 'pc.ram': Cannot allocate memory
          [Exit 1 ]
      
      The next commits will repair the rest of commit ef701d7b's damage.
      
      Signed-off-by: default avatarMarkus Armbruster <armbru@redhat.com>
      Message-Id: <1441983105-26376-3-git-send-email-armbru@redhat.com>
      Reviewed-by: default avatarPeter Crosthwaite <crosthwaite.peter@gmail.com>
      f8ed85ac
  18. Sep 11, 2015
  19. Jul 15, 2015
  20. Jul 03, 2015
  21. Jun 22, 2015
  22. Jun 19, 2015
  23. Jun 09, 2015
  24. Jun 08, 2015
    • Markus Armbruster's avatar
      QemuOpts: Drop qemu_opts_foreach() parameter abort_on_failure · a4c7367f
      Markus Armbruster authored
      
      When the argument is non-zero, qemu_opts_foreach() stops on callback
      returning non-zero, and returns that value.
      
      When the argument is zero, it doesn't stop, and returns the bit-wise
      inclusive or of all the return values.  Funky :)
      
      The callers that pass zero could just as well pass one, because their
      callbacks can't return anything but zero:
      
      * qemu_add_globals()'s callback qdev_add_one_global()
      
      * qemu_config_write()'s callback config_write_opts()
      
      * main()'s callbacks default_driver_check(), drive_enable_snapshot(),
        vnc_init_func()
      
      Drop the parameter, and always stop.
      
      Signed-off-by: default avatarMarkus Armbruster <armbru@redhat.com>
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      Acked-by: default avatarKevin Wolf <kwolf@redhat.com>
      a4c7367f
  25. Mar 19, 2015
    • Eduardo Habkost's avatar
      numa: Print warning if no node is assigned to a CPU · 549fc54b
      Eduardo Habkost authored
      
      We need all possible CPUs (including hotplug ones) to be present in the
      SRAT when QEMU starts. QEMU already does that correctly today, the only
      problem is that when a CPU is omitted from the NUMA configuration, it is
      silently assigned to node 0.
      
      Check if all CPUs up to max_cpus are present in the NUMA configuration
      and warn about missing CPUs.
      
      Make it just a warning, to allow management software to be updated if
      necessary. In the future we may make it a fatal error instead.
      
      Command-line examples:
      
      * Correct, no warning:
      
        $ qemu-system-x86_64 -smp 2,maxcpus=4
        $ qemu-system-x86_64 -smp 2,maxcpus=4 -numa node,cpus=0-3
      
      * Incomplete, with warnings:
      
        $ qemu-system-x86_64 -smp 2,maxcpus=4 -numa node,cpus=0
        qemu-system-x86_64: warning: CPU(s) not present in any NUMA nodes: 1 2 3
        qemu-system-x86_64: warning: All CPU(s) up to maxcpus should be described in NUMA config
      
        $ qemu-system-x86_64 -smp 2,maxcpus=4 -numa node,cpus=0-2
        qemu-system-x86_64: warning: CPU(s) not present in any NUMA nodes: 3
        qemu-system-x86_64: warning: All CPU(s) up to maxcpus should be described in NUMA config
      
      Signed-off-by: default avatarEduardo Habkost <ehabkost@redhat.com>
      ---
      v1 -> v2: (no changes)
      
      v2 -> v3:
       * Use enumerate_cpus() and error_report() for error message
       * Simplify logic using bitmap_full()
      
      v3 -> v4:
       * Clarify error message, mention that all CPUs up to
         maxcpus need to be described in NUMA config
      
      v4 -> v5:
       * Commit log update, to make problem description clearer
      549fc54b
    • Igor Mammedov's avatar
      numa: introduce machine callback for VCPU to node mapping · 57924bcd
      Igor Mammedov authored
      
      Current default round-robin way of distributing VCPUs among
      NUMA nodes might be wrong in case on multi-core/threads
      CPUs. Making guests confused wrt topology where cores from
      the same socket are on different nodes.
      
      Allow a machine to override default mapping by providing
       MachineClass::cpu_index_to_socket_id()
      callback which would allow it group VCPUs from a socket
      on the same NUMA node.
      
      Signed-off-by: default avatarIgor Mammedov <imammedo@redhat.com>
      Reviewed-by: default avatarAndreas Färber <afaerber@suse.de>
      Signed-off-by: default avatarEduardo Habkost <ehabkost@redhat.com>
      57924bcd
    • Eduardo Habkost's avatar
      numa: Reject configuration if CPU appears on multiple nodes · 3ef71975
      Eduardo Habkost authored
      
      Each CPU can appear in only one NUMA node on the NUMA config. Reject
      configuration if a CPU appears in multiple nodes.
      
      Reviewed-by: default avatarIgor Mammedov <imammedo@redhat.com>
      Signed-off-by: default avatarEduardo Habkost <ehabkost@redhat.com>
      3ef71975
    • Eduardo Habkost's avatar
      numa: Reject CPU indexes > max_cpus · 8979c945
      Eduardo Habkost authored
      
      CPU index is always less than max_cpus, as documented at sysemu.h:
      
      > The following shall be true for all CPUs:
      >   cpu->cpu_index < max_cpus <= MAX_CPUMASK_BITS
      
      Reject configuration which uses invalid CPU indexes.
      
      Reviewed-by: default avatarIgor Mammedov <imammedo@redhat.com>
      Signed-off-by: default avatarEduardo Habkost <ehabkost@redhat.com>
      8979c945
    • Eduardo Habkost's avatar
      numa: Fix off-by-one error at MAX_CPUMASK_BITS check · ed26b922
      Eduardo Habkost authored
      
      Fix the CPU index check to ensure we don't go beyond the size of the
      node_cpu bitmap.
      
      CPU index is always less than MAX_CPUMASK_BITS, as documented at
      sysemu.h:
      
      > The following shall be true for all CPUs:
      >   cpu->cpu_index < max_cpus <= MAX_CPUMASK_BITS
      
      Reviewed-by: default avatarIgor Mammedov <imammedo@redhat.com>
      Signed-off-by: default avatarEduardo Habkost <ehabkost@redhat.com>
      ed26b922
  26. Mar 10, 2015
  27. Feb 23, 2015
Loading