Skip to content
  • Zhuocheng Ding's avatar
    958ac3c4
    system/cpus: Fix CPUState.nr_cores' calculation · 958ac3c4
    Zhuocheng Ding authored
    
    
    From CPUState.nr_cores' comment, it represents "number of cores within
    this CPU package".
    
    After 003f230e ("machine: Tweak the order of topology members in
    struct CpuTopology"), the meaning of smp.cores changed to "the number of
    cores in one die", but this commit missed to change CPUState.nr_cores'
    calculation, so that CPUState.nr_cores became wrong and now it
    misses to consider numbers of clusters and dies.
    
    At present, only i386 is using CPUState.nr_cores.
    
    But as for i386, which supports die level, the uses of CPUState.nr_cores
    are very confusing:
    
    Early uses are based on the meaning of "cores per package" (before die
    is introduced into i386), and later uses are based on "cores per die"
    (after die's introduction).
    
    This difference is due to that commit a94e1428 ("target/i386: Add
    CPUID.1F generation support for multi-dies PCMachine") misunderstood
    that CPUState.nr_cores means "cores per die" when calculated
    CPUID.1FH.01H:EBX. After that, the changes in i386 all followed this
    wrong understanding.
    
    With the influence of 003f230e and a94e1428, for i386 currently
    the result of CPUState.nr_cores is "cores per die", thus the original
    uses of CPUState.cores based on the meaning of "cores per package" are
    wrong when multiple dies exist:
    1. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.01H:EBX[bits 23:16] is
       incorrect because it expects "cpus per package" but now the
       result is "cpus per die".
    2. In cpu_x86_cpuid() of target/i386/cpu.c, for all leaves of CPUID.04H:
       EAX[bits 31:26] is incorrect because they expect "cpus per package"
       but now the result is "cpus per die". The error not only impacts the
       EAX calculation in cache_info_passthrough case, but also impacts other
       cases of setting cache topology for Intel CPU according to cpu
       topology (specifically, the incoming parameter "num_cores" expects
       "cores per package" in encode_cache_cpuid4()).
    3. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.0BH.01H:EBX[bits
       15:00] is incorrect because the EBX of 0BH.01H (core level) expects
       "cpus per package", which may be different with 1FH.01H (The reason
       is 1FH can support more levels. For QEMU, 1FH also supports die,
       1FH.01H:EBX[bits 15:00] expects "cpus per die").
    4. In cpu_x86_cpuid() of target/i386/cpu.c, when CPUID.80000001H is
       calculated, here "cpus per package" is expected to be checked, but in
       fact, now it checks "cpus per die". Though "cpus per die" also works
       for this code logic, this isn't consistent with AMD's APM.
    5. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.80000008H:ECX expects
       "cpus per package" but it obtains "cpus per die".
    6. In simulate_rdmsr() of target/i386/hvf/x86_emu.c, in
       kvm_rdmsr_core_thread_count() of target/i386/kvm/kvm.c, and in
       helper_rdmsr() of target/i386/tcg/sysemu/misc_helper.c,
       MSR_CORE_THREAD_COUNT expects "cpus per package" and "cores per
       package", but in these functions, it obtains "cpus per die" and
       "cores per die".
    
    On the other hand, these uses are correct now (they are added in/after
    a94e1428):
    1. In cpu_x86_cpuid() of target/i386/cpu.c, topo_info.cores_per_die
       meets the actual meaning of CPUState.nr_cores ("cores per die").
    2. In cpu_x86_cpuid() of target/i386/cpu.c, vcpus_per_socket (in CPUID.
       04H's calculation) considers number of dies, so it's correct.
    3. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.1FH.01H:EBX[bits
       15:00] needs "cpus per die" and it gets the correct result, and
       CPUID.1FH.02H:EBX[bits 15:00] gets correct "cpus per package".
    
    When CPUState.nr_cores is correctly changed to "cores per package" again
    , the above errors will be fixed without extra work, but the "currently"
    correct cases will go wrong and need special handling to pass correct
    "cpus/cores per die" they want.
    
    Fix CPUState.nr_cores' calculation to fit the original meaning "cores
    per package", as well as changing calculation of topo_info.cores_per_die,
    vcpus_per_socket and CPUID.1FH.
    
    Fixes: a94e1428 ("target/i386: Add CPUID.1F generation support for multi-dies PCMachine")
    Fixes: 003f230e ("machine: Tweak the order of topology members in struct CpuTopology")
    Signed-off-by: default avatarZhuocheng Ding <zhuocheng.ding@intel.com>
    Co-developed-by: default avatarZhao Liu <zhao1.liu@intel.com>
    Signed-off-by: default avatarZhao Liu <zhao1.liu@intel.com>
    Tested-by: default avatarBabu Moger <babu.moger@amd.com>
    Tested-by: default avatarYongwei Ma <yongwei.ma@intel.com>
    Acked-by: default avatarMichael S. Tsirkin <mst@redhat.com>
    Message-ID: <20231024090323.1859210-4-zhao1.liu@linux.intel.com>
    Signed-off-by: default avatarPhilippe Mathieu-Daudé <philmd@linaro.org>
    958ac3c4
    system/cpus: Fix CPUState.nr_cores' calculation
    Zhuocheng Ding authored
    
    
    From CPUState.nr_cores' comment, it represents "number of cores within
    this CPU package".
    
    After 003f230e ("machine: Tweak the order of topology members in
    struct CpuTopology"), the meaning of smp.cores changed to "the number of
    cores in one die", but this commit missed to change CPUState.nr_cores'
    calculation, so that CPUState.nr_cores became wrong and now it
    misses to consider numbers of clusters and dies.
    
    At present, only i386 is using CPUState.nr_cores.
    
    But as for i386, which supports die level, the uses of CPUState.nr_cores
    are very confusing:
    
    Early uses are based on the meaning of "cores per package" (before die
    is introduced into i386), and later uses are based on "cores per die"
    (after die's introduction).
    
    This difference is due to that commit a94e1428 ("target/i386: Add
    CPUID.1F generation support for multi-dies PCMachine") misunderstood
    that CPUState.nr_cores means "cores per die" when calculated
    CPUID.1FH.01H:EBX. After that, the changes in i386 all followed this
    wrong understanding.
    
    With the influence of 003f230e and a94e1428, for i386 currently
    the result of CPUState.nr_cores is "cores per die", thus the original
    uses of CPUState.cores based on the meaning of "cores per package" are
    wrong when multiple dies exist:
    1. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.01H:EBX[bits 23:16] is
       incorrect because it expects "cpus per package" but now the
       result is "cpus per die".
    2. In cpu_x86_cpuid() of target/i386/cpu.c, for all leaves of CPUID.04H:
       EAX[bits 31:26] is incorrect because they expect "cpus per package"
       but now the result is "cpus per die". The error not only impacts the
       EAX calculation in cache_info_passthrough case, but also impacts other
       cases of setting cache topology for Intel CPU according to cpu
       topology (specifically, the incoming parameter "num_cores" expects
       "cores per package" in encode_cache_cpuid4()).
    3. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.0BH.01H:EBX[bits
       15:00] is incorrect because the EBX of 0BH.01H (core level) expects
       "cpus per package", which may be different with 1FH.01H (The reason
       is 1FH can support more levels. For QEMU, 1FH also supports die,
       1FH.01H:EBX[bits 15:00] expects "cpus per die").
    4. In cpu_x86_cpuid() of target/i386/cpu.c, when CPUID.80000001H is
       calculated, here "cpus per package" is expected to be checked, but in
       fact, now it checks "cpus per die". Though "cpus per die" also works
       for this code logic, this isn't consistent with AMD's APM.
    5. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.80000008H:ECX expects
       "cpus per package" but it obtains "cpus per die".
    6. In simulate_rdmsr() of target/i386/hvf/x86_emu.c, in
       kvm_rdmsr_core_thread_count() of target/i386/kvm/kvm.c, and in
       helper_rdmsr() of target/i386/tcg/sysemu/misc_helper.c,
       MSR_CORE_THREAD_COUNT expects "cpus per package" and "cores per
       package", but in these functions, it obtains "cpus per die" and
       "cores per die".
    
    On the other hand, these uses are correct now (they are added in/after
    a94e1428):
    1. In cpu_x86_cpuid() of target/i386/cpu.c, topo_info.cores_per_die
       meets the actual meaning of CPUState.nr_cores ("cores per die").
    2. In cpu_x86_cpuid() of target/i386/cpu.c, vcpus_per_socket (in CPUID.
       04H's calculation) considers number of dies, so it's correct.
    3. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.1FH.01H:EBX[bits
       15:00] needs "cpus per die" and it gets the correct result, and
       CPUID.1FH.02H:EBX[bits 15:00] gets correct "cpus per package".
    
    When CPUState.nr_cores is correctly changed to "cores per package" again
    , the above errors will be fixed without extra work, but the "currently"
    correct cases will go wrong and need special handling to pass correct
    "cpus/cores per die" they want.
    
    Fix CPUState.nr_cores' calculation to fit the original meaning "cores
    per package", as well as changing calculation of topo_info.cores_per_die,
    vcpus_per_socket and CPUID.1FH.
    
    Fixes: a94e1428 ("target/i386: Add CPUID.1F generation support for multi-dies PCMachine")
    Fixes: 003f230e ("machine: Tweak the order of topology members in struct CpuTopology")
    Signed-off-by: default avatarZhuocheng Ding <zhuocheng.ding@intel.com>
    Co-developed-by: default avatarZhao Liu <zhao1.liu@intel.com>
    Signed-off-by: default avatarZhao Liu <zhao1.liu@intel.com>
    Tested-by: default avatarBabu Moger <babu.moger@amd.com>
    Tested-by: default avatarYongwei Ma <yongwei.ma@intel.com>
    Acked-by: default avatarMichael S. Tsirkin <mst@redhat.com>
    Message-ID: <20231024090323.1859210-4-zhao1.liu@linux.intel.com>
    Signed-off-by: default avatarPhilippe Mathieu-Daudé <philmd@linaro.org>
Loading