- Feb 04, 2022
-
-
Michael S. Tsirkin authored
__get_cpuid_max returns an unsigned value. For consistency, store the result in an unsigned variable. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Richard Henderson <rth@twiddle.net> Signed-off-by:
Michael S. Tsirkin <mst@redhat.com> Reviewed-by:
Philippe Mathieu-Daudé <f4bug@amsat.org>
-
- Apr 01, 2020
-
-
Robert Hoo authored
By increasing avx2 length_to_accel to 128, we can simplify its logic and reduce a branch. The authorship of this patch actually belongs to Richard Henderson <richard.henderson@linaro.org>, I just fixed a boundary case on his original patch. Suggested-by:
Richard Henderson <richard.henderson@linaro.org> Signed-off-by:
Robert Hoo <robert.hu@linux.intel.com> Message-Id: <1585119021-46593-2-git-send-email-robert.hu@linux.intel.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Robert Hoo authored
Because in unit test, init_accel() will be called several times, each with different accelerator type. Signed-off-by:
Robert Hoo <robert.hu@linux.intel.com> Message-Id: <1585119021-46593-1-git-send-email-robert.hu@linux.intel.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
- Mar 16, 2020
-
-
Robert Hoo authored
And intialize buffer_is_zero() with it, when Intel AVX512F is available on host. This function utilizes Intel AVX512 fundamental instructions which is faster than its implementation with AVX2 (in my unit test, with 4K buffer, on CascadeLake SP, ~36% faster, buffer_zero_avx512() V.S. buffer_zero_avx2()). Signed-off-by:
Robert Hoo <robert.hu@linux.intel.com> Reviewed-by:
Richard Henderson <richard.henderson@linaro.org> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
- Jun 12, 2019
-
-
Markus Armbruster authored
No header includes qemu-common.h after this commit, as prescribed by qemu-common.h's file comment. Signed-off-by:
Markus Armbruster <armbru@redhat.com> Message-Id: <20190523143508.25387-5-armbru@redhat.com> [Rebased with conflicts resolved automatically, except for include/hw/arm/xlnx-zynqmp.h hw/arm/nrf51_soc.c hw/arm/msf2-soc.c block/qcow2-refcount.c block/qcow2-cluster.c block/qcow2-cache.c target/arm/cpu.h target/lm32/cpu.h target/m68k/cpu.h target/mips/cpu.h target/moxie/cpu.h target/nios2/cpu.h target/openrisc/cpu.h target/riscv/cpu.h target/tilegx/cpu.h target/tricore/cpu.h target/unicore32/cpu.h target/xtensa/cpu.h; bsd-user/main.c and net/tap-bsd.c fixed up]
-
- Jul 24, 2017
-
-
Richard Henderson authored
Clang 3.9 passes the CONFIG_AVX2_OPT configure test. However, the supplied <cpuid.h> does not contain the bit_AVX2 define that we use when detecting whether the routine can be enabled. Introduce a qemu-specific header that uses the compiler's definition of __cpuid et al, but supplies any missing bit_* definitions needed. This avoids introducing any extra ifdefs to util/bufferiszero.c, and allows quite a few to be removed from tcg/i386/tcg-target.inc.c. Signed-off-by:
Richard Henderson <rth@twiddle.net> Reviewed-by:
Peter Maydell <peter.maydell@linaro.org> Message-id: 20170719044018.18063-1-rth@twiddle.net Signed-off-by:
Peter Maydell <peter.maydell@linaro.org>
-
- Sep 14, 2016
-
-
Richard Henderson authored
Handle alignment of buffers, so that the vector paths can be used more often. Signed-off-by:
Richard Henderson <rth@twiddle.net> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Message-Id: <1473800239-13841-1-git-send-email-rth@twiddle.net> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
- Sep 13, 2016
-
-
Richard Henderson authored
There's no real knowledge of the cacheline size, just prefetching one loop ahead. Signed-off-by:
Richard Henderson <rth@twiddle.net> Message-Id: <1472496380-19706-7-git-send-email-rth@twiddle.net> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Paolo Bonzini authored
Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Richard Henderson authored
Signed-off-by:
Richard Henderson <rth@twiddle.net> Message-Id: <1472496380-19706-6-git-send-email-rth@twiddle.net> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Richard Henderson authored
For ppc64le, gcc6 does extremely poorly with the Altivec code. Moreover, on POWER7 and POWER8, a hand-optimized Altivec version turns out to be no faster than the revised integer version, and therefore not worth the effort. Signed-off-by:
Richard Henderson <rth@twiddle.net> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Richard Henderson authored
The revised integer version is 4 times faster than the neon version on an AppliedMicro Mustang. Even with hand scheduling and additional unrolling I cannot make any neon version run as fast as the integer. Signed-off-by:
Richard Henderson <rth@twiddle.net> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Richard Henderson authored
Allow selection of several acceleration functions based on the size and alignment of the buffer. Do not require ifunc support for AVX2 acceleration. Signed-off-by:
Richard Henderson <rth@twiddle.net> Message-Id: <1472496380-19706-5-git-send-email-rth@twiddle.net> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Richard Henderson authored
Since the two users don't make use of the returned offset, beyond ensuring that the entire buffer is zero, consider the can_use_buffer_find_nonzero_offset and buffer_find_nonzero_offset functions internal. Reviewed-by:
Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by:
Richard Henderson <rth@twiddle.net> Message-Id: <1472496380-19706-4-git-send-email-rth@twiddle.net> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Richard Henderson authored
This is unused and complicates the vector interface. Signed-off-by:
Richard Henderson <rth@twiddle.net> Message-Id: <1472496380-19706-3-git-send-email-rth@twiddle.net> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Richard Henderson authored
Signed-off-by:
Richard Henderson <rth@twiddle.net> Message-Id: <1472496380-19706-2-git-send-email-rth@twiddle.net> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-