- Sep 17, 2017
-
-
Richard Henderson authored
There was a potential problem here with an ILP32 host with 64 host registers. Signed-off-by:
Richard Henderson <richard.henderson@linaro.org>
-
Richard Henderson authored
It's not even clear what the interface REG and VAL32 were supposed to mean. All uses had REG = 0 and VAL32 was the bitset assigned to the destination. Signed-off-by:
Richard Henderson <richard.henderson@linaro.org>
-
Richard Henderson authored
Reviewed-by:
Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by:
Richard Henderson <richard.henderson@linaro.org>
-
Richard Henderson authored
Reviewed-by:
Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by:
Richard Henderson <richard.henderson@linaro.org>
-
Richard Henderson authored
Reviewed-by:
Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by:
Richard Henderson <richard.henderson@linaro.org>
-
Richard Henderson authored
Reviewed-by:
Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by:
Alex Bennée <alex.bennee@linaro.org> Signed-off-by:
Richard Henderson <richard.henderson@linaro.org>
-
- Sep 07, 2017
-
-
Richard Henderson authored
A new shared header tcg-pool.inc.c adds new_pool_label, for registering a tcg_target_ulong to be emitted after the generated code, plus relocation data to install a pointer to the data. A new pointer is added to the TCGContext, so that we dump the constant pool as data, not code. Signed-off-by:
Richard Henderson <rth@twiddle.net>
-
Richard Henderson authored
Dispense with TCGBackendData, as it has never been used for more than holding a single pointer. Use a define in the cpu/tcg-target.h to signal requirement for TCGLabelQemuLdst, so that we can drop the no-op tcg-be-null.h stubs. Rename tcg-be-ldst.h to tcg-ldst.inc.c. Reviewed-by:
Paolo Bonzini <pbonzini@redhat.com> Signed-off-by:
Richard Henderson <rth@twiddle.net>
-
Richard Henderson authored
Replace the USE_DIRECT_JUMP ifdef with a TCG_TARGET_HAS_direct_jump boolean test. Replace the tb_set_jmp_target1 ifdef with an unconditional function tb_target_set_jmp_target. While we're touching all backends, add a parameter for tb->tc_ptr; we're going to need it shortly for some backends. Move tb_set_jmp_target and tb_add_jump from exec-all.h to cpu-exec.c. This opens the possibility for TCG_TARGET_HAS_direct_jump to be a runtime decision -- based on host cpu capabilities, the size of code_gen_buffer, or a future debugging switch. Signed-off-by:
Richard Henderson <rth@twiddle.net>
-
- Aug 03, 2017
-
-
Richard Henderson authored
For a 64-bit ILP32 host, aligning to sizeof(long) is not enough. Guess the minimum for any host is 8, as that covers uint64_t. Qemu doesn't use a host long double or host vectors, except in extremely limited circumstances. Fixes a bus error for a sparc v8plus host. Signed-off-by:
Richard Henderson <rth@twiddle.net>
-
- Jul 04, 2017
-
-
Paolo Bonzini authored
Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
- Jun 19, 2017
-
-
Emilio G. Cota authored
Allocating an arbitrarily-sized array of tbs results in either (a) a lot of memory wasted or (b) unnecessary flushes of the code cache when we run out of TB structs in the array. An obvious solution would be to just malloc a TB struct when needed, and keep the TB array as an array of pointers (recall that tb_find_pc() needs the TB array to run in O(log n)). Perhaps a better solution, which is implemented in this patch, is to allocate TB's right before the translated code they describe. This results in some memory waste due to padding to have code and TBs in separate cache lines--for instance, I measured 4.7% of padding in the used portion of code_gen_buffer when booting aarch64 Linux on a host with 64-byte cache lines. However, it can allow for optimizations in some host architectures, since TCG backends could safely assume that the TB and the corresponding translated code are very close to each other in memory. See this message by rth for a detailed explanation: https://lists.gnu.org/archive/html/qemu-devel/2017-03/msg05172.html Subject: Re: GSoC 2017 Proposal: TCG performance enhancements Message-ID: <1e67644b-4b30-887e-d329-1848e94c9484@twiddle.net> Suggested-by:
Richard Henderson <rth@twiddle.net> Reviewed-by:
Pranith Kumar <bobby.prani@gmail.com> Signed-off-by:
Emilio G. Cota <cota@braap.org> Message-Id: <1496790745-314-3-git-send-email-cota@braap.org> [rth: Simplify the arithmetic in tcg_tb_alloc] Signed-off-by:
Richard Henderson <rth@twiddle.net>
-
- Jun 05, 2017
-
-
Emilio G. Cota authored
Instead of exporting goto_ptr directly to TCG frontends, export tcg_gen_lookup_and_goto_ptr(), which calls goto_ptr with the pointer returned by the lookup_tb_ptr() helper. This is the only use case we have for goto_ptr and lookup_tb_ptr, so having this function is very convenient. Furthermore, it trivially allows us to avoid calling the lookup helper if goto_ptr is not implemented by the backend. Reviewed-by:
Alex Bennée <alex.bennee@linaro.org> Signed-off-by:
Emilio G. Cota <cota@braap.org> Message-Id: <1493263764-18657-2-git-send-email-cota@braap.org> Message-Id: <1493263764-18657-3-git-send-email-cota@braap.org> Message-Id: <1493263764-18657-4-git-send-email-cota@braap.org> Message-Id: <1493263764-18657-5-git-send-email-cota@braap.org> [rth: Squashed 4 related commits.] Signed-off-by:
Richard Henderson <rth@twiddle.net>
-
- Feb 24, 2017
-
-
Frederic Konrad authored
We know there will be cases where MTTCG won't work until additional work is done in the front/back ends to support. It will however be useful to be able to turn it on. As a result MTTCG will default to off unless the combination is supported. However the user can turn it on for the sake of testing. Signed-off-by:
KONRAD Frederic <fred.konrad@greensocs.com> [AJB: move to -accel tcg,thread=multi|single, defaults] Signed-off-by:
Alex Bennée <alex.bennee@linaro.org> Reviewed-by:
Richard Henderson <rth@twiddle.net>
-
Alex Bennée authored
We'll be using the memory ordering definitions to define values for both the host and guest. To avoid fighting with circular header dependencies just move these types into their own minimal header. Signed-off-by:
Alex Bennée <alex.bennee@linaro.org> Reviewed-by:
Richard Henderson <rth@twiddle.net>
-
- Feb 22, 2017
-
-
Paolo Bonzini authored
The icount interrupt flag and tcg_exit_req serve almost the same purpose, let's make them completely the same. The former TB_EXIT_REQUESTED and TB_EXIT_ICOUNT_EXPIRED cases are unified, since we can distinguish them from the value of the interrupt flag. Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
- Jan 10, 2017
-
-
Richard Henderson authored
The number of actual invocations of ctpop itself does not warrent an opcode, but it is very helpful for POWER7 to use in generating an expansion for ctz. Reviewed-by:
Alex Bennée <alex.bennee@linaro.org> Signed-off-by:
Richard Henderson <rth@twiddle.net>
-
Richard Henderson authored
Reviewed-by:
Alex Bennée <alex.bennee@linaro.org> Signed-off-by:
Richard Henderson <rth@twiddle.net>
-
Richard Henderson authored
This will allow the target to tailor the constraints to the auto-detected ISA extensions. Reviewed-by:
Alex Bennée <alex.bennee@linaro.org> Signed-off-by:
Richard Henderson <rth@twiddle.net>
-
Richard Henderson authored
This is the same concept as, and same markup as, the early clobber markup in gcc. Reviewed-by:
Alex Bennée <alex.bennee@linaro.org> Signed-off-by:
Richard Henderson <rth@twiddle.net>
-
Richard Henderson authored
Adds tcg_gen_extract_* and tcg_gen_sextract_* for extraction of fixed position bitfields, much like we already have for deposit. Reviewed-by:
Alex Bennée <alex.bennee@linaro.org> Signed-off-by:
Richard Henderson <rth@twiddle.net>
-
- Nov 01, 2016
-
-
Peter Maydell authored
The typedefs we use for the TCGv_i32, TCGv_i64 and TCGv_ptr types are somewhat confusing, because we define them as pointers to structs, but the structs themselves are never defined. Explain in the comments a bit more clearly why this is OK and what is going on under the hood. Signed-off-by:
Peter Maydell <peter.maydell@linaro.org> Message-Id: <1477067922-26202-1-git-send-email-peter.maydell@linaro.org> Signed-off-by:
Richard Henderson <rth@twiddle.net>
-
- Oct 31, 2016
-
-
Paolo Bonzini authored
softmmu requires more functions to be thread-safe, because translation blocks can be invalidated from e.g. notdirty callbacks. Probably the same holds for user-mode emulation, it's just that no one has ever tried to produce a coherent locking there. This patch will guide the introduction of more tb_lock and tb_unlock calls for system emulation. Note that after this patch some (most) of the mentioned functions are still called outside tb_lock/tb_unlock. The next one will rectify this. Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Signed-off-by:
Alex Bennée <alex.bennee@linaro.org> Reviewed-by:
Richard Henderson <rth@twiddle.net> Message-Id: <20161027151030.20863-7-alex.bennee@linaro.org> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
- Oct 26, 2016
-
-
Richard Henderson authored
Allow qemu to build on 32-bit hosts without 64-bit atomic ops. Even if we only allow 32-bit hosts to multi-thread emulate 32-bit guests, we still need some way to handle the 32-bit guest using a 64-bit atomic operation. Do so by dropping back to single-step. Reviewed-by:
Emilio G. Cota <cota@braap.org> Reviewed-by:
Alex Bennée <alex.bennee@linaro.org> Signed-off-by:
Richard Henderson <rth@twiddle.net>
-
Richard Henderson authored
Force the use of cmpxchg16b on x86_64. Wikipedia suggests that only very old AMD64 (circa 2004) did not have this instruction. Further, it's required by Windows 8 so no new cpus will ever omit it. If we truely care about these, then we could check this at startup time and then avoid executing paths that use it. Reviewed-by:
Emilio G. Cota <cota@braap.org> Reviewed-by:
Alex Bennée <alex.bennee@linaro.org> Signed-off-by:
Richard Henderson <rth@twiddle.net>
-
Richard Henderson authored
Add all of cmpxchg, op_fetch, fetch_op, and xchg. Handle both endian-ness, and sizes up to 8. Handle expanding non-atomically, when emulating in serial. Reviewed-by:
Emilio G. Cota <cota@braap.org> Reviewed-by:
Alex Bennée <alex.bennee@linaro.org> Signed-off-by:
Richard Henderson <rth@twiddle.net>
-
Richard Henderson authored
When we cannot emulate an atomic operation within a parallel context, this exception allows us to stop the world and try again in a serial context. Reviewed-by:
Emilio G. Cota <cota@braap.org> Reviewed-by:
Alex Bennée <alex.bennee@linaro.org> Signed-off-by:
Richard Henderson <rth@twiddle.net>
-
- Sep 16, 2016
-
-
Pranith Kumar authored
This commit introduces the TCGOpcode for memory barrier instruction. This opcode takes an argument which is the type of memory barrier which should be generated. Signed-off-by:
Pranith Kumar <bobby.prani@gmail.com> Message-Id: <20160714202026.9727-2-bobby.prani@gmail.com> Signed-off-by:
Richard Henderson <rth@twiddle.net>
-
Richard Henderson authored
Previously we allowed fully unaligned operations, but not operations that are aligned but with less alignment than the operation size. In addition, arm32, ia64, mips, and sparc had been omitted from the previous overalignment patch, which would have led to that alignment being enforced. Signed-off-by:
Richard Henderson <rth@twiddle.net>
-
- Sep 15, 2016
-
-
Ladi Prosek authored
Unused function declarations were found using a simple gcc plugin and manually verified by grepping the sources. Signed-off-by:
Ladi Prosek <lprosek@redhat.com> Signed-off-by:
Michael Tokarev <mjt@tls.msk.ru>
-
- Aug 05, 2016
-
-
Richard Henderson authored
Rather than rely on recursion during the middle of register allocation, lower indirect registers to loads and stores off the indirect base into plain temps. For an x86_64 host, with sufficient registers, this results in identical code, modulo the actual register assignments. For an i686 host, with insufficient registers, this means that temps can be (temporarily) spilled to the stack in order to satisfy an allocation. This as opposed to the possibility of not being able to spill, to allocate a register for the indirect base, in order to perform a spill. Reviewed-by:
Aurelien Jarno <aurelien@aurel32.net> Signed-off-by:
Richard Henderson <rth@twiddle.net>
-
Richard Henderson authored
Reduce the size of other bitfields to make room. This reduces the cache footprint of compilation. Reviewed-by:
Aurelien Jarno <aurelien@aurel32.net> Signed-off-by:
Richard Henderson <rth@twiddle.net>
-
Richard Henderson authored
Instead of using -1 as end of chain, use 0, and link through the 0 entry as a fully circular double-linked list. Reviewed-by:
Aurelien Jarno <aurelien@aurel32.net> Signed-off-by:
Richard Henderson <rth@twiddle.net>
-
Richard Henderson authored
This reduces both memory usage and per-insn cacheline usage during code generation. Reviewed-by:
Aurelien Jarno <aurelien@aurel32.net> Signed-off-by:
Richard Henderson <rth@twiddle.net>
-
- Jul 17, 2016
-
-
Paolo Bonzini authored
Assertions help both Coverity and the clang static analyzer avoid false positives, but on the other hand both are confused when the condition is compiled as (void)(x != FOO). Always expand assertion macros when using Coverity or clang, through a new QEMU_STATIC_ANALYSIS preprocessor symbol. This fixes a couple false positives in TCG. Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
- Jul 06, 2016
-
-
Sergey Sorokin authored
Some architectures (e.g. ARMv8) need the address which is aligned to a size more than the size of the memory access. To support such check it's enough the current costless alignment check implementation in QEMU, but we need to support an alignment size specifying. Signed-off-by:
Sergey Sorokin <afarallax@yandex.ru> Message-Id: <1466705806-679898-1-git-send-email-afarallax@yandex.ru> Signed-off-by:
Richard Henderson <rth@twiddle.net> [rth: Assert in tcg_canonicalize_memop. Leave get_alignment_bits available for, though unused by, user-mode. Retain logging difference based on ALIGNED_ONLY.]
-
- Jun 20, 2016
-
-
Lluís Vilanova authored
Information is tracked inside the TCGContext structure, and later used by tracing events with the 'tcg' and 'vcpu' properties. The 'cpu' field is used to check tracing of translation-time events ("*_trans"). The 'tcg_env' field is used to pass it to execution-time events ("*_exec"). Signed-off-by:
Lluís Vilanova <vilanova@ac.upc.edu> Reviewed-by:
Peter Maydell <peter.maydell@linaro.org> Reviewed-by:
Richard Henderson <rth@twiddle.net> Message-id: 146549350162.18437.3033661139638458143.stgit@fimbulvetr.bsc.es Signed-off-by:
Stefan Hajnoczi <stefanha@redhat.com>
-
- May 19, 2016
-
-
Paolo Bonzini authored
TCG backends do not need most of exec-all.h; extract what they actually need to a separate file or move it directly to tcg.h. The next patch will stop including exec-all.h from everywhere. Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Paolo Bonzini authored
Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
- May 13, 2016
-
-
Sergey Fedorov authored
The value returned from tcg_qemu_tb_exec() is the value passed to the corresponding tcg_gen_exit_tb() at translation time of the last TB attempted to execute. It is a little confusing to store it in a variable named 'next_tb'. In fact, it is a combination of 4-byte aligned pointer and additional information in its two least significant bits. Break it down right away into two variables named 'last_tb' and 'tb_exit' which are a pointer to the last TB attempted to execute and the TB exit reason, correspondingly. This simplifies the code and improves its readability. Correct a misleading documentation comment for tcg_qemu_tb_exec() and fix logging in cpu_tb_exec(). Also rename a misleading 'next_tb' in another couple of places. Signed-off-by:
Sergey Fedorov <serge.fdrv@gmail.com> Signed-off-by:
Sergey Fedorov <sergey.fedorov@linaro.org> Signed-off-by:
Richard Henderson <rth@twiddle.net>
-