- Jun 21, 2021
-
-
Peter Maydell authored
The Arm MVE VDUP implementation would like to be able to emit code to duplicate a byte or halfword value into an i32. We have code to do this already in tcg-op-gvec.c, so all we need to do is make the functions global. For consistency with other functions made available to the frontends: * we rename to tcg_gen_dup_* * we expose both the _i32 and _i64 forms * we provide the #define for a _tl form Suggested-by:
Richard Henderson <richard.henderson@linaro.org> Reviewed-by:
Richard Henderson <richard.henderson@linaro.org> Signed-off-by:
Peter Maydell <peter.maydell@linaro.org> Message-id: 20210617121628.20116-10-peter.maydell@linaro.org
-
- Jun 19, 2021
-
-
Richard Henderson authored
Assume that we'll have fewer temps allocated after restarting with a fewer number of instructions. Reviewed-by:
Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by:
Richard Henderson <richard.henderson@linaro.org>
-
Richard Henderson authored
This function should have been updated for vector types when they were introduced. Fixes: d2fd745f Resolves: https://gitlab.com/qemu-project/qemu/-/issues/367 Cc: qemu-stable@nongnu.org Tested-by:
Stefan Weil <sw@weilnetz.de> Reviewed-by:
Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by:
Richard Henderson <richard.henderson@linaro.org>
-
Richard Henderson authored
We should not be aligning the offset in temp_allocate_frame, because the odd offset produces an aligned address in the end. Instead, pass the logical offset into tcg_set_frame and add the stack bias last. Cc: qemu-stable@nongnu.org Reviewed-by:
Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by:
Richard Henderson <richard.henderson@linaro.org>
-
Richard Henderson authored
Wrap guest memory operations for tci like we do for cpu_ld*_data. We cannot actually use the cpu_ldst.h interface without duplicating the memory trace operations performed within, which will already have been expanded into the tcg opcode stream. Tested-by:
Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by:
Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by:
Richard Henderson <richard.henderson@linaro.org>
-
Richard Henderson authored
These macros are only used in one place. By expanding, we get to apply some common-subexpression elimination and create some local variables. Tested-by:
Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by:
Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by:
Richard Henderson <richard.henderson@linaro.org>
-
Richard Henderson authored
This reverts commit dc09f047. For tcg, tracepoints are expanded inline in tcg opcodes. Using a helper which generates a second tracepoint is incorrect. For system mode, the extraction and re-packing of MemOp and mmu_idx lost the alignment information from MemOp. So we were no longer raising alignment exceptions for !TARGET_ALIGNED_ONLY guests. This can be seen in tests/tcg/xtensa/test_load_store.S. For user mode, we must update to the new signature of g2h() so that the revert compiles. We can leave set_helper_retaddr for later. Tested-by:
Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by:
Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by:
Richard Henderson <richard.henderson@linaro.org>
-
Richard Henderson authored
We can share this code between 32-bit and 64-bit loads and stores. Tested-by:
Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by:
Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by:
Richard Henderson <richard.henderson@linaro.org>
-
Richard Henderson authored
We already had the 32-bit versions for a 32-bit host; expand this to 64-bit hosts as well. The 64-bit opcodes are new. Tested-by:
Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by:
Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by:
Richard Henderson <richard.henderson@linaro.org>
-
Richard Henderson authored
We already had mulu2_i32 for a 32-bit host; expand this to 64-bit hosts as well. The muls2_i32 and the 64-bit opcodes are new. Tested-by:
Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by:
Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by:
Richard Henderson <richard.henderson@linaro.org>
-
Richard Henderson authored
Tested-by:
Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by:
Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by:
Richard Henderson <richard.henderson@linaro.org>
-
Richard Henderson authored
Tested-by:
Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by:
Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by:
Richard Henderson <richard.henderson@linaro.org>
-
Richard Henderson authored
These were already present in tcg-target.c.inc, but not in the interpreter. Tested-by:
Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by:
Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by:
Richard Henderson <richard.henderson@linaro.org>
-
Richard Henderson authored
When this opcode is not available in the backend, tcg middle-end will expand this as a series of 5 opcodes. So implementing this saves bytecode space. Tested-by:
Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by:
Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by:
Richard Henderson <richard.henderson@linaro.org>
-
Richard Henderson authored
This operation is critical to staying within the interpretation loop longer, which avoids the overhead of setup and teardown for many TBs. The check in tcg_prologue_init is disabled because TCI does want to use NULL to indicate exit, as opposed to branching to a real epilogue. Tested-by:
Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by:
Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by:
Richard Henderson <richard.henderson@linaro.org>
-
Richard Henderson authored
This removes all of the problems with unaligned accesses to the bytecode stream. With an 8-bit opcode at the bottom, we have 24 bits remaining, which are generally split into 6 4-bit slots. This fits well with the maximum length opcodes, e.g. INDEX_op_add2_i32, which have 6 register operands. We have, in previous patches, rearranged things such that there are no operations with a label which have more than one other operand. Which leaves us with a 20-bit field in which to encode a label, giving us a maximum TB size of 512k -- easily large. Change the INDEX_op_tci_movi_{i32,i64} opcodes to tci_mov[il]. The former puts the immediate in the upper 20 bits of the insn, like we do for the label displacement. The later uses a label to reference an entry in the constant pool. Thus, in the worst case we still have a single memory reference for any constant, but now the constants are out-of-line of the bytecode and can be shared between different moves saving space. Change INDEX_op_call to use a label to reference a pair of pointers in the constant pool. This removes the only slightly dodgy link with the layout of struct TCGHelperInfo. The re-encode cannot be done in pieces. Tested-by:
Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by:
Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by:
Richard Henderson <richard.henderson@linaro.org>
-
Richard Henderson authored
Inline it into its one caller, tci_write_reg64. Drop the asserts that are redundant with tcg_read_r. Tested-by:
Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by:
Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by:
Richard Henderson <richard.henderson@linaro.org>
-
Richard Henderson authored
The encoding planned for tci does not have enough room for brcond2, with 4 registers and a condition as input as well as the label. Resolve the condition into TCG_REG_TMP, and relax brcond to one register plus a label, considering the condition to always be reg != 0. Tested-by:
Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by:
Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by:
Richard Henderson <richard.henderson@linaro.org>
-
Richard Henderson authored
We're about to adjust the offset range on host memory ops, and the format of branches. Both will require a temporary. Tested-by:
Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by:
Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by:
Richard Henderson <richard.henderson@linaro.org>
-
Richard Henderson authored
This requires adjusting where arguments are stored. Place them on the stack at left-aligned positions. Adjust the stack frame to be at entirely positive offsets. Tested-by:
Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by:
Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by:
Richard Henderson <richard.henderson@linaro.org>
-
Richard Henderson authored
As the only call-clobbered regs for TCI, these should receive the least priority. Tested-by:
Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by:
Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by:
Richard Henderson <richard.henderson@linaro.org>
-
Richard Henderson authored
The current setting is much too pessimistic. Indicating only the one or two registers that are actually assigned after a call should avoid unnecessary movement between the register array and the stack array. Tested-by:
Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by:
Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by:
Richard Henderson <richard.henderson@linaro.org>
-
Richard Henderson authored
Add libffi as a build requirement for TCI. Add libffi to the dockerfiles to satisfy that requirement. Construct an ffi_cif structure for each unique typemask. Record the result in a separate hash table for later lookup; this allows helper_table to stay const. Tested-by:
Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by:
Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by:
Richard Henderson <richard.henderson@linaro.org>
-
Richard Henderson authored
Reviewed-by:
Philippe Mathieu-Daudé <f4bug@amsat.org> Tested-by:
Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by:
Richard Henderson <richard.henderson@linaro.org>
-
Richard Henderson authored
This will give us both flags and typemask for use later. We also fix a dumping bug, wherein calls generated for plugins fail tcg_find_helper and print (null) instead of either a name or the raw function pointer. Tested-by:
Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by:
Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by:
Richard Henderson <richard.henderson@linaro.org>
-
Richard Henderson authored
We're going to change how to look up the call flags from a TCGop, so extract it as a helper. Tested-by:
Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by:
Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by:
Richard Henderson <richard.henderson@linaro.org>
-
Richard Henderson authored
We will shortly be interested in distinguishing pointers from integers in the helper's declaration, as well as a true void return. We currently have two parallel 1 bit fields; merge them and expand to a 3 bit field. Our current maximum is 7 helper arguments, plus the return makes 8 * 3 = 24 bits used within the uint32_t typemask. Tested-by:
Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by:
Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by:
Richard Henderson <richard.henderson@linaro.org>
-
- Jun 14, 2021
-
-
Jose R. Ziviani authored
Commit 5e8892db fixed several function signatures but tcg_out_op for arm is missing. This patch fixes it as well. Signed-off-by:
Jose R. Ziviani <jziviani@suse.de> Message-Id: <20210610224450.23425-1-jziviani@suse.de> Signed-off-by:
Richard Henderson <richard.henderson@linaro.org>
-
Richard Henderson authored
Introduce a function to remove everything emitted since a given point. Reviewed-by:
Peter Maydell <peter.maydell@linaro.org> Signed-off-by:
Richard Henderson <richard.henderson@linaro.org>
-
Richard Henderson authored
These variables belong to the jit side, not the user side. Since tcg_init_ctx is no longer used outside of tcg/, move the declaration to tcg-internal.h. Reviewed-by:
Alex Bennée <alex.bennee@linaro.org> Reviewed-by:
Luis Pires <luis.pires@eldorado.org.br> Suggested-by:
Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by:
Richard Henderson <richard.henderson@linaro.org>
-
Richard Henderson authored
There's a change in mprotect() behaviour [1] in the latest macOS on M1 and it's not yet clear if it's going to be fixed by Apple. In this case, instead of changing permissions of N guard pages, we change permissions of N rwx regions. The same number of syscalls are required either way. [1] https://gist.github.com/hikalium/75ae822466ee4da13cbbe486498a191f Reviewed-by:
Luis Pires <luis.pires@eldorado.org.br> Signed-off-by:
Richard Henderson <richard.henderson@linaro.org>
-
Richard Henderson authored
Do not handle protections on a case-by-case basis in the various alloc_code_gen_buffer instances; do it within a single loop in tcg_region_init. Reviewed-by:
Alex Bennée <alex.bennee@linaro.org> Reviewed-by:
Luis Pires <luis.pires@eldorado.org.br> Signed-off-by:
Richard Henderson <richard.henderson@linaro.org>
-
Richard Henderson authored
If qemu_get_host_physmem returns an odd number of pages, then physmem / 8 will not be a multiple of the page size. The following was observed on a gitlab runner: ERROR qtest-arm/boot-serial-test - Bail out! ERROR:../util/osdep.c:80:qemu_mprotect__osdep: \ assertion failed: (!(size & ~qemu_real_host_page_mask)) Reviewed-by:
Alex Bennée <alex.bennee@linaro.org> Reviewed-by:
Luis Pires <luis.pires@eldorado.org.br> Signed-off-by:
Richard Henderson <richard.henderson@linaro.org>
-
Richard Henderson authored
Move the call out of the N versions of alloc_code_gen_buffer and into tcg_region_init. Reviewed-by:
Alex Bennée <alex.bennee@linaro.org> Reviewed-by:
Luis Pires <luis.pires@eldorado.org.br> Signed-off-by:
Richard Henderson <richard.henderson@linaro.org>
-
Richard Henderson authored
Change the interface from a boolean error indication to a negative error vs a non-negative protection. For the moment this is only interface change, not making use of the new data. Reviewed-by:
Alex Bennée <alex.bennee@linaro.org> Reviewed-by:
Luis Pires <luis.pires@eldorado.org.br> Signed-off-by:
Richard Henderson <richard.henderson@linaro.org>
-
Richard Henderson authored
Do not mess around with setting values within tcg_init_ctx. Put the values into 'region' directly, which is where they will live for the lifetime of the program. Reviewed-by:
Alex Bennée <alex.bennee@linaro.org> Reviewed-by:
Luis Pires <luis.pires@eldorado.org.br> Signed-off-by:
Richard Henderson <richard.henderson@linaro.org>
-
- Jun 11, 2021
-
-
Richard Henderson authored
Shortly, the full code_gen_buffer will only be visible to region.c, so move in_code_gen_buffer out-of-line. Move the debugging versions of tcg_splitwx_to_{rx,rw} to region.c as well, so that the compiler gets to see the implementation of in_code_gen_buffer. This leaves exactly one use of in_code_gen_buffer outside of region.c, in cpu_restore_state. Which, being on the exception path, is not performance critical. Reviewed-by:
Alex Bennée <alex.bennee@linaro.org> Reviewed-by:
Luis Pires <luis.pires@eldorado.org.br> Signed-off-by:
Richard Henderson <richard.henderson@linaro.org>
-
Richard Henderson authored
Return output buffer and size via output pointer arguments, rather than returning size via tcg_ctx->code_gen_buffer_size. Reviewed-by:
Luis Pires <luis.pires@eldorado.org.br> Reviewed-by:
Alex Bennée <alex.bennee@linaro.org> Signed-off-by:
Richard Henderson <richard.henderson@linaro.org>
-
Richard Henderson authored
Compute the value using straight division and bounds, rather than a loop. Pass in tb_size rather than reading from tcg_init_ctx.code_gen_buffer_size, Reviewed-by:
Luis Pires <luis.pires@eldorado.org.br> Reviewed-by:
Alex Bennée <alex.bennee@linaro.org> Signed-off-by:
Richard Henderson <richard.henderson@linaro.org>
-
Richard Henderson authored
Give the field a name reflecting its actual meaning. Reviewed-by:
Luis Pires <luis.pires@eldorado.org.br> Reviewed-by:
Alex Bennée <alex.bennee@linaro.org> Signed-off-by:
Richard Henderson <richard.henderson@linaro.org>
-