Commits · 1f546b709d6121d1a3e629e482ba14fb7bf10ce2 · Anton / libtcg

Jun 29, 2021

tcg/riscv: Remove MO_BSWAP handling · c86bd2dc


TCG_TARGET_HAS_MEMORY_BSWAP is already unset for this backend,
which means that MO_BSWAP be handled by the middle-end and
will never be seen by the backend.  Thus the indexes used with
qemu_{ld,st}_helpers will always be zero.

Tidy the comments and asserts in tcg_out_qemu_{ld,st}_direct.
It is not that we do not handle bswap "yet", but never will.

Acked-by: Alistair Francis <alistair.francis@wdc.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

c86bd2dc

tcg/aarch64: Unset TCG_TARGET_HAS_MEMORY_BSWAP · 51c559c7

Richard Henderson authored 3 years ago


The memory bswap support in the aarch64 backend merely dates from
a time when it was required.  There is nothing special about the
backend support that could not have been provided by the middle-end
even prior to the introduction of the bswap flags.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

51c559c7

tcg/arm: Unset TCG_TARGET_HAS_MEMORY_BSWAP · 843b8242

Richard Henderson authored 3 years ago


Now that the middle-end can replicate the same tricks as tcg/arm
used for optimizing bswap for signed loads and for stores, do not
pretend to have these memory ops in the backend.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

843b8242

tcg: Make use of bswap flags in tcg_gen_qemu_st_* · b53357ac

Richard Henderson authored 3 years ago


By removing TCG_BSWAP_IZ we indicate that the input is
not zero-extended, and thus can remove an explicit extend.
By removing TCG_BSWAP_OZ, we allow the implementation to
leave high bits set, which will be ignored by the store.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

b53357ac

tcg: Make use of bswap flags in tcg_gen_qemu_ld_* · 359feba5

Richard Henderson authored 3 years ago


We can perform any required sign-extension via TCG_BSWAP_OS.

Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

359feba5

tcg: Add flags argument to tcg_gen_bswap16_*, tcg_gen_bswap32_i64 · 2b836c2a

Richard Henderson authored 3 years ago


Implement the new semantics in the fallback expansion.
Change all callers to supply the flags that keep the
semantics unchanged locally.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

2b836c2a

tcg: Handle new bswap flags during optimize · 0b76ff8f

Richard Henderson authored 3 years ago


Notice when the input is known to be zero-extended and force
the TCG_BSWAP_IZ flag on.  Honor the TCG_BSWAP_OS bit during
constant folding.  Propagate the input to the output mask.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

0b76ff8f

tcg/tci: Support bswap flags · 0d57d36a

Richard Henderson authored 3 years ago


The existing interpreter zero-extends, ignoring high bits.
Simply add a separate sign-extension opcode if required.
Ensure that the interpreter supports ext16s when bswap16 is enabled.

Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

0d57d36a

tcg/mips: Support bswap flags in tcg_out_bswap32 · 1fce6534

Richard Henderson authored 3 years ago


Merge tcg_out_bswap32 and tcg_out_bswap32s.
Use the flags in the internal uses for loads and stores.

For mips32r2 bswap32 with zero-extension, standardize on
WSBH+ROTR+DEXT.  This is the same number of insns as the
previous DSBH+DSHD+DSRL but fits in better with the flags check.

Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

1fce6534

tcg/mips: Support bswap flags in tcg_out_bswap16 · 27362b7b

Richard Henderson authored 3 years ago


Merge tcg_out_bswap16 and tcg_out_bswap16s.  Use the flags
in the internal uses for loads and stores.

Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

27362b7b

tcg/s390: Support bswap flags · 1619ee9e

Richard Henderson authored 3 years ago


For INDEX_op_bswap16_i64, use 64-bit instructions so that we can
easily provide the extension to 64-bits.  Drop the special case,
previously used, where the input is already zero-extended -- the
minor code size savings is not worth the complication.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

1619ee9e

tcg/ppc: Use power10 byte-reverse instructions · 780b573f

Richard Henderson authored 3 years ago


Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

780b573f

tcg/ppc: Support bswap flags · 26ce7005

Richard Henderson authored 3 years ago

For INDEX_op_bswap32_i32, pass 0 for flags: input not zero-extended,
output does not need extension within the host 64-bit register.

Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

26ce7005

tcg/ppc: Split out tcg_out_bswap64 · 674ba588

Richard Henderson authored 3 years ago


Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

674ba588

tcg/ppc: Split out tcg_out_bswap32 · 8a611d86

Richard Henderson authored 3 years ago


Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

8a611d86

tcg/ppc: Split out tcg_out_bswap16 · 783d3ecd

Richard Henderson authored 3 years ago


With the use of a suitable temporary, we can use the same
algorithm when src overlaps dst.  The result is the same
number of instructions either way.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

783d3ecd

tcg/ppc: Split out tcg_out_sari{32,64} · 05dd01fa

Richard Henderson authored 3 years ago


We will shortly require sari in other context;
split out both for cleanliness sake.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

05dd01fa

tcg/ppc: Split out tcg_out_ext{8,16,32}s · f4bf14f4

Richard Henderson authored 3 years ago


We will shortly require these in other context;
make the expansion as clear as possible.

Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

f4bf14f4

tcg/arm: Support bswap flags · 2ec89a78

Richard Henderson authored 3 years ago


Combine the three bswap16 routines, and differentiate via the flags.
Use the correct flags combination from the load/store routines, and
pass along the constant parameter from tcg_out_op.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

2ec89a78

tcg/aarch64: Support bswap flags · 8fcfc6bf

Richard Henderson authored 3 years ago


Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

8fcfc6bf

tcg/aarch64: Merge tcg_out_rev{16,32,64} · dfa24dfa

Richard Henderson authored 3 years ago

Pass in the input and output size. We currently use 3 of the 5
possible combinations; the others may be used by new tcg opcodes.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

dfa24dfa

tcg/i386: Support bswap flags · 7335a3d6

Richard Henderson authored 3 years ago

Retain the current rorw bswap16 expansion for the zero-in/zero-out case.
Otherwise, perform a wider bswap plus a right-shift or extend.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

7335a3d6

tcg: Add flags argument to bswap opcodes · 587195bd

Richard Henderson authored 3 years ago


This will eventually simplify front-end usage, and will allow
backends to unset TCG_TARGET_HAS_MEMORY_BSWAP without loss of
optimization.

The argument is added during expansion, not currently exposed to the
front end translators.  The backends currently only support a flags
value of either TCG_BSWAP_IZ, or (TCG_BSWAP_IZ | TCG_BSWAP_OZ),
since they all require zero top bytes and leave them that way.
At the existing call sites we pass in (TCG_BSWAP_IZ | TCG_BSWAP_OZ),
except for the flags-ignored cases of a 32-bit swap of a 32-bit
value and or a 64-bit swap of a 64-bit value, where we pass 0.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

587195bd

tcg: Add tcg_gen_vec_shl{shr}{sar}8i_i32 · 950ee590

LIU Zhiwei authored 3 years ago


Implement tcg_gen_vec_shl{shr}{sar}8i_tl by adding corresponging i32 OP.

Signed-off-by: LIU Zhiwei <zhiwei_liu@c-sky.com>
Message-Id: <20210624105023.3852-5-zhiwei_liu@c-sky.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

950ee590

tcg: Add tcg_gen_vec_shl{shr}{sar}16i_i32 · 04f2a8bb

LIU Zhiwei authored 3 years ago


Implement tcg_gen_vec_shl{shr}{sar}16i_tl by adding corresponging i32 OP.

Signed-off-by: LIU Zhiwei <zhiwei_liu@c-sky.com>
Message-Id: <20210624105023.3852-4-zhiwei_liu@c-sky.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

04f2a8bb

tcg: Add tcg_gen_vec_add{sub}8_i32 · 448e7aa2

LIU Zhiwei authored 3 years ago


Implement tcg_gen_vec_add{sub}8_tl by adding corresponging i32 OP.

Signed-off-by: LIU Zhiwei <zhiwei_liu@c-sky.com>
Message-Id: <20210624105023.3852-3-zhiwei_liu@c-sky.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

448e7aa2

tcg: Add tcg_gen_vec_add{sub}16_i32 · 3d066e5d

LIU Zhiwei authored 3 years ago


Implement tcg_gen_vec_add{sub}16_tl by adding corresponding i32 OP.

Signed-off-by: LIU Zhiwei <zhiwei_liu@c-sky.com>
Message-Id: <20210624105023.3852-2-zhiwei_liu@c-sky.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

3d066e5d

Jun 21, 2021

tcg: Make gen_dup_i32/i64() public as tcg_gen_dup_i32/i64 · 614dd4f3

Peter Maydell authored 3 years ago


The Arm MVE VDUP implementation would like to be able to emit code to
duplicate a byte or halfword value into an i32.  We have code to do
this already in tcg-op-gvec.c, so all we need to do is make the
functions global.

For consistency with other functions made available to the frontends:
 * we rename to tcg_gen_dup_*
 * we expose both the _i32 and _i64 forms
 * we provide the #define for a _tl form

Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20210617121628.20116-10-peter.maydell@linaro.org

614dd4f3

Jun 19, 2021

tcg: Restart when exhausting the stack frame · 732d5897

Richard Henderson authored 3 years ago


Assume that we'll have fewer temps allocated after
restarting with a fewer number of instructions.

Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

732d5897

tcg: Allocate sufficient storage in temp_allocate_frame · c1c09194

Richard Henderson authored 3 years ago

This function should have been updated for vector types
when they were introduced.

Fixes: d2fd745f
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/367


Cc: qemu-stable@nongnu.org
Tested-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

c1c09194

tcg/sparc: Fix temp_allocate_frame vs sparc stack bias · 9defd1bd

Richard Henderson authored 3 years ago


We should not be aligning the offset in temp_allocate_frame,
because the odd offset produces an aligned address in the end.
Instead, pass the logical offset into tcg_set_frame and add
the stack bias last.

Cc: qemu-stable@nongnu.org
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

9defd1bd

tcg/tci: Use {set,clear}_helper_retaddr · 2fc6f16c

Richard Henderson authored 3 years ago


Wrap guest memory operations for tci like we do for cpu_ld*_data.

We cannot actually use the cpu_ldst.h interface without duplicating
the memory trace operations performed within, which will already
have been expanded into the tcg opcode stream.

Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

2fc6f16c

tcg/tci: Remove the qemu_ld/st_type macros · d1b1348c

Richard Henderson authored 3 years ago


These macros are only used in one place.  By expanding,
we get to apply some common-subexpression elimination
and create some local variables.

Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

d1b1348c

Revert "tcg/tci: Use exec/cpu_ldst.h interfaces" · 5183f209

Richard Henderson authored 3 years ago


This reverts commit dc09f047.

For tcg, tracepoints are expanded inline in tcg opcodes.
Using a helper which generates a second tracepoint is incorrect.

For system mode, the extraction and re-packing of MemOp and mmu_idx
lost the alignment information from MemOp.  So we were no longer
raising alignment exceptions for !TARGET_ALIGNED_ONLY guests.
This can be seen in tests/tcg/xtensa/test_load_store.S.

For user mode, we must update to the new signature of g2h() so that
the revert compiles.  We can leave set_helper_retaddr for later.

Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

5183f209

tcg/tci: Split out tci_qemu_ld, tci_qemu_st · 69acc02a

Richard Henderson authored 3 years ago


We can share this code between 32-bit and 64-bit loads and stores.

Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

69acc02a

tcg/tci: Implement add2, sub2 · 08096b1a

Richard Henderson authored 4 years ago


We already had the 32-bit versions for a 32-bit host; expand this
to 64-bit hosts as well.  The 64-bit opcodes are new.

Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

08096b1a

tcg/tci: Implement mulu2, muls2 · f6db0d8d

Richard Henderson authored 4 years ago


We already had mulu2_i32 for a 32-bit host; expand this to 64-bit
hosts as well.  The muls2_i32 and the 64-bit opcodes are new.

Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

f6db0d8d

tcg/tci: Implement clz, ctz, ctpop · 5255f48c

Richard Henderson authored 4 years ago


Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

5255f48c

tcg/tci: Implement extract, sextract · 0f10d7c5

Richard Henderson authored 4 years ago


Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

0f10d7c5

tcg/tci: Implement andc, orc, eqv, nand, nor · a81520b9

Richard Henderson authored 4 years ago


These were already present in tcg-target.c.inc,
but not in the interpreter.

Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

a81520b9