Commits · 3399bca4514b5c8d513a88fa3e472756468cb4c6 · Anton / libtcg

Jul 02, 2021

target/arm: Implement MVE shifts by register · 04ea4d3c

Peter Maydell authored 3 years ago


Implement the MVE shifts by register, which perform
shifts on a single general-purpose register.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210628135835.6690-19-peter.maydell@linaro.org

04ea4d3c

target/arm: Implement MVE shifts by immediate · 46321d47

Peter Maydell authored 3 years ago


Implement the MVE shifts by immediate, which perform shifts
on a single general-purpose register.

These patterns overlap with the long-shift-by-immediates,
so we have to rearrange the grouping a little here.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210628135835.6690-18-peter.maydell@linaro.org

46321d47

target/arm: Implement MVE long shifts by register · 0aa4b4c3

Peter Maydell authored 3 years ago

Implement the MVE long shifts by register, which perform shifts on a
pair of general-purpose registers treated as a 64-bit quantity, with
the shift count in another general-purpose register, which might be
either positive or negative.

Like the long-shifts-by-immediate, these encodings sit in the space
that was previously the UNPREDICTABLE MOVS/ORRS with Rm==13,15.
Because LSLL_rr and ASRL_rr overlap with both MOV_rxri/ORR_rrri and
also with CSEL (as one of the previously-UNPREDICTABLE Rm==13 cases),
we have to move the CSEL pattern into the same decodetree group.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210628135835.6690-17-peter.maydell@linaro.org

0aa4b4c3

target/arm: Implement MVE long shifts by immediate · f4ae6c8c

Peter Maydell authored 3 years ago


The MVE extension to v8.1M includes some new shift instructions which
sit entirely within the non-coprocessor part of the encoding space
and which operate only on general-purpose registers.  They take up
the space which was previously UNPREDICTABLE MOVS and ORRS encodings
with Rm == 13 or 15.

Implement the long shifts by immediate, which perform shifts on a
pair of general-purpose registers treated as a 64-bit quantity, with
an immediate shift count between 1 and 32.

Awkwardly, because the MOVS and ORRS trans functions do not UNDEF for
the Rm==13,15 case, we need to explicitly emit code to UNDEF for the
cases where v8.1M now requires that.  (Trying to change MOVS and ORRS
is too difficult, because the functions that generate the code are
shared between a dozen different kinds of arithmetic or logical
instruction for all A32, T16 and T32 encodings, and for some insns
and some encodings Rm==13,15 are valid.)

We make the helper functions we need for UQSHLL and SQSHLL take
a 32-bit value which the helper casts to int8_t because we'll need
these helpers also for the shift-by-register insns, where the shift
count might be < 0 or > 32.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210628135835.6690-16-peter.maydell@linaro.org

f4ae6c8c

target/arm: Implement MVE VADDLV · d43ebd9d

Peter Maydell authored 3 years ago


Implement the MVE VADDLV insn; this is similar to VADDV, except
that it accumulates 32-bit elements into a 64-bit accumulator
stored in a pair of general-purpose registers.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210628135835.6690-15-peter.maydell@linaro.org

d43ebd9d

target/arm: Implement MVE VSHLC · 2e6a4ce0

Peter Maydell authored 3 years ago


Implement the MVE VSHLC insn, which performs a shift left of the
entire vector with carry in bits provided from a general purpose
register and carry out bits written back to that register.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210628135835.6690-14-peter.maydell@linaro.org

2e6a4ce0

target/arm: Implement MVE saturating narrowing shifts · d6f9e011

Peter Maydell authored 3 years ago


Implement the MVE saturating shift-right-and-narrow insns
VQSHRN, VQSHRUN, VQRSHRN and VQRSHRUN.

do_srshr() is borrowed from sve_helper.c.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210628135835.6690-13-peter.maydell@linaro.org

d6f9e011

target/arm: Implement MVE VSHRN, VRSHRN · 162e2655

Peter Maydell authored 3 years ago


Implement the MVE shift-right-and-narrow insn VSHRN and VRSHRN.

do_urshr() is borrowed from sve_helper.c.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210628135835.6690-12-peter.maydell@linaro.org

162e2655

target/arm: Implement MVE VSRI, VSLI · a78b25fa

Peter Maydell authored 3 years ago


Implement the MVE VSRI and VSLI insns, which perform a
shift-and-insert operation.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210628135835.6690-11-peter.maydell@linaro.org

a78b25fa

target/arm: Implement MVE VSHLL · c2262707

Peter Maydell authored 3 years ago


Implement the MVE VHLL (vector shift left long) insn.  This has two
encodings: the T1 encoding is the usual shift-by-immediate format,
and the T2 encoding is a special case where the shift count is always
equal to the element size.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210628135835.6690-10-peter.maydell@linaro.org

c2262707

target/arm: Implement MVE vector shift right by immediate insns · 3394116f

Peter Maydell authored 3 years ago


Implement the MVE vector shift right by immediate insns VSHRI and
VRSHRI.  As with Neon, we implement these by using helper functions
which perform left shifts but allow negative shift counts to indicate
right shifts.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210628135835.6690-9-peter.maydell@linaro.org

3394116f

target/arm: Implement MVE vector shift left by immediate insns · f9ed6174

Peter Maydell authored 3 years ago


Implement the MVE shift-vector-left-by-immediate insns VSHL, VQSHL
and VQSHLU.

The size-and-immediate encoding here is the same as Neon, and we
handle it the same way neon-dp.decode does.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210628135835.6690-8-peter.maydell@linaro.org

f9ed6174

target/arm: Implement MVE logical immediate insns · eab84139

Peter Maydell authored 3 years ago


Implement the MVE logical-immediate insns (VMOV, VMVN,
VORR and VBIC). These have essentially the same encoding
as their Neon equivalents, and we implement the decode
in the same way.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210628135835.6690-7-peter.maydell@linaro.org

eab84139

target/arm: Use dup_const() instead of bitfield_replicate() · e4667a5b

Peter Maydell authored 3 years ago


Use dup_const() instead of bitfield_replicate() in
disas_simd_mod_imm().

(We can't replace the other use of bitfield_replicate() in this file,
in logic_imm_decode_wmask(), because that location needs to handle 2
and 4 bit elements, which dup_const() cannot.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210628135835.6690-6-peter.maydell@linaro.org

e4667a5b

target/arm: Use asimd_imm_const for A64 decode · 2c0286db

Peter Maydell authored 3 years ago


The A64 AdvSIMD modified-immediate grouping uses almost the same
constant encoding that A32 Neon does; reuse asimd_imm_const() (to
which we add the AArch64-specific case for cmode 15 op 1) instead of
reimplementing it all.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210628135835.6690-5-peter.maydell@linaro.org

2c0286db

target/arm: Make asimd_imm_const() public · dfd66bc0

Peter Maydell authored 3 years ago


The function asimd_imm_const() in translate-neon.c is an
implementation of the pseudocode AdvSIMDExpandImm(), which we will
also want for MVE.  Move the implementation to translate.c, with a
prototype in translate.h.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210628135835.6690-4-peter.maydell@linaro.org

dfd66bc0

target/arm: Fix bugs in MVE VRMLALDAVH, VRMLSLDAVH · 303db86f

Peter Maydell authored 3 years ago


The initial implementation of the MVE VRMLALDAVH and VRMLSLDAVH
insns had some bugs:
 * the 32x32 multiply of elements was being done as 32x32->32,
   not 32x32->64
 * we were incorrectly maintaining the accumulator in its full
   72-bit form across all 4 beats of the insn; in the pseudocode
   it is squashed back into the 64 bits of the RdaHi:RdaLo
   registers after each beat

In particular, fixing the second of these allows us to recast
the implementation to avoid 128-bit arithmetic entirely.

Since the element size here is always 4, we can also drop the
parameterization of ESIZE to make the code a little more readable.

Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210628135835.6690-3-peter.maydell@linaro.org

303db86f

target/arm: Fix MVE widening/narrowing VLDR/VSTR offset calculation · d59ccc30

Peter Maydell authored 3 years ago


In do_ldst(), the calculation of the offset needs to be based on the
size of the memory access, not the size of the elements in the
vector.  This meant we were getting it wrong for the widening and
narrowing variants of the various VLDR and VSTR insns.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210628135835.6690-2-peter.maydell@linaro.org

d59ccc30

target/arm: Check NaN mode before silencing NaN · 103e7579

Joe Komlodi authored 3 years ago


If the CPU is running in default NaN mode (FPCR.DN == 1) and we execute
FRSQRTE, FRECPE, or FRECPX with a signaling NaN, parts_silence_nan_frac() will
assert due to fpst->default_nan_mode being set.

To avoid this, we check to see what NaN mode we're running in before we call
floatxx_silence_nan().

Signed-off-by: Joe Komlodi <joe.komlodi@xilinx.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 1624662174-175828-2-git-send-email-joe.komlodi@xilinx.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

103e7579

target/mips: Extract nanoMIPS ISA translation routines · 3f178b8d

Philippe Mathieu-Daudé authored 4 years ago


Extract 4900 lines from the huge translate.c to a new file,
'nanomips_translate.c.inc'. As there are too many inter-
dependencies we don't compile it as another object, but
keep including it in the big translate.o. We gain in code
maintainability.

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20201120210844.2625602-13-f4bug@amsat.org>

3f178b8d

target/mips: Extract the microMIPS ISA translation routines · bf52c45a

Philippe Mathieu-Daudé authored 4 years ago


Extract 3200+ lines from the huge translate.c to a new file,
'micromips_translate.c.inc'. As there are too many inter-
dependencies we don't compile it as another object, but
keep including it in the big translate.o. We gain in code
maintainability.

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20201120210844.2625602-12-f4bug@amsat.org>

bf52c45a

target/mips: Extract Code Compaction ASE translation routines · 3230bad9

Philippe Mathieu-Daudé authored 4 years ago


Extract 1100+ lines from the huge translate.c to a new file,
'mips16e_translate.c.inc'. As there are too many inter-
dependencies we don't compile it as another object, but
keep including it in the big translate.o. We gain in code
maintainability.

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20201120210844.2625602-10-f4bug@amsat.org>

3230bad9

target/mips: Add declarations for generic TCG helpers · d5076631

Philippe Mathieu-Daudé authored 3 years ago


We want to extract the microMIPS ISA and Code Compaction ASE to
new compilation units.

We will first extract this code as included source files (.c.inc),
then make them new compilation units afterward.

The following methods are going to be used externally:

  micromips_translate.c.inc:1778:   gen_ldxs(ctx, rs, rt, rd);
  micromips_translate.c.inc:1806:   gen_align(ctx, 32, rd, rs, ...
  micromips_translate.c.inc:2859:   gen_addiupc(ctx, reg, offset, ...
  mips16e_translate.c.inc:444:      gen_addiupc(ctx, ry, offset, ...

To avoid too much code churn, it is simpler to declare these
prototypes in "translate.h" now.

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20210617174907.2904067-2-f4bug@amsat.org>

d5076631

Jun 29, 2021

target/mips: Fix gen_mxu_s32ldd_s32lddr · 92ecfab5

Richard Henderson authored 3 years ago


There were two bugs here: (1) the required endianness was
not present in the MemOp, and (2) we were not providing a
zero-extended input to the bswap as semantics required.

The best fix is to fold the bswap into the memory operation,
producing the desired result directly.

Acked-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

92ecfab5

target/sh4: Improve swap.b translation · b983a0e1

Richard Henderson authored 3 years ago


Remove TCG_BSWAP_IZ and the preceding zero-extension.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

b983a0e1

target/i386: Improve bswap translation · 94fdf987

Richard Henderson authored 3 years ago


Use a break instead of an ifdefed else.
There's no need to move the values through s->T0.
Remove TCG_BSWAP_IZ and the preceding zero-extension.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

94fdf987

target/arm: Improve REVSH · ebdd503d

Richard Henderson authored 3 years ago


The new bswap flags can implement the semantics exactly.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

ebdd503d

target/arm: Improve vector REV · 50a7470e

Richard Henderson authored 3 years ago


We can eliminate the requirement for a zero-extended output,
because the following store will ignore any garbage high bits.

Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

50a7470e

target/arm: Improve REV32 · 2b0a39e5

Richard Henderson authored 3 years ago


For the sf version, we are performing two 32-bit bswaps
in either half of the register.  This is equivalent to
performing one 64-bit bswap followed by a rotate.

For the non-sf version, we can remove TCG_BSWAP_IZ
and the preceding zero-extension.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

2b0a39e5

tcg: Add flags argument to tcg_gen_bswap16_*, tcg_gen_bswap32_i64 · 2b836c2a

Richard Henderson authored 3 years ago


Implement the new semantics in the fallback expansion.
Change all callers to supply the flags that keep the
semantics unchanged locally.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

2b836c2a

target/cris: Do not exit tb for X_FLAG changes · 5f5a05cd

Richard Henderson authored 3 years ago


We always know the exact value of X, that's all that matters.
This avoids splitting the TB e.g. between "ax" and "addq".

Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Tested-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

5f5a05cd

target/cris: Remove dc->flagx_known · 0ce97a31

Richard Henderson authored 3 years ago


Ever since 2a44f7f1, flagx_known is always true.
Fold away all of the tests against the flag.

Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Tested-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

0ce97a31

target/cris: Improve JMP_INDIRECT · 3a1a80cc

Richard Henderson authored 3 years ago


Use movcond instead of brcond to set env_pc.
Discard the btarget and btaken variables to improve
register allocation and avoid unnecessary writeback.

Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Tested-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

3a1a80cc

target/cris: Use tcg_gen_lookup_and_goto_ptr · e0a4620c

Richard Henderson authored 3 years ago


We can use this in gen_goto_tb and for DISAS_JUMP
to indirectly chain to the next TB.

Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Tested-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

e0a4620c

target/cris: Add DISAS_DBRANCH · 31737151

Richard Henderson authored 3 years ago

Move delayed branch handling to tb_stop, where we can re-use other
end-of-tb code, e.g. the evaluation of flags. Honor single stepping.
Validate that we aren't losing state by overwriting is_jmp.

Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Tested-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

31737151

target/cris: Add DISAS_UPDATE_NEXT · c9674752

Richard Henderson authored 3 years ago


Move this pc update into tb_stop.
We will be able to re-use this code shortly.

Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Tested-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

c9674752

target/cris: Set cpustate_changed for rfe/rfn · 9e9f5ba0

Richard Henderson authored 3 years ago


These insns set DISAS_UPDATE without cpustate_changed,
which isn't quite right.

Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Tested-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

9e9f5ba0

target/cris: Fold unhandled X_FLAG changes into cpustate_changed · afd5a331

Richard Henderson authored 3 years ago


We really do this already, by including them into the same test.
This just hoists the expression up a bit.

Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Tested-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

afd5a331

target/cris: Mark static arrays const · 5899ce68

Richard Henderson authored 3 years ago


Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Tested-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

5899ce68

target/cris: Mark helper_raise_exception noreturn · 71fc4615

Richard Henderson authored 3 years ago


Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Tested-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

71fc4615