Skip to content
Snippets Groups Projects
  1. Nov 02, 2020
  2. Oct 27, 2020
  3. Oct 26, 2020
  4. Oct 22, 2020
  5. Oct 20, 2020
    • Peter Maydell's avatar
      target/arm: Implement FPSCR.LTPSIZE for M-profile LOB extension · 8128c8e8
      Peter Maydell authored
      
      If the M-profile low-overhead-branch extension is implemented, FPSCR
      bits [18:16] are a new field LTPSIZE.  If MVE is not implemented
      (currently always true for us) then this field always reads as 4 and
      ignores writes.
      
      These bits used to be the vector-length field for the old
      short-vector extension, so we need to take care that they are not
      misinterpreted as setting vec_len. We do this with a rearrangement
      of the vfp_set_fpscr() code that deals with vec_len, vec_stride
      and also the QC bit; this obviates the need for the M-profile
      only masking step that we used to have at the start of the function.
      
      We provide a new field in CPUState for LTPSIZE, even though this
      will always be 4, in preparation for MVE, so we don't have to
      come back later and split it out of the vfp.xregs[FPSCR] value.
      (This state struct field will be saved and restored as part of
      the FPSCR value via the vmstate_fpscr in machine.c.)
      
      Signed-off-by: default avatarPeter Maydell <peter.maydell@linaro.org>
      Reviewed-by: default avatarRichard Henderson <richard.henderson@linaro.org>
      Message-id: 20201019151301.2046-11-peter.maydell@linaro.org
      8128c8e8
    • Peter Maydell's avatar
      target/arm: Allow M-profile CPUs with FP16 to set FPSCR.FP16 · d31e2ce6
      Peter Maydell authored
      
      M-profile CPUs with half-precision floating point support should
      be able to write to FPSCR.FZ16, but an M-profile specific masking
      of the value at the top of vfp_set_fpscr() currently prevents that.
      This is not yet an active bug because we have no M-profile
      FP16 CPUs, but needs to be fixed before we can add any.
      
      The bits that the masking is effectively preventing from being
      set are the A-profile only short-vector Len and Stride fields,
      plus the Neon QC bit. Rearrange the order of the function so
      that those fields are handled earlier and only under a suitable
      guard; this allows us to drop the M-profile specific masking,
      making FZ16 writeable.
      
      This change also makes the QC bit correctly RAZ/WI for older
      no-Neon A-profile cores.
      
      This refactoring also paves the way for the low-overhead-branch
      LTPSIZE field, which uses some of the bits that are used for
      A-profile Stride and Len.
      
      Signed-off-by: default avatarPeter Maydell <peter.maydell@linaro.org>
      Reviewed-by: default avatarRichard Henderson <richard.henderson@linaro.org>
      Message-id: 20201019151301.2046-10-peter.maydell@linaro.org
      d31e2ce6
    • Peter Maydell's avatar
      target/arm: Fix has_vfp/has_neon ID reg squashing for M-profile · 532a3af5
      Peter Maydell authored
      
      In arm_cpu_realizefn(), if the CPU has VFP or Neon disabled then we
      squash the ID register fields so that we don't advertise it to the
      guest.  This code was written for A-profile and needs some tweaks to
      work correctly on M-profile:
      
       * A-profile only fields should not be zeroed on M-profile:
         - MVFR0.FPSHVEC,FPTRAP
         - MVFR1.SIMDLS,SIMDINT,SIMDSP,SIMDHP
         - MVFR2.SIMDMISC
       * M-profile only fields should be zeroed on M-profile:
         - MVFR1.FP16
      
      In particular, because MVFR1.SIMDHP on A-profile is the same field as
      MVFR1.FP16 on M-profile this code was incorrectly disabling FP16
      support on an M-profile CPU (where has_neon is always false).  This
      isn't a visible bug yet because we don't have any M-profile CPUs with
      FP16 support, but the change is necessary before we introduce any.
      
      Reviewed-by: default avatarRichard Henderson <richard.henderson@linaro.org>
      Signed-off-by: default avatarPeter Maydell <peter.maydell@linaro.org>
      Message-id: 20201019151301.2046-9-peter.maydell@linaro.org
      532a3af5
    • Peter Maydell's avatar
      target/arm: Implement v8.1M low-overhead-loop instructions · b7226369
      Peter Maydell authored
      
      v8.1M's "low-overhead-loop" extension has three instructions
      for looping:
       * DLS (start of a do-loop)
       * WLS (start of a while-loop)
       * LE (end of a loop)
      
      The loop-start instructions are both simple operations to start a
      loop whose iteration count (if any) is in LR.  The loop-end
      instruction handles "decrement iteration count and jump back to loop
      start"; it also caches the information about the branch back to the
      start of the loop to improve performance of the branch on subsequent
      iterations.
      
      As with the branch-future instructions, the architecture permits an
      implementation to discard the LO_BRANCH_INFO cache at any time, and
      QEMU takes the IMPDEF option to never set it in the first place
      (equivalent to discarding it immediately), because for us a "real"
      implementation would be unnecessary complexity.
      
      (This implementation only provides the simple looping constructs; the
      vector extension MVE (Helium) adds some extra variants to handle
      looping across vectors.  We'll add those later when we implement
      MVE.)
      
      Signed-off-by: default avatarPeter Maydell <peter.maydell@linaro.org>
      Reviewed-by: default avatarRichard Henderson <richard.henderson@linaro.org>
      Message-id: 20201019151301.2046-8-peter.maydell@linaro.org
      b7226369
    • Peter Maydell's avatar
      target/arm: Implement v8.1M branch-future insns (as NOPs) · 05903f03
      Peter Maydell authored
      
      v8.1M implements a new 'branch future' feature, which is a
      set of instructions that request the CPU to perform a branch
      "in the future", when it reaches a particular execution address.
      In hardware, the expected implementation is that the information
      about the branch location and destination is cached and then
      acted upon when execution reaches the specified address.
      However the architecture permits an implementation to discard
      this cached information at any point, and so guest code must
      always include a normal branch insn at the branch point as
      a fallback. In particular, an implementation is specifically
      permitted to treat all BF insns as NOPs (which is equivalent
      to discarding the cached information immediately).
      
      For QEMU, implementing this caching of branch information
      would be complicated and would not improve the speed of
      execution at all, so we make the IMPDEF choice to implement
      all BF insns as NOPs.
      
      Reviewed-by: default avatarRichard Henderson <richard.henderson@linaro.org>
      Signed-off-by: default avatarPeter Maydell <peter.maydell@linaro.org>
      Message-id: 20201019151301.2046-7-peter.maydell@linaro.org
      05903f03
    • Peter Maydell's avatar
      target/arm: Don't allow BLX imm for M-profile · 920f04fa
      Peter Maydell authored
      
      The BLX immediate insn in the Thumb encoding always performs
      a switch from Thumb to Arm state. This would be totally useless
      in M-profile which has no Arm decoder, and so the instruction
      does not exist at all there. Make the encoding UNDEF for M-profile.
      
      (This part of the encoding space is used for the branch-future
      and low-overhead-loop insns in v8.1M.)
      
      Reviewed-by: default avatarRichard Henderson <richard.henderson@linaro.org>
      Signed-off-by: default avatarPeter Maydell <peter.maydell@linaro.org>
      Message-id: 20201019151301.2046-6-peter.maydell@linaro.org
      920f04fa
    • Peter Maydell's avatar
      target/arm: Make the t32 insn[25:23]=111 group non-overlapping · 45f11876
      Peter Maydell authored
      
      The t32 decode has a group which represents a set of insns
      which overlap with B_cond_thumb because they have [25:23]=111
      (which is an invalid condition code field for the branch insn).
      This group is currently defined using the {} overlap-OK syntax,
      but it is almost entirely non-overlapping patterns. Switch
      it over to use a non-overlapping group.
      
      For this to be valid syntactically, CPS must move into the same
      overlapping-group as the hint insns (CPS vs hints was the
      only actual use of the overlap facility for the group).
      
      The non-overlapping subgroup for CLREX/DSB/DMB/ISB/SB is no longer
      necessary and so we can remove it (promoting those insns to
      be members of the parent group).
      
      Reviewed-by: default avatarRichard Henderson <richard.henderson@linaro.org>
      Signed-off-by: default avatarPeter Maydell <peter.maydell@linaro.org>
      Message-id: 20201019151301.2046-5-peter.maydell@linaro.org
      45f11876
    • Peter Maydell's avatar
      target/arm: Implement v8.1M conditional-select insns · cc73bbde
      Peter Maydell authored
      
      v8.1M brings four new insns to M-profile:
       * CSEL  : Rd = cond ? Rn : Rm
       * CSINC : Rd = cond ? Rn : Rm+1
       * CSINV : Rd = cond ? Rn : ~Rm
       * CSNEG : Rd = cond ? Rn : -Rm
      
      Implement these.
      
      Reviewed-by: default avatarRichard Henderson <richard.henderson@linaro.org>
      Signed-off-by: default avatarPeter Maydell <peter.maydell@linaro.org>
      Message-id: 20201019151301.2046-4-peter.maydell@linaro.org
      cc73bbde
    • Peter Maydell's avatar
      target/arm: Implement v8.1M NOCP handling · 5d2555a1
      Peter Maydell authored
      
      From v8.1M, disabled-coprocessor handling changes slightly:
       * coprocessors 8, 9, 14 and 15 are also governed by the
         cp10 enable bit, like cp11
       * an extra range of instruction patterns is considered
         to be inside the coprocessor space
      
      We previously marked these up with TODO comments; implement the
      correct behaviour.
      
      Unfortunately there is no ID register field which indicates this
      behaviour.  We could in theory test an unrelated ID register which
      indicates guaranteed-to-be-in-v8.1M behaviour like ID_ISAR0.CmpBranch
      >= 3 (low-overhead-loops), but it seems better to simply define a new
      ARM_FEATURE_V8_1M feature flag and use it for this and other
      new-in-v8.1M behaviour that isn't identifiable from the ID registers.
      
      Signed-off-by: default avatarPeter Maydell <peter.maydell@linaro.org>
      Reviewed-by: default avatarRichard Henderson <richard.henderson@linaro.org>
      Message-id: 20201019151301.2046-3-peter.maydell@linaro.org
      5d2555a1
Loading