- Oct 30, 2019
-
-
Matus Kysel authored
Reintroduce float32_to_float64 that was removed here: https://lists.gnu.org/archive/html/qemu-devel/2018-04/msg00455.html - nbench test it not actually calling this function at all - SPECS 2006 significat number of tests impoved their runtime, just few of them showed small slowdown Reviewed-by:
Richard Henderson <richard.henderson@linaro.org> Signed-off-by:
Matus Kysel <mkysel@tachyum.com> Message-Id: <20191017142133.59439-1-mkysel@tachyum.com> [rth: Add comment about impossible inexact exceptions.] Signed-off-by:
Richard Henderson <richard.henderson@linaro.org>
-
- Aug 19, 2019
-
-
Alex Bennée authored
This is not a normal header and should only be included in the main softfloat.c file to bring in the various target specific specialisations. Indeed as it contains non-inlined C functions it is not even a legal header. Rename it to match our included C convention. Signed-off-by:
Alex Bennée <alex.bennee@linaro.org> Reviewed-by:
Richard Henderson <richard.henderson@linaro.org> Reviewed-by:
Philippe Mathieu-Daudé <philmd@redhat.com>
-
Alex Bennée authored
In our quest to eliminate the home rolled LIT64 macro we fixup usage inside the softfloat code. While we are at it we remove some of the extraneous spaces to closer fit the house style. Signed-off-by:
Alex Bennée <alex.bennee@linaro.org> Reviewed-by:
Richard Henderson <richard.henderson@linaro.org>
-
Alex Bennée authored
Remove some more use of LIT64 while making the meaning more clear. We also avoid the need of casts as the results by definition fit into the return type. Signed-off-by:
Alex Bennée <alex.bennee@linaro.org> Reviewed-by:
Richard Henderson <richard.henderson@linaro.org>
-
Alex Bennée authored
This also allows us to remove the extractFloat16exp/frac helpers. We avoid using the floatXX_pack_raw functions as they are slight overkill for masking out all but the top bit of the number. The generated code is almost exactly the same as makes no difference to the pre-conversion code. Signed-off-by:
Alex Bennée <alex.bennee@linaro.org>
-
Alex Bennée authored
We have a wrapper that does the right thing from stdint.h so lets use it for our constants in softfloat-specialize.h Signed-off-by:
Alex Bennée <alex.bennee@linaro.org> Reviewed-by:
Richard Henderson <richard.henderson@linaro.org>
-
- Mar 25, 2019
-
-
Kito Cheng authored
Before falling back to softfloat FMA, we do not restore the original values of inputs A and C. Fix it. This bug was caught by running gcc's testsuite on RISC-V qemu. Note that this change gives a small perf increase for fp-bench: Host: Intel(R) Core(TM) i7-4790K CPU @ 4.00GHz Command: perf stat -r 3 taskset -c 0 ./fp-bench -o mulAdd -p $prec - $prec = single: - before: 101.71 MFlops 102.18 MFlops 100.96 MFlops - after: 103.63 MFlops 103.05 MFlops 102.96 MFlops - $prec = double: - before: 173.10 MFlops 173.93 MFlops 172.11 MFlops - after: 178.49 MFlops 178.88 MFlops 178.66 MFlops Signed-off-by:
Kito Cheng <kito.cheng@gmail.com> Signed-off-by:
Emilio G. Cota <cota@braap.org> Message-Id: <20190322204320.17777-1-cota@braap.org> Reviewed-by:
Richard Henderson <richard.henderson@linaro.org> Signed-off-by:
Alex Bennée <alex.bennee@linaro.org>
-
Mateja Marjanovic authored
Wrong type of NaN was generated for IEEE 754-2008 by MADDF.<D|S> and MSUBF.<D|S> instructions when the arguments were (Inf, Zero, NaN) or (Zero, Inf, NaN). The if-else statement establishes if the system conforms to IEEE 754-1985 or IEEE 754-2008, and defines different behaviors depending on that. In case of IEEE 754-2008, in mentioned cases of inputs, <MADDF|MSUBF>.<D|S> returns the input value 'c' [2] (page 53) and raises floating point exception 'Invalid Operation' [1] (pages 349, 350). These scenarios were tested and the results in QEMU emulation match the results obtained on the machine that has a MIPS64R6 CPU. [1] MIPS Architecture for Programmers Volume II-a: The MIPS64 Instruction Set Reference Manual, Revision 6.06 [2] MIPS Architecture for Programmers Volume IV-j: The MIPS64 SIMD Architecture Module, Revision 1.12 Signed-off-by:
Mateja Marjanovic <mateja.marjanovic@rt-rk.com> Message-Id: <1553008916-15274-2-git-send-email-mateja.marjanovic@rt-rk.com> Reviewed-by:
Peter Maydell <peter.maydell@linaro.org> [AJB: fixed up commit message] Signed-off-by:
Alex Bennée <alex.bennee@linaro.org>
-
- Feb 26, 2019
-
-
Richard Henderson authored
Previously this was only supported for roundAndPackFloat64. New support in round_canonical, round_to_int, float128_round_to_int, roundAndPackFloat32, roundAndPackInt32, roundAndPackInt64, roundAndPackUint64. This does not include any of the floatx80 routines, as we do not have users for that rounding mode there. Signed-off-by:
Richard Henderson <richard.henderson@linaro.org> Message-Id: <20190215170225.15537-1-richard.henderson@linaro.org> Tested-by:
David Hildenbrand <david@redhat.com> [AJB: add missing break] Signed-off-by:
Alex Bennée <alex.bennee@linaro.org>
-
David Hildenbrand authored
Handling it just like float128_to_uint32_round_to_zero, that hopefully is free of bugs :) Documentation basically copied from float128_to_uint64 Signed-off-by:
David Hildenbrand <david@redhat.com> Reviewed-by:
Richard Henderson <richard.henderson@linaro.org> Signed-off-by:
Alex Bennée <alex.bennee@linaro.org>
-
- Jan 22, 2019
-
-
Emilio G. Cota authored
The added branch to the FMA ops is marked as unlikely and therefore its impact on performance (measured with fp-bench) is within noise range when measured on an Intel(R) Xeon(R) Gold 6142 CPU @ 2.60GHz. Reported-by:
Laurent Desnogues <laurent.desnogues@gmail.com> Signed-off-by:
Emilio G. Cota <cota@braap.org> Reviewed-by:
Richard Henderson <richard.henderson@linaro.org> Signed-off-by:
Alex Bennée <alex.bennee@linaro.org>
-
- Dec 17, 2018
-
-
Emilio G. Cota authored
Performance results for fp-bench: Host: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz - before: cmp-single: 110.98 MFlops cmp-double: 107.12 MFlops - after: cmp-single: 506.28 MFlops cmp-double: 524.77 MFlops Note that flattening both eq and eq_signaling versions would give us extra performance (695v506, 615v524 Mflops for single/double, respectively) but this would emit two essentially identical functions for each eq/signaling pair, which is a waste. Aggregate performance improvement for the last few patches: [ all charts in png: https://imgur.com/a/4yV8p ] 1. Host: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz qemu-aarch64 NBench score; higher is better Host: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz 16 +-+-----------+-------------+----===-------+---===-------+-----------+-+ 14 +-+..........................@@@&&.=.......@@@&&.=...................+-+ 12 +-+..........................@.@.&.=.......@.@.&.=.....+befor=== +-+ 10 +-+..........................@.@.&.=.......@.@.&.=.....+ad@@&& = +-+ 8 +-+.......................$$$%.@.&.=.......@.@.&.=.....+ @@u& = +-+ 6 +-+............@@@&&=+***##.$%.@.&.=***##$$%+@.&.=..###$$%%@i& = +-+ 4 +-+.......###$%%.@.&=.*.*.#.$%.@.&.=*.*.#.$%.@.&.=+**.#+$ +@m& = +-+ 2 +-+.....***.#$.%.@.&=.*.*.#.$%.@.&.=*.*.#.$%.@.&.=.**.#+$+sqr& = +-+ 0 +-+-----***##$%%@@&&=-***##$$%@@&&==***##$$%@@&&==-**##$$%+cmp==-----+-+ FOURIER NEURAL NELU DECOMPOSITION gmean qemu-aarch64 SPEC06fp (test set) speedup over QEMU 4c2c1015 Host: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz error bars: 95% confidence interval 4.5 +-+---+-----+----+-----+-----+-&---+-----+----+-----+-----+-----+----+-----+-----+-----+-----+----+-----+---+-+ 4 +-+..........................+@@+...........................................................................+-+ 3.5 +-+..............%%@&.........@@..............%%@&............................................+++dsub +-+ 2.5 +-+....&&+.......%%@&.......+%%@..+%%&+..@@&+.%%@&....................................+%%&+.+%@&++%%@& +-+ 2 +-+..+%%&..+%@&+.%%@&...+++..%%@...%%&.+$$@&..%%@&..%%@&.......+%%&+.%%@&+......+%%@&.+%%&++$$@&++d%@& %%@&+-+ 1.5 +-+**#$%&**#$@&**#%@&**$%@**#$%@**#$%&**#$@&**$%@&*#$%@**#$%@**#$%&**#%@&**$%@&*#$%@**#$%&**#$@&*+f%@&**$%@&+-+ 0.5 +-+**#$%&**#$@&**#%@&**$%@**#$%@**#$%&**#$@&**$%@&*#$%@**#$%@**#$%&**#%@&**$%@&*#$%@**#$%&**#$@&+sqr@&**$%@&+-+ 0 +-+**#$%&**#$@&**#%@&**$%@**#$%@**#$%&**#$@&**$%@&*#$%@**#$%@**#$%&**#%@&**$%@&*#$%@**#$%&**#$@&*+cmp&**$%@&+-+ 410.bw416.gam433.434.z435.436.cac437.lesli444.447.de450.so453454.ca459.GemsF465.tont470.lb4482.sphinxgeomean 2. Host: ARM Aarch64 A57 @ 2.4GHz qemu-aarch64 NBench score; higher is better Host: Applied Micro X-Gene, Aarch64 A57 @ 2.4 GHz 5 +-+-----------+-------------+-------------+-------------+-----------+-+ 4.5 +-+........................................@@@&==...................+-+ 3 4 +-+..........................@@@&==........@.@&.=.....+before +-+ 3 +-+..........................@.@&.=........@.@&.=.....+ad@@@&== +-+ 2.5 +-+.....................##$$%%.@&.=........@.@&.=.....+ @m@& = +-+ 2 +-+............@@@&==.***#.$.%.@&.=.***#$$%%.@&.=.***#$$%%d@& = +-+ 1.5 +-+.....***#$$%%.@&.=.*.*#.$.%.@&.=.*.*#.$.%.@&.=.*.*#+$ +f@& = +-+ 0.5 +-+.....*.*#.$.%.@&.=.*.*#.$.%.@&.=.*.*#.$.%.@&.=.*.*#+$+sqr& = +-+ 0 +-+-----***#$$%%@@&==-***#$$%%@@&==-***#$$%%@@&==-***#$$%+cmp==-----+-+ FOURIER NEURAL NLU DECOMPOSITION gmean Reviewed-by:
Alex Bennée <alex.bennee@linaro.org> Signed-off-by:
Emilio G. Cota <cota@braap.org> Signed-off-by:
Alex Bennée <alex.bennee@linaro.org>
-
Emilio G. Cota authored
Performance results for fp-bench: Host: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz - before: sqrt-single: 42.30 MFlops sqrt-double: 22.97 MFlops - after: sqrt-single: 311.42 MFlops sqrt-double: 311.08 MFlops Here USE_FP makes a huge difference for f64's, with throughput going from ~200 MFlops to ~300 MFlops. Reviewed-by:
Alex Bennée <alex.bennee@linaro.org> Signed-off-by:
Emilio G. Cota <cota@braap.org> Signed-off-by:
Alex Bennée <alex.bennee@linaro.org>
-
Emilio G. Cota authored
Performance results for fp-bench: 1. Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz - before: fma-single: 74.73 MFlops fma-double: 74.54 MFlops - after: fma-single: 203.37 MFlops fma-double: 169.37 MFlops 2. ARM Aarch64 A57 @ 2.4GHz - before: fma-single: 23.24 MFlops fma-double: 23.70 MFlops - after: fma-single: 66.14 MFlops fma-double: 63.10 MFlops 3. IBM POWER8E @ 2.1 GHz - before: fma-single: 37.26 MFlops fma-double: 37.29 MFlops - after: fma-single: 48.90 MFlops fma-double: 59.51 MFlops Here having 3FP64 set to 1 pays off for x86_64: [1] 170.15 vs [0] 153.12 MFlops Reviewed-by:
Alex Bennée <alex.bennee@linaro.org> Signed-off-by:
Emilio G. Cota <cota@braap.org> Signed-off-by:
Alex Bennée <alex.bennee@linaro.org>
-
Emilio G. Cota authored
Performance results for fp-bench: 1. Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz - before: div-single: 34.84 MFlops div-double: 34.04 MFlops - after: div-single: 275.23 MFlops div-double: 216.38 MFlops 2. ARM Aarch64 A57 @ 2.4GHz - before: div-single: 9.33 MFlops div-double: 9.30 MFlops - after: div-single: 51.55 MFlops div-double: 15.09 MFlops 3. IBM POWER8E @ 2.1 GHz - before: div-single: 25.65 MFlops div-double: 24.91 MFlops - after: div-single: 96.83 MFlops div-double: 31.01 MFlops Here setting 2FP64_USE_FP to 1 pays off for x86_64: [1] 215.97 vs [0] 62.15 MFlops Reviewed-by:
Alex Bennée <alex.bennee@linaro.org> Signed-off-by:
Emilio G. Cota <cota@braap.org> Signed-off-by:
Alex Bennée <alex.bennee@linaro.org>
-
Emilio G. Cota authored
Performance results for fp-bench: 1. Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz - before: mul-single: 126.91 MFlops mul-double: 118.28 MFlops - after: mul-single: 258.02 MFlops mul-double: 197.96 MFlops 2. ARM Aarch64 A57 @ 2.4GHz - before: mul-single: 37.42 MFlops mul-double: 38.77 MFlops - after: mul-single: 73.41 MFlops mul-double: 76.93 MFlops 3. IBM POWER8E @ 2.1 GHz - before: mul-single: 58.40 MFlops mul-double: 59.33 MFlops - after: mul-single: 60.25 MFlops mul-double: 94.79 MFlops Reviewed-by:
Alex Bennée <alex.bennee@linaro.org> Signed-off-by:
Emilio G. Cota <cota@braap.org> Signed-off-by:
Alex Bennée <alex.bennee@linaro.org>
-
Emilio G. Cota authored
Performance results (single and double precision) for fp-bench: 1. Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz - before: add-single: 135.07 MFlops add-double: 131.60 MFlops sub-single: 130.04 MFlops sub-double: 133.01 MFlops - after: add-single: 443.04 MFlops add-double: 301.95 MFlops sub-single: 411.36 MFlops sub-double: 293.15 MFlops 2. ARM Aarch64 A57 @ 2.4GHz - before: add-single: 44.79 MFlops add-double: 49.20 MFlops sub-single: 44.55 MFlops sub-double: 49.06 MFlops - after: add-single: 93.28 MFlops add-double: 88.27 MFlops sub-single: 91.47 MFlops sub-double: 88.27 MFlops 3. IBM POWER8E @ 2.1 GHz - before: add-single: 72.59 MFlops add-double: 72.27 MFlops sub-single: 75.33 MFlops sub-double: 70.54 MFlops - after: add-single: 112.95 MFlops add-double: 201.11 MFlops sub-single: 116.80 MFlops sub-double: 188.72 MFlops Note that the IBM and ARM machines benefit from having HARDFLOAT_2F{32,64}_USE_FP set to 0. Otherwise their performance can suffer significantly: - IBM Power8: add-single: [1] 54.94 vs [0] 116.37 MFlops add-double: [1] 58.92 vs [0] 201.44 MFlops - Aarch64 A57: add-single: [1] 80.72 vs [0] 93.24 MFlops add-double: [1] 82.10 vs [0] 88.18 MFlops On the Intel machine, having 2F64 set to 1 pays off, but it doesn't for 2F32: - Intel i7-6700K: add-single: [1] 285.79 vs [0] 426.70 MFlops add-double: [1] 302.15 vs [0] 278.82 MFlops Reviewed-by:
Alex Bennée <alex.bennee@linaro.org> Signed-off-by:
Emilio G. Cota <cota@braap.org> Signed-off-by:
Alex Bennée <alex.bennee@linaro.org>
-
Emilio G. Cota authored
The appended paves the way for leveraging the host FPU for a subset of guest FP operations. For most guest workloads (e.g. FP flags aren't ever cleared, inexact occurs often and rounding is set to the default [to nearest]) this will yield sizable performance speedups. The approach followed here avoids checking the FP exception flags register. See the added comment for details. This assumes that QEMU is running on an IEEE754-compliant FPU and that the rounding is set to the default (to nearest). The implementation-dependent specifics of the FPU should not matter; things like tininess detection and snan representation are still dealt with in soft-fp. However, this approach will break on most hosts if we compile QEMU with flags that break IEEE compatibility. There is no way to detect all of these flags at compilation time, but at least we check for -ffast-math (which defines __FAST_MATH__) and disable hardfloat (plus emit a #warning) when it is set. This patch just adds common code. Some operations will be migrated to hardfloat in subsequent patches to ease bisection. Note: some architectures (at least PPC, there might be others) clear the status flags passed to softfloat before most FP operations. This precludes the use of hardfloat, so to avoid introducing a performance regression for those targets, we add a flag to disable hardfloat. In the long run though it would be good to fix the targets so that at least the inexact flag passed to softfloat is indeed sticky. Reviewed-by:
Alex Bennée <alex.bennee@linaro.org> Signed-off-by:
Emilio G. Cota <cota@braap.org> Signed-off-by:
Alex Bennée <alex.bennee@linaro.org>
-
Emilio G. Cota authored
glibc >= 2.25 defines canonicalize in commit eaf5ad0 (Add canonicalize, canonicalizef, canonicalizel., 2016-10-26). Given that we'll be including <math.h> soon, prepare for this by prefixing our canonicalize() with sf_ to avoid clashing with the libc's canonicalize(). Reported-by:
Bastian Koppelmann <kbastian@mail.uni-paderborn.de> Tested-by:
Bastian Koppelmann <kbastian@mail.uni-paderborn.de> Reviewed-by:
Alex Bennée <alex.bennee@linaro.org> Signed-off-by:
Emilio G. Cota <cota@braap.org> Signed-off-by:
Alex Bennée <alex.bennee@linaro.org>
-
- Oct 17, 2018
-
-
Thomas Huth authored
Older versions of Clang (before 3.5) and GCC (before 4.1) do not support the "__attribute__((flatten))" yet. We don't care about such old versions of GCC anymore, but since Clang 3.4 is still used in EPEL for RHEL7 / CentOS 7, we should not use this attribute directly but with a wrapper macro instead. Reviewed-by:
Alex Bennée <alex.bennee@linaro.org> Reviewed-by:
Richard Henderson <richard.henderson@linaro.org> Signed-off-by:
Thomas Huth <thuth@redhat.com>
-
- Oct 05, 2018
-
-
Richard Henderson authored
The __udiv_qrnnd primitive that we nicked from gmp requires its inputs to be normalized. We were not doing that. Because the inputs are nearly normalized already, finishing that is trivial. Replace div128to64 with a "proper" udiv_qrnnd, so that this remains a reusable primitive. Fixes: cf07323d Fixes: https://bugs.launchpad.net/qemu/+bug/1793119 Tested-by:
Emilio G. Cota <cota@braap.org> Tested-by:
Alex Bennée <alex.bennee@linaro.org> Reviewed-by:
Alex Bennée <alex.bennee@linaro.org> Signed-off-by:
Richard Henderson <richard.henderson@linaro.org>
-
Thomas Huth authored
Our minimum required compiler for compiling QEMU is GCC 4.1 these days, so we can drop the support for compilers which do not provide the __builtin_clz*() functions yet. Since the countLeadingZeros32/64 are then identical to the clz32/64 functions, and we do not have to sync the softloat 2 codebase with upstream anymore (softloat 3 is a complete rewrite) we can simply replace the functions with our QEMU versions. Suggested-by:
Peter Maydell <peter.maydell@linaro.org> Acked-by:
Alex Bennée <alex.bennee@linaro.org> Reviewed-by:
Richard Henderson <richard.henderson@linaro.org> Reviewed-by:
Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by:
Thomas Huth <thuth@redhat.com> Message-Id: <1538118095-7003-1-git-send-email-thuth@redhat.com> Signed-off-by:
Richard Henderson <richard.henderson@linaro.org>
-
Emilio G. Cota authored
It has not had users since f83311e4 ("target-m68k: use floatx80 internally", 2017-06-21). Note that no other bit-width has floatX_trunc_to_int. Reviewed-by:
Alex Bennée <alex.bennee@linaro.org> Signed-off-by:
Emilio G. Cota <cota@braap.org> Signed-off-by:
Richard Henderson <richard.henderson@linaro.org>
-
- Aug 24, 2018
-
-
Richard Henderson authored
Signed-off-by:
Richard Henderson <richard.henderson@linaro.org> Message-id: 20180814002653.12828-3-richard.henderson@linaro.org Reviewed-by:
Peter Maydell <peter.maydell@linaro.org> Signed-off-by:
Peter Maydell <peter.maydell@linaro.org>
-
Richard Henderson authored
Signed-off-by:
Richard Henderson <richard.henderson@linaro.org> Message-id: 20180814002653.12828-2-richard.henderson@linaro.org Reviewed-by:
Peter Maydell <peter.maydell@linaro.org> Signed-off-by:
Peter Maydell <peter.maydell@linaro.org>
-
- Aug 16, 2018
-
-
Richard Henderson authored
For 0x1.0000000000003p+0 + 0x1.ffffffep+14 = 0x1.0001fffp+15 we dropped the sticky bit and so failed to raise inexact. Reported-by:
Laurent Desnogues <laurent.desnogues@gmail.com> Signed-off-by:
Richard Henderson <richard.henderson@linaro.org> Reviewed-by:
Laurent Desnogues <laurent.desnogues@gmail.com> Tested-by:
Laurent Desnogues <laurent.desnogues@gmail.com> Message-id: 20180810193129.1556-7-richard.henderson@linaro.org Signed-off-by:
Peter Maydell <peter.maydell@linaro.org>
-
- May 17, 2018
-
-
Richard Henderson authored
Isolate the target-specific choice to 3 functions instead of 6. The code in floatx80_default_nan tried to be over-general. There are only two targets that support this format: x86 and m68k. Thus there is no point in inventing a mechanism for snan_bit_is_one. Move routines that no longer have ifdefs out of softfloat-specialize.h. Tested-by:
Alex Bennée <alex.bennee@linaro.org> Reviewed-by:
Alex Bennée <alex.bennee@linaro.org> Signed-off-by:
Richard Henderson <richard.henderson@linaro.org>
-
Richard Henderson authored
Reduce the number of ifdefs. Correct the result for OpenRISC and TriCore (although TriCore fixed in target-specific code). Tested-by:
Alex Bennée <alex.bennee@linaro.org> Reviewed-by:
Alex Bennée <alex.bennee@linaro.org> Signed-off-by:
Richard Henderson <richard.henderson@linaro.org>
-
Richard Henderson authored
Isolate the target-specific choice to 2 functions instead of 6. The code in float16_default_nan was only correct for ARM, MIPS, and X86. Though float16 support is rare among our targets. The code in float128_default_nan was arguably wrong for Sparc. While QEMU supports the Sparc 128-bit insns, no real cpu enables it. The code in floatx80_default_nan tried to be over-general. There are only two targets that support this format: x86 and m68k. Thus there is no point in inventing a value for snan_bit_is_one. Move routines that no longer have ifdefs out of softfloat-specialize.h. Tested-by:
Alex Bennée <alex.bennee@linaro.org> Reviewed-by:
Alex Bennée <alex.bennee@linaro.org> Signed-off-by:
Richard Henderson <richard.henderson@linaro.org>
-
Richard Henderson authored
For each operand, pass a single enumeration instead of a pair of booleans. The commit also merges multiple different ifdef-selected implementations of pickNaNMulAdd into a single function whose body is ifdef-selected. Tested-by:
Alex Bennée <alex.bennee@linaro.org> Reviewed-by:
Peter Maydell <peter.maydell@linaro.org> Signed-off-by:
Richard Henderson <richard.henderson@linaro.org>
-
Richard Henderson authored
For each operand, pass a single enumeration instead of a pair of booleans. The commit also merges multiple different ifdef-selected implementations of pickNaN into a single function whose body is ifdef-selected. Tested-by:
Alex Bennée <alex.bennee@linaro.org> Reviewed-by:
Peter Maydell <peter.maydell@linaro.org> Signed-off-by:
Richard Henderson <richard.henderson@linaro.org>
-
Richard Henderson authored
We will need these helpers within softfloat-specialize.h, so move the definitions above the include. After specialization, they will not always be used so mark them to avoid the Werror. Tested-by:
Alex Bennée <alex.bennee@linaro.org> Reviewed-by:
Alex Bennée <alex.bennee@linaro.org> Reviewed-by:
Peter Maydell <peter.maydell@linaro.org> Signed-off-by:
Richard Henderson <richard.henderson@linaro.org>
-
Richard Henderson authored
Only MIPS requires snan_bit_is_one to be variable. While we are specializing softfloat behaviour, allow other targets to eliminate this runtime check. Cc: Aurelien Jarno <aurelien@aurel32.net> Cc: Yongbok Kim <yongbok.kim@mips.com> Cc: David Gibson <david@gibson.dropbear.id.au> Cc: Alexander Graf <agraf@suse.de> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Tested-by:
Alex Bennée <alex.bennee@linaro.org> Reviewed-by:
Alex Bennée <alex.bennee@linaro.org> Reviewed-by:
Peter Maydell <peter.maydell@linaro.org> Signed-off-by:
Richard Henderson <richard.henderson@linaro.org>
-
Richard Henderson authored
These functions are now unused. Tested-by:
Alex Bennée <alex.bennee@linaro.org> Reviewed-by:
Alex Bennée <alex.bennee@linaro.org> Reviewed-by:
Peter Maydell <peter.maydell@linaro.org> Signed-off-by:
Richard Henderson <richard.henderson@linaro.org>
-
Richard Henderson authored
We have already checked the arguments for SNaN; we don't need to do it again. Tested-by:
Alex Bennée <alex.bennee@linaro.org> Reviewed-by:
Alex Bennée <alex.bennee@linaro.org> Reviewed-by:
Peter Maydell <peter.maydell@linaro.org> Signed-off-by:
Richard Henderson <richard.henderson@linaro.org>
-
Alex Bennée authored
This allows us to delete a lot of additional boilerplate code which is no longer needed. Reviewed-by:
Peter Maydell <peter.maydell@linaro.org> Signed-off-by:
Alex Bennée <alex.bennee@linaro.org> Signed-off-by:
Richard Henderson <richard.henderson@linaro.org>
-
Alex Bennée authored
For float16 ARM supports an alternative half-precision format which sacrifices the ability to represent NaN/Inf in return for a higher dynamic range. The new FloatFmt flag, arm_althp, is then used to modify the behaviour of canonicalize and round_canonical with respect to representation and exception raising. Usage of this new flag waits until we re-factor float-to-float conversions. Reviewed-by:
Peter Maydell <peter.maydell@linaro.org> Signed-off-by:
Alex Bennée <alex.bennee@linaro.org> Signed-off-by:
Richard Henderson <richard.henderson@linaro.org>
-
Richard Henderson authored
With a canonical representation of NaNs, we can silence an SNaN immediately rather than delay until the final format is known. Tested-by:
Alex Bennée <alex.bennee@linaro.org> Reviewed-by:
Alex Bennée <alex.bennee@linaro.org> Reviewed-by:
Peter Maydell <peter.maydell@linaro.org> Signed-off-by:
Richard Henderson <richard.henderson@linaro.org>
-
Richard Henderson authored
With a canonical representation of NaNs, we can return the default nan directly rather than delay the expansion until the final format is known. Note one case where we uselessly assigned to a.sign, which was overwritten/ignored later when expanding float_class_dnan. Tested-by:
Alex Bennée <alex.bennee@linaro.org> Reviewed-by:
Alex Bennée <alex.bennee@linaro.org> Reviewed-by:
Peter Maydell <peter.maydell@linaro.org> Signed-off-by:
Richard Henderson <richard.henderson@linaro.org>
-
Richard Henderson authored
Tested-by:
Alex Bennée <alex.bennee@linaro.org> Reviewed-by:
Alex Bennée <alex.bennee@linaro.org> Reviewed-by:
Peter Maydell <peter.maydell@linaro.org> Signed-off-by:
Richard Henderson <richard.henderson@linaro.org>
-