diff options
author | Martin Storsjö <martin@martin.st> | 2020-06-17 11:53:17 +0300 |
---|---|---|
committer | Martin Storsjö <martin@martin.st> | 2020-06-19 12:24:05 +0300 |
commit | 53e7b21e34d0536c55b0b8ba120c2180726190b4 (patch) | |
tree | f444df676cd4ae86402cd02c5f3181b16738696c /tests | |
parent | 370200cd99bbc24c4d5c30e51ec2ef9bcf7f986e (diff) |
arm32: Add a NEON implementation of MSAC
Only use this in the cases when NEON can be used unconditionally
without runtime detection (when __ARM_NEON is defined).
The speedup over the C code is very modest for the smaller functions
(and the NEON version actually is a little slower than the C code
on Cortex A7 for adapt4), but the speedup is around 2x for
adapt16.
Cortex A7 A8 A9 A53 A72 A73
msac_decode_bool_c: 41.1 43.0 43.0 37.3 26.2 31.3
msac_decode_bool_neon: 40.2 42.0 37.2 32.8 19.9 25.5
msac_decode_bool_adapt_c: 65.1 70.4 58.5 54.3 33.2 40.8
msac_decode_bool_adapt_neon: 56.8 52.4 49.3 42.6 27.1 33.7
msac_decode_bool_equi_c: 36.9 37.2 42.8 32.6 22.7 42.3
msac_decode_bool_equi_neon: 34.9 35.1 36.4 29.7 19.5 36.4
msac_decode_symbol_adapt4_c: 114.2 139.0 111.6 99.9 65.5 83.5
msac_decode_symbol_adapt4_neon: 119.2 128.3 95.7 82.2 58.2 57.5
msac_decode_symbol_adapt8_c: 176.0 207.9 164.0 154.4 88.0 117.0
msac_decode_symbol_adapt8_neon: 128.3 130.3 110.7 85.1 59.9 61.4
msac_decode_symbol_adapt16_c: 292.1 320.5 256.4 246.4 129.1 173.3
msac_decode_symbol_adapt16_neon: 162.2 144.3 129.0 104.2 69.2 69.9
(Omitting msac_decode_hi_tok from the benchmark, as the "C" version
measured there uses the NEON version of msac_decode_symbol_adapt4.)
Diffstat (limited to 'tests')
-rw-r--r-- | tests/checkasm/msac.c | 2 |
1 files changed, 1 insertions, 1 deletions
diff --git a/tests/checkasm/msac.c b/tests/checkasm/msac.c index 33c0f0e..cdaf0de 100644 --- a/tests/checkasm/msac.c +++ b/tests/checkasm/msac.c @@ -239,7 +239,7 @@ void checkasm_check_msac(void) { c.bool = dav1d_msac_decode_bool_c; c.hi_tok = dav1d_msac_decode_hi_tok_c; -#if ARCH_AARCH64 && HAVE_ASM +#if (ARCH_AARCH64 || ARCH_ARM) && HAVE_ASM if (dav1d_get_cpu_flags() & DAV1D_ARM_CPU_FLAG_NEON) { c.symbol_adapt4 = dav1d_msac_decode_symbol_adapt4_neon; c.symbol_adapt8 = dav1d_msac_decode_symbol_adapt8_neon; |