diff options
author | Martin Storsjö <martin@martin.st> | 2021-08-22 15:07:45 +0300 |
---|---|---|
committer | Martin Storsjö <martin@martin.st> | 2021-08-24 23:31:22 +0300 |
commit | 5d14b4e6710057d8f16b1d5259ad1e6accef6f28 (patch) | |
tree | a20bd11105e8bd3689f39de5732ce428d2ec08b9 /src/refmvs.c | |
parent | 73851bc8a4fab0b18ed59be79362f07c86237fda (diff) |
arm: Add NEON implementations of splat_mv
Relative speedup over C code, for arm64:
Cortex A53 A72 A73 Apple M1
splat_mv_w1_neon: 1.09 0.95 1.22 -
splat_mv_w2_neon: 1.76 1.32 1.74 -
splat_mv_w4_neon: 2.78 2.19 2.19 15.00
splat_mv_w8_neon: 3.59 2.06 2.59 12.00
splat_mv_w16_neon: 4.12 1.72 2.53 3.14
splat_mv_w32_neon: 4.07 1.60 2.40 3.00
(The resolution of the timer used on Apple M1 isn't enough to
measure the small versions of this function.)
Relative speedup over C code, for arm32:
Cortex A7 A8 A9 A53 A72 A73
splat_mv_w1_neon: 0.70 1.12 0.91 0.65 1.01 1.06
splat_mv_w2_neon: 0.94 2.16 2.01 0.99 2.52 1.63
splat_mv_w4_neon: 1.27 2.04 1.49 1.52 1.75 2.18
splat_mv_w8_neon: 1.75 2.47 1.16 2.88 1.95 2.58
splat_mv_w16_neon: 2.00 2.44 1.12 3.25 1.85 2.65
splat_mv_w32_neon: 1.43 2.28 1.19 3.55 1.77 2.65
Diffstat (limited to 'src/refmvs.c')
-rw-r--r-- | src/refmvs.c | 6 |
1 files changed, 5 insertions, 1 deletions
diff --git a/src/refmvs.c b/src/refmvs.c index 724abb3..a845101 100644 --- a/src/refmvs.c +++ b/src/refmvs.c @@ -921,7 +921,11 @@ COLD void dav1d_refmvs_dsp_init(Dav1dRefmvsDSPContext *const c) { c->splat_mv = splat_mv_c; -#if HAVE_ASM && ARCH_X86 +#if HAVE_ASM +#if ARCH_AARCH64 || ARCH_ARM + dav1d_refmvs_dsp_init_arm(c); +#elif ARCH_X86 dav1d_refmvs_dsp_init_x86(c); #endif +#endif } |