diff options
author | Martin Storsjö <martin@martin.st> | 2020-11-18 15:20:18 +0300 |
---|---|---|
committer | Martin Storsjö <martin@martin.st> | 2020-11-20 23:32:12 +0300 |
commit | 018e64e714faadcf3840bd6a0a52cf6a8a3acde2 (patch) | |
tree | 72ae76fe323114d94fd31d40db30984466187f8c /src/meson.build | |
parent | e41a2a1fe0e10a8b2e22f238acb596c35b2f2f7f (diff) |
arm32: cdef: Add NEON implementations of CDEF for 16 bpc
Use a shared template file for assembly functions that can be
templated into 8 and 16 bpc forms, just like in the arm64 version.
Checkasm benchmarks:
Cortex A7 A8 A53 A72 A73
cdef_dir_16bpc_neon: 975.9 853.2 555.2 378.7 386.9
cdef_filter_4x4_16bpc_neon: 746.9 521.7 481.2 333.0 340.8
cdef_filter_4x8_16bpc_neon: 1300.0 885.5 816.3 582.7 599.5
cdef_filter_8x8_16bpc_neon: 2282.5 1415.0 1417.6 1059.0 1076.3
Corresponding numbers for arm64, for comparison:
Cortex A53 A72 A73
cdef_dir_16bpc_neon: 418.0 306.7 310.7
cdef_filter_4x4_16bpc_neon: 453.4 282.9 297.4
cdef_filter_4x8_16bpc_neon: 807.5 514.2 533.8
cdef_filter_8x8_16bpc_neon: 1425.2 924.4 942.0
Diffstat (limited to 'src/meson.build')
-rw-r--r-- | src/meson.build | 1 |
1 files changed, 1 insertions, 0 deletions
diff --git a/src/meson.build b/src/meson.build index 61dcc9e..acbf988 100644 --- a/src/meson.build +++ b/src/meson.build @@ -147,6 +147,7 @@ if is_asm_enabled if dav1d_bitdepths.contains('16') libdav1d_sources_asm += files( + 'arm/32/cdef16.S', 'arm/32/looprestoration16.S', 'arm/32/mc16.S', ) |