Welcome to mirror list, hosted at ThFree Co, Russian Federation.

github.com/videolan/dav1d.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorMartin Storsjö <martin@martin.st>2019-10-07 13:29:41 +0300
committerJean-Baptiste Kempf <jb@videolan.org>2019-10-08 09:18:25 +0300
commit61442bee60f45b05da627ddbac10a9a63e243f47 (patch)
treefc7bb14ebeb763b87e063c80d7f7897b21a8dba1 /src/arm/32/util.S
parent5647a57eabc454e2e2360429aba494452af00cb3 (diff)
arm: mc: Port the ARM64 warp filter to arm32
Relative speedup over C code: Cortex A7 A8 A9 A53 A72 A73 warp_8x8_8bpc_neon: 2.79 5.45 4.18 3.96 4.16 4.51 warp_8x8t_8bpc_neon: 2.79 5.33 4.18 3.98 4.22 4.25 Comparison to original ARM64 assembly: ARM64: Cortex A53 A72 A73 warp_8x8_8bpc_neon: 1854.6 1072.5 1102.5 warp_8x8t_8bpc_neon: 1839.6 1069.4 1089.5 ARM32: warp_8x8_8bpc_neon: 2132.5 1160.3 1218.0 warp_8x8t_8bpc_neon: 2113.7 1148.0 1209.1
Diffstat (limited to 'src/arm/32/util.S')
-rw-r--r--src/arm/32/util.S15
1 files changed, 15 insertions, 0 deletions
diff --git a/src/arm/32/util.S b/src/arm/32/util.S
index 3f2e328..ddf157f 100644
--- a/src/arm/32/util.S
+++ b/src/arm/32/util.S
@@ -69,4 +69,19 @@
#endif
.endm
+.macro transpose_8x8b q0, q1, q2, q3, r0, r1, r2, r3, r4, r5, r6, r7
+ vtrn.32 \q0, \q2
+ vtrn.32 \q1, \q3
+
+ vtrn.16 \r0, \r2
+ vtrn.16 \r1, \r3
+ vtrn.16 \r4, \r6
+ vtrn.16 \r5, \r7
+
+ vtrn.8 \r0, \r1
+ vtrn.8 \r2, \r3
+ vtrn.8 \r4, \r5
+ vtrn.8 \r6, \r7
+.endm
+
#endif /* DAV1D_SRC_ARM_32_UTIL_S */