Welcome to mirror list, hosted at ThFree Co, Russian Federation.

github.com/videolan/dav1d.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2019-03-05arm64: cdef: Clarify a slightly confusing commentMartin Storsjö
This might have said pri_taps[k]/sec_taps[k] at some earlier time.
2019-03-05arm64: cdef: Use a smarter padding constantMartin Storsjö
Pad with a value which works both as a large unsigned value and a negative signed value. This allows doing the max operation using signed max, avoiding the conditional altogether. Based on the same idea for x86 by Kyle Siefring. Before: Cortex A53 A72 A73 cdef_filter_4x4_8bpc_neon: 645.5 401.9 422.5 cdef_filter_4x8_8bpc_neon: 1193.7 756.6 782.4 cdef_filter_8x8_8bpc_neon: 2162.4 1361.9 1375.6 After: cdef_filter_4x4_8bpc_neon: 596.3 377.8 384.8 cdef_filter_4x8_8bpc_neon: 1097.4 705.5 707.1 cdef_filter_8x8_8bpc_neon: 1967.4 1232.3 1239.9
2019-03-05arm64: cdef: Do saturating subtractions to avoid max operations with 0Martin Storsjö
Before: Cortex A53 A72 A73 cdef_filter_4x4_8bpc_neon: 677.4 433.9 452.9 cdef_filter_4x8_8bpc_neon: 1255.0 815.2 841.8 cdef_filter_8x8_8bpc_neon: 2278.5 1440.0 1505.0 After: cdef_filter_4x4_8bpc_neon: 645.5 401.9 422.5 cdef_filter_4x8_8bpc_neon: 1193.7 756.6 782.4 cdef_filter_8x8_8bpc_neon: 2162.4 1361.9 1375.6
2019-02-14arm64: cdef: NEON implementation of the dir functionMartin Storsjö
Speedup vs C code: Cortex A53 A72 A73 cdef_dir_8bpc_neon: 4.43 3.51 4.39
2019-02-11arm64: cdef: NEON optimized cdef filter functionMartin Storsjö
Speedup vs C code: Cortex A53 A72 A73 cdef_filter_4x4_8bpc_neon: 4.62 4.48 4.76 cdef_filter_4x8_8bpc_neon: 4.82 4.80 5.08 cdef_filter_8x8_8bpc_neon: 5.29 5.33 5.79