Welcome to mirror list, hosted at ThFree Co, Russian Federation.

github.com/videolan/dav1d.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
path: root/src/ppc
AgeCommit message (Collapse)Author
2020-12-13x86: Rewrite wiener SSE2/SSSE3/AVX2 asmHenrik Gramner
The previous implementation did two separate passes in the horizontal and vertical directions, with the intermediate values being stored in a buffer on the stack. This caused bad cache thrashing. By interleaving the horizontal and vertical passes in combination with a ring buffer for storing only a few rows at a time the performance is improved by a significant amount. Also split the function into 7-tap and 5-tap versions. The latter is faster and fairly common (always for chroma, sometimes for luma).
2020-12-13Add miscellaneous minor wiener optimizationsHenrik Gramner
Combine horizontal and vertical filter pointers into a single parameter when calling the wiener DSP function. Eliminate the +128 filter coefficient handling where possible.
2020-02-01Rework the CDEF top edge handlingHenrik Gramner
Avoids some pointer chasing and simplifies the DSP code, at the cost of making the initialization a little bit more complicated. Also reduces memory usage by a small amount due to properly sizing the buffers instead of always allocating enough space for 4:4:4.
2019-10-09Add VSX wiener filter implementationMichail Alvanos
2019-08-19Utilize the constraints in assertions to improve code generationHenrik Gramner
When compiling in release mode, instead of just deleting assertions, use them to give hints to the compiler. This allows for slightly better code generation in some cases.
2019-07-27vsx: Add cdef_filterLuca Barbato
clang-8: cdef_filter_4x4_8bpc_c: 436.6 cdef_filter_4x4_8bpc_vsx: 101.1 cdef_filter_4x8_8bpc_c: 827.7 cdef_filter_4x8_8bpc_vsx: 183.5 cdef_filter_8x8_8bpc_c: 1510.2 cdef_filter_8x8_8bpc_vsx: 289.1 gcc-9: cdef_filter_4x4_8bpc_c: 403.2 cdef_filter_4x4_8bpc_vsx: 105.6 cdef_filter_4x8_8bpc_c: 825.5 cdef_filter_4x8_8bpc_vsx: 192.2 cdef_filter_8x8_8bpc_c: 1586.3 cdef_filter_8x8_8bpc_vsx: 295.0
2019-07-27vsx: Add shorter types and unpack helpersLuca Barbato
2019-06-10Initial PowerPC supportLuca Barbato
Limited to PowerPC64 LE for now.