Welcome to mirror list, hosted at ThFree Co, Russian Federation.

github.com/FFmpeg/FFmpeg.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2019-11-11avcodec/arm/mlpdsp: add missing dependency for truehdAman Gupta
Signed-off-by: Aman Gupta <aman@tmm1.net>
2019-03-22Merge commit '0676de935b1e81bc5b5698fef3e7d48ff2ea77ff'James Almer
* commit '0676de935b1e81bc5b5698fef3e7d48ff2ea77ff': arm: Implement a NEON version of 422 h264_h_loop_filter_chroma Merged-by: James Almer <jamrial@gmail.com>
2019-03-21arm: Implement a NEON version of 422 h264_h_loop_filter_chromaMartin Storsjö
Previously, the 420 version was used even for 422. This fixes occasional checkasm failures. Signed-off-by: Martin Storsjö <martin@martin.st>
2019-03-14Merge commit 'cef914e08310166112ac09567e66452a7679bfc8'James Almer
* commit 'cef914e08310166112ac09567e66452a7679bfc8': arm: vp8: Optimize put_epel16_h6v6 with vp8_epel8_v6_y2 Merged-by: James Almer <jamrial@gmail.com>
2019-02-21arm/h264dsp: change loop filter stride argument to ptrdiff_tJames Almer
This was missed in d5d699ab6e6f8a8290748d107416fd5c19757a1b Signed-off-by: James Almer <jamrial@gmail.com>
2019-02-19arm: vp8: Optimize put_epel16_h6v6 with vp8_epel8_v6_y2Martin Storsjö
This makes it similar to put_epel16_v6, and gives a 10-25% speedup of this function. Before: Cortex A7 A8 A9 A53 A72 vp8_put_epel16_h6v6_neon: 3058.0 2218.5 2459.8 2183.0 1572.2 After: vp8_put_epel16_h6v6_neon: 2670.8 1934.2 2244.4 1729.4 1503.9 Signed-off-by: Martin Storsjö <martin@martin.st>
2018-04-09avcodec/arm/hevcdsp_sao : add NEON optimization for saoMeng Wang
Signed-off-by: Meng Wang <wangmeng.kids@bytedance.com> Reviewed-by: Shengbin Meng <shengbinmeng@gmail.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2018-03-31arm: hevcdsp: Add commas between macro argumentsMartin Storsjö
When targeting darwin, clang requires commas between arguments, while the no-comma form is allowed for other targets. Since Xcode 9.3, the bundled clang supports altmacro and doesn't require using gas-preprocessor any longer. Signed-off-by: Martin Storsjö <martin@martin.st>
2018-03-31arm: hevcdsp: Avoid using macro expansion countersMartin Storsjö
Clang supports the macro expansion counter (used for making unique labels within macro expansions), but not when targeting darwin. Convert uses of the counter into normal local labels, as used elsewhere. Since Xcode 9.3, the bundled clang supports altmacro and doesn't require using gas-preprocessor any longer. Signed-off-by: Martin Storsjö <martin@martin.st>
2018-03-30Merge commit 'ab05d3934de8e932dbd77979a687e6598e67535c'James Almer
* commit 'ab05d3934de8e932dbd77979a687e6598e67535c': arm: vc1dsp: Add commas between macro arguments Merged-by: James Almer <jamrial@gmail.com>
2018-03-30arm: vc1dsp: Add commas between macro argumentsMartin Storsjö
When targeting darwin, clang requires commas between arguments, while the no-comma form is allowed for other targets. Since Xcode 9.3, the bundled clang supports altmacro and doesn't require using gas-preprocessor any longer. Signed-off-by: Martin Storsjö <martin@martin.st>
2018-03-08sbcenc: add armv6 and neon asm optimizationsAurelien Jacobs
This was originally based on libsbc, and was fully integrated into ffmpeg.
2018-01-13avcodec/arm/sbrdsp_neon: Use a free register instead of putting 2 things in oneMichael Niedermayer
Fixes high pitched shriek Fixes: 25420848_1478428308873746_4255813235963330560_n.mp4 Reported-by: Dale Curtis <dalecurtis@google.com> Reviewed-by: Dale Curtis <dalecurtis@chromium.org> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-12-09arm/hevc_idct: fix compilation on AndroidJames Almer
Compilation error "out of range" fixed for armeabi-v7a. Compilation failed trying to build libvlc.aar for ARM7 android on ubuntu 16.04 host. Error messages is "Offset out of range". The reason of the error is assembler LDR directives in function "ff_hevc_transform_luma_4x4_neon_8" need local storage in range <1k, but no such storage provided. Based on a patch by Ihor Bobalo <bob@eleks.com> Suggested-by: wbs Signed-off-by: James Almer <jamrial@gmail.com>
2017-12-09hevc: Add hevc_get_pixel_4/8/12/16/24/32/48/64Alexandra Hájková
Checkasm timings: block size bitdepth C NEON 4 8 bit: 146.7 48.7 10 bit: 146.7 52.7 8 8 bit: 430.3 84.4 10 bit: 430.4 119.5 12 8 bit: 812.8 141.0 10 bit: 812.8 195.0 16 8 bit: 1499.1 268.0 10 bit: 1498.9 368.4 24 8 bit: 4394.2 574.8 10 bit: 3696.3 804.8 32 8 bit: 5108.6 568.9 10 bit: 4249.6 918.8 48 8 bit: 16819.6 2304.9 10 bit: 13882.0 3178.5 64 8 bit: 13490.8 1799.5 10 bit: 11018.5 2519.4 Signed-off-by: Martin Storsjö <martin@martin.st>
2017-11-12Merge commit 'b487add7ecf78efda36d49815f8f8757bd24d4cb'James Almer
* commit 'b487add7ecf78efda36d49815f8f8757bd24d4cb': arm: Remove a redundant check in fmtconvert_init_arm.c Merged-by: James Almer <jamrial@gmail.com>
2017-11-11Merge commit '9dde6ab06c48f9447cd16f39bee33569cddb7be4'James Almer
* commit '9dde6ab06c48f9447cd16f39bee33569cddb7be4': arm: Fix SIGBUS on ARM when compiled with binutils 2.29 Merged-by: James Almer <jamrial@gmail.com>
2017-10-31Merge commit 'd7320ca3ed10f0d35b3740fa03341161e74275ea'James Almer
* commit 'd7320ca3ed10f0d35b3740fa03341161e74275ea': arm: Avoid using .dn register aliases Merged-by: James Almer <jamrial@gmail.com>
2017-10-31Merge commit 'ce080f47b8b55ab3d41eb00487b138d9906d114d'James Almer
* commit 'ce080f47b8b55ab3d41eb00487b138d9906d114d': hevc: Add NEON 32x32 IDCT Merged-by: James Almer <jamrial@gmail.com>
2017-10-31Merge commit '118dd4a321a2d67f67c21b076abd0b4d939ab642'James Almer
* commit '118dd4a321a2d67f67c21b076abd0b4d939ab642': hevc: 16x16 NEON idct: Use the right element size for loads/stores Merged-by: James Almer <jamrial@gmail.com>
2017-10-31Merge commit 'edbf0fffb15dde7a1de70b05855529d5fc769f14'James Almer
* commit 'edbf0fffb15dde7a1de70b05855529d5fc769f14': hevc: Add NEON add_residual for bitdepth 10 Merged-by: James Almer <jamrial@gmail.com>
2017-10-30Merge commit 'e1c2453a4fac1f7116244d0d05310935c20887e6'James Almer
* commit 'e1c2453a4fac1f7116244d0d05310935c20887e6': arm: hevc_idct: Tune the add_res_8x8 and add_res_32x32 functions Merged-by: James Almer <jamrial@gmail.com>
2017-10-30Merge commit '0d4d43513786f1df4d561e1fac924fb0722c6700'James Almer
* commit '0d4d43513786f1df4d561e1fac924fb0722c6700': hevc: Add NEON add_residual for bitdepth 8 See 03cecf45c134ebbaecb62505fe444ade423ea7dc Merged-by: James Almer <jamrial@gmail.com>
2017-10-30Merge commit '3d69dd65c6771c28d3bf4e8e53a905aa8cd01fd9'James Almer
* commit '3d69dd65c6771c28d3bf4e8e53a905aa8cd01fd9': hevc: Add support for bitdepth 10 for IDCT DC Merged-by: James Almer <jamrial@gmail.com>
2017-10-30Merge commit '358adef0305618219522858e471edf7e0cb4043e'James Almer
* commit '358adef0305618219522858e471edf7e0cb4043e': hevc: Add NEON IDCT DC functions for bitdepth 8 See 03cecf45c134ebbaecb62505fe444ade423ea7dc Merged-by: James Almer <jamrial@gmail.com>
2017-10-28Merge commit '89d9869d2491d4209d707a8e7f29c58227ae5a4e'James Almer
* commit '89d9869d2491d4209d707a8e7f29c58227ae5a4e': hevc: Add NEON 16x16 IDCT Merged-by: James Almer <jamrial@gmail.com>
2017-10-25Merge commit '0b9a237b2386ff84a6f99716bd58fa27a1b767e7'James Almer
* commit '0b9a237b2386ff84a6f99716bd58fa27a1b767e7': hevc: Add NEON 4x4 and 8x8 IDCT [15:12:59] <@ubitux> hevc_idct_4x4_8_c: 389.1 [15:13:00] <@ubitux> hevc_idct_4x4_8_neon: 126.6 [15:13:02] <@ubitux> our ^ [15:13:06] <@ubitux> hevc_idct_4x4_8_c: 389.3 [15:13:08] <@ubitux> hevc_idct_4x4_8_neon: 107.8 [15:13:10] <@ubitux> hevc_idct_4x4_10_c: 418.6 [15:13:12] <@ubitux> hevc_idct_4x4_10_neon: 108.1 [15:13:14] <@ubitux> libav ^ [15:13:30] <@ubitux> so yeah, we can probably trash our versions here Merged-by: James Almer <jamrial@gmail.com>
2017-10-24arm: Remove a redundant check in fmtconvert_init_arm.cMartin Storsjö
This was missed in e2710e790c0, where have_vfp && !have_vfpv3 were converted into have_vfp_vm. Signed-off-by: Martin Storsjö <martin@martin.st>
2017-09-02arm: Fix SIGBUS on ARM when compiled with binutils 2.29Martin Storsjö
In binutils 2.29, the behavior of the ADR instruction changed so that 1 is added to the address of a Thumb function (previously nothing was added). This allows the loaded address to be passed to a BLX instruction and the correct mode change will occur. See: https://sourceware.org/bugzilla/show_bug.cgi?id=21458 By using adr with a label that isn't annotated as a thumb function, we avoid the new behaviour in binutils 2.29 and get the same behaviour as in prior releases, and as in other assemblers (ms armasm.exe, clang's built in assembler) - an idea that Janne Grunau came up with. Signed-off-by: Martin Storsjö <martin@martin.st>
2017-07-11avcodec/rdft: remove sintableMuhammad Faiz
It is redundant with costable. The first half of sintable is identical with the second half of costable. The second half of sintable is negative value of the first half of sintable. The computation is changed to handle sign of sin values, in C code and ARM assembly code. Signed-off-by: Muhammad Faiz <mfcc64@gmail.com>
2017-06-28lavc/aacpsdsp: use ptrdiff_t for stride in hybrid_analysisClément Bœsch
2017-06-28lavc/arm: fix lack of precision in ff_ps_stereo_interpolate_neonClément Bœsch
The code originally pre-multiply by 2 the steps, causing the running sum of the h factors to drift away due to the lack of precision. It quickly causes an inaccuracy > 0.01. I tried diverse approaches such as multiply by 2.0 (instead of adding the value itself) without success. I'm unable to bench the impact of this change, feel free to compare. This commit fixes the incoming aacpsdsp tests. Following is an alternative simplified function (matching the incoming AArch64 code) that may be used: function ff_ps_stereo_interpolate_neon, export=1 vld1.32 {q0}, [r2] vld1.32 {q1}, [r3] ldr r12, [sp] vmov.f32 q8, q0 vmov.f32 q9, q1 vzip.32 q8, q0 vzip.32 q9, q1 1: vld1.32 {d4}, [r0,:64] vld1.32 {d6}, [r1,:64] vadd.f32 q8, q8, q9 vadd.f32 q0, q0, q1 vmov.f32 d5, d4 vmov.f32 d7, d6 vmul.f32 q2, q2, q8 vmla.f32 q2, q3, q0 vst1.32 {d4}, [r0,:64]! vst1.32 {d5}, [r1,:64]! subs r12, r12, #1 bgt 1b bx lr endfunc
2017-05-15arm: Avoid using .dn register aliasesMartin Storsjö
clang now (in the upcoming 5.0 version) is capable of building our arm assembly without relying on gas-preprocessor, although clang/LLVM doesn't support .dn register aliases. The VC1 MC assembly was only built and used if the chosen assembler supported the .dn directives though. This was supported as long as gas-preprocessor was used. This means that VC1 decoding got a speed regression on clang 5.0, unless the user manually chose using gas-preprocessor again. By avoiding using the .dn register aliases, we can build the VC1 MC assembly with the latest clang version. Support for the .dn/.qn directives in clang/LLVM isn't actively planned, see https://bugs.llvm.org/show_bug.cgi?id=18199. This partially reverts 896a5bff64264f4d01ed98eacc97a67260c1e17e. Signed-off-by: Martin Storsjö <martin@martin.st>
2017-05-04hevc: Add NEON 32x32 IDCTAlexandra Hájková
Signed-off-by: Martin Storsjö <martin@martin.st>
2017-05-04hevc: 16x16 NEON idct: Use the right element size for loads/storesAlexandra Hájková
This doesn't change the actual behaviour of the code but improves readability. Signed-off-by: Martin Storsjö <martin@martin.st>
2017-05-01hevc: Add NEON add_residual for bitdepth 10Alexandra Hájková
Signed-off-by: Martin Storsjö <martin@martin.st>
2017-04-28arm: hevc_idct: Tune the add_res_8x8 and add_res_32x32 functionsMartin Storsjö
Before: Cortex A7 A8 A9 A53 hevc_add_res_8x8_8_neon: 116.0 58.7 80.2 90.7 hevc_add_res_32x32_8_neon: 1230.0 737.5 1187.5 974.4 After: hevc_add_res_8x8_8_neon: 97.7 57.0 73.7 80.0 hevc_add_res_32x32_8_neon: 1216.0 698.7 1127.5 827.1 Signed-off-by: Martin Storsjö <martin@martin.st>
2017-04-27hevc: Add NEON add_residual for bitdepth 8Seppo Tomperi
Optimized by Alexandra Hájková. Signed-off-by: Martin Storsjö <martin@martin.st>
2017-04-25hevc: Add support for bitdepth 10 for IDCT DCAlexandra Hájková
Signed-off-by: Martin Storsjö <martin@martin.st>
2017-04-25hevc: Add NEON IDCT DC functions for bitdepth 8Seppo Tomperi
Signed-off-by: Alexandra Hájková <alexandra@khirnov.net> Signed-off-by: Martin Storsjö <martin@martin.st>
2017-04-12hevc: Add NEON 16x16 IDCTAlexandra Hájková
The speedup vs C code is around 6-13x. Signed-off-by: Martin Storsjö <martin@martin.st>
2017-04-06idct_arm: remove use of ff_put/add_pixels_clamped function pointer.Ronald S. Bultje
Instead, hardcode the use of the _arm implementation of add_pixels, and use the C version for put_pixels (as no arm-optimized version exists). Since there's separate implementations of idct{,_put,_add} for neon, this has no practical impact on performance.
2017-03-29vp9: split out generic decoding skeleton interface API from VP9 types.Ronald S. Bultje
This allows vp9dsp.h to only include the VP9 types header, and not the decoder skeleton interface which is for hardware decoders (dxva2/vaapi).
2017-03-29vp9: re-split the decoder/format/dsp interface header files.Ronald S. Bultje
The advantage here is that the internal software decoder interface is not exposed to the DSP functions or the hardware accelerations.
2017-03-28arm: Always build the hevcdsp_init_arm.c fileMartin Storsjö
The main hevcdsp.c file calls this init function if HAVE_ARM is set, regardless of whether neon support is available or not. This fixes builds where neon isn't supported by the build tools at all. Signed-off-by: Martin Storsjö <martin@martin.st>
2017-03-27hevc: Add NEON 4x4 and 8x8 IDCTAlexandra Hájková
Optimized by Martin Storsjö <martin@martin.st>. The speedup vs C code is around 3.2-4.4x. Signed-off-by: Martin Storsjö <martin@martin.st>
2017-03-27lavc/vp9: split into vp9{block,data,mvs}Clément Bœsch
This is following Libav layout to ease merges.
2017-03-21Merge commit '2caa93b813adc5dbb7771dfe615da826a2947d18'James Almer
* commit '2caa93b813adc5dbb7771dfe615da826a2947d18': mpegaudiodsp: Change type of array stride parameters to ptrdiff_t Merged-by: James Almer <jamrial@gmail.com>
2017-03-21Merge commit 'e4a94d8b36c48d95a7d412c40d7b558422ff659c'James Almer
* commit 'e4a94d8b36c48d95a7d412c40d7b558422ff659c': h264chroma: Change type of stride parameters to ptrdiff_t Merged-by: James Almer <jamrial@gmail.com>
2017-03-21Merge commit '2ec9fa5ec60dcd10e1cb10d8b4e4437e634ea428'James Almer
* commit '2ec9fa5ec60dcd10e1cb10d8b4e4437e634ea428': idct: Change type of array stride parameters to ptrdiff_t Merged-by: James Almer <jamrial@gmail.com>