Welcome to mirror list, hosted at ThFree Co, Russian Federation.

github.com/FFmpeg/FFmpeg.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2019-09-16avfilter/x86/vf_360: add most of >8 depth asmPaul B Mahol
2019-09-06x86/vf_v360: use a faster horizontal add in remap4_8bit_line_avx2James Almer
Signed-off-by: James Almer <jamrial@gmail.com>
2019-09-06x86/vf_v360: make remap{1,2}_8bit_line_avx2 work on x86_32James Almer
Signed-off-by: James Almer <jamrial@gmail.com>
2019-09-06avfilter/vf_v360: x86 SIMD for interpolationsPaul B Mahol
2019-08-07avfilter/vf_convolution: add x86 SIMD for filter_3x3()Ruiling Song
Tested using a simple command (apply edge enhance): ./ffmpeg_g -i ~/Downloads/bbb_sunflower_1080p_30fps_normal.mp4 \ -vf convolution="0 0 0 -1 1 0 0 0 0:0 0 0 -1 1 0 0 0 0:0 0 0 -1 1 0 0 0 0:0 0 0 -1 1 0 0 0 0:5:1:1:1:0:128:128:128" \ -an -vframes 1000 -f null /dev/null The fps increase from 151 to 270 on my local machine. Signed-off-by: Ruiling Song <ruiling.song@intel.com>
2019-06-12avfilter/vf_gblur: add missing preprocessor checkJames Almer
Fixes compilation on x86_32 Signed-off-by: James Almer <jamrial@gmail.com>
2019-06-12avfilter/vf_gblur: add x86 SIMD optimizationsRuiling Song
The horizontal pass get ~2x performance with the patch under single thread. Tested overall performance using the command(avx2 enabled): ./ffmpeg -i 1080p.mp4 -vf gblur -f null /dev/null ./ffmpeg -i 1080p.mp4 -vf gblur=threads=1 -f null /dev/null For single thread, the fps improves from 43 to 60, about 40%. For multi-thread, the fps improves from 110 to 130, about 20%. Signed-off-by: Ruiling Song <ruiling.song@intel.com>
2019-01-10avfilter: add anlmdn filter x86 SIMD optimizationsPaul B Mahol
2019-01-04x86/af_afir: use three operand form forat some instructionsJames Almer
Fixes compilation with old yasm versions. Signed-off-by: James Almer <jamrial@gmail.com>
2019-01-03x86/af_afir: add ff_fcmul_add_avx()James Almer
fcmul_add_c: 1228.8 fcmul_add_sse3: 334.3 fcmul_add_avx: 186.3 Tested on a Core i5 4460 @ 3.2GHz Reviewed-by: Paul B Mahol <onemda@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>
2019-01-03avfilter/af_afir: split off fcmul_add into a DSP contextJames Almer
Reviewed-by: Paul B Mahol <onemda@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>
2019-01-03x86/af_afir: fix processing the last elementJames Almer
ff_fcmul_add_sse3() is now identical to the C version. Reviewed-by: Paul B Mahol <onemda@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>
2018-11-22x86/scene_sad: fix link errors when HAVE_X86ASM is not definedJames Almer
Reviewed-by: Haihao Xiang <haihao.xiang@intel.com> Signed-off-by: James Almer <jamrial@gmail.com>
2018-11-15avfilter/vf_blend: add 10bit supportPaul B Mahol
2018-11-15avfilter/vf_bwdif: Use common yadif frame management logicPhilip Langdale
After adding field type management to the common yadif logic, we can remove the duplicate copy of that logic from bwdif.
2018-11-11avfilter/vf_framerate: factorize SAD functions which compute SAD for a whole ↵Marton Balint
frame Also add SIMD which works on lines because it is faster then calculating it on 8x8 blocks using pixelutils. Signed-off-by: Marton Balint <cus@passwd.hu>
2018-05-03avfilter/vf_overlay: exclude nv12/nv21 formats from x86 asm checkPaul B Mahol
They are yet to be supported, Signed-off-by: Paul B Mahol <onemda@gmail.com>
2018-05-03avfilter/vf_overlay: add x86 SIMDPaul B Mahol
Specifically for yuv444, yuv422, yuv420 format when main stream has no alpha, and alpha is straight. Signed-off-by: Paul B Mahol <onemda@gmail.com>
2018-04-24avfilter/vf_interlace: remove duplicate code with same funcionalityVasile Toncu
2018-04-05avfilter/x86/vf_blend : add SIMD for 16 bit version ofMartin Vignali
grainextract grainmerge average extremity negation
2018-04-05avfilter/x86/vf_blend : reorganize DIFFERENCE macro to reduce line ↵Martin Vignali
duplication between 8bit and 16 bit version
2018-02-24avfilter/x86/vf_blend : add 16 bit version for BLEND_SIMPLE, phoenix, ↵Martin Vignali
difference for SSE and AVX2 (x86_64)
2018-02-24avfilter/x86/vf_blend : indentMartin Vignali
2018-02-24avfilter/x86/vf_blend : reorganize init in order to add 16 bit versionMartin Vignali
2018-01-28avfilter/x86/vf_blend : avfilter/x86/vf_blend : add AVX2 version for each ↵Martin Vignali
func except divide and optimize average, grainextract, multiply, screen, grain merge
2018-01-28avfilter/vf_framerate: add SIMD functions for frame blendingMarton Balint
Blend function speedups on x86_64 Core i5 4460: ffmpeg -f lavfi -i allyuv -vf framerate=60:threads=1 -f null none C: 447548411 decicycles in Blend, 2048 runs, 0 skips SSSE3: 130020087 decicycles in Blend, 2048 runs, 0 skips AVX2: 128508221 decicycles in Blend, 2048 runs, 0 skips ffmpeg -f lavfi -i allyuv -vf format=yuv420p12,framerate=60:threads=1 -f null none C: 228932745 decicycles in Blend, 2048 runs, 0 skips SSE4: 123357781 decicycles in Blend, 2048 runs, 0 skips AVX2: 121215353 decicycles in Blend, 2048 runs, 0 skips Signed-off-by: Marton Balint <cus@passwd.hu>
2018-01-11avfilter/x86/vf_interlace : add AVX2 versionMartin Vignali
2017-12-20Revert "avfilter/vf_interlace : add AVX2 for lowpass_line 8 and 16"James Almer
This reverts commits 1a5865b6dcc97754a1d7eedc130fb58237d2a715 and 8fb1d63d919286971b8e6afad372730d6d6f25c8. They made fate interlace tests fail when AVX2 was used. Signed-off-by: James Almer <jamrial@gmail.com>
2017-12-19avfilter/x86/vf_hflip : indentMartin Vignali
based on patch by Paul B Mahol
2017-12-19avfilter/x86/vf_hflip : add avx2 version for hflip_byte and hflip_shortMartin Vignali
2017-12-19avfilter/x86/vf_hflip : merge hflip byte and hflip short to one macroMartin Vignali
2017-12-19avfilter/vf_tinterlace : add AVX2 func for lowpass_line 8 and 16Martin Vignali
2017-12-19avfilter/vf_interlace : add AVX2 for lowpass_line 8 and 16Martin Vignali
2017-12-19avfilter/vf_interlace : move func init in ff_interlace_init and add depth ↵Martin Vignali
arg for ff_interlace_init_x86
2017-12-15avfilter/x86/vf_interlace : avfilter/x86/vf_interlace : fix crash when using ↵Martin Vignali
unaligned data in low_pass complex related to ticket 6491
2017-12-15avfilter/x86/vf_interlace : avoid crash when data are unalignedMartin Vignali
ticket 6491
2017-12-09avfilter/x86/vf_threshold : add threshold16 SIMD (SSE4 and AVX2)Martin Vignali
2017-12-08x86/vf_hflip: use xor to zero initialize registersJames Almer
Signed-off-by: James Almer <jamrial@gmail.com>
2017-12-08x86/vf_hflip: don't load the width argument twiceJames Almer
Signed-off-by: James Almer <jamrial@gmail.com>
2017-12-04x86/vf_threshold: make threshold8 functions work on x86_32James Almer
Signed-off-by: James Almer <jamrial@gmail.com>
2017-12-04avfilter/x86/vf_hflip.asm: fix building on x32Paul B Mahol
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2017-12-04avfilter: add hflip x86 SIMDPaul B Mahol
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2017-12-04x86vf_threshold/: use the PBLENDVB macroJames Almer
Fixes building with yasm Tested-by: stevenliu Signed-off-by: James Almer <jamrial@gmail.com>
2017-12-03avfilter/x86/vf_threshold : cosmetic indentMartin Vignali
2017-12-03avfilter/x86/vf_threshold : add avx2 version for threshold 8Martin Vignali
2017-12-03avfilter/x86/vf_threshold : make macro for threshold8 in order to add avx2 ↵Martin Vignali
version
2017-12-02avfilter/vf_threshold: add x86 SIMDPaul B Mahol
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2017-10-21Merge commit '994c4bc10751e39c7ed9f67ffd0c0dea5223daf2'James Almer
* commit '994c4bc10751e39c7ed9f67ffd0c0dea5223daf2': x86util: Port all macros to cpuflags See d5f8a642f6eb1c6e305c41dabddd0fd36ffb3f77 Merged-by: James Almer <jamrial@gmail.com>
2017-09-23avfilter/interlace: add support for 10 and 12 bitThomas Mundt
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc> Signed-off-by: Thomas Mundt <tmundt75@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>
2017-09-15avfilter/interlace: prevent over-sharpening with the complex low-pass filterThomas Mundt
The complex vertical low-pass filter slightly over-sharpens the picture. This becomes visible when several transcodings are cascaded and the error potentises, e.g. some generations of HD->SD SD->HD. To prevent this behaviour the destination pixel must not exceed the source pixel when the average of the pixels above and below is less than the source pixel. And the other way around. Tested and approved in a visual transcoding cascade test by video professionals. SSIM/PSNR test with the first generation of an HD->SD file as a reference against the 6th generation(3 x SD->HD HD->SD): Results without the patch: SSIM Y:0.956508 (13.615881) U:0.991601 (20.757750) V:0.993004 (21.551382) All:0.974405 (15.918463) PSNR y:31.838009 u:48.424280 v:48.962711 average:34.759466 min:31.699297 max:40.857847 Results with the patch: SSIM Y:0.970051 (15.236232) U:0.991883 (20.905857) V:0.993174 (21.658049) All:0.981290 (17.279202) PSNR y:34.412108 u:48.504454 v:48.969496 average:37.264644 min:34.310637 max:42.373392 Signed-off-by: Thomas Mundt <tmundt75@gmail.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>