github.com/FFmpeg/FFmpeg.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author
2019-09-16	avfilter/x86/vf_360: add most of >8 depth asm	Paul B Mahol

2019-09-06	x86/vf_v360: use a faster horizontal add in remap4_8bit_line_avx2	James Almer
	Signed-off-by: James Almer <jamrial@gmail.com>
2019-09-06	x86/vf_v360: make remap{1,2}_8bit_line_avx2 work on x86_32	James Almer
	Signed-off-by: James Almer <jamrial@gmail.com>
2019-09-06	avfilter/vf_v360: x86 SIMD for interpolations	Paul B Mahol

2019-08-07	avfilter/vf_convolution: add x86 SIMD for filter_3x3()	Ruiling Song
	Tested using a simple command (apply edge enhance): ./ffmpeg_g -i ~/Downloads/bbb_sunflower_1080p_30fps_normal.mp4 \ -vf convolution="0 0 0 -1 1 0 0 0 0:0 0 0 -1 1 0 0 0 0:0 0 0 -1 1 0 0 0 0:0 0 0 -1 1 0 0 0 0:5:1:1:1:0:128:128:128" \ -an -vframes 1000 -f null /dev/null The fps increase from 151 to 270 on my local machine. Signed-off-by: Ruiling Song <ruiling.song@intel.com>
2019-06-12	avfilter/vf_gblur: add missing preprocessor check	James Almer
	Fixes compilation on x86_32 Signed-off-by: James Almer <jamrial@gmail.com>
2019-06-12	avfilter/vf_gblur: add x86 SIMD optimizations	Ruiling Song
	The horizontal pass get ~2x performance with the patch under single thread. Tested overall performance using the command(avx2 enabled): ./ffmpeg -i 1080p.mp4 -vf gblur -f null /dev/null ./ffmpeg -i 1080p.mp4 -vf gblur=threads=1 -f null /dev/null For single thread, the fps improves from 43 to 60, about 40%. For multi-thread, the fps improves from 110 to 130, about 20%. Signed-off-by: Ruiling Song <ruiling.song@intel.com>
2019-01-10	avfilter: add anlmdn filter x86 SIMD optimizations	Paul B Mahol

2019-01-04	x86/af_afir: use three operand form forat some instructions	James Almer
	Fixes compilation with old yasm versions. Signed-off-by: James Almer <jamrial@gmail.com>
2019-01-03	x86/af_afir: add ff_fcmul_add_avx()	James Almer
	fcmul_add_c: 1228.8 fcmul_add_sse3: 334.3 fcmul_add_avx: 186.3 Tested on a Core i5 4460 @ 3.2GHz Reviewed-by: Paul B Mahol <onemda@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>
2019-01-03	avfilter/af_afir: split off fcmul_add into a DSP context	James Almer
	Reviewed-by: Paul B Mahol <onemda@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>
2019-01-03	x86/af_afir: fix processing the last element	James Almer
	ff_fcmul_add_sse3() is now identical to the C version. Reviewed-by: Paul B Mahol <onemda@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>
2018-11-22	x86/scene_sad: fix link errors when HAVE_X86ASM is not defined	James Almer
	Reviewed-by: Haihao Xiang <haihao.xiang@intel.com> Signed-off-by: James Almer <jamrial@gmail.com>
2018-11-15	avfilter/vf_blend: add 10bit support	Paul B Mahol

2018-11-15	avfilter/vf_bwdif: Use common yadif frame management logic	Philip Langdale
	After adding field type management to the common yadif logic, we can remove the duplicate copy of that logic from bwdif.
2018-11-11	avfilter/vf_framerate: factorize SAD functions which compute SAD for a whole ↵	Marton Balint
	frame Also add SIMD which works on lines because it is faster then calculating it on 8x8 blocks using pixelutils. Signed-off-by: Marton Balint <cus@passwd.hu>
2018-05-03	avfilter/vf_overlay: exclude nv12/nv21 formats from x86 asm check	Paul B Mahol
	They are yet to be supported, Signed-off-by: Paul B Mahol <onemda@gmail.com>
2018-05-03	avfilter/vf_overlay: add x86 SIMD	Paul B Mahol
	Specifically for yuv444, yuv422, yuv420 format when main stream has no alpha, and alpha is straight. Signed-off-by: Paul B Mahol <onemda@gmail.com>
2018-04-24	avfilter/vf_interlace: remove duplicate code with same funcionality	Vasile Toncu

2018-04-05	avfilter/x86/vf_blend : add SIMD for 16 bit version of	Martin Vignali
	grainextract grainmerge average extremity negation
2018-04-05	avfilter/x86/vf_blend : reorganize DIFFERENCE macro to reduce line ↵	Martin Vignali
	duplication between 8bit and 16 bit version
2018-02-24	avfilter/x86/vf_blend : add 16 bit version for BLEND_SIMPLE, phoenix, ↵	Martin Vignali
	difference for SSE and AVX2 (x86_64)
2018-02-24	avfilter/x86/vf_blend : indent	Martin Vignali

2018-02-24	avfilter/x86/vf_blend : reorganize init in order to add 16 bit version	Martin Vignali

2018-01-28	avfilter/x86/vf_blend : avfilter/x86/vf_blend : add AVX2 version for each ↵	Martin Vignali
	func except divide and optimize average, grainextract, multiply, screen, grain merge
2018-01-28	avfilter/vf_framerate: add SIMD functions for frame blending	Marton Balint
	Blend function speedups on x86_64 Core i5 4460: ffmpeg -f lavfi -i allyuv -vf framerate=60:threads=1 -f null none C: 447548411 decicycles in Blend, 2048 runs, 0 skips SSSE3: 130020087 decicycles in Blend, 2048 runs, 0 skips AVX2: 128508221 decicycles in Blend, 2048 runs, 0 skips ffmpeg -f lavfi -i allyuv -vf format=yuv420p12,framerate=60:threads=1 -f null none C: 228932745 decicycles in Blend, 2048 runs, 0 skips SSE4: 123357781 decicycles in Blend, 2048 runs, 0 skips AVX2: 121215353 decicycles in Blend, 2048 runs, 0 skips Signed-off-by: Marton Balint <cus@passwd.hu>
2018-01-11	avfilter/x86/vf_interlace : add AVX2 version	Martin Vignali

2017-12-20	Revert "avfilter/vf_interlace : add AVX2 for lowpass_line 8 and 16"	James Almer
	This reverts commits 1a5865b6dcc97754a1d7eedc130fb58237d2a715 and 8fb1d63d919286971b8e6afad372730d6d6f25c8. They made fate interlace tests fail when AVX2 was used. Signed-off-by: James Almer <jamrial@gmail.com>
2017-12-19	avfilter/x86/vf_hflip : indent	Martin Vignali
	based on patch by Paul B Mahol
2017-12-19	avfilter/x86/vf_hflip : add avx2 version for hflip_byte and hflip_short	Martin Vignali

2017-12-19	avfilter/x86/vf_hflip : merge hflip byte and hflip short to one macro	Martin Vignali

2017-12-19	avfilter/vf_tinterlace : add AVX2 func for lowpass_line 8 and 16	Martin Vignali

2017-12-19	avfilter/vf_interlace : add AVX2 for lowpass_line 8 and 16	Martin Vignali

2017-12-19	avfilter/vf_interlace : move func init in ff_interlace_init and add depth ↵	Martin Vignali
	arg for ff_interlace_init_x86
2017-12-15	avfilter/x86/vf_interlace : avfilter/x86/vf_interlace : fix crash when using ↵	Martin Vignali
	unaligned data in low_pass complex related to ticket 6491
2017-12-15	avfilter/x86/vf_interlace : avoid crash when data are unaligned	Martin Vignali
	ticket 6491
2017-12-09	avfilter/x86/vf_threshold : add threshold16 SIMD (SSE4 and AVX2)	Martin Vignali

2017-12-08	x86/vf_hflip: use xor to zero initialize registers	James Almer
	Signed-off-by: James Almer <jamrial@gmail.com>
2017-12-08	x86/vf_hflip: don't load the width argument twice	James Almer
	Signed-off-by: James Almer <jamrial@gmail.com>
2017-12-04	x86/vf_threshold: make threshold8 functions work on x86_32	James Almer
	Signed-off-by: James Almer <jamrial@gmail.com>
2017-12-04	avfilter/x86/vf_hflip.asm: fix building on x32	Paul B Mahol
	Signed-off-by: Paul B Mahol <onemda@gmail.com>
2017-12-04	avfilter: add hflip x86 SIMD	Paul B Mahol
	Signed-off-by: Paul B Mahol <onemda@gmail.com>
2017-12-04	x86vf_threshold/: use the PBLENDVB macro	James Almer
	Fixes building with yasm Tested-by: stevenliu Signed-off-by: James Almer <jamrial@gmail.com>
2017-12-03	avfilter/x86/vf_threshold : cosmetic indent	Martin Vignali

2017-12-03	avfilter/x86/vf_threshold : add avx2 version for threshold 8	Martin Vignali

2017-12-03	avfilter/x86/vf_threshold : make macro for threshold8 in order to add avx2 ↵	Martin Vignali
	version
2017-12-02	avfilter/vf_threshold: add x86 SIMD	Paul B Mahol
	Signed-off-by: Paul B Mahol <onemda@gmail.com>
2017-10-21	Merge commit '994c4bc10751e39c7ed9f67ffd0c0dea5223daf2'	James Almer
	* commit '994c4bc10751e39c7ed9f67ffd0c0dea5223daf2': x86util: Port all macros to cpuflags See d5f8a642f6eb1c6e305c41dabddd0fd36ffb3f77 Merged-by: James Almer <jamrial@gmail.com>
2017-09-23	avfilter/interlace: add support for 10 and 12 bit	Thomas Mundt
	Reviewed-by: Michael Niedermayer <michael@niedermayer.cc> Signed-off-by: Thomas Mundt <tmundt75@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>
2017-09-15	avfilter/interlace: prevent over-sharpening with the complex low-pass filter	Thomas Mundt
	The complex vertical low-pass filter slightly over-sharpens the picture. This becomes visible when several transcodings are cascaded and the error potentises, e.g. some generations of HD->SD SD->HD. To prevent this behaviour the destination pixel must not exceed the source pixel when the average of the pixels above and below is less than the source pixel. And the other way around. Tested and approved in a visual transcoding cascade test by video professionals. SSIM/PSNR test with the first generation of an HD->SD file as a reference against the 6th generation(3 x SD->HD HD->SD): Results without the patch: SSIM Y:0.956508 (13.615881) U:0.991601 (20.757750) V:0.993004 (21.551382) All:0.974405 (15.918463) PSNR y:31.838009 u:48.424280 v:48.962711 average:34.759466 min:31.699297 max:40.857847 Results with the patch: SSIM Y:0.970051 (15.236232) U:0.991883 (20.905857) V:0.993174 (21.658049) All:0.981290 (17.279202) PSNR y:34.412108 u:48.504454 v:48.969496 average:37.264644 min:34.310637 max:42.373392 Signed-off-by: Thomas Mundt <tmundt75@gmail.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>