github.com/FFmpeg/FFmpeg.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author
2020-07-12	x86/h264_deblock: fix warning about trailing empty parameter	James Almer
	Fixes part of ticket #8771 Signed-off-by: James Almer <jamrial@gmail.com> (cherry picked from commit 2c844c98285ca03d9cc44db920da645cf0376c40)
2020-05-13	pixblockdsp, avdct: Add get_pixels_unaligned	Martin Storsjö
	Use this in vf_spp.c, where the get_pixels operation is done on unaligned source addresses. Hook up the x86 (mmx and sse) versions of get_pixels to this function pointer, as those implementations seem to support unaligned use. This fixes fate-filter-spp on armv7. Signed-off-by: Martin Storsjö <martin@martin.st>
2020-03-27	lavc/x86/hevc_add_res: Fix coeff overflow in ADD_RES_SSE_16_32_8	Linjie Fu
	Fix overflow for coeff -32768 in function ADD_RES_SSE_16_32_8 with no performance drop.(SSE2/AVX/AVX2) ./checkasm --test=hevc_add_res --bench Mainline: - hevc_add_res.add_residual [OK] hevc_add_res_32x32_8_sse2: 127.5 hevc_add_res_32x32_8_avx: 127.0 hevc_add_res_32x32_8_avx2: 86.5 Add overflow test case: - hevc_add_res.add_residual [FAILED] After: - hevc_add_res.add_residual [OK] hevc_add_res_32x32_8_sse2: 126.8 hevc_add_res_32x32_8_avx: 128.3 hevc_add_res_32x32_8_avx2: 86.8 Signed-off-by: Xu Guangxin <guangxin.xu@intel.com> Signed-off-by: Linjie Fu <linjie.fu@intel.com> Signed-off-by: Anton Khirnov <anton@khirnov.net>
2020-03-27	lavc/x86/hevc_add_res: Fix overflow in ADD_RES_SSE_8_8	Linjie Fu
	Fix overflow for coeff -32768 in function ADD_RES_SSE_8_8 with no performance drop. ./checkasm --test=hevc_add_res --bench Mainline: - hevc_add_res.add_residual [OK] hevc_add_res_8x8_8_sse2: 15.5 Add overflow test case: - hevc_add_res.add_residual [FAILED] After: - hevc_add_res.add_residual [OK] hevc_add_res_8x8_8_sse2: 15.5 Signed-off-by: Xu Guangxin <guangxin.xu@intel.com> Signed-off-by: Linjie Fu <linjie.fu@intel.com> Signed-off-by: Anton Khirnov <anton@khirnov.net>
2020-03-27	lavc/x86/hevc_add_res: Fix overflow in ADD_RES_MMX_4_8	Linjie Fu
	Fix overflow for coeff -32768 in function ADD_RES_MMX_4_8 with no performance drop. ./checkasm --test=hevc_add_res --bench Mainline: - hevc_add_res.add_residual [OK] hevc_add_res_4x4_8_mmxext: 15.5 Add overflow test case: - hevc_add_res.add_residual [FAILED] After: - hevc_add_res.add_residual [OK] hevc_add_res_4x4_8_mmxext: 15.0 Signed-off-by: Xu Guangxin <guangxin.xu@intel.com> Signed-off-by: Linjie Fu <linjie.fu@intel.com> Signed-off-by: Anton Khirnov <anton@khirnov.net>
2020-01-31	avcodec/x86/diracdsp: Fix high bits on Windows x86_64	Michael Niedermayer
	Found-by: james
2020-01-30	avcodec/x86/diracdsp: Fix incorrect src addressing in dequant_subband_32()	Michael Niedermayer
	Fixes: Segfault (not reproducable with asm, which made this hard to debug) Fixes: decoding errors Fixes: 19854/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_DIRAC_fuzzer-5729372837511168 Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg Reviewed-by: Paul B Mahol <onemda@gmail.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2019-10-30	vp4: prevent unaligned memory access in loop filter	Peter Ross
	VP4 applies a loop filter during motion compensation, causing the block offset will often by unaligned. This produces a bus error on some platforms, namely ARMv7 NEON. This patch adds a unaligned version of the loop filter function pointer to VP3DSPContext. Reported-by: Mike Melanson <mike@multimedia.cx> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2019-09-12	x85/opusdsp: enable the functions on all FMA3 CPUs	James Almer
	It's not using ymm registers, so limiting it to CPUs with fast AVX is not necessary. Signed-off-by: James Almer <jamrial@gmail.com>
2019-09-12	x86/opusdps: clear the high bits from some gprs	James Almer
	Fixes checkasm on systems like win64. Reviewed-by: Lynne Signed-off-by: James Almer <jamrial@gmail.com>
2019-05-14	avcodec/Makefile: add missing pngdsp dependency to the lscr decoder	James Almer
	Signed-off-by: James Almer <jamrial@gmail.com>
2019-05-03	x86/v210dec: use named registers	James Almer
	Signed-off-by: James Almer <jamrial@gmail.com>
2019-05-03	x86/v210dec: don't reserve more xmm regs than needed	James Almer
	Prevents pointless register saving on win64 for the sse3 and avx versions of the function. Signed-off-by: James Almer <jamrial@gmail.com>
2019-05-03	x86/v210dec: remove duplicate load instruction	James Almer
	Signed-off-by: James Almer <jamrial@gmail.com>
2019-05-02	avcodec/x86/v210: fix operands of vpblendd used in new avx2 code	James Darnley
	Assembly failed when using yasm rather than nasm.
2019-05-02	libavcodec Adding ff_v210_planar_unpack AVX2	Michael Stoner
	Replaced VSHUFPS with VPBLENDD to relieve port 5 bottleneck AVX2 is 1.4x faster than AVX
2019-04-27	x86/opusdsp: replace loads with shuffles	Lynne
	Has a slight speedup. Can't be carried over to aarch64, since it has no shufps-like instruction. Reviewed-by: Paul B Mahol <onemda@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>
2019-04-01	x86/opusdsp: fix WIN64 return value	Lynne
	Signed-off-by: James Almer <jamrial@gmail.com>
2019-04-01	x86/opusdsp: implement FMA3 accelerated postfilter and deemphasis	Lynne
	58893 decicycles in deemphasis_c, 130548 runs, 524 skips 9475 decicycles in deemphasis_fma3, 130686 runs, 386 skips -> 6.21x speedup 24866 decicycles in postfilter_c, 65386 runs, 150 skips 5268 decicycles in postfilter_fma3, 65505 runs, 31 skips -> 4.72x speedup Total decoder speedup: ~14% Deemphasis SIMD based on the following unrolling: const float c1 = CELT_EMPH_COEFF, c2 = c1c1, c3 = c2c1, c4 = c3c1; float state = coeff; for (int i = 0; i < len; i += 4) { y[0] = x[0] + c1state; y[1] = x[1] + c2state + c1x[0]; y[2] = x[2] + c3state + c1x[1] + c2x[0]; y[3] = x[3] + c4state + c1x[2] + c2x[1] + c3*x[0]; state = y[3]; y += 4; x += 4; }
2019-04-01	celt_pvq_init: only build when CONFIG_OPUS_ENCODER is enabled	Lynne
	The entire function was defined away before.
2019-04-01	x86/opus_dsp: rename to celt_pvq	Lynne
	Its only used in the encoder and in CELT's PVQ.
2019-02-20	avcodec/h264dsp: change loop filter stride argument to ptrdiff_t	James Almer

2018-12-02	avcodec/proresdsp indent after prev commit	Martin Vignali

2018-12-02	avcodec/proresdec : rename dsp part for 10b and check dspinit for supported ↵	Martin Vignali
	bits per raw sample based on patch by Kieran Kunhya
2018-05-08	mdct15: simplify x86 exptab permutation	Rostislav Pehlivanov
	Removes an unneeded copy and does the 5-point permute in-place. Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
2018-05-08	mdct15: simplify the fft15 x86 SIMD	Rostislav Pehlivanov
	Saves 1 gpr and 2 instructions and simplifies the macros a bit. Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
2018-04-02	mpeg4video: Add support for MPEG-4 Simple Studio Profile.	Kieran Kunhya
	This is a profile supporting > 8-bit video and has a higher quality DCT
2018-03-08	sbcenc: add MMX optimizations	Aurelien Jacobs
	This was originally based on libsbc, and was fully integrated into ffmpeg. Rough speed test: C version: speed= 592x MMX version: speed= 785x
2018-02-12	h264_idct: enable unmacro on newer NASM versions	Rostislav Pehlivanov
	Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
2018-01-28	avcodec/utvideoenc : add SIMD (avx) for sub_left_prediction	Martin Vignali
	asm code by Henrik Gramner
2018-01-12	avcodec: increase AV_INPUT_BUFFER_PADDING_SIZE to 64	James Almer
	AVX-512 support has been introduced, and even if no functions currently use zmm registers (able to load as much as 64 bytes of consecutive data per instruction), they will be added eventually. Reviewed-by: Rostislav Pehlivanov <atomnuker@gmail.com> Tested-by: Michael Niedermayer <michael@niedermayer.cc> Signed-off-by: James Almer <jamrial@gmail.com>
2017-12-10	x86/lossless_videodsp: rename ff_add_left_pred_int16_sse4 to ↵	James Almer
	ff_add_left_pred_int16_unaligned_ssse3 SSSE3_FAST is the proper check for it. Signed-off-by: James Almer <jamrial@gmail.com>
2017-12-10	x86/lossless_videodsp: don't overread the dst buffer in ↵	James Almer
	ff_add_left_pred_unaligned_avx2 Fixes valgrind Signed-off-by: James Almer <jamrial@gmail.com>
2017-12-09	avcodec/utvideodec : add SIMD (SSSE3 and AVX2) for gradient_pred	Martin Vignali

2017-12-09	avcodec/x86/lossless_videodsp : add avx2 version for add_left_pred	Martin Vignali

2017-12-09	avcodec/x86/lossless_videodsp.asm : make macro for add_left_pred_unaligned ↵	Martin Vignali
	in order to add avx2 version
2017-12-02	avcodec/x86/bswapdsp : use macro for 128 bits constants loading in xmm or ymm	Martin Vignali

2017-11-25	avcodec/fft: fix INTERL macro on 3dnow	Mikulas Patocka
	The commit b7c16a3f2c4921f613319938b8ee0e3d6fa83e8d ("x86: fft: Port to cpuflags") breaks the opus decoder in ffmpeg when compiling for 3dnow. The output is audible, but there's a lot of noise. The reason for the breakage is that the commit unintentionally changed the INTERL macro so that it is empty when compiling for 3dnow. This patch fixes it. Signed-off-by: Mikulas Patocka <mikulas@twibright.com> Signed-off-by: James Almer <jamrial@gmail.com>
2017-11-23	avcodec/x86/exrdsp : use ymm constant for pb_80	Martin Vignali
	speed seems to be similar, but simplify code
2017-11-21	x86/utvideodsp: reuse shared constants	James Almer
	Remove the broadcast instructions as well now that they are wide enough. Signed-off-by: James Almer <jamrial@gmail.com>
2017-11-21	x86/constants: make pb_80 32 byte wide	James Almer
	Signed-off-by: James Almer <jamrial@gmail.com>
2017-11-21	avcodec/huffyuvdspenc : add diff_int16 AVX2 func	Martin Vignali

2017-11-21	avcodec/huffyuvdspenc : reorganize diff_int16	Martin Vignali

2017-11-21	avcodec/huffyuvdsp : add add_int16 AVX2 func	Martin Vignali

2017-11-21	avcodec/huffyuvdsp : reorganize add_int16 asm	Martin Vignali

2017-11-21	avcodec/huffyuvdsp(enc) : move duplicate macro to a template file	Martin Vignali

2017-11-21	avcodec/x86/utvideodsp.asm : cosmetic	Martin Vignali
	better func separator and add comment for the restore rgb planes10 declaration
2017-11-21	avcodec/utvideodsp : add avx2 version for the dsp	Martin Vignali

2017-11-21	avcodec/x86/utvideodsp : make macro for func	Martin Vignali

2017-11-21	x86/jpeg2000dsp: add ff_ict_float_{fma3,fma4}	James Almer
	jpeg2000_ict_float_c: 2296.0 jpeg2000_ict_float_sse: 628.0 jpeg2000_ict_float_avx: 317.0 jpeg2000_ict_float_fma3: 262.0 Signed-off-by: James Almer <jamrial@gmail.com>