github.com/FFmpeg/FFmpeg.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author
2017-11-14	avcodec/x86/mpegvideodsp: Fix signedness bug in need_emu	Michael Niedermayer
	Fixes: out of array read Fixes: 3516/attachment-311488.dat Found-by: Insu Yun, Georgia Tech. Tested-by: wuninsu@gmail.com Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-11-13	Fix missing used attribute for inline assembly variables	Thomas Köppe
	Variables used in inline assembly need to be marked with attribute((used)). Static constants already were, via the define of DECLARE_ASM_CONST. But DECLARE_ALIGNED does not add this attribute, and some of the variables defined with it are const only used in inline assembly, and therefore appeared dead. This change adds a macro DECLARE_ASM_ALIGNED that marks variables as used. This change makes FFMPEG work with Clang's ThinLTO. Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-11-07	libavcodec/lossless_video_dsp : cosmetic add better separator for each ↵	Martin Vignali
	function, in order to make reading of the asm file easier
2017-11-07	libavcodec/lossless_videodsp : add add_bytes avx2 version	Martin Vignali

2017-10-30	x86/bswapdsp: add missing preprocessor wrappers for AVX2 functions	James Almer
	Fixes build with old nasm/yasm. Signed-off-by: James Almer <jamrial@gmail.com>
2017-10-29	libavcodec/bswapdsp : add AVX2 func for bswap_buf (swap uint32_t)	Martin Vignali

2017-10-21	Merge commit '681a86aba6cb09b98ad716d986182060c7795d20'	James Almer
	* commit '681a86aba6cb09b98ad716d986182060c7795d20': x86: fft: Port to cpuflags Merged-by: James Almer <jamrial@gmail.com>
2017-10-21	Merge commit 'e9bb77fb1012cba1951a82136df7071f71bce8fb'	James Almer
	* commit 'e9bb77fb1012cba1951a82136df7071f71bce8fb': x86: h264: Simplify DEQUANT macro with cpuflags Merged-by: James Almer <jamrial@gmail.com>
2017-10-21	Merge commit '307eb1a8ee363db1fcf869e427a8deb6d9538881'	James Almer
	* commit '307eb1a8ee363db1fcf869e427a8deb6d9538881': x86: vp8dsp: port FILTER_BILINEAR macro to cpuflags Merged-by: James Almer <jamrial@gmail.com>
2017-10-21	Merge commit '994c4bc10751e39c7ed9f67ffd0c0dea5223daf2'	James Almer
	* commit '994c4bc10751e39c7ed9f67ffd0c0dea5223daf2': x86util: Port all macros to cpuflags See d5f8a642f6eb1c6e305c41dabddd0fd36ffb3f77 Merged-by: James Almer <jamrial@gmail.com>
2017-10-12	Merge commit '6eef263aca281fb582e1fa3d841ac20ef747a252'	James Almer
	* commit '6eef263aca281fb582e1fa3d841ac20ef747a252': x86: Merge align directives into SECTION_RODATA declarations where possible Merged-by: James Almer <jamrial@gmail.com>
2017-10-05	x86/blockdsp: use three operand form for an instruction	James Almer
	Fixes assembling with old yasm.
2017-10-05	avcodec/x86/lossless_videoencdsp: Fix warning: signed dword value exceeds bounds	Michael Niedermayer
	Add () to regsize define Suggested-by: Henrik Gramner <henrik@gramner.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-10-05	avcodec/x86/lossless_videoencdsp: Fix handling of small widths	Michael Niedermayer
	Fixes out of array access Fixes: crash-huf.avi Regression since: 6b41b4414934cc930468ccd5db598dd6ef643987 This could also be fixed by adding checks in the C code that calls the dsp Found-by: Zhibin Hu and 连一汉 <lianyihan@360.cn> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-10-04	libavcodec/blockdsp : add AVX version	Martin Vignali
	Also modify the required alignment, to 32 instead of 16 for several codecs Signed-off-by: James Almer <jamrial@gmail.com>
2017-10-01	libavcodec/exr : add x86 SIMD for predictor	Martin Vignali
	Signed-off-by: James Almer <jamrial@gmail.com>
2017-09-27	Merge commit '7abdd026df6a9a52d07d8174505b33cc89db7bf6'	James Almer
	* commit '7abdd026df6a9a52d07d8174505b33cc89db7bf6': asm: Consistently uppercase SECTION markers Merged-by: James Almer <jamrial@gmail.com>
2017-09-26	Merge commit 'fd9212f2edfe9b107c3c08ba2df5fd2cba5ab9e3'	James Almer
	* commit 'fd9212f2edfe9b107c3c08ba2df5fd2cba5ab9e3': Mark some arrays that never change as const. Merged-by: James Almer <jamrial@gmail.com>
2017-09-19	x86/exrdsp: optimize ff_reorder_pixels_avx2()	Henrik Gramner
	Tested with "checkasm --test=exrdsp -bench" Before: reorder_pixels_c: 5187.8 reorder_pixels_sse2: 377.0 reorder_pixels_avx2: 331.3 After: reorder_pixels_c: 5181.5 reorder_pixels_sse2: 377.0 reorder_pixels_avx2: 313.8 Signed-off-by: James Almer <jamrial@gmail.com>
2017-09-18	avcodec/exrdsp: improve the ExrDSPContext->reorder_pixels prototype	James Almer
	Make dst be the first parameter and src const. It's more in line with the rest of the codebase. Signed-off-by: James Almer <jamrial@gmail.com>
2017-09-17	libavcodec/exr : add X86 SIMD for reorder_pixels	Martin Vignali
	Signed-off-by: James Almer <jamrial@gmail.com>
2017-08-22	avcodec/me_cmp: Fix crashes on ARM due to misalignment	Michael Niedermayer
	Adds a diff_pixels_unaligned() Fixes: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=872503 Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-08-20	opus_pvq_search: Restore the proper use of conditional define and simplify ↵	Ivan Kalvachev
	the function name suffix handling. Using named define properly documents the code paths. It also avoids passing additional numbered arguments through multiple levels of macro templates. The suffix handling is done by concatenation, like in other asm functions and avoid having two separate "cglobal" defines. Signed-off-by: Ivan Kalvachev <ikalvachev@gmail.com>
2017-08-18	opus_pvq_search: split functions into exactness and only use the exact if ↵	Rostislav Pehlivanov
	its faster This splits the asm function into exact and non-exact version. The exact version is as fast or faster on newer CPUs (which EXTERNAL_AVX_FAST describes well) whilst the non-exact version is faster than the exact on older CPUs. Also fixes yasm compilation which doesn't accept !cpuflags(avx) syntax. Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
2017-08-18	opus_pvq_search: only use rsqrtps approximation on CPUs with avx	Rostislav Pehlivanov
	Makes the search produce idential results with the C version. Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
2017-08-18	ops_pvq_search: remove dead macro	Rostislav Pehlivanov
	There's no point in toggling it, even for debugging. Its just worse. Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
2017-08-18	SIMD opus pvq_search implementation	Ivan Kalvachev
	Explanation on the workings and methods used by the Pyramid Vector Quantization Search function could be found in the following Work-In-Progress mail threads: http://ffmpeg.org/pipermail/ffmpeg-devel/2017-June/212146.html http://ffmpeg.org/pipermail/ffmpeg-devel/2017-June/212816.html http://ffmpeg.org/pipermail/ffmpeg-devel/2017-July/213030.html http://ffmpeg.org/pipermail/ffmpeg-devel/2017-July/213436.html Signed-off-by: Ivan Kalvachev <ikalvachev@gmail.com>
2017-07-30	mdct15: add inverse transform postrotation SIMD	Rostislav Pehlivanov
	2.5ms frames: Before (c): 2638 decicycles in postrotate, 2097040 runs, 112 skips After (sse3): 1467 decicycles in postrotate, 2097083 runs, 69 skips After (avx2): 1244 decicycles in postrotate, 2097085 runs, 67 skips 5ms frames: Before (c): 4987 decicycles in postrotate, 1048371 runs, 205 skips After (sse3): 2644 decicycles in postrotate, 1048509 runs, 67 skips After (avx2): 2031 decicycles in postrotate, 1048523 runs, 53 skips 10ms frames: Before (c): 9153 decicycles in postrotate, 523575 runs, 713 skips After (sse3): 5110 decicycles in postrotate, 523726 runs, 562 skips After (avx2): 3738 decicycles in postrotate, 524223 runs, 65 skips 20ms frames: Before (c): 17857 decicycles in postrotate, 261866 runs, 278 skips After (sse3): 10041 decicycles in postrotate, 261746 runs, 398 skips After (avx2): 7050 decicycles in postrotate, 262116 runs, 28 skips Improves total decoding performance for real world content by 9% with avx2. Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
2017-07-21	avcodec/x86/cavsdsp: Delete #include "libavcodec/x86/idctdsp.h".	Wan-Teh Chang
	This file already has #include "idctdsp.h", which is resolved to the idctdsp.h header in the directory where this file resides by compilers. Two other files in this directory, libavcodec/x86/idctdsp_init.c and libavcodec/x86/xvididct_init.c, also rely on #include "idctdsp.h" working this way. Signed-off-by: Wan-Teh Chang <wtc@google.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-07-05	Revert "x86/sbrdsp: remove unnecessary sign extend instruction in ↵	James Almer
	apply_noise_main" This reverts commit 24bb7db4037876c5722b0eecf7412502e7225634. noise has to after all be sign extended, not zero extended, on tests other than checkasm. Fixes most aac tests broken by the now reverted commit.
2017-07-05	x86/sbrdsp: remove unnecessary sign extend instruction in apply_noise_main	James Almer
	noise needs to be zero extended and it can be done implicitly as a side effect in a subsequent instruction. Signed-off-by: James Almer <jamrial@gmail.com>
2017-07-05	x86/sbrdsp: zero extend m_max in apply_noise_main	James Almer
	Tested-by: Michael Niedermayer <michael@niedermayer.cc> Signed-off-by: James Almer <jamrial@gmail.com>
2017-07-05	x86/utvideodsp: make restore_rgb_planes functions work on x86_32	James Almer
	Reviewed-by: Paul B Mahol <onemda@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>
2017-06-30	x86/sbrdsp: sign extend start and end gprs in ff_sbr_hf_gen_sse	James Almer
	Tested-by: Michael Niedermayer <michael@niedermayer.cc> Signed-off-by: James Almer <jamrial@gmail.com>
2017-06-28	avcodec/x86: use new x86-64 functions for -idct simple	James Darnley
	They now match according to FATE, barring any further bugs with untested parts
2017-06-28	avcodec/x86: add an 8-bit simple IDCT function based on the x86-64 high ↵	James Darnley
	depth functions Includes add/put functions Rounding contributed by Ronald S. Bultje
2017-06-28	avcodec/x86: allow future 8-bit simple idct to have "DC only hack"	James Darnley
	Created by Ronald S. Bultje
2017-06-28	lavc/aacpsdsp: use ptrdiff_t for stride in hybrid_analysis	Clément Bœsch

2017-06-28	avcodec/x86/vp9dsp_init_16bpp: Fix linking to missing ↵	Michael Niedermayer
	ff_vp9_ipred_dr_32x32_16_avx2() on 32bit Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-06-27	avcodec/vp9: add 64-bit ipred_dr_32x32_16 avx2 implementation	Ilia Valiakhmetov
	vp9_diag_downright_32x32_12bpp_c: 429.7 vp9_diag_downright_32x32_12bpp_sse2: 158.9 vp9_diag_downright_32x32_12bpp_ssse3: 144.6 vp9_diag_downright_32x32_12bpp_avx: 141.0 vp9_diag_downright_32x32_12bpp_avx2: 73.8 Almost 50% faster than avx implementation Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2017-06-27	avcodec/utvideodec: add SIMD for restore_rgb_planes	Paul B Mahol
	Signed-off-by: Paul B Mahol <onemda@gmail.com>
2017-06-26	lavc/x86: clear r2 higher bits in ff_sbr_sum_square	Matthieu Bouron
	Suggested-by: James Almer <jamrial@gmail.com>
2017-06-24	x86/mdct15: use three operand form for some instructions	James Almer
	Fixes compilation with old yasm
2017-06-24	mdct15: add assembly optimizations for the 15-point FFT	Rostislav Pehlivanov
	c: 1802 decicycles in fft15,16774635 runs, 2581 skips avx: 865 decicycles in fft15,16776378 runs, 838 skips Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
2017-06-21	build: Generalize yasm/nasm-related variable names	Diego Biurrun
	None of them are specific to the YASM assembler. (Cherry-picked from libav commit 39e208f4d4756367c7cd2d581847e0c1b8a429c1) Signed-off-by: James Almer <jamrial@gmail.com>
2017-06-20	avcodec/x86: allow future 8-bit simple idct to use slightly different ↵	James Darnley
	coefficients
2017-06-20	avcodec/x86: modify simple_idct10 macros to add an action paramter	James Darnley

2017-06-20	avcodec/x86: cleanup simple_idct10	James Darnley
	Use named arguments for the functions so we can remove a define. The stride/linesize argument is now ptrdiff_t type so we no longer need to sign extend the register.
2017-06-20	avcodec/x86/mpegenc: support transpose permuation type	James Darnley

2017-06-20	avcodec/x86/mpegenc: check IDCT permutation type is a valid value	James Darnley