github.com/FFmpeg/FFmpeg.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author
2016-10-22	doc: fix spelling errors	Andreas Cadhalpun
	Thanks to Mathieu Malaterre <malat@debian.org> for reporting the Que/Queue typo. (https://bugs.debian.org/839542) Reviewed-by: Lou Logan <lou@lrcd.com> Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
2016-10-18	aacenc: add SIMD optimizations for abs_pow34 and quantization	Rostislav Pehlivanov
	Performance improvements: quant_bands: with: 681 decicycles in quant_bands, 8388453 runs, 155 skips without: 1190 decicycles in quant_bands, 8388386 runs, 222 skips Around 42% for the function Twoloop coder: abs_pow34: with/without: 7.82s/8.17s Around 4% for the entire encoder Both: with/without: 7.15s/8.17s Around 12% for the entire encoder Fast coder: abs_pow34: with/without: 3.40s/3.77s Around 10% for the entire encoder Both: with/without: 3.02s/3.77s Around 20% faster for the entire encoder Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com> Tested-by: Michael Niedermayer <michael@niedermayer.cc> Reviewed-by: James Almer <jamrial@gmail.com>
2016-10-02	avcodec: fix arguments on xmm/neon clobber test wrappers	James Almer
	Signed-off-by: James Almer <jamrial@gmail.com>
2016-10-01	avcodec: add missing xmm/neon clobber test wrappers for the new encode API	James Almer
	Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>
2016-09-23	x86/h264_weight: use appropriate register size for weight parameters	Hendrik Leppkes
	Fixes trac 5579 Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com> Acked-by: Michael Niedermayer <michael@niedermayer.cc>
2016-09-23	avcodec/h264: Use ptrdiff_t for (bi)weight functions	Michael Niedermayer
	Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2016-08-07	avcodec/ttadsp: cosmetics	James Almer
	Clean some header includes and use the same naming scheme as in ttaencdsp Signed-off-by: James Almer <jamrial@gmail.com>
2016-08-02	x86/ttaenc: add ff_ttaenc_filter_process_{ssse3,sse4}	James Almer
	Signed-off-by: James Almer <jamrial@gmail.com>
2016-07-29	Merge commit '9df889a5f116c1ee78c2f239e0ba599c492431aa'	Clément Bœsch
	* commit '9df889a5f116c1ee78c2f239e0ba599c492431aa': h264: rename h264.[ch] to h264dec.[ch] Merged-by: Clément Bœsch <u@pkh.me>
2016-07-26	vp9: add mxext versions of the single-block (w=8,npx=8) h/v loopfilters.	Ronald S. Bultje
	Each takes about 0.1% of runtime in my profiles, and they didn't have any SIMD yet so far (we only had simd for npx=16 double-block versions).
2016-07-26	vp9: add mxext versions of the single-block (w=4,npx=8) h/v loopfilters.	Ronald S. Bultje
	Each takes about 0.5% of runtime in my profiles, and they didn't have any SIMD yet so far (we only had simd for npx=16 double-block versions).
2016-07-26	vp9: add 32x32 idct AVX2 implementation.	Ronald S. Bultje
	About 1.8x speedup compared to AVX version for full IDCT. Other sub-IDCT scenarios also see speedups. Full --bench output for idct_32x32_add_{bpp}_${subidct}_${opt} (50k cycles): nop: 16.5 vp9_inv_dct_dct_32x32_add_8_1_c: 2284.4 vp9_inv_dct_dct_32x32_add_8_1_sse2: 145.0 vp9_inv_dct_dct_32x32_add_8_1_ssse3: 137.4 vp9_inv_dct_dct_32x32_add_8_1_avx: 137.1 vp9_inv_dct_dct_32x32_add_8_1_avx2: 73.2 vp9_inv_dct_dct_32x32_add_8_2_c: 14680.8 vp9_inv_dct_dct_32x32_add_8_2_sse2: 2617.2 vp9_inv_dct_dct_32x32_add_8_2_ssse3: 982.9 vp9_inv_dct_dct_32x32_add_8_2_avx: 958.5 vp9_inv_dct_dct_32x32_add_8_2_avx2: 704.2 vp9_inv_dct_dct_32x32_add_8_4_c: 14443.1 vp9_inv_dct_dct_32x32_add_8_4_sse2: 2717.1 vp9_inv_dct_dct_32x32_add_8_4_ssse3: 965.7 vp9_inv_dct_dct_32x32_add_8_4_avx: 1000.7 vp9_inv_dct_dct_32x32_add_8_4_avx2: 717.1 vp9_inv_dct_dct_32x32_add_8_8_c: 14436.4 vp9_inv_dct_dct_32x32_add_8_8_sse2: 2671.8 vp9_inv_dct_dct_32x32_add_8_8_ssse3: 1038.5 vp9_inv_dct_dct_32x32_add_8_8_avx: 983.0 vp9_inv_dct_dct_32x32_add_8_8_avx2: 729.4 vp9_inv_dct_dct_32x32_add_8_16_c: 14614.7 vp9_inv_dct_dct_32x32_add_8_16_sse2: 2701.7 vp9_inv_dct_dct_32x32_add_8_16_ssse3: 1334.4 vp9_inv_dct_dct_32x32_add_8_16_avx: 1276.7 vp9_inv_dct_dct_32x32_add_8_16_avx2: 719.5 vp9_inv_dct_dct_32x32_add_8_32_c: 14363.6 vp9_inv_dct_dct_32x32_add_8_32_sse2: 2575.6 vp9_inv_dct_dct_32x32_add_8_32_ssse3: 2633.9 vp9_inv_dct_dct_32x32_add_8_32_avx: 2539.6 vp9_inv_dct_dct_32x32_add_8_32_avx2: 1395.0
2016-07-20	x86/diracdsp: make ff_put_signed_rect_clamped_10_sse4 work on x86_32	James Almer
	Reviewed-by: Rostislav Pehlivanov <atomnuker@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>
2016-07-12	diracdsp_init: add missing ARCH_X86_64 check	Rostislav Pehlivanov
	That SIMD is still x86_64 only for now. Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
2016-07-12	diracdsp: add SIMD for the 10 bit version of put_signed_rect_clamped	Rostislav Pehlivanov
	Signed-off-by: Rostislav Pehlivanov <rpehlivanov@obe.tv>
2016-07-12	diracdsp: add dequantization SIMD	Rostislav Pehlivanov
	Currently unused, to be used in the following commits. Signed-off-by: Rostislav Pehlivanov <rpehlivanov@obe.tv>
2016-07-11	vp9: add 16x16 idct avx2 (8-bit).	Ronald S. Bultje
	checkasm --bench, 10k runs, for *_add_${bpc}_${sub_idct}_${opt}, shows that it's about 1.65x as fast as the AVX version for the full IDCT, and similar speedups for the sub-IDCTs: nop: 24.6 vp9_inv_dct_dct_16x16_add_8_1_c: 6444.8 vp9_inv_dct_dct_16x16_add_8_1_sse2: 638.6 vp9_inv_dct_dct_16x16_add_8_1_ssse3: 484.4 vp9_inv_dct_dct_16x16_add_8_1_avx: 661.2 vp9_inv_dct_dct_16x16_add_8_1_avx2: 311.5 vp9_inv_dct_dct_16x16_add_8_2_c: 6665.7 vp9_inv_dct_dct_16x16_add_8_2_sse2: 646.9 vp9_inv_dct_dct_16x16_add_8_2_ssse3: 455.2 vp9_inv_dct_dct_16x16_add_8_2_avx: 521.9 vp9_inv_dct_dct_16x16_add_8_2_avx2: 304.3 vp9_inv_dct_dct_16x16_add_8_4_c: 7022.7 vp9_inv_dct_dct_16x16_add_8_4_sse2: 647.4 vp9_inv_dct_dct_16x16_add_8_4_ssse3: 467.1 vp9_inv_dct_dct_16x16_add_8_4_avx: 446.1 vp9_inv_dct_dct_16x16_add_8_4_avx2: 297.0 vp9_inv_dct_dct_16x16_add_8_8_c: 6800.4 vp9_inv_dct_dct_16x16_add_8_8_sse2: 598.6 vp9_inv_dct_dct_16x16_add_8_8_ssse3: 465.7 vp9_inv_dct_dct_16x16_add_8_8_avx: 440.9 vp9_inv_dct_dct_16x16_add_8_8_avx2: 290.2 vp9_inv_dct_dct_16x16_add_8_16_c: 6626.6 vp9_inv_dct_dct_16x16_add_8_16_sse2: 599.5 vp9_inv_dct_dct_16x16_add_8_16_ssse3: 475.0 vp9_inv_dct_dct_16x16_add_8_16_avx: 469.9 vp9_inv_dct_dct_16x16_add_8_16_avx2: 286.4
2016-07-09	Merge commit 'f1a9eee41c4b5ea35db9ff0088ce4e6f1e187f2c'	Clément Bœsch
	* commit 'f1a9eee41c4b5ea35db9ff0088ce4e6f1e187f2c': x86: Add missing movsxd for the int stride parameter Merged-by: Clément Bœsch <u@pkh.me>
2016-07-05	x86/dcadsp: optimize lfe_fir0_float_fma3 on x86_32	James Almer
	About 10% faster. Signed-off-by: James Almer <jamrial@gmail.com>
2016-07-04	avcodec: add missing xmm/neon clobber test wrappers for the new decode API	James Almer
	Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>
2016-06-27	asm: FF_-prefix internal macros used in inline assembly	Matthieu Bouron
	See merge commit '39d6d3618d48625decaff7d9bdbb45b44ef2a805'.
2016-06-26	Merge commit 'dc40a70c5755bccfb1a1349639943e1f408bea50'	Hendrik Leppkes
	* commit 'dc40a70c5755bccfb1a1349639943e1f408bea50': Drop unnecessary libavutil/x86/asm.h #includes Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
2016-06-22	Merge commit 'a6a750c7ef240b72ce01e9653343a0ddf247d196'	Clément Bœsch
	* commit 'a6a750c7ef240b72ce01e9653343a0ddf247d196': tests: Move all test programs to a subdirectory Merged-by: Clément Bœsch <clement@stupeflix.com>
2016-06-21	Merge commit '41ed7ab45fc693f7d7fc35664c0233f4c32d69bb'	Clément Bœsch
	* commit '41ed7ab45fc693f7d7fc35664c0233f4c32d69bb': cosmetics: Fix spelling mistakes Merged-by: Clément Bœsch <u@pkh.me>
2016-06-21	h264: rename h264.[ch] to h264dec.[ch]	Anton Khirnov
	This is more consistent with the naming of other decoders.
2016-06-17	x86: Add missing movsxd for the int stride parameter	Martin Storsjö
	Signed-off-by: Martin Storsjö <martin@martin.st>
2016-06-14	x86/aacpsdsp: optimize add_squares loop	James Almer
	Signed-off-by: James Almer <jamrial@gmail.com>
2016-06-08	x86/aacdec: use HADDPS macro	James Almer
	Signed-off-by: James Almer <jamrial@gmail.com>
2016-05-28	Drop unnecessary libavutil/x86/asm.h #includes	Diego Biurrun

2016-05-28	asm: FF_-prefix internal macros used in inline assembly	Diego Biurrun
	These warnings conflict with system macros on Solaris, producing truckloads of warnings about macro redefinition.
2016-05-13	tests: Move all test programs to a subdirectory	Diego Biurrun

2016-05-08	x86: lossless audio: SSE4 madd 32bits	Christophe Gisquet
	The unique user so far is wmalossless 24bits. The few samples tested show an order of 8, so more unrolling or an avx2 version do not make sense. Timings: 68 -> 49 cycles Reviewed-by: Paul B Mahol <onemda@gmail.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2016-05-04	cosmetics: Fix spelling mistakes	Vittorio Giovara
	Signed-off-by: Diego Biurrun <diego@biurrun.de>
2016-04-12	Merge commit '73ff983e8dd22ccee166403d0bbbc9c1cd543622'	Derek Buitenhuis
	* commit '73ff983e8dd22ccee166403d0bbbc9c1cd543622': fft: x86: cosmetics: Drop silly comments, add comment, whitespace Merged-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
2016-04-07	build: miscellaneous cosmetics	Diego Biurrun
	Restore alphabetical order in lists, break overly long lines, do some prettyprinting, add some explanatory section comments, group parts together that belong together logically.
2016-03-04	avcodec/fft: Add revtab32 for FFTs with more than 65536 samples	Michael Niedermayer
	x86 optimizations are used only for the cases they support (<=65536 samples) Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2016-03-04	avcodec: Extend fft to size 2^17	Michael Niedermayer
	Asked-for-by: durandal_1707 Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2016-03-01	fft: Split MDCT bits off from FFT	Diego Biurrun

2016-02-29	x86/vc1dsp: Split the file into MC and loopfilter	Timothy Gu

2016-02-26	fft: x86: cosmetics: Drop silly comments, add comment, whitespace	Diego Biurrun

2016-02-24	Merge commit '15a24614aef5836af3cd2c7cc3b2b737eee6bf3c'	Derek Buitenhuis
	* commit '15a24614aef5836af3cd2c7cc3b2b737eee6bf3c': build: Add vc1dsp component for more fine-grained dependencies Merged-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
2016-02-23	x86: hevc: Fix linking with both yasm and optimizations disabled	Diego Biurrun
	Some optimized functions reference optimized symbols, so the functions must be explicitly disabled when those symbols are unavailable.
2016-02-23	x86/dcadec: add ff_lfe_fir1_float_{sse3,avx}	James Almer
	Reviewed-by: Christophe Gisquet <christophe.gisquet@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>
2016-02-19	build: Add vc1dsp component for more fine-grained dependencies	Diego Biurrun

2016-02-16	Merge commit 'e280fe13291e9c712a5f4aa13b5263f3e8afed45'	Derek Buitenhuis
	* commit 'e280fe13291e9c712a5f4aa13b5263f3e8afed45': v210: Use separate sample_factors Merged-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
2016-02-16	Merge commit 'eafb05fcf37cd19a910ca3b17824384f9006bc0a'	Derek Buitenhuis
	* commit 'eafb05fcf37cd19a910ca3b17824384f9006bc0a': v210: x86: Add the correct guards around the asm code Merged-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
2016-02-15	x86: use the new helper macros where useful	James Almer
	Reviewed-by: Michael Niedermayer <michael@niedermayer.cc> Signed-off-by: James Almer <jamrial@gmail.com>
2016-02-14	x86/vc1dsp: Port vc1_*_hor_16b_shift2 to NASM format	Timothy Gu
	Reviewed-by: Christophe Gisquet <christophe.gisquet@gmail.com>
2016-02-07	huffyuvencdsp: Undefine "i" macro after each use	Timothy Gu

2016-02-06	x86/dcadec: add ff_lfe_fir0_float_{sse,sse2,avx,fma3}	James Almer
	Up to ~4 times faster on x86_64, ~8 times on x86_32 if compiling using x87 fp math. Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>