Welcome to mirror list, hosted at ThFree Co, Russian Federation.

github.com/FFmpeg/FFmpeg.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2017-06-21build: Generalize yasm/nasm-related variable namesDiego Biurrun
None of them are specific to the YASM assembler. (Cherry-picked from libav commit 39e208f4d4756367c7cd2d581847e0c1b8a429c1) Signed-off-by: James Almer <jamrial@gmail.com>
2017-05-15avcodec/h264: add sse2 versions of previous idct functionsJames Darnley
Kaby Lake Pentium: - ff_h264_idct_add_8_sse2: ~1.18x faster than mmxext - ff_h264_idct_dc_add_8_sse2: ~1.07x faster than mmxext
2017-05-15avcodec/h264: add avx 8-bit h264_idct_dc_addJames Darnley
Haswell: - 1.02x faster (405±0.7 vs. 397±0.8 decicycles) compared with mmxext Skylake-U: - 1.06x faster (498±1.8 vs. 470±1.3 decicycles) compared with mmxext
2017-05-15avcodec/h264: add avx 8-bit h264_idct_addJames Darnley
Haswell: - 1.11x faster (522±0.4 vs. 469±1.8 decicycles) compared with mmxext Skylake-U: - 1.21x faster (671±5.5 vs. 555±1.4 decicycles) compared with mmxext
2017-02-27avcodec/h264: enable sse2 chroma deblock/loop filter functionsJames Darnley
Between 1.00 and 1.16 times faster on Intel Yorkfield Core 2 Quad. Between 1.11 and 1.39 times faster on Intel Kaby Lake Pentium.
2017-02-27avcodec/h264: add avx 8-bit 4:2:2 chroma h intra deblock/loop filterJames Darnley
~1.37x faster (147 vs. 108 cycles) compared to mmxext function
2017-02-27avcodec/h264: add avx 8-bit 4:2:0 chroma h intra deblock/loop filterJames Darnley
~1.10x faster (69 vs. 63 cycles) compared to mmxext function
2017-02-27avcodec/h264: add avx 8-bit chroma v intra deblock/loop filterJames Darnley
~1.14x faster (90 vs 78 cycles) compared with mmxext
2017-02-27avcodec/h264: add avx 8-bit 4:2:2 chroma h deblock/loop filterJames Darnley
~1.21x faster (68 vs. 56 cycles) compared with mmxext function
2017-02-27avcodec/h264: add avx 8-bit 4:2:0 chroma h deblock/loop filterJames Darnley
~1.14x faster (93 vs. 81 cycles) compared with mmxext function
2017-02-27avcodec/h264: add avx 8-bit chroma v deblock/loop filterJames Darnley
~1.24x faster (101 vs. 81 cycles) compared with mmxext function
2017-02-18avcodec/h264: sse2, avx h luma mbaff deblock/loop filterJames Darnley
x86-64 only Yorkfield: - sse2: ~2.17x (434 vs. 200 cycles) Nehalem: - sse2: ~2.94x (409 vs. 139 cycles) Skylake: - sse2: ~3.10x (370 vs. 119 cycles) - avx: ~3.29x (370 vs. 112 cycles)
2016-12-07avcodec/h264: mmx2, sse2, avx 10-bit 4:2:2 h chroma deblock/loop filterJames Darnley
Yorkfield: - mmx2: 2.53x (504 vs. 199 cycles) - sse2: 3.83x (504 vs. 131 cycles) Nehalem: - mmx2: 2.42x (365 vs. 151 cycles) - sse2: 3.56x (365 vs. 103 cycles) Skylake: - mmx2: 1.81x (308 vs. 170 cycles) - sse2: 2.84x (308 vs. 108 cycles) - avx: 2.93x (308 vs. 105 cycles)
2016-12-07avcodec/h264: mmx2, sse2, avx 10-bit h chroma deblock/loop filterJames Darnley
Yorkfield: - mmx2: 2.45x (279 vs. 114 cycles) - sse2: 3.36x (279 vs. 83 cycles) Nehalem: - mmx2: 2.10x (192 vs. 92 cycles) - sse2: 2.84x (192 vs. 68 cycles) Skylake: - mmx2: 1.75x (170 vs. 97 cycles) - sse2: 2.47x (170 vs. 69 cycles) - avx: 2.47x (170 vs. 69 cycles)
2016-12-07whitespace changes after last commitJames Darnley
2016-12-07avcodec/h264: clean up and expand x86 function definitionsJames Darnley
2016-12-01avcodec/h264: sse2 and avx 4:2:2 idct add8 10-bit functionsJames Darnley
Yorkfield: - sse2: - complex: 4.13x faster (1514 vs. 367 cycles) - simple: 4.38x faster (1836 vs. 419 cycles) Skylake: - sse2: - complex: 3.61x faster ( 936 vs. 260 cycles) - simple: 3.97x faster (1126 vs. 284 cycles) - avx (versus sse2): - complex: 1.07x faster (260 vs. 244 cycles) - simple: 1.03x faster (284 vs. 274 cycles)
2016-12-01avcodec/h264: mmx 4:2:2 idct add8 functionJames Darnley
2.87 times faster (1830 vs. 638 cycles)
2016-12-01avcodec/h264: mmxext 4:2:2 chroma intra deblock/loop filterJames Darnley
2.1 times faster (401 vs. 194 cycles)
2016-09-23avcodec/h264: Use ptrdiff_t for (bi)weight functionsMichael Niedermayer
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2016-02-05avcodec/h264: mmxext 4:2:2 chroma deblock/loop filterJames Darnley
2.6 times faster (366 vs. 142 cycles)
2014-06-26Merge commit '5ab03e41e553452118113d0c224fa32b325e45e5'Michael Niedermayer
* commit '5ab03e41e553452118113d0c224fa32b325e45e5': x86: h264dsp: Fix link failure with optimizations disabled Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-26x86: h264dsp: Fix link failure with optimizations disabledDiego Biurrun
With optimzations disabled compilers have trouble doing dead code elimination on 'if (foo && 0)' expressions, while 'if (0 && foo)' still works, so use the latter to avoid problems. Bug-Id: 707
2014-04-05Merge commit 'b42f49e42f8cde25a788b2d13d03e99ca2956647'Michael Niedermayer
* commit 'b42f49e42f8cde25a788b2d13d03e99ca2956647': x86: dsputil: Eliminate some unnecessary dsputil_x86.h #includes Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-04-04x86: dsputil: Eliminate some unnecessary dsputil_x86.h #includesDiego Biurrun
2014-01-06Merge commit 'a03a642d5ceb5f2f7c6ebbf56ff365dfbcdb65eb'Michael Niedermayer
* commit 'a03a642d5ceb5f2f7c6ebbf56ff365dfbcdb65eb': h264: do not use 422 functions for monochrome See: 07abf13da4a7c3d23ce6bc6542d72e6252161736 Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-01-06h264: do not use 422 functions for monochromeAnton Khirnov
Fixes invalid memory access. Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind CC:libav-stable@libav.org
2013-08-30Merge commit 'e998b56362c711701b3daa34e7b956e7126336f4'Michael Niedermayer
* commit 'e998b56362c711701b3daa34e7b956e7126336f4': x86: avcodec: Consistently structure CPU extension initialization Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-08-29x86: avcodec: Consistently structure CPU extension initializationDiego Biurrun
2013-07-18Merge remote-tracking branch 'qatar/master'Michael Niedermayer
* qatar/master: Consistently use "cpu_flags" as variable/parameter name for CPU flags Conflicts: libavcodec/x86/dsputil_init.c libavcodec/x86/h264dsp_init.c libavcodec/x86/hpeldsp_init.c libavcodec/x86/motion_est.c libavcodec/x86/mpegvideo.c libavcodec/x86/proresdsp_init.c Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-07-18Consistently use "cpu_flags" as variable/parameter name for CPU flagsDiego Biurrun
2013-05-14Merge commit '1399931d07f0f37ef4526eb8d39d33c64e09618a'Michael Niedermayer
* commit '1399931d07f0f37ef4526eb8d39d33c64e09618a': x86: dsputil: Rename dsputil_mmx.h --> dsputil_x86.h Conflicts: libavcodec/x86/dsputil_mmx.c Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-05-13x86: dsputil: Rename dsputil_mmx.h --> dsputil_x86.hDiego Biurrun
The header is not (anymore) MMX-specific.
2013-05-01Merge commit '7f75f2f2bd692857c1c1ca7f414eb30ece3de93d'Michael Niedermayer
* commit '7f75f2f2bd692857c1c1ca7f414eb30ece3de93d': ppc: Drop unnecessary ff_ name prefixes from static functions x86: Drop unnecessary ff_ name prefixes from static functions arm: Drop unnecessary ff_ name prefixes from static functions Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-04-30x86: Drop unnecessary ff_ name prefixes from static functionsDiego Biurrun
2013-02-06Merge commit '620289a20e022b9c16c10d546ef86cc0bb77cc84'Michael Niedermayer
* commit '620289a20e022b9c16c10d546ef86cc0bb77cc84': sh4: Fix silly type vs. variable name search and replace typo configure: Group all hwaccels together in a separate variable Add av_cold attributes to arch-specific init functions Conflicts: configure libavcodec/arm/mpegvideo_armv5te.c libavcodec/x86/mlpdsp.c libavcodec/x86/motion_est.c libavcodec/x86/mpegvideoenc.c libavcodec/x86/videodsp_init.c Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-02-05Add av_cold attributes to arch-specific init functionsDiego Biurrun
2013-01-23Merge commit '88bd7fdc821aaa0cbcf44cf075c62aaa42121e3f'Michael Niedermayer
* commit '88bd7fdc821aaa0cbcf44cf075c62aaa42121e3f': Drop DCTELEM typedef Conflicts: libavcodec/alpha/dsputil_alpha.h libavcodec/alpha/motion_est_alpha.c libavcodec/arm/dsputil_init_armv6.c libavcodec/bfin/dsputil_bfin.h libavcodec/bfin/pixels_bfin.S libavcodec/cavs.c libavcodec/cavsdec.c libavcodec/dct-test.c libavcodec/dnxhdenc.c libavcodec/dsputil.c libavcodec/dsputil.h libavcodec/dsputil_template.c libavcodec/eamad.c libavcodec/h264_cavlc.c libavcodec/h264idct_template.c libavcodec/mpeg12.c libavcodec/mpegvideo.c libavcodec/mpegvideo.h libavcodec/mpegvideo_enc.c libavcodec/ppc/dsputil_altivec.c libavcodec/proresdsp.c Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-01-23Drop DCTELEM typedefDiego Biurrun
It does not help as an abstraction and adds dsputil dependencies. Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2012-12-12x86inc: support stack mem allocation and re-alignment in PROLOGUE.Ronald S. Bultje
Use this in VP8/H264-8bit loopfilter functions so they can be used if there is no aligned stack (e.g. MSVC 32bit or ICC 10.x). Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-12-12x86inc: support stack mem allocation and re-alignment in PROLOGUERonald S. Bultje
Use this in VP8/H264-8bit loopfilter functions so they can be used if there is no aligned stack (e.g. MSVC 32bit or ICC 10.x). Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2012-11-29Merge commit '1f3f896564501c23b44fcf605567c78ce066b539'Michael Niedermayer
* commit '1f3f896564501c23b44fcf605567c78ce066b539': fate: Add dependencies for Vorbis, ProRes, QTRLE, utvideo tests fate: real: Add dependencies fate: lossless-audio: Add dependencies x86: h264dsp: Fix linking with yasm and optimizations disabled Conflicts: libavcodec/x86/h264dsp_init.c tests/fate/lossless-audio.mak tests/fate/real.mak Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-11-28x86: h264dsp: Fix linking with yasm and optimizations disabledDiego Biurrun
Some optimized functions reference optimized symbols, so the functions must be explicitly disabled when those symbols are unavailable.
2012-11-14Merge remote-tracking branch 'qatar/master'Michael Niedermayer
* qatar/master: x86: mmx2 ---> mmxext in asm constructs Conflicts: libavcodec/x86/h264_chromamc_10bit.asm libavcodec/x86/h264_deblock.asm libavcodec/x86/h264dsp_init.c Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-11-14x86: mmx2 ---> mmxext in asm constructsDiego Biurrun
2012-11-01Merge commit 'fa8fcab1e0d31074c0644c4ac5194474c6c26415'Michael Niedermayer
* commit 'fa8fcab1e0d31074c0644c4ac5194474c6c26415': x86: h264_chromamc_10bit: drop pointless PAVG %define x86: mmx2 ---> mmxext in function names swscale: do not forget to swap data in formats with different endianness Conflicts: libavcodec/x86/dsputil_mmx.c libavfilter/x86/gradfun.c libswscale/input.c libswscale/utils.c libswscale/x86/swscale.c tests/ref/lavfi/pixfmts_scale Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-10-31x86: mmx2 ---> mmxext in function namesDiego Biurrun
2012-09-09x86/h264dsp_init: put a HAVE_YASM backMichael Niedermayer
Should fix compilation on open solaris Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-09-09Merge remote-tracking branch 'qatar/master'Michael Niedermayer
* qatar/master: swscale: Provide the right alignment for external mmx asm x86: Replace checks for CPU extensions and flags by convenience macros configure: msvc: fix/simplify setting of flags for hostcc x86: mlpdsp: mlp_filter_channel_x86 requires inline asm Conflicts: libavcodec/x86/fft_init.c libavcodec/x86/h264_intrapred_init.c libavcodec/x86/h264dsp_init.c libavcodec/x86/mpegaudiodec.c libavcodec/x86/proresdsp_init.c libavutil/x86/float_dsp_init.c libswscale/utils.c libswscale/x86/swscale.c Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-09-08x86: Replace checks for CPU extensions and flags by convenience macrosDiego Biurrun
This separates code relying on inline from that relying on external assembly and fixes instances where the coalesced check was incorrect.