Age | Commit message (Collapse) | Author |
|
difference for SSE and AVX2 (x86_64)
|
|
|
|
|
|
func except divide
and optimize average, grainextract, multiply, screen, grain merge
|
|
Blend function speedups on x86_64 Core i5 4460:
ffmpeg -f lavfi -i allyuv -vf framerate=60:threads=1 -f null none
C: 447548411 decicycles in Blend, 2048 runs, 0 skips
SSSE3: 130020087 decicycles in Blend, 2048 runs, 0 skips
AVX2: 128508221 decicycles in Blend, 2048 runs, 0 skips
ffmpeg -f lavfi -i allyuv -vf format=yuv420p12,framerate=60:threads=1 -f null none
C: 228932745 decicycles in Blend, 2048 runs, 0 skips
SSE4: 123357781 decicycles in Blend, 2048 runs, 0 skips
AVX2: 121215353 decicycles in Blend, 2048 runs, 0 skips
Signed-off-by: Marton Balint <cus@passwd.hu>
|
|
|
|
This reverts commits 1a5865b6dcc97754a1d7eedc130fb58237d2a715 and
8fb1d63d919286971b8e6afad372730d6d6f25c8.
They made fate interlace tests fail when AVX2 was used.
Signed-off-by: James Almer <jamrial@gmail.com>
|
|
based on patch by Paul B Mahol
|
|
|
|
|
|
|
|
|
|
arg for ff_interlace_init_x86
|
|
unaligned data in low_pass complex
related to ticket 6491
|
|
ticket 6491
|
|
|
|
Signed-off-by: James Almer <jamrial@gmail.com>
|
|
Signed-off-by: James Almer <jamrial@gmail.com>
|
|
Signed-off-by: James Almer <jamrial@gmail.com>
|
|
Signed-off-by: Paul B Mahol <onemda@gmail.com>
|
|
Signed-off-by: Paul B Mahol <onemda@gmail.com>
|
|
Fixes building with yasm
Tested-by: stevenliu
Signed-off-by: James Almer <jamrial@gmail.com>
|
|
|
|
|
|
version
|
|
Signed-off-by: Paul B Mahol <onemda@gmail.com>
|
|
* commit '994c4bc10751e39c7ed9f67ffd0c0dea5223daf2':
x86util: Port all macros to cpuflags
See d5f8a642f6eb1c6e305c41dabddd0fd36ffb3f77
Merged-by: James Almer <jamrial@gmail.com>
|
|
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Thomas Mundt <tmundt75@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
|
|
The complex vertical low-pass filter slightly over-sharpens the picture. This becomes visible when several transcodings are cascaded and the error potentises, e.g. some generations of HD->SD SD->HD.
To prevent this behaviour the destination pixel must not exceed the source pixel when the average of the pixels above and below is less than the source pixel. And the other way around.
Tested and approved in a visual transcoding cascade test by video professionals.
SSIM/PSNR test with the first generation of an HD->SD file as a reference against the 6th generation(3 x SD->HD HD->SD):
Results without the patch:
SSIM Y:0.956508 (13.615881) U:0.991601 (20.757750) V:0.993004 (21.551382) All:0.974405 (15.918463)
PSNR y:31.838009 u:48.424280 v:48.962711 average:34.759466 min:31.699297 max:40.857847
Results with the patch:
SSIM Y:0.970051 (15.236232) U:0.991883 (20.905857) V:0.993174 (21.658049) All:0.981290 (17.279202)
PSNR y:34.412108 u:48.504454 v:48.969496 average:37.264644 min:34.310637 max:42.373392
Signed-off-by: Thomas Mundt <tmundt75@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
|
|
grainextract
|
|
Signed-off-by: James Almer <jamrial@gmail.com>
|
|
Signed-off-by: Paul B Mahol <onemda@gmail.com>
|
|
|
|
Process more pixels per loop.
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
|
|
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
|
|
|
|
None of them are specific to the YASM assembler.
(Cherry-picked from libav commit 39e208f4d4756367c7cd2d581847e0c1b8a429c1)
Signed-off-by: James Almer <jamrial@gmail.com>
|
|
Signed-off-by: Paul B Mahol <onemda@gmail.com>
|
|
This complex (-1 2 6 2 -1) filter slightly less reduces interlace 'twitter' but better retain detail and subjective sharpness impression compared to the linear (1 2 1) filter.
Signed-off-by: Thomas Mundt <tmundt75@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
|
|
Signed-off-by: Thomas Mundt <tmundt75@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
|
|
None of them are specific to the YASM assembler.
|
|
Signed-off-by: Paul B Mahol <onemda@gmail.com>
|
|
This fixes many warnings of the sort
warning: label alone on a line without a colon might be in error
|
|
Signed-off-by: James Almer <jamrial@gmail.com>
|
|
See merge commit '39d6d3618d48625decaff7d9bdbb45b44ef2a805'.
|
|
* commit 'dc40a70c5755bccfb1a1349639943e1f408bea50':
Drop unnecessary libavutil/x86/asm.h #includes
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
|
|
Fixes failures with yasm 1.1.0 and older
Signed-off-by: James Almer <jamrial@gmail.com>
|
|
Old yasm/nasm versions don't support some of these
Signed-off-by: James Almer <jamrial@gmail.com>
|
|
Signed-off-by: James Almer <jamrial@gmail.com>
|
|
on x86_64:
time PSNR
plain 3.303 inf
SSE 1.649 107.087535
SSE3 1.632 107.087535
AVX 1.409 106.986771
FMA3 1.265 107.108437
on x86_32 (PSNR compared to x86_64 plain):
time PSNR
plain 7.225 103.951979
SSE 1.827 105.859282
SSE3 1.819 105.859282
AVX 1.533 105.997661
FMA3 1.384 105.885377
FMA4 test is not available
Reviewed-by: James Almer <jamrial@gmail.com>
Signed-off-by: Muhammad Faiz <mfcc64@gmail.com>
|