diff options
author | Kyle Siefring <kylesiefring@gmail.com> | 2021-01-02 06:18:19 +0300 |
---|---|---|
committer | Jean-Baptiste Kempf <jb@videolan.org> | 2021-01-06 02:23:05 +0300 |
commit | 5dc55af6a1034e730dd8899b5380c424b161e235 (patch) | |
tree | c83c9266ca61707d8b8f23f92373afdeee4ba337 /tools | |
parent | 0d02b5e40bf059cb37d934f0778c757f4eaf12b0 (diff) |
SSE2, msac: Use bsr shortcut for 50% bool decoding
bsr has 3 cycles of latency for modern x86 processors. For this
function, it's possible to obtain the number of bits to shift by
alternative means.
I'd estimate about approx -0.2% decrease in cpu usage based on
percentages associated with function symbols in perf report.
Benchmarks were run on a Ryzen 5 3600 (Zen 2). The used clip was the
original 1080p chimera.
Diffstat (limited to 'tools')
0 files changed, 0 insertions, 0 deletions