Welcome to mirror list, hosted at ThFree Co, Russian Federation.

github.com/videolan/dav1d.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2020-06-21Update NEWS for 0.7.10.7.1Jean-Baptiste Kempf
2020-06-20Extract y related operations out of warp_affine inner loopLuc Trudeau
2020-06-19x86: Branch before waiting on popcnt in ipred_z AVX2 functionsHenrik Gramner
Some specific Haswell CPU:s have a hardware bug where the popcnt instruction doesn't set zero flag correctly, which causes the wrong branch to be taken. popcnt also has a 3-cycle latency on Intel CPU:s, so doing the branch on the input value instead of the output reduces the amount of time wasted going down the wrong code path in case of branch mispredictions.
2020-06-19arm32: Add a NEON implementation of MSACMartin Storsjö
Only use this in the cases when NEON can be used unconditionally without runtime detection (when __ARM_NEON is defined). The speedup over the C code is very modest for the smaller functions (and the NEON version actually is a little slower than the C code on Cortex A7 for adapt4), but the speedup is around 2x for adapt16. Cortex A7 A8 A9 A53 A72 A73 msac_decode_bool_c: 41.1 43.0 43.0 37.3 26.2 31.3 msac_decode_bool_neon: 40.2 42.0 37.2 32.8 19.9 25.5 msac_decode_bool_adapt_c: 65.1 70.4 58.5 54.3 33.2 40.8 msac_decode_bool_adapt_neon: 56.8 52.4 49.3 42.6 27.1 33.7 msac_decode_bool_equi_c: 36.9 37.2 42.8 32.6 22.7 42.3 msac_decode_bool_equi_neon: 34.9 35.1 36.4 29.7 19.5 36.4 msac_decode_symbol_adapt4_c: 114.2 139.0 111.6 99.9 65.5 83.5 msac_decode_symbol_adapt4_neon: 119.2 128.3 95.7 82.2 58.2 57.5 msac_decode_symbol_adapt8_c: 176.0 207.9 164.0 154.4 88.0 117.0 msac_decode_symbol_adapt8_neon: 128.3 130.3 110.7 85.1 59.9 61.4 msac_decode_symbol_adapt16_c: 292.1 320.5 256.4 246.4 129.1 173.3 msac_decode_symbol_adapt16_neon: 162.2 144.3 129.0 104.2 69.2 69.9 (Omitting msac_decode_hi_tok from the benchmark, as the "C" version measured there uses the NEON version of msac_decode_symbol_adapt4.)
2020-06-18arm64: msac: Add a special cased implementation of decode_hi_tokMartin Storsjö
The speedup (over the normal version, that just calls the existing assembly version of symbol_adapt4) is not very impressive on bigger cores, but looks decent on small cores. It's an improvement though, in any case. Cortex A53 A72 A73 msac_decode_hi_tok_c: 175.7 136.2 138.1 msac_decode_hi_tok_neon: 146.8 129.4 125.9
2020-06-18arm64: msac: Use a narrower vector length for adapt4 in one placeMartin Storsjö
2020-06-18arm64: msac: Clarify the register use in one macroMartin Storsjö
Include the letter prefix when calling the macro, making it slightly less obscure.
2020-06-18cli: Avoid large intermediates in the windows get_time_nanosMartin Storsjö
By multiplicating the performance counter value (within its own time base) by the intended target time base, and only then dividing, we reduce the available numeric range by the factor of the original time base times the new time base. On Windows 10 on ARM64, the performance counter frequency is 19200000 (on x86_64 in a virtual machine, it's 10000000), making the calculation overflow every (1 << 64) / (19200000 * 1000000000) = 960 seconds, i.e. 16 minutes - long before the actual uint64_t nanosecond return value wraps around.
2020-06-18cli: Get the elapsed time if printing progress, regardless of the fps valueMartin Storsjö
Even if we don't want to throttle decoding to realtime, and even if the file itself didn't contain a valid fps value, we may want to call the synchronize function to fetch the current elapsed decoding time, for displaying the fps value.
2020-06-18Update NEWS for 0.7.1Jean-Baptiste Kempf
2020-06-18x86: Add put/prep_bilin_scaled AVX2 asmVictorien Le Couviour--Tuffet
Bilin scaled being very rarely used, add a new table entry to mc_subpel_filters, and jump to the put/prep_8tap_scaled code. AVX2 performance is obviously the same as the 8tap code, the speed up is much smaller though, as the C code is a true bilinear codepath, auto-vectorized. Yet, the AVX2 performance are always better.
2020-06-18x86: Add prep_8tap_scaled AVX2 asmVictorien Le Couviour--Tuffet
mct_scaled_8tap_regular_w4_8bpc_c: 872.1 mct_scaled_8tap_regular_w4_8bpc_avx2: 125.6 mct_scaled_8tap_regular_w4_dy1_8bpc_c: 886.3 mct_scaled_8tap_regular_w4_dy1_8bpc_avx2: 84.0 mct_scaled_8tap_regular_w4_dy2_8bpc_c: 1189.1 mct_scaled_8tap_regular_w4_dy2_8bpc_avx2: 84.7 mct_scaled_8tap_regular_w8_8bpc_c: 2261.0 mct_scaled_8tap_regular_w8_8bpc_avx2: 306.2 mct_scaled_8tap_regular_w8_dy1_8bpc_c: 2189.9 mct_scaled_8tap_regular_w8_dy1_8bpc_avx2: 233.8 mct_scaled_8tap_regular_w8_dy2_8bpc_c: 3060.3 mct_scaled_8tap_regular_w8_dy2_8bpc_avx2: 282.8 mct_scaled_8tap_regular_w16_8bpc_c: 4335.3 mct_scaled_8tap_regular_w16_8bpc_avx2: 680.7 mct_scaled_8tap_regular_w16_dy1_8bpc_c: 5137.2 mct_scaled_8tap_regular_w16_dy1_8bpc_avx2: 578.6 mct_scaled_8tap_regular_w16_dy2_8bpc_c: 7878.4 mct_scaled_8tap_regular_w16_dy2_8bpc_avx2: 774.6 mct_scaled_8tap_regular_w32_8bpc_c: 17871.9 mct_scaled_8tap_regular_w32_8bpc_avx2: 2954.8 mct_scaled_8tap_regular_w32_dy1_8bpc_c: 18594.7 mct_scaled_8tap_regular_w32_dy1_8bpc_avx2: 2073.9 mct_scaled_8tap_regular_w32_dy2_8bpc_c: 28696.0 mct_scaled_8tap_regular_w32_dy2_8bpc_avx2: 2852.1 mct_scaled_8tap_regular_w64_8bpc_c: 46967.5 mct_scaled_8tap_regular_w64_8bpc_avx2: 7527.5 mct_scaled_8tap_regular_w64_dy1_8bpc_c: 45564.2 mct_scaled_8tap_regular_w64_dy1_8bpc_avx2: 5262.9 mct_scaled_8tap_regular_w64_dy2_8bpc_c: 72793.3 mct_scaled_8tap_regular_w64_dy2_8bpc_avx2: 7535.9 mct_scaled_8tap_regular_w128_8bpc_c: 111190.8 mct_scaled_8tap_regular_w128_8bpc_avx2: 19386.8 mct_scaled_8tap_regular_w128_dy1_8bpc_c: 122625.0 mct_scaled_8tap_regular_w128_dy1_8bpc_avx2: 15376.1 mct_scaled_8tap_regular_w128_dy2_8bpc_c: 197120.6 mct_scaled_8tap_regular_w128_dy2_8bpc_avx2: 21871.0
2020-06-17Clean up fraction calculationColin Lee
2020-06-17Add clamping back to mv projectionColin Lee
Clamping in the motion vector projection calculation is required by spec. In commit aca57bf3db00c29e90605656f1015561d1d67c2d a rewrite of the function omitted the clamping. This commit readds the clamping.
2020-06-16arm64: itx: Simplify and clarify the sub_sp macro a little bitMartin Storsjö
Add an .error case for windows if subtracting more than 8 KB, simplify the generic subtraction case.
2020-06-16arm: itx: Add NEON implementation of itx for 8 bpcMartin Storsjö
The transforms process vectors of up to 8 elements at a time, for transforms up to size 8; for larger transforms, it uses vectors of 4 elements. Overall, the speedup over C code seems to be around 8-14x for the larger transforms, and 10-19x for the smaller ones. Relative speedup over C code (built with GCC 7.5) for a few functions: Cortex A7 A8 A9 A53 A72 A73 inv_txfm_add_4x4_dct_dct_0_8bpc_neon: 3.83 3.42 2.57 3.36 2.97 7.47 inv_txfm_add_4x4_dct_dct_1_8bpc_neon: 7.25 13.53 8.38 8.82 7.96 12.37 inv_txfm_add_8x8_dct_dct_0_8bpc_neon: 4.78 6.61 4.82 4.65 5.27 9.76 inv_txfm_add_8x8_dct_dct_1_8bpc_neon: 10.20 19.07 13.07 14.69 11.45 15.50 inv_txfm_add_16x16_dct_dct_0_8bpc_neon: 4.26 5.06 3.00 3.74 4.05 4.49 inv_txfm_add_16x16_dct_dct_1_8bpc_neon: 10.51 16.02 13.57 14.03 12.86 18.16 inv_txfm_add_16x16_dct_dct_2_8bpc_neon: 7.95 11.75 9.09 10.64 10.06 14.07 inv_txfm_add_32x32_dct_dct_0_8bpc_neon: 5.31 5.58 3.14 4.18 4.80 4.57 inv_txfm_add_32x32_dct_dct_1_8bpc_neon: 12.66 16.07 14.34 16.00 15.24 21.32 inv_txfm_add_32x32_dct_dct_4_8bpc_neon: 8.25 10.69 8.90 10.59 10.41 14.39 inv_txfm_add_64x64_dct_dct_0_8bpc_neon: 4.69 5.97 3.17 3.96 4.57 4.34 inv_txfm_add_64x64_dct_dct_1_8bpc_neon: 11.47 12.68 10.18 14.73 14.20 17.95 inv_txfm_add_64x64_dct_dct_4_8bpc_neon: 8.84 10.13 7.94 11.25 10.58 13.88
2020-06-11meson: Use dav1d_src_root everywhere for consistencyMatthias Dressel
2020-06-11Remove redundant memset in itx DSP initializationHenrik Gramner
The struct is already zero-initialized when the function is called except for the checkasm test, so move the zeroing there instead.
2020-06-11meson: Make docs generation subproject-safeMatthias Dressel
meson.source_root() returns the root of a parent project if dav1d is embedded as a subproject.
2020-06-11x86: Adapt SSSE3 prep_8tap to SSE2Victorien Le Couviour--Tuffet
--------------------- x86_64: ------------------------------------------ mct_8tap_regular_w4_h_8bpc_c: 302.3 mct_8tap_regular_w4_h_8bpc_sse2: 47.3 mct_8tap_regular_w4_h_8bpc_ssse3: 19.5 --------------------- mct_8tap_regular_w8_h_8bpc_c: 745.5 mct_8tap_regular_w8_h_8bpc_sse2: 235.2 mct_8tap_regular_w8_h_8bpc_ssse3: 70.4 --------------------- mct_8tap_regular_w16_h_8bpc_c: 1844.3 mct_8tap_regular_w16_h_8bpc_sse2: 755.6 mct_8tap_regular_w16_h_8bpc_ssse3: 225.9 --------------------- mct_8tap_regular_w32_h_8bpc_c: 6685.5 mct_8tap_regular_w32_h_8bpc_sse2: 2954.4 mct_8tap_regular_w32_h_8bpc_ssse3: 795.8 --------------------- mct_8tap_regular_w64_h_8bpc_c: 15633.5 mct_8tap_regular_w64_h_8bpc_sse2: 7120.4 mct_8tap_regular_w64_h_8bpc_ssse3: 1900.4 --------------------- mct_8tap_regular_w128_h_8bpc_c: 37772.1 mct_8tap_regular_w128_h_8bpc_sse2: 17698.1 mct_8tap_regular_w128_h_8bpc_ssse3: 4665.5 ------------------------------------------ mct_8tap_regular_w4_v_8bpc_c: 306.5 mct_8tap_regular_w4_v_8bpc_sse2: 71.7 mct_8tap_regular_w4_v_8bpc_ssse3: 37.9 --------------------- mct_8tap_regular_w8_v_8bpc_c: 923.3 mct_8tap_regular_w8_v_8bpc_sse2: 168.7 mct_8tap_regular_w8_v_8bpc_ssse3: 71.3 --------------------- mct_8tap_regular_w16_v_8bpc_c: 3040.1 mct_8tap_regular_w16_v_8bpc_sse2: 505.1 mct_8tap_regular_w16_v_8bpc_ssse3: 199.7 --------------------- mct_8tap_regular_w32_v_8bpc_c: 12354.8 mct_8tap_regular_w32_v_8bpc_sse2: 1942.0 mct_8tap_regular_w32_v_8bpc_ssse3: 714.2 --------------------- mct_8tap_regular_w64_v_8bpc_c: 29427.9 mct_8tap_regular_w64_v_8bpc_sse2: 4637.4 mct_8tap_regular_w64_v_8bpc_ssse3: 1829.2 --------------------- mct_8tap_regular_w128_v_8bpc_c: 72756.9 mct_8tap_regular_w128_v_8bpc_sse2: 11301.0 mct_8tap_regular_w128_v_8bpc_ssse3: 5020.6 ------------------------------------------ mct_8tap_regular_w4_hv_8bpc_c: 876.9 mct_8tap_regular_w4_hv_8bpc_sse2: 171.7 mct_8tap_regular_w4_hv_8bpc_ssse3: 112.2 --------------------- mct_8tap_regular_w8_hv_8bpc_c: 2215.1 mct_8tap_regular_w8_hv_8bpc_sse2: 730.2 mct_8tap_regular_w8_hv_8bpc_ssse3: 330.9 --------------------- mct_8tap_regular_w16_hv_8bpc_c: 6075.5 mct_8tap_regular_w16_hv_8bpc_sse2: 2252.1 mct_8tap_regular_w16_hv_8bpc_ssse3: 973.4 --------------------- mct_8tap_regular_w32_hv_8bpc_c: 22182.7 mct_8tap_regular_w32_hv_8bpc_sse2: 7692.6 mct_8tap_regular_w32_hv_8bpc_ssse3: 3599.8 --------------------- mct_8tap_regular_w64_hv_8bpc_c: 50876.8 mct_8tap_regular_w64_hv_8bpc_sse2: 18499.6 mct_8tap_regular_w64_hv_8bpc_ssse3: 8815.6 --------------------- mct_8tap_regular_w128_hv_8bpc_c: 122926.3 mct_8tap_regular_w128_hv_8bpc_sse2: 45120.0 mct_8tap_regular_w128_hv_8bpc_ssse3: 22085.7 ------------------------------------------
2020-06-11x86: Adapt SSSE3 prep_bilin to SSE2Victorien Le Couviour--Tuffet
--------------------- x86_64: ------------------------------------------ mct_bilinear_w4_h_8bpc_c: 98.9 mct_bilinear_w4_h_8bpc_sse2: 30.2 mct_bilinear_w4_h_8bpc_ssse3: 11.5 --------------------- mct_bilinear_w8_h_8bpc_c: 175.3 mct_bilinear_w8_h_8bpc_sse2: 57.0 mct_bilinear_w8_h_8bpc_ssse3: 19.7 --------------------- mct_bilinear_w16_h_8bpc_c: 396.2 mct_bilinear_w16_h_8bpc_sse2: 179.3 mct_bilinear_w16_h_8bpc_ssse3: 50.9 --------------------- mct_bilinear_w32_h_8bpc_c: 1311.2 mct_bilinear_w32_h_8bpc_sse2: 718.8 mct_bilinear_w32_h_8bpc_ssse3: 243.9 --------------------- mct_bilinear_w64_h_8bpc_c: 2892.7 mct_bilinear_w64_h_8bpc_sse2: 1746.0 mct_bilinear_w64_h_8bpc_ssse3: 568.0 --------------------- mct_bilinear_w128_h_8bpc_c: 7192.6 mct_bilinear_w128_h_8bpc_sse2: 4339.8 mct_bilinear_w128_h_8bpc_ssse3: 1619.2 ------------------------------------------ mct_bilinear_w4_v_8bpc_c: 129.7 mct_bilinear_w4_v_8bpc_sse2: 26.6 mct_bilinear_w4_v_8bpc_ssse3: 16.7 --------------------- mct_bilinear_w8_v_8bpc_c: 233.3 mct_bilinear_w8_v_8bpc_sse2: 55.0 mct_bilinear_w8_v_8bpc_ssse3: 24.7 --------------------- mct_bilinear_w16_v_8bpc_c: 498.9 mct_bilinear_w16_v_8bpc_sse2: 146.0 mct_bilinear_w16_v_8bpc_ssse3: 54.2 --------------------- mct_bilinear_w32_v_8bpc_c: 1562.2 mct_bilinear_w32_v_8bpc_sse2: 560.6 mct_bilinear_w32_v_8bpc_ssse3: 201.0 --------------------- mct_bilinear_w64_v_8bpc_c: 3221.3 mct_bilinear_w64_v_8bpc_sse2: 1380.6 mct_bilinear_w64_v_8bpc_ssse3: 499.3 --------------------- mct_bilinear_w128_v_8bpc_c: 7357.7 mct_bilinear_w128_v_8bpc_sse2: 3439.0 mct_bilinear_w128_v_8bpc_ssse3: 1489.1 ------------------------------------------ mct_bilinear_w4_hv_8bpc_c: 185.0 mct_bilinear_w4_hv_8bpc_sse2: 54.5 mct_bilinear_w4_hv_8bpc_ssse3: 22.1 --------------------- mct_bilinear_w8_hv_8bpc_c: 377.8 mct_bilinear_w8_hv_8bpc_sse2: 104.3 mct_bilinear_w8_hv_8bpc_ssse3: 35.8 --------------------- mct_bilinear_w16_hv_8bpc_c: 1159.4 mct_bilinear_w16_hv_8bpc_sse2: 311.0 mct_bilinear_w16_hv_8bpc_ssse3: 106.3 --------------------- mct_bilinear_w32_hv_8bpc_c: 4436.2 mct_bilinear_w32_hv_8bpc_sse2: 1230.7 mct_bilinear_w32_hv_8bpc_ssse3: 400.7 --------------------- mct_bilinear_w64_hv_8bpc_c: 10627.7 mct_bilinear_w64_hv_8bpc_sse2: 2934.2 mct_bilinear_w64_hv_8bpc_ssse3: 957.2 --------------------- mct_bilinear_w128_hv_8bpc_c: 26048.9 mct_bilinear_w128_hv_8bpc_sse2: 7590.3 mct_bilinear_w128_hv_8bpc_ssse3: 2947.0 ------------------------------------------
2020-06-10arm64: itx16: Add a missed eob check in the 16x8 transformMartin Storsjö
This allows skipping half of the first transforms if the input coefficients lie within the upper 4x4 (but checkasm only tests in increments of 8x8 at the moment). With checkasm modified to test in smaller increments, the speedup is like this: Before: Cortex A53 A72 A73 inv_txfm_add_16x8_dct_dct_1_10bpc_neon: 874.4 709.0 707.3 After: inv_txfm_add_16x8_dct_dct_1_10bpc_neon: 618.0 479.5 472.9
2020-06-10arm64: itx16: Remove a leftover unused macro parameterMartin Storsjö
2020-06-09x86: Fix compiler warnings when using nasm 2.15Henrik Gramner
2020-06-09Avoid compiling logging functions when logging is disabledHenrik Gramner
2020-06-07CI: Enable coverage reportsNiklas Haas
Blacklisted some files not directly relevant to the codebase (such as tests, tools and debugging functions). The coverage HTML report gets attached as a build artifact, although unfortunately we can't link directly to the `index.html`. We also attach the coverage XML as a cobertura report, although I'm not sure if it does anything.
2020-06-04Range of operating point is 0 - 31, not 0 - 32Wan-Teh Chang
2020-06-04arm: Add an export parameter to the const macroMartin Storsjö
This is currently not used in dav1d (yet), but there's a need for it in rav1e, which shares this header with dav1d.
2020-06-01x86: Add put_8tap_scaled AVX2 asmVictorien Le Couviour--Tuffet
mc_scaled_8tap_regular_w2_8bpc_c: 764.4 mc_scaled_8tap_regular_w2_8bpc_avx2: 191.3 mc_scaled_8tap_regular_w2_dy1_8bpc_c: 705.8 mc_scaled_8tap_regular_w2_dy1_8bpc_avx2: 89.5 mc_scaled_8tap_regular_w2_dy2_8bpc_c: 964.0 mc_scaled_8tap_regular_w2_dy2_8bpc_avx2: 120.3 mc_scaled_8tap_regular_w4_8bpc_c: 1355.7 mc_scaled_8tap_regular_w4_8bpc_avx2: 180.9 mc_scaled_8tap_regular_w4_dy1_8bpc_c: 1233.2 mc_scaled_8tap_regular_w4_dy1_8bpc_avx2: 115.3 mc_scaled_8tap_regular_w4_dy2_8bpc_c: 1707.6 mc_scaled_8tap_regular_w4_dy2_8bpc_avx2: 117.9 mc_scaled_8tap_regular_w8_8bpc_c: 2483.2 mc_scaled_8tap_regular_w8_8bpc_avx2: 294.8 mc_scaled_8tap_regular_w8_dy1_8bpc_c: 2166.4 mc_scaled_8tap_regular_w8_dy1_8bpc_avx2: 222.0 mc_scaled_8tap_regular_w8_dy2_8bpc_c: 3133.7 mc_scaled_8tap_regular_w8_dy2_8bpc_avx2: 292.6 mc_scaled_8tap_regular_w16_8bpc_c: 5239.2 mc_scaled_8tap_regular_w16_8bpc_avx2: 729.9 mc_scaled_8tap_regular_w16_dy1_8bpc_c: 5156.5 mc_scaled_8tap_regular_w16_dy1_8bpc_avx2: 602.2 mc_scaled_8tap_regular_w16_dy2_8bpc_c: 8018.4 mc_scaled_8tap_regular_w16_dy2_8bpc_avx2: 783.1 mc_scaled_8tap_regular_w32_8bpc_c: 14745.0 mc_scaled_8tap_regular_w32_8bpc_avx2: 2205.0 mc_scaled_8tap_regular_w32_dy1_8bpc_c: 14862.3 mc_scaled_8tap_regular_w32_dy1_8bpc_avx2: 1721.3 mc_scaled_8tap_regular_w32_dy2_8bpc_c: 23607.6 mc_scaled_8tap_regular_w32_dy2_8bpc_avx2: 2325.7 mc_scaled_8tap_regular_w64_8bpc_c: 54891.7 mc_scaled_8tap_regular_w64_8bpc_avx2: 8351.4 mc_scaled_8tap_regular_w64_dy1_8bpc_c: 50249.0 mc_scaled_8tap_regular_w64_dy1_8bpc_avx2: 5864.4 mc_scaled_8tap_regular_w64_dy2_8bpc_c: 79400.1 mc_scaled_8tap_regular_w64_dy2_8bpc_avx2: 8295.7 mc_scaled_8tap_regular_w128_8bpc_c: 121046.8 mc_scaled_8tap_regular_w128_8bpc_avx2: 21809.1 mc_scaled_8tap_regular_w128_dy1_8bpc_c: 133720.4 mc_scaled_8tap_regular_w128_dy1_8bpc_avx2: 16197.8 mc_scaled_8tap_regular_w128_dy2_8bpc_c: 218774.8 mc_scaled_8tap_regular_w128_dy2_8bpc_avx2: 22993.1
2020-05-28meson: favor _aligned_malloc over posix_memalignSteve Lhomme
posix_memalign is defined as a built-in in gcc in msys2 but it's not available when linking with the Universal C Runtime. _aligned_malloc is available in the UCRT. That should only affect builds targeting Windows since _aligned_malloc is a MS thing.
2020-05-26x86: Add minor looprestoration asm optimizationsHenrik Gramner
Eliminate store forwarding stalls. Use shorter instruction encodings where possible. Misc. tweaks.
2020-05-26dav1dplay: use new pl_chroma_location APINiklas Haas
This one correctly sets the subsampling mode based on whether or not the plane is actually subsampled, and also infers PL_CHROMA_UNKNOWN as PL_CHROMA_TOP_LEFT in such cases.
2020-05-25dav1dplay: allow resizing the windowNiklas Haas
libplacebo v66 got helper functions that make preserving the aspect ratio in this case trivial. But we still need to make sure to clear the FBO to black if the image doesn't cover it fully.
2020-05-20dav1dplay: don't freeze on render errors0.7.0Niklas Haas
Returning out of this function when pl_render_image() fails is the wrong thing to do, since that leaves the swapchain frame acquired but never submitted. Instead, just clear the target FBO to blank red (to make it clear that something went wrong) and continue on with presentation.
2020-05-19Update NEWS for 0.7.0Jean-Baptiste Kempf
2020-05-18dav1dplay: support on-GPU film grain synthesisNiklas Haas
Annoying minor differences in this struct layout mean we can't just memcpy the entire thing. Oh well. Note: technically, PL_API_VER 33 added this API, but PL_API_VER 63 is the minimum version of libplacebo that doesn't have glaring bugs when generating chroma grain, so we require that as a minimum instead. (I tested this version on some 4:2:2 and 4:2:0, 8-bit and 10-bit grain samples I had lying around and made sure the output was identical up to differences in rounding / dithering.)
2020-05-18dav1dplay: handle all supported csps/reprs/bitdepthsNiklas Haas
Generalize the code to set the right pl_image metadata based on the values signaled in the Dav1dPictureParameters / Dav1dSequenceHeader. Some values are not mapped, in which case stdout will be spammed. Whatever. Hopefully somebody sees that error spam and opens a bug report for libplacebo to implement it.
2020-05-18dav1dplay: move and simplify pl_image generationNiklas Haas
Having the pl_image generation live in upload_planes() rather than render() will make it easier to set the correct pl_image metadata based on the Dav1dPicture headers moving forwards. Rename the function to make more sense, semantically. Reduce some code duplication by turning per-plane fields into arrays wherever appropriate. As an aside, also apply the correct chroma location rather than hard-coding it as PL_CHROMA_LEFT.
2020-05-18dav1dplay: don't write directly to iparams.extensionsNiklas Haas
This is turned into a const array in upstream libplacebo, which generates warnings due to the implicit cast. Rewrite the code to have the mutable array live inside a separate variable `extensions` and only set `iparams.extensions` to this, rather than directly manipulating it.
2020-05-16Fix swapped define guards in dav1dplay’s libplacebo rendererEmmanuel Gil Peyrot
Signed-off-by: Marvin Scholz <epirat07@gmail.com>
2020-05-16Update NEWS for 0.7.0Jean-Baptiste Kempf
2020-05-15checkasm: x86: Check for stack corruptionHenrik Gramner
Add code to check that a function doesn't accidentally overwrite anything in the area located just above the current stack frame.
2020-05-15tools: add missing fopen error handlingMarvin Scholz
2020-05-15Dav1dPlay: Split placebo renderer into twoMarvin Scholz
This allows selecting at runtime if placebo should use OpenGL or Vulkan for rendering.
2020-05-15Dav1dPlay: Remove redundant log messageMarvin Scholz
2020-05-15Dav1dPlay: Remove unused renderer_info memberMarvin Scholz
2020-05-15Dav1dPlay: Allow runtime renderer selectionMarvin Scholz
2020-05-15Dav1dPlay: Fix renderer selectionMarvin Scholz
2020-05-15Dav1dPlay: Split renderers into different filesMarvin Scholz
2020-05-14Dav1dPlay: Add support for OpenGL with libplaceboMarvin Scholz