github.com/videolan/dav1d.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author
2020-06-21	Update NEWS for 0.7.10.7.1	Jean-Baptiste Kempf

2020-06-20	Extract y related operations out of warp_affine inner loop	Luc Trudeau

2020-06-19	x86: Branch before waiting on popcnt in ipred_z AVX2 functions	Henrik Gramner
	Some specific Haswell CPU:s have a hardware bug where the popcnt instruction doesn't set zero flag correctly, which causes the wrong branch to be taken. popcnt also has a 3-cycle latency on Intel CPU:s, so doing the branch on the input value instead of the output reduces the amount of time wasted going down the wrong code path in case of branch mispredictions.
2020-06-19	arm32: Add a NEON implementation of MSAC	Martin Storsjö
	Only use this in the cases when NEON can be used unconditionally without runtime detection (when __ARM_NEON is defined). The speedup over the C code is very modest for the smaller functions (and the NEON version actually is a little slower than the C code on Cortex A7 for adapt4), but the speedup is around 2x for adapt16. Cortex A7 A8 A9 A53 A72 A73 msac_decode_bool_c: 41.1 43.0 43.0 37.3 26.2 31.3 msac_decode_bool_neon: 40.2 42.0 37.2 32.8 19.9 25.5 msac_decode_bool_adapt_c: 65.1 70.4 58.5 54.3 33.2 40.8 msac_decode_bool_adapt_neon: 56.8 52.4 49.3 42.6 27.1 33.7 msac_decode_bool_equi_c: 36.9 37.2 42.8 32.6 22.7 42.3 msac_decode_bool_equi_neon: 34.9 35.1 36.4 29.7 19.5 36.4 msac_decode_symbol_adapt4_c: 114.2 139.0 111.6 99.9 65.5 83.5 msac_decode_symbol_adapt4_neon: 119.2 128.3 95.7 82.2 58.2 57.5 msac_decode_symbol_adapt8_c: 176.0 207.9 164.0 154.4 88.0 117.0 msac_decode_symbol_adapt8_neon: 128.3 130.3 110.7 85.1 59.9 61.4 msac_decode_symbol_adapt16_c: 292.1 320.5 256.4 246.4 129.1 173.3 msac_decode_symbol_adapt16_neon: 162.2 144.3 129.0 104.2 69.2 69.9 (Omitting msac_decode_hi_tok from the benchmark, as the "C" version measured there uses the NEON version of msac_decode_symbol_adapt4.)
2020-06-18	arm64: msac: Add a special cased implementation of decode_hi_tok	Martin Storsjö
	The speedup (over the normal version, that just calls the existing assembly version of symbol_adapt4) is not very impressive on bigger cores, but looks decent on small cores. It's an improvement though, in any case. Cortex A53 A72 A73 msac_decode_hi_tok_c: 175.7 136.2 138.1 msac_decode_hi_tok_neon: 146.8 129.4 125.9
2020-06-18	arm64: msac: Use a narrower vector length for adapt4 in one place	Martin Storsjö

2020-06-18	arm64: msac: Clarify the register use in one macro	Martin Storsjö
	Include the letter prefix when calling the macro, making it slightly less obscure.
2020-06-18	cli: Avoid large intermediates in the windows get_time_nanos	Martin Storsjö
	By multiplicating the performance counter value (within its own time base) by the intended target time base, and only then dividing, we reduce the available numeric range by the factor of the original time base times the new time base. On Windows 10 on ARM64, the performance counter frequency is 19200000 (on x86_64 in a virtual machine, it's 10000000), making the calculation overflow every (1 << 64) / (19200000 * 1000000000) = 960 seconds, i.e. 16 minutes - long before the actual uint64_t nanosecond return value wraps around.
2020-06-18	cli: Get the elapsed time if printing progress, regardless of the fps value	Martin Storsjö
	Even if we don't want to throttle decoding to realtime, and even if the file itself didn't contain a valid fps value, we may want to call the synchronize function to fetch the current elapsed decoding time, for displaying the fps value.
2020-06-18	Update NEWS for 0.7.1	Jean-Baptiste Kempf

2020-06-18	x86: Add put/prep_bilin_scaled AVX2 asm	Victorien Le Couviour--Tuffet
	Bilin scaled being very rarely used, add a new table entry to mc_subpel_filters, and jump to the put/prep_8tap_scaled code. AVX2 performance is obviously the same as the 8tap code, the speed up is much smaller though, as the C code is a true bilinear codepath, auto-vectorized. Yet, the AVX2 performance are always better.
2020-06-18	x86: Add prep_8tap_scaled AVX2 asm	Victorien Le Couviour--Tuffet
	mct_scaled_8tap_regular_w4_8bpc_c: 872.1 mct_scaled_8tap_regular_w4_8bpc_avx2: 125.6 mct_scaled_8tap_regular_w4_dy1_8bpc_c: 886.3 mct_scaled_8tap_regular_w4_dy1_8bpc_avx2: 84.0 mct_scaled_8tap_regular_w4_dy2_8bpc_c: 1189.1 mct_scaled_8tap_regular_w4_dy2_8bpc_avx2: 84.7 mct_scaled_8tap_regular_w8_8bpc_c: 2261.0 mct_scaled_8tap_regular_w8_8bpc_avx2: 306.2 mct_scaled_8tap_regular_w8_dy1_8bpc_c: 2189.9 mct_scaled_8tap_regular_w8_dy1_8bpc_avx2: 233.8 mct_scaled_8tap_regular_w8_dy2_8bpc_c: 3060.3 mct_scaled_8tap_regular_w8_dy2_8bpc_avx2: 282.8 mct_scaled_8tap_regular_w16_8bpc_c: 4335.3 mct_scaled_8tap_regular_w16_8bpc_avx2: 680.7 mct_scaled_8tap_regular_w16_dy1_8bpc_c: 5137.2 mct_scaled_8tap_regular_w16_dy1_8bpc_avx2: 578.6 mct_scaled_8tap_regular_w16_dy2_8bpc_c: 7878.4 mct_scaled_8tap_regular_w16_dy2_8bpc_avx2: 774.6 mct_scaled_8tap_regular_w32_8bpc_c: 17871.9 mct_scaled_8tap_regular_w32_8bpc_avx2: 2954.8 mct_scaled_8tap_regular_w32_dy1_8bpc_c: 18594.7 mct_scaled_8tap_regular_w32_dy1_8bpc_avx2: 2073.9 mct_scaled_8tap_regular_w32_dy2_8bpc_c: 28696.0 mct_scaled_8tap_regular_w32_dy2_8bpc_avx2: 2852.1 mct_scaled_8tap_regular_w64_8bpc_c: 46967.5 mct_scaled_8tap_regular_w64_8bpc_avx2: 7527.5 mct_scaled_8tap_regular_w64_dy1_8bpc_c: 45564.2 mct_scaled_8tap_regular_w64_dy1_8bpc_avx2: 5262.9 mct_scaled_8tap_regular_w64_dy2_8bpc_c: 72793.3 mct_scaled_8tap_regular_w64_dy2_8bpc_avx2: 7535.9 mct_scaled_8tap_regular_w128_8bpc_c: 111190.8 mct_scaled_8tap_regular_w128_8bpc_avx2: 19386.8 mct_scaled_8tap_regular_w128_dy1_8bpc_c: 122625.0 mct_scaled_8tap_regular_w128_dy1_8bpc_avx2: 15376.1 mct_scaled_8tap_regular_w128_dy2_8bpc_c: 197120.6 mct_scaled_8tap_regular_w128_dy2_8bpc_avx2: 21871.0
2020-06-17	Clean up fraction calculation	Colin Lee

2020-06-17	Add clamping back to mv projection	Colin Lee
	Clamping in the motion vector projection calculation is required by spec. In commit aca57bf3db00c29e90605656f1015561d1d67c2d a rewrite of the function omitted the clamping. This commit readds the clamping.
2020-06-16	arm64: itx: Simplify and clarify the sub_sp macro a little bit	Martin Storsjö
	Add an .error case for windows if subtracting more than 8 KB, simplify the generic subtraction case.
2020-06-16	arm: itx: Add NEON implementation of itx for 8 bpc	Martin Storsjö
	The transforms process vectors of up to 8 elements at a time, for transforms up to size 8; for larger transforms, it uses vectors of 4 elements. Overall, the speedup over C code seems to be around 8-14x for the larger transforms, and 10-19x for the smaller ones. Relative speedup over C code (built with GCC 7.5) for a few functions: Cortex A7 A8 A9 A53 A72 A73 inv_txfm_add_4x4_dct_dct_0_8bpc_neon: 3.83 3.42 2.57 3.36 2.97 7.47 inv_txfm_add_4x4_dct_dct_1_8bpc_neon: 7.25 13.53 8.38 8.82 7.96 12.37 inv_txfm_add_8x8_dct_dct_0_8bpc_neon: 4.78 6.61 4.82 4.65 5.27 9.76 inv_txfm_add_8x8_dct_dct_1_8bpc_neon: 10.20 19.07 13.07 14.69 11.45 15.50 inv_txfm_add_16x16_dct_dct_0_8bpc_neon: 4.26 5.06 3.00 3.74 4.05 4.49 inv_txfm_add_16x16_dct_dct_1_8bpc_neon: 10.51 16.02 13.57 14.03 12.86 18.16 inv_txfm_add_16x16_dct_dct_2_8bpc_neon: 7.95 11.75 9.09 10.64 10.06 14.07 inv_txfm_add_32x32_dct_dct_0_8bpc_neon: 5.31 5.58 3.14 4.18 4.80 4.57 inv_txfm_add_32x32_dct_dct_1_8bpc_neon: 12.66 16.07 14.34 16.00 15.24 21.32 inv_txfm_add_32x32_dct_dct_4_8bpc_neon: 8.25 10.69 8.90 10.59 10.41 14.39 inv_txfm_add_64x64_dct_dct_0_8bpc_neon: 4.69 5.97 3.17 3.96 4.57 4.34 inv_txfm_add_64x64_dct_dct_1_8bpc_neon: 11.47 12.68 10.18 14.73 14.20 17.95 inv_txfm_add_64x64_dct_dct_4_8bpc_neon: 8.84 10.13 7.94 11.25 10.58 13.88
2020-06-11	meson: Use dav1d_src_root everywhere for consistency	Matthias Dressel

2020-06-11	Remove redundant memset in itx DSP initialization	Henrik Gramner
	The struct is already zero-initialized when the function is called except for the checkasm test, so move the zeroing there instead.
2020-06-11	meson: Make docs generation subproject-safe	Matthias Dressel
	meson.source_root() returns the root of a parent project if dav1d is embedded as a subproject.
2020-06-11	x86: Adapt SSSE3 prep_8tap to SSE2	Victorien Le Couviour--Tuffet
	--------------------- x86_64: ------------------------------------------ mct_8tap_regular_w4_h_8bpc_c: 302.3 mct_8tap_regular_w4_h_8bpc_sse2: 47.3 mct_8tap_regular_w4_h_8bpc_ssse3: 19.5 --------------------- mct_8tap_regular_w8_h_8bpc_c: 745.5 mct_8tap_regular_w8_h_8bpc_sse2: 235.2 mct_8tap_regular_w8_h_8bpc_ssse3: 70.4 --------------------- mct_8tap_regular_w16_h_8bpc_c: 1844.3 mct_8tap_regular_w16_h_8bpc_sse2: 755.6 mct_8tap_regular_w16_h_8bpc_ssse3: 225.9 --------------------- mct_8tap_regular_w32_h_8bpc_c: 6685.5 mct_8tap_regular_w32_h_8bpc_sse2: 2954.4 mct_8tap_regular_w32_h_8bpc_ssse3: 795.8 --------------------- mct_8tap_regular_w64_h_8bpc_c: 15633.5 mct_8tap_regular_w64_h_8bpc_sse2: 7120.4 mct_8tap_regular_w64_h_8bpc_ssse3: 1900.4 --------------------- mct_8tap_regular_w128_h_8bpc_c: 37772.1 mct_8tap_regular_w128_h_8bpc_sse2: 17698.1 mct_8tap_regular_w128_h_8bpc_ssse3: 4665.5 ------------------------------------------ mct_8tap_regular_w4_v_8bpc_c: 306.5 mct_8tap_regular_w4_v_8bpc_sse2: 71.7 mct_8tap_regular_w4_v_8bpc_ssse3: 37.9 --------------------- mct_8tap_regular_w8_v_8bpc_c: 923.3 mct_8tap_regular_w8_v_8bpc_sse2: 168.7 mct_8tap_regular_w8_v_8bpc_ssse3: 71.3 --------------------- mct_8tap_regular_w16_v_8bpc_c: 3040.1 mct_8tap_regular_w16_v_8bpc_sse2: 505.1 mct_8tap_regular_w16_v_8bpc_ssse3: 199.7 --------------------- mct_8tap_regular_w32_v_8bpc_c: 12354.8 mct_8tap_regular_w32_v_8bpc_sse2: 1942.0 mct_8tap_regular_w32_v_8bpc_ssse3: 714.2 --------------------- mct_8tap_regular_w64_v_8bpc_c: 29427.9 mct_8tap_regular_w64_v_8bpc_sse2: 4637.4 mct_8tap_regular_w64_v_8bpc_ssse3: 1829.2 --------------------- mct_8tap_regular_w128_v_8bpc_c: 72756.9 mct_8tap_regular_w128_v_8bpc_sse2: 11301.0 mct_8tap_regular_w128_v_8bpc_ssse3: 5020.6 ------------------------------------------ mct_8tap_regular_w4_hv_8bpc_c: 876.9 mct_8tap_regular_w4_hv_8bpc_sse2: 171.7 mct_8tap_regular_w4_hv_8bpc_ssse3: 112.2 --------------------- mct_8tap_regular_w8_hv_8bpc_c: 2215.1 mct_8tap_regular_w8_hv_8bpc_sse2: 730.2 mct_8tap_regular_w8_hv_8bpc_ssse3: 330.9 --------------------- mct_8tap_regular_w16_hv_8bpc_c: 6075.5 mct_8tap_regular_w16_hv_8bpc_sse2: 2252.1 mct_8tap_regular_w16_hv_8bpc_ssse3: 973.4 --------------------- mct_8tap_regular_w32_hv_8bpc_c: 22182.7 mct_8tap_regular_w32_hv_8bpc_sse2: 7692.6 mct_8tap_regular_w32_hv_8bpc_ssse3: 3599.8 --------------------- mct_8tap_regular_w64_hv_8bpc_c: 50876.8 mct_8tap_regular_w64_hv_8bpc_sse2: 18499.6 mct_8tap_regular_w64_hv_8bpc_ssse3: 8815.6 --------------------- mct_8tap_regular_w128_hv_8bpc_c: 122926.3 mct_8tap_regular_w128_hv_8bpc_sse2: 45120.0 mct_8tap_regular_w128_hv_8bpc_ssse3: 22085.7 ------------------------------------------
2020-06-11	x86: Adapt SSSE3 prep_bilin to SSE2	Victorien Le Couviour--Tuffet
	--------------------- x86_64: ------------------------------------------ mct_bilinear_w4_h_8bpc_c: 98.9 mct_bilinear_w4_h_8bpc_sse2: 30.2 mct_bilinear_w4_h_8bpc_ssse3: 11.5 --------------------- mct_bilinear_w8_h_8bpc_c: 175.3 mct_bilinear_w8_h_8bpc_sse2: 57.0 mct_bilinear_w8_h_8bpc_ssse3: 19.7 --------------------- mct_bilinear_w16_h_8bpc_c: 396.2 mct_bilinear_w16_h_8bpc_sse2: 179.3 mct_bilinear_w16_h_8bpc_ssse3: 50.9 --------------------- mct_bilinear_w32_h_8bpc_c: 1311.2 mct_bilinear_w32_h_8bpc_sse2: 718.8 mct_bilinear_w32_h_8bpc_ssse3: 243.9 --------------------- mct_bilinear_w64_h_8bpc_c: 2892.7 mct_bilinear_w64_h_8bpc_sse2: 1746.0 mct_bilinear_w64_h_8bpc_ssse3: 568.0 --------------------- mct_bilinear_w128_h_8bpc_c: 7192.6 mct_bilinear_w128_h_8bpc_sse2: 4339.8 mct_bilinear_w128_h_8bpc_ssse3: 1619.2 ------------------------------------------ mct_bilinear_w4_v_8bpc_c: 129.7 mct_bilinear_w4_v_8bpc_sse2: 26.6 mct_bilinear_w4_v_8bpc_ssse3: 16.7 --------------------- mct_bilinear_w8_v_8bpc_c: 233.3 mct_bilinear_w8_v_8bpc_sse2: 55.0 mct_bilinear_w8_v_8bpc_ssse3: 24.7 --------------------- mct_bilinear_w16_v_8bpc_c: 498.9 mct_bilinear_w16_v_8bpc_sse2: 146.0 mct_bilinear_w16_v_8bpc_ssse3: 54.2 --------------------- mct_bilinear_w32_v_8bpc_c: 1562.2 mct_bilinear_w32_v_8bpc_sse2: 560.6 mct_bilinear_w32_v_8bpc_ssse3: 201.0 --------------------- mct_bilinear_w64_v_8bpc_c: 3221.3 mct_bilinear_w64_v_8bpc_sse2: 1380.6 mct_bilinear_w64_v_8bpc_ssse3: 499.3 --------------------- mct_bilinear_w128_v_8bpc_c: 7357.7 mct_bilinear_w128_v_8bpc_sse2: 3439.0 mct_bilinear_w128_v_8bpc_ssse3: 1489.1 ------------------------------------------ mct_bilinear_w4_hv_8bpc_c: 185.0 mct_bilinear_w4_hv_8bpc_sse2: 54.5 mct_bilinear_w4_hv_8bpc_ssse3: 22.1 --------------------- mct_bilinear_w8_hv_8bpc_c: 377.8 mct_bilinear_w8_hv_8bpc_sse2: 104.3 mct_bilinear_w8_hv_8bpc_ssse3: 35.8 --------------------- mct_bilinear_w16_hv_8bpc_c: 1159.4 mct_bilinear_w16_hv_8bpc_sse2: 311.0 mct_bilinear_w16_hv_8bpc_ssse3: 106.3 --------------------- mct_bilinear_w32_hv_8bpc_c: 4436.2 mct_bilinear_w32_hv_8bpc_sse2: 1230.7 mct_bilinear_w32_hv_8bpc_ssse3: 400.7 --------------------- mct_bilinear_w64_hv_8bpc_c: 10627.7 mct_bilinear_w64_hv_8bpc_sse2: 2934.2 mct_bilinear_w64_hv_8bpc_ssse3: 957.2 --------------------- mct_bilinear_w128_hv_8bpc_c: 26048.9 mct_bilinear_w128_hv_8bpc_sse2: 7590.3 mct_bilinear_w128_hv_8bpc_ssse3: 2947.0 ------------------------------------------
2020-06-10	arm64: itx16: Add a missed eob check in the 16x8 transform	Martin Storsjö
	This allows skipping half of the first transforms if the input coefficients lie within the upper 4x4 (but checkasm only tests in increments of 8x8 at the moment). With checkasm modified to test in smaller increments, the speedup is like this: Before: Cortex A53 A72 A73 inv_txfm_add_16x8_dct_dct_1_10bpc_neon: 874.4 709.0 707.3 After: inv_txfm_add_16x8_dct_dct_1_10bpc_neon: 618.0 479.5 472.9
2020-06-10	arm64: itx16: Remove a leftover unused macro parameter	Martin Storsjö

2020-06-09	x86: Fix compiler warnings when using nasm 2.15	Henrik Gramner

2020-06-09	Avoid compiling logging functions when logging is disabled	Henrik Gramner

2020-06-07	CI: Enable coverage reports	Niklas Haas
	Blacklisted some files not directly relevant to the codebase (such as tests, tools and debugging functions). The coverage HTML report gets attached as a build artifact, although unfortunately we can't link directly to the `index.html`. We also attach the coverage XML as a cobertura report, although I'm not sure if it does anything.
2020-06-04	Range of operating point is 0 - 31, not 0 - 32	Wan-Teh Chang

2020-06-04	arm: Add an export parameter to the const macro	Martin Storsjö
	This is currently not used in dav1d (yet), but there's a need for it in rav1e, which shares this header with dav1d.
2020-06-01	x86: Add put_8tap_scaled AVX2 asm	Victorien Le Couviour--Tuffet
	mc_scaled_8tap_regular_w2_8bpc_c: 764.4 mc_scaled_8tap_regular_w2_8bpc_avx2: 191.3 mc_scaled_8tap_regular_w2_dy1_8bpc_c: 705.8 mc_scaled_8tap_regular_w2_dy1_8bpc_avx2: 89.5 mc_scaled_8tap_regular_w2_dy2_8bpc_c: 964.0 mc_scaled_8tap_regular_w2_dy2_8bpc_avx2: 120.3 mc_scaled_8tap_regular_w4_8bpc_c: 1355.7 mc_scaled_8tap_regular_w4_8bpc_avx2: 180.9 mc_scaled_8tap_regular_w4_dy1_8bpc_c: 1233.2 mc_scaled_8tap_regular_w4_dy1_8bpc_avx2: 115.3 mc_scaled_8tap_regular_w4_dy2_8bpc_c: 1707.6 mc_scaled_8tap_regular_w4_dy2_8bpc_avx2: 117.9 mc_scaled_8tap_regular_w8_8bpc_c: 2483.2 mc_scaled_8tap_regular_w8_8bpc_avx2: 294.8 mc_scaled_8tap_regular_w8_dy1_8bpc_c: 2166.4 mc_scaled_8tap_regular_w8_dy1_8bpc_avx2: 222.0 mc_scaled_8tap_regular_w8_dy2_8bpc_c: 3133.7 mc_scaled_8tap_regular_w8_dy2_8bpc_avx2: 292.6 mc_scaled_8tap_regular_w16_8bpc_c: 5239.2 mc_scaled_8tap_regular_w16_8bpc_avx2: 729.9 mc_scaled_8tap_regular_w16_dy1_8bpc_c: 5156.5 mc_scaled_8tap_regular_w16_dy1_8bpc_avx2: 602.2 mc_scaled_8tap_regular_w16_dy2_8bpc_c: 8018.4 mc_scaled_8tap_regular_w16_dy2_8bpc_avx2: 783.1 mc_scaled_8tap_regular_w32_8bpc_c: 14745.0 mc_scaled_8tap_regular_w32_8bpc_avx2: 2205.0 mc_scaled_8tap_regular_w32_dy1_8bpc_c: 14862.3 mc_scaled_8tap_regular_w32_dy1_8bpc_avx2: 1721.3 mc_scaled_8tap_regular_w32_dy2_8bpc_c: 23607.6 mc_scaled_8tap_regular_w32_dy2_8bpc_avx2: 2325.7 mc_scaled_8tap_regular_w64_8bpc_c: 54891.7 mc_scaled_8tap_regular_w64_8bpc_avx2: 8351.4 mc_scaled_8tap_regular_w64_dy1_8bpc_c: 50249.0 mc_scaled_8tap_regular_w64_dy1_8bpc_avx2: 5864.4 mc_scaled_8tap_regular_w64_dy2_8bpc_c: 79400.1 mc_scaled_8tap_regular_w64_dy2_8bpc_avx2: 8295.7 mc_scaled_8tap_regular_w128_8bpc_c: 121046.8 mc_scaled_8tap_regular_w128_8bpc_avx2: 21809.1 mc_scaled_8tap_regular_w128_dy1_8bpc_c: 133720.4 mc_scaled_8tap_regular_w128_dy1_8bpc_avx2: 16197.8 mc_scaled_8tap_regular_w128_dy2_8bpc_c: 218774.8 mc_scaled_8tap_regular_w128_dy2_8bpc_avx2: 22993.1
2020-05-28	meson: favor _aligned_malloc over posix_memalign	Steve Lhomme
	posix_memalign is defined as a built-in in gcc in msys2 but it's not available when linking with the Universal C Runtime. _aligned_malloc is available in the UCRT. That should only affect builds targeting Windows since _aligned_malloc is a MS thing.
2020-05-26	x86: Add minor looprestoration asm optimizations	Henrik Gramner
	Eliminate store forwarding stalls. Use shorter instruction encodings where possible. Misc. tweaks.
2020-05-26	dav1dplay: use new pl_chroma_location API	Niklas Haas
	This one correctly sets the subsampling mode based on whether or not the plane is actually subsampled, and also infers PL_CHROMA_UNKNOWN as PL_CHROMA_TOP_LEFT in such cases.
2020-05-25	dav1dplay: allow resizing the window	Niklas Haas
	libplacebo v66 got helper functions that make preserving the aspect ratio in this case trivial. But we still need to make sure to clear the FBO to black if the image doesn't cover it fully.
2020-05-20	dav1dplay: don't freeze on render errors0.7.0	Niklas Haas
	Returning out of this function when pl_render_image() fails is the wrong thing to do, since that leaves the swapchain frame acquired but never submitted. Instead, just clear the target FBO to blank red (to make it clear that something went wrong) and continue on with presentation.
2020-05-19	Update NEWS for 0.7.0	Jean-Baptiste Kempf

2020-05-18	dav1dplay: support on-GPU film grain synthesis	Niklas Haas
	Annoying minor differences in this struct layout mean we can't just memcpy the entire thing. Oh well. Note: technically, PL_API_VER 33 added this API, but PL_API_VER 63 is the minimum version of libplacebo that doesn't have glaring bugs when generating chroma grain, so we require that as a minimum instead. (I tested this version on some 4:2:2 and 4:2:0, 8-bit and 10-bit grain samples I had lying around and made sure the output was identical up to differences in rounding / dithering.)
2020-05-18	dav1dplay: handle all supported csps/reprs/bitdepths	Niklas Haas
	Generalize the code to set the right pl_image metadata based on the values signaled in the Dav1dPictureParameters / Dav1dSequenceHeader. Some values are not mapped, in which case stdout will be spammed. Whatever. Hopefully somebody sees that error spam and opens a bug report for libplacebo to implement it.
2020-05-18	dav1dplay: move and simplify pl_image generation	Niklas Haas
	Having the pl_image generation live in upload_planes() rather than render() will make it easier to set the correct pl_image metadata based on the Dav1dPicture headers moving forwards. Rename the function to make more sense, semantically. Reduce some code duplication by turning per-plane fields into arrays wherever appropriate. As an aside, also apply the correct chroma location rather than hard-coding it as PL_CHROMA_LEFT.
2020-05-18	dav1dplay: don't write directly to iparams.extensions	Niklas Haas
	This is turned into a const array in upstream libplacebo, which generates warnings due to the implicit cast. Rewrite the code to have the mutable array live inside a separate variable `extensions` and only set `iparams.extensions` to this, rather than directly manipulating it.
2020-05-16	Fix swapped define guards in dav1dplay’s libplacebo renderer	Emmanuel Gil Peyrot
	Signed-off-by: Marvin Scholz <epirat07@gmail.com>
2020-05-16	Update NEWS for 0.7.0	Jean-Baptiste Kempf

2020-05-15	checkasm: x86: Check for stack corruption	Henrik Gramner
	Add code to check that a function doesn't accidentally overwrite anything in the area located just above the current stack frame.
2020-05-15	tools: add missing fopen error handling	Marvin Scholz

2020-05-15	Dav1dPlay: Split placebo renderer into two	Marvin Scholz
	This allows selecting at runtime if placebo should use OpenGL or Vulkan for rendering.
2020-05-15	Dav1dPlay: Remove redundant log message	Marvin Scholz

2020-05-15	Dav1dPlay: Remove unused renderer_info member	Marvin Scholz

2020-05-15	Dav1dPlay: Allow runtime renderer selection	Marvin Scholz

2020-05-15	Dav1dPlay: Fix renderer selection	Marvin Scholz

2020-05-15	Dav1dPlay: Split renderers into different files	Marvin Scholz

2020-05-14	Dav1dPlay: Add support for OpenGL with libplacebo	Marvin Scholz