github.com/videolan/dav1d.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author
2020-05-20	dav1dplay: don't freeze on render errors0.7.0	Niklas Haas
	Returning out of this function when pl_render_image() fails is the wrong thing to do, since that leaves the swapchain frame acquired but never submitted. Instead, just clear the target FBO to blank red (to make it clear that something went wrong) and continue on with presentation.
2020-05-19	Update NEWS for 0.7.0	Jean-Baptiste Kempf

2020-05-18	dav1dplay: support on-GPU film grain synthesis	Niklas Haas
	Annoying minor differences in this struct layout mean we can't just memcpy the entire thing. Oh well. Note: technically, PL_API_VER 33 added this API, but PL_API_VER 63 is the minimum version of libplacebo that doesn't have glaring bugs when generating chroma grain, so we require that as a minimum instead. (I tested this version on some 4:2:2 and 4:2:0, 8-bit and 10-bit grain samples I had lying around and made sure the output was identical up to differences in rounding / dithering.)
2020-05-18	dav1dplay: handle all supported csps/reprs/bitdepths	Niklas Haas
	Generalize the code to set the right pl_image metadata based on the values signaled in the Dav1dPictureParameters / Dav1dSequenceHeader. Some values are not mapped, in which case stdout will be spammed. Whatever. Hopefully somebody sees that error spam and opens a bug report for libplacebo to implement it.
2020-05-18	dav1dplay: move and simplify pl_image generation	Niklas Haas
	Having the pl_image generation live in upload_planes() rather than render() will make it easier to set the correct pl_image metadata based on the Dav1dPicture headers moving forwards. Rename the function to make more sense, semantically. Reduce some code duplication by turning per-plane fields into arrays wherever appropriate. As an aside, also apply the correct chroma location rather than hard-coding it as PL_CHROMA_LEFT.
2020-05-18	dav1dplay: don't write directly to iparams.extensions	Niklas Haas
	This is turned into a const array in upstream libplacebo, which generates warnings due to the implicit cast. Rewrite the code to have the mutable array live inside a separate variable `extensions` and only set `iparams.extensions` to this, rather than directly manipulating it.
2020-05-16	Fix swapped define guards in dav1dplay’s libplacebo renderer	Emmanuel Gil Peyrot
	Signed-off-by: Marvin Scholz <epirat07@gmail.com>
2020-05-16	Update NEWS for 0.7.0	Jean-Baptiste Kempf

2020-05-15	checkasm: x86: Check for stack corruption	Henrik Gramner
	Add code to check that a function doesn't accidentally overwrite anything in the area located just above the current stack frame.
2020-05-15	tools: add missing fopen error handling	Marvin Scholz

2020-05-15	Dav1dPlay: Split placebo renderer into two	Marvin Scholz
	This allows selecting at runtime if placebo should use OpenGL or Vulkan for rendering.
2020-05-15	Dav1dPlay: Remove redundant log message	Marvin Scholz

2020-05-15	Dav1dPlay: Remove unused renderer_info member	Marvin Scholz

2020-05-15	Dav1dPlay: Allow runtime renderer selection	Marvin Scholz

2020-05-15	Dav1dPlay: Fix renderer selection	Marvin Scholz

2020-05-15	Dav1dPlay: Split renderers into different files	Marvin Scholz

2020-05-14	Dav1dPlay: Add support for OpenGL with libplacebo	Marvin Scholz

2020-05-14	Dav1dPlay: Split FIFO to different files	Marvin Scholz
	To un-clutter the main dav1dplay.c, move the fifo to its own file and header.
2020-05-14	checkasm: arm: Offset the location of the stack canary reference	Martin Storsjö
	If the maximum number of arguments (currently 15) is changed into an even number, and a function actually takes the full number of arguments, we would have the situation where the checked spot on the stack is at the same place as we store an inverted copy of it. We already allocate enough space for two values though (for stack alignment purposes, 16 bytes on arm64 and 8 bytes on arm32) so by storing the reference in the upper half of this, the lower half of it works as canary and isn't overwritten.
2020-05-14	checkasm: arm32: Take the number of stack arguments into account when ↵	Martin Storsjö
	checking for stack clobbering
2020-05-14	checkasm: arm64: Take the number of stack arguments into account when ↵	Martin Storsjö
	checking for stack clobbering
2020-05-13	checkasm: Cosmetics	Henrik Gramner
	Use 'unsigned' instead of 'unsigned int' for consistency. Add 'const' to a few variables. Make proper use of C99 features.
2020-05-13	checkasm: Skip printing the seed when using --list-functions	Henrik Gramner
	Also skip the AVX warmup.
2020-05-13	checkasm: arm64: Avoid overwriting the v0/q0/d0/s0 register	Matthieu Bouron
	If functions return a float value, this value is stored in this register.
2020-05-13	checkasm: arm: Don't use blx to call checkasm_fail_func	Martin Storsjö
	We should just use a normal bl here, and the linker will add the 'x' bit if necessary. This fixes calling the checkasm_fail_func on windows, where the code is built in thumb mode (and the linker doesn't clear the 'x' bit in the blx instruction).
2020-05-12	CI: Add 32 bit instruction set test	Matthias Dressel

2020-05-12	CI: Optimise multi-threading tests	Matthias Dressel

2020-05-12	CI: Optimise instruction set tests	Matthias Dressel
	* The build from 'build-debian' is reused. 'logging' is not disabled since that would trigger an almost full rebuild. * All ASM tests are merged into one job which is expected to seldomly fail, therefore ease of debugging is traded in for efficiency.
2020-05-12	CI: Add multi-threading to conformance tests	Matthias Dressel

2020-05-12	CI: Run conformance tests with different instruction sets	Matthias Dressel

2020-05-12	checkasm: filmgrain: Fix benchmarking in 16 bpc mode	Martin Storsjö
	When benchmarking, the functions are called with a fixed width of 64x32 or 32x16, while the test itself is run with a random size in the range up to 128x32. In 16 bpc mode, the source pixels must be within the valid range, because they otherwise cause accesses out of bounds in the scaling array.
2020-05-11	cli: Reduce fps fraction in ivf parsing	Henrik Gramner
	Also avoid integer overflows by using 64-bit intermediate precision.
2020-05-10	x86: Use 'test' instead of 'or' to compare with zero	Henrik Gramner
	Allows for macro-op fusion.
2020-05-10	x86: Unconditionally compile msac_init.c	Henrik Gramner
	Eliminates the x86-64 check from the meson configuration file to be consistent with how other x86-64-exclusive code is handled.
2020-05-10	x86-64: Do msac refill before calling dav1d_msac_init_x86()	Henrik Gramner
	Allows for constant propagation and tail call elimination in the msac initialization, which is performed in each tile.
2020-05-10	msac: Avoid attempting to refill after eob has already been reached	Henrik Gramner
	Utilize the unsigned representation of a signed integer to skip the refill code if the count was already negative to begin with, which saves a few clock cycles at the end of each tile.
2020-05-10	arm64: itx: Add NEON implementation of itx for 10 bpc	Martin Storsjö
	Add an element size specifier to the existing individual transform functions for 8 bpc, naming them e.g. inv_dct_8h_x8_neon, to clarify that they operate on input vectors of 8h, and make the symbols public, to let the 10 bpc case call them from a different object file. The same convention is used in the new itx16.S, like inv_dct_4s_x8_neon. Make the existing itx.S compiled regardless of whether 8 bpc support is enabled. For builds with 8 bpc support disabled, this does include the unused frontend functions though, but this is hopefully tolerable to avoid having to split the file into a sharable file for transforms and a separate one for frontends. This only implements the 10 bpc case, as that case can use transforms operating on 16 bit coefficients in the second pass. Relative speedup vs C for a few functions: Cortex A53 A72 A73 inv_txfm_add_4x4_dct_dct_0_10bpc_neon: 4.14 4.06 4.49 inv_txfm_add_4x4_dct_dct_1_10bpc_neon: 6.51 6.49 6.42 inv_txfm_add_8x8_dct_dct_0_10bpc_neon: 5.02 4.63 6.23 inv_txfm_add_8x8_dct_dct_1_10bpc_neon: 8.54 7.13 11.96 inv_txfm_add_16x16_dct_dct_0_10bpc_neon: 5.52 6.60 8.03 inv_txfm_add_16x16_dct_dct_1_10bpc_neon: 11.27 9.62 12.22 inv_txfm_add_16x16_dct_dct_2_10bpc_neon: 9.60 6.97 8.59 inv_txfm_add_32x32_dct_dct_0_10bpc_neon: 2.60 3.48 3.19 inv_txfm_add_32x32_dct_dct_1_10bpc_neon: 14.65 12.64 16.86 inv_txfm_add_32x32_dct_dct_2_10bpc_neon: 11.57 8.80 12.68 inv_txfm_add_32x32_dct_dct_3_10bpc_neon: 8.79 8.00 9.21 inv_txfm_add_32x32_dct_dct_4_10bpc_neon: 7.58 6.21 7.80 inv_txfm_add_64x64_dct_dct_0_10bpc_neon: 2.41 2.85 2.75 inv_txfm_add_64x64_dct_dct_1_10bpc_neon: 12.91 10.27 12.24 inv_txfm_add_64x64_dct_dct_2_10bpc_neon: 10.96 7.97 10.31 inv_txfm_add_64x64_dct_dct_3_10bpc_neon: 8.95 7.42 9.55 inv_txfm_add_64x64_dct_dct_4_10bpc_neon: 7.97 6.12 7.82
2020-05-10	arm: Mark global symbols hidden	Martin Storsjö
	This matches what is done in C by -fvisibility=hidden. This avoids issues with relocations against other symbols exported from another assembly file.
2020-05-10	arm64: itx: Prepare for other bitdepths	Martin Storsjö

2020-05-10	itx: Add a bpc parameter to the itx dsp init function	Martin Storsjö

2020-05-10	arm64: itx: Share code for the three horz_16x8 functions	Martin Storsjö

2020-05-10	arm64: itx: Fix the eob checking for dct_dct_64x16	Martin Storsjö
	Before this, we never did the early exit from the first pass. Before: Cortex A53 A72 A73 inv_txfm_add_64x16_dct_dct_1_8bpc_neon: 7275.7 5198.3 5250.9 inv_txfm_add_64x16_dct_dct_2_8bpc_neon: 7276.1 5197.0 5251.3 inv_txfm_add_64x16_dct_dct_3_8bpc_neon: 7275.8 5196.2 5254.5 inv_txfm_add_64x16_dct_dct_4_8bpc_neon: 7273.6 5198.8 5254.2 After: inv_txfm_add_64x16_dct_dct_1_8bpc_neon: 5187.8 3763.8 3735.0 inv_txfm_add_64x16_dct_dct_2_8bpc_neon: 7280.6 5185.6 5256.3 inv_txfm_add_64x16_dct_dct_3_8bpc_neon: 7270.7 5179.8 5250.3 inv_txfm_add_64x16_dct_dct_4_8bpc_neon: 7271.7 5212.4 5256.4 The other related variants didn't have this bug and properly exited early when possible.
2020-05-10	arm64: itx: Simplify inv_txfm_horz_dct_32x8	Martin Storsjö
	Unify some loads and stores, avoiding some extra pointer moving.
2020-05-10	arm64: itx: Minor optimizations for the 8x32 functions	Martin Storsjö
	This gives a couple cycles speedup.
2020-05-10	arm64: itx: Cosmetic fix up	Martin Storsjö

2020-05-10	arm64: itx: Remove an unused constant	Martin Storsjö
	This isn't used for a sqrdmulh in its current form here. The one left in idct_coeffs[1] isn't used within the idct itself, but inv_txfm_horz_scale_dct_32x8 relies on it being left there for use with sqrdmulh scaling later.
2020-05-10	arm64: itx: Remove a todo comment about more special cased functions	Martin Storsjö
	These cases were removed from x86 to save space and simplify the code in e0b88bd2b2c97a2695edcc498485e1cb3003e7f1, as those cases were essentially unused in real world bitstreams.
2020-05-10	arm64: itx: Remove a now unused macro	Martin Storsjö
	The macro became unused in 9f084b0d2.
2020-05-10	arm64: Explicitly forbid using the x18 register	Martin Storsjö
	On windows and darwin (and modern android), the x18 register is reserved and shouldn't be modified by user code, while it is freely available on linux. Strictly avoid it, to keep the assembly code portable.
2020-05-06	checkasm: arm32: Check for stack overflows	Martin Storsjö