Welcome to mirror list, hosted at ThFree Co, Russian Federation.

github.com/videolan/dav1d.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2020-05-20dav1dplay: don't freeze on render errors0.7.0Niklas Haas
Returning out of this function when pl_render_image() fails is the wrong thing to do, since that leaves the swapchain frame acquired but never submitted. Instead, just clear the target FBO to blank red (to make it clear that something went wrong) and continue on with presentation.
2020-05-19Update NEWS for 0.7.0Jean-Baptiste Kempf
2020-05-18dav1dplay: support on-GPU film grain synthesisNiklas Haas
Annoying minor differences in this struct layout mean we can't just memcpy the entire thing. Oh well. Note: technically, PL_API_VER 33 added this API, but PL_API_VER 63 is the minimum version of libplacebo that doesn't have glaring bugs when generating chroma grain, so we require that as a minimum instead. (I tested this version on some 4:2:2 and 4:2:0, 8-bit and 10-bit grain samples I had lying around and made sure the output was identical up to differences in rounding / dithering.)
2020-05-18dav1dplay: handle all supported csps/reprs/bitdepthsNiklas Haas
Generalize the code to set the right pl_image metadata based on the values signaled in the Dav1dPictureParameters / Dav1dSequenceHeader. Some values are not mapped, in which case stdout will be spammed. Whatever. Hopefully somebody sees that error spam and opens a bug report for libplacebo to implement it.
2020-05-18dav1dplay: move and simplify pl_image generationNiklas Haas
Having the pl_image generation live in upload_planes() rather than render() will make it easier to set the correct pl_image metadata based on the Dav1dPicture headers moving forwards. Rename the function to make more sense, semantically. Reduce some code duplication by turning per-plane fields into arrays wherever appropriate. As an aside, also apply the correct chroma location rather than hard-coding it as PL_CHROMA_LEFT.
2020-05-18dav1dplay: don't write directly to iparams.extensionsNiklas Haas
This is turned into a const array in upstream libplacebo, which generates warnings due to the implicit cast. Rewrite the code to have the mutable array live inside a separate variable `extensions` and only set `iparams.extensions` to this, rather than directly manipulating it.
2020-05-16Fix swapped define guards in dav1dplay’s libplacebo rendererEmmanuel Gil Peyrot
Signed-off-by: Marvin Scholz <epirat07@gmail.com>
2020-05-16Update NEWS for 0.7.0Jean-Baptiste Kempf
2020-05-15checkasm: x86: Check for stack corruptionHenrik Gramner
Add code to check that a function doesn't accidentally overwrite anything in the area located just above the current stack frame.
2020-05-15tools: add missing fopen error handlingMarvin Scholz
2020-05-15Dav1dPlay: Split placebo renderer into twoMarvin Scholz
This allows selecting at runtime if placebo should use OpenGL or Vulkan for rendering.
2020-05-15Dav1dPlay: Remove redundant log messageMarvin Scholz
2020-05-15Dav1dPlay: Remove unused renderer_info memberMarvin Scholz
2020-05-15Dav1dPlay: Allow runtime renderer selectionMarvin Scholz
2020-05-15Dav1dPlay: Fix renderer selectionMarvin Scholz
2020-05-15Dav1dPlay: Split renderers into different filesMarvin Scholz
2020-05-14Dav1dPlay: Add support for OpenGL with libplaceboMarvin Scholz
2020-05-14Dav1dPlay: Split FIFO to different filesMarvin Scholz
To un-clutter the main dav1dplay.c, move the fifo to its own file and header.
2020-05-14checkasm: arm: Offset the location of the stack canary referenceMartin Storsjö
If the maximum number of arguments (currently 15) is changed into an even number, and a function actually takes the full number of arguments, we would have the situation where the checked spot on the stack is at the same place as we store an inverted copy of it. We already allocate enough space for two values though (for stack alignment purposes, 16 bytes on arm64 and 8 bytes on arm32) so by storing the reference in the upper half of this, the lower half of it works as canary and isn't overwritten.
2020-05-14checkasm: arm32: Take the number of stack arguments into account when ↵Martin Storsjö
checking for stack clobbering
2020-05-14checkasm: arm64: Take the number of stack arguments into account when ↵Martin Storsjö
checking for stack clobbering
2020-05-13checkasm: CosmeticsHenrik Gramner
Use 'unsigned' instead of 'unsigned int' for consistency. Add 'const' to a few variables. Make proper use of C99 features.
2020-05-13checkasm: Skip printing the seed when using --list-functionsHenrik Gramner
Also skip the AVX warmup.
2020-05-13checkasm: arm64: Avoid overwriting the v0/q0/d0/s0 registerMatthieu Bouron
If functions return a float value, this value is stored in this register.
2020-05-13checkasm: arm: Don't use blx to call checkasm_fail_funcMartin Storsjö
We should just use a normal bl here, and the linker will add the 'x' bit if necessary. This fixes calling the checkasm_fail_func on windows, where the code is built in thumb mode (and the linker doesn't clear the 'x' bit in the blx instruction).
2020-05-12CI: Add 32 bit instruction set testMatthias Dressel
2020-05-12CI: Optimise multi-threading testsMatthias Dressel
2020-05-12CI: Optimise instruction set testsMatthias Dressel
* The build from 'build-debian' is reused. 'logging' is not disabled since that would trigger an almost full rebuild. * All ASM tests are merged into one job which is expected to seldomly fail, therefore ease of debugging is traded in for efficiency.
2020-05-12CI: Add multi-threading to conformance testsMatthias Dressel
2020-05-12CI: Run conformance tests with different instruction setsMatthias Dressel
2020-05-12checkasm: filmgrain: Fix benchmarking in 16 bpc modeMartin Storsjö
When benchmarking, the functions are called with a fixed width of 64x32 or 32x16, while the test itself is run with a random size in the range up to 128x32. In 16 bpc mode, the source pixels must be within the valid range, because they otherwise cause accesses out of bounds in the scaling array.
2020-05-11cli: Reduce fps fraction in ivf parsingHenrik Gramner
Also avoid integer overflows by using 64-bit intermediate precision.
2020-05-10x86: Use 'test' instead of 'or' to compare with zeroHenrik Gramner
Allows for macro-op fusion.
2020-05-10x86: Unconditionally compile msac_init.cHenrik Gramner
Eliminates the x86-64 check from the meson configuration file to be consistent with how other x86-64-exclusive code is handled.
2020-05-10x86-64: Do msac refill before calling dav1d_msac_init_x86()Henrik Gramner
Allows for constant propagation and tail call elimination in the msac initialization, which is performed in each tile.
2020-05-10msac: Avoid attempting to refill after eob has already been reachedHenrik Gramner
Utilize the unsigned representation of a signed integer to skip the refill code if the count was already negative to begin with, which saves a few clock cycles at the end of each tile.
2020-05-10arm64: itx: Add NEON implementation of itx for 10 bpcMartin Storsjö
Add an element size specifier to the existing individual transform functions for 8 bpc, naming them e.g. inv_dct_8h_x8_neon, to clarify that they operate on input vectors of 8h, and make the symbols public, to let the 10 bpc case call them from a different object file. The same convention is used in the new itx16.S, like inv_dct_4s_x8_neon. Make the existing itx.S compiled regardless of whether 8 bpc support is enabled. For builds with 8 bpc support disabled, this does include the unused frontend functions though, but this is hopefully tolerable to avoid having to split the file into a sharable file for transforms and a separate one for frontends. This only implements the 10 bpc case, as that case can use transforms operating on 16 bit coefficients in the second pass. Relative speedup vs C for a few functions: Cortex A53 A72 A73 inv_txfm_add_4x4_dct_dct_0_10bpc_neon: 4.14 4.06 4.49 inv_txfm_add_4x4_dct_dct_1_10bpc_neon: 6.51 6.49 6.42 inv_txfm_add_8x8_dct_dct_0_10bpc_neon: 5.02 4.63 6.23 inv_txfm_add_8x8_dct_dct_1_10bpc_neon: 8.54 7.13 11.96 inv_txfm_add_16x16_dct_dct_0_10bpc_neon: 5.52 6.60 8.03 inv_txfm_add_16x16_dct_dct_1_10bpc_neon: 11.27 9.62 12.22 inv_txfm_add_16x16_dct_dct_2_10bpc_neon: 9.60 6.97 8.59 inv_txfm_add_32x32_dct_dct_0_10bpc_neon: 2.60 3.48 3.19 inv_txfm_add_32x32_dct_dct_1_10bpc_neon: 14.65 12.64 16.86 inv_txfm_add_32x32_dct_dct_2_10bpc_neon: 11.57 8.80 12.68 inv_txfm_add_32x32_dct_dct_3_10bpc_neon: 8.79 8.00 9.21 inv_txfm_add_32x32_dct_dct_4_10bpc_neon: 7.58 6.21 7.80 inv_txfm_add_64x64_dct_dct_0_10bpc_neon: 2.41 2.85 2.75 inv_txfm_add_64x64_dct_dct_1_10bpc_neon: 12.91 10.27 12.24 inv_txfm_add_64x64_dct_dct_2_10bpc_neon: 10.96 7.97 10.31 inv_txfm_add_64x64_dct_dct_3_10bpc_neon: 8.95 7.42 9.55 inv_txfm_add_64x64_dct_dct_4_10bpc_neon: 7.97 6.12 7.82
2020-05-10arm: Mark global symbols hiddenMartin Storsjö
This matches what is done in C by -fvisibility=hidden. This avoids issues with relocations against other symbols exported from another assembly file.
2020-05-10arm64: itx: Prepare for other bitdepthsMartin Storsjö
2020-05-10itx: Add a bpc parameter to the itx dsp init functionMartin Storsjö
2020-05-10arm64: itx: Share code for the three horz_16x8 functionsMartin Storsjö
2020-05-10arm64: itx: Fix the eob checking for dct_dct_64x16Martin Storsjö
Before this, we never did the early exit from the first pass. Before: Cortex A53 A72 A73 inv_txfm_add_64x16_dct_dct_1_8bpc_neon: 7275.7 5198.3 5250.9 inv_txfm_add_64x16_dct_dct_2_8bpc_neon: 7276.1 5197.0 5251.3 inv_txfm_add_64x16_dct_dct_3_8bpc_neon: 7275.8 5196.2 5254.5 inv_txfm_add_64x16_dct_dct_4_8bpc_neon: 7273.6 5198.8 5254.2 After: inv_txfm_add_64x16_dct_dct_1_8bpc_neon: 5187.8 3763.8 3735.0 inv_txfm_add_64x16_dct_dct_2_8bpc_neon: 7280.6 5185.6 5256.3 inv_txfm_add_64x16_dct_dct_3_8bpc_neon: 7270.7 5179.8 5250.3 inv_txfm_add_64x16_dct_dct_4_8bpc_neon: 7271.7 5212.4 5256.4 The other related variants didn't have this bug and properly exited early when possible.
2020-05-10arm64: itx: Simplify inv_txfm_horz_dct_32x8Martin Storsjö
Unify some loads and stores, avoiding some extra pointer moving.
2020-05-10arm64: itx: Minor optimizations for the 8x32 functionsMartin Storsjö
This gives a couple cycles speedup.
2020-05-10arm64: itx: Cosmetic fix upMartin Storsjö
2020-05-10arm64: itx: Remove an unused constantMartin Storsjö
This isn't used for a sqrdmulh in its current form here. The one left in idct_coeffs[1] isn't used within the idct itself, but inv_txfm_horz_scale_dct_32x8 relies on it being left there for use with sqrdmulh scaling later.
2020-05-10arm64: itx: Remove a todo comment about more special cased functionsMartin Storsjö
These cases were removed from x86 to save space and simplify the code in e0b88bd2b2c97a2695edcc498485e1cb3003e7f1, as those cases were essentially unused in real world bitstreams.
2020-05-10arm64: itx: Remove a now unused macroMartin Storsjö
The macro became unused in 9f084b0d2.
2020-05-10arm64: Explicitly forbid using the x18 registerMartin Storsjö
On windows and darwin (and modern android), the x18 register is reserved and shouldn't be modified by user code, while it is freely available on linux. Strictly avoid it, to keep the assembly code portable.
2020-05-06checkasm: arm32: Check for stack overflowsMartin Storsjö