Welcome to mirror list, hosted at ThFree Co, Russian Federation.

github.com/videolan/dav1d.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorKyle Siefring <kylesiefring@gmail.com>2021-01-08 20:29:39 +0300
committerJean-Baptiste Kempf <jb@videolan.org>2021-01-11 15:27:31 +0300
commit0bd57c6b22f42e9236707c0d7dc3b69a72df6225 (patch)
treeea8431514af1f120b69bc18266afb3b86049a2e3 /src/decode.c
parent3ccfc25a8451d5f17d80b06eb9d6a622722ec08d (diff)
Rework the usage of noskip_mask
Remove half of the masks since they are only used for cdef on a 8x8 level of granularity. Load the mask and combine the 16-bit sections into the 32-bit sections outside of the inner cdef loop. This should save some registers. Results in mild performance improvements.
Diffstat (limited to 'src/decode.c')
-rw-r--r--src/decode.c4
1 files changed, 2 insertions, 2 deletions
diff --git a/src/decode.c b/src/decode.c
index 4b076ca..197af98 100644
--- a/src/decode.c
+++ b/src/decode.c
@@ -1984,10 +1984,10 @@ static int decode_b(Dav1dTileContext *const t,
#undef set_ctx
}
if (!b->skip) {
- uint16_t (*noskip_mask)[2] = &t->lf_mask->noskip_mask[by4];
+ uint16_t (*noskip_mask)[2] = &t->lf_mask->noskip_mask[by4 >> 1];
const unsigned mask = (~0U >> (32 - bw4)) << (bx4 & 15);
const int bx_idx = (bx4 & 16) >> 4;
- for (int y = 0; y < bh4; y++, noskip_mask++) {
+ for (int y = 0; y < bh4; y += 2, noskip_mask++) {
(*noskip_mask)[bx_idx] |= mask;
if (bw4 == 32) // this should be mask >> 16, but it's 0xffffffff anyway
(*noskip_mask)[1] |= mask;