Age | Commit message (Collapse) | Author |
|
|
|
Adds the possibility to specify the data buffer directly and precise ownership.
By not passing ownership to gawain the memory block can be reused.
|
|
This enables to draw the same vbo but only with a selected range. (useful for selection with instancing/batching)
|
|
Everything was fine if one batch is always used with instancing. But problem arise if the next drawcall for this batch is not using instancing as the attrib divisor stays set to 1 in th VAO.
As instancing is less used than normal drawing I prefer to reset the divisor after drawing as it is reset before drawing instances.
|
|
This improves eevee's cache performance by 13% in my test.
|
|
Tried 101 but it gives colisions.
I think 257 is enough now that we dont have thousands of uniforms.
This gives some noticeable performance improvement.
Could be refined further.
|
|
|
|
This changes quite a few things:
- Drops the allocation of inputs as a chunk.
- Merge the linked list system into the Gwn_ShaderInput.
- Put name buffer into another memory block, easily resizable.
- Use offset instead of char* to direct to input name.
- Add only requested uniforms dynamicaly to the Shader Interface.
This drops some minor optimisation and use a bit more memory for small shaders (which are fixed count).
But this saves a lot of memory when using UBOs because the names and the Gwn_ShaderInput were alloc'ed for every UBO variable.
This also reduce the Shader Interface initial generation.
The lookup time is left unchanged.
|
|
This reverts commit 5514d2df1c6d9f2f108336e46b0db14316610d24.
|
|
This is an internal structure, and we don't put it to a list for anything else
that hash collision resolution. No need to have dedicated entry here, saves us
from extra allocation and pointer dereference.
|
|
|
|
This way we reduce number of loops from look-over-all-inputs to
loop-over-collision, which is expected to be much less CPU ticks.
There is still possible optimization: use memory pool of some sort
to manage memory needed for hash entries, but that will only speedup
shader interface construction / deconstruction time.
There are also some trickery happening to speed up process even more
in the case there is no hash collisions detected when constructing
shader interface.
|
|
Makes sure we don't waste CPU ticks on function call in such a time critical
area.
|
|
|
|
Also some minor corrections.
|
|
Wraps vertex-format, vertex-buffer and batch's (enough for drawing).
Doesn't yet expose index-buffers or shaders.
|
|
Needed to reference without first including headers.
|
|
Use ownership flags instead.
|
|
Flag ownership for each index array & vbo's
so we don't have to manually keep track of this and use the right free call.
Instead this can be passed on creation.
See D2676
|
|
Looking up names project wide or setting breakpoints wasn't so.
Names like common.h or element.h are also too generic.
|
|
Needed to clear the buffer without freeing.
|
|
|
|
|
|
This avoids using GWN_vertbuf_attr_set which needs to calculate the
offset and perform a memcpy every call.
Exposing the data directly allows us to avoid a memcpy in some cases
and means we can write to the vertex buffer's memory directly.
|
|
|
|
Use consistent prefix for gawain API names as well as
some abbreviations to avoid over-long names, see: D2678
|
|
|
|
Gawain doesn't include Blender's cross-platform "inline" definition. This change slipped in as part of D2697.
|
|
UNIFORM_NONE should never match a valid uniform (builtin or custom).
The logic for UNIFORM_CUSTOM was just wrong, since it returned the first custom uniform. This function should only accept builtin (non-custom) uniforms.
|
|
Quick hash rejection instead of string comparison. Uniform lookups already work this way. I don't expect a major overall speedup since attributes are looked up less frequently than uniforms.
|
|
Before this change Gawain was doing list lookup twice,
doing string comparison of every and each input which
is not efficient and not friendly for CPUs with small
cache size.
Now we store hash of input name together with actual
name and compare hashes first. Additionally, we do
everything in a single pass which is much better from
cache coherency point of view.
This brings Eevee cache population time from 80ms to
60ms on my desktop and from 800ms to 400ms for Clement
when navigating in a file from T50027.
Reviewers: merwin, dfelinto
Subscribers: fclem
Differential Revision: https://developer.blender.org/D2697
|
|
|
|
|
|
|
|
Not as if i'm totally fine with such style, but i'd better be consistent
with whatever the project is using.
|
|
This function is not performance critical, but I prefer the branch-free code and no hack needed to appease gcc.
Follow-up to recent 23035cf46fb4dd6a0bf7e688b0f15128030c77d1 and f637145450010d14660fcb029d41560a138eae14.
|
|
|
|
Goal is to make most of the API independent of OpenGL, Vulkan, any other backend.
Able to remove default case from ElementList_size because IndexType only covers index types. Not that and *everything else* like GLenum.
|
|
@fclem does this work for you?
|
|
|
|
Modern GL's glMapBufferRange works the same on all platforms.
Part of T49012
|
|
Recent versions of OpenGL support VAOs natively.
Part of T49012
|
|
This format is part of OpenGL 3.3, and one of the reasons for choosing 3.3 over 3.2.
Instead of checking #if USE_10_10_10 just use it wherever needed.
|
|
There is no more point of keep those around. ES20 may need special case
when/if we dabble with it again. Meanwhile no point on polluting the
code with this.
(ghost still has reference for the PROFILE, but that's reasonable)
|
|
Non-initiliazed var, I thought I was clever than this. :(
|
|
Get buffer size once, use it to both allocate and track VRAM.
|
|
same functionality
|
|
Revert 7a18ee62eb4d6c6028d05f1da259fe8695f49a3f and 1ff97bbfff78a0c375fb5256a9d9d37cd3973bbe after discussing with @fclem.
VertexBuffer_size should always report the same buffer size, but without asking/calling OpenGL.
|
|
|
|
|