Age | Commit message (Collapse) | Author |
|
|
|
|
|
This still does not make point density to work in Cycles, but at least it pass
the depsgraph down the line.
Note this was working fine before the depsgraph/render refactor to pass
evaluated depsgraph to the engines.
|
|
Point density texture and motion blur are still broken, and many more changes
are needed in general to used evaluated datablocks.
|
|
User notes
----------
Compositing, rendering of multi-layers in Eevee should be fully working now.
Development notes
-----------------
Up until now we were still using the same depsgraph for rendering and viewport
evaluation. And we had to go out of our ways to be sure the depsgraphs were
updated.
Now we iterate over the (to be rendered) view layers and create a depsgraph to
each one, fully evaluated and call the render engines (Cycles, Eevee, ...) with
this viewlayer/depsgraph/evaluation context.
At this time we are not handling data persistency, Depsgraph is created from
scratch prior to rendering each frame. So I got rid of most of the partial
update calls we had during the render pipeline.
Cycles: Brecht Van Lommel did a patch to tackle some of the required Cycles
changes but this commit mark these changes as TODOs. Basically Cycles needs to
render one layer at a time.
Reviewers: sergey, brecht
Differential Revision: https://developer.blender.org/D3073
|
|
This avoids having non null entries in shaderface->builtin_uniforms and a redundant check.
|
|
Apparently MSVC 2013 has trouble with stuff that's been declared
"static thread_local" (and/or maybe even the "thread_local" keyword).
https://stackoverflow.com/questions/29399494/what-is-the-current-state-of-support-for-thread-local-across-platforms
|
|
|
|
|
|
Offscreen contexts are not attached to a window and can only be used for rendering to frambuffer objects.
CGL implementation : Brecht Van Lommel (brecht)
GLX implementation : Clément Foucault (fclem)
WGL implementation : Germano Cavalcante (mano-wii)
Other implementation are just place holder for now.
|
|
Turns out to be the call that was destroying performance.
I get 18ms->6ms improvement of drawing time with 10 000 unique objects.
And we can still improve upon this!
|
|
Conflicts:
source/blender/blenkernel/BKE_blender_version.h
|
|
This is used to determine which voxels are to be considered empty space.
Previously it was hardcoded for converting dense grids to OpenVDB grids
to reduce disk space usage.
This value is also useful for rendering engines to know, i.e. to
optimize ray marching.
|
|
Some other software cannot handle grid names with spaces in them. We still check for names with spaces so as to not break old
files.
This fixes T53802.
|
|
|
|
Similar to the Principled BSDF, this should make it easier to set up volume
materials. Smoke and fire can be rendererd with just a single principled
volume node, the appropriate attributes will be used when available. The node
also works for simpler homogeneous volumes like water or mist.
Differential Revision: https://developer.blender.org/D3033
|
|
This breaks backwards compatibility some, making smoke colors brighters
than before. But it is also more correct this way.
|
|
A volume shader should be able to request attributes, and still be rendered
as homogeneous if no volume attributes are available for the object.
|
|
|
|
|
|
|
|
This is not an ideal solution but blender freeing system is already well tangled.
So tracking and clearing vao caches when destroying contexts does prevent bad behaviour.
|
|
|
|
This mainly helps with dense volumes, rendering can be 30% faster with
little noise increase in such scenes.
|
|
We now continue transparent paths after diffuse/glossy/transmission/volume
bounces are exceeded. This avoids unexpected boundaries in volumes with
transparent boundaries. It is also required for MIS to work correctly with
transparent surfaces, as we also continue through these in shadow rays.
The main visible changes is that volumes will now be lit by the background
even at volume bounces 0, same as surfaces.
Fixes T53914 and T54103.
|
|
|
|
|
|
|
|
Unify the path and branched path indirect SSS code. No performance impact
found on CUDA, for AMD split kernel the extra code was already there.
|
|
|
|
Reorganize struct elements by size, rename a constant.
|
|
A major bottleneck of current implementation is the call to create_bindings() for basically every drawcalls.
This is due to the VAO being tagged dirty when assigning a new shader to the Batch, defeating the purpose of the Batch (reuse it for drawing).
Since managing hundreds of batches in DrawManager and DrawCache seems not fun enough to me, I prefered rewritting the batches itself.
--- Batch changes ---
For this to happen I needed to change the Instancing to be part of the Batch rather than being another batch supplied at drawtime.
The Gwn_VertBuffers are copied from the batch to be instanciated and a new Gwn_VertBuffer is supplied for instancing attribs.
This mean a VAO can be generated and cached for this instancing case.
A Batch can be rendered with instancing, without instancing attribs and without the need for a new VAO using the GWN_batch_draw_range_ex with the force_instance parameter set to true.
--- Draw manager changes ---
The downside with this approach is that we must track the validity of the instanced batch (the original one). For this the only way (I could think of) is to set a callback for when the batch is getting free.
This means a bit of refactor in the DrawManager with the separation of batching and instancing Batches.
--- VAO cache ---
Each VAO is generated for a given ShaderInterface. This means we can keep it alive as long as the shader interface lives.
If a ShaderInterface is discarded, it needs to destroy every VAO associated to it. Otherwise, a new ShaderInterface with the same adress could be generated and reuse the same VAO with incorrect bindings.
The VAO cache itself is using a mix between a static array of VAO and a dynamic array if the is not enough space in the static.
Using this hybrid approach is a bit more performant than the dynamic array alone.
The array will not resize down but empty entries will be filled up again. It's unlikely we get a buffer overflow from this. Resizing could be done on next allocation if needed.
--- Results ---
Using Cached VAOs means that we are not querying each vertex attrib for each vbo for each drawcall, every redraw!
In a CPU limited test scene (10000 cubes in Clay engine) I get a reduction of CPU drawing time from ~20ms to 13ms.
The only area that is not caching VAOs is the instancing from particles (see comment DRW_shgroup_instance_batch).
|
|
This allows allocation of VAOs from different opengl contexts and thread as long as the drawing happens in the same context.
Allocation is thread safe as long as we abide by the "one opengl context per thread" rule.
We can still free from any thread and actual freeing will occur at new vao allocation or next context binding.
|
|
|
|
|
|
|
|
This should be the last Fermi removal commit, unless I missed something.
It's been a pleasure Fermi!
|
|
Did not touch Texture related defines, that comes next.
|
|
device_cuda.cpp.
Fermi code in Cycles kernel and texture system are coming next.
|
|
with some Intel cards
|
|
|
|
|
|
It seems that some opengl implementations are returning "[0]" after array names but some others dont.
Remove the "[0]" so everything is consistent.
|
|
|
|
|
|
|
|
|
|
It seems to be useful still in cases where the particle are distributed in
a particular order or pattern, to colorize them along with that. This isn't
really well defined, but might as well avoid breaking backwards compatibility
for now.
|
|
Theses batches keeps their memory chuck allocated after transfer to be reused and updated very often.
NOTE: This commit break instancing in DRW. (it's fixed in the next commit)
|
|
This allow to drawn large amounts of primitives without any memory footprint.
|