Age | Commit message (Collapse) | Author |
|
|
|
This was caused by the SSR option resetting the accumulation. But the
render passes were only cleared in the init phase. This means that
when SSR was resetting the `taa_render_sample` the actual renderpasses
would still contains 1 sample. This means the renderpasses were always
divided by the wrong number of samples.
The fix is to clear just before accumulation if the sample is 1.
The fact that it works for motion blur is kind of a blessing. This is because
we check `stl->effects->ssr_was_valid_double_buffer` before resetting the
sampling. So this only happens on the first motion step and does not affect
the rest of the rendering.
Differential Revision: https://developer.blender.org/D11033
|
|
|
|
with many objects
When you have many distinct objects, in an Eevee render then the shadow caster gets exponentially slower as the number of (distinct) objects increase.
This is because of the way that frontbuffer->bbox (EEVEE_BoundBox array) and the associated frontbuffer->update bitmap are resized.
Currently the resizing is done by reserving space for SH_CASTER_ALLOC_CHUNK (32) objects at a time.
When the number of objects is large, then the MEM_reallocN() gets progressively slower because it must memcpy the entire bbox/bitmap data to the new memory chunk.
And there will be a lot of *memcpy* operations for a large scene.
(Obviously there are a significant number of memory allocations/deallocations too - though this would be linear performance.)
I've switched to doubling the frontbuffer->alloc_count (buffer capacity) instead of adding SH_CASTER_ALLOC_CHUNK (32). As I understand this is the only way to eliminate exponential slowdown. Just increasing the size of SH_CASTER_ALLOC_CHUNK would still result in exponential slowdown eventually.
In other changes, the "+ 1" in this expression is not necessary.
if (id + 1 >= frontbuffer->alloc_count)
The buffer is 0-based. So when the buffer is initially allocated then id values from bbox[0] to bbox[31] are valid. Hence when frontbuffer->count == frontbuffer->alloc_count, is when the resizing should be triggered.
As it stands the "+ 1" results in resizing the buffer, when there is still capacity for one more object in the buffer.
I've changed the initial buffer allocation to use MEM_mallocN() instead of MEM_callocN(). The difference is that malloc() doesn't memset buffer (with zeros) when allocated. I've checked the code where new bbox records are created, and it does not rely on the buffer being initialised with zeros.
Anyway, isn't calloc() safer than using malloc()? Well no, it's actually the opposite in this case. Every time the buffer size is increased, it is done using realloc(), and this does not zero-out the uniniitialised portion of the buffer. So the code would break if it was modified to assume that the buffer contains zeros. Hence I believe initialising the buffer using calloc() could be misleading to a new developer.
Won't this result in increased memory usage? Yes, if you have millions of objects in your scene, then you are potentially using up-to twice the memory for the shadow caster. (However if you have millions of objects in your scene you're probably finding the Eevee render times a slow.)
Note that once the render gets going the frontbuffer bbox/bitmap will be shrunk to a multiple of SH_CASTER_ALLOC_CHUNK (32), therefore releasing the overallocation of memory.
As observed in Visual Studio - this appears to be prior to peak memory usage anyway.
Note this shrinking is executed in EEVEE_shadows_update() - during the first render sample pass. If necessary you could consider shrinking the buffer immediately after the EEVEE_shadows_caster_register() has done it's work. (Note however it appears you would need to add that function call is multiple places.)
Anyway as per the bug report I raised, I observed a 5% increase in peak-memory. And I'm unclear whether this difference in memory is due to me running the debug build. (It could be that there is no difference because of the shrinking.)
I couldn't figure out how the shadow caster backbuffer works. I see that EEVEE_shadows_init() has an explicit command to swap the front/back buffers. However this is done only when the buffers are first initialised and there is nothing in there yet. In my testing, the backbuffer->count was always zero, EEVEE_shadows_update() never did anything with the backbuffer.
Finally this problem is most evident when using Geometry Nodes or a Particle System to instantiate many objects. Objects created through say the array modifier do not cause any issues because it is considered one object by the shadow caster.
Reviewed By: #eevee_viewport, fclem
Differential Revision: https://developer.blender.org/D10631
|
|
|
|
|
|
Follow our code style for doxygen sections.
|
|
|
|
Following the most widely used convention for including todo's in
the code, that is: `TODO(name):`, `FIXME(name)` ... etc.
|
|
This will add the remaining static shaders to the eevee shader test suite.
- Downsampling
- GGX LUT generation
- Mist
- Motion Blur
- Ambient Occlusion
- Render Passes
- Screen Raytracing
- Shadows
- Subsurface
- Volumes
Reviewed By: Clément Foucault
Differential Revision: https://developer.blender.org/D8779
|
|
This is to make it easier to navigate captures in renderdoc.
|
|
This follows the GPU module naming of other buffers.
We pass name to distinguish each GPUUniformBuf in debug mode.
Also remove DRW_uniform_buffer interface.
|
|
|
|
- add the use of DRWShaderLibrary to EEVEE's glsl codebase to reduce code
complexity and duplication.
- split bsdf_common_lib.glsl into multiple sub library which are now shared
with other engines.
- the surface shader code is now more organised and have its own files.
- change default world to use a material nodetree and make lookdev shader
more clear.
Reviewed By: jbakker
Differential Revision: https://developer.blender.org/D8306
|
|
This avoid having a much higher memory footprint as the underlying texture
size allocated by the driver is likely to be much higher (rounded to next
Power of 2 or other alignement requirements).
|
|
This was caused by a missing DRWPass initialization.
Now we create the passes for every timestep but avoid clearing the
buffer after the first sample.
|
|
These are the modifications:
-With DRW modification we reduce the number of passes we need to populate.
-Rename passes for consistent naming.
-Reduce complexity in code compilation
-Cleanup how renderpass accumulation passes are setup, using pass instances.
-Make sculpt mode compatible with shadows
-Make hair passes compatible with SSS
-Error shader and lookdev materials now use standalone materials.
-Support default shader (world and material) using a default nodetree internally.
-Change BLEND_CLIP to be emulated by gpu nodetree. Making less shader variations.
-Use BLI_memblock for cache memory allocation.
-Renderpasses are handled by switching a UBO ref bind.
One major hack in this patch is the use of modified pointer as ghash keys.
This rely on the assumption that the keys will never overlap because the
number of options per key will never be bigger than the pointed struct.
The use of one single nodetree to support default material is also a bit hacky
since it won't support concurent usage of this nodetree.
(see EEVEE_shader_default_surface_nodetree)
Another change is that objects with shader errors now appear solid magenta instead
of shaded magenta. This is only because of code reuse purpose but could be changed
if really needed.
Reviewed By: jbakker
Differential Revision: https://developer.blender.org/D7642
|
|
|
|
This patch adds new render passes to EEVEE. These passes include:
* Emission
* Diffuse Light
* Diffuse Color
* Glossy Light
* Glossy Color
* Environment
* Volume Scattering
* Volume Transmission
* Bloom
* Shadow
With these passes it will be possible to use EEVEE effectively for
compositing. During development we kept a close eye on how to get similar
results compared to cycles render passes there are some differences that
are related to how EEVEE works. For EEVEE we combined the passes to
`Diffuse` and `Specular`. There are no transmittance or sss passes anymore.
Cycles will be changed accordingly.
Cycles volume transmittance is added to multiple surface col passes. For
EEVEE we left the volume transmittance as a separate pass.
Known Limitations
* All materials that use alpha blending will not be rendered in the render
passes. Other transparency modes are supported.
* More GPU memory is required to store the render passes. When rendering
a HD image with all render passes enabled at max extra 570MB GPU memory is
required.
Implementation Details
An overview of render passes have been described in
https://wiki.blender.org/wiki/Source/Render/EEVEE/RenderPasses
Future Developments
* In this implementation the materials are re-rendered for Diffuse/Glossy
and Emission passes. We could use multi target rendering to improve the
render speed.
* Other passes can be added later
* Don't render material based passes when only requesting AO or Shadow.
* Add more passes to the system. These could include Cryptomatte, AOV's, Vector,
ObjectID, MaterialID, UV.
Reviewed By: Clément Foucault
Differential Revision: https://developer.blender.org/D6331
|
|
Also fixes the sampling of hashed shadows.
|
|
|
|
|
|
When the result isn't used, prefer post increment/decrement
(already used nearly everywhere in Blender).
|
|
Reviewed By: brecht
Differential Revision: http://developer.blender.org/D5659
|