Age | Commit message (Collapse) | Author |
|
|
|
This changes the sampling routine to use the method described in
"A Simpler and Exact Sampling Routine for the GGXDistribution of Visible Normals"
by Eric Heitz.
http://jcgt.org/published/0007/04/01/slides.pdf
This avoids generating bad rays and thus improve noise level in screen-
space reflections / refraction.
|
|
This changes the hitBuffer to store `ReflectionDir * HitTime, invPdf`
just as the reference presentation.
This avoids issues when the hit refinement produce a coordinate that
does not land on the correct surface.
We now store the pdf in the same texture and store it inversed so we can
remove some ALU from the resolve shader.
This also rewrite the resolve shader to not be vectorized to improve
readability and scalability.
|
|
|
|
No functional changes.
This function replaces some of the logic in
`DRW_select_buffer_find_nearest_to_point` that traverses a buffer in a
spiral way to search for a closer pixel (not the closest).
Differential Revision: https://developer.blender.org/D10548
|
|
with many objects
When you have many distinct objects, in an Eevee render then the shadow caster gets exponentially slower as the number of (distinct) objects increase.
This is because of the way that frontbuffer->bbox (EEVEE_BoundBox array) and the associated frontbuffer->update bitmap are resized.
Currently the resizing is done by reserving space for SH_CASTER_ALLOC_CHUNK (32) objects at a time.
When the number of objects is large, then the MEM_reallocN() gets progressively slower because it must memcpy the entire bbox/bitmap data to the new memory chunk.
And there will be a lot of *memcpy* operations for a large scene.
(Obviously there are a significant number of memory allocations/deallocations too - though this would be linear performance.)
I've switched to doubling the frontbuffer->alloc_count (buffer capacity) instead of adding SH_CASTER_ALLOC_CHUNK (32). As I understand this is the only way to eliminate exponential slowdown. Just increasing the size of SH_CASTER_ALLOC_CHUNK would still result in exponential slowdown eventually.
In other changes, the "+ 1" in this expression is not necessary.
if (id + 1 >= frontbuffer->alloc_count)
The buffer is 0-based. So when the buffer is initially allocated then id values from bbox[0] to bbox[31] are valid. Hence when frontbuffer->count == frontbuffer->alloc_count, is when the resizing should be triggered.
As it stands the "+ 1" results in resizing the buffer, when there is still capacity for one more object in the buffer.
I've changed the initial buffer allocation to use MEM_mallocN() instead of MEM_callocN(). The difference is that malloc() doesn't memset buffer (with zeros) when allocated. I've checked the code where new bbox records are created, and it does not rely on the buffer being initialised with zeros.
Anyway, isn't calloc() safer than using malloc()? Well no, it's actually the opposite in this case. Every time the buffer size is increased, it is done using realloc(), and this does not zero-out the uniniitialised portion of the buffer. So the code would break if it was modified to assume that the buffer contains zeros. Hence I believe initialising the buffer using calloc() could be misleading to a new developer.
Won't this result in increased memory usage? Yes, if you have millions of objects in your scene, then you are potentially using up-to twice the memory for the shadow caster. (However if you have millions of objects in your scene you're probably finding the Eevee render times a slow.)
Note that once the render gets going the frontbuffer bbox/bitmap will be shrunk to a multiple of SH_CASTER_ALLOC_CHUNK (32), therefore releasing the overallocation of memory.
As observed in Visual Studio - this appears to be prior to peak memory usage anyway.
Note this shrinking is executed in EEVEE_shadows_update() - during the first render sample pass. If necessary you could consider shrinking the buffer immediately after the EEVEE_shadows_caster_register() has done it's work. (Note however it appears you would need to add that function call is multiple places.)
Anyway as per the bug report I raised, I observed a 5% increase in peak-memory. And I'm unclear whether this difference in memory is due to me running the debug build. (It could be that there is no difference because of the shrinking.)
I couldn't figure out how the shadow caster backbuffer works. I see that EEVEE_shadows_init() has an explicit command to swap the front/back buffers. However this is done only when the buffers are first initialised and there is nothing in there yet. In my testing, the backbuffer->count was always zero, EEVEE_shadows_update() never did anything with the backbuffer.
Finally this problem is most evident when using Geometry Nodes or a Particle System to instantiate many objects. Objects created through say the array modifier do not cause any issues because it is considered one object by the shadow caster.
Reviewed By: #eevee_viewport, fclem
Differential Revision: https://developer.blender.org/D10631
|
|
This was due to the AO random sampling using the same "seed" as
the AA jitter. Decorelating the noise fixes the issue.
|
|
This just bypass the occlusion computation if there is no occlusion
data. This avoids weird looking occlusion due to the screen space
geometric normal reconstruction.
|
|
The sampling is now optimum with every samples being at least one pixel
appart. Also use a squared repartition to improve the sampling near the
center.
This also removes the thickness heuristic since it seems to remove
a lot of details and bias the AO too much.
|
|
This is useful for debugging raycasting.
|
|
This is a major rewrite that improves the screen space raytracing
a little bit.
This also decouple ray preparation from raytracing to be reuse in other
part of the code.
This changes a few things:
- Reflections have lower grazing angle failure
- Reflections have less self intersection issues
- Contact shadows are now fully opaque (faster)
Unrelated but some self intersection / incorrect bad rays are caused by
the ray reconstruction technique used by the SSR. This is not fixed by
this commit but I added a TODO.
|
|
This removes the need for per mipmap scalling factor and trilinear interpolation
issues. We pad the texture so that all mipmaps have pixels in the next mip.
This simplifies the downsampling shader too.
This also change the SSR radiance buffer as well in the same fashion.
|
|
See comment in code for more details.
Differential Revision: https://developer.blender.org/D10615
|
|
The environment (world) irradiance wasn't correctly skipped.
|
|
The SSS shader in Eevee has the following drawbacks (elaborated in {T79933}):
1. Glowing
2. Ringing. On low SSS jittering it is rendered a bunch of sharp lines
3. Overall blurriness due to the nature of the effect
4. Shadows near occlusions as in T65849
5. Too much SSS near the edge and on highly-tilted surfaces
{F9438636}
{F9427302}
In the original shader code there was a depth correction factor, as far as I can understand for fixing light bleeding from one object to another. But it was scaled incorrectly. I modified its scale to depend on SSS scale*radius and made it independent from the scene scale. The scale parameter (`-4`) is chosen so that it makes tilted surfaces to have visually the same SSS radius as straight surfaces (surfaces with normal pointed directly to the camera).
This depth correction factor alone fixes all the problems except for ringing (pt. 2). Because of float-point precision errors and irradiance interpolation some samples near the border of an object might leak light, causing sparkly or dashed (because of aliasing) patterns around the highlights. Switching from `texture()` to `texelFetch()` fixes this problem and makes textures on renders visually sharper.
An alternative solution would be to detect object borders and somehow prevent samples from crossing it. This can be done by:
1. Adding an `object_id` texture. I think it requires much more code changing and makes the shader more complicated. Again, `object_id` is not interpolatable.
2. Watch gradient of depth and discard samples if the gradient is too big. This solution depends on scene scale and requires more texture lookups. Since SSS is usually a minor effect, it probably doesn't require that level of accuracy.
I haven't notice it in practice, but I assume it can make visible SSS radius slightly off (up to 0.5 px in screen space, which is negligible). It is completely mitigated with render sampling.
Reviewed By: Clément Foucault
Differential Revision: https://developer.blender.org/D9740
|
|
This improve self intersection prevention. Also reduce the bias
that was making a lot of rays not being shoot at grazing angles.
|
|
This is ported from Cycles and fixes issues with bump/normal mapping
giving weird reflections/lighting.
Fixes T81070 Specular light should be limited to normal pointing toward the camera
Fixes T78501 Normal mapping making specular artifact
|
|
Also use doxygen comments for sculpt functions.
|
|
The shadowing was computed on the light distance squared,
leaking to much light since it was integrating the extinction behind
the ligth itself.
Also bump the maximum shadow max step to the actual UI values. Otherwise
we get shadowing under evaluated because `dd` is too small.
|
|
missing NULL check if there is no cache there to begin with.
Differential Revision: https://developer.blender.org/D10581
|
|
The performance debug menu isn't used that often anymore as render doc
also show the timings. This patch will make sure that enabling the
performance debug view (21) does not crash blender.
|
|
Bad copy paste...
|
|
This was caused by a missing version check.
|
|
This fixes the issue of bokeh size being smaller when using overblur.
The additional overblur needs to be centered on the outer radius.
|
|
Cryptomatte layers in Blender are predefined. Other render engines
might have other naming schemes. This patch will allow creation of
cryptomatte layers with other names. This will be used by D3959 to
load cryptomatte openexr files from other render engines.
EEVEE and Cycles still use our fix naming scheme so no changes are
detectable by users.
|
|
Was caused by recent changes in window_translate_m4 by rBbb2af40ec7dd.
|
|
|
|
This was creating incorrectly occluded overlays.
|
|
Current implementation was to restricting for future enhancements where
the CryptomatterLayer could be read from existing metadata.
|
|
With the previous implementation, we could have pixels with offset larger
than 1 pixel.
Also fix a bug when the closest_index is not last. The sample positions
were incorrect in this case.
|
|
This was caused by the window_translate_m4 not offsetting the winmat in the
right direction for perspective view. Thus leading to incorrect weights.
The workbench sample weight computation was also inverted.
This fix will change the sampling pattern for EEVEE too (it will just
mirror it in perspective view).
|
|
|
|
There was a similar report prior to the introduction of the Image
Engine, see T74586.
This was fixed by rB6a5bd812b569 at that time, but got lost in the
refactor it seems.
Above commit introduced the `ED_space_image_get_display_channel_mask`
function that will determine
the valid bitflags for the display channel of a given ImBuf.
But since the refactor, this is not called anymore (`draw_image_main` is
not called anymore)
Now it seems we can safely reuse that said function
`ED_space_image_get_display_channel_mask` also for the Image Engine.
Maniphest Tasks: T85895
Differential Revision: https://developer.blender.org/D10510
|
|
Taa offset was applied on first sample. This wasn't happening before
DOF refactor.
Also fixes T85618 Wireframe Displays Strangely in Eevee (Rendered,
material Preview)
|
|
Contact shadows needed correct `gl_FragCoord.z` but this is not
correctly set for fullscreen passes. Need to pass depth using a global
variable until we get rid of `cl_eval.tracing_depth`.
|
|
Was caused by non initialized render_timesteps.
|
|
Was caused by non initialized render_timesteps.
|
|
.. it's a reserved keyword on GL > 4.0.
|
|
Aperture for a cone is 2 times the cone angle.
|
|
... for local variables.
|
|
This removes the last places where this was not the case.
We follow cycles convention of P being for Postion.
|
|
This makes is clearer and avoid having to setup worldPosition if
shader is not a material shader.
|
|
This adds an approximation of inverted AO by reversing the max horizon
search (becoming a min horizon). The horizons are correctly clamped in
the reverse direction to the shading and geometric normals.
The arc integration is untouched as it seems to be symetrical.
The limitation of this technique is that since it is still screen-space
AO you don't get other hidden surfaces occlusion. This is more
problematic in the case of inverted AO than for normal AO but it's
better than no support AO.
Support of distance parameter was easy thanks to recent AO refactor.
|
|
Nice format to output high definition normals or normalized colors.
|
|
Fix regression with roughness not masking reflections when not using
Screen Space raytracing.
The trick was to only evaluate one planar per pixel, the one with
the most influence. This should not be too limiting since this is what
we do for SSR.
Also change evaluation order do not apply occlusion on planars probes.
|
|
AO is always on in this case.
|
|
- Fix noise/banding artifact on distant geometry.
- Fix overshadowing on un-occluded surfaces at grazing angle producing "fresnel"
like shadowing. Some of it still appears but this is caused to the low number
of horizons per pixel.
- Improve performance by using a fixed number of samples and fixing the
sampling area size. A better sampling pattern is planned to recover
the lost precision on large AO radius.
- Improved normal reconstruction for the AO pass.
- Improve Bent Normal reconstruction resulting in less faceted look on
smoothed geometry.
- Add Thickness heuristic to avoid overshadowing of thin objects.
Factor is currently hardcoded.
- Add bent normal support to Glossy reflections.
- Change Glossy occlusion to give less light leaks from lightprobes.
It can overshadow on smooth surface but this should be mitigated by
using SSR.
- Use Bent Normal for rough Glossy surfaces.
- Occlusion is now correctly evaluated for each BSDF. However this does make
everything slower. This is mitigated by the fact the search is a lot faster
than before.
|
|
|
|
As now is possible to use multiframe in Draw mode, the option to display only lines must be disabled.
|
|
The problem was introduced fixing task T85035.
As the frame was set again when render, if there was a time modifier, the frame was not remaped to the right frame number.
|