Age | Commit message (Collapse) | Author |
|
Still work in progress.
|
|
This reduces the complexity and avoid framebuffer setup costs.
This also "remove" the prefiltering of the glossy cubemaps in favor
of a simple bilinear filtering of the mipchain.
|
|
This matches better what hardware raytracing will be doing. Performances
are also more predictable.
|
|
This changes drastically the implementation to leverage arbitrary writes
in order to reduce complexity, memory usage and increase speed.
Since we are no longer dependent on the framebuffer requirement, we can
allocate bigger size texture that fits all views and avoid the extra.
Transparency, holdout and emissions are no longer deferred and are now
composited using dual source blending.
The indirect lighting and raytracing is still not functional but will
also gets a large refactor on its own
|
|
|
|
|
|
# Conflicts:
# source/blender/draw/engines/eevee/eevee_bloom.c
# source/blender/draw/engines/eevee/eevee_cryptomatte.c
# source/blender/draw/engines/eevee/eevee_data.c
# source/blender/draw/engines/eevee/eevee_depth_of_field.c
# source/blender/draw/engines/eevee/eevee_effects.c
# source/blender/draw/engines/eevee/eevee_engine.c
# source/blender/draw/engines/eevee/eevee_lightcache.c
# source/blender/draw/engines/eevee/eevee_lightprobes.c
# source/blender/draw/engines/eevee/eevee_lights.c
# source/blender/draw/engines/eevee/eevee_lookdev.c
# source/blender/draw/engines/eevee/eevee_lut_gen.c
# source/blender/draw/engines/eevee/eevee_materials.c
# source/blender/draw/engines/eevee/eevee_mist.c
# source/blender/draw/engines/eevee/eevee_motion_blur.c
# source/blender/draw/engines/eevee/eevee_occlusion.c
# source/blender/draw/engines/eevee/eevee_private.h
# source/blender/draw/engines/eevee/eevee_render.c
# source/blender/draw/engines/eevee/eevee_renderpasses.c
# source/blender/draw/engines/eevee/eevee_sampling.c
# source/blender/draw/engines/eevee/eevee_screen_raytrace.c
# source/blender/draw/engines/eevee/eevee_shaders.c
# source/blender/draw/engines/eevee/eevee_shadows.c
# source/blender/draw/engines/eevee/eevee_shadows_cascade.c
# source/blender/draw/engines/eevee/eevee_shadows_cube.c
# source/blender/draw/engines/eevee/eevee_subsurface.c
# source/blender/draw/engines/eevee/eevee_temporal_sampling.c
# source/blender/draw/engines/eevee/eevee_volumes.c
# source/blender/gpu/intern/gpu_codegen.c
# source/blender/gpu/intern/gpu_material_library.c
# source/blender/gpu/opengl/gl_compute.cc
# source/blender/makesrna/intern/rna_material.c
|
|
|
|
|
|
|
|
The defrag shader make sure the free heap is free of holes. Making
the allocation more straightforward.
Since we now only reference the pages using the tiles, we introduce
a debug shader that produces an image with page data in a visual way.
This replaces the debug 8 option.
This also fixes some bug that were still present in the pipeline.
|
|
This fixes an issue with tile reuse causing corruption.
|
|
We now scan the depth buffer after the prepass to tag the needed
shadow tiles.
This is much more precise than the bound box tagging which is now
reserved for transparent objects.
This also:
- fix pixel radius size.
- add a dedicated info buffer to avoid having one unused tile.
|
|
This removes the light count limit for the forward shaded object. This
also provides a more efficient way of computing the culling directly on
the GPU. Moreover, this avoids doing multiple lighting passes for high
light counts in the deferred pipeline, improving performance.
|
|
This continue the effort to implement virtual shadow mapping.
This includes:
- Spot cone culling of tile.
- Tile vs. view frustum tagging.
- Shadowmap Page allocation / freeing.
- Rendering to 4K buffer only tiles that needs it.
- Copying to shadow atlas.
|
|
This is a total refactor of how shadows are handled.
We use Virtual shadow maps with different Level of details to
ensure a somewhat evenly distributed precision.
The shadow test is a really crude shadow test that will be
improved in further commit.
There is a pool of 4096 Tilemaps that are distributed between
shadowed ligths. These tilemaps are 16x16 each and reference
shadow map pages that are allocated in an atlas. Pages are only
allocated if needed (i.e: visible for rendering an object).
Page management is done on GPU using compute shaders to reduce
CPU task.
On CPU only one draw pass per updated tilemaps is issued.
This reduces the memory requirement of shadowmapping large scenes
with many lights.
|
|
Denoising make use of more memory to store and reproject the result of
previous frame to reduce noise. This only works for viewport.
There is a final bilateral filter for cleaning up noise even more.
|
|
This simply reuse the reflection raytracing pipeline but with another
ray distribution. Only direct lighting, distant lighting and emissive
light are visible to diffuse rays.
Subsurface effect is not visible but transmittance effect is visible
to diffuse rays.
Indirect diffuse light is processed by the SSS filter.
|
|
The new pipeline is now cleaner and allows for deferred refraction.
The refractions are more accurate but are not denoised for now. More
research needs to be done in this area.
There is no feedback buffer for now, so reflections of metallic surfaces
will appear black.
The same restriction on refractive materials still holds true. They will
not appear in screen space tracing of other non refractive surfaces.
However, refractive surfaces (non-blended) can now reflect themselves
and the other surfaces with screen space reflections.
Half res tracing is not implemented back yet.
|
|
Pretty much identical.
Texture format is now always `GPU_R32F` to remove some workarounds.
|
|
This new implementation follows the technique described in
"Efficient screen space subsurface scattering Siggraph 2018".
Compared to the old implementation it fixes a lot of issues at
the cost of it being slower. This fixes:
- Light leaking between different objects.
- Light leaking between different surfaces with different depths.
- SSS radii are now "texturable" per pixel. No SSS surfaces limits.
- Noise should be lower.
- Precomputation is only done once for all SSS surfaces which lowers the
per material storage and precomputation time.
Implementation is also simpler as it is only a one pass processing.
We differ from the reference presentation by not precomputing the
RGB weights per samples. We actually compute them on the fly in order
to support varying SSS radii.
Notes:
- SSS IOR and SSS anisotropy are not supported.
- Object level light leak prevention might not work for high number of
objects in the scene (> 1024). In this case light leak might occur.
Adding or deleting (hidding) objects in the scene might change which
objects can leak.
|
|
This was the cause of a bug on Intel Integrated GPU and might as well
impact other platforms.
|
|
Nothing much different compared to the previous implementation.
The transparent BSDF and principled BSDF now detects when the material
is potentially transparent to select the best way to render it.
|
|
This does not include reference spheres rendering.
The approach is a bit different than before.
Now we use a `bNodeTree` to control the rendering of lookdev. This
generates a `GPUMaterial` that is stored per `Instance`. This way
rendering lookdev is just updating the temp light cache using this
material as world material. Removing the use of custom shader.
This introduces a small hack in order to bind the studiolight hdri after
the nodetree glsl parsing.
The background display however is still using a custom shader in order
to sample the world cubemap with different roughness.
The view space option of the studiolight is now faster by using a
transform before shading instead of rebaking the lightprobe constantly.
This should not have any particular impact on render time.
|
|
Nothing significantly different appart from codestyle.
|
|
No much change appart from code organization and structure.
|
|
|
|
Only for background for now.
Support is now not using defines and just use the correct globals and
uniforms to keep the same values as before.
|
|
This adds support for rendering gpencil objects.
There is a lot of features to implement specially the ones requiring
per object uniforms.
|
|
This is the first step towards the new evaluation scheme of EEVEE
closures.
This commit contains:
- Removal of GPU_SOURCE_BUILTIN type, prefering global instead. This
avoid many boilerplate code since most of the old builtins are now
datas that are always present (i.e: view matrices, normals).
- Rewritting of codegen in C++ to use `std::stringstream`.
- Added a callback to let engine decide what to do with codegen code.
This remove a lot of needs for defines because of code order
dependency. The engine can insert the nodetree code in custom ways
to create advance effects (i.e: add displacement or vertex lighting).
Engine now returns final shader strings.
- Closure nodes evaluation replacment is a placeholder for now.
|
|
This is a port of the old material grouping. This is a bit more
clean as we use containers for each passes and other structures.
Nodetree is generated without major error for simple materials but
it is not yet used as closures are not outputed.
|
|
This adds the transparency and volume handling in the deferred
render pipeline.
Implementation is still unfinished.
To have better naming convention, I renamed object shader to surface.
|
|
This introduce a fat Gbuffer layout that groups closure data in groups
of similar BSDF. The goal is to have at least one sample for each
group to avoid too much code complexity and expected worse performance.
There is a lot of room for buffer reuse to reduce memory usage but it is
not considered a priority for now.
|
|
Difference with previous implementation:
- Better texture space usage of cone and area light shadow.
- Shadows are packed in an atlas. Reducing requirements for future
features.
- Sampling is simpler because shadow matrix does everything.
|
|
I did a small optimization pass to avoid some division and
redundant computation.
Also cleans-up the Light vector usage.
|
|
This follows closely the implementation of 2.5D tiled light
culling described in the presentation:
"Improved Culling for Tiled and Clustered Rendering"
from Michal Drobot
http://advances.realtimerendering.com/s2017/2017_Sig_Improved_Culling_final.pdf
I chose the tile + Z binning approach for its high depth range support
and low CPU overhead & low memory consumption compared to the cluster
based culling. The cons is that the culling is a bit less precise in
some aspect but it is quite balanced.
The culling is done by the `Culling` object which is templated to easily
be reused for light probes cullg.
The Z-binning process is described starting from slide 20 in the
reference pdf.
I also implemented a debug pass to visualize false negative (light
culled when they shouldn't) and light evaluation density.
This is useful to detect failure case and hotspot. This could be exposed
as a developper only render pass in the future.
Some optimization of the reference implementation requires extensions
not yet added to GPU module and will be added later.
|
|
This also wrap GPUFrameBuffer & GPUTexture inside eevee:Framebuffer
and eevee:Texture to improve managment.
Another cleanup was to put all members of `Instance` public to
avoid much complexity in accessing the data with modules
dependencies.
Also split velocity View related data to `class Velocity` and
rename previous `Velocity` to `VelocityModule`
|
|
Only supports simple point lights for now
|
|
This is almost the same thing as old implementation.
Differences:
- We clamp the motion vectors to their maximum when sampling the velocity buffer.
- Velocity rendering (and data manager) is separated from motion blur. This allows
outputing the motion vector render pass and in the future use motion vectors to
reproject older frames.
- Vector render pass support (only if motion blur is disabled, just like cycles).
- Velocity tiles are computed in one pass (simpler code, less CPU overhead, less
VRAM usage, maybe a bit slower but imperceivable (< 0.3ms)).
- Two velocity passes are outputed, one for motion blur fx (applied per shading view)
and one for the vector pass. This could be optimized further in the future.
- No current support for deformation & hair (to come).
|
|
We have many dimensions to avoid correlation between effects.
|
|
Pretty much identical to the previous implementation. With the exception
of a temporary noise function and some simplification of the CoC
computation. This also fixes issues with the Ortho depth of field.
Most of the files were modified to comply to new shader codestyle.
This also adds partial support of panoramic cameras (bokeh and
anamorphic is still buggy).
|
|
This cleansup a lot of confusion / complexity in the setup code.
Setup is closer to what cycles does now.
Also duplicates some buggy behavior of Cycles for now until this
is fixed.
|
|
This removes the use of very ugly macros. Now we preprocess
the .hh string at startup to change enum definitions to a more
GLSL friendly variant.
|
|
|
|
This move view resolution handling to the `Camera` class that will
in the future clip and trim each view in panoramic projection.
There is a new `CameraView` that contains the `DRWView` and subview.
This way each `ShadingView` is associated to a unique `CameraView`.
ShadingView` & `CameraView` are all allocated & defined at creation time
but only the one activated by `Camera` will be rendered.
|
|
- Add eevee_ prefix to shaders to avoid name clashing.
- remove plural of eevee_shaders.
- rename eevee_shared.hh to eevee_shader_shared.hh.
|