Age | Commit message (Collapse) | Author |
|
These transparent shadows can be expansive to evaluate. Especially on the
GPU they can lead to poor occupancy when only some pixels require many kernel
launches to trace and evaluate many layers of transparency.
Baked transparency allows tracing a single ray in many cases by accumulating
the throughput directly in the intersection program without recording hits
or evaluating shaders. Transparency is baked at curve vertices and
interpolated, for most shaders this will look practically the same as actual
shader evaluation.
Fixes T91428, performance regression with spring demo file due to transparent
hair, and makes it render significantly faster than Blender 2.93.
Differential Revision: https://developer.blender.org/D12880
|
|
Ref D12889
|
|
|
|
* Rename struct KernelGlobals to struct KernelGlobalsCPU
* Add KernelGlobals, IntegratorState and ConstIntegratorState typedefs
that every device can define in its own way.
* Remove INTEGRATOR_STATE_ARGS and INTEGRATOR_STATE_PASS macros and
replace with these new typedefs.
* Add explicit state argument to INTEGRATOR_STATE and similar macros
In preparation for decoupling main and shadow paths.
Differential Revision: https://developer.blender.org/D12888
|
|
|
|
This is the first of a sequence of changes to support compiling Cycles kernels as MSL (Metal Shading Language) in preparation for a Metal GPU device implementation.
MSL requires that all pointer types be declared with explicit address space attributes (device, thread, etc...). There is already precedent for this with Cycles' address space macros (ccl_global, ccl_private, etc...), therefore the first step of MSL-enablement is to apply these consistently. Line-for-line this represents the largest change required to enable MSL. Applying this change first will simplify future patches as well as offering the emergent benefit of enhanced descriptiveness.
The vast majority of deltas in this patch fall into one of two cases:
- Ensuring ccl_private is specified for thread-local pointer types
- Ensuring ccl_global is specified for device-wide pointer types
Additionally, the ccl_addr_space qualifier can be removed. Prior to Cycles X, ccl_addr_space was used as a context-dependent address space qualifier, but now it is either redundant (e.g. in struct typedefs), or can be replaced by ccl_global in the case of pointer types. Associated function variants (e.g. lcg_step_float_addrspace) are also redundant.
In cases where address space qualifiers are chained with "const", this patch places the address space qualifier first. The rationale for this is that the choice of address space is likely to have the greater impact on runtime performance and overall architecture.
The final part of this patch is the addition of a metal/compat.h header. This is partially complete and will be extended in future patches, paving the way for the full Metal implementation.
Ref T92212
Reviewed By: brecht
Maniphest Tasks: T92212
Differential Revision: https://developer.blender.org/D12864
|
|
Avoids possible numerical issues in the path tracing kernel, which
is most important for displacement as non-finite values in BVH can
lead to infinite node recursion during traversal.
Differential Revision: https://developer.blender.org/D12690
|
|
This includes much improved GPU rendering performance, viewport interactivity,
new shadow catcher, revamped sampling settings, subsurface scattering anisotropy,
new GPU volume sampling, improved PMJ sampling pattern, and more.
Some features have also been removed or changed, breaking backwards compatibility.
Including the removal of the OpenCL backend, for which alternatives are under
development.
Release notes and code docs:
https://wiki.blender.org/wiki/Reference/Release_Notes/3.0/Cycles
https://wiki.blender.org/wiki/Source/Render/Cycles
Credits:
* Sergey Sharybin
* Brecht Van Lommel
* Patrick Mours (OptiX backend)
* Christophe Hery (subsurface scattering anisotropy)
* William Leeson (PMJ sampling pattern)
* Alaska (various fixes and tweaks)
* Thomas Dinges (various fixes)
For the full commit history, see the cycles-x branch. This squashes together
all the changes since intermediate changes would often fail building or tests.
Ref T87839, T87837, T87836
Fixes T90734, T89353, T80267, T80267, T77185, T69800
|
|
The goal: allow to easily use AO approximation in scenes which combines
both small and large scale objects.
The idea: use per-object AO distance which will allow to override world
settings. Instancer object will "propagate" its AO distance to all its
instances unless the instance defines own distance (this allows to
modify AO distance in the shot files, without requiring to modify props
used in the shots.
Available from the new Fats GI Approximation panel in object properties.
Differential Revision: https://developer.blender.org/D12112
|
|
Also use doxy style function reference `#` prefix chars when
referencing identifiers.
|
|
Ref D8237, T78710
|
|
|
|
Baking vertex colors per-corner leads to unwanted discontinuities when there is
sampling noise, for example in ambient occlusion or with a bevel shader node for
normals. For this reason the code used to always average results per-vertex.
However when using split normals, multiple materials or UV islands, we do want to
preserve discontinuities. So now bake per corner, but make sure the sampling seed
is shared for vertices.
Fix T85550: vertex color baking crash with split normals, Ref D10399
Fix T84663: vertex color baking blending at UV seams
|
|
|
|
Contributed by pembem22.
Differential Revision: https://developer.blender.org/D8947
|
|
There should be no user visible change from this, except that tile size
now affects performance. The goal here is to simplify bake denoising in
D3099, letting it reuse more denoising tiles and pass code.
A lot of code is now shared with regular rendering, with the two main
differences being that we read some render result passes from the bake API
when starting to render a tile, and call the bake kernel instead of the
path trace kernel.
With this kind of design where Cycles asks for tiles from the bake API,
it should eventually be easier to reduce memory usage, show tiles as
they are baked, or bake multiple passes at once, though there's still
quite some work needed for that.
Reviewers: #cycles
Subscribers: monio, wmatyjewicz, lukasstockner97, michaelknubben
Differential Revision: https://developer.blender.org/D3108
|
|
This simplifies compositors setups and will be consistent with Eevee render
passes from D6331. There's a continuum between these passes and it's not clear
there is much advantage to having them available separately.
Differential Revision: https://developer.blender.org/D6848
|
|
With upcoming light group passes, for them to sum up correctly to the combined
pass the clamping must be more fine grained.
This also has the advantage that if one light is particularly noisy, it does
not diminish the contribution from other lights which do not need as much
clamping.
Clamp values on existing scenes will need to be tweaked to get similar results,
there is no automatic conversion possible which would give the same results as
before.
Implemented by Lukas, with tweaks by Brecht.
Part of D4837
|
|
Custom render passes are added in the Shader AOVs panel in the view layer
settings, with a name and data type. In shader nodes, an AOV Output node
is then used to output either a value or color to the pass.
Arbitrary names can be used for these passes, as long as they don't conflict
with built-in passes that are enabled. The AOV Output node can be used in both
material and world shader nodes.
Implemented by Lukas, with tweaks by Brecht.
Differential Revision: https://developer.blender.org/D4837
|
|
|
|
Apply clang format as proposed in T53211.
For details on usage and instructions for migrating branches
without conflicts, see:
https://wiki.blender.org/wiki/Tools/ClangFormat
|
|
Skip shader evaluation then, as we already do for lights. Less than
1% faster in my tests, but might as well be consistent for both.
|
|
|
|
|
|
|
|
We now continue transparent paths after diffuse/glossy/transmission/volume
bounces are exceeded. This avoids unexpected boundaries in volumes with
transparent boundaries. It is also required for MIS to work correctly with
transparent surfaces, as we also continue through these in shadow rays.
The main visible changes is that volumes will now be lit by the background
even at volume bounces 0, same as surfaces.
Fixes T53914 and T54103.
|
|
Goal is to reduce OpenCL kernel recompilations.
Currently viewport renders are still set to use 64 closures as this seems to
be faster and we don't want to cause a performance regression there. Needs
to be investigated.
Reviewed By: brecht
Differential Revision: https://developer.blender.org/D2775
|
|
The algorithm averages normals from nearby surfaces. It uses the same
sampling strategy as BSSRDFs, casting rays along the normal and two
orthogonal axes, and combining the samples with MIS.
The main concern here is that we are introducing raytracing inside
shader evaluation, which could be quite bad for GPU performance and
stack memory usage. In practice it doesn't seem so bad though.
Note that using this feature can easily slow down renders 20%, and
that if you care about performance then it's better to use a bevel
modifier. Mainly this is useful for baking, and for cases where the
mesh topology makes it difficult for the bevel modifier to work well.
Differential Revision: https://developer.blender.org/D2803
|
|
|
|
With a Titan Xp, reduces path trace local memory from 1092MB to 840MB.
Benchmark performance was within 1% with both RX 480 and Titan Xp.
Original patch was implemented by Sergey.
Differential Revision: https://developer.blender.org/D2249
|
|
|
|
|
|
This is done by storing only a subset of PathRadiance, and by storing
direct light immediately in the main PathRadiance. Saves about 10% of
CUDA stack memory, and simplifies subsurface indirect ray code.
|
|
Similar to what we did for area lights previously, this should help
preserve stratification when using multiple BSDFs in theory. Improvements
are not easily noticeable in practice though, because the number of BSDFs
is usually low. Still nice to eliminate one sampling dimension.
|
|
|
|
|
|
This was needed when we accessed OSL closure memory after shader evaluation,
which could get overwritten by another shader evaluation. But all closures
are immediatley converted to ShaderClosure now, so no longer needed.
|
|
|
|
Also pass by value and don't write back now that it is just a hash for seeding
and no longer an LCG state. Together this makes CUDA a tiny bit faster in my
tests, but mainly simplifies code.
|
|
|
|
|
|
No functional changes.
|
|
When using the Normal output of the Texture Coordinate node on Point and Spot lamps, the coordinates now depend on the rotation of the lamp.
On Area lamps, the Parametric output of the Geometry node now returns UV coordinates on the area lamp.
Credit for the Area lamp part goes to Stefan Werner (from D1995).
|
|
Using ones complement for detecting if transform has been applied was confusing
and led to several bugs. With this proper checks are made.
Also added a few transforms where they were missing, mostly affecting baking
and displacement when `P` is used in the shader (previously `P` was in the
wrong space for these shaders)
Also removed `TIME_INVALID` as this may have resulted in incorrect
transforms in some cases.
Reviewed By: brecht
Differential Revision: https://developer.blender.org/D2192
|
|
All the changes are mainly giving explicit tips on inlining functions,
so they match how inlining worked with previous toolkit.
This make kernel compiled by CUDA 8 render in average with same speed
as previous kernels. Some scenes are somewhat faster, some of them are
somewhat slower. But slowdown is within 1% so far.
On a positive side it allows us to enable newer generation cards on
buildbots (so GTX 10x0 will be officially supported soon).
|
|
Glossy, Anisotropic and Glass BSDFs
This commit adds a new distribution to the Glossy, Anisotropic and Glass BSDFs that implements the
multiple-scattering microfacet model described in the paper "Multiple-Scattering Microfacet BSDFs with the Smith Model".
Essentially, the improvement is that unlike classical GGX, which only models single scattering and assumes
the contribution of multiple bounces to be zero, this new model performs a random walk on the microsurface until
the ray leaves it again, which ensures perfect energy conservation.
In practise, this means that the "darkening problem" - GGX materials becoming darker with increasing
roughness - is solved in a physically correct and efficient way.
The downside of this model is that it has no (known) analytic expression for evalation. However, it can be
evaluated stochastically, and although the correct PDF isn't known either, the properties of MIS and the
balance heuristic guarantee an unbiased result at the cost of slightly higher noise.
Reviewers: dingto, #cycles, brecht
Reviewed By: dingto, #cycles, brecht
Subscribers: bliblubli, ace_dragon, gregzaal, brecht, harvester, dingto, marcog, swerner, jtheninja, Blendify, nutel
Differential Revision: https://developer.blender.org/D2002
|
|
|
|
Saves about 15% for the branched path kernel.
|
|
57% less for path and 48% less for branched path.
|
|
|