Age | Commit message (Collapse) | Author |
|
This is still useful in some cases even if not used by OpenImageDenoise. In
the future this may be replaced with a more generic system to control render
passes and filtering, but for now this just does what it did before.
|
|
Similar to the Depth, for compositing the interpolated values between a far
and near object can be non-sensical.
|
|
|
|
* Rename struct KernelGlobals to struct KernelGlobalsCPU
* Add KernelGlobals, IntegratorState and ConstIntegratorState typedefs
that every device can define in its own way.
* Remove INTEGRATOR_STATE_ARGS and INTEGRATOR_STATE_PASS macros and
replace with these new typedefs.
* Add explicit state argument to INTEGRATOR_STATE and similar macros
In preparation for decoupling main and shadow paths.
Differential Revision: https://developer.blender.org/D12888
|
|
This is the first of a sequence of changes to support compiling Cycles kernels as MSL (Metal Shading Language) in preparation for a Metal GPU device implementation.
MSL requires that all pointer types be declared with explicit address space attributes (device, thread, etc...). There is already precedent for this with Cycles' address space macros (ccl_global, ccl_private, etc...), therefore the first step of MSL-enablement is to apply these consistently. Line-for-line this represents the largest change required to enable MSL. Applying this change first will simplify future patches as well as offering the emergent benefit of enhanced descriptiveness.
The vast majority of deltas in this patch fall into one of two cases:
- Ensuring ccl_private is specified for thread-local pointer types
- Ensuring ccl_global is specified for device-wide pointer types
Additionally, the ccl_addr_space qualifier can be removed. Prior to Cycles X, ccl_addr_space was used as a context-dependent address space qualifier, but now it is either redundant (e.g. in struct typedefs), or can be replaced by ccl_global in the case of pointer types. Associated function variants (e.g. lcg_step_float_addrspace) are also redundant.
In cases where address space qualifiers are chained with "const", this patch places the address space qualifier first. The rationale for this is that the choice of address space is likely to have the greater impact on runtime performance and overall architecture.
The final part of this patch is the addition of a metal/compat.h header. This is partially complete and will be extended in future patches, paving the way for the full Metal implementation.
Ref T92212
Reviewed By: brecht
Maniphest Tasks: T92212
Differential Revision: https://developer.blender.org/D12864
|
|
This includes much improved GPU rendering performance, viewport interactivity,
new shadow catcher, revamped sampling settings, subsurface scattering anisotropy,
new GPU volume sampling, improved PMJ sampling pattern, and more.
Some features have also been removed or changed, breaking backwards compatibility.
Including the removal of the OpenCL backend, for which alternatives are under
development.
Release notes and code docs:
https://wiki.blender.org/wiki/Reference/Release_Notes/3.0/Cycles
https://wiki.blender.org/wiki/Source/Render/Cycles
Credits:
* Sergey Sharybin
* Brecht Van Lommel
* Patrick Mours (OptiX backend)
* Christophe Hery (subsurface scattering anisotropy)
* William Leeson (PMJ sampling pattern)
* Alaska (various fixes and tweaks)
* Thomas Dinges (various fixes)
For the full commit history, see the cycles-x branch. This squashes together
all the changes since intermediate changes would often fail building or tests.
Ref T87839, T87837, T87836
Fixes T90734, T89353, T80267, T80267, T77185, T69800
|
|
WITH_CYCLES_DEBUG was used for rendering BVH debugging passes. But since we
mainly use Embree an OptiX now, this information is no longer important.
WITH_CYCLES_DEBUG_NAN will enable additional checks for NaNs and invalid values
in the kernel, for Cycles developers. Previously these asserts where enabled in
all debug builds, but this is too likely to crash Blender in scenes that render
fine regardless of the NaNs. So this is behind a CMake option now.
Fixes T90240
|
|
Ref D8237, T78710
|
|
Contributed by pembem22.
Differential Revision: https://developer.blender.org/D8947
|
|
Now it matches Eevee, OpenGL and other renderers. Panoramic camera depth passes
are unchanged, and are still distance from the camera center.
|
|
No functional changes yet, this is work towards making CPU and GPU results
match more closely.
|
|
|
|
This feature takes some inspiration from
"RenderMan: An Advanced Path Tracing Architecture for Movie Rendering" and
"A Hierarchical Automatic Stopping Condition for Monte Carlo Global Illumination"
The basic principle is as follows:
While samples are being added to a pixel, the adaptive sampler writes half
of the samples to a separate buffer. This gives it two separate estimates
of the same pixel, and by comparing their difference it estimates convergence.
Once convergence drops below a given threshold, the pixel is considered done.
When a pixel has not converged yet and needs more samples than the minimum,
its immediate neighbors are also set to take more samples. This is done in order
to more reliably detect sharp features such as caustics. A 3x3 box filter that
is run periodically over the tile buffer is used for that purpose.
After a tile has finished rendering, the values of all passes are scaled as if
they were rendered with the full number of samples. This way, any code operating
on these buffers, for example the denoiser, does not need to be changed for
per-pixel sample counts.
Reviewed By: brecht, #cycles
Differential Revision: https://developer.blender.org/D4686
|
|
This simplifies compositors setups and will be consistent with Eevee render
passes from D6331. There's a continuum between these passes and it's not clear
there is much advantage to having them available separately.
Differential Revision: https://developer.blender.org/D6848
|
|
denoising albedo pass
To determine the albedo pass, Cycles currently follows the path until a predominantly
diffuse-ish material is hit and then takes the albedo there.
This works fine for normal mirrors, but as it completely ignores the color of the bounces
before that diffuse-ish material, it also means that any textures that are applied to the
specular-ish BSDFs won't affect the albedo pass at all.
Therefore, this patch changes that behaviour so that Cycles also keeps track of the
throughput of all specular-ish closures along the path so far and includes that in
the albedo pass.
This fixes part of the issue described in T73043. However, since it has an effect on the
albedo pass in most scenes, it could cause cause regressions, which is why I'm uploading
it as a patch instead of just committing as a fix.
Differential Revision: https://developer.blender.org/D6640
|
|
Similar to the Microfacet Closures, the Principled BSDF Sheen closure is
added at a high weight but typically results in fairly low values.
Therefore, the default weight is a bad indicator of importance.
The fix here is the same as it was back then for Microfacets:
Compute an average weight using the normal as the half-vector
and use it to scale down the sample weight and the albedo channel.
In addition to drastically improving denoising of materials with
sheen when using the new Denoising node, this also can reduce noise
on such materials considerably.
|
|
transform_direction() can't handle parameters in constant address space.
Creating a local copy of the parameter satisfies the OpenCL compiler.
CUDA and CPU compilers should be able to optimize this away I hope.
|
|
This patch adds support for the OptiX denoiser as an alternative to the existing NLM denoiser in Cycles. It's re-using the same denoising architecture based on tiles and therefore implicitly also works with multiple GPUs.
Reviewed By: sergey
Differential Revision: https://developer.blender.org/D6395
|
|
Custom render passes are added in the Shader AOVs panel in the view layer
settings, with a name and data type. In shader nodes, an AOV Output node
is then used to output either a value or color to the pass.
Arbitrary names can be used for these passes, as long as they don't conflict
with built-in passes that are enabled. The AOV Output node can be used in both
material and world shader nodes.
Implemented by Lukas, with tweaks by Brecht.
Differential Revision: https://developer.blender.org/D4837
|
|
average fresnel
The Principled BSDF uses Microfacet closures that include a fresnel term,
which are a special case since their weight tends to be near white even
if their average contribution is fairly low.
The sample weight is scaled by the average fresnel weight to account for
this, but the denoising albedo still used the unscaled weight.
This was fine for the original denoiser, but apparently OIDN can't handle
the resulting albedo pass well. Therefore, this commit adds the described
scaling to the albedo pass contribution as well.
This problem was described in T69770.
Reviewed By: brecht
Differential Revision: https://developer.blender.org/D6289
|
|
Small memory reduction change by only storing the pixels of the combined
pass when it is being shown in the viewport. Previously the combined pass
was always calculated and present in the output buffer. The combined pass
will still be calculated.
It is a limitation in Blender that Cycles always had a combined pass.
This patch will remove the limitation from the code base of Cycles.
Blender still has the limitation, but will always request the combined
renderpass when doing final rendering.
Reviewed By: brecht
Differential Revision: https://developer.blender.org/D5784
|
|
Ref D5363
|
|
|
|
Apply clang format as proposed in T53211.
For details on usage and instructions for migrating branches
without conflicts, see:
https://wiki.blender.org/wiki/Tools/ClangFormat
|
|
various parts of the CPU kernel
This commit adds a sample-based profiler that runs during CPU rendering and collects statistics on time spent in different parts of the kernel (ray intersection, shader evaluation etc.) as well as time spent per material and object.
The results are currently not exposed in the user interface or per Python yet, to see the stats on the console pass the "--cycles-print-stats" argument to Cycles (e.g. "./blender -- --cycles-print-stats").
Unfortunately, there is no clear way to extend this functionality to CUDA or OpenCL, so it is CPU-only for now.
Reviewers: brecht, sergey, swerner
Reviewed By: brecht, swerner
Differential Revision: https://developer.blender.org/D3892
|
|
|
|
It is supposed to be two spaces before comment stating which if
else/endif statements corresponds to. Was mainly violated in the
header guards.
|
|
This allows for extra output passes that encode automatic object and material masks
for the entire scene. It is an implementation of the Cryptomatte standard as
introduced by Psyop. A good future extension would be to add a manifest to the
export and to do plenty of testing to ensure that it is fully compatible with other
renderers and compositing programs that use Cryptomatte.
Internally, it adds the ability for Cycles to have several passes of the same type
that are distinguished by their name.
Differential Revision: https://developer.blender.org/D3538
|
|
|
|
Roughness baking previously defaulted to 1.0 for all diffuse materials,
now we also bake roughness values of Oren-Nayer and Principled Diffuse.
Differential Revision: https://developer.blender.org/D3115
|
|
This can be enabled in the Film panel, with an option to control the
transmisison roughness below which glass becomes transparent.
Differential Revision: https://developer.blender.org/D2904
|
|
No color pass because it's hard to define what to use as color in a volume.
Reviewers: sergey, brecht
Differential Revision: https://developer.blender.org/D2903
|
|
passes
|
|
Denoising of specular shaders
|
|
|
|
OpenCL.
The work size is still very conservative, and this doesn't help for progressive
refine. For that we will need to render multiple tiles at the same time. But this
should already help for denoising renders that require too much memory with big
tiles, and just generally soften the performance dropoff with small tiles.
Differential Revision: https://developer.blender.org/D2856
|
|
This was originally done with the first sample in the kernel for better
performance, but it doesn't work anymore with atomics. Any benefit was
very minor anyway, too small to measure it seems.
|
|
There is no significant difference in denoised benchmark scenes and
denoising ctests, so might as well make it all consistent.
|
|
|
|
Benchmarks peformance on GTX 1080 and RX 480 on Linux is the same for
bmw27, classroom, pabellon, and about 2% faster on fishy_cat and koro.
|
|
|
|
|
|
|
|
Was some mismatch in address space. Seems to be caused by recent additions.
Additionally, moved decoupled ray marching functions under ifdef, so they
don't try to use malloc() functions.
Thanks Mai for testing the patch!
|
|
Volume shaders without anything connected to the surface output are treated
as if they had a transparent BSDF as the surface shader in Cycles, so the
denoiser should skip feature pass writing for them just as it does with an
actual transparent BSDF.
|
|
|
|
Numerical inaccuracies would cause the XtWX matrix to be no longer
positive-semidefinite, which in turn caused the LSQ solver to fail.
|
|
This commit contains the first part of the new Cycles denoising option,
which filters the resulting image using information gathered during rendering
to get rid of noise while preserving visual features as well as possible.
To use the option, enable it in the render layer options. The default settings
fit a wide range of scenes, but the user can tweak individual settings to
control the tradeoff between a noise-free image, image details, and calculation
time.
Note that the denoiser may still change in the future and that some features
are not implemented yet. The most important missing feature is animation
denoising, which uses information from multiple frames at once to produce a
flicker-free and smoother result. These features will be added in the future.
Finally, thanks to all the people who supported this project:
- Google (through the GSoC) and Theory Studios for sponsoring the development
- The authors of the papers I used for implementing the denoiser (more details
on them will be included in the technical docs)
- The other Cycles devs for feedback on the code, especially Sergey for
mentoring the GSoC project and Brecht for the code review!
- And of course the users who helped with testing, reported bugs and things
that could and/or should work better!
|
|
|
|
This does a few things at once:
- Refactors host side split kernel logic into a new device
agnostic class `DeviceSplitKernel`.
- Removes tile splitting, a new work pool implementation takes its place and
allows as many threads as will fit in memory regardless of tile size, which
can give performance gains.
- Refactors split state buffers into one buffer, as well as reduces the
number of arguments passed to kernels. Means there's less code to deal
with overall.
- Moves kernel logic out of OpenCL kernel files so they can later be used by
other device types.
- Replaced OpenCL specific APIs with new generic versions
- Tiles can now be seen updating during rendering
|