Age | Commit message (Collapse) | Author |
|
Inspired by D12936 and D12929, this patch adds general purpose
"Combine Color" and "Separate Color" nodes to Geometry, Compositor,
Shader and Texture nodes.
- Within Geometry Nodes, it replaces the existing "Combine RGB" and
"Separate RGB" nodes.
- Within Compositor Nodes, it replaces the existing
"Combine RGBA/HSVA/YCbCrA/YUVA" and "Separate RGBA/HSVA/YCbCrA/YUVA"
nodes.
- Within Texture Nodes, it replaces the existing "Combine RGBA" and
"Separate RGBA" nodes.
- Within Shader Nodes, it replaces the existing "Combine RGB/HSV" and
"Separate RGB/HSV" nodes.
Python addons have not been updated to the new nodes yet.
**New shader code**
In node_color.h, color.h and gpu_shader_material_color_util.glsl,
missing methods hsl_to_rgb and rgb_to_hsl are added by directly
converting existing C code. They always produce the same result.
**Old code**
As requested by T96219, old nodes still exist but are not displayed in
the add menu. This means Python scripts can still create them as usual.
Otherwise, versioning replaces the old nodes with the new nodes when
opening .blend files.
Differential Revision: https://developer.blender.org/D14034
|
|
All released versions appear to work fine. Also slightly change wording.
|
|
|
|
* Float/double promotion warnings were mainly meant for avoiding slow
operatiosn in the kernel. Limit it to that to avoid hard to fix warnings
in Hydra.
* Const warnings in Hydra iterators.
* Unused variable warnings when building without glog.
* Wrong camera enum comparisons in assert.
* PASS_UNUSED is not a pass type, only for pass offsets.
|
|
This revision allows to specify CUDA host compiler (nvcc's -ccbin command
line option) when configuring the build. It addresses the case where the
C/C++ compiler to be used in CUDA toolchain should be different from the
default C/C++ compiler, for instance in case of compilers versions conflicts
or multiple installed compilers.
The new CMake option is named `CUDA_HOST_COMPILER` and can be used as follows:
`cmake -DCUDA_HOST_COMPILER=<path-to-host-compiler>`
If the option is not specified, the build configuration behaves as previously.
Differential Revision: https://developer.blender.org/D14248
|
|
This adds support for selective rendering of caustics in shadows of refractive
objects. Example uses are rendering of underwater caustics and eye caustics.
This is based on "Manifold Next Event Estimation", a method developed for
production rendering. The idea is to selectively enable shadow caustics on a
few objects in the scene where they have a big visual impact, without impacting
render performance for the rest of the scene.
The Shadow Caustic option must be manually enabled on light, caustic receiver
and caster objects. For such light paths, the Filter Glossy option will be
ignored and replaced by sharp caustics.
Currently this method has a various limitations:
* Only caustics in shadows of refractive objects work, which means no caustics
from reflection or caustics that outside shadows. Only up to 4 refractive
caustic bounces are supported.
* Caustic caster objects should have smooth normals.
* Not currently support for Metal GPU rendering.
In the future this method may be extended for more general caustics.
TECHNICAL DETAILS
This code adds manifold next event estimation through refractive surface(s) as a
new sampling technique for direct lighting, i.e. finding the point on the
refractive surface(s) along the path to a light sample, which satisfies Fermat's
principle for a given microfacet normal and the path's end points. This
technique involves walking on the "specular manifold" using a pseudo newton
solver. Such a manifold is defined by the specular constraint matrix from the
manifold exploration framework [2]. For each refractive interface, this
constraint is defined by enforcing that the generalized half-vector projection
onto the interface local tangent plane is null. The newton solver guides the
walk by linearizing the manifold locally before reprojecting the linear solution
onto the refractive surface. See paper [1] for more details about the technique
itself and [3] for the half-vector light transport formulation, from which it is
derived.
[1] Manifold Next Event Estimation
Johannes Hanika, Marc Droske, and Luca Fascione. 2015.
Comput. Graph. Forum 34, 4 (July 2015), 87–97.
https://jo.dreggn.org/home/2015_mnee.pdf
[2] Manifold exploration: a Markov Chain Monte Carlo technique for rendering
scenes with difficult specular transport Wenzel Jakob and Steve Marschner.
2012. ACM Trans. Graph. 31, 4, Article 58 (July 2012), 13 pages.
https://www.cs.cornell.edu/projects/manifolds-sg12/
[3] The Natural-Constraint Representation of the Path Space for Efficient
Light Transport Simulation. Anton S. Kaplanyan, Johannes Hanika, and Carsten
Dachsbacher. 2014. ACM Trans. Graph. 33, 4, Article 102 (July 2014), 13 pages.
https://cg.ivd.kit.edu/english/HSLT.php
The code for this samping technique was inserted at the light sampling stage
(direct lighting). If the walk is successful, it turns off path regularization
using a specialized flag in the path state (PATH_MNEE_SUCCESS). This flag tells
the integrator not to blur the brdf roughness further down the path (in a child
ray created from BSDF sampling). In addition, using a cascading mechanism of
flag values, we cull connections to caustic lights for this and children rays,
which should be resolved through MNEE.
This mechanism also cancels the MIS bsdf counter part at the casutic receiver
depth, in essence leaving MNEE as the only sampling technique from receivers
through refractive casters to caustic lights. This choice might not be optimal
when the light gets large wrt to the receiver, though this is usually not when
you want to use MNEE.
This connection culling strategy removes a fair amount of fireflies, at the cost
of introducing a slight bias. Because of the selective nature of the culling
mechanism, reflective caustics still benefit from the native path
regularization, which further removes fireflies on other surfaces (bouncing
light off casters).
Differential Revision: https://developer.blender.org/D13533
|
|
To make porting to other architectures easier, clarifying that this does not
need to be supported. The unused parallel_reduce implementation assumed warp
size 32, but is easy to update if we ever need it in the future.
|
|
* Replace license text in headers with SPDX identifiers.
* Remove specific license info from outdated readme.txt, instead leave details
to the source files.
* Add list of SPDX license identifiers used, and corresponding license texts.
* Update copyright dates while we're at it.
Ref D14069, T95597
|
|
Use a shorter/simpler license convention, stops the header taking so
much space.
Follow the SPDX license specification: https://spdx.org/licenses
- C/C++/objc/objc++
- Python
- Shell Scripts
- CMake, GNUmakefile
While most of the source tree has been included
- `./extern/` was left out.
- `./intern/cycles` & `./intern/atomic` are also excluded because they
use different header conventions.
doc/license/SPDX-license-identifiers.txt has been added to list SPDX all
used identifiers.
See P2788 for the script that automated these edits.
Reviewed By: brecht, mont29, sergey
Ref D14069
|
|
This add support for rendering of the point cloud object in Blender, as a native
geometry type in Cycles that is more memory and time efficient than instancing
sphere meshes. This can be useful for rendering sand, water splashes, particles,
motion graphics, etc.
Points are currently always rendered as spheres, with backface culling. More
shapes are likely to be added later, but this is the most important one and can
be customized with shaders.
For CPU rendering the Embree primitive is used, for GPU there is our own
intersection code. Motion blur is suppored. Volumes inside points are not
currently supported.
Implemented with help from:
* Kévin Dietrich: Alembic procedural integration
* Patrick Mourse: OptiX integration
* Josh Whelchel: update for cycles-x changes
Ref T92573
Differential Revision: https://developer.blender.org/D9887
|
|
This patch adds MetalRT support to Cycles kernel code. It is mostly additive in nature or confined to Metal-specific code, however there are a few areas where this interacts with other code:
- MetalRT closely follows the Optix implementation, and in some cases (notably handling of transforms) it makes sense to extend Optix special-casing to MetalRT. For these generalisations we now have `__KERNEL_GPU_RAYTRACING__` instead of `__KERNEL_OPTIX__`.
- MetalRT doesn't support primitive offsetting (as with `primitiveIndexOffset` in Optix), so we define and populate a new kernel texture, `__object_prim_offset`, containing per-object primitive / curve-segment offsets. This is referenced and applied in MetalRT intersection handlers.
- Two new BVH layout enum values have been added: `BVH_LAYOUT_METAL` and `BVH_LAYOUT_MULTI_METAL_EMBREE` for XPU mode). Some host-side enum case handling has been updated where it is trivial to do so.
Ref T92212
Reviewed By: brecht
Maniphest Tasks: T92212
Differential Revision: https://developer.blender.org/D13353
|
|
Contributed by luzpaz.
Differential Revision: https://developer.blender.org/D10447
|
|
MSL requires that constant address space literals be declared at program
scope. This patch moves the `blackbody_table_r/g/b` and `cie_colour_match`
constants into separate files so they can be declared at the appropriate scope.
Ref T92212
Differential Revision: https://developer.blender.org/D13241
|
|
|
|
Build HIP kernels with NanoVDB, and patch NanoVDB to work with HIP.
This is a header only library so no rebuild is needed. The changes are being
submitted upstream to openvdb, so this patch should be temporary.
Thanks Thomas for help testing this.
|
|
|
|
This patch adds a CMake option "WITH_CYCLES_DEBUG" which builds cycles with
a feature that allows debugging/selecting the direct-light sampling strategy.
The same option may later be used to add other debugging features that could
affect performance in release builds.
The three options are:
* Forward path tracing (e.g., via BSDF or phase function)
* Next-event estimation
* Multiple importance sampling combination of the previous two methods
Such a feature is useful for debugging light different sampling, evaluation,
and pdf methods (e.g., for light sources and BSDFs).
Differential Revision: https://developer.blender.org/D13152
|
|
rB3a4c8f406a3a3bf0627477c6183a594fa707a6e2 changed the macros that create the film
convert kernel entry points, but in the process accidentally changed the parameter definition
to one of those (which caused CUDA launch and misaligned address errors) and changed the
implementation as well. This restores the correct implementation from before.
In addition, the `ccl_gpu_kernel_threads` macro did not work as intended and caused the
generated launch bounds to end up with an incorrect input for the second parameter (it was
set to "thread_num_registers", rather than the result of the block number calculation). I'm
not entirely sure why, as the macro definition looked sound to me. Decided to simply go with
two separate macros instead, to simplify and solve this.
Also changed how state is captured with the `ccl_gpu_kernel_lambda` macro slightly, to avoid
a compiler warning (expression has no effect) that otherwise occurred.
Maniphest Tasks: T92985
Differential Revision: https://developer.blender.org/D13175
|
|
This patch adapts the shared kernel entrypoints so that they can be compiled as MSL (Metal Shading Language). Where possible, the adaptations avoid changes in common code.
In MSL, kernel function inputs are explicitly bound to resources. In the case of argument buffers, we declare a struct containing the kernel arguments, accessible via device pointer. This differs from CUDA and HIP where kernel function arguments are declared as traditional C-style function parameters. This patch adapts the entrypoints declared in kernel.h so that they can be translated via a new `ccl_gpu_kernel_signature` macro into the required parameter struct + kernel entrypoint pairing for MSL.
MSL buffer attribution must be applied to function parameters or non-static class data members. To allow universal access to the integrator state, kernel data, and texture fetch adapters, we wrap all of the shared kernel code in a `MetalKernelContext` class. This is achieved by bracketing the appropriate kernel headers with "context_begin.h" and "context_end.h" on Metal. When calling deeper into the kernel code, we must reference the context class (e.g. `context.integrator_init_from_camera`). This extra prefixing is performed by a set of defines in "context_end.h". These will require explicit maintenance if entrypoints change. We invite discussion on more maintainable ways to enforce correctness.
Lambda expressions are not supported on MSL, so a new `ccl_gpu_kernel_lambda` macro generates an inline function object and optionally capturing any required state. This yields the same behaviour. This approach is applied to all parallel_... implementations which are templated by operation. The lambda expressions in the film_convert... kernels don't adapt cleanly to use function objects. However, these entrypoints can be macro-generated more concisely to avoid lambda expressions entirely, instead relying on constant folding to handle the pixel/channel conversions.
A separate implementation of `gpu_parallel_active_index_array` is provided for Metal to workaround some subtle differences in SIMD width, and also to encapsulate some required thread parameters which must be declared as explicit entrypoint function parameters.
Ref T92212
Reviewed By: brecht
Maniphest Tasks: T92212
Differential Revision: https://developer.blender.org/D13109
|
|
|
|
Remove prefix of filenames that is the same as the folder name. This used
to help when #includes were using individual files, but now they are always
relative to the cycles root directory and so the prefixes are redundant.
For patches and branches, git merge and rebase should be able to detect the
renames and move over code to the right file.
|
|
* Split render/ into scene/ and session/. The scene/ folder now contains the
scene and its nodes. The session/ folder contains the render session and
associated data structures like drivers and render buffers.
* Move top level kernel headers into new folders kernel/camera/, kernel/film/,
kernel/light/, kernel/sample/, kernel/util/
* Move integrator related kernel headers into kernel/integrator/
* Move OSL shaders from kernel/shaders/ to kernel/osl/shaders/
For patches and branches, git merge and rebase should be able to detect the
renames and move over code to the right file.
|
|
|
|
* Additional structs added to the hipew loader for device props
* Adds hipRTC functions to the loader for future usage
* Enables CPU+GPU usage for HIP
* Cleanup to the adaptive kernel compilation process
* Fix for kernel compilation failures with HIP with latest master
Ref T92393, D12958
|
|
The motivation for this is twofold. It improves performance (5-10% on most
benchmark scenes), and will help to bring back transparency support for the
ambient occlusion pass.
* Duplicate some members from the main path state in the shadow path state.
* Add shadow paths incrementally to the array similar to what we do for
the shadow catchers.
* For the scheduling, allow running shade surface and shade volume kernels
as long as there is enough space in the shadow paths array. If not, execute
shadow kernels until it is empty.
* Add IntegratorShadowState and ConstIntegratorShadowState typedefs that
can be different between CPU and GPU. For GPU both main and shadow paths
juse have an integer for SoA access. Bt with CPU it's a different pointer
type so we get type safety checks in code shared between CPU and GPU.
* For CPU, add a separate IntegratorShadowStateCPU struct embedded in
IntegratorShadowState.
* Update various functions to take the shadow state, and make SVM take either
type of state using templates.
Differential Revision: https://developer.blender.org/D12889
|
|
There is not enough time before the release to improve Random Walk to handle
all cases this was used for, so restore it for now.
Since there is no more path splitting in cycles-x, this can increase noise in
non-flat areas for the sample number of samples, though fewer rays will be traced
also. This is fundamentally a trade-off we made in the new design and why Random
Walk is a better fit. However the importance resampling we do now does help to
reduce noise.
Differential Revision: https://developer.blender.org/D12800
|
|
|
|
And fix various broken things in the HIP kernel compilation.
|
|
|
|
NOTE: this feature is not ready for user testing, and not yet enabled in daily
builds. It is being merged now for easier collaboration on development.
HIP is a heterogenous compute interface allowing C++ code to be executed on
GPUs similar to CUDA. It is intended to bring back AMD GPU rendering support
on Windows and Linux.
https://github.com/ROCm-Developer-Tools/HIP.
As of the time of writing, it should compile and run on Linux with existing
HIP compilers and driver runtimes. Publicly available compilers and drivers
for Windows will come later.
See task T91571 for more details on the current status and work remaining
to be done.
Credits:
Sayak Biswas (AMD)
Arya Rafii (AMD)
Brian Savery (AMD)
Differential Revision: https://developer.blender.org/D12578
|
|
This includes much improved GPU rendering performance, viewport interactivity,
new shadow catcher, revamped sampling settings, subsurface scattering anisotropy,
new GPU volume sampling, improved PMJ sampling pattern, and more.
Some features have also been removed or changed, breaking backwards compatibility.
Including the removal of the OpenCL backend, for which alternatives are under
development.
Release notes and code docs:
https://wiki.blender.org/wiki/Reference/Release_Notes/3.0/Cycles
https://wiki.blender.org/wiki/Source/Render/Cycles
Credits:
* Sergey Sharybin
* Brecht Van Lommel
* Patrick Mours (OptiX backend)
* Christophe Hery (subsurface scattering anisotropy)
* William Leeson (PMJ sampling pattern)
* Alaska (various fixes and tweaks)
* Thomas Dinges (various fixes)
For the full commit history, see the cycles-x branch. This squashes together
all the changes since intermediate changes would often fail building or tests.
Ref T87839, T87837, T87836
Fixes T90734, T89353, T80267, T80267, T77185, T69800
|
|
WITH_CYCLES_DEBUG was used for rendering BVH debugging passes. But since we
mainly use Embree an OptiX now, this information is no longer important.
WITH_CYCLES_DEBUG_NAN will enable additional checks for NaNs and invalid values
in the kernel, for Cycles developers. Previously these asserts where enabled in
all debug builds, but this is too likely to crash Blender in scenes that render
fine regardless of the NaNs. So this is behind a CMake option now.
Fixes T90240
|
|
This fixes a performance regression on Ampere cards, on specific scenes like
classroom. For cycles-x there is little difference, but this is still helpful
for LTS releases, and we need to upgrade at some point anyway.
|
|
|
|
Support for the AO and bevel shader nodes requires calling "optixTrace" from within the shading
VM, which is only allowed from inlined functions to the raygen program or callables. This patch
therefore converts the shading VM to use direct callables to make it work. To prevent performance
regressions a separate kernel module is compiled and used for this purpose.
Reviewed By: brecht
Differential Revision: https://developer.blender.org/D9733
|
|
Replace 'set' with 'string(APPEND/PREPEND ...)'.
This avoids duplicating the variable name.
|
|
With this patch the build system checks whether the "CUDA10_NVCC_EXECUTABLE" CMake
variable is set and if so will use that to build sm_30 kernels. Similarily for sm_8x kernels it
checks "CUDA11_NVCC_EXECUTABLE". All other kernels are built using the default CUDA
toolkit. This makes it possible to use either the CUDA 10 or CUDA 11 toolkit by default and
only selectively use the other for the kernels where its a hard requirement.
Reviewed By: brecht
Differential Revision: https://developer.blender.org/D9179
|
|
NanoVDB is a platform-independent sparse volume data structure that makes it possible to
use OpenVDB volumes on the GPU. This patch uses it for volume rendering in Cycles,
replacing the previous usage of dense 3D textures.
Since it has a big impact on memory usage and performance and changes the OpenVDB
branch used for the rest of Blender as well, this is not enabled by default yet, which will
happen only after 2.82 was branched off. To enable it, build both dependencies and Blender
itself with the "WITH_NANOVDB" CMake option.
Reviewed By: brecht
Differential Revision: https://developer.blender.org/D8794
|
|
* Support precompiled libraries on Linux
* Add license headers
* Refactoring to deduplicate code
Includes work by Ray Molenkamp and Grische for precompiled libraries.
Ref D8769
|
|
|
|
one is found
This patch changes the discovery of pre-compiled kernels, to look for any PTX, even if
it does not match the current architecture version exactly. It works because the driver can
JIT-compile PTX generated for architectures less than or equal to the current one.
This e.g. makes it possible to render on a new GPU architecture even if no pre-compiled
binary kernel was distributed for it as part of the Blender installation.
Reviewed By: brecht
Differential Revision: https://developer.blender.org/D8332
|
|
Ref T73778
Depends on D8011
Maniphest Tasks: T73778
Differential Revision: https://developer.blender.org/D8012
|
|
This commit adds a new model to the Sky Texture node, which is based on a
method by Nishita et al. and works by basically simulating volumetric
scattering in the atmosphere.
By making some approximations (such as only considering single scattering),
we get a fairly simple and fast simulation code that takes into account
Rayleigh and Mie scattering as well as Ozone absorption.
This code is used to precompute a 512x128 texture which is then looked up
during render time, and is fast enough to allow real-time tweaking in the
viewport.
Due to the nature of the simulation, it exposes several parameters that
allow for lots of flexibility in choosing the look and matching real-world
conditions (such as Air/Dust/Ozone density and altitude).
Additionally, the same volumetric approach can be used to compute absorption
of the direct sunlight, so the model also supports adding direct sunlight.
This makes it significantly easier to set up Sun+Sky illumination where
the direction, intensity and color of the sun actually matches the sky.
In order to support properly sampling the direct sun component, the commit
also adds logic for sampling a specific area to the kernel light sampling
code. This is combined with portal and background map sampling using MIS.
This sampling logic works for the common case of having one Sky texture
going into the Background shader, but if a custom input to the Vector
node is used or if there are multiple Sky textures, it falls back to using
only background map sampling (while automatically setting the resolution to
4096x2048 if auto resolution is used).
More infos and preview can be found here:
https://docs.google.com/document/d/1gQta0ygFWXTrl5Pmvl_nZRgUw0mWg0FJeRuNKS36m08/view
Underlying model, implementation and documentation by Marco (@nacioss).
Improvements, cleanup and sun sampling by @lukasstockner.
Differential Revision: https://developer.blender.org/D7896
|
|
It appears to work fine after a recent bugfix and testing for the past few
weeks.
|
|
CMake: `WITH_CYCLES_DEVICE_OPTIX` did not respect `WITH_CYCLES_CUDA_BINARIES` causing the optix kernel to be always build at build time.
Code: `device_optix.cpp` did not count on the optix kernel not existing in the default location.
For this to work, one should have before starting blender
1) working nvcc environment
2) Optix SDK installed and the OPTIX_ROOT_DIR environment variable pointing to it which is not set by default
Differential Revision: https://developer.blender.org/D7400
Reviewed By: Brecht
|
|
Reviewed by: brecht pmoursnv
Differential Revision: https://developer.blender.org/D7136
|
|
This feature takes some inspiration from
"RenderMan: An Advanced Path Tracing Architecture for Movie Rendering" and
"A Hierarchical Automatic Stopping Condition for Monte Carlo Global Illumination"
The basic principle is as follows:
While samples are being added to a pixel, the adaptive sampler writes half
of the samples to a separate buffer. This gives it two separate estimates
of the same pixel, and by comparing their difference it estimates convergence.
Once convergence drops below a given threshold, the pixel is considered done.
When a pixel has not converged yet and needs more samples than the minimum,
its immediate neighbors are also set to take more samples. This is done in order
to more reliably detect sharp features such as caustics. A 3x3 box filter that
is run periodically over the tile buffer is used for that purpose.
After a tile has finished rendering, the values of all passes are scaled as if
they were rendered with the full number of samples. This way, any code operating
on these buffers, for example the denoiser, does not need to be changed for
per-pixel sample counts.
Reviewed By: brecht, #cycles
Differential Revision: https://developer.blender.org/D4686
|
|
This node provides the ability to rotate a vector around a `center` point using either `Axis Angle` , `Single Axis` or `Euler` methods.
Reviewed By: #cycles, brecht
Differential Revision: https://developer.blender.org/D3789
|
|
Custom render passes are added in the Shader AOVs panel in the view layer
settings, with a name and data type. In shader nodes, an AOV Output node
is then used to output either a value or color to the pass.
Arbitrary names can be used for these passes, as long as they don't conflict
with built-in passes that are enabled. The AOV Output node can be used in both
material and world shader nodes.
Implemented by Lukas, with tweaks by Brecht.
Differential Revision: https://developer.blender.org/D4837
|
|
|