Age | Commit message (Collapse) | Author |
|
Ref D8237, T78710
|
|
Contributed by pembem22.
Differential Revision: https://developer.blender.org/D8947
|
|
Now it matches Eevee, OpenGL and other renderers. Panoramic camera depth passes
are unchanged, and are still distance from the camera center.
|
|
No functional changes yet, this is work towards making CPU and GPU results
match more closely.
|
|
|
|
This feature takes some inspiration from
"RenderMan: An Advanced Path Tracing Architecture for Movie Rendering" and
"A Hierarchical Automatic Stopping Condition for Monte Carlo Global Illumination"
The basic principle is as follows:
While samples are being added to a pixel, the adaptive sampler writes half
of the samples to a separate buffer. This gives it two separate estimates
of the same pixel, and by comparing their difference it estimates convergence.
Once convergence drops below a given threshold, the pixel is considered done.
When a pixel has not converged yet and needs more samples than the minimum,
its immediate neighbors are also set to take more samples. This is done in order
to more reliably detect sharp features such as caustics. A 3x3 box filter that
is run periodically over the tile buffer is used for that purpose.
After a tile has finished rendering, the values of all passes are scaled as if
they were rendered with the full number of samples. This way, any code operating
on these buffers, for example the denoiser, does not need to be changed for
per-pixel sample counts.
Reviewed By: brecht, #cycles
Differential Revision: https://developer.blender.org/D4686
|
|
This simplifies compositors setups and will be consistent with Eevee render
passes from D6331. There's a continuum between these passes and it's not clear
there is much advantage to having them available separately.
Differential Revision: https://developer.blender.org/D6848
|
|
denoising albedo pass
To determine the albedo pass, Cycles currently follows the path until a predominantly
diffuse-ish material is hit and then takes the albedo there.
This works fine for normal mirrors, but as it completely ignores the color of the bounces
before that diffuse-ish material, it also means that any textures that are applied to the
specular-ish BSDFs won't affect the albedo pass at all.
Therefore, this patch changes that behaviour so that Cycles also keeps track of the
throughput of all specular-ish closures along the path so far and includes that in
the albedo pass.
This fixes part of the issue described in T73043. However, since it has an effect on the
albedo pass in most scenes, it could cause cause regressions, which is why I'm uploading
it as a patch instead of just committing as a fix.
Differential Revision: https://developer.blender.org/D6640
|
|
Similar to the Microfacet Closures, the Principled BSDF Sheen closure is
added at a high weight but typically results in fairly low values.
Therefore, the default weight is a bad indicator of importance.
The fix here is the same as it was back then for Microfacets:
Compute an average weight using the normal as the half-vector
and use it to scale down the sample weight and the albedo channel.
In addition to drastically improving denoising of materials with
sheen when using the new Denoising node, this also can reduce noise
on such materials considerably.
|
|
transform_direction() can't handle parameters in constant address space.
Creating a local copy of the parameter satisfies the OpenCL compiler.
CUDA and CPU compilers should be able to optimize this away I hope.
|
|
This patch adds support for the OptiX denoiser as an alternative to the existing NLM denoiser in Cycles. It's re-using the same denoising architecture based on tiles and therefore implicitly also works with multiple GPUs.
Reviewed By: sergey
Differential Revision: https://developer.blender.org/D6395
|
|
Custom render passes are added in the Shader AOVs panel in the view layer
settings, with a name and data type. In shader nodes, an AOV Output node
is then used to output either a value or color to the pass.
Arbitrary names can be used for these passes, as long as they don't conflict
with built-in passes that are enabled. The AOV Output node can be used in both
material and world shader nodes.
Implemented by Lukas, with tweaks by Brecht.
Differential Revision: https://developer.blender.org/D4837
|
|
average fresnel
The Principled BSDF uses Microfacet closures that include a fresnel term,
which are a special case since their weight tends to be near white even
if their average contribution is fairly low.
The sample weight is scaled by the average fresnel weight to account for
this, but the denoising albedo still used the unscaled weight.
This was fine for the original denoiser, but apparently OIDN can't handle
the resulting albedo pass well. Therefore, this commit adds the described
scaling to the albedo pass contribution as well.
This problem was described in T69770.
Reviewed By: brecht
Differential Revision: https://developer.blender.org/D6289
|
|
Small memory reduction change by only storing the pixels of the combined
pass when it is being shown in the viewport. Previously the combined pass
was always calculated and present in the output buffer. The combined pass
will still be calculated.
It is a limitation in Blender that Cycles always had a combined pass.
This patch will remove the limitation from the code base of Cycles.
Blender still has the limitation, but will always request the combined
renderpass when doing final rendering.
Reviewed By: brecht
Differential Revision: https://developer.blender.org/D5784
|
|
Ref D5363
|
|
|
|
Apply clang format as proposed in T53211.
For details on usage and instructions for migrating branches
without conflicts, see:
https://wiki.blender.org/wiki/Tools/ClangFormat
|
|
various parts of the CPU kernel
This commit adds a sample-based profiler that runs during CPU rendering and collects statistics on time spent in different parts of the kernel (ray intersection, shader evaluation etc.) as well as time spent per material and object.
The results are currently not exposed in the user interface or per Python yet, to see the stats on the console pass the "--cycles-print-stats" argument to Cycles (e.g. "./blender -- --cycles-print-stats").
Unfortunately, there is no clear way to extend this functionality to CUDA or OpenCL, so it is CPU-only for now.
Reviewers: brecht, sergey, swerner
Reviewed By: brecht, swerner
Differential Revision: https://developer.blender.org/D3892
|
|
|
|
It is supposed to be two spaces before comment stating which if
else/endif statements corresponds to. Was mainly violated in the
header guards.
|
|
This allows for extra output passes that encode automatic object and material masks
for the entire scene. It is an implementation of the Cryptomatte standard as
introduced by Psyop. A good future extension would be to add a manifest to the
export and to do plenty of testing to ensure that it is fully compatible with other
renderers and compositing programs that use Cryptomatte.
Internally, it adds the ability for Cycles to have several passes of the same type
that are distinguished by their name.
Differential Revision: https://developer.blender.org/D3538
|
|
|
|
Roughness baking previously defaulted to 1.0 for all diffuse materials,
now we also bake roughness values of Oren-Nayer and Principled Diffuse.
Differential Revision: https://developer.blender.org/D3115
|
|
This can be enabled in the Film panel, with an option to control the
transmisison roughness below which glass becomes transparent.
Differential Revision: https://developer.blender.org/D2904
|
|
No color pass because it's hard to define what to use as color in a volume.
Reviewers: sergey, brecht
Differential Revision: https://developer.blender.org/D2903
|
|
passes
|
|
Denoising of specular shaders
|
|
|
|
OpenCL.
The work size is still very conservative, and this doesn't help for progressive
refine. For that we will need to render multiple tiles at the same time. But this
should already help for denoising renders that require too much memory with big
tiles, and just generally soften the performance dropoff with small tiles.
Differential Revision: https://developer.blender.org/D2856
|
|
This was originally done with the first sample in the kernel for better
performance, but it doesn't work anymore with atomics. Any benefit was
very minor anyway, too small to measure it seems.
|
|
There is no significant difference in denoised benchmark scenes and
denoising ctests, so might as well make it all consistent.
|
|
|
|
Benchmarks peformance on GTX 1080 and RX 480 on Linux is the same for
bmw27, classroom, pabellon, and about 2% faster on fishy_cat and koro.
|
|
|
|
|
|
|
|
Was some mismatch in address space. Seems to be caused by recent additions.
Additionally, moved decoupled ray marching functions under ifdef, so they
don't try to use malloc() functions.
Thanks Mai for testing the patch!
|
|
Volume shaders without anything connected to the surface output are treated
as if they had a transparent BSDF as the surface shader in Cycles, so the
denoiser should skip feature pass writing for them just as it does with an
actual transparent BSDF.
|
|
|
|
Numerical inaccuracies would cause the XtWX matrix to be no longer
positive-semidefinite, which in turn caused the LSQ solver to fail.
|
|
This commit contains the first part of the new Cycles denoising option,
which filters the resulting image using information gathered during rendering
to get rid of noise while preserving visual features as well as possible.
To use the option, enable it in the render layer options. The default settings
fit a wide range of scenes, but the user can tweak individual settings to
control the tradeoff between a noise-free image, image details, and calculation
time.
Note that the denoiser may still change in the future and that some features
are not implemented yet. The most important missing feature is animation
denoising, which uses information from multiple frames at once to produce a
flicker-free and smoother result. These features will be added in the future.
Finally, thanks to all the people who supported this project:
- Google (through the GSoC) and Theory Studios for sponsoring the development
- The authors of the papers I used for implementing the denoiser (more details
on them will be included in the technical docs)
- The other Cycles devs for feedback on the code, especially Sergey for
mentoring the GSoC project and Brecht for the code review!
- And of course the users who helped with testing, reported bugs and things
that could and/or should work better!
|
|
|
|
This does a few things at once:
- Refactors host side split kernel logic into a new device
agnostic class `DeviceSplitKernel`.
- Removes tile splitting, a new work pool implementation takes its place and
allows as many threads as will fit in memory regardless of tile size, which
can give performance gains.
- Refactors split state buffers into one buffer, as well as reduces the
number of arguments passed to kernels. Means there's less code to deal
with overall.
- Moves kernel logic out of OpenCL kernel files so they can later be used by
other device types.
- Replaced OpenCL specific APIs with new generic versions
- Tiles can now be seen updating during rendering
|
|
|
|
This commit contains all the work related on the AMD megakernel split work
which was mainly done by Varun Sundar, George Kyriazis and Lenny Wang, plus
some help from Sergey Sharybin, Martijn Berger, Thomas Dinges and likely
someone else which we're forgetting to mention.
Currently only AMD cards are enabled for the new split kernel, but it is
possible to force split opencl kernel to be used by setting the following
environment variable: CYCLES_OPENCL_SPLIT_KERNEL_TEST=1.
Not all the features are supported yet, and that being said no motion blur,
camera blur, SSS and volumetrics for now. Also transparent shadows are
disabled on AMD device because of some compiler bug.
This kernel is also only implements regular path tracing and supporting
branched one will take a bit. Branched path tracing is exposed to the
interface still, which is a bit misleading and will be hidden there soon.
More feature will be enabled once they're ported to the split kernel and
tested.
Neither regular CPU nor CUDA has any difference, they're generating the
same exact code, which means no regressions/improvements there.
Based on the research paper:
https://research.nvidia.com/sites/default/files/publications/laine2013hpg_paper.pdf
Here's the documentation:
https://docs.google.com/document/d/1LuXW-CV-sVJkQaEGZlMJ86jZ8FmoPfecaMdR-oiWbUY/edit
Design discussion of the patch:
https://developer.blender.org/T44197
Differential Revision: https://developer.blender.org/D1200
|
|
This more a workaround for CUDA optimizer which can't optimize clamp(x, 0, 1)
into a single instruction and uses 4 instructions instead.
Original patch by @lockal with own modification:
Don't make changes outside of the kernel. They don't make any difference
anyway and term saturate() has a bit different meaning outside of kernel.
This gives around 2% of speedup in Barcelona file, but in more complex shader
setups with lots of math nodes with clamping speedup could be much nicer.
Subscribers: dingto
Projects: #cycles
Differential Revision: https://developer.blender.org/D1224
|
|
This was already mixed a bit, but the dot belongs there.
|
|
|
|
Z, Index, normal, UV and vector passes are only affected by surfaces with alpha
transparency equal to or higher than this threshold. With value 0.0 the first
surface hit will always write to these passes, regardless of transparency. With
higher values surfaces that are mostly transparent can be skipped until an opaque
surface is encountered.
|
|
This to avoids build conflicts with libc++ on FreeBSD, these __ prefixed values
are reserved for compilers. I apologize to anyone who has patches or branches
and has to go through the pain of merging this change, it may be easiest to do
these same replacements in your code and then apply/merge the patch.
Ref T37477.
|