Age | Commit message (Collapse) | Author |
|
Increasing the samplig dimensions like this is not optimal, I'm looking
into some deeper changes to reuse the random number and change the RR
probabilities, but this should fix the bug for now.
|
|
This is more important now that we will have tigther volume bounds that
we hit multiple times. It also avoids some noise due to RR previously
affecting these surfaces, which shouldn't have been the case and should
eventually be fixed for transparent BSDFs as well.
For non-volume scenes I found no performance impact on NVIDIA or AMD.
For volume scenes the noise decrease and fixed artifacts are worth the
little extra render time, when there is any.
|
|
We now continue transparent paths after diffuse/glossy/transmission/volume
bounces are exceeded. This avoids unexpected boundaries in volumes with
transparent boundaries. It is also required for MIS to work correctly with
transparent surfaces, as we also continue through these in shadow rays.
The main visible changes is that volumes will now be lit by the background
even at volume bounces 0, same as surfaces.
Fixes T53914 and T54103.
|
|
It is basically brute force volume scattering within the mesh, but part
of the SSS code for faster performance. The main difference with actual
volume scattering is that we assume the boundaries are diffuse and that
all lighting is coming through this boundary from outside the volume.
This gives much more accurate results for thin features and low density.
Some challenges remain however:
* Significantly more noisy than BSSRDF. Adding Dwivedi sampling may help
here, but it's unclear still how much it helps in real world cases.
* Due to this being a volumetric method, geometry like eyes or mouth can
darken the skin on the outside. We may be able to reduce this effect,
or users can compensate for it by reducing the scattering radius in
such areas.
* Sharp corners are quite bright. This matches actual volume rendering
and results in some other renderers, but maybe not so much real world
objects.
Differential Revision: https://developer.blender.org/D3054
|
|
This also fixes a subtle bug in the split kernel branched path SSS, the
volume stack update can't be shared between multiple hit points.
|
|
This is only needed for SSS which bounces to a different shading point.
|
|
Previously we stored each color channel in a single closure, which was
convenient for sampling a closure and channel together. But this doesn't
work so well for algorithms where we want to render multiple color
channels together.
|
|
This was broken in d750d18.
|
|
Goal is to reduce OpenCL kernel recompilations.
Currently viewport renders are still set to use 64 closures as this seems to
be faster and we don't want to cause a performance regression there. Needs
to be investigated.
Reviewed By: brecht
Differential Revision: https://developer.blender.org/D2775
|
|
|
|
With a Titan Xp, reduces path trace local memory from 1092MB to 840MB.
Benchmark performance was within 1% with both RX 480 and Titan Xp.
Original patch was implemented by Sergey.
Differential Revision: https://developer.blender.org/D2249
|
|
|
|
|
|
This was originally done with the first sample in the kernel for better
performance, but it doesn't work anymore with atomics. Any benefit was
very minor anyway, too small to measure it seems.
|
|
A little faster on some benchmark scenes, a little slower on others, seems
about performance neutral on average and saves a little memory.
|
|
This is done by storing only a subset of PathRadiance, and by storing
direct light immediately in the main PathRadiance. Saves about 10% of
CUDA stack memory, and simplifies subsurface indirect ray code.
|
|
Similar to what we did for area lights previously, this should help
preserve stratification when using multiple BSDFs in theory. Improvements
are not easily noticeable in practice though, because the number of BSDFs
is usually low. Still nice to eliminate one sampling dimension.
|
|
Previously the Sobol pattern suffered from some correlation issues that
made the outline of objects like a smoke domain visible. This helps
simplify the code and also makes some other optimizations possible.
|
|
|
|
|
|
Benchmarks peformance on GTX 1080 and RX 480 on Linux is the same for
bmw27, classroom, pabellon, and about 2% faster on fishy_cat and koro.
|
|
|
|
Need to exit the volume stack when shadow ray laves the medium.
Thanks Brecht for review and help in troubleshooting!
|
|
This was needed when we accessed OSL closure memory after shader evaluation,
which could get overwritten by another shader evaluation. But all closures
are immediatley converted to ShaderClosure now, so no longer needed.
|
|
Also pass by value and don't write back now that it is just a hash for seeding
and no longer an LCG state. Together this makes CUDA a tiny bit faster in my
tests, but mainly simplifies code.
|
|
|
|
|
|
|
|
|
|
Added some extra tirckery to avoid background being tinted dark with transparent
surface. Maybe a bit hacky, but seems to work fine.
|
|
Since all the shadow catchers are already assumed to be in the footage,
the shadows they cast on each other are already in the footage too. So
don't just let shadow catchers skip self, but all shadow catchers.
Another justification is that it should not matter if the shadow catcher
is modeled as one object or multiple separate objects, the resulting
render should be the same.
Differential Revision: https://developer.blender.org/D2763
|
|
transparent object
Tweaked the path radiance summing and alpha to accommodate for possible contribution of
light by transparent surface bounces happening prior to shadow catcher intersection.
This commit will change the way how shadow catcher results looks when was behind semi
transparent object, but the old result seemed to be fully wrong: there were big artifacts
when alpha-overing the result on some actual footage.
|
|
In branched path tracing main loop is always a camera ray, with varying
number of transparent bounces.
|
|
This commit contains the first part of the new Cycles denoising option,
which filters the resulting image using information gathered during rendering
to get rid of noise while preserving visual features as well as possible.
To use the option, enable it in the render layer options. The default settings
fit a wide range of scenes, but the user can tweak individual settings to
control the tradeoff between a noise-free image, image details, and calculation
time.
Note that the denoiser may still change in the future and that some features
are not implemented yet. The most important missing feature is animation
denoising, which uses information from multiple frames at once to produce a
flicker-free and smoother result. These features will be added in the future.
Finally, thanks to all the people who supported this project:
- Google (through the GSoC) and Theory Studios for sponsoring the development
- The authors of the papers I used for implementing the denoiser (more details
on them will be included in the technical docs)
- The other Cycles devs for feedback on the code, especially Sergey for
mentoring the GSoC project and Brecht for the code review!
- And of course the users who helped with testing, reported bugs and things
that could and/or should work better!
|
|
This implements branched path tracing for the split kernel.
General approach is to store the ray state at a branch point, trace the
branched ray as normal, then restore the state as necessary before iterating
to the next part of the path. A state machine is used to advance the indirect
loop state, which avoids the need to add any new kernels. Each iteration the
state machine recreates as much state as possible from the stored ray to keep
overall storage down.
Its kind of hard to keep all the different integration loops in sync, so this
needs lots of testing to make sure everything is working correctly. We should
probably start trying to deduplicate the integration loops more now.
Nonbranched BMW is ~2% slower, while classroom is ~2% faster, other scenes
could use more testing still.
Reviewers: sergey, nirved
Reviewed By: nirved
Subscribers: Blendify, bliblubli
Differential Revision: https://developer.blender.org/D2611
|
|
Since 9d50175 this is no longer needed, at least not with the current
sampler we are using.
|
|
Simplifies code quite a bit, making it shorter and easier to extend.
Currently no functional changes for users, but is required for the
upcoming work of shadow catcher support with OpenCL.
|
|
It uses an idea of accumulating all possible light reachable across the
light path (without taking shadow blocked into account) and accumulating
total shaded light across the path. Dividing second figure by first one
seems to be giving good estimate of the shadow.
In fact, to my knowledge, it's something really similar to what is
happening in the denoising branch, so we are aligned here which is good.
The workflow is following:
- Create an object which matches real-life object on which shadow is
to be catched.
- Create approximate similar material on that object.
This is needed to make indirect light properly affecting CG objects
in the scene.
- Mark object as Shadow Catcher in the Object properties.
Ideally, after doing that it will be possible to render the image and
simply alpha-over it on top of real footage.
|
|
|
|
We started to run out of bits there, so now we separate flags
which came from __object_flags and which are either runtime or
coming from __shader_flags.
Rule now is: SD_OBJECT_* flags are to be tested against new
object_flags field of ShaderData, all the rest flags are to
be tested against flags field of ShaderData.
There should be no user-visible changes, and time difference
should be minimal. In fact, from tests here can only see hardly
measurable difference and sometimes the new code is somewhat
faster (all within a noise floor, so hard to tell for sure).
Reviewers: brecht, dingto, juicyfruit, lukasstockner97, maiself
Differential Revision: https://developer.blender.org/D2428
|
|
|
|
This way it's more clear whether some issue is caused by lots of geometry in
the node or by lots of "transparent" BVH nodes.
|
|
Discard the whole volume stack on the last bounce (but keep
world volume if present).
Volumes are expected to be closed manifol meshes, meaning if
ray entered the volume there should be an intersection event
of ray exisintg the volume. Case when ray hit nothing and
there are still non-world volumes in the stack can happen in
either of cases.
1. Mesh is not closed manifold.
Such configurations are not really supported anyway and should
not be used.
Previous code would have consider the infinite length of the
ray to sample across, so render result wasn't really correct
anyway.
2. Exit intersection is more far away than the camera far
clip distance.
This case also will behave differently now, but previously it
wasn't really correct either, so it's not like we're breaking
something which was working as expected.
3. We missed exit event due to intersection precision issues.
This is exact the case which this patch fixes and avoid
fireflies.
4. Volume has Camera only visibility (all the rest visibility
is set to off)
This is what could be considered a regression but could be
solved quite easily by checking volume stack's objects flags
and keep entries which doesn't have Volume Scatter visibility
(or even better: ensure Volume Scatter visibility for objects
with volume closure),
Fixes T46108: Cycles - Overlapping emissive volumes generates unexpected bright hotspots around the intersection
Also fixes fireflies appearing on the edges of cube with
emissive volue.
Reviewers: juicyfruit, brecht
Reviewed By: brecht
Maniphest Tasks: T46108
Differential Revision: https://developer.blender.org/D2212
|
|
`kernel_path.h` and `kernel_path_branched.h` have a lot of conditional code and
it was kind of hard to tell what code belonged to which directive. Should be
easier to read now.
|
|
Mostly this is making inlining match CUDA 7.5 in a few performance critical
places. The end result is that performance is now better than before, possibly
due to less register spilling or other CUDA 8.0 compiler improvements.
On benchmarks scenes, there are 3% to 35% render time reductions. Stack memory
usage is reduced a little too.
Reviewed By: sergey
Differential Revision: https://developer.blender.org/D2269
|
|
|
|
Makes it so toolkit does exactly the same decision about what to inline,
but unfortunately it has really barely visible difference on GTX-980.
|
|
Glossy, Anisotropic and Glass BSDFs
This commit adds a new distribution to the Glossy, Anisotropic and Glass BSDFs that implements the
multiple-scattering microfacet model described in the paper "Multiple-Scattering Microfacet BSDFs with the Smith Model".
Essentially, the improvement is that unlike classical GGX, which only models single scattering and assumes
the contribution of multiple bounces to be zero, this new model performs a random walk on the microsurface until
the ray leaves it again, which ensures perfect energy conservation.
In practise, this means that the "darkening problem" - GGX materials becoming darker with increasing
roughness - is solved in a physically correct and efficient way.
The downside of this model is that it has no (known) analytic expression for evalation. However, it can be
evaluated stochastically, and although the correct PDF isn't known either, the properties of MIS and the
balance heuristic guarantee an unbiased result at the cost of slightly higher noise.
Reviewers: dingto, #cycles, brecht
Reviewed By: dingto, #cycles, brecht
Subscribers: bliblubli, ace_dragon, gregzaal, brecht, harvester, dingto, marcog, swerner, jtheninja, Blendify, nutel
Differential Revision: https://developer.blender.org/D2002
|
|
Saves about 15% for the branched path kernel.
|
|
57% less for path and 48% less for branched path.
|