Age | Commit message (Collapse) | Author |
|
|
|
These transparent shadows can be expansive to evaluate. Especially on the
GPU they can lead to poor occupancy when only some pixels require many kernel
launches to trace and evaluate many layers of transparency.
Baked transparency allows tracing a single ray in many cases by accumulating
the throughput directly in the intersection program without recording hits
or evaluating shaders. Transparency is baked at curve vertices and
interpolated, for most shaders this will look practically the same as actual
shader evaluation.
Fixes T91428, performance regression with spring demo file due to transparent
hair, and makes it render significantly faster than Blender 2.93.
Differential Revision: https://developer.blender.org/D12880
|
|
Helps save one OptiX payload and is a bit more efficient.
Differential Revision: https://developer.blender.org/D12909
|
|
The motivation for this is twofold. It improves performance (5-10% on most
benchmark scenes), and will help to bring back transparency support for the
ambient occlusion pass.
* Duplicate some members from the main path state in the shadow path state.
* Add shadow paths incrementally to the array similar to what we do for
the shadow catchers.
* For the scheduling, allow running shade surface and shade volume kernels
as long as there is enough space in the shadow paths array. If not, execute
shadow kernels until it is empty.
* Add IntegratorShadowState and ConstIntegratorShadowState typedefs that
can be different between CPU and GPU. For GPU both main and shadow paths
juse have an integer for SoA access. Bt with CPU it's a different pointer
type so we get type safety checks in code shared between CPU and GPU.
* For CPU, add a separate IntegratorShadowStateCPU struct embedded in
IntegratorShadowState.
* Update various functions to take the shadow state, and make SVM take either
type of state using templates.
Differential Revision: https://developer.blender.org/D12889
|
|
|
|
|
|
Need to initialize components for the full Diffuse BSDF.
Steps to reproduce:
- Default cube scene
- Switch to Cycles renderer
- Enable OSL backend
- Start viewport render
- Observe cube being much black
Differential Revision: https://developer.blender.org/D12921
|
|
A Clang-Format configuration to make the closure definition block to
be properly recognized as such.
Also small wrapper macro to avoid comma in the actual definition code
which was causing unwanted indentation of parameters definition.
Requires Clang-Format 7 or newer. The version we ship in the libs is
12, so for recommended development setup it should all be good.
Differential Revision: https://developer.blender.org/D12920
|
|
The thread affinity setting in OIDN can break multithreading on some CPUs.
While this leads to somewhat worse performance on CPUs that do work correctly,
it's better than having some CPUs use only half the cores.
|
|
Better not to tempt anyone from using unsafe access to device
functionality during host update.
|
|
The lookup tables are to be initialized after device free.
On Linux was only noticeable when rendering default cube scene with
an extra assert. On Windows it was causing an assert in STL in debug
builds.
Differential Revision: https://developer.blender.org/D12918
|
|
|
|
This reverts commit 3065d2609700d14100490a16c91152a6e71790e8. Causing crashes
in the spring scene.
|
|
Ref D12889
|
|
* isect Ng is no longer needed for shadows, for main path needed for SSS only
* Reduce rng_offset and queued_kernel to 16 bits
Ref D12889
|
|
Only copy the number of items used instead of the max items.
Ref D12889
|
|
This is only used by a single device, not need for thread safety.
|
|
|
|
* Rename struct KernelGlobals to struct KernelGlobalsCPU
* Add KernelGlobals, IntegratorState and ConstIntegratorState typedefs
that every device can define in its own way.
* Remove INTEGRATOR_STATE_ARGS and INTEGRATOR_STATE_PASS macros and
replace with these new typedefs.
* Add explicit state argument to INTEGRATOR_STATE and similar macros
In preparation for decoupling main and shadow paths.
Differential Revision: https://developer.blender.org/D12888
|
|
|
|
|
|
|
|
Caused a debug crash in Windows MSVS.
Reviewed By: brecht
Differential Revision: https://developer.blender.org/D12873
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
This is the first of a sequence of changes to support compiling Cycles kernels as MSL (Metal Shading Language) in preparation for a Metal GPU device implementation.
MSL requires that all pointer types be declared with explicit address space attributes (device, thread, etc...). There is already precedent for this with Cycles' address space macros (ccl_global, ccl_private, etc...), therefore the first step of MSL-enablement is to apply these consistently. Line-for-line this represents the largest change required to enable MSL. Applying this change first will simplify future patches as well as offering the emergent benefit of enhanced descriptiveness.
The vast majority of deltas in this patch fall into one of two cases:
- Ensuring ccl_private is specified for thread-local pointer types
- Ensuring ccl_global is specified for device-wide pointer types
Additionally, the ccl_addr_space qualifier can be removed. Prior to Cycles X, ccl_addr_space was used as a context-dependent address space qualifier, but now it is either redundant (e.g. in struct typedefs), or can be replaced by ccl_global in the case of pointer types. Associated function variants (e.g. lcg_step_float_addrspace) are also redundant.
In cases where address space qualifiers are chained with "const", this patch places the address space qualifier first. The rationale for this is that the choice of address space is likely to have the greater impact on runtime performance and overall architecture.
The final part of this patch is the addition of a metal/compat.h header. This is partially complete and will be extended in future patches, paving the way for the full Metal implementation.
Ref T92212
Reviewed By: brecht
Maniphest Tasks: T92212
Differential Revision: https://developer.blender.org/D12864
|
|
The assumption about absent shadow path was wrong.
The rest of the changes are to ensure shadow paths are finished prior
to the split, so that they write to the proper passes.
The issue was caught by running regression tests on OptiX.
Differential Revision: https://developer.blender.org/D12857
|
|
Happens i.e. when changing compute device.
A more proper follow-up to the on-demand display driver creation change.
|
|
Create display early on, so that ready_to_reset() passes assert test
for use for display actually configured.
|
|
The pixel accessor was not aware of possible offset in the
pixel padding causing some slices of the result not being
properly padded.
|
|
Ensure math happens on size_t type instead of int followed by a cast
to the size_t.
|
|
When baking in a debug build running gdb it kept asserting because a GL context was being created outside the main thread.
To fix this the patch only creates the GL context is only created for rendering (when it is actually used).
Reviewed By: sergey
Differential Revision: https://developer.blender.org/D12767
|
|
Commited by mistake
This reverts commit 6535779c92b90035870047f178cf3eff95f0bdf0.
|
|
This makes sure the previously bound context is restored after creating a
new context. This follows what is already happening on windows.
All system backend are patched.
This also removes the goto and some code duplication.
Differential Revision: https://developer.blender.org/D12455
|
|
Need to check allocation size, as the features do not change with
volume stack depth detection.
|
|
Addresses T77127 (Controller Drawing).
Adds VR controller visualization and custom drawing via draw
handlers. Add-ons can draw to the XR surface (headset display) and
mirror window by adding a View3D draw handler of region type 'XR' and
draw type 'POST_VIEW'. Controller drawing and custom overlays can be
toggled individually as XR session options, which will be added in a
future update to the VR Scene Inspection add-on.
For the actual drawing, the OpenXR XR_MSFT_controller_model extension
is used to load a glTF model provided by the XR runtime. The model's
vertex data is then used to create a GPUBatch in the XR session
state. Finally, this batch is drawn via the XR surface draw handler
mentioned above.
For runtimes that do not support the controller model extension, a
a simple fallback shape (sphere) is drawn instead.
Reviewed By: Severin, fclem
Differential Revision: https://developer.blender.org/D10948
|
|
|
|
Introduces `BKE_appdir_folder_caches` to get the folder that
can be used to store caches. On different OS's different folders
are used.
- Linux: `~/.cache/blender/`.
- MacOS: `Library/Caches/Blender/`.
- Windows: `(%USERPROFILE%\AppData\Local)\Blender Foundation\Blender\Cache\`.
Reviewed By: Severin
Differential Revision: https://developer.blender.org/D12822
|
|
|
|
For details see the "Extending the Disney BRDF to a BSDF with Integrated
Subsurface Scattering" paper.
We split the diffuse BSDF into a lambertian and retro-reflection component.
The retro-reflection component is always handled as a BSDF, while the
lambertian component can be replaced by a BSSRDF.
For the BSSRDF case, we compute Fresnel separately at the entry and exit
points, which may have different normals. As the scattering radius decreases
this converges to the BSDF case.
A downside is that this increases noise for subsurface scattering in the
Principled BSDF, due to some samples going to the retro-reflection component.
However the previous logic (also in 2.93) was simple wrong, using a
non-sensical view direction vector at the exit point. We use an importance
sampling weight estimate for the retro-reflection to try to better balance
samples between the BSDF and BSSRDF.
Differential Revision: https://developer.blender.org/D12801
|
|
There is not enough time before the release to improve Random Walk to handle
all cases this was used for, so restore it for now.
Since there is no more path splitting in cycles-x, this can increase noise in
non-flat areas for the sample number of samples, though fewer rays will be traced
also. This is fundamentally a trade-off we made in the new design and why Random
Walk is a better fit. However the importance resampling we do now does help to
reduce noise.
Differential Revision: https://developer.blender.org/D12800
|
|
It got missed in some of previous development.
Can not see a reason why the line needed to be removed, maybe just some
accident.
|
|
|
|
Only count volume objects after shader optimization.
Allows to discard objects which don't have effective volume
BSDF connected to the shader output (i.e. constant folded,
or non-volume BSDF used by mistake).
Solves memory regression reported in T92014.
There is still possibility to improve memory even further
for cases when there are a lot of non-intersecting volume
objects, but that requires a deeper refactor of update
process. Will happen as a followup development.
Differential Revision: https://developer.blender.org/D12797
|
|
The longer-term goal is to separate host-only scene update
from device update: make it possible to make kernel features
depend on actual scene state and flags.
This change makes it so shaders are compiled before kernel
load, making checks like "has_volume" available at the
kernel features calculation state.
No functional changes are expected at this point.
Differential Revision: https://developer.blender.org/D12795
|