Age | Commit message (Collapse) | Author |
|
artifacts
Enabling or disabling motion blur requires rebuilding the BVH of affected geometry and
uploading modified vertices to the device (since without motion blur the transform is
applied to the vertex positions, whereas with motion blur this is done during traversal).
Previously neither was happening when persistent data was enabled, since the relevant
node sockets were not tagged as modified after toggling motion blur.
The change to blender_object.cpp makes it so `geom->set_use_motion_blur()` is always
called (regardless of motion blur being toggled on or off), which will tag the geometry
as modified if that value changed and ensures the BVH is updated.
The change to hair.cpp/mesh.cpp was necessary since after motion blur is disabled,
the transform is applied to the vertex positions of a mesh, but those changes were not
uploaded to the device. This is fixed now that they are tagged as modified.
Maniphest Tasks: T90666
Differential Revision: https://developer.blender.org/D12781
|
|
|
|
|
|
Only copy required part of volume stack instead of entire stack.
Solves time regression introduced by D12759 and avoids need in
implementing volume stack calculation to exactly match what the
path tracing will do (as well as potentially makes scenes with
a lot of volumes ans a tiny bit of deeply nested ones render
faster).
Still need to look into memory aspect of the regression, but
that is for separate patch.
Ref T92014
Maniphest Tasks: T92014
Differential Revision: https://developer.blender.org/D12790
|
|
|
|
|
|
|
|
The Embree logic did not match the GPU.
|
|
|
|
Was causing extra overscan pixels, and was confusing multiple workers
check after fix T91994.
|
|
The state template iteration had difficult time dealing with 0-sized
arrays, causing iteration for until integer overflows.
|
|
The overscan change from D12599 lacked proper handling of window
when slicing buffer for multiple devices.
|
|
Previously the storage here was optimized to avoid indirections in BVH2
traversal. This helps improve performance a bit, but makes performance
and memory usage of Embree and OptiX BVHs a bit worse also. It also adds
code complexity in other parts of the code.
Now decouple triangle and curve primitive storage from BVH2.
* Reduced peak memory usage on all devices
* Bit better performance for OptiX and Embree
* Bit worse performance for CUDA
* Simplified code:
** Intersection.prim/object now matches ShaderData.prim/object
** No more offset manipulation for mesh displacement before a BVH is built
** Remove primitive packing code and flags for Embree and OptiX
** Curve segments are now stored in a KernelCurve struct
* Also happens to fix a bug in baking with incorrect prim/object
Fixes T91968, T91770, T91902
Differential Revision: https://developer.blender.org/D12766
|
|
MSVC does not support variable size array definition.
Use maximum possible stack, similar to the GPU case.
Not expected to have user-measurable difference.
|
|
Make volume stack allocated conditionally, potentially based on the
actual nested level of objects in the scene.
Currently the nested level is estimated by number of volume objects.
This is a non-expensive check which is probably enough in practice
to get almost perfect memory usage and performance.
The conditional allocation is a bit tricky.
For the CPU we declare and define maximum possible volume stack,
because there are only that many integrator states on the CPU.
On the GPU we declare outer SoA to have all volume stack elements,
but only allocate actually needed ones. The actually used volume
stack size is passed as a pre-processor, which seems to be easiest
and fastest for the GPU state copy.
There seems to be no speed regression in the demo files on RTX6000.
Note that scenes with high nested level of volume will now be slower
but correct.
Differential Revision: https://developer.blender.org/D12759
|
|
|
|
It's unclear why this code was added in the first place, but it seems
unnecessary, it can be restored if we find this breaks something.
The Embree docs mention that the same primitive may be hit multiple times, but
my understanding is that about e.g. curves where both the frontside and backside
may be hit. However those hits would be at different distances.
The context for this change is that we want to add an optimization where we
can immediately update throughput for transparent shadows instead of recording
intersections, and avoid duplicate would require extra work. However there is
an Embree example that does something similar without worrying about duplicate
hits either.
|
|
Fixes:{T91064}
Caused by {rBcd118c5581f482afc8554ff88b5b6f3b552b1682}
- Applies `ensure_valid_reflection()` to the normal input on all BSDFs for CPU and GPU.
- This doesn't affect hair.
- Removes `ensure_valid_reflection()` from the output of Bump Map and Normal Map nodes for CPU/GPU as it is not needed.
- The fix doesn't touch OSL.
Reviewed By: brecht, leesonw
Maniphest Tasks: T91064
Differential Revision: https://developer.blender.org/D12403
|
|
|
|
This effectively undoes some of the following commit:
rB4537e8558468c71a03bf53f59c60f888b3412de2
The tables in question were duplicated 5-6 times into the blender
executable due to the headers being used in multiple translation units.
This contributes ~6.3kb worth of duplicate data into the binary.
Some further details are in the below revision.
Differential Revision: https://developer.blender.org/D12724
|
|
|
|
Implement an overscan support for tiles, so that adaptive sampling can
rely on the pixels neighbourhood.
Differential Revision: https://developer.blender.org/D12599
|
|
Currently was only used for logging, but better to fix the size so
that it matches reality.
The issue was caused by decoupling number of shadow intersections
and using much higher number for CPU. This caused the total state
on GPU to be logged as 10s of gigabytes instead of 100s of megabytes.
Differential Revision: https://developer.blender.org/D12755
|
|
And fix various broken things in the HIP kernel compilation.
|
|
For example, crash when attempting to use OptiX denoiser on systems
without OptiX-capable device.
Perform check that scene update happened without errors.
Note that `et_error` makes progress to cancel, so the code was
simplified a bit.
|
|
Always sample background pass behind shadow catcher (if the pass
exists, of course), regardless of whether shadow catcher will be
used as approximate or accurate.
Allows to combine accurate shadows into an environment map.
Differential Revision: https://developer.blender.org/D12747
|
|
|
|
|
|
This previously only work for CPU rendering, and isn't that practical to get
working in the new architecture.
|
|
|
|
|
|
|
|
|
|
* Add OutputDriver, replacing function callbacks in Session.
* Add PathTraceTile, replacing tile access methods in Session.
* Add more detailed comments about how this driver should be implemented.
* Add OIIOOutputDriver for Cycles standalone to output an image.
Differential Revision: https://developer.blender.org/D12627
|
|
* Split GPUDisplay into two classes. PathTraceDisplay to implement the Cycles side,
and DisplayDriver to implement the host application side. The DisplayDriver is now
a fully abstract base class, embedded in the PathTraceDisplay.
* Move copy_pixels_to_texture implementation out of the host side into the Cycles side,
since it can be implemented in terms of the texture buffer mapping.
* Move definition of DeviceGraphicsInteropDestination into display driver header, so
that we do not need to expose private device headers in the public API.
* Add more detailed comments about how the DisplayDriver should be implemented.
The "driver" terminology might not be obvious, but is also used in other renderers.
Differential Revision: https://developer.blender.org/D12626
|
|
Replacement for float curve in legacy Attribute Curve Map node.
Float Curve defaults to [0.0-1.0] range.
Reviewed By: JacquesLucke, brecht
Differential Revision: https://developer.blender.org/D12683
|
|
Caused by a lack of synchronization between update process which sets
clear flag and the draw code checking the flag outside of a lock.
|
|
This struct is much bigger now, and does not actually need to be fully zero
initialized.
|
|
Similar to the previous change in the area: need to avoid ray
point and direction becoming a non-finite value.
Use the view direction when the geometrical normal can not be
calculated.
Collaboration and sanity inspiration with Brecht!
Differential Revision: https://developer.blender.org/D12703
|
|
Could cause an actual bug but probability is low in practice.
|
|
So we can do fewer intersection calls, only on the GPU do we need to save
memory and do this in small steps.
Ref T87836
|
|
The issue was caused by hair shader setup setting normal to a non
finite value, which then gets used to create a ray with non-finite
direction, making BVH traversal to run out of stack memory.
Happens with 150_0040_A.lighting.blend frame 112 of the Sprites
project.
Differential Revision: https://developer.blender.org/D12692
|
|
Avoids possible numerical issues in the path tracing kernel, which
is most important for displacement as non-finite values in BVH can
lead to infinite node recursion during traversal.
Differential Revision: https://developer.blender.org/D12690
|
|
Noticed while looking into flickering issues in viewport.
Doesn't seem to solve the flicker issue for me, but is something
what is supposed to be happening anyway.
Differential Revision: https://developer.blender.org/D12673
|
|
|
|
|
|
|
|
NOTE: this feature is not ready for user testing, and not yet enabled in daily
builds. It is being merged now for easier collaboration on development.
HIP is a heterogenous compute interface allowing C++ code to be executed on
GPUs similar to CUDA. It is intended to bring back AMD GPU rendering support
on Windows and Linux.
https://github.com/ROCm-Developer-Tools/HIP.
As of the time of writing, it should compile and run on Linux with existing
HIP compilers and driver runtimes. Publicly available compilers and drivers
for Windows will come later.
See task T91571 for more details on the current status and work remaining
to be done.
Credits:
Sayak Biswas (AMD)
Arya Rafii (AMD)
Brian Savery (AMD)
Differential Revision: https://developer.blender.org/D12578
|
|
This change fixes an issue when scene has a shadow catcher and film is
configured to be transparent. Starting viewport render and making the
background non-transparent will cause bad memory access (wrong render
and possibly crash).
Film passes depends on transparency of background, so check for this.
Demo file: F10650585
Differential Revision: https://developer.blender.org/D12666
|
|
Only do denoising on the full-frame result. Saves render time.
Can re-consider in the future when/if we'll want to support
denoising during rendering (similar to viewport) to allow artists
to stop rendering when they see image to be good enough. Until
there is a design for that workflow stick to a more time efficient
rendering.
Differential Revision: https://developer.blender.org/D12662
|