Welcome to mirror list, hosted at ThFree Co, Russian Federation.

git.blender.org/blender.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2021-11-29Cycles: Metal readiness: Specify DeviceQueue::enqueue arg typesMichael Jones
This patch adds new arg-type parameters to `DeviceQueue::enqueue` and its overrides. This is in preparation for the Metal backend which needs this information for correct argument encoding. Ref T92212 Reviewed By: brecht Maniphest Tasks: T92212 Differential Revision: https://developer.blender.org/D13357
2021-11-22Merge branch 'blender-v3.0-release'Sergey Sharybin
2021-11-22Fix T90308: Cycles crash copying memory from device to hostSergey Sharybin
Happens when device runs out of memory and Cycles is moving some textures to the host memory. The delayed memory free for OptiX BVH was moving data from one device_memory to another, leaving the original device memory in an invalid state. This was ruining the allocation map in the CUDA device which is using pointer to the device_memory. This change makes it so the memory pointer is stolen from BVH into the delayed memory free list. Additionally, forbid copying and moving instances of device_memory and added sanity checks in the device implementation. Differential Revision: https://developer.blender.org/D13316
2021-11-17Cycles: add build option to enable a debugging feature for MISSebastian Herholz
This patch adds a CMake option "WITH_CYCLES_DEBUG" which builds cycles with a feature that allows debugging/selecting the direct-light sampling strategy. The same option may later be used to add other debugging features that could affect performance in release builds. The three options are: * Forward path tracing (e.g., via BSDF or phase function) * Next-event estimation * Multiple importance sampling combination of the previous two methods Such a feature is useful for debugging light different sampling, evaluation, and pdf methods (e.g., for light sources and BSDFs). Differential Revision: https://developer.blender.org/D13152
2021-11-17Cleanup: Remove unused show_samples() device code in Cycles.Thomas Dinges
2021-11-11Cleanup CUDA / HIP commentsThomas Dinges
Remove outdated CUDA comments for bindless textures and cleanup some HIP comments that still mentioned CUDA. Differential Revision: https://developer.blender.org/D13189
2021-11-01Merge branch 'blender-v3.0-release'Clément Foucault
2021-11-01Cleanup: Remove Cycles device checks for half float.Thomas Dinges
All supported devices support half float now, so we can remove the check. Differential Revision: https://developer.blender.org/D13021
2021-11-01Fix T92671: confusing Cycles debug logs about CPU architectureBrecht Van Lommel
Instead of printing debug flags listing various CPU and GPU settings that may or may not be used, print when we are using them. This include CPU kernel types, OptiX debugging and CUDA and HIP adaptive compilation. BVH type was already printed.
2021-10-26Cycles: remove prefix from source code file namesBrecht Van Lommel
Remove prefix of filenames that is the same as the folder name. This used to help when #includes were using individual files, but now they are always relative to the cycles root directory and so the prefixes are redundant. For patches and branches, git merge and rebase should be able to detect the renames and move over code to the right file.
2021-10-26Cycles: changes to source code folders structureBrecht Van Lommel
* Split render/ into scene/ and session/. The scene/ folder now contains the scene and its nodes. The session/ folder contains the render session and associated data structures like drivers and render buffers. * Move top level kernel headers into new folders kernel/camera/, kernel/film/, kernel/light/, kernel/sample/, kernel/util/ * Move integrator related kernel headers into kernel/integrator/ * Move OSL shaders from kernel/shaders/ to kernel/osl/shaders/ For patches and branches, git merge and rebase should be able to detect the renames and move over code to the right file.
2021-10-21Cycles: add shadow path compaction for GPU renderingBrecht Van Lommel
Similar to main path compaction that happens before adding work tiles, this compacts shadow paths before launching kernels that may add shadow paths. Only do it when more than 50% of space is wasted. It's not a clear win in all scenes, some are up to 1.5% slower. Likely caused by different order of scheduling kernels having an unpredictable performance impact. Still feels like compaction is just the right thing to avoid cases where a few shadow paths can hold up a lot of main paths. Differential Revision: https://developer.blender.org/D12944
2021-10-21Cleanup: make HIP and CUDA code more consistentBrecht Van Lommel
Ref D12834
2021-10-20Cycles: reduce kernel reserved local memory when not using shader raytracingBrecht Van Lommel
Ref T87836
2021-09-30Cycles: refactor API for GPU displayBrecht Van Lommel
* Split GPUDisplay into two classes. PathTraceDisplay to implement the Cycles side, and DisplayDriver to implement the host application side. The DisplayDriver is now a fully abstract base class, embedded in the PathTraceDisplay. * Move copy_pixels_to_texture implementation out of the host side into the Cycles side, since it can be implemented in terms of the texture buffer mapping. * Move definition of DeviceGraphicsInteropDestination into display driver header, so that we do not need to expose private device headers in the public API. * Add more detailed comments about how the DisplayDriver should be implemented. The "driver" terminology might not be obvious, but is also used in other renderers. Differential Revision: https://developer.blender.org/D12626
2021-09-27Cycles: print name of kernels on errors in CUDA queue, for debuggingBrecht Van Lommel
2021-09-24Cleanup: remove unused device code and includesBrecht Van Lommel
2021-09-23Fix T91641: crash rendering with 16k environment map in CyclesBrecht Van Lommel
Protect against integer overflow.
2021-09-22Cleanup: spelling in commentsCampbell Barton
2021-09-21Cycles: merge of cycles-x branch, a major update to the rendererBrecht Van Lommel
This includes much improved GPU rendering performance, viewport interactivity, new shadow catcher, revamped sampling settings, subsurface scattering anisotropy, new GPU volume sampling, improved PMJ sampling pattern, and more. Some features have also been removed or changed, breaking backwards compatibility. Including the removal of the OpenCL backend, for which alternatives are under development. Release notes and code docs: https://wiki.blender.org/wiki/Reference/Release_Notes/3.0/Cycles https://wiki.blender.org/wiki/Source/Render/Cycles Credits: * Sergey Sharybin * Brecht Van Lommel * Patrick Mours (OptiX backend) * Christophe Hery (subsurface scattering anisotropy) * William Leeson (PMJ sampling pattern) * Alaska (various fixes and tweaks) * Thomas Dinges (various fixes) For the full commit history, see the cycles-x branch. This squashes together all the changes since intermediate changes would often fail building or tests. Ref T87839, T87837, T87836 Fixes T90734, T89353, T80267, T80267, T77185, T69800
2021-07-28Cycles: remove WITH_CYCLES_DEBUG, add WITH_CYCLES_DEBUG_NANBrecht Van Lommel
WITH_CYCLES_DEBUG was used for rendering BVH debugging passes. But since we mainly use Embree an OptiX now, this information is no longer important. WITH_CYCLES_DEBUG_NAN will enable additional checks for NaNs and invalid values in the kernel, for Cycles developers. Previously these asserts where enabled in all debug builds, but this is too likely to crash Blender in scenes that render fine regardless of the NaNs. So this is behind a CMake option now. Fixes T90240
2021-07-26Cycles: upgrade CUDA to 11.4Brecht Van Lommel
This fixes a performance regression on Ampere cards, on specific scenes like classroom. For cycles-x there is little difference, but this is still helpful for LTS releases, and we need to upgrade at some point anyway.
2021-03-11Cycles: Change device-only memory to actually only allocate on the devicePatrick Mours
This patch changes the `MEM_DEVICE_ONLY` type to only allocate on the device and fail if that is not possible anymore because out-of-memory (since OptiX acceleration structures may not be allocated in host memory). It also fixes high peak memory usage during OptiX acceleration structure building. Reviewed By: brecht Maniphest Tasks: T85985 Differential Revision: https://developer.blender.org/D10535
2021-02-05Cleanup: correct spelling in commentsCampbell Barton
2021-01-27Fix T85089: Crash when rendering scene that does not fit into GPU memory ↵James Horsley
with CUDA/OptiX The "cuda_mem_map_mutex" was potentially being locked recursively during the call to "CUDADevice::move_textures_to_host", which crashed. This moves around the locking and unlocking of "cuda_mem_map_mutex", so that it doesn't call a function that locks it while still holding the lock. Reviewed By: pmoursnv Maniphest Tasks: T85089, T84734 Differential Revision: https://developer.blender.org/D10219
2021-01-11Fix T82351: Cycles: Tile stealing glitches with adaptive samplingLukas Stockner
In my testing this works, but it requires me to remove the min(start_sample...) part in the adaptive sampling kernel, and I assume there's a reason why it was there? Reviewed By: brecht Maniphest Tasks: T82351 Differential Revision: https://developer.blender.org/D9445
2020-12-15Cleanup: spellingCampbell Barton
2020-12-11Cycles: Add CPU+GPU rendering support with OptiXPatrick Mours
Adds support for building multiple BVH types in order to support using both CPU and OptiX devices for rendering simultaneously. Primitive packing for Embree and OptiX is now standalone, so it only needs to be run once and can be shared between the two. Additionally, BVH building was made a device call, so that each device backend can decide how to perform the building. The multi-device for instance creates a special multi-BVH that holds references to several sub-BVHs, one for each sub-device. Reviewed By: brecht, kevindietrich Differential Revision: https://developer.blender.org/D9718
2020-10-05Cycles: Add NanoVDB support for rendering volumesPatrick Mours
NanoVDB is a platform-independent sparse volume data structure that makes it possible to use OpenVDB volumes on the GPU. This patch uses it for volume rendering in Cycles, replacing the previous usage of dense 3D textures. Since it has a big impact on memory usage and performance and changes the OpenVDB branch used for the rest of Blender as well, this is not enabled by default yet, which will happen only after 2.82 was branched off. To enable it, build both dependencies and Blender itself with the "WITH_NANOVDB" CMake option. Reviewed By: brecht Differential Revision: https://developer.blender.org/D8794
2020-07-20Cycles: Use pre-compiled PTX kernel for older generation when no matching ↵Patrick Mours
one is found This patch changes the discovery of pre-compiled kernels, to look for any PTX, even if it does not match the current architecture version exactly. It works because the driver can JIT-compile PTX generated for architectures less than or equal to the current one. This e.g. makes it possible to render on a new GPU architecture even if no pre-compiled binary kernel was distributed for it as part of the Blender installation. Reviewed By: brecht Differential Revision: https://developer.blender.org/D8332
2020-07-10Cleanup: reduce hardcoded numbers in denoising neighbor tiles codeBrecht Van Lommel
2020-06-22Cleanup: remove task pool stop() and finished()Brecht Van Lommel
2020-06-22Cleanup: use lambdas instead of functors for task pools, remove threadidBrecht Van Lommel
2020-06-22Cleanup: minor refactoring around DeviceTaskBrecht Van Lommel
2020-06-17Cleanup: fix compiler warningsBrecht Van Lommel
2020-06-12Cycles: Improve CUDA and OptiX error reporting in the viewportPatrick Mours
This patch makes the infamous "Cancel" error in the viewport a thing of the past. Instead it now shows a more useful error message and streamlines the error handling process in CUDA. Reviewed By: brecht Differential Revision: https://developer.blender.org/D8008
2020-06-08Cycles: Add support for P2P memory distribution (e.g. via NVLink)Patrick Mours
This change modifies the multi-device implementation to support memory distribution across devices, to reduce the overall memory footprint of large scenes and allow scenes to fit entirely into combined GPU memory that previously had to fall back to host memory. Reviewed By: brecht Differential Revision: https://developer.blender.org/D7426
2020-05-15Cycles: code refactor to bake using regular render session and tilesBrecht Van Lommel
There should be no user visible change from this, except that tile size now affects performance. The goal here is to simplify bake denoising in D3099, letting it reuse more denoising tiles and pass code. A lot of code is now shared with regular rendering, with the two main differences being that we read some render result passes from the bake API when starting to render a tile, and call the bake kernel instead of the path trace kernel. With this kind of design where Cycles asks for tiles from the bake API, it should eventually be easier to reduce memory usage, show tiles as they are baked, or bake multiple passes at once, though there's still quite some work needed for that. Reviewers: #cycles Subscribers: monio, wmatyjewicz, lukasstockner97, michaelknubben Differential Revision: https://developer.blender.org/D3108
2020-05-05Cycles: mark CUDA 10.2 as officially supportedBrecht Van Lommel
It appears to work fine after a recent bugfix and testing for the past few weeks.
2020-03-19Cleanup: `make format` after SortedIncludes changeDalai Felinto
2020-03-12Cleanup: add device_texture for images, distinct from other global memoryBrecht Van Lommel
There was too much image texture specific stuff in device_memory, and too much code duplication between devices.
2020-03-11Cleanup: stop encoding image data type in slot indexBrecht Van Lommel
This is legacy code from when we had a fixed number of textures.
2020-03-06Cleanup: spellingCampbell Barton
2020-03-05Adaptive Sampling for Cycles.Stefan Werner
This feature takes some inspiration from "RenderMan: An Advanced Path Tracing Architecture for Movie Rendering" and "A Hierarchical Automatic Stopping Condition for Monte Carlo Global Illumination" The basic principle is as follows: While samples are being added to a pixel, the adaptive sampler writes half of the samples to a separate buffer. This gives it two separate estimates of the same pixel, and by comparing their difference it estimates convergence. Once convergence drops below a given threshold, the pixel is considered done. When a pixel has not converged yet and needs more samples than the minimum, its immediate neighbors are also set to take more samples. This is done in order to more reliably detect sharp features such as caustics. A 3x3 box filter that is run periodically over the tile buffer is used for that purpose. After a tile has finished rendering, the values of all passes are scaled as if they were rendered with the full number of samples. This way, any code operating on these buffers, for example the denoiser, does not need to be changed for per-pixel sample counts. Reviewed By: brecht, #cycles Differential Revision: https://developer.blender.org/D4686
2020-02-28Cycles: Rework tile scheduling for denoisingPatrick Mours
This fixes denoising being delayed until after all rendering has finished. Instead, tile-based denoising is now part of the "RENDER" task again, so that it is all in one task and does not cause issues with dedicated task pools where tasks are serialized. Reviewed By: brecht Differential Revision: https://developer.blender.org/D6940
2020-02-25Cleanup: Remove superfluous "cuda_device_ptr" functionPatrick Mours
2020-02-19Cleanup: `make format`Dalai Felinto
2020-02-17Cycles: Add support for adaptive kernel compilation to OptiX devicePatrick Mours
This modifies the common CUDA implementation for adaptive kernel compilation slightly to support both CUBIN and PTX output (the latter which is then used in the OptiX device). It also fixes adaptive kernel compilation on Windows. Reviewed By: brecht Differential Revision: https://developer.blender.org/D6851
2020-02-12Fix Cycles build errors and clang-format after recent commitBrecht Van Lommel
2020-02-12Cleanup: Move common CUDA/OptiX Cycles device code into separate filePatrick Mours
This reduces code duplication between the CUDA and OptiX device implementations: The CUDA device class is now split into declaration and definition (similar to the OpenCL device) and the OptiX device class implements that and only overrides the functions it actually has to change, while using the CUDA implementation for everything else. Reviewed By: brecht Differential Revision: https://developer.blender.org/D6814