Welcome to mirror list, hosted at ThFree Co, Russian Federation.

git.blender.org/blender.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2021-07-28Cycles: remove WITH_CYCLES_DEBUG, add WITH_CYCLES_DEBUG_NANBrecht Van Lommel
WITH_CYCLES_DEBUG was used for rendering BVH debugging passes. But since we mainly use Embree an OptiX now, this information is no longer important. WITH_CYCLES_DEBUG_NAN will enable additional checks for NaNs and invalid values in the kernel, for Cycles developers. Previously these asserts where enabled in all debug builds, but this is too likely to crash Blender in scenes that render fine regardless of the NaNs. So this is behind a CMake option now. Fixes T90240
2021-07-26Cycles: upgrade CUDA to 11.4Brecht Van Lommel
This fixes a performance regression on Ampere cards, on specific scenes like classroom. For cycles-x there is little difference, but this is still helpful for LTS releases, and we need to upgrade at some point anyway.
2021-03-11Cycles: Change device-only memory to actually only allocate on the devicePatrick Mours
This patch changes the `MEM_DEVICE_ONLY` type to only allocate on the device and fail if that is not possible anymore because out-of-memory (since OptiX acceleration structures may not be allocated in host memory). It also fixes high peak memory usage during OptiX acceleration structure building. Reviewed By: brecht Maniphest Tasks: T85985 Differential Revision: https://developer.blender.org/D10535
2021-02-05Cleanup: correct spelling in commentsCampbell Barton
2021-01-27Fix T85089: Crash when rendering scene that does not fit into GPU memory ↵James Horsley
with CUDA/OptiX The "cuda_mem_map_mutex" was potentially being locked recursively during the call to "CUDADevice::move_textures_to_host", which crashed. This moves around the locking and unlocking of "cuda_mem_map_mutex", so that it doesn't call a function that locks it while still holding the lock. Reviewed By: pmoursnv Maniphest Tasks: T85089, T84734 Differential Revision: https://developer.blender.org/D10219
2021-01-11Fix T82351: Cycles: Tile stealing glitches with adaptive samplingLukas Stockner
In my testing this works, but it requires me to remove the min(start_sample...) part in the adaptive sampling kernel, and I assume there's a reason why it was there? Reviewed By: brecht Maniphest Tasks: T82351 Differential Revision: https://developer.blender.org/D9445
2020-12-15Cleanup: spellingCampbell Barton
2020-12-11Cycles: Add CPU+GPU rendering support with OptiXPatrick Mours
Adds support for building multiple BVH types in order to support using both CPU and OptiX devices for rendering simultaneously. Primitive packing for Embree and OptiX is now standalone, so it only needs to be run once and can be shared between the two. Additionally, BVH building was made a device call, so that each device backend can decide how to perform the building. The multi-device for instance creates a special multi-BVH that holds references to several sub-BVHs, one for each sub-device. Reviewed By: brecht, kevindietrich Differential Revision: https://developer.blender.org/D9718
2020-10-05Cycles: Add NanoVDB support for rendering volumesPatrick Mours
NanoVDB is a platform-independent sparse volume data structure that makes it possible to use OpenVDB volumes on the GPU. This patch uses it for volume rendering in Cycles, replacing the previous usage of dense 3D textures. Since it has a big impact on memory usage and performance and changes the OpenVDB branch used for the rest of Blender as well, this is not enabled by default yet, which will happen only after 2.82 was branched off. To enable it, build both dependencies and Blender itself with the "WITH_NANOVDB" CMake option. Reviewed By: brecht Differential Revision: https://developer.blender.org/D8794
2020-07-20Cycles: Use pre-compiled PTX kernel for older generation when no matching ↵Patrick Mours
one is found This patch changes the discovery of pre-compiled kernels, to look for any PTX, even if it does not match the current architecture version exactly. It works because the driver can JIT-compile PTX generated for architectures less than or equal to the current one. This e.g. makes it possible to render on a new GPU architecture even if no pre-compiled binary kernel was distributed for it as part of the Blender installation. Reviewed By: brecht Differential Revision: https://developer.blender.org/D8332
2020-07-10Cleanup: reduce hardcoded numbers in denoising neighbor tiles codeBrecht Van Lommel
2020-06-22Cleanup: remove task pool stop() and finished()Brecht Van Lommel
2020-06-22Cleanup: use lambdas instead of functors for task pools, remove threadidBrecht Van Lommel
2020-06-22Cleanup: minor refactoring around DeviceTaskBrecht Van Lommel
2020-06-12Cycles: Improve CUDA and OptiX error reporting in the viewportPatrick Mours
This patch makes the infamous "Cancel" error in the viewport a thing of the past. Instead it now shows a more useful error message and streamlines the error handling process in CUDA. Reviewed By: brecht Differential Revision: https://developer.blender.org/D8008
2020-06-08Cycles: Add support for P2P memory distribution (e.g. via NVLink)Patrick Mours
This change modifies the multi-device implementation to support memory distribution across devices, to reduce the overall memory footprint of large scenes and allow scenes to fit entirely into combined GPU memory that previously had to fall back to host memory. Reviewed By: brecht Differential Revision: https://developer.blender.org/D7426
2020-05-15Cycles: code refactor to bake using regular render session and tilesBrecht Van Lommel
There should be no user visible change from this, except that tile size now affects performance. The goal here is to simplify bake denoising in D3099, letting it reuse more denoising tiles and pass code. A lot of code is now shared with regular rendering, with the two main differences being that we read some render result passes from the bake API when starting to render a tile, and call the bake kernel instead of the path trace kernel. With this kind of design where Cycles asks for tiles from the bake API, it should eventually be easier to reduce memory usage, show tiles as they are baked, or bake multiple passes at once, though there's still quite some work needed for that. Reviewers: #cycles Subscribers: monio, wmatyjewicz, lukasstockner97, michaelknubben Differential Revision: https://developer.blender.org/D3108
2020-05-05Cycles: mark CUDA 10.2 as officially supportedBrecht Van Lommel
It appears to work fine after a recent bugfix and testing for the past few weeks.
2020-03-19Cleanup: `make format` after SortedIncludes changeDalai Felinto
2020-03-12Cleanup: add device_texture for images, distinct from other global memoryBrecht Van Lommel
There was too much image texture specific stuff in device_memory, and too much code duplication between devices.
2020-03-11Cleanup: stop encoding image data type in slot indexBrecht Van Lommel
This is legacy code from when we had a fixed number of textures.
2020-03-06Cleanup: spellingCampbell Barton
2020-03-05Adaptive Sampling for Cycles.Stefan Werner
This feature takes some inspiration from "RenderMan: An Advanced Path Tracing Architecture for Movie Rendering" and "A Hierarchical Automatic Stopping Condition for Monte Carlo Global Illumination" The basic principle is as follows: While samples are being added to a pixel, the adaptive sampler writes half of the samples to a separate buffer. This gives it two separate estimates of the same pixel, and by comparing their difference it estimates convergence. Once convergence drops below a given threshold, the pixel is considered done. When a pixel has not converged yet and needs more samples than the minimum, its immediate neighbors are also set to take more samples. This is done in order to more reliably detect sharp features such as caustics. A 3x3 box filter that is run periodically over the tile buffer is used for that purpose. After a tile has finished rendering, the values of all passes are scaled as if they were rendered with the full number of samples. This way, any code operating on these buffers, for example the denoiser, does not need to be changed for per-pixel sample counts. Reviewed By: brecht, #cycles Differential Revision: https://developer.blender.org/D4686
2020-02-28Cycles: Rework tile scheduling for denoisingPatrick Mours
This fixes denoising being delayed until after all rendering has finished. Instead, tile-based denoising is now part of the "RENDER" task again, so that it is all in one task and does not cause issues with dedicated task pools where tasks are serialized. Reviewed By: brecht Differential Revision: https://developer.blender.org/D6940
2020-02-25Cleanup: Remove superfluous "cuda_device_ptr" functionPatrick Mours
2020-02-19Cleanup: `make format`Dalai Felinto
2020-02-17Cycles: Add support for adaptive kernel compilation to OptiX devicePatrick Mours
This modifies the common CUDA implementation for adaptive kernel compilation slightly to support both CUBIN and PTX output (the latter which is then used in the OptiX device). It also fixes adaptive kernel compilation on Windows. Reviewed By: brecht Differential Revision: https://developer.blender.org/D6851
2020-02-12Cleanup: Move common CUDA/OptiX Cycles device code into separate filePatrick Mours
This reduces code duplication between the CUDA and OptiX device implementations: The CUDA device class is now split into declaration and definition (similar to the OpenCL device) and the OptiX device class implements that and only overrides the functions it actually has to change, while using the CUDA implementation for everything else. Reviewed By: brecht Differential Revision: https://developer.blender.org/D6814