Welcome to mirror list, hosted at ThFree Co, Russian Federation.

git.blender.org/blender.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2020-06-24Cleanup: compiler warningsBrecht Van Lommel
2020-06-22Cycles: port curve-ray intersection from Embree for use in Cycles GPUBrecht Van Lommel
This keeps render results compatible for combined CPU + GPU rendering. Peformance and quality primitives is quite different than before. There are now two options: * Rounded Ribbon: render hair as flat ribbon with (fake) rounded normals, for fast rendering. Hair curves are subdivided with a fixed number of user specified subdivisions. This gives relatively good results, especially when used with the Principled Hair BSDF and hair viewed from a typical distance. There are artifacts when viewed closed up, though this was also the case with all previous primitives (but different ones). * 3D Curve: render hair as 3D curve, for accurate results when viewing hair close up. This automatically subdivides the curve until it is smooth. This gives higher quality than any of the previous primitives, but does come at a performance cost and is somewhat slower than our previous Thick curves. The main problem here is performance. For CPU and OpenCL rendering performance seems usually quite close or better for similar quality results. However for CUDA and Optix, performance of 3D curve intersection is problematic, with e.g. 1.45x longer render time in Koro (though there is no equivalent quality and rounded ribbons seem fine for that scene). Any help or ideas to optimize this are welcome. Ref T73778 Depends on D8012 Maniphest Tasks: T73778 Differential Revision: https://developer.blender.org/D8013
2020-06-22Cycles: remove support for rendering hair as triangle and linesBrecht Van Lommel
Triangles were very memory intensive. The only reason they were not removed yet is that they gave more accurate results, but there will be an accurate 3D curve primitive added for this. Line rendering was always poor quality since the ends do not match up. To keep CPU and GPU compatibility we just remove them entirely. They could be brought back if an Embree compatible implementation is added, but it's not clear to me that there is a use case for these that we'd consider important. Ref T73778 Reviewers: #cycles Subscribers:
2020-05-27Merge branch 'blender-v2.83-release'Patrick Mours
2020-05-27Fix T76947: Optix realtime denoiser progressively reduces brightness of very ↵Patrick Mours
bright objects The input data to the OptiX denoiser was clamped to 0..10000 as required, but it could easily exceed that range with a high number of samples (since the data contains the overall sum). To fix that, divide by the number of samples first and multiply it back in after the denoiser ran.
2020-05-15Cycles: code refactor to bake using regular render session and tilesBrecht Van Lommel
There should be no user visible change from this, except that tile size now affects performance. The goal here is to simplify bake denoising in D3099, letting it reuse more denoising tiles and pass code. A lot of code is now shared with regular rendering, with the two main differences being that we read some render result passes from the bake API when starting to render a tile, and call the bake kernel instead of the path trace kernel. With this kind of design where Cycles asks for tiles from the bake API, it should eventually be easier to reduce memory usage, show tiles as they are baked, or bake multiple passes at once, though there's still quite some work needed for that. Reviewers: #cycles Subscribers: monio, wmatyjewicz, lukasstockner97, michaelknubben Differential Revision: https://developer.blender.org/D3108
2020-03-18Cycles: support for different 3D transform per volume gridBrecht Van Lommel
This is not yet fully supported by automatic volume bounds but works fine in most cases that will have mostly matching bounds. Ref T73201
2020-03-12Cleanup: add device_texture for images, distinct from other global memoryBrecht Van Lommel
There was too much image texture specific stuff in device_memory, and too much code duplication between devices.
2020-03-11Cleanup: stop encoding image data type in slot indexBrecht Van Lommel
This is legacy code from when we had a fixed number of textures.
2020-03-06Cleanup: tweak Cycles #includes in preparation for clang-format sortingBrecht Van Lommel
2020-03-05Adaptive Sampling for Cycles.Stefan Werner
This feature takes some inspiration from "RenderMan: An Advanced Path Tracing Architecture for Movie Rendering" and "A Hierarchical Automatic Stopping Condition for Monte Carlo Global Illumination" The basic principle is as follows: While samples are being added to a pixel, the adaptive sampler writes half of the samples to a separate buffer. This gives it two separate estimates of the same pixel, and by comparing their difference it estimates convergence. Once convergence drops below a given threshold, the pixel is considered done. When a pixel has not converged yet and needs more samples than the minimum, its immediate neighbors are also set to take more samples. This is done in order to more reliably detect sharp features such as caustics. A 3x3 box filter that is run periodically over the tile buffer is used for that purpose. After a tile has finished rendering, the values of all passes are scaled as if they were rendered with the full number of samples. This way, any code operating on these buffers, for example the denoiser, does not need to be changed for per-pixel sample counts. Reviewed By: brecht, #cycles Differential Revision: https://developer.blender.org/D4686
2020-02-11Cycles: Add support for denoising in the viewportPatrick Mours
The OptiX denoiser can be a great help when rendering in the viewport, since it is really fast and needs few samples to produce convincing results. This patch therefore adds support for using any Cycles denoiser in the viewport also (but only the OptiX one is selectable because the NLM one is too slow to be usable currently). It also adds support for denoising on a different device than rendering (so one can e.g. render with the CPU but denoise with OptiX). Reviewed By: #cycles, brecht Differential Revision: https://developer.blender.org/D6554
2020-01-08Cycles: Add OptiX AI denoiser supportPatrick Mours
This patch adds support for the OptiX denoiser as an alternative to the existing NLM denoiser in Cycles. It's re-using the same denoising architecture based on tiles and therefore implicitly also works with multiple GPUs. Reviewed By: sergey Differential Revision: https://developer.blender.org/D6395
2019-11-22Fix T71255: Particle hair not showing in viewport with OptiX after scalingPatrick Mours
The OptiX intersection program for curves uses "optixGetObjectRayDirection" to get the ray direction in object space (which was inverse transformed with the current transformation matrix). OptiX does no additional operations on it, so if there is a scaling transform, the direction is not normalized. But the curve intersection routine expects that. In addition, the distances used in "optixGetRayTmax()" and "optixReportIntersection()" are in world space, so need to adjust them accordingly.
2019-09-13Cycles: add Optix support in the kernelPatrick Mours
This adds all the kernel side changes for the Optix backend. Ref D5363
2019-08-16Fix T55054: possible use of unsupported instructions in Cycles texture codeLazydodo
Differential Revision: https://developer.blender.org/D5326
2019-05-01Cleanup: comments (long lines) in cyclesCampbell Barton
2019-04-17ClangFormat: apply to source, most of internCampbell Barton
Apply clang format as proposed in T53211. For details on usage and instructions for migrating branches without conflicts, see: https://wiki.blender.org/wiki/Tools/ClangFormat
2019-03-08Cycles OpenCL: Remove single programJeroen Bakker
Part of the cleanup of the OpenCL codebase. Single program is not effective when using OpenCL, it is slower to compile and slower during rendering (when used in for example `barbershop` or `victor`). Reviewers: brecht, #cycles Maniphest Tasks: T62267 Differential Revision: https://developer.blender.org/D4481
2019-02-26T61576: Do Not (Re-)Compile OpenCL kernelsJeroen Bakker
The goal of this patch is to have limit the number of times kernels needs to be compiled and are reused as kernels with different compile directives can lead to identical same binaries. The implementation does this by stripping the compile directives. and reshuffling kernels so the output is more likely to be the same. We focussed on the kernels where it was easy to detect and maintain (bundle, bake, displace, do_volume and background). More optimizations could be done but they are probably less obvious. Merged the data_init and state_buffer_size kernels to split_bundle. This patch will also remove empty kernels for do_volume and bake when their features are not enabled. When using the benchmark files there are less background, bake and do_volume kernels compiled. Fix: T61576, T61501, T61466 Reviewed By: brecht, #cycles Differential Revision: https://developer.blender.org/D4390
2019-02-21Fix: Missing closing brackets in includeJeroen Bakker
2019-02-21Fix: OpenCL Displacement and light samplingJeroen Bakker
The bake kernels are also used during mesh displacement and light importance sampling. We disabled the implementation of these kernels when baking was not enabled.
2019-02-20Cycles OpenCL: Remove OpenCL MegaKernelJeroen Bakker
Using OpenCL MegaKernel has been slow and therefore not usefull. This patch will remove the mega kernel from the OpenCL codebase and the OpenCLDeviceBase class. T61736: removal of mega kernel T61703: baking does not work with mega kernel Tags: #cycles Differential Revision: https://developer.blender.org/D4383
2019-02-19T61463: Separate Baking kernelsJeroen Bakker
Cycles OpenCL: Split baking kernels in own program Fix T61463. Before this patch baking was part of the base kernels. There are 3 baking kernels that and all 3 uses shader evaluation. Only for one of these kernels the functionality was wrapped in the __NO_BAKING__ compile directive. When you start baking this leads to long compile times. By separating in individual programs will reduce the compile times. Also wrapped all baking kernels with __NO_BAKING__ to reduce the compilation times. Impact on compilation time job | scene_name | previous | new | percentage --------+-----------------+----------+-------+------------ T61463 | empty | 10.63 | 7.27 | 32% T61463 | bmw | 17.91 | 14.24 | 20% T61463 | fishycat | 19.57 | 15.08 | 23% T61463 | barbershop | 54.10 | 48.18 | 11% T61463 | classroom | 17.55 | 14.42 | 18% T61463 | koro | 18.92 | 17.15 | 9% T61463 | pavillion | 17.43 | 14.23 | 18% T61463 | splash279 | 16.48 | 15.33 | 7% T61463 | volume_emission | 36.22 | 34.19 | 6% Impact on render time job | scene_name | previous | new | percentage --------+-----------------+----------+---------+------------ T61463 | empty | 21.06 | 20.54 | 2% T61463 | bmw | 198.44 | 189.59 | 4% T61463 | fishycat | 394.20 | 388.50 | 1% T61463 | barbershop | 1188.16 | 1185.49 | 0% T61463 | classroom | 341.08 | 339.27 | 1% T61463 | koro | 472.43 | 360.70 | 24% T61463 | pavillion | 905.77 | 902.14 | 0% T61463 | splash279 | 55.26 | 54.92 | 1% T61463 | volume_emission | 62.59 | 39.09 | 38% I don't have a grounded explanation why koro and volume_emission is this much faster; I have done several tests though... Maniphest Tasks: T61463 Differential Revision: https://developer.blender.org/D4376
2019-02-15Cycles: Support multithreaded compilation of kernelsBrecht Van Lommel
This patch implements a workaround to get the multithreaded compilation from D2231 working. So far, it only works for Blender, not for Cycles Standalone. Also, I have only tested the Linux codepath in the helper function. Depends on D2231. Patch by lukasstockner97, jbakker, brecht job | scene_name | compilation_time ----------+-----------------+------------------ Baseline | empty | 22.73 D2264 | empty | 13.94 Baseline | bmw | 56.44 D2264 | bmw | 41.32 Baseline | fishycat | 59.50 D2264 | fishycat | 45.19 Baseline | barbershop | 212.28 D2264 | barbershop | 169.81 Baseline | victor | 67.51 D2264 | victor | 53.60 Baseline | classroom | 51.46 D2264 | classroom | 39.02 Baseline | koro | 62.48 D2264 | koro | 49.03 Baseline | pavillion | 54.37 D2264 | pavillion | 38.82 Baseline | splash279 | 47.43 D2264 | splash279 | 37.94 Baseline | volume_emission | 145.22 D2264 | volume_emission | 121.10 This patch reduced compilation time as the split kernels and base kernels are compiled in parallel. In cycles debug mode (256) you can set unmark the opencl single program file, what reduces the compilation time even further (bmw 17 seconds, barbershop 53 seconds). Reviewers: brecht, dingto, sergey, juicyfruit, lukasstockner97 Reviewed By: brecht Subscribers: Loner, jbakker, candreacchio, 3dLuver, LazyDodo, bliblubli Differential Revision: https://developer.blender.org/D2264
2019-02-06Cycles: animation denoising support in the kernel.Lukas Stockner
This is the internal implementation, not available from the API or interface yet. The algorithm takes into account past and future frames, both to get more coherent animation and reduce noise. Ref D3889.
2019-02-06Cycles: prefilter feature passes separate from denoising.Lukas Stockner
Prefiltering of feature passes will happen during rendering, which can then be used for denoising immediately or written as a render pass for later (animation) denoising. The number of denoising data passes written is reduced because of this, leaving out the feature variance passes. The passes are now Normal, Albedo, Depth, Shadowing, Variance and Intensity. Ref D3889.
2018-12-04Cycles: add initial CUDA 10.0 support, but only recommend use for Turing cards.Brecht Van Lommel
There may still be rendering errors when used for older graphics cards.
2018-11-09Cycles: Cleanup, space after (void)Sergey Sharybin
It was used in like 95% of places.
2018-11-09Cycles: Cleanup, spacing after preprocessorSergey Sharybin
It is supposed to be two spaces before comment stating which if else/endif statements corresponds to. Was mainly violated in the header guards.
2018-10-28Cycles: Added Cryptomatte output.Stefan Werner
This allows for extra output passes that encode automatic object and material masks for the entire scene. It is an implementation of the Cryptomatte standard as introduced by Psyop. A good future extension would be to add a manifest to the export and to do plenty of testing to ensure that it is fully compatible with other renderers and compositing programs that use Cryptomatte. Internally, it adds the ability for Cycles to have several passes of the same type that are distinguished by their name. Differential Revision: https://developer.blender.org/D3538
2018-10-08Cycles: Use existing shared temporary memory in reconstruction step of the ↵Lukas Stockner
denoiser Previously the code allocated its own temporary memory, but it's possible to just use the existing shared one instead.
2018-10-06Cycles: Implement vectorized NLM kernels for faster CPU denoisingLukas Stockner
2018-07-06Cycles: Enabled half precision textures for OpenCL devices that support the ↵Stefan Werner
cl_khr_fp16 extension.
2018-07-06Cleanup: strip trailing space for cyclesCampbell Barton
2018-07-05Cycles: Adding native support for UINT16 textures.Stefan Werner
Textures in 16 bit integer format are sometimes used for displacement, bump and normal maps and can be exported by tools like Substance Painter. Without this patch, Cycles would promote those textures to single precision floating point, causing them to take up twice as much memory as needed. Reviewers: #cycles, brecht, sergey Reviewed By: #cycles, brecht, sergey Subscribers: sergey, dingto, #cycles Tags: #cycles Differential Revision: https://developer.blender.org/D3523
2018-07-04Cycles Denoising: Pass tile buffers to every OpenCL kernel to conform to ↵Lukas Stockner
standard and get rid of set_tile_info
2018-07-04Cycles Denoising: Cleanup: Rename tiles to tile_infoLukas Stockner
2018-06-14Cycles: Query XYZ to/from Scene Linear conversion from OCIO instead of ↵Lukas Stockner
assuming sRGB I've limited it to just the RGB<->XYZ stuff for now, correct image handling is the next step. Reviewers: brecht, sergey Differential Revision: https://developer.blender.org/D3478
2018-05-28Windows: Add support for building with clang.Ray Molenkamp
This commit contains the minimum to make clang build/work with blender, asan and ninja build support is forthcoming Things to note: 1) Builds and runs, and is able to pass all tests (except for the freestyle_stroke_material.blend test which was broken at that time for all platforms by the looks of it) 2) It's slightly faster than msvc when using cycles. (time in seconds, on an i7-3370) victor_cpu msvc:3099.51 clang:2796.43 pavillon_barcelona_cpu msvc:1872.05 clang:1827.72 koro_cpu msvc:1097.58 clang:1006.51 fishy_cat_cpu msvc:815.37 clang:722.2 classroom_cpu msvc:1705.39 clang:1575.43 bmw27_cpu msvc:552.38 clang:561.53 barbershop_interior_cpu msvc:2134.93 clang:1922.33 3) clang on windows uses a drop in replacement for the Microsoft cl.exe (takes some of the Microsoft parameters, but not all, and takes some of the clang parameters but not all) and uses ms headers + libraries + linker, so you still need visual studio installed and will use our existing vc14 svn libs. 4) X64 only currently, X86 builds but crashes on startup. 5) Tested with llvm/clang 6.0.0 6) Requires visual studio integration, available at https://github.com/LazyDodo/llvm-vs2017-integration 7) The Microsoft compiler spawns a few copies of cl in parallel to get faster build times, clang doesn't, so the build time is 3-4x slower than with msvc. 8) No openmp support yet. Have not looked at this much, the binary distribution of clang doesn't seem to include it on windows. 9) No ASAN support yet, some of the sanitizers can be made to work, but it was decided to leave support out of this commit. Reviewers: campbellbarton Differential Revision: https://developer.blender.org/D3304
2018-02-18Code cleanup: remove some more unused code after recent CUDA changes.Brecht Van Lommel
2018-02-18Cycles: Remove Fermi texture code.Thomas Dinges
This should be the last Fermi removal commit, unless I missed something. It's been a pleasure Fermi!
2018-02-18Cycles: Remove fermi related defines from the code.Thomas Dinges
Did not touch Texture related defines, that comes next.
2018-01-19Cycles: Remove util_debug include from kernel codeSergey Sharybin
Not sure why it was in there, all the debug flags stuff is to be handled outside of kernel.
2017-12-08Cycles: Fix difference in image Clip extension method between CPU and GPUSergey Sharybin
Our own implementation was behaving different comparing to OSL and GPU, namely on the border pixels OSL and CUDA was doing interpolation with black, but we were clamping coordinate. This partially fixes issue reported in T53452. Similar change should also be done for 3D interpolation perhaps, but this is to be investigated separately.
2017-12-08Cycles: Cleanup, split 2D interpolation functionSergey Sharybin
2017-11-30Cycles: Improve denoising speed on GPUs with small tile sizesLukas Stockner
Previously, the NLM kernels would be launched once per offset with one thread per pixel. However, with the smaller tile sizes that are now feasible, there wasn't enough work to fully occupy GPUs which results in a significant slowdown. Therefore, the kernels are now launched in a single call that handles all offsets at once. This has two downsides: Memory accesses to accumulating buffers are now atomic, and more importantly, the temporary memory now has to be allocated for every shift at once, increasing the required memory. On the other hand, of course, the smaller tiles significantly reduce the size of the memory. The main bottleneck right now is the construction of the transformation - there is nothing to be parallelized there, one thread per pixel is the maximum. I tried to parallelize the SVD implementation by storing the matrix in shared memory and launching one block per pixel, but that wasn't really going anywhere. To make the new code somewhat readable, the handling of rectangular regions was cleaned up a bit and commented, it should be easier to understand what's going on now. Also, some variables have been renamed to make the difference between buffer width and stride more apparent, in addition to some general style cleanup.
2017-11-21Cycles: Fixed compilation of CUDA kernels. Follow-up fix for my last commit.Stefan Werner
2017-11-21Cycles: Workaround for performance loss with the CUDA 9.0 SDK.Stefan Werner
CUDA 9.0.176 apparently caused some slow down on high-end Pascal cards that can be mitigated by increasing the number of registers. See https://developer.blender.org/F1142667 for a detailed comparison.
2017-11-05Code refactor: device memory cleanups, preparing for mapped host memory.Brecht Van Lommel