git.blender.org/blender.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author
2020-02-12	Cleanup: Move common CUDA/OptiX Cycles device code into separate file	Patrick Mours
	This reduces code duplication between the CUDA and OptiX device implementations: The CUDA device class is now split into declaration and definition (similar to the OpenCL device) and the OptiX device class implements that and only overrides the functions it actually has to change, while using the CUDA implementation for everything else. Reviewed By: brecht Differential Revision: https://developer.blender.org/D6814
2019-12-11	Cycles/OpenCL: Remove NULL PTR Workaround	Jeroen Bakker
	In the current OpenCL implementation we have a work-around for platforms that didn't support NULL pointers. We used to replace all NULLs and empty arrays with a pointer to a single byte on the OpenCL Device. During investigation of {T65924} it was asked to remove this work-around for testing. This change improves the render times. SCENE \| BEFORE \| AFTER --------------------+--------+------- bmw27 \| 108 \| 89 barbershop_interior \| 867 \| 673 classroom \| 270 \| 173 fishy_cat \| 244 \| 196 koro \| 249 \| 207 pavillon_barcelona \| 582 \| 414 Note that this change does not fix T65924 it just improves the rendering performance for OpenCL. We haven't tested this patch on all platforms so we should keep an eye out on the tracker. Reviewed By: sergey Differential Revision: https://developer.blender.org/D6391
2019-07-07	Cleanup: spelling	Campbell Barton

2019-05-01	Cleanup: comments (long lines) in cycles	Campbell Barton

2019-04-17	ClangFormat: apply to source, most of intern	Campbell Barton
	Apply clang format as proposed in T53211. For details on usage and instructions for migrating branches without conflicts, see: https://wiki.blender.org/wiki/Tools/ClangFormat
2019-03-19	Cleanup: trailing space	Campbell Barton

2019-03-17	Cleanup: remove Cycles advanced shading features toggle.	Brecht Van Lommel
	It's effectively always enabled, only not on some unsupported OpenCL devices. For testing those it's not useful to disable these features. This is replaced by the more fine grained feature toggles that we have now.
2019-03-15	Cycles/OpenCL: Compile Kernels During Scene Update	Jeroen Bakker
	The main goals of this change is faster starting when using foreground rendering. This patch will build kernels in parallel to the update process of the scene. When these optimized kernels are not available (yet) an AO kernel will be used. These AO kernels are fast to compile (3-7 seconds) and can be reused by all scenes. When the final kernels become available we will switch to these kernels. In background mode the AO kernels will not be used. Some kernels are being used during Scene update (displace, background light). When these kernels are being used the process can halt until these become available. Reviewed By: brecht, #cycles Maniphest Tasks: T61752 Differential Revision: https://developer.blender.org/D4428
2019-03-12	Cycles/OpenCL: Reduce How Often Kernel Recompilations Are Needed	Jeroen Bakker
	This patch will reduce the number of times that we need to recompile kernels. It does this by (en/dis)abling features by default. So when the user needs them that the kernels are already available. Other features are enabled by default for background and foreground rendering. When in background rendering the user wants the best render performance. When in foreground rendering the user wants the least amount of recompilations. Enabling volumetrics or subdivision evaluation will still trigger a recompilation during foreground rendering. Reviewed By: #cycles, brecht Differential Revision: https://developer.blender.org/D4485
2019-03-08	Cycles OpenCL: Remove single program	Jeroen Bakker
	Part of the cleanup of the OpenCL codebase. Single program is not effective when using OpenCL, it is slower to compile and slower during rendering (when used in for example `barbershop` or `victor`). Reviewers: brecht, #cycles Maniphest Tasks: T62267 Differential Revision: https://developer.blender.org/D4481
2019-02-26	T61576: Do Not (Re-)Compile OpenCL kernels	Jeroen Bakker
	The goal of this patch is to have limit the number of times kernels needs to be compiled and are reused as kernels with different compile directives can lead to identical same binaries. The implementation does this by stripping the compile directives. and reshuffling kernels so the output is more likely to be the same. We focussed on the kernels where it was easy to detect and maintain (bundle, bake, displace, do_volume and background). More optimizations could be done but they are probably less obvious. Merged the data_init and state_buffer_size kernels to split_bundle. This patch will also remove empty kernels for do_volume and bake when their features are not enabled. When using the benchmark files there are less background, bake and do_volume kernels compiled. Fix: T61576, T61501, T61466 Reviewed By: brecht, #cycles Differential Revision: https://developer.blender.org/D4390
2019-02-20	Cycles OpenCL: Remove OpenCL MegaKernel	Jeroen Bakker
	Using OpenCL MegaKernel has been slow and therefore not usefull. This patch will remove the mega kernel from the OpenCL codebase and the OpenCLDeviceBase class. T61736: removal of mega kernel T61703: baking does not work with mega kernel Tags: #cycles Differential Revision: https://developer.blender.org/D4383
2019-02-19	T61463: Separate Baking kernels	Jeroen Bakker
	Cycles OpenCL: Split baking kernels in own program Fix T61463. Before this patch baking was part of the base kernels. There are 3 baking kernels that and all 3 uses shader evaluation. Only for one of these kernels the functionality was wrapped in the __NO_BAKING__ compile directive. When you start baking this leads to long compile times. By separating in individual programs will reduce the compile times. Also wrapped all baking kernels with __NO_BAKING__ to reduce the compilation times. Impact on compilation time job \| scene_name \| previous \| new \| percentage --------+-----------------+----------+-------+------------ T61463 \| empty \| 10.63 \| 7.27 \| 32% T61463 \| bmw \| 17.91 \| 14.24 \| 20% T61463 \| fishycat \| 19.57 \| 15.08 \| 23% T61463 \| barbershop \| 54.10 \| 48.18 \| 11% T61463 \| classroom \| 17.55 \| 14.42 \| 18% T61463 \| koro \| 18.92 \| 17.15 \| 9% T61463 \| pavillion \| 17.43 \| 14.23 \| 18% T61463 \| splash279 \| 16.48 \| 15.33 \| 7% T61463 \| volume_emission \| 36.22 \| 34.19 \| 6% Impact on render time job \| scene_name \| previous \| new \| percentage --------+-----------------+----------+---------+------------ T61463 \| empty \| 21.06 \| 20.54 \| 2% T61463 \| bmw \| 198.44 \| 189.59 \| 4% T61463 \| fishycat \| 394.20 \| 388.50 \| 1% T61463 \| barbershop \| 1188.16 \| 1185.49 \| 0% T61463 \| classroom \| 341.08 \| 339.27 \| 1% T61463 \| koro \| 472.43 \| 360.70 \| 24% T61463 \| pavillion \| 905.77 \| 902.14 \| 0% T61463 \| splash279 \| 55.26 \| 54.92 \| 1% T61463 \| volume_emission \| 62.59 \| 39.09 \| 38% I don't have a grounded explanation why koro and volume_emission is this much faster; I have done several tests though... Maniphest Tasks: T61463 Differential Revision: https://developer.blender.org/D4376
2019-02-15	Cycles: Support multithreaded compilation of kernels	Brecht Van Lommel
	This patch implements a workaround to get the multithreaded compilation from D2231 working. So far, it only works for Blender, not for Cycles Standalone. Also, I have only tested the Linux codepath in the helper function. Depends on D2231. Patch by lukasstockner97, jbakker, brecht job \| scene_name \| compilation_time ----------+-----------------+------------------ Baseline \| empty \| 22.73 D2264 \| empty \| 13.94 Baseline \| bmw \| 56.44 D2264 \| bmw \| 41.32 Baseline \| fishycat \| 59.50 D2264 \| fishycat \| 45.19 Baseline \| barbershop \| 212.28 D2264 \| barbershop \| 169.81 Baseline \| victor \| 67.51 D2264 \| victor \| 53.60 Baseline \| classroom \| 51.46 D2264 \| classroom \| 39.02 Baseline \| koro \| 62.48 D2264 \| koro \| 49.03 Baseline \| pavillion \| 54.37 D2264 \| pavillion \| 38.82 Baseline \| splash279 \| 47.43 D2264 \| splash279 \| 37.94 Baseline \| volume_emission \| 145.22 D2264 \| volume_emission \| 121.10 This patch reduced compilation time as the split kernels and base kernels are compiled in parallel. In cycles debug mode (256) you can set unmark the opencl single program file, what reduces the compilation time even further (bmw 17 seconds, barbershop 53 seconds). Reviewers: brecht, dingto, sergey, juicyfruit, lukasstockner97 Reviewed By: brecht Subscribers: Loner, jbakker, candreacchio, 3dLuver, LazyDodo, bliblubli Differential Revision: https://developer.blender.org/D2264
2019-02-06	Cycles: animation denoising support in the kernel.	Lukas Stockner
	This is the internal implementation, not available from the API or interface yet. The algorithm takes into account past and future frames, both to get more coherent animation and reduce noise. Ref D3889.
2019-02-06	Cycles: prefilter feature passes separate from denoising.	Lukas Stockner
	Prefiltering of feature passes will happen during rendering, which can then be used for denoising immediately or written as a render pass for later (animation) denoising. The number of denoising data passes written is reduced because of this, leaving out the feature variance passes. The passes are now Normal, Albedo, Depth, Shadowing, Variance and Intensity. Ref D3889.
2018-11-30	Fix T58183: crash with CPU + GPU rendering after profiling changes.	Brecht Van Lommel
	Multi-device was not passing along profiler to the CPU.
2018-11-09	Cycles: Cleanup, space after (void)	Sergey Sharybin
	It was used in like 95% of places.
2018-08-14	Fix T56359: Unitialized variable in Cycles OpenCL could cause crashes.	Stefan Werner

2018-07-06	Cycles: Enabled half precision textures for OpenCL devices that support the ↵	Stefan Werner
	cl_khr_fp16 extension.
2018-07-04	Cycles Denoising: Pass tile buffers to every OpenCL kernel to conform to ↵	Lukas Stockner
	standard and get rid of set_tile_info
2018-07-04	Cycles Denoising: Cleanup: Rename tiles to tile_info	Lukas Stockner

2018-07-04	Cycles Denoising: Refactor denoiser tile handling	Lukas Stockner
	This deduplicates the calls for tile (un)mapping and allows to have a target buffer that is different from the source buffer (needed for baking and animation denoising).
2018-02-06	Fix T54001: AMD OpenCL fails with certain resolutions, after recent changes.	Brecht Van Lommel
	We should actually be using CL_DEVICE_MEM_BASE_ADDR_ALIGN for sub buffers, previous change in this code was incorrect. Renamed the function now to make the specific purpose of this alignment clear, it's not required for data types in general.
2017-11-30	Cycles: Improve denoising speed on GPUs with small tile sizes	Lukas Stockner
	Previously, the NLM kernels would be launched once per offset with one thread per pixel. However, with the smaller tile sizes that are now feasible, there wasn't enough work to fully occupy GPUs which results in a significant slowdown. Therefore, the kernels are now launched in a single call that handles all offsets at once. This has two downsides: Memory accesses to accumulating buffers are now atomic, and more importantly, the temporary memory now has to be allocated for every shift at once, increasing the required memory. On the other hand, of course, the smaller tiles significantly reduce the size of the memory. The main bottleneck right now is the construction of the transformation - there is nothing to be parallelized there, one thread per pixel is the maximum. I tried to parallelize the SVD implementation by storing the matrix in shared memory and launching one block per pixel, but that wasn't really going anywhere. To make the new code somewhat readable, the handling of rectangular regions was cleaned up a bit and commented, it should be easier to understand what's going on now. Also, some variables have been renamed to make the difference between buffer width and stride more apparent, in addition to some general style cleanup.
2017-11-09	Cycles: avoid reallocating tile denoising memory many times during render.	Brecht Van Lommel

2017-10-24	Code refactor: use device_only_memory and device_vector in more places.	Brecht Van Lommel

2017-10-24	Code refactor: store device/interp/extension/type in each device_memory.	Brecht Van Lommel

2017-10-07	Code refactor: make texture code more consistent between devices.	Brecht Van Lommel
	* Use common TextureInfo struct for all devices, except CUDA fermi. * Move image sampling code to kernels//kernel__image.h files. * Use arrays for data textures on Fermi too, so device_vector<Struct> works.
2017-08-12	Code cleanup: fix warning and improve terminology.	Brecht Van Lommel

2017-08-08	Cycles: More fixes for Windows 32 bit	Sergey Sharybin
	- Apparently MSVC does not support compound literals in C++ (at least by the looks of it). - Not sure how opencl_device_assert was managing to set protected property of the Device class.
2017-08-08	Cycles: Pack kernel textures into buffers for OpenCL	Mai Lavelle
	Image textures were being packed into a single buffer for OpenCL, which limited the amount of memory available for images to the size of one buffer (usually 4gb on AMD hardware). By packing textures into multiple buffers that limit is removed, while simultaneously reducing the number of buffers that need to be passed to each kernel. Benchmarks were within 2%. Fixes T51554. Differential Revision: https://developer.blender.org/D2745
2017-07-11	Cycles: Disable OpenCL clFlush workarounds	Sergey Sharybin
	This is something which was reported to work fine by Mai, Benjamin and confirmed by myself. Disabling this workaround gains us some speedup: Before Now bmw27 04:28.42 04:07.79 classroom 09:26.48 08:54.53 fishy_cat 08:44.01 08:18.70 koro 09:17.98 08:57.18 pavillon_barcelone 12:26.64 11:52.81 Test environment is: - Ubuntu 16.04, with all updates installed - AMD RX 480 GPU - amdgpu pro driver version 17.10-450821
2017-07-05	Cycles: Pass string by const reference rather than by value	Sergey Sharybin
	Some of the functions might have been inlined, but others i don't see how that was possible (don't think virtual functions can be inlined here). In any case, better be explicitly optimal in the code.
2017-06-10	Cycles: Blacklist unsupported OpenCL devices	Hristo Gueorguiev
	Due to various driver issues with AMD GCN 1 cards we can no longer support these GPUs. This patch makes them unavailable to select for Cycles rendering. GCN cards 2 and higher are still supported. Please use the most recent drivers available to ensure proper functionality. See here for a list to check which GPUs are supported: https://en.wikipedia.org/wiki/List_of_AMD_graphics_processing_units
2017-06-09	Cycles Denoising: Merge outlier heuristic and confidence interval test	Lukas Stockner
	The previous outlier heuristic only checked whether the pixel is more than twice as bright compared to the 75% quantile of the 5x5 neighborhood. While this detected fireflies robustly, it also incorrectly marked a lot of legitimate small highlights as outliers and filtered them away. This commit adds an additional condition for marking a pixel as a firefly: In addition to being above the reference brightness, the lower end of the 3-sigma confidence interval has to be below it. Since the lower end approximates how low the true value of the pixel might be, this test separates pixels that are supposed to be very bright from pixels that are very bright due to random fireflies. Also, since there is now a reliable outlier filter as a preprocessing step, the additional confidence interval test in the reconstruction kernel is no longer needed.
2017-05-18	Cycles Denoising: Add more robust outlier heuristic to avoid artifacts	Lukas Stockner
	Extremely bright pixels in the rendered image cause the denoising algorithm to produce extremely noticable artifacts. Therefore, a heuristic is needed to exclude these pixels from the filtering process. The new approach calculates the 75% percentile of the 5x5 neighborhood of each pixel and flags the pixel if it is more than twice as bright. During the reconstruction process, flagged pixels are skipped. Therefore, they don't cause any problems for neighboring pixels, and the outlier pixels themselves are replaced by a prediction of their actual value based on their feature pass values and the neighboring pixels. Therefore, the denoiser now also works as a smarter despeckling filter that uses a more accurate prediction of the pixel instead of a simple average. This can be used even if denoising isn't wanted by setting the denoising radius to 1.
2017-05-07	Cycles: Implement denoising option for reducing noise in the rendered image	Lukas Stockner
	This commit contains the first part of the new Cycles denoising option, which filters the resulting image using information gathered during rendering to get rid of noise while preserving visual features as well as possible. To use the option, enable it in the render layer options. The default settings fit a wide range of scenes, but the user can tweak individual settings to control the tradeoff between a noise-free image, image details, and calculation time. Note that the denoiser may still change in the future and that some features are not implemented yet. The most important missing feature is animation denoising, which uses information from multiple frames at once to produce a flicker-free and smoother result. These features will be added in the future. Finally, thanks to all the people who supported this project: - Google (through the GSoC) and Theory Studios for sponsoring the development - The authors of the papers I used for implementing the denoiser (more details on them will be included in the technical docs) - The other Cycles devs for feedback on the code, especially Sergey for mentoring the GSoC project and Brecht for the code review! - And of course the users who helped with testing, reported bugs and things that could and/or should work better!
2017-05-02	Cycles: Remove extra clFinish from driver workaround	Mai Lavelle
	These were causing problems with Nvidia OpenCL.
2017-03-29	Cycles: Make all #include statements relative to cycles source directory	Sergey Sharybin
	The idea is to make include statements more explicit and obvious where the file is coming from, additionally reducing chance of wrong header being picked up. For example, it was not obvious whether bvh.h was refferring to builder or traversal, whenter node.h is a generic graph node or a shader node and cases like that. Surely this might look obvious for the active developers, but after some time of not touching the code it becomes less obvious where file is coming from. This was briefly mentioned in T50824 and seems @brecht is fine with such explicitness, but need to agree with all active developers before committing this. Please note that this patch is lacking changes related on GPU/OpenCL support. This will be solved if/when we all agree this is a good idea to move forward. Reviewers: brecht, lukasstockner97, maiself, nirved, dingto, juicyfruit, swerner Reviewed By: lukasstockner97, maiself, nirved, dingto Subscribers: brecht Differential Revision: https://developer.blender.org/D2586
2017-03-21	Cycles: Use more friendly GPU device name for AMD cards	Sergey Sharybin
	For example, for RX480 you'll no longer see "Ellesmere" but will see "AMD Radeon RX 480 Graphics" which makes more sense and allows to easily distinguish which exact card it is when having multiple different cards of Ellesmere codenames (i.e. RX480 and WX7100) in the same machine.
2017-03-21	Cycles: Cleanup, add some utility functions to shorten access to low level API	Sergey Sharybin
	Should be no functional changes.
2017-03-09	Cycles: add single program debug option for split kernel	Hristo Gueorguiev
	Single program generally compiles kernels faster (2-3 times), loads faster, takes less drive space (2-3 times), and reduces the number of cached kernels.
2017-03-08	Cycles: Log which device kernels are being loaded for	Sergey Sharybin

2017-03-08	Cycles: Add names to buffer allocations	Mai Lavelle
	This is to help debug and track memory usage for generic buffers. We have similar for textures already since those require a name, but for buffers the name is only for debugging proposes.
2017-03-08	Cycles: Workaround for driver hangs	Mai Lavelle
	Simple workaround for some issues we've been having with AMD drivers hanging and rendering systems unresponsive. Unfortunately this makes things a bit slower, but its better than having to do hard reboots. Will be removed when drivers have been fixed. Define CYCLES_DISABLE_DRIVER_WORKAROUNDS to disable for testing purposes.
2017-03-08	Cycles: OpenCL split kernel refactor	Mai Lavelle
	This does a few things at once: - Refactors host side split kernel logic into a new device agnostic class `DeviceSplitKernel`. - Removes tile splitting, a new work pool implementation takes its place and allows as many threads as will fit in memory regardless of tile size, which can give performance gains. - Refactors split state buffers into one buffer, as well as reduces the number of arguments passed to kernels. Means there's less code to deal with overall. - Moves kernel logic out of OpenCL kernel files so they can later be used by other device types. - Replaced OpenCL specific APIs with new generic versions - Tiles can now be seen updating during rendering
2016-11-21	Cycles: Attempt to fix compilation error on ppc64el	Sergey Sharybin
	There is some define conflict between system headers and clew, so delay include of clew.h as much as possible.] This is something which needed to be done in the code before the refactor, hopefully such change will still work.
2016-11-07	Cycles: Refactor Device selection to allow individual GPU compute device ↵	Lukas Stockner
	selection Previously, it was only possible to choose a single GPU or all of that type (CUDA or OpenCL). Now, a toggle button is displayed for every device. These settings are tied to the PCI Bus ID of the devices, so they're consistent across hardware addition and removal (but not when swapping/moving cards). From the code perspective, the more important change is that now, the compute device properties are stored in the Addon preferences of the Cycles addon, instead of directly in the User Preferences. This allows for a cleaner implementation, removing the Cycles C API functions that were called by the RNA code to specify the enum items. Note that this change is neither backwards- nor forwards-compatible, but since it's only a User Preference no existing files are broken. Reviewers: #cycles, brecht Reviewed By: #cycles, brecht Subscribers: brecht, juicyfruit, mib2berlin, Blendify Differential Revision: https://developer.blender.org/D2338
2016-10-17	Cycles: Improve OpenCL kernel compilation logging	Lukas Stockner
	The previous refactor changed the code to use a separate logging mechanism to support multithreaded compilation. However, since that's not supported by any frameworks yes, it just resulted in bad logging behaviour. So, this commit changes the logging to go diectly to stdout/stderr once again by default.