git.blender.org/blender.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author
2020-06-24	Cleanup: compiler warnings	Brecht Van Lommel

2020-05-15	Cycles: code refactor to bake using regular render session and tiles	Brecht Van Lommel
	There should be no user visible change from this, except that tile size now affects performance. The goal here is to simplify bake denoising in D3099, letting it reuse more denoising tiles and pass code. A lot of code is now shared with regular rendering, with the two main differences being that we read some render result passes from the bake API when starting to render a tile, and call the bake kernel instead of the path trace kernel. With this kind of design where Cycles asks for tiles from the bake API, it should eventually be easier to reduce memory usage, show tiles as they are baked, or bake multiple passes at once, though there's still quite some work needed for that. Reviewers: #cycles Subscribers: monio, wmatyjewicz, lukasstockner97, michaelknubben Differential Revision: https://developer.blender.org/D3108
2020-03-18	Cycles: support for different 3D transform per volume grid	Brecht Van Lommel
	This is not yet fully supported by automatic volume bounds but works fine in most cases that will have mostly matching bounds. Ref T73201
2020-03-12	Cleanup: add device_texture for images, distinct from other global memory	Brecht Van Lommel
	There was too much image texture specific stuff in device_memory, and too much code duplication between devices.
2020-03-11	Cleanup: stop encoding image data type in slot index	Brecht Van Lommel
	This is legacy code from when we had a fixed number of textures.
2020-03-06	Cleanup: tweak Cycles #includes in preparation for clang-format sorting	Brecht Van Lommel

2020-03-05	Adaptive Sampling for Cycles.	Stefan Werner
	This feature takes some inspiration from "RenderMan: An Advanced Path Tracing Architecture for Movie Rendering" and "A Hierarchical Automatic Stopping Condition for Monte Carlo Global Illumination" The basic principle is as follows: While samples are being added to a pixel, the adaptive sampler writes half of the samples to a separate buffer. This gives it two separate estimates of the same pixel, and by comparing their difference it estimates convergence. Once convergence drops below a given threshold, the pixel is considered done. When a pixel has not converged yet and needs more samples than the minimum, its immediate neighbors are also set to take more samples. This is done in order to more reliably detect sharp features such as caustics. A 3x3 box filter that is run periodically over the tile buffer is used for that purpose. After a tile has finished rendering, the values of all passes are scaled as if they were rendered with the full number of samples. This way, any code operating on these buffers, for example the denoiser, does not need to be changed for per-pixel sample counts. Reviewed By: brecht, #cycles Differential Revision: https://developer.blender.org/D4686
2019-08-16	Fix T55054: possible use of unsupported instructions in Cycles texture code	Lazydodo
	Differential Revision: https://developer.blender.org/D5326
2019-04-17	ClangFormat: apply to source, most of intern	Campbell Barton
	Apply clang format as proposed in T53211. For details on usage and instructions for migrating branches without conflicts, see: https://wiki.blender.org/wiki/Tools/ClangFormat
2019-02-06	Cycles: animation denoising support in the kernel.	Lukas Stockner
	This is the internal implementation, not available from the API or interface yet. The algorithm takes into account past and future frames, both to get more coherent animation and reduce noise. Ref D3889.
2019-02-06	Cycles: prefilter feature passes separate from denoising.	Lukas Stockner
	Prefiltering of feature passes will happen during rendering, which can then be used for denoising immediately or written as a render pass for later (animation) denoising. The number of denoising data passes written is reduced because of this, leaving out the feature variance passes. The passes are now Normal, Albedo, Depth, Shadowing, Variance and Intensity. Ref D3889.
2018-11-09	Cycles: Cleanup, space after (void)	Sergey Sharybin
	It was used in like 95% of places.
2018-11-09	Cycles: Cleanup, spacing after preprocessor	Sergey Sharybin
	It is supposed to be two spaces before comment stating which if else/endif statements corresponds to. Was mainly violated in the header guards.
2018-10-06	Cycles: Implement vectorized NLM kernels for faster CPU denoising	Lukas Stockner

2018-07-06	Cleanup: strip trailing space for cycles	Campbell Barton

2018-07-05	Cycles: Adding native support for UINT16 textures.	Stefan Werner
	Textures in 16 bit integer format are sometimes used for displacement, bump and normal maps and can be exported by tools like Substance Painter. Without this patch, Cycles would promote those textures to single precision floating point, causing them to take up twice as much memory as needed. Reviewers: #cycles, brecht, sergey Reviewed By: #cycles, brecht, sergey Subscribers: sergey, dingto, #cycles Tags: #cycles Differential Revision: https://developer.blender.org/D3523
2018-07-04	Cycles Denoising: Cleanup: Rename tiles to tile_info	Lukas Stockner

2018-06-14	Cycles: Query XYZ to/from Scene Linear conversion from OCIO instead of ↵	Lukas Stockner
	assuming sRGB I've limited it to just the RGB<->XYZ stuff for now, correct image handling is the next step. Reviewers: brecht, sergey Differential Revision: https://developer.blender.org/D3478
2018-05-28	Windows: Add support for building with clang.	Ray Molenkamp
	This commit contains the minimum to make clang build/work with blender, asan and ninja build support is forthcoming Things to note: 1) Builds and runs, and is able to pass all tests (except for the freestyle_stroke_material.blend test which was broken at that time for all platforms by the looks of it) 2) It's slightly faster than msvc when using cycles. (time in seconds, on an i7-3370) victor_cpu msvc:3099.51 clang:2796.43 pavillon_barcelona_cpu msvc:1872.05 clang:1827.72 koro_cpu msvc:1097.58 clang:1006.51 fishy_cat_cpu msvc:815.37 clang:722.2 classroom_cpu msvc:1705.39 clang:1575.43 bmw27_cpu msvc:552.38 clang:561.53 barbershop_interior_cpu msvc:2134.93 clang:1922.33 3) clang on windows uses a drop in replacement for the Microsoft cl.exe (takes some of the Microsoft parameters, but not all, and takes some of the clang parameters but not all) and uses ms headers + libraries + linker, so you still need visual studio installed and will use our existing vc14 svn libs. 4) X64 only currently, X86 builds but crashes on startup. 5) Tested with llvm/clang 6.0.0 6) Requires visual studio integration, available at https://github.com/LazyDodo/llvm-vs2017-integration 7) The Microsoft compiler spawns a few copies of cl in parallel to get faster build times, clang doesn't, so the build time is 3-4x slower than with msvc. 8) No openmp support yet. Have not looked at this much, the binary distribution of clang doesn't seem to include it on windows. 9) No ASAN support yet, some of the sanitizers can be made to work, but it was decided to leave support out of this commit. Reviewers: campbellbarton Differential Revision: https://developer.blender.org/D3304
2018-02-18	Code cleanup: remove some more unused code after recent CUDA changes.	Brecht Van Lommel

2018-01-19	Cycles: Remove util_debug include from kernel code	Sergey Sharybin
	Not sure why it was in there, all the debug flags stuff is to be handled outside of kernel.
2017-12-08	Cycles: Fix difference in image Clip extension method between CPU and GPU	Sergey Sharybin
	Our own implementation was behaving different comparing to OSL and GPU, namely on the border pixels OSL and CUDA was doing interpolation with black, but we were clamping coordinate. This partially fixes issue reported in T53452. Similar change should also be done for 3D interpolation perhaps, but this is to be investigated separately.
2017-12-08	Cycles: Cleanup, split 2D interpolation function	Sergey Sharybin

2017-11-30	Cycles: Improve denoising speed on GPUs with small tile sizes	Lukas Stockner
	Previously, the NLM kernels would be launched once per offset with one thread per pixel. However, with the smaller tile sizes that are now feasible, there wasn't enough work to fully occupy GPUs which results in a significant slowdown. Therefore, the kernels are now launched in a single call that handles all offsets at once. This has two downsides: Memory accesses to accumulating buffers are now atomic, and more importantly, the temporary memory now has to be allocated for every shift at once, increasing the required memory. On the other hand, of course, the smaller tiles significantly reduce the size of the memory. The main bottleneck right now is the construction of the transformation - there is nothing to be parallelized there, one thread per pixel is the maximum. I tried to parallelize the SVD implementation by storing the matrix in shared memory and launching one block per pixel, but that wasn't really going anywhere. To make the new code somewhat readable, the handling of rectangular regions was cleaned up a bit and commented, it should be easier to understand what's going on now. Also, some variables have been renamed to make the difference between buffer width and stride more apparent, in addition to some general style cleanup.
2017-11-05	Code refactor: device memory cleanups, preparing for mapped host memory.	Brecht Van Lommel

2017-10-24	Code refactor: store device/interp/extension/type in each device_memory.	Brecht Van Lommel

2017-10-07	Cycles: CUDA bicubic and tricubic texture interpolation support.	Brecht Van Lommel
	While cubic interpolation is quite expensive on the CPU compared to linear interpolation, the difference on the GPU is quite small.
2017-10-07	Code refactor: make texture code more consistent between devices.	Brecht Van Lommel
	* Use common TextureInfo struct for all devices, except CUDA fermi. * Move image sampling code to kernels//kernel__image.h files. * Use arrays for data textures on Fermi too, so device_vector<Struct> works.
2017-10-05	Code refactor: split displace/background into separate kernels, remove luma.	Brecht Van Lommel

2017-10-04	Code refactor: use split variance calculation for mega kernels too.	Brecht Van Lommel
	There is no significant difference in denoised benchmark scenes and denoising ctests, so might as well make it all consistent.
2017-10-04	Code refactor: remove rng_state buffer and compute hash on the fly.	Brecht Van Lommel
	A little faster on some benchmark scenes, a little slower on others, seems about performance neutral on average and saves a little memory.
2017-08-08	Cycles: Fix compilation error of filter kernels on 32 bit Windows	Sergey Sharybin
	We don't enable global SSE optimizations in regular kernel, and we keep those disabled on Linux 32bit. One possible workaround would be to pass arguments by ccl_ref, but that is quite a few of code which better be done accurately.
2017-08-07	Code refactor: use float4 instead of intrinsics for CPU denoise filtering.	Brecht Van Lommel
	Differential Revision: https://developer.blender.org/D2764
2017-06-10	Cycles: Add kernel to enqueue inactive rays	Mai Lavelle
	The queue will be used to make reuse of inactive threads to keep the GPU more busy.
2017-06-09	Cycles Denoising: Merge outlier heuristic and confidence interval test	Lukas Stockner
	The previous outlier heuristic only checked whether the pixel is more than twice as bright compared to the 75% quantile of the 5x5 neighborhood. While this detected fireflies robustly, it also incorrectly marked a lot of legitimate small highlights as outliers and filtered them away. This commit adds an additional condition for marking a pixel as a firefly: In addition to being above the reference brightness, the lower end of the 3-sigma confidence interval has to be below it. Since the lower end approximates how low the true value of the pixel might be, this test separates pixels that are supposed to be very bright from pixels that are very bright due to random fireflies. Also, since there is now a reliable outlier filter as a preprocessing step, the additional confidence interval test in the reconstruction kernel is no longer needed.
2017-05-19	Cycles: Cleanup, variable names	Sergey Sharybin
	Don't use camel case for variable names. Leave that for the structures.
2017-05-18	Cycles Denoising: Add more robust outlier heuristic to avoid artifacts	Lukas Stockner
	Extremely bright pixels in the rendered image cause the denoising algorithm to produce extremely noticable artifacts. Therefore, a heuristic is needed to exclude these pixels from the filtering process. The new approach calculates the 75% percentile of the 5x5 neighborhood of each pixel and flags the pixel if it is more than twice as bright. During the reconstruction process, flagged pixels are skipped. Therefore, they don't cause any problems for neighboring pixels, and the outlier pixels themselves are replaced by a prediction of their actual value based on their feature pass values and the neighboring pixels. Therefore, the denoiser now also works as a smarter despeckling filter that uses a more accurate prediction of the pixel instead of a simple average. This can be used even if denoising isn't wanted by setting the denoising radius to 1.
2017-05-16	Cycles: Fix building with native only option	Mai Lavelle
	Approach suggested by Lukas S.
2017-05-07	Cycles: Implement denoising option for reducing noise in the rendered image	Lukas Stockner
	This commit contains the first part of the new Cycles denoising option, which filters the resulting image using information gathered during rendering to get rid of noise while preserving visual features as well as possible. To use the option, enable it in the render layer options. The default settings fit a wide range of scenes, but the user can tweak individual settings to control the tradeoff between a noise-free image, image details, and calculation time. Note that the denoiser may still change in the future and that some features are not implemented yet. The most important missing feature is animation denoising, which uses information from multiple frames at once to produce a flicker-free and smoother result. These features will be added in the future. Finally, thanks to all the people who supported this project: - Google (through the GSoC) and Theory Studios for sponsoring the development - The authors of the papers I used for implementing the denoiser (more details on them will be included in the technical docs) - The other Cycles devs for feedback on the code, especially Sergey for mentoring the GSoC project and Brecht for the code review! - And of course the users who helped with testing, reported bugs and things that could and/or should work better!
2017-05-03	Cycles: Split kernel - sort shaders	Hristo Gueorguiev
	Reduce thread divergence in kernel_shader_eval. Rays are sorted in blocks of 2048 according to shader->id. On R9 290 Classroom is ~30% faster, and Pabellon Barcelone is ~8% faster. No sorting for CUDA split kernel. Reviewers: sergey, maiself Reviewed By: maiself Differential Revision: https://developer.blender.org/D2598
2017-05-02	Cycles: Branched path tracing for the split kernel	Mai Lavelle
	This implements branched path tracing for the split kernel. General approach is to store the ray state at a branch point, trace the branched ray as normal, then restore the state as necessary before iterating to the next part of the path. A state machine is used to advance the indirect loop state, which avoids the need to add any new kernels. Each iteration the state machine recreates as much state as possible from the stored ray to keep overall storage down. Its kind of hard to keep all the different integration loops in sync, so this needs lots of testing to make sure everything is working correctly. We should probably start trying to deduplicate the integration loops more now. Nonbranched BMW is ~2% slower, while classroom is ~2% faster, other scenes could use more testing still. Reviewers: sergey, nirved Reviewed By: nirved Subscribers: Blendify, bliblubli Differential Revision: https://developer.blender.org/D2611
2017-04-28	Cycles: Cleanup, indentaiton and trailing whitespace and wrapping	Sergey Sharybin

2017-04-27	Unlimited number of textures for Cycles	Stefan Werner
	This patch allows for an unlimited number of textures in Cycles where the hardware allows. It replaces a number static arrays with dynamic arrays and changes the way the flat_slot indices are calculated. Eventually, I'd like to get to a point where there are only flat slots left and textures off all kinds are stored in a single array. Note that the arrays in DeviceScene are changed from containing device_vector<T> objects to device_vector<T>* pointers. Ideally, I'd like to store objects, but dynamic resizing of a std:vector in pre-C++11 calls the copy constructor, which for a good reason is not implemented for device_vector. Once we require C++11 for Cycles builds, we can implement a move constructor for device_vector and store objects again. The limits for CUDA Fermi hardware still apply. Reviewers: tod_baudais, InsigMathK, dingto, #cycles Reviewed By: dingto, #cycles Subscribers: dingto, smellslikedonkey Differential Revision: https://developer.blender.org/D2650
2017-03-29	Cycles: Make all #include statements relative to cycles source directory	Sergey Sharybin
	The idea is to make include statements more explicit and obvious where the file is coming from, additionally reducing chance of wrong header being picked up. For example, it was not obvious whether bvh.h was refferring to builder or traversal, whenter node.h is a generic graph node or a shader node and cases like that. Surely this might look obvious for the active developers, but after some time of not touching the code it becomes less obvious where file is coming from. This was briefly mentioned in T50824 and seems @brecht is fine with such explicitness, but need to agree with all active developers before committing this. Please note that this patch is lacking changes related on GPU/OpenCL support. This will be solved if/when we all agree this is a good idea to move forward. Reviewers: brecht, lukasstockner97, maiself, nirved, dingto, juicyfruit, swerner Reviewed By: lukasstockner97, maiself, nirved, dingto Subscribers: brecht Differential Revision: https://developer.blender.org/D2586
2017-03-16	Cycles: Define ccl_local variables in kernel functions	Sergey Sharybin
	Declaring ccl_local in a device function is not supported by certain compilers.
2017-03-09	Cycles: split kernel_shadow_blocked to AO & DL parts	Hristo Gueorguiev
	Reduces memory allocation for split kernel. This allows for faster rendering due to bigger global size, specially when GPU memory is limited. Perfromance results: R9 290 total render time Before After Change BMW 4:37 4:34 -1.1 % Classroom 14:43 14:30 -1.5 % Fishy Cat 11:20 11:04 -2.4 % Koro 12:11 12:04 -1.0 % Pabellon Barcelona 22:01 20:44 -5.8 % Pabellon Barcelona() 15:32 15:09 -2.5 % () without glossy connected to volume
2017-03-09	Cycles: SSS and Volume rendering in split kernel	Hristo Gueorguiev
	Decoupled ray marching is not supported yet. Transparent shadows are always enabled for volume rendering. Changes in kernel/bvh and kernel/geom are from Sergey. This simiplifies code significantly, and prepares it for record-all transparent shadow function in split kernel.
2017-03-08	Cycles: Remove sum_all_radiance kernel	Mai Lavelle
	This was only needed for the previous implementation of parallel samples. As we don't have that any more it can be removed. Real reason for removal tho is this: `per_sample_output_buffers` was being calculated too small and artifacts resulted. The tile buffer is already the correct size and calculating the size for `per_sample_output_buffers` is a bit difficult with the current layout of the code. As `per_sample_output_buffers` was only needed for `sum_all_radiance`, removing that kernel and writing output to the tile buffer directly fixes the artifacts.
2017-03-08	Cycles: Split path initialization into own kernel	Mai Lavelle
	This makes it easier to initialize things correctly in the data_init kernel before they are needed by path tracing.
2017-03-08	Cycles: CPU implementation of split kernel	Mai Lavelle