Welcome to mirror list, hosted at ThFree Co, Russian Federation.

git.blender.org/blender.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2018-07-04Cycles Denoising: Refactor denoiser tile handlingLukas Stockner
This deduplicates the calls for tile (un)mapping and allows to have a target buffer that is different from the source buffer (needed for baking and animation denoising).
2018-07-04Cycles Denoising: Split main function into logical stepsLukas Stockner
2018-06-26Revert "Turned off clang warnings in third party includes."Stefan Werner
This reverts commit d53093953f8f3b58600cb19020ecbe0b5f254b52.
2018-06-26Turned off clang warnings in third party includes.Stefan Werner
The latest clang compiler (at least the one in Xcode 9.4.1) warns about the register keyword and macro expansions using defined(). Since these warnings come from third party code, we can't address them directly in Blender. Silencing them via #pramgas will at least keep the warnings during a build down to the ones that are relevant to Blender code.
2018-02-06Fix T54001: AMD OpenCL fails with certain resolutions, after recent changes.Brecht Van Lommel
We should actually be using CL_DEVICE_MEM_BASE_ADDR_ALIGN for sub buffers, previous change in this code was incorrect. Renamed the function now to make the specific purpose of this alignment clear, it's not required for data types in general.
2018-01-22Cycles: Replace use_qbvh boolean flag with an enum-based propertySergey Sharybin
This was we can introduce other types of BVH, for example, wider ones, without causing too much mess around boolean flags. Thoughs: - Ideally device info should probably return bitflag of what BVH types it supports. It is possible to implement based on simple logic in device/ and mesh.cpp, rest of the changes will stay the same. - Not happy with workarounds in util_debug and duplicated enum in kernel. Maybe enbum should be stores in kernel, but then it's kind of weird to include kernel types from utils. Soudns some cyclkic dependency. Reviewers: brecht, maxim_d33 Reviewed By: brecht Differential Revision: https://developer.blender.org/D3011
2018-01-19Cycles: Make it more proper check on vectorization flags from DebugFlagsSergey Sharybin
Mimics to checks in system_cpu_support() checks.
2018-01-19Cycles: Cleanup, stop using debug flags in system utilitiesSergey Sharybin
Debug flags are to be controlling render behavior, nothing to do with low level system utilities. it was simple to hack, but logically is wrong. Lets do things where they are supposed to be done!
2017-12-06Cycles: Fix constness for load_kernels in device_cpu.cppLukas Stockner
2017-11-30Cycles: Improve denoising speed on GPUs with small tile sizesLukas Stockner
Previously, the NLM kernels would be launched once per offset with one thread per pixel. However, with the smaller tile sizes that are now feasible, there wasn't enough work to fully occupy GPUs which results in a significant slowdown. Therefore, the kernels are now launched in a single call that handles all offsets at once. This has two downsides: Memory accesses to accumulating buffers are now atomic, and more importantly, the temporary memory now has to be allocated for every shift at once, increasing the required memory. On the other hand, of course, the smaller tiles significantly reduce the size of the memory. The main bottleneck right now is the construction of the transformation - there is nothing to be parallelized there, one thread per pixel is the maximum. I tried to parallelize the SVD implementation by storing the matrix in shared memory and launching one block per pixel, but that wasn't really going anywhere. To make the new code somewhat readable, the handling of rectangular regions was cleaned up a bit and commented, it should be easier to understand what's going on now. Also, some variables have been renamed to make the difference between buffer width and stride more apparent, in addition to some general style cleanup.
2017-11-17Cycles: Add per-tile render time debug passLukas Stockner
Reviewers: sergey, brecht Differential Revision: https://developer.blender.org/D2920
2017-11-09Cycles: avoid reallocating tile denoising memory many times during render.Brecht Van Lommel
2017-11-09Cycles: Replace __MAX_CLOSURE__ build option with runtime integrator variableMai Lavelle
Goal is to reduce OpenCL kernel recompilations. Currently viewport renders are still set to use 64 closures as this seems to be faster and we don't want to cause a performance regression there. Needs to be investigated. Reviewed By: brecht Differential Revision: https://developer.blender.org/D2775
2017-11-05Code refactor: device memory cleanups, preparing for mapped host memory.Brecht Van Lommel
2017-11-03Fix T53247: mixed CPU + GPU render wrong texture limits.Brecht Van Lommel
2017-10-24Code refactor: move more memory allocation logic into device API.Brecht Van Lommel
* Remove tex_* and pixels_* functions, replace by mem_*. * Add MEM_TEXTURE and MEM_PIXELS as memory types recognized by devices. * No longer create device_memory and call mem_* directly, always go through device_only_memory, device_vector and device_pixels.
2017-10-24Code refactor: use device_only_memory and device_vector in more places.Brecht Van Lommel
2017-10-24Code refactor: store device/interp/extension/type in each device_memory.Brecht Van Lommel
2017-10-24Code refactor: pass device to scene, check OSL with device info.Brecht Van Lommel
2017-10-21Code refactor: avoid some unnecessary device memory copying.Brecht Van Lommel
2017-10-21Cycles: combined CPU + GPU rendering support.Brecht Van Lommel
CPU rendering will be restricted to a BVH2, which is not ideal for raytracing performance but can be shared with the GPU. Decoupled volume shading will be disabled to match GPU volume sampling. The number of CPU rendering threads is reduced to leave one core dedicated to each GPU. Viewport rendering will also only use GPU rendering still. So along with the BVH2 usage, perfect scaling should not be expected. Go to User Preferences > System to enable the CPU to render alongside the GPU. Differential Revision: https://developer.blender.org/D2873
2017-10-08Code refactor: use DeviceInfo to enable QBVH and decoupled volume shading.Brecht Van Lommel
2017-10-07Code refactor: make texture code more consistent between devices.Brecht Van Lommel
* Use common TextureInfo struct for all devices, except CUDA fermi. * Move image sampling code to kernels/*/kernel_*_image.h files. * Use arrays for data textures on Fermi too, so device_vector<Struct> works.
2017-10-05Code refactor: split displace/background into separate kernels, remove luma.Brecht Van Lommel
2017-10-04Code refactor: use split variance calculation for mega kernels too.Brecht Van Lommel
There is no significant difference in denoised benchmark scenes and denoising ctests, so might as well make it all consistent.
2017-10-04Code refactor: remove rng_state buffer and compute hash on the fly.Brecht Van Lommel
A little faster on some benchmark scenes, a little slower on others, seems about performance neutral on average and saves a little memory.
2017-08-25Cycles: Correct logging of sued CPU intrisicsSergey Sharybin
2017-08-08Cycles: Pack kernel textures into buffers for OpenCLMai Lavelle
Image textures were being packed into a single buffer for OpenCL, which limited the amount of memory available for images to the size of one buffer (usually 4gb on AMD hardware). By packing textures into multiple buffers that limit is removed, while simultaneously reducing the number of buffers that need to be passed to each kernel. Benchmarks were within 2%. Fixes T51554. Differential Revision: https://developer.blender.org/D2745
2017-08-07Code refactor: split defines into separate header, changes to SSE type headers.Brecht Van Lommel
I need to use some macros defined in util_simd.h for float3/float4, to emulate SSE4 instructions on SSE2. But due to issues with order of header includes this was not possible, this does some refactoring to make it work. Differential Revision: https://developer.blender.org/D2764
2017-07-05Cycles: Pass string by const reference rather than by valueSergey Sharybin
Some of the functions might have been inlined, but others i don't see how that was possible (don't think virtual functions can be inlined here). In any case, better be explicitly optimal in the code.
2017-07-03Cycles: Add missing split kernel to CPUDeviceLukas Stockner
2017-06-09Cycles Denoising: Merge outlier heuristic and confidence interval testLukas Stockner
The previous outlier heuristic only checked whether the pixel is more than twice as bright compared to the 75% quantile of the 5x5 neighborhood. While this detected fireflies robustly, it also incorrectly marked a lot of legitimate small highlights as outliers and filtered them away. This commit adds an additional condition for marking a pixel as a firefly: In addition to being above the reference brightness, the lower end of the 3-sigma confidence interval has to be below it. Since the lower end approximates how low the true value of the pixel might be, this test separates pixels that are supposed to be very bright from pixels that are very bright due to random fireflies. Also, since there is now a reliable outlier filter as a preprocessing step, the additional confidence interval test in the reconstruction kernel is no longer needed.
2017-05-20Cycles: Cleanup, style and unused argumentsSergey Sharybin
- Some arguments were inapproriatry tagged as unused using (void)foo semantic. Only use such semantic in tricky casses, when something needs to be ignored in release builds or something is dependent on tricky ifndef policy. For rest of the cases just use void foo(int /bar*/) semantic, which ensures variable is not used. Solves confusion and code running out of sync with later development. - Used proper unused semantic to some arguments. - Added braces to make code easier to follow, tricky indentation with ifdef, uh.
2017-05-18Cycles Denoising: Add more robust outlier heuristic to avoid artifactsLukas Stockner
Extremely bright pixels in the rendered image cause the denoising algorithm to produce extremely noticable artifacts. Therefore, a heuristic is needed to exclude these pixels from the filtering process. The new approach calculates the 75% percentile of the 5x5 neighborhood of each pixel and flags the pixel if it is more than twice as bright. During the reconstruction process, flagged pixels are skipped. Therefore, they don't cause any problems for neighboring pixels, and the outlier pixels themselves are replaced by a prediction of their actual value based on their feature pass values and the neighboring pixels. Therefore, the denoiser now also works as a smarter despeckling filter that uses a more accurate prediction of the pixel instead of a simple average. This can be used even if denoising isn't wanted by setting the denoising radius to 1.
2017-05-09Cycles: Properly free memory used by KernelGlobalsSergey Sharybin
Previous logic did not free memory used by vector classes which were storing images, causing memory leaks.
2017-05-07Cycles: Implement denoising option for reducing noise in the rendered imageLukas Stockner
This commit contains the first part of the new Cycles denoising option, which filters the resulting image using information gathered during rendering to get rid of noise while preserving visual features as well as possible. To use the option, enable it in the render layer options. The default settings fit a wide range of scenes, but the user can tweak individual settings to control the tradeoff between a noise-free image, image details, and calculation time. Note that the denoiser may still change in the future and that some features are not implemented yet. The most important missing feature is animation denoising, which uses information from multiple frames at once to produce a flicker-free and smoother result. These features will be added in the future. Finally, thanks to all the people who supported this project: - Google (through the GSoC) and Theory Studios for sponsoring the development - The authors of the papers I used for implementing the denoiser (more details on them will be included in the technical docs) - The other Cycles devs for feedback on the code, especially Sergey for mentoring the GSoC project and Brecht for the code review! - And of course the users who helped with testing, reported bugs and things that could and/or should work better!
2017-05-04Cycles: Fix crash when assigning KernelGlobalsLukas Stockner
The memory isn't initialized during allocation, so calling the assignment operator is a bad idea.
2017-04-07Cycles: Change work pool and global size of split CPU for easier debuggingMai Lavelle
2017-03-29Cycles: Make all #include statements relative to cycles source directorySergey Sharybin
The idea is to make include statements more explicit and obvious where the file is coming from, additionally reducing chance of wrong header being picked up. For example, it was not obvious whether bvh.h was refferring to builder or traversal, whenter node.h is a generic graph node or a shader node and cases like that. Surely this might look obvious for the active developers, but after some time of not touching the code it becomes less obvious where file is coming from. This was briefly mentioned in T50824 and seems @brecht is fine with such explicitness, but need to agree with all active developers before committing this. Please note that this patch is lacking changes related on GPU/OpenCL support. This will be solved if/when we all agree this is a good idea to move forward. Reviewers: brecht, lukasstockner97, maiself, nirved, dingto, juicyfruit, swerner Reviewed By: lukasstockner97, maiself, nirved, dingto Subscribers: brecht Differential Revision: https://developer.blender.org/D2586
2017-03-17Cycles: Silence strict compiler warningSergey Sharybin
2017-03-17Cycles: Improve memory usage of CPU split kernel by using smaller global sizeMai Lavelle
2017-03-11Fix T50888: Numeric overflow in split kernel state buffer size calculationMai Lavelle
Overflow led to the state buffer being too small and the split kernel to get stuck doing nothing forever.
2017-03-08Cycles: Fix indentationMai Lavelle
2017-03-08Cycles: Calculate size of split state buffer kernel sideMai Lavelle
By calculating the size of the state buffer in the kernel rather than the host less code is needed and the size actually reflects the requested features. Will also be a little faster in some cases because of larger global work size.
2017-03-08Cycles: Add names to buffer allocationsMai Lavelle
This is to help debug and track memory usage for generic buffers. We have similar for textures already since those require a name, but for buffers the name is only for debugging proposes.
2017-03-08Cycles: CPU implementation of split kernelMai Lavelle
2017-03-08Cycles: Allow device_memory to be used directlyMai Lavelle
This is useful for when theres no host side memory attched to the buffer
2016-12-03Cycles: Refactor Progress system to provide better estimatesLukas Stockner
The Progress system in Cycles had two limitations so far: - It just counted tiles, but ignored their size. For example, when rendering a 600x500 image with 512x512 tiles, the right 88x500 tile would count for 50% of the progress, although it only covers 15% of the image. - Scene update time was incorrectly counted as rendering time - therefore, the remaining time started very long and gradually decreased. This patch fixes both problems: First of all, the Progress now has a function to ignore time spans, and that is used to ignore scene update time. The larger change is the tile size: Instead of counting samples per tile, so that the final value is num_samples*num_tiles, the code now counts every sample for every pixel, so that the final value is num_samples*num_pixels. Along with that, some unused variables were removed from the Progress and Session classes. Reviewers: brecht, sergey, #cycles Subscribers: brecht, candreacchio, sergey Differential Revision: https://developer.blender.org/D2214
2016-05-31Cycles: Add human readable sizes to debug outputMai Lavelle
Some of these values can get quite large and are hard to read, adding this makes it easy to read them at a glance. Reviewed By: sergey Differential Revision: https://developer.blender.org/D2039
2016-05-18Cycles: Reduce amount of malloc() calls from the kernelSergey Sharybin
This commit makes it so malloc() is only happening once per volume and once per transparent shadow query (per thread), improving scalability of the code to multiple CPU cores. Hard to measure this with a low-bottom i7 here currently, but from quick tests seems volume sampling gave about 3-5% speedup. The idea is to store allocated memory in kernel globals, which are per thread on CPU already. Reviewers: dingto, juicyfruit, lukasstockner97, maiself, brecht Reviewed By: brecht Subscribers: Blendify, nutel Differential Revision: https://developer.blender.org/D1996