Welcome to mirror list, hosted at ThFree Co, Russian Federation.

git.blender.org/blender.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2017-03-27Cycles: First implementation of shadow catcherSergey Sharybin
It uses an idea of accumulating all possible light reachable across the light path (without taking shadow blocked into account) and accumulating total shaded light across the path. Dividing second figure by first one seems to be giving good estimate of the shadow. In fact, to my knowledge, it's something really similar to what is happening in the denoising branch, so we are aligned here which is good. The workflow is following: - Create an object which matches real-life object on which shadow is to be catched. - Create approximate similar material on that object. This is needed to make indirect light properly affecting CG objects in the scene. - Mark object as Shadow Catcher in the Object properties. Ideally, after doing that it will be possible to render the image and simply alpha-over it on top of real footage.
2017-03-24Cycles: Correct isfinite check used in integratorSergey Sharybin
Use fast-math friendly version of this function. We should probably avoid unsafe fast math, but this is to be done with real care with all the benchmarks properly done. For now comitting much safer fix.
2017-03-24Cycles: Workaround incorrect SSS with CUDA toolkit 8.0.61Sergey Sharybin
2017-03-23Cycles: Remove unused macroSergey Sharybin
2017-03-23Cycles: Use SSE-optimized version of triangle intersection for motion trianglesSergey Sharybin
The title says it all actually. Gives up to 10% speedup on test scenes here on i7-6800K. Render times on GPU are unreliable here, but there might be some slowdown caused by watertight nature of intersections.
2017-03-23Cycles: Fix speed regression on GPUSergey Sharybin
Avoid construction of temporary array and make utility function force-inlined. Additionally avoid calling float4_to_float3 twice. This brings render times to the same values as before current patch series.
2017-03-23Cycles: Use utility function for SSS triangle intersectionSergey Sharybin
This effectively de-duplicates triangle intersection logic implemented for both regular triangle and SSS triangle.
2017-03-23Cycles: Move watertight triangle intersection to an utility fileSergey Sharybin
This way the code can be reused more easily.
2017-03-23Cycles: Move triangle intersection precalc to an util fileSergey Sharybin
This is a preparation work for the followup commit which wil l move remaining parts of Woop intersection logic to an utility file. Doing it as a separate commit to keep changes more atomic and easier to bisect when/if needed.
2017-03-23Cycles: Cleanup, move utility function to utility fileSergey Sharybin
Was an old TODO, this function is handy for some math utilities as well.
2017-03-23Cycles: Move intersection math to own header fileSergey Sharybin
There are following benefits: - Modifying intersection algorithm will not cause so much re-compilation. - It works around header dependency hell and allows us to use vectorization types much easier in there.
2017-03-23Cycles: Cleanup, inline AVX register construction from kernel global dataSergey Sharybin
Currently should be no functional changes, preparing for some upcoming refactor.
2017-03-22Fix/workaround T50533: Transparency shader doesn't cast shadows with curve ↵Sergey Sharybin
segments There seems to be a compiler bug of MSVC2013. The issue does not happen on Linux and does not happen on Windows when building with MSVC2015. Since it's reallly a pain to debug release builds with MSVC2013 the AVX2 optimization is disabled for curve sergemnts for this compiler.
2017-03-21Cycles: Fix building of OpenCL kernelsMai Lavelle
Theres no overloading of functions in OpenCL so we can't make use of `safe_normalize` with `float2`.
2017-03-20Fix T50975: Cycles: Light sampling threshold inadvertently clamps negative lampsSergey Sharybin
2017-03-20Fix T50990: Random black pixels in Cycles when rendering material with ↵Sergey Sharybin
Multiscatter GGX
2017-03-17Cycles: Fix mistake in previous split kernel commitsSergey Sharybin
Own stupid mistake. Reported by nirved in IRC, thanks!
2017-03-17Cycles: Cleanup, indentationSergey Sharybin
2017-03-17Cycles: Fix compilation error of LCG RNGSergey Sharybin
2017-03-17Cycles: Fix handling of barriersMai Lavelle
2017-03-16Cycles: Define ccl_local variables in kernel functionsSergey Sharybin
Declaring ccl_local in a device function is not supported by certain compilers.
2017-03-16Cycles: Workaround for compilation error caused by passing KernelGlobalsSergey Sharybin
Pass globals as a bare pointer, same as it sued to be prior to split kernel rework. AMD CPU platform and Intel OpenCL were complaining about this. Perhaps we shouldn't pass globals as pointer at all, this isn't something what is really portable and can cause issues on 32 bit perhaps.
2017-03-16Cycles: Avoid some ccl_local in various kernelsSergey Sharybin
2017-03-14Cycles: Try to avoid infinite loops by catching invalid ray statesMai Lavelle
2017-03-13Cycles: Cleanup, wipe obviously outdated parts of split kernel commentsSergey Sharybin
2017-03-13fix msvc warnings about unknown opencl pragmaslazydodo
2017-03-13Cycles: Add missing header in the fileSergey Sharybin
2017-03-13Fix T50925: Add AO approximation to split kernelHristo Gueorguiev
2017-03-13Cycles: Make MESA compiler more happySergey Sharybin
While this compiler is not officially supported yet, getting it to work is a nice thing because more and more AMD cards will fall under MESA driver. It's also nice to use explicit comparison with NULL, which makes it more clear whether variable is a boolean or pointer. Even Rust enforces this! Patch by Ian Bruce with own modifications.
2017-03-11Fix T50888: Numeric overflow in split kernel state buffer size calculationMai Lavelle
Overflow led to the state buffer being too small and the split kernel to get stuck doing nothing forever.
2017-03-10Cycles: Cleanup, extra semicolon and spaceSergey Sharybin
2017-03-10Cycles: Enable SSS and volumes for CUDA and Nvidia OpenCL split kernelMai Lavelle
2017-03-09Cycles: add single program debug option for split kernelHristo Gueorguiev
Single program generally compiles kernels faster (2-3 times), loads faster, takes less drive space (2-3 times), and reduces the number of cached kernels.
2017-03-09Cycles: split kernel_shadow_blocked to AO & DL partsHristo Gueorguiev
Reduces memory allocation for split kernel. This allows for faster rendering due to bigger global size, specially when GPU memory is limited. Perfromance results: R9 290 total render time Before After Change BMW 4:37 4:34 -1.1 % Classroom 14:43 14:30 -1.5 % Fishy Cat 11:20 11:04 -2.4 % Koro 12:11 12:04 -1.0 % Pabellon Barcelona 22:01 20:44 -5.8 % Pabellon Barcelona(*) 15:32 15:09 -2.5 % (*) without glossy connected to volume
2017-03-09Cycles: Speedup transparent shadows in split kernelHristo Gueorguiev
This commit enables record-all transparent shadows rays. Perfromance results: R9 290 render time (without synchronization), seconds Before After Change BMW 261.5 262.5 +0.4 % Classroom 869.6 867.3 -0.3 % Fishy Cat 657.4 639.8 -2.7 % Koro 1909.8 692.8 -63.7 % Pabellon Barcelona 1633.3 1238.0 -24.2 % Pabellon Barcelona(*) 1158.1 903.8 -22.0 % (*) without glossy connected to volume
2017-03-09Cycles: SSS and Volume rendering in split kernelHristo Gueorguiev
Decoupled ray marching is not supported yet. Transparent shadows are always enabled for volume rendering. Changes in kernel/bvh and kernel/geom are from Sergey. This simiplifies code significantly, and prepares it for record-all transparent shadow function in split kernel.
2017-03-09Cycles: Fix CUDA build error for some compilersMai Lavelle
Needed to include `util_types.h` before using `uint`.
2017-03-08Cycles: Make it possible to access KernelGlobals from split data ↵Sergey Sharybin
initialization function
2017-03-08Cycles: Cleanup, remove residue of previous split kernel dataSergey Sharybin
This is all in split data state array.
2017-03-08Cycles: Fix indentationMai Lavelle
2017-03-08Cycles: Fix strict warning about unused variableMai Lavelle
2017-03-08Cycles: Calculate size of split state buffer kernel sideMai Lavelle
By calculating the size of the state buffer in the kernel rather than the host less code is needed and the size actually reflects the requested features. Will also be a little faster in some cases because of larger global work size.
2017-03-08Cycles: Initialize rng_state for split kernelMai Lavelle
Because the split kernel can render multiple samples in parallel it is necessary to have everything initialized before rendering of any samples begins. The code that normally handles initialization of `rng_state` (`kernel_path_trace_setup()`) only does so for the first sample, which was causing artifacts in the split kernel due to uninitialized `rng_state` for some samples. Note that because the split kernel can render samples in parallel this means that the split kernel is incompatible with the LCG.
2017-03-08Cycles: Remove sum_all_radiance kernelMai Lavelle
This was only needed for the previous implementation of parallel samples. As we don't have that any more it can be removed. Real reason for removal tho is this: `per_sample_output_buffers` was being calculated too small and artifacts resulted. The tile buffer is already the correct size and calculating the size for `per_sample_output_buffers` is a bit difficult with the current layout of the code. As `per_sample_output_buffers` was only needed for `sum_all_radiance`, removing that kernel and writing output to the tile buffer directly fixes the artifacts.
2017-03-08Cycles: Split path initialization into own kernelMai Lavelle
This makes it easier to initialize things correctly in the data_init kernel before they are needed by path tracing.
2017-03-08Cycles: CUDA implementation of split kernelMai Lavelle
2017-03-08Cycles: CPU implementation of split kernelMai Lavelle
2017-03-08Cycles: Remove ccl_fetch and SOAMai Lavelle
2017-03-08Cycles: OpenCL split kernel refactorMai Lavelle
This does a few things at once: - Refactors host side split kernel logic into a new device agnostic class `DeviceSplitKernel`. - Removes tile splitting, a new work pool implementation takes its place and allows as many threads as will fit in memory regardless of tile size, which can give performance gains. - Refactors split state buffers into one buffer, as well as reduces the number of arguments passed to kernels. Means there's less code to deal with overall. - Moves kernel logic out of OpenCL kernel files so they can later be used by other device types. - Replaced OpenCL specific APIs with new generic versions - Tiles can now be seen updating during rendering
2017-03-08Cycles: Add OpenCL kernel for zeroing memory buffersMai Lavelle
Transferring memory to the device was very slow and there's really no need when only zeroing a buffer.