Welcome to mirror list, hosted at ThFree Co, Russian Federation.

git.blender.org/blender.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2018-03-19Fix T54317: overlapping volume render bug after recent changes.Brecht Van Lommel
Increasing the samplig dimensions like this is not optimal, I'm looking into some deeper changes to reuse the random number and change the RR probabilities, but this should fix the bug for now.
2018-03-01Cycles: don't count volume boundaries as transparent bounces.Brecht Van Lommel
This is more important now that we will have tigther volume bounds that we hit multiple times. It also avoids some noise due to RR previously affecting these surfaces, which shouldn't have been the case and should eventually be fixed for transparent BSDFs as well. For non-volume scenes I found no performance impact on NVIDIA or AMD. For volume scenes the noise decrease and fixed artifacts are worth the little extra render time, when there is any.
2018-01-23Fix T53854: branched path tracing correlation bug with transparency in split ↵Brecht Van Lommel
kernel.
2017-11-16Cycles: Fix crash with split branched path tracingMai Lavelle
ShaderData memory was getting clobbered in the branched path code paths. Was caused by 087331c495b04ebd37903c0dc0e46262354cf026
2017-11-09Cycles: Replace __MAX_CLOSURE__ build option with runtime integrator variableMai Lavelle
Goal is to reduce OpenCL kernel recompilations. Currently viewport renders are still set to use 64 closures as this seems to be faster and we don't want to cause a performance regression there. Needs to be investigated. Reviewed By: brecht Differential Revision: https://developer.blender.org/D2775
2017-09-28Cycles: reduce subsurface stack memory usage.Brecht Van Lommel
This is done by storing only a subset of PathRadiance, and by storing direct light immediately in the main PathRadiance. Saves about 10% of CUDA stack memory, and simplifies subsurface indirect ray code.
2017-08-19Code cleanup: move rng into path state.Brecht Van Lommel
Also pass by value and don't write back now that it is just a hash for seeding and no longer an LCG state. Together this makes CUDA a tiny bit faster in my tests, but mainly simplifies code.
2017-06-10Cycles: Faster split branched path tracing by sharing samples with inactive ↵Mai Lavelle
threads Unlike regular path tracing, branched path tracing is usually used with lower sample counts, at least for primary rays. This means that are less samples for the GPU to work on in parallel and rendering is slower. As there is less work overall there is also more inactive threads during rendering with BPT. This patch makes use of those inactive rays to render branched samples in parallel with other samples. Each thread that is preparing for a branched sample will attempt to find an inactive thread and if one is found the state for the sample is copied to that thread. Potentially, if there are enough inactive threads, 100s of branched samples could be generated from the same originating thread and ran in parallel giving large speed ups. Gives 70% faster render for pavillion midday scene. 20-60% faster on BMW with car paint replaced with SSS/volumes.
2017-05-02Cycles: Branched path tracing for the split kernelMai Lavelle
This implements branched path tracing for the split kernel. General approach is to store the ray state at a branch point, trace the branched ray as normal, then restore the state as necessary before iterating to the next part of the path. A state machine is used to advance the indirect loop state, which avoids the need to add any new kernels. Each iteration the state machine recreates as much state as possible from the stored ray to keep overall storage down. Its kind of hard to keep all the different integration loops in sync, so this needs lots of testing to make sure everything is working correctly. We should probably start trying to deduplicate the integration loops more now. Nonbranched BMW is ~2% slower, while classroom is ~2% faster, other scenes could use more testing still. Reviewers: sergey, nirved Reviewed By: nirved Subscribers: Blendify, bliblubli Differential Revision: https://developer.blender.org/D2611
2017-03-27Cycles: Add OpenCL support for shadow catcher featureHristo Gueorguiev
The title says it all actually.
2017-03-27Cycles: Remove ccl_addr_space from RNG passed to functionsHristo Gueorguiev
Simplifies code quite a bit, making it shorter and easier to extend. Currently no functional changes for users, but is required for the upcoming work of shadow catcher support with OpenCL.
2017-03-16Cycles: Define ccl_local variables in kernel functionsSergey Sharybin
Declaring ccl_local in a device function is not supported by certain compilers.
2017-03-13Cycles: Cleanup, wipe obviously outdated parts of split kernel commentsSergey Sharybin
2017-03-08Cycles: Remove ccl_fetch and SOAMai Lavelle
2017-03-08Cycles: OpenCL split kernel refactorMai Lavelle
This does a few things at once: - Refactors host side split kernel logic into a new device agnostic class `DeviceSplitKernel`. - Removes tile splitting, a new work pool implementation takes its place and allows as many threads as will fit in memory regardless of tile size, which can give performance gains. - Refactors split state buffers into one buffer, as well as reduces the number of arguments passed to kernels. Means there's less code to deal with overall. - Moves kernel logic out of OpenCL kernel files so they can later be used by other device types. - Replaced OpenCL specific APIs with new generic versions - Tiles can now be seen updating during rendering
2016-09-19Cycles: Cleanup code style in split kernelSergey Sharybin
2016-02-03Cycles: Cleanup, indentation and bracesSergey Sharybin
2015-11-01Cycles: Remove unused argument from the split kernel functionsSergey Sharybin
Should be no functional changes, just simplifies operation with kernels.
2015-10-29Cycles: OpenCL split kernel cleanup, move casts from .h files to .cl filesSergey Sharybin
Ideally we shouldn't use char* at all, but for now we have to, so at least let's assume common .h files are free from pointer magic.
2015-07-03Cycles: Code cleanup in split kernel, whitespacesSergey Sharybin
2015-05-27Cycles: Code cleanup, split kernelSergey Sharybin
2015-05-26Fix T44833: Can't use ccl_local space in non-kernel functionsSergey Sharybin
This commit re-shuffles code in split kernel once again and makes it so common parts which is in the headers is only responsible to making all the work needed for specified ray index. Getting ray index, checking for it's validity and enqueuing tasks are now happening in the device specified part of the kernel. This actually makes sense because enqueuing is indeed device-specified and i.e. with CUDA we'll want to enqueue kernels from kernel and avoid CPU roundtrip. TODO: - Kernel comments are still placed in the common header files, but since queue related stuff is not passed to those functions those comments might need to be split as well. Just currently read them considering that they're also covering the way how all devices are invoking the common code path. - Arguments might need to be wrapped into KernelGlobals, so we don't ened to pass all them around as function arguments.
2015-05-25Fix T44833, OpenCL compile error on AMD.Thomas Dinges
This was broken after the kernel file restructure. Variables allocated in the __local address space can only be defined inside a __kernel function. We probably need to solve this a bit differently once we do the CUDA kernel split, but this fix shoud be good enough until then.
2015-05-22Cycles: Restructure kernel files organizationSergey Sharybin
Since the kernel split work we're now having quite a few of new files, majority of which are related on the kernel entry points. Keeping those files in the root kernel folder will eventually make it really hard to follow which files are actual implementation of Cycles kernel. Those files are now moved to kernel/kernels/<device_type>. This way adding extra entry points will be less noisy. It is also nice to have all device-specific files grouped together. Another change is in the way how split kernel invokes logic. Previously all the logic was implemented directly in the .cl files, which makes it a bit tricky to re-use the logic across other devices. Since we'll likely be looking into doing same split work for CUDA devices eventually it makes sense to move logic from .cl files to header files. Those files are stored in kernel/split. This does not mean the header files will not give error messages when tried to be included from other devices and their arguments will likely be changed, but having such separation is a good start anyway. There should be no functional changes. Reviewers: juicyfruit, dingto Differential Revision: https://developer.blender.org/D1314