Welcome to mirror list, hosted at ThFree Co, Russian Federation.

git.blender.org/blender.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2017-09-23Cycles: Fix compilation error of OpenCL megakernel on AppleSergey Sharybin
2017-09-20Cycles: slightly improve BSDF sample stratification for path tracing.Brecht Van Lommel
Similar to what we did for area lights previously, this should help preserve stratification when using multiple BSDFs in theory. Improvements are not easily noticeable in practice though, because the number of BSDFs is usually low. Still nice to eliminate one sampling dimension.
2017-09-16Cycles: Fix compilation error with OpenCL split kernelHristo Gueorguiev
2017-09-12Cycles: Tweaks to avoid compilation error of megakernelSergey Sharybin
Also moved code out of deep-inside ifdef block, otherwise it was quite confusing.
2017-09-05Cycles: Fix compilation error with CUDA after recent changesSergey Sharybin
2017-09-05Fix T52433: Volume Absorption color tintSergey Sharybin
Need to exit the volume stack when shadow ray laves the medium. Thanks Brecht for review and help in troubleshooting!
2017-09-05Cycles: Cleanup, styleSergey Sharybin
2017-08-24Code cleanup: remove shader context.Brecht Van Lommel
This was needed when we accessed OSL closure memory after shader evaluation, which could get overwritten by another shader evaluation. But all closures are immediatley converted to ShaderClosure now, so no longer needed.
2017-08-19Code cleanup: move rng into path state.Brecht Van Lommel
Also pass by value and don't write back now that it is just a hash for seeding and no longer an LCG state. Together this makes CUDA a tiny bit faster in my tests, but mainly simplifies code.
2017-08-07Fix Cycles shadow catcher objects influencing each other.Brecht Van Lommel
Since all the shadow catchers are already assumed to be in the footage, the shadows they cast on each other are already in the footage too. So don't just let shadow catchers skip self, but all shadow catchers. Another justification is that it should not matter if the shadow catcher is modeled as one object or multiple separate objects, the resulting render should be the same. Differential Revision: https://developer.blender.org/D2763
2017-04-18Cycles: Cleanup, styleSergey Sharybin
2017-04-07Cycles: Fix indentationMai Lavelle
2017-03-27Cycles: First implementation of shadow catcherSergey Sharybin
It uses an idea of accumulating all possible light reachable across the light path (without taking shadow blocked into account) and accumulating total shaded light across the path. Dividing second figure by first one seems to be giving good estimate of the shadow. In fact, to my knowledge, it's something really similar to what is happening in the denoising branch, so we are aligned here which is good. The workflow is following: - Create an object which matches real-life object on which shadow is to be catched. - Create approximate similar material on that object. This is needed to make indirect light properly affecting CG objects in the scene. - Mark object as Shadow Catcher in the Object properties. Ideally, after doing that it will be possible to render the image and simply alpha-over it on top of real footage.
2017-03-09Cycles: Speedup transparent shadows in split kernelHristo Gueorguiev
This commit enables record-all transparent shadows rays. Perfromance results: R9 290 render time (without synchronization), seconds Before After Change BMW 261.5 262.5 +0.4 % Classroom 869.6 867.3 -0.3 % Fishy Cat 657.4 639.8 -2.7 % Koro 1909.8 692.8 -63.7 % Pabellon Barcelona 1633.3 1238.0 -24.2 % Pabellon Barcelona(*) 1158.1 903.8 -22.0 % (*) without glossy connected to volume
2017-03-09Cycles: SSS and Volume rendering in split kernelHristo Gueorguiev
Decoupled ray marching is not supported yet. Transparent shadows are always enabled for volume rendering. Changes in kernel/bvh and kernel/geom are from Sergey. This simiplifies code significantly, and prepares it for record-all transparent shadow function in split kernel.
2017-03-08Cycles: Remove ccl_fetch and SOAMai Lavelle
2017-03-08Cycles: OpenCL split kernel refactorMai Lavelle
This does a few things at once: - Refactors host side split kernel logic into a new device agnostic class `DeviceSplitKernel`. - Removes tile splitting, a new work pool implementation takes its place and allows as many threads as will fit in memory regardless of tile size, which can give performance gains. - Refactors split state buffers into one buffer, as well as reduces the number of arguments passed to kernels. Means there's less code to deal with overall. - Moves kernel logic out of OpenCL kernel files so they can later be used by other device types. - Replaced OpenCL specific APIs with new generic versions - Tiles can now be seen updating during rendering
2017-02-08Cycles: Fix regression with transparent shadows in volumeSergey Sharybin
2017-02-08Cycles: Solve speed regression by casting opaque ray firstSergey Sharybin
2017-02-08Cycles: Fix compilation error on OpenCLSergey Sharybin
2017-02-08Cycles: Split shadow functions to avoid some duplicated calculationsSergey Sharybin
2017-02-08Cycles: Store shadow intersections in the kernel globalsSergey Sharybin
Seems CUDA failed to de-duplicate the array across multiple inlined versions of the shadow_blocked(). Helped it a bit with that now. Gives about 100MB memory improvement on a scenes after previous commit and brings up memory "regression" to only 100MB comparing to the master branch now.
2017-02-08Cycles: Implement record-all transparent shadow function for GPUSergey Sharybin
The idea is to record all possible transparent intersections when shooting transparent ray on GPU (similar to what we were doing on CPU already). This avoids need of doing whole ray-to-scene intersections queries for each intersection and speeds up a lot cases like transparent hair in the cost of extra memory. This commit is a base ground for now and this feature is kept disabled for until some further tweaks.
2017-02-08Cycles: Use an utility function to sort intersections arraySergey Sharybin
2017-02-08Cycles: Make GPU version of shadow_blocked() closer to CPUSergey Sharybin
Now we break the traversal cycle and then perform volume attenuation and check with zero throughput. Not sure it makes any measurable sense at this moment, but in the future it might help de-duplicating some extra logic here.
2017-02-08Cycles: De-duplicate transparent shadows attenuationSergey Sharybin
Fair amount of code was duplicated for CPU and GPU, now we are using inlined function to avoid such duplication.
2016-10-03Fix Cycles CUDA performance on CUDA 8.0.Brecht Van Lommel
Mostly this is making inlining match CUDA 7.5 in a few performance critical places. The end result is that performance is now better than before, possibly due to less register spilling or other CUDA 8.0 compiler improvements. On benchmarks scenes, there are 3% to 35% render time reductions. Stack memory usage is reduced a little too. Reviewed By: sergey Differential Revision: https://developer.blender.org/D2269
2016-09-21Cycles: Make code more uniform across two versions of shadow_blocked()Sergey Sharybin
Just to make it easier to research ways of possible code de-duplication.
2016-09-21Cycles: Remove out of date commentSergey Sharybin
2016-07-26Cycles: Revert previous fixes to intersect_all functionsSergey Sharybin
While they prevent legit write past the array boundary error those fixes introduced regression in behavior when having exact max_hits transparent intersections and nothing else. Previous code would have considered such case a totally opaque, but it's not correct. Fixes T48941: Some materials don't get transparent shadows anymore
2016-07-14Cycles: Fix wrong termination criteria in intersect_all functionsSergey Sharybin
It was possible to miss bounces termination criteria in this functions, mainly when max_hits was set to 0. Made the check more robust in traversal functions (which should not affect performance, it's an operation of same complexity AFAIK). Also avoid doing ray-scene intersection from shadow_blocked when limit of transparent bounces was already reached.
2016-06-23Cycles: Add multi-scattering, energy-conserving GGX as an option to the ↵Lukas Stockner
Glossy, Anisotropic and Glass BSDFs This commit adds a new distribution to the Glossy, Anisotropic and Glass BSDFs that implements the multiple-scattering microfacet model described in the paper "Multiple-Scattering Microfacet BSDFs with the Smith Model". Essentially, the improvement is that unlike classical GGX, which only models single scattering and assumes the contribution of multiple bounces to be zero, this new model performs a random walk on the microsurface until the ray leaves it again, which ensures perfect energy conservation. In practise, this means that the "darkening problem" - GGX materials becoming darker with increasing roughness - is solved in a physically correct and efficient way. The downside of this model is that it has no (known) analytic expression for evalation. However, it can be evaluated stochastically, and although the correct PDF isn't known either, the properties of MIS and the balance heuristic guarantee an unbiased result at the cost of slightly higher noise. Reviewers: dingto, #cycles, brecht Reviewed By: dingto, #cycles, brecht Subscribers: bliblubli, ace_dragon, gregzaal, brecht, harvester, dingto, marcog, swerner, jtheninja, Blendify, nutel Differential Revision: https://developer.blender.org/D2002
2016-05-24Fix T48508: Cycles Regression / CrashSergey Sharybin
2016-05-23Cycles CUDA: reduce stack memory by reusing ShaderData.Brecht Van Lommel
57% less for path and 48% less for branched path.
2016-05-18Cycles: Reduce amount of malloc() calls from the kernelSergey Sharybin
This commit makes it so malloc() is only happening once per volume and once per transparent shadow query (per thread), improving scalability of the code to multiple CPU cores. Hard to measure this with a low-bottom i7 here currently, but from quick tests seems volume sampling gave about 3-5% speedup. The idea is to store allocated memory in kernel globals, which are per thread on CPU already. Reviewers: dingto, juicyfruit, lukasstockner97, maiself, brecht Reviewed By: brecht Subscribers: Blendify, nutel Differential Revision: https://developer.blender.org/D1996
2016-01-30Cycles: Cleanup of OpenCL split kernel routinesSergey Sharybin
The idea is to switch from allocating separate buffers for shader data's structure of arrays to allocating one huge memory block and do some index trickery to make it accessed as SOA. This saves quite reasonable amount of lines of code in device_opencl and also makes it possible to get rid of special declaration of ShaderData structure. As a side effect it also makes it easier to experiment with SOA vs. AOS for split kernel. Works fine here on NVidia GTX580, Intel CPU amd AMD Fiji cards. Reviewers: #cycles, brecht, juicyfruit, dingto Differential Revision: https://developer.blender.org/D1593
2016-01-28Cycles: Remove few function arguments needed only for the split kernelSergey Sharybin
Use KernelGlobals to access all the global arrays for the intermediate storage instead of passing all this storage things explicitly. Tested here with Intel OpenCL, NVIDIA GTX580 and AMD Fiji, didn't see any artifacts, so guess it's all good. Reviewers: juicyfruit, dingto, lukasstockner97 Differential Revision: https://developer.blender.org/D1736
2016-01-14Cycles: Tweak inline policy for some functionsSergey Sharybin
The goal is to make Experimental kernel closer in performance to the official kernel, avoiding spills and such. There should not be big impact on official kernel, own tests showed few percent performance drop on laptop's GPU. CPU was always the same speed on AVX, AVX2 and SSE4.1 CPUs i've been testing here. This seems to be the last essential step before we can get rid of Experimental kernel and enable SSS officially on GPU without causing some major performance issues. Surely some more tweaks are possibly required, but that we can do for until cows go home anyway.
2016-01-07Cycles: Refactor how we pass bounce info to light path node.Thomas Dinges
This commit changes the way how we pass bounce information to the Light Path node. Instead of manualy copying the bounces into ShaderData, we now directly pass PathState. This reduces the arguments that we need to pass around and also makes it easier to extend the feature. This commit also exposes the Transmission Bounce Depth to the Light Path node. It works similar to the Transparent Depth Output: Replace a Transmission lightpath after X bounces with another shader, e.g a Diffuse one. This can be used to avoid black surfaces, due to low amount of max bounces. Reviewed by Sergey and Brecht, thanks for some hlp with this. I tested compilation and usage on CPU (SVM and OSL), CUDA, OpenCL Split and Mega kernel. Hopefully this covers all devices. :)
2015-05-21Cleanup: Remove some outdated comments related to split kernel.Thomas Dinges
2015-05-09Cycles: OpenCL kernel splitGeorge Kyriazis
This commit contains all the work related on the AMD megakernel split work which was mainly done by Varun Sundar, George Kyriazis and Lenny Wang, plus some help from Sergey Sharybin, Martijn Berger, Thomas Dinges and likely someone else which we're forgetting to mention. Currently only AMD cards are enabled for the new split kernel, but it is possible to force split opencl kernel to be used by setting the following environment variable: CYCLES_OPENCL_SPLIT_KERNEL_TEST=1. Not all the features are supported yet, and that being said no motion blur, camera blur, SSS and volumetrics for now. Also transparent shadows are disabled on AMD device because of some compiler bug. This kernel is also only implements regular path tracing and supporting branched one will take a bit. Branched path tracing is exposed to the interface still, which is a bit misleading and will be hidden there soon. More feature will be enabled once they're ported to the split kernel and tested. Neither regular CPU nor CUDA has any difference, they're generating the same exact code, which means no regressions/improvements there. Based on the research paper: https://research.nvidia.com/sites/default/files/publications/laine2013hpg_paper.pdf Here's the documentation: https://docs.google.com/document/d/1LuXW-CV-sVJkQaEGZlMJ86jZ8FmoPfecaMdR-oiWbUY/edit Design discussion of the patch: https://developer.blender.org/T44197 Differential Revision: https://developer.blender.org/D1200
2015-04-30Cycles: Record all possible volume intersections for SSS and camera checksThomas Dinges
This replaces sequential ray moving followed with scene intersection with single BVH traversal, which gives us all possible intersections. Only implemented for CPU, due to qsort and a bigger memory usage on GPU which we rather avoid. GPU still uses the regular bvh volume intersection code, while CPU now uses the new code. This improves render performance for scenes with: a) Camera inside volume mesh b) SSS mesh intersecting a volume mesh/domain In simple volume files (not much geometry) performance is roughly the same (slightly faster). In files with a lot of geometry, the performance increase is larger. bmps.blend with a volume shader and camera inside the mesh, it renders ~10% faster here. Patch by Sergey and myself. Differential Revision: https://developer.blender.org/D1264
2014-12-25Cleanup: Fix Cycles Apache header.Thomas Dinges
This was already mixed a bit, but the dot belongs there.
2014-09-26Cycles: Keep STACK_MAX_HITS private in kernel_shadowSergey Sharybin
This way adding record_all for other things becomes easier and doesn't lead to naming conflicts.
2014-09-24Cleanup: Avoid some defines for scene_intersect(), related to Min Width.Thomas Dinges
2014-06-30Condition was inverted in the previous transparent shadows commitSergey Sharybin
Handbook example what happens when you've got loads of patches and not double-check stuff before committing.
2014-06-30Fix T40836: Cycles volume scattering shader crashSergey Sharybin
Volume scatter might happen before path termination, so need to check transparent bounces and consider shadow an opaque when max transparent bounces are reached. TODO: CPU code seems to have different branching in conditions which made me thinking it does different things with volume attenuation, but from the render results it seems the same exact things are happening there. Worth looking into making simplifying code a bit here to improve readability.
2014-05-21Fix T40289: Cycles leaking memoryCampbell Barton
error in recent commit
2014-05-15Fix cycles bug with new transparent shadow code, giving too much volume shadow.Brecht Van Lommel
2014-05-04Style cleanup: indentation, bracesCampbell Barton