Welcome to mirror list, hosted at ThFree Co, Russian Federation.

git.blender.org/blender.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2018-02-18Code cleanup: remove some more unused code after recent CUDA changes.Brecht Van Lommel
2018-02-18Cycles: Remove fermi related defines from the code.Thomas Dinges
Did not touch Texture related defines, that comes next.
2018-02-14Cycles: restore Particle Info Index for now, keep it next to Random.Brecht Van Lommel
It seems to be useful still in cases where the particle are distributed in a particular order or pattern, to colorize them along with that. This isn't really well defined, but might as well avoid breaking backwards compatibility for now.
2018-02-14Cycles: change Index output of Hair and Particle Info to Random, in 0..1 range.Brecht Van Lommel
These are used for randomization, so it's convenient if the index is already hashed and consistent with the Object Info node.
2018-02-09Cycles: random walk subsurface scattering.Brecht Van Lommel
It is basically brute force volume scattering within the mesh, but part of the SSS code for faster performance. The main difference with actual volume scattering is that we assume the boundaries are diffuse and that all lighting is coming through this boundary from outside the volume. This gives much more accurate results for thin features and low density. Some challenges remain however: * Significantly more noisy than BSSRDF. Adding Dwivedi sampling may help here, but it's unclear still how much it helps in real world cases. * Due to this being a volumetric method, geometry like eyes or mouth can darken the skin on the outside. We may be able to reduce this effect, or users can compensate for it by reducing the scattering radius in such areas. * Sharp corners are quite bright. This matches actual volume rendering and results in some other renderers, but maybe not so much real world objects. Differential Revision: https://developer.blender.org/D3054
2018-01-13Cycles: adaptive subdivision support for panoramic cameras.Mai Lavelle
Adds the code to get screen size of a point in world space, which is used for subdividing geometry to the correct level. The approximate method of treating the point as if it were directly in front of the camera is used, as panoramic projections can become very distorted near the edges of an image. This should be fine for most uses. There is also no support yet for offscreen dicing scale, though panorama cameras are often used for rendering 360° renders anyway. Fixes T49254. Differential Revision: https://developer.blender.org/D2468
2018-01-11Fix T53755: Cycles OpenCL lamp shaders have incorrect normal.Brecht Van Lommel
2018-01-11Cycles: support animated object scale in motion blur.Stefan Werner
This was disabled previously due to CUDA compiler bugs, see T32900. Differential Revision: https://developer.blender.org/D2937
2017-11-14Cycles: Make per-object random value output also work for LampsLukas Stockner
2017-11-08Cycles: add bevel shader, for raytrace based rounded edges.Brecht Van Lommel
The algorithm averages normals from nearby surfaces. It uses the same sampling strategy as BSSRDFs, casting rays along the normal and two orthogonal axes, and combining the samples with MIS. The main concern here is that we are introducing raytracing inside shader evaluation, which could be quite bad for GPU performance and stack memory usage. In practice it doesn't seem so bad though. Note that using this feature can easily slow down renders 20%, and that if you care about performance then it's better to use a bevel modifier. Mainly this is useful for baking, and for cases where the mesh topology makes it difficult for the bevel modifier to work well. Differential Revision: https://developer.blender.org/D2803
2017-11-08Code refactor: rename subsurface to local traversal, for reuse.Brecht Van Lommel
2017-11-05Cycles: fix inefficient attribute map storage, saves 615MB in victor scene.Brecht Van Lommel
2017-10-07Cycles: CUDA bicubic and tricubic texture interpolation support.Brecht Van Lommel
While cubic interpolation is quite expensive on the CPU compared to linear interpolation, the difference on the GPU is quite small.
2017-10-07Code refactor: make texture code more consistent between devices.Brecht Van Lommel
* Use common TextureInfo struct for all devices, except CUDA fermi. * Move image sampling code to kernels/*/kernel_*_image.h files. * Use arrays for data textures on Fermi too, so device_vector<Struct> works.
2017-09-06Fix T52660: CUDA volume texture rendering not working on Fermi GPUs.Brecht Van Lommel
2017-08-08Cycles: Cleanup, de-duplicate function parameter listSergey Sharybin
Was only needed to sue const reference on CPU. Now it is done using ccl_ref.
2017-08-07Cycles: Cleanup, move curve intersection functions to own fileSergey Sharybin
This way curve file becomes much shorter and it's also easier to write a benchmark application to check performance before/after future changes.
2017-08-07Cycles: Cleanup, trailign whitespaceSergey Sharybin
2017-08-07Cycles: Cleanup, remove bvh prefix from curve functionsSergey Sharybin
Those are nothing to do with BVH, and can be used separately.
2017-08-07Code refactor: add, remove, optimize various SSE functions.Brecht Van Lommel
* Remove some unnecessary SSE emulation defines. * Use full precision float division so we can enable it. * Add sqrt(), sqr(), fabs(), shuffle variations, mask(). * Optimize reduce_add(), select(). Differential Revision: https://developer.blender.org/D2764
2017-05-07Cycles: Implement denoising option for reducing noise in the rendered imageLukas Stockner
This commit contains the first part of the new Cycles denoising option, which filters the resulting image using information gathered during rendering to get rid of noise while preserving visual features as well as possible. To use the option, enable it in the render layer options. The default settings fit a wide range of scenes, but the user can tweak individual settings to control the tradeoff between a noise-free image, image details, and calculation time. Note that the denoiser may still change in the future and that some features are not implemented yet. The most important missing feature is animation denoising, which uses information from multiple frames at once to produce a flicker-free and smoother result. These features will be added in the future. Finally, thanks to all the people who supported this project: - Google (through the GSoC) and Theory Studios for sponsoring the development - The authors of the papers I used for implementing the denoiser (more details on them will be included in the technical docs) - The other Cycles devs for feedback on the code, especially Sergey for mentoring the GSoC project and Brecht for the code review! - And of course the users who helped with testing, reported bugs and things that could and/or should work better!
2017-04-13Cycles: Make vectorized types constructor from register explicitSergey Sharybin
This is not a cheap operation which we dont' want to happen silently.
2017-04-10Cycles: Fix compilation error of AVX2 kernels with SSE optimization disabledSergey Sharybin
2017-03-29Cycles: Attempt to work around compilation errors of CUDA on sm_2xSergey Sharybin
2017-03-29Cycles: Make all #include statements relative to cycles source directorySergey Sharybin
The idea is to make include statements more explicit and obvious where the file is coming from, additionally reducing chance of wrong header being picked up. For example, it was not obvious whether bvh.h was refferring to builder or traversal, whenter node.h is a generic graph node or a shader node and cases like that. Surely this might look obvious for the active developers, but after some time of not touching the code it becomes less obvious where file is coming from. This was briefly mentioned in T50824 and seems @brecht is fine with such explicitness, but need to agree with all active developers before committing this. Please note that this patch is lacking changes related on GPU/OpenCL support. This will be solved if/when we all agree this is a good idea to move forward. Reviewers: brecht, lukasstockner97, maiself, nirved, dingto, juicyfruit, swerner Reviewed By: lukasstockner97, maiself, nirved, dingto Subscribers: brecht Differential Revision: https://developer.blender.org/D2586
2017-03-28Cycles: Switch to reformulated Pluecker ray/triangle intersectionSergey Sharybin
The intention of this commit it to address issues mentioned in the reports T43865,T50164 and T50452. The code is based on Embree code with some extra vectorization to speed up single ray to single triangle intersection. Unfortunately, such a fix is not coming for free. There is some slowdown for AVX2 processors, mainly due to different vectorization code, which caused different number of instructions to be executed and different instructions-per-cycle counters. But on another hand this commit makes pre-AVX2 platforms such as AVX and SSE4.1 a bit faster. The prerformance goes as following: 2.78c AVX2 2.78c AVX Patch AVX2 Patch AVX BMW 05:21.09 06:05.34 05:32.97 (+3.5%) 05:34.97 (-8.5%) Classroom 16:55.36 18:24.51 17:10.41 (+1.4%) 17:15.87 (-6.3%) Fishy Cat 08:08.49 08:36.26 08:09.19 (+0.2%) 08:12.25 (-4.7% Koro 11:22.54 11:45.24 11:13.25 (-1.5%) 11:43.81 (-0.3%) Barcelone 14:18.32 16:09.46 14:15.20 (-0.4%) 14:25.15 (-10.8%) On GPU the performance is about 1.5-2% slower in my tests on GTX1080 but afraid we can't do much as a part of this chaneg here and consider it a price to pay for more proper intersection check. Made in collaboration with Maxym Dmytrychenko, big thanks to him! Reviewers: brecht, juicyfruit, lukasstockner97, dingto Differential Revision: https://developer.blender.org/D1574
2017-03-23Cycles: Remove unused macroSergey Sharybin
2017-03-23Cycles: Use SSE-optimized version of triangle intersection for motion trianglesSergey Sharybin
The title says it all actually. Gives up to 10% speedup on test scenes here on i7-6800K. Render times on GPU are unreliable here, but there might be some slowdown caused by watertight nature of intersections.
2017-03-23Cycles: Fix speed regression on GPUSergey Sharybin
Avoid construction of temporary array and make utility function force-inlined. Additionally avoid calling float4_to_float3 twice. This brings render times to the same values as before current patch series.
2017-03-23Cycles: Use utility function for SSS triangle intersectionSergey Sharybin
This effectively de-duplicates triangle intersection logic implemented for both regular triangle and SSS triangle.
2017-03-23Cycles: Move watertight triangle intersection to an utility fileSergey Sharybin
This way the code can be reused more easily.
2017-03-23Cycles: Move triangle intersection precalc to an util fileSergey Sharybin
This is a preparation work for the followup commit which wil l move remaining parts of Woop intersection logic to an utility file. Doing it as a separate commit to keep changes more atomic and easier to bisect when/if needed.
2017-03-23Cycles: Cleanup, move utility function to utility fileSergey Sharybin
Was an old TODO, this function is handy for some math utilities as well.
2017-03-23Cycles: Cleanup, inline AVX register construction from kernel global dataSergey Sharybin
Currently should be no functional changes, preparing for some upcoming refactor.
2017-03-22Fix/workaround T50533: Transparency shader doesn't cast shadows with curve ↵Sergey Sharybin
segments There seems to be a compiler bug of MSVC2013. The issue does not happen on Linux and does not happen on Windows when building with MSVC2015. Since it's reallly a pain to debug release builds with MSVC2013 the AVX2 optimization is disabled for curve sergemnts for this compiler.
2017-03-09Cycles: SSS and Volume rendering in split kernelHristo Gueorguiev
Decoupled ray marching is not supported yet. Transparent shadows are always enabled for volume rendering. Changes in kernel/bvh and kernel/geom are from Sergey. This simiplifies code significantly, and prepares it for record-all transparent shadow function in split kernel.
2017-03-08Cycles: Remove ccl_fetch and SOAMai Lavelle
2017-02-15Cycles: Don't calculate primitive time if BVH motion steps are not usedSergey Sharybin
Solves memory regression by the default configuration.
2017-02-15Cycles: Fix wrong hair render results when using BVH motion stepsSergey Sharybin
The issue here was mainly coming from minimal pixel width feature which is quite commonly enabled in production shots. This feature will use some probabilistic heuristic in the curve intersection function to check whether we need to return intersection or not. This probability is calculated for every intersection check. Now, when we use multiple BVH nodes for curve primitives we increase probability of that primitive to be considered a good intersection for us. This is similar to increasing minimal width of curve. What is worst here is that change in the intersection probability fully depends on exact layout of BVH, meaning probability might change differently depending on a view angle, the way how builder binned the primitives and such. This makes it impossible to do simple check like dividing probability by number of BVH steps. Other solution might have been to split BVH into fully independent trees, but that will increase memory usage of all the static objects in the scenes, which is also not something desirable. For now used most simple but robust approach: store BVH primitives time and test it in curve intersection functions. This solves the regression, but has two downsides: - Uses more memory. which isn't surprising, and ANY solution to this problem will use more memory. What we still have to do is to avoid this memory increase for cases when we don't use BVH motion steps. - Reduces number of maximum available textures on pre-kepler cards. There is not much we can do here, hardware gets old but we need to move forward on more modern hardware..
2017-01-23Cycles: Split ShaderData object and shader flagsSergey Sharybin
We started to run out of bits there, so now we separate flags which came from __object_flags and which are either runtime or coming from __shader_flags. Rule now is: SD_OBJECT_* flags are to be tested against new object_flags field of ShaderData, all the rest flags are to be tested against flags field of ShaderData. There should be no user-visible changes, and time difference should be minimal. In fact, from tests here can only see hardly measurable difference and sometimes the new code is somewhat faster (all within a noise floor, so hard to tell for sure). Reviewers: brecht, dingto, juicyfruit, lukasstockner97, maiself Differential Revision: https://developer.blender.org/D2428
2017-01-23Cycles: Make object flag names more obvious that hey are object and not shaderSergey Sharybin
2017-01-20Cycles: Split motion triangle file once again, avoids annoying forward ↵Sergey Sharybin
declarations
2017-01-20Cycles: Move motion triangle intersection functions to own fileSergey Sharybin
Mimics how regular triangles are working and makes it more clear where the stuff is located in the kernel. Needed to have some forward declarations because of the current placement of things in the kernel.
2017-01-20Cycles: Cleanup, commentsSergey Sharybin
2017-01-12Cycles: Cleanup, indentation within preprocessorSergey Sharybin
2016-12-12Cycles: Cleanup, variable namesSergey Sharybin
Use underscore again and also solve confusing part then in BVH smae thing is called prim_addr but in intersection funcitons it was called triAddr.
2016-12-12Cycles: Cleanup, variables namesSergey Sharybin
Use underscore instead of camel case.
2016-12-02Cycles: Implement AVX2 path for curve intersection functionsSergey Sharybin
Gives little performance improvement on Linux and gives up to 2% speedup on koro.blend on Windows. Inspired by Maxym Dmytrychenko, thanks!
2016-11-03Cycles: Fix missing underscore in geom_object.hLukas Stockner
2016-11-03Cycles: Fix T49901: OpenCL build error after recent light texture coordinate ↵Lukas Stockner
commit Basically, the problem here was that the transform that's used to bring texture coordinates to world space is either fetched while setting up the shader (with Object Motion is enabled) or fetched when needed (otherwise). That helps to save ShaderData memory on OpenCL when Object Motion isn't needed. Now, if OM is enabled, the Lamp transform can just be stored inside the ShaderData as well. The original commit just assumed it is. However, when it's not (on OpenCL by default, for example), there is no easy way to fetch it when needed, since the ShaderData doesn't store the Lamp index. So, for now the lamps just don't support local texture coordinates anymore when Object Motion is disabled. To fix and support this properly, one of the following could be done: - Just always pre-fetch the transform. Downside: Memory Usage increases when not using OM on OpenCL - Add a variable to ShaderData that stores the Lamp ID to allow fetching it when needed - Store the Lamp ID inside prim or object. Problem: Cycles currently checks these for whether an object was hit - these checks would need to be changed. - Enable OM whenever a Texture Coordinate's Normal output is used. Downside: Might not actually be needed.