Welcome to mirror list, hosted at ThFree Co, Russian Federation.

git.blender.org/blender.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2015-06-13Cycles: Silent paranoid uninitialized GCC warnings in release kernelsSergey Sharybin
2015-05-09Cycles: OpenCL kernel splitGeorge Kyriazis
This commit contains all the work related on the AMD megakernel split work which was mainly done by Varun Sundar, George Kyriazis and Lenny Wang, plus some help from Sergey Sharybin, Martijn Berger, Thomas Dinges and likely someone else which we're forgetting to mention. Currently only AMD cards are enabled for the new split kernel, but it is possible to force split opencl kernel to be used by setting the following environment variable: CYCLES_OPENCL_SPLIT_KERNEL_TEST=1. Not all the features are supported yet, and that being said no motion blur, camera blur, SSS and volumetrics for now. Also transparent shadows are disabled on AMD device because of some compiler bug. This kernel is also only implements regular path tracing and supporting branched one will take a bit. Branched path tracing is exposed to the interface still, which is a bit misleading and will be hidden there soon. More feature will be enabled once they're ported to the split kernel and tested. Neither regular CPU nor CUDA has any difference, they're generating the same exact code, which means no regressions/improvements there. Based on the research paper: https://research.nvidia.com/sites/default/files/publications/laine2013hpg_paper.pdf Here's the documentation: https://docs.google.com/document/d/1LuXW-CV-sVJkQaEGZlMJ86jZ8FmoPfecaMdR-oiWbUY/edit Design discussion of the patch: https://developer.blender.org/T44197 Differential Revision: https://developer.blender.org/D1200
2015-05-09Cycles: Initial work towards selective nodes support compilationSergey Sharybin
The goal is to be able to compile kernel with nodes which are actually needed to render current scene, hence improving performance of the kernel, The idea is: - Have few node groups, starting with a group which contains nodes are used really often, and then couple of groups which will be extension of this one. - Have feature-based nodes disabling, so it's possible to disable nodes related to features which are not used with the currently used nodes group. This commit only lays down needed routines for this approach, actual split will happen later after gathering statistics from bunch of production scenes.
2015-02-12cleanupCampbell Barton
2015-02-10Cycles: Add print functions for sse3f, sse3i and sse3bSergey Sharybin
2015-02-02Cycles: Ignore -Wmaybe-uninitialized from the kernel in release buildsSergey Sharybin
This warning provided too much false-positive issues in release version of the kernel, making it really easy to miss actual warnings.
2015-02-02Cycles: Remove redundant calculation of w in recent cubic commitSergey Sharybin
Was rather harmless since compiler will optimize it out, but nice to get rid of this anyway.
2015-02-02Cycles: Indentation fix for the previous commitSergey Sharybin
2015-02-02Cycles: Implement cubit image interpolation on CPUSergey Sharybin
Basically title says it all. Could be not totally optimized but the code is there now.
2014-12-25Cycles: Fix compilation error on non-SSE2 architecturesSergey Sharybin
2014-12-25Cleanup: Fix Cycles Apache header.Thomas Dinges
This was already mixed a bit, but the dot belongs there.
2014-12-25Cycles: Implement QBVH tree traversalSergey Sharybin
This commit implements traversal for QBVH tree, which is based on the old loop code for traversal itself and Embree for node intersection. This commit also does some changes to the loop inspired by Embree: - Visibility flags are only checked for primitives. Doing visibility check for every node cost quite reasonable amount of time and in most cases those checks are true-positive. Other idea here would be to do visibility checks for leaf nodes only, but this would need to be investigated further. - For minimum hair width we extend all the nodes' bounding boxes. Again doing curve visibility check is quite costly for each of the nodes and those checks returns truth for most of the hierarchy anyway. There are number of possible optimization still, but current state is good enough in terms it makes rendering faster a little bit after recent watertight commit. Currently QBVH is only implemented for CPU with SSE2 support at least. All other devices would need to be supported later (if that'd make sense from performance point of view). The code is enabled for compilation in kernel. but blender wouldn't use it still.
2014-12-25Cycles: Add some utility functions and structuresSergey Sharybin
Most of them are not currently used but are essential for the further work. - CPU kernels with SSE2 support will now have sse3b, sse3f and sse3i - Added templatedversions of min4, max4 which are handy to use with register variables. - Added util_swap function which gets arguments by pointers. So hopefully it'll be a portable version of std::swap.
2014-11-07Cycles: Tweak to the expf() speed workaroundSergey Sharybin
Add compile-time check for particular glibc version which fixed the issue. This makes it so own-compiled blender is the fastest in the world, and the only issue remains what should we do for release builds. After some discussion with Campbell we decided to keep it as is for now because slowdown is not that much noticeable. We'll disable this workaround for release builds when all the majority of the distros will switch to the new version of glibc.
2014-10-22Cycles: Expose volume voxel data interpolation to the interfaceSergey Sharybin
It is per-material setting which could be found under the Volume settings in the material and world context buttons. There could still be some code-wise improvements, like using variable-size macro for interp3d instead of having interp3d_ex to which you can pass the interpolation method.
2014-10-22Cycles: Implement tricubic b-spline interpolation for CPU texture_imageSergey Sharybin
This is the first step towards supporting cubic interpolation for voxel data (such as smoke and fire). It is not epxosed to the interface at all yet, this is to be done soon after this change.
2014-10-08Cycles: correct math wrappersCampbell Barton
include the parens around value before cast, in some cases was causing double/float promotion by only casting the left value.
2014-10-06Cycles: Workaround dead-slow expf() on 64bit linuxSergey Sharybin
Single precision exponent on 64bit linux tends to be order of magnitude slower than double precision version even with single<->double precision conversion. Some feedback in the mailing lists also suggests that logf() is also slow, but this i didn't confirm here in the studio yet. Depending on the shader setup it gives ~3% with the secret agent shot and up to around 15% with the bmw scene here.
2014-06-13Cycles Refactor: Add SSE Utility code from Embree for cleaner SSE code.Thomas Dinges
This makes the code a bit easier to understand, and might come in handy if we want to reuse more Embree code. Differential Revision: https://developer.blender.org/D482 Code by Brecht, with fixes by Lockal, Sergey and myself.
2014-05-06Cycles: revert part of the optimization from ff34c2dCampbell Barton
This was faster for my AMD system but slower for Intel. However with gcc4.9,-O3 I was able to get roughly the same speed before/after. Revert since this isnt giving such clear benefits on most systems.
2014-05-05Cycles: avoid int->float conversions for pixel lookupsCampbell Barton
Gives ~3% speedup for image.blend test, and 6% for image heavy file. Overall speedup in real-world use is likely much less.
2014-03-29Cycles code internals: add CPU kernel support for 3D image textures.Brecht Van Lommel
2014-03-13Fix cycles texture interpolation mode closest constant offset on some devicesMartijn Berger
2014-03-08Add support for multiple interpolation modes on cycles image texturesMartijn Berger
All textures are sampled bi-linear currently with the exception of OSL there texture sampling is fixed and set to smart bi-cubic. This patch adds user control to this setting. Added: - bits to DNA / RNA in the form of an enum for supporting multiple interpolations types - changes to the image texture node drawing code ( add enum) - to ImageManager (this needs to know to allocate second texture when interpolation type is different) - to node compiler (pass on interpolation type) - to device tex_alloc this also needs to get the concept of multiple interpolation types - implementation for doing non interpolated lookup for cuda and cpu - implementation where we pass this along to osl ( this makes OSL also do linear untill I add smartcubic to the interface / DNA/ RNA) Reviewers: brecht, dingto Reviewed By: brecht CC: dingto, venomgfx Differential Revision: https://developer.blender.org/D317
2014-02-27Cycles: fix crash in SSE hair and half-floats on x86+vc2008Sv. Lockal
MSVC 2008 ignores alignement attribute when assigning from unaligned float4 vector, returned from other function. Now Cycles uses unaligned loads instead of casts for win32 in x86 mode.
2014-02-04Fix cycles crash with float image textures on CPU without AVX support.Brecht Van Lommel
The AVX kernel functions for reading image textures could be get used from non-AVX kernels. These are C++ class methods and need to be marked for inlining, all other functions are static so they don't leak into other kernels.
2014-01-15Code cleanup: move half float functions to separate header file.Brecht Van Lommel
2013-12-28Cycles: Move SIMD utility functions into its own file.Thomas Dinges
Recently added SSE macros for noise texture can be moved here as well, but I leave this for later.
2013-08-18Cycles: relicense GNU GPL source code to Apache version 2.0.Brecht Van Lommel
More information in this post: http://code.blender.org/ Thanks to all contributes for giving their permission!
2013-06-07Code cleanup: avoid some warnings due to implicit uint/int/float/double ↵Brecht Van Lommel
conversion.
2013-04-02Cycles: initial subsurface multiple scattering support. It's not working asBrecht Van Lommel
well as I would like, but it works, just add a subsurface scattering node and you can use it like any other BSDF. It is using fully raytraced sampling compatible with progressive rendering and other more advanced rendering algorithms we might used in the future, and it uses no extra memory so it's suitable for complex scenes. Disadvantage is that it can be quite noisy and slow. Two limitations that will be solved are that it does not work with bump mapping yet, and that the falloff function used is a simple cubic function, it's not using the real BSSRDF falloff function yet. The node has a color input, along with a scattering radius for each RGB color channel along with an overall scale factor for the radii. There is also no GPU support yet, will test if I can get that working later. Node Documentation: http://wiki.blender.org/index.php/Doc:2.6/Manual/Render/Cycles/Nodes/Shaders#BSSRDF Implementation notes: http://wiki.blender.org/index.php/Dev:2.6/Source/Render/Cycles/Subsurface_Scattering
2013-04-02Cycles: code refactoring to add generic lookup table memory.Brecht Van Lommel
2012-12-12Fix #33486: cycles CPU image textures were offset wrong by half a pixel comparedBrecht Van Lommel
to OpenGL/CUDA/OSL rendering.
2012-09-04Cycles: merge of changes from tomato branch.Brecht Van Lommel
Regular rendering now works tiled, and supports save buffers to save memory during render and cache render results. Brick texture node by Thomas. http://wiki.blender.org/index.php/Doc:2.6/Manual/Render/Cycles/Nodes/Textures#Brick_Texture Image texture Blended Box Mapping. http://wiki.blender.org/index.php/Doc:2.6/Manual/Render/Cycles/Nodes/Textures#Image_Texture http://mango.blender.org/production/blended_box/ Various bug fixes by Sergey and Campbell. * Fix for reading freed memory in some node setups. * Fix incorrect memory read when synchronizing mesh motion. * Fix crash appearing when direct light usage is different on different layers. * Fix for vector pass gives wrong result in some circumstances. * Fix for wrong resolution used for rendering Render Layer node. * Option to cancel rendering when doing initial synchronization. * No more texture limit when using CPU render. * Many fixes for new tiled rendering.
2012-06-09style cleanup: block commentsCampbell Barton
2012-05-13Cycles: OpenCL image texture support, fix an attribute node issue and refactorBrecht Van Lommel
feature enabling #defines a bit.
2012-01-20Sample as Lamp option for world shaders, to enable multiple importance sampling.Brecht Van Lommel
By default lighting from the world is computed solely with indirect light sampling. However for more complex environment maps this can be too noisy, as sampling the BSDF may not easily find the highlights in the environment map image. By enabling this option, the world background will be sampled as a lamp, with lighter parts automatically given more samples. Map Resolution specifies the size of the importance map (res x res). Before rendering starts, an importance map is generated by "baking" a grayscale image from the world shader. This will then be used to determine which parts of the background are light and so should receive more samples than darker parts. Higher resolutions will result in more accurate sampling but take more setup time and memory. Patch by Mike Farnsworth, thanks!
2011-11-22Cycles: OpenCL tweaksBrecht Van Lommel
* Reduce kernel arguments size, helps compile for apple nvidia. * Fix use of unitialized variable in displace kernel. * Use build flags in opencl kernel md5 hash. * Reorganize code for kernel feature #defines a bit.
2011-11-10Cycles:Brecht Van Lommel
* Fix excessive fireflies in Velvet BSDF (patch by David). * Disable some unused SSE code * Remove RTTI disabling flags for now, this is giving some compile issues and was only needed of OSL which we're not using yet.
2011-04-27Cycles render engine, initial commit. This is the engine itself, blender ↵Ton Roosendaal
modifications and build instructions will follow later. Cycles uses code from some great open source projects, many thanks them: * BVH building and traversal code from NVidia's "Understanding the Efficiency of Ray Traversal on GPUs": http://code.google.com/p/understanding-the-efficiency-of-ray-traversal-on-gpus/ * Open Shading Language for a large part of the shading system: http://code.google.com/p/openshadinglanguage/ * Blender for procedural textures and a few other nodes. * Approximate Catmull Clark subdivision from NVidia Mesh tools: http://code.google.com/p/nvidia-mesh-tools/ * Sobol direction vectors from: http://web.maths.unsw.edu.au/~fkuo/sobol/ * Film response functions from: http://www.cs.columbia.edu/CAVE/software/softlib/dorf.php