Welcome to mirror list, hosted at ThFree Co, Russian Federation.

git.blender.org/blender.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2017-02-11Cycles: Refactor device split kernel codeMai Lavelle
Moved all split kernel related stuff out of `Device` as it doesnt belong there. Those functions are now apart of `DeviceSplitKernel` which now must be implemented for each device type supporting the split kernel. No functional changes.
2017-02-11Cycles: Move kgbuffer allocation out of split kernel codeMai Lavelle
Allocating the buffer is the job of the device implementation, not the split kernel, so makes more sense to separate that code.
2017-01-24Cycles: Add split_kernel_global_size functionMai Lavelle
This is to allow devices to suggest a good global work size. Only implemented for OpenCL devices right now.
2017-01-24Cycles: Remove everything parallel samples from the split kernelMai Lavelle
Parallel samples never actually worked, was producing incorrect results or crashes, and wasn't any faster than work stealing, so removing it.
2016-12-16Cycles: Add names to buffer allocationsMai Lavelle
This is to help debug and track memory usage for generic buffers. We have similar for textures already since those require a name, but for buffers the name is only for debugging proposes.
2016-12-16Merge branch 'master' into cycles_split_kernelMai Lavelle
2016-12-09Land D2339 by bliblu blilazydodo
2016-12-07Merge branch 'master' into cycles_split_kernelMai Lavelle
2016-12-07Cycles: Make split kernel methods in Device privateMai Lavelle
These methods shouldn't really be called from anywhere except `DeviceSplitKernel`.
2016-12-07Cycles: Remove overloaded Device::mem_alloc functionMai Lavelle
It was a bit confusing having two versions of this function, only one is needed really. Added `device_memory::resize()` to take over the behavior the overload provided.
2016-12-03Cycles: Refactor Progress system to provide better estimatesLukas Stockner
The Progress system in Cycles had two limitations so far: - It just counted tiles, but ignored their size. For example, when rendering a 600x500 image with 512x512 tiles, the right 88x500 tile would count for 50% of the progress, although it only covers 15% of the image. - Scene update time was incorrectly counted as rendering time - therefore, the remaining time started very long and gradually decreased. This patch fixes both problems: First of all, the Progress now has a function to ignore time spans, and that is used to ignore scene update time. The larger change is the tile size: Instead of counting samples per tile, so that the final value is num_samples*num_tiles, the code now counts every sample for every pixel, so that the final value is num_samples*num_pixels. Along with that, some unused variables were removed from the Progress and Session classes. Reviewers: brecht, sergey, #cycles Subscribers: brecht, candreacchio, sergey Differential Revision: https://developer.blender.org/D2214
2016-11-20Cycles: Simpler use of sizeofMai Lavelle
Less likely to make mistakes later if something here needs to change.
2016-11-10Merge branch 'master' into cycles_split_kernelMai Lavelle
2016-11-07Cycles: Refactor Device selection to allow individual GPU compute device ↵Lukas Stockner
selection Previously, it was only possible to choose a single GPU or all of that type (CUDA or OpenCL). Now, a toggle button is displayed for every device. These settings are tied to the PCI Bus ID of the devices, so they're consistent across hardware addition and removal (but not when swapping/moving cards). From the code perspective, the more important change is that now, the compute device properties are stored in the Addon preferences of the Cycles addon, instead of directly in the User Preferences. This allows for a cleaner implementation, removing the Cycles C API functions that were called by the RNA code to specify the enum items. Note that this change is neither backwards- nor forwards-compatible, but since it's only a User Preference no existing files are broken. Reviewers: #cycles, brecht Reviewed By: #cycles, brecht Subscribers: brecht, juicyfruit, mib2berlin, Blendify Differential Revision: https://developer.blender.org/D2338
2016-10-27Cycles: Add function so each device can specify its ideal local work sizeMai Lavelle
2016-10-27Cycles: Add functions to allocate kernel globals for split kernelMai Lavelle
CPU device needs to allocate and free thread specific data, these functions are used to do that.
2016-10-20Cycles: Begin moving split kernel logic into own classMai Lavelle
The new class `DeviceSplitKernel` will handle all logic for enqueueing of kernels and memory memory management for the split kernel. Devices that support the split kernel will create an instance of this class and call its methods to run the split kernel. There's still some work to do yet to make this device independent and to deal with tile splitting.
2016-10-18Cycles: Implement enqueue_split_kernel_data_init for OpenCL devicesMai Lavelle
The `enqueue_split_kernel_data_init()` function will allow each device type to set up the various data buffers how ever they need to without concerning the rest of the split kernel logic.
2016-10-18Cycles: Add SplitKernelFunction with OpenCL implementationMai Lavelle
SplitKernelFunction can represent a split kernel function for any device its been implemented for. Currently this is only for OpenCL to simplify the enqueueing of the split kernels and move another step closer to a split kernel that can run on any device.
2016-10-18Cycles: Replace use of cl_mem with device_memory in split kernel deviceMai Lavelle
Working towards using only device agnostic types and methods in the host.
2016-08-15Cycles microdisplacement: Allow kernels to be built without patch evaluationMai Lavelle
Kernels can now be built without patch evaluation when not needed by the scene (Catmull-Clark subdivision not in use), giving a performance boost for some devices.
2016-05-19Cycles: Add support for bindless textures.Thomas Dinges
This adds support for CUDA Texture objects (also known as Bindless textures) for Kepler GPUs (Geforce 6xx and above). This is used for all 2D/3D textures, data still uses arrays as before. User benefits: * No more limits of image textures on Kepler. We had 5 float4 and 145 byte4 slots there before, now we have 1024 float4 and 1024 byte4. This can be extended further if we need to (just change the define). * Single channel textures slots (byte and float) are now supported on Kepler as well (1024 slots for each type). ToDo / Issues: * 3D textures don't work yet, at least don't show up during render. I have no idea whats wrong yet. * Dynamically allocate bindless_mapping array? I hope Fermi still works fine, but that should be tested on a Fermi card before pushing to master. Part of my GSoC 2016. Reviewers: sergey, #cycles, brecht Subscribers: swerner, jtheninja, brecht, sergey Differential Revision: https://developer.blender.org/D1999
2016-05-07Some fixes for CUDA runtime compile:Thomas Dinges
* When Baking wasn't used we got an error. * On top of Volume Nodes (NODES_FEATURE_VOLUME), we now also check if we need volume sampling code, so we can disable that as well and save some further compilation time.
2016-02-12Cycles: Always use guarded allocator of vectorsSergey Sharybin
We don't have vectors re-allocation happening multiple times from inside a loop anymore, so we can safely switch to a memory guarded allocator for vectors and keep track on the memory usage at various stages of rendering. Additionally, when building from inside Blender repository, Cycles will use Blender's guarded allocator, so actual memory usage will be displayed in the Space Info header. There are couple of tricky aspects of the patch: - TaskScheduler::exit() now explicitly frees memory used by `threads`. This is needed because `threads` is a static member which destructor isn't getting called on Blender's exit which caused memory leak print to happen. This shouldn't give any measurable speed issues, reallocation of that vector is only one of fewzillion other allocations happening during synchronization. - Use regular guarded malloc (not aligned one). No idea why it was made to be aligned in the first place. Perhaps some corner case tests or so. Vector was never expected to be aligned anyway. Let's see if we'll have actual bugs with this. Reviewers: dingto, lukasstockner97, juicyfruit, brecht Reviewed By: brecht Differential Revision: https://developer.blender.org/D1774
2016-01-12Cycles: Use special debug panel to fine-tune debug flagsSergey Sharybin
This panel is only visible when debug_value is set to 256 and has no affect in other cases. However, if debug value is not set to this value, environment variables will be used to control which features are enabled, so there's no visible changes to anyone in fact. There are some changes needed to prevent devices re-enumeration on every Cycles session create. Reviewers: juicyfruit, lukasstockner97, dingto, brecht Reviewed By: lukasstockner97, dingto Differential Revision: https://developer.blender.org/D1720
2015-11-22Cycles: Make branched path tracer covered with requested featuresSergey Sharybin
This gives few percent extra memory saving for the CUDA kernel when using regular path tracing. Still more like an experiment, but will be handy in the future.
2015-11-21Cycles: Make requested features struct aware of subsurface BSDFSergey Sharybin
This way we'll be able to disable SSS for the scene-adaptive kernel.
2015-11-21Cycles: Move build options constructions to DeviceRequestedFeaturesSergey Sharybin
This way it's easier to re-use requested features logic across multiple device implementations.
2015-07-28Cycles: Prepare for more image extension types supportSergey Sharybin
Basically just replace boolean periodic flag with extension type enum in the device API.
2015-07-18Cycles: Log requested device featuresSergey Sharybin
Useful to have this always logged because otherwise it's needed to remove cached kernels and check build flags to see which features are enabled.
2015-07-18Cycles; Make baking a feature-specific optionSergey Sharybin
This means render devices now might skip building baking kernels in cases when only actual render-related functionality is used. For now it's only implemented for OpenCL split kernel device and mainly needed to work around some compiler-specific bugs which crashes on building the kernel. Using OpenCL for baking might still crash the driver, but at least there is now higher probability of that GPU will be usable to render the scene. Real fix should actually be done in the driver side.
2015-06-08Cycles: Make hair, object and motion blur selective compiled into OpenCLSergey Sharybin
This features are now based on the scene settings, so scenes without those features used are rendered even faster. This gives about 30% speedup on the AMD A10 APU here, but at the same time it does not mean such an improvement will happen on all the hardware. That being said, the Tonga device here seems to have no measurable difference. In any case it seems handy to have for the future, when we'll want to support SSS in the kernel or to port selective compilation/split kernel to CUDA devices.
2015-05-11Cycles: Get rid of one more OpenGL matrix manipulation/push/pop.Antony Riakiotakis
2015-05-11Cycles: use vertex buffers when possible to draw tiles on the screen.Antony Riakiotakis
Not terribly necessary in this case, since we are just drawing a quad, but makes blender overall more GL 3.x core ready.
2015-05-09Cycles: OpenCL kernel splitGeorge Kyriazis
This commit contains all the work related on the AMD megakernel split work which was mainly done by Varun Sundar, George Kyriazis and Lenny Wang, plus some help from Sergey Sharybin, Martijn Berger, Thomas Dinges and likely someone else which we're forgetting to mention. Currently only AMD cards are enabled for the new split kernel, but it is possible to force split opencl kernel to be used by setting the following environment variable: CYCLES_OPENCL_SPLIT_KERNEL_TEST=1. Not all the features are supported yet, and that being said no motion blur, camera blur, SSS and volumetrics for now. Also transparent shadows are disabled on AMD device because of some compiler bug. This kernel is also only implements regular path tracing and supporting branched one will take a bit. Branched path tracing is exposed to the interface still, which is a bit misleading and will be hidden there soon. More feature will be enabled once they're ported to the split kernel and tested. Neither regular CPU nor CUDA has any difference, they're generating the same exact code, which means no regressions/improvements there. Based on the research paper: https://research.nvidia.com/sites/default/files/publications/laine2013hpg_paper.pdf Here's the documentation: https://docs.google.com/document/d/1LuXW-CV-sVJkQaEGZlMJ86jZ8FmoPfecaMdR-oiWbUY/edit Design discussion of the patch: https://developer.blender.org/T44197 Differential Revision: https://developer.blender.org/D1200
2015-05-09Cycles: Communicate number of closures and nodes feature set to the deviceSergey Sharybin
This way device can actually make a decision of how it can optimize the kernel in order to make it most efficient.
2015-05-09Cycles: Change the way how we pass requested capabilities to the deviceSergey Sharybin
Previously we only had experimental flag passed to device's load_kernel() which was all fine. But since we're gonna to have some extra parameters passed there it makes sense to wrap them into a single struct, which will make it easier to pass stuff around.
2015-03-29Optionally use c++11 stuff instead of boost in cycles where possible. We do ↵Martijn Berger
and continue to depend on boost though Reviewers: dingto, sergey Reviewed By: sergey Subscribers: #cycles Differential Revision: https://developer.blender.org/D1185
2015-03-27Cycles: Code cleanup, prepare for strict C++ flagsSergey Sharybin
2015-01-06Cycles: Repot CPU and CUDA capabilities to system info operatorSergey Sharybin
For CPU it gives available instructions set (SSE, AVX and so). For GPU CUDA it reports most of the attribute values returned by cuDeviceGetAttribute(). Ideally we need to only use set of those which are driver-specific (so we don't clutter system info with values which we can get from GPU specifications and be sure they stay the same because driver can't affect on them).
2014-12-25Cleanup: Fix Cycles Apache header.Thomas Dinges
This was already mixed a bit, but the dot belongs there.
2014-07-25Cycles Bake: show progress bar during bakeDalai Felinto
Baking progress preview is not possible, in parts due to the way the API was designed. But at least you get to see the progress bar while baking. Reviewers: sergey Differential Revision: https://developer.blender.org/D656
2014-05-11Cycles / CUDA: Increase maximum image textures on GPU.Thomas Dinges
Instead of 95, we can use 145 images now. This only affects Kepler and above (sm30, sm_35 and sm_50). This can be increased further if needed, but let's first test if this does not come with a performance impact. Originally developed during my GSoC 2013.
2014-03-26Fix T39420: Cycles viewport/preview flickers, when moving mouse across editorsSergey Sharybin
Issue was caused by the wrong usage of OCIO GLSL binding API. To make it work properly on pre-GLSL-1.3 drivers shader is to be enabled after the texture is binded to the opengl context. Otherwise it wouldn't know the proper texture size. This is actually a regression in 2.70 and to be ported to 'a'.
2014-03-08Add support for multiple interpolation modes on cycles image texturesMartijn Berger
All textures are sampled bi-linear currently with the exception of OSL there texture sampling is fixed and set to smart bi-cubic. This patch adds user control to this setting. Added: - bits to DNA / RNA in the form of an enum for supporting multiple interpolations types - changes to the image texture node drawing code ( add enum) - to ImageManager (this needs to know to allocate second texture when interpolation type is different) - to node compiler (pass on interpolation type) - to device tex_alloc this also needs to get the concept of multiple interpolation types - implementation for doing non interpolated lookup for cuda and cpu - implementation where we pass this along to osl ( this makes OSL also do linear untill I add smartcubic to the interface / DNA/ RNA) Reviewers: brecht, dingto Reviewed By: brecht CC: dingto, venomgfx Differential Revision: https://developer.blender.org/D317
2013-12-07Cycles: network render code updated for latest changes and improvedMartijn Berger
This actually works somewhat now, although viewport rendering is broken and any kind of network error or connection failure will kill Blender. * Experimental WITH_CYCLES_NETWORK cmake option * Networked Device is shown as an option next to CPU and GPU Compute * Various updates to work with the latest Cycles code * Locks and thread safety for RPC calls and tiles * Refactored pointer mapping code * Fix error in CPU brand string retrieval code This includes work by Doug Gale, Martijn Berger and Brecht Van Lommel. Reviewers: brecht Differential Revision: http://developer.blender.org/D36
2013-08-18Cycles: relicense GNU GPL source code to Apache version 2.0.Brecht Van Lommel
More information in this post: http://code.blender.org/ Thanks to all contributes for giving their permission!
2012-12-23Cycles: deal a bit better with errors when CUDA runs out of memory, try to ↵Brecht Van Lommel
avoid crashes.
2012-11-08Fix #33107: cycles fixed threads 1 was still having two cores do work,Brecht Van Lommel
because main thread works as well.
2012-11-05Cycles: memory usage reportSergey Sharybin
This commit adds memory usage information while rendering. It reports memory used by device, meaning: - For CPU it'll report real memory consumption - For GPU rendering it'll report GPU memory consumption, but it'll also mean the same memory is used from host side. This information displays information about memory requested by Cycles, not memory really allocated on a device. Real memory usage might be higher because of memory fragmentation or optimistic memory allocator. There's really nothing we can do against this. Also in contrast with blender internal's render cycles memory usage does not include memory used by scene, only memory needed by cycles itself will be displayed. So don't freak out if memory usage reported by cycles would be much lower than blender internal's. This commit also adds RenderEngine.update_memory_stats callback which is used to tell memory consumption from external engine to blender. This information is used to generate information line after rendering is finished.