Welcome to mirror list, hosted at ThFree Co, Russian Federation.

git.blender.org/blender.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2017-11-05Cycles: reserve CUDA local memory ahead of time.Brecht Van Lommel
This way we can log the amount of memory used, and it will be important for host mapped memory support.
2017-11-04Code refactor: replace CUDA array with linear memory for 1D and 2D textures.Brecht Van Lommel
This is a prequisite for getting host memory allocation to work. There appears to be no support for 3D textures using host memory. The original version of this code was written by Stefan Werner for D2056.
2017-11-03Fix T53247: mixed CPU + GPU render wrong texture limits.Brecht Van Lommel
2017-11-02Cycles: Add another limit to OpenCL memory usageMai Lavelle
Some drivers may report very large allocation sizes, which could cause unnecessary memory usage. This is now limited to 2gb which should still be enough to get the needed performance benefits without waste.
2017-10-25Fix one more assert being triggered due to recent changes.Brecht Van Lommel
2017-10-25Code refactor: remove MEM_WRITE_ONLY, always use MEM_READ_WRITE.Brecht Van Lommel
It's unlikely the driver can do useful optimizations with this, and if we sum multiple samples we are reading from the memory anyway.
2017-10-24Fix T53146: incomplete multi GPU and CPU + GPU memory statistics.Brecht Van Lommel
Part due to recent changes, part old bug.
2017-10-24Cycles: Fix compilation in debug modeSergey Sharybin
Please check compilation before committing refactor changes!
2017-10-24Cycles: Fix compilation error without C++11Sergey Sharybin
2017-10-24Fix T53134: denoising with CPU + GPU render leaves some tiles noisy.Brecht Van Lommel
2017-10-24Code refactor: move more memory allocation logic into device API.Brecht Van Lommel
* Remove tex_* and pixels_* functions, replace by mem_*. * Add MEM_TEXTURE and MEM_PIXELS as memory types recognized by devices. * No longer create device_memory and call mem_* directly, always go through device_only_memory, device_vector and device_pixels.
2017-10-24Code refactor: use device_only_memory and device_vector in more places.Brecht Van Lommel
2017-10-24Code refactor: store device/interp/extension/type in each device_memory.Brecht Van Lommel
2017-10-24Code refactor: pass device to scene, check OSL with device info.Brecht Van Lommel
2017-10-21Code refactor: avoid some unnecessary device memory copying.Brecht Van Lommel
2017-10-21Cycles: combined CPU + GPU rendering support.Brecht Van Lommel
CPU rendering will be restricted to a BVH2, which is not ideal for raytracing performance but can be shared with the GPU. Decoupled volume shading will be disabled to match GPU volume sampling. The number of CPU rendering threads is reduced to leave one core dedicated to each GPU. Viewport rendering will also only use GPU rendering still. So along with the BVH2 usage, perfect scaling should not be expected. Go to User Preferences > System to enable the CPU to render alongside the GPU. Differential Revision: https://developer.blender.org/D2873
2017-10-19Cycles: Add extra logging in CUDA device detection codeSergey Sharybin
2017-10-18Fix T53098, T53079: OpenCL world texture errors after recent changes.Brecht Van Lommel
2017-10-11Cycles: Fix possible race condition when initializing devices listSergey Sharybin
2017-10-08Cycles: schedule more work for non-display and compute preemption CUDA cards.Brecht Van Lommel
This change affects CUDA GPUs not connected to a display or connected to a display but supporting compute preemption so that the display does not freeze. I couldn't find an official list, but compute preemption seems to be only supported with GTX 1070+ and Linux (not GTX 1060- or Windows). This helps improve small tile rendering performance further if there are sufficient samples x number of pixels in a single tile to keep the GPU busy.
2017-10-08Fix T53017: Cycles not detecting AMD GPU when there is an NVidia GPU too.Mathieu Menuet
Best guess is that cuInit() somehow interferes with the AMD graphics driver on Windows, and switching the initialization order to do OpenCL first seems to solve the issue.
2017-10-08Code refactor: use DeviceInfo to enable QBVH and decoupled volume shading.Brecht Van Lommel
2017-10-07Code refactor: make texture code more consistent between devices.Brecht Van Lommel
* Use common TextureInfo struct for all devices, except CUDA fermi. * Move image sampling code to kernels/*/kernel_*_image.h files. * Use arrays for data textures on Fermi too, so device_vector<Struct> works.
2017-10-05Code refactor: split displace/background into separate kernels, remove luma.Brecht Van Lommel
2017-10-05Fix incorrect CUDA remaining time estimate after previous commit.Brecht Van Lommel
2017-10-04Cycles: CUDA faster rendering of small tiles, using multiple samples like ↵Brecht Van Lommel
OpenCL. The work size is still very conservative, and this doesn't help for progressive refine. For that we will need to render multiple tiles at the same time. But this should already help for denoising renders that require too much memory with big tiles, and just generally soften the performance dropoff with small tiles. Differential Revision: https://developer.blender.org/D2856
2017-10-04Code refactor: use split variance calculation for mega kernels too.Brecht Van Lommel
There is no significant difference in denoised benchmark scenes and denoising ctests, so might as well make it all consistent.
2017-10-04Code refactor: remove rng_state buffer and compute hash on the fly.Brecht Van Lommel
A little faster on some benchmark scenes, a little slower on others, seems about performance neutral on average and saves a little memory.
2017-10-04Code refactor: add WorkTile struct for passing work to kernel.Brecht Van Lommel
This makes sharing some code between mega/split in following commits a bit easier, and also paves the way for rendering multiple tiles later.
2017-09-27Code refactor: simplify CUDA context push/pop.Brecht Van Lommel
Makes it possible to call a function like mem_alloc() when the context is already active. Also fixes some missing pops in case of errors.
2017-08-30Cycles: Fix build with networking enabledMai Lavelle
2017-08-25Cycles: Correct logging of sued CPU intrisicsSergey Sharybin
2017-08-21Cycles: attempt to recover from crashing CUDA/OpenCL drivers on Windows.Brecht Van Lommel
I don't know if this will actually work, needs testing. Ref T52064.
2017-08-12Code cleanup: fix warning and improve terminology.Brecht Van Lommel
2017-08-09Cycles: Remove ulong usageSergey Sharybin
This is a bit confusing, especially when one mixes OpenCL code where ulong equals to uint64_t with CPU side code where ulong is expected to be something else from the naming. This commit makes it so we use explicit name, common on all platforms.
2017-08-09Cycles: Proper fix for recent OpenCL image crashMai Lavelle
Problem was that some code checks to see if device_pointer is null or not and the new allocator wasn't even setting the pointer to anything as it tracks memory location separately. Setting the pointer to non null keeps all users of device_pointer happy.
2017-08-08Cycles: More fixes for Windows 32 bitSergey Sharybin
- Apparently MSVC does not support compound literals in C++ (at least by the looks of it). - Not sure how opencl_device_assert was managing to set protected property of the Device class.
2017-08-08Cycles: Fix compilation error without C++11Sergey Sharybin
Common folks, nobody considered master a C++11 only branch. Such decision is to be done officially and will involve changes in quite a few infrastructure related areas.
2017-08-08Cycles: Pack kernel textures into buffers for OpenCLMai Lavelle
Image textures were being packed into a single buffer for OpenCL, which limited the amount of memory available for images to the size of one buffer (usually 4gb on AMD hardware). By packing textures into multiple buffers that limit is removed, while simultaneously reducing the number of buffers that need to be passed to each kernel. Benchmarks were within 2%. Fixes T51554. Differential Revision: https://developer.blender.org/D2745
2017-08-07Cycles: Cleanup, space after keywordSergey Sharybin
2017-08-07Code refactor: split defines into separate header, changes to SSE type headers.Brecht Van Lommel
I need to use some macros defined in util_simd.h for float3/float4, to emulate SSE4 instructions on SSE2. But due to issues with order of header includes this was not possible, this does some refactoring to make it work. Differential Revision: https://developer.blender.org/D2764
2017-08-05Cycles: CUDA split performance tweaks, still far from megakernel.Brecht Van Lommel
On Pabellon, 25.8s mega, 35.4s split before, 32.7s split after.
2017-07-11Cycles: Disable OpenCL clFlush workaroundsSergey Sharybin
This is something which was reported to work fine by Mai, Benjamin and confirmed by myself. Disabling this workaround gains us some speedup: Before Now bmw27 04:28.42 04:07.79 classroom 09:26.48 08:54.53 fishy_cat 08:44.01 08:18.70 koro 09:17.98 08:57.18 pavillon_barcelone 12:26.64 11:52.81 Test environment is: - Ubuntu 16.04, with all updates installed - AMD RX 480 GPU - amdgpu pro driver version 17.10-450821
2017-07-07Cycles: Fix ambiguity in call of min() functionSergey Sharybin
2017-07-06Cycles: Add artificial memory limit debug option for OpenCLMai Lavelle
2017-07-06Cycles: Dont allow global size to fall to zeroMai Lavelle
2017-07-06Cycles: Detect out of memory before buffer allocation in OpenCL devicesMai Lavelle
2017-07-05Cycles: Pass string by const reference rather than by valueSergey Sharybin
Some of the functions might have been inlined, but others i don't see how that was possible (don't think virtual functions can be inlined here). In any case, better be explicitly optimal in the code.
2017-07-03Cycles: Add missing split kernel to CPUDeviceLukas Stockner
2017-06-30Cycles: Disable baking in mega kernel when not in use to improve build timesMai Lavelle