Welcome to mirror list, hosted at ThFree Co, Russian Federation.

git.blender.org/blender.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2022-07-28Cleanup: simplifications and consistency for vector typesBrecht Van Lommel
* OneAPI: remove separate float3 definition * OneAPI: disable operator[] to match other GPUs * OneAPI: make int3 compact to match other GPUs * Use #pragma once * Add __KERNEL_NATIVE_VECTOR_TYPES__ to simplify checks * Remove unused vector3
2022-07-27Cycles: switch Cycles triangle barycentric convention to match Embree/OptiXBrecht Van Lommel
Simplifies intersection code a little and slightly improves precision regarding self intersection. The parametric texture coordinate in shader nodes is still the same as before for compatibility.
2022-07-27Cleanup: remove unnecessary bvh_instance_motion_popBrecht Van Lommel
2022-07-27Fix broken BVH2 on CPU after recent changesBrecht Van Lommel
Runtime switching between Embree and BVH2 got lost.
2022-07-27Cycles oneAPI: simplify num_concurrent_states selectionXavier Hallade
The number of Execution Units and resident "threads" (simd width * threads per EUs) are now exposed and used to select the number of states using a simplified heuristic.
2022-07-26Cleanup: spelling in commentsCampbell Barton
2022-07-26Fix Cycles Metal build errors after recent changesBrecht Van Lommel
float8 is a reserved type in Metal, but is not implemented. So rename to float8_t for now. Also move back intersection handlers to kernel.metal, they can't be in the class that encapsulates the other Metal kernel functions.
2022-07-25Cycles: Nishita Sky: Fix sun disk imprecision for large elevationClément Foucault
The issue was introduced by rBad5e3d30a2d2 which made possible to use unbounded elevation angle. In order to not touch the shading code, we just remap the value to the expected range the shading code expects. This means that elevation angles above +/-PI/2 effectively flip the sun rotation angle.
2022-07-25Cleanup: remove __KERNEL_CPU__Brecht Van Lommel
This was tested in some places to check if code was being compiled for the CPU, however this is only defined in the kernel. Checking __KERNEL_GPU__ always works.
2022-07-25Cycles: add math functions for float8Andrii Symkin
This patch adds required math functions for float8 to make it possible using float8 instead of float3 for color data. Differential Revision: https://developer.blender.org/D15525
2022-07-25Cleanup: move device BVH code to kernel/device/*/bvh.hBrecht Van Lommel
Having the OptiX/MetalRT/Embree/MetalRT implementations all in one file with many #ifdefs became too confusing. Instead split it up per device, and also move it together with device specific hit/filter/intersect functions and associated data types.
2022-07-25Fix wrong Cycles SSS intersection distance after ray distance changesBrecht Van Lommel
No need anymore to have a difference between CPU/GPU, all distances remain in world space.
2022-07-25Cycles: simplify handling of ray distance in GPU renderingBrecht Van Lommel
All our intersections functions now work with unnormalized ray direction, which means we no longer need to transform ray distance between world and object space, they can all remain in world space. There doesn't seem to be any real performance difference one way or the other, but it does simplify the code.
2022-07-25Cycles: more closely match some math and intersection operations in EmbreeBrecht Van Lommel
This helps with debugging, and gives a slightly closer match between CPU and CUDA/HIP/Metal renders when it comes to ray tracing precision.
2022-07-25Fix build error with WITH_CYCLES_KERNEL_NATIVE_ONLY on macOS ArmBrecht Van Lommel
-march=native is not supported for all architectures.
2022-07-24Fix T98367: Light group passes do not work when shadow catcher is usedLukas Stockner
2022-07-21Cleanup: spelling in comments, typos in tool-tipsCampbell Barton
2022-07-20Point Cloud: Remove redundant custom data pointersHans Goudey
Similar to e9f82d3dc7eebadcc52, but for point clouds instead. Differential Revision: https://developer.blender.org/D15487
2022-07-20Curves: Remove redundant custom data pointersHans Goudey
These mutable pointers present problems with ownership in relation to proper copy-on-write for attributes. The simplest solution is to just remove them and retrieve the layers from `CustomData` when they are needed. This also removes the complexity and redundancy of having to update the pointers as the curves change. A similar change will apply to meshes and point clouds. One downside of this change is that it makes random access with RNA slower. However, it's simple to just use the RNA attribute API instead, which is unaffected. In this patch I updated Cycles to do that. With the future attribute CoW changes, this generic approach makes sense because Cycles can just request ownership of the existing arrays. Differential Revision: https://developer.blender.org/D15486
2022-07-19Cleanup: Remove compile option for curves objectHans Goudey
After becb1530b1c81a408e20 the new curves object type isn't hidden behind an experimental flag anymore, and other areas depend on this, so disabling curves at compile time doesn't make sense anymore.
2022-07-18Cleanup: change internal Cycles compact BVH default to match UIBrecht Van Lommel
2022-07-15Cycles: refactor rays to have start and end distance, fix precision issuesBrecht Van Lommel
For transparency, volume and light intersection rays, adjust these distances rather than the ray start position. This way we increment the start distance by the smallest possible float increment to avoid self intersections, and be sure it works as the distance compared to be will be exactly the same as before, due to the ray start position and direction remaining the same. Fix T98764, T96537, hair ray tracing precision issues. Differential Revision: https://developer.blender.org/D15455
2022-07-15Fix Cycles MetalRT error after recent specialization changesBrecht Van Lommel
2022-07-15Cleanup: compiler warningBrecht Van Lommel
2022-07-15Cycles: generalize shader sorting / locality heuristic to all GPU devicesBrecht Van Lommel
This was added for Metal, but also gives good results with CUDA and OptiX. Also enable it for future Apple GPUs instead of only M1 and M2, since this has been shown to help across multiple GPUs so the better bet seems to enable rather than disable it. Also moves some of the logic outside of the Metal device code, and always enables the code in the kernel since other devices don't do dynamic compile. Time per sample with OptiX + RTX A6000: new old barbershop_interior 0.0730s 0.0727s bmw27 0.0047s 0.0053s classroom 0.0428s 0.0464s fishy_cat 0.0102s 0.0108s junkshop 0.0366s 0.0395s koro 0.0567s 0.0578s monster 0.0206s 0.0223s pabellon 0.0158s 0.0174s sponza 0.0088s 0.0100s spring 0.1267s 0.1280s victor 0.0524s 0.0531s wdas_cloud 0.0817s 0.0816s Ref D15331, T87836
2022-07-15Cycles: Apple Silicon optimization to specialize intersection kernelsMichael Jones
The Metal backend now compiles and caches a second set of kernels which are optimized for scene contents, enabled for Apple Silicon. The implementation supports doing this both for intersection and shading kernels. However this is currently only enabled for intersection kernels that are quick to compile, and already give a good speedup. Enabling this for shading kernels would be faster still, however this also causes a long wait times and would need a good user interface to control this. M1 Max samples per minute (macOS 13.0): PSO_GENERIC PSO_SPECIALIZED_INTERSECT PSO_SPECIALIZED_SHADE barbershop_interior 83.4 89.5 93.7 bmw27 1486.1 1671.0 1825.8 classroom 175.2 196.8 206.3 fishy_cat 674.2 704.3 719.3 junkshop 205.4 212.0 257.7 koro 310.1 336.1 342.8 monster 376.7 418.6 424.1 pabellon 273.5 325.4 339.8 sponza 830.6 929.6 1142.4 victor 86.7 96.4 96.3 wdas_cloud 111.8 112.7 183.1 Code contributed by Jason Fielder, Morteza Mostajabodaveh and Michael Jones Differential Revision: https://developer.blender.org/D14645
2022-07-15Cycles: keep track of SVM nodes used in kernelsMichael Jones
To be used for specialization in Metal, to automatically leave out unused nodes from the kernel. Ref D14645
2022-07-15Cycles: refactor to move part of KernelData definition to template headerBrecht Van Lommel
To be used for specialization on Metal in a following commit, turning these members into compile time constants. Ref D14645
2022-07-15Render: camera depth of field support for armature bone targetsDamien Picard
This is useful when using an armature as a camera rig, to avoid creating and targetting an empty object. Differential Revision: https://developer.blender.org/D7012
2022-07-15Cleanup: make formatBrecht Van Lommel
2022-07-14Fix Cycles MNEE wrong results with area light spreadOlivier Maury
When the solve is successful, the light sample needs to be updated since the effective shading point is now on the last refractive interface. Spread was not taken into account, creating false caustics. Differential Revision: https://developer.blender.org/D15449
2022-07-14Cleanup: replace state flow macros in the kernel with functionsBrecht Van Lommel
2022-07-14Cycles: add presets to the Performance panelBrecht Van Lommel
With choices Default, Lower Memory and Faster Render. For convenience, and to help communicate what the various settings do. Differential Revision: https://developer.blender.org/D15446
2022-07-14Cycles: Improve cache usage on Apple GPUs by chunking active indicesMichael Jones
This patch partitions the active indices into chunks prior to sorting by material in order to tradeoff some material coherence for better locality. On Apple Silicon GPUs (particularly higher end M1-family GPUs), we observe overall render time speedups of up to 15%. The partitioning is implemented by repeating the range of `shader_sort_key` for each partition, and encoding a "locator" key which distributes the indices into sorted chunks. Reviewed By: brecht Differential Revision: https://developer.blender.org/D15331
2022-07-14Cleanup: spelling in commentsCampbell Barton
Also remove duplicate comments in bmesh_log.h, caused by automated comment relocation in [0]. [0]: c4e041da23b9c45273fcd4874308c536b6a315d1
2022-07-12Cycles: Make not-compact BVH the default for embreeXavier Hallade
Measurements shown on average a 1.08x speedup for a 1.04x increase in memory usage which is an acceptable trade off for a default setting, although discoverability of such settings influencing memory usage could be improved. Reviewed By: brecht Differential Revision: https://developer.blender.org/D15429
2022-07-12Cycles: fix and enable JIT oneAPI CentOS7 builds for drivers 23570+Xavier Hallade
The current specific CentOS7 workaround we have for AoT, which is to disable __FAST_MATH__ by using -fhonor-nans, now also fixes the compilation issue for JIT as well since at least driver 23570.
2022-07-11Fix T99218: light group add button should be disabled when name is emptyBrecht Van Lommel
Previously it was inactive but still clickable. Ref D15316
2022-07-08Curves: use consistent default radius for Cycles, Eevee, Set Curve Radius nodeBrecht Van Lommel
To avoid Cycles not showing any hair by default, and to avoid very slow render due to many overlaps with the previous 1 meter default in the node. Fixes T97584, T99319 Differential Revision: https://developer.blender.org/D15405
2022-07-08Cycles: enable oneAPI in Linux release buildsXavier Hallade
with a very high min-driver version requirement, placeholder until JIT CentOS runtime compilation issue gets fixed in a defined version. min-driver version check can be worked around by setting CYCLES_ONEAPI_ALL_DEVICES environment variable.
2022-07-06Cycles oneAPI: Remove direct dependency on Level-ZeroXavier Hallade
We used it only to access device id for explicitly allowing Arc GPUs. It made the backend require ze_loader.dll which could be problematic if we end up using direct linking. I've replaced filtering based on PCI device id by using other HW properties instead (EUs, threads per EU), that are now available through Level-Zero.
2022-07-06Cleanup: fix comments in oneAPI kernel.cppXavier Hallade
2022-07-06Cycles: Improve an occupancy for Intel GPUsNikita Sirgienko
Initially oneAPI implementation have waited after each memory operation, even if there was no need for this. Now, the implementation will wait only if it is really necessary - it have improved performance noticeble for some scenes and a bit for the rest of them.
2022-07-01Cycles: fix support for multiple Intel GPUsXavier Hallade
Identical Intel GPUs ended up with the same id. Added PCI BDF to the id to make it unique.
2022-07-01Cleanup: add missing license headers in Cycles oneAPI implementationXavier Hallade
2022-06-30Fix broken Cycles performance benchmark after recent logging changesBrecht Van Lommel
Ensure full render report is printed with default verbosity.
2022-06-30Cycles: add more math functions for float4Andrii Symkin
Add more math functions for float4 to make them on par with float3 ones. It makes it possible to change the types of float3 variables to float4 without additional work. Differential Revision: https://developer.blender.org/D15318
2022-06-30Cleanup: formatCampbell Barton
2022-06-30Cleanup: spelling in commentsCampbell Barton
2022-06-29Cycles: Add support for rendering on Intel GPUs using oneAPIXavier Hallade
This patch adds a new Cycles device with similar functionality to the existing GPU devices. Kernel compilation and runtime interaction happen via oneAPI DPC++ compiler and SYCL API. This implementation is primarly focusing on Intel® Arc™ GPUs and other future Intel GPUs. The first supported drivers are 101.1660 on Windows and 22.10.22597 on Linux. The necessary tools for compilation are: - A SYCL compiler such as oneAPI DPC++ compiler or https://github.com/intel/llvm - Intel® oneAPI Level Zero which is used for low level device queries: https://github.com/oneapi-src/level-zero - To optionally generate prebuilt graphics binaries: Intel® Graphics Compiler All are included in Linux precompiled libraries on svn: https://svn.blender.org/svnroot/bf-blender/trunk/lib The same goes for Windows precompiled binaries but for the graphics compiler, available as "Intel® Graphics Offline Compiler for OpenCL™ Code" from https://www.intel.com/content/www/us/en/developer/articles/tool/oneapi-standalone-components.html, for which path can be set as OCLOC_INSTALL_DIR. Being based on the open SYCL standard, this implementation could also be extended to run on other compatible non-Intel hardware in the future. Reviewed By: sergey, brecht Differential Revision: https://developer.blender.org/D15254 Co-authored-by: Nikita Sirgienko <nikita.sirgienko@intel.com> Co-authored-by: Stefan Werner <stefan.werner@intel.com>