diff options
author | Patrick Mours <pmours@nvidia.com> | 2021-11-10 16:37:15 +0300 |
---|---|---|
committer | Patrick Mours <pmours@nvidia.com> | 2021-11-10 17:49:50 +0300 |
commit | f56562043521a5c160585aea3f28167b4d3bc77d (patch) | |
tree | af1e155ac9b25b6ad15acbe8dba8bb8a2d8edebf /intern/cycles/kernel/CMakeLists.txt | |
parent | a6e4cb092eb43b74379f99bdf82baab0db21603e (diff) |
Fix T92985: CUDA errors with Cycles film convert kernels
rB3a4c8f406a3a3bf0627477c6183a594fa707a6e2 changed the macros that create the film
convert kernel entry points, but in the process accidentally changed the parameter definition
to one of those (which caused CUDA launch and misaligned address errors) and changed the
implementation as well. This restores the correct implementation from before.
In addition, the `ccl_gpu_kernel_threads` macro did not work as intended and caused the
generated launch bounds to end up with an incorrect input for the second parameter (it was
set to "thread_num_registers", rather than the result of the block number calculation). I'm
not entirely sure why, as the macro definition looked sound to me. Decided to simply go with
two separate macros instead, to simplify and solve this.
Also changed how state is captured with the `ccl_gpu_kernel_lambda` macro slightly, to avoid
a compiler warning (expression has no effect) that otherwise occurred.
Maniphest Tasks: T92985
Differential Revision: https://developer.blender.org/D13175
Diffstat (limited to 'intern/cycles/kernel/CMakeLists.txt')
-rw-r--r-- | intern/cycles/kernel/CMakeLists.txt | 1 |
1 files changed, 0 insertions, 1 deletions
diff --git a/intern/cycles/kernel/CMakeLists.txt b/intern/cycles/kernel/CMakeLists.txt index f311b0e74bb..39cb886b16e 100644 --- a/intern/cycles/kernel/CMakeLists.txt +++ b/intern/cycles/kernel/CMakeLists.txt @@ -379,7 +379,6 @@ if(WITH_CYCLES_CUDA_BINARIES) ${SRC_KERNEL_HEADERS} ${SRC_KERNEL_DEVICE_GPU_HEADERS} ${SRC_KERNEL_DEVICE_CUDA_HEADERS} - ${SRC_KERNEL_DEVICE_METAL_HEADERS} ${SRC_UTIL_HEADERS} ) set(cuda_cubins) |