diff options
author | Nikita Sirgienko <nikita.sirgienko@intel.com> | 2022-09-14 16:55:56 +0300 |
---|---|---|
committer | Nikita Sirgienko <nikita.sirgienko@intel.com> | 2022-09-27 23:15:00 +0300 |
commit | 2ead05d73878721703de5d2fe6a07eb9053168aa (patch) | |
tree | 72b51851a8c5e7f9995607ca20c94c9cdf022ff3 /intern/cycles/device/optix/queue.cpp | |
parent | b145cc9d361e21da3a8a0ff2ef3bad1f8e8fbae6 (diff) |
Cycles: Add optional per-kernel performance statistics
When verbose level 4 is enabled, Blender prints kernel performance
data for Cycles on GPU backends (except Metal that doesn't use
debug_enqueue_* methods) for groups of kernels.
These changes introduce a new CYCLES_DEBUG_PER_KERNEL_PERFORMANCE
environment variable to allow getting timings for each kernels
separately and not grouped with others. This is done by adding
explicit synchronization after each kernel execution.
Differential Revision: https://developer.blender.org/D15971
Diffstat (limited to 'intern/cycles/device/optix/queue.cpp')
-rw-r--r-- | intern/cycles/device/optix/queue.cpp | 4 |
1 files changed, 3 insertions, 1 deletions
diff --git a/intern/cycles/device/optix/queue.cpp b/intern/cycles/device/optix/queue.cpp index f0d49ad6f6c..3bc547ed11d 100644 --- a/intern/cycles/device/optix/queue.cpp +++ b/intern/cycles/device/optix/queue.cpp @@ -46,7 +46,7 @@ bool OptiXDeviceQueue::enqueue(DeviceKernel kernel, return false; } - debug_enqueue(kernel, work_size); + debug_enqueue_begin(kernel, work_size); const CUDAContextScope scope(cuda_device_); @@ -131,6 +131,8 @@ bool OptiXDeviceQueue::enqueue(DeviceKernel kernel, 1, 1)); + debug_enqueue_end(); + return !(optix_device->have_error()); } |