diff options
author | Nikita Sirgienko <nikita.sirgienko@intel.com> | 2022-09-14 16:55:56 +0300 |
---|---|---|
committer | Nikita Sirgienko <nikita.sirgienko@intel.com> | 2022-09-27 23:15:00 +0300 |
commit | 2ead05d73878721703de5d2fe6a07eb9053168aa (patch) | |
tree | 72b51851a8c5e7f9995607ca20c94c9cdf022ff3 /intern/cycles/device/hip | |
parent | b145cc9d361e21da3a8a0ff2ef3bad1f8e8fbae6 (diff) |
Cycles: Add optional per-kernel performance statistics
When verbose level 4 is enabled, Blender prints kernel performance
data for Cycles on GPU backends (except Metal that doesn't use
debug_enqueue_* methods) for groups of kernels.
These changes introduce a new CYCLES_DEBUG_PER_KERNEL_PERFORMANCE
environment variable to allow getting timings for each kernels
separately and not grouped with others. This is done by adding
explicit synchronization after each kernel execution.
Differential Revision: https://developer.blender.org/D15971
Diffstat (limited to 'intern/cycles/device/hip')
-rw-r--r-- | intern/cycles/device/hip/queue.cpp | 4 |
1 files changed, 3 insertions, 1 deletions
diff --git a/intern/cycles/device/hip/queue.cpp b/intern/cycles/device/hip/queue.cpp index 8b3d963a32f..3f8b6267100 100644 --- a/intern/cycles/device/hip/queue.cpp +++ b/intern/cycles/device/hip/queue.cpp @@ -79,7 +79,7 @@ bool HIPDeviceQueue::enqueue(DeviceKernel kernel, return false; } - debug_enqueue(kernel, work_size); + debug_enqueue_begin(kernel, work_size); const HIPContextScope scope(hip_device_); const HIPDeviceKernel &hip_kernel = hip_device_->kernels.get(kernel); @@ -120,6 +120,8 @@ bool HIPDeviceQueue::enqueue(DeviceKernel kernel, 0), "enqueue"); + debug_enqueue_end(); + return !(hip_device_->have_error()); } |