Welcome to mirror list, hosted at ThFree Co, Russian Federation.

git.blender.org/blender.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorBrecht Van Lommel <brecht@blender.org>2022-07-14 17:42:43 +0300
committerBrecht Van Lommel <brecht@blender.org>2022-07-15 14:42:47 +0300
commit523bbf7065547a67e7c23f67f546a5ed6433f809 (patch)
treea054c6619cc5a1c1f330e65db92e45d4ded515c2 /intern/cycles/device/metal/queue.mm
parentda4ef05e4dfb700a61910e6d8e02183d7c272963 (diff)
Cycles: generalize shader sorting / locality heuristic to all GPU devices
This was added for Metal, but also gives good results with CUDA and OptiX. Also enable it for future Apple GPUs instead of only M1 and M2, since this has been shown to help across multiple GPUs so the better bet seems to enable rather than disable it. Also moves some of the logic outside of the Metal device code, and always enables the code in the kernel since other devices don't do dynamic compile. Time per sample with OptiX + RTX A6000: new old barbershop_interior 0.0730s 0.0727s bmw27 0.0047s 0.0053s classroom 0.0428s 0.0464s fishy_cat 0.0102s 0.0108s junkshop 0.0366s 0.0395s koro 0.0567s 0.0578s monster 0.0206s 0.0223s pabellon 0.0158s 0.0174s sponza 0.0088s 0.0100s spring 0.1267s 0.1280s victor 0.0524s 0.0531s wdas_cloud 0.0817s 0.0816s Ref D15331, T87836
Diffstat (limited to 'intern/cycles/device/metal/queue.mm')
-rw-r--r--intern/cycles/device/metal/queue.mm16
1 files changed, 2 insertions, 14 deletions
diff --git a/intern/cycles/device/metal/queue.mm b/intern/cycles/device/metal/queue.mm
index 6a9cc552098..5ac63a16c61 100644
--- a/intern/cycles/device/metal/queue.mm
+++ b/intern/cycles/device/metal/queue.mm
@@ -293,21 +293,9 @@ int MetalDeviceQueue::num_concurrent_busy_states() const
return result;
}
-int MetalDeviceQueue::num_sort_partitions(const size_t state_size) const
+int MetalDeviceQueue::num_sort_partition_elements() const
{
- /* Sort partitioning becomes less effective when more shaders are in the wavefront. In lieu of a
- * more sophisticated heuristic we simply disable sort partitioning if the shader count is high.
- */
- if (metal_device_->launch_params.data.max_shaders >= 300) {
- return 1;
- }
-
- const int optimal_partition_elements = MetalInfo::optimal_sort_partition_elements(
- metal_device_->mtlDevice);
- if (optimal_partition_elements) {
- return num_concurrent_states(state_size) / optimal_partition_elements;
- }
- return 1;
+ return MetalInfo::optimal_sort_partition_elements(metal_device_->mtlDevice);
}
void MetalDeviceQueue::init_execution()