diff options
author | Xavier Hallade <xavier.hallade@intel.com> | 2022-10-26 11:35:18 +0300 |
---|---|---|
committer | Xavier Hallade <xavier.hallade@intel.com> | 2022-10-26 11:53:23 +0300 |
commit | 4b14b33ea887e685937b7757af0c2093093b7c7e (patch) | |
tree | 93980583e87d4e73ce3cb030656da7a3c331df81 | |
parent | 633d314b75a1e84c9ed93e09047f87f34ddab802 (diff) |
Cycles: use packed float3 back for oneAPI
This fixes a 15% performance regression silently introduced by
79ab76e156d4bde937335be784cdf220294600d5 that aligned the compact
float3 on 16 bytes for oneAPI.
Current change is minimalist, there are further cleanup opportunities
such as removing packed_float3 definition for oneAPI but for some
reason, it cuts the recovered speedup in half, so we're starting with
this small fix for now.
Reviewed by: brecht
Differential Revision: https://developer.blender.org/D16340
-rw-r--r-- | intern/cycles/util/types_float3.h | 5 |
1 files changed, 5 insertions, 0 deletions
diff --git a/intern/cycles/util/types_float3.h b/intern/cycles/util/types_float3.h index 87c6b1d3654..34430945c38 100644 --- a/intern/cycles/util/types_float3.h +++ b/intern/cycles/util/types_float3.h @@ -10,7 +10,12 @@ CCL_NAMESPACE_BEGIN #ifndef __KERNEL_NATIVE_VECTOR_TYPES__ +# ifdef __KERNEL_ONEAPI__ +/* Define float3 as packed for oneAPI. */ +struct float3 +# else struct ccl_try_align(16) float3 +# endif { # ifdef __KERNEL_GPU__ /* Compact structure for GPU. */ |