Welcome to mirror list, hosted at ThFree Co, Russian Federation.

git.blender.org/blender.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorMichael Jones <michael_p_jones@apple.com>2022-01-20 13:11:58 +0300
committerMichael Jones <michael_p_jones@apple.com>2022-01-20 18:37:49 +0300
commitf6c8a78ac684242ba067499511a0db2fa64657fe (patch)
treef27b794be6b3fee030ccc753d9023ca52b1da00f /intern/cycles/util
parent9315215b2068b1d92d7e218ca53934aab85d68d7 (diff)
Cycles: Fix bvh2 gen on Apple Silicon and use it to speed up renders
This patch fixes a correctness issue discovered in the `int4 select(...)` function on Apple Silicon machines, which causes bad bvh2 builds. Although the generated bvh2s give correct renders, the resulting runtime performance is terrible. This fix allows us to switch over to bvh2 on Apple Silicon giving a significant performance uplift for many of the standard benchmarking assets. It also fixes some unit test failures stemming from the use of MetalRT, and trivially enables the new pointcloud primitive. Ref T92212 Reviewed By: brecht Maniphest Tasks: T92212 Differential Revision: https://developer.blender.org/D13877
Diffstat (limited to 'intern/cycles/util')
-rw-r--r--intern/cycles/util/math_int4.h5
1 files changed, 1 insertions, 4 deletions
diff --git a/intern/cycles/util/math_int4.h b/intern/cycles/util/math_int4.h
index 9e3f001efc2..eaa9be73b63 100644
--- a/intern/cycles/util/math_int4.h
+++ b/intern/cycles/util/math_int4.h
@@ -131,10 +131,7 @@ ccl_device_inline int4 clamp(const int4 &a, const int4 &mn, const int4 &mx)
ccl_device_inline int4 select(const int4 &mask, const int4 &a, const int4 &b)
{
# ifdef __KERNEL_SSE__
- const __m128 m = _mm_cvtepi32_ps(mask);
- /* TODO(sergey): avoid cvt. */
- return int4(_mm_castps_si128(
- _mm_or_ps(_mm_and_ps(m, _mm_castsi128_ps(a)), _mm_andnot_ps(m, _mm_castsi128_ps(b)))));
+ return int4(_mm_or_si128(_mm_and_si128(mask, a), _mm_andnot_si128(mask, b)));
# else
return make_int4(
(mask.x) ? a.x : b.x, (mask.y) ? a.y : b.y, (mask.z) ? a.z : b.z, (mask.w) ? a.w : b.w);