Cycles: Workaround dead-slow expf() on 64bit linux

Single precision exponent on 64bit linux tends to be order of magnitude slower than double precision version even with single<->double precision conversion. Some feedback in the mailing lists also suggests that logf() is also slow, but this i didn't confirm here in the studio yet. Depending on the shader setup it gives ~3% with the secret agent shot and up to around 15% with the bmw scene here.
author: Sergey Sharybin <sergey.vfx@gmail.com> 2014-10-06 11:43:23 +0400
committer: Sergey Sharybin <sergey.vfx@gmail.com> 2014-10-06 14:36:46 +0400
commit: cd6129d1ff6142c153a99917aa794b668e3b7dd2 (patch)
tree: fc3f4f6b6b1d3104334e7fe5c28b93832cb592b9 /intern/cycles/kernel/kernel_compat_cpu.h
parent: 1f1dcdfd76ee70d4c466af0e5917c2e40b39a989 (diff)
1 files changed, 7 insertions, 0 deletions
diff --git a/intern/cycles/kernel/kernel_compat_cpu.h b/intern/cycles/kernel/kernel_compat_cpu.h
index c2aab93c87b..25531843993 100644
--- a/intern/cycles/kernel/kernel_compat_cpu.h
+++ b/intern/cycles/kernel/kernel_compat_cpu.h
@@ -25,6 +25,13 @@
 #include "util_half.h"
 #include "util_types.h"
 
+/* On 64bit linux single precision exponent is really slow comparing to the
+ * double precision version, even with float<->double conversion involved.
+ */
+#if !defined(__KERNEL_GPU__) && defined(__linux__) && defined(__x86_64__)
+#  define expf(x) ((float)exp((double)x))
+#endif
+
 CCL_NAMESPACE_BEGIN
 
 /* Assertions inside the kernel only work for the CPU device, so we wrap it in
author	Sergey Sharybin <sergey.vfx@gmail.com>	2014-10-06 11:43:23 +0400
committer	Sergey Sharybin <sergey.vfx@gmail.com>	2014-10-06 14:36:46 +0400
commit	cd6129d1ff6142c153a99917aa794b668e3b7dd2 (patch)
tree	fc3f4f6b6b1d3104334e7fe5c28b93832cb592b9 /intern/cycles/kernel/kernel_compat_cpu.h
parent	1f1dcdfd76ee70d4c466af0e5917c2e40b39a989 (diff)