diff options
author | Brecht Van Lommel <brechtvanlommel@gmail.com> | 2016-10-02 15:48:39 +0300 |
---|---|---|
committer | Brecht Van Lommel <brechtvanlommel@gmail.com> | 2016-10-03 23:15:25 +0300 |
commit | a3abb020e37a072eb71fd30de9ab125d1c16623a (patch) | |
tree | b525be7f8a0792eedecb2b95802ede88dc3f330e /intern/cycles/kernel/closure/bsdf_microfacet.h | |
parent | 49ad4215baf16d850d0e367f003ab688e4a3d08e (diff) |
Fix Cycles CUDA performance on CUDA 8.0.
Mostly this is making inlining match CUDA 7.5 in a few performance critical
places. The end result is that performance is now better than before, possibly
due to less register spilling or other CUDA 8.0 compiler improvements.
On benchmarks scenes, there are 3% to 35% render time reductions. Stack memory
usage is reduced a little too.
Reviewed By: sergey
Differential Revision: https://developer.blender.org/D2269
Diffstat (limited to 'intern/cycles/kernel/closure/bsdf_microfacet.h')
-rw-r--r-- | intern/cycles/kernel/closure/bsdf_microfacet.h | 2 |
1 files changed, 1 insertions, 1 deletions
diff --git a/intern/cycles/kernel/closure/bsdf_microfacet.h b/intern/cycles/kernel/closure/bsdf_microfacet.h index 7c36f05b6cc..0a8d14a00c2 100644 --- a/intern/cycles/kernel/closure/bsdf_microfacet.h +++ b/intern/cycles/kernel/closure/bsdf_microfacet.h @@ -183,7 +183,7 @@ ccl_device_inline void microfacet_ggx_sample_slopes( *slope_y = S * z * safe_sqrtf(1.0f + (*slope_x)*(*slope_x)); } -ccl_device_inline float3 microfacet_sample_stretched( +ccl_device_forceinline float3 microfacet_sample_stretched( KernelGlobals *kg, const float3 omega_i, const float alpha_x, const float alpha_y, const float randu, const float randv, |