Welcome to mirror list, hosted at ThFree Co, Russian Federation.

git.blender.org/blender.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorSergey Sharybin <sergey.vfx@gmail.com>2016-05-17 13:30:46 +0300
committerSergey Sharybin <sergey.vfx@gmail.com>2016-05-18 11:14:24 +0300
commit7b356a856540a1affa5dc85360183418e6337a5a (patch)
tree9acee7019c696f694c97d504e1a2fe678a7f0cd1 /intern/cycles/kernel/kernel_shadow.h
parent2433a537fa12dad6cc8a1c323b1b73e5cad6cd4d (diff)
Cycles: Reduce amount of malloc() calls from the kernel
This commit makes it so malloc() is only happening once per volume and once per transparent shadow query (per thread), improving scalability of the code to multiple CPU cores. Hard to measure this with a low-bottom i7 here currently, but from quick tests seems volume sampling gave about 3-5% speedup. The idea is to store allocated memory in kernel globals, which are per thread on CPU already. Reviewers: dingto, juicyfruit, lukasstockner97, maiself, brecht Reviewed By: brecht Subscribers: Blendify, nutel Differential Revision: https://developer.blender.org/D1996
Diffstat (limited to 'intern/cycles/kernel/kernel_shadow.h')
-rw-r--r--intern/cycles/kernel/kernel_shadow.h18
1 files changed, 9 insertions, 9 deletions
diff --git a/intern/cycles/kernel/kernel_shadow.h b/intern/cycles/kernel/kernel_shadow.h
index 3b1111e5069..504ac2e40bc 100644
--- a/intern/cycles/kernel/kernel_shadow.h
+++ b/intern/cycles/kernel/kernel_shadow.h
@@ -59,14 +59,20 @@ ccl_device_inline bool shadow_blocked(KernelGlobals *kg, PathState *state, Ray *
/* intersect to find an opaque surface, or record all transparent surface hits */
Intersection hits_stack[STACK_MAX_HITS];
Intersection *hits = hits_stack;
- uint max_hits = kernel_data.integrator.transparent_max_bounce - state->transparent_bounce - 1;
+ const int transparent_max_bounce = kernel_data.integrator.transparent_max_bounce;
+ uint max_hits = transparent_max_bounce - state->transparent_bounce - 1;
/* prefer to use stack but use dynamic allocation if too deep max hits
* we need max_hits + 1 storage space due to the logic in
* scene_intersect_shadow_all which will first store and then check if
* the limit is exceeded */
- if(max_hits + 1 > STACK_MAX_HITS)
- hits = (Intersection*)malloc(sizeof(Intersection)*(max_hits + 1));
+ if(max_hits + 1 > STACK_MAX_HITS) {
+ if(kg->transparent_shadow_intersections == NULL) {
+ kg->transparent_shadow_intersections =
+ (Intersection*)malloc(sizeof(Intersection)*(transparent_max_bounce + 1));
+ }
+ hits = kg->transparent_shadow_intersections;
+ }
uint num_hits;
blocked = scene_intersect_shadow_all(kg, ray, hits, max_hits, &num_hits);
@@ -147,14 +153,8 @@ ccl_device_inline bool shadow_blocked(KernelGlobals *kg, PathState *state, Ray *
*shadow = throughput;
- if(hits != hits_stack)
- free(hits);
return is_zero(throughput);
}
-
- /* free dynamic storage */
- if(hits != hits_stack)
- free(hits);
}
else {
Intersection isect;