diff options
author | Sergey Sharybin <sergey.vfx@gmail.com> | 2018-01-08 14:08:18 +0300 |
---|---|---|
committer | Sergey Sharybin <sergey.vfx@gmail.com> | 2018-01-09 18:09:33 +0300 |
commit | c4e42d70a4949352f1233574cfc2da30c097439d (patch) | |
tree | 01dd50d86510f84ec90a8d9fcf9e680599ec9ef0 /source/blender/blenlib/intern/task.c | |
parent | 3144f0573a24a29363995d0fefeb8eeba1320f24 (diff) |
Task scheduler: Add minimum number of iterations per thread in parallel range
The idea is to support following: allow doing parallel for on a small range,
each iteration of which takes lots of compute power, but limit such range to
a subset of threads.
For example, on a machine with 44 threads we can occupy 4 threads to handle
range of 64 elements, 16 elements per thread, where each block of 16 elements
is very complex to compute.
The idea should be to use this setting instead of global use_threading flag,
which is only based on size of array. Proper use of the new flag will improve
threadability.
This commit only contains internal task scheduler changes, this setting is not
used yet by any areas.
Diffstat (limited to 'source/blender/blenlib/intern/task.c')
-rw-r--r-- | source/blender/blenlib/intern/task.c | 9 |
1 files changed, 8 insertions, 1 deletions
diff --git a/source/blender/blenlib/intern/task.c b/source/blender/blenlib/intern/task.c index f2a14aa9363..ba600be870b 100644 --- a/source/blender/blenlib/intern/task.c +++ b/source/blender/blenlib/intern/task.c @@ -1113,15 +1113,22 @@ void BLI_task_parallel_range(int start, int stop, state.iter = start; switch (settings->scheduling_mode) { case TASK_SCHEDULING_STATIC: - state.chunk_size = max_ii(1, (stop - start) / (num_tasks)); + state.chunk_size = max_ii( + settings->min_iter_per_thread, + (stop - start) / (num_tasks)); break; case TASK_SCHEDULING_DYNAMIC: + /* TODO(sergey): Make it configurable from min_iter_per_thread. */ state.chunk_size = 32; break; } num_tasks = min_ii(num_tasks, (stop - start) / state.chunk_size); + /* TODO(sergey): If number of tasks happened to be 1, use single threaded + * path. + */ + /* NOTE: This way we are adding a memory barrier and ensure all worker * threads can read and modify the value, without any locks. */ atomic_fetch_and_add_int32(&state.iter, 0); |