Task scheduler: Add minimum number of iterations per thread in parallel range

The idea is to support following: allow doing parallel for on a small range, each iteration of which takes lots of compute power, but limit such range to a subset of threads. For example, on a machine with 44 threads we can occupy 4 threads to handle range of 64 elements, 16 elements per thread, where each block of 16 elements is very complex to compute. The idea should be to use this setting instead of global use_threading flag, which is only based on size of array. Proper use of the new flag will improve threadability. This commit only contains internal task scheduler changes, this setting is not used yet by any areas.
author: Sergey Sharybin <sergey.vfx@gmail.com> 2018-01-08 14:08:18 +0300
committer: Sergey Sharybin <sergey.vfx@gmail.com> 2018-01-09 18:09:33 +0300
commit: c4e42d70a4949352f1233574cfc2da30c097439d (patch)
tree: 01dd50d86510f84ec90a8d9fcf9e680599ec9ef0 /source/blender/blenlib/intern/task.c
parent: 3144f0573a24a29363995d0fefeb8eeba1320f24 (diff)
1 files changed, 8 insertions, 1 deletions
diff --git a/source/blender/blenlib/intern/task.c b/source/blender/blenlib/intern/task.c
index f2a14aa9363..ba600be870b 100644
--- a/source/blender/blenlib/intern/task.c
+++ b/source/blender/blenlib/intern/task.c
@@ -1113,15 +1113,22 @@ void BLI_task_parallel_range(int start, int stop,
 	state.iter = start;
 	switch (settings->scheduling_mode) {
 		case TASK_SCHEDULING_STATIC:
-			state.chunk_size = max_ii(1, (stop - start) / (num_tasks));
+			state.chunk_size = max_ii(
+			        settings->min_iter_per_thread,
+			        (stop - start) / (num_tasks));
 			break;
 		case TASK_SCHEDULING_DYNAMIC:
+			/* TODO(sergey): Make it configurable from min_iter_per_thread. */
 			state.chunk_size = 32;
 			break;
 	}
 
 	num_tasks = min_ii(num_tasks, (stop - start) / state.chunk_size);
 
+	/* TODO(sergey): If number of tasks happened to be 1, use single threaded
+	 * path.
+	 */
+
 	/* NOTE: This way we are adding a memory barrier and ensure all worker
 	 * threads can read and modify the value, without any locks. */
 	atomic_fetch_and_add_int32(&state.iter, 0);
author	Sergey Sharybin <sergey.vfx@gmail.com>	2018-01-08 14:08:18 +0300
committer	Sergey Sharybin <sergey.vfx@gmail.com>	2018-01-09 18:09:33 +0300
commit	c4e42d70a4949352f1233574cfc2da30c097439d (patch)
tree	01dd50d86510f84ec90a8d9fcf9e680599ec9ef0 /source/blender/blenlib/intern/task.c
parent	3144f0573a24a29363995d0fefeb8eeba1320f24 (diff)