git.blender.org/blender.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author
2022-01-07	Cleanup: remove redundant const qualifiers for POD types	Campbell Barton
	MSVC used to warn about const mismatch for arguments passed by value. Remove these as newer versions of MSVC no longer show this warning.
2021-12-20	Docs: use doxygen formatting for BLI	Campbell Barton
	Differentiate doc-strings from title/section text.
2021-12-10	Cleanup: move public doc-strings into headers for various API's	Campbell Barton
	Some doc-strings were skipped because of blank-lines between the doc-string and the symbol and needed to be moved manually. - Added space below non doc-string comments to make it clear these aren't comments for the symbols directly below them. - Use doxy sections for some headers. Ref T92709
2021-12-09	Cleanup: move public doc-strings into headers for 'blenlib'	Campbell Barton
	- Added space below non doc-string comments to make it clear these aren't comments for the symbols directly below them. - Use doxy sections for some headers. - Minor improvements to doc-strings. Ref T92709
2021-10-19	Cleanup: use 'e' prefix for enum types	Campbell Barton

2021-07-15	BLI_task: add a callback to initialize TLS	Campbell Barton
	Useful when TLS requires it's own allocated structures.
2021-06-24	Cleanup: comment blocks, trailing space in comments	Campbell Barton

2021-06-15	BLI: use explicit task isolation, no longer part of parallel operations	Brecht Van Lommel
	After looking into task isolation issues with Sergey, we couldn't find the reason behind the deadlocks that we are getting in T87938 and a Sprite Fright file involving motion blur renders. There is no apparent place where we adding or waiting on tasks in a task group from different isolation regions, which is what is known to cause problems. Yet it still hangs. Either we do not understand some limitation of TBB isolation, or there is a bug in TBB, but we could not figure it out. Instead the idea is to use isolation only where we know we need it: when holding a mutex lock and then doing some multithreaded operation within that locked region. Three places where we do this now: * Generated images * Cached BVH tree building * OpenVDB lazy grid loading Compared to the more automatic approach previously used, there is the downside that it is easy to miss places where we need isolation. Yet doing it more automatically is also causing unexpected issue and bugs that we found no solution for, so this seems better. Patch implemented by Sergey and me. Differential Revision: https://developer.blender.org/D11603
2021-06-09	Cleanup: spelling in comments	Campbell Barton

2021-06-09	BLI_task: add TLS support to BLI_task_parallel_mempool	Campbell Barton
	Support thread local storage for BLI_task_parallel_mempool, as well as support for the reduce and free callbacks. mempool_iter_threadsafe_* functions have been moved into a private header thats only shared between task_iterator.c and BLI_mempool.c so the TLS can be made part of the iterator array without having to rely on passing in struct offsets. Add test task.MempoolIterTLS that ensures reduce and free are working as expected. Reviewed By: mont29 Ref D11548
2021-06-08	BLI: support disabling task isolation in task pool	Jacques Lucke
	Under some circumstances using task isolation can cause deadlocks. Previously, our task pool implementation would run all tasks in an isolated region. Now using task isolation is optional and can be turned on/off for individual task pools. Task pools that spawn new tasks recursively should never enable task isolation. There is a new check that finds these cases at runtime. Right now this check is disabled, so that this commit is a pure refactor. It will be enabled in an upcoming commit. This fixes T88598. Differential Revision: https://developer.blender.org/D11415
2021-01-15	Fix T84745: build error with TBB 2021	Brecht Van Lommel
	task_group::is_canceling() was removed.
2021-01-04	Cleanup: use 'pragma once'	Campbell Barton
	Add explanations for cases the header-guard defines are still used.
2020-09-04	Cleanup: Clang-Tidy readability-inconsistent-declaration-parameter-name fix	Sebastian Parborg
	No functional changes
2020-06-05	Cleanup: spelling	Campbell Barton

2020-05-25	Task: Graph Flow Task Scheduling	Jeroen Bakker
	Add TBB::flow graph scheduling to BLI_task. Using flow graphs, a graph of nodes (tasks) and links can be defined. Work can flow though the graph. During this process the execution of the nodes will be scheduled among the available threads. We are planning to use this to improve the threading in the draw manager. The implemented API is still limited it only supports sequential flows. Joins and buffers are not supported. We could eventually support them as part of an CPP API. These features from uses compile time templates and are hard to make a clean C-API for this. Reviewed By: Sergey Sharybin, Brecht van Lommel Differential Revision: https://developer.blender.org/D7578
2020-05-09	Fix T76427: edit mesh undo hanges when building without TBB	Brecht Van Lommel
	Background task pools would not restart threads if reused multiple times, thanks to Jeroen for identifying the cause of this problem. Differential Revision: https://developer.blender.org/D7659
2020-05-09	Cleanup: spelling	Campbell Barton

2020-05-08	Cleanup: take includes out of 'extern "C"' blocks	Jacques Lucke
	Surrounding includes with an 'extern "C"' block is not necessary anymore. Also that made it harder to add any C++ code to some headers, or include headers that have "optional" C++ code like `MEM_guardedalloc.h`. I tested compilation on linux and windows (and got help from @LazyDodo). If this still breaks compilation due to some linker error, the header containing the symbol in question is probably missing an 'extern "C"' block. Differential Revision: https://developer.blender.org/D7653
2020-04-30	Task: Use TBB as Task Scheduler	Brecht Van Lommel
	This patch enables TBB as the default task scheduler. TBB stands for Threading Building Blocks and is developed by Intel. The library contains several threading patters. This patch maps blenders BLI_task_* function to their counterpart. After this patch we can add more patterns. A promising one is TBB:graph that can be used for depsgraph, draw manager and compositor. Performance changes depends on the actual hardware. It was tested on different hardwares from laptops to workstations and we didn't detected any downgrade of the performance. * Linux Xeon E5-2699 v4 got FPS boost from 12 to 17 using Spring's 04_010_A.anim.blend. * AMD Ryzen Threadripper 2990WX 32-Core Animation playback goes from 9.5-10.5 FPS to 13.0-14.0 FPS on Agent 327 , 10_03_B.anim.blend. Reviewed By: brecht, sergey Differential Revision: https://developer.blender.org/D7475
2020-04-23	BLI: remove TaskParallelRangePool	Brecht Van Lommel
	This is not currently used and will take some work to support with TBB, so remove it until we have a new implementation based on TBB. Fixes T76005, parallel range pool tests failing. Ref D7475
2020-04-21	CleanUp: Remove thread_id from `TaskFreeFunction`	Jeroen Bakker
	It isn't used; cleanup related to {D7475}
2020-04-21	CleanUp: Renamed `BLI_task_pool_userdata` to `BLI_task_pool_user_data`	Jeroen Bakker
	In preparation for {D7475}
2020-04-17	Task: Separate Finalize into Reduce And Free	Jeroen Bakker
	In preparation of TBB we need to split the finalize function into reduce and free. Reduce is used to combine results and free for freeing any allocated memory. The reduce function is called to join user data chunk into another, to reduce the result to the original userdata_chunk memory. These functions should have no side effects so that they can be run on any thread. The free functions should free data created during execution (TaskParallelRangeFunc). Original patch by Brecht van Lommel {rB61f49db843cf5095203112226ae386f301be1e1a}. Reviewed By: Brecht van Lommel, Bastien Montagne Differential Revision: https://developer.blender.org/D7394
2020-04-09	TaskScheduler: Minor Preparations for TBB	Brecht Van Lommel
	Tasks: move priority from task to task pool {rBf7c18df4f599fe39ffc914e645e504fcdbee8636} Tasks: split task.c into task_pool.cc and task_iterator.c {rB4ada1d267749931ca934a74b14a82479bcaa92e0} Differential Revision: https://developer.blender.org/D7385
2020-01-07	Cleanup: Remove unused task scheduler constants	Sergey Sharybin

2019-12-17	Cleanup: redundant struct declarations	Campbell Barton

2019-11-26	BLI_task: Add pooled threaded index range iterator, Take II.	Bastien Montagne
	This code allows to push a set of different operations all based on iterations over a range of indices, and then process them all at once over multiple threads. This commit also adds unit tests for both old un-pooled, and new pooled task_parallel_range family of functions, as well as some basic performances tests. This is mainly interesting for relatively low amount of individual tasks, as expected. E.g. performance tests on a 32 threads machine, for a set of 10 different tasks, shows following improvements when using pooled version instead of ten sequential calls to BLI_task_parallel_range(): \| Num Items \| Sequential \| Pooled \| Speed-up \| \| --------- \| ---------- \| ------- \| -------- \| \| 10K \| 365 us \| 138 us \| 2.5 x \| \| 100K \| 877 us \| 530 us \| 1.66 x \| \| 1000K \| 5521 us \| 4625 us \| 1.25 x \| Differential Revision: https://developer.blender.org/D6189 Note: Compared to previous commit yesterday, this reworks atomic handling in parallel iter code, and fixes a dummy double-free bug. Now we should only use the two critical values for synchronization from atomic calls results, which is the proper way to do things. Reading a value after an atomic operation does not guarantee you will get the latest value in all cases (especially on Windows release builds it seems).
2019-11-25	Revert "BLI_task: Add pooled threaded index range iterator."	Bastien Montagne
	This reverts commit f9028a3be1f77c01edca44a68894e2ba9d9cfb14. This is giving weird heisenbug crash on only Windows release builds... Reverting until we understand to issue.
2019-11-25	BLI_task: Add pooled threaded index range iterator.	Bastien Montagne
	This code allows to push a set of different operations all based on iterations over a range of indices, and then process them all at once over multiple threads. This commit also adds unit tests for both old un-pooled, and new pooled `task_parallel_range` family of functions, as well as some basic performances tests. This is mainly interesting for relatively low amount of individual tasks, as expected. E.g. performance tests on a 32 threads machine, for a set of 10 different tasks, shows following improvements when using pooled version instead of ten sequential calls to `BLI_task_parallel_range()`: \| Num Items \| Sequential \| Pooled \| Speed-up \| \| --------- \| ---------- \| ------- \| -------- \| \| 10K \| 365 us \| 138 us \| 2.5 x \| \| 100K \| 877 us \| 530 us \| 1.66 x \| \| 1000K \| 5521 us \| 4625 us \| 1.25 x \| Differential Revision: https://developer.blender.org/D6189
2019-10-30	BLI_task: Add new generic `BLI_task_parallel_iterator()`.	Bastien Montagne
	This new function is part of the 'parallel for loops' functions. It takes an iterator callback to generate items to be processed, in addition to the usual 'process' func callback. This allows to use common code from BLI_task for a wide range of custom iteratiors, whithout having to re-invent the wheel of the whole tasks & data chuncks handling. This supports all settings features from `BLI_task_parallel_range()`, including dynamic and static (if total number of items is knwon) scheduling, TLS data and its finalize callback, etc. One question here is whether we should provide usercode with a spinlock by default, or enforce it to always handle its own sync mechanism. I kept it, since imho it will be needed very often, and generating one is pretty cheap even if unused... ---------- Additionaly, this commit converts (currently unused) `BLI_task_parallel_listbase()` to use that generic code. This was done mostly as proof of concept, but performance-wise it shows some interesting data, roughly: - Very light processing (that should not be threaded anyway) is several times slower, which is expected due to more overhead in loop management code. - Heavier processing can be up to 10% quicker (probably thanks to the switch from dynamic to static scheduling, which reduces a lot locking to fill-in the per-tasks chunks of data). Similar speed-up in non-threaded case comes as a surprise though, not sure what can explain that. While this conversion is not really needed, imho we should keep it (instead of existing code for that function), it's easier to have complex handling logic in as few places as possible, for maintaining and for improving it. Note: That work was initially done to allow for D5372 to be possible... Unfortunately that one proved to be not better than orig code on performances point of view. Reviewed By: sergey Differential Revision: https://developer.blender.org/D5371
2019-07-30	BLI_task: Cleanup: rename some structs to make them more generic.	Bastien Montagne
	TLS and Settings can be used by other types of parallel 'for loops', so removing 'Range' from their names. No functional changes expected here.
2019-07-30	BLI_task: tweak default chunk size for `BLI_task_parallel_range()`.	Bastien Montagne
	Previously we were setting it to 1 (aka no 'chunking'), to follow previous behavior. However, this is far from optimal, especially with CPUs that can have tens of threads nowadays. Now taking an heuristic approach (inspired from the one already existing for `BLI_task_parallel_listbase()`, which tries to guesstimate best chunk sizes based on several factors (amount of threads/parallel tasks, total number of items, ...). Think this is a reasonable base ground, more optimization here would of course be possible. Note that code that was already explicitely settings some value here won't be affected at all by that change.
2019-04-17	ClangFormat: apply to source, most of intern	Campbell Barton
	Apply clang format as proposed in T53211. For details on usage and instructions for migrating branches without conflicts, see: https://wiki.blender.org/wiki/Tools/ClangFormat
2019-04-16	Cleanup: trailing commas	Campbell Barton

2019-02-18	doxygen: add newline after \file	Campbell Barton
	While \file doesn't need an argument, it can't have another doxy command after it.
2019-02-06	Cleanup: remove redundant doxygen \file argument	Campbell Barton
	Move \ingroup onto same line to be more compact and make it clear the file is in the group.
2019-02-01	Cleanup: remove redundant, invalid info from headers	Campbell Barton
	BF-admins agree to remove header information that isn't useful, to reduce noise. - BEGIN/END license blocks Developers should add non license comments as separate comment blocks. No need for separator text. - Contributors This is often invalid, outdated or misleading especially when splitting files. It's more useful to git-blame to find out who has developed the code. See P901 for script to perform these edits.
2019-01-15	Cleanup: commas at the end of enums	Campbell Barton
	Without this clang-format may wrap them onto a single line.
2018-12-04	BLI_task: fix queue in work_and_wait, and support resetting.	Alexander Gavrilov
	To make the pool more usable for running multiple stages of tasks, fix local queue handling in BLI_task_pool_work_and_wait. Specifically, after the wait loop the local queue should be empty, or the wait part of the function contract isn't fulfilled. Instead, check and run any tasks in queue before the wait loop. Also, add a new function that resets the suspended state of the pool.
2018-06-29	Cleanup: trailing newlines	Campbell Barton

2018-06-01	Cleanup: trailing whitespace (comment blocks)	Campbell Barton
	Strip unindented comment blocks - mainly headers to avoid conflicts.
2018-01-21	Cleanup: style	Campbell Barton

2018-01-10	Task scheduler: Use restrict pointer qualifier	Sergey Sharybin
	Those pointers are never to be aliased, so let's be explicit about this and hope compiler does save some CPU ticks.
2018-01-09	Task scheduler: Use const qualifiers in parallel range	Sergey Sharybin

2018-01-09	Task scheduler: Add minimum number of iterations per thread in parallel range	Sergey Sharybin
	The idea is to support following: allow doing parallel for on a small range, each iteration of which takes lots of compute power, but limit such range to a subset of threads. For example, on a machine with 44 threads we can occupy 4 threads to handle range of 64 elements, 16 elements per thread, where each block of 16 elements is very complex to compute. The idea should be to use this setting instead of global use_threading flag, which is only based on size of array. Proper use of the new flag will improve threadability. This commit only contains internal task scheduler changes, this setting is not used yet by any areas.
2018-01-09	Task scheduler: Use single parallel range function with more flexible function	Sergey Sharybin
	Now all the fine-tuning is happening using parallel range settings structure, which avoid passing long lists of arguments, allows extend fine-tuning further, avoid having lots of various functions which basically does the same thing.
2018-01-09	Task scheduler: Get rid of extended version of parallel range callback	Sergey Sharybin
	Wrap all arguments into TLS type of argument. Avoids some branching and also makes it easier to extend things in the future.
2017-11-23	Add a new parallel looper for MemPool items to BLI_task.	Bastien Montagne
	It merely uses the new thread-safe iterators system of mempool, quite straight forward. Note that to avoid possible confusion with two void pointers as parameters of the callback, a dummy opaque struct pointer is used instead for the second parameter (pointer generated by iteration over mempool), callback functions must explicitely convert it to expected real type. Also added a basic gtest for this new feature.
2017-05-31	Task scheduler: Optimize subsequent pushing bunch of tasks	Sergey Sharybin
	The idea is to accumulate all new tasks in a thread local queue first without doing any thread synchronization (aka, locks and conditional variables) and move those tasks to a scheduler queue once they are all ready. This way we avoid per-task-pool lock and only have one lock per bunch of tasks. This is particularly handy when scheduling new dependency graph node children. Brings FPS of cached simulation from the linked below file from ~30 to ~50. See documentation for BLI_task_pool_delayed_push_{begin, end} and for TaskThreadLocalStorage::do_delayed_push. Fixes T50027: Rigidbody playback and simulation performance regression with new depsgraph Thanks Bastien for the review!