Age | Commit message (Collapse) | Author |
|
We often call `parallel_for` in places with very variable
sized workloads. When many elements are processed,
using multi-threading is great, but when processing
few elements (possibly many times) using `parallel_for`
can result in significant overhead.
I measured that this improves performance by >20% in
the refactored realize instances code I'm working on
separately. The change might also help with debugging
sometimes, because the stack trace is smaller and contains
fewer irrevelant symbols.
|
|
|
|
From patch D11780 from Erik Abrahamsson.
It parallelizes making the vertices, destruction of map entries,
finding if the result is PWN, finding triangle adjacencies,
and finding the ambient cell.
The latter needs a parallel_reduce from tbb, so added one into
BLI_task.hh so that if WITH_TBB is false, the code will still work.
On Erik's 6-core machine, the elapsed time went from 17.5s to 11.8s
(33% faster) on an intersection of two spheres with 3.1M faces.
On Howard's 24-core machine, the elapsed time went from 18.7s to 10.8s
for the same test.
|
|
This makes it easier to use task isolation in c++ code.
Previously, one either had to check `WITH_TBB` (possibly indirectly
through `WITH_OPENVDB`) or one had to use the C function which
is less convenient.
|
|
This namespace groups threading related functions/classes. This avoids
adding more threading related stuff to the blender namespace. Also it
makes naming a bit easier, e.g. the c++ version of BLI_task_isolate could
become blender::threading::isolate_task or something similar.
Differential Revision: https://developer.blender.org/D11624
|
|
* USD and OpenVDB headers use deprecated TBB headers, suppress all deprecation
warnings there since we have no control over them.
* For our own TBB includes, use the individual headers rather than the tbb.h that
includes everything to avoid warnings, rather than suppressing all.
This is in anticipation of the TBB 2020 upgrade in D10359. Ref D10361.
|
|
If a define of NOMINMAX was made before BLI_task.hh was included,
the compiler would emit a
warning C4005: 'NOMINMAX': macro redefinition
warning, to work around this only define it if it is not already
defined, and only undefine it if we were the ones that made the
define earlier.
|
|
|
|
TBB includes Windows.h which defines a min/max macro
leading to issues when you want to use std::min and
std::max.
This change prevents Windows.h from defining them
sidestepping the issue.
|
|
`BKE_mesh_calc_edges` was the main performance bottleneck in D9141.
While openvdb only needed ~115ms, calculating the edges afterwards
took ~960ms. Now with some parallelization this is reduced to ~210ms.
Parallelizing `BKE_mesh_calc_edges` is not entirely trivial, because it
has to perform deduplication and some other things that have to happen
in a certain order. Even though the multithreading improves performance
with more threads, there are diminishing returns when too many threads
are used in this function.
The speedup is mainly achieved by having multiple hash tables that are
filled in parallel. The distribution of the edges to hash tables is based on
a hash (that is different from the hash used in the actual hash tables).
I moved the function to C++, because that made it easier for me to
optimize it. Furthermore, I added `BLI_task.hh` which contains some
light tbb wrappers for parallelization.
Reviewers: campbellbarton
Differential Revision: https://developer.blender.org/D9151
|