Age | Commit message (Collapse) | Author |
|
Also removing the curve system manager which only stored a few curve intersection
settings. These are all changes towards making shape and subdivision settings
per-object instead of per-scene, but there is more work to do here.
Ref T73778
Depends on D8013
Maniphest Tasks: T73778
Differential Revision: https://developer.blender.org/D8014
|
|
No significant performance improvement is expected, but it means we have a
single thread pool throughout Blender. And it should make adding more
parallellization in the future easier.
After previous refactoring commits this is basically a drop-in replacement.
One difference is that the task pool had a mechanism for scheduling tasks to
the front of the queue to minimize memory usage. TBB has a smarter algorithm
to balance depth-first and breadth-first scheduling of tasks and we assume that
removes the need to manually provide hints to the scheduler.
Fixes T77533
|
|
|
|
Now that the rest of Blender also relies on TBB, no point in maintaining custom
code for paraller_for and thread local storage.
|
|
|
|
|
|
|
|
Apply clang format as proposed in T53211.
For details on usage and instructions for migrating branches
without conflicts, see:
https://wiki.blender.org/wiki/Tools/ClangFormat
|
|
Also try to move them from headers to implementation files as much as possible.
|
|
This is fully unpredictable for artists when one damaged object makes the whole
scene to render incorrectly. This involves two main changes:
- It is not enough to check triangle bounds to be valid when building BVH.
This is because triangle might have some finite vertices and some non-finite.
- We shouldn't add non-finite triangle area to the overall area for MIS.
|
|
|
|
|
|
Basically, make re-alloc and memcpy from the same lock, otherwise one
thread might be re-allocating thread while another one is trying to
copy data there.
Reported by Mohamed Sakr in IRC, thanks!
|
|
|
|
This makes us more sure that header files are more self-sufficient.
|
|
|
|
The idea is to make include statements more explicit and obvious where the
file is coming from, additionally reducing chance of wrong header being
picked up.
For example, it was not obvious whether bvh.h was refferring to builder
or traversal, whenter node.h is a generic graph node or a shader node
and cases like that.
Surely this might look obvious for the active developers, but after some
time of not touching the code it becomes less obvious where file is coming
from.
This was briefly mentioned in T50824 and seems @brecht is fine with such
explicitness, but need to agree with all active developers before committing
this.
Please note that this patch is lacking changes related on GPU/OpenCL
support. This will be solved if/when we all agree this is a good idea to move
forward.
Reviewers: brecht, lukasstockner97, maiself, nirved, dingto, juicyfruit, swerner
Reviewed By: lukasstockner97, maiself, nirved, dingto
Subscribers: brecht
Differential Revision: https://developer.blender.org/D2586
|
|
|
|
Solves memory regression by the default configuration.
|
|
The issue here was mainly coming from minimal pixel width feature
which is quite commonly enabled in production shots.
This feature will use some probabilistic heuristic in the curve
intersection function to check whether we need to return intersection
or not. This probability is calculated for every intersection check.
Now, when we use multiple BVH nodes for curve primitives we increase
probability of that primitive to be considered a good intersection
for us. This is similar to increasing minimal width of curve.
What is worst here is that change in the intersection probability
fully depends on exact layout of BVH, meaning probability might
change differently depending on a view angle, the way how builder
binned the primitives and such. This makes it impossible to do
simple check like dividing probability by number of BVH steps.
Other solution might have been to split BVH into fully independent
trees, but that will increase memory usage of all the static
objects in the scenes, which is also not something desirable.
For now used most simple but robust approach: store BVH primitives
time and test it in curve intersection functions. This solves the
regression, but has two downsides:
- Uses more memory.
which isn't surprising, and ANY solution to this problem will
use more memory.
What we still have to do is to avoid this memory increase for
cases when we don't use BVH motion steps.
- Reduces number of maximum available textures on pre-kepler cards.
There is not much we can do here, hardware gets old but we need
to move forward on more modern hardware..
|
|
|
|
|
|
|
|
This way we can stop traversing BVH node early on.
Gives about 2-2.5x times render time improvement with 3 BVH steps.
Hopefully this gives no measurable performance loss for scenes with
single BVH step.
Traversal is currently only implemented for QBVH, meaning old CPUs
and GPU do not benefit from this change.
|
|
Similar to the previous commit, the statistics goes as:
BVH Steps Render time (sec) Memory usage (MB)
0 46 260
1 27 373
2 18 598
3 15 826
Scene used for the tests is the agent's body from one of the barber
shop scenes (no textures or anything, just a diffuse material).
Once again this is limited to regular (non-spatial split) BVH,
Support of spatial split to this feature will come later.
|
|
The idea is to create several smaller BVH nodes for each of the motion
curve primitives. This acts as a forced spatial split for the single
primitive.
This gives up render time speedup of motion blurred hair in the cost
of extra memory usage. The numbers goes as:
BVH Steps Render time (sec) Memory usage (MB)
0 258 191
1 123 278
2 69 453
3 43 627
Scene used for the tests is the agent's hair from one of the barber
shop scenes.
Currently it's only limited to scenes without spatial split enabled,
since the spatial split builder requires some changes to work properly
with motion steps coordinates.
|
|
|
|
|
|
|
|
|
|
makes code slightly shorter and uses idea of const qualifiers.
|
|
Run into some nasty bugs while trying various things here.
Wouldn't mind enabling -Wshadow for Cycles actually..
|
|
This way we can have different limits for regular and motion curves
which we'll do in one of the upcoming commits in order to gain some
percents of speedup.
The reasoning here is that motion curves are usually intersecting
lots of others bounding boxes, which makes it inefficient to have
single primitive in the leaf node.
|
|
Maximal number of elements is supposed to be inclusive. That is what
it was always meant in this file and what @brecht considered still
the case in 6974b69c6172.
In fact, the commit message to that change mentions that we allowed
up to 2 curve primitives per leaf while in fact it was doing up to 1
curve primitive.
Making it real 2 primitives at a max gives about 5% slowdown for the
koro.blend scene. This is a reason why BVHParams.max_curve_leaf_size
was changed to 1 by this change.
|
|
|
|
It was possible to have non-initialized unaligned BVH split
to be used when regular BVH split SAH was inf. Now we ensure
that unaligned splitter is only used when it's really initialized.
It's a regression and should be in 2.78a.
|
|
|
|
|
|
This is a special builder type which is allowed to orient nodes to
strands direction, hence minimizing their surface area in comparison
with axis-aligned nodes. Such nodes are much more efficient for hair
rendering.
Implementation of BVH builder is based on Embree, and generally idea
there is to calculate axis-aligned SAH and oriented SAH and if SAH
of oriented node is smaller than axis-aligned SAH we create unaligned
node.
We store both aligned and unaligned nodes in the same tree (which
seems to be different from what Embree is doing) so we don't have
any any extra calculations needed to set up hair ray for BVH
traversal, hence avoiding any possible negative effect of this new
BVH nodes type.
This new builder is currently not in use, still need to make BVH
traversal code aware of unaligned nodes.
|
|
|
|
In certain types of animation it's possible to have some objects
scaling to zero. In this case we can save render times by avoid
traversing such instances.
Better to do ti ahead of a time, so traversal stays simple.
Reviewers: lukasstockner97, dingto, brecht
Reviewed By: brecht
Differential Revision: https://developer.blender.org/D2048
|
|
Some of these values can get quite large and are hard to read, adding this
makes it easy to read them at a glance.
Reviewed By: sergey
Differential Revision: https://developer.blender.org/D2039
|
|
separate some arrays.
Differential Revision: https://developer.blender.org/D2016
|
|
|
|
|
|
|
|
|
|
Simple fix: release lock earlier.
Reduces spatial split build time from 96 to 53sec on the Bunny.blend
(using studio Intel for benchmark).
NOTE: Timing difference is not that spectacular when comparing numbers
with builds before memory optimization, but even then it's about 20%
faster build.
|
|
This was only visible on systems with lots of threads and root of the issue
was that we've been pre-allocating too much memory for all the threads.
Now we only pre-allocate data for the main thread and rest of the threads
does allocation on-demand.
This brings down memory usage from 36Gig to 6.9Gig when building spatial
split for the Bunny.blend file on our Intel beast.
Originally regression was happened by the threaded spacial split builder
commit.
|
|
|