Age | Commit message (Collapse) | Author |
|
This was a missing bit from b53ce9a.
|
|
per node
|
|
|
|
This way it's more clear whether some issue is caused by lots of geometry in
the node or by lots of "transparent" BVH nodes.
|
|
Gives up to ~1% speedup again.
While it seems to be small, still nice since the code now is actually more
clean that it used to be before.
|
|
Just preparing for new optimization to be used in all traversal implementation.
Should be no measurable difference.
|
|
Several ideas here:
- Optimize calculation of near_{x,y,z} in a way that does not require
3 if() statements per update, which avoids negative effect of wrong
branch prediction.
- Optimization of direction clamping for BVH.
- Optimization of point/direction transform.
Brings ~1.5% speedup again depending on a scene (unfortunately, this
speedup can't be sum across all previous commits because speedup of
each of the changes varies from scene to scene, but it still seems to
be nice solid speedup of few percent on Linux and bigger speedup was
reported on Windows).
Once again ,thanks Maxym for inspiration!
Still TODO: We have multiple places where we need to calculate near
x,y,z indices in BVH, for now it's only done for main BVH traversal.
Will try to move this calculation to an utility function and see if
that can be easily re-used across all the BVH flavors.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Makes it simpler to compare different traversal algorithms.
|
|
Reviewers: brecht, sergey, dingto, juicyfruit
Differential Revision: https://developer.blender.org/D2220
|
|
Mostly this is making inlining match CUDA 7.5 in a few performance critical
places. The end result is that performance is now better than before, possibly
due to less register spilling or other CUDA 8.0 compiler improvements.
On benchmarks scenes, there are 3% to 35% render time reductions. Stack memory
usage is reduced a little too.
Reviewed By: sergey
Differential Revision: https://developer.blender.org/D2269
|
|
Some of the files were wrongly attributing code to some other
organizations and in few places proper attribution was missing.
This is mainly either a copy-paste error (when new file was
created from an existing one and header wasn't updated) or due
to some refactor which split non-original-BF code with purely
BF code.
Should solve some confusion around.
|
|
|
|
for QBVH
|
|
|
|
All the changes are mainly giving explicit tips on inlining functions,
so they match how inlining worked with previous toolkit.
This make kernel compiled by CUDA 8 render in average with same speed
as previous kernels. Some scenes are somewhat faster, some of them are
somewhat slower. But slowdown is within 1% so far.
On a positive side it allows us to enable newer generation cards on
buildbots (so GTX 10x0 will be officially supported soon).
|
|
While they prevent legit write past the array boundary error
those fixes introduced regression in behavior when having exact
max_hits transparent intersections and nothing else.
Previous code would have considered such case a totally opaque,
but it's not correct.
Fixes T48941: Some materials don't get transparent shadows anymore
|
|
It was possible to miss bounces termination criteria in this functions,
mainly when max_hits was set to 0.
Made the check more robust in traversal functions (which should not
affect performance, it's an operation of same complexity AFAIK).
Also avoid doing ray-scene intersection from shadow_blocked when
limit of transparent bounces was already reached.
|
|
Seems there's some conflict around `near` identifier in that configuration.
|
|
|
|
Code might have writing past the array boundaries.
|
|
This way restrict can be used for CUDA and OpenCL as well.
From quick tests in areas i've been testing this it might give some
barely measurable %% of speedup, but it increases registers pressure.
So use of this qualifier is still really limited.
|
|
Using camel case for variables is something what didn't came from our original
code, but rather from third party libraries. Let's avoid those as much as possible.
|
|
Matches better naming of volume traversal files, where we've got
optimized versions of a single step of volume intersection and
traversal which will gather all volume intersections.
|
|
BVH traversal is not really that much a geometry and we've got
quite some traversals now. Makes sense to keep them separate in
the name of source structure clarity.
|