Age | Commit message (Collapse) | Author |
|
It's apparently not nice to access 0th element of zero-size vector in C++.
|
|
Overview this in one of the previous BVH commits.
|
|
This way we'll notice that leaf splitting didn't happen correct pretty easily
in debug builds.
There'll be absolutely no impact on release builds.
|
|
This commit enables BVH leaf nodes split by the primitive type and makes it
so BVH traversal code is now aware and benefits from this.
As was mentioned in original commit, this change is crucial to be able to do
single ray to multiple triangle intersection. But it also appears to give
barely visible speedup in some scene.
In any case there should be no noticeable slowdown, and this change is what
we need to have anyway.
|
|
Use variables allocated in the stack and avoid heap allocation which should make
leaf splitting code a bit faster.
|
|
The idea of this change is make it possible to split leaf nodes by primitive
type, making leaf containing primitives of the same type.
This would become handy when working on a single ray to multiple triangles
intersection code, plus with careful implementation it might give some extra
benefits on BVH traversal code by avoiding primitive type fetch and check for
each primitive in the node. But that's a bit tricky to have benefits on this
change only because depth of BVH increases.
This option is not exposed to the interface at all and not used even secretly,
the commit is only needed to help working further in this direction without
messing around with local patches and worrying of them running out of date.
|
|
|
|
For CPU it gives available instructions set (SSE, AVX and so).
For GPU CUDA it reports most of the attribute values returned by
cuDeviceGetAttribute(). Ideally we need to only use set of those
which are driver-specific (so we don't clutter system info with
values which we can get from GPU specifications and be sure they
stay the same because driver can't affect on them).
|
|
|
|
Was a conflict in headers between clew and util_optimization.h.
|
|
|
|
OpenCL apparently does not support templates, so the idea of generic
function for swapping is a bit of a failure. Now it is either inlined
into the code (in triangle intersection) or has specific implementation
for QBVH.
This is probably even better, because we can't create QBVH-specific
function in util_math anyway.
|
|
|
|
This commit contains all the tweaks which were missing in initial patch
re-integration from the standalone Cycles repository.
This commit also contains an utility cmake macro to help linking targets
with different libraries for release/debug builds, the name currently is
target_link_libraries_decoupled
it gets a target and list of libraries and makes sure debug builds are
using libraries with "_d" suffix.
After all this changes it'll hopefully be easier to interchange patches
between blender and standalone repositories, because they're now quite
identical.
|
|
Ensure AVX/AVX2 is not used when Cycles is configured with
WITH_CPU_SSE set to OFF.
|
|
This way it is now possible to use gflags >= 2.1, where all the
functions were moved from google to gflags namespace.
This isn't currently used in blender, but for standalone repository
this change is essential.
|
|
Made it a dedicated macro to link release/debug targets against lib/lib_d
libraries which helps keeping code a bit more clean.
Also made it so MSVC is now happy about building debug Cycles with OSL
support.
Reshuffled code a bit and put some comments about what's going on, which
should make it a bit more clear.
|
|
For SSE checks still could be decoupled to be able to compile SSE2
kernel and not SSE4 depending on the CPU or so.
|
|
|
|
This applies to an application comiling from the standalone Cycles repository
only.
There's still lack of proper install target, so currently pthreads
library is to be copied next to cycles.exe manually.
|
|
This is what was handy troubleshooting issues in the studio,
plus this is exactly the same thing which would be helpful
when solving issues with paths to compiled shaders and cubins
for standalone repository.
|
|
This commit generalizes logging module a little bit in making it possible to use
Glog logging in standalone Cycles repository.
|
|
|
|
Basic idea is to check whether OIIO is compiled with embedded PugiXML parser
and if so use PugiXML from OIIO, otherwise find a standalone PugiXML library.
|
|
|
|
Not sure why it worked on Debian but didn't work on Arch, could have
been some indirect link dependency or so.
Anyway, we explicitly depends on pthreads, so need to do corresponding
find_package().
|
|
This changes were done in original commit of the standalone Cycles repository
and needed here for easier patch synchronization.
|
|
Main purpose of this is to bring new gflags library which is more likely
to have a fix for undefined order of static variables initialization and
also to bring new glog where some compilation error are fixed (which are
only visible with more strict checks with clang and c++11 enabled).
|
|
|
|
This reverts commit 1549fea9995c348bc14a9105df5e460644e2b33a.
After some further discussion with other developers in the team it becomes
clear there's no correct solution here. It is just more matter of what's
more convenient in particular case.
We're just going back to old code to avoid possible frustration with the
older files in newer blenders. This also means all HSV/HSL is considered
to be "linear" in the shading nodes.
Would be ported to 2.73 final.
|
|
This way we're kind of safer to troubleshoot possible stack overflow issues.
|
|
Traversal now can push up to 2x of nodes to the stack, so need some tweaks
to the stack size.
|
|
This way we'll be sure (in debug builds) that regular BVH traversal is not used
for QBVH tree (could happen because of mismatch of logic in kernel and render).
|
|
The reason for this is that we don't sue SSE optimization for 32bit platforms
because of T36316.
Things to look into:
- Nail the root of the issue of that report
- Implement non-SSE traversal code for QBVH
|
|
|
|
The issue is that only instance node contains proper visibility flags,
nodes from instanced BVH are not correct.
|
|
Seems the parent check didn't go deep enough and only checked single parent.
Now it checks the chain of parents which seems to be correct but requires
much more intense testing.
|
|
This commit implements heuristic which allows to skip nodes pushed to the stack
from intersection if distance to them is larger than the distance to the current
intersection.
This should solve speed regression which i didn't notice in the original QBVH
commit (which could have because i had WIP version of this patch applied in my
local branch).
From quick tests speed seems to be much closer to what is was with regular BVH.
There's still some possible code cleanup, but they'll need a bit of assembly
code check and now i want to make it so artists can happily use Cycles over the
holidays.
|
|
versions.
Dunno exactly why this was done earlier, but propose not to remove code not understood.
|
|
|
|
basically shadow rays were totally broken and most of the time did not record
any intersections, leading to really ad rendering artifacts.
This commit makes it so regardless of enabled optimization level render result
would be the same.
|
|
This issue doesn't happen with 6.5.12 and there's slight piece of hope it'll be
fixed in next toolkit releases..
For now we're forcing CUDA to not inline ray precalculation. This could lead to
some speed regression, but wouldn't expect it to be huge -- this code does not
run that often comparing to actual triangle intersection.
|
|
|
|
This was already mixed a bit, but the dot belongs there.
|
|
|
|
We might remove this again in the future, but for testing purposes
during the release cycle, this will be useful.
The setting defaults to QBVH, and can be found in the Performance panel.
|
|
This is harmless for now because tail of the node is zero in there, but better
to fix it early so in the case of extending BVH nodes this code doesn't give
issues.
|
|
This commit enables QBVH optimization structure automatically if rendering
with CPU and SSE2 support is detected.
This brings render time of agent shot back to the speed it used to be before
the watertight intersections commit, single koro and sponza scenes are about
7% faster here.
|
|
This commit implements traversal for QBVH tree, which is based on the old loop
code for traversal itself and Embree for node intersection.
This commit also does some changes to the loop inspired by Embree:
- Visibility flags are only checked for primitives.
Doing visibility check for every node cost quite reasonable amount of time
and in most cases those checks are true-positive.
Other idea here would be to do visibility checks for leaf nodes only, but
this would need to be investigated further.
- For minimum hair width we extend all the nodes' bounding boxes.
Again doing curve visibility check is quite costly for each of the nodes and
those checks returns truth for most of the hierarchy anyway.
There are number of possible optimization still, but current state is good
enough in terms it makes rendering faster a little bit after recent watertight
commit.
Currently QBVH is only implemented for CPU with SSE2 support at least. All
other devices would need to be supported later (if that'd make sense from
performance point of view).
The code is enabled for compilation in kernel. but blender wouldn't use it
still.
|
|
The idea is to make sure those childs would never be intersected with a ray
in order to make it so kernel never worries about number of child nodes.
|