Age | Commit message (Collapse) | Author |
|
This was already mixed a bit, but the dot belongs there.
|
|
|
|
We might remove this again in the future, but for testing purposes
during the release cycle, this will be useful.
The setting defaults to QBVH, and can be found in the Performance panel.
|
|
This is harmless for now because tail of the node is zero in there, but better
to fix it early so in the case of extending BVH nodes this code doesn't give
issues.
|
|
This commit enables QBVH optimization structure automatically if rendering
with CPU and SSE2 support is detected.
This brings render time of agent shot back to the speed it used to be before
the watertight intersections commit, single koro and sponza scenes are about
7% faster here.
|
|
This commit implements traversal for QBVH tree, which is based on the old loop
code for traversal itself and Embree for node intersection.
This commit also does some changes to the loop inspired by Embree:
- Visibility flags are only checked for primitives.
Doing visibility check for every node cost quite reasonable amount of time
and in most cases those checks are true-positive.
Other idea here would be to do visibility checks for leaf nodes only, but
this would need to be investigated further.
- For minimum hair width we extend all the nodes' bounding boxes.
Again doing curve visibility check is quite costly for each of the nodes and
those checks returns truth for most of the hierarchy anyway.
There are number of possible optimization still, but current state is good
enough in terms it makes rendering faster a little bit after recent watertight
commit.
Currently QBVH is only implemented for CPU with SSE2 support at least. All
other devices would need to be supported later (if that'd make sense from
performance point of view).
The code is enabled for compilation in kernel. but blender wouldn't use it
still.
|
|
The idea is to make sure those childs would never be intersected with a ray
in order to make it so kernel never worries about number of child nodes.
|
|
Previously every BVH traversal file was defining macro to check which features
should be compiled in, now this macro is defined in the parent header.
|
|
Basic idea is to allow multiple implementation per feature-set, meaning this
commit tries to make it easier to hook new algorithms for BVH traversal.
|
|
Most of them are not currently used but are essential for the further work.
- CPU kernels with SSE2 support will now have sse3b, sse3f and sse3i
- Added templatedversions of min4, max4 which are handy to use with register
variables.
- Added util_swap function which gets arguments by pointers.
So hopefully it'll be a portable version of std::swap.
|
|
Using this paper: Sven Woop, Watertight Ray/Triangle Intersection
http://jcgt.org/published/0002/01/05/paper.pdf
This change is expected to address quite reasonable amount of reports from the
bug tracker, plus it might help reducing the noise in some scenes.
Unfortunately, it's currently about 7% slower than the previous solution with
pre-computed triangle plane equations, but maybe with some smart tweaks to the
code (tests reshuffle, using SIMD in a nice way or so) we can avoid the speed
regression.
But perhaps smartest thing to do here would be to change single triangle / ray
intersection with multiple triangles / ray intersections. That's how Embree does
this and it's watertight single ray intersection is not any faster that this.
Currently only triangle intersection is modified accordingly to the paper, in
the future we would also want to modify the node / ray intersection.
Reviewers: brecht, juicyfruit
Subscribers: dingto, ton
Differential Revision: https://developer.blender.org/D819
|
|
The idea is to store visibility flags for leaf nodes only since visibility check
for inner nodes costs too much for QBVH hence it is not optimal to perform.
Leaf QBVH nodes have plenty of space to store all sort of flags, so we can make
nodes one element smaller, saving noticeable amount of memory.
|
|
|
|
Previously offsets were calculated based on the BVH node size,
which is wrong and real PITA in cases when some extra data is
to be added into (or removed from) the node.
Now use offsets which are not calculated form the node size.
|
|
|
|
This solves quite an over-allocation in BVH instances packing code,
unfortunately, it's not a magic bullet to solve memory bump caused
by the recent QBVH changes.
For that we'll likely need to decouple storage for leaf and inner
nodes. However, it's not really clear for now if it's something
important since that'd still be just a fraction of memory comparing
to all the hi-res textures.
|
|
Title says it all, quite straightforward implementation.
Would only mention that there's a bit of code duplication around packing node
into pack.nodes. Trying to de-duplicate it ends up in quite hairy code (like
functions with loads of arguments some of which could be NULL in certain
circumstances etc..). Leaving solving this duplication for later.
|
|
Before all the nodes were counted and allocated, leading to situations when
bunch of allocated memory is not used because reasonable amount of nodes are
simply ignored.
|
|
Visibility flags are set to all visibility anyway, So there was no reason
to perform that test.
TODO: We need to investigate if having primitive intersection functions
which doesn't do visibility check gives any speedup here as well.
|
|
This way extending intersection routines with some pre-calculation step wouldn't
explode the single file size, hopefully keeping them all in a nice maintainable
state.
|
|
|
|
x
|
|
|
|
to the XML API.
(Changes from the standalone repo)
|
|
These nodes were assuming sRGB input/output which is for sure wrong for the
shader pipeline which works in the linear space.
So now conversion to/from linear space happens in these nodes which makes them
making sence in the shader context but which might change look and feel of
existing scenes.
|
|
Title says it all, just be more careful in the future.
|
|
The issue was caused by the way how RNA pointer was created for the bMain:
namely Cycles was using RNA_id_pointer_create to create the pointer, which
would then try to refine the poniter based on the ID type.
This is just wrong and worked so far just because of co-incident, with the
file path from the bug report first letters in the ID name happened to be
NT which corresponds to NodeTree, and for sure refining such pointer will
fail.
Simple solution -- use proper way to create RNA pointer for non-ID block.
|
|
As discussed in rB983c71931b1886d4, we should print a warning in case of building on non-Windows and WITH_BF_IME enabled. We also terminate build in this case, so the warning isn't scrolled away. Was worked out together with @sergey.
|
|
SVM was normalizing the input normal, OSL did not. This lead to render
result differences across this shading systems.
|
|
This was caused by some internal optimization which evaluated SSS with
size of zero as BSDF but used different ID so the evaluation result
didn't appear in regular diffuse pass.
This lead to situation when SSS data was nowhere stored if the
size was zero.
Now SSS with zero size and close-to-zero sizes will be handled in the
same way from the passes point of view.
|
|
|
|
|
|
|
|
|
|
|
|
Sorry, my bad :/
|
|
Original patch by @random (D765) with some minor work done by @campbell
and me.
At this place, I'd like call out a number of people who were involved and
deserve a big "Thank you!":
* At the first place @randon who developed and submitted the patch
* The Blendercn community which helped a lot with testing - espacially
* @yuzukyo, @leon_cheung and @kjym3
* @campbellbarton, @mont29 and @sergey for their help and advises during
* review
* @ton who realized the importance of this early on and asked me for
* reviewing
We are still not finished, as this is only the first part of the
implementaion, but there's more to come!
|
|
frontmost but whole process
|
|
|
|
SpaceMouse Wireless
SpaceMouse Pro Wireless
Device info is from user reports. I don’t yet have the new devices, so
these are untested but likely to work :D
|
|
|
|
This way CUDA errors are visible in the image info line,
which makes things to behave the same across viewport and
final rendering.
That's right, we've got error reported via reports and info
line now. This is based on the feedback from our gooseberry
team.
|
|
This way when something goes wrong in Cycles (for example out of VRAM, timelimit
launching the kernel etc) we'll have a nice report in the Info space header.
Sure it'll be nice to have mention of error in the image editor's information
line, but that's for the future.
This fixes T42747: "CUDA error" appears only momentarily, then disappears
|
|
Currently it acts the same as set_cancel(), but this way we're able to
distinguish situations when rendering was aborted by user demand (for
example pressing Esc in standalone renderer) or if something went horribly
wrong (for example out of VRAM error).
|
|
This way for example we wouldn't wait a fortune while BVH is building after
GPU run out of memory when loading images just to see the render failure
message.
|
|
|
|
Forbid OSL from polluting current conext with obscure stuff from
windows.h, it's not useful and unhealthy anyway.
Maybe we sohuld also forbid using abbreviated Glog constants as
well tho.
|
|
|
|
Basically, title says ti all, the option is called WITH_BF_CYCLES_LOGGING
|
|
Since the aligned allocation of shader closures in OSL memory pool
this workaround is no longer needed.
Also put a comment which describes the desired layout of the structure
so array of shader closures is all nicely aligned.
|