Age | Commit message (Collapse) | Author |
|
The problem here was, as the title says, that the two kernels were swapped.
Since shader evaluation is only used for building the samling map when World MIS is enabled, rendering without it would still work fine, although baking also was broken.
|
|
The previous refactor changed the code to use a separate logging mechanism to support multithreaded compilation.
However, since that's not supported by any frameworks yes, it just resulted in bad logging behaviour.
So, this commit changes the logging to go diectly to stdout/stderr once again by default.
|
|
Reviewed By: sergey, brecht, dfelinto
Differential Revision: https://developer.blender.org/D1952
|
|
This was giving some speedup but made intersection tests to fail
from watertight point of view.
Needs deeper investigation, but need to quickly get it fixed for
the studio.
|
|
|
|
|
|
This is something tested by @LazyDodo and suggested by Maxym to make
MSVC happier.
|
|
This gives about 5% speedup on AVX2 kernels (other kernels still
have SSE disabled for math operations) and this solves the slowdown
of koro scene mention in the previous commit.
The title says it all actually. This commit also contains
changes to pass float3 as const reference in affected functions.
This should make MSVC happier without breaking OpenCL because it's
only done in areas which are ifdef-ed for non-OpenCL.
Another patch based on inspiration from Maxym Dmytrychenko, thanks!
|
|
This commit basically vectorizes existing code using AVX2 instructions
(without modifying algorithm itself). This gives quite nice speedups:
BMW: -8%
Classroom: -5%
Cat: -5%
Koro: +1%
Barcelona: -8%
That's on Linux machine, reported performance improvement on Windows
goes up to 20%.
Not currently sure why Koro is somewhat slower because it mainly uses
curve intersection tests, could be a time noise? Or osmething with the
cache utilization perhaps? In any case speedup in other scenes makes
me thinking that current state is acceptable for initial implementation.
This is again inspired by Maxym Dmytrychenko.
|
|
Based on existing ssef data type and to my knowledge it's also what happens in
Embree nowadays.
Inspired by Maxym Dmytrychenko and required for the upcoming triangle
intersection commit.
Hopefully the copyright message is correct.
|
|
Currently this does not give measurable difference, but is required
ground work for some upcoming further optimization of AVX2 kernels.
|
|
|
|
When ray hits curve segment with SSS shader it was possible to have
uninitialized hit_P variable used for sampling.
Seems that was a reason of our headache of difference between AVX2
and SSE4 render results here, so now we can revert all the nasty
ifdef-ed inline policies.
|
|
|
|
There are no user-visible changes, just some internal restructuring.
Differential Revision: https://developer.blender.org/D2231
|
|
|
|
|
|
|
|
|
|
Mostly this is making inlining match CUDA 7.5 in a few performance critical
places. The end result is that performance is now better than before, possibly
due to less register spilling or other CUDA 8.0 compiler improvements.
On benchmarks scenes, there are 3% to 35% render time reductions. Stack memory
usage is reduced a little too.
Reviewed By: sergey
Differential Revision: https://developer.blender.org/D2269
|
|
|
|
|
|
Fixes for clamp-omp, fewer shared variables, fix some cases of threads writing
to the same memory location. Issue found by Jens Verwiebe, who reports 30%
speedup with 16 core CPU, when using this with a recent clang-omp version.
|
|
This is also an important mathematical operation that can be folded
if it is known that one argument is a certain constant. For colors
the operation is provided as a Gamma node.
The SVM Gamma node needs a small fix to make it follow the 0 ^ 0 == 1
rule, same as the Power node, or the Gamma node itself in OSL mode.
Reviewers: #cycles
Differential Revision: https://developer.blender.org/D2263
|
|
|
|
It will discard the whole tile, but it's still kind of more friendly than
fully locked interface (sort of) for until tile is fully sampled.
Sorry if it causes PITA to merge for the opencl split work, but this issue
bothering a lot when collecting benchmarks.
|
|
|
|
Previously it was falling back to just a path after #include
statement was finished. Now we fall back to a proper current
file name after dealing with the preprocessor statement.
|
|
Some of the files were wrongly attributing code to some other
organizations and in few places proper attribution was missing.
This is mainly either a copy-paste error (when new file was
created from an existing one and header wasn't updated) or due
to some refactor which split non-original-BF code with purely
BF code.
Should solve some confusion around.
|
|
In Windows, event dispatching code is throwing out the wheel scroll count value.
Despite of how many fast you move the wheel, it only make one-notch scroll event.
This patch convert wheel event to multiple 1-notch wheel events.
This also correct the handling of smooth scroll mouse wheel (which can report smaller than 1-notch wheel movement) by accumulating the small wheel delta values.
Reviewers: djnz, shadowrom, elubie, #platform:_windows, sergey, juicyfruit, brecht
Reviewed By: shadowrom, elubie, #platform:_windows, brecht
Subscribers: dingto, elubie, brachi, brecht
Differential Revision: https://developer.blender.org/D143
|
|
|
|
|
|
We need a long-term fix, but this will get 2.78 out the door.
|
|
|
|
Path Tracing
The light sampling functions calculate light sampling PDF for the case that the light has been randomly selected out of all lights.
However, since BPT handles lamps and meshlights separately, this isn't the case. So, to avoid a wrong result, the code just included the 0.5 factor in the throughput.
In theory, however, the correction should be made to the sampling probability, which needs to be doubled. Now, for the regular calculation, that's no real difference since the throughput is divided by the pdf.
However, it does matter for the MIS calculation - it's unbiased both ways, but including the factor in the PDF instead of the throughput should give slightly better results.
Reviewers: sergey, brecht, dingto, juicyfruit
Differential Revision: https://developer.blender.org/D2258
|
|
is enabled
|
|
|
|
|
|
|
|
Was an integer overflow issue when calculating offsets.
|
|
|
|
Just to make it easier to research ways of possible code de-duplication.
|
|
|
|
|
|
|
|
|
|
We raised the minimum to GL 2.1 in Blender 2.77, and dropped support for older GPUs (pre-2012 Intel mostly). On Windows you get a popup message, but on Mac we simply crashed. Every Mac has a builtin software renderer for GL 2.1 so let's use that when the GPU is not capable!
Run blender --debug-gpu to see version detection & software fallback.
|
|
|
|
|
|
|