Age | Commit message (Collapse) | Author |
|
The issue was caused by stupid workaorund for libav. Now things works for
FFmpeg. There might need some tweaks needed for Libav, but that one is
not really priority for support.
|
|
|
|
Numerical inaccuracies would cause the XtWX matrix to be no longer
positive-semidefinite, which in turn caused the LSQ solver to fail.
|
|
Old code was working quite unreliable in combination with fast math
flag, especially when compiling with Clang. It seems we were hitting
result of the following bug submitted to Clang [1].
Basically, it was happening so that (int)sqrtf(64) was 7 when Cycles
is built with Clang but was correct 8 when built with GCC.
This commit works this around. Annoying, but don't see other way to
keep sampling pattern the same for Clang and GCC.
[1] https://bugs.llvm.org//show_bug.cgi?id=24063
|
|
Previous logic did not free memory used by vector classes
which were storing images, causing memory leaks.
|
|
|
|
|
|
|
|
|
|
|
|
log(0) is undefined and should not have been included
log(1) == 0, dividing by zero is not recommended
|
|
This commit contains the first part of the new Cycles denoising option,
which filters the resulting image using information gathered during rendering
to get rid of noise while preserving visual features as well as possible.
To use the option, enable it in the render layer options. The default settings
fit a wide range of scenes, but the user can tweak individual settings to
control the tradeoff between a noise-free image, image details, and calculation
time.
Note that the denoiser may still change in the future and that some features
are not implemented yet. The most important missing feature is animation
denoising, which uses information from multiple frames at once to produce a
flicker-free and smoother result. These features will be added in the future.
Finally, thanks to all the people who supported this project:
- Google (through the GSoC) and Theory Studios for sponsoring the development
- The authors of the papers I used for implementing the denoiser (more details
on them will be included in the technical docs)
- The other Cycles devs for feedback on the code, especially Sergey for
mentoring the GSoC project and Brecht for the code review!
- And of course the users who helped with testing, reported bugs and things
that could and/or should work better!
|
|
|
|
|
|
|
|
|
|
Fix for SSS in BPT.
|
|
|
|
The memory isn't initialized during allocation, so calling the assignment operator is a bad idea.
|
|
The issue was caused by unlimited textures commit, root of the issue is that
displacement code updates some of the image slots directly, so it needs to
ensure device vectors are all proper size.
|
|
|
|
|
|
Previously, every RenderPass would have a bitfield that specified its type. That limits the number of passes to 32, which was reached a while ago.
However, most of the code already supported arbitrary RenderPasses since they were also used to store Multilayer EXR images.
Therefore, this commit completely removes the passflag from RenderPass and changes all code to use the unique pass name for identification.
Since Blender Internal relies on hardcoded passes and to preserve compatibility, 32 pass names are reserved for the old hardcoded passes.
To support these arbitrary passes, the Render Result compositor node now adds dynamic sockets. For compatibility, the old hardcoded sockets are always stored and just hidden when the corresponding pass isn't available.
To use these changes, the Render Engine API now includes a function that allows render engines to add arbitrary passes to the render result. To be able to add options for these passes, addons can now add their own properties to SceneRenderLayers.
To keep the compositor input node updated, render engine plugins have to implement a callback that registers all the passes that will be generated.
From a user perspective, nothing should change with this commit.
Differential Revision: https://developer.blender.org/D2443
Differential Revision: https://developer.blender.org/D2444
|
|
Reduce thread divergence in kernel_shader_eval.
Rays are sorted in blocks of 2048 according to shader->id.
On R9 290 Classroom is ~30% faster, and Pabellon Barcelone is ~8% faster.
No sorting for CUDA split kernel.
Reviewers: sergey, maiself
Reviewed By: maiself
Differential Revision: https://developer.blender.org/D2598
|
|
It is really confusing to have some functions available in some devices
and not on another devices.
|
|
viewport is used
Previously the logic was different for duplis and regular objects: regular objects
were using render visibility when Render Layer option is enabled which duplis were
always using viewport visibility when rendering from the viewport.
This was quite confusing because caused different results in viewport and render
when artists were expecting them to match 1:1.
|
|
Can not measure any performance difference, so seems the code is identical
and just shorter.
|
|
It will use SSE2 optimized version when is possible.
|
|
These were causing problems with Nvidia OpenCL.
|
|
|
|
This implements branched path tracing for the split kernel.
General approach is to store the ray state at a branch point, trace the
branched ray as normal, then restore the state as necessary before iterating
to the next part of the path. A state machine is used to advance the indirect
loop state, which avoids the need to add any new kernels. Each iteration the
state machine recreates as much state as possible from the stored ray to keep
overall storage down.
Its kind of hard to keep all the different integration loops in sync, so this
needs lots of testing to make sure everything is working correctly. We should
probably start trying to deduplicate the integration loops more now.
Nonbranched BMW is ~2% slower, while classroom is ~2% faster, other scenes
could use more testing still.
Reviewers: sergey, nirved
Reviewed By: nirved
Subscribers: Blendify, bliblubli
Differential Revision: https://developer.blender.org/D2611
|
|
Spotted by Mai in IRC, thanks!
|
|
Global size y needs to be a multiple of 16.
|
|
This way we don't re-load kernels for every sample in the viewport.
Additionally, we don't risk global size changed inbetween of samples.
|
|
Not sure if this is a proper fix, but was getting frequent crashes, so
committing this real quick just to make master sable again. Can be
reverted later if there's a better fix. The changes to images really
need a closer look...
|
|
This way moving Blender bundle around doesn't re-trigger kernels compilation.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Previous fix did not work for mixed textures. This one will over-allocate
information array, but it's better than not being able to render at all.
Some more cleanup and improvement is coming.
|
|
unlimited textures commit"
This reverts commit 8f4166ee495531fa38b676b0a5ef4c482e89f9a5.
The fix was not correct for cases when we've got float textures.
|
|
|
|
textures commit
The indexing was totally wrong in both image packing code and image sampling in kernel.
Fixes T51341: Cycles OpenCL corruption in todays buildbot
|
|
|
|
|
|
|
|
|