Age | Commit message (Collapse) | Author |
|
|
|
|
|
|
|
|
|
|
|
For animations, you often want an animated render seed (noise pattern).
This could be done by e.g. setting a driver on the seed value.
Now it's a little checkbox, that can be enabled.
The animated seed is based on the current Blender frame and
the seed value itself. Simply enabling it, will already result in an animated
seed (different on each Blender frame), but it can be randomized further
by setting a different seed value.
Disabled per default, so no backward compatibility break.
Differential Revision: https://developer.blender.org/D1285
|
|
It is still disabled for AMD devices since can't test if it works fine
on this hardware.
|
|
Experimental feature set id currently unavailable for megakernel, it'll
require some changes to the cache system to distinguish cached regular
kernels from cached experimental kernels.
Currently unused, but some features will be enabled soon.
|
|
Previously it was explicitly mentioning it's NVidia kernel related option,
but in fact it's also handy for the OpenCL kernel.
|
|
Driver fails to compile kernel in reasonable time for those devices here,
so for easier testing of the OpenCL split kernel work disabling bake kernel
for now.
|
|
This way it's possible to do device-selective feature disabling/enabling.
Currently only supported for NVidia devices via OpenCL extension.
|
|
This way it's easier to access platform name, device ID and other stuff which
might be needed to define build options.
|
|
|
|
This required allocating some memory related on object transform needed
by ShaderData and currently it is done for all the platforms. Since we're
targeting full feature-complete platforms this is rather acceptable at
this point and in the future we'll do selective NO_HAIR/NO_SSS/NO_BLUR
kernels.
This is experimental still and in fact there're some major issues on
NVidia platform and it's not really clear if it's a bug in compiler,
some uninitizlied variable or other kind of issue.
|
|
Some stupid fixes like spaces around operator and missing semicolon,
plus fix for wrong detecting of ShaderData SOA size. Thar was harmless
since there's only one closure array, but still better to fix this.
|
|
This file was actually checking for features enabled on CPU and surely all
of them were enabled, so removing them does not cause any difference.
ideally we'll need to do runtime feature detection and just pass some stuff
as NULL to the kernel, or maybe also have variadic kernel entry points which
is also possible quite easily.
|
|
It's good for testing and seems to work quite reliably here.
This probably not totally cheap in terms of performance, but this we
could solve quite easily by selective kernel compilation once other
things are tested/proved to be reliable.
|
|
configurations
|
|
The kernels are now compiling just fine, but there're some issues
during rendering. This is still to be investigated.
|
|
Doing this as a separate commit so it's easier to revert in the future, once
OpenCL 2.0 is becoming our requirement.
|
|
This is required for OpenCL prior to 2.0 and those functions will become
handy when working on camera/motion blur support in split kernel.
|
|
Apart from simply enabling this features needed changes to the code were done.
Technical change, replacing SD access from "simple" structure to SOA.
|
|
|
|
|
|
|
|
|
|
No need to store them in the class, they're unlikely to be changed
and if they do change we're in big trouble anyway.
More appropriate approach would be then to typedef this things in
kernel_types.h, but still use inlined sizeof(),
|
|
|
|
Simple integer overflow issue.
TODO(sergey): Check on CPU cubic sampling, it might also need size_t.
|
|
|
|
|
|
Not terribly necessary in this case, since we are just drawing a quad,
but makes blender overall more GL 3.x core ready.
|
|
simple addition.
|
|
Suggested by Brecht, tested with gcc > 4.4 and Clang
|
|
Reported by IRC user HG1.
|
|
Thanks to Dingto for noticing!
|
|
This makes OCIO viewport color correction a little bit faster (about -0.5s for 100 samples)
Also set max half float value to 65504.0 to conform with IEEE 754.
|
|
Was using platform as a device id accidentally.
|
|
|
|
Branched Path is not supported, neither in the Split nor Megakernel.
|
|
It was using direction transform, which is obviously wrong.
|
|
Only those ones are priority for now, all the rest are still testable
if CYCLES_OPENCL_TEST or CYCLES_OPENCL_SPLIT_KERNEL_TEST environment
variables are set.
|
|
It's a but in compiler but it's nice to have working kernel for until
that bug is fixed.
|
|
This commit contains all the work related on the AMD megakernel split work
which was mainly done by Varun Sundar, George Kyriazis and Lenny Wang, plus
some help from Sergey Sharybin, Martijn Berger, Thomas Dinges and likely
someone else which we're forgetting to mention.
Currently only AMD cards are enabled for the new split kernel, but it is
possible to force split opencl kernel to be used by setting the following
environment variable: CYCLES_OPENCL_SPLIT_KERNEL_TEST=1.
Not all the features are supported yet, and that being said no motion blur,
camera blur, SSS and volumetrics for now. Also transparent shadows are
disabled on AMD device because of some compiler bug.
This kernel is also only implements regular path tracing and supporting
branched one will take a bit. Branched path tracing is exposed to the
interface still, which is a bit misleading and will be hidden there soon.
More feature will be enabled once they're ported to the split kernel and
tested.
Neither regular CPU nor CUDA has any difference, they're generating the
same exact code, which means no regressions/improvements there.
Based on the research paper:
https://research.nvidia.com/sites/default/files/publications/laine2013hpg_paper.pdf
Here's the documentation:
https://docs.google.com/document/d/1LuXW-CV-sVJkQaEGZlMJ86jZ8FmoPfecaMdR-oiWbUY/edit
Design discussion of the patch:
https://developer.blender.org/T44197
Differential Revision: https://developer.blender.org/D1200
|
|
This way device can actually make a decision of how it can optimize the kernel
in order to make it most efficient.
|
|
The goal is to be able to compile kernel with nodes which are actually needed
to render current scene, hence improving performance of the kernel,
The idea is:
- Have few node groups, starting with a group which contains nodes are used
really often, and then couple of groups which will be extension of this one.
- Have feature-based nodes disabling, so it's possible to disable nodes related
to features which are not used with the currently used nodes group.
This commit only lays down needed routines for this approach, actual split will
happen later after gathering statistics from bunch of production scenes.
|
|
This will be used by split kernel in order to compile most optimal kernel.
Maximum number of closures is actually being cached in the session, so viewport
rendering will not trigger kernel re-loading when number of closures goes down.
|
|
Currently unused but will be needed soon for the split kernel work.
|
|
This is currently unused but crucial for things like calculating amount of
device memory required to deal with the tasks.
Maybe not really best place to store it, but consider it good enough for now.
|
|
Previously we only had experimental flag passed to device's load_kernel() which
was all fine. But since we're gonna to have some extra parameters passed there
it makes sense to wrap them into a single struct, which will make it easier to
pass stuff around.
|