Age | Commit message (Collapse) | Author |
|
Remove prefix of filenames that is the same as the folder name. This used
to help when #includes were using individual files, but now they are always
relative to the cycles root directory and so the prefixes are redundant.
For patches and branches, git merge and rebase should be able to detect the
renames and move over code to the right file.
|
|
* Rename struct KernelGlobals to struct KernelGlobalsCPU
* Add KernelGlobals, IntegratorState and ConstIntegratorState typedefs
that every device can define in its own way.
* Remove INTEGRATOR_STATE_ARGS and INTEGRATOR_STATE_PASS macros and
replace with these new typedefs.
* Add explicit state argument to INTEGRATOR_STATE and similar macros
In preparation for decoupling main and shadow paths.
Differential Revision: https://developer.blender.org/D12888
|
|
This is the first of a sequence of changes to support compiling Cycles kernels as MSL (Metal Shading Language) in preparation for a Metal GPU device implementation.
MSL requires that all pointer types be declared with explicit address space attributes (device, thread, etc...). There is already precedent for this with Cycles' address space macros (ccl_global, ccl_private, etc...), therefore the first step of MSL-enablement is to apply these consistently. Line-for-line this represents the largest change required to enable MSL. Applying this change first will simplify future patches as well as offering the emergent benefit of enhanced descriptiveness.
The vast majority of deltas in this patch fall into one of two cases:
- Ensuring ccl_private is specified for thread-local pointer types
- Ensuring ccl_global is specified for device-wide pointer types
Additionally, the ccl_addr_space qualifier can be removed. Prior to Cycles X, ccl_addr_space was used as a context-dependent address space qualifier, but now it is either redundant (e.g. in struct typedefs), or can be replaced by ccl_global in the case of pointer types. Associated function variants (e.g. lcg_step_float_addrspace) are also redundant.
In cases where address space qualifiers are chained with "const", this patch places the address space qualifier first. The rationale for this is that the choice of address space is likely to have the greater impact on runtime performance and overall architecture.
The final part of this patch is the addition of a metal/compat.h header. This is partially complete and will be extended in future patches, paving the way for the full Metal implementation.
Ref T92212
Reviewed By: brecht
Maniphest Tasks: T92212
Differential Revision: https://developer.blender.org/D12864
|
|
Previously the storage here was optimized to avoid indirections in BVH2
traversal. This helps improve performance a bit, but makes performance
and memory usage of Embree and OptiX BVHs a bit worse also. It also adds
code complexity in other parts of the code.
Now decouple triangle and curve primitive storage from BVH2.
* Reduced peak memory usage on all devices
* Bit better performance for OptiX and Embree
* Bit worse performance for CUDA
* Simplified code:
** Intersection.prim/object now matches ShaderData.prim/object
** No more offset manipulation for mesh displacement before a BVH is built
** Remove primitive packing code and flags for Embree and OptiX
** Curve segments are now stored in a KernelCurve struct
* Also happens to fix a bug in baking with incorrect prim/object
Fixes T91968, T91770, T91902
Differential Revision: https://developer.blender.org/D12766
|
|
This includes much improved GPU rendering performance, viewport interactivity,
new shadow catcher, revamped sampling settings, subsurface scattering anisotropy,
new GPU volume sampling, improved PMJ sampling pattern, and more.
Some features have also been removed or changed, breaking backwards compatibility.
Including the removal of the OpenCL backend, for which alternatives are under
development.
Release notes and code docs:
https://wiki.blender.org/wiki/Reference/Release_Notes/3.0/Cycles
https://wiki.blender.org/wiki/Source/Render/Cycles
Credits:
* Sergey Sharybin
* Brecht Van Lommel
* Patrick Mours (OptiX backend)
* Christophe Hery (subsurface scattering anisotropy)
* William Leeson (PMJ sampling pattern)
* Alaska (various fixes and tweaks)
* Thomas Dinges (various fixes)
For the full commit history, see the cycles-x branch. This squashes together
all the changes since intermediate changes would often fail building or tests.
Ref T87839, T87837, T87836
Fixes T90734, T89353, T80267, T80267, T77185, T69800
|
|
Prior to rBb8ecdbcd964a normals were stored both in
DeviceScene.tri_vnormal and the float3 attributes buffer. However, the
normals in `DeviceScene.tri_vnormal` might have be transformed to world
space if the object's transformation was applied, while the data in the
float3 attributes buffer were not. This caused shading issues in cases
where the objects did have transformation applied, as the math expects
the normals to be in object space.
To fix this, convert the normals to object space if necessary before
applying the normal map.
Reviewed By: brecht
Maniphest Tasks: T90854
Differential Revision: https://developer.blender.org/D12294
|
|
Vertex normals are needed for normals maps and therefore are packed and send
to the device alongside the other float3 attributes. However, we already pack
and send vertex normals through `DeviceScene.tri_vnormal`.
This removes the packing of vertex normals from the attributes buffer, and
reuses `tri_vnormal` in the kernel for normals lookup for normal maps, which
reduces memory usage a bit, and speeds up device updates.
This also fixes potential missing normals updates following rB12a06292af86,
since the need for vertex normals for normals maps was overlooked.
Reviewed By: brecht
Differential Revision: https://developer.blender.org/D12237
|
|
|
|
Offset rays from the flat surface to match where they would be for a smooth
surface as specified by the normals. In the shading panel there is now a
Shading Offset (existing option) and Geometry Offset (new).
The Geometry Offset works as follows:
* 0: disabled
* 0.001: only terminated triangles (normal points to the light, geometry
doesn't) are affected
* 0.1 (default): triangles at grazing angles are affected, and the effect
fades out
* 1: all triangles are affected
Limitations:
* The artifact is still visible in some cases, it could be that some quads
require to be treated specifically as quads.
* Inconsistent normals cause artifacts.
* If small objects cast shadows to a big low poly surface, the shadows can
appear to be in a wrong place - because the surface moved slightly above
the geometry. This can be noticed only at grazing angles to light.
* Approximated surfaces of two non-intersecting low-poly objects can overlap
that causes off-the-wall shadows.
Generally, using one or a few levels of subdivision can get rid of artifacts
faster than before.
Differential Revision: https://developer.blender.org/D11065
|
|
|
|
|
|
Byte colors must be encoded in sRGB and converted to linear on lookup,
to avoid precision loss.
|
|
Ref D2057
|
|
Previously only float3 and byte4 was supported.
Ref D2057
|
|
Ref T78041
|
|
|
|
None are currently exported from Blender, this fixes the Cycles side
implementation.
Differential Revision: https://developer.blender.org/D6794
|
|
This patch adds a new Vertex Color node. The node also returns the alpha
of the vertex color layer as an output.
Reviewers: brecht
Differential Revision: https://developer.blender.org/D5767
|
|
Apply clang format as proposed in T53211.
For details on usage and instructions for migrating branches
without conflicts, see:
https://wiki.blender.org/wiki/Tools/ClangFormat
|
|
Float2 are now a new type for attributes in Cycles. Before, the choices
for attribute storage were float and float3, the latter padded to
float4. This meant that UV maps were inflated to twice the size
necessary.
Reviewers: brecht, sergey
Reviewed By: brecht
Subscribers: #cycles
Tags: #cycles
Differential Revision: https://developer.blender.org/D4409
|
|
This commit contains the first part of the new Cycles denoising option,
which filters the resulting image using information gathered during rendering
to get rid of noise while preserving visual features as well as possible.
To use the option, enable it in the render layer options. The default settings
fit a wide range of scenes, but the user can tweak individual settings to
control the tradeoff between a noise-free image, image details, and calculation
time.
Note that the denoiser may still change in the future and that some features
are not implemented yet. The most important missing feature is animation
denoising, which uses information from multiple frames at once to produce a
flicker-free and smoother result. These features will be added in the future.
Finally, thanks to all the people who supported this project:
- Google (through the GSoC) and Theory Studios for sponsoring the development
- The authors of the papers I used for implementing the denoiser (more details
on them will be included in the technical docs)
- The other Cycles devs for feedback on the code, especially Sergey for
mentoring the GSoC project and Brecht for the code review!
- And of course the users who helped with testing, reported bugs and things
that could and/or should work better!
|
|
|
|
We started to run out of bits there, so now we separate flags
which came from __object_flags and which are either runtime or
coming from __shader_flags.
Rule now is: SD_OBJECT_* flags are to be tested against new
object_flags field of ShaderData, all the rest flags are to
be tested against flags field of ShaderData.
There should be no user-visible changes, and time difference
should be minimal. In fact, from tests here can only see hardly
measurable difference and sometimes the new code is somewhat
faster (all within a noise floor, so hard to tell for sure).
Reviewers: brecht, dingto, juicyfruit, lukasstockner97, maiself
Differential Revision: https://developer.blender.org/D2428
|
|
|
|
Some of the files were wrongly attributing code to some other
organizations and in few places proper attribution was missing.
This is mainly either a copy-paste error (when new file was
created from an existing one and header wasn't updated) or due
to some refactor which split non-original-BF code with purely
BF code.
Should solve some confusion around.
|
|
Using ones complement for detecting if transform has been applied was confusing
and led to several bugs. With this proper checks are made.
Also added a few transforms where they were missing, mostly affecting baking
and displacement when `P` is used in the shader (previously `P` was in the
wrong space for these shaders)
Also removed `TIME_INVALID` as this may have resulted in incorrect
transforms in some cases.
Reviewed By: brecht
Differential Revision: https://developer.blender.org/D2192
|
|
Adds a descriptor for attributes that can easily be passed around and extended
to contain more data. Will be used for attributes on subdivision meshes.
Reviewed By: brecht
Differential Revision: https://developer.blender.org/D2110
|
|
There are several internal changes for this:
First idea is to make __tri_verts to behave similar to __tri_storage,
meaning, __tri_verts array now contains all vertices of all triangles
instead of just mesh vertices. This saves some lookup when reading
triangle coordinates in functions like triangle_normal().
In order to make it efficient needed to store global triangle offset
somewhere. So no __tri_vindex.w contains a global triangle index which
can be used to read triangle vertices.
Additionally, the order of vertices in that array is aligned with
primitives from BVH. This is needed to keep cache as much coherent as
possible for BVH traversal. This causes some extra tricks needed to
fill the array in and deal with True Displacement but those trickery
is fully required to prevent noticeable slowdown.
Next idea was to use this __tri_verts instead of __tri_storage in
intersection code. Unfortunately, this is quite tricky to do without
noticeable speed loss. Mainly this loss is caused by extra lookup
happening to access vertex coordinate.
Fortunately, tricks here and there (i,e, some types changes to avoid
casts which are not really coming for free) reduces those losses to
an acceptable level. So now they are within couple of percent only,
On a positive site we've achieved:
- Few percent of memory save with triangle-only scenes. Actual save
in this case is close to size of all vertices.
On a more fine-subdivided scenes this benefit might become more
obvious.
- Huge memory save of hairy scenes. For example, on koro.blend
there is about 20% memory save. Similar figure for bunny.blend.
This memory save was the main goal of this commit to move forward
with Hair BVH which required more memory per BVH node. So while
this sounds exciting, this memory optimization will become invisible
by upcoming Hair BVH work.
But again on a positive side, we can add an option to NOT use Hair
BVH and then we'll have same-ish render times as we've got currently
but will have this 20% memory benefit on hairy scenes.
|
|
This commit contains all the work related on the AMD megakernel split work
which was mainly done by Varun Sundar, George Kyriazis and Lenny Wang, plus
some help from Sergey Sharybin, Martijn Berger, Thomas Dinges and likely
someone else which we're forgetting to mention.
Currently only AMD cards are enabled for the new split kernel, but it is
possible to force split opencl kernel to be used by setting the following
environment variable: CYCLES_OPENCL_SPLIT_KERNEL_TEST=1.
Not all the features are supported yet, and that being said no motion blur,
camera blur, SSS and volumetrics for now. Also transparent shadows are
disabled on AMD device because of some compiler bug.
This kernel is also only implements regular path tracing and supporting
branched one will take a bit. Branched path tracing is exposed to the
interface still, which is a bit misleading and will be hidden there soon.
More feature will be enabled once they're ported to the split kernel and
tested.
Neither regular CPU nor CUDA has any difference, they're generating the
same exact code, which means no regressions/improvements there.
Based on the research paper:
https://research.nvidia.com/sites/default/files/publications/laine2013hpg_paper.pdf
Here's the documentation:
https://docs.google.com/document/d/1LuXW-CV-sVJkQaEGZlMJ86jZ8FmoPfecaMdR-oiWbUY/edit
Design discussion of the patch:
https://developer.blender.org/T44197
Differential Revision: https://developer.blender.org/D1200
|
|
Using this paper: Sven Woop, Watertight Ray/Triangle Intersection
http://jcgt.org/published/0002/01/05/paper.pdf
This change is expected to address quite reasonable amount of reports from the
bug tracker, plus it might help reducing the noise in some scenes.
Unfortunately, it's currently about 7% slower than the previous solution with
pre-computed triangle plane equations, but maybe with some smart tweaks to the
code (tests reshuffle, using SIMD in a nice way or so) we can avoid the speed
regression.
But perhaps smartest thing to do here would be to change single triangle / ray
intersection with multiple triangles / ray intersections. That's how Embree does
this and it's watertight single ray intersection is not any faster that this.
Currently only triangle intersection is modified accordingly to the paper, in
the future we would also want to modify the node / ray intersection.
Reviewers: brecht, juicyfruit
Subscribers: dingto, ton
Differential Revision: https://developer.blender.org/D819
|
|
This way extending intersection routines with some pre-calculation step wouldn't
explode the single file size, hopefully keeping them all in a nice maintainable
state.
|
|
|
|
Not as if it gives noticeable changes render-time, but it's just weird to
convert float4 to float 3 to just access individual x/y/z components.
Plus some compilers might be more stupid than GCC and don't optimize this
out well.
|
|
tri_shader does no longer need to a float.
Reviewers: dingto, sergey
Reviewed By: dingto, sergey
Subscribers: dingto
Projects: #cycles
Differential Revision: https://developer.blender.org/D789
|
|
|
|
Root of the issue goes back to the on-fly normals commit and the
latest fix for it wasn't actually correct. I've mixed two fixes
in there.
So the idea here goes back to storing negative scaled object flag
and flip runtime-calculated normal if this flag is set, which is
pretty much the same as the original fix for the issue from me.
The issue with motion blur wasn't caused by the rumtime normals
patch and it had issues before, because it already did runtime
normals calculation. Now made it so motion triangles takes the
negative scale flag into account.
This actually makes code more clean imo and avoids rather confusing
flipping code in mesh.cpp.
|
|
|
|
Fix T41115: Motion Blur renders Objects Black - But not in Viewport Preview
This actually extends previous fix to normals and makes it all much nicer now.
Worth doing some intense testing, quick one worked just fine but there always
could be some corner cases.
|
|
Fix T41079: Solid black render of object with negative scale and smooth shading
In both cases the issue was caused by negative scaled objects with single mesh
users for which scale gets applied when using static BVH.
Since the on-fly normals calculation land normals for such cases weren't flipped
leading them to point to a wrong direction.
Added a special object flag for this, which is a bit of a bummer because now
we've got less bits for real useful things, but this is the only way to get
proper normals without adding more complexity in the on-fly calculations.
|
|
* Added support for uchar4 attributes to Cycles' attribute system.
* This is used for Vertex Colors now, which saves some memory (4 unsigned characters, instead of 4 floats).
* GPU Texture Limit on sm_20 and sm_21 decreased from 95 to 94, because we need a new texture for the uchar4 attributes. This is no problem for sm_30 or newer.
Part of my GSoC 2014.
|
|
Instead of pre-calculation and storage, we now calculate the face normal during render.
This gives a small slowdown (~1%) but decreases memory usage, which is especially important for GPUs,
where you have limited VRAM.
Part of my GSoC 2014.
|
|
This makes bmw scene in msvc 12 builds 6% faster.
It also gives a minor speedup for SSE hair in all compilers.
|
|
|
|
|
|
|
|
|
|
|
|
|