Age | Commit message (Collapse) | Author |
|
|
|
(reported on bf-python mailing-list and in my github (!), let's hope in
the future we get more reports in developer.blender.org instead ;))
|
|
That's rather harmless in the master, just could cause some issues with the patches.
|
|
|
|
Replace old code for area lamps which was more like incorrect with more correct
one using the following paper as a reference:
Carlos Urena et al.
An Area-Preserving Parametrization for Spherical Rectangles.
https://www.solidangle.com/research/egsr2013_spherical_rectangle.pdf
Implementation is straight from the paper, currently the rectangle constants are
calculated for each of the samples. Ideally we need to pre-calculate them.
Some comparison images are available there
http://wiki.blender.org/index.php/Dev:Ref/Release_Notes/2.73/Cycles
Reviewers: brecht, juicyfruit
Subscribers: dingto, ton
Differential Revision: https://developer.blender.org/D823
|
|
This is so-called GPU limitation boundary hit, told compiler to NOT include
volume bound function, otherwise some real weird things used to happen.
We actually might want to do the same for CPU, inlining everything is not
the way to get fastest code.
|
|
Explicitly disable SSE kernels in Cycles when this option is used.
|
|
This should hopefully fix https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=765187
|
|
|
|
New IDE did not take over my former setting for preferring tabs
|
|
This basically fixes mix of size_t and uintptr_t usages which might be different size.
|
|
Quite annoying, the same thing we do from the blender side, But as a positive
side we can get rid of some utf8/utf16 conversions.
Hopefully it all work fine now, at leats works on mu russki windoze laptop.
|
|
|
|
respondsToSelector to simplify the whole detection
|
|
to C89 style to prevent jumping over constructors
|
|
|
|
Fix T42174.
|
|
comments and also in OSL.
|
|
|
|
|
|
|
|
implicit double to float conversion).
|
|
Also reduce number of branching and multiplications a bit by inlining the branches.
This gives an unmeasurable speedup, which is in case of BMW is about 2% here.
|
|
make sure the window states are correct in the lion_fs animation phase.
This also assures the CTX_wm_window(C) is okay.
|
|
This gives more precise information about memory usage which might be real handy
when doing memory optimization.
It works good here for as long as i can tell but if for some reason you'll be
experiencing some weird slowdown please let me know.
|
|
Not as if it gives noticeable changes render-time, but it's just weird to
convert float4 to float 3 to just access individual x/y/z components.
Plus some compilers might be more stupid than GCC and don't optimize this
out well.
|
|
We queried the wrong value when looking for the bound 2D texture. This
is not totally robust because currently bound texture may not be a 2D
one, but this should work for now.
|
|
|
|
|
|
|
|
include the parens around value before cast,
in some cases was causing double/float promotion by only casting the left value.
|
|
https://developer.blender.org/D643
Separates graphics context creation from window code in Ghost so that they can vary separately.
|
|
corner and blend is 0
After discussion with cambo here we decided it's better to choose arbitrary side of the box
(in this case it's X-axis) and use image from it. That's better than doing a blackness.
P.S. This is literally a corner case anyway.
|
|
|
|
Ray actually should have infinite length, so we can detect camera in a volume
which is bigger that the far clipping of the camera.
This might also give some speedup (wouldn't expect much tho) because we don't
need to re-calculate ray direction and length after every bounce now.
|
|
basically we skip all non-volume objects now in the volume stack function.
Depending on the show it might give some percent of speedup.
Most of the speedup would be gained in the scenes when having SSS object
intersecting the volume and taking a reasonable amount of frame space.
|
|
|
|
Single precision exponent on 64bit linux tends to be order of magnitude slower
than double precision version even with single<->double precision conversion.
Some feedback in the mailing lists also suggests that logf() is also slow, but
this i didn't confirm here in the studio yet.
Depending on the shader setup it gives ~3% with the secret agent shot and up to
around 15% with the bmw scene here.
|
|
This is a good practice to do anyway, plus it'll help with the upcoming change.
|
|
Signed-off-by: Thomas Dinges
|
|
we needed this.
|
|
|
|
|
|
|
|
Quite straightforward change, the only annoying thing is that we can't use
indentation for include directive just because of the way headers inlineing
works for OpenCL.
Might do smarter job in path_source_replace_includes() but don't want to
spend time on this yet.
|
|
|
|
* sm_52 can run a sm_50 kernel, so tell runtime detection to use that until we build a dedicated sm_52 kernel.
|
|
* On sm_30 and above there is no change (was not inlined already before), this just fixes a speed regression from yesterday. 6359c36ba407
* On sm_2x (tested with sm_21), I get a nice 8% speedup in the bmw scene with this. As a bonus, cubin compilation time and memory usage is significantly reduced. Regular cubin size went from 2.5MB to 2.0MB, Experimental one from 3.8MB to 2.5MB.
|
|
|
|
Currently only summed number of traversal steps and intersections used by the
camera ray intersection pass is implemented, but in the future we will support
more debug passes which would help checking what things makes the scene slow.
Example of such extra passes could be number of bounces, time spent on the
shader tree evaluation and so.
Implementation from the Cycles side is pretty much straightforward, could only
mention here that it's a build-time option disabled by default.
From the blender side it's implemented as a PASS_DEBUG with several subtypes
possible. This way we don't need to create an extra DNA pass type for each of
the debug passes, saving us a bits.
Reviewers: campbellbarton
Reviewed By: campbellbarton
Differential Revision: https://developer.blender.org/D813
|