Welcome to mirror list, hosted at ThFree Co, Russian Federation.

git.blender.org/blender.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorLukas Stockner <lukas.stockner@freenet.de>2022-10-16 02:57:44 +0300
committerLukas Stockner <lukas.stockner@freenet.de>2022-10-16 03:34:10 +0300
commit0c50f9c4aaf90eb26240c31fbeff12af519a8ab0 (patch)
tree3d1bc724eac9f013e1252953fa1905b9d26acf9d /intern/cycles/util/ssef.h
parentb898330c3744669a8c3805f1560130821422cca1 (diff)
Fix T98672: Noise texture shows incorrect behaviour for large scales
This was a floating point precision issue - or, to be more precise, an issue with how Cycles split floats into the integer and fractional parts for Perlin noise. For coordinates below -2^24, the integer could be wrong, leading to the fractional part being outside of 0-1 range, which breaks all sorts of other things. 2^24 sounds like a lot, but due to how the detail octaves work, it's not that hard to reach when combined with a large scale. Since this code is originally based on OSL, I checked if they changed it in the meantime, and sure enough, there's a fix for it: https://github.com/OpenImageIO/oiio/commit/5c9dc68391e9 So, this basically just ports over that change to Cycles. The original code mentions being faster, but as pointed out in the linked commit, the performance impact is actually irrelevant. I also checked in a simple scene with eight Noise textures at detail 15 (with >90% of render time being spent on the noise), and the render time went from 13.06sec to 13.05sec. So, yeah, no issue.
Diffstat (limited to 'intern/cycles/util/ssef.h')
-rw-r--r--intern/cycles/util/ssef.h23
1 files changed, 11 insertions, 12 deletions
diff --git a/intern/cycles/util/ssef.h b/intern/cycles/util/ssef.h
index a2fff94303e..7f9e332a997 100644
--- a/intern/cycles/util/ssef.h
+++ b/intern/cycles/util/ssef.h
@@ -5,6 +5,8 @@
#ifndef __UTIL_SSEF_H__
#define __UTIL_SSEF_H__
+#include <math.h>
+
#include "util/ssei.h"
CCL_NAMESPACE_BEGIN
@@ -534,6 +536,12 @@ __forceinline const ssef ceil(const ssef &a)
return _mm_round_ps(a, _MM_FROUND_TO_POS_INF);
# endif
}
+# else
+/* Non-SSE4.1 fallback, needed for floorfrac. */
+__forceinline const ssef floor(const ssef &a)
+{
+ return _mm_set_ps(floorf(a.f[3]), floorf(a.f[2]), floorf(a.f[1]), floorf(a.f[0]));
+}
# endif
__forceinline ssei truncatei(const ssef &a)
@@ -541,20 +549,11 @@ __forceinline ssei truncatei(const ssef &a)
return _mm_cvttps_epi32(a.m128);
}
-/* This is about 25% faster than straightforward floor to integer conversion
- * due to better pipelining.
- *
- * Unsaturated add 0xffffffff (a < 0) is the same as subtract -1.
- */
-__forceinline ssei floori(const ssef &a)
-{
- return truncatei(a) + cast((a < 0.0f).m128);
-}
-
__forceinline ssef floorfrac(const ssef &x, ssei *i)
{
- *i = floori(x);
- return x - ssef(*i);
+ ssef f = floor(x);
+ *i = truncatei(f);
+ return x - f;
}
////////////////////////////////////////////////////////////////////////////////