diff options
author | Michael Jones <michael_p_jones@apple.com> | 2022-07-12 16:32:46 +0300 |
---|---|---|
committer | Brecht Van Lommel <brecht@blender.org> | 2022-07-15 14:40:04 +0300 |
commit | da4ef05e4dfb700a61910e6d8e02183d7c272963 (patch) | |
tree | 778f5655aaad65e36884e7a9c320ac29fbe3cc39 /intern/cycles/util | |
parent | 5653c5fcdd9f424dc05ddf73b18ba8294daf4788 (diff) |
Cycles: Apple Silicon optimization to specialize intersection kernels
The Metal backend now compiles and caches a second set of kernels which are
optimized for scene contents, enabled for Apple Silicon.
The implementation supports doing this both for intersection and shading
kernels. However this is currently only enabled for intersection kernels that
are quick to compile, and already give a good speedup. Enabling this for
shading kernels would be faster still, however this also causes a long wait
times and would need a good user interface to control this.
M1 Max samples per minute (macOS 13.0):
PSO_GENERIC PSO_SPECIALIZED_INTERSECT PSO_SPECIALIZED_SHADE
barbershop_interior 83.4 89.5 93.7
bmw27 1486.1 1671.0 1825.8
classroom 175.2 196.8 206.3
fishy_cat 674.2 704.3 719.3
junkshop 205.4 212.0 257.7
koro 310.1 336.1 342.8
monster 376.7 418.6 424.1
pabellon 273.5 325.4 339.8
sponza 830.6 929.6 1142.4
victor 86.7 96.4 96.3
wdas_cloud 111.8 112.7 183.1
Code contributed by Jason Fielder, Morteza Mostajabodaveh and Michael Jones
Differential Revision: https://developer.blender.org/D14645
Diffstat (limited to 'intern/cycles/util')
-rw-r--r-- | intern/cycles/util/string.cpp | 18 | ||||
-rw-r--r-- | intern/cycles/util/string.h | 2 |
2 files changed, 20 insertions, 0 deletions
diff --git a/intern/cycles/util/string.cpp b/intern/cycles/util/string.cpp index 66ff866ee10..0c318cea44a 100644 --- a/intern/cycles/util/string.cpp +++ b/intern/cycles/util/string.cpp @@ -136,6 +136,19 @@ void string_replace(string &haystack, const string &needle, const string &other) } } +void string_replace_same_length(string &haystack, const string &needle, const string &other) +{ + assert(needle.size() == other.size()); + size_t pos = 0; + while (pos != string::npos) { + pos = haystack.find(needle, pos); + if (pos != string::npos) { + memcpy(haystack.data() + pos, other.data(), other.size()); + pos += other.size(); + } + } +} + string string_remove_trademark(const string &s) { string result = s; @@ -164,6 +177,11 @@ string to_string(const char *str) return string(str); } +string to_string(const float4 &v) +{ + return string_printf("%f,%f,%f,%f", v.x, v.y, v.z, v.w); +} + string string_to_lower(const string &s) { string r = s; diff --git a/intern/cycles/util/string.h b/intern/cycles/util/string.h index a74feee1750..ecbe9e106c6 100644 --- a/intern/cycles/util/string.h +++ b/intern/cycles/util/string.h @@ -38,12 +38,14 @@ void string_split(vector<string> &tokens, const string &separators = "\t ", bool skip_empty_tokens = true); void string_replace(string &haystack, const string &needle, const string &other); +void string_replace_same_length(string &haystack, const string &needle, const string &other); bool string_startswith(string_view s, string_view start); bool string_endswith(string_view s, string_view end); string string_strip(const string &s); string string_remove_trademark(const string &s); string string_from_bool(const bool var); string to_string(const char *str); +string to_string(const float4 &v); string string_to_lower(const string &s); /* Wide char strings are only used on Windows to deal with non-ASCII |