diff options
author | Jacques Lucke <jacques@blender.org> | 2022-04-26 18:12:34 +0300 |
---|---|---|
committer | Jacques Lucke <jacques@blender.org> | 2022-04-26 18:12:34 +0300 |
commit | ae94e36cfb2f3bc9a99b638782092d9c71d4b3c7 (patch) | |
tree | dc54dc643a2c498af1d3de97b471115607a8d3b4 /source/blender/blenlib/BLI_parameter_pack_utils.hh | |
parent | 9a53599180041cf9501e2ac6150c9f900a3a3fc0 (diff) |
Geometry Nodes: refactor array devirtualization
Goals:
* Better high level control over where devirtualization occurs. There is always
a trade-off between performance and compile-time/binary-size.
* Simplify using array devirtualization.
* Better performance for cases where devirtualization wasn't used before.
Many geometry nodes accept fields as inputs. Internally, that means that the
execution functions have to accept so called "virtual arrays" as inputs. Those
can be e.g. actual arrays, just single values, or lazily computed arrays.
Due to these different possible virtual arrays implementations, access to
individual elements is slower than it would be if everything was just a normal
array (access does through a virtual function call). For more complex execution
functions, this overhead does not matter, but for small functions (like a simple
addition) it very much does. The virtual function call also prevents the compiler
from doing some optimizations (e.g. loop unrolling and inserting simd instructions).
The solution is to "devirtualize" the virtual arrays for small functions where the
overhead is measurable. Essentially, the function is generated many times with
different array types as input. Then there is a run-time dispatch that calls the
best implementation. We have been doing devirtualization in e.g. math nodes
for a long time already. This patch just generalizes the concept and makes it
easier to control. It also makes it easier to investigate the different trade-offs
when it comes to devirtualization.
Nodes that we've optimized using devirtualization before didn't get a speedup.
However, a couple of nodes are using devirtualization now, that didn't before.
Those got a 2-4x speedup in common cases.
* Map Range
* Random Value
* Switch
* Combine XYZ
Differential Revision: https://developer.blender.org/D14628
Diffstat (limited to 'source/blender/blenlib/BLI_parameter_pack_utils.hh')
-rw-r--r-- | source/blender/blenlib/BLI_parameter_pack_utils.hh | 122 |
1 files changed, 122 insertions, 0 deletions
diff --git a/source/blender/blenlib/BLI_parameter_pack_utils.hh b/source/blender/blenlib/BLI_parameter_pack_utils.hh new file mode 100644 index 00000000000..d1ef7bcbc65 --- /dev/null +++ b/source/blender/blenlib/BLI_parameter_pack_utils.hh @@ -0,0 +1,122 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ + +#pragma once + +/** \file + * \ingroup bli + * + * C++ has a feature called "parameter packs" which allow building variadic templates. + * This file has some utilities to work with such parameter packs. + */ + +#include <tuple> +#include <type_traits> + +#include "BLI_utildefines.h" + +namespace blender { + +/** + * A type that encodes a specific value. + */ +template<typename T, T Element> struct TypeForValue { + static constexpr T value = Element; +}; + +/** + * A type that encodes a list of values of the same type. + * This is similar to #std::integer_sequence, but a bit more general. It's main purpose it to also + * support enums instead of just ints. + */ +template<typename T, T... Elements> struct ValueSequence { + /** + * Get the number of elements in the sequence. + */ + static constexpr size_t size() noexcept + { + return sizeof...(Elements); + } + + /** + * Get the element at a specific index. + */ + template<size_t I> static constexpr T at_index() + { + static_assert(I < sizeof...(Elements)); + return std::tuple_element_t<I, std::tuple<TypeForValue<T, Elements>...>>::value; + } + + /** + * Return true if the element is in the sequence. + */ + template<T Element> static constexpr bool contains() + { + return ((Element == Elements) || ...); + } +}; + +/** + * A type that encodes a list of types. + * #std::tuple can also encode a list of types, but has a much more complex implementation. + */ +template<typename... T> struct TypeSequence { + /** + * Get the number of types in the sequence. + */ + static constexpr size_t size() noexcept + { + return sizeof...(T); + } + + /** + * Get the type at a specific index. + */ + template<size_t I> using at_index = std::tuple_element_t<I, std::tuple<T...>>; +}; + +namespace detail { + +template<typename T, T Value, size_t... I> +inline ValueSequence<T, ((I == 0) ? Value : Value)...> make_value_sequence_impl( + std::index_sequence<I...> /* indices */) +{ + return {}; +} + +template<typename T, T Value1, T Value2, size_t... Value1Indices, size_t... I> +inline ValueSequence<T, + (ValueSequence<size_t, Value1Indices...>::template contains<I>() ? Value1 : + Value2)...> + make_two_value_sequence_impl(ValueSequence<size_t, Value1Indices...> /* value1_indices */, + std::index_sequence<I...> /* indices */) +{ + return {}; +}; + +} // namespace detail + +/** + * Utility to create a #ValueSequence that has the same value at every index. + */ +template<typename T, T Value, size_t Size> +using make_value_sequence = decltype(detail::make_value_sequence_impl<T, Value>( + std::make_index_sequence<Size>())); + +/** + * Utility to create a #ValueSequence that contains two different values. The indices of where the + * first value should be used are passed in. + */ +template<typename T, T Value1, T Value2, size_t Size, size_t... Value1Indices> +using make_two_value_sequence = decltype(detail::make_two_value_sequence_impl<T, Value1, Value2>( + ValueSequence<size_t, Value1Indices...>(), std::make_index_sequence<Size>())); + +namespace parameter_pack_utils_static_tests { +enum class MyEnum { A, B }; +static_assert(std::is_same_v<make_value_sequence<MyEnum, MyEnum::A, 3>, + ValueSequence<MyEnum, MyEnum::A, MyEnum::A, MyEnum::A>>); +static_assert( + std::is_same_v<make_two_value_sequence<MyEnum, MyEnum::A, MyEnum::B, 5, 1, 2>, + ValueSequence<MyEnum, MyEnum::B, MyEnum::A, MyEnum::A, MyEnum::B, MyEnum::B>>); +} // namespace parameter_pack_utils_static_tests + +} // namespace blender |