Welcome to mirror list, hosted at ThFree Co, Russian Federation.

github.com/google/ruy.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2021-11-02Update TFLite kernel to use Ruy 16x8 Gemm instead of reference kernel.test_406772541Dayeong Lee
PiperOrigin-RevId: 406772541
2021-11-02Ruy: Support 8x16 avx512 kernelDayeong Lee
PiperOrigin-RevId: 407005437
2021-11-01Ruy: Support 8x16 avx2_fma kernelDayeong Lee
PiperOrigin-RevId: 406766575
2021-10-27fix inheritance of kernels on x86. When an AVX2 kernel is not available, ↵Benoit Jacob
fall back on AVX, not StandardCpp PiperOrigin-RevId: 405900310
2021-10-21test i8xi16 casesBenoit Jacob
PiperOrigin-RevId: 404698692
2021-10-21Disable the internal test-only variants of the StandardCpp path in benchmarksBenoit Jacob
PiperOrigin-RevId: 404697829
2021-09-13Add missing volatile qualifier in Pack8bitRowMajorForNeonDotprodKeichi Takahashi
I was getting incorrect results on some environments and this turned out to be the cause. Closes https://github.com/google/ruy/pull/276 COPYBARA_INTEGRATE_REVIEW=https://github.com/google/ruy/pull/276 from keichi:add-missing-volatile e2d89fe29ce36510a08c704b603f513729713faf PiperOrigin-RevId: 396351130
2021-09-10Fix error when compiling ruy_test_overflow_dst_zero_point with GCCKeichi Takahashi
This fixes the following compilation error: ``` In file included from /usr/include/c++/10/vector:67, from /home/keichi/Projects/ruy/ruy/test_overflow_dst_zero_point.cc:32: /usr/include/c++/10/bits/stl_vector.h: In instantiation of ‘class std::vector<const signed char>’: /home/keichi/Projects/ruy/ruy/test_overflow_dst_zero_point.cc:75:24: required from here /usr/include/c++/10/bits/stl_vector.h:401:66: error: static assertion failed: std::vector must have a non-const, non-volatile value_type 401 | static_assert(is_same<typename remove_cv<_Tp>::type, _Tp>::value, | ^~~~~ ``` Closes https://github.com/google/ruy/pull/278 COPYBARA_INTEGRATE_REVIEW=https://github.com/google/ruy/pull/278 from keichi:fix-compilation-err 2e30471e9ce525f3a62337078cf2e80f17c966ff PiperOrigin-RevId: 395965795
2021-06-22Fix typo in Windows on ARM 32bitmetarutaiga
It build failed in ARM 32bit, I think it's just a typo. Closes https://github.com/google/ruy/pull/274 COPYBARA_INTEGRATE_REVIEW=https://github.com/google/ruy/pull/274 from metarutaiga:master f40e88396c1031289dbda5a0c98893557509542e PiperOrigin-RevId: 380820893
2021-06-18Fix the bazel build by dropping a xtensa-specific select entry.Benoit Jacob
PiperOrigin-RevId: 380178286
2021-06-16Fix integer overflow causing incorrect results.Benoit Jacob
Kernels perform the addition of the destination zero_point in int16. This addition needed to be saturating to avoid wrapping around. Thanks to Marat Dukhan for reporting and debugging this issue. Additionally, this commit: - makes the new Cortex-X1 tuned kernels tested. - adds Context::get_runtime_enabled_paths() to query the runtime CPU detection from the public Context interface. - updates the Bazel-to-CMake converter to support some minor recent BUILD changes. PiperOrigin-RevId: 379778779
2021-05-17remove pthread requirement for cc_target_os:xtensaRuy Contributors
PiperOrigin-RevId: 374225894
2021-05-11Fork Neon Float kernel for X1T.J. Alumbaugh
PiperOrigin-RevId: 373147434
2021-05-06IWYU: include limits for std::numeric_limits (#253)stha09
2021-05-01Remove runtime assertion on size of shift in reference codeT.J. Alumbaugh
PiperOrigin-RevId: 371435881
2021-04-26Remove non-ASCII character in commentT.J. Alumbaugh
PiperOrigin-RevId: 370495857
2021-04-221.02x speedup of Ruy AVX2 f32 and AVX-512 f32/i8Ruy Contributors
AVX-512: - broadcast without extra instruction (code size) - use native mask ops - re-roll mmm loop AVX2: avoid slow permute, especially for AMD PiperOrigin-RevId: 369907385
2021-04-20Fork 8bit Neon Dotprod kernel for X1 and support resolving to X1 coreT.J. Alumbaugh
PiperOrigin-RevId: 369496892
2021-04-06Create a utility library to suppress floating-point denormals, and apply it ↵Chao Mei
to every task execution of every thread. PiperOrigin-RevId: 366919663
2021-03-10Simplify some code and add release assertions to help debug a crash in an ↵Benoit Jacob
application. PiperOrigin-RevId: 361953871
2021-03-10rollback hopefully fixing some application crashBenoit Jacob
PiperOrigin-RevId: 361951187
2021-03-02Use std::ptrdiff_t instead of int when calculating memory size to avoid int ↵Chao Mei
overflow. PiperOrigin-RevId: 360298662
2021-02-09Simplify quantized multiplierGeorgios Pinitas
Alter sequence to a single rounded scaling with normal rounded shift. Double rounding and symmetric rounding are removed compared to reference. Double rounding seems unnecessary and can complicate implementations. Moreover, symmetric rounding also adds implementation complexity. For NEON the new sequence can be translated to VQDMULH + VRSHR. Closes https://github.com/google/ruy/pull/227 COPYBARA_INTEGRATE_REVIEW=https://github.com/google/ruy/pull/227 from GeorgeARM:mul_pr dec00bd87a8815fdad79d302494430aa63522752 PiperOrigin-RevId: 356539687
2021-02-09Update test tolerance ahead of merging PR #227bjacob
Closes https://github.com/google/ruy/pull/251 COPYBARA_INTEGRATE_REVIEW=https://github.com/google/ruy/pull/251 from bjacob:relax c8d2cf94d15abd4a9fd4222619c42413952f0fb1 PiperOrigin-RevId: 356340585
2021-01-23Allow late definitions of cpuinfo but only when ruy is a subdir. (#250)bjacob
2021-01-22Disable tests by default when ruy is a subproject.bjacob
Closes https://github.com/google/ruy/pull/249 COPYBARA_INTEGRATE_REVIEW=https://github.com/google/ruy/pull/249 from bjacob:tests-disabled-when-submodule 3a33bb081acadca3520edeae2c226827e9fe0f89 PiperOrigin-RevId: 353298619
2021-01-21Change the default MulParams multiplier values to multiply by 1, not 0.bjacob
Multiplying by 0 by default is unfriendly to people getting familiar with ruy having to debug why their output values are all 0. With a default of 1, tiny toy examples might output sane values, anything beyond that will saturate, and seeing all saturated values will be a hint that something needs to be set to rescale values. Closes https://github.com/google/ruy/pull/248 COPYBARA_INTEGRATE_REVIEW=https://github.com/google/ruy/pull/248 from bjacob:multiplier-default 3fb1152e899fffc1f9fa9103b533348599ca494f PiperOrigin-RevId: 353077204
2021-01-21Add basic gitignore (#246)Geoffrey Martin-Noble
2021-01-21Simplify cpuinfo build overlay (#247)Geoffrey Martin-Noble
2021-01-21Fixes for builds in open source projects with cpuinfo and googletest deps.Benoit Jacob
- Following XNNPACK's example, in CMakeLists.txt, skip including our own third_party/ directories if the target is already defined. This means that IREE embedding ruy as a third_party/ dep does not need to have its submodules checked out, ruy can use IREE's own cpuinfo and googletest. - Switch open-source builds to using the stripped-include-paths flavor of cpuinfo (like IREE is already using). PiperOrigin-RevId: 352871140
2021-01-21Update depgraphbjacob
- Switch to same colors as in ruy html traces - Move `:thread_pool` to its own yellow color for consistency with ruy traces - Drop `:validate` - Drop the legend, will be redundant in the context of markdown docs showing these different materials in the same context. preview: https://github.com/google/ruy/blob/84dd41f433b3befad6c711248a5d0f00fd8b2711/doc/depgraph.svg Closes https://github.com/google/ruy/pull/241 COPYBARA_INTEGRATE_REVIEW=https://github.com/google/ruy/pull/241 from bjacob:depgraph-update 8f2fa1d9a178c62b80fcc940c9d6ca5cf8ce3c41 PiperOrigin-RevId: 352858626
2021-01-20Revert "Revert "Add CMake support with a converter from Bazel""bjacob
Reverts google/ruy#243 Closes https://github.com/google/ruy/pull/244 PiperOrigin-RevId: 352711630
2021-01-20Corrected macro for detecting ppc platform (#83)Nishidha
(Sorry I merged this the wrong way the first time) PiperOrigin-RevId: 352705468
2021-01-20Add a tracing framework (really just logging).Benoit Jacob
This isn't a performance tracing framework (unlike the old ruy tracing). This is about understanding what happens inside a ruy::Mul with a view toward documenting how ruy works. Added a 'parametrized_example' to help play with this tracing on any flavor of ruy::Mul call. This also serves as a more elaborate example of how to call ruy::Mul, and as a single binary instantiating several different instantiations of the ruy::Mul template, which is useful for measuring binary size and showing a breakdown of ruy symbols in a document. A few code changes beyond tracing slipped in: - Improved logic in determining the traversal order in MakeBlockMap: In rectangular cases, since we first do the top-level rectangularness subdivision with linear traversal anyway, the traversal order only applies within each subdivision past that, so it should be based on sizes already divided by rectangularness. In practice this nudges 1000x400x2000 from kFractalHilbert to kFractalU on Pixel4, without making an observable perf difference in that case. - Removed the old RUY_BLOCK_MAP_DEBUG logging code: superseded. Kept only a minimal hook to force a block_size_log2 choice. - Wrote new comments on BlockMap internals. - Fixed Ctx::set_runtime_enabled_paths to behave as documented: passing Path::kNone reverts to the default behavior (auto detect). - Exposed Context::set_runtime_enabled_paths. - Renamed UseSimpleLoop -> GetUseSimpleLoop (easier to read trace). PiperOrigin-RevId: 352695092
2021-01-20Revert "Add CMake support with a converter from Bazel (#233)" (#243)bjacob
This reverts commit b87d6d2e65ca24ba38e9afbf1e9d0744dbda82d3.
2021-01-20Add CMake support with a converter from Bazel (#233)bjacob
* Add CMake support with a converter from Bazel, update by running: cmake/bazel_to_cmake.sh This supports building and running tests also on Android, e.g. ``` cmake ../ruy -G Ninja \ -DCMAKE_TOOLCHAIN_FILE=~/android-ndk-r21d/build/cmake/android.toolchain.cmake \ -DANDROID_ABI=arm64-v8a \ -DANDROID_PLATFORM=android-29 cmake --build . -j12 ctest . -j12 ``` Some parts of this were forked from IREE's cmake setup.
2021-01-19Corrected macro for detecting ppc platform. (#83)Nishidha
2021-01-19Move submodules to where they belong. (#240)bjacob
2021-01-16Add git submodules: googletest and cpuinfo (#235)bjacob
2021-01-16Bazel submodules (#236)bjacob
* Add git submodules: googletest and cpuinfo * Let the Bazel WORKSPACE point to the git submodules.
2021-01-15Fix doc paths in READMEBenoit Jacob
PiperOrigin-RevId: 351931688
2021-01-15Add a trimmed dependency graph and its generator, for doc purposes.Benoit Jacob
PiperOrigin-RevId: 351929118
2021-01-14Drop unneeded dependency from :context.Benoit Jacob
PiperOrigin-RevId: 351657429
2021-01-08Cosmetics: class-ify TrMulTask, in particular put the trailing _ where they ↵Benoit Jacob
belong. Also remove a useless #include in context.h. PiperOrigin-RevId: 350645020
2020-12-22Fix the new raw accumulators example - being raw accumulators, it's not 'per ↵Benoit Jacob
channel', as there is no multiplier here. PiperOrigin-RevId: 348522764
2020-12-22Relax test tolerance against Eigen, adapting to a recent Eigen change ↵Benoit Jacob
between Eigen commits 011e0db31d1bed8b7f73662be6d57d9f30fa457a and bec72345d69917f475e577d23df0ca4ed967a4f0. PiperOrigin-RevId: 348522159
2020-12-22fix gcc warningsBenoit Jacob
PiperOrigin-RevId: 348517342
2020-12-21Move the example out of the ruy/ruy directory, and add an example returning rawBenoit Jacob
int32 accumulators. PiperOrigin-RevId: 348511323
2020-11-15Fixing warnings on MSVC (comparing a bool with >).Ben Vanik
PiperOrigin-RevId: 342509771
2020-11-03Enforce x86 bit exactnessT.J. Alumbaugh
PiperOrigin-RevId: 340457081