Age | Commit message (Expand) | Author |
2020-07-24 | Let cpu_backend_gemm support all storage order combinations, unconditionally ...test_323013778 | benoitjacob |
2020-07-24 | Optimized packing code path for row-major 8bit inputs for the x86 paths. | Benoit Jacob |
2020-07-24 | Optimized packing code path for row-major 8bit inputs for the kNeon path. Wri... | Benoit Jacob |
2020-07-21 | Use lambdas to shorten Kernel8bitAvx512's source code, and to split the resul... | Benoit Jacob |
2020-07-21 | Optimized packing code path for row-major float inputs. | Benoit Jacob |
2020-07-20 | Optimized packing code path for row-major 8bit inputs for the kNeonDotprod path. | Benoit Jacob |
2020-07-15 | Fix the build on some toolchains - a missing #include<cstring> and some avx51... | Benoit Jacob |
2020-07-15 | Rename packing code implementation functions now that they are explicitly abo... | Benoit Jacob |
2020-07-15 | Templatize packing code paths on the source order, so that we support any com... | Benoit Jacob |
2020-07-14 | Simplification of FallBackToStandardCpp now that we are past the incremental ... | Benoit Jacob |
2020-07-14 | Efficient support for any channel_dimension for quantized kernels on AVX-512,... | Benoit Jacob |
2020-07-14 | Efficient support for any channel_dimension for quantized kernels on AVX-512,... | Benoit Jacob |
2020-07-14 | Efficient support for any channel_dimension for quantized kernels on AVX2. | Benoit Jacob |
2020-07-14 | Simplify x86 kernels by using the fact that there always is a per-channel buf... | Benoit Jacob |
2020-07-14 | Simplify x86 kernels thanks to the new fact that perchannel buffers are round... | Benoit Jacob |
2020-07-13 | Fix runtime detection of support for our AVX2+FMA code path: we were only che... | Benoit Jacob |
2020-07-13 | FMA is technically a separate ISA extension from AVX2. | Benoit Jacob |
2020-07-13 | Efficient support for any channel_dimension for float kernels on AVX-512. | Benoit Jacob |
2020-07-13 | Efficient support for any channel_dimension for float kernels on AVX2. | Benoit Jacob |
2020-07-13 | Allow the user to specify that they have allocated a slightly larger capacity... | Benoit Jacob |
2020-07-09 | Fix ARM32 packing code reading past the end of the source matrix, and finishi... | Benoit Jacob |
2020-07-09 | Add comments and some minor simplications to packing code. | Benoit Jacob |
2020-07-09 | Avoid overrunning per-channel buffers, whose size is that of the correspondin... | Benoit Jacob |
2020-07-09 | Minor optimization of in-order arm64 kernels, interleave the dup's used in th... | Benoit Jacob |
2020-07-09 | Minor simplification of arm32 assembly: the add instruction itself can be con... | Benoit Jacob |
2020-07-09 | Efficient support for any channel_dimension for quantized kernels on ARM32. | Benoit Jacob |
2020-07-09 | Efficient support for any channel_dimension for float kernels on ARM32. | Benoit Jacob |
2020-07-08 | Efficient support for any channel_dimension for kNeonDotprod quantized kernel... | Benoit Jacob |
2020-07-08 | Efficient support for any channel_dimension for kNeon quantized kernels on AR... | Benoit Jacob |
2020-07-08 | Ensure that the 1Col kernels are not used with channel_dimension==kCol, so th... | Benoit Jacob |
2020-07-08 | Efficient support for any channel_dimension for float kernels on ARM64. | Benoit Jacob |
2020-07-08 | Groundwork to pass channel_dimension down to kernels and to incrementally ena... | Benoit Jacob |
2020-07-07 | Revisiting RUY_OPT(AVOID_ALIASING). | Benoit Jacob |
2020-07-07 | Fix benchmarking of caching. | Benoit Jacob |
2020-07-06 | Allow benchmarking any combination of storage orders, and disable the randomi... | Benoit Jacob |
2020-07-06 | Allow disabling the reference path in the benchmark. | Benoit Jacob |
2020-07-02 | Start of a documentation directory. | Benoit Jacob |
2020-07-02 | Remove RUY_OPT(NATIVE_ROUNDING) or rather, the ability to disable it. | Benoit Jacob |
2020-07-02 | Make the reference/standard-cpp code in ApplyMultiplier match the ARM code, b... | Benoit Jacob |
2020-07-01 | Avoid relying on std::max being constexpr, which is c++14 behavior but is not... | Benoit Jacob |
2020-06-30 | Remove ExpectedOutcome support, it was used for death tests in test_special_m... | Benoit Jacob |
2020-06-29 | Store perchannel members in a union with their non-perchannel counterpart. | Benoit Jacob |
2020-06-29 | Split the storage of MulParams data members into 3 separate template speciali... | Benoit Jacob |
2020-06-26 | Remove cpuinfo from s390x build as there is no support yet | cdavoudian |
2020-06-26 | Reduce to the case of column-major destination matrix by transposing the whol... | Benoit Jacob |
2020-06-26 | Some refactoring in create_trmul_params.* ahead of implementing the transposi... | Benoit Jacob |
2020-06-25 | Implement the channels_dimension==kCol case. | Benoit Jacob |
2020-06-25 | Change Transpose functions to returning the result by value. | Benoit Jacob |
2020-06-25 | Store the MulParams by value, in a char[] buffer, in TrMulParams. | Benoit Jacob |
2020-06-25 | Add a channel_dimension member to MulParams, bringing the last piece to make ... | Benoit Jacob |