Welcome to mirror list, hosted at ThFree Co, Russian Federation.

github.com/marian-nmt/FBGEMM.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorDaya S Khudia <dskhudia@fb.com>2019-03-21 20:03:36 +0300
committerFacebook Github Bot <facebook-github-bot@users.noreply.github.com>2019-03-21 20:07:54 +0300
commit452627c5f29412528c26b57880f27914b1068d6e (patch)
tree3d7f6af4a78a15e72212e4a25916d238f0c9699a /include/fbgemm/PackingTraits-inl.h
parentd53c0220cf1749802736bba192c5e37f430df7a0 (diff)
Allocate some registers for B matrix loading and reuse loaded results
Summary: Instead of loading B matrix values with every vpmaddubsw instruction, load once and reuse. The downside is we need to use some register for holding these B matrix values which could have been otherwise used for C accumulations. Reviewed By: jianyuh Differential Revision: D14529495 fbshipit-source-id: 54bd4bcdcf14ac2f25a433ac60bfc08b7359453f
Diffstat (limited to 'include/fbgemm/PackingTraits-inl.h')
-rw-r--r--include/fbgemm/PackingTraits-inl.h4
1 files changed, 2 insertions, 2 deletions
diff --git a/include/fbgemm/PackingTraits-inl.h b/include/fbgemm/PackingTraits-inl.h
index 465c498..6bf34d5 100644
--- a/include/fbgemm/PackingTraits-inl.h
+++ b/include/fbgemm/PackingTraits-inl.h
@@ -186,7 +186,7 @@ struct PackingTraits<
std::int16_t,
inst_set_t::avx512,
typename std::enable_if<is_8bit<T>::value>::type> {
- static constexpr int MR{7}; ///< Register block for M dimension
+ static constexpr int MR{6}; ///< Register block for M dimension
static constexpr int NR{
128}; ///< Register block for N dimension;
///< Must be a multiple of 32 because 32*ROW_INTERLEAVE int8
@@ -200,7 +200,7 @@ struct PackingTraits<
///< B matrix.
static constexpr int MCB{
- 56}; ///< Cache block for M dimension (multiple of MR).
+ 60}; ///< Cache block for M dimension (multiple of MR).
static constexpr int NCB{
128}; ///< Cache block for N dimension (multiple of NR).
static constexpr int KCB{256}; ///< Cache block for K dimension.