Welcome to mirror list, hosted at ThFree Co, Russian Federation.

github.com/marian-nmt/FBGEMM.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorDaya S Khudia <dskhudia@fb.com>2019-03-21 20:03:36 +0300
committerFacebook Github Bot <facebook-github-bot@users.noreply.github.com>2019-03-21 20:07:54 +0300
commitf65f0ebe54f0512d8f42ee10025b596e3f42e0b8 (patch)
tree8a80b9de7c8d5ae034d707b27ac7c84cecd83d0d /include/fbgemm
parent452627c5f29412528c26b57880f27914b1068d6e (diff)
Improves small N cases back to what they were
Summary: In D14507536 and D14516232 small N cases suffered if we increased the NR. This fixes those cases. Reviewed By: jianyuh Differential Revision: D14529494 fbshipit-source-id: 6f53797948de760d6ed24b767cbbe8d27768660f
Diffstat (limited to 'include/fbgemm')
-rw-r--r--include/fbgemm/PackingTraits-inl.h8
1 files changed, 8 insertions, 0 deletions
diff --git a/include/fbgemm/PackingTraits-inl.h b/include/fbgemm/PackingTraits-inl.h
index 6bf34d5..5b50bc9 100644
--- a/include/fbgemm/PackingTraits-inl.h
+++ b/include/fbgemm/PackingTraits-inl.h
@@ -154,6 +154,10 @@ struct PackingTraits<
inst_set_t::avx512,
typename std::enable_if<is_8bit<T>::value>::type> {
static constexpr int MR{14}; ///< Register block for M dimension.
+ static constexpr int NR_MIN{
+ 16}; ///< Minimum register block for N dimension.
+ ///< 16 because 16*ROW_INTERLEAVE int8 elements
+ ///< completely fill a 512-bit wide vector.
static constexpr int NR{
32}; ///< Register block for N dimension.
///< Must be a multiple of 16 because 16*ROW_INTERLEAVE int8 elements
@@ -187,6 +191,10 @@ struct PackingTraits<
inst_set_t::avx512,
typename std::enable_if<is_8bit<T>::value>::type> {
static constexpr int MR{6}; ///< Register block for M dimension
+ static constexpr int NR_MIN{
+ 32}; ///< Minimum register block for N dimension;
+ ///< 32 because 32*ROW_INTERLEAVE int8 elements
+ ///< completely fill a 512-bit wide vector.
static constexpr int NR{
128}; ///< Register block for N dimension;
///< Must be a multiple of 32 because 32*ROW_INTERLEAVE int8