Welcome to mirror list, hosted at ThFree Co, Russian Federation.

github.com/marian-nmt/FBGEMM.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
path: root/src
diff options
context:
space:
mode:
authorMike Tsai <miketsai@fb.com>2019-06-15 03:04:25 +0300
committerFacebook Github Bot <facebook-github-bot@users.noreply.github.com>2019-06-15 03:10:29 +0300
commit604575ff5de717b2ee712190634840981a9c8fba (patch)
tree198817e0992810a7dff5ac0ed2a99a9b08834346 /src
parent5e71d2c304663f3b4e50cee723b8e98a867d11ca (diff)
Update the logic of checking valid parameters.
Summary: Add the check on NR_MIN and fix ymm/zmm register checks. Reviewed By: dskhudia Differential Revision: D15772144 fbshipit-source-id: 11e2c67fb3d47c5570b38ceaf9828ced0e60e65b
Diffstat (limited to 'src')
-rw-r--r--src/GenerateKernelU8S8S32ACC16Avx512.cc7
1 files changed, 4 insertions, 3 deletions
diff --git a/src/GenerateKernelU8S8S32ACC16Avx512.cc b/src/GenerateKernelU8S8S32ACC16Avx512.cc
index 505fec1..e5687eb 100644
--- a/src/GenerateKernelU8S8S32ACC16Avx512.cc
+++ b/src/GenerateKernelU8S8S32ACC16Avx512.cc
@@ -201,9 +201,10 @@ CodeGenBase<uint8_t, int8_t, int32_t, int16_t>::getOrCreate<inst_set_t::avx512>(
int maxMRegs = mRegBlockSize;
int maxNRegs = nRegBlockSize * row_interleave / VLEN_;
assert(
- maxMRegs * maxNRegs <= 24 &&
- "MR*(NR*ROW_INTERLEAVE*8/512) \
- must be <= 24(available registers constraint)");
+ (maxMRegs+1) * maxNRegs <= 28 &&
+ "number of zmm registers for C + one row for loading B: \
+ MR*(NR*ROW_INTERLEAVE*8/512) + (NR*ROW_INTERLEAVE*8/512) \
+ must be <= 28(available registers constraint)");
int mRegBlocks = mc / mRegBlockSize;
int mRegBlocksRem = mc % mRegBlockSize;