barebone int8-acc16 and int8-acc32 benchmarks

Summary: adding barebone gemm benchmarks for comparisons **Performance on Skylake T6 (turbo off; single thread)** M, N, K, Type, GOPS 64, 800, 320, MKL_fp32, 91.1 64, 800, 320, FBGEMM_i8_acc32, 118.7 64, 800, 320, FBGEMM_i8_acc16, 137.0 64, 768, 512, MKL_fp32, 102.0 64, 768, 512, FBGEMM_i8_acc32, 132.2 64, 768, 512, FBGEMM_i8_acc16, 160.1 16, 256, 512, MKL_fp32, 39.8 16, 256, 512, FBGEMM_i8_acc32, 55.3 16, 256, 512, FBGEMM_i8_acc16, 63.4 128, 128, 128, MKL_fp32, 49.2 128, 128, 128, FBGEMM_i8_acc32, 54.1 128, 128, 128, FBGEMM_i8_acc16, 54.4 256, 512, 256, MKL_fp32, 97.7 256, 512, 256, FBGEMM_i8_acc32, 126.2 256, 512, 256, FBGEMM_i8_acc16, 170.1 1024, 1024, 1024, MKL_fp32, 114.3 1024, 1024, 1024, FBGEMM_i8_acc32, 150.8 1024, 1024, 1024, FBGEMM_i8_acc16, 202.9 **Breakdown** M, N, K, Type, Packing (us), Kernel (us), Postproc (us), Total (us), GOPs 64, 800, 320, MKL_fp32, 0, 0, 0, 0, 95.7 64, 800, 320, FBGEMM_i8_acc32, 5.9, 261.9, 2.0, 275.9, 115.5 64, 800, 320, FBGEMM_i8_acc16, 17.4, 210.6, 3.3, 238.2, 132.1 64, 768, 512, MKL_fp32, 0, 0, 0, 0, 103.2 64, 768, 512, FBGEMM_i8_acc32, 9.0, 366.2, 1.9, 383.2, 128.0 64, 768, 512, FBGEMM_i8_acc16, 9.9, 298.3, 1.5, 314.8, 155.4 16, 256, 512, MKL_fp32, 0, 0, 0, 0, 40.8 16, 256, 512, FBGEMM_i8_acc32, 3.3, 60.5, 1.0, 68.3, 54.3 16, 256, 512, FBGEMM_i8_acc16, 3.2, 55.2, 0.5, 61.2, 60.6 128, 128, 128, MKL_fp32, 0, 0, 0, 0, 51.3 128, 128, 128, FBGEMM_i8_acc32, 8.1, 60.4, 0.6, 71.0, 52.4 128, 128, 128, FBGEMM_i8_acc16, 16.0, 44.8, 0.4, 64.6, 56.4 256, 512, 256, MKL_fp32, 0, 0, 0, 0, 95.0 256, 512, 256, FBGEMM_i8_acc32, 12.9, 512.1, 3.9, 542.1, 122.1 256, 512, 256, FBGEMM_i8_acc16, 12.1, 376.4, 2.3, 396.2, 165.8 1024, 1024, 1024, MKL_fp32, 0, 0, 0, 0, 114.9 1024, 1024, 1024, FBGEMM_i8_acc32, 116.9, 13999.2, 47.9, 14276.1, 150.3 1024, 1024, 1024, FBGEMM_i8_acc16, 125.7, 10490.3, 31.8, 10730.1, 200.0 TODO: add mkl-dnn as well. Reviewed By: jianyuh Differential Revision: D14196397 fbshipit-source-id: 4cfb22374a6553a774d2f92ef37e295b7296de8d
author: Daya S Khudia <dskhudia@fb.com> 2019-02-26 00:11:40 +0300
committer: Facebook Github Bot <facebook-github-bot@users.noreply.github.com> 2019-02-26 00:14:47 +0300
commit: 426d7be717a3d2f5cef5346ef10d81bb636e625a (patch)
tree: 449cb184525b295144e98cc390257cb2a78e3997 /src
parent: 7b99ce43df15ab1fb5c6be383e3bc2f651cde44c (diff)
1 files changed, 5 insertions, 1 deletions
diff --git a/src/PackAMatrix.cc b/src/PackAMatrix.cc
index 8469a39..f52684a 100644
--- a/src/PackAMatrix.cc
+++ b/src/PackAMatrix.cc
@@ -44,7 +44,11 @@ PackAMatrix<T, accT>::PackAMatrix(
         "groups = " + std::to_string(groups) +
         " does not divide numCols = " + std::to_string(BaseType::numCols()));
   }
-  if (!pmat) {
+  if (pmat) {
+    BaseType::buf_ = pmat;
+  }
+  else {
+    BaseType::bufAllocatedHere_ = true;
     BaseType::buf_ = (T*)fbgemmAlignedAlloc(
         64, BaseType::brow_ * BaseType::bcol_ * sizeof(T));
   }
author	Daya S Khudia <dskhudia@fb.com>	2019-02-26 00:11:40 +0300
committer	Facebook Github Bot <facebook-github-bot@users.noreply.github.com>	2019-02-26 00:14:47 +0300
commit	426d7be717a3d2f5cef5346ef10d81bb636e625a (patch)
tree	449cb184525b295144e98cc390257cb2a78e3997 /src
parent	7b99ce43df15ab1fb5c6be383e3bc2f651cde44c (diff)