diff options
author | Jongsoo Park <jongsoo@fb.com> | 2019-09-24 17:06:47 +0300 |
---|---|---|
committer | Facebook Github Bot <facebook-github-bot@users.noreply.github.com> | 2019-09-24 17:25:20 +0300 |
commit | 518d8a1832cf1eb1dda2feace1a278e9e4f302ba (patch) | |
tree | 532f3e479fa8a96644689c65fe7891b9ce30bcf0 /CMakeLists.txt | |
parent | 53f0c0d175ae4283609a5b251052f9c6598b8aee (diff) |
remove template parameter from PackedDepthWiseConvMatrix (#128)
Summary:
Pull Request resolved: https://github.com/pytorch/FBGEMM/pull/128
We don't really need to have KERNEL_PROD as a compile time constant template parameter in PackedDepthWiseConvMatrix for performance. Removing the template parameter will make generalizing depth-wise convolution to non 3x3 cases easier.
This diff only changes fbgemm while maintaining the old interface. The follow-up diff will change Caffe2 code using the old interface and remove the old interface.
This diff also splits FbgemmI8DepthwiseAvx2.cc into FbgemmI8Depthwise3DAvx2.cc and PackDepthwiseConvMatrixAvx2.cc to avoid compilation timeouts in OSS build tests.
Reviewed By: dskhudia
Differential Revision: D17514003
fbshipit-source-id: 2214637ac0762a585f619f0035d3449cc4f7669e
Diffstat (limited to 'CMakeLists.txt')
-rw-r--r-- | CMakeLists.txt | 2 |
1 files changed, 2 insertions, 0 deletions
diff --git a/CMakeLists.txt b/CMakeLists.txt index 817f699..8bf6371 100644 --- a/CMakeLists.txt +++ b/CMakeLists.txt @@ -75,8 +75,10 @@ endif() #All the source files that either use avx2 instructions statically set(FBGEMM_AVX2_SRCS src/FbgemmFP16UKernelsAvx2.cc + src/FbgemmI8Depthwise3DAvx2.cc src/FbgemmI8DepthwiseAvx2.cc src/OptimizedKernelsAvx2.cc + src/PackDepthwiseConvMatrixAvx2.cc src/QuantUtilsAvx2.cc src/UtilsAvx2.cc) |