Welcome to mirror list, hosted at ThFree Co, Russian Federation.

github.com/marian-nmt/FBGEMM.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2021-11-19Merge pull request #11 from marian-nmt/gcc11supportHEADmasterNikolay Bogoychev
Add GCC 11 support.
2021-10-29support gcc11gcc11supportNikolay Bogoychev
GCC11 now explicitly requires inclusion of <memory> <thread> <limits> and <utility> breaking some older code
2021-03-22Remove unused commentYoung Jin Kim
2021-03-22gcc 9.3+ build fix (#10)Young Jin Kim
* Turn -march=native off when using gcc 9.3+ (-march=x86-64)
2020-09-30Fix unit test errors - skip unnecessary acc16 test for VNNI CPUs (#9)Young Jin Kim
2020-09-03Restore CMake 3.5.1 compatibility by reimplementing ↵Aaron Burke
list(TRANSFORM...PREPEND) with a foreach() (#8)
2020-08-21Fix dependent library interface include directories to use build/install ↵Aaron Burke
generator expressions (#7) I verified it working well with marian-dev master and stand-alone fbgemm.
2020-08-12Fix public header property in cpuinfo and clog to support submodule installs ↵Aaron Burke
(#6) Looks good. Thanks!
2020-06-24Add more include <stdexcept> (#5)Young Jin Kim
2020-06-24Merge pull request #4 from marian-nmt/youki/fix-stdexceptYoung Jin Kim
Fix stdexcept compile error
2020-06-24Add stdexceptyouki/fix-stdexceptYoung Jin Kim
2020-05-23Merge pull request #3 from marian-nmt/youki/improve-mem-alloc-marianYoung Jin Kim
Remove an unnecessary memory allocation
2020-05-22Remove an unnecessary memory allocationyouki/improve-mem-alloc-marianYoung Jin Kim
2020-03-04Merge pull request #2 from XapaJIaMnu/restore_mac_supportYoung Jin Kim
Support mac again
2020-02-25Restore mac supportNikolay Bogoychev
2019-12-03Merge pull request #1 from marian-nmt/youki/win-jit-debug-int8Young Jin Kim
Youki/win jit debug int8
2019-12-03Merge branch 'master' into youki/win-jit-debug-int8Young Jin Kim
2019-10-19Remove unused codeYoung Jin Kim
2019-10-18Change AVX2 compile check to runtime checkYoung Jin Kim
2019-09-26Linux memory fixYoung Jin Kim
2019-09-26debugging linux unit testsYoung Jin Kim
2019-09-25fix linux build errorYoung Jin Kim
2019-09-25All functions are running well on windowsYoung Jin Kim
2019-09-25Fix windows build errorsYoung Jin Kim
2019-09-25Merge remote-tracking branch 'upstream/master' into youki/win-jit-debug-int8Young Jin Kim
Fix for windows build errors
2019-09-25Fix jit code (AVX512) on windowsYoung Jin Kim
2019-09-25JIT code working on windows (AVX512)Young Jin Kim
2019-09-24remove template parameter from PackedDepthWiseConvMatrix (#128)Jongsoo Park
Summary: Pull Request resolved: https://github.com/pytorch/FBGEMM/pull/128 We don't really need to have KERNEL_PROD as a compile time constant template parameter in PackedDepthWiseConvMatrix for performance. Removing the template parameter will make generalizing depth-wise convolution to non 3x3 cases easier. This diff only changes fbgemm while maintaining the old interface. The follow-up diff will change Caffe2 code using the old interface and remove the old interface. This diff also splits FbgemmI8DepthwiseAvx2.cc into FbgemmI8Depthwise3DAvx2.cc and PackDepthwiseConvMatrixAvx2.cc to avoid compilation timeouts in OSS build tests. Reviewed By: dskhudia Differential Revision: D17514003 fbshipit-source-id: 2214637ac0762a585f619f0035d3449cc4f7669e
2019-09-17Enable AVX2 query API when compiled with AVXYoung Jin Kim
2019-09-16A bit more refactoringAleks Zi
Summary: Small refactor of the avx2 acc32 generator Reviewed By: dskhudia Differential Revision: D17138005 fbshipit-source-id: 06ded92c5bebb35070a45578feb96e418f8d8489
2019-09-16Small refactoring of FBGEMM GenerateKernel classAleks Zi
Summary: Removed unnecessary member variables, using sstream instead of strings. Reviewed By: dskhudia Differential Revision: D17134969 fbshipit-source-id: 147d0b39cde9edf5fb70762558e90dced5ba0ab1
2019-09-14(fixed an error message)Frank Seide
2019-09-14fixed a build error for non-AVX2 buildsFrank Seide
2019-09-14Minor changes in initialization of dilation (#126)Daya Khudia
Summary: Pull Request resolved: https://github.com/pytorch/FBGEMM/pull/126 Default value for dilation is in function definition itself. Reviewed By: protonu Differential Revision: D17371791 fbshipit-source-id: c3430dfa3faccf549dc066aa8dcd422b910dbcaa
2019-09-13add missing instantiation for float bias for gconv (#127)Daya Khudia
Summary: Pull Request resolved: https://github.com/pytorch/FBGEMM/pull/127 float bias was going through a slow path. Adding a missing specialization. Reviewed By: protonu, jianyuh Differential Revision: D17346881 fbshipit-source-id: dd6b40d80c3c429b438ea6b4e1520b935e582c4a
2019-09-11fbgemmPacked and fbgemmConv apis with float bias + testsDaya Khudia
Summary: fbgemmPacked and fbgemmConv api changes to take float bias. Reviewed By: jianyuh Differential Revision: D17244262 fbshipit-source-id: 0531c829190d20e31cb957a3f1861d4a65645cee
2019-09-11ReQuantization with FP32 biasDaya Khudia
Summary: There is an issue in eager mode if we quantize bias using input_scale*weight_scale. See the following doc. https://fb.quip.com/ru2eAqzsjwXc Reviewed By: jianyuh Differential Revision: D16948098 fbshipit-source-id: ff2c2bc560c2c14da1941d65a15c96e18f407569
2019-09-11API changes to take unquantized bias for depthwise convDaya Khudia
Summary: Changing interface for on the fly bias quantization Also adding code to quantize bias on the fly Reviewed By: jianyuh Differential Revision: D17099709 fbshipit-source-id: 5cca79189c00710e703044350260a9fcaca77bb3
2019-09-11Add assert to ensure the divisor is not 0 (#25960)Jianyu Huang
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/25960 Pull Request resolved: https://github.com/pytorch/FBGEMM/pull/124 Reviewed By: dskhudia Differential Revision: D17292372 fbshipit-source-id: 71a72f87b99c65b3b956bd8361694b1de05fc333
2019-09-05CodeCache implemented with correct initialization of Static variables (#123)Aleks Zi
Summary: Pull Request resolved: https://github.com/pytorch/FBGEMM/pull/123 Same as D16968373 but fixed the static initialization dependencies problem (https://isocpp.org/wiki/faq/ctors#static-init-order). Reviewed By: dskhudia Differential Revision: D17194751 fbshipit-source-id: 274f111996ab4f1c4386bd3b9ee8f3790739fdcd
2019-09-05Revert D16968373: Introduced CodeCache container to share the microkernels ↵Edward Yang
among different threads. Differential Revision: D16968373 Original commit changeset: 22d66e50d9b3 fbshipit-source-id: 6163979bdb36cb0b1b95bfa1caeab67e7d23eee5
2019-09-05Modifying PackAWithIm2Col to support dilated convolution and adding test casesProtonu Basu
Summary: Modifying PackAWithIm2Col to support dilated convolution and adding test cases Reviewed By: dskhudia Differential Revision: D17184638 fbshipit-source-id: e2935b1e1577505440019f732d03be630d1be040
2019-09-04remove dw conv refs and use conv_ref instead (#122)Jongsoo Park
Summary: Pull Request resolved: https://github.com/pytorch/FBGEMM/pull/122 To prepare depth-wise convolution other than 3x3. The existing reference depth-wise convolution is limited to 3x3 and we should reuse conv_ref implementation for easier maintenance. Reviewed By: dskhudia Differential Revision: D17176591 fbshipit-source-id: 9f6f90a801a0ad95091f1d085e66861f86c3a8f1
2019-09-04Introduced CodeCache container to share the microkernels among different ↵Aleks Zi
threads. Summary: CodeCache is thread safe and ensures single creation of each microkernel. Uses a single jitRuntiume written to under a lock. The CodeHolder was removed from the class members as it is only a tmporary class, and can be created/destroyed on demand - no need to keep the metadata of the last generated microkernel. Reviewed By: dskhudia Differential Revision: D16968373 fbshipit-source-id: 22d66e50d9b3173c542e28daa322e7869eb52b14
2019-09-04Modifying reference conv2d/3d, im2col2d.3d to support dilated convolutionsProtonu Basu
Summary: Modifying reference conv2d/3d, im2col2d.3d to support dilated convolutions Reviewed By: dskhudia Differential Revision: D17169707 fbshipit-source-id: f6862f79d9cf10f0b72df1b6feafc3d35ba7e5d5
2019-09-04Adding Support for dilations in the conv_param_t constructorProtonu Basu
Summary: (PART 1) Adding support for convolutions with dilation -- Modifications to the constructor Reviewed By: jianyuh Differential Revision: D17165387 fbshipit-source-id: e005c416683d9d40a4413f8aba1b5f21a7afc156
2019-09-03disable clang formatting in a few array definitions (#121)Jongsoo Park
Summary: Pull Request resolved: https://github.com/pytorch/FBGEMM/pull/121 By adding "// clang-format off" and "// clang-format on" we can still apply clang-format to these files. Reviewed By: jianyuh Differential Revision: D17159312 fbshipit-source-id: de523536df4c33f0efe332f9bc7b0290cdac1ba0
2019-08-30Adopt Contributor CovenantPaul O'Shannessy
Summary: In order to foster healthy open source communities, we're adopting the [Contributor Covenant](https://www.contributor-covenant.org/). It has been built by open source community members and represents a shared understanding of what is expected from a healthy community. Reviewed By: josephsavona, danobi, rdzhabarov Differential Revision: D17104640 fbshipit-source-id: d210000de686c5f0d97d602b50472d5869bc6a49
2019-08-29int8 specialization for AVX2 Quantize routine (#120)James Reed
Summary: This adds a specialization for `int8` to the AVX2 `Quantize` routine. I tried also adding a specialization for `int32` (the final datatype we support in PyTorch quantization), but it seemed to introduce numerical issues stemming from the difference in implementations: https://github.com/pytorch/FBGEMM/blob/master/include/fbgemm/QuantUtils.h#L63 vs https://github.com/pytorch/FBGEMM/blob/master/src/QuantUtilsAvx2.cc#L82 Pull Request resolved: https://github.com/pytorch/FBGEMM/pull/120 Reviewed By: driazati Differential Revision: D17115198 Pulled By: jamesr66a fbshipit-source-id: 119145bb99235a7545389afa61483060200cc2b7
2019-08-21Per channel support in fbgemmConv (#119)Daya Khudia
Summary: Pull Request resolved: https://github.com/pytorch/FBGEMM/pull/119 Some paths in fbgemmConv had missing support for per channel quantization. Adding support for per channel as well as groupwise quantization support with this diff. Reviewed By: jianyuh Differential Revision: D16894740 fbshipit-source-id: 43a2c08d1c8d1b01775f875224774c39fae280bc