github.com/marian-nmt/FBGEMM.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author
2019-09-03	disable clang formatting in a few array definitions (#121)	Jongsoo Park
	Summary: Pull Request resolved: https://github.com/pytorch/FBGEMM/pull/121 By adding "// clang-format off" and "// clang-format on" we can still apply clang-format to these files. Reviewed By: jianyuh Differential Revision: D17159312 fbshipit-source-id: de523536df4c33f0efe332f9bc7b0290cdac1ba0
2019-08-30	Adopt Contributor Covenant	Paul O'Shannessy
	Summary: In order to foster healthy open source communities, we're adopting the [Contributor Covenant](https://www.contributor-covenant.org/). It has been built by open source community members and represents a shared understanding of what is expected from a healthy community. Reviewed By: josephsavona, danobi, rdzhabarov Differential Revision: D17104640 fbshipit-source-id: d210000de686c5f0d97d602b50472d5869bc6a49
2019-08-29	int8 specialization for AVX2 Quantize routine (#120)	James Reed
	Summary: This adds a specialization for `int8` to the AVX2 `Quantize` routine. I tried also adding a specialization for `int32` (the final datatype we support in PyTorch quantization), but it seemed to introduce numerical issues stemming from the difference in implementations: https://github.com/pytorch/FBGEMM/blob/master/include/fbgemm/QuantUtils.h#L63 vs https://github.com/pytorch/FBGEMM/blob/master/src/QuantUtilsAvx2.cc#L82 Pull Request resolved: https://github.com/pytorch/FBGEMM/pull/120 Reviewed By: driazati Differential Revision: D17115198 Pulled By: jamesr66a fbshipit-source-id: 119145bb99235a7545389afa61483060200cc2b7
2019-08-21	Per channel support in fbgemmConv (#119)	Daya Khudia
	Summary: Pull Request resolved: https://github.com/pytorch/FBGEMM/pull/119 Some paths in fbgemmConv had missing support for per channel quantization. Adding support for per channel as well as groupwise quantization support with this diff. Reviewed By: jianyuh Differential Revision: D16894740 fbshipit-source-id: 43a2c08d1c8d1b01775f875224774c39fae280bc
2019-08-15	Update asmjit to version that includes a bug fix (#118)	Daya Khudia
	Summary: Pull Request resolved: https://github.com/pytorch/FBGEMM/pull/118 Same as title Reviewed By: jianyuh Differential Revision: D16807867 fbshipit-source-id: f94e31f3710438aaf4665eadd541571af0afc618
2019-08-12	fix error message (#117)	Daya Khudia
	Summary: Pull Request resolved: https://github.com/pytorch/FBGEMM/pull/117 Fixes error message with mismatching parameters. Before: ``` [FBGEMM_CONV_ERROR] Prepacked weights can't be used with these convolution parameters! ``` After ``` [FBGEMM_CONV_ERROR] Convolution parameters mismatch between pre-packed weights and conv invocation! stride [1, 1] vs [2, 1]; Please pack weights using the same parameters with which convolution operation is invoked! ``` Reviewed By: jianyuh Differential Revision: D16749007 fbshipit-source-id: 7a3083f2955b798ae28d25ce1963c7de63654551
2019-08-12	Update README.md with mentioning PyTorch (#116)	Jianyu Huang
	Summary: As Title says. Pull Request resolved: https://github.com/pytorch/FBGEMM/pull/116 Test Plan: CI Differential Revision: D16747927 Pulled By: jianyuh fbshipit-source-id: 6d60a12e11dad7da20ce0224de8bc611b2e44578
2019-08-09	Integrate VNNI into FBGEMM master branch (#114)	Jianyu Huang
	Summary: Pull Request resolved: https://github.com/pytorch/FBGEMM/pull/114 Adding the VNNI support in FBGEMM. Previously, we have the issue on CMake version. Currently PyTorch and FBGEMM OSS test has the CMake 3.5 test, while ASMJIT requires CMake to be 3.8+. This caused the build failure for some platforms. Now the CMake version issue is resolved by a PR to ASMJIT to downgrade the CMake requirement: https://github.com/asmjit/asmjit/pull/252. Reviewed By: dskhudia Differential Revision: D16720839 fbshipit-source-id: e5e5f2d26f924df8d9fb955f4a3758561fa73288
2019-08-09	Add unpack to PackedGemmMatrixFP16 (#112)	Yinghai Lu
	Summary: Pull Request resolved: https://github.com/pytorch/FBGEMM/pull/112 We need to unpack the layout to support non-CPU arch. Reviewed By: jianyuh Differential Revision: D16584449 fbshipit-source-id: 309acaf8f2406e39d6975c0e9fef3e849a6d3950
2019-08-06	Back out "[fbgemm] Integrate VNNI into FBGEMM master branch"	Jianyu Huang
	Summary: Original commit changeset: fcaa13cc3159 ASMJIT requires the CMake version to be 3.8 However, FBGEMM and PyTorch only need the CMake version to be 3.5+. This caused the build failure in FBGEMM: https://circleci.com/gh/pytorch/FBGEMM/122#build-timing/containers/0 Reviewed By: dskhudia Differential Revision: D16670547 fbshipit-source-id: 506714c3db1cb82cf98895f58f82f235128f5285
2019-08-06	Integrate VNNI into FBGEMM master branch (#113)	Jianyu Huang
	Summary: Pull Request resolved: https://github.com/pytorch/FBGEMM/pull/113 Adding the VNNI support in FBGEMM. Reviewed By: dskhudia Differential Revision: D16276574 fbshipit-source-id: 832ccdb27339489ebc138f3b2678e53d107c1b79
2019-08-02	Pass blocking param pointer into packedBufferSize() in PackBMatrix.cc	Mike Tsai
	Summary: Pass blocking params in to compute correct buffer size for each group. Fix the bug for this CONV shape: `conv_param_t<2>(1, 32, 16, {12, 14}, 4, {3, 3}, {1, 1}, {0, 0, 0, 0})` Corresponding M, N, K = 120, 4, 288 with these params: BlockingFactors params; params.MCB = 48; params.NCB = 16; params.KCB = 256; params.MR = 1; params.NR = 16; params.ROW_INTERLEAVE = 4; params.NR_MIN = 16; Reviewed By: jianyuh Differential Revision: D16571367 fbshipit-source-id: 27c9b003d37c4d3d13767227e8343d44668823d6
2019-07-23	Update OSS build instructions for submodule build (#110)	Jianyu Huang
	Summary: Pull Request resolved: https://github.com/pytorch/FBGEMM/pull/110 Due to the update the ASMJIT submodule in the near future, we add the additional instruction for submodule update for OSS users. Reviewed By: dskhudia Differential Revision: D16415043 fbshipit-source-id: 0488dda34a5916a40eee948a5b7455cf8770d72d
2019-07-19	Fix fbgemm OSS failure	Jianyu Huang
	Summary: std::multiplier is not found. Reviewed By: jspark1105 Differential Revision: D16373256 fbshipit-source-id: ae273a3f447f95e4b26d3f1a43e7ddad288b78ab
2019-07-19	Support pointwise with unified convolution interface as well (#108)	Daya Khudia
	Summary: Pull Request resolved: https://github.com/pytorch/FBGEMM/pull/108 Pointwise gets converted to direct GEMM Reviewed By: jianyuh Differential Revision: D16296356 fbshipit-source-id: 68c88df90e5de669bfcddf426c6488e2a04d55d6
2019-07-18	Fix missing blocking params in conv im2col code path.	Mike Tsai
	Summary: Add blocking params as argument of rowOffsetBufferSize() so the allocated vector will be sized correctlly. Reviewed By: dskhudia, jianyuh Differential Revision: D16348913 fbshipit-source-id: c70a05f2f69db3ce71ec2c27a8db4d143649ddd6
2019-07-17	While calling fbgemmConv with packed weights, packed weights should be ↵	Daya Khudia
	compliant with convolution parameters Summary: This is to detect inadvertent calling for fbgemmConv with one set of conv parameters while packing was done with another set of parameters. Reviewed By: jspark1105 Differential Revision: D16269293 fbshipit-source-id: 9a166f5298d8246047e40fc880dd87e1037e0456
2019-07-16	changes to remove warnings when building in opt mode	Protonu Basu
	Summary: Changes to remove warnings when building FBGEMM in opt mode. Cleanup to address initialization of MCB, KCB, NCBX Reviewed By: jianyuh Differential Revision: D16283443 fbshipit-source-id: 0829aee45ed1d262a18bcf4dd294393ef018a688
2019-07-16	Add functions needed for unpacking in PackWeightsForConv (#106)	Daya Khudia
	Summary: Pull Request resolved: https://github.com/pytorch/FBGEMM/pull/106 The values returned by these functions is needed while unpacking weights. Reviewed By: jianyuh Differential Revision: D16193425 fbshipit-source-id: 8ee3a0dc46768d7cb572bf383be1ce2b450c44c9
2019-07-16	unpack through unified convolution interface (#105)	Daya Khudia
	Summary: Pull Request resolved: https://github.com/pytorch/FBGEMM/pull/105 Support for calling unpack using unified interface for packing convolution weights Reviewed By: jianyuh Differential Revision: D16190534 fbshipit-source-id: daebd7b6d1846921232f8391c816e2f0678d813f
2019-07-16	Assume input weights to be in transposed format for convUnified (#104)	Daya Khudia
	Summary: Pull Request resolved: https://github.com/pytorch/FBGEMM/pull/104 For consistency, we always assume that weights to PackWeightsForConv are in format K R S C/G, which is same as G K/G R S C/G cc: Huihan Liu: Please note this change. Reviewed By: jianyuh Differential Revision: D16186932 fbshipit-source-id: 9ca2562f213d6b296ef8bd2eca1e5b6e98c436ec
2019-07-10	Refactoring unpack weight function (#103)	Jianyu Huang
	Summary: Pull Request resolved: https://github.com/pytorch/FBGEMM/pull/103 In the same spirit of D16085552, we do the following in this Diff: - Refactor the pack/unpack code for PackB: use the same ```pack_unpack_``` function for both ```pack``` and ```unpack``` function. - Add a unit test. Reviewed By: dskhudia Differential Revision: D16160767 fbshipit-source-id: 7fb7006750537b0705a180f2014c786298a1c615
2019-07-06	Unpack data for 3x3 (and 3x3x3) depthwise convolution	Daya Khudia
	Summary: unpack weight for 3x3 depthwise and 3x3x3 depthwise convolutions. Reviewed By: jspark1105 Differential Revision: D16076463 fbshipit-source-id: 767749c1a10caefef4c76c2c51323d1a3041621a
2019-07-06	Implement ::unpack() for PackWeightMatrixForGConv	Jaewon Lee
	Summary: Implement ::unpack() for PackWeightMatrixForGConv. Unpack index calculation is the inverse of ::pack(). Reviewed By: dskhudia Differential Revision: D16085552 fbshipit-source-id: b8866365dc425fee2cb985b3e48c627198ebc29a
2019-07-01	Refactor the code and avoid the duplication (#102)	Jianyu Huang
	Summary: Pull Request resolved: https://github.com/pytorch/FBGEMM/pull/102 The Avx512 and Avx2 branches can be merged. Reviewed By: dskhudia Differential Revision: D16068952 fbshipit-source-id: b39beb32e80dc168d0c17db9dff8a67bb0fe976f
2019-07-01	Clean up some code for JIT code generator (#101)	Jianyu Huang
	Summary: Pull Request resolved: https://github.com/pytorch/FBGEMM/pull/101 Some code cleanup: - Both ```leadingDimCReg``` and ```leadingDimCRegAssign``` are used in ```GenerateKernelU8S8S32ACC32.c```. We should unify them to only use one variable name. - Remove some redundant register variable ```asmjit::X86Ymm tmpReg = x86::ymm14;```. Reviewed By: dskhudia Differential Revision: D15673269 fbshipit-source-id: 81eb3673d0ff97391557413a13f1972561a1f2db
2019-06-22	fix flaky test (#100)	Daya Khudia
	Summary: Pull Request resolved: https://github.com/pytorch/FBGEMM/pull/100 The test fails in some cases depending on what random values got generated. See the comment in diff on why does it fail. Reviewed By: jspark1105 Differential Revision: D15954045 fbshipit-source-id: d128ab7fa61f1b3210274120ac8f1e14c998f063
2019-06-20	Per channel and groupwise quantization (#99)	Daya Khudia
	Summary: Pull Request resolved: https://github.com/pytorch/FBGEMM/pull/99 A function to do per channel and groupwise quantization Reviewed By: jspark1105 Differential Revision: D15567272 fbshipit-source-id: e2f326ea7c7463b5c47b3f590e003344a9e41960
2019-06-15	Update the logic of checking valid parameters.	Mike Tsai
	Summary: Add the check on NR_MIN and fix ymm/zmm register checks. Reviewed By: dskhudia Differential Revision: D15772144 fbshipit-source-id: 11e2c67fb3d47c5570b38ceaf9828ced0e60e65b
2019-06-12	Print packed matrix for each group as well	Daya Khudia
	Summary: same as title. We were only printing packed matrix for group 0 Reviewed By: jianyuh Differential Revision: D15775235 fbshipit-source-id: 747550c9ae229a2eeb912409897c1331ada81e2b
2019-06-07	Remove duplicated header and undo some changes in D15399811	Daya Khudia
	Summary: Delete duplicated header Remove #ifndef and replace with pragma once. Reviewed By: jianyuh Differential Revision: D15669744 fbshipit-source-id: 8895f6c867e626ac5813a8952837435e76b09370
2019-06-05	Unified convolution interface	Daya Khudia
	Summary: We want to combine three different convolution interfaces under one top level function. Reviewed By: protonu Differential Revision: D15399811 fbshipit-source-id: 7390616d92783506fc156f0f6017f10b5f7f8e30
2019-06-04	Add quantized::fbgemm_linear_unpack operator for serialization (#97)	Jianyu Huang
	Summary: Pull Request resolved: https://github.com/pytorch/FBGEMM/pull/97 Pull Request resolved: https://github.com/pytorch/pytorch/pull/20721 - FBGEMM: Add unpack function for PackBMatrix class: Unpack pmat buffer to the origin_buf (Used for the serialization to recover weight matrix). - PyTorch Quantizer: Add quantized::fbgemm_linear_unpack operator for serialization. Reviewed By: zafartahirov Differential Revision: D15314568 fbshipit-source-id: 12080c8887ce31dc849d23e132ae1766ac319407
2019-05-30	Adding -02 flag to the cmake build	Protonu Basu
	Summary: Adding this flag makes up for perf diff seen with the cmake build system Reviewed By: dskhudia Differential Revision: D15377782 fbshipit-source-id: cf5308ff2b5d8d42ac57b555a94d845268a857c6
2019-05-24	fix broken test	Daya S Khudia
	Summary: contbuild failure fix Reviewed By: hx89 Differential Revision: D15487798 fbshipit-source-id: 9d766cd623a79f8ab65b9af1506ae5f37869232f
2019-05-24	Fix kernel logging	Mike Tsai
	Summary: Remove the extra line in ifdef block for kernel logging. Reviewed By: jianyuh Differential Revision: D15483193 fbshipit-source-id: 8ee25b07ab0a45e6f3d366876241599c87ab0c2d
2019-05-17	update readme	Daya S Khudia
	Summary: Readme update for submodules Reviewed By: protonu Differential Revision: D15345910 fbshipit-source-id: 1dfe4f9ae602f4b3801064a1bb68a506c4d954cf
2019-05-16	fixing compiler warnings for uninitialized MR, NCB, KCB	Protonu Basu
	Summary: fixing compiler warnings for uninitialized MR, NCB, KCB Reviewed By: dskhudia Differential Revision: D15362047 fbshipit-source-id: 57428f0610c8c12f9ff1f07fe8e472e5ff56bc82
2019-05-15	Fix CI indent error	Daya S Khudia
	Summary: Fixing a mistake in config Reviewed By: jianyuh Differential Revision: D15343772 fbshipit-source-id: 738866ed71590f606c3902a5925e012812820031
2019-05-15	fix circleci build with submodules	Daya S Khudia
	Summary: checkout submodules in circleci build Reviewed By: jianyuh Differential Revision: D15342627 fbshipit-source-id: 02b92497dff941c535047c0073399d4cfcfdccee
2019-05-14	Use submodules instead of cmake downloads	Daya S Khudia
	Summary: Pull Request resolved: https://github.com/pytorch/FBGEMM/pull/96 Reviewed By: jianyuh Differential Revision: D15336047 Pulled By: dskhudia fbshipit-source-id: 93435ba920baa3a712c5741e60c479901c95115d
2019-05-13	Back out "[FBGEMM][PR] switch from cmake downloads to git submodules"	Daya S Khudia
	Summary: Original commit changeset: 9a33573ba34b Reviewed By: jianyuh Differential Revision: D15320950 fbshipit-source-id: f6501b57346cc5e82fa2198dcf6b60b26cd4f7c6
2019-05-13	switch from cmake downloads to git submodules (#95)	David Pollack
	Summary: I created a pull request for #87. I also tend to do a lot of hacking without an internet connection and it is nice to have the required library offline. I also get a cryptic error message when I build pytorch without an internet connection because these modules aren't available. Pull Request resolved: https://github.com/pytorch/FBGEMM/pull/95 Reviewed By: jianyuh Differential Revision: D15299133 Pulled By: dskhudia fbshipit-source-id: 6cf9ed47482eceee5f0444a8361720e0cfe25a13
2019-04-19	make sure cpuinfo_initialize called before fbgemmHasAvx2/512Support (#94)	Jongsoo Park
	Summary: Pull Request resolved: https://github.com/pytorch/FBGEMM/pull/94 If we don't call cpuinfo_initialize before hand, fbgemmHasAvx2/512Support will always return false. We should really careful about this. Reviewed By: jianyuh Differential Revision: D14994129 fbshipit-source-id: b78028f0543d05595caaa627be2feb743d0694b1
2019-04-03	optimize dw conv for symmetric quant (#73)	Jongsoo Park
	Summary: Pull Request resolved: https://github.com/pytorch/FBGEMM/pull/73 Skip computing row_offset if B uses symmetric quantization. Skip adding col_offset if A uses symmetric quantization. Reviewed By: jianyuh Differential Revision: D14055973 fbshipit-source-id: 91da8f0755b2f90175e94a893b5a3ad6342c506d
2019-04-02	Exposing tuning parameters in FBGEMM (MCB, NCB, KCB, MR, NR, Row Interleave) ↵	Protonu Basu
	(#90) Summary: Pull Request resolved: https://github.com/pytorch/FBGEMM/pull/90 Exposing tuning parameters in FBGEMM (MCB, NCB, KCB, MR, NR, Row Interleave) Reviewed By: dskhudia Differential Revision: D14358148 fbshipit-source-id: 783fb4653fd696dbbd4075ad56cb8682db3011a5
2019-04-01	reduce the number of shapes tested in GConvTest (#91)	Jongsoo Park
	Summary: Pull Request resolved: https://github.com/pytorch/FBGEMM/pull/91 We were getting timeouts in tests Reviewed By: protonu Differential Revision: D14707740 fbshipit-source-id: abfdd81531f85140fe511c47db64b91efd155a9e
2019-03-25	Packing B documentation	Daya S Khudia
	Summary: Packing B documentation Reviewed By: jianyuh Differential Revision: D14579163 fbshipit-source-id: e18cb1eea56024fbe54f654b15ca79d10c42e17c
2019-03-21	Improves small N cases back to what they were	Daya S Khudia
	Summary: In D14507536 and D14516232 small N cases suffered if we increased the NR. This fixes those cases. Reviewed By: jianyuh Differential Revision: D14529494 fbshipit-source-id: 6f53797948de760d6ed24b767cbbe8d27768660f
2019-03-21	Allocate some registers for B matrix loading and reuse loaded results	Daya S Khudia
	Summary: Instead of loading B matrix values with every vpmaddubsw instruction, load once and reuse. The downside is we need to use some register for holding these B matrix values which could have been otherwise used for C accumulations. Reviewed By: jianyuh Differential Revision: D14529495 fbshipit-source-id: 54bd4bcdcf14ac2f25a433ac60bfc08b7359453f