github.com/marian-nmt/FBGEMM.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Expand)	Author
2019-02-15	simple spmdm optimization (#76)	Jongsoo Park
2019-02-14	clean up depthwise conv interface (#72)	Jongsoo Park
2019-02-14	fix bug in group conv + avx512 (#75)	Jongsoo Park
2019-02-14	JIT kernel should only handle a small portion of NCB for the last block: mult...	Jianyu Huang
2019-02-14	Fix PackBMatrix<T, accT>::printPackedMatrix issues	Jianyu Huang
2019-02-13	optimize gconv for b symmetric quantization (#70)	Jongsoo Park
2019-02-13	no need to subtract col offset if a_zp is 0 (#69)	Jongsoo Park
2019-02-13	isZeroPointZero_ -> isAZeroPointZero_ (#71)	Jongsoo Park
2019-02-13	group conv optimized for 16 channels per group (#68)	Jongsoo Park
2019-02-02	gconv optimized for 8 channels per group (#65)	Jongsoo Park
2019-02-02	minor optimization in handling zero points for row offset (#63)	Jongsoo Park
2019-02-01	make G slowest moving dim of packed weight of gconv (#62)	Jongsoo Park
2019-02-01	more careful about movaps (#60)	Jongsoo Park
2019-02-01	specialized requantization for gconv (#61)	Jongsoo Park
2019-01-31	optimize requantization remainder (#64)	Jongsoo Park
2019-01-31	Add threading for FBGEMM FP16	Jianyu Huang
2019-01-15	mac build fix (#58)	Daya S Khudia
2019-01-14	Groupwise direct convolution when number of channels per group is small	Daya S Khudia
2019-01-12	3x3x3 depthwise convolution with per channel quantization (#15775)	Jongsoo Park
2019-01-03	fix shared lib build	Daya S Khudia
2019-01-02	Fix a bug in FbgemmFP16 (#52)	Feiteng
2018-12-21	Update with clang format (#51)	Jianyu Huang
2018-12-11	instantiate more kernels for PackAmatrix (#47)	Jongsoo Park
2018-12-06	Fix duplicate symbols for thread local member variables (#43)	James Reed
2018-12-06	Add missing <algorithm> include (#42)	James Reed
2018-12-06	Final cleanup for avx2 isolation and consistent file names (#40)	Daya S Khudia
2018-12-06	avx2 intrinsic separation from OutputProcessing-inl.h (#38)	Daya S Khudia
2018-12-06	Separate out avx2 code from dense x sparse matrix multiplication (#39)	Daya S Khudia
2018-12-06	File name change for FbgemmI8Depthwise.h and FbgemmI8Depthwise.cc (#14725)	Daya S Khudia
2018-12-06	remove usage of c++ stdlib templates from FbgemmI8Depthwise (#37)	Daya S Khudia
2018-12-05	clean up PackAWithQuantRowOffset from avx2 intrinsics (#36)	Daya S Khudia
2018-12-05	Move avx2 intrinsics from PackAWithIm2Col (#35)	Daya S Khudia
2018-12-05	Removed avx2 code from PackAWithRowOffset.cc (#34)	Daya S Khudia
2018-12-05	avx2 specific code in a separate file for QuantUtils (#29)	Daya S Khudia
2018-12-05	Move avx2 specific code in different source files (#28)	Daya S Khudia
2018-12-01	Fix a bug in conv_ref	Jianyu Huang
2018-11-30	Only export symbols that are required while building shared library	Daya S Khudia
2018-11-29	sparse convolution output processing (#27)	Jongsoo Park
2018-11-27	per-group and per-channel quantization (#14340)	Jongsoo Park
2018-11-27	fix group convention in B packing (#26)	Jongsoo Park
2018-11-26	remove unnecessary zero_point argument from constructors (#14323)	Jongsoo Park
2018-11-26	minimize code compiled with avx2 and header includes from them (#14313)	Jongsoo Park
2018-11-23	parallelization over groups (#23)	Jongsoo Park
2018-11-22	adding quantization utility functions (#19)	Jongsoo Park
2018-11-22	use avx512 packing trait in PackWithQuantRowOffset (#20)	Jongsoo Park
2018-11-22	Unify the PackA file names (#21)	Jianyu Huang
2018-11-21	Fix assert failure	Jianyu Huang
2018-11-20	Optimize parallelization performance (#15)	Jianyu Huang
2018-11-20	Simple parallelism, add -openmp flags and omp parallel for Acc16/32 Unit Test...	Jianyu Huang
2018-11-20	A function to check if we are running on a fbgemm supported cpu (#13)	Daya S Khudia