Welcome to
mirror list
, hosted at
ThFree Co
, Russian Federation.
github.com/marian-nmt/FBGEMM.git - Unnamed repository; edit this file 'description' to name the repository.
index
:
github.com/marian-nmt/FBGEMM.git
aaronpburke/fix-install-targets2
copyPublic
gcc11support
master
youki/avx512-avx2
youki/benchmarksparse
youki/fix-avx2-fp16
youki/fix-gcc10-compile
youki/fix-gcc9-build
youki/fix-stdexcept
youki/fp16avx512
youki/fp16intrinsic
youki/improve-mem-alloc
youki/improve-mem-alloc-marian
youki/jit-experiments
youki/merge-win-int8
youki/mergemaster01092020
youki/mergemaster1206
youki/static-code-cache
youki/testsparse
youki/unit-test-fix
youki/unordered_map
youki/upstream0217
youki/upstream0509
Unnamed repository; edit this file 'description' to name the repository.
www-data
summary
refs
log
tree
commit
diff
log msg
author
committer
range
path:
root
/
src
Age
Commit message (
Expand
)
Author
2019-02-15
simple spmdm optimization (#76)
Jongsoo Park
2019-02-14
clean up depthwise conv interface (#72)
Jongsoo Park
2019-02-14
fix bug in group conv + avx512 (#75)
Jongsoo Park
2019-02-14
JIT kernel should only handle a small portion of NCB for the last block: mult...
Jianyu Huang
2019-02-14
Fix PackBMatrix<T, accT>::printPackedMatrix issues
Jianyu Huang
2019-02-13
optimize gconv for b symmetric quantization (#70)
Jongsoo Park
2019-02-13
no need to subtract col offset if a_zp is 0 (#69)
Jongsoo Park
2019-02-13
isZeroPointZero_ -> isAZeroPointZero_ (#71)
Jongsoo Park
2019-02-13
group conv optimized for 16 channels per group (#68)
Jongsoo Park
2019-02-02
gconv optimized for 8 channels per group (#65)
Jongsoo Park
2019-02-02
minor optimization in handling zero points for row offset (#63)
Jongsoo Park
2019-02-01
make G slowest moving dim of packed weight of gconv (#62)
Jongsoo Park
2019-02-01
more careful about movaps (#60)
Jongsoo Park
2019-02-01
specialized requantization for gconv (#61)
Jongsoo Park
2019-01-31
optimize requantization remainder (#64)
Jongsoo Park
2019-01-31
Add threading for FBGEMM FP16
Jianyu Huang
2019-01-15
mac build fix (#58)
Daya S Khudia
2019-01-14
Groupwise direct convolution when number of channels per group is small
Daya S Khudia
2019-01-12
3x3x3 depthwise convolution with per channel quantization (#15775)
Jongsoo Park
2019-01-03
fix shared lib build
Daya S Khudia
2019-01-02
Fix a bug in FbgemmFP16 (#52)
Feiteng
2018-12-21
Update with clang format (#51)
Jianyu Huang
2018-12-11
instantiate more kernels for PackAmatrix (#47)
Jongsoo Park
2018-12-06
Fix duplicate symbols for thread local member variables (#43)
James Reed
2018-12-06
Add missing <algorithm> include (#42)
James Reed
2018-12-06
Final cleanup for avx2 isolation and consistent file names (#40)
Daya S Khudia
2018-12-06
avx2 intrinsic separation from OutputProcessing-inl.h (#38)
Daya S Khudia
2018-12-06
Separate out avx2 code from dense x sparse matrix multiplication (#39)
Daya S Khudia
2018-12-06
File name change for FbgemmI8Depthwise.h and FbgemmI8Depthwise.cc (#14725)
Daya S Khudia
2018-12-06
remove usage of c++ stdlib templates from FbgemmI8Depthwise (#37)
Daya S Khudia
2018-12-05
clean up PackAWithQuantRowOffset from avx2 intrinsics (#36)
Daya S Khudia
2018-12-05
Move avx2 intrinsics from PackAWithIm2Col (#35)
Daya S Khudia
2018-12-05
Removed avx2 code from PackAWithRowOffset.cc (#34)
Daya S Khudia
2018-12-05
avx2 specific code in a separate file for QuantUtils (#29)
Daya S Khudia
2018-12-05
Move avx2 specific code in different source files (#28)
Daya S Khudia
2018-12-01
Fix a bug in conv_ref
Jianyu Huang
2018-11-30
Only export symbols that are required while building shared library
Daya S Khudia
2018-11-29
sparse convolution output processing (#27)
Jongsoo Park
2018-11-27
per-group and per-channel quantization (#14340)
Jongsoo Park
2018-11-27
fix group convention in B packing (#26)
Jongsoo Park
2018-11-26
remove unnecessary zero_point argument from constructors (#14323)
Jongsoo Park
2018-11-26
minimize code compiled with avx2 and header includes from them (#14313)
Jongsoo Park
2018-11-23
parallelization over groups (#23)
Jongsoo Park
2018-11-22
adding quantization utility functions (#19)
Jongsoo Park
2018-11-22
use avx512 packing trait in PackWithQuantRowOffset (#20)
Jongsoo Park
2018-11-22
Unify the PackA file names (#21)
Jianyu Huang
2018-11-21
Fix assert failure
Jianyu Huang
2018-11-20
Optimize parallelization performance (#15)
Jianyu Huang
2018-11-20
Simple parallelism, add -openmp flags and omp parallel for Acc16/32 Unit Test...
Jianyu Huang
2018-11-20
A function to check if we are running on a fbgemm supported cpu (#13)
Daya S Khudia
[next]