diff options
author | Jongsoo Park <jongsoo@fb.com> | 2019-09-30 20:19:48 +0300 |
---|---|---|
committer | Facebook Github Bot <facebook-github-bot@users.noreply.github.com> | 2019-09-30 20:25:42 +0300 |
commit | 0d7da7c36f50276b5a550d46508516d139522687 (patch) | |
tree | 58650c60ed36f4da04cece487cf50f16f039cecc /src/ExecuteKernelU8S8.cc | |
parent | 7dfeddb5ba976f47471275b2468909dfd9b577e1 (diff) |
fp16 gemm using avx512 (#135)
Summary:
Pull Request resolved: https://github.com/pytorch/FBGEMM/pull/135
fp16 GEMM was not using avx512 falling behind fp32 performance for large m cases.
This diff enables using avx512. Further tuning for register blocking size may be needed.
Longer term we would also need to use JIT'ing for fp16.
Reviewed By: dskhudia
Differential Revision: D17623727
fbshipit-source-id: 6605bcecf391141c457f257415b7ffb30d68fb29
Diffstat (limited to 'src/ExecuteKernelU8S8.cc')
0 files changed, 0 insertions, 0 deletions