diff options
author | Jianyu Huang <jianyuhuang@fb.com> | 2018-12-21 22:24:53 +0300 |
---|---|---|
committer | Facebook Github Bot <facebook-github-bot@users.noreply.github.com> | 2018-12-21 22:32:19 +0300 |
commit | bd35fce789f33bbb026617b1dff722d173586951 (patch) | |
tree | 375c0a2828fe118bbf27b95ddb5f7d0cdc7e5904 | |
parent | 4691d5bcb0756b69baf4f54e45d42ba75d133464 (diff) |
Update the profiling format for Acc32 Benchmark (#50)
Summary:
Pull Request resolved: https://github.com/pytorch/FBGEMM/pull/50
Before this DIFF:
M, N, K, Packing (ms), Kernel (ms), Postprocessing (ms), Total (ms), GOPs
3136, 256, 64, MKL_fp32, 64.5
0.1, 1.3, 0.3, 1.8, 3136, 256, 64, FBGEMM_i8_acc32, 55.7
3136, 64, 64, MKL_fp32, 54.9
0.1, 0.3, 0.1, 0.5, 3136, 64, 64, FBGEMM_i8_acc32, 50.7
3136, 64, 576, MKL_fp32, 60.9
0.4, 2.7, 0.1, 3.3, 3136, 64, 576, FBGEMM_i8_acc32, 70.3
...
After this DIFF:
M, N, K, Packing (ms), Kernel (ms), Postprocessing (ms), Total (ms), GOPs
3136, 256, 64, MKL_fp32, 62.4
3136, 256, 64, 0.1, 1.3, 0.3, 1.8, FBGEMM_i8_acc32, 54.8
3136, 64, 64, MKL_fp32, 49.4
3136, 64, 64, 0.1, 0.3, 0.1, 0.5, FBGEMM_i8_acc32, 46.3
3136, 64, 576, MKL_fp32, 65.6
3136, 64, 576, 0.4, 2.7, 0.1, 3.3, FBGEMM_i8_acc32, 70.0
...
Reviewed By: dskhudia
Differential Revision: D13531989
fbshipit-source-id: 267b8aea76bd11cd0aedec05b2f9b1ae75c10779
-rw-r--r-- | bench/PackedRequantizeAcc32Benchmark.cc | 6 |
1 files changed, 4 insertions, 2 deletions
diff --git a/bench/PackedRequantizeAcc32Benchmark.cc b/bench/PackedRequantizeAcc32Benchmark.cc index 5096475..56066a0 100644 --- a/bench/PackedRequantizeAcc32Benchmark.cc +++ b/bench/PackedRequantizeAcc32Benchmark.cc @@ -220,6 +220,9 @@ void performance_test() { double total_postprocessing_time = 0.0; double total_run_time = 0.0; #endif + cout << setw(6) << m << ", " << setw(6) << n << ", " << setw(6) << k + << ", "; + for (auto i = 0; i < NWARMUP + NITER; ++i) { #ifdef FBGEMM_MEASURE_TIME_BREAKDOWN packing_time = 0.0; @@ -308,8 +311,7 @@ void performance_test() { << total_postprocessing_time / (double)NITER / 1e6 << ", " << total_run_time / (double)NITER / 1e6 << ", "; #endif - cout << setw(6) << m << ", " << setw(6) << n << ", " << setw(6) << k << ", " - << setw(16) << runType << ", " << setw(5) << fixed << setw(5) + cout << setw(16) << runType << ", " << setw(5) << fixed << setw(5) << setprecision(1) << nops / ttot << endl; cout << endl; |