Welcome to mirror list, hosted at ThFree Co, Russian Federation.

github.com/llvm/llvm-project.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
path: root/bolt
diff options
context:
space:
mode:
authorspupyrev <spupyrev@fb.com>2022-07-08 20:14:26 +0300
committerspupyrev <spupyrev@fb.com>2022-07-11 19:31:52 +0300
commit7228371054746fd37a729b7f7f72f4689b68e890 (patch)
tree2b2fb647fbc72cb328fa8f45d0fd5173954f210f /bolt
parent76029cc53e838e6d86b13b0c39152f474fb09263 (diff)
[BOLT] Do not merge cold and hot chains of basic blocks
There is a post-processing in ext-tsp block reordering that merges some blocks into chains. This allows to maintain the original block order in the absense of profile data and can be beneficial for code size (when fallthroughs are merged). In the earlier version we could merge hot and cold (with zero execution count) chains, that later were split by SplitFunction.cpp (when split-all-cold=1). The diff eliminates the redundant merging. It is unlikely the change will affect the performance of a binary in a measurable way, as it is mostly operates with cold basic blocks. However, after the diff the impact of split-all-cold is almost negligible and we can avoid the extra function splitting. Measuring on the clang binary (negative is good, positive is a regression): **clang12** benchmark1: `0.0253` benchmark2: `-0.1843` benchmark3: `0.3234` benchmark4: `0.0333` **clang10** benchmark1 `-0.2517` benchmark2 `-0.3703` benchmark3 `-0.1186` benchmark4 `-0.3822` **clang7** benchmark1 `0.2526` benchmark2 `0.0500` benchmark3 `0.3024` benchmark4 `-0.0489` **Overall**: `-0.0671 ± 0.1172` (insignificant) Reviewed By: maksfb Differential Revision: https://reviews.llvm.org/D129397
Diffstat (limited to 'bolt')
-rw-r--r--bolt/lib/Passes/ExtTSPReorderAlgorithm.cpp11
1 files changed, 8 insertions, 3 deletions
diff --git a/bolt/lib/Passes/ExtTSPReorderAlgorithm.cpp b/bolt/lib/Passes/ExtTSPReorderAlgorithm.cpp
index 7281d8290f26..5435b2313bbc 100644
--- a/bolt/lib/Passes/ExtTSPReorderAlgorithm.cpp
+++ b/bolt/lib/Passes/ExtTSPReorderAlgorithm.cpp
@@ -642,20 +642,25 @@ private:
}
}
- /// Merge cold blocks to reduce code size
+ /// Merge remaining blocks into chains w/o taking jump counts into
+ /// consideration. This allows to maintain the original block order in the
+ /// absense of profile data
void mergeColdChains() {
for (BinaryBasicBlock *SrcBB : BF.layout()) {
// Iterating in reverse order to make sure original fallthrough jumps are
- // merged first
+ // merged first; this might be beneficial for code size.
for (auto Itr = SrcBB->succ_rbegin(); Itr != SrcBB->succ_rend(); ++Itr) {
BinaryBasicBlock *DstBB = *Itr;
size_t SrcIndex = SrcBB->getLayoutIndex();
size_t DstIndex = DstBB->getLayoutIndex();
Chain *SrcChain = AllBlocks[SrcIndex].CurChain;
Chain *DstChain = AllBlocks[DstIndex].CurChain;
+ bool IsColdSrc = SrcChain->executionCount() == 0;
+ bool IsColdDst = DstChain->executionCount() == 0;
if (SrcChain != DstChain && !DstChain->isEntryPoint() &&
SrcChain->blocks().back()->Index == SrcIndex &&
- DstChain->blocks().front()->Index == DstIndex)
+ DstChain->blocks().front()->Index == DstIndex &&
+ IsColdSrc == IsColdDst)
mergeChains(SrcChain, DstChain, 0, MergeTypeTy::X_Y);
}
}