From e754dd7e8be86e1adc9d4d13fb1105b848c11752 Mon Sep 17 00:00:00 2001 From: Leo Yan Date: Thu, 11 Aug 2022 14:24:51 +0800 Subject: perf c2c: Update documentation for new display option 'peer' Since the new display option 'peer' is introduced, this patch is to update the documentation to reflect it. Reviewed-by: Ali Saidi Signed-off-by: Leo Yan Acked-by: Ian Rogers Cc: Adrian Hunter Cc: Alexander Shishkin Cc: Anshuman Khandual Cc: German Gomez Cc: Gustavo A. R. Silva Cc: Ingo Molnar Cc: James Clark Cc: Jiri Olsa Cc: John Garry Cc: Kajol Jain Cc: Like Xu Cc: Mark Rutland Cc: Mike Leach Cc: Namhyung Kim Cc: Peter Zijlstra Cc: Timothy Hayes Cc: Will Deacon Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20220811062451.435810-16-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/Documentation/perf-c2c.txt | 31 ++++++++++++++++++++++++------- 1 file changed, 24 insertions(+), 7 deletions(-) (limited to 'tools') diff --git a/tools/perf/Documentation/perf-c2c.txt b/tools/perf/Documentation/perf-c2c.txt index 6f69173731aa..f1f7ae6b08d1 100644 --- a/tools/perf/Documentation/perf-c2c.txt +++ b/tools/perf/Documentation/perf-c2c.txt @@ -109,7 +109,9 @@ REPORT OPTIONS -d:: --display:: - Switch to HITM type (rmt, lcl) to display and sort on. Total HITMs as default. + Switch to HITM type (rmt, lcl) or peer snooping type (peer) to display + and sort on. Total HITMs (tot) as default, except Arm64 uses peer mode + as default. --stitch-lbr:: Show callgraph with stitched LBRs, which may have more complete @@ -174,12 +176,18 @@ For each cacheline in the 1) list we display following data: Cacheline - cacheline address (hex number) - Rmt/Lcl Hitm + Rmt/Lcl Hitm (Display with HITM types) - cacheline percentage of all Remote/Local HITM accesses - LLC Load Hitm - Total, LclHitm, RmtHitm + Peer Snoop (Display with peer type) + - cacheline percentage of all peer accesses + + LLC Load Hitm - Total, LclHitm, RmtHitm (For display with HITM types) - count of Total/Local/Remote load HITMs + Load Peer - Total, Local, Remote (For display with peer type) + - count of Total/Local/Remote load from peer cache or DRAM + Total records - sum of all cachelines accesses @@ -201,16 +209,21 @@ For each cacheline in the 1) list we display following data: - count of LLC load accesses, includes LLC hits and LLC HITMs RMT Load Hit - RmtHit, RmtHitm - - count of remote load accesses, includes remote hits and remote HITMs + - count of remote load accesses, includes remote hits and remote HITMs; + on Arm neoverse cores, RmtHit is used to account remote accesses, + includes remote DRAM or any upward cache level in remote node Load Dram - Lcl, Rmt - count of local and remote DRAM accesses For each offset in the 2) list we display following data: - HITM - Rmt, Lcl + HITM - Rmt, Lcl (Display with HITM types) - % of Remote/Local HITM accesses for given offset within cacheline + Peer Snoop - Rmt, Lcl (Display with peer type) + - % of Remote/Local peer accesses for given offset within cacheline + Store Refs - L1 Hit, L1 Miss, N/A - % of store accesses that hit L1, missed L1 and N/A (no available) memory level for given offset within cacheline @@ -227,9 +240,12 @@ For each offset in the 2) list we display following data: Code address - code address responsible for the accesses - cycles - rmt hitm, lcl hitm, load + cycles - rmt hitm, lcl hitm, load (Display with HITM types) - sum of cycles for given accesses - Remote/Local HITM and generic load + cycles - rmt peer, lcl peer, load (Display with peer type) + - sum of cycles for given accesses - Remote/Local peer load and generic load + cpu cnt - number of cpus that participated on the access @@ -251,7 +267,8 @@ The 'Node' field displays nodes that accesses given cacheline offset. Its output comes in 3 flavors: - node IDs separated by ',' - node IDs with stats for each ID, in following format: - Node{cpus %hitms %stores} + Node{cpus %hitms %stores} (Display with HITM types) + Node{cpus %peers %stores} (Display with peer type) - node IDs with list of affected CPUs in following format: Node{cpu list} -- cgit v1.2.3