Add SPGO support and MIBC comparison in dotnet-pgo (#52765)

This allows dotnet-pgo to generate .mibc files using the sample data stored in the trace that it is processing. It implements support for both last branch record (LBR) data and normal IP samples. The latter can be produced using PerfView as normal while the former currently requires using xperf with LBR mode enabled. For posterity, to enable both logging required .NET events and LBR, the following commands can be used (on Windows): ``` xperf.exe -start "NT Kernel Logger" -on LOADER+PROC_THREAD+PMC_PROFILE -MinBuffers 4096 -MaxBuffers 4096 -BufferSize 4096 -pmcprofile BranchInstructionRetired -LastBranch PmcInterrupt -setProfInt BranchInstructionRetired 65537 -start clr -on e13c0d23-ccbc-4e12-931b-d9cc2eee27e4:0x40000A0018:0x5 -MinBuffers 4096 -MaxBuffers 4096 -BufferSize 4096 scenario.exe xperf.exe -stop "NT Kernel Logger" -stop clr -d xperftrace.etl ``` SPGO does not currently do well with optimized code as the mapping IP<->IL mappings the JIT produces there are not sufficiently accurate. To collect data in tier-0 one can enable two environment variables before running the scenario: ``` $env:COMPlus_TC_QuickJitForLoops=1 $env:COMPlus_TC_CallCounting=0 ``` When samples are used the associated counts will not typically look valid, i.e. they won't satisfy flow conservation. To remedy this, dotnet-pgo performs a smoothing step after assigning samples to the flow-graph of each method. The smoothing is based on [1] and the code comes from Midori. The commit adds some new commands to dotnet-pgo. The --spgo flag can be specified to create-mibc to use samples to create the .mibc file. Also, even if --spgo is specified, instrumented data will still be preferred if available in the trace. If spgo is not specified, the behavior should be the same as before. --spgo-with-block-counts and --spgo-with-edge-counts control whether dotnet-pgo outputs the smoothed block or edge counts (or both). By default block counts are output. The JIT can use both forms of counts but will be most happy if only one kind is present for each method. --spgo-min-samples controls how many samples must be in each method before smoothing is applied and the result included in the .mibc. SPGO is quite sensitive to low sample counts and the produced results are not good when the number of samples is low. By default, this value is 50. The commit also adds a new compare-mibc command that allows to compare two .mibc files. Usage is dotnet-pgo compare-mibc --input file1.mibc --input file2.mibc. For example, comparing a .mibc produced via instrumentation and one produced via SPGO (in tier-0) for some JIT benchmarks produces the following: ``` Comparing instrumented.mibc to spgo.mibc Statistics for instrumented.mibc # Methods: 3490 # Methods with any profile data: 865 # Methods with 32-bit block counts: 0 # Methods with 64-bit block counts: 865 # Methods with 32-bit edge counts: 0 # Methods with 64-bit edge counts: 0 # Methods with type handle histograms: 184 # Methods with GetLikelyClass data: 0 # Profiled methods in instrumented.mibc not in spgo.mibc: 652 Statistics for spgo.mibc # Methods: 1107 # Methods with any profile data: 286 # Methods with 32-bit block counts: 286 # Methods with 64-bit block counts: 0 # Methods with 32-bit edge counts: 0 # Methods with 64-bit edge counts: 0 # Methods with type handle histograms: 0 # Methods with GetLikelyClass data: 0 # Profiled methods in spgo.mibc not in instrumented.mibc: 73 Comparison # Methods with profile data in both .mibc files: 213 Of these, 213 have matching flow-graphs and the remaining 0 do not When comparing the flow-graphs of the matching methods, their overlaps break down as follows: 100% █ (1.9%) >95% █████████████████████████████████▌ (61.0%) >90% ████████ (14.6%) >85% ████▏ (7.5%) >80% ████▋ (8.5%) >75% █▊ (3.3%) >70% █ (1.9%) >65% ▎ (0.5%) >60% ▎ (0.5%) >55% ▏ (0.0%) >50% ▏ (0.0%) >45% ▏ (0.0%) >40% ▎ (0.5%) >35% ▏ (0.0%) >30% ▏ (0.0%) >25% ▏ (0.0%) >20% ▏ (0.0%) >15% ▏ (0.0%) >10% ▏ (0.0%) > 5% ▏ (0.0%) > 0% ▏ (0.0%) (using block counts) ``` I also made the dump command print some statistics about the .mibc that was dumped. Hopefully some of this tooling can help track down #51908. [1] Levin R., Newman I., Haber G. (2008) Complementing Missing and Inaccurate Profiling Using a Minimum Cost Circulation Algorithm. In: Stenström P., Dubois M., Katevenis M., Gupta R., Ungerer T. (eds) High Performance Embedded Architectures and Compilers. HiPEAC 2008. Lecture Notes in Computer Science, vol 4917. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-77560-7_20
author: Jakob Botsch Nielsen <Jakob.botsch.nielsen@gmail.com> 2021-05-18 00:51:34 +0300
committer: GitHub <noreply@github.com> 2021-05-18 00:51:34 +0300
commit: 62df49298ae15a1b7b2bd6d093a6a37da512d1e0 (patch)
tree: 145409e1e6721335843df697aeb3f241d06544b1 /src/coreclr/tools
parent: 6f5ab2b939ccadef92341affdd089b0f7c003280 (diff)
13 files changed, 2132 insertions, 184 deletions
diff --git a/src/coreclr/tools/Common/Pgo/PgoFormat.cs b/src/coreclr/tools/Common/Pgo/PgoFormat.cs
index e9357ad3614..b68912328aa 100644
--- a/src/coreclr/tools/Common/Pgo/PgoFormat.cs
+++ b/src/coreclr/tools/Common/Pgo/PgoFormat.cs
@@ -562,7 +562,9 @@ namespace Internal.Pgo
                     switch (existingSchemaItem.InstrumentationKind)
                     {
                         case PgoInstrumentationKind.BasicBlockIntCount:
+                        case PgoInstrumentationKind.BasicBlockLongCount:
                         case PgoInstrumentationKind.EdgeIntCount:
+                        case PgoInstrumentationKind.EdgeLongCount:
                         case PgoInstrumentationKind.TypeHandleHistogramCount:
                             if ((existingSchemaItem.Count != 1) || (schema.Count != 1))
                             {
diff --git a/src/coreclr/tools/Common/TypeSystem/IL/FlowGraph.cs b/src/coreclr/tools/Common/TypeSystem/IL/FlowGraph.cs
new file mode 100644
index 00000000000..7fd2bb8cba2
--- /dev/null
+++ b/src/coreclr/tools/Common/TypeSystem/IL/FlowGraph.cs
@@ -0,0 +1,273 @@
+// Licensed to the .NET Foundation under one or more agreements.
+// The .NET Foundation licenses this file to you under the MIT license.
+
+using System;
+using System.Collections;
+using System.Collections.Generic;
+using System.Diagnostics;
+using System.Linq;
+using System.Text;
+using System.Threading.Tasks;
+
+namespace Internal.IL
+{
+    internal class BasicBlock : IEquatable<BasicBlock>
+    {
+        public BasicBlock(int start, int size)
+            => (Start, Size) = (start, size);
+
+        // First IL offset
+        public int Start { get; }
+        // Number of IL bytes in this basic block
+        public int Size { get; }
+
+        public HashSet<BasicBlock> Sources { get; } = new HashSet<BasicBlock>();
+        public HashSet<BasicBlock> Targets { get; } = new HashSet<BasicBlock>();
+
+        public override string ToString() => $"Start={Start}, Size={Size}";
+
+        public override bool Equals(object obj) => Equals(obj as BasicBlock);
+        public bool Equals(BasicBlock other) => other != null && Start == other.Start;
+        public override int GetHashCode() => HashCode.Combine(Start);
+
+        public static bool operator ==(BasicBlock left, BasicBlock right) => EqualityComparer<BasicBlock>.Default.Equals(left, right);
+        public static bool operator !=(BasicBlock left, BasicBlock right) => !(left == right);
+    }
+
+    internal class FlowGraph
+    {
+        private readonly int[] _bbKeys;
+
+        private FlowGraph(IEnumerable<BasicBlock> bbs)
+        {
+            BasicBlocks = bbs.OrderBy(bb => bb.Start).ToList();
+            _bbKeys = BasicBlocks.Select(bb => bb.Start).ToArray();
+        }
+
+        /// <summary>Basic blocks, ordered by start IL offset.</summary>
+        public List<BasicBlock> BasicBlocks { get; }
+
+        /// <summary>Find index of basic block containing IL offset.</summary>
+        public int LookupIndex(int ilOffset)
+        {
+            int index = Array.BinarySearch(_bbKeys, ilOffset);
+            if (index < 0)
+                index = ~index - 1;
+
+            // If ilOffset is negative (more generally, before the first BB)
+            // then binarySearch will return ~0 since index 0 is the first BB
+            // that's greater.
+            if (index < 0)
+                return -1;
+
+            // If this is the last BB we could be after as well.
+            BasicBlock bb = BasicBlocks[index];
+            if (ilOffset >= bb.Start + bb.Size)
+                return -1;
+
+            return index;
+        }
+
+        public BasicBlock Lookup(int ilOffset)
+            => LookupIndex(ilOffset) switch
+            {
+                -1 => null,
+                int idx => BasicBlocks[idx]
+            };
+
+        public IEnumerable<BasicBlock> LookupRange(int ilOffsetStart, int ilOffsetEnd)
+        {
+            if (ilOffsetStart < BasicBlocks[0].Start)
+                ilOffsetStart = BasicBlocks[0].Start;
+
+            if (ilOffsetEnd > BasicBlocks.Last().Start)
+                ilOffsetEnd = BasicBlocks.Last().Start;
+
+            int end = LookupIndex(ilOffsetEnd);
+            for (int i = LookupIndex(ilOffsetStart); i <= end; i++)
+                yield return BasicBlocks[i];
+        }
+
+        internal string Dump(Func<BasicBlock, string> getNodeAnnot, Func<(BasicBlock, BasicBlock), string> getEdgeAnnot)
+        {
+            var sb = new StringBuilder();
+            sb.AppendLine("digraph G {");
+            sb.AppendLine("  forcelabels=true;");
+            sb.AppendLine();
+            Dictionary<long, int> bbToIndex = new Dictionary<long, int>();
+            for (int i = 0; i < BasicBlocks.Count; i++)
+                bbToIndex.Add(BasicBlocks[i].Start, i);
+
+            foreach (BasicBlock bb in BasicBlocks)
+            {
+                string label = $"[{bb.Start:x}..{bb.Start + bb.Size:x})\\n{getNodeAnnot(bb)}";
+                sb.AppendLine($"  BB{bbToIndex[bb.Start]} [label=\"{label}\"];");
+            }
+
+            sb.AppendLine();
+
+            foreach (BasicBlock bb in BasicBlocks)
+            {
+                foreach (BasicBlock tar in bb.Targets)
+                {
+                    string label = getEdgeAnnot((bb, tar));
+                    string postfix = string.IsNullOrEmpty(label) ? "" : $" [label=\"{label}\"]";
+                    sb.AppendLine($"  BB{bbToIndex[bb.Start]} -> BB{bbToIndex[tar.Start]}{postfix};");
+                }
+            }
+
+            // Write ranks with BFS.
+            List<BasicBlock> curRank = new List<BasicBlock> { BasicBlocks.Single(bb => bb.Start == 0) };
+            HashSet<BasicBlock> seen = new HashSet<BasicBlock>(curRank);
+            while (curRank.Count > 0)
+            {
+                sb.AppendLine($"  {{rank = same; {string.Concat(curRank.Select(bb => $"BB{bbToIndex[bb.Start]}; "))}}}");
+                curRank = curRank.SelectMany(bb => bb.Targets).Where(seen.Add).ToList();
+            }
+
+            sb.AppendLine("}");
+            return sb.ToString();
+        }
+
+        public static FlowGraph Create(MethodIL il)
+        {
+            HashSet<int> bbStarts = GetBasicBlockStarts(il);
+
+            List<BasicBlock> bbs = new List<BasicBlock>();
+            void AddBB(int start, int count)
+            {
+                if (count > 0)
+                    bbs.Add(new BasicBlock(start, count));
+            }
+
+            int prevStart = 0;
+            foreach (int ofs in bbStarts.OrderBy(o => o))
+            {
+                AddBB(prevStart, ofs - prevStart);
+                prevStart = ofs;
+            }
+
+            AddBB(prevStart, il.GetILBytes().Length - prevStart);
+
+            FlowGraph fg = new FlowGraph(bbs);
+
+            // We know where each basic block starts now. Proceed by linking them together.
+            ILReader reader = new ILReader(il.GetILBytes());
+            foreach (BasicBlock bb in bbs)
+            {
+                reader.Seek(bb.Start);
+                while (reader.HasNext)
+                {
+                    Debug.Assert(fg.Lookup(reader.Offset) == bb);
+                    ILOpcode opc = reader.ReadILOpcode();
+                    if (opc.IsBranch())
+                    {
+                        int tar = reader.ReadBranchDestination(opc);
+                        bb.Targets.Add(fg.Lookup(tar));
+                        if (!opc.IsUnconditionalBranch())
+                            bb.Targets.Add(fg.Lookup(reader.Offset));
+
+                        break;
+                    }
+
+                    if (opc == ILOpcode.switch_)
+                    {
+                        uint numCases = reader.ReadILUInt32();
+                        int jmpBase = reader.Offset + checked((int)(numCases * 4));
+                        bb.Targets.Add(fg.Lookup(jmpBase));
+
+                        for (uint i = 0; i < numCases; i++)
+                        {
+                            int caseOfs = jmpBase + (int)reader.ReadILUInt32();
+                            bb.Targets.Add(fg.Lookup(caseOfs));
+                        }
+
+                        break;
+                    }
+
+                    if (opc == ILOpcode.ret || opc == ILOpcode.endfinally || opc == ILOpcode.endfilter || opc == ILOpcode.throw_ || opc == ILOpcode.rethrow)
+                    {
+                        break;
+                    }
+
+                    reader.Skip(opc);
+                    // Check fall through
+                    if (reader.HasNext)
+                    {
+                        BasicBlock nextBB = fg.Lookup(reader.Offset);
+                        if (nextBB != bb)
+                        {
+                            // Falling through
+                            bb.Targets.Add(nextBB);
+                            break;
+                        }
+                    }
+                }
+            }
+
+            foreach (BasicBlock bb in bbs)
+            {
+                foreach (BasicBlock tar in bb.Targets)
+                    tar.Sources.Add(bb);
+            }
+
+            return fg;
+        }
+
+        /// <summary>
+        /// Find IL offsets at which basic blocks begin.
+        /// </summary>
+        private static HashSet<int> GetBasicBlockStarts(MethodIL il)
+        {
+            ILReader reader = new ILReader(il.GetILBytes());
+            HashSet<int> bbStarts = new HashSet<int>();
+            bbStarts.Add(0);
+            while (reader.HasNext)
+            {
+                ILOpcode opc = reader.ReadILOpcode();
+                if (opc.IsBranch())
+                {
+                    int tar = reader.ReadBranchDestination(opc);
+                    bbStarts.Add(tar);
+                    // Conditional branches can fall through.
+                    if (!opc.IsUnconditionalBranch())
+                        bbStarts.Add(reader.Offset);
+                }
+                else if (opc == ILOpcode.switch_)
+                {
+                    uint numCases = reader.ReadILUInt32();
+                    int jmpBase = reader.Offset + checked((int)(numCases * 4));
+                    // Default case is at jmpBase.
+                    bbStarts.Add(jmpBase);
+
+                    for (uint i = 0; i < numCases; i++)
+                    {
+                        int caseOfs = jmpBase + (int)reader.ReadILUInt32();
+                        bbStarts.Add(caseOfs);
+                    }
+                }
+                else if (opc == ILOpcode.ret || opc == ILOpcode.endfinally || opc == ILOpcode.endfilter || opc == ILOpcode.throw_ || opc == ILOpcode.rethrow)
+                {
+                    if (reader.HasNext)
+                        bbStarts.Add(reader.Offset);
+                }
+                else
+                {
+                    reader.Skip(opc);
+                }
+            }
+
+            foreach (ILExceptionRegion ehRegion in il.GetExceptionRegions())
+            {
+                bbStarts.Add(ehRegion.TryOffset);
+                bbStarts.Add(ehRegion.TryOffset + ehRegion.TryLength);
+                bbStarts.Add(ehRegion.HandlerOffset);
+                bbStarts.Add(ehRegion.HandlerOffset + ehRegion.HandlerLength);
+                if (ehRegion.Kind.HasFlag(ILExceptionRegionKind.Filter))
+                    bbStarts.Add(ehRegion.FilterOffset);
+            }
+
+            return bbStarts;
+        }
+    }
+}
diff --git a/src/coreclr/tools/Common/TypeSystem/IL/ILOpcodeHelper.cs b/src/coreclr/tools/Common/TypeSystem/IL/ILOpcodeHelper.cs
index 99a3fbcc9b0..c3671c75ce8 100644
--- a/src/coreclr/tools/Common/TypeSystem/IL/ILOpcodeHelper.cs
+++ b/src/coreclr/tools/Common/TypeSystem/IL/ILOpcodeHelper.cs
@@ -50,6 +50,11 @@ namespace Internal.IL
             return false;
         }
 
+        public static bool IsUnconditionalBranch(this ILOpcode opcode)
+        {
+            return opcode == ILOpcode.br || opcode == ILOpcode.br_s || opcode == ILOpcode.leave || opcode == ILOpcode.leave_s;
+        }
+
         private static readonly byte[] s_opcodeSizes = new byte[]
         {
             1, // nop = 0x00,
diff --git a/src/coreclr/tools/dotnet-pgo/CommandLineOptions.cs b/src/coreclr/tools/dotnet-pgo/CommandLineOptions.cs
index 70a8b2d081d..3da9ded7063 100644
--- a/src/coreclr/tools/dotnet-pgo/CommandLineOptions.cs
+++ b/src/coreclr/tools/dotnet-pgo/CommandLineOptions.cs
@@ -28,6 +28,10 @@ namespace Microsoft.Diagnostics.Tools.Pgo
         public bool DisplayProcessedEvents;
         public bool ValidateOutputFile;
         public bool GenerateCallGraph;
+        public bool Spgo;
+        public bool SpgoIncludeBlockCounts;
+        public bool SpgoIncludeEdgeCounts;
+        public int SpgoMinSamples = 50;
         public bool VerboseWarnings;
         public jittraceoptions JitTraceOptions;
         public double ExcludeEventsBefore;
@@ -40,6 +44,7 @@ namespace Microsoft.Diagnostics.Tools.Pgo
         public List<AssemblyName> IncludedAssemblies = new List<AssemblyName>();
         public bool DumpMibc = false;
         public FileInfo InputFileToDump;
+        public List<FileInfo> CompareMibc;
 
         public string[] HelpArgs = Array.Empty<string>();
 
@@ -189,6 +194,15 @@ namespace Microsoft.Diagnostics.Tools.Pgo
 #endif
                 CommonOptions();
                 CompressedOption();
+
+                syntax.DefineOption(name: "spgo", value: ref Spgo, help: "Base profile on samples in the input. Uses last branch records if available and otherwise raw IP samples.", requireValue: false);
+                syntax.DefineOption(name: "spgo-with-block-counts", value: ref SpgoIncludeBlockCounts, help: "Include block counts in the written .mibc file. If neither this nor spgo-with-edge-counts are specified, then defaults to true.", requireValue: false);
+                syntax.DefineOption(name: "spgo-with-edge-counts", value: ref SpgoIncludeEdgeCounts, help: "Include edge counts in the written .mibc file.", requireValue: false);
+                syntax.DefineOption(name: "spgo-min-samples", value: ref SpgoMinSamples, help: $"The minimum number of total samples a function must have before generating profile data for it with SPGO. Default: {SpgoMinSamples}", requireValue: false);
+
+                if (!SpgoIncludeBlockCounts && !SpgoIncludeEdgeCounts)
+                    SpgoIncludeBlockCounts = true;
+
                 HelpOption();
             }
 
@@ -231,7 +245,7 @@ namespace Microsoft.Diagnostics.Tools.Pgo
             {
                 HelpArgs = new string[] { "merge", "--help", "--output", "output", "--input", "input"};
 
-                InputFilesToMerge = DefineFileOptionList(name: "i|input", help: "If a reference is not located on disk at the same location as used in the process, it may be specified with a --reference parameter. Multiple --reference parameters may be specified. The wild cards * and ? are supported by this option.");
+                InputFilesToMerge = DefineFileOptionList(name: "i|input", help: "Input .mibc files to be merged. Multiple input arguments are specified as --input file1.mibc --input file2.mibc");
                 OutputOption();
 
                 IReadOnlyList<string> assemblyNamesAsStrings = null;
@@ -281,6 +295,14 @@ namespace Microsoft.Diagnostics.Tools.Pgo
                     OutputFileName = new FileInfo(outputFile);
             }
 
+            var compareMibcCommand = syntax.DefineCommand(name: "compare-mibc", value: ref command, help: "Compare two .mibc files");
+            if (compareMibcCommand.IsActive)
+            {
+                HelpArgs = new[] { "compare-mibc", "--input", "first.mibc", "--input", "second.mibc" };
+                CompareMibc = DefineFileOptionList(name: "i|input", help: "The input .mibc files to be compared. Specify as --input file1.mibc --input file2.mibc");
+                if (CompareMibc.Count != 2)
+                    Help = true;
+            }
 
             if (syntax.ActiveCommand == null)
             {
@@ -363,7 +385,7 @@ Example tracing commands used to generate the input to this tool:
         private void ParseCommmandLineHelper(string[] args)
         {
             ArgumentSyntax argSyntax = ArgumentSyntax.Parse(args, DefineArgumentSyntax);
-            if (Help || (!FileType.HasValue && (InputFilesToMerge == null) && !DumpMibc))
+            if (Help || (!FileType.HasValue && (InputFilesToMerge == null) && !DumpMibc && CompareMibc == null))
             {
                 Help = true;
             }
diff --git a/src/coreclr/tools/dotnet-pgo/MethodMemoryMap.cs b/src/coreclr/tools/dotnet-pgo/MethodMemoryMap.cs
new file mode 100644
index 00000000000..1f0f3c5b044
--- /dev/null
+++ b/src/coreclr/tools/dotnet-pgo/MethodMemoryMap.cs
@@ -0,0 +1,177 @@
+// Licensed to the .NET Foundation under one or more agreements.
+// The .NET Foundation licenses this file to you under the MIT license.
+
+using System;
+using System.Collections.Generic;
+using System.Diagnostics;
+using System.IO;
+using System.Linq;
+using System.Text;
+using System.Threading.Tasks;
+using Internal.TypeSystem;
+using Microsoft.Diagnostics.Tracing.Etlx;
+using Microsoft.Diagnostics.Tracing.Parsers.Clr;
+
+namespace Microsoft.Diagnostics.Tools.Pgo
+{
+    // A map that can be used to resolve memory addresses back to the MethodDesc.
+    internal class MethodMemoryMap
+    {
+        private readonly ulong[] _infoKeys;
+        public readonly MemoryRegionInfo[] _infos;
+
+        public MethodMemoryMap(
+            TraceProcess p,
+            TraceTypeSystemContext tsc,
+            TraceRuntimeDescToTypeSystemDesc idParser,
+            int clrInstanceID)
+        {
+            // Capture the addresses of jitted code
+            List<MemoryRegionInfo> infos = new List<MemoryRegionInfo>();
+            Dictionary<long, MemoryRegionInfo> info = new Dictionary<long, MemoryRegionInfo>();
+            foreach (var e in p.EventsInProcess.ByEventType<MethodLoadUnloadTraceData>())
+            {
+                if (e.ClrInstanceID != clrInstanceID)
+                {
+                    continue;
+                }
+
+                MethodDesc method = null;
+                try
+                {
+                    method = idParser.ResolveMethodID(e.MethodID);
+                }
+                catch
+                {
+                }
+
+                if (method != null)
+                {
+                    infos.Add(new MemoryRegionInfo
+                    {
+                        StartAddress = e.MethodStartAddress,
+                        EndAddress = e.MethodStartAddress + checked((uint)e.MethodSize),
+                        MethodID = e.MethodID,
+                        Method = method,
+                    });
+                }
+            }
+
+            foreach (var e in p.EventsInProcess.ByEventType<MethodLoadUnloadVerboseTraceData>())
+            {
+                if (e.ClrInstanceID != clrInstanceID)
+                {
+                    continue;
+                }
+
+                MethodDesc method = null;
+                try
+                {
+                    method = idParser.ResolveMethodID(e.MethodID);
+                }
+                catch
+                {
+                }
+
+                if (method != null)
+                {
+                    infos.Add(new MemoryRegionInfo
+                    {
+                        StartAddress = e.MethodStartAddress,
+                        EndAddress = e.MethodStartAddress + checked((uint)e.MethodSize),
+                        MethodID = e.MethodID,
+                        Method = method,
+                    });
+                }
+            }
+
+            var sigProvider = new R2RSignatureTypeProvider(tsc);
+            foreach (var module in p.LoadedModules)
+            {
+                if (module.FilePath == "")
+                    continue;
+
+                if (!File.Exists(module.FilePath))
+                    continue;
+
+                try
+                {
+                    byte[] image = File.ReadAllBytes(module.FilePath);
+                    using (FileStream fstream = new FileStream(module.FilePath, FileMode.Open, FileAccess.Read, FileShare.Read))
+                    {
+                        var r2rCheckPEReader = new System.Reflection.PortableExecutable.PEReader(fstream, System.Reflection.PortableExecutable.PEStreamOptions.LeaveOpen);
+
+                        if (!ILCompiler.Reflection.ReadyToRun.ReadyToRunReader.IsReadyToRunImage(r2rCheckPEReader))
+                            continue;
+                    }
+
+                    var reader = new ILCompiler.Reflection.ReadyToRun.ReadyToRunReader(tsc, module.FilePath);
+                    foreach (var methodEntry in reader.GetCustomMethodToRuntimeFunctionMapping<TypeDesc, MethodDesc, R2RSigProviderContext>(sigProvider))
+                    {
+                        foreach (var runtimeFunction in methodEntry.Value.RuntimeFunctions)
+                        {
+                            infos.Add(new MemoryRegionInfo
+                            {
+                                StartAddress = module.ImageBase + (ulong)runtimeFunction.StartAddress,
+                                EndAddress = module.ImageBase + (ulong)runtimeFunction.StartAddress + (uint)runtimeFunction.Size,
+                                Method = methodEntry.Key,
+                                NativeToILMap = runtimeFunction.DebugInfo != null ? NativeToILMap.FromR2RBounds(runtimeFunction.DebugInfo.BoundsList) : null,
+                            });
+                        }
+                    }
+                }
+                catch { }
+            }
+
+            // Can have duplicate events, so pick first for each
+            var byMethodID = infos.GroupBy(i => i.MethodID).ToDictionary(g => g.Key, g => g.First());
+            foreach (MethodILToNativeMapTraceData e in p.EventsInProcess.ByEventType<MethodILToNativeMapTraceData>())
+            {
+                if (byMethodID.TryGetValue(e.MethodID, out MemoryRegionInfo inf))
+                    inf.NativeToILMap = NativeToILMap.FromEvent(e);
+            }
+
+            _infos = byMethodID.Values.OrderBy(i => i.StartAddress).ToArray();
+            _infoKeys = _infos.Select(i => i.StartAddress).ToArray();
+
+#if DEBUG
+            for (int i = 0; i < _infos.Length - 1; i++)
+            {
+                var cur = _infos[i];
+                var next = _infos[i + 1];
+                if (cur.EndAddress <= next.StartAddress)
+                    continue;
+
+                Debug.Fail("Overlap in memory ranges");
+            }
+#endif
+        }
+
+        public MemoryRegionInfo GetInfo(ulong ip)
+        {
+            int index = Array.BinarySearch(_infoKeys, ip);
+            if (index < 0)
+                index = ~index - 1;
+
+            if (index < 0)
+                return null; // Before first
+
+            var info = _infos[index];
+            if (ip < info.StartAddress || ip >= info.EndAddress)
+                return null;
+
+            return info;
+        }
+
+        public MethodDesc GetMethod(ulong ip) => GetInfo(ip)?.Method;
+    }
+
+    public class MemoryRegionInfo
+    {
+        public ulong StartAddress { get; set; }
+        public ulong EndAddress { get; set; }
+        public long MethodID { get; set; }
+        public MethodDesc Method { get; set; }
+        public NativeToILMap NativeToILMap { get; set; }
+    }
+}
diff --git a/src/coreclr/tools/dotnet-pgo/Program.cs b/src/coreclr/tools/dotnet-pgo/Program.cs
index 86448cfb57c..b8828d7df6f 100644
--- a/src/coreclr/tools/dotnet-pgo/Program.cs
+++ b/src/coreclr/tools/dotnet-pgo/Program.cs
@@ -70,13 +70,13 @@ namespace Microsoft.Diagnostics.Tools.Pgo
                 return new TypeSystemEntityOrUnknown(0);
 
             TypeDesc type = null;
-            
+
             try
             {
                 type = _idParser.ResolveTypeHandle(input, false);
             }
             catch
-            {}
+            { }
             if (type != null)
             {
                 return new TypeSystemEntityOrUnknown(type);
@@ -115,7 +115,7 @@ namespace Microsoft.Diagnostics.Tools.Pgo
     class Program
     {
         static Logger s_logger = new Logger();
-        static int Main(string []args)
+        static int Main(string[] args)
         {
             var options = CommandLineOptions.ParseCommandLine(args);
 
@@ -164,31 +164,6 @@ namespace Microsoft.Diagnostics.Tools.Pgo
             s_logger.PrintOutput(output);
         }
 
-        struct InstructionPointerRange : IComparable<InstructionPointerRange>
-        {
-            public InstructionPointerRange(ulong startAddress, int size)
-            {
-                StartAddress = startAddress;
-                EndAddress = startAddress + (ulong)size;
-            }
-
-            public ulong StartAddress;
-            public ulong EndAddress;
-
-            public int CompareTo(InstructionPointerRange other)
-            {
-                if (StartAddress < other.StartAddress)
-                {
-                    return -1;
-                }
-                if (StartAddress > other.StartAddress)
-                {
-                    return 1;
-                }
-                return (int)((long)EndAddress - (long)other.EndAddress);
-            }
-        }
-
         internal static void UnZipIfNecessary(ref string inputFileName, TextWriter log)
         {
             if (inputFileName.EndsWith(".trace.zip", StringComparison.OrdinalIgnoreCase))
@@ -252,10 +227,12 @@ namespace Microsoft.Diagnostics.Tools.Pgo
             {
                 return InnerMergeMain(commandLineOptions);
             }
-            else
+            if (commandLineOptions.CompareMibc != null)
             {
-                return InnerProcessTraceFileMain(commandLineOptions);
+                return InnerCompareMibcMain(commandLineOptions);
             }
+
+            return InnerProcessTraceFileMain(commandLineOptions);
         }
 
         static int InnerDumpMain(CommandLineOptions commandLineOptions)
@@ -278,6 +255,7 @@ namespace Microsoft.Diagnostics.Tools.Pgo
 
             PrintDetailedMessage($"Parsing {commandLineOptions.InputFileToDump}");
             var profileData = MIbcProfileParser.ParseMIbcFile(tsc, mibcPeReader, null, onlyDefinedInAssembly: null);
+            PrintMibcStats(profileData);
 
             using (FileStream outputFile = new FileStream(commandLineOptions.OutputFileName.FullName, FileMode.Create, FileAccess.Write))
             {
@@ -411,8 +389,244 @@ namespace Microsoft.Diagnostics.Tools.Pgo
             }
         }
 
+        static int InnerCompareMibcMain(CommandLineOptions options)
+        {
+            // Command line parser should require exactly 2 files
+            Trace.Assert(options.CompareMibc?.Count == 2);
+            FileInfo file1 = options.CompareMibc[0];
+            FileInfo file2 = options.CompareMibc[1];
+
+            PEReader mibc1 = MIbcProfileParser.OpenMibcAsPEReader(file1.FullName);
+            PEReader mibc2 = MIbcProfileParser.OpenMibcAsPEReader(file2.FullName);
+            var tsc = new TypeRefTypeSystem.TypeRefTypeSystemContext(new PEReader[] { mibc1, mibc2 });
+
+            ProfileData profile1 = MIbcProfileParser.ParseMIbcFile(tsc, mibc1, null, onlyDefinedInAssembly: null);
+            ProfileData profile2 = MIbcProfileParser.ParseMIbcFile(tsc, mibc2, null, onlyDefinedInAssembly: null);
+            PrintOutput($"Comparing {file1.Name} to {file2.Name}");
+            PrintOutput($"Statistics for {file1.Name}");
+            PrintStats(profile1, file1.Name, profile2, file2.Name);
+            PrintOutput("");
+            PrintOutput($"Statistics for {file2.Name}");
+            PrintStats(profile2, file2.Name, profile1, file1.Name);
+
+            static void PrintStats(ProfileData self, string selfName, ProfileData other, string otherName)
+            {
+                var methods = self.GetAllMethodProfileData().ToList();
+                var profiledMethods = methods.Where(spd => spd.SchemaData != null).ToList();
+                var otherMethods = other.GetAllMethodProfileData().ToList();
+                var otherProfiledMethods = otherMethods.Where(spd => spd.SchemaData != null).ToList();
+                PrintMibcStats(self);
+                PrintOutput($"# Profiled methods in {selfName} not in {otherName}: {profiledMethods.Select(m => m.Method).Except(otherProfiledMethods.Select(m => m.Method)).Count()}");
+            }
+
+            PrintOutput("");
+            PrintOutput("Comparison");
+            var methods1 = profile1.GetAllMethodProfileData().ToList();
+            var methods2 = profile2.GetAllMethodProfileData().ToList();
+            var profiledMethods1 = methods1.Where(m => m.SchemaData != null).ToList();
+            var profiledMethods2 = methods2.Where(m => m.SchemaData != null).ToList();
+
+            PrintOutput($"# Methods with profile data in both .mibc files: {profiledMethods1.Select(m => m.Method).Intersect(profiledMethods2.Select(m => m.Method)).Count()}");
+            var fgMatches = new List<(MethodProfileData prof1, MethodProfileData prof2)>();
+            var fgMismatches = new List<(MethodProfileData prof1, MethodProfileData prof2, List<string> mismatches)>();
+
+            foreach (MethodProfileData prof1 in profiledMethods1)
+            {
+                MethodProfileData prof2 = profile2.GetMethodProfileData(prof1.Method);
+                if (prof2?.SchemaData == null)
+                    continue;
+
+                Dictionary<int, PgoSchemaElem> GroupBlocks(MethodProfileData data)
+                    => data.SchemaData
+                       .Where(e => e.InstrumentationKind == PgoInstrumentationKind.BasicBlockIntCount || e.InstrumentationKind == PgoInstrumentationKind.BasicBlockLongCount)
+                       .ToDictionary(e => e.ILOffset);
+
+                Dictionary<(int, int), PgoSchemaElem> GroupEdges(MethodProfileData data)
+                    => data.SchemaData
+                       .Where(e => e.InstrumentationKind == PgoInstrumentationKind.EdgeIntCount || e.InstrumentationKind == PgoInstrumentationKind.EdgeLongCount)
+                       .ToDictionary(e => (e.ILOffset, e.Other));
+
+                var (blocks1, blocks2) = (GroupBlocks(prof1), GroupBlocks(prof2));
+                var (edges1, edges2) = (GroupEdges(prof1), GroupEdges(prof2));
+
+                List<string> mismatches = new List<string>();
+                if (blocks1.Count > 0 && blocks2.Count > 0)
+                {
+                    var in1 = blocks1.Keys.Where(k => !blocks2.ContainsKey(k)).ToList();
+                    var in2 = blocks2.Keys.Where(k => !blocks1.ContainsKey(k)).ToList();
+
+                    foreach (var m1 in in1)
+                        mismatches.Add($"{file1.Name} has a block at {m1:x} not present in {file2.Name}");
+                    foreach (var m2 in in2)
+                        mismatches.Add($"{file2.Name} has a block at {m2:x} not present in {file1.Name}");
+                }
+
+                if (edges1.Count > 0 && edges2.Count > 0)
+                {
+                    var in1 = edges1.Keys.Where(k => !edges2.ContainsKey(k)).ToList();
+                    var in2 = edges2.Keys.Where(k => !edges1.ContainsKey(k)).ToList();
+
+                    foreach (var (from, to) in in1)
+                        mismatches.Add($"{file1.Name} has an edge {from:x}->{to:x} not present in {file2.Name}");
+                    foreach (var (from, to) in in2)
+                        mismatches.Add($"{file2.Name} has an edge {from:x}->{to:x} not present in {file1.Name}");
+                }
+
+                if (mismatches.Count > 0)
+                    fgMismatches.Add((prof1, prof2, mismatches));
+                else
+                    fgMatches.Add((prof1, prof2));
+            }
+
+            PrintOutput($"  Of these, {fgMatches.Count} have matching flow-graphs and the remaining {fgMismatches.Count} do not");
+
+            if (fgMismatches.Count > 0)
+            {
+                PrintOutput("");
+                PrintOutput("Methods with mismatched flow-graphs:");
+                foreach ((MethodProfileData prof1, MethodProfileData prof2, List<string> mismatches) in fgMismatches)
+                {
+                    PrintOutput($"{prof1.Method}");
+                    foreach (string s in mismatches)
+                        PrintOutput($"  {s}");
+                }
+            }
+
+            if (fgMatches.Count > 0)
+            {
+                PrintOutput("");
+                PrintOutput($"When comparing the flow-graphs of the matching methods, their overlaps break down as follows:");
+
+                List<double> blockOverlaps = new List<double>();
+                List<double> edgeOverlaps = new List<double>();
+
+                foreach ((MethodProfileData prof1, MethodProfileData prof2) in fgMatches)
+                {
+                    var (blocks1, blocks2) = (GroupBlocks(prof1), GroupBlocks(prof2));
+                    var (edges1, edges2) = (GroupEdges(prof1), GroupEdges(prof2));
+
+                    double Overlap<TKey>(Dictionary<TKey, PgoSchemaElem> left, Dictionary<TKey, PgoSchemaElem> right)
+                    {
+                        long leftTotal = left.Values.Sum(e => e.DataLong);
+                        long rightTotal = right.Values.Sum(e => e.DataLong);
+                        Debug.Assert(left.Keys.All(k => right.ContainsKey(k)));
+                        Debug.Assert(right.Keys.All(k => left.ContainsKey(k)));
+
+                        var leftPW = left.ToDictionary(k => k.Key, k => k.Value.DataLong / (double)leftTotal);
+                        var rightPW = right.ToDictionary(k => k.Key, k => k.Value.DataLong / (double)rightTotal);
+
+                        double overlap = leftPW.Sum(k => Math.Min(k.Value, rightPW[k.Key]));
+                        return overlap;
+                    }
+
+                    if (blocks1.Count > 0 && blocks2.Count > 0)
+                        blockOverlaps.Add(Overlap(blocks1, blocks2));
+
+                    if (edges1.Count > 0 && edges2.Count > 0)
+                        edgeOverlaps.Add(Overlap(edges1, edges2));
+                }
+
+                void PrintHistogram(List<double> overlaps)
+                {
+                    int maxWidth = Console.WindowWidth - 10;
+                    const int maxLabelWidth = 4; // to print "100%".
+                    int barMaxWidth = maxWidth - (maxLabelWidth + 10); // Leave 10 chars for writing other things on the line
+                    const int bucketSize = 5;
+                    int width = Console.WindowWidth - 10;
+                    var sorted = overlaps.OrderByDescending(d => d).ToList();
+
+                    void PrintBar(string label, ref int curIndex, Func<double, bool> include, bool forcePrint)
+                    {
+                        int count = 0;
+                        while (curIndex < sorted.Count && include(sorted[curIndex]))
+                        {
+                            count++;
+                            curIndex++;
+                        }
+
+                        if (count == 0 && !forcePrint)
+                            return;
+
+                        double proportion = count / (double)sorted.Count;
+
+                        int numFullBlocks = (int)(proportion * barMaxWidth);
+                        double fractionalPart = proportion * barMaxWidth - numFullBlocks;
+
+                        const char fullBlock = '\u2588';
+                        string bar = new string(fullBlock, numFullBlocks);
+                        if ((int)(fractionalPart * 8) != 0)
+                        {
+                            // After full block comes a 7/8 block, then 6/8, then 5/8 etc.
+                            bar += (char)(fullBlock + (8 - (int)(fractionalPart * 8)));
+                        }
+
+                        // If empty, use the left one-eight block to show a line of where 0 is.
+                        if (bar == "")
+                            bar = "\u258f";
+
+                        string line = FormattableString.Invariant($"{label,-maxLabelWidth} {bar} ({proportion*100:F1}%)");
+                        PrintOutput(line);
+                    }
+
+                    // If there are any at 100%, then print those separately
+                    int curIndex = 0;
+                    PrintBar("100%", ref curIndex, d => d >= (1 - 0.000000001), false);
+                    for (int proportion = 100 - bucketSize; proportion >= 0; proportion -= bucketSize)
+                        PrintBar($">{(int)proportion,2}%", ref curIndex, d => d * 100 > proportion, true);
+                    PrintBar("0%", ref curIndex, d => true, false);
+                }
+
+                // Need UTF8 for the block chars.
+                Console.OutputEncoding = Encoding.UTF8;
+                if (blockOverlaps.Count > 0)
+                {
+                    PrintHistogram(blockOverlaps);
+                    PrintOutput("(using block counts)");
+                }
+
+                if (edgeOverlaps.Count > 0)
+                {
+                    if (blockOverlaps.Count > 0)
+                        PrintOutput("");
+
+                    PrintHistogram(edgeOverlaps);
+                    PrintOutput("(using edge counts)");
+                }
+            }
+
+            return 0;
+        }
+
+        static void PrintMibcStats(ProfileData data)
+        {
+            List<MethodProfileData> methods = data.GetAllMethodProfileData().ToList();
+            List<MethodProfileData> profiledMethods = methods.Where(spd => spd.SchemaData != null).ToList();
+            PrintOutput($"# Methods: {methods.Count}");
+            PrintOutput($"# Methods with any profile data: {profiledMethods.Count(spd => spd.SchemaData.Length > 0)}");
+            PrintOutput($"# Methods with 32-bit block counts: {profiledMethods.Count(spd => spd.SchemaData.Any(elem => elem.InstrumentationKind == PgoInstrumentationKind.BasicBlockIntCount))}");
+            PrintOutput($"# Methods with 64-bit block counts: {profiledMethods.Count(spd => spd.SchemaData.Any(elem => elem.InstrumentationKind == PgoInstrumentationKind.BasicBlockLongCount))}");
+            PrintOutput($"# Methods with 32-bit edge counts: {profiledMethods.Count(spd => spd.SchemaData.Any(elem => elem.InstrumentationKind == PgoInstrumentationKind.EdgeIntCount))}");
+            PrintOutput($"# Methods with 64-bit edge counts: {profiledMethods.Count(spd => spd.SchemaData.Any(elem => elem.InstrumentationKind == PgoInstrumentationKind.EdgeLongCount))}");
+            PrintOutput($"# Methods with type handle histograms: {profiledMethods.Count(spd => spd.SchemaData.Any(elem => elem.InstrumentationKind == PgoInstrumentationKind.TypeHandleHistogramTypeHandle))}");
+            PrintOutput($"# Methods with GetLikelyClass data: {profiledMethods.Count(spd => spd.SchemaData.Any(elem => elem.InstrumentationKind == PgoInstrumentationKind.GetLikelyClass))}");
+        }
+
+        private static Dictionary<int, PgoSchemaElem> GroupBlocks(MethodProfileData data)
+        {
+            return data.SchemaData
+               .Where(e => e.InstrumentationKind == PgoInstrumentationKind.BasicBlockIntCount || e.InstrumentationKind == PgoInstrumentationKind.BasicBlockLongCount)
+               .ToDictionary(e => e.ILOffset);
+        }
+
+        private static Dictionary<(int, int), PgoSchemaElem> GroupEdges(MethodProfileData data)
+        {
+            return data.SchemaData
+               .Where(e => e.InstrumentationKind == PgoInstrumentationKind.EdgeIntCount || e.InstrumentationKind == PgoInstrumentationKind.EdgeLongCount)
+               .ToDictionary(e => (e.ILOffset, e.Other));
+        }
+
         static int InnerProcessTraceFileMain(CommandLineOptions commandLineOptions)
-        { 
+        {
             if (commandLineOptions.TraceFile == null)
             {
                 PrintUsage(commandLineOptions, "--trace must be specified");
@@ -425,34 +639,31 @@ namespace Microsoft.Diagnostics.Tools.Pgo
                 return -8;
             }
 
-            if (commandLineOptions.OutputFileName != null)
+            if (!commandLineOptions.FileType.HasValue)
+            {
+                PrintUsage(commandLineOptions, $"--pgo-file-type must be specified");
+                return -9;
+            }
+            if ((commandLineOptions.FileType.Value != PgoFileType.jittrace) && (commandLineOptions.FileType != PgoFileType.mibc))
             {
-                if (!commandLineOptions.FileType.HasValue)
+                PrintUsage(commandLineOptions, $"Invalid output pgo type {commandLineOptions.FileType} specified.");
+                return -9;
+            }
+            if (commandLineOptions.FileType == PgoFileType.jittrace)
+            {
+                if (!commandLineOptions.OutputFileName.Name.EndsWith(".jittrace"))
                 {
-                    PrintUsage(commandLineOptions, $"--pgo-file-type must be specified");
+                    PrintUsage(commandLineOptions, $"jittrace output file name must end with .jittrace");
                     return -9;
                 }
-                if ((commandLineOptions.FileType.Value != PgoFileType.jittrace) && (commandLineOptions.FileType != PgoFileType.mibc))
+            }
+            if (commandLineOptions.FileType == PgoFileType.mibc)
+            {
+                if (!commandLineOptions.OutputFileName.Name.EndsWith(".mibc"))
                 {
-                    PrintUsage(commandLineOptions, $"Invalid output pgo type {commandLineOptions.FileType} specified.");
+                    PrintUsage(commandLineOptions, $"mibc output file name must end with .mibc");
                     return -9;
                 }
-                if (commandLineOptions.FileType == PgoFileType.jittrace)
-                {
-                    if (!commandLineOptions.OutputFileName.Name.EndsWith(".jittrace"))
-                    {
-                        PrintUsage(commandLineOptions, $"jittrace output file name must end with .jittrace");
-                        return -9;
-                    }
-                }
-                if (commandLineOptions.FileType == PgoFileType.mibc)
-                {
-                    if (!commandLineOptions.OutputFileName.Name.EndsWith(".mibc"))
-                    {
-                        PrintUsage(commandLineOptions, $"mibc output file name must end with .mibc");
-                        return -9;
-                    }
-                }
             }
 
             string etlFileName = commandLineOptions.TraceFile.FullName;
@@ -460,7 +671,7 @@ namespace Microsoft.Diagnostics.Tools.Pgo
             {
                 if (commandLineOptions.TraceFile.FullName.EndsWith(nettraceExtension))
                 {
-                    etlFileName = commandLineOptions.TraceFile.FullName.Substring(0, commandLineOptions.TraceFile.FullName.Length - nettraceExtension.Length) + ".etlx";
+                    etlFileName = Path.ChangeExtension(commandLineOptions.TraceFile.FullName, ".etlx");
                     PrintMessage($"Creating ETLX file {etlFileName} from {commandLineOptions.TraceFile.FullName}");
                     TraceLog.CreateFromEventPipeDataFile(commandLineOptions.TraceFile.FullName, etlFileName);
                 }
@@ -469,14 +680,20 @@ namespace Microsoft.Diagnostics.Tools.Pgo
             string lttngExtension = ".trace.zip";
             if (commandLineOptions.TraceFile.FullName.EndsWith(lttngExtension))
             {
-                etlFileName = commandLineOptions.TraceFile.FullName.Substring(0, commandLineOptions.TraceFile.FullName.Length - lttngExtension.Length) + ".etlx";
+                etlFileName = Path.ChangeExtension(commandLineOptions.TraceFile.FullName, ".etlx");
                 PrintMessage($"Creating ETLX file {etlFileName} from {commandLineOptions.TraceFile.FullName}");
                 TraceLog.CreateFromLttngTextDataFile(commandLineOptions.TraceFile.FullName, etlFileName);
             }
 
             UnZipIfNecessary(ref etlFileName, commandLineOptions.BasicProgressMessages ? Console.Out : new StringWriter());
 
-            using (var traceLog = TraceLog.OpenOrConvert(etlFileName))
+            // For SPGO we need to be able to map raw IPs back to IL offsets in methods.
+            // Normally TraceEvent facilitates this remapping automatically and discards the IL<->IP mapping table events.
+            // However, we have found TraceEvent's remapping to be imprecise (see https://github.com/microsoft/perfview/issues/1410).
+            // Thus, when SPGO is requested, we need to keep these events.
+            // Note that we always request these events to be kept because if one switches back and forth between SPGO and non-SPGO,
+            // the cached .etlx file will not update.
+            using (var traceLog = TraceLog.OpenOrConvert(etlFileName, new TraceLogOptions { KeepAllEvents = true }))
             {
                 if ((!commandLineOptions.Pid.HasValue && commandLineOptions.ProcessName == null) && traceLog.Processes.Count != 1)
                 {
@@ -720,6 +937,21 @@ namespace Microsoft.Diagnostics.Tools.Pgo
                     }
                 }
 
+                MethodMemoryMap methodMemMap = null;
+                MethodMemoryMap GetMethodMemMap()
+                {
+                    if (methodMemMap == null)
+                    {
+                        methodMemMap = new MethodMemoryMap(
+                            p,
+                            tsc,
+                            idParser,
+                            clrInstanceId.Value);
+                    }
+
+                    return methodMemMap;
+                }
+
                 Dictionary<MethodDesc, Dictionary<MethodDesc, int>> callGraph = null;
                 Dictionary<MethodDesc, int> exclusiveSamples = null;
                 if (commandLineOptions.GenerateCallGraph)
@@ -732,93 +964,8 @@ namespace Microsoft.Diagnostics.Tools.Pgo
 
                     callGraph = new Dictionary<MethodDesc, Dictionary<MethodDesc, int>>();
                     exclusiveSamples = new Dictionary<MethodDesc, int>();
-                    // Capture the addresses of jitted code
-                    List<ValueTuple<InstructionPointerRange, MethodDesc>> codeLocations = new List<(InstructionPointerRange, MethodDesc)>();
-                    foreach (var e in p.EventsInProcess.ByEventType<MethodLoadUnloadTraceData>())
-                    {
-                        if (e.ClrInstanceID != clrInstanceId.Value)
-                        {
-                            continue;
-                        }
-
-                        MethodDesc method = null;
-                        try
-                        {
-                            method = idParser.ResolveMethodID(e.MethodID, commandLineOptions.VerboseWarnings);
-                        }
-                        catch (Exception)
-                        {
-                        }
-
-                        if (method != null)
-                        {
-                            codeLocations.Add((new InstructionPointerRange(e.MethodStartAddress, e.MethodSize), method));
-                        }
-                    }
-                    foreach (var e in p.EventsInProcess.ByEventType<MethodLoadUnloadVerboseTraceData>())
-                    {
-                        if (e.ClrInstanceID != clrInstanceId.Value)
-                        {
-                            continue;
-                        }
-
-                        MethodDesc method = null;
-                        try
-                        {
-                            method = idParser.ResolveMethodID(e.MethodID, commandLineOptions.VerboseWarnings);
-                        }
-                        catch (Exception)
-                        {
-                        }
-
-                        if (method != null)
-                        {
-                            codeLocations.Add((new InstructionPointerRange(e.MethodStartAddress, e.MethodSize), method));
-                        }
-                    }
-
-                    var sigProvider = new R2RSignatureTypeProvider(tsc);
-                    foreach (var module in p.LoadedModules)
-                    {
-                        if (module.FilePath == "")
-                            continue;
-
-                        if (!File.Exists(module.FilePath))
-                            continue;
-
-                        try
-                        {
-                            byte[] image = File.ReadAllBytes(module.FilePath);
-                            using (FileStream fstream = new FileStream(module.FilePath, FileMode.Open, FileAccess.Read, FileShare.Read))
-                            {
-                                var r2rCheckPEReader = new System.Reflection.PortableExecutable.PEReader(fstream, System.Reflection.PortableExecutable.PEStreamOptions.LeaveOpen);
-
-                                if (!ILCompiler.Reflection.ReadyToRun.ReadyToRunReader.IsReadyToRunImage(r2rCheckPEReader))
-                                    continue;
-                            }
-
-                            var reader = new ILCompiler.Reflection.ReadyToRun.ReadyToRunReader(tsc, module.FilePath);
-                            foreach (var methodEntry in reader.GetCustomMethodToRuntimeFunctionMapping<TypeDesc, MethodDesc, R2RSigProviderContext>(sigProvider))
-                            {
-                                foreach (var runtimeFunction in methodEntry.Value.RuntimeFunctions)
-                                {
-                                    codeLocations.Add((new InstructionPointerRange(module.ImageBase + (ulong)runtimeFunction.StartAddress, runtimeFunction.Size), methodEntry.Key));
-                                }
-                            }
-                        }
-                        catch { }
-                    }
-
-                    InstructionPointerRange[] instructionPointerRanges = new InstructionPointerRange[codeLocations.Count];
-                    MethodDesc[] methods = new MethodDesc[codeLocations.Count];
-                    for (int i = 0; i < codeLocations.Count; i++)
-                    {
-                        instructionPointerRanges[i] = codeLocations[i].Item1;
-                        methods[i] = codeLocations[i].Item2;
-                    }
-
-                    Array.Sort(instructionPointerRanges, methods);
 
+                    MethodMemoryMap mmap = GetMethodMemMap();
                     foreach (var e in p.EventsInProcess.ByEventType<SampledProfileTraceData>())
                     {
                         var callstack = e.CallStack();
@@ -826,12 +973,12 @@ namespace Microsoft.Diagnostics.Tools.Pgo
                             continue;
 
                         ulong address1 = callstack.CodeAddress.Address;
-                        MethodDesc topOfStackMethod = LookupMethodByAddress(address1);
+                        MethodDesc topOfStackMethod = mmap.GetMethod(address1);
                         MethodDesc nextMethod = null;
                         if (callstack.Caller != null)
                         {
                             ulong address2 = callstack.Caller.CodeAddress.Address;
-                            nextMethod = LookupMethodByAddress(address2);
+                            nextMethod = mmap.GetMethod(address2);
                         }
 
                         if (topOfStackMethod != null)
@@ -875,45 +1022,6 @@ namespace Microsoft.Diagnostics.Tools.Pgo
                             }
                         }
                     }
-
-                    MethodDesc LookupMethodByAddress(ulong address)
-                    {
-                        int index = Array.BinarySearch(instructionPointerRanges, new InstructionPointerRange(address, 1));
-
-                        if (index >= 0)
-                        {
-                            return methods[index];
-                        }
-                        else
-                        {
-                            index = ~index;
-                            if (index >= instructionPointerRanges.Length)
-                                return null;
-
-                            if (instructionPointerRanges[index].StartAddress < address)
-                            {
-                                if (instructionPointerRanges[index].EndAddress > address)
-                                {
-                                    return methods[index];
-                                }
-                            }
-
-                            if (index == 0)
-                                return null;
-
-                            index--;
-
-                            if (instructionPointerRanges[index].StartAddress < address)
-                            {
-                                if (instructionPointerRanges[index].EndAddress > address)
-                                {
-                                    return methods[index];
-                                }
-                            }
-
-                            return null;
-                        }
-                    }
                 }
 
                 Dictionary<MethodDesc, MethodChunks> instrumentationDataByMethod = new Dictionary<MethodDesc, MethodChunks>();
@@ -965,6 +1073,226 @@ namespace Microsoft.Diagnostics.Tools.Pgo
                     }
                 }
 
+                Dictionary<MethodDesc, SampleProfile> sampleProfiles = new Dictionary<MethodDesc, SampleProfile>();
+                if (commandLineOptions.Spgo)
+                {
+                    MethodMemoryMap mmap = GetMethodMemMap();
+                    Dictionary<MethodDesc, MethodIL> ils = new Dictionary<MethodDesc, MethodIL>();
+                    Dictionary<MethodDesc, FlowGraph> flowGraphs = new Dictionary<MethodDesc, FlowGraph>();
+
+                    MethodIL GetMethodIL(MethodDesc desc)
+                    {
+                        if (!ils.TryGetValue(desc, out MethodIL il))
+                        {
+                            il = desc switch
+                            {
+                                EcmaMethod em => EcmaMethodIL.Create(em),
+                                var m => new InstantiatedMethodIL(m, EcmaMethodIL.Create((EcmaMethod)m.GetTypicalMethodDefinition())),
+                            };
+
+                            ils.Add(desc, il);
+                        }
+
+                        return il;
+                    }
+
+                    FlowGraph GetFlowGraph(MethodDesc desc)
+                    {
+                        if (!flowGraphs.TryGetValue(desc, out FlowGraph fg))
+                        {
+                            flowGraphs.Add(desc, fg = FlowGraph.Create(GetMethodIL(desc)));
+                        }
+
+                        return fg;
+                    }
+
+                    Guid lbrGuid = Guid.Parse("99134383-5248-43fc-834b-529454e75df3");
+                    bool hasLbr = traceLog.Events.Any(e => e.TaskGuid == lbrGuid);
+
+                    if (!hasLbr)
+                    {
+                        // No LBR data, use standard IP samples. First convert each sample to a tuple of (Method, raw IP, IL offset).
+                        (MethodDesc Method, ulong IP, int Offset) GetTuple(SampledProfileTraceData e)
+                        {
+                            MemoryRegionInfo info = mmap.GetInfo(e.InstructionPointer);
+                            if (info == null)
+                                return (null, e.InstructionPointer, -1);
+
+                            int offset = info.NativeToILMap?.Lookup(checked((uint)(e.InstructionPointer - info.StartAddress))) ?? -1;
+                            return (info.Method, e.InstructionPointer, offset);
+                        }
+
+                        var samples =
+                            p.EventsInProcess.ByEventType<SampledProfileTraceData>()
+                            .Select(GetTuple)
+                            .Where(t => t.Method != null && t.Offset >= 0)
+                            .ToList();
+
+                        // Now find all samples in each method.
+                        foreach (var g in samples.GroupBy(t => t.Method))
+                        {
+                            // SPGO is quite sensitive with low counts, so check if we should not generate SPGO data for this function.
+                            if (g.Count() < commandLineOptions.SpgoMinSamples)
+                                continue;
+
+                            MethodIL il = GetMethodIL(g.Key);
+                            SampleProfile sp = SampleProfile.Create(il, GetFlowGraph(g.Key), g.Select(t => t.Offset));
+                            sampleProfiles.Add(g.Key, sp);
+                        }
+                    }
+                    else
+                    {
+                        // We have LBR data. We use the LBR data to collect straight-line runs that the CPU did in this process inside managed methods.
+                        // That is, if we first see a branch from A -> B followed by a branch from C -> D, then we can conclude that the CPU executed
+                        // code from B -> C. We call this a 'run' and collect each run and its multiplicity.
+                        // Later, we will find all IL offsets on this path and assign samples to the distinct basic blocks corresponding to those IL offsets.
+                        Dictionary<(ulong startRun, ulong endRun), long> runs = new Dictionary<(ulong startRun, ulong endRun), long>();
+                        List<(ulong start, ulong end)> lbrRuns = new List<(ulong start, ulong end)>();
+                        LbrEntry64[] lbr64Arr = null;
+                        foreach (var e in traceLog.Events)
+                        {
+                            if (e.TaskGuid != lbrGuid)
+                                continue;
+
+                            // Opcode is always 32 for the LBR event.
+                            if (e.Opcode != (TraceEventOpcode)32)
+                                continue;
+
+                            unsafe
+                            {
+                                Span<LbrEntry64> lbr;
+                                if (traceLog.PointerSize == 4)
+                                {
+                                    // For 32-bit machines we convert the data into a 64-bit format first.
+                                    LbrTraceEventData32* data = (LbrTraceEventData32*)e.DataStart;
+                                    if (data->ProcessId != p.ProcessID)
+                                        continue;
+
+                                    Span<LbrEntry32> lbr32 = data->Entries(e.EventDataLength);
+                                    if (lbr64Arr == null || lbr64Arr.Length < lbr32.Length)
+                                        lbr64Arr = new LbrEntry64[lbr32.Length];
+
+                                    for (int i = 0; i < lbr32.Length; i++)
+                                    {
+                                        ref LbrEntry64 entry = ref lbr64Arr[i];
+                                        entry.FromAddress = lbr32[i].FromAddress;
+                                        entry.ToAddress = lbr32[i].ToAddress;
+                                        entry.Reserved = lbr32[i].Reserved;
+                                    }
+
+                                    lbr = lbr64Arr[0..lbr32.Length];
+                                }
+                                else
+                                {
+                                    Trace.Assert(traceLog.PointerSize == 8, $"Unexpected PointerSize {traceLog.PointerSize}");
+
+                                    LbrTraceEventData64* data = (LbrTraceEventData64*)e.DataStart;
+                                    // TODO: The process ID check is not sufficient as PIDs can be reused, so we need to use timestamps too,
+                                    // but we do not have access to PerfView functions to convert it. Hopefully TraceEvent will handle this
+                                    // for us in the future.
+                                    if (data->ProcessId != p.ProcessID)
+                                        continue;
+
+                                    lbr = data->Entries(e.EventDataLength);
+                                }
+
+                                // Store runs. LBR is chronological with most recent branches first.
+                                // To avoid double-counting blocks containing calls when the LBR buffer contains
+                                // both the call and the return from the call, we have to do some fancy things
+                                // when seeing cross-function branches, so we use a temporary list of runs
+                                // that we assign into the global dictionary.
+                                lbrRuns.Clear();
+                                for (int i = lbr.Length - 2; i >= 0; i--)
+                                {
+                                    ulong prevFrom = lbr[i + 1].FromAddress;
+                                    ulong prevTo = lbr[i + 1].ToAddress;
+                                    ulong curFrom = lbr[i].FromAddress;
+                                    MemoryRegionInfo prevFromMeth = methodMemMap.GetInfo(prevFrom);
+                                    MemoryRegionInfo prevToMeth = methodMemMap.GetInfo(prevTo);
+                                    MemoryRegionInfo curFromMeth = methodMemMap.GetInfo(curFrom);
+                                    // If this run is not in the same function then ignore it.
+                                    if (prevToMeth == null || prevToMeth != curFromMeth)
+                                        continue;
+
+                                    // Otherwise, if this run follows right after jumping back into the function, we might need to extend
+                                    // a previous run instead. This happens if we previously did a call out of this function and now returned back.
+                                    // TODO: Handle recursion here. The same function could return to itself and we wouldn't realize it from this check.
+                                    if (prevFromMeth != prevToMeth)
+                                    {
+                                        bool extendedPrevRun = false;
+                                        // Try to find a previous run. Iterate in reverse to simulate stack behavior of calls.
+                                        FlowGraph toFG = null;
+                                        for (int j = lbrRuns.Count - 1; j >= 0; j--)
+                                        {
+                                            MemoryRegionInfo endRunMeth = methodMemMap.GetInfo(lbrRuns[j].end);
+                                            if (endRunMeth != prevToMeth)
+                                                continue;
+
+                                            // Same function at least, check for same basic block
+                                            toFG ??= GetFlowGraph(endRunMeth.Method);
+                                            BasicBlock endRunBB = toFG.Lookup(endRunMeth.NativeToILMap.Lookup((uint)(lbrRuns[j].end - endRunMeth.StartAddress)));
+                                            BasicBlock toBB = toFG.Lookup(endRunMeth.NativeToILMap.Lookup((uint)(prevTo - endRunMeth.StartAddress)));
+                                            if (endRunBB == toBB && prevTo > lbrRuns[j].end)
+                                            {
+                                                // Same BB and the jump is to after where the previous run ends. Take that as a return to after that call and extend the previous run.
+                                                lbrRuns[j] = (lbrRuns[j].start, curFrom);
+                                                extendedPrevRun = true;
+                                                break;
+                                            }
+                                        }
+
+                                        if (extendedPrevRun)
+                                            continue;
+                                    }
+
+                                    lbrRuns.Add((prevTo, curFrom));
+                                }
+
+                                // Now insert runs.
+                                foreach (var pair in lbrRuns)
+                                {
+                                    if (runs.TryGetValue(pair, out long count))
+                                        runs[pair] = count + 1;
+                                    else
+                                        runs.Add(pair, 1);
+                                }
+                            }
+                        }
+
+                        // Group runs by memory region info, which corresponds to each .NET method.
+                        var groupedRuns =
+                            runs
+                            .Select(r => (start: r.Key.startRun, end: r.Key.endRun, count: r.Value, info: methodMemMap.GetInfo(r.Key.startRun)))
+                            .GroupBy(t => t.info);
+
+                        foreach (var g in groupedRuns)
+                        {
+                            if (g.Key == null || g.Key.NativeToILMap == null)
+                                continue;
+
+                            // Collect relative IPs of samples. Note that we cannot translate the end-points of runs from IPs to IL offsets
+                            // as we cannot assume that a straight-line execution between two IPs corresponds to a straight-line execution between
+                            // two IL offsets. SampleProfile.CreateFromLbr will be responsible for assigning samples based on the flow graph relative IPs,
+                            // the IP<->IL mapping and the flow graph.
+                            List<(uint start, uint end, long count)> samples =
+                                g
+                                .Where(t => t.end >= t.start && t.end < g.Key.EndAddress)
+                                .Select(t => ((uint)(t.start - g.Key.StartAddress), (uint)(t.end - g.Key.StartAddress), t.count))
+                                .ToList();
+
+                            if (samples.Sum(t => t.count) < commandLineOptions.SpgoMinSamples)
+                                continue;
+
+                            SampleProfile ep = SampleProfile.CreateFromLbr(
+                                GetMethodIL(g.Key.Method),
+                                GetFlowGraph(g.Key.Method),
+                                g.Key.NativeToILMap,
+                                samples);
+
+                            sampleProfiles.Add(g.Key.Method, ep);
+                        }
+                    }
+                }
 
                 if (commandLineOptions.DisplayProcessedEvents)
                 {
@@ -1019,6 +1347,42 @@ namespace Microsoft.Diagnostics.Tools.Pgo
                             var intDecompressor = new PgoProcessor.PgoEncodedCompressedIntParser(instrumentationData, 0);
                             methodData.InstrumentationData = PgoProcessor.ParsePgoData<TypeSystemEntityOrUnknown>(pgoDataLoader, intDecompressor, true).ToArray();
                         }
+                        else if (sampleProfiles.TryGetValue(methodData.Method, out SampleProfile sp))
+                        {
+                            IEnumerable<PgoSchemaElem> schema = Enumerable.Empty<PgoSchemaElem>();
+
+                            if (commandLineOptions.SpgoIncludeBlockCounts)
+                            {
+                                schema = schema.Concat(
+                                    sp.SmoothedSamples
+                                    .Select(kvp =>
+                                        new PgoSchemaElem
+                                        {
+                                            InstrumentationKind = kvp.Value > uint.MaxValue ? PgoInstrumentationKind.BasicBlockLongCount : PgoInstrumentationKind.BasicBlockIntCount,
+                                            ILOffset = kvp.Key.Start,
+                                            Count = 1,
+                                            DataLong = kvp.Value,
+                                        }));
+                            }
+
+                            if (commandLineOptions.SpgoIncludeEdgeCounts)
+                            {
+                                schema = schema.Concat(
+                                    sp.SmoothedEdgeSamples
+                                    .Select(kvp =>
+                                        new PgoSchemaElem
+                                        {
+                                            InstrumentationKind = kvp.Value > uint.MaxValue ? PgoInstrumentationKind.EdgeLongCount : PgoInstrumentationKind.EdgeIntCount,
+                                            ILOffset = kvp.Key.Item1.Start,
+                                            Other = kvp.Key.Item2.Start,
+                                            Count = 1,
+                                            DataLong = kvp.Value
+                                        }));
+                            }
+
+                            methodData.InstrumentationData = schema.ToArray();
+                        }
+
                         methodsUsedInProcess.Add(methodData);
                     }
                 }
diff --git a/src/coreclr/tools/dotnet-pgo/SPGO/CirculationGraph.cs b/src/coreclr/tools/dotnet-pgo/SPGO/CirculationGraph.cs
new file mode 100644
index 00000000000..150c2e5c8db
--- /dev/null
+++ b/src/coreclr/tools/dotnet-pgo/SPGO/CirculationGraph.cs
@@ -0,0 +1,237 @@
+// Licensed to the .NET Foundation under one or more agreements.
+// The .NET Foundation licenses this file to you under the MIT license.
+
+using System;
+using System.Collections.Generic;
+using System.Text;
+
+namespace Microsoft.Diagnostics.Tools.Pgo
+{
+    public sealed class CirculationGraph
+    {
+
+        public List<Node> Nodes;
+        public List<Edge> Edges;
+
+        public CirculationGraph()
+        {
+            this.Nodes = new List<Node>();
+            this.Edges = new List<Edge>();
+        }
+
+        // Limit the stateful interface only to adding Nodes, the addition of Edges is implicit.
+        // If all out-edges are added to the node before adding, the graph edge lists will be consistent.
+        public void AddNode(Node toAdd)
+        {
+            this.Edges.AddRange(toAdd.OutEdgeList);
+            this.Nodes.Add(toAdd);
+        }
+
+        public void CheckConsistentCirculation()
+        {
+            // First check that flow is within capacities.
+
+            foreach (Edge e in this.Edges)
+            {
+                e.CheckEdgeConsistency();
+            }
+            // Then check that flow in = flow out.
+            // Note that due to back-edges, each flow count should be 0.
+
+            foreach (Node n in this.Nodes)
+            {
+                long inFlow = n.NetInFlow();
+                long outFlow = n.NetOutFlow();
+                if (inFlow != outFlow)
+                {
+                    throw new Exception(string.Format("Node {0}: Has in-flow of {1} and out-flow of {2}", n.ID, inFlow, outFlow));
+                }
+            }
+        }
+
+        public long TotalCirculationCost()
+        {
+            long totalCost = 0;
+            foreach (Edge e in this.Edges)
+            {
+                totalCost += e.Cost * e.Flow;
+            }
+            // Divide the total by two because back-edges cause double-counting.
+
+            return totalCost / 2;
+        }
+    }
+
+    ///////////////////////////////////////////////////////////////////////////////////////////////////
+    //
+    // class Edge
+    //
+    ///////////////////////////////////////////////////////////////////////////////////////////////////
+
+    public sealed class Edge
+    {
+        static int s_idCounter = 0;
+
+        public Node Source;
+        public Node Target;
+        public Edge BackEdge;
+        public long MinCapacity;
+        public long MaxCapacity;
+        public long Flow;
+        public long Free;
+        public long Cost;
+        public long ID;
+
+        public Edge(Node source, Node target, long minCapacity, long maxCapacity, long cost)
+        {
+            this.Source = source;
+            this.Target = target;
+            this.MinCapacity = minCapacity;
+            this.MaxCapacity = maxCapacity;
+            this.Flow = minCapacity;
+            this.Free = maxCapacity - this.Flow;
+            this.Cost = cost;
+            this.ID = s_idCounter++;
+            this.BackEdge = new Edge(this);
+
+            // Make sure that the source and target Nodes of this edge know of its existence;
+            // This asymmetry is because Node objects will be initialized first in building the graph.
+            this.Target.AddInEdge(this);
+            this.Source.AddOutEdge(this);
+        }
+
+        // Constructor to create backedge.
+        public Edge(Edge backEdge)
+        {
+            this.Source = backEdge.Target;
+            this.Target = backEdge.Source;
+            this.BackEdge = backEdge;
+            this.MinCapacity = -backEdge.MaxCapacity;
+            this.MaxCapacity = -backEdge.MinCapacity;
+            this.Flow = -backEdge.Flow;
+            this.Free = this.MaxCapacity - this.Flow;
+            this.Cost = -backEdge.Cost;
+            this.ID = s_idCounter++;
+
+            this.Target.AddInEdge(this);
+            this.Source.AddOutEdge(this);
+        }
+
+        // Adds flow to the given edge and appropriately modifies backedge, throwing an exception if capacities are violated
+        public void AddFlow(long delta)
+        {
+            if (this.Flow + delta < this.MinCapacity || this.Flow + delta > this.MaxCapacity)
+            {
+                throw new Exception(string.Format("Edge {0}: Tried to assign flow of {1} with capacity range [{2}, {3}]", this.ID, this.Flow + delta, this.MinCapacity, this.MaxCapacity));
+            }
+            this.Flow += delta;
+            this.Free -= delta;
+            this.BackEdge.Flow -= delta;
+            this.BackEdge.Free += delta;
+        }
+
+        // Checks whether flow is within the capacities, and that the backedge is consistent.
+        public void CheckEdgeConsistency()
+        {
+            if (this.Flow < this.MinCapacity || this.Flow > this.MaxCapacity)
+            {
+                throw new Exception(string.Format("Edge {0}: Flow of {1} falls outside of capacity range [{2}, {3}]", this.ID, this.Flow, this.MinCapacity, this.MaxCapacity));
+            }
+
+            if (this.Free != this.MaxCapacity - this.Flow)
+            {
+                throw new Exception(string.Format("Edge {0}: Annotated with {1} free capacity, while should have {2}", this.ID, this.Free, this.MaxCapacity - this.Flow));
+            }
+
+            if (this.Flow != -this.BackEdge.Flow)
+            {
+                throw new Exception(string.Format("Edge {0}: Has {1} flow while backedge has {2}", this.ID, this.Flow, this.BackEdge.Flow));
+            }
+        }
+
+        public override string ToString() => $"{Source} -> {Target}";
+    }
+
+    ///////////////////////////////////////////////////////////////////////////////////////////////////
+    //
+    // class Node
+    //
+    ///////////////////////////////////////////////////////////////////////////////////////////////////
+
+    public sealed class Node
+    {
+        static int s_idCounter = 0;
+
+        public List<Edge> InEdgeList;
+        public List<Edge> OutEdgeList;
+        public Dictionary<Node, Edge> InEdgeMap;
+        public Dictionary<Node, Edge> OutEdgeMap;
+        public NodeMetaData MetaData;
+        public int ID;
+
+        public Node()
+        {
+            this.InEdgeList = new List<Edge>();
+            this.OutEdgeList = new List<Edge>();
+            this.InEdgeMap = new Dictionary<Node, Edge>();
+            this.OutEdgeMap = new Dictionary<Node, Edge>();
+            this.MetaData = new NodeMetaData();
+            this.ID = s_idCounter++;
+        }
+
+        public void AddInEdge(Edge toAdd)
+        {
+            this.InEdgeList.Add(toAdd);
+            this.InEdgeMap[toAdd.Target] = toAdd;
+        }
+
+        public void AddOutEdge(Edge toAdd)
+        {
+            this.OutEdgeList.Add(toAdd);
+            this.OutEdgeMap[toAdd.Target] = toAdd;
+        }
+
+        public long NetInFlow()
+        {
+            long inFlow = 0;
+            foreach (Edge inEdge in this.InEdgeList)
+            {
+                // Only count positive flow edges to derive net in-flow (every positive edge has a negative back-edge)
+
+                inFlow += Math.Max(0, inEdge.Flow);
+            }
+            return inFlow;
+        }
+
+        public long NetOutFlow()
+        {
+            long outFlow = 0;
+            foreach (Edge outEdge in this.OutEdgeList)
+            {
+                outFlow += Math.Max(0, outEdge.Flow);
+            }
+            return outFlow;
+        }
+
+        public override string ToString() => $"{ID}";
+    }
+
+    ///////////////////////////////////////////////////////////////////////////////////////////////////
+    //
+    // class NodeMetaData
+    //
+    ///////////////////////////////////////////////////////////////////////////////////////////////////
+
+    // This class carries around the meta-data to expedite Bellman-Ford's minimum cost algorithm in MinimumCostCirculation.
+    public sealed class NodeMetaData
+    {
+        public long Distance;
+        public Edge PredEdge;
+
+        public NodeMetaData()
+        {
+            this.Distance = 0;
+            this.PredEdge = null;
+        }
+    }
+}
diff --git a/src/coreclr/tools/dotnet-pgo/SPGO/FlowSmoothing.cs b/src/coreclr/tools/dotnet-pgo/SPGO/FlowSmoothing.cs
new file mode 100644
index 00000000000..9c5613c71e5
--- /dev/null
+++ b/src/coreclr/tools/dotnet-pgo/SPGO/FlowSmoothing.cs
@@ -0,0 +1,479 @@
+// Licensed to the .NET Foundation under one or more agreements.
+// The .NET Foundation licenses this file to you under the MIT license.
+
+/********
+ * This class handles smoothing over a circulation graph to be consistent and cost-minimal.
+ * 
+ * A circulation graph consists of nodes v, directed edges e, and two functions on the edges:
+ * 
+ * cost(e) = the cost of each positive unit of flow on the edge
+ * capacity(e) = the range of possible values of flow on the edge
+ * 
+ * where flow is a function on the edges such that, for every node, the flow on in-edges adds up
+ * to the flow on out-edges.
+ * 
+ * The objective of this class's main function (SmoothFlowGraph) is to take an inconsistent count of
+ * each node's net flow and map it onto a consistent circulation. This circulation is constructed to map
+ * back onto a consistent flow, and when a minimum cost circulation is found (by using a call to
+ * MinimumCostCirculation.FindMinCostCirculation), the flow it maps back onto will also minimize
+ * a cost metric. In other words, the parameter 'Func<T, bool, long> costFunction' assigns to each
+ * node T a cost to increasing its net flow (when the bool is true) and a cost to decreasing its
+ * net flow (when the bool is false.) SmoothFlowGraph then constructs a consistent circulation whose
+ * cost will be minimized exactly when the cost of changing the net flows of the blocks is minimized.
+ * 
+ * The translation is outlined in detail in Section 4 of "Complementing Incomplete Edge Profile by applying
+ * Minimum Cost Circulation Algorithms" (Levin 2007)
+ ********/
+
+using System;
+using System.Collections.Generic;
+using System.Diagnostics;
+using System.Linq;
+using System.Text;
+
+namespace Microsoft.Diagnostics.Tools.Pgo
+{
+    public class FlowSmoothing<T>
+    {
+        public Dictionary<T, long> NodeResults = new Dictionary<T, long>();
+        public Dictionary<(T, T), long> EdgeResults = new Dictionary<(T, T), long>();
+
+        Dictionary<T, long> m_sampleData;
+        T m_startBlock;
+        Func<T, HashSet<T>> m_successorFunction;
+        Func<T, bool, long> m_costFunction;
+
+        public FlowSmoothing(Dictionary<T, long> sampleData, T startBlock, Func<T, HashSet<T>> successorFunction, Func<T, bool, long> costFunction)
+        {
+            m_sampleData = sampleData;
+            m_startBlock = startBlock;
+            m_successorFunction = successorFunction;
+            m_costFunction = costFunction;
+        }
+
+        // Using sampleData, the graph given by successor function, and a cost assigned to each node,
+        // find a minimum-cost circulation to get a consistent block count.
+
+        public void Perform(int smoothingIterations = -1)
+        {
+            // Graph to run the circulation on.
+            CirculationGraph graph = new CirculationGraph();
+
+            // Map each concrete block T to a pair of Nodes and an Edge in the circulation graph: entrance, exit, and backedge.
+            Dictionary<T, (Node entrance, Node exit, Edge back)> abstractNodeMap = new Dictionary<T, (Node entrance, Node exit, Edge back)>();
+
+            // Create privileged nodes source and target that will be connected to induce flow.
+            Node source = new Node();
+            Node target = new Node();
+
+            // Sum all the weights so we can properly induce flow from target to source
+            long totalWeight = 0;
+
+            // Now generate the nodes of the graph.
+            foreach (T basicBlock in m_sampleData.Keys)
+            {
+                // ----------------------------------
+                // Create a subgraph structure:
+                //            ( ENTRY )
+                //   cost = c- |    | c+
+                //cap. = [w,w] ^    v cap. = [0, infty]
+                //    flow = w |    | f=0
+                //            ( EXIT )
+                // ----------------------------------
+                Node entryNode = new Node();
+                Node exitNode = new Node();
+
+                // Make the blockWeight correspond to the counts acquired from sampling.
+                long blockWeight = m_sampleData[basicBlock];
+                totalWeight += blockWeight;
+
+                // Create edge from s to exit with capacity equal to the block's weight. Vice-versa for entry to t.
+                new Edge(source, exitNode, blockWeight, blockWeight, 0);
+                new Edge(entryNode, target, blockWeight, blockWeight, 0);
+
+                // Add the edges for node-splitting with costs given by costFunction
+                new Edge(entryNode, exitNode, 0, long.MaxValue, m_costFunction(basicBlock, true));
+                Edge backEdge = new Edge(exitNode, entryNode, 0, blockWeight, m_costFunction(basicBlock, false));
+                backEdge.AddFlow(blockWeight);
+
+                // Create the entry for basicBlock in abstractNodeMap.
+                abstractNodeMap.Add(basicBlock, (entryNode, exitNode, backEdge));
+            }
+            // Create edges from the exit node of each subgraph to the entry of subgraphs corresponding to the concrete block's successors.
+
+            Dictionary<(T, T), Edge> abstractEdgeMap = new Dictionary<(T, T), Edge>();
+
+            foreach (T predecessorBlock in abstractNodeMap.Keys)
+            {
+                foreach (T successorBlock in m_successorFunction(predecessorBlock))
+                {
+                    Node predecessor = abstractNodeMap[predecessorBlock].exit;
+                    Node successor = abstractNodeMap[successorBlock].entrance;
+                    Edge newEdge = new Edge(predecessor, successor, 0, long.MaxValue, 0);
+                    abstractEdgeMap[(predecessorBlock, successorBlock)] = newEdge;
+                }
+                if (m_successorFunction(predecessorBlock).Count == 0)
+                {
+                    new Edge(abstractNodeMap[predecessorBlock].exit, abstractNodeMap[m_startBlock].entrance, 0, long.MaxValue, 0);
+                }
+            }
+            // Add the entrance/exit nodes, as well as the source and target nodes, to the graph.
+
+            foreach (T basicBlock in abstractNodeMap.Keys)
+            {
+                graph.AddNode(abstractNodeMap[basicBlock].entrance);
+                graph.AddNode(abstractNodeMap[basicBlock].exit);
+            }
+            (new Edge(target, source, 0, long.MaxValue, 0)).AddFlow(totalWeight);
+            graph.AddNode(source);
+            graph.AddNode(target);
+
+            MinimumCostCirculation.FindMinCostCirculation(graph);
+
+            // Derive the new concrete block hit counts by subtracting the backflow from the inflow of the entry node.
+            foreach (T concreteNode in abstractNodeMap.Keys)
+            {
+                long entryNodeFlow = abstractNodeMap[concreteNode].entrance.NetInFlow();
+                long backEdgeFlow = abstractNodeMap[concreteNode].back.Flow;
+                NodeResults[concreteNode] = entryNodeFlow - backEdgeFlow;
+            }
+            // Now log all the edge values back into the edgeResult dictionary.
+
+            foreach (var concreteEdge in abstractEdgeMap.Keys)
+            {
+                EdgeResults[concreteEdge] = abstractEdgeMap[concreteEdge].Flow;
+            }
+
+            MakeGraphFeasible();
+            CheckGraphConsistency();
+        }
+
+        // Helper function to perform parametric mapping on the NodeResults dictionary.
+        public Dictionary<T, S> MapNodes<S>(Func<T, long, S> transformation)
+        {
+            Dictionary<T, S> results = new Dictionary<T, S>();
+
+            foreach (T node in NodeResults.Keys)
+            {
+                results[node] = transformation(node, NodeResults[node]);
+            }
+
+            return results;
+        }
+
+        // Helper function to perform parametric mapping on the EdgeResults dictionary.
+        public Dictionary<(T, T), S> MapEdges<S>(Func<(T, T), long, S> transformation)
+        {
+            Dictionary<(T, T), S> results = new Dictionary<(T, T), S>();
+
+            foreach (var edge in EdgeResults.Keys)
+            {
+                results[edge] = transformation(edge, EdgeResults[edge]);
+            }
+
+            return results;
+        }
+
+        // Current "hacky" function to ensure that the profile counts are feasible in some execution.
+        // Looks for blocks with non-zero counts that are not connected by positive counts to the start and end,
+        // These are, by invariants of the smoothing algorithm, necessarily strongly-connected components before this function.
+        // Then, perform DFS from the start block to any block of such a strongly-connected component; once found, light up
+        // that path with incremented counts; repeat the same from that block to any exit block.
+        // There are several invariants that must hold for this to work; as such, this is provisionary but seems to work quickly and adequately
+        // without error.
+        public void MakeGraphFeasible()
+        {
+            // Keep a HashSet of which blocks are accessible from m_startBlock, traveling only over positive edges.
+
+            System.Collections.Generic.HashSet<T> reachableFromStart = new System.Collections.Generic.HashSet<T>();
+            Queue<T> toExamine = new Queue<T>();
+            reachableFromStart.Add(m_startBlock);
+            toExamine.Enqueue(m_startBlock);
+
+            // Perform a BFS to populate reachableFromStart; use toExamine as the auxiliary data structure.
+
+            while (toExamine.Count > 0)
+            {
+                T predBlock = toExamine.Dequeue();
+
+                foreach (T succBlock in m_successorFunction(predBlock))
+                {
+                    if (EdgeResults[(predBlock, succBlock)] > 0 && !reachableFromStart.Contains(succBlock))
+                    {
+                        reachableFromStart.Add(succBlock);
+                        toExamine.Enqueue(succBlock);
+                    }
+                }
+            }
+
+            // Iterate over each block, checking for the conditions of needing a path "lighted up"
+
+            foreach (T block in m_sampleData.Keys)
+            {
+                if (NodeResults[block] > 0 && !reachableFromStart.Contains(block))
+                {
+                    System.Collections.Generic.HashSet<T> stronglyConnectedComponent = new System.Collections.Generic.HashSet<T>();
+                    stronglyConnectedComponent.Add(block);
+                    toExamine.Enqueue(block);
+
+                    // Build a set containing the strongly connected component. Is possible that it is now connected
+                    // due to another "lighting up" iteration, but this case is ignored at present and does not affect feasibility.
+
+                    while (toExamine.Count > 0)
+                    {
+                        T predBlock = toExamine.Dequeue();
+
+                        foreach (T succBlock in m_successorFunction(predBlock))
+                        {
+                            if (EdgeResults[(predBlock, succBlock)] > 0 && !stronglyConnectedComponent.Contains(succBlock))
+                            {
+                                stronglyConnectedComponent.Add(succBlock);
+                                toExamine.Enqueue(succBlock);
+                            }
+                        }
+                    }
+
+                    // Now perform a search from the start to the component and from the component to the end.
+                    // Increment zero edges along the way along with their two ends.
+                    // For now use DFS.
+
+                    System.Collections.Generic.HashSet<T> visited = new System.Collections.Generic.HashSet<T>();
+                    Stack<T> trace = new Stack<T>();
+
+                    visited.Add(m_startBlock);
+                    trace.Push(m_startBlock);
+
+                    while (!stronglyConnectedComponent.Contains(trace.Peek()))
+                    {
+                        bool foundSuccessor = false;
+                        foreach (T succBlock in m_successorFunction(trace.Peek()))
+                        {
+                            if (!visited.Contains(succBlock))
+                            {
+                                visited.Add(succBlock);
+                                trace.Push(succBlock);
+                                foundSuccessor = true;
+                                break;
+                            }
+                        }
+
+                        if (!foundSuccessor)
+                        {
+                            trace.Pop();
+                        }
+                    }
+
+                    // Exhaust stack, "lighting up" path on the way through.
+
+                    T destination = trace.Peek();
+                    while (trace.Count > 1)
+                    {
+                        T succBlock = trace.Pop();
+                        T predBlock = trace.Peek();
+                        EdgeResults[(predBlock, succBlock)]++;
+                        NodeResults[succBlock]++;
+                        reachableFromStart.Add(predBlock);
+                    }
+
+                    NodeResults[m_startBlock]++;
+                    reachableFromStart.UnionWith(stronglyConnectedComponent);
+
+                    // Repeat similar for any node in the strongly connected component to an end block.
+                    // Start from the very end of the last path.
+
+                    visited = new System.Collections.Generic.HashSet<T>();
+                    trace = new Stack<T>();
+
+                    visited.Add(destination);
+                    trace.Push(destination);
+
+                    while (trace.Count > 0 && m_successorFunction(trace.Peek()).Count > 0)
+                    {
+                        bool foundSuccessor = false;
+                        foreach (T succBlock in m_successorFunction(trace.Peek()))
+                        {
+                            if (!visited.Contains(succBlock))
+                            {
+                                visited.Add(succBlock);
+                                trace.Push(succBlock);
+                                foundSuccessor = true;
+                                break;
+                            }
+                        }
+
+                        if (!foundSuccessor)
+                        {
+                            trace.Pop();
+                        }
+                    }
+
+                    if (trace.Count == 0)
+                    {
+                        Console.WriteLine("WARNING: No trace found to exit node. Light up all visited blocks");
+                        foreach (T predBlock in visited)
+                        {
+                            foreach (T succBlock in m_successorFunction(predBlock))
+                            {
+                                if (visited.Contains(succBlock))
+                                {
+                                    EdgeResults[(predBlock, succBlock)]++;
+                                    NodeResults[succBlock]++;
+                                    reachableFromStart.Add(predBlock);
+                                }
+                            }
+                        }
+                    }
+                    else
+                    {
+                        while (trace.Count > 1)
+                        {
+                            T succBlock = trace.Pop();
+                            T predBlock = trace.Peek();
+                            EdgeResults[(predBlock, succBlock)]++;
+                            NodeResults[succBlock]++;
+                            reachableFromStart.Add(predBlock);
+                        }
+                    }
+                }
+            }
+        }
+
+        // For now checks that the flow constraints hold and that the entry block has a count if any block has a count.
+        // Throws a descriptive Exception if an inconsistency is found.
+        public void CheckGraphConsistency()
+        {
+            // Logs the in-flow for each node from all of its in-edges.
+            Dictionary<T, long> inFlow = new Dictionary<T, long>();
+            long totalFlow = 0;
+
+            // Initialize the Dictionary.
+
+            foreach (T node in NodeResults.Keys)
+            {
+                inFlow[node] = 0;
+            }
+
+            foreach (T predNode in NodeResults.Keys)
+            {
+                long outFlow = 0;
+                long flow;
+
+                foreach (T succNode in m_successorFunction(predNode))
+                {
+                    flow = EdgeResults[(predNode, succNode)];
+                    inFlow[succNode] += flow;
+                    outFlow += flow;
+                    totalFlow += flow;
+                }
+                // Directs all flow to the entry node if there are no successors.
+
+                if (m_successorFunction(predNode).Count == 0)
+                {
+                    flow = NodeResults[predNode];
+                    inFlow[m_startBlock] += flow;
+                    outFlow += flow;
+                    totalFlow += flow;
+                }
+                // Checks for the condition that the node emits as much flow as recorded by NodeResults
+
+                if (NodeResults[predNode] != outFlow)
+                {
+                    Console.WriteLine(string.Format("WARNING: Node's count is {0}, but emits {1} flow to its successors", NodeResults[predNode], outFlow));
+                }
+            }
+            // Now check that the inFlow of each node adds up correctly.
+
+            foreach (T node in NodeResults.Keys)
+            {
+                if (NodeResults[node] != inFlow[node])
+                {
+                    Console.WriteLine(string.Format("WARNING: Node's count is {0}, but accepts {1} from its predecessors", NodeResults[node], inFlow[node]));
+                }
+            }
+            // Preliminary check that the start node has positive count as long as any node in the graph has positive count.
+
+            if (NodeResults[m_startBlock] == 0 && totalFlow > 0)
+            {
+                Console.WriteLine("WARNING: Graph has positive flow somewhere but zero flow at the entry");
+            }
+            // Check in more detail whether the graph is feasible. That is, if for every non-zero count there is a positive trace
+            // from the start to that block to an exit node. First check accessibility from the start using BFS.
+
+            System.Collections.Generic.HashSet<T> accessibleFromStart = new System.Collections.Generic.HashSet<T>();
+            Stack<T> toSee = new Stack<T>();
+            toSee.Push(m_startBlock);
+            accessibleFromStart.Add(m_startBlock);
+
+            while (toSee.Count > 0)
+            {
+                T predNode = toSee.Pop();
+                foreach (T succNode in m_successorFunction(predNode))
+                {
+                    if (EdgeResults[(predNode, succNode)] > 0 && !accessibleFromStart.Contains(succNode))
+                    {
+                        accessibleFromStart.Add(succNode);
+                        toSee.Push(succNode);
+                    }
+                }
+            }
+
+            foreach (T node in NodeResults.Keys)
+            {
+                if (NodeResults[node] > 0 && !accessibleFromStart.Contains(node))
+                {
+                    Console.WriteLine("WARNING: Node has positive count but not accessible from start");
+                }
+            }
+            // Now, check for blocks that are hit but don't lead to exit nodes.
+            // First, reverse the direction of the successor function and find the exit nodes.
+
+            Dictionary<T, List<T>> predMap = new Dictionary<T, List<T>>();
+            Stack<T> exitableNodes = new Stack<T>();
+            System.Collections.Generic.HashSet<T> accessibleFromExit = new System.Collections.Generic.HashSet<T>();
+
+            // Initialize the predecessor map.
+
+            foreach (T node in NodeResults.Keys)
+            {
+                predMap[node] = new List<T>();
+            }
+
+            foreach (T predNode in NodeResults.Keys)
+            {
+                foreach (T succNode in m_successorFunction(predNode))
+                {
+                    predMap[succNode].Add(predNode);
+                }
+                if (m_successorFunction(predNode).Count == 0)
+                {
+                    exitableNodes.Push(predNode);
+                    accessibleFromExit.Add(predNode);
+                }
+            }
+
+            // Then, produce the exit-able blocks by BFS.
+
+            while (exitableNodes.Count > 0)
+            {
+                T succNode = exitableNodes.Pop();
+                foreach (T predNode in predMap[succNode])
+                {
+                    if (EdgeResults[(predNode, succNode)] > 0 && !accessibleFromExit.Contains(predNode))
+                    {
+                        exitableNodes.Push(predNode);
+                        accessibleFromExit.Add(predNode);
+                    }
+                }
+            }
+            // Finally, iterate over all the nodes and check that if they have positive count, they are accessible from the exit.
+
+            foreach (T node in NodeResults.Keys)
+            {
+                if (NodeResults[node] > 0 && !accessibleFromExit.Contains(node))
+                {
+                    Console.WriteLine("WARNING: Node has positive count does not reach an exit node");
+                }
+            }
+        }
+    }
+}
diff --git a/src/coreclr/tools/dotnet-pgo/SPGO/LbrEntry.cs b/src/coreclr/tools/dotnet-pgo/SPGO/LbrEntry.cs
new file mode 100644
index 00000000000..5f01ddfab8b
--- /dev/null
+++ b/src/coreclr/tools/dotnet-pgo/SPGO/LbrEntry.cs
@@ -0,0 +1,80 @@
+// Licensed to the .NET Foundation under one or more agreements.
+// The .NET Foundation licenses this file to you under the MIT license.
+
+using System;
+using System.Collections.Generic;
+using System.Linq;
+using System.Runtime.CompilerServices;
+using System.Runtime.InteropServices;
+using System.Text;
+using System.Threading.Tasks;
+
+namespace Microsoft.Diagnostics.Tools.Pgo
+{
+    [StructLayout(LayoutKind.Sequential)]
+    public struct LbrEntry32
+    {
+        public uint FromAddress;
+        public uint ToAddress;
+        public uint Reserved;
+
+        public override string ToString() => $"{FromAddress:x} -> {ToAddress:x}";
+    }
+
+    [StructLayout(LayoutKind.Sequential)]
+    public struct LbrEntry64
+    {
+        public ulong FromAddress;
+        public ulong ToAddress;
+        public ulong Reserved;
+
+        public override string ToString() => $"{FromAddress:x} -> {ToAddress:x}";
+    }
+
+    [Flags]
+    public enum LbrOptionFlags
+    {
+        FilterKernel = 1 << 0,
+        FilterUser = 1 << 1,
+        FilterJcc = 1 << 2,
+        FilterNearRelCall = 1 << 3,
+        FilterNearIndCall = 1 << 4,
+        FilterNearRet = 1 << 5,
+        FilterNearIndJmp = 1 << 6,
+        FilterNearRelJmp = 1 << 7,
+        FilterFarBranch = 1 << 8,
+        CallstackEnable = 1 << 9,
+    }
+
+    [StructLayout(LayoutKind.Sequential)]
+    public unsafe struct LbrTraceEventData32
+    {
+        public ulong TimeStamp;
+        public uint ProcessId;
+        public uint ThreadId;
+        public LbrOptionFlags Options;
+        private LbrEntry32 _entries;
+
+        public Span<LbrEntry32> Entries(int totalSize)
+        {
+            IntPtr entriesOffset = Unsafe.ByteOffset(ref Unsafe.As<LbrTraceEventData32, byte>(ref this), ref Unsafe.As<LbrEntry32, byte>(ref _entries));
+            return MemoryMarshal.CreateSpan(ref _entries, (totalSize - (int)entriesOffset) / sizeof(LbrEntry32));
+        }
+    }
+
+    [StructLayout(LayoutKind.Sequential)]
+    public unsafe struct LbrTraceEventData64
+    {
+        public ulong TimeStamp;
+        public uint ProcessId;
+        public uint ThreadId;
+        public LbrOptionFlags Options;
+        private LbrEntry64 _entries;
+
+        public Span<LbrEntry64> Entries(int totalSize)
+        {
+            IntPtr entriesOffset = Unsafe.ByteOffset(ref Unsafe.As<LbrTraceEventData64, byte>(ref this), ref Unsafe.As<LbrEntry64, byte>(ref _entries));
+            return MemoryMarshal.CreateSpan(ref _entries, (totalSize - (int)entriesOffset) / sizeof(LbrEntry64));
+        }
+    }
+}
diff --git a/src/coreclr/tools/dotnet-pgo/SPGO/MinimumCostCirculation.cs b/src/coreclr/tools/dotnet-pgo/SPGO/MinimumCostCirculation.cs
new file mode 100644
index 00000000000..8aaaf1c3ba3
--- /dev/null
+++ b/src/coreclr/tools/dotnet-pgo/SPGO/MinimumCostCirculation.cs
@@ -0,0 +1,144 @@
+// Licensed to the .NET Foundation under one or more agreements.
+// The .NET Foundation licenses this file to you under the MIT license.
+
+using System;
+using System.Collections.Generic;
+using System.Linq;
+using System.Text;
+using System.Threading.Tasks;
+
+namespace Microsoft.Diagnostics.Tools.Pgo
+{
+
+    /********
+     * This class find the minimum-cost circulation on a circulation graph.
+     * 
+     * The CirculationGraph object that this acts on seeks to minimize the value of
+     * TotalCirculationCost() while maintaining flow invariants (that the flow into a node
+     * equals the flow out of a node.)
+     * 
+     * The standard way to solve this problem is to start with a consistent circulation
+     * which probably has a non-minimum cost, then to find cycles where all the edges
+     * 
+     * (1) All have positive capacities.
+     * (2) Have a negative sum of costs.
+     * 
+     * Then the algorithm will force as much flow around this cycle as possible, thus decreasing
+     * the cost. It is possible to prove that iterating this algorithm until no negative cycles
+     * exist will always find the optimal solution as long as the costs/capacities are integers.
+     * 
+     * In this implementation, Bellman-Ford's minimum cost path-finding algorithm is used to find
+     * negative cycles (it is able to detect whether and where a negative cycle exists if it does not
+     * halt in a consistent state, since graphs with negative cycle will not have "shortest path"
+     * well-defined for all pairs of nodes.)
+     * 
+     * This algorithm is by far the worst in the literature in terms of asymptotic runtime complexity, 
+     * but very simple to implement. If the process of finding min-cost circulations become a
+     * bottleneck, much more efficient algorithms exist.
+     ********/
+
+
+    public class MinimumCostCirculation
+    {
+
+        // Changes graph state into a minimum-cost circulation, if it exists.
+        public static void FindMinCostCirculation(CirculationGraph graph, int smoothingIterations = -1)
+        {
+            int numIterations = 0;
+
+            // Represent a cycle as a tuple of the edges on the cycle and the minimum free capacity.
+            Tuple<List<Edge>, long> cycle = FindNegativeCycle(graph);
+            while (cycle.Item1 != null && numIterations != smoothingIterations)
+            {
+
+                // Force flow equal to the minimum free capacity through all the edges on the negative cycle.
+                foreach (Edge e in cycle.Item1)
+                {
+                    e.AddFlow(cycle.Item2);
+                }
+
+                // Ensure that our new flow does not violate any flow conditions.
+                graph.CheckConsistentCirculation();
+                cycle = FindNegativeCycle(graph);
+                numIterations++;
+            }
+        }
+
+        // Returns a negative cycle on the graph, if it exists.
+        // Judicious choice of this cycle is the main way to get asymptotic speed-up.
+        // Current implementation: Application of Bellman-Ford shortest path algorithm.
+        public static Tuple<List<Edge>, long> FindNegativeCycle(CirculationGraph graph)
+        {
+            // First reset the metadata associated with this algorithm.
+
+            foreach (Node n in graph.Nodes)
+            {
+                n.MetaData = new NodeMetaData();
+            }
+            // Decide which edges should even be considered for increasing flow by those with positive Free space.
+
+            List<Edge> viableEdges = new List<Edge>();
+            foreach (Edge e in graph.Edges)
+            {
+                if (e.Free > 0)
+                {
+                    viableEdges.Add(e);
+                }
+            }
+
+            // Iterate Bellman-Ford n-1 times.
+            for (int i = 0; i < graph.Nodes.Count - 1; i++)
+            {
+                foreach (Edge e in viableEdges)
+                {
+                    if (e.Target.MetaData.Distance > e.Source.MetaData.Distance + e.Cost)
+                    {
+                        e.Target.MetaData.Distance = e.Source.MetaData.Distance + e.Cost;
+                        e.Target.MetaData.PredEdge = e;
+                    }
+                }
+            }
+            // Iterate over all edges one last time to find negative cycles.
+
+            foreach (Edge e in viableEdges)
+            {
+                if (e.Target.MetaData.Distance > e.Source.MetaData.Distance + e.Cost)
+                {
+                    return FindBellmanFordCycle(e.Source);
+                }
+            }
+            // Return a null cycle if no negative cycles are found; signals that there are no more negative cycles.
+
+            return Tuple.Create<List<Edge>, long>(null, 0);
+        }
+
+        // After Bellman-Ford is run and a negative cycle is signalled, find that negative cost cycle by traversing the
+        // parent edges until a repeat node is reached.
+        public static Tuple<List<Edge>, long> FindBellmanFordCycle(Node currentNode)
+        {
+            // Set of seen nodes; once there is a repeat we know we are lying on a cycle.
+
+            HashSet<Node> seenNodes = new HashSet<Node>();
+            while (!seenNodes.Contains(currentNode))
+            {
+                seenNodes.Add(currentNode);
+                currentNode = currentNode.MetaData.PredEdge.Source;
+            }
+            // Now keep traversing up this cycle until a repeat node is found, deriving the edges and min capacity.
+
+            List<Edge> cycleEdges = new List<Edge>();
+            long minCapacity = long.MaxValue;
+            seenNodes = new HashSet<Node>();
+
+            while (!seenNodes.Contains(currentNode))
+            {
+                seenNodes.Add(currentNode);
+                minCapacity = Math.Min(minCapacity, currentNode.MetaData.PredEdge.Free);
+                cycleEdges.Add(currentNode.MetaData.PredEdge);
+                currentNode = currentNode.MetaData.PredEdge.Source;
+            }
+
+            return Tuple.Create<List<Edge>, long>(cycleEdges, minCapacity);
+        }
+    }
+}
diff --git a/src/coreclr/tools/dotnet-pgo/SPGO/NativeToILMap.cs b/src/coreclr/tools/dotnet-pgo/SPGO/NativeToILMap.cs
new file mode 100644
index 00000000000..a61c2484dac
--- /dev/null
+++ b/src/coreclr/tools/dotnet-pgo/SPGO/NativeToILMap.cs
@@ -0,0 +1,80 @@
+// Licensed to the .NET Foundation under one or more agreements.
+// The .NET Foundation licenses this file to you under the MIT license.
+
+using System;
+using System.Collections.Generic;
+using System.Linq;
+using System.Text;
+using System.Threading.Tasks;
+using ILCompiler.Reflection.ReadyToRun;
+using Microsoft.Diagnostics.Tracing.Parsers.Clr;
+
+namespace Microsoft.Diagnostics.Tools.Pgo
+{
+    public class NativeToILMap
+    {
+        // Native offsets in order
+        private uint[] _nativeOffsets;
+        // Map from native offset to IL offset
+        private int[] _ilOffsets;
+
+        public NativeToILMap(uint[] nativeOffsets, int[] ilOffsets)
+        {
+            _nativeOffsets = nativeOffsets;
+            _ilOffsets = ilOffsets;
+        }
+
+        private int LookupIndex(uint rva)
+        {
+            int index = Array.BinarySearch(_nativeOffsets, rva);
+            if (index < 0)
+                index = ~index - 1;
+
+            // If rva is before first binary search will return ~0 so index will be -1.
+            if (index < 0)
+                return -1;
+
+            return index;
+        }
+
+        /// <summary>Look up IL offset associated with block that contains RVA.</summary>
+        public int Lookup(uint rva)
+            => LookupIndex(rva) switch
+            {
+                -1 => -1,
+                int index => _ilOffsets[index]
+            };
+
+        public IEnumerable<int> LookupRange(uint rvaStart, uint rvaEnd)
+        {
+            int start = LookupIndex(rvaStart);
+            if (start < 0)
+                start = 0;
+
+            int end = LookupIndex(rvaEnd);
+            if (end < 0)
+                yield break;
+
+            for (int i = start; i <= end; i++)
+                yield return _ilOffsets[i];
+        }
+
+        internal static NativeToILMap FromR2RBounds(List<DebugInfoBoundsEntry> boundsList)
+        {
+            List<DebugInfoBoundsEntry> sorted = boundsList.OrderBy(e => e.NativeOffset).ToList();
+
+            return new NativeToILMap(sorted.Select(e => e.NativeOffset).ToArray(), sorted.Select(e => (int)e.ILOffset).ToArray());
+        }
+
+        internal static NativeToILMap FromEvent(MethodILToNativeMapTraceData ev)
+        {
+            List<(uint rva, int ilOffset)> pairs = new List<(uint rva, int ilOffset)>(ev.CountOfMapEntries);
+            for (int i = 0; i < ev.CountOfMapEntries; i++)
+                pairs.Add(((uint)ev.NativeOffset(i), ev.ILOffset(i)));
+
+            pairs.RemoveAll(p => p.ilOffset < 0);
+            pairs.Sort((p1, p2) => p1.rva.CompareTo(p2.rva));
+            return new NativeToILMap(pairs.Select(p => p.rva).ToArray(), pairs.Select(p => p.ilOffset).ToArray());
+        }
+    }
+}
diff --git a/src/coreclr/tools/dotnet-pgo/SPGO/SampleProfile.cs b/src/coreclr/tools/dotnet-pgo/SPGO/SampleProfile.cs
new file mode 100644
index 00000000000..f60753b0b57
--- /dev/null
+++ b/src/coreclr/tools/dotnet-pgo/SPGO/SampleProfile.cs
@@ -0,0 +1,84 @@
+// Licensed to the .NET Foundation under one or more agreements.
+// The .NET Foundation licenses this file to you under the MIT license.
+
+using System;
+using System.Collections.Generic;
+using System.Diagnostics;
+using System.Linq;
+using System.Runtime.CompilerServices;
+using System.Runtime.InteropServices;
+using System.Text;
+using Internal.IL;
+using Internal.TypeSystem;
+using Internal.TypeSystem.Ecma;
+
+namespace Microsoft.Diagnostics.Tools.Pgo
+{
+    internal class SampleProfile
+    {
+        public SampleProfile(
+            MethodIL methodIL,
+            FlowGraph fg,
+            Dictionary<BasicBlock, long> samples,
+            Dictionary<BasicBlock, long> smoothedSamples,
+            Dictionary<(BasicBlock, BasicBlock), long> smoothedEdgeSamples)
+        {
+            MethodIL = methodIL;
+            FlowGraph = fg;
+            Samples = samples;
+            SmoothedSamples = smoothedSamples;
+            SmoothedEdgeSamples = smoothedEdgeSamples;
+        }
+
+        public MethodIL MethodIL { get; }
+        public FlowGraph FlowGraph { get; }
+        public Dictionary<BasicBlock, long> Samples { get; }
+        public Dictionary<BasicBlock, long> SmoothedSamples { get; }
+        public Dictionary<(BasicBlock, BasicBlock), long> SmoothedEdgeSamples { get; }
+
+        /// <summary>
+        /// Given pairs of runs (as relative IPs in this function), create a sample profile.
+        /// </summary>
+        public static SampleProfile CreateFromLbr(MethodIL il, FlowGraph fg, NativeToILMap map, IEnumerable<(uint fromRva, uint toRva, long count)> runs)
+        {
+            Dictionary<BasicBlock, long> bbSamples = fg.BasicBlocks.ToDictionary(bb => bb, bb => 0L);
+            foreach ((uint from, uint to, long count) in runs)
+            {
+                foreach (BasicBlock bb in map.LookupRange(from, to).Select(fg.Lookup).Distinct())
+                {
+                    if (bb != null)
+                        bbSamples[bb] += count;
+                }
+            }
+
+            FlowSmoothing<BasicBlock> flowSmooth = new FlowSmoothing<BasicBlock>(bbSamples, fg.Lookup(0), bb => bb.Targets, (bb, isForward) => bb.Size * (isForward ? 1 : 50) + 2);
+            flowSmooth.Perform();
+
+            return new SampleProfile(il, fg, bbSamples, flowSmooth.NodeResults, flowSmooth.EdgeResults);
+        }
+
+        /// <summary>
+        /// Given some IL offset samples into a method, construct a profile.
+        /// </summary>
+        public static SampleProfile Create(MethodIL il, FlowGraph fg, IEnumerable<int> ilOffsetSamples)
+        {
+            // Now associate raw IL-offset samples with basic blocks.
+            Dictionary<BasicBlock, long> bbSamples = fg.BasicBlocks.ToDictionary(bb => bb, bb => 0L);
+            foreach (int ofs in ilOffsetSamples)
+            {
+                if (ofs == -1)
+                    continue;
+
+                BasicBlock bb = fg.Lookup(ofs);
+                if (bb != null)
+                    bbSamples[bb]++;
+            }
+
+            // Smooth the graph to produce something that satisfies flow conservation.
+            FlowSmoothing<BasicBlock> flowSmooth = new FlowSmoothing<BasicBlock>(bbSamples, fg.Lookup(0), bb => bb.Targets, (bb, isForward) => bb.Size * (isForward ? 1 : 50) + 2);
+            flowSmooth.Perform();
+
+            return new SampleProfile(il, fg, bbSamples, flowSmooth.NodeResults, flowSmooth.EdgeResults);
+        }
+    }
+}
diff --git a/src/coreclr/tools/dotnet-pgo/dotnet-pgo.csproj b/src/coreclr/tools/dotnet-pgo/dotnet-pgo.csproj
index aa7fb075b91..daa58a4bcc4 100644
--- a/src/coreclr/tools/dotnet-pgo/dotnet-pgo.csproj
+++ b/src/coreclr/tools/dotnet-pgo/dotnet-pgo.csproj
@@ -29,6 +29,7 @@
     <Compile Include="..\aot\ILCompiler.ReadyToRun\IBC\IBCProfileData.cs" Link="IBCProfileData.cs" />
     <Compile Include="..\aot\ILCompiler.ReadyToRun\Compiler\ProfileData.cs" Link="ProfileData.cs" />
     <Compile Include="..\Common\Pgo\TypeSystemEntityOrUnknown.cs" Link="TypeSystemEntityOrUnknown.cs" />
+    <Compile Include="..\Common\TypeSystem\IL\FlowGraph.cs" Link="SPGO\FlowGraph.cs" />
     <Compile Include="..\Common\TypeSystem\IL\ILReader.cs" Link="ILReader.cs" />
     <Compile Include="..\Common\TypeSystem\MetadataEmitter\TypeSystemMetadataEmitter.cs" Link="TypeSystemMetadataEmitter" />
   </ItemGroup>
author	Jakob Botsch Nielsen <Jakob.botsch.nielsen@gmail.com>	2021-05-18 00:51:34 +0300
committer	GitHub <noreply@github.com>	2021-05-18 00:51:34 +0300
commit	62df49298ae15a1b7b2bd6d093a6a37da512d1e0 (patch)
tree	145409e1e6721335843df697aeb3f241d06544b1 /src/coreclr/tools
parent	6f5ab2b939ccadef92341affdd089b0f7c003280 (diff)