docs/design/features/standalone-gc-loading.md


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183

# Standalone GC Loader Design

Author: Sean Gillespie (@swgillespie) - 2017

This document aims to provide a specification for how a standalone GC is
to be loaded and what is to happen in the case of version mismatches.

## Definitions

Before diving in to the specification, it's useful to precisely define
some terms that will be used often in this document.

* The **Execution Engine**, or **EE** - The component of the CLR responsible for *executing* programs.
  This is an intentionally vague definition. The GC does not care how (or even *if*) programs are
  compiled or executed, so it is up to the EE to invoke the GC whenever an executing
  program does something that requires the GC's attention. The EE is notable because the implementation
  of an execution engine varies widely between runtimes; the CoreRT EE is primarily in managed code
  (C#), while CoreCLR (and the .NET Framework)'s EE is primarily in C++.
* The **GC**, or **Garbage Collector** - The component of the CLR responsible for allocating managed
  objects and reclaiming unused memory. It is written in C++ and the code is shared by multiple runtimes.
  (That is, CoreCLR/CoreRT may have different execution engines, but they share the *same* GC code.)
* The **DAC**, or **Data Access Component** - A subset of the execution engine that is compiled in
  such a way that it can be run *out of process*, when debugging .NET code using a debugger. The DAC
  is used by higher-level components such as SOS (Son of Strike, a windbg/lldb debugger extension for
  debugging managed code) and the DBI (a COM interface). The full details about the DAC are covered in
  [this document](https://github.com/dotnet/runtime/blob/main/docs/design/coreclr/botr/dac-notes.md).

## Rationale and Goals of a Standalone GC

A GC that is "standalone" is one that is able to be built as a dynamic shared library and loaded
dynamically at startup. This definition is useful because it provides a number of benefits
to the codebase:

* A standalone GC that can be loaded dynamically at runtime can also be *substituted* easily by
  loading any GC-containing dynamic shared library specified by a user. This is especially interesting
  for prototyping and testing GC changes since it would be possible to make changes to the GC
  without having to re-compile the runtime.
* A standalone GC that can be *built* as a dynamic shared library imposes a strong requirement that
  the interfaces that the GC uses to interact with other runtime components be complete and
  correct. A standalone GC will not link successfully if it refers to symbols defined within
  the EE. This makes the GC codebase significantly easier to share between different execution
  engine implementations; as long as the GC implements its side of the interface and the EE
  implements its side of the interface, we can expect that changes within the GC itself
  will be more portable to other runtime implementations.

Worth noting is that the JIT (both RyuJIT and the legacy JIT(s) before it) can be built standalone
and have realized these same benefits. The existence of an interface and an implementation loadable
from shared libraries has enabled RyuJIT in particular to be used as the code generator for both the
CoreRT compiler and crossgen, while still being flexible enough to be tested using tools that implement
very non-standard execution engines such as [SuperPMI](https://github.com/dotnet/runtime/blob/main/src/coreclr/ToolBox/superpmi/readme.txt).

The below loading protocol is inspired directly by the JIT loader and many aspects of the GC loader are identical
to what the JIT does when loading dynamic shared libraries.

## Loading a Standalone GC

Given that it is possible to create a GC that resides in a dynamic shared library, it is important
that the runtime have a protocol for locating and loading such GCs. The JIT is capable of being loaded
in this manner and, because of this, a significant amount of prior art exists for loading components
for shared libraries from the file system. This specification is based heavily on the ways that a
standalone JIT can be loaded.

Fundamentally, the algorithm for loading a standalone GC consists of these steps:

0. Identify whether or not we should be using a standalone GC at all.
1. Identify *where* the standalone GC will be loaded from.
3. Load the dynamic shared library and ask it to identify itself (name and version).
4. Check that the version numbers are compatible.
5. If so, initialize the GC and continue on with EE startup. If not, reject the dynamic shared library
   and raise an appropriate user-visible error.

The algorithm for initializing the DAC against a target process using a standalone GC consists of these steps:

1. Identify whether or not the target process is using a standalone GC at all. If not, no further
   checks are necessary.
2. If so, inspect the version number of the standalone GC in the target process and determine whether
   or not the DAC is compatible with that version. If not, present a notification of some kind
   that the debugging experience will be degraded.
3. Continue onwards.

Each one of these steps will be explained in detail below.

### Identifying candidate shared libraries

The question of whether or not the EE should attempt to locate and load a standalone GC
is answered by the EE's configuration system (`EEConfig`). EEConfig has the ability to
query configuration information from environment variables. Using this subsystem, users
can specify a specific environment variable to indicate that they are interested in
loading a standalone GC.

There is one environment variable that governs the behavior of the standalone GC loader:
`COMPlus_GCName`. It should be set to be a path to a dynamic shared library containing
the GC that the EE intends to load. Its presence informs the EE that, first, a standalone GC
is to be loaded and, second, precisely where the EE should load it from.

The EE will call `LoadLibrary` using the path given by `COMPlus_GCName`.
If this succeeds, the EE will move to the next step in the loading process.

### Verifying the version of a candidate GC

Once the EE has successfully loaded a candidate GC dynamic shared library, it must then check that the candidate GC is
version-compatible with the version of the EE that is doing the loading. It does this in three phases. First, the
candidate GC must expose a function with the given name and signature:

```c++
struct VersionInfo {
  uint32_t MajorVersion;
  uint32_t MinorVersion;
  uint32_t BuildVersion;
  const char* Name;
};

extern "C" void GC_VersionInfo(
  /* Out */ VersionInfo*
);
```

The EE will call `GetProcAddress` on the library, looking for `GC_VersionInfo`. It is a fatal error if this symbol
is not found.

Next, the EE will call this function and receive back a `VersionInfo` structure. Each EE capable of loading
standalone GCs has a major version number and minor version number that is obtained from the version of
`gcinterface.h` that the EE built against. It will compare these numbers against the numbers it receives from
`GC_VersionInfo` in this way:

* If the EE's MajorVersion is not equal to the MajorVersion obtained from the candidate GC, reject. Major version    changes occur when there are breaking changes in the EE/GC interface and it is not possible to interoperate with
  incompatible interfaces. A change is considered breaking if it alters the semantics of an existing method or if
  it deletes or renames existing methods so that VTable layouts are not compatible.
* If the EE's MinorVersion is greater than the MinorVersion obtained from the candidate GC, accept
  (Forward compatability). The EE must take care not to call any new APIs that are not present in the version of
  the candidate GC.
* Otherwise, accept (Backward compatibility). It is perfectly safe to use a GC whose MinorVersion exceeds the EE's
  MinorVersion.

The build version and name are not considered and are provided only for display/debug purposes.

If this succeeds, the EE will transition to the next step in the loading sequence.

### Initializing the GC

Once the EE has verified that the version of the candidate GC is valid, it then proceeds to initialize the
GC. It does so by loading (via `GetProcAddress`) and executing a function with this signature:

```c++
extern "C" HRESULT GC_Initialize(
  /* In  */ IGCToCLR*,
  /* Out */ IGCHeap**.
  /* Out */ IGCHandleManager**,
  /* Out */ GcDacVars*
);
```

The EE will provide its implementation of `IGCToCLR` to the GC and the GC will provide its implementations of
`IGCHeap`, `IGCHandleManager`, and `GcDacVars` to the EE. From here, if `GC_Initialize` returns a successful
HRESULT, the GC is considered initialized and the remainder of EE startup continues. If `GC_Initialize` returns
an error HRESULT, the initialization has failed.

### Initializing the DAC

The existence of a standalone GC is a debuggee process has implications for how the DAC is loaded and
initializes itself. The DAC has access to implementation details of the GC that are not normally exposed as part
of the `GC/EE` interfaces, and as such it is versioned separately.

When the DAC is being initialized and it loads the `GcDacVars` structure from the debuggee process's memory, it
must check the major and minor versions of the DAC, which are itself DAC variables exposed by a standalone GC.
It then decides whether or not the loaded GC is compatible with the DAC that is currently executing. It does this
in the same manner that the EE does:

* If the major versions of the DAC and loaded GC do not agree, reject.
* If the minor version of the DAC is greater than the minor version of the GC, accept but take care
  not to invoke any new code paths not present in the target GC.
* If the minor version of the DAC is less than or equal to the minor version of the GC, accept.

If a DAC rejects a loaded GC, it will return `E_FAIL` from DAC APIs that would otherwise need to interact with the
GC.

## Outstanding Questions

How can we provide the most useful error message when a standalone GC fails to load? In the past it has been difficult
to determine what preciscely has gone wrong with `coreclr_initialize` returns a HRESULT and no indication of what occured.

Same question for the DAC - Is `E_FAIL` the best we can do? If we could define our own error for DAC/GC version
mismatches, that would be nice; however, that is technically a breaking change in the DAC.