Welcome to mirror list, hosted at ThFree Co, Russian Federation.

interop-guidelines.md « coding-guidelines « Documentation - github.com/mono/corefx.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
blob: 161771dc679f0d983e3a22483eb6c26f7508f0e3 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
Interop Guidelines
==================

## Goals
We have the following goals related to interop code being used in CoreFX:

- Minimize code duplication for interop.
  - We should only define a given interop signature in a single place.
    This stuff is tricky, and we shouldn't be copy-and-pasting it.
- Minimize unnecessary IL in assemblies.
  - Interop signatures should only be compiled into the assemblies that
    actually consume them. Having extra signatures bloats assemblies and
    makes it more difficult to do static analysis over assemblies to
    understand what they actually use. It also leads to problems when such
    static verification is used as a gate, e.g. if a store verifies that
    only certain APIs are used by apps in the store.
- Keep interop code isolated and consolidated.
  - This is both for good hygiene and to help keep platform-specific code
    separated from platform-neutral code, which is important for maximizing
    reusable code above PAL layers.
- Ensure maximal managed code reuse across different OS flavors which have
  the same API but not the same ABI.
   - This is the case for UNIX and addressing it is a work-in-progress (see issue
     #2137 and section on "shims" below.)

## Approach

### Interop type
- All code related to interop signatures (DllImports, interop structs
  used in DllImports, constants that map to native values, etc.) should
  live in a partial, static, and internal “Interop” class in the root
  namespace, e.g.

```C#
internal static partial class Interop { ... }
```

- Declarations shouldn't be in Interop directly, but rather within a
  partial, static, internal nested type named for a given library or set
  of libraries, e.g.

```C#
internal static partial class Interop
{
    internal static partial class libc { ... }
}
...
internal static partial class Interop
{
    internal static partial class mincore { ... }
}
```
- With few exceptions, the only methods that should be defined in these
  interop types are DllImports.
  - Exceptions are limited to times when most or every consumer of a
  particular DllImport will need to wrap its invocation in a helper, e.g.
  to provide additional marshaling support, to hide thread-safety issues
  in the underlying OS implementation, to do any required manipulation of
  safe handles, etc. In such cases, the DllImport should be private
  whenever possible rather than internal, with the helper code exposed to
  consumers rather than having the DllImport exposed directly.

### File organization

- The Interop partial class definitions should live in Interop.*.cs
  files. These Interop.*.cs files should all live under Common rather than
  within a given assembly's folder.
  - The only exception to this should be when an assembly P/Invokes to its
    own native library that isn't available to or consumed by anyone else,
    e.g. System.IO.Compression P/Invoking to clrcompression.dll. In such
    cases, System.IO.Compression should have its own Interop folder which
    follows a similar scheme as outlined in this proposal, but just for
    these private P/Invokes.
- Under Common\src\Interop, we'll have a folder for each target
  platform, and within each platform, for each library from which
  functionality is being consumed. The Interop.*.cs files will live within
  those library folders, e.g.

```
\Common\src\Interop
    \Windows
        \mincore
            ... interop files
	\Unix
        \libc
            ... interop files
    \Linux
        \libc
            ... interop files
```

As shown above, platforms may be additive, in that an assembly may use functionality from multiple folders, e.g. System.IO.FileSystem's Linux build will use functionality both from Unix (common across all Unix systems) and from Linux (specific to Linux and not available across non-Linux Unix systems).
			 
- Interop.*.cs files are created in a way such that every assembly
  consuming the file will need every DllImport it contains.
  - If multiple related DllImports will all be needed by every consumer,
    they may be declared in the same file, named for the functionality
    grouping, e.g. Interop.IOErrors.cs.
  - Otherwise, in the limit (and the expected case for most situations)
    each Interop.*.cs file will contain a single DllImport and associated
    interop types (e.g. the structs used with that signature) and helper
    wrappers, e.g. Interop.strerror.cs.

```
\Common\src\Interop
    \Unix
        \libc
            \Interop.strerror.cs
    \Windows
        \mincore
            \Interop.OutputDebugString.cs
```

- If structs/constants will be used on their own without an associated
  DllImport, or if they may be used with multiple DllImports not in the
  same file, they should be declared in a separate file.
- In the case of multiple overloads of the same DllImport (e.g. some
  overloads taking a SafeHandle and others taking an IntPtr, or overloads
  taking different kinds of SafeHandles), if they can't all be declared in
  the same file (because they won't all be consumed by all consumers), the
  file should be qualified with the key differentiator, e.g.

```
\Common\src\Interop
    \Windows
        \mincore
            \Interop.DuplicateHandle_SafeTokenHandle.cs
            \Interop.DuplicateHandle_IntPtr.cs
```

- The library names used per-platform are stored in internal constants
  in the Interop class in a private Libraries class in a per-platform file
  named Interop.Libraries.cs. These constants are then used for all
  DllImports to that library, rather than having the string duplicated
  each time, e.g.

```C#
internal static partial class Interop // contents of Common\src\Interop\Windows\Interop.Libraries.cs
{
    private static class Libraries
    {
        internal const string Kernel32 = "kernel32.dll";
        internal const string Localization = "api-ms-win-core-localization-l1-2-0.dll";
        internal const string Handle = "api-ms-win-core-handle-l1-1-0.dll";
        internal const string ProcessThreads = "api-ms-win-core-processthreads-l1-1-0.dll";
        internal const string File = "api-ms-win-core-file-l1-1-0.dll";
        internal const string NamedPipe = "api-ms-win-core-namedpipe-l1-1-0.dll";
        internal const string IO = "api-ms-win-core-io-l1-1-0.dll";
        ...
    }
}

```
(Note that this will likely result in some extra constants defined in
each assembly that uses interop, which minimally violates one of the
goals, but it's very minimal.)
			 
- .csproj project files then include the interop code they need, e.g.
```XML
<ItemGroup Condition=" '$(TargetsUnix)' == 'true' ">
    <Compile Include="Interop\Unix\Interop.Libraries.cs" />
    <Compile Include="Interop\Unix\libc\Interop.strerror.cs" />
    <Compile Include="Interop\Unix\libc\Interop.getenv.cs" />
    <Compile Include="Interop\Unix\libc\Interop.getenv.cs" />
    <Compile Include="Interop\Unix\libc\Interop.open64.cs" />
    <Compile Include="Interop\Unix\libc\Interop.close.cs" />
    <Compile Include="Interop\Unix\libc\Interop.snprintf.cs" />
    ...
</ItemGroup>
```

### Build System
When building CoreFx, we use the "OSGroup" property to control what
target platform we are building for. The valid values for this property
are Windows_NT (which is the default value from MSBuild when running on
Windows), Linux and OSX.

The build system sets a few MSBuild properties, depending on the OSGroup
setting:

* TargetsWindows
* TargetsLinux
* TargetsOSX
* TargetsUnix

TargetsUnix is true for both OSX and Linux builds and can be used to
include code that can be used on both Linux and OSX (e.g. it is written
against a POSIX API that is present on both platforms).

You should not test the value of the OSGroup property directly, instead
use one of the values above.

#### Project Files
Whenever possible, a single .csproj should be used per assembly,
spanning all target platforms, e.g. System.Console.csproj includes
conditional entries for when targeting Windows vs when targeting Linux.
A property can be passed to msbuild to control which flavor is built,
e.g. msbuild /p:OSGroup=OSX System.Console.csproj.

### Constants
- Wherever possible, constants should be defined as "const". Only if the
  data type doesn't support this (e.g. IntPtr) should they instead be
  static readonly fields.

- Related constants should be grouped under a partial, static, internal
  type, e.g. for error codes they'd be grouped under an Errors type:

```C#
internal static partial class Interop
{
    internal static partial class libc
    {
        internal static partial class Errors
        {
            internal const int ENOENT = 2;
            internal const int EINTR = 4;
            internal const int EWOULDBLOCK = 11;
            internal const int EACCES = 13;
            internal const int EEXIST = 17;
            internal const int EXDEV = 18;
            internal const int EISDIR = 21;
            internal const int EINVAL = 22;
            internal const int EFBIG = 27;
            internal const int ENAMETOOLONG = 36;
            internal const int ECANCELED = 125;
            ...
        }
    }
}
```

Using enums instead of partial, static classes can lead to needing lots
of casts at call sites and can cause problems if such a type needs to be
split across multiple files (enums can't currently be partial). However,
enums can be valuable in making it clear in a DllImport signature what
values are permissible. Enums may be used in limited circumstances where
these aren't concerns: the full set of values can be represented in the
enum, and the interop signature can be defined to use the enum type
rather than the underlying integral type.

## Naming

- Interop signatures / structs / constants should be defined using the
  same name / capitalization / etc. that's used in the corresponding
  native code.
  - We should not rename any of these based on managed coding guidelines.
    The only exception to this is for the constant grouping type, which
    should be named with the most discoverable name possible; if that name
    is a concept (e.g. Errors), it can be named using managed naming
    guidelines.


## UNIX shims

Often, various UNIX flavors offer the same API from the point-of-view of compatibility
with C/C++ source code, but they do not have the same ABI. e.g. Fields can be laid out
differently, constants can have different numeric values, exports can
be named differently, etc. There are not only differences between operating systems
(Mac OS X vs. Ubuntu vs. FreeBSD), but also differences related to the underlying
processor architecture (x64 vs. x86 vs. ARM).

This leaves us with a situation where we can't write portable P/Invoke declarations
that will work on all flavors, and writing separate declarations per flavor is quite
fragile and won't scale.

To address this, we're moving to a model where all UNIX interop from corefx starts with 
a P/Invoke to a C++ lib written specifically for corefx. These libs -- System.*.Native.so 
(aka "shims") -- are intended to be very thin layers over underlying platform libraries. 
Generally, they are not there to add any significant abstraction, but to create a 
stable ABI such that the same IL assembly can work across UNIX flavors.

Guidelines for shim C++ API:

- Keep them as "thin"/1:1 as possible. 
  - We want to write the majority of code in C#. 
- Never skip the shim and P/Invoke directly to the underlying platform API. It's
easy to assume something is safe/guaranteed when it isn't.
- Don't cheat and take advantage of coincidental agreement between
one flavor's ABI and the shim's ABI. 
- Use PascalCase in a style closer to Win32 than libc.
  - If an export point has a 1:1 correspondence to the platform API, then name
    it after the platform API in PascalCase (e.g. stat -> Stat, fstat -> FStat).
  - If an export is not 1:1, then spell things out as we typically would in
    CoreFX code (i.e. don't use abbreviations unless they come from the underlying
    API.
  - At first, it seemed that we'd want to use 1:1 names throughout, but it
    turns out there are many cases where being strictly 1:1 isn't practical.
  - In order to reduce the chance of collisions when linking with CoreRT, all
    exports should have a prefix that corresponds to the Libraries' name, e.g.
    "SystemNative_" or "CryptoNative_" to make the method name more unique.
    See https://github.com/dotnet/corefx/issues/4818.
- Stick to data types which are guaranteed not to vary in size across flavors.
  - Use int32_t, int64_t, etc. from stdint.h and not int, long, etc.
  - Use char* for ASCII or UTF-8 strings and uint8_t* for byte buffers.
     - Note that sizeof(char) == 1 is guaranteed.
  - Do not use size_t in shim API. Always pick a fixed size. Often, it is most 
    convenient to line up with the managed int as int32_t (e.g. scratch buffer 
    size for read/write), but sometimes we need to handle huge sizes (e.g.
    memory mapped files) and therefore use uint64_t.
  - Use int64_t for native off_t values.