Welcome to mirror list, hosted at ThFree Co, Russian Federation.

OptionalFields.h « inc « Runtime « Native « src - github.com/mono/corert.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
blob: 4a512c736075dd582fadc36cf0bceb281507a422 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
// Licensed to the .NET Foundation under one or more agreements.
// The .NET Foundation licenses this file to you under the MIT license.
// See the LICENSE file in the project root for more information.

//
// Support for optional fields attached out-of-line to EETypes (or any other data structure for that matter).
// These should be used for attributes that exist for only a small subset of EETypes or are accessed only
// rarely. The idea is to avoid bloating the size of the most common EETypes and to move some of the colder
// data out-of-line to improve the density of the hot data. The basic idea is that the EEType contains a
// pointer to an OptionalFields structure (which may be NULL) and that structure contains a somewhat
// compressed version of the optional fields.
//
// For each OptionalFields instance we encode only the fields that are present so that the structure is as
// small as possible while retaining reasonable access costs.
//
// This implies some tricky tradeoffs:
//  * The more we compress the data the greater the access costs in terms of CPU.
//  * More effective compression schemes tend to lead to the payload data being unaligned. This itself can
//    result in overhead but on some architectures it's worse than that and the unaligned nature of the data
//    requires special handling in client code. Generally it would be more robust and clean not to leak out
//    such requirements to our callers. For small fields we can imagine copying the data into aligned storage
//    (and indeed that might be a natural part of the decompression process). It might be more problematic for
//    larger data items.
//
// In order to get the best of both worlds we employ a hybrid approach. Small values (typically single small
// integers) get encoded inline in a compressed format. Decoding them will automatically copy them into
// aligned storage. Larger values (such as complex data structures) will be stored out-of-line, naturally
// aligned and uncompressed (at least by this layer of the software). The entry in the optional field record
// will instead contain a reference to this out-of-line structure.
//
// Pointers are large (especially on 64-bit) and incur overhead in terms of base relocs and complexity (since
// the locations requiring relocs may not be aligned). To mitigate this we can encode references to these
// out-of-line records as deltas from a base address and by carefully ordering the layout of the out-of-line
// records we can share the same base address amongst multiple OptionalFields structures.
//
// Taking this to one end of the logical extreme we could store a single base address such as the module base
// address and encode all OptionalFields references as offsets from this; basically RVAs. This is cheap in the
// respect that we only need one base address (and associated reloc) but the majority of OptionalFields
// references will encode as fairly large deltas. As we'll touch on later our mechanism for compressing inline
// values in OptionalRecords is based on discarding insignificant leading zero bits; i.e. we encode small
// integers more effectively. So ideally we want to store multiple base addresses so we can lower the average
// encoding cost of the deltas.
//
// An additional concern is how these base addresses are located. Take the module base address example: we
// have no direct means of locating this based on an OptionalFields (or even the EEType that owns it). To
// obtain this value we're likely to have to perform some operation akin to a range lookup and there are
// interesting edge cases such as EETypes for generic types, which don't reside in modules.
//
// The approach taken here addresses several of the concerns above. The algorithm stores base addresses
// interleaved with the OptionalFields. They are located at well-known locations by aligning their addresses
// to a specific value (we can tune this but assume for the purposes of this explanation that the value is 64
// bytes). This implies that the address requiring a base reloc is always aligned plus it can be located
// cheaply from an OptionalFields address by masking off the low-order bits of that address.
//
// As OptionalFields are added any out-of-line data they reference is stored linearly in the same order (this
// does imply that all out-of-line records must live in the same section and thus must have the same access
// attributes). This provides locality: adjacent OptionalFields may encode deltas to different out-of-line
// records but since the out-of-line records are adjacent (or nearly so) as well, both deltas will be about
// the same size. Once we've filled in the space between stored base addresses (some padding might be needed
// near the end where a full OptionalField won't fit, but this should be small given good compression of
// OptionalFields) then we write out a new base address. This is chosen based on the first out-of-line record
// referenced by the next OptionalField (i.e. it will make the first delta zero and keep the subsequent ones
// small).
//
// Consider the following example where for the sake of simplicity we assume each OptionalFields structure has
// precisely one out-of-line reference:
//
//    +-----------------+                        Out-of-line Records
//    | Base Address    |----------------------> +--------------------+
//    +-----------------+                        | #1                 |
//    | OptionalFields  |                        +--------------------+
//    |   Record #1     |                        | #2                 |
//    |                 |                        |                    |
//    +-----------------+                        +--------------------+
//    | OptionalFields  |                        | #3                 |
//    |   Record #2     |         /------------> +--------------------+
//    |                 |        /               | #4                 |
//    +-----------------+       /                |                    |
//    | OptionalFields  |      /                 |                    |
//    |   Record #3     |     /                  +--------------------+
//    |                 |    /                   | #5                 |
//    +-----------------+   /                    |                    |
//    | Padding         |  /                     +--------------------+
//    +-----------------+ /                      :                    :
//    | Base Address    |-
//    +-----------------+
//    | OptionalFields  |
//    |   Record #4     |
//    |                 |
//    +-----------------+
//    | OptionalFields  |
//    |   Record #5     |
//    :                 :
//
// Each optional field uses the base address defined above it (at the lower memory address determined by
// masking off the alignment bits). No matter which out-of-line records they reference the deltas will be as
// small as we can make them.
//
// Lowering the alignment requirement introduces more base addresses and as a result also lowers the number of
// OptionalFields that share the same base address, leading to smaller encodings for out-of-line deltas. But
// at the same time it increases the number of pointers (and associated base relocs) that we must store.
// Additionally the compression of the deltas is not completely linear: certain ranges of delta magnitude will
// result in exactly the same storage being used when compressed. See the details of the delta encoding below
// to see how we can use this to our advantage when tuning the alignment of base addresses.
//
// We optimize the case where OptionalFields structs don't contain any out-of-line references. We collect
// those together and emit them in a single run with no interleaved base addresses.
//
// The OptionalFields record encoding itself is a byte stream representing one or more fields. The first byte
// is a field header: it contains a field type tag in the low-order 7 bits (giving us 128 possible field
// types) and the most significant bit indicates whether this is the last field of the structure. The field
// value (a 32-bit unsigned number) is encoded using the existing VarInt support which encodes the value in
// byte chunks taking between 1 and 5 bytes to do so.
//
// If the field value is out-of-line we decode the delta from the base address in much the same way as for
// inline field values. Before adding the delta to the base address, however, we scale it based on the natural
// alignment of the out-of-line data record it references. Since the out-of-line data is aligned on the same
// basis this scaling avoids encoding bits that will always be zero and thus allows us to reference a greater
// range of memory with a delta that encodes using less bytes.
//
// The value compression algorithm above gives us the non-linearity of compression referenced earlier. 32-bit
// values will encode in a given number of bytes based on the having a given number of significant
// (non-leading zero) bits:
//      5 bytes : 25 - 32 significant bits
//      4 bytes : 18 - 24 significant bits
//      3 bytes : 11 - 17 significant bits
//      2 bytes : 4 - 10 significant bits
//      1 byte  : 0 - 3 significant bits
//
// We can use this to our advantage when choosing an alignment at which to store base addresses. Assuming that
// most out-of-line data will have an alignment requirement of at least 4 bytes we note that the 2 byte
// encoding already gives us an addressable range of 2^10 * 4 == 4KB which is likely to be enough for the vast
// majority of cases. That is we can raise the granularity of base addresses until the average amount of
// out-of-line data addressed begins to approach 4KB which lowers the cost of storing the base addresses while
// not impacting the encoding size of deltas at all (there's no point in storing base addresses more
// frequently because it won't make the encodings of deltas any smaller).
//
// Trying to tune for one byte deltas all the time is probably not worth it. The addressability range (again
// assuming 4 byte alignment) is only 32 bytes and unless we start storing a lot of small data structures
// out-of-line tuning for this will involve placing the base addresses very frequently and our costs will be
// dominated by the size of the base address pointers and their relocs.
//

// Define enumeration of optional field tags.
enum OptionalFieldTag
{
#define DEFINE_INLINE_OPTIONAL_FIELD(_name, _type) OFT_##_name,
#define DEFINE_OUTLINE_OPTIONAL_FIELD(_name, _type) OFT_##_name,
#include "OptionalFieldDefinitions.h"
    OFT_Count // Number of field types we support
};

// Array that indicates whether a given field type is inline (true) or out-of-line (false).
static bool g_rgOptionalFieldTypeIsInline[OFT_Count] = {
#define DEFINE_INLINE_OPTIONAL_FIELD(_name, _type) true,
#define DEFINE_OUTLINE_OPTIONAL_FIELD(_name, _type) false,
#include "OptionalFieldDefinitions.h"
};

// Various random global constants we can tweak for performance tuning.
enum OptionalFieldConstants
{
    // Constants determining how often we interleave a "header" containing a base address for out-of-line
    // records into the stream of OptionalFields structures. These will occur at some power of 2 alignment of
    // memory address. The alignment must at least exceed that of a pointer (since we'll store a pointer in
    // the header and we need room for at least one OptionalFields record between each header). As the
    // alignment goes up we store less headers but may impose a larger one-time padding cost at the start of
    // the optional fields memory block as well as increasing the average encoding size for out-of-line record
    // deltas in each optional field record.
    //
    // Note that if you change these constants you must be sure to modify the alignment of the optional field
    // virtual section in ZapImage.cpp as well as ensuring the alignment of the containing physical section is
    // at least as high (this latter cases matters for the COFF output case only, when we're generating PE
    // images directly the physical section will get page alignment).
    OFC_HeaderAlignmentShift    = 7,
    OFC_HeaderAlignmentBytes    = 1 << OFC_HeaderAlignmentShift,
    OFC_HeaderAlignmentMask     = OFC_HeaderAlignmentBytes - 1,
};

// Support for some simple statistics gathering. Only used for performance tweaking and debugging of the
// algorithm so left disabled most of the time.
#ifdef BINDER
//#define FEATURE_OPTIONAL_FIELD_STATS 1
#endif
#ifdef FEATURE_OPTIONAL_FIELD_STATS
struct OptionalFieldStats
{
    UInt32      m_cOptionalFieldsStructs;
    UInt32      m_rgFieldCounts[OFT_Count];
    UInt32      m_rgSizeDist[8];
    UInt32      m_cHeaders;
    UInt32      m_cbPadding;
};
extern "C" OptionalFieldStats g_sOFStats;
#define OFS_COUNTER_INC(_name) g_sOFStats.m_c##_name++;
#else
#define OFS_COUNTER_INC(_name) 
#endif

#ifndef RHDUMP
typedef DPTR(class OptionalFields) PTR_OptionalFields;
typedef DPTR(PTR_OptionalFields) PTR_PTR_OptionalFields;
#endif // !RHDUMP


class OptionalFields
{
public:

    // Return the number of bytes necessary to encode the given integer.
    static UInt32 EncodingSize(UInt32 uiValue);

    // Encode the given field type and integer into the buffer provided (which is guaranteed to have enough
    // space). Update the pointer into the buffer to point just past the newly encoded bytes. Note that any
    // processing of the value for use with out-of-line records has already been performed; we're given the
    // raw value to encode.
    static void EncodeField(UInt8 ** ppFields, OptionalFieldTag eTag, bool fLastField, UInt32 uiValue);

#ifndef BINDER

    // Define accessors for each field type.
#define DEFINE_INLINE_OPTIONAL_FIELD(_name, _type)                       \
    _type Get##_name(_type defaultValue)                                 \
    {                                                                    \
    return (_type)GetInlineField(OFT_##_name, (UInt32)defaultValue); \
    }

#define DEFINE_OUTLINE_OPTIONAL_FIELD(_name, _type)                      \
    _type * Get##_name()                                                 \
    {                                                                    \
    return (_type*)GetOutlineField(OFT_##_name, _alignof(_type));    \
    }

#include "OptionalFieldDefinitions.h"

private:
    friend struct OptionalFieldsRuntimeBuilder;

    // Reads a field value (or the basis for an out-of-line record delta) starting from the first byte after
    // the field header. Advances the field location to the start of the next field.
    static OptionalFieldTag DecodeFieldTag(PTR_UInt8 * ppFields, bool *pfLastField);

    // Reads a field value (or the basis for an out-of-line record delta) starting from the first byte of a
    // field description. Advances the field location to the start of the next field.
    static UInt32 DecodeFieldValue(PTR_UInt8 * ppFields);

    UInt32 GetInlineField(OptionalFieldTag eTag, UInt32 uiDefaultValue);
#if 0
    // Currently not used for ProjectN, implementation needs to be "DAC'ified" if needed again
    void * GetOutlineField(OptionalFieldTag eTag, size_t cbValueAlignment);
#endif

#endif // !BINDER
};

#ifdef BINDER

class OptionalFieldsManager;

//
// Binder support for laying out OptionalFields structures. Note that this is relatively simple currently since
// none of the optional fields contain pointers. This will have to be revisited to support fixups and
// understand ZapNodes if pointers are introduced.
//
// Mostly this class just caches the fields declared for a particular OptionalFields and the resulting ZapNode
// once those fields have been encoded. The layout of OptionalFields in memory is handled by
// OptionalFieldsManager while the details of the encoding of each individual field are handled by the
// OptionalFields class itself.
//
class OptionalFieldsBuilder
{
public:
    // At construction time associate this build with a manager.
    OptionalFieldsBuilder(OptionalFieldsManager *pManager);

    // Once all fields have been added the ZapNode representing the OptionalFields structure can be retrieved
    // (and not before; this will be asserted in debug builds).
    ZapNode *GetNode();

    // Define setters for each field type.
#define DEFINE_INLINE_OPTIONAL_FIELD(_name, _type)  \
    void Add##_name(_type value)                    \
    {                                               \
        AddInlineField(OFT_##_name, (UInt32)value); \
    }

    // Note that the ZapBlob provided here must not have been placed since the OptionalFieldsManager will
    // place it later in a specific order within a special section.
#define DEFINE_OUTLINE_OPTIONAL_FIELD(_name, _type) \
    void Add##_name(ZapBlob * pValueBlob)           \
    {                                               \
        assert(!pValueBlob->IsPlaced());            \
        AddOutlineField(OFT_##_name, pValueBlob);   \
    }

#include "OptionalFieldDefinitions.h"

private:
    friend class OptionalFieldsManager;

    void AddInlineField(OptionalFieldTag eTag, UInt32 uiValue);
    void AddOutlineField(OptionalFieldTag eTag, ZapBlob * pValueBlob);

    // Cached field values specficied by one of the Add* methods above.
    struct OptionalField
    {
        bool                    m_fPresent;         // This field was added (we keep an array of all possible
                                                    // field types).
        UInt32                  m_uiOffset;         // Offset of image copy of the value from the base of the
                                                    // out-of-line record section. Set by manager during field
                                                    // encoding and used only for out-of-line fields.
        union
        {
            UInt32              m_uiValue;          // Inline value.
            ZapBlob *           m_pValueBlob;       // Out-of-line value.
        };
    };

    OptionalFieldsManager *     m_pManager;
    ZapNode *                   m_pNode;
    UInt32                      m_cFields;
    bool                        m_fContainsOutOfLineFields;
    OptionalField               m_rgFields[OFT_Count];
};

// The OptionalFieldsManager takes care of the layout of the two virtual sections used by OptionalFields: the
// OptionalFields themselves (possibly with base addresses interleaved) and the out-of-line data records.
class OptionalFieldsManager
{
public:
    OptionalFieldsManager(ZapImage * pZapImage);

    // Encode all the fields for one OptionalFields description cached in the given OptionalFieldsBuilder.
    ZapNode * EncodeFields(OptionalFieldsBuilder * pBuilder);

    // Place all the nodes created by this manager once all OptionalFields creation is complete.
    void Place();

private:
    // When we collect all the out-of-line records for later placement we also wish to cache the offset into
    // the virtual section at which each node lies. We could work this out from the nodes themselves but that
    // would be very expensive. This information is used to calculate the deltas between OptionalFields
    // out-of-line data references and the last out-of-line data record that was recorded as a base address.
    struct OutOfLineRecord
    {
        ZapNode *       m_pNode;
        UInt32          m_uiOffset;
    };

    // Given a builder calculate the size of the encoded version of the OptionalFields structure given the
    // current state of the OptionalFieldsManager (i.e. which base address is current etc.).
    UInt32 PlanEncoding(OptionalFieldsBuilder * pBuilder);

    // Actually encode the builder into an OptionalFields structure. The size of the encoding must have been
    // calculated by a previous call to PlanEncoding with no intervening manager state updates.
    ZapNode * PerformEncoding(OptionalFieldsBuilder * pBuilder, UInt32 cbEncoding);

    // Emit a new base address record using the given out-of-line data record as the new base. It's assumed
    // that sufficient padding has already been emitted such that the optional fields section is properly
    // aligned for this header record.
    void AddNewBaseAddressHeader(UInt32 idxBaseOutOfLineRecord);

    // Go through the builder looking for out-of-line records (it's assumed there is at least one if this is
    // called) adding copies of the data to the out-of-line records section. Returns the index of the first
    // record referenced by the builder which is the record that should be used as the base address if this is
    // the first OptionalFields emitted after a base address header.
    UInt32 AddOutOfLineRecords(OptionalFieldsBuilder * pBuilder);

    ZapImage *              m_pZapImage;
    vector<ZapNode*>        m_vecSimpleFields;              // OptionalFields without out-of-line records
    vector<ZapNode*>        m_vecComplexFields;             // OptionalFields with at least one out-of-line record
    vector<OutOfLineRecord> m_vecOutOfLineRecords;          // Out-of-line records referenced by above
    UInt32                  m_cbNextOutOfLineRecordOffset;  // Offset into section the next OOL record will occupy
    UInt32                  m_idxCurrentBaseOutOfLineRecord;// Index of OOL record currently being used as base
    UInt32                  m_cbFreeSpaceInCurrentGroup;    // Count of bytes left before next header is emitted
    bool                    m_fHeaderEmitted;               // Has at least one base address header been emitted?
};

#endif // BINDER

//
// Optional field encoder/decoder for dynamic types built at runtime
//
struct OptionalFieldsRuntimeBuilder
{
    void Decode(OptionalFields * pOptionalFields);

    UInt32 EncodingSize();

    UInt32 Encode(OptionalFields * pOptionalFields);

   struct OptionalField
    {
        bool                    m_fPresent;
        union
        {
            UInt32              m_uiValue;
            void *              m_pValueBlob;
        };
    };

    OptionalField               m_rgFields[OFT_Count];
};