Welcome to mirror list, hosted at ThFree Co, Russian Federation.

github.com/mono/mono.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
path: root/docs
diff options
context:
space:
mode:
authorZoltan Varga <vargaz@gmail.com>2009-02-12 22:22:39 +0300
committerZoltan Varga <vargaz@gmail.com>2009-02-12 22:22:39 +0300
commit18b99ebc85b2f466d0852fa641da3264e60d3349 (patch)
tree1e4d1e64f38b4c9e92216aea55ad564b0d3a445c /docs
parent4445ef40a6bf1997053a976d3919180e865b90ac (diff)
2009-02-12 Zoltan Varga <vargaz@gmail.com>
* memory-management.txt thread-safety.txt aot-compiler.txt jit-regalloc exception-handling.txt: Remove documents which are now on the wiki. svn path=/trunk/mono/; revision=126745
Diffstat (limited to 'docs')
-rw-r--r--docs/ChangeLog5
-rw-r--r--docs/aot-compiler.txt393
-rw-r--r--docs/exception-handling.txt330
-rw-r--r--docs/jit-regalloc283
-rw-r--r--docs/memory-management.txt32
-rw-r--r--docs/thread-safety.txt118
6 files changed, 5 insertions, 1156 deletions
diff --git a/docs/ChangeLog b/docs/ChangeLog
index 79952cb6787..4742f7c4d63 100644
--- a/docs/ChangeLog
+++ b/docs/ChangeLog
@@ -1,3 +1,8 @@
+2009-02-12 Zoltan Varga <vargaz@gmail.com>
+
+ * memory-management.txt thread-safety.txt aot-compiler.txt jit-regalloc
+ exception-handling.txt: Remove documents which are now on the wiki.
+
2009-02-11 Rodrigo Kumpera <rkumpera@novell.com>
* thread-safety.txt: Improve the docs about image lock.
diff --git a/docs/aot-compiler.txt b/docs/aot-compiler.txt
deleted file mode 100644
index 3d77c0a11ca..00000000000
--- a/docs/aot-compiler.txt
+++ /dev/null
@@ -1,393 +0,0 @@
-Mono Ahead Of Time Compiler
-===========================
-
- The Ahead of Time compilation feature in Mono allows Mono to
- precompile assemblies to minimize JIT time, reduce memory
- usage at runtime and increase the code sharing across multiple
- running Mono application.
-
- To precompile an assembly use the following command:
-
- mono --aot -O=all assembly.exe
-
- The `--aot' flag instructs Mono to ahead-of-time compile your
- assembly, while the -O=all flag instructs Mono to use all the
- available optimizations.
-
-* Caching metadata
-------------------
-
- Besides code, the AOT file also contains cached metadata information which allows
- the runtime to avoid certain computations at runtime, like the computation of
- generic vtables. This reduces both startup time, and memory usage. It is possible
- to create an AOT image which contains only this cached information and no code by
- using the 'metadata-only' option during compilation:
-
- mono --aot=metadata-only assembly.exe
-
- This works even on platforms where AOT is not normally supported.
-
-* Position Independent Code
----------------------------
-
- On x86 and x86-64 the code generated by Ahead-of-Time compiled
- images is position-independent code. This allows the same
- precompiled image to be reused across multiple applications
- without having different copies: this is the same way in which
- ELF shared libraries work: the code produced can be relocated
- to any address.
-
- The implementation of Position Independent Code had a
- performance impact on Ahead-of-Time compiled images but
- compiler bootstraps are still faster than JIT-compiled images,
- specially with all the new optimizations provided by the Mono
- engine.
-
-* How to support Position Independent Code in new Mono Ports
-------------------------------------------------------------
-
- Generated native code needs to reference various runtime
- structures/functions whose address is only known at run
- time. JITted code can simple embed the address into the native
- code, but AOT code needs to do an indirection. This
- indirection is done through a table called the Global Offset
- Table (GOT), which is similar to the GOT table in the Elf
- spec. When the runtime saves the AOT image, it saves some
- information for each method describing the GOT table entries
- used by that method. When loading a method from an AOT image,
- the runtime will fill out the GOT entries needed by the
- method.
-
- * Computing the address of the GOT
-
- Methods which need to access the GOT first need to compute its
- address. On the x86 it is done by code like this:
-
- call <IP + 5>
- pop ebx
- add <OFFSET TO GOT>, ebx
- <save got addr to a register>
-
- The variable representing the got is stored in
- cfg->got_var. It is allways allocated to a global register to
- prevent some problems with branches + basic blocks.
-
- * Referencing GOT entries
-
- Any time the native code needs to access some other runtime
- structure/function (i.e. any time the backend calls
- mono_add_patch_info ()), the code pointed by the patch needs
- to load the value from the got. For example, instead of:
-
- call <ABSOLUTE ADDR>
- it needs to do:
- call *<OFFSET>(<GOT REG>)
-
- Here, the <OFFSET> can be 0, it will be fixed up by the AOT compiler.
-
- For more examples on the changes required, see
-
- svn diff -r 37739:38213 mini-x86.c
-
- * The Program Linkage Table
-
- As in ELF, calls made from AOT code do not go through the GOT. Instead, a direct call is
- made to an entry in the Program Linkage Table (PLT). This is based on the fact that on
- most architectures, call instructions use a displacement instead of an absolute address, so
- they are already position independent. An PLT entry is usually a jump instruction, which
- initially points to some trampoline code which transfers control to the AOT loader, which
- will compile the called method, and patch the PLT entry so that further calls are made
- directly to the called method.
- If the called method is in the same assembly, and does not need initialization (i.e. it
- doesn't have GOT slots etc), then the call is made directly, bypassing the PLT.
-
-* Implementation
-----------------
-
-** The Precompiled File Format
------------------------------
-
- We use the native object format of the platform. That way it
- is possible to reuse existing tools like objdump and the
- dynamic loader. All we need is a working assembler, i.e. we
- write out a text file which is then passed to gas (the gnu
- assembler) to generate the object file.
-
- The precompiled image is stored in a file next to the original
- assembly that is precompiled with the native extension for a shared
- library (on Linux its ".so" to the generated file).
-
- For example: basic.exe -> basic.exe.so; corlib.dll -> corlib.dll.so
-
- To avoid symbol lookup overhead and to save space, some things like the
- compiled code of the individual methods are not identified by specific symbols
- like method_code_1234. Instead, they are stored in one big array and the
- offsets inside this array are stored in another array, requiring just two
- symbols. The offsets array is usually named 'FOO_offsets', where FOO is the
- array the offsets refer to, like 'methods', and 'method_offsets'.
-
- Generating code using an assembler and linker has some disadvantages:
- - it requires GNU binutils or an equivalent package to be installed on the
- machine running the aot compilation.
- - it is slow.
-
- There is some support in the aot compiler for directly emitting elf files, but
- its not complete (yet).
-
- The following things are saved in the object file and can be
- looked up using the equivalent to dlsym:
-
- mono_assembly_guid
-
- A copy of the assembly GUID.
-
- mono_aot_version
-
- The format of the AOT file format.
-
- mono_aot_opt_flags
-
- The optimizations flags used to build this
- precompiled image.
-
- method_infos
-
- Contains additional information needed by the runtime for using the
- precompiled method, like the GOT entries it uses.
-
- method_info_offsets
-
- Maps method indexes to offsets in the method_infos array.
-
- mono_icall_table
-
- A table that lists all the internal calls
- references by the precompiled image.
-
- mono_image_table
-
- A list of assemblies referenced by this AOT
- module.
-
- methods
-
- The precompiled code itself.
-
- method_offsets
-
- Maps method indexes to offsets in the methods array.
-
- ex_info
-
- Contains information about methods which is rarely used during normal execution,
- like exception and debug info.
-
- ex_info_offsets
-
- Maps method indexes to offsets in the ex_info array.
-
- class_info
-
- Contains precomputed metadata used to speed up various runtime functions.
-
- class_info_offsets
-
- Maps class indexes to offsets in the class_info array.
-
- class_name_table
-
- A hash table mapping class names to class indexes. Used to speed up
- mono_class_from_name ().
-
- plt
-
- The Program Linkage Table
-
- plt_info
-
- Contains information needed to find the method belonging to a given PLT entry.
-
-** Source file structure
------------------------------
-
- The AOT infrastructure is split into two files, aot-compiler.c and
- aot-runtime.c. aot-compiler.c contains the AOT compiler which is invoked by
- --aot, while aot-runtime.c contains the runtime support needed for loading
- code and other things from the aot files.
-
-** Compilation process
-----------------------------
-
- AOT compilation consists of the following stages:
- - collecting the methods to be compiled.
- - compiling them using the JIT.
- - emitting the JITted code and other information into an assembly file (.s).
- - assembling the file using the system assembler.
- - linking the resulting object file into a shared library using the system
- linker.
-
-** Handling compiled code
-----------------------------
-
- Each method is identified by a method index. For normal methods, this is
- equivalent to its index in the METHOD metadata table. For runtime generated
- methods (wrappers), it is an arbitrary number.
- Compiled code is created by invoking the JIT, requesting it to created AOT
- code instead of normal code. This is done by the compile_method () function.
- The output of the JIT is compiled code and a set of patches (relocations). Each
- relocation specifies an offset inside the compiled code, and a runtime object
- whose address is accessed at that offset.
- Patches are described by a MonoJumpInfo structure. From the perspective
- of the AOT compiler, there are two kinds of patches:
- - calls, which require an entry in the PLT table.
- - everything else, which require an entry in the GOT table.
- How patches is handled is described in the next section.
- After all the method are compiled, they are emitted into the output file into
- a byte array called 'methods', The emission
- is done by the emit_method_code () and emit_and_reloc_code () functions. Each
- piece of compiled code is identified by the local symbol .Lm_<method index>.
- While compiled code is emitted, all the locations which have an associated patch
- are rewritten using a platform specific process so the final generated code will
- refer to the plt and got entries belonging to the patches.
- The compiled code array
-can be accessed using the 'methods' global symbol.
-
-** Handling patches
-----------------------------
-
- Before a piece of AOTed code can be used, the GOT entries used by it must be
- filled out with the addresses of runtime objects. Those objects are identified
- by MonoJumpInfo structures. These stuctures are saved in a serialized form in
- the AOT file, so the AOT loader can deconstruct them. The serialization is done
- by the encode_patch () function, while the deserialization is done by the
- decode_patch_info () function.
- Every method has an associated method info blob inside the 'method_info' byte
- array in the AOT file. This contains all the information required to load the
- method at runtime:
- - the first got entry used by the method.
- - the number of got entries used by the method.
- - the serialized patch info for the got entries.
- Some patches, like vtables, icalls are very common, so instead of emitting their
- info every time they are used by a method, we emit the info only once into a
- byte array named 'got_info', and only emit an index into this array for every
- access.
-
-** The Procedure Linkage Table (PLT)
-------------------------------------
-
- Our PLT is similar to the elf PLT, it is used to handle calls between methods.
- If method A needs to call method B, then an entry is allocated in the PLT for
- method B, and A calls that entry instead of B directly. This is useful because
- in some cases the runtime needs to do some processing the first time B is
- called.
- There are two cases:
- - if B is in another assembly, then it needs to be looked up, then JITted or the
- corresponding AOT code needs to be found.
- - if B is in the same assembly, but has got slots, then the got slots need to be
- initialized.
- If none of these cases is true, then the PLT is not used, and the call is made
- directly to the native code of the target method.
- A PLT entry is usually implemented by a jump though a jump table, where the
- jump table entries are initially filled up with the address of a trampoline so
- the runtime can get control, and after the native code of the called method is
- created/found, the jump table entry is changed to point to the native code.
- All PLT entries also embed a integer offset after the jump which indexes into
- the 'plt_info' table, which stores the information required to find the called
- method. The PLT is emitted by the emit_plt () function.
-
-** Exception/Debug info
-----------------------------
-
- Each compiled method has some additional info generated by the JIT, usable
- for debugging (IL offset-native offset maps) and exception handling
- (saved registers, native offsets of try/catch clauses). Since this info is
- rarely needed, it is saved into a separate byte array called 'ex_info'.
-
-** Cached metadata
----------------------------
-
- When the runtime loads a class, it needs to compute a variety of information
- which is not readily available in the metadata, like the instance size,
- vtable, whenever the class has a finalizer/type initializer etc. Computing this
- information requires a lot of time, causes the loading of lots of metadata,
- and it usually involves the creation of many runtime data structures
- (MonoMethod/MonoMethodSignature etc), which are long living, and usually persist
- for the lifetime of the app. To avoid this, we compute the required information
- at aot compilation time, and save it into the aot image, into an array called
- 'class_info'. The runtime can query this information using the
- mono_aot_get_cached_class_info () function, and if the information is available,
- it can avoid computing it.
-
-** Full AOT mode
--------------------------
-
- Some platforms like the iphone prohibit JITted code, using technical and/or
- legal means. This is a significant problem for the mono runtime, since it
- generates a lot of code dynamically, using either the JIT or more low-level
- code generation macros. To solve this, the AOT compiler is able to function in
- full-aot or aot-only mode, where it generates and saves all the neccesary code
- in the aot image, so at runtime, no code needs to be generated.
- There are two kinds of code which needs to be considered:
- - wrapper methods, that is methods whose IL is generated dynamically by the
- runtime. They are handled by generating them in the add_wrappers () function,
- then emitting them the same way as the 'normal' methods. The only problem is
- that these methods do not have a methoddef token, so we need a separate table
- in the aot image ('wrapper_info') to find their method index.
- - trampolines and other small hand generated pieces of code. They are handled
- in an ad-hoc way in the emit_trampolines () function.
-
-* Performance considerations
-----------------------------
-
- Using AOT code is a trade-off which might lead to higher or
- slower performance, depending on a lot of circumstances. Some
- of these are:
-
- - AOT code needs to be loaded from disk before being used, so
- cold startup of an application using AOT code MIGHT be
- slower than using JITed code. Warm startup (when the code is
- already in the machines cache) should be faster. Also,
- JITing code takes time, and the JIT compiler also need to
- load additional metadata for the method from the disk, so
- startup can be faster even in the cold startup case.
-
- - AOT code is usually compiled with all optimizations turned
- on, while JITted code is usually compiled with default
- optimizations, so the generated code in the AOT case should
- be faster.
-
- - JITted code can directly access runtime data structures and
- helper functions, while AOT code needs to go through an
- indirection (the GOT) to access them, so it will be slower
- and somewhat bigger as well.
-
- - When JITting code, the JIT compiler needs to load a lot of
- metadata about methods and types into memory.
-
- - JITted code has better locality, meaning that if A method
- calls B, then the native code for A and B is usually quite
- close in memory, leading to better cache behaviour thus
- improved performance. In contrast, the native code of
- methods inside the AOT file is in a somewhat random order.
-
-* Future Work
--------------
-
- - Currently, when an AOT module is loaded, all of its
- dependent assemblies are also loaded eagerly, and these
- assemblies need to be exactly the same as the ones loaded
- when the AOT module was created ('hard binding'). Non-hard
- binding should be allowed.
-
- - On x86, the generated code uses call 0, pop REG, add
- GOTOFFSET, REG to materialize the GOT address. Newer
- versions of gcc use a separate function to do this, maybe we
- need to do the same.
-
- - Currently, we get vtable addresses from the GOT. Another
- solution would be to store the data from the vtables in the
- .bss section, so accessing them would involve less
- indirection.
-
-
-
diff --git a/docs/exception-handling.txt b/docs/exception-handling.txt
deleted file mode 100644
index 1fae4e4e1b7..00000000000
--- a/docs/exception-handling.txt
+++ /dev/null
@@ -1,330 +0,0 @@
-
- Exception Handling In the Mono Runtime
- --------------------------------------
-
-* Introduction
---------------
-
- There are many types of exceptions which the runtime needs to
- handle. These are:
-
- - exceptions thrown from managed code using the 'throw' or 'rethrow' CIL
- instructions.
-
- - exceptions thrown by some IL instructions like InvalidCastException thrown
- by the 'castclass' CIL instruction.
-
- - exceptions thrown by runtime code
-
- - synchronous signals received while in managed code
-
- - synchronous signals received while in native code
-
- - asynchronous signals
-
- Since exception handling is very arch dependent, parts of the
- exception handling code reside in the arch specific
- exceptions-<ARCH>.c files. The architecture independent parts
- are in mini-exceptions.c. The different exception types listed
- above are generated in different parts of the runtime, but
- ultimately, they all end up in the mono_handle_exception ()
- function in mini-exceptions.c.
-
-* Exceptions throw programmatically from managed code
------------------------------------------------------
-
- These exceptions are thrown from managed code using 'throw' or
- 'rethrow' CIL instructions. The JIT compiler will translate
- them to a call to a helper function called
- 'mono_arch_throw/rethrow_exception'.
-
- These helper functions do not exist at compile time, they are
- created dynamically at run time by the code in the
- exceptions-<ARCH>.c files.
-
- They perform various stack manipulation magic, then call a
- helper function usually named throw_exception (), which does
- further processing in C code, then calls
- mono_handle_exception() to do the rest.
-
-* Exceptions thrown implicitly from managed code
-------------------------------------------------
-
- These exceptions are thrown by some IL instructions when
- something goes wrong. When the JIT needs to throw such an
- exception, it emits a forward conditional branch and remembers
- its position, along with the exception which needs to be
- emitted. This is usually done in macros named
- EMIT_COND_SYSTEM_EXCEPTION in the mini-<ARCH>.c files.
-
- After the machine code for the method is emitted, the JIT
- calls the arch dependent mono_arch_emit_exceptions () function
- which will add the exception throwing code to the end of the
- method, and patches up the previous forward branches so they
- will point to this code.
-
- This has the advantage that the rarely-executed exception
- throwing code is kept separate from the method body, leading
- to better icache performance.
-
- The exception throwing code braches to the dynamically
- generated mono_arch_throw_corlib_exception helper function,
- which will create the proper exception object, does some stack
- manipulation, then calls throw_exception ().
-
-* Exceptions thrown by runtime code
------------------------------------
-
- These exceptions are usually thrown by the implementations of
- InternalCalls (icalls). First an appropriate exception object
- is created with the help of various helper functions in
- metadata/exception.c, which has a separate helper function for
- allocating each kind of exception object used by the runtime
- code. Then the mono_raise_exception () function is called to
- actually throw the exception. That function never returns.
-
- An example:
-
- if (something_is_wrong)
- mono_raise_exception (mono_get_exception_index_out_of_range ());
-
- mono_raise_exception () simply passes the exception to the JIT
- side through an API, where it will be received by helper
- created by mono_arch_throw_exception (). From now on, it is
- treated as an exception thrown from managed code.
-
-* Synchronous signals
----------------------
-
- For performance reasons, the runtime does not do same checks
- required by the CLI spec. Instead, it relies on the CPU to do
- them. The two main checks which are omitted are null-pointer
- checks, and arithmetic checks. When a null pointer is
- dereferenced by JITted code, the CPU will notify the kernel
- through an interrupt, and the kernel will send a SIGSEGV
- signal to the process. The runtime installs a signal handler
- for SIGSEGV, which is sigsegv_signal_handler () in mini.c. The
- signal handler creates the appropriate exception object and
- calls mono_handle_exception () with it. Arithmetic exceptions
- like division by zero are handled similarly.
-
-* Synchronous signals in native code
-------------------------------------
-
- Receiving a signal such as SIGSEGV while in native code means
- something very bad has happened. Because of this, the runtime
- will abort after trying to print a managed plus a native stack
- trace. The logic is in the mono_handle_native_sigsegv ()
- function.
-
- Note that there are two kinds of native code which can be the
- source of the signal:
-
- - code inside the runtime
- - code inside a native library loaded by an application, ie. libgtk+
-
-* Stack overflow checking
--------------------------
-
- Stack overflow exceptions need special handling. When a thread
- overflows its stack, the kernel sends it a normal SIGSEGV
- signal, but the signal handler tries to execute on the same as
- the thread leading to a further SIGSEGV which will terminate
- the thread. A solution is to use an alternative signal stack
- supported by UNIX operating systems through the sigaltstack
- (2) system call. When a thread starts up, the runtime will
- install an altstack using the mono_setup_altstack () function
- in mini-exceptions.c. When a SIGSEGV is received, the signal
- handler checks whenever the fault address is near the bottom
- of the threads normal stack. If it is, a
- StackOverflowException is created instead of a
- NullPointerException. This exception is handled like any other
- exception, with some minor differences.
-
- There are two reasons why sigaltstack is disabled by default:
-
- * The main problem with sigaltstack() is that the stack
- employed by it is not visible to the GC and it is possible
- that the GC will miss it.
-
- * Working sigaltstack support is very much os/kernel/libc
- dependent, so it is disabled by default.
-
-
-* Asynchronous signals
-----------------------
-
- Async signals are used by the runtime to notify a thread that
- it needs to change its state somehow. Currently, it is used
- for implementing thread abort/suspend/resume.
-
- Handling async signals correctly is a very hard problem,
- since the receiving thread can be in basically any state upon
- receipt of the signal. It can execute managed code, native
- code, it can hold various managed/native locks, or it can be
- in a process of acquiring them, it can be starting up,
- shutting down etc. Most of the C APIs used by the runtime are
- not asynch-signal safe, meaning it is not safe to call them
- from an async signal handler. In particular, the pthread
- locking functions are not async-safe, so if a signal handler
- interrupted code which was in the process of acquiring a lock,
- and the signal handler tries to acquire a lock, the thread
- will deadlock. Unfortunately, the current signal handling
- code does acquire locks, so sometimes it does deadlock.
-
- When receiving an async signal, the signal handler first tries
- to determine whenever the thread was executing managed code
- when it was interrupted. If it did, then it is safe to
- interrupt it, so a ThreadAbortException is constructed and
- thrown. If the thread was executing native code, then it is
- generally not safe to interrupt it. In this case, the runtime
- sets a flag then returns from the signal handler. That flag is
- checked every time the runtime returns from native code to
- managed code, and the exception is thrown then. Also, a
- platform specific mechanism is used to cause the thread to
- interrupt any blocking operation it might be doing.
-
- The async signal handler is in sigusr1_signal_handler () in
- mini.c, while the logic which determines whenever an exception
- is safe to be thrown is in mono_thread_request_interruption
- ().
-
-* Stack unwinding during exception handling
--------------------------------------------
-
- The execution state of a thread during exception handling is
- stored in an arch-specific structure called MonoContext. This
- structure contains the values of all the CPU registers
- relevant during exception handling, which usually means:
-
- - IP (instruction pointer)
- - SP (stack pointer)
- - FP (frame pointer)
- - callee saved registers
-
- Callee saved registers are the registers which are required by
- any procedure to be saved/restored before/after using
- them. They are usually defined by each platforms ABI
- (Application Binary Interface). For example, on x86, they are
- EBX, ESI and EDI.
-
- The code which calls mono_handle_exception () is required to
- construct the initial MonoContext. How this is done depends on
- the caller. For exceptions thrown from managed code, the
- mono_arch_throw_exception helper function saves the values of
- the required registers and passes them to throw_exception (),
- which will save them in the MonoContext structure. For
- exceptions thrown from signal handlers, the MonoContext
- stucture is initialized from the signal info received from the
- kernel.
-
- During exception handling, the runtime needs to 'unwind' the
- stack, i.e. given the state of the thread at a stack frame,
- construct the state at its callers. Since this is platform
- specific, it is done by a platform specific function called
- mono_arch_find_jit_info ().
-
- Two kinds of stack frames need handling:
-
- - Managed frames are easier. The JIT will store some
- information about each managed method, like which
- callee-saved registers it uses. Based on this information,
- mono_arch_find_jit_info () can find the values of the
- registers on the thread stack, and restore them.
-
- - Native frames are problematic, since we have no information
- about how to unwind through them. Some compilers generate
- unwind information for code, some don't. Also, there is no
- general purpose library to obtain and decode this unwind
- information. So the runtime uses a different solution. When
- managed code needs to call into native code, it does through
- a managed->native wrapper function, which is generated by
- the JIT. This function is responsible for saving the machine
- state into a per-thread structure called MonoLMF (Last
- Managed Frame). These LMF structures are stored on the
- threads stack, and are linked together using one of their
- fields. When the unwinder encounters a native frame, it
- simply pops one entry of the LMF 'stack', and uses it to
- restore the frame state to the moment before control passed
- to native code. In effect, all successive native frames are
- skipped together.
-
-Problems/future work
---------------------
-
-1. Async signal safety
-----------------------
-
- The current async signal handling code is not async safe, so
- it can and does deadlock in practice. It needs to be rewritten
- to avoid taking locks at least until it can determine that it
- was interrupting managed code.
-
- Another problem is the managed stack frame unwinding code. It
- blindly assumes that if the IP points into a managed frame,
- then all the callee saved registers + the stack pointer are
- saved on the stack. This is not true if the thread was
- interrupted while executing the method prolog/epilog.
-
-2. Raising exceptions from native code
---------------------------------------
-
- Currently, exceptions are raised by calling
- mono_raise_exception () in the middle of runtime code. This
- has two problems:
-
- - No cleanup is done, ie. if the caller of the function which
- throws an exception has taken locks, or allocated memory,
- that is not cleaned up. For this reason, it is only safe to
- call mono_raise_exception () 'very close' to managed code,
- ie. in the icall functions themselves.
-
- - To allow mono_raise_exception () to unwind through native
- code, we need to save the LMF structures which can add a lot
- of overhead even in the common case when no exception is
- thrown. So this is not zero-cost exception handling.
-
- An alternative might be to use a JNI style
- set-pending-exception API. Runtime code could call
- mono_set_pending_exception (), then return to its caller with
- an error indication allowing the caller to clean up. When
- execution returns to managed code, then managed->native
- wrapper could check whenever there is a pending exception and
- throw it if neccesary. Since we already check for pending
- thread interruption, this would have no overhead, allowing us
- to drop the LMF saving/restoring code, or significant parts of
- it.
-
-4. libunwind
-------------
-
- There is an OSS project called libunwind which is a standalone
- stack unwinding library. It is currently in development, but
- it is used by default by gcc on ia64 for its stack
- unwinding. The mono runtime also uses it on ia64. It has
- several advantages in relation to our current unwinding code:
-
- - it has a platform independent API, i.e. the same unwinding
- code can be used on multiple platforms.
-
- - it can generate unwind tables which are correct at every
- instruction, i.e. can be used for unwinding from async
- signals.
-
- - given sufficient unwind info generated by a C compiler, it
- can unwind through C code.
-
- - most of its API is async-safe
-
- - it implements the gcc C++ exception handling API, so in
- theory it can be used to implement mixed-language exception
- handling (i.e. C++ exception caught in mono, mono exception
- caught in C++).
-
- - it is MIT licensed
-
- The biggest problem with libuwind is its platform support. ia64 support is
- complete/well tested, while support for other platforms is missing/incomplete.
-
- http://www.hpl.hp.com/research/linux/libunwind/
-
diff --git a/docs/jit-regalloc b/docs/jit-regalloc
deleted file mode 100644
index 47a277046c8..00000000000
--- a/docs/jit-regalloc
+++ /dev/null
@@ -1,283 +0,0 @@
-Register Allocation
-===================
-
-The current JIT implementation uses a tree matcher to generate code. We use a
-simple algorithm to allocate registers in trees, and limit the number of used
-temporary register to 4 when evaluating trees. So we can use 2 registers for
-global register allocation.
-
-Register Allocation for Trees
-=============================
-
-We always evaluate trees from left to right. When there are no more registers
-available we need to spill values to memory. Here is the simplified algorithm.
-
-gboolean
-tree_allocate_regs (tree, exclude_reg)
-{
- if (!tree_allocate_regs (tree->left, -1))
- return FALSE;
-
- if (!tree_allocate_regs (tree->right, -1)) {
-
- tree->left->spilled == TRUE;
-
- free_used_regs (tree->left);
-
- if (!tree_allocate_regs (tree->right, tree->left->reg))
- return FALSE;
- }
-
- free_used_regs (tree->left);
- free_used_regs (tree->right);
-
- /* try to allocate a register (reg != exclude_reg) */
- if ((tree->reg = next_free_reg (exclude_reg)) != -1)
- return TRUE;
-
- return FALSE;
-}
-
-The emit routing actually spills the registers:
-
-tree_emit (tree)
-{
-
- tree_emit (tree->left);
-
- if (tree->left->spilled)
- save_reg (tree->left->reg);
-
- tree_emit (tree->right);
-
- if (tree->left->spilled)
- restore_reg (tree->left->reg);
-
-
- emit_code (tree);
-}
-
-
-Global Register Allocation
-==========================
-
-TODO.
-
-Local Register Allocation
-=========================
-
-This section describes the cross-platform local register allocator which is
-in the file mini-codegen.c.
-
-The input to the allocator is a basic block which contains linear IL, ie.
-instructions of the form:
-
- DEST <- SRC1 OP SRC2
-
-where DEST, SRC1, and SRC2 are virtual registers (vregs). The job of the
-allocator is to assign hard or physical registers (hregs) to each virtual
-registers so the vreg references in the instructions can be replaced with their
-assigned hreg, allowing machine code to be generated later.
-
-The allocator needs information about the number and types of arguments of
-instructions. It takes this information from the machine description files. It
-also needs arch specific information, like the number and type of the hard
-registers. It gets this information from arch-specific macros.
-
-Currently, the vregs and hregs are partitioned into two classes: integer and
-floating point.
-
-The allocator consists of two phases: In the first phase, a forward pass is
-made over the instructions, collecting liveness information for vregs. In the
-second phase, a backward pass is made over the instructions, assigning
-registers. This backward mode of operation makes understanding the allocator
-somewhat difficult to understand, but leads to better code in most cases.
-
-Allocator state
-===============
-
-The state of the allocator is stored in two arrays: iassign and isymbolic.
-iassign maps vregs to hregs, while isymbolic is the opposite.
-For a vreg, iassign [vreg] can contain the following values:
-
- * -1 vreg has no assigned hreg
-
- * hreg index (>= 0) vreg is assigned to the given hreg. This means
- later instructions (which we have already
- processed due to the backward direction) expect
- the value of vreg to be found in hreg.
-
- * spill slot index (< -1) vreg is spilled to the given spill slot. This
- means later instructions expect the value of
- vreg to be found on the stack in the given
- spill slot.
-
-Also, the allocator keeps track of which hregs are free and which are used.
-This information is stored in a bitmask called ifree_mask.
-
-There is a similar set of data structures for floating point registers.
-
-Spilling
-========
-
-When an allocator needs a free hreg, but all of them are assigned, it needs to
-free up one of them. It does this by spilling the contents of the vreg which
-is currently assigned to the selected hreg. Since later instructions expect
-the vreg to be found in the selected hreg, the allocator emits a spill-load
-instruction to load the value from the spill slot into the hreg after the
-currently processed instruction. When the vreg which is spilled is a
-destination in an instruction, the allocator will emit a spill-store to store
-the value into the spill slot.
-
-Fixed registers
-===============
-
-Some architectures, notably x86/amd64 require that the arguments/results of
-some instructions be assigned to specific hregs. An example is the shift
-opcodes on x86, where the second argument must be in ECX. The allocator
-has support for this. It tries to allocate the vreg to the required hreg. If
-thats not possible, then it will emit compensation code which moves values to
-the correct registers before/after the instruction.
-
-Fixed registers are mainly used on x86, but they are useful on more regular
-architectures on well, for example to model that after a call instruction, the
-return of the call is in a specific register.
-
-A special case of fixed registers is two address architectures, like the x86,
-where the instructions place their results into their first argument. This is
-modelled in the allocator by allocating SRC1 and DEST to the same hreg.
-
-Global registers
-================
-
-Some variables might already be allocated to hardware registers during the
-global allocation phase. In this case, SRC1, SRC2 and DEST might already be
-a hardware register. The allocator needs to do nothing in this case, except
-when the architecture uses fixed registers, in which case it needs to emit
-compensation code.
-
-Register pairs
-==============
-
-64 bit arithmetic on 32 bit machines requires instructions whose arguments are
-not registers, but register pairs. The allocator has support for this, both
-for freely allocatable register pairs, and for register pairs which are
-constrained to specific hregs (EDX:EAX on x86).
-
-Floating point stack
-====================
-
-The x86 architecture uses a floating point register stack instead of a set of
-fp registers. The allocator supports this by keeping track of the height of the
-fp stack, and spilling/loading values from the stack as neccesary.
-
-Calls
-=====
-
-Calls need special handling for two reasons: first, they will clobber all
-caller-save registers, meaning their contents will need to be spilled. Also,
-some architectures pass arguments in registers. The registers used for
-passing arguments are usually the same as the ones used for local allocation,
-so the allocator needs to handle them specially. This is done as follows:
-the MonoInst for the call instruction contains a map mapping vregs which
-contain the argument values to hregs where the argument needs to be placed,
-like this (on amd64):
-
-R33 -> RDI
-R34 -> RSI
-...
-
-When the allocator processes the call instruction, it allocates the vregs
-in the map to their associated hregs. So the call instruction is processed as
-if having a variable number of arguments which fixed register assignments.
-
-An example:
-
- R33 <- 1
- R34 <- 2
- call
-
-When the call instruction is processed, R33 is assigned to RDI, and R34 is
-assigned to RSI. Later, when the two assignment instructions are processed,
-R33 and R34 are already assigned to a hreg, so they are replaced with the
-associated hreg leading to the following final code:
-
- RDI <- 1
- RSI <- 1
- call
-
-Machine description files
-=========================
-
-A typical entry in the machine description files looks like this:
-
- shl: dest:i src1:i src2:s clob:1 len:2
-
-The allocator is only interested in the dest,src1,src2 and clob fields.
-It understands the following values for the dest, src1, src2 fields:
-
- i - integer register
- f - fp register
- b - base register (same as i, but the instruction does not modify the reg)
- m - fp register, even if an fp stack is used (no fp stack tracking)
-
-It understands the following values for the clob field:
-
- 1 - sreg1 needs to be the same as dreg
- c - instruction clobbers the caller-save registers
-
-Beside these values, an architecture can define additional values (like the 's'
-in the example). The allocator depends on a set of arch-specific macros to
-convert these values to information it needs during allocation.
-
-Arch specific macros
-====================
-
-These macros usually receive a value from the machine description file (like
-the 's' in the example). The examples below are for x86.
-
-/*
- * A bitmask selecting the caller-save registers (these are used for local
- * allocation).
- */
-#define MONO_ARCH_CALLEE_REGS X86_CALLEE_REGS
-
-/*
- * A bitmask selecting the callee-saved registers (these are usually used for
- * global allocation).
- */
-#define MONO_ARCH_CALLEE_SAVED_REGS X86_CALLER_REGS
-
-/* Same for the floating point registers */
-#define MONO_ARCH_CALLEE_FREGS 0
-#define MONO_ARCH_CALLEE_SAVED_FREGS 0
-
-/* Whenever the target uses a floating point stack */
-#define MONO_ARCH_USE_FPSTACK TRUE
-
-/* The size of the floating point stack */
-#define MONO_ARCH_FPSTACK_SIZE 6
-
-/*
- * Given a descriptor value from the machine description file, return the fixed
- * hard reg corresponding to that value.
- */
-#define MONO_ARCH_INST_FIXED_REG(desc) ((desc == 's') ? X86_ECX : ((desc == 'a') ? X86_EAX : ((desc == 'd') ? X86_EDX : ((desc == 'y') ? X86_EAX : ((desc == 'l') ? X86_EAX : -1)))))
-
-/*
- * A bitmask selecting the hregs which can be used for allocating sreg2 for
- * a given instruction.
- */
-#define MONO_ARCH_INST_SREG2_MASK(ins) (((ins [MONO_INST_CLOB] == 'a') || (ins [MONO_INST_CLOB] == 'd')) ? (1 << X86_EDX) : 0)
-
-/*
- * Given a descriptor value, return whenever it denotes a register pair.
- */
-#define MONO_ARCH_INST_IS_REGPAIR(desc) (desc == 'l' || desc == 'L')
-
-/*
- * Given a descriptor value, and the first register of a regpair, return a
- * bitmask selecting the hregs which can be used for allocating the second
- * register of the regpair.
- */
-#define MONO_ARCH_INST_REGPAIR_REG2(desc,hreg1) (desc == 'l' ? X86_EDX : -1)
diff --git a/docs/memory-management.txt b/docs/memory-management.txt
deleted file mode 100644
index a78ab5e3bf0..00000000000
--- a/docs/memory-management.txt
+++ /dev/null
@@ -1,32 +0,0 @@
-Metadata memory management
---------------------------
-
-Most metadata structures have a lifetime which is equal to the MonoImage where they are
-loaded from. These structures should be allocated from the memory pool of the
-corresponding MonoImage. The memory pool is protected by the loader lock.
-Examples of metadata structures in this category:
-- MonoClass
-- MonoMethod
-- MonoType
-Memory owned by these structures should be allocated from the image mempool as well.
-Examples include: klass->methods, klass->fields, method->signature etc.
-
-Generics complicates things. A generic class could have many instantinations where the
-generic arguments are from different assemblies. Where should we allocate memory for
-instantinations ? We can allocate from the mempool of the image which contains the
-generic type definition, but that would mean that the instantinations would remain in
-memory even after the assemblies containing their type arguments are unloaded, leading
-to a memory leak. Therefore, we do the following:
-- data structures representing the generic definitions are allocated from the image
- mempool as usual. These include:
- - generic class definition (MonoGenericClass->container_class)
- - generic method definitions
- - type parameters (MonoGenericParam)
-- data structures representing inflated classes/images are allocated from the heap. These
- structures are kept in a cache, indexed by type arguments of the instantinations. When
- an assembly is unloaded, this cache is searched and all instantinations referencing
- types from the assembly are freed. This is done by mono_metadata_clean_for_image ()
- in metadata.c. The structures handled this way include:
- - MonoGenericClass
- - MonoGenericInst
- - inflated MonoMethods
diff --git a/docs/thread-safety.txt b/docs/thread-safety.txt
deleted file mode 100644
index c1e5d7f8720..00000000000
--- a/docs/thread-safety.txt
+++ /dev/null
@@ -1,118 +0,0 @@
-
-1. Thread safety of metadata structures
-----------------------------------------
-
-1.1 Synchronization of read-only data
--------------------------------------
-
-Read-only data is data which is not modified after creation, like the
-actual binary metadata in the metadata tables.
-
-There are three kinds of threads with regards to read-only data:
-- readers
-- the creator of the data
-- the destroyer of the data
-
-Most threads are readers.
-
-- synchronization between readers is not necessary
-- synchronization between the writers is done using locks.
-- synchronization between the readers and the creator is done by not exposing
- the data to readers before it is fully constructed.
-- synchronization between the readers and the destroyer: TBD.
-
-1.2 Deadlock prevention plan
-----------------------------
-
-Hold locks for the shortest time possible. Avoid calling functions inside
-locks which might obtain global locks (i.e. locks known outside this module).
-
-1.3 Locks
-----------
-
-1.3.1 Simple locks
-------------------
-
- There are a lot of global data structures which can be protected by a 'simple' lock. Simple means:
- - the lock protects only this data structure or it only protects the data structures in a given C module.
- An example would be the appdomains list in domain.c
- - the lock can span many modules, but it still protects access to a single resource or set of resources.
- An example would be the image lock, which protects all data structures that belong to a given MonoImage.
- - the lock is only held for a short amount of time, and no other lock is acquired inside this simple lock. Thus there is
- no possibility of deadlock.
-
- Simple locks include, at least, the following :
- - the per-image lock acquired by using mono_image_(un)lock functions.
-
-1.3.2 The class loader lock
----------------------------
-
-This locks is held by the class loading routines in class.c and loader.c. It
-protects the various caches inside MonoImage which are used by these modules.
-
-1.3.3 The domain lock
----------------------
-
-Each appdomain has a lock which protects the per-domain data structures.
-
-1.3.4 The locking hierarchy
----------------------------
-
-It is useful to model locks by a locking hierarchy, which is a relation between locks, which is reflexive, transitive,
-and antisymmetric, in other words, a lattice. If a thread wants to acquire a lock B, while already holding A, it can only
-do it if A < B. If all threads work this way, then no deadlocks can occur.
-
-Our locking hierarchy so far looks like this:
- <DOMAIN LOCK>
- \
- <CLASS LOADER LOCK>
- \ \
- <SIMPLE LOCK 1> <SIMPLE LOCK 2>
-
-1.4 Notes
-----------
-
-Some common scenarios:
-- if a function needs to access a data structure, then it should lock it itself, and do not count on its caller locking it.
- So for example, the image->class_cache hash table would be locked by mono_class_get().
-
-- there are lots of places where a runtime data structure is created and stored in a cache. In these places, care must be
- taken to avoid multiple threads creating the same runtime structure, for example, two threads might call mono_class_get ()
- with the same class name. There are two choices here:
-
- <enter mutex>
- <check that item is created>
- if (created) {
- <leave mutex>
- return item
- }
- <create item>
- <store it in cache>
- <leave mutex>
-
- This is the easiest solution, but it requires holding the lock for the whole time which might create a scalability problem, and could also lead to deadlock.
-
- <enter mutex>
- <check that item is created>
- <leave mutex>
- if (created) {
- return item
- }
- <create item>
- <enter mutex>
- <check that item is created>
- if (created) {
- /* Another thread already created and stored the same item */
- <free our item>
- <leave mutex>
- return orig item
- }
- else {
- <store item in cache>
- <leave mutex>
- return item
- }
-
- This solution does not present scalability problems, but the created item might be hard to destroy (like a MonoClass).
-
-- lazy initialization of hashtables etc. is not thread safe