Welcome to mirror list, hosted at ThFree Co, Russian Federation.

github.com/mono/mono.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
Diffstat (limited to 'mcs/docs/compiler')
-rwxr-xr-xmcs/docs/compiler374
1 files changed, 0 insertions, 374 deletions
diff --git a/mcs/docs/compiler b/mcs/docs/compiler
deleted file mode 100755
index 91ac4980107..00000000000
--- a/mcs/docs/compiler
+++ /dev/null
@@ -1,374 +0,0 @@
- The Internals of the Mono C# Compiler
-
- Miguel de Icaza
- (miguel@ximian.com)
- 2002
-
-* Abstract
-
- The Mono C# compiler is a C# compiler written in C# itself.
- Its goals are to provide a free and alternate implementation
- of the C# language. The Mono C# compiler generates ECMA CIL
- images through the use of the System.Reflection.Emit API which
- enable the compiler to be platform independent.
-
-* Overview: How the compiler fits together
-
- The compilation process is managed by the compiler driver (it
- lives in driver.cs).
-
- The compiler reads a set of C# source code files, and parses
- them. Any assemblies or modules that the user might want to
- use with his project are loaded after parsing is done.
-
- Once all the files have been parsed, the type hierarchy is
- resolved. First interfaces are resolved, then types and
- enumerations.
-
- Once the type hierarchy is resolved, every type is populated:
- fields, methods, indexers, properties, events and delegates
- are entered into the type system.
-
- At this point the program skeleton has been completed. The
- next process is to actually emit the code for each of the
- executable methods. The compiler drives this from
- RootContext.EmitCode.
-
- Each type then has to populate its methods: populating a
- method requires creating a structure that is used as the state
- of the block being emitted (this is the EmitContext class) and
- then generating code for the topmost statement (the Block).
-
- Code generation has two steps: the first step is the semantic
- analysis (Resolve method) that resolves any pending tasks, and
- guarantees that the code is correct. The second phase is the
- actual code emission. All errors are flagged during in the
- "Resolution" process.
-
- After all code has been emitted, then the compiler closes all
- the types (this basically tells the Reflection.Emit library to
- finish up the types), resources, and definition of the entry
- point are done at this point, and the output is saved to
- disk.
-
-* The parsing process
-
- All the input files that make up a program need to be read in
- advance, because C# allows declarations to happen after an
- entity is used, for example, the following is a valid program:
-
- class X : Y {
- static void Main ()
- {
- a = "hello"; b = "world";
- }
- string a;
- }
-
- class Y {
- public string b;
- }
-
- At the time the assignment expression `a = "hello"' is parsed,
- it is not know whether a is a class field from this class, or
- its parents, or whether it is a property access or a variable
- reference. The actual meaning of `a' will not be discvored
- until the semantic analysis phase.
-
-** The Tokenizer and the pre-processor
-
- The tokenizer is contained in the file `cs-tokenizer.cs', and
- the main entry point is the `token ()' method. The tokenizer
- implements the `yyParser.yyInput' interface, which is what the
- Yacc/Jay parser will use when fetching tokens.
-
- Token definitions are generated by jay during the compilation
- process, and those can be references from the tokenizer class
- with the `Token.' prefix.
-
- Each time a token is returned, the location for the token is
- recorded into the `Location' property, that can be accessed by
- the parser. The parser retrieves the Location properties as
- it builds its internal representation to allow the semantic
- analysis phase to produce error messages that can pin point
- the location of the problem.
-
- Some tokens have values associated with it, for example when
- the tokenizer encounters a string, it will return a
- LITERAL_STRING token, and the actual string parsed will be
- available in the `Value' property of the tokenizer. The same
- mechanism is used to return integers and floating point
- numbers.
-
- C# has a limited pre-processor that allows conditional
- compilation, but it is not as fully featured as the C
- pre-processor, and most notably, macros are missing. This
- makes it simple to implement in very few lines and mesh it
- with the tokenizer.
-
- The `handle_preprocessing_directive' method in the tokenizer
- handles all the pre-processing, and it is invoked when the '#'
- symbol is found as the first token in a line.
-
- The state of the pre-processor is contained in a Stack called
- `ifstack', this state is used to track the if/elif/else/endif
- nesting and the current state. The state is encoded in the
- top of the stack as a number of values `TAKING',
- `TAKEN_BEFORE', `ELSE_SEEN', `PARENT_TAKING'.
-
-** Locations
-
- Locations are encoded as a 32-bit number (the Location
- struct) that map each input source line to a linear number.
- As new files are parsed, the Location manager is informed of
- the new file, to allow it to map back from an int constant to
- a file + line number.
-
- The tokenizer also tracks the column number for a token, but
- this is currently not being used or encoded. It could
- probably be encoded in the low 9 bits, allowing for columns
- from 1 to 512 to be encoded.
-
-* The Parser
-
- The parser is written using Jay, which is a port of Berkeley
- Yacc to Java, that I later ported to C#.
-
- Many people ask why the grammar of the parser does not match
- exactly the definition in the C# specification. The reason is
- simple: the grammar in the C# specification is designed to be
- consumed by humans, and not by a computer program. Before
- you can feed this grammar to a tool, it needs to be simplified
- to allow the tool to generate a correct parser for it.
-
- In the Mono C# compiler, we use a class for each of the
- statements and expressions in the C# language. For example,
- there is a `While' class for the the `while' statement, a
- `Cast' class to represent a cast expression and so on.
-
- There is a Statement class, and an Expression class which are
- the base classes for statements and expressions.
-
-** Namespaces
-
- Using list.
-
-* Internal Representation
-
-** Expressions
-
-*** The Expression Class
-
- The utility functions that can be called by all children of
- Expression.
-
-** Constants
-
- Constants in the Mono C# compiler are reprensented by the
- abstract class `Constant'. Constant is in turn derived from
- Expression. The base constructor for `Constant' just sets the
- expression class to be an `ExprClass.Value', Constants are
- born in a fully resolved state, so the `DoResolve' method
- only returns a reference to itself.
-
- Each Constant should implement the `GetValue' method which
- returns an object with the actual contents of this constant, a
- utility virtual method called `AsString' is used to render a
- diagnostic message. The output of AsString is shown to the
- developer when an error or a warning is triggered.
-
- Constant classes also participate in the constant folding
- process. Constant folding is invoked by those expressions
- that can be constant folded invoking the functionality
- provided by the ConstantFold class (cfold.cs).
-
- Each Constant has to implement a number of methods to convert
- itself into a Constant of a different type. These methods are
- called `ConvertToXXXX' and they are invoked by the wrapper
- functions `ToXXXX'. These methods only perform implicit
- numeric conversions. Explicit conversions are handled by the
- `Cast' expression class.
-
- The `ToXXXX' methods are the entry point, and provide error
- reporting in case a conversion can not be performed.
-
-** Constant Folding
-
- The C# language requires constant folding to be implemented.
- Constant folding is hooked up in the Binary.Resolve method.
- If both sides of a binary expression are constants, then the
- ConstantFold.BinaryFold routine is invoked.
-
- This routine implements all the binary operator rules, it
- is a mirror of the code that generates code for binary
- operators, but that has to be evaluated at runtime.
-
- If the constants can be folded, then a new constant expression
- is returned, if not, then the null value is returned (for
- example, the concatenation of a string constant and a numeric
- constant is deferred to the runtime).
-
-** Side effects
-
- a [i++]++
- a [i++] += 5;
-
-** Statements
-
-* The semantic analysis
-
- Hence, the compiler driver has to parse all the input files.
- Once all the input files have been parsed, and an internal
- representation of the input program exists, the following
- steps are taken:
-
- * The interface hierarchy is resolved first.
- As the interface hierarchy is constructed,
- TypeBuilder objects are created for each one of
- them.
-
- * Classes and structure hierarchy is resolved next,
- TypeBuilder objects are created for them.
-
- * Constants and enumerations are resolved.
-
- * Method, indexer, properties, delegates and event
- definitions are now entered into the TypeBuilders.
-
- * Elements that contain code are now invoked to
- perform semantic analysis and code generation.
-
-* Output Generation
-
-** Code Generation
-
- The EmitContext class is created any time that IL code is to
- be generated (methods, properties, indexers and attributes all
- create EmitContexts).
-
- The EmitContext keeps track of the current namespace and type
- container. This is used during name resolution.
-
- An EmitContext is used by the underlying code generation
- facilities to track the state of code generation:
-
- * The ILGenerator used to generate code for this
- method.
-
- * The TypeContainer where the code lives, this is used
- to access the TypeBuilder.
-
- * The DeclSpace, this is used to resolve names through
- RootContext.LookupType in the various statements and
- expressions.
-
- Code generation state is also tracked here:
-
- * CheckState:
-
- This variable tracks the `checked' state of the
- compilation, it controls whether we should generate
- code that does overflow checking, or if we generate
- code that ignores overflows.
-
- The default setting comes from the command line
- option to generate checked or unchecked code plus
- any source code changes using the checked/unchecked
- statements or expressions. Contrast this with the
- ConstantCheckState flag.
-
- * ConstantCheckState
-
- The constant check state is always set to `true' and
- cant be changed from the command line. The source
- code can change this setting with the `checked' and
- `unchecked' statements and expressions.
-
- * IsStatic
-
- Whether we are emitting code inside a static or
- instance method
-
- * ReturnType
-
- The value that is allowed to be returned or NULL if
- there is no return type.
-
-
- * ContainerType
-
- Points to the Type (extracted from the
- TypeContainer) that declares this body of code
- summary>
-
-
- * IsConstructor
-
- Whether this is generating code for a constructor
-
- * CurrentBlock
-
- Tracks the current block being generated.
-
- * ReturnLabel;
-
- The location where return has to jump to return the
- value
-
- A few variables are used to track the state for checking in
- for loops, or in try/catch statements:
-
- * InFinally
-
- Whether we are in a Finally block
-
- * InTry
-
- Whether we are in a Try block
-
- * InCatch
-
- Whether we are in a Catch block
-
- * InUnsafe
- Whether we are inside an unsafe block
-
-* Miscelaneous
-
-** Error Processing.
-
- Errors are reported during the various stages of the
- compilation process. The compiler stops its processing if
- there are errors between the various phases. This simplifies
- the code, because it is safe to assume always that the data
- structures that the compiler is operating on are always
- consistent.
-
- The error codes in the Mono C# compiler are the same as those
- found in the Microsoft C# compiler, with a few exceptions
- (where we report a few more errors, those are documented in
- mcs/errors/errors.txt). The goal is to reduce confussion to
- the users, and also to help us track the progress of the
- compiler in terms of the errors we report.
-
- The Report class provides error and warning display functions,
- and also keeps an error count which is used to stop the
- compiler between the phases.
-
- A couple of debugging tools are available here, and are useful
- when extending or fixing bugs in the compiler. If the
- `--fatal' flag is passed to the compiler, the Report.Error
- routine will throw an exception. This can be used to pinpoint
- the location of the bug and examine the variables around the
- error location.
-
- Warnings can be turned into errors by using the `--werror'
- flag to the compiler.
-
- The report class also ignores warnings that have been
- specified on the command line with the `--nowarn' flag.
-
- Finally, code in the compiler uses the global variable
- RootContext.WarningLevel in a few places to decide whether a
- warning is worth reporting to the user or not.
-