diff options
author | Massimiliano Mantione <massi@mono-cvs.ximian.com> | 2006-03-31 10:59:23 +0400 |
---|---|---|
committer | Massimiliano Mantione <massi@mono-cvs.ximian.com> | 2006-03-31 10:59:23 +0400 |
commit | b37add239c289a64f1d54cb0f367d2e081c0984f (patch) | |
tree | d7659abcba0bfebe93f8839026d42f35d4825467 /docs/tree-mover.txt | |
parent | 693aaf3bcc01559d2f2f33fa8c67ed8d96634c90 (diff) |
2006-03-31 Massimiliano Mantione <massi@ximian.com>
* docs/tree-mover.txt: Added tree mover documentation.
svn path=/trunk/mono/; revision=58830
Diffstat (limited to 'docs/tree-mover.txt')
-rw-r--r-- | docs/tree-mover.txt | 261 |
1 files changed, 261 insertions, 0 deletions
diff --git a/docs/tree-mover.txt b/docs/tree-mover.txt new file mode 100644 index 00000000000..3ee836a5b2f --- /dev/null +++ b/docs/tree-mover.txt @@ -0,0 +1,261 @@ + +Purpose + +Especially when inlining is active, it can happen that temporary +variables add pressure to the register allocator, producing bad +code. + +The idea is that some of these temporaries can be totally eliminated +my moving the MonoInst tree that defines them directly to the use +point in the code (so the name "tree mover"). + +Please note that this is *not* an optimization: it is mostly a +workaround to issues we have in the regalloc. +Actually, with the new linear IR this will not be possible at all +(there will be no more trees in the code!). +Anyway, this workaround turns out to be useful in the current state +of things... + +----------------------------------------------------------------------- + +Base logic + +If a local is defined by a value which is a proper expression (a tree +of MonoInst, not just another local or a constant), and this definition +is used only once, the tree can be moved directly to the use location, +and the definition eliminated. +Of course, none of the variables used in the tree must be defined in +the code path between the definition and the use, and the tree must be +free of side effects. +We do not handle the cases when the tree is just a local or a constant +because they are handled by copyprop and consprop, respectively. + +To make things simpler, we restrict the tree move to the case when: +- the definition and the use are in the same BB, and +- the use is followed by another definition in the same BB (it is not + possible that the 1st value is used again), or alternatively there + is no BB in the whole CFG that contains a use of this local before a + definition (so, again, there is no code path that can lead to a + subsequent use). + +To handle this, we maintain an ACT array (Available Copy Tree, similar +to the ACP), where we store the "state" of every local. +Ideally, every local can be in the following state: +[E] Undefined (by a tree, it could be in the ACP but we don't care). +[D] Defined (by a tree), and waiting for a use. +[U] Used, with a tree definition available in the same BB, but still + without a definition following the use (always in the same BB). +Of course state [E] (empty) is the initial one. + +Besides, there are two sort of "meta states", or flags: +[W] Still waiting for a use or definition in this BB (we have seen no + occurrence of the local yet). +[X] Used without being previously defined in the same BB (note that if + there is a definition that precedes the use in the same BB, even if + the definition is not a tree or is not available because of side + effects or because the tree value has changed the local is not in + state [X]). +Also note that state [X] is a sort of "global" condition, which if set +in one BB will stay valid for the whole CFG, even if the local will +otherwise change state. The idea of flagging a local as [X] is that if +there is a definition/use pair that reaches the end of a BB, it could +be that there is a CFG path that then leads to the BB flagging it as +[X] (which contains a use), so the tree cannot be moved. +So state [X] will always be set, and never examined in all the state +transitions we will describe. +In practice, we use flag [W] to set state [X]: if, when traversing a +BB, we find a use for a local in state [W], then that local is flagged +[X]. + + +For each BB, we initialize all states to [E] and [W], and then we +traverse the code one inst at a time, and update the variable states +in the ACT in the following ways: + +[Definition] + - Flag [W] is cleared. + - All "affected trees" are killed (go from state [D] to [E]). + The "affected trees" are the trees which contain (use) the defined + local, and the rationale is that the tree value changed, so the + tree is no longer available. + - If the local was in state [U], *that* tree move is marked "safe" + (because *this* definition makes us sure that the previous tree + cannot be used again in any way). + The idea is that "safe" moves can happen even if the local is + flagged [X], because the second definition "covers" the use. + The tree move is then saved in the "todo" list (and the affecting + nodes are cleared). + - If the local was defined by a tree, it goes to state [D], the tree + is recorded, and all the locals used in it are marked as "affecting + this tree" (of course these markers are lists, because each local + could affect more than one tree). + +[IndirectDefinition] + - All potentially affected trees (in state [D]) are killed. + +[Use] + - If the local is still [W], it is flagged [X] (the [W] goes away). + - If the local is in state [D], it goes to state [U]. + The tree move must not yet be recorded in the "todo" list, it still + stays in the ACT slot belonging to this local. + Anyway, the "affecting" nodes are updated, because now a definition + of a local used in this tree will affect only "indirect" (or also + "propagated") moves, but not *this* move (see below). + - If the local is in state [U], then the tree cannot be moved (it is + used two times): the move is canceled, and the state goes [E]. + - If the local is in state [E], the use is ignored. + +[IndirectUse] + - All potentially affected trees (in state [D] or [U]) are killed. + +[SideEffect] + - Tree is marked as "unmovable". + +Then, at the end of the BB, for each ACT slot: + - If state is [U], the tree move is recorded in the "todo" list, but + flagged "unsafe". + - Anyway, state goes to [E], the [W] flag is set, and all "affecting" + lists are cleared (we get ready to traverse the next BB). +Finally, when all BBs has been scanned, we traverse the "todo" list, +moving all "safe" entries, and moving "unsafe" ones only if their ACT +slot is not flagged [X]. + +So far, so good. +But there are two issues that make things harder :-( + +The first is the concept of "indirect tree move". +It can happen that a tree is scheduled for moving, and its destination +is a use that is located in a second tree, which could also be moved. +The main issue is that a definition of a variable of the 1st tree on +the path between the definition and the use of the 2nd one must prevent +the move. +But which move? The 1st or the 2nd? +Well, any of the two! +The point is, the 2nd move must be prevented *only* if the 1st one +happens: if it is aborted (for an [X] flag or any other reason), the +2nd move is OK, and vice versa... +We must handle this in the following way: +- The ACT must still remember if a slot is scheduled for moving in + this BB, and if it is, all the locals used in the tree. + We say that the slot is in state [M]. + Note that [M] is (like [X] and [W]) a sort of "meta state": a local + is flagged [M] when it goes to state [U], and the flag is cleared + when the tree move is cancelled +- A tree that uses a local whose slot is in state [M] is also using all + the locals used by the tree in state [M], but the use is "indirect". + These use nodes are also included in the "affecting" lists. +- The definition of a variable used in an "indirect" way has the + effect of "linking" the two involved tree moves, saying that only one + of the two can happen in practice, but not both. +- When the 2nd tree is scheduled for moving, the 1st one is *still* in + state [M], because a third move could "carry it forward", and all the + *three* moves should be mutually exclusive (to be safe!). + +The second tricky complication is the "tree forwarding" that can happen +when copyprop is involved. +It is conceptually similar to the "indirect tree move". +Only, the 2nd tree is not really a tree, it is just the local defined +in the 1st tree move. +It can happen that copyprop will propagate the definition. +We cannot make treeprop do the same job of copyprop, because copyprop +has less constraints, and is therefore more powerful in its scope. +The main issue is that treeprop cannot propagate a tree to *two* uses, +while copyprop is perfectly capable of propagating one definition to +two (or more) different places. +So we must let copyprop do its job otherwise we'll miss optimizations, +but we must also make it play safe with treeprop. +Let's clarify with an example: + a = v1 + v2; //a is defined by a tree, state [D], uses v2 and v2 + b = a; //a is used, state [U] with move scheduled, and + //b is defined by a, ACP[b] is a, and b is in state [DC] + c = b + v3; // b is used, goes to state [U] +The real trouble is that copyprop happens *immediately*, while treeprop +is deferred to the end of the CFG traversal. +So, in the 3rd statement, the "b" is immediately turned into an "a" by +copyprop, regardless of what treeprop will do. +Anyway, if we are careful, this is not so bad. +First of all, we must "accept" the fact that in the 3rd statement the +"b" is in fact an "a", as treeprop must happen *after* copyprop. +The real problem is that "a" is used twice: in the 2nd and 3rd lines. +In our usual setup, the 2nd line would set it to [U], and the 3rd line +would kill the move (and set "a" to [E]). +I have tried to play tricks, and reason as of copyprop didn't happen, +but everything becomes really messy. +Instead, we should note that the 2nd line is very likely to be dead. +At least in this BB, copyprop will turn all "b"s into "a"s as long as +it can, and when it cannot, it will be because either "a" or "b" have +been redefined, which would be after the tree move anyway. +So, the reasoning gets different: let's pretend that "b" will be dead. +This will make the "a" use in the 2nd statement useless, so there we +can "reset" "a" to [D], but also take note that if "b" will end up +not being dead, the tree move associated to this [D] must be aborted. +We can detect this in the following way: +- Either "b" is used before being defined in this BB, or +- It will be flagged "unsafe". +Both things are very easy to check. +The only quirk is that the "affecting" lists must not be cleared when +a slot goes to state [U], because a "propagation" could put it back +to state [D] (where those lists are needed, because it can be killed +by a definition to a used slot). + +----------------------------------------------------------------------- + +Implementation notes + +All the implementation runs inside the existing mono_local_cprop +function, and a separate memory pool is used to hold the temporary +data. + +A struct, MonoTreeMover, contains the pointers to the pool, the ACT, +the list of scheduled moves and auxiliary things. +This struct is allocated if the tree move pass is requested, and is +then passed along to all the involved functions, which are therefore +aware of the tree mover state. + +The ACT is an array of slots, obviously one per local. +Each slot is of type MonoTreeMoverActSlot, and contains the used and +affected locals, a pointer to the pending tree move and the "waiting" +and "unsafe" flags. + +The "affecting" lists a built from "dependency nodes", of type +MonoTreeMoverDependencyNode. +Each of the nodes contains the used and affected local, and is in +two lists: the locals used by a slot, and the locals affected by a +slot (obviously a different one). +So, each node means: "variable x is used in tree t, so a definition +of x affects tree t". +The "affecting" lists are doubly linked, to allow for O(1) deletion. +The "used" lists are simply linked, but when they are mantained there +is always a pointer to the last element to allow for O(1) list moving. +When a used list is dismissed (which happens often, any time a node is +killed), its nodes are unlinked from their respective affecting lists +and are then put in a "free" list in the MonoTreeMover to be reused. + +Each tree move is represented by a struct (MonoTreeMoverTreeMove), +which contains: +- the definition and use points, +- the "affected" moves (recall the concept of "indirect tree move"), +- the "must be dead" slots (recall "tree forwarding"). and +- a few utility flags. +The tree moves stays in the relevant ACT slot until it is ready to be +scheduled for moving, at which point it is put in a list in the +MonoTreeMover. +The tree moves structs are reused when they are killed, so there is +also a "free" list for them in the MonoTreeMover. + +The tree mover code has been added to all the relevant functions that +participate in consprop and copyprop, particularly: +- mono_cprop_copy_values takes care of variable uses (transitions from + states [D] to [U] and [U] to [E] because of killing), +- mono_cprop_invalidate_values takes care of side effects (indirect + accesses, calls...), +- mono_local_cprop_bb sets up and cleans the traversals for each BB, + and for each MonoInst it takes care of variable definitions. +To each of them has been added a MonoTreeMover parameter, which is not +NULL if the tree mover is running. +After mono_local_cprop_bb has run for all BBs, the MonoTreeMover has +the list of all the pending moves, which must be walked to actually +perform the moves (when possible, because "unsafe" flags, "affected" +moves and "must be dead" slots can still have their effects, which +must be handled now because they are fully known only at the end of +the CFG traversal). |