Improve Reverse PInvoke performance in ProjectN (#1138)

Recently while experimenting with interop performance micro-benchmarks, I observed that Reverse PInvoke code path in ProjectN is 8-10x slower than Desktop CLR. Morgan also noticed the slow down and extra memory allocation on this code path while doing the Unity app investigation. This change aims to improve this code path to achieve parity with Desktop CLR. This change improve Reverse PInvoke performance by: 1. Remove the dictionary in McgModuleManager which maps between thunk address and delegate. Instead store a weak GCHandle of the delegate in the thunk data section. 2. Optimize open static delegate by storing the static function pointer directly in the thunk data. And later on reverse pinvoke code path directly do a CallI on the function pointer. Modified MCG to generate special function to handle open static delegates. 3. While storing the function pointer, store jump stub code target address. Added a runtime helper to get the jump stub target. 4. Reorder some instructions in RhpReversePInvoke and InteropNative_CommonStub functions so that the hot path get better instruction cache use. Results: X86 33% slower than Desktop CLR (75 vs 92 instruction) 5.5x faster than latest ProjectN AMD64 9% faster than Desktop CLR (54 vs 67 instructions) 7.5x faster than latest ProjectN [tfs-changeset: 1596255]
author: dotnet bot <dotnet-bot@microsoft.com> 2016-04-14 23:59:59 +0300
committer: Michal Strehovský <MichalStrehovsky@users.noreply.github.com> 2016-04-14 23:59:59 +0300
commit: 9ed921ffded2f76e55e9a24fd8b8c9d3d2570c43 (patch)
tree: 62a773a7c336793f6b17f4be98c1ebbbd99d82f5 /src/Native/Runtime/i386
parent: 9804f42dce998498b693915ff1b2051478dadc97 (diff)
1 files changed, 16 insertions, 14 deletions
diff --git a/src/Native/Runtime/i386/PInvoke.asm b/src/Native/Runtime/i386/PInvoke.asm
index 219387ad1..2e0c0ca9d 100644
--- a/src/Native/Runtime/i386/PInvoke.asm
+++ b/src/Native/Runtime/i386/PInvoke.asm
@@ -135,21 +135,8 @@ ThreadAttached:
         ;;     2) Performing a managed delegate invoke on a reverse pinvoke delegate.
         ;;
         cmp         dword ptr [edx + OFFSETOF__Thread__m_pTransitionFrame], 0
-        jne         ValidTransition
+        je          CheckBadTransition
 
-        ;; Allow 'bad transitions' in when the TSF_DoNotTriggerGc mode is set.  This allows us to have 
-        ;; [NativeCallable] methods that are called via the "restricted GC callouts" as well as from native,
-        ;; which is necessary because the methods are CCW vtable methods on interfaces passed to native.
-        test        dword ptr [edx + OFFSETOF__Thread__m_ThreadStateFlags], TSF_DoNotTriggerGc
-        jz          BadTransition
-
-        ;; zero-out our 'previous transition frame' save slot
-        mov         dword ptr [eax], 0
-
-        ;; nothing more to do
-        jmp         AllDone
-
-ValidTransition:
         ; Save previous TransitionFrame prior to making the mode transition so that it is always valid 
         ; whenever we might attempt to hijack this thread.
         mov         ecx, [edx + OFFSETOF__Thread__m_pTransitionFrame]
@@ -164,6 +151,21 @@ AllDone:
         pop         edx         ; restore arg reg
         pop         ecx         ; restore arg reg
         ret
+        
+CheckBadTransition:
+        ;; Allow 'bad transitions' in when the TSF_DoNotTriggerGc mode is set.  This allows us to have 
+        ;; [NativeCallable] methods that are called via the "restricted GC callouts" as well as from native,
+        ;; which is necessary because the methods are CCW vtable methods on interfaces passed to native.
+        test        dword ptr [edx + OFFSETOF__Thread__m_ThreadStateFlags], TSF_DoNotTriggerGc
+        jz          BadTransition
+
+        ;; zero-out our 'previous transition frame' save slot
+        mov         dword ptr [eax], 0
+
+        ;; nothing more to do
+        jmp         AllDone
+
+
 
 AttachThread:
         ;;
author	dotnet bot <dotnet-bot@microsoft.com>	2016-04-14 23:59:59 +0300
committer	Michal Strehovský <MichalStrehovsky@users.noreply.github.com>	2016-04-14 23:59:59 +0300
commit	9ed921ffded2f76e55e9a24fd8b8c9d3d2570c43 (patch)
tree	62a773a7c336793f6b17f4be98c1ebbbd99d82f5 /src/Native/Runtime/i386
parent	9804f42dce998498b693915ff1b2051478dadc97 (diff)