Age | Commit message (Collapse) | Author |
|
(Equality)Comparer<T>.Default (#50446)
Co-authored-by: Andy Ayers <andya@microsoft.com>
|
|
* Poison address exposed user variables in debug code
Fix #13072
* Run jit-format
* Use named scratch register and kill it in LSRA
* Enable it unconditionally for testing purposes
* Remove unnecessary modified reg on ARM
* Fix OSR and get rid of test code
* Remove a declaration
* Undo modified comment and use modulo instead of and
* Add a test
* Rephrase comment
Co-authored-by: Kunal Pathak <Kunal.Pathak@microsoft.com>
* Disable poisoning test on mono
* Remove outdated line
Co-authored-by: Kunal Pathak <Kunal.Pathak@microsoft.com>
|
|
Add more iterators compatible with range-based `for` syntax for various data structures. These iterators all assume (and some check) that the underlying data structures determining the iteration are not changed during the iteration. For example, don't use these to iterate over the predecessor edges if you are changing the order or contents of the predecessor edge list.
- BasicBlock: iterate over all blocks in the function, a subset starting not at the first block, or a specified range of blocks. Removed uses of the `foreach_block` macro. E.g.:
```
for (BasicBlock* const block : Blocks()) // all blocks in function
for (BasicBlock* const block : BasicBlockSimpleList(fgFirstBB->bbNext)) // all blocks starting at fgFirstBB->bbNext
for (BasicBlock* const testBlock : BasicBlockRangeList(firstNonLoopBlock, lastNonLoopBlock)) // all blocks in range (inclusive)
```
- block predecessors: iterate over all predecessor edges, or all predecessor blocks, e.g.:
```
for (flowList* const edge : block->PredEdges())
for (BasicBlock* const predBlock : block->PredBlocks())
```
- block successors: iterate over all block successors using the `NumSucc()/GetSucc()`, or `NumSucc(Compiler*)/GetSucc(Compiler*)` pairs, e.g.:
```
for (BasicBlock* const succ : Succs())
for (BasicBlock* const succ : Succs(compiler))
```
Note that there already exists the "AllSuccessorsIter" which iterates over block successors including possible EH successors, e.g.:
```
for (BasicBlock* succ : block->GetAllSuccs(m_pCompiler))
```
- switch targets, (namely, the successors of `BBJ_SWITCH` blocks), e.g.:
```
for (BasicBlock* const bTarget : block->SwitchTargets())
```
- loops blocks: iterate over all the blocks in a loop, e.g.:
```
for (BasicBlock* const blk : optLoopTable[loopInd].LoopBlocks())
```
- Statements: added an iterator shortcut for the non-phi statements, e.g.:
```
for (Statement* const stmt : block->NonPhiStatements())
```
Note that there already exists an iterator over all statements, e.g.:
```
for (Statement* const stmt : block->Statements())
```
- EH clauses, e.g.:
```
for (EHblkDsc* const HBtab : EHClauses(this))
```
- GenTree in linear order (but not LIR, which already has an iterator), namely, using the `gtNext` links, e.g.:
```
for (GenTree* const call : stmt->TreeList())
```
This is a no-diff change.
|
|
* Introduce enum for BasicBlock and loop flags
This gives a better debugging experience in Visual Studio. It
also improves type checking: there were a few places still using
`unsigned` instead of `unsigned __int64` when manipulating
BasicBlock flags.
* Make sure debugreturn and contracts are disabled for the JIT build
* Convert GenTree flags, debug flags, and call flags to enums
* Remove bad GT_HWINTRINSIC copy/paste code from GenTree::GetRegSpillFlagByIdx
|
|
Instead, add and modify the appropriate preds when the mechanical
cloning is performed. This will preserve existing profile data
on the edges.
Contributes to #49030
No x86/x64 SPMI asm diffs.
|
|
* Simplify JIT label handling
Remove the BBF_JMP_TARGET flag that was set early and attempted
to be maintained all through compilation. Instead, use
BBF_USE_LABEL to indicate to the emitter where we need labels.
Also, stop setting and maintaining BBF_USE_LABEL early.
Add a pass over the blocks when preparing for codegen that sets
most of the necessary BBF_USE_LABEL flags. This flag will never
be set before codegen. A few places set the flag after codegen starts,
namely `genCreateTempLabel` and BBJ_COND handling for alignment.
Note that this flag must be set before the block is processed for
codegen (and an insGroup is created).
Together, these changes make it easier to work with the flow graph
without worrying about maintaining these bits of information through
various optimizations.
Add a few more details about alignment processing to the dump.
There are a few code asm diffs due to alignment processing not previously
creating a label to help compute how large a loop is.
There are a lot of textual asm diffs due to there being (mostly)
fewer labels, plus some additional insGroup output. This can happen if
a block was labeled with `BBF_JMP_TARGET` or `BBF_USE_LABEL` before,
but didn't need to be, perhaps after some optimizations. Now, the flag is
never added in the first place.
There are a large number of GC info diffs. Labels are where GC info state
changes are recorded between codegen and the emitter. If we eliminate an
unnecessary emitter label, then we also eliminate a capture of the current
codegen GC state. Since the emitter is lazy at marking GC deaths, this
means that we see a lot of lengthened GC lifetimes -- until the next
label, or some other cause of GC kill. Often, you see a register kill
followed by register birth just disappear, and the register is maintained
alive across the interim.
* Remove loop align flag if we decide a loop is no longer a loop
|
|
* Delete `JitDoOldStructRetyping`.
* delete unnecessary spilling in `fgUpdateInlineReturnExpressionPlaceHolder`.
|
|
Move to fgopt, and rework to make its operation a bit more obvious.
|
|
|
|
Stop trying to update the common return block profile data during return
merging, as it is not yet known which return blocks will become tail calls.
Start updating constant return block profile data during return merging
as this is when we transform the flow.
Update the common return block profile data during return merging in
morph (adding more counts) and when creating tail calls (removing counts).
Update profile consistency checker to handle switches properly and to use
tolerant compares.
Add extra dumping when solving for edge weights or adjusting flow edge
weights to help track down where errors are coming from.
Add new FMT_WT formatting string for profile weights, and start using it
in fgprofile. Use %g so we don't see huge digit strings.
Handle constant return merges too.
Also fix dump output from `setEdgeWeights` and pass in the destination
of the edge.
Refactor `setBBProfileWeight` to also handle setting/clearing rarely run.
|
|
|
|
Create a number of smaller files with cohesive sets of methods.
|
|
* Only pass the method context when tracking transitions.
* Add another separate helper for exit. use a jitflag to signify when to track transitions
* Apply suggestions from code review
Co-authored-by: Jan Kotas <jkotas@microsoft.com>
* Wire up R2R helpers.
* Revert "Wire up R2R helpers."
This reverts commit 80a4749232312a89752518bcb8741a2b8e0129cc.
* Update CorInfoHelpFunc to have the new entries. Don't handle the new helpers since they won't be used.
* Simplify if check since CORProfilerTrackTransitions is process-wide.
* Remove unneeded assignment.
* Fix formatting.
Co-authored-by: Jan Kotas <jkotas@microsoft.com>
|
|
* Implement emitting an unmanaged calling convention entry point with the correct argument order and register usage on x86.
* Move Unix x86 to the UnmanagedCallersOnly plan now that we don't need to do argument shuffling.
* Add SEH hookup and profiler/debugger hooks to Reverse P/Invoke entry helper to match custom x86 thunk.
Fixes #46177
* Remove Windows x86 assembly stub for individual reverse p/invokes. Move Windows x86 unmanaged callers only to not have extra overhead and put reverse P/Invoke stubs for Windows x86 on the UnmanagedCallersOnly plan.
* Further cleanup
* Remove extraneous UnmanagedCallersOnly block now that x86 UnmanagedCallersOnly has been simplified.
* Undo ArgOrder size specifier since it isn't needed and it doesn't work.
* Fix copy constructor reverse marshalling. Now that we don't have the emitted unmanaged thunk stub, we need to handle the x86 differences for copy-constructed parameters in the IL stub.
* Fix version guid syntax.
* Remove FastNExportHandler.
* Revert "Remove FastNExportHandler."
This reverts commit 423f70ee4d564147dc0ce370d38b3a38404f8e22.
* Fix setting up entry frame for new thread.
* Allow the NExportSEH record to live below ESP so we don't need to create a new stack frame.
* Fix formatting.
* Assign an offset for the return buffer on x86 since it might come in on the stack.
* Make sure we use the TC block we just put in on x86 as well.
* Shrink the ReversePInvokeFrame on non-x86 back to master's size.
* Fix arch-specific R2R constant.
* Pass the return address of the ReversePInvokeEnter helper to TraceCall instead of the entry point and call TraceCall from all JIT_ReversePInvokeEnter* helpers.
* Fix ILVerification and ILVerify
* fix R2R constants for crossgen1
* Don't assert ReversePInvokeFrame size for cross-bitness scenarios.
|
|
* Fix uses of uninitialized data
Fixes #46961
* Fix typo
* Formatting
|
|
Phase 1 of replacing existing infrastructure around handling of pgo data with more flexible schema based approach.
The schema based approach allows the JIT to define the form of data needed for instrumentation.
- The schema associates 4 32bit integers with each data collection point (ILOffset, InstrumentationKind, Count, and Other)
- Rich meaning is attached to InstrumentationKind, and Count
- InstrumentationKind defines the size and layout of individual instrumentation data items
- Count allows a single schema item to be repeated
- ILOffset and Other are not processed in any specific way by the infrastructure
Changes part of this phase
- PgoManager holds arbitrary amount of pgo data instead of a slab
- Aware of collectible assemblies
- Match with pgo data utilizes hash of IL body in addition to IL size information for greater accuracy in match
- JIT no longer uses block count apis, and instead uses schema based apis
- JIT now explicitly defines the shape of data collected for both basic block and type probes
- The rest of the system handles that without deep knowledge of what those formats are
- Text file format for pgo data updated
- Existing IBC infrastructure adjusted to speak in terms of schema concept
- Uncompressed and binary encoded implementation of Pgo schema handling
- Update SuperPMI to handle new apis
Future Changes for static Pgo
- Move Pgo type handle histogram processing into JIT
- Extract Pgo data from process using Event infrastructure
- Triggers for controlling Pgo data extraction
- Instrumented Pgo processing as part of dotnet-pgo tool
- Pgo data flow in crossgen2
|
|
* Detect inner loop and add 10 bytes of padding at the beginning
* generate nop in previous blocks
* TODO: figure out if anything needs to be done in optCanonicalizeLoop
* Add COMPlus_JitAlignLoopMinBlockWeight and COMPlus_JitAlignLoopMaxCodeSize
- Add 2 variables to control which loops get aligned
- Moved padding after the conditional/unconditional jump of previous block
* Reuse AlignLoops flag for dynamic loop alignment
* Detect back edge and count no. of instructions before doing loop alignment
* fix bugs
* propagate the basic block flag
* Switch from instrCount to codeSize
* JitAlignLoopWith32BPadding
* Add emitLoopAlign32Bytes()
* wip
* Add logic to avoid emitting nop if not needed
* fix a condition
* Several things:
- Replaced JitAlignLoopWith32BPadding with JitAlignLoopBoundary
- Added JitAlignLoopForJcc
- Added logging of boundary and point where instruction splitting happpens
- Add logic to take into consideration JCC.
* Added JitAlignLoopAdaptive algorithm
* wip
* revert emitarm64.cpp changes
* fix errors during merge
* fix build errors
* refactoring and cleanup
* refactoring and build errors fix
* jit format
* one more build error
* Add emitLoopAlignAdjustments()
* Update emitLoopAlignAdjustments to just include loopSize calc
* Remove #ifdef ADAPTIVE_LOOP_ALIGNMENT
* Code cleanup
* minor fixes
* Fix issues:
- Make sure all `align` instructions for non-adaptive fall under same IG
- Convert some variables to `unsigned short`
- Fixed the maxPadding amount for adaptive alignment calculation
* Other fixes
* Remove align_loops flag from coreclr
* Review feedback
- Do not align loop if it has call
- Created `emitSetLoopBackEdge()` to isolate `emitCurIG` inside emitter class
- Created `emitOutputAlign()` to move the align instruction output logic
- Renamed emitVariableeLoopAlign() to emitLongLoopAlign()
- Created `optIdentifyLoopsForAlignment()` to identify loops that need alignment
- Added comments at various places
* jit format
* Add FEATURE_LOOP_ALIGN
* remove special case for align
* Do not propagate BBF_LOOP_ALIGN in certain cases
* Introduce instrDescAlign and emitLastAlignedIgNum
* Several changes:
- Perform accurate padding size before outputting align instruction
- During outputting, just double check if the padding needed matches to what was calculated.
- If at any time, instruction sizes are over-estimated before the last align instruction,
then compensate them by adding NOP.
- As part of above step, do not perform encoding "VEX prefix shortening" if there is align
instruction in future.
- Fix edge cases where because of loop cloning or resolution phase of register allocator, the
loops are marked such that they cover the loops that are already mark for alignment. Fix by
resetting their IGF_LOOP_ALIGN flag.
- During loop size calculation, if the last IG also has `align` flag, then do not take into account
the align instruction's size because they are reserved for the next loop.
* jit format
* fix issue related to needLabel
* align memory correctly in superpmi
* Few more fixes:
- emitOffsAdj takes into account for any mis-prediction of jump. If we compensate that mis-prediction, that off that adjustment.
- Record the lastAlignIG only for valid non-zero align instructions
* minor JITDUMP messages
* Review comments
* missing check
* Mark the last align IG the one that has non-zero padding
* More review comments
* Propagate BBF_LOOP_ALIGN for compacting blocks
* Handle ALIGN_LOOP flag for loops that are unrolled
* jit format
* Loop size upto last back-edge instead of first back-edge
* Take loop weight in consideration
* remove align flag if loop is no longer valid
* Adjust loop block weight to 4 instead of 8
* missing space after rebase
* fix the enum values after rebase
* review feedback
* Add missing #ifdef DEBUG
|
|
Follow up to #45615.
We compute edge weights twice. Make sure we either enable or suppress
both the computations.
|
|
We only want the patchpoint flags to apply to the original block,
not the split remainder block.
|
|
Whenever blocks are renumbered or a block is swapped into an existing
pred list entry, ensure the pred list remains properly ordered.
Closes #8720.
|
|
* Allow non-primitive struct returns to not require a stub. Fixes #35928.
* Support propagating the UnmanagedCallersOnly calling convention to the JIT. Block UnmanagedCallersOnly in crossgen1 since the attribute parsing code isn't included.
* Support passing through the calling convention for UnmanagedCallersOnly in crossgen2
* Fix clang errors.
* Fix stack calculation.
* Fix usings
* Clean up jitinterface.
* Remove invalid assert.
* Fix up stdcall name mangling lookup.
* Fix flag condition.
* Use the register var type when copying from the register to the stack.
* Change flag check for readability.
* Rename variables to remove shadowing.
* Fix formatting.
* Create new getEntryPointCallConv method on the EE-JIT interface to handle all calling convention resolution and support extensible calling conventions.
* Remove unreachable code.
* Remove now unused getUnmanagedCallConv jitinterface method (replaced by getEntryPointCallConv).
* Fix formatting.
* Rename getEntryPointCallConv and only call it with a method when it's a P/Invoke or Reverse P/Invoke.
* Pass SuppressGCTransition through the getUnmanagedCallConv JIT-EE interface API.
* Refactor callconv handling so we can handle reverse P/Invokes with the callconv in the signature (and not in an UnmanagedCallersOnly attribute).
* Clean up whitespace.
* Pass MethodIL as the scope for the signature to enable propagating down annotations for calli's in P/Invokes.
* Remove usages of CORINFO_CALLCONV_FOO where FOO is an unmanaged callconv. move the definitions of those ones to the interpreter since that's the only place they're used.
* SuppressGC cleanup.
* Rename superpmi struct
* Add default condition to make clang happy.
* change enums to make clang happy.
* Remove CORINFO_CALLCONV_C and family from interpreter.
* Fix up handling of managed function pointers and remove invalid assert.
* Continue to use sigflag for suppressgc workaround.
* Clean up comment wording.
Signed-off-by: Jeremy Koritzinsky <jekoritz@microsoft.com>
* Remove more MethodIL passing we don't need any more
* Update src/coreclr/tools/Common/JitInterface/CorInfoImpl.cs
Co-authored-by: Jan Kotas <jkotas@microsoft.com>
* Fix SigTypeContext creation.
* Pass context by ptr.
* Fix formatting.
* Clear the Reverse P/Invoke flag when compiling inlinees. It's only needed on the entry-point functions.
Co-authored-by: Jan Kotas <jkotas@microsoft.com>
|
|
Under stress, optionally limit methods to having just one return block,
to stress the logic involved in merging returns.
|
|
This doesn't do any actual propagation, so remove it and the associated
tracking flags.
|
|
(src/coreclr/src becomes src/coreclr) (#44973)
* Move src/coreclr/src/Directory.Build.targets to src/coreclr
Merge src/coreclr/src/CMakeLists.txt into src/coreclr/CMakeLists.txt
* Mechanical move of src/coreclr/src to src/coreclr
* Scripts adjustments to reflect the changed paths
|