<feed xmlns='http://www.w3.org/2005/Atom'>
<title>Ryujinx/ARMeilleure/Common, branch master</title>
<subtitle>A backup of the Ryujinx master git branch.
</subtitle>
<link rel='alternate' type='text/html' href='https://git.benis.co.uk/Ryujinx/'/>
<entry>
<title>Move solution and projects to src</title>
<updated>2023-04-27T21:51:14+00:00</updated>
<author>
<name>TSR Berry</name>
<email>20988865+TSRBerry@users.noreply.github.com</email>
</author>
<published>2023-04-07T23:22:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.benis.co.uk/Ryujinx/commit/?id=cee712105850ac3385cd0091a923438167433f9f'/>
<id>cee712105850ac3385cd0091a923438167433f9f</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>Use new ArgumentNullException and ObjectDisposedException throw-helper API (#4163)</title>
<updated>2022-12-27T19:27:11+00:00</updated>
<author>
<name>Berkan Diler</name>
<email>berkan.diler1@ingka.ikea.com</email>
</author>
<published>2022-12-27T19:27:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.benis.co.uk/Ryujinx/commit/?id=0d3b82477ecbf7128340b6725a79413427c68748'/>
<id>0d3b82477ecbf7128340b6725a79413427c68748</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>Use ReadOnlySpan&lt;byte&gt; compiler optimization in more places (#3853)</title>
<updated>2022-11-18T03:10:44+00:00</updated>
<author>
<name>Berkan Diler</name>
<email>b.diler@gmx.de</email>
</author>
<published>2022-11-18T03:10:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.benis.co.uk/Ryujinx/commit/?id=c1372ed775e11aa4759fd3460f2e01d16372205a'/>
<id>c1372ed775e11aa4759fd3460f2e01d16372205a</id>
<content type='text'>
* Use ReadOnlySpan&lt;byte&gt; compiler optimization in more places

* Revert changes in ShaderBinaries.cs

* Remove unused using;

* Use ReadOnlySpan&lt;byte&gt; compiler optimization in more places</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* Use ReadOnlySpan&lt;byte&gt; compiler optimization in more places

* Revert changes in ShaderBinaries.cs

* Remove unused using;

* Use ReadOnlySpan&lt;byte&gt; compiler optimization in more places</pre>
</div>
</content>
</entry>
<entry>
<title>Clean up rejit queue (#2751)</title>
<updated>2022-09-08T23:14:08+00:00</updated>
<author>
<name>FICTURE7</name>
<email>FICTURE7@gmail.com</email>
</author>
<published>2022-09-08T23:14:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.benis.co.uk/Ryujinx/commit/?id=ee1825219b8ccca13df7198d4e9ffb966e44c883'/>
<id>ee1825219b8ccca13df7198d4e9ffb966e44c883</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>A few minor documentation fixes. (#3599)</title>
<updated>2022-08-19T21:21:06+00:00</updated>
<author>
<name>Nicholas Rodine</name>
<email>halfofastaple@gmail.com</email>
</author>
<published>2022-08-19T21:21:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.benis.co.uk/Ryujinx/commit/?id=7defc59b9dee5459e52f394642643dbd59c9b32f'/>
<id>7defc59b9dee5459e52f394642643dbd59c9b32f</id>
<content type='text'>
* A few minor documentation fixes.

* Removed more invalid inheritdoc instances.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* A few minor documentation fixes.

* Removed more invalid inheritdoc instances.</pre>
</div>
</content>
</entry>
<entry>
<title>Removed unused usings. (#3593)</title>
<updated>2022-08-18T16:04:54+00:00</updated>
<author>
<name>Nicholas Rodine</name>
<email>halfofastaple@gmail.com</email>
</author>
<published>2022-08-18T16:04:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.benis.co.uk/Ryujinx/commit/?id=951700fdd8f54fb34ffe8a3fb328a68b5bf37abe'/>
<id>951700fdd8f54fb34ffe8a3fb328a68b5bf37abe</id>
<content type='text'>
* Removed unused usings.

* Added back using, now that it's used.

* Removed extra whitespace.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* Removed unused usings.

* Added back using, now that it's used.

* Removed extra whitespace.</pre>
</div>
</content>
</entry>
<entry>
<title>Optimize LSRA (#2563)</title>
<updated>2021-10-08T21:15:44+00:00</updated>
<author>
<name>FICTURE7</name>
<email>FICTURE7@gmail.com</email>
</author>
<published>2021-10-08T21:15:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.benis.co.uk/Ryujinx/commit/?id=69093cf2d69490862aff974f170cee63a0016fd0'/>
<id>69093cf2d69490862aff974f170cee63a0016fd0</id>
<content type='text'>
* Optimize `TryAllocateRegWithtoutSpill` a bit

* Add a fast path for when all registers are live.
* Do not query `GetOverlapPosition` if the register is already in use
  (i.e: free position is 0).

* Do not allocate child split list if not parent

* Turn `LiveRange` into a reference struct

`LiveRange` is now a reference wrapping struct like `Operand` and
`Operation`.

It has also been changed into a singly linked-list. In micro-benchmarks
traversing the linked-list was faster than binary search on `List&lt;T&gt;`.
Even for quite large input sizes (e.g: 1,000,000), surprisingly.

Could be because the code gen for traversing the linked-list is much
much cleaner and there is no virtual dispatch happening when checking if
intervals overlaps.

* Turn `LiveInterval` into an iterator

The LSRA allocates in forward order and never inspect previous
`LiveInterval` once they are expired. Something similar can be done for
the `LiveRange`s within the `LiveInterval`s themselves.

The `LiveInterval` is turned into a iterator which expires `LiveRange`
within it. The iterator is moved forward along with interval walking
code, i.e: AllocateInterval(context, interval, cIndex).

* Remove `LinearScanAllocator.Sources`

Local methods are less susceptible to do allocations than lambdas.

* Optimize `GetOverlapPosition(interval)` a bit

Time complexity should be in O(n+m) instead of O(nm) now.

* Optimize `NumberLocals` a bit

Use the same idea as in `HybridAllocator` to store the visited state
in the MSB of the Operand's value instead of using a `HashSet&lt;T&gt;`.

* Optimize `InsertSplitCopies` a bit

Avoid allocating a redundant `CopyResolver`.

* Optimize `InsertSplitCopiesAtEdges` a bit

Avoid redundant allocations of `CopyResolver`.

* Use stack allocation for `freePositions`

Avoid redundant computations.

* Add `UseList`

Replace `SortedIntegerList` with an even more specialized data
structure. It allocates memory on the arena allocators and does not
require copying use positions when splitting it.

* Turn `LiveInterval` into a reference struct

`LiveInterval` is now a reference wrapping struct like `Operand` and
`Operation`.

The rationale behind turning this in a reference wrapping struct is
because a `LiveInterval` is associated with each local variable, and
these intervals may themselves be split further. I've seen translations
having up to 8000 local variables.

To make the `LiveInterval` unmanaged, a new data structure called
`LiveIntervalList` was added to store child splits. This differs from
`SortedList&lt;,&gt;` because it can contain intervals with the same start
position.

Really wished we got some more of C++ template in C#. :^(

* Optimize `GetChildSplit` a bit

No need to inspect the remaining ranges if we've reached a range which
starts after position, since the split list is ordered.

* Optimize `CopyResolver` a bit

Lazily allocate the fill, spill and parallel copy structures since most
of the time only one of them is needed.

* Optimize `BitMap.Enumerator` a bit

Marking `MoveNext` as `AggressiveInlining` allows RyuJIT to promote the
`Enumerator` struct into registers completely, reducing load/store code
a lot since it does not have to store the struct on the stack for ABI
purposes.

* Use stack allocation for `use/blockedPositions`

* Optimize `AllocateWithSpill` a bit

* Address feedback

* Make `LiveInterval.AddRange(,)` more conservative

Produces no diff against master, but just for good measure.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* Optimize `TryAllocateRegWithtoutSpill` a bit

* Add a fast path for when all registers are live.
* Do not query `GetOverlapPosition` if the register is already in use
  (i.e: free position is 0).

* Do not allocate child split list if not parent

* Turn `LiveRange` into a reference struct

`LiveRange` is now a reference wrapping struct like `Operand` and
`Operation`.

It has also been changed into a singly linked-list. In micro-benchmarks
traversing the linked-list was faster than binary search on `List&lt;T&gt;`.
Even for quite large input sizes (e.g: 1,000,000), surprisingly.

Could be because the code gen for traversing the linked-list is much
much cleaner and there is no virtual dispatch happening when checking if
intervals overlaps.

* Turn `LiveInterval` into an iterator

The LSRA allocates in forward order and never inspect previous
`LiveInterval` once they are expired. Something similar can be done for
the `LiveRange`s within the `LiveInterval`s themselves.

The `LiveInterval` is turned into a iterator which expires `LiveRange`
within it. The iterator is moved forward along with interval walking
code, i.e: AllocateInterval(context, interval, cIndex).

* Remove `LinearScanAllocator.Sources`

Local methods are less susceptible to do allocations than lambdas.

* Optimize `GetOverlapPosition(interval)` a bit

Time complexity should be in O(n+m) instead of O(nm) now.

* Optimize `NumberLocals` a bit

Use the same idea as in `HybridAllocator` to store the visited state
in the MSB of the Operand's value instead of using a `HashSet&lt;T&gt;`.

* Optimize `InsertSplitCopies` a bit

Avoid allocating a redundant `CopyResolver`.

* Optimize `InsertSplitCopiesAtEdges` a bit

Avoid redundant allocations of `CopyResolver`.

* Use stack allocation for `freePositions`

Avoid redundant computations.

* Add `UseList`

Replace `SortedIntegerList` with an even more specialized data
structure. It allocates memory on the arena allocators and does not
require copying use positions when splitting it.

* Turn `LiveInterval` into a reference struct

`LiveInterval` is now a reference wrapping struct like `Operand` and
`Operation`.

The rationale behind turning this in a reference wrapping struct is
because a `LiveInterval` is associated with each local variable, and
these intervals may themselves be split further. I've seen translations
having up to 8000 local variables.

To make the `LiveInterval` unmanaged, a new data structure called
`LiveIntervalList` was added to store child splits. This differs from
`SortedList&lt;,&gt;` because it can contain intervals with the same start
position.

Really wished we got some more of C++ template in C#. :^(

* Optimize `GetChildSplit` a bit

No need to inspect the remaining ranges if we've reached a range which
starts after position, since the split list is ordered.

* Optimize `CopyResolver` a bit

Lazily allocate the fill, spill and parallel copy structures since most
of the time only one of them is needed.

* Optimize `BitMap.Enumerator` a bit

Marking `MoveNext` as `AggressiveInlining` allows RyuJIT to promote the
`Enumerator` struct into registers completely, reducing load/store code
a lot since it does not have to store the struct on the stack for ABI
purposes.

* Use stack allocation for `use/blockedPositions`

* Optimize `AllocateWithSpill` a bit

* Address feedback

* Make `LiveInterval.AddRange(,)` more conservative

Produces no diff against master, but just for good measure.</pre>
</div>
</content>
</entry>
<entry>
<title>Add `Operand.Label` support to `Assembler` (#2680)</title>
<updated>2021-10-05T17:04:55+00:00</updated>
<author>
<name>FICTURE7</name>
<email>FICTURE7@gmail.com</email>
</author>
<published>2021-10-05T17:04:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.benis.co.uk/Ryujinx/commit/?id=ecc64c934da43f881c2821bc9bc52ee42e55af2f'/>
<id>ecc64c934da43f881c2821bc9bc52ee42e55af2f</id>
<content type='text'>
* Add `Operand.Label` support to `Assembler`

This adds label support to `Assembler` and enables branch tightening
when compiling with relocatables. Jump management and patching has been
moved to the `Assembler`.

* Move instruction table to `Assembler.Table`

* Set PTC internal version

* Rename `Assembler.Table` to `AssemblerTable`</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* Add `Operand.Label` support to `Assembler`

This adds label support to `Assembler` and enables branch tightening
when compiling with relocatables. Jump management and patching has been
moved to the `Assembler`.

* Move instruction table to `Assembler.Table`

* Set PTC internal version

* Rename `Assembler.Table` to `AssemblerTable`</pre>
</div>
</content>
</entry>
<entry>
<title>Reduce JIT GC allocations (#2515)</title>
<updated>2021-08-17T18:08:34+00:00</updated>
<author>
<name>FICTURE7</name>
<email>FICTURE7@gmail.com</email>
</author>
<published>2021-08-17T18:08:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.benis.co.uk/Ryujinx/commit/?id=22b2cb39af00fb8881e908fd671fbf57a6e2db2a'/>
<id>22b2cb39af00fb8881e908fd671fbf57a6e2db2a</id>
<content type='text'>
* Turn `MemoryOperand` into a struct

* Remove `IntrinsicOperation`

* Remove `PhiNode`

* Remove `Node`

* Turn `Operand` into a struct

* Turn `Operation` into a struct

* Clean up pool management methods

* Add `Arena` allocator

* Move `OperationHelper` to `Operation.Factory`

* Move `OperandHelper` to `Operand.Factory`

* Optimize `Operation` a bit

* Fix `Arena` initialization

* Rename `NativeList&lt;T&gt;` to `ArenaList&lt;T&gt;`

* Reduce `Operand` size from 88 to 56 bytes

* Reduce `Operation` size from 56 to 40 bytes

* Add optimistic interning of Register &amp; Constant operands

* Optimize `RegisterUsage` pass a bit

* Optimize `RemoveUnusedNodes` pass a bit

Iterating in reverse-order allows killing dependency chains in a single
pass.

* Fix PPTC symbols

* Optimize `BasicBlock` a bit

Reduce allocations from `_successor` &amp; `DominanceFrontiers`

* Fix `Operation` resize

* Make `Arena` expandable

Change the arena allocator to be expandable by allocating in pages, with
some of them being pooled. Currently 32 pages are pooled. An LRU removal
mechanism should probably be added to it.

Apparently MHR can allocate bitmaps large enough to exceed the 16MB
limit for the type.

* Move `Arena` &amp; `ArenaList` to `Common`

* Remove `ThreadStaticPool` &amp; co

* Add `PhiOperation`

* Reduce `Operand` size from 56 from 48 bytes

* Add linear-probing to `Operand` intern table

* Optimize `HybridAllocator` a bit

* Add `Allocators` class

* Tune `ArenaAllocator` sizes

* Add page removal mechanism to `ArenaAllocator`

Remove pages which have not been used for more than 5s after each reset.

I am on fence if this would be better using a Gen2 callback object like
the one in System.Buffers.ArrayPool&lt;T&gt;, to trim the pool. Because right
now if a large translation happens, the pages will be freed only after a
reset. This reset may not happen for a while because no new translation
is hit, but the arena base sizes are rather small.

* Fix `OOM` when allocating larger than page size in `ArenaAllocator`

Tweak resizing mechanism for Operand.Uses and Assignemnts.

* Optimize `Optimizer` a bit

* Optimize `Operand.Add&lt;T&gt;/Remove&lt;T&gt;` a bit

* Clean up `PreAllocator`

* Fix phi insertion order

Reduce codegen diffs.

* Fix code alignment

* Use new heuristics for degree of parallelism

* Suppress warnings

* Address gdkchan's feedback

Renamed `GetValue()` to `GetValueUnsafe()` to make it more clear that
`Operand.Value` should usually not be modified directly.

* Add fast path to `ArenaAllocator`

* Assembly for `ArenaAllocator.Allocate(ulong)`:

  .L0:
    mov rax, [rcx+0x18]
    lea r8, [rax+rdx]
    cmp r8, [rcx+0x10]
    ja short .L2
  .L1:
    mov rdx, [rcx+8]
    add rax, [rdx+8]
    mov [rcx+0x18], r8
    ret
  .L2:
    jmp ArenaAllocator.AllocateSlow(UInt64)

  A few variable/field had to be changed to ulong so that RyuJIT avoids
  emitting zero-extends.

* Implement a new heuristic to free pooled pages.

  If an arena is used often, it is more likely that its pages will be
  needed, so the pages are kept for longer (e.g: during PPTC rebuild or
  burst sof compilations). If is not used often, then it is more likely
  that its pages will not be needed (e.g: after PPTC rebuild or bursts
  of compilations).

* Address riperiperi's feedback

* Use `EqualityComparer&lt;T&gt;` in `IntrusiveList&lt;T&gt;`

Avoids a potential GC hole in `Equals(T, T)`.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* Turn `MemoryOperand` into a struct

* Remove `IntrinsicOperation`

* Remove `PhiNode`

* Remove `Node`

* Turn `Operand` into a struct

* Turn `Operation` into a struct

* Clean up pool management methods

* Add `Arena` allocator

* Move `OperationHelper` to `Operation.Factory`

* Move `OperandHelper` to `Operand.Factory`

* Optimize `Operation` a bit

* Fix `Arena` initialization

* Rename `NativeList&lt;T&gt;` to `ArenaList&lt;T&gt;`

* Reduce `Operand` size from 88 to 56 bytes

* Reduce `Operation` size from 56 to 40 bytes

* Add optimistic interning of Register &amp; Constant operands

* Optimize `RegisterUsage` pass a bit

* Optimize `RemoveUnusedNodes` pass a bit

Iterating in reverse-order allows killing dependency chains in a single
pass.

* Fix PPTC symbols

* Optimize `BasicBlock` a bit

Reduce allocations from `_successor` &amp; `DominanceFrontiers`

* Fix `Operation` resize

* Make `Arena` expandable

Change the arena allocator to be expandable by allocating in pages, with
some of them being pooled. Currently 32 pages are pooled. An LRU removal
mechanism should probably be added to it.

Apparently MHR can allocate bitmaps large enough to exceed the 16MB
limit for the type.

* Move `Arena` &amp; `ArenaList` to `Common`

* Remove `ThreadStaticPool` &amp; co

* Add `PhiOperation`

* Reduce `Operand` size from 56 from 48 bytes

* Add linear-probing to `Operand` intern table

* Optimize `HybridAllocator` a bit

* Add `Allocators` class

* Tune `ArenaAllocator` sizes

* Add page removal mechanism to `ArenaAllocator`

Remove pages which have not been used for more than 5s after each reset.

I am on fence if this would be better using a Gen2 callback object like
the one in System.Buffers.ArrayPool&lt;T&gt;, to trim the pool. Because right
now if a large translation happens, the pages will be freed only after a
reset. This reset may not happen for a while because no new translation
is hit, but the arena base sizes are rather small.

* Fix `OOM` when allocating larger than page size in `ArenaAllocator`

Tweak resizing mechanism for Operand.Uses and Assignemnts.

* Optimize `Optimizer` a bit

* Optimize `Operand.Add&lt;T&gt;/Remove&lt;T&gt;` a bit

* Clean up `PreAllocator`

* Fix phi insertion order

Reduce codegen diffs.

* Fix code alignment

* Use new heuristics for degree of parallelism

* Suppress warnings

* Address gdkchan's feedback

Renamed `GetValue()` to `GetValueUnsafe()` to make it more clear that
`Operand.Value` should usually not be modified directly.

* Add fast path to `ArenaAllocator`

* Assembly for `ArenaAllocator.Allocate(ulong)`:

  .L0:
    mov rax, [rcx+0x18]
    lea r8, [rax+rdx]
    cmp r8, [rcx+0x10]
    ja short .L2
  .L1:
    mov rdx, [rcx+8]
    add rax, [rdx+8]
    mov [rcx+0x18], r8
    ret
  .L2:
    jmp ArenaAllocator.AllocateSlow(UInt64)

  A few variable/field had to be changed to ulong so that RyuJIT avoids
  emitting zero-extends.

* Implement a new heuristic to free pooled pages.

  If an arena is used often, it is more likely that its pages will be
  needed, so the pages are kept for longer (e.g: during PPTC rebuild or
  burst sof compilations). If is not used often, then it is more likely
  that its pages will not be needed (e.g: after PPTC rebuild or bursts
  of compilations).

* Address riperiperi's feedback

* Use `EqualityComparer&lt;T&gt;` in `IntrusiveList&lt;T&gt;`

Avoids a potential GC hole in `Equals(T, T)`.</pre>
</div>
</content>
</entry>
<entry>
<title>Add multi-level function table (#2228)</title>
<updated>2021-05-29T21:06:28+00:00</updated>
<author>
<name>FICTURE7</name>
<email>FICTURE7@gmail.com</email>
</author>
<published>2021-05-29T21:06:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.benis.co.uk/Ryujinx/commit/?id=9d7627af6484e090ebbc3209bc7301f0bdf47d24'/>
<id>9d7627af6484e090ebbc3209bc7301f0bdf47d24</id>
<content type='text'>
* Add AddressTable&lt;T&gt;

* Use AddressTable&lt;T&gt; for dispatch

* Remove JumpTable &amp; co.

* Add fallback for out of range addresses

* Add PPTC support

* Add documentation to `AddressTable&lt;T&gt;`

* Make AddressTable&lt;T&gt; configurable

* Fix table walk

* Fix IsMapped check

* Remove CountTableCapacity

* Add PPTC support for fast path

* Rename IsMapped to IsValid

* Remove stale comment

* Change format of address in exception message

* Add TranslatorStubs

* Split DispatchStub

Avoids recompilation of stubs during tests.

* Add hint for 64bit or 32bit

* Add documentation to `Symbol`

* Add documentation to `TranslatorStubs`

Make `TranslatorStubs` disposable as well.

* Add documentation to `SymbolType`

* Add `AddressTableEventSource` to monitor function table size

Add an EventSource which measures the amount of unmanaged bytes
allocated by AddressTable&lt;T&gt; instances.

 dotnet-counters monitor -n Ryujinx --counters ARMeilleure

* Add `AllowLcqInFunctionTable` optimization toggle

This is to reduce the impact this change has on the test duration.
Before everytime a test was ran, the FunctionTable would be initialized
and populated so that the newly compiled test would get registered to
it.

* Implement unmanaged dispatcher

Uses the DispatchStub to dispatch into the next translation, which
allows execution to stay in unmanaged for longer and skips a
ConcurrentDictionary look up when the target translation has been
registered to the FunctionTable.

* Remove redundant null check

* Tune levels of FunctionTable

Uses 5 levels instead of 4 and change unit of AddressTableEventSource
from KB to MB.

* Use 64-bit function table

Improves codegen for direct branches:

    mov qword [rax+0x408],0x10603560
 -  mov rcx,sub_10603560_OFFSET
 -  mov ecx,[rcx]
 -  mov ecx,ecx
 -  mov rdx,JIT_CACHE_BASE
 -  add rdx,rcx
 +  mov rcx,sub_10603560
 +  mov rdx,[rcx]
    mov rcx,rax

Improves codegen for dispatch stub:

    and rax,byte +0x1f
 -  mov eax,[rcx+rax*4]
 -  mov eax,eax
 -  mov rcx,JIT_CACHE_BASE
 -  lea rax,[rcx+rax]
 +  mov rax,[rcx+rax*8]
    mov rcx,rbx

* Remove `JitCacheSymbol` &amp; `JitCache.Offset`

* Turn `Translator.Translate` into an instance method

We do not have to add more parameter to this method and related ones as
new structures are added &amp; needed for translation.

* Add symbol only when PTC is enabled

Address LDj3SNuD's feedback

* Change `NativeContext.Running` to a 32-bit integer

* Fix PageTable symbol for host mapped</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* Add AddressTable&lt;T&gt;

* Use AddressTable&lt;T&gt; for dispatch

* Remove JumpTable &amp; co.

* Add fallback for out of range addresses

* Add PPTC support

* Add documentation to `AddressTable&lt;T&gt;`

* Make AddressTable&lt;T&gt; configurable

* Fix table walk

* Fix IsMapped check

* Remove CountTableCapacity

* Add PPTC support for fast path

* Rename IsMapped to IsValid

* Remove stale comment

* Change format of address in exception message

* Add TranslatorStubs

* Split DispatchStub

Avoids recompilation of stubs during tests.

* Add hint for 64bit or 32bit

* Add documentation to `Symbol`

* Add documentation to `TranslatorStubs`

Make `TranslatorStubs` disposable as well.

* Add documentation to `SymbolType`

* Add `AddressTableEventSource` to monitor function table size

Add an EventSource which measures the amount of unmanaged bytes
allocated by AddressTable&lt;T&gt; instances.

 dotnet-counters monitor -n Ryujinx --counters ARMeilleure

* Add `AllowLcqInFunctionTable` optimization toggle

This is to reduce the impact this change has on the test duration.
Before everytime a test was ran, the FunctionTable would be initialized
and populated so that the newly compiled test would get registered to
it.

* Implement unmanaged dispatcher

Uses the DispatchStub to dispatch into the next translation, which
allows execution to stay in unmanaged for longer and skips a
ConcurrentDictionary look up when the target translation has been
registered to the FunctionTable.

* Remove redundant null check

* Tune levels of FunctionTable

Uses 5 levels instead of 4 and change unit of AddressTableEventSource
from KB to MB.

* Use 64-bit function table

Improves codegen for direct branches:

    mov qword [rax+0x408],0x10603560
 -  mov rcx,sub_10603560_OFFSET
 -  mov ecx,[rcx]
 -  mov ecx,ecx
 -  mov rdx,JIT_CACHE_BASE
 -  add rdx,rcx
 +  mov rcx,sub_10603560
 +  mov rdx,[rcx]
    mov rcx,rax

Improves codegen for dispatch stub:

    and rax,byte +0x1f
 -  mov eax,[rcx+rax*4]
 -  mov eax,eax
 -  mov rcx,JIT_CACHE_BASE
 -  lea rax,[rcx+rax]
 +  mov rax,[rcx+rax*8]
    mov rcx,rbx

* Remove `JitCacheSymbol` &amp; `JitCache.Offset`

* Turn `Translator.Translate` into an instance method

We do not have to add more parameter to this method and related ones as
new structures are added &amp; needed for translation.

* Add symbol only when PTC is enabled

Address LDj3SNuD's feedback

* Change `NativeContext.Running` to a 32-bit integer

* Fix PageTable symbol for host mapped</pre>
</div>
</content>
</entry>
</feed>
