| Age | Commit message (Collapse) | Author |
|
|
|
* Implement remaining HINT instructions as NOP
* Split HINT encodings more to account for CSDB
|
|
* use Array.Empty() where instead of allocating new zero-length arrays
* structure for loops in a way that the JIT will elide array/Span bounds checking
* avoiding function calls in for loop condition tests
* avoid LINQ in a hot path
* conform with code style
* fix mistake in GetNextWaitingObject()
* fix GetNextWaitingObject() possibility of returning null if all list items have TimePoint == long.MaxValue
* make GetNextWaitingObject() behave FIFO behavior for multiple items with the same TimePoint
|
|
(#3933)
* Optimize string memory usage. Use ReadOnlySpan<char> and StringBuilder where possible.
* Fix copypaste error
* Code generator review fixes
* Use if statement instead of switch
* Code style fixes
Co-authored-by: TSRBerry <20988865+TSRBerry@users.noreply.github.com>
* Another code style fix
* Styling fix
Co-authored-by: Mary-nyan <thog@protonmail.com>
* Styling fix
Co-authored-by: gdkchan <gab.dark.100@gmail.com>
Co-authored-by: TSRBerry <20988865+TSRBerry@users.noreply.github.com>
Co-authored-by: Mary-nyan <thog@protonmail.com>
Co-authored-by: gdkchan <gab.dark.100@gmail.com>
|
|
|
|
* Make all structs readonly when applicable. It should reduce amount of needless defensive copies
* Make structs with trivial boilerplate equality code record structs
* Remove unnecessary readonly modifiers from TextureCreateInfo
* Make BitMap structs readonly too
|
|
* A32: Implement VCVTT, VCVTB
* A32: F16C implementation of VCVTT/VCVTB
|
|
(#3694)
* OpCodeTable: Implement Hint instructions (CSDB, SEV, SEVL, WFE, WFI, YIELD)
* A64: Remove catch-all Hint instruction
* T16: Handle unallocated hint instructions
Some thumb tests execute these assuming that they're nops.
* T32: Fill out other Hint instructions
* A32: Fill out other hint instructions
|
|
both A32 and T32 (#3693)
|
|
|
|
* Fix increment on Arm32 NEON VLDn/VSTn instructions with regs > 1
* PPTC version bump
* PR feedback
|
|
|
|
|
|
instructions (#3687)
* Implement Thumb (32-bit) memory (ordered), multiply and bitfield instructions
* Remove public from interface
* Fix T32 BL immediate and implement signed and unsigned extend instructions
|
|
thumb instructions (#3683)
* Add ADD (zx imm12), NOP, MOV (register shifted), LDA, TBB, TBH, MOV (zx imm16) and CLZ thumb instructions, fix LDRD, STRD, CBZ, CBNZ and BLX (reg)
* Bump PPTC version
|
|
VPADDL, VSUBL, VQDMULH and VMLAL Arm32 NEON instructions (#3677)
* Implement VRSRA, VRSHRN, VQSHRUN, VQMOVN, VQMOVUN, VQADD, VQSUB, VRHADD, VPADDL, VSUBL, VQDMULH and VMLAL Arm32 NEON instructions
* PPTC version
* Fix VQADD/VQSUB
* Improve MRC/MCR handling and exception messages
In case data is being recompiled as code, we don't want to throw at emit stage, instead we should only throw if it actually tries to execute
|
|
* Implement some 32-bit Thumb instructions
* Optimize OpCode32MemMult using PopCount
|
|
* Implement Arm32 Sha256 and MRS Rd, CPSR instructions
* Add tests using Arm64 outputs
|
|
* T32: Implement load/store single (immediate)
* tests
* tidy formatting
* address comments
|
|
* T32: Implement Data Processing (Modified Immediate) instructions
* Update tests
* switch -> lookup table
|
|
* Decoders: Make IsThumb a function of OpCode32
* OpCode32: Fix GetPc
* T32: Implement B, B.cond, BL, BLX
* rm usings
|
|
* T32: Implement ADC, ADD, AND, BIC, CMN, CMP, EOR, MOV, MVN, ORN, ORR, RSB, SBC, SUB, TEQ, TST (shifted register)
* OpCodeTable: Sort T32 list
* Tests: Rename RandomTestCase to PrecomputedThumbTestCase
* T32: Tests for AluRsImm instructions
* fix nit
* fix nit 2
|
|
* Decoders: Add InITBlock argument
* OpCodeTable: Minor cleanup
* OpCodeTable: Remove existing thumb instruction implementations
* OpCodeTable: Prepare for thumb instructions
* OpCodeTables: Improve thumb fast lookup
* Tests: Prepare for thumb tests
* T16: Implement BX
* T16: Implement LSL/LSR/ASR (imm)
* T16: Implement ADDS, SUBS (reg)
* T16: Implement ADDS, SUBS (3-bit immediate)
* T16: Implement MOVS, CMP, ADDS, SUBS (8-bit immediate)
* T16: Implement ANDS, EORS, LSLS, LSRS, ASRS, ADCS, SBCS, RORS, TST, NEGS, CMP, CMN, ORRS, MULS, BICS, MVNS (low registers)
* T16: Implement ADD, CMP, MOV (high reg)
* T16: Implement BLX (reg)
* T16: Implement LDR (literal)
* T16: Implement {LDR,STR}{,H,B,SB,SH} (register)
* T16: Implement {LDR,STR}{,B,H} (immediate)
* T16: Implement LDR/STR (SP)
* T16: Implement ADR
* T16: Implement Add to SP (immediate)
* T16: Implement ADD/SUB (SP)
* T16: Implement SXTH, SXTB, UXTH, UTXB
* T16: Implement CBZ, CBNZ
* T16: Implement PUSH, POP
* T16: Implement REV, REV16, REVSH
* T16: Implement NOP
* T16: Implement LDM, STM
* T16: Implement SVC
* T16: Implement B (conditional)
* T16: Implement B (unconditional)
* T16: Implement IT
* fixup! T16: Implement ADD/SUB (SP)
* fixup! T16: Implement Add to SP (immediate)
* fixup! T16: Implement IT
* CpuTestThumb: Add randomized tests
* Remove inITBlock argument
* Address nits
* Use index to handle IfThenBlockState
* Reduce line noise
* fixup
* nit
|
|
* ARMeilleure: A32: Implement UHSUB8
* ARMeilleure: A32: Implement SHSUB8
|
|
|
|
|
|
* Implement FCVTNS (Scalar GP)
* Update Ptc Version
|
|
* Add FCVTMS_V Implementation to Armeilleure
* Fix opcode designation
* Add tests
* Amend Ptc version
* Fix OpCode / Tests
* Create Math.Floor helper method + Update implementation
* Address gdk comments
* Re-address gdk comments
* Update ARMeilleure/Decoders/OpCodeTable.cs
Co-authored-by: gdkchan <gab.dark.100@gmail.com>
* Update Tests to use 2S (4S) and 2D
Co-authored-by: gdkchan <gab.dark.100@gmail.com>
|
|
|
|
* Implement UHADD8 instruction along with a test unit
* Update PTC revision number
|
|
* Implement MSR instruction
Fix #1342.
Now Pocket Rumble is playable.
* Address gdkchan's comments
* Address gdkchan's comments
* Address gdkchan's comment
|
|
|
|
|
|
* Implement VCNT based on AArch64 CNT
Add tests
* Update PTC version
* Address LDj's comments
* Explicit size in encoding
* Tighter tests
* Replace SoftFallback with IR helper
Co-authored-by: LDj3SNuD <35856442+LDj3SNuD@users.noreply.github.com>
* Reduce one BitwiseAnd from IR fallback
Based on popcount64b from https://en.wikipedia.org/wiki/Hamming_weight#Efficient_implementation
* Rename parameter and add assert
Co-authored-by: LDj3SNuD <35856442+LDj3SNuD@users.noreply.github.com>
Co-authored-by: LDj3SNuD <35856442+LDj3SNuD@users.noreply.github.com>
|
|
* Implement PRFM (register variant) as NOP
Fix typo pfrm -> prfm
Add comments to distinguish variants
* Increment PTC version
|
|
Tests. (#1894)
|
|
variant & Sse fast path and slow path for both the "8/16B -> 8H" and "1/2D -> 1Q" variants; with Test. (#1817)
* Add Pmull_V Sse fast path only, both "8/16B -> 8H" and "1/2D -> 1Q" variants; with Test.
* Add Clmul fast path for the 128 bits variant.
* Small optimisation (save 60 instructions) for the Sse fast path about the 128 bits variant.
* Add slow path, both variants. Fix V128 Shl/Shr when shift = 0.
* A32: Add Vmull_I P64 variant (slow path); not tested.
* A32: Add Vmull_I_P8_P64 Test and fix P64 variant.
|
|
slow paths (using fused inst.s). Fix Vfma_V slow path not using StandardFPSCRValue(). (#1775)
* Fix Vnmls_S fast path (F64: losing input d value). Fix Vnmla_S & Vnmls_S slow paths (using fused inst.s).
Add Vfma_S & Vfms_S Fma fast paths.
Add Vfnma_S inst. with Fma/Sse fast paths and slow path.
Add Vfnms_S Sse fast path.
Add Tests for affected inst.s.
Nits.
* InternalVersion = 1775
* Nits.
* Fix Vfma_V slow path not using StandardFPSCRValue().
* Nit: Fix Vfma_V order.
* Add Vfms_V Sse fast path and slow path.
* Add Vfma_V and Vfms_V Test.
|
|
* Start implementation
* Draft
* Updated opcode.
Needs verification.
* Clean up code.
* Update implementation and tests.
* Update implemenation + tests
* Get RM from FPSCR + Do not use emit/addintrinsic
* Remove "fast" path, as recommended by gdk.
* Variable DELETED.
* Update ARMeilleure/Decoders/OpCodeTable.cs
Co-authored-by: LDj3SNuD <35856442+LDj3SNuD@users.noreply.github.com>
* Update ARMeilleure/Instructions/InstEmitSimdCvt32.cs
Co-authored-by: LDj3SNuD <35856442+LDj3SNuD@users.noreply.github.com>
* Update ARMeilleure/Instructions/InstEmitSimdCvt32.cs
Co-authored-by: LDj3SNuD <35856442+LDj3SNuD@users.noreply.github.com>
* Update ARMeilleure/Instructions/InstEmitSimdCvt32.cs
Co-authored-by: LDj3SNuD <35856442+LDj3SNuD@users.noreply.github.com>
* Move method
* stringing things together :)
Co-authored-by: LDj3SNuD <35856442+LDj3SNuD@users.noreply.github.com>
|
|
* Implement VFMA.F64
* Simplify switch
* Simplify FMA Instructions into their own IntrinsicType.
* Remove whitespace
* Fix indentation
* Change tests for Vfnms -- disable inf / nan
* Move args up, not description ;)
* Implementation Complete.
All Tests Pass (Slow / Fast Path)
* Move location of function in assembler + test updates.
* Shift params upwards
* Remove unused function
* Update PTC version.
* Add comments / re-oreder opcode table.
* Remove whitespace
* Fix nit
* Fix nit.
* Fix whitespace
* Wrong opcode was used by a bad merge.
* Addressed rip's comments.
|
|
* Implement VFNMA.F<32/64>
* Update PTC Version
* Update Implementation & Renames & Correct Order
* Fix alignment
* Update implementation to not trigger assert
* Actually use the intrinsic that makes sense :)
|
|
* Add necessary methods / op-code
* Enable Support for FMA Instruction Set
* Add Intrinsics / Assembly Opcodes for VFMSUB231XX.
* Add X86 Instructions for VFMSUB231XX
* Implement VFNMS
* Implement VFNMS Tests
* Add special cases for FMA instructions.
* Update PPTC Version
* Remove unused Op
* Move Check into Assert / Cleanup
* Rename and cleanup
* Whitespace
* Whitespace / Rename
* Re-sort
* Address final requests
* Implement VFMA.F64
* Simplify switch
* Simplify FMA Instructions into their own IntrinsicType.
* Remove whitespace
* Fix indentation
* Change tests for Vfnms -- disable inf / nan
* Move args up, not description ;)
* Undo vfma
* Completely remove vfms code.,
* Fix order of instruction in assembler
|
|
* Get rid of Reflection.Emit dependency on CPU and Shader projects
* Remove useless private sets
* Missed those due to the alignment
|
|
paths). (#1577)
* Add Umaal & Vabd_I, Vabdl_I, Vaddl_I, Vhadd, Vqshrn, Vshll inst.s (slow paths).
No test provided (i.e. draft).
* Ptc InternalVersion = 1577
|
|
* SIMD&FP load/store with scale > 4 should be undefined
* Catch more invalid encodings for FP&SIMD LDR/STR (reg variant)
* Set PTC version to PR number
|
|
|
|
* Fix Vcvt_FI & Vcvt_RM; Add Vfma_S & Vfms_S. Add Tests.
* Address PR feedback & Nit.
|
|
* Added some 32 bits instructions:
* VBIC
* VTST
* VSRA
* Incremented the PTC
* Add tests and fix implementation
* Fixed VBIC immediate opcode mapping
* Hey hey!
* Nit.
Co-authored-by: gdkchan <gab.dark.100@gmail.com>
Co-authored-by: LDj3SNuD <dvitiello@gmail.com>
Co-authored-by: LDj3SNuD <35856442+LDj3SNuD@users.noreply.github.com>
|
|
* Fix Vabs_V & Vneg_V (S8, S16, S32 & F32); add Tests.
* Update Ptc.cs
|
|
|