aboutsummaryrefslogtreecommitdiff
path: root/ARMeilleure
AgeCommit message (Collapse)Author
2020-12-02IPC refactor part 2: Use ReplyAndReceive on HLE services and remove special ↵gdkchan
handling from kernel (#1458) * IPC refactor part 2: Use ReplyAndReceive on HLE services and remove special handling from kernel * Fix for applet transfer memory + some nits * Keep handles if possible to avoid server handle table exhaustion * Fix IPC ZeroFill bug * am: Correctly implement CreateManagedDisplayLayer and implement CreateManagedDisplaySeparableLayer CreateManagedDisplaySeparableLayer is requires since 10.x+ when appletResourceUserId != 0 * Make it exit properly * Make ServiceNotImplementedException show the full message again * Allow yielding execution to avoid starving other threads * Only wait if active * Merge IVirtualMemoryManager and IAddressSpaceManager * Fix Ro loading data from the wrong process Co-authored-by: Thog <me@thog.eu>
2020-11-20Use backup when PTC compression is corrupt (#1704)riperiperi
* Use backup when PTC compression is corrupt * Apply suggestions from code review Co-authored-by: LDj3SNuD <35856442+LDj3SNuD@users.noreply.github.com> Co-authored-by: LDj3SNuD <35856442+LDj3SNuD@users.noreply.github.com>
2020-11-18CPU (A64): Add FP16/FP32 fast paths (F16C Intrinsics) for Fcvt_S, Fcvtl_V & ↵LDj3SNuD
Fcvtn_V Instructions. Now HardwareCapabilities uses CpuId. (#1650) * net5.0 * CPU (A64): Add FP16/FP32 fast paths (F16C Intrinsics) for Fcvt_S, Fcvtl_V & Fcvtn_V Instructions. Switch to .NET 5.0. Nits. Tests performed successfully in both debug and release mode (for all instructions involved). * Address comment. * Update appveyor.yml * Revert "Update appveyor.yml" This reverts commit 27cdd59e8b90e227e6924d9c162af26c00a89013. * Remove Assembler CpuId. * Update appveyor.yml * Address comment.
2020-11-17shader cache: Fix Linux boot issues (#1709)Mary
* shader cache: Fix Linux boot issues This rollback the init logic back to previous state, and replicate the way PTC handle initialization. * shader cache: set default state of ready for translation event to false * Fix cpu unit tests
2020-11-15infra: Migrate to .NET 5 (#1694)Mary
* infra: Migrate to .NET 5 This migrate projects and CI to .NET 5 * Remove language version restrictions (now on 9.0 by default) * infra: pin .NET 5 to avoid later issues * infra: Cleanup csproj files * infra: update dependencies * infra: Add temporary workaround for a bug in Vector128.Create see https://github.com/dotnet/runtime/issues/44704 for more informations
2020-11-04Fix LiveInterval.Split (#1660)FICTURE7
Before when splitting intervals, the end of the range would be included in the split check, this can produce empty ranges in the child split. This in turn can affect spilling decisions since the child split will have a different start position and this empty range will get a register and move to the active set for a brief moment. For example: A = [153, 172[; [1899, 1916[; [1991, 2010[; [2397, 2414[; ... Split(A, 1916) A0 = [153, 172[; [1899, 1916[ A1 = [1916, 1916[; [1991, 2010[; [2397, 2414[; ...
2020-10-21Get rid of Reflection.Emit dependency on CPU and Shader projects (#1626)gdkchan
* Get rid of Reflection.Emit dependency on CPU and Shader projects * Remove useless private sets * Missed those due to the alignment
2020-10-16Memory Read/Write Tracking using Region Handles (#1272)riperiperi
* WIP Range Tracking - Texture invalidation seems to have large problems - Buffer/Pool invalidation may have problems - Mirror memory tracking puts an additional `add` in compiled code, we likely just want to make HLE access slower if this is the final solution. - Native project is in the messiest possible location. - [HACK] JIT memory access always uses native "fast" path - [HACK] Trying some things with texture invalidation and views. It works :) Still a few hacks, messy things, slow things More work in progress stuff (also move to memory project) Quite a bit faster now. - Unmapping GPU VA and CPU VA will now correctly update write tracking regions, and invalidate textures for the former. - The Virtual range list is now non-overlapping like the physical one. - Fixed some bugs where regions could leak. - Introduced a weird bug that I still need to track down (consistent invalid buffer in MK8 ribbon road) Move some stuff. I think we'll eventually just put the dll and so for this in a nuget package. Fix rebase. [WIP] MultiRegionHandle variable size ranges - Avoid reprotecting regions that change often (needs some tweaking) - There's still a bug in buffers, somehow. - Might want different api for minimum granularity Fix rebase issue Commit everything needed for software only tracking. Remove native components. Remove more native stuff. Cleanup Use a separate window for the background context, update opentk. (fixes linux) Some experimental changes Should get things working up to scratch - still need to try some things with flush/modification and res scale. Include address with the region action. Initial work to make range tracking work Still a ton of bugs Fix some issues with the new stuff. * Fix texture flush instability There's still some weird behaviour, but it's much improved without this. (textures with cpu modified data were flushing over it) * Find the destination texture for Buffer->Texture full copy Greatly improves performance for nvdec videos (with range tracking) * Further improve texture tracking * Disable Memory Tracking for view parents This is a temporary approach to better match behaviour on master (where invalidations would be soaked up by views, rather than trigger twice) The assumption is that when views are created to a texture, they will cover all of its data anyways. Of course, this can easily be improved in future. * Introduce some tracking tests. WIP * Complete base tests. * Add more tests for multiregion, fix existing test. * Cleanup Part 1 * Remove unnecessary code from memory tracking * Fix some inconsistencies with 3D texture rule. * Add dispose tests. * Use a background thread for the background context. Rather than setting and unsetting a context as current, doing the work on a dedicated thread with signals seems to be a bit faster. Also nerf the multithreading test a bit. * Copy to texture with matching alignment This extends the copy to work for some videos with unusual size, such as tutorial videos in SMO. It will only occur if the destination texture already exists at XCount size. * Track reads for buffer copies. Synchronize new buffers before copying overlaps. * Remove old texture flushing mechanisms. Range tracking all the way, baby. * Wake the background thread when disposing. Avoids a deadlock when games are closed. * Address Feedback 1 * Separate TextureCopy instance for background thread Also `BackgroundContextWorker.InBackground` for a more sensible idenfifier for if we're in a background thread. * Add missing XML docs. * Address Feedback * Maybe I should start drinking coffee. * Some more feedback. * Remove flush warning, Refocus window after making background context
2020-10-13Add Umaal & Vabd_I, Vabdl_I, Vaddl_I, Vhadd, Vqshrn, Vshll inst.s (slow ↵LDj3SNuD
paths). (#1577) * Add Umaal & Vabd_I, Vabdl_I, Vaddl_I, Vhadd, Vqshrn, Vshll inst.s (slow paths). No test provided (i.e. draft). * Ptc InternalVersion = 1577
2020-09-30Remove old, unused CPU optimization (#1586)gdkchan
2020-09-22IPC refactor part 1: Use explicit separate threads to process requests (#1447)gdkchan
* Changes to allow explicit management of service threads * Remove now unused code * Remove ThreadCounter, its no longer needed * Allow and use separate server per service, also fix exit issues * New policy change: PTC version now uses PR number
2020-09-19Fix host stack overflow caused by some recursive guest methods. (#1528)LDj3SNuD
* Fix host stack overflow caused by some recursive guest methods. * PPTC flag up. * Address comments. Co-authored-by: gdkchan <gab.dark.100@gmail.com>
2020-09-19Implement block placement (#1549)FICTURE7
* Implement block placement Implement a simple pass which re-orders cold blocks at the end of the list of blocks in the CFG. * Set PPTC version * Use Array.Resize Address gdkchan's feedback
2020-09-12Relax block ordering constraints (#1535)FICTURE7
* Relax block ordering constraints Before `block.Next` had to follow `block.ListNext`, now it does not. Instead `CodeGenerator` will now emit the necessary jump instructions to ensure control flow. This makes control flow and block order modifications easier. It also eliminates some simple cases of redundant branches. * Set PPTC version
2020-09-07Do not emit StoreToContext before Return (#1537)FICTURE7
* Do not emit StoreToContext before Return * Set PPTC version
2020-09-01SIMD&FP load/store with scale > 4 should be undefined (#1522)gdkchan
* SIMD&FP load/store with scale > 4 should be undefined * Catch more invalid encodings for FP&SIMD LDR/STR (reg variant) * Set PTC version to PR number
2020-08-31Improve static branch prediction along fast path for memory accesses (#1484)FICTURE7
* Improve static branch prediction along fast path for memory accesses * Set PPTC interval version
2020-08-31CPU (A64): Add Scvtf_S_Fixed & Ucvtf_S_Fixed with Tests. (#1492)LDj3SNuD
2020-08-30Remove the Ryujinx.Debugger project (#1506)Mary
This project wasn't really used by anyone and isn't worth mantaining. This commit remove the profiler entirely from Ryujinx and remove the associated CI tasks.
2020-08-30Allow launching with custom data directories (#1505)mageven
* Allow launching with custom data directories Don't load alternate keys when using custom directory * Address gdkchan's comments * Misc fixes to log levels Added more enabled log levels by default Moved successful config updation to Notice as 1. It's not a warning 2. Warnings could've been disabled by the config load and hence message would be lost
2020-08-17Amadeus: Final Act (#1481)Mary
* Amadeus: Final Act This is my requiem, I present to you Amadeus, a complete reimplementation of the Audio Renderer! This reimplementation is based on my reversing of every version of the audio system module that I carried for the past 10 months. This supports every revision (at the time of writing REV1 to REV8 included) and all features proposed by the Audio Renderer on real hardware. Because this component could be used outside an emulation context, and to avoid possible "inspirations" not crediting the project, I decided to license the Ryujinx.Audio.Renderer project under LGPLv3. - FE3H voices in videos and chapter intro are not present. - Games that use two audio renderer **at the same time** are probably going to have issues right now **until we rewrite the audio output interface** (Crash Team Racing is the only known game to use two renderer at the same time). - Persona 5 Scrambler now goes ingame but audio is garbage. This is caused by the fact that the game engine is syncing audio and video in a really aggressive way. This will disappears the day this game run at full speed. * Make timing more precise when sleeping on Windows Improve precision to a 1ms resolution on Windows NT based OS. This is used to avoid having totally erratic timings and unify all Windows users to the same resolution. NOTE: This is only active when emulation is running.
2020-08-13Fix Vcvt_FI & Vcvt_RM; Add Vfma_S & Vfms_S. Add Tests. (#1471)LDj3SNuD
* Fix Vcvt_FI & Vcvt_RM; Add Vfma_S & Vfms_S. Add Tests. * Address PR feedback & Nit.
2020-08-09Fix PTC version increment from #1433 (#1462)mageven
2020-08-08CPU: This PR fixes Fpscr, among other things. (#1433)LDj3SNuD
* CPU: This PR fixes Fpscr, among other things. * Add Fpscr.Qc = 1 if sat. for Vqrshrn & Vqrshrun. * Fix Vcmp & Vcmpe opcode table. * Revert "Fix Vcmp & Vcmpe opcode table." This reverts commit c117d9410d693185ff5f8ee8e457ffbfb2027dd5. * Address PR feedbacks.
2020-08-05Improve branch operations (#1442)Ficture Seven
* Add Compare instruction * Add BranchIf instruction * Use test when BranchIf & Compare against 0 * Propagate Compare into BranchIfTrue/False use - Propagate Compare operations into their BranchIfTrue/False use and turn these into a BranchIf. - Clean up Comparison enum. * Replace BranchIfTrue/False with BranchIf * Use BranchIf in EmitPtPointerLoad - Using BranchIf early instead of BranchIfTrue/False improves LCQ and reduces the amount of work needed by the Optimizer. EmitPtPointerLoader was a/the big producer of BranchIfTrue/False. - Fix asserts firing when assembling BitwiseAnd because of type mismatch in EmitStoreExclusive. This is harmless and should not cause any diffs. * Increment PPTC interval version * Improve IRDumper for BranchIf & Compare * Use BranchIf in EmitNativeCall * Clean up * Do not emit test when immediately preceded by and
2020-08-04Improved Logger (#1292)mageven
* Logger class changes only Now compile-time checking is possible with the help of Nullable Value types. * Misc formatting * Manual optimizations PrintGuestLog PrintGuestStackTrace Surfaceflinger DequeueBuffer * Reduce SendVibrationXX log level to Debug * Add Notice log level This level is always enabled and used to print system info, etc... Also, rewrite LogColor to switch expression as colors are static * Unify unhandled exception event handlers * Print enabled LogLevels during init * Re-add App Exit disposes in proper order nit: switch case spacing * Revert PrintGuestStackTrace to Info logs due to #1407 PrintGuestStackTrace is now called in some critical error handlers so revert to old behavior as KThread isn't part of Guest. * Batch replace Logger statements
2020-07-30 Implement inline memory load/store exclusive and ordered (#1413)gdkchan
* Implement inline memory load/store exclusive * Fix missing REX prefix on 8-bits CMPXCHG * Increment PTC version due to bugfix * Remove redundant memory checks * Address PR feedback * Increment PPTC version
2020-07-30 Print guest stack trace on invalid memory access (#1407)gdkchan
* Print guest stack trace on invalid memory access * Improve XML docs
2020-07-30Use movd,movq for i32/64 VectorExtract %x, 0x0 (#1439)Ficture Seven
* Use movd,movq for i32/64 VectorExtract %x, 0x0 * Increment PPTC interval version * Use else-if instead - Address gdkchan's feedback. - Clean up Debug.Assert calls * Inline `count` expression into Debug.Assert Apparently the CoreCLR JIT will not eliminate this. :(
2020-07-25PPTC version increment (#1427)gdkchan
2020-07-24Refactor NativeContext (#1410)gdkchan
* Refactor NativeContext * Fix bugs * Use correct counts * Check index using register count constants
2020-07-19Implements some 32-bit instructions (VBIC, VTST, VSRA) (#1192)Valentin PONS
* Added some 32 bits instructions: * VBIC * VTST * VSRA * Incremented the PTC * Add tests and fix implementation * Fixed VBIC immediate opcode mapping * Hey hey! * Nit. Co-authored-by: gdkchan <gab.dark.100@gmail.com> Co-authored-by: LDj3SNuD <dvitiello@gmail.com> Co-authored-by: LDj3SNuD <35856442+LDj3SNuD@users.noreply.github.com>
2020-07-17CPU: A32: Fix Vabs_V & Vneg_V (S8, S16, S32 & F32); add Tests. (#1394)LDj3SNuD
* Fix Vabs_V & Vneg_V (S8, S16, S32 & F32); add Tests. * Update Ptc.cs
2020-07-17CPU: A32: Add Vadd & Vsub Wide (S/U_8/16/32) Inst.s with Test. (#1390)LDj3SNuD
2020-07-15Fix Decode exception condition (#1377)Ficture Seven
2020-07-13Add Fmax/minv_V & S/Ushl_S Inst.s with Tests. Fix Maxps/d & Minps/d d… (#1335)LDj3SNuD
* Add Fmax/minv_V & S/Ushl_S Inst.s with Tests. Fix Maxps/d & Minps/d double zero sign handling. Allows better handling of NaNs. * Optimized EmitSse2VectorIsNaNOpF() for multiple uses per opF.
2020-07-13Add SSE4.2 Path for CRC32, add A32 variant, add tests for non-castagnoli ↵riperiperi
variants. (#1328) * Add CRC32 A32 instructions. * Fix CRC32 instructions. * Add CRC intrinsic and fast path. Loop is currently unrolled, will look into adding temp vars after tests are added. * Begin work on Crc tests * Fix SSE4.2 path for CRC32C, finialize tests. * Remove unused IR path. * Fix spacing between prefix checks. * This should be Src. * PTC Version * OpCodeTable Order * Integer check improvement. Value and Crc can be either 32 or 64 size. * This wasn't necessary... * If size is 3, value type must be I64. * Fix same src+dest handling for non crc intrinsics. * Pre-fix (ha) issue with vex encodings
2020-07-13Fix Node Uses/Assignments (#1376)Ficture Seven
* Fix Node Uses/Assignments * Bump PPTC Version Number Co-authored-by: jduncanator <1518948+jduncanator@users.noreply.github.com>
2020-07-13Fix folding of ConvertI64ToI32 imm64 (#1383)Ficture Seven
* Fix folding of ConvertI64ToI32 imm64 * Increment PTC internal version * Clean up
2020-07-11Mask shift constants on x86 backend (#1382)gdkchan
* Mask shift constants on x86 backendd * Version bump
2020-07-11Fold ZeroExtend8/16/32 imm32/64 (#1358)Ficture Seven
* Fold ZeroExtend8/16/32 imm32/64 * Increment PTC version
2020-07-11Fold ConvertI64ToI32 imm64 (#1359)Ficture Seven
* Fold ConvertI64ToI32 imm64 * Increment PTC version * Bump PPTC InternalVersion Co-authored-by: jduncanator <1518948+jduncanator@users.noreply.github.com>
2020-07-09Fix PPTC on Windows 7. (#1369)LDj3SNuD
* Fix PPTC on Windows 7. * Address gdkchan comment.
2020-06-24Fix VMVN (immediate), Add VPMIN, VPMAX, VMVN (register) (#1303)riperiperi
* Add Vmvn (register), tests for both Vmvn variants. * Add Vpmin, Vpmax, improve Non-FastFp accuracy for Vpadd * Rebase on top of PTC. * Add Nopcode * Increment PTC version. * Fix nits.
2020-06-18Increment PTC version (#1311)Thog
Fix issues caused by https://github.com/Ryujinx/Ryujinx/commit/2421186d974446ef4183420c50bc37e58d9fe213
2020-06-18Generalize tail continues (#1298)Ficture Seven
* Generalize tail continues * Fix DecodeBasicBlock `Next` and `Branch` would be null, which is not the state expected by the branch instructions. They end up branching or falling into a block which is never populated by the `Translator`. This causes an assert to be fired when building the CFG. * Clean up Decode overloads * Do not synchronize when branching into exit block If we're branching into an exit block, that exit block will tail continue into another translation which already has a synchronization. * Remove A32 predicate tail continue If `block` is not an exit block then the `block.Next` must exist (as per the last instruction of `block`). * Throw if decoded 0 blocks Address gdkchan's feedback * Rebuild block list instead of setting to null Address gdkchan's feedback
2020-06-16Add Profiled Persistent Translation Cache. (#769)LDj3SNuD
* Delete DelegateTypes.cs * Delete DelegateCache.cs * Add files via upload * Update Horizon.cs * Update Program.cs * Update MainWindow.cs * Update Aot.cs * Update RelocEntry.cs * Update Translator.cs * Update MemoryManager.cs * Update InstEmitMemoryHelper.cs * Update Delegates.cs * Nit. * Nit. * Nit. * 10 fewer MSIL bytes for us * Add comment. Nits. * Update Translator.cs * Update Aot.cs * Nits. * Opt.. * Opt.. * Opt.. * Opt.. * Allow to change compression level. * Update MemoryManager.cs * Update Translator.cs * Manage corner cases during the save phase. Nits. * Update Aot.cs * Translator response tweak for Aot disabled. Nit. * Nit. * Nits. * Create DelegateHelpers.cs * Update Delegates.cs * Nit. * Nit. * Nits. * Fix due to #784. * Fixes due to #757 & #841. * Fix due to #846. * Fix due to #847. * Use MethodInfo for managed method calls. Use IR methods instead of managed methods about Max/Min (S/U). Follow-ups & Nits. * Add missing exception messages. Reintroduce slow path for Fmov_Vi. Implement slow path for Fmov_Si. * Switch to the new folder structure. Nits. * Impl. index-based relocation information. Impl. cache file version field. * Nit. * Address gdkchan comments. Mainly: - fixed cache file corruption issue on exit; - exposed a way to disable AOT on the GUI. * Address AcK77 comment. * Address Thealexbarney, jduncanator & emmauss comments. Header magic, CpuId (FI) & Aot -> Ptc. * Adaptation to the new application reloading system. Improvements to the call system of managed methods. Follow-ups. Nits. * Get the same boot times as on master when PTC is disabled. * Profiled Aot. * A32 support (#897). * #975 support (1 of 2). * #975 support (2 of 2). * Rebase fix & nits. * Some fixes and nits (still one bug left). * One fix & nits. * Tests fix (by gdk) & nits. * Support translations not only in high quality and rejit. Nits. * Added possibility to skip translations and continue execution, using `ESC` key. * Update SettingsWindow.cs * Update GLRenderer.cs * Update Ptc.cs * Disabled Profiled PTC by default as requested in the past by gdk. * Fix rejit bug. Increased number of parallel translations. Add stack unwinding stuffs support (1 of 2). Nits. * Add stack unwinding stuffs support (2 of 2). Tuned number of parallel translations. * Restored the ability to assemble jumps with 8-bit offset when Profiled PTC is disabled or during profiling. Modifications due to rebase. Nits. * Limited profiling of the functions to be translated to the addresses belonging to the range of static objects only. * Nits. * Nits. * Update Delegates.cs * Nit. * Update InstEmitSimdArithmetic.cs * Address riperiperi comments. * Fixed the issue of unjustifiably longer boot times at the second boot than at the first boot, measured at the same time or reference point and with the same number of translated functions. * Implemented a simple redundant load/save mechanism. Halved the value of Decoder.MaxInstsPerFunction more appropriate for the current performance of the Translator. Replaced by Logger.PrintError to Logger.PrintDebug in TexturePool.cs about the supposed invalid texture format to avoid the spawn of the log. Nits. * Nit. Improved Logger.PrintError in TexturePool.cs to avoid log spawn. Added missing code for FZ handling (in output) for fp max/min instructions (slow paths). * Add configuration migration for PTC Co-authored-by: Thog <me@thog.eu>
2020-06-14VABS takes one input register, not two. (#1300)riperiperi
2020-06-05Faster crc32 implementation (#1294)merry
* Add Pclmulqdq intrinsic * Implement crc32 in terms of pclmulqdq * Address PR comments
2020-05-27Add FMaxNmV & FMinNmV Inst.s with Test. (#1279)LDj3SNuD
Successful unit testing on Windows (debug and release mode).