aboutsummaryrefslogtreecommitdiff
path: root/Ryujinx.Graphics.GAL
AgeCommit message (Collapse)Author
2022-01-11Fix render target clear when sizes mismatch (#2994)gdkchan
2022-01-09Texture Sync, incompatible overlap handling, data flush improvements. (#2971)riperiperi
* Initial test for texture sync * WIP new texture flushing setup * Improve rules for incompatible overlaps Fixes a lot of issues with Unreal Engine games. Still a few minor issues (some caused by dma fast path?) Needs docs and cleanup. * Cleanup, improvements Improve rules for fast DMA * Small tweak to group together flushes of overlapping handles. * Fixes, flush overlapping texture data for ASTC and BC4/5 compressed textures. Fixes the new Life is Strange game. * Flush overlaps before init data, fix 3d texture size/overlap stuff * Fix 3D Textures, faster single layer flush Note: nosy people can no longer merge this with Vulkan. (unless they are nosy enough to implement the new backend methods) * Remove unused method * Minor cleanup * More cleanup * Use the More Fun and Hopefully No Driver Bugs method for getting compressed tex too This one's for metro * Address feedback, ASTC+ETC to FormatClass * Change offset to use Span slice rather than IntPtr Add * Fix this too
2022-01-08Add support for render scale to vertex stage. (#2763)riperiperi
* Add support for render scale to vertex stage. Occasionally games read off textureSize on the vertex stage to inform the fragment shader what size a texture is without querying in there. Scales were not present in the vertex shader to correct the sizes, so games were providing the raw upscaled texture size to the fragment shader, which was incorrect. One downside is that the fragment and vertex support buffer description must be identical, so the full size scales array must be defined when used. I don't think this will have an impact though. Another is that the fragment texture count must be updated when vertex shader textures are used. I'd like to correct this so that the update is folded into the update for the scales. Also cleans up a bunch of things, like it making no sense to call CommitRenderScale for each stage. Fixes render scale causing a weird offset bloom in Super Mario Party and Clubhouse Games. Clubhouse Games still has a pixelated look in a number of its games due to something else it does in the shader. * Split out support buffer update, lazy updates. * Commit support buffer before compute dispatch * Remove unnecessary qualifier. * Address Feedback
2021-12-31Force crop when presentation cached texture size mismatches (#2957)gdkchan
2021-12-30Add support for the R4G4 texture format (#2956)gdkchan
2021-11-28infra: Migrate to .NET 6 (#2829)Mary
* infra: Migrate to .NET 6 * Rollback version naming change * Workaround .NET 6 ZipArchive API issues * ci: Switch to VS 2022 for AppVeyor CI is now ready for .NET 6 * Suppress WebClient warning in DoUpdateWithMultipleThreads * Attempt to workaround System.Drawing.Common changes on 6.0.0 * Change keyboard rendering from System.Drawing to ImageSharp * Make the software keyboard renderer multithreaded * Bump ImageSharp version to 1.0.4 to fix a bug in Image.Load * Add fallback fonts to the keyboard renderer * Fix warnings * Address caian's comment * Clean up linux workaround as it's uneeded now * Update readme Co-authored-by: Caian Benedicto <caianbene@gmail.com>
2021-11-10Implement DrawTexture functionality (#2747)gdkchan
* Implement DrawTexture functionality * Non-NVIDIA support * Disable some features that should not affect draw texture (slow path) * Remove space from shader source * Match 2D engine names * Fix resolution scale and add missing XML docs * Disable transform feedback for draw texture fallback
2021-11-03Clamp number of mipmap levels to avoid API errors due to invalid textures ↵gdkchan
(#2808)
2021-10-28Add support for fragment shader interlock (#2768)gdkchan
* Support coherent images * Add support for fragment shader interlock * Change to tree based match approach * Refactor + check for branch targets and external registers * Make detection more robust * Use Intel fragment shader ordering if interlock is not available, use nothing if both are not available * Remove unused field
2021-10-18Initial tessellation shader support (#2534)gdkchan
* Initial tessellation shader support * Nits * Re-arrange built-in table * This is not needed anymore * PR feedback
2021-09-19Use shader subgroup extensions if shader ballot is not supported (#2627)gdkchan
* Use shader subgroup extensions if shader ballot is not supported * Shader cache version bump + cleanup * The type is still required on the table
2021-08-27Add a Multithreading layer for the GAL, multi-thread shader compilation at ↵riperiperi
runtime (#2501) * Initial Implementation About as fast as nvidia GL multithreading, can be improved with faster command queuing. * Struct based command list Speeds up a bit. Still a lot of time lost to resource copy. * Do shader init while the render thread is active. * Introduce circular span pool V1 Ideally should be able to use structs instead of references for storing these spans on commands. Will try that next. * Refactor SpanRef some more Use a struct to represent SpanRef, rather than a reference. * Flush buffers on background thread * Use a span for UpdateRenderScale. Much faster than copying the array. * Calculate command size using reflection * WIP parallel shaders * Some minor optimisation * Only 2 max refs per command now. The command with 3 refs is gone. :relieved: * Don't cast on the GPU side * Remove redundant casts, force sync on window present * Fix Shader Cache * Fix host shader save. * Fixup to work with new renderer stuff * Make command Run static, use array of delegates as lookup Profile says this takes less time than the previous way. * Bring up to date * Add settings toggle. Fix Muiltithreading Off mode. * Fix warning. * Release tracking lock for flushes * Fix Conditional Render fast path with threaded gal * Make handle iteration safe when releasing the lock This is mostly temporary. * Attempt to set backend threading on driver Only really works on nvidia before launching a game. * Fix race condition with BufferModifiedRangeList, exceptions in tracking actions * Update buffer set commands * Some cleanup * Only use stutter workaround when using opengl renderer non-threaded * Add host-conditional reservation of counter events There has always been the possibility that conditional rendering could use a query object just as it is disposed by the counter queue. This change makes it so that when the host decides to use host conditional rendering, the query object is reserved so that it cannot be deleted. Counter events can optionally start reserved, as the threaded implementation can reserve them before the backend creates them, and there would otherwise be a short amount of time where the counter queue could dispose the event before a call to reserve it could be made. * Address Feedback * Make counter flush tracked again. Hopefully does not cause any issues this time. * Wait for FlushTo on the main queue thread. Currently assumes only one thread will want to FlushTo (in this case, the GPU thread) * Add SDL2 headless integration * Add HLE macro commands. Co-authored-by: Mary <mary@mary.zone>
2021-08-26Add support for HLE macros and accelerate MultiDrawElementsIndirectCount #2 ↵mpnico
(#2557) * Add support for HLE macros and accelerate MultiDrawElementsIndirectCount * Add missing barrier * Fix index buffer count * Add support check for each macro hle before use * Add missing xml doc Co-authored-by: gdkchan <gab.dark.100@gmail.com>
2021-08-20Swap BGR components for 16-bit BGR texture formats (#2567)gdkchan
2021-08-11Workaround for Intel FrontFacing built-in variable bug (#2540)gdkchan
2021-08-11Replace BGRA and scale uniforms with a uniform block (#2496)gdkchan
* Replace BGRA and scale uniforms with a uniform block * Setting the data again on program change is no longer needed * Optimize and resolve some warnings * Avoid redundant support buffer updates * Some optimizations to BindBuffers (now inlined) * Unify render scale arrays
2021-07-19Return mapped buffer pointer directly for flush, WriteableRegion for ↵riperiperi
textures (#2494) * Return mapped buffer pointer directly for flush, WriteableRegion for textures A few changes here to generally improve performance, even for platforms not using the persistent buffer flush. - Texture and buffer flush now return a ReadOnlySpan<byte>. It's guaranteed that this span is pinned in memory, but it will be overwritten on the next flush from that thread, so it is expected that the data is used before calling again. - As a result, persistent mappings no longer copy to a new array - rather the persistent map is returned directly as a Span<>. A similar host array is used for the glGet flushes instead of allocating new arrays each time. - Texture flushes now do their layout conversion into a WriteableRegion when the texture is not MultiRange, which allows the flush to happen directly into guest memory rather than into a temporary span, then copied over. This avoids another copy when doing layout conversion. Overall, this saves 1 data copy for buffer flush, 1 copy for linear textures with matching source/target stride, and 2 copies for block textures or linear textures with mismatching strides. * Fix tests * Fix array pointer for Mesa/Intel path * Address some feedback * Update method for getting array pointer.
2021-06-28Add Screenshot Feature (#2354)emmauss
* Add internal screenshot capabilities * update version notice
2021-06-25Add support for custom line widths (#2406)gdkchan
2021-06-25Fix texture sampling with depth compare and LOD level or bias (#2404)gdkchan
* Fix texture sampling with depth compare and LOD level or bias * Shader cache version bump * nit: Sorting
2021-05-19Merge pull request #2177 from riperiperi/feature/parallel-shader-cacheEmulationFanatic
Allow parallel shader compilation when loading a shader cache
2021-05-16Use copy dependencies for the Intel/AMD view format workaround (#2144)riperiperi
* This might help AMD a bit * Removal of old workaround.
2021-04-18Implement parallel host shader cache compilation.riperiperi
2021-03-02Texture Cache: "Texture Groups" and "Texture Dependencies" (#2001)riperiperi
* Initial implementation (3d tex mips broken) This works rather well for most games, just need to fix 3d texture mips. * Cleanup * Address feedback * Copy Dependencies and various other fixes * Fix layer/level offset for copy from view<->view. * Remove dirty flag from dependency The dirty flag behaviour is not needed - DeferredCopy is all we need. * Fix tracking mip slices. * Propagate granularity (fix astral chain) * Address Feedback pt 1 * Save slice sizes as part of SizeInfo * Fix nits * Fix disposing multiple dependencies causing a crash This list is obviously modified when removing dependencies, so create a copy of it.
2021-02-08Implement ETC2 (RGB) texture format (#2000)gdkchan
* Implement ETC2 format * Fix component counts for compressed formats
2021-01-27Avoid some redundant GL calls (#1958)gdkchan
2021-01-17Implement lazy flush-on-read for Buffers (SSBO/Copy) (#1790)riperiperi
* Initial implementation of buffer flush (VERY WIP) * Host shaders need to be rebuilt for the SSBO write flag. * New approach with reserved regions and gl sync * Fix a ton of buffer issues. * Remove unused buffer unmapped behaviour * Revert "Remove unused buffer unmapped behaviour" This reverts commit f1700e52fb8760180ac5e0987a07d409d1e70ece. * Delete modified ranges on unmap Fixes potential crashes in Super Smash Bros, where a previously modified range could lie on either side of an unmap. * Cache some more delegates. * Dispose Sync on Close * Also create host sync for GPFifo syncpoint increment. * Copy buffer optimization, add docs * Fix race condition with OpenGL Sync * Enable read tracking on CommandBuffer, insert syncpoint on WaitForIdle * Performance: Only flush individual pages of SSBO at a time This avoids flushing large amounts of data when only a small amount is actually used. * Signal Modified rather than flushing after clear * Fix some docs and code style. * Introduce a new test for tracking memory protection. Sucessfully demonstrates that the bug causing write protection to be cleared by a read action has been fixed. (these tests fail on master) * Address Comments * Add host sync for SetReference This ensures that any indirect draws will correctly flush any related buffer data written before them. Fixes some flashing and misplaced world geometry in MH rise. * Make PageAlign static * Re-enable read tracking, for reads.
2021-01-13Implement clear buffer (fast path) (#1902)gdkchan
* Implement clear buffer (fast path) * Remove blank line
2021-01-05gpu: Implement missing texture formats (#1867)Ac_K
* gpu: Implement Etc2Rgba texture format * Add more format * Fix wrong pixel format
2020-12-15gui/gpu: Implement setting and toggle for Aspect Ratio (#1777)Ac_K
* gui/gpu: Implement setting and toggle for Aspect Ratio * address gdkchan feedback and add 16:10 * fix config.json file * Fix rebase * Address gdkchan feedback * Address rip feedback * Fix aspectWidth
2020-11-20Allow copy destination to have a different scale from source (#1711)riperiperi
* Allow copy destination to have a different scale from source Will result in more scaled copy destinations, but allows scaling in some games that copy textures to the output framebuffer. * Support copying multiple levels/layers Uses glFramebufferTextureLayer to copy multiple layers, copies levels individually (and scales the regions). Remove CopyArrayScaled, since the backend copy handles it now.
2020-11-15infra: Migrate to .NET 5 (#1694)Mary
* infra: Migrate to .NET 5 This migrate projects and CI to .NET 5 * Remove language version restrictions (now on 9.0 by default) * infra: pin .NET 5 to avoid later issues * infra: Cleanup csproj files * infra: update dependencies * infra: Add temporary workaround for a bug in Vector128.Create see https://github.com/dotnet/runtime/issues/44704 for more informations
2020-11-13Salieri: shader cache (#1701)Mary
Here come Salieri, my implementation of a disk shader cache! "I'm sure you know why I named it that." "It doesn't really mean anything." This implementation collects shaders at runtime and cache them to be later compiled when starting a game.
2020-11-08Use explicit buffer and texture bindings on shaders (#1666)gdkchan
* Use explicit buffer and texture bindings on shaders * More XML docs and other nits
2020-11-02Add seamless cubemap flag in sampler parameters. (#1658)riperiperi
* Add seamless cubemap flag in sampler parameters. * Check for the extension
2020-11-02Support res scale on images, correctly blacklist for SUST, move logic out of ↵riperiperi
backend. (#1657) * Support res scale on images, correctly blacklist for SUST, move logic out of backend. * Fix Typo
2020-11-01Support 3D BC4 and BC5 compressed textures (#1655)gdkchan
* Support 3D BC4 and BC5 compressed textures * PR feedback * Fix some typos
2020-10-25Fix transform feedback errors caused by host pause/resume and multiple uses ↵gdkchan
(#1634) * Fix transform feedback errors caused by host pause/resume * Fix TFB being used as something else issue with copies * This is supposed to be StreamCopy
2020-10-20Fix image binding format (#1625)gdkchan
* Fix image binding format * XML doc
2020-10-16Memory Read/Write Tracking using Region Handles (#1272)riperiperi
* WIP Range Tracking - Texture invalidation seems to have large problems - Buffer/Pool invalidation may have problems - Mirror memory tracking puts an additional `add` in compiled code, we likely just want to make HLE access slower if this is the final solution. - Native project is in the messiest possible location. - [HACK] JIT memory access always uses native "fast" path - [HACK] Trying some things with texture invalidation and views. It works :) Still a few hacks, messy things, slow things More work in progress stuff (also move to memory project) Quite a bit faster now. - Unmapping GPU VA and CPU VA will now correctly update write tracking regions, and invalidate textures for the former. - The Virtual range list is now non-overlapping like the physical one. - Fixed some bugs where regions could leak. - Introduced a weird bug that I still need to track down (consistent invalid buffer in MK8 ribbon road) Move some stuff. I think we'll eventually just put the dll and so for this in a nuget package. Fix rebase. [WIP] MultiRegionHandle variable size ranges - Avoid reprotecting regions that change often (needs some tweaking) - There's still a bug in buffers, somehow. - Might want different api for minimum granularity Fix rebase issue Commit everything needed for software only tracking. Remove native components. Remove more native stuff. Cleanup Use a separate window for the background context, update opentk. (fixes linux) Some experimental changes Should get things working up to scratch - still need to try some things with flush/modification and res scale. Include address with the region action. Initial work to make range tracking work Still a ton of bugs Fix some issues with the new stuff. * Fix texture flush instability There's still some weird behaviour, but it's much improved without this. (textures with cpu modified data were flushing over it) * Find the destination texture for Buffer->Texture full copy Greatly improves performance for nvdec videos (with range tracking) * Further improve texture tracking * Disable Memory Tracking for view parents This is a temporary approach to better match behaviour on master (where invalidations would be soaked up by views, rather than trigger twice) The assumption is that when views are created to a texture, they will cover all of its data anyways. Of course, this can easily be improved in future. * Introduce some tracking tests. WIP * Complete base tests. * Add more tests for multiregion, fix existing test. * Cleanup Part 1 * Remove unnecessary code from memory tracking * Fix some inconsistencies with 3D texture rule. * Add dispose tests. * Use a background thread for the background context. Rather than setting and unsetting a context as current, doing the work on a dedicated thread with signals seems to be a bit faster. Also nerf the multithreading test a bit. * Copy to texture with matching alignment This extends the copy to work for some videos with unusual size, such as tutorial videos in SMO. It will only occur if the destination texture already exists at XCount size. * Track reads for buffer copies. Synchronize new buffers before copying overlaps. * Remove old texture flushing mechanisms. Range tracking all the way, baby. * Wake the background thread when disposing. Avoids a deadlock when games are closed. * Address Feedback 1 * Separate TextureCopy instance for background thread Also `BackgroundContextWorker.InBackground` for a more sensible idenfifier for if we're in a background thread. * Add missing XML docs. * Address Feedback * Maybe I should start drinking coffee. * Some more feedback. * Remove flush warning, Refocus window after making background context
2020-10-13Fix incorrect GPU GL blend func values (#1612)gdkchan
2020-09-19Better viewport flipping and depth mode detection method (#1556)gdkchan
* Use a better viewport flipping approach * New approach to detect depth mode * nit: Sort method on the OpenGL backend * Adjust spacing on comment * Unswap near and far parameters based on ScaleZ
2020-09-11Allow swizzles to match with "undefined" components (#1538)riperiperi
* Add swizzle matching rules. Improves rules which try to match incompatible formats as perfect, such as D32 float -> R32 float. Remove Format.HasOneComponent, since this information is now available via the FormatInfo struct. * Fix this rule. * Update component counts for depth formats.
2020-09-10Texture/Buffer Memory Management Improvements (#1408)riperiperi
* Initial implementation. Still pending better valid-overlap handling, disposed pool, compressed format flush fix. * Very messy backend resource cache. * Oops * Dispose -> Release * Improve Release/Dispose. * More rule refinement. * View compatibility levels as an enum - you can always know if a view is only copy compatible. * General cleanup. Use locking on the resource cache, as it is likely to be used by other threads in future. * Rename resource cache to resource pool. * Address some of the smaller nits. * Fix regression with MK8 lens flare Texture flushes done the old way should trigger memory tracking. * Use TextureCreateInfo as a key. It now implements IEquatable and generates a hashcode based on width/height. * Fix size change for compressed+non-compressed view combos. Before, this could set either the compressed or non compressed texture with a size with the wrong size, depending on which texture had its size changed. This caused exceptions when flushing the texture. Now it correctly takes the block size into account, assuming that these textures are only related because a pixel in the non-compressed texture represents a block in the compressed one. * Implement JD's suggestion for HashCode Combine Co-authored-by: jduncanator <1518948+jduncanator@users.noreply.github.com> * Address feedback * Address feedback. Co-authored-by: jduncanator <1518948+jduncanator@users.noreply.github.com>
2020-08-02Facilitate OpenGL debug logging via GUI (#1373)mageven
* Allow printing GL Debug logs with GUI options Improve GL Debugger Make the new option persistent Address gdkchan's comments - Rename enum to GraphicsDebugLevel - Move Debugger Init to Renderer Init - Fix formatting * nit: newlines
2020-07-28Implement alpha test using legacy functions (#1426)gdkchan
2020-07-26Implement BGRA texture support (#1418)gdkchan
* Implement BGRA texture support * Missing AppendLine * Remove empty lines * Address PR feedback
2020-07-20GL: Implement more Point parameters (#1399)mageven
* Fix GL_INVALID_VALUE on glPointSize calls * Implement more of Point primitive state * Use existing Origin enum
2020-07-15Initial transform feedback support (#1370)gdkchan
* Initial transform feedback support * Some nits and fixes * Update ReportCounterType and Write method * Can't change shader or TFB bindings while TFB is active * Fix geometry shader input names with new naming
2020-07-10Implement Logical Operation registers and functionality (#1380)riperiperi
* Implement Logical Operation registers and functionality. * Address Feedback 1