aboutsummaryrefslogtreecommitdiff
path: root/src/video_core/renderer_opengl
AgeCommit message (Collapse)Author
2020-06-16gl_device: Reserve at least 4 image bindings for fragment stageMorph
Due to the limitation of GL_MAX_IMAGE_UNITS being low (8) on Intel's and Nvidia's proprietary drivers, we have to reserve an appropriate amount of image bindings for each of the stages. So far games have been observed to use 4 image bindings on the fragment stage (Kirby Star Allies) and 1 on the vertex stage (TWD series). No games thus far in my limited testing used more than 4 images concurrently and across all currently active programs. This fixes shader compilation errors on Kirby Star Allies on OpenGL (GLSL/GLASM)
2020-06-15Merge pull request #4066 from ReinUsesLisp/shared-ptr-bufRodrigo Locatti
buffer_cache: Avoid passing references of shared pointers and misc style changes
2020-06-14Merge pull request #4064 from ReinUsesLisp/invalidate-buffersbunnei
gl_rasterizer: Mark vertex buffers as dirty after buffer cache invalidation
2020-06-13Merge pull request #4049 from ReinUsesLisp/separate-samplersbunnei
shader/texture: Join separate image and sampler pairs offline
2020-06-12Merge pull request #3986 from ReinUsesLisp/shader-cachebunnei
shader_cache: Implement a generic runtime shader cache
2020-06-11gl_arb_decompiler: Implement FSwizzleAddReinUsesLisp
2020-06-11gl_arb_decompiler: Implement an assembly shader decompilerReinUsesLisp
Emit code compatible with NV_gpu_program5. This should emit code compatible with Fermi, but it wasn't tested on that architecture. Pascal has some issues not present on Turing GPUs.
2020-06-09Merge pull request #4027 from ReinUsesLisp/3d-slicesbunnei
texture_cache: Implement rendering to 3D textures
2020-06-09buffer_cache: Avoid passing references of shared pointers and misc style changesReinUsesLisp
Instead of using as template argument a shared pointer, use the underlying type and manage shared pointers explicitly. This can make removing shared pointers from the cache more easy. While we are at it, make some misc style changes and general improvements (like insert_or_assign instead of operator[] + operator=).
2020-06-08gl_rasterizer: Mark vertex buffers as dirty after buffer cache invalidationReinUsesLisp
Vertex buffers bindings become invalid after the stream buffer is invalidated. We were originally doing this, but it got lost at some point. - Fixes Animal Crossing: New Horizons, but it affects everything.
2020-06-08Merge pull request #4040 from ReinUsesLisp/nv-transform-feedbackbunnei
gl_rasterizer: Use NV_transform_feedback for XFB on assembly shaders
2020-06-08Merge pull request #4052 from ReinUsesLisp/debug-outputbunnei
renderer_opengl: Only enable DEBUG_OUTPUT when graphics debugging is enabled
2020-06-08texture_cache: Handle 3D texture blits with one layerReinUsesLisp
2020-06-08texture_cache: Implement rendering to 3D texturesReinUsesLisp
This allows rendering to 3D textures with more than one slice. Applications are allowed to render to more than one slice of a texture using gl_Layer from a VTG shader. This also requires reworking how 3D texture collisions are handled, for now, this commit allows rendering to slices but not to miplevels. When a render target attempts to write to a mipmap, we fallback to the previous implementation (copying or flushing as needed). - Fixes color correction 3D textures on UE4 games (rainbow effects). - Allows Xenoblade games to render to 3D textures directly.
2020-06-07rasterizer_cache: Remove files and includesReinUsesLisp
The rasterizer cache is no longer used. Each cache has its own generic implementation optimized for the cached data.
2020-06-07vk_pipeline_cache: Use generic shader cacheReinUsesLisp
Trivial port the generic shader cache to Vulkan.
2020-06-07gl_shader_cache: Use generic shader cacheReinUsesLisp
Trivially port the generic shader cache to OpenGL.
2020-06-06gl_device: Black list NVIDIA 443.24 for fast buffer uploadsReinUsesLisp
Skip fast buffer uploads on Nvidia 443.24 Vulkan beta driver on OpenGL. This driver throws the following error when calling BufferSubData or BufferData on buffers that are candidates for fast constant buffer uploads. This is the equivalens to push constants on Vulkan, except that they can access the full buffer. The error: Unknown internal debug message. The NVIDIA OpenGL driver has encountered an out of memory error. This application might behave inconsistently and fail. If this error persists on future drivers, we might have to look deeper into this issue. For now, we can black list it and log it as a temporary solution.
2020-06-05renderer_opengl: Only enable DEBUG_OUTPUT when graphics debugging is enabledReinUsesLisp
Avoids logging when it's not relevant. This can potentially reduce driver's internal thread overhead.
2020-06-05shader/texture: Join separate image and sampler pairs offlineReinUsesLisp
Games using D3D idioms can join images and samplers when a shader executes, instead of baking them into a combined sampler image. This is also possible on Vulkan. One approach to this solution would be to use separate samplers on Vulkan and leave this unimplemented on OpenGL, but we can't do this because there's no consistent way of determining which constant buffer holds a sampler and which one an image. We could in theory find the first bit and if it's in the TIC area, it's an image; but this falls apart when an image or sampler handle use an index of zero. The used approach is to track for a LOP.OR operation (this is done at an IR level, not at an ISA level), track again the constant buffers used as source and store this pair. Then, outside of shader execution, join the sample and image pair with a bitwise or operation. This approach won't work on games that truly use separate samplers in a meaningful way. For example, pooling textures in a 2D array and determining at runtime what sampler to use. This invalidates OpenGL's disk shader cache :) - Used mostly by D3D ports to Switch
2020-06-04Merge pull request #4031 from Morph1984/fix-gs-outputsbunnei
gl_shader_decompiler: Fix geometry shader outputs on Intel drivers
2020-06-03gl_rasterizer: Use NV_transform_feedback for XFB on assembly shadersReinUsesLisp
NV_transform_feedback, NV_transform_feedback2 and ARB_transform_feedback3 with NV_transform_feedback interactions allows implementing transform feedbacks as dynamic state. Maxwell implements transform feedbacks as dynamic state, so using these extensions with TransformFeedbackStreamAttribsNV allows us to properly emulate transform feedbacks without having to recompile shaders when the state changes.
2020-06-02Merge pull request #4014 from ReinUsesLisp/astc-nvidiabunnei
gl_device: Avoid devices with CAVEAT_SUPPORT on ASTC
2020-06-02Merge pull request #4006 from ReinUsesLisp/squash-ubosbunnei
glsl: Squash constant buffers into a single SSBO when we hit the limit
2020-06-01gl_shader_decompiler: Declare gl_Layer and gl_ViewportIndex within ↵Morph
gl_PerVertex for vertex and tessellation shaders
2020-06-01gl_shader_decompiler: Fix geometry shader outputs for Intel driversMorph
On Intel's proprietary drivers, gl_Layer and gl_ViewportIndex are not allowed members of gl_PerVertex block, causing the shader to fail to compile. Fix this by declaring these variables outside of gl_PerVertex.
2020-06-01Merge pull request #3996 from ReinUsesLisp/front-facesbunnei
fixed_pipeline_state,gl_rasterizer: Swap negative viewport checks for front faces
2020-05-31gl_device: Avoid devices with CAVEAT_SUPPORT on ASTCReinUsesLisp
This avoids using Nvidia's ASTC decoder on OpenGL. The last time it was profiled, it was slower than yuzu's decoder. While we are at it, fix a bug in the texture cache when native ASTC is not supported.
2020-05-31glsl: Squash constant buffers into a single SSBO when we hit the limitReinUsesLisp
Avoids compilation errors at the cost of shader build times and runtime performance when a game hits the limit of uniform buffers we can use.
2020-05-31Merge pull request #3958 from FernandoS27/gl-debugbunnei
OpenGL: Enable Debug Context and Synchronous debugging when graphics debugging is enabled
2020-05-31gl_device: Enable compute shaders for Intel proprietary driversMorph
Previously we were disabling compute shaders on Intel's proprietary driver due to broken compute. This has been fixed in the latest Intel drivers. Re-enable compute for Intel proprietary drivers and remove the check for broken compute.
2020-05-30Merge pull request #3982 from ReinUsesLisp/membar-ctsbunnei
shader/other: Implement MEMBAR.CTS
2020-05-28Merge pull request #3991 from ReinUsesLisp/depth-samplingbunnei
texture_cache: Implement depth stencil texture swizzles
2020-05-28Merge pull request #3993 from ReinUsesLisp/fix-zlabunnei
gl_shader_manager: Unbind GLSL program when binding a host pipeline
2020-05-27shader/other: Implement MEMBAR.CTSReinUsesLisp
This silences an assertion we were hitting and uses workgroup memory barriers when the game requests it.
2020-05-26gl_texture_cache: Implement small texture view cache for swizzlesReinUsesLisp
This fixes cases where the texture swizzle was applied twice on the same draw to a texture bound to two different slots.
2020-05-26texture_cache: Implement depth stencil texture swizzlesReinUsesLisp
Stop ignoring image swizzles on depth and stencil images. This doesn't fix a known issue on Xenoblade Chronicles 2 where an OpenGL texture changes swizzles twice before being used. A proper fix would be having a small texture view cache for this like we do on Vulkan.
2020-05-26gl_rasterizer: Port front face flip check from VulkanReinUsesLisp
While Vulkan was assuming we had no negative viewports, OpenGL code was assuming we had them. Port the old code from Vulkan to OpenGL, checking if the first viewport is negative before flipping faces. This is not a complete implementation since we only check for the first viewport to be negative. That said, unless a game is using Vulkan, OpenGL and NVN games should be fine here, and we can always compare with our Vulkan backend to see if there's a difference.
2020-05-26Merge pull request #3981 from ReinUsesLisp/barbunnei
shader/other: Implement BAR.SYNC 0x0
2020-05-26gl_shader_manager: Unbind GLSL program when binding a host pipelineReinUsesLisp
Fixes regression in Link's Awakening caused by 420cc13248350ef5c2d19e0b961cb4185cd16a8a
2020-05-25Merge pull request #3978 from ReinUsesLisp/write-rzbunnei
shader_decompiler: Visit source nodes even when they assign to RZ
2020-05-24Merge pull request #3905 from FernandoS27/vulkan-fixbunnei
Correct a series of crashes and intructions on Async GPU and Vulkan Pipeline
2020-05-24Merge pull request #3964 from ReinUsesLisp/arb-integrationbunnei
renderer_opengl: Add assembly program code paths
2020-05-24Merge pull request #3979 from ReinUsesLisp/thread-groupbunnei
shader/other: Implement thread comparisons (NV_shader_thread_group)
2020-05-21shader/other: Implement BAR.SYNC 0x0ReinUsesLisp
Trivially implement this particular case of BAR. Unless games use OpenCL or CUDA barriers, we shouldn't hit any other case here.
2020-05-21shader/other: Implement thread comparisons (NV_shader_thread_group)ReinUsesLisp
Hardware S2R special registers match gl_Thread*MaskNV. We can trivially implement these using Nvidia's extension on OpenGL or naively stubbing them with the ARB instructions to match. This might cause issues if the host device warp size doesn't match Nvidia's. That said, this is unlikely on proper shaders. Refer to the attached url for more documentation about these flags. https://www.khronos.org/registry/OpenGL/extensions/NV/NV_shader_thread_group.txt
2020-05-21shader_decompiler: Visit source nodes even when they assign to RZReinUsesLisp
Some operations like atomicMin were ignored because they returned were being stored to RZ. This operations have a side effect and it was being ignored.
2020-05-21buffer_cache: Use boost::intrusive::set for cachingReinUsesLisp
Instead of using boost::icl::interval_map for caching, use boost::intrusive::set. interval_map is intended as a container where the keys can overlap with one another; we don't need this for caching buffers and a std::set-like data structure that allows us to search with lower_bound is enough.
2020-05-19renderer_opengl: Add assembly program code pathsReinUsesLisp
Add code required to use OpenGL assembly programs based on NV_gpu_program5. Decompilation for ARB programs is intended to be added in a follow up commit. This does **not** include ARB decompilation and it's not in an usable state. The intention behind assembly programs is to reduce shader stutter significantly on drivers supporting NV_gpu_program5 (and other required extensions). Currently only Nvidia's proprietary driver supports these extensions. Add a UI option hidden for now to avoid people enabling this option accidentally. This code path has some limitations that OpenGL compatibility doesn't have: - NV_shader_storage_buffer_object is limited to 16 entries for a single OpenGL context state (I don't know if this is an intended limitation, an specification issue or I am missing something). Currently causes issues on The Legend of Zelda: Link's Awakening. - NV_parameter_buffer_object can't bind buffers using an offset different to zero. The used workaround is to copy to a temporary buffer (this doesn't happen often so it's not an issue). On the other hand, it has the following advantages: - Shaders build a lot faster. - We have control over how floating point rounding is done over individual instructions (SPIR-V on Vulkan can't do this). - Operations on shared memory can be unsigned and signed. - Transform feedbacks are dynamic state (not yet implemented). - Parameter buffers (uniform buffers) are per stage, matching NVN and hardware's behavior. - The API to bind and create assembly programs makes sense, unlike ARB_separate_shader_objects.
2020-05-17OpenGL: Enable Debug Context and Synchronous debugging when graphics ↵Fernando Sahmkow
debugging is enabled. This commit aims to help easing debugging of driver crashes without having to modify existing code.