aboutsummaryrefslogtreecommitdiff
path: root/src/video_core/renderer_opengl
AgeCommit message (Collapse)Author
2020-06-01Merge pull request #3996 from ReinUsesLisp/front-facesbunnei
fixed_pipeline_state,gl_rasterizer: Swap negative viewport checks for front faces
2020-05-31Merge pull request #3958 from FernandoS27/gl-debugbunnei
OpenGL: Enable Debug Context and Synchronous debugging when graphics debugging is enabled
2020-05-31gl_device: Enable compute shaders for Intel proprietary driversMorph
Previously we were disabling compute shaders on Intel's proprietary driver due to broken compute. This has been fixed in the latest Intel drivers. Re-enable compute for Intel proprietary drivers and remove the check for broken compute.
2020-05-30Merge pull request #3982 from ReinUsesLisp/membar-ctsbunnei
shader/other: Implement MEMBAR.CTS
2020-05-28Merge pull request #3991 from ReinUsesLisp/depth-samplingbunnei
texture_cache: Implement depth stencil texture swizzles
2020-05-28Merge pull request #3993 from ReinUsesLisp/fix-zlabunnei
gl_shader_manager: Unbind GLSL program when binding a host pipeline
2020-05-27shader/other: Implement MEMBAR.CTSReinUsesLisp
This silences an assertion we were hitting and uses workgroup memory barriers when the game requests it.
2020-05-26gl_texture_cache: Implement small texture view cache for swizzlesReinUsesLisp
This fixes cases where the texture swizzle was applied twice on the same draw to a texture bound to two different slots.
2020-05-26texture_cache: Implement depth stencil texture swizzlesReinUsesLisp
Stop ignoring image swizzles on depth and stencil images. This doesn't fix a known issue on Xenoblade Chronicles 2 where an OpenGL texture changes swizzles twice before being used. A proper fix would be having a small texture view cache for this like we do on Vulkan.
2020-05-26gl_rasterizer: Port front face flip check from VulkanReinUsesLisp
While Vulkan was assuming we had no negative viewports, OpenGL code was assuming we had them. Port the old code from Vulkan to OpenGL, checking if the first viewport is negative before flipping faces. This is not a complete implementation since we only check for the first viewport to be negative. That said, unless a game is using Vulkan, OpenGL and NVN games should be fine here, and we can always compare with our Vulkan backend to see if there's a difference.
2020-05-26Merge pull request #3981 from ReinUsesLisp/barbunnei
shader/other: Implement BAR.SYNC 0x0
2020-05-26gl_shader_manager: Unbind GLSL program when binding a host pipelineReinUsesLisp
Fixes regression in Link's Awakening caused by 420cc13248350ef5c2d19e0b961cb4185cd16a8a
2020-05-25Merge pull request #3978 from ReinUsesLisp/write-rzbunnei
shader_decompiler: Visit source nodes even when they assign to RZ
2020-05-24Merge pull request #3905 from FernandoS27/vulkan-fixbunnei
Correct a series of crashes and intructions on Async GPU and Vulkan Pipeline
2020-05-24Merge pull request #3964 from ReinUsesLisp/arb-integrationbunnei
renderer_opengl: Add assembly program code paths
2020-05-24Merge pull request #3979 from ReinUsesLisp/thread-groupbunnei
shader/other: Implement thread comparisons (NV_shader_thread_group)
2020-05-21shader/other: Implement BAR.SYNC 0x0ReinUsesLisp
Trivially implement this particular case of BAR. Unless games use OpenCL or CUDA barriers, we shouldn't hit any other case here.
2020-05-21shader/other: Implement thread comparisons (NV_shader_thread_group)ReinUsesLisp
Hardware S2R special registers match gl_Thread*MaskNV. We can trivially implement these using Nvidia's extension on OpenGL or naively stubbing them with the ARB instructions to match. This might cause issues if the host device warp size doesn't match Nvidia's. That said, this is unlikely on proper shaders. Refer to the attached url for more documentation about these flags. https://www.khronos.org/registry/OpenGL/extensions/NV/NV_shader_thread_group.txt
2020-05-21shader_decompiler: Visit source nodes even when they assign to RZReinUsesLisp
Some operations like atomicMin were ignored because they returned were being stored to RZ. This operations have a side effect and it was being ignored.
2020-05-21buffer_cache: Use boost::intrusive::set for cachingReinUsesLisp
Instead of using boost::icl::interval_map for caching, use boost::intrusive::set. interval_map is intended as a container where the keys can overlap with one another; we don't need this for caching buffers and a std::set-like data structure that allows us to search with lower_bound is enough.
2020-05-19renderer_opengl: Add assembly program code pathsReinUsesLisp
Add code required to use OpenGL assembly programs based on NV_gpu_program5. Decompilation for ARB programs is intended to be added in a follow up commit. This does **not** include ARB decompilation and it's not in an usable state. The intention behind assembly programs is to reduce shader stutter significantly on drivers supporting NV_gpu_program5 (and other required extensions). Currently only Nvidia's proprietary driver supports these extensions. Add a UI option hidden for now to avoid people enabling this option accidentally. This code path has some limitations that OpenGL compatibility doesn't have: - NV_shader_storage_buffer_object is limited to 16 entries for a single OpenGL context state (I don't know if this is an intended limitation, an specification issue or I am missing something). Currently causes issues on The Legend of Zelda: Link's Awakening. - NV_parameter_buffer_object can't bind buffers using an offset different to zero. The used workaround is to copy to a temporary buffer (this doesn't happen often so it's not an issue). On the other hand, it has the following advantages: - Shaders build a lot faster. - We have control over how floating point rounding is done over individual instructions (SPIR-V on Vulkan can't do this). - Operations on shared memory can be unsigned and signed. - Transform feedbacks are dynamic state (not yet implemented). - Parameter buffers (uniform buffers) are per stage, matching NVN and hardware's behavior. - The API to bind and create assembly programs makes sense, unlike ARB_separate_shader_objects.
2020-05-17OpenGL: Enable Debug Context and Synchronous debugging when graphics ↵Fernando Sahmkow
debugging is enabled. This commit aims to help easing debugging of driver crashes without having to modify existing code.
2020-05-13Merge pull request #3899 from ReinUsesLisp/float-comparisonsbunnei
shader_ir: Add separate instructions for ordered and unordered comparisons and fix NE on GLSL
2020-05-10gl_shader_decompiler: Properly emulate NaN behaviour on NEReinUsesLisp
"Not equal" operators on GLSL seem to behave as unordered when we expect an ordered comparison. Manually emulate this checking for LGE values (numbers, not-NaNs).
2020-05-09VideoCore: Use SyncGuestMemory mechanism for Shader/Pipeline Cache invalidation.Fernando Sahmkow
2020-05-09Merge pull request #3839 from Morph1984/r8g8uiRodrigo Locatti
texture: Implement R8G8UI
2020-05-09shader_ir: Separate float-point comparisons in ordered and unorderedReinUsesLisp
This allows us to use native SPIR-V instructions without having to manually check for NAN.
2020-05-04gl_rasterizer: Implement viewport swizzles with NV_viewport_swizzleReinUsesLisp
2020-05-03Merge pull request #3808 from ReinUsesLisp/wait-for-idlebunnei
{maxwell_3d,buffer_cache}: Implement memory barriers using 3D registers
2020-05-02Merge pull request #3693 from ReinUsesLisp/clean-samplersbunnei
shader/texture: Support multiple unknown sampler properties
2020-04-30texture: Implement R8G8UIMorph
- Used by The Walking Dead: The Final Season
2020-04-30Merge pull request #3807 from ReinUsesLisp/fix-depth-clampbunnei
maxwell_3d: Fix depth clamping register
2020-04-30Merge pull request #3799 from ReinUsesLisp/iadd-ccbunnei
shader: Implement P2R CC, IADD Rd.CC and IADD.X
2020-04-30Merge pull request #3805 from ReinUsesLisp/preserve-contentsbunnei
texture_cache: Reintroduce preserve_contents accurately
2020-04-28Merge pull request #3784 from ReinUsesLisp/shader-memory-utilbunnei
shader/memory_util: Deduplicate code
2020-04-28{maxwell_3d,buffer_cache}: Implement memory barriers using 3D registersReinUsesLisp
Drop MemoryBarrier from the buffer cache and use Maxwell3D's register WaitForIdle. To implement this on OpenGL we just call glMemoryBarrier with the necessary bits. Vulkan lacks this synchronization primitive, so we set an event and immediately wait for it. This is not a pretty solution, but it's what Vulkan can do without submitting the current command buffer to the queue (which ends up being more expensive on the CPU).
2020-04-27maxwell_3d: Fix depth clamping registerReinUsesLisp
Using deko3d as reference: https://github.com/devkitPro/deko3d/blob/4e47ba0013552e592a86ab7a2510d1e7dadf236a/source/maxwell/gpu_3d_state.cpp#L42 We were using bits 3 and 4 to determine depth clamping, but these are the same both enabled and disabled: state->depthClampEnable ? 0x101A : 0x181D The same happens on Nvidia's OpenGL driver, where they do something like this (default capabilities, GL 4.5 compatibility): (state & DEPTH_CLAMP) != 0 ? 0x201a : 0x281c There's always a difference between the first bits in this register, but bit 11 is consistently disabled on both deko3d/NVN and OpenGL. This commit changes yuzu's behaviour to use bit 11 to determine depth clamping. - Fixes depth issues on Super Mario Odyssey's intro.
2020-04-26texture_cache: Reintroduce preserve_contents accuratelyReinUsesLisp
This reverts commit 94b0e2e5dae4e0bd0021ac2d8fe1ff904a93ee69. preserve_contents proved to be a meaningful optimization. This commit reintroduces it but properly implemented on OpenGL. We have to make sure the clear removes all the previous contents of the image. It's not currently implemented on Vulkan because we can do smart things there that's preferred to be introduced in a separate commit.
2020-04-26Merge pull request #3753 from ReinUsesLisp/ac-vulkanRodrigo Locatti
{gl,vk}_rasterizer: Add lazy default buffer maker and use it for empty buffers
2020-04-26shader/memory_util: Deduplicate codeReinUsesLisp
Deduplicate code shared between vk_pipeline_cache and gl_shader_cache as well as shader decoder code. While we are at it, fix a bug in gl_shader_cache where compute shaders had an start offset of a stage shader.
2020-04-25shader/arithmetic_integer: Implement CC for IADDReinUsesLisp
2020-04-23shader_ir: Turn classes into data structuresReinUsesLisp
2020-04-22GL_Fence_Manager: use GL_TIMEOUT_IGNORED instead of a loop,Fernando Sahmkow
2020-04-22Async GPU: Correct flushing behavior to be similar to old async GPU behavior.Fernando Sahmkow
2020-04-22ShaderCache/PipelineCache: Cache null shaders.Fernando Sahmkow
2020-04-22Address Feedback.Fernando Sahmkow
2020-04-22Fix GCC error.Fernando Sahmkow
2020-04-22QueryCache: Implement Async Flushes.Fernando Sahmkow
2020-04-22OpenGL: Guarantee writes to Buffers.Fernando Sahmkow
2020-04-22GPU: Implement Flush Requests for Async mode.Fernando Sahmkow