| Age | Commit message (Collapse) | Author |
|
shader/other: Implement MEMBAR.CTS
|
|
|
|
maxwell_3d: Reduce severity of logs that can be spammed
|
|
texture_cache: Implement depth stencil texture swizzles
|
|
These logs were killing performance on some games when they were
spammed. Reduce them to Debug severity.
|
|
gl_shader_manager: Unbind GLSL program when binding a host pipeline
|
|
maxwell_to_vk: Add format B8G8R8A8_SRGB and add Attachable capability for B8G8R8A8_UNORM
|
|
This silences an assertion we were hitting and uses workgroup memory
barriers when the game requests it.
|
|
Null texture cubes were not considered arrays, causing issues on Vulkan
and OpenGL when creating views.
|
|
This fixes cases where the texture swizzle was applied twice on the same
draw to a texture bound to two different slots.
|
|
Stop ignoring image swizzles on depth and stencil images.
This doesn't fix a known issue on Xenoblade Chronicles 2 where an OpenGL
texture changes swizzles twice before being used. A proper fix would be
having a small texture view cache for this like we do on Vulkan.
|
|
shader/other: Implement BAR.SYNC 0x0
|
|
shader/memory: Implement non-addition operations in RED
|
|
Fixes regression in Link's Awakening caused by 420cc13248350ef5c2d19e0b961cb4185cd16a8a
|
|
shader_decompiler: Visit source nodes even when they assign to RZ
|
|
Correct a series of crashes and intructions on Async GPU and Vulkan Pipeline
|
|
renderer_opengl: Add assembly program code paths
|
|
shader/other: Implement thread comparisons (NV_shader_thread_group)
|
|
Trivially implement this particular case of BAR. Unless games use OpenCL
or CUDA barriers, we shouldn't hit any other case here.
|
|
Trivially implement these instructions. They are used in Astral Chain.
|
|
Hardware S2R special registers match gl_Thread*MaskNV. We can trivially
implement these using Nvidia's extension on OpenGL or naively stubbing
them with the ARB instructions to match. This might cause issues if the
host device warp size doesn't match Nvidia's. That said, this is
unlikely on proper shaders.
Refer to the attached url for more documentation about these flags.
https://www.khronos.org/registry/OpenGL/extensions/NV/NV_shader_thread_group.txt
|
|
Some operations like atomicMin were ignored because they returned were
being stored to RZ. This operations have a side effect and it was being
ignored.
|
|
Atomic instructions can be used without returning anything and this is
valid code. Remove the assert.
|
|
|
|
Drop the std::list hack to allocate memory indefinitely.
Instead use a custom allocator that keeps references valid until
destruction. This allocates fixed chunks of memory and puts pointers in
a free list. When an allocation is no longer used put it back to the
free list, this doesn't heap allocate because std::vector doesn't change
the capacity. If the free list is empty, allocate a new chunk.
|
|
Most overlaps in the buffer cache only contain one mapped address.
We can avoid close to all heap allocations once the buffer cache is
warmed up by using a small_vector with a stack size of one.
|
|
Instead of using boost::icl::interval_map for caching, use
boost::intrusive::set. interval_map is intended as a container where the
keys can overlap with one another; we don't need this for caching
buffers and a std::set-like data structure that allows us to search with
lower_bound is enough.
|
|
Removing shared pointers is a first step to be able to use intrusive
objects and keep allocations close to one another in memory.
|
|
Minor style changes. Mostly done so I avoid editing it while doing other
changes.
|
|
Add code required to use OpenGL assembly programs based on
NV_gpu_program5. Decompilation for ARB programs is intended to be added
in a follow up commit. This does **not** include ARB decompilation and
it's not in an usable state.
The intention behind assembly programs is to reduce shader stutter
significantly on drivers supporting NV_gpu_program5 (and other required
extensions). Currently only Nvidia's proprietary driver supports these
extensions.
Add a UI option hidden for now to avoid people enabling this option
accidentally.
This code path has some limitations that OpenGL compatibility doesn't
have:
- NV_shader_storage_buffer_object is limited to 16 entries for a single
OpenGL context state (I don't know if this is an intended limitation, an
specification issue or I am missing something). Currently causes issues
on The Legend of Zelda: Link's Awakening.
- NV_parameter_buffer_object can't bind buffers using an offset
different to zero. The used workaround is to copy to a temporary buffer
(this doesn't happen often so it's not an issue).
On the other hand, it has the following advantages:
- Shaders build a lot faster.
- We have control over how floating point rounding is done over
individual instructions (SPIR-V on Vulkan can't do this).
- Operations on shared memory can be unsigned and signed.
- Transform feedbacks are dynamic state (not yet implemented).
- Parameter buffers (uniform buffers) are per stage, matching NVN and
hardware's behavior.
- The API to bind and create assembly programs makes sense, unlike
ARB_separate_shader_objects.
|
|
Add format B8G8R8A8_SRGB and add Attachable capability for B8G8R8A8_UNORM
Used by Bravely Default II
|
|
|
|
Match OpenGL's behavior. This can fix or simplify bisecting issues on
Vulkan.
|
|
shader_ir: Add separate instructions for ordered and unordered comparisons and fix NE on GLSL
|
|
vk_graphics_pipeline: Implement rasterizer_enable on Vulkan
|
|
"Not equal" operators on GLSL seem to behave as unordered when we expect
an ordered comparison.
Manually emulate this checking for LGE values (numbers, not-NaNs).
|
|
|
|
|
|
|
|
texture: Implement R8G8UI
|
|
This allows us to use native SPIR-V instructions without having to
manually check for NAN.
|
|
maxwell_to_vk: implement missing signed int formats
|
|
video_core: Implement viewport swizzles with NV_viewport_swizzle
|
|
vk_sampler_cache: Use VK_EXT_custom_border_color when available
|
|
GPU: More optimizations to GPU Command List Processing and DMA Copy Optimizations
|
|
Co-authored-by: David <25727384+ogniK5377@users.noreply.github.com>
|
|
Co-authored-by: David <25727384+ogniK5377@users.noreply.github.com>
|
|
This should fix grass interactions on Breath of the Wild on Vulkan.
It is currently untested against validation layers.
Nvidia's Windows 443.09 beta driver or Linux 440.66.12 is required for
now.
|
|
|
|
|