| Age | Commit message (Collapse) | Author |
|
maxwell_3d: Initialize more registers to their expected value
|
|
These logs were killing performance on some games when they were
spammed. Reduce them to Debug severity.
|
|
Initialize line widths to avoid setting a line width of zero.
|
|
NVN expects this to be initialized as Fill, otherwise games that never
bind a rasterizer state will log an invalid polygon mode.
|
|
shader_ir: Add separate instructions for ordered and unordered comparisons and fix NE on GLSL
|
|
This allows us to use native SPIR-V instructions without having to
manually check for NAN.
|
|
video_core: Implement viewport swizzles with NV_viewport_swizzle
|
|
GPU: More optimizations to GPU Command List Processing and DMA Copy Optimizations
|
|
|
|
|
|
{maxwell_3d,buffer_cache}: Implement memory barriers using 3D registers
|
|
maxwell_3d: Fix depth clamping register
|
|
shader: Implement P2R CC, IADD Rd.CC and IADD.X
|
|
|
|
|
|
Drop MemoryBarrier from the buffer cache and use Maxwell3D's register
WaitForIdle.
To implement this on OpenGL we just call glMemoryBarrier with the
necessary bits.
Vulkan lacks this synchronization primitive, so we set an event and
immediately wait for it. This is not a pretty solution, but it's what
Vulkan can do without submitting the current command buffer to the queue
(which ends up being more expensive on the CPU).
|
|
|
|
Using deko3d as reference:
https://github.com/devkitPro/deko3d/blob/4e47ba0013552e592a86ab7a2510d1e7dadf236a/source/maxwell/gpu_3d_state.cpp#L42
We were using bits 3 and 4 to determine depth clamping, but these are
the same both enabled and disabled:
state->depthClampEnable ? 0x101A : 0x181D
The same happens on Nvidia's OpenGL driver, where they do something like
this (default capabilities, GL 4.5 compatibility):
(state & DEPTH_CLAMP) != 0 ? 0x201a : 0x281c
There's always a difference between the first bits in this register, but
bit 11 is consistently disabled on both deko3d/NVN and OpenGL. This
commit changes yuzu's behaviour to use bit 11 to determine depth
clamping.
- Fixes depth issues on Super Mario Odyssey's intro.
|
|
Optimize GPU Command Lists and Introduce Fast GPU Time Option
|
|
{gl,vk}_rasterizer: Add lazy default buffer maker and use it for empty buffers
|
|
IADD.X takes the carry flag and adds it to the result. This is generally
used to emulate 64-bit operations with 32-bit registers.
|
|
decode/arithmetic_half: Fix HADD2 and HMUL2 absolute and negation bits
|
|
|
|
The encoding for negation and absolute value was wrong.
Extracting is now done manually. Similar instructions having different
encodings is the rule, not the exception. To keep sanity and readability
I preferred to extract the desired bit manually.
This is implemented against nxas:
https://github.com/ReinUsesLisp/nxas/blob/8dbc38995711cc12206aa370145a3a02665fd989/table.h#L68
That is itself tested against nvdisasm (Nvidia's official disassembler).
|
|
|
|
|
|
|
|
CMakeLists: Enable -Wmissing-declarations on Linux builds
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
On NVN buffers can be enabled but have no size. According to deko3d and
the behavior we see in Animal Crossing: New Horizons these buffers get
the special address of 0x1000 and limit themselves to 0xfff.
Implement buffers without a size by binding a null buffer to OpenGL
without a side.
https://github.com/devkitPro/deko3d/blob/1d1930beea093b5a663419e93b0649719a3ca5da/source/maxwell/gpu_3d_vbo.cpp#L62-L63
|
|
fixed_pipeline_state: Pack structure, use memcmp and CityHash on it
|
|
maxwell_3d: Initialize format attributes constant as one
|
|
Reduce FixedPipelineState's size from 1384 to 664 bytes
|
|
|
|
nouveau expects this to be true but it doesn't set it.
|
|
Allows reporting more cases where logic errors may exist, such as
implicit fallthrough cases, etc.
We currently ignore unused parameters, since we currently have many
cases where this is intentional (virtual interfaces).
While we're at it, we can also tidy up any existing code that causes
warnings. This also uncovered a few bugs as well.
|
|
shader/memory: Implement RED.E.ADD and minor changes to ATOM
|
|
gl_rasterizer: Implement constant vertex attributes
|
|
Adds another variant of FCMP.
|
|
Credits go to gdkchan from Ryujinx for finding constant attributes are
used in retail games.
|
|
Implements "legacy" features from OpenGL present on hardware such as
smooth lines and line width.
|
|
shader/video: Partially implement VMNMX
|
|
Implements the common usages for VMNMX. Inputs with a different size
than 32 bits are not supported and sign mismatches aren't supported
either.
VMNMX works as follows:
It grabs Ra and Rb and applies a maximum/minimum on them (this is
defined by .MX), having in mind the input sign. This result can then be
saturated. After the intermediate result is calculated, it applies
another operation on it using Rc. These operations are merges,
accumulations or another min/max pass.
This instruction allows to implement with a more flexible approach GCN's
min3 and max3 instructions (for instance).
|