aboutsummaryrefslogtreecommitdiff
path: root/src/video_core/engines
AgeCommit message (Collapse)Author
2020-06-01Merge pull request #3998 from ReinUsesLisp/init-3dbunnei
maxwell_3d: Initialize more registers to their expected value
2020-05-28maxwell_3d: Reduce severity of logs that can be spammedReinUsesLisp
These logs were killing performance on some games when they were spammed. Reduce them to Debug severity.
2020-05-27maxwell_3d: Initialize line widthsReinUsesLisp
Initialize line widths to avoid setting a line width of zero.
2020-05-27maxwell_3d: Initialize polygon modesReinUsesLisp
NVN expects this to be initialized as Fill, otherwise games that never bind a rasterizer state will log an invalid polygon mode.
2020-05-13Merge pull request #3899 from ReinUsesLisp/float-comparisonsbunnei
shader_ir: Add separate instructions for ordered and unordered comparisons and fix NE on GLSL
2020-05-09shader_ir: Separate float-point comparisons in ordered and unorderedReinUsesLisp
This allows us to use native SPIR-V instructions without having to manually check for NAN.
2020-05-08Merge pull request #3885 from ReinUsesLisp/viewport-swizzlesbunnei
video_core: Implement viewport swizzles with NV_viewport_swizzle
2020-05-05Merge pull request #3815 from FernandoS27/command-list-2bunnei
GPU: More optimizations to GPU Command List Processing and DMA Copy Optimizations
2020-05-04vk_graphics_pipeline: Implement viewport swizzles with NV_viewport_swizzleReinUsesLisp
2020-05-04maxwell_3d: Add viewport swizzlesReinUsesLisp
2020-05-03Merge pull request #3808 from ReinUsesLisp/wait-for-idlebunnei
{maxwell_3d,buffer_cache}: Implement memory barriers using 3D registers
2020-04-30Merge pull request #3807 from ReinUsesLisp/fix-depth-clampbunnei
maxwell_3d: Fix depth clamping register
2020-04-30Merge pull request #3799 from ReinUsesLisp/iadd-ccbunnei
shader: Implement P2R CC, IADD Rd.CC and IADD.X
2020-04-28Clang Format and Documentation.Fernando Sahmkow
2020-04-28MaxwellDMA: Optimize micro copies.Fernando Sahmkow
2020-04-28{maxwell_3d,buffer_cache}: Implement memory barriers using 3D registersReinUsesLisp
Drop MemoryBarrier from the buffer cache and use Maxwell3D's register WaitForIdle. To implement this on OpenGL we just call glMemoryBarrier with the necessary bits. Vulkan lacks this synchronization primitive, so we set an event and immediately wait for it. This is not a pretty solution, but it's what Vulkan can do without submitting the current command buffer to the queue (which ends up being more expensive on the CPU).
2020-04-27VideoCore/Engines: Refactor Engines CallMethod.Fernando Sahmkow
2020-04-27maxwell_3d: Fix depth clamping registerReinUsesLisp
Using deko3d as reference: https://github.com/devkitPro/deko3d/blob/4e47ba0013552e592a86ab7a2510d1e7dadf236a/source/maxwell/gpu_3d_state.cpp#L42 We were using bits 3 and 4 to determine depth clamping, but these are the same both enabled and disabled: state->depthClampEnable ? 0x101A : 0x181D The same happens on Nvidia's OpenGL driver, where they do something like this (default capabilities, GL 4.5 compatibility): (state & DEPTH_CLAMP) != 0 ? 0x201a : 0x281c There's always a difference between the first bits in this register, but bit 11 is consistently disabled on both deko3d/NVN and OpenGL. This commit changes yuzu's behaviour to use bit 11 to determine depth clamping. - Fixes depth issues on Super Mario Odyssey's intro.
2020-04-27Merge pull request #3742 from FernandoS27/command-listbunnei
Optimize GPU Command Lists and Introduce Fast GPU Time Option
2020-04-26Merge pull request #3753 from ReinUsesLisp/ac-vulkanRodrigo Locatti
{gl,vk}_rasterizer: Add lazy default buffer maker and use it for empty buffers
2020-04-25shader/arithmetic_integer: Implement IADD.XReinUsesLisp
IADD.X takes the carry flag and adds it to the result. This is generally used to emulate 64-bit operations with 32-bit registers.
2020-04-25Merge pull request #3734 from ReinUsesLisp/half-float-modsbunnei
decode/arithmetic_half: Fix HADD2 and HMUL2 absolute and negation bits
2020-04-24Fix -Wdeprecated-copy warning.Markus Wick
2020-04-23decode/arithmetic_half: Fix HADD2 and HMUL2 absolute and negation bitsReinUsesLisp
The encoding for negation and absolute value was wrong. Extracting is now done manually. Similar instructions having different encodings is the rule, not the exception. To keep sanity and readability I preferred to extract the desired bit manually. This is implemented against nxas: https://github.com/ReinUsesLisp/nxas/blob/8dbc38995711cc12206aa370145a3a02665fd989/table.h#L68 That is itself tested against nvdisasm (Nvidia's official disassembler).
2020-04-23Clang Format.Fernando Sahmkow
2020-04-23Maxwell3D: Process Macros on MultiMethod.Fernando Sahmkow
2020-04-23DMAPusher: Propagate multimethod writes into the engines.Fernando Sahmkow
2020-04-23Merge pull request #3697 from lioncash/declarationsbunnei
CMakeLists: Enable -Wmissing-declarations on Linux builds
2020-04-22MaxwellDMA: Correct copying on accuracy level.Fernando Sahmkow
2020-04-22FenceManager: Manage syncpoints and rename fences to semaphores.Fernando Sahmkow
2020-04-22Rasterizer: Document SignalFence & ReleaseFences and setup skeletons on Vulkan.Fernando Sahmkow
2020-04-22GPU: Fix rebase errors.Fernando Sahmkow
2020-04-22OpenGL: Implement Fencing backend.Fernando Sahmkow
2020-04-22GPU: Delay Fences.Fernando Sahmkow
2020-04-22GPU: Refactor synchronization on Async GPUFernando Sahmkow
2020-04-22UI: Replasce accurate GPU option for GPU Accuracy LevelFernando Sahmkow
2020-04-21gl_rasterizer: Fix buffers without sizeReinUsesLisp
On NVN buffers can be enabled but have no size. According to deko3d and the behavior we see in Animal Crossing: New Horizons these buffers get the special address of 0x1000 and limit themselves to 0xfff. Implement buffers without a size by binding a null buffer to OpenGL without a side. https://github.com/devkitPro/deko3d/blob/1d1930beea093b5a663419e93b0649719a3ca5da/source/maxwell/gpu_3d_vbo.cpp#L62-L63
2020-04-21Merge pull request #3718 from ReinUsesLisp/better-pipeline-stateRodrigo Locatti
fixed_pipeline_state: Pack structure, use memcmp and CityHash on it
2020-04-20Merge pull request #3695 from ReinUsesLisp/default-attributesbunnei
maxwell_3d: Initialize format attributes constant as one
2020-04-18fixed_pipeline_state: Pack attribute stateReinUsesLisp
Reduce FixedPipelineState's size from 1384 to 664 bytes
2020-04-16General: Resolve warnings related to missing declarationsLioncash
2020-04-16maxwell_3d: Initialize format attributes constant as oneReinUsesLisp
nouveau expects this to be true but it doesn't set it.
2020-04-15CMakeLists: Specify -Wextra on linux buildsLioncash
Allows reporting more cases where logic errors may exist, such as implicit fallthrough cases, etc. We currently ignore unused parameters, since we currently have many cases where this is intentional (virtual interfaces). While we're at it, we can also tidy up any existing code that causes warnings. This also uncovered a few bugs as well.
2020-04-15Merge pull request #3612 from ReinUsesLisp/redFernando Sahmkow
shader/memory: Implement RED.E.ADD and minor changes to ATOM
2020-04-15Merge pull request #3662 from ReinUsesLisp/constant-attrsMat M
gl_rasterizer: Implement constant vertex attributes
2020-04-14shader/arithmetic: Add FCMP_CR variantReinUsesLisp
Adds another variant of FCMP.
2020-04-14gl_rasterizer: Implement constant vertex attributesReinUsesLisp
Credits go to gdkchan from Ryujinx for finding constant attributes are used in retail games.
2020-04-13gl_rasterizer: Implement line widths and smooth linesReinUsesLisp
Implements "legacy" features from OpenGL present on hardware such as smooth lines and line width.
2020-04-12Merge pull request #3578 from ReinUsesLisp/vmnmxFernando Sahmkow
shader/video: Partially implement VMNMX
2020-04-12shader/video: Partially implement VMNMXReinUsesLisp
Implements the common usages for VMNMX. Inputs with a different size than 32 bits are not supported and sign mismatches aren't supported either. VMNMX works as follows: It grabs Ra and Rb and applies a maximum/minimum on them (this is defined by .MX), having in mind the input sign. This result can then be saturated. After the intermediate result is calculated, it applies another operation on it using Rc. These operations are merges, accumulations or another min/max pass. This instruction allows to implement with a more flexible approach GCN's min3 and max3 instructions (for instance).