| Age | Commit message (Collapse) | Author |
|
shader/shift: Implement SHR wrapped and clamped variants
|
|
|
|
Nvidia defaults to wrapped shifts, but this is undefined behaviour on
OpenGL's spec. Explicitly mask/clamp according to what the guest shader
requires.
|
|
|
|
shader_ir/conversion: Implement F2I and F2F F16 selector
|
|
float_set_predicate: Add missing negation bit for the second operand
|
|
* texture_cache/surface_params: Remove unused local variable
* rasterizer_interface: Add missing documentation commentary
* maxwell_dma: Remove unused rasterizer reference
* video_core/gpu: Sort member declaration order to silent -Wreorder warning
* fermi_2d: Remove unused MemoryManager reference
* video_core: Silent unused variable warnings
* buffer_cache: Silent -Wreorder warnings
* kepler_memory: Remove unused MemoryManager reference
* gl_texture_cache: Add missing override
* buffer_cache: Add missing include
* shader/decode: Remove unused variables
|
|
shader/decode: Implement S2R Tic
|
|
|
|
|
|
|
|
Implement VOTE using Nvidia's intrinsics. Documentation about these can
be found here
https://developer.nvidia.com/reading-between-threads-shader-intrinsics
Instead of using portable ARB instructions I opted to use Nvidia
intrinsics because these are the closest we have to how Tegra X1
hardware renders.
To stub VOTE on non-Nvidia drivers (including nouveau) this commit
simulates a GPU with a warp size of one, returning what is meaningful
for the instruction being emulated:
* anyThreadNV(value) -> value
* allThreadsNV(value) -> value
* allThreadsEqualNV(value) -> true
ballotARB, also known as "uint64_t(activeThreadsNV())", emits
VOTE.ANY Rd, PT, PT;
on nouveau's compiler. This doesn't match exactly to Nvidia's code
VOTE.ALL Rd, PT, PT;
Which is emulated with activeThreadsNV() by this commit. In theory this
shouldn't really matter since .ANY, .ALL and .EQ affect the predicates
(set to PT on those cases) and not the registers.
|
|
half_set_predicate: Fix HSETP2_C constant buffer offset
|
|
Shader_Ir: Implement F16 Variants of F2F, F2I, I2F.
|
|
shader_ir: Implement NOP
|
|
|
|
|
|
|
|
Downgrade and suppress a series of GPU asserts and debug messages.
|
|
|
|
This commit takes care of implementing the F16 Variants of the
conversion instructions and makes sure conversions are done.
|
|
|
|
|
|
|
|
shader-ir: Minor cleanup-related changes
|
|
|
|
This commit reduces the sevirity of asserts for FP precision and
rounding as this are well known and have little to no consequences in
gpu's accuracy.
|
|
shader/decode/other: Correct branch indirect argument within BRA handling
|
|
This is more accurate in terms of describing what the functions are
actually doing. Temporal relates to time, not the setting of a temporary
itself.
|
|
This appears to have been a copy/paste error introduced within
8a6fc529a968e007f01464abadd32f9b5eb0a26c
|
|
While changing this code, simplify tracking code to allow returning
the base address node, this way callers don't have to manually rebuild
it on each invocation.
|
|
shader/texture: Add F16 support for TLDS
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Hardware testing revealed that SSY and PBK push to a different stack,
allowing code like this:
SSY label1;
PBK label2;
SYNC;
label1: PBK;
label2: EXIT;
|
|
Instead of having a vector of unique_ptr stored in a vector and
returning star pointers to this, use shared_ptr. While changing
initialization code, move it to a separate file when possible.
This is a first step to allow code analysis and node generation beyond
the ShaderIR class.
|
|
shader: Implement S2R Tid{XYZ} and CtaId{XYZ}
|
|
shader/memory: Implement generic memory stores and loads (ST and LD)
|
|
Keeps the shader code file endings consistent.
|
|
Amends cases where we were using things that were indirectly being
satisfied through other headers. This way, if those headers change and
eliminate dependencies on other headers in the future, we don't have
cascading compilation errors.
|
|
|
|
|
|
|