Fernando Sahmkow
5d3c5df7f4
gl_shader_decompiler: Implement AST decompiling
7 years ago
Fernando Sahmkow
f1ed22419c
shader_ir: Declare Manager and pass it to appropiate programs.
7 years ago
Fernando Sahmkow
9f61500df1
shader_ir: Corrections to outward movements and misc stuffs
7 years ago
Fernando Sahmkow
9581919b87
shader_ir: Add basic goto elimination
7 years ago
Fernando Sahmkow
a3d04b45a9
shader_ir: Initial Decompile Setup
7 years ago
ReinUsesLisp
79a7463f4c
gl_shader_decompiler: Use uint for images and fix SUATOM
In the process remove implementation of SUATOM.MIN and SUATOM.MAX as
these require a distinction between U32 and S32. These have to be
implemented with imageCompSwap loop.
6 years ago
ReinUsesLisp
331d140bb4
shader/image: Implement SULD and remove irrelevant code
* Implement SULD as float.
* Remove conditional declaration of GL_ARB_shader_viewport_layer_array.
6 years ago
Fernando Sahmkow
f02b9d37f0
Shader_IR: ICMP corrections and fixes
6 years ago
Fernando Sahmkow
01b8a78a8a
Shader_IR: Implement ICMP.
6 years ago
Fernando Sahmkow
ae03b1ebc7
VideoCore: Corrections to the MME Inliner and removal of hacky instance management.
6 years ago
ReinUsesLisp
42815d1d24
shader_ir/warp: Implement SHFL
6 years ago
ReinUsesLisp
2e6bebb3d2
shader/image: Implement SUATOM and fix SUST
7 years ago
ReinUsesLisp
e2aad88d51
gl_shader_decompiler: Keep track of written images and mark them as modified
6 years ago
ReinUsesLisp
9fb31b1b23
kepler_compute: Implement texture queries
7 years ago
ReinUsesLisp
b66b14a64f
shader_ir: Implement LD_S
Loads from shared memory.
7 years ago
ReinUsesLisp
df0203dd87
shader_ir: Implement ST_S
This instruction writes to a memory buffer shared with threads within
the same work group. It is known as "shared" memory in GLSL.
7 years ago
ReinUsesLisp
9b001821d9
shader/shift: Implement SHR wrapped and clamped variants
Nvidia defaults to wrapped shifts, but this is undefined behaviour on
OpenGL's spec. Explicitly mask/clamp according to what the guest shader
requires.
6 years ago
ReinUsesLisp
8ce5bb378f
half_set_predicate: Fix predicate assignments
6 years ago
Rodrigo Locatti
598157a8c9
video_core: Silent miscellaneous warnings ( #2820 )
* texture_cache/surface_params: Remove unused local variable
* rasterizer_interface: Add missing documentation commentary
* maxwell_dma: Remove unused rasterizer reference
* video_core/gpu: Sort member declaration order to silent -Wreorder warning
* fermi_2d: Remove unused MemoryManager reference
* video_core: Silent unused variable warnings
* buffer_cache: Silent -Wreorder warnings
* kepler_memory: Remove unused MemoryManager reference
* gl_texture_cache: Add missing override
* buffer_cache: Add missing include
* shader/decode: Remove unused variables
6 years ago
ReinUsesLisp
6f134adf2a
shader_ir/conversion: Split int and float selector and implement F2F H1
6 years ago
ReinUsesLisp
d9ad389777
shader_ir/conversion: Implement F2I F16 Ra.H1
6 years ago
ReinUsesLisp
d490cc5285
float_set_predicate: Add missing negation bit for the second operand
6 years ago
ReinUsesLisp
67f47b2f6a
shader_ir: Implement VOTE
Implement VOTE using Nvidia's intrinsics. Documentation about these can
be found here
https://developer.nvidia.com/reading-between-threads-shader-intrinsics
Instead of using portable ARB instructions I opted to use Nvidia
intrinsics because these are the closest we have to how Tegra X1
hardware renders.
To stub VOTE on non-Nvidia drivers (including nouveau) this commit
simulates a GPU with a warp size of one, returning what is meaningful
for the instruction being emulated:
* anyThreadNV(value) -> value
* allThreadsNV(value) -> value
* allThreadsEqualNV(value) -> true
ballotARB, also known as "uint64_t(activeThreadsNV())", emits
VOTE.ANY Rd, PT, PT;
on nouveau's compiler. This doesn't match exactly to Nvidia's code
VOTE.ALL Rd, PT, PT;
Which is emulated with activeThreadsNV() by this commit. In theory this
shouldn't really matter since .ANY, .ALL and .EQ affect the predicates
(set to PT on those cases) and not the registers.
7 years ago
ReinUsesLisp
b6272eb8e2
shader_ir: Implement NOP
7 years ago
ReinUsesLisp
48e8b1ab74
half_set_predicate: Fix HSETP2_C constant buffer offset
7 years ago
ReinUsesLisp
5188570517
decode/half_set_predicate: Fix predicates
7 years ago
ReinUsesLisp
11138d67ad
shader/decode: Implement S2R Tic
7 years ago
Fernando Sahmkow
9a0fa90be2
Shader_Ir: Implement F16 Variants of F2F, F2I, I2F.
This commit takes care of implementing the F16 Variants of the
conversion instructions and makes sure conversions are done.
7 years ago
Fernando Sahmkow
9a4a346b3f
Shader_Ir: Change Debug Asserts for Log Warnings
7 years ago
ReinUsesLisp
2f76aafca9
shader/half_set_predicate: Fix HSETP2 implementation
7 years ago
ReinUsesLisp
edc43b2509
shader/half_set_predicate: Implement missing HSETP2 variants
7 years ago
Lioncash
2f1921b8f4
video_core/control_flow: Provide operator!= for types with operator==
Provides operational symmetry for the respective structures.
7 years ago
Lioncash
e792178598
video_core/control_flow: Prevent sign conversion in TryGetBlock()
The return value is a u32, not an s32, so this would result in an
implicit signedness conversion.
7 years ago
Lioncash
c3dd5c7667
video_core/control_flow: Remove unnecessary BlockStack copy constructor
This is the default behavior of the copy constructor, so it doesn't need
to be specified.
While we're at it we can make the other non-default constructor
explicit.
7 years ago
Lioncash
095259a135
video_core/control_flow: Use std::move where applicable
Results in less work being done where avoidable.
7 years ago
Lioncash
0d287d3551
video_core/control_flow: Use the prefix variant of operator++ for iterators
Same thing, but potentially allows a standard library implementation to
pick a more efficient codepath.
7 years ago
Lioncash
da307b1c61
video_core/control_flow: Use empty() member function for checking emptiness
It's what it's there for.
7 years ago
Lioncash
f6250ef163
video_core: Resolve -Wreorder warnings
Ensures that the constructor members are always initialized in the order
that they're declared in.
7 years ago
Lioncash
fcc59b55f7
video_core/control_flow: Make program_size for ScanFlow() a std::size_t
Prevents a truncation warning from occurring with MSVC. Also the
internal data structures already treat it as a size_t, so this is just a
discrepancy in the interface.
7 years ago
Lioncash
1bad7650ec
video_core/control_flow: Place all internally linked types/functions within an anonymous namespace
Previously, quite a few functions were being linked with external
linkage.
7 years ago
Lioncash
78f54de493
video_core/shader/decode: Prevent sign-conversion warnings
Makes it explicit that the conversions here are intentional.
7 years ago
Fernando Sahmkow
3e0f5631c3
Shader_Ir: correct clang format
7 years ago
Fernando Sahmkow
a13b47f080
Shader_Ir: Downgrade precision and rounding asserts to debug asserts.
This commit reduces the sevirity of asserts for FP precision and
rounding as this are well known and have little to no consequences in
gpu's accuracy.
7 years ago
Lioncash
41e2ad0f26
shader_ir: std::move Node instance where applicable
These are std::shared_ptr instances underneath the hood, which means
copying them isn't as cheap as a regular pointer. Particularly so on
weakly-ordered systems.
This avoids atomic reference count increments and decrements where they
aren't necessary for the core set of operations.
7 years ago
Lioncash
4d02d971de
shader_ir: Rename Get/SetTemporal to Get/SetTemporary
This is more accurate in terms of describing what the functions are
actually doing. Temporal relates to time, not the setting of a temporary
itself.
7 years ago
Lioncash
40a74b1546
shader_ir: Remove unused includes
Removes unnecessary header dependencies.
7 years ago
Fernando Sahmkow
88fddaca00
Shader_Ir: Correct tracking to track from right to left
7 years ago
Lioncash
778d8fedfa
shader/decode/other: Correct branch indirect argument within BRA handling
This appears to have been a copy/paste error introduced within
d5d4cc30ec
7 years ago
ReinUsesLisp
a54be6ef96
shader: Allow tracking of indirect buffers without variable offset
While changing this code, simplify tracking code to allow returning
the base address node, this way callers don't have to manually rebuild
it on each invocation.
7 years ago
Fernando Sahmkow
3533ee4697
shader_ir: Add comments on missing instruction.
Also shows Nvidia's address space on comments.
7 years ago