If the bit 0 of the JMPU instruction is set, then the jump condition
will be inverted. That is, a jump will happen when the boolean is false
instead of when it is true.
Some disabled debugging functionality was being called from rendering
routines in VideoCore. Although disabled, many of them still allocated
memory or did some extra work that was enough to show up in a profiler.
Gives a slight (~2ms) speedup.
During testing, it was discovered that hardware does not interpolate
colors output by the vertex shader as-is. Rather, it drops the sign and
saturates the value to 1.0. This is done before interpolation, such that
(e.g.) interpolating outputs 1.5 and -0.5 is equivalent to as if the
shader had output the values 1.0 and 0.5 instead, with the interpolated
value never crossing 0.0.
This change has been tested against hardware.
memory.cpp/h contains definitions related to acessing memory and
configuring the address space
mem_map.cpp/h contains higher-level definitions related to configuring
the address space accoording to the kernel and allocating memory.
We now write create a temporary buffer for output registers and copy all of them to the actual output vertex structure after the shader has run. This is technically not necessary, but it's easier to vectorize in the future.
Involves making asserts use printf instead of the log functions (log functions are asynchronous and, as such, the log won't be printed in time)
As such, the log type argument was removed (printf obviously can't use it, and it's made obsolete by the file and line printing)
Also removed some GEKKO cruft.
Unused OutputVertex attributes were being left un-initialized. The
leftover garbage sometimes decoded as floating-point denormalized
values, causing fallbacks to microcode and massive slowdowns in the rest
of the rasterization pipeline even though the results were unused. By
zeroing the structure we ensure these attributes only contain harmless
zeros.