You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
Tree:
0d8ef2d3b9
3096/hle/bufferc
3096/qcom/clamp
3096/spirv/warp
3096/vk/drain_pending_build
3096/vk/pipeline_compilation
Kernel
Update-some-barriers-test
android6
atomicops-mxwell
bcn-ternary-soft
civa
descriptor
descriptor_pool_opt
descriptor_set
disable-vap
discfix
display-modified-settings-first
dmnt2
dynarm7345
dynarmic-coproc
dynarmic-ppc64
eden-orbis-ps4
eds-true-adreno-fixes
eds-true-adreno-fixes-pre-0.1.0
ffmpeg-cross-compile
fix-fsr-crash-linux-waw
fix/discord-rpc
flatopsfixes23485
freebsd-cubeb
fs_external_dlcupdates
install-vulkan-ps1-fix-windows-on-arm
interval-zero
lanobu
liz-crash-dumps-solaris
liz-dynarmic-backport-waitpkg
liz-dynarmic-macos-fbsd-port
liz-get-rid-of-mcl-intrusive-list
liz-heaptrack-fix
liz-no-rtti-allowance
lock-term-1
macos-sqbuild
macroify-surface-stuffs
master
memsetopsyscallavoid
mmap-fixews
mutliplayer-filter-better1
n64
nce-strx
nce_cpp
netgate1
netusejthreadstuff
no-d24
pagetable-clustering
pintocputhing
pipelinederivative
qcom-weird-vk-ftz
quick-fix
refactoreds2
release/0.0.3
release/0.0.4
release/0.1.0
rem-dup-applet-launch
remove-unused-fastmem-fallback
reorder-menu-game-per-config
revert-1240cd43d70a502508115c9abb12f7ef27e1ca4e
revert-2695
revert-4758e126b863da560bf30a00deda3bb44e26b7fa
revert-7eb5710f353798b05b8860187e2728f7795717a0
revert-eed703bc81214a47a5fc7bd3abf22152cbd5c40b
scmfix-worktree
selfhost0
shaderwipe15
showcase
showcase2
simp-word-man1
sjkdbsdfjkbsdf-2834
smartqueryreset
spookymansionreducecpuusage
static-linux
stuffmadeforfun
sured-revert
techno48473719
test-revert-gpu-optim
test2
teto-territory
true-eds
true-eds-graphics
true-eds-pre-0.0.1
vk-fix-oom-force-maller-buffers
vk-surface-andpc
vulkan-thingy
woa-turnip-expr
worekrs467584
xbzk-mci-bare-minimum-boot-fix
xbzk-saf-recursive-write-with-permission-request
0.0.0
0.0.1-pre-alpha
0.0.2-pre-alpha
test-tag1
test-tag2
v0.0.3
v0.0.3-rc1
v0.0.3-rc2
v0.0.3-rc3
v0.0.3.git
v0.0.4
v0.0.4-rc1
v0.0.4-rc2
v0.0.4-rc2.test
v0.0.4-rc2.test2
v0.0.4-rc3
v0.0.4-rc3.test1
v0.0.4-rc3.test2
v0.0.4.test
v0.1.0-rc1
${ noResults }
Uses arithmetic that can be identified more trivially by compilers for
optimizations. e.g. Rather than shifting the halves of the value and
then swapping and combining them, we can swap them in place.
e.g. for the original swap32 code on x86-64, clang 8.0 would generate:
mov ecx, edi
rol cx, 8
shl ecx, 16
shr edi, 16
rol di, 8
movzx eax, di
or eax, ecx
ret
while GCC 8.3 would generate the ideal:
mov eax, edi
bswap eax
ret
now both generate the same optimal output.
MSVC used to generate the following with the old code:
mov eax, ecx
rol cx, 8
shr eax, 16
rol ax, 8
movzx ecx, cx
movzx eax, ax
shl ecx, 16
or eax, ecx
ret 0
Now MSVC also generates a similar, but equally optimal result as clang/GCC:
bswap ecx
mov eax, ecx
ret 0
====
In the swap64 case, for the original code, clang 8.0 would generate:
mov eax, edi
bswap eax
shl rax, 32
shr rdi, 32
bswap edi
or rax, rdi
ret
(almost there, but still missing the mark)
while, again, GCC 8.3 would generate the more ideal:
mov rax, rdi
bswap rax
ret
now clang also generates the optimal sequence for this fallback as well.
This is a case where MSVC unfortunately falls short, despite the new
code, this one still generates a doozy of an output.
mov r8, rcx
mov r9, rcx
mov rax, 71776119061217280
mov rdx, r8
and r9, rax
and edx, 65280
mov rax, rcx
shr rax, 16
or r9, rax
mov rax, rcx
shr r9, 16
mov rcx, 280375465082880
and rax, rcx
mov rcx, 1095216660480
or r9, rax
mov rax, r8
and rax, rcx
shr r9, 16
or r9, rax
mov rcx, r8
mov rax, r8
shr r9, 8
shl rax, 16
and ecx, 16711680
or rdx, rax
mov eax, -16777216
and rax, r8
shl rdx, 16
or rdx, rcx
shl rdx, 16
or rax, rdx
shl rax, 8
or rax, r9
ret 0
which is pretty unfortunate.
|
7 years ago | |
|---|---|---|
| .. | ||
| default | Add Dark theme, Icon theming | 8 years ago |
| qdarkstyle | Port #3769 from Citra: "Update Dark theme to latest version" | 8 years ago |