for core stuff:
just remove unique ptrs that dont need any pointer stability at all (afterall its an allocation within an allocation so yeah)
for fibers:
Main reasoning behind this is because virtualBuffer<> is stupidly fucking expensive and it also clutters my fstat view
ALSO mmap is a syscall, syscalls are bad for performance or whatever
ALSO std::vector<> is better suited for handling this kind of "fixed size thing where its like big but not THAT big" (512 KiB isn't going to kill your memory usage for each fiber...)
for core.cpp stuff
- inlines stuff into std::optional<> as opposed to std::unique_ptr<> (because yknow, we are making the Impl from an unique_ptr, allocating within an allocation is unnecessary)
- reorganizes the structures a bit so padding doesnt screw us up (it's not perfect but eh saves a measly 44 bytes)
- removes unused/dead code
- uses std::vector<> instead of std::deque<>
no perf impact expected, maybe some initialisation boost but very minimal impact nonethless
lto gets rid of most calls anyways - the heavy issue is with shared_ptr and the cache coherency from the atomics... but i clumped them together because well, they kinda do not suffer from cache coherency - hopefully not a mistake
this balloons the size of Impl to about 1.67 MB - which is fine because we throw it in the stack anyways
REST OF INTERFACES: most of them ballooned in size as well, but overhead is ok since its an allocation within an alloc, no stack is used (when it comes to storing these i mean)
Signed-off-by: lizzie lizzie@eden-emu.dev
Reviewed-on: https://git.eden-emu.dev/eden-emu/eden/pulls/3306
Reviewed-by: CamilleLaVey <camillelavey99@gmail.com>
Reviewed-by: MaranBr <maranbr@eden-emu.dev>
Co-authored-by: lizzie <lizzie@eden-emu.dev>
Co-committed-by: lizzie <lizzie@eden-emu.dev>
Previously, we were mixing the raw CPU frequency and CNTFRQ.
The raw CPU frequency (1020 MHz) should've never been used as CNTPCT (whose frequency is CNTFRQ) is the only counter available.
The precision of sleep_for and wait_for is limited to 1-1.5ms on Windows.
Using SleepForOneTick() allows us to sleep for exactly one interval of the current timer resolution.
This allows us to take advantage of systems that have a timer resolution of 0.5ms to reduce CPU overhead in the event loop.
This formats all copyright comments according to SPDX formatting guidelines.
Additionally, this resolves the remaining GPLv2 only licensed files by relicensing them to GPLv2.0-or-later.
In addition to requiring nanosecond precision, using the native clock requires that the hardware TSC has a precision greater than the emulated CPU and its clock counter.
From -fsanitize=address, this code wasn't calling the proper destructor.
Adding virtual destructors for each inherited class and the base class
fixes this bug.
While we are at it, mark the functions as final.
Now that clang-format makes [[nodiscard]] attributes format sensibly, we
can apply them to several functions within the common library to allow
the compiler to complain about any misuses of the functions.