Skip to content
This repository has been archived by the owner on Oct 29, 2024. It is now read-only.

shader_jit_a64: Compact host executable memory #230

Merged
merged 3 commits into from
Sep 1, 2024

Conversation

Wunkolo
Copy link
Contributor

@Wunkolo Wunkolo commented Aug 11, 2024

Generates position-independent assembly to allow for code to be generated first within a std::vector before copying into executable memory. Allows for more compact memory-usage rather than allocating 1MiB of executable code for each shader. Saves up to ~1MiB for each shader.

@Wunkolo Wunkolo force-pushed the shader-jit-a64-compact-code branch from e40eae8 to 17c3a99 Compare August 11, 2024 21:00
@PabloMK7 PabloMK7 marked this pull request as ready for review August 11, 2024 22:17
@PabloMK7 PabloMK7 marked this pull request as draft August 11, 2024 22:18
@PabloMK7
Copy link
Owner

Sorry, misclicked ^^'

@Wunkolo
Copy link
Contributor Author

Wunkolo commented Aug 11, 2024

This currently passes all unit tests on my M2 Mac Mini but I wanted to do some more testing before I un-draft it. 👍

@PabloMK7
Copy link
Owner

NOTE: You will need to rebase against master for CI to pass.

Use the templated `BasicCodeGenerator` type rather than the specialized
`CodeGenerator` type.
Allows `VectorCodeGenerator` to work with these functions.
`VectorCodeGenerator` will always do far-calls since we cannot resolve any absolute addresses here.
Generates more position-independent assembly to allow for code to be
generated within a resizable vector before copying into executable
memory, allowing for more compact memory allocations and usage rather
than a statically defined worst-case for all-cases.

`VectorCodeGenerator` will need to generate position-independent code
rather than use absolute addresses. Assumes all far function calls in the
case of `VectorCodeGenerator` to use absolute addresses rather than
potentially use a relative `BL` branch after memory relocation.
@Wunkolo Wunkolo force-pushed the shader-jit-a64-compact-code branch from 17c3a99 to c5194f9 Compare August 20, 2024 17:02
@Wunkolo Wunkolo marked this pull request as ready for review August 31, 2024 02:20
@Wunkolo
Copy link
Contributor Author

Wunkolo commented Aug 31, 2024

Did some testing and this is ready to go!
In the shader unit tests alone. This saved over 122MiB of memory:

Code block size: 01048576(1024 KiB) -> 00066084(64 KiB)  | 1024KiB - 64KiB = 960KiB
Code block size: 01048576(1024 KiB) -> 00082584(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00066112(64 KiB)  | 1024KiB - 64KiB = 960KiB
Code block size: 01048576(1024 KiB) -> 00066108(64 KiB)  | 1024KiB - 64KiB = 960KiB
Code block size: 01048576(1024 KiB) -> 00066112(64 KiB)  | 1024KiB - 64KiB = 960KiB
Code block size: 01048576(1024 KiB) -> 00082464(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00082464(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00066096(64 KiB)  | 1024KiB - 64KiB = 960KiB
Code block size: 01048576(1024 KiB) -> 00066088(64 KiB)  | 1024KiB - 64KiB = 960KiB
Code block size: 01048576(1024 KiB) -> 00066088(64 KiB)  | 1024KiB - 64KiB = 960KiB
Code block size: 01048576(1024 KiB) -> 00082456(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00066088(64 KiB)  | 1024KiB - 64KiB = 960KiB
Code block size: 01048576(1024 KiB) -> 00066088(64 KiB)  | 1024KiB - 64KiB = 960KiB
Code block size: 01048576(1024 KiB) -> 00082460(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00082464(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00115224(112 KiB) | 1024KiB - 112KiB = 912KiB
Code block size: 01048576(1024 KiB) -> 00115224(112 KiB) | 1024KiB - 112KiB = 912KiB
Code block size: 01048576(1024 KiB) -> 00115216(112 KiB) | 1024KiB - 112KiB = 912KiB
Code block size: 01048576(1024 KiB) -> 00115216(112 KiB) | 1024KiB - 112KiB = 912KiB
Code block size: 01048576(1024 KiB) -> 00131596(128 KiB) | 1024KiB - 128KiB = 896KiB
Code block size: 01048576(1024 KiB) -> 00131596(128 KiB) | 1024KiB - 128KiB = 896KiB
Code block size: 01048576(1024 KiB) -> 00115216(112 KiB) | 1024KiB - 112KiB = 912KiB
Code block size: 01048576(1024 KiB) -> 00115216(112 KiB) | 1024KiB - 112KiB = 912KiB
Code block size: 01048576(1024 KiB) -> 00131596(128 KiB) | 1024KiB - 128KiB = 896KiB
Code block size: 01048576(1024 KiB) -> 00131596(128 KiB) | 1024KiB - 128KiB = 896KiB
Code block size: 01048576(1024 KiB) -> 00115216(112 KiB) | 1024KiB - 112KiB = 912KiB
Code block size: 01048576(1024 KiB) -> 00131596(128 KiB) | 1024KiB - 128KiB = 896KiB
Code block size: 01048576(1024 KiB) -> 00131596(128 KiB) | 1024KiB - 128KiB = 896KiB
Code block size: 01048576(1024 KiB) -> 00131596(128 KiB) | 1024KiB - 128KiB = 896KiB
Code block size: 01048576(1024 KiB) -> 00131596(128 KiB) | 1024KiB - 128KiB = 896KiB
Code block size: 01048576(1024 KiB) -> 00131596(128 KiB) | 1024KiB - 128KiB = 896KiB
Code block size: 01048576(1024 KiB) -> 00066076(64 KiB)  | 1024KiB - 64KiB = 960KiB
Code block size: 01048576(1024 KiB) -> 00066104(64 KiB)  | 1024KiB - 64KiB = 960KiB
Code block size: 01048576(1024 KiB) -> 00082484(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00082484(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00082484(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00082484(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00082484(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00082484(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00082484(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00082484(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00082484(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00082484(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00082484(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00082484(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00082484(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00082484(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00082484(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00082484(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00082480(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00082480(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00082492(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00082488(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00082480(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00082480(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00082492(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00082488(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00082480(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00082480(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00082488(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00082484(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00082480(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00082480(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00082488(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00082484(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00082480(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00082480(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00082492(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00082488(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00082480(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00082480(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00082492(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00082488(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00082480(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00082480(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00082488(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00082484(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00082480(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00082480(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00082488(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00082484(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00082480(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00082480(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00082488(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00082484(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00082480(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00082480(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00082488(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00082484(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00082480(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00082480(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00082484(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00082480(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00082480(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00082480(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00082484(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00082480(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00082480(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00082480(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00082488(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00082484(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00082480(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00082480(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00082488(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00082484(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00082480(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00082480(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00082484(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00082480(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00082480(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00082480(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00082484(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00082480(80 KiB)  | 1024KiB - 80KiB = 944KiB
Code block size: 01048576(1024 KiB) -> 00066076(64 KiB)  | 1024KiB - 64KiB = 960KiB
Code block size: 01048576(1024 KiB) -> 00098836(96 KiB)  | 1024KiB - 96KiB = 928KiB
Code block size: 01048576(1024 KiB) -> 00098836(96 KiB)  | 1024KiB - 96KiB = 928KiB
Code block size: 01048576(1024 KiB) -> 00098836(96 KiB)  | 1024KiB - 96KiB = 928KiB
Code block size: 01048576(1024 KiB) -> 00066076(64 KiB)  | 1024KiB - 64KiB = 960KiB
Code block size: 01048576(1024 KiB) -> 00098836(96 KiB)  | 1024KiB - 96KiB = 928KiB
Code block size: 01048576(1024 KiB) -> 00098836(96 KiB)  | 1024KiB - 96KiB = 928KiB
Code block size: 01048576(1024 KiB) -> 00115216(112 KiB) | 1024KiB - 112KiB = 912KiB
Code block size: 01048576(1024 KiB) -> 00115216(112 KiB) | 1024KiB - 112KiB = 912KiB
Code block size: 01048576(1024 KiB) -> 00115216(112 KiB) | 1024KiB - 112KiB = 912KiB
Code block size: 01048576(1024 KiB) -> 00098836(96 KiB)  | 1024KiB - 96KiB = 928KiB
Code block size: 01048576(1024 KiB) -> 00115216(112 KiB) | 1024KiB - 112KiB = 912KiB
Code block size: 01048576(1024 KiB) -> 00066076(64 KiB)  | 1024KiB - 64KiB = 960KiB
Code block size: 01048576(1024 KiB) -> 00098836(96 KiB)  | 1024KiB - 96KiB = 928KiB
Code block size: 01048576(1024 KiB) -> 00115216(112 KiB) | 1024KiB - 112KiB = 912KiB
Code block size: 01048576(1024 KiB) -> 00131596(128 KiB) | 1024KiB - 128KiB = 896KiB
Code block size: 01048576(1024 KiB) -> 00115216(112 KiB) | 1024KiB - 112KiB = 912KiB
Code block size: 01048576(1024 KiB) -> 00131596(128 KiB) | 1024KiB - 128KiB = 896KiB
Code block size: 01048576(1024 KiB) -> 00066076(64 KiB)  | 1024KiB - 64KiB = 960KiB
Code block size: 01048576(1024 KiB) -> 00147976(144 KiB) | 1024KiB - 144KiB = 880KiB
Code block size: 01048576(1024 KiB) -> 00082456(80 KiB)  | 1024KiB - 80KiB = 944KiB

125616 KiB saved (122.7MiB)

@PabloMK7 PabloMK7 merged commit 3e5bbac into PabloMK7:master Sep 1, 2024
12 checks passed
@Wunkolo Wunkolo deleted the shader-jit-a64-compact-code branch September 1, 2024 16:30
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants