-
Notifications
You must be signed in to change notification settings - Fork 261
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HiZ Buffer issues on Integrated GPUs (Intel, Apple Silicon) #940
Comments
Maybe we can add an enum to the device manager where we scan for Nvidia, AMD and Intel vendor devices and then have a cvar like in former times r_useIntelHacks wrapped around by #if defined(linux). |
I remember those old I would rather use conditional |
I can look into the mipmap gen pass but it's pretty much just copied from the Donut framework and it does not trigger any validation errors so it is more likely we have a driver bug on Linux. Is it the same behaviour on macOS? |
Yes, when running the master branch on Apple Silicon/arm64 (my M1 laptop), the HiZ buffer feature can cause an out-of-device-memory crash. It does not happen on my x86_64 machine with a discrete AMD 6600XT GPU. And I can't test using macOS with an Intel iGPU since my ancient Apple Intel laptop is too old and won't run the game - but that combo is mostly irrelevant these days in any case. I am suspicious that this is not a driver bug since it happens in two different environments: linux and macOS. And a clue when running on macOS with Apple Silicon: when I enable either new SSAO or TAA or both, the crash does not appear with HiZ enabled on the master branch. When both new SSAO and TAA are disabled, the crash happens immediately with HiZ enabled. Without HiZ no crash occurs in any situation. That's what makes me suspicious about a compute shader resource or sync issue with HiZ enabled. |
I looked into this and it was only used for SSAO and the Amstrad CPC postfx shaders. However I need the HiZ pass for the new screen space reflections because if I reference the _currentDepth texture and also have a material that can manipulate the depth buffer then it leads to a Vulkan error. So the new code definitely requires a similar function like bool R_UseTemporalAA() that checks for the cvar and if the system can handle it. The only potential driver problem I see with this texture is that it is the only FBO image with an explicit mip map chain. |
@RobertBeckebans I believe the root cause may be a driver limitation where shader subgroups do not work properly in all circumstances on macOS/MoltenVK and possibly on linux/Intel iGPUs - note this feature is used in the HiZ compute shader. I am playing around with MoltenVK's internal settings to see if I can overcome the problem for Apple Silicon. I currently don't have a solution for linux/Intel iGPUs other than disabling HiZ as suggested above, or rewriting the HiZ compute shader to not use subgroups. See comparable issue at godotengine/godot#67373 and solution at godotengine/godot#67915. |
When the Hierarchical Depth Buffer is enabled (either for the old SSAO, or the new Filmic Post FX passes), I see problems on Integrated GPUs as follows:
If I set
r_useHierarchicalDepthBuffers = 0
the problem goes away for both cases above. I have looked at the mipmapgen shader code and played with the parameters a bit, but I don't see anything obviously wrong, other than potential memory issues. I am wondering whether there is a subtle sync + resource release delay problem that is exposed on iGPUs that are slower and have more limited memory.If you don't want to research this, I can easily patch it by adding the following code to DeviceManager_VK.cpp:
Please let me know what you want to do with this.
The text was updated successfully, but these errors were encountered: