Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HiZ Buffer issues on Integrated GPUs (Intel, Apple Silicon) #940

Open
SRSaunders opened this issue Nov 7, 2024 · 6 comments
Open

HiZ Buffer issues on Integrated GPUs (Intel, Apple Silicon) #940

SRSaunders opened this issue Nov 7, 2024 · 6 comments
Labels

Comments

@SRSaunders
Copy link

SRSaunders commented Nov 7, 2024

When the Hierarchical Depth Buffer is enabled (either for the old SSAO, or the new Filmic Post FX passes), I see problems on Integrated GPUs as follows:

  1. On the master branch when running Vulkan on Linux + Intel iGPU: Game crashes after loading a level with a device lost error.
  2. On the push constants branch when running Vulkan on M1 Apple Silicon GPU: Game crashes after loading a level with device out of memory followed by device lost errors. I suspect this one does not show up on the master branch since the M1 is far too slow over there (< 10 fps). To get a reassonable frame rate I have to test on the push constants branch.

If I set r_useHierarchicalDepthBuffers = 0 the problem goes away for both cases above. I have looked at the mipmapgen shader code and played with the parameters a bit, but I don't see anything obviously wrong, other than potential memory issues. I am wondering whether there is a subtle sync + resource release delay problem that is exposed on iGPUs that are slower and have more limited memory.

If you don't want to research this, I can easily patch it by adding the following code to DeviceManager_VK.cpp:

	if( !otherGPUs.empty() )
	{
--->		//FIXME: On integrated GPUs disable HiZ buffer to avoid device out of memory/lost errors
--->		r_useHierarchicalDepthBuffer.SetBool( false );

		m_VulkanPhysicalDevice = otherGPUs[0];
		return true;
	}

Please let me know what you want to do with this.

@RobertBeckebans
Copy link
Owner

Maybe we can add an enum to the device manager where we scan for Nvidia, AMD and Intel vendor devices and then have a cvar like in former times r_useIntelHacks wrapped around by #if defined(linux).

@SRSaunders
Copy link
Author

SRSaunders commented Nov 11, 2024

I remember those old r_skipIntelWorkarounds cvars. However, they were used to influence the setting of booleans for feature availability (e.g. timer queries). In this case, the HiZ depth buffer feature is already controlled by the cvar r_useHierarchicalDepthBuffer. So using one cvar to set another seems over-complex unless there are other GPU-specific cases which I am not aware of at this point.

I would rather use conditional #if defined(linux, mac) logic combined with integrated GPU detection to directly control the r_useHierarchicalDepthBuffer cvar. Note my original suggestion above avoids both of these options and just disables it for iGPUs on all platforms. While required on linux and macos, I realize this might be unnecessary on Windows. However, in the past I have seen memory problems which cause crashes on linux be more general in nature, and while they don't crash on Windows there is indeed something wrong underneath. Linux can act like a "canary in the coal mine" for subtle issues. That's why I asked earlier if you want to look at this to see if you could see something wrong with mipmapgen (e.g. sync, memory, etc) that I have missed.

@RobertBeckebans
Copy link
Owner

I can look into the mipmap gen pass but it's pretty much just copied from the Donut framework and it does not trigger any validation errors so it is more likely we have a driver bug on Linux. Is it the same behaviour on macOS?

@SRSaunders
Copy link
Author

SRSaunders commented Nov 12, 2024

Is it the same behaviour on macOS?

Yes, when running the master branch on Apple Silicon/arm64 (my M1 laptop), the HiZ buffer feature can cause an out-of-device-memory crash. It does not happen on my x86_64 machine with a discrete AMD 6600XT GPU. And I can't test using macOS with an Intel iGPU since my ancient Apple Intel laptop is too old and won't run the game - but that combo is mostly irrelevant these days in any case.

I am suspicious that this is not a driver bug since it happens in two different environments: linux and macOS. And a clue when running on macOS with Apple Silicon: when I enable either new SSAO or TAA or both, the crash does not appear with HiZ enabled on the master branch. When both new SSAO and TAA are disabled, the crash happens immediately with HiZ enabled. Without HiZ no crash occurs in any situation. That's what makes me suspicious about a compute shader resource or sync issue with HiZ enabled.

@RobertBeckebans
Copy link
Owner

I looked into this and it was only used for SSAO and the Amstrad CPC postfx shaders. However I need the HiZ pass for the new screen space reflections because if I reference the _currentDepth texture and also have a material that can manipulate the depth buffer then it leads to a Vulkan error. So the new code definitely requires a similar function like bool R_UseTemporalAA() that checks for the cvar and if the system can handle it.

The only potential driver problem I see with this texture is that it is the only FBO image with an explicit mip map chain.
I can ask the NVRHI maintainer if he sees a problem in how we use the code.

@SRSaunders
Copy link
Author

SRSaunders commented Jan 19, 2025

@RobertBeckebans I believe the root cause may be a driver limitation where shader subgroups do not work properly in all circumstances on macOS/MoltenVK and possibly on linux/Intel iGPUs - note this feature is used in the HiZ compute shader. I am playing around with MoltenVK's internal settings to see if I can overcome the problem for Apple Silicon. I currently don't have a solution for linux/Intel iGPUs other than disabling HiZ as suggested above, or rewriting the HiZ compute shader to not use subgroups. See comparable issue at godotengine/godot#67373 and solution at godotengine/godot#67915.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: To do
Development

No branches or pull requests

2 participants