Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft: vulkan: Add bfloat16 support #12554

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

jeffbolznv
Copy link
Collaborator

This adds bfloat16 matrix multiply support based on VK_KHR_shader_bfloat16. The extension is required for coopmat multiply support, but matrix-vector multiply trivially promotes bf16 to fp32 and doesn't require the extension. The copy/get_rows shaders also don't require the extension.

It's probably possible to fall back to non-coopmat and promote to fp32 when the extension isn't supported, but this change doesn't do that.

The coopmat support also requires a glslc that supports the extension, which currently requires a custom build.

@jeffbolznv jeffbolznv requested a review from 0cc4m March 24, 2025 21:16
@github-actions github-actions bot added Vulkan Issues specific to the Vulkan backend ggml changes relating to the ggml tensor library for machine learning labels Mar 24, 2025
@jeffbolznv
Copy link
Collaborator Author

The tooling for Vulkan bfloat16 is not all merged yet - see KhronosGroup/SPIRV-Tools#6057 and KhronosGroup/glslang#3905. So if anybody wants to try this locally, you'd need to build a custom glslc. I'll update when it's all merged

NVIDIA will release a Vulkan developer driver with support for this extension, hopefully tomorrow.

@@ -221,7 +222,8 @@ void string_to_spv_func(const std::string& _name, const std::string& in_fname, c
std::string target_env = (name.find("_cm2") != std::string::npos) ? "--target-env=vulkan1.3" : "--target-env=vulkan1.2";

// disable spirv-opt for coopmat shaders for https://github.com/ggerganov/llama.cpp/issues/10734
std::string opt_level = coopmat ? "" : "-O";
// XXX do not submit - temporarily disabled for bfloat
std::string opt_level = "";//coopmat ? "" : "-O";
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't intend to submit this line as-is. spirv-opt needs a bit of work, hopefully that will be resolved soon.

This adds bfloat16 matrix multiply support based on VK_KHR_shader_bfloat16.
The extension is required for coopmat multiply support, but matrix-vector
multiply trivially promotes bf16 to fp32 and doesn't require the extension.
The copy/get_rows shaders also don't require the extension.

It's probably possible to fall back to non-coopmat and promote to fp32 when
the extension isn't supported, but this change doesn't do that.

The coopmat support also requires a glslc that supports the extension, which
currently requires a custom build.
@jeffbolznv
Copy link
Collaborator Author

I've rebased this, and added a fallback (promote to fp32) when bf16 coopmat support isn't available. Still waiting on the tooling to be merged, so still Draft for now.

…pport

Compile a variant of the scalar mul_mm shader that will promote the bf16
values to float, and use that when either the bf16 extension or the coopmat
extensions aren't available.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ggml changes relating to the ggml tensor library for machine learning Vulkan Issues specific to the Vulkan backend
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant