Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sycl : variable sg_size support for mmvq kernels #12336

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

Alcpz
Copy link
Collaborator

@Alcpz Alcpz commented Mar 11, 2025

MMVQ kernels were based on CUDA's mmvq kernels, which were tailored for subgroups of size 32. While bigger subgroups were considered, smaller subgroups were not. The changes allow for more flexibility in these kernels and reduce the code base size if we make these kernels fast enough in intel architectures.

The introduced changes distribute more work per thread / workitem if WARP_SIZE < 32.

This PR doesn't affect performance.

QK_WARP_SIZE is still needed for some kernels that still require subgroup size == 32, so I am renaming the existing variable for now.

Note: To enable these changes on Intel devices, further changes are required in the entry point of the MUL_MAT op.

Note2: This code path requires FMA intrinsics that may not be available for all architectures, even more changes are required to make it the default path on Intel devices without potentially losing support on older archs

@github-actions github-actions bot added ggml changes relating to the ggml tensor library for machine learning SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language labels Mar 11, 2025
Copy link
Collaborator

@Rbiessy Rbiessy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Alcpz Alcpz force-pushed the Alcpz/variable-sg-mmvq branch from 403a5ad to 65ecd1a Compare March 11, 2025 20:43
Copy link
Contributor

@qnixsynapse qnixsynapse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome. Thank you!
LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ggml changes relating to the ggml tensor library for machine learning SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants