Skip to content

ggml-cpu: Build variant targeting Neoverse-V2 #14380

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

ckastner
Copy link
Collaborator

As a first improvement on the recently added generic ARM support for GGML_CPU_ALL_VARIANTS, this builds a variant targeting Neoverse-V2 specifically (eg: Graviton4 or NVIDIA Grace).

  • The cmake part needed little change. Feature processing unchanged, but the target is a specific -mcpu= rather than a generic -march=
  • It also defines a GGML_ARM_MCPU passed on to the scoring function
  • The scoring function parses the part number from /proc/cpuinfo on Linux (Graviton4 is Linux-only and I'd guess NVIDIA Grace, too), and uses it in scoring.

In the scoring function, I shifted features to the 9th bit and beyond. The idea being that features are more important than microarchitecture, platform, whatever, which can use bits 2-8 to rank themselves. So nuances like the microarchitecture of two variants become relevant in scoring only if they have otherwise equal features, otherwise features win. I thought this might be a useful convention.

I tested this on Graviton4, where the neoverse-v2 variant indeed received a higher score than the armv8.6-a variant, which would also work for Neoverse-V2 as it is armv8.6-a. neoverse-v2 is also what the GGML_NATIVE=ON build targets.

I did not see meaningful benchmark improvements over generic armv8.6-a, but I tested only limited models, and only with 4 vCPUs. Some tests ran with 2-3% improvement, but this wasn't always reproducible. I hope to get more AWS resources in July where I can properly test this on a dedicated box.

In any case, I think this would at least serve as an easy-to-copy template for other variants where this might matter more.

This supersedes #14332.

@github-actions github-actions bot added the ggml changes relating to the ggml tensor library for machine learning label Jun 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ggml changes relating to the ggml tensor library for machine learning
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant