ggml-cpu: Build variant targeting Neoverse-V2 #14380

ckastner · 2025-06-25T19:14:19Z

As a first improvement on the recently added generic ARM support for GGML_CPU_ALL_VARIANTS, this builds a variant targeting Neoverse-V2 specifically (eg: Graviton4 or NVIDIA Grace).

The cmake part needed little change. Feature processing unchanged, but the target is a specific -mcpu= rather than a generic -march=
It also defines a GGML_ARM_MCPU passed on to the scoring function
The scoring function parses the part number from /proc/cpuinfo on Linux (Graviton4 is Linux-only and I'd guess NVIDIA Grace, too), and uses it in scoring.

In the scoring function, I shifted features to the 9th bit and beyond. The idea being that features are more important than microarchitecture, platform, whatever, which can use bits 2-8 to rank themselves. So nuances like the microarchitecture of two variants become relevant in scoring only if they have otherwise equal features, otherwise features win. I thought this might be a useful convention.

I tested this on Graviton4, where the neoverse-v2 variant indeed received a higher score than the armv8.6-a variant, which would also work for Neoverse-V2 as it is armv8.6-a. neoverse-v2 is also what the GGML_NATIVE=ON build targets.

I did not see meaningful benchmark improvements over generic armv8.6-a, but I tested only limited models, and only with 4 vCPUs. Some tests ran with 2-3% improvement, but this wasn't always reproducible. I hope to get more AWS resources in July where I can properly test this on a dedicated box.

In any case, I think this would at least serve as an easy-to-copy template for other variants where this might matter more.

This supersedes #14332.

This allows for ranking backends when they otherwise support the same features.

cmake: Reduce unnecessary nesting

4f04a23

github-actions bot added the ggml changes relating to the ggml tensor library for machine learning label Jun 25, 2025

ckastner mentioned this pull request Jun 25, 2025

ggml-cpu: Pass on tag_name to the feature scoring #14332

Closed

ckastner added 3 commits June 25, 2025 21:16

ggml-cpu: Add ARM variant targeting neoverse-v2

15dd2f7

ggml-cpu: Split ARM backend scores

c02e0da

This allows for ranking backends when they otherwise support the same features.

ggml-cpu: Rank neoverse-v2 over generic ARM

14ca242

ckastner force-pushed the target-cpus branch from 94b2702 to 14ca242 Compare June 25, 2025 19:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ggml-cpu: Build variant targeting Neoverse-V2 #14380

ggml-cpu: Build variant targeting Neoverse-V2 #14380

ckastner commented Jun 25, 2025

Uh oh!

Uh oh!

ggml-cpu: Build variant targeting Neoverse-V2 #14380

Are you sure you want to change the base?

ggml-cpu: Build variant targeting Neoverse-V2 #14380

Conversation

ckastner commented Jun 25, 2025

Uh oh!

Uh oh!