Build linux CUDA releases suitable for Colab & other platforms on 12.2 #11226

ochafik · 2025-01-14T02:13:56Z

Building specific archs separately to get maximum performance, smallest package size & shortest built times possible (compare a build for 7.5+8.0 vs. just 7.5 for instance: libggml-cuda.so is almost twice the size / ~70MB per arch)

Colab one-liner (example usage to install CUDA llama.cpp (will need to adjust to github releases; will write an install script when releases are available):

# Temporarily, hosting binaries on my own server
!wget -O llama-cpp.zip "https://download.ochafik.com/llama.cpp/llama-cpp-master-cuda-$( nvidia-smi | grep "CUDA Version: " | sed -E 's/.*?Version: ([0-9]+\.[0-9]+).*/\1/' )-cap-$( nvidia-smi --query-gpu=compute_cap --format=csv | tail -n 1 ).zip" && unzip -o llama-cpp.zip

# Once this PR gets merged
!wget -O llama-cpp.zip "$( curl --silent "https://api.github.com/repos/ggerganov/llama.cpp/releases/latest" | grep cuda-cu$( nvidia-smi | grep "CUDA Version: " | sed -E 's/.*?Version: ([0-9]+\.[0-9]+).*/\1/' )-cap$( nvidia-smi --query-gpu=compute_cap --format=csv | tail -n 1 ) | grep browser_download_url | sed -E 's/.*(https:.*)"/\1/' )"

TODO

Merge ci: ccache for all github worfklows #11516
Fix build on ci
Compare benefit of separate archives for server / cli vs. full?
Trigger a branch release if possible (to test entire mechanics)
Incubate install.sh (Unix incl. WSL) & install.ps1 (Windows) scripts that detect os, arch, cpu & gpu caps and install the right release (maybe through brew)

slaren · 2025-01-14T14:37:42Z

What's the reason for making a different release for each arch?

ochafik · 2025-01-14T15:11:25Z

What's the reason for making a different release for each arch?

@slaren Building for a single arch seems a lot faster, and having separate artefacts instead of (cuda-)fat binaries means smaller downloads / quicker setup on Colab. I couldn't finish a full build w/ all the architectures locally yet tho, maybe I'll try this to see how much overhead per arch we're talking about.

ochafik added 2 commits January 8, 2025 17:50

Package linux cuda releases for various caps

90a478b

Merge remote-tracking branch 'origin/master' into cuda-releases

4232406

github-actions bot added devops improvements to build systems and github actions ggml changes relating to the ggml tensor library for machine learning labels Jan 14, 2025

ochafik added 3 commits January 14, 2025 02:19

Fix container tags + rename

41b4b11

Update build.yml

66ec17e

Merge remote-tracking branch 'origin/master' into cuda-releases

e3cf4dc

ochafik and others added 21 commits January 19, 2025 02:07

Use ccache in Docker CUDA build

60151da

Update cuda.Dockerfile

a4aed1d

Merge branch 'ggerganov:master' into cuda-releases

67075cc

cuda builds: add libcurl

5eb87e9

Align artefact names on existing ones

2984d3c

Update build.yml

abd27fc

Update build.yml

b71c43c

Update build.yml

ac045e3

Temporarily upload artefacts in normal CI run to test artefacts

22ed602

Attempt to fix weird git error by installing deps before clone

4165293

ci: attempt to fix safe directory issue

c92ae47

ci: setup ccache

b7b264c

ditch ccache action + require cuda in release

7a5b18e

shuffle actions back to original order

f60e148

Merge remote-tracking branch 'origin/master' into cuda-releases

3d63db2

Merge remote-tracking branch 'origin/master' into cuda-releases

614fd07

Merge remote-tracking branch 'origin/master' into cuda-releases

fa38b8e

minimize diff

1b8f9ca

fix typo

89da8df

ci + cuda: checkout w/ history when packaging needed

ae175fe

install zip in cuda container

167c500

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Build linux CUDA releases suitable for Colab & other platforms on 12.2 #11226

Build linux CUDA releases suitable for Colab & other platforms on 12.2 #11226

ochafik commented Jan 14, 2025 •

edited

Loading

slaren commented Jan 14, 2025

ochafik commented Jan 14, 2025

Build linux CUDA releases suitable for Colab & other platforms on 12.2 #11226

Are you sure you want to change the base?

Build linux CUDA releases suitable for Colab & other platforms on 12.2 #11226

Conversation

ochafik commented Jan 14, 2025 • edited Loading

slaren commented Jan 14, 2025

ochafik commented Jan 14, 2025

ochafik commented Jan 14, 2025 •

edited

Loading