Rubric Inference Stack

A minimalist open-source LLM inference stack with structured outputs using SGLang and minimal API key auth via a Bun web server for 3x higher throughput than FastAPI.

The stack is currently highly-opinionated and likely to generalize and change.

Quickstart

The following assumes you're running on a Linux machine (likely in the cloud) with a GPU.

Docker

To build and run the Docker image:

docker build -t inference .
docker run -p 3000:3000 \
  -e SERVER_API_KEY=your_api_key \
  -e HF_TOKEN=your_hf_token \
  inference

The pre-built Docker image is also available on GHCR as rubriclab/inference:latest. To run the pre-built image:

docker run -p 3000:3000 \
  -e SERVER_API_KEY=your_api_key \
  -e HF_TOKEN=your_hf_token \
  ghcr.io/rubriclab/inference:latest

Skypilot

Install Skypilot and connect an infra provider. We recommend using uv for fast setup and Vast for competitively-priced GPUs or Runpod for a good experience.

First, grab an API key from your cloud provider (e.g. Vast or Runpod).

curl -LsSf https://astral.sh/uv/install.sh | sh
uv venv --python 3.10
source .venv/bin/activate
uv pip install "skypilot[vast,runpod]"

# Vast
uv pip install "vastai-sdk>=0.1.12"
echo "<your_vast_api_key>" > ~/.vast_api_key

# Runpod
uv pip install "runpod>=1.6.1"
runpod config # then enter your API key

sky launch skypilot.yaml

Client Example

Test the API from any OpenAI-compatible client:

cd test && bun i && touch .env

Populate your .env with:

BASE_URL=http://localhost:3000/v1
SERVER_API_KEY=your_api_key

Run the test:

bun index.ts

You should see a reasoning chain and a JSON payload conforming to the schema.

Name		Name	Last commit message	Last commit date
Latest commit History 118 Commits
.cursor/rules		.cursor/rules
.github/workflows		.github/workflows
auth		auth
test		test
.dockerignore		.dockerignore
.gitignore		.gitignore
.python-version		.python-version
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
skypilot.yaml		skypilot.yaml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Rubric Inference Stack

Quickstart

Docker

Skypilot

Client Example

About

Uh oh!

Packages

Uh oh!

Languages

License

RubricLab/inference

Folders and files

Latest commit

History

Repository files navigation

Rubric Inference Stack

Quickstart

Docker

Skypilot

Client Example

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Packages 0

Uh oh!

Languages

Packages