Stable Audio Open Modal

This repo includes python code for running inference with the Stable Audio Open 1.0 model. This can be run locally or hosted on Modal.

generate_audio_sample.py tweaks the provided prompt to attempt to generate a oneshot sample like a drum hit. It then applies some post processing to the model output to trim extra hits and fade out the audio smoothly.

Hugging Face setup

In order to access the Stable Audio Open model, you'll need to:

Create a Hugging Face account
Navigate to the Stable Audio Open 1.0 model page and opt-in to gain access to the model
Create a Hugging Face access token with read access
Copy the token and add it to your local env using the name HF_TOKEN:

For zsh, add this to your ~/.zshrc:

export HF_TOKEN=myhftoken

For fish, add this to your fish config (e.g. ~/.config/fish/config.fish):

set -Ux HF_TOKEN myhftoken

Local environment setup

Install miniconda: https://docs.conda.io/en/latest/miniconda.html
Setup the conda environment
```
conda env create -f environment.yml
```
Activate it
```
conda activate stable-audio-open-modal
```

Running locally

To run inference locally, you can run generate_audio.py after activating the conda environment.

For example:

python generate_audio.py --prompt "Massive metalic techno kick drum"

This will generate a file called output_0.wav in the current directory.

To see a list of available arguments to customize inference, run:

python generate_audio.py -h"

Running on Modal

To deploy the app to run inference on Modal, you'll need to:

Create a Modal account
Create a Hugging Face account and API token.
Sign the agreement to use the Stable Audio Open 1.0 model.
Setup secrets for the Modal app with the following environment variables:
- HF_TOKEN: Your Hugging Face API token
- AUTH_TOKEN: A Bearer auth token you create to authenticate requests to the Modal app
Deploy the app with the following command:
```
modal deploy src/api.py
```

Note, you can test the endpoint prior to deploying with the following command:

modal serve src/api.py

And hit the endpoint with a POST request locally. This assumes you have set the AUTH_TOKEN environment variable.

curl -X POST https://your-modal-endpoint.modal.run \
  -H "Authorization: Bearer "$AUTH_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "Dub techno snare"
  }' --output "modal-out.wav"

Name	Name	Last commit message	Last commit date
Latest commit steve-mackinnon Fix issues running with cuda Jan 6, 2025 1472747 · Jan 6, 2025 History 16 Commits
src	src	Fix issues running with cuda	Jan 6, 2025
.gitignore	.gitignore	Initial commit	Nov 18, 2024
README.md	README.md	Update README.md	Jan 6, 2025
environment.yml	environment.yml	Fix sndfile package install and update readme	Jan 6, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Stable Audio Open Modal

Hugging Face setup

Local environment setup

Running locally

Running on Modal

About

Releases

Packages

Languages

steve-mackinnon/stable-audio-open-modal

Folders and files

Latest commit

History

Repository files navigation

Stable Audio Open Modal

Hugging Face setup

Local environment setup

Running locally

Running on Modal

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages