A unified interface for seamless file operations across local, S3, and Hugging Face ecosystems.
unibox
simplifies loading, saving, and exploring data—whether it's a local CSV, an S3-hosted image, or an entire Hugging Face dataset. With a single API, you can handle diverse file types and storage backends effortlessly.
pip install unibox
Or with uv
:
uv tool install unibox
Load anything, anywhere:
import unibox as ub
# Local parquet file
df = ub.loads("data/sample.parquet")
# S3-hosted text file
lines = ub.loads("s3://my-bucket/notes.txt")
# Hugging Face dataset
dataset = ub.loads("hf://user/repo")
Save with ease:
ub.saves(df, "s3://my-bucket/processed.parquet")
ub.saves(dataset, "hf://my-org/new-dataset")
List files or peek at data:
# List all JPGs in an S3 folder
images = ub.ls("s3://bucket/images", exts=[".jpg"])
# Preview a dataset
ub.peeks(dataset)
- Versatile: Handles CSVs, images, datasets, and more—locally or remotely.
- Simple: One function call to load or save, no matter the source.
- Transformative: From quick data peeks to concurrent downloads, it scales with your needs.
Explore the full power in our documentation.
Love unibox? Join us! Check out CONTRIBUTING.md to get started.
Extra dev notes: see README_dev.md.