Barebones ML Project Template

This repository provides a barebones template for ML projects using PyTorch, Transformers, and Hydra. It includes definitions for a Docker development container to streamline the environment setup in VS Code.

📋 Prerequisites

Docker Engine (Installation Guide): Follow installation steps for your Linux distribution, or use Docker Desktop for Windows
NVIDIA Container Toolkit (Installation Guide): Follow steps for Installation and Configuration

Move Docker's default data-dir (Only if neeeded)

On my system, I have a lot of free space at /home, but very little in docker's default directory. Run the following commands to update Docker to store its data in a different directory.

Shutdown Docker service

sudo systemctl stop docker docker.socket
sudo systemctl status docker

Move data to the new path (if it's not already there)

sudo mkdir -p /etc/docker
sudo rsync -avxP /var/lib/docker/ /home/docker/
echo '{
  "data-root": "/home/docker"
}' | sudo tee /etc/docker/daemon.json

Restart the Docker services
```
sudo systemctl restart docker
```

Useful links:

🚀 Installation

Use this template to initialize a new project on GitHub. Then run the following:

git clone <repo-url> your_new_project
cd your_new_project

# modify requirements.txt as-needed then...

# Build the container
make docker-build

Your environment is ready, and you can already get working. Access the development environment in one of two ways:

VS Code DevContainer: Open the project in VS Code, install the Dev Containers extension, and select "Dev Containers: Reopen in Container" from the command palette (Ctrl+Shift+P) to work inside the Docker environment.

Docker Run Command: Dispatch scipts to be run on the container using, e.g.:

docker run --rm -v $(pwd):/workspace $(basename $(pwd)):latest bash -c "./scripts/train.sh"

📂 Project Structure

.
├── configs               # Configuration files for Hydra
│   ├── paths             #
│   │   └── default.yaml  # Default paths configuration
│   └── train.yaml        # Training-specific configuration
├── data                  # Data storage directory
├── logs                  # Logs generated from experiments
├── models                # Saved models
├── notebooks             # Jupyter notebooks for research and experimentation
│   └── template.ipynb    # Notebook template
├── scripts               # Shell scripts for automation
│   ├── eval.sh           # Evaluation script
│   └── train.sh          # Training script
├── src                   # Source code for the project
│   └── train.py          # Barebones train script
├── Dockerfile            # Docker environment setup
├── Makefile              # Makefile for automation (build, train, format, etc.)
├── pyproject.toml        # Python project configuration
├── README.md             # Project documentation
├── requirements.txt      # List of required Python packages
└── setup.py              # Python package setup

🛠 Features

Pre-configured PyTorch environment using Docker
Hydra-based configuration for flexibility in experiment settings
Pre-commit hooks for enforcing code quality
Sensible file structure to facilitate development
Automated setup with Makefile

📝 Notes

Configurations: Modify configs/train.yaml to adjust training settings.
Logs & Checkpoints: Stored in outputs/ and models/ respectively.
Extensibility: Add new scripts to scripts/ or modify Makefile for custom workflows.

💡 Recommendations

Run make format to run the pre-commit hooks before commiting your code.
Update requirements.txt whenever you install a new package in the container.
The configs/ folder is just a template. Consider cloning it outside the code base for day-to-day experiments. Then use the command-line flags --config-path (-cp) and --config-name (-cn) to direct hydra to those external locations. See Hydra's article on command line flags for more details.
Run make help for more commands

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Barebones ML Project Template

📋 Prerequisites

Useful links:

🚀 Installation

📂 Project Structure

🛠 Features

📝 Notes

💡 Recommendations

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.devcontainer		.devcontainer
.github/workflows		.github/workflows
configs		configs
notebooks		notebooks
scripts		scripts
src		src
.dockerignore		.dockerignore
.gitattributes		.gitattributes
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
Dockerfile		Dockerfile
Makefile		Makefile
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.py		setup.py

austinleedavis/barebones-ml

Folders and files

Latest commit

History

Repository files navigation

Barebones ML Project Template

📋 Prerequisites

Useful links:

🚀 Installation

📂 Project Structure

🛠 Features

📝 Notes

💡 Recommendations

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages