Cached Dataset

🚨 WARNING: This package is still under development and NOT ready for production use! 🚨

Cached Dataset

The idea is that when you have datasets with computation hungry transformations, you can wrap your dataset with the cached-dataset in order to cache the transformed version of your dataset either into disk or memory & thus avoid recomputing the transformations during each epoch of your training.

Depending on the context this can save a lot of time, but at the cost of memory consumption.

The package supports multi processing & is thus able to apply and cache your transformations as fast as possible.

Installation

pip install git+https://github.com/raideno/cached-dataset.git

Usage

from cached_dataset import DiskCachedDataset

# NOTE: your usual torch dataset with transforms for which you want to cache the transformed version
dataset = ...

# NOTE: the directory were you want to cache your dataset.
CACHING_DIRECTORY = "./cached-dataset"

cached_dataset = DiskCachedDataset.load_dataset_or_cache_it(
    dataset=dataset,
    base_path=CACHING_DIRECTORY,
    verbose=True,
    num_workers=0
)

for sample in cached_dataset:
    print(f"[sample-{i}]: {sample}")

Depending on your CPU / GPU power you might set the num_workers parameter to something else than 0 in order to speed up the caching process.

Note: for now the only available caching location is on disk, memory isn't supported yet.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.github/workflows		.github/workflows
cached_dataset		cached_dataset
tests		tests
.gitignore		.gitignore
README.md		README.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Cached Dataset

Installation

Usage

About

Languages

raideno/cached-dataset

Folders and files

Latest commit

History

Repository files navigation

Cached Dataset

Installation

Usage

About

Resources

Stars

Watchers

Forks

Languages