Synthetic Dataset Generation with Few-shot Guidance

This repository contains the codebase of a series of projects on synthetic dataset generation with few-shot guidance.

LoFT: LoRA-fused Dataset Generation with Few-shot Guidance, Arxiv.
DataDream: Few-shot Guided Dataset Generation, in ECCV, 2024.

Preliminary Setup

We use Stable-Diffusion-2-1-base as a base diffusion model.

Also, few-shot real data should be formed in the following way. Each data file should be located in the path PATH_TO_REAL_FEWSHOT/$DATASET/shot$N_SHOT_seed$FEWSHOT_SEED/$CLASS_NAME/$FILE. The list of $CLASS_NAME For each $DATASET can be found in sd-finetune/util.py file. For instance, when using a 16-shot setting, files should be located as follows:

📂 data
|_📂 real_train_fewshot
  |_📂 imagenet
    |_📂 shot16_seed0
      |_📂 abacus
        |_📄 n02666196_17944.JPEG
        |_📄 n02666196_10754.JPEG
        |_📄 n02666196_10341.JPEG
        ...
        |_📄 n02666196_16649.JPEG
      |_📂 clothes iron
      |_📂 great white shark
      |_📂 goldfish
      |_📂 tench
      ...
  |_📂 eurosat
    |_📂 shot16_seed0
      |_📂 AnnualCrop
      |_📂 Forest
      ...

Step

You can run LoFT, DataDream-class, and DataDream-dataset methods by following the process below.

Install the necessary dependencies in requirements.txt.
Finetune diffusion model: Follow the instructions in the sd-finetune folder.
Dataset generation: Follow the instructions in the generation folder.
Train classification model with synthetic data: Follow the instructions in the classification folder.

Citation

If you use this code in your research, please kindly cite the following papers

@article{kim2025loft,
TBD
}

@article{kim2024datadream,
  title={DataDream: Few-shot Guided Dataset Generation},
  author={Kim, Jae Myung and Bader, Jessica and Alaniz, Stephan and Schmid, Cordelia and Akata, Zeynep},
  journal={arXiv preprint arXiv:2407.10910},
  year={2024}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Synthetic Dataset Generation with Few-shot Guidance

Preliminary Setup

Step

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
classification		classification
generation		generation
paper		paper
sd-finetune		sd-finetune
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

ExplainableML/LoFT

Folders and files

Latest commit

History

Repository files navigation

Synthetic Dataset Generation with Few-shot Guidance

Preliminary Setup

Step

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages