Debiasing Model

This is an experimental project extending on Self-Diagnosis and Self-Debiasing: A Proposal for Reducing Corpus-Base Bias in NLP that tests the hypothesis that using the output of Schick's de-biasing procedure as labels and fine-tuning the model directly will lead to similar or reduced toxicity scores according to Perspective API.

Pipeline

🩹 How to train with your own data

To train with your own data. Put the data in model-input/prompts+continuations/ and follow the corresponding format. Then run python3 ./finetune_gpt2_with_logits

Using our model

from transformers import AutoModel
'''
0 = fine tuned on 1k examples
1 = fine tuned on 5k examples
2 = fine tuned on 10k examples
3 = fine tuned on 25k examples
'''
model_idx = 0 # [1, 2, 3] 
model = AutoModel.from_pretrained(f"newtonkwan/gpt2-xl-ft-{model_idx}")

Datasets

Real Toxicity Dataset - A dataset of 100k sentence snippets from the web for researchers to further address the risk of neural toxic degeneration in models (Gehman 2020)

Name		Name	Last commit message	Last commit date
Latest commit History 134 Commits
analysis		analysis
fine-tuning-archive		fine-tuning-archive
fine-tuning-demo-notebooks		fine-tuning-demo-notebooks
images		images
model-input		model-input
model-output		model-output
scores		scores
self-debiasing-timo		self-debiasing-timo
util		util
.gitignore		.gitignore
Debiasing Language Models with Self-Debiased Generations.pdf		Debiasing Language Models with Self-Debiased Generations.pdf
LICENSE		LICENSE
README.md		README.md
finetune_gpt2_using_labels_v4_loop.ipynb		finetune_gpt2_using_labels_v4_loop.ipynb
finetune_gpt2_with_labels.py		finetune_gpt2_with_labels.py
finetune_gpt2_with_logits.py		finetune_gpt2_with_logits.py
finetune_gpt2_with_mult.py		finetune_gpt2_with_mult.py
student_numbers.txt		student_numbers.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Debiasing Model

Pipeline

🩹 How to train with your own data

Using our model

Datasets

About

Releases

Packages

Contributors 3

Languages

License

pcheras/debiasing_model

Folders and files

Latest commit

History

Repository files navigation

Debiasing Model

Pipeline

🩹 How to train with your own data

Using our model

Datasets

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages