Skip to content

pcheras/debiasing_model

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Debiasing Model

This is an experimental project extending on Self-Diagnosis and Self-Debiasing: A Proposal for Reducing Corpus-Base Bias in NLP that tests the hypothesis that using the output of Schick's de-biasing procedure as labels and fine-tuning the model directly will lead to similar or reduced toxicity scores according to Perspective API.

Pipeline

Alt text

🩹 How to train with your own data

To train with your own data. Put the data in model-input/prompts+continuations/ and follow the corresponding format. Then run python3 ./finetune_gpt2_with_logits

Using our model

from transformers import AutoModel
'''
0 = fine tuned on 1k examples
1 = fine tuned on 5k examples
2 = fine tuned on 10k examples
3 = fine tuned on 25k examples
'''
model_idx = 0 # [1, 2, 3] 
model = AutoModel.from_pretrained(f"newtonkwan/gpt2-xl-ft-{model_idx}")

Datasets

Real Toxicity Dataset - A dataset of 100k sentence snippets from the web for researchers to further address the risk of neural toxic degeneration in models (Gehman 2020)

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •