Running docker files

First of all make sure to install docker :)

Follow the directory to the lick-caption-bias, and when you are in the directory execute this command.

For lstm

docker build -f ./Dockerfiles/Dockerfile_lstm -t lstm_fact . 
docker run -it lstm_fact /bin/bash
#In the container
cd lick-caption-bias

For BERT

docker build -f ./Dockerfiles/Dockerfile_bert -t bert_fact . 
docker run -it bert_fact /bin/bash
#In the container
cd lick-caption-bias

You should be in the container now. Go to the lick-caption-bias directory and run the scripts as described below.

Quantifying Societal Bias Amplification in Image Captioning

This repository contains source code necessary to reproduce the results presented in the paper Quantifying Societal Bias Amplification in Image Captioning (CVPR 2022, Oral). Please check the project website here.

LIC metric

LIC metric measures how much biased a set of model generated captions are with respect to the captions in the training dataset. LIC is computer as follows:

Mask attribute-revealing words.

Train 2 classifiers, 1 for human captions and 1 for generated captions, with attribute labels. The classifiers' goal is to predict the attribute (e.g. gender, race) for the person in the image using only the captions.
Calculate LIC scores for each classifier. If the set of captions is not biased, the classifier accuracy should be close to random chance.

To compute bias amplification, take the difference of the LIC socres between 2 classifiers.

Setup

Clone the repository.
Download the data (folder name: bias_data) and place in the current directory. The folder contains human/generated captions and corresponding gender/racial annotations from the paper Understanding and Evaluating Racial Biases in Image Captioning.
Install dependancies:

For LSTM classifier

- Python 3.7
- numpy 1.21.2 
- pytorch 1.9
- torchtext 0.10.0 
- spacy 3.4.0 
- sklearn 1.0.2 
- nltk 3.6.3

For BERT classifier

- Python 3.7
- numpy 1.21.2 
- pytorch 1.4
- transformers 4.0.1
- spacy 2.3
- sklearn 1.0.2 
- nltk 3.6.3

Compute LIC

We evaluate various captioning models (i.e. NIC, SAT, FC, Att2in, UpDn, Transformer, OSCAR, NIC+, and NIC+Equalizer). In the following commands, you can select a model in $model_name from them (i.e. nic, sat, fc, att2in, updn, transformer, oscar, nic_equalizer, or nic_plus).

In the paper, LSTM or BERT is used as the classifier. Please run the following commands according to the classifier you prefer to use.

To train the LSTM classifier on human captions and compute LIC in terms of gender bias run:

python3 lstm_leakage.py --seed $int --cap_model $model_name --calc_ann_leak True

Where $int is the arbitrary integer for random seed and $model_name is the choice of a captioning model to be compared.

To train the LSTM classifier on generated captions and compute LIC in terms of gender bias run:

python3 lstm_leakage.py --seed $int --cap_model $model_name --calc_model_leak True

Where $model_name is the choice of a captioning model.

To train the BERT classifier on human captions and compute LIC in terms of gender bias run:

python3 bert_leakage.py --seed $int --cap_model $model_name --calc_ann_leak True

To train the BERT classifier on generated captions and compute LIC in terms of gender bias run:

python3 bert_leakage.py --seed $int --cap_model $model_name --calc_model_leak True

Note: If you compute LIC in terms of racial bias, please run race_lstm_leakage.py or race_bert_leakage.py.

Note: To use pre-trained BERT without fine-tuning, you can add --freeze_bert True, --num_epochs 20, and --learning_rate 5e-5.

Results

Gender bias

Racial bias

Note: The classifier is trained 10 times with random initializations, and the results are reported by the average and standard deviation.

Citation

@inproceedings{hirota2022quantifying,
  title={Quantifying Societal Bias Amplification in Image Captioning},
  author={Hirota, Yusuke and Nakashima, Yuta and Garcia, Noa},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={13450--13459},
  year={2022}
 }

Name		Name	Last commit message	Last commit date
Latest commit History 63 Commits
Dockerfiles		Dockerfiles
__pycache__		__pycache__
bias_data		bias_data
images		images
run_scripts		run_scripts
README.md		README.md
bert_env.yml		bert_env.yml
bert_leakage.py		bert_leakage.py
bert_requirements.txt		bert_requirements.txt
bias_dataset.py		bias_dataset.py
lstm_env.yml		lstm_env.yml
lstm_leakage.py		lstm_leakage.py
lstm_requirements.txt		lstm_requirements.txt
model.py		model.py
race_bert_leakage.py		race_bert_leakage.py
race_dataset.py		race_dataset.py
race_lstm_leakage.py		race_lstm_leakage.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Running docker files

For lstm

For BERT

Quantifying Societal Bias Amplification in Image Captioning

LIC metric

Setup

For LSTM classifier

For BERT classifier

Compute LIC

Results

Gender bias

Racial bias

Citation

About

Releases

Packages

Languages

root-goksenin/lick-caption-bias

Folders and files

Latest commit

History

Repository files navigation

Running docker files

For lstm

For BERT

Quantifying Societal Bias Amplification in Image Captioning

LIC metric

Setup

For LSTM classifier

For BERT classifier

Compute LIC

Results

Gender bias

Racial bias

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages