Skip to content
/ SpARC Public

Stanford CS 224N (Winter 2022) Course Project

Notifications You must be signed in to change notification settings

SConsul/SpARC

Folders and files

NameName
Last commit message
Last commit date

Latest commit

0ce8799 · Jul 7, 2022
Apr 21, 2022
Apr 27, 2022
Mar 12, 2022
Apr 12, 2022
Jul 3, 2022
Apr 20, 2022
Apr 12, 2022
Apr 21, 2022
Apr 24, 2022
Jun 19, 2022
Apr 20, 2022
Jul 7, 2022
Apr 20, 2022

Repository files navigation

SpARC: Sparse Activation Regularization for Consistency

The repository is part of a course project for Stanford CS 224N (Winter 2022), taught by Prof. Chris Manning.

Installation

Install the required packages

Use pip to install the following packages:

pip install -U adapter-transformers
pip install sentencepiece

To vizualize the model activations in notebooks/Visualizer.ipynb, requires the additional installation of the BertViz package.

Clone the repository

Clone this repository using:

git clone https://github.com/SConsul/SpARC.git

Usage

Data Preprocessing

The BeliefBank dataset is available at https://allenai.org/data/beliefbank

To parse the relation graphs into question-answer pairs, as well as generating the train, val and test splits, run:

python src/utils/beliefbank_preprocess.py

This will generate json files with the required question-answer splits, qa_train, qa_val and qa_test. A separate qa_consistency to test the consistency of the model.

Finetuning

To finetune the MACAW-large transformer on the BeliefBank dataset, use the following command:

python src/train.py 

Required arguments:

Parameter Default Description
--train_path None json file containing the training data
--model_path None Location to save model weights.

Optional arguments:

Parameter Default Description
--max_epochs 10 number of epochs of finetuning.
--batch_size 64 batch size used, can be varied depending on the available GPU memory.
--lr 3e-4 Initial learning rate for Adam Optimizer.
--lr_decay False Boolean flag if learning rate should decay.
--weight_decay 0.1 L2 Regularization penalty on model weights.
--num_workers 4 Number of workers, can be varied depending on the available CPU memory.
--l1_reg None A float value indicating the penalty of th L1 norm of activations.
--freeze_backbone False Used to only finetune on the final linear layer of the model.
--adapter True Set to true to enable adapter layers for finetuning.
--layer_names None List of layers whose activations are to be regularized.
--sim None The float hyperparameter weighing the similarity loss.
--ce_loss 1.0 The float hyperparameter weighing the standard cross-entropy loss.
--token_type None Set to 'answer', 'question', 'eos', 'link' .
--sim_type None Set to 'angle' if similarity of activations are to be measured using the angles instead of their dot-products.

Evaluation

Get the inference of a saved model by running:

python src/inference.py

Required arguments:

Parameter Default Description
--in_path ./beliefbank-data-sep2021/qa_test.json choice of dataset split to do evaluate on.
--out_path None Path to save the inferences.
--batch_size 512 batch size used, can be varied depending on the available GPU memory.
--model_path None path of saved model weights
--adapter None Add flag if the network was finetuned using adapters

To obtain a model's accuracy, collect its inferences on the val/test set and run:

python src/utils/accuracy.py --results_path <path to inferences on val/test set>

To obtain the model's consistency, collect its inferences on qa_consistency and run:

python src/utils/consistency_v2.py --results_path <path to inferences on qa_consistency>

Team Members: Julia Xu (@jwxu), Samar Khanna (@samar-khanna), Sarthak Consul (@SConsul)