VSE-HAL

Code release for HAL: Improved Text-Image Matching by Mitigating Visual Semantic Hubs [arxiv] at AAAI 2020.

@inproceedings{liu2020hal,
  title={{HAL}: Improved text-image matching by mitigating visual semantic hubs},
  author={Liu, Fangyu and Ye, Rongtian and Wang, Xun and Li, Shuaipeng},
  booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
  volume={34},
  number={07},
  pages={11563--11571},
  year={2020}
}

Upgrade your text-image matching model with a few lines of code:

class ContrastiveLoss(nn.Module):
	...
	def forward(self, im, s, ...):
        	bsize = im.size()[0]
        	scores = self.sim(im, s)
		...
		tmp  = torch.eye(bsize).cuda()
		s_diag = tmp * scores
		scores_ = scores - s_diag
		...
		S_ = torch.exp(self.l_alpha * (scores_ - self.l_ep))
		loss_diag = - torch.log(1 + F.relu(s_diag.sum(0)))

        	loss = torch.sum( \
                	torch.log(1 + S_.sum(0)) / self.l_alpha \
                	+ torch.log(1 + S_.sum(1)) / self.l_alpha \
                	+ loss_diag \
                	) / bsize

        return loss

Dependencies

nltk==3.4.5
pycocotools==2.0.0
numpy==1.18.1
torch==1.5.1
torchvision==0.6.0
tensorboard_logger==0.1.0

Data

MS-COCO

[vgg_precomp]
[resnet_precomp]

Flickr30k

[vgg_precomp]

Train

Run train.py.

MS-COCO

w/o global weighting

python3 train.py \
	--data_path "data/data/resnet_precomp" \
	--vocab_path "data/vocab/" \
	--data_name coco_precomp \
	--batch_size 512 \
	--learning_rate 0.001 \
	--lr_update 8 \
	--num_epochs 13 \
	--img_dim 2048 \
	--logger_name runs/COCO \
	--local_alpha 30.00 \
	--local_ep 0.3

with global weighting

python3 train.py \
	--data_path "data/data/resnet_precomp" \
	--vocab_path "data/vocab/" \
	--data_name coco_precomp \
	--batch_size 512 \
	--learning_rate 0.001 \
	--lr_update 8 \
	--num_epochs 13 \
	--img_dim 2048 \
	--logger_name runs/COCO_mb \
	--local_alpha 30.00 \
	--local_ep 0.3 \
	--memory_bank \
	--global_alpha 40.00 \
	--global_beta 40.00 \
	--global_ep_posi 0.20 \
	--global_ep_nega 0.10 \
 	--mb_rate 0.05 \
	--mb_k 250

Flickr30k

python3 train.py \
	--data_path "data/data" \
	--vocab_path "data/vocab/" \
	--data_name f30k_precomp \
	--batch_size 128 \
	--learning_rate 0.001 \
	--lr_update 8 \
	--num_epochs 13 \
	--logger_name runs/f30k \
	--local_alpha 60.00 \
	--local_ep 0.7

Evaluate

Run compute_results.py.

COCO

python3 compute_results.py --data_path data/data/resnet_precomp --fold5 --model_path runs/COCO/model_best.pth.tar

Flickr30k

python3 compute_results.py --data_path data/data --model_path runs/f30k/model_best.pth.tar

Trained models

[Google Drive]

Note

Trained models and codes for replicating results on SCAN are coming soon.

Acknowledgments

This project would be impossible without the open source implementations of VSE++ and SCAN.

License

Apache License 2.0

Name	Name	Last commit message	Last commit date
Latest commit hardyqr Merge pull request #13 from hardyqr/dependabot/pip/torch-1.13.1 Mar 22, 2023 5a33289 · Mar 22, 2023 History 42 Commits
data/vocab	data/vocab	released version 1.0	Feb 9, 2020
.gitignore	.gitignore	fix logger repetitive output	Jun 23, 2020
README.md	README.md	Update README.md	Dec 4, 2020
compute_results.py	compute_results.py	fix logger repetitive output	Jun 23, 2020
data.py	data.py	fix logger repetitive output	Jun 23, 2020
evaluation.py	evaluation.py	fix logger repetitive output	Jun 23, 2020
model.py	model.py	fix logger repetitive output	Jun 23, 2020
requirements.txt	requirements.txt	Merge pull request #13 from hardyqr/dependabot/pip/torch-1.13.1	Mar 22, 2023
train.py	train.py	fix logger repetitive output	Jun 23, 2020
vocab.py	vocab.py	fix logger repetitive output	Jun 23, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VSE-HAL

Dependencies

Data

MS-COCO

Flickr30k

Train

MS-COCO

w/o global weighting

with global weighting

Flickr30k

Evaluate

COCO

Flickr30k

Trained models

Note

Acknowledgments

License

About

Releases

Packages

Contributors 3

Languages

hardyqr/HAL

Folders and files

Latest commit

History

Repository files navigation

VSE-HAL

Dependencies

Data

MS-COCO

Flickr30k

Train

MS-COCO

w/o global weighting

with global weighting

Flickr30k

Evaluate

COCO

Flickr30k

Trained models

Note

Acknowledgments

License

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages