MMBS (Towards Robust Visual Question Answering: Making the Most of Biased Samples via Contrastive Learning)

Here is the implementation of our Findings of EMNLP-2022 Towards Robust Visual Question Answering: Making the Most of Biased Samples via Contrastive Learning.

This repository contains code modified from here for SAR+MMBS and here for SAR+LMH, many thanks!

Qualitative comparison of our method LMH+MMBS against the plain method UpDn and the debiasing method LMH. In VQA-CP v2 (upper), the question types (‘Does the’ and ‘How many’) bias UpDn to the most common answers (see Fig. 5 for the an- swer distribution). LMH alleviates the language priors for yesno questions (upper left), while it fails on the more difficult non-yesno questions (upper right). Be- sides, LMH damages the ID performance, giving an un- common answer to the common sample from VQA v2 (lower right). MMBS improves the OOD performance while maintains the ID performance (lower right).

Overview of our method. The question cate- gory words are highlighted in yellow. The orange circle and blue triangle denote the cross-modality representa- tions of the original sample and positive sample. The other samples in the same batch are the negative sam- ples, which are denoted by the gray circles.

Download and preprocess the data

The data preprocessing code can refer to that of https://github.com/CrossmodalGroup/SSL-VQA.

cd data 
bash download.sh
python preprocess_image.py --data trainval
python create_dictionary.py --dataroot vqacp2/
python preprocess_text.py --dataroot vqacp2/ --version v2
cd ..

Requirements

python 3.7.6
pytorch 1.5.0
zarr
tdqm
spacy
h5py

The code of LXMERT-MMBS will be released soon.

Reference

If you found this code is useful, please cite the following paper:

@article{Si2022TowardsRV,
  title={Towards Robust Visual Question Answering: Making the Most of Biased Samples via Contrastive Learning},
  author={Qingyi Si and Yuanxin Liu and Fandong Meng and Zheng Lin and Peng Fu and Yanan Cao and Weiping Wang and Jie Zhou},
  journal={ArXiv},
  year={2022},
  volume={abs/2210.04563}
}

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
Appendix.pdf		Appendix.pdf
MMBS-model.jpg		MMBS-model.jpg
README.md		README.md
attention.py		attention.py
classifier.py		classifier.py
dataset_vqacp_MMBS.py		dataset_vqacp_MMBS.py
fc.py		fc.py
language_model.py		language_model.py
main_MMBS.py		main_MMBS.py
model_MMBS.py		model_MMBS.py
model_figure.jpg		model_figure.jpg
opts_MMBS.py		opts_MMBS.py
qualitativeComparison.jpg		qualitativeComparison.jpg
test_MMBS.py		test_MMBS.py
train_MMBS.py		train_MMBS.py
utils.py		utils.py

PhoebusSi/MMBS

Folders and files

Latest commit

History

Repository files navigation

MMBS (Towards Robust Visual Question Answering: Making the Most of Biased Samples via Contrastive Learning)

Download and preprocess the data

Requirements

The code of LXMERT-MMBS will be released soon.

Reference

About

Resources

Stars

Watchers

Forks

Languages