GQA

Backbone	Test-dev	Test-std	url	size
Resnet-101	62.48	61.99	model	3GB
EfficientNet-B5	62.95	62.45	model	2.7GB

Data preparation

The config for this dataset can be found in configs/gqa.json and is also shown below:

{
  "combine_datasets": ["gqa"],
  "combine_datasets_val": ["gqa"],
  "vg_img_path": "",
  "gqa_ann_path": "mdetr_annotations/",
  "gqa_split_type": "balanced"
}

Download the gqa images at GQA images and update vg_img_path to point to the folder containing the images.
Download our pre-processed annotations that are converted to coco format (all datasets present in the same zip folder for MDETR annotations): Pre-processed annotations and update the gqa_ann_path to this folder with pre-processed annotations.

Script to reproduce results

Model weights (can also be loaded directly from url):

gqa_resnet101_checkpoint.pth
gqa_EB5_checkpoint.pth
pretrained_resnet101_checkpoint.pth

GQA has two types of splits "all" and "balanced". Choose the one you are interested in, by changing in configs/gqa.json gqa_split_type

To run evaluation on testdev balanced:

python run_with_submitit.py --dataset_config configs/gqa.json --ngpus 1 --nodes 2  --ema --eval --do_qa --split_qa_heads --no_contrastive_align_loss --resume https://zenodo.org/record/4721981/files/gqa_resnet101_checkpoint.pth

To run on a single node with 2 gpus

python -m torch.distributed.launch --nproc_per_node=2 --use_env main.py --dataset_config configs/gqa.json --ema --eval --do_qa --split_qa_heads --no_contrastive_align_loss  --resume https://zenodo.org/record/4721981/files/gqa_resnet101_checkpoint.pth

To run finetuning on the "all" split (this was run on 8 nodes of 4 gpus each, effective batch size 128):

Change the configs/gqa.json to have gqa_split_type as "all"

python run_with_submitit.py --dataset_config configs/gqa.json --ngpus 8 --ema --epochs 125 --epoch_chunks 25 --do_qa --split_qa_heads --lr_drop 150 --load https://zenodo.org/record/4721981/files/pretrained_resnet101_checkpoint.pth --nodes 4 --batch_size 4 --no_aux_loss --qa_loss_coef 25 --lr 1.4e-4 --lr_backbone 1.4e-5 --text_encoder_lr 7e-5

To run on a single node with 8 gpus

python -m torch.distributed.launch --nproc_per_node=8 --use_env main.py --dataset_config configs/gqa.json --ema -epochs 125 --epoch_chunks 25 --lr_drop 150 --do_qa --split_qa_heads --load https://zenodo.org/record/4721981/files/pretrained_resnet101_checkpoint.pth --no_aux_loss --qa_loss_coef 25

To dump predictions that can be submitted to the EvalAI server, use this instead:

Change the configs/gqa.json to have gqa_split_type as "all"
--split can be testdev or submission to generate the prediction file that is uploaded to the GQA EvalAI server.

python run_with_submitit_gqa_eval.py  --do_qa --eval  --resume https://zenodo.org/record/4721981/files/gqa_resnet101_checkpoint.pth?download=1  --split_qa_heads --ngpus 1 --nodes 4  --ema --split testdev --dataset_config configs/gqa.json

The resulting predictions will be saved in the experiments output dir as testdev_predictions.json or submission_predictions.json accordingly.

You can also run this on just one node with 4 gpus

python -m torch.distributed.launch --nproc_per_node=4 --use_env scripts/eval_gqa.py --do_qa --eval --resume https://zenodo.org/record/4721981/files/gqa_resnet101_checkpoint.pth?download=1 --split_qa_heads --ema --split testdev --dataset_config configs/gqa.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gqa.md

gqa.md

GQA

Data preparation

Script to reproduce results

To run evaluation on testdev balanced:

To run finetuning on the "all" split (this was run on 8 nodes of 4 gpus each, effective batch size 128):

To dump predictions that can be submitted to the EvalAI server, use this instead:

Files

gqa.md

Latest commit

History

gqa.md

File metadata and controls

GQA

Data preparation

Script to reproduce results

To run evaluation on testdev balanced:

To run finetuning on the "all" split (this was run on 8 nodes of 4 gpus each, effective batch size 128):

To dump predictions that can be submitted to the EvalAI server, use this instead: