Backbone | Test-dev | Test-std | url | size |
---|---|---|---|---|
Resnet-101 | 62.48 | 61.99 | model | 3GB |
EfficientNet-B5 | 62.95 | 62.45 | model | 2.7GB |
The config for this dataset can be found in configs/gqa.json and is also shown below:
{
"combine_datasets": ["gqa"],
"combine_datasets_val": ["gqa"],
"vg_img_path": "",
"gqa_ann_path": "mdetr_annotations/",
"gqa_split_type": "balanced"
}
- Download the gqa images at GQA images and update
vg_img_path
to point to the folder containing the images. - Download our pre-processed annotations that are converted to coco format (all datasets present in the same zip folder for MDETR annotations): Pre-processed annotations and update the
gqa_ann_path
to this folder with pre-processed annotations.
Model weights (can also be loaded directly from url):
GQA has two types of splits "all" and "balanced". Choose the one you are interested in, by changing in configs/gqa.json gqa_split_type
python run_with_submitit.py --dataset_config configs/gqa.json --ngpus 1 --nodes 2 --ema --eval --do_qa --split_qa_heads --no_contrastive_align_loss --resume https://zenodo.org/record/4721981/files/gqa_resnet101_checkpoint.pth
To run on a single node with 2 gpus
python -m torch.distributed.launch --nproc_per_node=2 --use_env main.py --dataset_config configs/gqa.json --ema --eval --do_qa --split_qa_heads --no_contrastive_align_loss --resume https://zenodo.org/record/4721981/files/gqa_resnet101_checkpoint.pth
To run finetuning on the "all" split (this was run on 8 nodes of 4 gpus each, effective batch size 128):
- Change the configs/gqa.json to have
gqa_split_type
as "all"
python run_with_submitit.py --dataset_config configs/gqa.json --ngpus 8 --ema --epochs 125 --epoch_chunks 25 --do_qa --split_qa_heads --lr_drop 150 --load https://zenodo.org/record/4721981/files/pretrained_resnet101_checkpoint.pth --nodes 4 --batch_size 4 --no_aux_loss --qa_loss_coef 25 --lr 1.4e-4 --lr_backbone 1.4e-5 --text_encoder_lr 7e-5
To run on a single node with 8 gpus
python -m torch.distributed.launch --nproc_per_node=8 --use_env main.py --dataset_config configs/gqa.json --ema -epochs 125 --epoch_chunks 25 --lr_drop 150 --do_qa --split_qa_heads --load https://zenodo.org/record/4721981/files/pretrained_resnet101_checkpoint.pth --no_aux_loss --qa_loss_coef 25
- Change the configs/gqa.json to have
gqa_split_type
as "all" - --split can be testdev or submission to generate the prediction file that is uploaded to the GQA EvalAI server.
python run_with_submitit_gqa_eval.py --do_qa --eval --resume https://zenodo.org/record/4721981/files/gqa_resnet101_checkpoint.pth?download=1 --split_qa_heads --ngpus 1 --nodes 4 --ema --split testdev --dataset_config configs/gqa.json
- The resulting predictions will be saved in the experiments output dir as testdev_predictions.json or submission_predictions.json accordingly.
You can also run this on just one node with 4 gpus
python -m torch.distributed.launch --nproc_per_node=4 --use_env scripts/eval_gqa.py --do_qa --eval --resume https://zenodo.org/record/4721981/files/gqa_resnet101_checkpoint.pth?download=1 --split_qa_heads --ema --split testdev --dataset_config configs/gqa.json