Our paper StageInteractor: Query-based Object Detector with Cross-stage Interaction has been accepted by ICCV 2023.
Please refer to get_started.md for installation.
We also provide the requirements here:
conda create -n openmmdet python=3.7
conda activate openmmdet
conda install pytorch==1.10.0 cudatoolkit=11.3 -c pytorch
pip install openmim
mim install mmcv-full==1.3.3
pip install torchvision==0.11.1
pip install setuptools==59.5.0
pip install -e .
Our code is mainly based on: AdaMixer and MMDetection.
Please see get_started.md for the basic usage of MMDetection. We provide colab tutorial, and full guidance for quick run with existing dataset and with new dataset for beginners. There are also tutorials for finetuning models, adding new dataset, designing data pipeline, customizing models, customizing runtime settings and useful tools.
For frequently asked questions, you can refer to issues of AdaMixer and FAQ.
Here is an example to run our code with resnext101_32x4d
as backbones:
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 python -m torch.distributed.launch --nproc_per_node=8 --master_port=10020 tools/train.py ./configs/stageinteractor/stageinteractor_dx101_300_query_crop_mstrain_480-800_3x_coco.py --launcher pytorch
Here is an example to run our code with Swin-S
as backbones:
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 python -m torch.distributed.launch --nproc_per_node=8 --master_port=10021 tools/train.py ./configs/stageinteractor/stageinteractor_swin_s_300_query_crop_mstrain_480-800_3x_coco.py --launcher pytorch
Here is an example to run our code with resnext101_32x4d
as backbones:
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 python -m torch.distributed.launch --nproc_per_node=8 --master_port=10025 tools/test.py ./configs/stageinteractor/stageinteractor_dx101_300_query_crop_mstrain_480-800_3x_coco.py ./work_dirs/stageinteractor_dx101_300_query_crop_mstrain_480-800_3x_coco_0725_1348/epoch_36.pth --launcher pytorch --out ./work_dirs/stageinteractor_dx101_300_query_crop_mstrain_480-800_3x_coco_0725_1348/res.pkl --eval bbox
Here is an example to run our code with Swin-S
as backbones:
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 python -m torch.distributed.launch --nproc_per_node=8 --master_port=10025 tools/test.py ./configs/stageinteractor/stageinteractor_swin_s_300_query_crop_mstrain_480-800_3x_coco.py ./work_dirs/stageinteractor_swin_s_300_query_crop_mstrain_480-800_3x_coco/epoch_36.pth --launcher pytorch --eval bbox
Checkpoints and logs are available at google drive.
config | detector | backbone | APval | APtest |
---|---|---|---|---|
config | StageInteractor (3x schedule, 300 queries) | X101-DCN | 51.3 | 51.3 |
config | StageInteractor (3x schedule, 300 queries) | Swin-S | 52.7 | 52.7 |
Our code is mainly based on: AdaMixer and MMDetection.
If you use this toolbox or benchmark in your research, please cite this project.
@InProceedings{Teng_2023_ICCV,
author = {Teng, Yao and Liu, Haisong and Guo, Sheng and Wang, Limin},
title = {StageInteractor: Query-based Object Detector with Cross-stage Interaction},
booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
month = {October},
year = {2023},
pages = {6577-6588}
}