Shweta Singh†, Aayan Yadav†, Jitesh Jain, Humphrey Shi, Justin Johnson, Karan Desai
† Equal Contribution
[arxiv
] [Dataset Website
]
Introducing COCO-ReM, a set of high-quality instance annotations for COCO images. COCO-ReM improves on imperfections prevailing in COCO-2017 such as coarse mask boundaries, non-exhaustive annotations, inconsistent handling of occlusions, and duplicate masks. Masks in COCO-ReM have a visibly better quality than COCO-2017, as shown below.
- News
- Setup Instructions
- Download COCO-ReM
- Mask Visualization
- Evaluation using COCO-ReM
- Training with COCO-ReM
- Annotation Pipeline
- Citation
- [July 7, 2024]: Dataset now available on HuggingFace and code is public!
- [July 1, 2024]: COCO-ReM is accepted to ECCV 2024!
- [March 27, 2024]: Dataset website and arXiv preprint are public!
Clone the repository, create a conda environment, and install all dependencies as follows:
git clone https://github.com/kdexd/coco-rem.git && cd coco-rem
conda create -n coco_rem python=3.10
conda activate coco_rem
Install PyTorch and torchvision
following the instructions on pytorch.org.
Install Detectron2, instructions are available here.
Then, install the dependencies:
pip install -r requirements.txt
pip install git+https://github.com/facebookresearch/segment-anything.git
pip install git+https://github.com/bowenc0221/boundary-iou-api.git
python setup.py develop
COCO-ReM is hosted on Huggingface Datasets at @kdexd/coco-rem. Download the annotation files:
for name in trainrem valrem; do
wget https://huggingface.co/datasets/kdexd/coco-rem/resolve/main/instances_$name.json.zip
unzip instances_$name.json.zip
done
Dataset organization: COCO and COCO-ReM and must be organized inside datasets
directory as follows.
$PROJECT_ROOT/datasets
— coco/
— train2017/ # Contains 118287 train images (.jpg files).
— val2017/ # Contains 5000 val images (.jpg files).
— annotations/
— instances_train2017.json
— instances_val2017.json
- coco_rem/
- instances_trainrem.json
- instances_valrem.json
-lvis
- lvis_v1_val.json
- lvis_v1_train.json
We include a lightweight script to quickly visualize masks of COCO-ReM and COCO-2017, both validation and training sets. For example, run the following command to visualize the masks for COCO-ReM validation set:
python scripts/visualize_coco.py \
--input-json datasets/coco_rem/instances_valrem.json \
--image-dir datasets/coco/val2017 \
--output visualization_output
Read the documentation (python scripts/visualize_coco.py --help
) for details about other arguments.
We support evaluation of all fifty object detectors available in the paper.
First, run python checkpoints/download.py
to download all the pre-trained models
from their official repositories and save them in checkpoints/pretrained_weights
.
For example, to evaluate a Mask R-CNN ViTDet-B model using 8 GPUs and calculate average precision (AP) metrics, run the following command:
python scripts/train_net.py --num-gpus 8 --eval-only \
--config coco_rem/configs/vitdet/mask_rcnn_vitdet_b_100ep.py \
train.init_checkpoint=checkpoints/pretrained_weights/vitdet/mask_rcnn_vitdet_b_100ep.pkl \
dataloader.test.dataset.names=coco_rem_val \
train.output_dir=evaluation_results
We also support training ViTDet baselines on COCO-ReM using the Detectron2 library. Run the following command to train using 8 GPUs (with at least 32GB memory):
python scripts/train_net.py --num-gpus 8 \
--config coco_rem/configs/vitdet/mask_rcnn_vitdet_b_100ep.py \
dataloader.train.dataset.names=coco_rem_train \
dataloader.test.dataset.names=coco_rem_val \
train.output_dir=training_output \
dataloader.train.total_batch_size=16 train.grad_accum_steps=4
For GPUs with less memory, update the parameters in the last line above: the batch size can be halved and gradient accumulation steps can be doubled, for same results.
Download checkpoint for SAM from segment-anything repository and place it in checkpoint
folder.
Run the following command to refine the boundaries of validation set masks using 8 GPUs:
python scripts/refine_boundaries.py \
--input-json datasets/coco/annotations/instances_val2017.json \
--image-dir datasets/coco/val2017 \
--num-gpus 8 \
--output datasets/intermediate/cocoval_boundary_refined.json
Read the documentation (python scripts/refine_boundaries.py --help
) for details about other arguments.
Use default values for other optional arguments to follow the strategy used in paper.
Do this stage for both COCO and LVIS datasets before the merging stage.
Run the following command to merge LVIS annotations for validation set of COCO using the strategy described in paper:
python scripts/merge_instances.py \
--coco-json datasets/intermediate/cocoval_boundary_refined.json \
--lvis-json datasets/intermediate/lvistrain_boundary_refined.json datasets/intermediate/lvisval_boundary_refined.json \
--split val \
--output datasets/intermediate/cocoval_lvis_merged.json
Read the documentation (python scripts/merge_instances.py --help
) for details about above arguments.
Merging handpicked (image,category)
non exhaustive instances from LVIS in validation set is done in the script of next stage.
This stage is done only for validation set.
python scripts/correct_labeling_errors.py \
--input datasets/intermediate/cocoval_lvis_merged.json \
--output datasets/cocoval_refined.json
Note: For the above json to be COCO-ReM we also have to perform the manual parts of Stage 1 and Stage 2.
If you found COCO-ReM useful in your research, please consider starring ⭐ us on GitHub and citing 📚 us in your research!
@inproceedings{cocorem,
title={Benchmarking Object Detectors with COCO: A New Path Forward},
author={Singh, Shweta and Yadav, Aayan and Jain, Jitesh and Shi, Humphrey and Johnson, Justin and Desai, Karan},
journal={ECCV},
year={2024}
}