This repo is an official implementation of our IJCV paper: Logit Normalization for Long-Tail Object Detection, which was published in 08 January 2024.
Please refer to springer and arxiv for more details about our paper!
Real-world data with skewed distributions poses a serious challenge to existing object detectors.
Via adding this LogitNormHead
implemented below on the predicted logit vector for calibration, the long-tail bias will get greatly alleviated!
In general, our LogN is training- and tuning-free (i.e. require no extra training and tuning process), model- and label distribution-agnostic (i.e. generalization to different kinds of detectors and datasets), and also plug-and-play (i.e. direct application without any bells and whistles).
class LogitNormHead(Shared2FCBBoxHead):
def __init__(self, momentum=1e-4, *args, **kwargs):
super(LogitNormHead, self).__init__(*args, **kwargs)
self.bn = nn.BatchNorm1d(self.num_classes + 1, eps=1e-05, momentum=momentum, affine=False)
def get_statistics(self):
mean_val = self.bn.running_mean
mean_val[-1] = 0
std_val = torch.sqrt(torch.clamp(self.bn.running_var, min=1e-11))
std_val[-1] = 1
beta = torch.zeros_like(mean_val)
beta[:-1] = mean_val[:-1].min()
return mean_val.view(1, -1), std_val.view(1, -1), beta.view(1, -1)
def forward(self, x):
cls_score, bbox_pred = super(LogitNormHead, self).forward(x)
if self.training:
cls_score = self.bn(cls_score)
return cls_score, bbox_pred
else:
mean_val, std_val, beta = self.get_statistics()
cls_score = (cls_score - (mean_val - beta)) / std_val
return cls_score, bbox_pred
Please refer to models/roi_heads/bbox_heads/logit_norm_head.py
for more details!
Please refer to get_started.md for installation.
for images
LVIS uses same images as COCO's, so you need to donwload COCO dataset at folder ($COCO), and link those train
, val
under folder lvis
($LVIS).
mkdir -p data/lvis
ln -s $COCO/train $LVIS
ln -s $COCO/val $LVIS
ln -s $COCO/test $LVIS
for annotations
Download the annotations from lvis webset
cd $LVIS
mkdir annotations
then places the annotations at folder ($LVIS/annotations)
Finally you will have the file structure like below:
data
├── lvis
| ├── annotations
│ │ │ ├── lvis_v1_val.json
│ │ │ ├── lvis_v1_train.json
│ ├── train2017
│ │ ├── 000000004134.png
│ │ ├── 000000031817.png
│ │ ├── ......
│ ├── val2017
│ ├── test2017
for API
The official lvis-api and mmlvis can lead to some bugs of multiprocess. See issue
So you can install this LVIS API from my modified repo.
pip install git+https://github.com/tztztztztz/lvis-api.git
- Download the pretrained models from the below urls. You can also train the baselines from scrath, please refer to
scripts/baseline.sh
; - Please refer to
scripts/logn.sh
. Specifically, for logit-normalized calibration, we are now using an online approach, which involves finetuning the pretrained model for 1 epoch with theLogitNormHead
attached for aggregating statistics and then directly perform calibrated evaluation.
If you use the equalization losses, please cite our papers.
@article{zhao2024logit,
title={Logit Normalization for Long-tail Object Detection},
author={Zhao, Liang and Teng, Yao and Wang, Limin},
journal={International Journal of Computer Vision},
year={2024},
publisher={Springer}
}