In this paper, we propose a novel feature fusion framework of dual cross-attention transformers to model global feature interaction and capture complementary information across modalities simultaneously. In addition, we introdece an iterative interaction mechanism into dual cross-attention transformers, which shares parameters among block-wise multimodal transformers to reduce model complexity and computation cost. The proposed method is general and effective to be integrated into different detection frameworks and used with different backbones. Experimental results on KAIST, FLIR, and VEDAI datasets show that the proposed method achieves superior performance and faster inference, making it suitable for various practical scenarios.
Paper download in: https://arxiv.org/pdf/2308.07504.pdf
Clone repo and install requirements.txt in a Python>=3.8.0 conda environment, including PyTorch>=1.12.
git clone https://github.com/chanchanchan97/ICAFusion.git
cd ICAFusion
pip install -r requirements.txt
-
KAIST
Link:https://pan.baidu.com/s/1UdwQJH-cHVL91pkMW-ij6g Code:ig3y -
FLIR-aligned
Link:https://pan.baidu.com/s/1ljr8qJYdz-60Lj-iVEHBvg Code:uqzs -
VEDAI
Link:https://pan.baidu.com/s/1UKSI0Go0Ddt62tXNIySz9w Code:5ett
-
KAIST
Link:https://pan.baidu.com/s/18UXctOSgjp6EUcJXIGbWTQ Code:9eku -
FLIR-aligned
Link:https://pan.baidu.com/s/1VZbsTE4o6bw2XBypPW3zoA Code:xli9
Note: This is the txt files for evaluation. We continuously optimize our codes, which results in the difference in detection performance. However, the codes of module for multimodal feature fusion still remain consistent with the methods proposed in this paper.
- KAIST Link:https://pan.baidu.com/s/1N7SNEPXKX7KFaO2Th7vq2g Code:zijw
If you find our work useful in your research, please consider citing:
@article{SHEN2023109913,
title={ICAFusion: Iterative Cross-Attention Guided Feature Fusion for Multispectral Object Detection},
author={Shen, Jifeng and Chen, Yifei and Liu, Yue and Zuo, Xin and Fan, Heng and Yang, Wankou},
journal={Pattern Recognition},
pages={109913},
year={2023},
issn={0031-3203},
doi={https://doi.org/10.1016/j.patcog.2023.109913},
author={Jifeng Shen and Yifei Chen and Yue Liu and Xin Zuo and Heng Fan and Wankou Yang},
}