ShaSTA: Modeling Shape and Spatio-Temporal Affinities for 3D Multi-Object Tracking

ShaSTA: Modeling Shape and Spatio-Temporal Affinities for 3D Multi-Object Tracking,
Tara Sadjadpour, Jie Li, Rares Ambrus, Jeannette Bohg,
arXiv technical report (arXiv 2211.03919)

@article{sadjadpour2023shasta,
  title={Shasta: Modeling shape and spatio-temporal affinities for 3d multi-object tracking},
  author={Sadjadpour, Tara and Li, Jie and Ambrus, Rares and Bohg, Jeannette},
  journal={IEEE Robotics and Automation Letters},
  year={2023},
  publisher={IEEE}
}

If you enjoy this work and are interested in multi-modal perception with camera-LiDAR fusion, please also see our follow-up work ShaSTA-Fuse: Camera-LiDAR Sensor Fusion to Model Shape and Spatio-Temporal Affinities for 3D Multi-Object Tracking.

Highlights

Simple: Two sentences method summary: ShaSTA models shape and spatio-temporal affinities between tracks and detections in consecutive frames. By better understanding objects’ shapes and spatio-temporal contexts, ShaSTA improves data association, false-positive elimination, false-negative propagation, newborn initialization, dead track termination, and track confidence refinement.
Fast and Accurate: Our best model achieves 69.6 AMOTA on nuScenes, ranking 1st amongst trackers using CenterPoint detections.
Extensible: Simple framework for affinity-based 3D multi-object tracking in your novel algorithms.

Main Result

3D Tracking on nuScenes test set

	AMOTA ↑	AMOTP ↓
ShaSTA	69.6	0.540

Docs

Environment Setup

For reproducing our environment setup, please see ENV_SETUP.md.

Data

For getting the nuScenes data and obtaining our pre-processed data, please see DATA.md.

Reproduce Results and Extensions

For running our pre-trained models to reproduce our results, or training and evaluating your own models with this framework, please see MODELS.md. We also include a link to download our validation tracking results that were reported in the paper.

Visualization

We provide a helpful script for visualizing your results through two views: (1) top-down view with LiDAR point clouds projected onto the road map, and (2) front camera view with 3D boxes projected onto the scene. These two views are used in the qualitative results we show in our video demo.

Please see VISUALIZE.md. We highly recommend using our visualization tool for your tracking project to obtain qualitative results both for the development stage of your project and for publication. Please cite our repo if you find this helpful.

License

ShaSTA is released under the Creative Commons Non-Commercial License (see LICENSE). It is developed based on a forked version of CenterPoint. We also incorperate preprocessing code from SimpleTrack. Note that nuScenes is released under a non-commercial license.

Acknowlegement

This research was funded by the Toyota Research Institute.

This project is not possible without multiple open-source codebases. We list some notable examples below.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
configs/nusc		configs/nusc
det3d		det3d
docs		docs
mot_3d		mot_3d
nusc_visualize		nusc_visualize
preprocessing		preprocessing
tools		tools
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml
official_test.sh		official_test.sh
official_val.sh		official_val.sh
preprocessing.sh		preprocessing.sh
requirements.txt		requirements.txt
trainval.sh		trainval.sh

License

tsadja/ShaSTA

Folders and files

Latest commit

History

Repository files navigation

ShaSTA: Modeling Shape and Spatio-Temporal Affinities for 3D Multi-Object Tracking

Highlights

Main Result

3D Tracking on nuScenes test set

Docs

Environment Setup

Data

Reproduce Results and Extensions

Visualization

License

Acknowlegement

About

Topics

Resources

License

Stars

Watchers

Forks

Languages