A paper list of Multi Target Multi Camera (MTMC) tracking and related topics
including application case in: vehicle tracking 🚗 , pedestrian tracking 🙍 , sports player tracking ⚽ .
Click to show menu
- Observation-Centric SORT: Rethinking SORT for Robust Multi-Object Tracking, Cao et al. [paper] [code]
interesting to see a variant of SORT (observation-centered) achieve decent results
- PoserNet: Refining Relative Camera Poses Exploiting Object Detections, Taiana et al. 🌈 [paper] [code]
not tracking but seems applicable in MC-tracking, detect bbox from images and match roughly, use interesting GNN formulation to refine camera pose: image as node, edge as relative pose, bbox info added during message passing
at first associate box with high detection score, then associate box with low detection score, improve tracking on occluded objects
instance similarity learning based on region proposal, flexible, no external data required
- TrackFormer: Multi-Object Tracking with Transformers, Meinhardt et al. [paper]
Transformer, detection and tracking simultaneously
- How To Train Your Deep Multi-Object Tracker, Xu et al. 🌈 [paper]
Deep Hungarian Net, approximate MOTA, MOTP for loss function directly
- Learning a Neural Solver for Multiple Object Tracking, Braso & Leal-Taixe 🌈 [paper]
apperance embedding (node) and geometry distance embedding (edge) for graph, edge classification with cross entropy loss
- Deep learning in video multi-object tracking: A survey, Ciaparrone et al. [paper]
pipeline: detection, feature extraction, affinity, association
- Chained-Tracker: Chaining Paired Attentive Regression Results for End-to-End Joint Multiple-Object Detection and Tracking, Peng et al. 🌈 [paper] [code]
end-to-end MOT, use adjacent frames (chained) to combine detection, feature extraction and tracking
- Spatial-Temporal Relation Networks for Multi-Object Tracking, Xu et al. [paper]
use appearance, location and topology cues for similarity score, then graph solved by Hungarian algorithm
- Graph convolutional tracking, Gao et al. [paper]
GNN, Siamese network
motion and appearance extention -> Tracktor++
- Deep Learning for Visual Tracking: A Comprehensive Survey, Marvasti-Zadeh et al. [paper]
traditional and deep visual trackers
- A Review of Visual Trackers and Analysis of its Application to Mobile Robot, You et al. [paper]
correlation filter, deep learning and convolutional features
- Exploit the Connectivity: Multi-Object Tracking with TrackletNet, Wang et al. [paper]
use epipolar geometry, tracklet as node in graph
- Real-time Multiple People Tracking with Deeply Learned Candidate Selection and Person Re-Identification, Chen et al. [paper][code]
online MOT tracker
- Multi-Object Tracking with Quadruplet Convolutional Neural Networks, Son et al. [paper]
learn statistics to normalize effect of camera poses, temporal adjacent constraint for data association
- Real-Time Multiple Object Tracking, Murray. [paper]
not use appearance feature, very fast, not accurate
IoU tracker, no visual cues used, fast
- Online Multi-Target Tracking Using Recurrent Neural Networks, Milan et al. [paper]
RNN as tracker, LSTM for data association
- Learning by tracking: Siamese CNN for robust target association, Leal-Taixe et al. [paper]
use Siamese CNN to learn similarity, for data association, graph solved by Linear Programming
- Learning an image-based motion context for multiple people tracking, Leal-Taixe et al. [paper]
interaction between objects, relax the dependency of tracking on detections
- Graph Convolutional Network for Multi-Target Multi-Camera Vehicle Tracking, Luna et al. [paper]
step 1: single camera tracking & generate appearance feature, step 2: multi camera association with GNN (single camera trajectories as node, averaged feature as node feature, cos(feature) as edge feature), weighted loss for imbalance
- DyGLIP: A Dynamic Graph Model with Link Prediction for Accurate Multi-Camera Multiple Object Tracking, Quach et al. [paper]
tracklet as node, link prediction for data association, ok for w/wo overalaping view, use large training data
- Online Clustering-based Multi-Camera Vehicle Tracking in Scenarios with overlapping FOVs, Luna et al. [paper]
detection-> feature extraction, homography -> cross-camera cluster -> incremental temporal association, small latency, not very accurate
- Real-time 3D Deep Multi-Camera Tracking, You & Jiang [paper]
fusion all views into ground-plane occupancy heatmap
- City-Scale Multi-Camera Vehicle Tracking by Semantic Attribute Parsing and Cross-Camera Tracklet Matching, He et al. [paper]
tracklet representation with spatial-temporal attention, then tracklet-to-target assignment
tracklet-to-target assignment
- AI City Challenge 2020 – Computer Vision for Smart Transportation Applications, Chang et al. [paper]
single camera tracklet -> multi-camera tracklet fusion with appearance and physical features
- Multi-Camera Tracking of Vehicles based on Deep Features Re-ID and Trajectory-Based Camera Link Models, Hsu et al. [paper]
use TrackletNet for single camera trajectory -> inter-camera tracking
- ELECTRICITY: An Efficient Multi-camera Vehicle Tracking System for Intelligent City, Qian et al. [paper]
single camera tracking -> match tracklets across camera views
Reinforcement learning, collaborative multi-camera
camera synchronization, SfM, Bundle Adjustment, spline representation for drone trajectory
- The MTA Dataset for Multi Target Multi Camera Pedestrian Tracking by Weighted Distance Aggregation [paper]
combine appearance and homography for hierachical clustering, known camera pose
- Cross-View Tracking for Multi-Human 3D Pose Estimation at over 100 FPS, Chen et al. [paper]
- People tracking in multi-camera systems: a review, Iguernaissi et al. [paper]
Centralized (combine cross-camera views before tracking, like Wen et al.) and Distributed methods (single-camera tracking before fusion)
-
CityFlow: A City-Scale Benchmark for Multi-Target Multi-Camera Vehicle Tracking and Re-Identification, Tang et al. [paper]
-
Real-Time Multi-Target Multi-Camera Tracking with Spatial-Temporal Information, Zhang & Izquierdo 🌈 [paper]
single camera detection -> create/match to track, with apperance, motion, spatial-temporal cues (cross-camera)
- Features for Multi-Target Multi-Camera Tracking and Re-Identification, Ristani & Tomasi [paper] [code]
tracklet -> single camera trajectory (correlation clustering) -> multi camera trajectory
single camera tracking -> CNN feature extraction -> multi camera tracking (KMeans)
- Multi-Camera Multi-Target Tracking with Space-Time-View Hyper-graph, Wen et al. 🌈 [paper]
3D position for affinity computation, need know camera parameters, cross-view coupling before trajectory
- Persistent Tracking for Wide Area Aerial Surveillance, Prokaj & Medioni 🌈 [paper]
two tracker (detection and regression) in parallel, measure their correspondence
- Hypergraphs for joint multi-view reconstruction and multi-object tracking, Hofmann et al. 🌈 [paper] [code]
detection as node in hypergraph to find 3d reconstruction, which is node in a min-cost flow graph, solved by binary linear programming
- Branch-and-price global optimization for multi-view multi-target tracking, Leal-Taixé et al. [paper]