Skip to content

Rankings include: BetterDepth ChronoDepth Depth Any Video Depth Anything DepthCrafter DPT FutureDepth GBDMF GenPercept GeoWizard LeReS LightedDepth LFVRT Marigold Metric3D MiDaS NeWCRFs NVDS NVDS+ PatchFusion UniDepth ZoeDepth

Notifications You must be signed in to change notification settings

AIVFI/Monocular-Depth-Estimation-Rankings-and-2D-to-3D-Video-Conversion-Rankings

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

33 Commits
 
 

Repository files navigation

Monocular Depth Estimation Rankings
and 2D to 3D Video Conversion Rankings

List of Rankings

Monocular Depth Estimation Rankings

I. New layout

  1. ScanNet++ (98 video clips with 32 frames each): TAE
  2. NYU-Depth V2: OPW<=0.37
  3. NYU-Depth V2: AbsRel<=0.045 [test: new layout]

II. Old layout [currently no longer up to date]

  1. NYU-Depth V2 (640×480): AbsRel<=0.058 [currently no longer up to date]
  2. DA-2K (mostly 1500×2000): Acc (%)>=86
  3. UnrealStereo4K (3840×2160): AbsRel<=0.04
  4. MVS-Synth (1920×1080): AbsRel<=0.06
  5. HRSD (1920×1080): AbsRel<=0.08
  6. Middlebury2021 (1920×1080): SqRel<=0.5

2D to 3D Video Conversion Rankings

I. Light Field Video Reconstruction from Monocular Video Rankings

  1. Hybrid with 7×7 synthetic light field views✖️: LPIPS😍 (no data)
  2. Hybrid with 7×7 synthetic light field views✖️: PSNR😞>=32dB

Appendices


ScanNet++ (98 video clips with 32 frames each): TAE

RK Model
Links:
         Venue   Repository    
  TAE ↓  
{Input fr.}
arXiv
DAV
1 Depth Any Video
arXiv GitHub Stars
2.1 {MF}
2 DepthCrafter
arXiv GitHub Stars
2.2 {MF}
3 ChronoDepth
arXiv GitHub Stars
2.3 {MF}
4 NVDS
ICCV GitHub Stars
3.7 {4}

Back to Top Back to the List of Rankings

NYU-Depth V2: OPW<=0.37

RK Model
Links:
         Venue   Repository    
  OPW ↓  
{Input fr.}
arXiv
FD
   OPW ↓   
{Input fr.}
TPAMI
NVDS+
  OPW ↓  
{Input fr.}
ICCV
NVDS
1 FutureDepth
arXiv
0.303 {4} - -
2 NVDS+
TPAMI GitHub Stars
- 0.339 {4} -
3 NVDS
ICCV GitHub Stars
0.364 {4} - 0.364 {4}

Back to Top Back to the List of Rankings

NYU-Depth V2: AbsRel<=0.045 [test: new layout]

RK Model
Links:
         Venue   Repository    
  AbsRel ↓  
{Input fr.}
arXiv
BD
   AbsRel ↓   
{Input fr.}
TPAMI
M3D v2
  AbsRel ↓  
{Input fr.}
CVPR
DA
    AbsRel ↓    
{Input fr.}
NeurIPS
DA V2
- - - -
1-2 BetterDepth
arXiv
0.042 {1} - - - - - - -
1-2 Metric3D v2 ViT-Large
TPAMI GitHub Stars
- 0.042 {1} - - - - - -
3 Depth Anything Large
CVPR GitHub Stars
0.043 {1} 0.043 {1} 0.043 {1} 0.043 {1} - - - -
4 Depth Anything V2 Large
NeurIPS GitHub Stars
- - - 0.045 {1} - - - -

Back to Top Back to the List of Rankings

NYU-Depth V2 (640×480): AbsRel<=0.058 [currently no longer up to date]

RK     Model       AbsRel ↓  
{Input fr.}
Training
dataset
Official
  repository  
Practical
model
Vapour-
Synth
1-2 BetterDepth
arXiv
Backbone:
Depth Anything & Marigold
0.042 {1}
arXiv
Hypersim & Virtual KITTI - - -
1-2 Metric3D v2 CSTM_label
ICCV
ENH:
arXiv
Backbone:
DINOv2 with registers (ViT-L/14)
0.042 {1}
arXiv
DDAD & Lyft & Driving Stereo & DIML & Arogoverse2 & Cityscapes & DSEC & Mapillary PSD & Pandaset & UASOL & Virtual KITTI & Waymo & Matterport3d & Taskonomy & Replica & ScanNet & HM3d & Hypersim GitHub Stars - -
3 Depth Anything Large
CVPR
Backbone:
DINOv2 (ViT-L/14)
0.043 {1}
CVPR
Pretraining: BlendedMVS & DIML & HR-WSI & IRS & MegaDepth & TartanAir
Training: BDD100K & Google Landmarks & ImageNet-21K & LSUN & Objects365 & Open Images V7 & Places365 & SA-1B
GitHub Stars - -
4 MiDaS v3.1 BEiTL-512
TPAMI
ENH:
arXiv
Backbone:
BEiT512-L (ViT-L/16)
0.048 {1}
CVPR
Pretraining: ReDWeb & HR-WSI & BlendedMVS & NYU-Depth V2 & KITTI
Training: ReDWeb & DIML & 3D Movies & MegaDepth & WSVD & TartanAir & HR-WSI & ApolloScape & BlendedMVS & IRS & NYU-Depth V2 & KITTI
GitHub Stars - PyTorch
GitHub Stars
5 GeoWizard
arXiv
Backbone:
Stable Diffusion v2
0.052 {1}
arXiv
Hypersim & Replica & 3D Ken Burns & Objaverse & proprietary GitHub Stars - -
6 Marigold
CVPR
Backbone:
Stable Diffusion v2
0.055 {1}
CVPR
Hypersim & Virtual KITTI GitHub Stars - -
7 GenPercept
arXiv
Backbone:
Stable Diffusion v2.1
0.056 {1}
arXiv
Hypersim & Virtual KITTI GitHub Stars - -
8 NeWCRFs + LightedDepth
CVPR
ENH:
CVPR
0.057 {2}
CVPR
ENH:
NYU-Depth V2
GitHub Stars
ENH:
GitHub Stars
- -
9 UniDepth-V
CVPR
Backbone:
DINOv2 (ViT-L/14)
0.0578 {1}
CVPR
A2D2 & Argoverse2 & BDD100k & CityScapes & DrivingStereo & Mapillary PSD & ScanNet & Taskonomy & Waymo GitHub Stars - -

Back to Top Back to the List of Rankings

DA-2K (mostly 1500×2000): Acc (%)>=86

RK     Model      Acc (%) ↑ 
{Input fr.}
Training
dataset
Official
  repository  
Practical
model
Vapour-
Synth
1 Depth Anything V2 Giant
CVPR
ENH:
arXiv
Backbone:
DINOv2 (ViT-G/14)
97.4 {1}
arXiv
Pretraining: BlendedMVS & Hypersim & IRS & TartanAir & VKITTI 2
Training: BDD100K & Google Landmarks & ImageNet-21K & LSUN & Objects365 & Open Images V7 & Places365 & SA-1B
GitHub Stars
ENH:
GitHub Stars
- -
2 GeoWizard
arXiv
Backbone:
Stable Diffusion v2
88.1 {1}
arXiv
Hypersim & Replica & 3D Ken Burns & Objaverse & proprietary GitHub Stars - -
3 Marigold
CVPR
Backbone:
Stable Diffusion v2
86.8 {1}
arXiv
Hypersim & Virtual KITTI GitHub Stars - -

Back to Top Back to the List of Rankings

UnrealStereo4K (3840×2160): AbsRel<=0.04

RK     Model       AbsRel ↓  
{Input fr.}
Training
dataset
Official
  repository  
Practical
model
Vapour-
Synth
1 ZoeDepth +PFR=128
arXiv
ENH:
CVPR
0.0388 {1}
CVPR
ENH:
UnrealStereo4K
GitHub Stars
ENH:
GitHub Stars
- -

Back to Top Back to the List of Rankings

MVS-Synth (1920×1080): AbsRel<=0.06

RK     Model       AbsRel ↓  
{Input fr.}
Training
dataset
Official
  repository  
Practical
model
VapourSynth
1 ZoeDepth +PFR=128
arXiv
ENH:
CVPR
0.0589 {1}
CVPR
ENH:
MVS-Synth
GitHub Stars
ENH:
GitHub Stars
- -

Back to Top Back to the List of Rankings

HRSD (1920×1080): AbsRel<=0.08

RK     Model       AbsRel ↓  
{Input fr.}
Training
dataset
Official
  repository  
Practical
model
VapourSynth
1 DPT-B + R + AL
ICCV
ENH:
CVPRW
0.074 {1}
CVPRW
ENH:
HRSD
GitHub Stars
ENH:
-
- -

Back to Top Back to the List of Rankings

Middlebury2021 (1920×1080): SqRel<=0.5

RK     Model       SqRel ↓  
{Input fr.}
Training
dataset
Official
  repository  
Practical
model
VapourSynth
1 LeReS-GBDMF
CVPR
ENH:
AAAI
0.444 {1}
AAAI
ENH:
HR-WSI
GitHub Stars
ENH:
GitHub Stars
- -

Back to Top Back to the List of Rankings

Hybrid with 7×7 synthetic light field views✖️: PSNR😞>=32dB

RK     Model        PSNR ↑   
{Input fr.}
Training
dataset
Official
  repository  
Practical
model
VapourSynth
1 LFVRT
ECCV
MDE: DPT
ICCV
Backbone:
ViT
32.66 {3+1D}
ECCV
GoPro & TAMULF GitHub Stars
MDE:
GitHub Stars
- -

📝 Note: The above ranking includes only one model, as the other methods are image-based and don't have any temporal information making them unsuitable for light field video reconstruction from monocular video.

Back to Top Back to the List of Rankings

Appendix 3: List of all research papers from the above rankings

Method Paper     Venue    
BetterDepth BetterDepth: Plug-and-Play Diffusion Refiner for Zero-Shot Monocular Depth Estimation arXiv
ChronoDepth Learning Temporally Consistent Video Depth from Video Diffusion Priors arXiv
Depth Any Video Depth Any Video with Scalable Synthetic Data arXiv
Depth Anything Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data CVPR
Depth Anything V2 Depth Anything V2 NeurIPS
DepthCrafter DepthCrafter: Generating Consistent Long Depth Sequences for Open-world Videos arXiv
DPT Vision Transformers for Dense Prediction ICCV
FutureDepth FutureDepth: Learning to Predict the Future Improves Video Depth Estimation arXiv
GBDMF Multi-Resolution Monocular Depth Map Fusion by Self-Supervised Gradient-Based Composition AAAI
GenPercept Diffusion Models Trained with Large Data Are Transferable Visual Models arXiv
GeoWizard GeoWizard: Unleashing the Diffusion Priors for 3D Geometry Estimation from a Single Image arXiv
LeReS Learning to Recover 3D Scene Shape from a Single Image CVPR
LightedDepth LightedDepth: Video Depth Estimation in light of Limited Inference View Angles CVPR
LFVRT Synthesizing Light Field Video from Monocular Video ECCV
Marigold Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation CVPR
Metric3D Metric3D: Towards Zero-shot Metric 3D Prediction from A Single Image ICCV
Metric3D v2 Metric3D v2: A Versatile Monocular Geometric Foundation Model for Zero-shot Metric Depth and Surface Normal Estimation TPAMI
MiDaS Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-Shot Cross-Dataset Transfer TPAMI
MiDaS v3.1 MiDaS v3.1 – A Model Zoo for Robust Monocular Relative Depth Estimation arXiv
NeWCRFs Neural Window Fully-connected CRFs for Monocular Depth Estimation CVPR
NVDS Neural Video Depth Stabilizer ICCV
NVDS+ NVDS+: Towards Efficient and Versatile Neural Stabilizer for Video Depth Estimation TPAMI
PatchFusion PatchFusion: An End-to-End Tile-Based Framework for High-Resolution Monocular Metric Depth Estimation CVPR
R + AL High-Resolution Synthetic RGB-D Datasets for Monocular Depth Estimation CVPRW
UniDepth UniDepth: Universal Monocular Metric Depth Estimation CVPR
ZoeDepth ZoeDepth: Zero-shot Transfer by Combining Relative and Metric Depth arXiv

Back to Top Back to the List of Rankings

About

Rankings include: BetterDepth ChronoDepth Depth Any Video Depth Anything DepthCrafter DPT FutureDepth GBDMF GenPercept GeoWizard LeReS LightedDepth LFVRT Marigold Metric3D MiDaS NeWCRFs NVDS NVDS+ PatchFusion UniDepth ZoeDepth

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published