Skip to content

Yioutpi/Awesome-3D-Perception

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 

Repository files navigation

Awesome-3D-Perception

3D Perception

Object-level

  • PointLLM: Empowering Large Language Models to Understand Point Clouds [Paper] [Homepage] [Github]
  • Point-Bind & Point-LLM: Aligning Point Cloud with Multi-modality for 3D Understanding, Generation, and Instruction Following [Paper] [Demo] [Github]

Scenes-level

3D With CLIP

  • ULIP: Learning a Unified Representation of Language, Images, and Point Clouds for 3D Understanding [Paper] [Github]
  • ULIP-2: Towards Scalable Multimodal Pre-training for 3D Understanding [Paper] [Github]
  • OpenShape: Scaling Up 3D Shape Representation Towards Open-World Understanding [Paper] [Github] [Homepage]
  • CLIP 2 : Contrastive Language-Image-Point Pretraining from Real-World Point Cloud Data [Paper] [Github]
  • CLIP Goes 3D: Leveraging Prompt Tuning for Language Grounded 3D Recognition [Paper] [Github]
  • CLIP2Point: Transfer CLIP to Point Cloud Classification with Image-Depth Pre-Training [Paper] [Github]
  • Uni3D: Exploring Unified 3D Representation at Scale [Paper] [Github]
  • MixCon3D: Synergizing Multi-View and Cross-Modal Contrastive Learning for Enhancing 3D Representation [Paper] [Github]

3D-Dataset

Object-level

  • OmniObject3D (CVPR 2023 Award Candidate): real-scanned 3D objects(6K), 190 classes [Paper] [Homepage]
  • Objaverse-XL: 3D Objects(10M+) [Paper] [Homepage] [Dataset]
  • Cap3D: 3D-Text pairs(660K) [Paper] [Download]
  • ULIP - Objaverse Triplets: 3D Point Clouds(800K)-Images(10M)-Language(100M) Triplets, [Download]
  • ULIP - ShapeNet Triplets: 3D Point Clouds(52.5K)-Images(3M)-Language(30M) Triplets,[Download]

Scene-level

  • ScanRefer: 3D object localization in RGB-D scans using natural language
  • SQA3D: 650 Scenes, 6.8K situations, 20.4k descriptions and 33.4k diverse reasoning questions for these situations[Paper] [Homepage]

Survey

  • Recent Advances in Multi-modal 3D Scene Understanding: A Comprehensive Survey and Evaluation [Paper]
  • JM3D & JM3D-LLM: Elevating 3D Representation with Joint Multi-modal Cues [Paper]

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published