ICCV 2023 论文和开源项目合集(papers with code)!
2160 papers accepted!
ICCV 2023 收录论文IDs:https://t.co/A0mCH8gbOi
注1:欢迎各位大佬提交issue,分享ICCV 2023论文和开源项目!
注2:关于往年CV顶会论文以及其他优质CV论文和大盘点,详见: https://github.com/amusi/daily-paper-computer-vision
如果你想了解最新最优质的的CV论文、开源项目和学习资料,欢迎扫码加入【CVer学术交流群】!互相学习,一起进步~
- Backbone
- CLIP
- MAE
- GAN
- GNN
- MLP
- NAS
- OCR
- NeRF
- DETR
- Prompt
- Diffusion Models(扩散模型)
- Avatars
- ReID(重识别)
- 长尾分布(Long-Tail)
- Vision Transformer
- 视觉和语言(Vision-Language)
- 自监督学习(Self-supervised Learning)
- 数据增强(Data Augmentation)
- 目标检测(Object Detection)
- 目标跟踪(Visual Tracking)
- 语义分割(Semantic Segmentation)
- 实例分割(Instance Segmentation)
- 全景分割(Panoptic Segmentation)
- 医学图像分割(Medical Image Segmentation)
- 视频目标分割(Video Object Segmentation)
- 视频实例分割(Video Instance Segmentation)
- 参考图像分割(Referring Image Segmentation)
- 图像抠图(Image Matting)
- Low-level Vision
- 超分辨率(Super-Resolution)
- 去噪(Denoising)
- 去模糊(Deblur)
- 3D点云(3D Point Cloud)
- 3D目标检测(3D Object Detection)
- 3D语义分割(3D Semantic Segmentation)
- 3D目标跟踪(3D Object Tracking)
- 3D语义场景补全(3D Semantic Scene Completion)
- 3D配准(3D Registration)
- 3D人体姿态估计(3D Human Pose Estimation)
- 3D人体Mesh估计(3D Human Mesh Estimation)
- 医学图像(Medical Image)
- 图像生成(Image Generation)
- 视频生成(Video Generation)
- 图像编辑(Image Editing)
- 视频编辑(Video Editing)
- 视频理解(Video Understanding)
- 人体运动生成(Human Motion Generation)
- 低光照图像增强(Low-light Image Enhancement)
- 图像检索(Image Retrieval)
- 图像融合(Image Fusion)
- 其它(Others)
Transforming Text into Neural Human Avatars with Parameterized Shape and Pose Control
Paper: https://arxiv.org/abs/2303.17606
Code: https://github.com/songrise/AvatarCraft
Rethinking Mobile Block for Efficient Attention-based Models
IntrinsicNeRF: Learning Intrinsic Neural Radiance Fields for Editable Novel View Synthesis
- Homepage: https://zju3dv.github.io/intrinsic_nerf/
- Paper: https://arxiv.org/abs/2210.00647
- Code: https://github.com/zju3dv/IntrinsicNeRF
Transforming Text into Neural Human Avatars with Parameterized Shape and Pose Control
Paper: https://arxiv.org/abs/2303.17606
Code: https://github.com/songrise/AvatarCraft
PoseDiffusion: Solving Pose Estimation via Diffusion-aided Bundle Adjustment
FreeDoM: Training-Free Energy-Guided Conditional Diffusion Model
BoxDiff: Text-to-Image Synthesis with Training-Free Box-Constrained Diffusion
BeLFusion: Latent Diffusion for Behavior-Driven Human Motion Prediction
DDFM: Denoising Diffusion Model for Multi-Modality Image Fusion
Femtodet: an object detection baseline for energy versus performance tradeoffs
Group DETR: Fast DETR Training with Group-Wise One-to-Many Assignment
Cross-modal Orthogonal High-rank Augmentation for RGB-Event Transformer-trackers
- Paper: https://arxiv.org/abs/2307.04129
- Code: https://github.com/ZHU-Zhiyu/High-Rank_RGB-Event_Tracker
Segment Anything
- Homepage: https://segment-anything.com/
- Paper: https://arxiv.org/abs/2304.02643
- Code: https://github.com/facebookresearch/segment-anything
MARS: Model-agnostic Biased Object Removal without Additional Supervision for Weakly-Supervised Semantic Segmentation
FreeCOS: Self-Supervised Learning from Fractals and Unlabeled Images for Curvilinear Object Segmentation
DVIS: Decoupled Video Instance Segmentation Framework
Self-supervised Learning to Bring Dual Reversed Rolling Shutter Images Alive
Robo3D: Towards Robust and Reliable 3D Perception against Corruptions
- Homepage: https://ldkong.com/Robo3D
- Paper: https://arxiv.org/abs/2303.17597
- Code: https://github.com/ldkong1205/Robo3D
PETRv2: A Unified Framework for 3D Perception from Multi-Camera Images
DQS3D: Densely-matched Quantization-aware Semi-supervised 3D Detection
SparseFusion: Fusing Multi-Modal Sparse Representations for Multi-Sensor 3D Object Detection
StreamPETR: Exploring Object-Centric Temporal Modeling for Efficient Multi-View 3D Object Detection
Cross Modal Transformer: Towards Fast and Robust 3D Object Detection
MetaBEV: Solving Sensor Failures for BEV Detection and Map Segmentation
- Paper: https://arxiv.org/abs/2304.09801
- Project: https://chongjiange.github.io/metabev.html
- Code: https://github.com/ChongjianGE/MetaBEV
Revisiting Domain-Adaptive 3D Object Detection by Reliable, Diverse and Class-balanced Pseudo-Labeling
Rethinking Range View Representation for LiDAR Segmentation
- Homepage: https://ldkong.com/RangeFormer
- Paper: https://arxiv.org/abs/2303.05367
- Code: None
MBPTrack: Improving 3D Point Cloud Tracking with Memory Networks and Box Priors
Unmasked Teacher: Towards Training-Efficient Video Foundation Models
FreeDoM: Training-Free Energy-Guided Conditional Diffusion Model
BoxDiff: Text-to-Image Synthesis with Training-Free Box-Constrained Diffusion
Simulating Fluids in Real-World Still Images
- Homepage: https://slr-sfs.github.io/
- Paper: https://arxiv.org/abs/2204.11335
- Code: https://github.com/simon3dv/SLR-SFS
Multimodal Garment Designer: Human-Centric Latent Diffusion Models for Fashion Image Editing
- Paper: https://arxiv.org/abs/2304.02051
- Code: https://github.com/aimagelab/multimodal-garment-designer
FateZero: Fusing Attentions for Zero-shot Text-based Video Editing
- Project: https://fate-zero-edit.github.io/
- Paper: https://arxiv.org/abs/2303.09535
- Code: https://github.com/ChenyangQiQi/FateZero
BeLFusion: Latent Diffusion for Behavior-Driven Human Motion Prediction
Implicit Neural Representation for Cooperative Low-light Image Enhancement
Zero-Shot Composed Image Retrieval with Textual Inversion
DDFM: Denoising Diffusion Model for Multi-Modality Image Fusion
MotionBERT: A Unified Perspective on Learning Human Motion Representations