Picture for Zhiwei Lin

Zhiwei Lin

TEOcc: Radar-camera Multi-modal Occupancy Prediction via Temporal Enhancement

Add code
Oct 15, 2024
Figure 1 for TEOcc: Radar-camera Multi-modal Occupancy Prediction via Temporal Enhancement
Figure 2 for TEOcc: Radar-camera Multi-modal Occupancy Prediction via Temporal Enhancement
Figure 3 for TEOcc: Radar-camera Multi-modal Occupancy Prediction via Temporal Enhancement
Figure 4 for TEOcc: Radar-camera Multi-modal Occupancy Prediction via Temporal Enhancement
Viaarxiv icon

Training-Free Open-Ended Object Detection and Segmentation via Attention as Prompts

Add code
Oct 08, 2024
Figure 1 for Training-Free Open-Ended Object Detection and Segmentation via Attention as Prompts
Figure 2 for Training-Free Open-Ended Object Detection and Segmentation via Attention as Prompts
Figure 3 for Training-Free Open-Ended Object Detection and Segmentation via Attention as Prompts
Figure 4 for Training-Free Open-Ended Object Detection and Segmentation via Attention as Prompts
Viaarxiv icon

RobustSVC: HuBERT-based Melody Extractor and Adversarial Learning for Robust Singing Voice Conversion

Add code
Sep 10, 2024
Figure 1 for RobustSVC: HuBERT-based Melody Extractor and Adversarial Learning for Robust Singing Voice Conversion
Figure 2 for RobustSVC: HuBERT-based Melody Extractor and Adversarial Learning for Robust Singing Voice Conversion
Figure 3 for RobustSVC: HuBERT-based Melody Extractor and Adversarial Learning for Robust Singing Voice Conversion
Figure 4 for RobustSVC: HuBERT-based Melody Extractor and Adversarial Learning for Robust Singing Voice Conversion
Viaarxiv icon

RCBEVDet++: Toward High-accuracy Radar-Camera Fusion 3D Perception Network

Add code
Sep 08, 2024
Figure 1 for RCBEVDet++: Toward High-accuracy Radar-Camera Fusion 3D Perception Network
Figure 2 for RCBEVDet++: Toward High-accuracy Radar-Camera Fusion 3D Perception Network
Figure 3 for RCBEVDet++: Toward High-accuracy Radar-Camera Fusion 3D Perception Network
Figure 4 for RCBEVDet++: Toward High-accuracy Radar-Camera Fusion 3D Perception Network
Viaarxiv icon

Co-Speech Gesture Video Generation via Motion-Decoupled Diffusion Model

Add code
Apr 02, 2024
Viaarxiv icon

RCBEVDet: Radar-camera Fusion in Bird's Eye View for 3D Object Detection

Add code
Mar 25, 2024
Viaarxiv icon

Few-Shot Adversarial Prompt Learning on Vision-Language Models

Add code
Mar 21, 2024
Viaarxiv icon

GALA3D: Towards Text-to-3D Complex Scene Generation via Layout-guided Generative Gaussian Splatting

Add code
Feb 11, 2024
Figure 1 for GALA3D: Towards Text-to-3D Complex Scene Generation via Layout-guided Generative Gaussian Splatting
Figure 2 for GALA3D: Towards Text-to-3D Complex Scene Generation via Layout-guided Generative Gaussian Splatting
Figure 3 for GALA3D: Towards Text-to-3D Complex Scene Generation via Layout-guided Generative Gaussian Splatting
Figure 4 for GALA3D: Towards Text-to-3D Complex Scene Generation via Layout-guided Generative Gaussian Splatting
Viaarxiv icon

LCVO: An Efficient Pretraining-Free Framework for Visual Question Answering Grounding

Add code
Jan 29, 2024
Figure 1 for LCVO: An Efficient Pretraining-Free Framework for Visual Question Answering Grounding
Figure 2 for LCVO: An Efficient Pretraining-Free Framework for Visual Question Answering Grounding
Figure 3 for LCVO: An Efficient Pretraining-Free Framework for Visual Question Answering Grounding
Figure 4 for LCVO: An Efficient Pretraining-Free Framework for Visual Question Answering Grounding
Viaarxiv icon

Multi-view MidiVAE: Fusing Track- and Bar-view Representations for Long Multi-track Symbolic Music Generation

Add code
Jan 15, 2024
Viaarxiv icon