Picture for Yiheng Li

Yiheng Li

CoreNet: Conflict Resolution Network for Point-Pixel Misalignment and Sub-Task Suppression of 3D LiDAR-Camera Object Detection

Add code
Jan 11, 2025
Viaarxiv icon

RCTrans: Radar-Camera Transformer via Radar Densifier and Sequential Decoder for 3D Object Detection

Add code
Dec 17, 2024
Viaarxiv icon

UniPose: A Unified Multimodal Framework for Human Pose Comprehension, Generation and Editing

Add code
Nov 25, 2024
Viaarxiv icon

Mitigating Object Hallucination via Concentric Causal Attention

Add code
Oct 21, 2024
Viaarxiv icon

Learning Content-Aware Multi-Modal Joint Input Pruning via Bird's-Eye-View Representation

Add code
Oct 09, 2024
Figure 1 for Learning Content-Aware Multi-Modal Joint Input Pruning via Bird's-Eye-View Representation
Figure 2 for Learning Content-Aware Multi-Modal Joint Input Pruning via Bird's-Eye-View Representation
Figure 3 for Learning Content-Aware Multi-Modal Joint Input Pruning via Bird's-Eye-View Representation
Figure 4 for Learning Content-Aware Multi-Modal Joint Input Pruning via Bird's-Eye-View Representation
Viaarxiv icon

QuadBEV: An Efficient Quadruple-Task Perception Framework via Bird's-Eye-View Representation

Add code
Oct 09, 2024
Figure 1 for QuadBEV: An Efficient Quadruple-Task Perception Framework via Bird's-Eye-View Representation
Figure 2 for QuadBEV: An Efficient Quadruple-Task Perception Framework via Bird's-Eye-View Representation
Figure 3 for QuadBEV: An Efficient Quadruple-Task Perception Framework via Bird's-Eye-View Representation
Figure 4 for QuadBEV: An Efficient Quadruple-Task Perception Framework via Bird's-Eye-View Representation
Viaarxiv icon

WOMD-Reasoning: A Large-Scale Language Dataset for Interaction and Driving Intentions Reasoning

Add code
Jul 05, 2024
Viaarxiv icon

Immiscible Diffusion: Accelerating Diffusion Training with Noise Assignment

Add code
Jun 18, 2024
Viaarxiv icon

SparseFusion: Efficient Sparse Multi-Modal Fusion Framework for Long-Range 3D Perception

Add code
Mar 15, 2024
Figure 1 for SparseFusion: Efficient Sparse Multi-Modal Fusion Framework for Long-Range 3D Perception
Figure 2 for SparseFusion: Efficient Sparse Multi-Modal Fusion Framework for Long-Range 3D Perception
Figure 3 for SparseFusion: Efficient Sparse Multi-Modal Fusion Framework for Long-Range 3D Perception
Figure 4 for SparseFusion: Efficient Sparse Multi-Modal Fusion Framework for Long-Range 3D Perception
Viaarxiv icon

Towards Efficient 3D Object Detection in Bird's-Eye-View Space for Autonomous Driving: A Convolutional-Only Approach

Add code
Dec 01, 2023
Viaarxiv icon