Picture for Fan Lu

Fan Lu

Self-Consistent Latent Reasoning: Long Latent Sequence Reasoning for Vision-Language Model

Add code
May 13, 2026
Viaarxiv icon

MU-GeNeRF: Multi-view Uncertainty-guided Generalizable Neural Radiance Fields for Distractor-aware Scene

Add code
Apr 20, 2026
Viaarxiv icon

Visually-grounded Humanoid Agents

Add code
Apr 09, 2026
Viaarxiv icon

A Pragmatic VLA Foundation Model

Add code
Jan 26, 2026
Viaarxiv icon

LiDARDraft: Generating LiDAR Point Cloud from Versatile Inputs

Add code
Dec 23, 2025
Figure 1 for LiDARDraft: Generating LiDAR Point Cloud from Versatile Inputs
Figure 2 for LiDARDraft: Generating LiDAR Point Cloud from Versatile Inputs
Figure 3 for LiDARDraft: Generating LiDAR Point Cloud from Versatile Inputs
Figure 4 for LiDARDraft: Generating LiDAR Point Cloud from Versatile Inputs
Viaarxiv icon

Vision-Centric Activation and Coordination for Multimodal Large Language Models

Add code
Oct 16, 2025
Viaarxiv icon

UrbanCraft: Urban View Extrapolation via Hierarchical Sem-Geometric Priors

Add code
May 29, 2025
Figure 1 for UrbanCraft: Urban View Extrapolation via Hierarchical Sem-Geometric Priors
Figure 2 for UrbanCraft: Urban View Extrapolation via Hierarchical Sem-Geometric Priors
Figure 3 for UrbanCraft: Urban View Extrapolation via Hierarchical Sem-Geometric Priors
Figure 4 for UrbanCraft: Urban View Extrapolation via Hierarchical Sem-Geometric Priors
Viaarxiv icon

R2LDM: An Efficient 4D Radar Super-Resolution Framework Leveraging Diffusion Model

Add code
Mar 21, 2025
Viaarxiv icon

Range and Bird's Eye View Fused Cross-Modal Visual Place Recognition

Add code
Feb 17, 2025
Figure 1 for Range and Bird's Eye View Fused Cross-Modal Visual Place Recognition
Figure 2 for Range and Bird's Eye View Fused Cross-Modal Visual Place Recognition
Figure 3 for Range and Bird's Eye View Fused Cross-Modal Visual Place Recognition
Figure 4 for Range and Bird's Eye View Fused Cross-Modal Visual Place Recognition
Viaarxiv icon

Benchmarking Large Vision-Language Models via Directed Scene Graph for Comprehensive Image Captioning

Add code
Dec 12, 2024
Figure 1 for Benchmarking Large Vision-Language Models via Directed Scene Graph for Comprehensive Image Captioning
Figure 2 for Benchmarking Large Vision-Language Models via Directed Scene Graph for Comprehensive Image Captioning
Figure 3 for Benchmarking Large Vision-Language Models via Directed Scene Graph for Comprehensive Image Captioning
Figure 4 for Benchmarking Large Vision-Language Models via Directed Scene Graph for Comprehensive Image Captioning
Viaarxiv icon