Picture for Xiaohan Yu

Xiaohan Yu

M$^3$Searcher: Modular Multimodal Information Seeking Agency with Retrieval-Oriented Reasoning

Add code
Jan 14, 2026
Viaarxiv icon

SCE-SLAM: Scale-Consistent Monocular SLAM via Scene Coordinate Embeddings

Add code
Jan 14, 2026
Viaarxiv icon

SparseSurf: Sparse-View 3D Gaussian Splatting for Surface Reconstruction

Add code
Nov 18, 2025
Viaarxiv icon

TableRAG: A Retrieval Augmented Generation Framework for Heterogeneous Document Reasoning

Add code
Jun 12, 2025
Viaarxiv icon

Improving Medical Visual Representation Learning with Pathological-level Cross-Modal Alignment and Correlation Exploration

Add code
Jun 12, 2025
Viaarxiv icon

X2C: A Dataset Featuring Nuanced Facial Expressions for Realistic Humanoid Imitation

Add code
May 16, 2025
Viaarxiv icon

LGD: Leveraging Generative Descriptions for Zero-Shot Referring Image Segmentation

Add code
Apr 20, 2025
Figure 1 for LGD: Leveraging Generative Descriptions for Zero-Shot Referring Image Segmentation
Figure 2 for LGD: Leveraging Generative Descriptions for Zero-Shot Referring Image Segmentation
Figure 3 for LGD: Leveraging Generative Descriptions for Zero-Shot Referring Image Segmentation
Figure 4 for LGD: Leveraging Generative Descriptions for Zero-Shot Referring Image Segmentation
Viaarxiv icon

Visual and Semantic Prompt Collaboration for Generalized Zero-Shot Learning

Add code
Mar 29, 2025
Viaarxiv icon

Unveiling the Potential of Multimodal Retrieval Augmented Generation with Planning

Add code
Jan 26, 2025
Viaarxiv icon

Explainable CTR Prediction via LLM Reasoning

Add code
Dec 03, 2024
Figure 1 for Explainable CTR Prediction via LLM Reasoning
Figure 2 for Explainable CTR Prediction via LLM Reasoning
Figure 3 for Explainable CTR Prediction via LLM Reasoning
Figure 4 for Explainable CTR Prediction via LLM Reasoning
Viaarxiv icon