Picture for Sihang Cai

Sihang Cai

HVD: Human Vision-Driven Video Representation Learning for Text-Video Retrieval

Add code
Jan 22, 2026
Viaarxiv icon

Chat-Driven Text Generation and Interaction for Person Retrieval

Add code
Sep 16, 2025
Viaarxiv icon

Astrea: A MOE-based Visual Understanding Model with Progressive Alignment

Add code
Mar 12, 2025
Figure 1 for Astrea: A MOE-based Visual Understanding Model with Progressive Alignment
Figure 2 for Astrea: A MOE-based Visual Understanding Model with Progressive Alignment
Figure 3 for Astrea: A MOE-based Visual Understanding Model with Progressive Alignment
Figure 4 for Astrea: A MOE-based Visual Understanding Model with Progressive Alignment
Viaarxiv icon

EAGER-LLM: Enhancing Large Language Models as Recommenders through Exogenous Behavior-Semantic Integration

Add code
Feb 20, 2025
Viaarxiv icon