Picture for Yixing Peng

Yixing Peng

HumanOmni: A Large Vision-Speech Language Model for Human-Centric Video Understanding

Add code
Jan 25, 2025
Figure 1 for HumanOmni: A Large Vision-Speech Language Model for Human-Centric Video Understanding
Figure 2 for HumanOmni: A Large Vision-Speech Language Model for Human-Centric Video Understanding
Figure 3 for HumanOmni: A Large Vision-Speech Language Model for Human-Centric Video Understanding
Figure 4 for HumanOmni: A Large Vision-Speech Language Model for Human-Centric Video Understanding
Viaarxiv icon

Cross-Modal Adaptive Dual Association for Text-to-Image Person Retrieval

Add code
Dec 04, 2023
Figure 1 for Cross-Modal Adaptive Dual Association for Text-to-Image Person Retrieval
Figure 2 for Cross-Modal Adaptive Dual Association for Text-to-Image Person Retrieval
Figure 3 for Cross-Modal Adaptive Dual Association for Text-to-Image Person Retrieval
Figure 4 for Cross-Modal Adaptive Dual Association for Text-to-Image Person Retrieval
Viaarxiv icon