Picture for Xuemeng Song

Xuemeng Song

COMBINER: Composed Image Retrieval Guided by Attribute-based Neighbor Relations

Add code
Jun 03, 2026
Viaarxiv icon

FashionLens: Toward Versatile Fashion Image Retrieval via Task-Adaptive Learning

Add code
May 21, 2026
Viaarxiv icon

OSGNet with MLLM Reranking @ Ego4D Episodic Memory Challenge 2026

Add code
May 20, 2026
Viaarxiv icon

UniCVR: From Alignment to Reranking for Unified Zero-Shot Composed Visual Retrieval

Add code
Apr 22, 2026
Viaarxiv icon

MELT: Improve Composed Image Retrieval via the Modification Frequentation-Rarity Balance Network

Add code
Mar 31, 2026
Viaarxiv icon

VideoTemp-o3: Harmonizing Temporal Grounding and Video Understanding in Agentic Thinking-with-Videos

Add code
Feb 08, 2026
Viaarxiv icon

Dual Knowledge-Enhanced Two-Stage Reasoner for Multimodal Dialog Systems

Add code
Sep 09, 2025
Figure 1 for Dual Knowledge-Enhanced Two-Stage Reasoner for Multimodal Dialog Systems
Figure 2 for Dual Knowledge-Enhanced Two-Stage Reasoner for Multimodal Dialog Systems
Figure 3 for Dual Knowledge-Enhanced Two-Stage Reasoner for Multimodal Dialog Systems
Figure 4 for Dual Knowledge-Enhanced Two-Stage Reasoner for Multimodal Dialog Systems
Viaarxiv icon

Mitigating Hallucination Through Theory-Consistent Symmetric Multimodal Preference Optimization

Add code
Jun 13, 2025
Figure 1 for Mitigating Hallucination Through Theory-Consistent Symmetric Multimodal Preference Optimization
Figure 2 for Mitigating Hallucination Through Theory-Consistent Symmetric Multimodal Preference Optimization
Figure 3 for Mitigating Hallucination Through Theory-Consistent Symmetric Multimodal Preference Optimization
Figure 4 for Mitigating Hallucination Through Theory-Consistent Symmetric Multimodal Preference Optimization
Viaarxiv icon

Modality Reliability Guided Multimodal Recommendation

Add code
Apr 23, 2025
Figure 1 for Modality Reliability Guided Multimodal Recommendation
Figure 2 for Modality Reliability Guided Multimodal Recommendation
Figure 3 for Modality Reliability Guided Multimodal Recommendation
Figure 4 for Modality Reliability Guided Multimodal Recommendation
Viaarxiv icon

Fine-grained Textual Inversion Network for Zero-Shot Composed Image Retrieval

Add code
Mar 25, 2025
Viaarxiv icon