Picture for Son Tran

Son Tran

CoLLM: A Large Language Model for Composed Image Retrieval

Add code
Mar 25, 2025
Viaarxiv icon

M-LLM Based Video Frame Selection for Efficient Video Understanding

Add code
Feb 27, 2025
Viaarxiv icon

Bringing Multimodality to Amazon Visual Search System

Add code
Dec 17, 2024
Figure 1 for Bringing Multimodality to Amazon Visual Search System
Figure 2 for Bringing Multimodality to Amazon Visual Search System
Figure 3 for Bringing Multimodality to Amazon Visual Search System
Figure 4 for Bringing Multimodality to Amazon Visual Search System
Viaarxiv icon

DreamBlend: Advancing Personalized Fine-tuning of Text-to-Image Diffusion Models

Add code
Nov 28, 2024
Viaarxiv icon

X-Former: Unifying Contrastive and Reconstruction Learning for MLLMs

Add code
Jul 18, 2024
Figure 1 for X-Former: Unifying Contrastive and Reconstruction Learning for MLLMs
Figure 2 for X-Former: Unifying Contrastive and Reconstruction Learning for MLLMs
Figure 3 for X-Former: Unifying Contrastive and Reconstruction Learning for MLLMs
Figure 4 for X-Former: Unifying Contrastive and Reconstruction Learning for MLLMs
Viaarxiv icon

Open Vocabulary Multi-Label Video Classification

Add code
Jul 12, 2024
Viaarxiv icon

VidLA: Video-Language Alignment at Scale

Add code
Mar 21, 2024
Viaarxiv icon

UnsMOT: Unified Framework for Unsupervised Multi-Object Tracking with Geometric Topology Guidance

Add code
Sep 03, 2023
Viaarxiv icon

SurveyLM: A platform to explore emerging value perspectives in augmented language models' behaviors

Add code
Aug 01, 2023
Viaarxiv icon

Vision-Language Pre-Training with Triple Contrastive Learning

Add code
Mar 28, 2022
Figure 1 for Vision-Language Pre-Training with Triple Contrastive Learning
Figure 2 for Vision-Language Pre-Training with Triple Contrastive Learning
Figure 3 for Vision-Language Pre-Training with Triple Contrastive Learning
Figure 4 for Vision-Language Pre-Training with Triple Contrastive Learning
Viaarxiv icon