Picture for Adam Lee

Adam Lee

Towards Efficient Visual-Language Alignment of the Q-Former for Visual Reasoning Tasks

Add code
Oct 12, 2024
Figure 1 for Towards Efficient Visual-Language Alignment of the Q-Former for Visual Reasoning Tasks
Figure 2 for Towards Efficient Visual-Language Alignment of the Q-Former for Visual Reasoning Tasks
Figure 3 for Towards Efficient Visual-Language Alignment of the Q-Former for Visual Reasoning Tasks
Figure 4 for Towards Efficient Visual-Language Alignment of the Q-Former for Visual Reasoning Tasks
Viaarxiv icon

MERLIN: Multimodal Embedding Refinement via LLM-based Iterative Navigation for Text-Video Retrieval-Rerank Pipeline

Add code
Jul 17, 2024
Figure 1 for MERLIN: Multimodal Embedding Refinement via LLM-based Iterative Navigation for Text-Video Retrieval-Rerank Pipeline
Figure 2 for MERLIN: Multimodal Embedding Refinement via LLM-based Iterative Navigation for Text-Video Retrieval-Rerank Pipeline
Figure 3 for MERLIN: Multimodal Embedding Refinement via LLM-based Iterative Navigation for Text-Video Retrieval-Rerank Pipeline
Figure 4 for MERLIN: Multimodal Embedding Refinement via LLM-based Iterative Navigation for Text-Video Retrieval-Rerank Pipeline
Viaarxiv icon

Contrastive Weighted Learning for Near-Infrared Gaze Estimation

Add code
Nov 06, 2022
Viaarxiv icon