Picture for Chae Won Kim

Chae Won Kim

Zero-AVSR: Zero-Shot Audio-Visual Speech Recognition with LLMs by Learning Language-Agnostic Speech Representations

Add code
Mar 08, 2025
Viaarxiv icon

Personalized Lip Reading: Adapting to Your Unique Lip Movements with Vision and Language

Add code
Sep 02, 2024
Viaarxiv icon

TroL: Traversal of Layers for Large Language and Vision Models

Add code
Jun 18, 2024
Viaarxiv icon

Let's Go Real Talk: Spoken Dialogue Model for Face-to-Face Conversation

Add code
Jun 12, 2024
Figure 1 for Let's Go Real Talk: Spoken Dialogue Model for Face-to-Face Conversation
Figure 2 for Let's Go Real Talk: Spoken Dialogue Model for Face-to-Face Conversation
Figure 3 for Let's Go Real Talk: Spoken Dialogue Model for Face-to-Face Conversation
Figure 4 for Let's Go Real Talk: Spoken Dialogue Model for Face-to-Face Conversation
Viaarxiv icon

Meteor: Mamba-based Traversal of Rationale for Large Language and Vision Models

Add code
May 27, 2024
Viaarxiv icon

MoAI: Mixture of All Intelligence for Large Language and Vision Models

Add code
Mar 12, 2024
Figure 1 for MoAI: Mixture of All Intelligence for Large Language and Vision Models
Figure 2 for MoAI: Mixture of All Intelligence for Large Language and Vision Models
Figure 3 for MoAI: Mixture of All Intelligence for Large Language and Vision Models
Figure 4 for MoAI: Mixture of All Intelligence for Large Language and Vision Models
Viaarxiv icon

Persona Extraction Through Semantic Similarity for Emotional Support Conversation Generation

Add code
Mar 07, 2024
Figure 1 for Persona Extraction Through Semantic Similarity for Emotional Support Conversation Generation
Figure 2 for Persona Extraction Through Semantic Similarity for Emotional Support Conversation Generation
Figure 3 for Persona Extraction Through Semantic Similarity for Emotional Support Conversation Generation
Figure 4 for Persona Extraction Through Semantic Similarity for Emotional Support Conversation Generation
Viaarxiv icon

CoLLaVO: Crayon Large Language and Vision mOdel

Add code
Feb 20, 2024
Figure 1 for CoLLaVO: Crayon Large Language and Vision mOdel
Figure 2 for CoLLaVO: Crayon Large Language and Vision mOdel
Figure 3 for CoLLaVO: Crayon Large Language and Vision mOdel
Figure 4 for CoLLaVO: Crayon Large Language and Vision mOdel
Viaarxiv icon

Deep Visual Forced Alignment: Learning to Align Transcription with Talking Face Video

Add code
Feb 27, 2023
Viaarxiv icon