Picture for Joon Son Chung

Joon Son Chung

LAVCap: LLM-based Audio-Visual Captioning using Optimal Transport

Add code
Jan 16, 2025
Viaarxiv icon

AdaptVC: High Quality Voice Conversion with Adaptive Learning

Add code
Jan 07, 2025
Viaarxiv icon

CrossSpeech++: Cross-lingual Speech Synthesis with Decoupled Language and Speaker Generation

Add code
Dec 28, 2024
Viaarxiv icon

VoiceDiT: Dual-Condition Diffusion Transformer for Environment-Aware Speech Synthesis

Add code
Dec 26, 2024
Viaarxiv icon

V2SFlow: Video-to-Speech Generation with Speech Decomposition and Rectified Flow

Add code
Nov 29, 2024
Viaarxiv icon

AVHBench: A Cross-Modal Hallucination Benchmark for Audio-Visual Large Language Models

Add code
Oct 23, 2024
Figure 1 for AVHBench: A Cross-Modal Hallucination Benchmark for Audio-Visual Large Language Models
Figure 2 for AVHBench: A Cross-Modal Hallucination Benchmark for Audio-Visual Large Language Models
Figure 3 for AVHBench: A Cross-Modal Hallucination Benchmark for Audio-Visual Large Language Models
Figure 4 for AVHBench: A Cross-Modal Hallucination Benchmark for Audio-Visual Large Language Models
Viaarxiv icon

Accelerating Codec-based Speech Synthesis with Multi-Token Prediction and Speculative Decoding

Add code
Oct 17, 2024
Viaarxiv icon

Let Me Finish My Sentence: Video Temporal Grounding with Holistic Text Understanding

Add code
Oct 17, 2024
Figure 1 for Let Me Finish My Sentence: Video Temporal Grounding with Holistic Text Understanding
Figure 2 for Let Me Finish My Sentence: Video Temporal Grounding with Holistic Text Understanding
Figure 3 for Let Me Finish My Sentence: Video Temporal Grounding with Holistic Text Understanding
Figure 4 for Let Me Finish My Sentence: Video Temporal Grounding with Holistic Text Understanding
Viaarxiv icon

Text-To-Speech Synthesis In The Wild

Add code
Sep 13, 2024
Viaarxiv icon

The VoxCeleb Speaker Recognition Challenge: A Retrospective

Add code
Aug 27, 2024
Figure 1 for The VoxCeleb Speaker Recognition Challenge: A Retrospective
Figure 2 for The VoxCeleb Speaker Recognition Challenge: A Retrospective
Figure 3 for The VoxCeleb Speaker Recognition Challenge: A Retrospective
Figure 4 for The VoxCeleb Speaker Recognition Challenge: A Retrospective
Viaarxiv icon