Picture for Sang Hoon Woo

Sang Hoon Woo

EnCLAP++: Analyzing the EnCLAP Framework for Optimizing Automated Audio Captioning Performance

Add code
Sep 02, 2024
Figure 1 for EnCLAP++: Analyzing the EnCLAP Framework for Optimizing Automated Audio Captioning Performance
Figure 2 for EnCLAP++: Analyzing the EnCLAP Framework for Optimizing Automated Audio Captioning Performance
Figure 3 for EnCLAP++: Analyzing the EnCLAP Framework for Optimizing Automated Audio Captioning Performance
Figure 4 for EnCLAP++: Analyzing the EnCLAP Framework for Optimizing Automated Audio Captioning Performance
Viaarxiv icon

Expanding on EnCLAP with Auxiliary Retrieval Model for Automated Audio Captioning

Add code
Sep 02, 2024
Viaarxiv icon

EnCLAP: Combining Neural Audio Codec and Audio-Text Joint Embedding for Automated Audio Captioning

Add code
Jan 31, 2024
Viaarxiv icon

SANE-TTS: Stable And Natural End-to-End Multilingual Text-to-Speech

Add code
Jun 24, 2022
Figure 1 for SANE-TTS: Stable And Natural End-to-End Multilingual Text-to-Speech
Figure 2 for SANE-TTS: Stable And Natural End-to-End Multilingual Text-to-Speech
Figure 3 for SANE-TTS: Stable And Natural End-to-End Multilingual Text-to-Speech
Figure 4 for SANE-TTS: Stable And Natural End-to-End Multilingual Text-to-Speech
Viaarxiv icon

Talking Face Generation with Multilingual TTS

Add code
May 13, 2022
Figure 1 for Talking Face Generation with Multilingual TTS
Figure 2 for Talking Face Generation with Multilingual TTS
Figure 3 for Talking Face Generation with Multilingual TTS
Figure 4 for Talking Face Generation with Multilingual TTS
Viaarxiv icon