Picture for Sang Hoon Woo

Sang Hoon Woo

Expanding on EnCLAP with Auxiliary Retrieval Model for Automated Audio Captioning

Add code
Sep 02, 2024
Viaarxiv icon

EnCLAP++: Analyzing the EnCLAP Framework for Optimizing Automated Audio Captioning Performance

Add code
Sep 02, 2024
Figure 1 for EnCLAP++: Analyzing the EnCLAP Framework for Optimizing Automated Audio Captioning Performance
Figure 2 for EnCLAP++: Analyzing the EnCLAP Framework for Optimizing Automated Audio Captioning Performance
Figure 3 for EnCLAP++: Analyzing the EnCLAP Framework for Optimizing Automated Audio Captioning Performance
Figure 4 for EnCLAP++: Analyzing the EnCLAP Framework for Optimizing Automated Audio Captioning Performance
Viaarxiv icon

EnCLAP: Combining Neural Audio Codec and Audio-Text Joint Embedding for Automated Audio Captioning

Add code
Jan 31, 2024
Viaarxiv icon

SANE-TTS: Stable And Natural End-to-End Multilingual Text-to-Speech

Add code
Jun 24, 2022
Figure 1 for SANE-TTS: Stable And Natural End-to-End Multilingual Text-to-Speech
Figure 2 for SANE-TTS: Stable And Natural End-to-End Multilingual Text-to-Speech
Figure 3 for SANE-TTS: Stable And Natural End-to-End Multilingual Text-to-Speech
Figure 4 for SANE-TTS: Stable And Natural End-to-End Multilingual Text-to-Speech
Viaarxiv icon

Talking Face Generation with Multilingual TTS

Add code
May 13, 2022
Figure 1 for Talking Face Generation with Multilingual TTS
Figure 2 for Talking Face Generation with Multilingual TTS
Figure 3 for Talking Face Generation with Multilingual TTS
Figure 4 for Talking Face Generation with Multilingual TTS
Viaarxiv icon