Picture for Hoon-Young Cho

Hoon-Young Cho

Triage knowledge distillation for speaker verification

Add code
Jan 21, 2026
Viaarxiv icon

MATE: Matryoshka Audio-Text Embeddings for Open-Vocabulary Keyword Spotting

Add code
Jan 20, 2026
Viaarxiv icon

DAME: Duration-Aware Matryoshka Embedding for Duration-Robust Speaker Verification

Add code
Jan 20, 2026
Viaarxiv icon

Adversarial Deep Metric Learning for Cross-Modal Audio-Text Alignment in Open-Vocabulary Keyword Spotting

Add code
May 22, 2025
Figure 1 for Adversarial Deep Metric Learning for Cross-Modal Audio-Text Alignment in Open-Vocabulary Keyword Spotting
Figure 2 for Adversarial Deep Metric Learning for Cross-Modal Audio-Text Alignment in Open-Vocabulary Keyword Spotting
Figure 3 for Adversarial Deep Metric Learning for Cross-Modal Audio-Text Alignment in Open-Vocabulary Keyword Spotting
Figure 4 for Adversarial Deep Metric Learning for Cross-Modal Audio-Text Alignment in Open-Vocabulary Keyword Spotting
Viaarxiv icon

Single-Channel Distance-Based Source Separation for Mobile GPU in Outdoor and Indoor Environments

Add code
Jan 06, 2025
Figure 1 for Single-Channel Distance-Based Source Separation for Mobile GPU in Outdoor and Indoor Environments
Figure 2 for Single-Channel Distance-Based Source Separation for Mobile GPU in Outdoor and Indoor Environments
Figure 3 for Single-Channel Distance-Based Source Separation for Mobile GPU in Outdoor and Indoor Environments
Figure 4 for Single-Channel Distance-Based Source Separation for Mobile GPU in Outdoor and Indoor Environments
Viaarxiv icon

Text-Aware Adapter for Few-Shot Keyword Spotting

Add code
Dec 24, 2024
Figure 1 for Text-Aware Adapter for Few-Shot Keyword Spotting
Figure 2 for Text-Aware Adapter for Few-Shot Keyword Spotting
Figure 3 for Text-Aware Adapter for Few-Shot Keyword Spotting
Figure 4 for Text-Aware Adapter for Few-Shot Keyword Spotting
Viaarxiv icon

FINALLY: fast and universal speech enhancement with studio-like quality

Add code
Oct 08, 2024
Figure 1 for FINALLY: fast and universal speech enhancement with studio-like quality
Figure 2 for FINALLY: fast and universal speech enhancement with studio-like quality
Figure 3 for FINALLY: fast and universal speech enhancement with studio-like quality
Figure 4 for FINALLY: fast and universal speech enhancement with studio-like quality
Viaarxiv icon

Speech Boosting: Low-Latency Live Speech Enhancement for TWS Earbuds

Add code
Sep 27, 2024
Figure 1 for Speech Boosting: Low-Latency Live Speech Enhancement for TWS Earbuds
Figure 2 for Speech Boosting: Low-Latency Live Speech Enhancement for TWS Earbuds
Figure 3 for Speech Boosting: Low-Latency Live Speech Enhancement for TWS Earbuds
Figure 4 for Speech Boosting: Low-Latency Live Speech Enhancement for TWS Earbuds
Viaarxiv icon

High Fidelity Text-to-Speech Via Discrete Tokens Using Token Transducer and Group Masked Language Model

Add code
Jun 25, 2024
Figure 1 for High Fidelity Text-to-Speech Via Discrete Tokens Using Token Transducer and Group Masked Language Model
Figure 2 for High Fidelity Text-to-Speech Via Discrete Tokens Using Token Transducer and Group Masked Language Model
Figure 3 for High Fidelity Text-to-Speech Via Discrete Tokens Using Token Transducer and Group Masked Language Model
Figure 4 for High Fidelity Text-to-Speech Via Discrete Tokens Using Token Transducer and Group Masked Language Model
Viaarxiv icon

Relational Proxy Loss for Audio-Text based Keyword Spotting

Add code
Jun 08, 2024
Figure 1 for Relational Proxy Loss for Audio-Text based Keyword Spotting
Figure 2 for Relational Proxy Loss for Audio-Text based Keyword Spotting
Figure 3 for Relational Proxy Loss for Audio-Text based Keyword Spotting
Figure 4 for Relational Proxy Loss for Audio-Text based Keyword Spotting
Viaarxiv icon