Picture for Yu Xi

Yu Xi

UniCodec: Unified Audio Codec with Single Domain-Adaptive Codebook

Add code
Feb 27, 2025
Viaarxiv icon

Neural Directed Speech Enhancement with Dual Microphone Array in High Noise Scenario

Add code
Dec 24, 2024
Figure 1 for Neural Directed Speech Enhancement with Dual Microphone Array in High Noise Scenario
Figure 2 for Neural Directed Speech Enhancement with Dual Microphone Array in High Noise Scenario
Figure 3 for Neural Directed Speech Enhancement with Dual Microphone Array in High Noise Scenario
Figure 4 for Neural Directed Speech Enhancement with Dual Microphone Array in High Noise Scenario
Viaarxiv icon

NTC-KWS: Noise-aware CTC for Robust Keyword Spotting

Add code
Dec 17, 2024
Viaarxiv icon

Streaming Keyword Spotting Boosted by Cross-layer Discrimination Consistency

Add code
Dec 17, 2024
Figure 1 for Streaming Keyword Spotting Boosted by Cross-layer Discrimination Consistency
Figure 2 for Streaming Keyword Spotting Boosted by Cross-layer Discrimination Consistency
Figure 3 for Streaming Keyword Spotting Boosted by Cross-layer Discrimination Consistency
Figure 4 for Streaming Keyword Spotting Boosted by Cross-layer Discrimination Consistency
Viaarxiv icon

A Survey on Speech Large Language Models

Add code
Oct 24, 2024
Figure 1 for A Survey on Speech Large Language Models
Figure 2 for A Survey on Speech Large Language Models
Figure 3 for A Survey on Speech Large Language Models
Figure 4 for A Survey on Speech Large Language Models
Viaarxiv icon

Semi-supervised Learning for Code-Switching ASR with Large Language Model Filter

Add code
Jul 05, 2024
Viaarxiv icon

Romanization Encoding For Multilingual ASR

Add code
Jul 05, 2024
Figure 1 for Romanization Encoding For Multilingual ASR
Figure 2 for Romanization Encoding For Multilingual ASR
Figure 3 for Romanization Encoding For Multilingual ASR
Figure 4 for Romanization Encoding For Multilingual ASR
Viaarxiv icon

Text-aware Speech Separation for Multi-talker Keyword Spotting

Add code
Jun 18, 2024
Figure 1 for Text-aware Speech Separation for Multi-talker Keyword Spotting
Figure 2 for Text-aware Speech Separation for Multi-talker Keyword Spotting
Figure 3 for Text-aware Speech Separation for Multi-talker Keyword Spotting
Figure 4 for Text-aware Speech Separation for Multi-talker Keyword Spotting
Viaarxiv icon

TDT-KWS: Fast And Accurate Keyword Spotting Using Token-and-duration Transducer

Add code
Mar 20, 2024
Figure 1 for TDT-KWS: Fast And Accurate Keyword Spotting Using Token-and-duration Transducer
Figure 2 for TDT-KWS: Fast And Accurate Keyword Spotting Using Token-and-duration Transducer
Figure 3 for TDT-KWS: Fast And Accurate Keyword Spotting Using Token-and-duration Transducer
Figure 4 for TDT-KWS: Fast And Accurate Keyword Spotting Using Token-and-duration Transducer
Viaarxiv icon

Contrastive Learning With Audio Discrimination For Customizable Keyword Spotting In Continuous Speech

Add code
Jan 12, 2024
Viaarxiv icon