Picture for Yu Xi

Yu Xi

TC-BiMamba: Trans-Chunk bidirectionally within BiMamba for unified streaming and non-streaming ASR

Add code
Feb 12, 2026
Viaarxiv icon

Detect, Attend and Extract: Keyword Guided Target Speaker Extraction

Add code
Feb 08, 2026
Viaarxiv icon

Qwen3-ASR Technical Report

Add code
Jan 29, 2026
Viaarxiv icon

Time-Layer Adaptive Alignment for Speaker Similarity in Flow-Matching Based Zero-Shot TTS

Add code
Nov 13, 2025
Figure 1 for Time-Layer Adaptive Alignment for Speaker Similarity in Flow-Matching Based Zero-Shot TTS
Figure 2 for Time-Layer Adaptive Alignment for Speaker Similarity in Flow-Matching Based Zero-Shot TTS
Figure 3 for Time-Layer Adaptive Alignment for Speaker Similarity in Flow-Matching Based Zero-Shot TTS
Figure 4 for Time-Layer Adaptive Alignment for Speaker Similarity in Flow-Matching Based Zero-Shot TTS
Viaarxiv icon

Joint decoding method for controllable contextual speech recognition based on Speech LLM

Add code
Aug 12, 2025
Figure 1 for Joint decoding method for controllable contextual speech recognition based on Speech LLM
Figure 2 for Joint decoding method for controllable contextual speech recognition based on Speech LLM
Figure 3 for Joint decoding method for controllable contextual speech recognition based on Speech LLM
Figure 4 for Joint decoding method for controllable contextual speech recognition based on Speech LLM
Viaarxiv icon

Low-Resource Domain Adaptation for Speech LLMs via Text-Only Fine-Tuning

Add code
Jun 06, 2025
Figure 1 for Low-Resource Domain Adaptation for Speech LLMs via Text-Only Fine-Tuning
Figure 2 for Low-Resource Domain Adaptation for Speech LLMs via Text-Only Fine-Tuning
Figure 3 for Low-Resource Domain Adaptation for Speech LLMs via Text-Only Fine-Tuning
Figure 4 for Low-Resource Domain Adaptation for Speech LLMs via Text-Only Fine-Tuning
Viaarxiv icon

Masked Self-distilled Transducer-based Keyword Spotting with Semi-autoregressive Decoding

Add code
May 30, 2025
Viaarxiv icon

Fewer Hallucinations, More Verification: A Three-Stage LLM-Based Framework for ASR Error Correction

Add code
May 30, 2025
Figure 1 for Fewer Hallucinations, More Verification: A Three-Stage LLM-Based Framework for ASR Error Correction
Figure 2 for Fewer Hallucinations, More Verification: A Three-Stage LLM-Based Framework for ASR Error Correction
Figure 3 for Fewer Hallucinations, More Verification: A Three-Stage LLM-Based Framework for ASR Error Correction
Figure 4 for Fewer Hallucinations, More Verification: A Three-Stage LLM-Based Framework for ASR Error Correction
Viaarxiv icon

MFA-KWS: Effective Keyword Spotting with Multi-head Frame-asynchronous Decoding

Add code
May 26, 2025
Viaarxiv icon

UniCodec: Unified Audio Codec with Single Domain-Adaptive Codebook

Add code
Feb 27, 2025
Viaarxiv icon