Picture for Genshun Wan

Genshun Wan

Streaming Speech Recognition with Decoder-Only Large Language Models and Latency Optimization

Add code
Jan 30, 2026
Viaarxiv icon

Adapting Speech Foundation Models with Large Language Models for Unified Speech Recognition

Add code
Oct 27, 2025
Viaarxiv icon

Audio-Visual Representation Learning via Knowledge Distillation from Speech Foundation Models

Add code
Feb 09, 2025
Viaarxiv icon

Deep CLAS: Deep Contextual Listen, Attend and Spell

Add code
Sep 26, 2024
Figure 1 for Deep CLAS: Deep Contextual Listen, Attend and Spell
Figure 2 for Deep CLAS: Deep Contextual Listen, Attend and Spell
Figure 3 for Deep CLAS: Deep Contextual Listen, Attend and Spell
Figure 4 for Deep CLAS: Deep Contextual Listen, Attend and Spell
Viaarxiv icon

The USTC-NERCSLIP Systems for the CHiME-8 NOTSOFAR-1 Challenge

Add code
Sep 03, 2024
Figure 1 for The USTC-NERCSLIP Systems for the CHiME-8 NOTSOFAR-1 Challenge
Figure 2 for The USTC-NERCSLIP Systems for the CHiME-8 NOTSOFAR-1 Challenge
Figure 3 for The USTC-NERCSLIP Systems for the CHiME-8 NOTSOFAR-1 Challenge
Figure 4 for The USTC-NERCSLIP Systems for the CHiME-8 NOTSOFAR-1 Challenge
Viaarxiv icon

The USTC-NERCSLIP Systems for the CHiME-7 DASR Challenge

Add code
Aug 28, 2023
Figure 1 for The USTC-NERCSLIP Systems for the CHiME-7 DASR Challenge
Figure 2 for The USTC-NERCSLIP Systems for the CHiME-7 DASR Challenge
Figure 3 for The USTC-NERCSLIP Systems for the CHiME-7 DASR Challenge
Figure 4 for The USTC-NERCSLIP Systems for the CHiME-7 DASR Challenge
Viaarxiv icon

Reducing the gap between streaming and non-streaming Transducer-based ASR by adaptive two-stage knowledge distillation

Add code
Jun 27, 2023
Viaarxiv icon

Improved Self-Supervised Multilingual Speech Representation Learning Combined with Auxiliary Language Information

Add code
Dec 07, 2022
Figure 1 for Improved Self-Supervised Multilingual Speech Representation Learning Combined with Auxiliary Language Information
Figure 2 for Improved Self-Supervised Multilingual Speech Representation Learning Combined with Auxiliary Language Information
Figure 3 for Improved Self-Supervised Multilingual Speech Representation Learning Combined with Auxiliary Language Information
Figure 4 for Improved Self-Supervised Multilingual Speech Representation Learning Combined with Auxiliary Language Information
Viaarxiv icon

Progressive Multi-Scale Self-Supervised Learning for Speech Recognition

Add code
Dec 07, 2022
Figure 1 for Progressive Multi-Scale Self-Supervised Learning for Speech Recognition
Figure 2 for Progressive Multi-Scale Self-Supervised Learning for Speech Recognition
Figure 3 for Progressive Multi-Scale Self-Supervised Learning for Speech Recognition
Figure 4 for Progressive Multi-Scale Self-Supervised Learning for Speech Recognition
Viaarxiv icon

Improved Speech Pre-Training with Supervision-Enhanced Acoustic Unit

Add code
Dec 07, 2022
Figure 1 for Improved Speech Pre-Training with Supervision-Enhanced Acoustic Unit
Figure 2 for Improved Speech Pre-Training with Supervision-Enhanced Acoustic Unit
Figure 3 for Improved Speech Pre-Training with Supervision-Enhanced Acoustic Unit
Figure 4 for Improved Speech Pre-Training with Supervision-Enhanced Acoustic Unit
Viaarxiv icon