Picture for Xingchen Song

Xingchen Song

HydraFormer: One Encoder For All Subsampling Rates

Add code
Aug 08, 2024
Viaarxiv icon

U2++ MoE: Scaling 4.7x parameters with minimal impact on RTF

Add code
Apr 25, 2024
Viaarxiv icon

Spike-Triggered Contextual Biasing for End-to-End Mandarin Speech Recognition

Add code
Oct 07, 2023
Viaarxiv icon

LightGrad: Lightweight Diffusion Probabilistic Model for Text-to-Speech

Add code
Aug 31, 2023
Viaarxiv icon

ZeroPrompt: Streaming Acoustic Encoders are Zero-Shot Masked LMs

Add code
May 18, 2023
Viaarxiv icon

CB-Conformer: Contextual biasing Conformer for biased word recognition

Add code
Apr 25, 2023
Viaarxiv icon

Fast-U2++: Fast and Accurate End-to-End Speech Recognition in Joint CTC/Attention Frames

Add code
Nov 02, 2022
Viaarxiv icon

TrimTail: Low-Latency Streaming ASR with Simple but Effective Spectrogram-Level Length Penalty

Add code
Nov 01, 2022
Viaarxiv icon

FusionFormer: Fusing Operations in Transformer for Efficient Streaming Speech Recognition

Add code
Oct 31, 2022
Viaarxiv icon

WeNet 2.0: More Productive End-to-End Speech Recognition Toolkit

Add code
Mar 29, 2022
Figure 1 for WeNet 2.0: More Productive End-to-End Speech Recognition Toolkit
Figure 2 for WeNet 2.0: More Productive End-to-End Speech Recognition Toolkit
Figure 3 for WeNet 2.0: More Productive End-to-End Speech Recognition Toolkit
Figure 4 for WeNet 2.0: More Productive End-to-End Speech Recognition Toolkit
Viaarxiv icon