Picture for Shaojin Ding

Shaojin Ding

USM-Lite: Quantization and Sparsity Aware Fine-tuning for Speech Recognition with Universal Speech Models

Add code
Jan 03, 2024
Viaarxiv icon

2-bit Conformer quantization for automatic speech recognition

Add code
May 26, 2023
Viaarxiv icon

RAND: Robustness Aware Norm Decay For Quantized Seq2seq Models

Add code
May 24, 2023
Viaarxiv icon

Sharing Low Rank Conformer Weights for Tiny Always-On Ambient Speech Recognition Models

Add code
Mar 15, 2023
Viaarxiv icon

A Unified Cascaded Encoder ASR Model for Dynamic Model Sizes

Add code
Apr 20, 2022
Figure 1 for A Unified Cascaded Encoder ASR Model for Dynamic Model Sizes
Figure 2 for A Unified Cascaded Encoder ASR Model for Dynamic Model Sizes
Figure 3 for A Unified Cascaded Encoder ASR Model for Dynamic Model Sizes
Figure 4 for A Unified Cascaded Encoder ASR Model for Dynamic Model Sizes
Viaarxiv icon

Personal VAD 2.0: Optimizing Personal Voice Activity Detection for On-Device Speech Recognition

Add code
Apr 13, 2022
Figure 1 for Personal VAD 2.0: Optimizing Personal Voice Activity Detection for On-Device Speech Recognition
Figure 2 for Personal VAD 2.0: Optimizing Personal Voice Activity Detection for On-Device Speech Recognition
Figure 3 for Personal VAD 2.0: Optimizing Personal Voice Activity Detection for On-Device Speech Recognition
Figure 4 for Personal VAD 2.0: Optimizing Personal Voice Activity Detection for On-Device Speech Recognition
Viaarxiv icon

4-bit Conformer with Native Quantization Aware Training for Speech Recognition

Add code
Mar 29, 2022
Figure 1 for 4-bit Conformer with Native Quantization Aware Training for Speech Recognition
Figure 2 for 4-bit Conformer with Native Quantization Aware Training for Speech Recognition
Figure 3 for 4-bit Conformer with Native Quantization Aware Training for Speech Recognition
Figure 4 for 4-bit Conformer with Native Quantization Aware Training for Speech Recognition
Viaarxiv icon

Towards Lifelong Learning of Multilingual Text-To-Speech Synthesis

Add code
Oct 09, 2021
Figure 1 for Towards Lifelong Learning of Multilingual Text-To-Speech Synthesis
Figure 2 for Towards Lifelong Learning of Multilingual Text-To-Speech Synthesis
Figure 3 for Towards Lifelong Learning of Multilingual Text-To-Speech Synthesis
Figure 4 for Towards Lifelong Learning of Multilingual Text-To-Speech Synthesis
Viaarxiv icon

Textual Echo Cancellation

Add code
Aug 13, 2020
Figure 1 for Textual Echo Cancellation
Figure 2 for Textual Echo Cancellation
Figure 3 for Textual Echo Cancellation
Figure 4 for Textual Echo Cancellation
Viaarxiv icon

AutoSpeech: Neural Architecture Search for Speaker Recognition

Add code
May 07, 2020
Figure 1 for AutoSpeech: Neural Architecture Search for Speaker Recognition
Figure 2 for AutoSpeech: Neural Architecture Search for Speaker Recognition
Figure 3 for AutoSpeech: Neural Architecture Search for Speaker Recognition
Viaarxiv icon