Picture for Sining Sun

Sining Sun

Streaming Decoder-Only Automatic Speech Recognition with Discrete Speech Units: A Pilot Study

Add code
Jun 27, 2024
Figure 1 for Streaming Decoder-Only Automatic Speech Recognition with Discrete Speech Units: A Pilot Study
Figure 2 for Streaming Decoder-Only Automatic Speech Recognition with Discrete Speech Units: A Pilot Study
Figure 3 for Streaming Decoder-Only Automatic Speech Recognition with Discrete Speech Units: A Pilot Study
Figure 4 for Streaming Decoder-Only Automatic Speech Recognition with Discrete Speech Units: A Pilot Study
Viaarxiv icon

Skipformer: A Skip-and-Recover Strategy for Efficient Speech Recognition

Add code
Mar 13, 2024
Figure 1 for Skipformer: A Skip-and-Recover Strategy for Efficient Speech Recognition
Figure 2 for Skipformer: A Skip-and-Recover Strategy for Efficient Speech Recognition
Figure 3 for Skipformer: A Skip-and-Recover Strategy for Efficient Speech Recognition
Figure 4 for Skipformer: A Skip-and-Recover Strategy for Efficient Speech Recognition
Viaarxiv icon

Self-Supervised Disentangled Representation Learning for Robust Target Speech Extraction

Add code
Dec 16, 2023
Figure 1 for Self-Supervised Disentangled Representation Learning for Robust Target Speech Extraction
Figure 2 for Self-Supervised Disentangled Representation Learning for Robust Target Speech Extraction
Figure 3 for Self-Supervised Disentangled Representation Learning for Robust Target Speech Extraction
Figure 4 for Self-Supervised Disentangled Representation Learning for Robust Target Speech Extraction
Viaarxiv icon

Key Frame Mechanism For Efficient Conformer Based End-to-end Speech Recognition

Add code
Oct 28, 2023
Figure 1 for Key Frame Mechanism For Efficient Conformer Based End-to-end Speech Recognition
Figure 2 for Key Frame Mechanism For Efficient Conformer Based End-to-end Speech Recognition
Figure 3 for Key Frame Mechanism For Efficient Conformer Based End-to-end Speech Recognition
Figure 4 for Key Frame Mechanism For Efficient Conformer Based End-to-end Speech Recognition
Viaarxiv icon

DCCRN-KWS: an audio bias based model for noise robust small-footprint keyword spotting

Add code
May 23, 2023
Figure 1 for DCCRN-KWS: an audio bias based model for noise robust small-footprint keyword spotting
Figure 2 for DCCRN-KWS: an audio bias based model for noise robust small-footprint keyword spotting
Figure 3 for DCCRN-KWS: an audio bias based model for noise robust small-footprint keyword spotting
Figure 4 for DCCRN-KWS: an audio bias based model for noise robust small-footprint keyword spotting
Viaarxiv icon

Two Stage Contextual Word Filtering for Context bias in Unified Streaming and Non-streaming Transducer

Add code
Jan 17, 2023
Viaarxiv icon

CaTT-KWS: A Multi-stage Customized Keyword Spotting Framework based on Cascaded Transducer-Transformer

Add code
Jul 04, 2022
Figure 1 for CaTT-KWS: A Multi-stage Customized Keyword Spotting Framework based on Cascaded Transducer-Transformer
Figure 2 for CaTT-KWS: A Multi-stage Customized Keyword Spotting Framework based on Cascaded Transducer-Transformer
Figure 3 for CaTT-KWS: A Multi-stage Customized Keyword Spotting Framework based on Cascaded Transducer-Transformer
Figure 4 for CaTT-KWS: A Multi-stage Customized Keyword Spotting Framework based on Cascaded Transducer-Transformer
Viaarxiv icon

Leveraging Acoustic Contextual Representation by Audio-textual Cross-modal Learning for Conversational ASR

Add code
Jul 03, 2022
Figure 1 for Leveraging Acoustic Contextual Representation by Audio-textual Cross-modal Learning for Conversational ASR
Figure 2 for Leveraging Acoustic Contextual Representation by Audio-textual Cross-modal Learning for Conversational ASR
Figure 3 for Leveraging Acoustic Contextual Representation by Audio-textual Cross-modal Learning for Conversational ASR
Figure 4 for Leveraging Acoustic Contextual Representation by Audio-textual Cross-modal Learning for Conversational ASR
Viaarxiv icon

Conversational Speech Recognition By Learning Conversation-level Characteristics

Add code
Feb 17, 2022
Figure 1 for Conversational Speech Recognition By Learning Conversation-level Characteristics
Figure 2 for Conversational Speech Recognition By Learning Conversation-level Characteristics
Viaarxiv icon

Improving Streaming Transformer Based ASR Under a Framework of Self-supervised Learning

Add code
Sep 15, 2021
Figure 1 for Improving Streaming Transformer Based ASR Under a Framework of Self-supervised Learning
Figure 2 for Improving Streaming Transformer Based ASR Under a Framework of Self-supervised Learning
Figure 3 for Improving Streaming Transformer Based ASR Under a Framework of Self-supervised Learning
Figure 4 for Improving Streaming Transformer Based ASR Under a Framework of Self-supervised Learning
Viaarxiv icon