Picture for Songjun Cao

Songjun Cao

A Transcription Prompt-based Efficient Audio Large Language Model for Robust Speech Recognition

Add code
Aug 18, 2024
Viaarxiv icon

DistillW2V2: A Small and Streaming Wav2vec 2.0 Based ASR Model

Add code
Mar 16, 2023
Viaarxiv icon

Censer: Curriculum Semi-supervised Learning for Speech Recognition Based on Self-supervised Pre-training

Add code
Jun 27, 2022
Figure 1 for Censer: Curriculum Semi-supervised Learning for Speech Recognition Based on Self-supervised Pre-training
Figure 2 for Censer: Curriculum Semi-supervised Learning for Speech Recognition Based on Self-supervised Pre-training
Figure 3 for Censer: Curriculum Semi-supervised Learning for Speech Recognition Based on Self-supervised Pre-training
Figure 4 for Censer: Curriculum Semi-supervised Learning for Speech Recognition Based on Self-supervised Pre-training
Viaarxiv icon

A practical framework for multi-domain speech recognition and an instance sampling method to neural language modeling

Add code
Mar 09, 2022
Figure 1 for A practical framework for multi-domain speech recognition and an instance sampling method to neural language modeling
Figure 2 for A practical framework for multi-domain speech recognition and an instance sampling method to neural language modeling
Figure 3 for A practical framework for multi-domain speech recognition and an instance sampling method to neural language modeling
Figure 4 for A practical framework for multi-domain speech recognition and an instance sampling method to neural language modeling
Viaarxiv icon

Improving CTC-based speech recognition via knowledge transferring from pre-trained language models

Add code
Feb 22, 2022
Figure 1 for Improving CTC-based speech recognition via knowledge transferring from pre-trained language models
Figure 2 for Improving CTC-based speech recognition via knowledge transferring from pre-trained language models
Figure 3 for Improving CTC-based speech recognition via knowledge transferring from pre-trained language models
Figure 4 for Improving CTC-based speech recognition via knowledge transferring from pre-trained language models
Viaarxiv icon

Improving Hybrid CTC/Attention End-to-end Speech Recognition with Pretrained Acoustic and Language Model

Add code
Dec 14, 2021
Figure 1 for Improving Hybrid CTC/Attention End-to-end Speech Recognition with Pretrained Acoustic and Language Model
Figure 2 for Improving Hybrid CTC/Attention End-to-end Speech Recognition with Pretrained Acoustic and Language Model
Figure 3 for Improving Hybrid CTC/Attention End-to-end Speech Recognition with Pretrained Acoustic and Language Model
Figure 4 for Improving Hybrid CTC/Attention End-to-end Speech Recognition with Pretrained Acoustic and Language Model
Viaarxiv icon

Improving Accent Identification and Accented Speech Recognition Under a Framework of Self-supervised Learning

Add code
Sep 15, 2021
Figure 1 for Improving Accent Identification and Accented Speech Recognition Under a Framework of Self-supervised Learning
Figure 2 for Improving Accent Identification and Accented Speech Recognition Under a Framework of Self-supervised Learning
Figure 3 for Improving Accent Identification and Accented Speech Recognition Under a Framework of Self-supervised Learning
Figure 4 for Improving Accent Identification and Accented Speech Recognition Under a Framework of Self-supervised Learning
Viaarxiv icon

Improving Streaming Transformer Based ASR Under a Framework of Self-supervised Learning

Add code
Sep 15, 2021
Figure 1 for Improving Streaming Transformer Based ASR Under a Framework of Self-supervised Learning
Figure 2 for Improving Streaming Transformer Based ASR Under a Framework of Self-supervised Learning
Figure 3 for Improving Streaming Transformer Based ASR Under a Framework of Self-supervised Learning
Figure 4 for Improving Streaming Transformer Based ASR Under a Framework of Self-supervised Learning
Viaarxiv icon

Improving Speech Recognition Accuracy of Local POI Using Geographical Models

Add code
Jul 07, 2021
Figure 1 for Improving Speech Recognition Accuracy of Local POI Using Geographical Models
Figure 2 for Improving Speech Recognition Accuracy of Local POI Using Geographical Models
Figure 3 for Improving Speech Recognition Accuracy of Local POI Using Geographical Models
Figure 4 for Improving Speech Recognition Accuracy of Local POI Using Geographical Models
Viaarxiv icon

Multi-head Monotonic Chunkwise Attention For Online Speech Recognition

Add code
May 01, 2020
Figure 1 for Multi-head Monotonic Chunkwise Attention For Online Speech Recognition
Figure 2 for Multi-head Monotonic Chunkwise Attention For Online Speech Recognition
Figure 3 for Multi-head Monotonic Chunkwise Attention For Online Speech Recognition
Viaarxiv icon