Picture for Keyu An

Keyu An

Paraformer-v2: An improved non-autoregressive transformer for noise-robust speech recognition

Add code
Sep 26, 2024
Viaarxiv icon

Are Transformers in Pre-trained LM A Good ASR Encoder? An Empirical Study

Add code
Sep 26, 2024
Figure 1 for Are Transformers in Pre-trained LM A Good ASR Encoder? An Empirical Study
Figure 2 for Are Transformers in Pre-trained LM A Good ASR Encoder? An Empirical Study
Figure 3 for Are Transformers in Pre-trained LM A Good ASR Encoder? An Empirical Study
Figure 4 for Are Transformers in Pre-trained LM A Good ASR Encoder? An Empirical Study
Viaarxiv icon

Advancing VAD Systems Based on Multi-Task Learning with Improved Model Structures

Add code
Dec 19, 2023
Viaarxiv icon

Exploring RWKV for Memory Efficient and Low Latency Streaming ASR

Add code
Sep 26, 2023
Figure 1 for Exploring RWKV for Memory Efficient and Low Latency Streaming ASR
Figure 2 for Exploring RWKV for Memory Efficient and Low Latency Streaming ASR
Figure 3 for Exploring RWKV for Memory Efficient and Low Latency Streaming ASR
Figure 4 for Exploring RWKV for Memory Efficient and Low Latency Streaming ASR
Viaarxiv icon

BAT: Boundary aware transducer for memory-efficient and low-latency ASR

Add code
May 19, 2023
Viaarxiv icon

An Empirical Study of Language Model Integration for Transducer based Speech Recognition

Add code
Mar 31, 2022
Figure 1 for An Empirical Study of Language Model Integration for Transducer based Speech Recognition
Figure 2 for An Empirical Study of Language Model Integration for Transducer based Speech Recognition
Figure 3 for An Empirical Study of Language Model Integration for Transducer based Speech Recognition
Viaarxiv icon

CUSIDE: Chunking, Simulating Future Context and Decoding for Streaming ASR

Add code
Mar 31, 2022
Figure 1 for CUSIDE: Chunking, Simulating Future Context and Decoding for Streaming ASR
Figure 2 for CUSIDE: Chunking, Simulating Future Context and Decoding for Streaming ASR
Figure 3 for CUSIDE: Chunking, Simulating Future Context and Decoding for Streaming ASR
Figure 4 for CUSIDE: Chunking, Simulating Future Context and Decoding for Streaming ASR
Viaarxiv icon

Exploiting Single-Channel Speech for Multi-Channel End-to-End Speech Recognition: A Comparative Study

Add code
Mar 31, 2022
Figure 1 for Exploiting Single-Channel Speech for Multi-Channel End-to-End Speech Recognition: A Comparative Study
Figure 2 for Exploiting Single-Channel Speech for Multi-Channel End-to-End Speech Recognition: A Comparative Study
Figure 3 for Exploiting Single-Channel Speech for Multi-Channel End-to-End Speech Recognition: A Comparative Study
Figure 4 for Exploiting Single-Channel Speech for Multi-Channel End-to-End Speech Recognition: A Comparative Study
Viaarxiv icon

Multilingual and crosslingual speech recognition using phonological-vector based phone embeddings

Add code
Jul 11, 2021
Figure 1 for Multilingual and crosslingual speech recognition using phonological-vector based phone embeddings
Figure 2 for Multilingual and crosslingual speech recognition using phonological-vector based phone embeddings
Figure 3 for Multilingual and crosslingual speech recognition using phonological-vector based phone embeddings
Figure 4 for Multilingual and crosslingual speech recognition using phonological-vector based phone embeddings
Viaarxiv icon

Exploiting Single-Channel Speech For Multi-channel End-to-end Speech Recognition

Add code
Jul 06, 2021
Figure 1 for Exploiting Single-Channel Speech For Multi-channel End-to-end Speech Recognition
Figure 2 for Exploiting Single-Channel Speech For Multi-channel End-to-end Speech Recognition
Figure 3 for Exploiting Single-Channel Speech For Multi-channel End-to-end Speech Recognition
Figure 4 for Exploiting Single-Channel Speech For Multi-channel End-to-end Speech Recognition
Viaarxiv icon