Picture for Li-Rong Dai

Li-Rong Dai

Adaptive Confidence Multi-View Hashing for Multimedia Retrieval

Add code
Dec 12, 2023
Viaarxiv icon

Semantic VAD: Low-Latency Voice Activity Detection for Speech Interaction

Add code
May 21, 2023
Viaarxiv icon

CASA-ASR: Context-Aware Speaker-Attributed ASR

Add code
May 21, 2023
Viaarxiv icon

Joint Generative-Contrastive Representation Learning for Anomalous Sound Detection

Add code
May 20, 2023
Viaarxiv icon

AST-SED: An Effective Sound Event Detection Method Based on Audio Spectrogram Transformer

Add code
Mar 07, 2023
Viaarxiv icon

A Comparative Study on multichannel Speaker-attributed automatic speech recognition in Multi-party Meetings

Add code
Nov 01, 2022
Viaarxiv icon

Robust Data2vec: Noise-robust Speech Representation Learning for ASR by Combining Regression and Improved Contrastive Learning

Add code
Oct 27, 2022
Viaarxiv icon

Joint Training of Speech Enhancement and Self-supervised Model for Noise-robust ASR

Add code
May 26, 2022
Figure 1 for Joint Training of Speech Enhancement and Self-supervised Model for Noise-robust ASR
Figure 2 for Joint Training of Speech Enhancement and Self-supervised Model for Noise-robust ASR
Figure 3 for Joint Training of Speech Enhancement and Self-supervised Model for Noise-robust ASR
Figure 4 for Joint Training of Speech Enhancement and Self-supervised Model for Noise-robust ASR
Viaarxiv icon

A Complementary Joint Training Approach Using Unpaired Speech and Text for Low-Resource Automatic Speech Recognition

Add code
Apr 05, 2022
Figure 1 for A Complementary Joint Training Approach Using Unpaired Speech and Text for Low-Resource Automatic Speech Recognition
Figure 2 for A Complementary Joint Training Approach Using Unpaired Speech and Text for Low-Resource Automatic Speech Recognition
Figure 3 for A Complementary Joint Training Approach Using Unpaired Speech and Text for Low-Resource Automatic Speech Recognition
Figure 4 for A Complementary Joint Training Approach Using Unpaired Speech and Text for Low-Resource Automatic Speech Recognition
Viaarxiv icon

Learning Contextually Fused Audio-visual Representations for Audio-visual Speech Recognition

Add code
Feb 15, 2022
Figure 1 for Learning Contextually Fused Audio-visual Representations for Audio-visual Speech Recognition
Figure 2 for Learning Contextually Fused Audio-visual Representations for Audio-visual Speech Recognition
Figure 3 for Learning Contextually Fused Audio-visual Representations for Audio-visual Speech Recognition
Figure 4 for Learning Contextually Fused Audio-visual Representations for Audio-visual Speech Recognition
Viaarxiv icon