Picture for Xinjian Li

Xinjian Li

Bernie

Towards Robust Speech Representation Learning for Thousands of Languages

Add code
Jul 02, 2024
Viaarxiv icon

YODAS: Youtube-Oriented Dataset for Audio and Speech

Add code
Jun 02, 2024
Viaarxiv icon

Reproducing Whisper-Style Training Using an Open-Source Toolkit and Publicly Available Data

Add code
Oct 02, 2023
Viaarxiv icon

Learning to Speak from Text: Zero-Shot Multilingual Text-to-Speech with Unsupervised Text Pretraining

Add code
Feb 05, 2023
Viaarxiv icon

Textless Direct Speech-to-Speech Translation with Discrete Speech Representation

Add code
Oct 31, 2022
Figure 1 for Textless Direct Speech-to-Speech Translation with Discrete Speech Representation
Figure 2 for Textless Direct Speech-to-Speech Translation with Discrete Speech Representation
Figure 3 for Textless Direct Speech-to-Speech Translation with Discrete Speech Representation
Figure 4 for Textless Direct Speech-to-Speech Translation with Discrete Speech Representation
Viaarxiv icon

ASR2K: Speech Recognition for Around 2000 Languages without Audio

Add code
Sep 06, 2022
Figure 1 for ASR2K: Speech Recognition for Around 2000 Languages without Audio
Figure 2 for ASR2K: Speech Recognition for Around 2000 Languages without Audio
Figure 3 for ASR2K: Speech Recognition for Around 2000 Languages without Audio
Figure 4 for ASR2K: Speech Recognition for Around 2000 Languages without Audio
Viaarxiv icon

On Adversarial Robustness of Large-scale Audio Visual Learning

Add code
Mar 23, 2022
Figure 1 for On Adversarial Robustness of Large-scale Audio Visual Learning
Figure 2 for On Adversarial Robustness of Large-scale Audio Visual Learning
Figure 3 for On Adversarial Robustness of Large-scale Audio Visual Learning
Figure 4 for On Adversarial Robustness of Large-scale Audio Visual Learning
Viaarxiv icon

Multi-Faceted Hierarchical Multi-Task Learning for a Large Number of Tasks with Multi-dimensional Relations

Add code
Oct 26, 2021
Figure 1 for Multi-Faceted Hierarchical Multi-Task Learning for a Large Number of Tasks with Multi-dimensional Relations
Figure 2 for Multi-Faceted Hierarchical Multi-Task Learning for a Large Number of Tasks with Multi-dimensional Relations
Figure 3 for Multi-Faceted Hierarchical Multi-Task Learning for a Large Number of Tasks with Multi-dimensional Relations
Figure 4 for Multi-Faceted Hierarchical Multi-Task Learning for a Large Number of Tasks with Multi-dimensional Relations
Viaarxiv icon

On Prosody Modeling for ASR+TTS based Voice Conversion

Add code
Jul 20, 2021
Figure 1 for On Prosody Modeling for ASR+TTS based Voice Conversion
Figure 2 for On Prosody Modeling for ASR+TTS based Voice Conversion
Figure 3 for On Prosody Modeling for ASR+TTS based Voice Conversion
Figure 4 for On Prosody Modeling for ASR+TTS based Voice Conversion
Viaarxiv icon

Phoneme Recognition through Fine Tuning of Phonetic Representations: a Case Study on Luhya Language Varieties

Add code
Apr 04, 2021
Figure 1 for Phoneme Recognition through Fine Tuning of Phonetic Representations: a Case Study on Luhya Language Varieties
Figure 2 for Phoneme Recognition through Fine Tuning of Phonetic Representations: a Case Study on Luhya Language Varieties
Figure 3 for Phoneme Recognition through Fine Tuning of Phonetic Representations: a Case Study on Luhya Language Varieties
Figure 4 for Phoneme Recognition through Fine Tuning of Phonetic Representations: a Case Study on Luhya Language Varieties
Viaarxiv icon