Picture for Yi-Chen Chen

Yi-Chen Chen

Speech Representation Learning Through Self-supervised Pretraining And Multi-task Finetuning

Add code
Oct 18, 2021
Figure 1 for Speech Representation Learning Through Self-supervised Pretraining And Multi-task Finetuning
Figure 2 for Speech Representation Learning Through Self-supervised Pretraining And Multi-task Finetuning
Viaarxiv icon

SpeechNet: A Universal Modularized Model for Speech Processing Tasks

Add code
May 31, 2021
Figure 1 for SpeechNet: A Universal Modularized Model for Speech Processing Tasks
Figure 2 for SpeechNet: A Universal Modularized Model for Speech Processing Tasks
Figure 3 for SpeechNet: A Universal Modularized Model for Speech Processing Tasks
Figure 4 for SpeechNet: A Universal Modularized Model for Speech Processing Tasks
Viaarxiv icon

Self-supervised Pre-training Reduces Label Permutation Instability of Speech Separation

Add code
Oct 29, 2020
Figure 1 for Self-supervised Pre-training Reduces Label Permutation Instability of Speech Separation
Figure 2 for Self-supervised Pre-training Reduces Label Permutation Instability of Speech Separation
Figure 3 for Self-supervised Pre-training Reduces Label Permutation Instability of Speech Separation
Figure 4 for Self-supervised Pre-training Reduces Label Permutation Instability of Speech Separation
Viaarxiv icon

DARTS-ASR: Differentiable Architecture Search for Multilingual Speech Recognition and Adaptation

Add code
May 13, 2020
Figure 1 for DARTS-ASR: Differentiable Architecture Search for Multilingual Speech Recognition and Adaptation
Figure 2 for DARTS-ASR: Differentiable Architecture Search for Multilingual Speech Recognition and Adaptation
Figure 3 for DARTS-ASR: Differentiable Architecture Search for Multilingual Speech Recognition and Adaptation
Figure 4 for DARTS-ASR: Differentiable Architecture Search for Multilingual Speech Recognition and Adaptation
Viaarxiv icon

AIPNet: Generative Adversarial Pre-training of Accent-invariant Networks for End-to-end Speech Recognition

Add code
Nov 27, 2019
Figure 1 for AIPNet: Generative Adversarial Pre-training of Accent-invariant Networks for End-to-end Speech Recognition
Figure 2 for AIPNet: Generative Adversarial Pre-training of Accent-invariant Networks for End-to-end Speech Recognition
Figure 3 for AIPNet: Generative Adversarial Pre-training of Accent-invariant Networks for End-to-end Speech Recognition
Figure 4 for AIPNet: Generative Adversarial Pre-training of Accent-invariant Networks for End-to-end Speech Recognition
Viaarxiv icon

From Semi-supervised to Almost-unsupervised Speech Recognition with Very-low Resource by Jointly Learning Phonetic Structures from Audio and Text Embeddings

Add code
Apr 10, 2019
Figure 1 for From Semi-supervised to Almost-unsupervised Speech Recognition with Very-low Resource by Jointly Learning Phonetic Structures from Audio and Text Embeddings
Figure 2 for From Semi-supervised to Almost-unsupervised Speech Recognition with Very-low Resource by Jointly Learning Phonetic Structures from Audio and Text Embeddings
Figure 3 for From Semi-supervised to Almost-unsupervised Speech Recognition with Very-low Resource by Jointly Learning Phonetic Structures from Audio and Text Embeddings
Figure 4 for From Semi-supervised to Almost-unsupervised Speech Recognition with Very-low Resource by Jointly Learning Phonetic Structures from Audio and Text Embeddings
Viaarxiv icon

Improved Audio Embeddings by Adjacency-Based Clustering with Applications in Spoken Term Detection

Add code
Nov 07, 2018
Figure 1 for Improved Audio Embeddings by Adjacency-Based Clustering with Applications in Spoken Term Detection
Figure 2 for Improved Audio Embeddings by Adjacency-Based Clustering with Applications in Spoken Term Detection
Figure 3 for Improved Audio Embeddings by Adjacency-Based Clustering with Applications in Spoken Term Detection
Figure 4 for Improved Audio Embeddings by Adjacency-Based Clustering with Applications in Spoken Term Detection
Viaarxiv icon

Almost-unsupervised Speech Recognition with Close-to-zero Resource Based on Phonetic Structures Learned from Very Small Unpaired Speech and Text Data

Add code
Oct 30, 2018
Figure 1 for Almost-unsupervised Speech Recognition with Close-to-zero Resource Based on Phonetic Structures Learned from Very Small Unpaired Speech and Text Data
Figure 2 for Almost-unsupervised Speech Recognition with Close-to-zero Resource Based on Phonetic Structures Learned from Very Small Unpaired Speech and Text Data
Figure 3 for Almost-unsupervised Speech Recognition with Close-to-zero Resource Based on Phonetic Structures Learned from Very Small Unpaired Speech and Text Data
Figure 4 for Almost-unsupervised Speech Recognition with Close-to-zero Resource Based on Phonetic Structures Learned from Very Small Unpaired Speech and Text Data
Viaarxiv icon

Phonetic-and-Semantic Embedding of Spoken Words with Applications in Spoken Content Retrieval

Add code
Sep 03, 2018
Figure 1 for Phonetic-and-Semantic Embedding of Spoken Words with Applications in Spoken Content Retrieval
Figure 2 for Phonetic-and-Semantic Embedding of Spoken Words with Applications in Spoken Content Retrieval
Figure 3 for Phonetic-and-Semantic Embedding of Spoken Words with Applications in Spoken Content Retrieval
Figure 4 for Phonetic-and-Semantic Embedding of Spoken Words with Applications in Spoken Content Retrieval
Viaarxiv icon

Towards Unsupervised Automatic Speech Recognition Trained by Unaligned Speech and Text only

Add code
Aug 11, 2018
Figure 1 for Towards Unsupervised Automatic Speech Recognition Trained by Unaligned Speech and Text only
Figure 2 for Towards Unsupervised Automatic Speech Recognition Trained by Unaligned Speech and Text only
Figure 3 for Towards Unsupervised Automatic Speech Recognition Trained by Unaligned Speech and Text only
Figure 4 for Towards Unsupervised Automatic Speech Recognition Trained by Unaligned Speech and Text only
Viaarxiv icon