Picture for Yue Gu

Yue Gu

CosyVoice: A Scalable Multilingual Zero-shot Text-to-speech Synthesizer based on Supervised Semantic Tokens

Add code
Jul 09, 2024
Viaarxiv icon

The Effect of Predictive Formal Modelling at Runtime on Performance in Human-Swarm Interaction

Add code
Jan 22, 2024
Viaarxiv icon

Improving Label Assignments Learning by Dynamic Sample Dropout Combined with Layer-wise Optimization in Speech Separation

Add code
Nov 20, 2023
Viaarxiv icon

On-Device Constrained Self-Supervised Speech Representation Learning for Keyword Spotting via Knowledge Distillation

Add code
Jul 06, 2023
Viaarxiv icon

Self-supervised speech representation learning for keyword-spotting with light-weight transformers

Add code
Mar 07, 2023
Viaarxiv icon

Time-Domain Mapping Based Single-Channel Speech Separation With Hierarchical Constraint Training

Add code
Oct 20, 2021
Figure 1 for Time-Domain Mapping Based Single-Channel Speech Separation With Hierarchical Constraint Training
Figure 2 for Time-Domain Mapping Based Single-Channel Speech Separation With Hierarchical Constraint Training
Figure 3 for Time-Domain Mapping Based Single-Channel Speech Separation With Hierarchical Constraint Training
Figure 4 for Time-Domain Mapping Based Single-Channel Speech Separation With Hierarchical Constraint Training
Viaarxiv icon

Recurrent Distillation based Crowd Counting

Add code
Jun 14, 2020
Figure 1 for Recurrent Distillation based Crowd Counting
Figure 2 for Recurrent Distillation based Crowd Counting
Figure 3 for Recurrent Distillation based Crowd Counting
Figure 4 for Recurrent Distillation based Crowd Counting
Viaarxiv icon

A Monaural Speech Enhancement Method for Robust Small-Footprint Keyword Spotting

Add code
Jun 20, 2019
Figure 1 for A Monaural Speech Enhancement Method for Robust Small-Footprint Keyword Spotting
Figure 2 for A Monaural Speech Enhancement Method for Robust Small-Footprint Keyword Spotting
Figure 3 for A Monaural Speech Enhancement Method for Robust Small-Footprint Keyword Spotting
Figure 4 for A Monaural Speech Enhancement Method for Robust Small-Footprint Keyword Spotting
Viaarxiv icon

RHR-Net: A Residual Hourglass Recurrent Neural Network for Speech Enhancement

Add code
Apr 15, 2019
Figure 1 for RHR-Net: A Residual Hourglass Recurrent Neural Network for Speech Enhancement
Figure 2 for RHR-Net: A Residual Hourglass Recurrent Neural Network for Speech Enhancement
Figure 3 for RHR-Net: A Residual Hourglass Recurrent Neural Network for Speech Enhancement
Figure 4 for RHR-Net: A Residual Hourglass Recurrent Neural Network for Speech Enhancement
Viaarxiv icon

Multimodal Affective Analysis Using Hierarchical Attention Strategy with Word-Level Alignment

Add code
May 22, 2018
Figure 1 for Multimodal Affective Analysis Using Hierarchical Attention Strategy with Word-Level Alignment
Figure 2 for Multimodal Affective Analysis Using Hierarchical Attention Strategy with Word-Level Alignment
Figure 3 for Multimodal Affective Analysis Using Hierarchical Attention Strategy with Word-Level Alignment
Figure 4 for Multimodal Affective Analysis Using Hierarchical Attention Strategy with Word-Level Alignment
Viaarxiv icon