Picture for Dan Su

Dan Su

DurIAN-E 2: Duration Informed Attention Network with Adaptive Variational Autoencoder and Adversarial Learning for Expressive Text-to-Speech Synthesis

Add code
Oct 17, 2024
Viaarxiv icon

Nemotron-4 340B Technical Report

Add code
Jun 17, 2024
Viaarxiv icon

Generative Pre-trained Speech Language Model with Efficient Hierarchical Transformer

Add code
Jun 03, 2024
Viaarxiv icon

Fuse after Align: Improving Face-Voice Association Learning via Multimodal Encoder

Add code
Apr 15, 2024
Viaarxiv icon

Nemotron-4 15B Technical Report

Add code
Feb 27, 2024
Viaarxiv icon

MM-LLMs: Recent Advances in MultiModal Large Language Models

Add code
Jan 25, 2024
Viaarxiv icon

A High Fidelity and Low Complexity Neural Audio Coding

Add code
Oct 17, 2023
Viaarxiv icon

DurIAN-E: Duration Informed Attention Network For Expressive Text-to-Speech Synthesis

Add code
Sep 22, 2023
Viaarxiv icon

Text-Only Domain Adaptation for End-to-End Speech Recognition through Down-Sampling Acoustic Representation

Add code
Sep 04, 2023
Figure 1 for Text-Only Domain Adaptation for End-to-End Speech Recognition through Down-Sampling Acoustic Representation
Figure 2 for Text-Only Domain Adaptation for End-to-End Speech Recognition through Down-Sampling Acoustic Representation
Figure 3 for Text-Only Domain Adaptation for End-to-End Speech Recognition through Down-Sampling Acoustic Representation
Figure 4 for Text-Only Domain Adaptation for End-to-End Speech Recognition through Down-Sampling Acoustic Representation
Viaarxiv icon

Model Debiasing via Gradient-based Explanation on Representation

Add code
May 20, 2023
Viaarxiv icon