Picture for Xuedong Huang

Xuedong Huang

ComSL: A Composite Speech-Language Model for End-to-End Speech-to-Text Translation

Add code
May 24, 2023
Viaarxiv icon

i-Code Studio: A Configurable and Composable Framework for Integrative AI

Add code
May 23, 2023
Viaarxiv icon

i-Code V2: An Autoregressive Generation Framework over Vision, Language, and Speech Data

Add code
May 21, 2023
Figure 1 for i-Code V2: An Autoregressive Generation Framework over Vision, Language, and Speech Data
Figure 2 for i-Code V2: An Autoregressive Generation Framework over Vision, Language, and Speech Data
Figure 3 for i-Code V2: An Autoregressive Generation Framework over Vision, Language, and Speech Data
Figure 4 for i-Code V2: An Autoregressive Generation Framework over Vision, Language, and Speech Data
Viaarxiv icon

Z-Code++: A Pre-trained Language Model Optimized for Abstractive Summarization

Add code
Aug 21, 2022
Figure 1 for Z-Code++: A Pre-trained Language Model Optimized for Abstractive Summarization
Figure 2 for Z-Code++: A Pre-trained Language Model Optimized for Abstractive Summarization
Figure 3 for Z-Code++: A Pre-trained Language Model Optimized for Abstractive Summarization
Figure 4 for Z-Code++: A Pre-trained Language Model Optimized for Abstractive Summarization
Viaarxiv icon

i-Code: An Integrative and Composable Multimodal Learning Framework

Add code
May 05, 2022
Figure 1 for i-Code: An Integrative and Composable Multimodal Learning Framework
Figure 2 for i-Code: An Integrative and Composable Multimodal Learning Framework
Figure 3 for i-Code: An Integrative and Composable Multimodal Learning Framework
Figure 4 for i-Code: An Integrative and Composable Multimodal Learning Framework
Viaarxiv icon

Human Parity on CommonsenseQA: Augmenting Self-Attention with External Attention

Add code
Dec 14, 2021
Figure 1 for Human Parity on CommonsenseQA: Augmenting Self-Attention with External Attention
Figure 2 for Human Parity on CommonsenseQA: Augmenting Self-Attention with External Attention
Figure 3 for Human Parity on CommonsenseQA: Augmenting Self-Attention with External Attention
Figure 4 for Human Parity on CommonsenseQA: Augmenting Self-Attention with External Attention
Viaarxiv icon

Florence: A New Foundation Model for Computer Vision

Add code
Nov 22, 2021
Figure 1 for Florence: A New Foundation Model for Computer Vision
Figure 2 for Florence: A New Foundation Model for Computer Vision
Figure 3 for Florence: A New Foundation Model for Computer Vision
Figure 4 for Florence: A New Foundation Model for Computer Vision
Viaarxiv icon

One model to enhance them all: array geometry agnostic multi-channel personalized speech enhancement

Add code
Oct 20, 2021
Figure 1 for One model to enhance them all: array geometry agnostic multi-channel personalized speech enhancement
Figure 2 for One model to enhance them all: array geometry agnostic multi-channel personalized speech enhancement
Figure 3 for One model to enhance them all: array geometry agnostic multi-channel personalized speech enhancement
Figure 4 for One model to enhance them all: array geometry agnostic multi-channel personalized speech enhancement
Viaarxiv icon

Personalized Speech Enhancement: New Models and Comprehensive Evaluation

Add code
Oct 18, 2021
Figure 1 for Personalized Speech Enhancement: New Models and Comprehensive Evaluation
Figure 2 for Personalized Speech Enhancement: New Models and Comprehensive Evaluation
Figure 3 for Personalized Speech Enhancement: New Models and Comprehensive Evaluation
Viaarxiv icon

UniSpeech: Unified Speech Representation Learning with Labeled and Unlabeled Data

Add code
Jan 19, 2021
Figure 1 for UniSpeech: Unified Speech Representation Learning with Labeled and Unlabeled Data
Figure 2 for UniSpeech: Unified Speech Representation Learning with Labeled and Unlabeled Data
Figure 3 for UniSpeech: Unified Speech Representation Learning with Labeled and Unlabeled Data
Figure 4 for UniSpeech: Unified Speech Representation Learning with Labeled and Unlabeled Data
Viaarxiv icon