Picture for Arun Babu

Arun Babu

Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model

Add code
Aug 20, 2024
Viaarxiv icon

Toward Joint Language Modeling for Speech Units and Text

Add code
Oct 12, 2023
Figure 1 for Toward Joint Language Modeling for Speech Units and Text
Figure 2 for Toward Joint Language Modeling for Speech Units and Text
Figure 3 for Toward Joint Language Modeling for Speech Units and Text
Figure 4 for Toward Joint Language Modeling for Speech Units and Text
Viaarxiv icon

Scaling Autoregressive Multi-Modal Models: Pretraining and Instruction Tuning

Add code
Sep 05, 2023
Figure 1 for Scaling Autoregressive Multi-Modal Models: Pretraining and Instruction Tuning
Figure 2 for Scaling Autoregressive Multi-Modal Models: Pretraining and Instruction Tuning
Figure 3 for Scaling Autoregressive Multi-Modal Models: Pretraining and Instruction Tuning
Figure 4 for Scaling Autoregressive Multi-Modal Models: Pretraining and Instruction Tuning
Viaarxiv icon

Scaling Speech Technology to 1,000+ Languages

Add code
May 22, 2023
Viaarxiv icon

Efficient Self-supervised Learning with Contextualized Target Representations for Vision, Speech and Language

Add code
Dec 14, 2022
Figure 1 for Efficient Self-supervised Learning with Contextualized Target Representations for Vision, Speech and Language
Figure 2 for Efficient Self-supervised Learning with Contextualized Target Representations for Vision, Speech and Language
Figure 3 for Efficient Self-supervised Learning with Contextualized Target Representations for Vision, Speech and Language
Figure 4 for Efficient Self-supervised Learning with Contextualized Target Representations for Vision, Speech and Language
Viaarxiv icon

data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language

Add code
Feb 07, 2022
Figure 1 for data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language
Figure 2 for data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language
Figure 3 for data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language
Figure 4 for data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language
Viaarxiv icon

XLS-R: Self-supervised Cross-lingual Speech Representation Learning at Scale

Add code
Nov 19, 2021
Figure 1 for XLS-R: Self-supervised Cross-lingual Speech Representation Learning at Scale
Figure 2 for XLS-R: Self-supervised Cross-lingual Speech Representation Learning at Scale
Figure 3 for XLS-R: Self-supervised Cross-lingual Speech Representation Learning at Scale
Figure 4 for XLS-R: Self-supervised Cross-lingual Speech Representation Learning at Scale
Viaarxiv icon

Latency-Aware Neural Architecture Search with Multi-Objective Bayesian Optimization

Add code
Jun 25, 2021
Figure 1 for Latency-Aware Neural Architecture Search with Multi-Objective Bayesian Optimization
Figure 2 for Latency-Aware Neural Architecture Search with Multi-Objective Bayesian Optimization
Figure 3 for Latency-Aware Neural Architecture Search with Multi-Objective Bayesian Optimization
Figure 4 for Latency-Aware Neural Architecture Search with Multi-Objective Bayesian Optimization
Viaarxiv icon

Span Pointer Networks for Non-Autoregressive Task-Oriented Semantic Parsing

Add code
Apr 16, 2021
Figure 1 for Span Pointer Networks for Non-Autoregressive Task-Oriented Semantic Parsing
Figure 2 for Span Pointer Networks for Non-Autoregressive Task-Oriented Semantic Parsing
Figure 3 for Span Pointer Networks for Non-Autoregressive Task-Oriented Semantic Parsing
Figure 4 for Span Pointer Networks for Non-Autoregressive Task-Oriented Semantic Parsing
Viaarxiv icon

Non-Autoregressive Semantic Parsing for Compositional Task-Oriented Dialog

Add code
Apr 11, 2021
Figure 1 for Non-Autoregressive Semantic Parsing for Compositional Task-Oriented Dialog
Figure 2 for Non-Autoregressive Semantic Parsing for Compositional Task-Oriented Dialog
Figure 3 for Non-Autoregressive Semantic Parsing for Compositional Task-Oriented Dialog
Figure 4 for Non-Autoregressive Semantic Parsing for Compositional Task-Oriented Dialog
Viaarxiv icon