Picture for Rodrigo Mira

Rodrigo Mira

Unified Speech Recognition: A Single Model for Auditory, Visual, and Audiovisual Inputs

Add code
Nov 04, 2024
Viaarxiv icon

RT-LA-VocE: Real-Time Low-SNR Audio-Visual Speech Enhancement

Add code
Jul 10, 2024
Viaarxiv icon

BRAVEn: Improving Self-Supervised Pre-training for Visual and Auditory Speech Recognition

Add code
Apr 02, 2024
Viaarxiv icon

Laughing Matters: Introducing Laughing-Face Generation using Diffusion Models

Add code
May 15, 2023
Viaarxiv icon

Jointly Learning Visual and Auditory Speech Representations from Raw Data

Add code
Dec 12, 2022
Viaarxiv icon

LA-VocE: Low-SNR Audio-visual Speech Enhancement using Neural Vocoders

Add code
Nov 20, 2022
Viaarxiv icon

SVTS: Scalable Video-to-Speech Synthesis

Add code
May 04, 2022
Figure 1 for SVTS: Scalable Video-to-Speech Synthesis
Figure 2 for SVTS: Scalable Video-to-Speech Synthesis
Figure 3 for SVTS: Scalable Video-to-Speech Synthesis
Figure 4 for SVTS: Scalable Video-to-Speech Synthesis
Viaarxiv icon

Leveraging Real Talking Faces via Self-Supervision for Robust Forgery Detection

Add code
Jan 18, 2022
Figure 1 for Leveraging Real Talking Faces via Self-Supervision for Robust Forgery Detection
Figure 2 for Leveraging Real Talking Faces via Self-Supervision for Robust Forgery Detection
Figure 3 for Leveraging Real Talking Faces via Self-Supervision for Robust Forgery Detection
Figure 4 for Leveraging Real Talking Faces via Self-Supervision for Robust Forgery Detection
Viaarxiv icon

LiRA: Learning Visual Speech Representations from Audio through Self-supervision

Add code
Jun 16, 2021
Figure 1 for LiRA: Learning Visual Speech Representations from Audio through Self-supervision
Figure 2 for LiRA: Learning Visual Speech Representations from Audio through Self-supervision
Figure 3 for LiRA: Learning Visual Speech Representations from Audio through Self-supervision
Figure 4 for LiRA: Learning Visual Speech Representations from Audio through Self-supervision
Viaarxiv icon

End-to-End Video-To-Speech Synthesis using Generative Adversarial Networks

Add code
Apr 30, 2021
Figure 1 for End-to-End Video-To-Speech Synthesis using Generative Adversarial Networks
Figure 2 for End-to-End Video-To-Speech Synthesis using Generative Adversarial Networks
Figure 3 for End-to-End Video-To-Speech Synthesis using Generative Adversarial Networks
Figure 4 for End-to-End Video-To-Speech Synthesis using Generative Adversarial Networks
Viaarxiv icon