Picture for Maja Pantic

Maja Pantic

Unified Speech Recognition: A Single Model for Auditory, Visual, and Audiovisual Inputs

Add code
Nov 04, 2024
Viaarxiv icon

Full-Rank No More: Low-Rank Weight Training for Modern Speech Recognition Models

Add code
Oct 10, 2024
Figure 1 for Full-Rank No More: Low-Rank Weight Training for Modern Speech Recognition Models
Figure 2 for Full-Rank No More: Low-Rank Weight Training for Modern Speech Recognition Models
Figure 3 for Full-Rank No More: Low-Rank Weight Training for Modern Speech Recognition Models
Figure 4 for Full-Rank No More: Low-Rank Weight Training for Modern Speech Recognition Models
Viaarxiv icon

RT-LA-VocE: Real-Time Low-SNR Audio-Visual Speech Enhancement

Add code
Jul 10, 2024
Viaarxiv icon

Dynamic Data Pruning for Automatic Speech Recognition

Add code
Jun 26, 2024
Viaarxiv icon

MSRS: Training Multimodal Speech Recognition Models from Scratch with Sparse Mask Optimization

Add code
Jun 25, 2024
Figure 1 for MSRS: Training Multimodal Speech Recognition Models from Scratch with Sparse Mask Optimization
Figure 2 for MSRS: Training Multimodal Speech Recognition Models from Scratch with Sparse Mask Optimization
Figure 3 for MSRS: Training Multimodal Speech Recognition Models from Scratch with Sparse Mask Optimization
Figure 4 for MSRS: Training Multimodal Speech Recognition Models from Scratch with Sparse Mask Optimization
Viaarxiv icon

EMOPortraits: Emotion-enhanced Multimodal One-shot Head Avatars

Add code
Apr 29, 2024
Viaarxiv icon

BRAVEn: Improving Self-Supervised Pre-training for Visual and Auditory Speech Recognition

Add code
Apr 02, 2024
Viaarxiv icon

Audio-visual video-to-speech synthesis with synthesized input audio

Add code
Jul 31, 2023
Viaarxiv icon

SparseVSR: Lightweight and Noise Robust Visual Speech Recognition

Add code
Jul 10, 2023
Viaarxiv icon

Large-scale unsupervised audio pre-training for video-to-speech synthesis

Add code
Jun 27, 2023
Viaarxiv icon