Lip Reading


Chinese-LiPS: A Chinese audio-visual speech recognition dataset with Lip-reading and Presentation Slides

Add code
Apr 21, 2025
Viaarxiv icon

VALLR: Visual ASR Language Model for Lip Reading

Add code
Mar 27, 2025
Viaarxiv icon

GLaM-Sign: Greek Language Multimodal Lip Reading with Integrated Sign Language Accessibility

Add code
Jan 09, 2025
Viaarxiv icon

LipGen: Viseme-Guided Lip Video Generation for Enhancing Visual Speech Recognition

Add code
Jan 08, 2025
Figure 1 for LipGen: Viseme-Guided Lip Video Generation for Enhancing Visual Speech Recognition
Figure 2 for LipGen: Viseme-Guided Lip Video Generation for Enhancing Visual Speech Recognition
Figure 3 for LipGen: Viseme-Guided Lip Video Generation for Enhancing Visual Speech Recognition
Figure 4 for LipGen: Viseme-Guided Lip Video Generation for Enhancing Visual Speech Recognition
Viaarxiv icon

AVE Speech Dataset: A Comprehensive Benchmark for Multi-Modal Speech Recognition Integrating Audio, Visual, and Electromyographic Signals

Add code
Jan 28, 2025
Figure 1 for AVE Speech Dataset: A Comprehensive Benchmark for Multi-Modal Speech Recognition Integrating Audio, Visual, and Electromyographic Signals
Figure 2 for AVE Speech Dataset: A Comprehensive Benchmark for Multi-Modal Speech Recognition Integrating Audio, Visual, and Electromyographic Signals
Figure 3 for AVE Speech Dataset: A Comprehensive Benchmark for Multi-Modal Speech Recognition Integrating Audio, Visual, and Electromyographic Signals
Figure 4 for AVE Speech Dataset: A Comprehensive Benchmark for Multi-Modal Speech Recognition Integrating Audio, Visual, and Electromyographic Signals
Viaarxiv icon

Reading to Listen at the Cocktail Party: Multi-Modal Speech Separation

Add code
Jan 02, 2025
Viaarxiv icon

Spatio-temporal Transformers for Action Unit Classification with Event Cameras

Add code
Oct 29, 2024
Viaarxiv icon

Quantitative Analysis of Audio-Visual Tasks: An Information-Theoretic Perspective

Add code
Sep 29, 2024
Figure 1 for Quantitative Analysis of Audio-Visual Tasks: An Information-Theoretic Perspective
Figure 2 for Quantitative Analysis of Audio-Visual Tasks: An Information-Theoretic Perspective
Figure 3 for Quantitative Analysis of Audio-Visual Tasks: An Information-Theoretic Perspective
Figure 4 for Quantitative Analysis of Audio-Visual Tasks: An Information-Theoretic Perspective
Viaarxiv icon

Personalized Lip Reading: Adapting to Your Unique Lip Movements with Vision and Language

Add code
Sep 02, 2024
Viaarxiv icon

RAL:Redundancy-Aware Lipreading Model Based on Differential Learning with Symmetric Views

Add code
Sep 09, 2024
Viaarxiv icon