Picture for Mark Hasegawa-Johnson

Mark Hasegawa-Johnson

SyncDiff: Diffusion-based Talking Head Synthesis with Bottlenecked Temporal Visual Prior for Improved Synchronization

Add code
Mar 17, 2025
Viaarxiv icon

Robust Cross-Etiology and Speaker-Independent Dysarthric Speech Recognition

Add code
Jan 25, 2025
Figure 1 for Robust Cross-Etiology and Speaker-Independent Dysarthric Speech Recognition
Figure 2 for Robust Cross-Etiology and Speaker-Independent Dysarthric Speech Recognition
Figure 3 for Robust Cross-Etiology and Speaker-Independent Dysarthric Speech Recognition
Figure 4 for Robust Cross-Etiology and Speaker-Independent Dysarthric Speech Recognition
Viaarxiv icon

R2I-rPPG: A Robust Region of Interest Selection Method for Remote Photoplethysmography to Extract Heart Rate

Add code
Oct 21, 2024
Figure 1 for R2I-rPPG: A Robust Region of Interest Selection Method for Remote Photoplethysmography to Extract Heart Rate
Figure 2 for R2I-rPPG: A Robust Region of Interest Selection Method for Remote Photoplethysmography to Extract Heart Rate
Figure 3 for R2I-rPPG: A Robust Region of Interest Selection Method for Remote Photoplethysmography to Extract Heart Rate
Figure 4 for R2I-rPPG: A Robust Region of Interest Selection Method for Remote Photoplethysmography to Extract Heart Rate
Viaarxiv icon

Fine-Tuning Automatic Speech Recognition for People with Parkinson's: An Effective Strategy for Enhancing Speech Technology Accessibility

Add code
Sep 29, 2024
Viaarxiv icon

Just ASR + LLM? A Study on Speech Large Language Models' Ability to Identify and Understand Speaker in Spoken Dialogue

Add code
Sep 07, 2024
Viaarxiv icon

LI-TTA: Language Informed Test-Time Adaptation for Automatic Speech Recognition

Add code
Aug 11, 2024
Viaarxiv icon

Sound Tagging in Infant-centric Home Soundscapes

Add code
Jun 25, 2024
Figure 1 for Sound Tagging in Infant-centric Home Soundscapes
Figure 2 for Sound Tagging in Infant-centric Home Soundscapes
Figure 3 for Sound Tagging in Infant-centric Home Soundscapes
Figure 4 for Sound Tagging in Infant-centric Home Soundscapes
Viaarxiv icon

Towards Unsupervised Speech Recognition Without Pronunciation Models

Add code
Jun 12, 2024
Figure 1 for Towards Unsupervised Speech Recognition Without Pronunciation Models
Figure 2 for Towards Unsupervised Speech Recognition Without Pronunciation Models
Figure 3 for Towards Unsupervised Speech Recognition Without Pronunciation Models
Figure 4 for Towards Unsupervised Speech Recognition Without Pronunciation Models
Viaarxiv icon

C-TPT: Calibrated Test-Time Prompt Tuning for Vision-Language Models via Text Feature Dispersion

Add code
Mar 31, 2024
Figure 1 for C-TPT: Calibrated Test-Time Prompt Tuning for Vision-Language Models via Text Feature Dispersion
Figure 2 for C-TPT: Calibrated Test-Time Prompt Tuning for Vision-Language Models via Text Feature Dispersion
Figure 3 for C-TPT: Calibrated Test-Time Prompt Tuning for Vision-Language Models via Text Feature Dispersion
Figure 4 for C-TPT: Calibrated Test-Time Prompt Tuning for Vision-Language Models via Text Feature Dispersion
Viaarxiv icon

AdaMER-CTC: Connectionist Temporal Classification with Adaptive Maximum Entropy Regularization for Automatic Speech Recognition

Add code
Mar 18, 2024
Figure 1 for AdaMER-CTC: Connectionist Temporal Classification with Adaptive Maximum Entropy Regularization for Automatic Speech Recognition
Figure 2 for AdaMER-CTC: Connectionist Temporal Classification with Adaptive Maximum Entropy Regularization for Automatic Speech Recognition
Figure 3 for AdaMER-CTC: Connectionist Temporal Classification with Adaptive Maximum Entropy Regularization for Automatic Speech Recognition
Figure 4 for AdaMER-CTC: Connectionist Temporal Classification with Adaptive Maximum Entropy Regularization for Automatic Speech Recognition
Viaarxiv icon