Picture for Peter Bell

Peter Bell

Beyond Oversmoothing: Evaluating DDPM and MSE for Scalable Speech Synthesis in ASR

Add code
Oct 16, 2024
Viaarxiv icon

Semi-Supervised Cognitive State Classification from Speech with Multi-View Pseudo-Labeling

Add code
Sep 25, 2024
Figure 1 for Semi-Supervised Cognitive State Classification from Speech with Multi-View Pseudo-Labeling
Figure 2 for Semi-Supervised Cognitive State Classification from Speech with Multi-View Pseudo-Labeling
Figure 3 for Semi-Supervised Cognitive State Classification from Speech with Multi-View Pseudo-Labeling
Figure 4 for Semi-Supervised Cognitive State Classification from Speech with Multi-View Pseudo-Labeling
Viaarxiv icon

Large Language Model Based Generative Error Correction: A Challenge and Baselines for Speech Recognition, Speaker Tagging, and Emotion Recognition

Add code
Sep 17, 2024
Figure 1 for Large Language Model Based Generative Error Correction: A Challenge and Baselines for Speech Recognition, Speaker Tagging, and Emotion Recognition
Figure 2 for Large Language Model Based Generative Error Correction: A Challenge and Baselines for Speech Recognition, Speaker Tagging, and Emotion Recognition
Figure 3 for Large Language Model Based Generative Error Correction: A Challenge and Baselines for Speech Recognition, Speaker Tagging, and Emotion Recognition
Figure 4 for Large Language Model Based Generative Error Correction: A Challenge and Baselines for Speech Recognition, Speaker Tagging, and Emotion Recognition
Viaarxiv icon

TTSDS -- Text-to-Speech Distribution Score

Add code
Jul 17, 2024
Viaarxiv icon

Speech Emotion Recognition with ASR Transcripts: A Comprehensive Study on Word Error Rate and Fusion Techniques

Add code
Jun 12, 2024
Figure 1 for Speech Emotion Recognition with ASR Transcripts: A Comprehensive Study on Word Error Rate and Fusion Techniques
Figure 2 for Speech Emotion Recognition with ASR Transcripts: A Comprehensive Study on Word Error Rate and Fusion Techniques
Figure 3 for Speech Emotion Recognition with ASR Transcripts: A Comprehensive Study on Word Error Rate and Fusion Techniques
Figure 4 for Speech Emotion Recognition with ASR Transcripts: A Comprehensive Study on Word Error Rate and Fusion Techniques
Viaarxiv icon

Phonetic Error Analysis of Raw Waveform Acoustic Models with Parametric and Non-Parametric CNNs

Add code
Jun 02, 2024
Figure 1 for Phonetic Error Analysis of Raw Waveform Acoustic Models with Parametric and Non-Parametric CNNs
Figure 2 for Phonetic Error Analysis of Raw Waveform Acoustic Models with Parametric and Non-Parametric CNNs
Figure 3 for Phonetic Error Analysis of Raw Waveform Acoustic Models with Parametric and Non-Parametric CNNs
Figure 4 for Phonetic Error Analysis of Raw Waveform Acoustic Models with Parametric and Non-Parametric CNNs
Viaarxiv icon

1st Place Solution to Odyssey Emotion Recognition Challenge Task1: Tackling Class Imbalance Problem

Add code
May 30, 2024
Figure 1 for 1st Place Solution to Odyssey Emotion Recognition Challenge Task1: Tackling Class Imbalance Problem
Figure 2 for 1st Place Solution to Odyssey Emotion Recognition Challenge Task1: Tackling Class Imbalance Problem
Figure 3 for 1st Place Solution to Odyssey Emotion Recognition Challenge Task1: Tackling Class Imbalance Problem
Figure 4 for 1st Place Solution to Odyssey Emotion Recognition Challenge Task1: Tackling Class Imbalance Problem
Viaarxiv icon

Explainable Attribute-Based Speaker Verification

Add code
May 30, 2024
Viaarxiv icon

Crossmodal ASR Error Correction with Discrete Speech Units

Add code
May 26, 2024
Figure 1 for Crossmodal ASR Error Correction with Discrete Speech Units
Figure 2 for Crossmodal ASR Error Correction with Discrete Speech Units
Figure 3 for Crossmodal ASR Error Correction with Discrete Speech Units
Figure 4 for Crossmodal ASR Error Correction with Discrete Speech Units
Viaarxiv icon

LLM-Personalize: Aligning LLM Planners with Human Preferences via Reinforced Self-Training for Housekeeping Robots

Add code
Apr 22, 2024
Viaarxiv icon