Picture for Hsin-Min Wang

Hsin-Min Wang

Improving Perceptual Audio Aesthetic Assessment via Triplet Loss and Self-Supervised Embeddings

Add code
Sep 03, 2025
Viaarxiv icon

Revealing the Role of Audio Channels in ASR Performance Degradation

Add code
Aug 12, 2025
Viaarxiv icon

QAMRO: Quality-aware Adaptive Margin Ranking Optimization for Human-aligned Assessment of Audio Generation Systems

Add code
Aug 12, 2025
Viaarxiv icon

A Study on Speech Assessment with Visual Cues

Add code
Jun 11, 2025
Viaarxiv icon

Towards Robust Assessment of Pathological Voices via Combined Low-Level Descriptors and Foundation Model Representations

Add code
May 30, 2025
Figure 1 for Towards Robust Assessment of Pathological Voices via Combined Low-Level Descriptors and Foundation Model Representations
Figure 2 for Towards Robust Assessment of Pathological Voices via Combined Low-Level Descriptors and Foundation Model Representations
Figure 3 for Towards Robust Assessment of Pathological Voices via Combined Low-Level Descriptors and Foundation Model Representations
Figure 4 for Towards Robust Assessment of Pathological Voices via Combined Low-Level Descriptors and Foundation Model Representations
Viaarxiv icon

Towards Robust Automated Perceptual Voice Quality Assessment with Speech Foundation Models

Add code
May 28, 2025
Figure 1 for Towards Robust Automated Perceptual Voice Quality Assessment with Speech Foundation Models
Figure 2 for Towards Robust Automated Perceptual Voice Quality Assessment with Speech Foundation Models
Figure 3 for Towards Robust Automated Perceptual Voice Quality Assessment with Speech Foundation Models
Figure 4 for Towards Robust Automated Perceptual Voice Quality Assessment with Speech Foundation Models
Viaarxiv icon

MSECG: Incorporating Mamba for Robust and Efficient ECG Super-Resolution

Add code
Dec 06, 2024
Viaarxiv icon

How Good is ChatGPT at Audiovisual Deepfake Detection: A Comparative Study of ChatGPT, AI Models and Human Perception

Add code
Nov 14, 2024
Figure 1 for How Good is ChatGPT at Audiovisual Deepfake Detection: A Comparative Study of ChatGPT, AI Models and Human Perception
Figure 2 for How Good is ChatGPT at Audiovisual Deepfake Detection: A Comparative Study of ChatGPT, AI Models and Human Perception
Figure 3 for How Good is ChatGPT at Audiovisual Deepfake Detection: A Comparative Study of ChatGPT, AI Models and Human Perception
Figure 4 for How Good is ChatGPT at Audiovisual Deepfake Detection: A Comparative Study of ChatGPT, AI Models and Human Perception
Viaarxiv icon

Understanding Audiovisual Deepfake Detection: Techniques, Challenges, Human Factors and Perceptual Insights

Add code
Nov 12, 2024
Figure 1 for Understanding Audiovisual Deepfake Detection: Techniques, Challenges, Human Factors and Perceptual Insights
Figure 2 for Understanding Audiovisual Deepfake Detection: Techniques, Challenges, Human Factors and Perceptual Insights
Figure 3 for Understanding Audiovisual Deepfake Detection: Techniques, Challenges, Human Factors and Perceptual Insights
Figure 4 for Understanding Audiovisual Deepfake Detection: Techniques, Challenges, Human Factors and Perceptual Insights
Viaarxiv icon

Robust Audio-Visual Speech Enhancement: Correcting Misassignments in Complex Environments with Advanced Post-Processing

Add code
Sep 22, 2024
Figure 1 for Robust Audio-Visual Speech Enhancement: Correcting Misassignments in Complex Environments with Advanced Post-Processing
Figure 2 for Robust Audio-Visual Speech Enhancement: Correcting Misassignments in Complex Environments with Advanced Post-Processing
Figure 3 for Robust Audio-Visual Speech Enhancement: Correcting Misassignments in Complex Environments with Advanced Post-Processing
Figure 4 for Robust Audio-Visual Speech Enhancement: Correcting Misassignments in Complex Environments with Advanced Post-Processing
Viaarxiv icon