Picture for Huaming Wang

Huaming Wang

PAM: Prompting Audio-Language Models for Audio Quality Assessment

Add code
Feb 01, 2024
Viaarxiv icon

NOTSOFAR-1 Challenge: New Datasets, Baseline, and Tasks for Distant Meeting Transcription

Add code
Jan 16, 2024
Viaarxiv icon

Prompting Audios Using Acoustic Properties For Emotion Representation

Add code
Oct 05, 2023
Figure 1 for Prompting Audios Using Acoustic Properties For Emotion Representation
Figure 2 for Prompting Audios Using Acoustic Properties For Emotion Representation
Figure 3 for Prompting Audios Using Acoustic Properties For Emotion Representation
Figure 4 for Prompting Audios Using Acoustic Properties For Emotion Representation
Viaarxiv icon

Training Audio Captioning Models without Audio

Add code
Sep 14, 2023
Viaarxiv icon

Natural Language Supervision for General-Purpose Audio Representations

Add code
Sep 11, 2023
Viaarxiv icon

Pengi: An Audio Language Model for Audio Tasks

Add code
May 19, 2023
Viaarxiv icon

Real-Time Audio-Visual End-to-End Speech Enhancement

Add code
Mar 13, 2023
Viaarxiv icon

Speak Foreign Languages with Your Own Voice: Cross-Lingual Neural Codec Language Modeling

Add code
Mar 07, 2023
Viaarxiv icon

Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers

Add code
Jan 05, 2023
Figure 1 for Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers
Figure 2 for Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers
Figure 3 for Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers
Figure 4 for Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers
Viaarxiv icon

Learning to mask: Towards generalized face forgery detection

Add code
Dec 29, 2022
Viaarxiv icon