Picture for Huaming Wang

Huaming Wang

PAM: Prompting Audio-Language Models for Audio Quality Assessment

Add code
Feb 01, 2024
Viaarxiv icon

NOTSOFAR-1 Challenge: New Datasets, Baseline, and Tasks for Distant Meeting Transcription

Add code
Jan 16, 2024
Viaarxiv icon

Prompting Audios Using Acoustic Properties For Emotion Representation

Add code
Oct 05, 2023
Viaarxiv icon

Training Audio Captioning Models without Audio

Add code
Sep 14, 2023
Viaarxiv icon

Natural Language Supervision for General-Purpose Audio Representations

Add code
Sep 11, 2023
Viaarxiv icon

Pengi: An Audio Language Model for Audio Tasks

Add code
May 19, 2023
Viaarxiv icon

Real-Time Audio-Visual End-to-End Speech Enhancement

Add code
Mar 13, 2023
Viaarxiv icon

Speak Foreign Languages with Your Own Voice: Cross-Lingual Neural Codec Language Modeling

Add code
Mar 07, 2023
Viaarxiv icon

Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers

Add code
Jan 05, 2023
Figure 1 for Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers
Figure 2 for Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers
Figure 3 for Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers
Figure 4 for Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers
Viaarxiv icon

Learning to mask: Towards generalized face forgery detection

Add code
Dec 29, 2022
Viaarxiv icon