Picture for Soham Deshmukh

Soham Deshmukh

Microsoft

MACE: Leveraging Audio for Evaluating Audio Captioning Systems

Add code
Nov 05, 2024
Figure 1 for MACE: Leveraging Audio for Evaluating Audio Captioning Systems
Figure 2 for MACE: Leveraging Audio for Evaluating Audio Captioning Systems
Figure 3 for MACE: Leveraging Audio for Evaluating Audio Captioning Systems
Figure 4 for MACE: Leveraging Audio for Evaluating Audio Captioning Systems
Viaarxiv icon

Audio Entailment: Assessing Deductive Reasoning for Audio Understanding

Add code
Jul 25, 2024
Viaarxiv icon

SELM: Enhancing Speech Emotion Recognition for Out-of-Domain Scenarios

Add code
Jul 22, 2024
Viaarxiv icon

Domain Adaptation for Contrastive Audio-Language Models

Add code
Feb 14, 2024
Viaarxiv icon

PAM: Prompting Audio-Language Models for Audio Quality Assessment

Add code
Feb 01, 2024
Viaarxiv icon

Prompting Audios Using Acoustic Properties For Emotion Representation

Add code
Oct 05, 2023
Figure 1 for Prompting Audios Using Acoustic Properties For Emotion Representation
Figure 2 for Prompting Audios Using Acoustic Properties For Emotion Representation
Figure 3 for Prompting Audios Using Acoustic Properties For Emotion Representation
Figure 4 for Prompting Audios Using Acoustic Properties For Emotion Representation
Viaarxiv icon

LoFT: Local Proxy Fine-tuning For Improving Transferability Of Adversarial Attacks Against Large Language Model

Add code
Oct 02, 2023
Figure 1 for LoFT: Local Proxy Fine-tuning For Improving Transferability Of Adversarial Attacks Against Large Language Model
Figure 2 for LoFT: Local Proxy Fine-tuning For Improving Transferability Of Adversarial Attacks Against Large Language Model
Figure 3 for LoFT: Local Proxy Fine-tuning For Improving Transferability Of Adversarial Attacks Against Large Language Model
Figure 4 for LoFT: Local Proxy Fine-tuning For Improving Transferability Of Adversarial Attacks Against Large Language Model
Viaarxiv icon

Training Audio Captioning Models without Audio

Add code
Sep 14, 2023
Viaarxiv icon

Natural Language Supervision for General-Purpose Audio Representations

Add code
Sep 11, 2023
Viaarxiv icon

Pengi: An Audio Language Model for Audio Tasks

Add code
May 19, 2023
Viaarxiv icon