Picture for Karel Mundnich

Karel Mundnich

SpeechVerse: A Large-scale Generalizable Audio Language Model

Add code
May 14, 2024
Figure 1 for SpeechVerse: A Large-scale Generalizable Audio Language Model
Figure 2 for SpeechVerse: A Large-scale Generalizable Audio Language Model
Figure 3 for SpeechVerse: A Large-scale Generalizable Audio Language Model
Figure 4 for SpeechVerse: A Large-scale Generalizable Audio Language Model
Viaarxiv icon

SpeechGuard: Exploring the Adversarial Robustness of Multimodal Large Language Models

Add code
May 14, 2024
Viaarxiv icon

Audiovisual Highlight Detection in Videos

Add code
Feb 11, 2021
Figure 1 for Audiovisual Highlight Detection in Videos
Figure 2 for Audiovisual Highlight Detection in Videos
Figure 3 for Audiovisual Highlight Detection in Videos
Figure 4 for Audiovisual Highlight Detection in Videos
Viaarxiv icon

Characterizing dynamically varying acoustic scenes from egocentric audio recordings in workplace setting

Add code
Nov 10, 2019
Figure 1 for Characterizing dynamically varying acoustic scenes from egocentric audio recordings in workplace setting
Figure 2 for Characterizing dynamically varying acoustic scenes from egocentric audio recordings in workplace setting
Figure 3 for Characterizing dynamically varying acoustic scenes from egocentric audio recordings in workplace setting
Figure 4 for Characterizing dynamically varying acoustic scenes from egocentric audio recordings in workplace setting
Viaarxiv icon

Generating Labels for Regression of Subjective Constructs using Triplet Embeddings

Add code
Apr 02, 2019
Figure 1 for Generating Labels for Regression of Subjective Constructs using Triplet Embeddings
Figure 2 for Generating Labels for Regression of Subjective Constructs using Triplet Embeddings
Figure 3 for Generating Labels for Regression of Subjective Constructs using Triplet Embeddings
Figure 4 for Generating Labels for Regression of Subjective Constructs using Triplet Embeddings
Viaarxiv icon