Picture for Swarup Ranjan Behera

Swarup Ranjan Behera

Beyond Speech and More: Investigating the Emergent Ability of Speech Foundation Models for Classifying Physiological Time-Series Signals

Add code
Oct 16, 2024
Figure 1 for Beyond Speech and More: Investigating the Emergent Ability of Speech Foundation Models for Classifying Physiological Time-Series Signals
Figure 2 for Beyond Speech and More: Investigating the Emergent Ability of Speech Foundation Models for Classifying Physiological Time-Series Signals
Figure 3 for Beyond Speech and More: Investigating the Emergent Ability of Speech Foundation Models for Classifying Physiological Time-Series Signals
Figure 4 for Beyond Speech and More: Investigating the Emergent Ability of Speech Foundation Models for Classifying Physiological Time-Series Signals
Viaarxiv icon

SeQuiFi: Mitigating Catastrophic Forgetting in Speech Emotion Recognition with Sequential Class-Finetuning

Add code
Oct 16, 2024
Figure 1 for SeQuiFi: Mitigating Catastrophic Forgetting in Speech Emotion Recognition with Sequential Class-Finetuning
Figure 2 for SeQuiFi: Mitigating Catastrophic Forgetting in Speech Emotion Recognition with Sequential Class-Finetuning
Viaarxiv icon

Multi-View Multi-Task Modeling with Speech Foundation Models for Speech Forensic Tasks

Add code
Oct 16, 2024
Figure 1 for Multi-View Multi-Task Modeling with Speech Foundation Models for Speech Forensic Tasks
Figure 2 for Multi-View Multi-Task Modeling with Speech Foundation Models for Speech Forensic Tasks
Figure 3 for Multi-View Multi-Task Modeling with Speech Foundation Models for Speech Forensic Tasks
Figure 4 for Multi-View Multi-Task Modeling with Speech Foundation Models for Speech Forensic Tasks
Viaarxiv icon

Representation Loss Minimization with Randomized Selection Strategy for Efficient Environmental Fake Audio Detection

Add code
Sep 24, 2024
Figure 1 for Representation Loss Minimization with Randomized Selection Strategy for Efficient Environmental Fake Audio Detection
Figure 2 for Representation Loss Minimization with Randomized Selection Strategy for Efficient Environmental Fake Audio Detection
Figure 3 for Representation Loss Minimization with Randomized Selection Strategy for Efficient Environmental Fake Audio Detection
Figure 4 for Representation Loss Minimization with Randomized Selection Strategy for Efficient Environmental Fake Audio Detection
Viaarxiv icon

Avengers Assemble: Amalgamation of Non-Semantic Features for Depression Detection

Add code
Sep 22, 2024
Viaarxiv icon

Strong Alone, Stronger Together: Synergizing Modality-Binding Foundation Models with Optimal Transport for Non-Verbal Emotion Recognition

Add code
Sep 21, 2024
Figure 1 for Strong Alone, Stronger Together: Synergizing Modality-Binding Foundation Models with Optimal Transport for Non-Verbal Emotion Recognition
Figure 2 for Strong Alone, Stronger Together: Synergizing Modality-Binding Foundation Models with Optimal Transport for Non-Verbal Emotion Recognition
Figure 3 for Strong Alone, Stronger Together: Synergizing Modality-Binding Foundation Models with Optimal Transport for Non-Verbal Emotion Recognition
Figure 4 for Strong Alone, Stronger Together: Synergizing Modality-Binding Foundation Models with Optimal Transport for Non-Verbal Emotion Recognition
Viaarxiv icon

Are Music Foundation Models Better at Singing Voice Deepfake Detection? Far-Better Fuse them with Speech Foundation Models

Add code
Sep 21, 2024
Figure 1 for Are Music Foundation Models Better at Singing Voice Deepfake Detection? Far-Better Fuse them with Speech Foundation Models
Figure 2 for Are Music Foundation Models Better at Singing Voice Deepfake Detection? Far-Better Fuse them with Speech Foundation Models
Figure 3 for Are Music Foundation Models Better at Singing Voice Deepfake Detection? Far-Better Fuse them with Speech Foundation Models
Figure 4 for Are Music Foundation Models Better at Singing Voice Deepfake Detection? Far-Better Fuse them with Speech Foundation Models
Viaarxiv icon

Towards Multilingual Audio-Visual Question Answering

Add code
Jun 13, 2024
Viaarxiv icon

FastAST: Accelerating Audio Spectrogram Transformer via Token Merging and Cross-Model Knowledge Distillation

Add code
Jun 11, 2024
Viaarxiv icon

Spectral Clustering in Convex and Constrained Settings

Add code
Apr 03, 2024
Viaarxiv icon