Picture for Swarup Ranjan Behera

Swarup Ranjan Behera

Multi-View Multi-Task Modeling with Speech Foundation Models for Speech Forensic Tasks

Add code
Oct 16, 2024
Figure 1 for Multi-View Multi-Task Modeling with Speech Foundation Models for Speech Forensic Tasks
Figure 2 for Multi-View Multi-Task Modeling with Speech Foundation Models for Speech Forensic Tasks
Figure 3 for Multi-View Multi-Task Modeling with Speech Foundation Models for Speech Forensic Tasks
Figure 4 for Multi-View Multi-Task Modeling with Speech Foundation Models for Speech Forensic Tasks
Viaarxiv icon

Beyond Speech and More: Investigating the Emergent Ability of Speech Foundation Models for Classifying Physiological Time-Series Signals

Add code
Oct 16, 2024
Figure 1 for Beyond Speech and More: Investigating the Emergent Ability of Speech Foundation Models for Classifying Physiological Time-Series Signals
Figure 2 for Beyond Speech and More: Investigating the Emergent Ability of Speech Foundation Models for Classifying Physiological Time-Series Signals
Figure 3 for Beyond Speech and More: Investigating the Emergent Ability of Speech Foundation Models for Classifying Physiological Time-Series Signals
Figure 4 for Beyond Speech and More: Investigating the Emergent Ability of Speech Foundation Models for Classifying Physiological Time-Series Signals
Viaarxiv icon

SeQuiFi: Mitigating Catastrophic Forgetting in Speech Emotion Recognition with Sequential Class-Finetuning

Add code
Oct 16, 2024
Figure 1 for SeQuiFi: Mitigating Catastrophic Forgetting in Speech Emotion Recognition with Sequential Class-Finetuning
Figure 2 for SeQuiFi: Mitigating Catastrophic Forgetting in Speech Emotion Recognition with Sequential Class-Finetuning
Viaarxiv icon

Representation Loss Minimization with Randomized Selection Strategy for Efficient Environmental Fake Audio Detection

Add code
Sep 24, 2024
Figure 1 for Representation Loss Minimization with Randomized Selection Strategy for Efficient Environmental Fake Audio Detection
Figure 2 for Representation Loss Minimization with Randomized Selection Strategy for Efficient Environmental Fake Audio Detection
Figure 3 for Representation Loss Minimization with Randomized Selection Strategy for Efficient Environmental Fake Audio Detection
Figure 4 for Representation Loss Minimization with Randomized Selection Strategy for Efficient Environmental Fake Audio Detection
Viaarxiv icon

Avengers Assemble: Amalgamation of Non-Semantic Features for Depression Detection

Add code
Sep 22, 2024
Figure 1 for Avengers Assemble: Amalgamation of Non-Semantic Features for Depression Detection
Figure 2 for Avengers Assemble: Amalgamation of Non-Semantic Features for Depression Detection
Figure 3 for Avengers Assemble: Amalgamation of Non-Semantic Features for Depression Detection
Figure 4 for Avengers Assemble: Amalgamation of Non-Semantic Features for Depression Detection
Viaarxiv icon

Are Music Foundation Models Better at Singing Voice Deepfake Detection? Far-Better Fuse them with Speech Foundation Models

Add code
Sep 21, 2024
Figure 1 for Are Music Foundation Models Better at Singing Voice Deepfake Detection? Far-Better Fuse them with Speech Foundation Models
Figure 2 for Are Music Foundation Models Better at Singing Voice Deepfake Detection? Far-Better Fuse them with Speech Foundation Models
Figure 3 for Are Music Foundation Models Better at Singing Voice Deepfake Detection? Far-Better Fuse them with Speech Foundation Models
Figure 4 for Are Music Foundation Models Better at Singing Voice Deepfake Detection? Far-Better Fuse them with Speech Foundation Models
Viaarxiv icon

Strong Alone, Stronger Together: Synergizing Modality-Binding Foundation Models with Optimal Transport for Non-Verbal Emotion Recognition

Add code
Sep 21, 2024
Figure 1 for Strong Alone, Stronger Together: Synergizing Modality-Binding Foundation Models with Optimal Transport for Non-Verbal Emotion Recognition
Figure 2 for Strong Alone, Stronger Together: Synergizing Modality-Binding Foundation Models with Optimal Transport for Non-Verbal Emotion Recognition
Figure 3 for Strong Alone, Stronger Together: Synergizing Modality-Binding Foundation Models with Optimal Transport for Non-Verbal Emotion Recognition
Figure 4 for Strong Alone, Stronger Together: Synergizing Modality-Binding Foundation Models with Optimal Transport for Non-Verbal Emotion Recognition
Viaarxiv icon

Towards Multilingual Audio-Visual Question Answering

Add code
Jun 13, 2024
Viaarxiv icon

FastAST: Accelerating Audio Spectrogram Transformer via Token Merging and Cross-Model Knowledge Distillation

Add code
Jun 11, 2024
Viaarxiv icon

Spectral Clustering in Convex and Constrained Settings

Add code
Apr 03, 2024
Viaarxiv icon