Picture for Orchid Chetia Phukan

Orchid Chetia Phukan

Rethinking Cross-Corpus Speech Emotion Recognition Benchmarking: Are Paralinguistic Pre-Trained Representations Sufficient?

Add code
Sep 19, 2025
Viaarxiv icon

Are Multimodal Foundation Models All That Is Needed for Emofake Detection?

Add code
Sep 19, 2025
Viaarxiv icon

Towards Neural Audio Codec Source Parsing

Add code
Jun 14, 2025
Viaarxiv icon

Investigating Prosodic Signatures via Speech Pre-Trained Models for Audio Deepfake Source Attribution

Add code
Dec 23, 2024
Figure 1 for Investigating Prosodic Signatures via Speech Pre-Trained Models for Audio Deepfake Source Attribution
Figure 2 for Investigating Prosodic Signatures via Speech Pre-Trained Models for Audio Deepfake Source Attribution
Figure 3 for Investigating Prosodic Signatures via Speech Pre-Trained Models for Audio Deepfake Source Attribution
Figure 4 for Investigating Prosodic Signatures via Speech Pre-Trained Models for Audio Deepfake Source Attribution
Viaarxiv icon

Beyond Speech and More: Investigating the Emergent Ability of Speech Foundation Models for Classifying Physiological Time-Series Signals

Add code
Oct 16, 2024
Figure 1 for Beyond Speech and More: Investigating the Emergent Ability of Speech Foundation Models for Classifying Physiological Time-Series Signals
Figure 2 for Beyond Speech and More: Investigating the Emergent Ability of Speech Foundation Models for Classifying Physiological Time-Series Signals
Figure 3 for Beyond Speech and More: Investigating the Emergent Ability of Speech Foundation Models for Classifying Physiological Time-Series Signals
Figure 4 for Beyond Speech and More: Investigating the Emergent Ability of Speech Foundation Models for Classifying Physiological Time-Series Signals
Viaarxiv icon

SeQuiFi: Mitigating Catastrophic Forgetting in Speech Emotion Recognition with Sequential Class-Finetuning

Add code
Oct 16, 2024
Figure 1 for SeQuiFi: Mitigating Catastrophic Forgetting in Speech Emotion Recognition with Sequential Class-Finetuning
Figure 2 for SeQuiFi: Mitigating Catastrophic Forgetting in Speech Emotion Recognition with Sequential Class-Finetuning
Viaarxiv icon

Multi-View Multi-Task Modeling with Speech Foundation Models for Speech Forensic Tasks

Add code
Oct 16, 2024
Figure 1 for Multi-View Multi-Task Modeling with Speech Foundation Models for Speech Forensic Tasks
Figure 2 for Multi-View Multi-Task Modeling with Speech Foundation Models for Speech Forensic Tasks
Figure 3 for Multi-View Multi-Task Modeling with Speech Foundation Models for Speech Forensic Tasks
Figure 4 for Multi-View Multi-Task Modeling with Speech Foundation Models for Speech Forensic Tasks
Viaarxiv icon

Representation Loss Minimization with Randomized Selection Strategy for Efficient Environmental Fake Audio Detection

Add code
Sep 24, 2024
Figure 1 for Representation Loss Minimization with Randomized Selection Strategy for Efficient Environmental Fake Audio Detection
Figure 2 for Representation Loss Minimization with Randomized Selection Strategy for Efficient Environmental Fake Audio Detection
Figure 3 for Representation Loss Minimization with Randomized Selection Strategy for Efficient Environmental Fake Audio Detection
Figure 4 for Representation Loss Minimization with Randomized Selection Strategy for Efficient Environmental Fake Audio Detection
Viaarxiv icon

Avengers Assemble: Amalgamation of Non-Semantic Features for Depression Detection

Add code
Sep 22, 2024
Figure 1 for Avengers Assemble: Amalgamation of Non-Semantic Features for Depression Detection
Figure 2 for Avengers Assemble: Amalgamation of Non-Semantic Features for Depression Detection
Figure 3 for Avengers Assemble: Amalgamation of Non-Semantic Features for Depression Detection
Figure 4 for Avengers Assemble: Amalgamation of Non-Semantic Features for Depression Detection
Viaarxiv icon

Are Music Foundation Models Better at Singing Voice Deepfake Detection? Far-Better Fuse them with Speech Foundation Models

Add code
Sep 21, 2024
Figure 1 for Are Music Foundation Models Better at Singing Voice Deepfake Detection? Far-Better Fuse them with Speech Foundation Models
Figure 2 for Are Music Foundation Models Better at Singing Voice Deepfake Detection? Far-Better Fuse them with Speech Foundation Models
Figure 3 for Are Music Foundation Models Better at Singing Voice Deepfake Detection? Far-Better Fuse them with Speech Foundation Models
Figure 4 for Are Music Foundation Models Better at Singing Voice Deepfake Detection? Far-Better Fuse them with Speech Foundation Models
Viaarxiv icon