Picture for Digbalay Bose

Digbalay Bose

Can Text-to-image Model Assist Multi-modal Learning for Visual Recognition with Visual Modality Missing?

Add code
Feb 14, 2024
Viaarxiv icon

Does Video Summarization Require Videos? Quantifying the Effectiveness of Language in Video Summarization

Add code
Sep 18, 2023
Viaarxiv icon

MM-AU:Towards Multimodal Understanding of Advertisement Videos

Add code
Aug 27, 2023
Viaarxiv icon

FedMultimodal: A Benchmark For Multimodal Federated Learning

Add code
Jun 20, 2023
Viaarxiv icon

Unlocking Foundation Models for Privacy-Enhancing Speech Understanding: An Early Study on Low Resource Speech Training Leveraging Label-guided Synthetic Speech Content

Add code
Jun 13, 2023
Viaarxiv icon

Signal Processing Grand Challenge 2023 -- e-Prevention: Sleep Behavior as an Indicator of Relapses in Psychotic Patients

Add code
Apr 17, 2023
Viaarxiv icon

Contextually-rich human affect perception using multimodal scene information

Add code
Mar 13, 2023
Viaarxiv icon

A dataset for Audio-Visual Sound Event Detection in Movies

Add code
Feb 14, 2023
Viaarxiv icon

Multimodal Estimation of Change Points of Physiological Arousal in Drivers

Add code
Oct 28, 2022
Viaarxiv icon

Understanding of Emotion Perception from Art

Add code
Oct 13, 2021
Figure 1 for Understanding of Emotion Perception from Art
Figure 2 for Understanding of Emotion Perception from Art
Figure 3 for Understanding of Emotion Perception from Art
Figure 4 for Understanding of Emotion Perception from Art
Viaarxiv icon