Picture for Bhiksha Raj

Bhiksha Raj

Language Technologies Institute, Carnegie Mellon University, Mohammed bin Zayed University of AI

Deciphering GunType Hierarchy through Acoustic Analysis of Gunshot Recordings

Add code
Jun 25, 2025
Viaarxiv icon

CoLMbo: Speaker Language Model for Descriptive Profiling

Add code
Jun 11, 2025
Viaarxiv icon

Total-Editing: Head Avatar with Editable Appearance, Motion, and Lighting

Add code
May 26, 2025
Viaarxiv icon

CAARMA: Class Augmentation with Adversarial Mixup Regularization

Add code
Mar 20, 2025
Viaarxiv icon

Mellow: a small audio language model for reasoning

Add code
Mar 11, 2025
Viaarxiv icon

Robust Latent Matters: Boosting Image Generation with Sampling Error

Add code
Mar 11, 2025
Viaarxiv icon

On the Robust Approximation of ASR Metrics

Add code
Feb 18, 2025
Viaarxiv icon

Lost in Transcription, Found in Distribution Shift: Demystifying Hallucination in Speech Foundation Models

Add code
Feb 18, 2025
Viaarxiv icon

ADIFF: Explaining audio difference using natural language

Add code
Feb 06, 2025
Viaarxiv icon

Masked Autoencoders Are Effective Tokenizers for Diffusion Models

Add code
Feb 05, 2025
Viaarxiv icon