Picture for Bhiksha Raj

Bhiksha Raj

Language Technologies Institute, Carnegie Mellon University, Mohammed bin Zayed University of AI

Less is More Tokens: Efficient Math Reasoning via Difficulty-Aware Chain-of-Thought Distillation

Add code
Sep 05, 2025
Viaarxiv icon

OleSpeech-IV: A Large-Scale Multispeaker and Multilingual Conversational Speech Dataset with Diverse Topics

Add code
Sep 04, 2025
Viaarxiv icon

Directed-Tokens: A Robust Multi-Modality Alignment Approach to Large Language-Vision Models

Add code
Aug 19, 2025
Viaarxiv icon

CarelessWhisper: Turning Whisper into a Causal Streaming Model

Add code
Aug 17, 2025
Viaarxiv icon

Deciphering GunType Hierarchy through Acoustic Analysis of Gunshot Recordings

Add code
Jun 25, 2025
Viaarxiv icon

CoLMbo: Speaker Language Model for Descriptive Profiling

Add code
Jun 11, 2025
Viaarxiv icon

Total-Editing: Head Avatar with Editable Appearance, Motion, and Lighting

Add code
May 26, 2025
Viaarxiv icon

CAARMA: Class Augmentation with Adversarial Mixup Regularization

Add code
Mar 20, 2025
Figure 1 for CAARMA: Class Augmentation with Adversarial Mixup Regularization
Figure 2 for CAARMA: Class Augmentation with Adversarial Mixup Regularization
Figure 3 for CAARMA: Class Augmentation with Adversarial Mixup Regularization
Figure 4 for CAARMA: Class Augmentation with Adversarial Mixup Regularization
Viaarxiv icon

Robust Latent Matters: Boosting Image Generation with Sampling Error

Add code
Mar 11, 2025
Viaarxiv icon

Mellow: a small audio language model for reasoning

Add code
Mar 11, 2025
Figure 1 for Mellow: a small audio language model for reasoning
Figure 2 for Mellow: a small audio language model for reasoning
Figure 3 for Mellow: a small audio language model for reasoning
Figure 4 for Mellow: a small audio language model for reasoning
Viaarxiv icon